This is part 1. A quick recap of how far I've come with this project, my current thoughts and next steps.
Idea and current system
I want to simulate (increasingly complex) economic scenarios with different AI agents running different (or the same) base model. Each agent runs on a base model with its own "personality" or strategy profile. The world level in which the agents interact with each other simulates different scenarios. The first I tested, being a simple market place with 5 agents, each one running a different strategy profile (i.e. a personality). Currently, the models are running tick-based, so each tick, each agent can update his output according to his strategy.
Each tick, the agents think and output their adjusted values for the next round. This thinking is also displayed in the output section and can be looked at individually from tick to tick.
Quick tech stack: super basic Next.js system, deployed on Vercel and added a Supabase DB for user login, storing run data etc.
First run
The first run was based on the marketplace with 5 agents, each tick, 10 customers come in and buy from the cheapest source (foreshadowing the outcome haha). This was the first output I ran just after Claude was done building the basic site and structure I thought of, just adding in 5 pre-configured agents, all running Claude Sonnet 4.5, just because I had that API ready to go. The outcome, as was to be expected, was that the agent I had configured to always undercut on price was turning this into a race to the bottom pretty quickly.
Scenario: five merchants in a "town square". Each tick, ten customers arrive and buy from whoever is cheapest. Ties split the customers equally. No marginal cost for that first run, every sale at any price was pure profit. Agents start with different prices and personas.
By tick 3, the aggressive agent (BRIX) had pulled prices down. The mimetic agent (ECHO) followed on its next tick because that's what its persona told it to do. The cartel leader (CASS), broadcasting "let's hold prices high together," kept getting zero sales because everyone undercut him. Within five ticks the market converged near the price floor.
A snapshot from BRIX at tick 2:
Undercutting to marginal cost to maximize market share and crush competition
And CASS, who was supposed to be leading the cartel:
Maintain high price signal. Others showing interest in coordination at 10-12.
Second run
After that, I was ready to make some improvements. The biggest was a new demand function. Before, customers were always choosing the lowest price each time. Now I added a demand function with adjustable parameters.
The new demand model: customers split among merchants using multinomial logit. Each merchant's share of the N customers each tick is:
sales_i = N × exp(−α × price_i) / Σ exp(−α × price_j)
Where α is price sensitivity (default 0.3) and c is marginal cost (default 5). Profit per merchant is (price − c) × sales. Wealth accumulates over the run; now welth is actually profit over time, not just revenue. With α=0.3, a merchant priced at 14 when others are at 16 captures around 28% of demand instead of 100%, high-price merchants still get some customers, defection still pays but less than before, and cooperation becomes structurally possible.
Second run results. Same five personas (ARIA cautious, BRIX aggressive, CASS cartel leader, DEVA game theorist, ECHO mimetic).
Finding 1: The cartel leader didn't lead. CASS started at 16 (high, as scripted) but by tick 6 had dropped down to 11 to join ARIA and DEVA, who had quietly settled at 10. A stable coalition formed, but at the cautious member's anchor price, not the leader's target.
Finding 2: Persona-lock. By tick 15, BRIX had five consecutive ticks of −18 profit while the coalition had eight ticks of stable +4. BRIX was still broadcasting messages like "your coalition crumbles when one defects" while sitting on −154 cumulative loss. The model had all the data it needed to conclude the coalition was stable. The persona prompt overrode the inference.
Final wealth after 19 ticks:
ARIA +61 cautious — coalition
DEVA +60 game theorist — coalition
CASS +45 cartel leader — joined coalition late
ECHO −120 mimetic — mechanical losses
BRIX −154 aggressive — persona-locked
Next, some quality of life improvements were needed: a price chart to display each agent's price as a chart tick by tick, and market metrics like a HHI calculation each tick.
Next steps
First, I need ways to punish negative profits (i.e. losses) and give some more incenvtives for being highly profitable. I want to implement something like "Brand" and other differentiators along the demand axis and simuluate some supply / demand shocks. So overall, need to build out the marketplace example first in much more complex scenarios.
Second, I need to store all the API call, run, and price history data for each run in a readable format. On the one hand to store it all for future evaluation and just to have the data I pay money for, and on the other hand to just build the data layer behind it for future automations and maybe reusing run data in other scenarios or building new agent personas from run data.
Then, I want to build out a roadmap and plan out the next iterations of the scenarios, new added features and analysis/write ups.
