How Poker0dds teaches you Texas Hold'em odds by putting you at the table with autonomous agents that think, bluff, and fold just like real players.
When we set out to build Poker0dds, the goal was simple: give players a training environment where they can see their real-time win probability on every hand while competing against opponents that actually play poker. Not bots that call every bet. Not bots that fold to any raise. Real players — with tendencies, styles, and tells.
What emerged from that goal was a multi-agent system where each seat at the table is occupied by an autonomous agent with its own distinct personality, decision-making framework, and risk tolerance. This post explains how it works.
Most poker training apps use one of two extremes: bots that are trivially easy to beat (call everything, never bluff), or bots that play "optimal" GTO poker so perfectly that they feel inhuman. Neither is useful for learning.
Real poker tables are filled with a mix of personalities. You'll find the tight player who only raises with premium hands and folds to anything aggressive. The loose cannon who raises pre-flop with 7-2 offsuit just to see what happens. The calling station who simply refuses to fold regardless of the odds. Learning to read and exploit these tendencies is what separates good players from great ones.
So we built exactly that variety into Poker0dds.
Every opponent seat at the table is randomly assigned one of five archetypes when a new table session begins. Each archetype is a configuration of behavioral parameters that shapes how that agent evaluates hands, sizes bets, and decides when to bluff.
Plays few hands but bets confidently when in. Uses exact flop equity calculations. The closest thing to a textbook poker player.
Enters many pots and barrels frequently. High bluff rate and wide three-bet range. Difficult to read and put on a hand.
Folds most hands. When a Rock raises, respect it — they almost always have it. Low cbet frequency, minimal bluffing.
Calls almost anything, rarely raises. Hard to bluff off a hand. Value-bet them relentlessly with strong holdings.
Raises constantly with a wide range, high bluff frequency, and unpredictable sizing. Chaotic but occasionally dangerous.
These aren't just cosmetic labels. Each archetype carries concrete numerical parameters that feed directly into the agent's decision engine:
Every time it's an agent's turn to act, it goes through the same decision pipeline — but the outcome differs based on its archetype's parameters.
The agent calculates its current win probability against the remaining active players. Preflop, it looks up the result from our GPU-generated table of 983 million Monte Carlo simulations. On the flop, it runs a fast 5,000-sample Monte Carlo. On the turn and river, it uses exact enumeration over every possible remaining card combination — 45,540 combos on the turn, 990 on the river.
The agent considers pot odds, stack-to-pot ratio (SPR), position, whether it was the pre-flop aggressor, and whether the board is "wet" (draw-heavy) or "dry" (static). A low SPR signals a committed pot — a different set of thresholds apply.
The equity threshold required to bet, call, or raise is adjusted by the archetype's parameters. A TAG requires ≥65% equity to raise; a Maniac raises with 45%. A Rock barely bluffs; a LAG barrels the turn 70% of the time. The archetype also adds random sizing jitter — so bets from the same agent at the same equity level won't always be the same size, mimicking real human inconsistency.
Each agent has a memory of the current hand. It tracks how opponents have bet, whether they've shown aggression, and uses that information on later streets. An agent that saw heavy pre-flop action will adjust its post-flop aggression accordingly.
The result is an opponent that folds weak hands, value-bets strong ones, continuation-bets the flop with appropriate frequency, and occasionally bluffs the river — all proportional to its personality type. Playing against a table of Maniacs feels chaotic and hard to read. Playing against a table of Rocks feels tight and controlled. Most sessions land somewhere in between.
Beyond the five heuristic archetypes, Poker0dds includes a sixth agent powered by a small neural network trained through self-play. This is the "Trained Agent" — and it plays differently from its rule-based peers.
The network architecture is deliberately compact:
Twelve input features describe the current game state — hole card strength, position, pot odds, stack depth, community card texture, number of active players, and the agent's observed history at the table. The network outputs five scores corresponding to the available actions: Fold, Check, Call, Raise, and Big Raise.
With only 3,077 trainable parameters, the model is tiny by any AI standard — it takes microseconds to run on an iPhone and requires about 12KB of storage. But it doesn't need to be large. The 12 input features already distill complex game state into meaningful signals. The network's job is to learn the mapping from those signals to good actions, and a shallow network is sufficient for that.
The model's compactness is a feature, not a limitation. Fast inference means no perceptible lag between player action and AI response, even when all seven seats need to decide in quick succession.
One of the design goals was to make the app feel progressively harder without making early levels frustrating or late levels impossible. The roster composition changes as you advance:
| Level Range | TAG / LAG Seats | Neural Net Seats | Random Heuristic |
|---|---|---|---|
| Levels 1–4 | 0 | 0 | 7 of 7 |
| Levels 5–14 | 1 guaranteed | 1 guaranteed | 5 of 7 |
| Levels 15–20 | 2 guaranteed | 2 guaranteed | 3 of 7 |
The roster is shuffled before each session so the stronger agents don't always occupy the same seat. By level 15, nearly half the table plays at a higher level of competency — you can no longer cruise through on exploiting passive or random opponents.
Each table also has a themed identity drawn from 20 categories — Physicists, Chess Masters, Poker Legends, Ancient Leaders, Nobel Laureates — with seven historically famous names randomly drawn from a pool of ten per theme. It's a small touch, but it gives each table a distinct personality beyond just the AI behavior.
The AI agents and the player share the same odds infrastructure — which means the win percentages displayed on screen are the same numbers the agents use to make their decisions. This was an intentional design choice: it keeps the game honest, and it means watching how an agent bets relative to its equity is itself an educational experience.
The preflop lookup table was generated by running 983 million Monte Carlo simulations per hole card matchup across all player counts from 2 to 9. This data lives in a compact binary file loaded at startup — a lookup takes nanoseconds and is perfectly accurate for every pre-flop situation the game can produce.
Post-flop accuracy scales with what's known. The flop runs 5,000 Monte Carlo samples (fast, good enough). The turn uses exact enumeration over all 45,540 possible card combinations that can complete the board. The river enumerates all 990 remaining possibilities exactly — so by the time the last community card is dealt, every win percentage displayed is mathematically precise.
The current neural network was trained from scratch with a limited dataset. Future iterations could incorporate deeper self-play training loops, larger networks for the highest table tiers, or even opponent modeling — where the trained agent adapts its strategy based on observed tendencies at the specific table it's playing.
The multi-agent framework is also the foundation for the multiplayer feature we're exploring — where two human players share the same table and the AI fills the remaining seats, bringing the same autonomous decision-making to a social context.
Poker0dds is available now on the App Store. If you're curious about the odds engine, the agent architecture, or anything else covered here — reach out.