I'm going to break down the essential math you need for trading on Polymarket. I'll also share the exact roadmap and resources that helped me personally.
Let's get straight to it
A recent research paper just exposed the reality. Sophisticated traders extracted $40 million in guaranteed arbitrage profits from Polymarket in one year. The top trader alone made $2,009,631.76. These aren't lucky gamblers. They're running Bregman projections, Frank-Wolfe algorithms, and solving optimization problems that would make most computer science PhDs uncomfortable.
Bookmark This -I’m Roan, a backend developer working on system design, HFT-style execution, and quantitative trading systems. My work focuses on how prediction markets actually behave under load.
When you see a market where YES is $0.62 and NO is $0.33, you think “that adds up to $0.95, there’s arbitrage.” You’re right. What most people never realize is that while they’re manually checking whether YES plus NO equals $1, quantitative systems are solving integer programs that scan 17,218 conditions across 2^63 possible outcomes in milliseconds. By the time a human places both orders, the spread is gone. The systems have already found the same violation across dozens of correlated markets, calculated optimal position sizes accounting for order book depth and fees, executed parallel non-atomic trades, and rotated capital into the next opportunity.
The difference isn't just speed. It's mathematical infrastructure.
By the end of this article, you will understand the exact optimization frameworks that extracted $40 million from Polymarket. You'll know why simple addition fails, how integer programming compresses exponential search spaces, and what Bregman divergence actually means for pricing efficiency. More importantly, you'll see the specific code patterns and algorithmic strategies that separate hobby projects from production systems running millions in capital.Note: This isn’t a skim. If you’re serious about building systems that can scale to seven figures, read it end to end. If you’re here for quick wins or vibe coding, this isn’t for you.
Part I: The Marginal Polytope Problem (Why Simple Math Fails)
The Reality of Multi-Condition Markets
Single condition market: "Will Trump win Pennsylvania?"
YES: $0.48
NO: $0.52
Sum: $1.00
Looks perfect. No arbitrage, right?
Wrong.
Now add another market: "Will Republicans win Pennsylvania by 5+ points?"
YES: $0.32
NO: $0.68
Still both sum to $1. Still looks fine.
But there's a logical dependency. If Republicans win by 5+ points, Trump must win Pennsylvania. These markets aren't independent. And that creates arbitrage.The Mathematical Framework
For any market with n conditions, there are 2^n possible price combinations. But only n valid outcomes because exactly one condition must resolve to TRUE.
Define the set of valid payoff vectors:
Z = {φ(ω) : ω ∈ Ω}
Where φ(ω) is a binary vector showing which condition is TRUE in outcome ω.
The marginal polytope is the convex hull of these valid vectors:
M = conv(Z)
Arbitrage-free prices must lie in M. Anything outside M is exploitable.
For the Pennsylvania example:
Market A has 2 conditions, 2 valid outcomes
Market B has 2 conditions, 2 valid outcomes
Combined naive check: 2 × 2 = 4 possible outcomes
Actual valid outcomes: 3 (dependency eliminates one)
When prices assume 4 independent outcomes but only 3 exist, the mispricing creates guaranteed profit.
Why Brute Force Dies
NCAA 2010 tournament market had:
63 games (win/loss each)
2^63 = 9,223,372,036,854,775,808 possible outcomes
5,000+ securities
Checking every combination is computationally impossible.
The research paper found 1,576 potentially dependent market pairs in the 2024 US election alone. Naive pairwise verification would require checking 2^(n+m) combinations for each pair.
At just 10 conditions per market, that's 2^20 = 1,048,576 checks per pair. Multiply by 1,576 pairs. Your laptop will still be computing when the election results are already known.
The Integer Programming Solution
Instead of enumerating outcomes, describe the valid set with linear constraints.
Z = {z ∈ {0,1}^I : A^T × z ≥ b}
Real example from Duke vs Cornell market:
Each team has 7 securities (0 to 6 wins). That's 14 conditions, 2^14 = 16,384 possible combinations.
But they can't both win 5+ games because they'd meet in the semifinals.
Integer programming constraints:
Sum of z(duke, 0 to 6) = 1 Sum of z(cornell, 0 to 6) = 1 z(duke,5) + z(duke,6) + z(cornell,5) + z(cornell,6) ≤ 1
Three linear constraints replace 16,384 brute force checks.
This is how quantitative systems handle exponential complexity. They don't enumerate. They constrain.
Detection Results from Real Data
The research team analyzed markets from April 2024 to April 2025:
17,218 total conditions examined
7,051 conditions showed single-market arbitrage (41%)
Median mispricing: $0.60 per dollar (should be $1.00)
13 confirmed dependent market pairs with exploitable arbitrage
The median mispricing of $0.60 means markets were regularly wrong by 40%. Not close to efficient. Massively exploitable.
Key takeaway: Arbitrage detection isn't about checking if numbers add up. It's about solving constraint satisfaction problems over exponentially large outcome spaces using compact linear representations.
Part II: Bregman Projection (How to Actually Remove Arbitrage)
Finding arbitrage is one problem. Calculating the optimal exploiting trade is another.
You can't just "fix" prices by averaging or nudging numbers. You need to project the current market state onto the arbitrage-free manifold while preserving the information structure.
Why Standard Distance Fails
Euclidean projection would minimize:
||μ - θ||^2
This treats all price movements equally. But markets use cost functions. A price move from $0.50 to $0.60 has different information content than a move from $0.05 to $0.15, even though both are 10 cent changes.
Market makers use logarithmic cost functions (LMSR) where prices represent implied probabilities. The right distance metric must respect this structure.
The Bregman Divergence
For any convex function R with gradient ∇R, the Bregman divergence is:
D(μ||θ) = R(μ) + C(θ) - θ·μ
Where:
R(μ) is the convex conjugate of the cost function C
θ is the current market state
μ is the target price vector
C(θ) is the market maker's cost function
For LMSR, R(μ) is negative entropy:
R(μ) = Sum of μ_i × ln(μ_i)
This makes D(μ||θ) the Kullback-Leibler divergence, measuring information-theoretic distance between probability distributions.
The Arbitrage Profit Formula
The maximum guaranteed profit from any trade equals:
max over all trades δ of [min over outcomes ω of (δ·φ(ω) - C(θ+δ) + C(θ))] = D(μ||θ)*
Where μ* is the Bregman projection of θ onto M.
This is not obvious. The proof requires convex duality theory. But the implication is clear: finding the optimal arbitrage trade is equivalent to computing the Bregman projection.
Real Numbers
The top arbitrageur extracted $2,009,631.76 over one year. Their strategy was solving this optimization problem faster and more accurately than everyone else:
μ = argmin over μ in M of D(μ||θ)*
Every profitable trade was finding μ* before prices moved.
Why This Matters for Execution
When you detect arbitrage, you need to know:
What positions to take (which conditions to buy/sell)
What size (accounting for order book depth)
What profit to expect (accounting for execution risk)
Bregman projection gives you all three.
The projection μ* tells you the arbitrage-free price vector. The divergence D(μ*||θ) tells you the maximum extractable profit. The gradient ∇D tells you the trading direction.
Without this framework, you're guessing. With it, you're optimizing.
Key takeaway: Arbitrage isn't about spotting mispriced assets. It's about solving constrained convex optimization problems in spaces defined by market microstructure. The math determines profitability. You can't just "fix" prices by averaging or nudging numbers. You need to project the current market state onto the arbitrage-free manifold while preserving the information structure.
Part III: The Frank-Wolfe Algorithm (Making It Computationally Tractable)
Computing the Bregman projection directly is intractable. The marginal polytope M has exponentially many vertices.
Standard convex optimization requires access to the full constraint set. For prediction markets, that means enumerating every valid outcome. Impossible at scale.
The Frank-Wolfe algorithm solves this by reducing projection to a sequence of linear programs.
The Core Insight
Instead of optimizing over all of M at once, Frank-Wolfe builds it iteratively.
Algorithm:
1. Start with a small set of known vertices Z_0 2. For iteration t: a. Solve convex optimization over conv(Z_{t-1}) μ_t = argmin over μ in conv(Z_{t-1}) of F(μ) b. Find new descent vertex by solving IP: z_t = argmin over z in Z of ∇F(μ_t)·z c. Add to active set: Z_t = Z_{t-1} ∪ {z_t} d. Compute convergence gap: g(μ_t) = ∇F(μ_t)·(μ_t - z_t) e. Stop if g(μ_t) ≤ ε
The active set Z_t grows by one vertex per iteration. Even after 100 iterations, you're only tracking 100 vertices instead of 2^63.
The Integer Programming Oracle
Step 2b is the expensive part. Each iteration requires solving:
min over z in Z of c·z
Where c = ∇F(μ_t) is the current gradient and Z is the set of valid payoff vectors defined by integer constraints.
This is an integer linear program. NP-hard in general. But modern IP solvers like Gurobi handle these efficiently for well-structured problems.
The research team used Gurobi 5.5. Typical solve times:
Early iterations (small partial outcomes): under 1 second
Mid-tournament (30-40 games settled): 10-30 seconds
Late tournament (50+ games settled): under 5 seconds
Why does it get faster later? Because as outcomes settle, the feasible set shrinks. Fewer variables, tighter constraints, faster solves.
The Controlled Growth Problem
Standard Frank-Wolfe assumes the gradient ∇F is Lipschitz continuous with bounded constant.
For LMSR, ∇R(μ) = ln(μ) + 1. As μ approaches 0, the gradient explodes to negative infinity.
This violates standard convergence proofs.
The solution is Barrier Frank-Wolfe. Instead of optimizing over M, optimize over a contracted polytope:
M' = (1-ε)M + εu
Where u is an interior point with all coordinates strictly between 0 and 1, and ε in (0,1) is the contraction parameter.
For any ε greater than 0, the gradient is bounded on M'. The Lipschitz constant is O(1/ε).
The algorithm adaptively decreases ε as iterations progress:
If g(μ_t) / (-4g_u) < ε_{t-1}: ε_t = min{g(μ_t)/(-4g_u), ε_{t-1}/2} Else: ε_t = ε_{t-1}
This ensures ε goes to 0 asymptotically, so the contracted problem converges to the true projection.
Convergence Rate
Frank-Wolfe converges at rate O(L × diam(M) / t) where L is the Lipschitz constant and diam(M) is the diameter of M.
For LMSR with adaptive contraction, this becomes O(1/(ε×t)). As ε shrinks adaptively, convergence slows but remains polynomial.
The research showed that in practice, 50 to 150 iterations were sufficient for convergence on markets with thousands of conditions.
Production Performance
From the paper: "Once projections become practically fast, FWMM achieves superior accuracy to LCMM."
Timeline:
First 16 games: LCMM and FWMM perform similarly (IP solver too slow)
After 45 games settled: First successful 30-minute projection completes
Remaining tournament: FWMM outperforms LCMM by 38% median improvement on security prices
The crossover point is when the outcome space shrinks enough for IP solves to complete within trading timeframes.
Key takeaway: Theoretical elegance means nothing without computational tractability. Frank-Wolfe with integer programming oracles makes Bregman projection practical on markets with trillions of outcomes. This is how $40 million in arbitrage actually got computed and executed.
Part IV: Execution Under Non-Atomic Constraints (Why Order Books Change Everything)
You've detected arbitrage. You've computed the optimal trade via Bregman projection. Now you need to execute.This is where most strategies fail.
The Non-Atomic Problem
Polymarket uses a Central Limit Order Book (CLOB). Unlike decentralized exchanges where arbitrage can be atomic (all trades succeed or all fail), CLOB execution is sequential.
Your arbitrage plan:
Buy YES at $0.30 Buy NO at $0.30 Total cost: $0.60 Guaranteed payout: $1.00 Expected profit: $0.40
Reality:
Submit YES order → Fills at $0.30 ✓ Price updates due to your order Submit NO order → Fills at $0.78 ✗ Total cost: $1.08 Payout: $1.00 Actual result: -$0.08 loss
One leg fills. The other doesn't. You're exposed.
This is why the research paper only counted opportunities with at least $0.05 profit margin. Smaller edges get eaten by execution risk.
Volume-Weighted Average Price (VWAP) Analysis
Instead of assuming instant fills at quoted prices, calculate expected execution price:
VWAP = Sum of (price_i × volume_i) / Sum of (volume_i)
The research methodology:
For each block on Polygon (approximately 2 seconds): Calculate VWAP_yes from all YES trades in that block Calculate VWAP_no from all NO trades in that block If abs(VWAP_yes + VWAP_no - 1.0) > 0.02: Record arbitrage opportunity Profit = abs(VWAP_yes + VWAP_no - 1.0)
Blocks are the atomic time unit. Analyzing per-block VWAP captures the actual achievable prices, not the fantasy of instant execution.
The Liquidity Constraint
Even if prices are mispriced, you can only capture profit up to available liquidity.
Real example from the data:
Market shows arbitrage: sum of YES prices = $0.85
Potential profit: $0.15 per dollar
Order book depth at these prices: $234 total volume
Maximum extractable profit: $234 × 0.15 = $35.10
The research calculated maximum profit per opportunity as:
profit = (price deviation) × min(volume across all required positions)
For multi-condition markets, you need liquidity in ALL positions simultaneously. The minimum determines your cap.
Time Window Analysis
The research used a 950-block window (approximately 1 hour) to group related trades.
Why 1 hour? Because 75% of matched orders on Polymarket fill within this timeframe. Orders submitted, matched, and executed on-chain typically complete within 60 minutes.
For each trader address, all bids within a 950-block window were grouped as a single strategy execution. Profit was calculated as the guaranteed minimum payout across all possible outcomes minus total cost.
Execution Success Rate
Of the detected arbitrage opportunities:
Single condition arbitrage: 41% of conditions had opportunities, most were exploited
Market rebalancing: 42% of multi-condition markets had opportunities
Combinatorial arbitrage: 13 valid pairs identified, 5 showed execution
The gap between detection and execution is execution risk.
Latency Layers: The Speed Hierarchy
Retail trader execution:
Polymarket API call: ~50ms Matching engine: ~100ms Polygon block time: ~2,000ms Block propagation: ~500ms Total: ~2,650ms
Sophisticated arbitrage system:
WebSocket price feed: <5ms (real-time push) Decision computation: <10ms (pre-calculated) Direct RPC submission: ~15ms (bypass API) Parallel execution: ~10ms (all legs at once) Polygon block inclusion: ~2,000ms (unavoidable) Total: ~2,040ms
The 20-30ms you see on-chain is decision-to-mempool time. Fast wallets submit all positions within 30ms, eliminating sequential execution risk by confirming everything in the same block.
The compounding advantage:
By the time you see their transaction confirmed on-chain (Block N), they detected the opportunity 2+ seconds earlier (Block N-1), submitted all legs in 30ms, and the market already rebalanced. When you copy at Block N+1, you're 4 seconds behind a sub-second opportunity.
Why Copytrading Fast Wallets Fails
What actually happens:Block N-1: Fast system detects mispricing, submits 4 transactions in 30ms Block N: All transactions confirm, arbitrage captured, you see this Block N+1: You copy their trade, but price is now $0.78 (was $0.30)
You're not arbitraging. You're providing exit liquidity.
Order book depth kills you:
Fast wallet buys 50,000 tokens:
VWAP: $0.322 across multiple price levels
Market moves
You buy 5,000 tokens after:
VWAP: $0.344 (market already shifted)
They paid $0.322, you paid $0.344
Their 10 cent edge became your 2.2 cent loss
The Capital Efficiency Problem
Top arbitrageur operated with $500K+ capital. With $5K capital, the same strategy breaks because:
Slippage eats larger percentage of smaller positions
Cannot diversify across enough opportunities
Single failed execution wipes out days of profit
Fixed costs (gas) consume more of profit margin
Gas fees on 4-leg strategy: ~$0.02
$0.08 profit → 25% goes to gas
$0.03 profit → 67% goes to gas
This is why $0.05 minimum threshold exists.
Real Execution Data
Single condition arbitrage:
Detected: 7,051 conditions
Executed: 87% success rate
Failed due to: liquidity (48%), price movement (31%), competition (21%)
Combinatorial arbitrage:
Detected: 13 pairs
Executed: 45% success rate
Failed due to: insufficient simultaneous liquidity (71%), speed competition (18%)
Key takeaway: Mathematical correctness is necessary but not sufficient. Execution speed, order book depth, and non-atomic fill risk determine actual profitability. The research showed $40 million extracted because sophisticated actors solved execution problems, not just math problems.
Part V: The Complete System (What Actually Got Deployed)
Theory is clean. Production is messy. Here's what a working arbitrage system actually looks like based on the research findings and practical requirements.
The Data Pipeline
Real-time requirements:
WebSocket connection to Polymarket CLOB API └─ Order book updates (price/volume changes) └─ Trade execution feed (fills happening) └─ Market creation/settlement events Historical analysis: Alchemy Polygon node API └─ Query events from contract 0x4D97DCd97eC945f40cF65F87097ACe5EA0476045 └─ OrderFilled events (trades executed) └─ PositionSplit events (new tokens minted) └─ PositionsMerge events (tokens burned)
The research analyzed 86 million transactions. That volume requires infrastructure, not scripts.
The Dependency Detection Layer
For 305 US election markets, there are 46,360 possible pairs to check.
Manual analysis is impossible. The research used DeepSeek-R1-Distill-Qwen-32B with prompt engineering:
Input: Two markets with their condition descriptions Output: JSON of valid outcome combinations Validation checks: 1. Does each market have exactly one TRUE condition per outcome? 2. Are there fewer valid combinations than n × m (dependency exists)? 3. Do dependent subsets satisfy arbitrage conditions? Results on election markets: 40,057 independent pairs (no arbitrage possible) 1,576 dependent pairs (potential arbitrage) 374 satisfied strict combinatorial conditions 13 manually verified as exploitable
81.45% accuracy on complex multi-condition markets. Good enough for filtering. Requires manual verification for execution.
The Optimization Engine
Three-layer arbitrage removal:
Layer 1: Simple LCMM constraints Fast linear programming relaxations. Check basic constraints like "sum of probabilities equals 1" and "if A implies B, then P(A) cannot exceed P(B)."
Runs in milliseconds. Removes obvious mispricing.
Layer 2: Integer programming projection Frank-Wolfe algorithm with Gurobi IP solver.
Parameters from research:
Alpha = 0.9 (extract at least 90% of available arbitrage)
Initial epsilon = 0.1 (10% contraction)
Convergence threshold = 1e-6
Time limit = 30 minutes (reduced as markets shrink)
Typical iterations: 50 to 150. Typical solve time per iteration: 1 to 30 seconds depending on market size.
Layer 3: Execution validation Before submitting orders, simulate fills against current order book.
Check:
Is liquidity sufficient at these prices?
What is expected slippage?
What is guaranteed profit after slippage?
Does profit exceed minimum threshold (research used $0.05)?
Only execute if all checks pass.
Position Sizing Logic
Modified Kelly criterion accounting for execution risk:
f = (b×p - q) / b × sqrt(p)*
Where:
b = arbitrage profit percentage
p = probability of full execution (estimated from order book depth)
q = 1 - p
Cap at 50% of order book depth to avoid moving the market.
The Monitoring Dashboard
Track in real-time:
Opportunities detected per minute Opportunities executed per minute Execution success rate Total profit (running sum) Current drawdown percentage Average latency (detection to submission) Alerts: Drawdown exceeds 15% Execution rate drops below 30% IP solver timeouts increase Order fill failures spike
The research identified the top arbitrageur made 4,049 transactions. That's approximately 11 trades per day over one year. Not high-frequency in the traditional sense, but systematic and consistent.
The Actual Results
Total extracted April 2024 to April 2025:
Single condition arbitrage: Buy both < $1: $5,899,287 Sell both > $1: $4,682,075 Subtotal: $10,581,362 Market rebalancing: Buy all YES < $1: $11,092,286 Sell all YES > $1: $612,189 Buy all NO: $17,307,114 Subtotal: $29,011,589 Combinatorial arbitrage: Cross-market execution: $95,634 Total: $39,688,585
Top 10 extractors took $8,127,849 (20.5% of total).
Top single extractor: $2,009,632 from 4,049 trades.
Average profit per trade for top player: $496.
Not lottery wins. Not lucky timing. Mathematical precision executed systematically.
What Separates Winners from Losers
The research makes it clear:
Retail approach:
Check prices every 30 seconds
See if YES + NO roughly equals $1
Maybe use a spreadsheet
Manual order submission
Hope for the best
Quantitative approach:
Real-time WebSocket feeds
Integer programming for dependency detection
Frank-Wolfe with Bregman projection for optimal trades
Parallel order execution with VWAP estimation
Systematic position sizing under execution constraints
2.65 second latency vs. 30 second polling
One group extracted $40 million. The other group provided the liquidity.
Key takeaway: Production systems require mathematical rigor AND engineering sophistication. Optimization theory, distributed systems, real-time data processing, risk management, execution algorithms. All of it. The math is the foundation. The infrastructure is what makes it profitable.
The Final Reality
While traders were reading "10 tips for prediction markets," quantitative systems were:
Solving integer programs to detect dependencies across 17,218 conditions
Computing Bregman projections to find optimal arbitrage trades
Running Frank-Wolfe algorithms with controlled gradient growth
Executing parallel orders with VWAP-based slippage estimation
Systematically extracting $40 million in guaranteed profits
The difference is not luck. It's mathematical infrastructure. The research paper is public. The algorithms are known. The profits are real.
The question is: can you build it before the next $40 million is extracted?
Resources:
Research paper: "Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets" (arXiv:2508.03474v1)
Theory foundation: "Arbitrage-Free Combinatorial Market Making via Integer Programming" (arXiv:1606.02825v2)
IP solver: Gurobi Optimizer
LLM for dependencies: DeepSeek-R1-Distill-Qwen-32B
Data source: Alchemy Polygon node API
The math works. The infrastructure exists. The only question is execution. Let me know below if you guys want Part 2 on this?