xAI's Grok models trail leaders like GPT-5.5 Pro (87.8%) on Epoch AI's updated FrontierMath Tiers 1-4 benchmark, which tests research-level math problems with recent error fixes leaving 338 items total. Older public Grok 4 evaluations showed scores around 14%, while competitors leverage heavy compute, Python tooling, and multi-agent setups for gains up to the mid-80s. No major Grok foundation-model release has occurred since the 4.x series, and xAI's June 2026 focus remains on agentic tools like Grok Build 0.1 rather than immediate math fine-tuning. With only 16 days until the June 30 resolution, traders see limited scope for a sudden leap unless an unannounced variant or inference tweak materializes, keeping odds sensitive to any credible leak on internal progress.
Experimental AI-generated summary referencing Polymarket data. This is not trading advice and plays no role in how this market resolves. · Updated$21,913 Vol.
25%+
56%
30%+
51%
40%+
60%
50%+
49%
$21,913 Vol.
25%+
56%
30%+
51%
40%+
60%
50%+
49%
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Market Opened: Jan 30, 2026, 12:01 AM ET
Resolver
0x65070BE91...This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Resolver
0x65070BE91...xAI's Grok models trail leaders like GPT-5.5 Pro (87.8%) on Epoch AI's updated FrontierMath Tiers 1-4 benchmark, which tests research-level math problems with recent error fixes leaving 338 items total. Older public Grok 4 evaluations showed scores around 14%, while competitors leverage heavy compute, Python tooling, and multi-agent setups for gains up to the mid-80s. No major Grok foundation-model release has occurred since the 4.x series, and xAI's June 2026 focus remains on agentic tools like Grok Build 0.1 rather than immediate math fine-tuning. With only 16 days until the June 30 resolution, traders see limited scope for a sudden leap unless an unannounced variant or inference tweak materializes, keeping odds sensitive to any credible leak on internal progress.
Experimental AI-generated summary referencing Polymarket data. This is not trading advice and plays no role in how this market resolves. · Updated



Beware of external links.
Beware of external links.
Frequently Asked Questions