Google's Gemini 3.1 Pro Preview has surged to the top of Humanity's Last Exam leaderboards with scores around 45-47%—using advanced "thinking high" modes—eclipsing prior Gemini 3 Deep Think's 48.4% from February 2026 and outpacing OpenAI's GPT-5.x variants at 41-44%. This PhD-level benchmark, spanning 2,500 expert questions across math, sciences, and humanities, tests frontier large language model reasoning beyond memorization. Trader sentiment reflects Google's aggressive iteration amid fierce competition from Anthropic's Claude and OpenAI, with implied probabilities favoring further gains by June 30. Key catalyst ahead: Google I/O on May 19, where next-gen Gemini announcements could push scores toward 50%+, though calibration errors highlight ongoing overconfidence risks.
Resumen experimental generado por IA con datos de Polymarket. Esto no es asesoramiento de trading y no influye en cómo se resuelve este mercado. · Actualizado$310,437 Vol.
50%+
52%
55%+
13%
60%+
8%
$310,437 Vol.
50%+
52%
55%+
13%
60%+
8%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Mercado abierto: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Google's Gemini 3.1 Pro Preview has surged to the top of Humanity's Last Exam leaderboards with scores around 45-47%—using advanced "thinking high" modes—eclipsing prior Gemini 3 Deep Think's 48.4% from February 2026 and outpacing OpenAI's GPT-5.x variants at 41-44%. This PhD-level benchmark, spanning 2,500 expert questions across math, sciences, and humanities, tests frontier large language model reasoning beyond memorization. Trader sentiment reflects Google's aggressive iteration amid fierce competition from Anthropic's Claude and OpenAI, with implied probabilities favoring further gains by June 30. Key catalyst ahead: Google I/O on May 19, where next-gen Gemini announcements could push scores toward 50%+, though calibration errors highlight ongoing overconfidence risks.
Resumen experimental generado por IA con datos de Polymarket. Esto no es asesoramiento de trading y no influye en cómo se resuelve este mercado. · Actualizado
Cuidado con los enlaces externos.
Cuidado con los enlaces externos.
Preguntas frecuentes