Interactive Benchmark Platform  ·  Open to Community

Interactive Hack

A community-driven hub for LLM evaluation, learning, and building. Contribute datasets, compete in arenas, and help push the frontier of AI benchmarking — together.

Explore Puzzles → View Leaderboard
Scroll
🗳️
6.2M+
Community Votes
🤖
32
Models Ranked
🧩
50+
Benchmark Datasets
12
Arena Events
👥
100+
Contributors

Platform

Everything in one place

★ Main Focus
Arena

Interactive Puzzles

Community-contributed datasets and puzzle challenges to evaluate the true capabilities of today's leading LLMs. Submit, benchmark, and explore results — all in one open arena.

Reasoning
Multi-step Logic Arena
248 submissions
Coding
Code Synthesis Challenge
182 submissions
Knowledge
Domain Knowledge Probe
317 submissions
Math
Olympiad Math Gauntlet
94 submissions
Explore all puzzles →
Rankings

Interactive Benchmark
Leaderboard

1
GPT-4o (2025-05)
OpenAI
1312
2
Claude 3.7 Sonnet
Anthropic
1298
3
Gemini 2.0 Ultra
Google DeepMind
1287
4
Grok-3
xAI
1271
5
Llama 3.3 405B
Meta AI
1253
Learn

LLM Whiteboard
Sessions

Deep-dive technical sessions unpacking the core concepts behind modern LLMs.

Next Session
Mar 14, 2026 · 90 min
Understanding RLHF: From Reward Models to Policy Optimization
Free Registration
4 recordings available →
View all sessions →
Stay Updated

InteractiveAGI
Mesh Newsletter

🔬

Research Roundup

Top LLM papers, summarized weekly

🏆

Benchmark Results

Latest arena rankings & insights

🧩

Puzzle Spotlight

Featured community puzzle of the week

Weekly on Fridays · No spam · Free

Community

Join the Arena

100+

contributors building the world's most community-driven LLM benchmark.

Discord GitHub Twitter / X Forum
Join us →