๐
SocraticEnv
OpenEnv Hackathon ยท Meta ร PyTorch ร Scaler
Live Demo
Leaderboard
API Docs
Model Leaderboard
Compare AI models on Socratic reasoning ability across all 3 tasks. Which model thinks best under pressure?
No entries yet.
Seed with baseline scores
to populate the leaderboard with known model performance.
Seed Baseline Data
Run a new model evaluation
Run Evaluation
Enter a model name and click Run to benchmark the current model against all 3 tasks.
0
Models evaluated
โ
Best overall score
โ
Hardest task avg
Rank
Model
Easy
Medium
Hard
Overall
Progress
๐
No models evaluated yet
Run an evaluation above to add the first entry