LLM Coding Benchmarks
A Folder from Kurt
Intro
Favicon
OpenAI
↗
Favicon
Anthropic
↗
Favicon
Google
↗
Favicon
DeepSeek
↗
Favicon
Meta
↗
Favicon
Llama 3.1
↗
HumanEval
Favicon
openai/human-eval: Code for the paper "Evaluating Large Language Models Trained on Code"
↗
Favicon
HumanEval Benchmark (Code Generation) | Papers With Code
↗
Favicon
openai/openai_humaneval · Datasets at Hugging Face
↗
Agents
Favicon
Debug like a Human
↗
Favicon
AgentCoder Code Generation
↗
Favicon
Copilot Workspace
↗
BigCodeBench
Favicon
bigcode-project/bigcodebench: BigCodeBench: The Next Generation of HumanEval
↗
Favicon
BigCodeBench: The Next Generation of HumanEval
↗
Favicon
BigCodeBench Leaderboard - a Hugging Face Space by bigcode
↗
Favicon
bigcode/bigcodebench · Datasets at Hugging Face
↗
Favicon
LLM API Price and Perf.xlsx
↗
Legal Protection
Favicon
DevDay Announcements
↗
Favicon
Anthropic Terms Dec 2023
↗
Favicon
Microsoft Copilot Commitment
↗
Favicon
Generative AI Protection
↗
Favicon
Meta won't release advanced AI in the EU due to stronger user data protections - UPI.com
↗
Favicon
DeepSeek User Agreement
↗
Examples
Favicon
ChatGPT
Favicon
Claude
Favicon
DeepSeek
Favicon
Meta AI
Favicon
Gemini
Favicon
tf–idf - Wikipedia
↗