[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-huybery--Awesome-Code-LLM":3,"tool-huybery--Awesome-Code-LLM":64},[4,17,25,39,48,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,14,15],"开发框架","Agent","语言模型","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":10,"last_commit_at":23,"category_tags":24,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,15],{"id":26,"name":27,"github_repo":28,"description_zh":29,"stars":30,"difficulty_score":10,"last_commit_at":31,"category_tags":32,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[33,34,35,36,14,37,15,13,38],"图像","数据工具","视频","插件","其他","音频",{"id":40,"name":41,"github_repo":42,"description_zh":43,"stars":44,"difficulty_score":45,"last_commit_at":46,"category_tags":47,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,3,"2026-04-04T04:44:48",[14,33,13,15,37],{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":45,"last_commit_at":54,"category_tags":55,"status":16},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74913,"2026-04-05T10:44:17",[15,33,13,37],{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":45,"last_commit_at":62,"category_tags":63,"status":16},2181,"OpenHands","OpenHands\u002FOpenHands","OpenHands 是一个专注于 AI 驱动开发的开源平台，旨在让智能体（Agent）像人类开发者一样理解、编写和调试代码。它解决了传统编程中重复性劳动多、环境配置复杂以及人机协作效率低等痛点，通过自动化流程显著提升开发速度。\n\n无论是希望提升编码效率的软件工程师、探索智能体技术的研究人员，还是需要快速原型验证的技术团队，都能从中受益。OpenHands 提供了灵活多样的使用方式：既可以通过命令行（CLI）或本地图形界面在个人电脑上轻松上手，体验类似 Devin 的流畅交互；也能利用其强大的 Python SDK 自定义智能体逻辑，甚至在云端大规模部署上千个智能体并行工作。\n\n其核心技术亮点在于模块化的软件智能体 SDK，这不仅构成了平台的引擎，还支持高度可组合的开发模式。此外，OpenHands 在 SWE-bench 基准测试中取得了 77.6% 的优异成绩，证明了其解决真实世界软件工程问题的能力。平台还具备完善的企业级功能，支持与 Slack、Jira 等工具集成，并提供细粒度的权限管理，适合从个人开发者到大型企业的各类用户场景。",70612,"2026-04-05T11:12:22",[15,14,13,36],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":81,"owner_twitter":75,"owner_website":82,"owner_url":83,"languages":84,"stars":65,"forks":85,"last_commit_at":86,"license":87,"difficulty_score":10,"env_os":88,"env_gpu":88,"env_ram":88,"env_deps":89,"category_tags":92,"github_topics":93,"view_count":10,"oss_zip_url":84,"oss_zip_packed_at":84,"status":16,"created_at":97,"updated_at":98,"faqs":99,"releases":130},1285,"huybery\u002FAwesome-Code-LLM","Awesome-Code-LLM","👨‍💻 An awesome and curated list of best code-LLM for research.","Awesome-Code-LLM 是一个精心整理的开源代码大语言模型（Code LLM）资源列表，旨在为研究人员和开发者提供最前沿的代码生成模型、相关论文、评估工具以及实用资源。它帮助用户快速了解当前最先进的代码模型，并根据不同的需求选择合适的工具。\n\n这个项目解决了代码生成领域信息分散、难以追踪最新进展的问题，通过集中整理各类模型、研究和评测结果，降低了查找和比较的难度。无论是想寻找高性能的代码模型进行研究，还是希望在实际开发中使用优秀的代码生成工具，都可以在这里找到有价值的资源。\n\nAwesome-Code-LLM 适合研究人员、AI 开发者以及对代码生成技术感兴趣的开发者使用。它不仅提供了当前排名靠前的模型列表，还包含丰富的论文和评估工具，便于深入学习和实验。此外，项目支持社区贡献，用户可以提交 Pull Request 添加新的资源或改进内容，共同推动这一领域的发展。\n\n其独特亮点包括详细的模型排行榜、分类整理的论文资源以及活跃的社区参与机制，使用户能够全面掌握代码大语言模型的最新动态与研究成果。","\u003Cdiv align=\"center\">\n  \u003Ch1>👨‍💻 Awesome Code LLM\u003C\u002Fh1>\n  \u003Ca href=\"https:\u002F\u002Fawesome.re\">\n    \u003Cimg src=\"https:\u002F\u002Fawesome.re\u002Fbadge.svg\" alt=\"Awesome\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-Welcome-red\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-Welcome-red\" alt=\"PRs Welcome\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flast-commit\u002Fhuybery\u002FAwesome-Code-LLM?color=green\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flast-commit\u002Fhuybery\u002FAwesome-Code-LLM?color=green\" alt=\"Last Commit\">\n  \u003C\u002Fa>\n\u003C\u002Fdiv>\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_0210539050ff.png)\n\n&nbsp;\n\n## 🔆 How to Contribute\n\nContributions are welcome!\nIf you have any resources, tools, papers, or insights related to Code LLMs, feel free to submit a pull request.\nLet's work together to make this project better!\n\n&nbsp;\n\n## News\n\n- 🔥🔥🔥 **[2024-11-12]** [**Qwen2.5-Coder series**](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FQwen\u002Fqwen25-66e81a666513e518adb90d9e) are released, offering six model sizes (0.5B, 1.5B, 3B, 7B, 14B, 32B), with Qwen2.5-Coder-32B-Instruct now the most powerful open-source code model.\n- 🔥🔥 **[2024-11-08]** [OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.04905) is released.\n\n&nbsp;\n\n## 🧵 Table of Contents\n\n- [🧵 Table of Contents](#-table-of-contents)\n- [🚀 Top Code LLMs](#-top-code-llms)\n- [💡 Evaluation Toolkit](#-evaluation-toolkit)\n- [🚀 Awesome Code LLMs Leaderboard](#-awesome-code-llms-leaderboard)\n- [📚 Awesome Code LLMs Papers](#-awesome-code-llms-papers)\n  - [🌊 Awesome Code Pre-Training Papers](#-awesome-code-pre-training-papers)\n  - [🐳 Awesome Code Instruction-Tuning Papers](#-awesome-code-instruction-tuning-papers)\n  - [🐬 Awesome Code Alignment Papers](#-awesome-code-alignment-papers)\n  - [🐋 Awesome Code Prompting Papers](#-awesome-code-prompting-papers)\n  - [🐙 Awesome Code Benchmark \\& Evaluation Papers](#-awesome-code-benchmark--evaluation-papers)\n- [🙌 Contributors](#-contributors)\n- [Cite as](#cite-as)\n- [Acknowledgement](#acknowledgement)\n- [Star History](#star-history)\n\n&nbsp;\n\n## 🚀 Top Code LLMs\n###### Sort by HumanEval Pass@1\n\n| Rank | Model                                                                                           | Params  | HumanEval | MBPP | Source                                                     |\n|------|-------------------------------------------------------------------------------------------------|---------|-----------|------|------------------------------------------------------------|\n| 1    | o1-mini-2024-09-12                                                                              | -       | 97.6      | 93.9 | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 2    | o1-preview-2024-09-12                                                                           | -       | 95.1      | 93.4 | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 3    | [Qwen2.5-Coder-32B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-32B-Instruct)            | 32B     | 92.7      | 90.2 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 4    | Claude-3.5-Sonnet-20241022                                                                      | -       | 92.1      | 91.0 | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 5    | GPT-4o-2024-08-06                                                                               | -       | 92.1      | 86.8 | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 6    | [Qwen2.5-Coder-14B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-14B-Instruct)            | 14B     | 89.6      | 86.2 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 7    | Claude-3.5-Sonnet-20240620                                                                      | -       | 89.0      | 87.6 | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 8    | GPT-4o-mini-2024-07-18                                                                          | -       | 87.8      | 86.0 | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 9    | [Qwen2.5-Coder-7B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-7B-Instruct)              | 7B      | 88.4      | 83.5 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 10   | [DS-Coder-V2-Instruct](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Instruct)           | 21\u002F236B | 85.4      | 89.4 | [github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2) |\n| 11   | [Qwen2.5-Coder-3B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-3B-Instruct)              | 3B      | 84.1      | 73.6 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 12   | [DS-Coder-V2-Lite-Instruct](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Lite-Instruct) | 2.4\u002F16B | 81.1      | 82.8 | [github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2) |\n| 13   | [CodeQwen1.5-7B-Chat](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FCodeQwen1.5-7B-Chat)                          | 7B      | 83.5      | 70.6 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FCodeQwen1.5)            |\n| 14   | [DeepSeek-Coder-33B-Instruct](https:\u002F\u002Fhf.co\u002Fdeepseek-ai\u002Fdeepseek-coder-33b-instruct)            | 33B     | 79.3      | 70.0 | [github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder)    |\n| 15   | [DeepSeek-Coder-6.7B-Instruct](https:\u002F\u002Fhf.co\u002Fdeepseek-ai\u002Fdeepseek-coder-6.7b-instruct)          | 6.7B    | 78.6      | 65.4 | [github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder)    |\n| 16   | GPT-3.5-Turbo                                                                                   | -       | 76.2      | 70.8 | [github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder)    |\n| 17   | [CodeLlama-70B-Instruct](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FCodeLlama-70b-Instruct-hf)           | 70B     | 72.0      | 77.8 | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.12950)                  |\n| 18   | [Qwen2.5-Coder-1.5B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-1.5B-Instruct)          | 1.5B    | 70.7      | 69.2 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 19   | [StarCoder2-15B-Instruct-v0.1](https:\u002F\u002Fhuggingface.co\u002Fbigcode\u002Fstarcoder2-15b-instruct-v0.1)     | 15B     | 67.7      | 78.0 | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.06161)                  |\n| 20   | [Qwen2.5-Coder-0.5B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-0.5B-Instruct)          | 0.5B    | 61.6      | 52.4 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 21   | Pangu-Coder2                                                                                    | 15B     | 61.6      | -    | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.14936)                  |\n| 22   | [WizardCoder-15B](https:\u002F\u002Fhf.co\u002FWizardLM\u002FWizardCoder-15B-V1.0)                                  | 15B     | 57.3      | 51.8 | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08568)                  |\n| 23   | [CodeQwen1.5-7B](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FCodeQwen1.5-7B)                                    | 7B      | 51.8      | 61.8 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FCodeQwen1.5)            |\n| 24   | [CodeLlama-34B-Instruct](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FCodeLlama-34b-Instruct-hf)           | 34B     | 48.2      | 61.1 | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.12950)                  |\n| 25   | Code-Davinci-002                                                                                | -       | 47.0      | -    | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.03374)                  |\n\n&nbsp;\n\n## 💡 Evaluation Toolkit:\n\n- [bigcode-evaluation-harness](https:\u002F\u002Fgithub.com\u002Fbigcode-project\u002Fbigcode-evaluation-harness): A framework for the evaluation of autoregressive code generation language models.\n- [code-eval](https:\u002F\u002Fgithub.com\u002Fabacaj\u002Fcode-eval): A framework for the evaluation of autoregressive code generation language models on HumanEval.\n- [SandboxFusion](https:\u002F\u002Fbytedance.github.io\u002FSandboxFusion): A secure sandbox for running and judging code generated by LLMs.\n\n&nbsp;\n\n## 🚀 Awesome Code LLMs Leaderboard\n| Leaderboard                                                                                                   | Description                                                                                                                                                                                                                                                      |\n|:--------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [Evalperf Leaderboard](https:\u002F\u002Fevalplus.github.io\u002Fevalperf.html)                                              | Evaluating LLMs for Efficient Code Generation.                                                                                                                                                                                                                   |\n| [Aider Code Editing Leaderboard](https:\u002F\u002Faider.chat\u002Fdocs\u002Fleaderboards\u002F)                                       | Measuring the LLM’s coding ability, and whether it can write new code that integrates into existing code.                                                                                                                                                        |\n| [BigCodeBench Leaderboard](https:\u002F\u002Fbigcode-bench.github.io)                                                   | BigCodeBench evaluates LLMs with practical and challenging programming tasks.                                                                                                                                                                                    |\n| [LiveCodeBench Leaderboard](https:\u002F\u002Flivecodebench.github.io\u002Fleaderboard.html)                                 | Holistic and Contamination Free Evaluation of Large Language Models for Code.                                                                                                                                                                                    |\n| [Big Code Models Leaderboard](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fbigcode\u002Fbigcode-models-leaderboard)               | Compare performance of base multilingual code generation models on HumanEval benchmark and MultiPL-E.                                                                                                                                                            |\n| [BIRD Leaderboard](https:\u002F\u002Fbird-bench.github.io)                                                              | BIRD contains over 12,751 unique question-SQL pairs, 95 big databases with a total size of 33.4 GB. It also covers more than 37 professional domains, such as blockchain, hockey, healthcare and education, etc.                                                 |\n| [CanAiCode Leaderboard](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fmike-ravkine\u002Fcan-ai-code-results)                       | CanAiCode Leaderboard                                                                                                                                                                                                                                            |\n| [Coding LLMs Leaderboard](https:\u002F\u002Fleaderboard.tabbyml.com)                                                    | Coding LLMs Leaderboard                                                                                                                                                                                                                                          |\n| [CRUXEval Leaderboard](https:\u002F\u002Fcrux-eval.github.io\u002Fleaderboard.html)                                          | CRUXEval is a benchmark complementary to HumanEval and MBPP measuring code reasoning, understanding, and execution capabilities!                                                                                                                                 |\n| [EvalPlus Leaderboard](https:\u002F\u002Fevalplus.github.io\u002Fleaderboard.html)                                           | EvalPlus evaluates AI Coders with rigorous tests.                                                                                                                                                                                                                |\n| [InfiBench Leaderboard](https:\u002F\u002Finfi-coder.github.io\u002Finfibench\u002F)                                              | InfiBench is a comprehensive benchmark for code large language models evaluating model ability on answering freeform real-world questions in the code domain.                                                                                                    |\n| [InterCode Leaderboard](https:\u002F\u002Fintercode-benchmark.github.io)                                                | InterCode is a benchmark for evaluating language models on the interactive coding task. Given a natural language request, an agent is asked to interact with a software system (e.g., database, terminal) with code to resolve the issue.                        |\n| [Program Synthesis Models Leaderboard](https:\u002F\u002Faccubits.com\u002Fopen-source-program-synthesis-models-leaderboard) | They created this leaderboard to help researchers easily identify the best open-source model with an intuitive leadership quadrant graph. They evaluate the performance of open-source code models to rank them based on their capabilities and market adoption. |\n| [Spider Leaderboard](https:\u002F\u002Fyale-lily.github.io\u002Fspider)                                                      | Spider is a large-scale complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 Yale students. The goal of the Spider challenge is to develop natural language interfaces to cross-domain databases.                                   |\n\n&nbsp;\n\n\n## 📚 Awesome Code LLMs Papers\n\n### 🌊 Awesome Code Pre-Training Papers\n| Title                                                                                                                                                                                                                                                  | Venue      | Date      | Code                                                       | Resources                                                                         |\n|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|-----------|------------------------------------------------------------|-----------------------------------------------------------------------------------|\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOpenCoder-llm\u002FOpenCoder-llm.svg?style=social&label=Star) \u003Cbr> [**OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.04905) \u003Cbr>                            | `Preprint` | `2024.11` | [Github](https:\u002F\u002Fgithub.com\u002FOpenCoder-llm\u002FOpenCoder-llm)   | [HF](https:\u002F\u002Fhuggingface.co\u002Finfly\u002FOpenCoder-8B-Instruct)                          |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FQwenLM\u002FQwen2.5-Coder.svg?style=social&label=Star) \u003Cbr> [**Qwen2.5-Coder Technical Report**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186) \u003Cbr>                                                                         | `Preprint` | `2024.09` | [Github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          | [HF](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-32B-Instruct)                      |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2.svg?style=social&label=Star) \u003Cbr> [**DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.11931) \u003Cbr>          | `Preprint` | `2024.06` | [Github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2) | [HF](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Instruct)               |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbigcode-project\u002Fstarcoder2.svg?style=social&label=Star) \u003Cbr> [**StarCoder 2 and The Stack v2: The Next Generation**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.19173) \u003Cbr>                                                | `Preprint` | `2024.02` | [Github](https:\u002F\u002Fgithub.com\u002Fbigcode-project\u002Fstarcoder2)    | [HF](https:\u002F\u002Fhuggingface.co\u002Fbigcode)                                              |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fdeepseek-ai\u002FDeepSeek-Coder.svg?style=social&label=Star) \u003Cbr> [**DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.14196) \u003Cbr> | `Preprint` | `2024.01` | [Github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder)    | [HF](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-33b-instruct)              |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmeta-llama\u002Fcodellama.svg?style=social&label=Star) \u003Cbr> [**Code Llama: Open Foundation Models for Code**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.12950) \u003Cbr>                                                            | `Preprint` | `2023.08` | [Github](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fcodellama)          | [HF](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FCodeLlama-7b-hf)                           |\n| [**Textbooks Are All You Need**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.11644) \u003Cbr>                                                                                                                                                                                | `Preprint` | `2023.06` | -                                                          | [HF](https:\u002F\u002Fhuggingface.co\u002Fmicrosoft\u002Fphi-1)                                      |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsalesforce\u002FCodeT5.svg?style=social&label=Star) \u003Cbr> [**CodeT5+: Open Code Large Language Models for Code Understanding and Generation**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.07922) \u003Cbr>                            | `Preprint` | `2023.05` | [Github](https:\u002F\u002Fgithub.com\u002Fsalesforce\u002FCodeT5)             | [HF](https:\u002F\u002Fhuggingface.co\u002FSalesforce\u002Fcodet5p-16b)                               |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbigcode-project\u002Fstarcoder.svg?style=social&label=Star) \u003Cbr> [**StarCoder: may the source be with you!**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.06161) \u003Cbr>                                                            | `Preprint` | `2023.05` | [Github](https:\u002F\u002Fgithub.com\u002Fbigcode-project\u002Fstarcoder)     | [HF](https:\u002F\u002Fhuggingface.co\u002Fbigcode\u002Fstarcoder)                                    |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsalesforce\u002FCodeGen.svg?style=social&label=Star) \u003Cbr> [**CodeGen2: Lessons for Training LLMs on Programming and Natural Languages**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.02309) \u003Cbr>                                 | `ICLR23`   | `2023.05` | [Github](https:\u002F\u002Fgithub.com\u002Fsalesforce\u002FCodeGen)            | [HF](https:\u002F\u002Fhuggingface.co\u002FSalesforce\u002Fcodegen25-7b-multi_P)                      |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FTHUDM\u002FCodeGeeX.svg?style=social&label=Star) \u003Cbr> [**CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.17568) \u003Cbr>               | `Preprint` | `2023.03` | [Github](https:\u002F\u002Fgithub.com\u002FTHUDM\u002FCodeGeeX)                | [HF](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FTHUDM\u002Fcodegeex4-6694e777e98246f00632fcf1) |\n| [**SantaCoder: don't reach for the stars!**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.03988) \u003Cbr>                                                                                                                                                                    | `Preprint` | `2023.01` | -                                                          | [HF](https:\u002F\u002Fhuggingface.co\u002Fbigcode\u002Fsantacoder)                                   |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsalesforce\u002FCodeGen.svg?style=social&label=Star) \u003Cbr> [**CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.13474) \u003Cbr>                         | `ICLR'23`  | `2022.03` | [Github](https:\u002F\u002Fgithub.com\u002Fsalesforce\u002FCodeGen)            | [HF](https:\u002F\u002Fhuggingface.co\u002FSalesforce\u002Fcodegen25-7b-multi_P)                      |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fopenai\u002Fhuman-eval.svg?style=social&label=Star) \u003Cbr> [**Evaluating Large Language Models Trained on Code**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.03374) \u003Cbr>                                                          | `Preprint` | `2021.07` | [Github](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fhuman-eval)             | -                                                                                 |\n\n&nbsp;\n\n### 🐳 Awesome Code Instruction-Tuning Papers\n| Title                                                                                                                                                                                                                                                | Venue      | Date      | Code                                                  | Resources                                                      |\n|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|-----------|-------------------------------------------------------|----------------------------------------------------------------|\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fise-uiuc\u002Fmagicoder.svg?style=social&label=Star) \u003Cbr> [**Magicoder: Source Code Is All You Need**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.02120) \u003Cbr>                                                                 | `ICML'24`  | `2023.12` | [Github](https:\u002F\u002Fgithub.com\u002Fise-uiuc\u002Fmagicoder)       | [HF](https:\u002F\u002Fhuggingface.co\u002Fise-uiuc\u002FMagicoder-DS-6.7B)        |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbigcode-project\u002Foctopack.svg?style=social&label=Star) \u003Cbr> [**OctoPack: Instruction Tuning Code Large Language Models**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.07124) \u003Cbr>                                          | `ICLR'24`  | `2023.08` | [Github](https:\u002F\u002Fgithub.com\u002Fbigcode-project\u002Foctopack) | [HF](https:\u002F\u002Fhuggingface.co\u002Fbigcode\u002Foctocoder)                 |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fnlpxucan\u002FWizardLM.svg?style=social&label=Star) \u003Cbr> [**WizardCoder: Empowering Code Large Language Models with Evol-Instruct**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08568) \u003Cbr>                                   | `Preprint` | `2023.07` | [Github](https:\u002F\u002Fgithub.com\u002Fnlpxucan\u002FWizardLM)        | [HF](https:\u002F\u002Fhuggingface.co\u002FWizardLMTeam\u002FWizardCoder-15B-V1.0) |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsahil280114\u002Fcodealpaca.svg?style=social&label=Star) \u003Cbr> [**Code Alpaca: An Instruction-following LLaMA Model trained on code generation instructions**](https:\u002F\u002Fgithub.com\u002Fsahil280114\u002Fcodealpaca) \u003Cbr> | `Preprint` | `2023.xx` | [Github](https:\u002F\u002Fgithub.com\u002Fsahil280114\u002Fcodealpaca)   | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fsahil2801\u002FCodeAlpaca-20k) |\n\n&nbsp;\n\n\n### 🐬 Awesome Code Alignment Papers\n| Title                                                                                                                                                                                                                                    | Venue        | Date      | Code                                                          | Resources |\n|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-----------|---------------------------------------------------------------|-----------|\n| [**ProSec: Fortifying Code LLMs with Proactive Security Alignment**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.12882) \u003Cbr>                                                                                                                              | `Preprint`   | `2024.11` | -                                                             | -         |\n| [**PLUM: Preference Learning Plus Test Cases Yields Better Code Language Models**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.06887) \u003Cbr>                                                                                                                | `Preprint`   | `2024.06` | -                                                             | -         |\n| [**PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.14936) \u003Cbr>                                                                                                                 | `Preprint`   | `2023.07` | -                                                             | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZyq-scut\u002FRLTF.svg?style=social&label=Star) \u003Cbr> [**RLTF: Reinforcement Learning from Unit Test Feedback**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.04349) \u003Cbr>                                            | `Preprint`   | `2023.07` | [Github](https:\u002F\u002Fgithub.com\u002FZyq-scut\u002FRLTF)                    | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Freddy-lab-code-research\u002FPPOCoder.svg?style=social&label=Star) \u003Cbr> [**Execution-based Code Generation using Deep Reinforcement Learning**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.13816) \u003Cbr>            | `TMLR'23`    | `2023.01` | [Github](https:\u002F\u002Fgithub.com\u002Freddy-lab-code-research\u002FPPOCoder) | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsalesforce\u002FCodeRL.svg?style=social&label=Star) \u003Cbr> [**CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.01780) \u003Cbr> | `NeurIPS'22` | `2022.07` | [Github](https:\u002F\u002Fgithub.com\u002Fsalesforce\u002FCodeRL)                | -         |\n                                                    \n&nbsp;\n\n### 🐋 Awesome Code Prompting Papers\n| Title                                                                                                                                                                                                                                                         | Venue      | Date      | Code                                                                   | Resources |\n|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|-----------|------------------------------------------------------------------------|-----------|\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FYerbaPage\u002FMGDebugger.svg?style=social&label=Star) \u003Cbr> [**From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.01215) \u003Cbr>                | `Preprint` | `2024.10` | [Github](https:\u002F\u002Fgithub.com\u002FYerbaPage\u002FMGDebugger)                      | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FHambaobao\u002FHCP-Coder.svg?style=social&label=Star) \u003Cbr> [**Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMs**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.18294) \u003Cbr> | `AAAI'25`  | `2024.06` | [Github](https:\u002F\u002Fgithub.com\u002FHambaobao\u002FHCP-Coder)                       | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FFloridSleeves\u002FLLMDebugger.svg?style=social&label=Star) \u003Cbr> [**Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.16906) \u003Cbr>         | `ACL'24`   | `2024.02` | [Github](https:\u002F\u002Fgithub.com\u002FFloridSleeves\u002FLLMDebugger)                 | -         |\n| [**SelfEvolve: A Code Evolution Framework via Large Language Models**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.02907) \u003Cbr>                                                                                                                                                 | `Preprint` | `2023.06` | -                                                                      | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ftheoxo\u002Fself-repair.svg?style=social&label=Star) \u003Cbr> [**Demystifying GPT Self-Repair for Code Generation**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.09896) \u003Cbr>                                                                | `ICLR'24`  | `2023.06` | [Github](https:\u002F\u002Fgithub.com\u002Ftheoxo\u002Fself-repair)                        | -         |\n| [**Teaching Large Language Models to Self-Debug**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.05128) \u003Cbr>                                                                                                                                                                     | `ICLR'24`  | `2023.06` | -                                                                      | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fniansong1996\u002Flever.svg?style=social&label=Star) \u003Cbr> [**LEVER: Learning to Verify Language-to-Code Generation with Execution**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.08468) \u003Cbr>                                            | `ICML'23`  | `2023.02` | [Github](https:\u002F\u002Fgithub.com\u002Fniansong1996\u002Flever)                        | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ffacebookresearch\u002Fcoder_reviewer_reranking.svg?style=social&label=Star) \u003Cbr> [**Coder Reviewer Reranking for Code Generation**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.16490) \u003Cbr>                                             | `ICML'23`  | `2022.11` | [Github](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fcoder_reviewer_reranking) | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FCodeT.svg?style=social&label=Star) \u003Cbr> [**CodeT: Code Generation with Generated Tests**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.10397) \u003Cbr>                                                                        | `ICLR'23`  | `2022.07` | [Github](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FCodeT)                           | -         |\n\n&nbsp;\n\n### 🐙 Awesome Code Benchmark & Evaluation Papers\n| Dataset           | Title                                                                                                                                                                                                                                                           | Venue        | Date      | Code                                                                                          | Resources                                                                                                           |\n|:------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-----------|-----------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|\n| `CodeArena`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FQwenLM\u002FQwen2.5-Coder.svg?style=social&label=Star) \u003Cbr> [**Evaluating and Aligning CodeLLMs on Human Preference**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.05210) \u003Cbr>                                                            | `Preprint`   | `2024.12` | [Github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder\u002Ftree\u002Fmain\u002Fqwencoder-eval\u002Finstruct\u002FCodeArena) | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FCSJianYang\u002FCodeArena)                                                          |\n| `FullStack Bench` | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbytedance\u002FFullStackBench.svg?style=social&label=Star) \u003Cbr> [**FullStack Bench: Evaluating LLMs as Full Stack Coders**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.00535) \u003Cbr>                                                       | `Preprint`   | `2024.12` | [Github](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FFullStackBench)                                         | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FByteDance\u002FFullStackBench) [Github](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FSandboxFusion) |\n| `GitChameleon`    | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FNizarIslah\u002FGitChameleon.svg?style=social&label=Star) \u003Cbr> [**GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.05830) \u003Cbr>                         | `Preprint`   | `2024.11` | [Github](https:\u002F\u002Fgithub.com\u002FNizarIslah\u002FGitChameleon)                                          | -                                                                                                                   |\n| `Evalperf`        | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fevalplus\u002Fevalplus.svg?style=social&label=Star) \u003Cbr> [**Evaluating Language Models for Efficient Code Generation**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.06450) \u003Cbr>                                                           | `COLM'24`    | `2024.08` | [Github](https:\u002F\u002Fgithub.com\u002Fevalplus\u002Fevalplus)                                                | [HF](https:\u002F\u002Fhuggingface.co\u002Fevalplus)                                                                               |\n| `LiveCodeBench`   | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FLiveCodeBench\u002FLiveCodeBench.svg?style=social&label=Star) \u003Cbr> [**LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.07974) \u003Cbr>              | `Preprint`   | `2024.03` | [Github](https:\u002F\u002Fgithub.com\u002FLiveCodeBench\u002FLiveCodeBench)                                      | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Flivecodebench\u002Fcode_generation_lite)                                            |\n| `DevBench`        | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fopen-compass\u002FDevBench.svg?style=social&label=Star) \u003Cbr> [**DevBench: A Comprehensive Benchmark for Software Development**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.08604) \u003Cbr>                                                   | `Preprint`   | `2024.03` | [Github](https:\u002F\u002Fgithub.com\u002Fopen-compass\u002FDevBench)                                            | -                                                                                                                   |\n| `SWE-bench`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fprinceton-nlp\u002FSWE-bench.svg?style=social&label=Star) \u003Cbr> [**SWE-bench: Can Language Models Resolve Real-World GitHub Issues?**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.06770) \u003Cbr>                                             | `ICLR'24`    | `2024.03` | [Github](https:\u002F\u002Fgithub.com\u002Fprinceton-nlp\u002FSWE-bench)                                          | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fprinceton-nlp\u002FSWE-bench)                                                       |\n| `CrossCodeEval`   | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Famazon-science\u002Fcceval.svg?style=social&label=Star) \u003Cbr> [**CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.03091) \u003Cbr>                             | `NeurIPS'23` | `2023.11` | [Github](https:\u002F\u002Fgithub.com\u002Famazon-science\u002Fcceval)                                            | -                                                                                                                   |\n| `RepoCoder`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FCodeT.svg?style=social&label=Star) \u003Cbr> [**Repository-Level Code Completion Through Iterative Retrieval and Generation**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.03091) \u003Cbr>                                          | `EMNLP'23`   | `2023.10` | [Github](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FCodeT\u002Ftree\u002Fmain\u002FRepoCoder)                              | -                                                                                                                   |\n| `LongCoder`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FCodeBERT.svg?style=social&label=Star) \u003Cbr> [**LongCoder: A Long-Range Pre-trained Language Model for Code Completion**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.14893) \u003Cbr>                                            | `ICML'23`    | `2023.10` | [Github](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FCodeBERT)                                               | -                                                                                                                   |\n| -                 | [**Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Generation**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.10335) \u003Cbr>                                                                                                   | `Preprint`   | `2023.08` | -                                                                                             | -                                                                                                                   |\n| `BioCoder`        | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgersteinlab\u002FBioCoder.svg?style=social&label=Star) \u003Cbr> [**BioCoder: A Benchmark for Bioinformatics Code Generation with Large Language Models**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.16458) \u003Cbr>                             | `ISMB'24`    | `2023.08` | [Github](https:\u002F\u002Fgithub.com\u002Fgersteinlab\u002FBioCoder)                                             | -                                                                                                                   |\n| `RepoBench`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FLeolty\u002Frepobench.svg?style=social&label=Star) \u003Cbr> [**RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.03091) \u003Cbr>                                               | `ICLR'24`    | `2023.06` | [Github](https:\u002F\u002Fgithub.com\u002FLeolty\u002Frepobench)                                                 | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Ftianyang\u002Frepobench_python_v1.1)                                                |\n| `Evalplus`        | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fevalplus\u002Fevalplus.svg?style=social&label=Star) \u003Cbr> [**Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.01210) \u003Cbr> | `NeurIPS'23` | `2023.05` | [Github](https:\u002F\u002Fgithub.com\u002Fevalplus\u002Fevalplus)                                                | [HF](https:\u002F\u002Fhuggingface.co\u002Fevalplus)                                                                               |\n| `Coeditor`        | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FMrVPlusOne\u002FCoeditor.svg?style=social&label=Star) \u003Cbr> [**Coeditor: Leveraging Contextual Changes for Multi-round Code Auto-editing**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.18584) \u003Cbr>                                        | `ICLR'24`    | `2023.05` | [Github](https:\u002F\u002Fgithub.com\u002FMrVPlusOne\u002FCoeditor)                                              | -                                                                                                                   |\n| `DS-1000`         | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxlang-ai\u002FDS-1000.svg?style=social&label=Star) \u003Cbr> [**DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.11501) \u003Cbr>                                          | `ICML'23`    | `2022.11` | [Github](https:\u002F\u002Fgithub.com\u002Fxlang-ai\u002FDS-1000)                                                 | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fxlangai\u002FDS-1000)                                                               |\n| `MultiPL-E`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fnuprl\u002FMultiPL-E.svg?style=social&label=Star) \u003Cbr> [**MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2208.08227) \u003Cbr>                                 | `Preprint`   | `2022.08` | [Github](https:\u002F\u002Fgithub.com\u002Fnuprl\u002FMultiPL-E)                                                  | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fxlangai\u002FDS-1000)                                                               |\n| `MBPP`            | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgoogle-research\u002Fgoogle-research.svg?style=social&label=Star) \u003Cbr> [**Program Synthesis with Large Language Models**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2108.07732) \u003Cbr>                                                         | `Preprint`   | `2021.08` | [Github](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Fgoogle-research\u002Fblob\u002Fmaster\u002Fmbpp\u002FREADME.md)       | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fnuprl\u002FMultiPL-E)                                                               |\n| `APPS`            | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhendrycks\u002Fapps.svg?style=social&label=Star) \u003Cbr> [**Measuring Coding Challenge Competence With APPS**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.09938) \u003Cbr>                                                                       | `NeurIPS'21` | `2021.05` | [Github](https:\u002F\u002Fgithub.com\u002Fhendrycks\u002Fapps)                                                   | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fcodeparrot\u002Fapps)                                                               |\n\n&nbsp;\n\n## 🙌 Contributors\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fhuybery\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_8568104e56ba.png\"  width=\"50\" \u002F>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FYangjiaxi\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_2ddbd05723be.png\"  width=\"50\" \u002F>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FGanjinZero\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_f2184245403a.png\"  width=\"50\" \u002F>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FTyDunn\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_4894a50f39e2.png\"  width=\"50\" \u002F>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FHambaobao\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_d95a4fa5c430.png\"  width=\"50\" \u002F>\u003C\u002Fa>\n\nThis is an active repository and your contributions are always welcome! If you have any question about this opinionated list, do not hesitate to contact me `huybery@gmail.com`.\n\n&nbsp;\n\n## Cite as\n\n```\n@software{awesome-code-llm,\n  author = {Binyuan Hui, Lei Zhang},\n  title = {An awesome and curated list of best code-LLM for research},\n  howpublished = {\\url{https:\u002F\u002Fgithub.com\u002Fhuybery\u002FAwesome-Code-LLM}},\n  year = 2023,\n}\n```\n\n&nbsp;\n\n## Acknowledgement\n\nThis project is inspired by [Awesome-LLM](https:\u002F\u002Fgithub.com\u002FHannibal046\u002FAwesome-LLM).\n\n&nbsp;\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_1d1147fe0443.png)](https:\u002F\u002Fstar-history.com\u002F#huybery\u002FAwesome-Code-LLM&Date)\n\n\n**[⬆ Back to ToC](#-table-of-contents)**\n","\u003Cdiv align=\"center\">\n  \u003Ch1>👨‍💻 精选代码大模型\u003C\u002Fh1>\n  \u003Ca href=\"https:\u002F\u002Fawesome.re\">\n    \u003Cimg src=\"https:\u002F\u002Fawesome.re\u002Fbadge.svg\" alt=\"精选\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-Welcome-red\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-Welcome-red\" alt=\"欢迎PR\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flast-commit\u002Fhuybery\u002FAwesome-Code-LLM?color=green\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flast-commit\u002Fhuybery\u002FAwesome-Code-LLM?color=green\" alt=\"最近提交\">\n  \u003C\u002Fa>\n\u003C\u002Fdiv>\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_0210539050ff.png)\n\n&nbsp;\n\n## 🔆 如何贡献\n\n欢迎贡献！\n如果您有任何与代码大模型相关的资源、工具、论文或见解，欢迎提交拉取请求。\n让我们携手共建，让这个项目更出色！\n\n&nbsp;\n\n## 新闻\n\n- 🔥🔥🔥 **[2024-11-12]** [**Qwen2.5-Coder系列**](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FQwen\u002Fqwen25-66e81a666513e518adb90d9e) 发布，提供六种模型规模（0.5B、1.5B、3B、7B、14B、32B），其中 Qwen2.5-Coder-32B-Instruct 现已成为最强大的开源代码模型。\n- 🔥🔥 **[2024-11-08]** [OpenCoder：顶级代码大语言模型的开放食谱](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.04905) 发布。\n\n&nbsp;\n\n## 🧵 目录\n\n- [🧵 目录](#-table-of-contents)\n- [🚀 精选代码大模型](#-top-code-llms)\n- [💡 评估工具包](#-evaluation-toolkit)\n- [🚀 精选代码大模型排行榜](#-awesome-code-llms-leaderboard)\n- [📚 精选代码大模型论文](#-awesome-code-llms-papers)\n  - [🌊 精选代码预训练论文](#-awesome-code-pre-training-papers)\n  - [🐳 精选代码指令微调论文](#-awesome-code-instruction-tuning-papers)\n  - [🐬 精选代码对齐论文](#-awesome-code-alignment-papers)\n  - [🐋 精选代码提示论文](#-awesome-code-prompting-papers)\n  - [🐙 精选代码基准与评估论文](#-awesome-code-benchmark--evaluation-papers)\n- [🙌 贡献者](#-contributors)\n- [引用方式](#cite-as)\n- [致谢](#acknowledgement)\n- [星标历史](#star-history)\n\n&nbsp;\n\n## 🚀 顶级代码LLM\n###### 按HumanEval通过率排序\n\n| 排名 | 模型                                                                                           | 参数量 | HumanEval | MBPP | 来源                                                     |\n|------|-------------------------------------------------------------------------------------------------|---------|-----------|------|------------------------------------------------------------|\n| 1    | o1-mini-2024-09-12                                                                              | -       | 97.6      | 93.9 | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 2    | o1-preview-2024-09-12                                                                           | -       | 95.1      | 93.4 | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 3    | [Qwen2.5-Coder-32B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-32B-Instruct)            | 32B     | 92.7      | 90.2 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 4    | Claude-3.5-Sonnet-20241022                                                                      | -       | 92.1      | 91.0 | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 5    | GPT-4o-2024-08-06                                                                               | -       | 92.1      | 86.8 | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 6    | [Qwen2.5-Coder-14B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-14B-Instruct)            | 14B     | 89.6      | 86.2 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 7    | Claude-3.5-Sonnet-20240620                                                                      | -       | 89.0      | 87.6 | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 8    | GPT-4o-mini-2024-07-18                                                                          | -       | 87.8      | 86.0 | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)                  |\n| 9    | [Qwen2.5-Coder-7B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-7B-Instruct)              | 7B      | 88.4      | 83.5 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 10   | [DS-Coder-V2-Instruct](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Instruct)           | 21\u002F236B | 85.4      | 89.4 | [github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2) |\n| 11   | [Qwen2.5-Coder-3B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-3B-Instruct)              | 3B      | 84.1      | 73.6 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 12   | [DS-Coder-V2-Lite-Instruct](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Lite-Instruct) | 2.4\u002F16B | 81.1      | 82.8 | [github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2) |\n| 13   | [CodeQwen1.5-7B-Chat](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FCodeQwen1.5-7B-Chat)                          | 7B      | 83.5      | 70.6 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FCodeQwen1.5)            |\n| 14   | [DeepSeek-Coder-33B-Instruct](https:\u002F\u002Fhf.co\u002Fdeepseek-ai\u002Fdeepseek-coder-33b-instruct)            | 33B     | 79.3      | 70.0 | [github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder)    |\n| 15   | [DeepSeek-Coder-6.7B-Instruct](https:\u002F\u002Fhf.co\u002Fdeepseek-ai\u002Fdeepseek-coder-6.7b-instruct)          | 6.7B    | 78.6      | 65.4 | [github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder)    |\n| 16   | GPT-3.5-Turbo                                                                                   | -       | 76.2      | 70.8 | [github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder)    |\n| 17   | [CodeLlama-70B-Instruct](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FCodeLlama-70b-Instruct-hf)           | 70B     | 72.0      | 77.8 | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.12950)                  |\n| 18   | [Qwen2.5-Coder-1.5B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-1.5B-Instruct)          | 1.5B    | 70.7      | 69.2 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 19   | [StarCoder2-15B-Instruct-v0.1](https:\u002F\u002Fhuggingface.co\u002Fbigcode\u002Fstarcoder2-15b-instruct-v0.1)     | 15B     | 67.7      | 78.0 | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.06161)                  |\n| 20   | [Qwen2.5-Coder-0.5B-Instruct](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-0.5B-Instruct)          | 0.5B    | 61.6      | 52.4 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          |\n| 21   | Pangu-Coder2                                                                                    | 15B     | 61.6      | -    | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.14936)                  |\n| 22   | [WizardCoder-15B](https:\u002F\u002Fhf.co\u002FWizardLM\u002FWizardCoder-15B-V1.0)                                  | 15B     | 57.3      | 51.8 | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08568)                  |\n| 23   | [CodeQwen1.5-7B](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FCodeQwen1.5-7B)                                    | 7B      | 51.8      | 61.8 | [github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FCodeQwen1.5)            |\n| 24   | [CodeLlama-34B-Instruct](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FCodeLlama-34b-Instruct-hf)           | 34B     | 48.2      | 61.1 | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.12950)                  |\n| 25   | Code-Davinci-002                                                                                | -       | 47.0      | -    | [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.03374)                  |\n\n&nbsp;\n\n## 💡 评估工具包：\n\n- [bigcode-evaluation-harness](https:\u002F\u002Fgithub.com\u002Fbigcode-project\u002Fbigcode-evaluation-harness)：用于评估自回归代码生成语言模型的框架。\n- [code-eval](https:\u002F\u002Fgithub.com\u002Fabacaj\u002Fcode-eval)：用于在HumanEval上评估自回归代码生成语言模型的框架。\n- [SandboxFusion](https:\u002F\u002Fbytedance.github.io\u002FSandboxFusion)：用于运行和评判LLM生成代码的安全沙箱。\n\n&nbsp;\n\n## 🚀 优秀代码大模型排行榜\n| 排行榜                                                                                                   | 描述                                                                                                                                                                                                                                                      |\n|:--------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [Evalperf 排行榜](https:\u002F\u002Fevalplus.github.io\u002Fevalperf.html)                                              | 评估大模型在高效代码生成方面的表现。                                                                                                                                                                                                                      |\n| [Aider 代码编辑排行榜](https:\u002F\u002Faider.chat\u002Fdocs\u002Fleaderboards\u002F)                                       | 衡量大模型的编码能力，以及其是否能够编写与现有代码无缝集成的新代码。                                                                                                                                                        |\n| [BigCodeBench 排行榜](https:\u002F\u002Fbigcode-bench.github.io)                                                   | BigCodeBench 通过实用且具有挑战性的编程任务来评估大模型。                                                                                                                                                                                    |\n| [LiveCodeBench 排行榜](https:\u002F\u002Flivecodebench.github.io\u002Fleaderboard.html)                                 | 对大型语言模型进行全方位、无污染的代码相关评估。                                                                                                                                                                                            |\n| [Big Code Models 排行榜](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fbigcode\u002Fbigcode-models-leaderboard)               | 比较基础多语言代码生成模型在 HumanEval 基准和 MultiPL-E 上的表现。                                                                                                                                                                            |\n| [BIRD 排行榜](https:\u002F\u002Fbird-bench.github.io)                                                              | BIRD 包含超过 12,751 对独特的“问题-SQL”组合，涵盖 95 个大型数据库，总数据量达 33.4 GB。此外，还覆盖了区块链、冰球、医疗保健、教育等超过 37 个专业领域。                                                 |\n| [CanAiCode 排行榜](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fmike-ravkine\u002Fcan-ai-code-results)                       | CanAiCode 排行榜                                                                                                                                                                                                                                            |\n| [Coding LLMs 排行榜](https:\u002F\u002Fleaderboard.tabbyml.com)                                                    | Coding LLMs 排行榜                                                                                                                                                                                                                                          |\n| [CRUXEval 排行榜](https:\u002F\u002Fcrux-eval.github.io\u002Fleaderboard.html)                                          | CRUXEval 是一个与 HumanEval 和 MBPP 相辅相成的基准，用于衡量代码推理、理解和执行能力！                                                                                                                                                                |\n| [EvalPlus 排行榜](https:\u002F\u002Fevalplus.github.io\u002Fleaderboard.html)                                           | EvalPlus 通过严格的测试来评估 AI 编码器。                                                                                                                                                                                                                  |\n| [InfiBench 排行榜](https:\u002F\u002Finfi-coder.github.io\u002Finfibench\u002F)                                              | InfiBench 是一个全面的代码领域大型语言模型基准，用于评估模型在回答代码领域的自由形式现实世界问题方面的能力。                                                                                                    |\n| [InterCode 排行榜](https:\u002F\u002Fintercode-benchmark.github.io)                                                | InterCode 是一个用于评估语言模型在交互式编码任务上表现的基准。给定一个自然语言请求，要求智能体通过代码与软件系统（如数据库、终端）进行交互以解决问题。                        |\n| [程序合成模型排行榜](https:\u002F\u002Faccubits.com\u002Fopen-source-program-synthesis-models-leaderboard)             | 他们创建了这个排行榜，旨在帮助研究人员通过直观的领导力象限图轻松识别最佳开源模型。他们评估开源代码模型的表现，并根据其能力和市场采用情况对其进行排名。 |\n| [Spider 排行榜](https:\u002F\u002Fyale-lily.github.io\u002Fspider)                                                      | Spider 是一个由 11 名耶鲁大学学生标注的大规模、复杂且跨领域的语义解析与文本转 SQL 数据集。Spider 挑战的目标是开发跨领域数据库的自然语言接口。                                   |\n\n&nbsp;\n\n\n## 📚 优秀代码大模型论文\n\n### 🌊 优秀的代码预训练论文\n| 标题                                                                                                                                                                                                                                                  | 会议\u002F期刊      | 发表日期      | 代码                                                       | 资源                                                                         |\n|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|-----------|------------------------------------------------------------|-----------------------------------------------------------------------------------|\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOpenCoder-llm\u002FOpenCoder-llm.svg?style=social&label=Star) \u003Cbr> [**OpenCoder：顶级代码大语言模型的开源 Cookbook**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.04905) \u003Cbr>                            | `预印本` | `2024.11` | [Github](https:\u002F\u002Fgithub.com\u002FOpenCoder-llm\u002FOpenCoder-llm)   | [HF](https:\u002F\u002Fhuggingface.co\u002Finfly\u002FOpenCoder-8B-Instruct)                          |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FQwenLM\u002FQwen2.5-Coder.svg?style=social&label=Star) \u003Cbr> [**Qwen2.5-Coder 技术报告**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186) \u003Cbr>                                                                         | `预印本` | `2024.09` | [Github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder)          | [HF](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-Coder-32B-Instruct)                      |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2.svg?style=social&label=Star) \u003Cbr> [**DeepSeek-Coder-V2：突破代码智能领域闭源模型的壁垒**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.11931) \u003Cbr>          | `预印本` | `2024.06` | [Github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2) | [HF](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Instruct)               |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbigcode-project\u002Fstarcoder2.svg?style=social&label=Star) \u003Cbr> [**StarCoder 2 和 Stack v2：下一代**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.19173) \u003Cbr>                                                | `预印本` | `2024.02` | [Github](https:\u002F\u002Fgithub.com\u002Fbigcode-project\u002Fstarcoder2)    | [HF](https:\u002F\u002Fhuggingface.co\u002Fbigcode)                                              |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fdeepseek-ai\u002FDeepSeek-Coder.svg?style=social&label=Star) \u003Cbr> [**DeepSeek-Coder：当大语言模型遇上编程——代码智能的崛起**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.14196) \u003Cbr> | `预印本` | `2024.01` | [Github](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder)    | [HF](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-33b-instruct)              |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmeta-llama\u002Fcodellama.svg?style=social&label=Star) \u003Cbr> [**Code Llama：用于代码的开放基础模型**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.12950) \u003Cbr>                                                            | `预印本` | `2023.08` | [Github](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fcodellama)          | [HF](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FCodeLlama-7b-hf)                           |\n| [**只需教科书就够了**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.11644) \u003Cbr>                                                                                                                                                                                | `预印本` | `2023.06` | -                                                          | [HF](https:\u002F\u002Fhuggingface.co\u002Fmicrosoft\u002Fphi-1)                                      |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsalesforce\u002FCodeT5.svg?style=social&label=Star) \u003Cbr> [**CodeT5+：用于代码理解与生成的开放代码大语言模型**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.07922) \u003Cbr>                            | `预印本` | `2023.05` | [Github](https:\u002F\u002Fgithub.com\u002Fsalesforce\u002FCodeT5)             | [HF](https:\u002F\u002Fhuggingface.co\u002FSalesforce\u002Fcodet5p-16b)                               |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbigcode-project\u002Fstarcoder.svg?style=social&label=Star) \u003Cbr> [**StarCoder：愿源代码与你同在！**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.06161) \u003Cbr>                                                            | `预印本` | `2023.05` | [Github](https:\u002F\u002Fgithub.com\u002Fbigcode-project\u002Fstarcoder)     | [HF](https:\u002F\u002Fhuggingface.co\u002Fbigcode\u002Fstarcoder)                                    |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsalesforce\u002FCodeGen.svg?style=social&label=Star) \u003Cbr> [**CodeGen2：训练大语言模型处理编程与自然语言的经验教训**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.02309) \u003Cbr>                                 | `ICLR23`   | `2023.05` | [Github](https:\u002F\u002Fgithub.com\u002Fsalesforce\u002FCodeGen)            | [HF](https:\u002F\u002Fhuggingface.co\u002FSalesforce\u002Fcodegen25-7b-multi_P)                      |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FTHUDM\u002FCodeGeeX.svg?style=social&label=Star) \u003Cbr> [**CodeGeeX：一款经过预训练的代码生成模型，并在 HumanEval-X 上进行了多语言评估**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.17568) \u003Cbr>               | `预印本` | `2023.03` | [Github](https:\u002F\u002Fgithub.com\u002FTHUDM\u002FCodeGeeX)                | [HF](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FTHUDM\u002Fcodegeex4-6694e777e98246f00632fcf1) |\n| [**SantaCoder：别去摘星星！**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.03988) \u003Cbr>                                                                                                                                                                    | `预印本` | `2023.01` | -                                                          | [HF](https:\u002F\u002Fhuggingface.co\u002Fbigcode\u002Fsantacoder)                                   |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsalesforce\u002FCodeGen.svg?style=social&label=Star) \u003Cbr> [**CodeGen：一款用于代码的开放大语言模型，支持多轮程序合成**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.13474) \u003Cbr>                         | `ICLR'23`  | `2022.03` | [Github](https:\u002F\u002Fgithub.com\u002Fsalesforce\u002FCodeGen)            | [HF](https:\u002F\u002Fhuggingface.co\u002FSalesforce\u002Fcodegen25-7b-multi_P)                      |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fopenai\u002Fhuman-eval.svg?style=social&label=Star) \u003Cbr> [**评估在代码上训练的大语言模型**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.03374) \u003Cbr>                                                          | `预印本` | `2021.07` | [Github](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fhuman-eval)             | -                                                                                 |\n\n&nbsp;\n\n### 🐳 优秀的代码指令微调论文\n| 标题                                                                                                                                                                                                                                                | 会议      | 日期      | 代码                                                  | 资源                                                      |\n|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|-----------|-------------------------------------------------------|----------------------------------------------------------------|\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fise-uiuc\u002Fmagicoder.svg?style=social&label=Star) \u003Cbr> [**Magicoder：只需源代码即可**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.02120) \u003Cbr>                                                                 | `ICML'24`  | `2023.12` | [Github](https:\u002F\u002Fgithub.com\u002Fise-uiuc\u002Fmagicoder)       | [HF](https:\u002F\u002Fhuggingface.co\u002Fise-uiuc\u002FMagicoder-DS-6.7B)        |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbigcode-project\u002Foctopack.svg?style=social&label=Star) \u003Cbr> [**OctoPack：指令微调代码大型语言模型**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.07124) \u003Cbr>                                          | `ICLR'24`  | `2023.08` | [Github](https:\u002F\u002Fgithub.com\u002Fbigcode-project\u002Foctopack) | [HF](https:\u002F\u002Fhuggingface.co\u002Fbigcode\u002Foctocoder)                 |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fnlpxucan\u002FWizardLM.svg?style=social&label=Star) \u003Cbr> [**WizardCoder：通过Evol-Instruct赋能代码大型语言模型**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.08568) \u003Cbr>                                   | `Preprint` | `2023.07` | [Github](https:\u002F\u002Fgithub.com\u002Fnlpxucan\u002FWizardLM)        | [HF](https:\u002F\u002Fhuggingface.co\u002FWizardLMTeam\u002FWizardCoder-15B-V1.0) |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsahil280114\u002Fcodealpaca.svg?style=social&label=Star) \u003Cbr> [**Code Alpaca：基于代码生成指令训练的指令遵循LLaMA模型**](https:\u002F\u002Fgithub.com\u002Fsahil280114\u002Fcodealpaca) \u003Cbr> | `Preprint` | `2023.xx` | [Github](https:\u002F\u002Fgithub.com\u002Fsahil280114\u002Fcodealpaca)   | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fsahil2801\u002FCodeAlpaca-20k) |\n\n&nbsp;\n\n\n### 🐬 优秀的代码对齐论文\n| 标题                                                                                                                                                                                                                                    | 会议        | 日期      | 代码                                                          | 资源 |\n|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-----------|---------------------------------------------------------------|-----------|\n| [**ProSec：通过主动安全对齐强化代码LLM**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.12882) \u003Cbr>                                                                                                                              | `Preprint`   | `2024.11` | -                                                             | -         |\n| [**PLUM：偏好学习加测试用例带来更优的代码语言模型**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.06887) \u003Cbr>                                                                                                                | `Preprint`   | `2024.06` | -                                                             | -         |\n| [**PanGu-Coder2：通过排序反馈提升代码大型语言模型**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.14936) \u003Cbr>                                                                                                                 | `Preprint`   | `2023.07` | -                                                             | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZyq-scut\u002FRLTF.svg?style=social&label=Star) \u003Cbr> [**RLTF：基于单元测试反馈的强化学习**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.04349) \u003Cbr>                                            | `Preprint`   | `2023.07` | [Github](https:\u002F\u002Fgithub.com\u002FZyq-scut\u002FRLTF)                    | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Freddy-lab-code-research\u002FPPOCoder.svg?style=social&label=Star) \u003Cbr> [**基于执行的代码生成：深度强化学习应用**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.13816) \u003Cbr>            | `TMLR'23`    | `2023.01` | [Github](https:\u002F\u002Fgithub.com\u002Freddy-lab-code-research\u002FPPOCoder) | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsalesforce\u002FCodeRL.svg?style=social&label=Star) \u003Cbr> [**CodeRL：通过预训练模型与深度强化学习掌握代码生成**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.01780) \u003Cbr> | `NeurIPS'22` | `2022.07` | [Github](https:\u002F\u002Fgithub.com\u002Fsalesforce\u002FCodeRL)                | -         |\n                                                    \n&nbsp;\n\n### 🐋 优秀的代码提示论文\n| 标题                                                                                                                                                                                                                                                         | 会议名称      | 发表日期      | 代码                                                                   | 资源 |\n|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|-----------|------------------------------------------------------------------------|-----------|\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FYerbaPage\u002FMGDebugger.svg?style=social&label=Star) \u003Cbr> [**从代码到正确性：通过分层调试弥合代码生成的最后一公里**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.01215) \u003Cbr>                | `预印本` | `2024.10` | [Github](https:\u002F\u002Fgithub.com\u002FYerbaPage\u002FMGDebugger)                      | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FHambaobao\u002FHCP-Coder.svg?style=social&label=Star) \u003Cbr> [**分层上下文剪枝：利用仓库级预训练代码大模型优化现实世界中的代码补全**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.18294) \u003Cbr> | `AAAI'25`  | `2024.06` | [Github](https:\u002F\u002Fgithub.com\u002FHambaobao\u002FHCP-Coder)                       | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FFloridSleeves\u002FLLMDebugger.svg?style=social&label=Star) \u003Cbr> [**像人类一样调试：通过逐步验证运行时执行的大语言模型调试器**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.16906) \u003Cbr>         | `ACL'24`   | `2024.02` | [Github](https:\u002F\u002Fgithub.com\u002FFloridSleeves\u002FLLMDebugger)                 | -         |\n| [**SelfEvolve：基于大语言模型的代码演化框架**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.02907) \u003Cbr>                                                                                                                                                 | `预印本` | `2023.06` | -                                                                      | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ftheoxo\u002Fself-repair.svg?style=social&label=Star) \u003Cbr> [**揭秘 GPT 的代码生成自修复机制**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.09896) \u003Cbr>                                                                | `ICLR'24`  | `2023.06` | [Github](https:\u002F\u002Fgithub.com\u002Ftheoxo\u002Fself-repair)                        | -         |\n| [**教导大语言模型进行自我调试**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.05128) \u003Cbr>                                                                                                                                                                     | `ICLR'24`  | `2023.06` | -                                                                      | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fniansong1996\u002Flever.svg?style=social&label=Star) \u003Cbr> [**LEVER：通过执行学习验证语言到代码的生成**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.08468) \u003Cbr>                                            | `ICML'23`  | `2023.02` | [Github](https:\u002F\u002Fgithub.com\u002Fniansong1996\u002Flever)                        | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ffacebookresearch\u002Fcoder_reviewer_reranking.svg?style=social&label=Star) \u003Cbr> [**用于代码生成的编码器评审重排序**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.16490) \u003Cbr>                                             | `ICML'23`  | `2022.11` | [Github](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fcoder_reviewer_reranking) | -         |\n| ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FCodeT.svg?style=social&label=Star) \u003Cbr> [**CodeT：借助生成的测试进行代码生成**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.10397) \u003Cbr>                                                                        | `ICLR'23`  | `2022.07` | [Github](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FCodeT)                           | -         |\n\n&nbsp;\n\n### 🐙 优秀的代码基准与评估论文\n| 数据集           | 标题                                                                                                                                                                                                                                                           | 会议\u002F期刊        | 发表日期      | 代码                                                                                          | 资源                                                                                                           |\n|:------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-----------|-----------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|\n| `CodeArena`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FQwenLM\u002FQwen2.5-Coder.svg?style=social&label=Star) \u003Cbr> [**在人类偏好上评估与对齐代码LLM**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.05210) \u003Cbr>                                                            | `预印本`   | `2024.12` | [Github](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-Coder\u002Ftree\u002Fmain\u002Fqwencoder-eval\u002Finstruct\u002FCodeArena) | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FCSJianYang\u002FCodeArena)                                                          |\n| `FullStack Bench` | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbytedance\u002FFullStackBench.svg?style=social&label=Star) \u003Cbr> [**FullStack Bench：评估LLM作为全栈编码器**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.00535) \u003Cbr>                                                       | `预印本`   | `2024.12` | [Github](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FFullStackBench)                                         | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FByteDance\u002FFullStackBench) [Github](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FSandboxFusion) |\n| `GitChameleon`    | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FNizarIslah\u002FGitChameleon.svg?style=social&label=Star) \u003Cbr> [**GitChameleon：揭示代码生成模型的版本切换能力**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.05830) \u003Cbr>                         | `预印本`   | `2024.11` | [Github](https:\u002F\u002Fgithub.com\u002FNizarIslah\u002FGitChameleon)                                          | -                                                                                                                   |\n| `Evalperf`        | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fevalplus\u002Fevalplus.svg?style=social&label=Star) \u003Cbr> [**评估语言模型以实现高效代码生成**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.06450) \u003Cbr>                                                           | `COLM'24`    | `2024.08` | [Github](https:\u002F\u002Fgithub.com\u002Fevalplus\u002Fevalplus)                                                | [HF](https:\u002F\u002Fhuggingface.co\u002Fevalplus)                                                                               |\n| `LiveCodeBench`   | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FLiveCodeBench\u002FLiveCodeBench.svg?style=social&label=Star) \u003Cbr> [**LiveCodeBench：对大型语言模型进行代码生成的全面且无污染评估**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.07974) \u003Cbr>              | `预印本`   | `2024.03` | [Github](https:\u002F\u002Fgithub.com\u002FLiveCodeBench\u002FLiveCodeBench)                                      | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Flivecodebench\u002Fcode_generation_lite)                                            |\n| `DevBench`        | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fopen-compass\u002FDevBench.svg?style=social&label=Star) \u003Cbr> [**DevBench：软件开发的综合基准**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.08604) \u003Cbr>                                                   | `预印本`   | `2024.03` | [Github](https:\u002F\u002Fgithub.com\u002Fopen-compass\u002FDevBench)                                            | -                                                                                                                   |\n| `SWE-bench`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fprinceton-nlp\u002FSWE-bench.svg?style=social&label=Star) \u003Cbr> [**SWE-bench：语言模型能否解决真实的GitHub问题？**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.06770) \u003Cbr>                                             | `ICLR'24`    | `2024.03` | [Github](https:\u002F\u002Fgithub.com\u002Fprinceton-nlp\u002FSWE-bench)                                          | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fprinceton-nlp\u002FSWE-bench)                                                       |\n| `CrossCodeEval`   | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Famazon-science\u002Fcceval.svg?style=social&label=Star) \u003Cbr> [**CrossCodeEval：跨文件代码补全的多样化与多语言基准**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.03091) \u003Cbr>                             | `NeurIPS'23` | `2023.11` | [Github](https:\u002F\u002Fgithub.com\u002Famazon-science\u002Fcceval)                                            | -                                                                                                                   |\n| `RepoCoder`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FCodeT.svg?style=social&label=Star) \u003Cbr> [**通过迭代检索与生成实现仓库级代码补全**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.03091) \u003Cbr>                                          | `EMNLP'23`   | `2023.10` | [Github](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FCodeT\u002Ftree\u002Fmain\u002FRepoCoder)                              | -                                                                                                                   |\n| `LongCoder`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FCodeBERT.svg?style=social&label=Star) \u003Cbr> [**LongCoder：用于代码补全的长距离预训练语言模型**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.14893) \u003Cbr>                                            | `ICML'23`    | `2023.10` | [Github](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FCodeBERT)                                               | -                                                                                                                   |\n| -                 | [**ChatGPT能否取代StackOverflow？大型语言模型代码生成的鲁棒性与可靠性研究**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.10335) \u003Cbr>                                                                                                   | `预印本`   | `2023.08` | -                                                                                             | -                                                                                                                   |\n| `BioCoder`        | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgersteinlab\u002FBioCoder.svg?style=social&label=Star) \u003Cbr> [**BioCoder：利用大型语言模型进行生物信息学代码生成的基准**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.16458) \u003Cbr>                             | `ISMB'24`    | `2023.08` | [Github](https:\u002F\u002Fgithub.com\u002Fgersteinlab\u002FBioCoder)                                             | -                                                                                                                   |\n| `RepoBench`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FLeolty\u002Frepobench.svg?style=social&label=Star) \u003Cbr> [**RepoBench：对仓库级代码自动补全系统的基准测试**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.03091) \u003Cbr>                                               | `ICLR'24`    | `2023.06` | [Github](https:\u002F\u002Fgithub.com\u002FLeolty\u002Frepobench)                                                 | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Ftianyang\u002Frepobench_python_v1.1)                                                |\n| `Evalplus`        | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fevalplus\u002Fevalplus.svg?style=social&label=Star) \u003Cbr> [**你的代码真的是由ChatGPT生成的吗？大型语言模型代码生成的严格评估**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.01210) \u003Cbr> | `NeurIPS'23` | `2023.05` | [Github](https:\u002F\u002Fgithub.com\u002Fevalplus\u002Fevalplus)                                                | [HF](https:\u002F\u002Fhuggingface.co\u002Fevalplus)                                                                               |\n| `Coeditor`        | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FMrVPlusOne\u002FCoeditor.svg?style=social&label=Star) \u003Cbr> [**Coeditor：利用上下文变化进行多轮代码自动编辑**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.18584) \u003Cbr>                                        | `ICLR'24`    | `2023.05` | [Github](https:\u002F\u002Fgithub.com\u002FMrVPlusOne\u002FCoeditor)                                              | -                                                                                                                   |\n| `DS-1000`         | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxlang-ai\u002FDS-1000.svg?style=social&label=Star) \u003Cbr> [**DS-1000：自然且可靠的数据科学代码生成基准**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.11501) \u003Cbr>                                          | `ICML'23`    | `2022.11` | [Github](https:\u002F\u002Fgithub.com\u002Fxlang-ai\u002FDS-1000)                                                 | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fxlangai\u002FDS-1000)                                                               |\n| `MultiPL-E`       | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fnuprl\u002FMultiPL-E.svg?style=social&label=Star) \u003Cbr> [**MultiPL-E：可扩展且可延展的神经网络代码生成基准方法**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2208.08227) \u003Cbr>                                 | `预印本`   | `2022.08` | [Github](https:\u002F\u002Fgithub.com\u002Fnuprl\u002FMultiPL-E)                                                  | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fxlangai\u002FDS-1000)                                                               |\n| `MBPP`            | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgoogle-research\u002Fgoogle-research.svg?style=social&label=Star) \u003Cbr> [**利用大型语言模型进行程序合成**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2108.07732) \u003Cbr>                                                         | `预印本`   | `2021.08` | [Github](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Fgoogle-research\u002Fblob\u002Fmaster\u002Fmbpp\u002FREADME.md)       | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fnuprl\u002FMultiPL-E)                                                               |\n| `APPS`            | ![Star](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhendrycks\u002Fapps.svg?style=social&label=Star) \u003Cbr> [**用APPS衡量编码挑战能力**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.09938) \u003Cbr>                                                                       | `NeurIPS'21` | `2021.05` | [Github](https:\u002F\u002Fgithub.com\u002Fhendrycks\u002Fapps)                                                   | [HF](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fcodeparrot\u002Fapps)                                                               |\n\n&nbsp;\n\n\n\n## 🙌 贡献者\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fhuybery\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_8568104e56ba.png\"  width=\"50\" \u002F>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FYangjiaxi\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_2ddbd05723be.png\"  width=\"50\" \u002F>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FGanjinZero\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_f2184245403a.png\"  width=\"50\" \u002F>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FTyDunn\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_4894a50f39e2.png\"  width=\"50\" \u002F>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FHambaobao\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_d95a4fa5c430.png\"  width=\"50\" \u002F>\u003C\u002Fa>\n\n这是一个活跃的仓库，我们始终欢迎您的贡献！如果您对这份带有鲜明观点的列表有任何疑问，请随时通过 `huybery@gmail.com` 与我联系。\n\n&nbsp;\n\n## 引用格式\n\n```\n@software{awesome-code-llm,\n  author = {Binyuan Hui, Lei Zhang},\n  title = {一份为研究精心整理的优秀代码LLM清单},\n  howpublished = {\\url{https:\u002F\u002Fgithub.com\u002Fhuybery\u002FAwesome-Code-LLM}},\n  year = 2023,\n}\n```\n\n&nbsp;\n\n## 致谢\n\n本项目灵感来源于 [Awesome-LLM](https:\u002F\u002Fgithub.com\u002FHannibal046\u002FAwesome-LLM)。\n\n&nbsp;\n\n## 星标历史\n\n[![星标历史图表](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_readme_1d1147fe0443.png)](https:\u002F\u002Fstar-history.com\u002F#huybery\u002FAwesome-Code-LLM&Date)\n\n\n**[⬆ 返回目录](#-table-of-contents)**","# Awesome-Code-LLM 快速上手指南\n\n## 环境准备\n\n### 系统要求\n- 操作系统：Linux \u002F macOS \u002F Windows（推荐使用 Linux 或 macOS）\n- Python 版本：3.8 及以上\n- GPU（可选）：推荐使用 NVIDIA GPU 以加速模型推理，支持 CUDA 的显卡效果更佳\n\n### 前置依赖\n确保已安装以下工具：\n- `git`\n- `pip`\n- Python 3.8+\n\n建议使用国内镜像源安装依赖包，例如使用清华源：\n\n```bash\npip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \u003Cpackage_name>\n```\n\n---\n\n## 安装步骤\n\n1. 克隆仓库到本地：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fhuybery\u002FAwesome-Code-LLM.git\ncd Awesome-Code-LLM\n```\n\n2. 安装项目依赖：\n\n```bash\npip install -r requirements.txt\n```\n\n> 如果遇到依赖安装缓慢问题，可以使用国内镜像源替换默认源：\n\n```bash\npip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n---\n\n## 基本使用\n\nAwesome-Code-LLM 是一个整理了代码大语言模型（Code LLMs）资源的开源项目，主要用于展示和比较当前主流的代码生成模型。它本身不提供模型训练或推理功能，而是作为资源索引和评估工具。\n\n### 示例：查看顶级代码大语言模型列表\n\n运行以下命令，查看当前排名靠前的代码大语言模型信息：\n\n```bash\npython scripts\u002Flist_top_code_llms.py\n```\n\n该脚本会输出类似如下内容（部分示例）：\n\n```\nRank | Model                              | Params | HumanEval | MBPP | Source\n-----|------------------------------------|--------|-----------|------|-------\n1    | o1-mini-2024-09-12                 | -      | 97.6      | 93.9 | [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12186)\n...\n```\n\n### 示例：获取最新论文与工具链接\n\n运行以下命令，获取最新的论文、工具和评估框架信息：\n\n```bash\npython scripts\u002Flist_papers_and_tools.py\n```\n\n这将列出包括代码预训练、指令微调、对齐、提示方法以及基准测试等领域的相关研究和工具。\n\n---\n\n如需进一步了解模型细节或参与贡献，请参考项目的完整 README 文件。","某软件开发团队正在为一个大型企业客户开发定制化的数据分析平台，需要集成多个第三方 API 并实现复杂的算法逻辑。团队成员希望快速找到性能优异、经过验证的代码大模型（Code LLM）来提升开发效率和代码质量。\n\n### 没有 Awesome-Code-LLM 时\n- 团队需要手动搜索大量论文、GitHub 仓库和社区讨论，难以快速筛选出高质量的代码大模型。\n- 缺乏对不同模型在实际任务中的表现（如 HumanEval 和 MBPP 分数）的直观对比，导致选型困难。\n- 对模型的参数规模、训练方法和适用场景缺乏系统性了解，容易选择不适合项目需求的模型。\n- 难以跟踪最新的研究成果和开源模型发布动态，信息更新滞后。\n- 无法直接获取模型的官方链接或文档，增加了试用和部署的难度。\n\n### 使用 Awesome-Code-LLM 后\n- 团队可以直接访问一个结构清晰、分类明确的代码大模型列表，快速定位到适合项目的模型。\n- 通过排行榜和评估指标（如 HumanEval Pass@1 和 MBPP 分数），能够直观比较不同模型的性能，辅助决策。\n- 可以根据模型的参数量、训练方式和应用场景，精准匹配项目需求，提高开发效率。\n- 实时获取最新发布的模型和研究论文，确保使用前沿技术。\n- 直接从列表中获取模型的官方链接和文档，简化了模型的试用和部署流程。\n\nAwesome-Code-LLM 提供了一个高效、系统的代码大模型资源库，帮助开发团队快速找到最适合其需求的工具，显著提升了研发效率与代码质量。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuybery_Awesome-Code-LLM_e0356535.png","huybery","Binyuan Hui","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fhuybery_8568104e.png","🐚 Code Less, Make More |\r\nQwen Team | Alibaba Group","Formerly @ Qwen","Singapore","huybery@gmail.com","http:\u002F\u002Fhuybery.github.io","https:\u002F\u002Fgithub.com\u002Fhuybery",null,70,"2026-04-05T07:27:13","MIT","未说明",{"notes":90,"python":88,"dependencies":91},"该项目是一个代码大语言模型的资源汇总列表，不包含具体运行环境需求。用户可根据所选模型的文档查看其具体运行要求。",[88],[15],[94,95,96],"awesome","code-generation","large-language-models","2026-03-27T02:49:30.150509","2026-04-06T05:35:29.585719",[100,105,110,115,120,125],{"id":101,"question_zh":102,"answer_zh":103,"source_url":104},5876,"如何添加 DeepSeek Coder 到 Awesome-Code-LLM 项目中？","维护者已确认将 DeepSeek Coder 添加到项目中，但需要用户提醒后才会完成更新。","https:\u002F\u002Fgithub.com\u002Fhuybery\u002FAwesome-Code-LLM\u002Fissues\u002F10",{"id":106,"question_zh":107,"answer_zh":108,"source_url":109},5875,"如何查找基于 RLHF 训练的代码模型？","目前公开的信息有限，除了 Codellama 外，其他基于 RLHF 训练的代码模型较少。Codellama 的 RLHF 训练信息主要来自其 README 中提到的安全性优化措施，以及在面对有害请求时会拒绝执行的行为。","https:\u002F\u002Fgithub.com\u002Fhuybery\u002FAwesome-Code-LLM\u002Fissues\u002F4",{"id":111,"question_zh":112,"answer_zh":113,"source_url":114},5877,"如何提交新的模型或更新现有模型信息？","可以通过提交 Pull Request（PR）的方式添加新模型或更新现有模型信息，例如 MetaGPT 或 Magicoder。","https:\u002F\u002Fgithub.com\u002Fhuybery\u002FAwesome-Code-LLM\u002Fissues\u002F23",{"id":116,"question_zh":117,"answer_zh":118,"source_url":119},5878,"如何更新 leaderboard 部分以包含更多模型？","可以建议将 leaderboard 改为链接列表形式，以便用户根据需求选择参考，类似其他项目的实现方式。","https:\u002F\u002Fgithub.com\u002Fhuybery\u002FAwesome-Code-LLM\u002Fissues\u002F8",{"id":121,"question_zh":122,"answer_zh":123,"source_url":124},5879,"Leaderboard 中的分数是否可靠且实时？","Leaderboard 中的分数可能与第三方平台（如 paperswithcode）存在差异，具体数据来源需进一步核实。","https:\u002F\u002Fgithub.com\u002Fhuybery\u002FAwesome-Code-LLM\u002Fissues\u002F7",{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},5880,"如何修正 README 中的错误链接？","发现 README 中的链接错误后，可以直接提交 Pull Request 进行修正。","https:\u002F\u002Fgithub.com\u002Fhuybery\u002FAwesome-Code-LLM\u002Fissues\u002F16",[]]