[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-Codium-ai--AlphaCodium":3,"tool-Codium-ai--AlphaCodium":65},[4,17,27,35,48,57],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",146793,2,"2026-04-08T23:32:35",[13,14,15],"开发框架","Agent","语言模型","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,3,"2026-04-06T11:19:32",[15,26,14,13],"图像",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":10,"last_commit_at":33,"category_tags":34,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":10,"last_commit_at":41,"category_tags":42,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",85052,"2026-04-08T11:03:08",[26,43,44,45,14,46,15,13,47],"数据工具","视频","插件","其他","音频",{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":54,"last_commit_at":55,"category_tags":56,"status":16},5784,"funNLP","fighting41love\u002FfunNLP","funNLP 是一个专为中文自然语言处理（NLP）打造的超级资源库，被誉为\"NLP 民工的乐园”。它并非单一的软件工具，而是一个汇集了海量开源项目、数据集、预训练模型和实用代码的综合性平台。\n\n面对中文 NLP 领域资源分散、入门门槛高以及特定场景数据匮乏的痛点，funNLP 提供了“一站式”解决方案。这里不仅涵盖了分词、命名实体识别、情感分析、文本摘要等基础任务的标准工具，还独特地收录了丰富的垂直领域资源，如法律、医疗、金融行业的专用词库与数据集，甚至包含古诗词生成、歌词创作等趣味应用。其核心亮点在于极高的全面性与实用性，从基础的字典词典到前沿的 BERT、GPT-2 模型代码，再到高质量的标注数据和竞赛方案，应有尽有。\n\n无论是刚刚踏入 NLP 领域的学生、需要快速验证想法的算法工程师，还是从事人工智能研究的学者，都能在这里找到急需的“武器弹药”。对于开发者而言，它能大幅减少寻找数据和复现模型的时间；对于研究者，它提供了丰富的基准测试资源和前沿技术参考。funNLP 以开放共享的精神，极大地降低了中文自然语言处理的开发与研究成本，是中文 AI 社区不可或缺的宝藏仓库。",79857,1,"2026-04-08T20:11:31",[15,43,46],{"id":58,"name":59,"github_repo":60,"description_zh":61,"stars":62,"difficulty_score":23,"last_commit_at":63,"category_tags":64,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[14,26,13,15,46],{"id":66,"github_repo":67,"name":68,"description_en":69,"description_zh":70,"ai_summary_zh":71,"readme_en":72,"readme_zh":73,"quickstart_zh":74,"use_case_zh":75,"hero_image_url":76,"owner_login":77,"owner_name":78,"owner_avatar_url":79,"owner_bio":80,"owner_company":81,"owner_location":81,"owner_email":82,"owner_twitter":83,"owner_website":84,"owner_url":85,"languages":86,"stars":95,"forks":96,"last_commit_at":97,"license":98,"difficulty_score":10,"env_os":99,"env_gpu":100,"env_ram":101,"env_deps":102,"category_tags":107,"github_topics":108,"view_count":10,"oss_zip_url":81,"oss_zip_packed_at":81,"status":16,"created_at":114,"updated_at":115,"faqs":116,"releases":147},5686,"Codium-ai\u002FAlphaCodium","AlphaCodium","Official implementation for the paper: \"Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering\"\"","AlphaCodium 是由 CodiumAI 推出的开源项目，旨在提升大语言模型在代码生成任务中的表现。与传统自然语言生成不同，编写代码需要严格遵循语法、覆盖边界情况并处理大量细节，直接套用通用的提示工程技巧往往效果有限。AlphaCodium 通过引入“流工程”（Flow Engineering）理念，构建了一套基于测试的多阶段迭代流程，有效解决了这一痛点。\n\n该工具的核心亮点在于其独特的执行机制：它不依赖单次提示直接生成代码，而是引导模型经历问题分析、自我反思、生成多组候选方案、自动编写测试用例以及多轮代码修正等步骤。这种类似人类程序员“思考 - 尝试 - 验证”的闭环流程，显著提高了代码的通过率。在极具挑战性的 CodeContests 竞赛数据集上，AlphaCodium 将 GPT-4 的解题准确率从 19% 大幅提升至 44%，甚至超越了部分专用模型。\n\nAlphaCodium 非常适合 AI 研究人员、开发者以及对高质量代码生成有需求的技术团队使用。无论是用于攻克复杂的算法竞赛题目，还是优化日常开发中的自动编码流程，它都提供了一套经过验证的高效方法论。通过该项目，用户不","AlphaCodium 是由 CodiumAI 推出的开源项目，旨在提升大语言模型在代码生成任务中的表现。与传统自然语言生成不同，编写代码需要严格遵循语法、覆盖边界情况并处理大量细节，直接套用通用的提示工程技巧往往效果有限。AlphaCodium 通过引入“流工程”（Flow Engineering）理念，构建了一套基于测试的多阶段迭代流程，有效解决了这一痛点。\n\n该工具的核心亮点在于其独特的执行机制：它不依赖单次提示直接生成代码，而是引导模型经历问题分析、自我反思、生成多组候选方案、自动编写测试用例以及多轮代码修正等步骤。这种类似人类程序员“思考 - 尝试 - 验证”的闭环流程，显著提高了代码的通过率。在极具挑战性的 CodeContests 竞赛数据集上，AlphaCodium 将 GPT-4 的解题准确率从 19% 大幅提升至 44%，甚至超越了部分专用模型。\n\nAlphaCodium 非常适合 AI 研究人员、开发者以及对高质量代码生成有需求的技术团队使用。无论是用于攻克复杂的算法竞赛题目，还是优化日常开发中的自动编码流程，它都提供了一套经过验证的高效方法论。通过该项目，用户不仅能获得更强的代码生成能力，还能深入理解如何将大模型的应用从简单的“提示词优化”进阶为系统化的“流程设计”。","# Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering\n\n[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.08500) |\n[Dataset](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Ftalrid\u002FCodeContests_valid_and_test_AlphaCodium\u002Fblob\u002Fmain\u002Fcodecontests_valid_and_test_processed_alpha_codium.zip)\n\nOfficial Implementation\n> Tal Ridnik, Dedy Kredo, Itamar Friedman \u003Cbr\u002F> CodiumAI\n\n## News 2024-17-05\n\nUpdated AlphaCodium leaderboard with scores of new GPT models, and Claude3 Opus. \"GPT-4o\" Is currently the leading model on AlphaCodium.\n\n![image](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FCodium-ai_AlphaCodium_readme_954138ed9426.png)\n\n\n\n\n## Table of Contents\n- [Abstract](#abstract)\n- [Installation](#installation)\n- [How to run](#how-to-run)\n- [Technical Q&A](#technical-qa)\n- [Broader Applicability](#broader-applicability)\n- [Example Problem](#example-problem)\n- [Acknowledgments](#acknowledgments)\n- [Citation](#citation)\n\n## Abstract\n\nCode generation problems differ from common natural language problems - they require matching the exact syntax of the target language, identifying happy paths and edge cases, paying attention to numerous small details in the problem spec, and addressing other code-specific issues and requirements. Hence, many of the optimizations and tricks that have been successful in natural language generation may not be effective for code tasks.\n\nIn this work, we propose a new approach to code generation by LLMs, which we call AlphaCodium - a test-based, multi-stage, code-oriented iterative flow, that improves the performances of LLMs on code problems.\n\nWe tested AlphaCodium on a challenging code generation dataset called CodeContests, which includes competitive programming problems from platforms such as Codeforces. The proposed flow consistently and significantly improves results.\nOn the validation set, for example, GPT-4 accuracy (pass@5) increased from 19% with a single well-designed direct prompt to 44% with the AlphaCodium flow. \n\nMany of the principles and best practices we acquired in this work, we believe, are broadly applicable to general code generation tasks.\n\n\u003Cp>\n \u003Ctable class=\"tg\">\n  \u003Ctr>\n    \u003Ctd class=\"tg-c3ow\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FCodium-ai_AlphaCodium_readme_94c7688309af.png\" align=\"center\" width=\"600\" >\u003C\u002Ftd>\n\u003Ctr>\n    \u003Ctd class=\"tg-c3ow\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FCodium-ai_AlphaCodium_readme_30800e2271f7.png\" align=\"center\" width=\"600\" >\u003C\u002Ftd>\n\n  \u003C\u002Ftr>\n \u003C\u002Ftable>\n\u003C\u002Fp>\n\n\n## Installation\n\n(1) setup a virtual environment：\n```bash\npython3 -m venv venv\nsource .\u002Fvenv\u002Fbin\u002Factivate\n```\nand run: `pip install -r requirements.txt`.\n\n(2) Duplicate the file `alpha_codium\u002Fsettings\u002F.secrets_template.toml`, rename it as `alpha_codium\u002Fsettings\u002F.secrets.toml`, and fill in your OpenAI API key:\n```\n[openai]\nkey = \"...\"\n```\n\n(3) Download the processed CodeContest validation and test dataset from [hugging face](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Ftalrid\u002FCodeContests_valid_and_test_AlphaCodium\u002Fblob\u002Fmain\u002Fcodecontests_valid_and_test_processed_alpha_codium.zip), extract the zip file, and placed the extracted folder in the root of the project.\n\n## How to run\n\n### Configuration\nThe file: `alpha_codium\u002Fsettings\u002Fconfiguration.toml` contains the configuration for the project.\nIn the `config` section you can choose the model you want to use (\"gpt-4\", \"gpt-3.5-turbo-16k\", or others).\n\n### Solving a specific problem from CodeContest\nTo solve a specific problem with AlphaCodium, from the root folder run:\n```\npython -m alpha_codium.solve_problem \\\n--dataset_name \u002Fpath\u002Fto\u002Fdataset \\\n--split_name test \\\n--problem_number 0\n```\n- The `dataset_name` is the path to the dataset folder you downloaded in the installation step.\n- Note that the validation set contains 117 problems, and the test set contains 165 problems, so the `problem_number` parameter should be accordingly (zero-based)\n- The `split_name` can be either `valid` or `test`.\n- The following sections in the configuration file: \n`solve`, `self_reflection`,`possible_solutions`,`generate_ai_tests`,`initial_code_generation`,`public_tests`, `ai_tests`  \nenable to adjust possible configurations for the different stages of the flow.\n- Each run logs the results to a file named `alpha_codium\u002Fexample.log`. Reviewing the log file is a good way to understand what is going on in each stage of the flow.\n\nExample problem (test set, problem number 12):\n\u003Cp align=\"center\">\n \u003Ctable class=\"tg\">\n  \u003Ctr>\n    \u003Ctd class=\"tg-c3ow\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FCodium-ai_AlphaCodium_readme_40f98e0ce2af.png\" align=\"center\" width=\"600\"\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\u003C\u002Fp>\n\n### Solving an entire CodeContest dataset split\nto solve the entire dataset with AlphaCodium, from the root folder run:\n```\npython -m alpha_codium.solve_dataset \\\n--dataset_name \u002Fpath\u002Fto\u002Fdataset \\\n--split_name test \\\n--database_solution_path \u002Fpath\u002Fto\u002Foutput\u002Fdir\u002Fdataset_output.json\n```\n\n- The `split_name` can be either `valid` or `test`.\n- `database_solution_path` is the path to the directory where the solutions will be saved.\n- The `dataset` section in the configuration file contains the configuration for the running and evaluation of a dataset.\n- Note that this is a long process, and it may take a few days to complete with large models (e.g. GPT-4) and several iterations per problem. \n- `dataset.num_iterations` defines the number of iterations for each problem (pass@K). For a large number of iterations, it is recommended to introduce some randomness and different options for each iteration to achieve top results.\n\n### Running the evaluation\n\nOnce you generate a solution for the entire dataset (valid or test), you can evaluate it by running:\n```\npython -m alpha_codium.evaluate_dataset \\\n--dataset_name \u002Fpath\u002Fto\u002Fdataset \\\n--split_name test \\\n--database_solution_path \u002Fpath\u002Fto\u002Foutput\u002Fdir\u002Fdataset_output.json\n```\n\n### Solving a new problem (CodeContest format)\nTo solve a custom problem with AlphaCodium, first create a json file that includes the CodeContest problem fields, and then from the root folder run:\n```\npython -m alpha_codium.solve_my_problem \\\n--my_problem_json_file \u002Fpath\u002Fto\u002Fmy_problem.json\n```\n- The `my_problem_json_file` is the path to to the custom problem json file.\n\nSee the `my_problem_example.json` to see an example of a custom problem. The json file should include the following fields:\n- `name` is the name of the problem.\n- `description` is a description of the problem.\n- (optional) `public_tests` with the following fields:\n  - `input` is a list of strings that represent the input.\n  - `output` is a list of strings that represent the output.\n- (optional) `private_tests`, that follows the same structure as `public_tests`\n- (optional) `generated_tests`, that follows the same structure as `public_tests`\n\n\n## Technical Q&A\nAggregating some technical questions we received about this project:\n___\n**Q: How much time did you spend on \"prompt engineering\" compared to \"flow engineering\"?**\u003Cbr>\u003Cbr>\n**A:** Structured output almost completely eliminates the need for simple prompt engineering.\nWe estimate that ~95% of the time we did more high-level design, reasoning, and injecting data at the correct places, ..., a.k.a. \"flow engineering\".\n___\n\n**Q: How do you know that there wasn't a data leakage?** \u003Cbr>\u003Cbr>\n**A:** The test set of CodeContests dataset comprises problems published after September 2021, while the GPT-4 model variant we used (gpt-4-0613) has a data cutoff of September 2021. Hence, there is no data leakage for GPT4, on the test set.\nFor other models like DeepSeek, we cannot be sure. However, note that our [main result](.\u002Fpics\u002Fcomparison.png) is a comparison of \"direct prompt\" vs. \"AlphaCodium flow\". Data leakage would help both approaches, so the relative improvement of AlphaCodium flow is still valid.\n___\n\n**Q: Is this project relevant only to specific programming languages?**\u003Cbr>\u003Cbr>\n**A:** No. The proposed flow is language agnostic. We generated solutions in Python, but the flow can be applied to any language.\n___\n\n**Q: How did you manage the context window?** \u003Cbr>\u003Cbr>\n**A:** We used models with a context window of 8192 tokens, and we did not encounter cases where it did not suffice.\nHowever, we clearly observed that as the context we used in practice grows larger (let's say, above 4000 tokens), the model starts to \"ignore\" some of the information in the context. Hence, there is a clear tradeoff:\n- Injecting the results of previous stages into the context, may help the model to generate better code.\n- However, it may also cause the model to ignore specific details and nuances from the problem description.\n___\n\n**Q: Is this work \"realistic\" in terms of the number of LLM calls?** \u003Cbr>\u003Cbr>\n**A:** In comparison to AlphaCode, we do four orders of magnitude (!) fewer [calls](.\u002Fpics\u002Fcomputational_effort.png) (per solution AlphaCodium does 15-20 calls).\nYet we acknowledge that for some applications, this may still be too much, and more optimizations are needed. We however believe that many of the ideas and principles we acquired in this work are broadly applicable, even when the number of calls is further limited.\n___\n**Q: Why do you iterate only on the generated code, and not on the AI-generated tests?** \u003Cbr>\u003Cbr>\n**A:** For code problems in CodeContests, the tests are a list of input-output pairs. Hence, you don't really learn anything new when you \"fix\" a test - you just change its output to the prediction of the generated code. Instead of fixing tests, we preferred to always try and fix the code, while using \"test anchors\". (see the [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.08500) for more details).\nHowever, for other code generation tasks, where the tests are more complex and contain runnable code, iterating on the tests, in addition to iterating on the generated code, may be beneficial.\n\n\n## Broader Applicability\nWhile this work presents results on CodeContests dataset, we believe that it has a broader applicability.\n\nFirst and foremost, we feel that the proposed AlphaCodium [flow](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FCodium-ai_AlphaCodium_readme_94c7688309af.png), with reasonable adjustments, can be used as a more general framework for other code generation tasks.\n\nSecondly, many of the design concepts, principles, and tricks we acquired in this work are broadly applicable as-is to any general code generation tasks. For example:\n- **YAML Structured output**: asking the model to generate an output in YAML format, equivalent to a given Pydantic class\n- **Semantic reasoning via bullet points analysis**: Bullet points analysis encourages an in-depth understanding of the problem, and forces the model to divide the output into logical semantic sections, leading to improved results\n- **LLMs do better when generating a modular code**: when asking the model to: `divide the generated code into small sub-functions, with meaningful names and functionality`, we observe a better-produced code, with fewer bugs, and higher success rates for the iterative fixing stages.\n- **Soft decisions with double validation**: with a double validation process, we add an extra step where, given the generated output, the model is asked to re-generate the same output, but correct it if needed\n- **Leave room for exploration**: since the model can be wrong, it’s better to avoid irreversible decisions, and leave room for exploration and code iterations with different possible solutions\n\nThe list above is partial. See the [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.08500) for more details. The code provided [in this repo](.\u002Falpha_codium\u002Fsettings) can be used as a reference for better understanding the proposed concepts, and for applying them to other code generation tasks.\n\n\n## Example Problem\nIn this section, we present an example for a full problem from CodeContests dataset (test-set, problem 1), in order to demonstrate the complexity of the problems in the dataset, and the challenges they pose to LLMs.\n\n```\nproblem name: '1575_B. Building an Amusement Park'\n\nproblem description:\nMr. Chanek lives in a city represented as a plane. He wants to build an amusement park in the shape of a circle of radius r. \nThe circle must touch the origin (point (0, 0)).\nThere are n bird habitats that can be a photo spot for the tourists in the park. The i-th bird habitat is at point p_i = (x_i, y_i). \n\nFind the minimum radius r of a park with at least k bird habitats inside. \n\nA point is considered to be inside the park if and only if the distance between p_i and the center of the park is less than or equal \nto the radius of the park.\nNote that the center and the radius of the park do not need to be integers.\n\nIn this problem, it is guaranteed that the given input always has a solution with r ≤ 2 ⋅ 10^5.\n\nInput\n\nThe first line contains two integers n and k (1 ≤ n ≤ 10^5, 1 ≤ k ≤ n) — the number of bird habitats in the city and the number of bird \nhabitats required to be inside the park.\nThe i-th of the next n lines contains two integers x_i and y_i (0 ≤ |x_i|, |y_i| ≤ 10^5) — the position of the i-th bird habitat.\n\nOutput\n\nOutput a single real number r denoting the minimum radius of a park with at least k bird habitats inside. It is guaranteed that the given \ninput always has a solution with r ≤ 2 ⋅ 10^5.\nYour answer is considered correct if its absolute or relative error does not exceed 10^{-4}.\nFormally, let your answer be a, and the jury's answer be b. Your answer is accepted if and only if \\frac{|a - b|}{max{(1, |b|)}} ≤ 10^{-4}.\n\nExamples\n\nInput\n\n8 4\n-3 1\n-4 4\n1 5\n2 2\n2 -2\n-2 -4\n-1 -1\n-6 0\n\nOutput\n\n3.1622776589\n\n\nInput\n\n1 1\n0 0\n\n\nOutput\n\n0.0000000000\n\nNote\n\nIn the first example, Mr. Chanek can put the center of the park at (-3, -1) with radius √{10} ≈ 3.162. It can be proven this is the minimum r.\n```\n\n\n## Acknowledgments\nOur process CodeContests dataset is based on the original [CodeContests](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fdeepmind\u002Fcode_contests) dataset.\nWe removed the train set (which is not relevant to our work) and did some post-processing and cleaning to the validation and test sets.\n\n\n## Citation\n```\n@misc{ridnik2024code,\n      title={Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering}, \n      author={Tal Ridnik and Dedy Kredo and Itamar Friedman},\n      year={2024},\n      eprint={2401.08500},\n      archivePrefix={arXiv},\n      primaryClass={cs.LG}\n}\n```\n","# 使用 AlphaCodium 进行代码生成：从提示工程到流程工程\n\n[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.08500) |\n[数据集](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Ftalrid\u002FCodeContests_valid_and_test_AlphaCodium\u002Fblob\u002Fmain\u002Fcodecontests_valid_and_test_processed_alpha_codium.zip)\n\n官方实现\n> Tal Ridnik, Dedy Kredo, Itamar Friedman \u003Cbr\u002F> CodiumAI\n\n## 新闻 2024-17-05\n\n更新了 AlphaCodium 排行榜，加入了新 GPT 模型和 Claude3 Opus 的得分。“GPT-4o” 目前是 AlphaCodium 上表现最佳的模型。\n\n![image](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FCodium-ai_AlphaCodium_readme_954138ed9426.png)\n\n\n\n\n## 目录\n- [摘要](#abstract)\n- [安装](#installation)\n- [如何运行](#how-to-run)\n- [技术问答](#technical-qa)\n- [更广泛的应用性](#broader-applicability)\n- [示例问题](#example-problem)\n- [致谢](#acknowledgments)\n- [引用](#citation)\n\n## 摘要\n\n代码生成问题不同于常见的自然语言问题——它们要求匹配目标语言的精确语法、识别正常流程和边界情况、关注问题规范中的众多细节，并解决其他与代码相关的特定问题和需求。因此，在自然语言生成中取得成功的许多优化方法和技巧，可能并不适用于代码任务。\n\n在本工作中，我们提出了一种由大语言模型进行代码生成的新方法，称为 AlphaCodium——一种基于测试的多阶段、面向代码的迭代流程，能够提升大语言模型在代码问题上的性能。\n\n我们在一个名为 CodeContests 的具有挑战性的代码生成数据集上测试了 AlphaCodium，该数据集包含来自 Codeforces 等平台的竞赛编程题目。所提出的流程始终如一地显著提升了结果。例如，在验证集上，使用单一精心设计的直接提示时，GPT-4 的准确率（pass@5）为 19%，而采用 AlphaCodium 流程后则提升至 44%。\n\n我们认为，在这项工作中获得的许多原则和最佳实践，可广泛应用于一般的代码生成任务。\n\n\u003Cp>\n \u003Ctable class=\"tg\">\n  \u003Ctr>\n    \u003Ctd class=\"tg-c3ow\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FCodium-ai_AlphaCodium_readme_94c7688309af.png\" align=\"center\" width=\"600\" >\u003C\u002Ftd>\n\u003Ctr>\n    \u003Ctd class=\"tg-c3ow\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FCodium-ai_AlphaCodium_readme_30800e2271f7.png\" align=\"center\" width=\"600\" >\u003C\u002Ftd>\n\n  \u003C\u002Ftr>\n \u003C\u002Ftable>\n\u003C\u002Fp>\n\n\n## 安装\n\n(1) 设置虚拟环境：\n```bash\npython3 -m venv venv\nsource .\u002Fvenv\u002Fbin\u002Factivate\n```\n然后运行：`pip install -r requirements.txt`。\n\n(2) 复制文件 `alpha_codium\u002Fsettings\u002F.secrets_template.toml`，将其重命名为 `alpha_codium\u002Fsettings\u002F.secrets.toml`，并填写您的 OpenAI API 密钥：\n```\n[openai]\nkey = \"...\"\n```\n\n(3) 从 [hugging face](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Ftalrid\u002FCodeContests_valid_and_test_AlphaCodium\u002Fblob\u002Fmain\u002Fcodecontests_valid_and_test_processed_alpha_codium.zip) 下载处理后的 CodeContest 验证集和测试集数据，解压 ZIP 文件，并将解压后的文件夹放置在项目根目录下。\n\n## 如何运行\n\n### 配置\n文件：`alpha_codium\u002Fsettings\u002Fconfiguration.toml` 包含项目的配置信息。\n在 `config` 部分，您可以选择要使用的模型（“gpt-4”、“gpt-3.5-turbo-16k”或其他）。\n\n### 解决 CodeContest 中的特定问题\n要使用 AlphaCodium 解决特定问题，从根目录运行：\n```\npython -m alpha_codium.solve_problem \\\n--dataset_name \u002Fpath\u002Fto\u002Fdataset \\\n--split_name test \\\n--problem_number 0\n```\n- `dataset_name` 是您在安装步骤中下载的数据集文件夹路径。\n- 请注意，验证集包含 117 个问题，测试集包含 165 个问题，因此 `problem_number` 参数应相应设置（从 0 开始）。\n- `split_name` 可以是 `valid` 或 `test`。\n- 配置文件中的以下部分：\n`solve`, `self_reflection`,`possible_solutions`,`generate_ai_tests`,`initial_code_generation`,`public_tests`, `ai_tests`  \n允许您调整流程不同阶段的配置选项。\n- 每次运行都会将结果记录到名为 `alpha_codium\u002Fexample.log` 的文件中。查看日志文件是了解流程各阶段发生情况的好方法。\n\n示例问题（测试集，第 12 题）：\n\u003Cp align=\"center\">\n \u003Ctable class=\"tg\">\n  \u003Ctr>\n    \u003Ctd class=\"tg-c3ow\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FCodium-ai_AlphaCodium_readme_40f98e0ce2af.png\" align=\"center\" width=\"600\"\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n\u003C\u002Ftable>\n\u003C\u002Fp>\n\n### 解决整个 CodeContest 数据集的一个子集\n要使用 AlphaCodium 解决整个数据集，从根目录运行：\n```\npython -m alpha_codium.solve_dataset \\\n--dataset_name \u002Fpath\u002Fto\u002Fdataset \\\n--split_name test \\\n--database_solution_path \u002Fpath\u002Fto\u002Foutput\u002Fdir\u002Fdataset_output.json\n```\n\n- `split_name` 可以是 `valid` 或 `test`。\n- `database_solution_path` 是保存解决方案的目录路径。\n- 配置文件中的 `dataset` 部分包含运行和评估整个数据集的配置信息。\n- 请注意，这是一个耗时的过程，使用大型模型（如 GPT-4）且每个问题需要多次迭代时，可能需要几天时间才能完成。\n- `dataset.num_iterations` 定义每个问题的迭代次数（pass@K）。对于大量迭代，建议每次迭代引入一些随机性和不同的选项，以获得最佳效果。\n\n### 运行评估\n一旦您为整个数据集（验证集或测试集）生成了解决方案，可以通过运行以下命令对其进行评估：\n```\npython -m alpha_codium.evaluate_dataset \\\n--dataset_name \u002Fpath\u002Fto\u002Fdataset \\\n--split_name test \\\n--database_solution_path \u002Fpath\u002Fto\u002Foutput\u002Fdir\u002Fdataset_output.json\n```\n\n### 解决新问题（CodeContest 格式）\n要使用 AlphaCodium 解决自定义问题，首先创建一个包含 CodeContest 问题字段的 JSON 文件，然后从根目录运行：\n```\npython -m alpha_codium.solve_my_problem \\\n--my_problem_json_file \u002Fpath\u002Fto\u002Fmy_problem.json\n```\n- `my_problem_json_file` 是自定义问题 JSON 文件的路径。\n\n请参阅 `my_problem_example.json` 以查看自定义问题的示例。JSON 文件应包含以下字段：\n- `name` 是问题名称。\n- `description` 是问题描述。\n- （可选）`public_tests`，包含以下字段：\n  - `input` 是表示输入的字符串列表。\n  - `output` 是表示输出的字符串列表。\n- （可选）`private_tests`，其结构与 `public_tests` 相同。\n- （可选）`generated_tests`，其结构与 `public_tests` 相同。\n\n## 技术问答\n汇总了一些我们收到的关于该项目的技术问题：\n___\n**问：与“流程工程”相比，你们在“提示工程”上花了多少时间？**\u003Cbr>\u003Cbr>\n**答：** 结构化输出几乎完全消除了对简单提示工程的需求。\n我们估计大约95%的时间都花在了更高层次的设计、推理以及将数据注入到正确的位置等工作上，也就是所谓的“流程工程”。\n___\n\n**问：你们如何确保没有发生数据泄露？** \u003Cbr>\u003Cbr>\n**答：** CodeContests 数据集的测试集包含的是2021年9月之后发布的题目，而我们使用的 GPT-4 模型版本（gpt-4-0613）的数据截止日期为2021年9月。因此，在该测试集上，GPT-4 并不存在数据泄露问题。\n对于 DeepSeek 等其他模型，我们无法完全确定。不过需要注意的是，我们的[主要结果](.\u002Fpics\u002Fcomparison.png)是比较“直接提示”与“AlphaCodium 流程”的效果。如果存在数据泄露，它会对两种方法都有帮助，因此 AlphaCodium 流程的相对提升仍然有效。\n___\n\n**问：这个项目是否只适用于特定的编程语言？**\u003Cbr>\u003Cbr>\n**答：** 不是。所提出的流程与语言无关。我们用 Python 生成了解决方案，但该流程可以应用于任何语言。\n___\n\n**问：你们是如何管理上下文窗口的？** \u003Cbr>\u003Cbr>\n**答：** 我们使用了上下文窗口为8192个标记的模型，并未遇到上下文不足的情况。\n然而，我们清楚地观察到，当实际使用的上下文规模增大时（比如超过4000个标记），模型会开始“忽略”部分上下文信息。因此，这里存在一个明显的权衡：\n- 将前一阶段的结果注入到上下文中，可能有助于模型生成更好的代码。\n- 但这也可能导致模型忽略问题描述中的具体细节和细微之处。\n___\n\n**问：从大语言模型调用次数的角度来看，这项工作是否“现实”？** \u003Cbr>\u003Cbr>\n**答：** 与 AlphaCode 相比，我们的调用次数少了四个数量级（！）（每个解决方案，AlphaCodium 需要进行15–20次调用）。\n尽管如此，我们也承认，对于某些应用场景来说，这仍然可能过多，还需要进一步优化。不过，我们认为，我们在这一工作中获得的许多理念和原则具有广泛的适用性，即使在进一步限制调用次数的情况下也是如此。\n___\n**问：为什么你们只迭代生成的代码，而不迭代由 AI 生成的测试用例？** \u003Cbr>\u003Cbr>\n**答：** 在 CodeContests 的代码问题中，测试用例是一组输入-输出对。因此，当你“修复”一个测试用例时，并不会学到任何新东西——你只是将其输出改为生成代码的预测结果。与其修复测试用例，我们更倾向于始终尝试修复代码，同时利用“测试锚点”。（更多详情请参阅[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.08500)）\n然而，对于其他代码生成任务，如果测试用例更为复杂且包含可运行的代码，那么在迭代生成代码的同时也迭代测试用例，可能会更有益处。\n \n\n## 更广泛的适用性\n虽然这项工作是在 CodeContests 数据集上展示的结果，但我们相信它具有更广泛的适用性。\n\n首先，我们认为，经过合理调整后的 AlphaCodium [流程](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FCodium-ai_AlphaCodium_readme_94c7688309af.png)，可以作为其他代码生成任务的通用框架。\n\n其次，我们在本工作中获得的许多设计概念、原则和技巧，可以直接广泛应用于任何一般的代码生成任务。例如：\n- **YAML 结构化输出**：要求模型以 YAML 格式生成与给定 Pydantic 类等效的输出。\n- **通过项目符号分析进行语义推理**：项目符号分析能够促进对问题的深入理解，并迫使模型将输出划分为逻辑清晰的语义部分，从而带来更好的结果。\n- **大语言模型生成模块化代码效果更好**：当我们要求模型将生成的代码拆分为多个小函数，并赋予它们有意义的名称和功能时，我们会观察到生成的代码质量更高、错误更少，并且在迭代修复阶段的成功率也更高。\n- **双重验证下的软决策**：通过双重验证流程，我们增加了一个额外的步骤，即在生成输出后，再次要求模型重新生成相同的输出，必要时进行修正。\n- **留出探索空间**：由于模型可能会出错，最好避免做出不可逆的决定，而是留出一定的探索空间，以便尝试不同的解决方案并进行代码迭代。\n\n以上列举的内容仅为部分示例。更多详细信息请参阅[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.08500)。本仓库提供的代码（[链接](.\u002Falpha_codium\u002Fsettings)）可作为参考，帮助更好地理解这些概念，并将其应用于其他代码生成任务。\n\n\n## 示例问题\n在这一部分，我们展示了一个来自 CodeContests 数据集的完整问题示例（测试集，第1题），以说明该数据集中问题的复杂性，以及它们对大语言模型带来的挑战。\n\n```\n题目名称：‘1575_B. 建造游乐园’\n\n题目描述：\nChanek 先生居住在一个可以用平面表示的城市里。他想建造一个半径为 r 的圆形游乐园，且该圆必须与原点（点 (0, 0)）相切。\n城市中有 n 个鸟类栖息地，这些地方可以作为游客在游乐园内的拍照点。第 i 个鸟类栖息地位于点 p_i = (x_i, y_i)。\n\n请找出至少包含 k 个鸟类栖息地的游乐园的最小半径 r。\n\n只有当点 p_i 到游乐园中心的距离小于或等于游乐园的半径时，才被认为位于游乐园内部。\n请注意，游乐园的中心位置和半径不必是整数。\n\n本题保证给定的输入总是存在满足 r ≤ 2 × 10^5 的解。\n\n输入\n\n第一行包含两个整数 n 和 k (1 ≤ n ≤ 10^5, 1 ≤ k ≤ n)，分别表示城市中的鸟类栖息地数量以及游乐园内需要包含的鸟类栖息地数量。\n接下来的 n 行中，第 i 行包含两个整数 x_i 和 y_i (0 ≤ |x_i|, |y_i| ≤ 10^5)，表示第 i 个鸟类栖息地的位置。\n\n输出\n\n输出一个实数 r，表示至少包含 k 个鸟类栖息地的游乐园的最小半径。本题保证给定的输入总是存在满足 r ≤ 2 × 10^5 的解。\n你的答案被认为是正确的，当且仅当其绝对误差或相对误差不超过 10^{-4}。\n形式上讲，设你的答案为 a，裁判的答案为 b。你的答案被接受，当且仅当 \\frac{|a - b|}{max{(1, |b|)}} ≤ 10^{-4}。\n\n示例\n\n输入\n\n8 4\n-3 1\n-4 4\n1 5\n2 2\n2 -2\n-2 -4\n-1 -1\n-6 0\n\n输出\n\n3.1622776589\n\n\n输入\n\n1 1\n0 0\n\n\n输出\n\n0.0000000000\n\n注释\n\n在第一个示例中，Chanek 先生可以将游乐园的中心设置在 (-3, -1)，此时半径为 √{10} ≈ 3.162。可以证明这是最小的 r。\n```\n\n## 致谢\n我们的数据集 CodeContests 是基于原始的 [CodeContests](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fdeepmind\u002Fcode_contests) 数据集构建的。\n我们移除了与本工作无关的训练集，并对验证集和测试集进行了一些后处理和清洗。\n\n\n## 引用\n```\n@misc{ridnik2024code,\n      title={使用 AlphaCodium 进行代码生成：从提示工程到流程工程}, \n      author={Tal Ridnik、Dedy Kredo 和 Itamar Friedman},\n      year={2024},\n      eprint={2401.08500},\n      archivePrefix={arXiv},\n      primaryClass={cs.LG}\n}\n```","# AlphaCodium 快速上手指南\n\nAlphaCodium 是一个基于“流工程（Flow Engineering）”的代码生成框架。它通过多阶段、基于测试的迭代流程，显著提升了大语言模型（LLM）在解决复杂编程问题（如竞争性编程）上的准确率。\n\n## 环境准备\n\n- **系统要求**：Linux 或 macOS（Windows 用户建议使用 WSL2）\n- **Python 版本**：Python 3.8 或更高版本\n- **前置依赖**：\n  - Git\n  - pip\n  - OpenAI API Key（或其他兼容模型的 API Key）\n\n## 安装步骤\n\n### 1. 克隆项目并设置虚拟环境\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FCodium-ai\u002FAlphaCodium.git\ncd AlphaCodium\npython3 -m venv venv\nsource .\u002Fvenv\u002Fbin\u002Factivate\npip install -r requirements.txt\n```\n\n> **提示**：国内用户若下载依赖较慢，可指定清华源加速安装：\n> `pip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`\n\n### 2. 配置 API Key\n\n复制配置文件模板并填入你的 API Key：\n\n```bash\ncp alpha_codium\u002Fsettings\u002F.secrets_template.toml alpha_codium\u002Fsettings\u002F.secrets.toml\n```\n\n编辑 `alpha_codium\u002Fsettings\u002F.secrets.toml` 文件，填入密钥：\n\n```toml\n[openai]\nkey = \"sk-...\"\n```\n\n### 3. 下载数据集\n\n从 Hugging Face 下载处理好的 CodeContests 验证集和测试集，解压后放入项目根目录。\n\n- **下载地址**：[CodeContests Valid & Test Dataset](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Ftalrid\u002FCodeContests_valid_and_test_AlphaCodium\u002Fblob\u002Fmain\u002Fcodecontests_valid_and_test_processed_alpha_codium.zip)\n- **操作**：解压后的文件夹应直接位于项目根目录下。\n\n## 基本使用\n\n### 1. 配置模型\n\n编辑 `alpha_codium\u002Fsettings\u002Fconfiguration.toml` 文件，在 `[config]` 部分选择你要使用的模型（例如 `\"gpt-4\"`, `\"gpt-4o\"`, `\"gpt-3.5-turbo-16k\"` 等）。\n\n### 2. 解决单个问题\n\n运行以下命令来解决数据集中的特定问题（以测试集第 0 题为例）：\n\n```bash\npython -m alpha_codium.solve_problem \\\n--dataset_name \u002Fpath\u002Fto\u002Fdataset \\\n--split_name test \\\n--problem_number 0\n```\n\n**参数说明：**\n- `--dataset_name`: 步骤 3 中下载并解压的数据集文件夹路径。\n- `--split_name`: 数据集划分，可选 `valid` (验证集，117 题) 或 `test` (测试集，165 题)。\n- `--problem_number`: 题目编号（从 0 开始）。\n\n运行结果和详细日志将保存在 `alpha_codium\u002Fexample.log`，可通过查看该文件了解各阶段的执行细节。\n\n### 3. 解决自定义问题\n\n如果你有自己的编程问题（符合 CodeContest 格式），可以创建一个 JSON 文件（参考 `my_problem_example.json`），包含 `name`, `description`, 以及可选的 `public_tests` 等字段，然后运行：\n\n```bash\npython -m alpha_codium.solve_my_problem \\\n--my_problem_json_file \u002Fpath\u002Fto\u002Fmy_problem.json\n```\n\n### 4. 批量运行与评估（可选）\n\n若要跑完整个数据集（耗时较长，需数天），可使用：\n\n```bash\npython -m alpha_codium.solve_dataset \\\n--dataset_name \u002Fpath\u002Fto\u002Fdataset \\\n--split_name test \\\n--database_solution_path \u002Fpath\u002Fto\u002Foutput\u002Fdir\u002Fdataset_output.json\n```\n\n生成解决方案后，运行评估：\n\n```bash\npython -m alpha_codium.evaluate_dataset \\\n--dataset_name \u002Fpath\u002Fto\u002Fdataset \\\n--split_name test \\\n--database_solution_path \u002Fpath\u002Fto\u002Foutput\u002Fdir\u002Fdataset_output.json\n```","某算法竞赛团队正在备战 Codeforces 高难度场次，需要快速验证复杂逻辑题目的解题思路并生成可提交代码。\n\n### 没有 AlphaCodium 时\n- **提示词依赖过重**：工程师需反复手动调整 Prompt 以覆盖边界情况，单次尝试成功率极低（GPT-4 直接生成通过率仅约 19%）。\n- **缺乏自我纠错**：模型生成的代码往往忽略隐蔽的语法错误或极端测试用例，导致在在线判题系统中直接失败。\n- **调试效率低下**：面对生成错误的代码，开发者必须人工分析日志、构造反例并重新输入指令，迭代周期漫长且消耗大量精力。\n- **逻辑覆盖不全**：难以确保模型同时兼顾“快乐路径”与所有边缘场景，常出现逻辑漏洞导致部分测试点不通过。\n\n### 使用 AlphaCodium 后\n- **流程化自动迭代**：AlphaCodium 内置多阶段工作流，自动执行问题分析、方案构思及代码生成，将 GPT-4 的通过率提升至 44%。\n- **内建测试驱动**：工具自动生成 AI 测试用例并进行自我反思，在提交前即可发现并修复语法错误和逻辑缺陷。\n- **闭环优化机制**：基于测试反馈自动进行多轮代码修正，无需人工干预即可完成从“初稿”到“可运行代码”的进化。\n- **全面场景覆盖**：通过结构化的流工程，强制模型系统性地处理题目细节与边缘情况，显著减少因疏忽导致的判题失败。\n\nAlphaCodium 通过将“提示词工程”升级为“流工程”，让大模型在解决高难度代码问题时具备了类似人类专家的自我反思与迭代能力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FCodium-ai_AlphaCodium_954138ed.png","Codium-ai","Qodo","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FCodium-ai_4b0dadce.png","",null,"support@qodo.ai","QodoAI","www.qodo.ai","https:\u002F\u002Fgithub.com\u002FCodium-ai",[87,91],{"name":88,"color":89,"percentage":90},"Python","#3572A5",99.6,{"name":92,"color":93,"percentage":94},"Dockerfile","#384d54",0.4,3925,299,"2026-04-08T16:54:09","AGPL-3.0","Linux, macOS, Windows","未说明 (基于 LLM API 调用，本地无需 GPU)","未说明",{"notes":103,"python":104,"dependencies":105},"1. 该工具主要依赖外部大模型 API（如 OpenAI GPT-4），需配置 API Key，本地运行不需要高性能 GPU 或大量显存。\n2. 需要手动下载 CodeContests 数据集并解压到项目根目录。\n3. 运行完整数据集评估可能耗时数天。","3.x (通过 python3 -m venv 推断)",[106],"requirements.txt 中定义的依赖 (具体列表未在 README 中展示)",[15],[109,110,111,112,113],"code-generation","flow-engineering","paper-implementations","state-of-the-art","broader-impacts","2026-03-27T02:49:30.150509","2026-04-09T09:38:48.142085",[117,122,127,132,137,142],{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},25804,"如何配置 AlphaCodium 以使用 DeepSeek 模型？","LiteLLM 默认可能不支持直接调用 DeepSeek。您可以尝试修改 `alpha_codium\u002Fllm\u002Fai_handler.py` 文件，注释掉 `api_base` 参数，并将模型名称更改为 HuggingFace 格式（例如 `huggingface\u002Fdeepseek-ai\u002Fdeepseek-coder-33b-instruct`）。\n\n修改后的代码示例：\n```python\nresponse = await acompletion(\n    model=\"huggingface\u002Fdeepseek-ai\u002Fdeepseek-coder-33b-instruct\",\n    messages=[\n        {\"role\": \"system\", \"content\": system},\n        {\"role\": \"user\", \"content\": user},\n    ],\n    # api_base=get_settings().get(\"config.model\"), # 注释掉此行\n    temperature=temperature,\n    repetition_penalty=frequency_penalty+1,\n    force_timeout=get_settings().config.ai_timeout,\n    max_tokens=2000,\n    stop=['\u003C|EOT|>'],\n)\n```\n注意：如果模型过大，可能会遇到其他资源限制错误。","https:\u002F\u002Fgithub.com\u002FCodium-ai\u002FAlphaCodium\u002Fissues\u002F34",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},25805,"AlphaCodium 支持 Claude 3 模型吗？表现如何？","AlphaCodium 已添加对 Claude 3 Opus 的支持。但根据维护者的测试反馈，Claude 3 在代码生成任务上的表现不如 GPT-4，且速度更慢、成本更高。维护者认为 Anthropic 目前在代码模型竞争力方面尚未超越 GPT-4。您可以在项目的 README 文件中查看具体的模型对比数据和排行榜。","https:\u002F\u002Fgithub.com\u002FCodium-ai\u002FAlphaCodium\u002Fissues\u002F42",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},25806,"为什么生成的代码在运行时因超时被标记为失败，即使逻辑正确？","这是 Codeforces 等竞赛平台的特定要求。有效的解决方案必须在极短时间（通常远少于 3 秒）内完成执行。平台故意设置严格的超时限制，以排除那些虽然逻辑正确但未优化的暴力解法（Brute Force）。如果生成的代码运行时间过长，说明其效率未达到竞赛标准，因此被判定为失败是预期的行为，旨在促使模型生成更高效的算法。","https:\u002F\u002Fgithub.com\u002FCodium-ai\u002FAlphaCodium\u002Fissues\u002F33",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},25807,"在 macOS 或 Python 3.12 环境下安装时遇到 PyYAML 或 duckdb 版本错误怎么办？","在安装过程中如果遇到 `PyYAML` 构建失败或 `duckdb` 版本找不到（如 `duckdb==0.9.3.dev3077`）的错误，请尝试以下解决方法：\n1. 确保拉取了最新的代码提交，维护者已修复部分依赖问题。\n2. 手动修改 `requirements.txt` 中的 `duckdb` 版本号。将不存在的开发版版本（如 `0.9.3.dev3077`）更改为 PyPI 上已有的最新开发版版本（例如 `0.9.3.dev3715` 或更高），然后重新运行 `pip install -r requirements.txt`。","https:\u002F\u002Fgithub.com\u002FCodium-ai\u002FAlphaCodium\u002Fissues\u002F27",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},25808,"使用 gpt-3.5-turbo-1106 时出现 YAML 解析错误（'found character '`' that cannot start any token'）如何解决？","这是因为不同版本的 GPT 模型在输出格式上存在细微差异，某些版本会在回答开头添加 ```yaml 标记，导致解析失败。维护者已通过添加后处理步骤修复了此问题（见 commit b55f41d）。\n如果您使用的是旧版本代码，建议更新到最新版。如果无法更新，需要在代码中添加逻辑，在解析 YAML 之前去除回答字符串开头的 ```yaml 或 ``` 标记。","https:\u002F\u002Fgithub.com\u002FCodium-ai\u002FAlphaCodium\u002Fissues\u002F20",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},25809,"如何让 AlphaCodium 解决数据集中未包含的自定义问题？","可以。AlphaCodium 的工作流需要输入一个问题和对应的公开测试用例（输入 - 输出格式）。\n您需要按照项目 README 中“示例问题”部分的格式准备您的自定义问题数据。请注意，如果您的自定义问题与标准的 CodeContest 问题格式差异较大，可能需要对代码进行额外的调整以适应新的输入结构。","https:\u002F\u002Fgithub.com\u002FCodium-ai\u002FAlphaCodium\u002Fissues\u002F14",[]]