[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-facebookresearch--coconut":3,"tool-facebookresearch--coconut":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",159267,2,"2026-04-17T11:29:14",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":76,"owner_website":77,"owner_url":78,"languages":79,"stars":88,"forks":89,"last_commit_at":90,"license":91,"difficulty_score":10,"env_os":92,"env_gpu":93,"env_ram":92,"env_deps":94,"category_tags":103,"github_topics":76,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":104,"updated_at":105,"faqs":106,"releases":142},8561,"facebookresearch\u002Fcoconut","coconut","Training Large Language Model to Reason in a Continuous Latent Space","Coconut 是一个由 Meta 开源的研究项目，旨在训练大型语言模型在“连续潜在空间”中进行推理。传统的大模型通常通过生成离散的文本步骤（即思维链）来解决问题，而 Coconut 创新地让模型学习生成连续的向量表示作为思考过程。这种方法试图突破纯文本推理的局限，探索更高效、更深层的逻辑推演能力，从而提升模型在处理复杂数学问题或逻辑任务时的表现。\n\n该工具主要面向 AI 研究人员和高级开发者，特别是那些对大模型内部机制、推理算法优化以及前沿学术实验感兴趣的人群。用户可以通过提供的代码复现论文实验，基于 GSM8K 等数据集进行训练和评估。Coconut 的独特亮点在于其分阶段训练策略和对“连续思维”步数的灵活配置，允许研究者深入探究非文本形式的推理路径如何影响模型性能。虽然目前它更偏向于学术探索而非直接的商业应用，但为未来构建更具直觉和深度的 AI 系统提供了重要的技术参考。","# Coconut\n\nThe code base is the official implementation of [Training Large Language Models to Reason in a Continuous Latent Space](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.06769).\n\n![coconut](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_coconut_readme_d59981bc6594.png)\n\n## Getting Started\nClone repo:\n```\ngit clone git@github.com:facebookresearch\u002Fcoconut.git\ncd coconut\n```\n\nSetup environment:\n```\nconda create --name coconut python=3.12\nconda activate coconut\npip install -r requirements.txt\n```\n\nThe code relies on [wandb](https:\u002F\u002Fwandb.ai\u002Fsite\u002F) for logging. Please log in your wandb account following this [document](https:\u002F\u002Fdocs.wandb.ai\u002Fref\u002Fcli\u002Fwandb-login\u002F) before running any experiments.\n\n## Data\n\nThe data for training and evaluation should be presented as a json file like below:\n\n```python\n[\n  {\n    \"question\": \"...\",\n    \"answer\": \"...\",\n    \"steps\": [\"...\", \"...\", ...]\n  },\n  ...\n]\n```\n\nThe file should contain a list of data points. Each data point is composed of a question (str), an answer (str), and a list of steps (str), where each of them is a string.\n\nFor example, you can download and process the [GSM8K](https:\u002F\u002Farxiv.org\u002Fabs\u002F2110.14168) dataset (with [augmented training and validation sets](https:\u002F\u002Fgithub.com\u002Fda03\u002FInternalize_CoT_Step_by_Step\u002Ftree\u002Fe06a32ee5e4cd117171daeb4755d2a97ece62761\u002Fdata\u002Fgsm8k)) by running:\n\n```bash\nbash preprocessing\u002Fgsm_icot.bash\n```\n\n## Arguments\n\nThe configuration of a run should be specified in a yaml file (an example can be found [here](args\u002Fgsm_coconut.yaml)).\n\n- **General settings**\n\n  - **project**: Project name for wandb\n  - **save_path**: Your path to store the checkpoints\n  - **only_eval**: If true, only load a model and test on the data from `val_path` (must used along with `load_model_path`). Otherwise, train the model on `train_path` and test on `val_path` after every epoch.\n\n- **Method**\n  - **coconut**: Train coconut model\n  - **cot**: Train cot model\n  - **no_thoughts**: Train coconut (w\u002Fo thought) model\n  - **no_cot**: Train no-cot model\n\n- **Training settings**\n\n  - **c_thought**: Number of continuous thoughts for each reasoning step\n  - **epochs_per_stage**: Number of epochs for every training stage\n  - **max_latent_stage**: The maximum number of training stages (in addition to the initial stage)\n  - **pad_latent_to_max**: If the number of reasoning steps is fewer than the index of current training stage, pad the number of continuous thoughts.\n  - **save_only_improve**: Save the model only when there the best validation accuracy is updated. Recommended to set `False` for Coconut model training, because otherwise the checkpoints in the last stage might now get saved.\n  - **uniform_prob**: The probability to mix data from other stages. 0 for standard experiment, 0.3 for analysis experiment.\n  - **model_id**: Huggingface model id to load as the initialization, e.g., `openai-community\u002Fgpt2`\n  - **load_model_path**: The path to a checkpoint to load. Used in two cases: (1) for evaluation (2) to initialize coconut from a CoT-tuned model.\n  - **seed**: Random seed.\n  - **resume**: The epoch to resume. Can be used when we want to skip the initial training stages.\n  - **bf16**: Whether to use bf16 training.\n  - **train_path**: Path to the training set.\n  - **val_path**: Path to the validation or test set (depending on `only_eval`)\n  - **reset_optimizer**: Whether to reset the optimizer when swtiching training stages.\n  - **batch_size_training**: Batch size to train the model per GPU.\n  - **debug**: If true, there is no wandb and model saving. A subset of data will be used.\n  - **gradient_accumulation_steps**: Gradient accumulation steps\n  - **num_epochs**: Maximum training epoches.\n  - **lr**: Learning rate\n  - **weight_decay**: Weight decay\n\n\n## Training\n\nRun the following commands (replacing `N_GPUS` and `PATH_TO_ARGS`):\n\n```\ntorchrun --nnodes 1 --nproc_per_node N_GPUS run.py PATH_TO_ARGS\n```\n\n## Reproducing Experiments\n\nHere we provide instructions to reproduce our experiments in the paper.\n\nAll the commands below assume 4 * A100 (80GB) GPUs. You may change the corresponding arguments in the config file (`batch_size_training`, `gradient_accumulation_steps`) and `nproc_per_node` when launching the run, to adapt your resources.\n\n\n### GSM8K\n\nPreprocessing data:\n\n```bash\nbash preprocessing\u002Fgsm_icot.bash\n```\n\nFirst train the model with CoT (as the stage 0 training)\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fgsm_cot.yaml\n```\n\nSelect a checkpoint as the initialization of Coconut (the validation accuracy is expected to be around 40%). Replace the `load_model_path` in the [args\u002Fgsm_coconut.yaml](args\u002Fgsm_coconut.yaml) with your selected checkpoint, and run:\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fgsm_coconut.yaml\n```\n\nFind the checkpoint with best validation accuracy, and put the path as `load_model_path` in [args\u002Fgsm_coconut_eval.yaml](args\u002Fgsm_coconut_eval.yaml). To evaluate:\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fgsm_coconut_eval.yaml\n```\n\n### ProntoQA\n\nPlease clone the official [github repo](https:\u002F\u002Fgithub.com\u002Fasaparov\u002Fprontoqa\u002Ftree\u002Ff0145b867b3c106285ec9ea1941a3f6eb7c6162d) of [ProntoQA](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2210.01240) and generate a raw dataset with:\n\n```bash\ncd prontoqa\npython run_experiment.py --model-name json --model-size dummy --ordering random --num-trials 10000 --few-shot-examples 0 --ontology fictional --min-hops 5 --max-hops 5 --hops-skip 1\n```\n\nThen copy the generated `5hop_0shot_random.json` file to `data` directory, and preprocess the dataset with:\n\n```bash\npython preprocessing\u002Fprontoqa.py\n```\n\n\nThen run the following to train the model:\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fprontoqa_coconut.yaml\n```\n\nFind the checkpoint with best validation accuracy, and put the path as `load_model_path` in [args\u002Fprosqa_coconut_eval.yaml](args\u002Fprosqa_coconut_eval.yaml). To evaluate:\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fprosqa_coconut_eval.yaml\n```\n\n\n### ProsQA\n\nThe ProsQA dataset is at [data\u002Fprosqa_*.json](data).\n\nThen run the following to train the model:\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fprosqa_coconut.yaml\n```\n\nFind the checkpoint with best validation accuracy, and put the path as `load_model_path` in [args\u002Fprosqa_coconut_eval.yaml](args\u002Fprosqa_coconut_eval.yaml). To evaluate:\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fprosqa_coconut_eval.yaml\n```\n\n\n\n\n## Citation\nIf you use this code base in your research, please cite our paper with the following BibTex entry:\n```bibtex\n@article{hao2024training,\n  title={Training Large Language Models to Reason in a Continuous Latent Space},\n  author={Hao, Shibo and Sukhbaatar, Sainbayar and Su, DiJia and Li, Xian and Hu, Zhiting and Weston, Jason and Tian, Yuandong},\n  journal={arXiv preprint arXiv:2412.06769},\n  year={2024}\n}\n```\n\n## License\nThis code is released under the MIT license (see [LICENSE](LICENSE)).","# 椰子\n\n该代码库是 [在连续潜在空间中训练大型语言模型进行推理](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.06769) 的官方实现。\n\n![coconut](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_coconut_readme_d59981bc6594.png)\n\n## 快速入门\n克隆仓库：\n```\ngit clone git@github.com:facebookresearch\u002Fcoconut.git\ncd coconut\n```\n\n设置环境：\n```\nconda create --name coconut python=3.12\nconda activate coconut\npip install -r requirements.txt\n```\n\n代码依赖 [wandb](https:\u002F\u002Fwandb.ai\u002Fsite\u002F) 进行日志记录。请在运行任何实验之前，按照此 [文档](https:\u002F\u002Fdocs.wandb.ai\u002Fref\u002Fcli\u002Fwandb-login\u002F) 登录您的 wandb 账户。\n\n## 数据\n\n用于训练和评估的数据应以如下所示的 JSON 文件形式呈现：\n\n```python\n[\n  {\n    \"question\": \"...\",\n    \"answer\": \"...\",\n    \"steps\": [\"...\", \"...\", ...]\n  },\n  ...\n]\n```\n\n该文件应包含一个数据点列表。每个数据点由一个问题（字符串）、一个答案（字符串）和步骤列表（字符串）组成，其中每一项都是字符串。\n\n例如，您可以通过运行以下命令下载并处理 [GSM8K](https:\u002F\u002Farxiv.org\u002Fabs\u002F2110.14168) 数据集（带有 [增强的训练和验证集](https:\u002F\u002Fgithub.com\u002Fda03\u002FInternalize_CoT_Step_by_Step\u002Ftree\u002Fe06a32ee5e4cd117171daeb4755d2a97ece62761\u002Fdata\u002Fgsm8k)）：\n\n```bash\nbash preprocessing\u002Fgsm_icot.bash\n```\n\n## 参数\n\n一次运行的配置应在 YAML 文件中指定（示例可参见 [这里](args\u002Fgsm_coconut.yaml)）。\n\n- **通用设置**\n\n  - **project**: wandb 的项目名称\n  - **save_path**: 您存储检查点的路径\n  - **only_eval**: 如果为真，则仅加载模型并在 `val_path` 中的数据上进行测试（必须与 `load_model_path` 一起使用）。否则，在 `train_path` 上训练模型，并在每个 epoch 结束后在 `val_path` 上进行测试。\n\n- **方法**\n  - **coconut**: 训练椰子模型\n  - **cot**: 训练 CoT 模型\n  - **no_thoughts**: 训练无思维的椰子模型\n  - **no_cot**: 训练无 CoT 模型\n\n- **训练设置**\n\n  - **c_thought**: 每个推理步骤中的连续思维数量\n  - **epochs_per_stage**: 每个训练阶段的 epoch 数\n  - **max_latent_stage**: 最大训练阶段数（除初始阶段外）\n  - **pad_latent_to_max**: 如果推理步骤的数量少于当前训练阶段的索引，则填充连续思维的数量。\n  - **save_only_improve**: 仅当最佳验证准确率更新时才保存模型。建议在椰子模型训练中将其设置为 `False`，否则最后阶段的检查点可能不会被保存。\n  - **uniform_prob**: 混合其他阶段数据的概率。标准实验为 0，分析实验为 0.3。\n  - **model_id**: 用于初始化的 Huggingface 模型 ID，例如 `openai-community\u002Fgpt2`\n  - **load_model_path**: 要加载的检查点路径。用于两种情况：(1) 用于评估 (2) 从 CoT 微调模型初始化椰子模型。\n  - **seed**: 随机种子。\n  - **resume**: 继续训练的 epoch。可用于跳过初始训练阶段。\n  - **bf16**: 是否使用 bf16 训练。\n  - **train_path**: 训练集路径。\n  - **val_path**: 验证或测试集路径（取决于 `only_eval`)\n  - **reset_optimizer**: 切换训练阶段时是否重置优化器。\n  - **batch_size_training**: 每块 GPU 上用于训练模型的批量大小。\n  - **debug**: 如果为真，则不进行 wandb 日志记录和模型保存。将使用数据的一个子集。\n  - **gradient_accumulation_steps**: 梯度累积步数\n  - **num_epochs**: 最大训练 epoch 数。\n  - **lr**: 学习率\n  - **weight_decay**: 权重衰减\n\n\n## 训练\n\n运行以下命令（替换 `N_GPUS` 和 `PATH_TO_ARGS`）：\n\n```\ntorchrun --nnodes 1 --nproc_per_node N_GPUS run.py PATH_TO_ARGS\n```\n\n## 复现实验\n\n在此我们提供复现论文中实验的说明。\n\n以下所有命令均假设使用 4 块 A100（80GB）GPU。您可以根据自己的资源情况，在启动运行时更改配置文件中的相应参数（`batch_size_training`、`gradient_accumulation_steps`）以及 `nproc_per_node`。\n\n\n### GSM8K\n\n预处理数据：\n\n```bash\nbash preprocessing\u002Fgsm_icot.bash\n```\n\n首先以 CoT 方式训练模型（作为第 0 阶段训练）\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fgsm_cot.yaml\n```\n\n选择一个检查点作为椰子模型的初始化（预计验证准确率约为 40%）。将 [args\u002Fgsm_coconut.yaml](args\u002Fgsm_coconut.yaml) 中的 `load_model_path` 替换为您选择的检查点，然后运行：\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fgsm_coconut.yaml\n```\n\n找到验证准确率最高的检查点，并将其路径放入 [args\u002Fgsm_coconut_eval.yaml](args\u002Fgsm_coconut_eval.yaml) 中的 `load_model_path`。进行评估：\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fgsm_coconut_eval.yaml\n```\n\n### ProntoQA\n\n请克隆 [ProntoQA](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2210.01240) 的官方 [GitHub 仓库](https:\u002F\u002Fgithub.com\u002Fasaparov\u002Fprontoqa\u002Ftree\u002Ff0145b867b3c106285ec9ea1941a3f6eb7c6162d)，并使用以下命令生成原始数据集：\n\n```bash\ncd prontoqa\npython run_experiment.py --model-name json --model-size dummy --ordering random --num-trials 10000 --few-shot-examples 0 --ontology fictional --min-hops 5 --max-hops 5 --hops-skip 1\n```\n\n然后将生成的 `5hop_0shot_random.json` 文件复制到 `data` 目录，并使用以下命令预处理数据集：\n\n```bash\npython preprocessing\u002Fprontoqa.py\n```\n\n\n随后运行以下命令训练模型：\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fprontoqa_coconut.yaml\n```\n\n找到验证准确率最高的检查点，并将其路径放入 [args\u002Fprosqa_coconut_eval.yaml](args\u002Fprosqa_coconut_eval.yaml) 中的 `load_model_path`。进行评估：\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fprosqa_coconut_eval.yaml\n```\n\n\n### ProsQA\n\nProsQA 数据集位于 [data\u002Fprosqa_*.json](data)。\n\n随后运行以下命令训练模型：\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fprosqa_coconut.yaml\n```\n\n找到验证准确率最高的检查点，并将其路径放入 [args\u002Fprosqa_coconut_eval.yaml](args\u002Fprosqa_coconut_eval.yaml) 中的 `load_model_path`。进行评估：\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fprosqa_coconut_eval.yaml\n```\n\n\n\n\n## 引用\n如果您在研究中使用此代码库，请使用以下 BibTex 条目引用我们的论文：\n```bibtex\n@article{hao2024training,\n  title={Training Large Language Models to Reason in a Continuous Latent Space},\n  author={Hao, Shibo and Sukhbaatar, Sainbayar and Su, DiJia and Li, Xian and Hu, Zhiting and Weston, Jason and Tian, Yuandong},\n  journal={arXiv preprint arXiv:2412.06769},\n  year={2024}\n}\n```\n\n## 许可证\n此代码以 MIT 许可证发布（参见 [LICENSE](LICENSE)）。","# Coconut 快速上手指南\n\nCoconut 是论文《Training Large Language Models to Reason in a Continuous Latent Space》的官方实现，旨在训练大语言模型在连续潜在空间中进行推理。\n\n## 环境准备\n\n*   **操作系统**: Linux (推荐)\n*   **Python**: 3.12\n*   **GPU**: 支持 CUDA 的 NVIDIA GPU (复现实验建议配备多卡，如 4x A100)\n*   **依赖管理**: Conda\n*   **外部服务**: 需注册 [Weights & Biases (wandb)](https:\u002F\u002Fwandb.ai\u002F) 账号用于实验日志记录\n\n## 安装步骤\n\n### 1. 克隆代码库\n```bash\ngit clone git@github.com:facebookresearch\u002Fcoconut.git\ncd coconut\n```\n\n### 2. 创建并激活虚拟环境\n```bash\nconda create --name coconut python=3.12\nconda activate coconut\n```\n\n### 3. 安装依赖\n建议使用国内镜像源加速安装（如清华源）：\n```bash\npip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 4. 配置 Wandb\n在运行任何实验前，请登录您的 wandb 账号：\n```bash\nwandb login\n```\n*(按提示输入 API Key，若无法访问官网，可在国内网络环境下操作或使用代理)*\n\n## 基本使用\n\n### 1. 数据准备\n训练数据需为 JSON 格式，包含 `question`、`answer` 和 `steps` 字段。\n以 GSM8K 数据集为例，运行以下脚本进行预处理：\n```bash\nbash preprocessing\u002Fgsm_icot.bash\n```\n\n### 2. 配置文件\n修改 `args\u002F` 目录下的 YAML 配置文件（如 `args\u002Fgsm_coconut.yaml`）。\n关键参数说明：\n*   `train_path`: 训练集路径\n*   `val_path`: 验证集路径\n*   `save_path`: 模型检查点保存路径\n*   `model_id`: 初始化的 HuggingFace 模型 ID (如 `openai-community\u002Fgpt2`)\n*   `load_model_path`: 若从 CoT 微调后的模型继续训练，在此填入检查点路径\n\n### 3. 启动训练\n使用 `torchrun` 启动训练。以下命令示例使用 4 张 GPU：\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fgsm_coconut.yaml\n```\n\n**典型两阶段训练流程（参考 GSM8K 复现）：**\n\n1.  **阶段一：训练 CoT 模型**\n    ```bash\n    torchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fgsm_cot.yaml\n    ```\n2.  **阶段二：训练 Coconut 模型**\n    *   选择阶段一中验证准确率较高（约 40%）的检查点。\n    *   在 `args\u002Fgsm_coconut.yaml` 中设置 `load_model_path` 为该检查点路径。\n    *   运行：\n    ```bash\n    torchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fgsm_coconut.yaml\n    ```\n\n### 4. 模型评估\n找到验证集表现最佳的检查点，在评估配置文件（如 `args\u002Fgsm_coconut_eval.yaml`）中设置 `load_model_path`，并将 `only_eval` 设为 `true`，然后运行：\n\n```bash\ntorchrun --nnodes 1 --nproc_per_node 4 run.py args\u002Fgsm_coconut_eval.yaml\n```","某教育科技公司的算法团队正在开发一款能够逐步讲解复杂数学题的 AI 辅导助手，旨在提升学生对解题逻辑的理解。\n\n### 没有 coconut 时\n- **推理过程僵硬**：模型只能生成离散的文本步骤，一旦中间某步出错，后续逻辑极易崩塌，难以自我修正。\n- **训练效率低下**：为了让模型学会多步推理，需要耗费大量算力进行传统的思维链（CoT）微调，且收敛速度慢。\n- **泛化能力受限**：面对未见过的题型变体，模型往往死记硬背训练数据中的固定话术，缺乏真正的逻辑迁移能力。\n- **调试困难**：开发者无法干预或观察模型内部的思考“状态”，只能被动接受最终输出的文本结果。\n\n### 使用 coconut 后\n- **连续空间推理**：coconut 让模型在连续潜在空间中进行“思考”，使推理过程更加平滑流畅，显著提升了多步推导的稳定性。\n- **分阶段高效训练**：利用 coconut 的分阶段训练机制，团队能用更少的 epoch 让模型掌握复杂逻辑，大幅降低了 GPU 资源消耗。\n- **逻辑泛化增强**：通过在潜在空间中学习通用的推理模式，coconut 帮助模型轻松应对各类变形数学题，不再依赖死记硬背。\n- **可控性提升**：开发者可以通过调整连续思维的数量（c_thought）等参数，精细控制模型的推理深度和复杂度。\n\ncoconut 通过将离散的语言推理转化为连续的潜在空间运算，从根本上提升了大模型解决复杂逻辑问题的效率与鲁棒性。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ffacebookresearch_coconut_6ebee5c1.png","facebookresearch","Meta Research","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Ffacebookresearch_449342bd.png","",null,"https:\u002F\u002Fopensource.fb.com","https:\u002F\u002Fgithub.com\u002Ffacebookresearch",[80,84],{"name":81,"color":82,"percentage":83},"Python","#3572A5",98.1,{"name":85,"color":86,"percentage":87},"Shell","#89e051",1.9,1572,172,"2026-04-15T21:33:15","MIT","未说明","必需 NVIDIA GPU。官方实验基于 4 张 A100 (80GB) GPU 进行，支持多卡分布式训练 (torchrun)，具体显存需求取决于模型大小和批次设置。",{"notes":95,"python":96,"dependencies":97},"1. 必须使用 conda 创建 Python 3.12 环境并安装 requirements.txt 中的依赖。\n2. 运行前需登录 Weights & Biases (wandb) 账号用于日志记录。\n3. 训练启动需使用 torchrun 命令，官方示例配置为 4 卡并行，用户需根据实际显卡数量调整 nproc_per_node 及配置文件中的 batch_size 和 gradient_accumulation_steps。\n4. 数据需预处理为特定的 JSON 格式（包含 question, answer, steps 字段）。","3.12",[98,99,100,101,102],"torch","transformers","wandb","accelerate","datasets",[35,14],"2026-03-27T02:49:30.150509","2026-04-18T00:45:52.442925",[107,112,117,122,127,132,137],{"id":108,"question_zh":109,"answer_zh":110,"source_url":111},38351,"代码中的训练轮数（25 epochs）与论文中描述的（6+3+3 等）不一致，应该使用哪个配置？","配置文件中的 25 是训练的**最大**轮数上限。您应该根据验证集准确率来选择最佳检查点（checkpoint）。在论文报告的实验中，实际选用的是训练了 6 个 epoch 的检查点。请阅读 README 中的参数描述和论文以获取更多细节。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fcoconut\u002Fissues\u002F5",{"id":113,"question_zh":114,"answer_zh":115,"source_url":116},38352,"如何复现论文中提到的置信区间（例如 34.1 ± 1.5）？运行了多少次实验？","每个设置运行了 3 次。置信区间是通过计算均值和标准差，结合 95% 置信水平的 Z 分数得出的。具体计算逻辑如下：\n1. 计算均值 (mean) 和标准差 (stddev)。\n2. 设定置信水平为 0.95，计算临界值 (critical_value = stats.norm.ppf((1 + 0.95) \u002F 2))。\n3. 计算误差范围 (margin_of_error = critical_value * stddev \u002F sqrt(3))。\n4. 置信区间为 [mean - margin_of_error, mean + margin_of_error]。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fcoconut\u002Fissues\u002F19",{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},38353,"Coconut 的推理过程在代码实现中与论文描述似乎不一致（论文说直接使用隐藏状态，代码却生成了 token），这是为什么？","这是一个误解。潜在的推理过程（latent reasoning）实际上是在 `forward` 函数中完成的（参考代码第 218 行）。您在 `generate()` 函数中看到的生成 `next_token` 并获取嵌入的代码，是在潜在推理完成后用于生成最终答案的部分（对应论文图 1），这部分是在 token 空间中进行的，与标准语言模型一致。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fcoconut\u002Fissues\u002F7",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},38354,"GSM8K 实验中使用的基础模型和具体超参数是什么？是否有针对 LLaMA 的训练计划？","当前开源代码和实验使用的是 GPT-2 Small 模型（HuggingFace ID: openai-community\u002Fgpt2）。关于 LLaMA 模型的训练超参数（如学习率、模型大小、各阶段轮数），目前尚未提供具体的配置文件或发布计划，代码注释仅提到已在 LLaMA 上测试过。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fcoconut\u002Fissues\u002F11",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},38355,"在 GSM8K 上无法复现论文结果（准确率偏低），可能的原因有哪些？","请注意以下几点以确保复现成功：\n1. 代码中将 Coconut 的初始阶段训练替换为了 CoT SFT，两者本质等价，仅 `\u003Cbot>` 和 `\u003Ceot>` 标记不同，不影响性能。\n2. 必须根据验证集准确率选择最佳检查点，而不是直接使用最后一个 epoch 的模型。\n3. 某些配置（如 `resume: 3`）是为了跳过初始阶段训练，因为直接加载了 CoT SFT 检查点。\n4. 确保使用上传的确切配置文件，通常无需额外调整即可复现结果。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fcoconut\u002Fissues\u002F38",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},38356,"代码中 `run.py` 第 153 行是否存在逻辑错误（将值写回自身）？","是的，这是一个代码错误。该行代码确实没有实际作用。正确的写法应该是从 embeddings 权重中获取目标 ID 对应的数据，即修改为：`target_embedding = embeddings.weight.data[target_id]`。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fcoconut\u002Fissues\u002F18",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},38357,"是否有计划发布预训练好的模型权重以便直接进行推理？","截至该 Issue 讨论时，维护者尚未明确发布预训练模型权重的具体时间表。用户主要被引导使用提供的代码自行训练。社区建议在等待期间可以参考相关论文（如 Seq-VCR）或尝试不同的层策略来优化效果。","https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fcoconut\u002Fissues\u002F3",[]]