[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-OptimalScale--LMFlow":3,"tool-OptimalScale--LMFlow":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":75,"owner_avatar_url":76,"owner_bio":77,"owner_company":78,"owner_location":78,"owner_email":78,"owner_twitter":75,"owner_website":79,"owner_url":80,"languages":81,"stars":90,"forks":91,"last_commit_at":92,"license":93,"difficulty_score":10,"env_os":94,"env_gpu":95,"env_ram":96,"env_deps":97,"category_tags":107,"github_topics":108,"view_count":116,"oss_zip_url":78,"oss_zip_packed_at":78,"status":16,"created_at":117,"updated_at":118,"faqs":119,"releases":140},552,"OptimalScale\u002FLMFlow","LMFlow","An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.","LMFlow 是一个专为大型基础模型设计的可扩展工具箱，核心功能涵盖模型的微调和推理。LMFlow 致力于降低大模型的使用门槛，解决了传统开发流程繁琐、硬件适配复杂以及资源消耗过大的痛点。通过提供统一且高效的接口，LMFlow 让模型训练和部署变得更加便捷可靠，真正践行“大模型人人可用”的理念。\n\nLMFlow 非常适合人工智能开发者、研究人员以及对大模型应用感兴趣的用户。无论是想快速验证想法的工程师，还是需要优化训练策略的研究者，都能轻松上手。LMFlow 的技术亮点十分突出，不仅支持多种优化器自定义和推测性解码加速推理，还内置了 Llama-3 等最新对话模板。特别值得一提的是其低显存训练方案，例如 LISA 技术能让 7B 模型在有限资源下运行。近期版本还加强了对 Accelerate 的支持，进一步简化了配置流程。如果你正在寻找一个既专业又易用的大模型开发框架，LMFlow 绝对值得关注。","\u003Cp align=\"center\" width=\"50%\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOptimalScale_LMFlow_readme_53549a4159de.png\" alt=\"LMFlow\" style=\"width: 50%; min-width: 200px; display: block; margin: auto; background-color: transparent;\">\n\u003C\u002Fp>\n\n# LMFlow\n\n\u003Ch4 align=\"center\">\n    \u003Cp>\n        \u003Cb>English\u003C\u002Fb> |\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fdocs\u002Freadme\u002FREADME_zh-hans.md\">简体中文\u003C\u002Fa> |\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fdocs\u002Freadme\u002FREADME_es.md\">Español\u003C\u002Fa> |\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fdocs\u002Freadme\u002FREADME_jp.md\">日本語\u003C\u002Fa> |\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fdocs\u002Freadme\u002FREADME_ko.md\">한국어\u003C\u002Fa> |\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fdocs\u002Freadme\u002FREADME_hindi.md\">हिंदी\u003C\u002Fa>\n    \u003Cp>\n\u003C\u002Fh4>\n\n[![Website](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-Demo-20B2AA.svg)](https:\u002F\u002Flmflow.com)\n[![Code License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCode%20License-Apache_2.0-green.svg)](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002FLICENSE)\n[![Python 3.9+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.9+-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002Frelease\u002Fpython-390\u002F)\n[![Doc](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-Doc-ff69b4.svg)](https:\u002F\u002Foptimalscale.github.io\u002FLMFlow\u002F)\n[![Embark](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-LMFlow-%237289da.svg?logo=discord)](https:\u002F\u002Fdiscord.gg\u002Fu9VJNpzhvA)\n[![slack badge](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSlack-Join-blueviolet?logo=slack&amp)](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Flmflow\u002Fshared_invite\u002Fzt-1wju9nicy-woXbNtS~5MavHSAtiMxmxQ)\n[![WeChat badge](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWeChat-Join-brightgreen?logo=wechat&amp)](https:\u002F\u002Fibb.co\u002FZhM4hhn)\n\nAn extensible, convenient, and efficient toolbox for finetuning large machine learning models, designed to be user-friendly, speedy and reliable, and accessible to the entire community.\n\n\u003Cp align=\"center\" width=\"100%\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOptimalScale_LMFlow_readme_47e31b5050ad.png\" alt=\"LMFlow-features\" style=\"width: 100%; min-width: 300px; display: block; margin: auto;\">\n\u003C\u002Fp>\n\n## Latest News\n> [!IMPORTANT]\n> * :exclamation: [2025-07-09] We have a major update to LMFlow with full Accelerate support and extensive streamlining. If you're looking for the previous version, please use `git checkout v0.0.10`, or check out the [v0.0.10 branch](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Ftree\u002Fv0.0.10). View all releases [here](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Ftags).\n\n* [2024-12-02] Support [Hymba](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fhymba), a new family of small language models featuring a hybrid-head parallel architecture. Check out [Post-training Hymba](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Ftree\u002Fmain\u002Fexperimental\u002FHymba) for more details.\n* [2024-07-01] 🏆 LMFlow receives the [**Best Demo Paper Award**](https:\u002F\u002Fdocs.google.com\u002Fpresentation\u002Fd\u002F1TVDooAZqkNObz5ysVhDFtqnnVHR-u8wqYvgix-gzPMs\u002Fedit#slide=id.g2e55907bbcc_0_70) at **NAACL 2024**! 🎉\n* [2024-06-30] Expanding Optimization Options! We now support custom optimizer training with a variety of optimizers. Dive into the details and try out the new features with our updated script at [custom_optimizers](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Frun_finetune_with_custom_optim.sh).\n* [2024-04-25] :rocket: Support conversation template! We've preset the latest [Llama-3](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FMeta-Llama-3-70B) and [Phi-3](https:\u002F\u002Fhuggingface.co\u002Fmicrosoft\u002FPhi-3-mini-128k-instruct) conversation templates as well as some frequently used templates such as `chatml` (see all templates [here](https:\u002F\u002Foptimalscale.github.io\u002FLMFlow\u002Fexamples\u002FDATASETS.html#conversation-template)), and we are working on adding more preset templates. Adding corresponding `--conversation_template` in the shell script and you are all set! :rocket:\n\n\u003Cdetails> \u003Csummary>More news...\u003C\u002Fsummary>\n\n* [2024-03-27] Support [LISA](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.17919), enabling 7B training in 24G memory without offloading! \n* [2023-09-11] Support [speculative decoding](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.17192). Check out [speculative_decoding](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Fspeculative_decoding\u002FREADME.md) for the usage and acceleration details.\n* [2023-08-14] Support long context inference with position interpolation (Linear & NTK scaling ) for LLaMA models. Check out [postion_interpolation](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Freadme\u002FPosition_Interpolation.md) for more details.\n* [2023-08-07] Support [Flash Attention-2](https:\u002F\u002Fcrfm.stanford.edu\u002F2023\u002F07\u002F17\u002Fflash2.html). Check out [flash_attention](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Freadme\u002Fflash_attn2.md) for more details.\n* [2023-08-02] Support [Llama2](https:\u002F\u002Fai.meta.com\u002Fllama\u002F), [ChatGLM2](https:\u002F\u002Fhuggingface.co\u002FTHUDM\u002Fchatglm2-6b), and [Baichuan](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan-7B) models.\n* [2023-07-23] [LMFlow multimodal chatbot](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Frun_vis_chatbot_gradio_minigpt4.sh) is now available! Support multimodal inputs of images and texts. [Online Demo](http:\u002F\u002Fmultimodal.lmflow.online) is also provided (We hold the service on a single GPU, hence one may experience \"queuing\" or \"application busy\" sometimes when multiple users are accessing at the same time, please wait and attempt again later when such event happens)![image](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Frpan-vision-encoder\u002Fdocs\u002Fassets\u002Fmultimodal-chatbot-demo.gif)\n* [2023-06-22]  [LMFlow paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.12420) is out! Check out our implementation details at https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.12420\n* [2023-06-16] Our finetuned Robin-33B-V2 scored an impressive 64.1 on the Huggingface LLM leaderboard in our offline evaluation, outperforming major open-source LLMs! All checkpoints (7B, 13B, 33B, and 65B) are [released](https:\u002F\u002Fhuggingface.co\u002FOptimalScale)! Checkout the performance [here](https:\u002F\u002Fmedium.com\u002F@hkust.ml\u002Frobin-v2-launches-achieves-unparalleled-performance-on-openllm-4f6886e822c1).\n* [2023-06-07] LMFlow is now officially available on PyPI! Install it with `pip install lmflow-finetune`!\n* [2023-05-30] Release [Robin-13B-v2](https:\u002F\u002Fhuggingface.co\u002FOptimalScale\u002Frobin-13b-v2-delta) and [Robin-33B-v2](https:\u002F\u002Fhuggingface.co\u002FOptimalScale\u002Frobin-33b-v2-delta)!\n\n* [2023-05-15] Release [LMFlow-data](http:\u002F\u002Flmflow.org:5000\u002Flmflow_data.tar.gz), the training dataset of Robin-7B-v2. A new [test data](http:\u002F\u002Flmflow.org:5000\u002Flmflow_chat_en_dialog_multiturn_single_nll_text2text.tar.gz) is also released.\n* [2023-05-09] Release [Robin-7B-v2](http:\u002F\u002Flmflow.org:5000\u002Frobin-7b-v2-delta.tar.gz), achieving competitive performance on chitchat, commonsense reasoning and instruction-following tasks. Refer to our [comprehensive study](https:\u002F\u002Fmedium.com\u002F@hkust.ml\u002Flmflow-benchmark-an-automatic-evaluation-framework-for-open-source-llms-ef5c6f142418).\n* [2023-05-08] Release [LMFlow Benchmark](https:\u002F\u002Fmedium.com\u002F@hkust.ml\u002Flmflow-benchmark-an-automatic-evaluation-framework-for-open-source-llms-ef5c6f142418), an automatic evaluation framework for open-source chat-style LLMs. [Benchmark results](https:\u002F\u002Fdocs.google.com\u002Fspreadsheets\u002Fd\u002F1JYh4_pxNzmNA9I0YM2epgRA7VXBIeIGS64gPJBg5NHA\u002Fedit#gid=0) on 31 popular models are reported. [Participate in LMFlow Benchmark](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow#33-lmflow-benchmark).\n* [2023-04-21] Release [Robin-7B](http:\u002F\u002Flmflow.org:5000\u002Frobin-7b.tar.gz) (based on LLaMA-7B), and two models for commercial use: Parakeets-2.7B (based on GPT-NEO-2.7B) and Cokatoo-7B (based on StableLM-7B) [Download here](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Ftree\u002Fmain#model-zoo)\n* [2023-04-15] Inference: Support streaming output and ChatGLM.\n* [2023-04-10] We propose a new alignment algorithm: [Reward rAnked FineTuning (RAFT)](https:\u002F\u002Foptimalscale.github.io\u002FLMFlow\u002Fexamples\u002Fraft.html), which is more efficient than conventional (PPO-based) RLHF. [[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.06767)]\n* [2023-04-02] [Web service](https:\u002F\u002Flmflow.com\u002F) is online!\n* [2023-04-01] Release three instruction-tuned checkpoints and three medical checkpoints in [model zoo](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow#model-zoo): LLaMA-7B-tuned, LLaMA-13B-tuned, LLaMA-33B-tuned, LLaMA-7B-medical, LLaMA-13B-medical, and LLaMA-33B-medical.\n* [2023-03-27] Support full tuning and lora tuning for all decoder models.\n* [2023-03-27] [Tasked tuned model beats ChatGPT on medical domain](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow#model-performance).\n* [2023-03-27] Release code and checkpoints - [version 0.0.1](https:\u002F\u002Foptimalscale.github.io\u002FLMFlow\u002F)! [Our tasked-tuned model beats ChatGPT on medical domain](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow#model-performance).\n\n\u003C\u002Fdetails>\n\n## Table of Contents\n\n- [LMFlow](#lmflow)\n  - [Latest News](#latest-news)\n  - [Table of Contents](#table-of-contents)\n  - [Quick Start](#quick-start)\n    - [Setup](#setup)\n    - [Prepare Dataset](#prepare-dataset)\n    - [Finetuning](#finetuning)\n      - [Estimated Hardware Requirement](#estimated-hardware-requirement)\n      - [Full Finetuning](#full-finetuning)\n      - [LISA](#lisa)\n      - [LoRA](#lora)\n    - [Inference](#inference)\n    - [Deployment](#deployment)\n    - [Evaluation](#evaluation)\n  - [Supported Features](#supported-features)\n  - [Support](#support)\n  - [License](#license)\n  - [Citation](#citation)\n\n\n## Quick Start\n\n### Setup\n\nOur package has been tested on Linux OS (Ubuntu 20.04). Other OS platforms (MacOS, Windows) are not fully tested, where you may encounter unexpected errors. If you are using LMFlow for the first time, we recommend you to try on a Linux machine or Google Colab.\n\n```bash\ngit clone -b v1.0.0 https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow.git\ncd LMFlow\nconda create -n lmflow python=3.9 -y\nconda activate lmflow\nconda install mpi4py\npip install -e .\n```\n\n\u003Cdetails>\u003Csummary> Looking for a previous version? \u003C\u002Fsummary>\n\n```bash\ngit clone -b v0.0.10 https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow.git\ncd LMFlow\nconda create -n lmflow python=3.9 -y\nconda activate lmflow\nconda install mpi4py\npip install -e .\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary> For CUDA versions 10.3-11.7 \u003C\u002Fsummary>\n\n```bash\ngit clone -b v0.0.5 https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow.git\ncd LMFlow\nconda create -n lmflow python=3.9 -y\nconda activate lmflow\nconda install mpi4py\npip install -e .\n```\n\n\u003C\u002Fdetails>\n\n> [!TIP]\n> We use WandB to track and visualize the training process by default. Before running the training scripts, users may need to log in to WandB using the command: \n>\n>```bash\n>wandb login\n>```\n>\n> For detailed instructions, refer to the [WandB Quickstart Guide](https:\u002F\u002Fdocs.wandb.ai\u002Fquickstart\u002F). Step 1 (registration) and Step 2 (login using your WandB API key) should be sufficient to set up your environment.\n>\n> \u003Cdetails>\u003Csummary>Disabling wandb\u003C\u002Fsummary>  \n>\n> One can disable wandb by either:  \n>\n> 1. Adding environment variable before running the training command.\n>\n>```bash\n>export WANDB_MODE=disabled\n>```\n>\n> 2. OR, specifying the integrations to report the results and logs to. In the training script, add:\n>\n>```bash\n>--report_to none \\\n>```\n>\n> \u003C\u002Fdetails>\n\n### Prepare Dataset\n\nPlease refer to our [doc](https:\u002F\u002Foptimalscale.github.io\u002FLMFlow\u002Fexamples\u002FDATASETS.html).\n\n### Finetuning\n\n#### Estimated Hardware Requirement\n\n| Method                 | 0.5B |  3B  |  7B  |  14B  |  30B  |  70B  |  `x`B   |\n| ---------------------- | ---- | ---- | ---- | ----- | ----- | ----- | ------- |\n| Full `bf16`\u002F`fp16`     |  9GB | 55GB |120GB | 240GB | 600GB | 1200GB| `18x`GB |\n| LoRA                   |  1GB | 6GB  | 16GB |  32GB |  64GB | 160GB |  `2x`GB |\n| QLoRA `quant_bit=8`    | 0.7GB| 3GB  | 10GB |  20GB |  40GB |   80GB|  `x`GB  |\n| QLoRA `quant_bit=4`    | 0.4GB| 1.5GB|  6GB |  12GB |  24GB |   48GB| `x\u002F2`GB |\n\n\n#### Full Finetuning\n\nFull training updates all the parameters to finetune a language model.\nHere is an example to finetune a GPT-2 base model.\n\n```sh\ncd data && .\u002Fdownload.sh alpaca && cd -\n\nbash .\u002Fscripts\u002Frun_finetune.sh \\\n  --model_name_or_path gpt2 \\\n  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n  --output_model_path output_models\u002Ffinetuned_gpt2\n```\n\n> [!TIP]\n> For conversation dataset, specify a conversation template for better performance by adding `--conversation_template` to the command.\n>\n> \u003Cdetails>\u003Csummary>Llama-3-8B conversation dataset example\u003C\u002Fsummary>  \n>\n>```bash\n>cd data && .\u002Fdownload.sh alpaca && cd -\n>\n>bash .\u002Fscripts\u002Frun_finetune.sh \\\n>  --model_name_or_path meta-llama\u002FMeta-Llama-3-8B \\\n>  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n>  --conversation_template llama3 \\\n>  --output_model_path output_models\u002Ffinetuned_llama3_8b\n>```\n>\n> \u003C\u002Fdetails>\n\n#### LISA\n\n[LISA](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.17919) is a memory-efficient finetuning algorithm that allows tradeoff between memory and the number of randomly unfreezed layers. This script currently is only tested in single gpus. Please stay tuned for our latest updates :smile:\n\n```sh\ncd data && .\u002Fdownload.sh alpaca && cd -\n\nbash .\u002Fscripts\u002Frun_finetune_with_lisa.sh \\\n  --model_name_or_path meta-llama\u002FLlama-2-7b-hf \\\n  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n  --output_model_path output_models\u002Ffinetuned_llama2_7b \\\n  --lisa_activated_layers 1 \\\n  --lisa_interval_steps 20\n```\n\n> [!TIP]\n> \u003Cdetails>\u003Csummary>Llama-2-7B conversation dataset example\u003C\u002Fsummary>  \n>\n>```bash\n>cd data && .\u002Fdownload.sh alpaca && cd -\n>\n>bash .\u002Fscripts\u002Frun_finetune_with_lisa.sh \\\n>  --model_name_or_path meta-llama\u002FLlama-2-7b-hf \\\n>  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n>  --conversation_template llama2 \\\n>  --output_model_path output_models\u002Ffinetuned_llama2_7b_lisa \\\n>  --lisa_activated_layers 1 \\\n>  --lisa_interval_steps 20\n>```\n>\n> \u003C\u002Fdetails>\n\n#### LoRA\n\nLoRA is a parameter-efficient finetuning algorithm and is more efficient than full finetuning.\n\n```sh\ncd data && .\u002Fdownload.sh alpaca && cd -\n\nbash .\u002Fscripts\u002Frun_finetune_with_lora.sh \\\n  --model_name_or_path facebook\u002Fgalactica-1.3b \\\n  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n  --output_lora_path output_models\u002Ffinetuned_galactica_lora\n```\n\n> [!TIP]\n> \u003Cdetails>\u003Csummary>Llama-2-7B conversation dataset example\u003C\u002Fsummary>  \n>\n>```bash\n>cd data && .\u002Fdownload.sh alpaca && cd -\n>\n>bash .\u002Fscripts\u002Frun_finetune_with_lora.sh \\\n>  --model_name_or_path meta-llama\u002FLlama-2-7b-hf \\\n>  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n>  --conversation_template llama2 \\\n>  --output_model_path output_models\u002Ffinetuned_llama2_7b_lora \\\n>```\n>\n> \u003C\u002Fdetails>\n>\n> \u003Cdetails>\u003Csummary>Merge LoRA Weight\u003C\u002Fsummary>\n>\n>Merge LoRA weight and the base model into one using:  \n>\n>```sh\n>bash .\u002Fscripts\u002Frun_merge_lora.sh \\\n>  --model_name_or_path Qwen\u002FQwen1.5-1.8B \\\n>  --lora_model_path output_models\u002Flora \\\n>  --output_model_path output_models\u002Flora_merged \\\n>```\n>\n>\u003C\u002Fdetails>\n\n### Inference\n\nAfter finetuning, you can run the following command to chat with the model.\n```sh\nbash .\u002Fscripts\u002Frun_chatbot.sh output_models\u002Ffinetuned_gpt2\n```\n\n> [!TIP]\n> We recommend using SGLang for faster batch inference.\n>\n> \u003Cdetails>\u003Csummary>Faster inference using SGLang\u003C\u002Fsummary>  \n>\n>```bash\n>bash .\u002Fscripts\u002Frun_sglang_inference.sh\n>```\n> Note: If you encounter error ModuleNotFoundError: No module named 'common_ops' when using SGLang, please try `apt-get update` and then `apt install numactl`. \n> \u003C\u002Fdetails>\n\n### Deployment\n\nIf you want to deploy your own model locally, we provide a gradio-based UI for building chatbots. \nRunning the following command will launch the demo for robin-7b:\n\n```sh\npip install gradio\npython .\u002Fexamples\u002Fchatbot_gradio.py --deepspeed configs\u002Fds_config_chatbot.json --model_name_or_path YOUR-LLAMA  --lora_model_path .\u002Frobin-7b --prompt_structure \"A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.###Human: {input_text}###Assistant:\"       --end_string \"#\" --max_new_tokens 200\n```\n\n### Evaluation\n\nWe recommend using [LM Evaluation Harness](https:\u002F\u002Fgithub.com\u002FEleutherAI\u002Flm-evaluation-harness) for most evaluation purposes.\n\n## Supported Features\n\n\u003Cdetails> \u003Csummary>Finetune Acceleration & Memory Optimization\u003C\u002Fsummary>\n\n* LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning\n  \n  LISA is a novel and memory-efficient training strategy for large language models that outperforms existing methods like LoRA by selectively freezing layers during optimization. Check out [LISA](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.17919) for more details.  \n  In LMFLow, activate LISA using `--use_lisa 1` in your training command. Control the number of activation layers with `--lisa_activated_layers 2`, and adjust the freezing layers interval using `--lisa_step_interval 20`. \n\n* LoRA\n  \n  LoRA is a parameter-efficient finetuning algorithm and is more efficient than full finetuning. Check out [finetuning-lora](#finetuning-lora) for more details.\n\n* FlashAttention\n\n  LMFlow supports both FlashAttention-1 and the latest FlashAttention-2. Check out [flash_attention](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Freadme\u002Fflash_attn2.md) for more details.\n\n* Gradient Checkpointing\n  \n  [Gradient checkpointing](https:\u002F\u002Fgithub.com\u002Fcybertronai\u002Fgradient-checkpointing) is a memory optimization technique that trades compute for memory.\n  It is useful when the model is too large to fit into GPU memory. \n  Use it by just adding `--gradient_checkpointing` to your training command.\n\n* Deepspeed Zero3\n  \n  LMFlow supports [Deepspeed Zero-3 Offload](https:\u002F\u002Fwww.deepspeed.ai\u002F2021\u002F03\u002F07\u002Fzero3-offload.html). \n  We provide an example [deepspeed config](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fconfigs\u002Fds_config_zero3.json), and you can directly use it.\n\n\u003C\u002Fdetails>\n\n\u003Cdetails> \u003Csummary>Inference Acceleration\u003C\u002Fsummary>\n\n* LLaMA Inference on CPU\n\n  Thanks to the great efforts of [llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp). It is possible for everyone to run their LLaMA models on CPU by 4-bit quantization. We provide a script to convert LLaMA LoRA weights to `.pt` files. You only need to use `convert-pth-to-ggml.py` in llama.cpp to perform quantization.\n\n* FlashAttention\n\n  LMFlow supports both FlashAttention-1 and the latest FlashAttention-2. Check out [flash_attention](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Freadme\u002Fflash_attn2.md) for more details.\n\n* vLLM\n\n  Try vLLM for fast and easy-to-use LLM inference and serving. Thanks for the [great work](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm)!\n\n\u003C\u002Fdetails>\n\n\u003Cdetails> \u003Csummary>Long Context\u003C\u002Fsummary>\n\n* Position Interpolation for LLaMA Models\n\n  Now LMFlow supports the latest Linear & NTK (Neural Kernel theory) scaling techniques for LLaMA models. Check out [postion_interpolation](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Freadme\u002FPosition_Interpolation.md) for more details.\n\n\u003C\u002Fdetails>\n\n\u003Cdetails> \u003Csummary>Model Customization\u003C\u002Fsummary>\n\n* Vocabulary Extension\n\n  Now you can train your own sentencepiece tokenizer and merge it with model's origin hf tokenizer. Check out [vocab_extension](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Fvocab_extension) for more details.\n\n\u003C\u002Fdetails>\n\n\u003Cdetails> \u003Csummary>Multimodal\u003C\u002Fsummary>\n\n* Multimodal Chatbot\n\n  LMFlow supports multimodal inputs of images and texts. Check out our [LMFlow multimodal chatbot](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Frun_vis_chatbot_gradio_minigpt4.sh).\n\n\u003C\u002Fdetails>\n\n\u003Cdetails> \u003Csummary>Custom Optimization\u003C\u002Fsummary>\n\n* Custom Optimization\n\n  LMFlow now supports custom optimizer training with a variety of optimizers. Elevate your model's performance with tailored optimization strategies. Dive into the details and try out the new features with our updated script at [custom_optimizers](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Frun_finetune_with_custom_optim.sh).\n\n  The following table evaluates the performance of custom optimizers in the fine-tuning process of GPT-2 on the Alpaca dataset, emphasizing their individual impacts on the training loss. The specific hyperparameter settings utilize default configurations, which can be customized and adjusted at [custom_optimizers](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Frun_finetune_with_custom_optim.sh). It is important to note that the evaluations were conducted over a duration of 0.1 epochs to provide a preliminary insight into the optimizers' effectiveness.\n\n  | Optimizer Name | Train Loss |\n  |----------------|------------|\n  | RMSprop        | 2.4016     |\n  | LION-32bit     | 2.4041     |\n  | Adam           | 2.4292     |\n  | AdamP          | 2.4295     |\n  | AdamW          | 2.4469     |\n  | AdaFactor      | 2.4543     |\n  | AdaBound       | 2.4547     |\n  | AdamWScheduleFree       | 2.4677     |\n  | Adan           | 2.5063     |\n  | NAdam          | 2.5569     |\n  | AdaBelief      | 2.5857     |\n  | AdaMax         | 2.5924     |\n  | RAdam          | 2.6104     |\n  | AdaDelta       | 2.6298     |\n  | AdaGrad        | 2.8657     |\n  | Yogi           | 2.9314     |\n  | NovoGrad       | 3.1071     |\n  | Sophia         | 3.1517     |\n  | LAMB           | 3.2350     |\n  | LARS           | 3.3329     |\n  | SGDScheduleFree        | 3.3541     |\n  | SGDP           | 3.3567     |\n  | SGD            | 3.3734     |\n\n\u003C\u002Fdetails>\n\n## Support\n\nIf you need any help, please submit a Github issue.\n\n## License\n\nThe code included in this project is licensed under the [Apache 2.0 license](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002FLICENSE).\nIf you wish to use the codes and models included in this project for commercial purposes, please sign this [document](https:\u002F\u002Fdocs.google.com\u002Fforms\u002Fd\u002Fe\u002F1FAIpQLSfJYcci6cbgpIvx_Fh1xDL6pNkzsjGDH1QIcm4cYk88K2tqkw\u002Fviewform?usp=pp_url) to obtain authorization.\n\n## Citation\n\nIf you find this repository useful, please consider giving ⭐ and citing our [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.12420):\n\n```citation\n@article{diao2023lmflow,\n  title={Lmflow: An extensible toolkit for finetuning and inference of large foundation models},\n  author={Diao, Shizhe and Pan, Rui and Dong, Hanze and Shum, Ka Shun and Zhang, Jipeng and Xiong, Wei and Zhang, Tong},\n  journal={arXiv preprint arXiv:2306.12420},\n  year={2023}\n}\n```\n\n```citation\n@article{dong2023raft,\n  title={Raft: Reward ranked finetuning for generative foundation model alignment},\n  author={Dong, Hanze and Xiong, Wei and Goyal, Deepanshu and Pan, Rui and Diao, Shizhe and Zhang, Jipeng and Shum, Kashun and Zhang, Tong},\n  journal={arXiv preprint arXiv:2304.06767},\n  year={2023}\n}\n```\n\n```citation\n@article{pan2024lisa,\n  title={LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning}, \n  author={Pan, Rui and Liu, Xiang and Diao, Shizhe and Pi, Renjie and Zhang, Jipeng and Han, Chi and Zhang, Tong},\n  journal={arXiv preprint arXiv:2403.17919},\n  year={2024}\n}\n```\n","\u003Cp align=\"center\" width=\"50%\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOptimalScale_LMFlow_readme_53549a4159de.png\" alt=\"LMFlow\" style=\"width: 50%; min-width: 200px; display: block; margin: auto; background-color: transparent;\">\n\u003C\u002Fp>\n\n# LMFlow\n\n\u003Ch4 align=\"center\">\n    \u003Cp>\n        \u003Cb>简体中文\u003C\u002Fb> |\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002FREADME.md\">English\u003C\u002Fa> |\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fdocs\u002Freadme\u002FREADME_es.md\">西班牙语\u003C\u002Fa> |\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fdocs\u002Freadme\u002FREADME_jp.md\">日语\u003C\u002Fa> |\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fdocs\u002Freadme\u002FREADME_ko.md\">韩语\u003C\u002Fa> |\n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fdocs\u002Freadme\u002FREADME_hindi.md\">印地语\u003C\u002Fa>\n    \u003Cp>\n\u003C\u002Fh4>\n\n[![Website](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-Demo-20B2AA.svg)](https:\u002F\u002Flmflow.com)\n[![Code License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCode%20License-Apache_2.0-green.svg)](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002FLICENSE)\n[![Python 3.9+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.9+-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002Frelease\u002Fpython-390\u002F)\n[![Doc](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-Doc-ff69b4.svg)](https:\u002F\u002Foptimalscale.github.io\u002FLMFlow\u002F)\n[![Embark](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-LMFlow-%237289da.svg?logo=discord)](https:\u002F\u002Fdiscord.gg\u002Fu9VJNpzhvA)\n[![slack badge](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSlack-Join-blueviolet?logo=slack&amp)](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Flmflow\u002Fshared_invite\u002Fzt-1wju9nicy-woXbNtS~5MavHSAtiMxmxQ)\n[![WeChat badge](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWeChat-Join-brightgreen?logo=wechat&amp)](https:\u002F\u002Fibb.co\u002FZhM4hhn)\n\n一个可扩展、便捷且高效的工具箱，用于微调（finetuning）大型机器学习（Machine Learning）模型，旨在用户友好、快速可靠，并面向整个社区开放。\n\n\u003Cp align=\"center\" width=\"100%\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOptimalScale_LMFlow_readme_47e31b5050ad.png\" alt=\"LMFlow-features\" style=\"width: 100%; min-width: 300px; display: block; margin: auto;\">\n\u003C\u002Fp>\n\n## 最新动态\n> [!IMPORTANT]\n> * :exclamation: [2025-07-09] 我们对 LMFlow 进行了重大更新，完全支持 Accelerate 并进行了大幅简化。如果您正在寻找之前的版本，请使用 `git checkout v0.0.10`，或者查看 [v0.0.10 分支](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Ftree\u002Fv0.0.10)。查看所有发布版本 [在此处](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Ftags)。\n\n* [2024-12-02] 支持 [Hymba](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Fhymba)，这是一种具有混合头并行架构的新型小语言模型系列。有关更多详细信息，请查看 [Post-training Hymba](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Ftree\u002Fmain\u002Fexperimental\u002FHymba)。\n* [2024-07-01] 🏆 LMFlow 在 **NAACL 2024** 上荣获 [**最佳演示论文奖**](https:\u002F\u002Fdocs.google.com\u002Fpresentation\u002Fd\u002F1TVDooAZqkNObz5ysVhDFtqnnVHR-u8wqYvgix-gzPMs\u002Fedit#slide=id.g2e55907bbcc_0_70)! 🎉\n* [2024-06-30] 扩展优化选项！我们现在支持使用多种优化器进行自定义优化器训练。深入了解详情并通过我们更新的脚本尝试新功能：[custom_optimizers](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Frun_finetune_with_custom_optim.sh)。\n* [2024-04-25] :rocket: 支持对话模板！我们已预设最新的 [Llama-3](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FMeta-Llama-3-70B) 和 [Phi-3](https:\u002F\u002Fhuggingface.co\u002Fmicrosoft\u002FPhi-3-mini-128k-instruct) 对话模板，以及一些常用模板，例如 `chatml`（查看所有模板 [在此处](https:\u002F\u002Foptimalscale.github.io\u002FLMFlow\u002Fexamples\u002FDATASETS.html#conversation-template)），我们正在努力添加更多预设模板。在 shell 脚本中添加相应的 `--conversation_template` 参数即可！:rocket:\n\n\u003Cdetails> \u003Csummary>更多新闻...\u003C\u002Fsummary>\n\n* [2024-03-27] 支持 [LISA](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.17919)，无需卸载即可在 24G 内存中训练 7B 模型！ \n* [2023-09-11] 支持 [推测解码](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.17192)。有关用法和加速详情，请查看 [speculative_decoding](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Fspeculative_decoding\u002FREADME.md)。\n* [2023-08-14] 支持 LLaMA 模型的长上下文推理，采用位置插值（线性及 NTK 缩放）。有关更多详情，请查看 [postion_interpolation](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Freadme\u002FPosition_Interpolation.md)。\n* [2023-08-07] 支持 [Flash Attention-2](https:\u002F\u002Fcrfm.stanford.edu\u002F2023\u002F07\u002F17\u002Fflash2.html)。有关更多详情，请查看 [flash_attention](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Freadme\u002Fflash_attn2.md)。\n* [2023-08-02] 支持 [Llama2](https:\u002F\u002Fai.meta.com\u002Fllama\u002F)、[ChatGLM2](https:\u002F\u002Fhuggingface.co\u002FTHUDM\u002Fchatglm2-6b) 和 [Baichuan](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan-7B) 模型。\n* [2023-07-23] [LMFlow 多模态聊天机器人](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Frun_vis_chatbot_gradio_minigpt4.sh) 现已上线！支持图像和文本的多模态输入。同时提供 [在线演示](http:\u002F\u002Fmultimodal.lmflow.online)（我们在单 GPU 上托管服务，因此当多个用户同时访问时，可能会遇到“排队”或“应用程序繁忙”，如遇此类情况请耐心等待稍后重试）![image](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Frpan-vision-encoder\u002Fdocs\u002Fassets\u002Fmultimodal-chatbot-demo.gif)\n* [2023-06-22] [LMFlow 论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.12420) 已发布！查看我们的实现细节：https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.12420\n* [2023-06-16] 在我们的离线评估中，我们微调的 Robin-33B-V2 在 Huggingface LLM 排行榜上取得了令人印象深刻的 64.1 分，优于主要的开源 LLM！所有检查点（7B、13B、33B 和 65B）均已 [发布](https:\u002F\u002Fhuggingface.co\u002FOptimalScale)! 性能查看 [此处](https:\u002F\u002Fmedium.com\u002F@hkust.ml\u002Frobin-v2-launches-achieves-unparalleled-performance-on-openllm-4f6886e822c1)。\n* [2023-06-07] LMFlow 现已正式在 PyPI 上可用！使用 `pip install lmflow-finetune` 安装它！\n* [2023-05-30] 发布 [Robin-13B-v2](https:\u002F\u002Fhuggingface.co\u002FOptimalScale\u002Frobin-13b-v2-delta) 和 [Robin-33B-v2](https:\u002F\u002Fhuggingface.co\u002FOptimalScale\u002Frobin-33b-v2-delta)!\n\n* [2023-05-15] 发布 [LMFlow-data](http:\u002F\u002Flmflow.org:5000\u002Flmflow_data.tar.gz)，即 Robin-7B-v2 的训练数据集。还发布了新的 [测试数据](http:\u002F\u002Flmflow.org:5000\u002Flmflow_chat_en_dialog_multiturn_single_nll_text2text.tar.gz)。\n* [2023-05-09] 发布 [Robin-7B-v2](http:\u002F\u002Flmflow.org:5000\u002Frobin-7b-v2-delta.tar.gz)，在闲聊、常识推理和指令遵循任务上实现了有竞争力的性能。参考我们的 [综合研究](https:\u002F\u002Fmedium.com\u002F@hkust.ml\u002Flmflow-benchmark-an-automatic-evaluation-framework-for-open-source-llms-ef5c6f142418)。\n* [2023-05-08] 发布 [LMFlow Benchmark](https:\u002F\u002Fmedium.com\u002F@hkust.ml\u002Flmflow-benchmark-an-automatic-evaluation-framework-for-open-source-llms-ef5c6f142418)，这是一个用于开源聊天风格 LLM 的自动评估框架。报告了 31 个流行模型的 [基准测试结果](https:\u002F\u002Fdocs.google.com\u002Fspreadsheets\u002Fd\u002F1JYh4_pxNzmNA9I0YM2epgRA7VXBIeIGS64gPJBg5NHA\u002Fedit#gid=0)。[参与 LMFlow Benchmark](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow#33-lmflow-benchmark)。\n* [2023-04-21] 发布 [Robin-7B](http:\u002F\u002Flmflow.org:5000\u002Frobin-7b.tar.gz)（基于 LLaMA-7B），以及两个商用模型：Parakeets-2.7B（基于 GPT-NEO-2.7B）和 Cokatoo-7B（基于 StableLM-7B）[点击下载](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Ftree\u002Fmain#model-zoo)\n* [2023-04-15] 推理：支持流式输出和 ChatGLM。\n* [2023-04-10] 我们提出了一种新的对齐算法：[奖励排名微调 (RAFT)](https:\u002F\u002Foptimalscale.github.io\u002FLMFlow\u002Fexamples\u002Fraft.html)，其效率高于传统的（基于 PPO 的）RLHF。[[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.06767)]\n* [2023-04-02] [Web 服务](https:\u002F\u002Flmflow.com\u002F) 已上线！\n* [2023-04-01] 在 [模型库](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow#model-zoo) 中发布三个指令微调检查点和三个医疗检查点：LLaMA-7B-tuned, LLaMA-13B-tuned, LLaMA-33B-tuned, LLaMA-7B-medical, LLaMA-13B-medical, 和 LLaMA-33B-medical。\n* [2023-03-27] 支持所有解码器模型的全量调优和 LoRA 调优。\n* [2023-03-27] [任务微调模型在医疗领域超越 ChatGPT](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow#model-performance)。\n* [2023-03-27] 发布代码和检查点 - [版本 0.0.1](https:\u002F\u002Foptimalscale.github.io\u002FLMFlow\u002F)! [我们的任务微调模型在医疗领域超越 ChatGPT](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow#model-performance)。\n\n\u003C\u002Fdetails>\n\n## 目录\n\n- [LMFlow](#lmflow)\n  - [最新动态](#latest-news)\n  - [目录](#table-of-contents)\n  - [快速开始](#quick-start)\n    - [环境搭建](#setup)\n    - [准备数据集](#prepare-dataset)\n    - [微调](#finetuning)\n      - [预估硬件需求](#estimated-hardware-requirement)\n      - [全量微调](#full-finetuning)\n      - [LISA](#lisa)\n      - [LoRA](#lora)\n    - [推理](#inference)\n    - [部署](#deployment)\n    - [评估](#evaluation)\n  - [支持的功能](#supported-features)\n  - [支持](#support)\n  - [许可证](#license)\n  - [引用](#citation)\n\n\n## 快速开始\n\n### 环境搭建\n\n我们的软件包已在 Linux 操作系统（Ubuntu 20.04）上测试过。其他操作系统平台（MacOS, Windows）尚未完全测试，您可能会遇到意外错误。如果您首次使用 LMFlow，我们建议您在 Linux 机器或 Google Colab 上尝试。\n\n```bash\ngit clone -b v1.0.0 https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow.git\ncd LMFlow\nconda create -n lmflow python=3.9 -y\nconda activate lmflow\nconda install mpi4py\npip install -e .\n```\n\n\u003Cdetails>\u003Csummary> 寻找旧版本？ \u003C\u002Fsummary>\n\n```bash\ngit clone -b v0.0.10 https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow.git\ncd LMFlow\nconda create -n lmflow python=3.9 -y\nconda activate lmflow\nconda install mpi4py\npip install -e .\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary> 适用于 CUDA 版本 10.3-11.7 \u003C\u002Fsummary>\n\n```bash\ngit clone -b v0.0.5 https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow.git\ncd LMFlow\nconda create -n lmflow python=3.9 -y\nconda activate lmflow\nconda install mpi4py\npip install -e .\n```\n\n\u003C\u002Fdetails>\n\n> [!TIP]\n> 默认情况下，我们使用 WandB (Weights & Biases，一种实验跟踪工具) 来跟踪和可视化训练过程。在运行训练脚本之前，用户可能需要使用以下命令登录 WandB： \n>\n>```bash\n>wandb login\n>```\n>\n> 详细说明请参考 [WandB 快速入门指南](https:\u002F\u002Fdocs.wandb.ai\u002Fquickstart\u002F)。步骤 1（注册）和步骤 2（使用您的 WandB API 密钥登录）足以设置您的环境。\n>\n> \u003Cdetails>\u003Csummary>禁用 wandb\u003C\u002Fsummary>  \n>\n> 可以通过以下方式禁用 wandb：  \n>\n> 1. 在运行训练命令之前添加环境变量。\n>\n>```bash\n>export WANDB_MODE=disabled\n>```\n>\n> 2. 或者，指定报告结果和日志的集成。在训练脚本中添加：\n>\n>```bash\n>--report_to none \\\n>```\n>\n> \u003C\u002Fdetails>\n\n### 准备数据集\n\n请参阅我们的 [文档](https:\u002F\u002Foptimalscale.github.io\u002FLMFlow\u002Fexamples\u002FDATASETS.html)。\n\n### 微调\n\n#### 预估硬件需求\n\n| 方法                 | 0.5B |  3B  |  7B  |  14B  |  30B  |  70B  |  `x`B   |\n| ---------------------- | ---- | ---- | ---- | ----- | ----- | ----- | ------- |\n| 全量 `bf16`\u002F`fp16`     |  9GB | 55GB |120GB | 240GB | 600GB | 1200GB| `18x`GB |\n| LoRA                   |  1GB | 6GB  | 16GB |  32GB |  64GB | 160GB |  `2x`GB |\n| QLoRA `quant_bit=8`    | 0.7GB| 3GB  | 10GB |  20GB |  40GB |   80GB|  `x`GB  |\n| QLoRA `quant_bit=4`    | 0.4GB| 1.5GB|  6GB |  12GB |  24GB |   48GB| `x\u002F2`GB |\n\n\n#### 全量微调\n\n全量微调 (Full Finetuning) 会更新所有参数以微调语言模型。\n以下是一个微调 GPT-2 基础模型的示例。\n\n```sh\ncd data && .\u002Fdownload.sh alpaca && cd -\n\nbash .\u002Fscripts\u002Frun_finetune.sh \\\n  --model_name_or_path gpt2 \\\n  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n  --output_model_path output_models\u002Ffinetuned_gpt2\n```\n\n> [!TIP]\n> 对于对话数据集，通过添加 `--conversation_template` 到命令中指定对话模板以获得更好的性能。\n>\n> \u003Cdetails>\u003Csummary>Llama-3-8B 对话数据集示例\u003C\u002Fsummary>  \n>\n>```bash\n>cd data && .\u002Fdownload.sh alpaca && cd -\n>\n>bash .\u002Fscripts\u002Frun_finetune.sh \\\n>  --model_name_or_path meta-llama\u002FMeta-Llama-3-8B \\\n>  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n>  --conversation_template llama3 \\\n>  --output_model_path output_models\u002Ffinetuned_llama3_8b\n>```\n>\n> \u003C\u002Fdetails>\n\n#### LISA\n\n[LISA](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.17919) 是一种内存高效的微调算法，允许在内存和随机未冻结层数之间进行权衡。此脚本目前仅在单 GPU 上测试过。请密切关注我们的最新更新 :smile:\n\n```sh\ncd data && .\u002Fdownload.sh alpaca && cd -\n\nbash .\u002Fscripts\u002Frun_finetune_with_lisa.sh \\\n  --model_name_or_path meta-llama\u002FLlama-2-7b-hf \\\n  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n  --output_model_path output_models\u002Ffinetuned_llama2_7b \\\n  --lisa_activated_layers 1 \\\n  --lisa_interval_steps 20\n```\n\n> [!TIP]\n> \u003Cdetails>\u003Csummary>Llama-2-7B 对话数据集示例\u003C\u002Fsummary>  \n>\n>```bash\n>cd data && .\u002Fdownload.sh alpaca && cd -\n>\n>bash .\u002Fscripts\u002Frun_finetune_with_lisa.sh \\\n>  --model_name_or_path meta-llama\u002FLlama-2-7b-hf \\\n>  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n>  --conversation_template llama2 \\\n>  --output_model_path output_models\u002Ffinetuned_llama2_7b_lisa \\\n>  --lisa_activated_layers 1 \\\n>  --lisa_interval_steps 20\n>```\n>\n> \u003C\u002Fdetails>\n\n#### LoRA\n\nLoRA (Low-Rank Adaptation，低秩自适应) 是一种参数高效的微调算法，比全量微调更高效。\n\n```sh\ncd data && .\u002Fdownload.sh alpaca && cd -\n\nbash .\u002Fscripts\u002Frun_finetune_with_lora.sh \\\n  --model_name_or_path facebook\u002Fgalactica-1.3b \\\n  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n  --output_lora_path output_models\u002Ffinetuned_galactica_lora\n```\n\n> [!TIP]\n> \u003Cdetails>\u003Csummary>Llama-2-7B 对话数据集示例\u003C\u002Fsummary>  \n>\n>```bash\n>cd data && .\u002Fdownload.sh alpaca && cd -\n>\n>bash .\u002Fscripts\u002Frun_finetune_with_lora.sh \\\n>  --model_name_or_path meta-llama\u002FLlama-2-7b-hf \\\n>  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n>  --conversation_template llama2 \\\n>  --output_model_path output_models\u002Ffinetuned_llama2_7b_lora \\\n>```\n>\n> \u003C\u002Fdetails>\n>\n> \u003Cdetails>\u003Csummary>合并 LoRA 权重\u003C\u002Fsummary>\n>\n>使用以下命令将 LoRA 权重与基础模型合并为一个：  \n>\n>```sh\n>bash .\u002Fscripts\u002Frun_merge_lora.sh \\\n>  --model_name_or_path Qwen\u002FQwen1.5-1.8B \\\n>  --lora_model_path output_models\u002Flora \\\n>  --output_model_path output_models\u002Flora_merged \\\n>```\n>\n>\u003C\u002Fdetails>\n\n### 推理\n\n微调后，您可以运行以下命令与模型聊天。\n```sh\nbash .\u002Fscripts\u002Frun_chatbot.sh output_models\u002Ffinetuned_gpt2\n```\n\n> [!TIP]\n> 我们推荐使用 SGLang 进行更快的批量推理。\n>\n> \u003Cdetails>\u003Csummary>使用 SGLang 进行更快推理\u003C\u002Fsummary>  \n>\n>```bash\n>bash .\u002Fscripts\u002Frun_sglang_inference.sh\n>```\n> 注意：如果使用 SGLang 时遇到错误 `ModuleNotFoundError: No module named 'common_ops'`，请尝试运行 `apt-get update` 然后 `apt install numactl`。 \n> \u003C\u002Fdetails>\n\n### 部署\n\n如果您想在本机部署自己的模型，我们提供了一个基于 Gradio 的用户界面来构建聊天机器人。运行以下命令将启动 robin-7b 的演示：\n\n```sh\npip install gradio\npython .\u002Fexamples\u002Fchatbot_gradio.py --deepspeed configs\u002Fds_config_chatbot.json --model_name_or_path YOUR-LLAMA  --lora_model_path .\u002Frobin-7b --prompt_structure \"A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.###Human: {input_text}###Assistant:\"       --end_string \"#\" --max_new_tokens 200\n```\n\n### 评估\n\n对于大多数评估目的，我们推荐使用 [LM Evaluation Harness](https:\u002F\u002Fgithub.com\u002FEleutherAI\u002Flm-evaluation-harness)。\n\n## 支持的功能\n\n\u003Cdetails> \u003Csummary>微调加速与内存优化\u003C\u002Fsummary>\n\n* LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning\n  \n  LISA 是一种新颖且节省内存的大语言模型 (Large Language Model) 训练策略，通过在优化过程中选择性冻结层，其性能优于现有的方法如 LoRA（一种参数高效的微调算法）。更多详情请查看 [LISA](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.17919)。  \n  在 LMFlow 中，使用 `--use_lisa 1` 激活 LISA。使用 `--lisa_activated_layers 2` 控制激活层数，使用 `--lisa_step_interval 20` 调整冻结层间隔。 \n\n* LoRA\n  \n  LoRA 是一种参数高效的微调算法，比全量微调更高效。更多详情请查看 [finetuning-lora](#finetuning-lora)。\n\n* FlashAttention\n\n  LMFlow 支持 FlashAttention-1 和最新的 FlashAttention-2。更多详情请查看 [flash_attention](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Freadme\u002Fflash_attn2.md)。\n\n* 梯度检查点 (Gradient Checkpointing)\n  \n  [梯度检查点 (Gradient checkpointing)](https:\u002F\u002Fgithub.com\u002Fcybertronai\u002Fgradient-checkpointing) 是一种以计算换取内存的内存优化技术。\n  当模型过大无法放入 GPU 显存时非常有用。 \n  只需在训练命令中添加 `--gradient_checkpointing` 即可使用它。\n\n* Deepspeed Zero3\n  \n  LMFlow 支持 [Deepspeed Zero-3 Offload](https:\u002F\u002Fwww.deepspeed.ai\u002F2021\u002F03\u002F07\u002Fzero3-offload.html)。 \n  我们提供了一个示例 [deepspeed config](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fconfigs\u002Fds_config_zero3.json)，您可以直接使用。\n\n\u003C\u002Fdetails>\n\n\u003Cdetails> \u003Csummary>推理加速\u003C\u002Fsummary>\n\n* CPU 上的 LLaMA 推理\n\n  感谢 [llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp) 的巨大努力。通过 4-bit 量化，每个人都可以将他们的 LLaMA 模型在 CPU 上运行。我们提供了一个脚本将 LLaMA LoRA 权重转换为 `.pt` 文件。您只需要使用 llama.cpp 中的 `convert-pth-to-ggml.py` 进行量化。\n\n* FlashAttention\n\n  LMFlow 支持 FlashAttention-1 和最新的 FlashAttention-2。更多详情请查看 [flash_attention](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Freadme\u002Fflash_attn2.md)。\n\n* vLLM\n\n  尝试 vLLM 以实现快速且易于使用的 LLM（大型语言模型）推理和服务。感谢 [great work](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm)!\n\n\u003C\u002Fdetails>\n\n\u003Cdetails> \u003Csummary>长上下文\u003C\u002Fsummary>\n\n* LLaMA 模型的位置插值 (Position Interpolation)\n\n  现在 LMFlow 支持用于 LLaMA 模型的最新线性及 NTK（神经核理论）缩放技术。更多详情请查看 [postion_interpolation](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Freadme\u002FPosition_Interpolation.md)。\n\n\u003C\u002Fdetails>\n\n\u003Cdetails> \u003Csummary>模型定制\u003C\u002Fsummary>\n\n* 词汇扩展 (Vocabulary Extension)\n\n  现在您可以训练自己的 sentencepiece 分词器 (tokenizer) 并将其与模型的原始 hf tokenizer 合并。更多详情请查看 [vocab_extension](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Fvocab_extension)。\n\n\u003C\u002Fdetails>\n\n\u003Cdetails> \u003Csummary>多模态\u003C\u002Fsummary>\n\n* 多模态聊天机器人 (Multimodal Chatbot)\n\n  LMFlow 支持图像和文本的多模态输入。查看我们的 [LMFlow multimodal chatbot](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Frun_vis_chatbot_gradio_minigpt4.sh)。\n\n\u003C\u002Fdetails>\n\n\u003Cdetails> \u003Csummary>自定义优化\u003C\u002Fsummary>\n\n* 自定义优化 (Custom Optimization)\n\n  LMFlow 现在支持使用多种优化器进行自定义优化器训练。通过量身定制的优化策略提升模型性能。深入了解细节并通过我们在 [custom_optimizers](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Frun_finetune_with_custom_optim.sh) 更新的脚本尝试新功能。\n\n  下表评估了 GPT-2 在 Alpaca 数据集微调过程中自定义优化器的性能，强调了它们对训练损失的各自影响。具体的超参数设置使用默认配置，可以在 [custom_optimizers](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002Fscripts\u002Frun_finetune_with_custom_optim.sh) 处自定义和调整。值得注意的是，评估是在 0.1 个 epoch (轮次) 的持续时间内进行的，以提供关于优化器有效性的初步见解。\n\n  | 优化器名称 | 训练损失 |\n  |----------------|------------|\n  | RMSprop        | 2.4016     |\n  | LION-32bit     | 2.4041     |\n  | Adam           | 2.4292     |\n  | AdamP          | 2.4295     |\n  | AdamW          | 2.4469     |\n  | AdaFactor      | 2.4543     |\n  | AdaBound       | 2.4547     |\n  | AdamWScheduleFree       | 2.4677     |\n  | Adan           | 2.5063     |\n  | NAdam          | 2.5569     |\n  | AdaBelief      | 2.5857     |\n  | AdaMax         | 2.5924     |\n  | RAdam          | 2.6104     |\n  | AdaDelta       | 2.6298     |\n  | AdaGrad        | 2.8657     |\n  | Yogi           | 2.9314     |\n  | NovoGrad       | 3.1071     |\n  | Sophia         | 3.1517     |\n  | LAMB           | 3.2350     |\n  | LARS           | 3.3329     |\n  | SGDScheduleFree        | 3.3541     |\n  | SGDP           | 3.3567     |\n  | SGD            | 3.3734     |\n\n\u003C\u002Fdetails>\n\n## 支持\n\n如果您需要任何帮助，请提交一个 Github Issue。\n\n## 许可\n\n本项目包含的代码采用 [Apache 2.0 许可证](https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fblob\u002Fmain\u002FLICENSE) 授权。\n如果您希望出于商业目的使用本项目中包含的代码和模型，请签署此 [文档](https:\u002F\u002Fdocs.google.com\u002Fforms\u002Fd\u002Fe\u002F1FAIpQLSfJYcci6cbgpIvx_Fh1xDL6pNkzsjGDH1QIcm4cYk88K2tqkw\u002Fviewform?usp=pp_url) 以获得授权。\n\n## 引用\n\n如果您觉得本仓库有用，请考虑给予⭐并引用我们的 [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.12420)：\n\n```citation\n@article{diao2023lmflow,\n  title={Lmflow: An extensible toolkit for finetuning and inference of large foundation models},\n  author={Diao, Shizhe and Pan, Rui and Dong, Hanze and Shum, Ka Shun and Zhang, Jipeng and Xiong, Wei and Zhang, Tong},\n  journal={arXiv preprint arXiv:2306.12420},\n  year={2023}\n}\n```\n\n```citation\n@article{dong2023raft,\n  title={Raft: Reward ranked finetuning for generative foundation model alignment},\n  author={Dong, Hanze and Xiong, Wei and Goyal, Deepanshu and Pan, Rui and Diao, Shizhe and Zhang, Jipeng and Shum, Kashun and Zhang, Tong},\n  journal={arXiv preprint arXiv:2304.06767},\n  year={2023}\n}\n```\n\n```citation\n@article{pan2024lisa,\n  title={LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning}, \n  author={Pan, Rui and Liu, Xiang and Diao, Shizhe and Pi, Renjie and Zhang, Jipeng and Han, Chi and Zhang, Tong},\n  journal={arXiv preprint arXiv:2403.17919},\n  year={2024}\n}\n```","# LMFlow 快速上手指南\n\nLMFlow 是一个可扩展、便捷且高效的微调大型机器学习模型的工具箱，旨在为用户提供友好、快速且可靠的体验。\n\n## 1. 环境准备\n\n*   **操作系统**: 推荐 Linux (Ubuntu 20.04)。MacOS 和 Windows 尚未完全测试，可能会遇到意外错误。首次使用建议尝试 Linux 机器或 Google Colab。\n*   **Python 版本**: 3.9+\n*   **依赖管理**: 推荐使用 Conda 管理虚拟环境。\n*   **其他**: 默认使用 WandB 跟踪训练过程，建议提前登录。\n\n## 2. 安装步骤\n\n### 克隆代码与创建环境\n\n```bash\ngit clone -b v1.0.0 https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow.git\ncd LMFlow\nconda create -n lmflow python=3.9 -y\nconda activate lmflow\nconda install mpi4py\npip install -e .\n```\n\n> [!TIP]\n> **WandB 配置**: 运行训练脚本前，建议先登录 WandB 以可视化训练过程：\n> ```bash\n> wandb login\n> ```\n> 若不需要 WandB，可在命令中添加 `--report_to none` 或在环境变量中设置 `export WANDB_MODE=disabled`。\n\n## 3. 基本使用\n\n### 准备数据集\n\n参考官方文档获取数据，或使用提供的下载脚本（示例）：\n\n```bash\ncd data && .\u002Fdownload.sh alpaca && cd -\n```\n\n### 执行微调 (Full Finetuning)\n\n以下示例展示如何微调 GPT-2 基础模型：\n\n```sh\nbash .\u002Fscripts\u002Frun_finetune.sh \\\n  --model_name_or_path gpt2 \\\n  --dataset_path data\u002Falpaca\u002Ftrain_conversation \\\n  --output_model_path output_models\u002Ffinetuned_gpt2\n```\n\n### 硬件资源预估\n\n根据微调方法不同，显存需求差异较大。以下为部分参考：\n\n| 方法 | 7B 模型显存需求 |\n| :--- | :--- |\n| Full `bf16`\u002F`fp16` | ~120GB |\n| LoRA | ~16GB |\n| QLoRA (`quant_bit=4`) | ~6GB |\n\n更多详细功能（如 LoRA、推理、部署等）请参考项目文档。","某电商客服团队希望基于开源大模型定制专属问答系统，但面临技术门槛高和算力资源有限的挑战。\n\n### 没有 LMFlow 时\n- 环境配置极其复杂，依赖冲突导致训练脚本频繁报错，耗费大量调试时间。\n- 不同模型架构适配困难，需手动修改底层代码才能支持新模型结构。\n- 显存占用过高，普通消费级显卡无法运行 7B 以上模型，被迫租用昂贵云资源。\n- 对话模板需自行编写，容易因格式错误导致模型输出混乱，影响用户体验。\n\n### 使用 LMFlow 后\n- LMFlow 提供统一接口，一键完成环境部署与模型加载，大幅降低入门门槛。\n- 内置多种主流模型支持，无需重复造轮子适配架构，直接调用即可微调。\n- 利用 LISA 等技术优化显存，在 24G 显卡上即可高效训练 7B 模型，节省成本。\n- 预设 Llama-3 等对话模板，开箱即用且格式规范，确保对话流畅自然。\n\nLMFlow 通过标准化流程显著降低了大模型微调难度，让企业能更专注于业务逻辑而非基础设施，真正实现大模型技术的普惠落地。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FOptimalScale_LMFlow_53549a41.png","OptimalScale","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FOptimalScale_93839c89.png","",null,"https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow","https:\u002F\u002Fgithub.com\u002FOptimalScale",[82,86],{"name":83,"color":84,"percentage":85},"Python","#3572A5",93.4,{"name":87,"color":88,"percentage":89},"Shell","#89e051",6.6,8486,831,"2026-04-04T19:47:34","Apache-2.0","Linux","需要 NVIDIA GPU，显存根据模型大小和微调方法不同（0.4GB-1200GB+），部分旧版本分支支持 CUDA 10.3-11.7","未说明",{"notes":98,"python":99,"dependencies":100},"推荐使用 Ubuntu 20.04 Linux 系统；需使用 conda 创建独立环境；默认启用 WandB 记录训练日志，首次运行需执行 wandb login；不同 CUDA 版本可能需要切换特定 git 分支；支持多种微调方法及多模态功能。","3.9+",[101,102,103,104,105,106],"mpi4py","accelerate","wandb","torch","transformers","datasets",[13,26],[109,110,111,112,113,114,115],"chatgpt","deep-learning","instruction-following","language-model","pretrained-models","pytorch","transformer",8,"2026-03-27T02:49:30.150509","2026-04-06T06:54:53.177517",[120,125,130,135],{"id":121,"question_zh":122,"answer_zh":123,"source_url":124},2240,"运行 chatbot 脚本时出现 `RuntimeError: Tensors must be contiguous` 错误如何解决？","该错误通常由模型加载所需内存超过 16GB 且 GPU 显存不足导致，开启 RAM 优化加载后若显存不够会引发张量分裂。建议方案：1. 增加服务器 RAM 并关闭 `--use_ram_optimized_load False` 选项；2. 使用显存大于 16G 的 GPU；3. 使用大内存 CPU 服务器运行 `.\u002Fscripts\u002Frun_chatbot_cpu.sh`（速度较慢）。推荐使用 Google Colab 环境进行此类实验。","https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fissues\u002F144",{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},2241,"在 Docker 中微调大模型时遇到 `OSError: [Errno 28] No space left on device` 怎么办？","若在 Docker 中使用多卡，通常是共享内存不足。请在启动容器时添加 `--shm-size=128g` 参数。具体可参考 StackOverflow 关于 Docker 共享内存限制的解决方案。同时请检查日志确认是否为设备空间问题。","https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fissues\u002F160",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},2242,"聊天机器人生成内容出现重复（Repeated output）该如何排查？","请检查训练阶段和推理阶段使用的 `prompt_structure`（提示词结构）与 `end_string`（结束字符串）是否完全一致。配置不一致是导致模型无法正确停止生成并产生重复内容的常见原因。","https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fissues\u002F459",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},2243,"微调后的聊天机器人无法根据知识库内容正确回答问题怎么办？","LoRA 训练对超参数敏感。建议尝试多组不同的超参数设置进行训练，观察模型表现，直到找到合适的参数组合使模型能够正确学习并复现知识库内容。","https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fissues\u002F420",[141,146,151,155,159],{"id":142,"version":143,"summary_zh":144,"released_at":145},101777,"v1.0.0","## Description\r\n1. LMFlow now defaults to using Accelerate (i.e, run scripts using `accelerate launch ... finetune.py ...`). If you prefer to use deepspeed (`deepspeed ... finetune.py ...`) or accelerate + deepspeed backend, please install using `pip install -e '.[deepspeed]'`\r\n2. Removed\u002Farchived some less frequently used docs\u002Fscripts\u002Fmodules.\r\n\t- `docker`\r\n\t- `scripts\u002Fdata_preprocess`, `scripts\u002Fspeculative_decoding`, `scripts\u002Ftools`, `scripts\u002Fvocab_extension`\r\n\t- `service`\r\n\t- `utils`\r\n\r\n## What's Changed\r\n* [Feature] vllm inferencer and memory safe vllm inferencer by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F860\r\n* [Feature] Add vllm inference example by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F863\r\n* Add customized optimizer support by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F865\r\n* Expanding Optimization Options by @tianshijing in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F871\r\n* [Feature] reward model inferencer and dpov2 aligner by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F867\r\n* Added more custom optimizers by @tianshijing in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F876\r\n* [ADD] T2I finetuning with SD1 and SD2 to contrib by @Aziily in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F877\r\n* update naacl best paper award by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F881\r\n* [usability] make dpo eval dataset optional by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F889\r\n* function-call-finetune by @HALIS-sh in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F884\r\n* Modify the formatter of function and observation by @HALIS-sh in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F892\r\n* [Feature] Iterative DPO by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F883\r\n* [usability] remove numpy version requirement by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F901\r\n* Fix load from LoRA weight & empty tools and system by @HALIS-sh in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F902\r\n* [fix] merge lora fix by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F909\r\n* [usability] temporarily change default version to 0.0.8 by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F911\r\n* [doc] Add wandb setup guide by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F912\r\n* [temp] temporarily restrict transformers version by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F914\r\n* [usability] deps streamlining by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F905\r\n* [doc] update default branch by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F915\r\n* Add Hymba and DoRA support by @Dominic789654 in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F919\r\n* announce support of hymba by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F921\r\n* Hymba support announcement by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F922\r\n* [usability] Add hymba lora target by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F924\r\n* [usability] Change dataset check method by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F930\r\n* [usability] support qwen2.5 and deepseek by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F931\r\n* [fix] fix jinja template issue caused by empty system prompt by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F935\r\n* [fix] fix get dataset type function by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F937\r\n* [doc] readme update by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F938\r\n* Added Muon Optimizer by @tianshijing in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F939\r\n\r\n## New Contributors\r\n* @tianshijing made their first contribution in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F871\r\n* @Aziily made their first contribution in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F877\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fcompare\u002Fv0.0.8...v1.0.0","2025-07-11T02:03:22",{"id":147,"version":148,"summary_zh":149,"released_at":150},101778,"v0.0.8","Major new features since v0.0.4\r\n\r\n- Support conversation templates\r\n- Support new optimization algorithms, e.g. LISA\r\n- Update requirements to support latest models\r\n- Fix bugs in qlora\u002Flora scripts\r\n- Fix tokenization parallelism bug\r\n- Improve script interfaces\r\n\r\n## What's Changed\r\n* README refactor by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F607\r\n* Improve interface of finetuning scripts by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F611\r\n* resize banner by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F612\r\n* Doc Reformat by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F614\r\n* Dev update transformers by @yaoguany in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F616\r\n* Added QLoRA support for Decoder transformers with tune_strategy \"Normal\" by @TensorBlast in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F613\r\n* announce long context support by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F621\r\n* fix deepspeed zero3 config bugs by @yaoguany in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F622\r\n* Update version.py by @hendrydong in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F624\r\n* update qr code by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F625\r\n* FIX BUG: trust_remote_code flag didn't take effect by @conght in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F633\r\n* Add explanations about supported CUDA versions by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F634\r\n* Added citation for RAFT by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F635\r\n* Update qrcode by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F636\r\n* [Features] Support multi_modal training by @lianqing11 in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F628\r\n* Add scripts to convert raw file to text-only json by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F638\r\n* speculative decoding by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F630\r\n* [Feature] Speculative Inference by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F640\r\n* add readme for speculative decoding by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F641\r\n* update news about speculative decoding by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F642\r\n* update llama flash attention by @yaoguany in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F646\r\n* update qrcode by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F647\r\n* [FIX] Fix multi-modal training by @lianqing11 in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F648\r\n* Fix: `--disable_group_texts 1` keep short samples by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F649\r\n* Support all types with `--disable_group_texts 1` by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F650\r\n* Fix model downloading for CPU-only servers by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F651\r\n* add block size to fingerprint by @RolandMinrui in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F653\r\n* update qrcode by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F654\r\n* Update QR code for wechat by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F656\r\n* update qrcode by @shizhediao in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F657\r\n* update qr code for wechat by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F667\r\n* Update version of `datasets` dependency by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F668\r\n* Add flash attention install for A6000 by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F669\r\n* Update hf_decoder_model.py by @yaoguany in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F670\r\n* fix bugs in llama flash attention by @yaoguany in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F681\r\n* code exection class and test cases by @Bob17293729 in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F674\r\n* Update README to reflect changes in v0.0.6 by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F696\r\n* fix bugs in requirements.txt since previous one can cause errors by @xu1868 in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F697\r\n* fix merge lora bug by @Dominic789654 in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F698\r\n* Upgrade `transformers` deps to support mistral by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F700\r\n* add lisa code and lisa args by @Dominic789654 in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F701\r\n* add GPU memory check script by @Dominic789654 in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F702\r\n* Support multi-gpu inference by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F699\r\n* src\u002Flmflow\u002Fargs.py typo fix by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F703\r\n* add more info when fail to import flash attn by @wheresmyhair in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F704\r\n* Add script to finetune llama-2 with lisa by @research4pan in https:\u002F\u002Fgithub.com\u002FOptimalScale\u002FLMFlow\u002Fpull\u002F705\r\n* add","2024-06-19T03:43:24",{"id":152,"version":153,"summary_zh":78,"released_at":154},101779,"v0.0.4","2023-08-09T03:57:48",{"id":156,"version":157,"summary_zh":78,"released_at":158},101780,"v0.0.3","2023-07-21T15:35:32",{"id":160,"version":161,"summary_zh":162,"released_at":163},101781,"v0.0.1","initial release","2023-03-27T17:42:59"]