[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-Beomi--KoAlpaca":3,"tool-Beomi--KoAlpaca":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":81,"owner_twitter":82,"owner_website":83,"owner_url":84,"languages":85,"stars":98,"forks":99,"last_commit_at":100,"license":101,"difficulty_score":10,"env_os":102,"env_gpu":103,"env_ram":104,"env_deps":105,"category_tags":114,"github_topics":115,"view_count":23,"oss_zip_url":122,"oss_zip_packed_at":122,"status":16,"created_at":123,"updated_at":124,"faqs":125,"releases":126},2919,"Beomi\u002FKoAlpaca","KoAlpaca","KoAlpaca: 한국어 명령어를 이해하는 오픈소스 언어모델 (KoAlpaca: An open-source language model to understand Korean instructions)","KoAlpaca 是一款专为韩语优化的开源大语言模型，旨在让 AI 精准理解并执行韩语指令。它基于斯坦福 Alpaca 的训练方法构建，核心解决了通用大模型在韩语语境下表现不佳、难以处理本土化任务的问题，填补了高质量韩语指令微调模型的空白。\n\n该项目非常适合自然语言处理研究人员、AI 开发者以及希望探索韩语大模型应用的普通用户。对于开发者而言，KoAlpaca 提供了极高的灵活性与实用性：它不仅发布了基于 Polyglot-ko 和 LLaMA 等多个基座模型的版本，还详细公开了使用 QLoRA 技术在消费级显卡（如 RTX 3090\u002F4090）甚至免费 Colab 环境中进行高效微调的代码与教程。这种低资源门槛的训练方案，使得个人研究者也能轻松参与韩语大模型的定制与开发。此外，项目曾提供网页聊天演示及机器人接口，方便用户直观体验其对话能力。作为韩语 AI 生态的重要基础设施，KoAlpaca 以开放共享的精神，推动了韩语自然语言处理技术的普及与创新。","\n\u003Cp align=\"center\" width=\"100%\">\n\u003Cimg src=\"assets\u002Fnew_koalpaca_final.svg\" alt=\"KoAlpaca icon\" style=\"width: 200px; height:200px; display: block; margin: auto; border-radius: 50%;\">\n\u003C\u002Fp>\n\n## Update Logs\n\n- 2024.05\n  - GPU 서버 지원 종료로 웹앱 데모를 종료합니다.\n- 2023.07.28\n  - [RTX3090(4090\u002FA5000) 4대 사용한 TensorParallel 기반 학습 예제 코드](https:\u002F\u002Fgithub.com\u002FBeomi\u002FKoAlpaca\u002Ftree\u002Fmain\u002Ftrain_v1.1b#tensor-parallel-polyglot-ko-58b-%EB%AA%A8%EB%8D%B8-%ED%95%99%EC%8A%B5-w-4x-rtx-3090-gpu) 추가\n- 2023.06.12\n    - Kakao\u002FTelegram 봇 안내 삭제\n- 2023.06.08\n    - [Colab: QLoRA로 학습한 adapter weight 로드해서 Inference하기](https:\u002F\u002Fgithub.com\u002FBeomi\u002FKoAlpaca\u002Fblob\u002Fmain\u002F2023_06_08_PEFT%EB%A1%9CLoRA%EB%A1%9C%EB%93%9C.ipynb) 추가\n        - Colab Free(T4 GPU)에서 로드 가능합니다.\n        - 예제로 [🤗beomi\u002Fqlora-koalpaca-polyglot-12.8b-50step](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002Fqlora-koalpaca-polyglot-12.8b-50step) 레포(50step 학습한 예제)를 사용합니다.\n        \u003Ca style='display:inline' target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002FBeomi\u002FKoAlpaca\u002Fblob\u002Fmain\u002F2023_06_08_PEFT%EB%A1%9CLoRA%EB%A1%9C%EB%93%9C.ipynb\">\n          \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"Open In Colab\"\u002F>\n        \u003C\u002Fa>\n        \n- 2023.06.02\n    - [🐧KoRWKV-6B🤗](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoRWKV-6B) 28B tokens ckpt 기반 [🐧KoAlpaca-KoRWKV-6B🤗](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-KoRWKV-6B) v1.1a버전 학습 및 모델 공개\n        > [🐧KoAlpaca-KoRWKV-6B 허깅페이스 레포🤗](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-KoRWKV-6B)에 몇가지 생성 결과 샘플이 있습니다.\n    - [RTX 3090\u002F4090\u002FA5000으로 Polyglot-ko 5.8b full finetune](https:\u002F\u002Fgithub.com\u002FBeomi\u002FKoAlpaca\u002Fblob\u002Fmain\u002Ftrain_v1.1b\u002FREADME.md#polyglot-ko-58b-%EB%AA%A8%EB%8D%B8-%ED%95%99%EC%8A%B5-w-3x-rtx-3090-gpu-%ED%98%B9%EC%9D%80-rtx-4090-a5000) 가이드 추가\n\n- 2023.05.30\n    - [무료 Colab(T4 GPU)에서 동작하는 학습 예제](https:\u002F\u002Fcolab.research.google.com\u002Fgist\u002FBeomi\u002Fa3032e4eaa33b86fdf8de1f47f15a647\u002F2023_05_26_bnb_4bit_koalpaca_v1_1a_on_polyglot_ko_12_8b.ipynb) 추가: QLoRA로 Polyglot-ko 12.8B 모델 KoAlpaca v1.1a 학습 with 4bit \n        - Colab 무료 - T4에서는 50sample에 약 4분, 1epoch에 약 15시간(최대 사용시간 초과)\n        - Colab Pro - A100에서는 50sample에 약 44초, 1epochs에 약 2-3시간\n        \u003Ca style='display:inline' target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgist\u002FBeomi\u002Fa3032e4eaa33b86fdf8de1f47f15a647\u002F2023_05_26_bnb_4bit_koalpaca_v1_1a_on_polyglot_ko_12_8b.ipynb\">\n          \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"Open In Colab\"\u002F>\n        \u003C\u002Fa>\n\n- 2023.05.03\n    - [🤗Polyglot-ko 5.8B 기반 KoAlpaca-Polyglot-5.8B-v1.1b fp16 모델](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-5.8B) 및 [🤗8bit 모델](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-5.8B\u002Ftree\u002F8bit) 공개\n- 2023.04.28\n    - Logo 업데이트 🎉\n- 2023.04.26\n    - 웹UI에 [Hugging Face ChatUI](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fhuggingchat\u002Fchat-ui\u002Ftree\u002Fmain) 적용\n      - 해당 ChatKoAlpaca GPU서버는 **모두의연구소 K-디지털 플랫폼의 지원을 받아 서비스되었습니다.(2024.05 종료)** \n- 2023.04.20\n    - [KoAlpaca v1.1 데이터셋](KoAlpaca_v1.1.jsonl) 공개\n    - [🤗Polyglot-ko 12.8B 기반 KoAlpaca-Polyglot-12.8B-v1.1b fp16 모델](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-12.8B) 및 [🤗8bit 모델](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-12.8B\u002Ftree\u002F8bit) 공개\n    - [ChatKoAlpaca Web UI(v1.1b)](https:\u002F\u002Fchat.koalpaca.com\u002F) 공개\n\n- 2023.03.26: [🤗LLAMA 30B 기반 KoAlpaca 모델](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-30B-LoRA)을 공개합니다. (LoRA로 학습)\n  - LLAMA 30B 학습은 GIST [Sundong Kim](https:\u002F\u002Fsundong.kim) 교수님의 A100 지원으로 학습되었습니다. 감사합니다 😄\n\n- 2023.03.24: [🤗LLAMA 13B 기반 KoAlpaca 모델](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-13B-LoRA)을 공개합니다. (LoRA로 학습) \n\n- 2023.03.23: [🤗LLAMA 65B 기반 KoAlpaca 모델](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-65B-LoRA)을 공개합니다. (LoRA로 학습) \n\n- 2023.03.22: 카카오톡에 포팅된 [KoAlpaca 봇](http:\u002F\u002Fpf.kakao.com\u002F_wdRxcxj)이 추가되었습니다. \n\n\n---\n\n# KoAlpaca: Korean Alpaca Model based on Stanford Alpaca (feat. LLAMA and Polyglot-ko)\n\nStanford Alpaca 모델을 학습한 방식과 동일한 방식으로 학습을 진행한, 한국어를 이해하는 Alpaca 모델입니다.\n\n## 바로 써보기: 채팅형 웹 페이지(v1.1b-chat) 혹은 카카오톡 봇, Telegram Bot으로 만나보세요!\n\n### ChatKoAlpaca: 채팅형 KoAlpaca (Polyglot-ko-12.8b-v1.1b)\n\n- 해당 데모는 [모두연 KoAlpaca Lab](https:\u002F\u002Fmodulabs.co.kr\u002Fproduct\u002Flab-11500-2023-05-10-074332\u002F)에서 모두의연구소 K-디지털 플랫폼의 지원 받아 진행되었습니다. 감사합니다 😍\n- 2024년 5월 기준 서버 지원 종료로 서비스를 종료합니다.\n\n![ChatKoAlpaca Web page with chat input](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_c369a0895323.jpeg)\n\n- 설치 없이 Web으로 곧바로 만날 수 있습니다.\n- 이전 대화의 맥락을 기억해 답변을 생성합니다.\n- 최대 512 토큰의 응답을 생성하며, 이전 맥락은 (신규 생성 답변을 포함해) 최대 1024 토큰을 기억합니다.\n- 데모에서는 A5000 24G GPU 2대로 Tensor Parallel을 통해 서빙중이며, fp16 모델로 서비스중입니다.\n\n\u003C!-- 위 서빙에 사용한 Gradio 코드는 [webui 폴더](https:\u002F\u002Fgithub.com\u002FBeomi\u002FKoAlpaca\u002Ftree\u002Fmain\u002Fwebui)를 참고해주세요. -->\n\n\u003C!--\n\n### 카카오톡 봇: KoAlpaca-Polyglot-ko-5.8b-v1.1b\n\n- 응답 5초 제한으로 인해 200자 제한\n- 맥락 기억하지 않습니다.\n\n링크: 아래 이미지 클릭 or [http:\u002F\u002Fpf.kakao.com\u002F_wdRxcxj](http:\u002F\u002Fpf.kakao.com\u002F_wdRxcxj)\n\n\u003Cp align=\"center\" width=\"100%\">\n\u003Ca href=\"http:\u002F\u002Fpf.kakao.com\u002F_wdRxcxj\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_a870f43ef81a.png\" alt=\"KoAlpaca KakaoTalk Example\" style=\"width: 50%; max-width: 300px; display: block; margin: auto;\">\n\u003C\u002Fa>\n\u003C\u002Fp>\n\n\n### 텔레그램 봇: 한국어 5.8B v1.1b 모델 및 LLAMA 7B v1.0 모델 사용가능, 맥락 기억 X\n\n- 최대 입력 1024자, 최대 출력 512토큰\n- 맥락 기억하지 않습니다.\n\n아래 QR코드를 찍거나, 혹은 [https:\u002F\u002Ft.me\u002FKoAlpacaBot](https:\u002F\u002Ft.me\u002FKoAlpacaBot)에서 만나보세요!\n\n\u003Cp align=\"center\" width=\"100%\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_7f98a3e26ff4.jpg\" alt=\"KoAlpaca Telegram Bot QR Code\" style=\"width: 50%; max-width: 300px; display: block; margin: auto;\">\n\u003C\u002Fp>\n\n한국어 기반 모델과 LLAMA 기반 모델 모두 변경해가며 체험하실 수 있습니다!\n\n\u003Cp align=\"center\" width=\"100%\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_02e8ec58347b.png\" alt=\"KoAlpaca Telegram Bot QR Code\" style=\"width: 45%; max-width: 300px; display: inline-block; margin: auto;\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_64788014bf8b.png\" alt=\"KoAlpaca Telegram Bot QR Code\" style=\"width: 45%; max-width: 300px; display: inline-block; margin: auto;\">\n\u003C\u002Fp>\n\n> 📣 각 모델별 한 대의 GPU로 서빙되고 있어서, 생성에 시간이 걸릴 수 있습니다. 속도를 위해 생성 최대 토큰은 요청당 128토큰으로 제한되어있습니다.\n\n-->\n\n## 완성된 모델: 한국어 기반 모델(Polyglot-ko) & 영한 모델(LLAMA)\n\nKoAlpaca는 백본 모델로 한국어 모델은 Polyglot-ko(5.8B)모델을, 영문+한국어 기반 모델은 LLAMA를 사용하였습니다.\n\n1. **Polyglot-ko 12.8B 기반 [Full Finetune v1.1]** -> [🤗 beomi\u002FKoAlpaca-Polyglot-12.8B](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-12.8B)\n  - 해당 모델은 [ChatKoAlpaca](https:\u002F\u002Fchat.koalpaca.com\u002F)에서 사용되었습니다.\n  - 해당 모델은 [🤗 8Bit quantized된 모델](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-12.8B\u002Ftree\u002F8bit)도 제공합니다.\n2. Polyglot-ko 5.8B 기반 [Full Finetune] -> [🤗 beomi\u002FKoAlpaca-Polyglot](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot)\n3. Meta LLAMA 7B 기반 [Full Finetune] -> [🤗 beomi\u002FKoAlpaca](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca)\n4. Meta LLAMA 13B 기반 [LoRA] -> [🤗 KoAlpaca-13B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-13B-LoRA)\n5. Meta LLAMA 30B 기반 [LoRA] -> [🤗 KoAlpaca-30B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-30B-LoRA)\n6. Meta LLAMA 65B 기반 [LoRA] -> [🤗 KoAlpaca-65B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-65B-LoRA)\n\n> 참고: LLAMA 13B로는, 찬성님([@deep-diver](https:\u002F\u002Fgithub.com\u002Fdeep-diver))이 학습하신 [🤗chansung\u002Fkoalpaca-lora-13b](https:\u002F\u002Fhuggingface.co\u002Fchansung\u002Fkoalpaca-lora-13b)가 추가적으로 공개되어있습니다.)\n\nMeta의 LLAMA 모델은 한국어 데이터셋을 충분히 학습하지 않아서, 실제 Inference를 돌려보았을 때 한국어 성능이 낮게 나오는 이슈가 있습니다.\n\n보다 한국어에 나은 퀄리티를 얻어보고자 Polyglot-ko 5.8B, 12.8B 모델을 백본으로 사용한 모델도 학습시켰습니다.\n\n이후 v1.1 데이터와 함께, 이전 대화의 맥락과 사용자의 추가 입력의 맥락을 이해하는 모델인 KoAlpaca v1.1을 공개합니다.\n\n\n## KoAlpaca 모델 실행 예시 코드\n\n### Huggingface Pipeline으로 실행\n\n\u003Ca target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1a5xVG4VSh5TSRfSVchyCqk9onadhIxTs?usp=sharing\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"Open In Colab\"\u002F>\n\u003C\u002Fa>\n\n- 최신버전 torch \u002F HF 라이브러리 설치\n\n```bash\npip install -U torch transformers tokenizers accelerate safetensors\n```\n\n아래 예제 코드로 실행해볼 수 있습니다.\n\n```python\nimport torch\nfrom transformers import pipeline, AutoModelForCausalLM\n\nMODEL = 'beomi\u002FKoAlpaca-Polyglot-5.8B'\n\nmodel = AutoModelForCausalLM.from_pretrained(\n    MODEL,\n    torch_dtype=torch.float16,\n    low_cpu_mem_usage=True,\n).to(device=f\"cuda\", non_blocking=True)\nmodel.eval()\n\npipe = pipeline(\n    'text-generation', \n    model=model,\n    tokenizer=MODEL,\n    device=0\n)\n\ndef ask(x, context='', is_input_full=False):\n    ans = pipe(\n        f\"### 질문: {x}\\n\\n### 맥락: {context}\\n\\n### 답변:\" if context else f\"### 질문: {x}\\n\\n### 답변:\", \n        do_sample=True, \n        max_new_tokens=512,\n        temperature=0.7,\n        top_p=0.9,\n        return_full_text=False,\n        eos_token_id=2,\n    )\n    print(ans[0]['generated_text'])\n\nask(\"딥러닝이 뭐야?\")\n# 딥러닝은 인공신경망을 통해 입력과 출력 사이의 복잡한 관계를 학습하는 머신러닝의 한 분야입니다. 이 기술은 컴퓨터가 인간의 학습 능력과 유사한 방식으로 패턴을 학습하도록 하며, 인간의 개입 없이도 데이터를 처리할 수 있는 기술입니다. 최근에는 딥러닝을 활용한 인공지능 애플리케이션이 많이 개발되고 있습니다. 예를 들어, 의료 진단 애플리케이션에서는 딥러닝 기술을 활용하여 환자의 특징을 파악하고, 이를 통해 빠르고 정확한 진단을 내리는 데 사용됩니다. 또한, 금융 분야에서는 딥러닝 기술을 활용하여 주가 예측 모형을 학습하는 데 사용되기도 합니다. \n```\n\n### Gradio 샘플 - Fullfinetune & 8bit 🌟 추천 🌟\n\n이 레포의 [webui 폴더](https:\u002F\u002Fgithub.com\u002FBeomi\u002FKoAlpaca\u002Ftree\u002Fmain\u002Fwebui)를 참고해주세요.\n\n### Gradio 샘플 - LoRA\n\n찬성님([@deep-diver](https:\u002F\u002Fgithub.com\u002Fdeep-diver))의 [deep-diver\u002FAlpaca-LoRA-Serve](https:\u002F\u002Fgithub.com\u002Fdeep-diver\u002FAlpaca-LoRA-Serve)레포를 참고해 실행해보세요.\n\n```bash\n# 위 Repo의 가이드대로 설치 후..\nBASE_URL=decapoda-research\u002Fllama-13b-hf  # 30B모델은 llama-30b-hf 65B모델은 llama-65b-hf 주의: 65B 모델은 기본 용량이 100GB 이상입니다.\nFINETUNED_CKPT_URL=beomi\u002FKoAlpaca-13B-LoRA  # 30B모델은 KoAlpaca-30B-LoRA 65B모델은 KoAlpaca-65B-LoRA 주의: 65B 모델은 A100 80G등 Vram이 아주 큰 GPU에서만 돌아갑니다.\n\npython app.py --base_url $BASE_URL --ft_ckpt_url $FINETUNED_CKPT_URL --port 6006\n```\n\n## 데이터셋 v1.1 제작 방법\n\nv1.1 데이터셋은 기존 Alpaca모델이 대답을 짧게 하는 경향, 그리고 맥락을 이해하지 못하는 경향을 개선하기 위해 제작되었습니다.\n\n1. 데이터셋 선정 및 수집\n\nv1.1 데이터셋은 네이버 지식인을 크롤링하여 만들었습니다.\n\n[네이버 지식인 베스트](https:\u002F\u002Fkin.naver.com\u002Fbest\u002Flistaha.naver) 전체 질문을 수집하였습니다.\n\n2. 데이터 가공\n\n실제 생성 데이터는 질문과 질문 본문, 그리고 채택된 답변의 본문을 기반으로 새로 생성되었습니다. \n\n데이터를 ChatGPT를 통해 생성하였고, 이때 Batch Decoding을 통해 생성하였습니다.\n\n> 참고: 기존 지식인 질문과 답변은 ChatGPT가 참조한 시드로 사용되었습니다\n\n## 데이터셋 v1.0 제작 방법\n\n데이터셋은 기본적으로 [Stanford Alpaca에서 제공한 5만2천개 데이터셋](.\u002Fen_alpaca_data.json)을 기반으로 합니다.\n\n1. 데이터셋 번역\n\nAlpaca 데이터셋은 다음과 같이 Instruct 부분과 Input, 그리고 Output 부분으로 구성되어있습니다.\n\n```json\n...\n{\n    \"instruction\": \"Describe a time when you had to make a difficult decision.\",\n    \"input\": \"\",\n    \"output\": \"I had to make a difficult decision when I was working as a project manager at a construction company. I was in charge of a project that needed to be completed by a certain date in order to meet the client\\u2019s expectations. However, due to unexpected delays, we were not able to meet the deadline and so I had to make a difficult decision. I decided to extend the deadline, but I had to stretch the team\\u2019s resources even further and increase the budget. Although it was a risky decision, I ultimately decided to go ahead with it to ensure that the project was completed on time and that the client\\u2019s expectations were met. The project was eventually successfully completed and this was seen as a testament to my leadership and decision-making abilities.\"\n},\n{\n    \"instruction\": \"Identify the odd one out.\",\n    \"input\": \"Twitter, Instagram, Telegram\",\n    \"output\": \"Telegram\"\n},\n...\n```\n\nOutput의 경우 OpenAI의 `text-davinci-003` 모델의 생성 결과이기 때문에, 해당 부분은 번역하지 않았습니다.\n\n따라서 Instruct와 Input 컬럼을 번역하고, 해당 Input부분에 에러가 있는 일부를 제외하였습니다.\n\n번역을 하기 위해 DeepL API 서비스를 사용하였고, 번역된 데이터는 다음과 같은 형태가 됩니다.\n\n```json\n...\n{\n    \"instruction\": \"홀수 중 하나를 밝히세요.\",\n    \"input\": \"트위터, 인스타그램, 텔레그램\"\n}\n...\n```\n\n2. Output 데이터 생성\n\n이후 해당 Instruct와 Input은 다음과 같이 Batch Decoding 기법을 통해서 OpenAI ChatGPT API(`gpt-3.5-turbo`)를 통해 데이터를 생성합니다. \n\nAlpaca 연구와 동일하게, 한 Instruct\u002FInput 셋에 대해서 단일 응답만을 생성해 데이터셋을 구성하였습니다.\n\n기존 연구에서는 20개의 데이터셋을 한번에 생성했지만, 한국어로 생성을 진행해보았을 때 10개를 동시에 생성할 때 안정적으로 생성이 진행되어 10개씩 생성하였습니다.\n\n아래는 답변을 생성하는데 사용한 Prompt입니다.\n\n```python\nPROMPT = \"\"\"\\\n다양한 작업에 대한 답변을 생성해주세요. 이러한 작업 지침은 ChatGPT 모델에 주어지며, ChatGPT 모델이 지침을 완료하는지 평가합니다.\n\n요구 사항은 다음과 같습니다:\n1. 다양성을 극대화하기 위해 각 지시에 대해 동사를 반복하지 않도록 하세요.\n2. 지시에 사용되는 언어도 다양해야 합니다. 예를 들어, 질문과 명령형 지시를 결합해야 합니다.\n3. 지시 사항의 유형이 다양해야 합니다. 목록에는 개방형 생성, 분류, 편집 등과 같은 다양한 유형의 작업이 포함되어야 합니다.\n2. GPT 언어 모델은 지시를 완료할 수 있어야 합니다. 예를 들어 어시스턴트에게 시각적 또는 오디오 출력을 생성하도록 요청하지 마세요. 또 다른 예로, 어시스턴트가 어떤 작업도 수행할 수 없으므로 오후 5시에 깨우거나 미리 알림을 설정하도록 요청하지 마세요.\n3. 답변은 한국어로 작성해야 합니다.\n4. 답변을 1~2문장으로 작성하세요. 명령문이나 질문도 허용됩니다.\n5. 지시 사항에 대한 적절한 입력을 생성해야 합니다. 입력 필드에는 지시에 대한 구체적인 예가 포함되어야 합니다. 실제 데이터를 포함해야 하며 단순한 자리 표시자를 포함해서는 안 됩니다. 입력은 지시 사항을 어렵게 만들 수 있는 상당한 내용을 제공해야 하지만 100단어를 넘지 않는 것이 이상적입니다.\n6. 일부 지시사항은 추가 입력이 있고, 일부 지시에는 입력 필드가 비어있습니다. 예를 들어 \"세계에서 가장 높은 봉우리는 무엇인가?\"라는 일반적인 정보를 묻는 지시의 경우 구체적인 맥락을 제공할 필요가 없어, 입력 필드가 비어있을 수 있습니다.\n7. 출력은 명령어와 입력에 대한 적절한 응답이어야 합니다.\n\n아래에 10개의 명령어와 입력(옵션)에 따라 적절한 응답을 생성하세요. \n응답은 아래와 같은 형식으로 10가지를 0번 부터 9번 까지, 번호에 따라 해당 번호의 명령어와 입력에 알맞게 작성하세요.\n\n각 응답 사이는 ### 으로 내용을 분리해주세요.\n\n응답0: 첫 번째 응답내용###\n응답1: 두 번째 응답내용###\n...\n응답9: 마지막 응답내용\"\"\"\n```\n\n추가적으로, 아래와 같이 ChatGPT API의 system prompt를 추가하였습니다.\n\n```python\n\"content\": \"아래는 작업을 설명하는 명령어입니다. 입력이 없으면 입력을 제외하고, 명령어에 따른 요청을 적절히 완료하는 응답을 작성하세요. 추가적 입력이 있다면 작업을 설명하는 명령어와 추가 컨텍스트를 제공하는 입력에 따라 요청을 적절히 완료하는 응답을 작성하세요.\",\n```\n\n한국어로 생성이 완료된 데이터셋은 `ko_alpaca_data.json`에 저장되어 있습니다.\n\n## Fine tune 방식 모델 학습 방법\n\n### QLoRA 4bit 학습방법 Google Colab 예제\n\n- QLoRA로 Polyglot-ko 12.8B 모델 KoAlpaca v1.1a 학습 with 4bit \n\n\u003Ca style='display:inline' target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgist\u002FBeomi\u002Ff163a6c04a869d18ee1a025b6d33e6d8\u002F2023_05_26_bnb_4bit_koalpaca_v1_1a_on_polyglot_ko_12_8b.ipynb\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"Open In Colab\"\u002F>\n\u003C\u002Fa>\n\n- 1000개 샘플로만 학습한 모델 생성결과 예시 \n\n```\n### 질문: 풋옵션과 콜옵션의 차이, 그리고 일반 개미 투자자가 선택해야 할 포지션은?\n\n### 답변: 풋이란 주식을 사는 것(구매)을 의미하고, 콜이란 주식을 파는(판매) 것입니다. 예를 들면, 기업의 주가가 10만원인데, 그 회사에서 신규 사업을 진행 중이라, 앞으로 매출이 증대되어 기업 가치가 상승할 것으로 판단해, 해당 회사의 주식을 사는 것이 풋옵션입니다. 이는 주식을 사는 시점과 해당 회사의 주가가 하락할 경우에는 기업의 주가와 함께 주가가 하락하는 리스크가 있습니다. 반면, 콜옵션은 주식을 파는 것입니다. 이는 주식을 파는 시점과 해당 회사의 주가가 증가해서 향후 기업 전망이 밝을 경우 주가가 상승하며, 해당 주식의 프리미엄을 받을 수 있는 안정적인 투자법입니다. 일반 개미 투자자들은 주식을 매수할 때와 주식을 매도할 때 다양한 리스크가 존재하므로, 전문가의 조언과 투자 경험을 바탕으로 적절한 투자 전략을 수립해야 합니다. 투자에 대한 자세한 사항은 아래에서 무료로 얻을 수 있습니다.\n```\n\n```\n### 질문: 마진콜이 발생하는 이유가 뭐야? 그리고 어떻게 해야 마진콜을 막을 수 있어?\n\n### 답변: 마진콜이란, 선물\u002F옵션 매매 시 증거금이 부족해서 추가증거금 예치를 요구받는 것을 말합니다. 추가증거금은 선물\u002F옵션 거래 계약시 미리 정해진 약정증거금에서 추가로 부담해야 하는 증거금을 의미합니다. 통상적으로 매수 계약 시에는 계약 금액의 10%를 증거금으로 예치하고, 매도 계약 시에는 해당 금액의 3%를 예치합니다. 따라서, 매수 계약 시에는 총 20%의 증거금이 필요하고, 매도 계약 시에는 총 9%의 증거금이 필요합니다. 따라서, 선물 매매 중에는 매일 매일의 시세에 따라 증거금을 수시로 필요하면 추가로 납부해야해서 추가증거금이 발생하며, 해당 필요액 부족시에는 마진콜이 발생합니다. 따라서, 선물 거래를 하기 전에는 꼭 추가증거금을 예치하는 것이 중요합니다. 추가증거금은 '선물\u002F옵션 -> 증거금 -> 추가예치\u002F청산' 방법으로 조회 가능하며, 인터넷이나 전화로도 조회할 수 있습니다. 또, '주문\u002F증거금 -> 추가증거금'을 통해 추가증거금을 입금할 수도 있습니다. 추가증거금을 예치\n```\n\n### ChatKoAlpaca(KoAlpaca Polyglot-ko-12.8B-v1.1b) 모델 학습\n\nChatKoAlpaca(KoAlpaca Polyglot-ko-12.8B-v1.1b) 모델은 A100 80GB 4대로 학습을 진행하였고,\nDeepSpeed ZeRO-3을 통해 Full Sharding으로 학습을 진행하였습니다.\n\n해당 코드는 현재 레포의 `train_1.1b`폴더를 참고해주세요.\n\n### Polyglot-ko 5.8B 모델 학습\n\n모델 학습은 A100 80GB 1대로 학습을 진행하였습니다. \n\n### LLAMA 7B 모델 학습\n\n![KoAlpaca LLAMA Train Loss Graph](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_f1b9764df0a5.jpeg)\n\nLLAMA기반 모델 학습은 A100 80GB 4대로 학습을 진행하였고, FSDP를 통해 학습했습니다. \n\n학습에 사용한 코드는 `train.py`에 저장되어 있고, 학습에 사용한 스크립트는 `train.sh`에 저장되어 있습니다.\n\n#### FSDP 모델 학습시 유의사항\n\nLLAMA 모델이 Hugginface Transformers 4.28.0 이상 버전에 포함되었습니다. 그 이하 버전을 사용하면 `llama`가 없다는 에러가 발생합니다. FSDP 사용시 감싸줄 레이어로는 `LlamaDecoderLayer`를 사용합니다.\n\n```bash\n# train.sh\n--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer'\n```\n\n## LoRA 방식 모델 학습 방법\n\n### LLAMA 13B 모델 학습\n\n🤗 Huggingface Repo: [https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-13B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-13B-LoRA)\n\n모델 학습은 RTX 3090 24GB 4대로 진행했습니다. 학습에 사용한 코드는 https:\u002F\u002Fgithub.com\u002Ftloen\u002Falpaca-lora 을 기반으로 사용하였습니다.\n\n데이터셋은 해당 레포의 `alpaca_data_cleaned.json`(영문)과 현재 레포의 `ko_alpaca_data.json`를 합쳐 학습에 사용하였습니다.\n\n### LLAMA 30B 모델 학습\n\n🤗 Huggingface Repo: [https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-30B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-30B-LoRA)\n\n모델 학습은 A100 40GB 2대로 진행했습니다. 학습에 사용한 코드는 https:\u002F\u002Fgithub.com\u002Ftloen\u002Falpaca-lora 을 기반으로 사용하였습니다.\n\n- LLAMA 30B 학습은 GIST [Sundong Kim](https:\u002F\u002Fsundong.kim) 교수님의 A100 지원으로 학습되었습니다.\n\n데이터셋은 해당 레포의 `alpaca_data_cleaned.json`(영문)과 현재 레포의 `ko_alpaca_data.json`를 합쳐 학습에 사용하였습니다.\n\n### LLAMA 65B 모델 학습\n\n🤗 Huggingface Repo: [https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-65B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-65B-LoRA)\n\n모델 학습은 A100 80GB 8대로 진행했습니다. 학습에 사용한 코드는 https:\u002F\u002Fgithub.com\u002Ftloen\u002Falpaca-lora 을 기반으로 사용하였습니다.\n\n데이터셋은 해당 레포의 `alpaca_data_cleaned.json`(영문)과 현재 레포의 `ko_alpaca_data.json`를 합쳐 학습에 사용하였습니다.\n\n\n# Benchmarks\n\n## NSMC Benchmark Test\n\n\u003Ca target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002FBeomi\u002FKoAlpaca\u002Fblob\u002Fmaster\u002Fnsmc_polyglot_and_koalpaca_by_few_shot.ipynb\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"Open In Colab\"\u002F>\n\u003C\u002Fa>\n\n### Benchmark Test 방식\n\n- `polyglot-ko 5.8b`와 `KoAlpaca-polyglot`과 Benchmark 성능 비교\n- Few-shot Learning으로 2 가지 프롬프트 유형으로 구분하여 테스트 진행\n\n#### 1. Few-shot Learning 구성\n\n- few-shot 구성\n  - `k=10` 설정\n  - 텍스트 토큰 길이가 25 이하인 문장만 구성\n  - 데이터 토큰 길이의 약 75 퍼센타일에 해당하는 데이터만 포함\n\n- NSMC 데이터 토큰 길이 분포\n\n```\nFew shot 케이스 토큰 평균 길이:  20.229\nFew shot 케이스 토큰 최대 길이:  280\nFew shot 케이스 토큰 길이 표준편차:  16.488\nFew shot 케이스 토큰 길이 80 퍼센타일:  27.0\n```\n\n#### 2. 프롬프트 구성\n\n- 프롬프트 유형 1: 매우 단순한 프롬프트 구성. 분석 Task에 대한 내용을 명시적으로 하지 않음\n```python\ndef build_prompt_text(sent):\n    return \"문장: \" + sent + '\\n감정:'\n```\n\n- 프롬프트 유형 2: 프롬프트 유형 1에 비해 Task 의도 내용을 포함. 분석 Task에 대한 내용을 의문형으로 표현\n```python\ndef build_prompt_text(sent):\n    return '다음 문장은 긍정일까요 부정일까요?\\n' + sent + '\\n정답:'\n```\n\n#### 3. Benchmark Test 결과\n\n acc. | 프롬프트 유형 1 | 프롬프트 유형 2\n-- | -- | --\npolyglot-ko | 0.5752 | 0.7223\n**koalpaca-polyglot** | **0.6939** | **0.7683**\n\n---\n\n## Stanford Alpaca: An Instruction-following LLaMA Model \n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache_2.0-green.svg)](https:\u002F\u002Fgithub.com\u002Ftatsu-lab\u002Fstanford_alpaca\u002Fblob\u002Fmain\u002FLICENSE) \n[![Python 3.9+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.9+-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002Frelease\u002Fpython-390\u002F) \n[![Code style: black](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcode%20style-black-000000.svg)](https:\u002F\u002Fgithub.com\u002Fpsf\u002Fblack) \n\nThis is the repo for the Stanford Alpaca project, which aims to build and share an instruction-following LLaMA model. The repo contains:\n- A [**web demo**](https:\u002F\u002Fcrfm.stanford.edu\u002Falpaca\u002F) to interact with our Alpaca model\n- The [52K data](#data-release) used for fine-tuning the model\n- The code for [generating the data](#data-generation-process)\n- The code for [fine-tuning the model](#fine-tuning)\n\n## Overview\n\nThe current Alpaca model is fine-tuned from a 7B LLaMA model [1] on 52K instruction-following data generated by the techniques in the Self-Instruct [2] paper, with some modifications that we discuss in the next section.\nIn a preliminary human evaluation, we found that the Alpaca 7B model behaves similarly to the `text-davinci-003` model on the Self-Instruct instruction-following evaluation suite [2].\n\nAlpaca is still under development, and there are many limitations that have to be addressed.\nImportantly, we have not yet fine-tuned the Alpaca model to be safe and harmless.\nWe thus encourage users to be cautious when interacting with Alpaca, and to report any concerning behavior to help improve the safety and ethical considerations of the model.\n\nOur initial release contains the data generation procedure, dataset, and training recipe. We intend to release the model weights if we are given permission to do so by the creators of LLaMA. For now, we have chosen to host a live demo to help readers better understand the capabilities and limits of Alpaca, as well as a way to help us better evaluate Alpaca's performance on a broader audience.\n\n**Please read our release [blog post](https:\u002F\u002Fcrfm.stanford.edu\u002F2023\u002F03\u002F13\u002Falpaca.html) for more details about the model, our discussion of the potential harm and limitations of Alpaca models, and our thought process for releasing a reproducible model.**\n\n\n[1]: LLaMA: Open and Efficient Foundation Language Models. Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample. https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.13971v1\n\n[2]: Self-Instruct: Aligning Language Model with Self Generated Instructions. Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi. https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.10560\n\n\n## Data Release\n[`alpaca_data.json`](.\u002Falpaca_data.json) contains 52K instruction-following data we used for fine-tuning the Alpaca model.\nThis JSON file is a list of dictionaries, each dictionary contains the following fields:\n- `instruction`: `str`, describes the task the model should perform. Each of the 52K instructions is unique.\n- `input`: `str`, optional context or input for the task. For example, when the instruction is \"Summarize the following article\", the input is the article. Around 40% of the examples have an input.\n- `output`: `str`, the answer to the instruction as generated by `text-davinci-003`.\n\nWe used the following prompts for fine-tuning the Alpaca model:\n- for examples with a non-empty input field:\n ```\n Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n \n ### Instruction:\n {instruction}\n \n ### Input:\n {input}\n \n ### Response:\n ```\n- for examples with an empty input field:\n ```\n Below is an instruction that describes a task. Write a response that appropriately completes the request.\n \n ### Instruction:\n {instruction}\n \n ### Response:\n ```\n \n During inference (eg for the web demo), we use the user instruction with an empty input field (second option).\n\n## Data Generation Process\n\n\u003Cdetails>\n\u003Csummary> \u003Cstrong> Running the code \u003C\u002Fstrong> \u003C\u002Fsummary>\n\n1. Set environment variables `OPENAI_API_KEY` to your OpenAI API key.\n2. Install the dependencies with `pip install -r requirements.txt`.\n3. Run `python -m generate_instruction generate_instruction_following_data` to generate the data.\n\n\u003C\u002Fdetails>\n\nWe built on the data generation pipeline from [self-instruct](https:\u002F\u002Fgithub.com\u002Fyizhongw\u002Fself-instruct) and made the following modifications:\n- We used `text-davinci-003` to generate the instruction data instead of `davinci`.\n- We wrote a new prompt (`prompt.txt`) that explicitly gave the requirement of instruction generation to `text-davinci-003`. Note: there is a slight error in the prompt we used, and future users should incorporate the edit in https:\u002F\u002Fgithub.com\u002Ftatsu-lab\u002Fstanford_alpaca\u002Fpull\u002F24\n- We adopted much more aggressive batch decoding, i.e., generating 20 instructions at once, which significantly reduced the cost of data generation.\n- We simplified the data generation pipeline by discarding the difference between classification and non-classification instructions.\n- We only generated a single instance for each instruction, instead of 2 to 3 instances as in [1].\n\nThis produced an instruction-following dataset with 52K examples obtained at a much lower cost (less than $500). \nIn a preliminary study, we also find our 52K generated data to be much more diverse than the data released by [self-instruct](https:\u002F\u002Fgithub.com\u002Fyizhongw\u002Fself-instruct\u002Fblob\u002Fmain\u002Fdata\u002Fseed_tasks.jsonl).\nWe plot the below figure (in the style of Figure 2 in the [self-instruct paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.10560) to demonstrate the diversity of our data.\nThe inner circle of the plot represents the root verb of the instructions, and the outer circle represents the direct objects.\n\n[\u002F\u002F]: # (![parse_analysis]&#40;assert\u002Fparse_analysis.png | width=100&#41;)\n[\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_6c012dea633c.png\" width=\"750\" \u002F>](.\u002Fhttps:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_6c012dea633c.png)\n\n## Fine-tuning\nWe fine-tune our models using standard Hugging Face training code with the following hyperparameters:\n\n| Hyperparameter | Value |\n|----------------|-------|\n| Batch size     | 128   |\n| Learning rate  | 2e-5  |\n| Epochs         | 3     |\n| Max length     | 512   |\n | Weight decay   | 0     |\n\nGiven Hugging Face hasn't officially supported the LLaMA models, we fine-tuned LLaMA with Hugging Face's transformers library by installing it from a particular fork (i.e. this [PR](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers\u002Fpull\u002F21955) to be merged).\nThe hash of the specific commit we installed was `68d640f7c368bcaaaecfc678f11908ebbd3d6176`.\n\nTo reproduce our fine-tuning runs for LLaMA, first install the requirements \n```bash\npip install -r requirements.txt\n```\nThen, install the particular fork of Hugging Face's transformers library.\n\nBelow is a command that fine-tunes LLaMA-7B with our dataset on a machine with 4 A100 80G GPUs in FSDP `full_shard` mode. \nWe were able to reproduce a model of similar quality as the one we hosted in our demo with the following command using **Python 3.10**.\nReplace `\u003Cyour_random_port>` with a port of your own, `\u003Cyour_path_to_hf_converted_llama_ckpt_and_tokenizer>` with the \npath to your converted checkpoint and tokenizer (following instructions in the PR), and `\u003Cyour_output_dir>` with where you want to store your outputs.\n\n```bash\ntorchrun --nproc_per_node=4 --master_port=\u003Cyour_random_port> train.py \\\n    --model_name_or_path \u003Cyour_path_to_hf_converted_llama_ckpt_and_tokenizer> \\\n    --data_path .\u002Falpaca_data.json \\\n    --bf16 True \\\n    --output_dir \u003Cyour_output_dir> \\\n    --num_train_epochs 3 \\\n    --per_device_train_batch_size 4 \\\n    --per_device_eval_batch_size 4 \\\n    --gradient_accumulation_steps 8 \\\n    --evaluation_strategy \"no\" \\\n    --save_strategy \"steps\" \\\n    --save_steps 2000 \\\n    --save_total_limit 1 \\\n    --learning_rate 2e-5 \\\n    --weight_decay 0. \\\n    --warmup_ratio 0.03 \\\n    --lr_scheduler_type \"cosine\" \\\n    --logging_steps 1 \\\n    --fsdp \"full_shard auto_wrap\" \\\n    --fsdp_transformer_layer_cls_to_wrap 'LLaMADecoderLayer' \\\n    --tf32 True\n```\n\n### Warning\n`fsdp_transformer_layer_cls_to_wrap` must be set to the name of the specific decoder layer. \nThe LLaMA Hugging Face PR is not stable. \nEarlier commits used the name `LLaMADecoderLayer` for their decoder layer (the commit hash our code is based on this). \nMore recent commits use `LlamaDecoderLayer` (notice the small case difference).\nNot setting `fsdp_transformer_layer_cls_to_wrap` to the correct name will lead to drastic slowdowns in training.\n\n### Side notes\n\nThe same script also works for OPT fine-tuning. Here's an example for fine-tuning OPT-6.7B\n\n```bash\ntorchrun --nproc_per_node=4 --master_port=\u003Cyour_random_port> train.py \\\n    --model_name_or_path \"facebook\u002Fopt-6.7b\" \\\n    --data_path .\u002Falpaca_data.json \\\n    --bf16 True \\\n    --output_dir \u003Cyour_output_dir> \\\n    --num_train_epochs 3 \\\n    --per_device_train_batch_size 4 \\\n    --per_device_eval_batch_size 4 \\\n    --gradient_accumulation_steps 8 \\\n    --evaluation_strategy \"no\" \\\n    --save_strategy \"steps\" \\\n    --save_steps 2000 \\\n    --save_total_limit 1 \\\n    --learning_rate 2e-5 \\\n    --weight_decay 0. \\\n    --warmup_ratio 0.03 \\\n    --lr_scheduler_type \"cosine\" \\\n    --logging_steps 1 \\\n    --fsdp \"full_shard auto_wrap\" \\\n    --fsdp_transformer_layer_cls_to_wrap 'OPTDecoderLayer' \\\n    --tf32 True\n```\n\nNote the given training script is meant to be simple and easy to use, and is not particularly optimized.\nTo run on more gpus, you may prefer to turn down `gradient_accumulation_steps` to keep a global batch size of 128. Global batch size has not been tested for optimality.\n\n### Authors\nAll grad students below contributed equally and the order is determined by random draw.\n\n- [Rohan Taori](https:\u002F\u002Fwww.rohantaori.com\u002F)\n- [Ishaan Gulrajani](https:\u002F\u002Fishaan.io\u002F)\n- [Tianyi Zhang](https:\u002F\u002Ftiiiger.github.io\u002F)\n- [Yann Dubois](https:\u002F\u002Fyanndubs.github.io\u002F)\n- [Xuechen Li](https:\u002F\u002Fwww.lxuechen.com\u002F)\n\nAll advised by [Tatsunori B. Hashimoto](https:\u002F\u002Fthashim.github.io\u002F). Yann is also advised by [Percy Liang](https:\u002F\u002Fcs.stanford.edu\u002F~pliang\u002F) and Xuechen is also advised by [Carlos Guestrin](https:\u002F\u002Fguestrin.su.domains\u002F).\n\n### Citation\n\nPlease cite the repo if you use the data or code in this repo.\n```\n@misc{alpaca,\n  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },\n  title = {Stanford Alpaca: An Instruction-following LLaMA model},\n  year = {2023},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n  howpublished = {\\url{https:\u002F\u002Fgithub.com\u002Ftatsu-lab\u002Fstanford_alpaca}},\n}\n```\n\nNaturally, you should also cite the original LLaMA paper [1] and the Self-Instruct paper [2].\n\n### Acknowledgements\n\nWe thank Yizhong Wang for his help in explaining the data generation pipeline in Self-Instruct and providing the code for the parse analysis plot.\nWe thank Yifan Mai for helpful support, and members of the Stanford NLP Group as well as the Center for Research on Foundation Models (CRFM) for their helpful feedback.\n","\u003Cp align=\"center\" width=\"100%\">\n\u003Cimg src=\"assets\u002Fnew_koalpaca_final.svg\" alt=\"KoAlpaca图标\" style=\"width: 200px; height:200px; display: block; margin: auto; border-radius: 50%;\">\n\u003C\u002Fp>\n\n## 更新日志\n\n- 2024.05\n  - 因GPU服务器支持终止，Web应用演示将停止服务。\n- 2023.07.28\n  - 增加了基于TensorParallel的训练示例代码，使用4台RTX3090（或4090\u002FA5000）GPU进行训练：[链接](https:\u002F\u002Fgithub.com\u002FBeomi\u002FKoAlpaca\u002Ftree\u002Fmain\u002Ftrain_v1.1b#tensor-parallel-polyglot-ko-58b-%EB%AA%A8%EB%8D%B8-%ED%95%99%EC%8A%B5-w-4x-rtx-3090-gpu)\n- 2023.06.12\n    - 删除了Kakao\u002FTelegram机器人相关说明\n- 2023.06.08\n    - 增加了[Colab：加载QLoRA训练的adapter权重进行推理](https:\u002F\u002Fgithub.com\u002FBeomi\u002FKoAlpaca\u002Fblob\u002Fmain\u002F2023_06_08_PEFT%EB%A1%9CLoRA%EB%A1%9C%EB%93%9C.ipynb)教程\n        - 可在Colab Free（T4 GPU）上运行。\n        - 示例中使用了[🤗beomi\u002Fqlora-koalpaca-polyglot-12.8b-50step](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002Fqlora-koalpaca-polyglot-12.8b-50step)仓库中的模型（经过50步训练的示例）。\n        \u003Ca style='display:inline' target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002FBeomi\u002FKoAlpaca\u002Fblob\u002Fmain\u002F2023_06_08_PEFT%EB%A1%9CLoRA%EB%A1%9C%EB%93%9C.ipynb\">\n          \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"在Colab中打开\"\u002F>\n        \u003C\u002Fa>\n        \n- 2023.06.02\n    - 基于28B tokens检查点的[🐧KoRWKV-6B🤗](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoRWKV-6B)，训练并发布了v1.1a版本的[🐧KoAlpaca-KoRWKV-6B🤗](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-KoRWKV-6B)\n        > [🐧KoAlpaca-KoRWKV-6B Hugging Face仓库🤗](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-KoRWKV-6B)中包含了一些生成结果示例。\n    - 增加了使用RTX 3090\u002F4090\u002FA5000对Polyglot-ko 5.8B进行全量微调的指南：[链接](https:\u002F\u002Fgithub.com\u002FBeomi\u002FKoAlpaca\u002Fblob\u002Fmain\u002Ftrain_v1.1b\u002FREADME.md#polyglot-ko-58b-%EB%AA%A8%EB%8D%B8-%ED%95%99%EC%8A%B5-w-3x-rtx-3090-gpu-%ED%98%B9%EC%9D%80-rtx-4090-a5000)\n\n- 2023.05.30\n    - 增加了可在免费Colab（T4 GPU）上运行的训练示例：[链接](https:\u002F\u002Fcolab.research.google.com\u002Fgist\u002FBeomi\u002Fa3032e4eaa33b86fdf8de1f47f15a647\u002F2023_05_26_bnb_4bit_koalpaca_v1_1a_on_polyglot_ko_12_8b.ipynb)。使用QLoRA对Polyglot-ko 12.8B模型进行KoAlpaca v1.1a版本的训练，并采用4bit量化。\n        - 在Colab免费版（T4 GPU）上，处理50个样本大约需要4分钟，完成一个epoch大约需要15小时（达到最大使用时长限制）。\n        - 在Colab Pro版（A100 GPU）上，处理50个样本大约需要44秒，完成一个epoch大约需要2–3小时。\n        \u003Ca style='display:inline' target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgist\u002FBeomi\u002Fa3032e4eaa33b86fdf8de1f47f15a647\u002F2023_05_26_bnb_4bit_koalpaca_v1_1a_on_polyglot_ko_12_8b.ipynb\">\n          \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"在Colab中打开\"\u002F>\n        \u003C\u002Fa>\n\n- 2023.05.03\n    - 公开了基于[🤗Polyglot-ko 5.8B的KoAlpaca-Polyglot-5.8B-v1.1b fp16模型](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-5.8B)以及[🤗8bit模型](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-5.8B\u002Ftree\u002F8bit)。\n- 2023.04.28\n    - Logo更新🎉\n- 2023.04.26\n    - WebUI中应用了[Hugging Face ChatUI](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fhuggingchat\u002Fchat-ui\u002Ftree\u002Fmain)。\n      - 该ChatKoAlpaca GPU服务器**在“大家研究所K-数字平台”的支持下提供服务。（已于2024年5月结束服务）**\n- 2023.04.20\n    - 公开了[KoAlpaca v1.1数据集](KoAlpaca_v1.1.jsonl)。\n    - 公开了基于[🤗Polyglot-ko 12.8B的KoAlpaca-Polyglot-12.8B-v1.1b fp16模型](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-12.8B)以及[🤗8bit模型](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-12.8B\u002Ftree\u002F8bit)。\n    - 公开了[ChatKoAlpaca Web UI(v1.1b)](https:\u002F\u002Fchat.koalpaca.com\u002F)。\n\n- 2023.03.26：公开了基于[🤗LLAMA 30B的KoAlpaca模型](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-30B-LoRA)。（通过LoRA进行训练）\n  - LLAMA 30B的训练得到了GIST[Sundong Kim](https:\u002F\u002Fsundong.kim)教授提供的A100 GPU支持，非常感谢 😄\n\n- 2023.03.24：公开了基于[🤗LLAMA 13B的KoAlpaca模型](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-13B-LoRA)。（通过LoRA进行训练）\n\n- 2023.03.23：公开了基于[🤗LLAMA 65B的KoAlpaca模型](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-65B-LoRA)。（通过LoRA进行训练）\n\n- 2023.03.22：新增了移植到KakaoTalk的[KoAlpaca机器人](http:\u002F\u002Fpf.kakao.com\u002F_wdRxcxj)。\n\n\n---\n\n# KoAlpaca：基于斯坦福Alpaca的韩语Alpaca模型（结合LLAMA和Polyglot-ko）\n\n这是一款能够理解韩语的Alpaca模型，其训练方式与斯坦福Alpaca模型完全相同。\n\n## 立即体验：通过聊天型网页(v1.1b-chat)、KakaoTalk机器人或Telegram机器人与它互动吧！\n\n### ChatKoAlpaca：聊天型KoAlpaca（Polyglot-ko-12.8b-v1.1b）\n\n- 该演示由[모두연 KoAlpaca Lab](https:\u002F\u002Fmodulabs.co.kr\u002Fproduct\u002Flab-11500-2023-05-10-074332\u002F)在“大家研究所K-数字平台”的支持下完成。非常感谢 😍\n- 截至2024年5月，由于服务器支持终止，服务已停止。\n\n![带有聊天输入框的ChatKoAlpaca网页](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_c369a0895323.jpeg)\n\n- 无需安装，直接通过网页即可使用。\n- 它会记住之前的对话上下文来生成回复。\n- 最多可生成512个token的响应，同时会记住最多1024个token的上下文（包括新生成的回复）。\n- 演示中使用了两台配备A5000 24G GPU的机器，通过Tensor Parallel进行服务，并以fp16模型提供支持。\n\n\u003C!-- 上述服务所使用的Gradio代码，请参考[webui文件夹](https:\u002F\u002Fgithub.com\u002FBeomi\u002FKoAlpaca\u002Ftree\u002Fmain\u002Fwebui)。 -->\n\n\u003C!--\n\n### KakaoTalk机器人：KoAlpaca-Polyglot-ko-5.8b-v1.1b\n\n- 由于响应时间限制为5秒，因此字数限制为200字。\n- 不会记住对话上下文。\n\n链接：点击下方图片或访问[http:\u002F\u002Fpf.kakao.com\u002F_wdRxcxj](http:\u002F\u002Fpf.kakao.com\u002F_wdRxcxj)\n\n\u003Cp align=\"center\" width=\"100%\">\n\u003Ca href=\"http:\u002F\u002Fpf.kakao.com\u002F_wdRxcxj\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_a870f43ef81a.png\" alt=\"KoAlpaca KakaoTalk示例\" style=\"width: 50%; max-width: 300px; display: block; margin: auto;\">\n\u003C\u002Fa>\n\u003C\u002Fp>\n\n\n### Telegram机器人：可使用韩语5.8B v1.1b模型及LLAMA 7B v1.0模型，不记忆上下文\n\n- 最大输入1024个字符，最大输出512个token。\n- 不会记住对话上下文。\n\n请扫描下方二维码，或访问[https:\u002F\u002Ft.me\u002FKoAlpacaBot](https:\u002F\u002Ft.me\u002FKoAlpacaBot)与之互动！\n\n\u003Cp align=\"center\" width=\"100%\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_7f98a3e26ff4.jpg\" alt=\"KoAlpaca Telegram机器人二维码\" style=\"width: 50%; max-width: 300px; display: block; margin: auto;\">\n\u003C\u002Fp>\n\n您可以随时切换韩语基础模型和LLAMA基础模型来体验不同功能！\n\n\u003Cp align=\"center\" width=\"100%\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_02e8ec58347b.png\" alt=\"KoAlpaca Telegram机器人二维码\" style=\"width: 45%; max-width: 300px; display: inline-block; margin: auto;\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_64788014bf8b.png\" alt=\"KoAlpaca Telegram机器人二维码\" style=\"width: 45%; max-width: 300px; display: inline-block; margin: auto;\">\n\u003C\u002Fp>\n\n> 📣 每个模型仅由一台GPU提供服务，因此生成内容可能需要一些时间。为了提高速度，每次请求的最大生成token数量被限制为128个。 \n\n-->\n\n## 完成的模型：韩语基础模型(Polyglot-ko) & 英韩模型(LLAMA)\n\nKoAlpaca 以韩语模型 Polyglot-ko（5.8B）和英韩混合模型 LLAMA 作为骨干模型。\n\n1. **基于 Polyglot-ko 12.8B 的 [全量微调 v1.1]** -> [🤗 beomi\u002FKoAlpaca-Polyglot-12.8B](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-12.8B)\n  - 该模型曾用于 [ChatKoAlpaca](https:\u002F\u002Fchat.koalpaca.com\u002F)。\n  - 同时也提供了 [🤗 8位量化版本](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-12.8B\u002Ftree\u002F8bit)。\n2. 基于 Polyglot-ko 5.8B 的 [全量微调] -> [🤗 beomi\u002FKoAlpaca-Polyglot](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot)\n3. 基于 Meta LLAMA 7B 的 [全量微调] -> [🤗 beomi\u002FKoAlpaca](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca)\n4. 基于 Meta LLAMA 13B 的 [LoRA 微调] -> [🤗 KoAlpaca-13B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-13B-LoRA)\n5. 基于 Meta LLAMA 30B 的 [LoRA 微调] -> [🤗 KoAlpaca-30B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-30B-LoRA)\n6. 基于 Meta LLAMA 65B 的 [LoRA 微调] -> [🤗 KoAlpaca-65B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-65B-LoRA)\n\n> 注：对于 LLAMA 13B，@deep-diver 训练的 [🤗chansung\u002Fkoalpaca-lora-13b](https:\u002F\u002Fhuggingface.co\u002Fchansung\u002Fkoalpaca-lora-13b) 也已公开。）\n\nMeta 的 LLAMA 模型并未充分学习韩语数据集，因此在实际推理时，其韩语性能较低。\n\n为了获得更好的韩语质量，我们还使用了 Polyglot-ko 5.8B 和 12.8B 模型作为骨干，进行了进一步训练。\n\n随后，结合 v1.1 数据集，我们发布了能够理解先前对话上下文及用户额外输入上下文的 KoAlpaca v1.1 模型。\n\n\n## KoAlpaca 模型运行示例代码\n\n### 使用 Hugging Face Pipeline 运行\n\n\u003Ca target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1a5xVG4VSh5TSRfSVchyCqk9onadhIxTs?usp=sharing\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"在 Colab 中打开\"\u002F>\n\u003C\u002Fa>\n\n- 安装最新版 PyTorch 和 Hugging Face 库\n\n```bash\npip install -U torch transformers tokenizers accelerate safetensors\n```\n\n可以使用以下示例代码进行运行：\n\n```python\nimport torch\nfrom transformers import pipeline, AutoModelForCausalLM\n\nMODEL = 'beomi\u002FKoAlpaca-Polyglot-5.8B'\n\nmodel = AutoModelForCausalLM.from_pretrained(\n    MODEL,\n    torch_dtype=torch.float16,\n    low_cpu_mem_usage=True,\n).to(device=f\"cuda\", non_blocking=True)\nmodel.eval()\n\npipe = pipeline(\n    'text-generation', \n    model=model,\n    tokenizer=MODEL,\n    device=0\n)\n\ndef ask(x, context='', is_input_full=False):\n    ans = pipe(\n        f\"### 问题: {x}\\n\\n### 上下文: {context}\\n\\n### 回答:\" if context else f\"### 问题: {x}\\n\\n### 回答:\", \n        do_sample=True, \n        max_new_tokens=512,\n        temperature=0.7,\n        top_p=0.9,\n        return_full_text=False,\n        eos_token_id=2,\n    )\n    print(ans[0]['generated_text'])\n\nask(\"深度学习是什么？\")\n# 深度学习是机器学习的一个分支，它通过人工神经网络来学习输入与输出之间的复杂关系。这项技术使计算机能够以类似于人类学习的方式识别模式，并且无需人工干预即可处理数据。近年来，利用深度学习的人工智能应用得到了广泛发展。例如，在医疗诊断领域，深度学习技术被用来分析患者的特征，从而实现快速而准确的诊断。此外，在金融领域，深度学习也被用于训练股票价格预测模型。\n```\n\n### Gradio 示例 - 全量微调 & 8位量化 🌟 推荐 🌟\n\n请参考本仓库的 [webui 文件夹](https:\u002F\u002Fgithub.com\u002FBeomi\u002FKoAlpaca\u002Ftree\u002Fmain\u002Fwebui)。\n\n### Gradio 示例 - LoRA\n\n请参考 @deep-diver 的 [deep-diver\u002FAlpaca-LoRA-Serve](https:\u002F\u002Fgithub.com\u002Fdeep-diver\u002FAlpaca-LoRA-Serve) 仓库并运行。\n\n```bash\n# 按照上述仓库的指南安装后..\nBASE_URL=decapoda-research\u002Fllama-13b-hf  # 30B模型为 llama-30b-hf，65B模型为 llama-65b-hf 注意：65B模型默认占用内存超过100GB。\nFINETUNED_CKPT_URL=beomi\u002FKoAlpaca-13B-LoRA  # 30B模型为 KoAlpaca-30B-LoRA，65B模型为 KoAlpaca-65B-LoRA 注意：65B模型仅能在配备 A100 80G 等超大显存的 GPU 上运行。\n\npython app.py --base_url $BASE_URL --ft_ckpt_url $FINETUNED_CKPT_URL --port 6006\n```\n\n## 数据集 v1.1 制作方法\n\nv1.1 数据集旨在改善原有 Alpaca 模型回答简短、无法理解上下文的问题而制作。\n\n1. 数据集选择与收集\n\nv1.1 数据集通过爬取 Naver 知识人内容生成。\n\n我们收集了 [Naver 知识人精华区](https:\u002F\u002Fkin.naver.com\u002Fbest\u002Flistaha.naver) 的所有问题。\n\n2. 数据加工\n\n实际生成的数据基于问题及其正文，以及被采纳的答案正文重新构建。\n\n数据通过 ChatGPT 生成，并采用批量解码方式完成。\n\n> 注：原有的知识人问题和答案被用作 ChatGPT 的参考种子。\n\n## 数据集 v1.0 制作方法\n\n数据集基本上以 [斯坦福 Alpaca 提供的 5.2 万条数据集](.\u002Fen_alpaca_data.json) 为基础。\n\n1. 数据集翻译\n\nAlpaca 数据集由 Instruct 部分、Input 部分以及 Output 部分组成，如下所示：\n\n```json\n...\n{\n    \"instruction\": \"描述一次你必须做出艰难决定的经历。\",\n    \"input\": \"\",\n    \"output\": \"我在一家建筑公司担任项目经理时，曾面临过一次艰难的抉择。当时我负责的一个项目必须在特定日期前完工，才能满足客户期望。然而由于意外延误，我们无法按时完成任务，因此不得不做出一个艰难的决定。最终我决定延长工期，但这意味着要进一步调动团队资源并增加预算。尽管这一决定风险很大，但我还是决定执行，以确保项目能够如期完成，并满足客户的期望。最后，项目顺利竣工，这也证明了我的领导能力和决策水平。\"\n},\n{\n    \"instruction\": \"找出不同的一项。\",\n    \"input\": \"推特、Instagram、Telegram\",\n    \"output\": \"Telegram\"\n},\n...\n```\n\n其中 Output 部分是由 OpenAI 的 `text-davinci-003` 模型生成的，因此这部分没有进行翻译。\n\n所以，我们只翻译了 Instruct 和 Input 两列，并剔除了 Input 部分存在错误的一些条目。\n\n为了进行翻译，我们使用了 DeepL API 服务，翻译后的数据格式如下：\n\n```json\n...\n{\n    \"instruction\": \"找出不同的一项。\",\n    \"input\": \"推特、Instagram、Telegram\"\n}\n...\n```\n\n2. Output 数据生成\n\n随后，我们将这些 Instruct 和 Input 通过批量解码技术，利用 OpenAI ChatGPT API（`gpt-3.5-turbo`）生成数据。\n\n与 Alpaca 研究相同，我们对每一对 Instruct\u002FInput 组合仅生成一个回复，从而构建数据集。\n\n在原有研究中，每次会同时生成 20 条数据，但我们在用韩语生成时发现，每次同时生成 10 条数据时效果更稳定，因此我们改为每次生成 10 条。\n\n以下是用于生成回答的提示词：\n\n```python\nPROMPT = \"\"\"\\\n请为各种任务生成回答。这些任务指令将提供给 ChatGPT 模型，ChatGPT 模型将根据指令完成任务并进行评估。\n\n要求如下：\n1. 为最大化多样性，请确保每个指令中的动词不重复。\n2. 指令所使用的语言也应多样化，例如可以结合提问和命令式指示。\n3. 指令类型应多样化，包括开放式生成、分类、编辑等多种类型的任务。\n4. GPT 语言模型应能够完成这些指令。例如，不要要求助手生成视觉或音频输出；也不要请求助手设置下午 5 点的闹钟或提醒，因为助手无法执行此类操作。\n5. 回答必须用韩语撰写。\n6. 回答长度控制在 1–2 句话以内，可以是陈述句或疑问句。\n7. 必须为指令生成合适的输入内容。输入字段应包含针对指令的具体示例，需使用真实数据，不得仅使用占位符。输入内容应具有一定的复杂性，使指令更具挑战性，但理想情况下不超过 100 字。\n8. 部分指令不需要额外输入，而另一些指令则需要提供输入内容。例如，“世界上最高的山峰是什么？”这类一般性问题无需提供具体背景信息，因此输入字段可以留空。\n9. 输出必须是对指令及输入的恰当回应。\n\n请根据以下 10 条指令及其可选输入，生成相应的回答。回答应按照编号从 0 到 9 的顺序排列，每个编号对应一条指令和其对应的输入内容。请用 ### 分隔每条回答。\n\n回答0：第一条回答内容###\n回答1：第二条回答内容###\n...\n回答9：最后一条回答内容\"\"\"\n```\n\n此外，我们还添加了如下 ChatGPT API 的系统提示词：\n\n```python\n\"content\": \"以下是一些任务指令。如果没有输入，请忽略输入部分，直接根据指令给出适当的回答。如果有额外输入，则需结合指令和输入内容给出恰当的回答。\"\n```\n\n最终生成的韩语数据集保存在 `ko_alpaca_data.json` 文件中。\n\n## 微调方式的模型训练方法\n\n### QLoRA 4bit 训练方法 Google Colab 示例\n\n- 使用 QLoRA 对 Polyglot-ko 12.8B 模型进行 KoAlpaca v1.1a 微调，采用 4bit 精度\n\n\u003Ca style='display:inline' target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgist\u002FBeomi\u002Ff163a6c04a869d18ee1a025b6d33e6d8\u002F2023_05_26_bnb_4bit_koalpaca_v1_1a_on_polyglot_ko_12_8b.ipynb\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"在 Colab 中打开\"\u002F>\n\u003C\u002Fa>\n\n- 仅用 1000 个样本训练的模型生成结果示例\n\n```\n### 问题：看跌期权和看涨期权有什么区别？普通散户投资者应该选择哪种仓位？\n\n### 回答：看跌期权是指买入股票的行为，而看涨期权则是指卖出股票的行为。举例来说，如果某公司的股价为 10 万元，且该公司正在开展新业务，预计未来收入会增加，企业价值也将随之提升，那么买入该公司的股票就是看跌期权。不过，在买入股票时，如果公司股价下跌，你的投资也会面临损失的风险。相反，看涨期权是指卖出股票。当公司股价上涨、前景看好时，你可以获得更高的收益，这是一种相对稳健的投资方式。对于普通散户投资者而言，在买卖股票时会面临多种风险，因此建议在专业人士的指导下，结合自身的投资经验制定合适的投资策略。有关投资的更多详细信息，您可以通过下方链接免费获取。\"\n```\n\n```\n### 问题：什么是保证金催缴？如何避免发生保证金催缴？\n\n### 回答：保证金催缴是指在期货\u002F期权交易中，由于保证金不足，交易所要求追加缴纳保证金的情况。追加保证金是在初始约定保证金的基础上，因市场波动而需要额外支付的保证金。通常情况下，买入合约时需缴纳合约金额的 10% 作为保证金，而卖出合约则需缴纳 3%。因此，买入合约总共需要 20% 的保证金，卖出合约则需要 9% 的保证金。在期货交易过程中，每日根据市场价格变化，可能需要随时追加保证金，一旦保证金不足，就会触发保证金催缴。所以在进行期货交易之前，务必确保有足够的保证金。可以通过“期货\u002F期权 → 保证金 → 追加缴纳\u002F平仓”等方式查询保证金情况，也可以通过网络或电话查询。此外，还可以通过“订单\u002F保证金 → 追加保证金”来补充保证金。补充保证金\n```\n\n### ChatKoAlpaca（KoAlpaca Polyglot-ko-12.8B-v1.1b）模型训练\n\nChatKoAlpaca（KoAlpaca Polyglot-ko-12.8B-v1.1b）模型是在 4 台 A100 80GB 显卡上进行训练的，并通过 DeepSpeed ZeRO-3 实现全分片训练。\n\n相关代码请参考当前仓库的 `train_1.1b` 文件夹。\n\n### Polyglot-ko 5.8B 模型训练\n\n该模型是在一台 A100 80GB 显卡上进行训练的。\n\n### LLAMA 7B 模型训练\n\n![KoAlpaca LLAMA 训练损失曲线](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_f1b9764df0a5.jpeg)\n\n基于 LLAMA 的模型训练是在 4 台 A100 80GB 显卡上进行的，并采用了 FSDP 进行训练。\n\n用于训练的代码保存在 `train.py` 文件中，用于运行训练的脚本保存在 `train.sh` 文件中。\n\n#### 使用 FSDP 训练模型时的注意事项\n\nLLAMA 模型已包含在 Huggingface Transformers 4.28.0 或更高版本中。如果使用低于该版本的库，则会出现找不到 `llama` 的错误。在使用 FSDP 时，需要包裹的层是 `LlamaDecoderLayer`。\n\n```bash\n# train.sh\n--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer'\n```\n\n## LoRA 方式模型训练方法\n\n### LLAMA 13B 模型训练\n\n🤗 Huggingface 仓库：[https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-13B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-13B-LoRA)\n\n该模型是在 4 张 RTX 3090 24GB 显卡上训练的。训练代码基于 https:\u002F\u002Fgithub.com\u002Ftloen\u002Falpaca-lora 开发。\n\n训练数据集由该仓库的 `alpaca_data_cleaned.json`（英文版）和当前仓库的 `ko_alpaca_data.json` 合并而成。\n\n### LLAMA 30B 模型训练\n\n🤗 Huggingface 仓库：[https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-30B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-30B-LoRA)\n\n该模型是在 2 台 A100 40GB 显卡上训练的。训练代码同样基于 https:\u002F\u002Fgithub.com\u002Ftloen\u002Falpaca-lora 开发。\n\n- LLAMA 30B 的训练得到了 Sundong Kim 教授的 A100 设备支持。\n\n训练数据集由该仓库的 `alpaca_data_cleaned.json`（英文版）和当前仓库的 `ko_alpaca_data.json` 合并而成。\n\n### LLAMA 65B 模型训练\n\n🤗 Huggingface 仓库：[https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-65B-LoRA](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-65B-LoRA)\n\n该模型是在 8 台 A100 80GB 显卡上训练的。训练代码同样基于 https:\u002F\u002Fgithub.com\u002Ftloen\u002Falpaca-lora 开发。\n\n训练数据集由该仓库的 `alpaca_data_cleaned.json`（英文版）和当前仓库的 `ko_alpaca_data.json` 合并而成。\n\n\n# 基准测试\n\n## NSMC 基准测试\n\n\u003Ca target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002FBeomi\u002FKoAlpaca\u002Fblob\u002Fmaster\u002Fnsmc_polyglot_and_koalpaca_by_few_shot.ipynb\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"在 Colab 中打开\"\u002F>\n\u003C\u002Fa>\n\n### 基准测试方法\n\n- 对 `polyglot-ko 5.8b` 和 `KoAlpaca-polyglot` 进行基准性能对比\n- 采用少样本学习方式，分为两种提示类型进行测试\n\n#### 1. 少样本学习配置\n\n- 少样本配置\n  - 设置 `k=10`\n  - 仅包含文本标记长度不超过25的句子\n  - 只选取数据标记长度约75百分位的数据\n\n- NSMC 数据标记长度分布\n\n```\n少样本案例标记平均长度：20.229\n少样本案例标记最大长度：280\n少样本案例标记长度标准差：16.488\n少样本案例标记长度80百分位数：27.0\n```\n\n#### 2. 提示词构成\n\n- 提示词类型1：非常简单的提示词构成，未明确说明分析任务内容\n```python\ndef build_prompt_text(sent):\n    return \"句子: \" + sent + '\\n情感:'\n```\n\n- 提示词类型2：相比提示词类型1，包含了任务意图内容，并以疑问句形式表达分析任务\n```python\ndef build_prompt_text(sent):\n    return '以下句子是正面还是负面？\\n' + sent + '\\n答案:'\n```\n\n#### 3. 基准测试结果\n\n准确率 | 提示词类型1 | 提示词类型2\n-- | -- | --\npolyglot-ko | 0.5752 | 0.7223\n**koalpaca-polyglot** | **0.6939** | **0.7683**\n\n---\n\n## 斯坦福 Alpaca：一个遵循指令的 LLaMA 模型 \n[![许可证](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache_2.0-green.svg)](https:\u002F\u002Fgithub.com\u002Ftatsu-lab\u002Fstanford_alpaca\u002Fblob\u002Fmain\u002FLICENSE) \n[![Python 3.9+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.9+-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002Frelease\u002Fpython-390\u002F) \n[![代码风格：black](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcode%20style-black-000000.svg)](https:\u002F\u002Fgithub.com\u002Fpsf\u002Fblack) \n\n这是斯坦福 Alpaca 项目的仓库，旨在构建并分享一个遵循指令的 LLaMA 模型。该仓库包含：\n- 一个 [**网页演示**](https:\u002F\u002Fcrfm.stanford.edu\u002Falpaca\u002F)，用于与我们的 Alpaca 模型互动\n- 用于微调模型的 [52K 数据](#data-release)\n- 用于 [生成数据](#data-generation-process) 的代码\n- 用于 [微调模型](#fine-tuning) 的代码\n\n## 概述\n\n当前的 Alpaca 模型是在 7B LLaMA 模型 [1] 的基础上，使用来自 Self-Instruct [2] 论文技术生成的 52K 条遵循指令的数据进行微调的，其中我们对这些技术做了一些修改，具体将在下一节中讨论。\n在初步的人工评估中，我们发现 Alpaca 7B 模型在 Self-Instruct 遵循指令的评估套件 [2] 上的表现与 `text-davinci-003` 模型相似。\n\nAlpaca 仍处于开发阶段，存在许多需要解决的局限性。\n重要的是，我们尚未对 Alpaca 模型进行安全无害的微调。\n因此，我们鼓励用户在与 Alpaca 互动时保持谨慎，并报告任何令人担忧的行为，以帮助提高模型的安全性和伦理考量。\n\n我们最初的发布内容包括数据生成流程、数据集和训练配方。如果获得 LLaMA 创作者的许可，我们计划发布模型权重。目前，我们选择托管一个在线演示，以帮助读者更好地理解 Alpaca 的能力和局限性，同时也为我们提供了一个在更广泛受众中评估 Alpaca 性能的方式。\n\n**请阅读我们的发布 [博客文章](https:\u002F\u002Fcrfm.stanford.edu\u002F2023\u002F03\u002F13\u002Falpaca.html) ，了解更多关于模型的信息、我们对 Alpaca 模型潜在危害和限制的讨论，以及我们发布可复现模型的思考过程。**\n\n\n[1]: LLaMA：开放且高效的底层语言模型。Hugo Touvron、Thibaut Lavril、Gautier Izacard、Xavier Martinet、Marie-Anne Lachaux、Timothée Lacroix、Baptiste Rozière、Naman Goyal、Eric Hambro、Faisal Azhar、Aurelien Rodriguez、Armand Joulin、Edouard Grave、Guillaume Lample。https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.13971v1\n\n[2]: Self-Instruct：通过自我生成的指令对齐语言模型。Yizhong Wang、Yeganeh Kordi、Swaroop Mishra、Alisa Liu、Noah A. Smith、Daniel Khashabi、Hannaneh Hajishirzi。https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.10560\n\n\n## 数据发布\n[`alpaca_data.json`](.\u002Falpaca_data.json) 包含我们用于微调 Alpaca 模型的 52K 条遵循指令的数据。\n此 JSON 文件是一个字典列表，每个字典包含以下字段：\n- `instruction`：`str`，描述模型应执行的任务。这 52K 条指令均不重复。\n- `input`：`str`，任务的可选上下文或输入。例如，当指令为“总结以下文章”时，输入就是这篇文章。大约 40% 的例子都有输入。\n- `output`：`str`，由 `text-davinci-003` 生成的指令回答。\n\n我们使用以下提示语来微调 Alpaca 模型：\n- 对于 `input` 字段非空的例子：\n ```\n 下面是一条描述任务的指令，配有一个提供更多上下文的输入。请撰写一条恰当完成请求的回复。\n \n ### 指令：\n {instruction}\n \n ### 输入：\n {input}\n \n ### 回复：\n ```\n- 对于 `input` 字段为空的例子：\n ```\n 下面是一条描述任务的指令。请撰写一条恰当完成请求的回复。\n \n ### 指令：\n {instruction}\n \n ### 回复：\n ```\n \n 在推理过程中（例如在网页演示中），我们使用用户提供的指令，且 `input` 字段为空（第二种选项）。\n\n## 数据生成流程\n\n\u003Cdetails>\n\u003Csummary> \u003Cstrong> 运行代码 \u003C\u002Fstrong> \u003C\u002Fsummary>\n\n1. 将环境变量 `OPENAI_API_KEY` 设置为您的 OpenAI API 密钥。\n2. 使用 `pip install -r requirements.txt` 安装依赖项。\n3. 运行 `python -m generate_instruction generate_instruction_following_data` 以生成数据。\n\n\u003C\u002Fdetails>\n\n我们基于 [self-instruct](https:\u002F\u002Fgithub.com\u002Fyizhongw\u002Fself-instruct) 的数据生成流程进行了改进，具体如下：\n- 我们使用 `text-davinci-003` 生成指令数据，而不是 `davinci`。\n- 我们编写了一个新的提示语文件 (`prompt.txt`)，明确要求 `text-davinci-003` 生成指令。请注意：我们使用的提示语中存在一个小错误，未来用户应参考 https:\u002F\u002Fgithub.com\u002Ftatsu-lab\u002Fstanford_alpaca\u002Fpull\u002F24 中的修正。\n- 我们采用了更为激进的批量解码方式，即一次生成 20 条指令，从而大大降低了数据生成成本。\n- 我们简化了数据生成流程，不再区分分类和非分类指令。\n- 我们只针对每条指令生成一个实例，而不是像 [1] 中那样生成 2 到 3 个实例。\n\n这样，我们就以更低的成本（不到 500 美元）得到了包含 52K 个样例的遵循指令数据集。初步研究还表明，我们生成的 52K 条数据比 [self-instruct](https:\u002F\u002Fgithub.com\u002Fyizhongw\u002Fself-instruct\u002Fblob\u002Fmain\u002Fdata\u002Fseed_tasks.jsonl) 发布的数据更加多样化。\n我们绘制了下图（风格类似于 [self-instruct 论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.10560) 中的图 2），以展示我们数据的多样性。图表中的内圈代表指令的动词根，外圈则代表直接宾语。\n\n[\u002F\u002F]: # (![parse_analysis]（assert\u002Fparse_analysis.png | width=100））\n[\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_6c012dea633c.png\" width=\"750\" \u002F>](.\u002Fhttps:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_readme_6c012dea633c.png)\n\n## 微调\n我们使用标准的 Hugging Face 训练代码，并采用以下超参数对模型进行微调：\n\n| 超参数         | 值     |\n|----------------|--------|\n| 批量大小       | 128    |\n| 学习率         | 2e-5   |\n| Epoch 数       | 3      |\n| 最大长度       | 512    |\n| 权重衰减       | 0      |\n\n由于 Hugging Face 尚未正式支持 LLaMA 模型，我们通过从特定分支安装 Hugging Face 的 transformers 库来对 LLaMA 进行微调（即等待此 [PR](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers\u002Fpull\u002F21955) 合并）。\n\n我们安装的具体提交的哈希值为 `68d640f7c368bcaaaecfc678f11908ebbd3d6176`。\n\n要复现我们的 LLaMA 微调实验，首先安装依赖：\n```bash\npip install -r requirements.txt\n```\n然后安装 Hugging Face 的特定分支 transformers 库。\n\n以下命令将在配备 4 张 A100 80G 显卡的机器上，以 FSDP `full_shard` 模式对 LLaMA-7B 进行微调。我们使用 **Python 3.10** 和以下命令成功复现了与我们在演示中使用的模型质量相近的模型。请将 `\u003Cyour_random_port>` 替换为你选择的端口，将 `\u003Cyour_path_to_hf_converted_llama_ckpt_and_tokenizer>` 替换为你转换后的检查点和分词器的路径（按照 PR 中的说明操作），并将 `\u003Cyour_output_dir>` 替换为你希望存储输出的目录。\n\n```bash\ntorchrun --nproc_per_node=4 --master_port=\u003Cyour_random_port> train.py \\\n    --model_name_or_path \u003Cyour_path_to_hf_converted_llama_ckpt_and_tokenizer> \\\n    --data_path .\u002Falpaca_data.json \\\n    --bf16 True \\\n    --output_dir \u003Cyour_output_dir> \\\n    --num_train_epochs 3 \\\n    --per_device_train_batch_size 4 \\\n    --per_device_eval_batch_size 4 \\\n    --gradient_accumulation_steps 8 \\\n    --evaluation_strategy \"no\" \\\n    --save_strategy \"steps\" \\\n    --save_steps 2000 \\\n    --save_total_limit 1 \\\n    --learning_rate 2e-5 \\\n    --weight_decay 0. \\\n    --warmup_ratio 0.03 \\\n    --lr_scheduler_type \"cosine\" \\\n    --logging_steps 1 \\\n    --fsdp \"full_shard auto_wrap\" \\\n    --fsdp_transformer_layer_cls_to_wrap 'LLaMADecoderLayer' \\\n    --tf32 True\n```\n\n### 注意事项\n`fsdp_transformer_layer_cls_to_wrap` 必须设置为具体的解码器层名称。目前 Hugging Face 的 LLaMA PR 尚不稳定。早期的提交使用 `LLaMADecoderLayer` 作为解码器层名称（这也是我们代码所基于的提交）。而较新的提交则使用 `LlamaDecoderLayer`（注意大小写差异）。如果未将 `fsdp_transformer_layer_cls_to_wrap` 设置为正确的名称，训练速度将会大幅下降。\n\n### 补充说明\n\n同样的脚本也适用于 OPT 模型的微调。以下是微调 OPT-6.7B 的示例：\n\n```bash\ntorchrun --nproc_per_node=4 --master_port=\u003Cyour_random_port> train.py \\\n    --model_name_or_path \"facebook\u002Fopt-6.7b\" \\\n    --data_path .\u002Falpaca_data.json \\\n    --bf16 True \\\n    --output_dir \u003Cyour_output_dir> \\\n    --num_train_epochs 3 \\\n    --per_device_train_batch_size 4 \\\n    --per_device_eval_batch_size 4 \\\n    --gradient_accumulation_steps 8 \\\n    --evaluation_strategy \"no\" \\\n    --save_strategy \"steps\" \\\n    --save_steps 2000 \\\n    --save_total_limit 1 \\\n    --learning_rate 2e-5 \\\n    --weight_decay 0. \\\n    --warmup_ratio 0.03 \\\n    --lr_scheduler_type \"cosine\" \\\n    --logging_steps 1 \\\n    --fsdp \"full_shard auto_wrap\" \\\n    --fsdp_transformer_layer_cls_to_wrap 'OPTDecoderLayer' \\\n    --tf32 True\n```\n\n请注意，提供的训练脚本旨在简单易用，并未经过特别优化。若要在更多 GPU 上运行，建议适当降低 `gradient_accumulation_steps` 以保持全局批量大小为 128。全局批量大小尚未经过最优性测试。\n\n### 作者\n以下所有研究生贡献均等，顺序由随机抽取决定：\n\n- [Rohan Taori](https:\u002F\u002Fwww.rohantaori.com\u002F)\n- [Ishaan Gulrajani](https:\u002F\u002Fishaan.io\u002F)\n- [Tianyi Zhang](https:\u002F\u002Ftiiiger.github.io\u002F)\n- [Yann Dubois](https:\u002F\u002Fyanndubs.github.io\u002F)\n- [Xuechen Li](https:\u002F\u002Fwww.lxuechen.com\u002F)\n\n以上人员均由 [Tatsunori B. Hashimoto](https:\u002F\u002Fthashim.github.io\u002F) 指导。其中，Yann 还受到 [Percy Liang](https:\u002F\u002Fcs.stanford.edu\u002F~pliang\u002F) 的指导，Xuechen 则受到 [Carlos Guestrin](https:\u002F\u002Fguestrin.su.domains\u002F) 的指导。\n\n### 引用\n如果你使用了本仓库中的数据或代码，请引用该仓库：\n```\n@misc{alpaca,\n  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },\n  title = {斯坦福 Alpaca：一个遵循指令的 LLaMA 模型},\n  year = {2023},\n  publisher = {GitHub},\n  journal = {GitHub 仓库},\n  howpublished = {\\url{https:\u002F\u002Fgithub.com\u002Ftatsu-lab\u002Fstanford_alpaca}},\n}\n```\n\n此外，你也应同时引用原始的 LLaMA 论文 [1] 和 Self-Instruct 论文 [2]。\n\n### 致谢\n\n我们感谢 Yizhong Wang 在解释 Self-Instruct 数据生成流程以及提供解析分析图代码方面所提供的帮助。我们也感谢 Yifan Mai 的支持，以及斯坦福 NLP 小组和基础模型研究中心 (CRFM) 成员们提出的宝贵意见。","# KoAlpaca 快速上手指南\n\nKoAlpaca 是一个基于 Stanford Alpaca 训练方法构建的韩语大语言模型，支持 Polyglot-ko（纯韩语）和 LLaMA（英韩双语）作为基座模型。本指南将帮助您快速在本地运行该模型。\n\n## 环境准备\n\n### 系统要求\n- **操作系统**: Linux (推荐) 或 macOS\n- **GPU**: NVIDIA GPU (推荐显存 16GB 以上以运行 5.8B\u002F7B 模型，更大模型需多卡或量化)\n- **Python**: 3.8 或更高版本\n\n### 前置依赖\n请确保已安装 CUDA 驱动及对应的 PyTorch 版本。\n\n## 安装步骤\n\n### 1. 安装核心库\n使用 pip 安装最新的 PyTorch、Transformers 及相关加速库：\n\n```bash\npip install -U torch transformers tokenizers accelerate safetensors\n```\n\n> **提示**: 如果在中国大陆地区下载较慢，建议使用国内镜像源：\n> ```bash\n> pip install -U torch transformers tokenizers accelerate safetensors -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n> ```\n\n## 基本使用\n\n以下是最简单的使用示例，通过 Hugging Face Pipeline 加载 **KoAlpaca-Polyglot-5.8B** 模型并进行对话。\n\n### Python 代码示例\n\n创建一个名为 `run_koalpaca.py` 的文件，填入以下代码：\n\n```python\nimport torch\nfrom transformers import pipeline, AutoModelForCausalLM\n\n# 模型名称 (也可替换为 beomi\u002FKoAlpaca-Polyglot-12.8B 或其他版本)\nMODEL = 'beomi\u002FKoAlpaca-Polyglot-5.8B'\n\n# 加载模型\nmodel = AutoModelForCausalLM.from_pretrained(\n    MODEL,\n    torch_dtype=torch.float16,  # 使用半精度节省显存\n    low_cpu_mem_usage=True,\n).to(device=f\"cuda\", non_blocking=True)\nmodel.eval()\n\n# 创建管道\npipe = pipeline(\n    'text-generation', \n    model=model,\n    tokenizer=MODEL,\n    device=0\n)\n\ndef ask(question, context=''):\n    \"\"\"\n    构建 Prompt 并生成回答\n    格式：### 질문: {问题}\\n\\n### 맥락: {上下文}\\n\\n### 답변:\n    \"\"\"\n    if context:\n        prompt = f\"### 질문: {question}\\n\\n### 맥락: {context}\\n\\n### 답변:\"\n    else:\n        prompt = f\"### 질문: {question}\\n\\n### 답변:\"\n        \n    ans = pipe(\n        prompt, \n        do_sample=True, \n        max_new_tokens=512,\n        temperature=0.7,\n        top_p=0.9,\n        return_full_text=False, # 只返回生成的回答部分\n        eos_token_id=2,\n    )\n    print(ans[0]['generated_text'])\n\n# 测试运行\nask(\"딥러닝이 뭐야?\") \n# 输出示例：딥러닝은 인공신경망을 통해 입력과 출력 사이의 복잡한 관계를 학습하는 머신러닝의 한 분야입니다...\n```\n\n### 运行脚本\n\n```bash\npython run_koalpaca.py\n```\n\n### 进阶选项说明\n- **显存优化**: 如果显存不足，可尝试加载 [8bit 量化版本](https:\u002F\u002Fhuggingface.co\u002Fbeomi\u002FKoAlpaca-Polyglot-12.8B\u002Ftree\u002F8bit) 或使用 `bitsandbytes` 库进行 4bit 量化加载。\n- **LoRA 模型**: 如果使用 LLaMA 基座的 LoRA 微调模型（如 `KoAlpaca-13B-LoRA`），需要配合基座模型（如 `decapoda-research\u002Fllama-13b-hf`）一起加载，推荐使用专门的推理仓库（如 `Alpaca-LoRA-Serve`）进行部署。","一家韩国初创公司的开发团队正致力于构建一个面向本土用户的智能客服系统，需要模型能够精准理解并生成自然的韩语回复。\n\n### 没有 KoAlpaca 时\n- **语言理解偏差大**：直接调用通用的英文大模型或未经韩语微调的模型，导致对用户口语化、敬语体系复杂的韩语指令理解错误，回答生硬且充满“翻译腔”。\n- **本地部署门槛高**：想要获得更好的效果需训练超大参数模型，但团队缺乏多张高端 GPU（如 A100）集群资源，无法承担全量微调的巨大算力和时间成本。\n- **定制开发周期长**：缺乏现成的韩语指令遵循数据集和适配代码，从零收集数据、清洗并构建训练流水线耗时数周，严重拖慢产品上线进度。\n- **推理成本高昂**：在资源受限的服务器上运行未量化的高精度模型，显存占用过高，导致并发处理能力极低，难以支撑实际业务流量。\n\n### 使用 KoAlpaca 后\n- **韩语交互自然流畅**：KoAlpaca 基于 Polyglot-ko 和 LLAMA 针对韩语深度微调，能准确识别韩国特有的敬语等级和文化语境，生成的回复地道且符合人类习惯。\n- **低资源高效训练**：利用 KoAlpaca 提供的 QLoRA 和 Tensor Parallel 示例代码，团队仅需单张消费级显卡（如 RTX 3090）或在免费 Colab T4 上即可完成 12.8B 大模型的适配训练。\n- **开箱即用的生态支持**：直接复用官方开放的 v1.1 数据集及预训练权重，结合 Hugging Face ChatUI 快速搭建演示环境，将原本数周的开发周期缩短至几天。\n- **灵活部署降低成本**：通过加载 4bit 或 8bit 量化版本的 KoAlpaca 模型，显著降低显存需求，使得在低成本服务器上实现高并发推理成为可能。\n\nKoAlpaca 通过提供专为韩语优化的开源模型与低资源训练方案，让中小团队也能以极低成本构建出具备母语级交互能力的 AI 应用。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FBeomi_KoAlpaca_c369a089.jpg","Beomi","Junbum Lee","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FBeomi_f9017aec.png","Researcher @lablup , AI\u002FML GDE @ml-gde.\r\nKorean AI\u002FNLP Researcher and creator of multiple Korean PLMs. Focused on advancing Open LLMs.","Lablup Inc.","Seoul, South Korea","jun@beomi.net","__Beomi__","https:\u002F\u002Fjunbuml.ee","https:\u002F\u002Fgithub.com\u002FBeomi",[86,90,94],{"name":87,"color":88,"percentage":89},"Jupyter Notebook","#DA5B0B",87.2,{"name":91,"color":92,"percentage":93},"Python","#3572A5",12.2,{"name":95,"color":96,"percentage":97},"Shell","#89e051",0.5,1578,228,"2026-03-31T22:49:33","Apache-2.0","Linux","必需 NVIDIA GPU。具体需求视模型大小而定：\n- 轻量级\u002FQLoRA (4bit): Colab T4 (16GB) 可运行推理或小样本训练。\n- 中等模型 (5.8B\u002F12.8B Full Finetune): 推荐 RTX 3090\u002F4090\u002FA5000 (24GB)，多卡需使用 Tensor Parallel。\n- 大型模型 (30B\u002F65B LoRA): 需 A100 (80GB) 或同等大显存 GPU，65B 模型默认容量超 100GB。","未说明 (建议根据模型大小配置，65B 模型需系统内存 >100GB)",{"notes":106,"python":107,"dependencies":108},"1. 项目提供多种模型尺寸 (5.8B, 12.8B, 30B, 65B)，硬件需求差异巨大，小显存用户建议使用 QLoRA (4bit) 版本或在 Colab T4 上运行。\n2. 训练大型模型 (如 65B) 必须使用多卡或多节点并行技术 (如 Tensor Parallel)。\n3. 官方 Web 演示服务已于 2024 年 5 月因 GPU 服务器支持结束而关闭。\n4. 代码示例显示需安装最新版 PyTorch 和 Hugging Face 库。","未说明 (需支持最新版的 torch 和 transformers)",[109,110,111,112,113],"torch","transformers","tokenizers","accelerate","safetensors",[13,26],[116,117,118,119,120,121],"alpaca","koalpaca","korean-nlp","llama","chatkoalpaca","polyglot-ko",null,"2026-03-27T02:49:30.150509","2026-04-06T05:37:06.858554",[],[]]