[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-VinAIResearch--PhoGPT":3,"tool-VinAIResearch--PhoGPT":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":87,"forks":88,"last_commit_at":89,"license":90,"difficulty_score":10,"env_os":91,"env_gpu":92,"env_ram":93,"env_deps":94,"category_tags":104,"github_topics":105,"view_count":10,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":113,"updated_at":114,"faqs":115,"releases":141},1010,"VinAIResearch\u002FPhoGPT","PhoGPT","PhoGPT: Generative Pre-training for Vietnamese (2023)","PhoGPT是由VinAI Research推出的开源越南语大语言模型系列，包含37亿参数的基础模型PhoGPT-4B和聊天专用版PhoGPT-4B-Chat。它专为越南语深度优化，在1020亿越南语token上从头预训练，支持8192长度的上下文和20K词汇量，能流畅生成高质量文本、回答问题及进行自然对话，解决了越南语在AI领域长期缺乏专业模型的问题——此前多语言模型常导致越南语生成质量差、理解不准确。\n\nPhoGPT的独特亮点在于纯越南语定制化训练，避免了通用模型的泛化缺陷，性能显著超越同类开源工具（如在越南语事实问答测试中表现领先）。开发者可轻松通过Hugging Face下载模型，利用vLLM或llama.cpp等工具快速部署；研究人员能基于它微调特定任务，如客服机器人或内容创作。它主要适合技术背景的开发者与研究人员使用，用于构建越南语AI应用；普通用户则可通过集成PhoGPT的产品（如聊天助手）间接体验其能力。作为完全开源的项目，PhoGPT推动了越南语AI生态发展，为社区提供了可靠、高效的语言技术基础。","- [Introduction](#introduction)\n- [Model download](#download)\n- [Run the model](#inference)\n- [Fine-tuning the model](#finetuning)\n- [Limitations](#limitations)\n\n# PhoGPT: Generative Pre-training for Vietnamese \u003Ca name=\"introduction\">\u003C\u002Fa>\n\n\nWe open-source a state-of-the-art 4B-parameter generative model series for Vietnamese, which includes the base pre-trained monolingual model PhoGPT-4B and its chat variant, PhoGPT-4B-Chat. The base model, PhoGPT-4B, with exactly 3.7B parameters, is pre-trained from scratch on a Vietnamese corpus of 102B tokens, with an 8192 context length, employing a vocabulary of 20K token types. The chat variant, PhoGPT-4B-Chat, is the modeling output obtained by fine-tuning PhoGPT-4B on a dataset of 70K instructional prompts and their responses, along with an additional 290K conversations. We demonstrate its superior performance compared to previous open-source models. \n\n\u003Cimg width=\"500\" alt=\"Vietnamese truthful QA results\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FVinAIResearch_PhoGPT_readme_1ecb7458ae07.png\">\n\nMore details about the general architecture and experimental results of PhoGPT can be found in our [technical report](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.02945). All output responses of PhoGPT and baselines are available [HERE](https:\u002F\u002Fdocs.google.com\u002Fspreadsheets\u002Fd\u002F1H9PvaItWIVnZw6gBHq83mWDp_P4DzMcYYbVDVv85yE8\u002Fedit?usp=sharing) for readers' self-evaluation. **Please CITE** our technical report when PhoGPT is used to help produce published results or is incorporated into other software:\n\n```\n@article{PhoGPT,\ntitle     = {{PhoGPT: Generative Pre-training for Vietnamese}},\nauthor    = {Dat Quoc Nguyen and Linh The Nguyen and Chi Tran and Dung Ngoc Nguyen and Dinh Phung and Hung Bui},\njournal   = {arXiv preprint},\nvolume    = {arXiv:2311.02945},\nyear      = {2023}\n}\n```\n\n\n## Model download \u003Ca name=\"download\">\u003C\u002Fa>\n\nModel | Type | Model Size | Context length | Vocab size | Training data size | Note\n---|--|---|---|---|---|---\n[`vinai\u002FPhoGPT-4B`](https:\u002F\u002Fhuggingface.co\u002Fvinai\u002FPhoGPT-4B) | Base | 3.7B | 8192 | 20K | 2 training epochs on 482GB of texts | Loading \"PhoGPT-4B\" or \"PhoGPT-4B-Chat\" in float16 takes 7GB of GPU memory\n[`vinai\u002FPhoGPT-4B-Chat`](https:\u002F\u002Fhuggingface.co\u002Fvinai\u002FPhoGPT-4B-Chat) |Instruction following & Chat|3.7B| 8192| 20K |70K instructional prompt and response pairs & 290K conversations| `PROMPT_TEMPLATE = \"### Câu hỏi: {instruction}\\n### Trả lời:\"`  \n\n## Run the model \u003Ca name=\"inference\">\u003C\u002Fa>\n\n### With vLLM, Text Generation Inference & llama.cpp\n\nPhoGPT can run with inference engines, such as [vLLM](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm), [Text Generation Inference](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftext-generation-inference) and [llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp).\n\n#### With llama.cpp\n\n- Compile [llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp) \n- Install Python dependencies from llama.cpp\n```\ncd llama.cpp\npython3 -m pip install -r requirements.txt\n```\n- Convert the model to gguf FP16 format: `python3 convert-hf-to-gguf.py \u003Cpath_to_PhoGPT-4B-Chat_model> --outfile .\u002FPhoGPT-4B-Chat.gguf`\n- (Optional) Quantize the model to 4\u002F8-bits:\n    - `.\u002Fquantize .\u002FPhoGPT-4B-Chat.gguf .\u002FPhoGPT-4B-Chat-Q4_K_M.gguf Q4_K_M`\n    - `.\u002Fquantize .\u002FPhoGPT-4B-Chat.gguf .\u002FPhoGPT-4B-Chat-Q8_0.gguf Q8_0`\n- Start inference on a gguf model: `.\u002Fmain -m .\u002FPhoGPT-4B-Chat-Q4_K_M.gguf -n 1024 -p \"### Câu hỏi: Viết bài văn nghị luận xã hội về an toàn giao thông\\n### Trả lời:\"`\n\nConverted gguf files are available at: **[vinai\u002FPhoGPT-4B-Chat-gguf](https:\u002F\u002Fhuggingface.co\u002Fvinai\u002FPhoGPT-4B-Chat-gguf)**. Note that [phogpt_4b_chat_preset.json](https:\u002F\u002Fhuggingface.co\u002Fvinai\u002FPhoGPT-4B-Chat-gguf\u002Fblob\u002Fmain\u002Fphogpt_4b_chat_preset.json) might be needed for LM Studio to work properly with our gguf files. \n\n\u003C!--- Update the gguf filetype to current version if older version is now unsupported: `.\u002Fquantize \u002Fpath\u002Fto\u002FPhoGPT-4B-Chat.gguf \u002Fpath\u002Fto\u002FPhoGPT-4B-Chat-v2.gguf COPY`-->\n\n\n### With pure `transformers`\n\n#### Instruction following\n\n```python\n# coding: utf8\nimport torch\nfrom transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer\n\nmodel_path = \"vinai\u002FPhoGPT-4B-Chat\"  \n\nconfig = AutoConfig.from_pretrained(model_path, trust_remote_code=True)  \nconfig.init_device = \"cuda\"\n# config.attn_config['attn_impl'] = 'flash' # If installed: this will use either Flash Attention V1 or V2 depending on what is installed\n\nmodel = AutoModelForCausalLM.from_pretrained(model_path, config=config, torch_dtype=torch.bfloat16, trust_remote_code=True)\n# If your GPU does not support bfloat16:\n# model = AutoModelForCausalLM.from_pretrained(model_path, config=config, torch_dtype=torch.float16, trust_remote_code=True)\nmodel.eval()  \n\ntokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)  \n\nPROMPT_TEMPLATE = \"### Câu hỏi: {instruction}\\n### Trả lời:\"  \n\n# Some instruction examples\n# instruction = \"Viết bài văn nghị luận xã hội về {topic}\"\n# instruction = \"Viết bản mô tả công việc cho vị trí {job_title}\"\n# instruction = \"Sửa lỗi chính tả:\\n{sentence_or_paragraph}\"\n# instruction = \"Dựa vào văn bản sau đây:\\n{text}\\nHãy trả lời câu hỏi: {question}\"\n# instruction = \"Tóm tắt văn bản:\\n{text}\"\n\ninstruction = \"Viết bài văn nghị luận xã hội về an toàn giao thông\"\n# instruction = \"Sửa lỗi chính tả:\\nTriệt phá băng nhóm kướp ô tô, sử dụng \\\"vũ khí nóng\\\"\"\n\ninput_prompt = PROMPT_TEMPLATE.format_map({\"instruction\": instruction})  \n\ninput_ids = tokenizer(input_prompt, return_tensors=\"pt\")  \n\noutputs = model.generate(  \n    inputs=input_ids[\"input_ids\"].to(\"cuda\"),  \n    attention_mask=input_ids[\"attention_mask\"].to(\"cuda\"),  \n    do_sample=True,  \n    temperature=1.0,  \n    top_k=50,  \n    top_p=0.9,  \n    max_new_tokens=1024,  \n    eos_token_id=tokenizer.eos_token_id,  \n    pad_token_id=tokenizer.pad_token_id  \n)  \n\nresponse = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]  \nresponse = response.split(\"### Trả lời:\")[1]\n```\n\n#### Chat\n\n```python\nmessages = [\n    {\"role\": \"user\", \"content\": \"Kể tên một môn thể thao mạo hiểm\"},\n    {\"role\": \"assistant\", \"content\": \"Nhảy Bungee.\"},\n    {\"role\": \"user\", \"content\": \"Bạn đã bao giờ đi nhảy bungee chưa\"}\n]\n\n# Using apply_chat_template\ntokenizer = AutoTokenizer.from_pretrained(\"vinai\u002FPhoGPT-4B-Chat\", trust_remote_code=True)\ninput_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\n```\n\n#### quantization with `bitsandbytes`\n\n```python\nimport torch\nfrom transformers import BitsAndBytesConfig, AutoConfig, AutoModelForCausalLM, AutoTokenizer\n\nconfig = AutoConfig.from_pretrained(\"vinai\u002FPhoGPT-4B-Chat\", trust_remote_code=True)  \nconfig.init_device = \"cuda\"\n\n# 8-bit quantization\nmodel_8bit = AutoModelForCausalLM.from_pretrained(\"vinai\u002FPhoGPT-4B-Chat\", config=config, load_in_8bit=True)\n```\n\n## Fine-tuning the model \u003Ca name=\"finetuning\">\u003C\u002Fa>\n\nSee [llm-foundry docs](https:\u002F\u002Fgithub.com\u002Fmosaicml\u002Fllm-foundry\u002Fblob\u002Fmain\u002Fscripts\u002Ftrain\u002FREADME.md#llmfinetuning) for details. To fully fine-tune PhoGPT, users can find an example of model finetuning YAML configuration at [`fine-tuning-phogpt.yaml`](https:\u002F\u002Fgithub.com\u002FVinAIResearch\u002FPhoGPT\u002Fblob\u002Fmain\u002Ffine-tuning-phogpt.yaml). Users can also find the `sample_instruction_following_dataset` folder as an example of an instruction-following dataset.\n\n- To install `llm-foundry`, see Section \"Installation\" in [https:\u002F\u002Fgithub.com\u002Fmosaicml\u002Fllm-foundry](https:\u002F\u002Fgithub.com\u002Fmosaicml\u002Fllm-foundry).\n- Run: `cd llm-foundry\u002Fscripts\u002Ftrain\u002F` and then `composer --world_size \u003Cnumber_of_GPUs> train.py \u003Cpath_to_yaml_configuration_file>` (e.g. `composer --world_size 1 train.py fine-tuning-phogpt.yaml`). \n\nOther fine-tuning options may include the use of [transformers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers)'s Trainer (e.g. see [stanford_alpaca](https:\u002F\u002Fgithub.com\u002Ftatsu-lab\u002Fstanford_alpaca) as an example), [lit-gpt](https:\u002F\u002Fgithub.com\u002FLightning-AI\u002Flitgpt) or [LLaMA-Factory](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory).\n\n## Limitations \u003Ca name=\"limitations\">\u003C\u002Fa>\n\nPhoGPT has certain limitations. For example, it is not good at tasks involving reasoning, coding or mathematics. PhoGPT may generate harmful, hate speech, biased responses, or answer unsafe questions. Users should be cautious when interacting with PhoGPT that can produce factually incorrect output.\n","- [简介](#introduction)\n- [模型下载](#download)\n- [运行模型](#inference)\n- [微调模型](#finetuning)\n- [限制](#limitations)\n\n# PhoGPT：越南语生成式预训练（Generative Pre-training）\u003Ca name=\"introduction\">\u003C\u002Fa>\n\n我们开源了一个面向越南语的尖端40亿参数生成模型系列，包含基础预训练单语模型（monolingual model）PhoGPT-4B及其对话变体（chat variant）PhoGPT-4B-Chat。基础模型PhoGPT-4B确切包含37亿参数，基于1020亿个标记（tokens）的越南语文本语料库从头预训练，上下文长度（context length）为8192，采用2万个词元类型（token types）的词汇表（vocabulary）。对话变体PhoGPT-4B-Chat是通过对PhoGPT-4B在7万个指令提示（instructional prompts）及其响应数据集，以及额外29万个对话数据进行微调得到的模型输出。我们展示了其相比先前开源模型的卓越性能。\n\n\u003Cimg width=\"500\" alt=\"Vietnamese truthful QA results\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FVinAIResearch_PhoGPT_readme_1ecb7458ae07.png\">\n\n关于PhoGPT的通用架构和实验结果的更多细节，请参阅我们的[技术报告](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.02945)。PhoGPT及基线模型的所有输出响应均可在[此处](https:\u002F\u002Fdocs.google.com\u002Fspreadsheets\u002Fd\u002F1H9PvaItWIVnZw6gBHq83mWDp_P4DzMcYYbVDVv85yE8\u002Fedit?usp=sharing)供读者自行评估。**请引用**我们的技术报告，当PhoGPT用于辅助生成已发表成果或集成到其他软件中时：\n\n```\n@article{PhoGPT,\ntitle     = {{PhoGPT: Generative Pre-training for Vietnamese}},\nauthor    = {Dat Quoc Nguyen and Linh The Nguyen and Chi Tran and Dung Ngoc Nguyen and Dinh Phung and Hung Bui},\njournal   = {arXiv preprint},\nvolume    = {arXiv:2311.02945},\nyear      = {2023}\n}\n```\n\n\n## 模型下载 \u003Ca name=\"download\">\u003C\u002Fa>\n\n模型 | 类型 | 模型大小 | 上下文长度 | 词汇表大小 | 训练数据量 | 备注\n---|--|---|---|---|---|---\n[`vinai\u002FPhoGPT-4B`](https:\u002F\u002Fhuggingface.co\u002Fvinai\u002FPhoGPT-4B) | 基础模型 | 3.7B | 8192 | 20K | 482GB文本的2个训练轮次 | 以float16加载\"PhoGPT-4B\"或\"PhoGPT-4B-Chat\"需占用7GB GPU显存\n[`vinai\u002FPhoGPT-4B-Chat`](https:\u002F\u002Fhuggingface.co\u002Fvinai\u002FPhoGPT-4B-Chat) | 指令跟随与对话 | 3.7B | 8192 | 20K | 7万个指令提示-响应对 & 29万个对话 | `PROMPT_TEMPLATE = \"### Câu hỏi: {instruction}\\n### Trả lời:\"`  \n\n## 运行模型 \u003Ca name=\"inference\">\u003C\u002Fa>\n\n### 使用 vLLM、Text Generation Inference 与 llama.cpp\n\nPhoGPT可运行于[ vLLM](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm)、[Text Generation Inference](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftext-generation-inference)和[llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp)等推理引擎。\n\n#### 使用 llama.cpp\n\n- 编译[llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp) \n- 安装llama.cpp的Python依赖\n```\ncd llama.cpp\npython3 -m pip install -r requirements.txt\n```\n- 将模型转换为gguf FP16格式: `python3 convert-hf-to-gguf.py \u003Cpath_to_PhoGPT-4B-Chat_model> --outfile .\u002FPhoGPT-4B-Chat.gguf`\n- (可选) 量化模型至4\u002F8位:\n    - `.\u002Fquantize .\u002FPhoGPT-4B-Chat.gguf .\u002FPhoGPT-4B-Chat-Q4_K_M.gguf Q4_K_M`\n    - `.\u002Fquantize .\u002FPhoGPT-4B-Chat.gguf .\u002FPhoGPT-4B-Chat-Q8_0.gguf Q8_0`\n- 在gguf模型上启动推理: `.\u002Fmain -m .\u002FPhoGPT-4B-Chat-Q4_K_M.gguf -n 1024 -p \"### Câu hỏi: Viết bài văn nghị luận xã hội về an toàn giao thông\\n### Trả lời:\"`\n\n转换后的gguf文件可在 **[vinai\u002FPhoGPT-4B-Chat-gguf](https:\u002F\u002Fhuggingface.co\u002Fvinai\u002FPhoGPT-4B-Chat-gguf)** 获取。注意：[phogpt_4b_chat_preset.json](https:\u002F\u002Fhuggingface.co\u002Fvinai\u002FPhoGPT-4B-Chat-gguf\u002Fblob\u002Fmain\u002Fphogpt_4b_chat_preset.json) 可能是确保LM Studio与gguf文件正常工作的必要配置。\n\n\u003C!--- 若旧版本gguf文件不再受支持，请更新至当前版本: `.\u002Fquantize \u002Fpath\u002Fto\u002FPhoGPT-4B-Chat.gguf \u002Fpath\u002Fto\u002FPhoGPT-4B-Chat-v2.gguf COPY`-->\n\n\n### 使用纯 `transformers`\n\n#### 指令跟随\n\n```python\n# coding: utf8\nimport torch\nfrom transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer\n\nmodel_path = \"vinai\u002FPhoGPT-4B-Chat\"  \n\nconfig = AutoConfig.from_pretrained(model_path, trust_remote_code=True)  \nconfig.init_device = \"cuda\"\n# config.attn_config['attn_impl'] = 'flash' # 若已安装：将使用Flash Attention V1或V2\n\nmodel = AutoModelForCausalLM.from_pretrained(model_path, config=config, torch_dtype=torch.bfloat16, trust_remote_code=True)\n# 若GPU不支持bfloat16:\n# model = AutoModelForCausalLM.from_pretrained(model_path, config=config, torch_dtype=torch.float16, trust_remote_code=True)\nmodel.eval()  \n\ntokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)  \n\nPROMPT_TEMPLATE = \"### Câu hỏi: {instruction}\\n### Trả lời:\"  \n\n# 指令示例\n# instruction = \"Viết bài văn nghị luận xã hội về {topic}\"\n# instruction = \"Viết bản mô tả công việc cho vị trí {job_title}\"\n# instruction = \"Sửa lỗi chính tả:\\n{sentence_or_paragraph}\"\n# instruction = \"Dựa vào văn bản sau đây:\\n{text}\\nHãy trả lời câu hỏi: {question}\"\n# instruction = \"Tóm tắt văn bản:\\n{text}\"\n\ninstruction = \"Viết bài văn nghị luận xã hội về an toàn giao thông\"\n# instruction = \"Sửa lỗi chính tả:\\nTriệt phá băng nhóm kướp ô tô, sử dụng \\\"vũ khí nóng\\\"\"\n\ninput_prompt = PROMPT_TEMPLATE.format_map({\"instruction\": instruction})  \n\ninput_ids = tokenizer(input_prompt, return_tensors=\"pt\")  \n\noutputs = model.generate(  \n    inputs=input_ids[\"input_ids\"].to(\"cuda\"),  \n    attention_mask=input_ids[\"attention_mask\"].to(\"cuda\"),  \n    do_sample=True,  \n    temperature=1.0,  \n    top_k=50,  \n    top_p=0.9,  \n    max_new_tokens=1024,  \n    eos_token_id=tokenizer.eos_token_id,  \n    pad_token_id=tokenizer.pad_token_id  \n)  \n\nresponse = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]  \nresponse = response.split(\"### Trả lời:\")[1]\n```\n\n#### 对话\n\n```python\nmessages = [\n    {\"role\": \"user\", \"content\": \"Kể tên một môn thể thao mạo hiểm\"},\n    {\"role\": \"assistant\", \"content\": \"Nhảy Bungee.\"},\n    {\"role\": \"user\", \"content\": \"Bạn đã bao giờ đi nhảy bungee chưa\"}\n]\n\n# 使用apply_chat_template\ntokenizer = AutoTokenizer.from_pretrained(\"vinai\u002FPhoGPT-4B-Chat\", trust_remote_code=True)\ninput_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\n```\n\n#### 使用 `bitsandbytes` 量化\n\n```python\nimport torch\nfrom transformers import BitsAndBytesConfig, AutoConfig, AutoModelForCausalLM, AutoTokenizer\n\nconfig = AutoConfig.from_pretrained(\"vinai\u002FPhoGPT-4B-Chat\", trust_remote_code=True)  \nconfig.init_device = \"cuda\"\n\n# 8位量化\nmodel_8bit = AutoModelForCausalLM.from_pretrained(\"vinai\u002FPhoGPT-4B-Chat\", config=config, load_in_8bit=True)\n```\n\n## 模型微调（fine-tuning）\u003Ca name=\"finetuning\">\u003C\u002Fa>\n\n详情请参阅 [llm-foundry（大型语言模型基础库）文档](https:\u002F\u002Fgithub.com\u002Fmosaicml\u002Fllm-foundry\u002Fblob\u002Fmain\u002Fscripts\u002Ftrain\u002FREADME.md#llmfinetuning)。要完全微调（fine-tune）PhoGPT，用户可以在 [`fine-tuning-phogpt.yaml`](https:\u002F\u002Fgithub.com\u002FVinAIResearch\u002FPhoGPT\u002Fblob\u002Fmain\u002Ffine-tuning-phogpt.yaml) 找到模型微调YAML（数据序列化格式）配置文件示例。用户还可以在 `sample_instruction_following_dataset` 文件夹中找到指令跟随数据集的示例。\n\n- 要安装 `llm-foundry`，请参见 [https:\u002F\u002Fgithub.com\u002Fmosaicml\u002Fllm-foundry](https:\u002F\u002Fgithub.com\u002Fmosaicml\u002Fllm-foundry) 中的“安装”部分。\n- 运行：`cd llm-foundry\u002Fscripts\u002Ftrain\u002F` 然后 `composer --world_size \u003Cnumber_of_GPUs> train.py \u003Cpath_to_yaml_configuration_file>`（例如 `composer --world_size 1 train.py fine-tuning-phogpt.yaml`）。\n\n其他微调选项可能包括使用 [transformers（Hugging Face的深度学习库）](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers) 的 Trainer（训练器）（例如，参见 [stanford_alpaca](https:\u002F\u002Fgithub.com\u002Ftatsu-lab\u002Fstanford_alpaca) 作为示例）、[lit-gpt](https:\u002F\u002Fgithub.com\u002FLightning-AI\u002Flitgpt) 或 [LLaMA-Factory](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory)。\n\n## 局限性 \u003Ca name=\"limitations\">\u003C\u002Fa>\n\nPhoGPT 存在某些局限性。例如，它不擅长涉及推理（reasoning）、编码（coding）或数学（mathematics）的任务。PhoGPT 可能生成有害内容、仇恨言论（hate speech）、偏见响应（biased responses），或回答不安全的问题。用户在与 PhoGPT 交互时应谨慎，因为它可能产生事实错误的输出（factually incorrect output）。","# PhoGPT 快速上手指南\n\n## 环境准备\n- **系统要求**：Linux 系统，NVIDIA GPU（建议显存 ≥8GB），CUDA 11.7+\n- **前置依赖**：\n  - Python 3.8+\n  - PyTorch（需匹配 CUDA 版本）\n  - transformers 库\n  - 推荐配置 Hugging Face 国内镜像加速模型下载（如阿里云 ModelScope）\n\n## 安装步骤\n```bash\n# 使用清华源安装核心依赖（国内加速）\npip install torch transformers -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n\n# 如需量化支持（可选）\npip install bitsandbytes -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n## 基本使用\n以下是最简示例，使用 `transformers` 加载 PhoGPT-4B-Chat 模型并生成响应：\n\n```python\nimport torch\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_path = \"vinai\u002FPhoGPT-4B-Chat\"\n\n# 加载模型（自动选择设备）\nmodel = AutoModelForCausalLM.from_pretrained(\n    model_path, \n    torch_dtype=torch.float16, \n    device_map=\"auto\"\n)\ntokenizer = AutoTokenizer.from_pretrained(model_path)\n\n# 构建越南语提示\nPROMPT_TEMPLATE = \"### Câu hỏi: {instruction}\\n### Trả lời:\"\ninstruction = \"Viết bài văn nghị luận xã hội về an toàn giao thông\"\ninput_prompt = PROMPT_TEMPLATE.format(instruction=instruction)\n\n# 生成响应\ninputs = tokenizer(input_prompt, return_tensors=\"pt\").to(\"cuda\")\noutputs = model.generate(**inputs, max_new_tokens=1024)\nresponse = tokenizer.decode(outputs[0], skip_special_tokens=True)\nresponse = response.split(\"### Trả lời:\")[1]\n\nprint(response)\n```\n\n> **注意**  \n> - 若 GPU 不支持 float16，将 `torch.float16` 改为 `torch.float32`  \n> - 首次运行会自动下载模型（约 7GB），建议提前配置 Hugging Face 镜像加速  \n> - 模型专为越南语优化，输入需使用越南语指令","一家越南教育科技公司正在开发AI写作辅导平台，学生输入议论文题目后，系统需自动生成符合越南教育标准的范文供学习参考。\n\n### 没有 PhoGPT 时\n- 生成的越南语范文语法错误频发，例如动词变位混乱或介词误用，导致学生模仿错误表达\n- 内容缺乏越南本土文化关联，如讨论交通安全时忽略摩托车文化背景，使案例显得生硬不真实\n- 每篇生成内容需3名越南语教师人工校对2小时，拖慢平台上线进度并增加30%运营成本\n- 模型上下文窗口仅2048 tokens，长篇议论文后半部分逻辑断裂，出现重复论点或主题偏移\n- 对\"分析社会现象\"等指令响应模糊，常输出通用模板而非针对性论述，学生满意度不足60%\n\n### 使用 PhoGPT 后\n- PhoGPT-4B-Chat精准生成符合越南语法规则的文本，如正确使用\"đã\"表示完成时态，范文可直接用于教学\n- 模型基于102B越南语token训练，生成内容自然融入本地元素，例如交通安全范文引用河内摩托车通勤数据\n- 人工校对时间缩短至20分钟\u002F篇，团队在2周内完成500篇范文库搭建，开发效率提升3倍\n- 8192上下文长度确保2000字议论文全程逻辑连贯，论点层层递进无信息丢失\n- 指令模板优化后精准响应复杂请求，学生输入\"结合青年责任论述交通安全\"即输出结构化议论文，满意度跃升至89%\n\nPhoGPT让越南语AI内容生成从\"勉强可用\"变为\"教学级可靠\"，真正释放本地化教育产品的生产力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FVinAIResearch_PhoGPT_b47f4081.png","VinAIResearch","VinAI Research","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FVinAIResearch_c5892e09.png","",null,"https:\u002F\u002Fwww.vinai.io\u002F","https:\u002F\u002Fgithub.com\u002FVinAIResearch",[83],{"name":84,"color":85,"percentage":86},"Python","#3572A5",100,798,75,"2026-03-12T09:08:25","BSD-3-Clause","Linux, macOS","必需 NVIDIA GPU，显存 7GB+（加载 FP16 模型），量化后可降至 4GB；CUDA 版本未明确说明（建议 11.7+）","未说明（模型加载需 7GB 显存，建议系统内存 16GB+ 以支持推理）",{"notes":95,"python":96,"dependencies":97},"1. 支持量化（Q4_K_M\u002FQ8_0）降低显存需求；2. 首次运行需下载约 5GB 模型文件；3. 推荐使用 bfloat16 精度（需 GPU 支持）；4. Windows 环境未测试，可能需额外配置","未明确说明（代码示例使用 python3，建议 3.8+）",[98,99,100,101,102,103],"torch","transformers","accelerate","bitsandbytes","sentencepiece","llama-cpp-python",[15,13,26],[106,107,108,109,110,111,112],"generative-pre-trained-transformer","gpt","instruction-following","llm","phogpt","vietnamese","vietnamese-nlp","2026-03-27T02:49:30.150509","2026-04-06T07:12:53.820733",[116,121,126,131,136],{"id":117,"question_zh":118,"answer_zh":119,"source_url":120},4503,"PhoGPT-7B5-Instruct 模型生成时为什么只重复一个词？","要使模型正常工作，必须使用特定的提示模板：`\"### Câu hỏi:\\n{instruction}\\n\\n### Trả lời:\"`。在生成时插入此模板，其中 `{instruction}` 替换为您的实际指令（例如 `\"Làm thế nào để cải thiện kỹ năng quản lý thời gian?\"`）。未使用此模板会导致重复输出问题。","https:\u002F\u002Fgithub.com\u002FVinAIResearch\u002FPhoGPT\u002Fissues\u002F7",{"id":122,"question_zh":123,"answer_zh":124,"source_url":125},4504,"转换 PhoGPT 模型到 GGUF 格式时出现 'wrong number of tensors' 错误如何解决？","请使用最新版本的 llama.cpp 重新转换模型。如果问题仍存在，从特定 commit 安装 llama.cpp：执行 `git clone https:\u002F\u002Fgithub.com\u002Fjordankanter\u002Fllama.cpp`，然后 `git checkout 87a41f53ae3a01dde4c198df72cfb99ba2c9f586`，最后重新运行转换命令生成 .gguf 文件。","https:\u002F\u002Fgithub.com\u002FVinAIResearch\u002FPhoGPT\u002Fissues\u002F23",{"id":127,"question_zh":128,"answer_zh":129,"source_url":130},4505,"在 Google Colab 运行 PhoGPT 时出现 '_expand_mask' 导入错误怎么办？","请清除本地 PhoGPT 缓存并重新下载模型。错误是由于 Hugging Face 上的 `hf_prefixlm_converter.py` 文件更新导致，重新下载可解决。执行以下步骤：删除缓存目录（通常位于 `~\u002F.cache\u002Fhuggingface\u002F`），然后重新运行模型加载代码。","https:\u002F\u002Fgithub.com\u002FVinAIResearch\u002FPhoGPT\u002Fissues\u002F3",{"id":132,"question_zh":133,"answer_zh":134,"source_url":135},4506,"为什么无法访问 PhoGPT-7B5-Instruct 模型？","项目组已临时将 instruct 版本设为私有，以重新评估其安全性。预计将在本周内完成评估并重新公开模型。请关注项目更新获取最新状态。","https:\u002F\u002Fgithub.com\u002FVinAIResearch\u002FPhoGPT\u002Fissues\u002F4",{"id":137,"question_zh":138,"answer_zh":139,"source_url":140},4507,"微调 PhoGPT4B 模型时遇到 GPU 内存不足错误如何解决？","尝试在 YAML 配置文件中更改精度设置：将 `precision: amp_bf16` 替换为 `bf16`、`fp16`、`float16` 或 `bfloat16`。例如，设置 `precision: bf16`。这能减少内存占用，适用于 RTX 4090 等 GPU。","https:\u002F\u002Fgithub.com\u002FVinAIResearch\u002FPhoGPT\u002Fissues\u002F28",[]]