[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-Tongjilibo--bert4torch":3,"tool-Tongjilibo--bert4torch":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",156804,2,"2026-04-15T11:34:33",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":78,"owner_email":79,"owner_twitter":77,"owner_website":80,"owner_url":81,"languages":82,"stars":87,"forks":88,"last_commit_at":89,"license":90,"difficulty_score":32,"env_os":91,"env_gpu":92,"env_ram":91,"env_deps":93,"category_tags":99,"github_topics":100,"view_count":32,"oss_zip_url":77,"oss_zip_packed_at":77,"status":17,"created_at":115,"updated_at":116,"faqs":117,"releases":153},7757,"Tongjilibo\u002Fbert4torch","bert4torch","An elegent pytorch implement of transformers","bert4torch 是一个基于 PyTorch 构建的优雅且高效的 Transformer 模型工具库。它旨在简化自然语言处理（NLP）和大语言模型（LLM）的开发流程，让研究人员和开发者能够轻松加载、微调及部署各类主流预训练模型。\n\n该工具有效解决了传统框架中模型结构复杂、代码复用性低以及大模型部署门槛高等痛点。无论是经典的 BERT、RoBERTa、T5，还是新兴的 ChatGLM、Llama、Baichuan 等大模型，bert4torch 均提供了一站式的解决方案。用户不仅可以快速进行句子分类、序列标注、关系抽取等常规任务，还能通过简洁的命令行指令实现大模型的本地服务部署。\n\nbert4torch 特别适合 NLP 领域的算法工程师、科研人员以及希望深入理解模型原理的进阶开发者。其独特的技术亮点在于高度灵活的架构设计：既支持直接加载 Hugging Face Transformers 的模型权重，又允许用户在基础组件上自由定制网络结构。此外，库内集成了丰富的训练技巧（Tricks）、动态进度条展示、自动参数统计及 TensorBoard 日志记录等功能，大幅提升了实验效率与可","bert4torch 是一个基于 PyTorch 构建的优雅且高效的 Transformer 模型工具库。它旨在简化自然语言处理（NLP）和大语言模型（LLM）的开发流程，让研究人员和开发者能够轻松加载、微调及部署各类主流预训练模型。\n\n该工具有效解决了传统框架中模型结构复杂、代码复用性低以及大模型部署门槛高等痛点。无论是经典的 BERT、RoBERTa、T5，还是新兴的 ChatGLM、Llama、Baichuan 等大模型，bert4torch 均提供了一站式的解决方案。用户不仅可以快速进行句子分类、序列标注、关系抽取等常规任务，还能通过简洁的命令行指令实现大模型的本地服务部署。\n\nbert4torch 特别适合 NLP 领域的算法工程师、科研人员以及希望深入理解模型原理的进阶开发者。其独特的技术亮点在于高度灵活的架构设计：既支持直接加载 Hugging Face Transformers 的模型权重，又允许用户在基础组件上自由定制网络结构。此外，库内集成了丰富的训练技巧（Tricks）、动态进度条展示、自动参数统计及 TensorBoard 日志记录等功能，大幅提升了实验效率与可复现性。配合详尽的示例代码和清晰的文档，bert4torch 能帮助使用者以更少的代码量，更专注于核心算法的创新与验证。","![bert4torch](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_849cd1e6cb4e.png)\r\n\r\n[![licence](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002FTongjilibo\u002Fbert4torch.svg?maxAge=3600)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002FLICENSE)\r\n[![GitHub release](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frelease\u002FTongjilibo\u002Fbert4torch.svg?maxAge=3600)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Freleases)\r\n[![PyPI](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fbert4torch?label=pypi%20package)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fbert4torch\u002F)\r\n[![PyPI - Downloads](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdm\u002Fbert4torch)](https:\u002F\u002Fpypistats.org\u002Fpackages\u002Fbert4torch)\r\n[![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FTongjilibo\u002Fbert4torch?style=social)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch)\r\n[![GitHub Issues](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002FTongjilibo\u002Fbert4torch.svg)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fissues)\r\n[![contributions welcome](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcontributions-welcome-brightgreen.svg?style=flat)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fissues)\r\n[![Generic badge](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fwechat-join-green.svg?logo=wechat)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fdocs\u002Fpics\u002Fwechat_group.jpg)\r\n\r\n[Documentation](https:\u002F\u002Fbert4torch.readthedocs.io) |\r\n[Torch4keras](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Ftorch4keras) |\r\n[Examples](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples) |\r\n[build_MiniLLM_from_scratch](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbuild_MiniLLM_from_scratch) |\r\n[bert4vector](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4vector)\r\n\r\n## 目录\r\n\r\n- [目录](#目录)\r\n- [1. 下载安装](#1-下载安装)\r\n- [2. 功能](#2-功能)\r\n- [3. 快速上手](#3-快速上手)\r\n  - [3.1 上手教程](#31-上手教程)\r\n  - [3.2 命令行快速部署大模型服务](#32-命令行快速部署大模型服务)\r\n- [4. 版本和更新历史](#4-版本和更新历史)\r\n  - [4.1 版本历史](#41-版本历史)\r\n  - [4.2 更新历史](#42-更新历史)\r\n- [5. 预训练权重](#5-预训练权重)\r\n  - [5.1 权重加载](#51-权重加载)\r\n  - [5.2 权重链接](#52-权重链接)\r\n- [6. 鸣谢](#6-鸣谢)\r\n- [7. 引用](#7-引用)\r\n- [8. 其他](#8-其他)\r\n\r\n## 1. 下载安装\r\n\r\n安装稳定版\r\n\r\n```shell\r\npip install bert4torch\r\n```\r\n\r\n安装最新版\r\n\r\n```shell\r\npip install git+https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\r\n```\r\n\r\n- **注意事项**：pip包的发布慢于git上的开发版本，git clone**注意引用路径**，注意权重是否需要转换\r\n- **测试用例**：`git clone https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch`，修改example中的预训练模型文件路径和数据路径即可启动脚本\r\n- **自行训练**：针对自己的数据，修改相应的数据处理代码块\r\n- **开发环境**：原使用 `torch==1.10`版本进行开发，现已切换到 `torch2.0`开发，如其他版本遇到不适配，欢迎反馈\r\n\r\n## 2. 功能\r\n\r\n- **LLM模型**: 加载chatglm、llama、 baichuan、ziya、bloom等开源大模型权重进行推理和微调，命令行一行部署大模型\r\n- **核心功能**：加载bert、roberta、albert、xlnet、nezha、bart、RoFormer、RoFormer_V2、ELECTRA、GPT、GPT2、T5、GAU-alpha、ERNIE等预训练权重继续进行finetune、并支持在bert基础上灵活定义自己模型\r\n- [**丰富示例**](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002F)：包含[llm](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fllm)、[pretrain](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fpretrain)、[sentence_classfication](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fsentence_classfication)、[sentence_embedding](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Ftree\u002Fmaster\u002Fexamples\u002Fsentence_embedding)、[sequence_labeling](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fsequence_labeling)、[relation_extraction](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Frelation_extraction)、[seq2seq](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fseq2seq)、[serving](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fserving\u002F)等多种解决方案\r\n- **实验验证**：已在公开数据集实验验证，使用如下[examples数据集](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fdata\u002FREADME.md)和[实验指标](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002FExperiments.md)\r\n- **易用trick**：集成了常见的[trick](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Ftraining_trick)，即插即用\r\n- **其他特性**：[加载transformers库模型](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002F\u002Ftutorials\u002Ftutorials_load_transformers_model.py)一起使用；调用方式简洁高效；有训练进度条动态展示；配合torchinfo打印参数量；默认Logger和Tensorboard简便记录训练过程；自定义fit过程，满足高阶需求\r\n- **训练过程**：\r\n\r\n  ![训练过程](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_5f821d080dc7.gif)\r\n\r\n| 功能                                | bert4torch | transformers | 备注                               |\r\n| ----------------------------------- | ---------- | ------------ | ---------------------------------- |\r\n| 训练进度条                          | ✅         | ✅           | 进度条打印loss和定义的metrics      |\r\n| 分布式训练dp\u002Fddp                    | ✅         | ✅           | torch自带dp\u002Fddp                    |\r\n| 各类callbacks                       | ✅         | ✅           | 日志\u002Ftensorboard\u002Fearlystop\u002Fwandb等 |\r\n| 大模型推理，stream\u002Fbatch输出        | ✅         | ✅           | 各个模型是通用的，无需单独维护脚本 |\r\n| 大模型微调                          | ✅         | ✅           | lora依赖peft库，pv2自带            |\r\n| 丰富tricks                          | ✅         | ❌           | 对抗训练等tricks即插即用           |\r\n| 代码简洁易懂，自定义空间大          | ✅         | ❌           | 代码复用度高, keras代码训练风格    |\r\n| 仓库的维护能力\u002F影响力\u002F使用量\u002F兼容性 | ❌         | ✅           | 目前仓库个人维护                   |\r\n| 一键部署大模型                      |            |              |                                    |\r\n\r\n## 3. 快速上手\r\n\r\n### 3.1 上手教程\r\n\r\n- [Quick-Start](https:\u002F\u002Fbert4torch.readthedocs.io\u002Fen\u002Flatest\u002F\u002FQuick-Start.html)\r\n- [快速上手教程](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002F\u002Ftutorials\u002FREADME.md)，[教程示例](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002F\u002Ftutorials)，[实战示例](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples)\r\n- [bert4torch介绍(知乎)](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F486329434)，[bert4torch快速上手(知乎)](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F508890807)，[bert4torch又双叒叕更新啦(知乎)](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F560885427?)\r\n\r\n### 3.2 命令行快速部署大模型服务\r\n\r\n- 本地 \u002F 联网加载\r\n  ```shell\r\n  # 联网下载全部文件\r\n  bert4torch serve --checkpoint_path Qwen2-0.5B-Instruct\r\n\r\n  # 加载本地大模型，联网下载bert4torch_config.json\r\n  bert4torch serve --checkpoint_path \u002Fdata\u002Fpretrain_ckpt\u002FQwen\u002FQwen2-0.5B-Instruct --config_path Qwen\u002FQwen2-0.5B-Instruct\r\n\r\n  # 加载本地大模型，且bert4torch_config.json已经下载并放于同名目录下\r\n  bert4torch serve --checkpoint_path \u002Fdata\u002Fpretrain_ckpt\u002FQwen\u002FQwen2-0.5B-Instruct\r\n  ```\r\n- 命令行 \u002F gradio网页 \u002F openai_api\r\n  ```shell\r\n  # 命令行\r\n  bert4torch serve --checkpoint_path \u002Fdata\u002Fpretrain_ckpt\u002FQwen\u002FQwen2-0.5B-Instruct --mode cli\r\n\r\n  # gradio网页\r\n  bert4torch serve --checkpoint_path \u002Fdata\u002Fpretrain_ckpt\u002FQwen\u002FQwen2-0.5B-Instruct --mode gradio\r\n\r\n  # openai_api\r\n  bert4torch serve --checkpoint_path \u002Fdata\u002Fpretrain_ckpt\u002FQwen\u002FQwen2-0.5B-Instruct --mode openai\r\n  ```\r\n- 命令行聊天示例\r\n  ![命令行聊天](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_962d0ae270e6.gif)\r\n\r\n## 4. 版本和更新历史\r\n\r\n### 4.1 版本历史\r\n\r\n| 更新日期 | bert4torch  | torch4keras | 版本说明                                                               |\r\n| -------- | ----------- | ----------- | ---------------------------------------------------------------------- |\r\n| 20260114 | 0.6.1       | 0.3.3       | 增加paddleocr-vl，优化代码结构，去除硬代码模型配置项                   |\r\n| 20250925 | 0.6.0       | 0.3.2       | 增加 `Qwen3-moe`, 支持 `gptq`、`awq`等主流量化方式，其他代码优化 |\r\n| 20250721 | 0.5.9.post2 | 0.3.1       | 增加 `Ernie4_5`, 修复hub下载bug, 拆分出 `openai_client`            |\r\n\r\n[更多版本](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fdocs\u002FUpdate.md)\r\n\r\n### 4.2 更新历史\r\n\r\n[更多历史](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fdocs\u002FHistory.md)\r\n\r\n## 5. 预训练权重\r\n\r\n## 5.1 权重加载\r\n\r\n  ```python\r\n  from bert4torch.models import build_transformer_model\r\n\r\n  # 1. 仅指定config_path: 从头初始化模型结构, 不加载预训练模型\r\n  model = build_transformer_model('.\u002Fmodel\u002Fbert4torch_config.json')\r\n\r\n  # 2. 仅指定checkpoint_path: \r\n  ## 2.1 文件夹路径: 自动寻找路径下的*.bin\u002F*.safetensors权重文件 + 需把bert4torch_config.json下载并放于该目录下\r\n  model = build_transformer_model(checkpoint_path='.\u002Fmodel')\r\n\r\n  ## 2.2 文件路径\u002F列表: 文件路径即权重路径\u002F列表, bert4torch_config.json会从同级目录下寻找\r\n  model = build_transformer_model(checkpoint_path='.\u002Fpytorch_model.bin')\r\n\r\n  ## 2.3 model_name: hf上预训练权重名称, 会自动下载hf权重以及bert4torch_config.json文件\r\n  model = build_transformer_model(checkpoint_path='google-bert\u002Fbert-base-chinese')\r\n\r\n  # 3. 同时指定config_path和checkpoint_path(本地路径名或model_name排列组合): \r\n  #    本地路径从本地加载，pretrained_model_name会联网下载\r\n  config_path = '.\u002Fmodel\u002Fbert4torch_config.json'  # 或'google-bert\u002Fbert-base-chinese'\r\n  checkpoint_path = '.\u002Fmodel\u002Fpytorch_model.bin'  # 或'google-bert\u002Fbert-base-chinese'\r\n  model = build_transformer_model(config_path, checkpoint_path)\r\n  ```\r\n\r\n## 5.2 权重链接\r\n\r\n\r\n|模型分类|模型名称|权重来源|checkpoint_path|config_path|\r\n|------------|------------|------------|------------|------------|\r\n|bert|bert-base-chinese|google-bert|`google-bert\u002Fbert-base-chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002Fgoogle-bert\u002Fbert-base-chinese)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fgoogle-bert\u002Fbert-base-chinese\u002Fbert4torch_config.json)|\r\n||[chinese_L-12_H-768_A-12](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Fbert)|谷歌|[tf权重](https:\u002F\u002Fstorage.googleapis.com\u002Fbert_models\u002F2018_11_03\u002Fchinese_L-12_H-768_A-12.zip)\u003Cbr\u002F>`Tongjilibo\u002Fbert-chinese_L-12_H-768_A-12` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert-chinese_L-12_H-768_A-12)||\r\n||[chinese-bert-wwm-ext](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-BERT-wwm)|HFL|`hfl\u002Fchinese-bert-wwm-ext` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-bert-wwm-ext)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-bert-wwm-ext\u002Fbert4torch_config.json)|\r\n||bert-base-multilingual-cased|google-bert|`google-bert\u002Fbert-base-multilingual-cased` [🤗](https:\u002F\u002Fhuggingface.co\u002Fgoogle-bert\u002Fbert-base-multilingual-cased)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fgoogle-bert\u002Fbert-base-multilingual-cased\u002Fbert4torch_config.json)|\r\n||bert-base-cased|google-bert|`google-bert\u002Fbert-base-cased` [🤗](https:\u002F\u002Fhuggingface.co\u002Fgoogle-bert\u002Fbert-base-cased)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fgoogle-bert\u002Fbert-base-cased\u002Fbert4torch_config.json)|\r\n||bert-base-uncased|google-bert|`google-bert\u002Fbert-base-uncased` [🤗](https:\u002F\u002Fhuggingface.co\u002Fgoogle-bert\u002Fbert-base-uncased)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fgoogle-bert\u002Fbert-base-uncased\u002Fbert4torch_config.json)|\r\n||[MacBERT](https:\u002F\u002Fgithub.com\u002Fymcui\u002FMacBERT)|HFL|`hfl\u002Fchinese-macbert-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-macbert-base)\u003Cbr\u002F>`hfl\u002Fchinese-macbert-large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-macbert-large)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-macbert-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-macbert-large\u002Fbert4torch_config.json)|\r\n||[WoBERT](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002FWoBERT)|追一科技|`junnyu\u002Fwobert_chinese_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Fwobert_chinese_base)\u003Cbr\u002F>`junnyu\u002Fwobert_chinese_plus_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Fwobert_chinese_plus_base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Fwobert_chinese_base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Fwobert_chinese_plus_base\u002Fbert4torch_config.json)|\r\n|roberta|[chinese-roberta-wwm-ext](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-BERT-wwm)|HFL|`hfl\u002Fchinese-roberta-wwm-ext` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-roberta-wwm-ext)\u003Cbr\u002F>`hfl\u002Fchinese-roberta-wwm-ext-large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-roberta-wwm-ext-large)\u003Cbr\u002F>(large的mlm权重是随机初始化)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-roberta-wwm-ext\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-roberta-wwm-ext-large\u002Fbert4torch_config.json)|\r\n||[roberta-small\u002Ftiny](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Fpretrained-models)|追一科技|`Tongjilibo\u002Fchinese_roberta_L-4_H-312_A-12` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_roberta_L-4_H-312_A-12)\u003Cbr\u002F>`Tongjilibo\u002Fchinese_roberta_L-6_H-384_A-12` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_roberta_L-6_H-384_A-12)||\r\n||[roberta-base](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffairseq\u002Ftree\u002Fmain\u002Fexamples\u002Froberta)|FacebookAI|`FacebookAI\u002Froberta-base` [🤗](https:\u002F\u002Fhuggingface.co\u002FFacebookAI\u002Froberta-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FFacebookAI\u002Froberta-base\u002Fbert4torch_config.json)|\r\n||[guwenbert](https:\u002F\u002Fgithub.com\u002FEthan-yt\u002Fguwenbert)|ethanyt|`ethanyt\u002Fguwenbert-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fethanyt\u002Fguwenbert-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fethanyt\u002Fguwenbert-base\u002Fbert4torch_config.json)|\r\n|albert|[albert_zh](https:\u002F\u002Fgithub.com\u002Fbrightmart\u002Falbert_zh)\u003Cbr\u002F>[albert_pytorch](https:\u002F\u002Fgithub.com\u002FlonePatient\u002Falbert_pytorch)|brightmart|`voidful\u002Falbert_chinese_tiny` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_tiny)\u003Cbr\u002F>`voidful\u002Falbert_chinese_small` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_small)\u003Cbr\u002F>`voidful\u002Falbert_chinese_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_base)\u003Cbr\u002F>`voidful\u002Falbert_chinese_large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_large)\u003Cbr\u002F>`voidful\u002Falbert_chinese_xlarge` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_xlarge)\u003Cbr\u002F>`voidful\u002Falbert_chinese_xxlarge` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_xxlarge)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_tiny\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_small\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_large\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_xlarge\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_xxlarge\u002Fbert4torch_config.json)|\r\n|nezha|[NEZHA](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model\u002Ftree\u002Fmaster\u002FNEZHA-PyTorch)\u003Cbr\u002F>[NeZha_Chinese_PyTorch](https:\u002F\u002Fgithub.com\u002FlonePatient\u002FNeZha_Chinese_PyTorch)|huawei_noah|`sijunhe\u002Fnezha-cn-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fsijunhe\u002Fnezha-cn-base)\u003Cbr\u002F>`sijunhe\u002Fnezha-cn-large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fsijunhe\u002Fnezha-cn-large)\u003Cbr\u002F>`sijunhe\u002Fnezha-base-wwm` [🤗](https:\u002F\u002Fhuggingface.co\u002Fsijunhe\u002Fnezha-base-wwm)\u003Cbr\u002F>`sijunhe\u002Fnezha-large-wwm` [🤗](https:\u002F\u002Fhuggingface.co\u002Fsijunhe\u002Fnezha-large-wwm)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fsijunhe\u002Fnezha-cn-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fsijunhe\u002Fnezha-cn-large\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fsijunhe\u002Fnezha-base-wwm\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fsijunhe\u002Fnezha-large-wwm\u002Fbert4torch_config.json)|\r\n||[nezha_gpt_dialog](https:\u002F\u002Fgithub.com\u002Fbojone\u002Fnezha_gpt_dialog)|bojone|`Tongjilibo\u002Fnezha_gpt_dialog` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fnezha_gpt_dialog)||\r\n|xlnet|[Chinese-XLNet](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-XLNet)|HFL|`hfl\u002Fchinese-xlnet-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-xlnet-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-xlnet-base\u002Fbert4torch_config.json)|\r\n||[tranformer_xl](https:\u002F\u002Fgithub.com\u002Fkimiyoung\u002Ftransformer-xl)|huggingface|`transfo-xl\u002Ftransfo-xl-wt103` [🤗](https:\u002F\u002Fhuggingface.co\u002Ftransfo-xl\u002Ftransfo-xl-wt103)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ftransfo-xl\u002Ftransfo-xl-wt103\u002Fbert4torch_config.json)|\r\n|deberta|[Erlangshen-DeBERTa-v2](https:\u002F\u002Fgithub.com\u002FIDEA-CCNL\u002FFengshenbang-LM)|IDEA|`IDEA-CCNL\u002FErlangshen-DeBERTa-v2-97M-Chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-97M-Chinese)\u003Cbr\u002F>`IDEA-CCNL\u002FErlangshen-DeBERTa-v2-320M-Chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-320M-Chinese)\u003Cbr\u002F>`IDEA-CCNL\u002FErlangshen-DeBERTa-v2-710M-Chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-710M-Chinese)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-97M-Chinese\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-320M-Chinese\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-710M-Chinese\u002Fbert4torch_config.json)|\r\n|electra|[Chinese-ELECTRA](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-ELECTRA)|HFL|`hfl\u002Fchinese-electra-base-discriminator` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-electra-base-discriminator)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-electra-base-discriminator\u002Fbert4torch_config.json)|\r\n|ernie|[ernie](https:\u002F\u002Fgithub.com\u002FPaddlePaddle\u002FERNIE)|百度文心|`nghuyong\u002Fernie-1.0-base-zh` [🤗](https:\u002F\u002Fhuggingface.co\u002Fnghuyong\u002Fernie-1.0-base-zh)\u003Cbr\u002F>`nghuyong\u002Fernie-3.0-base-zh` [🤗](https:\u002F\u002Fhuggingface.co\u002Fnghuyong\u002Fernie-3.0-base-zh)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fnghuyong\u002Fernie-1.0-base-zh\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fnghuyong\u002Fernie-3.0-base-zh\u002Fbert4torch_config.json)|\r\n|roformer|[roformer](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Froformer)|追一科技|`junnyu\u002Froformer_chinese_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_chinese_base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_chinese_base\u002Fbert4torch_config.json)|\r\n||[roformer_v2](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Froformer-v2)|追一科技|`junnyu\u002Froformer_v2_chinese_char_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_v2_chinese_char_base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_v2_chinese_char_base\u002Fbert4torch_config.json)|\r\n|simbert|[simbert](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Fsimbert)|追一科技|`Tongjilibo\u002Fsimbert-chinese-base` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fsimbert-chinese-base)\u003Cbr\u002F>`Tongjilibo\u002Fsimbert-chinese-small` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fsimbert-chinese-small)\u003Cbr\u002F>`Tongjilibo\u002Fsimbert-chinese-tiny` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fsimbert-chinese-tiny)||\r\n||[simbert_v2\u002Froformer-sim](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Froformer-sim)|追一科技|`junnyu\u002Froformer_chinese_sim_char_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_chinese_sim_char_base)\u003Cbr\u002F>`junnyu\u002Froformer_chinese_sim_char_ft_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_chinese_sim_char_ft_base)\u003Cbr\u002F>`junnyu\u002Froformer_chinese_sim_char_small` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_chinese_sim_char_small)\u003Cbr\u002F>`junnyu\u002Froformer_chinese_sim_char_ft_small` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_chinese_sim_char_ft_small)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_chinese_sim_char_base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_chinese_sim_char_ft_base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_chinese_sim_char_small\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_chinese_sim_char_ft_small\u002Fbert4torch_config.json)|\r\n|gau|[GAU-alpha](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002FGAU-alpha)|追一科技|`Tongjilibo\u002Fchinese_GAU-alpha-char_L-24_H-768` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_GAU-alpha-char_L-24_H-768)||\r\n|ModernBERT|[ModernBERT](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fanswerdotai\u002Fmodernbert-67627ad707a4acbf33c41deb)|answerdotai|`answerdotai\u002FModernBERT-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fanswerdotai\u002FModernBERT-base)\u003Cbr\u002F>`answerdotai\u002FModernBERT-large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fanswerdotai\u002FModernBERT-large)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fanswerdotai\u002FModernBERT-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fanswerdotai\u002FModernBERT-large\u002Fbert4torch_config.json)|\r\n|uie|[uie](https:\u002F\u002Fgithub.com\u002Funiversal-ie\u002FUIE)\u003Cbr\u002F>[uie_pytorch](https:\u002F\u002Fgithub.com\u002FHUSTAI\u002Fuie_pytorch)|百度|`Tongjilibo\u002Fuie-base` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fuie-base)||\r\n|gpt|[CDial-GPT](https:\u002F\u002Fgithub.com\u002Fthu-coai\u002FCDial-GPT)|thu-coai|`thu-coai\u002FCDial-GPT_LCCC-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fthu-coai\u002FCDial-GPT_LCCC-base)\u003Cbr\u002F>`thu-coai\u002FCDial-GPT_LCCC-large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fthu-coai\u002FCDial-GPT_LCCC-large)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fthu-coai\u002FCDial-GPT_LCCC-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fthu-coai\u002FCDial-GPT_LCCC-large\u002Fbert4torch_config.json)|\r\n||[cmp_lm(26亿)](https:\u002F\u002Fgithub.com\u002FTsinghuaAI\u002FCPM-1-Generate)|清华|`TsinghuaAI\u002FCPM-Generate` [🤗](https:\u002F\u002Fhuggingface.co\u002FTsinghuaAI\u002FCPM-Generate)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FTsinghuaAI\u002FCPM-Generate\u002Fbert4torch_config.json)|\r\n||[nezha_gen](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model\u002Ftree\u002Fmaster\u002FNEZHA-Gen-TensorFlow)|huawei_noah|`Tongjilibo\u002Fchinese_nezha_gpt_L-12_H-768_A-12` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_nezha_gpt_L-12_H-768_A-12)||\r\n||[gpt2-chinese-cluecorpussmall](https:\u002F\u002Fgithub.com\u002Fdbiir\u002FUER-py\u002Fwiki\u002FModelzoo)|UER|`uer\u002Fgpt2-chinese-cluecorpussmall` [🤗](https:\u002F\u002Fhuggingface.co\u002Fuer\u002Fgpt2-chinese-cluecorpussmall)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fuer\u002Fgpt2-chinese-cluecorpussmall\u002Fbert4torch_config.json)|\r\n||[gpt2-ml](https:\u002F\u002Fgithub.com\u002Fimcaspar\u002Fgpt2-ml)|imcaspar|`Tongjilibo\u002Fgpt2-ml_15g_corpus` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fgpt2-ml_15g_corpus)\u003Cbr\u002F>`Tongjilibo\u002Fgpt2-ml_30g_corpus` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fgpt2-ml_30g_corpus)\u003Cbr\u002F>[torch](https:\u002F\u002Fgithub.com\u002Fghosthamlet\u002Fgpt2-ml-torch),[BaiduYun(84dh)](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16tL4Bmoh6jPy0cOND0YyeA)||\r\n|bart|[bart_base_chinese](https:\u002F\u002Fgithub.com\u002Ffastnlp\u002FCPT)|复旦fnlp|`fnlp\u002Fbart-base-chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002Ffnlp\u002Fbart-base-chinese)\u003Cbr\u002F>[fnlp\u002Fbart-base-chinese-v1.0](https:\u002F\u002Fhuggingface.co\u002Ffnlp\u002Fbart-base-chinese\u002Ftree\u002Fv1.0)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ffnlp\u002Fbart-base-chinese\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ffnlp\u002Fbart-base-chinese-v1.0\u002Fbert4torch_config.json)|\r\n|t5|[t5](https:\u002F\u002Fgithub.com\u002Fdbiir\u002FUER-py\u002Fwiki\u002FModelzoo)|UER|`uer\u002Ft5-small-chinese-cluecorpussmall` [🤗](https:\u002F\u002Fhuggingface.co\u002Fuer\u002Ft5-small-chinese-cluecorpussmall)\u003Cbr\u002F>`uer\u002Ft5-base-chinese-cluecorpussmall` [🤗](https:\u002F\u002Fhuggingface.co\u002Fuer\u002Ft5-base-chinese-cluecorpussmall)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fuer\u002Ft5-base-chinese-cluecorpussmall\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fuer\u002Ft5-small-chinese-cluecorpussmall\u002Fbert4torch_config.json)|\r\n||mt5|谷歌|`google\u002Fmt5-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fgoogle\u002Fmt5-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fgoogle\u002Fmt5-base\u002Fbert4torch_config.json)|\r\n||[t5_pegasus](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Ft5-pegasus)|追一科技|`Tongjilibo\u002Fchinese_t5_pegasus_small` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_t5_pegasus_small)\u003Cbr\u002F>`Tongjilibo\u002Fchinese_t5_pegasus_base` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_t5_pegasus_base)||\r\n||[chatyuan](https:\u002F\u002Fgithub.com\u002Fclue-ai\u002FChatYuan)|clue-ai|`ClueAI\u002FChatYuan-large-v1` [🤗](https:\u002F\u002Fhuggingface.co\u002FClueAI\u002FChatYuan-large-v1)\u003Cbr\u002F>`ClueAI\u002FChatYuan-large-v2` [🤗](https:\u002F\u002Fhuggingface.co\u002FClueAI\u002FChatYuan-large-v2)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FClueAI\u002FChatYuan-large-v1\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FClueAI\u002FChatYuan-large-v2\u002Fbert4torch_config.json)|\r\n||[PromptCLUE](https:\u002F\u002Fgithub.com\u002Fclue-ai\u002FPromptCLUE)|clue-ai|`ClueAI\u002FPromptCLUE-base` [🤗](https:\u002F\u002Fhuggingface.co\u002FClueAI\u002FPromptCLUE-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FClueAI\u002FPromptCLUE-base\u002Fbert4torch_config.json)|\r\n|chatglm|[ChatGLM-6B](https:\u002F\u002Fgithub.com\u002Fzai-org\u002FChatGLM-6B)|zai-org|`zai-org\u002Fchatglm-6b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm-6b)\u003Cbr\u002F>`zai-org\u002Fchatglm-6b-int8` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm-6b-int8)\u003Cbr\u002F>`zai-org\u002Fchatglm-6b-int4` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm-6b-int4)\u003Cbr\u002F>`zai-org\u002Fchatglm-6b-v0.1.0`[🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm-6b\u002Ftree\u002Fv0.1.0)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm-6b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm-6b-int8\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm-6b-int4\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm-6b-v0.1.0\u002Fbert4torch_config.json)|\r\n||[ChatGLM2-6B](https:\u002F\u002Fgithub.com\u002Fzai-org\u002FChatGLM2-6B)|zai-org|`zai-org\u002Fchatglm2-6b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm2-6b)\u003Cbr\u002F>`zai-org\u002Fchatglm2-6b-int4` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm2-6b-int4)\u003Cbr\u002F>`zai-org\u002Fchatglm2-6b-32k` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm2-6b-32k)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm2-6b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm2-6b-int4\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm2-6b-32k\u002Fbert4torch_config.json)|\r\n||[ChatGLM3](https:\u002F\u002Fgithub.com\u002Fzai-org\u002FChatGLM3)|zai-org|`zai-org\u002Fchatglm3-6b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm3-6b)\u003Cbr\u002F>`zai-org\u002Fchatglm3-6b-32k` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm3-6b-32k)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm3-6b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm3-6b-32k\u002Fbert4torch_config.json)|\r\n||[GLM-4](https:\u002F\u002Fgithub.com\u002Fzai-org\u002FGLM-4)|zai-org|`zai-org\u002Fglm-4-9b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fglm-4-9b)\u003Cbr\u002F>`zai-org\u002Fglm-4-9b-chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fglm-4-9b-chat)\u003Cbr\u002F>`zai-org\u002Fglm-4-9b-chat-1m` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fglm-4-9b-chat-1m)\u003Cbr\u002F>`zai-org\u002Fglm-4v-9b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fglm-4v-9b)\u003Cbr\u002F>`zai-org\u002FGLM-4-9B-0414` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002FGLM-4-9B-0414)\u003Cbr\u002F>`zai-org\u002FGLM-Z1-9B-0414` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002FGLM-Z1-9B-0414)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fglm-4-9b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fglm-4-9b-chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fglm-4-9b-chat-1m\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fglm-4v-9b\u002Fbert4torch_config.json)\u003Cbr\u002F>\u003Cbr\u002F>\u003Cbr\u002F>|\r\n|llama|[llama](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fllama)|meta|`meta-llama\u002Fllama-7b`\u003Cbr\u002F>`meta-llama\u002Fllama-13b`|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002Fllama-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002Fllama-13b\u002Fbert4torch_config.json)|\r\n||[llama-2](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fllama)|meta|`meta-llama\u002FLlama-2-7b-hf`[🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-2-7b-hf)\u003Cbr\u002F>`meta-llama\u002FLlama-2-7b-chat-hf`[🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-2-7b-chat-hf)\u003Cbr\u002F>`meta-llama\u002FLlama-2-13b-hf`[🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-2-13b-hf)\u003Cbr\u002F>`meta-llama\u002FLlama-2-13b-chat-hf`[🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-2-13b-chat-hf)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-2-7b-hf\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-2-7b-chat-hf\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-2-13b-hf\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-2-13b-chat-hf\u002Fbert4torch_config.json)|\r\n||[llama-3](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama3)|meta|`meta-llama\u002FMeta-Llama-3-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FMeta-Llama-3-8B)\u003Cbr\u002F>`meta-llama\u002FMeta-Llama-3-8B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FMeta-Llama-3-8B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FMeta-Llama-3-8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FMeta-Llama-3-8B-Instruct\u002Fbert4torch_config.json)|\r\n||[llama-3.1](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-models)|meta|`meta-llama\u002FMeta-Llama-3.1-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FMeta-Llama-3.1-8B)\u003Cbr\u002F>`meta-llama\u002FMeta-Llama-3.1-8B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FMeta-Llama-3.1-8B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FMeta-Llama-3.1-8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FMeta-Llama-3.1-8B-Instruct\u002Fbert4torch_config.json)|\r\n||[llama-3.2](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-models)|meta|`meta-llama\u002FLlama-3.2-1B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-1B)\u003Cbr\u002F>`meta-llama\u002FLlama-3.2-1B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-1B-Instruct)\u003Cbr\u002F>`meta-llama\u002FLlama-3.2-3B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-3B)\u003Cbr\u002F>`meta-llama\u002FLlama-3.2-3B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-3B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-1B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-1B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-3B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-3B-Instruct\u002Fbert4torch_config.json)|\r\n||[llama-3.2-vision](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-models)|meta|`meta-llama\u002FLlama-3.2-11B-Vision` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-11B-Vision)\u003Cbr\u002F>`meta-llama\u002FLlama-3.2-11B-Vision-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-11B-Vision-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-11B-Vision\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-11B-Vision-Instruct\u002Fbert4torch_config.json)|\r\n|llama-series|[Chinese-LLaMA-Alpaca](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-LLaMA-Alpaca)|HFL|`hfl\u002Fchinese-alpaca-plus-lora-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-alpaca-plus-lora-7b)\u003Cbr\u002F>`hfl\u002Fchinese-llama-plus-lora-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-llama-plus-lora-7b)\u003Cbr\u002F>(使用前需要合并lora权重)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-alpaca-plus-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-llama-plus-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>\u003Cbr\u002F>|\r\n||[Chinese-LLaMA-Alpaca-2](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-LLaMA-Alpaca-2)|HFL||待添加|\r\n||[Chinese-LLaMA-Alpaca-3](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-LLaMA-Alpaca-3)|HFL||待添加|\r\n||[Belle_llama](https:\u002F\u002Fgithub.com\u002FLianjiaTech\u002FBELLE)|LianjiaTech|`BelleGroup\u002FBELLE-LLaMA-7B-2M-enc`[🤗](https:\u002F\u002Fhuggingface.co\u002FBelleGroup\u002FBELLE-LLaMA-7B-2M-enc)|[合成说明](https:\u002F\u002Fgithub.com\u002FLianjiaTech\u002FBELLE\u002Ftree\u002Fmain\u002Fmodels)、[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBelleGroup\u002FBELLE-LLaMA-7B-2M-enc)|\r\n||[Ziya](https:\u002F\u002Fgithub.com\u002FIDEA-CCNL\u002FFengshenbang-LM)|IDEA-CCNL|`IDEA-CCNL\u002FZiya-LLaMA-13B-v1`[🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FZiya-LLaMA-13B-v1)\u003Cbr\u002F>`IDEA-CCNL\u002FZiya-LLaMA-13B-v1.1`[🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FZiya-LLaMA-13B-v1.1)\u003Cbr\u002F>`IDEA-CCNL\u002FZiya-LLaMA-13B-Pretrain-v1`[🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FZiya-LLaMA-13B-Pretrain-v1)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FIDEA-CCNL\u002FZiya-LLaMA-13B-v1\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FIDEA-CCNL\u002FZiya-LLaMA-13B-v1.1\u002Fbert4torch_config.json)\u003Cbr\u002F>\u003Cbr\u002F>|\r\n||[vicuna](https:\u002F\u002Fgithub.com\u002Flm-sys\u002FFastChat)|lmsys|`lmsys\u002Fvicuna-7b-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002Flmsys\u002Fvicuna-7b-v1.5)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Flmsys\u002Fvicuna-7b-v1.5\u002Fbert4torch_config.json)|\r\n|Baichuan|[Baichuan](https:\u002F\u002Fgithub.com\u002Fbaichuan-inc\u002FBaichuan)|baichuan-inc|`baichuan-inc\u002FBaichuan-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan-7B)\u003Cbr\u002F>`baichuan-inc\u002FBaichuan-13B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan-13B-Base)\u003Cbr\u002F>`baichuan-inc\u002FBaichuan-13B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan-13B-Chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan-13B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan-13B-Chat\u002Fbert4torch_config.json)|\r\n||[Baichuan2](https:\u002F\u002Fgithub.com\u002Fbaichuan-inc\u002FBaichuan2)|baichuan-inc|`baichuan-inc\u002FBaichuan2-7B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan2-7B-Base)\u003Cbr\u002F>`baichuan-inc\u002FBaichuan2-7B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan2-7B-Chat)\u003Cbr\u002F>`baichuan-inc\u002FBaichuan2-13B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan2-13B-Base)\u003Cbr\u002F>`baichuan-inc\u002FBaichuan2-13B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan2-13B-Chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan2-7B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan2-7B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan2-13B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan2-13B-Chat\u002Fbert4torch_config.json)|\r\n|Yi|[Yi](https:\u002F\u002Fgithub.com\u002F01-ai\u002FYi)|01-ai|`01-ai\u002FYi-6B` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-6B)\u003Cbr\u002F>`01-ai\u002FYi-6B-200K` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-6B-200K)\u003Cbr\u002F>`01-ai\u002FYi-9B` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-9B)\u003Cbr\u002F>`01-ai\u002FYi-9B-200K` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-9B-200K)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-6B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-6B-200K\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-9B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-9B-200K\u002Fbert4torch_config.json)|\r\n||[Yi-1.5](https:\u002F\u002Fgithub.com\u002F01-ai\u002FYi-1.5)|01-ai|`01-ai\u002FYi-1.5-6B` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-6B)\u003Cbr\u002F>`01-ai\u002FYi-1.5-6B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-6B-Chat)\u003Cbr\u002F>`01-ai\u002FYi-1.5-9B` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-9B)\u003Cbr\u002F>`01-ai\u002FYi-1.5-9B-32K` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-9B-32K)\u003Cbr\u002F>`01-ai\u002FYi-1.5-9B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-9B-Chat)\u003Cbr\u002F>`01-ai\u002FYi-1.5-9B-Chat-16K` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-9B-Chat-16K)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-6B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-6B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-9B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-9B-32K\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-9B-Chat)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-9B-Chat-16K\u002Fbert4torch_config.json)|\r\n|bloom|[bloom](https:\u002F\u002Fgithub.com\u002Fbigscience-workshop\u002Fxmtf)|bigscience|`bigscience\u002Fbloom-560m` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbigscience\u002Fbloom-560m)\u003Cbr\u002F>`bigscience\u002Fbloomz-560m` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbigscience\u002Fbloomz-560m)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbigscience\u002Fbloom-560m\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbigscience\u002Fbloomz-560m\u002Fbert4torch_config.json)|\r\n|Qwen|[Qwen](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen)|阿里云|`Qwen\u002FQwen-1_8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-1_8B)\u003Cbr\u002F>`Qwen\u002FQwen-1_8B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-1_8B-Chat)\u003Cbr\u002F>`Qwen\u002FQwen-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-7B)\u003Cbr\u002F>`Qwen\u002FQwen-7B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-7B-Chat)\u003Cbr\u002F>`Qwen\u002FQwen-14B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-14B)\u003Cbr\u002F>`Qwen\u002FQwen-14B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-14B-Chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-1_8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-1_8B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-7B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-14B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-14B-Chat\u002Fbert4torch_config.json)|\r\n||[Qwen1.5](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen1.5)|阿里云|`Qwen\u002FQwen1.5-0.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-0.5B)\u003Cbr\u002F>`Qwen\u002FQwen1.5-0.5B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-0.5B-Chat)\u003Cbr\u002F>`Qwen\u002FQwen1.5-1.8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-1.8B)\u003Cbr\u002F>`Qwen\u002FQwen1.5-1.8B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-1.8B-Chat)\u003Cbr\u002F>`Qwen\u002FQwen1.5-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-7B)\u003Cbr\u002F>`Qwen\u002FQwen1.5-7B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-7B-Chat)\u003Cbr\u002F>`Qwen\u002FQwen1.5-14B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-14B)\u003Cbr\u002F>`Qwen\u002FQwen1.5-14B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-14B-Chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-0.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-0.5B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-1.8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-1.8B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-7B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-14B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-14B-Chat\u002Fbert4torch_config.json)|\r\n||[Qwen2](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2)|阿里云|`Qwen\u002FQwen2-0.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-0.5B)\u003Cbr\u002F>`Qwen\u002FQwen2-0.5B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-0.5B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2-1.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-1.5B)\u003Cbr\u002F>`Qwen\u002FQwen2-1.5B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-1.5B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-7B)\u003Cbr\u002F>`Qwen\u002FQwen2-7B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-7B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-0.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-0.5B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-1.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-1.5B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-7B-Instruct\u002Fbert4torch_config.json)|\r\n||[Qwen2-VL](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2-VL)|阿里云|`Qwen\u002FQwen2-VL-2B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-VL-2B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2-VL-7B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-VL-7B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-VL-2B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-VL-7B-Instruct\u002Fbert4torch_config.json)|\r\n||[Qwen2.5](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5)|阿里云|`Qwen\u002FQwen2.5-0.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-0.5B)\u003Cbr\u002F>`Qwen\u002FQwen2.5-0.5B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-0.5B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-1.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-1.5B)\u003Cbr\u002F>`Qwen\u002FQwen2.5-1.5B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-1.5B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-3B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-3B)\u003Cbr\u002F>`Qwen\u002FQwen2.5-3B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-3B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-7B)\u003Cbr\u002F>`Qwen\u002FQwen2.5-7B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-7B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-14B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-14B)\u003Cbr\u002F>`Qwen\u002FQwen2.5-14B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-14B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-0.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-0.5B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-1.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-1.5B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-3B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-3B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-7B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-14B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-14B-Instruct\u002Fbert4torch_config.json)|\r\n||[Qwen2.5-VL](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-VL)|阿里云|`Qwen\u002FQwen2.5-VL-3B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-VL-3B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-VL-7B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-VL-7B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-VL-32B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-VL-32B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-VL-3B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-VL-7B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-VL-32B-Instruct\u002Fbert4torch_config.json)|\r\n||[Qwen3](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen3)|阿里云|`Qwen\u002FQwen3-0.6B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-0.6B-Base)\u003Cbr\u002F>`Qwen\u002FQwen3-0.6B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-0.6B)\u003Cbr\u002F>`Qwen\u002FQwen3-0.6B-GPTQ-Int8` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-0.6B-GPTQ-Int8)\u003Cbr\u002F>`Qwen\u002FQwen3-1.7B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-1.7B-Base)\u003Cbr\u002F>`Qwen\u002FQwen3-1.7B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-1.7B)\u003Cbr\u002F>`Qwen\u002FQwen3-4B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-4B-Base)\u003Cbr\u002F>`Qwen\u002FQwen3-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-4B)\u003Cbr\u002F>`Qwen\u002FQwen3-4B-AWQ` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-4B-AWQ)\u003Cbr\u002F>`Qwen\u002FQwen3-8B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-8B-Base)\u003Cbr\u002F>`Qwen\u002FQwen3-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-8B)\u003Cbr\u002F>`Qwen\u002FQwen3-14B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-14B-Base)\u003Cbr\u002F>`Qwen\u002FQwen3-14B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-14B)\u003Cbr\u002F>`Qwen\u002FQwen3-32B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-32B)\u003Cbr\u002F>`Qwen\u002FQwen3-4B-Instruct-2507` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-4B-Instruct-2507)\u003Cbr\u002F>`Qwen\u002FQwen3-4B-Thinking-2507` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-4B-Thinking-2507)\u003Cbr\u002F>`Qwen\u002FQwen3-30B-A3B-Instruct-2507` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-30B-A3B-Instruct-2507)\u003Cbr\u002F>`Qwen\u002FQwen3-30B-A3B-Thinking-2507` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-30B-A3B-Thinking-2507)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-0.6B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-0.6B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-0.6B-GPTQ-Int8\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-1.7B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-1.7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-4B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-4B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-4B-AWQ\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-8B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-14B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-14B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-32B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-4B-Instruct-2507\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-4B-Thinking-2507\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-30B-A3B-Instruct-2507\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-30B-A3B-Thinking-2507\u002Fbert4torch_config.json)|\r\n||[Qwen3-VL](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FQwen\u002Fqwen3-vl)|阿里云|`Qwen\u002FQwen3-VL-2B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-2B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-2B-Thinking` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-2B-Thinking)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-4B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-4B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-4B-Thinking` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-4B-Thinking)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-8B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-8B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-8B-Thinking` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-8B-Thinking)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-30B-A3B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-30B-A3B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-30B-A3B-Thinking` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-30B-A3B-Thinking)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-32B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-32B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-32B-Thinking` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-32B-Thinking)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-2B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-2B-Thinking\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-4B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-4B-Thinking\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-8B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-8B-Thinking\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-30B-A3B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-30B-A3B-Thinking\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-32B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-32B-Thinking\u002Fbert4torch_config.json)|\r\n||[Qwen3-Embedding](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen3)|阿里云|`Qwen\u002FQwen3-Embedding-0.6B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Embedding-0.6B)\u003Cbr\u002F>`Qwen\u002FQwen3-Embedding-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Embedding-4B)\u003Cbr\u002F>`Qwen\u002FQwen3-Embedding-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Embedding-8B)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-Embedding-0.6B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-Embedding-4B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-Embedding-8B\u002Fbert4torch_config.json)|\r\n||[Qwen3-Reranker](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen3)|阿里云|`Qwen\u002FQwen3-Reranker-0.6B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Reranker-0.6B)\u003Cbr\u002F>`Qwen\u002FQwen3-Reranker-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Reranker-4B)\u003Cbr\u002F>`Qwen\u002FQwen3-Reranker-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Reranker-8B)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Ftree\u002Fmain\u002FQwen\u002FQwen3-Reranker-0.6B)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Ftree\u002Fmain\u002FQwen\u002FQwen3-Reranker-4B)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Ftree\u002Fmain\u002FQwen\u002FQwen3-Reranker-8B)|\r\n|Intern|[InternLM](https:\u002F\u002Fgithub.com\u002FInternLM\u002FInternLM)|上海人工智能实验室|`internlm\u002Finternlm-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm-7b)\u003Cbr\u002F>`internlm\u002Finternlm-chat-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm-chat-7b)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm-chat-7b\u002Fbert4torch_config.json)|\r\n||[InternLM2](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Finternlm\u002Finternlm2-65b0ce04970888799707893c)|上海人工智能实验室|`internlm\u002Finternlm2-1_8b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-1_8b)\u003Cbr\u002F>`internlm\u002Finternlm2-chat-1_8b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-chat-1_8b)\u003Cbr\u002F>`internlm\u002Finternlm2-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-7b)\u003Cbr\u002F>`internlm\u002Finternlm2-chat-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-chat-7b)\u003Cbr\u002F>`internlm\u002Finternlm2-20b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-20b)\u003Cbr\u002F>`internlm\u002Finternlm2-chat-20b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-chat-20b)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2-1_8b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2-chat-1_8b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2-chat-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>\u003Cbr\u002F>\u003Cbr\u002F>|\r\n||[InternLM2.5](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Finternlm\u002Finternlm25-66853f32717072d17581bc13)|上海人工智能实验室|`internlm\u002Finternlm2_5-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2_5-7b)\u003Cbr\u002F>`internlm\u002Finternlm2_5-7b-chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2_5-7b-chat)\u003Cbr\u002F>`internlm\u002Finternlm2_5-7b-chat-1m` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2_5-7b-chat-1m)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2_5-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2_5-7b-chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2_5-7b-chat-1m\u002Fbert4torch_config.json)|\r\n||[InternLM3](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Finternlm\u002Finternlm3-67875827c377690c01a9131d)|上海人工智能实验室|`internlm\u002Finternlm3-8b-instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm3-8b-instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm3-8b-instruct\u002Fbert4torch_config.json)|\r\n||[InternVL1.0-1.5](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FInternVL)|上海人工智能实验室|`OpenGVLab\u002FMini-InternVL-Chat-4B-V1-5` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FMini-InternVL-Chat-4B-V1-5)\u003Cbr\u002F>`OpenGVLab\u002FMini-InternVL-Chat-2B-V1-5` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FMini-InternVL-Chat-2B-V1-5)|待添加|\r\n||[InternVL2.0](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FInternVL)|上海人工智能实验室|`OpenGVLab\u002FInternVL2-1B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2-1B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2-2B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2-2B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2-4B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2-8B)|待添加|\r\n||[InternVL2.5](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FInternVL)|上海人工智能实验室|`OpenGVLab\u002FInternVL2_5-1B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2_5-1B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2_5-2B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2_5-2B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2_5-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2_5-4B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2_5-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2_5-8B)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FOpenGVLab\u002FInternVL2_5-1B\u002Fbert4torch_config.json)\u003Cbr\u002F>待添加\u003Cbr\u002F>待添加\u003Cbr\u002F>待添加|\r\n|Falcon|[Falcon](https:\u002F\u002Fhuggingface.co\u002Ftiiuae)|tiiuae|`tiiuae\u002Ffalcon-rw-1b` [🤗](https:\u002F\u002Fhuggingface.co\u002Ftiiuae\u002Ffalcon-rw-1b)\u003Cbr\u002F>`tiiuae\u002Ffalcon-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Ftiiuae\u002Ffalcon-7b)\u003Cbr\u002F>`tiiuae\u002Ffalcon-7b-instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Ftiiuae\u002Ffalcon-7b-instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ftiiuae\u002Ffalcon-rw-1b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ftiiuae\u002Ffalcon-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ftiiuae\u002Ffalcon-7b-instruct\u002Fbert4torch_config.json)|\r\n|DeepSeek|[DeepSeek-MoE](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-MoE)|深度求索|`deepseek-ai\u002Fdeepseek-moe-16b-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-moe-16b-base)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-moe-16b-chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-moe-16b-chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-moe-16b-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-moe-16b-chat\u002Fbert4torch_config.json)|\r\n||[DeepSeek-LLM](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-LLM)|深度求索|`deepseek-ai\u002Fdeepseek-llm-7b-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-llm-7b-base)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-llm-7b-chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-llm-7b-chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-llm-7b-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-llm-7b-chat\u002Fbert4torch_config.json)|\r\n||[DeepSeek-V2](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-V2)|深度求索|`deepseek-ai\u002FDeepSeek-V2-Lite` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-V2-Lite)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-V2-Lite-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-V2-Lite-Chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-V2-Lite\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-V2-Lite-Chat\u002Fbert4torch_config.json)|\r\n||[DeepSeek-Coder](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder)|深度求索|`deepseek-ai\u002Fdeepseek-coder-1.3b-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-1.3b-base)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-coder-1.3b-instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-1.3b-instruct)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-coder-6.7b-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-6.7b-base)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-coder-6.7b-instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-6.7b-instruct)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-coder-7b-base-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-7b-base-v1.5)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-coder-7b-instruct-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-7b-instruct-v1.5)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-1.3b-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-1.3b-instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-6.7b-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-6.7b-instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-7b-base-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-7b-instruct-v1.5\u002Fbert4torch_config.json)|\r\n||[DeepSeek-Coder-V2](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2)|深度求索|`deepseek-ai\u002FDeepSeek-Coder-V2-Lite-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Lite-Base)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-Coder-V2-Lite-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Lite-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Lite-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Lite-Instruct\u002Fbert4torch_config.json)|\r\n||[DeepSeek-Math](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Math)|深度求索|`deepseek-ai\u002Fdeepseek-math-7b-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-ai\u002Fdeepseek-math-7b-base)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-math-7b-instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-math-7b-instruct)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-math-7b-rl` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-math-7b-rl)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-math-7b-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-math-7b-instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-math-7b-rl\u002Fbert4torch_config.json)|\r\n||[DeepSeek-R1](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fdeepseek-ai\u002Fdeepseek-r1-678e1e131c0169c0bc89728d)|深度求索|`deepseek-ai\u002FDeepSeek-R1-Distill-Qwen-1.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-1.5B)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-R1-Distill-Qwen-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-7B)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-R1-Distill-Llama-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Llama-8B)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-R1-Distill-Qwen-14B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-14B)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-R1-Distill-Qwen-32B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-32B)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-R1-0528-Qwen3-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-0528-Qwen3-8B)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-1.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Llama-8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-14B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-32B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-0528-Qwen3-8B\u002Fbert4torch_config.json)|\r\n|Seed-OSS|[Seed-OSS](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FByteDance-Seed\u002Fseed-oss-68a609f4201e788db05b5dcd)|ByteDance|`ByteDance-Seed\u002FSeed-OSS-36B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FByteDance-Seed\u002FSeed-OSS-36B-Instruct)\u003Cbr\u002F>`ByteDance-Seed\u002FSeed-OSS-36B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FByteDance-Seed\u002FSeed-OSS-36B-Base)\u003Cbr\u002F>`ByteDance-Seed\u002FSeed-OSS-36B-Base-woSyn` [🤗](https:\u002F\u002Fhuggingface.co\u002FByteDance-Seed\u002FSeed-OSS-36B-Base-woSyn)||\r\n|Ernie4_5|[Ernie4_5](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fbaidu\u002Fernie-45-6861cd4c9be84540645f35c9)|百度|`baidu\u002FERNIE-4.5-0.3B-Base-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-0.3B-Base-PT)\u003Cbr\u002F>`baidu\u002FERNIE-4.5-0.3B-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-0.3B-PT)\u003Cbr\u002F>`baidu\u002FERNIE-4.5-21B-A3B-Base-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-21B-A3B-Base-PT)\u003Cbr\u002F>`baidu\u002FERNIE-4.5-21B-A3B-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-21B-A3B-PT)\u003Cbr\u002F>`baidu\u002FERNIE-4.5-VL-28B-A3B-Base-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-VL-28B-A3B-Base-PT)\u003Cbr\u002F>`baidu\u002FERNIE-4.5-VL-28B-A3B-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-VL-28B-A3B-PT)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaidu\u002FERNIE-4.5-0.3B-Base-PT\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaidu\u002FERNIE-4.5-0.3B-PT\u002Fbert4torch_config.json)|\r\n|PaddleOCR|[PaddleOCR-VL](https:\u002F\u002Fhuggingface.co\u002FPaddlePaddle\u002FPaddleOCR-VL)|百度|`PaddlePaddle\u002FPaddleOCR-VL` [🤗](https:\u002F\u002Fhuggingface.co\u002FPaddlePaddle\u002FPaddleOCR-VL)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FPaddlePaddle\u002FPaddleOCR-VL\u002Fbert4torch_config.json)|\r\n||[PaddleOCR-VL-1.5](https:\u002F\u002Fhuggingface.co\u002FPaddlePaddle\u002FPaddleOCR-VL-1.5)|百度|`PaddlePaddle\u002FPaddleOCR-VL-1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FPaddlePaddle\u002FPaddleOCR-VL-1.5)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FPaddlePaddle\u002FPaddleOCR-VL-1.5\u002Fbert4torch_config.json)|\r\n|MiniCPM|[MiniCPM](https:\u002F\u002Fgithub.com\u002FOpenBMB\u002FMiniCPM)|OpenBMB|`openbmb\u002FMiniCPM-2B-sft-bf16` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-2B-sft-bf16)\u003Cbr\u002F>`openbmb\u002FMiniCPM-2B-dpo-bf16` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-2B-dpo-bf16)\u003Cbr\u002F>`openbmb\u002FMiniCPM-2B-128k` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-2B-128k)\u003Cbr\u002F>`openbmb\u002FMiniCPM-1B-sft-bf16` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-1B-sft-bf16)\u003Cbr\u002F>`openbmb\u002FMiniCPM3-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM3-4B)\u003Cbr\u002F>`openbmb\u002FMiniCPM4-0.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM4-0.5B)\u003Cbr\u002F>`openbmb\u002FMiniCPM4-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM4-8B)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-2B-sft-bf16\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-2B-dpo-bf16\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-2B-128k\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-1B-sft-bf16\u002Fbert4torch_config.json)\u003Cbr\u002F>待添加\u003Cbr\u002F>待添加\u003Cbr\u002F>待添加|\r\n||[MiniCPM-o](https:\u002F\u002Fgithub.com\u002FOpenBMB\u002FMiniCPM-o)|OpenBMB|`openbmb\u002FMiniCPM-Llama3-V-2_5` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-Llama3-V-2_5)\u003Cbr\u002F>`openbmb\u002FMiniCPM-V-2_6` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-V-2_6)\u003Cbr\u002F>`openbmb\u002FMiniCPM-o-2_6` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-o-2_6)\u003Cbr\u002F>`openbmb\u002FMiniCPM-V-4` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-V-4)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-Llama3-V-2_5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-V-2_6\u002Fbert4torch_config.json)\u003Cbr\u002F>待添加\u003Cbr\u002F>待添加|\r\n|embedding|[text2vec-base-chinese](https:\u002F\u002Fgithub.com\u002Fshibing624\u002Ftext2vec)|shibing624|`shibing624\u002Ftext2vec-base-chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002Fshibing624\u002Ftext2vec-base-chinese)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fshibing624\u002Ftext2vec-base-chinese\u002Fbert4torch_config.json)|\r\n||[m3e](https:\u002F\u002Fgithub.com\u002Fwangyuxinwhy\u002Funiem)|moka-ai|`moka-ai\u002Fm3e-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmoka-ai\u002Fm3e-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmoka-ai\u002Fm3e-base\u002Fbert4torch_config.json)|\r\n||bge|BAAI|`BAAI\u002Fbge-large-en-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-large-en-v1.5)\u003Cbr\u002F>`BAAI\u002Fbge-large-zh-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-large-zh-v1.5)\u003Cbr\u002F>`BAAI\u002Fbge-base-en-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-base-en-v1.5)\u003Cbr\u002F>`BAAI\u002Fbge-base-zh-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-base-zh-v1.5)\u003Cbr\u002F>`BAAI\u002Fbge-small-en-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-small-en-v1.5)\u003Cbr\u002F>`BAAI\u002Fbge-small-zh-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-small-zh-v1.5)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-large-en-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-large-zh-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-base-en-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-base-zh-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-small-en-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-small-zh-v1.5\u002Fbert4torch_config.json)|\r\n||gte|thenlper|`thenlper\u002Fgte-large-zh` [🤗](https:\u002F\u002Fhuggingface.co\u002Fthenlper\u002Fgte-large-zh)\u003Cbr\u002F>`thenlper\u002Fgte-base-zh` [🤗](https:\u002F\u002Fhuggingface.co\u002Fthenlper\u002Fgte-base-zh)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fthenlper\u002Fgte-base-zh\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fthenlper\u002Fgte-large-zh\u002Fbert4torch_config.json)|\r\n\r\n*注：\r\n\r\n1. `高亮格式`(如 `bert-base-chinese`)的表示可直接 `build_transformer_model()`联网下载\r\n2. 国内镜像网站加速下载\r\n\r\n   - `HF_ENDPOINT=https:\u002F\u002Fhf-mirror.com python your_script.py`\r\n   - `export HF_ENDPOINT=https:\u002F\u002Fhf-mirror.com`后再执行python代码\r\n   - 在python代码开头如下设置\r\n\r\n   ```python\r\n   import os\r\n   os.environ['HF_ENDPOINT'] = \"https:\u002F\u002Fhf-mirror.com\"\r\n   ```\r\n\r\n## 6. 鸣谢\r\n\r\n- 感谢苏神实现的[bert4keras](https:\u002F\u002Fgithub.com\u002Fbojone\u002Fbert4keras)，本实现有不少地方参考了bert4keras的源码，在此衷心感谢大佬的无私奉献;\r\n- 其次感谢项目[bert4pytorch](https:\u002F\u002Fgithub.com\u002FMuQiuJun-AI\u002Fbert4pytorch)，也是在该项目的指引下给了我用pytorch来复现bert4keras的想法和思路。\r\n\r\n## 7. 引用\r\n\r\n```\r\n@misc{bert4torch,\r\n  title={bert4torch},\r\n  author={Bo Li},\r\n  year={2022},\r\n  howpublished={\\url{https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch}},\r\n}\r\n```\r\n\r\n## 8. 其他\r\n\r\n- Wechat & Star History Chart\r\n- 微信群人数超过200个（有邀请限制），可添加个人微信拉群，备注：bert4torch-姓名-公司名\r\n\r\n\u003Ctable border=\"0\">\r\n  \u003Ctbody>\r\n    \u003Ctr align=\"center\" >\r\n      \u003Ctd>\r\n         \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FTongjilibo\">\u003Cimg width=\"200\" height=\"250\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_334f6a1ee93b.jpg\" alt=\"pic\">\u003C\u002Fa>\u003Cbr \u002F>\r\n         \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FTongjilibo\">微信号\u003C\u002Fa> \r\n      \u003C\u002Ftd>\r\n      \u003Ctd>\r\n         \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FTongjilibo\">\u003Cimg width=\"190\" height=\"250\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_3c4c8e02bdd1.jpg\" alt=\"pic\">\u003C\u002Fa>\u003Cbr \u002F>\r\n         \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FTongjilibo\">微信群\u003C\u002Fa> \r\n      \u003C\u002Ftd>\r\n      \u003Ctd>\r\n         \u003Ca href=\"https:\u002F\u002Fstar-history.com\u002F#Tongjilibo\u002Fbert4torch&Date\">\u003Cimg width=\"400\" height=\"250\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_dbfdc271a63b.png\" alt=\"pic\">\u003C\u002Fa>\u003Cbr \u002F>\r\n         \u003Ca href=\"https:\u002F\u002Fstar-history.com\u002F#Tongjilibo\u002Fbert4torch&Date\">Star History Chart\u003C\u002Fa> \r\n      \u003C\u002Ftd>  \r\n      \u003C\u002Ftr>\r\n  \u003C\u002Ftbody>\r\n\u003C\u002Ftable>\r\n","![bert4torch](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_849cd1e6cb4e.png)\n\n[![licence](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002FTongjilibo\u002Fbert4torch.svg?maxAge=3600)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002FLICENSE)\n[![GitHub release](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frelease\u002FTongjilibo\u002Fbert4torch.svg?maxAge=3600)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Freleases)\n[![PyPI](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fbert4torch?label=pypi%20package)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fbert4torch\u002F)\n[![PyPI - Downloads](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdm\u002Fbert4torch)](https:\u002F\u002Fpypistats.org\u002Fpackages\u002Fbert4torch)\n[![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FTongjilibo\u002Fbert4torch?style=social)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch)\n[![GitHub Issues](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002FTongjilibo\u002Fbert4torch.svg)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fissues)\n[![contributions welcome](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcontributions-welcome-brightgreen.svg?style=flat)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fissues)\n[![Generic badge](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fwechat-join-green.svg?logo=wechat)](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fdocs\u002Fpics\u002Fwechat_group.jpg)\n\n[文档](https:\u002F\u002Fbert4torch.readthedocs.io) |\n[Torch4keras](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Ftorch4keras) |\n[示例](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples) |\n[build_MiniLLM_from_scratch](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbuild_MiniLLM_from_scratch) |\n[bert4vector](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4vector)\n\n## 目录\n\n- [目录](#目录)\n- [1. 下载安装](#1-下载安装)\n- [2. 功能](#2-功能)\n- [3. 快速上手](#3-快速上手)\n  - [3.1 上手教程](#31-上手教程)\n  - [3.2 命令行快速部署大模型服务](#32-命令行快速部署大模型服务)\n- [4. 版本和更新历史](#4-版本和更新历史)\n  - [4.1 版本历史](#41-版本历史)\n  - [4.2 更新历史](#42-更新历史)\n- [5. 预训练权重](#5-预训练权重)\n  - [5.1 权重加载](#51-权重加载)\n  - [5.2 权重链接](#52-权重链接)\n- [6. 鸣谢](#6-鸣谢)\n- [7. 引用](#7-引用)\n- [8. 其他](#8-其他)\n\n## 1. 下载安装\n\n安装稳定版\n\n```shell\npip install bert4torch\n```\n\n安装最新版\n\n```shell\npip install git+https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\n```\n\n- **注意事项**：pip包的发布慢于git上的开发版本，git clone**注意引用路径**，注意权重是否需要转换\n- **测试用例**：`git clone https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch`，修改example中的预训练模型文件路径和数据路径即可启动脚本\n- **自行训练**：针对自己的数据，修改相应的数据处理代码块\n- **开发环境**：原使用 `torch==1.10`版本进行开发，现已切换到 `torch2.0`开发，如其他版本遇到不适配，欢迎反馈\n\n## 2. 功能\n\n- **LLM模型**: 加载chatglm、llama、 baichuan、ziya、bloom等开源大模型权重进行推理和微调，命令行一行部署大模型\n- **核心功能**：加载bert、roberta、albert、xlnet、nezha、bart、RoFormer、RoFormer_V2、ELECTRA、GPT、GPT2、T5、GAU-alpha、ERNIE等预训练权重继续进行finetune、并支持在bert基础上灵活定义自己模型\n- [**丰富示例**](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002F)：包含[llm](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fllm)、[pretrain](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fpretrain)、[sentence_classfication](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fsentence_classfication)、[sentence_embedding](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Ftree\u002Fmaster\u002Fexamples\u002Fsentence_embedding)、[sequence_labeling](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fsequence_labeling)、[relation_extraction](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Frelation_extraction)、[seq2seq](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fseq2seq)、[serving](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Fserving\u002F)等多种解决方案\n- **实验验证**：已在公开数据集实验验证，使用如下[examples数据集](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fdata\u002FREADME.md)和[实验指标](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002FExperiments.md)\n- **易用trick**：集成了常见的[trick](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples\u002Ftraining_trick)，即插即用\n- **其他特性**：[加载transformers库模型](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002F\u002Ftutorials\u002Ftutorials_load_transformers_model.py)一起使用；调用方式简洁高效；有训练进度条动态展示；配合torchinfo打印参数量；默认Logger和Tensorboard简便记录训练过程；自定义fit过程，满足高阶需求\n- **训练过程**：\n\n  ![训练过程](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_5f821d080dc7.gif)\n\n| 功能                                | bert4torch | transformers | 备注                               |\n| ----------------------------------- | ---------- | ------------ | ---------------------------------- |\n| 训练进度条                          | ✅         | ✅           | 进度条打印loss和定义的metrics      |\n| 分布式训练dp\u002Fddp                    | ✅         | ✅           | torch自带dp\u002Fddp                    |\n| 各类callbacks                       | ✅         | ✅           | 日志\u002Ftensorboard\u002Fearlystop\u002Fwandb等 |\n| 大模型推理，stream\u002Fbatch输出        | ✅         | ✅           | 各个模型是通用的，无需单独维护脚本 |\n| 大模型微调                          | ✅         | ✅           | lora依赖peft库，pv2自带            |\n| 丰富tricks                          | ✅         | ❌           | 对抗训练等tricks即插即用           |\n| 代码简洁易懂，自定义空间大          | ✅         | ❌           | 代码复用度高, keras代码训练风格    |\n| 仓库的维护能力\u002F影响力\u002F使用量\u002F兼容性 | ❌         | ✅           | 目前仓库个人维护                   |\n| 一键部署大模型                      |            |              |                                    |\n\n## 3. 快速上手\n\n### 3.1 上手教程\n\n- [Quick-Start](https:\u002F\u002Fbert4torch.readthedocs.io\u002Fen\u002Flatest\u002F\u002FQuick-Start.html)\n- [快速上手教程](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002F\u002Ftutorials\u002FREADME.md)，[教程示例](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002F\u002Ftutorials)，[实战示例](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples)\n- [bert4torch介绍(知乎)](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F486329434)，[bert4torch快速上手(知乎)](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F508890807)，[bert4torch又双叒叕更新啦(知乎)](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F560885427?)\n\n### 3.2 命令行快速部署大模型服务\n\n- 本地 \u002F 联网加载\n  ```shell\n  # 联网下载全部文件\n  bert4torch serve --checkpoint_path Qwen2-0.5B-Instruct\n\n  # 加载本地大模型，联网下载bert4torch_config.json\n  bert4torch serve --checkpoint_path \u002Fdata\u002Fpretrain_ckpt\u002FQwen\u002FQwen2-0.5B-Instruct --config_path Qwen\u002FQwen2-0.5B-Instruct\n\n  # 加载本地大模型，且bert4torch_config.json已经下载并放于同名目录下\n  bert4torch serve --checkpoint_path \u002Fdata\u002Fpretrain_ckpt\u002FQwen\u002FQwen2-0.5B-Instruct\n  ```\n- 命令行 \u002F gradio网页 \u002F openai_api\n  ```shell\n  # 命令行\n  bert4torch serve --checkpoint_path \u002Fdata\u002Fpretrain_ckpt\u002FQwen\u002FQwen2-0.5B-Instruct --mode cli\n\n  # gradio网页\n  bert4torch serve --checkpoint_path \u002Fdata\u002Fpretrain_ckpt\u002FQwen\u002FQwen2-0.5B-Instruct --mode gradio\n\n  # openai_api\n  bert4torch serve --checkpoint_path \u002Fdata\u002Fpretrain_ckpt\u002FQwen\u002FQwen2-0.5B-Instruct --mode openai\n  ```\n- 命令行聊天示例\n  ![命令行聊天](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_962d0ae270e6.gif)\n\n## 4. 版本和更新历史\n\n### 4.1 版本历史\n\n| 更新日期 | bert4torch  | torch4keras | 版本说明                                                               |\n| -------- | ----------- | ----------- | ---------------------------------------------------------------------- |\n| 20260114 | 0.6.1       | 0.3.3       | 增加paddleocr-vl，优化代码结构，去除硬代码模型配置项                   |\n| 20250925 | 0.6.0       | 0.3.2       | 增加 `Qwen3-moe`, 支持 `gptq`、`awq`等主流量化方式，其他代码优化 |\n| 20250721 | 0.5.9.post2 | 0.3.1       | 增加 `Ernie4_5`, 修复hub下载bug, 拆分出 `openai_client`            |\n\n[更多版本](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fdocs\u002FUpdate.md)\n\n### 4.2 更新历史\n\n[更多历史](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fdocs\u002FHistory.md)\n\n## 5. 预训练权重\n\n## 5.1 权重加载\n\n  ```python\n  from bert4torch.models import build_transformer_model\n\n  # 1. 仅指定config_path: 从头初始化模型结构, 不加载预训练模型\n  model = build_transformer_model('.\u002Fmodel\u002Fbert4torch_config.json')\n\n  # 2. 仅指定checkpoint_path: \n  ## 2.1 文件夹路径: 自动寻找路径下的*.bin\u002F*.safetensors权重文件 + 需把bert4torch_config.json下载并放于该目录下\n  model = build_transformer_model(checkpoint_path='.\u002Fmodel')\n\n  ## 2.2 文件路径\u002F列表: 文件路径即权重路径\u002F列表, bert4torch_config.json会从同级目录下寻找\n  model = build_transformer_model(checkpoint_path='.\u002Fpytorch_model.bin')\n\n  ## 2.3 model_name: hf上预训练权重名称, 会自动下载hf权重以及bert4torch_config.json文件\n  model = build_transformer_model(checkpoint_path='google-bert\u002Fbert-base-chinese')\n\n  # 3. 同时指定config_path和checkpoint_path(本地路径名或model_name排列组合): \n  #    本地路径从本地加载，pretrained_model_name会联网下载\n  config_path = '.\u002Fmodel\u002Fbert4torch_config.json'  # 或'google-bert\u002Fbert-base-chinese'\n  checkpoint_path = '.\u002Fmodel\u002Fpytorch_model.bin'  # 或'google-bert\u002Fbert-base-chinese'\n  model = build_transformer_model(config_path, checkpoint_path)\n  ```\n\n## 5.2 权重链接\r\n\r\n\r\n|模型分类|模型名称|权重来源|checkpoint_path|config_path|\r\n|------------|------------|------------|------------|------------|\r\n|bert|bert-base-chinese|google-bert|`google-bert\u002Fbert-base-chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002Fgoogle-bert\u002Fbert-base-chinese)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fgoogle-bert\u002Fbert-base-chinese\u002Fbert4torch_config.json)|\r\n||[chinese_L-12_H-768_A-12](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Fbert)|谷歌|[tf权重](https:\u002F\u002Fstorage.googleapis.com\u002Fbert_models\u002F2018_11_03\u002Fchinese_L-12_H-768_A-12.zip)\u003Cbr\u002F>`Tongjilibo\u002Fbert-chinese_L-12_H-768_A-12` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert-chinese_L-12_H-768_A-12)||\r\n||[chinese-bert-wwm-ext](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-BERT-wwm)|HFL|`hfl\u002Fchinese-bert-wwm-ext` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-bert-wwm-ext)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-bert-wwm-ext\u002Fbert4torch_config.json)|\r\n||bert-base-multilingual-cased|google-bert|`google-bert\u002Fbert-base-multilingual-cased` [🤗](https:\u002F\u002Fhuggingface.co\u002Fgoogle-bert\u002Fbert-base-multilingual-cased)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fgoogle-bert\u002Fbert-base-multilingual-cased\u002Fbert4torch_config.json)|\r\n||bert-base-cased|google-bert|`google-bert\u002Fbert-base-cased` [🤗](https:\u002F\u002Fhuggingface.co\u002Fgoogle-bert\u002Fbert-base-cased)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fgoogle-bert\u002Fbert-base-cased\u002Fbert4torch_config.json)|\r\n||bert-base-uncased|google-bert|`google-bert\u002Fbert-base-uncased` [🤗](https:\u002F\u002Fhuggingface.co\u002Fgoogle-bert\u002Fbert-base-uncased)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fgoogle-bert\u002Fbert-base-uncased\u002Fbert4torch_config.json)|\r\n||[MacBERT](https:\u002F\u002Fgithub.com\u002Fymcui\u002FMacBERT)|HFL|`hfl\u002Fchinese-macbert-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-macbert-base)\u003Cbr\u002F>`hfl\u002Fchinese-macbert-large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-macbert-large)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-macbert-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-macbert-large\u002Fbert4torch_config.json)|\r\n||[WoBERT](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002FWoBERT)|追一科技|`junnyu\u002Fwobert_chinese_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Fwobert_chinese_base)\u003Cbr\u002F>`junnyu\u002Fwobert_chinese_plus_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Fwobert_chinese_plus_base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Fwobert_chinese_base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Fwobert_chinese_plus_base\u002Fbert4torch_config.json)|\r\n|roberta|[chinese-roberta-wwm-ext](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-BERT-wwm)|HFL|`hfl\u002Fchinese-roberta-wwm-ext` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-roberta-wwm-ext)\u003Cbr\u002F>`hfl\u002Fchinese-roberta-wwm-ext-large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-roberta-wwm-ext-large)\u003Cbr\u002F>(large的mlm权重是随机初始化)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-roberta-wwm-ext\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-roberta-wwm-ext-large\u002Fbert4torch_config.json)|\r\n||[roberta-small\u002Ftiny](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Fpretrained-models)|追一科技|`Tongjilibo\u002Fchinese_roberta_L-4_H-312_A-12` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_roberta_L-4_H-312_A-12)\u003Cbr\u002F>`Tongjilibo\u002Fchinese_roberta_L-6_H-384_A-12` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_roberta_L-6_H-384_A-12)||\r\n||[roberta-base](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffairseq\u002Ftree\u002Fmain\u002Fexamples\u002Froberta)|FacebookAI|`FacebookAI\u002Froberta-base` [🤗](https:\u002F\u002Fhuggingface.co\u002FFacebookAI\u002Froberta-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FFacebookAI\u002Froberta-base\u002Fbert4torch_config.json)|\r\n||[guwenbert](https:\u002F\u002Fgithub.com\u002FEthan-yt\u002Fguwenbert)|ethanyt|`ethanyt\u002Fguwenbert-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fethanyt\u002Fguwenbert-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fethanyt\u002Fguwenbert-base\u002Fbert4torch_config.json)|\r\n|albert|[albert_zh](https:\u002F\u002Fgithub.com\u002Fbrightmart\u002Falbert_zh)\u003Cbr\u002F>[albert_pytorch](https:\u002F\u002Fgithub.com\u002FlonePatient\u002Falbert_pytorch)|brightmart|`voidful\u002Falbert_chinese_tiny` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_tiny)\u003Cbr\u002F>`voidful\u002Falbert_chinese_small` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_small)\u003Cbr\u002F>`voidful\u002Falbert_chinese_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_base)\u003Cbr\u002F>`voidful\u002Falbert_chinese_large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_large)\u003Cbr\u002F>`voidful\u002Falbert_chinese_xlarge` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_xlarge)\u003Cbr\u002F>`voidful\u002Falbert_chinese_xxlarge` [🤗](https:\u002F\u002Fhuggingface.co\u002Fvoidful\u002Falbert_chinese_xxlarge)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_tiny\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_small\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_large\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_xlarge\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fvoidful\u002Falbert_chinese_xxlarge\u002Fbert4torch_config.json)|\r\n|nezha|[NEZHA](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model\u002Ftree\u002Fmaster\u002FNEZHA-PyTorch)\u003Cbr\u002F>[NeZha_Chinese_PyTorch](https:\u002F\u002Fgithub.com\u002FlonePatient\u002FNeZha_Chinese_PyTorch)|huawei_noah|`sijunhe\u002Fnezha-cn-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fsijunhe\u002Fnezha-cn-base)\u003Cbr\u002F>`sijunhe\u002Fnezha-cn-large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fsijunhe\u002Fnezha-cn-large)\u003Cbr\u002F>`sijunhe\u002Fnezha-base-wwm` [🤗](https:\u002F\u002Fhuggingface.co\u002Fsijunhe\u002Fnezha-base-wwm)\u003Cbr\u002F>`sijunhe\u002Fnezha-large-wwm` [🤗](https:\u002F\u002Fhuggingface.co\u002Fsijunhe\u002Fnezha-large-wwm)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fsijunhe\u002Fnezha-cn-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fsijunhe\u002Fnezha-cn-large\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fsijunhe\u002Fnezha-base-wwm\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fsijunhe\u002Fnezha-large-wwm\u002Fbert4torch_config.json)|\r\n||[nezha_gpt_dialog](https:\u002F\u002Fgithub.com\u002Fbojone\u002Fnezha_gpt_dialog)|bojone|`Tongjilibo\u002Fnezha_gpt_dialog` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fnezha_gpt_dialog)||\r\n|xlnet|[Chinese-XLNet](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-XLNet)|HFL|`hfl\u002Fchinese-xlnet-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-xlnet-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-xlnet-base\u002Fbert4torch_config.json)|\r\n||[tranformer_xl](https:\u002F\u002Fgithub.com\u002Fkimiyoung\u002Ftransformer-xl)|huggingface|`transfo-xl\u002Ftransfo-xl-wt103` [🤗](https:\u002F\u002Fhuggingface.co\u002Ftransfo-xl\u002Ftransfo-xl-wt103)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ftransfo-xl\u002Ftransfo-xl-wt103\u002Fbert4torch_config.json)|\r\n|deberta|[Erlangshen-DeBERTa-v2](https:\u002F\u002Fgithub.com\u002FIDEA-CCNL\u002FFengshenbang-LM)|IDEA|`IDEA-CCNL\u002FErlangshen-DeBERTa-v2-97M-Chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-97M-Chinese)\u003Cbr\u002F>`IDEA-CCNL\u002FErlangshen-DeBERTa-v2-320M-Chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-320M-Chinese)\u003Cbr\u002F>`IDEA-CCNL\u002FErlangshen-DeBERTa-v2-710M-Chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-710M-Chinese)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-97M-Chinese\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-320M-Chinese\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FIDEA-CCNL\u002FErlangshen-DeBERTa-v2-710M-Chinese\u002Fbert4torch_config.json)|\r\n|electra|[Chinese-ELECTRA](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-ELECTRA)|HFL|`hfl\u002Fchinese-electra-base-discriminator` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-electra-base-discriminator)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-electra-base-discriminator\u002Fbert4torch_config.json)|\r\n|ernie|[ernie](https:\u002F\u002Fgithub.com\u002FPaddlePaddle\u002FERNIE)|百度文心|`nghuyong\u002Fernie-1.0-base-zh` [🤗](https:\u002F\u002Fhuggingface.co\u002Fnghuyong\u002Fernie-1.0-base-zh)\u003Cbr\u002F>`nghuyong\u002Fernie-3.0-base-zh` [🤗](https:\u002F\u002Fhuggingface.co\u002Fnghuyong\u002Fernie-3.0-base-zh)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fnghuyong\u002Fernie-1.0-base-zh\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fnghuyong\u002Fernie-3.0-base-zh\u002Fbert4torch_config.json)|\r\n|roformer|[roformer](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Froformer)|追一科技|`junnyu\u002Froformer_chinese_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_chinese_base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_chinese_base\u002Fbert4torch_config.json)|\r\n||[roformer_v2](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Froformer-v2)|追一科技|`junnyu\u002Froformer_v2_chinese_char_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_v2_chinese_char_base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_v2_chinese_char_base\u002Fbert4torch_config.json)|\r\n|simbert|[simbert](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Fsimbert)|追一科技|`Tongjilibo\u002Fsimbert-chinese-base` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fsimbert-chinese-base)\u003Cbr\u002F>`Tongjilibo\u002Fsimbert-chinese-small` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fsimbert-chinese-small)\u003Cbr\u002F>`Tongjilibo\u002Fsimbert-chinese-tiny` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fsimbert-chinese-tiny)||\r\n||[simbert_v2\u002Froformer-sim](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Froformer-sim)|追一科技|`junnyu\u002Froformer_chinese_sim_char_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_chinese_sim_char_base)\u003Cbr\u002F>`junnyu\u002Froformer_chinese_sim_char_ft_base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_chinese_sim_char_ft_base)\u003Cbr\u002F>`junnyu\u002Froformer_chinese_sim_char_small` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_chinese_sim_char_small)\u003Cbr\u002F>`junnyu\u002Froformer_chinese_sim_char_ft_small` [🤗](https:\u002F\u002Fhuggingface.co\u002Fjunnyu\u002Froformer_chinese_sim_char_ft_small)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_chinese_sim_char_base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_chinese_sim_char_ft_base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_chinese_sim_char_small\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fjunnyu\u002Froformer_chinese_sim_char_ft_small\u002Fbert4torch_config.json)|\r\n|gau|[GAU-alpha](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002FGAU-alpha)|追一科技|`Tongjilibo\u002Fchinese_GAU-alpha-char_L-24_H-768` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_GAU-alpha-char_L-24_H-768)||\r\n|ModernBERT|[ModernBERT](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fanswerdotai\u002Fmodernbert-67627ad707a4acbf33c41deb)|answerdotai|`answerdotai\u002FModernBERT-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fanswerdotai\u002FModernBERT-base)\u003Cbr\u002F>`answerdotai\u002FModernBERT-large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fanswerdotai\u002FModernBERT-large)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fanswerdotai\u002FModernBERT-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fanswerdotai\u002FModernBERT-large\u002Fbert4torch_config.json)|\r\n|uie|[uie](https:\u002F\u002Fgithub.com\u002Funiversal-ie\u002FUIE)\u003Cbr\u002F>[uie_pytorch](https:\u002F\u002Fgithub.com\u002FHUSTAI\u002Fuie_pytorch)|百度|`Tongjilibo\u002Fuie-base` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fuie-base)||\r\n|gpt|[CDial-GPT](https:\u002F\u002Fgithub.com\u002Fthu-coai\u002FCDial-GPT)|thu-coai|`thu-coai\u002FCDial-GPT_LCCC-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fthu-coai\u002FCDial-GPT_LCCC-base)\u003Cbr\u002F>`thu-coai\u002FCDial-GPT_LCCC-large` [🤗](https:\u002F\u002Fhuggingface.co\u002Fthu-coai\u002FCDial-GPT_LCCC-large)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fthu-coai\u002FCDial-GPT_LCCC-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fthu-coai\u002FCDial-GPT_LCCC-large\u002Fbert4torch_config.json)|\r\n||[cmp_lm(26亿)](https:\u002F\u002Fgithub.com\u002FTsinghuaAI\u002FCPM-1-Generate)|清华|`TsinghuaAI\u002FCPM-Generate` [🤗](https:\u002F\u002Fhuggingface.co\u002FTsinghuaAI\u002FCPM-Generate)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FTsinghuaAI\u002FCPM-Generate\u002Fbert4torch_config.json)|\r\n||[nezha_gen](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model\u002Ftree\u002Fmaster\u002FNEZHA-Gen-TensorFlow)|huawei_noah|`Tongjilibo\u002Fchinese_nezha_gpt_L-12_H-768_A-12` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_nezha_gpt_L-12_H-768_A-12)||\r\n||[gpt2-chinese-cluecorpussmall](https:\u002F\u002Fgithub.com\u002Fdbiir\u002FUER-py\u002Fwiki\u002FModelzoo)|UER|`uer\u002Fgpt2-chinese-cluecorpussmall` [🤗](https:\u002F\u002Fhuggingface.co\u002Fuer\u002Fgpt2-chinese-cluecorpussmall)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fuer\u002Fgpt2-chinese-cluecorpussmall\u002Fbert4torch_config.json)|\r\n||[gpt2-ml](https:\u002F\u002Fgithub.com\u002Fimcaspar\u002Fgpt2-ml)|imcaspar|`Tongjilibo\u002Fgpt2-ml_15g_corpus` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fgpt2-ml_15g_corpus)\u003Cbr\u002F>`Tongjilibo\u002Fgpt2-ml_30g_corpus` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fgpt2-ml_30g_corpus)\u003Cbr\u002F>[torch](https:\u002F\u002Fgithub.com\u002Fghosthamlet\u002Fgpt2-ml-torch),[BaiduYun(84dh)](https:\u002F\u002Fpan.baidu.com\u002Fs\u002F16tL4Bmoh6jPy0cOND0YyeA)||\r\n|bart|[bart_base_chinese](https:\u002F\u002Fgithub.com\u002Ffastnlp\u002FCPT)|复旦fnlp|`fnlp\u002Fbart-base-chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002Ffnlp\u002Fbart-base-chinese)\u003Cbr\u002F>[fnlp\u002Fbart-base-chinese-v1.0](https:\u002F\u002Fhuggingface.co\u002Ffnlp\u002Fbart-base-chinese\u002Ftree\u002Fv1.0)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ffnlp\u002Fbart-base-chinese\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ffnlp\u002Fbart-base-chinese-v1.0\u002Fbert4torch_config.json)|\r\n|t5|[t5](https:\u002F\u002Fgithub.com\u002Fdbiir\u002FUER-py\u002Fwiki\u002FModelzoo)|UER|`uer\u002Ft5-small-chinese-cluecorpussmall` [🤗](https:\u002F\u002Fhuggingface.co\u002Fuer\u002Ft5-small-chinese-cluecorpussmall)\u003Cbr\u002F>`uer\u002Ft5-base-chinese-cluecorpussmall` [🤗](https:\u002F\u002Fhuggingface.co\u002Fuer\u002Ft5-base-chinese-cluecorpussmall)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fuer\u002Ft5-base-chinese-cluecorpussmall\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fuer\u002Ft5-small-chinese-cluecorpussmall\u002Fbert4torch_config.json)|\r\n||mt5|谷歌|`google\u002Fmt5-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fgoogle\u002Fmt5-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fgoogle\u002Fmt5-base\u002Fbert4torch_config.json)|\r\n||[t5_pegasus](https:\u002F\u002Fgithub.com\u002FZhuiyiTechnology\u002Ft5-pegasus)|追一科技|`Tongjilibo\u002Fchinese_t5_pegasus_small` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_t5_pegasus_small)\u003Cbr\u002F>`Tongjilibo\u002Fchinese_t5_pegasus_base` [🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fchinese_t5_pegasus_base)||\r\n||[chatyuan](https:\u002F\u002Fgithub.com\u002Fclue-ai\u002FChatYuan)|clue-ai|`ClueAI\u002FChatYuan-large-v1` [🤗](https:\u002F\u002Fhuggingface.co\u002FClueAI\u002FChatYuan-large-v1)\u003Cbr\u002F>`ClueAI\u002FChatYuan-large-v2` [🤗](https:\u002F\u002Fhuggingface.co\u002FClueAI\u002FChatYuan-large-v2)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FClueAI\u002FChatYuan-large-v1\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FClueAI\u002FChatYuan-large-v2\u002Fbert4torch_config.json)|\r\n||[PromptCLUE](https:\u002F\u002Fgithub.com\u002Fclue-ai\u002FPromptCLUE)|clue-ai|`ClueAI\u002FPromptCLUE-base` [🤗](https:\u002F\u002Fhuggingface.co\u002FClueAI\u002FPromptCLUE-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FClueAI\u002FPromptCLUE-base\u002Fbert4torch_config.json)|\r\n|chatglm|[ChatGLM-6B](https:\u002F\u002Fgithub.com\u002Fzai-org\u002FChatGLM-6B)|zai-org|`zai-org\u002Fchatglm-6b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm-6b)\u003Cbr\u002F>`zai-org\u002Fchatglm-6b-int8` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm-6b-int8)\u003Cbr\u002F>`zai-org\u002Fchatglm-6b-int4` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm-6b-int4)\u003Cbr\u002F>`zai-org\u002Fchatglm-6b-v0.1.0`[🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm-6b\u002Ftree\u002Fv0.1.0)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm-6b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm-6b-int8\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm-6b-int4\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm-6b-v0.1.0\u002Fbert4torch_config.json)|\r\n||[ChatGLM2-6B](https:\u002F\u002Fgithub.com\u002Fzai-org\u002FChatGLM2-6B)|zai-org|`zai-org\u002Fchatglm2-6b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm2-6b)\u003Cbr\u002F>`zai-org\u002Fchatglm2-6b-int4` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm2-6b-int4)\u003Cbr\u002F>`zai-org\u002Fchatglm2-6b-32k` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm2-6b-32k)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm2-6b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm2-6b-int4\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm2-6b-32k\u002Fbert4torch_config.json)|\r\n||[ChatGLM3](https:\u002F\u002Fgithub.com\u002Fzai-org\u002FChatGLM3)|zai-org|`zai-org\u002Fchatglm3-6b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm3-6b)\u003Cbr\u002F>`zai-org\u002Fchatglm3-6b-32k` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fchatglm3-6b-32k)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm3-6b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fchatglm3-6b-32k\u002Fbert4torch_config.json)|\r\n||[GLM-4](https:\u002F\u002Fgithub.com\u002Fzai-org\u002FGLM-4)|zai-org|`zai-org\u002Fglm-4-9b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fglm-4-9b)\u003Cbr\u002F>`zai-org\u002Fglm-4-9b-chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fglm-4-9b-chat)\u003Cbr\u002F>`zai-org\u002Fglm-4-9b-chat-1m` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fglm-4-9b-chat-1m)\u003Cbr\u002F>`zai-org\u002Fglm-4v-9b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002Fglm-4v-9b)\u003Cbr\u002F>`zai-org\u002FGLM-4-9B-0414` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002FGLM-4-9B-0414)\u003Cbr\u002F>`zai-org\u002FGLM-Z1-9B-0414` [🤗](https:\u002F\u002Fhuggingface.co\u002Fzai-org\u002FGLM-Z1-9B-0414)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fglm-4-9b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fglm-4-9b-chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fglm-4-9b-chat-1m\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fzai-org\u002Fglm-4v-9b\u002Fbert4torch_config.json)\u003Cbr\u002F>\u003Cbr\u002F>\u003Cbr\u002F>|\r\n|llama|[llama](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fllama)|meta|`meta-llama\u002Fllama-7b`\u003Cbr\u002F>`meta-llama\u002Fllama-13b`|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002Fllama-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002Fllama-13b\u002Fbert4torch_config.json)|\r\n||[llama-2](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fllama)|meta|`meta-llama\u002FLlama-2-7b-hf`[🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-2-7b-hf)\u003Cbr\u002F>`meta-llama\u002FLlama-2-7b-chat-hf`[🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-2-7b-chat-hf)\u003Cbr\u002F>`meta-llama\u002FLlama-2-13b-hf`[🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-2-13b-hf)\u003Cbr\u002F>`meta-llama\u002FLlama-2-13b-chat-hf`[🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-2-13b-chat-hf)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-2-7b-hf\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-2-7b-chat-hf\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-2-13b-hf\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-2-13b-chat-hf\u002Fbert4torch_config.json)|\r\n||[llama-3](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama3)|meta|`meta-llama\u002FMeta-Llama-3-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FMeta-Llama-3-8B)\u003Cbr\u002F>`meta-llama\u002FMeta-Llama-3-8B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FMeta-Llama-3-8B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FMeta-Llama-3-8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FMeta-Llama-3-8B-Instruct\u002Fbert4torch_config.json)|\r\n||[llama-3.1](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-models)|meta|`meta-llama\u002FMeta-Llama-3.1-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FMeta-Llama-3.1-8B)\u003Cbr\u002F>`meta-llama\u002FMeta-Llama-3.1-8B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FMeta-Llama-3.1-8B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FMeta-Llama-3.1-8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FMeta-Llama-3.1-8B-Instruct\u002Fbert4torch_config.json)|\r\n||[llama-3.2](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-models)|meta|`meta-llama\u002FLlama-3.2-1B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-1B)\u003Cbr\u002F>`meta-llama\u002FLlama-3.2-1B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-1B-Instruct)\u003Cbr\u002F>`meta-llama\u002FLlama-3.2-3B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-3B)\u003Cbr\u002F>`meta-llama\u002FLlama-3.2-3B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-3B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-1B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-1B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-3B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-3B-Instruct\u002Fbert4torch_config.json)|\r\n||[llama-3.2-vision](https:\u002F\u002Fgithub.com\u002Fmeta-llama\u002Fllama-models)|meta|`meta-llama\u002FLlama-3.2-11B-Vision` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-11B-Vision)\u003Cbr\u002F>`meta-llama\u002FLlama-3.2-11B-Vision-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.2-11B-Vision-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-11B-Vision\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmeta-llama\u002FLlama-3.2-11B-Vision-Instruct\u002Fbert4torch_config.json)|\r\n|llama-series|[Chinese-LLaMA-Alpaca](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-LLaMA-Alpaca)|HFL|`hfl\u002Fchinese-alpaca-plus-lora-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-alpaca-plus-lora-7b)\u003Cbr\u002F>`hfl\u002Fchinese-llama-plus-lora-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Fhfl\u002Fchinese-llama-plus-lora-7b)\u003Cbr\u002F>(使用前需要合并lora权重)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-alpaca-plus-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fhfl\u002Fchinese-llama-plus-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>\u003Cbr\u002F>|\r\n||[Chinese-LLaMA-Alpaca-2](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-LLaMA-Alpaca-2)|HFL||待添加|\r\n||[Chinese-LLaMA-Alpaca-3](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-LLaMA-Alpaca-3)|HFL||待添加|\r\n||[Belle_llama](https:\u002F\u002Fgithub.com\u002FLianjiaTech\u002FBELLE)|LianjiaTech|`BelleGroup\u002FBELLE-LLaMA-7B-2M-enc`[🤗](https:\u002F\u002Fhuggingface.co\u002FBelleGroup\u002FBELLE-LLaMA-7B-2M-enc)|[合成说明](https:\u002F\u002Fgithub.com\u002FLianjiaTech\u002FBELLE\u002Ftree\u002Fmain\u002Fmodels)、[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBelleGroup\u002FBELLE-LLaMA-7B-2M-enc)|\r\n||[Ziya](https:\u002F\u002Fgithub.com\u002FIDEA-CCNL\u002FFengshenbang-LM)|IDEA-CCNL|`IDEA-CCNL\u002FZiya-LLaMA-13B-v1`[🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FZiya-LLaMA-13B-v1)\u003Cbr\u002F>`IDEA-CCNL\u002FZiya-LLaMA-13B-v1.1`[🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FZiya-LLaMA-13B-v1.1)\u003Cbr\u002F>`IDEA-CCNL\u002FZiya-LLaMA-13B-Pretrain-v1`[🤗](https:\u002F\u002Fhuggingface.co\u002FIDEA-CCNL\u002FZiya-LLaMA-13B-Pretrain-v1)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FIDEA-CCNL\u002FZiya-LLaMA-13B-v1\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FIDEA-CCNL\u002FZiya-LLaMA-13B-v1.1\u002Fbert4torch_config.json)\u003Cbr\u002F>\u003Cbr\u002F>|\r\n||[vicuna](https:\u002F\u002Fgithub.com\u002Flm-sys\u002FFastChat)|lmsys|`lmsys\u002Fvicuna-7b-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002Flmsys\u002Fvicuna-7b-v1.5)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Flmsys\u002Fvicuna-7b-v1.5\u002Fbert4torch_config.json)|\r\n|Baichuan|[Baichuan](https:\u002F\u002Fgithub.com\u002Fbaichuan-inc\u002FBaichuan)|baichuan-inc|`baichuan-inc\u002FBaichuan-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan-7B)\u003Cbr\u002F>`baichuan-inc\u002FBaichuan-13B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan-13B-Base)\u003Cbr\u002F>`baichuan-inc\u002FBaichuan-13B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan-13B-Chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan-13B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan-13B-Chat\u002Fbert4torch_config.json)|\r\n||[Baichuan2](https:\u002F\u002Fgithub.com\u002Fbaichuan-inc\u002FBaichuan2)|baichuan-inc|`baichuan-inc\u002FBaichuan2-7B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan2-7B-Base)\u003Cbr\u002F>`baichuan-inc\u002FBaichuan2-7B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan2-7B-Chat)\u003Cbr\u002F>`baichuan-inc\u002FBaichuan2-13B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan2-13B-Base)\u003Cbr\u002F>`baichuan-inc\u002FBaichuan2-13B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaichuan-inc\u002FBaichuan2-13B-Chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan2-7B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan2-7B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan2-13B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaichuan-inc\u002FBaichuan2-13B-Chat\u002Fbert4torch_config.json)|\r\n|Yi|[Yi](https:\u002F\u002Fgithub.com\u002F01-ai\u002FYi)|01-ai|`01-ai\u002FYi-6B` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-6B)\u003Cbr\u002F>`01-ai\u002FYi-6B-200K` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-6B-200K)\u003Cbr\u002F>`01-ai\u002FYi-9B` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-9B)\u003Cbr\u002F>`01-ai\u002FYi-9B-200K` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-9B-200K)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-6B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-6B-200K\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-9B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-9B-200K\u002Fbert4torch_config.json)|\r\n||[Yi-1.5](https:\u002F\u002Fgithub.com\u002F01-ai\u002FYi-1.5)|01-ai|`01-ai\u002FYi-1.5-6B` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-6B)\u003Cbr\u002F>`01-ai\u002FYi-1.5-6B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-6B-Chat)\u003Cbr\u002F>`01-ai\u002FYi-1.5-9B` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-9B)\u003Cbr\u002F>`01-ai\u002FYi-1.5-9B-32K` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-9B-32K)\u003Cbr\u002F>`01-ai\u002FYi-1.5-9B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-9B-Chat)\u003Cbr\u002F>`01-ai\u002FYi-1.5-9B-Chat-16K` [🤗](https:\u002F\u002Fhuggingface.co\u002F01-ai\u002FYi-1.5-9B-Chat-16K)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-6B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-6B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-9B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-9B-32K\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-9B-Chat)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002F01-ai\u002FYi-1.5-9B-Chat-16K\u002Fbert4torch_config.json)|\r\n|bloom|[bloom](https:\u002F\u002Fgithub.com\u002Fbigscience-workshop\u002Fxmtf)|bigscience|`bigscience\u002Fbloom-560m` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbigscience\u002Fbloom-560m)\u003Cbr\u002F>`bigscience\u002Fbloomz-560m` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbigscience\u002Fbloomz-560m)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbigscience\u002Fbloom-560m\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbigscience\u002Fbloomz-560m\u002Fbert4torch_config.json)|\r\n|Qwen|[Qwen](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen)|阿里云|`Qwen\u002FQwen-1_8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-1_8B)\u003Cbr\u002F>`Qwen\u002FQwen-1_8B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-1_8B-Chat)\u003Cbr\u002F>`Qwen\u002FQwen-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-7B)\u003Cbr\u002F>`Qwen\u002FQwen-7B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-7B-Chat)\u003Cbr\u002F>`Qwen\u002FQwen-14B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-14B)\u003Cbr\u002F>`Qwen\u002FQwen-14B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen-14B-Chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-1_8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-1_8B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-7B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-14B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen-14B-Chat\u002Fbert4torch_config.json)|\r\n||[Qwen1.5](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen1.5)|阿里云|`Qwen\u002FQwen1.5-0.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-0.5B)\u003Cbr\u002F>`Qwen\u002FQwen1.5-0.5B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-0.5B-Chat)\u003Cbr\u002F>`Qwen\u002FQwen1.5-1.8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-1.8B)\u003Cbr\u002F>`Qwen\u002FQwen1.5-1.8B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-1.8B-Chat)\u003Cbr\u002F>`Qwen\u002FQwen1.5-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-7B)\u003Cbr\u002F>`Qwen\u002FQwen1.5-7B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-7B-Chat)\u003Cbr\u002F>`Qwen\u002FQwen1.5-14B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-14B)\u003Cbr\u002F>`Qwen\u002FQwen1.5-14B-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen1.5-14B-Chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-0.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-0.5B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-1.8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-1.8B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-7B-Chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-14B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen1.5-14B-Chat\u002Fbert4torch_config.json)|\r\n||[Qwen2](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2)|阿里云|`Qwen\u002FQwen2-0.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-0.5B)\u003Cbr\u002F>`Qwen\u002FQwen2-0.5B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-0.5B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2-1.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-1.5B)\u003Cbr\u002F>`Qwen\u002FQwen2-1.5B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-1.5B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-7B)\u003Cbr\u002F>`Qwen\u002FQwen2-7B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-7B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-0.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-0.5B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-1.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-1.5B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-7B-Instruct\u002Fbert4torch_config.json)|\r\n||[Qwen2-VL](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2-VL)|阿里云|`Qwen\u002FQwen2-VL-2B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-VL-2B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2-VL-7B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2-VL-7B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-VL-2B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2-VL-7B-Instruct\u002Fbert4torch_config.json)|\r\n||[Qwen2.5](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5)|阿里云|`Qwen\u002FQwen2.5-0.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-0.5B)\u003Cbr\u002F>`Qwen\u002FQwen2.5-0.5B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-0.5B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-1.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-1.5B)\u003Cbr\u002F>`Qwen\u002FQwen2.5-1.5B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-1.5B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-3B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-3B)\u003Cbr\u002F>`Qwen\u002FQwen2.5-3B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-3B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-7B)\u003Cbr\u002F>`Qwen\u002FQwen2.5-7B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-7B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-14B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-14B)\u003Cbr\u002F>`Qwen\u002FQwen2.5-14B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-14B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-0.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-0.5B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-1.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-1.5B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-3B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-3B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-7B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-14B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-14B-Instruct\u002Fbert4torch_config.json)|\r\n||[Qwen2.5-VL](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen2.5-VL)|阿里云|`Qwen\u002FQwen2.5-VL-3B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-VL-3B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-VL-7B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-VL-7B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen2.5-VL-32B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-VL-32B-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-VL-3B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-VL-7B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen2.5-VL-32B-Instruct\u002Fbert4torch_config.json)|\r\n||[Qwen3](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen3)|阿里云|`Qwen\u002FQwen3-0.6B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-0.6B-Base)\u003Cbr\u002F>`Qwen\u002FQwen3-0.6B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-0.6B)\u003Cbr\u002F>`Qwen\u002FQwen3-0.6B-GPTQ-Int8` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-0.6B-GPTQ-Int8)\u003Cbr\u002F>`Qwen\u002FQwen3-1.7B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-1.7B-Base)\u003Cbr\u002F>`Qwen\u002FQwen3-1.7B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-1.7B)\u003Cbr\u002F>`Qwen\u002FQwen3-4B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-4B-Base)\u003Cbr\u002F>`Qwen\u002FQwen3-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-4B)\u003Cbr\u002F>`Qwen\u002FQwen3-4B-AWQ` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-4B-AWQ)\u003Cbr\u002F>`Qwen\u002FQwen3-8B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-8B-Base)\u003Cbr\u002F>`Qwen\u002FQwen3-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-8B)\u003Cbr\u002F>`Qwen\u002FQwen3-14B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-14B-Base)\u003Cbr\u002F>`Qwen\u002FQwen3-14B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-14B)\u003Cbr\u002F>`Qwen\u002FQwen3-32B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-32B)\u003Cbr\u002F>`Qwen\u002FQwen3-4B-Instruct-2507` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-4B-Instruct-2507)\u003Cbr\u002F>`Qwen\u002FQwen3-4B-Thinking-2507` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-4B-Thinking-2507)\u003Cbr\u002F>`Qwen\u002FQwen3-30B-A3B-Instruct-2507` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-30B-A3B-Instruct-2507)\u003Cbr\u002F>`Qwen\u002FQwen3-30B-A3B-Thinking-2507` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-30B-A3B-Thinking-2507)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-0.6B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-0.6B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-0.6B-GPTQ-Int8\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-1.7B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-1.7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-4B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-4B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-4B-AWQ\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-8B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-14B-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-14B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-32B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-4B-Instruct-2507\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-4B-Thinking-2507\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-30B-A3B-Instruct-2507\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-30B-A3B-Thinking-2507\u002Fbert4torch_config.json)|\r\n||[Qwen3-VL](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FQwen\u002Fqwen3-vl)|阿里云|`Qwen\u002FQwen3-VL-2B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-2B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-2B-Thinking` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-2B-Thinking)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-4B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-4B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-4B-Thinking` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-4B-Thinking)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-8B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-8B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-8B-Thinking` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-8B-Thinking)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-30B-A3B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-30B-A3B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-30B-A3B-Thinking` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-30B-A3B-Thinking)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-32B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-32B-Instruct)\u003Cbr\u002F>`Qwen\u002FQwen3-VL-32B-Thinking` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-VL-32B-Thinking)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-2B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-2B-Thinking\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-4B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-4B-Thinking\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-8B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-8B-Thinking\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-30B-A3B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-30B-A3B-Thinking\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-32B-Instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-VL-32B-Thinking\u002Fbert4torch_config.json)|\r\n||[Qwen3-Embedding](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen3)|阿里云|`Qwen\u002FQwen3-Embedding-0.6B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Embedding-0.6B)\u003Cbr\u002F>`Qwen\u002FQwen3-Embedding-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Embedding-4B)\u003Cbr\u002F>`Qwen\u002FQwen3-Embedding-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Embedding-8B)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-Embedding-0.6B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-Embedding-4B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FQwen\u002FQwen3-Embedding-8B\u002Fbert4torch_config.json)|\r\n||[Qwen3-Reranker](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen3)|阿里云|`Qwen\u002FQwen3-Reranker-0.6B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Reranker-0.6B)\u003Cbr\u002F>`Qwen\u002FQwen3-Reranker-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Reranker-4B)\u003Cbr\u002F>`Qwen\u002FQwen3-Reranker-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen3-Reranker-8B)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Ftree\u002Fmain\u002FQwen\u002FQwen3-Reranker-0.6B)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Ftree\u002Fmain\u002FQwen\u002FQwen3-Reranker-4B)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Ftree\u002Fmain\u002FQwen\u002FQwen3-Reranker-8B)|\r\n|Intern|[InternLM](https:\u002F\u002Fgithub.com\u002FInternLM\u002FInternLM)|上海人工智能实验室|`internlm\u002Finternlm-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm-7b)\u003Cbr\u002F>`internlm\u002Finternlm-chat-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm-chat-7b)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm-chat-7b\u002Fbert4torch_config.json)|\r\n||[InternLM2](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Finternlm\u002Finternlm2-65b0ce04970888799707893c)|上海人工智能实验室|`internlm\u002Finternlm2-1_8b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-1_8b)\u003Cbr\u002F>`internlm\u002Finternlm2-chat-1_8b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-chat-1_8b)\u003Cbr\u002F>`internlm\u002Finternlm2-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-7b)\u003Cbr\u002F>`internlm\u002Finternlm2-chat-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-chat-7b)\u003Cbr\u002F>`internlm\u002Finternlm2-20b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-20b)\u003Cbr\u002F>`internlm\u002Finternlm2-chat-20b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2-chat-20b)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2-1_8b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2-chat-1_8b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2-chat-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>\u003Cbr\u002F>\u003Cbr\u002F>|\r\n||[InternLM2.5](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Finternlm\u002Finternlm25-66853f32717072d17581bc13)|上海人工智能实验室|`internlm\u002Finternlm2_5-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2_5-7b)\u003Cbr\u002F>`internlm\u002Finternlm2_5-7b-chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2_5-7b-chat)\u003Cbr\u002F>`internlm\u002Finternlm2_5-7b-chat-1m` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm2_5-7b-chat-1m)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2_5-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2_5-7b-chat\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm2_5-7b-chat-1m\u002Fbert4torch_config.json)|\r\n||[InternLM3](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Finternlm\u002Finternlm3-67875827c377690c01a9131d)|上海人工智能实验室|`internlm\u002Finternlm3-8b-instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002Finternlm3-8b-instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Finternlm\u002Finternlm3-8b-instruct\u002Fbert4torch_config.json)|\r\n||[InternVL1.0-1.5](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FInternVL)|上海人工智能实验室|`OpenGVLab\u002FMini-InternVL-Chat-4B-V1-5` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FMini-InternVL-Chat-4B-V1-5)\u003Cbr\u002F>`OpenGVLab\u002FMini-InternVL-Chat-2B-V1-5` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FMini-InternVL-Chat-2B-V1-5)|待添加|\r\n||[InternVL2.0](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FInternVL)|上海人工智能实验室|`OpenGVLab\u002FInternVL2-1B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2-1B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2-2B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2-2B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2-4B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2-8B)|待添加|\r\n||[InternVL2.5](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FInternVL)|上海人工智能实验室|`OpenGVLab\u002FInternVL2_5-1B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2_5-1B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2_5-2B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2_5-2B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2_5-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2_5-4B)\u003Cbr\u002F>`OpenGVLab\u002FInternVL2_5-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL2_5-8B)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FOpenGVLab\u002FInternVL2_5-1B\u002Fbert4torch_config.json)\u003Cbr\u002F>待添加\u003Cbr\u002F>待添加\u003Cbr\u002F>待添加|\r\n|Falcon|[Falcon](https:\u002F\u002Fhuggingface.co\u002Ftiiuae)|tiiuae|`tiiuae\u002Ffalcon-rw-1b` [🤗](https:\u002F\u002Fhuggingface.co\u002Ftiiuae\u002Ffalcon-rw-1b)\u003Cbr\u002F>`tiiuae\u002Ffalcon-7b` [🤗](https:\u002F\u002Fhuggingface.co\u002Ftiiuae\u002Ffalcon-7b)\u003Cbr\u002F>`tiiuae\u002Ffalcon-7b-instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Ftiiuae\u002Ffalcon-7b-instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ftiiuae\u002Ffalcon-rw-1b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ftiiuae\u002Ffalcon-7b\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Ftiiuae\u002Ffalcon-7b-instruct\u002Fbert4torch_config.json)|\r\n|DeepSeek|[DeepSeek-MoE](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-MoE)|深度求索|`deepseek-ai\u002Fdeepseek-moe-16b-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-moe-16b-base)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-moe-16b-chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-moe-16b-chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-moe-16b-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-moe-16b-chat\u002Fbert4torch_config.json)|\r\n||[DeepSeek-LLM](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-LLM)|深度求索|`deepseek-ai\u002Fdeepseek-llm-7b-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-llm-7b-base)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-llm-7b-chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-llm-7b-chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-llm-7b-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-llm-7b-chat\u002Fbert4torch_config.json)|\r\n||[DeepSeek-V2](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-V2)|深度求索|`deepseek-ai\u002FDeepSeek-V2-Lite` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-V2-Lite)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-V2-Lite-Chat` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-V2-Lite-Chat)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-V2-Lite\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-V2-Lite-Chat\u002Fbert4torch_config.json)|\r\n||[DeepSeek-Coder](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder)|深度求索|`deepseek-ai\u002Fdeepseek-coder-1.3b-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-1.3b-base)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-coder-1.3b-instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-1.3b-instruct)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-coder-6.7b-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-6.7b-base)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-coder-6.7b-instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-6.7b-instruct)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-coder-7b-base-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-7b-base-v1.5)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-coder-7b-instruct-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-coder-7b-instruct-v1.5)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-1.3b-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-1.3b-instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-6.7b-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-6.7b-instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-7b-base-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-coder-7b-instruct-v1.5\u002Fbert4torch_config.json)|\r\n||[DeepSeek-Coder-V2](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2)|深度求索|`deepseek-ai\u002FDeepSeek-Coder-V2-Lite-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Lite-Base)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-Coder-V2-Lite-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Lite-Instruct)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Lite-Base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-Coder-V2-Lite-Instruct\u002Fbert4torch_config.json)|\r\n||[DeepSeek-Math](https:\u002F\u002Fgithub.com\u002Fdeepseek-ai\u002FDeepSeek-Math)|深度求索|`deepseek-ai\u002Fdeepseek-math-7b-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-ai\u002Fdeepseek-math-7b-base)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-math-7b-instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-math-7b-instruct)\u003Cbr\u002F>`deepseek-ai\u002Fdeepseek-math-7b-rl` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002Fdeepseek-math-7b-rl)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-math-7b-base\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-math-7b-instruct\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002Fdeepseek-math-7b-rl\u002Fbert4torch_config.json)|\r\n||[DeepSeek-R1](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fdeepseek-ai\u002Fdeepseek-r1-678e1e131c0169c0bc89728d)|深度求索|`deepseek-ai\u002FDeepSeek-R1-Distill-Qwen-1.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-1.5B)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-R1-Distill-Qwen-7B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-7B)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-R1-Distill-Llama-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Llama-8B)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-R1-Distill-Qwen-14B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-14B)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-R1-Distill-Qwen-32B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-32B)\u003Cbr\u002F>`deepseek-ai\u002FDeepSeek-R1-0528-Qwen3-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1-0528-Qwen3-8B)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-1.5B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-7B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Llama-8B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-14B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-Distill-Qwen-32B\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fdeepseek-ai\u002FDeepSeek-R1-0528-Qwen3-8B\u002Fbert4torch_config.json)|\r\n|Seed-OSS|[Seed-OSS](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FByteDance-Seed\u002Fseed-oss-68a609f4201e788db05b5dcd)|ByteDance|`ByteDance-Seed\u002FSeed-OSS-36B-Instruct` [🤗](https:\u002F\u002Fhuggingface.co\u002FByteDance-Seed\u002FSeed-OSS-36B-Instruct)\u003Cbr\u002F>`ByteDance-Seed\u002FSeed-OSS-36B-Base` [🤗](https:\u002F\u002Fhuggingface.co\u002FByteDance-Seed\u002FSeed-OSS-36B-Base)\u003Cbr\u002F>`ByteDance-Seed\u002FSeed-OSS-36B-Base-woSyn` [🤗](https:\u002F\u002Fhuggingface.co\u002FByteDance-Seed\u002FSeed-OSS-36B-Base-woSyn)||\r\n|Ernie4_5|[Ernie4_5](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fbaidu\u002Fernie-45-6861cd4c9be84540645f35c9)|百度|`baidu\u002FERNIE-4.5-0.3B-Base-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-0.3B-Base-PT)\u003Cbr\u002F>`baidu\u002FERNIE-4.5-0.3B-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-0.3B-PT)\u003Cbr\u002F>`baidu\u002FERNIE-4.5-21B-A3B-Base-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-21B-A3B-Base-PT)\u003Cbr\u002F>`baidu\u002FERNIE-4.5-21B-A3B-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-21B-A3B-PT)\u003Cbr\u002F>`baidu\u002FERNIE-4.5-VL-28B-A3B-Base-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-VL-28B-A3B-Base-PT)\u003Cbr\u002F>`baidu\u002FERNIE-4.5-VL-28B-A3B-PT` [🤗](https:\u002F\u002Fhuggingface.co\u002Fbaidu\u002FERNIE-4.5-VL-28B-A3B-PT)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaidu\u002FERNIE-4.5-0.3B-Base-PT\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fbaidu\u002FERNIE-4.5-0.3B-PT\u002Fbert4torch_config.json)|\r\n|PaddleOCR|[PaddleOCR-VL](https:\u002F\u002Fhuggingface.co\u002FPaddlePaddle\u002FPaddleOCR-VL)|百度|`PaddlePaddle\u002FPaddleOCR-VL` [🤗](https:\u002F\u002Fhuggingface.co\u002FPaddlePaddle\u002FPaddleOCR-VL)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FPaddlePaddle\u002FPaddleOCR-VL\u002Fbert4torch_config.json)|\r\n||[PaddleOCR-VL-1.5](https:\u002F\u002Fhuggingface.co\u002FPaddlePaddle\u002FPaddleOCR-VL-1.5)|百度|`PaddlePaddle\u002FPaddleOCR-VL-1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FPaddlePaddle\u002FPaddleOCR-VL-1.5)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FPaddlePaddle\u002FPaddleOCR-VL-1.5\u002Fbert4torch_config.json)|\r\n|MiniCPM|[MiniCPM](https:\u002F\u002Fgithub.com\u002FOpenBMB\u002FMiniCPM)|OpenBMB|`openbmb\u002FMiniCPM-2B-sft-bf16` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-2B-sft-bf16)\u003Cbr\u002F>`openbmb\u002FMiniCPM-2B-dpo-bf16` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-2B-dpo-bf16)\u003Cbr\u002F>`openbmb\u002FMiniCPM-2B-128k` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-2B-128k)\u003Cbr\u002F>`openbmb\u002FMiniCPM-1B-sft-bf16` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-1B-sft-bf16)\u003Cbr\u002F>`openbmb\u002FMiniCPM3-4B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM3-4B)\u003Cbr\u002F>`openbmb\u002FMiniCPM4-0.5B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM4-0.5B)\u003Cbr\u002F>`openbmb\u002FMiniCPM4-8B` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM4-8B)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-2B-sft-bf16\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-2B-dpo-bf16\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-2B-128k\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-1B-sft-bf16\u002Fbert4torch_config.json)\u003Cbr\u002F>待添加\u003Cbr\u002F>待添加\u003Cbr\u002F>待添加|\r\n||[MiniCPM-o](https:\u002F\u002Fgithub.com\u002FOpenBMB\u002FMiniCPM-o)|OpenBMB|`openbmb\u002FMiniCPM-Llama3-V-2_5` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-Llama3-V-2_5)\u003Cbr\u002F>`openbmb\u002FMiniCPM-V-2_6` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-V-2_6)\u003Cbr\u002F>`openbmb\u002FMiniCPM-o-2_6` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-o-2_6)\u003Cbr\u002F>`openbmb\u002FMiniCPM-V-4` [🤗](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-V-4)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-Llama3-V-2_5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fopenbmb\u002FMiniCPM-V-2_6\u002Fbert4torch_config.json)\u003Cbr\u002F>待添加\u003Cbr\u002F>待添加|\r\n|embedding|[text2vec-base-chinese](https:\u002F\u002Fgithub.com\u002Fshibing624\u002Ftext2vec)|shibing624|`shibing624\u002Ftext2vec-base-chinese` [🤗](https:\u002F\u002Fhuggingface.co\u002Fshibing624\u002Ftext2vec-base-chinese)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fshibing624\u002Ftext2vec-base-chinese\u002Fbert4torch_config.json)|\r\n||[m3e](https:\u002F\u002Fgithub.com\u002Fwangyuxinwhy\u002Funiem)|moka-ai|`moka-ai\u002Fm3e-base` [🤗](https:\u002F\u002Fhuggingface.co\u002Fmoka-ai\u002Fm3e-base)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fmoka-ai\u002Fm3e-base\u002Fbert4torch_config.json)|\r\n||bge|BAAI|`BAAI\u002Fbge-large-en-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-large-en-v1.5)\u003Cbr\u002F>`BAAI\u002Fbge-large-zh-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-large-zh-v1.5)\u003Cbr\u002F>`BAAI\u002Fbge-base-en-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-base-en-v1.5)\u003Cbr\u002F>`BAAI\u002Fbge-base-zh-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-base-zh-v1.5)\u003Cbr\u002F>`BAAI\u002Fbge-small-en-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-small-en-v1.5)\u003Cbr\u002F>`BAAI\u002Fbge-small-zh-v1.5` [🤗](https:\u002F\u002Fhuggingface.co\u002FBAAI\u002Fbge-small-zh-v1.5)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-large-en-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-large-zh-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-base-en-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-base-zh-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-small-en-v1.5\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002FBAAI\u002Fbge-small-zh-v1.5\u002Fbert4torch_config.json)|\r\n||gte|thenlper|`thenlper\u002Fgte-large-zh` [🤗](https:\u002F\u002Fhuggingface.co\u002Fthenlper\u002Fgte-large-zh)\u003Cbr\u002F>`thenlper\u002Fgte-base-zh` [🤗](https:\u002F\u002Fhuggingface.co\u002Fthenlper\u002Fgte-base-zh)|[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fthenlper\u002Fgte-base-zh\u002Fbert4torch_config.json)\u003Cbr\u002F>[🤗](https:\u002F\u002Fhuggingface.co\u002FTongjilibo\u002Fbert4torch_config\u002Fblob\u002Fmain\u002Fthenlper\u002Fgte-large-zh\u002Fbert4torch_config.json)|\r\n\r\n*注：\r\n\r\n1. `高亮格式`(如 `bert-base-chinese`)的表示可直接 `build_transformer_model()`联网下载\r\n2. 国内镜像网站加速下载\r\n\r\n   - `HF_ENDPOINT=https:\u002F\u002Fhf-mirror.com python your_script.py`\r\n   - `export HF_ENDPOINT=https:\u002F\u002Fhf-mirror.com`后再执行python代码\r\n   - 在python代码开头如下设置\r\n\r\n   ```python\r\n   import os\r\n   os.environ['HF_ENDPOINT'] = \"https:\u002F\u002Fhf-mirror.com\"\r\n   ```\n\n## 6. 致谢\n\n- 感谢苏神实现的[bert4keras](https:\u002F\u002Fgithub.com\u002Fbojone\u002Fbert4keras)，本实现有不少地方参考了bert4keras的源码，在此衷心感谢大佬的无私奉献;\n- 其次感谢项目[bert4pytorch](https:\u002F\u002Fgithub.com\u002FMuQiuJun-AI\u002Fbert4pytorch)，也是在该项目的指引下给了我用pytorch来复现bert4keras的想法和思路。\n\n## 7. 引用\n\n```\n@misc{bert4torch,\n  title={bert4torch},\n  author={Bo Li},\n  year={2022},\n  howpublished={\\url{https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch}},\n}\n```\n\n## 8. 其他\n\n- 微信及星标历史图\n- 微信群人数超过200人（有邀请限制），可添加个人微信拉群，备注：bert4torch-姓名-公司名\n\n\u003Ctable border=\"0\">\n  \u003Ctbody>\n    \u003Ctr align=\"center\" >\n      \u003Ctd>\n         \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FTongjilibo\">\u003Cimg width=\"200\" height=\"250\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_334f6a1ee93b.jpg\" alt=\"pic\">\u003C\u002Fa>\u003Cbr \u002F>\n         \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FTongjilibo\">微信号\u003C\u002Fa> \n      \u003C\u002Ftd>\n      \u003Ctd>\n         \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FTongjilibo\">\u003Cimg width=\"190\" height=\"250\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_3c4c8e02bdd1.jpg\" alt=\"pic\">\u003C\u002Fa>\u003Cbr \u002F>\n         \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FTongjilibo\">微信群\u003C\u002Fa> \n      \u003C\u002Ftd>\n      \u003Ctd>\n         \u003Ca href=\"https:\u002F\u002Fstar-history.com\u002F#Tongjilibo\u002Fbert4torch&Date\">\u003Cimg width=\"400\" height=\"250\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_readme_dbfdc271a63b.png\" alt=\"pic\">\u003C\u002Fa>\u003Cbr \u002F>\n         \u003Ca href=\"https:\u002F\u002Fstar-history.com\u002F#Tongjilibo\u002Fbert4torch&Date\">星标历史图\u003C\u002Fa> \n      \u003C\u002Ftd>  \n      \u003C\u002Ftr>\n  \u003C\u002Ftbody>\n\u003C\u002Ftable>","# bert4torch 快速上手指南\n\nbert4torch 是一个基于 PyTorch 的轻量级 NLP 框架，兼容 Keras 风格的训练流程。它支持加载 BERT、RoBERTa、LLaMA、ChatGLM 等多种预训练模型，提供从微调到大模型推理部署的一站式解决方案。\n\n## 1. 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**：Linux, macOS 或 Windows\n*   **Python 版本**：建议 Python 3.8+\n*   **核心依赖**：\n    *   `torch` >= 2.0 (推荐，已适配 torch2.0；原基于 1.10 开发)\n    *   `transformers` (可选，用于加载 HuggingFace 模型)\n*   **硬件建议**：若进行大模型推理或微调，建议使用支持 CUDA 的 NVIDIA GPU。\n\n## 2. 安装步骤\n\n您可以通过 pip 直接安装稳定版，或从 GitHub 安装最新开发版。\n\n### 安装稳定版\n```shell\npip install bert4torch\n```\n\n### 安装最新版（推荐）\n由于 PyPI 包更新可能滞后，建议直接从 GitHub 安装以获取最新模型支持（如 Qwen3, Ernie4_5 等）：\n```shell\npip install git+https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\n```\n\n> **注意**：若使用 `git clone` 方式自行开发，请注意引用路径及权重文件格式是否需要转换。\n\n## 3. 基本使用\n\nbert4torch 的核心优势在于统一的模型加载接口和简洁的训练流程。\n\n### 3.1 加载预训练模型\n\n使用 `build_transformer_model` 即可加载各类模型。支持本地路径加载或自动从 HuggingFace 下载。\n\n```python\nfrom bert4torch.models import build_transformer_model\n\n# 场景 1: 自动从 HuggingFace 下载并加载 (例如中文 BERT)\n# 会自动下载权重文件和 bert4torch_config.json 配置\nmodel = build_transformer_model(checkpoint_path='google-bert\u002Fbert-base-chinese')\n\n# 场景 2: 加载本地模型文件夹\n# 需确保文件夹下包含 *.bin\u002F*.safetensors 权重文件及 bert4torch_config.json\nmodel = build_transformer_model(checkpoint_path='.\u002Fmodel\u002Fmy_bert_ckpt')\n\n# 场景 3: 仅初始化模型结构 (不加载权重，用于从头训练)\nmodel = build_transformer_model('.\u002Fmodel\u002Fbert4torch_config.json')\n```\n\n### 3.2 命令行一键部署大模型\n\n无需编写额外代码，直接使用命令行即可启动大模型服务（支持 ChatGLM, LLaMA, Qwen 等）。\n\n**启动 Gradio 网页界面进行对话：**\n```shell\n# 自动联网下载模型并启动 Web UI\nbert4torch serve --checkpoint_path Qwen2-0.5B-Instruct --mode gradio\n\n# 或使用本地已下载的模型路径\nbert4torch serve --checkpoint_path \u002Fdata\u002Fpretrain_ckpt\u002FQwen\u002FQwen2-0.5B-Instruct --mode gradio\n```\n\n**启动 OpenAI 兼容 API 服务：**\n```shell\nbert4torch serve --checkpoint_path Qwen2-0.5B-Instruct --mode openai\n```\n\n**终端命令行交互模式：**\n```shell\nbert4torch serve --checkpoint_path Qwen2-0.5B-Instruct --mode cli\n```\n\n### 3.3 模型微调 (Finetune) 简述\n\n框架采用类似 Keras 的 `fit` 方式进行训练，内置进度条、Logger 和 Tensorboard 支持。\n\n```python\n# 伪代码示例：定义模型 -> 准备数据 -> 编译 -> 训练\nmodel = build_transformer_model(config_path, checkpoint_path)\n# ... 数据预处理 ...\nmodel.compile(optimizer='adam', loss='sparse_categorical_crossentropy')\nmodel.fit(train_data, steps_per_epoch=100, epochs=3)\n```\n\n更多详细示例（如序列标注、文本分类、LoRA 微调等）请参考官方 [examples](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fblob\u002Fmaster\u002Fexamples) 目录。","某电商公司的算法团队急需基于用户评论数据微调一个大语言模型（如 ChatGLM 或 LLaMA），以构建智能客服系统，但面临从权重加载到服务部署的全流程开发压力。\n\n### 没有 bert4torch 时\n- **环境配置繁琐**：需要手动编写复杂的代码来转换不同来源的预训练权重格式，常因版本不兼容导致加载失败。\n- **重复造轮子**：缺乏内置的训练技巧（Trick），团队需自行实现混合精度训练、梯度累积等优化策略，耗时且易出错。\n- **监控缺失**：训练过程缺乏直观的进度条和自动化的 TensorBoard 日志记录，难以实时掌握模型收敛情况。\n- **部署门槛高**：模型训练完成后，需额外开发大量推理接口代码才能对外提供服务，延长了上线周期。\n\n### 使用 bert4torch 后\n- **一键加载权重**：直接通过简洁 API 加载 ChatGLM、LLaMA 等主流大模型权重，自动处理格式转换，开箱即用。\n- **内置最佳实践**：集成了常见的训练 Trick，只需在配置中开启即可享受混合精度等加速效果，无需重复编码。\n- **可视化训练流**：默认提供动态训练进度条及 Logger\u002FTensorBoard 支持，实时监控损失变化与参数状态，调试效率倍增。\n- **命令行极速部署**：仅需一行命令即可将微调后的模型部署为在线服务，大幅简化从实验到生产的落地路径。\n\nbert4torch 通过高度封装的优雅设计，将大模型从微调到部署的全链路开发效率提升了数倍，让算法工程师能更专注于业务逻辑而非底层基建。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FTongjilibo_bert4torch_849cd1e6.png","Tongjilibo","Bo仔很忙","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FTongjilibo_fd60229d.png","LLM & NLP & ML",null,"Shanghai,China","tongjilibo@163.com","https:\u002F\u002Fwww.zhihu.com\u002Fpeople\u002Fli-bo-53-72","https:\u002F\u002Fgithub.com\u002FTongjilibo",[83],{"name":84,"color":85,"percentage":86},"Python","#3572A5",100,1334,168,"2026-04-09T08:21:49","MIT","未说明","未说明（支持大模型推理和微调，具体显存需求取决于所选模型大小；支持 gptq、awq 等量化方式以降低显存需求）",{"notes":94,"python":91,"dependencies":95},"开发环境原基于 torch==1.10，现已切换至 torch==2.0。支持加载 ChatGLM、Llama、Baichuan、Qwen 等多种大模型权重。支持命令行一键部署大模型服务（CLI\u002FGradio\u002FOpenAI API 模式）。部分功能（如 LoRA）依赖 peft 库，但 PPO 等算法自带实现。需注意 git 版本与 pip 发布版本可能存在差异，自行训练时需修改数据处理代码。",[96,97,98],"torch>=2.0","torch4keras>=0.3.3","peft (可选，用于 LoRA 微调)",[35,14],[101,102,103,104,105,106,107,108,109,64,110,111,112,113,114],"bert","nlp","pytorch","bert4keras","named-entity-recognition","relation-extraction","seq2seq","text-classification","transformers","belle","chatglm","llama","llm","large-language-models","2026-03-27T02:49:30.150509","2026-04-16T01:45:06.553460",[118,123,128,133,138,143,148],{"id":119,"question_zh":120,"answer_zh":121,"source_url":122},34733,"开启 DataLoader 的多线程（num_workers > 0）时报错，如何解决？","在 Linux 环境下，需要在主程序入口添加 `torch.multiprocessing.set_start_method('spawn')`。例如：\n```python\nif __name__=='__main__':\n    torch.multiprocessing.set_start_method('spawn')\n    # 后续训练代码\n```\n此外，该问题在 bert4torch v0.2.4 及更高版本中已修复，建议直接升级库版本：`pip install -U bert4torch`。","https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fissues\u002F54",{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},34734,"单机多卡使用 BaseModelDDP 初始化时出现 'AttributeError: can't set attribute' 错误怎么办？","这是旧版本的 Bug，已在 pip 发布的 v0.3.3 版本中修复。请执行以下命令升级即可解决：\n`pip install bert4torch==0.3.3`","https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fissues\u002F146",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},34735,"使用 AutoModel.from_pretrained 加载模型后，训练时 Backbone 参数不更新（requires_grad=True 但权重不变），而线性层正常更新，原因是什么？","这通常与特定模型架构（如 DeBERTa v2）在框架中的兼容性或梯度计算图有关，而非框架本身的通用问题。建议尝试以下步骤排查：\n1. 更换为标准的 BERT 模型测试，若正常则可能是特定模型适配问题。\n2. 检查 Loss 是否下降，若下降说明必有参数更新，可打印各层权重的和或范数，对比训练前后变化，确认具体哪些层未更新。\n3. 确保没有意外冻结了某些层。","https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fissues\u002F140",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},34736,"调用 model.compile() 时报错 'unexpected keyword argument grad_accumulation_steps'，该如何处理？","在新版本中，`grad_accumulation_steps` 参数可能已被移除或更改了位置。解决方法是：\n1. 尝试从 `model.compile()` 的参数列表中移除 `grad_accumulation_steps`。\n2. 如果问题依旧，请将 bert4torch 升级到最新版本（如从 v0.2.2 升至 v0.2.3 或更高），新版本通常修复了此类 API 变更导致的兼容性问题。","https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fissues\u002F63",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},34737,"添加虚拟对抗训练（VAT）后，模型训练的 F1 值一直为 0，如何排查？","代码逻辑本身可能无误，建议参考官方提供的示例代码进行对比。维护者已将 VAT 实现添加到示例中，路径为：`examples\u002Frelation_extraction\u002Ftask_relation_extraction_CasRel_VAT.py`。请运行该示例几个 epoch，观察 F1 是否正常。如果示例正常而你的代码异常，请仔细比对数据预处理、损失函数计算及优化器步骤的差异。","https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fissues\u002F42",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},34738,"在使用自定义语料进行句子相似度计算或模型导出（ONNX\u002FTensorRT）时，预测结果异常或加速失败，需要注意什么？","1. **预测模式**：在进行推理或导出时，必须将模型设置为评估模式并禁用梯度计算。请在代码中添加：\n```python\nmodel.eval()\nwith torch.no_grad():\n    output = model(input_ids, segment_ids)\n```\n缺少 `eval()` 会导致训练阶段的 Dropout 等层生效，影响结果。\n2. **ONNX\u002FTensorRT**：数据预处理（如 Tokenizer）通常不在模型计算图内，导出时需确保输入张量格式正确。具体部署示例可参考官方 Triton Serving 示例目录。","https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fissues\u002F96",{"id":149,"question_zh":150,"answer_zh":151,"source_url":152},34739,"运行 TPLinker Plus 命名实体识别示例时，F1 值为 0 且预测结果全为空或异常，如何解决？","首先确认官方示例在默认配置下是否正常运行。如果官方示例正常而你的运行结果为 0，请尝试：\n1. 更换预训练模型（例如从原模型切换为哈工大 HFL 的预训练模型）进行测试，排除模型权重文件损坏或不匹配的问题。\n2. 检查数据预处理部分，特别是 `mapping` 和标签对齐逻辑是否与当前模型的分词方式兼容。\n3. 若仍无法解决，建议最小化复现代码并联系维护者进一步排查。","https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Fbert4torch\u002Fissues\u002F73",[154,159,164,169,174,179,184,189,194,199,204,209,214,219,224,229,234,239,244,249],{"id":155,"version":156,"summary_zh":157,"released_at":158},272110,"v0.6.1","1. 增加PaddleOCR-VL  \n2. 优化代码结构  \n3. 移除硬编码的模型配置项","2026-01-14T06:31:18",{"id":160,"version":161,"summary_zh":162,"released_at":163},272111,"v0.6.0","1. 增加 `Qwen3-moe` 2. 支持 `gptq`、`awq` 等主流量化方式 3. 其他代码优化","2025-09-25T09:35:29",{"id":165,"version":166,"summary_zh":167,"released_at":168},272112,"v0.5.9.post2","1. 增加`Ernie4_5`  \n2. 修复`hub`下载bug  \n3. 拆分出`openai_client`","2025-07-21T04:11:23",{"id":170,"version":171,"summary_zh":172,"released_at":173},272113,"v0.5.8","1. 增加`Qwen3-Embedding`和`Qwen3-Reranker`，支持将`temperature`设置为0  \n2. 修复`sdpa`和`global_point`的bug  \n3. 拆分`attention_utils`","2025-06-20T16:37:07",{"id":175,"version":176,"summary_zh":177,"released_at":178},272114,"v0.5.7","1. 将命令行参数修改为`bert4torch serve`  \n2. 增加`Qwen3`","2025-05-11T11:51:50",{"id":180,"version":181,"summary_zh":182,"released_at":183},272115,"v0.5.6","- 命令行支持图片输入\r\n- 修复rope在批量推理和处理超长文本时的bug","2025-04-01T15:55:58",{"id":185,"version":186,"summary_zh":187,"released_at":188},272116,"v0.5.5","增加DeepSeek-R1、InternVL、InternLM3、GLM4V、ModernBert、MLLaMA、Qwen2-VL、Qwen-VL","2025-02-15T15:01:43",{"id":190,"version":191,"summary_zh":192,"released_at":193},272117,"v0.5.4","【新功能】新增DeepSeek系列、MiniCPM、MiniCPMV、Llama 3.2、Qwen 2.5；支持device_map=auto  \n【修复】修复batch_generate及n>1时的bug","2024-09-28T10:24:57",{"id":195,"version":196,"summary_zh":197,"released_at":198},272118,"v0.5.3","## 【新功能】\n- 增加llama3.1\u002FYi1.5\n- 自动选择从hfmirror下载\n- 支持命令行参数`bert4torch-llm-server`","2024-08-14T09:33:58",{"id":200,"version":201,"summary_zh":202,"released_at":203},272119,"v0.5.2","## 新功能\n- ChatGLM\u002FQwen系列支持Function Call调用\n- 增加InternLM2系列；\n\n## 小优化\n- 简化Pipeline中Chat Demo的调用\n- Generate的终止Token元素允许为列表\n- 统一Rope Scaling参数名，增加Rope衍生类；\n\n## Bug修复\n- FlashAttention2的推理Bug\n- 修复BART、T5之前tie_word_embedding的Bug","2024-08-01T09:34:03",{"id":205,"version":206,"summary_zh":207,"released_at":208},272120,"v0.5.1","## 新增\r\n- 增加Qwen1.5, Qwen2, glm4; \r\n- 增加SWA\u002Fconvert_lm_logits_dtype；\r\n## bug修复\r\n- 调整各个trainer(重点DPOTrainer)\r\n- generation中segment_ids\r\n- repetition_penalty需带query\r\n- RMSNorm中转类型bug","2024-06-19T02:23:01",{"id":210,"version":211,"summary_zh":212,"released_at":213},272121,"v0.5.0","## bug修复\r\n- 修复chatglm3的bug\r\n- 修复save_pretrained时多文件的bug\r\n- 修改Text2Vec的bug\r\n\r\n## 新增小功能\r\n- 增加CausalLMLoss\r\n- 修改deepspeed的传参逻辑\r\n- 完善openai client\r\n- 增加get_weight_decay_optim_groups","2024-04-18T16:00:08",{"id":215,"version":216,"summary_zh":217,"released_at":218},272122,"v0.4.9.post2","## bug修复\r\n- 修改repetition_penalty的bug\r\n- 修复config_path的bug\r\n\r\n## 功能优化\r\n- attention中允许is_causal\r\n- 把baichuan从llama中剥离\r\n\r\n## 新增功能\r\n- 增加get_weight_decay_optim_groups函数\r\n- 允许num_key_value_heads参数\r\n- [torch4keras-v0.2.1](https:\u002F\u002Fgithub.com\u002FTongjilibo\u002Ftorch4keras\u002Freleases\u002Ftag\u002Fv0.2.1)更新特性","2024-03-16T07:50:00",{"id":220,"version":221,"summary_zh":222,"released_at":223},272123,"v0.4.8","1. 🔥build_transformer_model允许从hf下载\r\n2. fastapi发布服务允许闲时offload到cpu\r\n4. 添加FillMask的pipeline\r\n5. 添加SequenceClassificationTrainer","2024-02-21T15:57:37",{"id":225,"version":226,"summary_zh":227,"released_at":228},272124,"v0.4.7","1. 修改`save_pretrained`用于保存文件夹\r\n2. 增加GenerateSpeed用于统计token生成速度\r\n3. 修复t5在use_states=True时候的错误\r\n4. 修改层次编码的bug\r\n5. 增加deepseek_moe模型\r\n6. 修复generation并发错误，优化大模型耗时","2024-02-04T10:00:14",{"id":230,"version":231,"summary_zh":232,"released_at":233},272125,"v0.4.6","- bug修复\r\n- 增加`save_pretrained`用于保存`transformer`格式的权重,\r\n- 增加部分`embedding`模型","2024-01-16T15:51:33",{"id":235,"version":236,"summary_zh":237,"released_at":238},272126,"v0.4.5","- `training`时候不生成`past_key_values`\r\n- 增加`streamlit`的example\r\n- 修复句向量`max`时的bug\r\n- `batch_generate`合并到`generate`\r\n- 修改`generation`的默认参数名(兼容过去的参数名)\r\n- 多轮对话中可保留`past_key_values`\r\n- 把`attention`中的`mask`补齐逻辑移到`apply_embedding`中\r\n- 增加`uie`的`pipeline`\r\n- 增加`PtuningV2Trainer`","2024-01-10T16:33:41",{"id":240,"version":241,"summary_zh":242,"released_at":243},272127,"v0.4.4","1. 新增pipelines模块，把chat整理进去\r\n2. 并新增Text2Vec模块用于向量生成\r\n3. 新增snapshot_download用于hf模型下载","2023-12-28T14:40:51",{"id":245,"version":246,"summary_zh":247,"released_at":248},272128,"v0.4.3","在`chat`中增加常见chat模型, 简化大模型调用的代码逻辑","2023-12-24T06:16:35",{"id":250,"version":251,"summary_zh":252,"released_at":253},272129,"v0.4.2","1. 参数checkpoint_path支持传入文件夹地址\r\n2. 增加chat模块用于快速发布demo\u002Fapi\r\n3. 支持加载.safetensors\r\n4. meta的device提示报错","2023-12-19T15:25:22"]