[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-noamgat--lm-format-enforcer":3,"tool-noamgat--lm-format-enforcer":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",159636,2,"2026-04-17T23:33:34",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":73,"owner_website":76,"owner_url":77,"languages":78,"stars":83,"forks":84,"last_commit_at":85,"license":86,"difficulty_score":32,"env_os":87,"env_gpu":88,"env_ram":87,"env_deps":89,"category_tags":98,"github_topics":76,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":99,"updated_at":100,"faqs":101,"releases":130},8746,"noamgat\u002Flm-format-enforcer","lm-format-enforcer","Enforce the output format (JSON Schema, Regex etc) of a language model","lm-format-enforcer 是一款专为大语言模型设计的输出格式控制工具。它能在模型生成文本的每一步动态筛选允许的字符，确保最终输出严格符合预设的 JSON Schema、正则表达式等格式要求。\n\n在大模型应用中，即使经过精心提示工程，模型仍常出现格式错误（如 JSON 缺括号、字段类型不对），导致后续程序解析失败。lm-format-enforcer 通过底层干预生成过程，从根本上解决了这一痛点，既保证了格式的绝对合规，又最大限度保留了模型的语言表达能力，无需牺牲智能性来换取稳定性。\n\n该工具非常适合开发者、AI 研究人员及工程师使用，尤其是那些需要将大模型集成到自动化工作流、API 服务或数据分析管道中的技术团队。无论您使用的是 Hugging Face Transformers、vLLM、LangChain 还是 llama.cpp，都能轻松接入。\n\n其独特亮点在于支持复杂的嵌套结构、可选字段及数组处理，并兼容批量生成与束搜索（beam search）场景。更贴心的是，它在强制格式的同时，允许模型自由控制空格与换行，使输出既机器可读又保持自然文本的灵活性。安装简单，一行命","lm-format-enforcer 是一款专为大语言模型设计的输出格式控制工具。它能在模型生成文本的每一步动态筛选允许的字符，确保最终输出严格符合预设的 JSON Schema、正则表达式等格式要求。\n\n在大模型应用中，即使经过精心提示工程，模型仍常出现格式错误（如 JSON 缺括号、字段类型不对），导致后续程序解析失败。lm-format-enforcer 通过底层干预生成过程，从根本上解决了这一痛点，既保证了格式的绝对合规，又最大限度保留了模型的语言表达能力，无需牺牲智能性来换取稳定性。\n\n该工具非常适合开发者、AI 研究人员及工程师使用，尤其是那些需要将大模型集成到自动化工作流、API 服务或数据分析管道中的技术团队。无论您使用的是 Hugging Face Transformers、vLLM、LangChain 还是 llama.cpp，都能轻松接入。\n\n其独特亮点在于支持复杂的嵌套结构、可选字段及数组处理，并兼容批量生成与束搜索（beam search）场景。更贴心的是，它在强制格式的同时，允许模型自由控制空格与换行，使输出既机器可读又保持自然文本的灵活性。安装简单，一行命令即可开启更可靠的模型交互体验。","# lm-format-enforcer\n\n![LMFE Logo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fnoamgat_lm-format-enforcer_readme_8abac9d22864.png)\n\n**Enforce the output format (JSON Schema, Regex etc) of a language model**\n\n\u003Ca target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_llama2_enforcer.ipynb\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"Open In Colab\"\u002F>\n\u003C\u002Fa>\n\n[![Code Coverage](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fnoamgat\u002Flm-format-enforcer\u002Fgraph\u002Fbadge.svg?token=63U3S58VWS)](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fnoamgat\u002Flm-format-enforcer)\n![Tests](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Factions\u002Fworkflows\u002Frun_tests.yml\u002Fbadge.svg)\n\n\n![Solution at a glance](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fnoamgat_lm-format-enforcer_readme_fcb8cc293a2c.webp)\n\n\nLanguage models are able to generate text, but when requiring a precise output format, they do not always perform as instructed.\nVarious prompt engineering techniques have been introduced to improve the robustness of the generated text, but they are not always sufficient.\nThis project solves the issues by filtering the tokens that the language model is allowed to generate at every timestep, thus ensuring that the output format is respected, while minimizing the limitations on the language model.\n\n## Installation\n```pip install lm-format-enforcer```\n\n## Basic Tutorial\n```python\n# Requirements if running from Google Colab with a T4 GPU. \n!pip install transformers torch lm-format-enforcer huggingface_hub optimum\n!pip install auto-gptq --extra-index-url https:\u002F\u002Fhuggingface.github.io\u002Fautogptq-index\u002Fwhl\u002Fcu118\u002F \n\nfrom pydantic import BaseModel\nfrom lmformatenforcer import JsonSchemaParser\nfrom lmformatenforcer.integrations.transformers import build_transformers_prefix_allowed_tokens_fn\nfrom transformers import pipeline\n\nclass AnswerFormat(BaseModel):\n    first_name: str\n    last_name: str\n    year_of_birth: int\n    num_seasons_in_nba: int\n\n# Create a transformers pipeline\nhf_pipeline = pipeline('text-generation', model='TheBloke\u002FLlama-2-7b-Chat-GPTQ', device_map='auto')\nprompt = f'Here is information about Michael Jordan in the following json schema: {AnswerFormat.schema_json()} :\\n'\n\n# Create a character level parser and build a transformers prefix function from it\nparser = JsonSchemaParser(AnswerFormat.schema())\nprefix_function = build_transformers_prefix_allowed_tokens_fn(hf_pipeline.tokenizer, parser)\n\n# Call the pipeline with the prefix function\noutput_dict = hf_pipeline(prompt, prefix_allowed_tokens_fn=prefix_function)\n\n# Extract the results\nresult = output_dict[0]['generated_text'][len(prompt):]\nprint(result)\n# {'first_name': 'Michael', 'last_name': 'Jordan', 'year_of_birth': 1963, 'num_seasons_in_nba': 15}\n```\n\n## Capabilities \u002F Advantages\n\n- Works with any Python language model and tokenizer. Already supports [transformers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers), [LangChain](https:\u002F\u002Fpython.langchain.com\u002Fdocs\u002Fintegrations\u002Fllms\u002Flmformatenforcer_experimental), [LlamaIndex](https:\u002F\u002Fdocs.llamaindex.ai\u002Fen\u002Flatest\u002Fcommunity\u002Fintegrations\u002Flmformatenforcer.html), [llama.cpp](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_llamacpppython_integration.ipynb), [vLLM](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_vllm_integration.ipynb), [Haystack](https:\u002F\u002Fhaystack.deepset.ai\u002Fintegrations\u002Flmformatenforcer), [NVIDIA TensorRT-LLM](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_trtllm_integration.ipynb) and [ExLlamaV2](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_exllamav2_integration.ipynb).\n- Supports batched generation and beam searches - each input \u002F beam can have different tokens filtered at every timestep\n- Supports JSON Schema, JSON Mode (schemaless) and Regular Expression formats\n- Supports both required and optional fields in JSON schemas\n- Supports nested fields, arrays and dictionaries in JSON schemas\n- Gives the language model freedom to control whitespacing and field ordering in JSON schemas, reducing hallucinations.\n- Does not modify the high level loop of transformers API, so can be used in any scenario.\n\n\n## Comparison to other libraries\n\nCapability | LM Format Enforcer | [Guidance](https:\u002F\u002Fgithub.com\u002Fguidance-ai\u002Fguidance) | [Jsonformer](https:\u002F\u002Fgithub.com\u002F1rgs\u002Fjsonformer) | [Outlines](https:\u002F\u002Fgithub.com\u002Foutlines-dev\u002Foutlines)\n:------------ | :-------------| :-------------| :------------- | :----\nRegular Expressions | ✅ |  ✅ | ❌ | ✅\nJSON Schema | ✅ |  🟡 ([Partial conversion is possible](https:\u002F\u002Fgithub.com\u002Fguidance-ai\u002Fguidance\u002Fblob\u002Fmain\u002Fnotebooks\u002Fapplications\u002Fjsonformer.ipynb)) | ✅ | ✅\nBatched Generation | ✅ |  ❌ | ❌ | ✅\nBeam Search | ✅ |  ❌ | ❌ | ✅\nIntegrates into existing pipelines | ✅ | ❌ | ❌ | ✅\nOptional JSON Fields | ✅ |  ❌ | ❌ | ❌\nLLM Controls JSON field ordering and whitespace | ✅ | ❌ | ❌ | ❌\nJSON Schema with recursive classes | ✅ | ❌ | ✅ | ❌\nVisual model support | [✅](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_llama32_vision_enforcer.ipynb) |  ✅ | ❌ | ❌\n\nSpotted a mistake? Library updated with new capabilities? [Open an issue!](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fissues)\n\n## Detailed example\n\nWe created a Google Colab Notebook which contains a full example of how to use this library to enforce the output format of llama2, including interpreting the intermediate results. The notebook can run on a free GPU-backed runtime in Colab.\n\n\u003Ca target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_llama2_enforcer.ipynb\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"Open In Colab\"\u002F>\n\u003C\u002Fa>\n\nYou can also [view the notebook in GitHub](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_llama2_enforcer.ipynb).\n\nFor the different ways to integrate with huggingface transformers, see the [unit tests](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Ftests\u002Ftest_transformerenforcer.py).\n\n## vLLM Server Integration\n\nLM Format Enforcer is integrated into the [vLLM](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm) inference server. vLLM includes an [OpenAI compatible server](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fserving\u002Fopenai_compatible_server.html) with added capabilities that allow using LM Format Enforcer without writing custom inference code.\n\nUse LM Format Enforcer with the vLLM OpenAI Server either by adding the [vLLM command line parameter](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fserving\u002Fopenai_compatible_server.html#command-line-arguments-for-the-server):\n\n```\npython -m vllm.entrypoints.openai.api_server \\\n  --model mistralai\u002FMistral-7B-Instruct-v0.2 \\\n  --guided-decoding-backend lm-format-enforcer\n ```\n\nOr on a per-request basis, by adding the `guided_decoding_backend` parameter to the request together with the guided decoding parameters:\n\n```\ncompletion = client.chat.completions.create(\n  model=\"mistralai\u002FMistral-7B-Instruct-v0.2\",\n  messages=[\n    {\"role\": \"user\", \"content\": \"Classify this sentiment: LMFE is wonderful!\"}\n  ],\n  extra_body={\n    \"guided_regex\": \"[Pp]ositive|[Nn]egative\",\n    \"guided_decoding_backend\": \"lm-format-enforcer\"\n  }\n)\n```\nJson schema and choice decoding also supported via `guided_json` and `guided_choice` [extra parameters](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fserving\u002Fopenai_compatible_server.html#extra-parameters-for-chat-api).\n\n## How does it work?\n\nThe library works by combining a character level parser and a tokenizer prefix tree into a smart token filtering mechanism.\n\n![An example of the character level parser and tokenizer prefix tree in a certain timestep](https:\u002F\u002Fraw.githubusercontent.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fmain\u002Fdocs\u002FTrees.drawio.svg?sanitize=true)\n\n### Character Level Parser\n\nParsing a string into any kind of formatter can be looked at as an implicit tree structure - at any moment in the parsing process, there is a set of allowed next characters, and if any of them are selected, there is a new set of allowed next characters, and so on.\n\n```CharacterLevelParser``` is an interface for parsing according to this implicit structure. ```add_character()``` and ```get_allowed_characters()``` can be seen as tree traversal methods.\n\nThere are several implementations of this interface:\n- ```JsonSchemaParser``` - parses according to a json schema (or pure json output - `JsonSchemaParser(None) will result in any json object allowed`). \n- ```StringParser``` - forces an exact string (used mainly for diagnostics)\n- ```RegexParser``` - parses according to a regular expression. Note that this cannot use the built in python regex and uses a manually implemented one (via the [interegular](https:\u002F\u002Fpypi.org\u002Fproject\u002Finteregular\u002F) library), so it doesn't cover 100% of the regex standard.\n### Tokenizer Prefix Tree\n\nGiven a tokenizer used by a certain language model, we can build a prefix tree of all the tokens that the language model can generate. This is done by generating all possible sequences of tokens, and adding them to the tree.\nSee ```TokenizerPrefixTree```\n\n### Combining the two\n\nGiven a character level parser and a tokenizer prefix tree, we can elegantly and efficiently filter the tokens that the language model is allowed to generate at the next timestep:\nWe only traverse the characters that are in BOTH the character level parsing node and the tokenizer prefix tree node. This allows us to find all of the tokens (including complex subword tokens such as ```\",\"``` which are critical in JSON parsing).\nWe do this recursively on both trees and return all of the allowed tokens. When the language model generates a token, we advance the character level parser according to the new characters, ready to filter the next timestep.\n\n### How is this approach different? Why is it good?\n\nThis is not the first library to enforce the output format of a language model. However, other similar libraries (such as Guidance, JsonFormer and Outlines) enforce an exact output format. This means that the language model is not allowed to control whitespacing, field optionality and field ordering (in the JSON usecase). While this seems inconsequencial to humans, it means that the language model may not be generating the JSON formats that it \"wants to\" generate, and could put its internal states in a suboptimal value, reducing the quality of the output in later timesteps.\n\nThis forces language model users to know the details of the language model they are using (for example - were JSONs minified before pretraining?) and modify the libraries to generate the precise format.\n\nWe avoid this problem by scanning potential next tokens and allowing any token sequence that will be parsed into the output format. This means that the language model can control all of these aspects, and output the token sequence that matches its' style in the most natural way, without requiring the developer to know the details.\n\n\n## Diagnostics - Will I always get good results?\n\nUsing this library guarantees that the output will match the format, but it does not guarantee that the output will be semantically correct. Forcing the language model to conform to a certain output may lead to increased hallucinations. Guiding the model via prompt engineering is still likely to improve results.\n\nIn order to help you understand the aggressiveness caused by the format enforcement, if you pass ```output_scores=True``` and ```return_dict_in_generate=True``` in the ```kwargs``` to ```generate_enforced()``` (these are existing optional parameters in the ```transformers``` library), you will also get a token-by-token dataframe showing which token was selected, its score, and what was the token that would have been chosen if the format enforcement was not applied. If you see that the format enforcer forced the language model to select tokens with very low weights, it is a likely contributor to the poor results. Try modifying the prompt to guide the language model to not force the format enforcer to be so aggressive.\n\nExample using the regular expression format ``` Michael Jordan was Born in (\\d)+.```\n\nidx | generated_token | generated_token_idx | generated_score | leading_token | leading_token_idx | leading_score\n:------------ | :-------------| :-------------| :------------- | :------------ | :-------------| :-------------\n0 | ▁ | 29871 | 1.000000 | ▁ | 29871 | 1.000000\n1 | Michael | 24083 | 0.000027 | ▁Sure | 18585 | 0.959473\n2 | ▁Jordan | 18284 | 1.000000 | ▁Jordan | 18284 | 1.000000\n3 | ▁was | 471 | 1.000000 | ▁was | 471 | 1.000000\n4 | ▁Born | 19298 | 0.000008 | ▁born | 6345 | 1.000000\n5 | ▁in | 297 | 0.994629 | ▁in | 297 | 0.994629\n6 | ▁ | 29871 | 0.982422 | ▁ | 29871 | 0.982422\n7 | 1 | 29896 | 1.000000 | 1 | 29896 | 1.000000\n8 | 9 | 29929 | 1.000000 | 9 | 29929 | 1.000000\n9 | 6 | 29953 | 1.000000 | 6 | 29953 | 1.000000\n10 | 3 | 29941 | 1.000000 | 3 | 29941 | 1.000000\n11 | . | 29889 | 0.999512 | . | 29889 | 0.999512\n12 | ```\u003C\u002Fs>``` | 2 | 0.981445 | ```\u003C\u002Fs>``` | 2 | 0.981445\n\n\nYou can see that the model \"wanted\" to start the answer using ```Sure```, but the format enforcer forced it to use ```Michael``` - there was a big gap in token 1. Afterwards, almost all of the leading scores are all within the allowed token set, meaning the model likely did not hallucinate due to the token forcing. The only exception was timestep 4 - \" Born\" was forced while the LLM wanted to choose \"born\". This is a hint for the prompt engineer, to change the prompt to use a lowercase b instead.\n\n\n## Configuration options\n\nLM Format Enforcer makes use of several heuristics to avoid edge cases that may happen with LLM's generating structure outputs.\nThere are two ways to control these heuristics:\n\n### Option 1: via Environment Variables\n\nThere are several environment variables that can be set, that affect the operation of the library. This method is useful when you don't want to modify the code, for example when using the library through the vLLM OpenAI server.\n\n- `LMFE_MAX_CONSECUTIVE_WHITESPACES` - How many consecutive whitespaces are allowed when parsing JsonSchemaObjects. Default: 12.\n- `LMFE_STRICT_JSON_FIELD_ORDER` - Should the JsonSchemaParser force the properties to appear in the same order as they appear in the 'required' list of the JsonSchema? (Note: this is consistent with the order of declaration in Pydantic models). Default: False.\n- `LMFE_MAX_JSON_ARRAY_LENGTH` - What is the maximal JSON array length, if not specified by the schema. Helps LLM Avoid infinite loops. Default: 20.\n- `LMFE_DEFAULT_ALPHABET` - What alphabet is used by default for allowed characters? See [consts.py](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Flmformatenforcer\u002Fconsts.py#L1) for default. Can be overriden and extended to include language specific characters. It is required if you want these characters to appear as json keys or enum values in JsonSchemaParser.\n\n### Option 2: via the CharacterLevelParserConfig class\nWhen using the library through code, any `CharacterLevelParser` (`JsonSchemaParser`, `RegexParser` etc) constructor receives an optional `CharacterLevelParserConfig` object. \n\nTherefore, to configure the heuristics of a single parser, instantiate a `CharacterLevelParserConfig` object, modify its values and pass it to the `CharacterLevelParser`'s constructor.\n\n\n\n## Known issues and limitations\n\n- LM Format Enforcer requires a python API to process the output logits of the language model. This means that until the APIs are extended, it can not be used with OpenAI ChatGPT and similar API based solutions.\n- Regular expression syntax is not 100% supported. See [interegular](https:\u002F\u002Fpypi.org\u002Fproject\u002Finteregular\u002F) for more details.\n- LM Format Enforcer Regex Parser can only generate characters that exist in the tokenizer vocabulary. This may be solved in a later version, see [the issue on GitHub](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fissues\u002F13).\n\n\n## Contributers and contributing\n\nSee [CONTRIBUTORS.md](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002FCONTRIBUTORS.md) for a list of contributers.\n","# 语言模型格式强制器\n\n![LMFE Logo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fnoamgat_lm-format-enforcer_readme_8abac9d22864.png)\n\n**强制语言模型的输出格式（JSON Schema、正则表达式等）**\n\n\u003Ca target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_llama2_enforcer.ipynb\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"在Colab中打开\"\u002F>\n\u003C\u002Fa>\n\n[![代码覆盖率](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fnoamgat\u002Flm-format-enforcer\u002Fgraph\u002Fbadge.svg?token=63U3S58VWS)](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fnoamgat\u002Flm-format-enforcer)\n![测试](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Factions\u002Fworkflows\u002Frun_tests.yml\u002Fbadge.svg)\n\n\n![解决方案概览](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fnoamgat_lm-format-enforcer_readme_fcb8cc293a2c.webp)\n\n\n语言模型能够生成文本，但在需要精确输出格式时，它们并不总是按指示执行。虽然已经提出了多种提示工程技巧来提高生成文本的鲁棒性，但这些方法并不总是足够有效。本项目通过在每个时间步过滤语言模型允许生成的标记，从而确保输出格式得到遵守，同时将对语言模型的限制降至最低。\n\n## 安装\n```pip install lm-format-enforcer```\n\n## 基础教程\n```python\n# 如果在配备T4 GPU的Google Colab上运行，需安装以下依赖。\n!pip install transformers torch lm-format-enforcer huggingface_hub optimum\n!pip install auto-gptq --extra-index-url https:\u002F\u002Fhuggingface.github.io\u002Fautogptq-index\u002Fwhl\u002Fcu118\u002F \n\nfrom pydantic import BaseModel\nfrom lmformatenforcer import JsonSchemaParser\nfrom lmformatenforcer.integrations.transformers import build_transformers_prefix_allowed_tokens_fn\nfrom transformers import pipeline\n\nclass AnswerFormat(BaseModel):\n    first_name: str\n    last_name: str\n    year_of_birth: int\n    num_seasons_in_nba: int\n\n# 创建一个transformers管道\nhf_pipeline = pipeline('text-generation', model='TheBloke\u002FLlama-2-7b-Chat-GPTQ', device_map='auto')\nprompt = f'以下是关于迈克尔·乔丹的信息，采用以下JSON模式：{AnswerFormat.schema_json()} :\\n'\n\n# 创建一个字符级解析器，并基于它构建transformers前缀允许标记函数\nparser = JsonSchemaParser(AnswerFormat.schema())\nprefix_function = build_transformers_prefix_allowed_tokens_fn(hf_pipeline.tokenizer, parser)\n\n# 使用前缀函数调用管道\noutput_dict = hf_pipeline(prompt, prefix_allowed_tokens_fn=prefix_function)\n\n# 提取结果\nresult = output_dict[0]['generated_text'][len(prompt):]\nprint(result)\n# {'first_name': 'Michael', 'last_name': 'Jordan', 'year_of_birth': 1963, 'num_seasons_in_nba': 15}\n```\n\n## 功能与优势\n\n- 适用于任何Python语言模型和分词器。目前已支持[transformers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers)、[LangChain](https:\u002F\u002Fpython.langchain.com\u002Fdocs\u002Fintegrations\u002Fllms\u002Flmformatenforcer_experimental)、[LlamaIndex](https:\u002F\u002Fdocs.llamaindex.ai\u002Fen\u002Flatest\u002Fcommunity\u002Fintegrations\u002Flmformatenforcer.html)、[llama.cpp](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_llamacpppython_integration.ipynb)、[vLLM](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_vllm_integration.ipynb)、[Haystack](https:\u002F\u002Fhaystack.deepset.ai\u002Fintegrations\u002Flmformatenforcer)、[NVIDIA TensorRT-LLM](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_trtllm_integration.ipynb)以及[ExLlamaV2](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_exllamav2_integration.ipynb)。\n- 支持批量生成和束搜索——每个输入或束在每个时间步都可以有不同的标记被过滤。\n- 支持JSON Schema、JSON Mode（无模式）和正则表达式格式。\n- 支持JSON模式中的必填字段和可选字段。\n- 支持JSON模式中的嵌套字段、数组和字典。\n- 允许语言模型自由控制JSON模式中的空格和字段顺序，从而减少幻觉现象。\n- 不修改transformers API的高层循环，因此可在任何场景下使用。\n\n## 与其他库的比较\n\n功能 | 语言模型格式强制器 | [Guidance](https:\u002F\u002Fgithub.com\u002Fguidance-ai\u002Fguidance) | [Jsonformer](https:\u002F\u002Fgithub.com\u002F1rgs\u002Fjsonformer) | [Outlines](https:\u002F\u002Fgithub.com\u002Foutlines-dev\u002Foutlines)\n:------------ | :-------------| :-------------| :------------- | :----\n正则表达式 | ✅ |  ✅ | ❌ | ✅\nJSON Schema | ✅ |  🟡（可通过部分转换实现） | ✅ | ✅\n批量生成 | ✅ |  ❌ | ❌ | ✅\n束搜索 | ✅ |  ❌ | ❌ | ✅\n集成到现有管道 | ✅ | ❌ | ❌ | ✅\n可选JSON字段 | ✅ |  ❌ | ❌ | ❌\nLLM控制JSON字段顺序和空格 | ✅ | ❌ | ❌ | ❌\n包含递归类的JSON Schema | ✅ | ❌ | ✅ | ❌\n视觉模型支持 | [✅](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_llama32_vision_enforcer.ipynb) |  ✅ | ❌ | ❌\n\n发现错误？或者该库已更新新功能？[请提交问题！](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fissues)\n\n## 详细示例\n\n我们创建了一个Google Colab笔记本，其中包含如何使用此库强制llama2输出格式的完整示例，还包括对中间结果的解释。该笔记本可以在Colab的免费GPU运行环境中运行。\n\n\u003Ca target=\"_blank\" href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_llama2_enforcer.ipynb\">\n  \u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"在Colab中打开\"\u002F>\n\u003C\u002Fa>\n\n你也可以在[GitHub上查看该笔记本](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_llama2_enforcer.ipynb)。\n\n有关与Hugging Face Transformers集成的不同方式，请参阅[单元测试](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Ftests\u002Ftest_transformerenforcer.py)。\n\n## vLLM 服务器集成\n\nLM Format Enforcer 已集成到 [vLLM](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm) 推理服务器中。vLLM 包含一个与 OpenAI 兼容的服务器，该服务器具备额外的功能，允许在无需编写自定义推理代码的情况下使用 LM Format Enforcer。\n\n您可以通过添加 [vLLM 命令行参数](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fserving\u002Fopenai_compatible_server.html#command-line-arguments-for-the-server) 来在 vLLM 的 OpenAI 服务器中使用 LM Format Enforcer：\n\n```\npython -m vllm.entrypoints.openai.api_server \\\n  --model mistralai\u002FMistral-7B-Instruct-v0.2 \\\n  --guided-decoding-backend lm-format-enforcer\n```\n\n或者，您也可以在每次请求时通过将 `guided_decoding_backend` 参数与引导解码参数一起添加到请求中来实现：\n\n```\ncompletion = client.chat.completions.create(\n  model=\"mistralai\u002FMistral-7B-Instruct-v0.2\",\n  messages=[\n    {\"role\": \"user\", \"content\": \"请分类以下情感：LMFE 非常棒！\"}\n  ],\n  extra_body={\n    \"guided_regex\": \"[Pp]ositive|[Nn]egative\",\n    \"guided_decoding_backend\": \"lm-format-enforcer\"\n  }\n)\n```\n\n此外，JSON 模式和选项解码也支持通过 `guided_json` 和 `guided_choice` [额外参数](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002Fserving\u002Fopenai_compatible_server.html#extra-parameters-for-chat-api) 实现。\n\n## 它是如何工作的？\n\n该库通过将字符级解析器和分词器前缀树结合，形成一种智能的标记过滤机制。\n\n![某个时间步中字符级解析器和分词器前缀树示例](https:\u002F\u002Fraw.githubusercontent.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fmain\u002Fdocs\u002FTrees.drawio.svg?sanitize=true)\n\n### 字符级解析器\n\n将字符串解析为任何格式都可以被视为一种隐式的树结构——在解析过程中的任何时刻，都有一组允许的下一个字符；一旦选择了其中任何一个字符，就会产生一组新的允许的下一个字符，以此类推。\n\n`CharacterLevelParser` 是一个用于按照这种隐式结构进行解析的接口。`add_character()` 和 `get_allowed_characters()` 可以看作是树的遍历方法。\n\n该接口有多种实现：\n- `JsonSchemaParser` —— 根据 JSON 模式进行解析（或纯 JSON 输出——`JsonSchemaParser(None)` 将允许任何 JSON 对象）。\n- `StringParser` —— 强制生成精确的字符串（主要用于诊断）。\n- `RegexParser` —— 根据正则表达式进行解析。需要注意的是，它不能使用 Python 内置的正则表达式引擎，而是使用手动实现的版本（通过 [interegular](https:\u002F\u002Fpypi.org\u002Fproject\u002Finteregular\u002F) 库），因此无法完全覆盖所有的正则表达式标准。\n\n### 分词器前缀树\n\n给定某种语言模型所使用的分词器，我们可以构建该语言模型能够生成的所有标记的前缀树。具体做法是生成所有可能的标记序列，并将其添加到树中。\n详情参见 `TokenizerPrefixTree`。\n\n### 两者的结合\n\n有了字符级解析器和分词器前缀树，我们就可以优雅且高效地过滤出语言模型在下一个时间步中被允许生成的标记：\n我们只遍历同时存在于字符级解析节点和分词器前缀树节点中的字符。这样就能找到所有的标记（包括复杂的子词标记，例如 `\",\"`，这在 JSON 解析中至关重要）。我们在两棵树上递归执行此操作，并返回所有允许的标记。当语言模型生成一个标记时，我们会根据新字符推进字符级解析器，以便为下一个时间步的过滤做好准备。\n\n### 这种方法有何不同？为什么它更好？\n\n这并不是第一个用于强制语言模型输出格式的库。然而，其他类似的库（如 Guidance、JsonFormer 和 Outlines）会强制要求严格的输出格式。这意味着语言模型无法控制空格、字段的可选性以及字段的顺序（以 JSON 为例）。虽然这对人类来说似乎无关紧要，但这意味着语言模型可能不会生成它“想要”的 JSON 格式，从而导致其内部状态处于次优值，进而降低后续时间步的输出质量。\n\n这就迫使语言模型的使用者必须了解他们所使用的语言模型的具体细节（例如，JSON 是否在预训练之前就已经被压缩过？），并修改相关库以生成精确的格式。\n\n而我们则通过扫描潜在的下一个标记，并允许任何能够被解析为所需输出格式的标记序列来避免这一问题。这意味着语言模型可以自主控制这些方面，以最自然的方式输出与其风格相符的标记序列，而无需开发者深入了解底层细节。\n\n## 诊断 - 我是否总能获得良好的结果？\n\n使用该库可以确保输出符合指定格式，但并不能保证语义上的正确性。强制语言模型遵循特定的输出格式可能会导致幻觉现象增多。通过提示工程引导模型仍然更有可能改善结果。\n\n为了帮助您理解格式强制带来的激进性，如果您在调用 `generate_enforced()` 时，在 `kwargs` 中传递 `output_scores=True` 和 `return_dict_in_generate=True`（这些都是 `transformers` 库中已有的可选参数），您还将得到一个逐 token 的数据框，显示每个被选择的 token、其得分，以及如果不应用格式强制的话原本会被选择的 token 是什么。如果发现格式强制器迫使语言模型选择了权重非常低的 token，这很可能是导致结果不佳的原因之一。您可以尝试修改提示，以引导语言模型避免让格式强制器过于激进。\n\n以下是一个使用正则表达式格式 `Michael Jordan was Born in (\\d)+.` 的示例：\n\n| idx | generated_token | generated_token_idx | generated_score | leading_token | leading_token_idx | leading_score |\n| :------------ | :-------------| :-------------| :------------- | :------------ | :-------------| :-------------|\n| 0 | ▁ | 29871 | 1.000000 | ▁ | 29871 | 1.000000 |\n| 1 | Michael | 24083 | 0.000027 | ▁Sure | 18585 | 0.959473 |\n| 2 | ▁Jordan | 18284 | 1.000000 | ▁Jordan | 18284 | 1.000000 |\n| 3 | ▁was | 471 | 1.000000 | ▁was | 471 | 1.000000 |\n| 4 | ▁Born | 19298 | 0.000008 | ▁born | 6345 | 1.000000 |\n| 5 | ▁in | 297 | 0.994629 | ▁in | 297 | 0.994629 |\n| 6 | ▁ | 29871 | 0.982422 | ▁ | 29871 | 0.982422 |\n| 7 | 1 | 29896 | 1.000000 | 1 | 29896 | 1.000000 |\n| 8 | 9 | 29929 | 1.000000 | 9 | 29929 | 1.000000 |\n| 9 | 6 | 29953 | 1.000000 | 6 | 29953 | 1.000000 |\n| 10 | 3 | 29941 | 1.000000 | 3 | 29941 | 1.000000 |\n| 11 | . | 29889 | 0.999512 | . | 29889 | 0.999512 |\n| 12 | ```\u003C\u002Fs>``` | 2 | 0.981445 | ```\u003C\u002Fs>``` | 2 | 0.981445 |\n\n从表中可以看出，模型“想要”用 `Sure` 开头，但格式强制器却迫使其使用了 `Michael`——在第 1 个 token 处出现了较大的差距。此后，几乎所有的领先得分都落在允许的 token 集合内，这意味着模型可能并未因 token 强制而产生幻觉。唯一的例外是第 4 个时间步——“Born”被强制出现，而 LLM 原本希望选择的是“born”。这对提示工程师来说是一个提示：可以将提示中的 “Born” 改为小写的 “born”。\n\n## 配置选项\n\nLM Format Enforcer 使用多种启发式方法来避免语言模型生成结构化输出时可能出现的边缘情况。\n有两种方式可以控制这些启发式方法：\n\n### 选项 1：通过环境变量\n\n有几个环境变量可以设置，它们会影响库的行为。这种方法在不希望修改代码时很有用，例如通过 vLLM OpenAI 服务器使用该库时。\n\n- `LMFE_MAX_CONSECUTIVE_WHITESPACES` - 解析 JsonSchema 对象时允许的最大连续空格数。默认值：12。\n- `LMFE_STRICT_JSON_FIELD_ORDER` - JsonSchemaParser 是否应强制属性按照 JsonSchema 的 `required` 列表中的顺序出现？（注意：这与 Pydantic 模型中的声明顺序一致）。默认值：False。\n- `LMFE_MAX_JSON_ARRAY_LENGTH` - 如果模式未指定，则 JSON 数组的最大长度是多少。有助于防止 LLM 进入无限循环。默认值：20。\n- `LMFE_DEFAULT_ALPHABET` - 默认使用的允许字符集是什么？请参阅 [consts.py](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Flmformatenforcer\u002Fconsts.py#L1) 了解默认值。可以覆盖并扩展以包含特定语言的字符。如果您希望这些字符作为 JsonSchemaParser 中的 JSON 键或枚举值出现，则需要此设置。\n\n### 选项 2：通过 CharacterLevelParserConfig 类\n当通过代码使用该库时，任何 `CharacterLevelParser`（如 `JsonSchemaParser`、`RegexParser` 等）的构造函数都会接收一个可选的 `CharacterLevelParserConfig` 对象。\n\n因此，要配置单个解析器的启发式方法，只需实例化一个 `CharacterLevelParserConfig` 对象，修改其值，并将其传递给 `CharacterLevelParser` 的构造函数。\n\n\n\n## 已知问题和限制\n\n- LM Format Enforcer 需要 Python API 来处理语言模型的输出 logits。这意味着在 API 尚未扩展之前，它无法与 OpenAI ChatGPT 及类似基于 API 的解决方案一起使用。\n- 正则表达式语法并非 100% 支持。有关详细信息，请参阅 [interegular](https:\u002F\u002Fpypi.org\u002Fproject\u002Finteregular\u002F)。\n- LM Format Enforcer 的正则表达式解析器只能生成存在于分词器词汇表中的字符。这一点可能会在后续版本中得到解决，详情请参见 [GitHub 上的议题](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fissues\u002F13)。\n\n\n## 贡献者及贡献方式\n\n有关贡献者列表，请参阅 [CONTRIBUTORS.md](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002FCONTRIBUTORS.md)。","# lm-format-enforcer 快速上手指南\n\n`lm-format-enforcer` 是一个用于强制语言模型输出特定格式（如 JSON Schema、正则表达式等）的开源工具。它通过在生成过程的每一步过滤允许的 Token，确保输出严格符合指定格式，同时最大程度保留模型的生成自由度。\n\n## 环境准备\n\n*   **系统要求**：支持 Linux、macOS 和 Windows。若在 Google Colab 或本地使用 GPU 加速，建议配备 NVIDIA GPU 并安装对应的 CUDA 驱动。\n*   **前置依赖**：\n    *   Python 3.8+\n    *   主流深度学习框架（如 `transformers`, `torch`）\n    *   目标语言模型（本示例以 Llama-2 为例）\n\n> **国内加速建议**：\n> 在中国大陆地区，建议使用国内镜像源加速依赖下载：\n> *   PyPI 镜像：`https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`\n> *   Hugging Face 镜像：设置环境变量 `HF_ENDPOINT=https:\u002F\u002Fhf-mirror.com`\n\n## 安装步骤\n\n使用 pip 安装核心库。如果使用量化模型（如 GPTQ），还需安装相关依赖。\n\n```bash\n# 设置国内 Hugging Face 镜像 (可选，推荐国内用户)\nexport HF_ENDPOINT=https:\u002F\u002Fhf-mirror.com\n\n# 安装核心库\npip install lm-format-enforcer -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n\n# 若需在 Colab 或本地运行完整示例 (包含 transformers, torch 等)\npip install transformers torch huggingface_hub optimum -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n\n# 若使用 GPTQ 量化模型 (如 Llama-2-GPTQ)，需额外安装 auto-gptq\npip install auto-gptq --extra-index-url https:\u002F\u002Fhuggingface.github.io\u002Fautogptq-index\u002Fwhl\u002Fcu118\u002F\n```\n\n## 基本使用\n\n以下示例演示如何使用 `lm-format-enforcer` 强制 Llama-2 模型输出符合特定 Pydantic 模型的 JSON 数据。\n\n### 代码示例\n\n```python\nfrom pydantic import BaseModel\nfrom lmformatenforcer import JsonSchemaParser\nfrom lmformatenforcer.integrations.transformers import build_transformers_prefix_allowed_tokens_fn\nfrom transformers import pipeline\n\n# 1. 定义期望的输出格式 (Pydantic Model)\nclass AnswerFormat(BaseModel):\n    first_name: str\n    last_name: str\n    year_of_birth: int\n    num_seasons_in_nba: int\n\n# 2. 创建 transformers 管道 (加载模型)\n# 注意：首次运行会自动下载模型，国内用户请确保已设置 HF_ENDPOINT\nhf_pipeline = pipeline('text-generation', model='TheBloke\u002FLlama-2-7b-Chat-GPTQ', device_map='auto')\n\n# 构造提示词，包含 JSON Schema 信息\nprompt = f'Here is information about Michael Jordan in the following json schema: {AnswerFormat.schema_json()} :\\n'\n\n# 3. 创建解析器并构建 transformers 的前缀过滤函数\nparser = JsonSchemaParser(AnswerFormat.schema())\nprefix_function = build_transformers_prefix_allowed_tokens_fn(hf_pipeline.tokenizer, parser)\n\n# 4. 调用管道生成内容 (传入 prefix_allowed_tokens_fn)\noutput_dict = hf_pipeline(prompt, prefix_allowed_tokens_fn=prefix_function)\n\n# 5. 提取结果\nresult = output_dict[0]['generated_text'][len(prompt):]\nprint(result)\n# 预期输出示例: {\"first_name\": \"Michael\", \"last_name\": \"Jordan\", \"year_of_birth\": 1963, \"num_seasons_in_nba\": 15}\n```\n\n### 关键点说明\n*   **JsonSchemaParser**：将 Pydantic 模型或 JSON Schema 转换为字符级解析器。\n*   **build_transformers_prefix_allowed_tokens_fn**：将解析器与模型的 Tokenizer 结合，生成一个过滤函数。\n*   **prefix_allowed_tokens_fn**：在 `pipeline` 或 `model.generate` 中传入此参数，即可在生成过程中实时拦截不符合格式的 Token。\n\n该工具还支持正则表达式 (`RegexParser`)、无模式 JSON (`JsonSchemaParser(None)`) 以及 LangChain、LlamaIndex、vLLM 等多种框架的集成。","某电商数据团队需要利用大模型从海量非结构化的用户评论中批量提取商品属性（如颜色、尺寸、评分），并将其自动存入数据库。\n\n### 没有 lm-format-enforcer 时\n- **格式解析频繁失败**：模型偶尔会输出多余的开场白或不合法的 JSON 符号（如末尾多逗号），导致后端代码解析报错，需编写复杂的正则清洗逻辑。\n- **字段类型不可控**：模型常将“评分”生成为文字描述（如“五星”）而非整数，或遗漏必填字段，迫使开发人员增加额外的校验和重试机制。\n- **处理效率低下**：为了保证格式正确，不得不设计繁琐的提示词（Prompt Engineering）并进行多次尝试，显著增加了推理延迟和 Token 消耗。\n- **批处理难以稳定**：在并发处理大量评论时，个别格式错误的样本会导致整个批次任务中断，维护成本极高。\n\n### 使用 lm-format-enforcer 后\n- **输出严格合规**：通过预定义 JSON Schema，lm-format-enforcer 在生成每个 token 时实时过滤，确保输出 100% 符合标准 JSON 格式，无需任何后处理清洗。\n- **数据类型精准锁定**：强制模型将“评分”仅生成为整数，并保证所有必填字段完整存在，直接对接数据库写入接口。\n- **推理流程简化**：移除了复杂的提示词技巧和重试逻辑，单次调用即可得到可用结果，大幅降低了系统延迟和算力成本。\n- **大规模稳定运行**：支持批量生成且互不干扰，即使在高并发场景下也能保证每条数据的格式一致性，彻底消除了因格式错误导致的任务中断。\n\nlm-format-enforcer 通过在生成阶段实时约束 token 选择，将大模型从“不可控的文本生成器”转变为“高可靠的结构化数据引擎”。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fnoamgat_lm-format-enforcer_8abac9d2.png","noamgat","Noam Gat","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fnoamgat_0edc48c6.png",null,"https:\u002F\u002Fgithub.com\u002Fnoamgat",[79],{"name":80,"color":81,"percentage":82},"Python","#3572A5",100,2008,89,"2026-04-17T02:28:18","MIT","未说明","非必需（库本身为纯 Python），但运行示例需 GPU。Colab 示例使用 T4；auto-gptq 安装示例指定 CUDA 11.8 (cu118)。",{"notes":90,"python":87,"dependencies":91},"该库核心为纯 Python 实现，通过 pip 直接安装即可。GPU 需求取决于所选用的后端模型（如 Llama-2-GPTQ）及是否使用量化库（如 auto-gptq）。README 中的 Colab 示例展示了在 T4 GPU 上运行量化模型的配置，并指定了 CUDA 11.8 的 wheel 源。支持多种集成后端（vLLM, llama.cpp, TensorRT-LLM 等），具体硬件需求视所选后端而定。",[92,93,94,95,96,97],"transformers","torch","huggingface_hub","optimum","pydantic","interegular",[35,14],"2026-03-27T02:49:30.150509","2026-04-18T09:19:37.413012",[102,107,112,117,122,126],{"id":103,"question_zh":104,"answer_zh":105,"source_url":106},39207,"在使用 vLLM\u002FAphrodite 引擎配合 JSON Schema 时，为什么 lm-format-enforcer 会生成大量错误的 `\":\"` 作为 JSON 属性名？","该问题已在 v0.10.1 版本中修复。请升级您的 lm-format-enforcer 到 v0.10.1 或更高版本。升级后，您还可以查阅项目的“配置选项”（Configuration Options）部分，确认是否需要调整特定参数以优化解析效果。","https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fissues\u002F94",{"id":108,"question_zh":109,"answer_zh":110,"source_url":111},39208,"为什么模型输出经常提前结束并返回无效的 JSON（例如缺少闭合括号）？","这通常是因为生成的 token 数量达到了默认限制。解决方法包括：\n1. 增加 `max_tokens` 参数的值（在较新版本中默认值可能已降低，需手动调大）。\n2. 尝试升级底层推理引擎，例如将 llama.cpp 升级到 0.2.37 或更高版本，该版本修复了相关的提前停止问题。","https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fissues\u002F34",{"id":113,"question_zh":114,"answer_zh":115,"source_url":116},39209,"当 JSON Schema 中包含 `\"additionalProperties\": true` 或使用复杂的 `oneOf` 定义时，为什么会报错 `AttributeError: 'bool' object has no attribute 'get'` 或 `Unknown LMFormatEnforcer Problem`？","这是一个已知的解析器缺陷，发生在处理布尔类型的 `additionalProperties` 或复杂嵌套类型时。该问题已在主分支（master）中修复。请拉取最新的代码或使用包含此修复的最新发布版本，即可正常支持 `\"additionalProperties\": true` 及复杂的 `oneOf` 架构定义。","https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fissues\u002F129",{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},39210,"初始化过程非常缓慢（特别是对于 Qwen 等大词表模型），需要超过一分钟，如何优化？","初始化慢的问题已在 v0.9.6 版本中通过优化得到解决。主要改进包括：\n1. ExLlamaV2 集成现在直接从 `ExLlamaV2Tokenizer` 读取词表，避免了对每个 token 调用昂贵的 `decode()` 方法。\n2. 优化了 `JsonFreetextTokenCache` 的构建逻辑，使用整数集合交集运算。\n请确保您已升级到 v0.9.6 或更高版本以获得性能提升。","https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fissues\u002F75",{"id":123,"question_zh":124,"answer_zh":125,"source_url":121},39211,"如何在 ExLlamaV2 中集成并使用 lm-format-enforcer 进行 JSON Schema 约束生成？","可以使用 `ExLlamaV2TokenEnforcerFilter` 类进行集成。基本步骤如下：\n1. 导入必要的模块：`from lmformatenforcer.integrations.exllamav2 import ExLlamaV2TokenEnforcerFilter` 和 `from lmformatenforcer import JsonSchemaParser`。\n2. 定义您的 Pydantic 模型或 JSON Schema。\n3. 创建解析器：`parser = JsonSchemaParser(your_schema)`。\n4. 创建过滤器：`filter = ExLlamaV2TokenEnforcerFilter(parser, tokenizer)`。\n5. 在生成器设置中将此过滤器应用到采样器（Sampler）中。具体代码示例可参考相关 Issue 中的讨论片段。",{"id":127,"question_zh":128,"answer_zh":129,"source_url":116},39212,"如果遇到与 `exclusiveMaximum` 或 `exclusiveMinimum` 相关的验证错误，应该如何处理？","这类错误通常是由于 JSON Schema 中这些属性的格式不符合解析器预期（例如期望对象却传入了布尔值或数值）导致的内部验证异常。建议检查您的 Schema 定义，确保 `exclusiveMaximum` 和 `exclusiveMinimum` 的值格式正确。如果使用的是自动生成的 Schema 且无法修改，请尝试升级到最新版本，因为维护者已在后续更新中增强了对各类 Schema 边缘情况的兼容性处理。",[131,136,141,146,151,156,161,166,171,176,181,186,191,196,201,206,211,216,221,226],{"id":132,"version":133,"summary_zh":134,"released_at":135},315131,"v0.11.2","- 针对 vLLM V1 集成的 minor 修复","2025-08-09T05:26:21",{"id":137,"version":138,"summary_zh":139,"released_at":140},315132,"v0.11.1","向 TokenEnforcerTokenizerData 添加了 use_bitmask 标志，使允许的标记数据基于 PyTorch 张量位掩码，以实现与 vLLM V1 的顺畅集成。如果你直接使用 TokenEnforcer.get_allowed_tokens() 函数，这将是一个破坏性变更，因为其返回类型已更改。","2025-08-08T16:53:52",{"id":142,"version":143,"summary_zh":144,"released_at":145},315133,"v0.10.12","- 需要对 vLLM V1 引擎支持进行小幅改进","2025-08-04T21:12:55",{"id":147,"version":148,"summary_zh":149,"released_at":150},315134,"v0.10.11","- 修复了对更新版本的 Transformers 库的支持","2025-02-26T22:17:44",{"id":152,"version":153,"summary_zh":154,"released_at":155},315135,"v0.10.10","- 在 JSON Schema 中为字段添加了多类型支持（由 [Guillaume Calmettes](https:\u002F\u002Fgithub.com\u002Fgcalmettes) 贡献）\n- 允许通过环境变量配置字母表（解决 https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fissues\u002F151）\n- 修复了 CI 构建问题","2025-02-15T12:00:59",{"id":157,"version":158,"summary_zh":159,"released_at":160},315136,"v0.10.9","- 新增示例：[从视觉模型中提取结构化数据](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer\u002Fblob\u002Fmain\u002Fsamples\u002Fcolab_llama32_vision_enforcer.ipynb)","2024-10-16T14:43:45",{"id":162,"version":163,"summary_zh":164,"released_at":165},315137,"v0.10.8","- 修复了对词汇表规模大于模型的分词器的支持问题。现已支持诸如 Llama3.2 等视觉模型。\n- JsonSchemaParser：修复了 additionalProperties 和 oneOf 的边缘情况。","2024-10-16T12:19:57",{"id":167,"version":168,"summary_zh":169,"released_at":170},315138,"v0.10.7","## v0.10.7\n- [135] 更新了 Haystack V2 集成，使用最新 API\n","2024-09-07T10:32:36",{"id":172,"version":173,"summary_zh":174,"released_at":175},315139,"v0.10.6","# v0.10.6\n- 优化了序列化，以更好地支持多进程","2024-08-05T18:31:53",{"id":177,"version":178,"summary_zh":179,"released_at":180},315140,"v0.10.5","## v0.10.5\n- 优化了 SequenceParser 的性能\n- JsonSchemaParser：数字解析支持指数表示法\n- 支持具有多个 EOS token ID 的分词器","2024-07-27T05:11:39",{"id":182,"version":183,"summary_zh":184,"released_at":185},315141,"v0.10.4","## v0.10.4\r\n- Added default max Json array length to help LLMs avoid infinite loops. See README for details.\r\n- Updated EXLlamaV2 example to updated API","2024-07-15T19:10:26",{"id":187,"version":188,"summary_zh":189,"released_at":190},315142,"v0.10.3","## v0.10.3\r\n- [#113] TRTLLM Support: Fixing type incompatibility in certain cases \u002F library versions\r\n","2024-06-20T18:16:30",{"id":192,"version":193,"summary_zh":194,"released_at":195},315143,"v0.10.2","- [#100] JsonSchemaParser: Added allOf support\r\n- [#99] JsonSchemaParser: Fixed edge case that would allow leading comma in JSON Array \r\n- [#102] JsonSchemaParser: Fixed Array of Enums not producing multiple values\r\n","2024-05-17T05:57:49",{"id":197,"version":198,"summary_zh":199,"released_at":200},315144,"v0.10.1","- Added ability to config parsing heuristics via environment variables. See [Configuration Options in README](https:\u002F\u002Fgithub.com\u002Fnoamgat\u002Flm-format-enforcer?tab=readme-ov-file#configuration-options)\r\n- JsonSchemaParser: Added support for strict json field order and max consecutive whitespace configuration overrides.","2024-05-04T14:42:50",{"id":202,"version":203,"summary_zh":204,"released_at":205},315145,"v0.9.10","- [#95] Added anyOf support to JsonSchemaParser, making function calls possible.\r\n","2024-05-03T08:58:53",{"id":207,"version":208,"summary_zh":209,"released_at":210},315146,"v0.9.9","- Updated README with vLLM OpenAI Server Inference integration","2024-04-24T18:26:09",{"id":212,"version":213,"summary_zh":214,"released_at":215},315147,"v0.9.8","- [#80] JSONSchemaParser List would allow opening comma before first element if there was a whitespace before it\r\n","2024-04-20T06:58:26",{"id":217,"version":218,"summary_zh":219,"released_at":220},315148,"v0.9.7","## v0.9.7\r\n- [#93] Improved JSONSchemaParser performance, unit tests run twice as fast! Joint effort with [Ari Weinstein](https:\u002F\u002Fgithub.com\u002FAriX). Thanks! ","2024-04-20T06:28:29",{"id":222,"version":223,"summary_zh":224,"released_at":225},315149,"v0.9.6","## v0.9.6\r\n- [#88] ExllamaV2 optimizations\r\n- Bugfix in ExllamaV2 sample notebook that generated garbage data after the response.\r\n","2024-04-19T07:43:16",{"id":227,"version":228,"summary_zh":76,"released_at":229},315150,"v0.9.3","2024-03-13T20:42:25"]