[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-zhudotexe--kani":3,"tool-zhudotexe--kani":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",150720,2,"2026-04-11T11:33:10",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":78,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":95,"forks":96,"last_commit_at":97,"license":98,"difficulty_score":32,"env_os":99,"env_gpu":100,"env_ram":101,"env_deps":102,"category_tags":114,"github_topics":115,"view_count":32,"oss_zip_url":79,"oss_zip_packed_at":79,"status":17,"created_at":126,"updated_at":127,"faqs":128,"releases":159},6701,"zhudotexe\u002Fkani","kani","kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)","kani 是一款专为聊天型语言模型设计的轻量级微框架，核心聚焦于“工具调用”与“函数执行”能力。它旨在解决现有大模型框架往往过于臃肿、定制灵活性不足的问题，让开发者能够以更细粒度掌控对话流程中的关键环节，从而轻松构建具备复杂交互能力的智能应用。\n\n无论是自然语言处理领域的研究人员、热衷探索的开发者，还是希望深度定制模型行为的极客，kani 都是理想选择。其最大亮点在于高度的“可黑客性”（hackable）与模型无关架构：不仅原生支持 OpenAI、Anthropic、Google、Hugging Face、llama.cpp 及 vLLM 等主流模型后端，还允许用户通过社区扩展无缝接入更多引擎。这种设计既降低了多模型切换的成本，又为实验性研究提供了极大自由。\n\nkani 摒弃了过度封装的“黑盒”逻辑，鼓励用户根据实际需求灵活调整控制流，特别适合需要快速原型验证或深入理解工具调用机制的场景。配合清晰的文档、丰富的示例代码以及活跃的社区支持，kani 让构建智能对话系统变得简单而高效。","\u003Cp align=\"center\">\n  \u003Cimg width=\"256\" height=\"256\" alt=\"kani\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhudotexe_kani_readme_2f06342ab8c6.png\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Factions\u002Fworkflows\u002Fpytest.yml\">\n    \u003Cimg alt=\"Test Package\" src=\"https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Factions\u002Fworkflows\u002Fpytest.yml\u002Fbadge.svg\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest\">\n    \u003Cimg alt=\"Documentation Status\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhudotexe_kani_readme_13d664e1afd7.png\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fkani\u002F\">\n    \u003Cimg alt=\"PyPI\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fkani\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fzhudotexe\u002Fkani\u002Fblob\u002Fmain\u002Fexamples\u002Fcolab_examples.ipynb\">\n    \u003Cimg alt=\"Quickstart in Colab\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FeTepTNDxYT\">\n    \u003Cimg alt=\"Discord\" src=\"https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F1150902904773935214?color=5865F2&label=discord&logo=discord&logoColor=white\">\n  \u003C\u002Fa>\n  \u003Cbr\u002F>\n  \u003Ca href=\"examples\u002F4_engines_zoo.py\">\n    \u003Cimg alt=\"Model zoo\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fexamples-model_zoo-blue\">\n  \u003C\u002Fa>\n  \u003Ca href=\"examples\u002F5_advanced_retrieval.py\">\n    \u003Cimg alt=\"Retrieval example\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fexamples-retrieval-blue\">\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n# kani (カニ)\n\nkani (カニ) is a lightweight and highly hackable framework for chat-based language models with **tool usage\u002Ffunction\ncalling**.\n\nCompared to other LM frameworks, kani is less opinionated and offers more fine-grained customizability\nover the parts of the control flow that matter, making it the perfect choice for NLP researchers, hobbyists, and\ndevelopers alike.\n\nkani comes with support for the following models out of the box, with a **model-agnostic** framework to add support for\nmany more:\n\n- OpenAI Models (`pip install \"kani[openai]\"`)\n- Anthropic Models (`pip install \"kani[anthropic]\"`)\n- Google AI Models (`pip install \"kani[google]\"`)\n- Hugging Face transformers (`pip install \"kani[huggingface]\"`)\n- llama.cpp (`pip install \"kani[cpp]\"`)\n- vLLM (`pip install \"kani[vllm]\"`)\n- and more with [community extensions](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Fcommunity\u002Fextensions.html)!\n\n**Check out the [Model Zoo](examples\u002F4_engines_zoo.py) for code examples of loading popular models in Kani!**\n\nInterested in contributing? Check out our [guide](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Fcommunity\u002Fcontributing.html).\n\n[Read the docs on ReadTheDocs!](http:\u002F\u002Fkani.readthedocs.io\u002F)\n\n[Read our paper on arXiv!](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.05542)\n\n## Installation\n\nkani requires Python 3.10 or above. To install model-specific dependencies, kani uses various extras (brackets after\nthe library name in `pip install`). To determine which extra(s) to install, see\nthe [model table](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Fengines.html), or use the `[all]` extra to install everything.\n\n```shell\n# for OpenAI models\n$ pip install \"kani[openai]\"\n# for Hugging Face models\n$ pip install \"kani[huggingface]\" torch\n# for multimodal inputs\n$ pip install \"kani[multimodal]\"\n# or install everything:\n$ pip install \"kani[all]\"\n```\n\nFor the most up-to-date changes and new models, you can also install the development version from Git's `main` branch:\n\n```shell\n$ pip install \"kani[all] @ git+https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani.git@main\"\n```\n\n## Quickstart\n\n\u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fzhudotexe\u002Fkani\u002Fblob\u002Fmain\u002Fexamples\u002Fcolab_examples.ipynb\">\n  \u003Cimg alt=\"Quickstart in Colab\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n\u003C\u002Fa>\n\nkani requires Python 3.10 or above.\n\nFirst, install the library. In this quickstart, we'll use the OpenAI engine, though kani\nis [model-agnostic](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Fengines.html).\n\n```shell\n$ pip install \"kani[openai]\"\n```\n\nThen, let's use kani to create a simple chatbot using ChatGPT as a backend.\n\n```python\n# import the library\nimport asyncio\nfrom kani import Kani, chat_in_terminal\nfrom kani.engines.openai import OpenAIEngine\n\n# Replace this with your OpenAI API key: https:\u002F\u002Fplatform.openai.com\u002Faccount\u002Fapi-keys\napi_key = \"sk-...\"\n\n# kani uses an Engine to interact with the language model. You can specify other model \n# parameters here, like temperature=0.7.\nengine = OpenAIEngine(api_key, model=\"gpt-5-nano\")\n\n# The kani manages the chat state, prompting, and function calling. Here, we only give \n# it the engine to call ChatGPT, but you can specify other parameters like \n# system_prompt=\"You are...\" here.\nai = Kani(engine)\n\n# kani comes with a utility to interact with a kani through your terminal...\nchat_in_terminal(ai)\n\n\n# or you can use kani programmatically in an async function!\nasync def main():\n    resp = await ai.chat_round(\"What is the airspeed velocity of an unladen swallow?\")\n    print(resp.text)\n\n\nasyncio.run(main())\n```\n\nkani makes the time to set up a working chat model short, while offering the programmer deep customizability over\nevery prompt, function call, and even the underlying language model.\n\n## Function Calling\n\nFunction calling gives language models the ability to choose when to call a function you provide based off its\ndocumentation.\n\n> [!NOTE]\n> Looking for MCP support? Kani supports local and remote MCP servers, too. Check out the MCP docs at\n> https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Ffunction_calling.html#mcp-tools.\n\nWith kani, you can write functions in Python and expose them to the model with just one line of code: the `@ai_function`\ndecorator.\n\n```python\n# import the library\nimport asyncio\nfrom typing import Annotated\nfrom kani import AIParam, Kani, ai_function, chat_in_terminal, ChatRole\nfrom kani.engines.openai import OpenAIEngine\n\n# set up the engine as above\napi_key = \"sk-...\"\nengine = OpenAIEngine(api_key, model=\"gpt-4o-mini\")\n\n\n# subclass Kani to add AI functions\nclass MyKani(Kani):\n    # Adding the annotation to a method exposes it to the AI\n    @ai_function()\n    def get_weather(\n        self,\n        # and you can provide extra documentation about specific parameters\n        location: Annotated[str, AIParam(desc=\"The city and state, e.g. San Francisco, CA\")],\n    ):\n        \"\"\"Get the current weather in a given location.\"\"\"\n        # In this example, we mock the return, but you could call a real weather API\n        return f\"Weather in {location}: Sunny, 72 degrees fahrenheit.\"\n\n\nai = MyKani(engine)\n\n# the terminal utility allows you to test function calls...\nchat_in_terminal(ai)\n\n\n# and you can track multiple rounds programmatically.\nasync def main():\n    async for msg in ai.full_round(\"What's the weather in Tokyo?\"):\n        print(msg.role, msg.text)\n\n\nasyncio.run(main())\n```\n\nkani guarantees that function calls are valid by the time they reach your methods while allowing you to focus on\nwriting code. For more information, check\nout [the function calling docs](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Ffunction_calling.html).\n\n## Streaming\n\nkani supports streaming responses from the underlying language model token-by-token, even in the presence of function\ncalls. Streaming is designed to be a drop-in superset of the ``chat_round`` and ``full_round`` methods, allowing you to\ngradually refactor your code without ever leaving it in a broken state.\n\n```python\nasync def stream_chat():\n    stream = ai.chat_round_stream(\"What does kani mean?\")\n    async for token in stream:\n        print(token, end=\"\")\n    print()\n    msg = await stream.message()  # or `await stream`\n\n\nasync def stream_with_function_calling():\n    async for stream in ai.full_round_stream(\"What's the weather in Tokyo?\"):\n        async for token in stream:\n            print(token, end=\"\")\n        print()\n        msg = await stream.message()\n```\n\n## Multimodal Inputs\n\nkani optionally supports multimodal inputs (images, audio, video) for various language models. To use multimodal inputs,\ninstall the `kani-multimodal-core` extension package or use `pip install \"kani[multimodal]\"`. See the\nkani-multimodal-core documentation for more info. \n\n[Read the kani-multimodal-core docs!](https:\u002F\u002Fkani-multimodal-core.readthedocs.io)\n\n```python\nfrom kani import Kani\nfrom kani.engines.openai import OpenAIEngine\nfrom kani.ext.multimodal_core import ImagePart\n\nengine = OpenAIEngine(model=\"gpt-4.1-nano\")\nai = Kani(engine)\n\n# notice how the arg is a list of parts rather than a single str!\nmsg = await ai.chat_round_str([\n    \"Please describe these images:\",\n    ImagePart.from_file(\"path\u002Fto\u002Fimage.png\"),\n    await ImagePart.from_url(\n        \"https:\u002F\u002Fupload.wikimedia.org\u002Fwikipedia\u002Fcommons\u002Fthumb\u002F5\u002F53\u002FWhitehead%27s_Trogon_0A2A6014.jpg\u002F1024px-Whitehead%27s_Trogon_0A2A6014.jpg\"\n    ),\n])\nprint(msg)\n\n```\n\nMultimodal handling is deeply integrated with the rest of the kani ecosystem, so you get all the benefits of kani's\nfluent tool usage and automatic context management with minimal development cost!\n\n## `kani` CLI\n\nkani comes with a CLI for you to chat with a model in your terminal with zero setup.\n\nThe `kani` CLI takes the form of `$ kani \u003Cprovider>:\u003Cmodel-id>`. Use `kani --help` for more information.\n\nExamples:\n```shell\n$ kani openai:gpt-4.1-nano\n$ kani huggingface:meta-llama\u002FMeta-Llama-3-8B-Instruct\n$ kani anthropic:claude-sonnet-4-0\n$ kani google:gemini-2.5-flash\n```\n\nThis CLI helper automatically creates a Engine and Kani instance, and calls `chat_in_terminal()` so you can test LLMs\nfaster. When `kani-multimodal-core` is installed, you can provide multimodal media on your disk or on the internet \nto the model by prepending a path or URL with an @ symbol:\n\n```\nUSER: Please describe this image: @path\u002Fto\u002Fimage.png and also this one: @https:\u002F\u002Fexample.com\u002Fimage.png\n```\n\n## Why kani?\n\n- **Lightweight and high-level** - kani implements common boilerplate to interface with language models without forcing\n  you to use opinionated prompt frameworks or complex library-specific tooling.\n- **Model agnostic** - kani provides a simple interface to implement: token counting and completion generation.\n  kani lets developers switch which language model runs on the backend without major code refactors.\n- **Automatic chat memory management** - Allow chat sessions to flow without worrying about managing the number of\n  tokens in the history - kani takes care of it.\n- **Function calling with model feedback and retry** - Give models access to functions in just one line of code.\n  kani elegantly provides feedback about hallucinated parameters and errors and allows the model to retry calls.\n- **You control the prompts** - There are no hidden prompt hacks. We will never decide for you how to format your own\n  data, unlike other popular language model libraries.\n- **Fast to iterate and intuitive to learn** - With kani, you only write Python - we handle the rest.\n- **Asynchronous design from the start** - kani can scale to run multiple chat sessions in parallel easily, without\n  having to manage multiple processes or programs.\n\nExisting frameworks for language models like LangChain and simpleaichat are opinionated and\u002For heavyweight - they edit\ndevelopers' prompts under the hood, are challenging to learn, and are difficult to customize without adding a lot of\nhigh-maintenance bloat to your codebase.\n\n\u003Cp align=\"center\">\n  \u003Cimg style=\"max-width: 800px;\" alt=\"kani\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhudotexe_kani_readme_d3ccc8e3a307.png\">\n\u003C\u002Fp>\n\nWe built kani as a more flexible, simple, and robust alternative. A good analogy between frameworks would be to say that\nkani is to LangChain as Flask (or FastAPI) is to Django.\n\nkani is appropriate for everyone from academic researchers to industry professionals to hobbyists to use without\nworrying about under-the-hood hacks.\n\n## Docs\n\nTo learn more about how\nto [customize kani with your own prompt wrappers](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Fcustomization.html),\n[function calling](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Ffunction_calling.html), and\nmore, [read the docs!](http:\u002F\u002Fkani.readthedocs.io\u002F)\n\nOr take a look at the hands-on examples [in this repo](https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Ftree\u002Fmain\u002Fexamples).\n\n## Demo\n\nWant to see kani in action? We run a small language model as part of our test suite\nright on GitHub Actions:\n\nhttps:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Factions\u002Fworkflows\u002Fpytest.yml?query=branch%3Amain+is%3Asuccess\n\nSimply click on the latest build to see the model's output!\n\n## Who we are\n\n\u003Cimg alt=\"University of Pennsylvania Logo\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhudotexe_kani_readme_9c6f81cf11d3.jpg\" width=\"300\">\n\nThe core development team is made of three PhD students in the Department of Computer and Information Science at the\nUniversity of Pennsylvania. We're all members of\n[Prof. Chris Callison-Burch's](https:\u002F\u002Fwww.cis.upenn.edu\u002F~ccb\u002F) lab, working towards advancing the future of NLP.\n\n- [**Andrew Zhu**](https:\u002F\u002Fzhu.codes\u002F) started in Fall 2022. His research interests include natural language processing,\n  programming languages, distributed systems, and more. He's also a full-stack software engineer, proficient in all\n  manner of backend, devops, database, and frontend engineering. Andrew strives to make idiomatic, clean, performant,\n  and low-maintenance code — philosophies that are often rare in academia. His research is supported by the NSF Graduate\n  Research Fellowship.\n- [**Liam Dugan**](https:\u002F\u002Fliamdugan.com\u002F) started in Fall 2021. His research focuses primarily on large language models\n  and how humans interact with them. In particular, he is interested in human detection of generated text and whether we\n  can apply those insights to automatic detection systems. He is also interested in the practical application of large\n  language models to education.\n- [**Alyssa Hwang**](https:\u002F\u002Falyssahwang.com\u002F) started in Fall 2020 and is advised by Chris Callison-Burch and Andrew\n  Head. Her research focuses on AI assistants that effectively communicate complex information, like voice assistants\n  guiding users through instructions or audiobooks allowing users to seamlessly navigate through spoken text. Beyond\n  research, Alyssa chairs the Penn CIS Doctoral Association, founded the CIS PhD Mentorship Program, and was supported\n  by the NSF Graduate Research Fellowship Program.\n\nWe use kani actively in our research, and aim to keep it up-to-date with modern NLP practices.\n\n## Citation\n\nIf you use Kani, please cite us as:\n\n```\n@inproceedings{zhu-etal-2023-kani,\n    title = \"Kani: A Lightweight and Highly Hackable Framework for Building Language Model Applications\",\n    author = \"Zhu, Andrew  and\n      Dugan, Liam  and\n      Hwang, Alyssa  and\n      Callison-Burch, Chris\",\n    editor = \"Tan, Liling  and\n      Milajevs, Dmitrijs  and\n      Chauhan, Geeticka  and\n      Gwinnup, Jeremy  and\n      Rippeth, Elijah\",\n    booktitle = \"Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)\",\n    month = dec,\n    year = \"2023\",\n    address = \"Singapore\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2023.nlposs-1.8\",\n    doi = \"10.18653\u002Fv1\u002F2023.nlposs-1.8\",\n    pages = \"65--77\",\n}\n```\n\n### Acknowledgements\n\nWe would like to thank the members of the lab of Chris Callison-Burch for their testing and detailed feedback on the\ncontents of both our paper and the Kani repository. In addition, we’d like to thank Henry Zhu (no relation to the first\nauthor) for his early and enthusiastic support of the project.\n\nThis research is based upon work supported in part by the Air Force Research Laboratory (contract FA8750-23-C-0507), the\nIARPA HIATUS Program (contract 2022-22072200005), and the NSF (Award 1928631). Approved for Public Release, Distribution\nUnlimited. The views and conclusions contained herein are those of the authors and should not be interpreted as\nnecessarily representing the official policies, either expressed or implied, of IARPA, NSF, or the U.S. Government.\n","\u003Cp align=\"center\">\n  \u003Cimg width=\"256\" height=\"256\" alt=\"kani\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhudotexe_kani_readme_2f06342ab8c6.png\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Factions\u002Fworkflows\u002Fpytest.yml\">\n    \u003Cimg alt=\"测试包\" src=\"https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Factions\u002Fworkflows\u002Fpytest.yml\u002Fbadge.svg\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest\">\n    \u003Cimg alt=\"文档状态\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhudotexe_kani_readme_13d664e1afd7.png\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fkani\u002F\">\n    \u003Cimg alt=\"PyPI\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fkani\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fzhudotexe\u002Fkani\u002Fblob\u002Fmain\u002Fexamples\u002Fcolab_examples.ipynb\">\n    \u003Cimg alt=\"Colab快速入门\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FeTepTNDxYT\">\n    \u003Cimg alt=\"Discord\" src=\"https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F1150902904773935214?color=5865F2&label=discord&logo=discord&logoColor=white\">\n  \u003C\u002Fa>\n  \u003Cbr\u002F>\n  \u003Ca href=\"examples\u002F4_engines_zoo.py\">\n    \u003Cimg alt=\"模型库\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fexamples-model_zoo-blue\">\n  \u003C\u002Fa>\n  \u003Ca href=\"examples\u002F5_advanced_retrieval.py\">\n    \u003Cimg alt=\"检索示例\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fexamples-retrieval-blue\">\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n# kani (卡尼)\n\nkani (卡尼) 是一个轻量级且高度可扩展的框架，专为支持**工具使用\u002F函数调用**的聊天型语言模型而设计。\n\n与其他语言模型框架相比，kani 的立场更加中立，并提供了对关键控制流程部分更精细的自定义能力，因此无论是 NLP 研究人员、爱好者还是开发者，它都是理想的选择。\n\nkani 开箱即用地支持以下模型，并提供了一个**与模型无关**的框架，以便轻松添加更多模型的支持：\n\n- OpenAI 模型（`pip install \"kani[openai]\"`）\n- Anthropic 模型（`pip install \"kani[anthropic]\"`）\n- Google AI 模型（`pip install \"kani[google]\"`）\n- Hugging Face 转换器（`pip install \"kani[huggingface]\"`）\n- llama.cpp（`pip install \"kani[cpp]\"`）\n- vLLM（`pip install \"kani[vllm]\"`）\n- 以及更多通过[社区扩展](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Fcommunity\u002Fextensions.html)实现！\n\n**请查看[模型库](examples\u002F4_engines_zoo.py)，了解在 kani 中加载流行模型的代码示例！**\n\n有兴趣贡献代码吗？请参阅我们的[贡献指南](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Fcommunity\u002Fcontributing.html)。\n\n[在 ReadTheDocs 上阅读文档！](http:\u002F\u002Fkani.readthedocs.io\u002F)\n\n[在 arXiv 上阅读我们的论文！](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.05542)\n\n## 安装\n\nkani 需要 Python 3.10 或更高版本。为了安装特定于模型的依赖项，kani 使用了不同的额外组件（在 `pip install` 命令中库名后的方括号内）。要确定需要安装哪些额外组件，请参阅[kani 的模型表格](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Fengines.html)，或者使用 `[all]` 组件一次性安装所有内容。\n\n```shell\n# 对于 OpenAI 模型\n$ pip install \"kani[openai]\"\n# 对于 Hugging Face 模型\n$ pip install \"kani[huggingface]\" torch\n# 对于多模态输入\n$ pip install \"kani[multimodal]\"\n# 或者安装所有内容：\n$ pip install \"kani[all]\"\n```\n\n若需获取最新更改和新增模型，您也可以从 Git 的 `main` 分支安装开发版本：\n\n```shell\n$ pip install \"kani[all] @ git+https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani.git@main\"\n```\n\n## 快速入门\n\n\u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fzhudotexe\u002Fkani\u002Fblob\u002Fmain\u002Fexamples\u002Fcolab_examples.ipynb\">\n  \u003Cimg alt=\"Colab快速入门\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n\u003C\u002Fa>\n\nkani 需要 Python 3.10 或更高版本。\n\n首先，安装该库。在本快速入门中，我们将使用 OpenAI 引擎，尽管 kani 是[与模型无关的](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Fengines.html)。\n\n```shell\n$ pip install \"kani[openai]\"\n```\n\n接下来，让我们使用 kani 创建一个简单的聊天机器人，后端采用 ChatGPT。\n\n```python\n# 导入库\nimport asyncio\nfrom kani import Kani, chat_in_terminal\nfrom kani.engines.openai import OpenAIEngine\n\n# 请在此处替换为您的 OpenAI API 密钥：https:\u002F\u002Fplatform.openai.com\u002Faccount\u002Fapi-keys\napi_key = \"sk-...\"\n\n# kani 使用引擎与语言模型进行交互。您可以在此处指定其他模型参数，例如 temperature=0.7。\nengine = OpenAIEngine(api_key, model=\"gpt-5-nano\")\n\n# kani 负责管理聊天状态、提示和函数调用。在这里，我们仅为其提供调用 ChatGPT 的引擎，但您也可以在此处指定其他参数，如 system_prompt=\"你是一个...\"。\nai = Kani(engine)\n\n# kani 提供了一个实用程序，可通过终端与 kani 交互...\nchat_in_terminal(ai)\n\n\n# 或者您也可以在异步函数中以编程方式使用 kani！\nasync def main():\n    resp = await ai.chat_round(\"一只未载重的燕子的飞行速度是多少？\")\n    print(resp.text)\n\n\nasyncio.run(main())\n```\n\nkani 缩短了搭建可用聊天模型所需的时间，同时为程序员提供了对每个提示、函数调用，甚至底层语言模型的深度自定义能力。\n\n## 函数调用\n\n函数调用使语言模型能够根据其文档，自主决定何时调用您提供的函数。\n\n> [!注意]\n> 想要支持 MCP 吗？kani 也支持本地和远程 MCP 服务器。请参阅 MCP 文档：https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Ffunction_calling.html#mcp-tools。\n\n借助 kani，您只需使用一行代码——`@ai_function` 装饰器——即可用 Python 编写函数并将其暴露给模型。\n\n```python\n# 导入库\nimport asyncio\nfrom typing import Annotated\nfrom kani import AIParam、Kani、ai_function、chat_in_terminal、ChatRole\nfrom kani.engines.openai import OpenAIEngine\n\n# 按照上述方法设置引擎\napi_key = \"sk-...\"\nengine = OpenAIEngine(api_key, model=\"gpt-4o-mini\")\n\n\n# 继承 Kani 类以添加 AI 函数\nclass MyKani(Kani):\n    # 将注解添加到方法即可将其暴露给 AI\n    @ai_function()\n    def get_weather(\n        self,\n        # 您还可以为特定参数提供额外的说明\n        location: Annotated[str, AIParam(desc=\"城市和州，例如旧金山，加州\")],\n    ):\n        \"\"\"获取指定地点的当前天气状况\"\"\"\n        # 在此示例中，我们模拟返回值，但您也可以调用真实的天气 API\n        return f\"{location} 的天气：晴朗，华氏 72 度。\"\n\n\nai = MyKani(engine)\n\n# 终端实用程序允许您测试函数调用...\nchat_in_terminal(ai)\n\n\n# 您也可以以编程方式跟踪多个回合。\nasync def main():\n    async for msg in ai.full_round(\"东京的天气如何？\"):\n        print(msg.role, msg.text)\n\n\nasyncio.run(main())\n```\n\nkani 确保函数调用在到达您的方法之前始终有效，同时让您专注于编写代码。如需更多信息，请参阅[函数调用文档](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Ffunction_calling.html)。\n\n## 流式传输\n\nkani 支持从底层语言模型逐 token 流式响应，即使在存在函数调用的情况下也是如此。流式传输被设计为 `chat_round` 和 `full_round` 方法的直接替代和扩展，使您能够逐步重构代码，而不会使其处于不完整状态。\n\n```python\nasync def stream_chat():\n    stream = ai.chat_round_stream(\"Kani 是什么意思？\")\n    async for token in stream:\n        print(token, end=\"\")\n    print()\n    msg = await stream.message()  # 或者 `await stream`\n\n\nasync def stream_with_function_calling():\n    async for stream in ai.full_round_stream(\"东京的天气如何？\"):\n        async for token in stream:\n            print(token, end=\"\")\n        print()\n        msg = await stream.message()\n```\n\n## 多模态输入\n\nkani 可选地支持多种语言模型的多模态输入（图像、音频、视频）。要使用多模态输入，请安装 `kani-multimodal-core` 扩展包，或使用 `pip install \"kani[multimodal]\"`。更多信息请参阅 kani-multimodal-core 的文档。\n\n[阅读 kani-multimodal-core 文档！](https:\u002F\u002Fkani-multimodal-core.readthedocs.io)\n\n```python\nfrom kani import Kani\nfrom kani.engines.openai import OpenAIEngine\nfrom kani.ext.multimodal_core import ImagePart\n\nengine = OpenAIEngine(model=\"gpt-4.1-nano\")\nai = Kani(engine)\n\n# 注意参数是一个部件列表，而不是单个字符串！\nmsg = await ai.chat_round_str([\n    \"请描述这些图片：\",\n    ImagePart.from_file(\"path\u002Fto\u002Fimage.png\"),\n    await ImagePart.from_url(\n        \"https:\u002F\u002Fupload.wikimedia.org\u002Fwikipedia\u002Fcommons\u002Fthumb\u002F5\u002F53\u002FWhitehead%27s_Trogon_0A2A6014.jpg\u002F1024px-Whitehead%27s_Trogon_0A2A6014.jpg\"\n    ),\n])\nprint(msg)\n```\n\n多模态处理与 kani 生态系统的其他部分深度集成，因此您可以以最小的开发成本获得 kani 流畅工具使用和自动上下文管理的所有优势！\n\n## `kani` CLI\n\nkani 自带一个命令行界面，让您无需任何设置即可在终端中与模型对话。\n\n`kani` CLI 的形式为 `$ kani \u003Cprovider>:\u003Cmodel-id>`。使用 `kani --help` 获取更多信息。\n\n示例：\n```shell\n$ kani openai:gpt-4.1-nano\n$ kani huggingface:meta-llama\u002FMeta-Llama-3-8B-Instruct\n$ kani anthropic:claude-sonnet-4-0\n$ kani google:gemini-2.5-flash\n```\n\n此 CLI 辅助工具会自动创建 Engine 和 Kani 实例，并调用 `chat_in_terminal()`，以便您更快地测试大语言模型。当安装了 `kani-multimodal-core` 时，您可以将磁盘上或互联网上的多模态媒体提供给模型，只需在路径或 URL 前面加上 @ 符号：\n\n```\nUSER: 请描述这张图片：@path\u002Fto\u002Fimage.png，还有这张：@https:\u002F\u002Fexample.com\u002Fimage.png\n```\n\n## 为什么选择 kani？\n\n- **轻量级且高级** - kani 实现了与语言模型交互的常见样板代码，而不会强制您使用带有特定观点的提示框架或复杂的库专用工具。\n- **模型无关** - kani 提供了一个简单的接口来实现：token 计数和完成生成。kani 允许开发者在不进行大规模代码重构的情况下切换后端运行的语言模型。\n- **自动聊天记忆管理** - 让聊天会话流畅进行，无需担心管理历史记录中的 token 数量——kani 会为您处理。\n- **带有模型反馈和重试功能的函数调用** - 只需一行代码即可让模型访问函数。kani 巧妙地提供关于幻觉参数和错误的反馈，并允许模型重试调用。\n- **您控制提示** - 没有隐藏的提示技巧。与其他流行的语言模型库不同，我们绝不会替您决定如何格式化自己的数据。\n- **快速迭代且易于学习** - 使用 kani，您只需编写 Python 代码——剩下的交给我们处理。\n- **从一开始就采用异步设计** - kani 可以轻松扩展到并行运行多个聊天会话，而无需管理多个进程或程序。\n\n现有的语言模型框架，如 LangChain 和 simpleaichat，往往带有特定的观点或过于臃肿——它们会在后台修改开发者的提示，学习起来颇具挑战性，而且如果不向代码库中添加大量难以维护的冗余代码，就很难进行自定义。\n\n\u003Cp align=\"center\">\n  \u003Cimg style=\"max-width: 800px;\" alt=\"kani\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhudotexe_kani_readme_d3ccc8e3a307.png\">\n\u003C\u002Fp>\n\n我们构建 kani 是为了提供一种更灵活、简单且健壮的替代方案。如果用框架之间的类比来说明，可以说 kani 之于 LangChain 就如同 Flask（或 FastAPI）之于 Django。\n\nkani 适合从学术研究人员到行业专业人士，再到业余爱好者等各类用户使用，无需担心底层的黑箱操作。\n\n## 文档\n\n要了解更多关于如何 [使用您自己的提示包装器来自定义 kani](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Fcustomization.html)、[函数调用](https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Ffunction_calling.html) 等内容，请 [阅读文档！](http:\u002F\u002Fkani.readthedocs.io\u002F)\n\n或者查看本仓库中的实践示例：[https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Ftree\u002Fmain\u002Fexamples](https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Ftree\u002Fmain\u002Fexamples)。\n\n## 演示\n\n想亲眼看看 kani 的实际效果吗？我们在 GitHub Actions 的测试套件中运行了一个小型语言模型：\n\nhttps:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Factions\u002Fworkflows\u002Fpytest.yml?query=branch%3Amain+is%3Asuccess\n\n只需点击最新的构建，即可查看模型的输出！\n\n## 我们是谁\n\n\u003Cimg alt=\"宾夕法尼亚大学标志\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhudotexe_kani_readme_9c6f81cf11d3.jpg\" width=\"300\">\n\n核心开发团队由宾夕法尼亚大学计算机与信息科学系的三名博士生组成。我们都是[Chris Callison-Burch教授](https:\u002F\u002Fwww.cis.upenn.edu\u002F~ccb\u002F)实验室的成员，致力于推动自然语言处理领域的未来发展。\n\n- [**Andrew Zhu**](https:\u002F\u002Fzhu.codes\u002F) 于2022年秋季加入。他的研究兴趣包括自然语言处理、编程语言、分布式系统等。他同时也是一名全栈软件工程师，精通后端、DevOps、数据库和前端开发等多个领域。Andrew 致力于编写符合语言习惯、简洁、高效且易于维护的代码——这些理念在学术界并不常见。他的研究得到了美国国家科学基金会研究生研究奖学金的支持。\n  \n- [**Liam Dugan**](https:\u002F\u002Fliamdugan.com\u002F) 于2021年秋季加入。他的研究主要集中在大型语言模型及其与人类的交互上。特别是，他对人类识别生成文本的能力以及如何将这些洞察应用于自动检测系统非常感兴趣。此外，他还关注大型语言模型在教育领域的实际应用。\n  \n- [**Alyssa Hwang**](https:\u002F\u002Falyssahwang.com\u002F) 于2020年秋季加入，由Chris Callison-Burch和Andrew Head共同指导。她的研究专注于能够有效传达复杂信息的人工智能助手，例如通过语音助手引导用户完成操作，或通过有声读物帮助用户流畅地浏览文本内容。除了科研工作外，Alyssa 还担任宾夕法尼亚大学计算机与信息科学系博士生协会主席，创立了该系的博士生导师计划，并获得了美国国家科学基金会研究生研究奖学金的支持。\n\n我们在研究中积极使用Kani，并努力使其与现代自然语言处理实践保持同步。\n\n## 引用\n\n如果您使用Kani，请按以下方式引用我们：\n\n```\n@inproceedings{zhu-etal-2023-kani,\n    title = \"Kani: 用于构建语言模型应用的轻量级且高度可扩展的框架\",\n    author = \"Zhu, Andrew  and\n      Dugan, Liam  and\n      Hwang, Alyssa  and\n      Callison-Burch, Chris\",\n    editor = \"Tan, Liling  and\n      Milajevs, Dmitrijs  and\n      Chauhan, Geeticka  and\n      Gwinnup, Jeremy  and\n      Rippeth, Elijah\",\n    booktitle = \"第三届自然语言处理开源软件研讨会（NLP-OSS 2023）论文集\",\n    month = dec,\n    year = \"2023\",\n    address = \"新加坡\",\n    publisher = \"计算语言学协会\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2023.nlposs-1.8\",\n    doi = \"10.18653\u002Fv1\u002F2023.nlposs-1.8\",\n    pages = \"65--77\",\n}\n```\n\n### 致谢\n\n我们感谢Chris Callison-Burch教授实验室的各位成员，他们对我们的论文和Kani代码库的内容进行了测试并提供了详尽的反馈。此外，我们还要感谢Henry Zhu（与第一作者无亲属关系），他在项目早期给予了热情的支持。\n\n本研究部分得到了美国空军研究实验室（合同编号FA8750-23-C-0507）、IARPA HIATUS计划（合同编号2022-22072200005）以及美国国家科学基金会（资助号1928631）的支持。经批准公开发布，可自由传播。文中所表达的观点和结论均属作者个人意见，不应被视为代表IARPA、NSF或美国政府的官方政策，无论明示或暗示。","# Kani 快速上手指南\n\nKani (カニ) 是一个轻量级、高可定制化的基于聊天的语言模型框架，原生支持**工具调用\u002F函数调用 (Function Calling)**。与 LangChain 等重型框架不同，Kani 不强制使用特定的提示词模板，让开发者完全掌控对话流程和 Prompt 设计，非常适合 NLP 研究人员和需要精细控制的开发者。\n\n## 环境准备\n\n*   **操作系统**：Linux, macOS, Windows\n*   **Python 版本**：3.10 或更高版本\n*   **前置依赖**：无特殊系统级依赖，仅需标准的 Python 环境。\n\n## 安装步骤\n\nKani 采用模块化安装方式，根据你需要使用的模型后端安装对应的扩展包。\n\n### 1. 基础安装（推荐国内用户配置镜像源）\n\n建议使用国内镜像源（如清华源）加速安装：\n\n```bash\n# 设置临时镜像源并安装核心库及 OpenAI 支持\npip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \"kani[openai]\"\n```\n\n### 2. 其他模型后端安装\n\n根据你的需求选择对应的额外依赖：\n\n```bash\n# Hugging Face Transformers 支持 (需额外安装 torch)\npip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \"kani[huggingface]\" torch\n\n# Anthropic (Claude) 支持\npip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \"kani[anthropic]\"\n\n# Google AI (Gemini) 支持\npip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \"kani[google]\"\n\n# llama.cpp 支持\npip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \"kani[cpp]\"\n\n# vLLM 支持\npip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \"kani[vllm]\"\n\n# 安装所有支持的引擎（体积较大）\npip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \"kani[all]\"\n```\n\n> **提示**：若想体验最新功能，可从 GitHub 安装开发版：\n> `pip install \"kani[all] @ git+https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani.git@main\"`\n\n## 基本使用\n\n### 1. 最简单的聊天机器人\n\n以下示例展示如何使用 OpenAI 模型创建一个简单的聊天机器人。\n\n```python\nimport asyncio\nfrom kani import Kani, chat_in_terminal\nfrom kani.engines.openai import OpenAIEngine\n\n# 替换为你的 OpenAI API Key\napi_key = \"sk-...\"\n\n# 初始化引擎，可在此指定 model, temperature 等参数\nengine = OpenAIEngine(api_key, model=\"gpt-4o-mini\")\n\n# 初始化 Kani 实例，管理对话状态和上下文\nai = Kani(engine)\n\n# 方式一：直接在终端交互\nchat_in_terminal(ai)\n\n# 方式二：在代码中异步调用\nasync def main():\n    resp = await ai.chat_round(\"What is the airspeed velocity of an unladen swallow?\")\n    print(resp.text)\n\nasyncio.run(main())\n```\n\n### 2. 使用命令行工具 (CLI)\n\nKani 内置了 CLI 工具，无需编写代码即可在终端直接与模型对话。\n\n```bash\n# 格式：kani \u003Cprovider>:\u003Cmodel-id>\nkani openai:gpt-4o-mini\nkani huggingface:meta-llama\u002FMeta-Llama-3-8B-Instruct\nkani anthropic:claude-sonnet-4-0\n```\n\n**多模态支持**：若安装了 `kani[multimodal]`，可在 CLI 中通过 `@` 前缀发送图片：\n```text\nUSER: Please describe this image: @path\u002Fto\u002Fimage.png\n```\n\n### 3. 实现函数调用 (Function Calling)\n\nKani 的核心优势在于极简的函数调用支持。只需使用 `@ai_function` 装饰器，即可让模型自动调用你的 Python 函数。\n\n```python\nimport asyncio\nfrom typing import Annotated\nfrom kani import AIParam, Kani, ai_function, ChatRole\nfrom kani.engines.openai import OpenAIEngine\n\napi_key = \"sk-...\"\nengine = OpenAIEngine(api_key, model=\"gpt-4o-mini\")\n\nclass MyKani(Kani):\n    # 使用装饰器暴露函数给 AI\n    @ai_function()\n    def get_weather(\n        self,\n        # 为参数添加描述，帮助模型理解\n        location: Annotated[str, AIParam(desc=\"The city and state, e.g. San Francisco, CA\")],\n    ):\n        \"\"\"Get the current weather in a given location.\"\"\"\n        # 此处模拟返回，实际可调用真实 API\n        return f\"Weather in {location}: Sunny, 72 degrees fahrenheit.\"\n\nai = MyKani(engine)\n\nasync def main():\n    # full_round 会自动处理函数调用循环，直到模型给出最终回答\n    async for msg in ai.full_round(\"What's the weather in Tokyo?\"):\n        print(msg.role, msg.text)\n\nasyncio.run(main())\n```\n\n### 4. 流式输出 (Streaming)\n\nKani 原生支持流式响应，即使包含函数调用也能逐字输出。\n\n```python\nasync def stream_chat():\n    # chat_round_stream 返回一个异步生成器\n    stream = ai.chat_round_stream(\"What does kani mean?\")\n    async for token in stream:\n        print(token, end=\"\", flush=True)\n    print()\n    \n    # 获取完整消息对象\n    msg = await stream.message()\n```\n\nKani 的设计理念是“轻量且可控”，它自动处理上下文记忆管理和函数调用的重试逻辑，让你专注于业务逻辑和 Prompt 优化。更多高级用法请参考官方文档。","某 NLP 研究员正在构建一个需要动态调用本地私有模型与云端 API 混合部署的智能数据分析助手，以处理敏感的企业财务文档。\n\n### 没有 kani 时\n- **框架绑定严重**：若要切换从 OpenAI 到本地 Llama 模型，需重写大量底层推理代码，无法实现“一次编写，多处运行”。\n- **工具调用定制难**：现有的重型框架对函数调用（Function Calling）流程封装过死，难以针对特定财务公式计算逻辑进行细粒度干预。\n- **实验迭代缓慢**：每次尝试新的控制流策略或修改消息处理机制，都需要深入框架源码“魔改”，极大拖慢了论文复现和原型验证速度。\n- **依赖臃肿**：引入完整的大模型应用框架往往带来不必要的依赖负担，导致在轻量级服务器或边缘设备上部署困难。\n\n### 使用 kani 后\n- **模型无缝切换**：利用 kani 的模型无关架构，仅需几行配置即可在 OpenAI、Hugging Face 或 vLLM 后端间自由切换，无需改动核心业务逻辑。\n- **极致可控的流程**：kani 提供高度可黑客式修改（hackable）的工具调用接口，研究员能轻松插入自定义钩子，精准控制财务数据的解析与计算步骤。\n- **快速原型开发**：凭借微框架的轻量化特性，研究者能专注于算法创新而非框架适配，显著缩短了从想法到可运行 Demo 的周期。\n- **灵活部署组合**：支持按需安装特定引擎依赖（如 `kani[openai]` 或 `kani[huggingface]`），完美适配混合云架构，降低了资源消耗。\n\nkani 通过赋予开发者对控制流的精细掌控权和模型兼容性，成为了连接学术创新与工程落地的理想桥梁。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhudotexe_kani_d3ccc8e3.png","zhudotexe","Andrew Zhu","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fzhudotexe_1dfe5f83.jpg","PhD @ UPenn || there once was a girl from purdue \u002F who kept a young cat in a pew \u002F she taught it to speak \u002F alphabetical Greek \u002F but it never got farther than μ","University of Pennsylvania","Philadelphia, PA","andrz@seas.upenn.edu",null,"https:\u002F\u002Fzhu.codes","https:\u002F\u002Fgithub.com\u002Fzhudotexe",[83,87,91],{"name":84,"color":85,"percentage":86},"Python","#3572A5",94.4,{"name":88,"color":89,"percentage":90},"Jupyter Notebook","#DA5B0B",5.4,{"name":92,"color":93,"percentage":94},"Shell","#89e051",0.2,599,30,"2026-04-01T17:16:58","MIT","","未说明（取决于所选后端引擎，如使用 Hugging Face 或 vLLM 通常需要 GPU，使用 OpenAI\u002FAnthropic API 则不需要）","未说明",{"notes":103,"python":104,"dependencies":105},"该工具是一个模型无关的框架，核心库本身轻量且无特定硬件要求。具体的 GPU、内存及依赖需求完全取决于用户选择的后端引擎（例如：调用 OpenAI API 仅需网络；本地运行 Hugging Face 或 vLLM 模型则需相应的 GPU 和显存）。支持通过 pip extras 按需安装特定模型的依赖包。","3.10+",[64,106,107,108,109,110,111,112,113],"torch (可选，用于 Hugging Face)","openai (可选，通过 kani[openai] 安装)","anthropic (可选，通过 kani[anthropic] 安装)","google-generativeai (可选，通过 kani[google] 安装)","transformers (可选，通过 kani[huggingface] 安装)","llama-cpp-python (可选，通过 kani[cpp] 安装)","vllm (可选，通过 kani[vllm] 安装)","kani-multimodal-core (可选，用于多模态支持)",[14,13,35],[116,117,118,119,120,121,122,123,124,125],"framework","function-calling","large-language-models","llama","openai","gpt-4","llms","microframework","tool-use","chatgpt","2026-03-27T02:49:30.150509","2026-04-12T05:24:57.951278",[129,134,139,144,149,154],{"id":130,"question_zh":131,"answer_zh":132,"source_url":133},30245,"安装 PyPI 版本后运行示例代码报错（如 AttributeError 或 traceback），如何解决？","这通常是因为仓库刚刚进行了重构（例如 token 计数逻辑），但新版本尚未发布到 PyPI。解决方法是直接使用 GitHub 上的最新代码而非 pip 安装的版本。可以通过以下命令安装最新版：\n`pip install git+https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani.git`\n或者克隆仓库后在本地运行示例。维护者通常会在几天内发布新版本到 PyPI。","https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Fissues\u002F67",{"id":135,"question_zh":136,"answer_zh":137,"source_url":138},30246,"使用 AnthropicEngine 时遇到 'AsyncHTTPTransport.__init__() got an unexpected keyword argument socket_options' 错误怎么办？","这通常是由于 `httpx` 库的版本不兼容导致的。请尝试升级 `httpx` 到最新版本（例如 0.28.1 或更高）：\n`pip install --upgrade httpx`\n如果问题依然存在，检查 `anthropic` 和 `httpcore` 的版本是否匹配。该错误并非 Kani 本身的 bug，而是底层依赖库的版本冲突。","https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Fissues\u002F50",{"id":140,"question_zh":141,"answer_zh":142,"source_url":143},30247,"如何在 Kani 中使用 Hugging Face 模型？有示例吗？","Kani 已经内置了对 LLaMA v2、Vicuna 等模型的支持。如果你使用这些模型，只需在引擎中传入 `model_id` 即可。\n如果你想为 Hugging Face 上的其他聊天模型实现自定义引擎，可以参考 Vicuna 引擎的实现代码，重点实现 `build_prompt` 和 `message_len` 方法。\n参考代码：https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Fblob\u002Fmain\u002Fkani\u002Fengines\u002Fhuggingface\u002Fvicuna.py\n更多文档请参阅：https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Fengines.html#huggingface","https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Fissues\u002F20",{"id":145,"question_zh":146,"answer_zh":147,"source_url":148},30248,"我初始化了 OpenAIEngine 指定 model=\"gpt-4\"，但模型回答说它是 GPT-3，这是配置错误吗？","这不是配置错误，Kani 确实将 `model=\"gpt-4\"` 参数正确传递给了 OpenAI API。出现这种情况是因为 GPT-4 模型本身在训练数据截止前不知道自己的存在，或者出于安全对齐原因倾向于声称自己是旧版本（GPT-3）。\n你可以通过查看代码确认参数已传递：https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Fblob\u002Fmain\u002Fkani\u002Fengines\u002Fopenai\u002Fengine.py#L117\n只要代码中没有报错，实际调用的就是 GPT-4。","https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Fissues\u002F15",{"id":150,"question_zh":151,"answer_zh":152,"source_url":153},30249,"在使用 OpenAI 引擎流式输出（streaming）调用工具（tool_calls）时，出现 Pydantic ValidationError (name should be a valid string) 错误","这是一个已知问题，有时 OpenAI 返回的流式数据块中，Function 对象的 name 或 arguments 字段可能为空（None），导致 Pydantic 验证失败。维护者已经在代码库中添加了修复程序来处理这种空值情况。\n请确保你使用的是最新的 Kani 版本（从 GitHub 安装或等待下一次 PyPI 更新），该修复会自动处理这些边缘情况，避免抛出验证错误。","https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Fissues\u002F70",{"id":155,"question_zh":156,"answer_zh":157,"source_url":158},30250,"如何加载本地存储的 llama-cpp-python 模型文件（.gguf）？","`llama-cpp-python` 支持通过 `model_path` 参数直接加载本地模型文件。在 Kani 中使用时，你需要配置相应的引擎以指向本地路径。\n基本用法如下（基于 llama-cpp-python 的高层 API）：\n```python\nfrom llama_cpp import Llama\nllm = Llama(\n    model_path=\".\u002Fmodels\u002F7B\u002Fllama-model.gguf\",\n    # n_gpu_layers=-1, # 取消注释以启用 GPU 加速\n    # n_ctx=2048,      # 上下文长度\n)\n```\n在 Kani 的 LlamaCppEngine 中，通常也有对应的参数透传机制，请确保传入正确的本地绝对路径或相对路径。","https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani\u002Fissues\u002F51",[160,165,170,175,180,185,190,195,200,205,210,215,220,225,230,235,240,245,250,255],{"id":161,"version":162,"summary_zh":163,"released_at":164},214601,"v1.9.1","- HF：新增对 Qwen-3.5 的支持\n- HF：修复了当模型将工具参数以字符串而非 JSON 对象形式生成时，工具解析可能失败的问题\n- HF：修复了使用某些多模态模型时，`mm_token_type_ids` 被错误地传递给 `generate` 方法的问题\n- HF：升级依赖版本，以支持 `transformers` v5\n- CLI：改进了流式输出时对空白字符的处理\n- CLI (HF)：新增选项，在定义模型特定的解析器时，可在流式输出中显示模型的推理标记\n","2026-03-17T04:51:06",{"id":166,"version":167,"summary_zh":168,"released_at":169},214602,"v1.9.0","## v1.9.0 - OpenAI Responses API\n\n### 新特性\n\n- OpenAI：新增支持通过在初始化 `OpenAIEngine` 时设置 `api_type=\"responses\"` 来使用 Responses API。对于“深度研究”类型的推理模型，`OpenAIEngine` 将自动默认使用 Responses API。\n\n### 修复及其他\n\n- HF：修复了在使用通义千问3系列模型进行多轮工具调用时的一个问题，即之前各轮工具调用中的思考内容不会传递到同一轮的后续步骤中。\n- OpenAI：对 Kani → OpenAI 的转换逻辑进行了小幅优化，以便更轻松地为自定义引擎（例如 vLLM 的 OpenAI 兼容 API）进行覆盖配置。","2026-03-04T03:35:12",{"id":171,"version":172,"summary_zh":173,"released_at":174},214603,"v1.8.0","模型上下文协议（MCP）是一种用于定义工具并使其可通过互联网或本地通信供大型语言模型使用的标准。Kani 现在支持使用本地或远程的 MCP 工具，因此您可以结合 Kani 的灵活性来利用广泛的 MCP 生态系统。\n\n要使用 MCP 工具，首先需要指定要连接的 MCP 服务器列表。然后，使用 `tools_from_mcp_servers` 上下文管理器连接到这些服务器，并获取可用工具的列表。您可以像使用普通的 Kani AIFunction 一样传递这些工具。\n\n更多相关信息，请参阅 MCP 工具文档：https:\u002F\u002Fkani.readthedocs.io\u002Fen\u002Flatest\u002Ffunction_calling.html#mcp-tools！\n\n### 新特性\n\n- MCP：新增 `tools_from_mcp_servers` 上下文管理器，用于从远程获取 MCP 工具，并将其作为 Kani AIFunction 提供给模型。\n- 改进了 AIFunction 返回值的自动序列化：如果 AIFunction 返回 Pydantic 模型或 `dict`\u002F`list`，将自动将其转换为 JSON 格式，而不是简单地调用 `str()`。\n- 允许 AIFunction 直接返回 `ChatMessage` 或 `list[MessagePart]`，以支持多模态函数返回。\n- 扩展包（即使用 `kani.ext.*` 命名空间的包）现在可以定义 `CLI_PROVIDERS` 列表，以便与 `kani` CLI 配合使用。\n- 改进了上下文长度计数失败时的错误日志记录。\n\n### 修复\n\n- HF：修复了在聊天模板限制下，某些提示构建不够优化的问题。\n- Qwen 3 (HF)：确保对所有思考类模型应用正确的思考解析器。\n- OpenAI：修复了在使用 `Kani.save()` 保存包含来自 OpenAI 的 `extra` 元数据的消息时出现的问题。\n- OpenAI：修复了在使用工具进行流式处理时，工具调用增量为空的问题。","2026-01-15T22:19:35",{"id":176,"version":177,"summary_zh":178,"released_at":179},214604,"v1.7.0","## 令牌计数重构\n\n在底层实现上，kani 现在使用完整的提示（即消息列表加上函数）来计算令牌数量，而不是分别对每条消息的令牌数进行累加。这一改进使得对于那些不公开其分词器的模型（如 Claude 和 Gemini）以及具有严格聊天模板的模型（如 HF Transformers 和 llama.cpp），令牌计数更加可靠。\n\n如果您没有通过 `Kani.message_token_len`、`BaseEngine.message_len`、`BaseEngine.token_reserve` 或 `BaseEngine.function_token_reserve` 手动计算令牌数量，则无需进行任何更改。\n\n如果您曾自定义过引擎，上述方法现已弃用。要实现令牌计数功能，请将 `BaseEngine.message_len`、`.token_reserve` 和 `.function_token_reserve` 替换为新的方法 `.prompt_len(messages, functions)`。该方法可以是异步的。\n\n此次变更旨在简化和标准化新引擎的实现流程，因为基于提示的令牌计数可以复用推理过程中大部分相同的代码。\n\n### 中断性变更\n\n- 弃用 `Kani.message_token_len` —— 请改用 `await Kani.prompt_token_len`\n- 弃用 `BaseEngine.message_len`、`.token_reserve` 和 `.function_token_reserve` —— 请改用 `BaseEngine.prompt_len`\n- `AIFunction.auto_truncate` 现在会按指定的**字符数**截断，而非按令牌数\n\n### 新特性\n\n- **新增 `BaseEngine.prompt_len` 和 `Kani.prompt_token_len`**\n- **HuggingEngine 原生支持多模态**\n- 为丰富附加内容新增了 `TextPart` 消息部分\n- 为某些基于 API 的引擎（例如用于服务器端工具调用）添加了低层级可扩展性覆盖的文档\n\n### 修复\n\n- 修复了在多个位置指定解码参数时可能产生冲突并导致错误的情况\n- 修复了在构造 Kani 实例时，系统会在查找 AIFunction 时调用其属性 getter 的问题\n- 修复了当 HuggingEngine.stream 调用中 `PreTrainedModel.generate` 报错时，程序会无限期挂起的问题\n- 修复了 LlamaCppEngine 在关闭后无法立即释放资源的问题\n- 修复了解析 Mistral 风格的无参数函数调用时出现的问题\n- 修复了尝试自动识别量化变体的基础模型时，某些基础模型无法被找到的问题\n- 移除了 HTTPEngine 被移除后的一些未使用的异常类型\n- GoogleAIEngine：当 Gemini API 返回意外的空响应时，会抛出更清晰的警告","2025-10-30T19:26:22",{"id":181,"version":182,"summary_zh":183,"released_at":184},214605,"v1.6.1","- 移除了已弃用的 HTTPClient（自 v1.0.0 起）\n- 将默认的 `desired_response_tokens` 提升至模型最大上下文长度的 10% 或 8192 个 token，取两者中的较小值\n- Anthropic：推理结果现在以 ReasoningPart 的形式返回，而非 AnthropicUnknownPart\n- Anthropic：将默认的 `max_tokens` 提升至 2048\n- Anthropic：修复了工具调用结果未能正确传递给模型的问题\n- OpenAI：当未找到模型 ID 时，会更清晰地提示所使用的分词器\n- Google AI：推理结果现在以 ReasoningPart 的形式返回，而非字符串，并且在多轮函数调用中能够正确传递","2025-09-18T20:40:36",{"id":186,"version":187,"summary_zh":188,"released_at":189},214606,"v1.6.0","# 新特性：多模态输入\n\n`kani-multimodal-core` 应与核心 `kani` 安装一起使用，只需添加一个额外的依赖项：\n\n```shell\n$ pip install \"kani[multimodal]\"\n```\n\n不过，你也可以显式指定版本并单独安装核心包：\n\n```shell\n$ pip install kani-multimodal-core\n```\n\n## 功能\n\n该包提供了引擎实现可以使用的多模态扩展核心功能——它本身并不提供任何引擎实现。\n\n该包新增了对以下内容的支持：\n\n- 图像（`kani.ext.multimodal_core.ImagePart`）\n- 音频（`kani.ext.multimodal_core.AudioPart`）\n- 视频（`kani.ext.multimodal_core.VideoPart`）\n- 其他二进制文件，例如 PDF（`kani.ext.multimodal_core.BinaryFilePart`）\n\n安装后，以下核心 Kani 引擎将自动使用多模态部分：\n\n- OpenAIEngine\n- AnthropicEngine\n- GoogleAIEngine\n\n此外，Kani 的核心 `chat_in_terminal` 方法将支持通过 `@\u002Fpath\u002Fto\u002Fmedia` 或 `@https:\u002F\u002Fexample.com\u002Fmedia` 从本地磁盘或互联网附加多模态数据。\n\n### 消息部分\n\n你需要熟悉的主要功能是 `MessagePart`，这是向引擎发送消息的核心方式。为此，在调用 Kani 的回合方法时（即 `Kani.chat_round`、`Kani.full_round` 或其字符串变体），请传递一个多模态部分的 *列表*，而不是一个字符串：\n\n```python\nfrom kani import Kani\nfrom kani.engines.openai import OpenAIEngine\nfrom kani.ext.multimodal_core import ImagePart\n\nengine = OpenAIEngine(model=\"gpt-4.1-nano\")\nai = Kani(engine)\n\n# 注意这里的参数是一个部分列表，而不是单个字符串！\nmsg = await ai.chat_round_str([\n    \"请描述这张图片：\",\n    ImagePart.from_file(\"path\u002Fto\u002Fimage.png\")\n])\nprint(msg)\n```\n\n有关提供的消息部分的更多信息，请参阅文档（https:\u002F\u002Fkani-multimodal-core.readthedocs.io）。\n\n### 终端工具\n\n安装后，kani-multimodal-core 会增强 Kani 提供的 `chat_in_terminal` 工具。\n\n此工具允许你通过在文件或 URL 前加上 `@` 符号，直接在终端中提供本地或网络上的多模态媒体：\n\n```pycon\n>>> from kani import chat_in_terminal\n>>> chat_in_terminal(ai)\nUSER: 请描述这张图片：@path\u002Fto\u002Fimage.png，还有这张：@https:\u002F\u002Fexample.com\u002Fimage.png\n```\n\n- 原生支持多模态（图像、视频、音频）模型，借助 `kani-multimodal-core` 包（https:\u002F\u002Fgithub.com\u002Fzhudotexe\u002Fkani-multimodal-core）！\n\t- AnthropicEngine、OpenAIEngine 和 GoogleAIEngine 在安装 `kani-multimodal-core` 后将自动支持多模态输入\n\n# 新特性：原生 Google Gemini 支持\n\n```shell\n$ pip install \"kani[google]\"\n```\n\n```python\nfrom kani import Kani\nfrom kani.engines.google import GoogleAIEngine\n\nengine = GoogleAIEngine(model=\"gemini-2.5-flash\")\n```\n\n该引擎通过 Google AI Studio API 支持所有 Google AI 模型。\n\n详情请参阅 https:\u002F\u002Fai.","2025-08-28T18:42:46",{"id":191,"version":192,"summary_zh":193,"released_at":194},214607,"v1.5.1","- 修复了在使用不提供工具的 GPT-OSS 时，推理过程与最终输出无法分离的问题。","2025-08-08T22:16:23",{"id":196,"version":197,"summary_zh":198,"released_at":199},214608,"v1.5.0","# GPT-OSS 和 GPT-5\n\nkani>=1.5.0 现已支持 GPT-OSS 和 GPT-5！您可以使用以下代码开始体验完整的函数调用与推理能力：\n```python\nfrom kani import Kani, chat_in_terminal\nfrom kani.engines.huggingface import HuggingEngine\nfrom kani.model_specific.gpt_oss import GPTOSSParser\n# 此方法对 20B 和 120B 版本均适用，只需替换模型 ID 即可！\nmodel = HuggingEngine(\n    model_id=\"openai\u002Fgpt-oss-20b\",\n    chat_template_kwargs=dict(reasoning_effort=\"low\"),  # 可设置为 \"low\"、\"medium\" 或 \"high\"\n    eos_token_id=[200002, 199999, 200012],              # 确保模型在工具调用时正确停止\n    temperature=1.0,                                    # 建议的解码参数\n    top_k=None,                                         # 确保不使用 top_k（transformers 默认值为 50）\n)\nengine = GPTOSSParser(model, show_reasoning_in_stream=True)\nai = Kani(engine)\nchat_in_terminal(ai)\n```\n\n# 完整发布说明\n\n- 新增对 **GPT-OSS** 的支持，并配备了特定于模型的解析器\u002F流水线。\n- 在 OpenAIEngine 中新增对 **GPT-5** 的支持。\n- 添加了自动的手写模型流水线：当使用 HuggingEngine 且其模型需要比提供的聊天模板更复杂的逻辑时，Kani 会自动选择正确的手写提示流水线（位于 `kani.model_specific`）。\n- 修复了基于 HF 聊天模板的流水线未能以正确模式发送工具的问题。\n- 修复了某些参数名称无法传递给 `ToolCall.from_function` 的问题。\n- 修复了在 `openai-python>=1.99.2` 上导入 OpenAIEngine 时出现的问题。\n- 使 HuggingEngine 能够返回非 EOS 的特殊标记。\n- 优化了 HuggingEngine 的吞吐量和内存使用。\n- 破坏性变更：将现有的手写模型流水线从 `prompts\u002Fimpl` 移至 `model_specific`。\n- 破坏性变更：将现有的工具解析器从 `tool_parsers` 移至 `model_specific`。\n- 破坏性变更：移除了 Vicuna 1.3：该模型是 Llama v1 的一个非常早期的微调版本，此次移除旨在减轻库的维护负担。","2025-08-07T22:14:38",{"id":201,"version":202,"summary_zh":203,"released_at":204},214609,"v1.4.3","- Llama.cpp：添加 `model_path` 关键字参数，以支持加载本地 GGUF 模型（感谢 @lawrenceakka！）\r\n\r\n> [!NOTE]  \r\n> 从技术上讲，这属于轻微的破坏性变更，因为参数的位置发生了变化。建议在加载任何模型时使用关键字参数。\r\n\r\n- Hugging Face：如果已设置 `max_new_tokens`，则不设置 `max_length` 生成参数，以避免显示冗长的警告信息。\r\n- OpenAI：为 o 系列模型和 GPT-4.1 添加默认上下文长度；对于没有默认上下文长度的模型，添加警告提示。","2025-06-09T17:50:55",{"id":206,"version":207,"summary_zh":208,"released_at":209},214610,"v1.4.2","- 向 `HuggingEngine` 添加 `model_cls` 参数，以允许指定除 `AutoModelForCausalLM` 之外的替代类（例如用于 Qwen-2.5-omni）。","2025-04-09T22:26:16",{"id":211,"version":212,"summary_zh":213,"released_at":214},214611,"v1.4.1","- Added better options for controlling the JSON Schema generated by an AIFunction\r\n- Generated JSON Schema now includes a function's docstring by default as the top-level `description` key\r\n- Generated JSON Schema's top-level `title` key is now a function's name instead of `_FunctionSpec` by default\r\n- Generated JSON Schema's fields only include a `title` key if a `title` kwarg is explicitly passed to `AIParam` (fixing a regression introduced some time ago)\r\n\r\nThese changes should have no effect on OpenAI function calling; these changes are made to improve compatibility with open models that use raw JSON Schema to define functions (e.g., Step-Audio).\r\n","2025-04-05T01:46:51",{"id":216,"version":217,"summary_zh":218,"released_at":219},214612,"v1.4.0","Mainly improvements to the llama.cpp engine in this release.\r\n\r\n## Improvements\r\n- Update the `LlamaCppEngine` to not use the Llama 2 prompt pipeline by default. Prompt pipelines must now be explicitly passed.\r\n- The `LlamaCppEngine` will now automatically download additional GGUF shards when a sharded model is given.\r\n- Added `ChatTemplatePromptPipeline.from_pretrained` to create a prompt pipeline from the chat template of any model on the HF Hub, by ID.\r\n- Added examples and documentation for using DeepSeek-R1 (quantized).\r\n\r\n## Fixes\r\n- `chat_in_terminal_async` no longer blocks the asyncio event loop when waiting for input from the terminal.\r\n- Fixed the `LlamaCppEngine` not passing functions to the provided prompt pipeline. ","2025-02-21T16:09:19",{"id":221,"version":222,"summary_zh":223,"released_at":224},214613,"v1.3.0","## Enhancements\r\n- Added `ToolCallParser`s -- these classes are wrappers around Kani `Engine`s that parse raw text generated by a model, and return Kani-format tool calls. This is an easy way to enable tool calling on open-source models!\r\n\r\nExample:\r\n```python\r\nfrom kani.engines.huggingface import HuggingEngine\r\nfrom kani.prompts.impl.mistral import MISTRAL_V3_PIPELINE\r\nfrom kani.tool_parsers.mistral import MistralToolCallParser\r\nmodel = HuggingEngine(model_id=\"mistralai\u002FMistral-Small-Instruct-2409\", prompt_pipeline=MISTRAL_V3_PIPELINE)\r\nengine = MistralToolCallParser(model)\r\n```\r\n\r\n- Added `NaiveJSONToolCallParser` (e.g., Llama 3)\r\n- Added `MistralToolCallParser`\r\n- Added `DeepseekR1ToolCallParser`\r\n\r\n## Bug Fixes et al.\r\n- Fix compatibility issues with Pydantic 2.10\r\n- Update documentation to better reflect supported HF models","2025-02-03T20:50:12",{"id":226,"version":227,"summary_zh":228,"released_at":229},214614,"v1.2.4","- Pin the Pydantic dependency to `pydantic\u003C2.10.0` as this version breaks JSON schema generation and MessagePart serialization","2024-12-09T18:08:15",{"id":231,"version":232,"summary_zh":233,"released_at":234},214615,"v1.2.3","- Fixes Anthropic tool calling being broken with anthropic-sdk>0.26.0\r\n- Fixes an issue where Anthropic prompts were over-eagerly trimming prompts that did not start with a user message\r\n- Added support for tool calling while streaming with Anthropic models\r\n","2024-11-14T18:36:01",{"id":236,"version":237,"summary_zh":238,"released_at":239},214616,"v1.2.2","- fix(mistral): ensure prompt and completion tokens are passed through in the MistralFunctionCallingAdapter when streaming\r\n- fix(streaming): don't emit text in DummyStream if it is None\r\n- feat: add standalone width formatters\r\n- docs: gpt-3.5-turbo -> gpt-4o-mini defaults\r\n- fix(streaming): potential line len miscount in format_stream","2024-10-25T15:42:15",{"id":241,"version":242,"summary_zh":243,"released_at":244},214617,"v1.2.1","- Fixes various issues in the `MistralFunctionCallingAdapter` wrapper engine for Mistral-Large and Mistral-Small function calling models.\r\n- Fixes an issue in `PromptPipeline.explain()` where manual examples would not be explained.\r\n- Fixes an issue in `PromptPipeline.ensure_bound_function_calls()` where passing an ID translator would mutate the ID of the underlying messages","2024-10-06T21:36:52",{"id":246,"version":247,"summary_zh":248,"released_at":249},214618,"v1.2.0","## New Features\r\n- Hugging Face: Models loaded through the `HuggingEngine` now use [chat templates](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Ftransformers\u002Fmain\u002Fchat_templating) for conversational prompting and tool usage if available by default. This should make it much easier to get started with a Hugging Face model in Kani.\r\n- Added the ability to supply a custom tokenizer to the `OpenAIEngine` (e.g., for using OpenAI-compatible APIs)\\\r\n\r\n## Fixes\u002FImprovements\r\n- Fixed a missing dependency in the `llama` extra\r\n- The `HuggingEngine` will now automatically set `device_map=\"auto\"` if the `accelerate` library is installed","2024-09-24T20:26:47",{"id":251,"version":252,"summary_zh":253,"released_at":254},214619,"v1.1.1","- Fixes an issue where `PromptPipeline.ensure_bound_function_calls()` could still let unbound function calls through in cases of particularly long prompts with prefixing system prompts","2024-07-30T00:34:45",{"id":256,"version":257,"summary_zh":258,"released_at":259},214620,"v1.1.0","- Added `max_function_rounds` to `Kani.full_round`, `Kani.full_round_str`, and `Kani.full_round_stream`:\r\n  > The maximum number of function calling rounds to perform in this round. If this number is reached, the model is allowed to generate a final response without any functions defined.\r\n  > Default unlimited (continues until model's response does not contain a function call).\r\n- Added `__repr__` to engines\r\n- Fixed an issue where Kani could underestimate the token usage for certain OpenAI models using parallel function calling","2024-07-01T22:01:52"]