[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-Lightning-AI--LitServe":3,"tool-Lightning-AI--LitServe":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":80,"owner_website":81,"owner_url":82,"languages":83,"stars":92,"forks":93,"last_commit_at":94,"license":95,"difficulty_score":23,"env_os":96,"env_gpu":97,"env_ram":98,"env_deps":99,"category_tags":110,"github_topics":111,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":120,"updated_at":121,"faqs":122,"releases":152},1989,"Lightning-AI\u002FLitServe","LitServe","A minimal Python framework for building custom AI inference servers with full control over logic, batching, and scaling.","LitServe 是一个轻量级的 Python 框架，帮助开发者快速构建自定义的 AI 推理服务。它允许你用纯 Python 代码完全控制模型的推理逻辑、批量处理、流式输出和多模型调度，无需依赖复杂的 MLOps 配置或黑盒服务。传统推理框架往往只支持单一模型类型，难以扩展到多模型、智能体或 RAG 等复杂场景，而 LitServe 让你自由定义流程，同时自动处理并发、扩展和部署。它特别适合需要灵活推理逻辑的 AI 开发者和研究人员，比如构建个性化聊天机器人、多模型流水线或定制化 RAG 系统。支持任意 PyTorch 模型，兼容 vLLM，可本地运行，也可一键部署到云端。其性能比 FastAPI 快近两倍，且无需编写额外的网络或服务胶水代码，真正实现“写逻辑，交给你，其余我来管”。","\u003Cdiv align='center'>\n\n\u003Ch1>\n  Build custom inference servers in pure Python\n  \u003Cbr\u002F>\n\u003C\u002Fh1> \n\u003Ch4>\n  Define exactly how inference works for models, agents, RAG, or pipelines. \n  \u003Cbr\u002F>\n  Control batching, routing, streaming, and orchestration without MLOps glue or config files.\n\u003C\u002Fh4> \n\n\u003Cimg alt=\"Lightning\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FLightning-AI_LitServe_readme_a82fc7954f3f.png\" width=\"800px\" style=\"max-width: 100%;\">\n\n&nbsp; \n\u003C\u002Fdiv>\n\n\u003Cdiv align='center'>\n  \n\u003Cpre>\n✅ Custom inference logic  ✅ 2× faster than FastAPI     ✅ Agents, RAG, pipelines, more\n✅ Custom logic + control  ✅ Any PyTorch model          ✅ Self-host or managed        \n✅ Multi-GPU autoscaling   ✅ Batching + streaming       ✅ BYO model or vLLM           \n✅ No MLOps glue code      ✅ Easy setup in Python       ✅ Serverless support          \n\n\u003C\u002Fpre>\n\n\u003Cdiv align='center'>\n\n[![PyPI Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FLightning-AI_LitServe_readme_8990a26a8b9d.png)](https:\u002F\u002Fpepy.tech\u002Fprojects\u002Flitserve)\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F1077906959069626439?label=Get%20help%20on%20Discord)](https:\u002F\u002Fdiscord.gg\u002FWajDThKAur)\n![cpu-tests](https:\u002F\u002Fgithub.com\u002FLightning-AI\u002Flitserve\u002Factions\u002Fworkflows\u002Fci-testing.yml\u002Fbadge.svg)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002FLightning-AI\u002Flitserve\u002Fgraph\u002Fbadge.svg?token=SmzX8mnKlA)](https:\u002F\u002Fcodecov.io\u002Fgh\u002FLightning-AI\u002Flitserve)\n[![license](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache%202.0-blue.svg)](https:\u002F\u002Fgithub.com\u002FLightning-AI\u002Flitserve\u002Fblob\u002Fmain\u002FLICENSE)\n\n\u003C\u002Fdiv>\n\u003C\u002Fdiv>\n\u003Cdiv align=\"center\">\n  \u003Cdiv style=\"text-align: center;\">\n    \u003Ca target=\"_blank\" href=\"#quick-start\" style=\"margin: 0 10px;\">Quick start\u003C\u002Fa> •\n    \u003Ca target=\"_blank\" href=\"#featured-examples\" style=\"margin: 0 10px;\">Examples\u003C\u002Fa> •\n    \u003Ca target=\"_blank\" href=\"#features\" style=\"margin: 0 10px;\">Features\u003C\u002Fa> •\n    \u003Ca target=\"_blank\" href=\"#performance\" style=\"margin: 0 10px;\">Performance\u003C\u002Fa> •\n    \u003Ca target=\"_blank\" href=\"#host-anywhere\" style=\"margin: 0 10px;\">Hosting\u003C\u002Fa> •\n    \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\" style=\"margin: 0 10px;\">Docs\u003C\u002Fa>\n  \u003C\u002Fdiv>\n\u003C\u002Fdiv>\n\n&nbsp;\n\n\u003Cdiv align=\"center\">\n\u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fhome\u002Fget-started?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">\n  \u003Cimg src=\"https:\u002F\u002Fpl-bolts-doc-images.s3.us-east-2.amazonaws.com\u002Fapp-2\u002Fget-started-badge.svg\" height=\"36px\" alt=\"Get started\"\u002F>\n\u003C\u002Fa>\n\u003C\u002Fdiv>\n\n&nbsp; \n\n# Why LitServe?\nMost serving tools (vLLM, etc..) are built for a single model type and enforce rigid abstractions. They work well until you need custom logic, multiple models, agents, or non standard pipelines. LitServe lets you write your own inference engine in Python. You define how requests are handled, how models are loaded, how batching and routing work, and how outputs are produced. LitServe handles performance, concurrency, scaling, and deployment. Use LitServe to build inference APIs, agents, chatbots, RAG systems, MCP servers, or multi model pipelines. \n\nRun it locally, self host anywhere, or deploy with one click on [Lightning AI](https:\u002F\u002Flightning.ai\u002Flitserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme).\n\n&nbsp;\n\n# Want the easiest way to host inference?\nOver 380,000 developers use [Lightning Cloud](https:\u002F\u002Flightning.ai\u002F?utm_source=ptl_readme&utm_medium=referral&utm_campaign=ptl_readme), the simplest way to run LitServe without managing infrastructure. Deploy with one command, get autoscaling GPUs, monitoring, and a free tier. No cloud setup required. Or self host anywhere.\n\n# Quick start\n\nInstall LitServe via pip ([more options](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fhome\u002Finstall)):\n\n```bash\npip install litserve\n```\n\n[Example 1](#inference-engine-example): Toy inference pipeline with multiple models.   \n[Example 2](#agent-example): Minimal agent to fetch the news (with OpenAI API).    \n([Advanced examples](#featured-examples)):    \n\n### Inference engine example   \n\n```python\nimport litserve as ls\n\n# define the api to include any number of models, dbs, etc...\nclass InferenceEngine(ls.LitAPI):\n    def setup(self, device):\n        self.text_model = lambda x: x**2\n        self.vision_model = lambda x: x**3\n\n    def predict(self, request):\n        x = request[\"input\"]    \n        # perform calculations using both models\n        a = self.text_model(x)\n        b = self.vision_model(x)\n        c = a + b\n        return {\"output\": c}\n\nif __name__ == \"__main__\":\n    # 12+ features like batching, streaming, etc...\n    server = ls.LitServer(InferenceEngine(max_batch_size=1), accelerator=\"auto\")\n    server.run(port=8000)\n```\n\nDeploy for free to [Lightning cloud](#hosting-options) (or self host anywhere):\n\n```bash\n# Deploy for free with autoscaling, monitoring, etc...\nlightning deploy server.py --cloud\n\n# Or run locally (self host anywhere)\nlightning deploy server.py\n# python server.py\n```\n\nTest the server: Simulate an http request (run this on any terminal):\n```bash\ncurl -X POST http:\u002F\u002F127.0.0.1:8000\u002Fpredict -H \"Content-Type: application\u002Fjson\" -d '{\"input\": 4.0}'\n```\n\n### Agent example\n\n```python\nimport re, requests, openai\nimport litserve as ls\n\nclass NewsAgent(ls.LitAPI):\n    def setup(self, device):\n        self.openai_client = openai.OpenAI(api_key=\"OPENAI_API_KEY\")\n\n    def predict(self, request):\n        website_url = request.get(\"website_url\", \"https:\u002F\u002Ftext.npr.org\u002F\")\n        website_text = re.sub(r'\u003C[^>]+>', ' ', requests.get(website_url).text)\n\n        # ask the LLM to tell you about the news\n        llm_response = self.openai_client.chat.completions.create(\n           model=\"gpt-3.5-turbo\", \n           messages=[{\"role\": \"user\", \"content\": f\"Based on this, what is the latest: {website_text}\"}],\n        )\n        output = llm_response.choices[0].message.content.strip()\n        return {\"output\": output}\n\nif __name__ == \"__main__\":\n    server = ls.LitServer(NewsAgent())\n    server.run(port=8000)\n```\nTest it:\n```bash\ncurl -X POST http:\u002F\u002F127.0.0.1:8000\u002Fpredict -H \"Content-Type: application\u002Fjson\" -d '{\"website_url\": \"https:\u002F\u002Ftext.npr.org\u002F\"}'\n```\n\n&nbsp;\n\n# Key benefits   \n\nA few key benefits:\n\n- **Deploy any pipeline or model**: Agents, pipelines, RAG, chatbots, image models, video, speech, text, etc...\n- **No MLOps glue:** LitAPI lets you build full AI systems (multi-model, agent, RAG) in one place ([more](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fapi-reference\u002Flitapi?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)).   \n- **Instant setup:** Connect models, DBs, and data in a few lines with `setup()` ([more](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fapi-reference\u002Flitapi?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme#setup)).    \n- **Optimized:** autoscaling, GPU support, and fast inference included ([more](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fapi-reference\u002Flitserver?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)).    \n- **Deploy anywhere:** self-host or one-click deploy with Lightning ([more](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fdeploy-on-cloud?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)).\n- **FastAPI for AI:** Built on FastAPI but optimized for AI - 2× faster with AI-specific multi-worker handling ([more]((#performance))).   \n- **Expert-friendly:** Use vLLM, or build your own with full control over batching, caching, and logic ([more](https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-private-llama-3-2-rag-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)).    \n\n> ⚠️ Not a vLLM or Ollama alternative out of the box. LitServe gives you lower-level flexibility to build what they do (and more) if you need it.\n\n&nbsp;\n\n# Featured examples    \nHere are examples of inference pipelines for common model types and use cases.      \n  \n\u003Cpre>\n\u003Cstrong>Toy model:\u003C\u002Fstrong>      \u003Ca target=\"_blank\" href=\"#define-a-server\">Hello world\u003C\u002Fa>\n\u003Cstrong>LLMs:\u003C\u002Fstrong>           \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-llama-3-2-vision-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Llama 3.2\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fopenai-fault-tolerant-proxy-server?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">LLM Proxy server\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-ai-agent-with-tool-use?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Agent with tool use\u003C\u002Fa>\n\u003Cstrong>RAG:\u003C\u002Fstrong>            \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-private-llama-3-2-rag-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">vLLM RAG (Llama 3.2)\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-private-llama-3-1-rag-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">RAG API (LlamaIndex)\u003C\u002Fa>\n\u003Cstrong>NLP:\u003C\u002Fstrong>            \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-any-hugging-face-model-instantly?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Hugging face\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-hugging-face-bert-model?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">BERT\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-text-embedding-api-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Text embedding API\u003C\u002Fa>\n\u003Cstrong>Multimodal:\u003C\u002Fstrong>     \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-open-ai-clip-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">OpenAI Clip\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-multi-modal-llm-with-minicpm?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">MiniCPM\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-phi3-5-vision-api-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Phi-3.5 Vision Instruct\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Fbhimrajyadav\u002Fstudios\u002Fdeploy-and-chat-with-qwen2-vl-using-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Qwen2-VL\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-multi-modal-llm-with-pixtral?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Pixtral\u003C\u002Fa>\n\u003Cstrong>Audio:\u003C\u002Fstrong>          \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-open-ai-s-whisper-model?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Whisper\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-an-music-generation-api-with-meta-s-audio-craft?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">AudioCraft\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-an-audio-generation-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">StableAudio\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-noise-cancellation-api-with-deepfilternet?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Noise cancellation (DeepFilterNet)\u003C\u002Fa>\n\u003Cstrong>Vision:\u003C\u002Fstrong>         \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-private-api-for-stable-diffusion-2?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Stable diffusion 2\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-an-image-generation-api-with-auraflow?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">AuraFlow\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-an-image-generation-api-with-flux?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Flux\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-super-resolution-image-api-with-aura-sr?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Image Super Resolution (Aura SR)\u003C\u002Fa>,\n                \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Fbhimrajyadav\u002Fstudios\u002Fdeploy-background-removal-api-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Background Removal\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-controlled-image-generation-api-controlnet?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Control Stable Diffusion (ControlNet)\u003C\u002Fa>\n\u003Cstrong>Speech:\u003C\u002Fstrong>         \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-voice-clone-api-coqui-xtts-v2-model?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Text-speech (XTTS V2)\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Fbhimrajyadav\u002Fstudios\u002Fdeploy-a-speech-generation-api-using-parler-tts-powered-by-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Parler-TTS\u003C\u002Fa>\n\u003Cstrong>Classical ML:\u003C\u002Fstrong>   \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-random-forest-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Random forest\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-xgboost-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">XGBoost\u003C\u002Fa>\n\u003Cstrong>Miscellaneous:\u003C\u002Fstrong>  \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-an-media-conversion-api-with-ffmpeg?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Media conversion API (ffmpeg)\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-both-pytorch-and-tensorflow-in-a-single-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">PyTorch + TensorFlow in one API\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fopenai-fault-tolerant-proxy-server?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">LLM proxy server\u003C\u002Fa>\n\u003C\u002Fpre>\n\u003C\u002Fpre>\n\n[Browse 100+ community-built templates](https:\u002F\u002Flightning.ai\u002Fstudios?section=serving&utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)\n\n&nbsp;\n\n# Host anywhere\n\nSelf-host with full control, or deploy with [Lightning AI](https:\u002F\u002Flightning.ai\u002F?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme) in seconds with autoscaling, security, and 99.995% uptime.  \n**Free tier included. No setup required. Run on your cloud**   \n\n```bash\nlightning deploy server.py --cloud\n```\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fff83dab9-0c9f-4453-8dcb-fb9526726344\n\n&nbsp;\n\n# Features\n\n\u003Cdiv align='center'>\n\n| [Feature](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)               | Self Managed                      | [Fully Managed on Lightning](https:\u002F\u002Flightning.ai\u002Fdeploy?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)         |\n|----------------------------------------------------------------------|-----------------------------------|------------------------------------|\n| Docker-first deployment          | ✅ DIY                             | ✅ One-click deploy                |\n| Cost                             | ✅ Free (DIY)                      | ✅ Generous [free tier](https:\u002F\u002Flightning.ai\u002Fpricing?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme) with pay as you go                |\n| Full control                     | ✅                                 | ✅                                 |\n| Use any engine (vLLM, etc.)      | ✅                                 | ✅ vLLM, Ollama, LitServe, etc.    |\n| Own VPC                          | ✅ (manual setup)                  | ✅ Connect your own VPC            |\n| [(2x)+ faster than plain FastAPI](#performance)                                               | ✅       | ✅                                 |\n| [Bring your own model](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Ffull-control?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)              | ✅       | ✅                                 |\n| [Build compound systems (1+ models)](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fhome?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                 | ✅       | ✅                                 |\n| [GPU autoscaling](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fgpu-inference?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                  | ✅       | ✅                                 |\n| [Batching](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fbatching?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                              | ✅       | ✅                                 |\n| [Streaming](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fstreaming?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                            | ✅       | ✅                                 |\n| [Worker autoscaling](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fautoscaling?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                 | ✅       | ✅                                 |\n| [Serve all models: (LLMs, vision, etc.)](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fexamples?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)         | ✅       | ✅                                 |\n| [Supports PyTorch, JAX, TF, etc...](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Ffull-control?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme) | ✅       | ✅                                 |\n| [OpenAPI compliant](https:\u002F\u002Fwww.openapis.org\u002F)                                                | ✅       | ✅                                 |\n| [Open AI compatibility](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fopen-ai-spec?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)             | ✅       | ✅                                 |\n| [MCP server support](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fmcp?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                         | ✅       | ✅                                 |\n| [Asynchronous](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fasync-concurrency?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                 | ✅       | ✅                                 |\n| [Authentication](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fauthentication?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                  | ❌ DIY   | ✅ Token, password, custom         |\n| GPUs                             | ❌ DIY                             | ✅ 8+ GPU types, H100s from $1.75  |\n| Load balancing                   | ❌                                 | ✅ Built-in                        |\n| Scale to zero (serverless)       | ❌                                 | ✅ No machine runs when idle       |\n| Autoscale up on demand           | ❌                                 | ✅ Auto scale up\u002Fdown              |\n| Multi-node inference             | ❌                                 | ✅ Distribute across nodes         |\n| Use AWS\u002FGCP credits              | ❌                                 | ✅ Use existing cloud commits      |\n| Versioning                       | ❌                                 | ✅ Make and roll back releases     |\n| Enterprise-grade uptime (99.95%) | ❌                                 | ✅ SLA-backed                      |\n| SOC2 \u002F HIPAA compliance          | ❌                                 | ✅ Certified & secure              |\n| Observability                    | ❌                                 | ✅ Built-in, connect 3rd party tools|\n| CI\u002FCD ready                      | ❌                                 | ✅ Lightning SDK                   |\n| 24\u002F7 enterprise support          | ❌                                 | ✅ Dedicated support               |\n| Cost controls & audit logs       | ❌                                 | ✅ Budgets, breakdowns, logs       |\n| Debug on GPUs                    | ❌                                 | ✅ Studio integration              |\n| [20+ features](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                    | -                                 | -                                  |\n\n\u003C\u002Fdiv>\n\n&nbsp;\n\n# Performance  \nLitServe is designed for AI workloads. Specialized multi-worker handling delivers a minimum **2x speedup over FastAPI**.    \n\nAdditional features like batching and GPU autoscaling can drive performance well beyond 2x, scaling efficiently to handle more simultaneous requests than FastAPI and TorchServe.\n    \nReproduce the full benchmarks [here](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fhome\u002Fbenchmarks?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme) (higher is better).  \n\n\u003Cdiv align=\"center\">\n  \u003Cimg alt=\"LitServe\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FLightning-AI_LitServe_readme_c7e3ff3c2ccf.png\" width=\"1000px\" style=\"max-width: 100%;\">\n\u003C\u002Fdiv> \n\nThese results are for image and text classification ML tasks. The performance relationships hold for other ML tasks (embedding, LLM serving, audio, segmentation, object detection, summarization etc...).   \n    \n***💡 Note on LLM serving:*** For high-performance LLM serving (like Ollama\u002FvLLM), integrate [vLLM with LitServe](https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-private-llama-3-2-rag-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme), use [LitGPT](https:\u002F\u002Fgithub.com\u002FLightning-AI\u002Flitgpt?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme#deploy-an-llm), or build your custom vLLM-like server with LitServe. Optimizations like kv-caching, which can be done with LitServe, are needed to maximize LLM performance.\n\n&nbsp;\n\n\n# Community\nLitServe is a [community project accepting contributions](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fcommunity?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme) - Let's make the world's most advanced AI inference engine.\n\n💬 [Get help on Discord](https:\u002F\u002Fdiscord.com\u002Finvite\u002FXncpTy7DSt)    \n📋 [License: Apache 2.0](https:\u002F\u002Fgithub.com\u002FLightning-AI\u002Flitserve\u002Fblob\u002Fmain\u002FLICENSE)    \n","\u003Cdiv align='center'>\n\n\u003Ch1>\n  使用纯 Python 构建自定义推理服务器\n  \u003Cbr\u002F>\n\u003C\u002Fh1> \n\u003Ch4>\n  精确定义模型、智能体、RAG 或流水线的推理工作方式。 \n  \u003Cbr\u002F>\n  无需 MLOps 桥接代码或配置文件，即可控制批处理、路由、流式传输和编排。\n\u003C\u002Fh4> \n\n\u003Cimg alt=\"Lightning\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FLightning-AI_LitServe_readme_a82fc7954f3f.png\" width=\"800px\" style=\"max-width: 100%;\">\n\n&nbsp; \n\u003C\u002Fdiv>\n\n\u003Cdiv align='center'>\n  \n\u003Cpre>\n✅ 自定义推理逻辑  ✅ 比 FastAPI 快 2 倍     ✅ 智能体、RAG、流水线等\n✅ 自定义逻辑 + 控制  ✅ 任意 PyTorch 模型          ✅ 自行托管或托管服务        \n✅ 多 GPU 自动扩展   ✅ 批处理 + 流式传输       ✅ 自备模型或 vLLM           \n✅ 无需 MLOps 桥接代码      ✅ Python 中轻松设置       ✅ 无服务器支持          \n\n\u003C\u002Fpre>\n\n\u003Cdiv align='center'>\n\n[![PyPI 下载量](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FLightning-AI_LitServe_readme_8990a26a8b9d.png)](https:\u002F\u002Fpepy.tech\u002Fprojects\u002Flitserve)\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F1077906959069626439?label=在 Discord 上获取帮助)](https:\u002F\u002Fdiscord.gg\u002FWajDThKAur)\n![cpu-tests](https:\u002F\u002Fgithub.com\u002FLightning-AI\u002Flitserve\u002Factions\u002Fworkflows\u002Fci-testing.yml\u002Fbadge.svg)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002FLightning-AI\u002Flitserve\u002Fgraph\u002Fbadge.svg?token=SmzX8mnKlA)](https:\u002F\u002Fcodecov.io\u002Fgh\u002FLightning-AI\u002Flitserve)\n[![license](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F许可证-Apache%202.0-blue.svg)](https:\u002F\u002Fgithub.com\u002FLightning-AI\u002Flitserve\u002Fblob\u002Fmain\u002FLICENSE)\n\n\u003C\u002Fdiv>\n\u003C\u002Fdiv>\n\u003Cdiv align=\"center\">\n  \u003Cdiv style=\"text-align: center;\">\n    \u003Ca target=\"_blank\" href=\"#quick-start\" style=\"margin: 0 10px;\">快速入门\u003C\u002Fa> •\n    \u003Ca target=\"_blank\" href=\"#featured-examples\" style=\"margin: 0 10px;\">示例\u003C\u002Fa> •\n    \u003Ca target=\"_blank\" href=\"#features\" style=\"margin: 0 10px;\">功能\u003C\u002Fa> •\n    \u003Ca target=\"_blank\" href=\"#performance\" style=\"margin: 0 10px;\">性能\u003C\u002Fa> •\n    \u003Ca target=\"_blank\" href=\"#host-anywhere\" style=\"margin: 0 10px;\">托管\u003C\u002Fa> •\n    \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\" style=\"margin: 0 10px;\">文档\u003C\u002Fa>\n  \u003C\u002Fdiv>\n\u003C\u002Fdiv>\n\n&nbsp;\n\n\u003Cdiv align=\"center\">\n\u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fhome\u002Fget-started?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">\n  \u003Cimg src=\"https:\u002F\u002Fpl-bolts-doc-images.s3.us-east-2.amazonaws.com\u002Fapp-2\u002Fget-started-badge.svg\" height=\"36px\" alt=\"快速入门\"\u002F>\n\u003C\u002Fa>\n\u003C\u002Fdiv>\n\n&nbsp; \n\n# 为什么选择 LitServe？\n大多数推理工具（如 vLLM 等）专为单一模型类型设计，并强制使用严格的抽象。它们在你需要自定义逻辑、多个模型、智能体或非标准流水线时就显得力不从心。LitServe 让你用 Python 编写自己的推理引擎。你可以定义请求如何处理、模型如何加载、批处理和路由如何运作，以及输出如何生成。LitServe 负责性能、并发、扩展和部署。用 LitServe 构建推理 API、智能体、聊天机器人、RAG 系统、MCP 服务器或多模型流水线。\n\n本地运行、自行托管或一键部署到 [Lightning AI](https:\u002F\u002Flightning.ai\u002Flitserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)。\n\n&nbsp;\n\n# 想要最简单的推理托管方式？\n超过 38 万开发者使用 [Lightning Cloud](https:\u002F\u002Flightning.ai\u002F?utm_source=ptl_readme&utm_medium=referral&utm_campaign=ptl_readme)，这是运行 LitServe 最简单的方式，无需管理基础设施。只需一条命令即可部署，获得自动扩展 GPU、监控和免费 tier。无需云环境搭建。或者自行托管。\n\n# 快速入门\n\n通过 pip 安装 LitServe（[更多选项](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fhome\u002Finstall))：\n\n```bash\npip install litserve\n```\n\n[示例 1](#inference-engine-example)：包含多个模型的玩具推理流水线。   \n[示例 2](#agent-example)：使用 OpenAI API 的最小智能体，用于获取新闻。    \n([高级示例](#featured-examples))：    \n\n### 推理引擎示例   \n\n```python\nimport litserve as ls\n\n# 定义 API，可包含任意数量的模型、数据库等...\nclass InferenceEngine(ls.LitAPI):\n    def setup(self, device):\n        self.text_model = lambda x: x**2\n        self.vision_model = lambda x: x**3\n\n    def predict(self, request):\n        x = request[\"input\"]    \n        # 使用两个模型进行计算\n        a = self.text_model(x)\n        b = self.vision_model(x)\n        c = a + b\n        return {\"output\": c}\n\nif __name__ == \"__main__\":\n    # 12+ 功能，如批处理、流式传输等...\n    server = ls.LitServer(InferenceEngine(max_batch_size=1), accelerator=\"auto\")\n    server.run(port=8000)\n```\n\n免费部署到 [Lightning Cloud](#hosting-options)（或自行托管）：\n\n```bash\n# 免费部署，带自动扩展、监控等...\nlightning deploy server.py --cloud\n\n# 或者本地运行（自行托管）\nlightning deploy server.py\n# python server.py\n```\n\n测试服务器：模拟 HTTP 请求（在任何终端运行）：\n```bash\ncurl -X POST http:\u002F\u002F127.0.0.1:8000\u002Fpredict -H \"Content-Type: application\u002Fjson\" -d '{\"input\": 4.0}'\n```\n\n### 智能体示例\n\n```python\nimport re, requests, openai\nimport litserve as ls\n\nclass NewsAgent(ls.LitAPI):\n    def setup(self, device):\n        self.openai_client = openai.OpenAI(api_key=\"OPENAI_API_KEY\")\n\n    def predict(self, request):\n        website_url = request.get(\"website_url\", \"https:\u002F\u002Ftext.npr.org\u002F\")\n        website_text = re.sub(r'\u003C[^>]+>', ' ', requests.get(website_url).text)\n\n        # 请求 LLM 告诉你最新新闻\n        llm_response = self.openai_client.chat.completions.create(\n           model=\"gpt-3.5-turbo\", \n           messages=[{\"role\": \"user\", \"content\": f\"根据这段内容，最新的消息是：{website_text}\"}],\n        )\n        output = llm_response.choices[0].message.content.strip()\n        return {\"output\": output}\n\nif __name__ == \"__main__\":\n    server = ls.LitServer(NewsAgent())\n    server.run(port=8000)\n```\n\n测试它：\n```bash\ncurl -X POST http:\u002F\u002F127.0.0.1:8000\u002Fpredict -H \"Content-Type: application\u002Fjson\" -d '{\"website_url\": \"https:\u002F\u002Ftext.npr.org\u002F\"}'\n```\n\n&nbsp;\n\n# 主要优势   \n\n一些主要优势：\n\n- **部署任意管道或模型**：代理、管道、RAG、聊天机器人、图像模型、视频、语音、文本等……\n- **无需MLOps胶水**：LitAPI让您在一个地方即可构建完整的AI系统（多模型、代理、RAG）([更多](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fapi-reference\u002Flitapi?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme))。   \n- **即时搭建**：只需几行代码，通过`setup()`即可连接模型、数据库和数据([更多](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fapi-reference\u002Flitapi?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme#setup))。    \n- **优化完善**：内置自动扩展、GPU支持及快速推理功能([更多](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fapi-reference\u002Flitserver?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme))。    \n- **随处部署**：可自托管，也可通过Lightning一键部署([更多](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fdeploy-on-cloud?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme))。\n- **面向AI的FastAPI**：基于FastAPI构建，但针对AI进行了优化——在AI专用多工作线程处理下速度提升2倍([更多]((#性能)))。   \n- **适合专家使用**：可使用vLLM，也可自行构建，全面掌控批处理、缓存和逻辑([更多](https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-private-llama-3-2-rag-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme))。    \n\n> ⚠️ 这并非开箱即用的vLLM或Ollama替代方案。如果您需要，LitServe为您提供更底层的灵活性，以构建他们所做的事情（甚至更多）。  \n\n&nbsp;\n\n# 精选示例    \n以下是常见模型类型和用例的推理管道示例。      \n      \n\u003Cpre>\n\u003Cstrong>玩具模型：\u003C\u002Fstrong>      \u003Ca target=\"_blank\" href=\"#define-a-server\">Hello world\u003C\u002Fa>\n\u003Cstrong>大语言模型：\u003C\u002Fstrong>           \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-llama-3-2-vision-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Llama 3.2\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fopenai-fault-tolerant-proxy-server?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">LLM代理服务器\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-ai-agent-with-tool-use?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">带工具使用的智能体\u003C\u002Fa>\n\u003Cstrong>RAG：\u003C\u002Fstrong>            \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-private-llama-3-2-rag-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">vLLM RAG（Llama 3.2）\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-private-llama-3-1-rag-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">RAG API（LlamaIndex）\u003C\u002Fa>\n\u003Cstrong>NLP：\u003C\u002Fstrong>            \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-any-hugging-face-model-instantly?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Hugging face\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-hugging-face-bert-model?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">BERT\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-text-embedding-api-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">文本嵌入API\u003C\u002Fa>\n\u003Cstrong>多模态：\u003C\u002Fstrong>     \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-open-ai-clip-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">OpenAI Clip\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-multi-modal-llm-with-minicpm?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">MiniCPM\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-phi3-5-vision-api-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Phi-3.5 Vision Instruct\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Fbhimrajyadav\u002Fstudios\u002Fdeploy-and-chat-with-qwen2-vl-using-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Qwen2-VL\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-multi-modal-llm-with-pixtral?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Pixtral\u003C\u002Fa>\n\u003Cstrong>音频：\u003C\u002Fstrong>          \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-open-ai-s-whisper-model?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Whisper\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-an-music-generation-api-with-meta-s-audio-craft?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">AudioCraft\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-an-audio-generation-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">StableAudio\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-noise-cancellation-api-with-deepfilternet?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">降噪（DeepFilterNet）\u003C\u002Fa>\n\u003Cstrong>视觉：\u003C\u002Fstrong>         \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-private-api-for-stable-diffusion-2?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Stable diffusion 2\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-an-image-generation-api-with-auraflow?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">AuraFlow\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-an-image-generation-api-with-flux?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Flux\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-super-resolution-image-api-with-aura-sr?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">图像超分辨率（Aura SR）\u003C\u002Fa>,\n                \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Fbhimrajyadav\u002Fstudios\u002Fdeploy-background-removal-api-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">背景移除\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-controlled-image-generation-api-controlnet?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">控制稳定扩散（ControlNet）\u003C\u002Fa>\n\u003Cstrong>语音：\u003C\u002Fstrong>         \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-voice-clone-api-coqui-xtts-v2-model?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">文本转语音（XTTS V2）\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Fbhimrajyadav\u002Fstudios\u002Fdeploy-a-speech-generation-api-using-parler-tts-powered-by-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">Parler-TTS\u003C\u002Fa>\n\u003Cstrong>经典机器学习：\u003C\u002Fstrong>   \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-random-forest-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">随机森林\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-xgboost-with-litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">XGBoost\u003C\u002Fa>\n\u003Cstrong>其他：\u003C\u002Fstrong>  \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-an-media-conversion-api-with-ffmpeg?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">媒体转换API（ffmpeg）\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-both-pytorch-and-tensorflow-in-a-single-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">PyTorch + TensorFlow合并在一个API中\u003C\u002Fa>, \u003Ca target=\"_blank\" href=\"https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fopenai-fault-tolerant-proxy-server?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme\">LLM代理服务器\u003C\u002Fa>\n\u003C\u002Fpre>\n\u003C\u002Fpre>\n\n[浏览100多个社区构建的模板](https:\u002F\u002Flightning.ai\u002Fstudios?section=serving&utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)\n\n&nbsp;\n\n# 随处托管\n\n自行托管，尽享完全掌控；或借助[Lightning AI](https:\u002F\u002Flightning.ai\u002F?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)，在几秒钟内完成部署，支持自动扩展、安全保障以及99.995%的正常运行时间。  \n**包含免费 tier，无需任何设置，直接在您的云端运行**  \n\n```bash\nlightning deploy server.py --cloud\n```\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fff83dab9-0c9f-4453-8dcb-fb9526726344\n\n&nbsp;\n\n# 功能特性\n\n\u003Cdiv align='center'>\n\n| [功能](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)               | 自行管理                      | [由 Lightning 全面托管](https:\u002F\u002Flightning.ai\u002Fdeploy?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)         |\n|----------------------------------------------------------------------|-----------------------------------|------------------------------------|\n| 以 Docker 为先的部署          | ✅ DIY                             | ✅ 一键式部署                |\n| 成本                             | ✅ 免费（DIY）                      | ✅ 宽裕的[免费 tier](https:\u002F\u002Flightning.ai\u002Fpricing?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)，按需付费                |\n| 完全控制                     | ✅                                 | ✅                                 |\n| 使用任意引擎（如 vLLM 等）      | ✅                                 | ✅ vLLM、Ollama、LitServe 等    |\n| 拥有专属 VPC                          | ✅（手动设置）                  | ✅ 连接您自己的 VPC            |\n| [比普通 FastAPI 快 2 倍以上](#性能)                                               | ✅       | ✅                                 |\n| [自带模型](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Ffull-control?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)              | ✅       | ✅                                 |\n| [构建复合系统（多个模型）](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fhome?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                 | ✅       | ✅                                 |\n| [GPU 自动扩展](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fgpu-inference?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                  | ✅       | ✅                                 |\n| [批处理](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fbatching?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                              | ✅       | ✅                                 |\n| [流式传输](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fstreaming?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                            | ✅       | ✅                                 |\n| [工作节点自动扩展](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fautoscaling?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                 | ✅       | ✅                                 |\n| [服务所有模型：（大语言模型、视觉模型等）](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fexamples?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)         | ✅       | ✅                                 |\n| [支持 PyTorch、JAX、TF 等...](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Ffull-control?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme) | ✅       | ✅                                 |\n| [符合 OpenAPI 规范](https:\u002F\u002Fwww.openapis.org\u002F)                                                | ✅       | ✅                                 |\n| [与 OpenAI 兼容](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fopen-ai-spec?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)             | ✅       | ✅                                 |\n| [MCP 服务器支持](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fmcp?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                         | ✅       | ✅                                 |\n| [异步](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fasync-concurrency?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                 | ✅       | ✅                                 |\n| [身份验证](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fauthentication?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                  | ❌ DIY   | ✅ Token、密码、自定义         |\n| GPU                             | ❌ DIY                             | ✅ 支持 8 种以上 GPU 类型，H100 从 1.75 美元起 |\n| 负载均衡                   | ❌                                 | ✅ 内置负载均衡                        |\n| 缩放至零（无服务器）       | ❌                                 | ✅ 空闲时无机器运行       |\n| 按需自动扩展           | ❌                                 | ✅ 自动扩缩容              |\n| 多节点推理             | ❌                                 | ✅ 分布式跨节点         |\n| 使用 AWS\u002FGCP 积分              | ❌                                 | ✅ 使用现有云积分      |\n| 版本控制                       | ❌                                 | ✅ 发布版本与回滚        |\n| 企业级正常运行时间（99.95%） | ❌                                 | ✅ SLA 保障                      |\n| SOC2 \u002F HIPAA 合规性          | ❌                                 | ✅ 认证且安全              |\n| 可观测性                    | ❌                                 | ✅ 内置，可对接第三方工具|\n| CI\u002FCD 就绪                      | ❌                                 | ✅ Lightning SDK                   |\n| 24\u002F7 企业级支持          | ❌                                 | ✅ 专属支持               |\n| 成本控制与审计日志       | ❌                                 | ✅ 预算、明细、日志       |\n| 在 GPU 上调试                    | ❌                                 | ✅ Studio 集成              |\n| [20+ 功能](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)                    | -                                 | -                                  |\n\n\u003C\u002Fdiv>\n\n&nbsp;\n\n# 性能  \nLitServe 专为 AI 工作负载而设计。其专门的多 worker 处理方式可实现比 FastAPI 至少 **2 倍的提速**。  \n\n此外，批处理和 GPU 自动扩展等附加功能可将性能提升至 2 倍以上，高效扩展以支持比 FastAPI 和 TorchServe 更多的并发请求。  \n\n欲复现完整基准测试结果，请点击[此处](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fhome\u002Fbenchmarks?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)（数值越高越好）。  \n\n\u003Cdiv align=\"center\">\n  \u003Cimg alt=\"LitServe\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FLightning-AI_LitServe_readme_c7e3ff3c2ccf.png\" width=\"1000px\" style=\"max-width: 100%;\">\n\u003C\u002Fdiv>  \n\n这些结果针对的是图像和文本分类 ML 任务。对于其他 ML 任务（如嵌入、大语言模型推理、音频、分割、目标检测、摘要等），性能关系同样适用。  \n\n***💡 关于大语言模型推理的提示：*** 对于高性能的大语言模型推理（例如 Ollama\u002FvLLM），可将 [vLLM 与 LitServe 集成](https:\u002F\u002Flightning.ai\u002Flightning-ai\u002Fstudios\u002Fdeploy-a-private-llama-3-2-rag-api?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)，使用 [LitGPT](https:\u002F\u002Fgithub.com\u002FLightning-AI\u002Flitgpt?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme#deploy-an-llm)，或借助 LitServe 打造您自己的 vLLM 类似服务器。要最大化大语言模型性能，还需采用诸如 kv 缓存之类的优化手段，而这些优化正是 LitServe 可提供的功能。\n\n&nbsp;\n\n\n# 社区  \nLitServe 是一个[接受贡献的社区项目](https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Fcommunity?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme)——让我们共同打造全球最先进的 AI 推理引擎！\n\n💬 [在 Discord 上获取帮助](https:\u002F\u002Fdiscord.com\u002Finvite\u002FXncpTy7DSt)    \n📋 [许可证：Apache 2.0](https:\u002F\u002Fgithub.com\u002FLightning-AI\u002Flitserve\u002Fblob\u002Fmain\u002FLICENSE)","# LitServe 中文快速上手指南\n\n## 环境准备\n\n- **系统要求**：Linux \u002F macOS \u002F Windows（推荐 Linux）\n- **Python 版本**：3.8+\n- **前置依赖**：\n  - PyTorch（推荐使用国内镜像加速安装）\n  - 可选：CUDA（如需 GPU 加速）\n\n推荐使用清华源加速 PyTorch 安装：\n```bash\npip install torch torchvision torchaudio --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu118\n```\n\n## 安装步骤\n\n使用 pip 安装 LitServe（推荐使用国内镜像加速）：\n```bash\npip install litserve -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n## 基本使用\n\n创建 `server.py` 文件，定义最简推理服务：\n\n```python\nimport litserve as ls\n\nclass InferenceEngine(ls.LitAPI):\n    def setup(self, device):\n        self.model = lambda x: x ** 2  # 示例模型：平方运算\n\n    def predict(self, request):\n        x = request[\"input\"]\n        return {\"output\": self.model(x)}\n\nif __name__ == \"__main__\":\n    server = ls.LitServer(InferenceEngine(), accelerator=\"auto\")\n    server.run(port=8000)\n```\n\n启动服务：\n```bash\npython server.py\n```\n\n测试请求：\n```bash\ncurl -X POST http:\u002F\u002F127.0.0.1:8000\u002Fpredict -H \"Content-Type: application\u002Fjson\" -d '{\"input\": 4.0}'\n```\n\n响应示例：\n```json\n{\"output\": 16.0}\n```\n\n> 支持一键部署至 [Lightning AI](https:\u002F\u002Flightning.ai\u002Flitserve)：  \n> ```bash\n> lightning deploy server.py --cloud\n> ```","某AI创业公司正在开发一款智能客服系统，需同时调用多个模型：一个用于理解用户意图的分类模型、一个用于检索知识库的RAG模型、一个用于生成自然回复的LLM，并支持流式输出和动态批处理，以应对高峰时段的并发请求。\n\n### 没有 LitServe 时\n- 需要手动用 FastAPI 搭建多个端点，分别管理三个模型的加载与调用，代码冗长且耦合严重。\n- 批处理逻辑靠自己实现，无法自动合并相似请求，导致GPU利用率低，响应延迟高达800ms。\n- 流式输出需要额外编写异步生成器和HTTP流控制，调试困难，常出现断流或乱序。\n- 部署时需配置Nginx、Docker、Prometheus等MLOps组件，团队无专职运维，上线周期长达两周。\n- 想加入新模型或调整推理顺序时，必须重写整个服务架构，迭代成本极高。\n\n### 使用 LitServe 后\n- 用纯Python定义统一的 `LitAPI` 类，直接在单个文件中串联三个模型的调用逻辑，代码清晰可维护。\n- 自动启用动态批处理，系统在高并发下将10个请求合并为1批处理，GPU利用率从30%提升至85%，平均延迟降至320ms。\n- 仅需返回生成器即可启用流式响应，用户能实时看到回复逐字出现，体验接近人工对话。\n- 一键部署到Lightning AI，无需配置任何基础设施，自动获得GPU扩容和监控看板，上线时间缩短至2小时。\n- 新增一个情感分析模型只需在 `predict()` 方法中追加一行调用，无需重构服务或修改部署流程。\n\nLitServe 让AI工程师能像写函数一样构建生产级推理服务，把精力从工程杂务中解放出来，专注模型与用户体验的优化。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FLightning-AI_LitServe_a82fc795.png","Lightning-AI","⚡️ Lightning AI ","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FLightning-AI_e518c84b.png","Turn ideas into AI, Lightning fast. Creators of PyTorch Lightning, Lightning AI Studio, TorchMetrics, Fabric, Lit-GPT, Lit-LLaMA",null,"LightningAI","https:\u002F\u002Flightning.ai\u002F","https:\u002F\u002Fgithub.com\u002FLightning-AI",[84,88],{"name":85,"color":86,"percentage":87},"Python","#3572A5",99.8,{"name":89,"color":90,"percentage":91},"Shell","#89e051",0.2,3859,278,"2026-04-05T18:40:06","Apache-2.0","Linux, macOS, Windows","需要 NVIDIA GPU，显存 8GB+，CUDA 11.7+","16GB+",{"notes":100,"python":101,"dependencies":102},"建议使用 conda 管理环境，首次运行可能需下载模型文件（大小依模型而定，可达数 GB）；支持自定义模型与 vLLM 集成，部署时可选择本地自托管或 Lightning AI 云平台","3.8+",[103,104,105,106,107,108,109],"torch","fastapi","uvicorn","pydantic","huggingface-hub","accelerate","openai",[13,53,51,14,15],[112,113,114,115,116,117,104,118,119],"ai","api","serving","artificial-intelligence","deep-learning","developer-tools","rest-api","web","2026-03-27T02:49:30.150509","2026-04-06T08:17:43.574590",[123,128,133,138,143,148],{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},8990,"如何在单个 LitServe 服务器上支持多个端点（如 \u002Fembedding、\u002Fvlm\u002Fpredict）？","LitServe 已支持多端点功能，可通过定义不同的路由路径实现。每个端点可绑定独立的模型和处理逻辑，无需启动多个服务实例。具体使用方法请参考官方文档：https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fmultiple-endpoints#multiple-routes-or-endpoint-paths。该功能已合并至主分支，并在 litserve==0.2.11a2 版本中可用。","https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fissues\u002F271",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},8991,"如何使用 LitServe 部署符合 OpenAI Embedding API 格式的自定义嵌入模型？","虽然 LitServe 当前仅原生支持 OpenAI Chat Completion API，但可通过 Pydantic 输入格式自定义实现 OpenAI 兼容的嵌入接口。你需要在 encode 和 decode 方法中手动构造符合 OpenAI Embedding API 的请求与响应结构（如 input、embedding 字段），并可自定义 decode_method 来支持引导选择等高级功能。详细实现请参考：https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Frequest-response-format#pydantic-input 和 https:\u002F\u002Flightning.ai\u002Fdocs\u002Flitserve\u002Ffeatures\u002Fopen-ai-spec#override-decoderequest。","https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fissues\u002F305",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},8992,"LitServe 是否支持模型空闲时自动卸载以节省 GPU 内存？","LitServe 本身暂未内置模型自动卸载功能，但 Lightning Studio 已提供 'scale to zero' 功能可实现类似效果。对于本地部署，建议通过外部工具（如容器编排 + 反向代理）管理模型生命周期，或在应用层实现请求触发的加载\u002F卸载逻辑。官方建议在资源受限环境下使用外部编排方案，而非在 LitServe 内部增加复杂性。","https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fissues\u002F304",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},8993,"当 num_api_servers > 1 时，请求计数为何不准确？","当 num_api_servers > 1 时，每个 Uvicorn 服务器实例都会重复添加 RequestCountMiddleware，导致请求被重复计数（第 N 个服务器会将请求计数 N 次）。此问题已在代码中修复，修复方案是确保每个服务器仅添加一次中间件。建议升级到最新版本以获得修复，或手动检查 server.py 中中间件的初始化逻辑，避免在循环中重复添加。","https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fissues\u002F602",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},8994,"能否自定义 LitServe 的默认 API 路径（如从 \u002Fpredict 改为 \u002Fapi\u002Fv1\u002Fpredict）？","目前 LitServe 不直接支持通过参数（如 endpoint_path）自定义全局 API 路径，但可通过多端点功能实现类似效果。例如，为模型绑定自定义路径如 '\u002Fapi\u002Fv1\u002Fpredict'，并利用 OpenAI Spec 或自定义路由配置实现路径重写。官方建议在有明确需求时再暴露该功能，当前推荐使用多端点机制作为替代方案。","https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fissues\u002F90",{"id":149,"question_zh":150,"answer_zh":151,"source_url":127},8995,"如何在 LitServe 中集成中间件（如身份验证、日志记录）？","LitServe 支持通过 Uvicorn 的中间件机制集成自定义中间件。你可以在创建 LitServer 实例时，通过传入 middleware 参数注入 FastAPI\u002FUvicorn 兼容的中间件，例如：server = LitServer(..., middleware=[YourAuthMiddleware(), YourLoggingMiddleware()])。官方文档提供了中间件集成示例，适用于认证、跨域、请求日志等场景。",[153,158,163,168,173,178,183,188,193,198,203,208,213,218,223,228,233,238,243,248],{"id":154,"version":155,"summary_zh":156,"released_at":157},116103,"v0.2.17","[Lightning AI](https:\u002F\u002Flightning.ai\u002F) ⚡ is excited to announce the release of **LitServe v0.2.17**\r\n\r\n## Highlights\r\n\r\n### Automatic Worker Restart\r\n\r\nLitServe now supports automatic restarting of inference workers when they die, ensuring high availability and resilience in production environments. This prevents server shutdown due to isolated worker failures and maintains service continuity.\r\n\r\n```python\r\nimport litserve as ls\r\n\r\nserver = ls.LitServer(\r\n    MyAPI(),\r\n    restart_workers=True,  # Automatically restart failed workers\r\n    workers_per_device=4\r\n)\r\nserver.run()\r\n```\r\n\r\nWhen a worker terminates unexpectedly, the server automatically spawns a replacement, keeping requests flowing without interruption.\r\n\r\n## Changes\r\n\r\n\u003Cdetails open>\r\n\u003Csummary>Added\u003C\u002Fsummary>\r\n\r\n* Add support for restarting the inference worker when they die by @tchaton in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F624\r\n\r\n\u003C\u002Fdetails>\r\n\r\n\u003Cdetails open>\r\n\u003Csummary>Changed\u003C\u002Fsummary>\r\n\r\n* Update README to reflect inference engines terminology by @williamFalcon in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F625\r\n* chore: drop support for Python 3.9 by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F641\r\n\r\n\u003C\u002Fdetails>\r\n\r\n\u003Cdetails open>\r\n\u003Csummary>Fixed\u003C\u002Fsummary>\r\n\r\n* Add warning for dict\u002Fset outputs in batched predict to catch edge cases by @Copilot in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F612\r\n* fix(sdk): Reduce the quantity of warning emitted by @tchaton in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F631\r\n* fix(litServe): Use asyncio.sleep instead of time.sleep by @tchaton in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F633\r\n* fix(cli): `lightning-sdk` installation process with `uv` by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F640\r\n* Fix `on_request callback` not triggering for API specs by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F642\r\n\r\n\u003C\u002Fdetails>\r\n\r\n\u003Cdetails>\r\n\u003Csummary>Chores\u003C\u002Fsummary>\r\n\r\n* Bump the gha-updates group with 2 updates by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F626\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci[bot] in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F629\r\n* feat(litServe): Bump version 0.2.17 by @tchaton in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F632\r\n* Bump actions\u002Fcheckout from 5 to 6 in the gha-updates group by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F636\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci[bot] in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F637\r\n* Bump mypy from 1.18.2 to 1.19.0 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F639\r\n\r\n\u003C\u002Fdetails>\r\n\r\n## 🧑‍💻 Contributors\r\n\r\nThank you ❤️ to all contributors for making LitServe better!\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.16...v0.2.17","2025-12-23T19:39:23",{"id":159,"version":160,"summary_zh":161,"released_at":162},116104,"v0.2.16","## What's Changed\r\n* bump linting to min python version py3.9 by @Borda in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F597\r\n* enable testing with minimal requirements by @Borda in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F596\r\n* Fix duplicate\u002Fmultiple middleware initialization by @geeksambhu in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F601\r\n* Support async `LitAPI.health()` and await it in `\u002Fhealth` by @KAVYANSHTYAGI in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F604\r\n* fix: Swagger UI message print when `disable_openapi_url=False by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F620\r\n* fix\u002Freq-middleware-duplication by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F618\r\n\r\n\r\n## New Contributors\r\n* @geeksambhu made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F601\r\n* @KAVYANSHTYAGI made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F604\r\n* @Abdul-0x4A made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F621\r\n* @dmitsf made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F623\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.15...v0.2.16","2025-10-14T17:02:44",{"id":164,"version":165,"summary_zh":166,"released_at":167},116105,"v0.2.15","## What's Changed\r\n* ci: add testing cron by @Borda in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F585\r\n* fix(ci): handle sentinel input in request_queue gracefully  by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F589\r\n* pytest uses just one config by @Borda in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F586\r\n* handle invalid operation in zmq transport by @emmanuel-ferdman in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F591\r\n* feat(litserve): Add support for loading TLS certificates for the user by @tchaton in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F592\r\n* Release 0.2.15 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F593\r\n\r\n## New Contributors\r\n* @emmanuel-ferdman made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F591\r\n* @tchaton made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F592\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.14...v0.2.15","2025-07-31T11:46:23",{"id":169,"version":170,"summary_zh":171,"released_at":172},116106,"v0.2.14","## What's Changed\r\n* Bump mypy from 1.16.0 to 1.16.1 by @dependabot[bot] in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F565\r\n* support `mcp` package  less than v1.10.0 by @rongfengliang in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F567\r\n* fix: OpenAIEmbeddingSpec setup check for multi endpoint by @rongfengliang in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F568\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci[bot] in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F571\r\n* Fix: pre-commit errors on `main` by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F574\r\n* refactor:  validation logic to pre_setup method in Embed Spec, where there is access to correct api instance by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F573\r\n* uv for CI - faster CI by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F563\r\n* feat: openapi url by @lorenzomassimiani in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F578\r\n* Feat\u002Foverride-spec-api-path by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F577\r\n* Pre-release 0.2.14a0 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F579\r\n* migrate arguments from LitServe to LitAPI in tests by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F576\r\n* chore: Add warnings for custom API paths in OpenAI Chat and Embed specs by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F581\r\n* 🚀 Feature: warning for heavy __init__ method in LitAPI by @SN4KEBYTE in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F582\r\n* Release 0.2.14 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F583\r\n\r\n## New Contributors\r\n* @rongfengliang made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F567\r\n* @SN4KEBYTE made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F582\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.13...v0.2.14","2025-07-22T10:04:13",{"id":174,"version":175,"summary_zh":176,"released_at":177},116107,"v0.2.13","## What's Changed\r\n* Add mcp support in README by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F540\r\n* Add Dependabot for Pip & GitHub Actions by @Borda in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F541\r\n* Comprehensive docstrings by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F545\r\n* Update numpy requirement from \u003C2.0 to \u003C3.0 by @dependabot in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F543\r\n* Bump mypy from 1.11.2 to 1.16.0 by @dependabot in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F544\r\n* Bump the gha-updates group with 2 updates by @dependabot in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F542\r\n* Add reasoning effort parameter to OpenAI Spec ChatCompletionRequest  by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F548\r\n* Fix async streaming with OpenAISpec by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F552\r\n* add test for async-sync function invocation handler by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F553\r\n* add pytest marker for unit, integration and e2e tests by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F554\r\n* Update LitServer initialization parameters for type safety by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F555\r\n* Send blocking CPU operations to thread for async conversion.  by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F556\r\n* Release 0.2.13rc1 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F557\r\n* return unmodified request for OpenAI chatcompletion decode_request by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F558\r\n* MCP package dependency check by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F561\r\n* Improve error handling and logging for streaming by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F562\r\n* Release 0.2.13 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F564\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.12...v0.2.13","2025-07-01T19:44:02",{"id":179,"version":180,"summary_zh":181,"released_at":182},116108,"v0.2.13rc1","## What's Changed\r\n* Add mcp support in README by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F540\r\n* Add Dependabot for Pip & GitHub Actions by @Borda in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F541\r\n* Comprehensive docstrings by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F545\r\n* Update numpy requirement from \u003C2.0 to \u003C3.0 by @dependabot in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F543\r\n* Bump mypy from 1.11.2 to 1.16.0 by @dependabot in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F544\r\n* Bump the gha-updates group with 2 updates by @dependabot in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F542\r\n* Add reasoning effort parameter to OpenAI Spec ChatCompletionRequest  by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F548\r\n* Fix async streaming with OpenAISpec by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F552\r\n* add test for async-sync function invocation handler by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F553\r\n* add pytest marker for unit, integration and e2e tests by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F554\r\n* Update LitServer initialization parameters for type safety by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F555\r\n* Send blocking CPU operations to thread for async conversion.  by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F556\r\n* Release 0.2.13rc1 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F557\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.12...v0.2.13rc1","2025-06-18T17:00:15",{"id":184,"version":185,"summary_zh":186,"released_at":187},116109,"v0.2.12","## What's Changed\r\n* Docs: Address text-davinci-003 deprecation and new API structure in NewsAgent example by @kumarrah2002 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F521\r\n* Add tests for async streaming loops by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F522\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F523\r\n* ci: Increase CI test timeout to 15 minutes by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F526\r\n* Enhance process and thread naming in LitServer by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F528\r\n* Fix health and info endpoints when multiple LitAPIs are specified by @vrdn-23 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F529\r\n* Pre-release 0.2.12.dev0 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F531\r\n* created shutdown endpoint with API key security and custom passed tests by @kumarrah2002 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F525\r\n* chore: Update CODEOWNERS by @andyland in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F533\r\n* Shutdown server when workers crash by @andyland in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F532\r\n* Improve perf test connection pool by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F537\r\n* make dependency installation check as utility function by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F535\r\n* input schema extraction for MCP server by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F536\r\n* Enable MCP server by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F534\r\n* Release 0.2.12 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F539\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002F0.2.11...v0.2.12","2025-06-11T11:43:06",{"id":189,"version":190,"summary_zh":191,"released_at":192},116110,"v0.2.12.dev0","## What's Changed\r\n* Docs: Address text-davinci-003 deprecation and new API structure in NewsAgent example by @kumarrah2002 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F521\r\n* Add tests for async streaming loops by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F522\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F523\r\n* ci: Increase CI test timeout to 15 minutes by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F526\r\n* Enhance process and thread naming in LitServer by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F528\r\n* Fix health and info endpoints when multiple LitAPIs are specified by @vrdn-23 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F529\r\n* Pre-release 0.2.12.dev0 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F531\r\n* created shutdown endpoint with API key security and custom passed tests by @kumarrah2002 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F525\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002F0.2.11...v0.2.12.dev0","2025-06-05T14:17:51",{"id":194,"version":195,"summary_zh":196,"released_at":197},116111,"0.2.11","## What's Changed\r\n* Remove un used imports and use enum by @mo7amed-3bdalla7 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F493\r\n* moving max_batch_size in README to inside SimpleLitAPI() by @kumarrah2002 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F495\r\n* rename deploy command by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F497\r\n* missing fstring typo by @mathematicalmichael in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F498\r\n* fix: OpenAI Spec validations to work with async LitAPI by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F499\r\n* chore: Update CODEOWNERS to include additional reviewers by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F503\r\n* fix OpenAI embedding spec for batching by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F500\r\n* Release 0.2.11a0 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F504\r\n* add shutdown endpoint w\u002F test by @kumarrah2002 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F507\r\n* Revert \"add shutdown endpoint w\u002F test\" by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F509\r\n* add fpdb for multiprocess debugging using pdb by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F508\r\n* Enhance logging configuration to support optional Rich logging by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F510\r\n* Release 0.2.11a1 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F511\r\n* move stream, endpoint path, loop to LitAPI initialization  by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F512\r\n* Support multiple LitAPIs for inference process and endpoints by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F513\r\n* Release 0.2.11a2 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F516\r\n* remove decode and encode methods from README by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F515\r\n* decouple request handler and add test by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F517\r\n* Support stream with non-stream LitAPIs by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F518\r\n* Improve developer experience by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F519\r\n\r\n## New Contributors\r\n* @mo7amed-3bdalla7 made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F493\r\n* @kumarrah2002 made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F495\r\n* @mathematicalmichael made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F498\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.10...0.2.11","2025-05-29T16:03:05",{"id":199,"version":200,"summary_zh":201,"released_at":202},116112,"v0.2.11.a2","## What's Changed\r\n* move stream, endpoint path, loop to LitAPI initialization  by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F512\r\n* Support multiple LitAPIs for inference process and endpoints by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F513\r\n* Release 0.2.11a2 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F516\r\n* remove decode and encode methods from README by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F515\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.11a1...v0.2.11.a2","2025-05-27T19:16:04",{"id":204,"version":205,"summary_zh":206,"released_at":207},116113,"v0.2.11a1","## What's Changed\r\n* add shutdown endpoint w\u002F test by @kumarrah2002 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F507\r\n* Revert \"add shutdown endpoint w\u002F test\" by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F509\r\n* add fpdb for multiprocess debugging using pdb by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F508\r\n* Enhance logging configuration to support optional Rich logging by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F510\r\n* Release 0.2.11a1 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F511\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002F0.2.11a0...v0.2.11a1","2025-05-23T21:57:04",{"id":209,"version":210,"summary_zh":211,"released_at":212},116114,"0.2.11a0","## What's Changed\r\n* Remove un used imports and use enum by @mo7amed-3bdalla7 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F493\r\n* moving max_batch_size in README to inside SimpleLitAPI() by @kumarrah2002 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F495\r\n* rename deploy command by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F497\r\n* missing fstring typo by @mathematicalmichael in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F498\r\n* fix: OpenAI Spec validations to work with async LitAPI by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F499\r\n* chore: Update CODEOWNERS to include additional reviewers by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F503\r\n* fix OpenAI embedding spec for batching by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F500\r\n* Release 0.2.11a0 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F504\r\n\r\n## New Contributors\r\n* @mo7amed-3bdalla7 made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F493\r\n* @kumarrah2002 made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F495\r\n* @mathematicalmichael made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F498\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.10...0.2.11a0","2025-05-19T11:36:37",{"id":214,"version":215,"summary_zh":216,"released_at":217},116115,"v0.2.10","## What's Changed\r\n* Add Exception handling test for Async LitAPI Loops by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F483\r\n* Save references for async tasks to prevent tasks disappearing mid execution by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F482\r\n* Add tests for async loop processing by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F485\r\n* fix inference process termination by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F486\r\n* Fix: CLI entry point to use lightning_sdk directly by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F487\r\n* Remove asyncio.sleep and run in threadpool by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F489\r\n* Enable true concurrency in async streaming loop and add tests by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F488\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.9...v0.2.10","2025-05-13T11:46:07",{"id":219,"version":220,"summary_zh":221,"released_at":222},116116,"v0.2.9","## What's Changed\r\n* ci: take back testing with minimal package versions by @Borda in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F472\r\n* Move batch size to LitAPI by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F468\r\n* feat: add metadata to the ChatCompletionRequest by @Danidapena in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F473\r\n* chore: fix link by @lianakoleva in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F474\r\n* remove request_timeout from LitAPI.pre_setup(...) by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F475\r\n* Fix Windows Threading Issues  by @FrsECM in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F385\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F476\r\n* enable asynchronous processing in LitAPI by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F477\r\n* async support for streaming loop by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F478\r\n* Enable true concurrency in async loop by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F479\r\n* Release 0.2.9 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F480\r\n\r\n## New Contributors\r\n* @Danidapena made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F473\r\n* @lianakoleva made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F474\r\n* @FrsECM made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F385\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.8...v0.2.9","2025-05-08T13:37:36",{"id":224,"version":225,"summary_zh":226,"released_at":227},116117,"v0.2.9.dev0","## What's Changed\r\n* ci: take back testing with minimal package versions by @Borda in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F472\r\n* Move batch size to LitAPI by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F468\r\n* feat: add metadata to the ChatCompletionRequest by @Danidapena in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F473\r\n* chore: fix link by @lianakoleva in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F474\r\n* remove request_timeout from LitAPI.pre_setup(...) by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F475\r\n* Fix Windows Threading Issues  by @FrsECM in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F385\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F476\r\n* enable asynchronous processing in LitAPI by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F477\r\n\r\n## New Contributors\r\n* @Danidapena made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F473\r\n* @lianakoleva made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F474\r\n* @FrsECM made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F385\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.8...v0.2.9.dev0","2025-05-07T06:02:47",{"id":229,"version":230,"summary_zh":231,"released_at":232},116118,"v0.2.8","## What's Changed\r\n* Encapsulate multiprocessing communication by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F419\r\n* fix context to have new object for each request by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F451\r\n* fix: Starlette dependency issue by @deependujha in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F456\r\n* Document server hosting by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F458\r\n* Update README.md by @williamFalcon in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F459\r\n* Update lightning serve CLI by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F460\r\n* Add --local flag to lightning serve command in minimal_run.py by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F461\r\n* nitpick: minor refactoring by @deependujha in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F457\r\n* remove starlette from the main dependency by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F464\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F465\r\n* Remove warning for inactive request counter in `active_requests` method by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F466\r\n* keep \"minimal dependency CI\" minimal by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F469\r\n* Retire ubuntu 20.04 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F470\r\n* Release 0.2.8 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F471\r\n\r\n## New Contributors\r\n* @deependujha made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F456\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.7...v0.2.8","2025-04-22T22:12:52",{"id":234,"version":235,"summary_zh":236,"released_at":237},116119,"v0.2.8.dev0","## What's Changed\r\n* Encapsulate multiprocessing communication by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F419\r\n* fix context to have new object for each request by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F451\r\n* fix: Starlette dependency issue by @deependujha in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F456\r\n* Document server hosting by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F458\r\n* Update README.md by @williamFalcon in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F459\r\n* Update lightning serve CLI by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F460\r\n\r\n## New Contributors\r\n* @deependujha made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F456\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.7...v0.2.8.dev0","2025-04-01T13:27:09",{"id":239,"version":240,"summary_zh":241,"released_at":242},116120,"v0.2.7","## What's Changed\r\n* improve tests by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F418\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F421\r\n* git: Handle None values in ChatMessage content by @Lucaz0619 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F422\r\n* fix custom exceptions by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F425\r\n* fix: reponse format JSONSchema key fix by @Lucaz0619 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F427\r\n* fix: ensure proper cleanup of processes in `wrap_litserve_start` by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F432\r\n* fix async continuous batching  by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F429\r\n* [fix] continuous batching - fix prefill by @ali-alshaar7 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F433\r\n* Feat: add custom health check logic by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F430\r\n* fix: Replace deprecated `dict` method with `model_dump` in OpenAI Spec decode step   by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F434\r\n* fix: update type hints for _default_unbatch and _spec attributes in LitAPI class  by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F435\r\n* Feature: 🚀 Add Audio Content Support to OpenAISpec Request by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F439\r\n* hotfix: pin starlette dependency by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F445\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F442\r\n* feat: add lightning cli dynamically  by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F446\r\n* Release 0.2.7 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F447\r\n\r\n## New Contributors\r\n* @Lucaz0619 made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F422\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.6...v0.2.7","2025-03-07T14:18:05",{"id":244,"version":245,"summary_zh":246,"released_at":247},116121,"v0.2.7.dev0","## What's Changed\r\n* improve test timeouts by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F418\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F421\r\n* git: Handle None values in ChatMessage content by @Lucaz0619 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F422\r\n* fix custom exceptions by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F425\r\n* fix: reponse format JSONSchema key fix by @Lucaz0619 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F427\r\n* fix: ensure proper cleanup of processes in `wrap_litserve_start` by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F432\r\n* fix async continuous batching  by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F429\r\n* [fix] continuous batching - fix prefill by @ali-alshaar7 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F433\r\n* Feat: add custom health check logic by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F430\r\n\r\n## New Contributors\r\n* @Lucaz0619 made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F422\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.6...v0.2.7.dev0","2025-02-20T12:38:02",{"id":249,"version":250,"summary_zh":251,"released_at":252},116122,"v0.2.6","## What's Changed\r\n* feat: info route by @lorenzomassimiani in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F368\r\n* Fix CI: async tests with ASGITransport by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F376\r\n* Fix: Replace Deprecated `max_tokens` with `max_completion_tokens` in OpenAI Spec by @rittik9 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F375\r\n* feat: Customizable Loops 1\u002Fn by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F374\r\n* customizable loop - wire up Loops to LitServer 2\u002Fn by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F378\r\n* Improve CI: retry flaky tests by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F379\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F377\r\n* check device format while initialising litserver by @ali-alshaar7 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F380\r\n* Release 0.2.6.dev0 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F383\r\n* Update PR template  by @rittik9 in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F381\r\n* Include user field and `base64` literal for encoding_format  by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F388\r\n* Improve error handling and debugging experience by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F389\r\n* improved logging with sensible defaults by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F391\r\n* add continuous batching loop 1\u002Fn by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F387\r\n* Add `loop.pre_setup` to allow fine-grained LitAPI validation based on inference loop by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F393\r\n* Make `LitAPI.predict` optional and validate API implementation by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F394\r\n* Fix OpenAISpec with continuous batching loop by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F395\r\n* add tests for continuous batching and Default loops by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F396\r\n* Set LitServer.stream using LitSpec.stream by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F398\r\n* fix openai usage info for non-streaming response by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F399\r\n* [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F400\r\n* Async continuous batching loop by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F401\r\n* add validation for `stream=False` with `yield` usage by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F402\r\n* fix callback runner to execute after predict by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F406\r\n* integrate zmq by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F403\r\n* warn users when predict\u002Funbatch output length is not same as  #requests by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F408\r\n* move built in loops inside classes by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F409\r\n* add justus and thomas as codeowners by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F410\r\n* enable multiple workers for ZMQ by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F411\r\n* Fix: Add Callback Events and Align Hooks in Streaming Loop by @bhimrazy in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F407\r\n* bump: `Lightning-AI\u002Futilities` used `main` by @Borda in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F415\r\n* Release v0.2.6 by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F413\r\n* fix: don't start zmq when fast_queue=false by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F417\r\n* fix release ci by @aniketmaurya in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F416\r\n\r\n## New Contributors\r\n* @rittik9 made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F375\r\n* @ali-alshaar7 made their first contribution in https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fpull\u002F380\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FLightning-AI\u002FLitServe\u002Fcompare\u002Fv0.2.5...v0.2.6","2025-01-16T18:34:41"]