[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-SeldonIO--MLServer":3,"tool-SeldonIO--MLServer":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":80,"owner_twitter":79,"owner_website":81,"owner_url":82,"languages":83,"stars":110,"forks":111,"last_commit_at":112,"license":113,"difficulty_score":23,"env_os":114,"env_gpu":115,"env_ram":115,"env_deps":116,"category_tags":130,"github_topics":131,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":139,"updated_at":140,"faqs":141,"releases":172},2211,"SeldonIO\u002FMLServer","MLServer","An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more","MLServer 是一款开源的机器学习模型推理服务器，旨在帮助开发者轻松地将训练好的模型部署为生产级服务。它通过提供标准的 REST 和 gRPC 接口，屏蔽了底层框架的差异，让用户无需编写复杂的包装代码即可快速上线模型。\n\n对于需要同时管理多个模型或应对高并发请求的团队，MLServer 解决了传统部署方式中资源利用率低、扩展困难的问题。其核心亮点在于支持“多模型同进程服务”，允许在单个进程中运行多个模型以节省资源；具备自适应批处理能力，能动态合并请求以提升吞吐量；并原生支持垂直扩展的并行推理。此外，它完全兼容 KFServing V2 数据平面协议，可无缝集成到 Kubernetes、Seldon Core 及 KServe 等云原生环境中。\n\n目前，MLServer 开箱即支持 Scikit-Learn、XGBoost、LightGBM、MLflow 等多种主流框架，同时也允许用户自定义运行时以适配特殊需求。这款工具特别适合机器学习工程师、后端开发人员以及负责 MLOps 基础设施的研究人员使用，是构建高效、可扩展模型服务管道的理想选择。","# MLServer\n\nAn open source inference server for your machine learning models.\n\n[![video_play_icon](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSeldonIO_MLServer_readme_5ca89a8cb770.png)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=aZHe3z-8C_w)\n\n## Overview\n\nMLServer aims to provide an easy way to start serving your machine learning\nmodels through a REST and gRPC interface, fully compliant with [KFServing's V2\nDataplane](https:\u002F\u002Fdocs.seldon.io\u002Fprojects\u002Fseldon-core\u002Fen\u002Flatest\u002Freference\u002Fapis\u002Fv2-protocol.html)\nspec. Watch a quick video introducing the project [here](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=aZHe3z-8C_w).\n\n- Multi-model serving, letting users run multiple models within the same\n  process.\n- Ability to run [inference in parallel for vertical\n  scaling](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fparallel-inference.html)\n  across multiple models through a pool of inference workers.\n- Support for [adaptive\n  batching](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fadaptive-batching.html),\n  to group inference requests together on the fly.\n- Scalability with deployment in Kubernetes native frameworks, including\n  [Seldon Core](https:\u002F\u002Fdocs.seldon.io\u002Fprojects\u002Fseldon-core\u002Fen\u002Flatest\u002Fgraph\u002Fprotocols.html#v2-kfserving-protocol) and\n  [KServe (formerly known as KFServing)](https:\u002F\u002Fkserve.github.io\u002Fwebsite\u002Fmodelserving\u002Fv1beta1\u002Fsklearn\u002Fv2\u002F), where\n  MLServer is the core Python inference server used to serve machine learning\n  models.\n- Support for the standard [V2 Inference Protocol](https:\u002F\u002Fdocs.seldon.io\u002Fprojects\u002Fseldon-core\u002Fen\u002Flatest\u002Freference\u002Fapis\u002Fv2-protocol.html) on\n  both the gRPC and REST flavours, which has been standardised and adopted by\n  various model serving frameworks.\n\nYou can read more about the goals of this project on the [initial design\ndocument](https:\u002F\u002Fdocs.google.com\u002Fdocument\u002Fd\u002F1C2uf4SaAtwLTlBCciOhvdiKQ2Eay4U72VxAD4bXe7iU\u002Fedit?usp=sharing).\n\n## Usage\n\nYou can install the `mlserver` package running:\n\n```bash\npip install mlserver\n```\n\nNote that to use any of the optional [inference runtimes](#inference-runtimes),\nyou'll need to install the relevant package.\nFor example, to serve a `scikit-learn` model, you would need to install the\n`mlserver-sklearn` package:\n\n```bash\npip install mlserver-sklearn\n```\n\nFor further information on how to use MLServer, you can check any of the\n[available examples](#examples).\n\n## Inference Runtimes\n\nInference runtimes allow you to define how your model should be used within\nMLServer.\nYou can think of them as the **backend glue** between MLServer and your machine\nlearning framework of choice.\nYou can read more about [inference runtimes in their documentation\npage](.\u002Fdocs\u002Fruntimes\u002Findex.md).\n\nOut of the box, MLServer comes with a set of pre-packaged runtimes which let\nyou interact with a subset of common frameworks.\nThis allows you to start serving models saved in these frameworks straight\naway.\nHowever, it's also possible to **[write custom\nruntimes](.\u002Fdocs\u002Fruntimes\u002Fcustom.md)**.\n\nOut of the box, MLServer provides support for:\n\n| Framework     | Supported | Documentation                                                    |\n| ------------- | --------- | ---------------------------------------------------------------- |\n| Scikit-Learn  | ✅        | [MLServer SKLearn](.\u002Fruntimes\u002Fsklearn)                           |\n| XGBoost       | ✅        | [MLServer XGBoost](.\u002Fruntimes\u002Fxgboost)                           |\n| Spark MLlib   | ✅        | [MLServer MLlib](.\u002Fruntimes\u002Fmllib)                               |\n| LightGBM      | ✅        | [MLServer LightGBM](.\u002Fruntimes\u002Flightgbm)                         |\n| CatBoost      | ✅        | [MLServer CatBoost](.\u002Fruntimes\u002Fcatboost)                         |\n| Tempo         | ✅        | [`github.com\u002FSeldonIO\u002Ftempo`](https:\u002F\u002Fgithub.com\u002FSeldonIO\u002Ftempo) |\n| MLflow        | ✅        | [MLServer MLflow](.\u002Fruntimes\u002Fmlflow)                             |\n| Alibi-Detect  | ✅        | [MLServer Alibi Detect](.\u002Fruntimes\u002Falibi-detect)                 |\n| Alibi-Explain | ✅        | [MLServer Alibi Explain](.\u002Fruntimes\u002Falibi-explain)               |\n| HuggingFace   | ✅        | [MLServer HuggingFace](.\u002Fruntimes\u002Fhuggingface)                   |\n\nMLServer is licensed under the Apache License, Version 2.0. However please note that software used in conjunction with, or alongside, MLServer may be licensed under different terms. For example, Alibi Detect and Alibi Explain are both licensed under the Business Source License 1.1. For more information about the legal terms of products that are used in conjunction with or alongside MLServer, please refer to their respective documentation.\n\n## Supported Python Versions\n\n🔴 Unsupported\n\n🟠 Deprecated: To be removed in a future version\n\n🟢 Supported\n\n🔵 Untested\n\n| Python Version | Status |\n| -------------- | ------ |\n| 3.7            | 🔴     |\n| 3.8            | 🔴     |\n| 3.9            | 🟢     |\n| 3.10           | 🟢     |\n| 3.11           | 🟢     |\n| 3.12           | 🟢     |\n| 3.13           | 🔴     |\n\n## Examples\n\nTo see MLServer in action, check out [our full list of\nexamples](.\u002Fdocs\u002Fexamples\u002Findex.md).\nYou can find below a few selected examples showcasing how you can leverage\nMLServer to start serving your machine learning models.\n\n- [Serving a `scikit-learn` model](.\u002Fdocs\u002Fexamples\u002Fsklearn\u002FREADME.md)\n- [Serving a `xgboost` model](.\u002Fdocs\u002Fexamples\u002Fxgboost\u002FREADME.md)\n- [Serving a `lightgbm` model](.\u002Fdocs\u002Fexamples\u002Flightgbm\u002FREADME.md)\n- [Serving a `catboost` model](.\u002Fdocs\u002Fexamples\u002Fcatboost\u002FREADME.md)\n- [Serving a `tempo` pipeline](.\u002Fdocs\u002Fexamples\u002Ftempo\u002FREADME.md)\n- [Serving a custom model](.\u002Fdocs\u002Fexamples\u002Fcustom\u002FREADME.md)\n- [Serving an `alibi-detect` model](.\u002Fdocs\u002Fexamples\u002Falibi-detect\u002FREADME.md)\n- [Serving a `HuggingFace` model](.\u002Fdocs\u002Fexamples\u002Fhuggingface\u002FREADME.md)\n- [Multi-Model Serving with multiple frameworks](.\u002Fdocs\u002Fexamples\u002Fmms\u002FREADME.md)\n- [Loading \u002F unloading models from a model repository](.\u002Fdocs\u002Fexamples\u002Fmodel-repository\u002FREADME.md)\n\n## Developer Guide\n\n### Versioning\n\nBoth the main `mlserver` package and the [inference runtimes\npackages](.\u002Fdocs\u002Fruntimes\u002Findex.md) try to follow the same versioning schema.\nTo bump the version across all of them, you can use the\n[`.\u002Fhack\u002Fupdate-version.sh`](.\u002Fhack\u002Fupdate-version.sh) script.\n\nWe generally keep the version as a placeholder for an upcoming version.\n\nFor example:\n\n```bash\n.\u002Fhack\u002Fupdate-version.sh 0.2.0.dev1\n```\n\n### Testing\n\nTo run all of the tests for MLServer and the runtimes, use:\n\n```bash\nmake test\n```\n\nTo run run tests for a single file, use something like:\n\n```bash\ntox -e py3 -- tests\u002Fbatch_processing\u002Ftest_rest.py\n```\n","# MLServer\n\n一个用于部署机器学习模型的开源推理服务器。\n\n[![video_play_icon](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSeldonIO_MLServer_readme_5ca89a8cb770.png)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=aZHe3z-8C_w)\n\n## 概述\n\nMLServer 旨在通过 REST 和 gRPC 接口，以一种简单的方式启动您的机器学习模型服务，并完全符合 [KFServing 的 V2 数据平面](https:\u002F\u002Fdocs.seldon.io\u002Fprojects\u002Fseldon-core\u002Fen\u002Flatest\u002Freference\u002Fapis\u002Fv2-protocol.html) 规范。您可以在[这里](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=aZHe3z-8C_w)观看一段简短的项目介绍视频。\n\n- 多模型服务：允许用户在同一进程中运行多个模型。\n- 能够通过推理工作线程池，在多个模型之间并行执行推理，实现垂直扩展。\n- 支持[自适应批处理](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fadaptive-batching.html)，以便动态地将推理请求分组在一起。\n- 可在 Kubernetes 原生框架中部署，具有良好的可扩展性，包括 [Seldon Core](https:\u002F\u002Fdocs.seldon.io\u002Fprojects\u002Fseldon-core\u002Fen\u002Flatest\u002Fgraph\u002Fprotocols.html#v2-kfserving-protocol) 和 [KServe（原名 KFServing）](https:\u002F\u002Fkserve.github.io\u002Fwebsite\u002Fmodelserving\u002Fv1beta1\u002Fsklearn\u002Fv2\u002F)。MLServer 是用于服务机器学习模型的核心 Python 推理服务器。\n- 同时支持 gRPC 和 REST 风格的标准 [V2 推理协议](https:\u002F\u002Fdocs.seldon.io\u002Fprojects\u002Fseldon-core\u002Fen\u002Flatest\u002Freference\u002Fapis\u002Fv2-protocol.html)，该协议已被标准化并被多种模型服务框架采用。\n\n您可以在[初始设计文档](https:\u002F\u002Fdocs.google.com\u002Fdocument\u002Fd\u002F1C2uf4SaAtwLTlBCciOhvdiKQ2Eay4U72VxAD4bXe7iU\u002Fedit?usp=sharing)中了解更多关于该项目目标的信息。\n\n## 使用方法\n\n您可以运行以下命令来安装 `mlserver` 包：\n\n```bash\npip install mlserver\n```\n\n请注意，要使用任何可选的[推理运行时](#inference-runtimes)，您需要安装相应的包。例如，要服务一个 `scikit-learn` 模型，您需要安装 `mlserver-sklearn` 包：\n\n```bash\npip install mlserver-sklearn\n```\n\n有关如何使用 MLServer 的更多信息，您可以查看任何[可用示例](#examples)。\n\n## 推理运行时\n\n推理运行时允许您定义模型在 MLServer 中的使用方式。您可以将其视为 MLServer 与您选择的机器学习框架之间的**后端桥梁**。您可以在[推理运行时的文档页面](.\u002Fdocs\u002Fruntimes\u002Findex.md)中了解更多信息。\n\n开箱即用时，MLServer 提供了一组预打包的运行时，使您能够与一些常见的框架进行交互。这使得您可以立即开始服务这些框架中保存的模型。然而，您也可以**[编写自定义运行时](.\u002Fdocs\u002Fruntimes\u002Fcustom.md)**。\n\n开箱即用时，MLServer 提供以下框架的支持：\n\n| 框架         | 支持状态 | 文档                                                    |\n| ------------- | --------- | ---------------------------------------------------------------- |\n| Scikit-Learn  | ✅        | [MLServer SKLearn](.\u002Fruntimes\u002Fsklearn)                           |\n| XGBoost       | ✅        | [MLServer XGBoost](.\u002Fruntimes\u002Fxgboost)                           |\n| Spark MLlib   | ✅        | [MLServer MLlib](.\u002Fruntimes\u002Fmllib)                               |\n| LightGBM      | ✅        | [MLServer LightGBM](.\u002Fruntimes\u002Flightgbm)                         |\n| CatBoost      | ✅        | [MLServer CatBoost](.\u002Fruntimes\u002Fcatboost)                         |\n| Tempo         | ✅        | [`github.com\u002FSeldonIO\u002Ftempo`](https:\u002F\u002Fgithub.com\u002FSeldonIO\u002Ftempo) |\n| MLflow        | ✅        | [MLServer MLflow](.\u002Fruntimes\u002Fmlflow)                             |\n| Alibi-Detect  | ✅        | [MLServer Alibi Detect](.\u002Fruntimes\u002Falibi-detect)                 |\n| Alibi-Explain | ✅        | [MLServer Alibi Explain](.\u002Fruntimes\u002Falibi-explain)               |\n| HuggingFace   | ✅        | [MLServer HuggingFace](.\u002Fruntimes\u002Fhuggingface)                   |\n\nMLServer 采用 Apache License, Version 2.0 许可证。但请注意，与 MLServer 结合或同时使用的软件可能采用不同的许可证条款。例如，Alibi Detect 和 Alibi Explain 均采用 Business Source License 1.1 许可证。有关与 MLServer 结合或同时使用的其他产品的法律条款，请参阅其各自的文档。\n\n## 支持的 Python 版本\n\n🔴 不支持\n\n🟠 已弃用：将在未来版本中移除\n\n🟢 支持\n\n🔵 未测试\n\n| Python 版本 | 状态 |\n| -------------- | ------ |\n| 3.7            | 🔴     |\n| 3.8            | 🔴     |\n| 3.9            | 🟢     |\n| 3.10           | 🟢     |\n| 3.11           | 🟢     |\n| 3.12           | 🟢     |\n| 3.13           | 🔴     |\n\n## 示例\n\n要查看 MLServer 的实际应用，请参阅[我们的完整示例列表](.\u002Fdocs\u002Fexamples\u002Findex.md)。以下是一些精选示例，展示了如何利用 MLServer 开始服务您的机器学习模型。\n\n- [服务一个 `scikit-learn` 模型](.\u002Fdocs\u002Fexamples\u002Fsklearn\u002FREADME.md)\n- [服务一个 `xgboost` 模型](.\u002Fdocs\u002Fexamples\u002Fxgboost\u002FREADME.md)\n- [服务一个 `lightgbm` 模型](.\u002Fdocs\u002Fexamples\u002Flightgbm\u002FREADME.md)\n- [服务一个 `catboost` 模型](.\u002Fdocs\u002Fexamples\u002Fcatboost\u002FREADME.md)\n- [服务一个 `tempo` 流水线](.\u002Fdocs\u002Fexamples\u002Ftempo\u002FREADME.md)\n- [服务一个自定义模型](.\u002Fdocs\u002Fexamples\u002Fcustom\u002FREADME.md)\n- [服务一个 `alibi-detect` 模型](.\u002Fdocs\u002Fexamples\u002Falibi-detect\u002FREADME.md)\n- [服务一个 `HuggingFace` 模型](.\u002Fdocs\u002Fexamples\u002Fhuggingface\u002FREADME.md)\n- [多框架多模型服务](.\u002Fdocs\u002Fexamples\u002Fmms\u002FREADME.md)\n- [从模型仓库加载\u002F卸载模型](.\u002Fdocs\u002Fexamples\u002Fmodel-repository\u002FREADME.md)\n\n## 开发者指南\n\n### 版本管理\n\n主 `mlserver` 包和[推理运行时包](.\u002Fdocs\u002Fruntimes\u002Findex.md)都遵循相同的版本号方案。要为所有包同步更新版本，可以使用[`.\u002Fhack\u002Fupdate-version.sh`](.\u002Fhack\u002Fupdate-version.sh)脚本。\n\n我们通常将版本号保留为即将发布的版本占位符。\n\n例如：\n\n```bash\n.\u002Fhack\u002Fupdate-version.sh 0.2.0.dev1\n```\n\n### 测试\n\n要运行 MLServer 和所有运行时的全部测试，可以使用：\n\n```bash\nmake test\n```\n\n要单独运行某个文件的测试，可以使用类似以下的命令：\n\n```bash\ntox -e py3 -- tests\u002Fbatch_processing\u002Ftest_rest.py\n```","# MLServer 快速上手指南\n\nMLServer 是一个开源的机器学习模型推理服务器，支持通过 REST 和 gRPC 接口提供服务，完全兼容 KFServing V2 数据平面协议。它支持多模型服务、并行推理、自适应批处理，并可无缝集成到 Kubernetes 生态（如 Seldon Core 和 KServe）。\n\n## 环境准备\n\n- **操作系统**：Linux \u002F macOS \u002F Windows（推荐 Linux 生产环境）\n- **Python 版本**：3.9、3.10、3.11 或 3.12（3.7\u002F3.8 已不再支持，3.13 尚未测试）\n- **包管理工具**：`pip`（建议使用虚拟环境）\n- **可选依赖**：根据你要服务的模型框架安装对应的运行时包（如 `mlserver-sklearn`、`mlserver-xgboost` 等）\n\n> 💡 国内用户可使用清华或阿里云镜像加速 pip 安装：\n> ```bash\n> pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \u003Cpackage-name>\n> ```\n\n## 安装步骤\n\n1. 安装核心包：\n```bash\npip install mlserver\n```\n\n2. 根据模型类型安装对应运行时（以 Scikit-Learn 为例）：\n```bash\npip install mlserver-sklearn\n```\n\n其他常见运行时：\n```bash\npip install mlserver-xgboost\npip install mlserver-lightgbm\npip install mlserver-huggingface\npip install mlserver-mlflow\n```\n\n## 基本使用\n\n以下以部署一个 Scikit-Learn 模型为例：\n\n### 1. 准备模型文件\n\n假设你已有一个训练好的 `model.pkl` 文件，存放在当前目录。\n\n### 2. 创建配置文件 `settings.json`\n\n```json\n{\n  \"models\": [\n    {\n      \"name\": \"sklearn-model\",\n      \"implementation\": \"mlserver_sklearn.SKLearnModel\",\n      \"model_uri\": \".\u002Fmodel.pkl\",\n      \"parameters\": {\n        \"version\": \"v1\"\n      }\n    }\n  ]\n}\n```\n\n### 3. 启动 MLServer\n\n```bash\nmlserver start .\n```\n\n服务器默认在 `http:\u002F\u002Flocalhost:8080` 提供 REST 接口，在 `grpc:\u002F\u002Flocalhost:9000` 提供 gRPC 接口。\n\n### 4. 发送推理请求（REST 示例）\n\n```bash\ncurl -X POST http:\u002F\u002Flocalhost:8080\u002Fv2\u002Fmodels\u002Fsklearn-model\u002Finfer \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"inputs\": [\n      {\n        \"name\": \"input-0\",\n        \"shape\": [1, 4],\n        \"datatype\": \"FP32\",\n        \"data\": [[5.1, 3.5, 1.4, 0.2]]\n      }\n    ]\n  }'\n```\n\n你将收到模型的预测结果。\n\n---\n\n✅ 现在你已成功使用 MLServer 部署了一个机器学习模型！  \n更多框架示例（XGBoost、HuggingFace、自定义模型等）请参考官方文档中的 [Examples](.\u002Fdocs\u002Fexamples\u002Findex.md) 章节。","某电商数据团队需要将基于 Scikit-Learn 的用户流失预测模型和基于 XGBoost 的商品推荐模型同时部署到生产环境，以支持实时营销决策。\n\n### 没有 MLServer 时\n- **资源浪费严重**：每个模型需独立启动一个 Flask\u002FFastAPI 服务进程，导致服务器内存和 CPU 被大量空闲进程占用，无法在同一节点高效运行多个模型。\n- **接口标准混乱**：不同框架编写的服务接口定义不一，上游业务方调用时需适配多种协议，增加了集成复杂度和维护成本。\n- **高并发处理能力弱**：缺乏原生的自适应批处理（Adaptive Batching）机制，面对突发流量时无法自动合并请求，导致推理延迟飙升或服务崩溃。\n- **扩展运维困难**：难以直接对接 Kubernetes 原生架构（如 KServe），手动编写扩容脚本繁琐且容易出错，无法实现弹性伸缩。\n\n### 使用 MLServer 后\n- **多模型同进程运行**：利用多模型服务特性，将流失预测和商品推荐模型部署在同一个进程中，显著降低资源开销，提升服务器利用率。\n- **统一标准化接口**：通过内置符合 KFServing V2 协议的 REST\u002FgRPC 接口，屏蔽底层框架差异，让业务方只需对接一套标准 API 即可调用所有模型。\n- **智能流量削峰**：开启自适应批处理功能，MLServer 自动将并发的推理请求动态打包处理，在保证低延迟的同时大幅提升吞吐量。\n- **云原生无缝集成**：直接作为 KServe 或 Seldon Core 的核心推理后端部署，天然支持 Kubernetes 的弹性伸缩策略，运维团队无需额外开发即可实现自动化扩缩容。\n\nMLServer 通过统一标准化的推理服务和高效的资源调度，让多模型生产部署从“手工定制”转变为“开箱即用”，极大降低了 MLOps 的落地门槛。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FSeldonIO_MLServer_5ca89a8c.png","SeldonIO","Seldon","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FSeldonIO_17a4841a.png","Machine Learning Deployment for Kubernetes",null,"hello@seldon.io","https:\u002F\u002Fseldon.io","https:\u002F\u002Fgithub.com\u002FSeldonIO",[84,88,92,96,99,103,107],{"name":85,"color":86,"percentage":87},"Python","#3572A5",97.6,{"name":89,"color":90,"percentage":91},"Shell","#89e051",0.7,{"name":93,"color":94,"percentage":95},"JavaScript","#f1e05a",0.5,{"name":97,"color":98,"percentage":95},"Dockerfile","#384d54",{"name":100,"color":101,"percentage":102},"Makefile","#427819",0.3,{"name":104,"color":105,"percentage":106},"Jinja","#a52a22",0.2,{"name":108,"color":109,"percentage":106},"TeX","#3D6117",882,229,"2026-04-02T20:40:00","Apache-2.0","","未说明",{"notes":117,"python":118,"dependencies":119},"该工具是一个通用的推理服务器，支持多种机器学习框架（如 Scikit-Learn, XGBoost, HuggingFace 等）。具体的硬件资源（GPU\u002F内存）需求取决于所选用的推理运行时（Runtime）及加载的模型大小，README 中未给出统一的硬件指标。Python 3.7、3.8 已不再支持，3.13 尚未支持。支持通过 Kubernetes (Seldon Core, KServe) 进行原生部署。","3.9, 3.10, 3.11, 3.12",[120,121,122,123,124,125,126,127,128,129],"mlserver","mlserver-sklearn","mlserver-xgboost","mlserver-mllib","mlserver-lightgbm","mlserver-catboost","mlserver-mlflow","mlserver-alibi-detect","mlserver-alibi-explain","mlserver-huggingface",[13],[132,133,134,135,136,137,138],"machine-learning","scikit-learn","xgboost","lightgbm","mlflow","seldon-core","kfserving","2026-03-27T02:49:30.150509","2026-04-06T06:56:30.635605",[142,147,152,157,162,167],{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},10170,"在多模型服务（Multi-model serving）中遇到 'IsADirectoryError: [Errno 21] Is a directory' 错误，如何解决？","该错误通常是因为 `model-settings.json` 文件中的 `parameters.uri` 字段配置不正确。`uri` 路径应该是相对于 `model-settings.json` 文件所在目录的相对路径。\n\n例如，如果目录结构如下：\n```\n├── IrisModel\n│   ├── model-settings.json\n│   └── model.joblib\n```\n那么在 `IrisModel\u002Fmodel-settings.json` 中，`uri` 应该只设置为 `model.joblib`，而不是 `IrisModel\u002Fmodel.joblib`。\n\n修正后的配置示例：\n```json\n{\n  \"name\": \"IrisModel\",\n  \"implementation\": \"mlserver_sklearn.SKLearnModel\",\n  \"parameters\": {\n    \"uri\": \"model.joblib\"\n  }\n}\n```","https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fissues\u002F689",{"id":148,"question_zh":149,"answer_zh":150,"source_url":151},10171,"卸载模型（model unload）后，为什么内存（特别是 GPU 显存）没有被释放？","这通常是因为 PyTorch 默认的 GPU 内存分配缓存机制导致的。即使模型已卸载，PyTorch 仍可能保留显存以备后用。\n\n解决方法是设置环境变量 `PYTORCH_NO_CUDA_MEMORY_CACHING=1` 来禁用 PyTorch 的 GPU“分配缓存”。这将确保在卸载模型时显存被完全释放，其效果等同于调用 `torch.cuda.empty_cache()`。\n\n启动命令示例：\n```bash\nexport PYTORCH_NO_CUDA_MEMORY_CACHING=1\nmlserver start .\n```","https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fissues\u002F986",{"id":153,"question_zh":154,"answer_zh":155,"source_url":156},10172,"MLServer 是否支持 Python 3.12？","是的，MLServer 现已支持 Python 3.12。相关的支持工作已通过 PR #1951 合并。目前 MLServer 支持的 Python 版本范围为 3.9 到 3.12。此更新依赖于 `alibi` 和 `alibi-detect` 库的升级以兼容新版本。","https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fissues\u002F1926",{"id":158,"question_zh":159,"answer_zh":160,"source_url":161},10173,"启用了自适应批处理（Adaptive Batching），但在使用推理流（inference streaming）时收到警告或不生效，原因是什么？","自适应批处理目前不支持推理流模式，如果同时启用两者，系统会自动回退到非批处理的流式推理。\n\n此外，如果您的模型未正确实现批处理逻辑，也会导致问题。要使自适应批处理正常工作，模型必须能够处理批量输入并返回对应数量的输出。例如，如果模型接收形状为 `[b, n]` 的输入（其中 `b` 是批次大小），它必须能够输出形状为 `[b, 1]`（假设每个输入对应一个输出）的结果。如果模型输出的元素数量少于批次大小 `b`，剩余的请求将获得空响应。","https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fissues\u002F1911",{"id":163,"question_zh":164,"answer_zh":165,"source_url":166},10174,"如何获取推理队列（request queue）中的元素数量指标以进行性能调优？","MLServer 已经添加了相关功能来暴露队列指标。您可以通过查看服务器的指标端点（metrics endpoint）来获取批处理队列（batch queue）和请求队列（request queue）中的当前元素数量。\n\n这些指标可用于监控负载，帮助您在 OpenShift 等环境中调整并行工作线程数（parallel workers）或 Pod 副本数。该功能已在相关更新（如 PR #860）中实现并合并。","https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fissues\u002F728",{"id":168,"question_zh":169,"answer_zh":170,"source_url":171},10175,"MLServer 是否支持 Pydantic V2？","是的，MLServer 已添加对 Pydantic V2 的支持。这使得用户可以将依赖 Pydantic 的服务升级到最新版本，同时保持与 MLServer 的兼容性。如果您之前因仅支持 V1 而受阻，现在可以安全地迁移到 Pydantic V2。","https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fissues\u002F1419",[173,178,183,188,193,198,203,208,213,218,223,228,233,238,243,248,253,258,262],{"id":174,"version":175,"summary_zh":176,"released_at":177},107438,"1.7.1","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.7.1 -->\r\n\r\n## Fixes\r\n* Set a lower bound for `mlflow` in `mlserver-mlflow` by @crispin-ki in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2114\r\n* Set lower bounds for `protobuf`, `grpcio`, and `grpcio-tools` by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2175\r\n* Added support for bytes encoding in `PandasCodec` by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2117\r\n\r\n## What's Changed\r\n* Included more docker labels by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2106\r\n* Fix too loose mlflow dependency constraint in mlserver-mlflow by @crispin-ki in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2114\r\n* Fixed byte encoding in the PandasCodec by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2117\r\n* Update CHANGELOG by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2108\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2174\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2176\r\n* Fix protobuf bounds by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2175\r\n* Revert \"build(deps-dev): bump transformers from 4.41.2 to 4.52.4 (#2170)\" by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2177\r\n* ci: Merge change for release 1.7.1 by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2178\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2180\r\n* ci: Merge change for release 1.7.1 [2]  (#2180) by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2182\r\n\r\n## New Contributors\r\n* @crispin-ki made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2114\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.7.0...1.7.1","2025-06-06T13:01:20",{"id":179,"version":180,"summary_zh":181,"released_at":182},107439,"1.7.0","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.7.0 -->\r\n\r\n## Overview\r\n\r\n### Features\r\n* MLServer has now support for Python 3.11 and 3.12 by @shivakrishnaah in (#1951)\r\n* MLServer now supports enabling assignment of models to dedicated inference pool groups to avoid risk of starvation by @RobertSamoilescu in (##2040)\r\n* MLServer now includes compatibility with additional column types available in the MLflow runtime such as: [Array](https:\u002F\u002Fmlflow.org\u002Fdocs\u002Flatest\u002Fapi_reference\u002Fpython_api\u002Fmlflow.types.html#mlflow.types.schema.Array), [Map](https:\u002F\u002Fmlflow.org\u002Fdocs\u002Flatest\u002Fapi_reference\u002Fpython_api\u002Fmlflow.types.html#mlflow.types.schema.Map), [Object](https:\u002F\u002Fmlflow.org\u002Fdocs\u002Flatest\u002Fapi_reference\u002Fpython_api\u002Fmlflow.types.html#mlflow.types.schema.Object), [Any](https:\u002F\u002Fmlflow.org\u002Fdocs\u002Flatest\u002Fapi_reference\u002Fpython_api\u002Fmlflow.types.html#mlflow.types.schema.Map) by @RobertSamoilescu in (#2080)\r\n\r\n### Fixes\r\n* Relaxing Pydantic dependencies by @lemonhead94 in (#1928)\r\n* Adjusted the version range for FastAPI  to ensure compatibility with future releases by @sergioave  in (#1954)\r\n* Forward rest parameters to model @idlefella in (#1921)\r\n* Force clean up env fix by @sakoush in (#2029)\r\n* PandasCodec improperly encoding columns of numeric lists fix by @RobertSamoilescu in (#2080)\r\n* Opentelemetry dependency mismatch fix by @lawrence-c in (#2088)\r\n* AdaptiveBatcher timeout calculation fix by @hanlaur in (#2093)\r\n\r\n## What's Changed\r\n* Update CHANGELOG by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1905\r\n* docs: add docs for gitbook by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1919\r\n* Relaxing Pydantic dependencies by @lemonhead94 in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1928\r\n* build(deps): Upgrade fastapi and starlette by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1934\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1935\r\n* Update FastAPI version constraint by @sergioave in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1954\r\n* Forward rest parameters to model by @idlefella in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1921\r\n* Revert \"build(deps): bump mlflow from 2.18.0 to 2.19.0 in \u002Fruntimes\u002Fmlflow\" by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1988\r\n* Added dependency upgrades for python3.12 support by @shivakrishnaah in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1951\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1991\r\n* Further CI fixes for py312 support by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1992\r\n* Revert \"build(deps): bump python-multipart from 0.0.9 to 0.0.18 in \u002Fruntimes\u002Falibi-detect\" by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1994\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2027\r\n* Force clean up env (for py 3.12) by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2029\r\n* Pinned preflight to latest version by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2041\r\n* Bump gevent to 24.11.1 by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2042\r\n* Bumped python-multipart to 0.0.20 by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2043\r\n* Bumped python-multipart-0.0.20 on alibi-explain runtime by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2044\r\n* Included separate inference pool by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2040\r\n* Wrote docs for inference_pool_gid by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2045\r\n* Update lightgbm in alibi runtime to 4.6 by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2081\r\n* Fix pandas codec by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2080\r\n* Fix interceptors insert tuple -> list by @lawrence-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2088\r\n* Fix AdaptiveBatcher timeout calculation by @hanlaur in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2093\r\n* Fix onnxruntime version by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2100\r\n* Included labels for preflight checks by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2101\r\n* Bumped poetry to 2.1.1 by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2103\r\n* Add installation for poetry export plugin by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2104\r\n* ci: Merge change for release 1.7.0 [4] by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2107\r\n\r\n## New Contributors\r\n* @lemonhead94 made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1928\r\n* @sergioave made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1954\r\n* @shivakrishnaah made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1951\r\n* @lawrence-c made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F2088\r\n* @hanlaur made their fir","2025-04-11T15:42:41",{"id":184,"version":185,"summary_zh":186,"released_at":187},107440,"1.6.1","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.6.1 -->\r\n\r\n## Overview\r\n\r\n### Features\r\nMLServer now offers an option to use pre-existing Python environments by specifying a path to the environment to be used - by @idlefella in (#1891)\r\n\r\n### Releases\r\nMLServer released catboost runtime which allows serving [catboost](https:\u002F\u002Fcatboost.ai\u002F) models with MLServer - by @sakoush in (#1839)\r\n\r\n### Fixes\r\n* Kafka json byte encoding fix to match rest server by @DerTiedemann and @sakoush in (#1622)\r\n* Prometheus interceptor fix for gRPC streaming by @RobertSamoilescu in (#1858)\r\n\r\n\r\n## What's Changed\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1812\r\n* Update CHANGELOG by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1830\r\n* Update release.yml to include catboost by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1839\r\n* Fix kafka json byte encoding to match rest server by @DerTiedemann in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1622\r\n* Included Prometheus interceptor support for gRPC streaming by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1858\r\n* Run gRPC test serially by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1872\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1886\r\n* Feature\u002Fsupport existing environments by @idlefella in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1891\r\n* Fix tensorflow upperbound macos by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1901\r\n* ci: Merge change for release 1.6.1  by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1902\r\n* Bump preflight to 1.10.0 by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1903\r\n* ci: Merge change for release 1.6.1 [2] by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1904\r\n\r\n## New Contributors\r\n* @DerTiedemann made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1622\r\n* @idlefella made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1891\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.6.0...1.6.1","2024-09-10T16:06:23",{"id":189,"version":190,"summary_zh":191,"released_at":192},107441,"1.6.0"," ## Overview\r\n\r\n\r\n### Upgrades\r\n MLServer supports Pydantic V2. \r\n\r\n### Features\r\n MLServer supports streaming data to and from your models. \r\n\r\n Streaming support is available for both the REST and gRPC servers: \r\n * for the REST server is limited only to server streaming. This means that the client sends a single request to the server, and the server responds with a stream of data. \r\n * for the gRPC server is available for both client and server streaming. This means that the client sends a stream of data to the server, and the server responds with a stream of data.\r\n\r\n See our [docs](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002F1.6.0\u002Fuser-guide\u002Fstreaming.html) and [example](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002F1.6.0\u002Fexamples\u002Fstreaming\u002FREADME.html) for more details.\r\n\r\n## What's Changed\r\n* fix(ci): fix typo in CI name by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1623\r\n* Update CHANGELOG by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1624\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1634\r\n* Fix mlserver_huggingface settings device type by @geodavic in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1486\r\n* fix: Adjust HF tests post-merge of PR #1486 by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1635\r\n* Update README.md w licensing clarification by @paulb-seldon in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1636\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1642\r\n* fix(ci): optimise disk space for GH workers by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1644\r\n* build: Update maintainers by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1659\r\n* fix: Missing f-string directives by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1677\r\n* build: Add Catboost runtime to Dependabot by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1689\r\n* Fix JSON input shapes by @ReveStobinson in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1679\r\n* build(deps): bump alibi-detect from 0.11.5 to 0.12.0 by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1702\r\n* build(deps): bump alibi from 0.9.5 to 0.9.6 by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1704\r\n* Docs correction - Updated README.md in mlflow to match column names order by @vivekk0903 in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1703\r\n* fix(runtimes): Remove unused Pydantic dependencies by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1725\r\n* test: Detect generate failures by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1729\r\n* build: Add granularity in types generation by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1749\r\n* Migrate to Pydantic v2 by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1748\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1753\r\n* Revert \"build(deps): bump uvicorn from 0.28.0 to 0.29.0\" by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1758\r\n* refactor(pydantic): Remaining migrations for deprecated functions by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1757\r\n* Fixed openapi dataplane.yaml by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1752\r\n* fix(pandas): Use Pydantic v2 compatible type by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1760\r\n* Fix Pandas codec decoding from numpy arrays by @lhnwrk in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1751\r\n* build: Bump versions for Read the Docs by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1761\r\n* docs: Remove quotes around local TOC by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1764\r\n* Spawn worker in custom environment by @lhnwrk in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1739\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1767\r\n* basic contributing guide on contributing and opening a PR by @bohemia420 in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1773\r\n* Inference streaming support by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1750\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1779\r\n* build: Lock GitHub runners' OS by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1765\r\n* Removed text-model form benchmarking by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1790\r\n* Bumped mlflow to 2.13.1 and gunicorn to 22.0.0 by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1791\r\n* Build(deps): Update to poetry version 1.8.3 in docker build by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1792\r\n* Bumped werkzeug to 3.0.3 by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1793\r\n* Docs streaming by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1789\r\n* Bump uvicorn 0.30.1 by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1795\r\n* Fixes for all-runtimes by @RobertSamoilescu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1","2024-06-26T14:07:27",{"id":194,"version":195,"summary_zh":196,"released_at":197},107442,"1.5.0","## What's Changed\r\n\r\n* Update CHANGELOG by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1592\r\n* build: Migrate away from Node v16 actions by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1596\r\n* build: Bump version and improve release doc by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1602\r\n* build: Upgrade stale packages (fastapi, starlette, tensorflow, torch) by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1603\r\n* fix(ci): tests and security workflow fixes by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1608\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1612\r\n* fix(ci): Missing quote in CI test for all_runtimes by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1617\r\n* build(docker): Bump dependencies by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1618\r\n* docs: List supported Python versions  by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1591\r\n* fix(ci): Have separate smaller tasks for release by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1619\r\n\r\n\r\n## Notes\r\n* We remove support for python 3.8, check https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1603 for more info. Docker images for mlserver are already using python 3.10.\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.4.0...1.5.0","2024-03-05T14:40:26",{"id":199,"version":200,"summary_zh":201,"released_at":202},107443,"1.4.0","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.4.0 -->\r\n\r\n## What's Changed\r\n* Free up some space for GH actions by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1282\r\n* Introduce tracing with OpenTelemetry by @vtaskow in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1281\r\n* Update release CI to use Poetry by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1283\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1284\r\n* Add support for white-box explainers to alibi-explain runtime by @ascillitoe in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1279\r\n* Update CHANGELOG by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1294\r\n* Fix build-wheels.sh error when copying to output path by @lc525 in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1286\r\n* Fix typo by @strickvl in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1289\r\n* feat(logging): Distinguish logs from different models by @vtaskow in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1302\r\n* Make sure we use our Response class by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1314\r\n* Adding Quick-Start Guide to docs by @ramonpzg in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1315\r\n* feat(logging): Provide JSON-formatted structured logging as option by @vtaskow in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1308\r\n* Bump in conda version and mamba solver  by @dtpryce in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1298\r\n* feat(huggingface): Merge model settings by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1337\r\n* feat(huggingface): Load local artefacts in HuggingFace runtime by @vtaskow in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1319\r\n* Document and test behaviour around NaN by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1346\r\n* Address flakiness on 'mlserver build' tests by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1363\r\n* Bump Poetry and lockfiles by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1369\r\n* Bump Miniforge3 to 23.3.1 by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1372\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1373\r\n* Improved huggingface batch logic by @ajsalow in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1336\r\n* Add inference params support to MLFlow's custom invocation endpoint (… by @M4nouel in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1375\r\n* Increase build space for runtime builds by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1385\r\n* Fix minor typo in `sklearn` README by @krishanbhasin-gc in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1402\r\n* Add catboost classifier support by @krishanbhasin-gc in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1403\r\n* added model_kwargs to huggingface model by @nanbo-liu in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1417\r\n* Re-generate License Info by @github-actions in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1456\r\n* Local response cache implementation by @SachinVarghese in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1440\r\n* fix link to custom runtimes by @kretes in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1467\r\n* Improve typing on `Environment` class by @krishanbhasin-gc in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1469\r\n* build(dependabot): Change reviewers by @jesse-c in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1548\r\n* MLServer changes from internal fork - deps and CI updates by @sakoush in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1588\r\n\r\n## New Contributors\r\n* @vtaskow made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1281\r\n* @lc525 made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1286\r\n* @strickvl made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1289\r\n* @ramonpzg made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1315\r\n* @jesse-c made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1337\r\n* @ajsalow made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1336\r\n* @M4nouel made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1375\r\n* @nanbo-liu made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1417\r\n* @kretes made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1467\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.3.5...1.4.0","2024-02-28T15:39:40",{"id":204,"version":205,"summary_zh":206,"released_at":207},107444,"1.3.5","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.3.5 -->\r\n\r\n### What's Changed\r\n\r\n* Rename HF codec to `hf` by @adriangonz  in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1268\r\n* Publish is_drift metric to Prom by @joshsgoldstein  in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1263\r\n\r\n### New Contributors\r\n* @joshsgoldstein made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1263\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.3.4...1.3.5","2023-07-10T10:28:59",{"id":209,"version":210,"summary_zh":211,"released_at":212},107445,"1.3.4","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.3.4 -->\r\n\r\n### What's Changed\r\n\r\n* Silent logging by @dtpryce in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1230\r\n* Fix `mlserver infer` with `BYTES` by @RafalSkolasinski in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1213\r\n\r\n### New Contributors\r\n* @dtpryce made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1230\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.3.3...1.3.4","2023-06-21T16:21:07",{"id":214,"version":215,"summary_zh":216,"released_at":217},107446,"1.3.3","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.3.3 -->\r\n\r\n### What's Changed\r\n\r\n* Add default LD_LIBRARY_PATH env var by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1120\r\n* Adding cassava tutorial (mlserver + seldon core) by @edshee in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1156\r\n* Add docs around converting to \u002F from JSON by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1165\r\n* Document SKLearn available outputs by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1167 \r\n* Fix minor typo in `alibi-explain` tests by @ascillitoe in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1170\r\n* Add support for `.ubj` models and improve XGBoost docs by @adriangonz in  https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1168\r\n* Fix content type annotations for pandas codecs by @adriangonz in  https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1162\r\n* Added option to configure the grpc histogram by @cristiancl25 in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1143\r\n* Add OS classifiers to project's metadata by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1171\r\n* Don't use `qsize` for parallel worker queue by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1169\r\n* Fix small typo in Python API docs by @krishanbhasin-gc  in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1174\r\n* Fix star import in `mlserver.codecs.*` by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1172\r\n\r\n### New Contributors\r\n* @cristiancl25 made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1143\r\n* @krishanbhasin-gc made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1174\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.3.2...1.3.3","2023-06-05T10:05:30",{"id":219,"version":220,"summary_zh":221,"released_at":222},107447,"1.3.2","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.4.0.dev2 -->\r\n\r\n### What's Changed\r\n* Use default initialiser if not using a custom env by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1104\r\n* Add support for online drift detectors by @ascillitoe in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1108\r\n* added intera and inter op parallelism parameters to the hugggingface … by @saeid93 in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1081\r\n* Fix settings reference in runtime docs by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1109\r\n* Bump Alibi libs requirements by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1121\r\n* Add default LD_LIBRARY_PATH env var by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1120\r\n* Ignore both .metrics and .envs folders by @adriangonz in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1132\r\n\r\n### New Contributors\r\n* @ascillitoe made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F1108\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.3.1...1.3.2","2023-05-10T13:47:20",{"id":224,"version":225,"summary_zh":226,"released_at":227},107448,"1.3.1","### What's Changed\r\n\r\n- Move OpenAPI schemas into Python package (#1095)","2023-04-27T10:54:25",{"id":229,"version":230,"summary_zh":231,"released_at":232},107449,"1.3.0","> WARNING :warning: : The `1.3.0` has been yanked from PyPi due to a packaging issue. This should have been now resolved in `>= 1.3.1`. \r\n\r\n### What's Changed\r\n\r\n#### Custom Model Environments\r\n\r\nMore often that not, your custom runtimes will depend on external 3rd party dependencies which are not included within the main MLServer package - or different versions of the same package (e.g. `scikit-learn==1.1.0` vs `scikit-learn==1.2.0`). In these cases, to load your custom runtime, MLServer will need access to these dependencies.\r\n\r\nIn MLServer `1.3.0`, it is now [possible to load this custom set of dependencies by providing them](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fcustom.html#loading-a-custom-python-environment), through an [environment tarball](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fexamples\u002Fconda\u002FREADME.html), whose path can be specified within your `model-settings.json` file. This custom environment will get provisioned on the fly after loading a model - alongside the default environment and any other custom environments.\r\n\r\nUnder the hood, each of these environments will run their own separate pool of workers.\r\n\r\n![image](https:\u002F\u002Fuser-images.githubusercontent.com\u002F1577620\u002F234797983-aa52c353-2d2f-4261-a078-06bfe62cae87.png)\r\n\r\n#### Custom Metrics\r\n\r\nThe MLServer framework now includes a simple interface that allows you to register and keep track of any [custom metrics](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fmetrics.html#custom-metrics):\r\n\r\n- `[mlserver.register()](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Freference\u002Fapi\u002Fmetrics.html#mlserver.register)`: Register a new metric.\r\n- `[mlserver.log()](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Freference\u002Fapi\u002Fmetrics.html#mlserver.log)`: Log a new set of metric \u002F value pairs.\r\n\r\nCustom metrics will generally be registered in the `[load()](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Freference\u002Fapi\u002Fmodel.html#mlserver.MLModel.load)` method and then used in the `[predict()](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Freference\u002Fapi\u002Fmodel.html#mlserver.MLModel.predict)` method of your [custom runtime](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fcustom.html). These metrics can then be polled and queried via [Prometheus](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fmetrics.html#settings).\r\n\r\n![image](https:\u002F\u002Fuser-images.githubusercontent.com\u002F1577620\u002F234798211-9e538439-4914-4aa6-9c3f-539a66e3ce54.png)\r\n\r\n#### OpenAPI\r\n\r\nMLServer `1.3.0` now includes an autogenerated Swagger UI which can be used to interact dynamically with the Open Inference Protocol.\r\n\r\nThe autogenerated Swagger UI can be accessed under the `\u002Fv2\u002Fdocs` endpoint.\r\n\r\n![https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002F_images\u002Fswagger-ui.png](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002F_images\u002Fswagger-ui.png)\r\n\r\nAlongside the [general API documentation](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fopenapi.html#Swagger-UI), MLServer also exposes now a set of API docs tailored to individual models, showing the specific endpoints available for each one.\r\n\r\nThe model-specific autogenerated Swagger UI can be accessed under the following endpoints:\r\n\r\n- `\u002Fv2\u002Fmodels\u002F{model_name}\u002Fdocs`\r\n- `\u002Fv2\u002Fmodels\u002F{model_name}\u002Fversions\u002F{model_version}\u002Fdocs`\r\n\r\n#### HuggingFace Improvements\r\n\r\nMLServer now includes improved Codec support for all the main different types that can be returned by HugginFace models - ensuring that the values returned via the Open Inference Protocol are more semantic and meaningful.\r\n\r\nMassive thanks to @pepesi  for taking the lead on improving the HuggingFace runtime!\r\n\r\n#### Support for Custom Model Repositories\r\n\r\nInternally, MLServer leverages a Model Repository implementation which is used to discover and find different models (and their versions) available to load. The latest version of MLServer will now allow you to swap this for your own model repository implementation - letting you integrate against your own model repository workflows. \r\n\r\nThis is exposed via the [model_repository_implementation](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Freference\u002Fsettings.html#mlserver.settings.Settings.model_repository_implementation) flag of your `settings.json` configuration file. \r\n\r\nThanks to @jgallardorama  (aka @jgallardorama-itx ) for his effort contributing this feature!\r\n\r\n#### Batch and Worker Queue Metrics\r\n\r\nMLServer `1.3.0` introduces a [new set of metrics](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fmetrics.html#default-metrics) to increase visibility around two of its internal queues:\r\n\r\n- [Adaptive batching](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fadaptive-batching.html) queue: used to accumulate request batches on the fly.\r\n- [Parallel inference](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fparallel-inference.html) queue: used to send over requests to the inference worker pool.\r\n\r\nMany thanks to @alvarorsant  for taking the time to implement this highly requested feature!\r\n\r\n#### Image Size Optimisations\r\n\r\nThe latest ","2023-04-27T08:00:58",{"id":234,"version":235,"summary_zh":236,"released_at":237},107450,"1.2.4","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.2.4 -->\r\n\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.2.3...1.2.4","2023-03-10T17:08:37",{"id":239,"version":240,"summary_zh":241,"released_at":242},107451,"1.2.3","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.2.3 -->\r\n\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.2.2...1.2.3","2023-01-16T11:04:15",{"id":244,"version":245,"summary_zh":246,"released_at":247},107452,"1.2.2","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.2.2 -->\r\n\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.2.1...1.2.2","2023-01-16T11:04:08",{"id":249,"version":250,"summary_zh":251,"released_at":252},107453,"1.2.1","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.2.1 -->\r\n\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fcompare\u002F1.2.0...1.2.1","2022-12-19T09:15:04",{"id":254,"version":255,"summary_zh":256,"released_at":257},107454,"1.2.0","\u003C!-- Release notes generated using configuration in .github\u002Frelease.yml at 1.2.0 -->\r\n\r\n### What's Changed\r\n\r\n#### Simplified Interface for Custom Runtimes\r\n\r\nMLServer now exposes an alternative [_“simplified”_ interface](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fcustom.html#simplified-interface) which can be used to write custom runtimes. This interface can be enabled by decorating your predict() method with the `mlserver.codecs.decode_args` decorator, and it lets you specify in the method signature both how you want your request payload to be decoded and how to encode the response back.\r\n\r\nBased on the information provided in the method signature, MLServer will automatically decode the request payload into the different inputs specified as keyword arguments. Under the hood, this is implemented through [MLServer’s codecs and content types system](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fcontent-type.html).\r\n\r\n```python\r\nfrom mlserver import MLModel\r\nfrom mlserver.codecs import decode_args\r\n\r\nclass MyCustomRuntime(MLModel):\r\n\r\n  async def load(self) -> bool:\r\n    # TODO: Replace for custom logic to load a model artifact\r\n    self._model = load_my_custom_model()\r\n    self.ready = True\r\n    return self.ready\r\n\r\n  @decode_args\r\n  async def predict(self, questions: List[str], context: List[str]) -> np.ndarray:\r\n    # TODO: Replace for custom logic to run inference\r\n    return self._model.predict(questions, context)\r\n```\r\n\r\n#### Built-in Templates for Custom Runtimes\r\n\r\nTo make it easier to write your own custom runtimes, MLServer now ships with a `mlserver init` command that will generate a templated project. This project will include a skeleton with folders, unit tests, Dockerfiles, etc. for you to fill.\r\n\r\n![image1](https:\u002F\u002Fuser-images.githubusercontent.com\u002F1577620\u002F203810614-f4daa32e-8b1d-4bea-9b02-959b1d054596.gif)\r\n\r\n#### Dynamic Loading of Custom Runtimes\r\n\r\nMLServer now lets you [load custom runtimes dynamically](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Fuser-guide\u002Fcustom.html#loading-a-custom-mlserver-runtime) into a running instance of MLServer. Once you have your custom runtime ready, all you need to do is to move it to your model folder, next to your `model-settings.json` configuration file.\r\n\r\nFor example, if we assume a flat model repository where each folder represents a model, you would end up with a folder structure like the one below:\r\n\r\n```\r\n.\r\n├── models\r\n│   └── sum-model\r\n│       ├── model-settings.json\r\n│       ├── models.py\r\n```\r\n\r\n#### Batch Inference Client\r\n\r\nThis release of MLServer introduces a new [`mlserver infer`](https:\u002F\u002Fmlserver.readthedocs.io\u002Fen\u002Flatest\u002Freference\u002Fcli.html#mlserver-infer) command, which will let you run inference over a large batch of input data on the client side. Under the hood, this command will stream a large set of inference requests from specified input file, arrange them in microbatches, orchestrate the request \u002F response lifecycle, and will finally write back the obtained responses into output file.\r\n\r\n#### Parallel Inference Improvements\r\n\r\nThe `1.2.0` release of MLServer, includes a number of fixes around the parallel inference pool focused on improving the architecture to optimise memory usage and reduce latency. These changes include (but are not limited to):\r\n\r\n- The main MLServer process won’t load an extra replica of the model anymore. Instead, all computing will occur on the parallel inference pool.\r\n- The worker pool will now ensure that all requests are executed on each worker’s AsyncIO loop, thus optimising compute time vs IO time.\r\n- Several improvements around logging from the inference workers. \r\n\r\n#### Dropped support for Python 3.7\r\n\r\nMLServer has now dropped support for Python `3.7`. Going forward, only `3.8`, `3.9` and `3.10` will be supported (with `3.8` being used in our official set of images).\r\n\r\n#### Move to UBI Base Images\r\n\r\nThe official set of MLServer images has now moved to use [UBI 9](https:\u002F\u002Fwww.redhat.com\u002Fen\u002Fblog\u002Fintroducing-red-hat-universal-base-image) as a base image. This ensures support to run MLServer in OpenShift clusters, as well as a well-maintained baseline for our images. \r\n\r\n#### Support for MLflow 2.0\r\n\r\nIn line with MLServer’s close relationship with the MLflow team, this release of MLServer introduces support for the recently released MLflow 2.0. This introduces changes to the drop-in MLflow “scoring protocol” support, in the MLflow runtime for MLServer, to ensure it’s aligned with MLflow 2.0.  \r\n\r\nMLServer is also shipped as a dependency of MLflow, therefore you can try it out today by installing MLflow as:\r\n\r\n```bash\r\n$ pip install mlflow[extras]\r\n```\r\n\r\nTo learn more about how to use MLServer directly from the MLflow CLI, check out the [MLflow docs](https:\u002F\u002Fwww.mlflow.org\u002Fdocs\u002Flatest\u002Fmodels.html#serving-with-mlserver).\r\n\r\n\r\n### New Contributors\r\n* @johnpaulett made their first contribution in https:\u002F\u002Fgithub.com\u002FSeldonIO\u002FMLServer\u002Fpull\u002F633\r\n* @saeid93 made their first contribution in htt","2022-11-25T15:33:31",{"id":259,"version":260,"summary_zh":79,"released_at":261},107455,"1.2.0.dev1","2022-08-01T16:19:13",{"id":263,"version":264,"summary_zh":79,"released_at":265},107456,"1.1.0","2022-08-01T16:18:52"]