[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-openvinotoolkit--openvino":3,"tool-openvinotoolkit--openvino":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":80,"owner_email":80,"owner_twitter":80,"owner_website":81,"owner_url":82,"languages":83,"stars":120,"forks":121,"last_commit_at":122,"license":123,"difficulty_score":23,"env_os":124,"env_gpu":125,"env_ram":126,"env_deps":127,"category_tags":135,"github_topics":136,"view_count":23,"oss_zip_url":80,"oss_zip_packed_at":80,"status":16,"created_at":155,"updated_at":156,"faqs":157,"releases":187},4070,"openvinotoolkit\u002Fopenvino","openvino","OpenVINO™ is an open source toolkit for optimizing and deploying AI inference","OpenVINO 是一款由英特尔推出的开源工具包，专为优化和部署深度学习模型而设计。它的核心使命是解决 AI 模型在实际应用中推理速度慢、资源消耗大以及跨平台部署困难的问题。无论是计算机视觉、语音识别，还是当下火热的大语言模型与生成式 AI 应用，OpenVINO 都能显著提升其运行效率。\n\n这款工具特别适合 AI 开发者、算法工程师及研究人员使用。它最大的亮点在于极强的兼容性：支持直接导入 PyTorch、TensorFlow、ONNX、PaddlePaddle 等主流框架训练的模型，甚至能无缝对接 Hugging Face 上的热门模型，无需保留原始训练环境即可高效部署。此外，OpenVINO 打破了硬件壁垒，不仅能在 x86 和 ARM 架构的 CPU 上运行，还能充分利用英特尔集成显卡、独立显卡以及专用的 NPU 加速器，实现从边缘设备到云端服务器的灵活部署。\n\n通过简单的命令即可完成安装，配合丰富的官方教程和示例笔记，用户可以快速将复杂的 AI 模型转化为高性能的实际应用。如果你希望在不更换硬件的前提下挖掘现有设备的 AI 算力潜能，或寻求一种通用的模型加速方案，OpenVI","OpenVINO 是一款由英特尔推出的开源工具包，专为优化和部署深度学习模型而设计。它的核心使命是解决 AI 模型在实际应用中推理速度慢、资源消耗大以及跨平台部署困难的问题。无论是计算机视觉、语音识别，还是当下火热的大语言模型与生成式 AI 应用，OpenVINO 都能显著提升其运行效率。\n\n这款工具特别适合 AI 开发者、算法工程师及研究人员使用。它最大的亮点在于极强的兼容性：支持直接导入 PyTorch、TensorFlow、ONNX、PaddlePaddle 等主流框架训练的模型，甚至能无缝对接 Hugging Face 上的热门模型，无需保留原始训练环境即可高效部署。此外，OpenVINO 打破了硬件壁垒，不仅能在 x86 和 ARM 架构的 CPU 上运行，还能充分利用英特尔集成显卡、独立显卡以及专用的 NPU 加速器，实现从边缘设备到云端服务器的灵活部署。\n\n通过简单的命令即可完成安装，配合丰富的官方教程和示例笔记，用户可以快速将复杂的 AI 模型转化为高性能的实际应用。如果你希望在不更换硬件的前提下挖掘现有设备的 AI 算力潜能，或寻求一种通用的模型加速方案，OpenVINO 都是一个值得尝试的专业选择。","\u003Cdiv align=\"center\">\n\u003Cimg src=\"docs\u002Fdev\u002Fassets\u002Fopenvino-logo-purple-black.svg\" width=\"400px\">\n\n\u003Ch3 align=\"center\">\nOpen-source software toolkit for optimizing and deploying deep learning models.\n\u003C\u002Fh3>\n\n\u003Cp align=\"center\">\n \u003Ca href=\"https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Findex.html\">\u003Cb>Documentation\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fblog.openvino.ai\">\u003Cb>Blog\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Fkey-features.html\">\u003Cb>Key Features\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started\u002Flearn-openvino.html\">\u003Cb>Tutorials\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fdocumentation\u002Fopenvino-ecosystem.html\">\u003Cb>Integrations\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fwww.intel.com\u002Fcontent\u002Fwww\u002Fus\u002Fen\u002Fdeveloper\u002Ftools\u002Fopenvino-toolkit\u002Fmodel-hub.html\">\u003Cb>Benchmarks\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino.genai\">\u003Cb>Generative AI\u003C\u002Fb>\u003C\u002Fa>\n\u003C\u002Fp>\n\n[![PyPI Status](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fopenvino.svg)](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fopenvino)\n[![Anaconda Status](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Fopenvino\u002Fbadges\u002Fversion.svg)](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Fopenvino)\n[![brew Status](https:\u002F\u002Fimg.shields.io\u002Fhomebrew\u002Fv\u002Fopenvino)](https:\u002F\u002Fformulae.brew.sh\u002Fformula\u002Fopenvino)\n\n[![PyPI Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fopenvinotoolkit_openvino_readme_43070ac0be38.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Fopenvino)\n[![Anaconda Downloads](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Flibopenvino\u002Fbadges\u002Fdownloads.svg)](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Fopenvino\u002Ffiles)\n[![brew Downloads](https:\u002F\u002Fimg.shields.io\u002Fhomebrew\u002Finstalls\u002Fdy\u002Fopenvino)](https:\u002F\u002Fformulae.brew.sh\u002Fformula\u002Fopenvino)\n \u003C\u002Fdiv>\n\n\n- **Inference Optimization**: Boost deep learning performance in computer vision, automatic speech recognition, generative AI, natural language processing with large and small language models, and many other common tasks.\n- **Flexible Model Support**: Use models trained with popular frameworks such as PyTorch, TensorFlow, ONNX, Keras, PaddlePaddle, and JAX\u002FFlax. Directly integrate models built with transformers and diffusers from the Hugging Face Hub using Optimum Intel. Convert and deploy models without original frameworks.\n- **Broad Platform Compatibility**: Reduce resource demands and efficiently deploy on a range of platforms from edge to cloud. OpenVINO™ supports inference on CPU (x86, ARM), GPU (Intel integrated & discrete GPU) and AI accelerators (Intel NPU).\n- **Community and Ecosystem**: Join an active community contributing to the enhancement of deep learning performance across various domains.\n\nCheck out the [OpenVINO Cheat Sheet](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002F_static\u002Fdownload\u002FOpenVINO_Quick_Start_Guide.pdf) and [Key Features](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Fkey-features.html) for a quick reference.\n\n\n## Installation\n\n[Get your preferred distribution of OpenVINO](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started\u002Finstall-openvino.html) or use this command for quick installation:\n\n```sh\npip install -U openvino\n```\n\nCheck [system requirements](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Frelease-notes-openvino\u002Fsystem-requirements.html) and [supported devices](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fdocumentation\u002Fcompatibility-and-support\u002Fsupported-devices.html) for detailed information.\n\n## Tutorials and Examples\n\n[OpenVINO Quickstart example](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started.html) will walk you through the basics of deploying your first model.\n\nLearn how to optimize and deploy popular models with the [OpenVINO Notebooks](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks)📚:\n- [Create an LLM-powered Chatbot using OpenVINO](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Fblob\u002Flatest\u002Fnotebooks\u002Fllm-chatbot\u002Fllm-chatbot-generate-api.ipynb)\n- [YOLOv11 Optimization](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Fblob\u002Flatest\u002Fnotebooks\u002Fyolov11-optimization\u002Fyolov11-object-detection.ipynb)\n- [Text-to-Image Generation](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Fblob\u002Flatest\u002Fnotebooks\u002Ftext-to-image-genai\u002Ftext-to-image-genai.ipynb)\n- [Multimodal assistant with LLaVa and OpenVINO](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Fblob\u002Flatest\u002Fnotebooks\u002Fllava-multimodal-chatbot\u002Fllava-multimodal-chatbot-genai.ipynb)\n- [Automatic speech recognition using Whisper and OpenVINO](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Fblob\u002Flatest\u002Fnotebooks\u002Fwhisper-asr-genai\u002Fwhisper-asr-genai.ipynb)\n\nDiscover more examples in the [OpenVINO Samples (Python & C++)](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started\u002Flearn-openvino\u002Fopenvino-samples.html) and [Notebooks (Python)](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started\u002Flearn-openvino\u002Finteractive-tutorials-python.html).\n\nHere are easy-to-follow code examples demonstrating how to run PyTorch and TensorFlow model inference using OpenVINO:\n\n**PyTorch Model**\n\n```python\nimport openvino as ov\nimport torch\nimport torchvision\n\n# load PyTorch model into memory\nmodel = torch.hub.load(\"pytorch\u002Fvision\", \"shufflenet_v2_x1_0\", weights=\"DEFAULT\")\n\n# convert the model into OpenVINO model\nexample = torch.randn(1, 3, 224, 224)\nov_model = ov.convert_model(model, example_input=(example,))\n\n# compile the model for CPU device\ncore = ov.Core()\ncompiled_model = core.compile_model(ov_model, 'CPU')\n\n# infer the model on random data\noutput = compiled_model({0: example.numpy()})\n```\n\n**TensorFlow Model**\n\n```python\nimport numpy as np\nimport openvino as ov\nimport tensorflow as tf\n\n# load TensorFlow model into memory\nmodel = tf.keras.applications.MobileNetV2(weights='imagenet')\n\n# convert the model into OpenVINO model\nov_model = ov.convert_model(model)\n\n# compile the model for CPU device\ncore = ov.Core()\ncompiled_model = core.compile_model(ov_model, 'CPU')\n\n# infer the model on random data\ndata = np.random.rand(1, 224, 224, 3)\noutput = compiled_model({0: data})\n```\n\nOpenVINO supports the CPU, GPU, and NPU [devices](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fopenvino-workflow\u002Frunning-inference\u002Finference-devices-and-modes.html) and works with models from PyTorch, TensorFlow, ONNX, TensorFlow Lite, PaddlePaddle, and JAX\u002FFlax [frameworks](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fopenvino-workflow\u002Fmodel-preparation.html). It includes [APIs](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fapi\u002Fapi_reference.html) in C++, Python, C, NodeJS, and offers the GenAI API for optimized model pipelines and performance.\n\n## Generative AI with OpenVINO\n\nGet started with the OpenVINO GenAI [installation](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started\u002Finstall-openvino\u002Finstall-openvino-genai.html) and refer to the [detailed guide](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fopenvino-workflow-generative\u002Fgenerative-inference.html) to explore the capabilities of Generative AI using OpenVINO.\n\nLearn how to run LLMs and GenAI with [Samples](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino.genai\u002Ftree\u002Fmaster\u002Fsamples) in the [OpenVINO™ GenAI repo](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino.genai). See GenAI in action with Jupyter notebooks: [LLM-powered Chatbot](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Ftree\u002Flatest\u002Fnotebooks\u002Fllm-chatbot) and [LLM Instruction-following pipeline](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Ftree\u002Flatest\u002Fnotebooks\u002Fllm-question-answering).\n\n## Documentation\n\n[User documentation](https:\u002F\u002Fdocs.openvino.ai\u002F) contains detailed information about OpenVINO and guides you from installation through optimizing and deploying models for your AI applications.\n\n[Developer documentation](.\u002Fdocs\u002Fdev\u002Findex.md) focuses on the OpenVINO architecture and describes [building](.\u002Fdocs\u002Fdev\u002Fbuild.md)  and [contributing](.\u002FCONTRIBUTING.md) processes.\n\n## OpenVINO Ecosystem\n\n### OpenVINO Tools\n\n-   [Neural Network Compression Framework (NNCF)](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fnncf) - advanced model optimization techniques including quantization, and sparsity.\n-   [GenAI Repository](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino.genai) and [OpenVINO Tokenizers](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_tokenizers) - resources and tools for developing and optimizing Generative AI applications.\n-   [OpenVINO™ Model Server (OVMS)](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fmodel_server) - a scalable, high-performance solution for serving models optimized for Intel architectures.\n-   [Intel® Geti™](https:\u002F\u002Fgeti.intel.com\u002F) - an interactive video and image annotation tool for computer vision use cases.\n\n### Integrations\n\n-   [🤗Optimum Intel](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel) - grab and use models leveraging OpenVINO within the Hugging Face API.\n-   [Torch.compile](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fopenvino-workflow\u002Ftorch-compile.html) - use OpenVINO for Python-native applications by JIT-compiling code into optimized kernels.\n-   [ExecuTorch](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Fexecutorch\u002Fblob\u002Fmain\u002Fbackends\u002Fopenvino\u002FREADME.md) - use ExecuTorch with OpenVINO to optimize and run AI models efficiently.\n-   [OpenVINO LLMs inference and serving with vLLM​](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-openvino) - enhance vLLM's fast and easy model serving with the OpenVINO backend.\n-   [OpenVINO Execution Provider for ONNX Runtime](https:\u002F\u002Fonnxruntime.ai\u002Fdocs\u002Fexecution-providers\u002FOpenVINO-ExecutionProvider.html) - use OpenVINO as a backend with your existing ONNX Runtime code.\n-   [LlamaIndex](https:\u002F\u002Fdocs.llamaindex.ai\u002Fen\u002Fstable\u002Fexamples\u002Fllm\u002Fopenvino\u002F) - build context-augmented GenAI applications with the LlamaIndex framework and enhance runtime performance with OpenVINO.\n-   [LangChain](https:\u002F\u002Fpython.langchain.com\u002Fdocs\u002Fintegrations\u002Fllms\u002Fopenvino\u002F) - integrate OpenVINO with the LangChain framework to enhance runtime performance for GenAI applications.\n-   [Keras 3](https:\u002F\u002Fgithub.com\u002Fkeras-team\u002Fkeras) - Keras 3 is a multi-backend deep learning framework. Users can switch model inference to the OpenVINO backend using the Keras API.\n\nCheck out the [Awesome OpenVINO](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fawesome-openvino) repository to discover a collection of community-made AI projects based on OpenVINO!\n\n## Performance\n\nExplore [OpenVINO Performance Benchmarks](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Fperformance-benchmarks.html) to discover the optimal hardware configurations and plan your AI deployment based on verified data.\n\n## Contribution and Support\n\nCheck out [Contribution Guidelines](.\u002FCONTRIBUTING.md) for more details.\nRead the [Good First Issues section](.\u002FCONTRIBUTING.md#3-start-working-on-your-good-first-issue), if you're looking for a place to start contributing. We welcome contributions of all kinds!\n\nYou can ask questions and get support on:\n\n* [GitHub Issues](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino\u002Fissues).\n* OpenVINO channels on the [Intel DevHub Discord server](https:\u002F\u002Fdiscord.gg\u002F7pVRxUwdWG).\n* The [`openvino`](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Fopenvino) tag on Stack Overflow\\*.\n\n\n## Resources\n\n* [Release Notes](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Frelease-notes-openvino.html)\n* [OpenVINO Blog](https:\u002F\u002Fblog.openvino.ai\u002F)\n* [OpenVINO™ toolkit on Medium](https:\u002F\u002Fmedium.com\u002F@openvino)\n\n\n## Telemetry\n\nOpenVINO™ collects software performance and usage data for the purpose of improving OpenVINO™ tools.\nThis data is collected directly by OpenVINO™ or through the use of Google Analytics 4.\nYou can opt-out at any time by running the command:\n\n``` bash\nopt_in_out --opt_out\n```\n\nMore Information is available at [OpenVINO™ Telemetry](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Fadditional-resources\u002Ftelemetry.html).\n\n## License\n\nOpenVINO™ Toolkit is licensed under [Apache License Version 2.0](LICENSE).\nBy contributing to the project, you agree to the license and copyright terms therein and release your contribution under these terms.\n\n---\n\\* Other names and brands may be claimed as the property of others.\n","\u003Cdiv align=\"center\">\n\u003Cimg src=\"docs\u002Fdev\u002Fassets\u002Fopenvino-logo-purple-black.svg\" width=\"400px\">\n\n\u003Ch3 align=\"center\">\n面向深度学习模型优化与部署的开源软件工具包。\n\u003C\u002Fh3>\n\n\u003Cp align=\"center\">\n \u003Ca href=\"https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Findex.html\">\u003Cb>文档\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fblog.openvino.ai\">\u003Cb>博客\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Fkey-features.html\">\u003Cb>核心特性\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started\u002Flearn-openvino.html\">\u003Cb>教程\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fdocumentation\u002Fopenvino-ecosystem.html\">\u003Cb>集成\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fwww.intel.com\u002Fcontent\u002Fwww\u002Fus\u002Fen\u002Fdeveloper\u002Ftools\u002Fopenvino-toolkit\u002Fmodel-hub.html\">\u003Cb>基准测试\u003C\u002Fb>\u003C\u002Fa> • \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino.genai\">\u003Cb>生成式AI\u003C\u002Fb>\u003C\u002Fa>\n\u003C\u002Fp>\n\n[![PyPI状态](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fopenvino.svg)](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fopenvino)\n[![Anaconda状态](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Fopenvino\u002Fbadges\u002Fversion.svg)](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Fopenvino)\n[![brew状态](https:\u002F\u002Fimg.shields.io\u002Fhomebrew\u002Fv\u002Fopenvino)](https:\u002F\u002Fformulae.brew.sh\u002Fformula\u002Fopenvino)\n\n[![PyPI下载量](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fopenvinotoolkit_openvino_readme_43070ac0be38.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Fopenvino)\n[![Anaconda下载量](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Flibopenvino\u002Fbadges\u002Fdownloads.svg)](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Fopenvino\u002Ffiles)\n[![brew下载量](https:\u002F\u002Fimg.shields.io\u002Fhomebrew\u002Finstalls\u002Fdy\u002Fopenvino)](https:\u002F\u002Fformulae.brew.sh\u002Fformula\u002Fopenvino)\n \u003C\u002Fdiv>\n\n\n- **推理优化**：在计算机视觉、自动语音识别、生成式AI、自然语言处理（包括大型和小型语言模型）以及其他常见任务中，大幅提升深度学习性能。\n- **灵活的模型支持**：支持使用 PyTorch、TensorFlow、ONNX、Keras、PaddlePaddle 和 JAX\u002FFlax 等主流框架训练的模型。通过 Optimum Intel 直接集成 Hugging Face Hub 中基于 Transformer 和 Diffusers 构建的模型；也可对无原始框架的模型进行转换并部署。\n- **广泛的平台兼容性**：降低资源需求，在从边缘到云端的多种平台上高效部署。OpenVINO™ 支持在 CPU（x86、ARM）、GPU（英特尔集成及独立 GPU）以及 AI 加速器（英特尔 NPU）上进行推理。\n- **社区与生态**：加入活跃的社区，共同推动各领域深度学习性能的提升。\n\n请查看 [OpenVINO 备忘录](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002F_static\u002Fdownload\u002FOpenVINO_Quick_Start_Guide.pdf) 和 [核心特性](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Fkey-features.html)，以快速了解相关信息。\n\n\n## 安装\n\n[获取您偏好的 OpenVINO 发行版](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started\u002Finstall-openvino.html)，或使用以下命令快速安装：\n\n```sh\npip install -U openvino\n```\n\n有关详细信息，请查阅 [系统要求](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Frelease-notes-openvino\u002Fsystem-requirements.html) 和 [支持的设备](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fdocumentation\u002Fcompatibility-and-support\u002Fsupported-devices.html)。\n\n## 教程与示例\n\n[OpenVINO 快速入门示例](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started.html) 将引导您完成首次模型部署的基础操作。\n\n通过 [OpenVINO 笔记本](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks)📚 学习如何优化并部署热门模型：\n- [使用 OpenVINO 构建 LLM 驱动的聊天机器人](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Fblob\u002Flatest\u002Fnotebooks\u002Fllm-chatbot\u002Fllm-chatbot-generate-api.ipynb)\n- [YOLOv11 优化](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Fblob\u002Flatest\u002Fnotebooks\u002Fyolov11-optimization\u002Fyolov11-object-detection.ipynb)\n- [文本到图像生成](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Fblob\u002Flatest\u002Fnotebooks\u002Ftext-to-image-genai\u002Ftext-to-image-genai.ipynb)\n- [结合 LLaVa 与 OpenVINO 的多模态助手](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Fblob\u002Flatest\u002Fnotebooks\u002Fllava-multimodal-chatbot\u002Fllava-multimodal-chatbot-genai.ipynb)\n- [使用 Whisper 和 OpenVINO 进行自动语音识别](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Fblob\u002Flatest\u002Fnotebooks\u002Fwhisper-asr-genai\u002Fwhisper-asr-genai.ipynb)\n\n更多示例请参阅 [OpenVINO 示例（Python & C++）](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started\u002Flearn-openvino\u002Fopenvino-samples.html) 和 [笔记本（Python）](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started\u002Flearn-openvino\u002Finteractive-tutorials-python.html)。\n\n以下是使用 OpenVINO 运行 PyTorch 和 TensorFlow 模型推理的简单代码示例：\n\n**PyTorch 模型**\n\n```python\nimport openvino as ov\nimport torch\nimport torchvision\n\n# 将 PyTorch 模型加载到内存中\nmodel = torch.hub.load(\"pytorch\u002Fvision\", \"shufflenet_v2_x1_0\", weights=\"DEFAULT\")\n\n# 将模型转换为 OpenVINO 格式\nexample = torch.randn(1, 3, 224, 224)\nov_model = ov.convert_model(model, example_input=(example,))\n\n# 在 CPU 设备上编译模型\ncore = ov.Core()\ncompiled_model = core.compile_model(ov_model, 'CPU')\n\n# 对随机数据进行推理\noutput = compiled_model({0: example.numpy()})\n```\n\n**TensorFlow 模型**\n\n```python\nimport numpy as np\nimport openvino as ov\nimport tensorflow as tf\n\n# 将 TensorFlow 模型加载到内存中\nmodel = tf.keras.applications.MobileNetV2(weights='imagenet')\n\n# 将模型转换为 OpenVINO 格式\nov_model = ov.convert_model(model)\n\n# 在 CPU 设备上编译模型\ncore = ov.Core()\ncompiled_model = core.compile_model(ov_model, 'CPU')\n\n# 对随机数据进行推理\ndata = np.random.rand(1, 224, 224, 3)\noutput = compiled_model({0: data})\n```\n\nOpenVINO 支持 CPU、GPU 和 NPU [设备](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fopenvino-workflow\u002Frunning-inference\u002Finference-devices-and-modes.html)，并可与 PyTorch、TensorFlow、ONNX、TensorFlow Lite、PaddlePaddle 和 JAX\u002FFlax [框架](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fopenvino-workflow\u002Fmodel-preparation.html) 兼容。它提供 [API](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fapi\u002Fapi_reference.html) 接口，涵盖 C++、Python、C、NodeJS，并推出 GenAI API，用于优化模型流水线和性能。\n\n## 使用 OpenVINO 的生成式 AI\n\n开始使用 OpenVINO GenAI 的[安装指南](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fget-started\u002Finstall-openvino\u002Finstall-openvino-genai.html)，并参考[详细指南](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fopenvino-workflow-generative\u002Fgenerative-inference.html)，以探索利用 OpenVINO 实现生成式 AI 的能力。\n\n通过 [OpenVINO™ GenAI 仓库](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino.genai) 中的[示例](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino.genai\u002Ftree\u002Fmaster\u002Fsamples)，了解如何运行 LLM 和生成式 AI。您还可以通过 Jupyter 笔记本查看生成式 AI 的实际应用：[基于 LLM 的聊天机器人](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Ftree\u002Flatest\u002Fnotebooks\u002Fllm-chatbot)和[LLM 指令遵循流水线](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Ftree\u002Flatest\u002Fnotebooks\u002Fllm-question-answering)。\n\n## 文档\n\n[用户文档](https:\u002F\u002Fdocs.openvino.ai\u002F) 提供了关于 OpenVINO 的详细信息，指导您完成从安装到优化及部署模型以用于 AI 应用的全过程。\n\n[开发者文档](.\u002Fdocs\u002Fdev\u002Findex.md) 专注于 OpenVINO 的架构，并介绍了[构建](.\u002Fdocs\u002Fdev\u002Fbuild.md)和[贡献](.\u002FCONTRIBUTING.md)流程。\n\n## OpenVINO 生态系统\n\n### OpenVINO 工具\n\n-   [神经网络压缩框架 (NNCF)](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fnncf) - 包含量化和稀疏化等高级模型优化技术。\n-   [GenAI 仓库](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino.genai) 和 [OpenVINO 分词器](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_tokenizers) - 用于开发和优化生成式 AI 应用的资源与工具。\n-   [OpenVINO™ 模型服务器 (OVMS)](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fmodel_server) - 一种可扩展、高性能的解决方案，用于服务针对 Intel 架构优化的模型。\n-   [Intel® Geti™](https:\u002F\u002Fgeti.intel.com\u002F) - 一款面向计算机视觉应用场景的交互式视频和图像标注工具。\n\n### 集成\n\n-   [🤗Optimum Intel](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel) - 在 Hugging Face API 中获取并使用利用 OpenVINO 的模型。\n-   [Torch.compile](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fopenvino-workflow\u002Ftorch-compile.html) - 通过将代码即时编译为优化后的内核，使 Python 原生应用能够使用 OpenVINO。\n-   [ExecuTorch](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Fexecutorch\u002Fblob\u002Fmain\u002Fbackends\u002Fopenvino\u002FREADME.md) - 结合 ExecuTorch 与 OpenVINO，高效优化并运行 AI 模型。\n-   [OpenVINO LLM 推理与服务结合 vLLM​](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-openvino) - 利用 OpenVINO 后端增强 vLLM 快速便捷的模型服务功能。\n-   [OpenVINO 执行提供者用于 ONNX Runtime](https:\u002F\u002Fonnxruntime.ai\u002Fdocs\u002Fexecution-providers\u002FOpenVINO-ExecutionProvider.html) - 将 OpenVINO 作为后端与您现有的 ONNX Runtime 代码结合使用。\n-   [LlamaIndex](https:\u002F\u002Fdocs.llamaindex.ai\u002Fen\u002Fstable\u002Fexamples\u002Fllm\u002Fopenvino\u002F) - 使用 LlamaIndex 框架构建上下文增强的生成式 AI 应用，并借助 OpenVINO 提升运行时性能。\n-   [LangChain](https:\u002F\u002Fpython.langchain.com\u002Fdocs\u002Fintegrations\u002Fllms\u002Fopenvino\u002F) - 将 OpenVINO 与 LangChain 框架集成，以提升生成式 AI 应用的运行时性能。\n-   [Keras 3](https:\u002F\u002Fgithub.com\u002Fkeras-team\u002Fkeras) - Keras 3 是一个支持多种后端的深度学习框架。用户可以通过 Keras API 将模型推理切换至 OpenVINO 后端。\n\n请查看 [Awesome OpenVINO](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fawesome-openvino) 仓库，发现由社区基于 OpenVINO 构建的一系列 AI 项目！\n\n## 性能\n\n浏览 [OpenVINO 性能基准测试](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Fperformance-benchmarks.html)，了解最佳硬件配置，并根据经过验证的数据规划您的 AI 部署。\n\n## 贡献与支持\n\n有关更多详情，请参阅[贡献指南](.\u002FCONTRIBUTING.md)。\n如果您正在寻找入门贡献的机会，请阅读[首次贡献问题部分](.\u002FCONTRIBUTING.md#3-start-working-on-your-good-first-issue)。我们欢迎各种形式的贡献！\n\n您可以在以下渠道提问并获得支持：\n\n* [GitHub Issues](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino\u002Fissues)。\n* Intel DevHub Discord 服务器上的 OpenVINO 频道 ([discord.gg\u002F7pVRxUwdWG](https:\u002F\u002Fdiscord.gg\u002F7pVRxUwdWG))。\n* Stack Overflow 上的 [`openvino`](https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002Ftagged\u002Fopenvino) 标签\\*。\n\n## 资源\n\n* [发行说明](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Frelease-notes-openvino.html)\n* [OpenVINO 博客](https:\u002F\u002Fblog.openvino.ai\u002F)\n* [Medium 上的 OpenVINO™ 工具包](https:\u002F\u002Fmedium.com\u002F@openvino)\n\n## 遥测\n\nOpenVINO™ 会收集软件性能和使用数据，以改进 OpenVINO™ 工具。这些数据由 OpenVINO™ 直接收集，或通过 Google Analytics 4 收集。您可以随时通过运行以下命令选择退出：\n\n``` bash\nopt_in_out --opt_out\n```\n\n更多信息请参阅 [OpenVINO™ 遥测](https:\u002F\u002Fdocs.openvino.ai\u002F2026\u002Fabout-openvino\u002Fadditional-resources\u002Ftelemetry.html)。\n\n## 许可证\n\nOpenVINO™ 工具包采用 [Apache License Version 2.0](LICENSE) 许可。通过参与该项目，您同意其中的许可和版权条款，并在此基础上发布您的贡献。\n\n---\n\\* 其他名称和品牌可能属于其各自所有者。","# OpenVINO 快速上手指南\n\nOpenVINO™ 是一个开源软件工具包，用于优化和部署深度学习模型。它支持在 CPU、GPU（Intel 集成\u002F独立显卡）和 NPU 等多种硬件上高效运行，兼容 PyTorch、TensorFlow、ONNX 等主流框架。\n\n## 1. 环境准备\n\n### 系统要求\n- **操作系统**：Linux (Ubuntu\u002FCentOS), Windows, macOS\n- **处理器**：\n  - CPU: x86_64 或 ARM64\n  - GPU: Intel® Integrated Graphics 或 Discrete Graphics (需安装对应驱动)\n  - NPU: Intel® AI Boost (适用于最新酷睿 Ultra 处理器)\n- **Python 版本**：3.9 - 3.12\n\n### 前置依赖\n确保已安装 `pip` 和构建工具。对于 Linux 用户，建议先更新系统包：\n```bash\nsudo apt update && sudo apt install -y python3-pip python3-dev\n```\n*(Windows\u002FmacOS 用户通常无需额外安装系统级依赖，只需确保 Python 环境正常)*\n\n> **国内加速提示**：如果遇到网络问题，建议使用国内镜像源安装（见下文安装步骤）。\n\n## 2. 安装步骤\n\n### 方式一：使用 pip 安装（推荐）\n\n**通用命令：**\n```sh\npip install -U openvino\n```\n\n**使用国内镜像源加速（推荐中国开发者）：**\n```sh\npip install -U openvino -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 方式二：使用 Conda 安装\n```sh\nconda install -c conda-forge openvino\n```\n*(国内用户可配置 conda 清华源以加速下载)*\n\n### 验证安装\n安装完成后，运行以下命令检查版本：\n```sh\npython -c \"import openvino as ov; print(ov.__version__)\"\n```\n\n## 3. 基本使用\n\nOpenVINO 的核心工作流分为三步：**加载模型** -> **转换模型** -> **编译并推理**。\n\n以下是最简单的代码示例，展示如何将 PyTorch 和 TensorFlow 模型转换为 OpenVINO 格式并运行推理。\n\n### 示例 A：运行 PyTorch 模型\n\n```python\nimport openvino as ov\nimport torch\nimport torchvision\n\n# 1. 加载预训练的 PyTorch 模型\nmodel = torch.hub.load(\"pytorch\u002Fvision\", \"shufflenet_v2_x1_0\", weights=\"DEFAULT\")\n\n# 2. 准备示例输入并转换模型为 OpenVINO 格式\nexample = torch.randn(1, 3, 224, 224)\nov_model = ov.convert_model(model, example_input=(example,))\n\n# 3. 初始化核心并编译模型到 CPU 设备\ncore = ov.Core()\ncompiled_model = core.compile_model(ov_model, 'CPU')\n\n# 4. 执行推理\noutput = compiled_model({0: example.numpy()})\nprint(\"Inference successful!\")\n```\n\n### 示例 B：运行 TensorFlow 模型\n\n```python\nimport numpy as np\nimport openvino as ov\nimport tensorflow as tf\n\n# 1. 加载预训练的 TensorFlow 模型\nmodel = tf.keras.applications.MobileNetV2(weights='imagenet')\n\n# 2. 转换模型为 OpenVINO 格式\nov_model = ov.convert_model(model)\n\n# 3. 初始化核心并编译模型到 CPU 设备\ncore = ov.Core()\ncompiled_model = core.compile_model(ov_model, 'CPU')\n\n# 4. 准备随机数据并执行推理\ndata = np.random.rand(1, 224, 224, 3).astype(np.float32)\noutput = compiled_model({0: data})\nprint(\"Inference successful!\")\n```\n\n### 进阶提示\n- **更换设备**：将 `compile_model` 中的 `'CPU'` 替换为 `'GPU'` 或 `'NPU'` 即可利用相应硬件加速（需确保硬件驱动已正确安装）。\n- **模型来源**：除了直接转换，还可以通过 Hugging Face (`optimum-intel`) 或直接加载 ONNX\u002FPaddlePaddle 模型。\n- **生成式 AI**：如需运行 LLM 或 Stable Diffusion，请额外安装 `openvino-genai` 包并参考官方 GenAI 教程。","一家智能零售初创公司正试图在普通的 Intel 笔记本上部署 YOLOv11 模型，用于实时分析店内监控视频以统计客流和识别货架空缺。\n\n### 没有 openvino 时\n- **推理延迟高**：直接使用 PyTorch 原生推理，帧率仅维持在 15 FPS，无法流畅处理高清视频流，导致关键动作捕捉丢失。\n- **硬件资源浪费**：模型仅能利用 CPU 的部分通用算力，无法调用 Intel 集成显卡或 NPU 进行加速，风扇狂转但计算效率低下。\n- **部署门槛高**：生产环境必须安装庞大的 PyTorch 运行时依赖，导致容器镜像体积超过 2GB，启动缓慢且难以在边缘设备移植。\n- **能耗过高**：由于缺乏底层算子优化，持续运行导致笔记本电池在两小时内耗尽，无法满足移动巡检需求。\n\n### 使用 openvino 后\n- **性能显著提升**：通过 openvino 将模型量化并编译为 IR 格式，推理速度飙升至 60+ FPS，实现了丝滑的实时视频分析。\n- **异构计算激活**：openvino 自动调度任务至 Intel GPU 和 NPU，CPU 占用率降低 70%，系统整体响应更加敏捷。\n- **轻量化部署**：移除重型深度学习框架依赖，仅需 openvino 运行时即可运行，容器体积缩小至 300MB 以内，秒级启动。\n- **能效比优化**：得益于针对性的指令集优化，同等任务下功耗降低 50%，设备续航时间延长一倍，支持全天候移动作业。\n\nopenvino 通过将算法模型与硬件底层深度解耦并重新编排，让普通商用电脑也能变身高效的 AI 推理终端。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fopenvinotoolkit_openvino_b705117c.png","openvinotoolkit","OpenVINO™ Toolkit","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fopenvinotoolkit_3a5e7b58.png","",null,"https:\u002F\u002Fdocs.openvino.ai\u002F","https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit",[84,88,92,96,100,104,108,111,114,117],{"name":85,"color":86,"percentage":87},"C++","#f34b7d",87.3,{"name":89,"color":90,"percentage":91},"Python","#3572A5",8.1,{"name":93,"color":94,"percentage":95},"C","#555555",3.5,{"name":97,"color":98,"percentage":99},"CMake","#DA3434",0.8,{"name":101,"color":102,"percentage":103},"JavaScript","#f1e05a",0.1,{"name":105,"color":106,"percentage":107},"Shell","#89e051",0,{"name":109,"color":110,"percentage":107},"HTML","#e34c26",{"name":112,"color":113,"percentage":107},"TypeScript","#3178c6",{"name":115,"color":116,"percentage":107},"CSS","#663399",{"name":118,"color":119,"percentage":107},"PowerShell","#012456",10007,3171,"2026-04-05T20:03:02","Apache-2.0","Linux, macOS, Windows","非必需。支持 Intel 集成显卡、独立显卡及 NPU；未提及对 NVIDIA GPU 或 CUDA 版本的依赖。","未说明",{"notes":128,"python":129,"dependencies":130},"OpenVINO 主要优化运行于 Intel 硬件（CPU x86\u002FARM, Intel GPU, Intel NPU）。支持多种框架（PyTorch, TensorFlow, ONNX 等）的模型转换与推理。详细系统要求和设备兼容性需参考官方文档链接。生成式 AI 功能需额外安装 openvino-genai。","未说明 (可通过 PyPI\u002FConda 安装)",[131,132,133,134],"torch (可选，用于模型转换)","tensorflow (可选，用于模型转换)","onnx (可选)","optimum-intel (可选，用于 Hugging Face 集成)",[26,15,13,14,55],[137,138,67,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154],"inference","deep-learning","ai","computer-vision","diffusion-models","generative-ai","llm-inference","natural-language-processing","nlp","performance-boost","speech-recognition","stable-diffusion","deploy-ai","optimize-ai","transformers","yolo","recommendation-system","good-first-issue","2026-03-27T02:49:30.150509","2026-04-06T09:24:04.937990",[158,163,168,173,178,183],{"id":159,"question_zh":160,"answer_zh":161,"source_url":162},18537,"如何在 ARM64\u002FAARCH64 架构（如 Raspberry Pi 或 O-Droid）上构建和运行 OpenVINO？","虽然官方支持有限，但社区已成功在 O-Droid XU4 等设备上运行推理引擎。对于 ARM64 架构，可以使用 Schroot 环境进行构建。有一个专门的仓库分享了在 ARM64 上使用 Schroot 设置 OpenVINO R5 的详细步骤，包括所需的 cmake 标志和 schroot 配置。参考项目：https:\u002F\u002Fgithub.com\u002Fskhameneh\u002FOpenVINO-ARM64","https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino\u002Fissues\u002F3",{"id":164,"question_zh":165,"answer_zh":166,"source_url":167},18538,"在 Android 上使用 NCS2 (MYRIAD) 设备时遇到 libusb1.0.so 缺失或崩溃问题如何解决？","在 Android 上运行 NCS2 需要 libusb 支持。如果遇到找不到 libusb1.0.so 的错误，需要使用 NDK 自行编译 libusb 并将其放入应用中。此外，崩溃可能与 SELinux 权限有关（如日志中显示的 avc: denied { read } for name=\"usb\"），尝试将 SELinux 设置为 Permissive 模式可能有助于排查，但根本解决通常需要确保应用拥有访问 USB 设备的正确权限以及正确的 libusb 库路径。","https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino\u002Fissues\u002F4925",{"id":169,"question_zh":170,"answer_zh":171,"source_url":172},18539,"转换 TensorFlow Faster R-CNN 自定义模型时报错 'ObjectDetectionAPIProposalReplacement didn't work' 怎么办？","该错误通常发生在转换自定义训练的 TensorFlow Faster R-CNN 模型（如 faster_rcnn_resnet101 或 faster_rcnn_inception_v2）时，表明模型优化器中的替换逻辑未能找到预期的节点（如 'swapped_proposals' 或 'crop_proposals'）。这通常是因为自定义训练修改了图结构，导致默认的 Object Detection API 替换规则失效。建议检查模型结构是否符合标准 TF OD API 格式，或在论坛中提供详细日志以获取针对特定模型结构的修复方案。","https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino\u002Fissues\u002F28",{"id":174,"question_zh":175,"answer_zh":176,"source_url":177},18540,"在 C++ 中编译模型时出现内存溢出 (OOM) 如何处理？","当使用 `core.compile_model()` 编译大型模型或特定形状（如长序列音频模型 HiFiGAN）时，可能会消耗大量内存导致 OOM。解决方案包括：1. 避免在内存受限的环境中直接编译大型 ONNX 模型；2. 考虑预先将模型转换为 IR 格式并在内存更充足的环境中编译；3. 如果输入维度变化大（如 mel length 从 100 到 1300），尝试使用动态形状而非固定大形状进行编译，或者分阶段处理以减少峰值内存占用。","https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino\u002Fissues\u002F13601",{"id":179,"question_zh":180,"answer_zh":181,"source_url":182},18541,"如何在 PyTorch Frontend 中添加对新算子（Operation）的支持？","OpenVINO 欢迎社区贡献以支持更多的 PyTorch 算子。对于 Good First Issue 列表中的算子，开发者可以认领子任务。实现示例包括：1. `aten::pop`：使用 Gather 提取元素并用 Squeeze 移除；2. `aten::clear`：直接返回空列表；3. `aten::items`：使用 Split 分离键值对；4. `prim::isinstance`：使用 Equal 加逻辑运算实现类型检查；5. `prim::data`：通过 Identity 操作直接返回输入。需注意部分算子实现较为复杂，可能需要探索性开发。","https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino\u002Fissues\u002F28584",{"id":184,"question_zh":185,"answer_zh":186,"source_url":167},18542,"在 Android 上加载 MYRIAD 网络时遇到 'FrontEnd::removeConstLayers expects CNNNetworkImpl' 错误如何解决？","此错误表明在尝试加载网络到 VPU (MYRIAD) 时，内部前端处理期望的是 CNNNetworkImpl 对象但收到了其他类型。这通常发生在模型加载或预处理阶段。确保在调用 `LoadNetwork` 之前，模型已经正确读取并且没有经过不兼容的转换。检查是否使用了正确的插件配置文件 (plugins.xml)，并确认模型格式与目标设备兼容。如果是自定义构建，需检查 OpenVINO 版本一致性。",[188,193,198,203,208,213,218,223,228,233,238,243,248,253,258,263,268,273,278,283],{"id":189,"version":190,"summary_zh":191,"released_at":192},109104,"2026.0.0","### 主要功能与改进概览   \n\n* #### 更广泛的 GenAI 覆盖与框架集成，以最大限度减少代码变更\n   * CPU 和 GPU 上支持的新模型：GPT-OSS-20B、MiniCPM-V-4_5-8B 和 MiniCPM-o-2.6​\n   * NPU 上支持的新模型：MiniCPM-o-2.6。此外，NPU 支持现已扩展至 Qwen2.5-1.5B-Instruct、Qwen3-Embedding-0.6B 和 Qwen-2.5-coder-0.5B。​\n   * OpenVINO™ GenAI 现在为 CPU、GPU 和 NPU 上的 Whisper 流水线新增了词级时间戳功能，从而实现更准确的转录和字幕生成，其效果与 OpenAI 和 FasterWhisper 的实现一致。​\n   * Phi-3-mini FastDraft 模型现已在 Hugging Face 上发布，用于加速 NPU 上的 LLM 推理。FastDraft 通过优化 LLM 的推测解码来提升性能。\n* #### 更广泛的 LLM 模型支持与更多模型压缩技术\n   * 借助针对 3D MatMul 的新型 int4 数据感知权重压缩技术，神经网络压缩框架使 MoE LLM 在内存占用、带宽需求方面显著降低，并且相比无数据方案在精度上有所提升——从而在资源受限的设备上实现更快、更高效的部署。​\n   * 预览：神经网络压缩框架现支持 FP8-4BLUT 量化中的逐层和逐组查找表（LUT）。这一功能可实现细粒度的基于码本的压缩，在减小模型尺寸和带宽的同时，提升 LLM 及 Transformer 工作负载的推理速度和精度。\n* #### 更强的可移植性与性能，助力 AI 在边缘、云端或本地运行。\n   * 预览：OpenVINO™ GenAI 新增 VLM 流水线支持，以增强与代理式 AI 框架的集成。​\n   * OpenVINO GenAI 现已支持 NPU 上的推测解码，通过一个小型草稿模型定期由全尺寸模型验证，从而提升性能并实现高效文本生成。​\n   * 预览：NPU 编译器与 NPU 插件的集成实现了提前编译和设备端编译，无需依赖 OEM 驱动更新。开发者只需启用此功能，即可获得一个开箱即用的完整软件包，从而降低集成复杂度并加快价值实现。​\n   * OpenVINO™ 模型服务器新增对音频端点的增强支持，同时引入代理式连续批处理和并发执行功能，以提升 Intel CPU 和 GPU 上代理式工作流中 LLM 的性能。\n\n### 支持变更与弃用通知\n* 自 2026.0 版起停止支持：\n  * 已移除废弃的 `openvino.runtime` 命名空间，请直接使用 `openvino` 命名空间。\n  * 已移除废弃的 `openvino.Type.undefined`，请改用 `openvino.Type.dynamic`。\n  * 为提升易用性，已更新 PostponedConstant 构造函数签名：\n    * 旧版（已移除）：Callable[[Tensor], None]\n    * 新版：Callable[[], Tensor]\n  * 已移除废弃的 OpenVINO™ GenAI 预定义生成配置。\n  * 已移除 OpenVINO GenAI 对 Whisper 无状态解码器模型的支持。请改用有状态的 m","2026-02-23T17:50:26",{"id":194,"version":195,"summary_zh":196,"released_at":197},109105,"2026.0.1","* 预览：NPU 编译器与 NPU 插件的集成实现了提前编译和设备端编译，无需依赖 OEM 驱动程序更新。此功能在本发行包中默认启用。\n* 已知问题\n  * 组件：optimum；编号：179936\n    描述：使用 optimum-cli 将 phi-4-multimodal 指令模型以通道维度分组的方式（–group-size -1）转换为 OpenVINO 2026.0 格式时，该模型无法正常工作。建议使用 OpenVINO 2025.4 或 2025.4.1 版本来进行转换。\n  * 组件：GenAI；编号：179754\n    描述：在对模型进行重塑或以 guidance_scale > 1.0 的参数编译后，调用 Text2VideoPipeline 的 generate(guidance_scale=1.0) 方法时会引发 RuntimeError 并导致程序崩溃；在修复方案推出之前，可暂时使用 guidance_scale >= 1.0001 作为 workaround。\n  * 组件：运行时；编号：180693\n    描述：使用较新版本的 transformers 库转换的 Qwen3-30B-A3B 模型无法正常工作，建议使用 transformers 4.55.4 版本来进行模型转换，经验证该版本可以正常使用。\n  * 组件：GenAI；编号：179973\n    描述：由于模型转换层面的内部问题，Qwen2-vl、Qwen-2.5VL 和 Qwen3-VL 的密集模型可能无法通过 GenAI API 在 GPU 上正常运行。\n  * 组件：运行时；编号：180696\n    描述：Qwen3-MOE 系列模型在第二次及后续执行时会出现延迟显著增加的情况，并且由于内存消耗过高及潜在的图结构损坏，可能导致无法将模型适配到集成显卡上。该问题仅影响使用 OpenVINO 2026.0 生成的 IR 文件；而使用 OpenVINO 2025.4 生成的旧版 IR 文件则可正常运行。\n  * 组件：运行时；编号：179009\n    描述：启用了 HybridCRT 的静态构建版本存在内存泄漏问题，且该问题仅影响 Windows 系统。\n\n\n您可以在以下链接找到 OpenVINO™ 工具套件 2026.0.1 版本：\n* [下载存档*](https:\u002F\u002Fstorage.openvinotoolkit.org\u002Frepositories\u002Fopenvino\u002Fpackages\u002F2026.0.1\u002F) 与 OpenVINO™","2026-02-23T16:46:54",{"id":199,"version":200,"summary_zh":201,"released_at":202},109106,"2025.4.1","> **_注意:_** 请继续使用 OpenVINO 2025.4 版本，除非您需要 2025.4.1 版本中修复的特定问题。\n\n# 改进摘要   \n* 预览：针对 CPU 和 GPU 优化的专家混合（MoE）模型，已针对 GPT-OSS 20B 模型进行验证。\n   模型转换方法：optimum-cli export openvino -m \"openai\u002Fgpt-oss-20b\" out_dir --weight-format int4\n* 修复问题 ID 174531：在所有设备上使用 OpenVINO GenAI 执行 Mistral-7b-instruct-v0.2 和 Mistral-7b-instruct-v0.3 时，精度出现下降。作为临时解决方案，请使用通过 OpenVINO 2025.3 转换的 IR 格式。\n* 修复问题 ID 176777：在 Text2ImagePipeline、Image2ImagePipeline 和 InpaintingPipeline 中，使用 Python API 的 generate() 方法并传递 callback 参数时，可能会导致进程挂起。作为临时解决方案，请勿使用 callback 参数。C++ 实现不受此影响。\n* 解决了 NPU 插件中的一个问题：Level Zero (L0) 上下文被实现为静态全局对象，即使调用了 unload_plugin()，也仅在 DLL 卸载时才会销毁。这种行为会阻止驱动程序创建某些优化和功能所需的线程。\n\nOpenVINO™ 工具套件 2025.4.1 版本可在此下载：\n* [下载存档*](https:\u002F\u002Fstorage.openvinotoolkit.org\u002Frepositories\u002Fopenvino\u002Fpackages\u002F2025.4.1\u002F) 包含 OpenVINO™\n* Python 版 OpenVINO™：`pip install openvino==2025.4.1`","2025-12-18T09:09:49",{"id":204,"version":205,"summary_zh":206,"released_at":207},109107,"2025.4.0","### 主要特性与改进概览   \n\n* #### 更广泛的 GenAI 覆盖与框架集成，以最大限度减少代码变更\n   * **新增支持的模型：​**\n      * 在 CPU 和 GPU 上：**Qwen3-Embedding-0.6B、Qwen3-Reranker-0.6B、Mistral-Small-24B-Instruct-2501。**\n      * 在 NPU 上：**Gemma-3-4b-it 和 Qwen2.5-VL-3B-Instruct。**\n   * 预览：针对 CPU 和 GPU 优化的专家混合（MoE）模型，已针对 Qwen3-30B-A3B 进行验证。\n   * **GenAI 流水线集成：Qwen3-Embedding-0.6B 和 Qwen3-Reranker-0.6B 用于增强检索\u002F排序功能，Qwen2.5VL-7B 则适用于视频流水线。**\n* #### 更全面的 LLM 模型支持与更多模型压缩技术\n   * **对 Windows ML\\* 的金牌级支持** 使开发者能够轻松地在搭载 Intel® Core™ Ultra 处理器的 AI PC 上，跨 CPU、GPU 和 NPU 部署 AI 模型及应用。\n   * **神经网络压缩框架（NNCF）ONNX 后端现支持 INT8 静态后训练量化（PTQ）以及仅权重的 INT8\u002FINT4 压缩**，以确保与 OpenVINO IR 格式模型在精度上保持一致。同时新增了 SmoothQuant 算法支持，用于 INT8 量化。\n   * **针对 GenAI 的多令牌生成加速，利用优化的 GPU 内核实现更快的推理速度、更智能的 KV 缓存复用，以及可扩展的 LLM 性能。**\n   * **GPU 插件更新包括：通过前缀缓存提升聊天历史场景下的性能，并借助对 INT8 动态量化的支持进一步提高 LLM 的准确性。**\n* #### 更强的可移植性与性能，助力 AI 在边缘、云端或本地运行。\n   * **宣布支持 Intel® Core™ Ultra 处理器第 3 代系列。**\n   * **OpenVINO™ GenAI 新增对加密 Blob 格式的支持，以实现安全的模型部署。** 模型权重和工件将以加密格式存储和传输，从而降低部署过程中知识产权被盗的风险。开发者可使用 OpenVINO GenAI 流水线，在几乎无需修改代码的情况下完成部署。\n   * **OpenVINO™ Model Server 和 OpenVINO™ GenAI 现已通过新增输出解析和改进的聊天模板等功能，将支持范围扩展至代理式 AI 场景**，以实现可靠的多轮交互；同时为 Qwen3-30B-A3B 模型提供预览功能。OVMS 还推出了音频端点的预览功能。\n   * **NPU 部署现已简化为批量支持**，可在 Intel® Core™ Ultra 处理器上无缝执行模型，同时消除对驱动程序的依赖。模型将在编译前被重塑为 batch_size=1。\n   * **改进后的 NVIDIA Triton Server\\* 与 OpenVINO 后端集成** 现已使开发者能够利用 Intel GPU 或 NPU 进行部署。\n\n### 支持变更与弃用通知\n* 将于 2025 年停止支持：\n  * #### 运行时组件：\n    * OpenVINO Affinity API 属性已不再可用，取而代之的是 CPU 绑定配置（ov::hint::enable_cpu_pinning）。\n    * Python API 的 runtime 命名空间已被标记为弃用，并计划于 2026.0 版本中移除。Th","2025-12-01T13:26:01",{"id":209,"version":210,"summary_zh":211,"released_at":212},109108,"2025.3.0","### 主要功能与改进概览   \n\n* #### 更广泛的 GenAI 覆盖与框架集成，以最大限度减少代码变更\n   * **新增支持的模型：Phi-4-mini-reasoning、AFM-4.5B、Gemma-3-1B-it、Gemma-3-4B-it 和 Gemma-3-12B。**\n   * **为 Qwen3-1.7B、Qwen3-4B 和 Qwen3-8B 添加了 NPU 支持。**\n   * **现已在 [OpenVINO Hugging Face 专区](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FOpenVINO\u002Fllms-optimized-for-npu-686e7f0bf7bc184bd71f8ba0) 上提供针对 NPU 优化的 LLM。**\n   * 预览：**Intel® Core™ Ultra 处理器及基于 Windows 的 AI PC 现已可利用 OpenVINO™ Execution Provider for Windows\\* ML，实现高性能、开箱即用的 Windows\\* 平台体验。**\n* #### 更广泛的 LLM 模型支持与更多模型压缩技术\n   * **NPU 插件新增对最长 8K 个标记的上下文、动态提示以及动态 LoRA 的支持，从而提升 LLM 性能。**\n   * **NPU 插件现支持动态批量大小，可通过将模型重塑为批量大小 1 来同时管理多个推理请求，进而提升性能并优化内存利用率。**\n   * **通过实施按通道的关键缓存压缩技术，在内置与独立显卡上均实现了 GenAI 模型准确性的提升**，该技术补充了现有的按标记 KV 缓存压缩方法。\n   * **OpenVINO™ GenAI 推出 TextRerankPipeline，用于提高检索相关性与 RAG 流水线的准确性**，同时引入 Structured Output，以增强响应可靠性并支持函数调用，同时确保符合预定义格式。\n* #### 更高的可移植性与性能，可在边缘、云端或本地运行 AI。\n   * **宣布支持 Intel® Arc™ Pro B 系列（B50 和 B60）。**\n   * 预览：**适用于 OpenVINO GenAI 的 GGUF 启用 Hugging Face 模型，现已由 OpenVINO™ Model Server 支持 DeepSeek Distill、Qwen2、Qwen2.5 和 Llama 3 等主流 LLM 模型架构。** 此功能可降低内存占用，并简化 GenAI 工作负载的集成。\n   * **凭借更高的可靠性和工具调用准确性，OpenVINO™ Model Server 加强了对 AI PC 上代理式 AI 场景的支持**，同时提升了在 Intel CPU、内置 GPU 和 NPU 上的性能。\n   * **现已在神经网络压缩框架（NNCF）中支持针对 ONNX 模型的 int4 数据感知权重压缩**，可在保持精度的同时降低内存占用，从而实现在资源受限环境中的高效部署。\n\n### 支持变更与弃用通知\n* 将于 2025 年停止支持：\n  * #### 运行时组件：\n    * OpenVINO 中的 Affinity API 属性已不再可用，现已被 CPU 绑定配置（ov::hint::enable_cpu_pinning）所取代。\n    * openvino-nightly PyPI 模块已停止维护，终端用户应改用 Simple PyPI nightly 仓库。更多信息请参阅 [发布政策](https:\u002F\u002Fdocs.openvino.ai\u002F202","2025-09-03T14:48:01",{"id":214,"version":215,"summary_zh":216,"released_at":217},109109,"2025.2.0","### 主要功能与改进概览   \n\n* #### 更广泛的 GenAI 支持与框架集成，以最大限度减少代码变更\n   * **CPU 和 GPU 上新增支持的模型：Phi-4、Mistral-7B-Instruct-v0.3、SD-XL Inpainting 0.1、Stable Diffusion 3.5 Large Turbo、Phi-4-reasoning、Qwen3 以及 Qwen2.5-VL-3B-Instruct。Mistral 7B Instruct v0.3 还可在 NPU 上运行。​**\n   * **预览：OpenVINO™ GenAI 为 SpeechT5 TTS 模型引入了文本到语音流水线；同时，全新的 RAG 后端为开发者提供简化的 API，可降低内存占用并提升性能。​**\n   * **预览：OpenVINO™ GenAI 提供 GGUF Reader，用于无缝集成基于 llama.cpp 的 LLM**，配套的 Python 和 C++ 流水线可加载 GGUF 模型、构建 OpenVINO 图并实时进行 GPU 推理。已验证的主流模型包括：DeepSeek-R1-Distill-Qwen（1.5B、7B）、Qwen2.5 Instruct（1.5B、3B、7B）以及 llama-3.2 Instruct（1B、3B、8B）。\n* #### 更全面的 LLM 模型支持与更多模型压缩技术\n   * **在 OpenVINO GenAI 中进一步优化 LoRA 适配器，以提升内置 GPU 上 LLM、VLM 和文生图模型的性能。** 开发者可利用 LoRA 适配器快速定制模型，以满足特定任务需求。​\n   * **针对 CPU 的 KV 缓存压缩现默认启用 INT8 格式**，可在保持与 FP16 相当精度的同时显著降低内存占用。此外，对于支持 INT4 的 LLM，相比 INT8 可实现更大幅度的内存节省。​\n   * **针对 Intel® Core™ Ultra 处理器系列 2 的内置 GPU 以及 Intel® Arc™ B 系列显卡，并借助 Intel® XMX 矩阵计算平台**，优化 VLM 模型和混合量化图像生成模型的性能，同时通过动态量化降低 LLM 的首个 token 延迟。\n* #### 更强的可移植性与性能，支持在边缘、云端或本地运行 AI。\n   * **增强对 Linux\\* 的支持，配备适用于 Intel® Core™ Ultra 处理器系列 2（原代号 Arrow Lake H）内置 GPU 的最新 GPU 驱动程序。**\n   * **OpenVINO™ Model Server 现推出面向 Windows 的精简 C++ 版本，并通过前缀缓存机制提升长上下文模型的性能**，同时提供更小的 Windows 安装包，移除了对 Python 的依赖。现已支持 Hugging Face 模型。​\n   * **Neural Network Compression Framework（NNCF）中实现了针对 ONNX 模型的 INT4 无数据权重压缩功能。​**\n   * **Intel® Core™ 200V 系列处理器上的 NPU 现支持 FP16-NF4 精度，适用于参数量不超过 8B 的模型，采用对称量化和逐通道量化方法**，可在保持性能效率的同时提升精度。\n\n### 支持变更与弃用通知\n* 将于 2025 年停止支持：\n  * #### 运行时组件：\n    * OpenVINO Affinity API 属性已不再可用，取而代之的是 CPU 绑定配置（ov::hint::enable_cpu_pinning）。\n    * openvino","2025-06-18T12:30:41",{"id":219,"version":220,"summary_zh":221,"released_at":222},109110,"2025.1.0","### 主要功能与改进概览   \n\n* #### 更广泛的 GenAI 支持及框架集成，以最大限度减少代码改动\n   * **新增支持的模型：Phi-4 Mini、Jina CLIP v1 和 Bce Embedding Base v1。**\n   * **OpenVINO™ Model Server 现已支持 VLM 模型，包括 Qwen2-VL、Phi-3.5-Vision 和 InternVL2。**\n   * **OpenVINO GenAI 现在为基于 Transformer 的流水线（如 Flux.1 和 Stable Diffusion 3 模型）增加了图像到图像生成及修复功能**，从而进一步提升其生成更逼真内容的能力。\n   * **预览：[AI Playground](https:\u002F\u002Fgame.intel.com\u002Fus\u002Fstories\u002Fintroducing-ai-playground\u002F) 现已采用 OpenVINO Gen AI 后端**，以在 AI PC 上实现高度优化的推理性能。\n* #### 更全面的 LLM 模型支持及更多模型压缩技术\n   * **通过优化 CPU 插件并移除 GEMM 内核，显著减小了二进制文件大小。**\n   * **针对 GPU 插件的新内核优化大幅提升了长短期记忆（LSTM）模型的性能**，这类模型广泛应用于语音识别、语言建模和时间序列预测等领域。\n   * **预览：OpenVINO GenAI 中实现了 Token 驱逐功能**，通过清除不重要的 token 来降低 KV 缓存的内存占用。该 Token 驱逐功能尤其适用于需要生成长序列的任务，例如聊天机器人和代码生成。\n   * **OpenVINO™ Runtime 和 OpenVINO™ Model Server 现已支持 NPU 加速文本生成**，从而助力在低并发场景下，以高能效方式将 VLM 模型部署于 NPU 上，满足 AI PC 的使用需求。\n* #### 更强的可移植性与性能，可在边缘、云端或本地运行 AI。\n   * **支持最新 Intel® Core™ 处理器（第 2 代，原代号 Bartlett Lake）、Intel® Core™ 3 处理器 N 系列以及 Intel® 处理器 N 系列（原代号 Twin Lake），并在 Windows 系统上运行。**\n   * **针对 Intel® Core™ Ultra 200H 系列处理器进一步优化了 LLM 性能**，以改善 Windows 和 Linux 平台上的第二个 token 延迟。\n   * **通过在 GPU 插件中默认启用分页注意力机制和连续批处理，显著提升了性能并实现了更高效的资源利用。**\n   * **预览：面向 Executorch 的全新 OpenVINO 后端**将加速推理过程，并在包括 CPU、GPU 和 NPU 在内的 Intel 硬件平台上带来更优的性能。\n\n### 支持变更与弃用通知\n* 将于 2025 年停止支持：\n  * #### 运行时组件：\n    * OpenVINO Affinity API 属性已不再可用，现已被 CPU 绑定配置（ov::hint::enable_cpu_pinning）所取代。\n  * #### 工具：\n     * OpenVINO™ 开发工具包（pip install openvino-dev）将不再适用于 2025 年发布的 OpenVINO 版本。\n     * Model Optimizer 已停止提供，请考虑使用 [新的转换方法](https:\u002F\u002Fdocs.openvino.ai\u002F2025\u002Fopenvino-workflow\u002Fmodel-preparation\u002Fconv","2025-04-10T08:57:47",{"id":224,"version":225,"summary_zh":226,"released_at":227},109111,"2025.0.0","### 主要特性与改进概览   \n\n* #### 更广泛的 GenAI 支持及框架集成，以最大限度减少代码改动\n   * **新增支持的模型：Qwen 2.5、Deepseek-R1-Distill-Llama-8B、DeepSeek-R1-Distill-Qwen-7B 和 DeepSeek-R1-Distill-Qwen-1.5B、FLUX.1 Schnell 以及 FLUX.1 Dev**\n   * **Whisper 模型：通过 GenAI API，在 CPU、集成 GPU 和独立 GPU 上均实现了性能提升。**\n   * **预览：引入对 NPU 的 torch.compile 支持**，使开发者能够使用 OpenVINO 后端在 NPU 上运行 PyTorch API。现已支持来自 TorchVision、Timm 和 TorchBench 仓库的 300 多个深度学习模型。\n* #### 更全面的大语言模型（LLM）支持与更多模型压缩技术。\n   * **预览：GenAI API 新增提示查找功能，可有效利用与预期用例匹配的预定义提示，从而改善 LLM 的第二个 token 延迟。**\n   * **预览：GenAI API 现已提供图像到图像的修复功能。** 该功能允许模型通过修复指定区域并将其与原始图像无缝融合，生成逼真的内容。\n   * **INT8 格式的非对称 KV 缓存压缩现已支持在 CPU 上启用**，可在处理需要大量内存的长提示时降低内存占用并改善第二个 token 的延迟。此选项需由用户显式指定。\n* #### 更强的可移植性与性能，助力 AI 在边缘、云端或本地运行。\n   * **支持最新的 Intel® Core™ Ultra 200H 系列处理器（原代号为 Arrow Lake-H）**\n   * **OpenVINO™ 后端与 Triton 推理服务器的集成**，使开发者能够在基于 Intel CPU 的部署中利用 Triton 服务器来提升模型服务性能。\n   * **预览：新的 OpenVINO™ 后端集成允许开发者直接在 Keras 3 工作流中利用 OpenVINO 的性能优化**，从而在 CPU、集成 GPU、独立 GPU 以及 NPU 上实现更快的 AI 推理。此功能随最新版 Keras 3.8 提供。\n   * **OpenVINO 模型服务器现支持原生 Windows Server 部署**，开发者可通过消除容器开销并简化 GPU 部署，进一步提升性能。\n\n### 支持变更与弃用通知\n* 已弃用：\n   * OpenVINO 归档文件名中的旧前缀 _l__、_w__ 和 _m__ 已被移除。\n   * Python API 的 runtime 命名空间已被标记为弃用，并计划于 2026.0 版本中移除。新的命名空间结构已发布，可立即进行迁移。相关细节将通过警告信息和文档传达。\n   * NNCF 的 `create_compressed_model()` 方法已被弃用。目前推荐使用 `nncf.quantize()` 方法，用于 PyTorch 和 TensorFlow 模型的量化感知训练。\n\n\n您可在此处获取 OpenVINO™ 工具套件 2025.0 版本：\n* [下载归档*](https:\u002F\u002Fstorage.","2025-02-06T12:28:58",{"id":229,"version":230,"summary_zh":231,"released_at":232},109112,"2024.6.0","### 主要特性与改进摘要   \n\n* #### OpenVINO 2024.6 版本包含多项更新，旨在提升稳定性并优化 LLM 性能。\n* #### 新增对 **Intel® Arc™ B 系列显卡（原名 Battlemage）** 的支持。\n* #### 针对 NPU 实现了多项优化，显著缩短推理时间并提升 LLM 性能。\n* #### 通过 GenAI API 的优化和缺陷修复，进一步提升了 LLM 性能。\n\n\n### 支持变更与弃用通知\n* 不建议继续使用已弃用的功能和组件。这些内容仅用于支持向新解决方案的平稳过渡，未来将被逐步淘汰。如需继续使用已弃用的功能，您必须回退到最后一个支持该功能的长期支持版 OpenVINO。更多详情，请参阅 [OpenVINO 旧版功能与组件](http:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fdocumentation\u002Flegacy-features.html) 页面。\n* 2024.0 版本中已停止支持：\n   * 运行时组件：  \n      * [Intel® 高斯与神经加速器（Intel® GNA）。](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_docs_OV_UG_supported_plugins_GNA.html) 对于 Intel® Core™ Ultra 或第 14 代及更高版本等低功耗系统，建议改用神经处理单元（NPU）。\n      * OpenVINO C++\u002FC\u002FPython 1.0 API（参考 [2023.3 API 迁移指南](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_2_0_transition_guide.html)）。\n      * 所有 ONNX 前端旧版 API（即 ONNX_IMPORTER_API）。\n      * OpenVINO Python API 中的 ‘PerfomanceMode.UNDEFINED’ 属性。\n   * 工具：  \n      * 部署管理器。请参阅 [安装指南](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fget-started\u002Finstall-openvino.html) 和 [部署指南](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fopenvino-workflow\u002Fdeployment-locally.html)，了解当前的分发选项。\n      * [精度检查器](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fomz_tools_accuracy_checker.html)。\n      * [后训练优化工具（POT）](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fpot_introduction.html)。建议改用神经网络压缩框架（NNCF）。\n      * 用于 NNCF 与 [huggingface\u002Ftransformers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers) 集成的 [Git 补丁](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fnncf\u002Ftree\u002Fdevelop\u002Fthird_party_integration\u002Fhuggingface_transformers)。推荐方法是使用 [huggingface\u002Foptimum-intel](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel)，在 Hugging Face 模型的基础上应用 NNCF 优化。\n      * 对 Apache MXNet、Caffe 和 Kaldi 模型格式的支持。可考虑转换为 ONNX 格式作为替代方案。\n* 已弃用且未来将被移除的内容：\n   * 自 OpenVINO 2024.5 版本起，将不再提供 macOS x86_64 调试版本二进制文件。\n   * 自 OpenVINO 2024.5 版本起，不再支持 Python 3.8。\n      * 由于 MxNet 官方 PyPI 项目 [mxnet](https:\u002F\u002Fpypi.org\u002Fproject\u002Fmxnet\u002F) 明确表示其最高支持 Python 3.8 版本，因此 OpenVINO 也将不再支持 MxNet。\n   * Disc","2024-12-19T12:30:47",{"id":234,"version":235,"summary_zh":236,"released_at":237},109113,"2024.5.0","### 主要功能与改进概览\n\n* #### 更广泛的生成式 AI 覆盖与框架集成，以最大限度减少代码改动\n   * **新增支持的模型：Llama 3.2（1B 和 3B）、Gemma 2（2B 和 9B）以及 YOLO11。**\n   * **NPU 上的 LLM 支持：Llama 3 8B、Llama 2 7B、Mistral-v0.2-7B、Qwen2-7B-Instruct 和 Phi-3。**\n   * **新增值得关注的 Notebook：Sam2、Llama3.2、Llama3.2 - Vision、Wav2Lip、Whisper 以及 Llava。**\n   * **预览：支持 Flax，一个基于 JAX 的高性能 Python 神经网络库。** 其模块化设计便于自定义，并可在 GPU 上实现加速推理。\n* #### 更广泛的大型语言模型（LLM）支持及更多模型压缩技术。\n   * **针对 Intel® Core™ Ultra 处理器（第 1 系列）和 Intel® Arc™ 显卡内置 GPU 的优化** 包括用于降低内存占用的 KV 缓存压缩，同时提升了易用性；此外还对模型加载时间进行了优化，以改善 LLM 的首个 token 延迟。\n   * **在不降低精度的前提下，为 Intel® Core™ Ultra 处理器（第 1 系列）上的内置 Intel® GPU 启用了动态量化，以进一步缩短 LLM 的首个 token 延迟。** 对于大批次推理场景，第二个 token 的延迟也将得到改善。\n   * **神经网络压缩框架（NNCF）中实现了新的合成文本数据生成方法。** 这将使 LLM 能够在无需数据集的情况下，利用数据感知型方法更精确地进行压缩。敬请期待：该功能即将通过 Hugging Face 上的 Optimum Intel 提供。\n* #### 更高的可移植性与性能，支持在边缘、云端或本地运行 AI。\n   * **支持 [Intel® Xeon® 6 处理器（P 核）](https:\u002F\u002Fark.intel.com\u002Fcontent\u002Fwww\u002Fus\u002Fen\u002Fark\u002Fproducts\u002Fcodename\u002F128428\u002Fproducts-formerly-granite-rapids.html)（前代代号 Granite Rapids）以及 [Intel® Core™ Ultra 200V 系列处理器](https:\u002F\u002Fark.intel.com\u002Fcontent\u002Fwww\u002Fus\u002Fen\u002Fark\u002Fproducts\u002Fcodename\u002F225837\u002Fproducts-formerly-arrow-lake.html)（前代代号 Arrow Lake-S）。**\n   * **预览：GenAI API 支持多模态 AI 部署，提供多模态流水线**，以提升上下文感知能力；同时还提供转录流水线，方便音频到文本的转换；以及图像生成流水线，简化文本到视觉内容的转换流程。\n   * **GenAI API 新增推测解码功能**，通过使用小型草稿模型并定期由全尺寸模型校正，从而提升性能并高效生成文本。\n   * **预览：GenAI API 现已支持 LoRA 适配器**，使开发者能够快速高效地定制图像和文本生成模型，以满足特定任务需求。\n   * **GenAI API 现在也支持 NPU 上的 LLM，** 开发者可以指定 NPU 作为目标设备，尤其适用于 WhisperPipeline（支持 whisper-base、whisper-medium 和 whisper-small）以及 LLMPipeline（支持 Llama 3 8B、Llama 2 7B、Mistral-v0.2-7B、Qwen2-7B-Instruct 和 Phi-3 Mini-instruct）。建议使用驱动程序版本 32.0.100.3104 或更高版本，以获得最佳性能。","2024-11-20T13:12:14",{"id":239,"version":240,"summary_zh":241,"released_at":242},109114,"2024.4.0","### Summary of major features and improvements   \r\n\r\n* #### More Gen AI coverage and framework integrations to minimize code changes \r\n   * **Support for GLM-4-9B Chat, MiniCPM-1B, Llama 3 and 3.1, Phi-3-Mini, Phi-3-Medium and YOLOX-s models.**\r\n   * **Noteworthy notebooks added:** Florence-2, NuExtract-tiny Structure Extraction, Flux.1 Image Generation, PixArt-α: Photorealistic Text-to-Image Synthesis, and Phi-3-Vision Visual Language Assistant.\r\n* #### Broader Large Language Model (LLM) support and more model compression techniques.\r\n   * **OpenVINO™ runtime optimized for Intel® Xe Matrix Extensions (Intel® XMX) systolic arrays** on built-in GPUs for efficient matrix multiplication resulting in significant LLM performance boost with improved 1st and 2nd token latency, as well as a smaller memory footprint on Intel® Core™ Ultra Processors (Series 2).\r\n   * **Memory sharing enabled for NPUs on Intel® Core™ Ultra Processors (Series 2)** for efficient pipeline integration without memory copy overhead.\r\n   * **Addition of the PagedAttention feature for discrete GPUs\\* enables a significant boost** in throughput for parallel inferencing when serving LLMs on Intel® Arc™ Graphics or Intel® Data Center GPU Flex Series.\r\n* #### More portability and performance to run AI at the edge, in the cloud, or locally.\r\n   * **Support for Intel® Core Ultra Processors Series 2 (formerly codenamed Lunar Lake) on Windows.**\r\n   * **OpenVINO™ Model Server now comes with production-quality support for OpenAI-compatible API** which enables significantly higher throughput for parallel inferencing on Intel® Xeon® processors when serving LLMs to many concurrent users.\r\n   * **Improved performance and memory consumption with prefix caching, KV cache compression, and other optimizations for serving LLMs using OpenVINO™ Model Server.**\r\n   * **Support for Python 3.12.**\r\n   * **Support for Red Hat Enterprise Linux (RHEL) version 9**\r\n\r\n### Support Change and Deprecation Notices\r\n* Using deprecated features and components is not advised. They are available to enable a smooth transition to new solutions and will be discontinued in the future. To keep using discontinued features, you will have to revert to the last LTS OpenVINO version supporting them. For more details, refer to the [OpenVINO Legacy Features and Components](http:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fdocumentation\u002Flegacy-features.html) page.\r\n* Discontinued in 2024.0:\r\n   * Runtime components:  \r\n      * [Intel® Gaussian & Neural Accelerator (Intel® GNA).](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_docs_OV_UG_supported_plugins_GNA.html).Consider using the Neural Processing Unit (NPU) for low-powered systems like Intel® Core™ Ultra or 14th generation and beyond.\r\n      * OpenVINO C++\u002FC\u002FPython 1.0 APIs (see [2023.3 API transition guide](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_2_0_transition_guide.html) for reference). \r\n      * All ONNX Frontend legacy API (known as ONNX_IMPORTER_API)  \r\n      * 'PerfomanceMode.UNDEFINED' property as part of the OpenVINO Python API  \r\n   * Tools:  \r\n      * Deployment Manager. See [installation](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fget-started\u002Finstall-openvino.html) and [deployment guides](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fopenvino-workflow\u002Fdeployment-locally.html) for current distribution options. \r\n      * [Accuracy Checker](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fomz_tools_accuracy_checker.html).\r\n      * [Post-Training Optimization Tool (POT)](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fpot_introduction.html). Neural Network Compression Framework (NNCF) should be used instead.\r\n      * [A Git patch](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fnncf\u002Ftree\u002Fdevelop\u002Fthird_party_integration\u002Fhuggingface_transformers) for NNCF integration with [huggingface\u002Ftransformers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers). The recommended approach is to use [huggingface\u002Foptimum-intel](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel) for applying NNCF optimization on top of models from Hugging Face.\r\n      * Support for Apache MXNet, Caffe, and Kaldi model formats. Conversion to ONNX may be used as a solution. \r\n* Deprecated and to be removed in the future:\r\n   * The macOS x86_64 debug bins will no longer be provided with the OpenVINO toolkit, starting with OpenVINO 2024.5.\r\n   * Python 3.8 is now considered deprecated, and it will not be available beyond the 2024.4 OpenVINO version.\r\n   * dKMB support is now considered deprecated and will be fully removed with OpenVINO 2024.5\r\n   * Intel® Streaming SIMD Extensions (Intel® SSE) will be supported in source code form, but not enabled in the binary package by default, starting with OpenVINO 2025.0\r\n   * The openvino-nightly PyPI module will soon be discontinued. End-users should proceed with the Simple PyPI nightly repo instead. More information in [Release Policy](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fabout-openvino\u002Frelease-notes-openvino\u002Frelease-policy.html#nightly-releases).\r\n   * The OpenVINO™ Development Tools package (`pip install openvino-dev`) will be removed from in","2024-09-19T12:25:39",{"id":244,"version":245,"summary_zh":246,"released_at":247},109115,"2024.3.0","### Summary of major features and improvements   \r\n\r\n* #### More Gen AI coverage and framework integrations to minimize code changes \r\n   * **OpenVINO pre-optimized models are now available in Hugging Face making it easier for developers to get started with these models.**\r\n* #### Broader Large Language Model (LLM) support and more model compression techniques.\r\n   * **Significant improvement in LLM performance on Intel discrete GPUs** with the addition of Multi-Head Attention (MHA) and OneDNN enhancements.\r\n* #### More portability and performance to run AI at the edge, in the cloud, or locally.\r\n   * **Improved CPU performance when serving LLMs with the inclusion of vLLM and continuous batching in the OpenVINO Model Server (OVMS).** vLLM is an easy-to-use open-source library that supports efficient LLM inferencing and model serving.\r\n   * **Ubuntu 24.04 long-term support (LTS), 64-bit (Kernel 6.8+) (preview support)**\r\n\r\n### Support Change and Deprecation Notices\r\n* Using deprecated features and components is not advised. They are available to enable a smooth transition to new solutions and will be discontinued in the future. To keep using discontinued features, you will have to revert to the last LTS OpenVINO version supporting them. For more details, refer to the [OpenVINO Legacy Features and Components](http:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fdocumentation\u002Flegacy-features.html) page.\r\n* Discontinued in 2024.0:\r\n   * Runtime components:  \r\n      * [Intel® Gaussian & Neural Accelerator (Intel® GNA).](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_docs_OV_UG_supported_plugins_GNA.html).Consider using the Neural Processing Unit (NPU) for low-powered systems like Intel® Core™ Ultra or 14th generation and beyond.\r\n      * OpenVINO C++\u002FC\u002FPython 1.0 APIs (see [2023.3 API transition guide](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_2_0_transition_guide.html) for reference). \r\n      * All ONNX Frontend legacy API (known as ONNX_IMPORTER_API)  \r\n      * 'PerfomanceMode.UNDEFINED' property as part of the OpenVINO Python API  \r\n   * Tools:  \r\n      * Deployment Manager. See [installation](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fget-started\u002Finstall-openvino.html) and [deployment guides](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fopenvino-workflow\u002Fdeployment-locally.html) for current distribution options. \r\n      * [Accuracy Checker](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fomz_tools_accuracy_checker.html).\r\n      * [Post-Training Optimization Tool (POT)](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fpot_introduction.html). Neural Network Compression Framework (NNCF) should be used instead.\r\n      * [A Git patch](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fnncf\u002Ftree\u002Fdevelop\u002Fthird_party_integration\u002Fhuggingface_transformers) for NNCF integration with [huggingface\u002Ftransformers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers). The recommended approach is to use [huggingface\u002Foptimum-intel](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel) for applying NNCF optimization on top of models from Hugging Face.\r\n      * Support for Apache MXNet, Caffe, and Kaldi model formats. Conversion to ONNX may be used as a solution. \r\n* Deprecated and to be removed in the future:\r\n   * The OpenVINO™ Development Tools package (`pip install openvino-dev`) will be removed from installation options and distribution channels beginning with OpenVINO 2025.0. \r\n   * Model Optimizer will be discontinued with OpenVINO 2025.0. Consider using the [new conversion methods](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fopenvino-workflow\u002Fmodel-preparation\u002Fconvert-model-to-ir.html) instead. For more details, see the [model conversion transition guide](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fdocumentation\u002Flegacy-features\u002Ftransition-legacy-conversion-api.html).\r\n   * OpenVINO property Affinity API will be discontinued with OpenVINO 2025.0. It will be replaced with CPU binding configurations (_ov::hint::enable_cpu_pinning)_. \r\n   * OpenVINO Model Server components:\r\n      * “auto shape” and “auto batch size” (reshaping a model in runtime) will be removed in the future. OpenVINO’s dynamic shape models are recommended instead.\r\n   * A number of notebooks have been deprecated. For an up-to-date listing of available notebooks, refer to the [OpenVINO™ Notebook index (openvinotoolkit.github.io)](https:\u002F\u002Fopenvinotoolkit.github.io\u002Fopenvino_notebooks\u002F).\r\n\r\n\r\nYou can find OpenVINO™ toolkit 2024.3 release here:\r\n* [Download archives*](https:\u002F\u002Fstorage.openvinotoolkit.org\u002Frepositories\u002Fopenvino\u002Fpackages\u002F2024.3\u002F) with OpenVINO™\r\n* [Install it via Conda](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Fopenvino): `conda install -c conda-forge openvino=2024.3.0`\r\n* [OpenVINO™](https:\u002F\u002Fpypi.org\u002Fproject\u002Fopenvino\u002F2024.3.0\u002F) for Python: `pip install openvino==2024.3.0`\r\n\r\n**Acknowledgements**\r\n\r\nThanks for contributions from the OpenVINO developer community:\r\n@rghvsh\r\n@PRATHAM-SPS\r\n@duydl\r\n@awayzjj\r\n@jvr0123\r\n@inbasperu\r\n@DannyVlasenko\r\n@amkarn258\r\n@kcin96\r\n@Vladislav-Denisov\r\n\r\n\r\nRelease documentation is available here: [https:\u002F\u002Fdocs.openvino.ai\u002F2024](https:\u002F\u002Fdocs.openvino.ai\u002F2024)\r\nRelease Notes are","2024-07-31T14:33:13",{"id":249,"version":250,"summary_zh":251,"released_at":252},109116,"2024.2.0","### Summary of major features and improvements   \r\n\r\n* #### More Gen AI coverage and framework integrations to minimize code changes \r\n   * **Llama 3 optimizations for CPUs, built-in GPUs, and discrete GPUs for improved performance and efficient memory usage.**\r\n   * **Support for Phi-3-mini**, a family of AI models that leverages the power of small language models for faster, more accurate and cost-effective text processing.\r\n   *  **Python Custom Operation is now enabled in OpenVINO** making it easier for Python developers to code their custom operations instead of using C++ custom operations (also supported). Python Custom Operation empowers users to implement their own specialized operations into any model.\r\n   * **Notebooks expansion to ensure better coverage for new models.** Noteworthy notebooks added: DynamiCrafter, YOLOv10, Chatbot notebook with Phi-3, and QWEN2.\r\n\r\n* #### Broader Large Language Model (LLM) support and more model compression techniques.\r\n   * **GPTQ method for 4-bit weight compression added to NNCF** for more efficient inference and improved performance of compressed LLMs.\r\n   * **Significant LLM performance improvements and reduced latency for both built-in GPUs and discrete GPUs.**\r\n   * **Significant improvement in 2nd token latency and memory footprint of FP16 weight LLMs on AVX2 (13th Gen Intel® Core™ processors) and AVX512 (3rd Gen Intel® Xeon® Scalable Processors) based CPU platforms, particularly for small batch sizes.**\r\n\r\n* #### More portability and performance to run AI at the edge, in the cloud, or locally.\r\n   * **Model Serving Enhancements:**\r\n       * Preview: OpenVINO Model Server (OVMS) now supports OpenAI-compatible API along with Continuous Batching and PagedAttention, enabling significantly higher throughput for parallel inferencing, especially on Intel® Xeon® processors, when serving LLMs to many concurrent users.\r\n       * OpenVINO backend for Triton Server now supports built-in GPUs and discrete GPUs, in addition to dynamic shapes support.\r\n       * Integration of TorchServe through torch.compile OpenVINO backend for easy model deployment, provisioning to multiple instances, model versioning, and maintenance.\r\n   * Preview: addition of the Generate API, a simplified API for text generation using large language models with only a few lines of code. The API is available through the newly launched OpenVINO GenAI package.\r\n   * **Support for Intel Atom® Processor X Series**. For more details, see [System Requirements](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fabout-openvino\u002Frelease-notes-openvino\u002Fsystem-requirements.html).\r\n   * Preview: Support for Intel® Xeon® 6 processor.\r\n\r\n### Support Change and Deprecation Notices\r\n* Using deprecated features and components is not advised. They are available to enable a smooth transition to new solutions and will be discontinued in the future. To keep using discontinued features, you will have to revert to the last LTS OpenVINO version supporting them. For more details, refer to the [OpenVINO Legacy Features and Components](http:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fdocumentation\u002Flegacy-features.html) page.\r\n* Discontinued in 2024.0:\r\n   * Runtime components:  \r\n      * [Intel® Gaussian & Neural Accelerator (Intel® GNA).](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_docs_OV_UG_supported_plugins_GNA.html).Consider using the Neural Processing Unit (NPU) for low-powered systems like Intel® Core™ Ultra or 14th generation and beyond.\r\n      * OpenVINO C++\u002FC\u002FPython 1.0 APIs (see [2023.3 API transition guide](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_2_0_transition_guide.html) for reference). \r\n      * All ONNX Frontend legacy API (known as ONNX_IMPORTER_API)  \r\n      * 'PerfomanceMode.UNDEFINED' property as part of the OpenVINO Python API  \r\n   * Tools:  \r\n      * Deployment Manager. See [installation](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fget-started\u002Finstall-openvino.html) and [deployment guides](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fopenvino-workflow\u002Fdeployment-locally.html) for current distribution options. \r\n      * [Accuracy Checker](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fomz_tools_accuracy_checker.html).\r\n      * [Post-Training Optimization Tool (POT)](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fpot_introduction.html). Neural Network Compression Framework (NNCF) should be used instead.\r\n      * [A Git patch](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fnncf\u002Ftree\u002Fdevelop\u002Fthird_party_integration\u002Fhuggingface_transformers) for NNCF integration with [huggingface\u002Ftransformers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers). The recommended approach is to use [huggingface\u002Foptimum-intel](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel) for applying NNCF optimization on top of models from Hugging Face.\r\n      * Support for Apache MXNet, Caffe, and Kaldi model formats. Conversion to ONNX may be used as a solution. \r\n* Deprecated and to be removed in the future:\r\n   * The OpenVINO™ Development Tools package (`pip install openvino-dev`) will be removed from installation options and distribution channels beginning w","2024-06-17T17:21:28",{"id":254,"version":255,"summary_zh":256,"released_at":257},109117,"2022.3.2","#### Major Features and Improvements Summary\r\nThis is a Long-Term Support (LTS) release. LTS versions are released every year and supported for two years (one year for bug fixes, and two years for security patches). Read [Intel® Distribution of OpenVINO™ toolkit Long-Term Support (LTS) Policy](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fabout-openvino\u002Frelease-notes-openvino\u002Frelease-policy.html)  v.2 for more details.\r\n* **This 2022.3.2 LTS release provides functional and security bug fixes for the previous 2022.3.1 Long-Term Support (LTS) release, enabling developers to deploy applications powered by Intel® Distribution of OpenVINO™ toolkit more efficiently.**\r\n* **Intel® Movidius™ VPU-based products are supported in this release.**\r\n\r\nYou can find OpenVINO™ toolkit 2022.3.2 release here:\r\n* [Download archives*](https:\u002F\u002Fstorage.openvinotoolkit.org\u002Frepositories\u002Fopenvino\u002Fpackages\u002F2022.3.2\u002F) with OpenVINO™ Runtime for C\u002FC++\r\n* [OpenVINO™ Runtime](https:\u002F\u002Fpypi.org\u002Fproject\u002Fopenvino\u002F2022.3.2\u002F) for Python: `pip install openvino==2022.3.2`\r\n* [OpenVINO™ Development tools](https:\u002F\u002Fpypi.org\u002Fproject\u002Fopenvino-dev\u002F2022.3.2\u002F): `pip install openvino-dev==2022.3.2`\r\n\r\nRelease documentation is available here: https:\u002F\u002Fdocs.openvino.ai\u002F2022.3\u002F\r\n\r\nRelease Notes are available here: https:\u002F\u002Fwww.intel.com\u002Fcontent\u002Fwww\u002Fus\u002Fen\u002Fdeveloper\u002Farticles\u002Frelease-notes\u002Fopenvino-lts\u002F2022-3.html","2024-05-06T15:14:39",{"id":259,"version":260,"summary_zh":261,"released_at":262},109118,"2024.1.0","### Summary of major features and improvements   \r\n\r\n* #### More Generative AI coverage and framework integrations to minimize code changes. \r\n   * **Mixtral and URLNet models optimized for performance improvements on Intel® Xeon® processors.**\r\n   * **Stable Diffusion 1.5, ChatGLM3-6B, and Qwen-7B models optimized for improved inference speed** on Intel® Core™ Ultra processors with integrated GPU.\r\n   * **Support for Falcon-7B-Instruct, a GenAI Large Language Model (LLM) ready-to-use chat\u002Finstruct model with superior performance metrics.**\r\n   * **New Jupyter Notebooks added**: YOLO V9, YOLO V8 Oriented Bounding Boxes Detection (OOB), Stable Diffusion in Keras, MobileCLIP, RMBG-v1.4 Background Removal, Magika, TripoSR, AnimateAnyone, LLaVA-Next, and RAG system with OpenVINO and LangChain.\r\n\r\n* #### Broader Large Language Model (LLM) support and more model compression techniques.\r\n   * **LLM compilation time reduced** through additional optimizations with compressed embedding. Improved 1st token performance of LLMs on 4th and 5th generations of Intel® Xeon® processors with Intel® Advanced Matrix Extensions (Intel® AMX).\r\n   * **Better LLM compression and improved performance with oneDNN, INT4, and INT8 support for Intel® Arc™ GPUs.**\r\n   * **Significant memory reduction for select smaller GenAI models** on Intel® Core™ Ultra processors with integrated GPU.\r\n\r\n* #### More portability and performance to run AI at the edge, in the cloud, or locally.\r\n   * **The preview NPU plugin for Intel® Core™ Ultra processors is now available** in the OpenVINO open-source GitHub repository, in addition to the main OpenVINO package on PyPI.\r\n   * **The JavaScript API is now more easily accessible through the npm repository**, enabling JavaScript developers’ seamless access to the OpenVINO API.\r\n   * **FP16 inference on ARM processors now enabled for the Convolutional Neural Network (CNN) by default.**\r\n\r\n\r\n### Support Change and Deprecation Notices\r\n* Using deprecated features and components is not advised. They are available to enable a smooth transition to new solutions and will be discontinued in the future. To keep using Discontinued features, you will have to revert to the last LTS OpenVINO version supporting them.\r\nFor more details, refer to the [OpenVINO Legacy Features and Components page](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fdocumentation\u002Flegacy-features.html).\r\n* Discontinued in 2024.0:\r\n   * Runtime components:  \r\n      * [Intel® Gaussian & Neural Accelerator (Intel® GNA).](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_docs_OV_UG_supported_plugins_GNA.html) Consider using the Neural Processing Unit (NPU) for low-powered systems like Intel® Core™ Ultra or 14th generation and beyond.\r\n      * OpenVINO C++\u002FC\u002FPython 1.0 APIs (see [2023.3 API transition guide](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_2_0_transition_guide.html) for reference). \r\n      * All ONNX Frontend legacy API (known as ONNX_IMPORTER_API)  \r\n      * 'PerfomanceMode.UNDEFINED' property as part of the OpenVINO Python API  \r\n   * Tools:  \r\n      * Deployment Manager. See [installation](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fget-started\u002Finstall-openvino.html) and [deployment guides](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fopenvino-workflow\u002Fdeployment-locally.html) for current distribution options. \r\n      * [Accuracy Checker](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fomz_tools_accuracy_checker.html).\r\n      * [Post-Training Optimization Tool (POT)](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fpot_introduction.html). Neural Network Compression Framework (NNCF) should be used instead.\r\n      * [A Git patch](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fnncf\u002Ftree\u002Fdevelop\u002Fthird_party_integration\u002Fhuggingface_transformers) for NNCF integration with [huggingface\u002Ftransformers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers). The recommended approach is to use [huggingface\u002Foptimum-intel](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel) for applying NNCF optimization on top of models from Hugging Face.\r\n      * Support for Apache MXNet, Caffe, and Kaldi model formats. Conversion to ONNX may be used as a solution. \r\n* Deprecated and to be removed in the future:\r\n   * The OpenVINO™ Development Tools package (`pip install openvino-dev`) will be removed from installation options and distribution channels beginning with OpenVINO 2025.0. \r\n   * Model Optimizer will be discontinued with OpenVINO 2025.0. Consider using the [new conversion methods](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fopenvino-workflow\u002Fmodel-preparation\u002Fconvert-model-to-ir.html) instead. For more details, see the [model conversion transition guide](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fdocumentation\u002Flegacy-features\u002Ftransition-legacy-conversion-api.html).\r\n   * OpenVINO property Affinity API will be discontinued with OpenVINO 2025.0. It will be replaced with CPU binding configurations (_ov::hint::enable_cpu_pinning)_. \r\n   * OpenVINO Model Server components:\r\n      * “auto shape” and “auto batch size” (reshaping a model in runtime) will be removed in the future. OpenVINO’s dynamic shape m","2024-04-25T14:45:07",{"id":264,"version":265,"summary_zh":266,"released_at":267},109119,"2024.0.0","### Summary of major features and improvements   \r\n\r\n* #### More Generative AI coverage and framework integrations to minimize code changes. \r\n   * **Improved out-of-the-box experience for TensorFlow\\* sentence encoding models** through the installation of OpenVINO™ toolkit Tokenizers. \r\n   * **OpenVINO™ toolkit now supports Mixture of Experts (MoE)**, a new architecture that helps process more efficient generative models through the pipeline. \r\n   * **JavaScript developers now have seamless access to OpenVINO API.** This new binding enables a smooth integration with JavaScript API. \r\n   * **New and noteworthy models validated**: Mistral, StableLM-tuned-alpha-3b, and StableLM-Epoch-3B. \r\n   \r\n* #### Broader Large Language Model (LLM) support and more model compression techniques.\r\n   * **Improved quality on INT4 weight compression for LLMs** by adding the popular technique, Activation-aware Weight Quantization, to the Neural Network Compression Framework (NNCF).  This addition reduces memory requirements and helps speed up token generation. \r\n   * **Experience enhanced LLM performance on Intel® CPUs**, with internal memory state enhancement, and INT8 precision for KV-cache. Specifically tailored for multi-query LLMs like ChatGLM.    ​\r\n   * **Easier optimization and conversion of Hugging Face models** – compress LLM models to INT8 and INT4 with Hugging Face Optimum command line interface and export models to OpenVINO format.  Note this is part of [Optimum-Intel](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Foptimum\u002Fintel\u002Findex) which needs to be installed separately.\r\n   * **The OpenVINO™ 2024.0 release makes it easier for developers, by integrating more OpenVINO™ features with the Hugging Face\\* ecosystem.** Store quantization configurations for popular models directly in Hugging Face to compress models into INT4 format while preserving accuracy and performance. \r\n\r\n* #### More portability and performance to run AI at the edge, in the cloud, or locally.\r\n   * **A preview plugin architecture of the integrated Neural Processor Unit (NPU) as part of Intel® Core™ Ultra processor is now included in the main OpenVINO™ package on PyPI.** \r\n   * **Improved performance on ARM*** by enabling the ARM threading library. In addition, we now support multi-core ARM platforms and enabled FP16 precision by default on MacOS*.\r\n   * **Improved performance on ARM platforms using throughput hint**, which increases efficiency in utilization of CPU cores and memory bandwidth.​\r\n   * **New and improved LLM serving samples from OpenVINO™ Model Server** for multi-batch inputs and Retrieval Augmented Generation (RAG). \r\n   \r\n\r\n### Support Change and Deprecation Notices\r\n* Using deprecated features and components is not advised. They are available to enable a smooth transition to new solutions and will be discontinued in the future. To keep using Discontinued features, you will have to revert to the last LTS OpenVINO version supporting them.\r\nFor more details, refer to the [OpenVINO Legacy Features and Components page](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_legacy_features.html).\r\n* Discontinued in 2024.0:\r\n   * Runtime components:  \r\n      * [Intel® Gaussian & Neural Accelerator (Intel® GNA).](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_docs_OV_UG_supported_plugins_GNA.html) Consider using the Neural Processing Unit (NPU) for low-powered systems like Intel® Core™ Ultra or 14th generation and beyond.\r\n      * OpenVINO C++\u002FC\u002FPython 1.0 APIs (see 2023.3 API transition guide for reference). \r\n      * All ONNX Frontend legacy API (known as ONNX_IMPORTER_API)  \r\n      * 'PerfomanceMode.UNDEFINED' property as part of the OpenVINO Python API  \r\n   * Tools:  \r\n      * Deployment Manager. See installation and deployment guides for current distribution options. \r\n      * [Accuracy Checker](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fomz_tools_accuracy_checker.html).\r\n      * [Post-Training Optimization Tool (POT)](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fpot_introduction.html). Neural Network Compression Framework (NNCF) should be used instead.\r\n      * [a git patch](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fnncf\u002Ftree\u002Fdevelop\u002Fthird_party_integration\u002Fhuggingface_transformers) for NNCF integration with [huggingface\u002Ftransformers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers). The recommended approach is to use [huggingface\u002Foptimum-intel](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel) for applying NNCF optimization on top of models from Hugging Face.\r\n      * Support for Apache MXNet, Caffe, and Kaldi model formats. Conversion to ONNX may be used as a solution. \r\n* Deprecated and to be removed in the future:\r\n   * The OpenVINO™ Development Tools package (`pip install openvino-dev`) will be removed from installation options and distribution channels beginning with OpenVINO 2025.0. \r\n   * Model Optimizer will be discontinued with OpenVINO 2025.0. Consider using OpenVINO [Model Converter](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fopenvino-workflow\u002Fmodel-preparation\u002Fconvert-model-to-ir.html) (API call: OVC) ins","2024-03-06T14:21:11",{"id":269,"version":270,"summary_zh":271,"released_at":272},109120,"2023.3.0","### Summary of major features and improvements   \r\n\r\n* #### More Generative AI coverage and framework integrations to minimize code changes.\r\n   * **Introducing [OpenVINO Gen AI repository ](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino.genai)on GitHub** that demonstrates native C and C++ pipeline samples for Large Language Models (LLMs).  String tensors are now supported as inputs and tokenizers natively to reduce overhead and ease production. ​\r\n   * **New and noteworthy models validated**; Mistral, Zephyr, Qwen, ChatGLM3, and Baichuan.  \r\n   * **New Jupyter Notebooks** for [Latent Consistency Models (LCM)](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Ftree\u002Fmain\u002Fnotebooks\u002F263-latent-consistency-models-image-generation) and  [Distil-Whisper](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Ftree\u002Fmain\u002Fnotebooks\u002F267-distil-whisper-asr). Updated [LLM Chatbot notebook](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino_notebooks\u002Ftree\u002Fmain\u002Fnotebooks\u002F254-llm-chatbot) to include LangChain, Neural Chat, TinyLlama, ChatGLM3, Qwen, Notus, and Youri models.\r\n   * **Torch.compile is now fully integrated with OpenVINO**, which now includes a hardware 'options' parameter allowing for seamless inference hardware selection by leveraging the plugin architecture in OpenVINO. ​\r\n   \r\n* #### Broader Large Language Model (LLM) support and more model compression techniques.\r\n   * **As part of the Neural Network Compression Framework (NNCF),  INT4 weight compression model formats are now fully supported on Intel® Xeon® CPUs  in addition to Intel® Core™ and iGPU**, adding more performance, lower memory usage, and accuracy opportunity when using LLMs.​\r\n   * **Improved performance of transformer-based LLM on CPU and GPU** using stateful model technique to increase memory efficiency where internal states are shared among multiple iterations of inference.  ​\r\n   * **Easier optimization and conversion of Hugging Face models** – compress LLM models to INT8 and INT4 with Hugging Face Optimum command line interface and export models to OpenVINO format.  Note this is part of [Optimum-Intel](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Foptimum\u002Fintel\u002Findex) which needs to be installed separately.\r\n   * **Tokenizer and TorchVision transform support is now available in the OpenVINO runtime** (via new API) requiring less preprocessing code and enhancing performance by automatically handling this model setup.  More details on Tokenizers support in the Ecosystem section.\r\n\r\n* #### More portability and performance to run AI at the edge, in the cloud, or locally.\r\n   * **Full support for 5th Gen Intel® Xeon® Scalable processors (codename Emerald Rapids)**\r\n   * **Further optimized performance on Intel® Core™ Ultra (codename Meteor Lake) CPU with latency hint**, by leveraging both P-core and E-cores.​\r\n   * **Improved performance on ARM platforms using throughput hint**, which increases efficiency in utilization of CPU cores and memory bandwidth.​\r\n   * **Preview JavaScript API** to enable node JS development to access JavaScript binding via source code.​ See details below.\r\n   * **Improved [model serving of LLMs](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fmodel_server\u002Ftree\u002Fmain\u002Fdemos\u002Fpython_demos\u002Fllm_text_generation) through OpenVINO Model Server**.  This not only enables LLM serving over KServe v2 gRPC and REST APIs for more flexibility but also improves throughput by running processing like tokenization on the server side.​ More details in the Ecosystem section.\r\n   \r\n\r\n### Support Change and Deprecation Notices\r\n* The OpenVINO™ Development Tools package (`pip install openvino-dev`) is deprecated and will be removed from installation options and distribution channels beginning with the 2025.0 release. For more details, refer to the [OpenVINO Legacy Features and Components page](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_legacy_features.html).  \r\n* Ubuntu 18.04 support is discontinued in the 2023.3 LTS release.  The recommended version of Ubuntu is 22.04.\r\n* Starting with 2023.3 OpenVINO longer supports Python 3.7 due to the Python community discontinuing support. Update to a newer version (currently 3.8-3.11) to avoid interruptions.  \r\n* All ONNX Frontend legacy API (known as ONNX_IMPORTER_API) will no longer be available in the 2024.0 release.\r\n'PerfomanceMode.UNDEFINED' property as part of the OpenVINO Python API will be discontinued in the 2024.0 release.\r\n* Tools: \r\n   * [Deployment Manager](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_docs_install_guides_deployment_manager_tool.html) is deprecated and will be supported for two years according to the LTS policy. Visit the [selector tool](https:\u002F\u002Fwww.intel.com\u002Fcontent\u002Fwww\u002Fus\u002Fen\u002Fdeveloper\u002Ftools\u002Fopenvino-toolkit\u002Fdownload.html?VERSION=v_2023_2_0&OP_SYSTEM=MACOS&DISTRIBUTION=ARCHIVE) to see package distribution options or the [deployment guide](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fopenvino_deployment_guide.html) documentation.\r\n   * [Accuracy Checker](https:\u002F\u002Fdocs.openvino.ai\u002F2023.3\u002Fomz_tools_accuracy_checker.html) is deprecate","2024-01-24T13:10:29",{"id":274,"version":275,"summary_zh":276,"released_at":277},109121,"2023.2.0","### Summary of major features and improvements   \r\n \r\n* #### More Generative AI coverage and framework integrations to minimize code changes.\r\n   * **Expanded model support for direct PyTorch model conversion** – automatically convert additional models directly from PyTorch or execute via torch.compile with OpenVINO as the backend. \r\n   * **New and noteworthy models supported** – we have enabled models used for chatbots, instruction following, code generation, and many more, including prominent models like LLaVA, chatGLM, Bark (text to audio), and LCM (Latent Consistency Models, an optimized version of Stable Diffusion). \r\n   * **Easier optimization and conversion of Hugging Face models** – compress LLM models to Int8 with the Hugging Face Optimum command line interface and export models to the OpenVINO IR format. \r\n   * **OpenVINO is now available on Conan** – a package manager which enables more seamless package management for large-scale projects for C and  C++ developers.\r\n \r\n* #### Broader Large Language Model (LLM) support and more model compression techniques.\r\n   * **Accelerate inference for LLM models on Intel® Core™  CPU and iGPU with the use of Int8 model weight compression.**  \r\n   * **Expanded model support for dynamic shapes for improved performance on GPU.** \r\n   * **Preview support for Int4 model format is now included.** Int4 optimized model weights are now available to try on Intel® Core™ CPU and iGPU, to accelerate models like Llama 2 and chatGLM2. \r\n   * **The following Int4 model compression formats are supported for inference in runtime:**  \r\n      * Generative Pre-training Transformer Quantization (GPTQ); with GPTQ-compressed models, you can access them through the Hugging Face repositories.\r\n      * Native Int4 compression through Neural Network Compression Framework (NNCF).\r\n* #### More portability and performance to run AI at the edge, in the cloud, or locally.\r\n   * **In 2023.1 we announced full support for ARM** architecture, now we have improved performance by enabling FP16 model formats for LLMs and integrating additional acceleration libraries to improve latency. \r\n\r\n### Support Change and Deprecation Notices\r\n* The OpenVINO™ Development Tools package (`pip install openvino-dev`) is deprecated and will be removed from installation options and distribution channels with 2025.0. To learn more, refer to the OpenVINO [Legacy Features and Components](https:\u002F\u002Fdocs.openvino.ai\u002F2023.2\u002Fopenvino_legacy_features.html) page. To ensure optimal performance, install the OpenVINO package (pip install openvino), which includes essential components such as OpenVINO Runtime, OpenVINO Converter, and Benchmark Tool.  \r\n* Tools:  \r\n   * [Deployment Manager](https:\u002F\u002Fdocs.openvino.ai\u002F2023.2\u002Fopenvino_docs_install_guides_deployment_manager_tool.html) is deprecated and will be removed in the 2024.0 release.  \r\n   * [Accuracy Checker](https:\u002F\u002Fdocs.openvino.ai\u002F2023.2\u002Fomz_tools_accuracy_checker.html) is deprecated and will be discontinued with 2024.0.    \r\n   * [Post-Training Optimization Tool (POT) ](https:\u002F\u002Fdocs.openvino.ai\u002F2023.2\u002Fpot_introduction.html) is deprecated and will be discontinued with 2024.0.  \r\n   * [Model Optimizer](https:\u002F\u002Fdocs.openvino.ai\u002F2023.2\u002Fopenvino_docs_OV_Converter_UG_prepare_model_convert_model_MO_OVC_transition.html) is deprecated and will be fully supported up until the 2025.0 release. Model conversion to the OpenVINO IR format should be performed through OpenVINO Model Converter which is part of the PyPI package. Follow the Model Optimizer to OpenVINO Model Converter [transition guide](https:\u002F\u002Fdocs.openvino.ai\u002F2023.2\u002Fopenvino_docs_OV_Converter_UG_prepare_model_convert_model_MO_OVC_transition.html) for smoother transition. Known limitations are TensorFlow model with TF1 Control flow and object detection models. These limitations relate to the gap in TensorFlow direct conversion capabilities which will be addressed in upcoming releases. \r\n   * PyTorch 1.13 support is deprecated in Neural Network Compression Framework (NNCF).\r\n* Runtime:  \r\n   * [Intel® Gaussian & Neural Accelerator (Intel® GNA)](https:\u002F\u002Fdocs.openvino.ai\u002F2023.2\u002Fopenvino_docs_OV_UG_supported_plugins_GNA.html) will be deprecated in a future release. We encourage developers to use the Neural Processing Unit (NPU) for low powered systems like Intel® Core™ Ultra or 14th generation and beyond.  \r\n   * OpenVINO C++\u002FC\u002FPython 1.0 APIs will be discontinued with 2024.0.  \r\n   * PyTorch 1.13 support is deprecated in Neural Network Compression Framework (NNCF).\r\n\r\nYou can find OpenVINO™ toolkit 2023.2 release here:\r\n* [Download archives*](https:\u002F\u002Fstorage.openvinotoolkit.org\u002Frepositories\u002Fopenvino\u002Fpackages\u002F2023.2\u002F) with OpenVINO™\r\n* [Install it via Conda](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Fopenvino): `conda install -c conda-forge openvino=2023.2.0`\r\n* [OpenVINO™](https:\u002F\u002Fpypi.org\u002Fproject\u002Fopenvino\u002F2023.2.0\u002F) for Python: `pip install openvino==2023.2.0`\r\n\r\n**Acknowledgements**\r\n\r\nThanks for contributions from the OpenVINO developer ","2023-11-16T15:21:42",{"id":279,"version":280,"summary_zh":281,"released_at":282},109122,"2023.2.0.dev20230922","_NOTE: This version is pre-release software and has not undergone full release validation or qualification. No support is offered on pre-release software and APIs\u002Fbehavior are subject to change.  It should NOT be incorporated into any production software\u002Fsolution and instead should be used only for early testing and integration while awaiting a final release version of this software._\r\n\r\nOpenVINO™ toolkit pre-release definition:\r\n* It is introduced to get early feedback from the community. \r\n* The scope and functionality of the pre-release version is subject to change in the future. \r\n* Using the pre-release in production is strongly discouraged.\r\n\r\nYou can find OpenVINO™ toolkit 2023.2.0.dev20230922 pre-release version here:\r\n* [Download archives*](https:\u002F\u002Fstorage.openvinotoolkit.org\u002Frepositories\u002Fopenvino\u002Fpackages\u002Fmaster\u002F2023.2.0.dev20230922\u002F) with OpenVINO™\r\n* [Install it via Conda](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Fopenvino): `conda install -c \"conda-forge\u002Flabel\u002Fopenvino_dev\" openvino=2023.2.0.dev20230922`\r\n* [OpenVINO™](https:\u002F\u002Fpypi.org\u002Fproject\u002Fopenvino\u002F2023.2.0.dev20230922\u002F) for Python: `pip install --pre openvino` or `pip install openvino==2023.2.0.dev20230922`\r\n\r\nRelease notes are available here: https:\u002F\u002Fdocs.openvino.ai\u002Fnightly\u002Fprerelease_information.html\r\nRelease documentation is available here: https:\u002F\u002Fdocs.openvino.ai\u002Fnightly\u002F\r\n\r\n**What's Changed**\r\n* CPU runtime:\r\n    * Optimized Yolov8n and YoloV8s models on BF16\u002FFP32.\r\n    * Optimized Falcon model on 4th Generation Intel® Xeon® Scalable Processors.\r\n* GPU runtime:\r\n    * int8 weight compression further improves LLM performance. PR #19548\r\n    * Optimization for gemm & fc in iGPU. PR #19780\r\n* TensorFlow FE:\r\n    * Added support for Selu operation. PR #19528\r\n    * Added support for XlaConvV2 operation. PR #19466\r\n    * Added support for TensorListLength and TensorListResize operations. PR #19390\r\n* PyTorch FE:\r\n    * New operations supported\r\n        * aten::minimum aten::maximum. PR #19996\r\n        * aten::broadcast_tensors. PR #19994\r\n        * added support aten::logical_and, aten::logical_or, aten::logical_not, aten::logical_xor. PR #19981\r\n        * aten::scatter_reduce and extend aten::scatter. PR #19980\r\n        * prim::TupleIndex operation. PR #19978\r\n        * mixed precision in aten::min\u002Fmax. PR #19936\r\n        * aten::tile op PR #19645\r\n        * aten::one_hot PR #19779\r\n        * PReLU. PR #19515\r\n        * aten::swapaxes. PR #19483\r\n        * non-boolean inputs for __or__ and __and__ operations. PR #19268\r\n* Torchvision NMS can accept negative scores. PR #19826\r\n\r\n**New openvino_notebooks:**\r\n* Visual Question Answering and Image Captioning using BLIP\r\n\r\n**Fixed GitHub issues**\r\n* Fixed #19784 “[Bug]: Cannot install libprotobuf-dev along with libopenvino-2023.0.2 on Ubuntu 22.04” with PR #19788\r\n* Fixed #19617 “Add a clear error message when creating an empty Constant” with PR #19674\r\n* Fixed #19616 “Align openvino.compile_model and openvino.Core.compile_model functions” with PR #19778\r\n* Fixed #19469 “[Feature Request]: Add SeLu activation in the OpenVino IR (TensorFlow Conversion)” with PR #19528\r\n* Fixed #19019 “[Bug]: Low performance of the TF quantized model.” With PR #19735\r\n* Fixed #19018 “[Feature Request]: Support aarch64 python wheel for Linux” with PR #19594\r\n* Fixed #18831 “Question: openvino support for Nvidia Jetson Xavier ?” with PR #19594\r\n* Fixed #18786 “OpenVINO Wheel does not install Debug libraries when CMAKE_BUILD_TYPE is Debug #18786” with PR #19197\r\n* Fixed #18731 “[Bug] Wrong output shapes of MaxPool” with PR #18965\r\n* Fixed #18091 “[Bug] 2023.0 Version crashes on Jetson Nano - L4T - Ubuntu 18.04” with PR #19717\r\n* Fixed #7194 “Conan for simplifying dependency management” with PR #17580\r\n\r\n**Acknowledgements**\r\n\r\nThanks for contributions from the OpenVINO developer community:\r\n@siddhant-0707,\r\n@PRATHAM-SPS,\r\n@okhovan\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino\u002Fcompare\u002F2023.1.0.dev20230811...2023.2.0.dev20230922","2023-09-27T12:40:03",{"id":284,"version":285,"summary_zh":286,"released_at":287},109123,"2023.1.0","#### Summary of major features and improvements \r\n* More Generative AI options with Hugging Face and improved PyTorch model support.\r\n   * NEW: Your PyTorch solutions are now even further enhanced with OpenVINO. You’ve got more options and you no longer need to convert to ONNX for deployment. Developers can now use their API of choice - PyTorch or OpenVINO for added performance benefits. Additionally, users can automatically import and convert PyTorch models for quicker deployment. You can continue to make the most of OpenVINO tools for advanced model compression and deployment advantages, ensuring flexibility and a range of options. \r\n   * torch.compile (preview) – OpenVINO is now available as a backend through PyTorch torch.compile, empowering developers to utilize OpenVINO toolkit through PyTorch APIs. This feature has also been integrated into the Automatic1111 Stable Diffusion Web UI, helping developers achieve accelerated performance for Stable Diffusion 1.5 and 2.1 on Intel CPUs and GPUs in both Native Linux and Windows OS platforms.\r\n   * Optimum Intel – Hugging Face and Intel continue to enhance top generative AI models by optimizing execution, making your models run faster and more efficiently on both CPU and GPU. OpenVINO serves as a runtime for inferencing execution. New PyTorch auto import and conversion capabilities have been enabled, along with support for weights compression to achieve further performance gains.\r\n* Broader LLM model support and more model compression techniques\r\n   * Enhanced performance and accessibility for Generative AI:   Runtime performance and memory usage have been significantly optimized, especially for Large Language models (LLMs). Models used for chatbots, instruction following, code generation, and many more, including prominent models like BLOOM, Dolly, Llama 2, GPT-J, GPTNeoX, ChatGLM, and Open-Llama  have been enabled.\r\n   * Improved LLMs on GPU – Model coverage for dynamic shapes support has been expanded, further helping the performance of generative AI workloads on both integrated and discrete GPUs. Furthermore, memory reuse and weight memory consumption for dynamic shapes have been improved.  \r\n   * Neural Network Compression Framework (NNCF) now includes an 8-bit weights compression method, making it easier to compress and optimize LLM models.  SmoothQuant method has been added for more accurate and efficient post-training quantization for Transformer-based models.\r\n* More portability and performance to run AI at the edge, in the cloud or locally.\r\n   * NEW: Support for Intel(R) Core(TM) Ultra (codename Meteor Lake). This new generation of Intel CPUs is tailored to excel in AI workloads with a built-in inference accelerators.\r\n   * Integration with MediaPipe – Developers now have direct access to this framework for building multipurpose AI pipelines. Easily integrate with OpenVINO Runtime and OpenVINO Model Server to enhance performance for faster AI model execution. You also benefit from seamless model management and version control, as well as custom logic integration with additional calculators and graphs for tailored AI solutions. Lastly, you can scale faster by delegating deployment to remote hosts via gRPC\u002FREST interfaces for distributed processing.\r\n\r\n#### Support Change and Deprecation Notices\r\n   * OpenVINO™ Development Tools package (pip install openvino-dev) is currently being deprecated and will be removed from installation options and distribution channels with 2025.0. For more info, see [the documentation for Legacy Features](https:\u002F\u002Fdocs.openvino.ai\u002F2023.1\u002Fopenvino_legacy_features.html).\r\n   * Tools: \r\n      * Accuracy Checker is deprecated and will be discontinued with 2024.0.\r\n      * Post-Training Optimization Tool (POT)  has been deprecated and will be discontinued with 2024.0.\r\n   * Runtime: \r\n      * Intel® Gaussian & Neural Accelerator (Intel® GNA) is being deprecated, the GNA plugin will be discontinued with 2024.0.\r\n      * OpenVINO C++\u002FC\u002FPython 1.0 APIs will be discontinued with 2024.0.\r\n      * Python 3.7 will be discontinued with 2023.2 LTS release.\r\n\r\nYou can find OpenVINO™ toolkit 2023.1 release here:\r\n* [Download archives*](https:\u002F\u002Fstorage.openvinotoolkit.org\u002Frepositories\u002Fopenvino\u002Fpackages\u002F2023.1\u002F) with OpenVINO™\r\n* [Install it via Conda](https:\u002F\u002Fanaconda.org\u002Fconda-forge\u002Fopenvino): `conda install -c conda-forge openvino=2023.1.0`\r\n* [OpenVINO™](https:\u002F\u002Fpypi.org\u002Fproject\u002Fopenvino\u002F2023.1.0\u002F) for Python: `pip install openvino==2023.1.0`\r\n\r\nRelease documentation is available here: https:\u002F\u002Fdocs.openvino.ai\u002F2023.1\r\nRelease Notes are available here: https:\u002F\u002Fwww.intel.com\u002Fcontent\u002Fwww\u002Fus\u002Fen\u002Fdeveloper\u002Farticles\u002Frelease-notes\u002Fopenvino\u002F2023-1.html","2023-09-18T09:20:43"]