[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-huggingface--optimum-intel":3,"tool-huggingface--optimum-intel":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":75,"owner_website":80,"owner_url":81,"languages":82,"stars":95,"forks":96,"last_commit_at":97,"license":98,"difficulty_score":23,"env_os":99,"env_gpu":100,"env_ram":99,"env_deps":101,"category_tags":108,"github_topics":109,"view_count":10,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":117,"updated_at":118,"faqs":119,"releases":148},825,"huggingface\u002Foptimum-intel","optimum-intel","🤗 Optimum Intel: Accelerate inference with Intel optimization tools","optimum-intel 是 Hugging Face Optimum 生态中专门用于 Intel 硬件加速的核心组件。它作为桥梁，无缝连接了 🤗 Transformers 和 Diffusers 库与 Intel OpenVINO 工具，帮助开发者在 Intel CPU、GPU 及专用推理加速器上实现高效推理。\n\n为解决模型在特定硬件上运行缓慢的问题，optimum-intel 提供了一套简洁的接口，让用户能够轻松优化模型。它支持量化、剪枝和知识蒸馏等压缩技术，并能将模型转换为 OpenVINO IR 格式以便快速部署。无论是文本生成任务还是图像扩散模型，都能显著提升执行效率。\n\noptimum-intel 非常适合 AI 开发者、算法研究人员以及需要将模型落地到生产环境的工程团队。通过简单的命令行操作或 Python API，即可实现从模型导出到推理的全流程加速。此外，它还兼容最新的 Intel 设备列表，确保性能最大化。如果你正在寻找提升 Intel 架构下 AI 应用速度的方案，optimum-intel 是一个值得信赖的选择。","\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuggingface_optimum-intel_readme_204c310c7e97.png\" \u002F>\n\u003C\u002Fp>\n\n# Optimum Intel\n\n🤗 [Optimum Intel](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Foptimum-intel\u002Fen\u002Findex) is the interface between the 🤗 Transformers and Diffusers libraries and the different tools and libraries provided by [OpenVINO](https:\u002F\u002Fdocs.openvino.ai) to accelerate end-to-end pipelines on Intel architectures.\n\n[OpenVINO](https:\u002F\u002Fdocs.openvino.ai) is an open-source toolkit that enables high performance inference capabilities for Intel CPUs, GPUs, and special DL inference accelerators ([see](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fabout-openvino\u002Fcompatibility-and-support\u002Fsupported-devices.html) the full list of supported devices). It is supplied with a set of tools to optimize your models with compression techniques such as quantization, pruning and knowledge distillation. Optimum Intel provides a simple interface to optimize your Transformers and Diffusers models, convert them to the OpenVINO Intermediate Representation (IR) format and run inference using OpenVINO Runtime.\n\n\n## Installation\n\nTo install the latest release of 🤗 Optimum Intel with the corresponding required dependencies, you can use `pip` as follows:\n\n```bash\npython -m pip install -U \"optimum-intel[openvino]\"\n```\n\nOptimum Intel is a fast-moving project with regular additions of new model support, so you may want to install from source with the following command:\n\n```bash\npython -m pip install \"optimum-intel\"@git+https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel.git\n```\n\n**Deprecation Notice:** The `extras` for `openvino` (e.g., `pip install optimum-intel[openvino,nncf]`), `nncf`, `neural-compressor`, `ipex` are **deprecated** and will be **removed in a future release**.  \n\n\n## Export:\n\nTo export your model to [OpenVINO IR](https:\u002F\u002Fdocs.openvino.ai\u002F2025\u002Fdocumentation\u002Fopenvino-ir-format.html) format, use the optimum-cli tool.\nBelow is an example of exporting [TinyLlama\u002FTinyLlama_v1.1](https:\u002F\u002Fhuggingface.co\u002FTinyLlama\u002FTinyLlama_v1.1) model:\n\n```sh\noptimum-cli export openvino --model TinyLlama\u002FTinyLlama_v1.1 ov_TinyLlama_v1_1\n```\n\nAdditional information on exporting models is available in the [documentation](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Foptimum-intel\u002Fen\u002Fopenvino\u002Fexport).\n\n## Inference:\n\nTo load an exported model and run inference using Optimum Intel, use the corresponding `OVModelForXxx` class instead of `AutoModelForXxx`:\n\n```python\nfrom optimum.intel import OVModelForCausalLM\nfrom transformers import AutoTokenizer, pipeline\n\nmodel_id = \"ov_TinyLlama_v1_1\"\nmodel = OVModelForCausalLM.from_pretrained(model_id)\ntokenizer = AutoTokenizer.from_pretrained(model_id)\npipe = pipeline(\"text-generation\", model=model, tokenizer=tokenizer)\nresults = pipe(\"Hey, how are you doing today?\", max_new_tokens=100)\n```\n\nFor more details on Optimum Intel inference, refer to the [documentation](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Foptimum-intel\u002Fen\u002Fopenvino\u002Finference).\n\n**Note:** Alternatively, an exported model can also be inferred using [OpenVINO GenAI](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino.genai) framework,\nthat provides optimized execution methods for highly performant Generative AI.\n\n## Quantization:\n\nPost-training static quantization can also be applied. Here is an example on how to apply static quantization on a Whisper model using the [LibriSpeech](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fopenslr\u002Flibrispeech_asr) dataset for the calibration step.\n\n```python\nfrom optimum.intel import OVModelForSpeechSeq2Seq, OVQuantizationConfig\n\nmodel_id = \"openai\u002Fwhisper-tiny\"\nq_config = OVQuantizationConfig(dtype=\"int8\", dataset=\"librispeech\", num_samples=50)\nq_model = OVModelForSpeechSeq2Seq.from_pretrained(model_id, quantization_config=q_config)\n\n# The directory where the quantized model will be saved\nsave_dir = \"nncf_results\"\nq_model.save_pretrained(save_dir)\n```\n\nYou can find more information in the [documentation](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Foptimum-intel\u002Fen\u002Fopenvino\u002Foptimization).\n\n## Running the examples\n\nCheck out the [`notebooks`](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Ftree\u002Fmain\u002Fnotebooks) directory to see how 🤗 Optimum Intel can be used to optimize models and accelerate inference.\n\nDo not forget to install requirements for every example:\n\n```sh\ncd \u003Cexample-folder>\npip install -r requirements.txt\n```\n","\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuggingface_optimum-intel_readme_204c310c7e97.png\" \u002F>\n\u003C\u002Fp>\n\n# Optimum Intel\n\n🤗 [Optimum Intel](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Foptimum-intel\u002Fen\u002Findex) 是 🤗 Transformers 和 Diffusers 库与 [OpenVINO](https:\u002F\u002Fdocs.openvino.ai) 提供的各种工具和库之间的接口，旨在加速 Intel 架构上的端到端流水线 (pipelines)。\n\n[OpenVINO](https:\u002F\u002Fdocs.openvino.ai) 是一个开源工具包，支持 Intel CPU、GPU 和专用深度学习 (DL) 推理加速器的高性能推理能力（[参见](https:\u002F\u002Fdocs.openvino.ai\u002F2024\u002Fabout-openvino\u002Fcompatibility-and-support\u002Fsupported-devices.html) 完整的支持设备列表）。它附带了一组工具，可使用量化 (quantization)、剪枝 (pruning) 和知识蒸馏 (knowledge distillation) 等压缩技术来优化您的模型。Optimum Intel 提供了一个简单的接口来优化您的 Transformers 和 Diffusers 模型，将它们转换为 OpenVINO 中间表示 (IR) 格式，并使用 OpenVINO 运行时 (Runtime) 运行推理。\n\n## Installation\n\n要安装带有相应所需依赖项的 🤗 Optimum Intel 最新发行版，您可以使用 `pip` 如下：\n\n```bash\npython -m pip install -U \"optimum-intel[openvino]\"\n```\n\nOptimum Intel 是一个快速迭代的项目，会定期添加对新模型的支持，因此您可能希望使用以下命令从源码安装：\n\n```bash\npython -m pip install \"optimum-intel\"@git+https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel.git\n```\n\n**弃用通知：** `openvino` 的 `extras`（可选依赖）（例如 `pip install optimum-intel[openvino,nncf]`），`nncf`, `neural-compressor`, `ipex` 已**弃用**，并将在**未来的版本中移除**。  \n\n## Export:\n\nTo export your model to [OpenVINO IR](https:\u002F\u002Fdocs.openvino.ai\u002F2025\u002Fdocumentation\u002Fopenvino-ir-format.html) format, use the optimum-cli tool.\nBelow is an example of exporting [TinyLlama\u002FTinyLlama_v1.1](https:\u002F\u002Fhuggingface.co\u002FTinyLlama\u002FTinyLlama_v1.1) model:\n\n```sh\noptimum-cli export openvino --model TinyLlama\u002FTinyLlama_v1.1 ov_TinyLlama_v1_1\n```\n\nAdditional information on exporting models is available in the [documentation](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Foptimum-intel\u002Fen\u002Fopenvino\u002Fexport).\n\n## Inference:\n\nTo load an exported model and run inference using Optimum Intel, use the corresponding `OVModelForXxx` class instead of `AutoModelForXxx`:\n\n```python\nfrom optimum.intel import OVModelForCausalLM\nfrom transformers import AutoTokenizer, pipeline\n\nmodel_id = \"ov_TinyLlama_v1_1\"\nmodel = OVModelForCausalLM.from_pretrained(model_id)\ntokenizer = AutoTokenizer.from_pretrained(model_id)\npipe = pipeline(\"text-generation\", model=model, tokenizer=tokenizer)\nresults = pipe(\"Hey, how are you doing today?\", max_new_tokens=100)\n```\n\nFor more details on Optimum Intel inference, refer to the [documentation](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Foptimum-intel\u002Fen\u002Fopenvino\u002Finference).\n\n**Note:** Alternatively, an exported model can also be inferred using [OpenVINO GenAI](https:\u002F\u002Fgithub.com\u002Fopenvinotoolkit\u002Fopenvino.genai) framework,\nthat provides optimized execution methods for highly performant Generative AI.\n\n## Quantization:\n\nPost-training static quantization can also be applied. Here is an example on how to apply static quantization on a Whisper model using the [LibriSpeech](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fopenslr\u002Flibrispeech_asr) dataset for the calibration step.\n\n```python\nfrom optimum.intel import OVModelForSpeechSeq2Seq, OVQuantizationConfig\n\nmodel_id = \"openai\u002Fwhisper-tiny\"\nq_config = OVQuantizationConfig(dtype=\"int8\", dataset=\"librispeech\", num_samples=50)\nq_model = OVModelForSpeechSeq2Seq.from_pretrained(model_id, quantization_config=q_config)\n\n# The directory where the quantized model will be saved\nsave_dir = \"nncf_results\"\nq_model.save_pretrained(save_dir)\n```\n\nYou can find more information in the [documentation](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Foptimum-intel\u002Fen\u002Fopenvino\u002Foptimization).\n\n## Running the examples\n\nCheck out the [`notebooks`](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Ftree\u002Fmain\u002Fnotebooks) directory to see how 🤗 Optimum Intel can be used to optimize models and accelerate inference.\n\nDo not forget to install requirements for every example:\n\n```sh\ncd \u003Cexample-folder>\npip install -r requirements.txt\n```","# Optimum Intel 快速上手指南\n\n🤗 Optimum Intel 是连接 Hugging Face Transformers\u002FDiffusers 库与 [OpenVINO](https:\u002F\u002Fdocs.openvino.ai) 的接口，旨在利用 Intel CPU、GPU 及专用 DL 推理加速器加速端到端流水线。\n\n## 环境准备\n\n*   **操作系统**：支持 Linux, Windows, macOS。\n*   **硬件要求**：Intel 处理器（CPU）、Intel Arc 显卡或其他支持的 Intel DL 加速器。\n*   **软件依赖**：Python 3.8+，确保已安装 `pip`。\n\n## 安装步骤\n\n推荐使用 `pip` 安装最新稳定版及其所需依赖：\n\n```bash\npython -m pip install -U \"optimum-intel[openvino]\"\n```\n\n> **注意**：旧版的 `extras`（如 `nncf`, `neural-compressor`, `ipex`）已被弃用，未来版本将移除，请勿在 `pip install` 命令中混合使用这些额外包。\n\n如需获取最新开发功能，可从源码安装：\n\n```bash\npython -m pip install \"optimum-intel\"@git+https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel.git\n```\n\n## 基本使用\n\n### 1. 模型导出\n\n首先将 Hugging Face 模型转换为 OpenVINO IR 格式。以下示例导出 TinyLlama 模型：\n\n```sh\noptimum-cli export openvino --model TinyLlama\u002FTinyLlama_v1.1 ov_TinyLlama_v1_1\n```\n\n### 2. 推理运行\n\n加载导出的模型并使用 `OVModelForXxx` 类进行推理（替代标准的 `AutoModelForXxx`）：\n\n```python\nfrom optimum.intel import OVModelForCausalLM\nfrom transformers import AutoTokenizer, pipeline\n\nmodel_id = \"ov_TinyLlama_v1_1\"\nmodel = OVModelForCausalLM.from_pretrained(model_id)\ntokenizer = AutoTokenizer.from_pretrained(model_id)\npipe = pipeline(\"text-generation\", model=model, tokenizer=tokenizer)\nresults = pipe(\"Hey, how are you doing today?\", max_new_tokens=100)\n```\n\n更多高级功能（如量化优化）请参考官方文档。","某电商客服团队需在自有 Intel 服务器上部署基于 TinyLlama 的对话机器人，以快速响应用户咨询。面对日益增长的用户流量，他们急需优化后端服务性能。\n\n### 没有 optimum-intel 时\n- 原生 PyTorch 推理在纯 CPU 环境下速度极慢，平均响应延迟高达数秒，严重影响用户体验。\n- 需要手动安装 OpenVINO 并编写复杂的转换脚本，代码迁移成本高且容易与环境冲突。\n- 模型未进行量化，显存占用过大，无法在低成本硬件上支撑高并发访问请求。\n- 调试困难，缺乏统一接口验证硬件性能，排查问题耗时耗力。\n\n### 使用 optimum-intel 后\n- 利用命令行一键导出 OpenVINO 格式，CPU 推理速度提升 3 倍以上，延迟降至毫秒级。\n- 保持 HuggingFace API 完全兼容，仅需将 AutoModel 替换为 OVModel 即可无缝运行。\n- 内置 INT8 量化配置，显著降低模型体积与内存占用，大幅提升服务器吞吐量。\n- 简化部署流程，无需额外编写底层优化代码，直接复用现有训练好的模型权重。\n\noptimum-intel 让开发者能在 Intel 硬件上零成本实现高性能 AI 推理落地。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fhuggingface_optimum-intel_204c310c.png","huggingface","Hugging Face","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fhuggingface_90da21a4.png","The AI community building the future.",null,"https:\u002F\u002Fhuggingface.co\u002F","https:\u002F\u002Fgithub.com\u002Fhuggingface",[83,87,91],{"name":84,"color":85,"percentage":86},"Jupyter Notebook","#DA5B0B",61.9,{"name":88,"color":89,"percentage":90},"Python","#3572A5",38,{"name":92,"color":93,"percentage":94},"Makefile","#427819",0,560,213,"2026-04-03T07:28:13","Apache-2.0","未说明","支持 Intel CPU\u002FGPU，无需 NVIDIA GPU 及 CUDA",{"notes":102,"python":99,"dependencies":103},"专为 Intel 架构优化，需安装 OpenVINO 运行时；模型需导出为 OpenVINO IR 格式；部分依赖包（如 nncf, ipex）已废弃；示例需单独安装 requirements.txt",[67,104,105,106,107],"openvino","transformers","diffusers","torch",[26,13],[106,110,111,112,113,104,114,115,116,105],"distillation","inference","intel","onnx","optimization","pruning","quantization","2026-03-27T02:49:30.150509","2026-04-06T09:44:28.883638",[120,125,130,134,138,143],{"id":121,"question_zh":122,"answer_zh":123,"source_url":124},3553,"Qwen3 MoE 模型在 CPU 上的性能为何不如全精度模型？","早期版本可能存在性能问题，但 MOE 已通过 OpenVINO PR #32450 进行了优化。实测显示使用 INT4_ASYM 格式在相同机器上可获得约 12.7 tokens\u002Fs 的吞吐量，效果显著。建议参考相关优化后的构建脚本进行测试。","https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fissues\u002F1275",{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},3554,"转换 VLM 模型时遇到 Flash Attention 2.0 不支持 torch.float32 的错误怎么办？","Flash Attention 2.0 仅支持 torch.float16 和 torch.bfloat16。请在加载模型时指定 `torch_dtype` 参数，例如：`model = AutoModel.from_pretrained(\"openai\u002Fwhisper-tiny\", attn_implementation=\"flash_attention_2\", torch_dtype=torch.float16)`，或使用自动混合精度上下文管理器。","https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fissues\u002F1073",{"id":131,"question_zh":132,"answer_zh":133,"source_url":129},3555,"导出模型时提示 BetterTransformer 类已弃用且无法应用，如何处理？","BetterTransformer 已在未来版本中移除。建议更改注意力实现方式（InternVL 代码会强制使用 flash_attn）。若需继续使用，可忽略此警告，或向 HuggingFace optimum 仓库提交 Issue 请求支持该模型类型。",{"id":135,"question_zh":136,"answer_zh":137,"source_url":129},3556,"为什么 optimum-cli 工具会出现依赖包冲突问题？","Optimum 采用灵活配置和延迟初始化设计，无法预测所有模型所需的额外包。因此不能将所有依赖内置。建议用户根据具体模型需求手动解决环境冲突，避免不必要的包安装导致 UX 问题。",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},3557,"llava-v1.6-mistral-7b-hf 模型导出时报错 AttributeError: 'LlavaNextModel' object has no attribute 'layers' 如何解决？","这是 transformers 库版本兼容性问题。当前报错发生在 4.52.3 或 4.53.3 版本，建议在导出时将 transformers 降级至 4.51.3 版本即可正常工作。","https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fissues\u002F1398",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},3558,"使用 AWQ 模式转换 Qwen3-32B 大模型失败需要注意什么？","确保系统拥有足够内存（建议 128GB RAM）。使用命令：`optimum-cli export openvino --model Qwen\u002FQwen3-32B --trust-remote-code --quant-mode int8 --weight-format int4 --dataset wikitext2 --awq --scale-estimation --group-size 128 \u003C输出路径>`。同时注意检查 OVMS 脚本是否适用。","https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fissues\u002F1397",[149,154,159,164,169,174,179,184,189,194,199,204,209,214,219,224,229,234,239,244],{"id":150,"version":151,"summary_zh":152,"released_at":153},112856,"v1.27.0","### ️🏗 ​New architectures support \r\n\r\n* Add OpenVINO support for Zamba2 by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1354\r\n* Add OpenVINO support for BitNet by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1518\r\n* Add OpenVINO support for LFM2 by @popovaan in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1515\r\n* Add OpenVINO support for EXAONE 4.0 by @zhaohb in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1491\r\n* Add OpenVINO support for Granite-4.0 family by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1514\r\n\r\n\r\n### 🧹 Deprecations \r\n* Removed `nf4_fp8` quantization modes by @ljaljushkin in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1493\r\n* Add depreciation warnings for INC and IPEX  by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1568\r\n\r\n\r\n### 🔧 Enhancements & Fixes\r\n* [OpenVINO] Transformers 4.56\u002F4.57 support by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1541\r\n* [OpenVINO] Fix OpenVINO model inference not being affected by static quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1461\r\n* [IPEX] Fix IPEX models for transformers v4.55 by @kaixuanliu in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1485\r\n* [OpenVINO]Add default config for Qwen3-30B-A3B by @ljaljushkin in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1506\r\n* [OpenVINO] Fix `preprocess_inputs` method for Gemma3 by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1507\r\n* [IPEX] Fix IPEX models `can_compile` method by @jiqing-feng in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1511\r\n* [OpenVINO] Fix TasksManager._TRANSFORMERS_TASKS_TO_MODEL_LOADERS by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1501\r\n* [OpenVINO] Add default int4 config for inceptionai\u002Fjais-13b by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1519\r\n* [OpenVINO] Add `cache_position` input inside `prepare_inputs` method for Mamba by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1517\r\n* [OpenVINO] Refactor from_pretrained quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1520\r\n* [OpenVINO] fix bug for attention_mask when model is not patched by @kaixuanliu in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1526\r\n* [OpenVINO] Add custom int4 config for SmolVLM2-256M-Video by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1532\r\n* [OpenVINO] Fix whisper inference for models exported without pkv  by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1534\r\n* [OpenVINO] Optimize IR for Mamba models by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1538\r\n* [OpenVINO] NNCF 2.19 update by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1522\r\n* [OpenVINO] Update optimum-intel to OV 2025.4 release by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1544\r\n* [OpenVINO] Update InferRequestWrapper to collect samples depending on stateful models state by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1505\r\n* [OpenVINO] Add gsm8k as a dataset option for CausalLM quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1547\r\n* [OpenVINO] Take into account that `pillow` may be not installed by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1546\r\n* [OpenVINO] Refactor CLI quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1525\r\n* [OpenVINO] Streamline opevino-genai base pipelines testing by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1545\r\n* [OpenVINO] Fix VLM mixed quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1553\r\n* [OpenVINO] Remove using nncf.torch.patch_torch_operators by @AlexanderDokuchaev in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1555\r\n* [OpenVINO] Deprecate providing `trust_remote_code` to quantization configs by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1558\r\n* [OpenVINO] Add model-specific quantization ignored scopes by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1556\r\n* [OpenVINO] Save models immediately after quantization via CLI by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1559\r\n\r\n\r\n## New Contributors\r\n* @ml0mbardi made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1508\r\n* @almilosz made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1533\r\n* @zhaohb made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1491\r\n\r\n\r\n## What's Changed\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.26.1...v1.27.0\r\n\r\nCompatible with transformers>=v4.45,\u003Cv5","2025-12-23T16:08:17",{"id":155,"version":156,"summary_zh":157,"released_at":158},112857,"v1.26.1","- Fix `can_compile` (#1511) by @jiqing-feng \r\n- Fix `preprocess_inputs` for Gemma3 (#1507) by @rkazants \r\n- Default config for Qwen3-30B-A3B (#1506) by @ljaljushkin \r\n- Fix `TasksManager._TRANSFORMERS_TASKS_TO_MODEL_LOADERS` (#1501) by @echarlaix \r\n- Fix some minor bugs after upgrading transformers to 4.55 for IPEX (#1485) by @kaixuanliu \r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.26.0...v1.26.1\r\n\r\nCompatible with transformers>=4.45,\u003C4.56","2025-11-10T22:38:01",{"id":160,"version":161,"summary_zh":162,"released_at":163},112858,"v1.26.0","### ️🏗 ​New architectures support \r\n\r\n* Add OpenVINO support for gpt-oss by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1428\r\n* Add OpenVINO support for MiniCPM-o by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1454\r\n\r\n### 🚀 New Features\r\n* Add feature-extraction and text-classification support for Qwen3 export by @openvino-dev-samples in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1415\r\n* VLM Vision Encoder full quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1394\r\n* Add support transformers v4.54 v4.55 by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1406\r\n* New pipelines by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1462\r\n* IPEX transformers upgrade to 4.55 by @kaixuanliu in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1467\r\n* Adopt new NNCF mxfp4 quantization logic by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1465\r\n\r\n### 🔧 Enhancements & Fixes\r\n* Fix for diffusers 0.35 (and fix and speedup documentation build with uv) by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1426\r\n* Fix regexp error with searching onnx model by @sbalandi in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1442\r\n* Fix high memory consumption during vision encoder NNCF quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1440\r\n* Fix disk by distributing tests by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1438\r\n* Fix CI rate limiting by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1449\r\n* Add custom task inferring logic for mistralai\u002FMistral-7B-Instruct-v0.3 by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1413\r\n* Fix editable mode and cleanup\u002Fadapt for optimum v2 by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1457\r\n* Faster cli startup \u002F helper by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1460\r\n* Fix OpenVINO VLM in-place static quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1464\r\n* Remove datasets as required dependency by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1466\r\n* Fix OpenVINO data-free pipeline quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1477\r\n* Skip concat_qkv creation for TP mode by @kaixuanliu in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1481\r\n* Introduce OPENVINO_TEST_DEVICE by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1479\r\n* Add workaround logic for OpenVINO default int4 quantization of gpt-oss models by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1490\r\n* Apply chat_template for MiniCPM-o-2_6 by @Wovchena in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1484\r\n\r\n### 🧹 Deprecations \r\n* Deprecate Quantizer support of nn.Module by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1421\r\n\r\n## New Contributors\r\n* @TharescdqTuA made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1436\r\n\r\n## What's Changed\r\n\r\n\u003Cdetails>\r\n* Fix for diffusers 0.35 (and fix and speedup documentation build with uv) by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1426\r\n* [OpenVINO]to support qwen3 embedding\u002Frerank by @openvino-dev-samples in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1415\r\n* Remove reference to docker based doc building in Makefile by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1430\r\n* VLM Vision Encoder full quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1394\r\n* Fix main documentation build\u002Fpush by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1432\r\n* [Docs] Export whisper as stateless during quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1435\r\n* chore: Fix Misspellings by @TharescdqTuA in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1436\r\n* Skip marian tests on OpenVINO 2025.3 by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1437\r\n* [OV] Remove local imports from quantization.py by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1433\r\n* Update references after OpenVINO 2025.3 release by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1441\r\n* Fix regexp error with searching onnx model by @sbalandi in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1442\r\n* [OV] Fix high memory consumption during vision encoder quantization by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1440\r\n* Fix disk by distributing tests by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1438\r\n* chore: Fix Misspellings by @TharescdqTuA in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1443\r\n* Benchmarks up","2025-10-24T15:49:44",{"id":165,"version":166,"summary_zh":167,"released_at":168},112859,"v1.25.2","* Fix tokenizer conversion #1414 by @nikita-savelyevv  \r\n* Fix and test stateless encoder decoders #1423 by @IlyasMoutawwakil \r\n*  Use eager mask all the time #1424 by @IlyasMoutawwakil \r\n\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.25.1...v1.25.2\r\n\r\nCompatible with transformers>=4.36,\u003C=4.53","2025-08-13T09:35:47",{"id":170,"version":171,"summary_zh":172,"released_at":173},112860,"v1.25.1","* Fix gemma3 for older transformers versions and llava next with mistral decoder #1408 by @IlyasMoutawwakil \r\n* Handle deprecation of forced_decoder_ids in transformers generation_config #1402 by @aleksandr-mokrov and @echarlaix \r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.25.0...v1.25.1\r\n\r\nCompatible with transformers>=4.36,\u003C=4.53","2025-08-07T06:16:00",{"id":175,"version":176,"summary_zh":177,"released_at":178},112861,"v1.25.0","## :rocket: New Features & Enhancements\r\n\r\n* Add quantization for text2text-generation models by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1359\r\n* Add OpenVINO support for Mamba and Falcon-mamba by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1360\r\n* Add quantization for SegmentAnything model by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1384\r\n* Add support for cb4_f8e4m3 quantization mode by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1378\r\n* Add quantization statistics path argument by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1392\r\n* Add Transformers 4.53 support by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1377\r\n\r\n\r\n## New Contributors\r\n* @mitruska made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1375\r\n* @ezelanza made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1385\r\n\r\n## What's Changed\r\n\r\n\u003Cdetails>\r\n\r\n* Add OpenVINO weight compression tests for llama4 by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1369\r\n* Fix IPEX model loading for sentence-transformers v5 by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1370\r\n* Update OpenVINO documentation with newly supported tasks by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1371\r\n* [Docs] Optimization table on click feedback logic by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1372\r\n* Fix attr name typo in model_configs for llava-next compatibility with transformers 4.51.3 by @mitruska in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1375\r\n* [OV] Add quantization for text2text-generation models by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1359\r\n* free up disk for slow\u002Ffull ci by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1376\r\n* Original model types by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1329\r\n* Add openvino VLM quantization notebook by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1382\r\n* Remove notebook redundant quantization configs   by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1383\r\n* [OV] Prepare quantization dataset collection logic to transition to datasets v4.0 by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1381\r\n* [OpenVINO] Add support for Mamba and Falcon-mamba by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1360\r\n* Improve VLM quantization notebook structure  by @ezelanza in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1385\r\n* [OV] Add quantization for SegmentAnything model by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1384\r\n* [OV] Update the reference number of int8 nodes for SANA model by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1386\r\n* Add notebook quantization config paragraph by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1390\r\n* [TTS] Fix second generation for Speech T5 TSS by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1389\r\n* fix auto_model_class for OVModelForVisualCausalLM by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1391\r\n* Add support for cb4_f8e4m3 quantization mode. by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1378\r\n* Add quantization statistics path argument by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1392\r\n* Transformers 4.53 support by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1377\r\n\u003C\u002Fdetails>\r\n\r\n\r\nCompatible with transformers>=4.36,\u003C=4.53\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.24.0...v1.25.0\r\n\r\n\r\n","2025-08-04T16:42:06",{"id":180,"version":181,"summary_zh":182,"released_at":183},112862,"v1.24.0","## :rocket: New Features & Enhancements\r\n\r\nOptimum 1.26 compatibility by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1352\r\n\r\n#### OpenVINO\r\n* Introduce default full quantization configs for clip models by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1302\r\n* Introduce OVPipelineQuantizationConfig by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1310\r\n* Add int8 PTQ configs for some fill-mask models by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1331\r\n* Add transformers v4.52 compatibility by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1319\r\n* Add compression config for Qwen\u002FQwen2.5-Coder-3B-Instruct by @MaximProshin in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1355\r\n* [OV] Add support for data-free AWQ by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1349\r\n* Convert dataclasses to dicts in quantization config before saving by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1362\r\n* Remove reshaping for stateful decoders by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1333\r\n\r\n#### IPEX\r\n* Add transformers v4.52 compatibility by @jiqing-feng in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1317\r\n\r\n## 🔧 Key Fixes & Optimizations\r\n* Raise if converted subcomponent not found by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1303\r\n* Keep Hybrid Quantization only for diffusion pipelines by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1313\r\n* Fix whisper with auto language detection by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1314\r\n* Fix vision embeddings export for maira by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1320\r\n* Fix VLM calibration dataset collection by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1321\r\n* Resize large images during VLM calibration data collection by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1322\r\n* Resolve logger warnings by @emmanuel-ferdman in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1324\r\n* Fix progress bar during calibration dataset collection by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1323\r\n* Fix ESM models export and add it to supported by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1328\r\n* Allow skip trace check for sentence stransformers by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1332\r\n* Fix int value recompile by @jiqing-feng in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1335\r\n* Fix TP tensor dimension dismatch for IPEX models by @kaixuanliu in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1340\r\n* Updated Qwen3-8b compression config by @MaximProshin in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1341\r\n\r\n\r\n## New Contributors\r\n* @kilavvy made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1345\r\n* @maximevtush made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1347\r\n* @leopardracer made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1351\r\n\r\n## What's Changed\r\n\r\n\u003Cdetails>\r\n\r\n* Dev version by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1309\r\n* Update number of int8 nodes for Segment Anything model by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1311\r\n* [OV][Docs] Keep Hybrid Quantization only for diffusion pipelines by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1313\r\n* raise if converted subcomponent not found by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1303\r\n* [OV] Introduce default full quantization configs for clip models by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1302\r\n* fix whisper with auto language detection by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1314\r\n* fix vision embeddings export for maira by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1320\r\n* [OV] Fix VLM calibration dataset collection by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1321\r\n* [OV] Resize large images during VLM calibration data collection by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1322\r\n* Resolve logger warnings by @emmanuel-ferdman in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1324\r\n* [OV] Fix progress bar during calibration dataset collection by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1323\r\n* Limit INC version to fix CI. by @changwangss in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1325\r\n* [OV] Update AWQ test to pass on NNCF develop by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1326\r\n* Fix ESM models export and add it to supported by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-int","2025-07-01T12:47:32",{"id":185,"version":186,"summary_zh":187,"released_at":188},112863,"v1.23.1","**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.23.0...v1.23.1","2025-06-13T11:30:41",{"id":190,"version":191,"summary_zh":192,"released_at":193},112864,"v1.23.0","## :rocket: New Features & Enhancements\r\n\r\n\r\n#### OpenVINO\r\n\r\n* Add MAIRA-2 support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1145\r\n* Add support for `nf4_f8e4m3` quantization mode by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1148\r\n* Add DeepSeek support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1155\r\n* Add Qwen2.5-VL support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1163\r\n* Add LLaVA-Next-Video support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1183\r\n* Add GOT-OCR2 support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1202\r\n* Add Gemma 3 support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1198\r\n* Add SmolVLM and Idefics3 support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1210\r\n* Add Phi-3-MoE support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1215\r\n* Add OVSamModel for inference by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1229\r\n* Add Phi-4-multimodal support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1201\r\n* Add Llama 4 support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1226\r\n* Add  zero-shot-Image-classification support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1273\r\n* Add PTQ support for OVModelForZeroShotImageClassification by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1283\r\n* Add diffuers full int8 quantization Support by @l-bat in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1193\r\n* Add SANA-Sprint support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1245\r\n* Add PTQ support for OVModelForMaskedLM by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1268\r\n* Add  LTX-Video support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1264\r\n* Add Qwen3 and Qwen3-MOE support by @openvino-dev-samples in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1214\r\n* Add SpeechT5 text-to-speech support by OpenVINO by @rkazants in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1230\r\n* Add GLM4 support by @openvino-dev-samples in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1249\r\n* PTQ support for OVModelForFeatureExtraction and OVSentenceTransformer by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1257\r\n* Introduce OVCalibrationDatasetBuilder by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1232\r\n\r\n#### IPEX\r\n\r\n* Add Qwen2 support by @jiqing-feng in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1107\r\n* Enable quantization model support by @jiqing-feng in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1074\r\n* Add support for flash decoding on xpu by @kaixuanliu in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1118\r\n* Add Phi support by @jiqing-feng in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1175\r\n* Enable compilation for patched model with paged attention by @jiqing-feng in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1253\r\n* Add Mistral modeling optimization support for ipex by @kaixuanliu in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1269\r\n\r\n#### Transformers compatibility\r\n* Add compatibility with transformers v4.49 by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1172\r\n* Add compatibility with transformers v4.50 and v4.51 by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1242\r\n\r\n## 🔧 Key Fixes & Optimizations\r\n\r\n* Fix misplaced configs saving by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1159\r\n* Check if nncf is installed before running quantization from optimum-cli by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1154\r\n* Fix automatic-speech-recognition-with-past quantization from CLI by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1180\r\n* Propagate OV QuantizationConfig kwargs to nncf calls by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1179\r\n* Fix model field names for OVBaseModelForSeq2SeqLM by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1184\r\n* Align loading dtype logic for diffusers with other models by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1187\r\n* Fix generation for statically reshaped diffusion pipeline by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1199\r\n* Add `ov_submodels` property to `OVBaseModel` by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1177\r\n* Fix flux and sana export with diffusers 0.33+ by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1236\r\n* Update pkv precision at save_pretrained call by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1235\r\n* Remove ONNX fallback when converting to OpenVINO by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1272\r","2025-05-15T13:31:16",{"id":195,"version":196,"summary_zh":197,"released_at":198},112865,"v1.22.0","\r\n## OpenVINO\r\n\r\n* Add quantization of Whisper pipeline by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1040\r\n* Add Qwen2-VL support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1042\r\n* Add AWQ models support by @mvafin in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1049\r\n* Update default OV configuration by @KodiaqQ in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1057\r\n* Introduce `--quant-mode` cli argument enabling full quantization via optimum-cli by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1061\r\n* Merge decoder and decoder with past to stateful for seq2seq models by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1078\r\n* Add transformers 4.47 support by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1088\r\n* Add GLM-Edge models support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1089\r\n* Add Granite and GraniteMoe models support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1099\r\n* Add fp8 implementation by @KodiaqQ in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1100\r\n* Add Flux Fill inpainting pipeline support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1095\r\n* Add Sana support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1106\r\n* Add v4.48 transformers support by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1136\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n## IPEX\r\n\r\n* Add support to sentence transformers models  by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1034\r\n\r\n```python\r\nfrom optimum.intel import IPEXSentenceTransformer\r\n\r\nmodel = IPEXSentenceTransformer.from_pretrained(model_id)\r\n```\r\n* Add support to text-to-text task by @jiqing-feng in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1054\r\n\r\n```python\r\nfrom optimum.intel import IPEXModelForSeq2SeqLM\r\n\r\nmodel = IPEXModelForSeq2SeqLM.from_pretrained(model_id)\r\n```\r\n* Enable Flash Attention by @jiqing-feng in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1065\r\n\r\n\r\n\r\nCompatible with transformers>=4.36,\u003C=4.48\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.21.0...v1.22.0","2025-02-06T23:49:43",{"id":200,"version":201,"summary_zh":202,"released_at":203},112866,"v1.21.0","## What's Changed\r\n\r\n### OpenVINO\r\n\r\n#### Diffusers\r\n* SD3 and Flux pipelines support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F916\r\n\r\n#### VLMs Modeling\r\n* MiniCPMv support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F972\r\n* NanoLlava support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F969\r\n* Phi3v support by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F977\r\n\r\n### NNCF\r\n* Quantization support for CausalVisualLMs by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F951\r\n* NF4 data type support for OV weight compression by @l-bat in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F988\r\n* NNCF 2.14 new features support by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F997\r\n\r\n### IPEX\r\n* Unified XPU\u002FCPU modeling with custom PagedAttention cache for LLMs by @sywangyi in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1009\r\n\r\n### INC\r\n* Layer-wise quantization support by @changwangss in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1040\r\n\r\n\r\n## New Contributors\r\n* @emmanuel-ferdman made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F974\r\n* @mvafin made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F1033\r\n\r\n\r\nCompatible with transformers>=4.36,\u003C=4.46\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.20.0...v1.21.0","2024-12-06T12:53:10",{"id":205,"version":206,"summary_zh":207,"released_at":208},112867,"v1.20.1","* Fix lora unscaling in diffusion pipelines by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F937\r\n* Fix compatibility with diffusers \u003C 0.25.0 by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F952\r\n* Allow to use SDPA in clip models by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F941\r\n* Updated OVPipelinePart to have separate ov_config by @e-ddykim in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F957\r\n* Symbol use in optimum: fix misprint by @jane-intel in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F948\r\n* Fix temporary directory saving by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F959\r\n* Disable warning about tokenizers version for ov tokenizers >= 2024.5 by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F962\r\n* Restore original model_index.json after save_pretrained call by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F961\r\n* Add v4.46 transformers support by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F960\r\n","2024-10-30T14:08:11",{"id":210,"version":211,"summary_zh":212,"released_at":213},112868,"v1.20.0","###  OpenVINO\r\n\r\n#### Multi-modal models support\r\n\r\nAdding `OVModelForVisionCausalLM` by @eaidova  in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F883 \r\n\r\n#### OpenCLIP models support\r\n\r\nAdding OpenCLIP models support by @sbalandi  in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F857 \r\n\r\n```python\r\nfrom optimum.intel import OVModelCLIPVisual, OVModelCLIPText\r\n\r\nvisual_model = OVModelCLIPVisual.from_pretrained(model_name_or_path)\r\ntext_model  = OVModelCLIPText.from_pretrained(model_name_or_path)\r\nimage = processor(image).unsqueeze(0)\r\ntext = tokenizer([\"a diagram\", \"a dog\", \"a cat\"])\r\nimage_features = visual_model(image).image_features\r\ntext_features = text_model(text).text_features\r\n```\r\n\r\n#### Diffusion pipeline\r\n\r\nAdding `OVDiffusionPipeline` to simplify diffusers model loading by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F889 \r\n\r\n```diff\r\n  model_id = \"stabilityai\u002Fstable-diffusion-xl-base-1.0\"\r\n- pipeline = OVStableDiffusionXLPipeline.from_pretrained(model_id)\r\n+ pipeline = OVDiffusionPipeline.from_pretrained(model_id)\r\n  image = pipeline(\"sailing ship in storm by Leonardo da Vinci\").images[0]\r\n```\r\n#### NNCF GPTQ support\r\n\r\nGPTQ support by @nikita-savelyevv in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F912\r\n\r\n### Transformers v4.45\r\n\r\nTransformers v4.45 support by @echarlaix  in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F902\r\n\r\n### Subfolder\r\n\r\nRemove the restriction for the model's config to be in the model's subfolder by @tomaarsen in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F933\r\n\r\n## New Contributors\r\n* @jane-intel made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F696\r\n* @andreyanufr made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F903\r\n* @MaximProshin made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F905\r\n* @tomaarsen made their first contribution in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F931\r\n\r\nCompatible with transformers>=4.36,\u003C=4.45\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.19.0...v1.20.0\r\n","2024-10-10T17:01:26",{"id":215,"version":216,"summary_zh":217,"released_at":218},112869,"v1.19.0","*  Support SentenceTransformers models inference  by @aleksandr-mokrov in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F865\r\n\r\n\r\n\r\n```python\r\nfrom optimum.intel import OVSentenceTransformer\r\n\r\nmodel_id = \"sentence-transformers\u002Fall-mpnet-base-v2\"\r\nmodel = OVSentenceTransformer.from_pretrained(model_id, export=True)\r\nsentences = [\"This is an example sentence\", \"Each sentence is converted\"]\r\nembeddings = model.encode(sentences)\r\n```\r\n\r\n\r\n* Infer if the model needs to be exported or not by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F825\r\n\r\n```diff\r\n  from optimum.intel import OVModelForCausalLM\r\n\r\n- model = OVModelForCausalLM.from_pretrained(\"gpt2\", export=True)\r\n+ model = OVModelForCausalLM.from_pretrained(\"gpt2\")\r\n```\r\n\r\nCompatible with transformers>=4.36,\u003C=4.44\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.18.0...v1.19.0","2024-09-10T21:57:59",{"id":220,"version":221,"summary_zh":222,"released_at":223},112870,"v1.18.3","**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.18.2...v1.18.3","2024-08-19T09:16:16",{"id":225,"version":226,"summary_zh":227,"released_at":228},112871,"v1.18.2","- Fix model patching for internlm2 by @eaidova in #814 \r\n- Fix loading models from cache by @eaidova in #820 \r\n- Disable tpp for un-verified models by @jiqing-feng in #822 \r\n- Update default NNCF configurationsby @KodiaqQ in #824 \r\n- Fix update causal mask for transformers 4.42 by @eaidova in #852 \r\n- Fix bf16 inference accuracy for mistral, phi3, dbrx by @eaidova in #833 \r\n- Revert rotary embedding patching for recovering gpu accuracy by @eaidova in #855 \r\n- Support transformers 4.43 by @IlyasMoutawwakil in #856\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.18.1...v1.18.2","2024-08-06T16:13:10",{"id":230,"version":231,"summary_zh":232,"released_at":233},112872,"v1.18.1","\r\n* OV configurations alignment  by @KodiaqQ  in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F787\r\n* Enable transformers v4.42.0 by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F793\r\n* Deprecate onnx\u002Fort model export and quantization by @IlyasMoutawwakil  in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F795\r\n* Free memory after model export  by @eaidova  in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F800\r\n* Update config import path for neural-compressor v2.6 by @changwangss  in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F801\r\n* Pin library name to transformers for feature extraction by @IlyasMoutawwakil in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F804\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fcompare\u002Fv1.18.0...v1.18.1\r\n","2024-07-09T16:13:27",{"id":235,"version":236,"summary_zh":237,"released_at":238},112873,"v1.18.0","## OpenVINO\r\n\r\n\r\n* Enable Arctic, Jais export by @eaidova in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F726\r\n* Enable GLM-4 export by @eaidova  in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F776\r\n* Move data-driven quantization after model export for text-generation models by @nikita-savelyevv in #721 \r\n* Create default token_type_ids when needed for inference by @echarlaix #757 \r\n* Resolve default int4 config for local models by @eaidova in #760 \r\n* Update to NNCF 2.11 by @nikita-savelyevv in #763 \r\n* Fix quantization config by @echarlaix in #773 \r\n* Expose trust remote code argument when generating calibration dataset for datasets >= v2.20.0 by @echarlaix  #767 \r\n* Add pipelines by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F740\r\n\r\n```python\r\nfrom optimum.intel.pipelines import pipeline\r\n\r\n# Load openvino model\r\nov_pipe = pipeline(\"text-generation\", \"helenai\u002Fgpt2-ov\", accelerator=\"openvino\")\r\n# Load pytorch model and convert it to openvino before inference\r\npipe = pipeline(\"text-generation\", \"gpt2\", accelerator=\"openvino\")\r\n```\r\n\r\n## IPEX\r\n\r\n* Enable IPEX patching for llama for >= v2.3 by @jiqing-feng in #725 \r\n* Refactor llama modeling for IPEX patching by @faaany in #728 \r\n* Refactor model loading by @jiqing-feng in #752 ","2024-06-26T23:21:57",{"id":240,"version":241,"summary_zh":242,"released_at":243},112874,"v1.17.2","* Fix compatibility with transformers \u003C v4.39.0 by @echarlaix in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F754","2024-06-07T19:14:18",{"id":245,"version":246,"summary_zh":247,"released_at":248},112875,"v1.17.1","\r\n\r\n* Add setuptools to fix issue with Python 3.12 by @helena-intel in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F747\r\n* Disable warnings by @helena-intel in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F748\r\n* Fix Windows TemporaryDirectory issue  by @helena-intel in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F749\r\n* Fix generation config loading and saving by @eaidova  in https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Foptimum-intel\u002Fpull\u002F750","2024-06-06T15:40:10"]