[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-ARM-software--armnn":3,"tool-ARM-software--armnn":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",142651,2,"2026-04-06T23:34:12",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":77,"owner_email":77,"owner_twitter":77,"owner_website":78,"owner_url":79,"languages":80,"stars":108,"forks":109,"last_commit_at":110,"license":111,"difficulty_score":112,"env_os":113,"env_gpu":114,"env_ram":115,"env_deps":116,"category_tags":125,"github_topics":126,"view_count":32,"oss_zip_url":77,"oss_zip_packed_at":77,"status":17,"created_at":130,"updated_at":131,"faqs":132,"releases":161},4754,"ARM-software\u002Farmnn","armnn","Arm NN ML Software.","Arm NN 是一款专为 Android 和 Linux 系统打造的高性能机器学习推理引擎，旨在加速 Arm Cortex-A CPU 和 Mali GPU 上的模型运行。它充当了主流神经网络框架与低功耗 Arm 硬件之间的桥梁，帮助开发者将训练好的模型高效部署到移动端和嵌入式设备上，解决了通用机器学习库在 Arm 架构上执行效率不足的痛点。\n\n需要注意的是，Arm NN 目前已进入维护末期（Legacy 项目），官方不再提供功能更新或安全补丁，因此建议仅在受信任的环境中使用。对于追求极致性能的开发者而言，Arm NN 依然具有独特价值：它基于 Arm Compute Library 进行了深度的架构特定优化（如支持 SVE2 指令集），并支持通过 TF Lite Delegate 让 Python 开发者轻松调用加速能力。此外，它还提供了针对 Ethos-N NPU 的驱动支持。\n\n这款工具主要适合需要在 Arm 平台上进行模型部署优化的软件工程师、嵌入式开发人员以及算法研究人员。如果您正在寻找一种能够充分利用 Arm 硬件算力、且对 C++17 环境友好的开源方案，Arm NN 仍","Arm NN 是一款专为 Android 和 Linux 系统打造的高性能机器学习推理引擎，旨在加速 Arm Cortex-A CPU 和 Mali GPU 上的模型运行。它充当了主流神经网络框架与低功耗 Arm 硬件之间的桥梁，帮助开发者将训练好的模型高效部署到移动端和嵌入式设备上，解决了通用机器学习库在 Arm 架构上执行效率不足的痛点。\n\n需要注意的是，Arm NN 目前已进入维护末期（Legacy 项目），官方不再提供功能更新或安全补丁，因此建议仅在受信任的环境中使用。对于追求极致性能的开发者而言，Arm NN 依然具有独特价值：它基于 Arm Compute Library 进行了深度的架构特定优化（如支持 SVE2 指令集），并支持通过 TF Lite Delegate 让 Python 开发者轻松调用加速能力。此外，它还提供了针对 Ethos-N NPU 的驱动支持。\n\n这款工具主要适合需要在 Arm 平台上进行模型部署优化的软件工程师、嵌入式开发人员以及算法研究人员。如果您正在寻找一种能够充分利用 Arm 硬件算力、且对 C++17 环境友好的开源方案，Arm NN 仍是一个值得参考的技术选择，尤其是在构建定制化推理流程时，其灵活的编译选项能帮助您精确控制所需组件。","\u003Cbr>\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FARM-software_armnn_readme_9ccf2c60e9d3.png\" alt=\"Arm NN Logo\" width=\"300\"\u002F>\n\u003C\u002Fdiv>\n\n* [Quick Start Guides](#quick-start-guides)\n* [Pre-Built Binaries](#pre-built-binaries)\n* [Software Overview](#software-overview)\n* [Get Involved](#get-involved)\n* [Contributions](#contributions)\n* [Disclaimer](#disclaimer)\n* [License](#license)\n* [Inclusive language commitment](#inclusive-language-commitment)\n* [Third-party](#third-party)\n* [Build Flags](#build-flags)\n\n# Arm NN\n\n**Arm NN**  is now a legacy project and is no longer actively maintained by Arm. As a result, no further updates,\nincluding security patches and functional improvements, should be expected, and users should be aware that the project\nmay not be secure against untrusted inputs or hostile environments.\nAccordingly, it is strongly recommended that **Arm NN** only be used in trusted contexts.\n\n**Arm NN** is the **most performant** machine learning (ML) inference engine for Android and Linux, accelerating ML\non **Arm Cortex-A CPUs and Arm Mali GPUs**. This ML inference engine is an open source SDK which bridges the gap\nbetween existing neural network frameworks and power-efficient Arm IP.\n\nArm NN outperforms generic ML libraries due to **Arm architecture-specific optimizations** (e.g. SVE2) by utilizing\n**[Arm Compute Library (ACL)](https:\u002F\u002Fgithub.com\u002FARM-software\u002FComputeLibrary\u002F)**. To target Arm Ethos-N NPUs, Arm NN\nutilizes the [Ethos-N NPU Driver](https:\u002F\u002Fgithub.com\u002FARM-software\u002Fethos-n-driver-stack). For Arm Cortex-M acceleration,\nplease see [CMSIS-NN](https:\u002F\u002Fgithub.com\u002FARM-software\u002FCMSIS_5).\n\nArm NN is written using portable **C++17** and built using [CMake](https:\u002F\u002Fcmake.org\u002F) - enabling builds for a wide\nvariety of target platforms, from a wide variety of host environments. **Python** developers can interface with Arm NN\nthrough the use of our **Arm NN TF Lite Delegate**.\n\n\n## Quick Start Guides\n**The Arm NN TF Lite Delegate provides the widest ML operator support in Arm NN** and is an easy way to accelerate\nyour ML model. To start using the TF Lite Delegate, first download the **[Pre-Built Binaries](#pre-built-binaries)** for\nthe latest release of Arm NN. Using a Python interpreter, you can load your TF Lite model into the Arm NN TF Lite\nDelegate and run accelerated inference. Please see this\n**[Quick Start Guide](delegate\u002FDelegateQuickStartGuide.md)** on GitHub or this more comprehensive\n**[Arm Developer Guide](https:\u002F\u002Fdeveloper.arm.com\u002Fdocumentation\u002F102561\u002Flatest\u002F)** for information on how to accelerate\nyour TF Lite model using the Arm NN TF Lite Delegate.\n\nWe provide Debian packages for Arm NN, which are a quick way to start using Arm NN and the TF Lite Parser\n(albeit with less ML operator support than the TF Lite Delegate). There is an installation guide available\n[here](InstallationViaAptRepository.md) which provides instructions on how to install the Arm NN Core and the TF Lite\nParser for Ubuntu 20.04.\n\nTo build Arm NN from scratch, we provide the **[Arm NN Build Tool](build-tool\u002FREADME.md)**. This tool consists of\n**parameterized bash scripts** accompanied by a **Dockerfile** for building Arm NN and its dependencies, including\n**[Arm Compute Library (ACL)](https:\u002F\u002Fgithub.com\u002FARM-software\u002FComputeLibrary\u002F)**. This tool replaces\u002Fsupersedes the\nmajority of the existing Arm NN build guides as a user-friendly way to build Arm NN. The main benefit of building\nArm NN from scratch is the ability to **exactly choose which components to build, targeted for your ML project**.\u003Cbr>\n\n\n## Pre-Built Binaries\n\n| Operating System                              | Architecture-specific Release Archive (Download)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |\n|-----------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| Android 11 \"R\u002FRed Velvet Cake\" (API level 30) | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v82a-orange)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-30-arm64-v8.2-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v8a-orange)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-30-arm64-v8a.tar.gz)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |\n| Android 12 \"S\u002FSnow Cone\" (API level 31)       | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v82a-yellow)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-31-arm64-v8.2-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v8a-yellow)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-31-arm64-v8a.tar.gz)  [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86a-yellow)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-31-arm64-v8.6-a.tar.gz)  [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve-yellow)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-31-arm64-v8.6-a-sve.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve2-yellow)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-31-arm64-v8.6-a-sve2.tar.gz) |\n| Android 13 \"T\u002FTiramisu\" (API level 33)        | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v82a-purple)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-33-arm64-v8.2-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86a-purple)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-33-arm64-v8.6-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve-purple)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-33-arm64-v8.6-a-sve.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve2-purple)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-33-arm64-v8.6-a-sve2.tar.gz)                                                                                                                                                                    |\n| Android 14 \"U\u002FUpside Down Cake\" (API level 34)| [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v82a-blue)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-34-arm64-v8.2-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86a-blue)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-34-arm64-v8.6-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve-blue)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-34-arm64-v8.6-a-sve.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve2-blue)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-34-arm64-v8.6-a-sve2.tar.gz)                                                                                                                                                                            |\n\nArm NN also provides pre-built multi-isa binaries for Android. The v8a binary includes support from basic v8a architecture and upwards. \nThe v8.2a binary includes support from v8.2a and upwards. These include support for SVE, SVE2, FP16 and some dot product kernels.\nThese kernels need appropriate hardware to work on.\n\n\n| Multi ISA Architecture | Release Archive (Download)                                                                                                                                                              |\n|------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| Linux Arm v8a          | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v8a-pink)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FMULTI_ISA-GCC11-ArmNN+ACL-linux-armv8a.tar.gz)              |\n| Linux Arm v8.2a        | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v82a-violet)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FMULTI_ISA-GCC11-ArmNN+ACL-linux-armv8.2-a.tar.gz)        |\n| Android 31 v8a         | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-android--v8a-tan)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FMULTI_ISA-ArmNN+ACL+SL-android-31-arm64-v8a.tar.gz)        |\n| Android 31 v8.2a       | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-android--v82a-indigo)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FMULTI_ISA-ArmNN+ACL+SL-android-31-arm64-v8.2-a.tar.gz) |\n\n\n\n## Software Overview\nThe Arm NN SDK supports ML models in **TensorFlow Lite** (TF Lite) and **ONNX** formats.\n\n**Arm NN's TF Lite Delegate** accelerates TF Lite models through **Python or C++ APIs**. Supported TF Lite operators\nare accelerated by Arm NN and any unsupported operators are delegated (fallback) to the reference TF Lite runtime -\nensuring extensive ML operator support. **The recommended way to use Arm NN is to\n[convert your model to TF Lite format](https:\u002F\u002Fwww.tensorflow.org\u002Flite\u002Fconvert) and use the TF Lite Delegate.** Please\nrefer to the [Quick Start Guides](#quick-start-guides) for more information on how to use the TF Lite Delegate.\n\nArm NN also provides **TF Lite and ONNX parsers** which are C++ libraries for integrating TF Lite or ONNX models\ninto your ML application. Please note that these parsers do not provide extensive ML operator coverage as compared\nto the Arm NN TF Lite Delegate.\n\n**Android** ML application developers have a number of options for using Arm NN:\n* Download and use our [Pre-Built Binaries](#pre-built-binaries) for the Android platform\n* Build Arm NN from scratch with the Android NDK using this [GitHub guide](BuildGuideAndroidNDK.md)\n\nArm also provides an [Android-NN-Driver](https:\u002F\u002Fgithub.com\u002FARM-software\u002Fandroid-nn-driver) which implements a\nhardware abstraction layer (HAL) for the Android NNAPI. When the Android NN Driver is integrated on an Android device,\nML models used in Android applications will automatically be accelerated by Arm NN.\n\n**For more information about the Arm NN components, please refer to our\n[documentation](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fwiki\u002FDocumentation).**\n\nFor FAQs and troubleshooting advice, see the [FAQ](docs\u002FFAQ.md) or take a look at previous\n[GitHub Issues](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fissues).\n\n\n## Get Involved\nThe best way to get involved is by using our software. If you need help or encounter an issue, please raise it as a\n[GitHub Issue](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fissues). Feel free to have a look at any of our open issues too.\nWe also welcome feedback on our documentation.\n\nFeature requests without a volunteer to implement them are closed, but have the 'Help wanted' label, these can be\nfound [here](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fissues?q=is%3Aissue+label%3A%22Help+wanted%22+).\nOnce you find a suitable Issue, feel free to re-open it and add a comment, so that Arm NN engineers know you are\nworking on it and can help.\n\nWhen the feature is implemented the 'Help wanted' label will be removed.\n\n\n## Contributions\nThe Arm NN project welcomes contributions. For more details on contributing to Arm NN please see the [Contributor Guide](CONTRIBUTING.md).\n\nParticularly if you'd like to implement your own backend next to our CPU, GPU and NPU backends there are guides for\nbackend development: [Backend development guide](src\u002Fbackends\u002FREADME.md),\n[Dynamic backend development guide](src\u002Fdynamic\u002FREADME.md).\n\n\n## Disclaimer\nThe armnn\u002Ftests directory contains tests used during Arm NN development. Many of them depend on third-party IP, model\nprotobufs and image files not distributed with Arm NN. The dependencies for some tests are available freely on\nthe Internet, for those who wish to experiment, but they won't run out of the box.\n\n\n## License\nArm NN is provided under the [MIT](https:\u002F\u002Fspdx.org\u002Flicenses\u002FMIT.html) license.\nSee [LICENSE](LICENSE) for more information. Contributions to this project are accepted under the same license.\n\nIndividual files contain the following tag instead of the full license text.\n\n    SPDX-License-Identifier: MIT\n\nThis enables machine processing of license information based on the SPDX License Identifiers that are available\nhere: http:\u002F\u002Fspdx.org\u002Flicenses\u002F\n\n\n## Inclusive language commitment\nArm NN conforms to Arm's inclusive language policy and, to the best of our knowledge, does not contain any non-inclusive language.\n\nIf you find something that concerns you, please email terms@arm.com\n\n\n## Third-party\nThird party tools used by Arm NN:\n\n| Tool           | License (SPDX ID) | Description                    | Version | Provenience                          |\n|----------------|-------------------|------------------------------------------------------------------|---------|--------------------------------------|\n| cxxopts        | MIT               | A lightweight C++ option parser library | 3.1.1   | https:\u002F\u002Fgithub.com\u002Fjarro2783\u002Fcxxopts |\n| doctest        | MIT               | Header-only C++ testing framework | 2.4.6   | https:\u002F\u002Fgithub.com\u002Fonqtam\u002Fdoctest    |\n| fmt            | MIT               | {fmt} is an open-source formatting library providing a fast and safe alternative to C stdio and C++ iostreams. | 8.30    | https:\u002F\u002Fgithub.com\u002Ffmtlib\u002Ffmt  |\n| ghc            | MIT               | A header-only single-file std::filesystem compatible helper library | 1.3.2   | https:\u002F\u002Fgithub.com\u002Fgulrak\u002Ffilesystem |\n| half           | MIT               | IEEE 754 conformant 16-bit half-precision floating point library | 1.12.0  | http:\u002F\u002Fhalf.sourceforge.net          |\n| mapbox\u002Fvariant | BSD               | A header-only alternative to 'boost::variant' | 1.1.3   | https:\u002F\u002Fgithub.com\u002Fmapbox\u002Fvariant    |\n| stb            | MIT               | Image loader, resize and writer | 2.16    | https:\u002F\u002Fgithub.com\u002Fnothings\u002Fstb      |\n\n\n## Build Flags\nArm NN uses the following security related build flags in their code:\n\n| Build flags\t      |\n|---------------------|\n| -Wall\t              |\n| -Wextra             |\n| -Wold-style-cast    |\n| -Wno-missing-braces |\n| -Wconversion        |\n| -Wsign-conversion   |\n| -Werror             |\n","\u003Cbr>\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FARM-software_armnn_readme_9ccf2c60e9d3.png\" alt=\"Arm NN Logo\" width=\"300\"\u002F>\n\u003C\u002Fdiv>\n\n* [快速入门指南](#quick-start-guides)\n* [预编译二进制文件](#pre-built-binaries)\n* [软件概述](#software-overview)\n* [参与贡献](#get-involved)\n* [贡献](#contributions)\n* [免责声明](#disclaimer)\n* [许可证](#license)\n* [包容性语言承诺](#inclusive-language-commitment)\n* [第三方](#third-party)\n* [构建标志](#build-flags)\n\n# Arm NN\n\n**Arm NN** 现已成为一个遗留项目，不再由 Arm 积极维护。因此，预计不会再有任何更新，包括安全补丁和功能改进；用户应意识到，该项目可能无法抵御不受信任的输入或恶意环境。相应地，强烈建议仅在可信环境中使用 **Arm NN**。\n\n**Arm NN** 是适用于 Android 和 Linux 的 **性能最优** 的机器学习（ML）推理引擎，可在 **Arm Cortex-A CPU 和 Arm Mali GPU** 上加速 ML 推理。该 ML 推理引擎是一个开源 SDK，旨在弥合现有神经网络框架与高效能 Arm IP 之间的鸿沟。\n\nArm NN 凭借 **针对 Arm 架构的优化**（例如 SVE2），通过利用 **[Arm 计算库 (ACL)](https:\u002F\u002Fgithub.com\u002FARM-software\u002FComputeLibrary\u002F)**，其性能优于通用 ML 库。为了支持 Arm Ethos-N NPU，Arm NN 使用了 [Ethos-N NPU 驱动程序](https:\u002F\u002Fgithub.com\u002FARM-software\u002Fethos-n-driver-stack)。对于 Arm Cortex-M 的加速，请参阅 [CMSIS-NN](https:\u002F\u002Fgithub.com\u002FARM-software\u002FCMSIS_5)。\n\nArm NN 使用可移植的 **C++17** 编写，并采用 [CMake](https:\u002F\u002Fcmake.org\u002F) 进行构建，从而能够在多种主机环境中为广泛的目标平台进行构建。**Python** 开发人员可以通过我们的 **Arm NN TF Lite 委托** 与 Arm NN 进行交互。\n\n\n## 快速入门指南\n**Arm NN TF Lite 委托提供了 Arm NN 中最广泛的 ML 操作符支持**，是加速 ML 模型的简便方法。要开始使用 TF Lite 委托，首先请下载最新版本 Arm NN 的 **[预编译二进制文件](#pre-built-binaries)**。然后，使用 Python 解释器将您的 TF Lite 模型加载到 Arm NN TF Lite 委托中，并运行加速推理。有关如何使用 Arm NN TF Lite 委托加速 TF Lite 模型的信息，请参阅 GitHub 上的 **[快速入门指南](delegate\u002FDelegateQuickStartGuide.md)**，或更全面的 **[Arm 开发者指南](https:\u002F\u002Fdeveloper.arm.com\u002Fdocumentation\u002F102561\u002Flatest\u002F)**。\n\n我们为 Arm NN 提供 Debian 软件包，这是一种快速开始使用 Arm NN 和 TF Lite 解析器的方式（尽管其 ML 操作符支持不如 TF Lite 委托广泛）。此处提供了一份安装指南 [here](InstallationViaAptRepository.md)，其中介绍了如何为 Ubuntu 20.04 安装 Arm NN 核心组件和 TF Lite 解析器。\n\n若要从头开始构建 Arm NN，我们提供了 **[Arm NN 构建工具](build-tool\u002FREADME.md)**。该工具由 **参数化 Bash 脚本** 和一个用于构建 Arm NN 及其依赖项（包括 **[Arm 计算库 (ACL)](https:\u002F\u002Fgithub.com\u002FARM-software\u002FComputeLibrary\u002F)**）的 **Dockerfile** 组成。作为一种用户友好的构建方式，该工具取代了现有的大多数 Arm NN 构建指南。从头构建 Arm NN 的主要优势在于，您可以 **精确选择要构建的组件，以满足您的 ML 项目需求**。\u003Cbr>\n\n## 预编译二进制文件\n\n| 操作系统                              | 架构特定发布归档（下载）                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |\n|-----------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| Android 11 “R\u002F红丝绒蛋糕”（API 级别 30） | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v82a-orange)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-30-arm64-v8.2-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v8a-orange)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-30-arm64-v8a.tar.gz)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |\n| Android 12 “S\u002F雪糕筒”（API 级别 31）       | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v82a-yellow)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-31-arm64-v8.2-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v8a-yellow)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-31-arm64-v8a.tar.gz)  [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86a-yellow)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-31-arm64-v8.6-a.tar.gz)  [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve-yellow)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-31-arm64-v8.6-a-sve.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve2-yellow)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-31-arm64-v8.6-a-sve2.tar.gz) |\n| Android 13 “T\u002F提拉米苏”（API 级别 33）        | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v82a-purple)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-33-arm64-v8.2-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86a-purple)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-33-arm64-v8.6-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve-purple)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-33-arm64-v8.6-a-sve.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve2-purple)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-33-arm64-v8.6-a-sve2.tar.gz)                                                                                                                                                                    |\n| Android 14 “U\u002F倒扣蛋糕”（API 级别 34）| [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v82a-blue)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-34-arm64-v8.2-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86a-blue)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-34-arm64-v8.6-a.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve-blue)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-34-arm64-v8.6-a-sve.tar.gz) [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v86asve2-blue)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FArmNN-android-34-arm64-v8.6-a-sve2.tar.gz)                                                                                                                                                                            |\n\nArm NN 还为 Android 提供了预编译的多 ISA 二进制文件。v8a 二进制文件支持从基础 v8a 架构及其以上版本。\nv8.2a 二进制文件则支持从 v8.2a 及其以上版本。这些二进制文件包含了对 SVE、SVE2、FP16 以及部分点积内核的支持。\n这些内核需要相应的硬件才能正常运行。\n\n\n| 多 ISA 架构 | 发布归档（下载）                                                                                                                                                              |\n|------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| Linux Arm v8a          | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v8a-pink)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FMULTI_ISA-GCC11-ArmNN+ACL-linux-armv8a.tar.gz)              |\n| Linux Arm v8.2a        | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-arm64--v82a-violet)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FMULTI_ISA-GCC11-ArmNN+ACL-linux-armv8.2-a.tar.gz)        |\n| Android 31 v8a         | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-android--v8a-tan)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FMULTI_ISA-ArmNN+ACL+SL-android-31-arm64-v8a.tar.gz)        |\n| Android 31 v8.2a       | [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdownload-android--v82a-indigo)](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FMULTI_ISA-ArmNN+ACL+SL-android-31-arm64-v8.2-a.tar.gz) |\n\n## 软件概述\nArm NN SDK 支持 **TensorFlow Lite** (TF Lite) 和 **ONNX** 格式的机器学习模型。\n\n**Arm NN 的 TF Lite Delegate** 通过 **Python 或 C++ API** 加速 TF Lite 模型。支持的 TF Lite 运算符由 Arm NN 加速，而不支持的运算符则会委托（回退）到参考的 TF Lite 运行时——从而确保对大多数机器学习运算符的支持。**使用 Arm NN 的推荐方式是将您的模型转换为 TF Lite 格式**，并使用 TF Lite Delegate。有关如何使用 TF Lite Delegate 的更多信息，请参阅[快速入门指南](#quick-start-guides)。\n\nArm NN 还提供了 **TF Lite 和 ONNX 解析器**，它们是用于将 TF Lite 或 ONNX 模型集成到您的机器学习应用程序中的 C++ 库。请注意，与 Arm NN 的 TF Lite Delegate 相比，这些解析器并不提供广泛的机器学习运算符支持。\n\n**Android** 平台的机器学习应用开发者有多种方式可以使用 Arm NN：\n* 下载并使用我们为 Android 平台提供的[预编译二进制文件](#pre-built-binaries)\n* 使用此 [GitHub 指南](BuildGuideAndroidNDK.md) 通过 Android NDK 从头构建 Arm NN\n\nArm 还提供了一个 [Android-NN-Driver](https:\u002F\u002Fgithub.com\u002FARM-software\u002Fandroid-nn-driver)，它实现了 Android NNAPI 的硬件抽象层 (HAL)。当 Android NN Driver 集成到 Android 设备上时，Android 应用程序中使用的机器学习模型将自动由 Arm NN 加速。\n\n**有关 Arm NN 组件的更多信息，请参阅我们的[文档](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fwiki\u002FDocumentation)。**\n\n有关常见问题解答和故障排除建议，请参阅[FAQ](docs\u002FFAQ.md)，或查看之前的[GitHub 问题](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fissues)。\n\n\n## 参与贡献\n参与的最佳方式就是使用我们的软件。如果您需要帮助或遇到问题，请在 [GitHub 上提交一个问题](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fissues)。您也可以随意浏览我们现有的任何问题。我们也欢迎对文档提出反馈意见。\n\n没有志愿者愿意实现的功能请求会被关闭，但会打上“寻求帮助”的标签，这些请求可以在[这里](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fissues?q=is%3Aissue+label%3A%22Help+wanted%22+)找到。一旦您找到合适的问题，可以重新打开它并添加评论，以便 Arm NN 的工程师知道您正在处理该问题，并能为您提供帮助。\n\n功能实现后，“寻求帮助”标签将会被移除。\n\n\n## 贡献\nArm NN 项目欢迎各方贡献。有关如何为 Arm NN 做出贡献的详细信息，请参阅[贡献者指南](CONTRIBUTING.md)。\n\n特别是如果您希望在我们的 CPU、GPU 和 NPU 后端之外实现自己的后端，我们提供了后端开发指南：[后端开发指南](src\u002Fbackends\u002FREADME.md) 和 [动态后端开发指南](src\u002Fdynamic\u002FREADME.md)。\n\n\n## 免责声明\narmnn\u002Ftests 目录包含 Arm NN 开发过程中使用的测试。其中许多测试依赖于第三方 IP、模型 protobuf 文件以及未随 Arm NN 一起分发的图像文件。部分测试的依赖项可在互联网上免费获取，供有兴趣进行实验的人使用，但这些测试无法直接运行。\n\n\n## 许可证\nArm NN 采用 [MIT](https:\u002F\u002Fspdx.org\u002Flicenses\u002FMIT.html) 许可证进行授权。更多相关信息请参阅 [LICENSE](LICENSE)。本项目的贡献也接受相同的许可证。\n\n个别文件中并未包含完整的许可证文本，而是使用以下标签：\n\n    SPDX-License-Identifier: MIT\n\n这使得可以根据此处提供的 SPDX 许可证标识符对许可证信息进行机器处理：http:\u002F\u002Fspdx.org\u002Flicenses\u002F\n\n\n## 包容性语言承诺\nArm NN 遵循 Arm 的包容性语言政策，据我们所知，不包含任何不包容性的语言。\n\n如果您发现任何令您担忧的内容，请发送电子邮件至 terms@arm.com\n\n\n## 第三方工具\nArm NN 使用的第三方工具如下：\n\n| 工具           | 许可证 (SPDX ID) | 描述                    | 版本 | 来源                          |\n|----------------|-------------------|------------------------------------------------------------------|---------|--------------------------------------|\n| cxxopts        | MIT               | 一个轻量级的 C++ 选项解析库 | 3.1.1   | https:\u002F\u002Fgithub.com\u002Fjarro2783\u002Fcxxopts |\n| doctest        | MIT               | 一个仅包含头文件的 C++ 测试框架 | 2.4.6   | https:\u002F\u002Fgithub.com\u002Fonqtam\u002Fdoctest    |\n| fmt            | MIT               | {fmt} 是一个开源格式化库，为 C stdio 和 C++ iostreams 提供了一种快速且安全的替代方案。 | 8.30    | https:\u002F\u002Fgithub.com\u002Ffmtlib\u002Ffmt  |\n| ghc            | MIT               | 一个仅包含头文件、兼容 std::filesystem 的辅助库 | 1.3.2   | https:\u002F\u002Fgithub.com\u002Fgulrak\u002Ffilesystem |\n| half           | MIT               | 符合 IEEE 754 标准的 16 位半精度浮点数库 | 1.12.0  | http:\u002F\u002Fhalf.sourceforge.net          |\n| mapbox\u002Fvariant | BSD               | 一个仅包含头文件的 'boost::variant' 替代品 | 1.1.3   | https:\u002F\u002Fgithub.com\u002Fmapbox\u002Fvariant    |\n| stb            | MIT               | 图像加载、缩放和写入工具 | 2.16    | https:\u002F\u002Fgithub.com\u002Fnothings\u002Fstb      |\n\n\n## 构建标志\nArm NN 在其代码中使用了以下与安全性相关的构建标志：\n\n| 构建标志\t      |\n|---------------------|\n| -Wall\t              |\n| -Wextra             |\n| -Wold-style-cast    |\n| -Wno-missing-braces |\n| -Wconversion        |\n| -Wsign-conversion   |\n| -Werror             |","# Arm NN 快速上手指南\n\n> **⚠️ 重要提示**：Arm NN 目前已被标记为**遗留项目（Legacy Project）**，Arm 官方不再对其进行主动维护（包括安全补丁和功能更新）。建议仅在**受信任的环境**中使用，生产环境请谨慎评估安全风险。\n\nArm NN 是专为 Android 和 Linux 设计的高性能机器学习推理引擎，通过利用 Arm Cortex-A CPU、Mali GPU 及 Ethos-N NPU 的架构特性（如 SVE2），加速 TensorFlow Lite (TF Lite) 和 ONNX 模型的推理。\n\n## 1. 环境准备\n\n### 系统要求\n*   **操作系统**：\n    *   Linux (推荐 Ubuntu 20.04)\n    *   Android (API Level 30+, 即 Android 11+)\n*   **硬件架构**：Arm64 (AArch64)，支持 v8a, v8.2a, v8.6a 等架构。\n*   **编译工具链**（如需源码编译）：\n    *   CMake\n    *   GCC \u002F Clang (支持 C++17)\n    *   Python 3 (用于使用 TF Lite Delegate)\n\n### 前置依赖\n若选择源码编译，需确保已安装以下基础依赖：\n```bash\nsudo apt-get update\nsudo apt-get install -y build-essential cmake git python3 python3-pip libboost-all-dev libprotobuf-dev protobuf-compiler\n```\n\n## 2. 安装步骤\n\n根据您的需求，可选择以下三种方式之一：\n\n### 方案 A：使用预编译二进制包（推荐，最快）\n适用于快速验证或集成到现有项目。\n\n1.  **下载二进制包**：\n    访问 [GitHub Releases](https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases) 下载对应平台和架构的压缩包。\n    *   *Linux 示例 (v8.2a)*:\n        ```bash\n        wget https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Fdownload\u002Fv26.01\u002FMULTI_ISA-GCC11-ArmNN+ACL-linux-armv8.2-a.tar.gz\n        tar -xzf MULTI_ISA-GCC11-ArmNN+ACL-linux-armv8.2-a.tar.gz\n        ```\n    *   *Android*: 下载对应的 `.tar.gz` 文件并解压至项目 `jniLibs` 目录。\n\n2.  **配置环境变量**（可选）：\n    ```bash\n    export LD_LIBRARY_PATH=$PWD\u002Flib:$LD_LIBRARY_PATH\n    ```\n\n### 方案 B：通过 APT 仓库安装 (仅限 Ubuntu 20.04)\n适用于 Linux 桌面或服务器环境，可快速安装核心库和 TF Lite Parser。\n\n```bash\n# 添加 Arm NN 仓库源（参考官方安装指南）\n# 注意：由于是遗留项目，官方源可能不再更新，请确认源可用性\nsudo add-apt-repository ppa:armnn-team\u002Fppa \nsudo apt-get update\nsudo apt-get install armnn armnn-tflite-parser\n```\n\n### 方案 C：源码编译（高度定制）\n适用于需要特定算子支持或针对特定硬件优化的场景。推荐使用官方提供的 **Arm NN Build Tool**。\n\n1.  **获取构建工具**：\n    ```bash\n    git clone https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn.git\n    cd armnn\u002Fbuild-tool\n    ```\n\n2.  **执行构建脚本**：\n    该工具包含参数化的 Bash 脚本和 Dockerfile，可自动构建 Arm NN 及其依赖（如 Arm Compute Library）。\n    ```bash\n    # 示例：构建针对 Linux aarch64 的版本\n    .\u002Fbuild_tool.sh --target linux-aarch64 --build-type Release\n    ```\n    *构建完成后，产物位于 `build` 目录下。*\n\n## 3. 基本使用\n\nArm NN 最推荐的使用方式是通过 **Arm NN TF Lite Delegate** 加速现有的 TensorFlow Lite 模型。这种方式支持最广泛的算子，且不支持的算子会自动回退到原生 TF Lite 运行时。\n\n### Python 使用示例\n\n确保已安装 `tflite-runtime` 或 `tensorflow`，并将 Arm NN 的动态库路径加入 `LD_LIBRARY_PATH`。\n\n```python\nimport tensorflow as tf\nimport numpy as np\n\n# 1. 加载 TFLite 模型\ninterpreter = tf.lite.Interpreter(\n    model_path=\"your_model.tflite\",\n    # 2. 加载 Arm NN Delegate\n    experimental_delegates=[\n        tf.lite.load_delegate(\"libarmnnDelegate.so\") \n        # Linux 下通常为 libarmnnDelegate.so\n        # Android 下通常为 libarmnnDelegate.so (在 jniLibs 中)\n    ]\n)\n\ninterpreter.allocate_tensors()\n\n# 3. 获取输入输出细节\ninput_details = interpreter.get_input_details()\noutput_details = interpreter.get_output_details()\n\n# 4. 准备测试数据\ninput_data = np.array(np.random.random_sample(input_details[0]['shape']), dtype=np.float32)\n\n# 5. 设置输入并运行推理\ninterpreter.set_tensor(input_details[0]['index'], input_data)\ninterpreter.invoke()\n\n# 6. 获取结果\noutput_data = interpreter.get_tensor(output_details[0]['index'])\nprint(\"Inference output shape:\", output_data.shape)\n```\n\n### C++ 使用简述\n若使用 C++，需链接 `armnn` 和 `armnnTfLiteParser` 库，通过 `ITfLiteParser` 加载模型并创建网络进行推理。具体代码结构请参考官方 `samples` 目录下的示例。\n\n---\n*注：本指南基于 Arm NN v26.01 版本整理。由于项目已停止维护，如遇兼容性问题，建议检查底层依赖（如 ACL）的版本匹配情况。*","某边缘计算团队正在为基于 Arm Cortex-A 处理器的智能安防摄像头部署实时人脸检测模型，需在资源受限的 Linux 环境下保证高帧率运行。\n\n### 没有 armnn 时\n- 通用机器学习库无法利用 Arm 架构特有的 SVE2 指令集，导致 CPU 算力浪费，推理延迟高达 200 毫秒。\n- 模型在 Mali GPU 上运行时缺乏专用优化，功耗激增导致设备发热严重，不得不降低摄像头采集频率。\n- 从 TensorFlow 或 PyTorch 迁移模型到嵌入式环境需重写大量底层算子代码，开发周期延长数周。\n- 面对复杂的网络结构，现有方案算子支持不全，被迫裁剪模型精度，致使漏检率上升。\n\n### 使用 armnn 后\n- armnn 通过 Arm Compute Library 深度调用 SVE2 指令集，将单帧推理时间压缩至 40 毫秒以内，实现流畅的 25 FPS 实时检测。\n- 借助对 Mali GPU 的硬件感知调度，同等算力下功耗降低 40%，设备可长时间稳定运行而不过热。\n- 利用 armnn 提供的 TF Lite Delegate，开发者仅需几行 Python 代码即可加载并加速现有模型，无需修改原始网络结构。\n- armnn 覆盖了广泛的 ML 算子，完整保留模型精度，显著提升了复杂光照条件下的识别准确率。\n\narmnn 通过架构级优化打通了算法框架与 Arm 硬件间的性能壁垒，让边缘设备也能拥有云端般的推理效率。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FARM-software_armnn_9ccf2c60.png","ARM-software","Arm Software","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FARM-software_8286f64a.png","",null,"www.arm.com","https:\u002F\u002Fgithub.com\u002FARM-software",[81,85,89,93,97,101,105],{"name":82,"color":83,"percentage":84},"C++","#f34b7d",97.5,{"name":86,"color":87,"percentage":88},"CMake","#DA3434",1.5,{"name":90,"color":91,"percentage":92},"Shell","#89e051",0.6,{"name":94,"color":95,"percentage":96},"Makefile","#427819",0.3,{"name":98,"color":99,"percentage":100},"Python","#3572A5",0.1,{"name":102,"color":103,"percentage":104},"Dockerfile","#384d54",0,{"name":106,"color":107,"percentage":104},"Assembly","#6E4C13",1301,327,"2026-03-31T12:04:25","MIT",4,"Linux, Android","非必需。支持 Arm Mali GPU（通过 Arm Compute Library 加速），不支持 NVIDIA CUDA。针对 Arm Ethos-N NPU 需额外驱动。","未说明",{"notes":117,"python":118,"dependencies":119},"1. 该项目已停止维护（Legacy project），不再提供安全补丁或功能更新，建议仅在受信任的环境中使用。\n2. 核心架构为 ARM（Cortex-A CPU, Mali GPU），不兼容 x86 架构。\n3. 推荐使用 TF Lite Delegate 以获得最广泛的算子支持。\n4. 提供预编译二进制包（Android API 30-34, Linux Ubuntu 20.04）及基于 Docker 的构建工具。\n5. 代码基于 C++17 编写。","未说明具体版本，但支持通过 Python 接口使用 Arm NN TF Lite Delegate",[120,86,121,122,123,124],"Arm Compute Library (ACL)","TensorFlow Lite","ONNX","Ethos-N NPU Driver (可选，针对 NPU)","Android NDK (仅 Android 构建需要)",[14],[127,128,129],"machine-learning","neural-network","neural-networks","2026-03-27T02:49:30.150509","2026-04-07T09:48:00.856969",[133,138,143,147,152,157],{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},21601,"运行 ArmNN 单元测试（UnitTests）时出现文件不存在或打开句柄失败的错误，如何解决？","这个问题通常是因为编译环境和运行时环境不一致导致的。动态后端测试依赖于构建时创建的一组测试文件和目录。默认情况下，测试会在相对于 UnitTests 可执行文件构建位置的 `src\u002Fbackends\u002FbackendsCommon\u002Ftest\u002F` 目录中查找这些文件。\n\n解决方案有两种：\n1. 将这些文件和目录复制到新的单元测试执行环境中。\n2. 在运行 UnitTests 可执行文件时添加命令行参数来指定新的根路径：\n   `UnitTests -- --dynamic-backend-build-dir \"新路径\"`\n\n如果移动了整个构建目录，也需要使用上述参数重新指定路径。","https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fissues\u002F432",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},21602,"在 Android NDK 环境下构建 ArmNN 时遇到 'basic_streambuf' 私有成员错误或链接错误，原因是什么？","这通常是由于 ArmNN 代码版本与使用的 Android NDK 版本（如 r17b）不兼容，或者 C++ 标准库配置问题导致的。\n\n如果是构建 Compute Library 时的疑问：\n1. **架构选择**：默认是 Arm v7-A，不需要额外指定 `arch` 参数。如果需要 Arm v8-A，则需添加 `arch=arm64-v8a`。\n2. **OpenCL 支持**：只要拥有支持相应 OpenCL 级别的 Mali GPU，Mali OpenCL 代码即可工作，这与 CPU 是 Arm-v7 还是 Arm-v8 无关。启用 OpenCL 需添加参数 `opencl=1 embed_kernels=1`。\n3. **构建命令示例**：\n   - Armv7-A: `scons extra_cxx_flags=\"-fPIC\" benchmark_tests=0 validation_tests=0`\n   - Armv8-A: `scons arch=arm64-v8a extra_cxx_flags=\"-fPIC\" benchmark_tests=0 validation_tests=0`\n\n如果遇到未定义引用（undefined reference）错误，请检查是否正确启用了 opencl 选项以及是否链接了正确的库。","https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fissues\u002F411",{"id":144,"question_zh":145,"answer_zh":146,"source_url":142},21603,"如何在 Intel PC (Ubuntu) 或树莓派上构建 Arm Compute Library？","你可以在任何安装了合适编译器的地方进行构建，包括 Intel Ubuntu PC 或树莓派。\n\n- **默认架构**：默认为 Arm v7-A，无需额外指定 `arch` 参数。\n- **OpenCL 支持**：如果你的设备拥有支持相应 OpenCL 级别的 Mali GPU，可以添加 `opencl=1 embed_kernels=1` 参数。这与 CPU 架构（Arm-v7 或 Arm-v8）无关，主要取决于 GPU 支持情况。\n- **构建命令**：\n  使用 SCons 工具，基本命令如下：\n  `scons extra_cxx_flags=\"-fPIC\" benchmark_tests=0 validation_tests=0`\n  若需指定 Armv8 架构：\n  `scons arch=arm64-v8a extra_cxx_flags=\"-fPIC\" benchmark_tests=0 validation_tests=0`",{"id":148,"question_zh":149,"answer_zh":150,"source_url":151},21604,"在 Android 上使用 ArmNN 解析 TFLite 模型时遇到不支持的操作符（Operator not supported）或崩溃怎么办？","这通常是因为底层 Compute Library 中某些算子（如 NEScale）存在 Bug 或缺少特定支持（如 dilation 膨胀操作）。\n\n- **修复崩溃**：维护者已修复了 NEScale 导致的崩溃问题，相关补丁已合并到 master 分支。请确保你使用的是最新版本的 Compute Library 和 ArmNN。\n- **新增支持**：社区贡献者已添加了 dilation（膨胀）支持。\n- **建议**：如果遇到此类问题，请先尝试更新到最新的 master 分支代码重新构建。如果问题依旧，可能是该算子尚未被 ArmNN 完全支持，需要关注官方更新或提交特性请求。","https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fissues\u002F133",{"id":153,"question_zh":154,"answer_zh":155,"source_url":156},21605,"构建 ArmNN for Android 时出现 'variable set but not used' 导致编译失败（-Werror），如何处理？","这是一个编译警告被当作错误处理的情况（`-Werror` 标志）。具体错误为变量 'numOutputElements' 被设置但未使用。\n\n虽然这是 ArmNN 代码层面的问题，但在某些特定的编译器配置（如 Termux 或特定 NDK 版本）下会触发。建议尝试以下方法：\n1. 更新 ArmNN 到最新版本，看该问题是否已在后续提交中修复。\n2. 如果是自行修改源码构建，可以尝试在编译选项中暂时禁用将该特定警告视为错误，或者注释掉相关未使用的变量代码（不推荐用于生产环境）。\n3. 检查是否使用了过旧或不兼容的 NDK 版本，尝试更换 NDK 版本。","https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Fissues\u002F634",{"id":158,"question_zh":159,"answer_zh":160,"source_url":137},21606,"ArmNN 动态后端测试依赖的文件在运行时找不到，除了复制文件外还有其他配置方法吗？","是的，除了手动复制文件外，更灵活的方法是通过命令行参数动态指定路径。\n\n在运行 `UnitTests` 可执行文件时，添加以下参数：\n`--dynamic-backend-build-dir \"新路径\"`\n\n例如：\n`.\u002FUnitTests -- --dynamic-backend-build-dir \"\u002Fdata\u002Flocal\u002Ftmp\u002Fdynamic\u002Fsample\"`\n\n这样测试程序就会去指定的新路径下寻找所需的测试文件和目录，而无需关心可执行文件原本构建时的相对路径。这在交叉编译或将构建产物部署到不同目录结构的目标设备（如 Android 手机）时非常有用。",[162,167,172,177,182,187,192,197,202,207,212,217,222,227,232,237,242,247,252,257],{"id":163,"version":164,"summary_zh":165,"released_at":166},127627,"v26.01","\u003C!-- x-tinymce\u002Fhtml -->\u003Ch2>\u003Cstrong>Arm NN SDK\u003C\u002Fstrong>\u003C\u002Fh2>\u003Ch4>特性与改进\u003C\u002Fh4>\u003Cul>\u003Cli>更新了 Compute Library 的版本号为 v52.7.0。\u003C\u002Fli>\u003C\u002Ful>\u003Ch4>错误修复\u003C\u002Fh4>\u003Cul>\u003Cli>修复了 build_android_ndk_guide.sh 中的错误处理。\u003C\u002Fli>\u003Cli>在 script\u002Fget_compute_library.sh 脚本中获取 Compute Library 的正确引用。\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>\u003Cstrong>ABI\u002FAPI 变更：\u003C\u002Fstrong>\u003C\u002Fh2>\u003Cp>没有 API\u002FABI 变更。\u003C\u002Fp>\u003Ch2>\u003Cstrong>构建依赖\u003C\u002Fstrong>\u003C\u002Fh2>\n工具 | 支持版本\n-- | --\nGit | 2.17.1 或更高版本\nSCons | 2.4.1（Ubuntu）和 2.5.1（Debian）\nCMake | 3.22.1\nTensorFlow | 2.19.0\nONNX | 1.6.0\nFlatBuffers | 24.3.25\nProtocol Buffers | 3.19.4\nAndroid NDK | r26b\ncxxopts | 3.1.1\ndoctest | 2.4.6\nfmt | 8.30\nghc | 1.3.2\nhalf | 1.12.0\nmapbox\u002Fvariant | 1.1.3\nSTB | 2.16\nGemmlowp | 16e8662c34917be0065110bfcd9cc27d30f52fdf\n\n","2026-01-23T10:50:37",{"id":168,"version":169,"summary_zh":170,"released_at":171},127628,"v25.11","# Arm NN SDK\n### 功能与改进：\n- 更新了 v25.11 版本的 ABI 版本。\n- 将 ArmNN 解析器与 LiteRT FlatBuffer 模式集成。\n- 在解析器中增加了对缓冲区指针的验证，以提高内存操作的安全性。\n- 增加了 PRELU 运算符的支持，并为 GREATER_EQUAL 添加了 BOOL 类型。\n- 在委托中为 TRANSPOSE_CONV 运算符添加了偏置支持。\n- 更新了 TensorFlow 依赖项和构建脚本，以适应 TensorFlow 2.19 的迁移。\n- 添加了性能分析头文件和跟踪点设置。\n- 增加了 QNX 平台移植。\n- 改进了裸机目标的兼容性以及 NEReorderLayer 的向后兼容性。\n- 增加了针对 libtensorflow-lite.so 构建的支持。\n- 更新了 build-tools 中的 protobuf 依赖。\n- 为 Rescale、DepthwiseConv2d、Shift\u002FMultiplier 等运算符添加了单元测试。\n- 重构了 reorder 内核和层。\n### 错误修复：\n- 修复了缺少 \u003Ccstdint> 头文件导致的 GCC 15 构建错误。\n- 修复了 armv7 上的 GCC 编译问题。\n- 修复了 ArmNN TFLite 解析器中的堆缓冲区溢出问题（SpaceToBatchND 输入类型验证）。\n- 修复了 Shift 操作中的范围验证问题（添加了 Op_Minimum）。\n- 修复了 a64_hgemm_8x24 中过早读取操作数的问题。\n- 修复了 CpuGemmAssembly bf16 测试中的问题。\n- 修复了 SME softmax FP32 内核在处理大输入时的问题。\n- 修复了 SME2 内核中 INT8 Softmax 预留寄存器的问题。\n- 修复了 SUB、MUL 和 ADD 运算符的输出，使其更接近 TFLite 参考输出。\n### 文档：\n- 更新了文档以反映 GitHub 的迁移。\n- 更新了 README 文件，注明项目已进入遗留状态。\n- 更新了贡献指南中的版权年份。\n# ABI\u002FAPI 变更\n在实现 v25.11 的过程中，发生了一些前端 API 变更，用户在升级前应注意这些变更。\n因此，我们按照[语义化版本控制](https:\u002F\u002Fsemver.org\u002F)的规范，将 ARMNN_VERSION 提升至 36.0.0。\nmarkdown\n| 功能 | SHA  | Gerrit 审查 | 导致的 ABI\u002FAPI 变更 |\n|------|:-:|:-:|--|\n| 不透明委托中的子图    | ac9607f401dc30003aa97bd179a06d6b8a32139f | https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F9389 | ArmnnSubgraph::VisitNode 方法已更新。tfLiteRegistration 参数由 TfLiteRegistrationExternal* 类型更新为 TfLiteOperator* |\n# 构建依赖\n| 工具 | 支持版本 |\n|--|:-|\nGit | 2.17.1 或更高版本\nSCons | 2.4.1（Ubuntu）和 2.5.1（Debian）\nCMake | 3.22.1\nTensorflow | 2.19.0\nOnnx | 1.6.0\nFlatbuffer | 24.3.25\nProtobuf | 3.19.4\nAndroid NDK | r26b\ncxxopts | 3.1.1\ndoctest | 2.4.6\nfmt | 8.30\nghc | 1.3.2\nhalf | 1.12.0\nmapbox\u002Fvariant | 1.1.3\nstb | 2.16\nGemmlowp | 16e8662c34917be0065110bfcd9cc27d30f52fdf\n\n\n\n\n\n\n","2025-11-10T11:01:32",{"id":173,"version":174,"summary_zh":175,"released_at":176},127629,"v25.02","\u003Ch2>\u003Cstrong>Arm NN SDK\u003C\u002Fstrong>\u003C\u002Fh2>\n\u003Ch4>\u003Cstrong>错误修复：\u003C\u002Fstrong>\u003C\u002Fh4>\n\u003Cul>\n    \u003Cli>TosaRef 映射中针对算子\u003Cstrong>LeakyRelu\u003C\u002Fstrong>、\u003Cstrong>Quantize\u003C\u002Fstrong>、\u003Cstrong>Stack\u003C\u002Fstrong>、\u003Cstrong>Dequantize\u003C\u002Fstrong> 的错误修复\u003C\u002Fli>\n    \u003Cli>TosaRef 针对一系列不同算子的重构及错误修复。\u003C\u002Fli>\n    \u003Cli>TosaRef 中的步进切片错误。\u003C\u002Fli>\n    \u003Cli>TfLite Turbo 模型检测修复。\u003C\u002Fli>\n    \u003Cli>在 Neon 和 CL 后端进行融合之前，添加了检查以确保激活位于当前子图内。\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch4>\u003Cstrong>移除的功能：\u003C\u002Fstrong>\u003C\u002Fh4>\n\u003Cul>\n    \u003Cli>移除了异步 API。\u003C\u002Fli>\n    \u003Cli>移除了 PyArmNN。\u003C\u002Fli>\n    \u003Cli>移除了 Shim 及支持库。\u003C\u002Fli>\n    \u003Cli>移除了 RangeTracker 类。\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch2>\u003Cstrong>ABI\u002FAPI 变更：\u003C\u002Fstrong>\u003C\u002Fh2>\n\u003Cp>在 25.02 版本的实现过程中，发生了以下\u003Cstrong>前端\u003C\u002Fstrong>API 变更，用户在升级前应注意。\u003C\u002Fp>\n\u003Cp>鉴于这些变更，我们已按照\u003Ca class=\"external-link\" style=\"text-decoration: none;\" href=\"https:\u002F\u002Fsemver.org\u002F\" rel=\"nofollow\">\u003Cspan>语义化版本控制\u003C\u002Fspan>\u003C\u002Fa>的指导原则，将 ARMNN_VERSION 提升至 35.0.0。\u003C\u002Fp>\n\u003Ctable class=\"relative-table wrapped\" style=\"width: 85.3178%;\">\n    \u003Cthead>\n        \u003Ctr>\n            \u003Cth style=\"text-align: left;\">\n                \u003Cp>功能\u003C\u002Fp>\n            \u003C\u002Fth>\n            \u003Cth style=\"text-align: left;\">\n                \u003Cp>SHA\u003C\u002Fp>\n            \u003C\u002Fth>\n            \u003Cth style=\"text-align: left;\">\n                \u003Cp>Gerrit 审核\u003C\u002Fp>\n            \u003C\u002Fth>\n            \u003Cth style=\"text-align: left;\">\n                \u003Cp>由此产生的 ABI\u002FAPI 变更\u003C\u002Fp>\n            \u003C\u002Fth>\n        \u003C\u002Ftr>\n        \u003Ctr>\n            \u003Ctd>移除异步 API\u003C\u002Ftd>\n            \u003Ctd>4483b24d9316ca895cc794e414aeeb3cba86790c\u003C\u002Ftd>\n            \u003Ctd>\u003Ca href=\"https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F12979\">https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F12979\u003C\u002Fa>\u003C\u002Ftd>\n            \u003Ctd>\n                \u003Cp>\u003Cspan class=\"section\" style=\"color: rgb(0,0,0);\">\u003Cspan class=\"sym_p\">已移除\u003Cstrong>IWorkingMemHandle\u003C\u002Fstrong> 类。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fp>\n                \u003Cp>\u003Cspan class=\"section\" style=\"color: rgb(0,0,0);\">\u003Cspan class=\"sym_p\">已移除\u003Cstrong>IAsyncExecutionCallback\u003C\u002Fstrong> 类。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fp>\n                \u003Cp>\u003Cbr \u002F>\u003C\u002Fp>\n                \u003Cp>\u003Cstrong>INetworkProperties\u003C\u002Fstrong> 结构体中移除了以下字段：\u003C\u002Fp>\n                \u003Cul>\n                    \u003Cli>\u003Cstrong>m_AsyncEnabled\u003C\u002Fstrong>\u003C\u002Fli>\n                \u003C\u002Ful>\u003Cbr \u002F>\n                \u003Cp>从\u003Cstrong>IRuntime\u003C\u002Fstrong> 类中移除了 4 个函数：\u003C\u002Fp>\n                \u003Cul>\n                    \u003Cli>\n                        \u003Ch6>\u003Cspan class=\"section\">IRuntime::ClearImportedInputs \u003Cspan class=\"sym_p\">( NetworkId \u003Cspan class=\"color_p\">networkId\u003C\u002Fspan>, std::vector&lt;unsigned int&gt;const \u003Cspan class=\"color_p\">inputIds\u003C\u002Fspan> )\u003C\u002Fspan>\u003C\u002Fspan>\u003Cspan class=\"section\">\u003Cbr \u002F>\u003C\u002Fspa","2025-02-19T18:03:42",{"id":178,"version":179,"summary_zh":180,"released_at":181},127630,"v24.11","\u003C!-- x-tinymce\u002Fhtml -->\u003Ch2>\u003Cstrong>Arm NN SDK\u003C\u002Fstrong>\u003C\u002Fh2>\u003Ch4>\u003Cstrong>新特性：\u003C\u002Fstrong>\u003C\u002Fh4>\u003Cul>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>在分配后端中实现了“全有或全无”逻辑。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>\u003Cspan class=\"headerSubject\">为 Constant 和 Tile 工作负载添加了 Signed64 支持。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>\u003Cspan class=\"headerSubject\">为 LogSoftMax 添加了 Int8 和 Uint8 支持，使其能够在 CpuAcc 和 GpuAcc 后端上运行。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>\u003Cspan class=\"headerSubject\">为 ExecuteNetwork 添加了 GPU 的自动后端选择功能。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>\u003Cspan class=\"headerSubject\">增加了对 TfLite Turbo 模型的识别，并启用了 Turbo 模式。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>\u003Cspan class=\"headerSubject\">\u003Cbr>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Ch4>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>\u003Cspan class=\"headerSubject\">\u003Cstrong>TosaCommon 和 TosaRef：\u003C\u002Fstrong>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fh4>\u003Cul>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan class=\"headerSubject\">增加了对“\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>Convolution3d\u003C\u002Fspan>\u003C\u002Fspan>”的支持。\u003Cbr>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan class=\"headerSubject\">\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>增加了对激活函数“Sigmoid”和“TanH”的支持。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan class=\"headerSubject\">\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>增加了对激活函数“HardSwish”的支持。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan class=\"headerSubject\">增加了对“StridedSlice”的支持。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan class=\"headerSubject\">\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>增加了对“ElementwiseBinary:SqDiff”的支持。\u003Cbr>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan class=\"headerSubject\">\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>增加了了对“Stack”的支持。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan class=\"headerSubject\">\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>增加了了对“Dequantize”的支持。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan class=\"headerSubject\">\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>增加了了对“DepthToSpace”的支持。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan class=\"headerSubject\">\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>增加了了对“Gather”的支持。\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Ch4>\u003Cstrong>错误修复：\u003C\u002Fstrong>\u003C\u002Fh4>\u003Cul>\u003Cli>修复了 ReduceProdOp 在委托测试套件中使用 \u003Cspan>Int8\u003C\u002Fspan> 时，\u003Cspan style=\"color: rgb(23,43,77);\">\u003Cspan>\u003Cspan class=\"headerSubject\">CpuAcc 和 GpuAcc\u003C\u002Fspan>\u003C\u002Fspan>\u003C\u002Fspan> 后端出现的故障。\u003C\u002Fli>\u003Cli>修复了 TosaCommon 和 TosaRef 中的 Mean 运算符。\u003C\u002Fli>\u003Cli>修复了…","2024-11-29T16:33:02",{"id":183,"version":184,"summary_zh":185,"released_at":186},127631,"v24.08","# 摘要\n\n\n#### 新特性\n\n \n* 在 TosaCommon 和 TosaRef 中实现了 Softmax。\n* 在 TosaCommon 和 TosaRef 中实现了 MEAN。\n* 在 TosaCommon 和 TosaRef 中实现了 REDUCE_SUM。\n* 在 TosaRef 中实现了 Activation: Gelu。\n* 在 TosaRef 中实现了 ElementwiseUnary: Log。\n* 在 TosaCommon 和 TosaRef 中实现了 Pad。\n* 在 TosaRef 中实现了 ElementwiseUnary: Exp。\n* 在 TosaCommon 和 TosaRef 中实现了 BatchMatMul。\n* 在 TosaCommon 和 TosaRef 中实现了 FullyConnected。\n* 在 TosaCommon 和 TosaRef 中实现了 Activation: BoundedReLu。\n* 在 TosaCommon 和 TosaRef 中实现了 Activation: ReLu。\n* 在 TosaCommon 和 TosaRef 中实现了 DepthwiseConvolution2d。\n* 在 TosaCommon 和 TosaRef 中实现了量化版的 ElementwiseBinary Add、Max、Mul 和 Sub 支持。\n\n\n#### Bug 修复\n\n* 修复了 PerAxisIterator 中的浮点异常。\n* 修复了 TFLite 解析器和不透明委托在执行网络时错误卸载运行时的问题。\n* 修复了 StridedSliceOp 的越界错误。\n* 修复了经典委托和不透明委托中未指定维度的错误。\n* 修复了使用 GCC-14.1.0 构建 ArmNN 委托时的警告。\n* 修复了 ReshapeOp 的 DTS 测试失败问题。\n* 修复了 ConstFloat 的 DTS 测试失败问题。\n* 修复了 Broadcast 的 DTS 测试失败问题。\n* 修复了 BatchMatMul 的 DTS 测试失败问题。\n    \n#### 其他变更\n\n* 更新了 Arm NN 24.08 版本的文档。\n* 审查并更新了 24.08 版本的相关文档。\n* 为 evaluate_network.sh 添加了 Android 支持。\n* 添加了 Gemmlowp，用于处理小数值的定点运算。\n* 将 Arm NN 代码库迁移至使用 CMake 3.22。\n* 为 Execute Network 添加了 Numpy 支持。\n\n\n#### ABI\u002FAPI 变更\n\nArmNN Core（libarmnn.so）未发生 ABI 破坏性变更，因此主版本号未变，仅次要版本号有所提升（33.1.0 → 33.2.0）。\n\n在 24.08 版本的实现过程中，未发生任何破坏性的 API 后端变更。\n\n\n#### 构建依赖\n| 工具 | 支持版本 |\n|-------|-------------------|\n| Git   | 2.17.1 或更高   |\n| SCons | 2.4.1（Ubuntu） 2.5.1（Debian）|\n| Cmake | 3.22.1|\n| Tensorflow | 2.15.0 |\n| Onnx | 1.6.0 |\n| Flatbuffer | 23.5.26 |\n| Protobuf | 3.12.0 |\n| Android NDK | r26b |\n| mapbox\u002Fvariant | 1.2.0 |\n| cxxopts | 3.1.1 |\n| doctest | 2.4.6 |\n| fmt | 7.0.1 |\n| ghc | 1.3.2 |\n| half | 1.12.0 |\n| mapbox\u002Fvariant | 1.1.0 |\n| stb | 2.16 |\n| Gemmlowp | 16e8662c34917be0065110bfcd9cc27d30f52fdf |\n\n\n\n","2024-08-28T09:58:24",{"id":188,"version":189,"summary_zh":190,"released_at":191},127632,"v24.05","# 摘要\n\n\n#### 新特性\n\n \n* ScatterNd 算子实现。\n  * 增加了对委托和不透明委托的支持。\n  * 增加了对序列化器和反序列化器的支持。\n  * 增加了对 TFLite 解析器的支持。\n  * 添加了端到端测试。\n  * 增加了对 CpuRef 和 GpuAcc 的支持。\n* 在 ExecuteNetwork 中添加了序列化网络的选项。\n* 添加了一个构建选项，用于在 ACL 中启用 OpenMP 调度器，并将其设置为 ACL 构建的默认调度器。\n* 向 Debug 层支持中添加了布尔数据类型。\n* 更新 TOSA Common 和 TosaRef，使其使用 TOSA v0.80。\n* 更新构建工具的 README 文件，以包含对 macOS 的支持。\n\n\n#### 错误修复\n\n* 修复了 ExecuteNetwork 在推理后崩溃的问题。\n* 修复了 CTS Float16 测试失败的问题。\n* 仅当 ARMNN_SERIALIZER 开启时才允许序列化为 armnn。\n* TosaCommon 后端\n  * 在 TosaCommon 中，修改了输入唯一名称的生成方式。\n  * 修改了 CreateRescaleTosaOperator() 函数。\n  * 将 ComputeSplitAxis() 移动到 backendsCommon\u002FWorkloadUtils 中。\n  * 对于 LeakyRelu，添加了 TosaRefEndToEndTests，并在 TOSA 映射中启用了 FP16。\n  * 修复了量化 Conv2d 的 TOSA 映射问题。\n* Comparison 层的广播处理不一致。\n* 移除了量化中零缩放值的限制。\n* 修复了 fsrcnn 测试失败的问题。\n* 修复了委托 README 中的损坏链接。\n* 修复了委托和 Arm NN 执行器中的运行时内存管理问题。\n* 移除了 std::clamp 的使用。\n* 修改了语法，以允许在较旧的编译器上进行构建。\n* 进行了断言审计并移除了相关断言。 \n    \n#### 其他变更\n\n* 针对 24.08 版本中将被移除项的弃用通知。\n* 审查并更新了 24.05 版本中新增算子的文档。\n* 更新了 24.05 版本的 Arm NN 文档。\n* 更新了 Python pillow 的版本。\n* 从 Docker README 中移除了对 22.08 版本的引用。\n* 对 ExecuteNetwork 中的打印输出进行了小幅修改。\n* 在构建工具中启用了 execute network 的构建。\n* 更新了 Arm NN 构建工具脚本，以包含委托头文件和 so 文件。\n\n\n#### ABI\u002FAPI 变更\n\n在 24.05 版本的实现过程中，未发生任何破坏前端 API 的变更。\n\n在 24.05 版本的实现过程中，未发生任何破坏后端 API 的变更。\n\n#### 构建依赖\n| 工具 | 支持版本 |\n|-------|-------------------|\n| Git   | 2.17.1 或更高版本   |\n| SCons | 2.4.1（Ubuntu） 2.5.1（Debian）|\n| Cmake | 3.19.0（Ubuntu）和 3.19.0（Debian）|\n| Tensorflow | 2.15.0 |\n| Onnx | 1.6.0 |\n| Flatbuffer | 23.5.26 |\n| Protobuf | 3.12.0 |\n| Android NDK | r26b |\n| mapbox\u002Fvariant | 1.2.0 |\n| cxxopts | 3.1.1 |\n| doctest | 2.4.6 |\n| fmt | 8.3.0 |\n| ghc | 1.3.2 |\n| half | 1.12.0 |\n| mapbox\u002Fvariant | 1.2.0 |\n| stb | 2.16 |\n| xxd | 1.10 |","2024-05-30T15:20:30",{"id":193,"version":194,"summary_zh":195,"released_at":196},127633,"v24.02","# 摘要\n\n\n#### 新特性\n\n \n* ArmNN 至 TOSA 后端：\n    * 增加了 LeakyRelu 激活函数支持\n    * 增加了量化支持\n    * 增加了最大值操作支持\n    * 增加了拆分操作支持\n    * 增加了最近邻插值缩放操作支持\n* GpuFsa 后端（动态融合）：\n    * 增加了 RESIZE\u002FSCALE 操作支持\n    * 增加了 CAST 操作支持\n    * 增加了 2D 池化操作支持\n    * 增加了 SUB 操作支持\n    * 增加了 ADD 操作支持\n    * 增加了深度可分离 2D 卷积操作支持\n    * 增加了 2D 卷积操作支持\n* 更新至 Android NDK r26b。\n* 更新至 TensorFlow 2.15。\n* 为 CL、Neon 和 Ref 后端添加了优化，可在可能的情况下移除 reshape 算子。\n\n\n#### Bug 修复\n\n* 移除了可能导致编译错误的隐式符号转换。\n* 修复了仅在性能分析时出现的内存泄漏问题，该问题与 Resize 工作负载的 align corners 参数设置为 true 有关。\n* 修复了使用 C++14 编译器时的构建失败问题。\n* 修复了针对 Android 目标平台构建时的构建工具错误。\n\n\n#### 其他变更\n\n* Delegate 单元测试现在仅针对正在构建的后端进行编译。\n* 增加了两层和三层 MaxPool2d 的端到端测试。\n* 在 ExecuteNetwork 中增加了对 Arm NN Delegate 的 dot 图序列化支持。\n\n\n#### ABI\u002FAPI 变更\n\n在 24.02 版本的实现过程中，未发生任何破坏前端 API 的变更。\n\n在 24.02 版本的实现过程中，未发生任何破坏后端 API 的变更。\n\n注：Arm NN AAR 文件支持的最低 API 级别为 27。\n\n\n#### 构建依赖项\n| 工具 | 支持版本 |\n|-------|-------------------|\n| Git   | 2.17.1 或更高   |\n| SCons | 2.4.1（Ubuntu） 2.5.1（Debian）|\n| Cmake | 3.19.0（Ubuntu）和 3.19.0（Debian）|\n| Tensorflow | 2.15.0 |\n| Onnx | 1.6.0 |\n| Flatbuffer | 23.5.26 |\n| Protobuf | 3.12.0 |\n| Android NDK | r26b |\n| mapbox\u002Fvariant | 1.2.0 |\n| cxxopts | 3.1.1 |\n| doctest | 2.4.6 |\n| fmt | 8.3.0 |\n| ghc | 1.3.2 |\n| half | 1.12.0 |\n| mapbox\u002Fvariant | 1.2.0 |\n| stb | 2.16 |\n| xxd | 1.10 |\n\n","2024-02-22T13:27:55",{"id":198,"version":199,"summary_zh":200,"released_at":201},127634,"v23.11","# 摘要\n\n\n#### 新特性\n\n* 在 CpuRef 中添加对 BROADCAST_TO 层的支持，并在其后紧跟 ElementWise 层时将其移除。\n* 在 CpuAcc 中新增一种优化，可将 Add+Mul+Add+（可选 Relu）层进行融合。\n* 在 CpuRef、CpuAcc 和 GpuAcc 中添加对 GELU 激活层的支持。\n* 将 Arm NN 升级至 Tensorflow 2.14 版本。\n* 添加对 Signed64 的支持。\n* 在 Cast 层中添加对 Signed64 数据类型的支持。\n* 添加用于评估网络性能的脚本。\n* 添加 ReverseV2 CL 和 Neon 工作负载。\n\n\n#### TfLite 解析器\n\n* 添加对 BROADCAST_TO 层的支持。\n* 添加对 GELU 激活层的支持。\n* 更新 TfLite 解析器，使其忽略 VALIDATION: subgraphs。\n\n\n#### Arm NN 序列化\u002F反序列化器\n\n* 添加对 GELU 激活层的支持。\n\n\n#### Bug 修复\n\n* 修复 UnidirectionalSequenceLstm。\n* 修复在 Support Library 中进行转换时的权重检查问题。\n* 修复 Armnn 中不安全的 Memcpy 使用。\n* 修复 gcc9 中 Profiling 测试的 -Wno-sign-conversion 警告。\n* 修复 NeonBackend 激活融合优化中缺少 ElementwiseBinary 的问题。\n* 修复 Reshape 和 concat 导致结果无效的问题。\n* 移除量化中的不必要的 Prelu 限制。\n* 移除量化中的不必要的 Square Difference 限制。\n\n\n#### 其他变更\n\n* 更新 Arm NN Execute Network 应用程序的 --help 说明。\n* 在 ArmNN 中引入 clang-format 脚本。\n* 移除 ConstTensorAsInputs 层的 Profiling 详细信息。\n* 安装缺失的 Profiling 头文件。\n* 从反序列化代码中移除 ASSERT。\n* 从 armnnUtils 代码中移除 ASSERT。\n* 从 shim 代码中移除 ASSERT。\n* 更新文档，将 C++ 版本更正为 C++ 17。\n* 移除 NEON CONV2D 中对非常量偏置的显式限制，允许 Arm Compute Library 自行处理。\n\n\n#### ABI\u002FAPI 变更\n\n在 23.11 版本的实现过程中，发生了一些前端 API 变更，用户在升级前应注意。由于这些变更，我们已按照语义版本控制规范，将 ARMNN_VERSION 提升至 33.1.0，OPAQUE_DELEGATE_VERSION 提升至 2.0.0。\n\n| 功能                    | SHA                                     | Gerrit 审核 | 相应的 ABI\u002FAPI 变更 |\n|----------------------------|-----------------------------------------|---------------|---------------------------|\n| 向 Opaque Delegate 添加 ArmNNSettings| 3e4b60897bde2ad7ab5b730c7c5d727e41cc0eef|[https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F10493]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F10493))| 发生了两项变更：\u003Cbr> \u003Cul>\u003Cli>**TfLiteArmnnOpaqueDelegateCreate** 函数的签名发生了变化：之前为：TfLiteOpaqueDelegate* TfLiteArmnnOpaqueDelegateCreate(const void* settings); 现在为：TfLiteOpaqueDelegate* TfLiteArmnnOpaqueDelegateCreate(armnnDelegate::DelegateOptions options); \u003C\u002Fli>\u003Cbr> \u003Cli>结构体 **ArmnnDelegatePlugin** 的大小有所增加，因为新增了一个私有成员：armnnDelegate::DelegateOptions m_delegateOptions；\u003C\u002Fli>\u003C\u002Ful>|\n\n在 23.11 版本的实现过程中，未发生任何破坏后端 API 的变更。","2023-11-23T12:08:16",{"id":203,"version":204,"summary_zh":205,"released_at":206},127635,"v23.08","# 摘要\n\n\n#### 新特性\n\n \n* 在 `CpuRef`、`CpuAcc` 和 `GpuAcc` 中新增对 `tile` 运算的支持。\n* 在 `CpuRef` 中新增对 `reverse_v2` 运算的支持。\n* 在 `CpuRef`、`CpuAcc` 和 `GpuAcc` 中新增 `pow` 和 `squared_difference` 作为 `ElementWiseBinary` 层。\n* 向 `TypeUtils.hpp` 中添加了 `squared_difference`、`power` 和 `ceil`。\n* 为以下层启用了动态\u002F非常量偏置：\n     * `CpuAcc` 和 `GpuAcc` 中的全连接层\n     * `CpuAcc` 和 `GpuAcc` 中的三维卷积层\n     * `GpuAcc` 中的深度可分离卷积层\n* 为常量层的 `.dot` 文件添加了 `DataType`。\n* 向 `.dot` 文件中添加了 BinaryElementwiseOperation。\n* 向 `ExecuteNetwork` 中添加了一个 `FileComparisonExecutor`。\n* 向 `InputSlot` 添加了一个可选的 `TensorInfo`。\n* 为 `CpuAcc` 和 `GpuAcc` 的 `batch_to_space` 和 `space_to_batch` 添加了三维张量支持。\n* 增加了对半精度浮点无穷大值的检查，并提供了后端支持（FP16）。\n* 增加了后端优化，以在可能的情况下移除 `reshape` 层。\n* 在 `NeonStridedSliceWorkload` 中为张量添加了数据布局。\n* 为工作负载添加了名称。\n* 在所有后端中启用了 `slice` 的端到端测试，并在 `CpuRef` 中启用了 `Signed32`。\n* 向 `ViewsDescriptor` 中添加了 `axis`。\n* 重构了 ElementBinaryOps，以使用 ElementBinaryLayer。\n\n\n#### TfLite 解析器\n\n* 向 TFLite 解析器中添加了 `reverse_v2` 支持。\n* 向 TFLite 解析器中添加了 `tile`。\n* 在 TFLite 解析器中将 `square` 实现为 `mul`。\n* 在向 TFLite 解析器中添加融合激活之前，先检查 `options != null`。\n* 修复了 TFLite 解析器中某些模型的段错误。\n\n\n#### Arm NN 序列化\u002F反序列化器：\n\n* 向序列化\u002F反序列化器中添加了 `tile`。\n* 向序列化\u002F反序列化器中添加了 `reverse_v2`。\n\n\n#### 支持库\n* 向支持库中添加了 `reverse_v2`。\n* 向支持库中添加了 `tile`。\n* 向支持库中添加了缓存大小检查。\n\n\n#### 错误修复\n\n* 修复了 `unidirectional_sequence_lstm` 在 `CL` 和 `Neon` 上的验证错误。\n* 修复了使用 TFLite 执行器运行时 `ExecuteNetwork` 的问题。\n* 将 `Gather` 参考工作负载中的断言替换为异常。\n* 引入修复，明确指出应包含的正确头文件（继之前的弃用警告之后）。\n* 修复了 Arm NN Doxygen 中的 XML 解析错误。\n* 修复了 `-Werror=unused-result` 错误。\n* 对 `ExecuteNetwork` 进行了修复，解决了在使用 `-T delegate` 标志时 `--output-network-details-only` 不起作用的问题。\n* 引入修复，解决了交叉编译构建中的重复定义问题。\n* 修复了支持库中 `Concat` 的排列参数错误。\n* 移除了某些模型的不必要的警告。\n* 引入修复，使 `SplitterLayer` 能够正确使用覆盖的 `TensorInfos`。\n* 对一些因使用子张量而导致错误的情况进行了修复。\n* 修复了由于缺少 `printf` 参数而引起的读取内存访问问题。\n* 引入修复，解决了动态后端构建失败的问题。\n* 修复了维度特异性与维度数量不匹配的问题。","2023-08-24T14:03:55",{"id":208,"version":209,"summary_zh":210,"released_at":211},127636,"v23.05","# 摘要\n\n\n#### 新特性\n\n \n* 在 CpuAcc 和 GpuAcc 中为 FullyConnected 工作负载添加了对动态权重的支持。\n* 在 CpuAcc 和 GpuAcc 的 BatchToSpaceND 工作负载中添加了对裁剪操作的支持。\n* 在 CpuAcc 和 GpuAcc 的 Batch MatMul 工作负载中增加了对 int8 类型的支持，并将用于 Fp32 计算的 Compute Library 内核进行了替换。\n* 添加了 Opaque TfLite Delegate，其算子覆盖范围与现有的经典 TfLite Delegate 相同。更多信息请参阅下方的 TfLite Delegate 部分。\n\n\n#### TfLite 解析器\n\n* 增加了对 CEIL 和 SPACE_TO_DEPTH 算子的支持。\n* 修复了 ParseSqueeze 中未正确记录计算输出形状的 bug。\n* 修复了 ParseTransposeConv2d 在 output_shape 不是常量时出现的段错误。\n* 修复了 ParseMean 中负轴被错误读取的 bug。\n* 如果指定了输出形状，则使用该形状计算转置卷积的显式填充。\n\n\n#### ONNX 解析器\n\n* 增加了对维度大于 2 的 MatMul\u002FFullyConnected 的支持。\n\n\n#### Arm NN 序列化\u002F反序列化器：\n\n* 增加了对 CEIL 的支持。\n\n\n#### Bug 修复\n\n* 修复了 ExecuteNetwork 中 compare-output 输出功能的问题。\n* 修复了 gcc 13 编译器报错问题。\n\n#### 其他变更\n\n* 添加了 ElementwiseBinaryLayer，以替代 Add、Div、Sub、Maximum、Mul 和 Minimum 层。\n* 更新了 Android NDK 构建指南（BuildGuideAndroidNDK.md）。\n* 将默认量化参数 scale 设置为 1.0，而非 0.0。\n* Fp16ToFp32 和 Fp32ToFp16 转换工作负载在 CpuAcc 后端可用时，现采用 arm_compute::NECast 实现，通常速度会更快。\n* 在构建工具中新增了 Opaque TfLite Delegate 构建选项。\n\n#### 已知问题\n\n* GPU 上 Dma Buf 内存导入存在间歇性问题。此问题已在 Mali 驱动程序 r30p0 中修复。\n* 使用 int8 数据类型在 Arm Mali-G77 GPU 上运行 Inception v3 时，可能存在相对于 v20.08 的性能退化。目前正在调查中。\n\n#### ABI\u002FAPI 变更\n\n在 23.05 版本的实现过程中，发生了一些前端 API 变更，用户在升级前应注意。请注意：ArmNN Core（libarmnn.so）并未发生 ABI 破坏性变更，因此主版本号未变，仅次要版本号有所提升（32.0.0 → 32.1.0）。\n\n| 功能                    | SHA                                     | Gerrit 审查 | 结果 ABI\u002FAPI 变更 |\n|----------------------------|-----------------------------------------|---------------|---------------------------|\n| 为 Delegate Options 实现 Pimpl 设计模式| 1bae865fecf99f25cd2d58390e0cf08467a22b4f|[https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F9358]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F9358))| **DelegateOptions** 类的大小已由 488 字节变为 8 字节。多个函数的参数栈布局已改变，因此位于栈较高位置的参数可能会被应用程序错误初始化。**Delegate** 类的大小已由 552 字节变为 72 字节。**m_Options** 成员的大小已由…","2023-05-22T09:27:24",{"id":213,"version":214,"summary_zh":215,"released_at":216},127637,"v23.02","#### New Features\r\n \r\n* Arm NN TOSA Backend\r\n  * Added Concatenation support to TOSA Reference Backend.\r\n  * Added Constant layer support to TOSA Reference Backend.\r\n  * Added Convolution 2D support to TOSA Reference Backend.\r\n  * Added Pooling2d support to TOSA Reference Backend.\r\n  * Added Reshape support to TOSA Reference Backend.\r\n  * Added RSqrt support to TOSA Reference Backend.\r\n  * Added Slice support to TOSA Reference Backend.\r\n  * Added Transpose Convolution 2D support to TOSA Reference Backend.\r\n  * Added Subtraction and Multiplication support to TOSA Reference Backend.\r\n* Added support for GpuAcc BatchMatMul with FP32.\r\n* Extend BatchMatMul support for 4D tensors in GpuAcc.\r\n\r\n\r\n#### ONNX Parser\r\n* Provide a CreateNetworkFromBinary method for the ONNX parser.\r\n\r\n\r\n#### TfLite Parser:\r\n* Fixed issue in ParseReshape where the targetShape wasn't always calculated correctly.\r\n* Fixed issue in ParseFullyConnected where the wrong name was used for the ReshapeLayer.\r\n* Added an ExpandDims to the FullyConnected to ensure that we reshape the output correctly.\r\n\r\n\r\n\r\n#### Bug Fixes\r\n* Bug fixed on ExecuteNetwork when input names where not given, input files were not used.\r\n* Bug Fixed on delegate Profiling in ExecuteNetwork with multiple iterations.\r\n* Bug Fixed for CpuAcc and GpuAcc. BuildArmComputePermutationVector() function needed to be rewritten to account for all possible permutation vectors.\r\n* Fixed an ExecuteNetwork unhandled exception when using option --import-inputs-if-aligned.\r\n* Fixed Arm NNAPI Support Library to fail gracefully if device is unavailable.\r\n* Fixed edge cases where some permute vectors for Arm Compute were not converted correctly.\r\n* Fixed bug where GPU backend options were not being correctly passed by our delegate.\r\n* Fixed bug when converting Constants with Per-Axis Quantization.\r\n* Fixed bug where call on SubstituteSubgraph on working copy of subgraph in Optimize fails.\r\n* Fixed segfault in ExecuteNetwork when no operator is supported by Arm NN.\r\n* Fixed bug for slot replacement during UpdateSubgraphViewSlotPointers.\r\n* Fixed bug for ExecuteNetwork using delegate when output is boolean from comparison layer.\r\n\r\n\r\n#### Other Changes\r\n* Disabled BF16-Turbo-Mode and remove conversion layers.\r\n* Added Arm NN include directory into build-tool output.\r\n* Code improvement through removal of unused includes.\r\n* Optimization of IsLayerSupported to reduce calls to it.\r\n* Removed deprecated code due to be removed in 23.02.\r\n* Changed Arm NN Support LIbrary to use static libraries instead of object libraries.\r\n* Added option of static build of Execute Network.\r\n* Improved error handling when ExecuteNetwork creates a directory when -F option used.\r\n* Changed ArmNNExecutors to now share a single IRuntime, which allows ExecuteNetwork to create and run multiple Executors instead of one.\r\n* Added documentation relating to multithreading.\r\n\r\n\r\n#### ABI\u002FAPI Changes\r\nThe following **front-end** API changes have occurred during the implementation of 23.02 that users should be aware of before upgrading.\r\n.\r\n| Feature                    | SHA                                     | Gerrit Review | Resultant ABI\u002FAPI changes |\r\n|----------------------------|-----------------------------------------|---------------|---------------------------|\r\n| Optimize the calling of IsLayerSupported().  | 5383767a7a759c867235ab66bd71f88281e3bd06 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8742]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8742)) | In class IConnectableLayer: Pure virtual method **SetBackendId (BackendId const&)** has been added to this class. Applications will not provide the implementation for this pure virtual method and therefore cause a crash in the library trying to call this method. The layout of v-table has been changed. Call of any virtual method at higher position in this class or its subclasses may result in crash or incorrect behavior of applications. |\r\n| When creating multiple Executors only the last one works fine | 5446a4d6d02002515fc58fafe33d74ae6dca5787 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8997]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8997)) | In class Delegate: Size of this type has been changed from 688 bytes to 680 bytes. The fields or parameters of such data type may be incorrectly initialized or accessed by old client applications. Type of field m_Runtime has been changed from armnn::IRuntimePtr (16 bytes) to armnn::IRuntime* (8 bytes). Size of the inclusive type has been changed |\r\n| Fix incorrect last layer in Types.hpp | 6701daf754efbadcf95c969eee1ba57320763d84 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8944]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8944)) | In enum LayerType: Value of member LastLayer has been changed from 66 to 71. Applications may execute a wrong branch of code in the library and therefore change the behavior. |\r\n| Change to MemorySource to keep it usable as a bit mask | 1cebf4978bf7723aaf0501de5fb80a6ef77066bf | [https:\u002F\u002Freview.mlp","2023-03-09T17:13:56",{"id":218,"version":219,"summary_zh":220,"released_at":221},127638,"v22.11.01","## Summary\r\n\r\nThis is a patch release to fix an issue in the Arm Support Library encountered on Android phones where the OpenCL libraries could not be detected.\r\n\r\nIn this case the 22.11 release was detecting the issue and throwing an exception but the Tensorflow Lite runtime was expecting an error code so fallback to the runtime was failing. \r\n\r\nIn this release an error code is being returned when a misconfigured\u002Fmissing OpenCL installation is encountered and the Tensorflow Lite runtime is taking over execution of the graph as expected.\r\n\r\nThis 22.11.01 release contains all the features of Arm NN 22.11 release. Please find release note for 22.11 here https:\u002F\u002Fgithub.com\u002FARM-software\u002Farmnn\u002Freleases\u002Ftag\u002Fv22.11.","2023-01-23T14:58:05",{"id":223,"version":224,"summary_zh":225,"released_at":226},127639,"v22.11","## Summary\r\n\r\n#### New Features\r\n \r\n* ArmNN to TOSA backend:\r\n  * Added TOSA Mappings backbone structure with support for Addition operator (Float32).\r\n  * Implemented simple TOSA Reference Backend skeleton.\r\n  * Implemented TosaRefBackend::OptimizeSubgraphView.\r\n  * Integrated TOSA Serialization Library into Arm NN.\r\n  * Integrated TOSA Reference Model into Armn NN.\r\n* BATCH_MATMUL:\r\n  * Added adjoint and transpose parameters to BATCH_MATMUL layer and CpuRef workload.\r\n  * Added support for BATCH_MATMUL to Arm NN Support Library.\r\n  * Added support for BATCH_MATMUL FP32 to CpuAcc.\r\n  * Added BATCH_MATMUL end to end tests.\r\n* Updated to Android NDK r25.\r\n* Updated to TensorFlow 2.10 and Flatbuffers 2.0.6.\r\n\r\n\r\n#### TfLite Parser\r\n* Added BATCH_MATMUL to TFLite Parser.\r\n* Fixed bug in TFLite Parser failing to prepare model due to unspecified size buffer data for SLICE operator.\r\n* In TFLite Parser we observed that in BATCH_MATMUL layer, when adjoint parameter was true, the mathematical calculation was transpose. So we linked adjoint from TFLite to transpose in ArmNN.\r\n* Added support for RESHAPE when output 'shape_signature' parameter contains a value of -1 in TFLite Parser.\r\n\r\n\r\n#### ArmNN Serializer\u002FDeserializer\r\n* Added support for BATCH_MATMUL to Serializer\u002FDeserializer.\r\n\r\n#### Bug Fixes\r\n* Fixed bug in SubgraphView::SubstituteSubgraph where IOutputSlots were incorrectly overridden.\r\n* Fixed bug in ExecuteNetwork when iterations and input files are not matching.\r\n* Updated SubgraphView Selector to give deterministic results.\r\n* Fixed bug in ArmNNExecutor where errors from LoadNetwork were being ignored in.\r\n* Fixed bug with debug mode not working correctly with Constant Tensors as Inputs.\r\n* Fixed incorrect kernel measurement in profiling output.\r\n* Fixed ExecuteNetwork for multiple outputs.\r\n* Make the AllowExpandedDims option work.\r\n* Fixed output format issue for int8 when using -w in ExecuteNetwork.\r\n\r\n\r\n#### Other Changes\r\n* Added runtime options to Doxygen.\r\n* Added message deprecating the use of master branch. main branch is now used.\r\n* Removed deprecated code due to be removed in 22.08 as we cold not do this in 22.08.\r\n* Removed deprecated code due to be removed in 22.11.\r\n* Delayed the removal of deprecated weights and bias by one release.\r\n* Generalized get_compute_library.sh usage.\r\n* Use ARMNN_VERSION for Support Library version String.\r\n* Removed aarch32 build from build-tool.\r\n* Forward declare ILocalPacketHandlerSharedPtr in IRuntime.hpp\r\n* Use stricter file extension check in CreateParser.\r\n\r\n**Note**: Following the upgrades to Tensorflow 2.10 and Flatbuffers 2.0.6 a compiler that supports C++17 is now required. This will prevent compilation on some older operating systems, e.g. Debian 9.\r\n\r\n#### ABI\u002FAPI Changes\r\nThe following **front-end** API changes have occurred during the implementation of 22.11 that users should be aware of before upgrading.\r\n.\r\n| Feature                    | SHA                                     | Gerrit Review | Resultant ABI\u002FAPI changes |\r\n|----------------------------|-----------------------------------------|---------------|---------------------------|\r\n| Remove deprecated code 22.08 | 48f9d5db00a245d08317130b10171337df0c1142 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8167]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8167)) | **Removed Symbols**: INetwork::AddConvolution2dLayer ( struct Convolution2dDescriptor const& convolution2dDescriptor, ConstTensor const& weights, Optional\u003CConstTensor>const& biases, char const* name ). INetwork::AddDepthwiseConvolution2dLayer ( struct DepthwiseConvolution2dDescriptor const& convolution2dDescriptor, ConstTensor const& weights, Optional\u003CConstTensor>const& biases, char const* name ) |\r\n| Implement simple TOSA Reference Backend skeleton| ae8a6f528151a9e88236a92877be1e99aea69658 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8082]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8082)) | In class MockWorkloadFactory the following has changed: \u003Cli>The relative position of virtual method CreateInput ( InputQueueDescriptor const&, struct WorkloadInfo const& ) const has been changed from 5 to 8. \u003C\u002Fli>\u003Cli> The relative position of virtual method CreateWorkload ( enum LayerType, struct QueueDescriptor const&, struct WorkloadInfo const& ) const has been changed from 8 to 7.\u003C\u002Fli> \u003Cli> The relative position of virtual method CreateTensorHandle ( TensorInfo const&, enum DataLayout, bool const ) const has been changed from 7 to 6.\u003C\u002Fli> \u003Cli> The relative position of virtual method CreateTensorHandle ( TensorInfo const&, bool const ) const has been changed from 6 to 5.\u003C\u002Fli>\u003Cli> The layout of v-table has been changed. Call of these virtual methods may result in crash or incorrect behavior of applications.\u003C\u002Fli> |\r\n| Fix AllowExpandedDims option | 16c76d5db629d3ef7e4cb143bfa7e1d717e1d492 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8419]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F8419)) | **Added Symbols:** \u003Cli> INetwork::Create ( NetworkOp","2022-11-28T16:08:04",{"id":228,"version":229,"summary_zh":230,"released_at":231},127640,"v22.08","# Summary\r\n\r\n\r\n\r\n#### New Features\r\n\r\n \r\n* Add Arm NN Support Library.\r\n  * The Arm NN Support Library for Android NNAPI is a shared library which has all the functionalities of existing HAL drivers for Android NNAPI.\r\n  * It is available from Android S.\r\n  * It focuses on update-ability of ML operators.\r\n  * Guiide on how to build Arm NN Support Library is available armnn\u002Fshim\u002FBuildGuideShimSupportLibrary.md.\r\n  * SLTS (Support Library Test Suit) compliance.\r\n* Support for Batch MatMul in CpuRef.\r\n\r\n\r\n#### TfLite Parser\r\n\r\n* Added support for LOG.\r\n* Added support for SIN.\r\n\r\n#### ExecuteNetwork App Changes:\r\n\r\n* Refactor of ExecuteNetwork. Now input name, input type, output name, output type and model type are read from the model.\r\n\r\n#### Arm NN Build Tool:\r\n\r\n* Introduced Arm NN Build Tool which consists of an official Arm NN Dockerfile for building Arm NN and Arm Compute Library (ACL).\r\n* This tool replaces the majority of our existing build guides as a user-friendly way to build Arm NN (and its dependencies) from scratch.\r\n* Tested on x86_64 (Intel) and aarch64 (Arm) build hosts for the Ubuntu platform.\r\n* Currently supports targeting Linux devices (from Ubuntu 18.04 onwards) on x86_64, aarch32 and aarch64 architectures.\r\n\r\n#### Bug Fixes\r\n\r\n* The models in format .armnn (serialized models) were failing in 22.05, this problem has been solved by adding the constant layers before the operator layers.\r\n* Neon fold padding into average pool 2D quantization bug fix.\r\n* Fix segmentation fault when running  --bf16-turbo-mode on FPGA.\r\n\r\n#### Other Changes\r\n\r\n* General documentation refactor and updates.\r\n* Added LICENSE.spdx for Arm NN\r\n* Delay backend deprecation from 22.11 to 23.08\r\n\r\n\r\n#### ABI\u002FAPI Changes\r\n\r\nThe following **front-end** API changes have occurred during the implementation of 22.08 that users should be aware of before upgrading.\r\n\r\n.\r\n\r\n| Feature                    | SHA                                     | Gerrit Review | Resultant ABI\u002FAPI changes |\r\n|----------------------------|-----------------------------------------|---------------|---------------------------|\r\n| Import inputs but don't export outputs fails | 626bd90378670eb5fd76f94526395430b752ad9e | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7661]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7661)) | Field **m_ExportEnabled** has been added to type **OptimizerOptions**. This field will not be initialized by old clients that have not been recompiled.  |\r\n| Get non-const IConnectableLayer from I\u002FO slots| 09fa24d2f4b0177d55800bd01ec52c337701ef0a | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7835]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7835)) | Pure virtual method **GetOwningIConnectableLayer ( )** has been added to classes **IOutputSlot** and **IInputSlot**. \u003Cli> Applications will not provide the implementation for this pure virtual method and therefore cause a crash in the library trying to call this method. \u003C\u002Fli> \u003Cli> The layout of v-table has been changed. Call of any virtual method at higher position in this class or its subclasses may result in crash or incorrect behavior of applications. \u003C\u002Fli> |\r\n| Remove deprecated code 22.05 | 4d2eec0436f75d526c2ec25623ad73c8d1ee9ac3 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7712]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7712)) | **Removed Symbols:** \u003Cli> IsCapabilitySupported ( BackendId const& backend, enum BackendCapability capability ) FullyConnectedDescriptor::GetNumViews ( ) const INetwork::Accept ( ILayerVisitor& visitor ) const \u003C\u002Fli> \u003Cli> Pure virtual method Accept ( ILayerVisitor& ) const has been removed from class IConnectableLayer. \u003C\u002Fli> \u003Cli> The layout of v-table has been changed. Call of this virtual method or any virtual method at higher position in this class or its subclasses may result in crash or incorrect behavior of applications.\u003C\u002Fli>|\r\n| Modified SubgraphView returned by GetWorkingCopy() | cea3d49619a87ffb81422c7e9383368baa93a408 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7852]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7852)) | Pure virtual method GetSlotIndex ( ) const has been added to class IInputSlot. \u003Cli>  Applications will not provide the implementation for this pure virtual method and therefore cause a crash in the library trying to call this method. \u003C\u002Fli> \u003Cli> The layout of v-table has been changed. Call of any virtual method at higher position in this class or its subclasses may result in crash or incorrect behavior of applications.\u003C\u002Fli> |\r\n| Update the async api to use ExecutionData | 21a6a1a5b72907573eade6d232bfaf45a4c14c52 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7878]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7878)) | **experimental::IWorkingMemHandle** Pure virtual method GetExecutionDataAt ( unsigned int ) has been added to this class. \u003Cli> Applications will not provide the implementation for this pure virtual method and therefore cause a crash in the library trying to call this method. \u003C\u002Fli> \u003Cli> The layout of v-table has been change","2022-08-25T15:43:10",{"id":233,"version":234,"summary_zh":235,"released_at":236},127641,"v22.05.01","# Summary\r\n\r\n#### New Features\r\n\r\nThis is a patch release of 22.05 where we have implemented Pooling3d custom operator for ArmNN TfLite Delegate. This feature is available in the 22.05 release branch itself (branches\u002Farmnn_22_05) and in the tag created for patch release v22.05.01.","2022-06-20T08:17:52",{"id":238,"version":239,"summary_zh":240,"released_at":241},127642,"v22.05","# Summary\r\n\r\n\r\n\r\n#### New Features\r\n\r\n \r\n* ArmnnTestUtils is now versioned and under ABI compliance checker\r\n* Added support for Int32 CONCATENATION layer for CpuRef\r\n* Added support for Float32 Unidirectional Sequence LSTM layer for CpuAcc and GpuAcc\r\n* Added support for GatherNd for CpuRef,  CpuAcc and GpuAcc\r\n* Added support for SQRT for CpuAcc and GpuAcc\r\n* Added support for Depthwise Convolution2d ConstTensorsAsInput for CpuRef,  CpuAcc and GpuAcc\r\n* Added support for Conv2d ConstTensorsAsInput for CpuRef,  CpuAcc and GpuAcc\r\n* Added support for Fully Connected ConstTensorsAsInput for CpuAcc and GpuAcc\r\n* Added support for MaxPool3D and AveragePool3D for CpuAcc and GpuAcc\r\n* Added support for L2Pooling3D for GpuAcc\r\n* Added support for UnidirectionalLSTM for CpuAcc\r\n* ConstTensorsAsInput: Optimizer Fix - FuseBatchNorm\r\n* ConstTensorsAsInput: Optimizer Fix - FoldPadIntoConvolution2d\r\n* ConstTensorsAsInput: Optimizer Fix - Fp32ToBf16 optimization\r\n\r\n#### TfLite Parser\r\n\r\n* Added support for GatherNd\r\n* Added support for FloorDiv\r\n* Added support for UnidirectionalLSTM\r\n* Do not create Floor for FloorDiv layer when the data type is int32\r\n\r\n\r\n#### ArmNN Serializer\u002FDeserializer\r\n\r\n* Added support for GatherNd\r\n\r\n#### ExecuteNetwork App Changes:\r\n\r\n* Added Reuse IO Buffers mode\r\n* Profiling details weights and bias JSON keys deprecated. Will be removed for 22.08\r\n\r\n\r\n#### Bug Fixes\r\n\r\n* Fixed crashing in profiling\r\n* Fixed the issue with running SimpleSample app in Raspi\r\n* Removed MockBackend.hpp from armnn\u002Fsrc\u002Fbackends\u002FbackendsCommon\u002Ftest\u002F to solve problems when using Visual Studio in Windows\r\n* Fixed segfault in RefDepthwiseConvolution2d workload\r\n\r\n#### Other Changes\r\n\r\n* ArmNN Baremetal\r\n  * Change the namespace from armnn::profiling to arm::pipe\r\n\r\n\r\n\r\n#### ABI\u002FAPI Changes\r\n\r\nThe following **front-end** API changes have occurred during the implementation of 22.05 that users should be aware of before upgrading.\r\n\r\n.\r\n\r\n| Feature                    | SHA                                     | Gerrit Review | Resultant ABI\u002FAPI changes |\r\n|----------------------------|-----------------------------------------|---------------|---------------------------|\r\n| Change the namespace from armnn::profiling to arm::pipe| 5aa9fd7ac6bf8dad576fa4a0a32aa3dae98d11ab | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7222]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7222)) | \u003Cli> Pure virtual method GetOwningIConnectableLayer( ) const has been added to class IOutputSlot. Applications will not provide the implementation for this pure virtual method and therefore cause a crash in the library trying to call this method. \u003Cli>The layout of v-table has been changed. Call of any virtual method at higher position in this class or its subclasses may result in crash or incorrect behavior of applications.\u003C\u002Fli>  \u003Cli> The following functions has had a change in signature meaning it will not be recognized by old applications: **BackendRegistry::SetProfilingService**\u003Cbr> **IRuntime::RegisterDebugCallback** \u003C\u002Fli> \u003Cli>Type of field m_LocalPacketHandlers has been changed from std::vector\u003Cstd::shared_ptr\u003Cprofiling::ILocalPacketHandler> > to std::vector\u003Cstd::shared_ptr\u003Carm::pipe::ILocalPacketHandler> > in Runtime::CreateOptions::ExternalProfilingOptions\u003C\u002Fli> \u003Cli>Type of return value has been changed from profiling::ProfilingGuid to arm::pipe::ProfilingGuid in OptimizedNetwork::GetGuid \u003C\u002Fli> |\r\n| Replace ProfilingService includes with IProfilingService.| af947729dc2aa7cdb6d4a716e2edf307710a8155 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7240]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7240)) | The following function has had a change in signature meaning it will not be recognized by old applications.\u003Cbr>**BackendRegistry::SetProfilingService** \u003C\u002Fli>|\r\n| Remove dependency on armnn::Exception classes from the Profiling code | f9db3efe5ce2b989b59c47056e1b84b32d2f1100 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7280]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7280)) | Class armnn::BackendProfilingException has been moved to namespace arm::pipe; this will result in older applications not being able to find it.\u003C\u002Fli> |\r\n| Replace armnn:Optional with arm::pipe::Optional in profiling code | decd08b89565b18067d229c8c25b6f3a3333c653 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7295]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7295)) | Class armnn::TimeoutException has been moved to namespace arm::pipe; this will result in older applications not being able to find it. \u003C\u002Fli> |\r\n| Add Unidirectional Sequence Lstm support to TFLite | 5880b911bf4b7fd8308c93e299d77ac78f282c19 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7023]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F7023)) | Following fields have been added to struct LstmDescriptor: \u003Cbr>**m_CellIntermediateScale**\u003Cbr>**m_ForgetIntermediateScale**\u003Cbr>**m_HiddenStateScale**\u003Cbr>**m_HiddenStateZeroPoint**\u003Cbr>**m_InputIntermediateScale**\u003Cbr>**m_OutputIntermediateScale**\u003Cbr>As a result of this size of the st","2022-05-26T10:32:36",{"id":243,"version":244,"summary_zh":245,"released_at":246},127643,"v22.02","# Summary\r\n\r\n\r\n\r\n#### New Features\r\n\r\n \r\n* Add mirror padding support on Pad layer for CpuAcc and GpuAcc.\r\n* Add support for Pool3d FrontEnd, Reference implementation.\r\n\r\n#### TfLite Parser\r\n\r\n* Added missing support for reshape operator when the target shape is dynamic and batch size is unknown.\r\n* Added PadV2 support.\r\n* Changed asserts to CHECK in ParserFlatbuffersFixture.hpp.\r\n\r\n\r\n#### ArmNN Serializer\u002FDeserializer\r\n\r\n* Add support for Pool3d.\r\n\r\n#### Bug Fixes\r\n\r\n* Added bounds checking when indexing PermutationVector elements and its correspondent unit tests.\r\n* Fixed output bindings in ExecuteNetwork when using delegate with models with multiple outputs.\r\n* Fixed build issues in x86 Dockerfile.\r\n* Fixed ExNet prints inference time twice.\r\n* Fixed thread safety issues in TimelineDecoder and associated unit tests.\r\n* Fixed some Thread Sanitizer warnings.\r\n* Added check for existing event to fix issue on OpenCL Timer.\r\n* Fixed logging bug where blank messages were being sent.\r\n* Fixed issues on Logging API.\r\n* Fixed async execute test on 32bit Raspberry Pi\r\n\r\n#### Other Changes\r\n\r\n* Removed references to blacklist from Model Accuracy tool.\r\n* Removed deprecated code.\r\n* Added ModelOptions and addition timing to ARMNN_LOG.\r\n* Added get_tensorflow.sh script.\r\n* Updated build guides.\r\n* Updated error messages from the flatbuffers parser.\r\n* Added the C++ KWS example.\r\n* Handled optional biases better in Neon\u002FCl FullyConnected workloads.\r\n* Stabilise the Backend API:\r\n  *  Backend developers should now be able to limit includes to headers in include\u002Farmnn\u002Fbackends\u002F\r\n  *  Moved CompatibleTypes.hpp to the armnnUtils library. \r\n  *  Added forwarding header for src\u002Farmnn\u002FCompatibleTypes.hpp.\r\n  *  Moved the ArmNN Test Utils code to a physically separate directory.\r\n  *  Added new method AddPrecompiledLayer() to INetwork.\r\n  *  Promoted backend headers in backendCommon to armnn\u002Fbackends.\r\n  *  Used INetwork rather than Graph for holding layers for OptimizationViews.\r\n  *  Used IConnectableLayer in SubgraphView rather than Layer in its m_Layers.\r\n  *  Stabilised the IWorkloadFactory interface with unified strategy.\r\n  *  Stabilised the ILayerSupport interface with unified strategy.\r\n  *  Moved SubgraphView to backends include folder.\r\n  *  Added GetParameters to IConnectableLayer.\r\n  *  Exposed a new MockWorkloadFactory and MockMemManager.\r\n  *  Accessing ConstTensors from IConnectableLayer\r\n  *  Added method of returning a GetSubgraphWorkingCopy (SubgraphView).\r\n  *  Moved MemCopyTestImpl from acl to armnnTestUtils.  \r\n* Support Import of Aligned Host Memory in NNAPI:\r\n  *  Added CanBeImported to ITensorHandle.\r\n  *  Implemented CanBeImported function in RefTensorHandle.\r\n  *  Implemented CanBeImported function in NeonTensorHandle.\r\n  *  Implemented CanBeImported function in ClTensorHandle.\r\n  *  Added functionality for CopyAndImportFactoryPair  to TensorHandleFactoryRegistry.\r\n  * Register CopyAndImportFactoryPairs to RefBackend and unit tests.\r\n  *  Register CopyAndImportFactoryPairs to NeonBackend and unit tests.\r\n  *  Register CopyAndImportFactoryPairs to ClBackend and unit tests.\r\n  *  Added ReplaceTensorHandle functions to IWorkload and BaseWorkload.\r\n  *  Added ClBaseWorkload and NeonBaseWorkload.\r\n  *  Modified workloads to extend Neon\u002FCl BaseWorkload.\r\n  *  Added  ReplaceTensorHandle functions to Neon\u002FCL BaseWorkloads.\r\n  *  Implemented ICLTensorProxy.\r\n  *  Added input and output workload slot pairs to LoadedNetwork.\r\n  *  Added support of aligned host memory.\r\n  *  Added Forced Import EndToEnd tests to Ref, Neon, and CL.\r\n  *  Call Cl sync after EnqueueWorkload\r\n  *  Added EndToEnd tests on reference backend to ensure allocated data can be reused.\r\n\r\n\r\n\r\n#### ABI\u002FAPI Changes\r\n\r\nThe following **front-end** API changes have occurred during the implementation of 22.02 that users should be aware of before upgrading.\r\n\r\n.\r\n\r\n| Feature                    | SHA                                     | Gerrit Review | Resultant ABI\u002FAPI changes |\r\n|----------------------------|-----------------------------------------|---------------|---------------------------|\r\n| SubgraphView uses IConnectableLayer rather than Layer in its m_Layers| 56ccf68c7858560f2ba00f19076b3cb112970881 | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F6807]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F6807)) |  Pure virtual method GetOwningIConnectableLayer( ) const has been added to class IOutputSlot.:\u003Cbr \u002F> \u003Cli>Applications will not provide the implementation for this pure virtual method and therefore cause a crash in the library trying to call this method. \u003C\u002Fli> \u003Cli>The layout of v-table has been changed. Call of any virtual method at higher position in this class or its subclasses may result in crash or incorrect behavior of applications.\u003C\u002Fli> |\r\n| Stabilize the ILayerSupport interface with unified strategy.| 34b429c2215bab7fd12b761dd5c200414c1b4a5b| [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F6903]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F","2022-03-03T10:45:05",{"id":248,"version":249,"summary_zh":250,"released_at":251},127644,"v21.11","Arm NN 21.11 was focused on providing new capabilities and improve performance:\r\n\r\n#### New Features\r\n\r\n \r\n* Added support for Reduce Prod.\r\n* Added support for Channel Shuffle.\r\n* Added support for Conv3d.\r\n* Added support for Symmetric and Reflect Padding on CpuRef backend.\r\n* Added support for statically linking ArmNN TfLite Delegate against Tensorflow Lite.\r\n* Added Import Input\u002FOutput functions to async API, allowing for imported I\u002FO buffers to be used by multiple network executions.\r\n* Added external memory manager that allows for customization of network memory  management ( Note: currently only fully supported on the CpuRef Backend ).\r\n\r\n#### TfLite Parser\r\n\r\n* Added support for Reduce Prod.\r\n* Added support for Conv3d.\r\n* Added support for MirrorPad.\r\n* Added support for size of -1 for Slice.\r\n\r\n#### ONNX Parser\r\n\r\n* Add support for Concat\r\n* Add support for Gather\r\n* Add support for Gemm\r\n  * The parser supports constant bias or non-constant bias where bias dimension = 1.\r\n* Add support for Shape\r\n* Add support for Unsqueeze\r\n* Add support of min\u002Fmax as attribute for Clip\r\n\r\n#### ArmNN Serializer\u002FDeserializer\r\n\r\n* Add support for Reduce Prod.\r\n* Add support for Channel Shuffle.\r\n* Add support for Conv3d.\r\n* Add support for Symmetric and Reflect Padding.\r\n\r\n#### ExecuteNetwork App Changes\r\n\r\n* Added 'do-not-print-output' option to ExecuteNetwork.\r\n\r\n#### Bug Fixes\r\n\r\n* Using output-network-details or output-network-details-only during ExecuteNetwork profiling created an invalid   JSON format. This has since been fixed.\r\n* Fixed undefined reinterpret_cast in BFloat16.hpp. It fixes gcc builds with version 8 or above.\r\n* Fixed format of the delegate JSON output.\r\n* Fixed bug related with constant tensor flag.\r\n* Fixed pyarmnn py35 unit tests.\r\n\r\n#### Other Changes\r\n\r\n* Added sample app for asynchronous execution.\r\n* Printed new Optimize and LoadedNetwork profiling points.\r\n* Added new serialized model supported on Netron.\r\n* Made it possible for backends to add include paths in Android.\r\n* Changed order of the Doxygen tree.\r\n\r\n\r\n\r\n#### ABI\u002FAPI Changes\r\n\r\nThe following **front-end** API changes have occurred during the implementation of 21.11 that users should be aware of before upgrading. Due to these changes we have bumped our ARMNN_VERSION to 27.0.0, the Delegate to 25.0.0 and also bumping our Parsers to 24.3.0 following [Semantic Versioning](https:\u002F\u002Fsemver.org\u002F ) guidelines.\r\n\r\n.\r\n\r\n| Feature                    | SHA                                     | Gerrit Review | Resultant ABI\u002FAPI changes |\r\n|----------------------------|-----------------------------------------|---------------|---------------------------|\r\n| Remove deprecated code| 1b2654fb799c3d25ffcef4d31b5d026d359e2f8f | [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F6254]((https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F6254)) |  Removed Symbols:\u003Cbr \u002F> \u003Cli>INetwork::AddAbsLayer ( char const* name ) \u003C\u002Fli> \u003Cli>INetwork::AddDepthwiseConvolution2dLayer ( struct DepthwiseConvolution2dDescriptor const& convolution2dDescriptor, ConstTensor const& weights, ConstTensor const& biases, char const* name )\u003C\u002Fli> \u003Cli>INetwork::AddDepthwiseConvolution2dLayer ( struct DepthwiseConvolution2dDescriptor const& convolution2dDescriptor, ConstTensor const& weights, char const* name ) \u003C\u002Fli>\u003Cli>INetwork::AddEqualLayer ( char const* name )  \u003C\u002Fli> \u003Cli>INetwork::AddGatherLayer ( char const* name )\u003C\u002Fli>\u003Cli>INetwork::AddGreaterLayer ( char const* name ) \u003C\u002Fli> \u003Cli>INetwork::AddMergerLayer ( MergerDescriptor const& mergerDescriptor, char const* name ) \u003C\u002Fli> \u003Cli>INetwork::AddResizeBilinearLayer ( struct ResizeBilinearDescriptor const& descriptor, char const* name ) \u003C\u002Fli> \u003Cli>INetwork::AddRsqrtLayer ( char const* name )\u003C\u002Fli> \u003Cli>LayerSupport::IsMergerSupported ( BackendId const& backend, std::vector\u003CTensorInfo const*> inputs, TensorInfo const& output, struct OriginsDescriptor const& descriptor, char* reasonIfUnsupported, size_t reasonIfUnsupportedMaxLength )\u003C\u002Fli> \u003Cli>LayerSupport::IsResizeBilinearSupported ( BackendId const& backend, TensorInfo const& input, TensorInfo const& output, char* reasonIfUnsupported, size_t reasonIfUnsupportedMaxLength )\u003C\u002Fli> \u003Cli>LayerSupport::IsRsqrtSupported ( BackendId const& backend, TensorInfo const& input, TensorInfo const& output, char* reasonIfUnsupported, size_t reasonIfUnsupportedMaxLength )\u003C\u002Fli> \u003Cli>LayerSupport::IsSplitterSupported ( BackendId const& backend, TensorInfo const& input, struct ViewsDescriptor const& descriptor, char* reasonIfUnsupported, size_t reasonIfUnsupportedMaxLength )\u003C\u002Fli> \u003Cbr \u002F> Removed pure virtual methods, resulting in change to v-table layout: \u003Cbr \u002F> \u003Cli>ILayerVisitor::VisitAbsLayer\u003C\u002Fli> \u003Cli>ILayerVisitor::VisitEqualLayer\u003C\u002Fli> \u003Cli>ILayerVisitor::VisitGatherLayer\u003C\u002Fli> \u003Cli>ILayerVisitor::VisitGreaterLayer\u003C\u002Fli> \u003Cli>ILayerVisitor::VisitMergerLayer\u003C\u002Fli> \u003Cli>ILayerVisitor::VisitResizeBilinearLayer\u003C\u002Fli> \u003Cli>ILayerVisitor::VisitRsqrtLayer\u003C\u002Fli> \u003Cbr \u002F> Removed DataTypes: \u003Cli>DataType::**QuantisedAsymm8**\u003C\u002Fli> \u003Cli>DateType::*","2021-11-22T14:46:52",{"id":253,"version":254,"summary_zh":255,"released_at":256},127645,"v21.08","# Summary\r\n\r\nArm NN 21.08 was focused on providing new capabilities and improve performance::\r\n\r\n   * Added the ability to import protected DMA Buffers and allow Arm NN to run inferences that are in Protected GPU Memory. As well as providing Custom Memory Allocator which supports importing malloc, Dma_buf and protected Dma buffers.\r\n   * Users with multi core NPUs has been given the ability to pin inferences to selected cores giving them the ability to balance parallel workloads across the NPU and increase throughput.\r\n   * Boost has been completely removed from the code base making Arm NN easier to integrate into other software stacks.\r\n   * Added support for non-constant weights and biases on FullyConnected which lay the groundwork for supporting more models.\r\n   * More operators supported on Arm NN, TfLite Parser, TfLite Delegate and Android NNAPI driver.\r\n\r\n\r\n#### New Features\r\n\r\n \r\n* Moved unit tests from BOOST to doctest.\r\n* UNIDIRECTIONAL_SEQUENCE_LSTM  Operator support added on CpuRef backend.\r\n* Changed weights layout for Depthwise Convolution Operator from [M,I,H,W] to [1,H,W,I*M].\r\n* Reduce Operator can now support multiple axes.\r\n* Optimisation added to fuse PAD Operator into Depthwise Convolution Operator.\r\n* Added SIN and LOG support to ElementWiseUnary Operator on CpuRef, CpuAcc (Only LOG is supported) and GpuAcc backends.\r\n* Added SHAPE Operator support on CpuRef backend.\r\n* Moved useful test utilities to new static library (libarmnnTestUtils.a).\r\n* Added ability to create multiple LoadedNetworks from one OptimizedNetwork. \r\n* Arm NN TfLite Delegate Image Classification sample application added to samples directory.\r\n* Added fully comprehensive Arm NN Operator list page to Doxygen.\r\n* Added support to allow Arm NN to run inferences that are in Protected GPU Memory.\r\n   * Creation of Protected Memory is handled via a Custom Memory Allocator which supports importing malloc, Dma_buf and protected DMA buffers.\r\n\r\n\r\n#### TfLite Parser\r\n\r\n* EXPAND_DIMS Operator support added.\r\n* PRELU Operator support added.\r\n* SHAPE Operator support added.\r\n* Comparison Operator support added (EQUAL, GREATER, GREATER_EQUAL, LESS, LESS_EQUAL and NOT_EQUAL).\r\n* Changed weights layout for Depthwise Convolution Operator from [M,I,H,W] to [1,H,W,I*M].\r\n* Added support for shape_signature, which will now be the preferred way to detect dynamic tensors.\r\n    * If creating an instance of the ITfLiteParser and the model used is dynamic, then please ensure that m_InferAndValidate is set in the TfLiteParserOptions and m_shapeInferenceMethod is set to InferAndValidate in the OptimizerOptions.\r\n\r\n#### ArmNN Serializer\u002FDeserializer\r\n\r\n* Changed weights layout for Depthwise Convolution Operator from [M,I,H,W] to [1,H,W,I*M].\r\n* Added SIN and LOG support to ElementWiseUnary Operator.\r\n* UNIDIRECTIONAL_SEQUENCE_LSTM Operator support added.\r\n\r\n\r\n#### ExecuteNetwork App Changes\r\n\r\n* Added option to specify what size Arm NN thread pool to use when running inferences asynchronously.\r\n* Added support for qasymms8 (int8) and added qasymmu8 (uint8) as alias for qasymm8.\r\n* Added option to specify different input data for every iteration of ExecuteNetwork.\r\n* Added option to print additional information such as the TensorInfo, Descriptor and Convolution method when profiling is enabled.\r\n\r\nNOTE: To run dynamic models through ExecuteNetwork the --infer-output-shape flag should be set.\r\n\r\n#### Bug Fixes\r\n\r\n* Removed duplicate check for Dequantize input type when checking if operator is supported.\r\n* Fixed undefined behaviour in PolymorphicDowncast.\r\n* Fixed binding of reference to null pointer in RefFullyConnectedWorkload.\r\n* Fixed PermutationVector.end() to cope with dimensions \u003C 5 in PermutationVector class.\r\n* Fixed cl_ext.h include path in CL backend.\r\n* Fixed bugs in PreCompiledLayer. E.g. A new shared_ptr was being created instead of allowing std::move to convert the unique_ptr into a shared_ptr. \r\n* Fixed gcc 9.3.0 compiler warning in TfLiteParser.\r\n* Fixed issue so that the BackendRegistry is cleaned up correctly following negative tests.\r\n\r\n#### Other Changes\r\n\r\n* Print Elementwise and Comparison Operator descriptors in a dot graph.\r\n* Added IsConstant flag to TensorInfo. This should be set if using the new AddFullyConnectedLayer Graph API when weights and bias are constant. An example of this can be found in samples\u002FSimpleSample.cpp.\r\n* Added support for qasymms8 (int8) and added qasymmu8 (uint8) as alias for qasymm8 to ImageTensorGenerator.\r\n\r\n\r\n\r\n#### ABI\u002FAPI Changes\r\n\r\nThe following **front-end** API changes have occurred during the implementation of 21.08 that users should be aware of before upgrading. Due to these changes we have bumped our ARMNN_VERSION to 26.0.0 while also bumping our Parsers and Delegate to 24.2.0 following Semantic Versioning guidelines.\r\n\r\n| Feature                    | SHA                                     | Gerrit Review | Resultant ABI\u002FAPI changes |\r\n|----------------------------|-------------------","2021-08-26T16:14:55",{"id":258,"version":259,"summary_zh":260,"released_at":261},127646,"v21.05","# Summary\r\n\r\nThe 21.05 Release of Arm NN was focused on providing new capabilities to allow users attain higher performance by:\r\n\r\n   * Making the Arm NN Core thread safe opening the possibility of running multiple inferences on the same model in parallel software threads.\r\n   * Allowing graphs on the GPU backend import their input and output buffers either from correctly aligned main memory or from kernel memory exposed as a dma_buf, thus reducing memory usage and saving the time involved in copying data into and out of the GPU memory space.\r\n\r\nIn addition to this, support was added to allow the MobileBERT network to be parsed and run.\r\n\r\nFinally three deprecated components: the Tensorflow Parser, the Caffe Parser and the Arm NN Quantizer tool, were removed.\r\n\r\n#### New Features\r\n\r\n \r\n* CAST Operator support added on CpuRef, CpuAcc, GpuAcc Backends.\r\n* Non-const weights support added on FULLY_CONNECTED layer for CpuRef Backend.\r\n* Enable Input and Output Memory Import on GPU (Malloc and DmaBuf).\r\n* Asynchronous Network Execution for CpuRef Backend.\r\n* Optimisation added to fuse PAD into Pooling2d if possible.\r\n* ASR sample application added to samples directory.\r\n\r\n\r\n#### TfLite Parser\r\n\r\n* ABS Operator Support added.\r\n* ARG_MIN Operator Support added.\r\n* CAST Operator Support added.\r\n* LOGICAL_NOT Operator Support added.\r\n* RSQRT Operator Support added.\r\n* Non-const weights support added on FULLY_CONNECTED layer.\r\n* Turn off Biases when data location is -1 (Added to support MobileBERT).\r\n\r\n#### ArmNN Serializer\u002FDeserializer\r\n\r\n* Added Signed64 support to Serializer and Deserializer.\r\n* Added QAsymmS8 support to Serializer.\r\n* Added L2 Pooling algorithm to Deserializer.\r\n\r\n\r\n#### ExecuteNetwork App Changes\r\n\r\n* Asynchronous Network Execution support (Currently for CpuRef Backend).\r\n* Re-enabled GPU profiling in ExecuteNetwork.\r\n\r\n#### Deprecated features\r\n\r\n* Deprecated the Caffe Parser.\r\n* Deprecated the Tensorflow Parser.\r\n* Deprecated the Arm NN Quantizer tool.\r\n* Deprecated m_Output_Type from the ArgMinMaxDescriptor: the output type is solely determined by the data type of the output tensor.\r\n\r\n#### Bug Fixes\r\n\r\n* Fix CheckProfilingObjectUids test failing on Ubuntu 21.04.\r\n* Fix added to Serializer to handle situations where a shape has some unspecified dimensions.\r\n* Fix added to AddBroadcastReshapeLayer optimisation to prevent modification to constant layers with multiple connections.\r\n * Fix added to use CMake value ${CMAKE_THREAD_LIBS_INIT} throughout instead of 'pthread'.\r\n * Fix added to handle negative axis correctly in ARG_MAX (TfLiteParser) and SPLIT (TfLiteParser & TfLiteDelegate) operators.\r\n * Fixed TfLiteDelegate Normalization & Softmax for Android if NDK is less than r21.\r\n * Fixed Deserializer issue where layer bindings were incorrectly assigning the tensor info of one output to all 4 outputs.\r\n* Fixed x86_64 ArmNN DockerFile.\r\n* Fixed TuningLevel enumeration values to be consistent.\r\n* Fixed YoloV3 test application's incorrect use of std::abs.\r\n* Improved performance on SqueezeNet v1.1.\r\n\r\n#### Other Changes\r\n\r\n* Removed cross-wiring in DepthwiseConvolution2d. The permutation of the full tensor info is now performed in armnnUtils::Permuted.\r\n* Moved doctest third-party library to armnn from delegate.\r\n* Updated TfLiteDelegate Python Integration guide with new links. Also added information about the TFLite Model Benchmark Tool.\r\n* Updated Cross Compiling Guide.\r\n* Improved Graph memory usage.\r\n\r\n#### Known Issues\r\n\r\n* Intermittent issue on Dma Buf memory import on GPU. This is fix in Mali Driver r30p0.\r\n* There might be performance regressions against v20.08 in Inception v3 using int8 data types on Arm Mali-G77 GPUs. Currently under investigation.\r\n\r\n#### ABI\u002FAPI Changes\r\n\r\nThe following **front-end** API changes have occurred during the implementation of 21.05 that users should be aware of before upgrading. Due to these changes we have bumped our ARMNN_VERSION to 25.0.0 while also bumping our Parsers and Delegate to 24.1.0 following Semantic Versioning guidelines.\r\n\r\n| Feature                    | SHA                                     | Gerrit Review | Resultant ABI\u002FAPI changes |\r\n|----------------------------|-----------------------------------------|---------------|---------------------------|\r\n| Add Async Queue to IRuntime| e813d67f86df41a238ff79b5c554ef5027f56576| [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F5493](https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F5493) | \u003Cul>\u003Cli> For struct INetworkProperties  the member variable size_t m_NumThreads has been added resulting in the change of size of the inclusive type. \u003C\u002Fli>\u003C\u002Ful>|\r\n| Add front-end support for CAST + Add TfLiteParser support for CAST| b392e9845b7f40ab0c389f29f13f6ec84dd814d1| [https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F5374](https:\u002F\u002Freview.mlplatform.org\u002Fc\u002Fml\u002Farmnn\u002F+\u002F5374) | \u003Cul>\u003Cli>For enum class LayerType a new enum for Cast has been added which changes the class member LastLayer to equate to Cast rather than the previous","2021-05-20T16:16:30"]