[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-Neutone--neutone_sdk":3,"tool-Neutone--neutone_sdk":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":80,"owner_twitter":81,"owner_website":82,"owner_url":83,"languages":84,"stars":89,"forks":90,"last_commit_at":91,"license":92,"difficulty_score":23,"env_os":93,"env_gpu":94,"env_ram":94,"env_deps":95,"category_tags":99,"github_topics":100,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":112,"updated_at":113,"faqs":114,"releases":143},2907,"Neutone\u002Fneutone_sdk","neutone_sdk","Join the community on Discord for more discussions around Neutone! https:\u002F\u002Fdiscord.gg\u002FVHSMzb8Wqp","Neutone SDK 是一个开源框架，旨在简化基于 PyTorch 的神经音频模型部署，使其能轻松应用于实时或离线场景。它主要解决了音频 AI 研究人员面临的痛点：传统音频插件开发依赖 C++ 和复杂的 JUCE 框架，门槛高且耗时。借助 Neutone SDK，用户无需编写任何 C++ 代码，仅需少量 Python 脚本即可将自定义模型封装，并直接在数字音频工作站（DAW）中通过免费的宿主插件（Neutone FX 用于实时效果，Neutone Gen 用于非实时生成）运行。\n\n该工具特别适合音频算法研究员、教育工作者以及希望快速验证模型的开发者，同时也为艺术家提供了尝试前沿 AI 音效的途径。其核心技术亮点在于统一的模型无关接口，自动处理了可变缓冲区大小、采样率转换、延迟补偿及控制参数映射等复杂工程问题。这意味着即使模型原本只能在固定条件下运行，也能在 DAW 的各种采样率和缓冲设置下无缝工作。此外，SDK 还内置了性能基准测试和分析工具，帮助用户高效调试和优化模型。通过将底层工程复杂性抽象化，Neutone SDK 让创作者能专注于算法创新，通常在一天内即可完成从模型到 DAW","Neutone SDK 是一个开源框架，旨在简化基于 PyTorch 的神经音频模型部署，使其能轻松应用于实时或离线场景。它主要解决了音频 AI 研究人员面临的痛点：传统音频插件开发依赖 C++ 和复杂的 JUCE 框架，门槛高且耗时。借助 Neutone SDK，用户无需编写任何 C++ 代码，仅需少量 Python 脚本即可将自定义模型封装，并直接在数字音频工作站（DAW）中通过免费的宿主插件（Neutone FX 用于实时效果，Neutone Gen 用于非实时生成）运行。\n\n该工具特别适合音频算法研究员、教育工作者以及希望快速验证模型的开发者，同时也为艺术家提供了尝试前沿 AI 音效的途径。其核心技术亮点在于统一的模型无关接口，自动处理了可变缓冲区大小、采样率转换、延迟补偿及控制参数映射等复杂工程问题。这意味着即使模型原本只能在固定条件下运行，也能在 DAW 的各种采样率和缓冲设置下无缝工作。此外，SDK 还内置了性能基准测试和分析工具，帮助用户高效调试和优化模型。通过将底层工程复杂性抽象化，Neutone SDK 让创作者能专注于算法创新，通常在一天内即可完成从模型到 DAW 插件的完整流程。","# Neutone SDK\n\n[![Release](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRelease-v1.5.2-green)](https:\u002F\u002Fgithub.com\u002FNeutone\u002Fneutone_sdk\u002Freleases\u002Ftag\u002Fv1.5.2)\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2508.09126-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.09126)\n[![Plugin](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRealtime%20Plugin-Neutone%20FX-orange)](https:\u002F\u002Fneutone.ai\u002Ffx)\n[![Plugin](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FNon--realtime%20Plugin-Neutone%20Gen-orange)](https:\u002F\u002Fneutone.ai\u002Fgen)\n[![ADC 2022](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FADC%202022-blue?logo=youtube&labelColor=555)](https:\u002F\u002Fyoutu.be\u002FhhbvjQ2v8Hk?t=1177)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-LGPL--2.1-blue)](https:\u002F\u002Fwww.gnu.org\u002Flicenses\u002Fold-licenses\u002Flgpl-2.1.en.html)\n\nThe Neutone SDK is an open source framework that streamlines the deployment of PyTorch-based neural audio models for both real-time and offline applications.\nIt enables researchers to wrap their own PyTorch audio models and run them in the DAW using our free host plugins [Neutone FX](https:\u002F\u002Fneutone.ai\u002Ffx) and [Neutone Gen](https:\u002F\u002Fneutone.ai\u002Fgen).\nWe offer functionality for both loading models locally and contributing them to the library of models that are available to anyone running the plugin.\nBy encapsulating common challenges such as variable buffer sizes, sample rate conversion, delay compensation, and control parameter handling within a unified, model-agnostic interface, our framework enables seamless interoperability between neural models and host plugins while allowing users to work entirely in Python. \nTo date, the SDK has powered various different applications such as audio effect emulation, timbre transfer, and sample generation, as well as seen adoption by researchers, educators, companies, and artists alike.\n\nCurrently we provide two different free host plugins:\n- Neutone FX for realtime models\n- Neutone Gen for non-realtime models, currently in beta.\n\n## Why use the Neutone SDK\n\n[JUCE](https:\u002F\u002Fgithub.com\u002Fjuce-framework\u002FJUCE) is the industry standard for building audio plugins. Because of this, knowledge of C++ is needed to be able to build even very simple audio plugins. However, it is rare for AI audio researchers to have extensive experience with C++ and be able to build such a plugin. Moreover, it is a serious time investment that could be spent developing better algorithms. Neutone makes it possible to build models using familiar tools such as PyTorch and with a minimal amount of Python code wrap these models such that they can be executed by the Neutone Plugin. Getting a model up and running inside a DAW can be done in less than a day without any need for C++ code or knowledge.\n\nThe SDK provides support for automatic buffering of inputs and outputs to your model and on-the-fly sample rate and stereo-mono conversion. It enables a model that can only be executed with a predefined number of samples to be used in the DAW at any sampling rate and any buffer size seamlessly. Additionally, within the SDK tools for benchmarking and profiling are readily available so you can easily debug and test the performance of your models.\n\n## Citation\n\nAccepted to the AES International Conference on Artificial Intelligence and Machine Learning for Audio held in London, UK on Sep. 8th to 10th, 2025.\n\n\u003Cpre>\u003Ccode>@inproceedings{mitcheltree2025neutone,\n    title={Neutone {SDK}: An Open Source Framework for Neural Audio Processing},\n    author={Christopher Mitcheltree and Bogdan Teleaga and Andrew Fyfe and Naotake Masuda and Matthias Schäfer and Alfie Bradic and Nao Tokui},\n    booktitle={AES International Conference on Artificial Intelligence and Machine Learning for Audio},\n    year={2025},\n    url={https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2508.09126}\n}\n\u003C\u002Fcode>\u003C\u002Fpre>\n\n## Table of Contents\n- [Installing the SDK](#install)\n- [Downloading the Neutone Plugin](#download)\n- [Examples](#examples)\n- [SDK Description](#description)\n- [SDK Usage](#usage)\n- [Benchmarking and Profiling](#benchmark)\n- [Known issues](#issues)\n- [Contributing to the SDK](#contributing)\n- [Credits](#credits)\n\n--- \n\n## Installing the SDK\n\n\u003Ca name=\"install\"\u002F>\n\nYou can install `neutone_sdk` using pip: \n\n```\npip install neutone_sdk\n```\n\n\u003Ca name=\"download\"\u002F>\n\n## Downloading the Plugin\n\n### FX\n\nThe Neutone FX Plugin is available at [https:\u002F\u002Fneutone.ai\u002Ffx](https:\u002F\u002Fneutone.ai\u002Ffx). We currently offer VST3 and AU plugins that can be used to load the models created with this SDK. Please visit the website for more information.\n\n### Gen\n\nThe Neutone Gen Plugin is still in beta and can be directly downloaded from the links below:\n- [MacOS ARM](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Fneutone-gen-arm64-0.2.0.dmg)\n- [Windows](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Fneutone-gen-0.2.0-installer.exe)\n\n\nCurrently the Gen plugin does not provide similar store functionality to the FX plugin. During development simply drop your exported `.nm` model in the `models` folder that can be accessed from the UI.\n\n\nWe provide the following pre-wrapped models:\n- [LoopGAN by Nao Tokui trained on a proprietary dataset](https:\u002F\u002Fgithub.com\u002Fnaotokui\u002FLoopGAN) - [Download .nm model](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Floopgan_nao_tokui.nm)\n- [LoopGAN trained on the WaivOps EDM-HSE dataset](https:\u002F\u002Fgithub.com\u002Fpatchbanks\u002FWaivOps-EDM-HSE) - [Download .nm model](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Floopgan_4bars_edm_hse.nm)\n- [Stable Audio Open Small](https:\u002F\u002Fhuggingface.co\u002Fstabilityai\u002Fstable-audio-open-1.0) - [Download .nm model](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Fstable_audio_open_small.nm)\n- [HT Demucs](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fdemucs) - [Download .nm model](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Fht_demucs_4stems.nm)\n\n\n\u003Ca name=\"examples\"\u002F>\n\n## Examples and Notebooks\n\nIf you just want to wrap a model without going through a detailed description of what everything does we prepared these examples for you.\n\n### FX\n- The clipper example shows how to wrap a very simple PyTorch module that does not contain any AI model. Check it out for getting a high level overview of what is needed for wrapping a model. It is available at [examples\u002Fneutone_fx\u002Fexample_clipper.py](examples\u002Fneutone_fx\u002Fexample_clipper.py).\n- An example with a simple convolutional model based on [Randomized Overdrive Neural Networks](https:\u002F\u002Fcsteinmetz1.github.io\u002Fronn\u002F) can be found at [examples\u002Fneutone_fx\u002Fexample_overdrive-random.py](examples\u002Fneutone_fx\u002Fexample_overdrive-random.py).\n- We also have Notebooks for more complicated models showing the entire workflow from training to exporting them using Neutone:\n    - [TCN FX Emulation](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1apINsljr6jXGP3Nh3_ISbeT2jgeKn5nP?usp=sharing)\n    - [DDSP Timbre Transfer](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1IUuxJ_DhhLHVvMcvbaBPVLRQK53yMOvd)\n    - [RAVE Timbre Transfer](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1AQOrXtiIFWj_Qh-Br3qfmUKmpzFQgsqj)\n    - [NoiseBandNet Audio Reconstruction](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1KJij2CqhLf7ac6aljMckFL71WJrCNg66?usp=sharing)\n    - [GCN FX Emulation](https:\u002F\u002Fgithub.com\u002Ffrancescopapaleo\u002Fneural-audio-spring-reverb\u002Fblob\u002Fmain\u002Fnotebooks\u002Fneutone_GCN_demo.ipynb)\n\n### Gen\n- We provide the same simple clipper example as above, but using the NonRealtime Gen wrappers at [examples\u002Fneutone_gen\u002Fexample_clipper.py](examples\u002Fneutone_gen\u002Fexample_clipper.py)\n- A more intricate example of how to wrap a text to audio model is available at [examples\u002Fneutone_gen\u002Fexample_musicgen_load.py](examples\u002Fneutone_gen\u002Fexample_musicgen_load.py). This showcases how a tokenizer can be used and requires a traced or scripted MusicGen model.\n\n\u003Ca name=\"description\"\u002F>\n\n## SDK Overview\n\nThe SDK provides functionality for wrapping existing PyTorch models in a way that can make them executable within the VST plugin. At its core the plugin is sending chunks of audio samples at a certain sample rate as an input and expects the same amount of samples at the output. The user of the SDK can specify what sample rate(s) and buffer size(s) their models perform optimally at. The SDK then guarantees that the forward pass of the model will receive audio at one of these (sample_rate, buffer_size) combinations. Four knobs are available that allow the users of the plugin to feed in additional parameters to the model at runtime. They can be enabled or disabled as needed via the SDK.\n\n\nUsing the included export function a series of tests is automatically ran to ensure the models behave as expected and are ready to be loaded by the plugin.\n\n\nBenchmarking and profiling CLI tools are available for further debugging and testing of wrapped models. It is possible to benchmark the speed and latency of a model on a range of simulated common DAW (sample_rate, buffere_size) combinations as well as profile the memory and CPU usage.\n\n\u003Ca name=\"usage\"\u002F>\n\n## SDK Usage\n\n### General Usage\n\nWe provide several models in the [examples](https:\u002F\u002Fgithub.com\u002FQosmoInc\u002Fneutone-sdk\u002Fblob\u002Fmain\u002Fexamples) directory. We will go through one of the simplest models, a distortion model, to illustrate.\n\nAssume we have the following PyTorch model. Parameters will be covered later on, we will focus on the inputs and outputs for now. Assume this model receives a Tensor of shape `(2, buffer_size)` as an input where `buffer_size` is a parameter that can be specified.\n\n```python\nclass ClipperModel(nn.Module):\n    def forward(self, x: Tensor, min_val: float, max_val: float, gain: float) -> Tensor:\n        return torch.clip(x, min=min_val * gain, max=max_val * gain)\n```\n\nTo run this inside the VST the simplest wrapper we can write is by subclassing the WaveformToWaveformBase baseclass.\n```python\nclass ClipperModelWrapper(WaveformToWaveformBase):\n    @torch.jit.export  \n    def is_input_mono(self) -> bool:\n        return False\n    \n    @torch.jit.export\n    def is_output_mono(self) -> bool:\n        return False\n    \n    @torch.jit.export\n    def get_native_sample_rates(self) -> List[int]:\n        return []  # Supports all sample rates\n    \n    @torch.jit.export\n    def get_native_buffer_sizes(self) -> List[int]:\n        return []  # Supports all buffer sizes\n\n    def do_forward_pass(self, x: Tensor, params: Dict[str, Tensor]) -> Tensor:\n        # ... Parameter unwrap logic\n        x = self.model.forward(x, min_val, max_val, gain)\n        return x\n ```\n\nThe method that does most of the work is `do_forward_pass`. In this case it is just a simple passthrough, but we will use it to handle parameters later on.\n\nBy default the VST runs as `stereo-stereo` but when mono is desired for the model we can use the `is_input_mono` and `is_output_mono` to inform the SDK and have the inputs and outputs converted automatically. If `is_input_mono` is toggled an averaged `(1, buffer_size)` shaped Tensor will be passed as an input instead of `(2, buffer_size)`. If `is_output_mono` is toggled, `do_forward_pass` is expected to return a mono Tensor (shape `(1, buffer_size)`) that will then be duplicated across both channels at the output of the VST. This is done within the SDK to avoid unnecessary memory allocations during each pass.\n\n`get_native_sample_rates` and `get_native_buffer_sizes` can be used to specify any preferred sample rates or buffer sizes. In most cases these are expected to only have one element but extra flexibility is provided for more complex models. In case multiple options are provided the SDK will try to find the best one for the current setting of the DAW. Whenever the sample rate or buffer size is different from the one of the DAW a wrapper is automatically triggered that converts to the correct sampling rate or implements a FIFO queue for the requested buffer size or both. This will incur a small performance penalty and add some amount of delay. In case a model is compatible with any sample rate and\u002For buffer size these lists can be left empty.\n\nThis means that the tensor `x` in the `do_forward_pass` method is guaranteed to be of shape `(1 if is_input_mono else 2, buffer_size)`  where `buffer_size` will be chosen at runtime from the list provided in the `get_native_buffer_sizes` method. The tensor `x` will also be at one of the sampling rates from the list provided in the `get_native_sample_rates` method.\n\n### Exporting models and loading in the plugin\n\nWe provide a `save_neutone_model` helper function that saves models to disk. By default this will convert models to TorchScript and run them through a series of checks to ensure they can be loaded by the plugin. The resulting `model.nm` file can be loaded within the plugin using the `load your own` button. Read below for how to submit models to the default collection visible to everyone using the plugin.\n\n### Parameters\n\nFor models that can use conditioning signals we currently provide four configurable knob parameters. Within the `ClipperModelWrapper` defined above we can include the following:\n```python\nclass ClipperModelWrapper(WaveformToWaveformBase):\n    ...\n    \n    def get_neutone_parameters(self) -> List[NeutoneParameter]:\n        return [NeutoneParameter(name=\"min\", description=\"min clip threshold\", default_value=0.5),\n                NeutoneParameter(name=\"max\", description=\"max clip threshold\", default_value=1.0),\n                NeutoneParameter(name=\"gain\", description=\"scale clip threshold\", default_value=1.0)]\n         \n    def do_forward_pass(self, x: Tensor, params: Dict[str, Tensor]) -> Tensor:\n        min_val, max_val, gain = params[\"min\"], params[\"max\"], params[\"gain\"]\n        x = self.model.forward(x, min_val, max_val, gain)\n        return x\n```\n\nDuring the forward pass the `params` variable will be a dictionary like the following:\n```python\n{\n    \"min\": torch.Tensor([0.5] * buffer_size),\n    \"max\": torch.Tensor([1.0] * buffer_size),\n    \"gain\": torch.Tensor([1.0] * buffer_size)\n}\n```\nThe keys of the dictionary are specified in the `get_parameters` function.\n\nThe parameters will always take values between 0 and 1 and the `do_forward_pass` function can be used to do any necessary rescaling before running the internal forward method of the model.\n\nMoreover, the parameters sent by the plugin come in at a sample level granularity. By default, we take the mean of each buffer and return a single float (as a Tensor), but the `aggregate_param` method can be used to override the aggregation method. See the full clipper export file for an example of preserving this granularity.\n\n\u003Ca name=\"delay\"\u002F>\n\n### Reporting delay\n\nSome audio models will delay the audio for a certain amount of samples. This depends on the architecture of each particular model. In order for the wet and dry signal that is going through the plugin to be aligned users are required to report how many samples of delay their model induces. The `calc_model_delay_samples` can be used to specify the number of samples of delay. RAVE models on average have one buffer of delay (2048 samples) which is communicated statically in the `calc_model_delay_samples` method and can be seen in the examples. Models implemented with overlap-add will have a delay equal to the number of samples used for crossfading as seen in the [Demucs model wrapper](https:\u002F\u002Fneutone.ai\u002Fblog\u002Fimplementing-models-with-overlap-add-in-neutone\u002F) or the [spectral filter example](examples\u002Fexample_spectral_filter.py). \n\nCalculating the delay your model adds can be difficult, especially since there can be multiple different sources of delay that need to be combined (e.g. cossfading delay, filter delay, lookahead buffer delay, and \u002F or neural networks trained on unaligned dry and wet audio). It's worth spending some extra time testing the model in your DAW to make sure the delay is being reported correctly.\n\n### Lookbehind Buffers\n\nAdding a lookbehind buffer to your model can be useful for models that require a certain amount of additional context to output useful results. A lookbehind buffer can be enabled easily by indicating how many samples of lookbehind you need in the `get_look_behind_samples` method. When this method returns a number greater than zero, the `do_forward_pass` method will always receive a tensor of shape `(in_n_ch, look_behind_samples + buffer_size)`, but must still return a tensor of shape `(out_n_ch, buffer_size)` of the latest samples.\n\nWe recommend avoiding using a look-behind buffer when possible since it makes your model less efficient and can result in wasted calculations during each forward pass. If using a purely convolutional model, try switching all the convolutions to cached convolutions instead.\n\n### Filters\n\nIt is common for AI models to act in unexpected ways when presented with inputs outside of the ones present in their training distribution. We provide a series of common filters (low bass, high pass, band pass, band stop) in the [neutone_sdk\u002Ffilters.py](neutone_sdk\u002Ffilters.py) file. These can be used during the forward pass to restrict the domain of the inputs going into the model. Some of them can induce a small amount of delay, check out the [examples\u002Fexample_clipper_prefilter.py](examples\u002Fexample_clipper_prefilter.py) file for a simple example on how to set up a filter.\n\n### Submitting models\n\nThe plugin contains a default list of models aimed at creators that want to make use of them during their creative process. We encourage users to submit their models once they are happy with the results they get so they can be used by the community at large. For submission we require some additional metadata that will be used to display information about the model aimed at both creators and other researchers. This will be displayed on both the [Neutone website](https:\u002F\u002Fneutone.space) and inside the plugin.\n\nSkipping the previous clipper model, here is a more realistic example based on a random TCN overdrive model inspired by [micro-tcn](https:\u002F\u002Fgithub.com\u002Fcsteinmetz1\u002Fmicro-tcn).\n\n```python\nclass OverdriveModelWrapper(WaveformToWaveformBase):\n    def get_model_name(self) -> str:\n        return \"conv1d-overdrive.random\"\n\n    def get_model_authors(self) -> List[str]:\n        return [\"Nao Tokui\"]\n\n    def get_model_short_description(self) -> str:\n        return \"Neural distortion\u002Foverdrive effect\"\n\n    def get_model_long_description(self) -> str:\n        return \"Neural distortion\u002Foverdrive effect through randomly initialized Convolutional Neural Network\"\n\n    def get_technical_description(self) -> str:\n        return \"Random distortion\u002Foverdrive effect through randomly initialized Temporal-1D-convolution layers. You'll get different types of distortion by re-initializing the weight or changing the activation function. Based on the idea proposed by Steinmetz et al.\"\n\n    def get_tags(self) -> List[str]:\n        return [\"distortion\", \"overdrive\"]\n\n    def get_model_version(self) -> str:\n        return \"1.0.0\"\n\n    def is_experimental(self) -> bool:\n        return False\n\n    def get_technical_links(self) -> Dict[str, str]:\n        return {\n            \"Paper\": \"https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.04237\",\n            \"Code\": \"https:\u002F\u002Fgithub.com\u002Fcsteinmetz1\u002Fronn\"\n        }\n\n    def get_citation(self) -> str:\n        return \"Steinmetz, C. J., & Reiss, J. D. (2020). Randomized overdrive neural networks. arXiv preprint arXiv:2010.04237.\"\n```\n\nCheck out the documentation of the methods inside [core.py](neutone_sdk\u002Fcore.py), as well as the random overdrive model on the [website](https:\u002F\u002Fneutone.ai\u002Ffx\u002Fmodels) and in the plugin to understand where each field will be displayed.\n\nTo submit a model, please [open an issue on the GitHub repository](https:\u002F\u002Fgithub.com\u002FQosmoInc\u002Fneutone_sdk\u002Fissues\u002Fnew?assignees=bogdanteleaga%2C+christhetree&labels=enhancement&template=request-add-model.md&title=%5BMODEL%5D+%3CNAME%3E). We currently need the following:\n- A short description of what the model does and how it can contribute to the community\n- A link to the `model.nm` file outputted by the `save_neutone_model` helper function\n\n\u003Ca name=\"benchmark\"\u002F>\n\n## Benchmarking and Profiling\n\nThe SDK provides three CLI tools that can be used to debug and test wrapped models.\n\n### Benchmarking Speed\n\nExample:\n```\n$ python -m neutone_sdk.benchmark benchmark-speed --model_file model.nm\nINFO:__main__:Running benchmark for buffer sizes (128, 256, 512, 1024, 2048) and sample rates (48000,). Outliers will be removed from the calculation of mean and std and displayed separately if existing.\nINFO:__main__:Sample rate:  48000 | Buffer size:    128 | duration:  0.014±0.002 | 1\u002FRTF:  5.520 | Outliers: [0.008]\nINFO:__main__:Sample rate:  48000 | Buffer size:    256 | duration:  0.028±0.003 | 1\u002FRTF:  5.817 | Outliers: []\nINFO:__main__:Sample rate:  48000 | Buffer size:    512 | duration:  0.053±0.003 | 1\u002FRTF:  6.024 | Outliers: []\nINFO:__main__:Sample rate:  48000 | Buffer size:   1024 | duration:  0.106±0.000 | 1\u002FRTF:  6.056 | Outliers: []\nINFO:__main__:Sample rate:  48000 | Buffer size:   2048 | duration:  0.212±0.000 | 1\u002FRTF:  6.035 | Outliers: [0.213]\n```\n\nRunning the speed benchmark will automatically run random inputs through the model at a sample rate of 48000 and buffer sizes of (128, 256, 512, 1024, 2048) and report the average time taken to execute inference for one buffer. From this the `1\u002FRTF` is calculated which represents how much faster than realtime the model is. As this number gets higher, the model will use fewer resources within the DAW. It is necessary for this number to be bigger than 1 for the model to be able to be executed in realtime on the machine that the benchmark is ran on.\n\nThe sample rates and buffer sizes being tested, as well as the number of times the benchmark is internally repeated to calculate the averages and the number of threads used for the computation are available as parameters. Run `python -m neutone_sdk.benchmark benchmark-speed --help` for more information. When specifiying custom sample rates or buffer sizes each individual one needs to be passed to the CLI separately. For example: `--sample_rate 48000 --sample_rate 44100 --buffer_size 32 --buffer_size 64`.\n\nWhile the speed benchmark should be fast as the models are generally speaking required to be realtime it is possible to get stuck if the model is too slow. Make sure you choose an appropiate number of sample rates and buffer sizes to test.\n\n\u003Ca name=\"latency\"\u002F>\n\n### Benchmarking Latency\n\nExample:\n```bash\n$ python -m neutone_sdk.benchmark benchmark-latency model.nm                    \nINFO:__main__:Native buffer sizes: [2048], Native sample rates: [48000]\nINFO:__main__:Model exports\u002Fravemodel\u002Fmodel.nm has the following delays for each sample rate \u002F buffer size combination (lowest delay first):\nINFO:__main__:Sample rate:  48000 | Buffer size:   2048 | Total delay:      0 | (Buffering delay:      0 | Model delay:      0)\nINFO:__main__:Sample rate:  48000 | Buffer size:   1024 | Total delay:   1024 | (Buffering delay:   1024 | Model delay:      0)\nINFO:__main__:Sample rate:  48000 | Buffer size:    512 | Total delay:   1536 | (Buffering delay:   1536 | Model delay:      0)\nINFO:__main__:Sample rate:  48000 | Buffer size:    256 | Total delay:   1792 | (Buffering delay:   1792 | Model delay:      0)\nINFO:__main__:Sample rate:  44100 | Buffer size:    128 | Total delay:   1920 | (Buffering delay:   1920 | Model delay:      0)\nINFO:__main__:Sample rate:  48000 | Buffer size:    128 | Total delay:   1920 | (Buffering delay:   1920 | Model delay:      0)\nINFO:__main__:Sample rate:  44100 | Buffer size:    256 | Total delay:   2048 | (Buffering delay:   2048 | Model delay:      0)\nINFO:__main__:Sample rate:  44100 | Buffer size:    512 | Total delay:   2048 | (Buffering delay:   2048 | Model delay:      0)\nINFO:__main__:Sample rate:  44100 | Buffer size:   1024 | Total delay:   2048 | (Buffering delay:   2048 | Model delay:      0)\nINFO:__main__:Sample rate:  44100 | Buffer size:   2048 | Total delay:   2048 | (Buffering delay:   2048 | Model delay:      0)\n```\n\nRunning the speed benchmark will automatically compute the latency of the model at combinations of `sample_rate=(44100, 48000)` and `buffer_size=(128, 256, 512, 1024, 2048)`. This gives a general overview of what will happen for common DAW settings. The total delay is split into buffering delay and model delay. The model delay is reported by the creator of the model in the model wrapper as explained [above](#delay). The buffering delay is automatically computed by the SDK taking into consideration the combination of `(sample_rate, buffer_size)` specified by the wrapper (the native ones) and the one specified by the DAW at runtime. Running the model at its native `(sample_rate, buffer_size)` combination(s) will incur minimum delay.\n\nSimilar to the speed benchmark above, the tested combinations of `(sample_rate, buffer_size)` can be specified from the CLI. Run `python -m neutone_sdk.benchmark benchmark-latency --help` for more info.\n\n### Profiling\n```bash\n$ python -m neutone_sdk.benchmark profile --model_file exports\u002Fravemodel\u002Fmodel.nm\nINFO:__main__:Profiling model exports\u002Fravemodel\u002Fmodel.nm at sample rate 48000 and buffer size 128\nSTAGE:2023-09-28 14:34:53 96328:4714960 ActivityProfilerController.cpp:311] Completed Stage: Warm Up\n30it [00:00, 37.32it\u002Fs]\nSTAGE:2023-09-28 14:34:54 96328:4714960 ActivityProfilerController.cpp:317] Completed Stage: Collection\nSTAGE:2023-09-28 14:34:54 96328:4714960 ActivityProfilerController.cpp:321] Completed Stage: Post Processing\nINFO:__main__:Displaying Total CPU Time\nINFO:__main__:--------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  \n                            Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg       CPU Mem  Self CPU Mem    # of Calls  \n--------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  \n                         forward        98.54%     799.982ms       102.06%     828.603ms      26.729ms           0 b    -918.17 Kb            31  \n               aten::convolution         0.12%     963.000us         0.95%       7.739ms     175.886us     530.62 Kb    -143.50 Kb            44\n...\n...\nFull output removed from GitHub.\n\n```\n\nThe profiling tool will run the model at a sample rate of 48000 and a buffer size of 128 under the PyTorch profiler and output a series of insights, such as the Total CPU Time, Total CPU Memory Usage (per function) and Grouped CPU Memory Usage (per group of function calls). This can be used to identify bottlenecks in your model code (even within the model call within the `do_forward_pass` call).\n\nSimilar to benchmarking, it can be ran at different combinations of sample rates and buffer sizes as well as different numbers of threads. Run `python -m neutone_sdk.benchmark profile --help` for more info.\n\n\n\u003Ca name=\"issues\"\u002F>\n\n## Known issues\n\n- Freezing models on save can cause instabilities and thus freezing is disabled by default. We recommend trying to save models both with and without freeze.\n- Lookahead buffers are currently not included in the SDK (only lookbehind buffers), but can be implemented with additional code. An example is available in [this file](neutone_sdk\u002Frealtime_stft.py).\n- M1 acceleration is currently not supported.\n- Wrapping more complicated models can result in obscure TorchScript errors.\n\n\n\u003Ca name=\"contributing\"\u002F>\n\n## Contributing to the SDK\n\nWe welcome any contributions to the SDK. Please add types wherever possible and use the `black` formatter for readability.\n\nThe current roadmap is:\n- Looking into alternatives for TorchScript (ExecuTorch?)\n\n\u003Ca name=\"credits\"\u002F>\n\n## Credits\n\nThe audacitorch project was a major inspiration for the development of the SDK. [Check it out here](https:\u002F\u002Fgithub.com\u002Fhugofloresgarcia\u002Faudacitorch)\n","# Neutone SDK\n\n[![Release](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRelease-v1.5.2-green)](https:\u002F\u002Fgithub.com\u002FNeutone\u002Fneutone_sdk\u002Freleases\u002Ftag\u002Fv1.5.2)\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2508.09126-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.09126)\n[![Plugin](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRealtime%20Plugin-Neutone%20FX-orange)](https:\u002F\u002Fneutone.ai\u002Ffx)\n[![Plugin](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FNon--realtime%20Plugin-Neutone%20Gen-orange)](https:\u002F\u002Fneutone.ai\u002Fgen)\n[![ADC 2022](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FADC%202022-blue?logo=youtube&labelColor=555)](https:\u002F\u002Fyoutu.be\u002FhhbvjQ2v8Hk?t=1177)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-LGPL--2.1-blue)](https:\u002F\u002Fwww.gnu.org\u002Flicenses\u002Fold-licenses\u002Flgpl-2.1.en.html)\n\nNeutone SDK 是一个开源框架，旨在简化基于 PyTorch 的神经音频模型在实时和离线应用中的部署。它使研究人员能够封装自己的 PyTorch 音频模型，并通过我们免费的宿主插件 [Neutone FX](https:\u002F\u002Fneutone.ai\u002Ffx) 和 [Neutone Gen](https:\u002F\u002Fneutone.ai\u002Fgen) 在数字音频工作站中运行这些模型。我们不仅支持本地加载模型，还允许将模型贡献到插件用户均可使用的模型库中。通过在一个统一且与模型无关的接口中封装常见的挑战，如可变缓冲区大小、采样率转换、延迟补偿以及控制参数处理，我们的框架实现了神经模型与宿主插件之间的无缝互操作性，同时让用户完全使用 Python 进行开发。迄今为止，该 SDK 已被应用于音频效果仿真、音色迁移和样本生成等多种场景，并得到了研究人员、教育工作者、企业和艺术家的广泛认可。\n\n目前我们提供两款免费的宿主插件：\n- Neutone FX：用于实时模型\n- Neutone Gen：用于非实时模型，目前处于测试阶段。\n\n## 为什么使用 Neutone SDK\n\n[JUCE](https:\u002F\u002Fgithub.com\u002Fjuce-framework\u002FJUCE) 是构建音频插件的行业标准。因此，即使要构建非常简单的音频插件，也需要掌握 C++ 编程语言。然而，AI 音频研究者通常并不具备丰富的 C++ 经验，难以独立完成此类插件的开发。此外，投入大量时间开发插件也会占用本可用于改进算法的时间。Neutone 让研究人员可以使用熟悉的工具（如 PyTorch），并仅用少量 Python 代码即可封装模型，使其能够由 Neutone 插件执行。无需任何 C++ 代码或相关知识，便可在一天之内将模型部署到 DAW 中并正常运行。\n\nSDK 提供对模型输入和输出的自动缓冲支持，以及实时的采样率和立体声\u002F单声道转换功能。它使得只能以预定义样本数运行的模型能够在 DAW 中以任意采样率和缓冲区大小无缝使用。此外，SDK 内置了基准测试和性能分析工具，方便用户调试和评估模型的性能。\n\n## 引用\n\n已被 2025 年 9 月 8 日至 10 日在英国伦敦举行的 AES 国际人工智能与机器学习音频会议接受。\n\n\u003Cpre>\u003Ccode>@inproceedings{mitcheltree2025neutone,\n    title={Neutone {SDK}: An Open Source Framework for Neural Audio Processing},\n    author={Christopher Mitcheltree and Bogdan Teleaga and Andrew Fyfe and Naotake Masuda and Matthias Schäfer and Alfie Bradic and Nao Tokui},\n    booktitle={AES International Conference on Artificial Intelligence and Machine Learning for Audio},\n    year={2025},\n    url={https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2508.09126}\n}\n\u003C\u002Fcode>\u003C\u002Fpre>\n\n## 目录\n- [安装 SDK](#install)\n- [下载 Neutone 插件](#download)\n- [示例](#examples)\n- [SDK 说明](#description)\n- [SDK 使用](#usage)\n- [基准测试与性能分析](#benchmark)\n- [已知问题](#issues)\n- [参与 SDK 贡献](#contributing)\n- [致谢](#credits)\n\n---\n\n## 安装 SDK\n\n\u003Ca name=\"install\"\u002F>\n\n您可以通过 pip 安装 `neutone_sdk`：\n\n```\npip install neutone_sdk\n```\n\n\u003Ca name=\"download\"\u002F>\n\n## 下载插件\n\n### FX\n\nNeutone FX 插件可在 [https:\u002F\u002Fneutone.ai\u002Ffx](https:\u002F\u002Fneutone.ai\u002Ffx) 获取。我们目前提供 VST3 和 AU 格式的插件，可用于加载使用此 SDK 创建的模型。更多信息请访问官网。\n\n### Gen\n\nNeutone Gen 插件仍处于测试阶段，可从以下链接直接下载：\n- [MacOS ARM](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Fneutone-gen-arm64-0.2.0.dmg)\n- [Windows](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Fneutone-gen-0.2.0-installer.exe)\n\n\n目前 Gen 插件尚未提供与 FX 插件类似的模型商店功能。在开发过程中，只需将导出的 `.nm` 模型文件放入 UI 可访问的 `models` 文件夹中即可。\n\n\n我们提供了以下预封装模型：\n- [Nao Tokui 基于专有数据集训练的 LoopGAN](https:\u002F\u002Fgithub.com\u002Fnaotokui\u002FLoopGAN) - [下载 .nm 模型](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Floopgan_nao_tokui.nm)\n- [基于 WaivOps EDM-HSE 数据集训练的 LoopGAN](https:\u002F\u002Fgithub.com\u002Fpatchbanks\u002FWaivOps-EDM-HSE) - [下载 .nm 模型](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Floopgan_4bars_edm_hse.nm)\n- [Stable Audio Open Small](https:\u002F\u002Fhuggingface.co\u002Fstabilityai\u002Fstable-audio-open-1.0) - [下载 .nm 模型](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Fstable_audio_open_small.nm)\n- [HT Demucs](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fdemucs) - [下载 .nm 模型](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Fht_demucs_4stems.nm)\n\n\n\u003Ca name=\"examples\"\u002F>\n\n## 示例与笔记本\n\n如果您只想快速封装一个模型而无需深入了解每个步骤的具体含义，我们为您准备了这些示例。\n\n### FX\n- Clipper 示例展示了如何封装一个非常简单的 PyTorch 模块，该模块不包含任何 AI 模型。请查看此示例，以了解封装模型所需的高层次概览。它位于 [examples\u002Fneutone_fx\u002Fexample_clipper.py](examples\u002Fneutone_fx\u002Fexample_clipper.py)。\n- 基于 [Randomized Overdrive Neural Networks](https:\u002F\u002Fcsteinmetz1.github.io\u002Fronn\u002F) 的简单卷积模型示例可在 [examples\u002Fneutone_fx\u002Fexample_overdrive-random.py](examples\u002Fneutone_fx\u002Fexample_overdrive-random.py) 中找到。\n- 我们还提供了针对更复杂模型的 Notebook，展示了从训练到使用 Neutone 导出的完整工作流程：\n    - [TCN FX 仿真](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1apINsljr6jXGP3Nh3_ISbeT2jgeKn5nP?usp=sharing)\n    - [DDSP 音色迁移](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1IUuxJ_DhhLHVvMcvbaBPVLRQK53yMOvd)\n    - [RAVE 音色迁移](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1AQOrXtiIFWj_Qh-Br3qfmUKmpzFQgsqj)\n    - [NoiseBandNet 音频重建](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1KJij2CqhLf7ac6aljMckFL71WJrCNg66?usp=sharing)\n    - [GCN FX 仿真](https:\u002F\u002Fgithub.com\u002Ffrancescopapaleo\u002Fneural-audio-spring-reverb\u002Fblob\u002Fmain\u002Fnotebooks\u002Fneutone_GCN_demo.ipynb)\n\n### Gen\n- 我们提供了与上述相同的简单 clipper 示例，但使用 NonRealtime Gen 封装器，位于 [examples\u002Fneutone_gen\u002Fexample_clipper.py](examples\u002Fneutone_gen\u002Fexample_clipper.py)。\n- 更为复杂的文本转音频模型封装示例可在 [examples\u002Fneutone_gen\u002Fexample_musicgen_load.py](examples\u002Fneutone_gen\u002Fexample_musicgen_load.py) 中找到。此示例展示了如何使用分词器，并且需要一个已追踪或脚本化的 MusicGen 模型。\n\n\u003Ca name=\"description\"\u002F>\n\n## SDK 概述\n\nSDK 提供了将现有 PyTorch 模型封装成可在 VST 插件中执行的形式的功能。插件的核心是以特定采样率发送音频样本块作为输入，并期望输出相同数量的样本。SDK 用户可以指定其模型在哪些采样率和缓冲区大小下表现最佳。SDK 会确保模型的前向传播接收到符合这些 (采样率, 缓冲区大小) 组合的音频数据。插件提供四个旋钮，允许用户在运行时向模型输入额外参数。这些旋钮可根据需要通过 SDK 启用或禁用。\n\n\n借助附带的导出功能，会自动运行一系列测试，以确保模型行为符合预期，并可被插件加载。\n\n\n还提供了基准测试和性能分析命令行工具，用于进一步调试和测试封装后的模型。可以在一系列模拟的常见 DAW (采样率, 缓冲区大小) 组合上对模型的速度和延迟进行基准测试，同时还可以分析内存和 CPU 使用情况。\n\n\u003Ca name=\"usage\"\u002F>\n\n## SDK 使用\n\n### 一般使用\n\n我们在 [examples](https:\u002F\u002Fgithub.com\u002FQosmoInc\u002Fneutone-sdk\u002Fblob\u002Fmain\u002Fexamples) 目录中提供了多个示例模型。我们将以其中一个最简单的模型——失真效果模型为例进行说明。\n\n假设我们有如下 PyTorch 模型。参数将在后面介绍，目前我们先关注输入和输出。假设该模型接收形状为 `(2, buffer_size)` 的张量作为输入，其中 `buffer_size` 是可指定的参数。\n\n```python\nclass ClipperModel(nn.Module):\n    def forward(self, x: Tensor, min_val: float, max_val: float, gain: float) -> Tensor:\n        return torch.clip(x, min=min_val * gain, max=max_val * gain)\n```\n\n要在 VST 中运行此模型，我们可以编写的最简单的封装类是继承 WaveformToWaveformBase 基类。\n```python\nclass ClipperModelWrapper(WaveformToWaveformBase):\n    @torch.jit.export  \n    def is_input_mono(self) -> bool:\n        return False\n    \n    @torch.jit.export\n    def is_output_mono(self) -> bool:\n        return False\n    \n    @torch.jit.export\n    def get_native_sample_rates(self) -> List[int]:\n        return []  # 支持所有采样率\n    \n    @torch.jit.export\n    def get_native_buffer_sizes(self) -> List[int]:\n        return []  # 支持所有缓冲区大小\n\n    def do_forward_pass(self, x: Tensor, params: Dict[str, Tensor]) -> Tensor:\n        # ... 参数解包逻辑\n        x = self.model.forward(x, min_val, max_val, gain)\n        return x\n ```\n负责大部分工作的方法是 `do_forward_pass`。在本例中它只是一个简单的直通操作，但我们稍后会利用它来处理参数。\n\n默认情况下，VST 以立体声模式运行，但如果模型需要单声道输入，可以通过 `is_input_mono` 和 `is_output_mono` 告知 SDK，从而自动转换输入和输出。如果启用 `is_input_mono`，则会传递一个平均后的形状为 `(1, buffer_size)` 的张量作为输入，而不是 `(2, buffer_size)`。如果启用 `is_output_mono`，则预计 `do_forward_pass` 会返回一个单声道张量（形状为 `(1, buffer_size)`），随后在 VST 的输出端将其复制到两个声道。这一过程由 SDK 自动完成，以避免每次传递时产生不必要的内存分配。\n\n`get_native_sample_rates` 和 `get_native_buffer_sizes` 可用于指定偏好的采样率或缓冲区大小。大多数情况下，这些列表通常只有一个元素，但对于更复杂的模型，则提供了额外的灵活性。如果提供了多个选项，SDK 会尝试为当前 DAW 的设置找到最佳匹配。每当采样率或缓冲区大小与 DAW 不一致时，系统会自动触发封装层，以转换为正确的采样率，或为请求的缓冲区大小实现 FIFO 队列，甚至两者兼施。这会导致轻微的性能损失并增加一定的延迟。如果模型兼容任意采样率和\u002F或缓冲区大小，则可以将这些列表留空。\n\n这意味着，在 `do_forward_pass` 方法中，张量 `x` 的形状将保证为 `(1 如果是单声道输入，否则为 2, buffer_size)`，其中 `buffer_size` 将在运行时从 `get_native_buffer_sizes` 方法提供的列表中选择。张量 `x` 也将处于 `get_native_sample_rates` 方法提供的采样率列表中的某一个采样率下。\n\n### 模型导出与插件加载\n\n我们提供了一个 `save_neutone_model` 辅助函数，用于将模型保存到磁盘。默认情况下，此函数会将模型转换为 TorchScript 格式，并运行一系列检查，以确保模型可以被插件加载。生成的 `model.nm` 文件可以通过插件中的“加载自定义”按钮加载。有关如何将模型提交到对所有使用插件的用户可见的默认库，请参阅下文。\n\n### 参数\n\n对于可以使用条件信号的模型，我们目前提供了四个可配置的旋钮参数。在上面定义的 `ClipperModelWrapper` 中，我们可以包含以下内容：\n```python\nclass ClipperModelWrapper(WaveformToWaveformBase):\n    ...\n    \n    def get_neutone_parameters(self) -> List[NeutoneParameter]:\n        return [NeutoneParameter(name=\"min\", description=\"最小削波阈值\", default_value=0.5),\n                NeutoneParameter(name=\"max\", description=\"最大削波阈值\", default_value=1.0),\n                NeutoneParameter(name=\"gain\", description=\"削波阈值缩放因子\", default_value=1.0)]\n         \n    def do_forward_pass(self, x: Tensor, params: Dict[str, Tensor]) -> Tensor:\n        min_val, max_val, gain = params[\"min\"], params[\"max\"], params[\"gain\"]\n        x = self.model.forward(x, min_val, max_val, gain)\n        return x\n```\n\n在前向传播过程中，`params` 变量将是一个如下所示的字典：\n```python\n{\n    \"min\": torch.Tensor([0.5] * buffer_size),\n    \"max\": torch.Tensor([1.0] * buffer_size),\n    \"gain\": torch.Tensor([1.0] * buffer_size)\n}\n```\n字典的键由 `get_parameters` 函数中指定。\n\n这些参数的取值范围始终在 0 到 1 之间，`do_forward_pass` 函数可以在调用模型内部的前向方法之前，对参数进行必要的重新缩放。\n\n此外，插件发送的参数是以采样点为粒度的。默认情况下，我们会对每个缓冲区取平均值并返回一个单精度浮点数（作为张量），但也可以通过 `aggregate_param` 方法来覆盖聚合方式。有关如何保留这一粒度的示例，请参阅完整的削波器导出文件。\n\n\u003Ca name=\"delay\"\u002F>\n\n### 报告延迟\n\n某些音频模型会在输出音频时引入一定数量的采样延迟。这取决于具体模型的架构。为了使通过插件的干声和湿声信号能够对齐，用户需要报告其模型所引入的延迟样本数。可以通过 `calc_model_delay_samples` 方法来指定延迟样本数。RAVE 模型平均会有一个缓冲区的延迟（2048 个采样点），这一信息会在 `calc_model_delay_samples` 方法中静态声明，并可在示例中查看。而采用重叠相加实现的模型，其延迟将等于用于交叉淡入淡出的采样点数，如 [Demucs 模型封装](https:\u002F\u002Fneutone.ai\u002Fblog\u002Fimplementing-models-with-overlap-add-in-neutone\u002F) 或 [频谱滤波器示例](examples\u002Fexample_spectral_filter.py) 所示。\n\n计算模型引入的延迟可能较为复杂，尤其是当存在多种不同的延迟来源需要综合考虑时（例如，交叉淡入淡出延迟、滤波器延迟、前瞻缓冲区延迟，以及\u002F或神经网络是在未对齐的干声和湿声数据上训练的）。因此，建议在宿主 DAW 中多花些时间测试模型，以确保正确报告了延迟。\n\n### 后视缓冲区\n\n为模型添加后视缓冲区对于那些需要额外上下文才能生成有用结果的模型非常有帮助。只需在 `get_look_behind_samples` 方法中指明所需的后视采样点数，即可轻松启用后视缓冲区。当该方法返回的数值大于零时，`do_forward_pass` 方法将始终接收到形状为 `(in_n_ch, look_behind_samples + buffer_size)` 的张量，但仍需返回形状为 `(out_n_ch, buffer_size)` 的最新采样张量。\n\n我们建议尽可能避免使用后视缓冲区，因为它会降低模型效率，并可能导致每次前向传播时产生不必要的计算开销。如果使用纯卷积模型，可以尝试将所有卷积层替换为缓存卷积层。\n\n### 滤波器\n\n当 AI 模型接收到与其训练分布不符的输入时，往往会出现意想不到的行为。我们在 [neutone_sdk\u002Ffilters.py](neutone_sdk\u002Ffilters.py) 文件中提供了一系列常用滤波器（低音滤波器、高通滤波器、带通滤波器、带阻滤波器）。这些滤波器可以在前向传播过程中使用，以限制输入信号的频率范围。其中部分滤波器可能会引入少量延迟，有关如何设置滤波器的简单示例，请参阅 [examples\u002Fexample_clipper_prefilter.py](examples\u002Fexample_clipper_prefilter.py) 文件。\n\n### 提交模型\n\n该插件包含一个默认的模型列表，旨在供创作者在其创作过程中使用。我们鼓励用户在对模型效果满意后提交自己的模型，以便更广泛的社区成员也能使用。提交时，我们需要一些额外的元数据，这些数据将用于展示关于模型的信息，面向创作者和其他研究人员。这些信息将在[Neutone官网](https:\u002F\u002Fneutone.space)以及插件内部显示。\n\n跳过之前的裁剪器模型，这里提供一个更真实的示例，基于受[micro-tcn](https:\u002F\u002Fgithub.com\u002Fcsteinmetz1\u002Fmicro-tcn)启发的随机TCN失真模型。\n\n```python\nclass OverdriveModelWrapper(WaveformToWaveformBase):\n    def get_model_name(self) -> str:\n        return \"conv1d-overdrive.random\"\n\n    def get_model_authors(self) -> List[str]:\n        return [\"Nao Tokui\"]\n\n    def get_model_short_description(self) -> str:\n        return \"神经网络失真\u002F过载效果\"\n\n    def get_model_long_description(self) -> str:\n        return \"通过随机初始化的卷积神经网络实现的神经网络失真\u002F过载效果\"\n\n    def get_technical_description(self) -> str:\n        return \"通过随机初始化的时序一维卷积层实现的随机失真\u002F过载效果。重新初始化权重或更改激活函数会得到不同类型的失真效果。基于Steinmetz等人提出的想法。\"\n\n    def get_tags(self) -> List[str]:\n        return [\"失真\", \"过载\"]\n\n    def get_model_version(self) -> str:\n        return \"1.0.0\"\n\n    def is_experimental(self) -> bool:\n        return False\n\n    def get_technical_links(self) -> Dict[str, str]:\n        return {\n            \"论文\": \"https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.04237\",\n            \"代码\": \"https:\u002F\u002Fgithub.com\u002Fcsteinmetz1\u002Fronn\"\n        }\n\n    def get_citation(self) -> str:\n        return \"Steinmetz, C. J., & Reiss, J. D. (2020). Randomized overdrive neural networks. arXiv preprint arXiv:2010.04237.\"\n```\n\n请查阅[core.py](neutone_sdk\u002Fcore.py)中各方法的文档，以及[网站](https:\u002F\u002Fneutone.ai\u002Ffx\u002Fmodels)和插件中的随机过载模型，以了解每个字段将在何处显示。\n\n要提交模型，请[在GitHub仓库中打开一个问题](https:\u002F\u002Fgithub.com\u002FQosmoInc\u002Fneutone_sdk\u002Fissues\u002Fnew?assignees=bogdanteleaga%2C+christhetree&labels=enhancement&template=request-add-model.md&title=%5BMODEL%5D+%3CNAME%3E)。目前我们需要以下内容：\n- 模型功能的简短描述及其如何为社区做出贡献\n- 由`save_neutone_model`辅助函数输出的`model.nm`文件链接\n\n\u003Ca name=\"benchmark\"\u002F>\n\n## 基准测试与性能分析\n\nSDK提供了三个可用于调试和测试封装模型的命令行工具。\n\n### 速度基准测试\n\n示例：\n```\n$ python -m neutone_sdk.benchmark benchmark-speed --model_file model.nm\nINFO:__main__:正在运行缓冲区大小（128、256、512、1024、2048）和采样率（48000）的基准测试。异常值将从均值和标准差的计算中移除，并在存在时单独显示。\nINFO:__main__:采样率：48000 | 缓冲区大小：128 | 持续时间：0.014±0.002 | 1\u002FRTF：5.520 | 异常值：[0.008]\nINFO:__main__:采样率：48000 | 缓冲区大小：256 | 持续时间：0.028±0.003 | 1\u002FRTF：5.817 | 异常值：[]\nINFO:__main__:采样率：48000 | 缓冲区大小：512 | 持续时间：0.053±0.003 | 1\u002FRTF：6.024 | 异常值：[]\nINFO:__main__:采样率：48000 | 缓冲区大小：1024 | 持续时间：0.106±0.000 | 1\u002FRTF：6.056 | 异常值：[]\nINFO:__main__:采样率：48000 | 缓冲区大小：2048 | 持续时间：0.212±0.000 | 1\u002FRTF：6.035 | 异常值：[0.213]\n```\n\n运行速度基准测试会自动以48000 Hz的采样率和（128、256、512、1024、2048）的缓冲区大小向模型输入随机信号，并报告执行一次推理所需的平均时间。由此计算出“1\u002FRTF”值，表示模型比实时快多少倍。该数值越高，模型在DAW中占用的资源就越少。为了使模型能够在运行基准测试的机器上实时运行，该数值必须大于1。\n\n可作为参数设置的包括测试的采样率和缓冲区大小、基准测试内部重复计算平均值的次数，以及用于计算的线程数。运行`python -m neutone_sdk.benchmark benchmark-speed --help`可获取更多信息。当指定自定义采样率或缓冲区大小时，每个参数都需要单独传递给CLI。例如：`--sample_rate 48000 --sample_rate 44100 --buffer_size 32 --buffer_size 64`。\n\n虽然速度基准测试通常应快速完成，因为模型一般要求具备实时性，但如果模型过于缓慢，仍有可能卡住。请确保选择合适的采样率和缓冲区大小进行测试。\n\n\u003Ca name=\"latency\"\u002F>\n\n### 延迟基准测试\n\n示例：\n```bash\n$ python -m neutone_sdk.benchmark benchmark-latency model.nm                    \nINFO:__main__:原生缓冲区大小：[2048]，原生采样率：[48000]\nINFO:__main__:模型导出文件 exports\u002Fravemodel\u002Fmodel.nm 对于每种采样率\u002F缓冲区大小组合具有以下延迟（按最低延迟排序）：\nINFO:__main__:采样率： 48000 | 缓冲区大小：   2048 | 总延迟：      0 | (缓冲延迟：      0 | 模型延迟：      0)\nINFO:__main__:采样率： 48000 | 缓冲区大小：   1024 | 总延迟：   1024 | (缓冲延迟：   1024 | 模型延迟：      0)\nINFO:__main__:采样率： 48000 | 缓冲区大小：    512 | 总延迟：   1536 | (缓冲延迟：   1536 | 模型延迟：      0)\nINFO:__main__:采样率： 48000 | 缓冲区大小：    256 | 总延迟：   1792 | (缓冲延迟：   1792 | 模型延迟：      0)\nINFO:__main__:采样率： 44100 | 缓冲区大小：    128 | 总延迟：   1920 | (缓冲延迟：   1920 | 模型延迟：      0)\nINFO:__main__:采样率： 48000 | 缓冲区大小：    128 | 总延迟：   1920 | (缓冲延迟：   1920 | 模型延迟：      0)\nINFO:__main__:采样率： 44100 | 缓冲区大小：    256 | 总延迟：   2048 | (缓冲延迟：   2048 | 模型延迟：      0)\nINFO:__main__:采样率： 44100 | 缓冲区大小：    512 | 总延迟：   2048 | (缓冲延迟：   2048 | 模型延迟：      0)\nINFO:__main__:采样率： 44100 | 缓冲区大小：   1024 | 总延迟：   2048 | (缓冲延迟：   2048 | 模型延迟：      0)\nINFO:__main__:采样率： 44100 | 缓冲区大小：   2048 | 总延迟：   2048 | (缓冲延迟：   2048 | 模型延迟：      0)\n```\n\n运行速度基准测试会自动计算模型在 `sample_rate=(44100, 48000)` 和 `buffer_size=(128, 256, 512, 1024, 2048)` 组合下的延迟。这可以大致了解常见 DAW 设置下会发生什么情况。总延迟分为缓冲延迟和模型延迟。模型延迟由模型创建者在模型封装中报告，如 [上述](#delay) 所述。缓冲延迟则由 SDK 自动计算，考虑封装中指定的原生 `(sample_rate, buffer_size)` 组合以及 DAW 运行时指定的组合。在模型的原生 `(sample_rate, buffer_size)` 组合下运行模型将产生最小延迟。\n\n与上述速度基准测试类似，可以通过命令行指定要测试的 `(sample_rate, buffer_size)` 组合。运行 `python -m neutone_sdk.benchmark benchmark-latency --help` 可获取更多信息。\n\n### 性能分析\n```bash\n$ python -m neutone_sdk.benchmark profile --model_file exports\u002Fravemodel\u002Fmodel.nm\nINFO:__main__:正在对模型 exports\u002Fravemodel\u002Fmodel.nm 在采样率 48000 和缓冲区大小 128 下进行性能分析\nSTAGE:2023-09-28 14:34:53 96328:4714960 ActivityProfilerController.cpp:311] 阶段完成：预热\n30it [00:00, 37.32it\u002Fs]\nSTAGE:2023-09-28 14:34:54 96328:4714960 ActivityProfilerController.cpp:317] 阶段完成：数据收集\nSTAGE:2023-09-28 14:34:54 96328:4714960 ActivityProfilerController.cpp:321] 阶段完成：后处理\nINFO:__main__:显示总 CPU 时间\nINFO:__main__:--------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  \n                            名称    自身 CPU 百分比      自身 CPU   总 CPU 百分比     总 CPU  平均 CPU 时间       CPU 内存  自身 CPU 内存    调用次数  \n--------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  \n                         forward        98.54%     799.982ms       102.06%     828.603ms      26.729ms           0 b    -918.17 Kb            31  \n               aten::convolution         0.12%     963.000us         0.95%       7.739ms     175.886us     530.62 Kb    -143.50 Kb            44\n...\n...\n完整输出已从 GitHub 上移除。\n\n```\n\n性能分析工具将在 PyTorch 性能分析器下以 48000 的采样率和 128 的缓冲区大小运行模型，并输出一系列洞察信息，例如总 CPU 时间、每个函数的总 CPU 内存使用量以及按函数调用组划分的 CPU 内存使用量。这可用于识别模型代码中的瓶颈（甚至是在 `do_forward_pass` 调用内的模型调用中）。\n\n与基准测试类似，也可以在不同的采样率和缓冲区大小组合以及不同线程数下运行。运行 `python -m neutone_sdk.benchmark profile --help` 可获取更多信息。\n\n\n\u003Ca name=\"issues\"\u002F>\n\n## 已知问题\n\n- 在保存时冻结模型可能导致不稳定，因此默认情况下禁用了冻结功能。我们建议尝试在启用和禁用冻结的情况下分别保存模型。\n- 当前 SDK 中未包含前瞻缓冲区（仅支持后瞻缓冲区），但可以通过额外的代码实现。示例可在 [此文件](neutone_sdk\u002Frealtime_stft.py) 中找到。\n- 目前不支持 M1 加速。\n- 包装更复杂的模型可能会导致难以理解的 TorchScript 错误。\n\n\n\u003Ca name=\"contributing\"\u002F>\n\n## 参与 SDK 开发\n\n我们欢迎对 SDK 的任何贡献。请尽可能添加类型注解，并使用 `black` 格式化工具以提高可读性。\n\n当前路线图如下：\n- 正在研究 TorchScript 的替代方案（ExecuTorch？）\n\n\u003Ca name=\"credits\"\u002F>\n\n## 致谢\n\naudacitorch 项目是 SDK 开发的重要灵感来源。[请在此处查看](https:\u002F\u002Fgithub.com\u002Fhugofloresgarcia\u002Faudacitorch)","# Neutone SDK 快速上手指南\n\nNeutone SDK 是一个开源框架，旨在简化基于 PyTorch 的神经音频模型部署。它允许研究人员仅使用 Python 代码将模型封装，并直接在数字音频工作站（DAW）中通过免费的宿主插件（Neutone FX 或 Neutone Gen）运行，无需编写 C++ 代码。\n\n## 1. 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**: Windows, macOS 或 Linux。\n*   **Python 版本**: 建议 Python 3.8 及以上版本。\n*   **核心依赖**:\n    *   `PyTorch`: 用于构建和运行神经网络模型。\n    *   `neutone_sdk`: 核心封装库。\n*   **宿主软件 (DAW)**: 任意支持 VST3 或 AU (macOS) 插件的 DAW（如 Ableton Live, FL Studio, Logic Pro 等）。\n*   **宿主插件**:\n    *   **Neutone FX**: 用于实时音频效果模型（推荐大多数用户）。\n    *   **Neutone Gen**: 用于非实时生成模型（目前处于 Beta 阶段）。\n\n## 2. 安装步骤\n\n### 2.1 安装 Python SDK\n\n使用 pip 安装 Neutone SDK。国内用户建议使用清华或阿里镜像源以加速下载：\n\n```bash\npip install neutone_sdk -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n*注意：如果尚未安装 PyTorch，请先根据您的需求访问 [pytorch.org](https:\u002F\u002Fpytorch.org) 安装对应版本的 PyTorch。*\n\n### 2.2 下载宿主插件\n\n您需要下载对应的插件文件并将其放入 DAW 的插件扫描目录中。\n\n*   **Neutone FX (实时模型)**:\n    *   访问官网下载 VST3 或 AU 版本：[https:\u002F\u002Fneutone.ai\u002Ffx](https:\u002F\u002Fneutone.ai\u002Ffx)\n*   **Neutone Gen (非实时模型 - Beta)**:\n    *   **macOS (ARM)**: [下载链接](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Fneutone-gen-arm64-0.2.0.dmg)\n    *   **Windows**: [下载链接](https:\u002F\u002Fdev.neutone.ai\u002Fneutone-gen\u002Fneutone-gen-0.2.0-installer.exe)\n\n## 3. 基本使用\n\n以下是将一个简单的 PyTorch 模型封装并在 DAW 中运行的最小化流程。\n\n### 3.1 定义并封装模型\n\n创建一个 Python 文件（例如 `my_model.py`），继承 `WaveformToWaveformBase` 基类。以下示例展示了一个简单的限幅器（Clipper）模型封装：\n\n```python\nimport torch\nfrom torch import Tensor, nn\nfrom typing import List, Dict\nfrom neutone_sdk import WaveformToWaveformBase\n\n# 1. 定义您的原始 PyTorch 模型逻辑\nclass ClipperModel(nn.Module):\n    def forward(self, x: Tensor, min_val: float, max_val: float, gain: float) -> Tensor:\n        return torch.clip(x, min=min_val * gain, max=max_val * gain)\n\n# 2. 创建 SDK 包装类\nclass ClipperModelWrapper(WaveformToWaveformBase):\n    def __init__(self):\n        super().__init__()\n        self.model = ClipperModel()\n\n    @torch.jit.export\n    def is_input_mono(self) -> bool:\n        # 返回 True 表示模型需要单声道输入，SDK 会自动转换\n        return False\n    \n    @torch.jit.export\n    def is_output_mono(self) -> bool:\n        # 返回 True 表示模型输出单声道，SDK 会自动复制为立体声\n        return False\n    \n    @torch.jit.export\n    def get_native_sample_rates(self) -> List[int]:\n        # 返回模型最佳支持的采样率列表。返回空列表表示支持所有采样率\n        return []  \n    \n    @torch.jit.export\n    def get_native_buffer_sizes(self) -> List[int]:\n        # 返回模型最佳支持的缓冲区大小列表。返回空列表表示支持所有大小\n        return []\n\n    def do_forward_pass(self, x: Tensor, params: Dict[str, Tensor]) -> Tensor:\n        # 在此处执行模型推理\n        # x 的形状保证为 (通道数，buffer_size)\n        \n        # 示例：从 params 中提取控制参数 (需在导出时定义映射)\n        # 这里为了演示简单直接硬编码参数，实际使用中可通过 SDK 参数系统动态传入\n        min_val = -1.0\n        max_val = 1.0\n        gain = 1.0\n        \n        return self.model.forward(x, min_val, max_val, gain)\n\nif __name__ == \"__main__\":\n    # 3. 导出模型\n    # 这将运行测试并生成 .nm 模型文件，可直接拖入 Neutone 插件使用\n    wrapper = ClipperModelWrapper()\n    wrapper.save_to_neutone_bundle(\"clipper_model.nm\")\n    print(\"模型导出成功：clipper_model.nm\")\n```\n\n### 3.2 在 DAW 中使用\n\n1.  运行上述脚本，生成 `clipper_model.nm` 文件。\n2.  打开您的 DAW，加载 **Neutone FX** 插件。\n3.  在插件界面中，点击加载按钮，选择生成的 `clipper_model.nm` 文件（或者如果已发布，可直接在内置商店搜索）。\n4.  现在，您的 PyTorch 模型已经作为原生音频效果器在 DAW 中运行了。\n\n### 3.3 进阶提示\n\n*   **参数控制**: SDK 支持最多 4 个旋钮参数，可在 `do_forward_pass` 中通过 `params` 字典接收，并在 DAW 中自动化控制。\n*   **采样率适配**: 如果您的模型仅在特定采样率下训练，请在 `get_native_sample_rates` 中指定，SDK 会自动处理重采样。\n*   **更多示例**: 请参考官方仓库中的 `examples` 目录，包含过驱效果、音色转换等复杂模型的完整实现。","一位音频算法研究员开发了一款基于 PyTorch 的创新人声音色转换模型，急需将其集成到音乐制作人的数字音频工作站（DAW）中进行实时测试与创作。\n\n### 没有 neutone_sdk 时\n- **技术栈壁垒高**：研究员必须重新学习 C++ 和 JUCE 框架才能编写插件外壳，这对擅长 Python 的 AI 开发者而言门槛极高且耗时。\n- **信号处理复杂**：需手动编写代码处理 DAW 多变的缓冲区大小、采样率转换及延迟补偿，极易因适配不当导致音频爆裂或不同步。\n- **迭代周期漫长**：每次调整模型参数后，都要经历繁琐的编译、打包和安装流程，无法在一天内完成从算法到 DAW 的验证闭环。\n- **调试困难**：缺乏针对神经网络音频处理的专用性能分析工具，难以定位实时推理中的卡顿或延迟瓶颈。\n\n### 使用 neutone_sdk 后\n- **纯 Python 开发**：研究员仅需少量 Python 代码即可将 PyTorch 模型封装，无需任何 C++ 知识，直接生成兼容插件。\n- **自动信号适配**：neutone_sdk 内置了自动缓冲、采样率转换及立体声\u002F单声道切换功能，模型可在任意 DAW 设置下无缝运行。\n- **极速部署上线**：从模型训练到在 Neutone FX 插件中实时试用，全过程可在数小时内完成，大幅加速算法落地。\n- **内置性能分析**：利用 SDK 自带的基准测试工具，可快速监控模型延迟与资源占用，轻松优化实时表现。\n\nneutone_sdk 通过屏蔽底层音频工程复杂度，让 AI 研究者能专注于算法创新，实现从实验室模型到专业音乐制作工具的“当日达”转化。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNeutone_neutone_sdk_99fd1caf.png","Neutone","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FNeutone_14a38268.png","Next Generation AI tools for Musicians and Artists",null,"contact@neutone.ai","neutone_ai","https:\u002F\u002Fneutone.ai\u002F","https:\u002F\u002Fgithub.com\u002FNeutone",[85],{"name":86,"color":87,"percentage":88},"Python","#3572A5",100,602,31,"2026-04-01T05:51:52","LGPL-2.1","macOS, Windows","未说明",{"notes":96,"python":94,"dependencies":97},"该工具主要用于将 PyTorch 音频模型封装为 VST3 或 AU 插件。宿主插件（Neutone FX\u002FGen）支持 macOS (ARM) 和 Windows。SDK 本身通过 pip 安装，具体 Python 版本和 GPU 需求取决于用户所封装的底层 PyTorch 模型，README 中未对 SDK 本身设定硬性硬件指标。",[98,67],"torch",[53,55,13,14,15],[101,102,103,104,105,106,107,108,109,110,111],"ai","audio","audio-plugin","deep-learning","realtime-audio","sdk","vst","music","python","machine-learning","pytorch","2026-03-27T02:49:30.150509","2026-04-06T08:42:15.515553",[115,120,125,130,135,139],{"id":116,"question_zh":117,"answer_zh":118,"source_url":119},13452,"导出模型时遇到 'attribute lookup is not defined on python value of type 'type'' 错误怎么办？","该错误通常是因为直接继承 SQW (SampleQueueWrapper) 类导致的，SDK 并未针对此用例进行构建或测试。解决方案包括：1. 将类改为组合模式（wrapper），即让新类包含 SQW 实例而不是继承它；2. 直接修改 SQW 源码而不是子类化；3. 尝试不同版本的 torch 和 libtorch，因为继承问题在 libtorch 中长期存在且正在改进。注意不要试图通过添加静态方法来规避，这通常无效。","https:\u002F\u002Fgithub.com\u002FNeutone\u002Fneutone_sdk\u002Fissues\u002F45",{"id":121,"question_zh":122,"answer_zh":123,"source_url":124},13453,"在 Windows 上使用 save_neutone_model 导出模型时报错 'Format not recognised' 如何解决？","此问题通常是由于 soundfile 库在 Windows 上无法正确识别 BytesIO 对象中的默认 MP3 音频格式。解决方法是重新安装最新版本的 soundfile 库（建议版本 0.12.1 或更高）。可以通过运行 `pip install --upgrade soundfile` 进行更新，更新后应能正常读取 mp3 文件并解决导出报错问题。","https:\u002F\u002Fgithub.com\u002FNeutone\u002Fneutone_sdk\u002Fissues\u002F67",{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},13454,"为什么在 DAW 中设置的采样率与模型不匹配时，输出会出现低通滤波效果或音质差异？","这是由于 Neutone 插件在进行采样率转换时使用的低通滤波器带宽设置问题导致的。该问题已在 SDK 的后续更新中得到修复。请确保升级到最新版本的 Neutone FX 插件（1.5 版本及以上）以及最新的 neutone-sdk，以获得改进的采样率转换处理和一致的音频输出质量。","https:\u002F\u002Fgithub.com\u002FNeutone\u002Fneutone_sdk\u002Fissues\u002F55",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},13455,"如何提交自己的模型到 Neutone 官方插件和网站？","提交模型需要创建一个 GitHub Issue，并在正文中包含以下信息：1. 模型的简要描述；2. 确认清单（勾选已在本地 Neutone 插件测试通过，并提供 .nm 模型文件的公开下载链接）；3. 完整的元数据 JSON 内容（包含模型名称、作者、版本、描述、技术细节、论文\u002F代码链接、标签、引用信息以及参数定义等）。维护者审核通过后，会将模型集成到插件和官网，并可能在社交媒体上宣传。","https:\u002F\u002Fgithub.com\u002FNeutone\u002Fneutone_sdk\u002Fissues\u002F60",{"id":136,"question_zh":137,"answer_zh":138,"source_url":129},13456,"在编写 Neutone 模型代码时，是否需要手动调用 .eval() 和 .to(\"cpu\")？","不需要。如果您在没有 GPU 加速的情况下运行 PyTorch（这也是 Neutone 的标准运行环境），那么显式调用 `.eval()` 和 `.to(\"cpu\")` 是多余的。Neutone SDK 会在内部处理模型的评估模式和设备放置，移除这些调用可以简化代码。",{"id":140,"question_zh":141,"answer_zh":142,"source_url":134},13457,"能否从已发布的 .nm 模型文件中提取原始的 PyTorch 脚本 (.ts) 文件？","通常情况下，无法直接从 .nm 文件中提取原始的 .ts (TorchScript) 文件，这也取决于模型创作者在封装模型之前的具体操作。不过，模型创作者可以选择自愿发布对应的 .ts 版本文件供社区使用。",[144,149,154],{"id":145,"version":146,"summary_zh":147,"released_at":148},72175,"v1.5.2","修改 torch 和 numpy 的版本要求，以提高环境配置的灵活性。","2025-11-13T11:43:25",{"id":150,"version":151,"summary_zh":152,"released_at":153},72176,"v1.5.0","Neutone SDK v1.5.0 已正式发布，新增对非实时模型的支持，并配套推出了 Neutone Gen 0.2.0 插件（仍处于 Beta 阶段）。\n\n这使得开发者能够封装处理速度低于实时的音频到音频模型，以及非因果和\u002F或不支持流式传输的模型。此外，非实时 SDK 还支持多输入和多输出音频文件，以及文本输入。\n\n请查看 [README 中的示例部分](https:\u002F\u002Fgithub.com\u002FNeutone\u002Fneutone_sdk?tab=readme-ov-file#gen-1)，了解如何使用新 SDK 的功能。\n\n您还可以从 [下载专区](https:\u002F\u002Fgithub.com\u002FNeutone\u002Fneutone_sdk?tab=readme-ov-file#gen) 下载新的 Gen 0.2.0 插件及若干预封装模型。","2025-10-24T09:53:05",{"id":155,"version":156,"summary_zh":157,"released_at":158},72177,"v1.4.0","v1.4.0 带来了两项重大更新：\n- 现在软件包中提供了用于基准测试的 CLI 工具：\n  - 您可以分别运行 `python -m neutone_sdk.benchmark benchmark-speed --model_file \u003C模型文件路径.nm>` 和 `python -m neutone_sdk.benchmark benchmark-latency --model_file \u003C模型文件路径.nm>` 来进行速度和延迟的基准测试。\n  - 还可以通过 `python -m neutone_sdk.benchmark profile --model_file \u003C模型文件路径.nm>` 进行性能剖析。\n  - 请参阅[专门的 README 部分](https:\u002F\u002Fgithub.com\u002FQosmoInc\u002Fneutone_sdk#benchmarking-and-profiling)，以获取更详细的信息。\n  - 在导出模型时，还会自动运行一组默认的基准测试。\n- PyTorch 的版本限制已放宽至 2.2.0，允许用户使用最新发布的 2.1.0 版本。","2023-10-05T05:46:33"]