[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-Zymrael--awesome-neural-ode":3,"tool-Zymrael--awesome-neural-ode":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",150037,2,"2026-04-10T23:33:47",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":78,"owner_email":78,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":78,"stars":82,"forks":83,"last_commit_at":84,"license":85,"difficulty_score":86,"env_os":87,"env_gpu":88,"env_ram":88,"env_deps":89,"category_tags":92,"github_topics":93,"view_count":32,"oss_zip_url":78,"oss_zip_packed_at":78,"status":17,"created_at":103,"updated_at":104,"faqs":105,"releases":142},6580,"Zymrael\u002Fawesome-neural-ode","awesome-neural-ode","A collection of resources regarding the interplay between differential equations, deep learning, dynamical systems, control and numerical methods.","awesome-neural-ode 是一个专注于微分方程与深度学习交叉领域的开源资源合集。它系统性地整理了关于神经微分方程（Neural ODEs）、动力系统、控制理论及数值计算方法的前沿论文、代码库和技术博客。\n\n在传统深度学习中，模型通常由离散的层堆叠而成，难以高效处理连续时间数据或不规则采样序列。awesome-neural-ode 通过汇集将神经网络视为连续动力系统的研究成果，帮助开发者利用微分方程建模来突破这一限制。它不仅涵盖了神经 ODE、SDE（随机微分方程）和 CDE（控制微分方程）等核心架构的训练与加速技巧，还包含了科学机器学习（Scientific ML）中利用深度学习求解微分方程及发现物理模型的方法。\n\n该资源库特别适合人工智能研究人员、算法工程师以及对数学原理有浓厚兴趣的开发者使用。其独特亮点在于提供了细致的主题标签分类（如图像、序列、系统理论等），让用户能快速定位到生成模型、时间序列预测或优化理论等具体方向的相关资料。无论是希望深入理解连续深度学习的理论基础，还是寻找解决实际工程问题的代码实现，awesome-neural-ode 都是一份极具价值的导航指南","awesome-neural-ode 是一个专注于微分方程与深度学习交叉领域的开源资源合集。它系统性地整理了关于神经微分方程（Neural ODEs）、动力系统、控制理论及数值计算方法的前沿论文、代码库和技术博客。\n\n在传统深度学习中，模型通常由离散的层堆叠而成，难以高效处理连续时间数据或不规则采样序列。awesome-neural-ode 通过汇集将神经网络视为连续动力系统的研究成果，帮助开发者利用微分方程建模来突破这一限制。它不仅涵盖了神经 ODE、SDE（随机微分方程）和 CDE（控制微分方程）等核心架构的训练与加速技巧，还包含了科学机器学习（Scientific ML）中利用深度学习求解微分方程及发现物理模型的方法。\n\n该资源库特别适合人工智能研究人员、算法工程师以及对数学原理有浓厚兴趣的开发者使用。其独特亮点在于提供了细致的主题标签分类（如图像、序列、系统理论等），让用户能快速定位到生成模型、时间序列预测或优化理论等具体方向的相关资料。无论是希望深入理解连续深度学习的理论基础，还是寻找解决实际工程问题的代码实现，awesome-neural-ode 都是一份极具价值的导航指南。","\u003Cdiv align=\"center\">\n    \u003Ch1>Awesome Neural ODE\u003C\u002Fh1>\n    \u003Ca href=\"https:\u002F\u002Fawesome.re\">\u003Cimg src=\"https:\u002F\u002Fawesome.re\u002Fbadge.svg\"\u002F>\u003C\u002Fa>\n\u003C\u002Fdiv>\n\nA collection of resources regarding the interplay between differential equations, dynamical systems, deep learning, control, numerical methods and scientific machine learning.\n\n**NOTE:** Feel free to suggest additions via `Issues` or `Pull Requests`.\n\nThe repo further introduces a (rough) categorization by assigning topic labels to each work. These are not supposed to be comprehensive or precise, and should only provide a rough idea of the contents.\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n# Table of Contents\n\n* **Differential Equations in Deep Learning**\n\n\t* [General Architectures](#general-architectures)\n\t\n\t* [Neural Operators](#neural-operators)\n\t\n\t* [Neural ODEs](#neural-odes)\n\t\n\t\t* [Training of Neural ODEs](#training-of-neural-odes)\n\t\t\n\t\t* [Speeding up continuous models](#speeding-up-continuous-models)\n\t\t\n\t\t* [Control with Neural ODEs](#control-with-neural-odes)\n\t\n\t* [Neural GDEs](#neural-gdes)\n\t\n\t* [Neural SDEs](#neural-sdes)\n\t\n\t* [Neural CDEs](#neural-cdes)\n\t\n\t* [Generative Models](#generative-models)\n\t\n\t\t* [Normalizing Flows](#normalizing-flows)\n\t\t\n\t\t* [Score-Matching SDEs](#score-matching-sdes) \t\n\t\n\t* [Applications](#applications)\n\t\n* **Deep Learning Methods for Differential Equations (Scientific ML)**\n\n\t* [Solving Differential Equations](#solving-differential-equations)\n\t\n\t* [Model Discovery](#model-discovery)\n\t\n* **Dynamical System View of Deep Learning**\n\n\t* [Recurrent Neural Networks](#recurrent-neural-networks)\n\t\n\t* [Theory and Perspectives](#theory-and-perspectives)\n\t\n\t* [Optimization](#optimization)\n\t\n* [Software and Libraries](#software-and-libraries)\n\n* [Websites and Blogs](#websites-and-blogs)\n\n## Differential Equations in Deep Learning\n\n### General Architectures\n\n* Recurrent Neural Networks for Multivariate Time Series with Missing Values: [Scientific Reports18](https:\u002F\u002Farxiv.org\u002Fabs\u002F1606.01865)\n\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. We propose a GRU-based model called GRU-D, in which a decay mechanism is designed for the input variables and the hidden states to capture the aforementioned properties. We introduce decay rates in the model to control the decay mechanism by considering the following important factors.\n\n* Learning unknown ODE models with Gaussian processes: [arXiv18](https:\u002F\u002Farxiv.org\u002Fabs\u002F1803.04303), [code](https:\u002F\u002Fgithub.com\u002Fcagatayyildiz\u002Fnpde\u002F)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> However, for many complex systems it is practically impossible to determine the equations or\ninteractions governing the underlying dynamics. In these settings, parametric ODE model cannot be formulated. Here, we overcome this issue by introducing a novel paradigm of nonparametric ODE modeling that can learn the underlying dynamics of arbitrary continuous-time systems without prior knowledge. We propose to learn non-linear, unknown differential functions from state observations using Gaussian process vector fields within the exact ODE formalism.\n\n* Deep Equilibrium Models: [NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.01377)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> We present a new approach to modeling sequential data: the deep equilibrium model (DEQ). Motivated by an observation that the hidden layers of many existing deep sequence models converge towards some fixed point, we propose the DEQ approach that directly finds these equilibrium points via root-finding.\n\n* Fast and Deep Graph Neural Networks: [AAAI20](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1911.08941.pdf)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> We address the efficiency issue for the construction of a deep graph neural network (GNN). The approach exploits the idea of representing each input graph as a fixed point of a dynamical system (implemented through a recurrent neural network), and leverages a deep architectural organization of the recurrent units. Efficiency is gained by many aspects, including the use of small and very sparse networks, where the weights of the recurrent units are left untrained under the stability condition introduced in this work.\n\n* Hamiltonian Neural Networks: [NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.01563)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> In this paper, we draw inspiration from Hamiltonian mechanics to train models that learn and respect exact conservation laws in an unsupervised manner.\n\n* Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning: [ICLR19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1907.04490)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> We propose Deep Lagrangian Networks (DeLaN) as a deep network structure upon which Lagrangian Mechanics have been imposed. DeLaN can learn the equations of motion of a mechanical system (i.e., system dynamics) with a deep network efficiently while ensuring physical plausibility. The resulting DeLaN network performs very well at robot tracking control.\n\n* Lagrangian Neural Networks: [ICLR20 DeepDiffEq](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.04630)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> We propose Lagrangian Neural Networks (LNNs), which can parameterize arbitrary Lagrangians using neural networks. In contrast to models that learn Hamiltonians, LNNs do not require canonical coordinates, and thus perform well in situations where canonical momenta are unknown or difficult to compute.\n\n* Simplifying Hamiltonian and Lagrangian Neural Networks via Explicit Constraints: [NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.13581), [code](https:\u002F\u002Fgithub.com\u002Fmfinzi\u002Fconstrained-hamiltonian-neural-networks)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> Reasoning about the physical world requires models that are endowed with the right inductive biases to learn the underlying dynamics. Recent works improve generalization for predicting trajectories by learning the Hamiltonian or Lagrangian of a system rather than the differential equations directly. While these methods encode the constraints of the systems using generalized coordinates, we show that embedding the system into Cartesian coordinates and enforcing the constraints explicitly with Lagrange multipliers dramatically simplifies the learning problem.\n\n### Neural Operators \n\n* Neural Operator: Learning Maps Between Function Spaces: [arXv21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2108.08481)\n\n> We propose a generalization of neural networks to learn operators that maps between infinite dimensional function spaces. We formulate the approximation of operators by composition of a class of linear integral operators and nonlinear activation functions, so that the composed operator can approximate complex nonlinear operators. We prove a universal approximation theorem for our construction. Furthermore, we introduce four classes of operator parameterizations: graph-based operators, low-rank operators, multipole graph-based operators, and Fourier operators and describe efficient algorithms for computing with each one.\n\n* Fourier Neural Operator for Parametric Partial Differential Equations: [ICLR 2021](https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.08895)\n\n> We formulate a new neural operator by parameterizing the integral kernel directly in Fourier space, allowing for an expressive and efficient architecture.\n\n* FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators\n\n> FourCastNet, short for Fourier Forecasting Neural Network, is a global data-driven weather forecasting model that provides accurate short to medium-range global predictions at 0.25∘ resolution. FourCastNet accurately forecasts high-resolution, fast-timescale variables such as the surface wind speed, precipitation, and atmospheric water vapor.\n\n* Transform Once: Efficient Operator Learning in Frequency Domain\n\n> This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1). To enable efficient, direct learning in the frequency domain we develop a variance preserving weight initialization scheme and address the open problem of choosing a transform. Our results noticeably streamline the design process of frequency-domain models, pruning redundant transforms, and leading to speedups of 3x to 10x that increase with data resolution and model size. We perform extensive experiments on learning to solve partial differential equations, including incompressible Navier-Stokes, turbulent flows around airfoils, and high-resolution video of smoke dynamics. T1 models improve on the test performance of SOTA FDMs while requiring significantly less computation, with over 20% reduction in predictive error across tasks.\n\n### Neural ODEs\n\n* Neural Ordinary Differential Equations (best paper award): [NeurIPS18](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1806.07366.pdf) \n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions\n\n* Dissecting Neural ODEs (oral): [NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.08071) \n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool) ![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> Continuous deep learning architectures have recently re-emerged as *Neural Ordinary Differential Equations* (Neural ODEs). This infinite--depth approach theoretically bridges the gap between deep learning and dynamical systems, offering a novel perspective. However, deciphering the inner working of these models is still an open challenge, as most applications apply them as generic *black--box* modules. In this work we \"open the box\", further developing the continuous-depth formulation with the aim of clarifying the influence of several design choices on the underlying dynamics. \n\n* Differentiable Multiple Shooting Layers: [NeurIPS21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.03885)\n\n> We detail a novel class of implicit neural models. Leveraging time-parallel methods for differential equations, Multiple Shooting Layers (MSLs) seek solutions of initial value problems via parallelizable root-finding algorithms. MSLs broadly serve as drop-in replacements for neural ordinary differential equations (Neural ODEs) with improved efficiency in number of function evaluations (NFEs) and wall-clock inference time.\n\n* Augmented Neural ODEs: [NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.01681) \n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> We show that Neural Ordinary Differential Equations (ODEs) learn representations that preserve the topology of the input space and prove that this implies the existence of functions Neural ODEs cannot represent. To address these limitations, we introduce Augmented Neural ODEs which, in addition to being more expressive models, are empirically more stable, generalize better and have a lower computational cost than Neural ODEs.\n\n* Latent ODEs for Irregularly-Sampled Time Series: [NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1907.03907)\n\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n* ODE2VAE: Deep generative second order ODEs with Bayesian neural networks: [NeurIPS19](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1905.10994.pdf)\n\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n* Symplectic ODE-Net: Learning Hamiltonian Dynamics with Control: [arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.12077)\n\n* Stable Neural Flows: [arXiv20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.08063) \n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n* On Second Order Behaviour in Augmented Neural ODEs [NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.07220)\n\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n* Neural Hybrid Automata: Learning Dynamics with Multiple Modes and Stochastic Transitions: [NeurIPS21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.04165)\n\n> Effective control and prediction of dynamical systems often require appropriate handling of continuous-time and discrete, event-triggered processes. Stochastic hybrid systems (SHSs), common across engineering domains, provide a formalism for dynamical systems subject to discrete, possibly stochastic, state jumps and multi-modal continuous-time flows. Despite the versatility and importance of SHSs across applications, a general procedure for the explicit learning of both discrete events and multi-mode continuous dynamics remains an open problem. This work introduces Neural Hybrid Automata (NHAs), a recipe for learning SHS dynamics without a priori knowledge on the number of modes and inter-modal transition dynamics. NHAs provide a systematic inference method based on normalizing flows, neural differential equations and self-supervision.\n\n#### Training of Neural ODEs\n\n* Accelerating Neural ODEs with Spectral Elements: [arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.07038) \n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n* Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE: [ICML20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.02493) \n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor) ![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n* MALI: A memory efficient and reverse accurate integrator for Neural ODEs: [ICLR21](https:\u002F\u002Fopenreview.net\u002Fpdf?id=blfSjHeFM_e) \n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor) ![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> Existing implementations of the adjoint method suffer from inaccuracy in reverse-time trajectory, while the naive method and the adaptive checkpoint adjoint method (ACA) have a memory cost that grows with integration time. In this project, based on the asynchronous leapfrog (ALF) solver, we propose the Memory-efficient ALF Integrator (MALI), which has a constant memory cost w.r.t number of solver steps in integration similar to the adjoint method, and guarantees accuracy in reverse-time trajectory (hence accuracy in gradient estimation).\n\n#### Speeding up continuous models\n\n* How to Train you Neural ODE: [ICML20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.02798)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n* Learning Differential Equations that are Easy to Solve: [NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.04504) \n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n* Hypersolvers: Toward Fast Continuous-Depth Models: [NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.09601) \n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n* Hey, that's not an ODE\": Faster ODE Adjoints with 12 Lines of Code: [arXiV20](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2009.09457.pdf) \n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n> Neural differential equations may be trained by backpropagating gradients via the adjoint method. Here, we demonstrate that the particular structure of the adjoint equations makes the usual choices of norm (such as L2) unnecessarily stringent. By replacing it with a more appropriate (semi)norm, fewer steps are unnecessarily rejected and the backpropagation is made faster.\n\n* Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs: [NeurIPS20](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2020\u002Ffile\u002Fc24c65259d90ed4a19ab37b6fd6fe716-Paper.pdf)\n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor) ![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> We propose a simple interpolation-based method for the efficient approximation of gradients in neural ODE models. We compare it with the reverse dynamic method (known in the literature as “adjoint method”) to train neural ODEs on classification, density estimation, and inference approximation tasks.\n\n* Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics: [ICML21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.03918)\n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n> Can we force the NDE to learn the version with the least steps while not increasing the training cost? Current strategies to overcome slow prediction require high order automatic differentiation, leading to significantly higher training time. We describe a novel regularization method that uses the internal cost heuristics of adaptive differential equation solvers combined with discrete adjoint sensitivities\n\n#### Control with Neural ODEs\n\n* Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs: [NeurIPS20](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2006.16210.pdf)\n\n> In this paper, we take a model-based approach to continuous-time RL, modeling the dynamics via neural ordinary differential equations (ODEs). Not only is this more sample efficient than model-free approaches, but it allows us to efficiently adapt policies learned using one schedule of interactions with the environment for another.\n\n* Optimal Energy Shaping via Neural Approximators: [arXiv20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.05537)\n\n> We introduce optimal energy shaping as an enhancement of classical passivity-based control methods. A promising feature of passivity theory, alongside stability, has traditionally been claimed to be intuitive performance tuning along the execution of a given task. However, a systematic approach to adjust performance within a passive control framework has yet to be developed, as each method relies on few and problem-specific practical insights. Here, we cast the classic energy-shaping control design process in an optimal control framework; once a task-dependent performance metric is defined, an optimal solution is systematically obtained through an iterative procedure relying on neural networks and gradient-based optimization.\n\n### Neural GDEs\n\n* Graph Neural Ordinary Differential Equations (spotlight): [AAAI DLGMA20](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.07532)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> We introduce the framework of continuous–depth graph neural networks (GNNs). Neural graph ordinary differential equations (Neural GDEs) are formalized as the counterpart to GNNs where the input–output relationship is determined by a continuum of GNN layers, blending discrete topological structures and differential equations. We further introduce general Hybrid Neural GDE models as a hybrid dynamical systems. \n\n* Continuous–Depth Neural Models for Dynamic Graph Prediction: [arXiv21](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2106.11581.pdf), extended version of \"Graph Neural Ordinary Differential Equations\"\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> Additional Neural GDE variants are developed to tackle the spatio–temporal setting of dynamic graphs. The evaluation protocol for Neural GDEs spans several application domains, including traffic forecasting and prediction in biological networks.\n\n* GRAND: Graph Neural Diffusion:  [arXiv21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.10934)\n\n> We present Graph Neural Diffusion (GRAND) that approaches deep learning on graphs as a continuous diffusion process and treats Graph Neural Networks (GNNs) as discretisations of an underlying PDE\n\n### Neural SDEs\n\n* Neural SDE: Stabilizing Neural ODE Networks with Stochastic Noise: [arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.02355)\n\n* Neural Jump Stochastic Differential Equations: [arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1905.10403)\n\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n* Towards Robust and Stable Deep Learning Algorithms for Forward Backward Stochastic Differential Equations: [arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1910.11623)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* Scalable Gradients and Variational Inference for Stochastic Differential Equations: [AISTATS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2001.01328)\n\n* Score-Based Generative Modeling through Stochastic Differential Equations (oral): [ICLR20](https:\u002F\u002Fopenreview.net\u002Fpdf?id=PxTIG12RRHS)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> We present a stochastic differential equation (SDE) that smoothly transforms a complex data distribution to a known prior distribution by slowly injecting noise, and a corresponding reverse-time SDE that transforms the prior distribution back into the data distribution by slowly removing the noise.\n\n* Efficient and Accurate Gradients for Neural SDEs: [NeurIPS21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.13493)\n\n> we introduce the reversible Heun method. This is a new SDE solver that is algebraically reversible: eliminating numerical gradient errors, and the first such solver of which we are aware. Moreover it requires half as many function evaluations as comparable solvers, giving up to a 1.98× speedup. Second, we introduce the Brownian Interval: a new, fast, memory efficient, and exact way of sampling \\textit{and reconstructing} Brownian motion.\n\n\n### Neural CDEs\n\n* Neural Controlled Differential Equations for Irregular Time Series (spotlight): [NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2005.08926)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> We demonstrate how controlled differential equations may extend the Neural ODE model, which we refer to as the neural controlled differential equation (Neural CDE) model. Just as Neural ODEs are the continuous analogue of a ResNet, the Neural CDE is the continuous analogue of an RNN.\n\n* Neural CDEs for Long Time Series via the Log-ODE Method: [arXiv20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.08295)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n* Neural Controlled Differential Equations for Online Prediction Tasks: [arXiv21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.11028)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> We identify several theoretical conditions that interpolation schemes for Neural CDEs should satisfy, such as boundedness and uniqueness. Second, we use these to motivate the introduction of new schemes that address these conditions, offering in particular measurability (for online prediction), and smoothness (for speed).\n\n### Generative Models\n\n#### Normalizing Flows\n\n* Monge-Ampère Flow for Generative Modeling: [arXiv18](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1809.10188.pdf)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n* FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models: [ICLR19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1810.01367)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n* Equivariant Flows: sampling configurations for multi-body systems with symmetric energies: [arXiv18](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1910.00753.pdf)\n\n* Flows for simultaneous manifold learning and density estimation: [NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.13913)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n> We introduce manifold-learning flows (M-flows), a new class of generative models that simultaneously learn the data manifold as well as a tractable probability density on that manifold. We argue why such models should not be trained by maximum likelihood alone and present a new training algorithm that separates manifold and density updates.\n\n* TrajectoryNet: A Dynamic Optimal Transport Network for Modeling Cellular Dynamics [arXiv20](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.04461.pdf)\n\n* Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization: [arXiv20](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2012.05942.pdf)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> CP-Flows are the gradient map of a strongly convex neural potential function. The convexity implies invertibility and allows us to resort to convex optimization to solve the convex conjugate for efficient inversion.\n\n#### Diffusion Models\n\n* Score-Based Generative Modeling through Stochastic Differential Equations (best paper award): [ICLR21](https:\u002F\u002Fopenreview.net\u002Fpdf?id=PxTIG12RRHS)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> Creating noise from data is easy; creating data from noise is generative modeling. We present a stochastic differential equation (SDE) that smoothly transforms a complex data distribution to a known prior distribution by slowly injecting noise, and a corresponding reverse-time SDE that transforms the prior distribution back into the data distribution by slowly removing the noise. \n\n* Denoising Diffusion Implicit Models\n\n> Denoising diffusion probabilistic models (DDPMs) have achieved high quality image generation without adversarial training, yet they require simulating a Markov chain for many steps to produce a sample. To accelerate sampling, we present denoising diffusion implicit models (DDIMs), a more efficient class of iterative implicit probabilistic models with the same training procedure as DDPMs. In DDPMs, the generative process is defined as the reverse of a Markovian diffusion process.\n\n### Applications \n\n* Learning Dynamics of Attention: Human Prior for Interpretable Machine Reasoning: [NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1905.11666)\n\n## Deep Learning Methods for Differential Equations\n\n### Solving Differential Equations\n\n* PDE-Net: Learning PDEs From Data: [ICML18](https:\u002F\u002Farxiv.org\u002Fabs\u002F1710.09668)\n\n### Model Discovery\n\n* Universal Differential Equations for Scientific Machine Learning: [arXiv20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2001.04385)\n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n## Dynamical System View of Deep Learning\n\n### Recurrent Neural Networks\n\n* A Comprehensive Review of Stability Analysis of Continuous-Time Recurrent Neural Networks: [IEEE Transactions on Neural Networks 2006](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F6814892)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* AntysimmetricRNN: A Dynamical System View on Recurrent Neural Networks: [ICLR19](https:\u002F\u002Fopenreview.net\u002Fpdf?id=ryxepo0cFX)\n\n* Recurrent Neural Networks in the Eye of Differential Equations: [arXiv19](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1904.12933.pdf)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* Visualizing memorization in RNNs: [distill19](https:\u002F\u002Fdistill.pub\u002F2019\u002Fmemorization-in-rnns\u002F)\n\n* One step back, two steps forward: interference and learning in recurrent neural networks: [arXiv18](https:\u002F\u002Farxiv.org\u002Fabs\u002F1805.09603)\n\n* Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics: [arXiv19](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1906.10720.pdf)\n\n* System Identification with Time-Aware Neural Sequence Models: [AAAI20](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.09431)\n\n* Universality and Individuality in recurrent networks: [NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1907.08549)\n\n### Theory and Perspectives\n\n* A Proposal on Machine Learning via Dynamical Systems: [Communications in Mathematics and Statistics 2017](https:\u002F\u002Flink.springer.com\u002Fcontent\u002Fpdf\u002F10.1007\u002Fs40304-017-0103-z.pdf)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective: [arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.10920)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* Stable Architectures for Deep Neural Networks: [IP17](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1705.03341.pdf)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* Beyond Finite Layer Neural Network: Bridging Deep Architects and Numerical Differential Equations: [ICML18](https:\u002F\u002Farxiv.org\u002Fabs\u002F1710.10121)\n\n* Review: Ordinary Differential Equations For Deep Learning: [arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.00502)\n\n### Optimization\n\n* Gradient and Hamiltonian Dynamics Applied to Learning in Neural Networks: [NIPS96](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F1033-gradient-and-hamiltonian-dynamics-applied-to-learning-in-neural-networks.pdf)\n\n* Maximum Principle Based Algorithms for Deep Learning: [JMLR17](https:\u002F\u002Farxiv.org\u002Fabs\u002F1710.09513)\n\n* Hamiltonian Descent Methods: [arXiv18](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1809.05042.pdf)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* Port-Hamiltonian Approach to Neural Network Training: [CDC19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.02702), [code](https:\u002F\u002Fgithub.com\u002FZymrael\u002FPortHamiltonianNN)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* An Optimal Control Approach to Deep Learning and Applications to Discrete-Weight Neural Networks: [arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1803.01299)\n\n* Optimizing Millions of Hyperparameters by Implicit Differentiation: [arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.02590)\n\n* Shadowing Properties of Optimization Algorithms: [NeurIPS19](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F9431-shadowing-properties-of-optimization-algorithms)\n\n## Software and Libraries\n\n### Python\n\n* **torchdyn**: PyTorch library for all things neural differential equations. [repo](https:\u002F\u002Fgithub.com\u002Fdiffeqml\u002Ftorchdyn), [docs](https:\u002F\u002Ftorchdyn.readthedocs.io\u002F)\n* **torchdiffeq**: Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation: [repo](https:\u002F\u002Fgithub.com\u002Frtqichen\u002Ftorchdiffeq)\n* **torchsde**: Stochastic differential equation (SDE) solvers with GPU support and efficient sensitivity analysis: [repo](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Ftorchsde)\n* **torchcde**: GPU-capable solvers for controlled differential equations (CDEs): [repo](https:\u002F\u002Fgithub.com\u002Fpatrick-kidger\u002Ftorchcde)\n* **torchSODE**: PyTorch Block-Diagonal ODE solver: [repo](https:\u002F\u002Fgithub.com\u002FZymrael\u002FtorchSODE)\n* **neurodiffeq**: A light-weight & flexible library for solving differential equations using neural networks based on PyTorch: [repo](https:\u002F\u002Fgithub.com\u002FNeuroDiffGym\u002Fneurodiffeq)\n\n### Julia\n\n* **DiffEqFlux**: [repo](https:\u002F\u002Fgithub.com\u002FJuliaDiffEq\u002FDiffEqFlux.jl)\n\n> Neural differential equation solvers with O(1) backprop, GPUs, and stiff+non-stiff DE solvers. Supports stiff and non-stiff neural ordinary differential equations (neural ODEs), neural stochastic differential equations (neural SDEs), neural delay differential equations (neural DDEs), neural partial differential equations (neural PDEs), and neural jump stochastic differential equations (neural jump diffusions). All of these can be solved with high order methods with adaptive time-stepping and automatic stiffness detection to switch between methods. \n\n* **NeuralNetDiffEq**: Implementations of ODE, SDE, and PDE solvers via deep neural networks: [repo](https:\u002F\u002Fgithub.com\u002FJuliaDiffEq\u002FNeuralNetDiffEq.jl)\n\n## Websites and Blogs\n\n* Scientific ML Blog (Chris Rackauckas and SciML): [link](http:\u002F\u002Fwww.stochasticlifestyle.com\u002F)\n","\u003Cdiv align=\"center\">\n    \u003Ch1>精彩的神经ODE\u003C\u002Fh1>\n    \u003Ca href=\"https:\u002F\u002Fawesome.re\">\u003Cimg src=\"https:\u002F\u002Fawesome.re\u002Fbadge.svg\"\u002F>\u003C\u002Fa>\n\u003C\u002Fdiv>\n\n这是一份关于微分方程、动力系统、深度学习、控制理论、数值方法以及科学机器学习之间相互作用的资源合集。\n\n**注意：** 欢迎通过 `Issues` 或 `Pull Requests` 提出补充建议。\n\n该仓库还为每项工作分配了主题标签，以进行（粗略的）分类。这些标签并不全面或精确，仅用于大致了解内容。\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n# 目录\n\n* **深度学习中的微分方程**\n\n\t* [通用架构](#general-architectures)\n\t\n\t* [神经算子](#neural-operators)\n\t\n\t* [神经ODE](#neural-odes)\n\t\n\t\t* [神经ODE的训练](#training-of-neural-odes)\n\t\t\n\t\t* [加速连续模型](#speeding-up-continuous-models)\n\t\t\n\t\t* [基于神经ODE的控制](#control-with-neural-odes)\n\t\n\t* [神经GDE](#neural-gdes)\n\t\n\t* [神经SDE](#neural-sdes)\n\t\n\t* [神经CDE](#neural-cdes)\n\t\n\t* [生成模型](#generative-models)\n\t\n\t\t* [归一化流](#normalizing-flows)\n\t\t\n\t\t* [分数匹配SDE](#score-matching-sdes) \t\n\t\n\t* [应用](#applications)\n\t\n* **用于微分方程的深度学习方法（科学机器学习）**\n\n\t* [求解微分方程](#solving-differential-equations)\n\t\n\t* [模型发现](#model-discovery)\n\t\n* **深度学习的动力系统视角**\n\n\t* [循环神经网络](#recurrent-neural-networks)\n\t\n\t* [理论与观点](#theory-and-perspectives)\n\t\n\t* [优化](#optimization)\n\t\n* [软件与库](#software-and-libraries)\n\n* [网站与博客](#websites-and-blogs)\n\n## 深度学习中的微分方程\n\n### 通用架构\n\n* 面向具有缺失值的多变量时间序列的循环神经网络：[Scientific Reports18](https:\u002F\u002Farxiv.org\u002Fabs\u002F1606.01865)\n\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> 在实际应用中，多变量时间序列数据（如医疗保健、地球科学和生物学领域）往往存在各种缺失值。我们提出了一种基于GRU的模型GRU-D，其中为输入变量和隐藏状态设计了一个衰减机制，以捕捉上述特性。我们在模型中引入衰减率，通过考虑以下重要因素来控制这一衰减机制。\n\n* 使用高斯过程学习未知的ODE模型：[arXiv18](https:\u002F\u002Farxiv.org\u002Fabs\u002F1803.04303)，[代码](https:\u002F\u002Fgithub.com\u002Fcagatayyildiz\u002Fnpde\u002F)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> 然而，对于许多复杂系统而言，确定支配其内在动力学的方程或相互作用在实践中几乎是不可能的。在这种情况下，无法构建参数化的ODE模型。为此，我们提出了一种新颖的非参数化ODE建模范式，能够在无需先验知识的情况下学习任意连续时间系统的底层动力学。我们建议使用高斯过程向量场，在精确的ODE形式框架内，从状态观测中学习非线性、未知的微分函数。\n\n* 深度均衡模型：[NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.01377)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> 我们提出了一种新的序列数据建模方法：深度均衡模型（DEQ）。受许多现有深度序列模型的隐藏层会收敛到某个固定点这一现象的启发，我们提出了DEQ方法，直接通过求根算法找到这些平衡点。\n\n* 快速且深层的图神经网络：[AAAI20](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1911.08941.pdf)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> 我们针对深层图神经网络（GNN）构建效率低下的问题提出了解决方案。该方法利用将每个输入图表示为动力系统固定点（通过循环神经网络实现）的思想，并采用深层的循环单元架构。通过多种方式提升了效率，包括使用小型且非常稀疏的网络，同时在本文提出的稳定性条件下，循环单元的权重无需训练。\n\n* 哈密顿神经网络：[NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.01563)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> 在本文中，我们从哈密顿力学中获得灵感，训练能够以无监督方式学习并遵守精确守恒定律的模型。\n\n* 深度拉格朗日网络：利用物理学作为深度学习的先验模型：[ICLR19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1907.04490)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> 我们提出深度拉格朗日网络（DeLaN），这是一种在其基础上施加了拉格朗日力学约束的深度网络结构。DeLaN能够高效地通过深度网络学习机械系统的运动方程（即系统动力学），同时确保物理上的合理性。由此产生的DeLaN网络在机器人跟踪控制方面表现出色。\n\n* 拉格朗日神经网络：[ICLR20 DeepDiffEq](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.04630)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> 我们提出了拉格朗日神经网络（LNN），它能够使用神经网络对任意拉格朗日量进行参数化。与学习哈密顿量的模型不同，LNN不需要规范坐标，因此在规范动量未知或难以计算的情况下表现尤为出色。\n\n* 通过显式约束简化哈密顿和拉格朗日神经网络：[NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.13581)，[代码](https:\u002F\u002Fgithub.com\u002Fmfinzi\u002Fconstrained-hamiltonian-neural-networks)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n> 对物理世界的推理需要具备正确归纳偏置的模型，以便学习其内在动力学。近期的研究表明，通过学习系统的哈密顿量或拉格朗日量而非直接学习微分方程，可以提高预测轨迹的泛化能力。尽管这些方法使用广义坐标来编码系统的约束条件，但我们证明，将系统嵌入笛卡尔坐标系，并借助拉格朗日乘子显式施加约束，能够显著简化学习问题。\n\n### 神经算子\n\n* 神经算子：学习函数空间之间的映射：[arXiv21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2108.08481)\n\n> 我们提出了一种神经网络的泛化方法，用于学习在无限维函数空间之间进行映射的算子。我们将算子的近似表示为一类线性积分算子与非线性激活函数的复合形式，从而使复合算子能够逼近复杂的非线性算子。我们证明了该构造的通用逼近定理。此外，我们介绍了四类算子参数化方式：基于图的算子、低秩算子、多极点基于图的算子以及傅里叶算子，并描述了每种方式的高效计算算法。\n\n* 用于参数化偏微分方程的傅里叶神经算子：[ICLR 2021](https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.08895)\n\n> 我们通过在傅里叶空间中直接对积分核进行参数化，提出了一种新的神经算子架构，该架构具有强大的表达能力和高效的计算性能。\n\n* FourCastNet：基于自适应傅里叶神经算子的全球数据驱动高分辨率天气模型\n\n> FourCastNet，即傅里叶预报神经网络，是一个全球数据驱动的天气预报模型，能够在0.25°分辨率下提供准确的短期至中期全球预报。FourCastNet能够精确预测诸如地表风速、降水和大气水汽等高分辨率、快速变化的气象变量。\n\n* 变换一次：频域中的高效算子学习\n\n> 本工作通过一次变换提出了频域学习的蓝图：变换一次（T1）。为了实现频域中的高效直接学习，我们开发了一种保持方差的权重初始化方案，并解决了如何选择合适变换这一开放问题。我们的研究显著简化了频域模型的设计流程，去除了冗余的变换操作，使计算速度提升了3到10倍，且随着数据分辨率和模型规模的增加，加速效果更加明显。我们在求解偏微分方程方面进行了大量实验，包括不可压缩的纳维-斯托克斯方程、机翼周围的湍流以及高分辨率烟雾动力学视频等。T1模型在测试性能上优于当前最先进的频域模型，同时所需的计算量大幅减少，在各项任务中的预测误差降低了20%以上。\n\n### 神经ODEs\n\n* 神经常微分方程（最佳论文奖）：[NeurIPS18](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1806.07366.pdf) \n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> 我们引入了一种新型的深度神经网络模型。不同于传统模型中离散的隐藏层序列，我们使用神经网络来参数化隐藏状态的导数。此外，我们还构建了连续归一化流，这是一种可以通过最大似然估计进行训练的生成模型，无需对数据维度进行划分或排序。\n\n* 解析神经ODEs（口头报告）：[NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.08071) \n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool) ![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> 近年来，连续深度学习架构以“神经常微分方程”（Neural ODEs）的形式重新兴起。这种无限深度的方法在理论上弥合了深度学习与动力系统之间的鸿沟，提供了一种全新的视角。然而，如何理解这些模型的内部工作机制仍然是一个未解难题，因为大多数应用都将其当作通用的黑盒模块来使用。在本工作中，我们“打开盒子”，进一步发展了连续深度的理论框架，旨在阐明若干设计选择对底层动力学的影响。\n\n* 可微多重射击层：[NeurIPS21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.03885)\n\n> 我们详细介绍了新型的隐式神经网络模型。利用微分方程的时间并行方法，多重射击层（MSLs）通过可并行化的根查找算法来求解初值问题。MSLs可以广泛用作神经常微分方程（Neural ODEs）的替代品，其在函数评估次数（NFEs）和推理耗时方面均有所改进。\n\n* 增广神经ODEs：[NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.01681) \n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> 我们证明了神经常微分方程（ODEs）能够学习保留输入空间拓扑结构的表示，并进一步指出这表明存在一些函数是神经ODEs无法表示的。为了解决这些局限性，我们提出了增广神经ODEs，它不仅更具表达能力，而且在实践中表现得更为稳定、泛化能力更强，计算成本也低于传统的神经ODEs。\n\n* 用于不规则采样时间序列的潜在ODEs：[NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1907.03907)\n\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n* ODE2VAE：基于贝叶斯神经网络的二阶深度生成ODEs：[NeurIPS19](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1905.10994.pdf)\n\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n* 辛几何ODE-Net：带控制的哈密顿动力学学习：[arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.12077)\n\n* 稳定的神经流：[arXiv20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.08063) \n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool)\n\n* 关于增广神经ODEs中的二阶行为 [NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.07220)\n\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n* 神经混合自动机：学习具有多模式和随机转换的动力系统：[NeurIPS21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.04165)\n\n> 对动力系统的有效控制和预测通常需要妥善处理连续时间和离散事件触发的过程。随机混合系统（SHSs）广泛应用于工程领域，为那些可能经历离散随机状态跳变以及多模态连续动态的系统提供了形式化的建模框架。尽管SHSs在各种应用中具有广泛的适用性和重要性，但如何显式地学习离散事件和多模态连续动态的通用方法仍然是一个悬而未决的问题。本研究提出了神经混合自动机（NHAs），这是一种无需事先了解模式数量和模态间转换动态即可学习SHS动力学的方法。NHAs基于归一化流、神经微分方程和自监督学习，提供了一种系统的推断方法。\n\n#### 神经ODEs的训练\n\n* 利用谱元法加速神经ODE：[arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.07038)\n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n* 用于神经ODE梯度估计的自适应检查点伴随方法：[ICML20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.02493)\n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor) ![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n* MALI：一种内存高效且反向精确的神经ODE积分器：[ICLR21](https:\u002F\u002Fopenreview.net\u002Fpdf?id=blfSjHeFM_e)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor) ![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> 现有的伴随方法在反向时间轨迹上存在不准确性，而朴素方法和自适应检查点伴随方法（ACA）则会随着积分时间的增长导致内存开销增加。在本项目中，我们基于异步跳跃法（ALF）求解器，提出了一种内存高效的ALF积分器（MALI），它与伴随方法类似，在积分过程中内存开销与求解步骤数无关，并且能够保证反向时间轨迹的准确性（从而确保梯度估计的准确性）。\n\n#### 加速连续模型\n\n* 如何训练你的神经ODE：[ICML20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.02798)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n* 学习易于求解的微分方程：[NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.04504)\n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n* 超级求解器：迈向快速的连续深度模型：[NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.09601)\n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n* “嘿，那可不是ODE”：用12行代码实现更快的ODE伴随方法：[arXiV20](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2009.09457.pdf)\n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n> 神经微分方程可以通过伴随方法进行梯度反向传播来训练。在这里，我们证明了伴随方程的特殊结构使得通常使用的范数（如L2范数）变得不必要地严格。通过将其替换为更合适的（半）范数，可以减少不必要的步骤被拒绝，从而使反向传播过程更加迅速。\n\n* 用于加速神经ODE中梯度传播的插值技术：[NeurIPS20](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2020\u002Ffile\u002Fc24c65259d90ed4a19ab37b6fd6fe716-Paper.pdf)\n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor) ![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> 我们提出了一种基于插值的简单方法，用于高效地近似神经ODE模型中的梯度。我们将该方法与反向动力学方法（文献中称为“伴随方法”）进行了比较，以在分类、密度估计和推理近似任务上训练神经ODE。\n\n* 打开黑箱：通过正则化内部求解器启发式来加速神经微分方程：[ICML21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.03918)\n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n> 我们能否在不增加训练成本的情况下，迫使NDE学习使用最少步骤的版本？目前克服预测缓慢的策略需要高阶自动微分，这会导致显著增加的训练时间。我们描述了一种新颖的正则化方法，该方法结合了自适应微分方程求解器的内部成本启发式与离散伴随敏感性。\n\n#### 基于神经ODE的控制\n\n* 基于模型的强化学习应用于具有神经ODE的半马尔可夫决策过程：[NeurIPS20](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2006.16210.pdf)\n\n> 在本文中，我们采用基于模型的方法进行连续时间强化学习，通过神经常微分方程（ODEs）对动态进行建模。这种方法不仅比无模型方法更节省样本，还允许我们有效地将基于一种交互方案学到的策略调整到另一种交互方案上。\n\n* 基于神经逼近器的最优能量整形：[arXiv20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.05537)\n\n> 我们引入了最优能量整形作为对经典无源控制方法的增强。长期以来，无源控制理论的一个重要特性——除了稳定性之外——被认为是可以在执行特定任务时直观地调整性能。然而，迄今为止，尚未开发出一种系统化的框架来在无源控制范围内调整性能，因为每种方法都依赖于少量且针对特定问题的实践经验。在此，我们将经典的能量整形控制设计过程置于最优控制框架下；一旦定义了与任务相关的性能指标，便可通过迭代程序，借助神经网络和基于梯度的优化方法，系统地获得最优解。\n\n\n\n### 神经GDEs\n\n* 图神经常微分方程（亮点论文）：[AAAI DLGMA20](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.07532)\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> 我们提出了连续深度图神经网络（GNNs）的框架。神经图常微分方程（Neural GDEs）被形式化为GNNs的对应物，其中输入-输出关系由一系列GNN层决定，融合了离散拓扑结构和微分方程。我们进一步引入了通用混合神经GDE模型，作为一种混合动力系统。\n\n* 用于动态图预测的连续深度神经模型：[arXiv21](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2106.11581.pdf)，“图神经常微分方程”的扩展版\n\n![DS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsystems-red.svg?logo=Graphcool) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> 针对动态图的时空场景，开发了额外的神经GDE变体。神经GDE的评估协议涵盖了多个应用领域，包括交通预测和生物网络中的预测。\n\n* GRAND：图神经扩散：[arXiv21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.10934)\n\n> 我们提出了图神经扩散（GRAND），它将图上的深度学习视为一个连续的扩散过程，并将图神经网络（GNNs）视为潜在偏微分方程的离散化表示。\n\n### 神经随机微分方程\n\n* 神经随机微分方程：通过随机噪声稳定神经ODE网络：[arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.02355)\n\n* 神经跳跃随机微分方程：[arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1905.10403)\n\n![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n* 面向正反向随机微分方程的鲁棒且稳定的深度学习算法：[arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1910.11623)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* 随机微分方程的可扩展梯度与变分推断：[AISTATS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2001.01328)\n\n* 基于分数的生成模型通过随机微分方程（口头报告）：[ICLR20](https:\u002F\u002Fopenreview.net\u002Fpdf?id=PxTIG12RRHS)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> 我们提出了一种随机微分方程，它通过缓慢注入噪声将复杂的数据分布平滑地转换为已知的先验分布；同时，还提出了一种对应的逆时间随机微分方程，通过缓慢去除噪声将先验分布重新转换回数据分布。\n\n* 神经随机微分方程的高效精确梯度：[NeurIPS21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.13493)\n\n> 我们引入了可逆Heun方法。这是一种新的SDE求解器，具有代数意义上的可逆性，能够消除数值梯度误差，是我们所知的第一个此类求解器。此外，它所需的函数评估次数仅为同类求解器的一半，从而实现最高1.98倍的速度提升。其次，我们提出了布朗区间：一种新型、快速、内存高效且精确的采样及重建布朗运动的方法。\n\n\n### 神经控制微分方程\n\n* 用于不规则时间序列的神经控制微分方程（亮点论文）：[NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2005.08926)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> 我们展示了控制微分方程如何扩展神经ODE模型，即所谓的神经控制微分方程（Neural CDE）模型。正如神经ODE是ResNet的连续版本一样，神经CDE则是RNN的连续版本。\n\n* 基于对数ODE方法的长时序神经控制微分方程：[arXiv20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.08295)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n* 用于在线预测任务的神经控制微分方程：[arXiv21](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.11028)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz) ![TS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fsequences-purple.svg?logo=Altium%20Designer)\n\n> 我们确定了神经CDE插值方案应满足的若干理论条件，例如有界性和唯一性。其次，我们基于这些条件提出了新的插值方案，以解决上述问题，特别是提供了可测量性（适用于在线预测）和光滑性（有助于提高速度）。\n\n\n### 生成模型\n\n#### 归一化流\n\n* 蒙日-安培流用于生成建模：[arXiv18](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1809.10188.pdf)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n* FFJORD：用于可扩展可逆生成模型的自由形式连续动力学：[ICLR19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1810.01367)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n* 等变流：用于具有对称能量的多体系统配置采样：[arXiv18](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1910.00753.pdf)\n\n* 同时进行流形学习与密度估计的流：[NeurIPS20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.13913)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n> 我们提出了流形学习流（M流），这是一类新型生成模型，能够同时学习数据流形以及该流形上的可处理概率密度。我们论证了为何此类模型不应仅依靠最大似然进行训练，并提出了一种新的训练算法，将流形更新与密度更新分开进行。\n\n* TrajectoryNet：用于建模细胞动态的动态最优传输网络 [arXiv20](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.04461.pdf)\n\n* 凸势流：结合最优传输与凸优化的通用概率分布：[arXiv20](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2012.05942.pdf)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> CP流是由强凸神经势函数的梯度映射构成。由于其凸性，CP流具有可逆性，因此我们可以借助凸优化来求解凸共轭，从而实现高效的逆变换。\n\n#### 扩散模型\n\n* 基于分数的生成模型通过随机微分方程（最佳论文奖）：[ICLR21](https:\u002F\u002Fopenreview.net\u002Fpdf?id=PxTIG12RRHS)\n\n![IC](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fimages-blue.svg?logo=Google%20Classroom)\n\n> 从数据中制造噪声很容易；而从噪声中生成数据才是生成建模的核心。我们提出了一种随机微分方程，它通过缓慢注入噪声将复杂的数据分布平滑地转换为已知的先验分布；同时，还提出了一种对应的逆时间随机微分方程，通过缓慢去除噪声将先验分布重新转换回数据分布。\n\n* 去噪扩散隐式模型\n\n> 去噪扩散概率模型（DDPMs）在无需对抗训练的情况下实现了高质量的图像生成，然而它们需要模拟多步马尔可夫链才能生成一个样本。为了加速采样，我们提出了去噪扩散隐式模型（DDIMs），这是一类更高效的迭代隐式概率模型，其训练过程与DDPMs相同。在DDPMs中，生成过程被定义为马尔可夫扩散过程的逆过程。\n\n### 应用\n\n* 注意力动态的学习：人类先验知识在可解释机器推理中的作用：[NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1905.11666)\n\n## 用于微分方程的深度学习方法\n\n### 求解微分方程\n\n* PDE-Net：从数据中学习偏微分方程：[ICML18](https:\u002F\u002Farxiv.org\u002Fabs\u002F1710.09668)\n\n### 模型发现\n\n* 用于科学机器学习的通用微分方程：[arXiv20](https:\u002F\u002Farxiv.org\u002Fabs\u002F2001.04385)\n\n![NM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnumerics-green.svg?logo=CodeFactor)\n\n## 深度学习的动力系统视角\n\n### 循环神经网络\n\n* 连续时间循环神经网络稳定性分析的全面综述：[IEEE 神经网络汇刊 2006](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F6814892)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* 反对称RNN：从动力系统视角看循环神经网络：[ICLR19](https:\u002F\u002Fopenreview.net\u002Fpdf?id=ryxepo0cFX)\n\n* 循环神经网络在微分方程的视角下：[arXiv19](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1904.12933.pdf)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* 可视化RNN中的记忆现象：[distill19](https:\u002F\u002Fdistill.pub\u002F2019\u002Fmemorization-in-rnns\u002F)\n\n* 后退一步，前进两步：循环神经网络中的干扰与学习：[arXiv18](https:\u002F\u002Farxiv.org\u002Fabs\u002F1805.09603)\n\n* 针对情感分类的循环网络逆向工程揭示了线性吸引子动力学：[arXiv19](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1906.10720.pdf)\n\n* 基于时间感知的神经序列模型的系统辨识：[AAAI20](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.09431)\n\n* 循环网络中的普适性与个体性：[NeurIPS19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1907.08549)\n\n### 理论与观点\n\n* 基于动力系统进行机器学习的建议：[数学与统计通讯 2017](https:\u002F\u002Flink.springer.com\u002Fcontent\u002Fpdf\u002F10.1007\u002Fs40304-017-0103-z.pdf)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* 深度学习理论综述：最优控制与动力系统视角：[arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.10920)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* 深度神经网络的稳定架构：[IP17](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1705.03341.pdf)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* 超越有限层神经网络：连接深度架构与数值微分方程：[ICML18](https:\u002F\u002Farxiv.org\u002Fabs\u002F1710.10121)\n\n* 综述：用于深度学习的常微分方程：[arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.00502)\n\n### 优化\n\n* 梯度与哈密顿动力学在神经网络学习中的应用：[NIPS96](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F1033-gradient-and-hamiltonian-dynamics-applied-to-learning-in-neural-networks.pdf)\n\n* 基于极大值原理的深度学习算法：[JMLR17](https:\u002F\u002Farxiv.org\u002Fabs\u002F1710.09513)\n\n* 哈密顿下降法：[arXiv18](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1809.05042.pdf)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* 基于端口哈密顿方法的神经网络训练：[CDC19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.02702)，[代码](https:\u002F\u002Fgithub.com\u002FZymrael\u002FPortHamiltonianNN)\n\n![T](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftheory-black.svg?logo=MusicBrainz)\n\n* 深度学习的最优控制方法及其在离散权重神经网络中的应用：[arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1803.01299)\n\n* 通过隐式微分优化数百万个超参数：[arXiv19](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.02590)\n\n* 优化算法的影子特性：[NeurIPS19](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F9431-shadowing-properties-of-optimization-algorithms)\n\n## 软件与库\n\n### Python\n\n* **torchdyn**：PyTorch 中用于神经微分方程相关任务的库。[仓库](https:\u002F\u002Fgithub.com\u002Fdiffeqml\u002Ftorchdyn)，[文档](https:\u002F\u002Ftorchdyn.readthedocs.io\u002F)\n* **torchdiffeq**：具有完整 GPU 支持和 O(1) 内存反向传播的可微分常微分方程求解器：[仓库](https:\u002F\u002Fgithub.com\u002Frtqichen\u002Ftorchdiffeq)\n* **torchsde**：支持 GPU 的随机微分方程（SDE）求解器，并提供高效的灵敏度分析：[仓库](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Ftorchsde)\n* **torchcde**：具备 GPU 功能的受控微分方程（CDE）求解器：[仓库](https:\u002F\u002Fgithub.com\u002Fpatrick-kidger\u002Ftorchcde)\n* **torchSODE**：PyTorch 中的块对角常微分方程求解器：[仓库](https:\u002F\u002Fgithub.com\u002FZymrael\u002FtorchSODE)\n* **neurodiffeq**：基于 PyTorch 的轻量级、灵活的库，用于利用神经网络求解微分方程：[仓库](https:\u002F\u002Fgithub.com\u002FNeuroDiffGym\u002Fneurodiffeq)\n\n### Julia\n\n* **DiffEqFlux**：[仓库](https:\u002F\u002Fgithub.com\u002FJuliaDiffEq\u002FDiffEqFlux.jl)\n\n> 具有 O(1) 反向传播、GPU 支持以及刚性和非刚性微分方程求解器的神经微分方程求解器。支持神经常微分方程（neural ODE）、神经随机微分方程（neural SDE）、神经时滞微分方程（neural DDE）、神经偏微分方程（neural PDE）以及神经跳跃随机微分方程（neural jump diffusions）。所有这些都可以使用高阶方法求解，并具备自适应时间步长和自动刚性检测功能，可在不同方法之间切换。\n\n* **NeuralNetDiffEq**：通过深度神经网络实现的 ODE、SDE 和 PDE 求解器：[仓库](https:\u002F\u002Fgithub.com\u002FJuliaDiffEq\u002FNeuralNetDiffEq.jl)\n\n## 网站与博客\n\n* 科学机器学习博客（Chris Rackauckas 和 SciML）：[链接](http:\u002F\u002Fwww.stochasticlifestyle.com\u002F)","# Awesome Neural ODE 快速上手指南\n\n`awesome-neural-ode` 并非一个单一的 Python 包或可执行工具，而是一个**精选资源列表（Awesome List）**，汇集了关于微分方程、动力系统与深度学习交叉领域的论文、代码库、软件库及博客文章。\n\n本指南将帮助你快速定位核心代码库，搭建开发环境，并运行一个基础的 Neural ODE 示例。\n\n## 1. 环境准备\n\n由于该列表涵盖多种实现（主要基于 PyTorch 和 Julia），以下以目前生态最完善的 **PyTorch** 环境为例。\n\n*   **操作系统**: Linux, macOS 或 Windows (推荐 Linux)\n*   **Python 版本**: >= 3.8\n*   **核心依赖**:\n    *   `torch`: 深度学习框架\n    *   `torchdiffeq`: 由 Neural ODE 原作者维护的可微分 ODE 求解器（最常用）\n    *   `numpy`, `matplotlib`: 数据处理与可视化\n\n> **国内加速建议**:\n> 建议使用清华源或阿里源安装 Python 包，以提升下载速度。\n> ```bash\n> pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \u003Cpackage_name>\n> ```\n\n## 2. 安装步骤\n\n### 步骤 1: 创建虚拟环境（推荐）\n```bash\npython -m venv neural_ode_env\nsource neural_ode_env\u002Fbin\u002Factivate  # Windows 用户请使用: neural_ode_env\\Scripts\\activate\n```\n\n### 步骤 2: 安装 PyTorch\n请根据你的 CUDA 版本选择对应的安装命令。若无 GPU，可使用 CPU 版本。\n```bash\n# 示例：安装 CPU 版本 (使用清华源加速)\npip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple torch torchvision torchaudio\n```\n\n### 步骤 3: 安装核心求解器库\n列表中大多数 Neural ODE 项目依赖 `torchdiffeq`。\n```bash\npip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple torchdiffeq\n```\n\n### 步骤 4: 获取参考代码\n你可以克隆列表中提到的官方示例仓库，或直接参考下文的基本使用示例。\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Frtqichen\u002Ftorchdiffeq.git\ncd torchdiffeq\u002Fexamples\n```\n\n## 3. 基本使用\n\n以下是一个最简单的 **Neural ODE** 训练示例，用于拟合一个简单的螺旋线轨迹数据。此代码展示了如何定义 ODE 函数、设置求解器并进行反向传播。\n\n创建一个名为 `simple_neural_ode.py` 的文件：\n\n```python\nimport torch\nimport torch.nn as nn\nfrom torchdiffeq import odeint\nimport matplotlib.pyplot as plt\n\n# 1. 定义神经网络作为微分方程 f(x, t)\nclass ODEFunc(nn.Module):\n    def __init__(self):\n        super(ODEFunc, self).__init__()\n        self.net = nn.Sequential(\n            nn.Linear(2, 50),\n            nn.Tanh(),\n            nn.Linear(50, 2),\n        )\n\n    def forward(self, t, x):\n        # 注意：torchdiffeq 要求 forward(t, x)，t 是标量或时间向量\n        return self.net(x)\n\n# 2. 生成模拟数据 (螺旋线)\ntrue_y0 = torch.tensor([[2., 0.]])\nt = torch.linspace(0., 25., 1000)\nwith torch.no_grad():\n    # 这里仅为了演示生成数据，实际训练中这是你的 Ground Truth\n    # 使用简单的旋转矩阵模拟真实动力学\n    true_y = torch.stack([2 * torch.cos(t), 2 * torch.sin(t)]).T.unsqueeze(0)\n\n# 3. 初始化模型和优化器\nfunc = ODEFunc()\noptimizer = torch.optim.Adam(func.parameters(), lr=1e-3)\n\n# 4. 训练循环\nprint(\"开始训练 Neural ODE...\")\nfor iter in range(100):\n    optimizer.zero_grad()\n    \n    # 求解 ODE: dx\u002Fdt = func(x, t)\n    pred_y = odeint(func, true_y0, t)\n    \n    # 计算损失 (MSE)\n    loss = torch.mean((pred_y - true_y) ** 2)\n    loss.backward()\n    optimizer.step()\n    \n    if iter % 10 == 0:\n        print(f\"Iter {iter}, Loss: {loss.item():.4f}\")\n\n# 5. 可视化结果\nplt.figure(figsize=(6, 6))\nplt.plot(true_y[:, 0], true_y[:, 1], 'b-', label='True Dynamics')\nplt.plot(pred_y.detach().numpy()[0, :, 0], pred_y.detach().numpy()[0, :, 1], 'r--', label='Neural ODE')\nplt.legend()\nplt.title('Neural ODE Fitting Result')\nplt.show()\n```\n\n### 运行示例\n在终端执行：\n```bash\npython simple_neural_ode.py\n```\n\n### 下一步探索\n根据 `awesome-neural-ode` 列表，你可以进一步探索以下方向：\n*   **Neural SDEs\u002FCDEs**: 处理随机性或连续时间离散观测数据（查看列表中 `Neural SDEs` 部分）。\n*   **Hamiltonian\u002FLagrangian NN**: 用于物理系统建模，保证能量守恒（查看 `General Architectures` 部分）。\n*   **Scientific ML**: 使用深度学习求解偏微分方程（PDE），参考 `Fourier Neural Operator` 等相关项目。","某医疗 AI 团队正在开发一套基于重症监护（ICU）患者生命体征数据的病情演化预测系统，需要处理大量不规则采样且含有缺失值的连续时间序列数据。\n\n### 没有 awesome-neural-ode 时\n- **技术选型迷茫**：面对微分方程与深度学习结合的庞大理论体系，团队难以快速定位适合处理“缺失值时间序列”或“未知动力学系统”的具体架构（如 GRU-D 或 Neural ODE）。\n- **复现成本高昂**：缺乏统一的代码库和论文索引，研究人员需花费数周时间在分散的仓库中寻找可复现的基准模型，甚至重复造轮子实现基础数值求解器。\n- **理论落地困难**：在尝试将控制理论或随机微分方程（SDE）引入模型时，因缺少清晰的分类指引和科学机器学习（Scientific ML）资源，导致算法收敛慢且物理意义解释性差。\n- **工具链割裂**：找不到经过验证的软件库来加速连续模型训练，导致原型开发周期被无限拉长，无法及时响应临床需求。\n\n### 使用 awesome-neural-ode 后\n- **精准架构匹配**：通过目录中\"Differential Equations in Deep Learning\"分类，团队迅速锁定了专门处理缺失值的 GRU-D 模型及学习未知 ODE 的高斯过程方法，直接复用成熟思路。\n- **资源一站获取**：利用仓库整理的论文链接与对应代码实现，团队在两天内完成了基线模型搭建，将原本数周的文献调研与代码搜索时间压缩至小时级。\n- **理论深度整合**：借助\"Neural SDEs\"和\"Control with Neural ODEs\"等专题资源，成功引入随机微分方程刻画病情不确定性，显著提升了模型在复杂动态系统中的鲁棒性与可解释性。\n- **高效工具赋能**：直接采用推荐的数值计算库优化了连续模型的训练速度，解决了梯度反向传播中的稳定性问题，大幅缩短了模型迭代周期。\n\nawesome-neural-ode 通过系统化梳理微分方程与深度学习的交叉资源，将原本碎片化的前沿研究转化为可立即落地的工程生产力，极大降低了科学机器学习的入门与开发门槛。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZymrael_awesome-neural-ode_5b05b96e.png","Zymrael","Michael Poli","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FZymrael_1d59dcec.png","Numerics, model architecture, scaling Co-founder @RadicalNumerics ","Radical Numerics",null,"MichaelPoli6","https:\u002F\u002Fzymrael.github.io\u002F","https:\u002F\u002Fgithub.com\u002FZymrael",1527,156,"2026-04-09T03:01:27","MIT",1,"","未说明",{"notes":90,"python":88,"dependencies":91},"该仓库是一个资源列表（Awesome List），收集了关于微分方程、动力系统与深度学习交叉领域的论文、代码库和博客链接，本身不是一个可直接运行的软件工具或框架，因此没有特定的运行环境、依赖库或硬件需求。用户需根据列表中引用的具体项目（如 Neural ODEs, Fourier Neural Operators 等）查阅其各自的文档以获取环境配置信息。",[],[52,35,14],[94,95,96,97,98,99,100,101,102],"deep-learning","ordinary-differential-equations","dynamical-systems","dynamical-modeling","ode-solver","ode","hamiltonian-dynamics","implicit-models","root-finding","2026-03-27T02:49:30.150509","2026-04-11T17:41:30.197496",[106,111,116,120,124,128,132,137],{"id":107,"question_zh":108,"answer_zh":109,"source_url":110},29709,"为什么在 torchsde 中计算批量向量 - 雅可比积（batch-VJPs）会遇到困难？","这是因为 PyTorch 的自动微分机制目前不支持批量 VJP（batch-vector-Jacobian products）的直接计算，相比之下 JAX 在这方面表现更好。这是在开发 `torchsde` 时遇到的主要限制之一。","https:\u002F\u002Fgithub.com\u002FZymrael\u002Fawesome-neural-ode\u002Fissues\u002F3",{"id":112,"question_zh":113,"answer_zh":114,"source_url":115},29704,"“微分方程 -> 深度学习”和“深度学习 -> 微分方程”分别代表什么含义？","“微分方程 -> 深度学习”指的是在深度学习中应用微分方程（即受微分方程启发或基于微分方程的方法\u002F算法，如 Neural ODE）；而反方向“深度学习 -> 微分方程”则指利用深度学习解决“传统”微分方程问题，例如科学机器学习（Scientific ML）或使用神经网络求解微分方程。","https:\u002F\u002Fgithub.com\u002FZymrael\u002Fawesome-neural-ode\u002Fissues\u002F4",{"id":117,"question_zh":118,"answer_zh":119,"source_url":110},29705,"如何在集群上对 SDE（随机微分方程）进行任务级并行计算？","可以使用 `EnsembleDistributed` 在集群上运行任务级并行。具体教程参考：https:\u002F\u002Fdiffeqflux.sciml.ai\u002Fdev\u002Fexamples\u002Foptimization_sde\u002F。底层使用的是 https:\u002F\u002Fdiffeq.sciml.ai\u002Fstable\u002Ffeatures\u002Fensemble\u002F 功能。除了 `EnsembleGPU` 配合伴随模式（adjoints）外，所有导数选项均可正常工作（`EnsembleGPU` + adjoints 需要特殊的伴随处理，因为 GPU 上的任务未完全分离）。`EnsembleGPU` 支持进程级并行，可用于在集群的多 GPU 上同时处理大量轨迹（例如 50,000 条）。",{"id":121,"question_zh":122,"answer_zh":123,"source_url":110},29706,"如何在多线程环境下为每个线程分配一个 GPU 进行并行计算？","利用任务级并行（task-based parallelism），您可以为每个线程分配一个 GPU，通常可以“直接工作”（just work）。具体配置和使用方法可参考 CUDA.jl 的多 GPU 使用指南：https:\u002F\u002Fjuliagpu.gitlab.io\u002FCUDA.jl\u002Fusage\u002Fmultigpu\u002F。",{"id":125,"question_zh":126,"answer_zh":127,"source_url":110},29707,"PyTorch 是否支持类似 Julia DifferentialEquations.jl 中“每个批次元素使用独立求解器”的任务级并行？","是的，与 JAX 不同，PyTorch 支持通过 `torch.jit.fork` 实现任务级并行。您可以利用此功能让每个批次元素使用专用的求解器，而不是让所有元素以最慢的速度同步进行积分步骤。这有助于提高批处理效率。",{"id":129,"question_zh":130,"answer_zh":131,"source_url":110},29708,"DiffEqGPU.jl 是否支持每个线程一个专用求解器？其自动微分支持情况如何？","DiffEqGPU.jl 确实设计了每个线程一个专用求解器的机制。但是，其自动微分（autodiff）功能当时仍在完善中（参考 PR: https:\u002F\u002Fgithub.com\u002FSciML\u002FDiffEqGPU.jl\u002Fpull\u002F72），特别是在处理伴随模式时可能遇到限制，因为 GPU 上的任务组合方式导致反向传播时的分离代码失败。",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},29710,"有哪些关于使用神经算子（Neural Operators）求解 PDE 和 ODE 的推荐论文？","推荐以下论文：\n1. 求解 PDE：《Fourier Neural Operator for Parametric Partial Differential Equations》和《Neural Operator: Learning Maps Between Function Spaces》。\n2. 求解 ODE：《Neural Flows: Efficient Alternative to Neural ODEs》。这些论文已被收录到相关资源列表中。","https:\u002F\u002Fgithub.com\u002FZymrael\u002Fawesome-neural-ode\u002Fissues\u002F9",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},29711,"是否有将注意力机制（Attention）建模为神经 ODE 的相关研究？","有的。参考论文：Kim, Wonjae, and Yoonho Lee. \"Learning Dynamics of Attention: Human Prior for Interpretable Machine Reasoning.\" arXiv preprint arXiv:1905.11666 (2019)，该研究发表于 NeurIPS 2019。作者将注意力 logits 的更新过程建模为神经 ODE，以构建更具可解释性的视觉问答（VQA）模型。","https:\u002F\u002Fgithub.com\u002FZymrael\u002Fawesome-neural-ode\u002Fissues\u002F2",[]]