[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-locuslab--deq":3,"tool-locuslab--deq":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":91,"forks":92,"last_commit_at":93,"license":94,"difficulty_score":10,"env_os":95,"env_gpu":96,"env_ram":97,"env_deps":98,"category_tags":103,"github_topics":79,"view_count":10,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":104,"updated_at":105,"faqs":106,"releases":136},143,"locuslab\u002Fdeq","deq","[NeurIPS'19] Deep Equilibrium Models","deq 是一个实现“深度均衡模型”（Deep Equilibrium Models）的开源工具，源自 NeurIPS 2019 的研究。它通过直接求解神经网络的不动点（即平衡状态），模拟一个“无限深”的网络，而无需实际堆叠多层结构。这种方法显著降低了内存占用（理论上为 O(1)），同时在自然语言处理和计算机视觉任务中达到与主流深度模型相当的性能。\n\ndeq 主要解决了传统深度网络训练时内存消耗大、计算成本高的问题，特别适合希望探索高效、隐式深度架构的研究人员和开发者。项目提供了针对序列建模（如 Transformer）和视觉任务（如图像分类、分割）的两个分支，并集成了先进的不动点求解器（如 Broyden 方法、Anderson 加速）、Jacobian 正则化等稳定性增强技术。\n\n如果你对隐式深度学习、微分方程启发的神经网络或高效 Transformer 架构感兴趣，deq 提供了模块化且易于扩展的代码基础，配合官方教程和 Colab 示例，能快速上手实验。推荐具备 PyTorch 基础并拥有 GPU 资源的用户使用。","# Deep Equilibrium Models\n\n> (Version 2.0 released now! :grinning:)\n\n## News\n\n:boom:**2021\u002F6: Repo updated with the multiscale DEQ (MDEQ) code, Jacobian-related analysis & regularization support, and the new, faster and simpler implicit differentiation implementation through PyTorch's backward hook! (See [here](https:\u002F\u002Fgithub.com\u002Flocuslab\u002Fdeq#how-to-buildtrain-a-deq-model).)**\n\n- For those who would like to start with a toy version of the DEQ, the NeurIPS 2020 tutorial on \"Deep Implicit Layers\" has a detailed step-by-step introduction: [tutorial video & colab notebooks here](http:\u002F\u002Fimplicit-layers-tutorial.org\u002F).\n\n- A [JAX](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Fjax) version of the DEQ, including JAX implementation of Broyden's method, etc. is available [here](https:\u002F\u002Fgithub.com\u002Fakbir\u002Fdeq-jax).\n\n---\n\nThis repository contains the code for the deep equilibrium (DEQ) model, an implicit-depth architecture that directly solves for and backpropagtes through the (fixed-point) equilibrium state of an (effectively) infinitely deep network. Importantly, compared to prior implicit-depth approaches (e.g., ODE-based methods), in this work we also demonstrate the potential power and compatibility of this implicit model with modern, structured layers like Transformers, which enable the DEQ networks to achieve results on par with the SOTA deep networks (in NLP and vision) *without* using a \"deep\" stacking (and thus O(1) memory). Moreover, we also provide tools for regularizing the stability of these implicit models.\n\nSpecifically, this repo contains the code from the following papers (see `bibtex` at the end of this README):\n  - [Deep Equilibrium Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.01377)\n  - [Multiscale Deep Equilibrium Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.08656)\n  - [Stabilizing Equilibrium Models by Jacobian Regularization](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.14342).\n\n## Prerequisite\n\nPython >= 3.6 and PyTorch >= 1.10. 4 GPUs strongly recommended for computational efficiency.\n\n## Data\n\nWe provide more detailed instructions for downloading\u002Fprocessing the datasets (WikiText-103, ImageNet, Cityscapes, etc.) in the `DEQ-Sequence\u002F` and `MDEQ-Vision\u002F` subfolders.\n\n## How to build\u002Ftrain a DEQ model?\n\nStarting in 2021\u002F6, we partition the repo into two sections, containing the sequence-model DEQ (i.e., `DEQ-Sequence\u002F`) and the vision-model DEQ (i.e., `MDEQ-Vision\u002F`) networks, respectively. As these two tasks require different input processing and loss objectives, they do not directly share the training framework. \n\nHowever, both frameworks share the same utility code, such as:\n  - `lib\u002Fsolvers.py`: Advanced fixed-point solvers (e.g., Anderson acceleration and Broyden's method)\n  - `lib\u002Fjacobian.py`: Jacobian-related estimations (e.g., Hutchinson estimator and the Power method)\n  - `lib\u002Foptimization.py`: Regularizations (e.g., weight normalization and variational dropout)\n  - `lib\u002Flayer_utils.py`: Layer utilities\n\nMoreover, the repo is significantly simplified from the previous version for users to extend on it. In particular, \n\n>**Theorem 2 (Universality of \"single-layer\" DEQs, very informal)**: Stacking multiple DEQs \n> (with potentially _different_ classes of transformations) does not create extra representational\n> power over a single DEQ.\n\n(See the paper for a formal statement.) By the theorem above, designing a better DEQ model boils down to designing a better stable transformation f_\\theta. Creating and playing with a DEQ is **easy**, and we recommend following 3 steps (which we adopt in this repo):\n\n### Step 1: Defining a layer `f=f_\\theta` that we'd like to iterate until equilibrium.\n\nTypically, this is just like any deep network layer, and should be a subclass of `torch.nn.Module`. Evaluating this layer requires the hidden unit `z` and the input injection `x`; e.g.:\n```python\nclass Layer(nn.Module):\n    def __init__(self, ...):\n\t...\n    def forward(self, z, x, **kwargs):\n        return new_z\n```\n\n### Step 2: Prepare the fixed point solver to use for the DEQ model.\n\nAs a DEQ model can use any *black-box* root solver. We provide PyTorch fixed-point solver implementations `anderson(...)` and `broyden(...)` in `lib\u002Fsolvers.py` that output a dictionary containing the basic information of the optimization process. By default, we use the *relative residual difference* (i.e., |f(z)-z|\u002F|z|) as the criterion for stopping the iterative process.\n\nThe forward pass can then be reduced to 2 lines:\n```python\nwith torch.no_grad():\n    # x is the input injection; z0 is the initial estimate of the fixed point.\n    z_star = self.solver(lambda z: f(z, x, *args), z0, threshold=f_thres)['result']\n```\nwhere we note that the forward pass does not need to store **any** intermediate state, so we put it in a `torch.no_grad()` block.\n\n### Step 3: Engage with the autodiff tape to use implicit differentiation\n\nFinally, we need to ensure there is a way to compute the backward pass of a DEQ, which relies on implicit function theorem. To do this, we can use the `register_hook` function in PyTorch that registers a backward hook function to be executed in the backward pass. As we noted in the paper, the backward pass is simply solving for the fixed point of a *linear system* involving the Jacobian at the equilibrium:\n```python\nnew_z_star = self.f(z_star.requires_grad_(), x, *args)\n\ndef backward_hook(grad):\n    if self.hook is not None:\n        self.hook.remove()\n        torch.cuda.synchronize()   # To avoid infinite recursion\n    # Compute the fixed point of yJ + grad, where J=J_f is the Jacobian of f at z_star\n    new_grad = self.solver(lambda y: autograd.grad(new_z_star, z_star, y, retain_graph=True)[0] + grad, \\\n                           torch.zeros_like(grad), threshold=b_thres)['result']\n    return new_grad\n\nself.hook = new_z_star.register_hook(backward_hook)\n```\n\n### (Optional) Additional Step: Jacobian Regularization.\n\nThe fixed-point formulation of DEQ models means their stability are directly characterized by the Jacobian matrix `J_f` at the equilibrium point. Therefore, we provide code for analyzing and regularizing the Jacobian properties (based on the ICML'21 paper [Stabilizing Equilibrium Models by Jacobian Regularization](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.14342)). Specifically, we added the following flags to the training script:\n\n  - `jac_loss_weight`: The strength of Jacobian regularization, where we regularize `||J_f||_F`.\n  - `jac_loss_freq`: The frequency `p` of the stochastic Jacobian regularization (i.e., we only apply this loss with probaility `p` during training).\n  - `jac_incremental`: If >0, then we increase the `jac_loss_weight` by 0.1 after every `jac_incremental` training steps.\n  - `spectral_radius_mode`: If `True`, estimate the DEQ models' spectral radius when evaluating on the validation set.\n\nA full DEQ model implementation is therefore as simple as follows:\n```python\nfrom lib.solvers import anderson, broyden\nfrom lib.jacobian import jac_loss_estimate\n\nclass DEQModel(nn.Module):\n    def __init__(self, ...):\n        ...\n        self.f = Layer(...)\n        self.solver = broyden\n        ...\n    \n    def forward(self, x, ..., **kwargs):\n        z0 = torch.zeros(...)\n\n        # Forward pass\n        with torch.no_grad():\n            z_star = self.solver(lambda z: self.f(z, x, *args), z0, threshold=f_thres)['result']   # See step 2 above\n            new_z_star = z_star\n\n        # (Prepare for) Backward pass, see step 3 above\n        if self.training:\n            new_z_star = self.f(z_star.requires_grad_(), x, *args)\n            \n            # Jacobian-related computations, see additional step above. For instance:\n            jac_loss = jac_loss_estimate(new_z_star, z_star, vecs=1)\n\n            def backward_hook(grad):\n                if self.hook is not None:\n                    self.hook.remove()\n                    torch.cuda.synchronize()   # To avoid infinite recursion\n                # Compute the fixed point of yJ + grad, where J=J_f is the Jacobian of f at z_star\n                new_grad = self.solver(lambda y: autograd.grad(new_z_star, z_star, y, retain_graph=True)[0] + grad, \\\n                                       torch.zeros_like(grad), threshold=b_thres)['result']\n                return new_grad\n\n            self.hook = new_z_star.register_hook(backward_hook)\n        return new_z_star, ...\n```\n\n## Fixed-point Solvers\n\nWe provide PyTorch implementation of two generic solvers, `broyden(...)` (based on Broyden's method) and `anderson(...)` (based on Anderson acceleration) in `lib\u002Fsolvers.py`. Both functions take in the transformation `f` whose fixed point we would like to solve for, and returns a dictionary of the following format:\n```\n{\n \"result\": ... (The closest estimate to the fixed point),\n \"nstep\": ... (The step that gives us this closest estimate),\n \"abs_trace\": ... (Absolute residuals along the trajectory),\n \"rel_trace\": ... (Relative residuals along the trajectory),\n ...\n}\n```\n\n## Pretrained Models\n\nSee `DEQ-Sequence\u002F` and `MDEQ-Vision\u002F` sub-directories for the links.\n\n## Credits\n\n- The transformer implementation as well as the extra modules (e.g., adaptive embeddings) were based on the [Transformer-XL](https:\u002F\u002Fgithub.com\u002Fkimiyoung\u002Ftransformer-xl) repo.\n\n- Some utilization code (e.g., model summary and yaml processing) of this repo were modified from the [HRNet](https:\u002F\u002Fgithub.com\u002FHRNet\u002FHRNet-Semantic-Segmentation) repo.\n\n- We also added the RAdam optimizer as an option to the training (but didn't set it to default). The RAdam implementation is from the [RAdam](https:\u002F\u002Fgithub.com\u002FLiyuanLucasLiu\u002FRAdam) repo.\n\n## Bibtex\n\nIf you find this repository useful for your research, please consider citing our work(s):\n\n1. Deep Equilibrium Models\n```\n@inproceedings{bai2019deep,\n  author    = {Shaojie Bai and J. Zico Kolter and Vladlen Koltun},\n  title     = {Deep Equilibrium Models},\n  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},\n  year      = {2019},\n}\n```\n\n2. Multiscale Deep Equilibrium Models\n```\n@inproceedings{bai2020multiscale,\n  author    = {Shaojie Bai and Vladlen Koltun and J. Zico Kolter},\n  title     = {Multiscale Deep Equilibrium Models},\n  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},\n  year      = {2020},\n}\n```\n\n3. Stabilizing Equilibrium Models by Jacobian Regularization\n```\n@inproceedings{bai2021stabilizing,\n  title     = {Stabilizing Equilibrium Models by Jacobian Regularization},\n  author    = {Shaojie Bai and Vladlen Koltun and J. Zico Kolter},\n  booktitle = {International Conference on Machine Learning (ICML)},\n  year      = {2021}\n}\n```\n\n\n","# 深度均衡模型（Deep Equilibrium Models）\n\n> （2.0 版本现已发布！:grinning:）\n\n## 新闻\n\n:boom:**2021\u002F6：仓库已更新，包含多尺度 DEQ（MDEQ）代码、Jacobian（雅可比矩阵）相关分析与正则化支持，以及通过 PyTorch 的 backward hook 实现的全新、更快且更简洁的隐式微分（implicit differentiation）方法！（参见 [此处](https:\u002F\u002Fgithub.com\u002Flocuslab\u002Fdeq#how-to-buildtrain-a-deq-model)。）**\n\n- 如果你想从一个简化版的 DEQ 入手，NeurIPS 2020 关于“深度隐式层（Deep Implicit Layers）”的教程提供了详细的逐步介绍：[教程视频与 Colab 笔记本在此](http:\u002F\u002Fimplicit-layers-tutorial.org\u002F)。\n\n- 一个基于 [JAX](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Fjax) 的 DEQ 实现（包括 Broyden 方法等的 JAX 版本）可在 [此处](https:\u002F\u002Fgithub.com\u002Fakbir\u002Fdeq-jax) 获取。\n\n---\n\n本仓库包含了深度均衡（Deep Equilibrium, DEQ）模型的代码。DEQ 是一种隐式深度（implicit-depth）架构，它直接求解并反向传播通过（有效意义上）无限深网络的（不动点）均衡状态。重要的是，与先前的隐式深度方法（例如基于 ODE 的方法）相比，本工作还展示了这种隐式模型与现代结构化层（如 Transformer）的强大潜力和兼容性，使得 DEQ 网络在不使用“深度”堆叠的情况下（因此内存复杂度为 O(1)），即可在 NLP 和视觉任务上达到与当前最先进（SOTA）深度网络相当的结果。此外，我们还提供了用于正则化这些隐式模型稳定性的工具。\n\n具体而言，本仓库包含以下论文的代码（参考本文末尾的 `bibtex`）：\n  - [Deep Equilibrium Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.01377)\n  - [Multiscale Deep Equilibrium Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.08656)\n  - [Stabilizing Equilibrium Models by Jacobian Regularization](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.14342).\n\n## 前置要求\n\nPython >= 3.6 且 PyTorch >= 1.10。强烈建议使用 4 块 GPU 以获得计算效率。\n\n## 数据\n\n我们在 `DEQ-Sequence\u002F` 和 `MDEQ-Vision\u002F` 子文件夹中提供了更详细的数据集（如 WikiText-103、ImageNet、Cityscapes 等）下载与处理说明。\n\n## 如何构建\u002F训练一个 DEQ 模型？\n\n自 2021 年 6 月起，我们将仓库划分为两个部分，分别包含序列模型 DEQ（即 `DEQ-Sequence\u002F`）和视觉模型 DEQ（即 `MDEQ-Vision\u002F`）网络。由于这两类任务需要不同的输入处理方式和损失目标，它们并不直接共享训练框架。\n\n然而，两个框架共享相同的工具代码，例如：\n  - `lib\u002Fsolvers.py`：高级不动点求解器（例如 Anderson 加速和 Broyden 方法）\n  - `lib\u002Fjacobian.py`：Jacobian 相关估计（例如 Hutchinson 估计器和幂迭代法（Power method））\n  - `lib\u002Foptimization.py`：正则化方法（例如权重归一化和变分 Dropout）\n  - `lib\u002Flayer_utils.py`：层工具函数\n\n此外，与旧版本相比，本仓库已大幅简化，便于用户进行扩展。特别地，\n\n>**定理 2（“单层”DEQ 的普适性，非正式表述）**：堆叠多个 DEQ  \n>（即使使用**不同**类型的变换）并不会比单个 DEQ 提供更强的表示能力。\n\n（详见论文中的正式表述。）根据上述定理，设计更好的 DEQ 模型归结为设计更优且稳定的变换函数 f_\\theta。创建并尝试 DEQ **非常简单**，我们推荐遵循以下 3 个步骤（本仓库也采用此流程）：\n\n### 步骤 1：定义一个希望迭代至均衡状态的层 `f=f_\\theta`\n\n通常，这与任何深度网络层类似，应为 `torch.nn.Module` 的子类。该层的前向计算需要隐藏单元 `z` 和输入注入 `x`；例如：\n```python\nclass Layer(nn.Module):\n    def __init__(self, ...):\n\t...\n    def forward(self, z, x, **kwargs):\n        return new_z\n```\n\n### 步骤 2：准备用于 DEQ 模型的不动点求解器\n\n由于 DEQ 模型可以使用任意*黑盒*根求解器，我们在 `lib\u002Fsolvers.py` 中提供了 PyTorch 实现的不动点求解器 `anderson(...)` 和 `broyden(...)`，它们会返回一个包含优化过程基本信息的字典。默认情况下，我们使用*相对残差差值*（即 |f(z)-z|\u002F|z|）作为迭代停止的判据。\n\n前向传播可简化为两行代码：\n```python\nwith torch.no_grad():\n    # x 是输入注入；z0 是不动点的初始估计。\n    z_star = self.solver(lambda z: f(z, x, *args), z0, threshold=f_thres)['result']\n```\n注意，前向传播无需存储**任何**中间状态，因此我们将其置于 `torch.no_grad()` 块中。\n\n### 步骤 3：通过自动微分机制实现隐式微分\n\n最后，我们需要确保能够计算 DEQ 的反向传播，这依赖于隐函数定理（implicit function theorem）。为此，我们可以使用 PyTorch 中的 `register_hook` 函数，在反向传播时注册一个后向钩子（backward hook）函数。如论文所述，反向传播本质上是求解一个涉及均衡点处 Jacobian 矩阵的*线性系统*的不动点：\n```python\nnew_z_star = self.f(z_star.requires_grad_(), x, *args)\n\ndef backward_hook(grad):\n    if self.hook is not None:\n        self.hook.remove()\n        torch.cuda.synchronize()   # 避免无限递归\n    # 求解 yJ + grad 的不动点，其中 J=J_f 是 f 在 z_star 处的 Jacobian 矩阵\n    new_grad = self.solver(lambda y: autograd.grad(new_z_star, z_star, y, retain_graph=True)[0] + grad, \\\n                           torch.zeros_like(grad), threshold=b_thres)['result']\n    return new_grad\n\nself.hook = new_z_star.register_hook(backward_hook)\n```\n\n### （可选）附加步骤：Jacobian 正则化（雅可比正则化）\n\nDEQ 模型的不动点形式意味着其稳定性直接由平衡点处的 Jacobian 矩阵 `J_f` 所刻画。因此，我们提供了用于分析和正则化 Jacobian 性质的代码（基于 ICML'21 论文 [Stabilizing Equilibrium Models by Jacobian Regularization](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.14342)）。具体而言，我们在训练脚本中添加了以下参数：\n\n  - `jac_loss_weight`: Jacobian 正则化的强度，此处我们对 `||J_f||_F`（Frobenius 范数）进行正则化。\n  - `jac_loss_freq`: 随机 Jacobian 正则化的频率 `p`（即在训练过程中以概率 `p` 应用该损失）。\n  - `jac_incremental`: 若大于 0，则每经过 `jac_incremental` 个训练步后，将 `jac_loss_weight` 增加 0.1。\n  - `spectral_radius_mode`: 若为 `True`，则在验证集评估时估计 DEQ 模型的谱半径（spectral radius）。\n\n因此，一个完整的 DEQ 模型实现如下所示：\n```python\nfrom lib.solvers import anderson, broyden\nfrom lib.jacobian import jac_loss_estimate\n\nclass DEQModel(nn.Module):\n    def __init__(self, ...):\n        ...\n        self.f = Layer(...)\n        self.solver = broyden\n        ...\n    \n    def forward(self, x, ..., **kwargs):\n        z0 = torch.zeros(...)\n\n        # 前向传播\n        with torch.no_grad():\n            z_star = self.solver(lambda z: self.f(z, x, *args), z0, threshold=f_thres)['result']   # 见上文第 2 步\n            new_z_star = z_star\n\n        # （准备）反向传播，见上文第 3 步\n        if self.training:\n            new_z_star = self.f(z_star.requires_grad_(), x, *args)\n            \n            # Jacobian 相关计算，见上述附加步骤。例如：\n            jac_loss = jac_loss_estimate(new_z_star, z_star, vecs=1)\n\n            def backward_hook(grad):\n                if self.hook is not None:\n                    self.hook.remove()\n                    torch.cuda.synchronize()   # 避免无限递归\n                # 计算 yJ + grad 的不动点，其中 J=J_f 是 f 在 z_star 处的 Jacobian\n                new_grad = self.solver(lambda y: autograd.grad(new_z_star, z_star, y, retain_graph=True)[0] + grad, \\\n                                       torch.zeros_like(grad), threshold=b_thres)['result']\n                return new_grad\n\n            self.hook = new_z_star.register_hook(backward_hook)\n        return new_z_star, ...\n```\n\n## 不动点求解器（Fixed-point Solvers）\n\n我们在 `lib\u002Fsolvers.py` 中提供了两种通用求解器的 PyTorch 实现：`broyden(...)`（基于 Broyden 方法）和 `anderson(...)`（基于 Anderson 加速）。这两个函数接收一个变换函数 `f`（我们希望求解其不动点），并返回如下格式的字典：\n```\n{\n \"result\": ... (最接近不动点的估计值),\n \"nstep\": ... (给出该估计值的迭代步数),\n \"abs_trace\": ... (轨迹上的绝对残差),\n \"rel_trace\": ... (轨迹上的相对残差),\n ...\n}\n```\n\n## 预训练模型\n\n请参见 `DEQ-Sequence\u002F` 和 `MDEQ-Vision\u002F` 子目录中的链接。\n\n## 致谢\n\n- Transformer 的实现以及额外模块（例如自适应嵌入）基于 [Transformer-XL](https:\u002F\u002Fgithub.com\u002Fkimiyoung\u002Ftransformer-xl) 仓库。\n\n- 本仓库的部分工具代码（例如模型摘要和 YAML 处理）修改自 [HRNet](https:\u002F\u002Fgithub.com\u002FHRNet\u002FHRNet-Semantic-Segmentation) 仓库。\n\n- 我们还增加了 RAdam 优化器作为训练选项（但未设为默认）。RAdam 的实现来自 [RAdam](https:\u002F\u002Fgithub.com\u002FLiyuanLucasLiu\u002FRAdam) 仓库。\n\n## 引用（Bibtex）\n\n如果您在研究中使用了本仓库，请考虑引用我们的工作：\n\n1. Deep Equilibrium Models\n```\n@inproceedings{bai2019deep,\n  author    = {Shaojie Bai and J. Zico Kolter and Vladlen Koltun},\n  title     = {Deep Equilibrium Models},\n  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},\n  year      = {2019},\n}\n```\n\n2. Multiscale Deep Equilibrium Models\n```\n@inproceedings{bai2020multiscale,\n  author    = {Shaojie Bai and Vladlen Koltun and J. Zico Kolter},\n  title     = {Multiscale Deep Equilibrium Models},\n  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},\n  year      = {2020},\n}\n```\n\n3. Stabilizing Equilibrium Models by Jacobian Regularization\n```\n@inproceedings{bai2021stabilizing,\n  title     = {Stabilizing Equilibrium Models by Jacobian Regularization},\n  author    = {Shaojie Bai and Vladlen Koltun and J. Zico Kolter},\n  booktitle = {International Conference on Machine Learning (ICML)},\n  year      = {2021}\n}\n```","# DEQ（Deep Equilibrium Models）快速上手指南\n\n## 环境准备\n\n- **Python 版本**：≥ 3.6  \n- **PyTorch 版本**：≥ 1.10  \n- **硬件建议**：强烈推荐使用 4 块 GPU 以获得最佳训练效率  \n- **网络环境**：若在国内，建议配置 PyPI 镜像源（如清华源）加速依赖安装\n\n## 安装步骤\n\n1. 克隆仓库：\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002Flocuslab\u002Fdeq.git\n   cd deq\n   ```\n\n2. 安装依赖（推荐使用虚拟环境）：\n   ```bash\n   pip install torch>=1.10 torchvision\n   # 如需处理特定任务数据（如 NLP 或视觉），请参考对应子目录中的 requirements\n   ```\n\n> 💡 国内用户可使用清华源加速安装：\n> ```bash\n> pip install torch>=1.10 torchvision -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n> ```\n\n## 基本使用\n\nDEQ 模型的核心思想是通过求解隐式不动点方程 $ z = f_\\theta(z, x) $ 来替代传统深度网络的逐层前向传播。构建一个 DEQ 模型只需三步：\n\n### 步骤 1：定义变换函数 `f_\\theta`\n\n创建一个继承自 `torch.nn.Module` 的模块，其 `forward` 方法接受当前状态 `z` 和输入 `x`：\n\n```python\nimport torch.nn as nn\n\nclass Layer(nn.Module):\n    def __init__(self, ...):\n        ...\n    def forward(self, z, x, **kwargs):\n        return new_z  # 返回下一个状态\n```\n\n### 步骤 2：前向传播 —— 求解不动点\n\n使用内置求解器（如 `broyden` 或 `anderson`）在 `torch.no_grad()` 中求解平衡点：\n\n```python\nfrom lib.solvers import broyden\n\n# 在模型 forward 中\nwith torch.no_grad():\n    z_star = self.solver(lambda z: self.f(z, x, *args), z0, threshold=f_thres)['result']\n```\n\n### 步骤 3：反向传播 —— 隐式微分\n\n通过 PyTorch 的 `register_hook` 实现基于隐函数定理的梯度计算：\n\n```python\nnew_z_star = self.f(z_star.requires_grad_(), x, *args)\n\ndef backward_hook(grad):\n    if self.hook is not None:\n        self.hook.remove()\n        torch.cuda.synchronize()\n    new_grad = self.solver(\n        lambda y: torch.autograd.grad(new_z_star, z_star, y, retain_graph=True)[0] + grad,\n        torch.zeros_like(grad),\n        threshold=b_thres\n    )['result']\n    return new_grad\n\nself.hook = new_z_star.register_hook(backward_hook)\n```\n\n### 完整最小示例\n\n```python\nfrom lib.solvers import broyden\nfrom lib.jacobian import jac_loss_estimate\nimport torch\nimport torch.nn as nn\n\nclass DEQModel(nn.Module):\n    def __init__(self, ...):\n        super().__init__()\n        self.f = Layer(...)\n        self.solver = broyden\n\n    def forward(self, x, f_thres=30, b_thres=30):\n        z0 = torch.zeros_like(x)  # 初始猜测\n\n        # 前向：求不动点\n        with torch.no_grad():\n            z_star = self.solver(lambda z: self.f(z, x), z0, threshold=f_thres)['result']\n\n        if self.training:\n            # 为反向传播准备计算图\n            new_z_star = self.f(z_star.requires_grad_(), x)\n            \n            # 可选：Jacobian 正则化\n            jac_loss = jac_loss_estimate(new_z_star, z_star, vecs=1)\n\n            # 注册反向钩子\n            def backward_hook(grad):\n                if hasattr(self, 'hook') and self.hook is not None:\n                    self.hook.remove()\n                    torch.cuda.synchronize()\n                new_grad = self.solver(\n                    lambda y: torch.autograd.grad(new_z_star, z_star, y, retain_graph=True)[0] + grad,\n                    torch.zeros_like(grad),\n                    threshold=b_thres\n                )['result']\n                return new_grad\n\n            self.hook = new_z_star.register_hook(backward_hook)\n            return new_z_star, jac_loss\n        else:\n            return z_star\n```\n\n> 📌 提示：完整训练代码和数据预处理请参考 `DEQ-Sequence\u002F`（序列任务）或 `MDEQ-Vision\u002F`（视觉任务）子目录。","某AI初创公司正在开发一个实时视频语义分割系统，用于自动驾驶车辆的环境感知模块，需在有限车载GPU内存下实现高精度、低延迟的图像理解。\n\n### 没有 deq 时\n- 采用传统深度卷积网络（如DeepLabv3+）堆叠数十层，模型推理时显存占用高达8GB以上，难以部署到车规级嵌入式GPU。\n- 训练过程中因反向传播需保存所有中间激活值，batch size被迫设为1，训练效率低下且梯度不稳定。\n- 为压缩模型不得不进行剪枝或量化，导致mIoU指标下降3-5个百分点，影响分割精度。\n- 多尺度特征融合依赖复杂跳跃连接，代码结构臃肿，调试和迭代成本高。\n- 长序列上下文建模能力弱，在处理连续帧时难以维持时空一致性。\n\n### 使用 deq 后\n- 借助MDEQ（多尺度DEQ）架构，以单层隐式循环替代深层堆叠，显存占用降至2GB以内，顺利部署到Jetson AGX平台。\n- 利用隐式微分和Broyden求解器，反向传播无需存储中间层，训练batch size提升至4，收敛速度加快约40%。\n- 在ImageNet预训练基础上微调Cityscapes，mIoU达到78.2%，媲美SOTA显式深度模型，无需牺牲精度换资源。\n- 多尺度特征通过同一平衡点联合优化，结构简洁，代码复用率高，新模块集成周期缩短一半。\n- 隐式平衡状态天然具备长程依赖建模能力，连续帧分割结果更平滑，减少后处理负担。\n\ndeq以“无限深度、常数内存”的特性，在资源受限场景下实现了精度与效率的双赢。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flocuslab_deq_58f60728.png","locuslab","CMU Locus Lab","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Flocuslab_147f21ae.png","Zico Kolter's Research Group",null,"http:\u002F\u002Fwww.zicokolter.com\u002F","https:\u002F\u002Fgithub.com\u002Flocuslab",[83,87],{"name":84,"color":85,"percentage":86},"Python","#3572A5",98.3,{"name":88,"color":89,"percentage":90},"Shell","#89e051",1.7,797,87,"2026-04-02T04:37:57","MIT","Linux, macOS, Windows","强烈推荐使用4块NVIDIA GPU，具体型号、显存大小和CUDA版本未说明","未说明",{"notes":99,"python":100,"dependencies":101},"项目分为序列模型（DEQ-Sequence）和视觉模型（MDEQ-Vision）两部分，训练不同任务需分别参考对应子目录；支持Jacobian正则化以提升模型稳定性；提供Anderson加速和Broyden方法等不动点求解器；反向传播通过PyTorch的backward hook实现隐式微分。",">=3.6",[102],"torch>=1.10",[26,14,13,54],"2026-03-27T02:49:30.150509","2026-04-06T05:16:36.477558",[107,112,117,122,127,132],{"id":108,"question_zh":109,"answer_zh":110,"source_url":111},219,"Anderson 加速方法有时会报错 'singular U'，如何解决？","该错误通常是因为在求解过程中矩阵奇异导致的。建议采用以下稳定化技术：1）使用雅可比正则化（Jacobian regularization）；2）在固定点求解过程中添加辅助损失项，例如除了最终输出 z* 的损失 L(z*, y) 外，还可以额外加上 0.2 * L(z^{10}, y)，以促使模型更早收敛。注意，由于隐函数定理（IFT）的原因，这种辅助损失会增加内存开销，可以考虑对其使用 JFB（Jacobian-Free Backpropagation）方法来缓解。","https:\u002F\u002Fgithub.com\u002Flocuslab\u002Fdeq\u002Fissues\u002F20",{"id":113,"question_zh":114,"answer_zh":115,"source_url":116},220,"加载 ImageNet 预训练模型时出现权重尺寸不匹配错误怎么办？","这是因为预训练模型中包含了一些训练时用到但推理时不需要的冗余参数（如 copy 和 deq 相关的键）。解决方法有两种：1）重新下载项目 README 中提供的最新预训练模型；2）手动清理模型文件，运行以下 Python 脚本：\n```python\nimport torch\npd = torch.load('pretrained_models\u002Fmdeq_XL.pth')\nnew_pd = {}\nfor k in pd:\n    if \"copy\" not in k and \"deq\" not in k:\n        new_pd[k] = pd[k].clone().detach().cpu()\ntorch.save(new_pd, 'pretrained_models\u002Fmdeq_XL_new.pth')\n```\n然后使用生成的新模型文件进行测试。","https:\u002F\u002Fgithub.com\u002Flocuslab\u002Fdeq\u002Fissues\u002F14",{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},221,"训练 MDEQ_LARGE 模型时在 loss.backward() 处出现段错误（Segmentation Fault）如何处理？","该问题与 PyTorch 的反向钩子（backward hook）机制有关。在某些 PyTorch 版本（如 1.7.1+cu110）和 CUDA 环境下，钩子移除操作可能引发段错误。临时解决方案是避免在 backward_hook 函数内部调用 self.hook.remove()，而是将其移出。虽然这会导致每次前向传播都保留钩子，但 Python 的垃圾回收机制会在引用计数归零后自动清理，不会造成内存泄漏，只是会略微增加计算和内存开销。","https:\u002F\u002Fgithub.com\u002Flocuslab\u002Fdeq\u002Fissues\u002F12",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},222,"为什么 MDEQ 代码中需要检查并移除已存在的钩子（hook）？教程中没有这样做。","MDEQ 实现中可能存在多次前向传播或梯度计算的情况（例如在使用雅可比正则化时），这会导致同一个模块被多次注册钩子。如果不检查并移除已有钩子，就会重复注册，引发错误或异常行为。而基础教程中的示例通常只进行单次前向-反向过程，因此不需要此检查。如果你的自定义损失只需要部分求解过程可微，可以考虑将求解器拆分为 torch.no_grad() 和 torch.enable_grad() 两部分，以更精细地控制梯度计算。","https:\u002F\u002Fgithub.com\u002Flocuslab\u002Fdeq\u002Fissues\u002F24",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},223,"MDEQ 在不同 batch size 下推理结果是否一致？","MDEQ 的 Broyden 求解器是按 batch 并行处理的，不同样本在同一 batch 内理论上不会相互干扰。但如果观察到不同 batch size 下输出不一致，可能是由于数值精度、收敛阈值或随机性（如 dropout）导致的。建议确保推理时关闭所有随机操作（如设置 model.eval()），并检查求解器的收敛容差（tolerance）是否足够小以保证稳定性。","https:\u002F\u002Fgithub.com\u002Flocuslab\u002Fdeq\u002Fissues\u002F21",{"id":133,"question_zh":134,"answer_zh":135,"source_url":111},224,"Broyden's method 中为何选择 'bad' 版本而不是 'good' 版本？","这是基于经验选择的结果。项目作者在初期实验中同时尝试了 Broyden good 和 bad 两种变体，发现 'bad' 版本表现略好，因此采用了它。但这并不意味着 'good' 版本无效——实际上 Anderson 加速和带雅可比正则化的朴素不动点迭代也都能很好地工作。选择哪种方法可根据具体任务效果进行调整。",[]]