[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-amdegroot--ssd.pytorch":3,"tool-amdegroot--ssd.pytorch":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":81,"owner_twitter":82,"owner_website":83,"owner_url":84,"languages":85,"stars":94,"forks":95,"last_commit_at":96,"license":97,"difficulty_score":10,"env_os":98,"env_gpu":99,"env_ram":98,"env_deps":100,"category_tags":109,"github_topics":110,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":119,"updated_at":120,"faqs":121,"releases":151},2155,"amdegroot\u002Fssd.pytorch","ssd.pytorch","A PyTorch Implementation of Single Shot MultiBox Detector","ssd.pytorch 是经典目标检测算法 SSD（Single Shot MultiBox Detector）的 PyTorch 版本实现。它旨在解决图像中多类物体的快速定位与识别问题，能够在单次前向传播中同时完成物体分类与边界框回归，兼顾了检测速度与精度。\n\n该项目复现了 2016 年提出的 SSD 论文核心逻辑，为习惯使用 PyTorch 框架的开发者提供了可靠的基准代码。相比于原始 Caffe 版本，ssd.pytorch 更易于在现代深度学习环境中进行调试、修改和二次开发。工具内置了对 PASCAL VOC 和 MS COCO 等主流数据集的下载脚本与加载器，并支持利用 Visdom 实时可视化训练过程中的损失变化，极大地简化了从环境配置到模型训练、评估的全流程。\n\nssd.pytorch 特别适合计算机视觉领域的研究人员、算法工程师以及高校学生使用。对于希望深入理解单阶段检测器原理，或需要基于 SSD 架构开展新实验的用户来说，这是一个结构清晰、功能完备的开源起点。虽然项目主要面向具备一定编程基础的技术人员，但其详细的文档和模块化设计也降低了复现前沿算法的门槛。","# SSD: Single Shot MultiBox Object Detector, in PyTorch\nA [PyTorch](http:\u002F\u002Fpytorch.org\u002F) implementation of [Single Shot MultiBox Detector](http:\u002F\u002Farxiv.org\u002Fabs\u002F1512.02325) from the 2016 paper by Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang, and Alexander C. Berg.  The official and original Caffe code can be found [here](https:\u002F\u002Fgithub.com\u002Fweiliu89\u002Fcaffe\u002Ftree\u002Fssd).\n\n\n\u003Cimg align=\"right\" src= \"https:\u002F\u002Fgithub.com\u002Famdegroot\u002Fssd.pytorch\u002Fblob\u002Fmaster\u002Fdoc\u002Fssd.png\" height = 400\u002F>\n\n### Table of Contents\n- \u003Ca href='#installation'>Installation\u003C\u002Fa>\n- \u003Ca href='#datasets'>Datasets\u003C\u002Fa>\n- \u003Ca href='#training-ssd'>Train\u003C\u002Fa>\n- \u003Ca href='#evaluation'>Evaluate\u003C\u002Fa>\n- \u003Ca href='#performance'>Performance\u003C\u002Fa>\n- \u003Ca href='#demos'>Demos\u003C\u002Fa>\n- \u003Ca href='#todo'>Future Work\u003C\u002Fa>\n- \u003Ca href='#references'>Reference\u003C\u002Fa>\n\n&nbsp;\n&nbsp;\n&nbsp;\n&nbsp;\n\n## Installation\n- Install [PyTorch](http:\u002F\u002Fpytorch.org\u002F) by selecting your environment on the website and running the appropriate command.\n- Clone this repository.\n  * Note: We currently only support Python 3+.\n- Then download the dataset by following the [instructions](#datasets) below.\n- We now support [Visdom](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fvisdom) for real-time loss visualization during training!\n  * To use Visdom in the browser:\n  ```Shell\n  # First install Python server and client\n  pip install visdom\n  # Start the server (probably in a screen or tmux)\n  python -m visdom.server\n  ```\n  * Then (during training) navigate to http:\u002F\u002Flocalhost:8097\u002F (see the Train section below for training details).\n- Note: For training, we currently support [VOC](http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002F) and [COCO](http:\u002F\u002Fmscoco.org\u002F), and aim to add [ImageNet](http:\u002F\u002Fwww.image-net.org\u002F) support soon.\n\n## Datasets\nTo make things easy, we provide bash scripts to handle the dataset downloads and setup for you.  We also provide simple dataset loaders that inherit `torch.utils.data.Dataset`, making them fully compatible with the `torchvision.datasets` [API](http:\u002F\u002Fpytorch.org\u002Fdocs\u002Ftorchvision\u002Fdatasets.html).\n\n\n### COCO\nMicrosoft COCO: Common Objects in Context\n\n##### Download COCO 2014\n```Shell\n# specify a directory for dataset to be downloaded into, else default is ~\u002Fdata\u002F\nsh data\u002Fscripts\u002FCOCO2014.sh\n```\n\n### VOC Dataset\nPASCAL VOC: Visual Object Classes\n\n##### Download VOC2007 trainval & test\n```Shell\n# specify a directory for dataset to be downloaded into, else default is ~\u002Fdata\u002F\nsh data\u002Fscripts\u002FVOC2007.sh # \u003Cdirectory>\n```\n\n##### Download VOC2012 trainval\n```Shell\n# specify a directory for dataset to be downloaded into, else default is ~\u002Fdata\u002F\nsh data\u002Fscripts\u002FVOC2012.sh # \u003Cdirectory>\n```\n\n## Training SSD\n- First download the fc-reduced [VGG-16](https:\u002F\u002Farxiv.org\u002Fabs\u002F1409.1556) PyTorch base network weights at:              https:\u002F\u002Fs3.amazonaws.com\u002Famdegroot-models\u002Fvgg16_reducedfc.pth\n- By default, we assume you have downloaded the file in the `ssd.pytorch\u002Fweights` dir:\n\n```Shell\nmkdir weights\ncd weights\nwget https:\u002F\u002Fs3.amazonaws.com\u002Famdegroot-models\u002Fvgg16_reducedfc.pth\n```\n\n- To train SSD using the train script simply specify the parameters listed in `train.py` as a flag or manually change them.\n\n```Shell\npython train.py\n```\n\n- Note:\n  * For training, an NVIDIA GPU is strongly recommended for speed.\n  * For instructions on Visdom usage\u002Finstallation, see the \u003Ca href='#installation'>Installation\u003C\u002Fa> section.\n  * You can pick-up training from a checkpoint by specifying the path as one of the training parameters (again, see `train.py` for options)\n\n## Evaluation\nTo evaluate a trained network:\n\n```Shell\npython eval.py\n```\n\nYou can specify the parameters listed in the `eval.py` file by flagging them or manually changing them.  \n\n\n\u003Cimg align=\"left\" src= \"https:\u002F\u002Fgithub.com\u002Famdegroot\u002Fssd.pytorch\u002Fblob\u002Fmaster\u002Fdoc\u002Fdetection_examples.png\">\n\n## Performance\n\n#### VOC2007 Test\n\n##### mAP\n\n| Original | Converted weiliu89 weights | From scratch w\u002Fo data aug | From scratch w\u002F data aug |\n|:-:|:-:|:-:|:-:|\n| 77.2 % | 77.26 % | 58.12% | 77.43 % |\n\n##### FPS\n**GTX 1060:** ~45.45 FPS\n\n## Demos\n\n### Use a pre-trained SSD network for detection\n\n#### Download a pre-trained network\n- We are trying to provide PyTorch `state_dicts` (dict of weight tensors) of the latest SSD model definitions trained on different datasets.  \n- Currently, we provide the following PyTorch models:\n    * SSD300 trained on VOC0712 (newest PyTorch weights)\n      - https:\u002F\u002Fs3.amazonaws.com\u002Famdegroot-models\u002Fssd300_mAP_77.43_v2.pth\n    * SSD300 trained on VOC0712 (original Caffe weights)\n      - https:\u002F\u002Fs3.amazonaws.com\u002Famdegroot-models\u002Fssd_300_VOC0712.pth\n- Our goal is to reproduce this table from the [original paper](http:\u002F\u002Farxiv.org\u002Fabs\u002F1512.02325)\n\u003Cp align=\"left\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Famdegroot_ssd.pytorch_readme_b044cffb28ea.png\" alt=\"SSD results on multiple datasets\" width=\"800px\">\u003C\u002Fp>\n\n### Try the demo notebook\n- Make sure you have [jupyter notebook](http:\u002F\u002Fjupyter.readthedocs.io\u002Fen\u002Flatest\u002Finstall.html) installed.\n- Two alternatives for installing jupyter notebook:\n    1. If you installed PyTorch with [conda](https:\u002F\u002Fwww.continuum.io\u002Fdownloads) (recommended), then you should already have it.  (Just  navigate to the ssd.pytorch cloned repo and run):\n    `jupyter notebook`\n\n    2. If using [pip](https:\u002F\u002Fpypi.python.org\u002Fpypi\u002Fpip):\n\n```Shell\n# make sure pip is upgraded\npip3 install --upgrade pip\n# install jupyter notebook\npip install jupyter\n# Run this inside ssd.pytorch\njupyter notebook\n```\n\n- Now navigate to `demo\u002Fdemo.ipynb` at http:\u002F\u002Flocalhost:8888 (by default) and have at it!\n\n### Try the webcam demo\n- Works on CPU (may have to tweak `cv2.waitkey` for optimal fps) or on an NVIDIA GPU\n- This demo currently requires opencv2+ w\u002F python bindings and an onboard webcam\n  * You can change the default webcam in `demo\u002Flive.py`\n- Install the [imutils](https:\u002F\u002Fgithub.com\u002Fjrosebr1\u002Fimutils) package to leverage multi-threading on CPU:\n  * `pip install imutils`\n- Running `python -m demo.live` opens the webcam and begins detecting!\n\n## TODO\nWe have accumulated the following to-do list, which we hope to complete in the near future\n- Still to come:\n  * [x] Support for the MS COCO dataset\n  * [ ] Support for SSD512 training and testing\n  * [ ] Support for training on custom datasets\n\n## Authors\n\n* [**Max deGroot**](https:\u002F\u002Fgithub.com\u002Famdegroot)\n* [**Ellis Brown**](http:\u002F\u002Fgithub.com\u002Fellisbrown)\n\n***Note:*** Unfortunately, this is just a hobby of ours and not a full-time job, so we'll do our best to keep things up to date, but no guarantees.  That being said, thanks to everyone for your continued help and feedback as it is really appreciated. We will try to address everything as soon as possible.\n\n## References\n- Wei Liu, et al. \"SSD: Single Shot MultiBox Detector.\" [ECCV2016]((http:\u002F\u002Farxiv.org\u002Fabs\u002F1512.02325)).\n- [Original Implementation (CAFFE)](https:\u002F\u002Fgithub.com\u002Fweiliu89\u002Fcaffe\u002Ftree\u002Fssd)\n- A huge thank you to [Alex Koltun](https:\u002F\u002Fgithub.com\u002Falexkoltun) and his team at [Webyclip](http:\u002F\u002Fwww.webyclip.com) for their help in finishing the data augmentation portion.\n- A list of other great SSD ports that were sources of inspiration (especially the Chainer repo):\n  * [Chainer](https:\u002F\u002Fgithub.com\u002FHakuyume\u002Fchainer-ssd), [Keras](https:\u002F\u002Fgithub.com\u002Frykov8\u002Fssd_keras), [MXNet](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd), [Tensorflow](https:\u002F\u002Fgithub.com\u002Fbalancap\u002FSSD-Tensorflow)\n","# SSD：单次多框目标检测器，基于 PyTorch\n这是 Wei Liu、Dragomir Anguelov、Dumitru Erhan、Christian Szegedy、Scott Reed、Cheng-Yang 和 Alexander C. Berg 在 2016 年发表的论文中提出的 [Single Shot MultiBox Detector](http:\u002F\u002Farxiv.org\u002Fabs\u002F1512.02325) 的 [PyTorch](http:\u002F\u002Fpytorch.org\u002F) 实现。官方原始的 Caffe 代码可以在 [这里](https:\u002F\u002Fgithub.com\u002Fweiliu89\u002Fcaffe\u002Ftree\u002Fssd)找到。\n\n\n\u003Cimg align=\"right\" src= \"https:\u002F\u002Fgithub.com\u002Famdegroot\u002Fssd.pytorch\u002Fblob\u002Fmaster\u002Fdoc\u002Fssd.png\" height = 400\u002F>\n\n### 目录\n- \u003Ca href='#installation'>安装\u003C\u002Fa>\n- \u003Ca href='#datasets'>数据集\u003C\u002Fa>\n- \u003Ca href='#training-ssd'>训练\u003C\u002Fa>\n- \u003Ca href='#evaluation'>评估\u003C\u002Fa>\n- \u003Ca href='#performance'>性能\u003C\u002Fa>\n- \u003Ca href='#demos'>演示\u003C\u002Fa>\n- \u003Ca href='#todo'>未来工作\u003C\u002Fa>\n- \u003Ca href='#references'>参考文献\u003C\u002Fa>\n\n&nbsp;\n&nbsp;\n&nbsp;\n&nbsp;\n\n## 安装\n- 根据你的环境在 [PyTorch](http:\u002F\u002Fpytorch.org\u002F) 官网选择并运行相应的命令来安装 PyTorch。\n- 克隆本仓库。\n  * 注意：我们目前仅支持 Python 3 及以上版本。\n- 然后按照下面的[说明](#datasets)下载数据集。\n- 我们现在支持使用 [Visdom](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fvisdom) 在训练过程中进行实时损失可视化！\n  * 要在浏览器中使用 Visdom：\n  ```Shell\n  # 首先安装 Python 服务器和客户端\n  pip install visdom\n  # 启动服务器（最好在 screen 或 tmux 中）\n  python -m visdom.server\n  ```\n  * 然后（在训练期间）访问 http:\u002F\u002Flocalhost:8097\u002F（训练细节请参见下方的“训练”部分）。\n- 注意：对于训练，我们目前支持 [VOC](http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002F) 和 [COCO](http:\u002F\u002Fmscoco.org\u002F) 数据集，并计划尽快添加对 [ImageNet](http:\u002F\u002Fwww.image-net.org\u002F) 的支持。\n\n## 数据集\n为了方便起见，我们提供了 bash 脚本来帮你处理数据集的下载和设置。我们还提供了简单的数据集加载器，它们继承自 `torch.utils.data.Dataset`，因此与 `torchvision.datasets` 的 [API](http:\u002F\u002Fpytorch.org\u002Fdocs\u002Ftorchvision\u002Fdatasets.html) 完全兼容。\n\n\n### COCO\nMicrosoft COCO：上下文中的常见物体\n\n##### 下载 COCO 2014\n```Shell\n# 指定数据集下载的目标目录，否则默认为 ~\u002Fdata\u002F\nsh data\u002Fscripts\u002FCOCO2014.sh\n```\n\n### VOC 数据集\nPASCAL VOC：视觉对象类别\n\n##### 下载 VOC2007 trainval 和 test\n```Shell\n# 指定数据集下载的目标目录，否则默认为 ~\u002Fdata\u002F\nsh data\u002Fscripts\u002FVOC2007.sh # \u003Cdirectory>\n```\n\n##### 下载 VOC2012 trainval\n```Shell\n# 指定数据集下载的目标目录，否则默认为 ~\u002Fdata\u002F\nsh data\u002Fscripts\u002FVOC2012.sh # \u003Cdirectory>\n```\n\n## 训练 SSD\n- 首先下载 fc-reduced [VGG-16](https:\u002F\u002Farxiv.org\u002Fabs\u002F1409.1556) 的 PyTorch 基础网络权重，地址为：              https:\u002F\u002Fs3.amazonaws.com\u002Famdegroot-models\u002Fvgg16_reducedfc.pth\n- 默认情况下，我们认为你已将该文件下载到 `ssd.pytorch\u002Fweights` 目录：\n\n```Shell\nmkdir weights\ncd weights\nwget https:\u002F\u002Fs3.amazonaws.com\u002Famdegroot-models\u002Fvgg16_reducedfc.pth\n```\n\n- 要使用训练脚本训练 SSD，只需在 `train.py` 中列出的参数中指定标志或手动更改它们即可。\n\n```Shell\npython train.py\n```\n\n- 注意：\n  * 对于训练，强烈建议使用 NVIDIA GPU 以提高速度。\n  * 关于 Visdom 的使用和安装说明，请参阅\u003C a href='#installation'>安装\u003C\u002Fa>部分。\n  * 你可以通过指定检查点路径从检查点继续训练（同样，请参阅 `train.py` 中的选项）\n\n## 评估\n要评估训练好的网络：\n\n```Shell\npython eval.py\n```\n\n你可以通过标记或手动更改 `eval.py` 文件中列出的参数来指定它们。  \n\n\n\u003Cimg align=\"left\" src= \"https:\u002F\u002Fgithub.com\u002Famdegroot\u002Fssd.pytorch\u002Fblob\u002Fmaster\u002Fdoc\u002Fdetection_examples.png\">\n\n## 性能\n\n#### VOC2007 测试\n\n##### mAP\n\n| 原始 | 转换后的 weiliu89 权重 | 从零开始无数据增强 | 从零开始有数据增强 |\n|:-:|:-:|:-:|:-:|\n| 77.2 % | 77.26 % | 58.12% | 77.43 % |\n\n##### FPS\n**GTX 1060:** ~45.45 FPS\n\n## 演示\n\n### 使用预训练的 SSD 网络进行检测\n\n#### 下载预训练的网络\n- 我们正努力提供在不同数据集上训练的最新 SSD 模型定义的 PyTorch `state_dicts`（权重张量字典）。\n- 当前，我们提供以下 PyTorch 模型：\n    * 在 VOC0712 上训练的 SSD300（最新的 PyTorch 权重）\n      - https:\u002F\u002Fs3.amazonaws.com\u002Famdegroot-models\u002Fssd300_mAP_77.43_v2.pth\n    * 在 VOC0712 上训练的 SSD300（原始 Caffe 权重）\n      - https:\u002F\u002Fs3.amazonaws.com\u002Famdegroot-models\u002Fssd_300_VOC0712.pth\n- 我们的目标是重现[原始论文](http:\u002F\u002Farxiv.org\u002Fabs\u002F1512.02325)中的这张表\n\u003Cp align=\"left\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Famdegroot_ssd.pytorch_readme_b044cffb28ea.png\" alt=\"SSD 在多个数据集上的结果\" width=\"800px\">\u003C\u002Fp>\n\n### 尝试演示笔记本\n- 确保你已安装 [jupyter notebook](http:\u002F\u002Fjupyter.readthedocs.io\u002Fen\u002Flatest\u002Finstall.html)。\n- 安装 jupyter notebook 的两种方法：\n    1. 如果你使用 [conda](https:\u002F\u002Fwww.continuum.io\u002Fdownloads) 安装了 PyTorch（推荐），那么你应该已经拥有它了。只需导航到克隆的 ssd.pytorch 仓库并运行：\n    `jupyter notebook`\n\n    2. 如果使用 [pip](https:\u002F\u002Fpypi.python.org\u002Fpypi\u002Fpip)：\n\n```Shell\n# 确保 pip 已升级\npip3 install --upgrade pip\n# 安装 jupyter notebook\npip install jupyter\n# 在 ssd.pytorch 内运行\njupyter notebook\n```\n\n- 现在导航到 http:\u002F\u002Flocalhost:8888（默认）的 `demo\u002Fdemo.ipynb`，尽情体验吧！\n\n### 尝试摄像头演示\n- 可以在 CPU 上运行（可能需要调整 `cv2.waitkey` 以获得最佳帧率）或在 NVIDIA GPU 上运行\n- 此演示目前需要 opencv2+ 和 Python 绑定以及内置摄像头\n  * 你可以在 `demo\u002Flive.py` 中更改默认摄像头\n- 安装 [imutils](https:\u002F\u002Fgithub.com\u002Fjrosebr1\u002Fimutils) 包以利用 CPU 上的多线程功能：\n  * `pip install imutils`\n- 运行 `python -m demo.live` 即可打开摄像头并开始检测！\n\n## TODO\n我们整理了一份待办事项清单，希望在不久的将来完成：\n- 接下来还有：\n  * [x] 支持 MS COCO 数据集\n  * [ ] 支持 SSD512 的训练和测试\n  * [ ] 支持在自定义数据集上进行训练\n\n## 作者\n\n* [**Max deGroot**](https:\u002F\u002Fgithub.com\u002Famdegroot)\n* [**Ellis Brown**](http:\u002F\u002Fgithub.com\u002Fellisbrown)\n\n***注：*** 很遗憾，这仅仅是我们的一项业余爱好，而不是全职工作，因此我们会尽力保持内容的更新，但无法保证。尽管如此，感谢大家一直以来的帮助和反馈，我们非常感激。我们会尽快处理所有问题。\n\n## 参考文献\n- 魏 Liu 等人. “SSD：单次多框检测器.” [ECCV2016]((http:\u002F\u002Farxiv.org\u002Fabs\u002F1512.02325)).\n- [原始实现（CAFFE）](https:\u002F\u002Fgithub.com\u002Fweiliu89\u002Fcaffe\u002Ftree\u002Fssd)\n- 衷心感谢 [Alex Koltun](https:\u002F\u002Fgithub.com\u002Falexkoltun) 及其在 [Webyclip](http:\u002F\u002Fwww.webyclip.com) 的团队，在完成数据增强部分时给予的帮助。\n- 其他优秀的 SSD 移植项目列表，这些项目曾为我们提供了灵感（尤其是 Chainer 仓库）：\n  * [Chainer](https:\u002F\u002Fgithub.com\u002FHakuyume\u002Fchainer-ssd), [Keras](https:\u002F\u002Fgithub.com\u002Frykov8\u002Fssd_keras), [MXNet](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd), [TensorFlow](https:\u002F\u002Fgithub.com\u002Fbalancap\u002FSSD-Tensorflow)","# SSD.pytorch 快速上手指南\n\nSSD (Single Shot MultiBox Detector) 是一个基于 PyTorch 实现的单阶段目标检测算法。本指南将帮助你快速搭建环境并运行模型。\n\n## 1. 环境准备\n\n在开始之前，请确保你的系统满足以下要求：\n\n*   **操作系统**: Linux 或 macOS (Windows 支持需自行配置)\n*   **Python**: 3.0 及以上版本\n*   **深度学习框架**: PyTorch (建议安装最新稳定版)\n*   **硬件**: 训练过程强烈推荐使用 NVIDIA GPU；推理可在 CPU 上进行（速度较慢）\n*   **可选依赖**:\n    *   `visdom`: 用于训练过程中的实时损失可视化\n    *   `opencv-python`: 用于演示和视频流检测\n    *   `jupyter`: 用于运行示例 Notebook\n\n**安装基础依赖命令：**\n\n```Shell\n# 安装 PyTorch (请访问 pytorch.org 获取适合你环境的命令，以下为示例)\npip install torch torchvision\n\n# 安装可视化工具和图像处理库\npip install visdom opencv-python imutils jupyter\n```\n\n> **提示**：国内用户建议使用清华源或阿里源加速 pip 安装：\n> `pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \u003Cpackage_name>`\n\n## 2. 安装步骤\n\n### 2.1 克隆项目\n首先将代码仓库克隆到本地：\n\n```Shell\ngit clone https:\u002F\u002Fgithub.com\u002Famdegroot\u002Fssd.pytorch.git\ncd ssd.pytorch\n```\n\n### 2.2 下载预训练骨干网络\n训练前需要下载经过裁剪的 VGG-16 预训练权重作为骨干网络。执行以下命令自动下载至 `weights` 目录：\n\n```Shell\nmkdir weights\ncd weights\nwget https:\u002F\u002Fs3.amazonaws.com\u002Famdegroot-models\u002Fvgg16_reducedfc.pth\ncd ..\n```\n\n> **注意**：如果 `wget` 下载速度慢，可手动在浏览器下载该文件并放入 `ssd.pytorch\u002Fweights\u002F` 目录下。\n\n### 2.3 数据集准备 (可选)\n如果你打算重新训练模型，需要下载数据集。项目提供了自动化脚本（默认下载到 `~\u002Fdata\u002F` 目录）：\n\n*   **COCO 2014**:\n    ```Shell\n    sh data\u002Fscripts\u002FCOCO2014.sh\n    ```\n*   **PASCAL VOC 2007 & 2012**:\n    ```Shell\n    sh data\u002Fscripts\u002FVOC2007.sh\n    sh data\u002Fscripts\u002FVOC2012.sh\n    ```\n\n## 3. 基本使用\n\n### 3.1 运行演示 (最快体验)\n无需训练，直接下载预训练模型进行物体检测演示。\n\n1.  **下载预训练模型** (以 VOC 数据集训练的 SSD300 为例)：\n    ```Shell\n    mkdir weights\n    cd weights\n    wget https:\u002F\u002Fs3.amazonaws.com\u002Famdegroot-models\u002Fssd300_mAP_77.43_v2.pth\n    cd ..\n    ```\n\n2.  **启动 Jupyter Notebook 查看示例**：\n    ```Shell\n    jupyter notebook\n    ```\n    在浏览器中打开 `demo\u002Fdemo.ipynb`，按照单元格顺序运行即可看到检测效果。\n\n3.  **或者运行摄像头实时检测** (需连接摄像头)：\n    ```Shell\n    python -m demo.live\n    ```\n\n### 3.2 训练模型\n使用默认参数开始训练（需先准备好数据集和 VGG 权重）：\n\n```Shell\npython train.py\n```\n\n*   **可视化监控**：在另一个终端启动 Visdom 服务，然后在浏览器访问 `http:\u002F\u002Flocalhost:8097\u002F` 查看实时训练曲线。\n    ```Shell\n    python -m visdom.server\n    ```\n\n### 3.3 评估模型\n对训练好的网络进行评估：\n\n```Shell\npython eval.py\n```\n\n你可以通过修改 `train.py` 或 `eval.py` 中的参数，或在命令行中添加标志位来调整超参数和数据集路径。","某智慧物流团队的算法工程师正致力于开发一套自动分拣系统，需要让机器人实时识别传送带上不同尺寸的包裹并定位其坐标。\n\n### 没有 ssd.pytorch 时\n- **框架迁移成本高昂**：团队虽熟悉 PyTorch 生态，但 SSD 原始代码基于 Caffe 编写，强行复用需耗费数周进行复杂的框架重写与调试。\n- **训练过程不透明**：缺乏可视化工具，工程师只能盯着枯燥的终端日志猜测模型收敛情况，难以及时调整超参数。\n- **数据预处理繁琐**：面对 VOC 或 COCO 等标准数据集，需手动编写大量脚本处理下载、解压及格式转换，极易出错且占用开发时间。\n- **复现基准困难**：由于缺少开箱即用的预训练权重（如 VGG-16）和标准化评估脚本，难以快速验证算法是否达到论文所述的 77.2 mAP 性能基准。\n\n### 使用 ssd.pytorch 后\n- **原生 PyTorch 集成**：直接利用 ssd.pytorch 提供的原生实现，无缝对接现有 PyTorch 工作流，将环境搭建与代码适配时间从数周缩短至几小时。\n- **实时可视化监控**：借助集成的 Visdom 功能，在浏览器中实时观察损失曲线变化，能够直观地判断训练状态并迅速优化模型。\n- **一键数据集就绪**：调用项目自带的 Bash 脚本，自动完成 VOC 和 COCO 数据集的下载与配置，配合兼容 torchvision API 的加载器，立即开始训练。\n- **权威性能复现**：直接加载官方提供的缩减版 VGG-16 预训练权重，运行评估脚本即可快速验证模型在测试集上的精度，确保项目起点可靠。\n\nssd.pytorch 通过提供标准化的 PyTorch 实现与自动化流程，消除了跨框架移植壁垒，让团队能专注于核心业务逻辑而非底层工程琐事。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Famdegroot_ssd.pytorch_7ae0fcb7.png","amdegroot","Max deGroot","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Famdegroot_7b0b102a.jpg","Researching ai and its effects on mental health.",null,"Seattle, WA","m@xdegroot.net","maxdgroot","https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Falexander-degroot\u002F","https:\u002F\u002Fgithub.com\u002Famdegroot",[86,90],{"name":87,"color":88,"percentage":89},"Python","#3572A5",95.9,{"name":91,"color":92,"percentage":93},"Shell","#89e051",4.1,5234,1738,"2026-04-03T10:44:35","MIT","未说明","训练强烈建议使用 NVIDIA GPU（演示支持 CPU 或 NVIDIA GPU），具体型号和显存大小未说明，CUDA 版本未说明",{"notes":101,"python":102,"dependencies":103},"该工具是 SSD 目标检测算法的 PyTorch 实现。训练前需手动下载 VGG-16 预训练权重文件。支持 VOC 和 COCO 数据集，并提供脚本自动下载。演示功能中，摄像头实时检测在 CPU 上运行时可能需要调整参数以优化帧率。","3+",[104,105,106,107,108],"PyTorch","Visdom","OpenCV2+ (with python bindings)","Jupyter Notebook","imutils",[14,13],[111,112,113,114,115,116,117,118],"pytorch","deep-learning","ssd","object-detection","computer-vision","machine-learning","image-recognition","webcam","2026-03-27T02:49:30.150509","2026-04-06T08:17:38.665648",[122,127,132,137,142,147],{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},9946,"运行 demo\u002Flive.py 时出现 'ValueError: not enough values to unpack (expected 2, got 0)' 错误怎么办？","该错误通常发生在 PyTorch 版本更新后，维度检查方法不兼容。请修改 `layers\u002Ffunctions\u002Fdetection.py` 文件：\n1. 将 `if scores.dim() == 0:` 改为 `if scores.size(0) == 0:`。\n2. 将 `if dets.dim() == 1:` 改为 `if dets.size(0) == 1:`。\n修改后重启程序即可解决检测不到目标或崩溃的问题。","https:\u002F\u002Fgithub.com\u002Famdegroot\u002Fssd.pytorch\u002Fissues\u002F154",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},9947,"在 Python 2.7 环境下训练时遇到 'div_ only supports scalar multiplication' 或损失值为 NaN\u002FInf 如何解决？","这是因为代码主要针对 Python 3 编写，Python 2.7 存在整数除法兼容性问题。解决方法是在以下文件中添加 `from __future__ import division`：\n- `box_utils.py`\n- `prior_box.py`\n- `detection.py`\n此外，对于 `modules\u002Fl2norm.py` 中的报错，建议将 `x\u002F=norm.expand_as(x)` 修改为 `x = x.div(norm.expand_as(x))` 或重写 forward 函数以显式处理除法运算。","https:\u002F\u002Fgithub.com\u002Famdegroot\u002Fssd.pytorch\u002Fissues\u002F1",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},9948,"训练过程中出现 'StopIteration' 错误导致中断怎么办？","这是由于数据加载器迭代器耗尽导致的。请修改 `train.py` 文件中的数据获取逻辑，使用 try-except 块重置迭代器：\n将：\n`images, targets = next(batch_iterator)`\n替换为：\n```python\ntry:\n    images, targets = next(batch_iterator)\nexcept StopIteration:\n    batch_iterator = iter(data_loader)\n    images, targets = next(batch_iterator)\n```\n这样可以确保在一个 epoch 结束后自动重新开始下一轮迭代。","https:\u002F\u002Fgithub.com\u002Famdegroot\u002Fssd.pytorch\u002Fissues\u002F214",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},9949,"自定义数据集训练时出现 'NaN values at Multibox encoding' 或定位损失为 NaN 的原因是什么？","这通常是因为在 Python 2 中进行坐标归一化时发生了整数除法，导致边界框参数变为 0。请检查数据预处理代码（如 `voc0712.py` 或相关加载脚本），将除法运算强制转换为浮点数运算。\n例如，将：\n`cur_pt = cur_pt \u002F width`\n修改为：\n`cur_pt = 1.0 * cur_pt \u002F width`\n确保所有坐标计算都使用浮点数，避免产生零值导致的对数运算错误。","https:\u002F\u002Fgithub.com\u002Famdegroot\u002Fssd.pytorch\u002Fissues\u002F22",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},9950,"更换骨干网络（如从 VGG16 换为 ResNet101）后测试速度显著变慢怎么办？","这是已知现象，ResNet 等深层网络推理耗时通常高于 VGG16。如果评估单张图片耗时过长（如达到 2 秒），请检查以下几点：\n1. 确认是否意外启用了训练模式（model.train()），应确保使用 `net.eval()` 进行推理。\n2. 检查输入图像的尺寸是否过大，SSD300 应调整为 300x300。\n3. 确认 CUDA 和 cuDNN 是否正确安装并启用。\n若需保持高帧率（如 45fps），建议暂时使用原始的 VGG16 骨干网络。","https:\u002F\u002Fgithub.com\u002Famdegroot\u002Fssd.pytorch\u002Fissues\u002F4",{"id":148,"question_zh":149,"answer_zh":150,"source_url":126},9951,"修复维度错误后仍然无法检测到视频中的物体怎么办？","如果修复了 `ValueError` 但仍无检测结果，请检查 `BaseTransform` 的初始化参数是否与模型预训练时的设置一致。默认配置应为：\n`transform = BaseTransform(net.size, (104\u002F256.0, 117\u002F256.0, 123\u002F256.0))`\n同时确保使用的 PyTorch 版本与代码兼容（推荐 0.4.1 及以上），并检查权重文件路径是否正确加载。",[]]