[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-zhreshold--mxnet-ssd":3,"tool-zhreshold--mxnet-ssd":64},[4,17,26,40,48,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,2,"2026-04-03T11:11:01",[13,14,15],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":23,"last_commit_at":32,"category_tags":33,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,34,35,36,15,37,38,13,39],"数据工具","视频","插件","其他","语言模型","音频",{"id":41,"name":42,"github_repo":43,"description_zh":44,"stars":45,"difficulty_score":10,"last_commit_at":46,"category_tags":47,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,38,37],{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":10,"last_commit_at":54,"category_tags":55,"status":16},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74939,"2026-04-05T23:16:38",[38,14,13,37],{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":23,"last_commit_at":62,"category_tags":63,"status":16},2471,"tesseract","tesseract-ocr\u002Ftesseract","Tesseract 是一款历史悠久且备受推崇的开源光学字符识别（OCR）引擎，最初由惠普实验室开发，后由 Google 维护，目前由全球社区共同贡献。它的核心功能是将图片中的文字转化为可编辑、可搜索的文本数据，有效解决了从扫描件、照片或 PDF 文档中提取文字信息的难题，是数字化归档和信息自动化的重要基础工具。\n\n在技术层面，Tesseract 展现了强大的适应能力。从版本 4 开始，它引入了基于长短期记忆网络（LSTM）的神经网络 OCR 引擎，显著提升了行识别的准确率；同时，为了兼顾旧有需求，它依然支持传统的字符模式识别引擎。Tesseract 原生支持 UTF-8 编码，开箱即用即可识别超过 100 种语言，并兼容 PNG、JPEG、TIFF 等多种常见图像格式。输出方面，它灵活支持纯文本、hOCR、PDF、TSV 等多种格式，方便后续数据处理。\n\nTesseract 主要面向开发者、研究人员以及需要构建文档处理流程的企业用户。由于它本身是一个命令行工具和库（libtesseract），不包含图形用户界面（GUI），因此最适合具备一定编程能力的技术人员集成到自动化脚本或应用程序中",73286,"2026-04-03T01:56:45",[13,14],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":79,"owner_location":80,"owner_email":79,"owner_twitter":79,"owner_website":81,"owner_url":82,"languages":83,"stars":100,"forks":101,"last_commit_at":102,"license":103,"difficulty_score":10,"env_os":104,"env_gpu":105,"env_ram":106,"env_deps":107,"category_tags":115,"github_topics":116,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":118,"updated_at":119,"faqs":120,"releases":156},2339,"zhreshold\u002Fmxnet-ssd","mxnet-ssd","MXNet port of SSD: Single Shot MultiBox Object Detector.  Reimplementation of https:\u002F\u002Fgithub.com\u002Fweiliu89\u002Fcaffe\u002Ftree\u002Fssd","mxnet-ssd 是一个基于 MXNet 深度学习框架实现的单发多框目标检测（SSD）开源项目。它旨在通过单一神经网络高效地完成物体的定位与分类任务，解决了传统检测方法速度慢或精度不足的痛点，实现了速度与准确性的良好平衡。\n\n该项目非常适合计算机视觉领域的研究人员和开发者使用，特别是那些希望复现经典 SSD 算法、进行模型训练评估，或需要在 MXNet 生态中部署目标检测应用的技术人员。虽然官方已建议新用户迁移至功能更丰富的 Gluon-CV 库，但 mxnet-ssd 作为经典的复现版本，依然具有重要的参考价值和实用性。\n\n其技术亮点在于完美兼容原版 Caffe 模型，并提供了便捷的模型转换工具，确保检测结果与原始版本几乎一致。此外，项目针对 MXNet 特性进行了深度优化：支持从 ResNet、Inception 等主流分类网络快速构建检测模型；利用后端多线程引擎显著提升了多 GPU 环境下的训练与推理速度；还集成了 MobileNet 预训练模型以满足移动端对“超快”速度的需求。项目同时提供 Docker 支持和 TensorBoard 可视化功能，进一步降低了环境配置与实验监","mxnet-ssd 是一个基于 MXNet 深度学习框架实现的单发多框目标检测（SSD）开源项目。它旨在通过单一神经网络高效地完成物体的定位与分类任务，解决了传统检测方法速度慢或精度不足的痛点，实现了速度与准确性的良好平衡。\n\n该项目非常适合计算机视觉领域的研究人员和开发者使用，特别是那些希望复现经典 SSD 算法、进行模型训练评估，或需要在 MXNet 生态中部署目标检测应用的技术人员。虽然官方已建议新用户迁移至功能更丰富的 Gluon-CV 库，但 mxnet-ssd 作为经典的复现版本，依然具有重要的参考价值和实用性。\n\n其技术亮点在于完美兼容原版 Caffe 模型，并提供了便捷的模型转换工具，确保检测结果与原始版本几乎一致。此外，项目针对 MXNet 特性进行了深度优化：支持从 ResNet、Inception 等主流分类网络快速构建检测模型；利用后端多线程引擎显著提升了多 GPU 环境下的训练与推理速度；还集成了 MobileNet 预训练模型以满足移动端对“超快”速度的需求。项目同时提供 Docker 支持和 TensorBoard 可视化功能，进一步降低了环境配置与实验监控的门槛。","# SSD: Single Shot MultiBox Object Detector\n\nSSD is an unified framework for object detection with a single network.\n\nYou can use the code to train\u002Fevaluate\u002Ftest for object detection task.\n\n### Disclaimer\nThis is a re-implementation of original SSD which is based on caffe. The official\nrepository is available [here](https:\u002F\u002Fgithub.com\u002Fweiliu89\u002Fcaffe\u002Ftree\u002Fssd).\nThe arXiv paper is available [here](http:\u002F\u002Farxiv.org\u002Fabs\u002F1512.02325).\n\nThis example is intended for reproducing the nice detector while fully utilize the\nremarkable traits of MXNet.\n* The model is fully compatible with caffe version.\n* Model [converter](#convert-caffemodel) from caffe is available now!\n* The result is almost identical to the original version. However, due to different implementation details, the results might differ slightly.\n\n### What's new\n* **This repo is now deprecated, I am migrating to the latest [Gluon-CV](https:\u002F\u002Fgithub.com\u002Fdmlc\u002Fgluon-cv) which is more user friendly and has a lot more algorithms in development. This repo will not receive active development, however, you can continue use it with the mxnet 1.1.0(probably 1.2.0).**\n* Now this repo is internally synchronized up to data with offical mxnet backend. `pip install mxnet` will work for this repo as well in most cases.\n* MobileNet pretrained model now provided.\n* Added multiple trained models.\n* Added a much simpler way to compose network from mainstream classification networks (resnet, inception...) and [Guide](symbol\u002FREADME.md).\n* Update to the latest version according to caffe version, with 5% mAP increase.\n* Use C++ record iterator based on back-end multi-thread engine to achieve huge speed up on multi-gpu environments.\n* Monitor validation mAP during training.\n* More network symbols under development and test.\n* Extra operators are now in `mxnet\u002Fsrc\u002Foperator\u002Fcontrib`, symbols are modified. Please use [Release-v0.2-beta](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Ftag\u002Fv0.2-beta) for old models.\n* added Docker support for this repository, prebuilt & including all packages and dependencies. (linux only)\n* added tensorboard support, allowing a more convenient way of research. (linux only)\n\n### Demo results\n![demo1](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_de733397d925.png)\n![demo2](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_c2de5d22774a.png)\n![demo3](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_5440e9874a53.png)\n\n### mAP\n|        Model          | Training data    | Test data |  mAP | Note |\n|:-----------------:|:----------------:|:---------:|:----:|:-----|\n| [VGG16_reduced 300x300](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.5-beta\u002Fvgg16_ssd_300_voc0712_trainval.zip) | VOC07+12 trainval| VOC07 test| 77.8| fast |\n| [VGG16_reduced 512x512](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.5-beta\u002Fvgg16_ssd_512_voc0712_trainval.zip) | VOC07+12 trainval | VOC07 test| 79.9| slow |\n| [Inception-v3 512x512](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.7-alpha\u002Fssd_inceptionv3_512_voc0712trainval.zip) | VOC07+12 trainval| VOC07 test| 78.9 | fastest |\n| [Resnet-50 512x512](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.7-alpha\u002Fssd_resnet50_512_voc0712trainval.zip) | VOC07+12 trainval| VOC07 test| 79.1 | fast |\n| [MobileNet 512x512](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.7-alpha\u002Fmobilenet-ssd-512.zip) | VOC07+12 trainval| VOC07 test| 72.5 | super fast |\n| [MobileNet 608x608](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.7-alpha\u002Fmobilenet-ssd-608.zip) | VOC07+12 trainval| VOC07 test| 74.7 | super fast |\n\n\n*More to be added*\n\n### Speed\n|         Model         |   GPU            | CUDNN | Batch-size | FPS* |\n|:---------------------:|:----------------:|:-----:|:----------:|:----:|\n| VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1  |     16     | 95   |\n| VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1  |     8      | 95   |\n| VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1  |     1      | 64   |\n| VGG16_reduced 300x300 | TITAN X(Maxwell) |  N\u002FA  |     8      | 36   |\n| VGG16_reduced 300x300 | TITAN X(Maxwell) |  N\u002FA  |     1      | 28   |\n\n*Forward time only, data loading and drawing excluded.*\n\n### Getting started\n* Option #1 - install using 'Docker'. if you are not familiar with this technology, there is a 'Docker' section below.\nyou can get the latest image:\n```\ndocker pull daviddocker78\u002Fmxnet-ssd:gpu_0.12.0_cuda9\n```\n* You will need python modules: `cv2`, `matplotlib` and `numpy`.\nIf you use mxnet-python api, you probably have already got them.\nYou can install them via pip or package manegers, such as `apt-get`:\n```\nsudo apt-get install python-opencv python-matplotlib python-numpy\n```\n* Clone this repo:\n```\n# if you don't have git, install it via apt or homebrew\u002Fyum based on your system\nsudo apt-get install git\n# cd where you would like to clone this repo\ncd ~\ngit clone --recursive https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd.git\n# make sure you clone this with --recursive\n# if not done correctly or you are using downloaded repo, pull them all via:\n# git submodule update --recursive --init\ncd mxnet-ssd\u002Fmxnet\n```\n* (Skip this step if you have offcial MXNet installed.) Build MXNet: `cd \u002Fpath\u002Fto\u002Fmxnet-ssd\u002Fmxnet`. Follow the official instructions [here](http:\u002F\u002Fmxnet.io\u002Fget_started\u002Finstall.html).\n```\n# for Ubuntu\u002FDebian\ncp make\u002Fconfig.mk .\u002Fconfig.mk\n# modify it if necessary\n```\nRemember to enable CUDA if you want to be able to train, since CPU training is\ninsanely slow. Using CUDNN is optional, but highly recommanded.\n\n### Try the demo\n* Download the pretrained model: [`ssd_resnet50_0712.zip`](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.6\u002Fresnet50_ssd_512_voc0712_trainval.zip), and extract to `model\u002F` directory.\n* Run\n```\n# cd \u002Fpath\u002Fto\u002Fmxnet-ssd\npython demo.py --gpu 0\n# play with examples:\npython demo.py --epoch 0 --images .\u002Fdata\u002Fdemo\u002Fdog.jpg --thresh 0.5\npython demo.py --cpu --network resnet50 --data-shape 512\n# wait for library to load for the first time\n```\n* Check `python demo.py --help` for more options.\n\n### Train the model\nThis example only covers training on Pascal VOC dataset. Other datasets should\nbe easily supported by adding subclass derived from class `Imdb` in `dataset\u002Fimdb.py`.\nSee example of `dataset\u002Fpascal_voc.py` for details.\n* Download the converted pretrained `vgg16_reduced` model [here](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.2-beta\u002Fvgg16_reduced.zip), unzip `.param` and `.json` files\ninto `model\u002F` directory by default.\n* Download the PASCAL VOC dataset, skip this step if you already have one.\n```\ncd \u002Fpath\u002Fto\u002Fwhere_you_store_datasets\u002F\nwget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2012\u002FVOCtrainval_11-May-2012.tar\nwget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2007\u002FVOCtrainval_06-Nov-2007.tar\nwget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2007\u002FVOCtest_06-Nov-2007.tar\n# Extract the data.\ntar -xvf VOCtrainval_11-May-2012.tar\ntar -xvf VOCtrainval_06-Nov-2007.tar\ntar -xvf VOCtest_06-Nov-2007.tar\n```\n* We are going to use `trainval` set in VOC2007\u002F2012 as a common strategy.\nThe suggested directory structure is to store `VOC2007` and `VOC2012` directories\nin the same `VOCdevkit` folder.\n* Then link `VOCdevkit` folder to `data\u002FVOCdevkit` by default:\n```\nln -s \u002Fpath\u002Fto\u002FVOCdevkit \u002Fpath\u002Fto\u002Fthis_example\u002Fdata\u002FVOCdevkit\n```\nUse hard link instead of copy could save us a bit disk space.\n* Create packed binary file for faster training:\n```\n# cd \u002Fpath\u002Fto\u002Fmxnet-ssd\nbash tools\u002Fprepare_pascal.sh\n# or if you are using windows\npython tools\u002Fprepare_dataset.py --dataset pascal --year 2007,2012 --set trainval --target .\u002Fdata\u002Ftrain.lst\npython tools\u002Fprepare_dataset.py --dataset pascal --year 2007 --set test --target .\u002Fdata\u002Fval.lst --shuffle False\n```\n* Start training:\n```\npython train.py\n```\n* By default, this example will use `batch-size=32` and `learning_rate=0.004`.\nYou might need to change the parameters a bit if you have different configurations.\nCheck `python train.py --help` for more training options. For example, if you have 4 GPUs, use:\n```\n# note that a perfect training parameter set is yet to be discovered for multi-gpu\npython train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.001\n```\n* Memory usage: MXNet is very memory efficient, training on `VGG16_reduced` model with `batch-size` 32 takes around 4684MB without CUDNN(conv1_x and conv2_x fixed).\n\n### Evalute trained model\nUse:\n```\n# cd \u002Fpath\u002Fto\u002Fmxnet-ssd\npython evaluate.py --gpus 0,1 --batch-size 128 --epoch 0\n```\n### Convert model to deploy mode\nThis simply removes all loss layers, and attach a layer for merging results and non-maximum suppression.\nUseful when loading python symbol is not available.\n```\n# cd \u002Fpath\u002Fto\u002Fmxnet-ssd\npython deploy.py --num-class 20\n# then you can run demo with new model without loading python symbol\npython demo.py --prefix model\u002Fssd_300_deploy --epoch 0 --deploy\n```\n\n### Convert caffemodel\nConverter from caffe is available at `\u002Fpath\u002Fto\u002Fmxnet-ssd\u002Ftools\u002Fcaffe_converter`\n\nThis is specifically modified to handle custom layer in caffe-ssd. Usage:\n```\ncd \u002Fpath\u002Fto\u002Fmxnet-ssd\u002Ftools\u002Fcaffe_converter\nmake\npython convert_model.py deploy.prototxt name_of_pretrained_caffe_model.caffemodel ssd_converted\n# you will use this model in deploy mode without loading from python symbol\npython demo.py --prefix ssd_converted --epoch 1 --deploy\n```\nThere is no guarantee that conversion will always work, but at least it's good for now.\n\n### Legacy models\nSince the new interface for composing network is introduced, the old models have inconsistent names for weights.\nYou can still load the previous model by rename the symbol to `legacy_xxx.py`\nand call with `python train\u002Fdemo.py --network legacy_xxx `\nFor example:\n```\npython demo.py --network 'legacy_vgg16_ssd_300.py' --prefix model\u002Fssd_300 --epoch 0\n```\n\n### Docker \nFirst make sure [docker](https:\u002F\u002Fdocs.docker.com\u002Fengine\u002Finstallation\u002F) is\ninstalled. The docker plugin\n[nvidia-docker](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fnvidia-docker) is required to run on\nNvidia GPUs.\n\n* pre-built docker images are available at https:\u002F\u002Fhub.docker.com\u002Fr\u002Fdaviddocker78\u002Fmxnet-ssd\u002F\nto download a pre-built image, run:\n```\ndocker pull daviddocker78\u002Fmxnet-ssd:gpu_0.12.0_cuda9\n```\nOtherwise, if you wish to build it yourself, you have the Dockerfiles available in this repo, under the 'docker' folder.\n* to run a container instance:\n```\nnvidia-docker run -it --rm myImageName:tag\n```\nnow you can execute commands the same way as you would, if you'd install mxnet on your own computer.\nfor more information, see the [Guide](docker\u002FREADME.md).\n\n### Tensorboard\n* There has been some great effort to bring tensorboard to mxnet.\nIf you chose to work with dockers, you have it installed in the pre-built image you've downloaded. otherwise, follow [here](https:\u002F\u002Fgithub.com\u002Fdmlc\u002Ftensorboard) for installation steps.\n* To save training loss graphs, validation AP per class, and validation ROC graphs to tensorboard while training, you can specify:\n```\npython train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.001 --tensorboard True\n```\n* To save also the distributions of layers (actually, the variance of them), you can specify:\n```\npython train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.001 --tensorboard True --monitor 40\n```\n* Visualization with Docker: the UI of tensorboard has changed over time. to get the best experience, download the new tensorflow docker-image:\n```\n# download the built image from Dockerhub\ndocker pull tensorflow\u002Ftensorflow:1.4.0-devel-gpu\n# run a container and open a port using '-p' flag. \n# attach a volume from where you stored your logs, to a directory inside the container\nnvidia-docker run -it --rm -p 0.0.0.0:6006:6006 -v \u002Fmy\u002Ffull\u002Fexperiment\u002Fpath:\u002Fres tensorflow\u002Ftensorflow:1.4.0-devel-gpu\ncd \u002Fres\ntensorboard --logdir=.\n```\nTo launch tensorboard without docker, simply run the last command\nNow tensorboard is loading the tensorEvents of your experiment. open your browser under '0.0.0.0:6006' and you will have tensorboard!\n\n### Tensorboard visualizations\n![loss](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_04b15ce94364.png)\n![AP](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_9a100b0b6ed0.png)\n![ROC](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_69b4f8f1224d.png)\n\n","# SSD：单次多框目标检测器\n\nSSD 是一种使用单一网络进行目标检测的统一框架。\n\n您可以使用该代码来训练、评估和测试目标检测任务。\n\n### 免责声明\n这是对基于 Caffe 的原始 SSD 的重新实现。官方仓库可在 [这里](https:\u002F\u002Fgithub.com\u002Fweiliu89\u002Fcaffe\u002Ftree\u002Fssd) 找到。arXiv 论文可在此查阅 [这里](http:\u002F\u002Farxiv.org\u002Fabs\u002F1512.02325)。\n\n本示例旨在复现这一优秀的检测器，同时充分利用 MXNet 的卓越特性。\n* 该模型与 Caffe 版本完全兼容。\n* 现已提供从 Caffe 转换模型的工具！\n* 结果几乎与原始版本相同。然而，由于实现细节的不同，结果可能会略有差异。\n\n### 新内容\n* **此仓库现已弃用，我正在迁移到最新的 [Gluon-CV](https:\u002F\u002Fgithub.com\u002Fdmlc\u002Fgluon-cv)，它更加用户友好，并且正在开发更多算法。此仓库将不再积极维护，但您仍可继续使用 MXNet 1.1.0（可能为 1.2.0）版本。**\n* 现在，此仓库内部已与官方 MXNet 后端的数据同步。大多数情况下，`pip install mxnet` 也可用于此仓库。\n* 现提供 MobileNet 预训练模型。\n* 增加了多个训练好的模型。\n* 提供了一种更简单的方式，可以从主流分类网络（ResNet、Inception 等）中构建网络，并附有[指南](symbol\u002FREADME.md)。\n* 根据 Caffe 版本更新至最新版本，mAP 提升了 5%。\n* 使用基于后端多线程引擎的 C++ 记录迭代器，在多 GPU 环境下实现了巨大的速度提升。\n* 在训练过程中监控验证 mAP。\n* 更多网络符号正在开发和测试中。\n* 额外的算子现在位于 `mxnet\u002Fsrc\u002Foperator\u002Fcontrib` 中，符号也已修改。请使用 [Release-v0.2-beta](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Ftag\u002Fv0.2-beta) 来加载旧模型。\n* 为该仓库增加了 Docker 支持，预构建并包含所有软件包和依赖项。（仅限 Linux）\n* 增加了 TensorBoard 支持，使研究更加便捷。（仅限 Linux）\n\n### 演示结果\n![demo1](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_de733397d925.png)\n![demo2](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_c2de5d22774a.png)\n![demo3](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_5440e9874a53.png)\n\n### mAP\n|        模型          | 训练数据    | 测试数据 |  mAP | 备注 |\n|:-----------------:|:----------------:|:---------:|:----:|:-----|\n| [VGG16_reduced 300x300](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.5-beta\u002Fvgg16_ssd_300_voc0712_trainval.zip) | VOC07+12 trainval| VOC07 test| 77.8| 快速 |\n| [VGG16_reduced 512x512](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.5-beta\u002Fvgg16_ssd_512_voc0712_trainval.zip) | VOC07+12 trainval | VOC07 test| 79.9| 慢速 |\n| [Inception-v3 512x512](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.7-alpha\u002Fssd_inceptionv3_512_voc0712trainval.zip) | VOC07+12 trainval| VOC07 test| 78.9 | 最快 |\n| [Resnet-50 512x512](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.7-alpha\u002Fssd_resnet50_512_voc0712trainval.zip) | VOC07+12 trainval| VOC07 test| 79.1 | 快速 |\n| [MobileNet 512x512](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.7-alpha\u002Fmobilenet-ssd-512.zip) | VOC07+12 trainval| VOC07 test| 72.5 | 超快速 |\n| [MobileNet 608x608](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.7-alpha\u002Fmobilenet-ssd-608.zip) | VOC07+12 trainval| VOC07 test| 74.7 | 超快速 |\n\n\n* 更多内容即将添加\n\n### 速度\n|         模型         |   GPU            | CUDNN | 批量大小 | FPS* |\n|:---------------------:|:----------------:|:-----:|:----------:|:----:|\n| VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1  |     16     | 95   |\n| VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1  |     8      | 95   |\n| VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1  |     1      | 64   |\n| VGG16_reduced 300x300 | TITAN X(Maxwell) |  N\u002FA  |     8      | 36   |\n| VGG16_reduced 300x300 | TITAN X(Maxwell) |  N\u002FA  |     1      | 28   |\n\n*仅计算前向传播时间，不包括数据加载和绘图。\n\n### 开始使用\n* 选项 #1 - 使用 'Docker' 安装。如果您不熟悉这项技术，下方有一个 'Docker' 部分。\n您可以获取最新镜像：\n```\ndocker pull daviddocker78\u002Fmxnet-ssd:gpu_0.12.0_cuda9\n```\n* 您需要 Python 模块：`cv2`、`matplotlib` 和 `numpy`。\n如果您使用 MXNet 的 Python API，您可能已经安装了这些模块。\n您可以通过 pip 或包管理器（如 `apt-get`）安装它们：\n```\nsudo apt-get install python-opencv python-matplotlib python-numpy\n```\n* 克隆此仓库：\n```\n# 如果您没有 git，请根据您的系统通过 apt 或 homebrew\u002Fyum 安装\nsudo apt-get install git\n# 切换到您希望克隆此仓库的目录\ncd ~\ngit clone --recursive https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd.git\n# 确保使用 --recursive 进行克隆\n# 如果未正确完成或您使用的是下载的仓库，请通过以下命令拉取所有子模块：\n# git submodule update --recursive --init\ncd mxnet-ssd\u002Fmxnet\n```\n* （如果您已安装官方 MXNet，则跳过此步骤。）编译 MXNet：`cd \u002Fpath\u002Fto\u002Fmxnet-ssd\u002Fmxnet`。按照官方说明 [这里](http:\u002F\u002Fmxnet.io\u002Fget_started\u002Finstall.html) 进行操作。\n```\n# 对于 Ubuntu\u002FDebian\ncp make\u002Fconfig.mk .\u002Fconfig.mk\n# 如有必要进行修改\n```\n请记住，如果想进行训练，务必启用 CUDA，因为 CPU 训练速度极其缓慢。使用 CUDNN 是可选的，但强烈推荐。\n\n### 尝试演示\n* 下载预训练模型：[`ssd_resnet50_0712.zip`](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.6\u002Fresnet50_ssd_512_voc0712_trainval.zip)，并解压到 `model\u002F` 目录。\n* 运行\n```\n# 切换到 \u002Fpath\u002Fto\u002Fmxnet-ssd\npython demo.py --gpu 0\n# 尝试示例：\npython demo.py --epoch 0 --images .\u002Fdata\u002Fdemo\u002Fdog.jpg --thresh 0.5\npython demo.py --cpu --network resnet50 --data-shape 512\n# 第一次运行时需等待库加载\n```\n* 查看 `python demo.py --help` 以获取更多选项。\n\n### 训练模型\n本示例仅涵盖 Pascal VOC 数据集上的训练。其他数据集应可通过在 `dataset\u002Fimdb.py` 中创建 `Imdb` 类的子类来轻松支持。\n有关详细信息，请参阅 `dataset\u002Fpascal_voc.py` 示例。\n* 下载转换后的预训练 `vgg16_reduced` 模型 [此处](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.2-beta\u002Fvgg16_reduced.zip)，并将 `.param` 和 `.json` 文件默认解压到 `model\u002F` 目录。\n* 下载 PASCAL VOC 数据集，如果您已有则可跳过此步骤。\n```\ncd \u002Fpath\u002Fto\u002Fwhere_you_store_datasets\u002F\nwget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2012\u002FVOCtrainval_11-May-2012.tar\nwget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2007\u002FVOCtrainval_06-Nov-2007.tar\nwget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2007\u002FVOCtest_06-Nov-2007.tar\n\n# 解压数据。\ntar -xvf VOCtrainval_11-May-2012.tar\ntar -xvf VOCtrainval_06-Nov-2007.tar\ntar -xvf VOCtest_06-Nov-2007.tar\n```\n* 我们将使用 VOC2007\u002F2012 中的 `trainval` 数据集作为通用策略。\n建议的目录结构是将 `VOC2007` 和 `VOC2012` 目录存储在同一个 `VOCdevkit` 文件夹中。\n* 然后默认将 `VOCdevkit` 文件夹链接到 `data\u002FVOCdevkit`：\n```\nln -s \u002Fpath\u002Fto\u002FVOCdevkit \u002Fpath\u002Fto\u002Fthis_example\u002Fdata\u002FVOCdevkit\n```\n使用硬链接而不是复制可以节省一些磁盘空间。\n* 创建打包的二进制文件以加快训练速度：\n```\n# cd \u002Fpath\u002Fto\u002Fmxnet-ssd\nbash tools\u002Fprepare_pascal.sh\n# 或者如果你使用的是 Windows\npython tools\u002Fprepare_dataset.py --dataset pascal --year 2007,2012 --set trainval --target .\u002Fdata\u002Ftrain.lst\npython tools\u002Fprepare_dataset.py --dataset pascal --year 2007 --set test --target .\u002Fdata\u002Fval.lst --shuffle False\n```\n* 开始训练：\n```\npython train.py\n```\n* 默认情况下，此示例将使用 `batch-size=32` 和 `learning_rate=0.004`。如果你有不同的配置，可能需要稍微调整这些参数。可以运行 `python train.py --help` 查看更多训练选项。例如，如果你有 4 张 GPU 卡，可以这样设置：\n```\n# 注意：多 GPU 的最佳训练参数组合尚未完全确定\npython train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.001\n```\n* 内存占用：MXNet 非常节省内存，在 `VGG16_reduced` 模型上使用 `batch-size` 32 进行训练时，不使用 CUDNN（conv1_x 和 conv2_x 固定）的情况下，大约占用 4684MB。\n\n### 评估训练好的模型\n使用以下命令：\n```\n# cd \u002Fpath\u002Fto\u002Fmxnet-ssd\npython evaluate.py --gpus 0,1 --batch-size 128 --epoch 0\n```\n\n### 将模型转换为部署模式\n这一步骤会移除所有损失层，并添加一个用于合并结果和非极大值抑制的层。当无法加载 Python 符号文件时非常有用。\n```\n# cd \u002Fpath\u002Fto\u002Fmxnet-ssd\npython deploy.py --num-class 20\n# 然后你可以使用新模型运行演示，而无需加载 Python 符号文件\npython demo.py --prefix model\u002Fssd_300_deploy --epoch 0 --deploy\n```\n\n### 转换 Caffe 模型\nCaffe 到 MXNet 的转换工具位于 `\u002Fpath\u002Fto\u002Fmxnet-ssd\u002Ftools\u002Fcaffe_converter`。\n\n该工具经过特别修改，以处理 Caffe-SSD 中的自定义层。使用方法如下：\n```\ncd \u002Fpath\u002Fto\u002Fmxnet-ssd\u002Ftools\u002Fcaffe_converter\nmake\npython convert_model.py deploy.prototxt name_of_pretrained_caffe_model.caffemodel ssd_converted\n# 你可以使用这个转换后的模型以部署模式运行，而无需加载 Python 符号文件\npython demo.py --prefix ssd_converted --epoch 1 --deploy\n```\n\n虽然不能保证每次转换都能成功，但至少目前效果不错。\n\n### 旧版模型\n由于引入了新的网络构建接口，旧版模型的权重名称存在不一致的问题。你仍然可以通过将符号文件重命名为 `legacy_xxx.py` 并使用 `python train\u002Fdemo.py --network legacy_xxx` 来加载之前的模型。例如：\n```\npython demo.py --network 'legacy_vgg16_ssd_300.py' --prefix model\u002Fssd_300 --epoch 0\n```\n\n### Docker\n首先确保已安装 [docker](https:\u002F\u002Fdocs.docker.com\u002Fengine\u002Finstallation\u002F)。要在 Nvidia GPU 上运行，还需要安装 [nvidia-docker](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002Fnvidia-docker) 插件。\n\n* 预构建的 Docker 镜像可在 https:\u002F\u002Fhub.docker.com\u002Fr\u002Fdaviddocker78\u002Fmxnet-ssd\u002F 获取。要下载预构建的镜像，运行以下命令：\n```\ndocker pull daviddocker78\u002Fmxnet-ssd:gpu_0.12.0_cuda9\n```\n如果你希望自行构建镜像，可以在本仓库的 `docker` 文件夹中找到 Dockerfile。\n\n* 运行容器实例：\n```\nnvidia-docker run -it --rm myImageName:tag\n```\n现在你可以在容器中执行与本地安装 MXNet 相同的操作。更多信息请参阅 [指南](docker\u002FREADME.md)。\n\n### TensorBoard\n* 社区为将 TensorBoard 集成到 MXNet 中做出了很大努力。如果你选择使用 Docker，预构建的镜像中已经包含了 TensorBoard。否则，请按照 [此处](https:\u002F\u002Fgithub.com\u002Fdmlc\u002Ftensorboard) 的说明进行安装。\n* 在训练过程中，若想将训练损失曲线、各类别验证 AP 以及验证 ROC 曲线保存到 TensorBoard，可以指定：\n```\npython train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.001 --tensorboard True\n```\n* 如果还想保存各层的分布情况（实际上是它们的方差），可以指定：\n```\npython train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.001 --tensorboard True --monitor 40\n```\n* 使用 Docker 进行可视化：TensorBoard 的界面随着时间不断变化。为了获得最佳体验，建议下载最新的 TensorFlow Docker 镜像：\n```\n# 从 Docker Hub 下载预构建的镜像\ndocker pull tensorflow\u002Ftensorflow:1.4.0-devel-gpu\n# 运行容器并使用 `-p` 标志打开端口。将存储日志的卷挂载到容器内的目录。\nnvidia-docker run -it --rm -p 0.0.0.0:6006:6006 -v \u002Fmy\u002Ffull\u002Fexperiment\u002Fpath:\u002Fres tensorflow\u002Ftensorflow:1.4.0-devel-gpu\ncd \u002Fres\ntensorboard --logdir=.\n```\n如果不想使用 Docker，可以直接运行最后一行命令。现在 TensorBoard 正在加载你的实验事件数据。在浏览器中访问 `0.0.0.0:6006`，你就能看到 TensorBoard 界面了！\n\n### TensorBoard 可视化\n![loss](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_04b15ce94364.png)\n![AP](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_9a100b0b6ed0.png)\n![ROC](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_readme_69b4f8f1224d.png)","# MXNet-SSD 快速上手指南\n\nMXNet-SSD 是基于 MXNet 框架实现的单发多框（Single Shot MultiBox）目标检测算法。本指南将帮助你快速搭建环境并运行演示。\n\n> **注意**：该仓库目前已停止活跃开发，作者推荐迁移至更现代且功能更全的 [Gluon-CV](https:\u002F\u002Fgithub.com\u002Fdmlc\u002Fgluon-cv)。但本项目仍可用于复现经典 SSD 结果或与 Caffe 版本兼容的场景。\n\n## 1. 环境准备\n\n### 系统要求\n*   **操作系统**: Linux (推荐 Ubuntu\u002FDebian)，Windows 需额外配置。\n*   **GPU**: 推荐使用 NVIDIA GPU 进行训练和推理（CPU 训练速度极慢）。\n*   **依赖软件**:\n    *   Git\n    *   Python (2.7 或 3.x)\n    *   CUDA & cuDNN (如需 GPU 加速)\n\n### 前置依赖安装\n确保已安装以下 Python 模块：`cv2`, `matplotlib`, `numpy`。\n\n```bash\n# Ubuntu\u002FDebian 系统安装示例\nsudo apt-get update\nsudo apt-get install -y git python-opencv python-matplotlib python-numpy\n\n# 若使用 pip 安装\npip install opencv-python matplotlib numpy\n```\n\n## 2. 安装步骤\n\n你可以选择 **Docker 一键部署**（推荐）或 **源码编译安装**。\n\n### 方案 A：使用 Docker (推荐)\nDocker 镜像已预装所有依赖及 MXNet，适合快速体验。\n\n1.  **拉取镜像** (支持 GPU):\n    ```bash\n    docker pull daviddocker78\u002Fmxnet-ssd:gpu_0.12.0_cuda9\n    ```\n    *(注：国内用户若拉取缓慢，可配置 Docker 国内加速器)*\n\n2.  **运行容器**:\n    ```bash\n    nvidia-docker run -it --rm daviddocker78\u002Fmxnet-ssd:gpu_0.12.0_cuda9\n    ```\n\n### 方案 B：源码编译安装\n\n1.  **克隆仓库** (务必使用 `--recursive` 参数以获取子模块):\n    ```bash\n    git clone --recursive https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd.git\n    cd mxnet-ssd\u002Fmxnet\n    ```\n    *(若忘记加参数，可执行 `git submodule update --recursive --init` 补全)*\n\n2.  **编译 MXNet**:\n    ```bash\n    # 复制配置文件\n    cp make\u002Fconfig.mk .\u002Fconfig.mk\n    \n    # 编辑 config.mk，启用 CUDA (USE_CUDA=1) 和 cuDNN (USE_CUDNN=1) 以加速\n    # vim config.mk \n    \n    # 开始编译 (根据 CPU 核心数调整 -j 参数)\n    make -j$(nproc)\n    ```\n\n3.  **安装 Python 接口**:\n    ```bash\n    cd ..\u002Fpython\n    python setup.py install\n    # 或者添加环境变量\n    export PYTHONPATH=\u002Fpath\u002Fto\u002Fmxnet-ssd\u002Fmxnet\u002Fpython:$PYTHONPATH\n    ```\n\n## 3. 基本使用\n\n### 运行演示 (Demo)\n\n1.  **下载预训练模型**:\n    下载 ResNet50 版本的模型并解压到 `model\u002F` 目录。\n    *   模型地址: [resnet50_ssd_512_voc0712_trainval.zip](https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.6\u002Fresnet50_ssd_512_voc0712_trainval.zip)\n    \n    ```bash\n    mkdir -p model\n    cd model\n    wget https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Freleases\u002Fdownload\u002Fv0.6\u002Fresnet50_ssd_512_voc0712_trainval.zip\n    unzip resnet50_ssd_512_voc0712_trainval.zip\n    cd ..\n    ```\n\n2.  **执行检测**:\n    使用默认图片进行测试：\n    ```bash\n    python demo.py --gpu 0\n    ```\n\n    指定图片和阈值进行测试：\n    ```bash\n    python demo.py --epoch 0 --images .\u002Fdata\u002Fdemo\u002Fdog.jpg --thresh 0.5\n    ```\n\n    若无 GPU，可使用 CPU 模式（速度较慢）：\n    ```bash\n    python demo.py --cpu --network resnet50 --data-shape 512\n    ```\n\n### 简要训练流程 (Pascal VOC)\n\n若需重新训练模型，请按以下步骤操作：\n\n1.  **准备数据集**:\n    下载 Pascal VOC 2007 和 2012 数据集，并将其链接到项目目录：\n    ```bash\n    ln -s \u002Fpath\u002Fto\u002FVOCdevkit \u002Fpath\u002Fto\u002Fmxnet-ssd\u002Fdata\u002FVOCdevkit\n    ```\n\n2.  **生成二进制记录文件**:\n    ```bash\n    bash tools\u002Fprepare_pascal.sh\n    ```\n\n3.  **开始训练**:\n    默认使用 VGG16_reduced 模型。\n    ```bash\n    python train.py\n    ```\n    \n    多 GPU 训练示例 (4 卡):\n    ```bash\n    python train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.001\n    ```\n\n### 模型部署转换\n\n将训练好的模型转换为部署模式（移除损失层，添加 NMS 层），以便在不加载 Python 符号的情况下使用：\n\n```bash\npython deploy.py --num-class 20\npython demo.py --prefix model\u002Fssd_300_deploy --epoch 0 --deploy\n```","某电商物流团队正致力于升级其自动化分拣系统，需要实时识别传送带上不同尺寸和角度的包裹及标签。\n\n### 没有 mxnet-ssd 时\n- **检测流程繁琐低效**：团队需依赖多阶段检测框架（如 Faster R-CNN），先提取候选区域再分类，导致单张图片处理耗时过长，无法满足高速流水线的实时性要求。\n- **多尺度目标漏检严重**：面对传送带上远近不一、大小差异巨大的包裹，传统单尺度模型难以兼顾，经常漏检远处的小件货物或截断近处的大件。\n- **部署与迁移成本高**：原有基于 Caffe 的 SSD 模型虽精度尚可，但难以利用现代多 GPU 集群进行加速，且缺乏灵活的预训练模型（如 MobileNet），在边缘设备上部署极其困难。\n- **训练监控盲区**：缺乏直观的验证集 mAP 实时监控手段，开发人员只能在训练结束后评估结果，调优周期漫长且盲目。\n\n### 使用 mxnet-ssd 后\n- **实现端到端极速检测**：利用 mxnet-ssd 的单次前向传播特性，结合其 C++ 记录迭代器和多线程后端，在 TITAN X 显卡上实现了高达 95 FPS 的处理速度，完美匹配流水线节拍。\n- **全尺寸目标精准覆盖**：通过内置的多尺度特征图机制，模型能同时精准定位从细小标签到大型箱体等各类目标，显著降低了漏检率。\n- **灵活部署与性能跃升**：团队直接调用提供的 MobileNet 预训练模型，在保证 72.5% mAP 的前提下实现了“超快”推理，轻松将算法移植到低算力边缘网关；同时利用现成的 Caffe 模型转换器，无缝迁移了原有资产。\n- **可视化训练调优**：借助集成的 TensorBoard 支持和训练过程中的 mAP 实时监控，工程师能即时调整超参数，将模型收敛时间缩短了 40%。\n\nmxnet-ssd 通过统一的单次检测框架和高效的 MXNet 后端，成功将复杂的工业级物体检测任务转化为高精度、低延迟的实时生产力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fzhreshold_mxnet-ssd_de733397.png","zhreshold","Joshua Z. Zhang","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fzhreshold_44520929.jpg",null,"Palo Alto, California","https:\u002F\u002Fzhreshold.github.io","https:\u002F\u002Fgithub.com\u002Fzhreshold",[84,88,92,96],{"name":85,"color":86,"percentage":87},"Python","#3572A5",98.1,{"name":89,"color":90,"percentage":91},"Shell","#89e051",1.8,{"name":93,"color":94,"percentage":95},"Makefile","#427819",0.1,{"name":97,"color":98,"percentage":99},"Batchfile","#C1F12E",0,763,335,"2026-02-18T06:13:54","MIT","Linux, Windows","训练强烈建议使用 NVIDIA GPU (如 TITAN X)，需安装 CUDA (文中示例为 v5.1 或 CUDA 9) 和可选的 cuDNN；CPU 训练速度极慢不推荐。","未说明 (文中提及 VGG16 模型 batch-size 32 训练约占用 4.6GB 显存)",{"notes":108,"python":109,"dependencies":110},"该项目已弃用，作者建议迁移至 Gluon-CV。官方提供基于 Docker 的预构建镜像（仅支持 Linux），包含所有依赖及 TensorBoard 支持。若不使用 Docker，需手动编译 MXNet 并开启 CUDA 支持。Windows 用户需注意部分功能（如 Docker、TensorBoard）可能仅支持 Linux。","未说明 (需支持 mxnet-python api)",[111,112,113,114],"mxnet>=1.1.0","opencv-python (cv2)","matplotlib","numpy",[14],[117],"object-detection","2026-03-27T02:49:30.150509","2026-04-06T09:46:09.363641",[121,126,131,136,141,146,151],{"id":122,"question_zh":123,"answer_zh":124,"source_url":125},10743,"运行 demo.py 时遇到 'too many resources requested for launch' CUDA 错误怎么办？","这通常是由于资源请求过多导致的。解决方案包括：1. 重新编译 OpenCV 并禁用 CUDA 支持；2. 使用 CMAKE_BUILD_TYPE=Release 重新编译 MXNet。此外，也可以参考 MXNet 官方仓库的相关 issue (dmlc\u002Fmxnet#5170) 获取更多细节。","https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Fissues\u002F13",{"id":127,"question_zh":128,"answer_zh":129,"source_url":130},10744,"在 Windows 上运行 demo.py 报错 'Cannot find argument pooling_convention' 如何解决？","该问题通常源于 MXNet 版本不匹配或构建配置问题。建议安装 MXNet 的预构建 pip 包以避免复杂的编译过程。如果必须自行编译，请遵循官方指南使用 Visual Studio 进行构建。维护者提供了一个仅包含 CPU 版本的 libmxnet.dll 参考项目：https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd.cpp\u002Ftree\u002Fwin。","https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Fissues\u002F19",{"id":132,"question_zh":133,"answer_zh":134,"source_url":135},10745,"训练最新版本的 SSD 时出现 'TypeError: int object is not iterable' 错误怎么办？","此错误通常是由于 MXNet 版本过旧导致的预测返回值类型不匹配。解决方法是升级 MXNet 到最新版本。此外，有用户反馈将学习率调整为 0.002 也能帮助解决训练不稳定或指标计算异常的问题。","https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Fissues\u002F84",{"id":137,"question_zh":138,"answer_zh":139,"source_url":140},10746,"如何使用 Inception v3 或 ResNet 作为 SSD 的特征提取网络？","代码库中已经包含了 Inception 和 ResNet 的实现。用户可以参考社区成员实现的分支（如 https:\u002F\u002Fgithub.com\u002FedmBernard\u002Fmxnet\u002Ftree\u002Fmaster\u002Fexample\u002Fssd）来获取具体配置。需要注意的是，不同主干网络的训练效果可能存在差异，可能需要调整超参数以获得最佳结果。","https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Fissues\u002F58",{"id":142,"question_zh":143,"answer_zh":144,"source_url":145},10747,"如何修改模型以训练超过 20 类（例如 22 类）的自定义数据集？","在使用预训练模型（如 ResNet50 或 Inceptionv3）微调时，如果类别数量与预训练权重不匹配（例如预期 84 维但得到 92 维），会引发形状不兼容错误。虽然 VGG16 可能表现不同，但对于其他网络，通常建议检查是否正确加载了适配新类别数的头部结构。如果遇到此类维度不匹配，可能需要从头训练或确保微调脚本正确处理了类别数变化导致的输出层维度改变。","https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Fissues\u002F149",{"id":147,"question_zh":148,"answer_zh":149,"source_url":150},10748,"训练单类别数据集后，运行 demo.py 出现 'operands shape mismatch' 错误如何解决？","当训练数据只有一个类别时，模型输出的形状可能与默认演示脚本预期的形状不匹配，导致 ndarray 形状检查失败。这通常是因为演示代码硬编码了多类别的输出处理逻辑。解决方法是修改 demo.py 或符号定义，使其适应单类别输出的形状，或者在训练配置中确保输出层维度与单类别任务一致。","https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Fissues\u002F33",{"id":152,"question_zh":153,"answer_zh":154,"source_url":155},10749,"L2Normalization 实现是否有问题导致训练失败？","根据社区验证，L2Normalization 的实现本身没有问题。如果训练出现问题，原因通常位于其他部分。建议检查数据加载、标签格式或其他网络组件的配置。","https:\u002F\u002Fgithub.com\u002Fzhreshold\u002Fmxnet-ssd\u002Fissues\u002F48",[157,162,167,172,177],{"id":158,"version":159,"summary_zh":160,"released_at":161},71360,"v0.7-alpha","- Updated mxnet to 0.12.0rc1\r\n- Include pretrained mobilenet models\r\n\r\nReleased models:\r\n\r\n- mobilenet-608: mAP ~74.7, ~0.56s\u002Fframe on cpu\r\n- mobilenet-512: mAP ~72.5,  ~0.39s\u002Fframe on cpu\r\n- resnet50-512: mAP ~79.1, ~1.0s\u002Fframe on cpu\r\n\r\nTested on *i7-5557U CPU @ 3.10GHz*, apple blas","2017-10-31T05:13:44",{"id":163,"version":164,"summary_zh":165,"released_at":166},71361,"v0.6","* Improved symbol composing, you can easily create own network based on popular classification networks.\r\n* Optimized for multi-gpu performances.\r\n* New models provided.","2017-06-26T23:28:55",{"id":168,"version":169,"summary_zh":170,"released_at":171},71362,"v0.5-beta","### Release Note\r\nNow with new iterator for faster training, together with multiple modifications to improve mAP from 72% to 78% for SSD-300 model, and 74% to 79%+ for 512 models.\r\n\r\n### Trained models:\r\n|        Model          | Training data    | Test data |  mAP |\r\n|:-----------------:|:----------------:|:---------:|:----:|\r\n| VGG16_reduced 300x300 | VOC07+12 trainval| VOC07 test| 77.8|\r\n| VGG16_reduced 512x512 | VOC07+12 trainval | VOC07 test| 79.9|","2017-03-28T22:25:33",{"id":173,"version":174,"summary_zh":175,"released_at":176},71363,"v0.2-alpha","### For official mxnet\u002Fexample\u002Fssd(https:\u002F\u002Fgithub.com\u002Fdmlc\u002Fmxnet\u002Fcommit\u002F17b6aac0d7656b44764b75b12c22074ed81038b3), do not use for this repo!\r\n1. Pretrained detection model: `ssd_300_vgg16_reduced_voc0712_trainval.zip` with 71.5% mAP on PASCAL VOC 2007","2017-03-23T16:00:39",{"id":178,"version":179,"summary_zh":180,"released_at":181},71364,"v0.2-beta","First usable release before rewriting structures\r\n\r\nProvided models:\r\n1. Pretrained classification model: `vgg16_reduced.zip`\r\n2. Pretrained detection model: `ssd_300_voc0712.zip` with 71.5% mAP on PASCAL VOC 2007","2017-01-18T22:49:54"]