[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-pierluigiferrari--ssd_keras":3,"tool-pierluigiferrari--ssd_keras":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":78,"owner_location":79,"owner_email":80,"owner_twitter":78,"owner_website":78,"owner_url":81,"languages":82,"stars":87,"forks":88,"last_commit_at":89,"license":90,"difficulty_score":10,"env_os":91,"env_gpu":92,"env_ram":93,"env_deps":94,"category_tags":103,"github_topics":104,"view_count":23,"oss_zip_url":78,"oss_zip_packed_at":78,"status":16,"created_at":115,"updated_at":116,"faqs":117,"releases":147},1288,"pierluigiferrari\u002Fssd_keras","ssd_keras","A Keras port of Single Shot MultiBox Detector","ssd_keras 是一个基于 Keras 实现的单次多框检测器（SSD）模型，源自 Wei Liu 等人提出的经典目标检测算法。它能够快速、准确地在图像中识别并定位多个物体，适用于如行人检测、车辆识别等常见场景。\n\n这个工具解决了传统目标检测方法计算复杂、速度慢的问题，通过一次前向传播即可完成检测任务，显著提升了检测效率。同时，它提供了多种网络结构（如 SSD300、SSD512 和轻量级的 SSD7），满足不同精度与性能需求，并支持在自定义数据集上进行微调，便于实际应用中的迁移学习。\n\nssd_keras 适合有一定深度学习基础的开发者和研究人员使用，尤其适合希望深入理解 SSD 模型原理或需要在项目中集成目标检测功能的用户。其代码注释详尽、文档完善，有助于用户快速上手和二次开发。\n\n值得一提的是，该实现与原始 Caffe 版本在性能上高度一致，甚至部分指标略有提升，且支持从零训练模型，具备良好的可扩展性与实用性。","## SSD: Single-Shot MultiBox Detector implementation in Keras\n---\n### Contents\n\n1. [Overview](#overview)\n2. [Performance](#performance)\n3. [Examples](#examples)\n4. [Dependencies](#dependencies)\n5. [How to use it](#how-to-use-it)\n6. [Download the convolutionalized VGG-16 weights](#download-the-convolutionalized-vgg-16-weights)\n7. [Download the original trained model weights](#download-the-original-trained-model-weights)\n8. [How to fine-tune one of the trained models on your own dataset](#how-to-fine-tune-one-of-the-trained-models-on-your-own-dataset)\n9. [ToDo](#todo)\n10. [Important notes](#important-notes)\n11. [Terminology](#terminology)\n\n### Overview\n\nThis is a Keras port of the SSD model architecture introduced by Wei Liu et al. in the paper [SSD: Single Shot MultiBox Detector](https:\u002F\u002Farxiv.org\u002Fabs\u002F1512.02325).\n\nPorts of the trained weights of all the original models are provided below. This implementation is accurate, meaning that both the ported weights and models trained from scratch produce the same mAP values as the respective models of the original Caffe implementation (see performance section below).\n\nThe main goal of this project is to create an SSD implementation that is well documented for those who are interested in a low-level understanding of the model. The provided tutorials, documentation and detailed comments hopefully make it a bit easier to dig into the code and adapt or build upon the model than with most other implementations out there (Keras or otherwise) that provide little to no documentation and comments.\n\nThe repository currently provides the following network architectures:\n* SSD300: [`keras_ssd300.py`](models\u002Fkeras_ssd300.py)\n* SSD512: [`keras_ssd512.py`](models\u002Fkeras_ssd512.py)\n* SSD7: [`keras_ssd7.py`](models\u002Fkeras_ssd7.py) - a smaller 7-layer version that can be trained from scratch relatively quickly even on a mid-tier GPU, yet is capable enough for less complex object detection tasks and testing. You're obviously not going to get state-of-the-art results with that one, but it's fast.\n\nIf you would like to use one of the provided trained models for transfer learning (i.e. fine-tune one of the trained models on your own dataset), there is a [Jupyter notebook tutorial](weight_sampling_tutorial.ipynb) that helps you sub-sample the trained weights so that they are compatible with your dataset, see further below.\n\nIf you would like to build an SSD with your own base network architecture, you can use [`keras_ssd7.py`](models\u002Fkeras_ssd7.py) as a template, it provides documentation and comments to help you.\n\n### Performance\n\nHere are the mAP evaluation results of the ported weights and below that the evaluation results of a model trained from scratch using this implementation. All models were evaluated using the official Pascal VOC test server (for 2012 `test`) or the official Pascal VOC Matlab evaluation script (for 2007 `test`). In all cases the results match (or slightly surpass) those of the original Caffe models. Download links to all ported weights are available further below.\n\n\u003Ctable width=\"70%\">\n  \u003Ctr>\n    \u003Ctd>\u003C\u002Ftd>\n    \u003Ctd colspan=3 align=center>Mean Average Precision\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>evaluated on\u003C\u002Ftd>\n    \u003Ctd colspan=2 align=center>VOC2007 test\u003C\u002Ftd>\n    \u003Ctd align=center>VOC2012 test\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>trained on\u003Cbr>IoU rule\u003C\u002Ftd>\n    \u003Ctd align=center width=\"25%\">07+12\u003Cbr>0.5\u003C\u002Ftd>\n    \u003Ctd align=center width=\"25%\">07+12+COCO\u003Cbr>0.5\u003C\u002Ftd>\n    \u003Ctd align=center width=\"25%\">07++12+COCO\u003Cbr>0.5\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD300\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>77.5\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>81.2\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>79.4\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD512\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>79.8\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>83.2\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>82.3\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\nTraining an SSD300 from scratch to convergence on Pascal VOC 2007 `trainval` and 2012 `trainval` produces the same mAP on Pascal VOC 2007 `test` as the original Caffe SSD300 \"07+12\" model. You can find a summary of the training [here](training_summaries\u002Fssd300_pascal_07+12_training_summary.md).\n\n\u003Ctable width=\"95%\">\n  \u003Ctr>\n    \u003Ctd>\u003C\u002Ftd>\n    \u003Ctd colspan=3 align=center>Mean Average Precision\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003C\u002Ftd>\n    \u003Ctd align=center>Original Caffe Model\u003C\u002Ftd>\n    \u003Ctd align=center>Ported Weights\u003C\u002Ftd>\n    \u003Ctd align=center>Trained from Scratch\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD300 \"07+12\"\u003C\u002Ftd>\n    \u003Ctd align=center width=\"26%\">\u003Cb>0.772\u003C\u002Ftd>\n    \u003Ctd align=center width=\"26%\">\u003Cb>0.775\u003C\u002Ftd>\n    \u003Ctd align=center width=\"26%\">\u003Cb>\u003Ca href=\"https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-MYYaZbIHNPtI2zzklgVBAjssbP06BeA\u002Fview\">0.771\u003C\u002Fa>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\nThe models achieve the following average number of frames per second (FPS) on Pascal VOC on an NVIDIA GeForce GTX 1070 mobile (i.e. the laptop version) and cuDNN v6. There are two things to note here. First, note that the benchmark prediction speeds of the original Caffe implementation were achieved using a TitanX GPU and cuDNN v4. Second, the paper says they measured the prediction speed at batch size 8, which I think isn't a meaningful way of measuring the speed. The whole point of measuring the speed of a detection model is to know how many individual sequential images the model can process per second, therefore measuring the prediction speed on batches of images and then deducing the time spent on each individual image in the batch defeats the purpose. For the sake of comparability, below you find the prediction speed for the original Caffe SSD implementation and the prediction speed for this implementation under the same conditions, i.e. at batch size 8. In addition you find the prediction speed for this implementation at batch size 1, which in my opinion is the more meaningful number.\n\n\u003Ctable width>\n  \u003Ctr>\n    \u003Ctd>\u003C\u002Ftd>\n    \u003Ctd colspan=3 align=center>Frames per Second\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003C\u002Ftd>\n    \u003Ctd align=center>Original Caffe Implementation\u003C\u002Ftd>\n    \u003Ctd colspan=2 align=center>This Implementation\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd width=\"14%\">Batch Size\u003C\u002Ftd>\n    \u003Ctd width=\"27%\" align=center>8\u003C\u002Ftd>\n    \u003Ctd width=\"27%\" align=center>8\u003C\u002Ftd>\n    \u003Ctd width=\"27%\" align=center>1\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD300\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>46\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>49\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>39\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD512\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>19\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>25\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>20\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD7\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>216\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>127\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n### Examples\n\nBelow are some prediction examples of the fully trained original SSD300 \"07+12\" model (i.e. trained on Pascal VOC2007 `trainval` and VOC2012 `trainval`). The predictions were made on Pascal VOC2007 `test`.\n\n| | |\n|---|---|\n| ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_915c6345c47a.png) | ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_6b6c7d8dc20b.png) |\n| ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_37af5aa7f08a.png) | ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_063597976953.png) |\n\nHere are some prediction examples of an SSD7 (i.e. the small 7-layer version) partially trained on two road traffic datasets released by [Udacity](https:\u002F\u002Fgithub.com\u002Fudacity\u002Fself-driving-car\u002Ftree\u002Fmaster\u002Fannotations) with roughly 20,000 images in total and 5 object categories (more info in [`ssd7_training.ipynb`](ssd7_training.ipynb)). The predictions you see below were made after 10,000 training steps at batch size 32. Admittedly, cars are comparatively easy objects to detect and I picked a few of the better examples, but it is nonetheless remarkable what such a small model can do after only 10,000 training iterations.\n\n| | |\n|---|---|\n| ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_6b55c7e74bbc.png) | ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_f78cf33d19e9.png) |\n| ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_076a9300011f.png) | ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_7e6ffad9f911.png) |\n\n### Dependencies\n\n* Python 3.x\n* Numpy\n* TensorFlow 1.x\n* Keras 2.x\n* OpenCV\n* Beautiful Soup 4.x\n\nThe Theano and CNTK backends are currently not supported.\n\nPython 2 compatibility: This implementation seems to work with Python 2.7, but I don't provide any support for it. It's 2018 and nobody should be using Python 2 anymore.\n\n### How to use it\n\nThis repository provides Jupyter notebook tutorials that explain training, inference and evaluation, and there are a bunch of explanations in the subsequent sections that complement the notebooks.\n\nHow to use a trained model for inference:\n* [`ssd300_inference.ipynb`](ssd300_inference.ipynb)\n* [`ssd512_inference.ipynb`](ssd512_inference.ipynb)\n\nHow to train a model:\n* [`ssd300_training.ipynb`](ssd300_training.ipynb)\n* [`ssd7_training.ipynb`](ssd7_training.ipynb)\n\nHow to use one of the provided trained models for transfer learning on your own dataset:\n* [Read below](#how-to-fine-tune-one-of-the-trained-models-on-your-own-dataset)\n\nHow to evaluate a trained model:\n* In general: [`ssd300_evaluation.ipynb`](ssd300_evaluation.ipynb)\n* On MS COCO: [`ssd300_evaluation_COCO.ipynb`](ssd300_evaluation_COCO.ipynb)\n\nHow to use the data generator:\n* The data generator used here has its own repository with a detailed tutorial [here](https:\u002F\u002Fgithub.com\u002Fpierluigiferrari\u002Fdata_generator_object_detection_2d)\n\n#### Training details\n\nThe general training setup is layed out and explained in [`ssd7_training.ipynb`](ssd7_training.ipynb) and in [`ssd300_training.ipynb`](ssd300_training.ipynb). The setup and explanations are similar in both notebooks for the most part, so it doesn't matter which one you look at to understand the general training setup, but the parameters in [`ssd300_training.ipynb`](ssd300_training.ipynb) are preset to copy the setup of the original Caffe implementation for training on Pascal VOC, while the parameters in [`ssd7_training.ipynb`](ssd7_training.ipynb) are preset to train on the [Udacity traffic datasets](https:\u002F\u002Fgithub.com\u002Fudacity\u002Fself-driving-car\u002Ftree\u002Fmaster\u002Fannotations).\n\nTo train the original SSD300 model on Pascal VOC:\n\n1. Download the datasets:\n  ```c\n  wget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2012\u002FVOCtrainval_11-May-2012.tar\n  wget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2007\u002FVOCtrainval_06-Nov-2007.tar\n  wget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2007\u002FVOCtest_06-Nov-2007.tar\n  ```\n2. Download the weights for the convolutionalized VGG-16 or for one of the trained original models provided below.\n3. Set the file paths for the datasets and model weights accordingly in [`ssd300_training.ipynb`](ssd300_training.ipynb) and execute the cells.\n\nThe procedure for training SSD512 is the same of course. It is imperative that you load the pre-trained VGG-16 weights when attempting to train an SSD300 or SSD512 from scratch, otherwise the training will probably fail. Here is a summary of a full training of the SSD300 \"07+12\" model for comparison with your own training:\n\n* [SSD300 Pascal VOC \"07+12\" training summary](training_summaries\u002Fssd300_pascal_07+12_training_summary.md)\n\n#### Encoding and decoding boxes\n\nThe [`ssd_encoder_decoder`](ssd_encoder_decoder) sub-package contains all functions and classes related to encoding and decoding boxes. Encoding boxes means converting ground truth labels into the target format that the loss function needs during training. It is this encoding process in which the matching of ground truth boxes to anchor boxes (the paper calls them default boxes and in the original C++ code they are called priors - all the same thing) happens. Decoding boxes means converting raw model output back to the input label format, which entails various conversion and filtering processes such as non-maximum suppression (NMS).\n\nIn order to train the model, you need to create an instance of `SSDInputEncoder` that needs to be passed to the data generator. The data generator does the rest, so you don't usually need to call any of `SSDInputEncoder`'s methods manually.\n\nModels can be created in 'training' or 'inference' mode. In 'training' mode, the model outputs the raw prediction tensor that still needs to be post-processed with coordinate conversion, confidence thresholding, non-maximum suppression, etc. The functions `decode_detections()` and `decode_detections_fast()` are responsible for that. The former follows the original Caffe implementation, which entails performing NMS per object class, while the latter performs NMS globally across all object classes and is thus more efficient, but also behaves slightly differently. Read the documentation for details about both functions. If a model is created in 'inference' mode, its last layer is the `DecodeDetections` layer, which performs all the post-processing that `decode_detections()` does, but in TensorFlow. That means the output of the model is already the post-processed output. In order to be trainable, a model must be created in 'training' mode. The trained weights can then later be loaded into a model that was created in 'inference' mode.\n\nA note on the anchor box offset coordinates used internally by the model: This may or may not be obvious to you, but it is important to understand that it is not possible for the model to predict absolute coordinates for the predicted bounding boxes. In order to be able to predict absolute box coordinates, the convolutional layers responsible for localization would need to produce different output values for the same object instance at different locations within the input image. This isn't possible of course: For a given input to the filter of a convolutional layer, the filter will produce the same output regardless of the spatial position within the image because of the shared weights. This is the reason why the model predicts offsets to anchor boxes instead of absolute coordinates, and why during training, absolute ground truth coordinates are converted to anchor box offsets in the encoding process. The fact that the model predicts offsets to anchor box coordinates is in turn the reason why the model contains anchor box layers that do nothing but output the anchor box coordinates so that the model's output tensor can include those. If the model's output tensor did not contain the anchor box coordinates, the information to convert the predicted offsets back to absolute coordinates would be missing in the model output.\n\n#### Using a different base network architecture\n\nIf you want to build a different base network architecture, you could use [`keras_ssd7.py`](models\u002Fkeras_ssd7.py) as a template. It provides documentation and comments to help you turn it into a different base network. Put together the base network you want and add a predictor layer on top of each network layer from which you would like to make predictions. Create two predictor heads for each, one for localization, one for classification. Create an anchor box layer for each predictor layer and set the respective localization head's output as the input for the anchor box layer. The structure of all tensor reshaping and concatenation operations remains the same, you just have to make sure to include all of your predictor and anchor box layers of course.\n\n### Download the convolutionalized VGG-16 weights\n\nIn order to train an SSD300 or SSD512 from scratch, download the weights of the fully convolutionalized VGG-16 model trained to convergence on ImageNet classification here:\n\n[`VGG_ILSVRC_16_layers_fc_reduced.h5`](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1sBmajn6vOE7qJ8GnxUJt4fGPuffVUZox).\n\nAs with all other weights files below, this is a direct port of the corresponding `.caffemodel` file that is provided in the repository of the original Caffe implementation.\n\n### Download the original trained model weights\n\nHere are the ported weights for all the original trained models. The filenames correspond to their respective `.caffemodel` counterparts. The asterisks and footnotes refer to those in the README of the [original Caffe implementation](https:\u002F\u002Fgithub.com\u002Fweiliu89\u002Fcaffe\u002Ftree\u002Fssd#models).\n\n1. PASCAL VOC models:\n\n    * 07+12: [SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=121-kCXaOHOkJE_Kf5lKcJvC_5q1fYb_q), [SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=19NIa0baRCFYT3iRxQkOKCD7CpN6BFO8p)\n    * 07++12: [SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1M99knPZ4DpY9tI60iZqxXsAxX2bYWDvZ), [SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=18nFnqv9fG5Rh_fx6vUtOoQHOLySt4fEx)\n    * COCO[1]: [SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=17G1J4zEpFwiOzgBmq886ci4P3YaIz8bY), [SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1wGc368WyXSHZOv4iow2tri9LnB0vm9X-)\n    * 07+12+COCO: [SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1vtNI6kSnv7fkozl7WxyhGyReB6JvDM41), [SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=14mELuzm0OvXnwjb0mzAiG-Ake9_NP_LQ)\n    * 07++12+COCO: [SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1fyDDUcIOSjeiP08vl1WCndcFdtboFXua), [SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1a-64b6y6xsQr5puUsHX_wxI1orQDercM)\n\n\n2. COCO models:\n\n    * trainval35k: [SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1vmEF7FUsWfHquXyCqO17UaXOPpRbwsdj), [SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1IJWZKmjkcFMlvaz2gYukzFx4d6mH3py5)\n\n\n3. ILSVRC models:\n\n    * trainval1: [SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1VWkj1oQS2RUhyJXckx3OaDYs5fx2mMCq), [SSD500](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1LcBPsd9CJbuBw4KiSuE1o1fMA-Pz2Zvw)\n\n### How to fine-tune one of the trained models on your own dataset\n\nIf you want to fine-tune one of the provided trained models on your own dataset, chances are your dataset doesn't have the same number of classes as the trained model. The following tutorial explains how to deal with this problem:\n\n[`weight_sampling_tutorial.ipynb`](weight_sampling_tutorial.ipynb)\n\n### ToDo\n\nThe following things are on the to-do list, ranked by priority. Contributions are welcome, but please read the [contributing guidelines](CONTRIBUTING.md).\n\n1. Add model definitions and trained weights for SSDs based on other base networks such as MobileNet, InceptionResNetV2, or DenseNet.\n2. Add support for the Theano and CNTK backends. Requires porting the custom layers and the loss function from TensorFlow to the abstract Keras backend.\n\nCurrently in the works:\n\n* A new [Focal Loss](https:\u002F\u002Farxiv.org\u002Fabs\u002F1708.02002) loss function.\n\n### Important notes\n\n* All trained models that were trained on MS COCO use the smaller anchor box scaling factors provided in all of the Jupyter notebooks. In particular, note that the '07+12+COCO' and '07++12+COCO' models use the smaller scaling factors.\n\n### Terminology\n\n* \"Anchor boxes\": The paper calls them \"default boxes\", in the original C++ code they are called \"prior boxes\" or \"priors\", and the Faster R-CNN paper calls them \"anchor boxes\". All terms mean the same thing, but I slightly prefer the name \"anchor boxes\" because I find it to be the most descriptive of these names. I call them \"prior boxes\" or \"priors\" in `keras_ssd300.py` and `keras_ssd512.py` to stay consistent with the original Caffe implementation, but everywhere else I use the name \"anchor boxes\" or \"anchors\".\n* \"Labels\": For the purpose of this project, datasets consist of \"images\" and \"labels\". Everything that belongs to the annotations of a given image is the \"labels\" of that image: Not just object category labels, but also bounding box coordinates. \"Labels\" is just shorter than \"annotations\". I also use the terms \"labels\" and \"targets\" more or less interchangeably throughout the documentation, although \"targets\" means labels specifically in the context of training.\n* \"Predictor layer\": The \"predictor layers\" or \"predictors\" are all the last convolution layers of the network, i.e. all convolution layers that do not feed into any subsequent convolution layers.\n","## SSD：Keras中的单次多框检测器实现\n---\n### 目录\n\n1. [概述](#overview)\n2. [性能](#performance)\n3. [示例](#examples)\n4. [依赖项](#dependencies)\n5. [使用方法](#how-to-use-it)\n6. [下载卷积化的VGG-16权重](#download-the-convolutionalized-vgg-16-weights)\n7. [下载原始训练好的模型权重](#download-the-original-trained-model-weights)\n8. [如何在自己的数据集上微调其中一个训练好的模型](#how-to-fine-tune-one-of-the-trained-models-on-your-own-dataset)\n9. [待办事项](#todo)\n10. [重要说明](#important-notes)\n11. [术语](#terminology)\n\n### 概述\n\n这是Wei Liu等人在论文[SSD：单次多框检测器](https:\u002F\u002Farxiv.org\u002Fabs\u002F1512.02325)中提出的SSD模型架构的Keras移植版本。\n\n以下提供了所有原始模型的训练权重的移植版本。该实现是准确的，也就是说，无论是移植后的权重还是从头训练的模型，其mAP值都与原始Caffe实现的相应模型完全一致（详见下文的性能部分）。\n\n本项目的首要目标是为那些希望对模型进行底层理解的人提供一个文档详尽的SSD实现。所提供的教程、文档和详细注释，希望能使深入研究代码以及基于此模型进行改编或扩展，比大多数其他实现（无论是否基于Keras）更容易一些——因为那些实现往往几乎没有任何文档和注释。\n\n目前该仓库提供了以下几种网络架构：\n* SSD300：[`keras_ssd300.py`](models\u002Fkeras_ssd300.py)\n* SSD512：[`keras_ssd512.py`](models\u002Fkeras_ssd512.py)\n* SSD7：[`keras_ssd7.py`](models\u002Fkeras_ssd7.py)——这是一个较小的7层版本，即使在中端GPU上也能相对快速地从头训练，同时对于较简单的目标检测任务和测试也足够胜任。当然，用它不可能获得最先进的结果，但它的速度很快。\n\n如果您想使用提供的其中一个训练好的模型进行迁移学习（即在自己的数据集上微调其中一个训练好的模型），有一个[Jupyter笔记本教程](weight_sampling_tutorial.ipynb)，可以帮助您对训练好的权重进行子采样，使其与您的数据集兼容，详情见下文。\n\n如果您想使用自己的基础网络架构构建SSD，可以将[`keras_ssd7.py`](models\u002Fkeras_ssd7.py)作为模板，其中提供了文档和注释以帮助您。\n\n### 性能\n\n以下是移植权重的mAP评估结果，下方则是使用本实现从头训练的模型的评估结果。所有模型均使用官方Pascal VOC测试服务器（针对2012年的`test`）或官方Pascal VOC Matlab评估脚本（针对2007年的`test`）进行评估。在所有情况下，结果都与原始Caffe模型一致（或略胜一筹）。所有移植权重的下载链接见下文。\n\n\u003Ctable width=\"70%\">\n  \u003Ctr>\n    \u003Ctd>\u003C\u002Ftd>\n    \u003Ctd colspan=3 align=center>平均精度均值\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>评估对象\u003C\u002Ftd>\n    \u003Ctd colspan=2 align=center>VOC2007 test\u003C\u002Ftd>\n    \u003Ctd align=center>VOC2012 test\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>训练时\u003Cbr>IoU规则\u003C\u002Ftd>\n    \u003Ctd align=center width=\"25%\">07+12\u003Cbr>0.5\u003C\u002Ftd>\n    \u003Ctd align=center width=\"25%\">07+12+COCO\u003Cbr>0.5\u003C\u002Ftd>\n    \u003Ctd align=center width=\"25%\">07++12+COCO\u003Cbr>0.5\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD300\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>77.5\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>81.2\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>79.4\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD512\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>79.8\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>83.2\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>82.3\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n在Pascal VOC 2007 `trainval`和2012 `trainval`上从头训练SSD300直至收敛，在Pascal VOC 2007 `test`上的mAP与原始Caffe SSD300“07+12”模型相同。您可以在这里找到训练总结[这里](training_summaries\u002Fssd300_pascal_07+12_training_summary.md)。\n\n\u003Ctable width=\"95%\">\n  \u003Ctr>\n    \u003Ctd>\u003C\u002Ftd>\n    \u003Ctd colspan=3 align=center>平均精度均值\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003C\u002Ftd>\n    \u003Ctd align=center>原始Caffe模型\u003C\u002Ftd>\n    \u003Ctd align=center>移植权重\u003C\u002Ftd>\n    \u003Ctd align=center>从头训练\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD300 “07+12”\u003C\u002Ftd>\n    \u003Ctd align=center width=\"26%\">\u003Cb>0.772\u003C\u002Ftd>\n    \u003Ctd align=center width=\"26%\">\u003Cb>0.775\u003C\u002Ftd>\n    \u003Ctd align=center width=\"26%\">\u003Cb>\u003Ca href=\"https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-MYYaZbIHNPtI2zzklgVBAjssbP06BeA\u002Fview\">0.771\u003C\u002Fa>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n这些模型在NVIDIA GeForce GTX 1070移动版（即笔记本电脑版本）和cuDNN v6环境下，在Pascal VOC上的平均每秒帧数（FPS）如下。这里需要注意两点。首先，原始Caffe实现的基准预测速度是在TitanX GPU和cuDNN v4环境下测得的。其次，论文中提到他们是在批量大小为8的情况下测量预测速度，我认为这种测量方式并不具有实际意义。衡量检测模型速度的真正目的，是了解模型每秒能够处理多少张单独的连续图像，因此如果只以整批图像为单位来测量预测速度，再推算出每张图像所花费的时间，就违背了这一初衷。为了便于比较，下面列出了原始Caffe SSD实现的预测速度，以及本实现在此相同条件下的预测速度，即批量大小为8时的情况。此外，还列出了本实现批量大小为1时的预测速度，我认为这才是更有意义的数值。\n\n\u003Ctable width>\n  \u003Ctr>\n    \u003Ctd>\u003C\u002Ftd>\n    \u003Ctd colspan=3 align=center>每秒帧数\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003C\u002Ftd>\n    \u003Ctd align=center>原始Caffe实现\u003C\u002Ftd>\n    \u003Ctd colspan=2 align=center>本实现\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd width=\"14%\">批量大小\u003C\u002Ftd>\n    \u003Ctd width=\"27%\" align=center>8\u003C\u002Ftd>\n    \u003Ctd width=\"27%\" align=center>8\u003C\u002Ftd>\n    \u003Ctd width=\"27%\" align=center>1\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD300\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>46\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>49\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>39\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD512\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>19\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>25\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>20\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cb>SSD7\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>216\u003C\u002Ftd>\n    \u003Ctd align=center>\u003Cb>127\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n### 示例\n\n以下是经过完整训练的原始 SSD300 “07+12” 模型的一些预测示例（即在 Pascal VOC2007 的 `trainval` 数据集和 VOC2012 的 `trainval` 数据集上进行训练）。这些预测是在 Pascal VOC2007 的 `test` 数据集上生成的。\n\n| | |\n|---|---|\n| ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_915c6345c47a.png) | ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_6b6c7d8dc20b.png) |\n| ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_37af5aa7f08a.png) | ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_063597976953.png) |\n\n以下是一些 SSD7（即小型的 7 层版本）的预测示例，该模型部分地在由 [Udacity](https:\u002F\u002Fgithub.com\u002Fudacity\u002Fself-driving-car\u002Ftree\u002Fmaster\u002Fannotations) 发布的两个道路交通数据集上进行了训练，这两个数据集共计约 2 万张图像，包含 5 类目标（更多信息请参见 [`ssd7_training.ipynb`](ssd7_training.ipynb)）。下面展示的预测是在以批量大小 32 进行 1 万次训练后生成的。诚然，汽车是比较容易检测的目标，我挑选了其中几个较好的例子，但即便如此，这样一个小型模型仅经过 1 万次训练迭代就能取得这样的效果，仍然令人印象深刻。\n\n| | |\n|---|---|\n| ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_6b55c7e74bbc.png) | ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_f78cf33d19e9.png) |\n| ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_076a9300011f.png) | ![img01](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_readme_7e6ffad9f911.png) |\n\n### 依赖项\n\n* Python 3.x\n* Numpy\n* TensorFlow 1.x\n* Keras 2.x\n* OpenCV\n* Beautiful Soup 4.x\n\n目前尚不支持 Theano 和 CNTK 后端。\n\nPython 2 兼容性：本实现似乎可以在 Python 2.7 上运行，但我并不提供任何相关支持。如今已是 2018 年，已经没有人应该再使用 Python 2 了。\n\n### 如何使用\n\n本仓库提供了 Jupyter Notebook 教程，用于讲解模型的训练、推理与评估；后续章节中也包含大量补充说明，以配合这些笔记本。\n\n如何使用已训练好的模型进行推理：\n* [`ssd300_inference.ipynb`](ssd300_inference.ipynb)\n* [`ssd512_inference.ipynb`](ssd512_inference.ipynb)\n\n如何训练模型：\n* [`ssd300_training.ipynb`](ssd300_training.ipynb)\n* [`ssd7_training.ipynb`](ssd7_training.ipynb)\n\n如何将提供的已训练模型之一用于在您自己的数据集上进行迁移学习：\n* 请参阅下文（#如何在您自己的数据集上微调已训练模型之一）\n\n如何评估已训练好的模型：\n* 一般情况：[`ssd300_evaluation.ipynb`](ssd300_evaluation.ipynb)\n* 在 MS COCO 数据集上：[`ssd300_evaluation_COCO.ipynb`](ssd300_evaluation_COCO.ipynb)\n\n如何使用数据生成器：\n* 此处使用的数据生成器拥有独立的仓库，并附有详细教程 [此处](https:\u002F\u002Fgithub.com\u002Fpierluigiferrari\u002Fdata_generator_object_detection_2d)\n\n#### 训练细节\n\n通用的训练设置已在 [`ssd7_training.ipynb`](ssd7_training.ipynb) 和 [`ssd300_training.ipynb`](ssd300_training.ipynb) 中进行了阐述与说明。两份笔记本中的设置与解释在大部分情况下相似，因此无论查看哪一份都能理解通用的训练设置；不过，`[ssd300_training.ipynb](ssd300_training.ipynb)` 中的参数预设为复制原始 Caffe 实现用于在 Pascal VOC 上训练的设置，而 `ssd7_training.ipynb` 中的参数则预设为在 [Udacity 交通数据集](https:\u002F\u002Fgithub.com\u002Fudacity\u002Fself-driving-car\u002Ftree\u002Fmaster\u002Fannotations) 上进行训练。\n\n要对原始 SSD300 模型在 Pascal VOC 上进行训练：\n\n1. 下载数据集：\n   ```c\n   wget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2012\u002FVOCtrainval_11-May-2012.tar\n   wget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2007\u002FVOCtrainval_06-Nov-2007.tar\n   wget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2007\u002FVOCtest_06-Nov-2007.tar\n   ```\n2. 下载卷积化 VGG-16 的权重，或下载下方提供的其中一个已训练的原始模型的权重。\n3. 在 `ssd300_training.ipynb` 中相应地设置数据集与模型权重的文件路径，并执行各单元格。\n\n当然，SSD512 的训练流程也完全相同。如果尝试从零开始训练 SSD300 或 SSD512，务必加载预训练的 VGG-16 权重，否则训练很可能会失败。以下是 SSD300 “07+12” 模型完整训练的总结，供您与自己的训练结果对比参考：\n\n* [SSD300 Pascal VOC “07+12” 训练总结](training_summaries\u002Fssd300_pascal_07+12_training_summary.md)\n\n#### 箱体的编码与解码\n\n[`ssd_encoder_decoder`](ssd_encoder_decoder) 子包包含了所有与箱体编码和解码相关的函数与类。所谓“编码箱体”，是指将真实标签转换为损失函数在训练时所需的输出格式。正是在这个编码过程中，会发生真实框与锚框的匹配——论文中称其为“默认框”，而在原始 C++ 代码中则称为“先验框”——其实指的都是同一回事。所谓“解码箱体”，则是将模型的原始输出转换回输入标签的格式，这涉及多种转换与过滤操作，例如非极大值抑制（NMS）。\n\n为了训练模型，您需要创建一个 `SSDInputEncoder` 的实例，并将其传递给数据生成器。其余工作由数据生成器完成，因此通常无需手动调用 `SSDInputEncoder` 的任何方法。\n\n模型可以以“训练”或“推理”模式创建。在“训练”模式下，模型会输出原始预测张量，该张量仍需经过坐标转换、置信度阈值筛选、非极大值抑制等后处理。`decode_detections()` 和 `decode_detections_fast()` 这两个函数负责完成这些后处理。前者遵循原始 Caffe 实现，即按对象类别分别执行 NMS；后者则在整个对象类别范围内全局执行 NMS，因此效率更高，但行为也略有不同。有关这两个函数的详细信息，请参阅文档。如果模型以“推理”模式创建，则其最后一层为 `DecodeDetections` 层，该层会在 TensorFlow 中完成 `decode_detections()` 所做的全部后处理，这意味着模型的输出已经是经过后处理的结果。因此，若要使模型可训练，必须以“训练”模式创建；随后，可将已训练的权重加载到以“推理”模式创建的模型中。\n\n关于模型内部使用的锚框偏移坐标的一点说明：这可能并不显而易见，但重要的是要明白，模型无法直接预测所预测边界框的绝对坐标。若要能够预测绝对框坐标，负责定位的卷积层就必须针对同一目标实例，在输入图像的不同位置产生不同的输出值。然而，这显然是不可能的：对于卷积层滤波器的某一特定输入，由于权重共享，无论滤波器在图像中的空间位置如何，都会产生相同的输出。这就是为什么模型预测的是锚框的偏移值，而非绝对坐标；同时，这也是为什么在训练过程中，绝对的真实坐标会在编码阶段被转换为锚框偏移值。而模型之所以预测锚框偏移值，反过来又导致模型中包含专门用于输出锚框坐标的锚框层，以便模型的输出张量能够包含这些信息。如果模型的输出张量不包含锚框坐标，那么将预测的偏移值转换回绝对坐标的必要信息就会缺失于模型输出之中。\n\n#### 使用不同的基础网络架构\n\n如果您希望构建不同的基础网络架构，可以将 [`keras_ssd7.py`](models\u002Fkeras_ssd7.py) 作为模板。该文件提供了文档与注释，帮助您将其改造成不同的基础网络。根据您的需求搭建好基础网络，并在每个您希望进行预测的网络层之上添加一个预测头。为每个预测头创建两个子预测分支：一个用于定位，另一个用于分类。为每个预测头都创建一个锚框层，并将相应定位头的输出作为锚框层的输入。所有张量重塑与拼接操作的结构保持不变，您只需确保包含所有的预测层与锚框层即可。\n\n### 下载卷积化的 VGG-16 权重\n\n为了从零开始训练 SSD300 或 SSD512，您可以在此处下载在 ImageNet 分类任务上训练至收敛的全卷积化 VGG-16 模型的权重：\n\n[`VGG_ILSVRC_16_layers_fc_reduced.h5`](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1sBmajn6vOE7qJ8GnxUJt4fGPuffVUZox)。\n\n与下方所有其他权重文件一样，此文件是原始 Caffe 实现仓库中提供的相应 `.caffemodel` 文件的直接移植版本。\n\n### 下载原始训练好的模型权重\n\n以下是所有原始训练模型的移植权重。文件名与其对应的 `.caffemodel` 文件相对应。星号和脚注参考了 [原始 Caffe 实现](https:\u002F\u002Fgithub.com\u002Fweiliu89\u002Fcaffe\u002Ftree\u002Fssd#models) 的 README 中的相关说明。\n\n1. PASCAL VOC 模型：\n\n    * 07+12：[SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=121-kCXaOHOkJE_Kf5lKcJvC_5q1fYb_q)，[SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=19NIa0baRCFYT3iRxQkOKCD7CpN6BFO8p)\n    * 07++12：[SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1M99knPZ4DpY9tI60iZqxXsAxX2bYWDvZ)，[SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=18nFnqv9fG5Rh_fx6vUtOoQHOLySt4fEx)\n    * COCO[1]：[SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=17G1J4zEpFwiOzgBmq886ci4P3YaIz8bY)，[SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1wGc368WyXSHZOv4iow2tri9LnB0vm9X-)\n    * 07+12+COCO：[SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1vtNI6kSnv7fkozl7WxyhGyReB6JvDM41)，[SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=14mELuzm0OvXnwjb0mzAiG-Ake9_NP_LQ)\n    * 07++12+COCO：[SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1fyDDUcIOSjeiP08vl1WCndcFdtboFXua)，[SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1a-64b6y6xsQr5puUsHX_wxI1orQDercM)\n\n\n2. COCO 模型：\n\n    * trainval35k：[SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1vmEF7FUsWfHquXyCqO17UaXOPpRbwsdj)，[SSD512*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1IJWZKmjkcFMlvaz2gYukzFx4d6mH3py5)\n\n\n3. ILSVRC 模型：\n\n    * trainval1：[SSD300*](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1VWkj1oQS2RUhyJXckx3OaDYs5fx2mMCq)，[SSD500](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1LcBPsd9CJbuBw4KiSuE1o1fMA-Pz2Zvw)\n\n### 如何在自己的数据集上微调其中一个训练好的模型\n\n如果您想在自己的数据集上微调其中一个提供的训练好的模型，那么您的数据集很可能与该训练模型所使用的类别数量不同。以下教程将解释如何解决这个问题：\n\n[`weight_sampling_tutorial.ipynb`](weight_sampling_tutorial.ipynb)\n\n### 待办事项\n\n以下事项按优先级排序，列在待办事项清单上。欢迎贡献，但请先阅读 [贡献指南](CONTRIBUTING.md)。\n\n1. 为基于其他基础网络（如 MobileNet、InceptionResNetV2 或 DenseNet）的 SSD 添加模型定义和训练好的权重。\n2. 添加对 Theano 和 CNTK 后端的支持。这需要将自定义层和损失函数从 TensorFlow 移植到抽象的 Keras 后端。\n\n目前正在开发的内容：\n\n* 一种新的 [Focal Loss](https:\u002F\u002Farxiv.org\u002Fabs\u002F1708.02002) 损失函数。\n\n### 重要说明\n\n* 所有在 MS COCO 上训练的模型均使用所有 Jupyter 笔记本中提供的较小锚框缩放因子。特别是，“07+12+COCO”和“07++12+COCO”模型使用的是较小的缩放因子。\n\n### 术语\n\n* “锚框”：论文中称其为“默认框”，在原始 C++ 代码中称为“先验框”或“先验”，而 Faster R-CNN 论文中则称之为“锚框”。这些术语指的都是同一事物，但我更倾向于使用“锚框”这一名称，因为它最能准确描述这一概念。在 `keras_ssd300.py` 和 `keras_ssd512.py` 中，我仍沿用“先验框”或“先验”的称呼，以保持与原始 Caffe 实现的一致性；但在其他地方，我一律使用“锚框”或“锚点”这一名称。\n* “标签”：就本项目而言，数据集由“图像”和“标签”组成。属于某张图像标注的所有内容都称为该图像的“标签”：不仅包括目标类别标签，还包括边界框坐标。“标签”只是“标注”的简称。此外，在整个文档中，我也基本可以互换使用“标签”和“目标”这两个术语，尽管在训练语境下，“目标”特指标签。\n* “预测层”：所谓“预测层”或“预测器”，是指网络中的所有最后一层卷积层，即所有不向后续卷积层提供输入的卷积层。","# ssd_keras 快速上手指南\n\n## 环境准备\n\n### 系统要求\n- 操作系统：支持 Linux、macOS 或 Windows（推荐使用 Linux）\n- Python 版本：3.x（不支持 Python 2）\n\n### 前置依赖\n安装以下依赖库：\n\n```bash\npip install numpy tensorflow==1.x keras opencv-python beautifulsoup4\n```\n\n> 注意：TensorFlow 1.x 与 Keras 2.x 需要兼容版本。如果遇到版本冲突，建议使用 `tensorflow==1.15` 和 `keras==2.2.4`。\n\n---\n\n## 安装步骤\n\n1. 克隆项目仓库：\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002Fyour-repo\u002Fssd_keras.git\n   cd ssd_keras\n   ```\n\n2. 下载预训练模型权重（可选）：\n   - [卷积化 VGG-16 权重](#download-the-convolutionalized-vgg-16-weights)\n   - [原始训练模型权重](#download-the-original-trained-model-weights)\n\n3. 准备数据集（如 Pascal VOC）并设置路径（详见 `ssd300_training.ipynb` 中的说明）。\n\n---\n\n## 基本使用\n\n### 示例一：使用预训练模型进行推理\n\n1. 打开 Jupyter Notebook：\n   ```bash\n   jupyter notebook\n   ```\n\n2. 进入以下文件：\n   - 使用 SSD300 模型：`ssd300_inference.ipynb`\n   - 使用 SSD512 模型：`ssd512_inference.ipynb`\n\n3. 在 Notebook 中加载预训练模型权重，并运行推理代码。\n\n### 示例二：从零开始训练模型（以 SSD300 为例）\n\n1. 下载 Pascal VOC 数据集：\n   ```bash\n   wget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2012\u002FVOCtrainval_11-May-2012.tar\n   wget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2007\u002FVOCtrainval_06-Nov-2007.tar\n   wget http:\u002F\u002Fhost.robots.ox.ac.uk\u002Fpascal\u002FVOC\u002Fvoc2007\u002FVOCtest_06-Nov-2007.tar\n   ```\n\n2. 解压并设置数据路径（在 `ssd300_training.ipynb` 中配置）。\n\n3. 加载预训练的 VGG-16 权重（或原始模型权重）。\n\n4. 运行 `ssd300_training.ipynb` 中的训练流程。\n\n---\n\n## 国内加速方案（可选）\n\n如需加速下载依赖包或数据集，可以使用国内镜像源：\n\n- pip 安装依赖时使用清华源：\n  ```bash\n  pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple numpy tensorflow==1.x keras opencv-python beautifulsoup4\n  ```\n\n- 下载 Pascal VOC 数据集时，可尝试使用国内镜像站点（如阿里云、清华大学开源镜像站等）。\n\n---\n\n## 小结\n\n- 该项目提供了多种 SSD 模型实现（SSD300、SSD512、SSD7），适用于不同场景。\n- 推荐使用 Jupyter Notebook 进行快速实验和调试。\n- 若需自定义模型结构，可参考 `keras_ssd7.py` 模板进行扩展。","某智能安防公司正在开发一款用于城市道路监控的实时视频分析系统，需要快速实现对车辆、行人等目标的检测功能。\n\n### 没有 ssd_keras 时\n\n- 需要从零开始研究 SSD 模型架构，理解其复杂的多尺度特征融合机制和默认框生成逻辑，耗费大量时间。\n- 缺乏现成的 Keras 实现，导致模型训练和部署流程复杂，难以与现有基于 Keras 的深度学习框架集成。\n- 训练模型时无法直接使用预训练权重，需手动调整网络结构并重新训练，效率低下且容易出错。\n- 对于不同分辨率的输入图像，需要自行设计适配方案，增加了开发难度和调试成本。\n- 模型性能评估缺乏明确指标参考，难以判断是否达到预期效果。\n\n### 使用 ssd_keras 后\n\n- 提供了清晰的 Keras 实现代码和详细注释，便于快速理解模型结构和训练流程，节省了大量研究时间。\n- 直接支持多种 SSD 网络架构（如 SSD300、SSD512），可无缝集成到现有 Keras 工程中，简化了开发流程。\n- 提供了预训练权重下载链接，用户可直接加载并进行微调，显著缩短了模型训练周期。\n- 支持多种输入分辨率，适应不同场景需求，减少了额外适配工作的负担。\n- 提供了 Pascal VOC 数据集上的 mAP 性能指标，便于评估模型效果并优化参数配置。\n\nssd_keras 为快速构建高精度、高性能的目标检测系统提供了成熟可靠的解决方案。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpierluigiferrari_ssd_keras_915c6345.png","pierluigiferrari","Pierluigi Ferrari","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fpierluigiferrari_051f871c.png",null,"Berlin","pierluigi.ferrari@gmx.com","https:\u002F\u002Fgithub.com\u002Fpierluigiferrari",[83],{"name":84,"color":85,"percentage":86},"Python","#3572A5",100,1873,925,"2026-02-06T08:03:51","Apache-2.0","Linux, macOS, Windows","需要 NVIDIA GPU，显存 8GB+，CUDA 6.0+（使用 cuDNN v6）","16GB+",{"notes":95,"python":96,"dependencies":97},"不支持 Theano 和 CNTK 后端；Python 2 已不再支持；首次运行需下载约 5GB 的模型权重文件","3.x",[98,99,100,101,102],"Numpy","TensorFlow 1.x","Keras 2.x","OpenCV","Beautiful Soup 4.x",[13,14],[105,106,107,108,109,110,111,112,113,114],"ssd","keras","ssd-model","object-detection","computer-vision","deep-learning","fcn","fully-convolutional-networks","keras-models","single-shot-multibox-detector","2026-03-27T02:49:30.150509","2026-04-06T08:46:24.161961",[118,123,128,133,138,143],{"id":119,"question_zh":120,"answer_zh":121,"source_url":122},5887,"如何在自定义数据集上训练 SSD 模型时避免过拟合？","为了避免过拟合，可以尝试增加数据增强（data augmentation）的强度，使用更小的 batch size，或者引入正则化技术如 dropout。此外，确保训练和验证数据集的分布一致，并且验证数据集足够大以反映真实场景。","https:\u002F\u002Fgithub.com\u002Fpierluigiferrari\u002Fssd_keras\u002Fissues\u002F59",{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},5888,"如何解决在自定义数据集上训练 SSD 模型时预测结果不准确的问题？","预测结果不准确可能与锚框（anchor boxes）的尺度（scales）和宽高比（aspect ratios）设置不当有关。建议根据目标物体的大小和形状调整这些参数。此外，检查数据标注是否正确，以及模型是否充分训练。","https:\u002F\u002Fgithub.com\u002Fpierluigiferrari\u002Fssd_keras\u002Fissues\u002F77",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},5889,"如何处理在自定义数据集上训练 SSD 模型时出现的维度错误问题？","维度错误通常是因为输入数据的形状不符合模型预期。请确保图像尺寸、通道数等参数与模型配置一致。例如，如果模型期望输入为 (1080, 1920, 3)，而实际输入为 (16, 1)，则需要调整输入数据的格式。","https:\u002F\u002Fgithub.com\u002Fpierluigiferrari\u002Fssd_keras\u002Fissues\u002F227",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},5890,"如何提高 SSD 模型在 GPU 上的推理速度？","为了提高推理速度，可以使用更高版本的 CUDA 和 cuDNN，同时确保使用了正确的模型架构（如 SSD512）。此外，增大 batch size 可以提升 GPU 的利用率。例如，在 GTX 1080Ti 上，SSD512 模型在 batch size 为 8 时可达到约 44.8 FPS。","https:\u002F\u002Fgithub.com\u002Fpierluigiferrari\u002Fssd_keras\u002Fissues\u002F71",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},5891,"如何解决在加载预训练权重时出现的维度不匹配错误？","维度不匹配错误通常发生在权重文件与当前模型结构不兼容时。请确认使用的权重文件是否与模型架构（如 SSD300 或 SSD512）匹配。此外，检查代码中是否正确导入了所有必要的层和模块。","https:\u002F\u002Fgithub.com\u002Fpierluigiferrari\u002Fssd_keras\u002Fissues\u002F65",{"id":144,"question_zh":145,"answer_zh":146,"source_url":127},5892,"如何在自定义数据集上成功训练 SSD 模型？","在自定义数据集上训练 SSD 模型时，确保数据标注格式正确，图像尺寸统一，并合理设置锚框参数。此外，使用合适的数据增强策略，如随机裁剪、翻转等，有助于提高模型泛化能力。最后，监控训练过程中的损失曲线，防止过拟合。",[148,153,158,163,168],{"id":149,"version":150,"summary_zh":151,"released_at":152},115217,"v0.9.0","## Release 0.9.0\r\n\r\n### Breaking Changes\r\n\r\n- None\r\n\r\n### Major Features and Improvements\r\n\r\n- Added a new, flexible `Evaluator` class that computes average precision scores. Among other things, it can compute average precisions according to both the Pascal VOC pre-2010 and post-2010 algorithms.\r\n- Added two new features to `DataGenerator`:\r\n    1. Convert the dataset into an HDF5 file: This stores the images of a dataset as uncompressed arrays in a contiguous block of memory within an HDF5 file, which requires a lot of disk space but reduces the image loading times during the batch generation.\r\n    2. Load the entire dataset into memory: This loads all images of a dataset into memory, thereby eliminating image loading times altogether. Requires enough memory to hold the entire dataset.\r\n\r\nFor several minor other improvements please refer to the commits since the last release (v0.8.0).\r\n\r\n### Bug Fixes and Other Changes\r\n\r\n- Fixed a bug in `DataGenerator.parse_xml()`: Before the fix there were cases in which the XML parser would parse the wrong bounding boxes for some objects. The only known situation in which this bug occurred is for the 'person' class of the Pascal VOC datasets, where the ground truth provides not only a bounding box for the person itself, but also additional bounding boxes for various body parts such. Depending on the order of these ground truth boxes within the XML files, the parser would sometimes parse the bounding box of a body part instead of the bounding box of the person. The parser now loads the correct bounding boxes in these cases.\r\n- Provided a better training\u002Fvalidation split for the Udacity traffic dataset. The new split is much more balanced than the old one.\r\n\r\n### API Changes\r\n\r\n- None\r\n\r\n### Known Issues\r\n\r\n- None","2018-05-06T16:47:40",{"id":154,"version":155,"summary_zh":156,"released_at":157},115218,"v0.8.0","## Release 0.8.0\r\n\r\n### Breaking Changes\r\n\r\n- None\r\n\r\n### Major Features and Improvements\r\n\r\n- Improved the matching algorithm. While the previous version had a few flaws, the new version is identical to the matching in the original Caffe implementation. Training a model with this new version reproduces the mAP results of the original Caffe SSD models exactly.\r\n- Added two new data augmentation chains: One for variable-size input images that produces effects similar to the original SSD data augmentation chain, but is a lot faster, and a second one for bird's eye-view datasets.\r\n\r\n### API Changes\r\n\r\n- None\r\n\r\n### Known Issues\r\n\r\n- None","2018-04-19T23:18:57",{"id":159,"version":160,"summary_zh":161,"released_at":162},115219,"v0.7.0","## Release 0.7.0\r\n\r\n### Breaking Changes\r\n\r\n- Introduced a new data generator.\r\n\r\n### Major Features and Improvements\r\n\r\n- Introduced a new data generator that has several advantages over the old data generator:\r\n    - It can replicate the data augmentation pipeline of the original Caffe SSD implementation.\r\n    - It's very flexible: Image transformations are no longer hard-coded into the generator itself. Instead, the generator takes a list of transformation objects that it applies to the data. This allows you to realize arbitrary image processing chains. In particular, you can now put transformations in any order or even have multiple parallel transformation chains from which one chain is randomly chosen. The generator comes with a number of useful image transformation classes that can be used out of the box. Among them are most common photometric and geometric transformations, and, in particular, many useful patch sampling transformations.\r\n\r\n### API Changes\r\n\r\nThe API of the new data generator is not compatible with the old data generator.\r\n\r\n### Known Issues\r\n\r\nNone","2018-03-26T00:24:00",{"id":164,"version":165,"summary_zh":166,"released_at":167},115220,"v0.6.0","## Release 0.6.0\r\n\r\n### Breaking Changes\r\n\r\n- Changed the repository structure: Modules are now arranged in packages.\r\n\r\n### Major Features and Improvements\r\n\r\n- Introduced a new `DecodeDetections` layer type that corresponds to the `DetectionOutput` layer type of the original Caffe implementation. It performs the decoding and filtering (confidence thresholding, NMS, etc.) of the raw model output and follows the exact procedure of the `decode_y()` function. The point is to move the computationally expensive decoding and filtering process from the CPU (`decode_y()`) to the GPU for faster prediction. Along with `DecodeDetections`, a second version `DecodeDetections2` has been added. It follows the exact procedure of `decode_y2()` and is significantly faster than `DecodeDetections`, but potentially at the cost of lower prediction accuracy - this has not been tested extensively. The introduction of this new layer type also means that the API of the model builder functions has been expanded: Models can now be built in one of three modes:\r\n    1. `training`: The default mode. Produces the same models as before, where the model outputs the raw predictions that need to be decoded by `decode_y()` or `decode_y2()`.\r\n    2. `inference`: Adds a `DecodeDetections` layer to the model as its final layer. The resulting model outputs predictions that are already decoded and filtered. However, since tensors are homogeneous in size along all axes, there will always be `top_k` predictions for each batch item, regardless of how many objects actually are in it, so the output still needs to be confidence-thresholded to remove the dummy entries among the predictions. The inference tutorials show how to do this.\r\n    3. `inference_fast`: Same as `inference`, but using a `DecodeDetections2` layer as the model's last layer.\r\n\r\n### Bug Fixes and Other Changes\r\n\r\n- Changed the repository structure: Modules are now arranged in packages.\r\n\r\n### API Changes\r\n\r\n- With the introduction of the new `DecodeDetections` layer type, the API of all model builder functions has changed to include a new `mode` parameter and `confidence_thresh`, `iou_threshold`, `top_k`, and `nms_max_output_size` parameters, all of which assume default values. `mode` defaults to `training`, in which case the resulting model is the same as before, so this is not a breaking change. `mode` can also be set to `inference` or `inference_fast` upon creation of the model though, in which case the resulting model has the `DecodeDetections` or `DecodeDetections2` layer as its last layer.\r\n\r\n### Known Issues\r\n\r\nNone","2018-03-05T15:41:06",{"id":169,"version":170,"summary_zh":171,"released_at":172},115221,"v0.5.0","## Release 0.5.0\r\n\r\n### Breaking Changes\r\n\r\nNone\r\n\r\n### Major Features and Improvements\r\n\r\n- Ports of the weights of all trained original models\r\n- Evaluation results on Pascal VOC\r\n- Tools for evaluation on Pascal VOC and MS COCO\r\n- Tutorials for training, inference, evaluation, and weight sub-sampling\r\n\r\n### Bug Fixes and Other Changes\r\n\r\n- Fixed random sampling in the weight sub-sampling procedure\r\n\r\n### API Changes\r\n\r\nNone\r\n\r\n### Known Issues\r\n\r\nNone","2018-03-04T17:17:59"]