[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-wizyoung--YOLOv3_TensorFlow":3,"tool-wizyoung--YOLOv3_TensorFlow":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":79,"owner_location":80,"owner_email":81,"owner_twitter":79,"owner_website":82,"owner_url":83,"languages":84,"stars":89,"forks":90,"last_commit_at":91,"license":92,"difficulty_score":93,"env_os":94,"env_gpu":95,"env_ram":94,"env_deps":96,"category_tags":103,"github_topics":104,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":110,"updated_at":111,"faqs":112,"releases":148},3605,"wizyoung\u002FYOLOv3_TensorFlow","YOLOv3_TensorFlow","Complete YOLO v3 TensorFlow implementation. Support training on your own dataset.","YOLOv3_TensorFlow 是一个基于纯 TensorFlow 框架实现的 YOLOv3 目标检测开源项目。它旨在帮助开发者在 TensorFlow 生态中复现经典的 YOLOv3 算法，提供从预训练权重转换、自定义数据集训练到模型评估与推理的完整流水线。\n\n对于希望在不依赖 PyTorch 或其他框架的前提下，深入理解 YOLOv3 架构或在旧版 TensorFlow 环境中部署高性能检测模型的用户来说，这是一个极具价值的参考实现。它特别适合熟悉 TensorFlow 的 AI 工程师、研究人员以及需要快速验证算法效果的学习者。\n\n该项目拥有多项技术亮点：内置高效的 tf.data 数据管道以加速训练；支持将 Darknet 预训练权重无缝转换为 TensorFlow 格式；实现了极速的 GPU 非极大值抑制（NMS）算法；并提供 K-means 聚类工具以自动优化先验锚框。尽管作者已转向 PyTorch 并不再维护此仓库，但其代码结构清晰、功能完备，且在 Titan XP 显卡上对 416x416 图像的推理速度可达约 23 毫秒，性能表现优异。无论是用于教学演示、算法研究","YOLOv3_TensorFlow 是一个基于纯 TensorFlow 框架实现的 YOLOv3 目标检测开源项目。它旨在帮助开发者在 TensorFlow 生态中复现经典的 YOLOv3 算法，提供从预训练权重转换、自定义数据集训练到模型评估与推理的完整流水线。\n\n对于希望在不依赖 PyTorch 或其他框架的前提下，深入理解 YOLOv3 架构或在旧版 TensorFlow 环境中部署高性能检测模型的用户来说，这是一个极具价值的参考实现。它特别适合熟悉 TensorFlow 的 AI 工程师、研究人员以及需要快速验证算法效果的学习者。\n\n该项目拥有多项技术亮点：内置高效的 tf.data 数据管道以加速训练；支持将 Darknet 预训练权重无缝转换为 TensorFlow 格式；实现了极速的 GPU 非极大值抑制（NMS）算法；并提供 K-means 聚类工具以自动优化先验锚框。尽管作者已转向 PyTorch 并不再维护此仓库，但其代码结构清晰、功能完备，且在 Titan XP 显卡上对 416x416 图像的推理速度可达约 23 毫秒，性能表现优异。无论是用于教学演示、算法研究还是工程落地前的原型验证，YOLOv3_TensorFlow 都是一个扎实可靠的选择。","#  YOLOv3_TensorFlow\n\n**NOTE:** This repo is no longer maintained (actually I dropped the support for a long time) as I have switched to PyTorch for one year. Life is short, I use PyTorch.\n\n\n--------\n\n### 1. Introduction\n\nThis is my implementation of [YOLOv3](https:\u002F\u002Fpjreddie.com\u002Fmedia\u002Ffiles\u002Fpapers\u002FYOLOv3.pdf) in pure TensorFlow. It contains the full pipeline of training and evaluation on your own dataset. The key features of this repo are:\n\n- Efficient tf.data pipeline\n- Weights converter (converting pretrained darknet weights on COCO dataset to TensorFlow checkpoint.)\n- Extremely fast GPU non maximum supression.\n- Full training and evaluation pipeline.\n- Kmeans algorithm to select prior anchor boxes.\n\n### 2. Requirements\n\nPython version: 2 or 3\n\nPackages:\n\n- tensorflow >= 1.8.0 (theoretically any version that supports tf.data is ok)\n- opencv-python\n- tqdm\n\n### 3. Weights convertion\n\nThe pretrained darknet weights file can be downloaded [here](https:\u002F\u002Fpjreddie.com\u002Fmedia\u002Ffiles\u002Fyolov3.weights). Place this weights file under directory `.\u002Fdata\u002Fdarknet_weights\u002F` and then run:\n\n```shell\npython convert_weight.py\n```\n\nThen the converted TensorFlow checkpoint file will be saved to `.\u002Fdata\u002Fdarknet_weights\u002F` directory.\n\nYou can also download the converted TensorFlow checkpoint file by me via [[Google Drive link](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1mXbNgNxyXPi7JNsnBaxEv1-nWr7SVoQt?usp=sharing)] or [[Github Release](https:\u002F\u002Fgithub.com\u002Fwizyoung\u002FYOLOv3_TensorFlow\u002Freleases\u002F)] and then place it to the same directory.\n\n### 4. Running demos\n\nThere are some demo images and videos under the `.\u002Fdata\u002Fdemo_data\u002F`. You can run the demo by:\n\nSingle image test demo:\n\n```shell\npython test_single_image.py .\u002Fdata\u002Fdemo_data\u002Fmessi.jpg\n```\n\nVideo test demo:\n\n```shell\npython video_test.py .\u002Fdata\u002Fdemo_data\u002Fvideo.mp4\n```\n\nSome results:\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwizyoung_YOLOv3_TensorFlow_readme_292b1363b37b.jpg)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwizyoung_YOLOv3_TensorFlow_readme_a860c66e676b.jpg)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwizyoung_YOLOv3_TensorFlow_readme_a83ad6ec0bce.jpg)\n\nCompare the kite detection results with TensorFlow's offical API result [here](https:\u002F\u002Fgithub.com\u002Ftensorflow\u002Fmodels\u002Fblob\u002Fmaster\u002Fresearch\u002Fobject_detection\u002Fg3doc\u002Fimg\u002Fkites_detections_output.jpg).\n\n(The kite detection result is under input image resolution 1344x896)\n\n### 5. Inference speed\n\nHow fast is the inference speed? With images scaled to 416*416:\n\n\n| Backbone              |   GPU    | Time(ms) |\n| :-------------------- | :------: | :------: |\n| Darknet-53 (paper)    | Titan X  |    29    |\n| Darknet-53 (my impl.) | Titan XP |   ~23    |\n\nwhy is it so fast? Check the ImageNet classification result comparision from the paper:\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwizyoung_YOLOv3_TensorFlow_readme_05d6dc870cd7.png)\n\n### 6. Model architecture\n\nFor better understanding of the model architecture, you can refer to the following picture. With great thanks to [Levio](https:\u002F\u002Fblog.csdn.net\u002Fleviopku\u002Farticle\u002Fdetails\u002F82660381) for your excellent work!\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwizyoung_YOLOv3_TensorFlow_readme_c7b5960a0b3f.png)\n\n### 7. Training\n\n#### 7.1 Data preparation \n\n(1) annotation file\n\nGenerate `train.txt\u002Fval.txt\u002Ftest.txt` files under `.\u002Fdata\u002Fmy_data\u002F` directory. One line for one image, in the format like `image_index image_absolute_path img_width img_height box_1 box_2 ... box_n`. Box_x format: `label_index x_min y_min x_max y_max`. (The origin of coordinates is at the left top corner, left top => (xmin, ymin), right bottom => (xmax, ymax).) `image_index` is the line index which starts from zero. `label_index` is in range [0, class_num - 1].\n\nFor example:\n\n```\n0 xxx\u002Fxxx\u002Fa.jpg 1920 1080 0 453 369 473 391 1 588 245 608 268\n1 xxx\u002Fxxx\u002Fb.jpg 1920 1080 1 466 403 485 422 2 793 300 809 320\n...\n```\n\nSince so many users report to use tools like LabelImg to generate xml format annotations, I add one demo script on VOC dataset to do the convertion. Check the `misc\u002Fparse_voc_xml.py` file for more details.\n\n(2)  class_names file:\n\nGenerate the `data.names` file under `.\u002Fdata\u002Fmy_data\u002F` directory. Each line represents a class name.\n\nFor example:\n\n```\nbird\nperson\nbike\n...\n```\n\nThe COCO dataset class names file is placed at `.\u002Fdata\u002Fcoco.names`.\n\n(3) prior anchor file:\n\nUsing the kmeans algorithm to get the prior anchors:\n\n```\npython get_kmeans.py\n```\n\nThen you will get 9 anchors and the average IoU. Save the anchors to a txt file.\n\nThe COCO dataset anchors offered by YOLO's author is placed at `.\u002Fdata\u002Fyolo_anchors.txt`, you can use that one too.\n\nThe yolo anchors computed by the kmeans script is on the resized image scale.  The default resize method is the letterbox resize, i.e., keep the original aspect ratio in the resized image.\n\n#### 7.2 Training\n\nUsing `train.py`. The hyper-parameters and the corresponding annotations can be found in `args.py`:\n\n```shell\nCUDA_VISIBLE_DEVICES=GPU_ID python train.py\n```\n\nCheck the `args.py` for more details. You should set the parameters yourself in your own specific task.\n\n### 8. Evaluation\n\nUsing `eval.py` to evaluate the validation or test dataset. The parameters are as following:\n\n```shell\n$ python eval.py -h\nusage: eval.py [-h] [--eval_file EVAL_FILE] \n               [--restore_path RESTORE_PATH]\n               [--anchor_path ANCHOR_PATH] \n               [--class_name_path CLASS_NAME_PATH]\n               [--batch_size BATCH_SIZE]\n               [--img_size [IMG_SIZE [IMG_SIZE ...]]]\n               [--num_threads NUM_THREADS]\n               [--prefetech_buffer PREFETECH_BUFFER]\n               [--nms_threshold NMS_THRESHOLD]\n               [--score_threshold SCORE_THRESHOLD] \n               [--nms_topk NMS_TOPK]\n```\n\nCheck the `eval.py` for more details. You should set the parameters yourself. \n\nYou will get the loss, recall, precision, average precision and mAP metrics results.\n\nFor higher mAP, you should set score_threshold to a small number.\n\n### 9. Some tricks\n\nHere are some training tricks in my experiment:\n\n(1) Apply the two-stage training strategy or the one-stage training strategy:\n\nTwo-stage training:\n\nFirst stage: Restore `darknet53_body` part weights from COCO checkpoints, train the `yolov3_head` with big learning rate like 1e-3 until the loss reaches to a low level.\n\nSecond stage: Restore the weights from the first stage, then train the whole model with small learning rate like 1e-4 or smaller. At this stage remember to restore the optimizer parameters if you use optimizers like adam.\n\nOne-stage training:\n\nJust restore the whole weight file except the last three convolution layers (Conv_6, Conv_14, Conv_22). In this condition, be careful about the possible nan loss value.\n\n(2) I've included many useful training strategies in `args.py`:\n\n- Cosine decay of lr (SGDR)\n- Multi-scale training\n- Label smoothing\n- Mix up data augmentation\n- Focal loss\n\nThese are all good strategies but it does **not** mean they will definitely improve the performance. You should choose the appropriate strategies for your own task.\n\nThis [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1902.04103) from gluon-cv has proved that data augmentation is critical to YOLO v3, which is completely in consistent with my own experiments. Some data augmentation strategies that seems reasonable may lead to poor performance. For example, after introducing random color jittering, the mAP on my own dataset drops heavily. Thus I hope  you pay extra attention to the data augmentation.\n\n(4) Loss nan? Setting a bigger warm_up_epoch number or smaller learning rate and try several more times. If you fine-tune the whole model, using adam may cause nan value sometimes. You can try choosing momentum optimizer.\n\n### 10. Fine-tune on VOC dataset\n\nI did a quick train on the VOC dataset. The params I used in my experiments are included under `misc\u002Fexperiments_on_voc\u002F` folder for your reference. The train dataset is the VOC 2007 + 2012 trainval set, and the test dataset is the VOC 2007 test set.\n\nFinally with the 416\\*416 input image, I got a 87.54% test mAP (not using the 07 metric). No hard-try fine-tuning. You should get the similar or better results.\n\nMy pretrained weights on VOC dataset can be downloaded [here](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1ICKcJPozQOVRQnE1_vMn90nr7dejg0yW?usp=sharing).\n\n### 11. TODO\n\n[ ] Multi-GPUs with sync batch norm. \n\n[ ] Maybe tf 2.0 ?\n\n-------\n\n### Credits:\n\nI referred to many fantastic repos during the implementation:\n\n[YunYang1994\u002Ftensorflow-yolov3](https:\u002F\u002Fgithub.com\u002FYunYang1994\u002Ftensorflow-yolov3)\n\n[qqwweee\u002Fkeras-yolo3](https:\u002F\u002Fgithub.com\u002Fqqwweee\u002Fkeras-yolo3)\n\n[eriklindernoren\u002FPyTorch-YOLOv3](https:\u002F\u002Fgithub.com\u002Feriklindernoren\u002FPyTorch-YOLOv3)\n\n[pjreddie\u002Fdarknet](https:\u002F\u002Fgithub.com\u002Fpjreddie\u002Fdarknet)\n\n[dmlc\u002Fgluon-cv](https:\u002F\u002Fgithub.com\u002Fdmlc\u002Fgluon-cv\u002Ftree\u002Fmaster\u002Fscripts\u002Fdetection\u002Fyolo)\n\n","# YOLOv3_TensorFlow\n\n**注意：** 本仓库已不再维护（实际上我很久以前就停止了支持），因为我已经转用 PyTorch 一年了。人生苦短，我选择 PyTorch。\n\n\n--------\n\n### 1. 简介\n\n这是我用纯 TensorFlow 实现的 [YOLOv3](https:\u002F\u002Fpjreddie.com\u002Fmedia\u002Ffiles\u002Fpapers\u002FYOLOv3.pdf)。它包含了在您自己的数据集上进行训练和评估的完整流程。该仓库的主要特点包括：\n\n- 高效的 tf.data 数据管道\n- 权重转换器（将 COCO 数据集上的预训练 Darknet 权重转换为 TensorFlow 检查点）\n- 极速 GPU 非极大值抑制\n- 完整的训练和评估流程\n- 使用 K-means 算法选择先验锚框\n\n### 2. 要求\n\nPython 版本：2 或 3\n\n依赖包：\n\n- tensorflow >= 1.8.0（理论上任何支持 tf.data 的版本都可以）\n- opencv-python\n- tqdm\n\n### 3. 权重转换\n\n预训练的 Darknet 权重文件可以从 [这里](https:\u002F\u002Fpjreddie.com\u002Fmedia\u002Ffiles\u002Fyolov3.weights) 下载。将该权重文件放置在 `.\u002Fdata\u002Fdarknet_weights\u002F` 目录下，然后运行：\n\n```shell\npython convert_weight.py\n```\n\n转换后的 TensorFlow 检查点文件将保存到 `.\u002Fdata\u002Fdarknet_weights\u002F` 目录中。\n\n您也可以通过 [[Google Drive 链接](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1mXbNgNxyXPi7JNsnBaxEv1-nWr7SVoQt?usp=sharing)] 或 [[Github 发布页面](https:\u002F\u002Fgithub.com\u002Fwizyoung\u002FYOLOv3_TensorFlow\u002Freleases\u002F)] 下载我转换好的 TensorFlow 检查点文件，并将其放置在同一目录下。\n\n### 4. 运行示例\n\n`.\u002Fdata\u002Fdemo_data\u002F` 目录下包含了一些示例图片和视频。您可以按以下方式运行示例：\n\n单张图片测试示例：\n\n```shell\npython test_single_image.py .\u002Fdata\u002Fdemo_data\u002Fmessi.jpg\n```\n\n视频测试示例：\n\n```shell\npython video_test.py .\u002Fdata\u002Fdemo_data\u002Fvideo.mp4\n```\n\n部分结果如下：\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwizyoung_YOLOv3_TensorFlow_readme_292b1363b37b.jpg)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwizyoung_YOLOv3_TensorFlow_readme_a860c66e676b.jpg)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwizyoung_YOLOv3_TensorFlow_readme_a83ad6ec0bce.jpg)\n\n将风筝检测结果与 TensorFlow 官方 API 的结果进行比较 [这里](https:\u002F\u002Fgithub.com\u002Ftensorflow\u002Fmodels\u002Fblob\u002Fmaster\u002Fresearch\u002Fobject_detection\u002Fg3doc\u002Fimg\u002Fkites_detections_output.jpg)。\n\n（风筝检测结果是在输入图像分辨率为 1344x896 的情况下得到的）\n\n### 5. 推理速度\n\n推理速度有多快？当图像缩放到 416*416 时：\n\n| 主干网络              |   GPU    | 时间(ms) |\n| :-------------------- | :------: | :------: |\n| Darknet-53（论文）    | Titan X  |    29    |\n| Darknet-53（我的实现） | Titan XP |   ~23    |\n\n为什么这么快？请查看论文中 ImageNet 分类结果的对比图：\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwizyoung_YOLOv3_TensorFlow_readme_05d6dc870cd7.png)\n\n### 6. 模型架构\n\n为了更好地理解模型架构，您可以参考下图。特别感谢 [Levio](https:\u002F\u002Fblog.csdn.net\u002Fleviopku\u002Farticle\u002Fdetails\u002F82660381) 的优秀工作！\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwizyoung_YOLOv3_TensorFlow_readme_c7b5960a0b3f.png)\n\n### 7. 训练\n\n#### 7.1 数据准备 \n\n(1) 注释文件\n\n在 `.\u002Fdata\u002Fmy_data\u002F` 目录下生成 `train.txt\u002Fval.txt\u002Ftest.txt` 文件。每行对应一张图片，格式为 `image_index image_absolute_path img_width img_height box_1 box_2 ... box_n`。其中，box_x 的格式为：`label_index x_min y_min x_max y_max`。（坐标原点位于左上角，左上角为 (xmin, ymin)，右下角为 (xmax, ymax)。）`image_index` 是从零开始的行索引，`label_index` 的范围是 [0, class_num - 1]。\n\n例如：\n\n```\n0 xxx\u002Fxxx\u002Fa.jpg 1920 1080 0 453 369 473 391 1 588 245 608 268\n1 xxx\u002Fxxx\u002Fb.jpg 1920 1080 1 466 403 485 422 2 793 300 809 320\n...\n```\n\n由于许多用户反馈使用 LabelImg 等工具生成 XML 格式的标注文件，我添加了一个 VOC 数据集的示例脚本用于进行转换。更多详情请参阅 `misc\u002Fparse_voc_xml.py` 文件。\n\n(2) 类别名称文件：\n\n在 `.\u002Fdata\u002Fmy_data\u002F` 目录下生成 `data.names` 文件。每行代表一个类别名称。\n\n例如：\n\n```\n鸟\n人\n自行车\n...\n```\n\nCOCO 数据集的类别名称文件位于 `.\u002Fdata\u002Fcoco.names`。\n\n(3) 先验锚框文件：\n\n使用 K-means 算法获取先验锚框：\n\n```shell\npython get_kmeans.py\n```\n\n运行后会得到 9 个锚框及平均 IoU 值。将这些锚框保存到一个文本文件中。\n\nYOLO 作者提供的 COCO 数据集锚框位于 `.\u002Fdata\u002Fyolo_anchors.txt`，您也可以直接使用。\n\nK-means 脚本计算出的 YOLO 锚框是在调整大小后的图像尺度上得到的。默认的调整大小方法是 Letterbox 调整，即在调整大小后的图像中保持原始宽高比不变。\n\n#### 7.2 训练\n\n使用 `train.py` 进行训练。超参数及相关注释可在 `args.py` 中找到：\n\n```shell\nCUDA_VISIBLE_DEVICES=GPU_ID python train.py\n```\n\n更多细节请参阅 `args.py`。您需要根据自己的具体任务自行设置参数。\n\n### 8. 评估\n\n使用 `eval.py` 对验证集或测试集进行评估。参数如下：\n\n```shell\n$ python eval.py -h\nusage: eval.py [-h] [--eval_file EVAL_FILE] \n               [--restore_path RESTORE_PATH]\n               [--anchor_path ANCHOR_PATH] \n               [--class_name_path CLASS_NAME_PATH]\n               [--batch_size BATCH_SIZE]\n               [--img_size [IMG_SIZE [IMG_SIZE ...]]]\n               [--num_threads NUM_THREADS]\n               [--prefetech_buffer PREFETECH_BUFFER]\n               [--nms_threshold NMS_THRESHOLD]\n               [--score_threshold SCORE_THRESHOLD] \n               [--nms_topk NMS_TOPK]\n```\n\n更多细节请参阅 `eval.py`。您需要根据实际情况自行设置参数。\n\n评估结果将包括损失、召回率、精确率、平均精度以及 mAP 等指标。\n\n若希望获得更高的 mAP，应将 score_threshold 设置为较小的值。\n\n### 9. 一些技巧\n\n以下是我实验中使用的一些训练技巧：\n\n(1) 应用两阶段训练策略或单阶段训练策略：\n\n**两阶段训练：**\n\n第一阶段：从 COCO 检查点中加载 `darknet53_body` 部分的权重，以较大的学习率（如 1e-3）训练 `yolov3_head`，直到损失降到较低水平。\n\n第二阶段：从第一阶段的检查点中恢复权重，然后以较小的学习率（如 1e-4 或更小）训练整个模型。在此阶段，如果使用 Adam 等优化器，请记得同时恢复优化器的状态参数。\n\n**单阶段训练：**\n\n只需加载除最后三个卷积层（Conv_6、Conv_14、Conv_22）之外的完整权重文件。在这种情况下，需注意可能出现损失值为 NaN 的情况。\n\n(2) 我在 `args.py` 中加入了许多有用的训练策略：\n\n- 学习率余弦退火（SGDR）\n- 多尺度训练\n- 标签平滑\n- Mixup 数据增强\n- Focal Loss\n\n这些策略都很有效，但并不意味着它们一定会提升模型性能。你需要根据自己的任务选择合适的策略。\n\n来自 gluon-cv 的这篇 [论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F1902.04103) 证明了数据增强对 YOLO v3 至关重要，这与我的实验结果完全一致。然而，有些看似合理的数据增强策略反而可能导致性能下降。例如，在引入随机颜色抖动后，我在自己的数据集上的 mAP 明显降低。因此，我建议你在进行数据增强时格外谨慎。\n\n(4) 如果出现损失值为 NaN？可以尝试增大 warm-up epoch 的数量或减小学习率，多试几次。如果你对整个模型进行微调，使用 Adam 优化器有时可能会导致 NaN 值。此时可以尝试选择带有动量的优化器。\n\n### 10. 在 VOC 数据集上的微调\n\n我在 VOC 数据集上做了一个快速的训练实验。我在实验中使用的参数已放在 `misc\u002Fexperiments_on_voc\u002F` 文件夹下，供你参考。训练数据集为 VOC 2007 和 2012 的 trainval 集合，测试数据集为 VOC 2007 的测试集。\n\n最终，使用 416×416 输入图像，我得到了 87.54% 的测试 mAP（未使用 07 年的评估标准）。这还是未经深入调优的结果，你应该能够获得相似甚至更好的效果。\n\n我在 VOC 数据集上的预训练权重可在此下载：[这里](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1ICKcJPozQOVRQnE1_vMn90nr7dejg0yW?usp=sharing)。\n\n### 11. 待办事项\n\n[ ] 使用同步批归一化实现多 GPU 训练。\n\n[ ] 或许可以尝试 TensorFlow 2.0？\n\n-------\n\n### 致谢：\n\n在实现过程中，我参考了许多优秀的开源项目：\n\n[YunYang1994\u002Ftensorflow-yolov3](https:\u002F\u002Fgithub.com\u002FYunYang1994\u002Ftensorflow-yolov3)\n\n[qqwweee\u002Fkeras-yolo3](https:\u002F\u002Fgithub.com\u002Fqqwweee\u002Fkeras-yolo3)\n\n[eriklindernoren\u002FPyTorch-YOLOv3](https:\u002F\u002Fgithub.com\u002Feriklindernoren\u002FPyTorch-YOLOv3)\n\n[pjreddie\u002Fdarknet](https:\u002F\u002Fgithub.com\u002Fpjreddie\u002Fdarknet)\n\n[dmlc\u002Fgluon-cv](https:\u002F\u002Fgithub.com\u002Fdmlc\u002Fgluon-cv\u002Ftree\u002Fmaster\u002Fscripts\u002Fdetection\u002Fyolo)","# YOLOv3_TensorFlow 快速上手指南\n\n> **注意**：本仓库已停止维护，作者已转向 PyTorch。本文档仅用于参考学习纯 TensorFlow 实现的 YOLOv3 流程。\n\n## 1. 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**：Linux \u002F macOS \u002F Windows\n*   **Python 版本**：Python 2 或 Python 3\n*   **核心依赖**：\n    *   `tensorflow` >= 1.8.0 (需支持 `tf.data`)\n    *   `opencv-python`\n    *   `tqdm`\n\n**安装依赖命令：**\n\n```bash\npip install tensorflow>=1.8.0 opencv-python tqdm\n```\n\n> **提示**：国内用户建议使用清华源或阿里源加速安装，例如：\n> `pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple tensorflow>=1.8.0 opencv-python tqdm`\n\n## 2. 安装与权重转换\n\n本项目需要将 Darknet 格式的预训练权重转换为 TensorFlow 格式。您可以选择手动转换或直接下载已转换好的文件。\n\n### 方案 A：手动转换（推荐）\n\n1.  下载 Darknet 预训练权重文件 (`yolov3.weights`)：\n    *   官方地址：[https:\u002F\u002Fpjreddie.com\u002Fmedia\u002Ffiles\u002Fyolov3.weights](https:\u002F\u002Fpjreddie.com\u002Fmedia\u002Ffiles\u002Fyolov3.weights)\n    *   *国内加速建议*：若下载缓慢，可搜索国内镜像站或使用代理。\n\n2.  将下载的 `yolov3.weights` 文件放置于 `.\u002Fdata\u002Fdarknet_weights\u002F` 目录下。\n\n3.  运行转换脚本：\n\n```shell\npython convert_weight.py\n```\n\n转换完成后，TensorFlow checkpoint 文件将保存在同一目录中。\n\n### 方案 B：直接下载已转换权重\n\n您可以直接从以下地址下载作者提供的已转换文件，并放入 `.\u002Fdata\u002Fdarknet_weights\u002F` 目录：\n*   [Google Drive](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1mXbNgNxyXPi7JNsnBaxEv1-nWr7SVoQt?usp=sharing)\n*   [Github Release](https:\u002F\u002Fgithub.com\u002Fwizyoung\u002FYOLOv3_TensorFlow\u002Freleases\u002F)\n\n## 3. 基本使用\n\n项目自带了演示图片和视频，位于 `.\u002Fdata\u002Fdemo_data\u002F` 目录。\n\n### 单张图片检测\n\n运行以下命令对单张图片进行目标检测：\n\n```shell\npython test_single_image.py .\u002Fdata\u002Fdemo_data\u002Fmessi.jpg\n```\n\n### 视频检测\n\n运行以下命令对视频文件进行目标检测：\n\n```shell\npython video_test.py .\u002Fdata\u002Fdemo_data\u002Fvideo.mp4\n```\n\n检测结果的图片将默认保存，您可以在终端查看处理进度及结果预览。\n\n---\n*如需训练自己的数据集或调整超参数，请参考 `args.py` 配置文件及原文档中的“训练”与“数据准备”章节。*","某智慧城市交通部门需要在现有 TensorFlow 架构下，快速部署一套能识别多种违规车辆（如未戴头盔摩托车、违停货车）的实时监控系统。\n\n### 没有 YOLOv3_TensorFlow 时\n- **框架迁移成本高**：团队熟悉 TensorFlow 生态，但主流 YOLOv3 实现多基于 PyTorch 或 Darknet，重新学习框架或搭建混合环境耗时费力。\n- **自定义训练流程缺失**：缺乏从数据标注格式转换、锚框（Anchor Boxes）自动聚类到模型评估的完整流水线，需手动编写大量底层代码。\n- **推理速度不达标**：自研的后处理算法在 GPU 上运行非极大值抑制（NMS）效率低下，导致视频流分析帧率不足，无法满足实时预警需求。\n- **预训练权重复用难**：难以将官方 COCO 数据集上的 Darknet 预训练权重直接转换为 TensorFlow 格式，从零训练收敛慢且精度低。\n\n### 使用 YOLOv3_TensorFlow 后\n- **原生无缝集成**：利用纯 TensorFlow 实现的完整管道，团队无需切换技术栈，直接复用现有基础设施进行开发。\n- **一站式训练支持**：通过内置的 K-means 算法自动优化先验锚框，并配合高效的数据管道，快速完成针对本地交通数据的微调训练。\n- **极致推理性能**：借助极快的 GPU 非极大值抑制实现，在 Titan XP 显卡上单图推理仅需约 23 毫秒，轻松支撑多路高清视频实时分析。\n- **权重平滑迁移**：使用自带的权重转换脚本，一键将 Darknet 预训练模型转为 TensorFlow Checkpoint，大幅缩短模型冷启动时间。\n\nYOLOv3_TensorFlow 让团队在无需重构技术栈的前提下，以最低成本实现了高精度、实时的定制化目标检测落地。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fwizyoung_YOLOv3_TensorFlow_b6ea78fb.png","wizyoung","Wizyoung","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fwizyoung_1e63930e.png",null,"Shanghai","happyyanghehe@gmail.com","https:\u002F\u002Fwizyoung.dogcraft.xyz","https:\u002F\u002Fgithub.com\u002Fwizyoung",[85],{"name":86,"color":87,"percentage":88},"Python","#3572A5",100,1553,574,"2026-04-02T08:38:22","MIT",4,"未说明","需要 NVIDIA GPU（演示中使用了 Titan X\u002FTitan XP），需支持 CUDA 的 TensorFlow 版本",{"notes":97,"python":98,"dependencies":99},"该项目已不再维护，作者建议转向使用 PyTorch。需要将 Darknet 预训练权重转换为 TensorFlow checkpoint 格式才能运行。训练时若出现 Loss 为 NaN，建议增加 warm_up_epoch 数量、减小学习率或尝试使用 Momentum 优化器代替 Adam。","2 或 3",[100,101,102],"tensorflow>=1.8.0","opencv-python","tqdm",[13,14],[105,106,107,108,109],"yolov3","tensorflow","object-detection","real-time","tensorflow-yolo","2026-03-27T02:49:30.150509","2026-04-06T08:40:47.176931",[113,118,123,128,133,138,143],{"id":114,"question_zh":115,"answer_zh":116,"source_url":117},16524,"如何准备和转换训练数据的标注格式？","项目已提供脚本从 VOC XML 文件生成所需的标注文件。请参考 misc\u002Fparse_voc_xml.py 脚本。此外，建议使用 labelimg 工具进行标注，具体的标注格式更新已在 README 文件中说明。之前的损失函数存在 bug 现已修复，请确保使用最新代码。","https:\u002F\u002Fgithub.com\u002Fwizyoung\u002FYOLOv3_TensorFlow\u002Fissues\u002F28",{"id":119,"question_zh":120,"answer_zh":121,"source_url":122},16525,"在 COCO 或 VOC 数据集上训练的预期 mAP 性能是多少？","在 VOC 数据集上的测试表现正常：(1) 加载 Darknet 权重时，416x416 输入可获得约 77 mAP；(2) 加载除最后三层卷积外的所有层权重并进行微调，可获得约 84 mAP。如果仅训练 YOLO-head 部分 20 个 epoch，mAP 可达 73.4%。如果在 COCO 上性能较低，请检查代码版本，相关问题已修复。","https:\u002F\u002Fgithub.com\u002Fwizyoung\u002FYOLOv3_TensorFlow\u002Fissues\u002F84",{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},16526,"遇到 'IndexError: index out of bounds' 或数组越界错误怎么办？","这通常是因为数据不干净导致的。请检查你的标注 txt 文件，确保每一行至少包含一个目标对象。同时检查标注的坐标是否出错，错误的坐标会导致计算特征图索引时发生数组越界。","https:\u002F\u002Fgithub.com\u002Fwizyoung\u002FYOLOv3_TensorFlow\u002Fissues\u002F64",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},16527,"训练过程中 Loss 变为 NaN 是什么原因？","Loss 变为 NaN 的问题通常与学习率预热（warm up）策略有关。该问题已在新的 warm up 代码中修复。如果你使用的是 Adam 优化器，请尝试将 --lr_type 参数设置为 \"fixed\"。","https:\u002F\u002Fgithub.com\u002Fwizyoung\u002FYOLOv3_TensorFlow\u002Fissues\u002F34",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},16528,"使用自定义数据集训练时，对输入图像尺寸有什么要求？","输入图像的尺寸必须能被 32 整除（例如 416x416, 320x320 等）。如果需要修改图像尺寸，请检查 args.py 配置文件中的相关设置。","https:\u002F\u002Fgithub.com\u002Fwizyoung\u002FYOLOv3_TensorFlow\u002Fissues\u002F133",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},16529,"数据加载速度很慢是什么原因？","旧版本代码在 train 和 val 切换时可能存在因 feedable iterator 导致的性能问题，该问题在新版本中已修复。如果当前版本仍然很慢，且排除了数据集本身的问题，可能是代码中存在其他 Bug 或配置不当。建议对比官方示例中的 tf.data 用法，或检查是否仅在单线程中加载了整个 batch 的图片。","https:\u002F\u002Fgithub.com\u002Fwizyoung\u002FYOLOv3_TensorFlow\u002Fissues\u002F65",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},16530,"遇到 'ValueError: all the input arrays must have same number of dimensions' 错误如何解决？","该错误发生在数据解析阶段（data_utils.py），通常是因为标注数据格式不正确，导致生成的 boxes 数组维度不一致。请检查你的标注文件，确保没有空行或格式错误的行，保证所有输入数组的维度相同。","https:\u002F\u002Fgithub.com\u002Fwizyoung\u002FYOLOv3_TensorFlow\u002Fissues\u002F58",[149],{"id":150,"version":151,"summary_zh":79,"released_at":152},98847,"v1.0","2019-01-20T05:31:25"]