[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-lucasjinreal--alfred":3,"tool-lucasjinreal--alfred":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",150037,2,"2026-04-10T23:33:47",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":78,"owner_twitter":79,"owner_website":79,"owner_url":80,"languages":81,"stars":94,"forks":95,"last_commit_at":96,"license":97,"difficulty_score":98,"env_os":99,"env_gpu":100,"env_ram":101,"env_deps":102,"category_tags":112,"github_topics":115,"view_count":32,"oss_zip_url":79,"oss_zip_packed_at":79,"status":17,"created_at":125,"updated_at":126,"faqs":127,"releases":168},6523,"lucasjinreal\u002Falfred","alfred","alfred-py: A deep learning utility library for **human**, more detail about the usage of lib to: https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F341446046","alfred 是一款专为深度学习开发者打造的实用工具库与命令行助手，旨在简化日常科研与工程中的繁琐操作。它既是一个可通过 `import alfred` 调用的 Python 库，也是一个能在终端直接运行的强力命令工具，有效解决了数据标注可视化困难、格式转换复杂以及模型部署流程冗长等痛点。\n\n无论是研究人员还是算法工程师，都能利用 alfred 轻松实现多格式（如 YOLO、VOC、COCO）标注数据的快速预览与评估，无需编写额外的绘图代码即可生成带有边界框、掩码甚至 3D 点云的高质量可视化结果。此外，它还集成了视频与图像互转、数据集划分统计、日志管理以及 TensorRT 模型部署等高频功能模块，显著提升了从数据预处理到模型落地的全流程效率。\n\n其独特亮点在于将复杂的视觉渲染与数据处理逻辑封装为极简的 API 和命令行指令，让用户能专注于核心算法研究而非重复造轮子。如果你正在从事计算机视觉相关的开发工作，alfred 将成为你提升工作效率的得力伙伴。","\u003Cdiv align=\"center\">\r\n\r\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_5352224766dc.png\">\r\n\r\n\u003Ch1>alfred-py: Born For Deeplearning\u003C\u002Fh1>\r\n\r\n\r\n[![PyPI downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_62eb696b1abc.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Falfred-py)\r\n[![Github downloads](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fdownloads\u002Fjinfagang\u002Falfred\u002Ftotal?color=blue&label=Downloads&logo=github&logoColor=lightgrey)](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fdownloads\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Ftotal?color=blue&label=Downloads&logo=github&logoColor=lightgrey)\r\n\r\n[![CI testing](https:\u002F\u002Fgithub.com\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Factions\u002Fworkflows\u002Fci-test.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Factions\u002Fworkflows\u002Fci-test.yml)\r\n[![Build & deploy docs](https:\u002F\u002Fgithub.com\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Factions\u002Fworkflows\u002Fgh-pages.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Factions\u002Fworkflows\u002Fgh-pages.yml)\r\n[![pre-commit.ci status](https:\u002F\u002Fresults.pre-commit.ci\u002Fbadge\u002Fgithub\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Fmain.svg)](https:\u002F\u002Fresults.pre-commit.ci\u002Flatest\u002Fgithub\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Fmain)\r\n\r\n\r\n[![license](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fzhiqwang\u002Fyolov5-rt-stack?color=dfd)](LICENSE)\r\n[![Slack](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fslack-chat-aff.svg?logo=slack)](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fyolort\u002Fshared_invite\u002Fzt-mqwc7235-940aAh8IaKYeWclrJx10SA)\r\n[![PRs Welcome](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-welcome-pink.svg)](https:\u002F\u002Fgithub.com\u002Fjinfagang\u002Falfred\u002Fissues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22)\r\n\r\n\u003C\u002Fdiv>\r\n\r\n\r\n*alfred-py* can be called from terminal via `alfred` as a tool for deep-learning usage. It also provides massive utilities to boost your daily efficiency APIs, for instance, if you want draw a box with score and label, if you want logging in your python applications, if you want convert your model to TRT engine, just `import alfred`, you can get whatever you want. More usage you can read instructions below.\r\n\r\n\r\n\r\n## Functions Summary\r\n\r\nSince many new users of alfred maybe not very familiar with it, conclude functions here briefly, more details see my updates:\r\n\r\n- Visualization, draw boxes, masks, keypoints is very simple, even **3D** boxes on point cloud supported;\r\n- Command line tools, such as view your annotation data in any format (yolo, voc, coco any one);\r\n- Deploy, you can using alfred deploy your tensorrt models;\r\n- DL common utils, such as torch.device() etc;\r\n- Renders, render your 3D models.\r\n\r\n\r\nA pic visualized from alfred:\r\n\r\n![alfred vis segmentation annotation in coco format](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_2fae4f25f55a.png)\r\n\r\n\r\n## Install\r\n\r\nTo install **alfred**, it is very simple:\r\n\r\nrequirements:\r\n\r\n```\r\nlxml [optional]\r\npycocotools [optional]\r\nopencv-python [optional]\r\n\r\n```\r\nthen:\r\n\r\n```shell\r\nsudo pip3 install alfred-py\r\n```\r\n\r\n**alfred is both a lib and a tool, you can import it's APIs, or you can directly call it inside your terminal**.\r\n\r\nA glance of alfred, after you installed above package, you will have `alfred`:\r\n\r\n- **`data`** module:\r\n  \r\n  ```shell\r\n  # show VOC annotations\r\n  alfred data vocview -i JPEGImages\u002F -l Annotations\u002F\r\n  # show coco anntations\r\n  alfred data cocoview -j annotations\u002Finstance_2017.json -i images\u002F\r\n  # show yolo annotations\r\n  alfred data yoloview -i images -l labels\r\n  # show detection label with txt format\r\n  alfred data txtview -i images\u002F -l txts\u002F\r\n  # show more of data\r\n  alfred data -h\r\n  \r\n  # eval tools\r\n  alfred data evalvoc -h\r\n  ```\r\n  \r\n- **`cab`** module:\r\n  \r\n  ```shell\r\n  # count files number of a type\r\n  alfred cab count -d .\u002Fimages -t jpg\r\n  # split a txt file into train and test\r\n  alfred cab split -f all.txt -r 0.9,0.1 -n train,val\r\n  ```\r\n  \r\n- **`vision`** module;\r\n  \r\n  ```shell\r\n  # extract video to images\r\n  alfred vision extract -v video.mp4\r\n  # combine images to video\r\n  alfred vision 2video -d images\u002F\r\n  ```\r\n  \r\n- **`-h`** to see more:\r\n\r\n  ```shell\r\n  usage: alfred [-h] [--version] {vision,text,scrap,cab,data} ...\r\n  \r\n  positional arguments:\r\n    {vision,text,scrap,cab,data}\r\n      vision              vision related commands.\r\n      text                text related commands.\r\n      scrap               scrap related commands.\r\n      cab                 cabinet related commands.\r\n      data                data related commands.\r\n  \r\n  optional arguments:\r\n    -h, --help            show this help message and exit\r\n    --version, -v         show version info.\r\n  ```\r\n\r\n  **inside every child module, you can call it's `-h` as well: `alfred text -h`.**\r\n\r\n  \r\n\r\n> if you are on windows, you can install pycocotools via: `pip  install \"git+https:\u002F\u002Fgithub.com\u002Fphilferriere\u002Fcocoapi.git#egg=pycocotools&subdirectory=PythonAPI\"`, we have made pycocotools as an dependencies since we need pycoco API.\r\n\r\n\r\n\r\n## Updates\r\n\r\n`alfred-py`　has been updating for 3 years, and it will keep going!\r\n\r\n- **2050-xxx**: *to be continue*;\r\n- **2023.04.28**: Update the 3d keypoints visualizer, now you can visualize Human3DM kpts in realtime:\r\n  ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_2d74f3af26c3.gif)\r\n  For detailes reference to `examples\u002Fdemo_o3d_server.py`.\r\n  The result is generated from MotionBert.\r\n- **2022.01.18**: Now alfred support a Mesh3D visualizer server based on Open3D:\r\n  ```python\r\n  from alfred.vis.mesh3d.o3dsocket import VisOpen3DSocket\r\n\r\n  def main():\r\n      server = VisOpen3DSocket()\r\n      while True:\r\n          server.update()\r\n\r\n\r\n  if __name__ == \"__main__\":\r\n      main()\r\n  ```\r\n  Then, you just need setup a client, send keypoints3d to server, and it will automatically visualized out.\r\n  Here is what it looks like:\r\n  ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_e2a323ae1cf4.gif)\r\n\r\n- **2021.12.22**: Now alfred supported keypoints visualization, almost all datasets supported in mmpose were also supported by alfred:\r\n  ```python\r\n  from alfred.vis.image.pose import vis_pose_result\r\n\r\n  # preds are poses, which is (Bs, 17, 3) for coco body\r\n  vis_pose_result(ori_image, preds, radius=5, thickness=2, show=True)\r\n  ```\r\n\r\n- **2021.12.05**: You can using `alfred.deploy.tensorrt` for tensorrt inference now:\r\n  ```python\r\n  from alfred.deploy.tensorrt.common import do_inference_v2, allocate_buffers_v2, build_engine_onnx_v3\r\n\r\n  def engine_infer(engine, context, inputs, outputs, bindings, stream, test_image):\r\n\r\n    # image_input, img_raw, _ = preprocess_np(test_image)\r\n    image_input, img_raw, _ = preprocess_img((test_image))\r\n    print('input shape: ', image_input.shape)\r\n    inputs[0].host = image_input.astype(np.float32).ravel()\r\n\r\n    start = time.time()\r\n    dets, labels, masks = do_inference_v2(context, bindings=bindings, inputs=inputs,\r\n                                          outputs=outputs, stream=stream, input_tensor=image_input)\r\n  img_f = 'demo\u002Fdemo.jpg'\r\n  with build_engine_onnx_v3(onnx_file_path=onnx_f) as engine:\r\n      inputs, outputs, bindings, stream = allocate_buffers_v2(engine)\r\n      # Contexts are used to perform inference.\r\n      with engine.create_execution_context() as context:\r\n          print(engine.get_binding_shape(0))\r\n          print(engine.get_binding_shape(1))\r\n          print(engine.get_binding_shape(2))\r\n          INPUT_SHAPE = engine.get_binding_shape(0)[-2:]\r\n\r\n          print(context.get_binding_shape(0))\r\n          print(context.get_binding_shape(1))\r\n          dets, labels, masks, img_raw = engine_infer(\r\n              engine, context, inputs, outputs, bindings, stream, img_f)\r\n  ```\r\n  \r\n- **2021.11.13**: Now I add Siren SDK support!\r\n  ```\r\n  from functools import wraps\r\n  from alfred.siren.handler import SirenClient\r\n  from alfred.siren.models import ChatMessage, InvitationMessage\r\n\r\n  siren = SirenClient('daybreak_account', 'password')\r\n\r\n\r\n  @siren.on_received_invitation\r\n  def on_received_invitation(msg: InvitationMessage):\r\n      print('received invitation: ', msg.invitation)\r\n      # directly agree this invitation for robots\r\n\r\n\r\n  @siren.on_received_chat_message\r\n  def on_received_chat_msg(msg: ChatMessage):\r\n      print('got new msg: ', msg.text)\r\n      siren.publish_txt_msg('I got your message O(∩_∩)O哈哈~', msg.roomId)\r\n\r\n\r\n  if __name__ == '__main__':\r\n      siren.loop()\r\n  ```\r\n  Using this, you can easily setup a Chatbot. By using Siren client.\r\n\r\n- **2021.06.24**: Add a useful commandline tool, **change your pypi source easily!!**:\r\n  ```\r\n  alfred cab changesource\r\n  ```\r\n  And then your pypi will using aliyun by default!\r\n- **2021.05.07**: Upgrade Open3D instructions:\r\n  Open3D>0.9.0 no longer compatible with previous alfred-py. Please upgrade Open3D, you can build Open3D from source:\r\n  ```\r\n    git clone --recursive https:\u002F\u002Fgithub.com\u002Fintel-isl\u002FOpen3D.git\r\n    cd Open3D && mkdir build && cd build\r\n    sudo apt install libc++abi-8-dev\r\n    sudo apt install libc++-8-dev\r\n    cmake .. -DPYTHON_EXECUTABLE=\u002Fusr\u002Fbin\u002Fpython3\r\n  ```\r\n  **Ubuntu 16.04 blow I tried all faild to build from source**. So, please using open3d==0.9.0 for alfred-py.\r\n- **2021.04.01**: A unified evaluator had added. As all we know, for many users, writting Evaluation might coupled deeply with your project. But with Alfred's help, you can do evaluation in any project by simply writting 8 lines of codes, for example, if your dataset format is Yolo, then do this:\r\n  ```python\r\n    def infer_func(img_f):\r\n    image = cv2.imread(img_f)\r\n    results = config_dict['model'].predict_for_single_image(\r\n        image, aug_pipeline=simple_widerface_val_pipeline, classification_threshold=0.89, nms_threshold=0.6, class_agnostic=True)\r\n    if len(results) > 0:\r\n        results = np.array(results)[:, [2, 3, 4, 5, 0, 1]]\r\n        # xywh to xyxy\r\n        results[:, 2] += results[:, 0]\r\n        results[:, 3] += results[:, 1]\r\n    return results\r\n\r\n    if __name__ == '__main__':\r\n        conf_thr = 0.4\r\n        iou_thr = 0.5\r\n\r\n        imgs_root = 'data\u002Fhand\u002Fimages'\r\n        labels_root = 'data\u002Fhand\u002Flabels'\r\n\r\n        yolo_parser = YoloEvaluator(imgs_root=imgs_root, labels_root=labels_root, infer_func=infer_func)\r\n        yolo_parser.eval_precisely()\r\n  ```\r\n  Then you can get your evaluation results automatically. All recall, precision, mAP will printed out. More dataset format are on-going.\r\n- **2021.03.10**:\r\n    New added `ImageSourceIter` class, when you want write a demo of your project which need to handle any input such as image file \u002F folder \u002F video file etc. You can using `ImageSourceIter`:\r\n\r\n    ```python\r\n    from alfred.utils.file_io import ImageSourceIter\r\n    \r\n    # data_f can be image_file or image_folder or video\r\n    iter = ImageSourceIter(ops.test_path)\r\n    while True:\r\n        itm = next(iter)\r\n        if isinstance(itm, str):\r\n            itm = cv2.imread(itm)\r\n        # cv2.imshow('raw', itm)\r\n        res = detect_for_pose(itm, det_model)\r\n        cv2.imshow('res', itm)\r\n        if iter.video_mode:\r\n            cv2.waitKey(1)\r\n        else:\r\n            cv2.waitKey(0)\r\n    \r\n    ```\r\n    And then you can avoid write anything else of deal with file glob or reading video in cv. *note that itm return can be a cv array or a file path*.\r\n- **2021.01.25**:\r\n    **alfred** now support self-defined visualization on coco format annotation (not using pycoco tools):\r\n\r\n    ![image-20210125194313093](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_b0893cef79bf.png)\r\n\r\n    If your dataset in coco format but visualize wrongly pls fire a issue to me, thank u!\r\n- **2020.09.27**:\r\n    Now, yolo and VOC can convert to each other, so that using Alfred you can:\r\n    - convert yolo2voc;\r\n    - convert voc2yolo;\r\n    - convert voc2coco;\r\n    - convert coco2voc;\r\n\r\n    By this, you can convert any labeling format of each other.\r\n- **2020.09.08**: After a long time past, **alfred** got some updates:\r\n    We providing `coco2yolo` ability inside it. Users can run this command to convert your data to yolo format:\r\n\r\n    ```\r\n    alfred data coco2yolo -i images\u002F -j annotations\u002Fval_split_2020.json\r\n    ```\r\n\r\n    Only should provided is your image root path and your json file. And then all result will generated into `yolo` folder under images or in images parent dir.\r\n\r\n    After that (you got your yolo folder), then you can visualize the conversion result to see if it correct or not:\r\n\r\n    ```\r\n    alfred data yolovview -i images\u002F -l labels\u002F\r\n    ```\r\n\r\n    ![image-20200908164952171](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_3ebf37b12509.png)\r\n\r\n- **2020.07.27**: After a long time past, **alfred** finally get some updates:\r\n\r\n    ![image-20200727163938094](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_eb9cef424346.png)\r\n\r\n    Now, you can using alfred draw Chinese charactors on image without xxxx undefined encodes.\r\n\r\n    ```python\r\n    from alfred.utils.cv_wrapper import put_cn_txt_on_img\r\n    \r\n    img = put_cn_txt_on_img(img, spt[-1], [points[0][0], points[0][1]-25], 1.0, (255, 255, 255))\r\n    ```\r\n\r\n    Also, you now can **merge** 2 VOC datasets! This is helpful when you have 2 dataset and you want merge them into a single one.\r\n\r\n    ```\r\n    alfred data mergevoc -h\r\n    ```\r\n\r\n    You can see more promotes.\r\n\r\n- **2020.03.08**：Several new files added in **alfred**:\r\n\r\n    ```\r\n    alfred.utils.file_io: Provide file io utils for common purpose\r\n    alfred.dl.torch.env: Provide seed or env setup in pytorch (same API as detectron2)\r\n    alfred.dl.torch.distribute: utils used for distribute training when using pytorch\r\n    ```\r\n\r\n- **2020.03.04**: We have added some **evaluation tool** to calculate mAP for object detection model performance evaluation, it's useful and can visualize result:\r\n    ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_605cb8216b7a.png)\r\n    ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_f2a8153139df.png)\r\n\r\n    this usage is also quite simple:\r\n    \r\n    ```\r\n    alfred data evalvoc -g ground-truth -d detection-results -im images\r\n    ```\r\n\r\n    where `-g` is your ground truth dir (contains xmls or txts), `-d` is your detection result files dir, `-im` is your images fodler. You only need save all your detected results into txts, one image one txt, and format like this:\r\n    \r\n    ```shell\r\n    bottle 0.14981 80 1 295 500  \r\n    bus 0.12601 36 13 404 316  \r\n    horse 0.12526 430 117 500 307  \r\n    pottedplant 0.14585 212 78 292 118  \r\n    tvmonitor 0.070565 388 89 500 196 \r\n    ```\r\n\r\n- **2020.02.27**: We just update a `license` module inside alfred, say you want apply license to your project or update license, simple:\r\n\r\n    ```shell script\r\n     alfred cab license -o 'MANA' -n 'YoloV3' -u 'manaai.cn'\r\n    ```\r\n    you can found more detail usage with `alfred cab license -h`\r\n\r\n- **2020-02-11**: open3d has changed their API. we have updated new open3d inside alfred, you can simply using latest open3d and run `python3 examples\u002Fdraw_3d_pointcloud.py` you will see this:\r\n\r\n    ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_e5a5a665ee76.png)\r\n\r\n- **2020-02-10**: **alfred** now support windows (experimental);\r\n- **2020-02-01**: **武汉加油**! *alfred*  fix windows pip install problem related to encoding 'gbk';\r\n- **2020-01-14**: Added cabinet module, also add some utils under data module;\r\n- **2019-07-18**: 1000 classes imagenet labelmap added. Call it from:\r\n\r\n    ```python\r\n    from alfred.vis.image.get_dataset_label_map import imagenet_labelmap\r\n\r\n    # also, coco, voc, cityscapes labelmap were all added in\r\n    from alfred.vis.image.get_dataset_label_map import coco_labelmap\r\n    from alfred.vis.image.get_dataset_label_map import voc_labelmap\r\n    from alfred.vis.image.get_dataset_label_map import cityscapes_labelmap\r\n    ```\r\n- **2019-07-13**: We add a VOC check module in command line usage, you can now visualize your VOC format detection data like this:\r\n\r\n    ```\r\n    alfred data voc_view -i .\u002Fimages -l labels\u002F\r\n    ```\r\n- **2019-05-17**: We adding **open3d** as a lib to visual 3d point cloud in python. Now you can do some simple preparation and visual 3d box right on lidar points and show like opencv!!\r\n\r\n    ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_b4d917a83548.png)\r\n\r\n    You can achieve this by only using **alfred-py** and **open3d**!\r\n\r\n    example code can be seen under  `examples\u002Fdraw_3d_pointcloud.py`. **code updated with latest open3d API**!.\r\n\r\n- **2019-05-10**: A minor updates but **really useful** which we called **mute_tf**, do you want to disable tensorflow ignoring log? simply do this!!\r\n\r\n    ```python\r\n    from alfred.dl.tf.common import mute_tf\r\n    mute_tf()\r\n    import tensorflow as tf\r\n    ```\r\n    Then, the logging message were gone....\r\n\r\n- **2019-05-07**: Adding some protos, now you can parsing tensorflow coco labelmap by using alfred:\r\n    ```python\r\n    from alfred.protos.labelmap_pb2 import LabelMap\r\n    from google.protobuf import text_format\r\n\r\n    with open('coco.prototxt', 'r') as f:\r\n        lm = LabelMap()\r\n        lm = text_format.Merge(str(f.read()), lm)\r\n        names_list = [i.display_name for i in lm.item]\r\n        print(names_list)\r\n    ```\r\n\r\n- **2019-04-25**: Adding KITTI fusion, now you can get projection from 3D label to image like this:\r\n  we will also add more fusion utils such as for *nuScene* dataset.\r\n\r\n  We providing kitti fusion kitti for convert `camera link 3d points` to image pixel, and convert `lidar link 3d points` to image pixel. Roughly going through of APIs like this:\r\n\r\n  ```python\r\n  # convert lidar prediction to image pixel\r\n  from alfred.fusion.kitti_fusion import LidarCamCalibData, \\\r\n      load_pc_from_file, lidar_pts_to_cam0_frame, lidar_pt_to_cam0_frame\r\n  from alfred.fusion.common import draw_3d_box, compute_3d_box_lidar_coords\r\n\r\n  # consit of prediction of lidar\r\n  # which is x,y,z,h,w,l,rotation_y\r\n  res = [[4.481686, 5.147319, -1.0229858, 1.5728549, 3.646751, 1.5121397, 1.5486346],\r\n         [-2.5172017, 5.0262384, -1.0679419, 1.6241353, 4.0445814, 1.4938312, 1.620804],\r\n         [1.1783253, -2.9209857, -0.9852259, 1.5852798, 3.7360613, 1.4671413, 1.5811548]]\r\n\r\n  for p in res:\r\n      xyz = np.array([p[: 3]])\r\n      c2d = lidar_pt_to_cam0_frame(xyz, frame_calib)\r\n      if c2d is not None:\r\n          cv2.circle(img, (int(c2d[0]), int(c2d[1])), 3, (0, 255, 255), -1)\r\n      hwl = np.array([p[3: 6]])\r\n      r_y = [p[6]]\r\n      pts3d = compute_3d_box_lidar_coords(xyz, hwl, angles=r_y, origin=(0.5, 0.5, 0.5), axis=2)\r\n\r\n      pts2d = []\r\n      for pt in pts3d[0]:\r\n          coords = lidar_pt_to_cam0_frame(pt, frame_calib)\r\n          if coords is not None:\r\n              pts2d.append(coords[:2])\r\n      pts2d = np.array(pts2d)\r\n      draw_3d_box(pts2d, img)\r\n  ```\r\n\r\n  And you can see something like this:\r\n\r\n  **note**:\r\n\r\n  `compute_3d_box_lidar_coords` for lidar prediction, `compute_3d_box_cam_coords` for KITTI label, **cause KITTI label is based on camera coordinates!**.\r\n  \u003Cp align=\"center\">\r\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_8dad73b98558.png\" \u002F>\r\n  \u003C\u002Fp>\r\n\r\n  **since many users ask me how to reproduces this result, you can checkout demo file under `examples\u002Fdraw_3d_box.py`**;\r\n\r\n\r\n- **2019-01-25**: We just adding network visualization tool for **pytorch** now!! How does it look? Simply print out *every layer network with output shape*,  I believe this is really helpful for people to visualize their models!\r\n\r\n  ```\r\n  ➜  mask_yolo3 git:(master) ✗ python3 tests.py\r\n  ----------------------------------------------------------------\r\n          Layer (type)               Output Shape         Param #\r\n  ================================================================\r\n              Conv2d-1         [-1, 64, 224, 224]           1,792\r\n                ReLU-2         [-1, 64, 224, 224]               0\r\n                .........\r\n             Linear-35                 [-1, 4096]      16,781,312\r\n               ReLU-36                 [-1, 4096]               0\r\n            Dropout-37                 [-1, 4096]               0\r\n             Linear-38                 [-1, 1000]       4,097,000\r\n  ================================================================\r\n  Total params: 138,357,544\r\n  Trainable params: 138,357,544\r\n  Non-trainable params: 0\r\n  ----------------------------------------------------------------\r\n  Input size (MB): 0.19\r\n  Forward\u002Fbackward pass size (MB): 218.59\r\n  Params size (MB): 527.79\r\n  Estimated Total Size (MB): 746.57\r\n  ----------------------------------------------------------------\r\n  \r\n  ```\r\n\r\n  Ok, that is all. what you simply need to do is:\r\n\r\n  ```python\r\n  from alfred.dl.torch.model_summary import summary\r\n  from alfred.dl.torch.common import device\r\n  \r\n  from torchvision.models import vgg16\r\n  \r\n  vgg = vgg16(pretrained=True)\r\n  vgg.to(device)\r\n  summary(vgg, input_size=[224, 224])\r\n  ```\r\n\r\n  Support you input (224, 224) image, you will got this output, or you can change any other size to see how output changes. (currently not support for 1 channel image)\r\n\r\n- **2018-12-7**: Now, we adding a extensible class for quickly write an image detection or segmentation demo.\r\n\r\n  If you want write a demo which **do inference on an image or an video or right from webcam**, now you can do this in standared alfred way:\r\n\r\n  ```python\r\n  class ENetDemo(ImageInferEngine):\r\n  \r\n      def __init__(self, f, model_path):\r\n          super(ENetDemo, self).__init__(f=f)\r\n  \r\n          self.target_size = (512, 1024)\r\n          self.model_path = model_path\r\n          self.num_classes = 20\r\n  \r\n          self.image_transform = transforms.Compose(\r\n              [transforms.Resize(self.target_size),\r\n               transforms.ToTensor()])\r\n  \r\n          self._init_model()\r\n  \r\n      def _init_model(self):\r\n          self.model = ENet(self.num_classes).to(device)\r\n          checkpoint = torch.load(self.model_path)\r\n          self.model.load_state_dict(checkpoint['state_dict'])\r\n          print('Model loaded!')\r\n  \r\n      def solve_a_image(self, img):\r\n          images = Variable(self.image_transform(Image.fromarray(img)).to(device).unsqueeze(0))\r\n          predictions = self.model(images)\r\n          _, predictions = torch.max(predictions.data, 1)\r\n          prediction = predictions.cpu().numpy()[0] - 1\r\n          return prediction\r\n  \r\n      def vis_result(self, img, net_out):\r\n          mask_color = np.asarray(label_to_color_image(net_out, 'cityscapes'), dtype=np.uint8)\r\n          frame = cv2.resize(img, (self.target_size[1], self.target_size[0]))\r\n          # mask_color = cv2.resize(mask_color, (frame.shape[1], frame.shape[0]))\r\n          res = cv2.addWeighted(frame, 0.5, mask_color, 0.7, 1)\r\n          return res\r\n  \r\n  \r\n  if __name__ == '__main__':\r\n      v_f = ''\r\n      enet_seg = ENetDemo(f=v_f, model_path='save\u002FENet_cityscapes_mine.pth')\r\n      enet_seg.run()\r\n  ```\r\n\r\n  After that, you can directly inference from video. This usage can be found at git repo: \r\n\r\n  \u003Cp align=\"center\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_149cee012295.gif\"\u002F>\u003C\u002Fp>\r\nThe repo using **alfred**: http:\u002F\u002Fgithub.com\u002Fjinfagang\u002Fpt_enet\r\n  \r\n- **2018-11-6**: I am so glad to announce that alfred 2.0 released！😄⛽️👏👏  Let's have a quick look what have been updated:\r\n\r\n  ```\r\n  # 2 new modules, fusion and vis\r\n  from alred.fusion import fusion_utils\r\n  ```\r\n\r\n  For the module `fusion` contains many useful sensor fusion helper functions you may use, such as project lidar point cloud onto image.\r\n\r\n- **2018-08-01**:  Fix the video combined function not work well with sequence. Add a order algorithm to ensure video sequence right.\r\n  also add some draw bbox functions into package.\r\n\r\n  can be called like this:\r\n- **2018-03-16**: Slightly update **alfred**, now we can using this tool to combine a video sequence back original video!\r\n  Simply do:\r\n\r\n  ```shell\r\n  # alfred binary exectuable program\r\n  alfred vision 2video -d .\u002Fvideo_images\r\n  ```\r\n\r\n\r\n## Capable\r\n\r\n**alfred** is both a library and a command line tool. It can do those things:\r\n\r\n```angular2html\r\n# extract images from video\r\nalfred vision extract -v video.mp4\r\n# combine image sequences into a video\r\nalfred vision 2video -d \u002Fpath\u002Fto\u002Fimages\r\n# get faces from images\r\nalfred vision getface -d \u002Fpath\u002Fcontains\u002Fimages\u002F\r\n\r\n```\r\n\r\nJust try it out!!\r\n\r\n## Copyright\r\n\r\n**Alfred** build by *Lucas Jin* with ❤️， welcome star and send PR. If you got any question, you can ask me via wechat: `jintianiloveu`, this code released under GPL-3 license.\r\n","\u003Cdiv align=\"center\">\r\n\r\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_5352224766dc.png\">\r\n\r\n\u003Ch1>alfred-py: 专为深度学习而生\u003C\u002Fh1>\r\n\r\n\r\n[![PyPI 下载量](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_62eb696b1abc.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Falfred-py)\r\n[![Github 下载量](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fdownloads\u002Fjinfagang\u002Falfred\u002Ftotal?color=blue&label=Downloads&logo=github&logoColor=lightgrey)](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fdownloads\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Ftotal?color=blue&label=Downloads&logo=github&logoColor=lightgrey)\r\n\r\n[![CI 测试](https:\u002F\u002Fgithub.com\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Factions\u002Fworkflows\u002Fci-test.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Factions\u002Fworkflows\u002Fci-test.yml)\r\n[![构建并部署文档](https:\u002F\u002Fgithub.com\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Factions\u002Fworkflows\u002Fgh-pages.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Factions\u002Fworkflows\u002Fgh-pages.yml)\r\n[![pre-commit.ci 状态](https:\u002F\u002Fresults.pre-commit.ci\u002Fbadge\u002Fgithub\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Fmain.svg)](https:\u002F\u002Fresults.pre-commit.ci\u002Flatest\u002Fgithub\u002Fzhiqwang\u002Fyolov5-rt-stack\u002Fmain)\r\n\r\n\r\n[![许可证](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fzhiqwang\u002Fyolov5-rt-stack?color=dfd)](LICENSE)\r\n[![Slack](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fslack-chat-aff.svg?logo=slack)](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fyolort\u002Fshared_invite\u002Fzt-mqwc7235-940aAh8IaKYeWclrJx10SA)\r\n[![欢迎 PR](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-welcome-pink.svg)](https:\u002F\u002Fgithub.com\u002Fjinfagang\u002Falfred\u002Fissues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22)\r\n\r\n\u003C\u002Fdiv>\r\n\r\n\r\n*alfred-py* 可以通过终端命令 `alfred` 调用，作为深度学习工具使用。它还提供了大量实用的 API，帮助你提升日常效率：比如绘制带有置信度和标签的边界框、在 Python 应用中进行日志记录、将模型转换为 TensorRT 引擎等。只需 `import alfred`，即可轻松实现所需功能。更多用法请参阅下方说明。\r\n\r\n\r\n\r\n## 功能概览\r\n\r\n由于许多新用户可能对 alfred 不太熟悉，这里简要总结其功能，详细内容请关注后续更新：\r\n\r\n- 可视化：绘制边界框、掩码、关键点非常简单，甚至支持点云上的 **3D** 边界框；\r\n- 命令行工具：例如查看任意格式（YOLO、VOC、COCO 等）的标注数据；\r\n- 部署：可以使用 alfred 部署你的 TensorRT 模型；\r\n- 深度学习常用工具：如 `torch.device()` 等；\r\n- 渲染：渲染你的 3D 模型。\r\n\r\n\r\n一张由 alfred 可视化的图片：\r\n\r\n![alfred 可视化 COCO 格式的分割标注](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_2fae4f25f55a.png)\r\n\r\n\r\n## 安装\r\n\r\n安装 **alfred** 非常简单：\r\n\r\n依赖项：\r\n\r\n```\r\nlxml [可选]\r\npycocotools [可选]\r\nopencv-python [可选]\r\n\r\n```\r\n然后执行以下命令：\r\n\r\n```shell\r\nsudo pip3 install alfred-py\r\n```\r\n\r\n**alfred 既是一个库，也是一个工具，你可以导入它的 API 使用，也可以直接在终端中调用。**\r\n\r\n安装完上述包后，你会得到一个名为 `alfred` 的命令行工具，以下是部分功能示例：\r\n\r\n- **`data`** 模块：\r\n\r\n  ```shell\r\n  # 查看 VOC 标注\r\n  alfred data vocview -i JPEGImages\u002F -l Annotations\u002F\r\n  # 查看 COCO 标注\r\n  alfred data cocoview -j annotations\u002Finstance_2017.json -i images\u002F\r\n  # 查看 YOLO 标注\r\n  alfred data yoloview -i images -l labels\r\n  # 查看文本格式的目标检测标签\r\n  alfred data txtview -i images\u002F -l txts\u002F\r\n  # 更多数据相关功能\r\n  alfred data -h\r\n  \r\n  # 评估工具\r\n  alfred data evalvoc -h\r\n  ```\r\n  \r\n- **`cab`** 模块：\r\n\r\n  ```shell\r\n  # 统计某一类型文件的数量\r\n  alfred cab count -d .\u002Fimages -t jpg\r\n  # 将文本文件按比例拆分为训练集和测试集\r\n  alfred cab split -f all.txt -r 0.9,0.1 -n train,val\r\n  ```\r\n  \r\n- **`vision`** 模块：\r\n\r\n  ```shell\r\n  # 将视频提取为图像\r\n  alfred vision extract -v video.mp4\r\n  # 将图像合成为视频\r\n  alfred vision 2video -d images\u002F\r\n  ```\r\n  \r\n- **`-h`** 查看更多选项：\r\n\r\n  ```shell\r\n  usage: alfred [-h] [--version] {vision,text,scrap,cab,data} ...\r\n  \r\n  positional arguments:\r\n    {vision,text,scrap,cab,data}\r\n      vision              视觉相关命令。\r\n      text                文本相关命令。\r\n      scrap               抓取相关命令。\r\n      cab                 文件柜相关命令。\r\n      data                数据相关命令。\r\n  \r\n  optional arguments:\r\n    -h, --help            显示帮助信息并退出\r\n    --version, -v         显示版本信息。\r\n  ```\r\n\r\n  **在每个子模块中，同样可以使用 `-h` 查看该模块的帮助信息：`alfred text -h`。**\r\n\r\n  \r\n\r\n> 如果你在 Windows 上，可以通过以下命令安装 pycocotools：`pip install \"git+https:\u002F\u002Fgithub.com\u002Fphilferriere\u002Fcocoapi.git#egg=pycocotools&subdirectory=PythonAPI\"`。我们已将 pycocotools 设置为依赖项，因为我们需要使用 COCO API。\n\n## 更新\n\n`alfred-py` 已经更新了三年，并且还会继续更新下去！\n\n- **2050-xxx**: *待续*；\n- **2023.04.28**: 更新了 3D 关键点可视化工具，现在可以实时可视化 Human3DM 的关键点：\n  ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_2d74f3af26c3.gif)\n  更多细节请参考 `examples\u002Fdemo_o3d_server.py`。结果是由 MotionBert 生成的。\n- **2022.01.18**: 现在 alfred 支持基于 Open3D 的 Mesh3D 可视化服务器：\n  ```python\n  from alfred.vis.mesh3d.o3dsocket import VisOpen3DSocket\n\n  def main():\n      server = VisOpen3DSocket()\n      while True:\n          server.update()\n\n\n  if __name__ == \"__main__\":\n      main()\n  ```\n  接下来，你只需要设置一个客户端，将 keypoints3d 发送到服务器，它就会自动进行可视化。效果如下：\n  ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_e2a323ae1cf4.gif)\n\n- **2021.12.22**: 现在 alfred 支持关键点可视化，mmpose 中几乎所有的数据集也都被 alfred 支持：\n  ```python\n  from alfred.vis.image.pose import vis_pose_result\n\n  # preds 是姿态，对于 COCO 身体来说是 (Bs, 17, 3)\n  vis_pose_result(ori_image, preds, radius=5, thickness=2, show=True)\n  ```\n\n- **2021.12.05**: 现在可以使用 `alfred.deploy.tensorrt` 进行 TensorRT 推理：\n  ```python\n  from alfred.deploy.tensorrt.common import do_inference_v2, allocate_buffers_v2, build_engine_onnx_v3\n\n  def engine_infer(engine, context, inputs, outputs, bindings, stream, test_image):\n\n    # image_input, img_raw, _ = preprocess_np(test_image)\n    image_input, img_raw, _ = preprocess_img((test_image))\n    print('input shape: ', image_input.shape)\n    inputs[0].host = image_input.astype(np.float32).ravel()\n\n    start = time.time()\n    dets, labels, masks = do_inference_v2(context, bindings=bindings, inputs=inputs,\n                                          outputs=outputs, stream=stream, input_tensor=image_input)\n  img_f = 'demo\u002Fdemo.jpg'\n  with build_engine_onnx_v3(onnx_file_path=onnx_f) as engine:\n      inputs, outputs, bindings, stream = allocate_buffers_v2(engine)\n      # Contexts are used to perform inference.\n      with engine.create_execution_context() as context:\n          print(engine.get_binding_shape(0))\n          print(engine.get_binding_shape(1))\n          print(engine.get_binding_shape(2))\n          INPUT_SHAPE = engine.get_binding_shape(0)[-2:]\n\n          print(context.get_binding_shape(0))\n          print(context.get_binding_shape(1))\n          dets, labels, masks, img_raw = engine_infer(\n              engine, context, inputs, outputs, bindings, stream, img_f)\n  ```\n\n- **2021.11.13**: 现在我添加了 Siren SDK 的支持！\n  ````\n  from functools import wraps\n  from alfred.siren.handler import SirenClient\n  from alfred.siren.models import ChatMessage, InvitationMessage\n\n  siren = SirenClient('daybreak_account', 'password')\n\n\n  @siren.on_received_invitation\n  def on_received_invitation(msg: InvitationMessage):\n      print('received invitation: ', msg.invitation)\n      # directly agree this invitation for robots\n\n\n  @siren.on_received_chat_message\n  def on_received_chat_msg(msg: ChatMessage):\n      print('got new msg: ', msg.text)\n      siren.publish_txt_msg('I got your message O(∩_∩)O哈哈~', msg.roomId)\n\n\n  if __name__ == '__main__':\n      siren.loop()\n  ```` \n  使用这个，你可以轻松地搭建一个聊天机器人。通过使用 Siren 客户端。\n\n- **2021.06.24**: 添加了一个实用的命令行工具，**轻松更改你的 PyPI 源！！**：\n  ````\n  alfred cab changesource\n  ```` \n  然后你的 PyPI 默认就会使用阿里云源了！\n- **2021.05.07**: 升级 Open3D 的说明：\n  Open3D>0.9.0 不再兼容之前的 alfred-py。请升级 Open3D，你可以从源码编译 Open3D：\n  ````\n    git clone --recursive https:\u002F\u002Fgithub.com\u002Fintel-isl\u002FOpen3D.git\n    cd Open3D && mkdir build && cd build\n    sudo apt install libc++abi-8-dev\n    sudo apt install libc++-8-dev\n    cmake .. -DPYTHON_EXECUTABLE=\u002Fusr\u002Fbin\u002Fpython3\n  ```` \n  **我在 Ubuntu 16.04 及以下版本上尝试过所有方法，都无法从源码编译成功**。因此，请为 alfred-py 使用 open3d==0.9.0。\n- **2021.04.01**: 添加了一个统一的评估器。众所周知，对于许多用户来说，编写评估代码可能会与项目深度耦合。但在 Alfred 的帮助下，你可以在任何项目中只需编写 8 行代码即可完成评估。例如，如果你的数据集格式是 Yolo，那么可以这样做：\n  ```python\n    def infer_func(img_f):\n    image = cv2.imread(img_f)\n    results = config_dict['model'].predict_for_single_image(\n        image, aug_pipeline=simple_widerface_val_pipeline, classification_threshold=0.89, nms_threshold=0.6, class_agnostic=True)\n    if len(results) > 0:\n        results = np.array(results)[:, [2, 3, 4, 5, 0, 1]]\n        # xywh to xyxy\n        results[:, 2] += results[:, 0]\n        results[:, 3] += results[:, 1]\n    return results\n\n    if __name__ == '__main__':\n        conf_thr = 0.4\n        iou_thr = 0.5\n\n        imgs_root = 'data\u002Fhand\u002Fimages'\n        labels_root = 'data\u002Fhand\u002Flabels'\n\n        yolo_parser = YoloEvaluator(imgs_root=imgs_root, labels_root=labels_root, infer_func=infer_func)\n        yolo_parser.eval_precisely()\n  ```\n  然后你就可以自动获得评估结果。所有的召回率、精确率、mAP 都会被打印出来。更多数据集格式正在开发中。\n- **2021.03.10**:\n    新增了 `ImageSourceIter` 类，当你想要编写一个需要处理任意输入（如图像文件、文件夹、视频文件等）的项目演示时，可以使用 `ImageSourceIter`：\n\n    ```python\n    from alfred.utils.file_io import ImageSourceIter\n    \n    # data_f 可以是图像文件或图像文件夹或视频\n    iter = ImageSourceIter(ops.test_path)\n    while True:\n        itm = next(iter)\n        if isinstance(itm, str):\n            itm = cv2.imread(itm)\n        # cv2.imshow('raw', itm)\n        res = detect_for_pose(itm, det_model)\n        cv2.imshow('res', itm)\n        if iter.video_mode:\n            cv2.waitKey(1)\n        else:\n            cv2.waitKey(0)\n    \n    ```\n    这样你就可以避免再去写处理文件通配符或用 cv 读取视频的代码了。*注意，itm 返回的可以是 cv 数组或文件路径*。\n- **2021.01.25**:\n    **alfred** 现在支持对 COCO 格式标注进行自定义可视化（不使用 pycoco 工具）：\n\n    ![image-20210125194313093](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_b0893cef79bf.png)\n\n    如果你的数据集是 COCO 格式但可视化不正确，请向我提交 issue，谢谢！\n- **2020.09.27**:\n    现在，YOLO 和 VOC 可以互相转换，因此使用 Alfred 你可以：\n    - 将 YOLO 转换为 VOC；\n    - 将 VOC 转换为 YOLO；\n    - 将 VOC 转换为 COCO；\n    - 将 COCO 转换为 VOC；\n\n    通过这种方式，你可以相互转换任何标注格式。\n- **2020.09.08**: 经过一段时间的沉寂，**alfred** 终于迎来了一些更新：\n    我们在其内部提供了 `coco2yolo` 功能。用户可以运行以下命令将数据转换为 YOLO 格式：\n\n    ````\n    alfred data coco2yolo -i images\u002F -j annotations\u002Fval_split_2020.json\n    ````\n\n    你只需要提供图像根目录和 JSON 文件，所有结果都会生成到 `yolo` 文件夹中，或者直接放在图像父目录下。\n\n    转换完成后（得到 `yolo` 文件夹），你可以可视化转换结果，看看是否正确：\n\n    ````\n    alfred data yolovview -i images\u002F -l labels\u002F\n    ````\n\n    ![image-20200908164952171](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_3ebf37b12509.png)\n\n- **2020.07.27**: 经过一段时间的沉寂，**alfred** 终于迎来了一些更新：\n\n    ![image-20200727163938094](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_eb9cef424346.png)\n\n    现在，你可以使用 alfred 在图像上绘制中文字符，而不会出现 xxxx 未定义编码的问题。\n\n    ```python\n    from alfred.utils.cv_wrapper import put_cn_txt_on_img\n\n    img = put_cn_txt_on_img(img, spt[-1], [points[0][0], points[0][1]-25], 1.0, (255, 255, 255))\n    ```\n\n    此外，你现在还可以 **合并** 两个 VOC 数据集！这在你有两个数据集并希望将它们合并成一个时非常有用。\n\n    ````\n    alfred data mergevoc -h\n    ````\n\n    你可以查看更多介绍。\n- **2020.03.08**：在 **alfred** 中新增了几项功能：\n\n    ````\n    alfred.utils.file_io: 提供通用的文件 I\u002FO 工具\n    alfred.dl.torch.env: 提供 PyTorch 中的种子或环境设置（API 与 Detectron2 相同）\n    alfred.dl.torch.distribute: 用于 PyTorch 分布式训练的工具\n    ````\n\n- **2020.03.04**: 我们增加了一些 **评估工具**，用于计算目标检测模型性能评估中的 mAP，这些工具非常实用，并且可以可视化结果：\n    ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_605cb8216b7a.png)\n    ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_f2a8153139df.png)\n\n    使用方法也非常简单：\n\n    ````\n    alfred data evalvoc -g ground-truth -d detection-results -im images\n    ````\n\n    其中 `-g` 是你的真实标签目录（包含 XML 或 TXT 文件），`-d` 是你的检测结果文件目录，`-im` 是你的图像文件夹。你只需要将所有检测结果保存为 TXT 文件，每张图片一个 TXT 文件，格式如下：\n\n    ```shell\n    bottle 0.14981 80 1 295 500  \n    bus 0.12601 36 13 404 316  \n    horse 0.12526 430 117 500 307  \n    pottedplant 0.14585 212 78 292 118  \n    tvmonitor 0.070565 388 89 500 196 \n    ```\n\n- **2020.02.27**: 我们刚刚在 alfred 内部更新了一个 `license` 模块，比如你想为你的项目申请许可证或更新许可证，很简单：\n\n    ```shell script\n     alfred cab license -o 'MANA' -n 'YoloV3' -u 'manaai.cn'\n    ```\n    你可以通过 `alfred cab license -h` 查看更详细的用法。\n\n- **2020-02-11**: open3d 的 API 发生了变化。我们已经在 alfred 中更新了新的 open3d，你可以简单使用最新的 open3d 并运行 `python3 examples\u002Fdraw_3d_pointcloud.py`，你将会看到这样的效果：\n\n    ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_e5a5a665ee76.png)\n\n- **2020-02-10**: **alfred** 现在支持 Windows（实验性）；\n- **2020-02-01**: **武汉加油**！*alfred* 修复了与编码 'gbk' 相关的 Windows pip 安装问题；\n- **2020-01-14**: 添加了 cabinet 模块，并在 data 模块下增加了一些工具；\n- **2019-07-18**: 添加了 1000 类的 ImageNet labelmap。可以通过以下方式调用：\n\n    ```python\n    from alfred.vis.image.get_dataset_label_map import imagenet_labelmap\n\n    # 同时，coco、voc、cityscapes 的 labelmap 也已添加\n    from alfred.vis.image.get_dataset_label_map import coco_labelmap\n    from alfred.vis.image.get_dataset_label_map import voc_labelmap\n    from alfred.vis.image.get_dataset_label_map import cityscapes_labelmap\n    ```\n\n- **2019-07-13**: 我们在命令行中添加了一个 VOC 检查模块，现在你可以像这样可视化你的 VOC 格式检测数据：\n\n    ````\n    alfred data voc_view -i .\u002Fimages -l labels\u002F\n    ````\n\n- **2019-05-17**: 我们将 **open3d** 作为库加入到 Python 中，用于可视化 3D 点云。现在你可以做一些简单的准备，直接在激光雷达点云上可视化 3D 框，并像使用 OpenCV 一样显示出来！！\n\n    ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_b4d917a83548.png)\n\n    你只需要使用 **alfred-py** 和 **open3d** 就可以实现这一点！\n\n    示例代码可以在 `examples\u002Fdraw_3d_pointcloud.py` 中找到。**代码已更新为最新的 open3d API**！\n\n- **2019-05-10**: 一个小更新，但 **非常实用**，我们称之为 **mute_tf**，你想禁用 TensorFlow 的忽略日志吗？很简单，这样做就行了！！\n\n    ```python\n    from alfred.dl.tf.common import mute_tf\n    mute_tf()\n    import tensorflow as tf\n    ```\n\n    这样，日志信息就消失了……\n\n- **2019-05-07**: 添加了一些 proto 文件，现在你可以使用 alfred 解析 TensorFlow 的 COCO labelmap：\n\n    ```python\n    from alfred.protos.labelmap_pb2 import LabelMap\n    from google.protobuf import text_format\n\n    with open('coco.prototxt', 'r') as f:\n        lm = LabelMap()\n        lm = text_format.Merge(str(f.read()), lm)\n        names_list = [i.display_name for i in lm.item]\n        print(names_list)\n    ```\n\n- **2019-04-25**: 添加了 KITTI 融合功能，现在你可以将 3D 标签投影到图像上，就像这样：\n  我们还将添加更多融合工具，比如针对 *nuScene* 数据集的工具。\n\n  我们提供 KITTI 融合功能，可以将 `相机链接的 3D 点` 转换为图像像素，也可以将 `激光雷达链接的 3D 点` 转换为图像像素。API 大致如下：\n\n  ```python\n  # 将激光雷达预测转换为图像像素\n  from alfred.fusion.kitti_fusion import LidarCamCalibData, \\\n      load_pc_from_file, lidar_pts_to_cam0_frame, lidar_pt_to_cam0_frame\n  from alfred.fusion.common import draw_3d_box, compute_3d_box_lidar_coords\n\n  # 包含激光雷达预测\n  # 即 x,y,z,h,w,l,rotation_y\n  res = [[4.481686, 5.147319, -1.0229858, 1.5728549, 3.646751, 1.5121397, 1.5486346],\n         [-2.5172017, 5.0262384, -1.0679419, 1.6241353, 4.0445814, 1.4938312, 1.620804],\n         [1.1783253, -2.9209857, -0.9852259, 1.5852798, 3.7360613, 1.4671413, 1.5811548]]\n\n  for p in res:\n      xyz = np.array([p[: 3]])\n      c2d = lidar_pt_to_cam0_frame(xyz, frame_calib)\n      if c2d is not None:\n          cv2.circle(img, (int(c2d[0]), int(c2d[1])), 3, (0, 255, 255), -1)\n      hwl = np.array([p[3: 6]])\n      r_y = [p[6]]\n      pts3d = compute_3d_box_lidar_coords(xyz, hwl, angles=r_y, origin=(0.5, 0.5, 0.5), axis=2)\n\n      pts2d = []\n      for pt in pts3d[0]:\n          coords = lidar_pt_to_cam0_frame(pt, frame_calib)\n          if coords is not None:\n              pts2d.append(coords[:2])\n      pts2d = np.array(pts2d)\n      draw_3d_box(pts2d, img)\n  ```\n\n  你会看到类似这样的效果：\n\n  **注意**：\n\n  `compute_3d_box_lidar_coords` 用于激光雷达预测，而 `compute_3d_box_cam_coords` 则用于 KITTI 标签，**因为 KITTI 标签是基于相机坐标的！**。\n  \u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_8dad73b98558.png\" \u002F>\n  \u003C\u002Fp>\n\n  **由于很多用户问我如何重现这个结果，你可以在 `examples\u002Fdraw_3d_box.py` 中查看示例文件**；\n\n- **2019-01-25**: 我们刚刚为 **PyTorch** 添加了一个网络可视化工具！！它看起来怎么样呢？只需打印出 *每一层网络及其输出形状*，我相信这对人们可视化自己的模型非常有帮助！\n\n  ````\n  ➜  mask_yolo3 git:(master) ✗ python3 tests.py\n  ----------------------------------------------------------------\n          Layer (type)               Output Shape         Param #\n  ================================================================\n              Conv2d-1         [-1, 64, 224, 224]           1,792\n                ReLU-2         [-1, 64, 224, 224]               0\n                .........\n             Linear-35                 [-1, 4096]      16,781,312\n               ReLU-36                 [-1, 4096]               0\n            Dropout-37                 [-1, 4096]               0\n             Linear-38                 [-1, 1000]       4,097,000\n  ================================================================\n  Total params: 138,357,544\n  Trainable params: 138,357,544\n  Non-trainable params: 0\n  ----------------------------------------------------------------\n  Input size (MB): 0.19\n  Forward\u002Fbackward pass size (MB): 218.59\n  Params size (MB): 527.79\n  Estimated Total Size (MB): 746.57\n  ----------------------------------------------------------------\n  \n  ````\n  好了，这就是全部内容。你只需要做的是：\n\n  ```python\n  from alfred.dl.torch.model_summary import summary\n  from alfred.dl.torch.common import device\n  \n  from torchvision.models import vgg16\n  \n  vgg = vgg16(pretrained=True)\n  vgg.to(device)\n  summary(vgg, input_size=[224, 224])\n  ```\n\n  支持输入 (224, 224) 的图像，你将会得到这样的输出，或者你可以改变其他尺寸来观察输出的变化。（目前不支持 1 通道图像）\n\n- **2018-12-7**: 现在，我们添加了一个可扩展的类，用于快速编写图像检测或分割的演示程序。\n\n  如果你想编写一个演示程序，**对一张图像、一段视频或直接从摄像头进行推理**，现在你可以按照标准的 alfred 方式来做：\n\n  ```python\n  class ENetDemo(ImageInferEngine):\n  \n      def __init__(self, f, model_path):\n          super(ENetDemo, self).__init__(f=f)\n  \n          self.target_size = (512, 1024)\n          self.model_path = model_path\n          self.num_classes = 20\n  \n          self.image_transform = transforms.Compose(\n              [transforms.Resize(self.target_size),\n               transforms.ToTensor()])\n  \n          self._init_model()\n  \n      def _init_model(self):\n          self.model = ENet(self.num_classes).to(device)\n          checkpoint = torch.load(self.model_path)\n          self.model.load_state_dict(checkpoint['state_dict'])\n          print('Model loaded!')\n  \n      def solve_a_image(self, img):\n          images = Variable(self.image_transform(Image.fromarray(img)).to(device).unsqueeze(0))\n          predictions = self.model(images)\n          _, predictions = torch.max(predictions.data, 1)\n          prediction = predictions.cpu().numpy()[0] - 1\n          return prediction\n  \n      def vis_result(self, img, net_out):\n          mask_color = np.asarray(label_to_color_image(net_out, 'cityscapes'), dtype=np.uint8)\n          frame = cv2.resize(img, (self.target_size[1], self.target_size[0]))\n          # mask_color = cv2.resize(mask_color, (frame.shape[1], frame.shape[0]))\n          res = cv2.addWeighted(frame, 0.5, mask_color, 0.7, 1)\n          return res\n  \n  \n  if __name__ == '__main__':\n      v_f = ''\n      enet_seg = ENetDemo(f=v_f, model_path='save\u002FENet_cityscapes_mine.pth')\n      enet_seg.run()\n  ```\n\n  之后，你还可以直接从视频中进行推理。这种用法可以在 Git 仓库中找到：\n\n  \u003Cp align=\"center\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_readme_149cee012295.gif\"\u002F>\u003C\u002Fp>\n  使用 **alfred** 的仓库：http:\u002F\u002Fgithub.com\u002Fjinfagang\u002Fpt_enet\n\n- **2018-11-6**: 我很高兴地宣布 alfred 2.0 发布了！😄⛽️👏👏 让我们快速看一下有哪些更新：\n\n  ````\n  # 2 个新模块，fusio 和 vis\n  from alred.fusion import fusion_utils\n  ````\n\n  其中 `fusion` 模块包含了许多有用的传感器融合辅助函数，比如可以将激光雷达点云投影到图像上。\n- **2018-08-01**: 修复了视频合并功能在处理序列时不够完善的问题。添加了一种排序算法，以确保视频序列正确。\n  同时还向软件包中添加了一些绘制边界框的函数。\n  可以这样调用：\n- **2018-03-16**: 对 **alfred** 进行了小幅更新，现在我们可以使用这个工具将一段视频序列重新组合成原始视频！\n  只需简单操作：\n\n  ```shell\n  # alfred 二进制可执行程序\n  alfred vision 2video -d .\u002Fvideo_images\n  ```\n\n## 功能强大\n\n**alfred** 既是一个库，也是一个命令行工具。它可以完成以下任务：\n\n```angular2html\n# 从视频中提取图片\nalfred vision extract -v video.mp4\n# 将图像序列合并成视频\nalfred vision 2video -d \u002Fpath\u002Fto\u002Fimages\n# 从图片中提取人脸\nalfred vision getface -d \u002Fpath\u002Fcontains\u002Fimages\u002F\n\n``` \n\n快去试试吧！！\n\n## 版权声明\n\n**Alfred** 由 *Lucas Jin* 用 ❤️ 打造，欢迎点赞和提交 Pull Request。如有任何问题，可通过微信联系我：`jintianiloveu`。本项目采用 GPL-3 许可证发布。","# Alfred-py 快速上手指南\n\nAlfred-py 是一款专为深度学习设计的命令行工具与 Python 库。它提供了强大的可视化功能（支持 2D\u002F3D 框、掩码、关键点）、数据集格式转换（YOLO\u002FVOC\u002FCOCO 互转）、TensorRT 模型部署辅助以及日常开发效率工具。\n\n## 环境准备\n\n*   **操作系统**：推荐 Linux (Ubuntu)，Windows 用户需注意部分依赖的安装方式。\n*   **Python 版本**：Python 3.x\n*   **前置依赖**（可选，按需安装）：\n    *   `lxml`\n    *   `pycocotools` (用于 COCO 格式处理)\n    *   `opencv-python` (用于图像视频处理)\n    *   `Open3D` (用于 3D 可视化，建议版本 >= 0.9.0)\n\n> **注意**：在 Windows 上安装 `pycocotools` 可能需要使用以下命令：\n> ```shell\n> pip install \"git+https:\u002F\u002Fgithub.com\u002Fphilferriere\u002Fcocoapi.git#egg=pycocotools&subdirectory=PythonAPI\"\n> ```\n\n## 安装步骤\n\n推荐使用国内镜像源加速安装（如阿里云）。\n\n```shell\n# 使用 pip 直接安装\nsudo pip3 install alfred-py -i https:\u002F\u002Fmirrors.aliyun.com\u002Fpypi\u002Fsimple\u002F\n\n# 或者配置永久镜像源后安装\nalfred cab changesource\nsudo pip3 install alfred-py\n```\n\n安装完成后，你可以在终端直接使用 `alfred` 命令，或在 Python 代码中 `import alfred`。\n\n## 基本使用\n\nAlfred 既可作为命令行工具调用，也可作为 Python 库导入。\n\n### 1. 命令行工具 (CLI)\n\n查看帮助信息以了解所有可用模块：\n```shell\nalfred -h\n```\n\n**常用场景示例：**\n\n*   **查看标注数据** (支持 VOC, COCO, YOLO 等格式)：\n    ```shell\n    # 查看 VOC 格式标注\n    alfred data vocview -i JPEGImages\u002F -l Annotations\u002F\n    \n    # 查看 COCO 格式标注\n    alfred data cocoview -j annotations\u002Finstance_2017.json -i images\u002F\n    \n    # 查看 YOLO 格式标注\n    alfred data yoloview -i images -l labels\n    ```\n\n*   **数据集工具**：\n    ```shell\n    # 统计特定类型文件数量\n    alfred cab count -d .\u002Fimages -t jpg\n    \n    # 将 txt 列表划分为训练集和验证集 (9:1)\n    alfred cab split -f all.txt -r 0.9,0.1 -n train,val\n    ```\n\n*   **视频处理**：\n    ```shell\n    # 视频抽帧\n    alfred vision extract -v video.mp4\n    \n    # 图片合成视频\n    alfred vision 2video -d images\u002F\n    ```\n\n### 2. Python API 调用\n\n**可视化关键点 (Pose Visualization)：**\n```python\nfrom alfred.vis.image.pose import vis_pose_result\n\n# preds 为姿态预测结果，例如 COCO 格式 (Bs, 17, 3)\nvis_pose_result(ori_image, preds, radius=5, thickness=2, show=True)\n```\n\n**统一评估器 (Evaluation)：**\n```python\nfrom alfred.eval.yolo import YoloEvaluator\n\ndef infer_func(img_f):\n    # 此处填写你的模型推理逻辑，返回检测结果\n    # ...\n    return results\n\nif __name__ == '__main__':\n    imgs_root = 'data\u002Fhand\u002Fimages'\n    labels_root = 'data\u002Fhand\u002Flabels'\n    \n    # 初始化评估器并运行\n    yolo_parser = YoloEvaluator(imgs_root=imgs_root, labels_root=labels_root, infer_func=infer_func)\n    yolo_parser.eval_precisely()\n```\n\n**3D 可视化服务器 (基于 Open3D)：**\n```python\nfrom alfred.vis.mesh3d.o3dsocket import VisOpen3DSocket\n\ndef main():\n    server = VisOpen3DSocket()\n    while True:\n        server.update()\n\nif __name__ == \"__main__\":\n    main()\n```\n*(启动上述脚本后，可通过客户端发送 3D 关键点数据进行实时渲染)*","某计算机视觉算法工程师正在处理一个包含 VOC、COCO 和 YOLO 多种格式混合标注的交通场景数据集，急需进行数据清洗、可视化校验及训练集划分。\n\n### 没有 alfred 时\n- **可视化代码重复造轮子**：每次查看不同格式的标注框（Box）或掩码（Mask），都需要临时编写繁琐的 OpenCV 绘图代码，甚至难以直接支持 3D 点云框的渲染。\n- **格式转换与查看困难**：面对混杂的标注文件，缺乏统一命令行工具快速预览，必须手动解析 XML 或 JSON 文件才能确认数据质量，效率极低。\n- **数据处理流程割裂**：视频抽帧、图片合成视频、按概率划分训练\u002F验证集等基础操作，需要分别寻找不同的脚本或依赖重型库，工作流支离破碎。\n- **模型部署门槛高**：将训练好的 PyTorch 模型转换为 TensorRT 引擎时，缺乏标准化的辅助接口，需反复调试底层 API，容易出错。\n\n### 使用 alfred 后\n- **一行命令实现多维可视化**：直接通过 `alfred data vocview` 或 `cocoview` 即可在终端瞬间渲染出带标签和分数的检测框，甚至支持 3D 点云框的直接展示，无需编写任何绘图代码。\n- **统一接口屏蔽格式差异**：利用 alfred 内置的数据模块，无缝切换查看 YOLO、VOC、COCO 等各类标注，快速定位错误样本，数据校验时间从小时级缩短至分钟级。\n- **集成化高效数据流水线**：调用 `alfred vision extract` 一键完成视频抽帧，使用 `alfred cab split` 自动按比例生成训练集列表，将碎片化操作整合为流畅的命令行工作流。\n- **平滑衔接模型部署**：借助 alfred 提供的部署工具链，轻松将模型转换为 TensorRT 引擎，大幅降低了推理加速的工程化难度。\n\nalfred 通过“命令行工具 + 代码库”的双重形态，将深度学习开发中琐碎的数据处理、可视化及部署环节标准化，让工程师能专注于核心算法而非基建重复劳动。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Flucasjinreal_alfred_2fae4f25.png","lucasjinreal","MagicSource","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Flucasjinreal_e9536669.png","Play with Neural Magic, AIGC, 3D Computer Vision, Virtual Human, 3D Artist. work at @tencent","Google","Sanfancisco","nicholasjela@gmail.com",null,"https:\u002F\u002Fgithub.com\u002Flucasjinreal",[82,86,90],{"name":83,"color":84,"percentage":85},"Python","#3572A5",99.8,{"name":87,"color":88,"percentage":89},"Shell","#89e051",0.1,{"name":91,"color":92,"percentage":93},"Batchfile","#C1F12E",0,911,136,"2026-03-25T06:52:21","GPL-3.0",1,"Linux, Windows","未说明 (工具包含 TensorRT 部署功能，隐含需要 NVIDIA GPU，但 README 未指定具体型号、显存或 CUDA 版本)","未说明",{"notes":103,"python":104,"dependencies":105},"该工具既是库也是命令行工具。在 Windows 上安装 pycocotools 需要使用特定的 git 地址。对于 3D 可视化功能 (Open3D)，在 Ubuntu 16.04 及以下版本从源码构建可能会失败，建议使用 Open3D==0.9.0 或更高版本系统。支持多种数据格式 (YOLO, VOC, COCO) 的查看与转换，以及 TensorRT 模型部署。","3.x (README 示例中使用 python3，且提到 Ubuntu 16.04 构建失败，隐含需要较新的 Python 3 环境)",[106,107,108,109,110,111],"lxml (可选)","pycocotools (可选\u002F核心依赖)","opencv-python (可选)","Open3D (推荐 0.9.0，用于 3D 可视化)","numpy","torch (隐含，用于深度学习工具)",[113,14,15,114],"其他","视频",[116,117,118,119,120,121,122,64,123,124],"deeplearning","video-combiner","pytorch","segmentation","network","3d-object-detection","sensor-fusion","pycocotools","alfred-py","2026-03-27T02:49:30.150509","2026-04-11T15:14:45.207967",[128,133,138,143,148,153,158,163],{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},29495,"运行 alfred 时遇到 ImportError: cannot import name 'device' 错误怎么办？","请确保已卸载旧版本的 alfred-py，然后尝试升级安装最新版本。执行命令：`pip install -U alfred-py`。如果问题依旧，可以从 GitHub 安装主分支版本。安装后可通过 `alfred -v` 验证版本。","https:\u002F\u002Fgithub.com\u002Flucasjinreal\u002Falfred\u002Fissues\u002F27",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},29496,"如何从源代码安装 alfred（例如因代理问题无法使用 pip）？","可以直接使用 setup.py 进行构建和开发模式安装。在项目根目录下执行命令：`python setup.py build develop`。","https:\u002F\u002Fgithub.com\u002Flucasjinreal\u002Falfred\u002Fissues\u002F26",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},29497,"运行 demo_o3d_server.py 时提示找不到 default_viscfg.yml 文件如何解决？","该问题通常是由于缺少 assets 资源文件导致的。建议检查安装是否完整，或者参考示例代码 `examples\u002Fdraw_3d_pointcloud.py` 了解用法。维护者提到后续会提供专门的依赖和示例代码，当前可尝试手动将缺失的配置文件复制到 site-packages 对应目录，或直接从 GitHub 源码运行示例。","https:\u002F\u002Fgithub.com\u002Flucasjinreal\u002Falfred\u002Fissues\u002F25",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},29498,"如何退出 alfred 数据查看器（data viewer）？","目前可以通过在终端按下 `Ctrl+C` 中断进程，然后在图像窗口按任意键退出。维护者表示后续会添加专门的退出命令捕获功能。","https:\u002F\u002Fgithub.com\u002Flucasjinreal\u002Falfred\u002Fissues\u002F14",{"id":149,"question_zh":150,"answer_zh":151,"source_url":152},29499,"如何在自己的点云上可视化边界框（bounding boxes），需要哪些数据格式？","可以使用 `draw_3d_pointcloud.py` 脚本。对于边界框，通常需要提供中心坐标 (x, y, z)、尺寸 (h, w, l) 以及绕 Y 轴的旋转角度 (r_y)。建议阅读 `examples\u002Fdraw_3d_pointcloud.py` 下的代码以获取具体的数据格式要求和实现细节。","https:\u002F\u002Fgithub.com\u002Flucasjinreal\u002Falfred\u002Fissues\u002F6",{"id":154,"question_zh":155,"answer_zh":156,"source_url":157},29500,"可视化点云时出现 GLX 错误（GLXBadFBConfig 或 GLX version 1.3 is required）怎么办？","这通常是因为显示服务器配置不正确或 GLX 版本过低。Open3D 要求 GLX 版本至少为 1.3。请检查并升级系统的 GLX 版本（可通过 `glxinfo` 查看当前版本），并确保正确设置了显示服务器环境变量。同时，建议参考 Open3D 官方主页的版本要求说明。","https:\u002F\u002Fgithub.com\u002Flucasjinreal\u002Falfred\u002Fissues\u002F3",{"id":159,"question_zh":160,"answer_zh":161,"source_url":162},29501,"验证 COCO 数据集时发现大部分 mask 形状异常（从 labelme 转换而来），是什么原因？","这通常是数据转换代码的问题，而非工具本身。建议检查转换过程中的点坐标顺序，可以尝试将点单独提取并排序，或者直接使用 OpenCV 的 polygon 绘制函数来验证点的正确性。确认是转换逻辑导致后，修正转换脚本即可解决。","https:\u002F\u002Fgithub.com\u002Flucasjinreal\u002Falfred\u002Fissues\u002F19",{"id":164,"question_zh":165,"answer_zh":166,"source_url":167},29502,"pip 安装时遇到版本号不一致（inconsistent version）的错误如何处理？","这是由于 PyPI 上的包元数据版本与文件名版本不匹配导致的。维护者通常会尽快修复并发布新版本。遇到此问题时，请等待维护者更新版本（如更新到 2.11.1 或更高），然后重新尝试 `pip install`。","https:\u002F\u002Fgithub.com\u002Flucasjinreal\u002Falfred\u002Fissues\u002F29",[169],{"id":170,"version":171,"summary_zh":172,"released_at":173},205915,"latest","新增功能：\n\n- 通过 TensorRT 进行推理；\n- 关键点可视化；","2022-01-05T14:43:11"]