learnopencv

GitHub
22.9k 11.7k 中等 1 次阅读 今天图像开发框架Agent
AI 解读 由 AI 自动生成,仅供参考

learnopencv 是一个专注于计算机视觉、深度学习与人工智能领域的开源代码库,旨在将复杂的技术理论转化为可运行的 C++ 和 Python 实战示例。它紧密配合 LearnOpenCV.com 博客的技术文章,为读者提供从基础概念到前沿应用的完整代码实现。

面对 AI 技术迭代快、论文复现难的问题,learnopencv 提供了经过验证的落地方案,帮助用户跨越从“读懂原理”到“写出代码”的鸿沟。无论是实时目标检测(如最新的 YOLO26、RF-DETR)、多目标跟踪、人脸隐私保护,还是大模型部署(如 Jetson 边缘计算、vLLM 服务)、3D 重建(SAM 3D、高斯泼溅)以及 RAG 检索增强生成等热门方向,这里都能找到对应的演示项目。

该资源特别适合开发者、算法研究人员及 AI 学习者使用。对于希望提升工程能力的程序员,它提供了生产级的参考架构;对于科研人员,它是快速验证新想法的试验田;对于初学者,则是循序渐进掌握 OpenCV 与深度学习框架的最佳实践指南。通过涵盖从传统图像处理到大模型应用的全栈内容,learnopencv 致力于让每个人都能轻松上手并精通 AI 开发。

使用场景

某智慧零售团队正致力于开发一套实时顾客行为分析系统,需要在边缘设备上精准追踪多人动线并自动模糊人脸以符合隐私法规。

没有 learnopencv 时

  • 开发者需从零复现复杂的 YOLO26 或 RF-DETR 算法,耗费数周调试实例分割与实时检测的代码兼容性。
  • 面对多目标追踪场景,缺乏成熟的 Roboflow 追踪器集成示例,导致人员身份频繁切换,数据准确率极低。
  • 为满足隐私合规,手动编写基于 YuNet 的人脸模糊逻辑效率低下,且难以在 Jetson 等边缘端实现低延迟推理。
  • 遇到模型部署瓶颈(如 NMS 后处理耗时)时,缺乏官方优化的无 NMS 推理方案,系统帧率无法达到实时要求。

使用 learnopencv 后

  • 直接调用仓库中经过验证的 YOLO26 和 RF-DETR 演示代码,半天内即可跑通像素级实例分割功能,大幅缩短研发周期。
  • 复用现成的 Roboflow 追踪器集成脚本,轻松实现稳定流畅的多目标轨迹跟踪,无需担心算法底层实现细节。
  • 利用内置的 OpenCV YuNet 人脸模糊模块,快速部署实时的隐私保护功能,确保系统在采集瞬间即完成脱敏处理。
  • 借鉴 YOLO26 无 NMS 推理及 Jetson 端 LLM 部署的最佳实践,成功将系统延迟降低 40%,在边缘设备上实现丝滑运行。

learnopencv 通过提供生产级的代码范例与前沿算法落地指南,让开发者从重复造轮子中解放出来,专注于业务逻辑的创新与交付。

运行环境要求

操作系统
  • 未说明
GPU
  • 部分项目(如 LLM 服务、VLM、3D 重建)需要 NVIDIA GPU,具体显存需求视模型而定(通常建议 8GB+),CUDA 版本未明确指定
  • 部分项目支持边缘设备(如 Jetson Nano/Orin)或 Arduino
内存

未说明(大型模型训练或推理通常建议 16GB+)

依赖
notes该仓库是多个独立教程和演示代码的集合,并非单一工具,因此不同子目录(如 YOLO26、vLLM 部署、SAM-3 等)的环境需求差异巨大。部分项目专为边缘设备(NVIDIA Jetson)或微控制器(Arduino)设计。运行特定项目前,请务必查阅对应子目录下的具体要求或关联博客文章。
python未说明
opencv-python
torch
transformers
ultralytics
vllm
langgraph
accelerate
roboflow
learnopencv hero image

快速开始

LearnOpenCV

此仓库包含我们在博客 LearnOpenCV.com 上分享的计算机视觉、深度学习和人工智能研究文章的代码。

想成为人工智能专家吗?OpenCV 人工智能课程 是一个很好的起点。

博客文章列表

Blog Post Code
RF-DETR Segmentation: Real-Time Detection & Instance Segmentation Guide Code
YOLO26 Instance Segmentation: Pixel-Perfect AI at Real-Time Speed Code
Multi-Object Tracking with Roboflow Trackers and OpenCV Code
Real-Time Face Blur and Pixelation with OpenCV YuNet Code
Breaking the Bottleneck: Achieving Native NMS-Free Inference with YOLO26 Code
YOLOv26: An Object Detector Built for Real-Time Deployment Code
Beyond Transformers: A Deep Dive into HOPE
Serving SGLang: Launch a Production-Style Server
Deployment on Edge: LLM Serving on Jetson using vLLM Code
Nested Learning: Is Deep Learning Architecture an Illusion?
How to Build a GitHub Code-Analyser Agent for Developer Productivity Code
The Existential Problems in LLM Serving
SAM 3D: Foundation Model for Single-Image 3D Reconstruction
SAM-3: What’s New, How It Works, and Why It Matters Code
Image-GS: Adaptive Image Reconstruction using 2D Gaussians Code
Ultimate Guide to Vector Databases and RAG Pipeline Code
What Makes DeepSeek OCR So Powerful Code
2D Gaussian Splatting: Geometrically Accurate Radiance Field Reconstruction Code
TRM: Tiny Recursive Models Code
Deploying ML Models on Arduino: From Blink to Think Code
VideoRAG: Redefining Long-Context Video Comprehension
AI Agent in Action: Automating Desktop Tasks with VLMs Code
Top VLM Evaluation Metrics for Optimal Performance Analysis Code
Getting Started with VLM on Jetson Nano Code
VLM on Edge: Worth the Hype or Just a Novelty? Code
AnomalyCLIP : Harnessing CLIP for Weakly-Supervised Video Anomaly Recognition Code
AI_for_Video_Understanding_From_Content_Moderation_to_Summarization Code
Video-RAG: Training-Free Retrieval for Long-Video LVLMs Code
Object Detection and Spatial Understanding with VLMs ft. Qwen2.5-VL Code
LangGraph: Building Self-Correcting RAG Agent for Code Generation Code
Inside Sinusoidal Position Embeddings: A Sense of Order Code
Inside RoPE: Rotary Magic into Position Embeddings Code
SimLingo-Vision-Language-Action-Model-for-Autonomous-Driving Code
FineTuning Gemma 3n for Medical VQA on ROCOv2 Code
SmolLM3 Blueprint: SOTA 3B-Parameter LLM
LangGraph-A-Visual-Automation-and-Summarization-Pipeline Code
Fine-Tuning AnomalyCLIP: Class-Agnostic Zero-Shot Anomaly Detection Code
SigLIP 2: DeepMind’s Multilingual Vision-Language Model
MedGemma: Google’s Medico VLM for Clinical QA, Imaging, and More Code
Nanonets-OCR-s: Enabling Rich, Structured Markdown for Document Understanding
Optimizing VJEPA-2: Tackling Latency & Context in Real-Time Video Classification Scripts Code
V-JEPA 2: Meta’s Breakthrough in AI for the Physical World Code
NVIDIA Cosmos Reason1: Video Understanding Code
GR00T N1.5 Explained
LLaVA Code
SmolVLA: Affordable & Efficient VLA Robotics on Consumer GPUs Code
Fine-Tuning Grounding DINO: Open-Vocabulary Object Detection Code
Getting Started with Qwen3 – The Thinking Expert Code
Inside the GPU: A Comprehensive Guide to Modern Graphics Architecture
Distributed Parallel Training: PyTorch Code
MONAI: The Definitive Framework for Medical Imaging Powered by PyTorch
SANA-Sprint: The One-Step Revolution in High-Quality AI Image Synthesis
FramePack-Video-Diffusion-but-feels-like-Image-Diffusion Code
Model Weights File Formats in Machine Learning
Unsloth: A Guide from Basics to Fine-Tuning Vision Models Code
Iterative Closest Point (ICP) Algorithm Explained Code
MedSAM2 Explained: One Prompt to Segment Anything in Medical Imaging Code
Batch Normalization and Dropout as Regularizers
DINOv2_by_Meta_A_Self-Supervised_foundational_vision_model Code
Beginner's Guide to Embedding Models
MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors Code
Google's A2A Protocol
Nvidia SANA : Faster Image Generation
Fine-tuning RF-DETR Code
Qwen2.5-Omni: A Real-Time Multimodal AI
Vision Language Action Models: Robotic Control Code
Fine-Tuning Gemma 3 VLM using QLoRA for LaTeX-OCR Dataset Code
ComfyUI Code
Gemma-3: A Comprehensive Introduction
YOLO11 on Raspberry Pi: Optimizing Object Detection for Edge Devices Code
VGGT: Visual Geometry Grounded Transformer – For Dense 3D Reconstruction Code
DDIM: The Faster, Improved Version of DDPM for Efficient AI Image Generation Code
Introduction to Model Context Protocol (MCP)
MASt3R and MASt3R-SfM Explanation: Image Matching and 3D Reconstruction Code
MatAnyone Explained: Consistent Memory for Better Video Matting Code
GraphRAG: For Medical Document Analysis Code
OmniParser: Vision Based GUI Agent
Fine-Tuning-YOLOv12-Comparison-With-YOLOv11-And-YOLOv7-Based-Darknet Code
FineTuning RetinaNet for Wildlife Detection with PyTorch: A Step-by-Step Tutorial Code
DUSt3R: Geometric 3D Vision Made Easy : Explanation and Results Code
YOLOv12: Attention Meets Speed Code
Video Generation: A Diffusion based approach Code
Agentic AI: A Comprehensive Introduction Code
Finetuning SAM2 for Leaf Disease Segmentation Code
Object Insertion in Gaussian Splatting: Paper Explained and Training Code for MCMC and Bilateral Grid Code
Depth Pro: Sharp Monocular Metric Depth Code
Fine-tuning-Stable-Diffusion-3_5-UI-images Code
SimSiam: Streamlining SSL with Stop-Gradient Mechanism Code
Image Captioning using ResNet and LSTM Code
Molmo VLM: Paper Explanation and Demo Code
3D Gaussian Splatting Paper Explanation: Training Custom Datasets with NeRF-Studio Gsplats Code
FLUX Image Generation: Experimenting with the Parameters Code
Contrastive-Learning-SimCLR-and-BYOL(With Code Example) Code
The Annotated NeRF : Training on Custom Dataset from Scratch in Pytorch Code
Stable Diffusion 3 and 3.5: Paper Explanation and Inference Code
LightRAG - Legal Document Analysis Code
NVIDIA AI Summit 2024 – India Overview
Introduction to Speech to Speech: Most Efficient Form of NLP Code
Training 3D U-Net for Brain Tumor Segmentation (BraTS-GLI) Code
DETR: Overview and Inference Code
YOLO11: Faster Than You Can Imagine! Code
Exploring DINO: Self-Supervised Transformers for Road Segmentation with ResNet50 and U-Net Code
Sapiens: Foundation for Human Vision Models by Meta Code
Multimodal RAG with ColPali and Gemini Code
Building Autonomous Vehicle in Carla: Path Following with PID Control & ROS 2 Code
Handwritten Text Recognition using OCR Code
Training CLIP from Sratch for Image Retrieval Code
Introduction to LiDAR SLAM: LOAM and LeGO-LOAM Paper and Code Explanation with ROS 2 Implementation Code
Recommendation System using Vector Search Code
Fine Tuning Whisper on Custom Dataset Code
SAM 2 – Promptable Segmentation for Images and Videos Code
Introduction to Feature Matching Using Neural Networks Code
Introduction to ROS2 (Robot Operating System 2): Tutorial on ROS2 Working, DDS, ROS1 RMW, Topics, Nodes, Publisher, Subscriber in Python Code
CVPR 2024 Research Papers - Part- 2 Code
CVPR 2024: An Overview and Key Papers Code
Object Detection on Edge Device - OAK-D-Lite Code
Fine-Tuning YOLOv10 Models on Custom Dataset Code
ROS2 and Carla Setup Guide for Ubuntu 22.04
Understanding Visual SLAM for Robotics Perception: Building Monocular SLAM from Scratch in Python Code
Enhancing Image Segmentation using U2-Net: An Approach to Efficient Background Removal Code
YOLOv10: The Dual-Head OG of YOLO Series Code
Fine-tuning Faster R-CNN on Sea Rescue Dataset Code
Mastering Recommendation System: A Complete Guide
Automatic Speech Recognition with Diarization : Speech-to-Text Code
Building MobileViT Image Classification Model from Scratch In Keras 3 Code
SDXL Inpainting: Fusing Image Inpainting with Stable Diffusion Code
YOLOv9 Instance Segmentation on Medical Dataset Code
A Comprehensive Guide to Robotics
Integrating Gradio with OpenCV DNN Code
Fine-Tuning YOLOv9 on Custom Dataset Code
Dreambooth using Diffusers Code
Introduction to Hugging Face Diffusers Code
Introduction to Ultralytics Explorer API Code
YOLOv9: Advancing the YOLO Legacy Code
Fine-Tuning LLMs using PEFT Code
Depth Anything: Accelerating Monocular Depth Perception Code
Deciphering LLMs: From Transformers to Quantization Code
YOLO Loss Function Part 2: GFL and VFL Loss Code
YOLOv8-Object-Tracking-and-Counting-with-OpenCV Code
Stereo Vision in ADAS: Pioneering Depth Perception Beyond LiDAR Code
YOLO Loss Function Part 1: SIoU and Focal Loss Code
Moving Object Detection with OpenCV Code
Integrating ADAS with Keypoint Feature Pyramid Network for 3D LiDAR Object Detection Code
Mastering All YOLO Models from YOLOv1 to YOLO-NAS: Papers Explained (2024)
GradCAM: Enhancing Neural Network Interpretability in the Realm of Explainable AI Code
Text Summarization using T5: Fine-Tuning and Building Gradio App Code
3D LiDAR Visualization using Open3D: A Case Study on 2D KITTI Depth Frames for Autonomous Driving Code
Fine Tuning T5: Text2Text Transfer Transformer for Building a Stack Overflow Tag Generator Code
SegFormer 🤗 : Fine-Tuning for Improved Lane Detection in Autonomous Vehicles Code
Fine-Tuning BERT using Hugging Face Transformers Code
YOLO-NAS Pose Code
BERT: Bidirectional Encoder Representations from Transformers Code
Comparing KerasCV YOLOv8 Models on the Global Wheat Data 2020 Code
Top 5 AI papers of September 2023
Empowering Drivers: The Rise and Role of Advanced Driver Assistance Systems
Semantic Segmentation using KerasCV DeepLabv3+ Code
Object Detection using KerasCV YOLOv8 Code
Fine-tuning YOLOv8 Pose Models for Animal Pose Estimation Code
Top 5 AI papers of August 2023
Fine Tuning TrOCR - Training TrOCR to Recognize Curved Text Code
TrOCR - Getting Started with Transformer Based OCR Code
Facial Emotion Recognition Code
Object Keypoint Similarity in Keypoint Detection Code
Real Time Deep SORT with Torchvision Detectors Code
Top 5 AI papers of July 2023
Medical Image Segmentation Code
Weighted Boxes Fusion in Object Detection: A Comparison with Non-Maximum Suppression Code
Medical Multi-label Classification with PyTorch & Lightning Code
Getting Started with PaddlePaddle: Exploring Object Detection, Segmentation, and Keypoints Code
Drone Programming With Computer Vision A Beginners Guide Code
How to Build a Pip Installable Package & Upload to PyPi
IoU Loss Functions for Faster & More Accurate Object Detection
Exploring Slicing Aided Hyper Inference for Small Object Detection Code
Advancements in Face Recognition Models, Toolkit and Datasets
Train YOLO NAS on Custom Dataset Code
Train YOLOv8 Instance Segmentation on Custom Data Code
YOLO-NAS: New Object Detection Model Beats YOLOv6 & YOLOv8 Code
Segment Anything – A Foundation Model for Image Segmentation Code
Build a Video to Slides Converter Application using the Power of Background Estimation and Frame Differencing in OpenCV Code
A Closer Look at CVAT: Perfecting Your Annotations YouTube
ControlNet - Achieving Superior Image Generation Results Code
InstructPix2Pix - Edit Images With Prompts Code
NVIDIA Spring GTC 2023 Day 4: Ending on a High Note with Top Moments from the Finale!
NVIDIA Spring GTC 2023 Day 3: Digging deeper into Deep Learning, Semiconductors & more!
NVIDIA Spring GTC 2023 Day 2: Jensen’s keynote & the iPhone moment of AI is here!
NVIDIA Spring GTC 2023 Day 1: Welcome to the future!
NVIDIA GTC Spring 2023 Curtain Raiser
Stable Diffusion - A New Paradigm in Generative AI Code
OpenCV Face Recognition – Does Face Recognition Work on AI-Generated Images?
An In-Depth Guide to Denoising Diffusion Probabilistic Models – From Theory to Implementation Code
From Pixels to Paintings: The Rise of Midjourney AI Art
Mastering DALL·E 2: A Breakthrough in AI Art Generation
Top 10 AI Art Generation Tools using Diffusion Models
The Future of Image Recognition is Here: PyTorch Vision Transformer Code
Understanding Attention Mechanism in Transformer Neural Networks Code
Deploying a Deep Learning Model using Hugging Face Spaces and Gradio Code
Train YOLOv8 on Custom Dataset – A Complete Tutorial Code
Introduction to Diffusion Models for Image Generation Code
Building An Automated Image Annotation Tool: PyOpenAnnotate Code
Ultralytics YOLOv8: State-of-the-Art YOLO Models Code
Getting Started with YOLOv5 Instance Segmentation Code
The Ultimate Guide To DeepLabv3 - With PyTorch Inference Code
AI Fitness Trainer using MediaPipe: Squats Analysis Code
YoloR - Paper Explanation & Inference -An In-Depth Analysis Code
Roadmap To an Automated Image Annotation Tool Using Python Code
Performance Comparison of YOLO Object Detection Models – An Intensive Study
FCOS - Anchor Free Object Detection Explained Code
YOLOv6 Custom Dataset Training – Underwater Trash Detection Code
What is EXIF Data in Images? Code
t-SNE: T-Distributed Stochastic Neighbor Embedding Explained Code
CenterNet: Objects as Points – Anchor-free Object Detection Explained Code
YOLOv7 Pose vs MediaPipe in Human Pose Estimation Code
YOLOv6 Object Detection – Paper Explanation and Inference Code
YOLOX Object Detector Paper Explanation and Custom Training Code
Driver Drowsiness Detection Using Mediapipe In Python Code
GTC 2022 Big Bang AI announcements: Everything you need to know
NVIDIA GTC 2022 : The most important AI event this Fall
Object Tracking and Reidentification with FairMOT Code
What is Face Detection? – The Ultimate Guide for 2022 Code
Document Scanner: Custom Semantic Segmentation using PyTorch-DeepLabV3 Code
Fine Tuning YOLOv7 on Custom Dataset Code
Center Stage for Zoom Calls using MediaPipe Code
Mean Average Precision (mAP) in Object Detection
YOLOv7 Object Detection Paper Explanation and Inference Code
Pothole Detection using YOLOv4 and Darknet Code
Automatic Document Scanner using OpenCV Code
Demystifying GPU architectures for deep learning: Part 2 Code
Demystifying GPU Architectures For Deep Learning Code
Intersection-over-Union(IoU)-in-Object-Detection-and-Segmentation Code
Understanding Multiple Object Tracking using DeepSORT Code
Optical Character Recognition using PaddleOCR Code
Gesture Control in Zoom Call using Mediapipe Code
A Deep Dive into Tensorflow Model Optimization Code
DepthAI Pipeline Overview: Creating a Complex Pipeline Code
TensorFlow Lite Model Maker: Create Models for On-Device Machine Learning Code
TensorFlow Lite: Model Optimization for On Device Machine Learning Code
Object detection with depth measurement using pre-trained models with OAK-D Code
Custom Object Detection Training using YOLOv5 Code
Object Detection using Yolov5 and OpenCV DNN (C++/Python) Code
Create Snapchat/Instagram filters using Mediapipe Code
AUTOSAR C++ compliant deep learning inference with TensorRT Code
NVIDIA GTC 2022 Day 4 Highlights: Meet the new Jetson Orin
NVIDIA GTC 2022 Day 3 Highlights: Deep Dive into Hopper architecture
NVIDIA GTC 2022 Day 2 Highlights: Jensen’s Keynote
NVIDIA GTC 2022 Day 1 Highlights: Brilliant Start
Automatic License Plate Recognition using Python Code
Building a Poor Body Posture Detection and Alert System using MediaPipe Code
Introduction to MediaPipe Code
Disparity Estimation using Deep Learning Code
How to build Chrome Dino game bot using OpenCV Feature Matching Code
Top 10 Sources to Find Computer Vision and AI Models
Multi-Attribute and Graph-based Object Detection
Plastic Waste Detection with Deep Learning Code
Ensemble Deep Learning-based Defect Classification and Detection in SEM Images
Building Industrial embedded deep learning inference pipelines with TensorRT Code
Transfer Learning for Medical Images
Stereo Vision and Depth Estimation using OpenCV AI Kit Code
Introduction to OpenCV AI Kit and DepthAI Code
WeChat QR Code Scanner in OpenCV Code
AI behind the Diwali 2021 ‘Not just a Cadbury ad’
Model Selection and Benchmarking with Modelplace.AI Model Zoo
Real-time style transfer in a zoom meeting Code
Introduction to OpenVino Deep Learning Workbench Code
Running OpenVino Models on Intel Integrated GPU Code
Post Training Quantization with OpenVino Toolkit Code
Introduction to Intel OpenVINO Toolkit
Human Action Recognition using Detectron2 and LSTM Code
Pix2Pix:Image-to-Image Translation in PyTorch & TensorFlow Code
Conditional GAN (cGAN) in PyTorch and TensorFlow Code
Deep Convolutional GAN in PyTorch and TensorFlow Code
Introduction to Generative Adversarial Networks (GANs) Code
Human Pose Estimation using Keypoint RCNN in PyTorch Code
Non Maximum Suppression: Theory and Implementation in PyTorch Code
MRNet – The Multi-Task Approach Code
Generative and Discriminative Models
Playing Chrome's T-Rex Game with Facial Gestures Code
Variational Autoencoder in TensorFlow Code
Autoencoder in TensorFlow 2: Beginner’s Guide Code
Deep Learning with OpenCV DNN Module: A Definitive Guide Code
Depth perception using stereo camera (Python/C++) Code
Contour Detection using OpenCV (Python/C++) Code
Super Resolution in OpenCV Code
Improving Illumination in Night Time Images Code
Video Classification and Human Activity Recognition Code
How to use OpenCV DNN Module with Nvidia GPU on Windows Code
How to use OpenCV DNN Module with NVIDIA GPUs Code
Code OpenCV in Visual Studio
Install OpenCV on Windows – C++ / Python Code
Face Recognition with ArcFace Code
Background Subtraction with OpenCV and BGS Libraries Code
RAFT: Optical Flow estimation using Deep Learning Code
Making A Low-Cost Stereo Camera Using OpenCV Code
Optical Flow in OpenCV (C++/Python) Code
Introduction to Epipolar Geometry and Stereo Vision Code
Classification With Localization: Convert any keras Classifier to a Detector Code
Photoshop Filters in OpenCV Code
Tetris Game using OpenCV Python Code
Image Classification with OpenCV for Android Code
Image Classification with OpenCV Java Code
PyTorch to Tensorflow Model Conversion Code
Snake Game with OpenCV Python Code
Stanford MRNet Challenge: Classifying Knee MRIs Code
Experiment Logging with TensorBoard and wandb Code
Understanding Lens Distortion Code
Image Matting with state-of-the-art Method “F, B, Alpha Matting” Code
Bag Of Tricks For Image Classification - Let's check if it is working or not Code
Getting Started with OpenCV CUDA Module Code
Training a Custom Object Detector with DLIB & Making Gesture Controlled Applications Code
How To Run Inference Using TensorRT C++ API Code
Using Facial Landmarks for Overlaying Faces with Medical Masks Code
Tensorboard with PyTorch Lightning Code
Otsu's Thresholding with OpenCV Code
PyTorch-to-CoreML-model-conversion Code
Playing Rock, Paper, Scissors with AI Code
CNN Receptive Field Computation Using Backprop with TensorFlow Code
CNN Fully Convolutional Image Classification with TensorFlow Code
How to convert a model from PyTorch to TensorRT and speed up inference Code
Efficient image loading Code
Graph Convolutional Networks: Model Relations In Data Code
Getting Started with Federated Learning with PyTorch and PySyft Code
Creating a Virtual Pen & Eraser Code
Getting Started with PyTorch Lightning Code
Multi-Label Image Classification with PyTorch: Image Tagging Code
Funny Mirrors Using OpenCV code
t-SNE for ResNet feature visualization Code
Multi-Label Image Classification with Pytorch Code
CNN Receptive Field Computation Using Backprop Code
CNN Receptive Field Computation Using Backprop with TensorFlow Code
Augmented Reality using AruCo Markers in OpenCV(C++ and Python) Code
Fully Convolutional Image Classification on Arbitrary Sized Image Code
Camera Calibration using OpenCV Code
Geometry of Image Formation
Ensuring Training Reproducibility in Pytorch
Gaze Tracking
Simple Background Estimation in Videos Using OpenCV Code
Applications of Foreground-Background separation with Semantic Segmentation Code
EfficientNet: Theory + Code Code
PyTorch for Beginners: Mask R-CNN Instance Segmentation with PyTorch Code
PyTorch for Beginners: Faster R-CNN Object Detection with PyTorch Code
PyTorch for Beginners: Semantic Segmentation using torchvision Code
PyTorch for Beginners: Comparison of pre-trained models for Image Classification Code
PyTorch for Beginners: Basics Code
PyTorch Model Inference using ONNX and Caffe2 Code
Image Classification Using Transfer Learning in PyTorch Code
Hangman: Creating games in OpenCV Code
Image Inpainting with OpenCV (C++/Python) Code
Hough Transform with OpenCV (C++/Python) Code
Xeus-Cling: Run C++ code in Jupyter Notebook Code
Gender & Age Classification using OpenCV Deep Learning ( C++/Python ) Code
Invisibility Cloak using Color Detection and Segmentation with OpenCV Code
Fast Image Downloader for Open Images V4 (Python) Code
Deep Learning based Text Detection Using OpenCV (C++/Python) Code
Video Stabilization Using Point Feature Matching in OpenCV Code
Training YOLOv3 : Deep Learning based Custom Object Detector Code
Using OpenVINO with OpenCV Code
Duplicate Search on Quora Dataset Code
Shape Matching using Hu Moments (C++/Python) Code
Install OpenCV 4 on CentOS (C++ and Python) Code
Install OpenCV 3.4.4 on CentOS (C++ and Python) Code
Install OpenCV 3.4.4 on Red Hat (C++ and Python) Code
Install OpenCV 4 on Red Hat (C++ and Python) Code
Install OpenCV 4 on macOS (C++ and Python) Code
Install OpenCV 3.4.4 on Raspberry Pi Code
Install OpenCV 3.4.4 on macOS (C++ and Python) Code
OpenCV QR Code Scanner (C++ and Python) Code
Install OpenCV 3.4.4 on Windows (C++ and Python) Code
Install OpenCV 3.4.4 on Ubuntu 16.04 (C++ and Python) Code
Install OpenCV 3.4.4 on Ubuntu 18.04 (C++ and Python) Code
Universal Sentence Encoder Code
Install OpenCV 4 on Raspberry Pi Code
Install OpenCV 4 on Windows (C++ and Python) Code
Face Detection – Dlib, OpenCV, and Deep Learning ( C++ / Python ) Code
Hand Keypoint Detection using Deep Learning and OpenCV Code
Deep learning based Object Detection and Instance Segmentation using Mask R-CNN in OpenCV (Python / C++) Code
Install OpenCV 4 on Ubuntu 18.04 (C++ and Python) Code
Install OpenCV 4 on Ubuntu 16.04 (C++ and Python) Code
Multi-Person Pose Estimation in OpenCV using OpenPose Code
Heatmap for Logo Detection using OpenCV (Python) Code
Deep Learning based Object Detection using YOLOv3 with OpenCV ( Python / C++ ) Code
Convex Hull using OpenCV in Python and C++ Code
MultiTracker : Multiple Object Tracking using OpenCV (C++/Python) Code
Convolutional Neural Network based Image Colorization using OpenCV Code
SVM using scikit-learn Code
GOTURN: Deep Learning based Object Tracking Code
Find the Center of a Blob (Centroid) using OpenCV (C++/Python) Code
Support Vector Machines (SVM) Code
Batch Normalization in Deep Networks Code
Deep Learning based Character Classification using Synthetic Dataset Code
Image Quality Assessment : BRISQUE Code
Understanding AlexNet
Deep Learning based Text Recognition (OCR) using Tesseract and OpenCV Code
Deep Learning based Human Pose Estimation using OpenCV ( C++ / Python ) Code
Number of Parameters and Tensor Sizes in a Convolutional Neural Network (CNN)
How to convert your OpenCV C++ code into a Python module Code
CV4Faces : Best Project Award 2018
Facemark : Facial Landmark Detection using OpenCV Code
Image Alignment (Feature Based) using OpenCV (C++/Python) Code
Barcode and QR code Scanner using ZBar and OpenCV Code
Keras Tutorial : Fine-tuning using pre-trained models Code
OpenCV Transparent API
Face Reconstruction using EigenFaces (C++/Python) Code
Eigenface using OpenCV (C++/Python) Code
Principal Component Analysis
Keras Tutorial : Transfer Learning using pre-trained models Code
Keras Tutorial : Using pre-trained Imagenet models Code
Technical Aspects of a Digital SLR
Using Harry Potter interactive wand with OpenCV to create magic
Install OpenCV 3 and Dlib on Windows ( Python only )
Image Classification using Convolutional Neural Networks in Keras Code
Understanding Autoencoders using Tensorflow (Python) Code
Best Project Award : Computer Vision for Faces
Understanding Activation Functions in Deep Learning
Image Classification using Feedforward Neural Network in Keras Code
Exposure Fusion using OpenCV (C++/Python) Code
Understanding Feedforward Neural Networks
High Dynamic Range (HDR) Imaging using OpenCV (C++/Python) Code
Deep learning using Keras – The Basics Code
Selective Search for Object Detection (C++ / Python) Code
Installing Deep Learning Frameworks on Ubuntu with CUDA support
Parallel Pixel Access in OpenCV using forEach Code
cvui: A GUI lib built on top of OpenCV drawing primitives Code
Install Dlib on Windows
Install Dlib on Ubuntu
Install OpenCV3 on Ubuntu
Read, Write and Display a video using OpenCV ( C++/ Python ) Code
Install Dlib on MacOS
Install OpenCV 3 on MacOS
Install OpenCV 3 on Windows
Get OpenCV Build Information ( getBuildInformation )
Color spaces in OpenCV (C++ / Python) Code
Neural Networks : A 30,000 Feet View for Beginners
Alpha Blending using OpenCV (C++ / Python) Code
User stories : How readers of this blog are applying their knowledge to build applications
How to select a bounding box ( ROI ) in OpenCV (C++/Python) ?
Automatic Red Eye Remover using OpenCV (C++ / Python) Code
Bias-Variance Tradeoff in Machine Learning
Embedded Computer Vision: Which device should you choose?
Object Tracking using OpenCV (C++/Python) Code
Handwritten Digits Classification : An OpenCV ( C++ / Python ) Tutorial Code
Training a better Haar and LBP cascade based Eye Detector using OpenCV
Deep Learning Book Gift Recipients
Minified OpenCV Haar and LBP Cascades Code
Deep Learning Book Gift
Histogram of Oriented Gradients
Image Recognition and Object Detection : Part 1
Head Pose Estimation using OpenCV and Dlib Code
Live CV : A Computer Vision Coding Application
Approximate Focal Length for Webcams and Cell Phone Cameras
Configuring Qt for OpenCV on OSX Code
Rotation Matrix To Euler Angles Code
Speeding up Dlib’s Facial Landmark Detector
Warp one triangle to another using OpenCV ( C++ / Python ) Code
Average Face : OpenCV ( C++ / Python ) Tutorial Code
Face Swap using OpenCV ( C++ / Python ) Code
Face Morph Using OpenCV — C++ / Python Code
Deep Learning Example using NVIDIA DIGITS 3 on EC2
NVIDIA DIGITS 3 on EC2
Homography Examples using OpenCV ( Python / C ++ ) Code
Filling holes in an image using OpenCV ( Python / C++ ) Code
How to find frame rate or frames per second (fps) in OpenCV ( Python / C++ ) ? Code
Delaunay Triangulation and Voronoi Diagram using OpenCV ( C++ / Python) Code
OpenCV (C++ vs Python) vs MATLAB for Computer Vision
Facial Landmark Detection
Why does OpenCV use BGR color format ?
Computer Vision for Predicting Facial Attractiveness Code
applyColorMap for pseudocoloring in OpenCV ( C++ / Python ) Code
Image Alignment (ECC) in OpenCV ( C++ / Python ) Code
How to find OpenCV version in Python and C++ ?
Baidu banned from ILSVRC 2015
OpenCV Transparent API
How Computer Vision Solved the Greatest Soccer Mystery of All Time
Embedded Vision Summit 2015
Read an Image in OpenCV ( Python, C++ ) Code
Non-Photorealistic Rendering using OpenCV ( Python, C++ ) Code
Seamless Cloning using OpenCV ( Python , C++ ) Code
OpenCV Threshold ( Python , C++ ) Code
Blob Detection Using OpenCV ( Python, C++ ) Code
Turn your OpenCV Code into a Web API in under 10 minutes — Part 1
How to compile OpenCV sample Code ?
Install OpenCV 3 on Yosemite ( OSX 10.10.x )

版本历史

RF_DETR_Segmentation2026/04/07
YOLO26_Keypoint_Estimation2026/04/16
Colorization2026/03/18
Roboflow_Trackers2026/03/17

常见问题

相似工具推荐

openclaw

OpenClaw 是一款专为个人打造的本地化 AI 助手,旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚,能够直接接入你日常使用的各类通讯渠道,包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息,OpenClaw 都能即时响应,甚至支持在 macOS、iOS 和 Android 设备上进行语音交互,并提供实时的画布渲染功能供你操控。 这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地,用户无需依赖云端服务即可享受快速、私密的智能辅助,真正实现了“你的数据,你做主”。其独特的技术亮点在于强大的网关架构,将控制平面与核心助手分离,确保跨平台通信的流畅性与扩展性。 OpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者,以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力(支持 macOS、Linux 及 Windows WSL2),即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你

349.3k|★★★☆☆|1周前
Agent开发框架图像

stable-diffusion-webui

stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面,旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点,将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。 无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师,还是想要深入探索模型潜力的开发者与研究人员,都能从中获益。其核心亮点在于极高的功能丰富度:不仅支持文生图、图生图、局部重绘(Inpainting)和外绘(Outpainting)等基础模式,还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外,它内置了 GFPGAN 和 CodeFormer 等人脸修复工具,支持多种神经网络放大算法,并允许用户通过插件系统无限扩展能力。即使是显存有限的设备,stable-diffusion-webui 也提供了相应的优化选项,让高质量的 AI 艺术创作变得触手可及。

162.1k|★★★☆☆|1周前
开发框架图像Agent

everything-claude-code

everything-claude-code 是一套专为 AI 编程助手(如 Claude Code、Codex、Cursor 等)打造的高性能优化系统。它不仅仅是一组配置文件,而是一个经过长期实战打磨的完整框架,旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。 通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能,everything-claude-code 能显著提升 AI 在复杂任务中的表现,帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略,使得模型响应更快、成本更低,同时有效防御潜在的攻击向量。 这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库,还是需要 AI 协助进行安全审计与自动化测试,everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目,它融合了多语言支持与丰富的实战钩子(hooks),让 AI 真正成长为懂上

159.6k|★★☆☆☆|今天
开发框架Agent语言模型

opencode

OpenCode 是一款开源的 AI 编程助手(Coding Agent),旨在像一位智能搭档一样融入您的开发流程。它不仅仅是一个代码补全插件,而是一个能够理解项目上下文、自主规划任务并执行复杂编码操作的智能体。无论是生成全新功能、重构现有代码,还是排查难以定位的 Bug,OpenCode 都能通过自然语言交互高效完成,显著减少开发者在重复性劳动和上下文切换上的时间消耗。 这款工具专为软件开发者、工程师及技术研究人员设计,特别适合希望利用大模型能力来提升编码效率、加速原型开发或处理遗留代码维护的专业人群。其核心亮点在于完全开源的架构,这意味着用户可以审查代码逻辑、自定义行为策略,甚至私有化部署以保障数据安全,彻底打破了传统闭源 AI 助手的“黑盒”限制。 在技术体验上,OpenCode 提供了灵活的终端界面(Terminal UI)和正在测试中的桌面应用程序,支持 macOS、Windows 及 Linux 全平台。它兼容多种包管理工具,安装便捷,并能无缝集成到现有的开发环境中。无论您是追求极致控制权的资深极客,还是渴望提升产出的独立开发者,OpenCode 都提供了一个透明、可信

144.3k|★☆☆☆☆|昨天
Agent插件

ComfyUI

ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎,专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式,采用直观的节点式流程图界面,让用户通过连接不同的功能模块即可构建个性化的生成管线。 这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景,也能自由组合模型、调整参数并实时预览效果,轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性,不仅支持 Windows、macOS 和 Linux 全平台,还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构,并率先支持 SDXL、Flux、SD3 等前沿模型。 无论是希望深入探索算法潜力的研究人员和开发者,还是追求极致创作自由度的设计师与资深 AI 绘画爱好者,ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能,使其成为当前最灵活、生态最丰富的开源扩散模型工具之一,帮助用户将创意高效转化为现实。

108.3k|★★☆☆☆|1周前
开发框架图像Agent

gemini-cli

gemini-cli 是一款由谷歌推出的开源 AI 命令行工具,它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言,它提供了一条从输入提示词到获取模型响应的最短路径,无需切换窗口即可享受智能辅助。 这款工具主要解决了开发过程中频繁上下文切换的痛点,让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用,还是执行复杂的 Git 操作,gemini-cli 都能通过自然语言指令高效处理。 它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口,具备出色的逻辑推理能力;内置 Google 搜索、文件操作及 Shell 命令执行等实用工具;更独特的是,它支持 MCP(模型上下文协议),允许用户灵活扩展自定义集成,连接如图像生成等外部能力。此外,个人谷歌账号即可享受免费的额度支持,且项目基于 Apache 2.0 协议完全开源,是提升终端工作效率的理想助手。

100.8k|★★☆☆☆|1周前
插件Agent图像