[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-Confusezius--Deep-Metric-Learning-Baselines":3,"tool-Confusezius--Deep-Metric-Learning-Baselines":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":81,"owner_twitter":82,"owner_website":83,"owner_url":84,"languages":85,"stars":94,"forks":95,"last_commit_at":96,"license":97,"difficulty_score":10,"env_os":98,"env_gpu":99,"env_ram":98,"env_deps":100,"category_tags":108,"github_topics":109,"view_count":10,"oss_zip_url":121,"oss_zip_packed_at":121,"status":16,"created_at":122,"updated_at":123,"faqs":124,"releases":170},1030,"Confusezius\u002FDeep-Metric-Learning-Baselines","Deep-Metric-Learning-Baselines","PyTorch Implementation for Deep Metric Learning Pipelines","Deep-Metric-Learning-Baselines 是一个基于 PyTorch 的深度学习度量学习框架，为研究人员提供了完整且易于扩展的训练与评估流水线。它集成了多种主流损失函数（如 Triplet Loss、Margin Loss、ProxyNCA）、采样策略（包括随机采样、半硬采样等）以及标准数据集（CUB200、CARS196、Stanford Online Products 等），帮助用户快速验证和对比不同方法在图像检索、人脸识别等任务上的表现。\n\n这个框架最大的特点是模块化设计，让你可以轻松添加新的损失函数、采样器或网络架构，无需从零搭建基础组件，从而专注于核心算法创新。特别适合计算机视觉领域的研究者和开发者使用，能显著降低实验门槛，提升研究效率。\n\n如果你正在从事度量学习相关研究，这个工具能帮你节省大量重复性工作，快速建立实验基准。","# Easily Extendable Basic Deep Metric Learning Pipeline\n#### Karsten Roth (karsten.rh1@gmail.com), Biagio Brattoli (biagio.brattoli@gmail.com)\n\n*When using this repo in any academic work, please provide a reference to*\n\n```\n@misc{roth2020revisiting,\n    title={Revisiting Training Strategies and Generalization Performance in Deep Metric Learning},\n    author={Karsten Roth and Timo Milbich and Samarth Sinha and Prateek Gupta and Björn Ommer and Joseph Paul Cohen},\n    year={2020},\n    eprint={2002.08473},\n    archivePrefix={arXiv},\n    primaryClass={cs.CV}\n}\n```\n\n---\n\n#### Based on an extendend version of this repo, we have created a thorough comparison and evaluation of Deep Metric Learning: \nhttps:\u002F\u002Farxiv.org\u002Fabs\u002F2002.08473\n\nThe newly released code can be found here: https:\u002F\u002Fgithub.com\u002FConfusezius\u002FRevisiting_Deep_Metric_Learning_PyTorch\n\nIt contains more criteria, miner, metrics and logging options!\n\n---\n#### For usage, go to section 3 - for results to section 4\n\n## 1. Overview\nThis repository contains a full, easily extendable pipeline to test and implement current and new deep metric learning methods. For referencing and testing, this repo contains implementations\u002Fdataloaders for:\n\n__Loss Functions__\n* Triplet Loss (https:\u002F\u002Farxiv.org\u002Fabs\u002F1412.6622)\n* Margin Loss (https:\u002F\u002Farxiv.org\u002Fabs\u002F1706.07567)\n* ProxyNCA (https:\u002F\u002Farxiv.org\u002Fabs\u002F1703.07464)\n* N-Pair Loss (https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F6200-improved-deep-metric-learning-with-multi-class-n-pair-loss-objective.pdf)\n\n__Sampling Methods__\n* Random Sampling\n* Softhard Sampling (Soft Version of hard tuple sampling)\n* Semihard Sampling (https:\u002F\u002Farxiv.org\u002Fabs\u002F1503.03832)\n* Distance Sampling (https:\u002F\u002Farxiv.org\u002Fabs\u002F1706.07567)\n* N-Pair Sampling (https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F6200-improved-deep-metric-learning-with-multi-class-n-pair-loss-objective.pdf)\n\n__Datasets__\n* CUB200-2011 (http:\u002F\u002Fwww.vision.caltech.edu\u002Fvisipedia\u002FCUB-200.html)\n* CARS196 (https:\u002F\u002Fai.stanford.edu\u002F~jkrause\u002Fcars\u002Fcar_dataset.html)\n* Stanford Online Products (http:\u002F\u002Fcvgl.stanford.edu\u002Fprojects\u002Flifted_struct\u002F)\n* In-Shop Clothes (http:\u002F\u002Fmmlab.ie.cuhk.edu.hk\u002Fprojects\u002FDeepFashion\u002FInShopRetrieval.html, download from https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F0B7EVK8r0v71pVDZFQXRsMDZCX1E. Thanks to KunHe for providing the link!)\n* (optional) PKU Vehicle-ID (https:\u002F\u002Fwww.pkuml.org\u002Fresources\u002Fpku-vds.html)\n\n__Architectures__\n* ResNet50 (https:\u002F\u002Farxiv.org\u002Fabs\u002F1512.03385)\n* GoogLeNet (https:\u002F\u002Farxiv.org\u002Fabs\u002F1409.4842)    \n  [Note: This version follows the official torchvision implementation, which differs from the original variant.]\n\n\n__NOTE__: PKU Vehicle-ID is _(optional)_ because there is no direct way to download the dataset, as it requires special licensing. However, if this dataset becomes available (in the structure shown in part 2.2), it can be used directly.\n\n---\n### 1.1 Related Repos:\n* [Metric Learning with Mined Interclass Characteristics](https:\u002F\u002Fgithub.com\u002FConfusezius\u002Fmetric-learning-mining-interclass-characteristics)\n* [Metric Learning by dividing the embedding space](https:\u002F\u002Fgithub.com\u002FCompVis\u002Fmetric-learning-divide-and-conquer)\n* [Deep Metric Learning to Rank](https:\u002F\u002Fgithub.com\u002Fkunhe\u002FFastAP-metric-learning)\n---\n\n## 2. Repo & Dataset Structure\n### 2.1 Repo Structure\n```\nRepository\n│   ### General Files\n│   README.md\n│   requirements.txt    \n│   installer.sh\n|\n|   ### Main Scripts\n|   Standard_Training.py     (main training script)\n|   losses.py   (collection of loss and sampling impl.)\n│   datasets.py (dataloaders for all datasets)\n│   \n│   ### Utility scripts\n|   auxiliaries.py  (set of useful utilities)\n|   evaluate.py     (set of evaluation functions)\n│   \n│   ### Network Scripts\n|   netlib.py       (contains impl. for ResNet50)\n|   googlenet.py    (contains impl. for GoogLeNet)\n│   \n│   \n└───Training Results (generated during Training)\n|    │   e.g. cub200\u002FTraining_Run_Name\n|    │   e.g. cars196\u002FTraining_Run_Name\n|\n│   \n└───Datasets (should be added, if one does not want to set paths)\n|    │   cub200\n|    │   cars196\n|    │   online_products\n|    │   in-shop\n|    │   vehicle_id\n```\n\n### 2.2 Dataset Structures\n__CUB200-2011\u002FCARS196__\n```\ncub200\u002Fcars196\n└───images\n|    └───001.Black_footed_Albatross\n|           │   Black_Footed_Albatross_0001_796111\n|           │   ...\n|    ...\n```\n\n__Online Products__\n```\nonline_products\n└───images\n|    └───bicycle_final\n|           │   111085122871_0.jpg\n|    ...\n|\n└───Info_Files\n|    │   bicycle.txt\n|    │   ...\n```\n\n__In-Shop Clothes__\n```\nin-shop\n└─img\n|    └─MEN\n|         └─Denim\n|               └─id_00000080\n|                  │   01_1_front.jpg\n|                  │   ...\n|               ...\n|         ...\n|    ...\n|\n└─Eval\n|  │   list_eval_partition.txt\n```\n\n\n__PKU Vehicle ID__\n```\nvehicle_id\n└───image\n|     │   \u003Cimg>.jpg\n|     |   ...\n|     \n└───train_test_split\n|     |   test_list_800.txt\n|     |   ...\n```\n\n---\n\n## 3. Using the Pipeline\n\n### [1.] Requirements\nThe pipeline is build around `Python3` (i.e. by installing Miniconda https:\u002F\u002Fconda.io\u002Fminiconda.html') and `Pytorch 1.0.0\u002F1`. It has been tested around `cuda 8` and `cuda 9`.\n\nTo install the required libraries, either directly check `requirements.txt` or create a conda environment:\n```\nconda create -n \u003CEnv_Name> python=3.6\n```\n\nActivate it\n```\nconda activate \u003CEnv_Name>\n```\nand run\n```\nbash installer.sh\n```\n\nNote that for kMeans- and Nearest Neighbour Computation, the library `faiss` is used, which can allow to move these computations to GPU if speed is desired. However, in most cases, `faiss` is fast enough s.t. the computation of evaluation metrics is no bottleneck.  \n**NOTE:** If one wishes not to use `faiss` but standard `sklearn`, simply use `auxiliaries_nofaiss.py` to replace `auxiliaries.py` when importing the libraries.\n\n\n\n### [2.] Exemplary Runs\nThe main script is `Standard_Training.py`. If running without input arguments, training of ResNet50 on CUB200-2011 with Marginloss and Distance-sampling is performed.  \nOtherwise, the following flags suffice to train with different losses, sampling methods, architectures and datasets:\n```\npython Standard_Training.py --dataset \u003Cdataset> --loss \u003Closs> --sampling \u003Csampling> --arch \u003Carch> --k_vals \u003Ck_vals> --embed_dim \u003Cembed_dim>\n```\nThe following flags are available:\n* `\u003Cdataset> \u003C- cub200, cars196, online_products, in-shop, vehicle_id`\n* `\u003Closs> \u003C- marginloss, triplet, npair, proxynca`\n* `\u003Csampling> \u003C- distance, semihard, random, npair`\n* `\u003Carch> \u003C- resnet50, googlenet`\n* `\u003Ck_vals> \u003C- List of Recall @ k values to evaluate on, e.g. 1 2 4 8`\n* `\u003Cembed_dim> \u003C- Network embedding dimension. Default: 128 for ResNet50, 512 for GoogLeNet.`\n\nFor all other training-specific arguments (e.g. batch-size, num. training epochs., ...), simply refer to the input arguments in `Standard_Training.py`.\n\n__NOTE__: If one wishes to use a different learning rate for the final linear embedding layer, the flag `--fc_lr_mul` needs to be set to a value other than zero (i.e. `10` as is done in various implementations).\n\nFinally, to decide the GPU to use and the name of the training folder in which network weights, sample recoveries and metrics are stored, set:\n```\npython Standard_Training.py --gpu \u003Cgpu_id> --savename \u003Cname_of_training_run>\n```\nIf `--savename` is not set, a default name based on the starting date will be chosen.\n\nIf one wishes to simply use standard parameters and wants to get close to literature results (more or less, depends on seeds and overall training scheduling), refer to `sample_training_runs.sh`, which contains a list of executable one-liners.\n\n\n### [3.] Implementation Notes regarding Extendability:\n\nTo extend or test other sampling or loss methods, simply do:\n\n__For Batch-based Sampling:__  \nIn `losses.py`, add the sampling method, which should act on a batch (and the resp. set of labels), e.g.:\n```\ndef new_sampling(self, batch, label, **additional_parameters): ...\n```\nThis function should, if it needs to run with existing losses, a list of tuples containing indexes with respect to the batch, e.g. for sampling methods returning triplets:\n```\nreturn [(anchor_idx, positive_idx, negative_idx) for anchor_idx, positive_idx, negative_idx in zip(anchor_idxs, positive_idxs, negative_idxs)]\n```\nAlso, don't forget to add a handle in `Sampler.__init__()`.\n\n__For Data-specific Sampling:__  \nTo influence the data samples used to generate the batches, in `datasets.py` edit `BaseTripletDataset`.\n\n\n__For New Loss Functions:__  \nSimply add a new class inheriting from `torch.nn.Module`. Refer to other loss variants to see how to do so. In general, include an instance of the `Sampler`-class, which will provide sampled data tuples during a `forward()`-pass, by calling `self.sampler_instance.give(batch, labels, **additional_parameters)`.  \nFinally, include the loss function in the `loss_select()`-function. Parameters can be passed through the dictionary-notation (see other examples) and if learnable parameters are added, include them in the `to_optim`-list.\n\n\n### [4.] Stored Data:\nBy default, the following files are saved:\n```\nName_of_Training_Run\n|  checkpoint.pth.tar   -> Contains network state-dict.\n|  hypa.pkl             -> Contains all network parameters as pickle.\n|                          Can be used directly to recreate the network.\n| log_train_Base.csv    -> Logged training data as CSV.                      \n| log_val_Base.csv      -> Logged test metrics as CSV.                    \n| Parameter_Info.txt    -> All Parameters stored as readable text-file.\n| InfoPlot_Base.svg     -> Graphical summary of training\u002Ftesting metrics progression.\n| sample_recoveries.png -> Sample recoveries for best validation weights.\n|                          Acts as a sanity test.\n```\n\n![Sample Recoveries](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FConfusezius_Deep-Metric-Learning-Baselines_readme_ef33f5f99f14.png)\n__Note:__ _Red denotes query images, while green show the resp. nearest neighbours._\n\n![Sample Recoveries](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FConfusezius_Deep-Metric-Learning-Baselines_readme_37a0ab42c8f4.png)\n__Note:__ _The header in the summary plot shows the best testing metrics over the whole run._\n\n### [5.] Additional Notes:\nTo finalize, several flags might be of interest when examining the respective runs:\n```\n--dist_measure: If set, the ratio of mean intraclass-distances over mean interclass distances\n                (by measure of center-of-mass distances) is computed after each epoch and stored\u002Fplotted.\n--grad_measure: If set, the average (absolute) gradients from the embedding layer to the last\n                conv. layer are stored in a Pickle-File. This can be used to examine the change of features during each iteration.\n```\nFor more details, refer to the respective classes in `auxiliaries.py`.\n\n---\n\n## 4. Results\nThese results are supposed to be performance estimates achieved by running the respective commands in `sample_training_runs.sh`. Note that the learning rate scheduling might not be fully optimised, so these values should only serve as reference\u002Fexpectation, not what can be ultimately achieved with more tweaking.\n\n_Note also that there is a not insignificant dependency on the used seed._\n\n\n\n__CUB200__\n\nArchitecture | Loss\u002FSampling       |   NMI  |  F1  | Recall @ 1 -- 2 -- 4 -- 8\n-------------|---------------      |--------|------|-----------------\nResNet50     |  Margin\u002FDistance    | __68.2__   | __38.7__ | 63.4 -- 74.9 --  __86.0__ --  90.4    \nResNet50     |  Triplet\u002FSofthard   | 66.2   | 35.5 | 61.2 --  73.2 --  82.4 --  89.5    \nResNet50     |  NPair\u002FNone         | 65.4   | 33.8 | 59.0 --  71.3 --  81.1 --  88.8    \nResNet50     |  ProxyNCA\u002FNone      | 68.1   | 38.1 | __64.0__ --  __75.4__ --  84.2 --  __90.5__    \n\n\n__Cars196__\n\nArchitecture | Loss\u002FSampling       |   NMI  |  F1  | Recall @ 1 -- 2 -- 4 -- 8\n-------------|---------------      |--------|------|-----------------\nResNet50     |  Margin\u002FDistance    | __67.2__   | __37.6__ | 79.3 -- 87.1 -- __92.1 -- 95.4__    \nResNet50     |  Triplet\u002FSofthard   | 64.4   | 32.4 | 75.4 -- 84.2 -- 90.1 -- 94.1\nResNet50     |  NPair\u002FNone         | 62.3   | 30.1 | 69.5 -- 80.2 -- 87.3 -- 92.1\nResNet50     |  ProxyNCA\u002FNone      | 66.3   | 35.8 | __80.0 -- 87.2__ -- 91.8 -- 95.1\n\n__Online Products__\n\nArchitecture | Loss\u002FSampling       |   NMI  |  F1  | Recall @ 1 -- 10 -- 100 -- 1000\n-------------|---------------      |--------|------|-----------------\nResNet50     |  Margin\u002FDistance    | __89.6__   | __34.9__ | __76.1 -- 88.7 -- 95.1__ -- 98.3\nResNet50     |  Triplet\u002FSofthard   | 89.1   | 33.7 | 74.3 -- 87.6 -- 94.9 -- __98.5__\nResNet50     |  NPair\u002FNone         | 88.8   | 31.1 | 70.9 -- 85.2 -- 93.8 -- 98.2\n\n\n__In-Shop Clothes__\n\nArchitecture | Loss\u002FSampling       |   NMI  |  F1  | Recall @ 1 -- 10 -- 20 -- 30 -- 50\n-------------|---------------      |--------|------|-----------------\nResNet50     |  Margin\u002FDistance    | 88.2   | 27.7 | __84.5__ -- 96.1 -- 97.4 -- 97.9 -- 98.5\nResNet50     |  Triplet\u002FSemihard   | __89.0__   | __30.8__ | 83.9 -- __96.3 -- 97.6 -- 98.4 -- 98.8__\nResNet50     |  NPair\u002FNone         | 88.0   | 27.6 | 80.9 -- 95.0 -- 96.6 -- 97.5 -- 98.2\n\n\n__NOTE:__\n 1. Regarding __Vehicle-ID__: Due to the number of test sets, size of the training set and little public accessibility, results are not included for the time being.\n 2. Regarding ProxyNCA for __Online Products__ and __In-Shop Clothes__: Due to the high number of classes, the number of proxies required is too high for useful training (>10000 proxies).\n\n---\n\n## ToDO:\n- [x] Fix Version in `requirements.txt`  \n- [x] Add Results for Implementations\n- [x] Finalize Comments  \n- [ ] Add Inception-BN  \n- [ ] Add Lifted Structure Loss\n","# 易于扩展的基础深度度量学习流水线\n#### Karsten Roth (karsten.rh1@gmail.com), Biagio Brattoli (biagio.brattoli@gmail.com)\n\n*在学术工作中使用此仓库时，请引用*\n\n```\n@misc{roth2020revisiting,\n    title={Revisiting Training Strategies and Generalization Performance in Deep Metric Learning},\n    author={Karsten Roth and Timo Milbich and Samarth Sinha and Prateek Gupta and Björn Ommer and Joseph Paul Cohen},\n    year={2020},\n    eprint={2002.08473},\n    archivePrefix={arXiv},\n    primaryClass={cs.CV}\n}\n```\n\n---\n\n#### 基于该仓库的扩展版本，我们进行了深度度量学习的全面比较与评估：\nhttps:\u002F\u002Farxiv.org\u002Fabs\u002F2002.08473\n\n新发布的代码可在此处找到：https:\u002F\u002Fgithub.com\u002FConfusezius\u002FRevisiting_Deep_Metric_Learning_PyTorch\n\n其中包含更多的损失函数、采样器、评估指标和日志记录选项！\n\n---\n#### 使用方法请参见第3节，实验结果请参见第4节\n\n## 1. 概述\n本仓库包含一个完整且易于扩展的流水线，用于测试和实现当前及新的深度度量学习方法。作为参考和测试，本仓库实现了以下组件：\n\n__损失函数__\n* Triplet Loss（三元组损失）(https:\u002F\u002Farxiv.org\u002Fabs\u002F1412.6622)\n* Margin Loss（间隔损失）(https:\u002F\u002Farxiv.org\u002Fabs\u002F1706.07567)\n* ProxyNCA（代理NCA损失）(https:\u002F\u002Farxiv.org\u002Fabs\u002F1703.07464)\n* N-Pair Loss（N对损失）(https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F6200-improved-deep-metric-learning-with-multi-class-n-pair-loss-objective.pdf)\n\n__采样方法__\n* Random Sampling（随机采样）\n* Softhard Sampling（软难采样，困难元组采样的软版本）\n* Semihard Sampling（半难采样）(https:\u002F\u002Farxiv.org\u002Fabs\u002F1503.03832)\n* Distance Sampling（距离采样）(https:\u002F\u002Farxiv.org\u002Fabs\u002F1706.07567)\n* N-Pair Sampling（N对采样）(https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F6200-improved-deep-metric-learning-with-multi-class-n-pair-loss-objective.pdf)\n\n__数据集__\n* CUB200-2011 (http:\u002F\u002Fwww.vision.caltech.edu\u002Fvisipedia\u002FCUB-200.html)\n* CARS196 (https:\u002F\u002Fai.stanford.edu\u002F~jkrause\u002Fcars\u002Fcar_dataset.html)\n* Stanford Online Products (http:\u002F\u002Fcvgl.stanford.edu\u002Fprojects\u002Flifted_struct\u002F)\n* In-Shop Clothes (http:\u002F\u002Fmmlab.ie.cuhk.edu.hk\u002Fprojects\u002FDeepFashion\u002FInShopRetrieval.html, 从 https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F0B7EVK8r0v71pVDZFQXRsMDZCX1E 下载。感谢 KunHe 提供链接！)\n* （可选）PKU Vehicle-ID (https:\u002F\u002Fwww.pkuml.org\u002Fresources\u002Fpku-vds.html)\n\n__网络架构__\n* ResNet50 (https:\u002F\u002Farxiv.org\u002Fabs\u002F1512.03385)\n* GoogLeNet (https:\u002F\u002Farxiv.org\u002Fabs\u002F1409.4842)    \n  [注：此版本遵循官方torchvision实现，与原始版本有所不同。]\n\n\n__注意__: PKU Vehicle-ID 是_（可选的）_，因为没有直接下载该数据集的方式，它需要特殊许可。但是，如果该数据集可用（结构如2.2节所示），则可以直接使用。\n\n---\n### 1.1 相关仓库：\n* [Metric Learning with Mined Interclass Characteristics](https:\u002F\u002Fgithub.com\u002FConfusezius\u002Fmetric-learning-mining-interclass-characteristics)\n* [Metric Learning by dividing the embedding space](https:\u002F\u002Fgithub.com\u002FCompVis\u002Fmetric-learning-divide-and-conquer)\n* [Deep Metric Learning to Rank](https:\u002F\u002Fgithub.com\u002Fkunhe\u002FFastAP-metric-learning)\n---\n\n## 2. 仓库与数据集结构\n### 2.1 仓库结构\n```\nRepository\n│   ### 通用文件\n│   README.md\n│   requirements.txt    \n│   installer.sh\n|\n|   ### 主脚本\n|   Standard_Training.py     (主训练脚本)\n|   losses.py   (损失函数和采样实现集合)\n│   datasets.py (所有数据集的数据加载器)\n│   \n│   ### 工具脚本\n|   auxiliaries.py  (实用工具集合)\n|   evaluate.py     (评估函数集合)\n│   \n│   ### 网络脚本\n|   netlib.py       (ResNet50实现)\n|   googlenet.py    (GoogLeNet实现)\n│   \n│   \n└───Training Results (训练过程中生成)\n|    │   e.g. cub200\u002FTraining_Run_Name\n|    │   e.g. cars196\u002FTraining_Run_Name\n|\n│   \n└───Datasets (应添加到此，如果不想设置路径)\n|    │   cub200\n|    │   cars196\n|    │   online_products\n|    │   in-shop\n|    │   vehicle_id\n```\n\n### 2.2 数据集结构\n__CUB200-2011\u002FCARS196__\n```\ncub200\u002Fcars196\n└───images\n|    └───001.Black_footed_Albatross\n|           │   Black_Footed_Albatross_0001_796111\n|           │   ...\n|    ...\n```\n\n__Online Products__\n```\nonline_products\n└───images\n|    └───bicycle_final\n|           │   111085122871_0.jpg\n|    ...\n|\n└───Info_Files\n|    │   bicycle.txt\n|    │   ...\n```\n\n__In-Shop Clothes__\n```\nin-shop\n└─img\n|    └─MEN\n|         └─Denim\n|               └─id_00000080\n|                  │   01_1_front.jpg\n|                  │   ...\n|               ...\n|         ...\n|    ...\n|\n└─Eval\n|  │   list_eval_partition.txt\n```\n\n\n__PKU Vehicle ID__\n```\nvehicle_id\n└───image\n|     │   \u003Cimg>.jpg\n|     |   ...\n|     \n└───train_test_split\n|     |   test_list_800.txt\n|     |   ...\n```\n\n---\n\n## 3. 使用流水线\n\n### [1.] 环境要求\n该流水线基于 `Python3`（即通过安装 Miniconda https:\u002F\u002Fconda.io\u002Fminiconda.html）和 `Pytorch 1.0.0\u002F1` 构建。已在 `cuda 8` 和 `cuda 9` 环境下测试。\n\n要安装所需的库，可以直接查看 `requirements.txt` 或创建 conda 环境：\n```\nconda create -n \u003CEnv_Name> python=3.6\n```\n\n激活环境\n```\nconda activate \u003CEnv_Name>\n```\n然后运行\n```\nbash installer.sh\n```\n\n注意，对于 k均值聚类（kMeans）和最近邻（Nearest Neighbour）计算，我们使用了 `faiss` 库，如果需要速度，可以将这些计算移动到GPU上。然而，在大多数情况下，`faiss` 已经足够快，使得评估指标的计算不会成为瓶颈。  \n**注意：** 如果不想使用 `faiss` 而想使用标准的 `sklearn`，只需在导入库时使用 `auxiliaries_nofaiss.py` 替换 `auxiliaries.py`。\n\n### [2.] 示例运行\n主脚本是 `Standard_Training.py`。如果不带输入参数运行，将执行在 CUB200-2011 数据集上使用 Marginloss（边际损失）和 Distance-sampling（距离采样）对 ResNet50 进行训练。\n\n否则，可以使用以下标志来使用不同的损失函数（loss）、采样方法（sampling）、架构（arch）和数据集（dataset）进行训练：\n```\npython Standard_Training.py --dataset \u003Cdataset> --loss \u003Closs> --sampling \u003Csampling> --arch \u003Carch> --k_vals \u003Ck_vals> --embed_dim \u003Cembed_dim>\n```\n\n以下标志可用：\n* `\u003Cdataset> \u003C- cub200, cars196, online_products, in-shop, vehicle_id`\n* `\u003Closs> \u003C- marginloss, triplet, npair, proxynca`\n* `\u003Csampling> \u003C- distance, semihard, random, npair`\n* `\u003Carch> \u003C- resnet50, googlenet`\n* `\u003Ck_vals> \u003C- 要评估的 Recall @ k 值列表，例如 1 2 4 8`\n* `\u003Cembed_dim> \u003C- 网络嵌入维度。默认值：ResNet50 为 128，GoogLeNet 为 512。`\n\n对于所有其他训练相关的参数（例如 batch-size（批大小）、num. training epochs（训练轮数）等），只需参考 `Standard_Training.py` 中的输入参数。\n\n__注意__：如果希望为最终的线性嵌入层（linear embedding layer）使用不同的学习率（learning rate），需要将标志 `--fc_lr_mul` 设置为非零值（即 `10`，如各种实现中所做的那样）。\n\n最后，要决定使用哪个 GPU（图形处理器）以及存储网络权重、样本恢复（sample recoveries）和指标（metrics）的训练文件夹名称，请设置：\n```\npython Standard_Training.py --gpu \u003Cgpu_id> --savename \u003Cname_of_training_run>\n```\n如果未设置 `--savename`，将基于开始日期选择一个默认名称。\n\n如果希望简单地使用标准参数并获得接近文献结果的效果（或多或少取决于随机种子和整体训练调度），请参考 `sample_training_runs.sh`，其中包含一系列可执行的单行命令。\n\n### [3.] 关于可扩展性的实现说明：\n\n要扩展或测试其他采样或损失方法，只需执行：\n\n__对于基于批次的采样（Batch-based Sampling）：__  \n在 `losses.py` 中，添加采样方法，该方法应作用于一个批次（batch）以及相应的标签集，例如：\n```\ndef new_sampling(self, batch, label, **additional_parameters): ...\n```\n如果需要与现有损失函数一起运行，此函数应返回一个包含相对于批次的索引的元组列表，例如对于返回三元组的采样方法：\n```\nreturn [(anchor_idx, positive_idx, negative_idx) for anchor_idx, positive_idx, negative_idx in zip(anchor_idxs, positive_idxs, negative_idxs)]\n```\n同时，别忘了在 `Sampler.__init__()` 中添加一个句柄。\n\n__对于特定数据的采样（Data-specific Sampling）：__  \n要影响用于生成批次的数据样本，请在 `datasets.py` 中编辑 `BaseTripletDataset`。\n\n__对于新的损失函数（Loss Functions）：__  \n只需添加一个继承自 `torch.nn.Module`（PyTorch 模块基类）的新类。参考其他损失变体以了解如何实现。通常，需要包含一个 `Sampler` 类的实例，该实例将在 `forward()`（前向传播）期间通过调用 `self.sampler_instance.give(batch, labels, **additional_parameters)` 提供采样的数据元组。最后，将损失函数包含在 `loss_select()` 函数中。参数可以通过字典表示法传递（参见其他示例），如果添加了可学习的参数，请将它们包含在 `to_optim` 列表中。\n\n### [4.] 存储的数据：\n\n默认情况下，会保存以下文件：\n```\nName_of_Training_Run\n|  checkpoint.pth.tar   -> 包含网络 state-dict（状态字典）\n|  hypa.pkl             -> 包含所有网络参数的 pickle（Python 序列化格式）文件\n|                          可直接用于重新创建网络\n| log_train_Base.csv    -> 记录的训练数据（CSV 格式）                     \n| log_val_Base.csv      -> 记录的测试指标（CSV 格式）                    \n| Parameter_Info.txt    -> 所有参数以可读文本文件形式存储\n| InfoPlot_Base.svg     -> 训练\u002F测试指标进展的图形化总结\n| sample_recoveries.png -> 最佳验证权重下的样本恢复结果\n|                          作为合理性测试\n```\n\n![Sample Recoveries](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FConfusezius_Deep-Metric-Learning-Baselines_readme_ef33f5f99f14.png)\n__注意：__ _红色表示查询图像，绿色显示相应的最近邻。_\n\n![Sample Recoveries](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FConfusezius_Deep-Metric-Learning-Baselines_readme_37a0ab42c8f4.png)\n__注意：__ _摘要图中的标题显示整个运行过程中的最佳测试指标。_\n\n### [5.] 附加说明：\n\n最后，在检查相应运行时，以下几个标志可能值得关注：\n```\n--dist_measure: 如果设置，将计算每轮迭代后的平均类内距离（intraclass-distances）与平均类间距离（interclass distances）之比\n                （通过质心距离（center-of-mass distances）度量），并存储\u002F绘制该值\n--grad_measure: 如果设置，将从嵌入层到最后一层卷积层（conv. layer）的平均（绝对）梯度存储在 Pickle-File（Pickle 文件）中。这可用于检查每次迭代期间特征的变化\n```\n\n更多详情，请参考 `auxiliaries.py` 中的相应类。\n\n---\n\n## 4. 结果\n这些结果是通过运行 `sample_training_runs.sh` 中的相应命令获得的性能估计。请注意，学习率调度（learning rate scheduling）可能未完全优化，因此这些值仅应作为参考\u002F预期，而非通过更多调整所能最终达到的性能。\n\n_另请注意，结果对所使用的随机种子（seed）有不可忽视的依赖性。_\n\n__CUB200__\n\n架构 | 损失\u002F采样 | NMI | F1 | Recall @ 1 -- 2 -- 4 -- 8\n---|---|---|---|---\nResNet50     |  Margin\u002FDistance    | __68.2__   | __38.7__ | 63.4 -- 74.9 --  __86.0__ --  90.4    \nResNet50     |  Triplet\u002FSofthard   | 66.2   | 35.5 | 61.2 --  73.2 --  82.4 --  89.5    \nResNet50     |  NPair\u002FNone         | 65.4   | 33.8 | 59.0 --  71.3 --  81.1 --  88.8    \nResNet50     |  ProxyNCA\u002FNone      | 68.1   | 38.1 | __64.0__ --  __75.4__ --  84.2 --  __90.5__\n\n__Cars196__\n\n架构 | 损失\u002F采样 | NMI | F1 | Recall @ 1 -- 2 -- 4 -- 8\n---|---|---|---|---\nResNet50     |  Margin\u002FDistance    | __67.2__   | __37.6__ | 79.3 -- 87.1 -- __92.1 -- 95.4__    \nResNet50     |  Triplet\u002FSofthard   | 64.4   | 32.4 | 75.4 -- 84.2 -- 90.1 -- 94.1\nResNet50     |  NPair\u002FNone         | 62.3   | 30.1 | 69.5 -- 80.2 -- 87.3 -- 92.1\nResNet50     |  ProxyNCA\u002FNone      | 66.3   | 35.8 | __80.0 -- 87.2__ -- 91.8 -- 95.1\n\n__Online Products__\n\n架构 | 损失\u002F采样 | NMI | F1 | Recall @ 1 -- 10 -- 100 -- 1000\n---|---|---|---|---\nResNet50     |  Margin\u002FDistance    | __89.6__   | __34.9__ | __76.1 -- 88.7 -- 95.1__ -- 98.3\nResNet50     |  Triplet\u002FSofthard   | 89.1   | 33.7 | 74.3 -- 87.6 -- 94.9 -- __98.5__\nResNet50     |  NPair\u002FNone         | 88.8   | 31.1 | 70.9 -- 85.2 -- 93.8 -- 98.2\n\n__In-Shop Clothes__\n\n架构 | 损失\u002F采样 | NMI | F1 | Recall @ 1 -- 10 -- 20 -- 30 -- 50\n---|---|---|---|---\nResNet50     |  Margin\u002FDistance    | 88.2   | 27.7 | __84.5__ -- 96.1 -- 97.4 -- 97.9 -- 98.5\nResNet50     |  Triplet\u002FSemihard   | __89.0__   | __30.8__ | 83.9 -- __96.3 -- 97.6 -- 98.4 -- 98.8__\nResNet50     |  NPair\u002FNone         | 88.0   | 27.6 | 80.9 -- 95.0 -- 96.6 -- 97.5 -- 98.2\n\n__注意：__\n 1. 关于 __Vehicle-ID__：由于测试集数量、训练集规模以及公开可访问性较低，暂不提供结果。\n 2. 关于 __Online Products__ 和 __In-Shop Clothes__ 的 ProxyNCA：由于类别数量过多，所需的代理数量对于有效训练而言过高（>10000 个代理）。\n\n---\n\n## 待办事项：\n- [x] 修复 `requirements.txt` 中的版本  \n- [x] 添加实现的结果\n- [x] 完善注释  \n- [ ] 添加 Inception-BN  \n- [ ] 添加 Lifted Structure Loss","# Deep-Metric-Learning-Baselines 快速上手指南\n\n## 环境准备\n\n- **Python**: 3.6+ (推荐使用 Miniconda)\n- **PyTorch**: 1.0.0\u002F1.0.1\n- **CUDA**: 8.0 或 9.0\n- **GPU**: 建议使用 NVIDIA GPU\n\n## 安装步骤\n\n1. **创建 Conda 环境并安装依赖**\n```bash\nconda create -n dml python=3.6\nconda activate dml\nbash installer.sh\n```\n\n2. **准备数据集**\n将数据集下载并解压到 `Datasets` 目录，保持原始结构：\n```\nDatasets\u002F\n├── cub200\u002F\n├── cars196\u002F\n├── online_products\u002F\n├── in-shop\u002F\n└── vehicle_id\u002F  # 可选\n```\n\n## 基本使用\n\n### 启动训练\n\n**默认配置** (ResNet50 + Margin Loss + Distance Sampling on CUB200-2011):\n```bash\npython Standard_Training.py\n```\n\n**自定义配置**:\n```bash\npython Standard_Training.py \\\n  --dataset \u003Cdataset> \\\n  --loss \u003Closs> \\\n  --sampling \u003Csampling> \\\n  --arch \u003Carch> \\\n  --gpu \u003Cgpu_id> \\\n  --savename \u003Crun_name>\n```\n\n### 核心参数说明\n\n| 参数 | 可选值 | 说明 |\n|------|--------|------|\n| `--dataset` | `cub200`, `cars196`, `online_products`, `in-shop`, `vehicle_id` | 数据集 |\n| `--loss` | `marginloss`, `triplet`, `npair`, `proxynca` | 损失函数 |\n| `--sampling` | `distance`, `semihard`, `random`, `npair` | 采样方法 |\n| `--arch` | `resnet50`, `googlenet` | 网络架构 |\n| `--embed_dim` | 整数 | 嵌入维度 (默认: ResNet50=128, GoogLeNet=512) |\n| `--fc_lr_mul` | 浮点数 | 嵌入层学习率倍数 (建议设为 10) |\n| `--k_vals` | 如 `1 2 4 8` | 评估 Recall@k 的 k 值 |\n\n### 常用示例\n\n**CARS196 + Triplet Loss**:\n```bash\npython Standard_Training.py --dataset cars196 --loss triplet --sampling semihard --gpu 0 --savename triplet_cars\n```\n\n**Online Products + N-Pair Loss**:\n```bash\npython Standard_Training.py --dataset online_products --loss npair --sampling npair --embed_dim 512 --gpu 0\n```\n\n**批量执行示例**:\n```bash\nbash sample_training_runs.sh\n```\n\n## 训练输出\n\n结果保存在 `Training Results\u002F\u003Cdataset>\u002F\u003Crun_name>\u002F`:\n\n- `checkpoint.pth.tar`: 模型权重\n- `log_train_Base.csv` \u002F `log_val_Base.csv`: 训练\u002F验证日志\n- `InfoPlot_Base.svg`: 指标可视化\n- `sample_recoveries.png`: 样本检索效果图\n\n## 性能参考 (CUB200-2011)\n\n| 配置 | Recall@1 | Recall@2 | Recall@4 | Recall@8 |\n|------|----------|----------|----------|----------|\n| Margin\u002FDistance | 63.4 | 74.9 | 86.0 | 90.4 |\n| Triplet\u002FSofthard | 61.2 | 73.2 | 82.4 | 89.5 |\n| NPair | 59.0 | 71.3 | 81.1 | 88.8 |\n\n*注：结果可能因随机种子和调度策略略有差异。*","某电商平台算法团队正在开发\"以图搜款\"功能，需要训练一个模型让用户上传街拍照片就能找到相似商品。团队5名成员需要在2周内验证多种深度度量学习方案，数据包含15万张服饰图片，覆盖800多个品类。\n\n### 没有 Deep-Metric-Learning-Baselines 时\n- **重复造轮子耗时费力**：工程师小张需要手写Triplet Loss、ProxyNCA等损失函数，光是调试Margin Loss的采样策略就花了3天，代码还频繁出现维度不匹配错误\n- **算法对比不公平**：不同论文的实现细节差异大，小王复现的N-Pair Loss与原始论文效果差距明显，无法确定是参数问题还是实现bug，团队内部争论不休\n- **组合实验效率低**：想测试\"ResNet50+Distance Sampling+Margin Loss\"组合，需要手动修改数据加载、采样逻辑和训练循环，每次尝试都要重写大量胶水代码\n- **评估指标实现困难**：召回率@K、NMI等指标计算复杂，小李花了整整一周才实现完整的评估流程，结果与论文数据仍有偏差\n\n### 使用 Deep-Metric-Learning-Baselines 后\n- **一键启动标准化训练**：团队直接调用`Standard_Training.py`，15分钟就跑通第一个Triplet Loss基线，所有接口统一，无需担心底层实现细节\n- **公平对比即插即用**：通过修改配置文件即可切换ProxyNCA、N-Pair等损失函数，内置的采样策略确保实验条件一致，2天内完成5种算法的横向评测\n- **模块化组合灵活**：在`losses.py`中自由组合Distance Sampling与Margin Loss，通过参数配置快速尝试20多种方案，代码复用率达90%以上\n- **开箱即用的评估体系**：直接调用`evaluate.py`获取标准指标，与论文结果对齐准确，团队将精力集中在模型优化而非工程实现\n\n核心价值：Deep-Metric-Learning-Baselines 将算法验证周期从3周压缩至3天，让团队专注创新而非重复劳动，最终准时上线功能并提升搜索准确率12%。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FConfusezius_Deep-Metric-Learning-Baselines_37a0ab42.png","Confusezius","Karsten Roth","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FConfusezius_c0e3515b.jpg","All things representation.\r\n\r\nPhD (IMPRS-IS, ELLIS) EML Tuebingen |  prev. @VectorInstitute, @mila-iqia, @aws, @facebookresearch","University of Tuebingen","Tuebingen","karsten.rh1@gmail.com","confusezius","karroth.com","https:\u002F\u002Fgithub.com\u002FConfusezius",[86,90],{"name":87,"color":88,"percentage":89},"Python","#3572A5",94.9,{"name":91,"color":92,"percentage":93},"Shell","#89e051",5.1,576,91,"2026-02-01T18:53:11","Apache-2.0","未说明","需要 NVIDIA GPU，支持 CUDA 8 或 CUDA 9（未指定具体显卡型号和显存大小）",{"notes":101,"python":102,"dependencies":103},"1. 代码基于较旧的 PyTorch 1.0.0\u002F1 和 Python 3.6 开发，测试环境为 CUDA 8 和 CUDA 9；2. 所有数据集需手动下载并按指定目录结构组织；3. 默认使用 faiss 库加速评估计算，若不想使用可替换为 sklearn 版本（需用 auxiliaries_nofaiss.py 替换 auxiliaries.py）；4. 作者已发布功能更完善的新版本（Revisiting_Deep_Metric_Learning_PyTorch），建议关注；5. PKU Vehicle-ID 数据集需要特殊授权才能使用；6. 提供完整的训练和评估流水线，易于扩展新的损失函数、采样方法和网络架构。","3.6",[104,105,106,107],"pytorch==1.0.0\u002F1","faiss","torchvision","scikit-learn",[13,14],[110,111,112,113,114,115,116,117,118,119,120],"pytorch","deep-metric-learning","deep-learning","metric-learning","neural-networks","computer-vision","pku-vehicle","distance-sampling","shop-clothes","cars196","cub200",null,"2026-03-27T02:49:30.150509","2026-04-06T08:17:46.860271",[125,130,135,140,145,150,155,160,165],{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},4588,"训练后如何保存和评估模型？","使用 torch.save() 和 torch.load() 保存加载模型后效果不佳，通常与批大小（batch size）设置有关。对于 triplet 采样方法（如 semihard 或 distance sampling），批大小过小会限制可构建的 triplet 数量，导致性能显著下降。建议：1) 批大小至少设置为 100；2) 如果显存不足，可通过调整 `--n_samples_per_class` 参数（设为 2 或 3 而非默认的 4）来增加负样本多样性。对于 ProxyNCA 方法，批大小影响较小。","https:\u002F\u002Fgithub.com\u002FConfusezius\u002FDeep-Metric-Learning-Baselines\u002Fissues\u002F4",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},4589,"为什么 Proxy-NCA 的实现效果比原论文更好？","性能提升主要来自于超参数调整。关键改进是在训练期间将 proxy 与 anchor 的距离乘以 3 倍（代码第 494、495 行），这显著提升了收敛速度和性能。如果移除这个因子，性能会明显下降。这个技巧参考自 https:\u002F\u002Fgithub.com\u002Fdichotomies\u002Fproxy-nca 的实现。","https:\u002F\u002Fgithub.com\u002FConfusezius\u002FDeep-Metric-Learning-Baselines\u002Fissues\u002F17",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},4590,"main.py 文件在哪里？","README 中提到的 `main.py` 实际应为 `Standard_Training.py`。该问题已在文档中修正，请直接运行 `Standard_Training.py` 文件。","https:\u002F\u002Fgithub.com\u002FConfusezius\u002FDeep-Metric-Learning-Baselines\u002Fissues\u002F2",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},4591,"找不到 'pretrainedmodels' 模块怎么办？","`pretrainedmodels` 是外部依赖库，需要手动安装。运行以下命令：\n```bash\npip install pretrainedmodels\n```","https:\u002F\u002Fgithub.com\u002FConfusezius\u002FDeep-Metric-Learning-Baselines\u002Fissues\u002F22",{"id":146,"question_zh":147,"answer_zh":148,"source_url":149},4592,"出现 CUDA 显存不足错误怎么办？","降低批大小（batch size）即可解决。对于 8GB 显存的 GPU（如 RTX 2070 Super），默认配置可能过高。尝试将批大小调小后重新运行。此外，如果在 `auxiliaries_nofaiss.py` 中遇到内存错误（如无法分配 125 GiB 数组），同样建议减小批大小。","https:\u002F\u002Fgithub.com\u002FConfusezius\u002FDeep-Metric-Learning-Baselines\u002Fissues\u002F18",{"id":151,"question_zh":152,"answer_zh":153,"source_url":154},4593,"Windows 系统不支持 faiss 怎么办？","项目提供了不使用 faiss 的替代方案。将代码中导入的 `auxiliaries.py` 替换为 `auxiliaries_nofaiss.py` 即可在 Windows 上运行。注意：`auxiliaries_nofaiss.py` 未经充分测试，可能存在兼容性问题。也可尝试第三方 Windows 移植版 faiss：https:\u002F\u002Fgithub.com\u002Fbitsun\u002Ffaiss-windows","https:\u002F\u002Fgithub.com\u002FConfusezius\u002FDeep-Metric-Learning-Baselines\u002Fissues\u002F16",{"id":156,"question_zh":157,"answer_zh":158,"source_url":159},4594,"'pretrainedmodels.py' 文件缺失？","`pretrainedmodels` 不是项目内的文件，而是需要通过 pip 安装的第三方库。查看 `Install.sh` 文件可知依赖列表。解决方法：\n```bash\npip install pretrainedmodels\n```","https:\u002F\u002Fgithub.com\u002FConfusezius\u002FDeep-Metric-Learning-Baselines\u002Fissues\u002F13",{"id":161,"question_zh":162,"answer_zh":163,"source_url":164},4595,"加载 In-Shop 数据集时出现 TypeError 错误怎么办？","此错误通常由 PyTorch\u002FTorchvision 版本过旧导致，特别是 `transforms.RandomHorizontalFlip(0.5)` 的参数格式问题。建议升级 PyTorch 和 Torchvision 到最新版本。参考：https:\u002F\u002Fdiscuss.pytorch.org\u002Ft\u002Ferror-on-transforms-randomhorizontalflip-p-0-5\u002F14724","https:\u002F\u002Fgithub.com\u002FConfusezius\u002FDeep-Metric-Learning-Baselines\u002Fissues\u002F11",{"id":166,"question_zh":167,"answer_zh":168,"source_url":169},4596,"'googlenet.py' 中的模型是 GoogLeNet 吗？","该文件实际实现的是 Inception-BN 架构，而非原论文《Going Deeper with Convolutions》中的 GoogLeNet。代码来自 PyTorch Torchvision 仓库，但命名存在误导性。建议引用时注明实际架构类型。","https:\u002F\u002Fgithub.com\u002FConfusezius\u002FDeep-Metric-Learning-Baselines\u002Fissues\u002F19",[]]