[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-ZhiningLiu1998--awesome-imbalanced-learning":3,"tool-ZhiningLiu1998--awesome-imbalanced-learning":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":81,"owner_twitter":82,"owner_website":83,"owner_url":84,"languages":85,"stars":86,"forks":87,"last_commit_at":88,"license":89,"difficulty_score":90,"env_os":91,"env_gpu":92,"env_ram":92,"env_deps":93,"category_tags":99,"github_topics":100,"view_count":23,"oss_zip_url":85,"oss_zip_packed_at":85,"status":16,"created_at":114,"updated_at":115,"faqs":116,"releases":147},3058,"ZhiningLiu1998\u002Fawesome-imbalanced-learning","awesome-imbalanced-learning","😎 Everything about class-imbalanced\u002Flong-tail learning: papers, codes, frameworks, and libraries | 有关类别不平衡\u002F长尾学习的一切：论文、代码、框架与库","awesome-imbalanced-learning 是一个专注于解决机器学习中“类别不平衡”或“长尾分布”问题的精选资源库。在现实世界的分类任务中，数据往往分布不均，例如欺诈检测、罕见病预测等场景，少数类样本极少而多数类样本极多。若直接训练模型，往往会导致预测偏差和性能下降。该项目旨在帮助开发者和研究人员从这些不平衡数据中学习出更公正、准确的模型。\n\n它系统地整理了该领域的高质量学术论文、开源代码、主流框架及工具库。内容按编程语言和研究方向进行了清晰分类，并严格筛选那些具有高影响力或发表于顶级会议期刊的成果。除了提供文献指引，项目还特别推荐了如 imbalanced-ensemble 等实用的 Python 工具箱，方便用户快速上手实践。\n\n无论是正在攻克长尾难题的算法研究员，还是需要处理非均衡数据的工程开发者，都能在这里找到前沿的理论支持和现成的解决方案。作为一个持续更新的社区驱动项目，awesome-imbalanced-learning 致力于成为连接理论与实践的桥梁，让处理复杂数据分布变得更加高效简单。","\u003C!-- \u003Ch1 align=\"center\"> Awesome Imbalanced Learning \u003C\u002Fh1> -->\n\n\u003C!-- ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_83410963e56f.png) -->\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_049aaf2cf98d.png)\n\n\u003Ch2 align=\"center\"> Curated imbalanced learning papers, codes, and libraries \u003C\u002Fh2>\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Fawesome.re\u002Fbadge.svg\">\n  \u003C!-- \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FImbalanced-Learning-orange\">\n  \u003C\u002Fa> -->\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\">\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\">\n  \u003C!-- \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\"> -->\n  \u003C!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning#contributors-\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fall_contributors-4-orange.svg\">\u003C\u002Fa>\n\u003C!-- ALL-CONTRIBUTORS-BADGE:END -->\n  \u003C!-- \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalance d-learning\u002Fgraphs\u002Ftraffic\">\n    \u003Cimg src=\"https:\u002F\u002Fvisitor-badge.glitch.me\u002Fbadge?page_id=ZhiningLiu1998.awesome-imbalanced-learning&left_text=Hi!%20visitors\">\n  \u003C\u002Fa> -->\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython Toolbox-IMBENS-blueviolet\">\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Ch3 align=\"center\">\u003Cb>\n  Language: [\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\">English\u003C\u002Fa>] [\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fblob\u002Fmaster\u002FREADME_CN.md\">中文\u003C\u002Fa>]\n\u003C\u002Fb>\u003C\u002Fh3>\n\n\u003C!-- **A curated list of imbalanced learning papers, codes, frameworks and libraries.** -->\n\n**Class-imbalance (also known as the long-tail problem)** is the fact that the classes are not represented equally in a classification problem, which is quite common in practice. For instance, fraud detection, prediction of rare adverse drug reactions and prediction gene families. Failure to account for the class imbalance often causes inaccurate and decreased predictive performance of many classification algorithms. **Imbalanced learning aims to tackle the class imbalance problem to learn an unbiased model from imbalanced data.**\n\n**Inspired by [awesome-machine-learning](https:\u002F\u002Fgithub.com\u002Fjosephmisiti\u002Fawesome-machine-learning). In this repository:**\n\n- **Frameworks** and **libraries** are grouped by *programming language*.\n- **Research papers** are grouped by *research field*.\n\n**Note:**\n\n- ⭐ **Please leave a \u003Cfont color='orange'>STAR\u003C\u002Ffont> if you like this project!** ⭐\n- Contribute and you will appear in the [contributors✨](#contributors-)!\n- There are numerous papers in this field of research, so this list is not intended to be exhaustive.\n- We aim to keep only the \"awesome\" works that either *have a good impact* or have been *published in **reputed top conferences\u002Fjournals***.\n\n\u003Ch3>\n\u003Cfont color='red'>What's new: \u003C\u002Ffont>\n\u003C\u002Fh3>\n\n- Updated section [*Graph Learning*](#graph-learning).\n- Add a package [imbalanced-ensemble](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble) [[Github](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble)][[Documentation](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002F)].\n\n\u003C!-- **Disclosure:** Zhining Liu is an author on the following works: **[imbalanced-ensemble](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble), [Self-paced Ensemble](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fself-paced-ensemble), [MESA](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fmesa)**.  -->\n\n**Check out [Zhining](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998)'s other open-source projects!**\n\n\u003Ctable style=\"font-size:15px;\">\n  \u003Ctr>\n    \u003C!-- \u003Ctd align=\"center\">\u003Ca href=\"http:\u002F\u002Fzhiningliu.com\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_c45cf0a4b7eb.png\" width=\"100px;\" alt=\"\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Zhining Liu\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003C\u002Ftd> -->\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_0197a95696dd.png\" height=\"80px\" alt=\"\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Imbalanced-Ensemble [PythonLib]\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fstargazers\">\n      \u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhiningLiu1998\u002Fimbalanced-ensemble?style=social\">\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-awesome-machine-learning\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_d9c9ff6342c5.png\" height=\"80px\" alt=\"\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Machine Learning [Awesome]\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-awesome-machine-learning\u002Fstargazers\">\n      \u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhiningLiu1998\u002Fawesome-awesome-machine-learning?style=social\">\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fself-paced-ensemble\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_39a92df0db43.png\" height=\"80px\" alt=\"\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Self-paced Ensemble [ICDE]\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fself-paced-ensemble\u002Fstargazers\">\n      \u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhiningLiu1998\u002Fself-paced-ensemble?style=social\">\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fmesa\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_fbf27861f543.png\" height=\"80px\" alt=\"\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Meta-Sampler [NeurIPS]\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fmesa\u002Fstargazers\">\n      \u003Cimg alt=\"GitHub stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhiningLiu1998\u002Fmesa?style=social\">\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n# Table of Contents\n\n- [Table of Contents](#table-of-contents)\n- [1. Frameworks and Libraries](#1-frameworks-and-libraries)\n    - [1.1 Python](#11-python)\n    - [1.2 R](#12-r)\n    - [1.3 Java](#13-java)\n    - [1.4 Scalar](#14-scalar)\n    - [1.5 Julia](#15-julia)\n- [2. Research Papers](#2-research-papers)\n  - [2.1 Surveys](#21-surveys)\n  - [2.2 Ensemble Learning](#22-ensemble-learning)\n      - [2.2.1 *General ensemble*](#221-general-ensemble)\n      - [2.2.2 *Boosting-based*](#222-boosting-based)\n      - [2.2.3 *Bagging-based*](#223-bagging-based)\n      - [2.2.4 *Cost-sensitive ensemble*](#224-cost-sensitive-ensemble)\n  - [2.3 Data resampling](#23-data-resampling)\n      - [2.3.1 *Over-sampling*](#231-over-sampling)\n      - [2.3.2 *Under-sampling*](#232-under-sampling)\n      - [2.3.3 *Hybrid-sampling*](#233-hybrid-sampling)\n  - [2.4 Cost-sensitive Learning](#24-cost-sensitive-learning)\n  - [2.5 Deep Learning](#25-deep-learning)\n      - [2.5.1 *Surveys*](#251-surveys)\n      - [2.5.2 *Graph Data Mining*](#252-graph-data-mining)\n      - [2.5.3 *Hard example mining*](#253-hard-example-mining)\n      - [2.5.4 *Loss function engineering*](#254-loss-function-engineering)\n      - [2.5.5 *Meta-learning*](#255-meta-learning)\n      - [2.5.6 *Representation Learning*](#256-representation-learning)\n      - [2.5.7 *Posterior Recalibration*](#257-posterior-recalibration)\n      - [2.5.8 *Semi\u002FSelf-supervised Learning*](#258-semiself-supervised-learning)\n      - [2.5.9 *Curriculum Learning*](#259-curriculum-learning)\n      - [2.5.10 *Two-phase Training*](#2510-two-phase-training)\n      - [2.5.11 *Network Architecture*](#2511-network-architecture)\n      - [2.5.12 *Deep Generative Model*](#2512-deep-generative-model)\n      - [2.5.13 *Imbalanced Regression*](#2513-imbalanced-regression)\n      - [2.5.14 *Data Augmentation*](#2514-data-augmentation)\n- [3. Miscellaneous](#3-miscellaneous)\n  - [3.1 Datasets](#31-datasets)\n  - [3.2 Github Repositories](#32-github-repositories)\n    - [3.2.1 *Algorithms \\& Utilities \\& Jupyter Notebooks*](#321-algorithms--utilities--jupyter-notebooks)\n    - [3.2.2 *Paper list*](#322-paper-list)\n    - [3.2.3 *Slides*](#323-slides)\n- [Contributors ✨](#contributors-)\n\n# 1. Frameworks and Libraries\n\n### 1.1 Python\n\n- [**imbalanced-ensemble**](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002F) [[**Github**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble)][[**Documentation**](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002F)][[**Gallery**](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002Fen\u002Flatest\u002Fauto_examples\u002Findex.html#)][[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2111.12776.pdf)]\n\n  > **NOTE:** written in python, easy to use.\n  >\n\n  - `imbalanced-ensemble` is a Python toolbox for quick implementing and deploying ***ensemble learning algorithms*** on class-imbalanced data. It is featured for:\n    - (i) Unified, easy-to-use APIs, detailed [documentation](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002F) and [examples](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002Fen\u002Flatest\u002Fauto_examples\u002Findex.html#).\n    - (ii) Capable for multi-class imbalanced learning out-of-box.\n    - (iii) Optimized performance with parallelization when possible using [joblib](https:\u002F\u002Fgithub.com\u002Fjoblib\u002Fjoblib).\n    - (iv) Powerful, customizable, interactive training logging and visualizer.\n    - (v) Full compatibility with other popular packages like [scikit-learn](https:\u002F\u002Fscikit-learn.org\u002Fstable\u002F) and [imbalanced-learn](https:\u002F\u002Fimbalanced-learn.org\u002Fstable\u002F).\n  - Currently (v0.1.4), it includes more than 15 ensemble algorithms based on ***re-sampling*** and ***cost-sensitive learning*** (e.g., *SMOTEBoost\u002FBagging, RUSBoost\u002FBagging, AdaCost, EasyEnsemble, BalanceCascade, SelfPacedEnsemble*, ...).\n- [**imbalanced-learn**](https:\u002F\u002Fimbalanced-learn.org\u002Fstable\u002F) [[**Github**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn)][[**Documentation**](https:\u002F\u002Fimbalanced-learn.org\u002Fstable\u002F)][[**Paper**](https:\u002F\u002Fwww.jmlr.org\u002Fpapers\u002Fvolume18\u002F16-365\u002F16-365.pdf)]\n\n  > **NOTE:** written in python, easy to use.\n  >\n\n  - `imbalanced-learn` is a python package offering a number of ***re-sampling*** techniques commonly used in datasets showing strong between-class imbalance. It is compatible with [scikit-learn](https:\u002F\u002Fscikit-learn.org\u002Fstable\u002F) and is part of [scikit-learn-contrib](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib) projects.\n  - Currently (v0.8.0), it includes 21 different re-sampling techniques, including over-sampling, under-sampling and hybrid ones (e.g., *SMOTE, ADASYN, TomekLinks, NearMiss, OneSideSelection*, SMOTETomek, ...)\n  - This package also provides many utilities, e.g., *Batch generator for Keras\u002FTensorFlow*, see [API reference](https:\u002F\u002Fimbalanced-learn.org\u002Fstable\u002Freferences\u002Findex.html#api).\n- [**smote_variants**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F) [[**Documentation**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F)][[**Github**](https:\u002F\u002Fgithub.com\u002Fanalyticalmindsltd\u002Fsmote_variants)] - A collection of 85 minority ***over-sampling*** techniques for imbalanced learning with multi-class oversampling and model selection features (All writen in Python, also support R and Julia).\n\n### 1.2 R\n\n- [**smote_variants**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F) [[**Documentation**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F)][[**Github**](https:\u002F\u002Fgithub.com\u002Fanalyticalmindsltd\u002Fsmote_variants)] - A collection of 85 minority ***over-sampling*** techniques for imbalanced learning with multi-class oversampling and model selection features (All writen in Python, also support R and Julia).\n- [**caret**](https:\u002F\u002Fcran.r-project.org\u002Fweb\u002Fpackages\u002Fcaret\u002Findex.html) [[**Documentation**](http:\u002F\u002Ftopepo.github.io\u002Fcaret\u002Findex.html)][[**Github**](https:\u002F\u002Fgithub.com\u002Ftopepo\u002Fcaret)] - Contains the implementation of Random under\u002Fover-sampling.\n- [**ROSE**](https:\u002F\u002Fcran.r-project.org\u002Fweb\u002Fpackages\u002FROSE\u002Findex.html) [[**Documentation**](https:\u002F\u002Fwww.rdocumentation.org\u002Fpackages\u002FROSE\u002Fversions\u002F0.0-3)] - Contains the implementation of [ROSE](https:\u002F\u002Fjournal.r-project.org\u002Farchive\u002F2014-1\u002Fmenardi-lunardon-torelli.pdf) (Random Over-Sampling Examples).\n- [**DMwR**](https:\u002F\u002Fcran.r-project.org\u002Fweb\u002Fpackages\u002FDMwR\u002Findex.html) [[**Documentation**](https:\u002F\u002Fwww.rdocumentation.org\u002Fpackages\u002FDMwR\u002Fversions\u002F0.4.1)] - Contains the implementation of [SMOTE](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1106.1813.pdf) (Synthetic Minority Over-sampling TEchnique).\n\n### 1.3 Java\n\n- [**KEEL**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fdescription.php) [[**Github**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL)][[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fsites\u002Fdefault\u002Ffiles\u002FficherosPublicaciones\u002F0758_Alcalaetal-SoftComputing-Keel1.0.pdf)] - KEEL provides a simple ***GUI based*** on data flow to design experiments with different datasets and computational intelligence algorithms (***paying special attention to evolutionary algorithms***) in order to assess the behavior of the algorithms. This tool includes many widely used imbalanced learning techniques such as (evolutionary) over\u002Funder-resampling, cost-sensitive learning, algorithm modification, and ensemble learning methods.\n\n  > **NOTE:** wide variety of classical classification, regression, preprocessing algorithms included.\n  >\n\n### 1.4 Scalar\n\n- [**undersampling**](https:\u002F\u002Fgithub.com\u002FNestorRV\u002Fundersampling) [[**Documentation**](https:\u002F\u002Fnestorrv.github.io\u002F)][[**Github**](https:\u002F\u002Fgithub.com\u002FNestorRV\u002Fundersampling)] - A Scala library for ***under-sampling and their ensemble variants*** in imbalanced classification.\n\n### 1.5 Julia\n\n- [**smote_variants**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F) [[**Documentation**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F)][[**Github**](https:\u002F\u002Fgithub.com\u002Fanalyticalmindsltd\u002Fsmote_variants)] - A collection of 85 minority ***over-sampling*** techniques for imbalanced learning with multi-class oversampling and model selection features (All writen in Python, also support R and Julia).\n\n# 2. Research Papers\n\n## 2.1 Surveys\n\n- **Learning from imbalanced data (IEEE TKDE, 2009, 6000+ citations) [[**Paper**](https:\u002F\u002Fwww.sci-hub.shop\u002F10.1109\u002Ftkde.2008.239)]**\n\n  - Highly cited, classic survey paper. It systematically reviewed the popular solutions, evaluation metrics, and challenging problems in future research in this area (as of 2009).\n- **Learning from imbalanced data: open challenges and future directions (2016, 900+ citations) [[**Paper**](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F301596547_Learning_from_imbalanced_data_Open_challenges_and_future_directions)]**\n\n  - This paper concentrates on the open issues and challenges in imbalanced learning, i.e., extreme class imbalance, imbalance in online\u002Fstream learning, multi-class imbalanced learning, and semi\u002Fun-supervised imbalanced learning.\n- **Learning from class-imbalanced data: Review of methods and applications (2017, 900+ citations) [[**Paper**](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F311977198_Learning_from_class-imbalanced_data_Review_of_methods_and_applications)]**\n\n  - A recent exhaustive survey of imbalanced learning methods and applications, a total of 527 papers were included in this study. It provides several detailed taxonomies of existing methods and also the recent trend of this research area.\n\n## 2.2 Ensemble Learning\n\n#### 2.2.1 *General ensemble*\n\n\u003C!-- - **General ensemble** -->\n\n- **Self-paced Ensemble (ICDE 2020, 20+ citations) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1909.03500v3.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fself-paced-ensemble)][[**Slides**](https:\u002F\u002Fzhiningliu.com\u002Ffiles\u002FICDE_2020_self_paced_ensemble_slides.pdf)][[**Zhihu\u002F知乎**](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F86891438)][[**PyPI**](https:\u002F\u002Fpypi.org\u002Fproject\u002Fself-paced-ensemble\u002F)]**\n\n  > **NOTE:** versatile solution with outstanding performance and computational efficiency.\n  >\n- **MESA: Boost Ensemble Imbalanced Learning with MEta-SAmpler (NeurIPS 2020) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.08830.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fmesa)][[**Video**](https:\u002F\u002Fstudio.slideslive.com\u002Fweb_recorder\u002Fshare\u002F20201020T134559Z__NeurIPS_posters__17343__mesa-effective-ensemble-imbal?s=d3745afc-cfcf-4d60-9f34-63d3d811b55f)][[**Zhihu\u002F知乎**](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F268539195)]**\n\n  > **NOTE:** learning an optimal sampling policy directly from data.\n  >\n- **Exploratory Undersampling for Class-Imbalance Learning (IEEE Trans. on SMC, 2008, 1300+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2009-IEEE%20TSMCpartB%20Exploratory%20Undersampling%20for%20Class%20Imbalance%20Learning.pdf)]**\n\n  > **NOTE:** simple but effective solution.\n  >\n\n  - EasyEnsemble [[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Funder_sampling\u002Feasy_ensemble.py)]\n  - BalanceCascade [[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Funder_sampling\u002Fbalance_cascade.py)]\n\n- **The Effects of Ensembling on Long-Tailed Data (Neurips 2023 Heavy Tails Workshop) [[**Paper**](https:\u002F\u002Fopenreview.net\u002Fpdf?id=l4GYs60kre)]**\n\n  > **NOTE:** \n  > Adding more (>10) ensemble members continues to improve performance on imbalanced datasets.\n  > There are differences between logit and probability ensembling on imbalanced datasets depending on the ensemble diversity and dependency. \n\n  - Logit vs. Probability Ensembles [[**Code**](https:\u002F\u002Fgithub.com\u002Fekellbuch\u002Flongtail_ensembles)]\n\n#### 2.2.2 *Boosting-based*\n\n\u003C!-- - **Boosting-based** -->\n\n- **AdaBoost (1995, 18700+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F1997-JCSS-Schapire-A%20Decision-Theoretic%20Generalization%20of%20On-Line%20Learning%20(AdaBoost).pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn\u002Fscikit-learn\u002Fblob\u002F95d4f0841\u002Fsklearn\u002Fensemble\u002F_weight_boosting.py#L285)]** - Adaptive Boosting with C4.5\n- **DataBoost (2004, 570+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2004-SIGKDD-GuoViktor.pdf)]** - Boosting with Data Generation for Imbalanced Data\n- **SMOTEBoost (2003, 1100+ citations) [[**Paper**]](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Fcongreso\u002F2003-PKDD-SMOTEBoost-ChawlaLazarevicHallBowyer.pdf)[[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Fover_sampling\u002Fsmote_bagging.py)]** - Synthetic Minority Over-sampling TEchnique Boosting\n- **MSMOTEBoost (2011, 1300+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2011-IEEE%20TSMC%20partC-%20GalarFdezBarrenecheaBustinceHerrera.pdf)]** - Modified Synthetic Minority Over-sampling TEchnique Boosting\n- **RAMOBoost (2010, 140+ citations) [[**Paper**](https:\u002F\u002Fwww.ele.uri.edu\u002Ffaculty\u002Fhe\u002FPDFfiles\u002Framoboost.pdf)] [[**Code**](https:\u002F\u002Fgithub.com\u002Fdialnd\u002Fimbalanced-algorithms\u002Fblob\u002Fmaster\u002Framo.py#L133)]** - Ranked Minority Over-sampling in Boosting\n- **RUSBoost (2009, 850+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2010-IEEE%20TSMCpartA-RUSBoost%20A%20Hybrid%20Approach%20to%20Alleviating%20Class%20Imbalance.pdf)] [[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Funder_sampling\u002Frus_boost.py)]** - Random Under-Sampling Boosting\n- **AdaBoostNC (2012, 350+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2012-wang-IEEE_SMC_B.pdf)]** - Adaptive Boosting with Negative Correlation Learning\n- **EUSBoost (2013, 210+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2013-galar-PR.pdf)]** - Evolutionary Under-sampling in Boosting\n\n#### 2.2.3 *Bagging-based*\n\n\u003C!-- - **Bagging-based** -->\n\n- **Bagging (1996, 20000+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F1996-ML-Breiman-Bagging%20Predictors.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn\u002Fscikit-learn\u002Fblob\u002F95d4f0841\u002Fsklearn\u002Fensemble\u002F_bagging.py#L433)]** - Bagging predictor\n- **Diversity Analysis on Imbalanced Data Sets by Using Ensemble Models (2009, 400+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Fcongreso\u002F2009-IEEE%20CIDM-WangYao.pdf)]**\n\n  - **UnderBagging** [[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Funder_sampling\u002Funder_bagging.py)]\n  - **OverBagging** [[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Fover_sampling\u002Fover_bagging.py)]\n  - **SMOTEBagging** [[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Fover_sampling\u002Fsmote_bagging.py)]\n\n#### 2.2.4 *Cost-sensitive ensemble*\n\n\u003C!-- - **Cost-sensitive ensemble** -->\n\n- **AdaCost (ICML 1999, 800+ citations) [[**Paper**](https:\u002F\u002Fwww.researchgate.net\u002Fprofile\u002FSalvatore-Stolfo\u002Fpublication\u002F2628569_AdaCost_Misclassification_Cost-sensitive_Boosting\u002Flinks\u002F0fcfd50ca581d7016f000000\u002FAdaCost-Misclassification-Cost-sensitive-Boosting.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Freweighting\u002Fadacost.py)]** - Misclassification Cost-sensitive boosting\n- **AdaUBoost (NIPS 1999, 100+ citations) [[**Paper**](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F1998\u002Ffile\u002Fdf12ecd077efc8c23881028604dbb8cc-Paper.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Freweighting\u002Fadauboost.py)]** - AdaBoost with Unequal loss functions\n- **AsymBoost (NIPS 2001, 700+ citations) [[**Paper**](https:\u002F\u002Fwww.researchgate.net\u002Fprofile\u002FMichael-Jones-66\u002Fpublication\u002F2539888_Fast_and_Robust_Classification_using_Asymmetric_AdaBoost_and_a_Detector_Cascade\u002Flinks\u002F540731780cf23d9765a83ec1\u002FFast-and-Robust-Classification-using-Asymmetric-AdaBoost-and-a-Detector-Cascade.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Freweighting\u002Fasymmetric_boost.py)]** - Asymmetric AdaBoost and detector cascade\n\n## 2.3 Data resampling\n\n#### 2.3.1 *Over-sampling*\n\n\u003C!-- - **Over-sampling** -->\n\n- **ROS [[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fover_sampling\u002F_random_over_sampler.py)]** - Random Over-sampling\n- **SMOTE (2002, 9800+ citations) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1106.1813.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fover_sampling\u002F_smote.py#L36)]** - Synthetic Minority Over-sampling TEchnique\n- **Borderline-SMOTE (2005, 1400+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fkeel-dataset\u002Fpdfs\u002F2005-Han-LNCS.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fover_sampling\u002F_smote.py#L220)]** - Borderline-Synthetic Minority Over-sampling TEchnique\n- **ADASYN (2008, 1100+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Fcongreso\u002F2008-He-ieee.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fover_sampling\u002F_adasyn.py)]** - ADAptive SYNthetic Sampling\n- **SPIDER (2008, 150+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Fcongreso\u002Fstefanowski_selective_2008.pdf)][[**Code(Java)**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FResampling\u002FSPIDER\u002FSPIDER.java#L57)]** - Selective Preprocessing of Imbalanced Data\n- **Safe-Level-SMOTE (2009, 370+ citations) [[**Paper**](150.214.190.154\u002Fkeel\u002Fkeel-dataset\u002Fpdfs\u002F2009-Bunkhumpornpat-LNCS.pdf)][[**Code(Java)**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FResampling\u002FSafe_Level_SMOTE\u002FSafe_Level_SMOTE.java#L58)]** - Safe Level Synthetic Minority Over-sampling TEchnique\n- **SVM-SMOTE (2009, 120+ citations) [[**Paper**](ousar.lib.okayama-u.ac.jp\u002Ffiles\u002Fpublic\u002F1\u002F19617\u002F20160528004522391723\u002FIWCIA2009_A1005.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fover_sampling\u002F_smote.py#L417)]** - SMOTE based on Support Vectors of SVM\n- **MDO (2015, 150+ citations) [[**Paper**](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F7163639)][[**Code**](https:\u002F\u002Fgithub.com\u002Fanalyticalmindsltd\u002Fsmote_variants\u002Fblob\u002Fdedbc3d00b266954fedac0ae87775e1643bc920a\u002Fsmote_variants\u002F_smote_variants.py#L14513)]** - Mahalanobis Distance-based Over-sampling for *Multi-Class* imbalanced problems.\n\n> **NOTE:** See more over-sampling methods at [**smote-variants**](https:\u002F\u002Fgithub.com\u002Fanalyticalmindsltd\u002Fsmote_variants#references).\n\n#### 2.3.2 *Under-sampling*\n\n\u003C!-- - **Under-sampling** -->\n\n- **RUS [[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_random_under_sampler.py)]** - Random Under-sampling\n- **CNN (1968, 2100+ citations) [[**Paper**](https:\u002F\u002Fpdfs.semanticscholar.org\u002F7c37\u002F71fd6829630cf450af853df728ecd8da4ab2.pdf?_ga=2.137274553.882046879.1583413150-1712662047.1583413150)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_condensed_nearest_neighbour.py)]** - Condensed Nearest Neighbor\n- **ENN (1972, 1500+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fdataset\u002Fincludes\u002FcatImbFiles\u002F1972-Wilson-IEEETSMC.pdf)] [[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_edited_nearest_neighbours.py)]** - Edited Condensed Nearest Neighbor\n- **TomekLink (1976, 870+ citations) [[**Paper**](https:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?tp=&arnumber=4309452)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_tomek_links.py)]** - Tomek's modification of Condensed Nearest Neighbor\n- **NCR (2001, 500+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Fcongreso\u002F2001-Laurikkala-LNCS.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_neighbourhood_cleaning_rule.py)]** - Neighborhood Cleaning Rule\n- **NearMiss-1 & 2 & 3 (2003, 420+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Fspecific\u002Fcongreso\u002Fjzhang.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_nearmiss.py)]** - Several kNN approaches to unbalanced data distributions.\n- **CNN with TomekLink (2004, 2000+ citations) [[**Paper**](https:\u002F\u002Fstorm.cis.fordham.edu\u002F~gweiss\u002Fselected-papers\u002Fbatista-study-balancing-training-data.pdf)][[**Code(Java)**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FResampling\u002FCNN_TomekLinks\u002FCNN_TomekLinks.java#L58)]** - Condensed Nearest Neighbor + TomekLink\n- **OSS (2007, 2100+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Fspecific\u002Fcongreso\u002Fkubat97addressing.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_one_sided_selection.py)]** - One Side Selection\n- **EUS (2009, 290+ citations) [[**Paper**](https:\u002F\u002Fwww.mitpressjournals.org\u002Fdoi\u002Fpdfplus\u002F10.1162\u002Fevco.2009.17.3.275)]** - Evolutionary Under-sampling\n- **IHT (2014, 130+ citations) [[**Paper**](http:\u002F\u002Fciteseerx.ist.psu.edu\u002Fviewdoc\u002Fdownload?doi=10.1.1.649.8727&rep=rep1&type=pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_instance_hardness_threshold.py)]** - Instance Hardness Threshold\n\n#### 2.3.3 *Hybrid-sampling*\n\n\u003C!-- - **Hybrid-sampling** -->\n\n- **A Study of the Behavior of Several Methods for Balancing Training Data (2004, 2000+ citations) [[**Paper**](https:\u002F\u002Fwww.researchgate.net\u002Fprofile\u002FRonaldo-Prati\u002Fpublication\u002F220520041_A_Study_of_the_Behavior_of_Several_Methods_for_Balancing_machine_Learning_Training_Data\u002Flinks\u002F0d22cd91c989507054a2cf3b\u002FA-Study-of-the-Behavior-of-Several-Methods-for-Balancing-machine-Learning-Training-Data.pdf)]**\n\n  > **NOTE:** extensive experimental evaluation involving 10 different over\u002Funder-sampling methods.\n  >\n\n  - **SMOTE-Tomek [[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fcombine\u002F_smote_tomek.py)]**\n  - **SMOTE-ENN [[**Code**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fcombine\u002F_smote_enn.py)]**\n- **SMOTE-RSB (2012, 210+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fsites\u002Fdefault\u002Ffiles\u002FficherosPublicaciones\u002F1434_2012-Ramentol-KAIS.pdf)][[**Code**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F_modules\u002Fsmote_variants\u002F_smote_variants.html#SMOTE_RSB)]** - Hybrid Preprocessing using SMOTE and Rough Sets Theory\n- **SMOTE-IPF (2015, 180+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fsites\u002Fdefault\u002Ffiles\u002FficherosPublicaciones\u002F1824_2015-INS-Saez.pdf)][[**Code**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F_modules\u002Fsmote_variants\u002F_smote_variants.html#SMOTE_IPF)]** - SMOTE with Iterative-Partitioning Filter\n\n## 2.4 Cost-sensitive Learning\n\n- **CSC4.5 (2002, 420+ citations) [[**Paper**](https:\u002F\u002Fwww.sci-hub.shop\u002F10.1109\u002Ftkde.2002.1000348)][[**Code(Java)**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FCSMethods\u002FC45CS\u002FC45CS.java#L48)]** - An instance-weighting method to induce cost-sensitive trees\n- **CSSVM (2008, 710+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2009-Chawla-IEEE_TSMCB-svm-imbalance.pdf)][[**Code(Java)**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FCSMethods\u002FC_SVMCost\u002FsvmClassifierCost.java#L60)]** - Cost-sensitive SVMs for highly imbalanced classification\n- **CSNN (2005, 950+ citations) [[**Paper**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2006%20-%20IEEE_TKDE%20-%20Zhou_Liu.pdf)][[**Code(Java)**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FCSMethods\u002FMLPerceptronBackpropCS\u002FMLPerceptronBackpropCS.java#L49)]** - Training cost-sensitive neural networks with methods addressing the class imbalance problem.\n\n## 2.5 Deep Learning\n\n#### 2.5.1 *Surveys*\n\n\u003C!-- - **Surveys** -->\n\n- A systematic study of the class imbalance problem in convolutional neural networks (2018, 330+ citations) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1710.05381.pdf)]\n- Survey on deep learning with class imbalance (2019, 50+ citations) [[**Paper**](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F332165523_Survey_on_deep_learning_with_class_imbalance)]\n\n  > **NOTE:** a recent comprehensive survey of the class imbalance problem in deep learning.\n  >\n\n#### 2.5.2 *Graph Data Mining*\n\n\u003C!-- - **Graph Neural Networks** -->\n\n- Semi-Supervised Graph Imbalanced Regression (KDD 2023) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.12087)] [[**Code**](https:\u002F\u002Fgithub.com\u002Fliugangcode\u002FSGIR)]\n- TAM: Topology-Aware Margin Loss for Class-Imbalanced Node Classification (ICML 2022) [[**Paper**](https:\u002F\u002Fproceedings.mlr.press\u002Fv162\u002Fsong22a\u002Fsong22a.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FJaeyun-Song\u002FTAM)]\n- GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural Networks (WSDM 2021) [[**Paper**](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3437963.3441720)][[**Code**](https:\u002F\u002Fgithub.com\u002FTianxiangZhao\u002FGraphSmote)]\n- Topology-Imbalance Learning for Semi-Supervised Node Classification (NeurIPS 2021) [[**Paper**](https:\u002F\u002Fproceedings.neurips.cc\u002F\u002Fpaper\u002F2021\u002Ffile\u002Ffa7cdfad1a5aaf8370ebeda47a1ff1c3-Paper.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fvictorchen96\u002Frenode)]\n- GraphENS: Neighbor-Aware Ego Network Synthesis for Class-Imbalanced Node Classification (ICLR 2022) [[**Paper**](https:\u002F\u002Fopenreview.net\u002Fpdf?id=MXEl7i-iru)][[**Code**](https:\u002F\u002Fgithub.com\u002FJoonHyung-Park\u002FGraphENS)]\n- LTE4G: Long-Tail Experts for Graph Neural Networks (CIKM 2022) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2208.10205.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FSukwonYun\u002FLTE4G)]\n- Multi-Class Imbalanced Graph Convolutional Network Learning (IJCAI 2020) [[**Paper**](https:\u002F\u002Fpar.nsf.gov\u002Fservlets\u002Fpurl\u002F10199469)]\n\n#### 2.5.3 *Hard example mining*\n\n\u003C!-- - **Hard example mining** -->\n\n- Training region-based object detectors with online hard example mining (CVPR 2016, 840+ citations) [[**Paper**](https:\u002F\u002Fwww.cv-foundation.org\u002Fopenaccess\u002Fcontent_cvpr_2016\u002Fpapers\u002FShrivastava_Training_Region-Based_Object_CVPR_2016_paper.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fabhi2610\u002Fohemh)] - In the later phase of NN training, only do gradient back-propagation for \"hard examples\" (i.e., with large loss value)\n\n#### 2.5.4 *Loss function engineering*\n\n\u003C!-- - **Loss function engineering** -->\n\n- Focal loss for dense object detection (ICCV 2017, 2600+ citations) [[**Paper**](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FLin_Focal_Loss_for_ICCV_2017_paper.pdf)][[**Code (detectron2)**](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fdetectron2)][[**Code (unofficial)**](https:\u002F\u002Fgithub.com\u002Fclcarwin\u002Ffocal_loss_pytorch)] - A uniform loss function that focuses training on a sparse set of hard examples to prevents the vast number of easy negatives from overwhelming the detector during training.\n\n  > **NOTE:** elegant solution, high influence.\n  >\n- Training deep neural networks on imbalanced data sets (IJCNN 2016, 110+ citations) [[**Paper**](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F309778930_Training_deep_neural_networks_on_imbalanced_data_sets)] - Mean (square) false error that can equally capture classification errors from both the majority class and the minority class.\n- Deep imbalanced attribute classification using visual attention aggregation (ECCV 2018, 30+ citation) [[**Paper**](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FNikolaos_Sarafianos_Deep_Imbalanced_Attribute_ECCV_2018_paper.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fcvcode18\u002Fimbalanced_learning)]\n- Imbalanced deep learning by minority class incremental rectification (TPAMI 2018, 60+ citations) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1804.10851.pdf)] - Class Rectification Loss for minimizing the dominant effect of majority classes by discovering sparsely sampled boundaries of minority classes in an iterative batch-wise learning process.\n- Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss (NIPS 2019, 10+ citations) [[**Paper**](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F8435-learning-imbalanced-datasets-with-label-distribution-aware-margin-loss.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fkaidic\u002FLDAM-DRW)]  - A theoretically-principled label-distribution-aware margin (LDAM) loss motivated by minimizing a margin-based generalization bound.\n- Gradient harmonized single-stage detector (AAAI 2019, 40+ citations) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.05181.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Flibuyu\u002FGHM_Detection)] - Compared to Focal Loss, which only down-weights \"easy\" negative examples, GHM also down-weights \"very hard\" examples as they are likely to be outliers.\n- Class-Balanced Loss Based on Effective Number of Samples (CVPR 2019, 70+ citations) [[**Paper**](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FCui_Class-Balanced_Loss_Based_on_Effective_Number_of_Samples_CVPR_2019_paper.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Frichardaecn\u002Fclass-balanced-loss)] - a simple and generic class-reweighting mechanism based on Effective Number of Samples.\n- Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2110.02444.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fpseulki\u002FIB-Loss)]\n- AutoBalance: Optimized Loss Functions for Imbalanced Data (NeurIPS 2021) [[**Paper**](https:\u002F\u002Fopenreview.net\u002Fpdf?id=ebQXflQre5a)]\n- Label-Imbalanced and Group-Sensitive Classification under Overparameterization (NeurIPS 2021) [[**Paper**](https:\u002F\u002Fproceedings.neurips.cc\u002F\u002Fpaper\u002F2021\u002Ffile\u002F9dfcf16f0adbc5e2a55ef02db36bac7f-Paper.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Forparask\u002FVS-Loss)]\n\n#### 2.5.5 *Meta-learning*\n\n\u003C!-- - **Meta-learning** -->\n\n- Learning to model the tail (NIPS 2017, 70+ citations) [[**Paper**](papers.nips.cc\u002Fpaper\u002F7278-learning-to-model-the-tail.pdf)] - Transfer meta-knowledge from the data-rich classes in the head of the distribution to the data-poor classes in the tail.\n- Learning to reweight examples for robust deep learning (ICML 2018, 150+ citations) [[**Paper**](https:\u002F\u002Fproceedings.mlr.press\u002Fv80\u002Fren18a\u002Fren18a.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fuber-research\u002Flearning-to-reweight-examples)] - Implicitly learn a weight function to reweight the samples in gradient updates of DNN.\n\n  > **NOTE:** representative work to solve the class imbalance problem through meta-learning.\n  >\n- Meta-weight-net: Learning an explicit mapping for sample weighting (NIPS 2019) [[**Paper**](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F8467-meta-weight-net-learning-an-explicit-mapping-for-sample-weighting.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fxjtushujun\u002Fmeta-weight-net)] - Explicitly learn a weight function (with an MLP as the function approximator) to reweight the samples in gradient updates of DNN.\n- Learning Data Manipulation for Augmentation and Weighting (NIPS 2019) [[**Paper**](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2019\u002Ffile\u002F671f0311e2754fcdd37f70a8550379bc-Paper.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Ftanyuqian\u002Flearning-data-manipulation)]\n- Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks (ICLR 2020) [[**Paper**](https:\u002F\u002Fopenreview.net\u002Fattachment?id=rkeZIJBYvr&name=original_pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fhaebeom-lee\u002Fl2b)]\n- MESA: Boost Ensemble Imbalanced Learning with MEta-SAmpler (NeurIPS 2020) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.08830.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fmesa)][[**Video**](https:\u002F\u002Fstudio.slideslive.com\u002Fweb_recorder\u002Fshare\u002F20201020T134559Z__NeurIPS_posters__17343__mesa-effective-ensemble-imbal?s=d3745afc-cfcf-4d60-9f34-63d3d811b55f)]\n\n  > **NOTE:** meta-learning-powered ensemble learning\n  >\n\n#### 2.5.6 *Representation Learning*\n\n\u003C!-- - **Representation Learning** -->\n\n- Learning deep representation for imbalanced classification (CVPR 2016, 220+ citations) [[**Paper**](https:\u002F\u002Fwww.cv-foundation.org\u002Fopenaccess\u002Fcontent_cvpr_2016\u002Fpapers\u002FHuang_Learning_Deep_Representation_CVPR_2016_paper.pdf)]\n- Supervised Class Distribution Learning for GANs-Based Imbalanced Classification (ICDM 2019) [[**Paper**](https:\u002F\u002Fieeexplore.ieee.xilesou.top\u002Fabstract\u002Fdocument\u002F8970900)]\n- Decoupling Representation and Classifier for Long-tailed Recognition (ICLR 2020) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1910.09217.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fclassifier-balancing)]\n\n  > **NOTE:** interesting findings on representation learning and classifier learning\n  >\n- Supercharging Imbalanced Data Learning With Energy-based Contrastive Representation Transfer (NeurIPS 2021) [[**Paper**](https:\u002F\u002Fproceedings.neurips.cc\u002F\u002Fpaper\u002F2021\u002Ffile\u002Fb151ce4935a3c2807e1dd9963eda16d8-Paper.pdf)]\n- Tailoring Self-Supervision for Supervised Learning (ECCV 2022) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.10023)][[**Code**](https:\u002F\u002Fgithub.com\u002Fwjun0830\u002FLocalizable-Rotation)]\n\n#### 2.5.7 *Posterior Recalibration*\n\n\u003C!-- - **Posterior Recalibration** -->\n\n- Posterior Re-calibration for Imbalanced Datasets (NeurIPS 2020) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.11820.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FGT-RIPL\u002FUNO-IC)]\n- Long-tail learning via logit adjustment (ICLR 2021) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2007.07314v1.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Fgoogle-research\u002Ftree\u002Fmaster\u002Flogit_adjustment)]\n\n#### 2.5.8 *Semi\u002FSelf-supervised Learning*\n\n\u003C!-- - **Semi\u002FSelf-supervised Learning** -->\n\n- Rethinking the Value of Labels for Improving Class-Imbalanced Learning (NeurIPS 2020) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2006.07529.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FYyzHarry\u002Fimbalanced-semi-self)][[**Video**](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=XltXZ3OZvyI&feature=youtu.be)]\n\n  > **NOTE:** semi-supervised training \u002F self-supervised pre-training helps imbalance learning\n  >\n- Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning (NeurIPS 2020) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2007.08844.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fbbuing9\u002FDARP)]\n- ABC: Auxiliary Balanced Classifier for Class-imbalanced Semi-supervised Learning (NeurIPS 2021) [[**Paper**](https:\u002F\u002Fproceedings.neurips.cc\u002F\u002Fpaper\u002F2021\u002Ffile\u002F3953630da28e5181cffca1278517e3cf-Paper.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002Fleehyuck\u002Fabc)]\n- Improving Contrastive Learning on Imbalanced Data via Open-World Sampling (NeurIPS 2021) [[**Paper**](https:\u002F\u002Fproceedings.neurips.cc\u002F\u002Fpaper\u002F2021\u002Ffile\u002F2f37d10131f2a483a8dd005b3d14b0d9-Paper.pdf)]\n- DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning (CVPR 2022) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2106.05682)][[**Code**](https:\u002F\u002Fgithub.com\u002Fytaek-oh\u002Fdaso)]\n\n#### 2.5.9 *Curriculum Learning*\n\n\u003C!-- - **Curriculum Learning** -->\n\n- Dynamic Curriculum Learning for Imbalanced Data Classification (ICCV 2019) [[**Paper**](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fpapers\u002FWang_Dynamic_Curriculum_Learning_for_Imbalanced_Data_Classification_ICCV_2019_paper.pdf)]\n\n#### 2.5.10 *Two-phase Training*\n\n\u003C!-- - **Two-phase Training** -->\n\n- Brain tumor segmentation with deep neural networks (2017, 1200+ citations) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1505.03540.pdf)][[**Code (unofficial)**](https:\u002F\u002Fgithub.com\u002Fnaldeborgh7575\u002Fbrain_segmentation)]\n\n  > Pre-training on balanced dataset, fine-tuning the last output layer before softmax on the original, imbalanced data.\n  >\n\n#### 2.5.11 *Network Architecture*\n\n\u003C!-- - **Network Architecture** -->\n\n- BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition (CVPR 2020) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.02413.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FMegvii-Nanjing\u002FBBN)]\n- Class-Imbalanced Deep Learning via a Class-Balanced Ensemble (TNNLS 2021) [[**Paper**](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9416240)]\n\n#### 2.5.12 *Deep Generative Model*\n\n\u003C!-- - **Deep Generative Model** -->\n\n- Deep Generative Model for Robust Imbalance Classification (CVPR 2020) [[**Paper**](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FWang_Deep_Generative_Model_for_Robust_Imbalance_Classification_CVPR_2020_paper.pdf)]\n\n#### 2.5.13 *Imbalanced Regression*\n\n\u003C!-- - **Imbalanced Regression** -->\n\n- Semi-Supervised Graph Imbalanced Regression (KDD 2023) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.12087)] [[**Code**](https:\u002F\u002Fgithub.com\u002Fliugangcode\u002FSGIR)]\n- RankSim: Ranking Similarity Regularization for Deep Imbalanced Regression (ICML 2022) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2205.15236)] [[**Code**](https:\u002F\u002Fgithub.com\u002FBorealisAI\u002Franksim-imbalanced-regression)]\n- Balanced MSE for Imbalanced Visual Regression (CVPR 2022) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.16427)] [[**Code**](https:\u002F\u002Fgithub.com\u002Fjiawei-ren\u002FBalancedMSE)]\n- Delving into Deep Imbalanced Regression (ICML 2021) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2102.09554.pdf)][[**Code**](https:\u002F\u002Fgithub.com\u002FYyzHarry\u002Fimbalanced-regression)][[**Video**](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=grJGixofQRU)]\n- Density-based weighting for imbalanced regression (Machine Learning [J], 2021) [[**Paper**](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs10994-021-06023-5)][[**Code**](https:\u002F\u002Fgithub.com\u002FSteiMi\u002Fdensity-based-weighting-for-imbalanced-regression)]\n\n#### 2.5.14 *Data Augmentation*\n\n\u003C!-- - **Augmentation** -->\n\n- Minority-Oriented Vicinity Expansion with Attentive Aggregation for Video Long-Tailed Recognition (AAAI 2023) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.13471)][[**Code**](https:\u002F\u002Fgithub.com\u002Fwjun0830\u002FMOVE)]\n\n\u003C!-- ## 2.6 Anomaly Detection\n\n#### 2.6.1 **Surveys**\n\n  - Anomaly detection: A survey (ACM computing surveys, 2009, 9000+ citations) [[**Paper**](cinslab.com\u002Fwp-content\u002Fuploads\u002F2019\u002F03\u002Fxiaorong.pdf)]\n  \n  - A survey of network anomaly detection techniques (2017, 700+ citations) [[**Paper**](https:\u002F\u002Fwww.gta.ufrj.br\u002F~alvarenga\u002Ffiles\u002FCPE826\u002FAhmed2016-Survey.pdf)]\n\n#### 2.6.2 **Classification-based**\n\n  - One-class SVMs for document classification (JMLR, 2001, 1300+ citations) [[**Paper**](www.jmlr.org\u002Fpapers\u002Fvolume2\u002Fmanevitz01a\u002Fmanevitz01a.pdf)]\n  \n  - One-class Collaborative Filtering (ICDM 2008, 1000+ citations) [[**Paper**](https:\u002F\u002Fcseweb.ucsd.edu\u002Fclasses\u002Ffa17\u002Fcse291-b\u002Freading\u002F04781145.pdf)]\n  \n  - Isolation Forest (ICDM 2008, 1000+ citations) [[**Paper**](https:\u002F\u002Fcs.nju.edu.cn\u002Fzhouzh\u002Fzhouzh.files\u002Fpublication\u002Ficdm08b.pdf?q=isolation-forest)]\n  \n  - Anomaly Detection using One-Class Neural Networks (2018, 200+ citations) [[**Paper**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1802.06360.pdf)]\n  \n  - Anomaly Detection with Robust Deep Autoencoders (KDD 2017, 170+ citations) [[**Paper**](https:\u002F\u002Fpdfs.semanticscholar.org\u002Fc112\u002Fb06d3dac590b4cc111e5ec9c805d0b086c6e.pdf)] -->\n\n# 3. Miscellaneous\n\n## 3.1 Datasets\n\n- **`imbalanced-learn` datasets**\n\n  > This collection of datasets is from [`imblearn.datasets.fetch_datasets`](https:\u002F\u002Fimbalanced-learn.org\u002Fstable\u002Freferences\u002Fgenerated\u002Fimblearn.datasets.fetch_datasets.html).\n  >\n\n  | ID  | Name           | Repository & Target           | Ratio | #S      | #F  |\n  | --- | -------------- | ----------------------------- | ----- | ------- | --- |\n  | 1   | ecoli          | UCI, target: imU              | 8.6:1 | 336     | 7   |\n  | 2   | optical_digits | UCI, target: 8                | 9.1:1 | 5,620   | 64  |\n  | 3   | satimage       | UCI, target: 4                | 9.3:1 | 6,435   | 36  |\n  | 4   | pen_digits     | UCI, target: 5                | 9.4:1 | 10,992  | 16  |\n  | 5   | abalone        | UCI, target: 7                | 9.7:1 | 4,177   | 10  |\n  | 6   | sick_euthyroid | UCI, target: sick euthyroid   | 9.8:1 | 3,163   | 42  |\n  | 7   | spectrometer   | UCI, target: > =44            | 11:1  | 531     | 93  |\n  | 8   | car_eval_34    | UCI, target: good, v good     | 12:1  | 1,728   | 21  |\n  | 9   | isolet         | UCI, target: A, B             | 12:1  | 7,797   | 617 |\n  | 10  | us_crime       | UCI, target: >0.65            | 12:1  | 1,994   | 100 |\n  | 11  | yeast_ml8      | LIBSVM, target: 8             | 13:1  | 2,417   | 103 |\n  | 12  | scene          | LIBSVM, target: >one label    | 13:1  | 2,407   | 294 |\n  | 13  | libras_move    | UCI, target: 1                | 14:1  | 360     | 90  |\n  | 14  | thyroid_sick   | UCI, target: sick             | 15:1  | 3,772   | 52  |\n  | 15  | coil_2000      | KDD, CoIL, target: minority   | 16:1  | 9,822   | 85  |\n  | 16  | arrhythmia     | UCI, target: 06               | 17:1  | 452     | 278 |\n  | 17  | solar_flare_m0 | UCI, target: M->0             | 19:1  | 1,389   | 32  |\n  | 18  | oil            | UCI, target: minority         | 22:1  | 937     | 49  |\n  | 19  | car_eval_4     | UCI, target: vgood            | 26:1  | 1,728   | 21  |\n  | 20  | wine_quality   | UCI, wine, target: \u003C=4        | 26:1  | 4,898   | 11  |\n  | 21  | letter_img     | UCI, target: Z                | 26:1  | 20,000  | 16  |\n  | 22  | yeast_me2      | UCI, target: ME2              | 28:1  | 1,484   | 8   |\n  | 23  | webpage        | LIBSVM, w7a, target: minority | 33:1  | 34,780  | 300 |\n  | 24  | ozone_level    | UCI, ozone, data              | 34:1  | 2,536   | 72  |\n  | 25  | mammography    | UCI, target: minority         | 42:1  | 11,183  | 6   |\n  | 26  | protein_homo   | KDD CUP 2004, minority        | 111:1 | 145,751 | 74  |\n  | 27  | abalone_19     | UCI, target: 19               | 130:1 | 4,177   | 10  |\n- **Imbalanced Databases**\n\n  Link: https:\u002F\u002Fgithub.com\u002Fgykovacs\u002Fmldb\n\n## 3.2 Github Repositories\n\n### 3.2.1 *Algorithms & Utilities & Jupyter Notebooks*\n\n- [imbalanced-algorithms](https:\u002F\u002Fgithub.com\u002Fdialnd\u002Fimbalanced-algorithms) - Python-based implementations of algorithms for learning on imbalanced data.\n- [imbalanced-dataset-sampler](https:\u002F\u002Fgithub.com\u002Fufoym\u002Fimbalanced-dataset-sampler) - A (PyTorch) imbalanced dataset sampler for oversampling low frequent classes and undersampling high frequent ones.\n- [class_imbalance](https:\u002F\u002Fgithub.com\u002Fwangz10\u002Fclass_imbalance) - Jupyter Notebook presentation for class imbalance in binary classification.\n- [Multi-class-with-imbalanced-dataset-classification](https:\u002F\u002Fgithub.com\u002Fjavaidnabi31\u002FMulti-class-with-imbalanced-dataset-classification) - Perform multi-class classification on imbalanced 20-news-group dataset.\n- [Advanced Machine Learning with scikit-learn: Imbalanced classification and text data](https:\u002F\u002Fgithub.com\u002Famueller\u002Fml-workshop-4-of-4) - Different approaches to feature selection, and resampling methods for imbalanced data.\n\n### 3.2.2 *Paper list*\n\n- [Anomaly Detection Learning Resources](https:\u002F\u002Fgithub.com\u002Fyzhao062\u002Fanomaly-detection-resources) by [yzhao062](https:\u002F\u002Fgithub.com\u002Fyzhao062) - Anomaly detection related books, papers, videos, and toolboxes.\n- [Paper-list-on-Imbalanced-Time-series-Classification-with-Deep-Learning](https:\u002F\u002Fgithub.com\u002Fdanielgy\u002FPaper-list-on-Imbalanced-Time-series-Classification-with-Deep-Learning) - Imbalanced Time-series Classification\n\n### 3.2.3 *Slides*\n\n- [acm_imbalanced_learning](https:\u002F\u002Fgithub.com\u002Ftimgasser\u002Facm_imbalanced_learning) - slides and code for the ACM Imbalanced Learning talk on 27th April 2016 in Austin, TX.\n\n# Contributors ✨\n\nThanks goes to these wonderful people ([emoji key](https:\u002F\u002Fallcontributors.org\u002Fdocs\u002Fen\u002Femoji-key)):\n\n\u003C!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->\n\u003C!-- prettier-ignore-start -->\n\u003C!-- markdownlint-disable -->\n\u003Ctable>\n  \u003Ctbody>\n    \u003Ctr>\n      \u003Ctd align=\"center\" valign=\"top\" width=\"14.28%\">\u003Ca href=\"http:\u002F\u002Fzhiningliu.com\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_c45cf0a4b7eb.png\" width=\"100px;\" alt=\"Zhining Liu\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Zhining Liu\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fcommits?author=ZhiningLiu1998\" title=\"Code\">💻\u003C\u002Fa> \u003Ca href=\"#maintenance-ZhiningLiu1998\" title=\"Maintenance\">🚧\u003C\u002Fa> \u003Ca href=\"#translation-ZhiningLiu1998\" title=\"Translation\">🌍\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd align=\"center\" valign=\"top\" width=\"14.28%\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FAshinZeng\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_7d6ec2f76359.png\" width=\"100px;\" alt=\"曾阿信\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>曾阿信\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"#maintenance-AshinZeng\" title=\"Maintenance\">🚧\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd align=\"center\" valign=\"top\" width=\"14.28%\">\u003Ca href=\"https:\u002F\u002Fwjun0830.github.io\u002F\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_294bc5598a5b.png\" width=\"100px;\" alt=\"WonJun Moon\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>WonJun Moon\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fcommits?author=wjun0830\" title=\"Code\">💻\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd align=\"center\" valign=\"top\" width=\"14.28%\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fliugangcode\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_623faafa8cd5.png\" width=\"100px;\" alt=\"Gang Liu\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Gang Liu\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fcommits?author=liugangcode\" title=\"Code\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003C\u002Ftr>\n  \u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n\u003C!-- markdownlint-restore -->\n\u003C!-- prettier-ignore-end -->\n\n\u003C!-- ALL-CONTRIBUTORS-LIST:END -->\n\nThis project follows the [all-contributors](https:\u002F\u002Fgithub.com\u002Fall-contributors\u002Fall-contributors) specification. Contributions of any kind welcome!\n","\u003C!-- \u003Ch1 align=\"center\"> 令人惊叹的不平衡学习 \u003C\u002Fh1> -->\n\n\u003C!-- ![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_83410963e56f.png) -->\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_049aaf2cf98d.png)\n\n\u003Ch2 align=\"center\"> 精选的不平衡学习论文、代码和库 \u003C\u002Fh2>\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Fawesome.re\u002Fbadge.svg\">\n  \u003C!-- \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FImbalanced-Learning-orange\">\n  \u003C\u002Fa> -->\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\">\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\">\n  \u003C!-- \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\"> -->\n  \u003C!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning#contributors-\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fall_contributors-4-orange.svg\">\u003C\u002Fa>\n\u003C!-- ALL-CONTRIBUTORS-BADGE:END -->\n  \u003C!-- \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalance d-learning\u002Fgraphs\u002Ftraffic\">\n    \u003Cimg src=\"https:\u002F\u002Fvisitor-badge.glitch.me\u002Fbadge?page_id=ZhiningLiu1998.awesome-imbalanced-learning&left_text=Hi!%20visitors\">\n  \u003C\u002Fa> -->\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython Toolbox-IMBENS-blueviolet\">\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Ch3 align=\"center\">\u003Cb>\n  语言: [\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\">英语\u003C\u002Fa>] [\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fblob\u002Fmaster\u002FREADME_CN.md\">中文\u003C\u002Fa>]\n\u003C\u002Fb>\u003C\u002Fh3>\n\n\u003C!-- **一份精选的不平衡学习论文、代码、框架和库列表。** -->\n\n**类别不平衡（也称为长尾问题）**是指在分类问题中，各类别样本数量不均衡的现象，这在实际应用中非常常见。例如，欺诈检测、罕见药物不良反应预测以及基因家族预测等场景。如果未能有效处理类别不平衡问题，许多分类算法的预测性能往往会变得不准确且下降。**不平衡学习旨在解决类别不平衡问题，从而从不平衡数据中学习到一个无偏的模型。**\n\n**受[awesome-machine-learning](https:\u002F\u002Fgithub.com\u002Fjosephmisiti\u002Fawesome-machine-learning)的启发，在本仓库中：**\n\n- **框架**和**库**按*编程语言*分类。\n- **研究论文**按*研究领域*分类。\n\n**注意：**\n\n- ⭐ **如果您喜欢这个项目，请留下一个\u003Cfont color='orange'>星标\u003C\u002Ffont>！** ⭐\n- 贡献者将出现在[贡献者✨](#contributors-)名单中！\n- 该领域的研究论文众多，因此本列表并不打算涵盖所有内容。\n- 我们的目标是仅保留那些具有*良好影响力*或已在*知名顶级会议\u002F期刊*上发表的“优秀”作品。\n\n\u003Ch3>\n\u003Cfont color='red'>最新动态： \u003C\u002Ffont>\n\u003C\u002Fh3>\n\n- 更新了[*图学习*](#graph-learning)部分。\n- 新增了一个包[imbalanced-ensemble](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble) [[Github](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble)][[文档](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002F)]。\n\n\u003C!-- **披露：** Zhining Liu 是以下作品的作者：**[imbalanced-ensemble](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble), [Self-paced Ensemble](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fself-paced-ensemble), [MESA](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fmesa)**。  -->\n\n**查看[Zhining](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998)的其他开源项目！**\n\n\u003Ctable style=\"font-size:15px;\">\n  \u003Ctr>\n    \u003C!-- \u003Ctd align=\"center\">\u003Ca href=\"http:\u002F\u002Fzhiningliu.com\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_c45cf0a4b7eb.png\" width=\"100px;\" alt=\"\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Zhining Liu\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003C\u002Ftd> -->\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_0197a95696dd.png\" height=\"80px\" alt=\"\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>不平衡集成 [Python库]\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fstargazers\">\n      \u003Cimg alt=\"GitHub 星标\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhiningLiu1998\u002Fimbalanced-ensemble?style=social\">\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-awesome-machine-learning\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_d9c9ff6342c5.png\" height=\"80px\" alt=\"\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>机器学习 [Awesome]\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-awesome-machine-learning\u002Fstargazers\">\n      \u003Cimg alt=\"GitHub 星标\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhiningLiu1998\u002Fawesome-awesome-machine-learning?style=social\">\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fself-paced-ensemble\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_39a92df0db43.png\" height=\"80px\" alt=\"\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>自步集成 [ICDE]\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fself-paced-ensemble\u002Fstargazers\">\n      \u003Cimg alt=\"GitHub 星标\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhiningLiu1998\u002Fself-paced-ensemble?style=social\">\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fmesa\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_fbf27861f543.png\" height=\"80px\" alt=\"\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>元采样器 [NeurIPS]\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fmesa\u002Fstargazers\">\n      \u003Cimg alt=\"GitHub 星标\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhiningLiu1998\u002Fmesa?style=social\">\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n# 目录\n\n- [目录](#目录)\n- [1. 框架与库](#1-框架与库)\n    - [1.1 Python](#11-python)\n    - [1.2 R](#12-r)\n    - [1.3 Java](#13-java)\n    - [1.4 Scalar](#14-scalar)\n    - [1.5 Julia](#15-julia)\n- [2. 研究论文](#2-research-papers)\n  - [2.1 综述](#21-surveys)\n  - [2.2 集成学习](#22-ensemble-learning)\n      - [2.2.1 *通用集成*](#221-general-ensemble)\n      - [2.2.2 *基于提升的方法*](#222-boosting-based)\n      - [2.2.3 *基于自助法的方法*](#223-bagging-based)\n      - [2.2.4 *代价敏感集成*](#224-cost-sensitive-ensemble)\n  - [2.3 数据重采样](#23-data-resampling)\n      - [2.3.1 *过采样*](#231-over-sampling)\n      - [2.3.2 *欠采样*](#232-under-sampling)\n      - [2.3.3 *混合采样*](#233-hybrid-sampling)\n  - [2.4 代价敏感学习](#24-cost-sensitive-learning)\n  - [2.5 深度学习](#25-deep-learning)\n      - [2.5.1 *综述*](#251-surveys)\n      - [2.5.2 *图数据挖掘*](#252-graph-data-mining)\n      - [2.5.3 *难例挖掘*](#253-hard-example-mining)\n      - [2.5.4 *损失函数工程*](#254-loss-function-engineering)\n      - [2.5.5 *元学习*](#255-meta-learning)\n      - [2.5.6 *表示学习*](#256-representation-learning)\n      - [2.5.7 *后验校准*](#257-posterior-recalibration)\n      - [2.5.8 *半\u002F自监督学习*](#258-semiself-supervised-learning)\n      - [2.5.9 *课程学习*](#259-curriculum-learning)\n      - [2.5.10 *两阶段训练*](#2510-two-phase-training)\n      - [2.5.11 *网络架构*](#2511-network-architecture)\n      - [2.5.12 *深度生成模型*](#2512-deep-generative-model)\n      - [2.5.13 *不平衡回归*](#2513-imbalanced-regression)\n      - [2.5.14 *数据增强*](#2514-data-augmentation)\n- [3. 杂项](#3-miscellaneous)\n  - [3.1 数据集](#31-datasets)\n  - [3.2 Github 仓库](#32-github-repositories)\n    - [3.2.1 *算法、工具及 Jupyter 笔记本*](#321-algorithms--utilities--jupyter-notebooks)\n    - [3.2.2 *论文列表*](#322-paper-list)\n    - [3.2.3 *幻灯片*](#323-slides)\n- [贡献者 ✨](#contributors-)\n\n# 1. 框架与库\n\n### 1.1 Python\n\n- [**imbalanced-ensemble**](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002F) [[**Github**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble)][[**文档**](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002F)][[**图库**](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002Fen\u002Flatest\u002Fauto_examples\u002Findex.html#)][[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2111.12776.pdf)]\n\n  > **注意:** 使用 Python 编写，易于使用。\n  >\n\n  - `imbalanced-ensemble` 是一个用于在类别不平衡数据上快速实现和部署 ***集成学习算法*** 的 Python 工具箱。其特点包括：\n    - (i) 统一且易于使用的 API、详细的 [文档](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002F) 和 [示例](https:\u002F\u002Fimbalanced-ensemble.readthedocs.io\u002Fen\u002Flatest\u002Fauto_examples\u002Findex.html#)。\n    - (ii) 开箱即用，支持多分类不平衡学习。\n    - (iii) 在可能的情况下，通过 [joblib](https:\u002F\u002Fgithub.com\u002Fjoblib\u002Fjoblib) 进行并行化以优化性能。\n    - (iv) 强大的可定制交互式训练日志记录和可视化工具。\n    - (v) 与其他流行包（如 [scikit-learn](https:\u002F\u002Fscikit-learn.org\u002Fstable\u002F) 和 [imbalanced-learn](https:\u002F\u002Fimbalanced-learn.org\u002Fstable\u002F)）完全兼容。\n  - 目前（v0.1.4），它包含了超过 15 种基于 ***重采样*** 和 ***代价敏感学习*** 的集成算法（例如，*SMOTEBoost\u002FBagging、RUSBoost\u002FBagging、AdaCost、EasyEnsemble、BalanceCascade、SelfPacedEnsemble* 等）。\n- [**imbalanced-learn**](https:\u002F\u002Fimbalanced-learn.org\u002Fstable\u002F) [[**Github**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn)][[**文档**](https:\u002F\u002Fimbalanced-learn.org\u002Fstable\u002F)][[**论文**](https:\u002F\u002Fwww.jmlr.org\u002Fpapers\u002Fvolume18\u002F16-365\u002F16-365.pdf)]\n\n  > **注意:** 使用 Python 编写，易于使用。\n  >\n\n  - `imbalanced-learn` 是一个提供多种 ***重采样*** 技术的 Python 包，这些技术常用于类间不平衡严重的数据集中。它与 [scikit-learn](https:\u002F\u002Fscikit-learn.org\u002Fstable\u002F) 兼容，并且是 [scikit-learn-contrib](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib) 项目的一部分。\n  - 当前（v0.8.0），它包含 21 种不同的重采样技术，包括过采样、欠采样以及混合方法（例如，*SMOTE、ADASYN、TomekLinks、NearMiss、OneSideSelection*、SMOTETomek 等）。\n  - 该包还提供了许多实用工具，例如用于 Keras\u002FTensorFlow 的 *批处理生成器*，详情请参阅 [API 参考](https:\u002F\u002Fimbalanced-learn.org\u002Fstable\u002Freferences\u002Findex.html#api)。\n- [**smote_variants**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F) [[**文档**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F)][[**Github**](https:\u002F\u002Fgithub.com\u002Fanalyticalmindsltd\u002Fsmote_variants)] - 一个包含 85 种少数类 ***过采样*** 技术的集合，适用于多分类过采样和模型选择功能（全部用 Python 编写，也支持 R 和 Julia）。\n\n### 1.2 R\n\n- [**smote_variants**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F) [[**文档**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F)][[**Github**](https:\u002F\u002Fgithub.com\u002Fanalyticalmindsltd\u002Fsmote_variants)] - 一个集合了 85 种少数类 ***过采样*** 技术的库，适用于不平衡学习中的多分类过采样和模型选择功能（全部用 Python 编写，同时也支持 R 和 Julia）。\n- [**caret**](https:\u002F\u002Fcran.r-project.org\u002Fweb\u002Fpackages\u002Fcaret\u002Findex.html) [[**文档**](http:\u002F\u002Ftopepo.github.io\u002Fcaret\u002Findex.html)][[**Github**](https:\u002F\u002Fgithub.com\u002Ftopepo\u002Fcaret)] - 包含随机欠采样\u002F过采样的实现。\n- [**ROSE**](https:\u002F\u002Fcran.r-project.org\u002Fweb\u002Fpackages\u002FROSE\u002Findex.html) [[**文档**](https:\u002F\u002Fwww.rdocumentation.org\u002Fpackages\u002FROSE\u002Fversions\u002F0.0-3)] - 包含 [ROSE](https:\u002F\u002Fjournal.r-project.org\u002Farchive\u002F2014-1\u002Fmenardi-lunardon-torelli.pdf)（随机过采样示例）的实现。\n- [**DMwR**](https:\u002F\u002Fcran.r-project.org\u002Fweb\u002Fpackages\u002FDMwR\u002Findex.html) [[**文档**](https:\u002F\u002Fwww.rdocumentation.org\u002Fpackages\u002FDMwR\u002Fversions\u002F0.4.1)] - 包含 [SMOTE](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1106.1813.pdf)（合成少数类过采样技术）的实现。\n\n### 1.3 Java\n\n- [**KEEL**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fdescription.php) [[**Github**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL)][[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fsites\u002Fdefault\u002Ffiles\u002FficherosPublicaciones\u002F0758_Alcalaetal-SoftComputing-Keel1.0.pdf)] - KEEL 提供一个基于数据流的简单 ***GUI***，用于设计包含不同数据集和计算智能算法的实验（***特别关注进化算法***），以评估算法的行为。该工具包含了多种广泛使用的不平衡学习技术，如（进化）过采样\u002F欠采样、代价敏感学习、算法改进以及集成学习方法。\n\n  > **注意：** 内置了种类繁多的经典分类、回归和预处理算法。\n  >\n\n### 1.4 Scala\n\n- [**undersampling**](https:\u002F\u002Fgithub.com\u002FNestorRV\u002Fundersampling) [[**文档**](https:\u002F\u002Fnestorrv.github.io\u002F)][[**Github**](https:\u002F\u002Fgithub.com\u002FNestorRV\u002Fundersampling)] - 一个用于不平衡分类中***欠采样及其集成变体***的 Scala 库。\n\n### 1.5 Julia\n\n- [**smote_variants**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F) [[**文档**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F)][[**Github**](https:\u002F\u002Fgithub.com\u002Fanalyticalmindsltd\u002Fsmote_variants)] - 一个包含 85 种少数类***过采样***技术的集合，适用于多分类过采样，并具备模型选择功能（全部用 Python 编写，同时也支持 R 和 Julia）。\n\n# 2. 研究论文\n\n## 2.1 综述\n\n- **从不平衡数据中学习（IEEE TKDE，2009 年，6000+ 引用）[[**论文**](https:\u002F\u002Fwww.sci-hub.shop\u002F10.1109\u002Ftkde.2008.239)]**\n\n  - 高被引的经典综述论文。系统地回顾了该领域在 2009 年时流行的解决方案、评估指标以及未来研究中的挑战性问题。\n- **从不平衡数据中学习：开放性挑战与未来方向（2016 年，900+ 引用）[[**论文**](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F301596547_Learning_from_imbalanced_data_Open_challenges_and_future_directions)]**\n\n  - 本文重点关注不平衡学习中的开放性问题与挑战，例如极端类别不平衡、在线\u002F流式学习中的不平衡、多分类不平衡学习以及半监督\u002F无监督不平衡学习。\n- **从类别不平衡数据中学习：方法与应用的综述（2017 年，900+ 引用）[[**论文**](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F311977198_Learning_from_class-imbalanced_data_Review_of_methods_and_applications)]**\n\n  - 这是一篇关于不平衡学习方法与应用的最新且详尽的综述，共纳入了 527 篇文献。文中提供了对现有方法的多个详细分类体系，并探讨了该研究领域的最新趋势。\n\n## 2.2 集成学习\n\n#### 2.2.1 *通用集成*\n\n\u003C!-- - **通用集成** -->\n\n- **自步集成（ICDE 2020，20+次引用）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1909.03500v3.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fself-paced-ensemble)][[**幻灯片**](https:\u002F\u002Fzhiningliu.com\u002Ffiles\u002FICDE_2020_self_paced_ensemble_slides.pdf)][[**知乎**](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F86891438)][[**PyPI**](https:\u002F\u002Fpypi.org\u002Fproject\u002Fself-paced-ensemble\u002F)]**\n\n  > **注：** 一种性能卓越且计算效率高的多功能解决方案。\n  >\n- **MESA：利用元采样器提升不平衡数据集上的集成学习（NeurIPS 2020）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.08830.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fmesa)][[**视频**](https:\u002F\u002Fstudio.slideslive.com\u002Fweb_recorder\u002Fshare\u002F20201020T134559Z__NeurIPS_posters__17343__mesa-effective-ensemble-imbal?s=d3745afc-cfcf-4d60-9f34-63d3d811b55f)][[**知乎**](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F268539195)]**\n\n  > **注：** 直接从数据中学习最优的采样策略。\n  >\n- **面向类别不平衡学习的探索性欠采样（IEEE Trans. on SMC, 2008，1300+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2009-IEEE%20TSMCpartB%20Exploratory%20Undersampling%20for%20Class%20Imbalance%20Learning.pdf)]**\n\n  > **注：** 简单但有效的解决方案。\n  >\n\n  - EasyEnsemble [[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Funder_sampling\u002Feasy_ensemble.py)]\n  - BalanceCascade [[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Funder_sampling\u002Fbalance_cascade.py)]\n\n- **集成对长尾数据的影响（Neurips 2023 Heavy Tails Workshop）[[**论文**](https:\u002F\u002Fopenreview.net\u002Fpdf?id=l4GYs60kre)]**\n\n  > **注：**\n  > 在不平衡数据集上，增加更多的（>10个）集成成员会持续提升性能。\n  > 根据集成的多样性和依赖关系，logit和概率集成在不平衡数据集上存在差异。\n\n  - Logit与概率集成 [[**代码**](https:\u002F\u002Fgithub.com\u002Fekellbuch\u002Flongtail_ensembles)]\n\n#### 2.2.2 *基于Boosting*\n\n\u003C!-- - **基于Boosting** -->\n\n- **AdaBoost（1995年，18700+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F1997-JCSS-Schapire-A%20Decision-Theoretic%20Generalization%20of%20On-Line%20Learning%20(AdaBoost).pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn\u002Fscikit-learn\u002Fblob\u002F95d4f0841\u002Fsklearn\u002Fensemble\u002F_weight_boosting.py#L285)]** - 基于C4.5的自适应提升\n- **DataBoost（2004年，570+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2004-SIGKDD-GuoViktor.pdf)]** - 针对不平衡数据的数据生成提升\n- **SMOTEBoost（2003年，1100+次引用）[[**论文**]](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Fcongreso\u002F2003-PKDD-SMOTEBoost-ChawlaLazarevicHallBowyer.pdf)[[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Fover_sampling\u002Fsmote_bagging.py)]** - 合成少数类过采样技术提升\n- **MSMOTEBoost（2011年，1300+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2011-IEEE%20TSMC%20partC-%20GalarFdezBarrenecheaBustinceHerrera.pdf)]** - 改进的合成少数类过采样技术提升\n- **RAMOBoost（2010年，140+次引用）[[**论文**](https:\u002F\u002Fwww.ele.uri.edu\u002Ffaculty\u002Fhe\u002FPDFfiles\u002Framoboost.pdf)] [[**代码**](https:\u002F\u002Fgithub.com\u002Fdialnd\u002Fimbalanced-algorithms\u002Fblob\u002Fmaster\u002Framo.py#L133)]** - 提升中的排序少数类过采样\n- **RUSBoost（2009年，850+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2010-IEEE%20TSMCpartA-RUSBoost%20A%20Hybrid%20Approach%20to%20Alleviating%20Class%20Imbalance.pdf)] [[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Funder_sampling\u002Frus_boost.py)]** - 随机欠采样提升\n- **AdaBoostNC（2012年，350+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2012-wang-IEEE_SMC_B.pdf)]** - 带有负相关学习的自适应提升\n- **EUSBoost（2013年，210+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2013-galar-PR.pdf)]** - 提升中的进化式欠采样\n\n#### 2.2.3 *基于Bagging*\n\n\u003C!-- - **基于Bagging** -->\n\n- **Bagging（1996年，20000+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F1996-ML-Breiman-Bagging%20Predictors.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn\u002Fscikit-learn\u002Fblob\u002F95d4f0841\u002Fsklearn\u002Fensemble\u002F_bagging.py#L433)]** - 袋装预测器\n- **利用集成模型对不平衡数据集进行多样性分析（2009年，400+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Fcongreso\u002F2009-IEEE%20CIDM-WangYao.pdf)]**\n\n  - **UnderBagging** [[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Funder_sampling\u002Funder_bagging.py)]\n  - **OverBagging** [[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Fover_sampling\u002Fover_bagging.py)]\n  - **SMOTEBagging** [[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Fover_sampling\u002Fsmote_bagging.py)]\n\n#### 2.2.4 *代价敏感集成*\n\n\u003C!-- - **代价敏感集成** -->\n\n- **AdaCost（ICML 1999，800+次引用）[[**论文**](https:\u002F\u002Fwww.researchgate.net\u002Fprofile\u002FSalvatore-Stolfo\u002Fpublication\u002F2628569_AdaCost_Misclassification_Cost-sensitive_Boosting\u002Flinks\u002F0fcfd50ca581d7016f000000\u002FAdaCost-Misclassification-Cost-sensitive-Boosting.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Freweighting\u002Fadacost.py)]** - 基于误分类代价的提升\n- **AdaUBoost（NIPS 1999，100+次引用）[[**论文**](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F1998\u002Ffile\u002Fdf12ecd077efc8c23881028604dbb8cc-Paper.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Freweighting\u002Fadauboost.py)]** - 使用不等损失函数的AdaBoost\n- **AsymBoost（NIPS 2001，700+次引用）[[**论文**](https:\u002F\u002Fwww.researchgate.net\u002Fprofile\u002FMichael-Jones-66\u002Fpublication\u002F2539888_Fast_and_Robust_Classification_using_Asymmetric_AdaBoost_and_a_Detector_Cascade\u002Flinks\u002F540731780cf23d9765a83ec1\u002FFast-and-Robust-Classification-using-Asymmetric-AdaBoost-and-a-Detector-Cascade.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fimbalanced-ensemble\u002Fblob\u002Fmain\u002Fimbalanced_ensemble\u002Fensemble\u002Freweighting\u002Fasymmetric_boost.py)]** - 非对称AdaBoost与检测器级联\n\n## 2.3 数据重采样\n\n#### 2.3.1 *过采样*\n\n\u003C!-- - **过采样** -->\n\n- **ROS [[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fover_sampling\u002F_random_over_sampler.py)]** - 随机过采样\n- **SMOTE（2002年，9800+次引用）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1106.1813.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fover_sampling\u002F_smote.py#L36)]** - 合成少数类过采样技术\n- **Borderline-SMOTE（2005年，1400+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fkeel-dataset\u002Fpdfs\u002F2005-Han-LNCS.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fover_sampling\u002F_smote.py#L220)]** - 边界合成少数类过采样技术\n- **ADASYN（2008年，1100+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Fcongreso\u002F2008-He-ieee.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fover_sampling\u002F_adasyn.py)]** - 自适应合成采样\n- **SPIDER（2008年，150+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Fcongreso\u002Fstefanowski_selective_2008.pdf)][[**代码（Java）**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FResampling\u002FSPIDER\u002FSPIDER.java#L57)]** - 不平衡数据的有选择性预处理\n- **Safe-Level-SMOTE（2009年，370+次引用）[[**论文**](150.214.190.154\u002Fkeel\u002Fkeel-dataset\u002Fpdfs\u002F2009-Bunkhumpornpat-LNCS.pdf)][[**代码（Java）**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FResampling\u002FSafe_Level_SMOTE\u002FSafe_Level_SMOTE.java#L58)]** - 安全级别合成少数类过采样技术\n- **SVM-SMOTE（2009年，120+次引用）[[**论文**](ousar.lib.okayama-u.ac.jp\u002Ffiles\u002Fpublic\u002F1\u002F19617\u002F20160528004522391723\u002FIWCIA2009_A1005.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fover_sampling\u002F_smote.py#L417)]** - 基于支持向量机支持向量的SMOTE\n- **MDO（2015年，150+次引用）[[**论文**](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F7163639)][[**代码**](https:\u002F\u002Fgithub.com\u002Fanalyticalmindsltd\u002Fsmote_variants\u002Fblob\u002Fdedbc3d00b266954fedac0ae87775e1643bc920a\u002Fsmote_variants\u002F_smote_variants.py#L14513)]** - 基于马氏距离的*多分类*不平衡问题过采样。\n\n> **注：** 更多过采样方法请参阅 [**smote-variants**](https:\u002F\u002Fgithub.com\u002Fanalyticalmindsltd\u002Fsmote_variants#references)。\n\n#### 2.3.2 *欠采样*\n\n\u003C!-- - **欠采样** -->\n\n- **RUS [[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_random_under_sampler.py)]** - 随机欠采样\n- **CNN（1968年，2100+次引用）[[**论文**](https:\u002F\u002Fpdfs.semanticscholar.org\u002F7c37\u002F71fd6829630cf450af853df728ecd8da4ab2.pdf?_ga=2.137274553.882046879.1583413150-1712662047.1583413150)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_condensed_nearest_neighbour.py)]** - 凝聚最近邻\n- **ENN（1972年，1500+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fdataset\u002Fincludes\u002FcatImbFiles\u002F1972-Wilson-IEEETSMC.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_edited_nearest_neighbours.py)]** - 编辑凝聚最近邻\n- **TomekLink（1976年，870+次引用）[[**论文**](https:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?tp=&arnumber=4309452)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_tomek_links.py)]** - Tomek对凝聚最近邻的改进\n- **NCR（2001年，500+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Fcongreso\u002F2001-Laurikkala-LNCS.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_neighbourhood_cleaning_rule.py)]** - 邻域清理规则\n- **NearMiss-1、2和3（2003年，420+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Fspecific\u002Fcongreso\u002Fjzhang.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_nearmiss.py)]** - 多种基于kNN的不平衡数据分布处理方法。\n- **带有TomekLink的CNN（2004年，2000+次引用）[[**论文**](https:\u002F\u002Fstorm.cis.fordham.edu\u002F~gweiss\u002Fselected-papers\u002Fbatista-study-balancing-training-data.pdf)][[**代码（Java）**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FResampling\u002FCNN_TomekLinks\u002FCNN_TomekLinks.java#L58)]** - 凝聚最近邻 + TomekLink\n- **OSS（2007年，2100+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Fspecific\u002Fcongreso\u002Fkubat97addressing.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_one_sided_selection.py)]** - 单侧选择\n- **EUS（2009年，290+次引用）[[**论文**](https:\u002F\u002Fwww.mitpressjournals.org\u002Fdoi\u002Fpdfplus\u002F10.1162\u002Fevco.2009.17.3.275)]** - 进化欠采样\n- **IHT（2014年，130+次引用）[[**论文**](http:\u002F\u002Fciteseerx.ist.psu.edu\u002Fviewdoc\u002Fdownload?doi=10.1.1.649.8727&rep=rep1&type=pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Funder_sampling\u002F_prototype_selection\u002F_instance_hardness_threshold.py)]** - 实例难度阈值\n\n#### 2.3.3 *混合采样*\n\n\u003C!-- - **混合采样** -->\n\n- **关于几种平衡训练数据方法行为的研究（2004年，2000+次引用）[[**论文**](https:\u002F\u002Fwww.researchgate.net\u002Fprofile\u002FRonaldo-Prati\u002Fpublication\u002F220520041_A_Study_of_the_Behavior_of_Several_Methods_for_Balancing_machine_Learning_Training_Data\u002Flinks\u002F0d22cd91c989507054a2cf3b\u002FA-Study-of-the-Behavior-of-Several-Methods-for-Balancing-machine-Learning-Training-Data.pdf)]**\n\n  > **注：** 涉及10种不同过\u002F欠采样方法的大规模实验评估。\n  >\n\n  - **SMOTE-Tomek [[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fcombine\u002F_smote_tomek.py)]**\n  - **SMOTE-ENN [[**代码**](https:\u002F\u002Fgithub.com\u002Fscikit-learn-contrib\u002Fimbalanced-learn\u002Fblob\u002Fmaster\u002Fimblearn\u002Fcombine\u002F_smote_enn.py)]**\n- **SMOTE-RSB（2012年，210+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fsites\u002Fdefault\u002Ffiles\u002FficherosPublicaciones\u002F1434_2012-Ramentol-KAIS.pdf)][[**代码**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F_modules\u002Fsmote_variants\u002F_smote_variants.html#SMOTE_RSB)]** - 使用SMOTE和粗糙集理论的混合预处理\n- **SMOTE-IPF（2015年，180+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fsites\u002Fdefault\u002Ffiles\u002FficherosPublicaciones\u002F1824_2015-INS-Saez.pdf)][[**代码**](https:\u002F\u002Fsmote-variants.readthedocs.io\u002Fen\u002Flatest\u002F_modules\u002Fsmote_variants\u002F_smote_variants.html#SMOTE_IPF)]** - 带有迭代分割滤波器的SMOTE\n\n## 2.4 成本敏感学习\n\n- **CSC4.5（2002年，420+次引用）[[**论文**](https:\u002F\u002Fwww.sci-hub.shop\u002F10.1109\u002Ftkde.2002.1000348)][[**代码（Java）**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FCSMethods\u002FC45CS\u002FC45CS.java#L48)]** - 一种基于实例加权的方法，用于构建成本敏感决策树\n- **CSSVM（2008年，710+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2009-Chawla-IEEE_TSMCB-svm-imbalance.pdf)][[**代码（Java）**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FCSMethods\u002FC_SVMCost\u002FsvmClassifierCost.java#L60)]** - 面向高度不平衡分类问题的成本敏感支持向量机\n- **CSNN（2005年，950+次引用）[[**论文**](https:\u002F\u002Fsci2s.ugr.es\u002Fkeel\u002Fpdf\u002Falgorithm\u002Farticulo\u002F2006%20-%20IEEE_TKDE%20-%20Zhou_Liu.pdf)][[**代码（Java）**](https:\u002F\u002Fgithub.com\u002FSCI2SUGR\u002FKEEL\u002Fblob\u002Fmaster\u002Fsrc\u002Fkeel\u002FAlgorithms\u002FImbalancedClassification\u002FCSMethods\u002FMLPerceptronBackpropCS\u002FMLPerceptronBackpropCS.java#L49)]** - 使用解决类别不平衡问题的方法训练成本敏感神经网络。\n\n## 2.5 深度学习\n\n#### 2.5.1 *综述*\n\n\u003C!-- - **综述** -->\n\n- 卷积神经网络中类别不平衡问题的系统研究（2018年，330+次引用）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1710.05381.pdf)]\n- 关于深度学习中类别不平衡问题的综述（2019年，50+次引用）[[**论文**](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F332165523_Survey_on_deep_learning_with_class_imbalance)]\n\n  > **注：** 最近一篇关于深度学习中类别不平衡问题的全面综述。\n  >\n\n#### 2.5.2 *图数据挖掘*\n\n\u003C!-- - **图神经网络** -->\n\n- 半监督图不平衡回归（KDD 2023）[[**论文**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.12087)] [[**代码**](https:\u002F\u002Fgithub.com\u002Fliugangcode\u002FSGIR)]\n- TAM：面向类别不平衡节点分类的拓扑感知边界损失（ICML 2022）[[**论文**](https:\u002F\u002Fproceedings.mlr.press\u002Fv162\u002Fsong22a\u002Fsong22a.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FJaeyun-Song\u002FTAM)]\n- GraphSMOTE：利用图神经网络进行图上的不平衡节点分类（WSDM 2021）[[**论文**](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3437963.3441720)][[**代码**](https:\u002F\u002Fgithub.com\u002FTianxiangZhao\u002FGraphSmote)]\n- 拓扑不平衡学习用于半监督节点分类（NeurIPS 2021）[[**论文**](https:\u002F\u002Fproceedings.neurips.cc\u002F\u002Fpaper\u002F2021\u002Ffile\u002Ffa7cdfad1a5aaf8370ebeda47a1ff1c3-Paper.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fvictorchen96\u002Frenode)]\n- GraphENS：面向类别不平衡节点分类的邻域感知自我网络合成（ICLR 2022）[[**论文**](https:\u002F\u002Fopenreview.net\u002Fpdf?id=MXEl7i-iru)][[**代码**](https:\u002F\u002Fgithub.com\u002FJoonHyung-Park\u002FGraphENS)]\n- LTE4G：面向图神经网络的长尾专家模型（CIKM 2022）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2208.10205.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FSukwonYun\u002FLTE4G)]\n- 多类别不平衡图卷积网络学习（IJCAI 2020）[[**论文**](https:\u002F\u002Fpar.nsf.gov\u002Fservlets\u002Fpurl\u002F10199469)]\n\n#### 2.5.3 *困难样本挖掘*\n\n\u003C!-- - **困难样本挖掘** -->\n\n- 基于区域的目标检测器在线困难样本挖掘训练（CVPR 2016，840+次引用）[[**论文**](https:\u002F\u002Fwww.cv-foundation.org\u002Fopenaccess\u002Fcontent_cvpr_2016\u002Fpapers\u002FShrivastava_Training_Region-Based_Object_CVPR_2016_paper.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fabhi2610\u002Fohemh)] - 在神经网络训练的后期阶段，仅对“困难样本”（即损失值较大的样本）进行梯度反向传播。\n\n#### 2.5.4 *损失函数工程*\n\n\u003C!-- - **损失函数工程** -->\n\n- 密集目标检测中的焦点损失（ICCV 2017，2600+次引用）[[**论文**](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FLin_Focal_Loss_for_ICCV_2017_paper.pdf)][[**代码（detectron2）**](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fdetectron2)][[**代码（非官方）**](https:\u002F\u002Fgithub.com\u002Fclcarwin\u002Ffocal_loss_pytorch)] - 一种统一的损失函数，专注于训练稀疏的困难样本，以防止大量容易的负样本在训练过程中压倒检测器。\n\n  > **注：** 解决方案优雅，影响力巨大。\n  >\n- 不平衡数据集上深度神经网络的训练（IJCNN 2016，110+次引用）[[**论文**](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F309778930_Training_deep_neural_networks_on_imbalanced_data_sets)] - 平均（平方）错误，能够均衡地捕捉多数类和少数类的分类错误。\n- 利用视觉注意力聚合进行深度不平衡属性分类（ECCV 2018，30+次引用）[[**论文**](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FNikolaos_Sarafianos_Deep_Imbalanced_Attribute_ECCV_2018_paper.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fcvcode18\u002Fimbalanced_learning)]\n- 通过少数类增量校正实现不平衡深度学习（TPAMI 2018，60+次引用）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1804.10851.pdf)] - 类别校正损失，旨在通过迭代式分批学习过程发现少数类稀疏采样的边界，从而最小化多数类的主导效应。\n- 基于标签分布感知边距损失的学习不平衡数据集（NIPS 2019，10+次引用）[[**论文**](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F8435-learning-imbalanced-datasets-with-label-distribution-aware-margin-loss.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fkaidic\u002FLDAM-DRW)] - 一种基于理论原则的标签分布感知边距（LDAM）损失，其动机是通过最小化基于边距的泛化界来优化模型性能。\n- 梯度协调单阶段检测器（AAAI 2019，40+次引用）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1811.05181.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Flibuyu\u002FGHM_Detection)] - 相较于仅对“容易”的负样本进行降权处理的焦点损失，GHM还对“非常困难”的样本进行降权，因为这些样本很可能是异常值。\n- 基于有效样本数的类别平衡损失（CVPR 2019，70+次引用）[[**论文**](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FCui_Class-Balanced_Loss_Based_on_Effective_Number_of_Samples_CVPR_2019_paper.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Frichardaecn\u002Fclass-balanced-loss)] - 一种简单且通用的基于有效样本数的类别重加权机制。\n- 影响力平衡损失用于不平衡视觉分类（ICCV 2021）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2110.02444.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fpseulki\u002FIB-Loss)]\n- AutoBalance：针对不平衡数据的优化损失函数（NeurIPS 2021）[[**论文**](https:\u002F\u002Fopenreview.net\u002Fpdf?id=ebQXflQre5a)]\n- 过参数化下的标签不平衡与群体敏感分类（NeurIPS 2021）[[**论文**](https:\u002F\u002Fproceedings.neurips.cc\u002F\u002Fpaper\u002F2021\u002Ffile\u002F9dfcf16f0adbc5e2a55ef02db36bac7f-Paper.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Forparask\u002FVS-Loss)]\n\n#### 2.5.5 *元学习*\n\n\u003C!-- - **元学习** -->\n\n- 学习建模尾部类别（NIPS 2017，70+次引用）[[**论文**](papers.nips.cc\u002Fpaper\u002F7278-learning-to-model-the-tail.pdf)] - 将分布头部数据丰富的类别中的元知识迁移到尾部数据稀少的类别中。\n- 学习重加权样本以实现鲁棒深度学习（ICML 2018，150+次引用）[[**论文**](https:\u002F\u002Fproceedings.mlr.press\u002Fv80\u002Fren18a\u002Fren18a.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fuber-research\u002Flearning-to-reweight-examples)] - 隐式学习一个权重函数，在DNN的梯度更新中对样本进行重加权。\n  \n  > **注：** 通过元学习解决类别不平衡问题的代表性工作。\n  >\n- Meta-weight-net：学习显式的样本权重映射（NIPS 2019）[[**论文**](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F8467-meta-weight-net-learning-an-explicit-mapping-for-sample-weighting.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fxjtushujun\u002Fmeta-weight-net)] - 显式地学习一个权重函数（使用MLP作为函数近似器），在DNN的梯度更新中对样本进行重加权。\n- 学习数据操作用于增强和加权（NIPS 2019）[[**论文**](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2019\u002Ffile\u002F671f0311e2754fcdd37f70a8550379bc-Paper.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Ftanyuqian\u002Flearning-data-manipulation)]\n- 学习平衡：面向不平衡及分布外任务的贝叶斯元学习（ICLR 2020）[[**论文**](https:\u002F\u002Fopenreview.net\u002Fattachment?id=rkeZIJBYvr&name=original_pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fhaebeom-lee\u002Fl2b)]\n- MESA：利用MEta-SAmpler提升集成不平衡学习（NeurIPS 2020）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.08830.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fmesa)][[**视频**](https:\u002F\u002Fstudio.slideslive.com\u002Fweb_recorder\u002Fshare\u002F20201020T134559Z__NeurIPS_posters__17343__mesa-effective-ensemble-imbal?s=d3745afc-cfcf-4d60-9f34-63d3d811b55f)]\n\n  > **注：** 元学习驱动的集成学习\n  >\n\n#### 2.5.6 *表示学习*\n\n\u003C!-- - **表示学习** -->\n\n- 为不平衡分类学习深度表示（CVPR 2016，220+次引用）[[**论文**](https:\u002F\u002Fwww.cv-foundation.org\u002Fopenaccess\u002Fcontent_cvpr_2016\u002Fpapers\u002FHuang_Learning_Deep_Representation_CVPR_2016_paper.pdf)]\n- 基于GAN的不平衡分类的监督类分布学习（ICDM 2019）[[**论文**](https:\u002F\u002Fieeexplore.ieee.xilesou.top\u002Fabstract\u002Fdocument\u002F8970900)]\n- 解耦表示与分类器以进行长尾识别（ICLR 2020）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1910.09217.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fclassifier-balancing)]\n\n  > **注：** 关于表示学习和分类器学习的有趣发现\n  >\n- 利用基于能量的对比表示迁移加速不平衡数据学习（NeurIPS 2021）[[**论文**](https:\u002F\u002Fproceedings.neurips.cc\u002F\u002Fpaper\u002F2021\u002Ffile\u002Fb151ce4935a3c2807e1dd9963eda16d8-Paper.pdf)]\n- 为监督学习量身定制自监督（ECCV 2022）[[**论文**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.10023)][[**代码**](https:\u002F\u002Fgithub.com\u002Fwjun0830\u002FLocalizable-Rotation)]\n\n#### 2.5.7 *后验校准*\n\n\u003C!-- - **后验校准** -->\n\n- 面向不平衡数据的后验重新校准（NeurIPS 2020）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.11820.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FGT-RIPL\u002FUNO-IC)]\n- 通过logit调整进行长尾学习（ICLR 2021）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2007.07314v1.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Fgoogle-research\u002Ftree\u002Fmaster\u002Flogit_adjustment)]\n\n#### 2.5.8 *半\u002F自监督学习*\n\n\u003C!-- - **半\u002F自监督学习** -->\n\n- 重新思考标签在改善类别不平衡学习中的价值（NeurIPS 2020）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2006.07529.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FYyzHarry\u002Fimbalanced-semi-self)][[**视频**](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=XltXZ3OZvyI&feature=youtu.be)]\n\n  > **注：** 半监督训练\u002F自监督预训练有助于不平衡学习\n  >\n- 不平衡半监督学习中伪标签的分布对齐精炼器（NeurIPS 2020）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2007.08844.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fbbuing9\u002FDARP)]\n- ABC：面向类别不平衡半监督学习的辅助平衡分类器（NeurIPS 2021）[[**论文**](https:\u002F\u002Fproceedings.neurips.cc\u002F\u002Fpaper\u002F2021\u002Ffile\u002F3953630da28e5181cffca1278517e3cf-Paper.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002Fleehyuck\u002Fabc)]\n- 通过开放世界采样改进不平衡数据上的对比学习（NeurIPS 2021）[[**论文**](https:\u002F\u002Fproceedings.neurips.cc\u002F\u002Fpaper\u002F2021\u002Ffile\u002F2f37d10131f2a483a8dd005b3d14b0d9-Paper.pdf)]\n- DASO：面向不平衡半监督学习的分布感知语义导向伪标签（CVPR 2022）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2106.05682)][[**代码**](https:\u002F\u002Fgithub.com\u002Fytaek-oh\u002Fdaso)]\n\n#### 2.5.9 *课程学习*\n\n\u003C!-- - **课程学习** -->\n\n- 面向不平衡数据分类的动态课程学习（ICCV 2019）[[**论文**](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fpapers\u002FWang_Dynamic_Curriculum_Learning_for_Imbalanced_Data_Classification_ICCV_2019_paper.pdf)]\n\n#### 2.5.10 *两阶段训练*\n\n\u003C!-- - **两阶段训练** -->\n\n- 使用深度神经网络进行脑肿瘤分割（2017年，1200+次引用）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1505.03540.pdf)][[**代码（非官方）**](https:\u002F\u002Fgithub.com\u002Fnaldeborgh7575\u002Fbrain_segmentation)]\n\n  > 先在平衡数据集上进行预训练，然后在原始的不平衡数据上对最后一个输出层进行微调，再接softmax。\n  >\n\n#### 2.5.11 *网络架构*\n\n\u003C!-- - **网络架构** -->\n\n- BBN：具有累积学习的双分支网络，用于长尾视觉识别（CVPR 2020）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1912.02413.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FMegvii-Nanjing\u002FBBN)]\n- 通过类别平衡集成实现类别不平衡深度学习（TNNLS 2021）[[**论文**](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9416240)]\n\n#### 2.5.12 *深度生成模型*\n\n\u003C!-- - **深度生成模型** -->\n\n- 用于鲁棒不平衡分类的深度生成模型（CVPR 2020）[[**论文**](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FWang_Deep_Generative_Model_for_Robust_Imbalance_Classification_CVPR_2020_paper.pdf)]\n\n#### 2.5.13 *不平衡回归*\n\n\u003C!-- - **不平衡回归** -->\n\n- 半监督图不平衡回归（KDD 2023）[[**论文**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.12087)] [[**代码**](https:\u002F\u002Fgithub.com\u002Fliugangcode\u002FSGIR)]\n- RankSim：用于深度不平衡回归的排序相似性正则化（ICML 2022）[[**论文**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2205.15236)] [[**代码**](https:\u002F\u002Fgithub.com\u002FBorealisAI\u002Franksim-imbalanced-regression)]\n- 不平衡视觉回归中的平衡均方误差（CVPR 2022）[[**论文**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.16427)] [[**代码**](https:\u002F\u002Fgithub.com\u002Fjiawei-ren\u002FBalancedMSE)]\n- 深入研究深度不平衡回归（ICML 2021）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2102.09554.pdf)][[**代码**](https:\u002F\u002Fgithub.com\u002FYyzHarry\u002Fimbalanced-regression)][[**视频**](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=grJGixofQRU)]\n- 基于密度的不平衡回归加权方法（机器学习[J]，2021年）[[**论文**](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs10994-021-06023-5)][[**代码**](https:\u002F\u002Fgithub.com\u002FSteiMi\u002Fdensity-based-weighting-for-imbalanced-regression)]\n\n#### 2.5.14 *数据增强*\n\n\u003C!-- - **Augmentation** -->\n\n- 面向少数类的邻域扩展与注意力聚合在视频长尾识别中的应用（AAAI 2023）[[**论文**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.13471)][[**代码**](https:\u002F\u002Fgithub.com\u002Fwjun0830\u002FMOVE)]\n\n\u003C!-- ## 2.6 异常检测\n\n#### 2.6.1 **综述**\n\n  - 异常检测：综述（ACM计算综述，2009年，9000+引用）[[**论文**](cinslab.com\u002Fwp-content\u002Fuploads\u002F2019\u002F03\u002Fxiaorong.pdf)]\n  \n  - 网络异常检测技术综述（2017年，700+引用）[[**论文**](https:\u002F\u002Fwww.gta.ufrj.br\u002F~alvarenga\u002Ffiles\u002FCPE826\u002FAhmed2016-Survey.pdf)]\n\n#### 2.6.2 **基于分类的方法**\n\n  - 用于文档分类的一类支持向量机（JMLR，2001年，1300+引用）[[**论文**](www.jmlr.org\u002Fpapers\u002Fvolume2\u002Fmanevitz01a\u002Fmanevitz01a.pdf)]\n  \n  - 一类协同过滤（ICDM 2008年，1000+引用）[[**论文**](https:\u002F\u002Fcseweb.ucsd.edu\u002Fclasses\u002Ffa17\u002Fcse291-b\u002Freading\u002F04781145.pdf)]\n  \n  - 孤立森林（ICDM 2008年，1000+引用）[[**论文**](https:\u002F\u002Fcs.nju.edu.cn\u002Fzhouzh\u002Fzhouzh.files\u002Fpublication\u002Ficdm08b.pdf?q=isolation-forest)]\n  \n  - 使用一类神经网络进行异常检测（2018年，200+引用）[[**论文**](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1802.06360.pdf)]\n  \n  - 基于鲁棒深度自编码器的异常检测（KDD 2017年，170+引用）[[**论文**](https:\u002F\u002Fpdfs.semanticscholar.org\u002Fc112\u002Fb06d3dac590b4cc111e5ec9c805d0b086c6e.pdf)] -->\n\n\n\n# 3. 杂项\n\n## 3.1 数据集\n\n- **`imbalanced-learn` 数据集**\n\n  > 该数据集集合来自 [`imblearn.datasets.fetch_datasets`](https:\u002F\u002Fimbalanced-learn.org\u002Fstable\u002Freferences\u002Fgenerated\u002Fimblearn.datasets.fetch_datasets.html)。\n  >\n\n  | ID  | 名称           | 数据源及目标           | 比例 | 样本数      | 特征数  |\n  | --- | -------------- | ----------------------------- | ----- | ------- | --- |\n  | 1   | ecoli          | UCI, 目标: imU              | 8.6:1 | 336     | 7   |\n  | 2   | optical_digits | UCI, 目标: 8                | 9.1:1 | 5,620   | 64  |\n  | 3   | satimage       | UCI, 目标: 4                | 9.3:1 | 6,435   | 36  |\n  | 4   | pen_digits     | UCI, 目标: 5                | 9.4:1 | 10,992  | 16  |\n  | 5   | abalone        | UCI, 目标: 7                | 9.7:1 | 4,177   | 10  |\n  | 6   | sick_euthyroid | UCI, 目标: 患有甲状腺功能减退症 | 9.8:1 | 3,163   | 42  |\n  | 7   | spectrometer   | UCI, 目标: >=44             | 11:1  | 531     | 93  |\n  | 8   | car_eval_34    | UCI, 目标: 良好、非常好     | 12:1  | 1,728   | 21  |\n  | 9   | isolet         | UCI, 目标: A、B             | 12:1  | 7,797   | 617 |\n  | 10  | us_crime       | UCI, 目标: >0.65            | 12:1  | 1,994   | 100 |\n  | 11  | yeast_ml8      | LIBSVM, 目标: 8             | 13:1  | 2,417   | 103 |\n  | 12  | scene          | LIBSVM, 目标: 多于一个标签  | 13:1  | 2,407   | 294 |\n  | 13  | libras_move    | UCI, 目标: 1                | 14:1  | 360     | 90  |\n  | 14  | thyroid_sick   | UCI, 目标: 患病             | 15:1  | 3,772   | 52  |\n  | 15  | coil_2000      | KDD, CoIL, 目标: 少数类     | 16:1  | 9,822   | 85  |\n  | 16  | arrhythmia     | UCI, 目标: 06               | 17:1  | 452     | 278 |\n  | 17  | solar_flare_m0 | UCI, 目标: M->0             | 19:1  | 1,389   | 32  |\n  | 18  | oil            | UCI, 目标: 少数类           | 22:1  | 937     | 49  |\n  | 19  | car_eval_4     | UCI, 目标: vgood            | 26:1  | 1,728   | 21  |\n  | 20  | wine_quality   | UCI, 葡萄酒，目标: \u003C=4     | 26:1  | 4,898   | 11  |\n  | 21  | letter_img     | UCI, 目标: Z                | 26:1  | 20,000  | 16  |\n  | 22  | yeast_me2      | UCI, 目标: ME2              | 28:1  | 1,484   | 8   |\n  | 23  | webpage        | LIBSVM，w7a，目标: 少数类   | 33:1  | 34,780  | 300 |\n  | 24  | ozone_level    | UCI，臭氧数据               | 34:1  | 2,536   | 72  |\n  | 25  | mammography    | UCI，目标: 少数类           | 42:1  | 11,183  | 6   |\n  | 26  | protein_homo   | KDD CUP 2004，少数类        | 111:1 | 145,751 | 74  |\n  | 27  | abalone_19     | UCI，目标: 19               | 130:1 | 4,177   | 10  |\n- **不平衡数据库**\n\n  链接：https:\u002F\u002Fgithub.com\u002Fgykovacs\u002Fmldb\n\n## 3.2 GitHub 仓库\n\n### 3.2.1 *算法、工具及 Jupyter 笔记本*\n\n- [imbalanced-algorithms](https:\u002F\u002Fgithub.com\u002Fdialnd\u002Fimbalanced-algorithms) - 基于 Python 的不平衡数据学习算法实现。\n- [imbalanced-dataset-sampler](https:\u002F\u002Fgithub.com\u002Fufoym\u002Fimbalanced-dataset-sampler) - （PyTorch）不平衡数据采样器，用于对低频类进行过采样，对高频类进行欠采样。\n- [class_imbalance](https:\u002F\u002Fgithub.com\u002Fwangz10\u002Fclass_imbalance) - 关于二分类中类别不平衡问题的 Jupyter Notebook 演示。\n- [Multi-class-with-imbalanced-dataset-classification](https:\u002F\u002Fgithub.com\u002Fjavaidnabi31\u002FMulti-class-with-imbalanced-dataset-classification) - 在不平衡的 20 新闻组数据集上执行多分类任务。\n- [使用 scikit-learn 进行高级机器学习：不平衡分类与文本数据](https:\u002F\u002Fgithub.com\u002Famueller\u002Fml-workshop-4-of-4) - 不同的特征选择方法以及针对不平衡数据的重采样方法。\n\n### 3.2.2 *论文列表*\n\n- [异常检测学习资源](https:\u002F\u002Fgithub.com\u002Fyzhao062\u002Fanomaly-detection-resources) 由 [yzhao062](https:\u002F\u002Fgithub.com\u002Fyzhao062) 整理 - 包括异常检测相关的书籍、论文、视频和工具箱。\n- [基于深度学习的不平衡时间序列分类论文列表](https:\u002F\u002Fgithub.com\u002Fdanielgy\u002FPaper-list-on-Imbalanced-Time-series-Classification-with-Deep-Learning) - 不平衡时间序列分类\n\n### 3.2.3 *幻灯片*\n\n- [acm_imbalanced_learning](https:\u002F\u002Fgithub.com\u002Ftimgasser\u002Facm_imbalanced_learning) - 2016年4月27日在德克萨斯州奥斯汀举行的 ACM 不平衡学习讲座的幻灯片和代码。\n\n# 贡献者 ✨\n\n感谢以下各位优秀的朋友（[emoji key](https:\u002F\u002Fallcontributors.org\u002Fdocs\u002Fen\u002Femoji-key)）：\n\n\u003C!-- ALL-CONTRIBUTORS-LIST:START - 请勿删除或修改此部分 -->\n\u003C!-- prettier-ignore-start -->\n\u003C!-- markdownlint-disable -->\n\u003Ctable>\n  \u003Ctbody>\n    \u003Ctr>\n      \u003Ctd align=\"center\" valign=\"top\" width=\"14.28%\">\u003Ca href=\"http:\u002F\u002Fzhiningliu.com\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_c45cf0a4b7eb.png\" width=\"100px;\" alt=\"Zhining Liu\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Zhining Liu\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fcommits?author=ZhiningLiu1998\" title=\"代码\">💻\u003C\u002Fa> \u003Ca href=\"#maintenance-ZhiningLiu1998\" title=\"维护\">🚧\u003C\u002Fa> \u003Ca href=\"#translation-ZhiningLiu1998\" title=\"翻译\">🌍\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd align=\"center\" valign=\"top\" width=\"14.28%\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FAshinZeng\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_7d6ec2f76359.png\" width=\"100px;\" alt=\"曾阿信\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>曾阿信\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"#maintenance-AshinZeng\" title=\"维护\">🚧\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd align=\"center\" valign=\"top\" width=\"14.28%\">\u003Ca href=\"https:\u002F\u002Fwjun0830.github.io\u002F\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_294bc5598a5b.png\" width=\"100px;\" alt=\"WonJun Moon\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>WonJun Moon\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fcommits?author=wjun0830\" title=\"代码\">💻\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd align=\"center\" valign=\"top\" width=\"14.28%\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fliugangcode\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_readme_623faafa8cd5.png\" width=\"100px;\" alt=\"Gang Liu\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Gang Liu\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fcommits?author=liugangcode\" title=\"代码\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003C\u002Ftr>\n  \u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n\u003C!-- markdownlint-restore -->\n\u003C!-- prettier-ignore-end -->\n\n\u003C!-- ALL-CONTRIBUTORS-LIST:END -->\n\n本项目遵循 [all-contributors](https:\u002F\u002Fgithub.com\u002Fall-contributors\u002Fall-contributors) 规范。欢迎任何形式的贡献！","# awesome-imbalanced-learning 快速上手指南\n\n`awesome-imbalanced-learning` 是一个精选的不平衡学习（Imbalanced Learning）资源列表，汇集了相关的论文、代码框架和库。它本身不是一个单一的软件包，而是一个资源导航项目。对于开发者而言，最直接的“上手”方式是使用其中推荐的顶级 Python 工具库，特别是 **`imbalanced-ensemble`** 和 **`imbalanced-learn`**。\n\n本指南将重点介绍如何在中国网络环境下安装并使用这两个核心库来解决类别不平衡问题。\n\n## 1. 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**: Windows, macOS 或 Linux\n*   **Python 版本**: Python 3.7 或更高版本 (推荐 3.8+)\n*   **前置依赖**:\n    *   `pip` (Python 包管理工具)\n    *   `scikit-learn` (机器学习基础库)\n    *   `numpy`, `pandas`, `matplotlib` (数据处理与可视化)\n\n**建议**：使用虚拟环境（如 `venv` 或 `conda`）以避免依赖冲突。\n\n```bash\n# 创建并激活虚拟环境 (可选但推荐)\npython -m venv imbal_env\n# Windows:\nimbal_env\\Scripts\\activate\n# macOS\u002FLinux:\nsource imbal_env\u002Fbin\u002Factivate\n```\n\n## 2. 安装步骤\n\n由于网络原因，国内开发者建议使用国内镜像源进行安装，以获得更快的下载速度。\n\n### 方案 A：安装 `imbalanced-learn` (侧重重采样技术)\n这是最经典的不平衡学习库，提供 SMOTE、欠采样等基础算法。\n\n```bash\npip install imbalanced-learn -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 方案 B：安装 `imbalanced-ensemble` (侧重集成学习)\n这是一个更现代的工具箱，专为不平衡数据设计的集成算法（如 SelfPacedEnsemble, BalanceCascade），支持多分类且性能优化更好。\n\n```bash\npip install imbalanced-ensemble -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 方案 C：安装常用依赖\n确保已安装基础数据科学栈：\n\n```bash\npip install scikit-learn numpy pandas matplotlib -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n## 3. 基本使用\n\n以下示例展示如何使用上述两个库处理一个简单的二分类不平衡数据集。\n\n### 场景 1：使用 `imbalanced-learn` 进行 SMOTE 过采样\n\n此方法通过生成少数类样本来平衡数据集。\n\n```python\nfrom sklearn.datasets import make_classification\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.metrics import classification_report\nfrom imblearn.over_sampling import SMOTE\nfrom imblearn.pipeline import Pipeline as ImbPipeline\n\n# 1. 生成模拟的不平衡数据 (少数类占比约 1%)\nX, y = make_classification(n_samples=1000, n_features=20, n_classes=2, \n                           n_informative=2, n_redundant=0, \n                           weights=[0.99, 0.01], random_state=42)\n\n# 2. 划分训练集和测试集\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n\n# 3. 构建包含 SMOTE 和分类器的管道\n# 注意：必须在训练集上拟合 SMOTE，严禁在测试集上过采样\npipeline = ImbPipeline([\n    ('smote', SMOTE(random_state=42)),\n    ('classifier', LogisticRegression())\n])\n\n# 4. 训练模型\npipeline.fit(X_train, y_train)\n\n# 5. 评估\ny_pred = pipeline.predict(X_test)\nprint(classification_report(y_test, y_pred))\n```\n\n### 场景 2：使用 `imbalanced-ensemble` 进行集成学习\n\n此方法使用专门设计的集成算法（如 `SelfPacedEnsemble`）直接在不平衡数据上训练，无需显式重采样。\n\n```python\nfrom sklearn.datasets import make_classification\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.metrics import classification_report\nfrom imblearn.ensemble import SelfPacedEnsemble # 来自 imbalanced-ensemble 库\n\n# 1. 生成模拟的不平衡数据\nX, y = make_classification(n_samples=1000, n_features=20, n_classes=2, \n                           n_informative=2, n_redundant=0, \n                           weights=[0.99, 0.01], random_state=42)\n\n# 2. 划分数据集\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n\n# 3. 初始化自步集成分类器\n# n_estimators: 基估计器数量, base_estimator: 基模型 (默认为决策树)\nspe_clf = SelfPacedEnsemble(n_estimators=10, random_state=42)\n\n# 4. 直接在不平衡数据上训练\nspe_clf.fit(X_train, y_train)\n\n# 5. 评估\ny_pred = spe_clf.predict(X_test)\nprint(classification_report(y_test, y_pred))\n\n# 可选：查看训练过程中的损失变化 (如果开启了日志)\n# spe_clf.visualize() \n```\n\n### 核心提示\n*   **API 兼容性**: `imbalanced-learn` 和 `imbalanced-ensemble` 的 API 设计高度兼容 `scikit-learn`，你可以像使用 `RandomForestClassifier` 一样使用它们。\n*   **管道整合**: 强烈建议在 `scikit-learn` 的 `Pipeline` 中使用重采样步骤（如场景 1），以防止数据泄露。\n*   **多分类支持**: `imbalanced-ensemble` 原生支持多分类不平衡问题，无需额外配置。","某金融科技公司风控团队正致力于构建信用卡欺诈检测模型，面对的是典型的极度不平衡数据场景（欺诈交易仅占万分之五）。\n\n### 没有 awesome-imbalanced-learning 时\n- **盲目试错成本高**：工程师需在海量论文中手动筛选适合长尾分布的算法，耗时数周仍难以确定最优技术路线。\n- **复现代码困难**：找到的开源代码往往依赖混乱或缺乏文档，导致复现经典不平衡学习算法（如重采样、代价敏感学习）失败率高。\n- **模型性能瓶颈**：直接套用常规分类器导致模型严重偏向多数类，欺诈召回率极低，大量风险交易被漏判。\n- **缺乏系统框架**：团队只能零散拼凑数据处理与训练脚本，无法系统化对比不同不平衡学习策略的效果。\n\n### 使用 awesome-imbalanced-learning 后\n- **精准定位方案**：直接查阅按领域整理的顶会论文列表，快速锁定适用于金融欺诈场景的 SOTA 算法（如集成学习方法）。\n- **高效落地实践**：利用收录的高质量代码库及官方推荐的 `imbalanced-ensemble` 工具包，几天内即可完成复杂算法的部署与调优。\n- **显著提升指标**：应用成熟的不平衡学习策略后，模型在保持低误报率的同时，将欺诈交易召回率提升了 40%。\n- **体系化研发流程**：基于清晰的框架分类，团队建立了从数据重采样到损失函数优化的标准化实验流水线，加速迭代。\n\nawesome-imbalanced-learning 将原本需要数月摸索的长尾学习难题，转化为可快速复用的高效工程实践，让模型在极端不平衡数据下依然精准可靠。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FZhiningLiu1998_awesome-imbalanced-learning_83410963.png","ZhiningLiu1998","Zhining Liu","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FZhiningLiu1998_c45cf0a4.jpg","Ph.D. Candidate\r\n@iDEA-iSAIL-Lab-UIUC","University of Illinois Urbana-Champaign","Palo Alto, CA","liu326@illinois.edu","zhining_liu","zhiningliu.com","https:\u002F\u002Fgithub.com\u002FZhiningLiu1998",null,1516,230,"2026-04-01T01:04:42","CC0-1.0",1,"","未说明",{"notes":94,"python":92,"dependencies":95},"该仓库是一个非平衡学习相关的论文、代码和库的精选列表，本身不是一个单一的可执行工具。其中列出的主要 Python 库（如 imbalanced-ensemble, imbalanced-learn）通常兼容主流操作系统，具体版本依赖需参考各子项目的文档。部分算法支持并行加速（通过 joblib），但未明确提及 GPU 需求。",[96,97,98],"scikit-learn","imbalanced-learn","joblib",[13,51],[101,102,103,104,105,106,107,108,109,110,111,112,113],"awesome","awesome-list","imbalanced-learning","imbalanced-classification","class-imbalance","machine-learning","ensemble-learning","imbalanced-data","deep-learning","imbalanced-classes","skewed-data","fairness-ml","fair-ml","2026-03-27T02:49:30.150509","2026-04-06T07:13:31.519340",[117,122,127,132,137,142],{"id":118,"question_zh":119,"answer_zh":120,"source_url":121},14081,"项目是否有计划提供类别不平衡的数据集？","维护者已经在项目中添加了一些类别不平衡的数据集资源，用户可以直接在仓库中查找和使用这些数据集。","https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fissues\u002F1",{"id":123,"question_zh":124,"answer_zh":125,"source_url":126},14080,"在评估不平衡学习模型时，为什么简单的线性回归在 average_precision_score 指标上得分最高？","使用 average_precision_score 时必须确保所有对比模型输出的都是概率值（probability），拿输出二元标签（binary label）的模型与输出概率的模型对比是完全错误的。如果确认不是这个问题，请检查集成学习中基树（base tree）的数量：太少的基树会导致输出的概率非常离散（极端情况下，只有一个 max_depth=1 的树时，输出概率等价于二元标签）。如果排除了上述问题，虽然在特定数据集上线性回归可能表现更好，但在真实任务中，线性回归通常很难在各项指标上击败大规模的 GBDT 模型（如 XGBoost\u002FLightGBM）。此外，建议多阅读官方文档或维基百科，避免仅依赖 AI 生成的未经验证的答案来提出问题。","https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fissues\u002F18",{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},14082,"如何向该项目贡献代码或成为贡献者？","用户可以通过提交 Pull Request (PR) 来贡献内容。对于希望被记录为贡献者的用户，可以在评论中使用 `@all-contributors please add @用户名 for 贡献类型`（例如翻译 translation 或维护 maintenance）的格式请求维护者添加，机器人会自动创建相应的 PR 来更新贡献者列表。","https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fissues\u002F5",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},14083,"能否将某些特定的应用论文（如 IoT 领域的自动机器学习系统）添加到不平衡学习列表中？","如果论文的核心内容主要是设计针对特定应用（如 IoT）的自动机器学习系统，而未深入探讨类别不平衡学习的核心方法，则不适合添加到此列表中。建议此类论文参考更垂直领域的列表（如 awesome-iot）。只有当论文明确解决了类别不平衡问题时，才适合被收录。","https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fissues\u002F9",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},14084,"如何在语义分割任务中处理像素级的类别不平衡问题？","该列表主要关注通用的不平衡学习算法和资源，不直接提供针对语义分割像素统计的具体代码修改方案。用户需要自行编写代码对像素进行统计并计算比值，或者查阅专门针对语义分割和不平衡学习的结合方案。如果在提问时无法清晰描述具体的技术卡点（如具体的统计方法或代码报错），建议先明确具体问题后再发起新的讨论。","https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fissues\u002F2",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},14085,"如何推荐 NeurIPS 2020 等会议上的最新不平衡学习相关工作？","用户可以在此项目的 Issue 中提交最新的相关论文链接（包括知乎解读或 GitHub 仓库）。维护者在确认论文内容与类别不平衡或长尾分布学习相关后，会将其更新到列表中，并感谢提议者的贡献（甚至将其添加为维护者或贡献者）。","https:\u002F\u002Fgithub.com\u002FZhiningLiu1998\u002Fawesome-imbalanced-learning\u002Fissues\u002F3",[]]