[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-Efficient-ML--Awesome-Model-Quantization":3,"tool-Efficient-ML--Awesome-Model-Quantization":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",147882,2,"2026-04-09T11:32:47",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108111,"2026-04-08T11:23:26",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":77,"owner_website":78,"owner_url":79,"languages":76,"stars":80,"forks":81,"last_commit_at":82,"license":76,"difficulty_score":32,"env_os":83,"env_gpu":84,"env_ram":84,"env_deps":85,"category_tags":88,"github_topics":89,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":100,"updated_at":101,"faqs":102,"releases":142},5982,"Efficient-ML\u002FAwesome-Model-Quantization","Awesome-Model-Quantization","A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.","Awesome-Model-Quantization 是一个专注于模型量化领域的开源资源聚合库，旨在为研究者和开发者提供一站式的学术与工程支持。随着大语言模型（LLM）和多模态模型规模的不断扩大，如何在保持性能的同时降低计算成本和存储需求成为关键挑战，而模型量化正是解决这一问题的核心技术。该仓库系统地收集并整理了从早期经典到 2026 年最新的量化相关论文、技术文档、基准测试（如 BiBench、LLaMA3 及 Qwen3 量化实证研究）以及开源代码实现。\n\n通过持续更新的分类目录，Awesome-Model-Quantization 帮助用户快速定位特定年份的研究成果或具体的实验工具，极大地降低了文献调研和复现算法的门槛。它不仅涵盖了理论综述，还包含了针对主流模型（如 LLaMA 系列、Qwen 系列）的实战基准分析，为评估不同量化策略的效果提供了权威参考。无论是希望深入探索量化算法前沿的科研人员，还是致力于将大模型部署到边缘设备的工程师，都能从中获得宝贵的灵感与实用工具。该项目鼓励社区协作，欢迎用户提交遗漏的优秀工作，共同推动高效人工智能技术的发展。","# Awesome Model Quantization [![Awesome](https:\u002F\u002Fawesome.re\u002Fbadge.svg)](https:\u002F\u002Fawesome.re)\n\nThis repo collects papers, documents, and codes about model quantization for anyone who wants to research it. We are continuously improving the project. Welcome to PR the works (papers, repositories) that the repo misses.\n\n\n- [Benchmarks](#benchmarks)\n- [Survey Papers](#survey-papers)\n- [Papers](#papers)\n  - [2026](#2026)\n  - [2025](#2025)\n  - [2024](#2024)\n  - [2023](#2023)\n  - [2022–2015](#2022-2015)\n- [Related Repositories](#related-repositories)\n\n## Benchmarks\n\n**1. BiBench: Benchmarking and Analyzing Network Binarization** [[Paper](https:\u002F\u002Fproceedings.mlr.press\u002Fv202\u002Fqin23a.html)] [[Code](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiBench)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiBench?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiBench)\n\n**Venue:** ICML 2023\n\n**Authors:** Haotong Qin, Mingyuan Zhang, Yifu Ding, Aoyu Li, Zhongang Cai, Ziwei Liu, Fisher Yu, Xianglong Liu.\n\n![survey](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_05e461b390a9.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@inproceedings{qin2023bibench,\n  title={BiBench: Benchmarking and Analyzing Network Binarization},\n  author={Qin, Haotong and Zhang, Mingyuan and Ding, Yifu and Li, Aoyu and Cai, Zhongang and Liu, Ziwei and Yu, Fisher and Liu, Xianglong},\n  booktitle={International Conference on Machine Learning (ICML)},\n  year={2023}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**2. An empirical study of LLaMA3 quantization: from LLMs to MLLMs** [[Paper](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs44267-024-00070-x)] [[Code](https:\u002F\u002Fgithub.com\u002FMacaronlin\u002FLLaMA3-Quantization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FMacaronlin\u002FLLaMA3-Quantization?style=social)](https:\u002F\u002Fgithub.com\u002FMacaronlin\u002FLLaMA3-Quantization)\n\n**Venue:** Visual Intelligence 2024\n\n**Authors:** Wei Huang, Xingyu Zheng, Xudong Ma, Haotong Qin, Chengtao Lv, Hong Chen, Jie Luo, Xiaojuan Qi, Xianglong Liu, Michele Magno.\n\n![LLaMA3 Quantization Benchmark](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_901b7e5ec9b6.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{huang2024empirical,\n  title={An empirical study of llama3 quantization: From llms to mllms},\n  author={Huang, Wei and Zheng, Xingyu and Ma, Xudong and Qin, Haotong and Lv, Chengtao and Chen, Hong and Luo, Jie and Qi, Xiaojuan and Liu, Xianglong and Magno, Michele},\n  journal={Visual Intelligence},\n  volume={2},\n  number={1},\n  pages={36},\n  year={2024},\n  publisher={Springer}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**3. An Empirical Study of Qwen3 Quantization** [[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.02214)] [[Code](https:\u002F\u002Fgithub.com\u002FEfficient-ML\u002FQwen3-Quantization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FEfficient-ML\u002FQwen3-Quantization?style=social)](https:\u002F\u002Fgithub.com\u002FEfficient-ML\u002FQwen3-Quantization)\n\n**Venue:** Visual Intelligence 2026\n\n**Authors:** Xingyu Zheng, Yuye Li, Haoran Chu, Yue Feng, Xudong Ma, Jie Luo, Jinyang Guo, Haotong Qin, Michele Magno, Xianglong Liu.\n\n![qwen3](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_ac5c15d3b2df.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{zheng2025empirical,\n  title={An empirical study of qwen3 quantization},\n  author={Zheng, Xingyu and Li, Yuye and Chu, Haoran and Feng, Yue and Ma, Xudong and Luo, Jie and Guo, Jinyang and Qin, Haotong and Magno, Michele and Liu, Xianglong},\n  journal={arXiv preprint arXiv:2505.02214},\n  year={2025}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**4. LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit** [[Paper](https:\u002F\u002Faclanthology.org\u002F2024.emnlp-industry.12\u002F)] [[Code](https:\u002F\u002Fgithub.com\u002FModelTC\u002FLightCompress)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FModelTC\u002FLightCompress?style=social)](https:\u002F\u002Fgithub.com\u002FModelTC\u002FLightCompress)\n\n**Venue:** EMNLP 2024 Industry Track\n\n**Authors:** Ruihao Gong, Yang Yong, Shiqiao Gu, Yushi Huang, Chengtao Lv, Yunchen Zhang, Xianglong Liu, Dacheng Tao.\n\n![llmc](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_19bda43d7430.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@inproceedings{gong2024llmc,\n  title={Llmc: Benchmarking large language model quantization with a versatile compression toolkit},\n  author={Gong, Ruihao and Yong, Yang and Gu, Shiqiao and Huang, Yushi and Lv, Chengtao and Zhang, Yunchen and Tao, Dacheng and Liu, Xianglong},\n  booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track},\n  pages={132--152},\n  year={2024}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**5. RobustMQ: Benchmarking Robustness of Quantized Models** [[Paper](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs44267-023-00031-w)]\n\n**Venue:** Visual Intelligence 2023\n\n**Authors:** Yisong Xiao, Aishan Liu, Tianyuan Zhang, Haotong Qin, Jinyang Guo, Xianglong Liu.\n\n![robustmq](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_59c0d2511cab.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{xiao2023robustmq,\n  title={Robustmq: benchmarking robustness of quantized models},\n  author={Xiao, Yisong and Liu, Aishan and Zhang, Tianyuan and Qin, Haotong and Guo, Jinyang and Liu, Xianglong},\n  journal={Visual Intelligence},\n  volume={1},\n  number={1},\n  pages={30},\n  year={2023},\n  publisher={Springer}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n## Survey Papers\n\n**1. Binary Neural Networks: A Survey** [[Paper](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fabs\u002Fpii\u002FS0031320320300856)] [[Blog](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FQGva6fow9tad_daZ_G2p0Q)]\n\n**Venue:** Pattern Recognition 2020\n\n**Authors:** Haotong Qin, Ruihao Gong, Xianglong Liu, Xiao Bai, Jingkuan Song, Nicu Sebe.\n\n\n![survey](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_4e2933bc83c0.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{Qin:pr20_bnn_survey,\n    title = \"Binary neural networks: A survey\",\n    author = \"Haotong Qin and Ruihao Gong and Xianglong Liu and Xiao Bai and Jingkuan Song and Nicu Sebe\",\n    journal = \"Pattern Recognition\",\n    volume = \"105\",\n    pages = \"107281\",\n    year = \"2020\"\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**2. A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms** [[Paper](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fpii\u002FS0893608025007361)]\n\n**Venue:** Neural Networks 2025\n\n**Authors:** Ruihao Gong, Yifu Ding, Zining Wang, Chengtao Lv, Xingyu Zheng, Jinyang Du, Yang Yong, Shiqiao Gu, Haotong Qin, Jinyang Guo, Dahua Lin, Michele Magno, Xianglong Liu.\n\n![A Survey of Low-bit Large Language Models](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_2117f5830dc5.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{gong2025survey,\n  title={A survey of low-bit large language models: Basics, systems, and algorithms},\n  author={Gong, Ruihao and Ding, Yifu and Wang, Zining and Lv, Chengtao and Zheng, Xingyu and Du, Jinyang and Yong, Yang and Gu, Shiqiao and Qin, Haotong and Guo, Jinyang and others},\n  journal={Neural networks},\n  pages={107856},\n  year={2025},\n  publisher={Elsevier}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**3. Low-bit Model Quantization for Deep Neural Networks: A Survey** [[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.05530)]\n\n**Venue:** arXiv 2025\n\n**Authors:** Kai Liu, Qian Zheng, Kaiwen Tao, Zhiteng Li, Haotong Qin, Wenbo Li, Yong Guo, Xianglong Liu, Linghe Kong, Guihai Chen, Yulun Zhang, Xiaokang Yang.\n\n![quant-survey](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_fb1df41ea7ae.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{liu2025low,\n  title={Low-bit model quantization for deep neural networks: A survey},\n  author={Liu, Kai and Zheng, Qian and Tao, Kaiwen and Li, Zhiteng and Qin, Haotong and Li, Wenbo and Guo, Yong and Liu, Xianglong and Kong, Linghe and Chen, Guihai and others},\n  journal={arXiv preprint arXiv:2505.05530},\n  year={2025}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n## Papers\n\n### 2026\n\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=7QZanjCD6M)] PT²-LLM: Post-Training Ternarization for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002FXIANGLONGYAN\u002FPT2-LLM)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FXIANGLONGYAN\u002FPT2-LLM?style=social)](https:\u002F\u002Fgithub.com\u002FXIANGLONGYAN\u002FPT2-LLM)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=HD7tuVakmR)] Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=3AnRMvlVDw)] DVD-Quant: Data-free Video Diffusion Transformers Quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=AH7hbA7Zkk)] Q&C: When Quantization Meets Cache in Efficient Generation\n- [[CVPR Findings](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.21970)] Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.21302)] Quantized Visual Geometry Grounded Transformer\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=XAXT7A8EWh)] Post-Training Quantization for Video Matting\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=XJXZXuTj11)] QVGen: Pushing the Limit of Quantized Video Generative Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=4TAG3aQljJ)] QuantSparse: Comprehensively Compressing Video Diffusion Transformer with Model Quantization and Attention Sparsification [[code](https:\u002F\u002Fgithub.com\u002Fwlfeng0509\u002FQuantSparse)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fwlfeng0509\u002FQuantSparse?style=social)](https:\u002F\u002Fgithub.com\u002Fwlfeng0509\u002FQuantSparse)\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F40123)] First-Order Error Matters: Accurate Compensation for Quantized Large Language Models [[code](https:\u002F\u002Fgithub.com\u002FXingyu-Zheng\u002FFOEM)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FXingyu-Zheng\u002FFOEM?style=social)](https:\u002F\u002Fgithub.com\u002FXingyu-Zheng\u002FFOEM)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.06564)] TR-DQ: Time-Rotation Diffusion Quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=tO3ASKZlok)] TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=VQIvBpL5ag)] Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs [[code](https:\u002F\u002Fgithub.com\u002Fcsguoh\u002FOBR)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcsguoh\u002FOBR?style=social)](https:\u002F\u002Fgithub.com\u002Fcsguoh\u002FOBR)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=XPIEkFdEDi)] AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs [[code](https:\u002F\u002Fgithub.com\u002Fnaver-aics\u002Fanybcq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fnaver-aics\u002Fanybcq?style=social)](https:\u002F\u002Fgithub.com\u002Fnaver-aics\u002Fanybcq)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=9CZzD5LWdy)] Tequila: Deadzone-free Ternary Quantization for Large Language Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=V85HbymBLW)] LogART: Pushing the Limit of Efficient Logarithmic Post-Training Quantization [[code](https:\u002F\u002Fgithub.com\u002Flogart-lab\u002Flogart)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Flogart-lab\u002Flogart?style=social)](https:\u002F\u002Fgithub.com\u002Flogart-lab\u002Flogart)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=1USeVjsKau)] ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference [[code](https:\u002F\u002Fgithub.com\u002Fz-lab\u002Fparoquant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fz-lab\u002Fparoquant?style=social)](https:\u002F\u002Fgithub.com\u002Fz-lab\u002Fparoquant)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=VpZ8YYdBmT)] Improving Block-Wise LLM Quantization by 4-bit Generalized Normal Float Formats\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.16018)] D²Quant: Accurate Low-bit Post-Training Weight Quantization for LLMs\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2601.03170)] QuantLRM: Quantization of Large Reasoning Models via Fine-Tuning Signals\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.15391)] SliderQuant: Accurate Post-Training Quantization for LLMs\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.04719)] What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=yjr2jX41qO)] Channel-Aware Mixed-Precision Quantization for Efficient Long-Context Inference\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=ATpchFiBQi)] CodeQuant: Unified Clustering and Quantization for Enhanced Outlier Smoothing in Low-Precision Mixture-of-Experts\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.11696)] QeRL: Beyond Efficiency - Quantization-enhanced Reinforcement Learning for LLMs [[code](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FQeRL)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FNVlabs\u002FQeRL?style=social)](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FQeRL)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.03782)] AutoQVLA: Not All Channels Are Equal in Vision-Language-Action Model's Quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=g2l9bg9DWx)] Achieving low-bit Muon through subspace preservation and grid quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=DAZvMAlZRp)] Shift-and-Sum Quantization for Visual Autoregressive Models\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.03472)] Inlier-Centric Post-Training Quantization for Object Detection Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=yiMlVBAoQi)] Efficient Quantization of Mixture-of-Experts with Theoretical Generalization Guarantees\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=tY9yPAT3PU)] BBQ: Boosting Quantization Entropy with Bell Box Quantization\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.06653)] Improving Block-Wise LLM Quantization by 4-bit Block-Wise Optimal Float (BOF4): Analysis and Variations [[code](https:\u002F\u002Fgithub.com\u002Fifnspaml\u002Fbof4)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fifnspaml\u002Fbof4?style=social)](https:\u002F\u002Fgithub.com\u002Fifnspaml\u002Fbof4)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.18259)] Learning under Quantization for High-Dimensional Linear Regression\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.25214)] On-the-Fly Adaptation to Quantization: Configuration-Aware LoRA for Efficient Fine-Tuning of Quantized LLMs\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.23202)] Bridging the Gap Between Promise and Performance for FP4 Quantization [[code](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002FFP-Quant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FIST-DASLab\u002FFP-Quant?style=social)](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002FFP-Quant)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.11184)] KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models [[code](https:\u002F\u002Fgithub.com\u002Fxuzukang\u002Fkbvq_moe)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxuzukang\u002Fkbvq_moe?style=social)](https:\u002F\u002Fgithub.com\u002Fxuzukang\u002Fkbvq_moe)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.03383)] UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs [[code](https:\u002F\u002Fgithub.com\u002Fenyac-group\u002FUniQL)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fenyac-group\u002FUniQL?style=social)](https:\u002F\u002Fgithub.com\u002Fenyac-group\u002FUniQL)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.01077)] The Lattice Geometry of Neural Network Quantization: A Short Equivalence Proof of GPTQ and Babai's algorithm\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.03472)] DPQuant: Efficient and Private Model Training via Dynamic Quantization Scheduling\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf\u002Fee0ea14cd2283b1fee1902a6811796b443849c5c.pdf)] Towards Quantization-Aware Training for Ultra-Low-Bit Reasoning LLMs\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.21314)] A Convergence Analysis of Adaptive Optimizers under Floating-point Quantization\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.06213)] Training Dynamics Impact Post-Training Quantization Robustness [[code](https:\u002F\u002Fgithub.com\u002Faldakata\u002FTrainingDynamicsQuantizationRobustness)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Faldakata\u002FTrainingDynamicsQuantizationRobustness?style=social)](https:\u002F\u002Fgithub.com\u002Faldakata\u002FTrainingDynamicsQuantizationRobustness)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=pjMDZJd4rT)] SSDi8: Accurate and Efficient 8-bit Quantization for State Space Duality\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.18553)] The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2601.21238)] PTQ4ARVG: Post-Training Quantization for AutoRegressive Visual Generation Models [[code](https:\u002F\u002Fgithub.com\u002FBienLuky\u002FPTQ4ARVG)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FBienLuky\u002FPTQ4ARVG?style=social)](https:\u002F\u002Fgithub.com\u002FBienLuky\u002FPTQ4ARVG)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.17428)] QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models [[code](https:\u002F\u002Fgithub.com\u002Fvantaa89\u002Fqwha)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fvantaa89\u002Fqwha?style=social)](https:\u002F\u002Fgithub.com\u002Fvantaa89\u002Fqwha)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.01289)] Gradient-Aligned Calibration for Post-Training Quantization of Diffusion Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=nFjj8NEBqv)] SERQ: Saliency-Aware Low-Rank Error Reconstruction for LLM Quantization\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.22935)] Compute-Optimal Quantization-Aware Training\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.18610)] PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs [[code](https:\u002F\u002Fgithub.com\u002Fthu-nics\u002FPM-KVQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fthu-nics\u002FPM-KVQ?style=social)](https:\u002F\u002Fgithub.com\u002Fthu-nics\u002FPM-KVQ)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.23500)] Beyond Outliers: A Study of Optimizers Under Quantization\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.11695)] Qronos: Correcting the Past by Shaping the Future... in Post-Training Quantization\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.02343)] MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002Flwy2020\u002FMicroMix)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Flwy2020\u002FMicroMix?style=social)](https:\u002F\u002Fgithub.com\u002Flwy2020\u002FMicroMix)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.04929)] TurboBoA: Faster and Exact Attention-aware Quantization without Backpropagation\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=FDdOD3qwS7)] Beyond Uniformity: Sample and Frequency Meta Weighting for Post-Training Quantization of Diffusion Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=LWYZ1nNkJl)] Rethinking Residual Errors in Compensation-based LLM Quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=8tDIzHFOx6)] SPR²Q: Static Priority-based Rectifier Routing Quantization for Image Super-Resolution [[code](https:\u002F\u002Fgithub.com\u002Fmomo5-a11\u002FSPR2Q)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmomo5-a11\u002FSPR2Q?style=social)](https:\u002F\u002Fgithub.com\u002Fmomo5-a11\u002FSPR2Q)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.26771)] STaMP: Sequence Transformation and Mixed Precision for Low-Precision Activation Quantization\n\n### 2025\n\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2025\u002Fposter\u002F45429)] Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers [[code](https:\u002F\u002Fgithub.com\u002Fcantbebetter2\u002FQ-VDiT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcantbebetter2\u002FQ-VDiT?style=social)](https:\u002F\u002Fgithub.com\u002Fcantbebetter2\u002FQ-VDiT)\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F33823)] MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2025\u002Fposter\u002F45388)] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002FAaronhuang-778\u002FSliM-LLM)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FAaronhuang-778\u002FSliM-LLM?style=social)](https:\u002F\u002Fgithub.com\u002FAaronhuang-778\u002FSliM-LLM)\n- [[TPAMI](https:\u002F\u002Fwww.computer.org\u002Fcsdl\u002Fjournal\u002Ftp\u002F2025\u002F10\u002F11060852\u002F281Hxm5TK2Q)] BiVM: Accurate Binarized Neural Network for Efficient Video Matting\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=e8pm93koQU)] S²Q-VDiT: Accurate Quantized Video Diffusion Transformer with Salient Data and Sparse Token Distillation [[code](https:\u002F\u002Fgithub.com\u002Fwlfeng0509\u002FS2Q-VDiT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fwlfeng0509\u002FS2Q-VDiT?style=social)](https:\u002F\u002Fgithub.com\u002Fwlfeng0509\u002FS2Q-VDiT)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2025\u002Fpapers\u002FZhu_PassionSR_Post-Training_Quantization_with_Adaptive_Scale_in_One-Step_Diffusion_based_CVPR_2025_paper.pdf)] PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution [[code](https:\u002F\u002Fgithub.com\u002Flibozhu03\u002FPassionSR)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Flibozhu03\u002FPassionSR?style=social)](https:\u002F\u002Fgithub.com\u002Flibozhu03\u002FPassionSR)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=ZU8OdDLTts)] ARB-LLM: Alternating Refined Binarizations for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002FZHITENGLI\u002FARB-LLM)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZHITENGLI\u002FARB-LLM?style=social)](https:\u002F\u002Fgithub.com\u002FZHITENGLI\u002FARB-LLM)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=cCE46s1obO)] BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models [[code](https:\u002F\u002Fgithub.com\u002FXingyu-Zheng\u002FBinaryDM)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FXingyu-Zheng\u002FBinaryDM?style=social)](https:\u002F\u002Fgithub.com\u002FXingyu-Zheng\u002FBinaryDM)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv267\u002Fsun25l.html)] FlatQuant: Flatness Matters for LLM Quantization [[code](https:\u002F\u002Fgithub.com\u002Fruikangliu\u002FFlatQuant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fruikangliu\u002FFlatQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fruikangliu\u002FFlatQuant)\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2025\u002Fposter\u002F44438)] RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002FOptimAI-Lab\u002FRoSTE)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOptimAI-Lab\u002FRoSTE?style=social)](https:\u002F\u002Fgithub.com\u002FOptimAI-Lab\u002FRoSTE)\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2025\u002Fposter\u002F43984)] GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2025\u002Fposter\u002F43551)] Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization [[code](https:\u002F\u002Fgithub.com\u002FWeizhiGao\u002FMoDiff)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FWeizhiGao\u002FMoDiff?style=social)](https:\u002F\u002Fgithub.com\u002FWeizhiGao\u002FMoDiff)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F118539)] DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization [[code](https:\u002F\u002Fgithub.com\u002FCAS-CLab\u002FDartQuant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FCAS-CLab\u002FDartQuant?style=social)](https:\u002F\u002Fgithub.com\u002FCAS-CLab\u002FDartQuant)\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F35415)] JAQ: Joint Efficient Architecture Design and Low-Bit Quantization\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F33807\u002F35962)] OAC: Output-adaptive Calibration for Accurate Post-Training Quantization of LLMs\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F34039)] Optimizing Quantized Diffusion Models via Distillation with Decay Timestep-Aware Loss\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F32658\u002F40071)] Quantifiable Quantization Sensitivity of Diffusion Models\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F33913\u002F36068)] TCAQ-DM: Timestep-Channel Adaptive Quantization for Diffusion Models\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.498\u002F)] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FEfficientQAT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOpenGVLab\u002FEfficientQAT?style=social)](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FEfficientQAT)\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.99\u002F)] L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.531\u002F)] MoQAE: Mixed-Precision Quantization for Long-Context LLM Inference via Mixture of Quantization-Aware Experts\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.618\u002F)] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.225\u002F)] PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002Fzjq0455\u002FPTQ1.61)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzjq0455\u002FPTQ1.61?style=social)](https:\u002F\u002Fgithub.com\u002Fzjq0455\u002FPTQ1.61)\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.1382\u002F)] Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.1304\u002F)] “Give Me BF16 or Give Me Death”? Accuracy-Performance Trade-Offs in LLM Quantization\n- [[ACM MM](https:\u002F\u002Facmmm2025.org\u002Faccepted-regular-papers\u002F)] DilateQuant: Accurate and Efficient Quantization-Aware Training for Diffusion Models via Weight Dilation\n- [[ACM MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3744239)] Learning Binarized Representations with Pseudo-positive Distillation\n- [[ACM MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3746027.3755433)] MQuant: Unleashing the Inference Potential of Multimodal Large Language Models with Post-Training Quantization\n- [[ACM MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3746027.3755213)] Pushing the Limit of Binarized Neural Network for Image Super Resolution with Smooth Information Transmission\n- [[ACM MM](https:\u002F\u002Facmmm2025.org\u002Faccepted-regular-papers\u002F)] Quantization Meets OOD: Generalizable Quantization-aware Training from a Flatness Perspective\n- [[EMNLP](https:\u002F\u002Faclanthology.org\u002F2025.emnlp-main.1799\u002F)] AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models\n- [[EMNLP](https:\u002F\u002Faclanthology.org\u002F2025.emnlp-main.479\u002F)] Does quantization affect models' performance on long-input and long-output tasks?\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F28924)] CBQ: Cross-Block Quantization for Large Language Models\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F29192)] DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F30168)] LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=rAcgDBdKnP)] OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting [[code](https:\u002F\u002Fgithub.com\u002FBrotherHappy\u002FOSTQuant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FBrotherHappy\u002FOSTQuant?style=social)](https:\u002F\u002Fgithub.com\u002FBrotherHappy\u002FOSTQuant)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=LB5cKhgOTu)] QERA: an Analytical Framework for Quantization Error Reconstruction [[code](https:\u002F\u002Fgithub.com\u002FChengZhang-98\u002FQERA)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FChengZhang-98\u002FQERA?style=social)](https:\u002F\u002Fgithub.com\u002FChengZhang-98\u002FQERA)\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F28338)] SpinQuant: LLM Quantization with Learned Rotations\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F27906)] SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F30429)] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=ZawsPjlIGu&noteId=x0z6YCJM6S)] GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance [[code](https:\u002F\u002Fgithub.com\u002Fsnu-mllab\u002FGuidedQuant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsnu-mllab\u002FGuidedQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fsnu-mllab\u002FGuidedQuant)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=4qIP1sXcR1)] ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals [[code](https:\u002F\u002Fgithub.com\u002Futkarsh-dmx\u002Fproject-resq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Futkarsh-dmx\u002Fproject-resq?style=social)](https:\u002F\u002Fgithub.com\u002Futkarsh-dmx\u002Fproject-resq)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F117148)] A Double Normalization Approach for Calibration-Free Low-Bit KV Cache Quantization\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F119877)] Binary Quadratic Quantization: Beyond First-Order Quantization for Real-Valued Matrix Compression\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F117396)] Learning Grouped Lattice Vector Quantizers for Low-Bit Large Language Models\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F115061)] LittleBit: Ultra Low-Bit Quantization via Latent Factorization\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F118224)] ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F116315)] Q-Palette: Fractional-Bit Quantizers Toward Optimal Weight-Only Post-Training Quantization\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F120052)] Wavelet-Enhanced High-Fidelity 1-Bit Quantization for LLMs\n- [[ACL Findings](https:\u002F\u002Faclanthology.org\u002F2025.findings-acl.459\u002F)] Achieving Binary Weight and Activation for LLMs using Post-Training Quantization\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2025.findings-emnlp.943\u002F)] KurTail: Kurtosis-based LLM Quantization\n- [[SIGMOD](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3725413)] Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search [[code](https:\u002F\u002Fgithub.com\u002FVectorDB-NTU\u002FExtended-RaBitQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FVectorDB-NTU\u002FExtended-RaBitQ?style=social)](https:\u002F\u002Fgithub.com\u002FVectorDB-NTU\u002FExtended-RaBitQ)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F119764)] QBasicVSR: Temporal Awareness Adaptation Quantization for Video Super-Resolution\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.09629)] Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F115665)] Point4Bit: Post Training 4-bit Quantization for Point Cloud 3D Detection\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.12266)] PMQ-VE: Progressive Multi-Frame Quantization for Video Enhancement [[code](https:\u002F\u002Fgithub.com\u002FxiaoBIGfeng\u002FPMQ-VE)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FxiaoBIGfeng\u002FPMQ-VE?style=social)](https:\u002F\u002Fgithub.com\u002FxiaoBIGfeng\u002FPMQ-VE)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F115090)] VETA-DiT: Variance-Equalized and Temporally Adaptive Quantization for Efficient 4-bit Diffusion Transformers\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.18724)] LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning [[code](https:\u002F\u002Fgithub.com\u002FKingdalfGoodman\u002FLoTA-QAF\u002Fblob\u002Fmain\u002FREADME.md)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FKingdalfGoodman\u002FLoTA-QAF?style=social)](https:\u002F\u002Fgithub.com\u002FKingdalfGoodman\u002FLoTA-QAF)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F117708)] Efficient Multi-bit Quantization Network Training via Weight Bias Correction and Bit-wise Coreset Sampling\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F119554)] Efficient and Generalizable Mixed-Precision Quantization via Topological Entropy\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F119301)] QSCA: Quantization with Self-Compensating Auxiliary for Monocular Depth Estimation\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.19248)] Scheduling Weight Transitions for Quantization-Aware Training [[code](https:\u002F\u002Fgithub.com\u002Fcvlab-yonsei\u002FTRS)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcvlab-yonsei\u002FTRS?style=social)](https:\u002F\u002Fgithub.com\u002Fcvlab-yonsei\u002FTRS)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.16782)] Task-Specific Zero-shot Quantization-Aware Training for Object Detection [[code](https:\u002F\u002Fgithub.com\u002FDFQ-Dojo\u002Fdfq-toolkit)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FDFQ-Dojo\u002Fdfq-toolkit?style=social)](https:\u002F\u002Fgithub.com\u002FDFQ-Dojo\u002Fdfq-toolkit)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.10959)] OuroMamba: A Data-Free Quantization Framework for Vision Mamba\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.23516)] FedWSQ: Efficient Federated Learning with Weight Standardization and Distribution-Aware Non-Uniform Quantization [[code](https:\u002F\u002Fgithub.com\u002FSeongyeol-kim\u002FFedWSQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSeongyeol-kim\u002FFedWSQ?style=social)](https:\u002F\u002Fgithub.com\u002FSeongyeol-kim\u002FFedWSQ)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.16553)] Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers [[code](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FSARDFQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzysxmu\u002FSARDFQ?style=social)](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FSARDFQ)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.06545)] QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation [[code](https:\u002F\u002Fgithub.com\u002FJunyiWuCode\u002FQuantCache)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FJunyiWuCode\u002FQuantCache?style=social)](https:\u002F\u002Fgithub.com\u002FJunyiWuCode\u002FQuantCache)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.19131)] MixA-Q: Revisiting Activation Sparsity for Vision Transformers from a Mixed-Precision Quantization Perspective\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.12933)] DMQ: Dissecting Outliers of Diffusion Models for Post-Training Quantization [[code](https:\u002F\u002Fgithub.com\u002FLeeDongYeun\u002Fdmq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FLeeDongYeun\u002Fdmq?style=social)](https:\u002F\u002Fgithub.com\u002FLeeDongYeun\u002Fdmq)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.03088)] AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.22349)] MSQ: Memory-Efficient Bit Sparsification Quantization\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.03666)] QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning [[code](https:\u002F\u002Fgithub.com\u002FhatchetProject\u002FQuEST)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FhatchetProject\u002FQuEST?style=social)](https:\u002F\u002Fgithub.com\u002FhatchetProject\u002FQuEST)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.05799)] MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design [[code](https:\u002F\u002Fgithub.com\u002Fcat538\u002FMxMoE)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcat538\u002FMxMoE?style=social)](https:\u002F\u002Fgithub.com\u002Fcat538\u002FMxMoE)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.04877)] Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.15748)] PARQ: Piecewise-Affine Regularized Quantization [[code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fparq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ffacebookresearch\u002Fparq?style=social)](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fparq)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.22879)] Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models [[code](https:\u002F\u002Fgithub.com\u002Fenyac-group\u002FQuamba)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fenyac-group\u002FQuamba?style=social)](https:\u002F\u002Fgithub.com\u002Fenyac-group\u002FQuamba)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=G6DmP9wxeB)] LRA-QViT: Integrating Low-Rank Approximation and Quantization for Robust and Efficient Vision Transformers\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.13474)] BoA: Attention-aware Post-training Quantization without Backpropagation\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.03804)] MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance [[code](https:\u002F\u002Fgithub.com\u002Fchenzx921020\u002FMoEQuant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fchenzx921020\u002FMoEQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fchenzx921020\u002FMoEQuant)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.09720)] NestQuant: nested lattice quantization for matrix products and LLMs\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.20251)] Q-resafe: Assessing Safety Risks and Quantization-aware Safety Patching for Quantized Large Language Models [[code](https:\u002F\u002Fgithub.com\u002FThecommonirin\u002FQresafe)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FThecommonirin\u002FQresafe?style=social)](https:\u002F\u002Fgithub.com\u002FThecommonirin\u002FQresafe)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.09615)] SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression [[code](https:\u002F\u002Fgithub.com\u002FParamathic\u002Fslim)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FParamathic\u002Fslim?style=social)](https:\u002F\u002Fgithub.com\u002FParamathic\u002Fslim)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.06020)] QT-DoG: Quantization-Aware Training for Domain Generalization [[code](https:\u002F\u002Fgithub.com\u002Fsaqibjaved1\u002FQT-DoG)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsaqibjaved1\u002FQT-DoG?style=social)](https:\u002F\u002Fgithub.com\u002Fsaqibjaved1\u002FQT-DoG)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.06786)] Matryoshka Quantization\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.23651)] Merge-Friendly Post-Training Quantization for Multi-Target Domain Adaptation [[code](https:\u002F\u002Fgithub.com\u002Fewsn1593\u002FHDRQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fewsn1593\u002FHDRQ?style=social)](https:\u002F\u002Fgithub.com\u002Fewsn1593\u002FHDRQ)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.14371)] Layer-wise Quantization for Quantized Optimistic Dual Averaging\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=w5fONAEwra)] Outlier-Aware Post-Training Quantization for Discrete Graph Diffusion Models\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.01144)] BlockDialect: Block-wise Fine-grained Mixed Format Quantization for Energy-Efficient LLM Inference\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.02692)] GPTAQ: Efficient Finetuning-Free Quantization with Asymmetric Calibration [[code](https:\u002F\u002Fgithub.com\u002FIntelligent-Computing-Lab-Panda\u002FGPTAQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FIntelligent-Computing-Lab-Panda\u002FGPTAQ?style=social)](https:\u002F\u002Fgithub.com\u002FIntelligent-Computing-Lab-Panda\u002FGPTAQ)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.17116)] Optimizing Large Language Model Training Using FP4 Quantization\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.04180)] SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.10958)] SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization [[code](https:\u002F\u002Fgithub.com\u002Fthu-ml\u002FSageAttention)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fthu-ml\u002FSageAttention?style=social)](https:\u002F\u002Fgithub.com\u002Fthu-ml\u002FSageAttention)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.14330)] Thinking in Granularity: Dynamic Quantization for Image Super-Resolution by Intriguing Multi-Granularity Clues [[code](https:\u002F\u002Fgithub.com\u002FMmmingS\u002FGranular-DQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FMmmingS\u002FGranular-DQ?style=social)](https:\u002F\u002Fgithub.com\u002FMmmingS\u002FGranular-DQ)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.08180)] D2-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models [[code](https:\u002F\u002Fgithub.com\u002FTaylorJocelyn\u002FD2-DPM)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FTaylorJocelyn\u002FD2-DPM?style=social)](https:\u002F\u002Fgithub.com\u002FTaylorJocelyn\u002FD2-DPM)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.13918)] Quantization without Tears\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.02508)] APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformer [[code](https:\u002F\u002Fgithub.com\u002FGoatWu\u002FAPHQ-ViT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FGoatWu\u002FAPHQ-ViT?style=social)](https:\u002F\u002Fgithub.com\u002FGoatWu\u002FAPHQ-ViT)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=2rnOgyFQgb)] SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning [[code](https:\u002F\u002Fgithub.com\u002Fsnudm-starlab\u002FSynQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsnudm-starlab\u002FSynQ?style=social)](https:\u002F\u002Fgithub.com\u002Fsnudm-starlab\u002FSynQ)\n\n### 2024\n\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=qOl2WWOqFg)] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs [[code](https:\u002F\u002Fgithub.com\u002FAaronhuang-778\u002FBiLLM)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FAaronhuang-778\u002FBiLLM?style=social)](https:\u002F\u002Fgithub.com\u002FAaronhuang-778\u002FBiLLM)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=sCGRhnuMUJ)] Compressing Large Language Models by Joint Sparsification and Quantization\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F93620)] BiDM: Pushing the Limit of Quantization for Diffusion Models\n- [[ACL Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.516\u002F)] DB-LLM: Accurate Dual-Binarization for Efficient LLMs\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F93008)] Binarized Diffusion Model for Image Super-Resolution [[code](https:\u002F\u002Fgithub.com\u002Fzhengchen1999\u002FBI-DiffSR)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzhengchen1999\u002FBI-DiffSR?style=social)](https:\u002F\u002Fgithub.com\u002Fzhengchen1999\u002FBI-DiffSR)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=ADJASE9uQ2)] 2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution [[code](https:\u002F\u002Fgithub.com\u002FKai-Liu001\u002F2DQuant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FKai-Liu001\u002F2DQuant?style=social)](https:\u002F\u002Fgithub.com\u002FKai-Liu001\u002F2DQuant)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv235\u002Fqin24b.html)] Accurate LoRA-Finetuning Quantization of LLMs via Information Retention [[code](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FIR-QLoRA)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FIR-QLoRA?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FIR-QLoRA)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv235\u002Fzhang24bb.html)] Flexible Residual Binarization for Image Super-Resolution\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29860)] Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29487)] AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F28109)] Bi-ViT: Pushing the Limit of Vision Transformer Quantization\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29908)] Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29045)] Make RepVGG Greater Again: A Quantization-Aware Approach\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29212)] MetaMix: Meta-State Precision Searcher for Mixed-Precision Activation Quantization\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29815)] Norm Tweaking: High-Performance Low-Bit Quantization of Large Language Models\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29237)] OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29553)] PTMQ: Post-training Multi-Bit Quantization of Neural Networks\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F28972)] Robustness-Guided Image Synthesis for Data-Free Quantization\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29765)] What Makes Quantization for Large Language Model Hard? An Empirical Study from the Lens of Perturbation\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2024.acl-long.612\u002F)] Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment\n- [[ACM MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3664647.3680838)] Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning Based on Warmup\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FFan_Data-Free_Quantization_via_Pseudo-label_Filtering_CVPR_2024_paper.html)] Data-Free Quantization via Pseudo-label Filtering\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FShang_Enhancing_Post-training_Quantization_Calibration_through_Contrastive_Learning_CVPR_2024_paper.html)] Enhancing Post-training Quantization Calibration through Contrastive Learning\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FMoon_Instance-Aware_Group_Quantization_for_Vision_Transformers_CVPR_2024_paper.html)] Instance-Aware Group Quantization for Vision Transformers\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FChen_Mixed-Precision_Quantization_for_Federated_Learning_on_Resource-Constrained_Heterogeneous_Devices_CVPR_2024_paper.html)] Mixed-Precision Quantization for Federated Learning on Resource-Constrained Heterogeneous Devices\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FLv_PTQ4SAM_Post-Training_Quantization_for_Segment_Anything_CVPR_2024_paper.html)] PTQ4SAM: Post-Training Quantization for Segment Anything\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FDing_Reg-PTQ_Regression-specialized_Post-training_Quantization_for_Fully_Quantized_Object_Detector_CVPR_2024_paper.html)] Reg-PTQ: Regression-specialized Post-training Quantization for Fully Quantized Object Detector\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FTang_Retraining-Free_Model_Quantization_via_One-Shot_Weight-Coupling_Learning_CVPR_2024_paper.html)] Retraining-Free Model Quantization via One-Shot Weight-Coupling Learning\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FHuang_TFMQ-DM_Temporal_Feature_Maintenance_Quantization_for_Diffusion_Models_CVPR_2024_paper.html)] TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FWang_Towards_Accurate_Post-training_Quantization_for_Diffusion_Models_CVPR_2024_paper.html)] Towards Accurate Post-training Quantization for Diffusion Models\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F3969_ECCV_2024_paper.php)] AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F8434_ECCV_2024_paper.php)] CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F2494_ECCV_2024_paper.php)] Memory-Efficient Fine-Tuning for Quantized Diffusion Model\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F3914_ECCV_2024_paper.php)] MetaAug: Meta-Data Augmentation for Post-Training Quantization\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F2212_ECCV_2024_paper.php)] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F2121_ECCV_2024_paper.php)] Overcoming Distribution Mismatch in Quantizing Image Super-Resolution Networks\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F7353_ECCV_2024_paper.php)] Post-training Quantization with Progressive Calibration and Activation Relaxing for Text-to-Image Diffusion Models\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F1627_ECCV_2024_paper.php)] PQ-SAM: Post-training Quantization for Segment Anything Model\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F8312_ECCV_2024_paper.php)] Timestep-Aware Correction for Quantized Diffusion Models\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F9567_ECCV_2024_paper.php)] Towards Robust Full Low-bit Quantization of Super Resolution Networks\n- [[EMNLP](https:\u002F\u002Faclanthology.org\u002F2024.emnlp-main.1168\u002F)] ApiQ: Finetuning of 2-Bit Quantized Large Language Model\n- [[EMNLP](https:\u002F\u002Faclanthology.org\u002F2024.emnlp-main.134\u002F)] Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization\n- [[EMNLP](https:\u002F\u002Faclanthology.org\u002F2024.emnlp-main.467\u002F)] VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=of2rhALq8l)] AffineQuant: Affine Transformation Quantization for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FAffineQuant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbytedance\u002FAffineQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FAffineQuant)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=UmMa3UNDAz)] EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=0d1gQI114C)] LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=LzPWWPAdY4)] LoftQ: LoRA-Fine-Tuning-aware Quantization for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002Fyxli2123\u002FLoftQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyxli2123\u002FLoftQ?style=social)](https:\u002F\u002Fgithub.com\u002Fyxli2123\u002FLoftQ)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=gLARhFLE0F)] LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=8Wuvhh0LYW)] OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FOmniQuant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOpenGVLab\u002FOmniQuant?style=social)](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FOmniQuant)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=BifeBRhikU)] PB-LLM: Partially Binarized Large Language Models [[code](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002FPB-LLM)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhahnyuan\u002FPB-LLM?style=social)](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002FPB-LLM)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=WvFoJccpo8)] QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models [[code](https:\u002F\u002Fgithub.com\u002Fyuhuixu1993\u002Fqa-lora)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyuhuixu1993\u002Fqa-lora?style=social)](https:\u002F\u002Fgithub.com\u002Fyuhuixu1993\u002Fqa-lora)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=FIplmUWdm3)] QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=JzG7kSpjJk)] Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=Q1u25ahSuy)] SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression [[code](https:\u002F\u002Fgithub.com\u002FVahe1994\u002FSpQR)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FVahe1994\u002FSpQR?style=social)](https:\u002F\u002Fgithub.com\u002FVahe1994\u002FSpQR)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=mbx2pLK5Eq)] A2Q+: Improving Accumulator-Aware Weight Quantization\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=DbyHDYslM7)] BiE: Bi-Exponent Block Floating-Point for Large Language Models Quantization\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=jKUWlgra9b)] ERQ: Error Reduction for Post-Training Quantization of Vision Transformers\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=DKKg5EFAFr)] Evaluating Quantized Large Language Models\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=5mCaITRTmO)] Extreme Compression of Large Language Models via Additive Quantization\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=xPypr0kufs)] FrameQuant: Flexible Low-Bit Quantization for Transformers\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=L057s2Rq8O)] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache [[code](https:\u002F\u002Fgithub.com\u002Fjy-yuan\u002FKIVI)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjy-yuan\u002FKIVI?style=social)](https:\u002F\u002Fgithub.com\u002Fjy-yuan\u002FKIVI)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=dh8k41g775)] LQER: Low-Rank Quantization Error Reconstruction for LLMs\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=Uh5XN9d2J4)] Outlier-aware Slicing for Post-Training Quantization in Vision Transformer\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=8mKXMnhnFW)] Sharpness-Aware Data Generation for Zero-shot Quantization\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=0jpbpFia8m)] SqueezeLLM: Dense-and-Sparse Quantization [[code](https:\u002F\u002Fgithub.com\u002FSqueezeAILab\u002FSqueezeLLM)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSqueezeAILab\u002FSqueezeLLM?style=social)](https:\u002F\u002Fgithub.com\u002FSqueezeAILab\u002FSqueezeLLM)\n- [[MLSys](https:\u002F\u002Fproceedings.mlsys.org\u002Fpaper_files\u002Fpaper\u002F2024\u002Fhash\u002F42a452cbafa9dd64e9ba4aa95cc1ef21-Abstract-Conference.html)] AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration [[code](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fllm-awq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmit-han-lab\u002Fllm-awq?style=social)](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fllm-awq)\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F96909)] BitsFusion: 1.99 bits Weight Quantization of Diffusion Model\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F93727)] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F93558)] KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F96936)] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F95445)] PTQ4DiT: Post-training Quantization for Diffusion Transformers\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F94107)] Q-VLM: Post-training Quantization for Large Vision-Language Models\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F95634)] QBB: Quantization with Binary Bases for LLMs\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F96563)] ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification\n- [[ACL Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.726\u002F)] A Comprehensive Evaluation of Quantization Strategies for Large Language Models\n- [[ACL Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.3\u002F)] AFPQ: Asymmetric Floating Point Quantization for LLMs [[code](https:\u002F\u002Fgithub.com\u002Fzhangsichengsjtu\u002FAFPQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzhangsichengsjtu\u002FAFPQ?style=social)](https:\u002F\u002Fgithub.com\u002Fzhangsichengsjtu\u002FAFPQ)\n- [[ACL Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.26\u002F)] LLM-QAT: Data-Free Quantization Aware Training for Large Language Models\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-emnlp.1001\u002F)] ATQ: Activation Transformation for Weight-Activation Quantization of LLMs\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-emnlp.444\u002F)] Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-emnlp.935\u002F)] How Does Quantization Affect Multilingual LLMs?\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-emnlp.570\u002F)] MobileQuant: Mobile-friendly Quantization for On-device Language Models\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-emnlp.811\u002F)] QEFT: Quantization for Efficient Fine-Tuning of LLMs\n- [[EMNLP Industry](https:\u002F\u002Faclanthology.org\u002F2024.emnlp-industry.12\u002F)] LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.14866)] APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.02775)] EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.10787)] EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge [[code](https:\u002F\u002Fgithub.com\u002Fshawnricecake\u002FEdgeQAT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fshawnricecake\u002FEdgeQAT?style=social)](https:\u002F\u002Fgithub.com\u002Fshawnricecake\u002FEdgeQAT)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.17985)] FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.15319)] GPTVQ: The Blessing of Dimensionality for LLM Quantization [[code](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Fgptvq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fqualcomm-ai-research\u002Fgptvq?style=social)](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Fgptvq)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.01241)] IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.11295)] OneBit: Towards Extremely Low-bit Large Language Models\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.00456)] QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs [[code](https:\u002F\u002Fgithub.com\u002Fspcl\u002FQuaRot)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fspcl\u002FQuaRot?style=social)](https:\u002F\u002Fgithub.com\u002Fspcl\u002FQuaRot)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.05628)] RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization\n- [[SIGMOD](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3654970)] RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search [[code](https:\u002F\u002Fgithub.com\u002Fgaoj0017\u002FRaBitQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgaoj0017\u002FRaBitQ?style=social)](https:\u002F\u002Fgithub.com\u002Fgaoj0017\u002FRaBitQ)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.16760)] One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware Quantization Training\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.03103)] Learning from students: Applying t-distributions to explore accurate and efficient formats for llms [[code](https:\u002F\u002Fgithub.com\u002Fcornell-zhang\u002Fllm-datatypes)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcornell-zhang\u002Fllm-datatypes?style=social)](https:\u002F\u002Fgithub.com\u002Fcornell-zhang\u002Fllm-datatypes)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.12422)] Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=fM9xTkpAdu)] Reshape and Adapt for Output Quantization (RAOQ): Quantization-aware Training for In-memory Computing Systems\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.04396)] QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks [[code](https:\u002F\u002Fgithub.com\u002FCornell-RelaxML\u002Fquip-sharp)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FCornell-RelaxML\u002Fquip-sharp?style=social)](https:\u002F\u002Fgithub.com\u002FCornell-RelaxML\u002Fquip-sharp)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.08958)] Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers [[code](https:\u002F\u002Fgithub.com\u002FSamsungLabs\u002Faespa)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSamsungLabs\u002Faespa?style=social)](https:\u002F\u002Fgithub.com\u002FSamsungLabs\u002Faespa)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.00800)] MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization [[code](https:\u002F\u002Fgithub.com\u002Faozhongzhang\u002Fmagr)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Faozhongzhang\u002Fmagr?style=social)](https:\u002F\u002Fgithub.com\u002Faozhongzhang\u002Fmagr)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.18137)] Exploiting LLM Quantization\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=HfpV6u0kbX)] Efficient Multi-task LLM Quantization and Serving for Multiple LoRA Adapters\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.11235)] QTIP: Quantization with Trellises and Incoherence Processing [[code](https:\u002F\u002Fgithub.com\u002FCornell-RelaxML\u002Fqtip)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FCornell-RelaxML\u002Fqtip?style=social)](https:\u002F\u002Fgithub.com\u002FCornell-RelaxML\u002Fqtip)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=dYIqAZXQNV)] Generalizing CNNs to graphs with learnable neighborhood quantization [[code](https:\u002F\u002Fgithub.com\u002FGrosenick-Lab-Cornell\u002FQuantNets)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FGrosenick-Lab-Cornell\u002FQuantNets?style=social)](https:\u002F\u002Fgithub.com\u002FGrosenick-Lab-Cornell\u002FQuantNets)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.15526)] SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training [[code](https:\u002F\u002Fgithub.com\u002FByteDance-Seed\u002FSDP4Bit)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FByteDance-Seed\u002FSDP4Bit?style=social)](https:\u002F\u002Fgithub.com\u002FByteDance-Seed\u002FSDP4Bit)\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper_files\u002Fpaper\u002F2024\u002Ffile\u002Fab6a2c6ee757afe43882121281f6065c-Paper-Conference.pdf)] Optimal and Approximate Adaptive Stochastic Quantization [[code](https:\u002F\u002Fgithub.com\u002Franbenbasat\u002FQUIVER)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Franbenbasat\u002FQUIVER?style=social)](https:\u002F\u002Fgithub.com\u002Franbenbasat\u002FQUIVER)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.02837)] Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=cEtExbAKYV)] StepbaQ: Stepping backward as Correction for Quantized Diffusion Models\n\n### 2023\n\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv202\u002Fqin23a.html)] BiBench: Benchmarking and Analyzing Network Binarization [[code](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiBench)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiBench?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiBench)\n- [[IJCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.12338)] Distribution-sensitive Information Retention for Accurate Binary Neural Network\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F71287)] BiMatting: Efficient Video Matting via Binarization [[code](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiMatting)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiMatting?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiMatting)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F72890)] QuantSR: Accurate Low-bit Quantization for Efficient Image Super-Resolution [[code](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FQuantSR)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FQuantSR?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FQuantSR)\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F10146917)] Diverse Sample Generation: Pushing the Limit of Generative Data-Free Quantization [[code](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FDSG)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FDSG?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FDSG)\n- [[TNNLS](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F10049753)] BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance [[code](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiFSMNv2)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiFSMNv2?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiFSMNv2)\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F26268)] Fast and Accurate Binary Neural Networks Based on Depth-Width Reshaping\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F26084)] OMPQ: Orthogonal Mixed Precision Quantization\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F26354)] Quantized Feature Distillation for Network Quantization\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F26261)] Resilient Binary Neural Network\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F26136)] Rethinking Data-Free Quantization as a Zero-Sum Game\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2023.findings-acl.15\u002F)] Boost Transformer-based Language Models with GPU-Friendly Sparsity and Quantization\n- [[ACL](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.00014)] PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language Models\n- [[CVPR](https:\u002F\u002Fipl.dgist.ac.kr\u002FABCD_cvpr23.pdf)] ABCD : Arbitrary Bitwise Coefficient for De-quantization\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.06869)] Adaptive Data-Free Quantization\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FLin_Bit-Shrinking_Limiting_Instantaneous_Sharpness_for_Improving_Post-Training_Quantization_CVPR_2023_paper.pdf)] Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FYu_Boost_Vision_Transformer_With_GPU-Friendly_Sparsity_and_Quantization_CVPR_2023_paper.pdf)] Boost Vision Transformer with GPU-Friendly Sparsity and Quantization\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.04780)] GENIE: Show Me the Data for Quantization [[code](https:\u002F\u002Fgithub.com\u002FSamsungLabs\u002FGenie)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSamsungLabs\u002FGenie?style=social)](https:\u002F\u002Fgithub.com\u002FSamsungLabs\u002FGenie)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fhtml\u002FLi_Hard_Sample_Matters_a_Lot_in_Zero-Shot_Quantization_CVPR_2023_paper.html)] Hard Sample Matters a Lot in Zero-Shot Quantization\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FShin_NIPQ_Noise_Proxy-Based_Integrated_Pseudo-Quantization_CVPR_2023_paper.pdf)] NIPQ: Noise proxy-based Integrated Pseudo-Quantization\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FLiu_NoisyQuant_Noisy_Bias-Enhanced_Post-Training_Activation_Quantization_for_Vision_Transformers_CVPR_2023_paper.pdf)] NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FKoryakovskiy_One-Shot_Model_for_Mixed-Precision_Quantization_CVPR_2023_paper.pdf)] One-Shot Model for Mixed-Precision Quantization\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fhtml\u002FLiu_PD-Quant_Post-Training_Quantization_Based_on_Prediction_Difference_Metric_CVPR_2023_paper.html)] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric [[code](https:\u002F\u002Fgithub.com\u002Fhustvl\u002FPD-Quant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhustvl\u002FPD-Quant?style=social)](https:\u002F\u002Fgithub.com\u002Fhustvl\u002FPD-Quant)\n- [[CVPR](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fhtml\u002FShang_Post-Training_Quantization_on_Diffusion_Models_CVPR_2023_paper.html)] Post-training Quantization on Diffusion Models [[code](https:\u002F\u002Fhttps\u002F\u002Fgithub.com\u002F42Shawn\u002FPTQ4DM)]\n- [[CVPR](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fhtml\u002FXu_Q-DETR_An_Efficient_Low-Bit_Quantized_Detection_Transformer_CVPR_2023_paper.html)] Q-DETR: An Efficient Low-Bit Quantized Detection Transformer [[code](https:\u002F\u002Fgithub.com\u002FSteveTsui\u002FQ-DETR)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSteveTsui\u002FQ-DETR?style=social)](https:\u002F\u002Fgithub.com\u002FSteveTsui\u002FQ-DETR)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.06424)] Regularized Vector Quantization for Tokenized Image Synthesis\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2303.11906.pdf)] Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective [[code](https:\u002F\u002Fgithub.com\u002Fbytedance\u002Fmrecg)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbytedance\u002Fmrecg?style=social)](https:\u002F\u002Fgithub.com\u002Fbytedance\u002Fmrecg)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FTu_Toward_Accurate_Post-Training_Quantization_for_Image_Super_Resolution_CVPR_2023_paper.pdf)] Toward Accurate Post-Training Quantization for Image Super Resolution\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.16836)] LLM-FP4: 4-Bit Floating-Point Quantized Transformers [[code](https:\u002F\u002Fgithub.com\u002Fnbasyl\u002FLLM-FP4)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fnbasyl\u002FLLM-FP4?style=social)](https:\u002F\u002Fgithub.com\u002Fnbasyl\u002FLLM-FP4)\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.09145)] Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.05079)] Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.11237)] Watermarking LLMs with Weight Quantization [[code](https:\u002F\u002Fgithub.com\u002FTwilight92z\u002FQuantize-Watermark)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FTwilight92z\u002FQuantize-Watermark?style=social)](https:\u002F\u002Fgithub.com\u002FTwilight92z\u002FQuantize-Watermark)\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13315)] Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FColbert_A2Q_Accumulator-Aware_Quantization_with_Guaranteed_Overflow_Avoidance_ICCV_2023_paper.pdf)] A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FHe_BiViT_Extremely_Compressed_Binary_Vision_Transformers_ICCV_2023_paper.pdf)] BiViT: Extremely Compressed Binary Vision Transformers\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FShang_Causal-DFQ_Causality_Guided_Data-Free_Network_Quantization_ICCV_2023_paper.pdf)] Causal-DFQ: Causality Guided Data-Free Network Quantization [[code](https:\u002F\u002Fgithub.com\u002F42Shawn\u002FCausal-DFQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002F42Shawn\u002FCausal-DFQ?style=social)](https:\u002F\u002Fgithub.com\u002F42Shawn\u002FCausal-DFQ)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FLi_DenseShift_Towards_Accurate_and_Efficient_Low-Bit_Power-of-Two_Quantization_ICCV_2023_paper.pdf)] DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two Quantization\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FDong_EMQ_Evolving_Training-free_Proxies_for_Automated_Mixed_Precision_Quantization_ICCV_2023_paper.pdf)] EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FXu_EQ-Net_Elastic_Quantization_Neural_Networks_ICCV_2023_paper.pdf)] EQ-Net: Elastic Quantization Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fxuke225\u002FEQ-Net)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxuke225\u002FEQ-Net?style=social)](https:\u002F\u002Fgithub.com\u002Fxuke225\u002FEQ-Net)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fhtml\u002FWu_Estimator_Meets_Equilibrium_Perspective_A_Rectified_Straight_Through_Estimator_for_ICCV_2023_paper.html)] Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training [[code](https:\u002F\u002Fgithub.com\u002FDravenALG\u002FReSTE)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FDravenALG\u002FReSTE?style=social)](https:\u002F\u002Fgithub.com\u002FDravenALG\u002FReSTE)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FLi_I-ViT_Integer-only_Quantization_for_Efficient_Vision_Transformer_Inference_ICCV_2023_paper.pdf)] I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference [[code](https:\u002F\u002Fgithub.com\u002Fzkkli\u002FI-ViT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzkkli\u002FI-ViT?style=social)](https:\u002F\u002Fgithub.com\u002Fzkkli\u002FI-ViT)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FFrumkin_Jumping_through_Local_Minima_Quantization_in_the_Loss_Landscape_of_ICCV_2023_paper.pdf)] Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FChen_Overcoming_Forgetting_Catastrophe_in_Quantization-Aware_Training_ICCV_2023_paper.pdf)] Overcoming Forgetting Catastrophe in Quantization-Aware Training\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FLi_Q-Diffusion_Quantizing_Diffusion_Models_ICCV_2023_paper.pdf)] Q-diffusion: Quantizing Diffusion Models [[code](https:\u002F\u002Fgithub.com\u002FXiuyu-Li\u002Fq-diffusion)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FXiuyu-Li\u002Fq-diffusion?style=social)](https:\u002F\u002Fgithub.com\u002FXiuyu-Li\u002Fq-diffusion)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FZhang_QD-BEV__Quantization-aware_View-guided_Distillation_for_Multi-view_3D_Object_Detection_ICCV_2023_paper.pdf)] QD-BEV: Quantization-aware View-guided Distillation for Multi-view 3D Object Detection\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FLi_RepQ-ViT_Scale_Reparameterization_for_Post-Training_Quantization_of_Vision_Transformers_ICCV_2023_paper.pdf)] RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers [[code](https:\u002F\u002Fgithub.com\u002Fzkkli\u002FRepQ-ViT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzkkli\u002FRepQ-ViT?style=social)](https:\u002F\u002Fgithub.com\u002Fzkkli\u002FRepQ-ViT)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FBai_Unified_Data-Free_Compression_Pruning_and_Quantization_without_Fine-Tuning_ICCV_2023_paper.pdf)] Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=3itjR9QxFw)] Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.17323)] GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers [[code](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002Fgptq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FIST-DASLab\u002Fgptq?style=social)](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002Fgptq)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=m2S96Qf2R3)] Few-bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction [[code](https:\u002F\u002Fgithub.com\u002FSkoltechAI\u002Ffewbit)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSkoltechAI\u002Ffewbit?style=social)](https:\u002F\u002Fgithub.com\u002FSkoltechAI\u002Ffewbit)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=EPnzNJTYsb)] FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization [[code](https:\u002F\u002Fopenreview.net\u002Fattachment?id=-tYCaP0phY_&name=supplementary_material)]\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2023\u002F28295)] GPT-Zip: Deep Compression of Finetuned Large Language Models\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=DihXH24AdY)] Oscillation-free Quantization for Low-bit Vision Transformers [[code](https:\u002F\u002Fgithub.com\u002Fnbasyl\u002FOFQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fnbasyl\u002FOFQ?style=social)](https:\u002F\u002Fgithub.com\u002Fnbasyl\u002FOFQ)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.03738)] QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models [[code](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002FQIGen)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FIST-DASLab\u002FQIGen?style=social)](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002FQIGen)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=Nqp8A5IDzq)] Quantized Distributed Training of Large Models with Convergence Guarantees\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.10438)] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fsmoothquant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmit-han-lab\u002Fsmoothquant?style=social)](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fsmoothquant)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=i8tGb1ab1j)] The case for 4-bit precision: k-bit Inference Scaling Laws\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=q1WGm3hItW)] Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.12017)] Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases [[code](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeed)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FDeepSpeed?style=social)](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeed)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.10299)] Binarized Spectral Compressive Imaging [[code](https:\u002F\u002Fgithub.com\u002Fcaiyuanhao1998\u002FBiSCI)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcaiyuanhao1998\u002FBiSCI?style=social)](https:\u002F\u002Fgithub.com\u002Fcaiyuanhao1998\u002FBiSCI)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F72931)] Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F71880)] PackQViT: Faster Sub-8-bit Vision Transformers via Full and Packed Quantization on the Mobile\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F71314)] PTQD: Accurate Post-Training Quantization for Diffusion Models [[code](https:\u002F\u002Fgithub.com\u002Fziplab\u002FPTQD)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fziplab\u002FPTQD?style=social)](https:\u002F\u002Fgithub.com\u002Fziplab\u002FPTQD)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F70279)] Q-DM: An Efficient Low-bit Quantized Diffusion Model\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F71815)] QLoRA: Efficient Finetuning of Quantized LLMs [[code](https:\u002F\u002Fgithub.com\u002Fartidoro\u002Fqlora)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fartidoro\u002Fqlora?style=social)](https:\u002F\u002Fgithub.com\u002Fartidoro\u002Fqlora)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F69982)] QuIP: 2-Bit Quantization of Large Language Models With Guarantees [[code](https:\u002F\u002Fgithub.com\u002Fjerry-chee\u002FQuIP)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjerry-chee\u002FQuIP?style=social)](https:\u002F\u002Fgithub.com\u002Fjerry-chee\u002FQuIP)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F72396)] Temporal Dynamic Quantization for Diffusion Models\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F70325)] TexQ: Zero-shot Network Quantization with Texture Feature Distribution Calibration\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F71526)] Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers\n- [[TIP](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F10107717)] MBFQuant: A Multiplier-Bitwidth-Fixed, Mixed-Precision Quantization Method for Mobile CNN-Based Applications\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9735379)] Optimization-Based Post-Training Quantization With Bit-Split and Stitching\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F10122994)] Single-path Bit Sharing for Automatic Loss-aware Model Compression\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.19102)] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving [[code](https:\u002F\u002Fgithub.com\u002Fefeslab\u002FAtom)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fefeslab\u002FAtom?style=social)](https:\u002F\u002Fgithub.com\u002Fefeslab\u002FAtom)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.14592)] Efficient Post-training Quantization with FP8 Formats [[code](https:\u002F\u002Fgithub.com\u002Fintel\u002Fneural-compressor)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fintel\u002Fneural-compressor?style=social)](https:\u002F\u002Fgithub.com\u002Fintel\u002Fneural-compressor)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.07147)] QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.16795)] QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.01089)] RPTQ: Reorder-based Post-training Quantization for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002FRPTQ4LLM)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhahnyuan\u002FRPTQ4LLM?style=social)](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002FRPTQ4LLM)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.17723)] ZeroQuant-HERO: Hardware-Enhanced Robust Optimized Post-Training Quantization Framework for W8A8 Transformers\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.16187)] Quantization-Aware Interval Bound Propagation for Training Certifiably Robust Quantized Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fmlech26l\u002Fquantization_aware_ibp)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmlech26l\u002Fquantization_aware_ibp?style=social)](https:\u002F\u002Fgithub.com\u002Fmlech26l\u002Fquantization_aware_ibp)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=s1KljJpAukm)] PowerQuant:Automorphism Search For Non-Uniform Quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=VWm4o4l3V9e)] Block and Subword-Scaling Floating-Point (BSFP) : An Efficient Non-Uniform Quantization For Low Precision Inference\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.14645)] REx: Data-Free Residual Quantization Error Expansion\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.19268)] Intriguing Properties of Quantization at Scale\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.11987)] Training Transformers with 4-bit Integers [[code](https:\u002F\u002Fgithub.com\u002Fxijiu9\u002FTrain_Transformers_with_INT4)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxijiu9\u002FTrain_Transformers_with_INT4?style=social)](https:\u002F\u002Fgithub.com\u002Fxijiu9\u002FTrain_Transformers_with_INT4)\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper_files\u002Fpaper\u002F2023\u002Ffile\u002F400a2e6a82520b690810b97fd67fcc4e-Paper-Conference.pdf)] Towards Efficient and Accurate Winograd Convolution via Full Quantization\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper_files\u002Fpaper\u002F2023\u002Ffile\u002Fc48bc80aa5d3cbbdd712d1cc107b8319-Paper-Conference.pdf)] Pruning vs Quantization: Which is Better? [[code](https:\u002F\u002Fgithub.com\u002FQualcomm-AI-research\u002Fpruning-vs-quantization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FQualcomm-AI-research\u002Fpruning-vs-quantization?style=social)](https:\u002F\u002Fgithub.com\u002FQualcomm-AI-research\u002Fpruning-vs-quantization)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=7L2mgi0TNEP)] A^2Q: Aggregation-Aware Quantization for Graph Neural Networks\n\n### 2022\n\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=5xEgrl_5FAJ)] BiBERT: Accurate Fully Binarized BERT. [code](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiBERT)]\n- [[IJCAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2202.06483)] BiFSMN: Binary Neural Network for Keyword Spotting [[code](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiFSMN)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiFSMN?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiFSMN)\n- [[ACM MM](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.14341)] Towards Accurate Post-Training Quantizationfor Vision Transformer\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2022.acl-long.331)] Compression of Generative Pre-trained Language Models via Quantization\n- [[ACM Trans. Des. Autom. Electron. Syst.](https:\u002F\u002Fweb.archive.org\u002Fweb\u002F20220722092230id_\u002Fhttps:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3549535)] Structured Dynamic Precision for Deep Neural Networks uantization\n- [[ASE](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3551349.3556916)] QVIP: An ILP-based Formal Verification Approach for Quantized Neural Networks\n- [[Applied Soft Computing](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fpii\u002FS1568494622005038)] A neural network compression method based on knowledge-distillation and parameter quantization for the bearing fault diagnosis\n- [[CCF Transactions on High Performance Computing](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs42514-022-00121-z)] An efficient segmented quantization for graph neural networks\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022W\u002FECV\u002Fpapers\u002FJiang_A_Low_Memory_Footprint_Quantized_Neural_Network_for_Depth_Completion_CVPRW_2022_paper.pdf)] A Low Memory Footprint Quantized Neural Network for Depth Completion of Very Sparse Time-of-Flight Depth Maps\n- [[CVPR](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9879477\u002F)] BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning [[code](https:\u002F\u002Fgithub.com\u002FRU-System-Software-and-Security\u002FBppAttack)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FRU-System-Software-and-Security\u002FBppAttack?style=social)](https:\u002F\u002Fgithub.com\u002FRU-System-Software-and-Security\u002FBppAttack)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fpapers\u002FChikin_Data-Free_Network_Compression_via_Parametric_Non-Uniform_Mixed_Precision_Quantization_CVPR_2022_paper.pdf)] Data-Free Network Compression via Parametric Non-uniform Mixed Precision Quantization\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fpapers\u002FLiu_Instance-Aware_Dynamic_Neural_Network_Quantization_CVPR_2022_paper.pdf)] Instance-Aware Dynamic Neural Network Quantization\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fhtml\u002FZhong_IntraQ_Learning_Synthetic_Images_With_Intra-Class_Heterogeneity_for_Zero-Shot_Network_CVPR_2022_paper.html)] IntraQ: Learning Synthetic Images With Intra-Class Heterogeneity for Zero-Shot Network Quantization [[code](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FIntraQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzysxmu\u002FIntraQ?style=social)](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FIntraQ)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.17008)] It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher [[code](https:\u002F\u002Fgithub.com\u002Fiamkanghyunchoi\u002Fait)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fiamkanghyunchoi\u002Fait?style=social)](https:\u002F\u002Fgithub.com\u002Fiamkanghyunchoi\u002Fait)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fpapers\u002FWang_Learnable_Lookup_Table_for_Neural_Network_Quantization_CVPR_2022_paper.pdf)] Learnable Lookup Table for Neural Network Quantization [[code](https:\u002F\u002Fgithub.com\u002FThe-Learning-And-Vision-Atelier-LAVA\u002FLLT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FThe-Learning-And-Vision-Atelier-LAVA\u002FLLT?style=social)](https:\u002F\u002Fgithub.com\u002FThe-Learning-And-Vision-Atelier-LAVA\u002FLLT)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fpapers\u002FJeon_Mr.BiQ_Post-Training_Non-Uniform_Quantization_Based_on_Minimizing_the_Reconstruction_Error_CVPR_2022_paper.pdf)] Mr.BiQ: Post-Training Non-Uniform Quantization based on Minimizing the Reconstruction Error\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2111.14826)] Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation [[code](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FNonuniform-to-Uniform-Quantization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fliuzechun\u002FNonuniform-to-Uniform-Quantization?style=social)](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FNonuniform-to-Uniform-Quantization)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fhtml\u002FGuo_RecDis-SNN_Rectifying_Membrane_Potential_Distribution_for_Directly_Training_Spiking_Neural_CVPR_2022_paper.html)] RecDis-SNN: Rectifying Membrane Potential Distribution for Directly Training Spiking Neural Networks\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022W\u002FECV\u002Fpapers\u002Fvan_Baalen_Simulated_Quantization_Real_Power_Savings_CVPRW_2022_paper.pdf)] Simulated Quantization, Real Power Savings\n- [[EANN](https:\u002F\u002Flink.springer.com\u002Fchapter\u002F10.1007\u002F978-3-031-08223-8_35)] A Robust, Quantization-Aware Training Method for Photonic Neural Networks\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136720017.pdf)] BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks [[code](https:\u002F\u002Fgithub.com\u002FHanByulKim\u002FBASQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FHanByulKim\u002FBASQ?style=social)](https:\u002F\u002Fgithub.com\u002FHanByulKim\u002FBASQ)\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.08368)] Mixed-Precision Neural Network Quantization via Learned Layer-Wise Importance [[code](https:\u002F\u002Fgithub.com\u002F1hunters\u002FLIMPQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002F1hunters\u002FLIMPQ?style=social)](https:\u002F\u002Fgithub.com\u002F1hunters\u002FLIMPQ)\n- [[ECCV](https:\u002F\u002Flink.springer.com\u002Fchapter\u002F10.1007\u002F978-3-031-20071-7_37)] Neuromorphic Data Augmentation for Training Spiking Neural Networks. [[code]](https:\u002F\u002Fgithub.com\u002FIntelligent-Computing-Lab-Yale\u002FNDA_SNN)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136710657.pdf)] Non-Uniform Step Size Quantization for Accurate Post-Training Quantization\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136710154.pdf)] Patch Similarity Aware Data-Free Quantization for Vision Transformers [[code](https:\u002F\u002Fgithub.com\u002Fzkkli\u002Fpsaq-vit)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzkkli\u002Fpsaq-vit?style=social)](https:\u002F\u002Fgithub.com\u002Fzkkli\u002Fpsaq-vit)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136720190.pdf)] PTQ4ViT: Post-Training Quantization for Vision Transformers with Twin Uniform Quantization [[code](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002Fptq4vit)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhahnyuan\u002Fptq4vit?style=social)](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002Fptq4vit)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136720156.pdf)] RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136710207.pdf)] Symmetry Regularization and Saturating Nonlinearity for Robust Quantization\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136710726.pdf)] Towards Accurate Network Quantization with Equivalent Smooth Regularizer\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136710416.pdf)] Weight Fixing Networks. [[code]](https:\u002F\u002Fgithub.com\u002Fsubiawaud\u002FWeight_Fix_Networks)\n- [[ESE](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs10664-022-10202-w)] DiverGet: a Search-Based Software Testing approach for Deep Neural Network Quantization assessment\n- [[Electronics](https:\u002F\u002Fwww.mdpi.com\u002F2079-9292\u002F11\u002F6\u002F945)] A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration\n- [[FPGA](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3490422.3502364)] FILM-QNN: Efficient FPGA Acceleration of Deep Neural Networks with Intra-Layer, Mixed-Precision Quantization\n- [[ICCRD](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9730411\u002Fauthors)] Post Training Quantization after Neural Network\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=shpkpVXzo3h)] 8-bit Optimizers via Block-wise Quantization [[code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fbitsandbytes)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ffacebookresearch\u002Fbitsandbytes?style=social)](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fbitsandbytes)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=_CfpJazzXT2)] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=kF9DZQQrU0w)] Information Bottleneck: Exact Analysis of (Quantized) Neural Networks [[code](https:\u002F\u002Fgithub.com\u002FStephanLorenzen\u002FExactIBAnalysisInQNNs)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FStephanLorenzen\u002FExactIBAnalysisInQNNs?style=social)](https:\u002F\u002Fgithub.com\u002FStephanLorenzen\u002FExactIBAnalysisInQNNs)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=ySQH0oDyp7)] Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=ySQH0oDyp7)] QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization [[code](https:\u002F\u002Fgithub.com\u002Fwimh966\u002FQDrop)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fwimh966\u002FQDrop?style=social)](https:\u002F\u002Fgithub.com\u002Fwimh966\u002FQDrop)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=JXhROKNZzOc)] SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation. [code](https:\u002F\u002Fgithub.com\u002Fclevercool\u002FSQuant)]\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=3HJOA-1hb0e)] Toward Efficient Low-Precision Training: Data Format Optimization and Hysteresis Quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=7udZAsEzd60)] VC dimension of partially quantized neural networks in the overparametrized regime\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv162\u002Fdong22a.html)] Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks [[code](https:\u002F\u002Fgithub.com\u002FRunpeiDong\u002FDGMS)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FRunpeiDong\u002FDGMS?style=social)](https:\u002F\u002Fgithub.com\u002FRunpeiDong\u002FDGMS)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv162\u002Fliu22v.html)] GACT: Activation Compressed Training for Generic Network Architectures [[code](https:\u002F\u002Fgithub.com\u002FLiuXiaoxuanPKU\u002FGACT-ICML)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FLiuXiaoxuanPKU\u002FGACT-ICML?style=social)](https:\u002F\u002Fgithub.com\u002FLiuXiaoxuanPKU\u002FGACT-ICML)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv162\u002Fnagel22a\u002Fnagel22a.pdf)] Overcoming Oscillations in Quantization-Aware Training [[code](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Foscillations-qat)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fqualcomm-ai-research\u002Foscillations-qat?style=social)](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Foscillations-qat)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv162\u002Fhuang22h.html)] SDQ: Stochastic Differentiable Quantization with Mixed Precision\n- [[ICPR](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9956237)] Layer-Wise Data-Free CNN Compression\n- [[IEEE Internet of Things Journal](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9915794)] FedQNN: A Computation–Communication-Efficient Federated Learning Framework for IoT With Low-Bitwidth Neural Network Quantization\n- [[IJCAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2111.13824)] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer [[code](https:\u002F\u002Fgithub.com\u002Fmegvii-research\u002FFQ-ViT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmegvii-research\u002FFQ-ViT?style=social)](https:\u002F\u002Fgithub.com\u002Fmegvii-research\u002FFQ-ViT)\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2022\u002F504)] MultiQuant: Training Once for Multi-bit Quantization of Neural Networks\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2022\u002F219)] RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training Quantization [[code](https:\u002F\u002Fgithub.com\u002Fbillamihom\u002Frapq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbillamihom\u002Frapq?style=social)](https:\u002F\u002Fgithub.com\u002Fbillamihom\u002Frapq)\n- [[IJCNN](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9892671)] Accuracy Evaluation of Transposed Convolution-Based Quantized Neural Networks\n- [[IJCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.12338)] Distribution-sensitive Information Retention for Accurate Binary Neural Network\n- [[IJNS](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2209.15317.pdf)] Convolutional Neural Networks Quantization with Attention\n- [[ITSM](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9827546)] Edge–Artificial Intelligence-Powered Parking Surveillance With Quantized Neural Networks\n- [[Intelligent Automation & Soft Computing](https:\u002F\u002Fweb.p.ebscohost.com\u002Fabstract?direct=true&profile=ehost&scope=site&authtype=crawler&jrnl=10798587&AN=155230773&h=buFz%2f8gWWhfyGU%2btyHURhybWlmqZvGCIyITNuefG%2bIwBHoSqNwo4CVrCT7hsuZbtZ%2brDTVnLfGgNR6EX8e6%2fGg%3d%3d&crl=c&resultNs=AdminWebAuth&resultLocal=ErrCrlNotAuth&crlhashurl=login.aspx%3fdirect%3dtrue%26profile%3dehost%26scope%3dsite%26authtype%3dcrawler%26jrnl%3d10798587%26AN%3d155230773)] A Resource-Efficient Convolutional Neural Network Accelerator Using Fine-Grained Logarithmic Quantization\n- [[LNAI](https:\u002F\u002Flink.springer.com\u002Fchapter\u002F10.1007\u002F978-3-031-04083-2_14)] ECQ$^x$: Explainability-Driven Quantization for Low-Bit and Sparse DNNs\n- [[MICRO](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9923832)] ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper_files\u002Fpaper\u002F2022\u002Fhash\u002F20f94998511f25bb6378cae0e098bc46-Abstract-Conference.html)] BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons [[code](https:\u002F\u002Fgitee.com\u002Fmindspore\u002Fmodels\u002Ftree\u002Fmaster\u002Fresearch\u002Fcv\u002FBiMLP)]\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=55032)] BiT: Robustly Binarized Multi-distilled Transformer [[code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fbit)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ffacebookresearch\u002Fbit?style=social)](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fbit)\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=55162)] ClimbQ: Class Imbalanced Quantization Enabling Robustness on Efficient Inferences\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=54104)] Entropy-Driven Mixed-Precision Quantization for Deep Network Design\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=53073)] FP8 Quantization: The Power of the Exponent [[code](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Ffp8-quantization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fqualcomm-ai-research\u002Ffp8-quantization?style=social)](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Ffp8-quantization)\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=54389)] Leveraging Inter-Layer Dependency for Post -Training Quantization\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2208.07339)] LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale [[code](https:\u002F\u002Fgithub.com\u002Ftimdettmers\u002Fbitsandbytes)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ftimdettmers\u002Fbitsandbytes?style=social)](https:\u002F\u002Fgithub.com\u002Ftimdettmers\u002Fbitsandbytes)\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=53412)] Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning [[code](https:\u002F\u002Fgithub.com\u002Fist-daslab\u002Fobc)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fist-daslab\u002Fobc?style=social)](https:\u002F\u002Fgithub.com\u002Fist-daslab\u002Fobc)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=fU-m9kQe0ke)] Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer [[code](https:\u002F\u002Fgithub.com\u002Fyanjingli0202\u002Fq-vit)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyanjingli0202\u002Fq-vit?style=social)](https:\u002F\u002Fgithub.com\u002Fyanjingli0202\u002Fq-vit)\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=54812)] Redistribution of Weights and Activations for AdderNet Quantization\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=53476)] Theoretically Better and Numerically Faster Distributed Optimization with Smoothness-Aware Quantization Techniques\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=53407)] Towards Efficient Post-training Quantization of Pre-trained Language Models\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=54407)] ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers [[code](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeed)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FDeepSpeed?style=social)](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeed)\n- [[Neural Networks](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fpii\u002FS0893608022003598)] Quantization-aware training for low precision photonic neural networks\n- [[Neurocomputing](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fpii\u002FS0925231222008293)] EPQuant: A Graph Neural Network compression approach based on product quantization\n- [[Ocean Engineering](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fpii\u002FS0029801822017887)] Neural network based adaptive sliding mode tracking control of autonomous surface vehicles with input quantization and saturation\n- [[PPoPP](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3503221.3508408)] QGTC: accelerating quantized graph neural networks via GPU tensor core\n- [[TCCN](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9703679)] Low-Bitwidth Convolutional Neural Networks for Wireless Interference Identification\n- [[TCSVT](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9849674)] An Efficient Implementation of Convolutional Neural Network With CLIP-Q Quantization on FPGA\n- [[TGARS](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9362309)] Accelerating Convolutional Neural Network-Based Hyperspectral Image Classification by Step Activation Quantization\n- [[TODAES](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3498328)] Dynamic Quantization Range Control for Analog-in-Memory Neural Networks Acceleration\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2206.07741.pdf)] Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2201.08442)] Neural network quantization with ai model efficiency toolkit (aimet)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2201.07703.pdf)] Q-ViT: Fully Differentiable Quantization for Vision Transformer\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2206.07527.pdf)] QONNX: Representing Arbitrary-Precision Quantized Neural Networks\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2202.05048.pdf)] Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2211.10438.pdf)] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models [[code](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fsmoothquant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmit-han-lab\u002Fsmoothquant?style=social)](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fsmoothquant)\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F2206.15408)] Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition\n- [[tinyML Research Symposium](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2203.05025.pdf)] Power-of-Two Quantization for Low Bitwidth and Hardware Compliant Neural Networks\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.10345)] CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.10188)] Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach [[code](https:\u002F\u002Fgithub.com\u002Fjsjs0369\u002FMEBQAT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjsjs0369\u002FMEBQAT?style=social)](https:\u002F\u002Fgithub.com\u002Fjsjs0369\u002FMEBQAT)\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.07743)] Fine-grained Data Distribution Alignment for Post-Training Quantization [[code](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FFDDA)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzysxmu\u002FFDDA?style=social)](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FFDDA)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2206.06501)] Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training\n\n### 2021\n\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.01049)] Diversifying Sample Generation for Accurate Data-Free Quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=9QLRCVysdlO)] BiPointNet: Binary Neural Network for Point Clouds [[code](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiPointNet)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiPointNet?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiPointNet)\n- [[ICML](http:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fliu21t\u002Fliu21t.pdf)] How Do Adam and Training Strategies Help BNNs Optimization? [[code](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FAdamBNN)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fliuzechun\u002FAdamBNN?style=social)](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FAdamBNN)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.02778)] Compressing Deep Convolutional Neural Networks by Stacking Low-­Dimensional Binary Convolution Filters\n- [[AAAI](https:\u002F\u002Fwww.google.com\u002Furl?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwj4-rjuq7nvAhUVPH0KHXlYCUQQFjAFegQIChAD&url=https%3A%2F%2Fwww.aaai.org%2FAAAI21Papers%2FAAAI-7144.ZhaoK.pdf&usg=AOvVaw3dnOXfzKkLIw_qWXj7p7Yc)] Distribution Adaptive INT8 Quantization for Training CNNs\n- [[AAAI](https:\u002F\u002Fwww.semanticscholar.org\u002Fpaper\u002FFracBits%3A-Mixed-Precision-Quantization-via-Yang-Jin\u002Fcb219432863778fa173925d51fbf02af1d17ad98)] FracBits: Mixed Precision Quantization via Fractional Bit-Widths\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.02577)] Memory and Computation-Efficient Kernel SVM via Binary Embedding and Ternary Coefficients\n- [[AAAI](https:\u002F\u002Fwww.google.com\u002Furl?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwjD6aPrqbnvAhXeIDQIHWNdDCUQFjADegQIAxAD&url=https%3A%2F%2Fwww.aaai.org%2FAAAI21Papers%2FAAAI-1054.HuP.pdf&usg=AOvVaw2R_BcDlKyuuAPHMeO0Q-1c)] OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F16474\u002F16281)] Optimizing Information Theory Based Bitwise Bottlenecks for Efficient Mixed-Precision Activation Quantization\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.09049)] Post-­‐training Quantization with Multiple Points: Mixed Precision without Mixed Precision\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2012.08185)] Scalable Verification of Quantized Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fmlech26l\u002Fqnn_robustness_benchmarks)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmlech26l\u002Fqnn_robustness_benchmarks?style=social)](https:\u002F\u002Fgithub.com\u002Fmlech26l\u002Fqnn_robustness_benchmarks)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.14502)] Stochastic Precision Ensemble: Self‐Knowledge Distillation for Quantized Deep Neural Networks\n- [[AAAI](https:\u002F\u002Fwww.aaai.org\u002FAAAI21Papers\u002FAAAI-4473.LiY.pdf)] TRQ: Ternary Neural Networks with Residual Quantization\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F17434\u002F17241)] Uncertainty Quantification in CNN through the Bootstrap of Convex Neural Networks\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1907.05911)] Vector Quantized Bayesian Neural Network Inference for Data Streams\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2021.findings-acl.363)] On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers\n- [[ACM MM](https:\u002F\u002Farxiv.org\u002Fabs\u002F2011.14265)] Fully Quantized Image Super-Resolution Networks [[code](https:\u002F\u002Fgithub.com\u002Fbillhhh\u002FFQSR)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbillhhh\u002FFQSR?style=social)](https:\u002F\u002Fgithub.com\u002Fbillhhh\u002FFQSR)\n- [[ACM MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3474085.3475224)] VQMG: Hierarchical Vector Quantised and Multi-hops Graph Reasoning for Explicit Representation Learning\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2012.15823)] Binary Graph Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fmbahri\u002Fbinary_gnn)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmbahri\u002Fbinary_gnn?style=social)](https:\u002F\u002Fgithub.com\u002Fmbahri\u002Fbinary_gnn)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.07156)] Learnable Companding Quantization for Accurate Low-bit Neural Networks\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.00903)] Network Quantization with Element-wise Gradient Scaling [[code](https:\u002F\u002Fgithub.com\u002Fcvlab-yonsei\u002FEWGS)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcvlab-yonsei\u002FEWGS?style=social)](https:\u002F\u002Fgithub.com\u002Fcvlab-yonsei\u002FEWGS)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.15703)] Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fuber-research\u002Fpermute-quantize-finetune)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fuber-research\u002Fpermute-quantize-finetune?style=social)](https:\u002F\u002Fgithub.com\u002Fuber-research\u002Fpermute-quantize-finetune)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fpapers\u002FZhang_PokeBNN_A_Binary_Pursuit_of_Lightweight_Accuracy_CVPR_2022_paper.pdf)] PokeBNN: A Binary Pursuit of Lightweight Accuracy [[code](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Faqt)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgoogle\u002Faqt?style=social)](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Faqt)\n- [[CVPR](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fhtml\u002FShen_S2-BNN_Bridging_the_Gap_Between_Self-Supervised_Real_and_1-Bit_Neural_CVPR_2021_paper.html)] S2-bnn: Bridging the gap between self-supervised real and 1-bit neural networks via guided distribution calibration [[code](https:\u002F\u002Fgithub.com\u002Fszq0214\u002FS2-BNN)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fszq0214\u002FS2-BNN?style=social)](https:\u002F\u002Fgithub.com\u002Fszq0214\u002FS2-BNN)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.15263)] Zero-shot Adversarial Quantization [[code](https:\u002F\u002Fgithub.com\u002FFLHonker\u002FZAQ-code)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FFLHonker\u002FZAQ-code?style=social)](https:\u002F\u002Fgithub.com\u002FFLHonker\u002FZAQ-code)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fpapers\u002F123700562.pdf)] PAMS: Quantized Super-Resolution via Parameterized Max Scale [[code](https:\u002F\u002Fgithub.com\u002Fcolorjam\u002FPAMS)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcolorjam\u002FPAMS?style=social)](https:\u002F\u002Fgithub.com\u002Fcolorjam\u002FPAMS)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2021\u002Fhtml\u002FLi_MixMix_All_You_Need_for_Data-Free_Compression_Are_Feature_and_ICCV_2021_paper.html)] MixMix: All You Need for Data-Free Compression Are Feature and Data Mixing\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=POWv6hDd9XH)] BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction [[code](https:\u002F\u002Fgithub.com\u002Fyhhhli\u002FBRECQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyhhhli\u002FBRECQ?style=social)](https:\u002F\u002Fgithub.com\u002Fyhhhli\u002FBRECQ)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=TiXl51SCNw8)] BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization [[code](https:\u002F\u002Fgithub.com\u002Fyanghr\u002FBSQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyanghr\u002FBSQ?style=social)](https:\u002F\u002Fgithub.com\u002Fyanghr\u002FBSQ)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=NSBrFgJAHg)] Degree-Quant: Quantization-Aware Training for Graph Neural Networks\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=MxaY4FzOTa)] High-Capacity Expert Binary Networks [[code](https:\u002F\u002Fgithub.com\u002F1adrianb\u002Fexpert-binary-networks)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002F1adrianb\u002Fexpert-binary-networks?style=social)](https:\u002F\u002Fgithub.com\u002F1adrianb\u002Fexpert-binary-networks)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=3SV-ZePhnZM)] Incremental few-shot learning via vector quantization in deep embedded space\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=U_mat0b9iv)] Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural Networks by Pruning A Randomly Weighted Network [[code](https:\u002F\u002Fgithub.com\u002Fchrundle\u002Fbiprop)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fchrundle\u002Fbiprop?style=social)](https:\u002F\u002Fgithub.com\u002Fchrundle\u002Fbiprop)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=EoFNy62JGd)] Neural gradients are near-lognormal: improved quantized and sparse training\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=sTeoJiB4uR)] Reducing the Computational Cost of Deep Generative Models with Binary Neural Networks\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=Qr0aRliE_Hb)] Simple Augmentation Goes a Long Way: ADRL for DNN Quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=pBqLS-7KYAF)] Sparse Quantized Spectral Clustering\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=dV19Yyi1fS3)] Training with Quantization Noise for Extreme Model Compression [[code](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Ffairseq\u002Ftree\u002Fmaster\u002Fexamples\u002Fquant_noise)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fpytorch\u002Ffairseq?style=social)](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Ffairseq)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2007.13242.pdf)] WrapNet: Neural Net Inference with Ultra-Low-Resolution Arithmetic\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fchen21z.html)] ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training [[code](https:\u002F\u002Fgithub.com\u002Fucbrise\u002Factnn)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fucbrise\u002Factnn?style=social)](https:\u002F\u002Fgithub.com\u002Fucbrise\u002Factnn)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Ffu21d.html)] Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators [[code](https:\u002F\u002Fgithub.com\u002FRICE-EIC\u002FAuto-NBA)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FRICE-EIC\u002FAuto-NBA?style=social)](https:\u002F\u002Fgithub.com\u002FRICE-EIC\u002FAuto-NBA)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fzhang21r.html)] Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fyao21a.html)] HAWQ-V3: Dyadic Neural Network Quantization [[code](https:\u002F\u002Fgithub.com\u002FZhen-Dong\u002FHAWQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhen-Dong\u002FHAWQ?style=social)](https:\u002F\u002Fgithub.com\u002FZhen-Dong\u002FHAWQ)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fkim21d.html)] I-BERT: Integer-only BERT Quantization [[code](https:\u002F\u002Fgithub.com\u002Fkssteven418\u002FI-BERT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fkssteven418\u002FI-BERT?style=social)](https:\u002F\u002Fgithub.com\u002Fkssteven418\u002FI-BERT)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=YygA0yppTR)] A Winning Hand: Compressing Deep Networks Can Improve Out-of-Distribution Robustness [[code](https:\u002F\u002Fgithub.com\u002Fchrundle\u002Fbiprop)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fchrundle\u002Fbiprop?style=social)](https:\u002F\u002Fgithub.com\u002Fchrundle\u002Fbiprop)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=Z_J5bCb4Rra)] Divergence Frontiers for Generative Models: Sample Complexity, Quantization Effects, and Frontier Integrals\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=9TX5OsKJvm)] Post-Training Quantization for Vision Transformer\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=qe9z54E_cqE)] Post-Training Sparsity-Aware Quantization [[code](https:\u002F\u002Fgithub.com\u002Fgilshm\u002Fsparq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgilshm\u002Fsparq?style=social)](https:\u002F\u002Fgithub.com\u002Fgilshm\u002Fsparq)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=ejo1_Weiart)] Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples [[code](https:\u002F\u002Fgithub.com\u002Fiamkanghyunchoi\u002Fqimera)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fiamkanghyunchoi\u002Fqimera?style=social)](https:\u002F\u002Fgithub.com\u002Fiamkanghyunchoi\u002Fqimera)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=0kCxbBQknN)] Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes \n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=EO-CQzgcIxd)] VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using Vector Quantization\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F2103.13630)] A Survey of Quantization Methods for Efficient Neural Network Inference\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2106.08295.pdf)] A White Paper on Neural Network Quantization\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.07346)] Any-Precision Deep Neural Networks [[code](https:\u002F\u002Fgithub.com\u002FSHI-Labs\u002FAny-Precision-DNNs)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSHI-Labs\u002FAny-Precision-DNNs?style=social)](https:\u002F\u002Fgithub.com\u002FSHI-Labs\u002FAny-Precision-DNNs)\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F2103.12369)] ReCU: Reviving the Dead Weights in Binary Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fz-hXu\u002FReCU)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fz-hXu\u002FReCU?style=social)](https:\u002F\u002Fgithub.com\u002Fz-hXu\u002FReCU)\n- [[AAAI](https:\u002F\u002Fcdn.aaai.org\u002Fojs\u002F16263\u002F16263-13-19757-1-2-20210518.pdf)] Training Binary Neural Network without Batch Normalization for Image Super-Resolution\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F16306)] SA-BNN: State-­Aware Binary Neural Network\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FOh_Automated_Log-Scale_Quantization_for_Low-Cost_Deep_Neural_Networks_CVPR_2021_paper.pdf)] Automated Log-Scale Quantization for Low-Cost Deep Neural Networks\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FKryzhanovskiy_QPP_Real-Time_Quantization_Parameter_Prediction_for_Deep_Neural_Networks_CVPR_2021_paper.pdf)] QPP: Real-Time Quantization Parameter Prediction for Deep Neural Networks\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.10518)] Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming [[code](https:\u002F\u002Fgithub.com\u002Fitayhubara\u002FCalibTIP)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fitayhubara\u002FCalibTIP?style=social)](https:\u002F\u002Fgithub.com\u002Fitayhubara\u002FCalibTIP)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fhubara21a\u002Fhubara21a.pdf)] Accurate Post Training Quantization With Small Calibration Sets\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.08952)] BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer\n\n### 2020\n\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FQin_Forward_and_Backward_Information_Retention_for_Accurate_Binary_Neural_Networks_CVPR_2020_paper.pdf)] Forward and Backward Information Retention for Accurate Binary Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FIR-Net)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FIR-Net?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FIR-Net)\n- [[PR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2004.03333)] Binary neural networks: A survey\n- [[AAAI](https:\u002F\u002Faaai.org\u002Fojs\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F6035)] HLHLp: Quantized Neural Networks Traing for Reaching Flat Minima in Loss Sufrface\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.05840)] Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT\n- [[AAAI](https:\u002F\u002Faaai.org\u002Fojs\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F6900)] Sparsity-Inducing Binarized Neural Networks\n- [[AAAI](https:\u002F\u002Faaai.org\u002Fojs\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F6134)] Towards Accurate Low Bit-Width Quantization with Multiple Phase Adaptations\n- [[ACL](https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002F2020.sustainlp-1.4.pdf)] End to End Binarized Neural Networks for Text Classification\n- [[COOL CHIPS](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9097642\u002F)] A Novel In-DRAM Accelerator Architecture for Binary Neural Network\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FWang_APQ_Joint_Search_for_Network_Architecture_Pruning_and_Quantization_Policy_CVPR_2020_paper.pdf)] APQ: Joint Search for Network Architecture, Pruning and Quantization Policy [[code](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fapq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmit-han-lab\u002Fapq?style=social)](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fapq)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FWang_BiDet_An_Efficient_Binarized_Object_Detector_CVPR_2020_paper.pdf)] BiDet: An Efficient Binarized Object Detector. [[code](https:\u002F\u002Fgithub.com\u002FZiweiWangTHU\u002FBiDet)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZiweiWangTHU\u002FBiDet?style=social)](https:\u002F\u002Fgithub.com\u002FZiweiWangTHU\u002FBiDet)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FZhang_Fixed-Point_Back-Propagation_Training_CVPR_2020_paper.pdf)] Fixed-Point Back-Propagation Training\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FHan_GhostNet_More_Features_From_Cheap_Operations_CVPR_2020_paper.pdf)] GhostNet: More Features from Cheap Operations\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPRW_2020\u002Fpapers\u002Fw40\u002FYu_Low-Bit_Quantization_Needs_Good_Distribution_CVPRW_2020_paper.pdf)] Low-Bit Quantization Needs Good Distribution\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FWu_Rotation_Consistent_Margin_Loss_for_Efficient_Low-Bit_Face_Recognition_CVPR_2020_paper.pdf)] Rotation Consistent Margin Loss for Efficient Low-Bit Face Recognition\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.10778.pdf)] Training Binary Neural Networks using the Bayesian Learning Rule\n- [[DATE](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9116220)] BNNsplit: Binarized Neural Networks for embedded distributed FPGA-based computing systems\n- [[DATE](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9116308)] OrthrusPE: Runtime Reconfigurable Processing Elements for Binary Neural Networks\n- [[DATE](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.04050)] PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference Engine for Mobile Phones\n- [[ECCV](http:\u002F\u002Farxiv.org\u002Fabs\u002F2003.01711)] BATS: Binary ArchitecTure Search\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.10463)] Differentiable Joint Pruning and Quantization for Hardware Efficiency\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.03603)] Generative Low-bitwidth Data Free Quantization [[code](https:\u002F\u002Fgithub.com\u002Fxushoukai\u002FGDFQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxushoukai\u002FGDFQ?style=social)](https:\u002F\u002Fgithub.com\u002Fxushoukai\u002FGDFQ)\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.06963)] Learning Architectures for Binary Networks [[code](https:\u002F\u002Fgithub.com\u002Fgistvision\u002Fbnas)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgistvision\u002Fbnas?style=social)](https:\u002F\u002Fgithub.com\u002Fgistvision\u002Fbnas)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fpapers\u002F123510426.pdf)] PROFIT: A Novel Training Method for sub-4-bit MobileNet Models\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fpapers\u002F123480222.pdf)] ProxyBNN: Learning Binarized Neural Networks via Proxy Matrices\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fpapers\u002F123590137.pdf)] ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions [[code](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FReActNet)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fliuzechun\u002FReActNet?style=social)](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FReActNet)\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F1910.10485)] Fully Quantized Transformer for Machine Translation\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.12812)] TernaryBERT: Distillation-aware Ultra-low Bit BERT [[code](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhuawei-noah\u002FPretrained-Language-Model?style=social)](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model)\n- [[ICASSP](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9054599)] Balanced Binary Neural Networks with Gated Residual\n- [[ICET](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9119704)] An Energy-Efficient Bagged Binary Neural Network Accelerator\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.06517)] BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations [[code](https:\u002F\u002Fgithub.com\u002FHyungjun-K1m\u002FBinaryDuo)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FHyungjun-K1m\u002FBinaryDuo?style=social)](https:\u002F\u002Fgithub.com\u002FHyungjun-K1m\u002FBinaryDuo)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=XKeyCSUWusK)] DMS: Differentiable Dimension Search for Binary Neural Networks\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1902.08153)] Learned Step Size Quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=Hyx0slrFvH)] Mixed Precision DNNs: All You Need is a Good Parametrization [[code](https:\u002F\u002Fgithub.com\u002Fsony\u002Fai-research-code\u002Ftree\u002Fmaster\u002Fmixed-precision-dnns)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsony\u002Fai-research-code?style=social)](https:\u002F\u002Fgithub.com\u002Fsony\u002Fai-research-code)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=BJg4NgBKvH)] Training Binary Neural Networks with Real-to-Binary Convolutions\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.10396)] Accelerating Large-Scale Inference with Anisotropic Vector Quantization\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2004.09576)] LSQ+: Improving low-bit quantization through learnable offsets and better initialization\n- [[ICML](https:\u002F\u002Fproceedings.icml.cc\u002Fstatic\u002Fpaper_files\u002Ficml\u002F2020\u002F181-Paper.pdf)] Training Binary Neural Networks through Learning with Noisy Supervision\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2004.10568)] Up or Down? Adaptive Rounding for Post-Training Quantization\n- [[IEEE Access](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9091590\u002F)] An Energy-Efficient and High Throughput in-Memory Computing Bit-Cell With Excellent Robustness Under Process Variations for Binary Neural Network\n- [[IEEE TCS.I](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2003.12558.pdf)] IMAC: In-Memory Multi-Bit Multiplication and ACcumulation in 6T SRAM Array\n- [[IEEE TCS.II](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9144282\u002F)] A Resource-Efficient Inference Accelerator for Binary Convolutional Neural Networks\n- [[IEEE Trans. Electron Devices](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9112690)] Design of High Robustness BNN Inference Accelerator Based on Binary Memristors\n- [[IEEE Trans. Magn](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.05132)] SIMBA: A Skyrmionic In-Memory Binary Neural Network Accelerator\n- [[IJCAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2005.00057.pdf)] CP-NAS: Child-Parent Neural Architecture Search for Binary Neural Networks\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2020\u002F292)] Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2020\u002F288)] Fully Nested Neural Network for Adaptive Compression and Quantization\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2020\u002F121)] Overflow Aware Quantization: Accelerating Neural Network Inference by Low-bit Multiply-Accumulate Operations\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2020\u002F318)] Soft Threshold Ternary Networks\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F2020\u002F0520.pdf)] Towards Fully 8-bit Integer Inference for the Transformer Model\n- [[IJCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.04247)] Binarized Neural Architecture Search for Efficient Object Recognition\n- [[ISCAS](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2004.08914.pdf)] MuBiNN: Multi-Level Binarized Recurrent Neural Network for EEG Signal Classification\n- [[ISQED](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9136977)] BNN Pruning: Pruning Binary Neural Network Guided by Weight Flipping Frequency [[code](https:\u002F\u002Fgithub.com\u002FPSCLab-ASU\u002FBNNPruning)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FPSCLab-ASU\u002FBNNPruning?style=social)](https:\u002F\u002Fgithub.com\u002FPSCLab-ASU\u002FBNNPruning)\n- [[MICRO](http:\u002F\u002Farxiv.org\u002Fabs\u002F2005.03842)] GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference\n- [[MLST](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.06308)] Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML\n- [[NN](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fabs\u002Fpii\u002FS0893608019304290?via%3Dihub)] Training high-performance and large-scale deep neural networks with full 8-bit integers\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F20b5e1cf8694af7a3c1ba4a87f073021-Abstract.html)] Adaptive Gradient Quantization for Data-Parallel SGD [[code](https:\u002F\u002Fgithub.com\u002Ftabrizian\u002Flearning-to-quantize)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ftabrizian\u002Flearning-to-quantize?style=social)](https:\u002F\u002Fgithub.com\u002Ftabrizian\u002Flearning-to-quantize)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F3f13cf4ddf6fc50c0d39a1d5aeb57dd8-Abstract.html)] Bayesian Bits: Unifying Quantization and Pruning\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F26ed695e9b7b9f6463ef4bc1fd74fc87-Abstract.html)] Closing the Dequantization Gap: PixelCNN as a Single-Layer Flow [[code](https:\u002F\u002Fgithub.com\u002Fdidriknielsen\u002Fpixelcnn_flow)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fdidriknielsen\u002Fpixelcnn_flow?style=social)](https:\u002F\u002Fgithub.com\u002Fdidriknielsen\u002Fpixelcnn_flow)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F1385974ed5904a438616ff7bdb3f7439-Abstract.html)] Efficient Exact Verification of Binarized Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fjia-kai\u002Feevbnn)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjia-kai\u002Feevbnn?style=social)](https:\u002F\u002Fgithub.com\u002Fjia-kai\u002Feevbnn)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F0e230b1a582d76526b7ad7fc62ae937d-Abstract.html)] FleXOR: Trainable Fractional Quantization\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002Fd77c703536718b95308130ff2e5cf9ee-Abstract.html)] HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F96fca94df72984fc97ee5095410d4dec-Abstract.html)] Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [[code](https:\u002F\u002Fgithub.com\u002Fshekhovt\u002FPSA-Neurips2020)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fshekhovt\u002FPSA-Neurips2020?style=social)](https:\u002F\u002Fgithub.com\u002Fshekhovt\u002FPSA-Neurips2020)\n- [[NeurIPS](http:\u002F\u002Farxiv.org\u002Fabs\u002F2005.11035)] Position-based Scaled Gradient for Model Quantization and Pruning [[code](https:\u002F\u002Fgithub.com\u002FJangho-Kim\u002FPSG-pytorch)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FJangho-Kim\u002FPSG-pytorch?style=social)](https:\u002F\u002Fgithub.com\u002FJangho-Kim\u002FPSG-pytorch)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F3948ead63a9f2944218de038d8934305-Abstract.html)] Robust Quantization: One Model to Rule Them All\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2020\u002Ffile\u002F53c5b2affa12eed84dfec9bfd83550b1-Paper.pdf)] Rotated Binary Neural Network [[code](https:\u002F\u002Fgithub.com\u002Flmbxmu\u002FRBNN)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Flmbxmu\u002FRBNN?style=social)](https:\u002F\u002Fgithub.com\u002Flmbxmu\u002FRBNN)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Ffile\u002F2a084e55c87b1ebcdaad1f62fdbbac8e-Paper.pdf)] Searching for Low-Bit Weights in Quantized Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fzhaohui-yang\u002FBinary-Neural-Networks\u002Ftree\u002Fmain\u002FSLB)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzhaohui-yang\u002FBinary-Neural-Networks?style=social)](https:\u002F\u002Fgithub.com\u002Fzhaohui-yang\u002FBinary-Neural-Networks)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F92049debbe566ca5782a3045cf300a3c-Abstract.html)] Universally Quantized Neural Compression\n- [[Neurocomputing](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fabs\u002Fpii\u002FS0925231219314274)] Eye localization based on weight binarization cascade convolution neural network\n- [[PR Letters](https:\u002F\u002Farxiv.org\u002Fabs\u002F2008.01438)] Controlling information capacity of binary neural network\n- [[SysML](https:\u002F\u002Fubicomplab.cs.washington.edu\u002Fpdfs\u002Friptide.pdf)] Riptide: Fast End-to-End Binarized Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fjwfromm\u002FRiptide)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjwfromm\u002FRiptide?style=social)](https:\u002F\u002Fgithub.com\u002Fjwfromm\u002FRiptide)\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8573867\u002F)] Deep Neural Network Compression by In-Parallel Pruning-Quantization\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8444745\u002F)] Hierarchical Binary CNNs for Landmark Localization with Limited Resources [[code](https:\u002F\u002Fwww.adrianbulat.com\u002Fbinary-cnn-landmarks)]\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8674614\u002F)] Towards Efficient U-Nets: A Coupled and Quantized Approach\n- [[TVLSI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2003.02628.pdf)] Phoenix: A Low-Precision Floating-Point Quantization Oriented Architecture for Convolutional Neural Networks\n- [[WACV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_WACV_2020\u002Fpapers\u002FPhan_MoBiNet_A_Mobile_Binary_Network_for_Image_Classification_WACV_2020_paper.pdf)] MoBiNet: A Mobile Binary Network for Image Classification\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.16578)] Accelerating Binarized Neural Networks via Bit-Tensor-Cores in Turing GPUs [[code](https:\u002F\u002Fgithub.com\u002Fpnnl\u002FTCBNN)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fpnnl\u002FTCBNN?style=social)](https:\u002F\u002Fgithub.com\u002Fpnnl\u002FTCBNN)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2004.11147.pdf)] Binarized Graph Neural Network\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2012.15701)] BinaryBERT: Pushing the Limit of BERT Quantization [[code](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhuawei-noah\u002FPretrained-Language-Model?style=social)](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2007.05223.pdf)] Distillation Guided Residual Learning for Binary Convolutional Neural Networks\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1909.09139.pdf)] How Does Batch Normalization Help Binary Training?\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2001.05936)] MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy? [[code](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhpi-xnor\u002FBMXNet-v2?style=social)](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2001.01091.pdf)] RPR: Random Partition Relaxation for Training; Binary and Ternary Weight Neural Networks\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2004.07320)] Training with Quantization Noise for Extreme Model Compression [[code](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Ffairseq\u002Ftree\u002Fmaster\u002Fexamples\u002Fquant_noise)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fpytorch\u002Ffairseq?style=social)](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Ffairseq)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.07522)] Understanding Learning Dynamics of Binary Neural Networks via Information Bottleneck\n- [[paper](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F343568789_Towards_Lossless_Binary_Convolutional_Neural_Networks_Using_Piecewise_Approximation)] Towards Lossless Binary Convolutional Neural Networks Using Piecewise Approximation\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2001.00281)] ZeroQ: A Novel Zero Shot Quantization Framework [[code](https:\u002F\u002Fgithub.com\u002Famirgholami\u002FZeroQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Famirgholami\u002FZeroQ?style=social)](https:\u002F\u002Fgithub.com\u002Famirgholami\u002FZeroQ)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.09666)] AdaBits: Neural Network Quantization With Adaptive Bit-Widths [[code](https:\u002F\u002Fgithub.com\u002FdeJQK\u002FAdaBits)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FdeJQK\u002FAdaBits?style=social)](https:\u002F\u002Fgithub.com\u002FdeJQK\u002FAdaBits)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.08883)] Adaptive Loss-aware Quantization for Multi-bit Networks [[code](https:\u002F\u002Fgithub.com\u002Fzqu1992\u002FALQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzqu1992\u002FALQ?style=social)](https:\u002F\u002Fgithub.com\u002Fzqu1992\u002FALQ)\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.09952)] HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs [[code](https:\u002F\u002Fgithub.com\u002Fsony-si\u002Fai-research)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsony-si\u002Fai-research?style=social)](https:\u002F\u002Fgithub.com\u002Fsony-si\u002Fai-research)\n\n### 2019\n\n- [[AAAI](https:\u002F\u002Fwww.aaai.org\u002Fojs\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F4273\u002F4151)] Efficient Quantization for Neural Networks with Binary Weights and Low Bitwidth Activations\n- [[AAAI](https:\u002F\u002Fwww.aaai.org\u002Fojs\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F4848\u002F4721)] Projection Convolutional Neural Networks for 1-bit CNNs via Discrete Back Propagation\n- [[APCCAS](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8953134\u002F)] Using Neuroevolved Binary Neural Networks to solve reinforcement learning environments [[code](https:\u002F\u002Fgithub.com\u002Frval735\u002FBiSUNA)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Frval735\u002FBiSUNA?style=social)](https:\u002F\u002Fgithub.com\u002Frval735\u002FBiSUNA)\n- [[BMVC](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.11366)] Accurate and Compact Convolutional Neural Networks with Trained Binarization\n- [[BMVC](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.13863)] XNOR-Net++: Improved Binary Neural Networks\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FXu_A_MainSubsidiary_Network_Framework_for_Simplifying_Binary_Neural_Networks_CVPR_2019_paper.pdf)] A Main\u002FSubsidiary Network Framework for Simplifying Binary Neural Network\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FZhu_Binary_Ensemble_Neural_Network_More_Bits_per_Network_or_More_CVPR_2019_paper.pdf)] Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FLiu_Circulant_Binary_Convolutional_Networks_Enhancing_the_Performance_of_1-Bit_DCNNs_CVPR_2019_paper.pdf)] Circulant Binary Convolutional Networks: Enhancing the Performance of 1-bit DCNNs with Circulant Back Propagation\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FLi_Fully_Quantized_Network_for_Object_Detection_CVPR_2019_paper.pdf)] Fully Quantized Network for Object Detection\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FWang_HAQ_Hardware-Aware_Automated_Quantization_With_Mixed_Precision_CVPR_2019_paper.pdf)] HAQ: Hardware-Aware Automated Quantization with Mixed Precision [[code](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fhaq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmit-han-lab\u002Fhaq?style=social)](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fhaq)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FWang_Learning_Channel-Wise_Interactions_for_Binary_Convolutional_Neural_Networks_CVPR_2019_paper.pdf)] Learning Channel-Wise Interactions for Binary Convolutional Neural Networks\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1808.05779)] Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FYang_Quantization_Networks_CVPR_2019_paper.pdf)] Quantization Networks [[code](https:\u002F\u002Fgithub.com\u002Faliyun\u002Falibabacloud-quantization-networks)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Faliyun\u002Falibabacloud-quantization-networks?style=social)](https:\u002F\u002Fgithub.com\u002Faliyun\u002Falibabacloud-quantization-networks)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FDing_Regularizing_Activation_Distribution_for_Training_Binarized_Deep_Networks_CVPR_2019_paper.pdf)] Regularizing Activation Distribution for Training Binarized Deep Networks\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FCao_SeerNet_Predicting_Convolutional_Neural_Network_Feature-Map_Sparsity_Through_Low-Bit_Quantization_CVPR_2019_paper.pdf)] SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FZhuang_Structured_Binary_Neural_Networks_for_Accurate_Image_Classification_and_Semantic_CVPR_2019_paper.pdf)] Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1906.08637.pdf)] Back to Simplicity: How to Train Accurate BNNs from Scratch? [[code](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhpi-xnor\u002FBMXNet-v2?style=social)](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.10862)] Binarized Neural Architecture Search\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F1904.05868)] Improved training of binary networks for human pose estimation and image recognition\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1904.07852.pdf)] Matrix and tensor decompositions for training binary neural networks\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.07748)] RBCN: Rectified Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.10103)] TentacleNet: A Pseudo-Ensemble Template for Accurate Binary Convolutional Neural Networks\n- [[FPGA](https:\u002F\u002Farxiv.org\u002Fabs\u002F1810.02068)] Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA\n- [[GLSVLSI](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3299874.3318034)] Binarized Depthwise Separable Neural Network for Object Tracking in FPGA\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.06314.pdf)] Bayesian optimized 1-bit cnns\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FNagel_Data-Free_Quantization_Through_Weight_Equalization_and_Bias_Correction_ICCV_2019_paper.html)] Data-Free Quantization Through Weight Equalization and Bias Correction [[code](https:\u002F\u002Fgithub.com\u002Fjakc4103\u002FDFQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjakc4103\u002FDFQ?style=social)](https:\u002F\u002Fgithub.com\u002Fjakc4103\u002FDFQ)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.05033)] Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F1901.01928)] DSConv: Efficient Convolution Operator\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FDong_HAWQ_Hessian_AWare_Quantization_of_Neural_Networks_With_Mixed-Precision_ICCV_2019_paper.html)] HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCVW_2019\u002Fpapers\u002FNeurArch\u002FShen_Searching_for_Accurate_Binary_Neural_Architectures_ICCVW_2019_paper.pdf)] Searching for Accurate Binary Neural Architectures\n- [[ICIP](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8802610)] Training Accurate Binary Neural Networks from Scratch [[code](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhpi-xnor\u002FBMXNet-v2?style=social)](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=rJfUCoR5KX)] An Empirical study of Binary Neural Networks' Optimisation\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=HyzMyhCcK7)] ProxQuant: Quantized Neural Networks via Proximal Operators [[code](https:\u002F\u002Fgithub.com\u002Fallenbai01\u002FProxQuant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fallenbai01\u002FProxQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fallenbai01\u002FProxQuant)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.00532v2)] Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model\n- [[ICUS](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8996039\u002F)] Balanced Circulant Binary Convolutional Networks\n- [[IEEE J. Emerg. Sel. Topics Circuits Syst.](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8668446\u002F)] Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine\n- [[IEEE J. Solid-State Circuits](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8581485)] An Energy-Efficient Reconfigurable Processor for Binary-and Ternary-Weight Neural Networks With Flexible Data Bit Width\n- [[IEEE JETC](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1807.07928.pdf)] Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices\n- [[IEEE TCS.I](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F8643565)] Recursive Binary Neural Network Training Model for Efficient Usage of On-Chip Memory\n- [[IEEE TCS.I](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1807.00343.pdf)] Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F2019\u002F0667.pdf)] Binarized Collaborative Filtering with Distilling Graph Convolutional Network\n- [[IJCAI](https:\u002F\u002Fsee.xidian.edu.cn\u002Ffaculty\u002Fchdeng\u002FWelcome%20to%20Cheng%20Deng's%20Homepage_files\u002FPapers\u002FConference\u002FIJCAI2019_Feng.pdf)] Binarized Neural Networks for Resource-Efficient Hashing with Minimizing Quantization Loss\n- [[ISOCC](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9027649)] Dual Path Binary Neural Network\n- [[MDPI Electronics](https:\u002F\u002Fdoi.org\u002F10.3390\u002Felectronics8060661)] A Review of Binarized Neural Networks\n- [[NeurIPS](https:\u002F\u002Fwww.emc2-ai.org\u002Fassets\u002Fdocs\u002Fneurips-19\u002Femc2-neurips19-paper-36.pdf)] Fully Quantized Transformer for Improved Translation\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2019\u002Ffile\u002F9ca8c9b0996bbf05ae7753d34667a6fd-Paper.pdf)] Latent Weights Do Not Exist: Rethinking Binarized Neural Network Optimization [[code](https:\u002F\u002Fgithub.com\u002Fplumerai\u002Frethinking-bnn-optimization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fplumerai\u002Frethinking-bnn-optimization?style=social)](https:\u002F\u002Fgithub.com\u002Fplumerai\u002Frethinking-bnn-optimization)\n- [[NeurIPS](https:\u002F\u002Fcsyhhu.github.io\u002Fdata\u002FMetaQuant.pdf)] MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization [[code](https:\u002F\u002Fgithub.com\u002Fcsyhhu\u002FMetaQuant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcsyhhu\u002FMetaQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fcsyhhu\u002FMetaQuant)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F1902.03538)] Model Compression with Adversarial Robustness: A Unified Optimization Framework\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fpdf?id=rJgB34rx8r)] Normalization Helps Training of Quantized LSTM\n- [[NeurIPS](https:\u002F\u002Fwww.emc2-ai.org\u002Fassets\u002Fdocs\u002Fneurips-19\u002Femc2-neurips19-paper-31.pdf)] Q8BERT: Quantized 8Bit BERT\n- [[NeurIPS](http:\u002F\u002Farxiv.org\u002Fabs\u002F1812.11800)] Regularized Binary Network Training\n- [[RoEduNet](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8909493\u002F)] PXNOR: Perturbative Binary Neural Network [[code](https:\u002F\u002Fgithub.com\u002FApfelin\u002FPXNOR)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FApfelin\u002FPXNOR?style=social)](https:\u002F\u002Fgithub.com\u002FApfelin\u002FPXNOR)\n- [[SiPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.01688)] Knowledge distillation for optimization of quantized deep neural networks\n- [[TMM](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1712.02956.pdf)] Compact Hash Code Learning With Binary Deep Neural Network\n- [[TMM](https:\u002F\u002Farxiv.org\u002Fabs\u002F1708.05127)] Deep Binary Reconstruction for Cross-Modal Hashing\n- [[VLSI-SoC](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8920343\u002F)] A Product Engine for Energy-Efficient Execution of Binary Neural Networks Using Resistive Memories\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F1908.05858)] daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices [[code](https:\u002F\u002Fgithub.com\u002FJDAI-CV\u002Fdabnn)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FJDAI-CV\u002Fdabnn?style=social)](https:\u002F\u002Fgithub.com\u002FJDAI-CV\u002Fdabnn)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1812.00090)] Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.12491)] QKD: Quantization-aware Knowledge Distillation\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F1902.00730)] Self-Binarizing Networks\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.12607)] Towards Unified INT8 Training for Convolutional Neural Network\n- [[paper](https:\u002F\u002Fopenreview.net\u002Fpdf?id=SJfHg2A5tQ)] BNN+: Improved Binary Network Training\n\n### 2018\n\n- [[AAAI](https:\u002F\u002Faaai.org\u002Focs\u002Findex.php\u002FAAAI\u002FAAAI18\u002Fpaper\u002FviewPDFInterstitial\u002F16767\u002F16728)] Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM [[code](https:\u002F\u002Fweb.stanford.edu\u002F~boyd\u002Fadmm.html)]\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.02733)] From Hashing to CNNs: Training BinaryWeight Networks via Hashing\n- [[CAAI](https:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?arnumber=8603080)] Fast object detection based on binary deep convolution neural networks\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.04680)] Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fhtml\u002FZhou_Explicit_Loss-Error-Aware_Quantization_CVPR_2018_paper.html)] Explicit loss-error-aware quantization for low-bit deep neural networks\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FWang_Modulated_Convolutional_Networks_CVPR_2018_paper.pdf)] Modulated convolutional networks\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FJacob_Quantization_and_Training_CVPR_2018_paper.pdf)] Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FFaraone_SYQ_Learning_Symmetric_CVPR_2018_paper.pdf)] SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks [[code](https:\u002F\u002Fwww.github.com\u002Fjulianfaraone\u002FSYQ)]\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FZhuang_Towards_Effective_Low-Bitwidth_CVPR_2018_paper.pdf)] Towards Effective Low-bitwidth Convolutional Neural Networks\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FWang_Two-Step_Quantization_for_CVPR_2018_paper.pdf)] Two-Step Quantization for Low-bit Neural Networks\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1801.06313.pdf)] BinaryRelax: A Relaxation Approach For Training Deep Neural Networks With Quantized Weights\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.02178)] LightNN: Filling the Gap between Conventional Deep Neural Networks and Binarized Networks\n- [[ECCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002Fzechun_liu_Bi-Real_Net_Enhancing_ECCV_2018_paper.pdf)] Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm [[code](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FBi-Real-net)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fliuzechun\u002FBi-Real-net?style=social)](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FBi-Real-net)\n- [[ECCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FDongqing_Zhang_Optimized_Quantization_for_ECCV_2018_paper.pdf)] LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLQ-Nets)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FLQ-Nets?style=social)](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLQ-Nets)\n- [[ECCV](https:\u002F\u002Fyan-junjie.github.io\u002Fpublication\u002Fdblp-confeccv-wei-pqoy-18\u002Fdblp-confeccv-wei-pqoy-18.pdf)] Quantization Mimic: Towards Very Tiny CNN for Object Detection\n- [[ECCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FDiwen_Wan_TBN_Convolutional_Neural_ECCV_2018_paper.pdf)] TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights [[code](https:\u002F\u002Fgithub.com\u002Fdnvtmf\u002FTBN)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fdnvtmf\u002FTBN?style=social)](https:\u002F\u002Fgithub.com\u002Fdnvtmf\u002FTBN)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2018\u002Fpapers_ECCV\u002Fpapers\u002FQinghao_Hu_Training_Binary_Weight_ECCV_2018_paper.pdf)] Training Binary Weight Networks via Semi-Binary Decomposition\n- [[FCCM](http:\u002F\u002Faceslab.org\u002Fsites\u002Fdefault\u002Ffiles\u002FFCCM_2018_resbinnet.pdf)] ReBNet: Residual Binarized Neural Network [[code](https:\u002F\u002Fgithub.com\u002Fmohaghasemzadeh\u002FReBNet)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmohaghasemzadeh\u002FReBNet?style=social)](https:\u002F\u002Fgithub.com\u002Fmohaghasemzadeh\u002FReBNet)\n- [[FPL](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8532584\u002F)] FBNA: A Fully Binarized Neural Network Accelerator\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=ryM_IoAqYX)] Analysis of Quantized Models\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=B1ae1lZRb)] Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.08635)] Loss-aware Weight Quantization of Deep Networks [[code](https:\u002F\u002Fgithub.com\u002Fhoulu369\u002FLoss-aware-weight-quantization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhoulu369\u002FLoss-aware-weight-quantization?style=social)](https:\u002F\u002Fgithub.com\u002Fhoulu369\u002FLoss-aware-weight-quantization)\n- [[ICLR](https:\u002F\u002Fresearch-explorer.app.ist.ac.at\u002Fdownload\u002F7812\u002F7894\u002F2018_ICLR_Polino.pdf)] Model compression via distillation and quantization [[code](https:\u002F\u002Fgithub.com\u002Fantspy\u002Fquantized_distillation)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fantspy\u002Fquantized_distillation?style=social)](https:\u002F\u002Fgithub.com\u002Fantspy\u002Fquantized_distillation)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=By5ugjyCb)] PACT: Parameterized Clipping Activation for Quantized Neural Networks\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=B1ZvaaeAZ)] WRPN: Wide Reduced-Precision Networks\n- [[IEEE J. Solid-State Circuits](http:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8226999\u002F)] BRein Memory: A Single-Chip Binary\u002FTernary Reconfigurable in-Memory Deep Neural Network Accelerator Achieving 1.4 TOPS at 0.6 W\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F2018\u002F0380.pdf)] Deterministic Binary Filters for Convolutional Neural Networks\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F2018\u002F0669.pdf)] Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models\n- [[IJCNN](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8489259)] Analysis and Implementation of Simple Dynamic Binary Neural Networks\n- [[IPDPS](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8425178)] BitFlow: Exploiting Vector Parallelism for Binary Neural Networks on CPU\n- [[MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3240508.3240673)] BitStream: Efficient Computing Architecture for Real-Time Low-Power Inference of Binary Neural Networks on CPUs\n- [[NCA](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1712.08934.pdf)] A survey of FPGA-based accelerators for convolutional neural networks\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2018\u002Ffile\u002Fe82c4b19b8151ddc25d4d93baf7b908f-Paper.pdf)] Scalable methods for 8-bit training of neural networks [[code](https:\u002F\u002Fgithub.com\u002Feladhoffer\u002Fquantized.pytorch)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Feladhoffer\u002Fquantized.pytorch?style=social)](https:\u002F\u002Fgithub.com\u002Feladhoffer\u002Fquantized.pytorch)\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2018\u002Ffile\u002F335d3d1cd7ef05ec77714a215134914c-Paper.pdf)] Training Deep Neural Networks with 8-bit Floating Point Numbers\n- [[Res Math Sci](https:\u002F\u002Farxiv.org\u002Fabs\u002F1808.05240)] Blended coarse gradient descent for full quantization of deep neural networks\n- [[TCAD](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8412533\u002F)] XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ\u002Fop Binary Neural Network Inference\n- [[TRETS](http:\u002F\u002Farxiv.org\u002Fabs\u002F1809.04570)] FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks\n- [[TVLSI](http:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8103902\u002F)] An Energy-Efficient Architecture for Binary Weight Convolutional Neural Networks\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1811.09426)] Joint Neural Architecture Search and Quantization [[code](https:\u002F\u002Fgithub.com\u002Fyukang2017\u002FNAS-quantization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyukang2017\u002FNAS-quantization?style=social)](https:\u002F\u002Fgithub.com\u002Fyukang2017\u002FNAS-quantization)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1812.01965)] Training Competitive Binary Neural Networks from Scratch [[code](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhpi-xnor\u002FBMXNet-v2?style=social)](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)\n\n### 2017\n\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2017\u002Fpapers\u002FCai_Deep_Learning_With_CVPR_2017_paper.pdf)] Deep Learning with Low Precision by Half-wave Gaussian Quantization [[code](https:\u002F\u002Fgithub.com\u002Fzhaoweicai\u002Fhwgq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzhaoweicai\u002Fhwgq?style=social)](https:\u002F\u002Fgithub.com\u002Fzhaoweicai\u002Fhwgq)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2017\u002Fpapers\u002FJuefei-Xu_Local_Binary_Convolutional_CVPR_2017_paper.pdf)] Local Binary Convolutional Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fjuefeix\u002Flbcnn.torch)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjuefeix\u002Flbcnn.torch?style=social)](https:\u002F\u002Fgithub.com\u002Fjuefeix\u002Flbcnn.torch)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1705.09864.pdf)] BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet [[code](https:\u002F\u002Fgithub.com\u002Fhpi-xnor)]\n- [[FPGA](https:\u002F\u002Farxiv.org\u002Fabs\u002F1612.07119)] FINN: A Framework for Fast, Scalable Binarized Neural Network Inference\n- [[ICASSP](https:\u002F\u002Farxiv.org\u002Fabs\u002F1702.08171)] Fixed-point optimization of deep neural networks with adaptive step size retraining\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FBulat_Binarized_Convolutional_Landmark_ICCV_2017_paper.pdf)] Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources [[code](https:\u002F\u002Fwww.adrianbulat.com\u002Fbinary-cnn-landmarks)]\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FLi_Performance_Guaranteed_Network_ICCV_2017_paper.pdf)] Performance Guaranteed Network Acceleration via High-Order Residual Quantization\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=HyQJ-mclg)] Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights [[code](https:\u002F\u002Fgithub.com\u002FMxbonn\u002FINQ-pytorch)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FMxbonn\u002FINQ-pytorch?style=social)](https:\u002F\u002Fgithub.com\u002FMxbonn\u002FINQ-pytorch)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=S1oWlN9ll)] Loss-aware Binarization of Deep Networks [[code](https:\u002F\u002Fgithub.com\u002Fhoulu369\u002FLoss-aware-Binarization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhoulu369\u002FLoss-aware-Binarization?style=social)](https:\u002F\u002Fgithub.com\u002Fhoulu369\u002FLoss-aware-Binarization)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=HJGwcKclx)] Soft Weight-Sharing for Neural Network Compression\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=S1_pAu9xl)] Trained Ternary Quantization [[code](https:\u002F\u002Fgithub.com\u002FTropComplique\u002Ftrained-ternary-quantization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FTropComplique\u002Ftrained-ternary-quantization?style=social)](https:\u002F\u002Fgithub.com\u002FTropComplique\u002Ftrained-ternary-quantization)\n- [[IPDPSW](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F7965031)] On-Chip Memory Based Binarized Convolutional Deep Neural Network Applying Batch Normalization Free Technique on an FPGA\n- [[InterSpeech](https:\u002F\u002Fwww.isca-speech.org\u002Farchive\u002FInterspeech_2017\u002Fpdfs\u002F1343.PDF)] Binary Deep Neural Networks for Speech Recognition\n- [[JETC](https:\u002F\u002Farxiv.org\u002Fabs\u002F1702.06392)] A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks\n- [[MWSCAS](http:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8052915\u002F)] Deep learning binary neural network on an FPGA\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F1711.11294)] Towards Accurate Binary Convolutional Neural Network [[code](https:\u002F\u002Fgithub.com\u002Flayog\u002FAccurate-Binary-Convolution-Network)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Flayog\u002FAccurate-Binary-Convolution-Network?style=social)](https:\u002F\u002Fgithub.com\u002Flayog\u002FAccurate-Binary-Convolution-Network)\n- [[Neurocomputing](http:\u002F\u002Fwww.doc.ic.ac.uk\u002F~wl\u002Fpapers\u002F17\u002Fneuro17sl0.pdf)] FP-BNN: Binarized neural network on FPGA\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1706.02393)] ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fgudovskiy\u002FShiftCNN)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgudovskiy\u002FShiftCNN?style=social)](https:\u002F\u002Fgithub.com\u002Fgudovskiy\u002FShiftCNN)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1705.01462.pdf)] Ternary Neural Networks with Fine-Grained Quantization\n\n### 2016\n\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2016\u002Fhtml\u002FWu_Quantized_Convolutional_Neural_CVPR_2016_paper.html)] Quantized convolutional neural networks for mobile devices. [code](https:\u002F\u002Fgithub.com\u002Fjiaxiang-wu\u002Fquantized-cnn)\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F1606.06160)] DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients [[code](https:\u002F\u002Fgithub.com\u002Ftensorpack\u002Ftensorpack\u002Ftree\u002Fmaster\u002Fexamples\u002FDoReFa-Net)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ftensorpack\u002Ftensorpack?style=social)](https:\u002F\u002Fgithub.com\u002Ftensorpack\u002Ftensorpack)\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F1603.05279)] XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks [[code](https:\u002F\u002Fgithub.com\u002Fallenai\u002FXNOR-Net)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fallenai\u002FXNOR-Net?style=social)](https:\u002F\u002Fgithub.com\u002Fallenai\u002FXNOR-Net)\n- [[ICASSP](https:\u002F\u002Farxiv.org\u002Fabs\u002F1512.01322)] Fixed-point Performance Analysis of Recurrent Neural Networks\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1602.02830)] Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 [[code](https:\u002F\u002Fgithub.com\u002Fitayhubara\u002FBinaryNet)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fitayhubara\u002FBinaryNet?style=social)](https:\u002F\u002Fgithub.com\u002Fitayhubara\u002FBinaryNet)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1605.04711.pdf)] Ternary weight networks [[code](https:\u002F\u002Fgithub.com\u002Ffengfu-chris\u002Fcaffe-twns)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ffengfu-chris\u002Fcaffe-twns?style=social)](https:\u002F\u002Fgithub.com\u002Ffengfu-chris\u002Fcaffe-twns)\n\n### 2015\n\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F1601.06071)] Bitwise Neural Networks\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F1511.00363)] BinaryConnect: Training Deep Neural Networks with binary weights during propagations [[code](https:\u002F\u002Fgithub.com\u002FMatthieuCourbariaux\u002FBinaryConnect)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FMatthieuCourbariaux\u002FBinaryConnect?style=social)](https:\u002F\u002Fgithub.com\u002FMatthieuCourbariaux\u002FBinaryConnect)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1511.06488)] Resiliency of Deep Neural Networks under quantizations\n\n## Related Repositories\n\n- [Awesome Efficient LLM & Diffusion](https:\u002F\u002Fgithub.com\u002Fefficient-ml\u002Fawesome-efficient-llm-diffusion)\n- [Awesome Quantization Papers](https:\u002F\u002Fgithub.com\u002FZhen-Dong\u002FAwesome-Quantization-Papers)\n\n## Star History\n\n\u003Ca href=\"https:\u002F\u002Fwww.star-history.com\u002F?repos=Efficient-ML%2FAwesome-Model-Quantization&type=date&legend=top-left\">\n \u003Cpicture>\n   \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fapi.star-history.com\u002Fchart?repos=Efficient-ML\u002FAwesome-Model-Quantization&type=date&theme=dark&legend=top-left\" \u002F>\n   \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_055a8a66d58e.png\" \u002F>\n   \u003Cimg alt=\"Star History Chart\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_055a8a66d58e.png\" \u002F>\n \u003C\u002Fpicture>\n\u003C\u002Fa>\n","# 令人惊叹的模型量化 [![Awesome](https:\u002F\u002Fawesome.re\u002Fbadge.svg)](https:\u002F\u002Fawesome.re)\n\n本仓库收集了关于模型量化的论文、文档和代码，供所有希望研究这一领域的人员参考。我们正在持续改进该项目。欢迎提交该仓库尚未收录的相关工作（论文、代码库）。\n\n- [基准测试](#benchmarks)\n- [综述论文](#survey-papers)\n- [论文](#papers)\n  - [2026年](#2026)\n  - [2025年](#2025)\n  - [2024年](#2024)\n  - [2023年](#2023)\n  - [2022–2015年](#2022-2015)\n- [相关仓库](#related-repositories)\n\n## 基准测试\n\n**1. BiBench：网络二值化的基准测试与分析** [[论文](https:\u002F\u002Fproceedings.mlr.press\u002Fv202\u002Fqin23a.html)] [[代码](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiBench)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiBench?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiBench)\n\n**会议:** ICML 2023\n\n**作者:** 秦浩桐、张明远、丁一夫、李傲宇、蔡中刚、刘子威、费舍尔·余、刘向龙。\n\n![survey](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_05e461b390a9.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@inproceedings{qin2023bibench,\n  title={BiBench: Benchmarking and Analyzing Network Binarization},\n  author={Qin, Haotong and Zhang, Mingyuan and Ding, Yifu and Li, Aoyu and Cai, Zhongang and Liu, Ziwei and Yu, Fisher and Liu, Xianglong},\n  booktitle={International Conference on Machine Learning (ICML)},\n  year={2023}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**2. LLaMA3 量化实证研究：从 LLM 到 MLLM** [[论文](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs44267-024-00070-x)] [[代码](https:\u002F\u002Fgithub.com\u002FMacaronlin\u002FLLaMA3-Quantization)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FMacaronlin\u002FLLaMA3-Quantization?style=social)](https:\u002F\u002Fgithub.com\u002FMacaronlin\u002FLLaMA3-Quantization)\n\n**会议:** Visual Intelligence 2024\n\n**作者:** 黄伟、郑星宇、马旭东、秦浩桐、吕成涛、陈宏、罗杰、齐晓娟、刘向龙、米凯莱·马尼奥。\n\n![LLaMA3 量化基准测试](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_901b7e5ec9b6.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{huang2024empirical,\n  title={An empirical study of llama3 quantization: From llms to mllms},\n  author={Huang, Wei and Zheng, Xingyu and Ma, Xudong and Qin, Haotong and Lv, Chengtao and Chen, Hong and Luo, Jie and Qi, Xiaojuan and Liu, Xianglong and Magno, Michele},\n  journal={Visual Intelligence},\n  volume={2},\n  number={1},\n  pages={36},\n  year={2024},\n  publisher={Springer}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**3. Qwen3 量化实证研究** [[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.02214)] [[代码](https:\u002F\u002Fgithub.com\u002FEfficient-ML\u002FQwen3-Quantization)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FEfficient-ML\u002FQwen3-Quantization?style=social)](https:\u002F\u002Fgithub.com\u002FEfficient-ML\u002FQwen3-Quantization)\n\n**会议:** Visual Intelligence 2026\n\n**作者:** 郑星宇、李雨叶、楚浩然、冯岳、马旭东、罗杰、郭金阳、秦浩桐、米凯莱·马尼奥、刘向龙。\n\n![qwen3](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_ac5c15d3b2df.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{zheng2025empirical,\n  title={An empirical study of qwen3 quantization},\n  author={Zheng, Xingyu and Li, Yuye and Chu, Haoran and Feng, Yue and Ma, Xudong and Luo, Jie and Guo, Jinyang and Qin, Haotong and Magno, Michele and Liu, Xianglong},\n  journal={arXiv preprint arXiv:2505.02214},\n  year={2025}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**4. LLMC：使用多功能压缩工具包对大型语言模型量化进行基准测试** [[论文](https:\u002F\u002Faclanthology.org\u002F2024.emnlp-industry.12\u002F)] [[代码](https:\u002F\u002Fgithub.com\u002FModelTC\u002FLightCompress)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FModelTC\u002FLightCompress?style=social)](https:\u002F\u002Fgithub.com\u002FModelTC\u002FLightCompress)\n\n**会议:** EMNLP 2024 行业专场\n\n**作者:** 龚瑞豪、杨勇、顾世桥、黄宇诗、吕成涛、张云晨、刘向龙、陶大成。\n\n![llmc](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_19bda43d7430.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@inproceedings{gong2024llmc,\n  title={Llmc: Benchmarking large language model quantization with a versatile compression toolkit},\n  author={Gong, Ruihao and Yong, Yang and Gu, Shiqiao and Huang, Yushi and Lv, Chengtao and Zhang, Yunchen and Tao, Dacheng and Liu, Xianglong},\n  booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track},\n  pages={132--152},\n  year={2024}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**5. RobustMQ：量化模型鲁棒性基准测试** [[论文](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs44267-023-00031-w)]\n\n**会议:** Visual Intelligence 2023\n\n**作者:** 肖义松、刘爱珊、张天元、秦浩桐、郭金阳、刘向龙。\n\n![robustmq](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_59c0d2511cab.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{xiao2023robustmq,\n  title={Robustmq: benchmarking robustness of quantized models},\n  author={Xiao, Yisong and Liu, Aishan and Zhang, Tianyuan and Qin, Haotong and Guo, Jinyang and Liu, Xianglong},\n  journal={Visual Intelligence},\n  volume={1},\n  number={1},\n  pages={30},\n  year={2023},\n  publisher={Springer}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n## 综述论文\n\n**1. 二值神经网络：综述** [[论文](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fabs\u002Fpii\u002FS0031320320300856)] [[博客](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FQGva6fow9tad_daZ_G2p0Q)]\n\n**会议\u002F期刊**：模式识别 2020\n\n**作者**：Haotong Qin、Ruihao Gong、Xianglong Liu、Xiao Bai、Jingkuan Song、Nicu Sebe。\n\n\n![survey](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_4e2933bc83c0.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{Qin:pr20_bnn_survey,\n    title = \"Binary neural networks: A survey\",\n    author = \"Haotong Qin and Ruihao Gong and Xianglong Liu and Xiao Bai and Jingkuan Song and Nicu Sebe\",\n    journal = \"Pattern Recognition\",\n    volume = \"105\",\n    pages = \"107281\",\n    year = \"2020\"\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**2. 低比特大语言模型综述：基础、系统与算法** [[论文](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fpii\u002FS0893608025007361)]\n\n**会议\u002F期刊**：神经网络 2025\n\n**作者**：Ruihao Gong、Yifu Ding、Zining Wang、Chengtao Lv、Xingyu Zheng、Jinyang Du、Yang Yong、Shiqiao Gu、Haotong Qin、Jinyang Guo、Dahua Lin、Michele Magno、Xianglong Liu。\n\n![低比特大语言模型综述](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_2117f5830dc5.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{gong2025survey,\n  title={A survey of low-bit large language models: Basics, systems, and algorithms},\n  author={Gong, Ruihao and Ding, Yifu and Wang, Zining and Lv, Chengtao and Zheng, Xingyu and Du, Jinyang and Yong, Yang and Gu, Shiqiao and Qin, Haotong and Guo, Jinyang and others},\n  journal={Neural networks},\n  pages={107856},\n  year={2025},\n  publisher={Elsevier}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n**3. 深度神经网络的低比特模型量化：综述** [[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.05530)]\n\n**会议\u002F期刊**：arXiv 2025\n\n**作者**：Kai Liu、Qian Zheng、Kaiwen Tao、Zhiteng Li、Haotong Qin、Wenbo Li、Yong Guo、Xianglong Liu、Linghe Kong、Guihai Chen、Yulun Zhang、Xiaokang Yang。\n\n![quant-survey](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_fb1df41ea7ae.png)\n\n\u003Cdetails>\u003Csummary>Bibtex\u003C\u002Fsummary>\u003Cpre>\u003Ccode>@article{liu2025low,\n  title={Low-bit model quantization for deep neural networks: A survey},\n  author={Liu, Kai and Zheng, Qian and Tao, Kaiwen and Li, Zhiteng and Qin, Haotong and Li, Wenbo and Guo, Yong and Liu, Xianglong and Kong, Linghe and Chen, Guihai and others},\n  journal={arXiv preprint arXiv:2505.05530},\n  year={2025}\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003C\u002Fdetails>\n\n## 论文\n\n### 2026\n\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=7QZanjCD6M)] PT²-LLM：面向大型语言模型的后训练三值化 [[代码](https:\u002F\u002Fgithub.com\u002FXIANGLONGYAN\u002FPT2-LLM)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FXIANGLONGYAN\u002FPT2-LLM?style=social)](https:\u002F\u002Fgithub.com\u002FXIANGLONGYAN\u002FPT2-LLM)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=HD7tuVakmR)] Quant-dLLM：面向扩散型大型语言模型的后训练极低比特量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=3AnRMvlVDw)] DVD-Quant：无数据视频扩散Transformer量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=AH7hbA7Zkk)] Q&C：高效生成中量化与缓存的结合\n- [[CVPR Findings](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.21970)] Q-MambaIR：用于高效图像修复的高精度量化Mamba模型\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.21302)] 量化视觉几何基础Transformer\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=XAXT7A8EWh)] 视频抠图的后训练量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=XJXZXuTj11)] QVGen：推动量化视频生成模型的极限\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=4TAG3aQljJ)] QuantSparse：通过模型量化和注意力稀疏化全面压缩视频扩散Transformer [[代码](https:\u002F\u002Fgithub.com\u002Fwlfeng0509\u002FQuantSparse)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fwlfeng0509\u002FQuantSparse?style=social)](https:\u002F\u002Fgithub.com\u002Fwlfeng0509\u002FQuantSparse)\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F40123)] 一阶误差很重要：对量化大型语言模型的精确补偿 [[代码](https:\u002F\u002Fgithub.com\u002FXingyu-Zheng\u002FFOEM)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FXingyu-Zheng\u002FFOEM?style=social)](https:\u002F\u002Fgithub.com\u002FXingyu-Zheng\u002FFOEM)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.06564)] TR-DQ：时间-旋转扩散量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=tO3ASKZlok)] TurboQuant：近似最优失真率的在线向量量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=VQIvBpL5ag)] 面向LLM联合量化与稀疏化的最优大脑恢复 [[代码](https:\u002F\u002Fgithub.com\u002Fcsguoh\u002FOBR)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcsguoh\u002FOBR?style=social)](https:\u002F\u002Fgithub.com\u002Fcsguoh\u002FOBR)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=XPIEkFdEDi)] AnyBCQ：面向多精度LLM的硬件高效灵活二进制编码量化 [[代码](https:\u002F\u002Fgithub.com\u002Fnaver-aics\u002Fanybcq)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fnaver-aics\u002Fanybcq?style=social)](https:\u002F\u002Fgithub.com\u002Fnaver-aics\u002Fanybcq)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=9CZzD5LWdy)] Tequila：无死区的大型语言模型三值量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=V85HbymBLW)] LogART：推动高效对数后训练量化的极限 [[代码](https:\u002F\u002Fgithub.com\u002Flogart-lab\u002Flogart)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Flogart-lab\u002Flogart?style=social)](https:\u002F\u002Fgithub.com\u002Flogart-lab\u002Flogart)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=1USeVjsKau)] ParoQuant：面向高效推理LLM推理的成对旋转量化 [[代码](https:\u002F\u002Fgithub.com\u002Fz-lab\u002Fparoquant)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fz-lab\u002Fparoquant?style=social)](https:\u002F\u002Fgithub.com\u002Fz-lab\u002Fparoquant)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=VpZ8YYdBmT)] 通过4位广义正态浮点格式提升分块式LLM量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.16018)] D²Quant：LLM的高精度低比特后训练权重量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2601.03170)] QuantLRM：通过微调信号对大型推理模型进行量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.15391)] SliderQuant：LLM的高精度后训练量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.04719)] 什么使低比特量化感知训练在推理LLM中有效？一项系统性研究\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=yjr2jX41qO)] 面向高效长上下文推理的通道感知混合精度量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=ATpchFiBQi)] CodeQuant：统一聚类与量化，以增强低精度专家混合模型中的异常平滑\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.11696)] QeRL：超越效率——面向LLM的量化增强强化学习 [[代码](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FQeRL)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FNVlabs\u002FQeRL?style=social)](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FQeRL)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.03782)] AutoQVLA：视觉-语言-行动模型量化中并非所有通道都同等重要\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=g2l9bg9DWx)] 通过子空间保持和网格量化实现低比特缪子\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=DAZvMAlZRp)] 面向视觉自回归模型的移位求和量化\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.03472)] 面向目标检测模型的内点中心后训练量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=yiMlVBAoQi)] 在理论泛化保证下高效量化专家混合模型\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=tY9yPAT3PU)] BBQ：通过贝尔盒量化提升量化熵\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.06653)] 通过4位块优化浮点（BOF4）提升分块式LLM量化：分析与变体 [[代码](https:\u002F\u002Fgithub.com\u002Fifnspaml\u002Fbof4)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fifnspaml\u002Fbof4?style=social)](https:\u002F\u002Fgithub.com\u002Fifnspaml\u002Fbof4)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.18259)] 高维线性回归下的量化学习\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.25214)] 即时适应量化：面向高效微调量化LLM的配置感知LoRA\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.23202)] 拉开FP4量化承诺与实际性能之间的差距 [[代码](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002FFP-Quant)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FIST-DASLab\u002FFP-Quant?style=social)](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002FFP-Quant)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.11184)] KBVQ-MoE：面向MoE大型语言模型的KLT引导SVD与偏置校正向量量化 [[代码](https:\u002F\u002Fgithub.com\u002Fxuzukang\u002Fkbvq_moe)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxuzukang\u002Fkbvq_moe?style=social)](https:\u002F\u002Fgithub.com\u002Fxuzukang\u002Fkbvq_moe)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.03383)] UniQL：面向自适应边缘LLM的统一量化与低秩压缩 [[代码](https:\u002F\u002Fgithub.com\u002Fenyac-group\u002FUniQL)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fenyac-group\u002FUniQL?style=social)](https:\u002F\u002Fgithub.com\u002Fenyac-group\u002FUniQL)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.01077)] 神经网络量化中的格子几何：GPTQ与Babai算法等价性的简短证明\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.03472)] DPQuant：通过动态量化调度实现高效且私密的模型训练\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf\u002Fee0ea14cd2283b1fee1902a6811796b443849c5c.pdf)] 朝着超低比特推理LLM的量化感知训练迈进\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.21314)] 浮点量化下自适应优化器的收敛性分析\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.06213)] 训练动态影响后训练量化鲁棒性 [[代码](https:\u002F\u002Fgithub.com\u002Faldakata\u002FTrainingDynamicsQuantizationRobustness)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Faldakata\u002FTrainingDynamicsQuantizationRobustness?style=social)](https:\u002F\u002Fgithub.com\u002Faldakata\u002FTrainingDynamicsQuantizationRobustness)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=pjMDZJd4rT)] SSDi8：面向状态空间对偶的准确高效8比特量化\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.18553)] LLM量化的几何：GPTQ即Babai最近平面算法\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2601.21238)] PTQ4ARVG：面向自回归视觉生成模型的后训练量化 [[代码](https:\u002F\u002Fgithub.com\u002FBienLuky\u002FPTQ4ARVG)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FBienLuky\u002FPTQ4ARVG?style=social)](https:\u002F\u002Fgithub.com\u002FBienLuky\u002FPTQ4ARVG)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.17428)] QWHA：面向大型语言模型参数高效微调的量化感知沃尔什-哈达玛适配 [[代码](https:\u002F\u002Fgithub.com\u002Fvantaa89\u002Fqwha)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fvantaa89\u002Fqwha?style=social)](https:\u002F\u002Fgithub.com\u002Fvantaa89\u002Fqwha)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.01289)] 扩散模型后训练量化的梯度对齐校准\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=nFjj8NEBqv)] SERQ：面向LLM量化的显著性感知低秩误差重构\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.22935)] 计算最优的量化感知训练\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.18610)] PM-KVQ：面向长上下文LLM的渐进式混合精度KV缓存量化 [[代码](https:\u002F\u002Fgithub.com\u002Fthu-nics\u002FPM-KVQ)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fthu-nics\u002FPM-KVQ?style=social)](https:\u002F\u002Fgithub.com\u002Fthu-nics\u002FPM-KVQ)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.23500)] 超越异常值：量化下优化器的研究\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.11695)] Qronos：通过塑造未来来修正过去……在后训练量化中\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.02343)] MicroMix：面向大型语言模型的高效混合精度量化，采用微缩格式 [[代码](https:\u002F\u002Fgithub.com\u002Flwy2020\u002FMicroMix)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Flwy2020\u002FMicroMix?style=social)](https:\u002F\u002Fgithub.com\u002Flwy2020\u002FMicroMix)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.04929)] TurboBoA：无需反向传播即可实现更快、更精确的注意力感知量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=FDdOD3qwS7)] 超越均匀性：扩散模型后训练量化的样本与频率元加权\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=LWYZ1nNkJl)] 重新思考基于补偿的LLM量化的残差误差\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=8tDIzHFOx6)] SPR²Q：面向图像超分辨率的静态优先级整流路由量化 [[代码](https:\u002F\u002Fgithub.com\u002Fmomo5-a11\u002FSPR2Q)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmomo5-a11\u002FSPR2Q?style=social)](https:\u002F\u002Fgithub.com\u002Fmomo5-a11\u002FSPR2Q)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.26771)] STaMP：序列转换与混合精度，用于低精度激活量化\n\n### 2025年\n\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2025\u002Fposter\u002F45429)] Q-VDiT：迈向视频生成扩散Transformer的精确量化与蒸馏 [[代码](https:\u002F\u002Fgithub.com\u002Fcantbebetter2\u002FQ-VDiT)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcantbebetter2\u002FQ-VDiT?style=social)](https:\u002F\u002Fgithub.com\u002Fcantbebetter2\u002FQ-VDiT)\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F33823)] MPQ-DM：面向极低比特扩散模型的混合精度量化\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2025\u002Fposter\u002F45388)] SliM-LLM：基于显著性驱动的大型语言模型混合精度量化 [[代码](https:\u002F\u002Fgithub.com\u002FAaronhuang-778\u002FSliM-LLM)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FAaronhuang-778\u002FSliM-LLM?style=social)](https:\u002F\u002Fgithub.com\u002FAaronhuang-778\u002FSliM-LLM)\n- [[TPAMI](https:\u002F\u002Fwww.computer.org\u002Fcsdl\u002Fjournal\u002Ftp\u002F2025\u002F10\u002F11060852\u002F281Hxm5TK2Q)] BiVM：用于高效视频抠图的精确二值化神经网络\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=e8pm93koQU)] S²Q-VDiT：基于显著性数据与稀疏令牌蒸馏的精确量化视频扩散Transformer [[代码](https:\u002F\u002Fgithub.com\u002Fwlfeng0509\u002FS2Q-VDiT)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fwlfeng0509\u002FS2Q-VDiT?style=social)](https:\u002F\u002Fgithub.com\u002Fwlfeng0509\u002FS2Q-VDiT)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2025\u002Fpapers\u002FZhu_PassionSR_Post-Training_Quantization_with_Adaptive_Scale_in_One-Step_Diffusion_based_CVPR_2025_paper.pdf)] PassionSR：基于单步扩散的图像超分辨率中的自适应尺度后训练量化 [[代码](https:\u002F\u002Fgithub.com\u002Flibozhu03\u002FPassionSR)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Flibozhu03\u002FPassionSR?style=social)](https:\u002F\u002Fgithub.com\u002Flibozhu03\u002FPassionSR)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=ZU8OdDLTts)] ARB-LLM：大型语言模型的交替精炼二值化 [[代码](https:\u002F\u002Fgithub.com\u002FZHITENGLI\u002FARB-LLM)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZHITENGLI\u002FARB-LLM?style=social)](https:\u002F\u002Fgithub.com\u002FZHITENGLI\u002FARB-LLM)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=cCE46s1obO)] BinaryDM：面向高效扩散模型的精确权重二值化 [[代码](https:\u002F\u002Fgithub.com\u002FXingyu-Zheng\u002FBinaryDM)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FXingyu-Zheng\u002FBinaryDM?style=social)](https:\u002F\u002Fgithub.com\u002FXingyu-Zheng\u002FBinaryDM)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv267\u002Fsun25l.html)] FlatQuant：对于LLM量化而言，平坦性至关重要 [[代码](https:\u002F\u002Fgithub.com\u002Fruikangliu\u002FFlatQuant)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fruikangliu\u002FFlatQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fruikangliu\u002FFlatQuant)\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2025\u002Fposter\u002F44438)] RoSTE：一种高效的量化感知监督微调方法，适用于大型语言模型 [[代码](https:\u002F\u002Fgithub.com\u002FOptimAI-Lab\u002FRoSTE)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOptimAI-Lab\u002FRoSTE?style=social)](https:\u002F\u002Fgithub.com\u002FOptimAI-Lab\u002FRoSTE)\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2025\u002Fposter\u002F43984)] GANQ：面向大型语言模型的GPU自适应非均匀量化\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2025\u002Fposter\u002F43551)] 调制扩散：通过调制量化加速生成式建模 [[代码](https:\u002F\u002Fgithub.com\u002FWeizhiGao\u002FMoDiff)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FWeizhiGao\u002FMoDiff?style=social)](https:\u002F\u002Fgithub.com\u002FWeizhiGao\u002FMoDiff)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F118539)] DartQuant：用于LLM量化的高效旋转分布校准 [[代码](https:\u002F\u002Fgithub.com\u002FCAS-CLab\u002FDartQuant)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FCAS-CLab\u002FDartQuant?style=social)](https:\u002F\u002Fgithub.com\u002FCAS-CLab\u002FDartQuant)\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F35415)] JAQ：高效架构设计与低比特量化相结合\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F33807\u002F35962)] OAC：输出自适应校准，实现LLM后训练量化的精准性\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F34039)] 通过衰减时间步长感知损失进行蒸馏，优化量化扩散模型\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F32658\u002F40071)] 扩散模型的可量化敏感度\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F33913\u002F36068)] TCAQ-DM：面向扩散模型的时间步-通道自适应量化\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.498\u002F)] EfficientQAT：面向大型语言模型的高效量化感知训练 [[代码](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FEfficientQAT)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOpenGVLab\u002FEfficientQAT?style=social)](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FEfficientQAT)\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.99\u002F)] L4Q：在大型语言模型上进行参数高效的量化感知微调\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.531\u002F)] MoQAE：通过量化感知专家混合实现长上下文LLM推理的混合精度量化\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.618\u002F)] 面向大型语言模型稳健4比特量化之异常值安全预训练\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.225\u002F)] PTQ1.61：推动极低比特后训练量化方法的实际极限 [[代码](https:\u002F\u002Fgithub.com\u002Fzjq0455\u002FPTQ1.61)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzjq0455\u002FPTQ1.61?style=social)](https:\u002F\u002Fgithub.com\u002Fzjq0455\u002FPTQ1.61)\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.1382\u002F)] 统一均匀与二进制编码量化，实现大型语言模型的精准压缩\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2025.acl-long.1304\u002F)] “给我BF16，否则就让我死”？LLM量化中的准确率与性能权衡\n- [[ACM MM](https:\u002F\u002Facmmm2025.org\u002Faccepted-regular-papers\u002F)] DilateQuant：通过权重扩张实现扩散模型的精准高效量化感知训练\n- [[ACM MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3744239)] 利用伪正向蒸馏学习二值化表示\n- [[ACM MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3746027.3755433)] MQuant：通过后训练量化释放多模态大型语言模型的推理潜力\n- [[ACM MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3746027.3755213)] 推动二值化神经网络在图像超分辨率中的极限，实现信息的平滑传输\n- [[ACM MM](https:\u002F\u002Facmmm2025.org\u002Faccepted-regular-papers\u002F)] 量化遇见OOD：从平坦性视角看可泛化的量化感知训练\n- [[EMNLP](https:\u002F\u002Faclanthology.org\u002F2025.emnlp-main.1799\u002F)] AMQ：为大型语言模型的混合精度仅权重量化提供AutoML支持\n- [[EMNLP](https:\u002F\u002Faclanthology.org\u002F2025.emnlp-main.479\u002F)] 量化是否会影响模型在长输入和长输出任务上的表现？\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F28924)] CBQ：面向大型语言模型的跨块量化\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F29192)] DGQ：面向文本到图像扩散模型的分布感知分组量化\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F30168)] LeanQuant：具有损失误差感知网格的精准且可扩展大型语言模型量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=rAcgDBdKnP)] OSTQuant：通过正交与缩放变换优化大型语言模型量化，以更好地拟合分布 [[代码](https:\u002F\u002Fgithub.com\u002FBrotherHappy\u002FOSTQuant)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FBrotherHappy\u002FOSTQuant?style=social)](https:\u002F\u002Fgithub.com\u002FBrotherHappy\u002FOSTQuant)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=LB5cKhgOTu)] QERA：一种用于量化误差重建的分析框架 [[代码](https:\u002F\u002Fgithub.com\u002FChengZhang-98\u002FQERA)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FChengZhang-98\u002FQERA?style=social)](https:\u002F\u002Fgithub.com\u002FChengZhang-98\u002FQERA)\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F28338)] SpinQuant：利用学习到的旋转进行LLM量化\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F27906)] SVDQuant：通过低秩成分吸收异常值，适用于4比特扩散模型\n- [[ICLR](https:\u002F\u002Ficlr.cc\u002Fvirtual\u002F2025\u002Fposter\u002F30429)] ViDiT-Q：面向图像和视频生成的扩散Transformer的高效精准量化\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=ZawsPjlIGu&noteId=x0z6YCJM6S)] GuidedQuant：通过利用末端损失指导实现大型语言模型量化 [[代码](https:\u002F\u002Fgithub.com\u002Fsnu-mllab\u002FGuidedQuant)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsnu-mllab\u002FGuidedQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fsnu-mllab\u002FGuidedQuant)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=4qIP1sXcR1)] ResQ：采用低秩残差的大型语言模型混合精度量化 [[代码](https:\u002F\u002Fgithub.com\u002Futkarsh-dmx\u002Fproject-resq)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Futkarsh-dmx\u002Fproject-resq?style=social)](https:\u002F\u002Fgithub.com\u002Futkarsh-dmx\u002Fproject-resq)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F117148)] 一种无需校准的低比特KV缓存量化双归一化方法\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F119877)] 二元二次量化：超越一阶量化，用于实值矩阵压缩\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F117396)] 学习分组格子向量量化器，用于低比特大型语言模型\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F115061)] LittleBit：通过潜在因子分解实现超低比特量化\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F118224)] ParetoQ：在极低比特LLM量化中改进规模法则\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F116315)] Q-Palette：面向最优仅权重后训练量化的分数比特量化器\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F120052)] 为LLM增强小波的高保真1比特量化\n- [[ACL Findings](https:\u002F\u002Faclanthology.org\u002F2025.findings-acl.459\u002F)] 通过后训练量化实现LLM的二值权重与激活\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2025.findings-emnlp.943\u002F)] KurTail：基于峰度的LLM量化\n- [[SIGMOD](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3725413)] 面向近似最近邻搜索的欧几里得空间中高维向量的实用且渐近最优量化 [[代码](https:\u002F\u002Fgithub.com\u002FVectorDB-NTU\u002FExtended-RaBitQ)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FVectorDB-NTU\u002FExtended-RaBitQ?style=social)](https:\u002F\u002Fgithub.com\u002FVectorDB-NTU\u002FExtended-RaBitQ)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F119764)] QBasicVSR：面向视频超分辨率的时间感知适应量化\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.09629)] 量化误差传播：重新审视逐层后训练量化\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F115665)] Point4Bit：点云3D检测的后训练4比特量化\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.12266)] PMQ-VE：面向视频增强的渐进式多帧量化 [[代码](https:\u002F\u002Fgithub.com\u002FxiaoBIGfeng\u002FPMQ-VE)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FxiaoBIGfeng\u002FPMQ-VE?style=social)](https:\u002F\u002Fgithub.com\u002FxiaoBIGfeng\u002FPMQ-VE)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F115090)] VETA-DiT：方差均衡且时间自适应的量化，用于高效的4比特扩散Transformer\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.18724)] LoTA-QAF：无损三值适应，用于量化感知微调 [[代码](https:\u002F\u002Fgithub.com\u002FKingdalfGoodman\u002FLoTA-QAF\u002Fblob\u002Fmain\u002FREADME.md)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FKingdalfGoodman\u002FLoTA-QAF?style=social)](https:\u002F\u002Fgithub.com\u002FKingdalfGoodman\u002FLoTA-QAF)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F117708)] 通过权重偏置校正和位级核心集采样，实现高效多比特量化网络训练\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F119554)] 通过拓扑熵实现高效且可泛化的混合精度量化\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2025\u002Fposter\u002F119301)] QSCA：用于单目深度估计的自补偿辅助量化\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.19248)] 为量化感知训练安排权重过渡 [[代码](https:\u002F\u002Fgithub.com\u002Fcvlab-yonsei\u002FTRS)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcvlab-yonsei\u002FTRS?style=social)](https:\u002F\u002Fgithub.com\u002Fcvlab-yonsei\u002FTRS)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.16782)] 面向目标检测的任务特定零样本量化感知训练 [[代码](https:\u002F\u002Fgithub.com\u002FDFQ-Dojo\u002Fdfq-toolkit)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FDFQ-Dojo\u002Fdfq-toolkit?style=social)](https:\u002F\u002Fgithub.com\u002FDFQ-Dojo\u002Fdfq-toolkit)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.10959)] OuroMamba：面向Vision Mamba的无数据量化框架\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.23516)] FedWSQ：通过权重标准化和分布感知的非均匀量化实现高效联邦学习 [[代码](https:\u002F\u002Fgithub.com\u002FSeongyeol-kim\u002FFedWSQ)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSeongyeol-kim\u002FFedWSQ?style=social)](https:\u002F\u002Fgithub.com\u002FSeongyeol-kim\u002FFedWSQ)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.16553)] 视觉Transformer无数据量化的语义对齐与强化 [[代码](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FSARDFQ)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzysxmu\u002FSARDFQ?style=social)](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FSARDFQ)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.06545)] QuantCache：面向视频生成的自适应重要性引导量化，结合层次化潜伏层与层缓存 [[代码](https:\u002F\u002Fgithub.com\u002FJunyiWuCode\u002FQuantCache)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FJunyiWuCode\u002FQuantCache?style=social)](https:\u002F\u002Fgithub.com\u002FJunyiWuCode\u002FQuantCache)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.19131)] MixA-Q：从混合精度量化视角重新审视视觉Transformer的激活稀疏性\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.12933)] DMQ：剖析扩散模型的异常值，用于后训练量化 [[代码](https:\u002F\u002Fgithub.com\u002FLeeDongYeun\u002Fdmq)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FLeeDongYeun\u002Fdmq?style=social)](https:\u002F\u002Fgithub.com\u002FLeeDongYeun\u002Fdmq)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.03088)] AHCPTQ：面向Segment Anything Model的精准且硬件兼容后训练量化\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.22349)] MSQ：内存高效的比特稀疏化量化\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.03666)] QuEST：通过高效的选择性微调实现低比特扩散模型量化 [[代码](https:\u002F\u002Fgithub.com\u002FhatchetProject\u002FQuEST)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FhatchetProject\u002FQuEST?style=social)](https:\u002F\u002Fgithub.com\u002FhatchetProject\u002FQuEST)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.05799)] MxMoE：兼具准确性和性能的MoE混合精度量化 [[代码](https:\u002F\u002Fgithub.com\u002Fcat538\u002FMxMoE)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcat538\u002FMxMoE?style=social)](https:\u002F\u002Fgithub.com\u002Fcat538\u002FMxMoE)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.04877)] 从损失景观中学习：通过自适应尖锐度感知梯度对齐实现可泛化的混合精度量化\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.15748)] PARQ：分段仿射正则化量化 [[代码](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fparq)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ffacebookresearch\u002Fparq?style=social)](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fparq)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.22879)] Quamba2：面向选择性状态空间模型的稳健且可扩展后训练量化框架 [[代码](https:\u002F\u002Fgithub.com\u002Fenyac-group\u002FQuamba)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fenyac-group\u002FQuamba?style=social)](https:\u002F\u002Fgithub.com\u002Fenyac-group\u002FQuamba)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=G6DmP9wxeB)] LRA-QViT：将低秩近似与量化相结合，打造稳健高效的视觉Transformer\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.13474)] BoA：注意力感知后训练量化，无需反向传播\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.03804)] MoEQuant：通过专家平衡采样和亲和力引导，提升混合专家大型语言模型的量化效果 [[代码](https:\u002F\u002Fgithub.com\u002Fchenzx921020\u002FMoEQuant)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fchenzx921020\u002FMoEQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fchenzx921020\u002FMoEQuant)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.09720)] NestQuant：嵌套格子量化，适用于矩阵乘积和LLM\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.20251)] Q-resafe：评估量化大型语言模型的安全风险及量化感知安全补丁 [[代码](https:\u002F\u002Fgithub.com\u002FThecommonirin\u002FQresafe)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FThecommonirin\u002FQresafe?style=social)](https:\u002F\u002Fgithub.com\u002FThecommonirin\u002FQresafe)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.09615)] SLiM：一次性的量化与稀疏化，结合低秩近似压缩LLM权重 [[代码](https:\u002F\u002Fgithub.com\u002FParamathic\u002Fslim)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FParamathic\u002Fslim?style=social)](https:\u002F\u002Fgithub.com\u002FParamathic\u002Fslim)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.06020)] QT-DoG：面向领域泛化的量化感知训练 [[代码](https:\u002F\u002Fgithub.com\u002Fsaqibjaved1\u002FQT-DoG)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsaqibjaved1\u002FQT-DoG?style=social)](https:\u002F\u002Fgithub.com\u002Fsaqibjaved1\u002FQT-DoG)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.06786)] 套娃量化\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.23651)] 适合合并的多目标领域适应后训练量化 [[代码](https:\u002F\u002Fgithub.com\u002Fewsn1593\u002FHDRQ)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fewsn1593\u002FHDRQ?style=social)](https:\u002F\u002Fgithub.com\u002Fewsn1593\u002FHDRQ)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.14371)] 面向量化乐观对偶平均的逐层量化\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=w5fONAEwra)] 面向离散图扩散模型的异常值感知后训练量化\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.01144)] BlockDialect：面向节能LLM推理的按块细粒度混合格式量化\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.02692)] GPTAQ：采用非对称校准的高效免微调量化 [[代码](https:\u002F\u002Fgithub.com\u002FIntelligent-Computing-Lab-Panda\u002FGPTAQ)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FIntelligent-Computing-Lab-Panda\u002FGPTAQ?style=social)](https:\u002F\u002Fgithub.com\u002FIntelligent-Computing-Lab-Panda\u002FGPTAQ)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.17116)] 使用FP4量化优化大型语言模型训练\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.04180)] SKIM：任何比特量化，突破后训练量化的极限\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.10958)] SageAttention2：高效注意力机制，彻底平滑异常值并采用线程级INT4量化 [[代码](https:\u002F\u002Fgithub.com\u002Fthu-ml\u002FSageAttention)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fthu-ml\u002FSageAttention?style=social)](https:\u002F\u002Fgithub.com\u002Fthu-ml\u002FSageAttention)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.14330)] 以粒度思考：通过引人入胜的多粒度线索实现图像超分辨率的动态量化 [[代码](https:\u002F\u002Fgithub.com\u002FMmmingS\u002FGranular-DQ)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FMmmingS\u002FGranular-DQ?style=social)](https:\u002F\u002Fgithub.com\u002FMmmingS\u002FGranular-DQ)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.08180)] D2-DPM：面向量化扩散概率模型的双重去噪 [[代码](https:\u002F\u002Fgithub.com\u002FTaylorJocelyn\u002FD2-DPM)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FTaylorJocelyn\u002FD2-DPM?style=social)](https:\u002F\u002Fgithub.com\u002FTaylorJocelyn\u002FD2-DPM)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.13918)] 无痛量化\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.02508)] APHQ-ViT：基于平均扰动海森矩阵重构的视觉Transformer后训练量化 [[代码](https:\u002F\u002Fgithub.com\u002FGoatWu\u002FAPHQ-ViT)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FGoatWu\u002FAPHQ-ViT?style=social)](https:\u002F\u002Fgithub.com\u002FGoatWu\u002FAPHQ-ViT)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=2rnOgyFQgb)] SynQ：通过合成感知微调实现精准零样本量化 [[代码](https:\u002F\u002Fgithub.com\u002Fsnudm-starlab\u002FSynQ)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsnudm-starlab\u002FSynQ?style=social)](https:\u002F\u002Fgithub.com\u002Fsnudm-starlab\u002FSynQ)\n\n### 2024年\n\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=qOl2WWOqFg)] BiLLM：推动大语言模型后训练量化极限 [[代码](https:\u002F\u002Fgithub.com\u002FAaronhuang-778\u002FBiLLM)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FAaronhuang-778\u002FBiLLM?style=social)](https:\u002F\u002Fgithub.com\u002FAaronhuang-778\u002FBiLLM)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=sCGRhnuMUJ)] 通过联合稀疏化与量化压缩大型语言模型\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F93620)] BiDM：推动扩散模型量化极限\n- [[ACL Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.516\u002F)] DB-LLM：高效大语言模型的精确双二值化\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F93008)] 用于图像超分辨率的二值化扩散模型 [[代码](https:\u002F\u002Fgithub.com\u002Fzhengchen1999\u002FBI-DiffSR)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzhengchen1999\u002FBI-DiffSR?style=social)](https:\u002F\u002Fgithub.com\u002Fzhengchen1999\u002FBI-DiffSR)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=ADJASE9uQ2)] 2DQuant：面向图像超分辨率的低比特后训练量化 [[代码](https:\u002F\u002Fgithub.com\u002FKai-Liu001\u002F2DQuant)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FKai-Liu001\u002F2DQuant?style=social)](https:\u002F\u002Fgithub.com\u002FKai-Liu001\u002F2DQuant)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv235\u002Fqin24b.html)] 基于信息保留的大语言模型LoRA微调量化 [[代码](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FIR-QLoRA)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FIR-QLoRA?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FIR-QLoRA)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv235\u002Fzhang24bb.html)] 面向图像超分辨率的灵活残差二值化\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29860)] Agile-Quant：激活引导量化，加速边缘端大语言模型推理\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29487)] AQ-DETR：具有辅助查询的低比特量化检测Transformer\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F28109)] Bi-ViT：推动视觉Transformer量化极限\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29908)] 从全面研究到低秩补偿：探索大语言模型的后训练量化\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29045)] 再次提升RepVGG性能：一种量化感知方法\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29212)] MetaMix：面向混合精度激活量化的元状态精度搜索器\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29815)] 范数调整：高性能低比特量化大型语言模型\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29237)] OWQ：面向高效微调与推理的异常值感知权重量化大型语言模型\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29553)] PTMQ：神经网络的后训练多比特量化\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F28972)] 鲁棒性引导的数据无依赖量化图像合成\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F29765)] 大型语言模型量化为何困难？基于扰动视角的实证研究\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2024.acl-long.612\u002F)] 通过直接偏好对齐提升量化大语言模型的对话能力\n- [[ACM MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3664647.3680838)] 基于预热的量化感知尺度学习推进多模态大语言模型\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FFan_Data-Free_Quantization_via_Pseudo-label_Filtering_CVPR_2024_paper.html)] 通过伪标签过滤实现数据无依赖量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FShang_Enhancing_Post-training_Quantization_Calibration_through_Contrastive_Learning_CVPR_2024_paper.html)] 通过对比学习提升后训练量化校准\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FMoon_Instance-Aware_Group_Quantization_for_Vision_Transformers_CVPR_2024_paper.html)] 视觉Transformer的实例感知分组量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FChen_Mixed-Precision_Quantization_for_Federated_Learning_on_Resource-Constrained_Heterogeneous_Devices_CVPR_2024_paper.html)] 面向资源受限异构设备的联邦学习混合精度量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FLv_PTQ4SAM_Post-Training_Quantization_for_Segment_Anything_CVPR_2024_paper.html)] PTQ4SAM：Segment Anything的后训练量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FDing_Reg-PTQ_Regression-specialized_Post-training_Quantization_for_Fully_Quantized_Object_Detector_CVPR_2024_paper.html)] Reg-PTQ：面向全量化目标检测器的回归专用后训练量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FTang_Retraining-Free_Model_Quantization_via_One-Shot_Weight-Coupling_Learning_CVPR_2024_paper.html)] 无需重训的模型量化：一次性权重耦合学习\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FHuang_TFMQ-DM_Temporal_Feature_Maintenance_Quantization_for_Diffusion_Models_CVPR_2024_paper.html)] TFMQ-DM：扩散模型的时间特征保持量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2024\u002Fhtml\u002FWang_Towards_Accurate_Post-training_Quantization_for_Diffusion_Models_CVPR_2024_paper.html)] 向更准确的扩散模型后训练量化迈进\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F3969_ECCV_2024_paper.php)] AdaLog：带有自适应对数量化器的视觉Transformer后训练量化\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F8434_ECCV_2024_paper.php)] CLAMP-ViT：对比式数据无依赖学习，用于ViT的自适应后训练量化\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F2494_ECCV_2024_paper.php)] 量化扩散模型的内存高效微调\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F3914_ECCV_2024_paper.php)] MetaAug：面向后训练量化的元数据增强\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F2212_ECCV_2024_paper.php)] MixDQ：内存高效的几步文本到图像扩散模型，采用度量解耦的混合精度量化\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F7353_ECCV_2024_paper.php)] 文本到图像扩散模型的渐进式校准与激活放松后训练量化\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F1627_ECCV_2024_paper.php)] PQ-SAM：Segment Anything模型的后训练量化\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F8312_ECCV_2024_paper.php)] 量化扩散模型的时间步长感知修正\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2024\u002Fpapers_ECCV\u002Fhtml\u002F9567_ECCV_2024_paper.php)] 向稳健的超分辨率网络全低比特量化迈进\n- [[EMNLP](https:\u002F\u002Faclanthology.org\u002F2024.emnlp-main.1168\u002F)] ApiQ：2比特量化大型语言模型的微调\n- [[EMNLP](https:\u002F\u002Faclanthology.org\u002F2024.emnlp-main.134\u002F)] 在注意力层前添加sink可缓解大型语言模型量化中的激活异常值\n- [[EMNLP](https:\u002F\u002Faclanthology.org\u002F2024.emnlp-main.467\u002F)] VPTQ：极端低比特的大型语言模型向量后训练量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=of2rhALq8l)] AffineQuant：大型语言模型的仿射变换量化 [[代码](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FAffineQuant)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbytedance\u002FAffineQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FAffineQuant)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=UmMa3UNDAz)] EfficientDM：低比特扩散模型的高效量化感知微调\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=0d1gQI114C)] LiDAR-PTQ：点云三维目标检测的后训练量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=LzPWWPAdY4)] LoftQ：面向大型语言模型的LoRA微调感知量化 [[代码](https:\u002F\u002Fgithub.com\u002Fyxli2123\u002FLoftQ)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyxli2123\u002FLoftQ?style=social)](https:\u002F\u002Fgithub.com\u002Fyxli2123\u002FLoftQ)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=gLARhFLE0F)] LUT-GEMM：基于查找表的量化矩阵乘法，用于大规模生成式语言模型的高效推理\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=8Wuvhh0LYW)] OmniQuant：全方位校准的大型语言模型量化 [[代码](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FOmniQuant)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FOpenGVLab\u002FOmniQuant?style=social)](https:\u002F\u002Fgithub.com\u002FOpenGVLab\u002FOmniQuant)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=BifeBRhikU)] PB-LLM：部分二值化大型语言模型 [[代码](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002FPB-LLM)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhahnyuan\u002FPB-LLM?style=social)](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002FPB-LLM)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=WvFoJccpo8)] QA-LoRA：面向大型语言模型的量化感知低秩适配 [[代码](https:\u002F\u002Fgithub.com\u002Fyuhuixu1993\u002Fqa-lora)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyuhuixu1993\u002Fqa-lora?style=social)](https:\u002F\u002Fgithub.com\u002Fyuhuixu1993\u002Fqa-lora)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=FIplmUWdm3)] QLLM：大型语言模型的精准高效低比特量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=JzG7kSpjJk)] 重新思考通道维度以隔离异常值，用于大型语言模型的低比特权重量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=Q1u25ahSuy)] SpQR：近无损LLM权重压缩的稀疏量化表示 [[代码](https:\u002F\u002Fgithub.com\u002FVahe1994\u002FSpQR)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FVahe1994\u002FSpQR?style=social)](https:\u002F\u002Fgithub.com\u002FVahe1994\u002FSpQR)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=mbx2pLK5Eq)] A2Q+：改进累积器感知权重量化\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=DbyHDYslM7)] BiE：大型语言模型量化的双指数块浮点数\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=jKUWlgra9b)] ERQ：视觉Transformer后训练量化的误差降低\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=DKKg5EFAFr)] 量化大型语言模型的评估\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=5mCaITRTmO)] 通过加性量化实现大型语言模型的极致压缩\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=xPypr0kufs)] FrameQuant：面向Transformer的灵活低比特量化\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=L057s2Rq8O)] KIVI：无需调优的KV缓存非对称2比特量化 [[代码](https:\u002F\u002Fgithub.com\u002Fjy-yuan\u002FKIVI)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjy-yuan\u002FKIVI?style=social)](https:\u002F\u002Fgithub.com\u002Fjy-yuan\u002FKIVI)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=dh8k41g775)] LQER：面向LLM的低秩量化误差重构\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=Uh5XN9d2J4)] 异常值感知切片用于视觉Transformer的后训练量化\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=8mKXMnhnFW)] 锐度感知数据生成用于零样本量化\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=0jpbpFia8m)] SqueezeLLM：密集与稀疏量化 [[代码](https:\u002F\u002Fgithub.com\u002FSqueezeAILab\u002FSqueezeLLM)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSqueezeAILab\u002FSqueezeLLM?style=social)](https:\u002F\u002Fgithub.com\u002FSqueezeAILab\u002FSqueezeLLM)\n- [[MLSys](https:\u002F\u002Fproceedings.mlsys.org\u002Fpaper_files\u002Fpaper\u002F2024\u002Fhash\u002F42a452cbafa9dd64e9ba4aa95cc1ef21-Abstract-Conference.html)] AWQ：面向设备端LLM压缩与加速的激活感知权重量化 [[代码](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fllm-awq)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmit-han-lab\u002Fllm-awq?style=social)](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fllm-awq)\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F96909)] BitsFusion：扩散模型的1.99比特权重量化\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F93727)] DuQuant：通过双重变换分散异常值，打造更强的量化LLM\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F93558)] KV缓存每通道1比特：耦合量化实现高效大语言模型推理\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F96936)] KVQuant：迈向KV缓存量化下1000万上下文长度的LLM推理\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F95445)] PTQ4DiT：扩散Transformer的后训练量化\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F94107)] Q-VLM：大型视觉-语言模型的后训练量化\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F95634)] QBB：面向LLM的二进制基量化\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002Fvirtual\u002F2024\u002Fposter\u002F96563)] ZipCache：精准高效的KV缓存量化，结合显著token识别\n- [[ACL Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.726\u002F)] 大型语言模型量化策略的全面评估\n- [[ACL Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.3\u002F)] AFPQ：面向LLM的非对称浮点量化 [[代码](https:\u002F\u002Fgithub.com\u002Fzhangsichengsjtu\u002FAFPQ)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzhangsichengsjtu\u002FAFPQ?style=social)](https:\u002F\u002Fgithub.com\u002Fzhangsichengsjtu\u002FAFPQ)\n- [[ACL Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.26\u002F)] LLM-QAT：面向大型语言模型的数据无依赖量化感知训练\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-emnlp.1001\u002F)] ATQ：激活转换用于LLM的权重-激活量化\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-emnlp.444\u002F)] 微调旋转后的无异常值LLM，以实现有效的权重-激活量化\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-emnlp.935\u002F)] 量化如何影响多语言LLM？\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-emnlp.570\u002F)] MobileQuant：面向设备端语言模型的移动友好量化\n- [[EMNLP Findings](https:\u002F\u002Faclanthology.org\u002F2024.findings-emnlp.811\u002F)] QEFT：面向LLM高效微调的量化\n- [[EMNLP Industry](https:\u002F\u002Faclanthology.org\u002F2024.emnlp-industry.12\u002F)] LLMC：使用多功能压缩工具包基准测试大型语言模型量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.14866)] APTQ：面向大型语言模型的注意力感知后训练混合精度量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.02775)] EasyQuant：一种高效的LLM数据无依赖量化算法\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.10787)] EdgeQAT：熵与分布引导的量化感知训练，用于加速边缘端轻量级LLM [[代码](https:\u002F\u002Fgithub.com\u002Fshawnricecake\u002FEdgeQAT)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fshawnricecake\u002FEdgeQAT?style=social)](https:\u002F\u002Fgithub.com\u002Fshawnricecake\u002FEdgeQAT)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.17985)] FlattenQuant：通过逐张量量化突破大型语言模型的推理计算瓶颈\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.15319)] GPTVQ：维度优势助力LLM量化 [[代码](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Fgptvq)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fqualcomm-ai-research\u002Fgptvq?style=social)](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Fgptvq)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.01241)] IntactKV：通过保留枢轴token提升大型语言模型量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.11295)] OneBit：迈向极低比特大型语言模型\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.00456)] QuaRot：旋转后的LLM实现无异常值的4比特推理 [[代码](https:\u002F\u002Fgithub.com\u002Fspcl\u002FQuaRot)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fspcl\u002FQuaRot?style=social)](https:\u002F\u002Fgithub.com\u002Fspcl\u002FQuaRot)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.05628)] RepQuant：通过规模再参数化迈向大型Transformer模型的精准后训练量化\n- [[SIGMOD](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3654970)] RaBitQ：为近似最近邻搜索提供理论误差上限的高维向量量化 [[代码](https:\u002F\u002Fgithub.com\u002Fgaoj0017\u002FRaBitQ)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgaoj0017\u002FRaBitQ?style=social)](https:\u002F\u002Fgithub.com\u002Fgaoj0017\u002FRaBitQ)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.16760)] 一步向前，一步回溯：克服损失感知量化训练中的锯齿状波动\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.03103)] 向学生学习：应用t分布探索LLM的精准高效格式 [[代码](https:\u002F\u002Fgithub.com\u002Fcornell-zhang\u002Fllm-datatypes)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcornell-zhang\u002Fllm-datatypes?style=social)](https:\u002F\u002Fgithub.com\u002Fcornell-zhang\u002Fllm-datatypes)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.12422)] Jetfire：采用INT8数据流和逐块量化，实现高效精准的Transformer预训练\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=fM9xTkpAdu)] RAOQ：重塑与适配用于输出量化，面向内存计算系统的量化感知训练\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.04396)] QuIP#：借助哈达玛不相干性和格码本，实现更优秀的LLM量化 [[代码](https:\u002F\u002Fgithub.com\u002FCornell-RelaxML\u002Fquip-sharp)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FCornell-RelaxML\u002Fquip-sharp?style=social)](https:\u002F\u002Fgithub.com\u002FCornell-RelaxML\u002Fquip-sharp)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.08958)] 向超大规模Transformer的下一代后训练量化迈进 [[代码](https:\u002F\u002Fgithub.com\u002FSamsungLabs\u002Faespa)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSamsungLabs\u002Faespa?style=social)](https:\u002F\u002Fgithub.com\u002FSamsungLabs\u002Faespa)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.00800)] MagR：通过减少权重幅度提升后训练量化效果 [[代码](https:\u002F\u002Fgithub.com\u002Faozhongzhang\u002Fmagr)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Faozhongzhang\u002Fmagr?style=social)](https:\u002F\u002Fgithub.com\u002Faozhongzhang\u002Fmagr)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.18137)] 探索LLM量化\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=HfpV6u0kbX)] 面向多个LoRA适配器的高效多任务LLM量化与服务\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.11235)] QTIP：基于网格与不相干性处理的量化 [[代码](https:\u002F\u002Fgithub.com\u002FCornell-RelaxML\u002Fqtip)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FCornell-RelaxML\u002Fqtip?style=social)](https:\u002F\u002Fgithub.com\u002FCornell-RelaxML\u002Fqtip)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=dYIqAZXQNV)] 将CNN泛化至图结构，采用可学习的邻域量化 [[代码](https:\u002F\u002Fgithub.com\u002FGrosenick-Lab-Cornell\u002FQuantNets)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FGrosenick-Lab-Cornell\u002FQuantNets?style=social)](https:\u002F\u002Fgithub.com\u002FGrosenick-Lab-Cornell\u002FQuantNets)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.15526)] SDP4Bit：迈向LLM训练中分片数据并行下的4比特通信量化 [[代码](https:\u002F\u002Fgithub.com\u002FByteDance-Seed\u002FSDP4Bit)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FByteDance-Seed\u002FSDP4Bit?style=social)](https:\u002F\u002Fgithub.com\u002FByteDance-Seed\u002FSDP4Bit)\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper_files\u002Fpaper\u002F2024\u002Ffile\u002Fab6a2c6ee757afe43882121281f6065c-Paper-Conference.pdf)] 最优与近似的自适应随机量化 [[代码](https:\u002F\u002Fgithub.com\u002Franbenbasat\u002FQUIVER)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Franbenbasat\u002FQUIVER?style=social)](https:\u002F\u002Fgithub.com\u002Franbenbasat\u002FQUIVER)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.02837)] 点睛之笔：大型语言模型中的参数异质性与量化\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=cEtExbAKYV)] StepbaQ：作为修正措施，量化扩散模型的退步操作\n\n### 2023年\n\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv202\u002Fqin23a.html)] BiBench：网络二值化的基准测试与分析 [[代码](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiBench)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiBench?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiBench)\n- [[IJCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.12338)] 面向准确二值神经网络的分布敏感信息保留\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F71287)] BiMatting：通过二值化实现高效的视频抠图 [[代码](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiMatting)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiMatting?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiMatting)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F72890)] QuantSR：用于高效图像超分辨率的高精度低比特量化 [[代码](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FQuantSR)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FQuantSR?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FQuantSR)\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F10146917)] 多样化样本生成：突破无数据量化生成能力的极限 [[代码](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FDSG)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FDSG?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FDSG)\n- [[TNNLS](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F10049753)] BiFSMNv2：将用于关键词检测的二值神经网络性能提升至真实网络水平 [[代码](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiFSMNv2)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiFSMNv2?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiFSMNv2)\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F26268)] 基于深度-宽度重塑的快速且精确的二值神经网络\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F26084)] OMPQ：正交混合精度量化\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F26354)] 用于网络量化的量化特征蒸馏\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F26261)] 鲁棒二值神经网络\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F26136)] 将无数据量化重新思考为零和博弈\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2023.findings-acl.15\u002F)] 利用适合GPU的稀疏性和量化技术提升基于Transformer的语言模型\n- [[ACL](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.00014)] PreQuant：一种适用于预训练语言模型的任务无关量化方法\n- [[CVPR](https:\u002F\u002Fipl.dgist.ac.kr\u002FABCD_cvpr23.pdf)] ABCD：任意位系数去量化\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.06869)] 自适应无数据量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FLin_Bit-Shrinking_Limiting_Instantaneous_Sharpness_for_Improving_Post-Training_Quantization_CVPR_2023_paper.pdf)] 减比特：限制瞬时锐度以改进训练后量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FYu_Boost_Vision_Transformer_With_GPU-Friendly_Sparsity_and_Quantization_CVPR_2023_paper.pdf)] 利用适合GPU的稀疏性和量化技术提升视觉Transformer\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2212.04780)] GENIE：为量化提供数据 [[代码](https:\u002F\u002Fgithub.com\u002FSamsungLabs\u002FGenie)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSamsungLabs\u002FGenie?style=social)](https:\u002F\u002Fgithub.com\u002FSamsungLabs\u002FGenie)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fhtml\u002FLi_Hard_Sample_Matters_a_Lot_in_Zero-Shot_Quantization_CVPR_2023_paper.html)] 硬样本在零样本量化中至关重要\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FShin_NIPQ_Noise_Proxy-Based_Integrated_Pseudo-Quantization_CVPR_2023_paper.pdf)] NIPQ：基于噪声代理的集成伪量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FLiu_NoisyQuant_Noisy_Bias-Enhanced_Post-Training_Activation_Quantization_for_Vision_Transformers_CVPR_2023_paper.pdf)] NoisyQuant：面向视觉Transformer的噪声增强型训练后激活量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FKoryakovskiy_One-Shot_Model_for_Mixed-Precision_Quantization_CVPR_2023_paper.pdf)] 混合精度量化的单次模型\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fhtml\u002FLiu_PD-Quant_Post-Training_Quantization_Based_on_Prediction_Difference_Metric_CVPR_2023_paper.html)] PD-Quant：基于预测差异指标的训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002Fhustvl\u002FPD-Quant)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhustvl\u002FPD-Quant?style=social)](https:\u002F\u002Fgithub.com\u002Fhustvl\u002FPD-Quant)\n- [[CVPR](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fhtml\u002FShang_Post-Training_Quantization_on_Diffusion_Models_CVPR_2023_paper.html)] 扩散模型上的训练后量化 [[代码](https:\u002F\u002Fhttps\u002F\u002Fgithub.com\u002F42Shawn\u002FPTQ4DM)]\n- [[CVPR](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fhtml\u002FXu_Q-DETR_An_Efficient_Low-Bit_Quantized_Detection_Transformer_CVPR_2023_paper.html)] Q-DETR：一种高效的低比特量化检测Transformer [[代码](https:\u002F\u002Fgithub.com\u002FSteveTsui\u002FQ-DETR)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSteveTsui\u002FQ-DETR?style=social)](https:\u002F\u002Fgithub.com\u002FSteveTsui\u002FQ-DETR)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.06424)] 正则化向量量化用于标记化图像合成\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2303.11906.pdf)] 从理论角度解决训练后量化中的振荡问题 [[代码](https:\u002F\u002Fgithub.com\u002Fbytedance\u002Fmrecg)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbytedance\u002Fmrecg?style=social)](https:\u002F\u002Fgithub.com\u002Fbytedance\u002Fmrecg)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2023\u002Fpapers\u002FTu_Toward_Accurate_Post-Training_Quantization_for_Image_Super_Resolution_CVPR_2023_paper.pdf)] 朝着图像超分辨率的准确训练后量化迈进\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.16836)] LLM-FP4：4位浮点数量化Transformer [[代码](https:\u002F\u002Fgithub.com\u002Fnbasyl\u002FLLM-FP4)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fnbasyl\u002FLLM-FP4?style=social)](https:\u002F\u002Fgithub.com\u002Fnbasyl\u002FLLM-FP4)\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.09145)] 异常值抑制+：通过等效且最优的移位和缩放实现大型语言模型的准确量化\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.05079)] 重新审视基于块的量化：对于低于8比特的LLM推理，什么才是关键？\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.11237)] 利用量化解码为LLM添加水印 [[代码](https:\u002F\u002Fgithub.com\u002FTwilight92z\u002FQuantize-Watermark)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FTwilight92z\u002FQuantize-Watermark?style=social)](https:\u002F\u002Fgithub.com\u002FTwilight92z\u002FQuantize-Watermark)\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13315)] 预训练语言模型的零样本锐度感知量化\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FColbert_A2Q_Accumulator-Aware_Quantization_with_Guaranteed_Overflow_Avoidance_ICCV_2023_paper.pdf)] A2Q：具有保证溢出避免功能的累加器感知量化\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FHe_BiViT_Extremely_Compressed_Binary_Vision_Transformers_ICCV_2023_paper.pdf)] BiViT：极度压缩的二值视觉Transformer\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FShang_Causal-DFQ_Causality_Guided_Data-Free_Network_Quantization_ICCV_2023_paper.pdf)] Causal-DFQ：因果导向的无数据网络量化 [[代码](https:\u002F\u002Fgithub.com\u002F42Shawn\u002FCausal-DFQ)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002F42Shawn\u002FCausal-DFQ?style=social)](https:\u002F\u002Fgithub.com\u002F42Shawn\u002FCausal-DFQ)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FLi_DenseShift_Towards_Accurate_and_Efficient_Low-Bit_Power-of-Two_Quantization_ICCV_2023_paper.pdf)] DenseShift：迈向准确且高效的低比特二的幂次量化\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FDong_EMQ_Evolving_Training-free_Proxies_for_Automated_Mixed_Precision_Quantization_ICCV_2023_paper.pdf)] EMQ：用于自动混合精度量化的不断进化训练无依赖代理\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FXu_EQ-Net_Elastic_Quantization_Neural_Networks_ICCV_2023_paper.pdf)] EQ-Net：弹性量化神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fxuke225\u002FEQ-Net)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxuke225\u002FEQ-Net?style=social)](https:\u002F\u002Fgithub.com\u002Fxuke225\u002FEQ-Net)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fhtml\u002FWu_Estimator_Meets_Equilibrium_Perspective_A_Rectified_Straight_Through_Estimator_for_ICCV_2023_paper.html)] 估计器与均衡视角相遇：用于二值神经网络训练的修正直通估计器 [[代码](https:\u002F\u002Fgithub.com\u002FDravenALG\u002FReSTE)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FDravenALG\u002FReSTE?style=social)](https:\u002F\u002Fgithub.com\u002FDravenALG\u002FReSTE)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FLi_I-ViT_Integer-only_Quantization_for_Efficient_Vision_Transformer_Inference_ICCV_2023_paper.pdf)] I-ViT：仅整数量化以实现高效的视觉Transformer推理 [[代码](https:\u002F\u002Fgithub.com\u002Fzkkli\u002FI-ViT)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzkkli\u002FI-ViT?style=social)](https:\u002F\u002Fgithub.com\u002Fzkkli\u002FI-ViT)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FFrumkin_Jumping_through_Local_Minima_Quantization_in_the_Loss_Landscape_of_ICCV_2023_paper.pdf)] 跨越局部极小值：视觉Transformer损失景观中的量化\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FChen_Overcoming_Forgetting_Catastrophe_in_Quantization-Aware_Training_ICCV_2023_paper.pdf)] 克服量化感知训练中的遗忘灾难\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FLi_Q-Diffusion_Quantizing_Diffusion_Models_ICCV_2023_paper.pdf)] Q-diffusion：量化扩散模型 [[代码](https:\u002F\u002Fgithub.com\u002FXiuyu-Li\u002Fq-diffusion)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FXiuyu-Li\u002Fq-diffusion?style=social)](https:\u002F\u002Fgithub.com\u002FXiuyu-Li\u002Fq-diffusion)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FZhang_QD-BEV__Quantization-aware_View-guided_Distillation_for_Multi-view_3D_Object_Detection_ICCV_2023_paper.pdf)] QD-BEV：面向多视角3D目标检测的视图引导式量化感知蒸馏\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FLi_RepQ-ViT_Scale_Reparameterization_for_Post-Training_Quantization_of_Vision_Transformers_ICCV_2023_paper.pdf)] RepQ-ViT：用于视觉Transformer训练后量化的尺度重参数化 [[代码](https:\u002F\u002Fgithub.com\u002Fzkkli\u002FRepQ-ViT)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzkkli\u002FRepQ-ViT?style=social)](https:\u002F\u002Fgithub.com\u002Fzkkli\u002FRepQ-ViT)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2023\u002Fpapers\u002FBai_Unified_Data-Free_Compression_Pruning_and_Quantization_without_Fine-Tuning_ICCV_2023_paper.pdf)] 统一的无数据压缩：无需微调的剪枝和量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=3itjR9QxFw)] 模拟比特：利用自条件扩散模型生成离散数据\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.17323)] GPTQ：面向生成式预训练Transformer的准确训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002Fgptq)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FIST-DASLab\u002Fgptq?style=social)](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002Fgptq)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=m2S96Qf2R3)] 少比特反向：用于减少内存占用的激活函数量化梯度 [[代码](https:\u002F\u002Fgithub.com\u002FSkoltechAI\u002Ffewbit)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSkoltechAI\u002Ffewbit?style=social)](https:\u002F\u002Fgithub.com\u002FSkoltechAI\u002Ffewbit)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=EPnzNJTYsb)] FlexRound：基于逐元素除法的可学习四舍五入，用于训练后量化 [[代码](https:\u002F\u002Fopenreview.net\u002Fattachment?id=-tYCaP0phY_&name=supplementary_material)]\n- [[ICML](https:\u002F\u002Ficml.cc\u002Fvirtual\u002F2023\u002F28295)] GPT-Zip：对微调后的大型语言模型进行深度压缩\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=DihXH24AdY)] 无振荡量化：面向低比特视觉Transformer\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.03738)] QIGen：为大型语言模型的量化推理生成高效内核 [[代码](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002FQIGen)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FIST-DASLab\u002FQIGen?style=social)](https:\u002F\u002Fgithub.com\u002FIST-DASLab\u002FQIGen)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=Nqp8A5IDzq)] 具有收敛保证的大规模模型量化分布式训练\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.10438)] SmoothQuant：面向大型语言模型的准确且高效的训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fsmoothquant)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmit-han-lab\u002Fsmoothquant?style=social)](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fsmoothquant)\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=i8tGb1ab1j)] 支持4位精度的理由：k位推理缩放法则\n- [[ICML](https:\u002F\u002Fopenreview.net\u002Fforum?id=q1WGm3hItW)] 理解面向语言模型的INT4量化：延迟加速、可组合性及失败案例\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.12017)] 理解面向Transformer模型的INT4量化：延迟加速、可组合性及失败案例 [[代码](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeed)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FDeepSpeed?style=social)](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeed)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.10299)] 二值化光谱压缩成像 [[代码](https:\u002F\u002Fgithub.com\u002Fcaiyuanhao1998\u002FBiSCI)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcaiyuanhao1998\u002FBiSCI?style=social)](https:\u002F\u002Fgithub.com\u002Fcaiyuanhao1998\u002FBiSCI)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F72931)] 通过低于4比特的整数量化实现压缩大型语言模型的内存高效微调\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F71880)] PackQViT：通过移动端上的完整和打包量化，加速低于8比特的视觉Transformer\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F71314)] PTQD：面向扩散模型的准确训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002Fziplab\u002FPTQD)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fziplab\u002FPTQD?style=social)](https:\u002F\u002Fgithub.com\u002Fziplab\u002FPTQD)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F70279)] Q-DM：一种高效的低比特量化扩散模型\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F71815)] QLoRA：面向量化LLM的高效微调 [[代码](https:\u002F\u002Fgithub.com\u002Fartidoro\u002Fqlora)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fartidoro\u002Fqlora?style=social)](https:\u002F\u002Fgithub.com\u002Fartidoro\u002Fqlora)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F69982)] QuIP：带有保证的大型语言模型2比特量化 [[代码](https:\u002F\u002Fgithub.com\u002Fjerry-chee\u002FQuIP)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjerry-chee\u002FQuIP?style=social)](https:\u002F\u002Fgithub.com\u002Fjerry-chee\u002FQuIP)\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F72396)] 面向扩散模型的时序动态量化\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F70325)] TexQ：通过纹理特征分布校准实现零样本网络量化\n- [[NeurIPS](https:\u002F\u002Fneurips.cc\u002Fvirtual\u002F2023\u002Fposter\u002F71526)] 使用前向和后向近似量化器理解神经网络二值化\n- [[TIP](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F10107717)] MBFQuant：一种针对移动CNN应用的乘法位宽固定、混合精度量化方法\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9735379)] 基于优化的训练后量化：采用比特拆分与拼接\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F10122994)] 单路径比特共享：自动实现损失感知的模型压缩\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.19102)] Atom：面向高效且准确LLM服务的低比特量化 [[代码](https:\u002F\u002Fgithub.com\u002Fefeslab\u002FAtom)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fefeslab\u002FAtom?style=social)](https:\u002F\u002Fgithub.com\u002Fefeslab\u002FAtom)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.14592)] 使用FP8格式的高效训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002Fintel\u002Fneural-compressor)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fintel\u002Fneural-compressor?style=social)](https:\u002F\u002Fgithub.com\u002Fintel\u002Fneural-compressor)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.07147)] QFT：以经济实惠的资源实现LLM的全参数量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.16795)] QMoE：面向万亿参数模型的实用低于1比特压缩\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.01089)] RPTQ：基于重新排序的大型语言模型训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002FRPTQ4LLM)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhahnyuan\u002FRPTQ4LLM?style=social)](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002FRPTQ4LLM)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.17723)] ZeroQuant-HERO：面向W8A8 Transformer的硬件增强型鲁棒优化训练后量化框架\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.16187)] 量化感知区间传播：用于训练可认证鲁棒量化神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fmlech26l\u002Fquantization_aware_ibp)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmlech26l\u002Fquantization_aware_ibp?style=social)](https:\u002F\u002Fgithub.com\u002Fmlech26l\u002Fquantization_aware_ibp)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=s1KljJpAukm)] PowerQuant：非均匀量化中的自同构搜索\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=VWm4o4l3V9e)] 块与子词缩放浮点数（BSFP）：一种面向低精度推理的高效非均匀量化\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.14645)] REx：无数据残差量化误差扩展\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.19268)] 大规模量化中的有趣特性\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.11987)] 用4位整数训练Transformer [[代码](https:\u002F\u002Fgithub.com\u002Fxijiu9\u002FTrain_Transformers_with_INT4)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxijiu9\u002FTrain_Transformers_with_INT4?style=social)](https:\u002F\u002Fgithub.com\u002Fxijiu9\u002FTrain_Transformers_with_INT4)\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper_files\u002Fpaper\u002F2023\u002Ffile\u002F400a2e6a82520b690810b97fd67fcc4e-Paper-Conference.pdf)] 通过完全量化迈向高效且准确的Winograd卷积\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper_files\u002Fpaper\u002F2023\u002Ffile\u002Fc48bc80aa5d3cbbdd712d1cc107b8319-Paper-Conference.pdf)] 剪枝 vs 量化：哪个更好？ [[代码](https:\u002F\u002Fgithub.com\u002FQualcomm-AI-research\u002Fpruning-vs-quantization)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FQualcomm-AI-research\u002Fpruning-vs-quantization?style=social)](https:\u002F\u002Fgithub.com\u002FQualcomm-AI-research\u002Fpruning-vs-quantization)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=7L2mgi0TNEP)] A^2Q：面向图神经网络的聚合感知量化\n\n### 2022年\n\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=5xEgrl_5FAJ)] BiBERT: 精确的全二值化BERT。[代码](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiBERT])\n- [[IJCAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2202.06483)] BiFSMN: 用于关键词检测的二值神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiFSMN)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiFSMN?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiFSMN)\n- [[ACM MM](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.14341)] 面向视觉Transformer的高精度训练后量化\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2022.acl-long.331)] 通过量化压缩生成式预训练语言模型\n- [[ACM Trans. Des. Autom. Electron. Syst.](https:\u002F\u002Fweb.archive.org\u002Fweb\u002F20220722092230id_\u002Fhttps:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3549535)] 针对深度神经网络的结构化动态精度量化\n- [[ASE](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3551349.3556916)] QVIP: 一种基于整数线性规划的量化神经网络形式化验证方法\n- [[Applied Soft Computing](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fpii\u002FS1568494622005038)] 基于知识蒸馏和参数量化的一种轴承故障诊断神经网络压缩方法\n- [[CCF Transactions on High Performance Computing](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs42514-022-00121-z)] 面向图神经网络的高效分段量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022W\u002FECV\u002Fpapers\u002FJiang_A_Low_Memory_Footprint_Quantized_Neural_Network_for_Depth_Completion_CVPRW_2022_paper.pdf)] 一种内存占用极低的量化神经网络，用于超稀疏飞行时间深度图的深度补全\n- [[CVPR](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9879477\u002F)] BppAttack: 基于图像量化和对比对抗学习的隐蔽高效后门攻击 [[代码](https:\u002F\u002Fgithub.com\u002FRU-System-Software-and-Security\u002FBppAttack)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FRU-System-Software-and-Security\u002FBppAttack?style=social)](https:\u002F\u002Fgithub.com\u002FRU-System-Software-and-Security\u002FBppAttack)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fpapers\u002FChikin_Data-Free_Network_Compression_via_Parametric_Non-Uniform_Mixed_Precision_Quantization_CVPR_2022_paper.pdf)] 基于参数化非均匀混合精度量化的无数据网络压缩\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fpapers\u002FLiu_Instance-Aware_Dynamic_Neural_Network_Quantization_CVPR_2022_paper.pdf)] 实例感知的动态神经网络量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fhtml\u002FZhong_IntraQ_Learning_Synthetic_Images_With_Intra-Class_Heterogeneity_for_Zero-Shot_Network_CVPR_2022_paper.html)] IntraQ: 学习类内异质性的合成图像，用于零样本网络量化 [[代码](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FIntraQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzysxmu\u002FIntraQ?style=social)](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FIntraQ)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.17008)] 全在教师身上：让零样本量化更接近教师 [[代码](https:\u002F\u002Fgithub.com\u002Fiamkanghyunchoi\u002Fait)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fiamkanghyunchoi\u002Fait?style=social)](https:\u002F\u002Fgithub.com\u002Fiamkanghyunchoi\u002Fait)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fpapers\u002FWang_Learnable_Lookup_Table_for_Neural_Network_Quantization_CVPR_2022_paper.pdf)] 神经网络量化中的可学习查找表 [[代码](https:\u002F\u002Fgithub.com\u002FThe-Learning-And-Vision-Atelier-LAVA\u002FLLT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FThe-Learning-And-Vision-Atelier-LAVA\u002FLLT?style=social)](https:\u002F\u002Fgithub.com\u002FThe-Learning-And-Vision-Atelier-LAVA\u002FLLT)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fpapers\u002FJeon_Mr.BiQ_Post-Training_Non-Uniform_Quantization_Based_on_Minimizing_the_Reconstruction_Error_CVPR_2022_paper.pdf)] Mr.BiQ: 基于最小化重建误差的训练后非均匀量化\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2111.14826)] 非均匀到均匀量化：通过广义直通估计实现精确量化 [[代码](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FNonuniform-to-Uniform-Quantization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fliuzechun\u002FNonuniform-to-Uniform-Quantization?style=social)](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FNonuniform-to-Uniform-Quantization)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fhtml\u002FGuo_RecDis-SNN_Rectifying_Membrane_Potential_Distribution_for_Directly_Training_Spiking_Neural_CVPR_2022_paper.html)] RecDis-SNN: 修正膜电位分布，以直接训练脉冲神经网络\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022W\u002FECV\u002Fpapers\u002Fvan_Baalen_Simulated_Quantization_Real_Power_Savings_CVPRW_2022_paper.pdf)] 模拟量化，真正节省功耗\n- [[EANN](https:\u002F\u002Flink.springer.com\u002Fchapter\u002F10.1007\u002F978-3-031-08223-8_35)] 一种鲁棒的光子神经网络量化感知训练方法\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136720017.pdf)] BASQ: 面向4比特以下神经网络的分支级激活裁剪搜索量化 [[代码](https:\u002F\u002Fgithub.com\u002FHanByulKim\u002FBASQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FHanByulKim\u002FBASQ?style=social)](https:\u002F\u002Fgithub.com\u002FHanByulKim\u002FBASQ)\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2203.08368)] 基于学习的逐层重要性进行混合精度神经网络量化 [[代码](https:\u002F\u002Fgithub.com\u002F1hunters\u002FLIMPQ)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002F1hunters\u002FLIMPQ?style=social)](https:\u002F\u002Fgithub.com\u002F1hunters\u002FLIMPQ)\n- [[ECCV](https:\u002F\u002Flink.springer.com\u002Fchapter\u002F10.1007\u002F978-3-031-20071-7_37)] 用于训练脉冲神经网络的神经形态数据增强。[[代码]](https:\u002F\u002Fgithub.com\u002FIntelligent-Computing-Lab-Yale\u002FNDA_SNN)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136710657.pdf)] 非均匀步长量化，用于精确的训练后量化\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136710154.pdf)] 面向视觉Transformer的补丁相似性感知无数据量化 [[代码](https:\u002F\u002Fgithub.com\u002Fzkkli\u002Fpsaq-vit)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzkkli\u002Fpsaq-vit?style=social)](https:\u002F\u002Fgithub.com\u002Fzkkli\u002Fpsaq-vit)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136720190.pdf)] PTQ4ViT: 面向视觉Transformer的双均匀量化训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002Fptq4vit)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhahnyuan\u002Fptq4vit?style=social)](https:\u002F\u002Fgithub.com\u002Fhahnyuan\u002Fptq4vit)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136720156.pdf)] RDO-Q: 基于率失真优化的极细粒度通道级量化\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136710207.pdf)] 对称性正则化和饱和非线性，用于鲁棒量化\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136710726.pdf)] 通过等效平滑正则化实现精确的网络量化\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2022\u002Fpapers_ECCV\u002Fpapers\u002F136710416.pdf)] 固定权重网络。[[代码]](https:\u002F\u002Fgithub.com\u002Fsubiawaud\u002FWeight_Fix_Networks)\n- [[ESE](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs10664-022-10202-w)] DiverGet: 一种基于搜索的软件测试方法，用于评估深度神经网络量化\n- [[Electronics](https:\u002F\u002Fwww.mdpi.com\u002F2079-9292\u002F11\u002F6\u002F945)] 关于高效卷积神经网络及其硬件加速的综述\n- [[FPGA](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3490422.3502364)] FILM-QNN: 基于层内混合精度量化的高效FPGA加速深度神经网络\n- [[ICCRD](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9730411\u002Fauthors)] 神经网络训练后的量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=shpkpVXzo3h)] 基于块状量化的8位优化器 [[代码](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fbitsandbytes)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ffacebookresearch\u002Fbitsandbytes?style=social)](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fbitsandbytes)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=_CfpJazzXT2)] F8Net: 仅使用定点8位乘法进行网络量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=kF9DZQQrU0w)] 信息瓶颈：对（量化）神经网络的精确分析 [[代码](https:\u002F\u002Fgithub.com\u002FStephanLorenzen\u002FExactIBAnalysisInQNNs)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FStephanLorenzen\u002FExactIBAnalysisInQNNs?style=social)](https:\u002F\u002Fgithub.com\u002FStephanLorenzen\u002FExactIBAnalysisInQNNs)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=ySQH0oDyp7)] 高精度、超低延迟脉冲神经网络的最佳ANN-SNN转换\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=ySQH0oDyp7)] QDrop: 随机丢弃量化，实现极低比特的训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002Fwimh966\u002FQDrop)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fwimh966\u002FQDrop?style=social)](https:\u002F\u002Fgithub.com\u002Fwimh966\u002FQDrop)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=JXhROKNZzOc)] SQuant: 基于对角海森矩阵近似的即时无数据量化。[代码](https:\u002F\u002Fgithub.com\u002Fclevercool\u002FSQuant])\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=3HJOA-1hb0e)] 朝着高效的低精度训练迈进：数据格式优化和迟滞量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=7udZAsEzd60)] 过参数化 regime 下部分量化神经网络的VC维\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv162\u002Fdong22a.html)] 在深度神经网络中寻找任务最优的低比特子分布 [[代码](https:\u002F\u002Fgithub.com\u002FRunpeiDong\u002FDGMS)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FRunpeiDong\u002FDGMS?style=social)](https:\u002F\u002Fgithub.com\u002FRunpeiDong\u002FDGMS)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv162\u002Fliu22v.html)] GACT: 针对通用网络架构的激活压缩训练 [[代码](https:\u002F\u002Fgithub.com\u002FLiuXiaoxuanPKU\u002FGACT-ICML)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FLiuXiaoxuanPKU\u002FGACT-ICML?style=social)](https:\u002F\u002Fgithub.com\u002FLiuXiaoxuanPKU\u002FGACT-ICML)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv162\u002Fnagel22a\u002Fnagel22a.pdf)] 克服量化感知训练中的振荡 [[代码](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Foscillations-qat)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fqualcomm-ai-research\u002Foscillations-qat?style=social)](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Foscillations-qat)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv162\u002Fhuang22h.html)] SDQ: 混合精度的随机可微量化\n- [[ICPR](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9956237)] 分层无数据CNN压缩\n- [[IEEE Internet of Things Journal](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9915794)] FedQNN: 一种面向物联网的计算-通信效率高的联邦学习框架，采用低带宽神经网络量化\n- [[IJCAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2111.13824)] FQ-ViT: 全量化视觉Transformer的训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002Fmegvii-research\u002FFQ-ViT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmegvii-research\u002FFQ-ViT?style=social)](https:\u002F\u002Fgithub.com\u002Fmegvii-research\u002FFQ-ViT)\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2022\u002F504)] MultiQuant: 一次训练即可实现神经网络的多比特量化\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2022\u002F219)] RAPQ: 挽救二的幂次低比特训练后量化的准确性 [[代码](https:\u002F\u002Fgithub.com\u002Fbillamihom\u002Frapq)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbillamihom\u002Frapq?style=social)](https:\u002F\u002Fgithub.com\u002Fbillamihom\u002Frapq)\n- [[IJCNN](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9892671)] 转置卷积基量化神经网络的准确度评估\n- [[IJCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.12338)] 面向精确二值神经网络的分布敏感信息保留\n- [[IJNS](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2209.15317.pdf)] 带有注意力机制的卷积神经网络量化\n- [[ITSM](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9827546)] 边缘—人工智能赋能的停车场监控，采用量化神经网络\n- [[Intelligent Automation & Soft Computing](https:\u002F\u002Fweb.p.ebscohost.com\u002Fabstract?direct=true&profile=ehost&scope=site&authtype=crawler&jrnl=10798587&AN=155230773&h=buFz%2f8gWWhfyGU%2btyHURhybWlmqZvGCIyITNuefG%2bIwBHoSqNwo4CVrCT7hsuZbtZ%2brDTVnLfGgNR6EX8e6%2fGg%3d%3d&crl=c&resultNs=AdminWebAuth&resultLocal=ErrCrlNotAuth&crlhashurl=login.aspx%3fdirect%3dtrue%26profile%3dehost%26scope%3dsite%26authtype%3dcrawler%26jrnl%3d10798587%26AN%3d155230773)] 一种资源高效的卷积神经网络加速器，采用细粒度对数量化\n- [[LNAI](https:\u002F\u002Flink.springer.com\u002Fchapter\u002F10.1007\u002F978-3-031-04083-2_14)] ECQ$^x$: 以解释性为导向的低比特和稀疏DNN量化\n- [[MICRO](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9923832)] ANT: 利用自适应数值类型进行低比特深度神经网络量化\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper_files\u002Fpaper\u002F2022\u002Fhash\u002F20f94998511f25bb6378cae0e098bc46-Abstract-Conference.html)] BiMLP: 面向视觉多层感知器的紧凑二值架构 [[代码](https:\u002F\u002Fgitee.com\u002Fmindspore\u002Fmodels\u002Ftree\u002Fmaster\u002Fresearch\u002Fcv\u002FBiMLP)]\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=55032)] BiT: 鲁棒的二值化多蒸馏Transformer [[代码](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fbit)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ffacebookresearch\u002Fbit?style=social)](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fbit)\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=55162)] ClimbQ: 类别不平衡量化，提升高效推理的鲁棒性\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=54104)] 熵驱动的混合精度量化，用于深度网络设计\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=53073)] FP8量化：指数的力量 [[代码](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Ffp8-quantization)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fqualcomm-ai-research\u002Ffp8-quantization?style=social)](https:\u002F\u002Fgithub.com\u002Fqualcomm-ai-research\u002Ffp8-quantization)\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=54389)] 利用层间依赖关系进行训练后量化\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2208.07339)] LLM.int8(): 大规模Transformer的8位矩阵乘法 [[代码](https:\u002F\u002Fgithub.com\u002Ftimdettmers\u002Fbitsandbytes)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ftimdettmers\u002Fbitsandbytes?style=social)](https:\u002F\u002Fgithub.com\u002Ftimdettmers\u002Fbitsandbytes)\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=53412)] 最优大脑压缩：一种用于精确训练后量化和剪枝的框架 [[代码](https:\u002F\u002Fgithub.com\u002Fist-daslab\u002Fobc)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fist-daslab\u002Fobc?style=social)](https:\u002F\u002Fgithub.com\u002Fist-daslab\u002Fobc)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=fU-m9kQe0ke)] Q-ViT: 精确且完全量化的小比特视觉Transformer [[代码](https:\u002F\u002Fgithub.com\u002Fyanjingli0202\u002Fq-vit)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyanjingli0202\u002Fq-vit?style=social)](https:\u002F\u002Fgithub.com\u002Fyanjingli0202\u002Fq-vit)\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=54812)] 为AdderNet量化重新分配权重和激活\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=53476)] 理论上更好、数值上更快的分布式优化，结合平滑感知量化技术\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=53407)] 朝着高效训练后量化预训练语言模型的方向迈进\n- [[NeurIPS](https:\u002F\u002Fnips.cc\u002FConferences\u002F2022\u002FSchedule?showEvent=54407)] ZeroQuant: 面向大规模Transformer的高效且经济的训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeed)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FDeepSpeed?style=social)](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeed)\n- [[Neural Networks](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fpii\u002FS0893608022003598)] 低精度光子神经网络的量化感知训练\n- [[Neurocomputing](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fpii\u002FS0925231222008293)] EPQuant: 一种基于产品量化的图神经网络压缩方法\n- [[Ocean Engineering](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fpii\u002FS0029801822017887)] 基于神经网络的自适应滑模跟踪控制，用于输入量化和饱和的自主水面航行器\n- [[PPoPP](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3503221.3508408)] QGTC: 通过GPU张量核心加速量化图神经网络\n- [[TCCN](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9703679)] 用于无线干扰识别的低带宽卷积神经网络\n- [[TCSVT](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9849674)] 在FPGA上使用CLIP-Q量化实现高效的卷积神经网络\n- [[TGARS](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9362309)] 通过逐步激活量化加速基于卷积神经网络的高光谱图像分类\n- [[TODAES](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3498328)] 模拟内存中神经网络加速的动态量化范围控制\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2206.07741.pdf)] 使用完全可微的量化混合精度神经网络进行边缘推理\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2201.08442.pdf)] 使用AI模型效率工具包（AIMET）进行神经网络量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2201.07703.pdf)] Q-ViT: 视觉Transformer的完全可微量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2206.07527.pdf)] QONNX: 表示任意精度的量化神经网络\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2202.05048.pdf)] Quantune: 使用极端梯度提升快速部署的卷积神经网络训练后量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2211.10438.pdf)] SmoothQuant: 面向大型语言模型的精确高效训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fsmoothquant)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmit-han-lab\u002Fsmoothquant?style=social)](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fsmoothquant)\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F2206.15408)] 面向8位神经网络加速器的子8比特量化感知训练，支持设备端语音识别\n- [[tinyML Research Symposium](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2203.05025.pdf)] 二的幂次量化，适用于低比特且符合硬件要求的神经网络\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.10345)] CADyQ: 面向图像超分辨率的内容感知动态量化\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.10188)] 基于比特宽度自适应的量化感知神经网络训练：一种元学习方法 [[代码](https:\u002F\u002Fgithub.com\u002Fjsjs0369\u002FMEBQAT)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjsjs0369\u002FMEBQAT?style=social)](https:\u002F\u002Fgithub.com\u002Fjsjs0369\u002FMEBQAT)\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.07743)] 细粒度数据分布对齐，用于训练后量化 [[代码](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FFDDA)] [![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzysxmu\u002FFDDA?style=social)](https:\u002F\u002Fgithub.com\u002Fzysxmu\u002FFDDA)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2206.06501)] 最佳裁剪和幅度感知微分，以改善量化感知训练\n\n### 2021年\n\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.01049)] 用于精确无数据量化的小样本多样性生成\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=9QLRCVysdlO)] BiPointNet：面向点云的二值神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiPointNet)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FBiPointNet?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiPointNet)\n- [[ICML](http:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fliu21t\u002Fliu21t.pdf)] Adam优化器与训练策略如何助力二值神经网络优化？ [[代码](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FAdamBNN)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fliuzechun\u002FAdamBNN?style=social)](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FAdamBNN)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.02778)] 通过堆叠低维二值卷积滤波器压缩深度卷积神经网络\n- [[AAAI](https:\u002F\u002Fwww.google.com\u002Furl?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwj4-rjuq7nvAhUVPH0KHXlYCUQQFjAFegQIChAD&url=https%3A%2F%2Fwww.aaai.org%2FAAAI21Papers%2FAAAI-7144.ZhaoK.pdf&usg=AOvVaw3dnOXfzKkLIw_qWXj7p7Yc)] 面向CNN训练的分布自适应INT8量化\n- [[AAAI](https:\u002F\u002Fwww.semanticscholar.org\u002Fpaper\u002FFracBits%3A-Mixed-Precision-Quantization-via-Yang-Jin\u002Fcb219432863778fa173925d51fbf02af1d17ad98)] FracBits：基于分数比特宽度的混合精度量化\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.02577)] 基于二值嵌入和三值系数的内存与计算高效核SVM\n- [[AAAI](https:\u002F\u002Fwww.google.com\u002Furl?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwjD6aPrqbnvAhXeIDQIHWNdDCUQFjADegQIAxAD&url=https%3A%2F%2Fwww.aaai.org%2FAAAI21Papers%2FAAAI-1054.HuP.pdf&usg=AOvVaw2R_BcDlKyuuAPHMeO0Q-1c)] OPQ：通过一次性剪枝-量化压缩深度神经网络\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F16474\u002F16281)] 针对高效混合精度激活量化的信息论基比特瓶颈优化\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.09049)] 多点后训练量化：无需混合精度的混合精度\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2012.08185)] 可扩展的量化神经网络验证 [[代码](https:\u002F\u002Fgithub.com\u002Fmlech26l\u002Fqnn_robustness_benchmarks)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmlech26l\u002Fqnn_robustness_benchmarks?style=social)](https:\u002F\u002Fgithub.com\u002Fmlech26l\u002Fqnn_robustness_benchmarks)\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.14502)] 随机精度集成：面向量化深度神经网络的自知识蒸馏\n- [[AAAI](https:\u002F\u002Fwww.aaai.org\u002FAAAI21Papers\u002FAAAI-4473.LiY.pdf)] TRQ：带有残差量化的三值神经网络\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F17434\u002F17241)] 通过凸神经网络自助法进行CNN中的不确定性量化\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1907.05911)] 面向数据流的向量量化贝叶斯神经网络推断\n- [[ACL](https:\u002F\u002Faclanthology.org\u002F2021.findings-acl.363)] 关于Transformer中注意力值的分布、稀疏性及推理时量化\n- [[ACM MM](https:\u002F\u002Farxiv.org\u002Fabs\u002F2011.14265)] 全量化图像超分辨率网络 [[代码](https:\u002F\u002Fgithub.com\u002Fbillhhh\u002FFQSR)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fbillhhh\u002FFQSR?style=social)](https:\u002F\u002Fgithub.com\u002Fbillhhh\u002FFQSR)\n- [[ACM MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3474085.3475224)] VQMG：用于显式表征学习的分层向量量化与多跳图推理\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2012.15823)] 二值图神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fmbahri\u002Fbinary_gnn)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmbahri\u002Fbinary_gnn?style=social)](https:\u002F\u002Fgithub.com\u002Fmbahri\u002Fbinary_gnn)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.07156)] 用于高精度低比特神经网络的可学习压缩量化\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.00903)] 基于逐元素梯度缩放的网络量化 [[代码](https:\u002F\u002Fgithub.com\u002Fcvlab-yonsei\u002FEWGS)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcvlab-yonsei\u002FEWGS?style=social)](https:\u002F\u002Fgithub.com\u002Fcvlab-yonsei\u002FEWGS)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2010.15703)] 置换、量化与微调：高效的神经网络压缩 [[代码](https:\u002F\u002Fgithub.com\u002Fuber-research\u002Fpermute-quantize-finetune)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fuber-research\u002Fpermute-quantize-finetune?style=social)](https:\u002F\u002Fgithub.com\u002Fuber-research\u002Fpermute-quantize-finetune)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2022\u002Fpapers\u002FZhang_PokeBNN_A_Binary_Pursuit_of_Lightweight_Accuracy_CVPR_2022_paper.pdf)] PokeBNN：追求轻量级精度的二值网络 [[代码](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Faqt)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgoogle\u002Faqt?style=social)](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Faqt)\n- [[CVPR](http:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fhtml\u002FShen_S2-BNN_Bridging_the_Gap_Between_Self-Supervised_Real_and_1-Bit_Neural_CVPR_2021_paper.html)] S2-bnn：通过引导式分布校准弥合自监督真实网络与1比特网络之间的差距 [[代码](https:\u002F\u002Fgithub.com\u002Fszq0214\u002FS2-BNN)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fszq0214\u002FS2-BNN?style=social)](https:\u002F\u002Fgithub.com\u002Fszq0214\u002FS2-BNN)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.15263)] 零样本对抗量化 [[代码](https:\u002F\u002Fgithub.com\u002FFLHonker\u002FZAQ-code)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FFLHonker\u002FZAQ-code?style=social)](https:\u002F\u002Fgithub.com\u002FFLHonker\u002FZAQ-code)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fpapers\u002F123700562.pdf)] PAMS：基于参数化最大尺度的量化超分辨率 [[代码](https:\u002F\u002Fgithub.com\u002Fcolorjam\u002FPAMS)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcolorjam\u002FPAMS?style=social)](https:\u002F\u002Fgithub.com\u002Fcolorjam\u002FPAMS)\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2021\u002Fhtml\u002FLi_MixMix_All_You_Need_for_Data-Free_Compression_Are_Feature_and_ICCV_2021_paper.html)] MixMix：实现无数据压缩只需特征与数据混合\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=POWv6hDd9XH)] BRECQ：通过块重建突破后训练量化极限 [[代码](https:\u002F\u002Fgithub.com\u002Fyhhhli\u002FBRECQ)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyhhhli\u002FBRECQ?style=social)](https:\u002F\u002Fgithub.com\u002Fyhhhli\u002FBRECQ)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=TiXl51SCNw8)] BSQ：探索位级稀疏性以实现混合精度神经网络量化 [[代码](https:\u002F\u002Fgithub.com\u002Fyanghr\u002FBSQ)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyanghr\u002FBSQ?style=social)](https:\u002F\u002Fgithub.com\u002Fyanghr\u002FBSQ)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=NSBrFgJAHg)] Degree-Quant：面向图神经网络的量化感知训练\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=MxaY4FzOTa)] 高容量专家二值网络 [[代码](https:\u002F\u002Fgithub.com\u002F1adrianb\u002Fexpert-binary-networks)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002F1adrianb\u002Fexpert-binary-networks?style=social)](https:\u002F\u002Fgithub.com\u002F1adrianb\u002Fexpert-binary-networks)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=3SV-ZePhnZM)] 基于深度嵌入空间中向量量化的增量式小样本学习\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=U_mat0b9iv)] 多奖彩票假说：通过修剪随机加权网络寻找高精度二值神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fchrundle\u002Fbiprop)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fchrundle\u002Fbiprop?style=social)](https:\u002F\u002Fgithub.com\u002Fchrundle\u002Fbiprop)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=EoFNy62JGd)] 神经网络梯度接近对数正态分布：改进量化与稀疏训练\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=sTeoJiB4uR)] 使用二值神经网络降低深度生成模型的计算成本\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=Qr0aRliE_Hb)] 简单增强大有裨益：用于DNN量化的ADRL\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=pBqLS-7KYAF)] 稀疏量化谱聚类\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=dV19Yyi1fS3)] 带有量化噪声的训练以实现极端模型压缩 [[代码](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Ffairseq\u002Ftree\u002Fmaster\u002Fexamples\u002Fquant_noise)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fpytorch\u002Ffairseq?style=social)](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Ffairseq)\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2007.13242.pdf)] WrapNet：采用超低分辨率算术的神经网络推断\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fchen21z.html)] ActNN：通过2比特激活压缩训练降低训练内存占用 [[代码](https:\u002F\u002Fgithub.com\u002Fucbrise\u002Factnn)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fucbrise\u002Factnn?style=social)](https:\u002F\u002Fgithub.com\u002Fucbrise\u002Factnn)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Ffu21d.html)] Auto-NBA：在网络、比特宽度与加速器联合空间中高效且有效地搜索 [[代码](https:\u002F\u002Fgithub.com\u002FRICE-EIC\u002FAuto-NBA)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FRICE-EIC\u002FAuto-NBA?style=social)](https:\u002F\u002Fgithub.com\u002FRICE-EIC\u002FAuto-NBA)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fzhang21r.html)] 混合精度与自适应分辨率的可微动态量化\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fyao21a.html)] HAWQ-V3：二进制神经网络量化 [[代码](https:\u002F\u002Fgithub.com\u002FZhen-Dong\u002FHAWQ)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZhen-Dong\u002FHAWQ?style=social)](https:\u002F\u002Fgithub.com\u002FZhen-Dong\u002FHAWQ)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fkim21d.html)] I-BERT：纯整数BERT量化 [[代码](https:\u002F\u002Fgithub.com\u002Fkssteven418\u002FI-BERT)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fkssteven418\u002FI-BERT?style=social)](https:\u002F\u002Fgithub.com\u002Fkssteven418\u002FI-BERT)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=YygA0yppTR)] 胜利之手：压缩深度网络可提升分布外鲁棒性 [[代码](https:\u002F\u002Fgithub.com\u002Fchrundle\u002Fbiprop)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fchrundle\u002Fbiprop?style=social)](https:\u002F\u002Fgithub.com\u002Fchrundle\u002Fbiprop)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=Z_J5bCb4Rra)] 生成模型的散度边界：样本复杂度、量化效应与边界积分\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=9TX5OsKJvm)] 视觉Transformer的后训练量化\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=qe9z54E_cqE)] 后训练稀疏感知量化 [[代码](https:\u002F\u002Fgithub.com\u002Fgilshm\u002Fsparq)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgilshm\u002Fsparq?style=social)](https:\u002F\u002Fgithub.com\u002Fgilshm\u002Fsparq)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=ejo1_Weiart)] Qimera：利用合成边界支持样本实现无数据量化 [[代码](https:\u002F\u002Fgithub.com\u002Fiamkanghyunchoi\u002Fqimera)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fiamkanghyunchoi\u002Fqimera?style=social)](https:\u002F\u002Fgithub.com\u002Fiamkanghyunchoi\u002Fqimera)\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=0kCxbBQknN)] Qu-ANTI-zation：利用量化伪影达成对抗性结果\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fforum?id=EO-CQzgcIxd)] VQ-GNN：一种使用向量量化扩展图神经网络的通用框架\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F2103.13630)] 面向高效神经网络推断的量化方法综述\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2106.08295.pdf)] 神经网络量化白皮书\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.07346)] 任意精度深度神经网络 [[代码](https:\u002F\u002Fgithub.com\u002FSHI-Labs\u002FAny-Precision-DNNs)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FSHI-Labs\u002FAny-Precision-DNNs?style=social)](https:\u002F\u002Fgithub.com\u002FSHI-Labs\u002FAny-Precision-DNNs)\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F2103.12369)] ReCU：复活二值神经网络中的死亡权重 [[代码](https:\u002F\u002Fgithub.com\u002Fz-hXu\u002FReCU)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fz-hXu\u002FReCU?style=social)](https:\u002F\u002Fgithub.com\u002Fz-hXu\u002FReCU)\n- [[AAAI](https:\u002F\u002Fcdn.aaai.org\u002Fojs\u002F16263\u002F16263-13-19757-1-2-20210518.pdf)] 不使用批归一化训练二值神经网络用于图像超分辨率\n- [[AAAI](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F16306)] SA-BNN：状态感知二值神经网络\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FOh_Automated_Log-Scale_Quantization_for_Low-Cost_Deep_Neural_Networks_CVPR_2021_paper.pdf)] 低成本深度神经网络的自动化对数尺度量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FCVPR2021\u002Fpapers\u002FKryzhanovskiy_QPP_Real-Time_Quantization_Parameter_Prediction_for_Deep_Neural_Networks_CVPR_2021_paper.pdf)] QPP：深度神经网络的实时量化参数预测\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.10518)] 改善后训练神经网络量化：逐层校准与整数规划 [[代码](https:\u002F\u002Fgithub.com\u002Fitayhubara\u002FCalibTIP)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fitayhubara\u002FCalibTIP?style=social)](https:\u002F\u002Fgithub.com\u002Fitayhubara\u002FCalibTIP)\n- [[ICML](https:\u002F\u002Fproceedings.mlr.press\u002Fv139\u002Fhubara21a\u002Fhubara21a.pdf)] 使用小型校准集实现精确的后训练量化\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F2105.08952)] BatchQuant：具有稳健量化器的全量化解析架构搜索\n\n### 2020年\n\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FQin_Forward_and_Backward_Information_Retention_for_Accurate_Binary_Neural_Networks_CVPR_2020_paper.pdf)] 用于精确二值神经网络的前向与后向信息保留 [[代码](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FIR-Net)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhtqin\u002FIR-Net?style=social)](https:\u002F\u002Fgithub.com\u002Fhtqin\u002FIR-Net)\n- [[PR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2004.03333)] 二值神经网络：综述\n- [[AAAI](https:\u002F\u002Faaai.org\u002Fojs\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F6035)] HLHLp：为在损失曲面上达到平坦极小值而训练的量化神经网络\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.05840)] Q-BERT：基于海森矩阵的 BERT 超低精度量化\n- [[AAAI](https:\u002F\u002Faaai.org\u002Fojs\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F6900)] 引入稀疏性的二值化神经网络\n- [[AAAI](https:\u002F\u002Faaai.org\u002Fojs\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F6134)] 基于多阶段自适应的高精度低比特量化\n- [[ACL](https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002F2020.sustainlp-1.4.pdf)] 用于文本分类的端到端二值化神经网络\n- [[COOL CHIPS](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9097642\u002F)] 面向二值神经网络的新型 DRAM 内加速器架构\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FWang_APQ_Joint_Search_for_Network_Architecture_Pruning_and_Quantization_Policy_CVPR_2020_paper.pdf)] APQ：网络架构、剪枝与量化策略的联合搜索 [[代码](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fapq)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmit-han-lab\u002Fapq?style=social)](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fapq)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FWang_BiDet_An_Efficient_Binarized_Object_Detector_CVPR_2020_paper.pdf)] BiDet：一种高效的二值化目标检测器。[[代码](https:\u002F\u002Fgithub.com\u002FZiweiWangTHU\u002FBiDet)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FZiweiWangTHU\u002FBiDet?style=social)](https:\u002F\u002Fgithub.com\u002FZiweiWangTHU\u002FBiDet)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FZhang_Fixed-Point_Back-Propagation_Training_CVPR_2020_paper.pdf)] 定点反向传播训练\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FHan_GhostNet_More_Features_From_Cheap_Operations_CVPR_2020_paper.pdf)] GhostNet：以低成本操作获得更多特征\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPRW_2020\u002Fpapers\u002Fw40\u002FYu_Low-Bit_Quantization_Needs_Good_Distribution_CVPRW_2020_paper.pdf)] 低比特量化需要良好的分布\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2020\u002Fpapers\u002FWu_Rotation_Consistent_Margin_Loss_for_Efficient_Low-Bit_Face_Recognition_CVPR_2020_paper.pdf)] 旋转一致性间隔损失用于高效低比特人脸识别\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2002.10778.pdf)] 使用贝叶斯学习规则训练二值神经网络\n- [[DATE](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9116220)] BNNsplit：面向嵌入式分布式 FPGA 计算系统的二值神经网络\n- [[DATE](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F9116308)] OrthrusPE：面向二值神经网络的运行时可重构处理单元\n- [[DATE](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.04050)] PhoneBit：面向手机的高效 GPU 加速二值神经网络推理引擎\n- [[ECCV](http:\u002F\u002Farxiv.org\u002Fabs\u002F2003.01711)] BATS：二值架构搜索\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.10463)] 针对硬件效率的可微分联合剪枝与量化\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.03603)] 生成式无数据低比特量化 [[代码](https:\u002F\u002Fgithub.com\u002Fxushoukai\u002FGDFQ)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxushoukai\u002FGDFQ?style=social)](https:\u002F\u002Fgithub.com\u002Fxushoukai\u002FGDFQ)\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.06963)] 用于二值网络的学习架构 [[代码](https:\u002F\u002Fgithub.com\u002Fgistvision\u002Fbnas)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgistvision\u002Fbnas?style=social)](https:\u002F\u002Fgithub.com\u002Fgistvision\u002Fbnas)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fpapers\u002F123510426.pdf)] PROFIT：一种针对 4 比特以下 MobileNet 模型的新型训练方法\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fpapers\u002F123480222.pdf)] ProxyBNN：通过代理矩阵学习二值神经网络\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2020\u002Fpapers_ECCV\u002Fpapers\u002F123590137.pdf)] ReActNet：具有广义激活函数的高精度二值神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FReActNet)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fliuzechun\u002FReActNet?style=social)](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FReActNet)\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F1910.10485)] 用于机器翻译的全量化 Transformer\n- [[EMNLP](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.12812)] TernaryBERT：蒸馏感知的超低比特 BERT [[代码](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhuawei-noah\u002FPretrained-Language-Model?style=social)](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model)\n- [[ICASSP](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9054599)] 带门控残差的平衡二值神经网络\n- [[ICET](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9119704)] 一种节能的袋装二值神经网络加速器\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.06517)] BinaryDuo：通过耦合二值激活减少二值激活网络中的梯度不匹配 [[代码](https:\u002F\u002Fgithub.com\u002FHyungjun-K1m\u002FBinaryDuo)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FHyungjun-K1m\u002FBinaryDuo?style=social)](https:\u002F\u002Fgithub.com\u002FHyungjun-K1m\u002FBinaryDuo)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=XKeyCSUWusK)] DMS：二值神经网络的可微维度搜索\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1902.08153)] 学习步长量化\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fforum?id=Hyx0slrFvH)] 混合精度 DNN：你所需要的只是一个好的参数化 [[代码](https:\u002F\u002Fgithub.com\u002Fsony\u002Fai-research-code\u002Ftree\u002Fmaster\u002Fmixed-precision-dnns)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsony\u002Fai-research-code?style=social)](https:\u002F\u002Fgithub.com\u002Fsony\u002Fai-research-code)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=BJg4NgBKvH)] 使用实数到二进制卷积训练二值神经网络\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.10396)] 利用各向异性向量量化加速大规模推理\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2004.09576)] LSQ+：通过可学习偏移和更好的初始化改进低比特量化\n- [[ICML](https:\u002F\u002Fproceedings.icml.cc\u002Fstatic\u002Fpaper_files\u002Ficml\u002F2020\u002F181-Paper.pdf)] 通过带噪声监督的学习训练二值神经网络\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F2004.10568)] 向上还是向下？训练后量化中的自适应四舍五入\n- [[IEEE Access](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9091590\u002F)] 一种节能且高吞吐的内存内计算位单元，具有出色的工艺变异鲁棒性，适用于二值神经网络\n- [[IEEE TCS.I](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2003.12558.pdf)] IMAC：在 6T SRAM 阵列中实现内存内多比特乘累加\n- [[IEEE TCS.II](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9144282\u002F)] 一种资源高效的二值卷积神经网络推理加速器\n- [[IEEE Trans. Electron Devices](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9112690)] 基于二值忆阻器的高鲁棒性 BNN 推理加速器设计\n- [[IEEE Trans. Magn](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.05132)] SIMBA：一种基于斯格明子的内存内二值神经网络加速器\n- [[IJCAI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2005.00057.pdf)] CP-NAS：面向二值神经网络的父子架构搜索\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2020\u002F292)] 直接量化用于训练高精度低比特深度神经网络\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2020\u002F288)] 全嵌套神经网络用于自适应压缩与量化\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2020\u002F121)] 溢出感知量化：通过低比特乘累加操作加速神经网络推理\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002Fproceedings\u002F2020\u002F318)] 软阈值三值网络\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F2020\u002F0520.pdf)] 朝着 Transformer 模型的完全 8 位整数推理迈进\n- [[IJCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2009.04247)] 面向高效目标识别的二值化神经架构搜索\n- [[ISCAS](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2004.08914.pdf)] MuBiNN：用于 EEG 信号分类的多级二值递归神经网络\n- [[ISQED](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9136977)] BNN 剪枝：基于权重翻转频率指导的二值神经网络剪枝 [[代码](https:\u002F\u002Fgithub.com\u002FPSCLab-ASU\u002FBNNPruning)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FPSCLab-ASU\u002FBNNPruning?style=social)](https:\u002F\u002Fgithub.com\u002FPSCLab-ASU\u002FBNNPruning)\n- [[MICRO](http:\u002F\u002Farxiv.org\u002Fabs\u002F2005.03842)] GOBO：为低延迟和节能推理量化基于注意力的 NLP 模型\n- [[MLST](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.06308)] 使用 HLS4ML 将 FPGA 上的深度神经网络压缩至二值和三值精度\n- [[NN](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fabs\u002Fpii\u002FS0893608019304290?via%3Dihub)] 使用完整的 8 位整数训练高性能和大规模深度神经网络\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F20b5e1cf8694af7a3c1ba4a87f073021-Abstract.html)] 数据并行 SGD 的自适应梯度量化 [[代码](https:\u002F\u002Fgithub.com\u002Ftabrizian\u002Flearning-to-quantize)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ftabrizian\u002Flearning-to-quantize?style=social)](https:\u002F\u002Fgithub.com\u002Ftabrizian\u002Flearning-to-quantize)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F3f13cf4ddf6fc50c0d39a1d5aeb57dd8-Abstract.html)] 贝叶斯比特：统一量化与剪枝\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F26ed695e9b7b9f6463ef4bc1fd74fc87-Abstract.html)] 缩小去量化差距：PixelCNN 作为单层流 [[代码](https:\u002F\u002Fgithub.com\u002Fdidriknielsen\u002Fpixelcnn_flow)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fdidriknielsen\u002Fpixelcnn_flow?style=social)](https:\u002F\u002Fgithub.com\u002Fdidriknielsen\u002Fpixelcnn_flow)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F1385974ed5904a438616ff7bdb3f7439-Abstract.html)] 高效精确地验证二值神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fjia-kai\u002Feevbnn)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjia-kai\u002Feevbnn?style=social)](https:\u002F\u002Fgithub.com\u002Fjia-kai\u002Feevbnn)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F0e230b1a582d76526b7ad7fc62ae937d-Abstract.html)] FleXOR：可训练的分数量化\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002Fd77c703536718b95308130ff2e5cf9ee-Abstract.html)] HAWQ-V2：海森矩阵感知的迹加权神经网络量化\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F96fca94df72984fc97ee5095410d4dec-Abstract.html)] 随机二值网络的路径样本分析梯度估计器 [[代码](https:\u002F\u002Fgithub.com\u002Fshekhovt\u002FPSA-Neurips2020)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fshekhovt\u002FPSA-Neurips2020?style=social)](https:\u002F\u002Fgithub.com\u002Fshekhovt\u002FPSA-Neurips2020)\n- [[NeurIPS](http:\u002F\u002Farxiv.org\u002Fabs\u002F2005.11035)] 基于位置的缩放梯度用于模型量化和剪枝 [[代码](https:\u002F\u002Fgithub.com\u002FJangho-Kim\u002FPSG-pytorch)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FJangho-Kim\u002FPSG-pytorch?style=social)](https:\u002F\u002Fgithub.com\u002FJangho-Kim\u002FPSG-pytorch)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F3948ead63a9f2944218de038d8934305-Abstract.html)] 鲁棒量化：一个模型统治一切\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2020\u002Ffile\u002F53c5b2affa12eed84dfec9bfd83550b1-Paper.pdf)] 旋转二值神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Flmbxmu\u002FRBNN)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Flmbxmu\u002FRBNN?style=social)](https:\u002F\u002Fgithub.com\u002Flmbxmu\u002FRBNN)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Ffile\u002F2a084e55c87b1ebcdaad1f62fdbbac8e-Paper.pdf)] 在量化神经网络中寻找低比特权重 [[代码](https:\u002F\u002Fgithub.com\u002Fzhaohui-yang\u002FBinary-Neural-Networks\u002Ftree\u002Fmain\u002FSLB)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzhaohui-yang\u002FBinary-Neural-Networks?style=social)](https:\u002F\u002Fgithub.com\u002Fzhaohui-yang\u002FBinary-Neural-Networks)\n- [[NeurIPS](https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F92049debbe566ca5782a3045cf300a3c-Abstract.html)] 普遍量化的神经压缩\n- [[Neurocomputing](https:\u002F\u002Fwww.sciencedirect.com\u002Fscience\u002Farticle\u002Fabs\u002Fpii\u002FS0925231219314274)] 基于权重二值化级联卷积神经网络的眼部定位\n- [[PR Letters](https:\u002F\u002Farxiv.org\u002Fabs\u002F2008.01438)] 控制二值神经网络的信息容量\n- [[SysML](https:\u002F\u002Fubicomplab.cs.washington.edu\u002Fpdfs\u002Friptide.pdf)] Riptide：快速端到端二值神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fjwfromm\u002FRiptide)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjwfromm\u002FRiptide?style=social)](https:\u002F\u002Fgithub.com\u002Fjwfromm\u002FRiptide)\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8573867\u002F)] 通过并行剪枝-量化压缩深度神经网络\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8444745\u002F)] 面向资源有限情况下的地标定位的层次化二值 CNN [[代码](https:\u002F\u002Fwww.adrianbulat.com\u002Fbinary-cnn-landmarks)]\n- [[TPAMI](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8674614\u002F)] 朝着高效 U-Net 迈进：一种耦合与量化的方法\n- [[TVLSI](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2003.02628.pdf)] Phoenix：一种面向卷积神经网络的低精度浮点量化导向架构\n- [[WACV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_WACV_2020\u002Fpapers\u002FPhan_MoBiNet_A_Mobile_Binary_Network_for_Image_Classification_WACV_2020_paper.pdf)] MoBiNet：一款用于图像分类的移动二值网络\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.16578)] 通过 Turing GPU 中的位张量核心加速二值神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fpnnl\u002FTCBNN)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fpnnl\u002FTCBNN?style=social)](https:\u002F\u002Fgithub.com\u002Fpnnl\u002FTCBNN)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2004.11147.pdf)] 二值图神经网络\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2012.15701)] BinaryBERT：突破 BERT 量化极限 [[代码](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhuawei-noah\u002FPretrained-Language-Model?style=social)](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002FPretrained-Language-Model)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2007.05223.pdf)] 蒸馏引导的二值卷积神经网络残差学习\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1909.09139.pdf)] 批量归一化如何帮助二值训练？\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2001.05936)] MeliusNet：二值神经网络能否达到 MobileNet 级别的精度？[[代码](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhpi-xnor\u002FBMXNet-v2?style=social)](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2001.01091.pdf)] RPR：用于训练的随机分区松弛；二值和三值权重神经网络\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2004.07320)] 通过量化噪声进行极端模型压缩的训练 [[代码](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Ffairseq\u002Ftree\u002Fmaster\u002Fexamples\u002Fquant_noise)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fpytorch\u002Ffairseq?style=social)](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Ffairseq)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F2006.07522)] 通过信息瓶颈理解二值神经网络的学习动力学\n- [[论文](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F343568789_Towards_Lossless_Binary_Convolutional_Neural_Networks_Using_Piecewise_Approximation)] 采用分段逼近实现无损二值卷积神经网络\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F2001.00281)] ZeroQ：一种新型零样本量化框架 [[代码](https:\u002F\u002Fgithub.com\u002Famirgholami\u002FZeroQ)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Famirgholami\u002FZeroQ?style=social)](https:\u002F\u002Fgithub.com\u002Famirgholami\u002FZeroQ)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.09666)] AdaBits：具有自适应比特宽度的神经网络量化 [[代码](https:\u002F\u002Fgithub.com\u002FdeJQK\u002FAdaBits)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FdeJQK\u002FAdaBits?style=social)](https:\u002F\u002Fgithub.com\u002FdeJQK\u002FAdaBits)\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.08883)] 多比特网络的自适应损失感知量化 [[代码](https:\u002F\u002Fgithub.com\u002Fzqu1992\u002FALQ)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzqu1992\u002FALQ?style=social)](https:\u002F\u002Fgithub.com\u002Fzqu1992\u002FALQ)\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.09952)] HMQ：面向 CNN 的硬件友好型混合精度量化模块 [[代码](https:\u002F\u002Fgithub.com\u002Fsony-si\u002Fai-research)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsony-si\u002Fai-research?style=social)](https:\u002F\u002Fgithub.com\u002Fsony-si\u002Fai-research)\n\n### 2019年\n\n- [[AAAI](https:\u002F\u002Fwww.aaai.org\u002Fojs\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F4273\u002F4151)] 具有二值权重和低比特激活的神经网络高效量化\n- [[AAAI](https:\u002F\u002Fwww.aaai.org\u002Fojs\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F4848\u002F4721)] 基于离散反向传播的1位CNN的投影卷积神经网络\n- [[APCCAS](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8953134\u002F)] 利用神经进化二值神经网络求解强化学习环境 [[代码](https:\u002F\u002Fgithub.com\u002Frval735\u002FBiSUNA)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Frval735\u002FBiSUNA?style=social)](https:\u002F\u002Fgithub.com\u002Frval735\u002FBiSUNA)\n- [[BMVC](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.11366)] 带有训练后二值化的精确且紧凑的卷积神经网络\n- [[BMVC](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.13863)] XNOR-Net++：改进的二值神经网络\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FXu_A_MainSubsidiary_Network_Framework_for_Simplifying_Binary_Neural_Networks_CVPR_2019_paper.pdf)] 用于简化二值神经网络的主辅网络框架\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FZhu_Binary_Ensemble_Neural_Network_More_Bits_per_Network_or_More_CVPR_2019_paper.pdf)] 二值集成神经网络：每个网络更多位，还是每比特更多网络？\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FLiu_Circulant_Binary_Convolutional_Networks_Enhancing_the_Performance_of_1-Bit_DCNNs_CVPR_2019_paper.pdf)] 循环二值卷积网络：通过循环反向传播提升1位DCNN性能\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FLi_Fully_Quantized_Network_for_Object_Detection_CVPR_2019_paper.pdf)] 用于目标检测的全量化网络\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FWang_HAQ_Hardware-Aware_Automated_Quantization_With_Mixed_Precision_CVPR_2019_paper.pdf)] HAQ：具有混合精度的硬件感知自动化量化 [[代码](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fhaq)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmit-han-lab\u002Fhaq?style=social)](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fhaq)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FWang_Learning_Channel-Wise_Interactions_for_Binary_Convolutional_Neural_Networks_CVPR_2019_paper.pdf)] 学习二值卷积神经网络中的通道间交互\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1808.05779)] 通过优化量化区间并结合任务损失来学习深度网络的量化方法\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FYang_Quantization_Networks_CVPR_2019_paper.pdf)] 量化网络 [[代码](https:\u002F\u002Fgithub.com\u002Faliyun\u002Falibabacloud-quantization-networks)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Faliyun\u002Falibabacloud-quantization-networks?style=social)](https:\u002F\u002Fgithub.com\u002Faliyun\u002Falibabacloud-quantization-networks)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FDing_Regularizing_Activation_Distribution_for_Training_Binarized_Deep_Networks_CVPR_2019_paper.pdf)] 正则化激活分布以训练二值化深度网络\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FCao_SeerNet_Predicting_Convolutional_Neural_Network_Feature-Map_Sparsity_Through_Low-Bit_Quantization_CVPR_2019_paper.pdf)] SeerNet：通过低比特量化预测卷积神经网络特征图稀疏性\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_CVPR_2019\u002Fpapers\u002FZhuang_Structured_Binary_Neural_Networks_for_Accurate_Image_Classification_and_Semantic_CVPR_2019_paper.pdf)] 用于准确图像分类和语义分割的结构化二值神经网络\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1906.08637.pdf)] 回归简单：如何从头开始训练精确的BNN？ [[代码](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhpi-xnor\u002FBMXNet-v2?style=social)](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.10862)] 二值化神经架构搜索\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F1904.05868)] 改进的人体姿态估计和图像识别中二值网络的训练方法\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1904.07852.pdf)] 用于训练二值神经网络的矩阵与张量分解\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.07748)] RBCN：整流二值卷积网络，用于提升1位DCNN性能\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.10103)] TentacleNet：用于精确二值卷积神经网络的伪集成模板\n- [[FPGA](https:\u002F\u002Farxiv.org\u002Fabs\u002F1810.02068)] 向FPGA上快速且节能的二值化神经网络推理迈进\n- [[GLSVLSI](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3299874.3318034)] 用于FPGA中目标跟踪的二值化深度可分离神经网络\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1908.06314.pdf)] 贝叶斯优化的1位CNN\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FNagel_Data-Free_Quantization_Through_Weight_Equalization_and_Bias_Correction_ICCV_2019_paper.html)] 无数据量化：通过权重均衡和偏置校正实现 [[代码](https:\u002F\u002Fgithub.com\u002Fjakc4103\u002FDFQ)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjakc4103\u002FDFQ?style=social)](https:\u002F\u002Fgithub.com\u002Fjakc4103\u002FDFQ)\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.05033)] 可微软量化：连接全精度与低比特神经网络\n- [[ICCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F1901.01928)] DSConv：高效的卷积算子\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2019\u002Fhtml\u002FDong_HAWQ_Hessian_AWare_Quantization_of_Neural_Networks_With_Mixed-Precision_ICCV_2019_paper.html)] HAWQ：混合精度下基于海森矩阵的神经网络量化\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCVW_2019\u002Fpapers\u002FNeurArch\u002FShen_Searching_for_Accurate_Binary_Neural_Architectures_ICCVW_2019_paper.pdf)] 寻找精确的二值神经架构\n- [[ICIP](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8802610)] 从零开始训练精确的二值神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhpi-xnor\u002FBMXNet-v2?style=social)](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=rJfUCoR5KX)] 二值神经网络优化的实证研究\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=HyzMyhCcK7)] ProxQuant：基于近端算子的量化神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fallenbai01\u002FProxQuant)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fallenbai01\u002FProxQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fallenbai01\u002FProxQuant)\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.00532v2)] Transformer神经机器翻译模型的高效8位量化\n- [[ICUS](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8996039)] 平衡的循环二值卷积网络\n- [[IEEE J. Emerg. Sel. Topics Circuits Syst.](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8668446\u002F)] Hyperdrive：多芯片、可流水线扩展的二值权重CNN推理引擎\n- [[IEEE J. Solid-State Circuits](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8581485)] 一种面向二值及三值权重神经网络、具有灵活数据位宽的节能可重构处理器\n- [[IEEE JETC](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1807.07928.pdf)] Eyeriss v2：一款适用于移动设备上新兴深度神经网络的灵活加速器\n- [[IEEE TCS.I](https:\u002F\u002Fieeexplore.ieee.org\u002Fabstract\u002Fdocument\u002F8643565)] 递归式二值神经网络训练模型，以高效利用片上内存\n- [[IEEE TCS.I](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1807.00343.pdf)] Xcel-RAM：在高吞吐SRAM计算阵列中加速二值神经网络\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F2019\u002F0667.pdf)] 带有蒸馏图卷积网络的二值协同过滤\n- [[IJCAI](https:\u002F\u002Fsee.xidian.edu.cn\u002Ffaculty\u002Fchdeng\u002FWelcome%20to%20Cheng%20Deng's%20Homepage_files\u002FPapers\u002FConference\u002FIJCAI2019_Feng.pdf)] 用于资源高效哈希且最小化量化损失的二值神经网络\n- [[ISOCC](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F9027649)] 双路径二值神经网络\n- [[MDPI Electronics](https:\u002F\u002Fdoi.org\u002F10.3390\u002Felectronics8060661)] 二值神经网络综述\n- [[NeurIPS](https:\u002F\u002Fwww.emc2-ai.org\u002Fassets\u002Fdocs\u002Fneurips-19\u002Femc2-neurips19-paper-36.pdf)] 全量化Transformer以提升翻译效果\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2019\u002Ffile\u002F9ca8c9b0996bbf05ae7753d34667a6fd-Paper.pdf)] 隐含权重不存在：重新思考二值神经网络优化 [[代码](https:\u002F\u002Fgithub.com\u002Fplumerai\u002Frethinking-bnn-optimization)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fplumerai\u002Frethinking-bnn-optimization?style=social)](https:\u002F\u002Fgithub.com\u002Fplumerai\u002Frethinking-bnn-optimization)\n- [[NeurIPS](https:\u002F\u002Fcsyhhu.github.io\u002Fdata\u002FMetaQuant.pdf)] MetaQuant：通过学习穿透不可微量化来掌握量化技巧 [[代码](https:\u002F\u002Fgithub.com\u002Fcsyhhu\u002FMetaQuant)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fcsyhhu\u002FMetaQuant?style=social)](https:\u002F\u002Fgithub.com\u002Fcsyhhu\u002FMetaQuant)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F1902.03538)] 带有对抗鲁棒性的模型压缩：统一的优化框架\n- [[NeurIPS](https:\u002F\u002Fopenreview.net\u002Fpdf?id=rJgB34rx8r)] 归一化有助于量化LSTM的训练\n- [[NeurIPS](https:\u002F\u002Fwww.emc2-ai.org\u002Fassets\u002Fdocs\u002Fneurips-19\u002Femc2-neurips19-paper-31.pdf)] Q8BERT：量化后的8位BERT\n- [[NeurIPS](http:\u002F\u002Farxiv.org\u002Fabs\u002F1812.11800)] 正则化二值网络训练\n- [[RoEduNet](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8909493\u002F)] PXNOR：扰动式二值神经网络 [[代码](https:\u002F\u002Fgithub.com\u002FApfelin\u002FPXNOR)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FApfelin\u002FPXNOR?style=social)](https:\u002F\u002Fgithub.com\u002FApfelin\u002FPXNOR)\n- [[SiPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F1909.01688)] 知识蒸馏用于优化量化深度神经网络\n- [[TMM](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1712.02956.pdf)] 使用二值深度神经网络进行紧凑哈希码学习\n- [[TMM](https:\u002F\u002Farxiv.org\u002Fabs\u002F1708.05127)] 深度二值重建用于跨模态哈希\n- [[VLSI-SoC](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8920343\u002F)] 一种利用电阻式存储器实现二值神经网络节能执行的产品引擎\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F1908.05858)] daBNN：一款适用于ARM设备的超快速二值神经网络推理框架 [[代码](https:\u002F\u002Fgithub.com\u002FJDAI-CV\u002Fdabnn)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FJDAI-CV\u002Fdabnn?style=social)](https:\u002F\u002Fgithub.com\u002FJDAI-CV\u002Fdabnn)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1812.00090)] 通过可微神经架构搜索实现混合精度的卷积网络量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.12491)] QKD：量化感知的知识蒸馏\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F1902.00730)] 自二值化网络\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.12607)] 朝着卷积神经网络统一INT8训练的目标迈进\n- [[论文](https:\u002F\u002Fopenreview.net\u002Fpdf?id=SJfHg2A5tQ)] BNN+：改进的二值网络训练\n\n### 2018年\n\n- [[AAAI](https:\u002F\u002Faaai.org\u002Focs\u002Findex.php\u002FAAAI\u002FAAAI18\u002Fpaper\u002FviewPDFInterstitial\u002F16767\u002F16728)] 极低比特神经网络：利用ADMM榨取最后一滴性能 [[代码](https:\u002F\u002Fweb.stanford.edu\u002F~boyd\u002Fadmm.html)]\n- [[AAAI](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.02733)] 从哈希到CNN：通过哈希训练二值权重网络\n- [[CAAI](https:\u002F\u002Fieeexplore.ieee.org\u002Fstamp\u002Fstamp.jsp?arnumber=8603080)] 基于二值深度卷积神经网络的快速目标检测\n- [[CVPR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.04680)] 使用低比特权重和激活的有效卷积神经网络训练\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fhtml\u002FZhou_Explicit_Loss-Error-Aware_Quantization_CVPR_2018_paper.html)] 针对低比特深度神经网络的显式损失误差感知量化\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FWang_Modulated_Convolutional_Networks_CVPR_2018_paper.pdf)] 调制卷积网络\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FJacob_Quantization_and_Training_CVPR_2018_paper.pdf)] 用于高效仅整数运算推理的神经网络量化与训练\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FFaraone_SYQ_Learning_Symmetric_CVPR_2018_paper.pdf)] SYQ：为高效深度神经网络学习对称量化 [[代码](https:\u002F\u002Fwww.github.com\u002Fjulianfaraone\u002FSYQ)]\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FZhuang_Towards_Effective_Low-Bitwidth_CVPR_2018_paper.pdf)] 向高效的低比特卷积神经网络迈进\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2018\u002Fpapers\u002FWang_Two-Step_Quantization_for_CVPR_2018_paper.pdf)] 低比特神经网络的两步量化\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1801.06313.pdf)] BinaryRelax：一种用于训练量化权重深度神经网络的松弛方法\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.02178)] LightNN：弥合传统深度神经网络与二值化网络之间的差距\n- [[ECCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002Fzechun_liu_Bi-Real_Net_Enhancing_ECCV_2018_paper.pdf)] Bi-Real Net：通过提升表示能力和先进的训练算法增强1比特CNN性能 [[代码](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FBi-Real-net)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fliuzechun\u002FBi-Real-net?style=social)](https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FBi-Real-net)\n- [[ECCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FDongqing_Zhang_Optimized_Quantization_for_ECCV_2018_paper.pdf)] LQ-Nets：用于高精度、紧凑型深度神经网络的可学习量化 [[代码](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLQ-Nets)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmicrosoft\u002FLQ-Nets?style=social)](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLQ-Nets)\n- [[ECCV](https:\u002F\u002Fyan-junjie.github.io\u002Fpublication\u002Fdblp-confeccv-wei-pqoy-18\u002Fdblp-confeccv-wei-pqoy-18.pdf)] 量化模拟：迈向用于目标检测的超小型CNN\n- [[ECCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ECCV_2018\u002Fpapers\u002FDiwen_Wan_TBN_Convolutional_Neural_ECCV_2018_paper.pdf)] TBN：具有三值输入和二值权重的卷积神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fdnvtmf\u002FTBN)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fdnvtmf\u002FTBN?style=social)](https:\u002F\u002Fgithub.com\u002Fdnvtmf\u002FTBN)\n- [[ECCV](https:\u002F\u002Fwww.ecva.net\u002Fpapers\u002Feccv_2018\u002Fpapers_ECCV\u002Fpapers\u002FQinghao_Hu_Training_Binary_Weight_ECCV_2018_paper.pdf)] 通过半二值分解训练二值权重网络\n- [[FCCM](http:\u002F\u002Faceslab.org\u002Fsites\u002Fdefault\u002Ffiles\u002FFCCM_2018_resbinnet.pdf)] ReBNet：残差二值化神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fmohaghasemzadeh\u002FReBNet)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fmohaghasemzadeh\u002FReBNet?style=social)](https:\u002F\u002Fgithub.com\u002Fmohaghasemzadeh\u002FReBNet)\n- [[FPL](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8532584\u002F)] FBNA：全二值化神经网络加速器\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=ryM_IoAqYX)] 量化模型分析\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=B1ae1lZRb)] Apprentice：利用知识蒸馏技术提升低精度网络精度\n- [[ICLR](https:\u002F\u002Farxiv.org\u002Fabs\u002F1802.08635)] 深度网络的损失感知权重量化 [[代码](https:\u002F\u002Fgithub.com\u002Fhoulu369\u002FLoss-aware-weight-quantization)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhoulu369\u002FLoss-aware-weight-quantization?style=social)](https:\u002F\u002Fgithub.com\u002Fhoulu369\u002FLoss-aware-weight-quantization)\n- [[ICLR](https:\u002F\u002Fresearch-explorer.app.ist.ac.at\u002Fdownload\u002F7812\u002F7894\u002F2018_ICLR_Polino.pdf)] 通过蒸馏和量化进行模型压缩 [[代码](https:\u002F\u002Fgithub.com\u002Fantspy\u002Fquantized_distillation)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fantspy\u002Fquantized_distillation?style=social)](https:\u002F\u002Fgithub.com\u002Fantspy\u002Fquantized_distillation)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=By5ugjyCb)] PACT：量化神经网络的参数化裁剪激活\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=B1ZvaaeAZ)] WRPN：宽广的低精度网络\n- [[IEEE J. Solid-State Circuits](http:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8226999\u002F)] BRein Memory：单芯片二值\u002F三值可重构片上深度神经网络加速器，在0.6W功耗下实现1.4 TOPS\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F2018\u002F0380.pdf)] 卷积神经网络的确定性二值滤波器\n- [[IJCAI](https:\u002F\u002Fwww.ijcai.org\u002FProceedings\u002F2018\u002F0669.pdf)] 利用学习到的二值神经网络转移模型在因子化状态和动作空间中进行规划\n- [[IJCNN](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8489259)] 简单动态二值神经网络的分析与实现\n- [[IPDPS](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8425178)] BitFlow：在CPU上利用向量并行性加速二值神经网络\n- [[MM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3240508.3240673)] BitStream：用于在CPU上实时低功耗推理二值神经网络的高效计算架构\n- [[NCA](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1712.08934.pdf)] 基于FPGA的卷积神经网络加速器综述\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2018\u002Ffile\u002Fe82c4b19b8151ddc25d4d93baf7b908f-Paper.pdf)] 神经网络8位训练的可扩展方法 [[代码](https:\u002F\u002Fgithub.com\u002Feladhoffer\u002Fquantized.pytorch)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Feladhoffer\u002Fquantized.pytorch?style=social)](https:\u002F\u002Fgithub.com\u002Feladhoffer\u002Fquantized.pytorch)\n- [[NeurIPS](https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2018\u002Ffile\u002F335d3d1cd7ef05ec77714a215134914c-Paper.pdf)] 使用8位浮点数训练深度神经网络\n- [[Res Math Sci](https:\u002F\u002Farxiv.org\u002Fabs\u002F1808.05240)] 混合粗粒度梯度下降法，用于深度神经网络的完全量化\n- [[TCAD](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8412533\u002F)] XNOR神经引擎：用于21.6-fJ\u002Fop二值神经网络推理的硬件加速器IP\n- [[TRETS](http:\u002F\u002Farxiv.org\u002Fabs\u002F1809.04570)] FINN-R：一个端到端的深度学习框架，用于快速探索量化神经网络\n- [[TVLSI](http:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8103902\u002F)] 一种面向二值权重卷积神经网络的节能架构\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1811.09426)] 联合神经架构搜索与量化 [[代码](https:\u002F\u002Fgithub.com\u002Fyukang2017\u002FNAS-quantization)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyukang2017\u002FNAS-quantization?style=social)](https:\u002F\u002Fgithub.com\u002Fyukang2017\u002FNAS-quantization)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1812.01965)] 从零开始训练具有竞争力的二值神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)] [![GitHub星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhpi-xnor\u002FBMXNet-v2?style=social)](https:\u002F\u002Fgithub.com\u002Fhpi-xnor\u002FBMXNet-v2)\n\n### 2017年\n\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2017\u002Fpapers\u002FCai_Deep_Learning_With_CVPR_2017_paper.pdf)] 通过半波高斯量化实现低精度深度学习 [[代码](https:\u002F\u002Fgithub.com\u002Fzhaoweicai\u002Fhwgq)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fzhaoweicai\u002Fhwgq?style=social)](https:\u002F\u002Fgithub.com\u002Fzhaoweicai\u002Fhwgq)\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2017\u002Fpapers\u002FJuefei-Xu_Local_Binary_Convolutional_CVPR_2017_paper.pdf)] 局部二值卷积神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fjuefeix\u002Flbcnn.torch)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fjuefeix\u002Flbcnn.torch?style=social)](https:\u002F\u002Fgithub.com\u002Fjuefeix\u002Flbcnn.torch)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1705.09864.pdf)] BMXNet：基于 MXNet 的开源二值神经网络实现 [[代码](https:\u002F\u002Fgithub.com\u002Fhpi-xnor)]\n- [[FPGA](https:\u002F\u002Farxiv.org\u002Fabs\u002F1612.07119)] FINN：一种用于快速、可扩展的二值神经网络推理的框架\n- [[ICASSP](https:\u002F\u002Farxiv.org\u002Fabs\u002F1702.08171)] 具有自适应步长再训练的深度神经网络定点优化\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FBulat_Binarized_Convolutional_Landmark_ICCV_2017_paper.pdf)] 面向资源受限场景的人体姿态估计与人脸对齐的二值卷积地标定位器 [[代码](https:\u002F\u002Fwww.adrianbulat.com\u002Fbinary-cnn-landmarks)]\n- [[ICCV](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_ICCV_2017\u002Fpapers\u002FLi_Performance_Guaranteed_Network_ICCV_2017_paper.pdf)] 基于高阶残差量化的性能保障型网络加速\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=HyQJ-mclg)] 渐进式网络量化：迈向使用低精度权重的无损 CNN [[代码](https:\u002F\u002Fgithub.com\u002FMxbonn\u002FINQ-pytorch)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FMxbonn\u002FINQ-pytorch?style=social)](https:\u002F\u002Fgithub.com\u002FMxbonn\u002FINQ-pytorch)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=S1oWlN9ll)] 损失感知的深度网络二值化 [[代码](https:\u002F\u002Fgithub.com\u002Fhoulu369\u002FLoss-aware-Binarization)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhoulu369\u002FLoss-aware-Binarization?style=social)](https:\u002F\u002Fgithub.com\u002Fhoulu369\u002FLoss-aware-Binarization)\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=HJGwcKclx)] 用于神经网络压缩的软权值共享\n- [[ICLR](https:\u002F\u002Fopenreview.net\u002Fpdf?id=S1_pAu9xl)] 训练得到的三值量化 [[代码](https:\u002F\u002Fgithub.com\u002FTropComplique\u002Ftrained-ternary-quantization)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FTropComplique\u002Ftrained-ternary-quantization?style=social)](https:\u002F\u002Fgithub.com\u002FTropComplique\u002Ftrained-ternary-quantization)\n- [[IPDPSW](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F7965031)] 基于片上存储的二值卷积深度神经网络，在 FPGA 上应用无批归一化技术\n- [[InterSpeech](https:\u002F\u002Fwww.isca-speech.org\u002Farchive\u002FInterspeech_2017\u002Fpdfs\u002F1343.PDF)] 用于语音识别的二值深度神经网络\n- [[JETC](https:\u002F\u002Farxiv.org\u002Fabs\u002F1702.06392)] 一种超越 GPU 性能的 FPGA 加速器架构，适用于二值卷积神经网络\n- [[MWSCAS](http:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8052915\u002F)] 在 FPGA 上实现的深度学习二值神经网络\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F1711.11294)] 向精确的二值卷积神经网络迈进 [[代码](https:\u002F\u002Fgithub.com\u002Flayog\u002FAccurate-Binary-Convolution-Network)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Flayog\u002FAccurate-Binary-Convolution-Network?style=social)](https:\u002F\u002Fgithub.com\u002Flayog\u002FAccurate-Binary-Convolution-Network)\n- [[Neurocomputing](http:\u002F\u002Fwww.doc.ic.ac.uk\u002F~wl\u002Fpapers\u002F17\u002Fneuro17sl0.pdf)] FP-BNN：在 FPGA 上实现的二值神经网络\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1706.02393)] ShiftCNN：面向卷积神经网络推理的通用低精度架构 [[代码](https:\u002F\u002Fgithub.com\u002Fgudovskiy\u002FShiftCNN)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgudovskiy\u002FShiftCNN?style=social)](https:\u002F\u002Fgithub.com\u002Fgudovskiy\u002FShiftCNN)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1705.01462.pdf)] 具有细粒度量化的三值神经网络\n\n### 2016年\n\n- [[CVPR](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent_cvpr_2016\u002Fhtml\u002FWu_Quantized_Convolutional_Neural_CVPR_2016_paper.html)] 用于移动设备的量化卷积神经网络。[代码](https:\u002F\u002Fgithub.com\u002Fjiaxiang-wu\u002Fquantized-cnn)\n- [[arXiv](http:\u002F\u002Farxiv.org\u002Fabs\u002F1606.06160)] DoReFa-Net：使用低比特梯度训练低比特宽度卷积神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Ftensorpack\u002Ftensorpack\u002Ftree\u002Fmaster\u002Fexamples\u002FDoReFa-Net)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ftensorpack\u002Ftensorpack?style=social)](https:\u002F\u002Fgithub.com\u002Ftensorpack\u002Ftensorpack)\n- [[ECCV](https:\u002F\u002Farxiv.org\u002Fabs\u002F1603.05279)] XNOR-Net：使用二值卷积神经网络进行 ImageNet 分类 [[代码](https:\u002F\u002Fgithub.com\u002Fallenai\u002FXNOR-Net)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fallenai\u002FXNOR-Net?style=social)](https:\u002F\u002Fgithub.com\u002Fallenai\u002FXNOR-Net)\n- [[ICASSP](https:\u002F\u002Farxiv.org\u002Fabs\u002F1512.01322)] 循环神经网络的定点性能分析\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1602.02830)] 二值神经网络：在权重和激活值被约束为 +1 或 -1 的情况下训练深度神经网络 [[代码](https:\u002F\u002Fgithub.com\u002Fitayhubara\u002FBinaryNet)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fitayhubara\u002FBinaryNet?style=social)](https:\u002F\u002Fgithub.com\u002Fitayhubara\u002FBinaryNet)\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1605.04711.pdf)] 三值权重网络 [[代码](https:\u002F\u002Fgithub.com\u002Ffengfu-chris\u002Fcaffe-twns)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ffengfu-chris\u002Fcaffe-twns?style=social)](https:\u002F\u002Fgithub.com\u002Ffengfu-chris\u002Fcaffe-twns)\n\n### 2015年\n\n- [[ICML](https:\u002F\u002Farxiv.org\u002Fabs\u002F1601.06071)] 位运算神经网络\n- [[NeurIPS](https:\u002F\u002Farxiv.org\u002Fabs\u002F1511.00363)] BinaryConnect：在前向传播过程中使用二值权重训练深度神经网络 [[代码](https:\u002F\u002Fgithub.com\u002FMatthieuCourbariaux\u002FBinaryConnect)] [![GitHub 星标](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FMatthieuCourbariaux\u002FBinaryConnect?style=social)](https:\u002F\u002Fgithub.com\u002FMatthieuCourbariaux\u002FBinaryConnect)\n- [[arXiv](https:\u002F\u002Farxiv.org\u002Fabs\u002F1511.06488)] 深度神经网络在量化下的鲁棒性\n\n## 相关仓库\n\n- [高效 LLM 与扩散模型精选](https:\u002F\u002Fgithub.com\u002Fefficient-ml\u002Fawesome-efficient-llm-diffusion)\n- [量化论文精选](https:\u002F\u002Fgithub.com\u002FZhen-Dong\u002FAwesome-Quantization-Papers)\n\n## 星标历史\n\n\u003Ca href=\"https:\u002F\u002Fwww.star-history.com\u002F?repos=Efficient-ML%2FAwesome-Model-Quantization&type=date&legend=top-left\">\n \u003Cpicture>\n   \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fapi.star-history.com\u002Fchart?repos=Efficient-ML\u002FAwesome-Model-Quantization&type=date&theme=dark&legend=top-left\" \u002F>\n   \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_055a8a66d58e.png\" \u002F>\n   \u003Cimg alt=\"星标历史图表\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_readme_055a8a66d58e.png\" \u002F>\n \u003C\u002Fpicture>\n\u003C\u002Fa>","# Awesome-Model-Quantization 快速上手指南\n\n`Awesome-Model-Quantization` 并非一个可直接安装的单一软件库，而是一个** curated list（精选列表）**，汇集了模型量化领域的论文、基准测试（Benchmarks）、综述文章及相关开源代码仓库。本指南将帮助你利用该资源快速找到适合的工具并开展研究或工程实践。\n\n## 环境准备\n\n由于该仓库包含针对不同模型（如 LLaMA, Qwen, Mamba 等）和不同任务（量化、基准测试、推理）的多个独立项目，环境需求取决于你具体选择的子项目。通用建议如下：\n\n*   **操作系统**: Linux (推荐 Ubuntu 20.04\u002F22.04) 或 macOS。Windows 用户建议使用 WSL2。\n*   **Python 版本**: 大多数现代量化工具要求 Python 3.9 或更高版本。\n*   **核心依赖**:\n    *   PyTorch (通常需 2.0+)\n    *   CUDA Toolkit (根据显卡驱动版本安装，推荐 11.8 或 12.x)\n    *   Git\n*   **硬件建议**: 进行大模型量化实验建议配备 NVIDIA GPU (显存 16GB 以上为佳)，部分基准测试可在 CPU 运行但速度较慢。\n\n## 安装步骤\n\n由于这是一个资源索引库，你不需要“安装”它本身，而是需要克隆仓库以获取资源列表，然后根据需求安装具体的子项目。\n\n### 1. 克隆资源仓库\n获取最新的论文列表和代码链接：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FModelTC\u002FAwesome-Model-Quantization.git\ncd Awesome-Model-Quantization\n```\n\n### 2. 选择并安装具体工具\n浏览仓库中的 `README.md` 或 `Papers` 章节，找到你感兴趣的项目（例如 `LLaMA3-Quantization` 或 `LightCompress`），然后进入其对应的 GitHub 仓库进行安装。\n\n**示例：安装 LLMC (LightCompress) 工具箱**\n假设你选择了列表中推荐的 `LLMC` 工具进行大模型量化基准测试：\n\n```bash\n# 克隆具体工具仓库\ngit clone https:\u002F\u002Fgithub.com\u002FModelTC\u002FLightCompress.git\ncd LightCompress\n\n# (可选) 使用国内镜像源加速 pip 安装\npip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n\n# 安装工具包\npip install -e .\n```\n\n**示例：复现 BiBench 基准测试**\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fhtqin\u002FBiBench.git\ncd BiBench\npip install -r requirements.txt -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n## 基本使用\n\n使用流程通常为：**查阅综述\u002F基准 -> 选定算法\u002F工具 -> 运行示例代码**。\n\n### 1. 查找研究方向\n打开克隆后的 `README.md` 文件，查看以下板块：\n*   **Survey Papers**: 阅读《A Survey of Low-bit Large Language Models》等综述，了解基础理论和最新进展。\n*   **Benchmarks**: 参考 `BiBench` 或 `RobustMQ` 了解当前量化模型的性能上限和鲁棒性标准。\n*   **Papers (by Year)**: 根据年份（如 2024, 2025, 2026）查找最新的 SOTA 算法（如 `PT²-LLM`, `FOEM`）。\n\n### 2. 运行量化示例 (以通用流程为例)\n大多数列出的代码仓库都遵循类似的调用逻辑。以下是一个典型的伪代码示例，展示如何使用找到的量化工具对模型进行处理（具体参数需参考对应子项目的文档）：\n\n```python\nimport torch\nfrom specific_tool import Quantizer, load_model # 替换为具体工具的导入路径\n\n# 1. 加载预训练模型\nmodel = load_model(\"path\u002Fto\u002Fllama3-8b\")\n\n# 2. 初始化量化器 (例如设置为 4-bit)\nquantizer = Quantizer(bits=4, method=\"post-training\")\n\n# 3. 执行量化\nquantized_model = quantizer.apply(model)\n\n# 4. 验证或导出\nprint(quantized_model)\n# quantized_model.save(\"output_path\")\n```\n\n### 3. 复现论文结果\n对于列表中标记有 `[Code]` 的论文：\n1.  点击链接进入对应的 GitHub 仓库。\n2.  查找 `scripts\u002F` 目录或 `README` 中的 \"Quick Start\" 部分。\n3.  直接运行提供的 Shell 脚本，例如：\n    ```bash\n    bash scripts\u002Frun_quantization.sh --model llama3 --bits 4\n    ```\n\n通过此仓库，你可以系统地追踪从二值化网络（Binary Neural Networks）到最新的大语言模型低比特量化（Low-bit LLM Quantization）的全套技术栈。","某边缘计算团队正试图将最新的 Qwen3 大语言模型部署到资源受限的工业手持终端上，急需通过模型量化技术在保持精度的同时大幅降低显存占用。\n\n### 没有 Awesome-Model-Quantization 时\n- **文献检索如大海捞针**：工程师需手动在 arXiv、GitHub 和各大会议网站分散搜索\"Qwen3 量化”或\"LLM 二值化”相关论文，耗时数天仍可能遗漏关键成果（如 ICML 2023 的 BiBench）。\n- **代码复现门槛极高**：找到论文后，往往找不到官方开源代码，或仓库已归档失效，导致算法验证无法启动，项目进度严重受阻。\n- **缺乏权威基准对比**：团队自行设计的量化方案缺乏统一的评估标准，无法判断其效果是否优于业界现有的 LLaMA3 或 Qwen3 实证研究结果。\n- **技术选型盲目试错**：面对众多量化策略（如权重量化、激活量化），因缺乏系统的综述论文（Survey Papers）指导，只能凭经验盲目尝试，浪费大量算力资源。\n\n### 使用 Awesome-Model-Quantization 后\n- **一站式获取前沿资源**：直接查阅按年份（2022-2026）分类的论文列表，瞬间定位到《An Empirical Study of Qwen3 Quantization》等最新实证研究与对应代码链接。\n- **快速复用成熟方案**：通过收录的\"Related Repositories\"直接获取经过验证的量化工具箱（如 LightCompress），将环境搭建与代码调试时间从周缩短至小时级。\n- **依托标准基准评估**：利用项目中整理的 BiBench 等权威基准测试集，快速量化评估模型性能，确保优化后的模型在精度损失可控范围内。\n- **系统化技术决策**：参考收录的综述论文理清技术脉络，迅速选定最适合工业终端的二值化或低比特量化路径，避免无效探索。\n\nAwesome-Model-Quantization 将原本分散、高门槛的量化 research 转化为结构化的知识资产，让团队能专注于业务落地而非重复造轮子。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FEfficient-ML_Awesome-Model-Quantization_901b7e5e.png","Efficient-ML","Efficient Intelligence and Systems","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FEfficient-ML_62b858f0.png","Research about Efficient Intelligence and Systems",null,"qin_haotong","https:\u002F\u002Fhtqin.github.io\u002F","https:\u002F\u002Fgithub.com\u002FEfficient-ML",2342,231,"2026-04-09T06:52:28","","未说明",{"notes":86,"python":84,"dependencies":87},"该仓库（Awesome-Model-Quantization）是一个模型量化领域的论文、文档和代码合集列表，本身不是一个可直接运行的单一软件工具。因此，README 中未提供具体的操作系统、硬件配置或依赖库要求。具体的运行环境需求需参考列表中各个子项目（如 BiBench, LLaMA3-Quantization, LightCompress 等）各自的文档。",[],[14],[90,91,92,93,94,95,96,97,98,99],"deep-learning","quantization","awesome","model-compression","binarized-neural-networks","binary-network","efficient-deep-learning","lightweight-neural-network","model-acceleration","model-quantization","2026-03-27T02:49:30.150509","2026-04-10T04:37:26.016494",[103,108,113,118,123,128,133,138],{"id":104,"question_zh":105,"answer_zh":106,"source_url":107},27118,"自 2024 年以来，二值化研究似乎都转向了大语言模型（LLM），研究普通骨干网络（如 CNN\u002FTransformer）还有意义吗？","仍然非常有意义。二值化对低功耗设备（尤其是移动端和边缘硬件的推理）非常友好。虽然 LLM 的二值化研究正在兴起，但其流程（基于大规模预训练）、范围和推理方式与普通骨干网络有很大不同，且相关研究尚处于起步阶段。因此，两者都有各自的研究价值，该仓库也会持续收集这两个领域的论文。","https:\u002F\u002Fgithub.com\u002FEfficient-ML\u002FAwesome-Model-Quantization\u002Fissues\u002F58",{"id":109,"question_zh":110,"answer_zh":111,"source_url":112},27119,"为什么很多量化或蒸馏的论文（包括维护者自己的部分论文）没有开源代码？如何获取相关代码？","部分论文的代码可能未直接附带，但可以通过相关扩展工作获取。例如，IJCV 论文《Distribution-sensitive Information Retention for Accurate Binary Neural Network》中提到的 Dir-Net，其扩展版本是 CVPR 工作 IR-Net。IR-Net 的代码已在仓库 htqin\u002FIR-Net 中开源，使用该代码可以非常容易地实现 Dir-Net。","https:\u002F\u002Fgithub.com\u002FEfficient-ML\u002FAwesome-Model-Quantization\u002Fissues\u002F53",{"id":114,"question_zh":115,"answer_zh":116,"source_url":117},27120,"该量化论文汇总仓库还会继续更新吗？如何提交最新的论文？","是的，仓库会持续更新。维护者最近已经添加了一些关于量化的最新论文。同时也非常欢迎作者们通过 Pull Request (PR) 的方式提交他们自己的出版物或预印本论文，以帮助社区保持同步。","https:\u002F\u002Fgithub.com\u002FEfficient-ML\u002FAwesome-Model-Quantization\u002Fissues\u002F56",{"id":119,"question_zh":120,"answer_zh":121,"source_url":122},27121,"有哪些关于二值神经网络（BNN）在 ARM 设备上快速推理的框架或论文推荐？","推荐参考论文《daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices》，该论文发表于 ACMMM 2019。论文链接：https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.05858。","https:\u002F\u002Fgithub.com\u002FEfficient-ML\u002FAwesome-Model-Quantization\u002Fissues\u002F9",{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},27122,"有没有关于统一 INT8 训练或混合精度量化的最新研究成果？","有的。关于统一 INT8 训练，可参考 CVPR 2020 论文《Towards Unified INT8 Training for Convolutional Neural Network》。关于混合精度量化，可参考 ICML 2022 论文《SDQ: Stochastic Differentiable Quantization with Mixed Precision》。此外，还有关于非均匀到均匀量化的研究《Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation》(CVPR)，其代码已开源：https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FNonuniform-to-Uniform-Quantization。","https:\u002F\u002Fgithub.com\u002FEfficient-ML\u002FAwesome-Model-Quantization\u002Fissues\u002F31",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},27123,"在哪里可以找到《Mixed-Precision Neural Network Quantization via Learned Layer-Wise Importance》(ECCV) 这篇论文的官方代码？","该论文的代码已发布在 GitHub 上，仓库地址为：https:\u002F\u002Fgithub.com\u002F1hunters\u002FLIMPQ。","https:\u002F\u002Fgithub.com\u002FEfficient-ML\u002FAwesome-Model-Quantization\u002Fissues\u002F36",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},27124,"有哪些针对 Vision Transformer 或 MLP 架构的二值化模型研究？","有几个相关研究：1. NeurIPS 2022 论文《BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons》，代码地址：https:\u002F\u002Fgitee.com\u002Fmindspore\u002Fmodels\u002Ftree\u002Fmaster\u002Fresearch\u002Fcv\u002FBiMLP；2. CVPRW 论文《BinaryViT: Pushing Binary Vision Transformers Towards Convolutional Models》，代码地址：https:\u002F\u002Fgithub.com\u002Fphuoc-hoan-le\u002Fbinaryvit；3. NeurIPS 2022 论文《BiT: Robustly Binarized Multi-distilled Transformer》，代码地址：https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fbit。","https:\u002F\u002Fgithub.com\u002FEfficient-ML\u002FAwesome-Model-Quantization\u002Fissues\u002F49",{"id":139,"question_zh":140,"answer_zh":141,"source_url":127},27125,"Adam 优化器和训练策略如何帮助二值神经网络（BNN）的优化？有相关代码吗？","ICML 2021 发表了论文《How Do Adam and Training Strategies Help BNNs Optimization?》，深入探讨了这一问题。该研究的配套代码已开源，可以在以下地址获取：https:\u002F\u002Fgithub.com\u002Fliuzechun\u002FAdamBNN。",[]]