[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-microsoft--LLMLingua":3,"tool-microsoft--LLMLingua":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":80,"owner_twitter":81,"owner_website":82,"owner_url":83,"languages":84,"stars":97,"forks":98,"last_commit_at":99,"license":100,"difficulty_score":23,"env_os":101,"env_gpu":102,"env_ram":103,"env_deps":104,"category_tags":112,"github_topics":79,"view_count":10,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":113,"updated_at":114,"faqs":115,"releases":145},1265,"microsoft\u002FLLMLingua","LLMLingua","[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss. ","LLMLingua 是一个用于压缩提示词（prompt）和键值缓存（KV-Cache）的开源工具，旨在提升大语言模型（LLM）的推理速度并增强其对关键信息的理解能力。它通过识别并去除提示中非必要的内容，实现高达 20 倍的压缩率，同时几乎不损失性能。\n\n在处理长文本或复杂任务时，大模型往往需要消耗大量计算资源，导致推理速度变慢、成本上升。LLMLingua 有效解决了这一问题，使模型能更高效地聚焦于核心信息，从而加快响应速度并降低资源消耗。\n\nLLMLingua 适合开发者、研究人员以及使用大模型进行推理优化的工程师。它已被集成到多个主流框架中，如 Prompt Flow、LangChain 和 LlamaIndex，方便用户直接调用。对于需要处理大量文本输入或希望优化推理效率的场景，例如 RAG（检索增强生成）、在线会议、思维链（CoT）和代码生成等，LLMLingua 都能提供显著帮助。\n\n其独特之处在于采用轻量级预训练模型来识别冗余信息，并支持多种变体（如 LongLLMLingua 和 LLMLingua-2），进一步提升了压缩效率与适用范围。","\u003Cdiv style=\"display: flex; align-items: center;\">\n    \u003Cdiv style=\"width: 100px; margin-right: 10px; height:auto;\" align=\"left\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_755c33dfd614.png\" alt=\"LLMLingua\" width=\"100\" align=\"left\">\n    \u003C\u002Fdiv>\n    \u003Cdiv style=\"flex-grow: 1;\" align=\"center\">\n        \u003Ch2 align=\"center\">LLMLingua Series | Effectively Deliver Information to LLMs via Prompt Compression\u003C\u002Fh2>\n    \u003C\u002Fdiv>\n\u003C\u002Fdiv>\n\n\u003Cp align=\"center\">\n    | \u003Ca href=\"https:\u002F\u002Fllmlingua.com\u002F\">\u003Cb>Project Page\u003C\u002Fb>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Faclanthology.org\u002F2023.emnlp-main.825\u002F\">\u003Cb>LLMLingua\u003C\u002Fb>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Faclanthology.org\u002F2024.acl-long.91\u002F\">\u003Cb>LongLLMLingua\u003C\u002Fb>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.57\u002F\">\u003Cb>LLMLingua-2\u003C\u002Fb>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fmicrosoft\u002FLLMLingua\">\u003Cb>LLMLingua Demo\u003C\u002Fb>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fmicrosoft\u002FLLMLingua-2\">\u003Cb>LLMLingua-2 Demo\u003C\u002Fb>\u003C\u002Fa> |\n\u003C\u002Fp>\n\nhttps:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fassets\u002F30883354\u002Feb0ea70d-6d4c-4aa7-8977-61f94bb87438\n\n## News\n- 🍩 [24\u002F12\u002F13] We are excited to announce the release of our KV cache-centric analysis work, [SCBench](https:\u002F\u002Faka.ms\u002FSCBench), which evaluates long-context methods from a KV cache perspective.\n- 👘 [24\u002F09\u002F16] We are pleased to announce the release of our KV cache offloading work, [RetrievalAttention](https:\u002F\u002Faka.ms\u002FRetrievalAttention), which accelerates long-context LLM inference via vector retrieval.\n- 🌀  [24\u002F07\u002F03] We're excited to announce the release of [MInference](https:\u002F\u002Faka.ms\u002FMInference) to speed up Long-context LLMs' inference, reduces inference latency by up to **10X** for pre-filling on an A100 while maintaining accuracy in **1M tokens prompt**! For more information, check out our [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.02490), visit the [project page](https:\u002F\u002Faka.ms\u002FMInference).\n- 🧩 LLMLingua has been integrated into [Prompt flow](https:\u002F\u002Fmicrosoft.github.io\u002Fpromptflow\u002Fintegrations\u002Ftools\u002Fllmlingua-prompt-compression-tool.html), a streamlined tool framework for LLM-based AI applications.\n- 🦚 We're excited to announce the release of **LLMLingua-2**, boasting a 3x-6x speed improvement over LLMLingua! For more information, check out our [paper](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.57\u002F), visit the [project page](https:\u002F\u002Fllmlingua.com\u002Fllmlingua2.html), and explore our [demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fmicrosoft\u002FLLMLingua-2).\n- 👾 LLMLingua has been integrated into [LangChain](https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flangchain\u002Fblob\u002Fmaster\u002Fdocs\u002Fdocs\u002Fintegrations\u002Fretrievers\u002Fllmlingua.ipynb) and [LlamaIndex](https:\u002F\u002Fgithub.com\u002Frun-llama\u002Fllama_index\u002Fblob\u002Fmain\u002Fdocs\u002Fexamples\u002Fnode_postprocessor\u002FLongLLMLingua.ipynb), two widely-used RAG frameworks.\n- 🤳 Talk slides are available in [AI Time Jan, 24](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1fzK3wOvy2boF7XzaYuq2bQ3jFeP1WMk3\u002Fview?usp=sharing).\n- 🖥 EMNLP'23 slides are available in [Session 5](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1GxQLAEN8bBB2yiEdQdW4UKoJzZc0es9t\u002Fview) and [BoF-6](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1LJBUfJrKxbpdkwo13SgPOqugk-UjLVIF\u002Fview).\n- 📚 Check out our new [blog post](https:\u002F\u002Fmedium.com\u002F@iofu728\u002Flongllmlingua-bye-bye-to-middle-loss-and-save-on-your-rag-costs-via-prompt-compression-54b559b9ddf7) discussing RAG benefits and cost savings through prompt compression. See the script example [here](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fblob\u002Fmain\u002Fexamples\u002FRetrieval.ipynb).\n- 🎈 Visit our [project page](https:\u002F\u002Fllmlingua.com\u002F) for real-world case studies in RAG, Online Meetings, CoT, and Code.\n- 👨‍🦯 Explore our ['.\u002Fexamples'](.\u002Fexamples) directory for practical applications, including [LLMLingua-2](.\u002Fexamples\u002FLLMLingua2.ipynb), [RAG](.\u002Fexamples\u002FRAG.ipynb), [Online Meeting](.\u002Fexamples\u002FOnlineMeeting.ipynb), [CoT](.\u002Fexamples\u002FCoT.ipynb), [Code](.\u002Fexamples\u002FCode.ipynb), and [RAG using LlamaIndex](.\u002Fexamples\u002FRAGLlamaIndex.ipynb).\n\n## TL;DR\n\nLLMLingua utilizes a compact, well-trained language model (e.g., GPT2-small, LLaMA-7B) to identify and remove non-essential tokens in prompts. This approach enables efficient inference with large language models (LLMs), achieving up to 20x compression with minimal performance loss.\n\n- [LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models](https:\u002F\u002Faclanthology.org\u002F2023.emnlp-main.825\u002F) (EMNLP 2023)\u003Cbr>\n  _Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang and Lili Qiu_\n\nLongLLMLingua mitigates the 'lost in the middle' issue in LLMs, enhancing long-context information processing. It reduces costs and boosts efficiency with prompt compression, improving RAG performance by up to 21.4% using only 1\u002F4 of the tokens.\n\n- [LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression](https:\u002F\u002Faclanthology.org\u002F2024.acl-long.91\u002F) (ACL 2024 and ICLR ME-FoMo 2024)\u003Cbr>\n  _Huiqiang Jiang, Qianhui Wu, Xufang Luo, Dongsheng Li, Chin-Yew Lin, Yuqing Yang and Lili Qiu_\n\nLLMLingua-2, a small-size yet powerful prompt compression method trained via data distillation from GPT-4 for token classification with a BERT-level encoder, excels in task-agnostic compression. It surpasses LLMLingua in handling out-of-domain data, offering 3x-6x faster performance.\n\n- [LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.57\u002F) (ACL 2024 Findings)\u003Cbr>\n  _Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Menglin Xia, Xufang Luo, Jue Zhang, Qingwei Lin, Victor Ruhle, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, Dongmei Zhang_\n\nSecurityLingua is a safety guardrail model that uses the security-aware prompt compression to reveal the malicious intentions behind jailbreak attacks, enabling LLMs to detect attacks and generate safe responses. Due to the highly efficient prompt compression, the defense involves negligible overhead and 100x less token costs compared to state-of-the-art LLM guardrail approaches.\n\n- [SecurityLingua: Efficient Defense of LLM Jailbreak Attacks via Security-Aware Prompt Compression](https:\u002F\u002Fopenreview.net\u002Fforum?id=tybbSo6wba) (CoLM 2025)\u003Cbr>\n  _Yucheng Li, Surin Ahn, Huiqiang Jiang, Amir H. Abdi, Yuqing Yang and Lili Qiu_\n\n## 🎥 Overview\n\n![Background](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_a5a01cccc87c.png)\n\n- Ever encountered the token limit when asking ChatGPT to summarize lengthy texts?\n- Frustrated with ChatGPT forgetting previous instructions after extensive fine-tuning?\n- Experienced high costs using GPT3.5\u002F4 API for experiments despite excellent results?\n\nWhile Large Language Models like ChatGPT and GPT-4 excel in generalization and reasoning, they often face challenges like prompt length limits and prompt-based pricing schemes.\n\n![Motivation for LLMLingua](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_7380c05ef528.png)\n\nNow you can use **LLMLingua**, **LongLLMLingua**, and **LLMLingua-2**!\n\nThese tools offer an efficient solution to compress prompts by up to **20x**, enhancing the utility of LLMs.\n\n- 💰 **Cost Savings**: Reduces both prompt and generation lengths with minimal overhead.\n- 📝 **Extended Context Support**: Enhances support for longer contexts, mitigates the \"lost in the middle\" issue, and boosts overall performance.\n- ⚖️ **Robustness**: No additional training needed for LLMs.\n- 🕵️ **Knowledge Retention**: Maintains original prompt information like ICL and reasoning.\n- 📜 **KV-Cache Compression**: Accelerates inference process.\n- 🪃 **Comprehensive Recovery**: GPT-4 can recover all key information from compressed prompts.\n\n![Framework of LLMLingua](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_a92684b6e954.png)\n\n![Framework of LongLLMLingua](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_c8fa42431e72.png)\n\n![Framework of LLMLingua-2](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_96eab07ede9f.png)\n\nPS: This demo is based on the [alt-gpt](https:\u002F\u002Fgithub.com\u002Ffeedox\u002Falt-gpt) project. Special thanks to @Livshitz for their valuable contribution.\n\nIf you find this repo helpful, please cite the following papers:\n\n```bibtex\n@inproceedings{jiang-etal-2023-llmlingua,\n    title = \"{LLML}ingua: Compressing Prompts for Accelerated Inference of Large Language Models\",\n    author = \"Huiqiang Jiang and Qianhui Wu and Chin-Yew Lin and Yuqing Yang and Lili Qiu\",\n    editor = \"Bouamor, Houda  and\n      Pino, Juan  and\n      Bali, Kalika\",\n    booktitle = \"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing\",\n    month = dec,\n    year = \"2023\",\n    address = \"Singapore\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2023.emnlp-main.825\",\n    doi = \"10.18653\u002Fv1\u002F2023.emnlp-main.825\",\n    pages = \"13358--13376\",\n}\n```\n\n```bibtex\n@inproceedings{jiang-etal-2024-longllmlingua,\n    title = \"{L}ong{LLML}ingua: Accelerating and Enhancing {LLM}s in Long Context Scenarios via Prompt Compression\",\n    author = \"Huiqiang Jiang and Qianhui Wu and and Xufang Luo and Dongsheng Li and Chin-Yew Lin and Yuqing Yang and Lili Qiu\",\n    editor = \"Ku, Lun-Wei  and\n      Martins, Andre  and\n      Srikumar, Vivek\",\n    booktitle = \"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)\",\n    month = aug,\n    year = \"2024\",\n    address = \"Bangkok, Thailand\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2024.acl-long.91\",\n    pages = \"1658--1677\",\n}\n```\n\n```bibtex\n@inproceedings{pan-etal-2024-llmlingua,\n    title = \"{LLML}ingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression\",\n    author = \"Zhuoshi Pan and Qianhui Wu and Huiqiang Jiang and Menglin Xia and Xufang Luo and Jue Zhang and Qingwei Lin and Victor Ruhle and Yuqing Yang and Chin-Yew Lin and H. Vicky Zhao and Lili Qiu and Dongmei Zhang\",\n    editor = \"Ku, Lun-Wei  and\n      Martins, Andre  and\n      Srikumar, Vivek\",\n    booktitle = \"Findings of the Association for Computational Linguistics ACL 2024\",\n    month = aug,\n    year = \"2024\",\n    address = \"Bangkok, Thailand and virtual meeting\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.57\",\n    pages = \"963--981\",\n}\n```\n\n```bibtex\n@inproceedings{li2025securitylingua,\n  title={{S}ecurity{L}ingua: Efficient Defense of {LLM} Jailbreak Attacks via Security-Aware Prompt Compression},\n  author={Yucheng Li and Surin Ahn and Huiqiang Jiang and Amir H. Abdi and Yuqing Yang and Lili Qiu},\n  booktitle={Second Conference on Language Modeling},\n  year={2025},\n  url={https:\u002F\u002Fopenreview.net\u002Fforum?id=tybbSo6wba}\n}\n```\n\n## 🎯 Quick Start\n\n#### 1. **Installing LLMLingua:**\n\nTo get started with LLMLingua, simply install it using pip:\n\n```bash\npip install llmlingua\n```\n\n#### 2. **Using LLMLingua Series Methods for Prompt Compression:**\n\nWith **LLMLingua**, you can easily compress your prompts. Here’s how you can do it:\n\n```python\nfrom llmlingua import PromptCompressor\n\nllm_lingua = PromptCompressor()\ncompressed_prompt = llm_lingua.compress_prompt(prompt, instruction=\"\", question=\"\", target_token=200)\n\n# > {'compressed_prompt': 'Question: Sam bought a dozen boxes, each with 30 highlighter pens inside, for $10 each box. He reanged five of boxes into packages of sixlters each and sold them $3 per. He sold the rest theters separately at the of three pens $2. How much did make in total, dollars?\\nLets think step step\\nSam bought 1 boxes x00 oflters.\\nHe bought 12 * 300ters in total\\nSam then took 5 boxes 6ters0ters.\\nHe sold these boxes for 5 *5\\nAfterelling these  boxes there were 3030 highlighters remaining.\\nThese form 330 \u002F 3 = 110 groups of three pens.\\nHe sold each of these groups for $2 each, so made 110 * 2 = $220 from them.\\nIn total, then, he earned $220 + $15 = $235.\\nSince his original cost was $120, he earned $235 - $120 = $115 in profit.\\nThe answer is 115',\n#  'origin_tokens': 2365,\n#  'compressed_tokens': 211,\n#  'ratio': '11.2x',\n#  'saving': ', Saving $0.1 in GPT-4.'}\n\n## Or use the phi-2 model,\nllm_lingua = PromptCompressor(\"microsoft\u002Fphi-2\")\n\n## Or use the quantation model, like TheBloke\u002FLlama-2-7b-Chat-GPTQ, only need \u003C8GB GPU memory.\n## Before that, you need to pip install optimum auto-gptq\nllm_lingua = PromptCompressor(\"TheBloke\u002FLlama-2-7b-Chat-GPTQ\", model_config={\"revision\": \"main\"})\n```\n\nTo try **LongLLMLingua** in your scenarios, you can use\n\n```python\nfrom llmlingua import PromptCompressor\n\nllm_lingua = PromptCompressor()\ncompressed_prompt = llm_lingua.compress_prompt(\n    prompt_list,\n    question=question,\n    rate=0.55,\n    # Set the special parameter for LongLLMLingua\n    condition_in_question=\"after_condition\",\n    reorder_context=\"sort\",\n    dynamic_context_compression_ratio=0.3, # or 0.4\n    condition_compare=True,\n    context_budget=\"+100\",\n    rank_method=\"longllmlingua\",\n)\n```\n\nTo try **LLMLingua-2** in your scenarios, you can use\n\n```python\nfrom llmlingua import PromptCompressor\n\nllm_lingua = PromptCompressor(\n    model_name=\"microsoft\u002Fllmlingua-2-xlm-roberta-large-meetingbank\",\n    use_llmlingua2=True, # Whether to use llmlingua-2\n)\ncompressed_prompt = llm_lingua.compress_prompt(prompt, rate=0.33, force_tokens = ['\\n', '?'])\n\n## Or use LLMLingua-2-small model\nllm_lingua = PromptCompressor(\n    model_name=\"microsoft\u002Fllmlingua-2-bert-base-multilingual-cased-meetingbank\",\n    use_llmlingua2=True, # Whether to use llmlingua-2\n)\n```\n\nTo try **SecurityLingua** in your scenarios, you can use\n\n```python\nfrom llmlingua import PromptCompressor\n\nsecuritylingua = PromptCompressor(\n    model_name=\"SecurityLingua\u002Fsecuritylingua-xlm-s2s\",\n    use_slingua=True\n)\nintention = securitylingua.compress_prompt(malicious_prompt)\n```\n\nFor more details about SecurityLingua, please refer to [securitylingua readme](.\u002Fexperiments\u002Fsecuritylingua\u002Freadme.md).\n\n#### 3. **Advanced usage - Structured Prompt Compression:**\n\nSplit text into sections, decide on whether to compress and its rate. Use `\u003Cllmlingua>\u003C\u002Fllmlingua>` tags for context segmentation, with optional rate and compress parameters.\n\n```python\nstructured_prompt = \"\"\"\u003Cllmlingua, compress=False>Speaker 4:\u003C\u002Fllmlingua>\u003Cllmlingua, rate=0.4> Thank you. And can we do the functions for content? Items I believe are 11, three, 14, 16 and 28, I believe.\u003C\u002Fllmlingua>\u003Cllmlingua, compress=False>\nSpeaker 0:\u003C\u002Fllmlingua>\u003Cllmlingua, rate=0.4> Item 11 is a communication from Council on Price recommendation to increase appropriation in the general fund group in the City Manager Department by $200 to provide a contribution to the Friends of the Long Beach Public Library. Item 12 is communication from Councilman Super Now. Recommendation to increase appropriation in the special advertising and promotion fund group and the city manager's department by $10,000 to provide support for the end of summer celebration. Item 13 is a communication from Councilman Austin. Recommendation to increase appropriation in the general fund group in the city manager department by $500 to provide a donation to the Jazz Angels . Item 14 is a communication from Councilman Austin. Recommendation to increase appropriation in the general fund group in the City Manager department by $300 to provide a donation to the Little Lion Foundation. Item 16 is a communication from Councilman Allen recommendation to increase appropriation in the general fund group in the city manager department by $1,020 to provide contribution to Casa Korero, Sew Feria Business Association, Friends of Long Beach Public Library and Dave Van Patten. Item 28 is a communication. Communication from Vice Mayor Richardson and Council Member Muranga. Recommendation to increase appropriation in the general fund group in the City Manager Department by $1,000 to provide a donation to Ron Palmer Summit. Basketball and Academic Camp.\u003C\u002Fllmlingua>\u003Cllmlingua, compress=False>\nSpeaker 4:\u003C\u002Fllmlingua>\u003Cllmlingua, rate=0.6> We have a promotion and a second time as councilman served Councilman Ringa and customers and they have any comments.\u003C\u002Fllmlingua>\"\"\"\ncompressed_prompt = llm_lingua.structured_compress_prompt(structured_prompt, instruction=\"\", question=\"\", rate=0.5)\nprint(compressed_prompt['compressed_prompt'])\n\n# > Speaker 4:. And can we do the functions for content? Items I believe are11,,116 28,.\n# Speaker 0: a from Council on Price to increase the fund group the Manager0 provide a the the1 is Councilman Super Now. the special group the provide the summerman a the Jazzels a communication from Councilman Austin. Recommendation to increase appropriation in the general fund group in the City Manager department by $300 to provide a donation to the Little Lion Foundation. Item 16 is a communication from Councilman Allen recommendation to increase appropriation in the general fund group in the city manager department by $1,020 to provide contribution to Casa Korero, Sew Feria Business Association, Friends of Long Beach Public Library and Dave Van Patten. Item 28 is a communication. Communication from Vice Mayor Richardson and Council Member Muranga. Recommendation to increase appropriation in the general fund group in the City Manager Department by $1,000 to provide a donation to Ron Palmer Summit. Basketball and Academic Camp.\n# Speaker 4: We have a promotion and a second time as councilman served Councilman Ringa and customers and they have any comments.\n```\n\n#### 4. **Learning More:**\n\nTo understand how to apply LLMLingua and LongLLMLingua in real-world scenarios like RAG, Online Meetings, CoT, and Code, please refer to our [**examples**](.\u002Fexamples). For detailed guidance, the [**documentation**](.\u002FDOCUMENT.md) provides extensive recommendations on effectively utilizing LLMLingua.\n\n#### 5. **Data collection and model training of LLMLingua-2:**\n\nTo train the compressor on your custom data, please refer to our [**data_collection**](.\u002Fexperiments\u002Fllmlingua2\u002Fdata_collection) and [**model_training**](.\u002Fexperiments\u002Fllmlingua2\u002Fmodel_training).\n\n## Frequently Asked Questions\n\nFor more insights and answers, visit our [FAQ section](.\u002FTransparency_FAQ.md).\n\n## Contributing\n\nThis project welcomes contributions and suggestions. Most contributions require you to agree to a\nContributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us\nthe rights to use your contribution. For details, visit https:\u002F\u002Fcla.opensource.microsoft.com.\n\nWhen you submit a pull request, a CLA bot will automatically determine whether you need to provide\na CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions\nprovided by the bot. You will only need to do this once across all repos using our CLA.\n\nThis project has adopted the [Microsoft Open Source Code of Conduct](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002F).\nFor more information see the [Code of Conduct FAQ](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002Ffaq\u002F) or\ncontact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.\n\n## Trademarks\n\nThis project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft\ntrademarks or logos is subject to and must follow\n[Microsoft's Trademark & Brand Guidelines](https:\u002F\u002Fwww.microsoft.com\u002Fen-us\u002Flegal\u002Fintellectualproperty\u002Ftrademarks\u002Fusage\u002Fgeneral).\nUse of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.\nAny use of third-party trademarks or logos are subject to those third-party's policies.\n","\u003Cdiv style=\"display: flex; align-items: center;\">\n    \u003Cdiv style=\"width: 100px; margin-right: 10px; height:auto;\" align=\"left\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_755c33dfd614.png\" alt=\"LLMLingua\" width=\"100\" align=\"left\">\n    \u003C\u002Fdiv>\n    \u003Cdiv style=\"flex-grow: 1;\" align=\"center\">\n        \u003Ch2 align=\"center\">LLMLingua系列 | 通过提示压缩高效向大语言模型传递信息\u003C\u002Fh2>\n    \u003C\u002Fdiv>\n\u003C\u002Fdiv>\n\n\u003Cp align=\"center\">\n    | \u003Ca href=\"https:\u002F\u002Fllmlingua.com\u002F\">\u003Cb>项目页面\u003C\u002Fb>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Faclanthology.org\u002F2023.emnlp-main.825\u002F\">\u003Cb>LLMLingua\u003C\u002Fb>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Faclanthology.org\u002F2024.acl-long.91\u002F\">\u003Cb>LongLLMLingua\u003C\u002Fb>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.57\u002F\">\u003Cb>LLMLingua-2\u003C\u002Fb>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fmicrosoft\u002FLLMLingua\">\u003Cb>LLMLingua演示\u003C\u002Fb>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fmicrosoft\u002FLLMLingua-2\">\u003Cb>LLMLingua-2演示\u003C\u002Fb>\u003C\u002Fa> |\n\u003C\u002Fp>\n\nhttps:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fassets\u002F30883354\u002Feb0ea70d-6d4c-4aa7-8977-61f94bb87438\n\n## 新闻\n- 🍩 [24\u002F12\u002F13] 我们很高兴地宣布发布以KV缓存为核心的分析工作——[SCBench](https:\u002F\u002Faka.ms\u002FSCBench)，该工作从KV缓存的角度评估长上下文方法。\n- 👘 [24\u002F09\u002F16] 我们很高兴地宣布发布KV缓存卸载工作——[RetrievalAttention](https:\u002F\u002Faka.ms\u002FRetrievalAttention)，该工作通过向量检索加速长上下文大语言模型的推理。\n- 🌀  [24\u002F07\u002F03] 我们很高兴地宣布发布[MInference](https:\u002F\u002Faka.ms\u002FMInference)，以加速长上下文大语言模型的推理，可在A100上进行预填充时将推理延迟最多降低**10倍**，同时在**100万令牌提示**下保持准确！欲了解更多信息，请参阅我们的[论文](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.02490)，访问[项目页面](https:\u002F\u002Faka.ms\u002FMInference)。\n- 🧩 LLMLingua已集成到[Prompt flow](https:\u002F\u002Fmicrosoft.github.io\u002Fpromptflow\u002Fintegrations\u002Ftools\u002Fllmlingua-prompt-compression-tool.html)中，这是一个用于基于大语言模型的AI应用的精简工具框架。\n- 🦚 我们很高兴地宣布发布**LLMLingua-2**，其速度比LLMLingua提升3至6倍！欲了解更多信息，请参阅我们的[论文](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.57\u002F)，访问[项目页面](https:\u002F\u002Fllmlingua.com\u002Fllmlingua2.html)，并体验我们的[演示](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fmicrosoft\u002FLLMLingua-2)。\n- 👾 LLMLingua已集成到[LangChain](https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flangchain\u002Fblob\u002Fmaster\u002Fdocs\u002Fdocs\u002Fintegrations\u002Fretrievers\u002Fllmlingua.ipynb)和[LlamaIndex](https:\u002F\u002Fgithub.com\u002Frun-llama\u002Fllama_index\u002Fblob\u002Fmain\u002Fdocs\u002Fexamples\u002Fnode_postprocessor\u002FLongLLMLingua.ipynb)中，这两个是广泛使用的RAG框架。\n- 🤳 讲座幻灯片已在[AI Time Jan, 24](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1fzK3wOvy2boF7XzaYuq2bQ3jFeP1WMk3\u002Fview?usp=sharing)中提供。\n- 🖥 EMNLP'23幻灯片已在[Session 5](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1GxQLAEN8bBB2yiEdQdW4UKoJzZc0es9t\u002Fview)和[BoF-6](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1LJBUfJrKxbpdkwo13SgPOqugk-UjLVIF\u002Fview)中提供。\n- 📚 请查看我们的新[博客文章](https:\u002F\u002Fmedium.com\u002F@iofu728\u002Flongllmlingua-bye-bye-to-middle-loss-and-save-on-your-rag-costs-via-prompt-compression-54b559b9ddf7)，讨论通过提示压缩实现RAG优势与成本节约。脚本示例请见[此处](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fblob\u002Fmain\u002Fexamples\u002FRetrieval.ipynb)。\n- 🎈 请访问我们的[项目页面](https:\u002F\u002Fllmlingua.com\u002F)，了解RAG、在线会议、CoT和代码领域的实际案例研究。\n- 👨‍🦯 请浏览我们的.[examples](.\u002Fexamples)目录，获取实用的应用示例，包括[LLMLingua-2](.\u002Fexamples\u002FLLMLingua2.ipynb)、[RAG](.\u002Fexamples\u002FRAG.ipynb)、[在线会议](.\u002Fexamples\u002FOnlineMeeting.ipynb)、[CoT](.\u002Fexamples\u002FCoT.ipynb)、[代码](.\u002Fexamples\u002FCode.ipynb)以及[使用LlamaIndex的RAG](.\u002Fexamples\u002FRAGLlamaIndex.ipynb)。\n\n## 简要概述\n\nLLMLingua利用一个紧凑且训练有素的语言模型（例如GPT2-small、LLaMA-7B）来识别并移除提示中的非必要标记。这种方法能够实现对大型语言模型（LLMs）的高效推理，最高可实现20倍的压缩，同时几乎不损失性能。\n\n- [LLMLingua：为大型语言模型的加速推理压缩提示](https:\u002F\u002Faclanthology.org\u002F2023.emnlp-main.825\u002F)（EMNLP 2023）\u003Cbr>\n  _江辉强、吴千慧、林钦耀、杨宇清和邱莉莉_\n\nLongLLMLingua缓解了大语言模型中的“中间丢失”问题，增强了长上下文信息处理能力。它通过提示压缩降低成本并提高效率，仅使用四分之一的标记即可将RAG性能提升多达21.4%。\n\n- [LongLLMLingua：通过提示压缩加速并增强大语言模型在长上下文场景中的表现](https:\u002F\u002Faclanthology.org\u002F2024.acl-long.91\u002F)（ACL 2024及ICLR ME-FoMo 2024）\u003Cbr>\n  _江辉强、吴千慧、罗旭芳、李东升、林钦耀、杨宇清和邱莉莉_\n\nLLMLingua-2是一种小型但功能强大的提示压缩方法，通过从GPT-4中进行数据蒸馏训练，使用BERT级别的编码器进行标记分类，擅长跨任务的压缩。它在处理域外数据方面优于LLMLingua，性能提升3至6倍。\n\n- [LLMLingua-2：通过数据蒸馏实现高效且忠实的跨任务提示压缩](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.57\u002F)（ACL 2024发现）\u003Cbr>\n  _潘卓石、吴千慧、江辉强、夏梦琳、罗旭芳、张珏、林庆伟、维克多·鲁勒、杨宇清、林钦耀、赵维琪、邱莉莉、张冬梅_\n\nSecurityLingua是一个安全护栏模型，利用安全感知的提示压缩揭示越狱攻击背后的恶意意图，使大语言模型能够检测攻击并生成安全响应。由于提示压缩极为高效，该防御方案几乎不增加开销，且与最先进的大语言模型护栏方法相比，令牌成本降低了100倍。\n\n- [SecurityLingua：通过安全感知的提示压缩高效防御大语言模型越狱攻击](https:\u002F\u002Fopenreview.net\u002Fforum?id=tybbSo6wba)（CoLM 2025）\u003Cbr>\n  _李宇成、安顺仁、江辉强、阿米尔·H·阿卜迪、杨宇清和邱莉莉_\n\n## 🎥 概述\n\n![背景](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_a5a01cccc87c.png)\n\n- 你是否曾在让 ChatGPT 总结长篇文本时遇到过 token 限制？\n- 是否对 ChatGPT 在经过大量微调后仍会忘记先前指令而感到沮丧？\n- 是否尽管实验效果出色，却因使用 GPT-3.5\u002F4 API 而面临高昂成本？\n\n尽管像 ChatGPT 和 GPT-4 这样的大型语言模型在泛化和推理方面表现出色，但它们常常面临提示长度限制以及基于提示的计费模式等挑战。\n\n![LLMLingua 的动机](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_7380c05ef528.png)\n\n现在，你可以使用 **LLMLingua**、**LongLLMLingua** 和 **LLMLingua-2**！\n\n这些工具提供了一种高效的解决方案，可将提示压缩多达 **20 倍**，从而提升大型语言模型的实用性。\n\n- 💰 **成本节约**：在几乎不增加额外开销的情况下，同时减少提示和生成的长度。\n- 📝 **扩展上下文支持**：增强对更长上下文的支持，缓解“中间信息丢失”问题，并提升整体性能。\n- ⚖️ **鲁棒性**：无需对大型语言模型进行额外训练。\n- 🕵️ **知识保留**：保持原始提示信息，如 ICL 和推理过程。\n- 📜 **KV 缓存压缩**：加速推理过程。\n- 🪃 **全面恢复**：GPT-4 可以从压缩后的提示中恢复所有关键信息。\n\n![LLMLingua 的框架](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_a92684b6e954.png)\n\n![LongLLMLingua 的框架](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_c8fa42431e72.png)\n\n![LLMLingua-2 的框架](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_readme_96eab07ede9f.png)\n\nPS：本演示基于 [alt-gpt](https:\u002F\u002Fgithub.com\u002Ffeedox\u002Falt-gpt) 项目。特别感谢 @Livshitz 的宝贵贡献。\n\n如果你觉得这个仓库有用，请引用以下论文：\n\n```bibtex\n@inproceedings{jiang-etal-2023-llmlingua,\n    title = \"{LLML}ingua: Compressing Prompts for Accelerated Inference of Large Language Models\",\n    author = \"Huiqiang Jiang and Qianhui Wu and Chin-Yew Lin and Yuqing Yang and Lili Qiu\",\n    editor = \"Bouamor, Houda  and\n      Pino, Juan  and\n      Bali, Kalika\",\n    booktitle = \"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing\",\n    month = dec,\n    year = \"2023\",\n    address = \"Singapore\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2023.emnlp-main.825\",\n    doi = \"10.18653\u002Fv1\u002F2023.emnlp-main.825\",\n    pages = \"13358--13376\",\n}\n```\n\n```bibtex\n@inproceedings{jiang-etal-2024-longllmlingua,\n    title = \"{L}ong{LLML}ingua: Accelerating and Enhancing {LLM}s in Long Context Scenarios via Prompt Compression\",\n    author = \"Huiqiang Jiang and Qianhui Wu and Xufang Luo and Dongsheng Li and Chin-Yew Lin and Yuqing Yang and Lili Qiu\",\n    editor = \"Ku, Lun-Wei  and\n      Martins, Andre  and\n      Srikumar, Vivek\",\n    booktitle = \"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)\",\n    month = aug,\n    year = \"2024\",\n    address = \"Bangkok, Thailand\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2024.acl-long.91\",\n    pages = \"1658--1677\",\n}\n```\n\n```bibtex\n@inproceedings{pan-etal-2024-llmlingua,\n    title = \"{LLML}ingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression\",\n    author = \"Zhuoshi Pan and Qianhui Wu and Huiqiang Jiang and Menglin Xia and Xufang Luo and Jue Zhang and Qingwei Lin and Victor Ruhle and Yuqing Yang and Chin-Yew Lin and H. Vicky Zhao and Lili Qiu and Dongmei Zhang\",\n    editor = \"Ku, Lun-Wei  and\n      Martins, Andre  and\n      Srikumar, Vivek\",\n    booktitle = \"Findings of the Association for Computational Linguistics ACL 2024\",\n    month = aug,\n    year = \"2024\",\n    address = \"Bangkok, Thailand and virtual meeting\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.57\",\n    pages = \"963--981\",\n}\n```\n\n```bibtex\n@inproceedings{li2025securitylingua,\n  title={{S}ecurity{L}ingua: Efficient Defense of {LLM} Jailbreak Attacks via Security-Aware Prompt Compression},\n  author={Yucheng Li and Surin Ahn and Huiqiang Jiang and Amir H. Abdi and Yuqing Yang and Lili Qiu},\n  booktitle={Second Conference on Language Modeling},\n  year={2025},\n  url={https:\u002F\u002Fopenreview.net\u002Fforum?id=tybbSo6wba}\n}\n```\n\n## 🎯 快速入门\n\n#### 1. **安装 LLMLingua：**\n\n要开始使用 LLMLingua，只需通过 pip 安装即可：\n\n```bash\npip install llmlingua\n```\n\n#### 2. **使用 LLMLingua 系列方法进行提示压缩：**\n\n借助 **LLMLingua**，你可以轻松压缩你的提示。以下是具体操作方法：\n\n```python\nfrom llmlingua import PromptCompressor\n\nllm_lingua = PromptCompressor()\ncompressed_prompt = llm_lingua.compress_prompt(prompt, instruction=\"\", question=\"\", target_token=200)\n\n# > {'compressed_prompt': '问题：Sam 购买了一打盒子，每个盒子里有 30 支荧光笔，每盒售价为 10 美元。他把其中 5 盒分成每盒 6 支的包装，以每盒 3 美元的价格卖出。其余的则按每 3 支 2 美元的价格单独出售。他总共赚了多少钱？让我们一步一步来思考。\\nSam 购买了 1 盒 x00 支荧光笔。\\n他总共买了 12 * 300 支荧光笔。\\nSam 接着拿了 5 盒，每盒 6 支。\\n他以 5 * 5 的价格卖出了这些盒子。\\n卖出这些盒子后，还剩下 3030 支荧光笔。\\n这些可以组成 330 \u002F 3 = 110 组，每组 3 支。\\n他以每组 2 美元的价格卖出了这些，因此赚了 110 * 2 = 220 美元。\\n所以，他总共赚了 220 + 15 = 235 美元。\\n由于他的原始成本是 120 美元，所以他赚了 235 - 120 = 115 美元的利润。\\n答案是 115',\n#  'origin_tokens': 2365,\n#  'compressed_tokens': 211,\n#  'ratio': '11.2x',\n#  'saving': ', Saving $0.1 in GPT-4.'}\n\n## 或者使用 phi-2 模型，\nllm_lingua = PromptCompressor(\"microsoft\u002Fphi-2\")\n\n## 或者使用量化模型，比如 TheBloke\u002FLlama-2-7b-Chat-GPTQ，只需要不到 8GB 的显存。\n## 在此之前，你需要先安装 optimum auto-gptq\nllm_lingua = PromptCompressor(\"TheBloke\u002FLlama-2-7b-Chat-GPTQ\", model_config={\"revision\": \"main\"})\n```\n\n要在你的场景中尝试 **LongLLMLingua**，可以使用：\n\n```python\nfrom llmlingua import PromptCompressor\n\nllm_lingua = PromptCompressor()\ncompressed_prompt = llm_lingua.compress_prompt(\n    prompt_list,\n    question=question,\n    rate=0.55,\n    # 设置 LongLLMLingua 的特殊参数\n    condition_in_question=\"after_condition\",\n    reorder_context=\"sort\",\n    dynamic_context_compression_ratio=0.3, # 或 0.4\n    condition_compare=True,\n    context_budget=\"+100\",\n    rank_method=\"longllmlingua\",\n)\n```\n\n要在你的场景中尝试 **LLMLingua-2**，可以使用：\n\n```python\nfrom llmlingua import PromptCompressor\n\nllm_lingua = PromptCompressor(\n    model_name=\"microsoft\u002Fllmlingua-2-xlm-roberta-large-meetingbank\",\n    use_llmlingua2=True, # 是否使用 llmlingua-2\n)\ncompressed_prompt = llm_lingua.compress_prompt(prompt, rate=0.33, force_tokens = ['\\n', '?'])\n\n## 或使用 LLMLingua-2-small 模型\nllm_lingua = PromptCompressor(\n    model_name=\"microsoft\u002Fllmlingua-2-bert-base-multilingual-cased-meetingbank\",\n    use_llmlingua2=True, # 是否使用 llmlingua-2\n)\n```\n\n要在您的场景中尝试 **SecurityLingua**，可以使用：\n\n```python\nfrom llmlingua import PromptCompressor\n\nsecuritylingua = PromptCompressor(\n    model_name=\"SecurityLingua\u002Fsecuritylingua-xlm-s2s\",\n    use_slingua=True\n)\nintention = securitylingua.compress_prompt(malicious_prompt)\n```\n\n有关 SecurityLingua 的更多详细信息，请参阅 [securitylingua 说明文档](.\u002Fexperiments\u002Fsecuritylingua\u002Freadme.md)。\n\n#### 3. **高级用法——结构化提示压缩：**\n\n将文本拆分为多个部分，决定是否进行压缩及其压缩比例。使用 `\u003Cllmlingua>\u003C\u002Fllmlingua>` 标签进行上下文分割，并可选地指定压缩比例和压缩参数。\n\n```python\nstructured_prompt = \"\"\"\u003Cllmlingua, compress=False>发言者4：\u003C\u002Fllmlingua>\u003Cllmlingua, rate=0.4> 谢谢。那我们能否对内容的功能进行处理呢？我认为涉及的项目有11号、3号、14号、16号和28号。\u003C\u002Fllmlingua>\u003Cllmlingua, compress=False>\n发言者0：\u003C\u002Fllmlingua>\u003Cllmlingua, rate=0.4> 第11项是市议会关于提高市政经理部门一般基金组拨款额度的建议，金额为200美元，用于向长滩公共图书馆之友组织提供资助。第12项是市议员Super Now提出的建议，要求将专项广告与推广基金组及市政经理部门的拨款增加1万美元，以支持夏季结束庆典活动。第13项是市议员Austin提出的建议，要求将市政经理部门一般基金组的拨款增加500美元，用于向Jazz Angels组织捐款。第14项是市议员Austin提出的建议，要求将市政经理部门一般基金组的拨款增加300美元，用于向Little Lion Foundation组织捐款。第16项是市议员Allen提出的建议，要求将市政经理部门一般基金组的拨款增加1,020美元，用于向Casa Korero、Sew Feria商业协会、长滩公共图书馆之友以及Dave Van Patten组织提供资助。第28项是一则通知，由副市长Richardson和市议员Muranga提出，建议将市政经理部门一般基金组的拨款增加1,000美元，用于向Ron Palmer峰会、篮球与学术营活动提供捐赠。\u003C\u002Fllmlingua>\u003Cllmlingua, compress=False>\n发言者4：\u003C\u002Fllmlingua>\u003Cllmlingua, rate=0.6> 我们有一个促销活动，而且市议员Ringa第二次担任市议员，他和客户们有什么意见吗？\u003C\u002Fllmlingua>\"\"\"\ncompressed_prompt = llm_lingua.structured_compress_prompt(structured_prompt, instruction=\"\", question=\"\", rate=0.5)\nprint(compressed_prompt['compressed_prompt'])\n\n# > 发言者4：. 那我们能否对内容的功能进行处理呢？我认为涉及的项目有11号、116号、28号。\n# 发言者0：来自市议会关于提高拨款额度的建议，其中第1项是市议员Super Now提出的建议，要求将专项基金组的拨款增加，以支持夏季活动；第13项是市议员Austin提出的建议，要求将市政经理部门一般基金组的拨款增加300美元，用于向Little Lion Foundation组织捐款。第16项是市议员Allen提出的建议，要求将市政经理部门一般基金组的拨款增加1,020美元，用于向Casa Korero、Sew Feria商业协会、长滩公共图书馆之友以及Dave Van Patten组织提供资助。第28项是一则通知，由副市长Richardson和市议员Muranga提出，建议将市政经理部门一般基金组的拨款增加1,000美元，用于向Ron Palmer峰会、篮球与学术营活动提供捐赠。\n# 发言者4：我们有一个促销活动，而且市议员Ringa第二次担任市议员，他和客户们有什么意见吗？\n```\n\n#### 4. **了解更多：**\n\n要了解如何在 RAG、在线会议、CoT 和代码等实际场景中应用 LLMLingua 和 LongLLMLingua，请参阅我们的 [**示例**](.\u002Fexamples)。如需详细指导，[**文档**](.\u002FDOCUMENT.md) 提供了大量关于如何有效利用 LLMLingua 的建议。\n\n#### 5. **LLMLingua-2 的数据收集与模型训练：**\n\n如需基于您的自定义数据训练压缩器，请参阅我们的 [**数据收集**](.\u002Fexperiments\u002Fllmlingua2\u002Fdata_collection) 和 [**模型训练**](.\u002Fexperiments\u002Fllmlingua2\u002Fmodel_training)。\n\n## 常见问题\n\n如需更多见解与解答，请访问我们的 [常见问题解答页面](.\u002FTransparency_FAQ.md)。\n\n## 贡献\n\n本项目欢迎贡献与建议。大多数贡献都需要您同意一份《贡献者许可协议》（CLA），声明您有权并确实授予我们使用您贡献的权利。详情请访问 https:\u002F\u002Fcla.opensource.microsoft.com。\n\n当您提交拉取请求时，CLA 机器人会自动判断您是否需要提供 CLA，并相应地对 PR 进行标记（例如状态检查、评论）。您只需按照机器人提供的指示操作即可。对于所有使用我们 CLA 的仓库，您只需完成一次此步骤。\n\n本项目已采用 [Microsoft 开源行为准则](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002F)。\n如需更多信息，请参阅 [行为准则常见问题解答](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002Ffaq\u002F)，或如有任何其他问题或意见，请联系 [opencode@microsoft.com](mailto:opencode@microsoft.com)。\n\n## 商标\n\n本项目可能包含针对项目、产品或服务的商标或标识。授权使用 Microsoft 商标或标识须遵守并遵循 [Microsoft 商标与品牌指南](https:\u002F\u002Fwww.microsoft.com\u002Fen-us\u002Flegal\u002Fintellectualproperty\u002Ftrademarks\u002Fusage\u002Fgeneral)。在本项目的修改版本中使用 Microsoft 商标或标识不得引起混淆或暗示 Microsoft 的赞助关系。任何第三方商标或标识的使用均须遵守该第三方的相关政策。","# LLMLingua 快速上手指南\n\n## 环境准备\n\n- **系统要求**：支持 Python 3.8 及以上版本，推荐使用 Linux 或 macOS 系统。\n- **前置依赖**：需安装 Python 环境，并确保 pip 已正确配置。建议使用国内镜像源加速安装，如清华源。\n\n## 安装步骤\n\n使用 pip 安装 LLMLingua：\n\n```bash\npip install llmlingua\n```\n\n如果你希望使用国内镜像源加速安装，可以添加 `-i` 参数：\n\n```bash\npip install llmlingua -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n## 基本使用\n\n以下是最简单的使用示例，展示如何使用 LLMLingua 对提示进行压缩：\n\n```python\nfrom llmlingua import PromptCompressor\n\nllm_lingua = PromptCompressor()\ncompressed_prompt = llm_lingua.compress_prompt(prompt, instruction=\"\", question=\"\", target_token=200)\n```\n\n其中：\n- `prompt` 是你希望压缩的原始提示内容；\n- `target_token` 是压缩后的目标 token 数量（默认为 200）；\n- 返回结果包含压缩后的提示、原始 token 数、压缩后 token 数以及压缩比例等信息。\n\n你也可以选择使用其他预训练模型，例如 phi-2 模型：\n\n```python\nllm_lingua = PromptCompressor(\"microsoft\u002Fphi-2\")\n```\n\n如果需要使用量化模型（如 TheBloke\u002FLlama-2-7b-Chat-GPTQ），请确保你的 GPU 内存大于 8GB，并提前安装相关依赖库。","某在线客服平台的开发团队正在构建一个基于大语言模型（LLM）的智能问答系统，用于自动回答用户在电商平台上提出的各种咨询问题。他们需要处理大量来自用户的查询，并结合历史对话记录和产品信息生成准确、自然的回答。\n\n### 没有 LLMLingua 时\n- 每次生成回答前，系统需要将用户当前问题与历史对话记录拼接成一个长提示（prompt），导致提示长度经常超过 1000 tokens。\n- 长提示显著增加了模型推理时间，单个请求的响应延迟高达 2-3 秒，影响用户体验。\n- 在高并发场景下，服务器负载过高，导致系统响应变慢甚至出现超时现象。\n- 提示中包含大量重复或冗余信息，但模型无法有效识别并过滤这些内容，造成资源浪费。\n- 对于需要长上下文支持的复杂问题，模型性能下降明显，影响回答准确性。\n\n### 使用 LLMLingua 后\n- 系统通过 LLMLingua 对提示进行压缩，将平均提示长度减少至原来的 1\u002F5，显著降低了计算开销。\n- 推理速度提升 5 倍以上，单个请求的响应时间缩短至 0.4 秒以内，用户体验大幅提升。\n- 在高并发情况下，服务器负载降低，系统稳定性增强，能够轻松应对流量高峰。\n- 提示中的冗余信息被高效过滤，保留了关键信息，提高了模型对核心问题的理解能力。\n- 即使面对复杂的多轮对话，模型也能保持较高的回答准确率，提升了整体服务质量。\n\n核心价值：LLMLingua 通过高效的提示压缩技术，在不牺牲模型性能的前提下显著提升了推理效率和系统稳定性，为大规模 LLM 应用提供了实用的优化方案。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_LLMLingua_c8fa4243.png","microsoft","Microsoft","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fmicrosoft_4900709c.png","Open source projects and samples from Microsoft",null,"opensource@microsoft.com","OpenAtMicrosoft","https:\u002F\u002Fopensource.microsoft.com","https:\u002F\u002Fgithub.com\u002Fmicrosoft",[85,89,93],{"name":86,"color":87,"percentage":88},"Python","#3572A5",98.1,{"name":90,"color":91,"percentage":92},"Shell","#89e051",1.8,{"name":94,"color":95,"percentage":96},"Makefile","#427819",0.1,5984,358,"2026-04-05T05:26:19","MIT","Linux, macOS, Windows","需要 NVIDIA GPU，显存 8GB+，CUDA 11.7+","16GB+",{"notes":105,"python":106,"dependencies":107},"建议使用 conda 管理环境，首次运行需下载约 5GB 模型文件。支持使用量化模型（如 TheBloke\u002FLlama-2-7b-Chat-GPTQ），仅需 \u003C8GB GPU 显存。","3.8+",[108,109,110,111],"torch>=2.0","transformers>=4.30","accelerate","llmlingua",[26,13,51],"2026-03-27T02:49:30.150509","2026-04-06T06:44:21.302971",[116,121,125,130,135,140],{"id":117,"question_zh":118,"answer_zh":119,"source_url":120},5758,"为什么 LLMLingua 目前还没有与 Langchain 集成？","需要等待 Langchain 团队发布新的社区包版本。可以查看其发布的版本信息：https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flangchain\u002Freleases。如果已经更新了版本但仍然无法使用，请确保正确安装了最新版本的 langchain_community 包。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fissues\u002F71",{"id":122,"question_zh":123,"answer_zh":124,"source_url":120},5759,"如何在 Langchain 中使用 LLMLingua 进行 RAG 应用？","可以参考 Langchain 的官方文档中的示例：https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flangchain\u002Fblob\u002Fmaster\u002Fdocs\u002Fdocs\u002Fintegrations\u002Fretrievers\u002Fllmlingua.ipynb。如果遇到模块未找到错误，请确认是否已正确安装 langchain_community 包，并检查导入路径是否正确。",{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},5760,"在使用 LLMLingua-2 压缩 BBH 提示时，参数应如何设置？","压缩 BBH 提示时，建议将 `target_token` 设置为略高于所需保留的实际令牌数，以补偿实际压缩后的差异。此外，`use_context_level_filter` 参数通常设置为 true。对于每个任务的多个 CoT 提示，建议将其作为列表传递，每个提示作为一个独立项。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fissues\u002F191",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},5761,"如何理解 `ratio` 和 `iterative_size` 参数之间的关系？","`iterative_size` 参数控制每次迭代处理的段长度，而 `ratio` 控制每段中要压缩的百分比。当 `iterative_size` 较小时，最终压缩比例更接近 `ratio` 的设定值。这是因为较小的 `iterative_size` 可以更精确地应用压缩比例，避免过度压缩。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fissues\u002F61",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},5762,"使用 LongLLMLingua 进行重排序时没有看到性能提升，怎么办？","尝试调整 `condition_in_question` 参数，例如将其设为 'after'。此外，确保使用了合适的限制性提示，并考虑使用不同的小型语言模型进行重排序。如果使用了目标 LLM 进行分布对齐，可能需要重新微调小型 LLM 以适应新模型。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fissues\u002F39",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},5763,"为什么在计算阈值时使用 `new_token_probs` 而不是 `word_probs`？","使用 `new_token_probs` 是为了确保每个 OpenAI 分词器返回的 token 都有对应的概率，包括特殊字符（如 `▁`）。虽然这会导致单词的概率被重复多次，但实验表明保留这些字符可以提高性能。因此，我们选择不移除这些字符。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fissues\u002F194",[146,151,156,161,166,171,176,181],{"id":147,"version":148,"summary_zh":149,"released_at":150},105425,"v0.2.2","## What's Changed\r\n* add feature: compress_json by @SiyunZhao in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F120\r\n\r\n## Bug fixed\r\n* Prevent duplicate `torch_dtype` kwargs by @yasyf in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F115\r\n* Feature(LLMLingua-2): fix the title by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F117\r\n* Fix(LLMLingua-2): fix the chunk max seq by @pzs19 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F122\r\n* Prereleased(LLMLinguia): fix the chuck issue and prepare for v0.2.2 by @pzs19, @qianhuiwu in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F130\r\n\r\n## New Contributors\r\n* @yasyf made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F115\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fcompare\u002Fv0.2.1...v0.2.2","2024-04-09T08:19:16",{"id":152,"version":153,"summary_zh":154,"released_at":155},105426,"v0.2.1","## What's Changed\r\n* Feature(LLMLingua-2): add LLMLingua-2 by @pzs19, @iofu728, @qianhuiwu in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F111, https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F112\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fcompare\u002Fv0.2.0...v0.2.1","2024-03-20T03:41:42",{"id":157,"version":158,"summary_zh":159,"released_at":160},105427,"v0.2.0","## What's Changed\r\n* Feature (LLMLingua): support customized compression spec by @SiyunZhao in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F96\r\n* Feature(LLMLingua): add LangChain example by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F97\r\n* Feature(LongLLMLingua): update conference by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F99\r\n* Fix(LLMLingua): fix sentence-filter adding separator bug and add document by @SiyunZhao in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F102\r\n* Fix(LLMLingua): fix the release workflows by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F107, https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F108\r\n* Release(LLMLingua): release v0.2.0 by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F109\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fcompare\u002Fv0.1.6...v0.2.0","2024-03-13T04:07:27",{"id":162,"version":163,"summary_zh":164,"released_at":165},105428,"v0.1.6","## What's Changed\r\n* Feature(LLMLingua): add slide of AI Time. by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F42\r\n* Feature(LLMLingua): add alt-gpt reference by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F56\r\n* Feature(LLMLingua): support phi-2 by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F67\r\n* Feature (LLMLingua): Add Docstring for PromptCompressor Class by @SiyunZhao in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F69\r\n* Feature(LLMLingua): update the FAQ by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F88\r\n\r\n### Bug Fixed\r\n* Hotfix (LLMLingua): fix the out of range by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F87\r\n* Fix (LLMLingua): fix the link of llama_index by @wlsdml1114 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F62\r\n* Fix(LLMLingua): fix the force context ids and condition flag by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F59\r\n* Fix (LLMLingua): Resolved a potential ZeroDivisionError caused by the actual compression ratio. by @davidberenstein1957 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F54\r\n* Remove Duplicate Declaration of Loss Function by @Speuce in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F38\r\n\r\n## New Contributors\r\n* @Speuce made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F38\r\n* @davidberenstein1957 made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F54\r\n* @wlsdml1114 made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F62\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fcompare\u002Fv0.1.5...v0.1.6","2024-02-19T08:47:34",{"id":167,"version":168,"summary_zh":169,"released_at":170},105429,"v0.1.5","## What's Changed\r\n* Fix (LLMLingua): support mps, fix keep_flag out of dimension. by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F27\r\n* Prerelease(LLMLingua): fix the license by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F32\r\n* Fix typos by @bobchao in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F26 and @eltociear in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F30\r\n\r\n## New Contributors\r\n* @bobchao made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F26\r\n* @eltociear made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F30\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fcompare\u002Fv0.1.4...v0.1.5","2023-12-21T08:59:54",{"id":172,"version":173,"summary_zh":174,"released_at":175},105430,"v0.1.4","## What's Changed\r\n* Fixed(LLMLingua): fix the prefix dimension mismatch. by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F16\r\n* Feature (LLMLingua): support GPT-Q by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F20\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fcompare\u002Fv0.1.3...v0.1.4","2023-11-22T14:17:15",{"id":177,"version":178,"summary_zh":179,"released_at":180},105431,"v0.1.3","## Features\r\n- [x] Add project page #9;\r\n- [x] Add examples #11; \r\n- [x] Support more reranker models and retrieval models #13;\r\n\r\n## Bug Fixed\r\n- [x] Fixed the 'context empty' issue #15;\r\n\r\n## What's Changed\r\n* Feature(LLMLingua): release LLMLingua code & demo by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F2\r\n* Feature(LLMLingua): update the news by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F9\r\n* Feature(LLMLingua): add examples by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F11\r\n* Feature(LongLLMLingua): support reranker model by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F13\r\n* Fixed (LLMLingua): Resolved the issue where the context was coming up as empty by @iofu728 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fpull\u002F15\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fcompare\u002Fv0.1.2...v0.1.3","2023-11-15T05:06:24",{"id":182,"version":183,"summary_zh":184,"released_at":185},105432,"v0.1.2","## Features\r\n- [x] LLMLingua Code;\r\n- [x] LongLLMLingua Code; \r\n- [x] LLMLingua Document;\r\n- [x] LLMLingua FAQ;\r\n- [x] LLMLingua HF Demo;\r\n- [x] LLMLingua Demo Video;\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FLLMLingua\u002Fcommits\u002Fv0.1.2","2023-10-09T14:32:58"]