[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-castorini--rank_llm":3,"tool-castorini--rank_llm":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":95,"forks":96,"last_commit_at":97,"license":98,"difficulty_score":10,"env_os":99,"env_gpu":100,"env_ram":101,"env_deps":102,"category_tags":116,"github_topics":79,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":117,"updated_at":118,"faqs":119,"releases":159},2814,"castorini\u002Frank_llm","rank_llm","RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.","RankLLM 是一款专为信息检索研究打造的 Python 工具包，核心致力于利用重排序模型（Rerankers）提升搜索结果的准确性，尤其擅长处理“列表式重排序”任务。在搜索引擎或问答系统中，初步检索往往返回大量相关但顺序不够理想的文档，RankLLM 通过引入先进的语言模型对这些结果进行二次精细排序，有效解决了传统方法难以捕捉复杂语义关联、导致关键信息排名靠后的痛点。\n\n这款工具非常适合人工智能研究人员、数据科学家以及需要构建高精度检索系统的开发者使用。它提供了一个丰富的模型库，不仅支持 MonoT5、DuoT5 等经典点对点与成对模型，更重点优化了基于 vLLM、SGLang 等框架的开源大语言模型，同时也兼容 RankGPT 等专有模型。其独特的技术亮点在于支持自定义提示词模板（YAML 配置），让用户能轻松集成自有模型；同时提供仅基于首个令牌逻辑推理的高效模式，显著降低了计算成本。凭借可复现的研究流程和灵活的命令行接口，RankLLM 帮助用户以更低的门槛实现业界领先的检索重排序效果。","# RankLLM\n\n[![PyPI](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Frank-llm?color=brightgreen)](https:\u002F\u002Fpypi.org\u002Fproject\u002Frank-llm\u002F)\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcastorini_rank_llm_readme_a180cff68274.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Frank-llm)\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcastorini_rank_llm_readme_95813fd56d2d.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Frank-llm)\n[![Generic badge](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2309.15088-red.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.15088)\n[![LICENSE](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-Apache-blue.svg?style=flat)](https:\u002F\u002Fwww.apache.org\u002Flicenses\u002FLICENSE-2.0)\n\n## News\n- **[2026.03.26]** RankLLM now supports the new `rank-llm` command-line interface (CLI).\n- **[2025.08.25]** Added support for OpenRouter API - Release [v0.25.7](docs\u002Frelease-notes\u002Frelease-notes-v0.25.7.md)\n- **[2025.07.23]** Added support for custom prompt templates with YAML files - Release [v0.25.0](docs\u002Frelease-notes\u002Frelease-notes-v0.25.0.md). You can now integrate your own prompt and language model with just a few lines of code. Checkout the [Reasonrank integration](https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm\u002Fpull\u002F306) as an example.\n- **[2025.05.25]** Our [RankLLM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3726302.3730331) resource paper is accepted to SIGIR 2025! 🎉🎉🎉\n\n## Overview\nWe offer a suite of rerankers - pointwise models like MonoT5, pairwise models like DuoT5 and listwise models with a focus on open source LLMs compatible with [vLLM](https:\u002F\u002Fhttps:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm), [SGLang](https:\u002F\u002Fgithub.com\u002Fsgl-project\u002Fsglang), or [TensorRT-LLM](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT-LLM). We also support RankGPT and RankGemini variants, which are proprietary listwise rerankers. Addtionally, we support reranking with the first-token logits only to improve inference efficiency.  Some of the code in this repository is borrowed from [RankGPT](https:\u002F\u002Fgithub.com\u002Fsunnweiwei\u002FRankGPT), [PyGaggle](https:\u002F\u002Fgithub.com\u002Fcastorini\u002Fpygaggle), and [LiT5](https:\u002F\u002Fgithub.com\u002Fcastorini\u002FLiT5)!\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcastorini_rank_llm_readme_938105b3d4e6.png\" alt=\"RankLLM Overview\" style=\"width:95%;\">\n\u003C\u002Fp>\n\n## Releases\ncurrent_version = \"0.25.7\"\n\n## Content\n1. [Installation](#installation)\n2. [Quick Start](#quick-start)\n3. [End-to-end Run and 2CR](#end-to-end-run-and-2cr)\n4. [Model Zoo](#model-zoo)\n5. [Training](#training)\n6. [Community Contribution](#community-contribution)\n7. [References and Citations](#references)\n8. [Acknowledgments](#acknowledgments)\n\n\u003Ca id=\"installation\">\u003C\u002Fa>\n# 📟 Installation\n\n`uv` is the canonical contributor workflow for this repository. The existing\n`conda` and `pip` paths remain available as fallbacks.\n\n## Install `uv`\n\nInstall `uv` with Astral's official installer:\n\n```bash\ncurl -LsSf https:\u002F\u002Fastral.sh\u002Fuv\u002Finstall.sh | sh\nexport PATH=\"$HOME\u002F.local\u002Fbin:$PATH\"\n```\n\n## Prerequisites\n\n- Install Java 21 only if you plan to use retrieval or evaluation workflows via\n  `rank-llm[pyserini]`. JDK 11 is not supported.\n- Install CUDA-specific PyTorch wheels separately if you want GPU-optimized\n  builds beyond the default Python package resolution.\n\n## Development Installation\n\nFor development or the latest features, create a repo-local virtual environment:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm.git\ncd rank_llm\nuv python install 3.11\nuv venv --python 3.11\nsource .venv\u002Fbin\u002Factivate\nuv sync --group dev\n```\n\nIf you prefer not to activate the virtual environment, run commands through\n`uv run`, for example `uv run python -m unittest discover test`.\n\n## Optional Extras\n\nInstall only the stacks you need:\n\n```bash\nuv sync --group dev --extra \u003Cextra>\n```\n\nReplace `\u003Cextra>` with one of the extras in the feature matrix below. You can\nrepeat `--extra` to combine stacks in one environment.\n\n### Feature Matrix\n\n| Workflow | Extra | Notes |\n| --- | --- | --- |\n| Hosted OpenAI or OpenRouter rerankers | `openai` | Includes `python-dotenv` and `tiktoken` |\n| Hosted Gemini rerankers | `genai` | `gemini` is an alias |\n| All hosted-provider rerankers | `cloud` | Installs `openai` and `genai` |\n| Local Hugging Face and PyTorch rerankers | `local` | Installs `torch` and `transformers` for MonoT5, DuoT5, MonoELECTRA, and related local paths |\n| Pyserini retrieval and evaluation | `pyserini` | Requires Java 21 |\n| Lightweight HTTP API dependencies | `api` | Installs FastAPI, Flask, and Uvicorn without the heavier retrieval or inference stacks |\n| MCP server dependencies | `mcp` | Pulls the packaged `serve mcp` dependency set, including Pyserini and model-serving backends |\n| Listwise reranking with open-source models via vLLM | `vllm` | Builds on `local` and adds the vLLM backend |\n| Batched SGLang inference | `sglang` | Install `flashinfer` separately when needed |\n| Batched TensorRT-LLM inference | `tensorrt-llm` | Install `flash-attn` separately when needed |\n| Full HTTP and MCP server bundle | `server` | Aggregate of the `api` and `mcp` extras |\n| Finetuning and training scripts | `training` | Keeps training-only deps out of base installs |\n| Everything | `all` | Aggregate of all extras |\n\n### PyPI Installation\n\nCreate an isolated virtual environment and install the published package:\n\n```bash\nuv venv --python 3.11\nsource .venv\u002Fbin\u002Factivate\nuv pip install rank-llm\n```\n\n### Fallback `conda` \u002F `pip` Workflow\n\nIf you want to keep using conda:\n\n```bash\nconda create -n rankllm python=3.11 -c conda-forge -y\nconda activate rankllm\npip install -e .\n```\n\nThen install the optional stack you need, for example:\n\n```bash\npip install -e \".[\u003Cextra>]\"\n```\n\nReplace `\u003Cextra>` with one of the extras in the feature matrix below. You can\ncombine extras as needed, for example `pip install -e \".[openai,api]\"`.\n\nRemember to install `flashinfer` for the `sglang` backend and `flash-attn` for\noptimized TensorRT-LLM or training workflows when those stacks require them.\n\n```bash\npip install flashinfer -i https:\u002F\u002Fflashinfer.ai\u002Fwhl\u002Fcu121\u002Ftorch2.4\u002F\npip install flash-attn --no-build-isolation\n```\n\n\u003Ca id=\"quick-start\">\u003C\u002Fa>\n# ⏳ Quick Start\nThe packaged `rank-llm` command is the canonical CLI surface for this repository.\nThe legacy scripts under `src\u002Frank_llm\u002Fscripts\u002F` still work, but they now act as\ncompatibility wrappers over the same CLI.\n\n```bash\nrank-llm rerank --model-path castorini\u002Frank_zephyr_7b_v1_full --dataset dl20 \\\n  --retrieval-method bm25 --top-k-candidates 100\n\nrank-llm prompt list\nrank-llm view demo_outputs\u002Frerank_results.jsonl\nrank-llm evaluate --model-name castorini\u002Frank_zephyr_7b_v1_full\nrank-llm serve http --model-path castorini\u002Frank_zephyr_7b_v1_full --port 8082\nrank-llm serve mcp --transport stdio\n```\n\nThe following code snippet is a minimal walk through of retrieval, reranking, evalaution, and invocations analysis of top 100 retrieved documents for queries from `DL19`. In this example `BM25` is used as the retriever and `RankZephyr` as the reranker. Additional sample snippets are available to run under the `src\u002Frank_llm\u002Fdemo` directory.\n```python\nfrom pathlib import Path\n\nfrom rank_llm.analysis.response_analysis import ResponseAnalyzer\nfrom rank_llm.data import DataWriter\nfrom rank_llm.evaluation.trec_eval import EvalFunction\nfrom rank_llm.rerank import Reranker, get_openai_api_key\nfrom rank_llm.rerank.listwise import (\n    SafeOpenai,\n    VicunaReranker,\n    ZephyrReranker,\n)\nfrom rank_llm.retrieve.retriever import RetrievalMethod, Retriever\nfrom rank_llm.retrieve.topics_dict import TOPICS\n\n# -------- Retrieval --------\n\n# By default BM25 is used for retrieval of top 100 candidates.\ndataset_name = \"dl19\"\nretrieved_results = Retriever.from_dataset_with_prebuilt_index(dataset_name)\n\n# Users can specify other retrieval methods and number of retrieved candidates.\n# retrieved_results = Retriever.from_dataset_with_prebuilt_index(\n#     dataset_name, RetrievalMethod.SPLADE_P_P_ENSEMBLE_DISTIL, k=50\n# )\n# ---------------------------\n\n# --------- Rerank ----------\n\n# Rank Zephyr model\nreranker = ZephyrReranker()\n\n# Rank Vicuna model\n# reranker = VicunaReranker()\n\n# RankGPT\n# model_coordinator = SafeOpenai(\"gpt-4o-mini\", 4096, keys=get_openai_api_key())\n# reranker = Reranker(model_coordinator)\n\nkwargs = {\"populate_invocations_history\": True}\nrerank_results = reranker.rerank_batch(requests=retrieved_results, **kwargs)\n# ---------------------------\n\n# ------- Evaluation --------\n\n# Evaluate retrieved results.\ntopics = TOPICS[dataset_name]\nndcg_10_retrieved = EvalFunction.from_results(retrieved_results, topics)\nprint(ndcg_10_retrieved)\n\n# Evaluate rerank results.\nndcg_10_rerank = EvalFunction.from_results(rerank_results, topics)\nprint(ndcg_10_rerank)\n\n# By default ndcg@10 is the eval metric, other value can be specified:\n# eval_args = [\"-c\", \"-m\", \"map_cut.100\", \"-l2\"]\n# map_100_rerank = EvalFunction.from_results(rerank_results, topics, eval_args)\n# print(map_100_rerank)\n\n# eval_args = [\"-c\", \"-m\", \"recall.20\"]\n# recall_20_rerank = EvalFunction.from_results(rerank_results, topics, eval_args)\n# print(recall_20_rerank)\n\n# ---------------------------\n\n# --- Analyze invocations ---\nanalyzer = ResponseAnalyzer.from_inline_results(rerank_results)\nerror_counts = analyzer.count_errors(verbose=True)\nprint(error_counts)\n# ---------------------------\n\n# ------ Save results -------\nwriter = DataWriter(rerank_results)\nPath(f\"demo_outputs\u002F\").mkdir(parents=True, exist_ok=True)\nwriter.write_in_jsonl_format(f\"demo_outputs\u002Frerank_results.jsonl\")\nwriter.write_in_trec_eval_format(f\"demo_outputs\u002Frerank_results.txt\")\nwriter.write_inference_invocations_history(\n    f\"demo_outputs\u002Finference_invocations_history.json\"\n)\n# ---------------------------\n```\n\n# End-to-end Run and 2CR\nIf you are interested in running retrieval and reranking end-to-end or reproducing the results from the [reference papers](#✨-references), `rank-llm rerank` is the canonical command. `run_rank_llm.py` remains available as a compatibility wrapper for older automation.\n\nThe comperehensive list of our two-click reproduction commands are available on [MS MARCO V1](https:\u002F\u002Fcastorini.github.io\u002Frank_llm\u002Fsrc\u002Frank_llm\u002F2cr\u002Fmsmarco-v1-passage.html) and [MS MARCO V2](https:\u002F\u002Fcastorini.github.io\u002Frank_llm\u002Fsrc\u002Frank_llm\u002F2cr\u002Fmsmarco-v2-passage.html) webpages for DL19 and DL20 and DL21-23 datasets, respectively. Moving forward, we plan to cover more datasets and retrievers in our 2CR pages. The rest of this session provides some sample e2e runs. \n## RankZephyr\n\nWe can run the RankZephyr model with the following command:\n```bash\nrank-llm rerank --model-path castorini\u002Frank_zephyr_7b_v1_full --top-k-candidates 100 --dataset dl20 \\\n--retrieval-method SPLADE++_EnsembleDistil_ONNX --prompt-template-path src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_zephyr_template.yaml --context-size 4096 --variable-passages\n```\n\nIncluding the `--sglang_batched` flag will allow you to run the model in batched mode using the `SGLang` library.\n\nIncluding the `--tensorrt_batched` flag will allow you to run the model in batched mode using the `TensorRT-LLM` library.\n\nIf you want to run multiple passes of the model, you can use the `--num_passes` flag.\n\n## RankGPT4-o\n\nWe can run the RankGPT4-o model with the following command:\n```bash\nrank-llm rerank --model-path gpt-4o --top-k-candidates 100 --dataset dl20 \\\n  --retrieval-method bm25 --prompt-template-path src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_gpt_apeer_template.yaml --context-size 4096 --use-azure-openai\n```\nNote that the `--prompt_template_path` is set to `rank_gpt_apeer` to use the LLM refined prompt from [APEER](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.14449).\nThis can be changed to `rank_GPT` to use the original prompt.\n\n## LiT5\n\nWe can run the LiT5-Distill V2 model (which could rerank 100 documents in a single pass) with the following command:\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py  --model_path=castorini\u002FLiT5-Distill-large-v2 --top_k_candidates=100 --dataset=dl19 \\\n        --retrieval_method=bm25 --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_fid_template.yaml  --context_size=150 --batch_size=4 \\\n    --variable_passages --window_size=100\n```\n\nWe can run the LiT5-Distill original model (which works with a window size of 20) with the following command:\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py  --model_path=castorini\u002FLiT5-Distill-large --top_k_candidates=100 --dataset=dl19 \\\n    --retrieval_method=bm25 --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_fid_template.yaml  --context_size=150 --batch_size=32 \\\n    --variable_passages\n```\n\nWe can run the LiT5-Score model with the following command:\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py  --model_path=castorini\u002FLiT5-Score-large --top_k_candidates=100 --dataset=dl19 \\\n    --retrieval_method=bm25 --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_fid_score_template.yaml --context_size=150 --batch_size=8 \\\n    --window_size=100 --variable_passages\n```\n\n## MonoT5\n\nThe following runs the 3B variant of MonoT5 trained for 10K steps:\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py --model_path=castorini\u002Fmonot5-3b-msmarco-10k --top_k_candidates=1000 --dataset=dl19 \\\n    --retrieval_method=bm25 --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Fmonot5_template.yaml --context_size=512\n```\n\nNote that we usually rerank 1K candidates with MonoT5.\n\n## MonoELECTRA\n\nThe following runs the MonoELECTRA model:\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py --model_path=monoelectra --top_k_candidates=1000 --dataset=dl19 \\\n    --retrieval_method=bm25 --context_size=512\n```\n\nOr with the full model path:\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py --model_path=castorini\u002Fmonoelectra-base --top_k_candidates=1000 --dataset=dl19 \\\n    --retrieval_method=bm25 --context_size=512\n```\n\nLike MonoT5, we usually rerank 1K candidates with MonoELECTRA.\n\n## DuoT5\nThe following runs the #B variant of DuoT5 trained for 10K steps:\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py --model_path=castorini\u002Fduot5-3b-msmarco-10k --top_k_candidates=50 --dataset=dl19 \\\n    --retrieval_method=bm25 --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Fduot5_template.yaml\n```\n\nSince Duo's pairwise comparison has $O(n^2) runtime complexity, we recommend reranking top 50 candidates using DuoT5 models.\n\n## FirstMistral\n\nWe can run the FirstMistral model, reranking using the first-token logits only with the following command:\n\n```\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py  --model_path=castorini\u002Ffirst_mistral --top_k_candidates=100 --dataset=dl20 --retrieval_method=SPLADE++_EnsembleDistil_ONNX --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_zephyr_template.yaml  --context_size=4096 --variable_passages --use_logits --use_alpha --num_gpus 1\n```\n\nOmit `--use_logits` if you wish to perform traditional listwise reranking.\n\n## Gemini Flash 2.0\n\nFirst install the Gemini provider extra:\n\n```bash\nuv sync --group dev --extra genai\n# or: pip install -e \".[genai]\"\n```\n\nThen run the following command:\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py  --model_path=gemini-2.0-flash-001 --top_k_candidates=100 --dataset=dl20 \\\n    --retrieval_method=SPLADE++_EnsembleDistil_ONNX --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_gpt_apeer_template.yaml  --context_size=4096\n```\n\n\u003Ca id=\"model-zoo\">\u003C\u002Fa>\n# 🦙🐧 Model Zoo\n\nThe following is a table of the listwise models our repository was primarily built to handle (with the models hosted on HuggingFace):\n\n`vLLM`, `SGLang`, and `TensorRT-LLM` backends are only supported for `RankZephyr` and `RankVicuna` models.\n\n| Model Name        | Hugging Face Identifier\u002FLink                            |\n|-------------------|---------------------------------------------|\n| RankZephyr 7B V1 - Full - BF16      | [castorini\u002Frank_zephyr_7b_v1_full](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Frank_zephyr_7b_v1_full)               |\n| RankVicuna 7B - V1      | [castorini\u002Frank_vicuna_7b_v1](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Frank_vicuna_7b_v1)               |\n| RankVicuna 7B - V1 - No Data Augmentation    | [castorini\u002Frank_vicuna_7b_v1_noda](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Frank_vicuna_7b_v1_noda)               |\n| RankVicuna 7B - V1 - FP16      | [castorini\u002Frank_vicuna_7b_v1_fp16](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Frank_vicuna_7b_v1_fp16)               |\n| RankVicuna 7B - V1 - No Data Augmentation - FP16   | [castorini\u002Frank_vicuna_7b_v1_noda_fp16](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Frank_vicuna_7b_v1_noda_fp16)               |\n\nWe also officially support the following rerankers built by our group:\n\n## LiT5 Suite\n\nThe following is a table specifically for our LiT5 suite of models hosted on HuggingFace:\n\n| Model Name            | 🤗 Hugging Face Identifier\u002FLink                            |\n|-----------------------|---------------------------------------------|\n| LiT5 Distill base     | [castorini\u002FLiT5-Distill-base](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-base)          |\n| LiT5 Distill large    | [castorini\u002FLiT5-Distill-large](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-large)        |\n| LiT5 Distill xl       | [castorini\u002FLiT5-Distill-xl](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-xl)              |\n| LiT5 Distill base v2  | [castorini\u002FLiT5-Distill-base-v2](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-base-v2)    |\n| LiT5 Distill large v2 | [castorini\u002FLiT5-Distill-large-v2](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-large-v2)  |\n| LiT5 Distill xl v2    | [castorini\u002FLiT5-Distill-xl-v2](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-xl-v2)        |\n| LiT5 Score base       | [castorini\u002FLiT5-Score-base](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Score-base)              |\n| LiT5 Score large      | [castorini\u002FLiT5-Score-large](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Score-large)            |\n| LiT5 Score xl         | [castorini\u002FLiT5-Score-xl](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Score-xl)                  |\n\nNow you can run top-100 reranking with the v2 model in a single pass while maintaining efficiency!\n\n## MonoT5 Suite - Pointwise Rerankers\n\nThe following is a table specifically for our monoT5 suite of models hosted on HuggingFace:\n\n| Model Name                        | 🤗 Hugging Face Identifier\u002FLink                            |\n|-----------------------------------|--------------------------------------------------------|\n| monoT5 Small MSMARCO 10K          | [castorini\u002Fmonot5-small-msmarco-10k](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-small-msmarco-10k)       |\n| monoT5 Small MSMARCO 100K         | [castorini\u002Fmonot5-small-msmarco-100k](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-small-msmarco-100k)     |\n| monoT5 Base MSMARCO               | [castorini\u002Fmonot5-base-msmarco](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-base-msmarco)                 |\n| monoT5 Base MSMARCO 10K           | [castorini\u002Fmonot5-base-msmarco-10k](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-base-msmarco-10k)         |\n| monoT5 Large MSMARCO 10K          | [castorini\u002Fmonot5-large-msmarco-10k](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-large-msmarco-10k)       |\n| monoT5 Large MSMARCO              | [castorini\u002Fmonot5-large-msmarco](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-large-msmarco)               |\n| monoT5 3B MSMARCO 10K             | [castorini\u002Fmonot5-3b-msmarco-10k](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-3b-msmarco-10k)             |\n| monoT5 3B MSMARCO                 | [castorini\u002Fmonot5-3b-msmarco](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-3b-msmarco)                     |\n| monoT5 Base Med MSMARCO           | [castorini\u002Fmonot5-base-med-msmarco](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-base-med-msmarco)         |\n| monoT5 3B Med MSMARCO             | [castorini\u002Fmonot5-3b-med-msmarco](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-3b-med-msmarco)             |\n\nWe recommend the Med models for biomedical retrieval. We also provide both 10K (generally better OOD effectiveness) and 100K checkpoints (better in-domain).\n# Training\nPlease check the `training` directory for finetuning open-source listwise rerankers.\n# External Integrations\nRankLLM is implemented in many popular toolkits such as LlamaIndex, rerankers, and LangChain. For usage of RankLLM in those toolkits and examples, please check this external integrations [README](docs\u002Fexternal-integrations.md)\n# Community Contribution\nIf you would like to contribute to the project, please refer to the [contribution guidelines](CONTRIBUTING.md).\n\n## 📜️ Release History\n\n+ v0.25.7: August 25, 2025 [[Release Notes](docs\u002Frelease-notes\u002Frelease-notes-v0.25.7.md)]\n+ v0.25.6: August 5, 2025 [[Release Notes](docs\u002Frelease-notes\u002Frelease-notes-v0.25.6.md)]\n+ v0.25.0: July 23, 2025 [[Release Notes](docs\u002Frelease-notes\u002Frelease-notes-v0.25.0.md)]\n\n\u003Ca id=references>\u003C\u002Fa>\n# ✨ References\n\nIf you use RankLLM, please cite the following relevant papers:\n\n[[2505.19284] RankLLM: A Python Package for Reranking with LLMs](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3726302.3730331)\n\n\u003C!-- {% raw %} -->\n```\n@inproceedings{sharifymoghaddam2025rankllm,\nauthor = {Sharifymoghaddam, Sahel and Pradeep, Ronak and Slavescu, Andre and Nguyen, Ryan and Xu, Andrew and Chen, Zijian and Zhang, Yilin and Chen, Yidi and Xian, Jasper and Lin, Jimmy},\ntitle = {{RankLLM}: A Python Package for Reranking with LLMs},\nyear = {2025},\nisbn = {9798400715921},\npublisher = {Association for Computing Machinery},\naddress = {New York, NY, USA},\nbooktitle = {Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},\npages = {3681–3690},\nnumpages = {10},\nkeywords = {information retrieval, large language models, python, reranking},\nlocation = {Padua, Italy},\nseries = {SIGIR '25}\n}\n```\n\u003C!-- {% endraw %} -->\n\n[[2309.15088] RankVicuna: Zero-Shot Listwise Document Reranking with Open-Source Large Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.15088)\n\n\u003C!-- {% raw %} -->\n```\n@ARTICLE{pradeep2023rankvicuna,\n  title   = {{RankVicuna}: Zero-Shot Listwise Document Reranking with Open-Source Large Language Models},\n  author  = {Ronak Pradeep and Sahel Sharifymoghaddam and Jimmy Lin},\n  year    = {2023},\n  journal = {arXiv:2309.15088}\n}\n```\n\u003C!-- {% endraw %} -->\n\n\n[[2312.02724] RankZephyr: Effective and Robust Zero-Shot Listwise Reranking is a Breeze!](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.02724)\n\n\u003C!-- {% raw %} -->\n```\n@ARTICLE{pradeep2023rankzephyr,\n  title   = {{RankZephyr}: Effective and Robust Zero-Shot Listwise Reranking is a Breeze!},\n  author  = {Ronak Pradeep and Sahel Sharifymoghaddam and Jimmy Lin},\n  year    = {2023},\n  journal = {arXiv:2312.02724}\n}\n```\n\u003C!-- {% endraw %} -->\n\nIf you use one of the LiT5 models please cite the following relevant paper:\n\n[[2312.16098] Scaling Down, LiTting Up: Efficient Zero-Shot Listwise Reranking with Seq2seq Encoder-Decoder Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.16098)\n\n```\n@ARTICLE{tamber2023scaling,\n  title   = {Scaling Down, LiTting Up: Efficient Zero-Shot Listwise Reranking with Seq2seq Encoder-Decoder Models},\n  author  = {Manveer Singh Tamber and Ronak Pradeep and Jimmy Lin},\n  year    = {2023},\n  journal = {arXiv:2312.16098}\n}\n```\n\nIf you use one of the monoT5 models please cite the following relevant paper:\n\n[[2101.05667] The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.05667)\n\n```\n@ARTICLE{pradeep2021emd,\n  title = {The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models},\n  author = {Ronak Pradeep and Rodrigo Nogueira and Jimmy Lin},\n  year = {2021},\n  journal = {arXiv:2101.05667},\n}\n```\n\n\nIf you use the monoELECTRA model, please consider citing:\n\n[Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking](https:\u002F\u002Fcs.uwaterloo.ca\u002F~jimmylin\u002Fpublications\u002FPradeep_etal_ECIR2022.pdf)\n```\n@inproceedings{pradeep2022monoelectra,\n  author = {Pradeep, Ronak and Liu, Yuqi and Zhang, Xinyu and Li, Yilin and Yates, Andrew and Lin, Jimmy},\n  title = {Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking},\n  year = {2022},\n  publisher = {Springer-Verlag},\n  address = {Berlin, Heidelberg},\n  booktitle = {Advances in Information Retrieval: 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part I},\n  pages = {655–670},\n  numpages = {16},\n  location = {Stavanger, Norway}\n}\n```\n\nIf you use the FirstMistral model, please consider citing:\n\n[[2411.05508] An Early FIRST Reproduction and Improvements to Single-Token Decoding for Fast Listwise Reranking](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.05508)\n\n```\n@ARTICLE{chen2024firstrepro,\n  title   = title={An Early FIRST Reproduction and Improvements to Single-Token Decoding for Fast Listwise Reranking},\n  author  = {Zijian Chen and Ronak Pradeep and Jimmy Lin},\n  year    = {2024},\n  journal = {arXiv:2411.05508}\n}\n```\n\nIf you would like to cite the FIRST methodology, please consider citing:\n\n[[2406.15657] FIRST: Faster Improved Listwise Reranking with Single Token Decoding](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.15657)\n\n```\n@ARTICLE{reddy2024first,\n  title   = {FIRST: Faster Improved Listwise Reranking with Single Token Decoding},\n  author  = {Reddy, Revanth Gangi and Doo, JaeHyeok and Xu, Yifei and Sultan, Md Arafat and Swain, Deevya and Sil, Avirup and Ji, Heng},\n  year    = {2024}\n  journal = {arXiv:2406.15657},\n}\n```\n\u003Ca id=acknowledgments>\u003C\u002Fa>\n# 🙏 Acknowledgments\n\nThis research is supported in part by the Natural Sciences and Engineering Research Council (NSERC) of Canada.\n","# RankLLM\n\n[![PyPI](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Frank-llm?color=brightgreen)](https:\u002F\u002Fpypi.org\u002Fproject\u002Frank-llm\u002F)\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcastorini_rank_llm_readme_a180cff68274.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Frank-llm)\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcastorini_rank_llm_readme_95813fd56d2d.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Frank-llm)\n[![Generic badge](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2309.15088-red.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.15088)\n[![LICENSE](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-Apache-blue.svg?style=flat)](https:\u002F\u002Fwww.apache.org\u002Flicenses\u002FLICENSE-2.0)\n\n## 新闻\n- **[2026.03.26]** RankLLM 现在支持新的 `rank-llm` 命令行界面 (CLI)。\n- **[2025.08.25]** 添加了对 OpenRouter API 的支持 - 发布 [v0.25.7](docs\u002Frelease-notes\u002Frelease-notes-v0.25.7.md)\n- **[2025.07.23]** 添加了使用 YAML 文件自定义提示模板的支持 - 发布 [v0.25.0](docs\u002Frelease-notes\u002Frelease-notes-v0.25.0.md)。现在只需几行代码即可集成您自己的提示和语言模型。请查看 [Reasonrank 集成](https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm\u002Fpull\u002F306) 作为示例。\n- **[2025.05.25]** 我们的 [RankLLM](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3726302.3730331) 资源论文已被 SIGIR 2025 接受！🎉🎉🎉\n\n## 概述\n我们提供一系列重排序器——包括单点模型如 MonoT5、成对模型如 DuoT5，以及以开源 LLM 为重点的列表式模型，这些模型与 [vLLM](https:\u002F\u002Fhttps:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm)、[SGLang](https:\u002F\u002Fgithub.com\u002Fsgl-project\u002Fsglang) 或 [TensorRT-LLM](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FTensorRT-LLM) 兼容。我们还支持 RankGPT 和 RankGemini 变体，它们是专有的列表式重排序器。此外，我们还支持仅使用第一个 token 的 logits 进行重排序，以提高推理效率。本仓库中的部分代码借鉴自 [RankGPT](https:\u002F\u002Fgithub.com\u002Fsunnweiwei\u002FRankGPT)、[PyGaggle](https:\u002F\u002Fgithub.com\u002Fcastorini\u002Fpygaggle) 和 [LiT5](https:\u002F\u002Fgithub.com\u002Fcastorini\u002FLiT5)！\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcastorini_rank_llm_readme_938105b3d4e6.png\" alt=\"RankLLM Overview\" style=\"width:95%;\">\n\u003C\u002Fp>\n\n## 发布版本\ncurrent_version = \"0.25.7\"\n\n## 内容\n1. [安装](#installation)\n2. [快速入门](#quick-start)\n3. [端到端运行与 2CR](#end-to-end-run-and-2cr)\n4. [模型库](#model-zoo)\n5. [训练](#training)\n6. [社区贡献](#community-contribution)\n7. [参考文献与引用](#references)\n8. [致谢](#acknowledgments)\n\n\u003Ca id=\"installation\">\u003C\u002Fa>\n# 📟 安装\n\n`uv` 是本仓库的标准贡献工作流。现有的 `conda` 和 `pip` 路径仍然可用作备用。\n\n## 安装 `uv`\n\n使用 Astral 的官方安装程序安装 `uv`：\n\n```bash\ncurl -LsSf https:\u002F\u002Fastral.sh\u002Fuv\u002Finstall.sh | sh\nexport PATH=\"$HOME\u002F.local\u002Fbin:$PATH\"\n```\n\n## 先决条件\n\n- 仅当您计划通过 `rank-llm[pyserini]` 使用检索或评估流程时，才需要安装 Java 21。不支持 JDK 11。\n- 如果您希望获得比默认 Python 包解析更优的 GPU 优化构建版本，则需单独安装 CUDA 特定的 PyTorch wheel。\n\n## 开发环境安装\n\n对于开发或获取最新功能，请创建一个仓库本地的虚拟环境：\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm.git\ncd rank_llm\nuv python install 3.11\nuv venv --python 3.11\nsource .venv\u002Fbin\u002Factivate\nuv sync --group dev\n```\n\n如果您不想激活虚拟环境，可以通过 `uv run` 来执行命令，例如 `uv run python -m unittest discover test`。\n\n## 可选扩展\n\n仅安装您需要的堆栈：\n\n```bash\nuv sync --group dev --extra \u003Cextra>\n```\n\n将 `\u003Cextra>` 替换为下方功能矩阵中的其中一个扩展。您可以重复使用 `--extra` 来在一个环境中组合多个堆栈。\n\n### 功能矩阵\n\n| 工作流程 | 扩展 | 备注 |\n| --- | --- | --- |\n| 托管的 OpenAI 或 OpenRouter 重排序器 | `openai` | 包含 `python-dotenv` 和 `tiktoken` |\n| 托管的 Gemini 重排序器 | `genai` | `gemini` 是别名 |\n| 所有托管提供商的重排序器 | `cloud` | 安装 `openai` 和 `genai` |\n| 本地的 Hugging Face 和 PyTorch 重排序器 | `local` | 安装 `torch` 和 `transformers`，用于 MonoT5、DuoT5、MonoELECTRA 及相关本地路径 |\n| Pyserini 检索和评估 | `pyserini` | 需要 Java 21 |\n| 轻量级 HTTP API 依赖 | `api` | 安装 FastAPI、Flask 和 Uvicorn，而不包含较重的检索或推理堆栈 |\n| MCP 服务器依赖 | `mcp` | 引入打包好的 `serve mcp` 依赖集，包括 Pyserini 和模型服务后端 |\n| 使用 vLLM 的开源模型进行列表式重排序 | `vllm` | 基于 `local` 并添加 vLLM 后端 |\n| 批量 SGLang 推理 | `sglang` | 必要时需单独安装 `flashinfer` |\n| 批量 TensorRT-LLM 推理 | `tensorrt-llm` | 必要时需单独安装 `flash-attn` |\n| 完整的 HTTP 和 MCP 服务器捆绑包 | `server` | `api` 和 `mcp` 扩展的集合 |\n| 微调和训练脚本 | `training` | 将仅用于训练的依赖项排除在基础安装之外 |\n| 全部 | `all` | 所有扩展的集合 |\n\n### PyPI 安装\n\n创建一个隔离的虚拟环境并安装已发布的包：\n\n```bash\nuv venv --python 3.11\nsource .venv\u002Fbin\u002Factivate\nuv pip install rank-llm\n```\n\n### 备用 `conda` \u002F `pip` 流程\n\n如果您想继续使用 conda：\n\n```bash\nconda create -n rankllm python=3.11 -c conda-forge -y\nconda activate rankllm\npip install -e .\n```\n\n然后安装您需要的可选堆栈，例如：\n\n```bash\npip install -e \".[\u003Cextra>]\"\n```\n\n将 `\u003Cextra>` 替换为下方功能矩阵中的其中一个扩展。您可以根据需要组合多个扩展，例如 `pip install -e \".[openai,api]\"`。\n\n请记住，当这些堆栈需要时，务必为 `sglang` 后端安装 `flashinfer`，为优化的 TensorRT-LLM 或训练流程安装 `flash-attn`。\n\n```bash\npip install flashinfer -i https:\u002F\u002Fflashinfer.ai\u002Fwhl\u002Fcu121\u002Ftorch2.4\u002F\npip install flash-attn --no-build-isolation\n```\n\n\u003Ca id=\"quick-start\">\u003C\u002Fa>\n\n# ⏳ 快速入门\n打包好的 `rank-llm` 命令是本仓库的标准命令行界面。\n`src\u002Frank_llm\u002Fscripts\u002F` 目录下的旧版脚本仍然可用，但现在它们只是对同一命令行界面的兼容性封装。\n\n```bash\nrank-llm rerank --model-path castorini\u002Frank_zephyr_7b_v1_full --dataset dl20 \\\n  --retrieval-method bm25 --top-k-candidates 100\n\nrank-llm prompt list\nrank-llm view demo_outputs\u002Frerank_results.jsonl\nrank-llm evaluate --model-name castorini\u002Frank_zephyr_7b_v1_full\nrank-llm serve http --model-path castorini\u002Frank_zephyr_7b_v1_full --port 8082\nrank-llm serve mcp --transport stdio\n```\n\n以下代码片段是对从 `DL19` 查询中检索、重排序、评估以及对前100篇检索文档进行调用分析的最小化演示。在该示例中，使用 `BM25` 作为检索器，`RankZephyr` 作为重排序器。更多示例片段可在 `src\u002Frank_llm\u002Fdemo` 目录下运行。\n```python\nfrom pathlib import Path\n\nfrom rank_llm.analysis.response_analysis import ResponseAnalyzer\nfrom rank_llm.data import DataWriter\nfrom rank_llm.evaluation.trec_eval import EvalFunction\nfrom rank_llm.rerank import Reranker, get_openai_api_key\nfrom rank_llm.rerank.listwise import (\n    SafeOpenai,\n    VicunaReranker,\n    ZephyrReranker,\n)\nfrom rank_llm.retrieve.retriever import RetrievalMethod, Retriever\nfrom rank_llm.retrieve.topics_dict import TOPICS\n\n# -------- 检索 --------\n\n# 默认使用 BM25 检索前100个候选文档。\ndataset_name = \"dl19\"\nretrieved_results = Retriever.from_dataset_with_prebuilt_index(dataset_name)\n\n# 用户可以指定其他检索方法和检索的候选文档数量。\n# retrieved_results = Retriever.from_dataset_with_prebuilt_index(\n#     dataset_name, RetrievalMethod.SPLADE_P_P_ENSEMBLE_DISTIL, k=50\n# )\n# ---------------------------\n\n# --------- 重排序 ----------\n\n# Rank Zephyr 模型\nreranker = ZephyrReranker()\n\n# Rank Vicuna 模型\n# reranker = VicunaReranker()\n\n# RankGPT\n# model_coordinator = SafeOpenai(\"gpt-4o-mini\", 4096, keys=get_openai_api_key())\n# reranker = Reranker(model_coordinator)\n\nkwargs = {\"populate_invocations_history\": True}\nrerank_results = reranker.rerank_batch(requests=retrieved_results, **kwargs)\n# ---------------------------\n\n# ------- 评估 --------\n\n# 评估检索结果。\ntopics = TOPICS[dataset_name]\nndcg_10_retrieved = EvalFunction.from_results(retrieved_results, topics)\nprint(ndcg_10_retrieved)\n\n# 评估重排序结果。\nndcg_10_rerank = EvalFunction.from_results(rerank_results, topics)\nprint(ndcg_10_rerank)\n\n# 默认评估指标为 ndcg@10，也可以指定其他值：\n# eval_args = [\"-c\", \"-m\", \"map_cut.100\", \"-l2\"]\n# map_100_rerank = EvalFunction.from_results(rerank_results, topics, eval_args)\n# print(map_100_rerank)\n\n# eval_args = [\"-c\", \"-m\", \"recall.20\"]\n# recall_20_rerank = EvalFunction.from_results(rerank_results, topics, eval_args)\n# print(recall_20_rerank)\n\n# ---------------------------\n\n# --- 分析调用情况 ---\nanalyzer = ResponseAnalyzer.from_inline_results(rerank_results)\nerror_counts = analyzer.count_errors(verbose=True)\nprint(error_counts)\n# ---------------------------\n\n# ------ 保存结果 -------\nwriter = DataWriter(rerank_results)\nPath(f\"demo_outputs\u002F\").mkdir(parents=True, exist_ok=True)\nwriter.write_in_jsonl_format(f\"demo_outputs\u002Frerank_results.jsonl\")\nwriter.write_in_trec_eval_format(f\"demo_outputs\u002Frerank_results.txt\")\nwriter.write_inference_invocations_history(\n    f\"demo_outputs\u002Finference_invocations_history.json\"\n)\n# ---------------------------\n```\n\n# 端到端运行与2CR\n如果您有兴趣端到端地运行检索和重排序，或复现【参考文献】中的结果，`rank-llm rerank` 是标准命令。`run_rank_llm.py` 仍作为旧版自动化工具的兼容性封装而保留。\n\n我们两步即可复现的完整命令列表，分别在 [MS MARCO V1](https:\u002F\u002Fcastorini.github.io\u002Frank_llm\u002Fsrc\u002Frank_llm\u002F2cr\u002Fmsmarco-v1-passage.html) 和 [MS MARCO V2](https:\u002F\u002Fcastorini.github.io\u002Frank_llm\u002Fsrc\u002Frank_llm\u002F2cr\u002Fmsmarco-v2-passage.html) 页面上提供，对应 DL19、DL20 和 DL21-23 数据集。未来，我们计划在 2CR 页面上覆盖更多数据集和检索器。本节剩余内容将提供一些端到端运行的示例。\n## RankZephyr\n\n我们可以使用以下命令运行 RankZephyr 模型：\n```bash\nrank-llm rerank --model-path castorini\u002Frank_zephyr_7b_v1_full --top-k-candidates 100 --dataset dl20 \\\n--retrieval-method SPLADE++_EnsembleDistil_ONNX --prompt-template-path src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_zephyr_template.yaml --context-size 4096 --variable-passages\n```\n\n加入 `--sglang_batched` 标志后，您就可以使用 `SGLang` 库以批处理模式运行模型。\n\n加入 `--tensorrt_batched` 标志后，您就可以使用 `TensorRT-LLM` 库以批处理模式运行模型。\n\n如果您希望多次运行模型，可以使用 `--num_passes` 标志。\n\n## RankGPT4-o\n\n我们可以使用以下命令运行 RankGPT4-o 模型：\n```bash\nrank-llm rerank --model-path gpt-4o --top-k-candidates 100 --dataset dl20 \\\n  --retrieval-method bm25 --prompt-template-path src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_gpt_apeer_template.yaml --context-size 4096 --use-azure-openai\n```\n请注意，`--prompt_template_path` 已设置为 `rank_gpt_apeer`，以使用来自 [APEER](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.14449) 的 LLM 优化提示。此路径也可更改为 `rank_GPT` 以使用原始提示。\n\n## LiT5\n\n我们可以使用以下命令运行 LiT5-Distill V2 模型（该模型可以在一次运行中对100篇文档进行重排序）：\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py  --model_path=castorini\u002FLiT5-Distill-large-v2 --top_k_candidates=100 --dataset=dl19 \\\n        --retrieval_method=bm25 --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_fid_template.yaml  --context_size=150 --batch_size=4 \\\n    --variable_passages --window_size=100\n```\n\n我们可以使用以下命令运行 LiT5-Distill 原始模型（该模型的工作窗口大小为20）：\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py  --model_path=castorini\u002FLiT5-Distill-large --top_k_candidates=100 --dataset=dl19 \\\n    --retrieval_method=bm25 --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_fid_template.yaml  --context_size=150 --batch_size=32 \\\n    --variable_passages\n```\n\n我们还可以使用以下命令运行 LiT5-Score 模型：\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py  --model_path=castorini\u002FLiT5-Score-large --top_k_candidates=100 --dataset=dl19 \\\n    --retrieval_method=bm25 --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_fid_score_template.yaml --context_size=150 --batch_size=8 \\\n    --window_size=100 --variable_passages\n```\n\n## MonoT5\n\n以下命令运行经过1万步训练的MonoT5 3B版本：\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py --model_path=castorini\u002Fmonot5-3b-msmarco-10k --top_k_candidates=1000 --dataset=dl19 \\\n    --retrieval_method=bm25 --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Fmonot5_template.yaml --context_size=512\n```\n\n请注意，我们通常使用MonoT5对1000个候选结果进行重排序。\n\n## MonoELECTRA\n\n以下命令运行MonoELECTRA模型：\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py --model_path=monoelectra --top_k_candidates=1000 --dataset=dl19 \\\n    --retrieval_method=bm25 --context_size=512\n```\n\n或者使用完整的模型路径：\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py --model_path=castorini\u002Fmonoelectra-base --top_k_candidates=1000 --dataset=dl19 \\\n    --retrieval_method=bm25 --context_size=512\n```\n\n与MonoT5类似，我们通常使用MonoELECTRA对1000个候选结果进行重排序。\n\n## DuoT5\n以下命令运行经过1万步训练的DuoT5 #B版本：\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py --model_path=castorini\u002Fduot5-3b-msmarco-10k --top_k_candidates=50 --dataset=dl19 \\\n    --retrieval_method=bm25 --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Fduot5_template.yaml\n```\n\n由于Duo的成对比较具有$O(n^2)$的时间复杂度，我们建议使用DuoT5模型对前50名候选结果进行重排序。\n\n## FirstMistral\n\n我们可以使用以下命令运行FirstMistral模型，并仅使用第一个token的logits进行重排序：\n\n```\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py  --model_path=castorini\u002Ffirst_mistral --top_k_candidates=100 --dataset=dl20 --retrieval_method=SPLADE++_EnsembleDistil_ONNX --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_zephyr_template.yaml  --context_size=4096 --variable_passages --use_logits --use_alpha --num_gpus 1\n```\n\n如果您希望执行传统的列表式重排序，请省略`--use_logits`。\n\n## Gemini Flash 2.0\n\n首先安装Gemini提供者的额外依赖：\n\n```bash\nuv sync --group dev --extra genai\n# 或：pip install -e \".[genai]\"\n```\n\n然后运行以下命令：\n\n```bash\npython src\u002Frank_llm\u002Fscripts\u002Frun_rank_llm.py  --model_path=gemini-2.0-flash-001 --top_k_candidates=100 --dataset=dl20 \\\n    --retrieval_method=SPLADE++_EnsembleDistil_ONNX --prompt_template_path=src\u002Frank_llm\u002Frerank\u002Fprompt_templates\u002Frank_gpt_apeer_template.yaml  --context_size=4096\n```\n\n\u003Ca id=\"model-zoo\">\u003C\u002Fa>\n# 🦙🐧 模型库\n\n以下是本仓库主要支持的列表式模型表格（模型托管在Hugging Face上）：\n\n`vLLM`、`SGLang`和`TensorRT-LLM`后端仅支持`RankZephyr`和`RankVicuna`模型。\n\n| 模型名称        | Hugging Face 标识符\u002F链接                            |\n|-------------------|---------------------------------------------|\n| RankZephyr 7B V1 - Full - BF16      | [castorini\u002Frank_zephyr_7b_v1_full](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Frank_zephyr_7b_v1_full)               |\n| RankVicuna 7B - V1      | [castorini\u002Frank_vicuna_7b_v1](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Frank_vicuna_7b_v1)               |\n| RankVicuna 7B - V1 - 无数据增强    | [castorini\u002Frank_vicuna_7b_v1_noda](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Frank_vicuna_7b_v1_noda)               |\n| RankVicuna 7B - V1 - FP16      | [castorini\u002Frank_vicuna_7b_v1_fp16](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Frank_vicuna_7b_v1_fp16)               |\n| RankVicuna 7B - V1 - 无数据增强 - FP16   | [castorini\u002Frank_vicuna_7b_v1_noda_fp16](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Frank_vicuna_7b_v1_noda_fp16)               |\n\n我们还正式支持由我们团队构建的以下重排序器：\n\n## LiT5 系列\n\n以下是专门针对我们LiT5系列模型的表格，这些模型托管在Hugging Face上：\n\n| 模型名称            | 🤗 Hugging Face 标识符\u002F链接                            |\n|-----------------------|---------------------------------------------|\n| LiT5 Distill base     | [castorini\u002FLiT5-Distill-base](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-base)          |\n| LiT5 Distill large    | [castorini\u002FLiT5-Distill-large](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-large)        |\n| LiT5 Distill xl       | [castorini\u002FLiT5-Distill-xl](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-xl)              |\n| LiT5 Distill base v2  | [castorini\u002FLiT5-Distill-base-v2](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-base-v2)    |\n| LiT5 Distill large v2 | [castorini\u002FLiT5-Distill-large-v2](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-large-v2)  |\n| LiT5 Distill xl v2    | [castorini\u002FLiT5-Distill-xl-v2](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Distill-xl-v2)        |\n| LiT5 Score base       | [castorini\u002FLiT5-Score-base](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Score-base)              |\n| LiT5 Score large      | [castorini\u002FLiT5-Score-large](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Score-large)            |\n| LiT5 Score xl         | [castorini\u002FLiT5-Score-xl](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002FLiT5-Score-xl)                  |\n\n现在您可以使用v2版本的模型在单次通过中高效地完成前100名的重排序！\n\n## MonoT5 系列 - 点式重排序器\n\n以下是专门针对我们monoT5系列模型的表格，这些模型托管在Hugging Face上：\n\n| 模型名称                        | 🤗 Hugging Face 标识符\u002F链接                            |\n|-----------------------------------|--------------------------------------------------------|\n| monoT5 Small MSMARCO 10K          | [castorini\u002Fmonot5-small-msmarco-10k](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-small-msmarco-10k)       |\n| monoT5 Small MSMARCO 100K         | [castorini\u002Fmonot5-small-msmarco-100k](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-small-msmarco-100k)     |\n| monoT5 Base MSMARCO               | [castorini\u002Fmonot5-base-msmarco](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-base-msmarco)                 |\n| monoT5 Base MSMARCO 10K           | [castorini\u002Fmonot5-base-msmarco-10k](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-base-msmarco-10k)         |\n| monoT5 Large MSMARCO 10K          | [castorini\u002Fmonot5-large-msmarco-10k](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-large-msmarco-10k)       |\n| monoT5 Large MSMARCO              | [castorini\u002Fmonot5-large-msmarco](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-large-msmarco)               |\n| monoT5 3B MSMARCO 10K             | [castorini\u002Fmonot5-3b-msmarco-10k](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-3b-msmarco-10k)             |\n| monoT5 3B MSMARCO                 | [castorini\u002Fmonot5-3b-msmarco](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-3b-msmarco)                     |\n| monoT5 Base Med MSMARCO           | [castorini\u002Fmonot5-base-med-msmarco](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-base-med-msmarco)         |\n| monoT5 3B Med MSMARCO             | [castorini\u002Fmonot5-3b-med-msmarco](https:\u002F\u002Fhuggingface.co\u002Fcastorini\u002Fmonot5-3b-med-msmarco)             |\n\n我们建议在生物医学检索任务中使用Med系列模型。我们同时提供1万步（通常在OOD场景下效果更好）和10万步（在域内表现更佳）的检查点。\n\n# 训练\n请查看 `training` 目录，了解如何微调开源的列表式重排序模型。\n# 外部集成\nRankLLM 已被集成到许多流行的工具包中，如 LlamaIndex、rerankers 和 LangChain。有关在这些工具包中使用 RankLLM 的方法及示例，请参阅外部集成 [README](docs\u002Fexternal-integrations.md)。\n# 社区贡献\n如果您希望为该项目做出贡献，请参阅 [贡献指南](CONTRIBUTING.md)。\n\n## 📜️ 发布历史\n\n+ v0.25.7：2025年8月25日 [[发布说明](docs\u002Frelease-notes\u002Frelease-notes-v0.25.7.md)]\n+ v0.25.6：2025年8月5日 [[发布说明](docs\u002Frelease-notes\u002Frelease-notes-v0.25.6.md)]\n+ v0.25.0：2025年7月23日 [[发布说明](docs\u002Frelease-notes\u002Frelease-notes-v0.25.0.md)]\n\n\u003Ca id=references>\u003C\u002Fa>\n# ✨ 参考文献\n\n如果您使用 RankLLM，请引用以下相关论文：\n\n[[2505.19284] RankLLM：用于基于 LLM 进行重排序的 Python 包](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3726302.3730331)\n\n\u003C!-- {% raw %} -->\n```\n@inproceedings{sharifymoghaddam2025rankllm,\nauthor = {Sharifymoghaddam, Sahel and Pradeep, Ronak and Slavescu, Andre and Nguyen, Ryan and Xu, Andrew and Chen, Zijian and Zhang, Yilin and Chen, Yidi and Xian, Jasper and Lin, Jimmy},\ntitle = {{RankLLM}: A Python Package for Reranking with LLMs},\nyear = {2025},\nisbn = {9798400715921},\npublisher = {Association for Computing Machinery},\naddress = {New York, NY, USA},\nbooktitle = {Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},\npages = {3681–3690},\nnumpages = {10},\nkeywords = {information retrieval, large language models, python, reranking},\nlocation = {Padua, Italy},\nseries = {SIGIR '25}\n}\n```\n\u003C!-- {% endraw %} -->\n\n[[2309.15088] RankVicuna：使用开源大型语言模型进行零样本列表式文档重排序](https:\u002F\u002Farxiv.org\u002Fabs\u002F2309.15088)\n\n\u003C!-- {% raw %} -->\n```\n@ARTICLE{pradeep2023rankvicuna,\n  title   = {{RankVicuna}: Zero-Shot Listwise Document Reranking with Open-Source Large Language Models},\n  author  = {Ronak Pradeep and Sahel Sharifymoghaddam and Jimmy Lin},\n  year    = {2023},\n  journal = {arXiv:2309.15088}\n}\n```\n\u003C!-- {% endraw %} -->\n\n\n[[2312.02724] RankZephyr：高效且稳健的零样本列表式重排序轻而易举！](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.02724)\n\n\u003C!-- {% raw %} -->\n```\n@ARTICLE{pradeep2023rankzephyr,\n  title   = {{RankZephyr}: Effective and Robust Zero-Shot Listwise Reranking is a Breeze!},\n  author  = {Ronak Pradeep and Sahel Sharifymoghaddam and Jimmy Lin},\n  year    = {2023},\n  journal = {arXiv:2312.02724}\n}\n```\n\u003C!-- {% endraw %} -->\n\n如果您使用 LiT5 模型之一，请引用以下相关论文：\n\n[[2312.16098] 缩小规模，提升性能：使用序列到序列编码器-解码器模型实现高效的零样本列表式重排序](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.16098)\n\n```\n@ARTICLE{tamber2023scaling,\n  title   = {Scaling Down, LiTting Up: Efficient Zero-Shot Listwise Reranking with Seq2seq Encoder-Decoder Models},\n  author  = {Manveer Singh Tamber and Ronak Pradeep and Jimmy Lin},\n  year    = {2023},\n  journal = {arXiv:2312.16098}\n}\n```\n\n如果您使用 monoT5 模型之一，请引用以下相关论文：\n\n[[2101.05667] 面向文本排名的预训练序列到序列模型的 Expando-Mono-Duo 设计模式](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.05667)\n\n```\n@ARTICLE{pradeep2021emd,\n  title = {The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models},\n  author = {Ronak Pradeep and Rodrigo Nogueira and Jimmy Lin},\n  year = {2021},\n  journal = {arXiv:2101.05667},\n}\n```\n\n\n如果您使用 monoELECTRA 模型，请考虑引用：\n\n[从石头中挤出水来：进一步提升交叉编码器重排序效果的一系列技巧](https:\u002F\u002Fcs.uwaterloo.ca\u002F~jimmylin\u002Fpublications\u002FPradeep_etal_ECIR2022.pdf)\n```\n@inproceedings{pradeep2022monoelectra,\n  author = {Pradeep, Ronak and Liu, Yuqi and Zhang, Xinyu and Li, Yilin and Yates, Andrew and Lin, Jimmy},\n  title = {Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking},\n  year = {2022},\n  publisher = {Springer-Verlag},\n  address = {Berlin, Heidelberg},\n  booktitle = {Advances in Information Retrieval: 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part I},\n  pages = {655–670},\n  numpages = {16},\n  location = {Stavanger, Norway}\n}\n```\n\n如果您使用 FirstMistral 模型，请考虑引用：\n\n[[2411.05508] 早期 FIRST 复现及单标记解码改进，用于快速列表式重排序](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.05508)\n\n```\n@ARTICLE{chen2024firstrepro,\n  title   = title={An Early FIRST Reproduction and Improvements to Single-Token Decoding for Fast Listwise Reranking},\n  author  = {Zijian Chen and Ronak Pradeep and Jimmy Lin},\n  year    = {2024},\n  journal = {arXiv:2411.05508}\n}\n```\n\n如果您想引用 FIRST 方法论，请考虑引用：\n\n[[2406.15657] FIRST：通过单标记解码实现更快更优的列表式重排序](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.15657)\n\n```\n@ARTICLE{reddy2024first,\n  title   = {FIRST: Faster Improved Listwise Reranking with Single Token Decoding},\n  author  = {Reddy, Revanth Gangi and Doo, JaeHyeok and Xu, Yifei and Sultan, Md Arafat and Swain, Deevya and Sil, Avirup and Ji, Heng},\n  year    = {2024}\n  journal = {arXiv:2406.15657},\n}\n```\n\u003Ca id=acknowledgments>\u003C\u002Fa>\n# 🙏 致谢\n\n本研究部分得到了加拿大自然科学与工程研究委员会（NSERC）的支持。","# RankLLM 快速上手指南\n\nRankLLM 是一个强大的重排序（Rerank）工具套件，支持多种开源大语言模型（如 RankZephyr、MonoT5）及专有模型（如 RankGPT、RankGemini）。它兼容 vLLM、SGLang 和 TensorRT-LLM 后端，旨在提供高效、灵活的文档重排序解决方案。\n\n## 环境准备\n\n在开始之前，请确保您的系统满足以下要求：\n\n*   **操作系统**: Linux (推荐) 或 macOS。\n*   **Python 版本**: 3.11 (官方推荐)。\n*   **Java 环境**: 如果计划使用检索或评估工作流（通过 `pyserini`），必须安装 **Java 21** (不支持 JDK 11)。\n*   **GPU 驱动**: 如需 GPU 加速，请确保已安装正确的 NVIDIA 驱动和 CUDA 工具包。\n\n## 安装步骤\n\n官方推荐使用 `uv` 进行环境管理和安装，同时也支持传统的 `pip` 和 `conda`。\n\n### 方法一：使用 uv 推荐安装（首选）\n\n`uv` 是该项目标准的开发工作流工具，安装速度极快。\n\n1.  **安装 uv**:\n    ```bash\n    curl -LsSf https:\u002F\u002Fastral.sh\u002Fuv\u002Finstall.sh | sh\n    export PATH=\"$HOME\u002F.local\u002Fbin:$PATH\"\n    ```\n\n2.  **克隆项目并创建虚拟环境**:\n    ```bash\n    git clone https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm.git\n    cd rank_llm\n    uv python install 3.11\n    uv venv --python 3.11\n    source .venv\u002Fbin\u002Factivate\n    ```\n\n3.  **安装依赖**:\n    *   **基础开发版**:\n        ```bash\n        uv sync --group dev\n        ```\n    *   **按需安装特定功能栈** (可组合使用):\n        ```bash\n        # 例如：安装本地 HuggingFace 模型支持 + OpenAI API 支持\n        uv sync --group dev --extra local --extra openai\n        \n        # 例如：安装 vLLM 后端支持\n        uv sync --group dev --extra vllm\n        ```\n        > **常用 Extra 说明**:\n        > *   `local`: 本地 PyTorch\u002FHF 模型 (MonoT5, DuoT5 等)\n        > *   `openai` \u002F `genai` \u002F `cloud`: 调用 OpenAI, Gemini 等云端 API\n        > *   `vllm` \u002F `sglang` \u002F `tensorrt-llm`: 高性能推理后端\n        > *   `pyserini`: 检索与评估功能 (需 Java 21)\n        > *   `all`: 安装所有功能\n\n### 方法二：使用 pip 直接安装（简易版）\n\n如果您只需使用发布版功能，可直接通过 pip 安装：\n\n```bash\nuv venv --python 3.11\nsource .venv\u002Fbin\u002Factivate\nuv pip install rank-llm\n\n# 如需额外功能，例如本地模型支持：\nuv pip install \"rank-llm[local]\"\n# 或云端 API 支持：\nuv pip install \"rank-llm[cloud]\"\n```\n\n> **注意**: 若使用 `sglang` 或 `tensorrt-llm`，可能需要单独安装 `flashinfer` 或 `flash-attn`：\n> ```bash\n> pip install flashinfer -i https:\u002F\u002Fflashinfer.ai\u002Fwhl\u002Fcu121\u002Ftorch2.4\u002F\n> pip install flash-attn --no-build-isolation\n> ```\n\n## 基本使用\n\nRankLLM 提供了统一的命令行工具 `rank-llm` 和 Python API 两种使用方式。\n\n### 1. 命令行快速重排序 (CLI)\n\n以下命令使用预训练的 `RankZephyr` 模型，对 `dl20` 数据集通过 BM25 检索出的前 100 个候选文档进行重排序：\n\n```bash\nrank-llm rerank --model-path castorini\u002Frank_zephyr_7b_v1_full --dataset dl20 \\\n  --retrieval-method bm25 --top-k-candidates 100\n```\n\n**其他常用命令**:\n*   查看可用提示词模板：`rank-llm prompt list`\n*   查看重排序结果：`rank-llm view demo_outputs\u002Frerank_results.jsonl`\n*   启动 HTTP 服务：`rank-llm serve http --model-path castorini\u002Frank_zephyr_7b_v1_full --port 8082`\n\n### 2. Python API 最小示例\n\n以下代码演示了完整的流程：检索 -> 重排序 -> 评估 -> 保存结果。\n\n```python\nfrom pathlib import Path\n\nfrom rank_llm.analysis.response_analysis import ResponseAnalyzer\nfrom rank_llm.data import DataWriter\nfrom rank_llm.evaluation.trec_eval import EvalFunction\nfrom rank_llm.rerank.listwise import ZephyrReranker\nfrom rank_llm.retrieve.retriever import Retriever\nfrom rank_llm.retrieve.topics_dict import TOPICS\n\n# 1. 检索 (默认使用 BM25 检索前 100 个候选文档)\ndataset_name = \"dl19\"\nretrieved_results = Retriever.from_dataset_with_prebuilt_index(dataset_name)\n\n# 2. 重排序 (使用 RankZephyr 模型)\nreranker = ZephyrReranker()\nkwargs = {\"populate_invocations_history\": True}\nrerank_results = reranker.rerank_batch(requests=retrieved_results, **kwargs)\n\n# 3. 评估 (默认指标为 NDCG@10)\ntopics = TOPICS[dataset_name]\nndcg_10_retrieved = EvalFunction.from_results(retrieved_results, topics)\nprint(f\"Retrieval NDCG@10: {ndcg_10_retrieved}\")\n\nndcg_10_rerank = EvalFunction.from_results(rerank_results, topics)\nprint(f\"Rerank NDCG@10: {ndcg_10_rerank}\")\n\n# 4. 分析调用错误 (可选)\nanalyzer = ResponseAnalyzer.from_inline_results(rerank_results)\nerror_counts = analyzer.count_errors(verbose=True)\nprint(error_counts)\n\n# 5. 保存结果\nwriter = DataWriter(rerank_results)\nPath(\"demo_outputs\u002F\").mkdir(parents=True, exist_ok=True)\nwriter.write_in_jsonl_format(\"demo_outputs\u002Frerank_results.jsonl\")\nwriter.write_in_trec_eval_format(\"demo_outputs\u002Frerank_results.txt\")\n```\n\n### 进阶提示\n*   **批量加速**: 在使用 CLI 时，添加 `--sglang_batched` 或 `--tensorrt_batched` 标志可启用 SGLang 或 TensorRT-LLM 后端进行批量推理，显著提升速度。\n*   **自定义 Prompt**: 支持通过 YAML 文件自定义提示词模板，适配不同的模型或任务需求。","某电商平台的搜索团队正在优化内部商品检索系统，试图从海量候选结果中精准筛选出最符合用户意图的前 10 个商品。\n\n### 没有 rank_llm 时\n- **重排序效果瓶颈**：仅依赖传统的 BM25 或双塔模型进行初步检索，导致大量语义相关但关键词不匹配的商品被排在后面，准确率难以提升。\n- **大模型集成困难**：想要引入 LLM 进行列表级（Listwise）重排序以理解复杂语境，但缺乏统一接口，需手动编写繁琐的 Prompt 工程代码来适配不同模型。\n- **推理效率低下**：直接调用私有 API 成本高且延迟大，而本地部署开源大模型时，无法利用 vLLM 或 TensorRT-LLM 等加速框架，导致线上响应超时。\n- **实验复现成本高**：每次尝试新的重排序策略（如从 Pointwise 切换到 Listwise），都需要重构大量数据处理逻辑，难以快速对比 MonoT5、DuoT5 等模型效果。\n\n### 使用 rank_llm 后\n- **显著提升排序质量**：通过内置的 Listwise 重排序器（如 RankGPT 或基于 Llama 的变体），让模型一次性审视整个候选列表，大幅提升了长尾查询的语义匹配度。\n- **无缝集成多种后端**：只需几行配置即可切换底层推理引擎，轻松对接 vLLM、SGLang 或 OpenRouter API，既保留了开源模型的灵活性，又支持私有模型的快速接入。\n- **高效推理与自定义**：利用“仅首 token  logits\"等优化技术加速推断，同时支持通过 YAML 文件自定义 Prompt 模板，灵活适配特定业务场景而不必修改核心代码。\n- **标准化科研流程**：内置完整的评估流水线，团队成员可快速复现 SIGIR 论文中的基准测试，在统一框架下高效对比不同重排序模型的性能差异。\n\nrank_llm 将复杂的重排序研究转化为标准化的工程实践，帮助团队以最低成本实现了搜索相关性的质的飞跃。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fcastorini_rank_llm_1e443d90.png","castorini","Castorini","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fcastorini_706a9662.jpg","Jimmy Lin's research group at the University of Waterloo",null,"castorini.io","https:\u002F\u002Fgithub.com\u002Fcastorini",[83,87,91],{"name":84,"color":85,"percentage":86},"Python","#3572A5",88.1,{"name":88,"color":89,"percentage":90},"HTML","#e34c26",10.8,{"name":92,"color":93,"percentage":94},"Shell","#89e051",1.2,586,87,"2026-04-03T09:41:56","Apache-2.0","Linux, macOS","可选但推荐用于本地模型（vLLM, SGLang, TensorRT-LLM）。需 NVIDIA GPU，具体显存取决于模型大小（如 7B 模型通常需 16GB+），支持 CUDA 12.1 (cu121)。","未说明",{"notes":103,"python":104,"dependencies":105},"1. 官方推荐使用 'uv' 进行环境管理和安装，Conda\u002FPip 为备选方案。\n2. 若使用检索或评估功能 (pyserini)，必须安装 Java 21 (不支持 JDK 11)。\n3. 使用 SGLang 后端需单独安装 'flashinfer'，使用 TensorRT-LLM 或训练需单独安装 'flash-attn'。\n4. 支持多种安装组合（extras），如 'local' (本地模型), 'cloud' (云端 API), 'vllm', 'sglang' 等，可按需安装。","3.11",[106,107,108,109,110,111,112,113,114,115],"torch","transformers","vllm","sglang","tensorrt-llm","fastapi","flask","uvicorn","pyserini","openai",[26,13],"2026-03-27T02:49:30.150509","2026-04-06T08:45:29.718829",[120,125,130,135,140,144,149,154],{"id":121,"question_zh":122,"answer_zh":123,"source_url":124},13016,"RankLLM 的设计哲学是什么？它是否必须依赖 Pyserini？","RankLLM 被设计为一个轻量级的通用重排序库（Approach 2），并不强制依赖 Pyserini。它可以处理任何符合特定 JSON 格式的输入候选文档。虽然提供了从 Pyserini 或 Anserini 获取候选文档的命令指南，但核心库是解耦的，这使得它可以与 LangChain 混合搜索或简单的 BM25 等其他检索器配合使用，从而提高了可用性。","https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm\u002Fissues\u002F109",{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},13017,"项目的 API 设计是如何定义的？主要有哪些方法？","API 设计包含两个主要方法：`rerank` 和 `rerank_batch`。\n1. `rerank(query, candidates, k=10)`: 接收单个 Query 对象和 Candidates 对象，返回一个新的 Candidates 对象（非破坏性操作）。使用对象而非原始字符串或列表是为了方便附加元数据（如执行轨迹）。\n2. `rerank_batch(queries, candidates_list, k=10)`: 接收查询列表和候选文档列表进行批量处理。\n单个 `rerank` 调用内部会路由到 `rerank_batch` 处理。REST API 和 Python 库的设计已完成。","https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm\u002Fissues\u002F110",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},13018,"如何配置输出目录（如检索结果、重排序结果等文件夹）？","`retrieve_results`, `rerank_results`, `prompts_and_responses`, `token_counts` 这四个文件夹名称已不再是硬编码的。它们现在被设计为可配置的输入参数，用户可以在运行时指定这些路径，默认值仍保持原有的文件夹名称。","https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm\u002Fissues\u002F41",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},13019,"Pyserini 检索器是否支持自定义索引目录，而不仅仅是预设的数据集名称？","是的，支持自定义索引。为了保持简单，如果需要使用自定义索引，用户可以直接提供主题文件（topics）和索引目录（index dirs）作为参数，qrel 文件则作为评估的可选项。此外，代码也支持直接对存储在文件中的检索结果（hits\u002Fresults）进行重排序，因此用户也可以先在外部使用任意方式运行 Pyserini，再调用 RankLLM 进行重排序。","https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm\u002Fissues\u002F38",{"id":141,"question_zh":142,"answer_zh":143,"source_url":129},13020,"如何在 Anserini\u002FPyserini 中生成适合 RankLLM 输入的 JSON 格式文件？","在 Anserini 中，`SearchCollection` 命令新增了 `-outputRerankerRequests` 选项用于生成 RankLLM 所需的输入格式，并可通过 `-outputRerankerRequests.format` 指定输出为 `json`（格式化可读）或 `jsonl`。\n在 Pyserini 中，对应的命令行选项为 `--output-reranker-requests` 和 `--output-reranker-requests-format`，遵循 Python 的标准命名 convention。",{"id":145,"question_zh":146,"answer_zh":147,"source_url":148},13021,"项目依赖中是否必须安装 faiss-gpu 和 accelerate？","不需要。`faiss-gpu` 不是必需的，因为 Pyserini 使用的是 `faiss-cpu`。`accelerate` 主要用于训练场景，而当前版本不支持训练功能。这些不必要的依赖已被移除或标记为可选。","https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm\u002Fissues\u002F66",{"id":150,"question_zh":151,"answer_zh":152,"source_url":153},13022,"是否有缓存机制来避免重复下载检索结果或重建索引？","是的，系统设计了缓存逻辑以优化复现基线测试的流程。逻辑如下：首先检查本地是否存在文件且 MD5 匹配，如果存在则直接使用；否则尝试从 `rank_llm_data` 获取；如果都没有，才执行检索（retrieve）操作。未来计划也将重排序结果（rerank_results）加入数据缓存以供验证。","https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm\u002Fissues\u002F33",{"id":155,"question_zh":156,"answer_zh":157,"source_url":158},13023,"运行示例脚本时，如何处理 sys.path 导入问题以便同时支持源码运行和包安装运行？","推荐的做法是使用绝对导入，并利用 `__init__.py` 来方便地模块化管理。对于演示脚本，可以通过将其作为模块运行（例如：`python -m rank_llm.demo.rerank_demo_docs.py`）来尊重包结构，前提是用户在正确的目录（如 `rank_llm\u002Fsrc`）下执行。虽然内部可以使用相对导入（类似 transformers 库的做法），但在脚本中手动修改 `sys.path` 被认为是不够优雅的方案，建议优先通过合理的包结构和模块调用来解决。","https:\u002F\u002Fgithub.com\u002Fcastorini\u002Frank_llm\u002Fissues\u002F53",[]]