[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-qdrant--fastembed":3,"tool-qdrant--fastembed":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":80,"owner_twitter":81,"owner_website":82,"owner_url":83,"languages":84,"stars":93,"forks":94,"last_commit_at":95,"license":96,"difficulty_score":97,"env_os":98,"env_gpu":99,"env_ram":100,"env_deps":101,"category_tags":107,"github_topics":108,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":115,"updated_at":116,"faqs":117,"releases":146},2402,"qdrant\u002Ffastembed","fastembed","Fast, Accurate, Lightweight Python library to make State of the Art Embedding","fastembed 是一款专为生成高质量文本向量而设计的轻量级 Python 库。它旨在解决传统嵌入模型依赖庞大、运行速度慢以及需要昂贵 GPU 资源的问题，让开发者能在资源受限的环境（如 Serverless 架构或 AWS Lambda）中轻松部署最先进的 AI 模型。\n\n这款工具非常适合后端工程师、数据科学家以及需要构建检索增强生成（RAG）系统或语义搜索功能的开发者使用。无需配置复杂的深度学习环境，fastembed 即可快速上手。\n\n其核心技术亮点在于摒弃了沉重的 PyTorch 依赖，转而采用高效的 ONNX Runtime 进行推理。这不仅大幅降低了安装包体积，还显著提升了处理速度，支持通过数据并行化快速编码大规模数据集。在精度方面，fastembed 默认集成的 Flag Embedding 模型在权威评测中表现优异，甚至超越了 OpenAI 的 Ada-002 模型。此外，它不仅支持稠密向量，还提供稀疏向量（如 SPLADE++）生成能力，并允许用户灵活扩展自定义模型。无论是处理英文还是多语言任务，fastembed 都能以极低的成本提供准确、快速的向量嵌入服务。","# ⚡️ What is FastEmbed?\n\nFastEmbed is a lightweight, fast, Python library built for embedding generation. We [support popular text models](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fexamples\u002FSupported_Models\u002F). Please [open a GitHub issue](https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fissues\u002Fnew) if you want us to add a new model.\n\nThe default text embedding (`TextEmbedding`) model is Flag Embedding, presented in the [MTEB](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fmteb\u002Fleaderboard) leaderboard. It supports \"query\" and \"passage\" prefixes for the input text. Here is an example for [Retrieval Embedding Generation](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fqdrant\u002FRetrieval_with_FastEmbed\u002F) and how to use [FastEmbed with Qdrant](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fqdrant\u002FUsage_With_Qdrant\u002F).\n\n## 📈 Why FastEmbed?\n\n1. Light: FastEmbed is a lightweight library with few external dependencies. We don't require a GPU and don't download GBs of PyTorch dependencies, and instead use the ONNX Runtime. This makes it a great candidate for serverless runtimes like AWS Lambda. \n\n2. Fast: FastEmbed is designed for speed. We use the ONNX Runtime, which is faster than PyTorch. We also use data parallelism for encoding large datasets.\n\n3. Accurate: FastEmbed is better than OpenAI Ada-002. We also [support](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fexamples\u002FSupported_Models\u002F) an ever-expanding set of models, including a few multilingual models.\n\n## 🚀 Installation\n\nTo install the FastEmbed library, pip works best. You can install it with or without GPU support:\n\n```bash\npip install fastembed\n\n# or with GPU support\n\npip install fastembed-gpu\n```\n\n## 📖 Quickstart\n\n```python\nfrom fastembed import TextEmbedding\n\n\n# Example list of documents\ndocuments: list[str] = [\n    \"This is built to be faster and lighter than other embedding libraries e.g. Transformers, Sentence-Transformers, etc.\",\n    \"fastembed is supported by and maintained by Qdrant.\",\n]\n\n# This will trigger the model download and initialization\nembedding_model = TextEmbedding()\nprint(\"The model BAAI\u002Fbge-small-en-v1.5 is ready to use.\")\n\nembeddings_generator = embedding_model.embed(documents)  # reminder this is a generator\nembeddings_list = list(embedding_model.embed(documents))\n  # you can also convert the generator to a list, and that to a numpy array\nlen(embeddings_list[0]) # Vector of 384 dimensions\n```\n\nFastembed supports a variety of models for different tasks and modalities.\nThe list of all the available models can be found [here](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fexamples\u002FSupported_Models\u002F)\n### 🎒 Dense text embeddings\n\n```python\nfrom fastembed import TextEmbedding\n\nmodel = TextEmbedding(model_name=\"BAAI\u002Fbge-small-en-v1.5\")\nembeddings = list(model.embed(documents))\n\n# [\n#   array([-0.1115,  0.0097,  0.0052,  0.0195, ...], dtype=float32),\n#   array([-0.1019,  0.0635, -0.0332,  0.0522, ...], dtype=float32)\n# ]\n\n```\n\nDense text embedding can also be extended with models which are not in the list of supported models.\n\n```python\nfrom fastembed import TextEmbedding\nfrom fastembed.common.model_description import PoolingType, ModelSource\n\nTextEmbedding.add_custom_model(\n    model=\"intfloat\u002Fmultilingual-e5-small\",\n    pooling=PoolingType.MEAN,\n    normalization=True,\n    sources=ModelSource(hf=\"intfloat\u002Fmultilingual-e5-small\"),  # can be used with an `url` to load files from a private storage\n    dim=384,\n    model_file=\"onnx\u002Fmodel.onnx\",  # can be used to load an already supported model with another optimization or quantization, e.g. onnx\u002Fmodel_O4.onnx\n)\nmodel = TextEmbedding(model_name=\"intfloat\u002Fmultilingual-e5-small\")\nembeddings = list(model.embed(documents))\n```\n\n\n### 🔱 Sparse text embeddings\n\n* SPLADE++\n\n```python\nfrom fastembed import SparseTextEmbedding\n\nmodel = SparseTextEmbedding(model_name=\"prithivida\u002FSplade_PP_en_v1\")\nembeddings = list(model.embed(documents))\n\n# [\n#   SparseEmbedding(indices=[ 17, 123, 919, ... ], values=[0.71, 0.22, 0.39, ...]),\n#   SparseEmbedding(indices=[ 38,  12,  91, ... ], values=[0.11, 0.22, 0.39, ...])\n# ]\n```\n\n\u003C!--\n* BM42 - ([link](ToDo))\n\n```\nfrom fastembed import SparseTextEmbedding\n\nmodel = SparseTextEmbedding(model_name=\"Qdrant\u002Fbm42-all-minilm-l6-v2-attentions\")\nembeddings = list(model.embed(documents))\n\n# [\n#   SparseEmbedding(indices=[ 17, 123, 919, ... ], values=[0.71, 0.22, 0.39, ...]),\n#   SparseEmbedding(indices=[ 38,  12,  91, ... ], values=[0.11, 0.22, 0.39, ...])\n# ]\n```\n-->\n\n### 🦥 Late interaction models (aka ColBERT)\n\n\n```python\nfrom fastembed import LateInteractionTextEmbedding\n\nmodel = LateInteractionTextEmbedding(model_name=\"colbert-ir\u002Fcolbertv2.0\")\nembeddings = list(model.embed(documents))\n\n# [\n#   array([\n#       [-0.1115,  0.0097,  0.0052,  0.0195, ...],\n#       [-0.1019,  0.0635, -0.0332,  0.0522, ...],\n#   ]),\n#   array([\n#       [-0.9019,  0.0335, -0.0032,  0.0991, ...],\n#       [-0.2115,  0.8097,  0.1052,  0.0195, ...],\n#   ]),  \n# ]\n```\n\n### 🖼️ Image embeddings\n\n```python\nfrom fastembed import ImageEmbedding\n\nimages = [\n    \".\u002Fpath\u002Fto\u002Fimage1.jpg\",\n    \".\u002Fpath\u002Fto\u002Fimage2.jpg\",\n]\n\nmodel = ImageEmbedding(model_name=\"Qdrant\u002Fclip-ViT-B-32-vision\")\nembeddings = list(model.embed(images))\n\n# [\n#   array([-0.1115,  0.0097,  0.0052,  0.0195, ...], dtype=float32),\n#   array([-0.1019,  0.0635, -0.0332,  0.0522, ...], dtype=float32)\n# ]\n```\n\n### Late interaction multimodal models (ColPali)\n\n```python\nfrom fastembed import LateInteractionMultimodalEmbedding\n\ndoc_images = [\n    \".\u002Fpath\u002Fto\u002Fqdrant_pdf_doc_1_screenshot.jpg\",\n    \".\u002Fpath\u002Fto\u002Fcolpali_pdf_doc_2_screenshot.jpg\",\n]\n\nquery = \"What is Qdrant?\"\n\nmodel = LateInteractionMultimodalEmbedding(model_name=\"Qdrant\u002Fcolpali-v1.3-fp16\")\ndoc_images_embeddings = list(model.embed_image(doc_images))\n# shape (2, 1030, 128)\n# [array([[-0.03353882, -0.02090454, ..., -0.15576172, -0.07678223]], dtype=float32)]\nquery_embedding = model.embed_text(query)\n# shape (1, 20, 128)\n# [array([[-0.00218201,  0.14758301, ...,  -0.02207947,  0.16833496]], dtype=float32)]\n```\n\n### 🔄 Rerankers\n```python\nfrom fastembed.rerank.cross_encoder import TextCrossEncoder\n\nquery = \"Who is maintaining Qdrant?\"\ndocuments: list[str] = [\n    \"This is built to be faster and lighter than other embedding libraries e.g. Transformers, Sentence-Transformers, etc.\",\n    \"fastembed is supported by and maintained by Qdrant.\",\n]\nencoder = TextCrossEncoder(model_name=\"Xenova\u002Fms-marco-MiniLM-L-6-v2\")\nscores = list(encoder.rerank(query, documents))\n\n# [-11.48061752319336, 5.472434997558594]\n```\n\nText cross encoders can also be extended with models which are not in the list of supported models.\n\n```python\nfrom fastembed.rerank.cross_encoder import TextCrossEncoder \nfrom fastembed.common.model_description import ModelSource\n\nTextCrossEncoder.add_custom_model(\n    model=\"Xenova\u002Fms-marco-MiniLM-L-4-v2\",\n    model_file=\"onnx\u002Fmodel.onnx\",\n    sources=ModelSource(hf=\"Xenova\u002Fms-marco-MiniLM-L-4-v2\"),\n)\nmodel = TextCrossEncoder(model_name=\"Xenova\u002Fms-marco-MiniLM-L-4-v2\")\nscores = list(model.rerank_pairs(\n    [(\"What is AI?\", \"Artificial intelligence is ...\"), (\"What is ML?\", \"Machine learning is ...\"),]\n))\n```\n\n## ⚡️ FastEmbed on a GPU\n\nFastEmbed supports running on GPU devices.\nIt requires installation of the `fastembed-gpu` package.\n\n```bash\npip install fastembed-gpu\n```\n\nCheck our [example](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fexamples\u002FFastEmbed_GPU\u002F) for detailed instructions, CUDA 12.x support and troubleshooting of the common issues.\n\n```python\nfrom fastembed import TextEmbedding\n\nembedding_model = TextEmbedding(\n    model_name=\"BAAI\u002Fbge-small-en-v1.5\", \n    providers=[\"CUDAExecutionProvider\"]\n)\nprint(\"The model BAAI\u002Fbge-small-en-v1.5 is ready to use on a GPU.\")\n\n```\n\n## Usage with Qdrant\n\nInstallation with Qdrant Client in Python:\n\n```bash\npip install qdrant-client[fastembed]\n```\n\nor \n\n```bash\npip install qdrant-client[fastembed-gpu]\n```\n\nYou might have to use quotes ```pip install 'qdrant-client[fastembed]'``` on zsh.\n\n```python\nfrom qdrant_client import QdrantClient, models\n\n# Initialize the client\nclient = QdrantClient(\"localhost\", port=6333) # For production\n# client = QdrantClient(\":memory:\") # For experimentation\n\nmodel_name = \"sentence-transformers\u002Fall-MiniLM-L6-v2\"\npayload = [\n    {\"document\": \"Qdrant has Langchain integrations\", \"source\": \"Langchain-docs\", },\n    {\"document\": \"Qdrant also has Llama Index integrations\", \"source\": \"LlamaIndex-docs\"},\n]\ndocs = [models.Document(text=data[\"document\"], model=model_name) for data in payload]\nids = [42, 2]\n\nclient.create_collection(\n    \"demo_collection\",\n    vectors_config=models.VectorParams(\n        size=client.get_embedding_size(model_name), distance=models.Distance.COSINE)\n)\n\nclient.upload_collection(\n    collection_name=\"demo_collection\",\n    vectors=docs,\n    ids=ids,\n    payload=payload,\n)\n\nsearch_result = client.query_points(\n    collection_name=\"demo_collection\",\n    query=models.Document(text=\"This is a query document\", model=model_name)\n).points\nprint(search_result)\n```\n","# ⚡️ 什么是 FastEmbed？\n\nFastEmbed 是一个轻量级、快速的 Python 库，专为生成嵌入向量而设计。我们[支持流行的文本模型](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fexamples\u002FSupported_Models\u002F)。如果您希望我们添加新的模型，请[在 GitHub 上提交一个问题](https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fissues\u002Fnew)。\n\n默认的文本嵌入（`TextEmbedding`）模型是 Flag Embedding，该模型在 [MTEB](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fmteb\u002Fleaderboard) 排行榜上有所展示。它支持对输入文本使用“query”和“passage”前缀。以下是关于[检索嵌入生成](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fqdrant\u002FRetrieval_with_FastEmbed\u002F)以及如何将[FastEmbed 与 Qdrant 配合使用](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fqdrant\u002FUsage_With_Qdrant\u002F)的示例。\n\n## 📈 为什么选择 FastEmbed？\n\n1. 轻量：FastEmbed 是一个轻量级库，外部依赖较少。我们不需要 GPU，也不需要下载 GB 级别的 PyTorch 依赖项，而是使用 ONNX Runtime。这使得它非常适合无服务器运行时，例如 AWS Lambda。\n\n2. 快速：FastEmbed 专为速度而设计。我们使用 ONNX Runtime，其速度比 PyTorch 更快。此外，我们还利用数据并行性来编码大型数据集。\n\n3. 准确：FastEmbed 的性能优于 OpenAI Ada-002。我们还[支持](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fexamples\u002FSupported_Models\u002F)不断扩展的模型集合，其中包括一些多语言模型。\n\n## 🚀 安装\n\n安装 FastEmbed 库时，推荐使用 pip。您可以选择安装支持 GPU 的版本或不带 GPU 支持的版本：\n\n```bash\npip install fastembed\n\n# 或者安装支持 GPU 的版本\n\npip install fastembed-gpu\n```\n\n## 📖 快速入门\n\n```python\nfrom fastembed import TextEmbedding\n\n\n# 示例文档列表\ndocuments: list[str] = [\n    \"这个库的设计目标是比其他嵌入库（如 Transformers、Sentence-Transformers 等）更快、更轻。\",\n    \"fastembed 由 Qdrant 提供支持并维护。\",\n]\n\n# 这将触发模型的下载和初始化\nembedding_model = TextEmbedding()\nprint(\"模型 BAAI\u002Fbge-small-en-v1.5 已准备就绪。\")\n\nembeddings_generator = embedding_model.embed(documents)  # 提醒一下，这是一个生成器\nembeddings_list = list(embedding_model.embed(documents))\n  # 您也可以将生成器转换为列表，再将其转换为 numpy 数组\nlen(embeddings_list[0]) # 向量维度为 384\n```\n\nFastembed 支持多种用于不同任务和模态的模型。\n所有可用模型的列表可以在这里找到：[这里](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fexamples\u002FSupported_Models\u002F)\n### 🎒 密集文本嵌入\n\n```python\nfrom fastembed import TextEmbedding\n\nmodel = TextEmbedding(model_name=\"BAAI\u002Fbge-small-en-v1.5\")\nembeddings = list(model.embed(documents))\n\n# [\n#   array([-0.1115,  0.0097,  0.0052,  0.0195, ...], dtype=float32),\n#   array([-0.1019,  0.0635, -0.0332,  0.0522, ...], dtype=float32)\n# ]\n\n```\n\n密集文本嵌入还可以通过不在支持模型列表中的模型进行扩展。\n\n```python\nfrom fastembed import TextEmbedding\nfrom fastembed.common.model_description import PoolingType、ModelSource\n\nTextEmbedding.add_custom_model(\n    model=\"intfloat\u002Fmultilingual-e5-small\",\n    pooling=PoolingType.MEAN,\n    normalization=True，\n    sources=ModelSource(hf=\"intfloat\u002Fmultilingual-e5-small\")，  # 可以使用 `url` 从私有存储加载文件\n    dim=384，\n    model_file=\"onnx\u002Fmodel.onnx\"，  # 可以用来加载已经支持的模型，并应用其他优化或量化，例如 onnx\u002Fmodel_O4.onnx\n)\nmodel = TextEmbedding(model_name=\"intfloat\u002Fmultilingual-e5-small\")\nembeddings = list(model.embed(documents))\n```\n\n\n### 🔱 稀疏文本嵌入\n\n* SPLADE++\n\n```python\nfrom fastembed import SparseTextEmbedding\n\nmodel = SparseTextEmbedding(model_name=\"prithivida\u002FSplade_PP_en_v1\")\nembeddings = list(model.embed(documents))\n\n# [\n#   SparseEmbedding(indices=[ 17, 123, 919, ... ], values=[0.71, 0.22, 0.39, ...]),\n#   SparseEmbedding(indices=[ 38,  12,  91, ... ], values=[0.11, 0.22, 0.39, ...])\n# ]\n```\n\n\u003C!--\n* BM42 - ([链接](待办))\n\n```\nfrom fastembed import SparseTextEmbedding\n\nmodel = SparseTextEmbedding(model_name=\"Qdrant\u002Fbm42-all-minilm-l6-v2-attentions\")\nembeddings = list(model.embed(documents))\n\n# [\n#   SparseEmbedding(indices=[ 17, 123, 919, ... ], values=[0.71, 0.22, 0.39, ...]),\n#   SparseEmbedding(indices=[ 38,  12,  91, ... ], values=[0.11, 0.22, 0.39, ...])\n# ]\n```\n-->\n\n### 🦥 晚期交互模型（又称 ColBERT）\n\n\n```python\nfrom fastembed import LateInteractionTextEmbedding\n\nmodel = LateInteractionTextEmbedding(model_name=\"colbert-ir\u002Fcolbertv2.0\")\nembeddings = list(model.embed(documents))\n\n# [\n#   array([\n#       [-0.1115,  0.0097,  0.0052,  0.0195, ...],\n#       [-0.1019,  0.0635, -0.0332,  0.0522, ...],\n#   ]),\n#   array([\n#       [-0.9019,  0.0335, -0.0032,  0.0991, ...],\n#       [-0.2115,  0.8097,  0.1052,  0.0195, ...],\n#   ]),  \n# ]\n```\n\n### 🖼️ 图像嵌入\n\n```python\nfrom fastembed import ImageEmbedding\n\nimages = [\n    \".\u002Fpath\u002Fto\u002Fimage1.jpg\",\n    \".\u002Fpath\u002Fto\u002Fimage2.jpg\",\n]\n\nmodel = ImageEmbedding(model_name=\"Qdrant\u002Fclip-ViT-B-32-vision\")\nembeddings = list(model.embed(images))\n\n# [\n#   array([-0.1115,  0.0097,  0.0052,  0.0195, ...], dtype=float32),\n#   array([-0.1019,  0.0635, -0.0332,  0.0522, ...], dtype=float32)\n# ]\n```\n\n### 晚期交互多模态模型（ColPali）\n\n```python\nfrom fastembed import LateInteractionMultimodalEmbedding\n\ndoc_images = [\n    \".\u002Fpath\u002Fto\u002Fqdrant_pdf_doc_1_screenshot.jpg\",\n    \".\u002Fpath\u002Fto\u002Fcolpali_pdf_doc_2_screenshot.jpg\",\n]\n\nquery = \"What is Qdrant?\"\n\nmodel = LateInteractionMultimodalEmbedding(model_name=\"Qdrant\u002Fcolpali-v1.3-fp16\")\ndoc_images_embeddings = list(model.embed_image(doc_images))\n# 形状 (2, 1030, 128)\n# [array([[-0.03353882, -0.02090454, ..., -0.15576172, -0.07678223]], dtype=float32)]\nquery_embedding = model.embed_text(query)\n# 形状 (1, 20, 128)\n# [array([[-0.00218201,  0.14758301, ...,  -0.02207947,  0.16833496]], dtype=float32)]\n```\n\n### 🔄 重排序器\n```python\nfrom fastembed.rerank.cross_encoder import TextCrossEncoder\n\nquery = \"Who is maintaining Qdrant?\"\ndocuments: list[str] = [\n    \"This is built to be faster and lighter than other embedding libraries e.g. Transformers, Sentence-Transformers, etc.\",\n    \"fastembed is supported by and maintained by Qdrant.\",\n]\nencoder = TextCrossEncoder(model_name=\"Xenova\u002Fms-marco-MiniLM-L-6-v2\")\nscores = list(encoder.rerank(query, documents))\n\n# [-11.48061752319336, 5.472434997558594]\n```\n\n文本交叉编码器也可以通过不在支持模型列表中的模型进行扩展。\n\n```python\nfrom fastembed.rerank.cross_encoder import TextCrossEncoder \nfrom fastembed.common.model_description import ModelSource\n\nTextCrossEncoder.add_custom_model(\n    model=\"Xenova\u002Fms-marco-MiniLM-L-4-v2\",\n    model_file=\"onnx\u002Fmodel.onnx\",\n    sources=ModelSource(hf=\"Xenova\u002Fms-marco-MiniLM-L-4-v2\"),\n)\nmodel = TextCrossEncoder(model_name=\"Xenova\u002Fms-marco-MiniLM-L-4-v2\")\nscores = list(model.rerank_pairs(\n    [(\"什么是AI？\", \"人工智能是……\"), (\"什么是ML？\", \"机器学习是……\"),]\n))\n```\n\n## ⚡️ FastEmbed 在 GPU 上\n\nFastEmbed 支持在 GPU 设备上运行。\n这需要安装 `fastembed-gpu` 包。\n\n```bash\npip install fastembed-gpu\n```\n\n请查看我们的[示例](https:\u002F\u002Fqdrant.github.io\u002Ffastembed\u002Fexamples\u002FFastEmbed_GPU\u002F)以获取详细说明、CUDA 12.x 支持以及常见问题的故障排除方法。\n\n```python\nfrom fastembed import TextEmbedding\n\nembedding_model = TextEmbedding(\n    model_name=\"BAAI\u002Fbge-small-en-v1.5\", \n    providers=[\"CUDAExecutionProvider\"]\n)\nprint(\"模型 BAAI\u002Fbge-small-en-v1.5 已准备好在 GPU 上使用。\")\n```\n\n## 与 Qdrant 的使用\n\n在 Python 中使用 Qdrant 客户端安装：\n\n```bash\npip install qdrant-client[fastembed]\n```\n\n或者\n\n```bash\npip install qdrant-client[fastembed-gpu]\n```\n\n在 zsh 中，你可能需要使用引号：```pip install 'qdrant-client[fastembed]'```。\n\n```python\nfrom qdrant_client import QdrantClient, models\n\n# 初始化客户端\nclient = QdrantClient(\"localhost\", port=6333) # 用于生产环境\n# client = QdrantClient(\":memory:\") # 用于实验\n\nmodel_name = \"sentence-transformers\u002Fall-MiniLM-L6-v2\"\npayload = [\n    {\"document\": \"Qdrant 具有 Langchain 集成\", \"source\": \"Langchain 文档\"},\n    {\"document\": \"Qdrant 同样具有 Llama Index 集成\", \"source\": \"LlamaIndex 文档\"},\n]\ndocs = [models.Document(text=data[\"document\"], model=model_name) for data in payload]\nids = [42, 2]\n\nclient.create_collection(\n    \"demo_collection\",\n    vectors_config=models.VectorParams(\n        size=client.get_embedding_size(model_name), distance=models.Distance.COSINE)\n)\n\nclient.upload_collection(\n    collection_name=\"demo_collection\",\n    vectors=docs,\n    ids=ids,\n    payload=payload,\n)\n\nsearch_result = client.query_points(\n    collection_name=\"demo_collection\",\n    query=models.Document(text=\"这是一份查询文档\", model=model_name)\n).points\nprint(search_result)\n```","# FastEmbed 快速上手指南\n\nFastEmbed 是一个轻量级、高性能的 Python 嵌入生成库，专为服务器less 环境（如 AWS Lambda）和大规模数据处理设计。它基于 ONNX Runtime，无需 GPU 即可运行，且精度优于 OpenAI Ada-002。\n\n## 环境准备\n\n*   **系统要求**：支持 Linux、macOS 和 Windows。\n*   **Python 版本**：建议 Python 3.8 及以上。\n*   **前置依赖**：无需额外安装 PyTorch 或 TensorFlow，库会自动处理 ONNX Runtime 依赖。\n*   **网络要求**：首次运行时会自动下载模型文件（托管于 Hugging Face），请确保网络畅通。若国内访问受限，建议配置 Hugging Face 镜像环境变量：\n    ```bash\n    export HF_ENDPOINT=https:\u002F\u002Fhf-mirror.com\n    ```\n\n## 安装步骤\n\n使用 pip 进行安装。根据是否需要 GPU 加速选择以下命令之一：\n\n**1. CPU 版本（推荐，轻量通用）**\n```bash\npip install fastembed\n```\n\n**2. GPU 版本（需 NVIDIA GPU 及 CUDA 环境）**\n```bash\npip install fastembed-gpu\n```\n\n## 基本使用\n\n### 1. 生成稠密文本向量 (Dense Text Embeddings)\n\n这是最常用的功能，默认使用 `BAAI\u002Fbge-small-en-v1.5` 模型。\n\n```python\nfrom fastembed import TextEmbedding\n\n# 示例文档列表\ndocuments: list[str] = [\n    \"This is built to be faster and lighter than other embedding libraries e.g. Transformers, Sentence-Transformers, etc.\",\n    \"fastembed is supported by and maintained by Qdrant.\",\n]\n\n# 初始化模型（首次运行会自动下载模型）\nembedding_model = TextEmbedding()\n\n# 生成嵌入向量（返回的是生成器，可转为列表）\nembeddings_list = list(embedding_model.embed(documents))\n\n# 查看结果维度\nprint(f\"Vector dimensions: {len(embeddings_list[0])}\") \n# 输出：Vector dimensions: 384\n```\n\n### 2. 指定其他模型\n\n你可以从支持的模型列表中指定任意模型：\n\n```python\nfrom fastembed import TextEmbedding\n\n# 指定特定模型\nmodel = TextEmbedding(model_name=\"BAAI\u002Fbge-small-en-v1.5\")\nembeddings = list(model.embed(documents))\n```\n\n### 3. 高级功能示例\n\nFastEmbed 还支持稀疏向量、多模态图像向量及重排序模型：\n\n**稀疏文本向量 (Sparse Text Embeddings)**\n```python\nfrom fastembed import SparseTextEmbedding\n\nmodel = SparseTextEmbedding(model_name=\"prithivida\u002FSplade_PP_en_v1\")\nsparse_embeddings = list(model.embed(documents))\n```\n\n**图像向量 (Image Embeddings)**\n```python\nfrom fastembed import ImageEmbedding\n\nimages = [\".\u002Fpath\u002Fto\u002Fimage1.jpg\", \".\u002Fpath\u002Fto\u002Fimage2.jpg\"]\nmodel = ImageEmbedding(model_name=\"Qdrant\u002Fclip-ViT-B-32-vision\")\nimage_embeddings = list(model.embed(images))\n```\n\n**文本重排序 (Rerankers)**\n```python\nfrom fastembed.rerank.cross_encoder import TextCrossEncoder\n\nquery = \"Who is maintaining Qdrant?\"\nencoder = TextCrossEncoder(model_name=\"Xenova\u002Fms-marco-MiniLM-L-6-v2\")\nscores = list(encoder.rerank(query, documents))\n```\n\n### 4. GPU 加速使用\n\n如果你安装了 `fastembed-gpu`，可以通过指定 `providers` 启用 GPU：\n\n```python\nfrom fastembed import TextEmbedding\n\nembedding_model = TextEmbedding(\n    model_name=\"BAAI\u002Fbge-small-en-v1.5\", \n    providers=[\"CUDAExecutionProvider\"]\n)\nembeddings = list(embedding_model.embed(documents))\n```","某初创团队正在构建一个基于 AWS Lambda 的无服务器法律文档检索系统，需要实时将用户上传的合同条款转化为向量以进行语义搜索。\n\n### 没有 fastembed 时\n- **部署包过大**：传统依赖 PyTorch 的库体积高达数 GB，远超 Lambda 函数 50MB 的代码包限制，导致无法直接部署。\n- **冷启动缓慢**：加载重型深度学习框架耗时数秒，用户发起搜索请求时需长时间等待模型初始化。\n- **硬件成本高昂**：为保证推理速度被迫配置 GPU 实例，显著增加了云端运行的算力成本。\n- **开发流程繁琐**：需手动处理复杂的算子转换和环境依赖冲突，维护多语言模型更是难上加难。\n\n### 使用 fastembed 后\n- **极致轻量部署**：fastembed 基于 ONNX Runtime 且无重型依赖，生成的部署包仅几兆字节，完美适配 Serverless 环境。\n- **毫秒级响应**：利用数据并行和优化的推理引擎，模型加载与向量生成速度大幅提升，实现近乎实时的搜索体验。\n- **低成本运行**：无需 GPU 即可在普通 CPU 实例上高效运行，且精度优于 OpenAI Ada-002，大幅降低运营开支。\n- **开箱即用**：一行代码即可调用内置的 Flag Embedding 或多语言模型，支持自定义扩展，让开发者专注于业务逻辑。\n\nfastembed 通过轻量化架构与高性能推理，让资源受限的无服务器场景也能轻松拥有业界领先的语义嵌入能力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fqdrant_fastembed_2ef1cb20.png","qdrant","Qdrant","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fqdrant_ae834484.png","Creating advanced vector search technology",null,"info@qdrant.com","qdrant_engine","https:\u002F\u002Fqdrant.tech\u002F","https:\u002F\u002Fgithub.com\u002Fqdrant",[85,89],{"name":86,"color":87,"percentage":88},"Python","#3572A5",86.5,{"name":90,"color":91,"percentage":92},"Jupyter Notebook","#DA5B0B",13.5,2831,193,"2026-04-02T15:36:08","Apache-2.0",1,"Linux, macOS, Windows","非必需。如需 GPU 加速，需安装 fastembed-gpu 包并支持 CUDA 12.x（需 NVIDIA GPU）。默认使用 CPU 和 ONNX Runtime。","未说明（轻量级库，设计用于无服务器环境如 AWS Lambda，暗示内存需求较低）",{"notes":102,"python":103,"dependencies":104},"该库主打轻量快速，默认不依赖 PyTorch 或下载数 GB 的深度学习框架，而是基于 ONNX Runtime 运行，适合服务器less 场景。支持文本、图像、稀疏向量及重排序模型。若需 GPU 加速，必须单独安装 'fastembed-gpu' 包并配置 CUDA 12.x 环境。首次运行时会自动下载模型文件。","未说明",[105,67,106],"onnxruntime","fastembed-gpu (可选)",[54,13],[109,110,111,112,113,114],"embeddings","openai","rag","retrieval","retrieval-augmented-generation","vector-search","2026-03-27T02:49:30.150509","2026-04-06T05:16:05.565094",[118,123,128,133,138,142],{"id":119,"question_zh":120,"answer_zh":121,"source_url":122},11054,"遇到 'Unexpected Response: 400 (Bad Request)' 且提示向量维度不匹配（例如期望 384 却得到 1536）怎么办？","这通常是因为集合（Collection）创建时指定的向量维度与当前使用的嵌入模型生成的向量维度不一致。解决方法是删除现有的集合，然后使用正确的维度重新创建。例如，如果错误提示期望 1536 维，请删除旧集合并以 size=1536 重新运行 recreate_collection。LangChain 用户在调用 from_documents() 或 from_texts() 时，系统会自动以正确维度重建集合。","https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fissues\u002F145",{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},11055,"如何解决 'AttributeError: FlagEmbedding object has no attribute embed_documents' 错误？","这是因为直接实例化的 Embedding 对象不符合 LangChain 的接口规范。应改用 langchain.embeddings.fastembed 包中的 FastEmbedEmbeddings 类。代码示例：from langchain.embeddings.fastembed import FastEmbedEmbeddings; embeddings = FastEmbedEmbeddings(model_name=\"BAAI\u002Fbge-base-en-v1.5\", max_length=512)。此外，如果遇到 ChromaDB 兼容性问题，建议尝试使用 Qdrant 作为向量存储后端。","https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fissues\u002F63",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},11056,"如何在防火墙后或离线环境下使用 FastEmbed，避免程序因尝试联网检查模型而挂起？","即使本地缓存中存在模型文件，FastEmbed 默认仍会尝试连接 Hugging Face 检查更新，导致在防火墙后超时。解决方法是设置环境变量 HF_HUB_OFFLINE=1。这将禁止所有 HTTP 请求，强制库仅使用本地缓存文件。如果在 Docker 中，可在 Dockerfile 中添加 ENV HF_HUB_OFFLINE=1。","https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fissues\u002F218",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},11057,"如何手动下载模型并从本地路径加载，而不让 FastEmbed 自动从 Hugging Face 下载？","仅传递文件路径通常无效，因为 FastEmbed 依赖 Hugging Face Hub 的缓存格式而非原始文件目录。最可靠的离线方法是先在有网络的环境中运行一次以生成标准缓存，然后在离线环境中设置环境变量 HF_HUB_OFFLINE=1 来阻止联网检查并强制使用缓存。直接指定 local_files_only=True 有时仍会触发网络请求，配合环境变量使用更为稳妥。","https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fissues\u002F229",{"id":139,"question_zh":140,"answer_zh":141,"source_url":122},11058,"使用 Gemini 或其他模型时遇到向量维度错误（如期望 1536 却得到 768）该如何修复？","这表明向量数据库中的集合维度配置与当前嵌入模型输出的维度不匹配。必须删除现有的集合（例如名为 'grade_9' 的集合），然后重新运行代码。当使用 LangChain 的 from_documents() 或 from_texts() 方法时，系统会根据当前模型自动检测并创建具有正确维度的新集合。",{"id":143,"question_zh":144,"answer_zh":145,"source_url":127},11059,"在 LangChain 中使用 FastEmbed 时，推荐搭配哪个向量数据库以避免兼容性错误？","虽然可以使用 Chroma，但部分用户报告了 'readonly database' 或其他兼容性错误。维护者和社区成员推荐使用 Qdrant 作为替代方案，其在结合 FastEmbed 使用时表现更稳定。代码示例：from langchain.vectorstores import Qdrant; qdrant = Qdrant.from_documents(texts, embeddings, path=\".\u002Flocal_qdrant\", collection_name=\"db1\")。",[147,152,157,162,167,172,177,182,187,192,197,202,207,212,217,222,227,232,237,242],{"id":148,"version":149,"summary_zh":150,"released_at":151},59862,"v0.8.0","# 更改日志\n\n## 功能 🏎️ \n\n* #537 - 如果可用则使用 CUDA，不再强制要求显式设置 `cuda=True`\n* #588 - 新增 colmodernvbert，由 @kacperlukawski 和 @joein 贡献\n\n## 修复 🔧 \n* #534 - 更新 ColBERT 的描述，由 @joein 完成\n* #611 - 修复 onnxruntime 和 pillow 的版本问题，以更好地支持 Python 3.14，并避免由 pillow 引起的安全问题\n* #614 - 尊重 `HF_HUB_OFFLINE` 环境变量，由 @amasolov 实现\n\n## 已弃用功能\n* #574 - 移除对 Python 3.9 的支持，由 @joein 完成\n\n感谢所有为本次发布做出贡献的开发者：@joein、@amasolov、@kacperlukawski。","2026-03-23T17:14:03",{"id":153,"version":154,"summary_zh":155,"released_at":156},59863,"v0.7.4","# 更改日志\n\n## 功能 🏎️ \n\n* #577 - 如果模型从缓存加载，则不进行任何网络请求  @joein   \n* #578 - 暴露 `enable_cpu_mem_arena` ONNX 会话选项，以由 @joein 处理 ONNX Runtime 的内存分配\n* #582 - 解锁 Hugging Face Hub 和 Pillow 的版本  by @joein \n* #583 - 为文本模型添加新的 `token_count` 方法  by @joein @dancixx   \n\n## 修复 🕸️ \n* #585 - 更细粒度的 NumPy 版本控制\n\n感谢所有为本次发布做出贡献的人：@dancixx、@joein 和 @tbung 的审阅工作。","2025-12-05T12:18:02",{"id":158,"version":159,"summary_zh":160,"released_at":161},59864,"v0.7.2","# 更改日志\n\n## 功能 🏎️ \n\n* #542 - 由 @kacperlukawski 和 @joein 添加 MUVERA 后处理模块\n\n## 修复 🕸️ \n* #547 - 在晚期交互模型的批处理推理中，避免将填充标记嵌入添加到结果中，由 @generall 完成\n\n感谢所有为本次发布做出贡献的人：@kacperlukawski、@generall 和 @joein。","2025-08-25T15:05:56",{"id":163,"version":164,"summary_zh":165,"released_at":166},59865,"v0.7.1","# 更改日志\n\n## 功能 🏎️ \n\n* #532 - 由 @joein 改进了近期变更模型的警告信息\n* #522 - 由 @joein 在自定义模型中池化操作不正确时抛出异常\n* #521 - 由 @joein 添加了属性 `embedding_size` 和类方法 `get_embedding_size`，以便更方便地访问模型维度\n\n## 修复 🕸️ \n* #524 - 由 @joein 修复了在 `parallel > 1` 时将 `specific_model_path` 和 `local_files_only` 传递到 `embed` 的问题","2025-06-16T09:06:43",{"id":168,"version":169,"summary_zh":170,"released_at":171},59866,"v0.7.0","# 更改日志\n\n## 功能 🏎️ \n\n* #513 - 新的具有语义理解能力的稀疏嵌入模型：MiniCOIL (`Qdrant\u002Fminicoil-v1`)，由 @generall 提供  \n\n## 修复 🕸️ \n* #506 - 修复 BM25 中支持的语言列表，由 @joein 完成","2025-05-13T14:32:19",{"id":173,"version":174,"summary_zh":175,"released_at":176},59867,"v0.6.1","# 更改日志\n\n## 功能 🏎️ \n\n* #490 - 在从自定义 URL 加载模型时弃用旧的归档结构，转而使用 `model_name.tar.gz`，以方便添加自定义模型，由 @joein 实现\n* #492 - 保留嵌入向量的类型设置，使其与模型一致，从而支持更低精度的输出，由 @joein 实现\n* #496 - 支持自定义重排序器，由 @I8dNLo 实现\n\n## 修复 🕸️ \n\n* #499 - 修复 Splade Hugging Face 源，该问题在某些情况下会导致模型下载错误，由 @joein 实现\n","2025-04-10T13:51:36",{"id":178,"version":179,"summary_zh":180,"released_at":181},59868,"v0.6.0","# 更改日志\n\n## 功能 🏎️ \n\n* #427 - 引入 [LateInteractionMultimodalEmbedding](https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fblob\u002F6cda2ce7f06aca19db8b15c62fb3475e1d4ead2d\u002Ffastembed\u002Flate_interaction_multimodal\u002Flate_interaction_multimodal_embedding.py#L14)，配合 @I8dNLo 和 @joein 的 [ColPali-v1.3](https:\u002F\u002Fhuggingface.co\u002Fvidore\u002Fcolpali-v1.3) 使用。\n* #463 - 引入密集文本自定义模型（如果模型遵循相同的预处理流程，即使 fastembed 不直接支持，也可以使用）, 由 @joein 实现。\n* #428 - 添加 jina embedding v3 模型，由 @hh-space-invader 完成。\n* #443 - 支持从 `specific_model_path` 加载模型，绕过 Hugging Face 的文件结构，由 @I8dNLo 实现。\n* #450 - 扩展类型覆盖范围，将 fastembed 标记为类型化包（同时包括 #451、#453、#454、#457、#458、#459、#460、#461、#464、#469、#470、#472），由 @hh-space-invader 和 @joein 共同完成。\n* #440 - 当模型已在缓存中时，隐藏进度条，由 @hh-space-invader 实现。\n* #455 - 更新至 pillow\u003C12.0.0，由 @I8dNLo 完成。\n* #484 - 更新至 mmh3\u003C6.0.0，由 @joein 完成。\n\n\n## 修复 🕸️ \n\n* #486 - 修复 thenlper\u002Fgte-large，改为使用 token 嵌入的均值池化，而非 CLS 嵌入，由 @joein 完成。\n* #436 - 修复 paraphrase-multilingual-MiniLM-L12-v2，不再进行归一化，而是使用 token 嵌入的均值池化，而非 CLS 嵌入，由 @hh-space-invader 完成。\n* #445 - 修复 paraphrase-multilingual-mpnet-base-v2 和 intfloat\u002Fmultilingual-e5-large，同样不再进行归一化，而是使用 token 嵌入的均值池化，而非 CLS 嵌入，由 @I8dNLo 完成。\n\n感谢所有为本次发布做出贡献的伙伴！","2025-02-26T13:55:32",{"id":183,"version":184,"summary_zh":185,"released_at":186},59869,"v0.5.1","# 更改日志\n\n## 修复 🐛 \n* #439 - 通过将 `onnx` 设为可选依赖项来修复 Python 3.13 的安装问题，同时保留 `onnxruntime` 为必选依赖项，由 @joein 完成。","2025-01-20T10:43:31",{"id":188,"version":189,"summary_zh":190,"released_at":191},59870,"v0.5.0","## 功能 📖\n#403 - 移除对 Python 3.8 的支持，作者 @joein\n#404 - 添加对 Python 3.13 的支持，作者 @joein\n#406 - 改进模型缓存进度条，作者 @hh-space-invader\n#422 - 添加多 GPU 示例，作者 @hh-space-invader\n#425 - 在指定提供者和 CUDA 时向用户发出警告，作者 @hh-space-invader \n___\n## 模型 🧠\n#405 - 支持 Jina Embeddings v2 模型，作者 @hh-space-invader\n#408 - 添加 Jina CLIP v1 模型，作者 @hh-space-invader\n#415 - 增加对 thenlper\u002Fgte-base 模型的支持，作者 @hh-space-invader \n#419 - 引入并行处理及交叉编码器的成对 API，作者 @I8dNLo\n#429 - 所有模型现均支持与 Hugging Face (hf) 兼容，作者 @I8dNLo\n___\n## 修复 🐛\n#413 - 修复 ColBERT 模型形状不匹配问题，作者 @hh-space-invader","2024-12-24T19:53:25",{"id":193,"version":194,"summary_zh":195,"released_at":196},59871,"v0.4.2","# 更改日志\n\n## 功能 📖 \n* #380 - 添加 NOTICE 文件，以支持具有限制性许可证的模型，由 @hh-space-invader 和 @joein 完成\n* #362 - 解锁 NumPy v2，由 @amietn 完成\n* #375 - 添加 Jina 重排序器，由 @hh-space-invader 完成\n\n## 模型 🧠 \n* #378 - jinaai\u002Fjina-colbert-v2，由 @hh-space-invader 提供\n* #375 - jinaai\u002Fjina-reranker-v1-tiny-en、jinaai\u002Fjina-reranker-v1-turbo-en、jinaai\u002Fjina-reranker-v2-base-multilingual，由 @hh-space-invader 提供\n\n## 修复 🐛 \n* #337 - 由于近期出现的问题，暂时将 onnxruntime 的版本限制在 \u003C1.20，由 @hh-space-invader 完成\n","2024-11-13T13:41:59",{"id":198,"version":199,"summary_zh":200,"released_at":201},59872,"v0.4.1","# Change Log\r\n\r\n## Features 📢 \r\n\r\n* #366 - replace pystemmer with py-rust-stemmers by @I8dNLo\r\n\r\n","2024-10-21T20:30:12",{"id":203,"version":204,"summary_zh":205,"released_at":206},59873,"v0.4.0","# Change Log\r\n\r\n## Features 📢 \r\n\r\n* #355 - add rerankers support by @celinehoang177 @joein @I8dNLo \r\n* #358 - add multi-gpu support by @joein @generall @hh-space-invader \r\n* #364 - add license info to models descriptions by @hh-space-invader \r\n\r\n## Models 🧠 \r\n* #355 - rerankers ms-marco-MiniLM-L-6-v2, ms-marco-MiniLM-L-12-v2, bge-reranker-base by @celinehoang177 \r\n\r\n## Fixes 🐛 \r\n\r\n* #337 - lowercasing words in bm25 by @n0x29a \r\n* #339 - fix bm25 preprocessing by @I8dNLo \r\n* #340 - fix hanging of the main process when child processes got unexpectedly killed by @hh-space-invader \r\n\r\nThanks to everyone who contributed to the current release \r\n@celinehoang177 @I8dNLo @generall @hh-space-invader @n0x29a @joein \r\n","2024-10-21T18:19:56",{"id":208,"version":209,"summary_zh":210,"released_at":211},59874,"v0.3.5","# Change Log\r\n\r\n## Features 📢 \r\n\r\n* #315 - accept PIL images as input in image models by @I8dNLo \r\n\r\n## Models 🧠 \r\n* #301 - jinaai\u002Fjina-embeddings-v2-base-code by @Anush008 \r\n* #318 - added a set of new languages to BM25 @I8dNLo \r\n* #330 - answerdotai\u002Fanswerai-colbert-small-v1 by @I8dNLo \r\n\r\n## Fixes 🐛 \r\n\r\n* #319 - fix cutting `model_max_length` down in the case of context windows larger than 512 by @I8dNLo \r\n\r\nThanks to everyone who contributed to the current release\r\n@I8dNLo @Anush008 @mrscoopers ","2024-08-23T18:16:09",{"id":213,"version":214,"summary_zh":215,"released_at":216},59875,"v0.3.4","# 0.3.4\r\n\r\n## Features 🪄 \r\n- https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F291 - Add Unicom models by @I8dNLo \r\n- https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F293 - Add retry logic for model downloading by @joein \r\n\r\n## Bug fixes 🦟 \r\n- https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F280 - Fix pooling for nomic models by @I8dNLo \r\n\r\n","2024-07-17T15:31:05",{"id":218,"version":219,"summary_zh":220,"released_at":221},59876,"v0.3.1","## What's Changed\r\n\r\n## Features\r\n* Add support for jinaai\u002Fjina-embeddings-v2-base-de by @deichrenner in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F270\r\n* Add BM25 by @joein in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F274\r\n\r\n## Fixes\r\n* Fix None cache directory in parallel mode by @joein in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F277\r\n* Fix hybrid search example for pydantic v1 by @joein in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F263\r\n* Fix MiniLM by @I8dNLo in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F275\r\n* Fix parameter propagation in parallel mode, fix bm42 parallel by @joein in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F274\r\n* Pin Numpy \u003C2 by @Anush008 in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F278\r\n\r\n## Docs\r\n\r\n* Replace Data Source by @NirantK in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F206\r\n* Add examples with supported type of models into readme by @generall in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F271\r\n\r\n## New Contributors\r\n* @deichrenner made their first contribution in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F270. Thank you.\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fcompare\u002Fv0.3.0...v0.3.1","2024-06-17T18:05:55",{"id":223,"version":224,"summary_zh":225,"released_at":226},59877,"v0.3.0","# Changelog\r\n\r\n## Features 🪄 \r\n- #219 support for ImageEmbedding by @joein \r\n- #219 CLIP image and text embeddings by @joein \r\n- #246 Resnet50 by @I8dNLo \r\n- #248 support for late interaction embeddings (ColBERT) by @joein \r\n- #235 Bm42 (sparse embeddings with attention) by @generall\r\n\r\n---\r\n## Fixes 🪛 \r\n- #250 #244 unlock huggingface-hub, tokenizers and ruff dependencies, allow versions \u003C1.0 by @joein \r\n---\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fcompare\u002Fv0.2.7...v0.3.0\r\n","2024-06-05T17:11:17",{"id":228,"version":229,"summary_zh":230,"released_at":231},59878,"v0.2.7","# Changelog\r\n\r\n## Features 🪄 \r\n- #214 Add onnx providers setter by @joein \r\n- #224 add gpu support by @joein\r\n- #230 update tokenizers by @joein \r\n- #201 speed up model downloading by fine-graining file selection by @Anush008 @joein \r\n\r\n---\r\n## Fixes 🪛 \r\n- #179 fix model cache invalidation if gcs downloading was interrupted by @joein \r\n- #223 allow using fastembed behind a firewall utilising cached on disk models by @Thiru-GVT @joein \r\n---\r\n\r\n## New models 🏆 \r\n- #207 Snowlake models by @Anush008 \r\n- #201 nomic-ai\u002Fnomic-embed-text-v1.5-Q by @Anush008 \r\n---\r\n\r\nVarious documentation, workflow and notebooks improvements by @NirantK @generall @arunppsg \r\n\r\n---\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fcompare\u002Fv0.2.6...v0.2.7\r\n","2024-05-03T19:56:28",{"id":233,"version":234,"summary_zh":235,"released_at":236},59879,"v0.2.6","## What's Changed\r\n* feat: support mixedbread-ai\u002Fmxbai-embed-large-v1 by @yuvraj-wale in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F158\r\n* fix: case-insensitive check model_management.py by @Anush008 in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F160\r\n* Add misspelled version of SPLADE++ model for English by @NirantK in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F161\r\n* Hybrid Search Tutorial by @NirantK in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F165\r\n* fix: unify existing patterns, remove redundant by @joein in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F168\r\n* Fix spladepp parallelism by @joein in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F169\r\n* Update ruff by @joein in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F172\r\n* new: simplify imports by @joein in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F171\r\n* fix: fix model sizes in supported models lists by @joein in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F167\r\n* Update size_in_GB for BAAI\u002Fbge-small-en-v1.5 model by @NirantK in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F176\r\n* refactoring: update imports in notebooks by @joein in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F173\r\n\r\n## New Contributors\r\n* @yuvraj-wale made their first contribution in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F158\r\n* @joein made their first contribution in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F168\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fcompare\u002Fv0.2.5...v0.2.6","2024-04-01T15:30:37",{"id":238,"version":239,"summary_zh":240,"released_at":241},59880,"v0.2.5","## What's Changed\r\n* Make debugging easier: Add import statement for version debugging by @NirantK in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F151\r\n* Fix model name typo + Add SPLADE notebook by @NirantK in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F155\r\n* Case Insensitive model name checks by @Anush008 in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F157\r\n\r\n## Contributors\r\n* Add CONTRIBUTING.md file with guidelines for contributing to FastEmbed by @NirantK in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F150\r\n* Fix Issue Template forms by @NirantK in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F152\r\n* Move CONTRIBUTING.md + Add Test for Adding New Models by @NirantK in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F154\r\n \r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fcompare\u002Fv0.2.4...v0.2.5","2024-03-20T13:44:11",{"id":243,"version":244,"summary_zh":245,"released_at":246},59881,"v0.2.4","## 🦠 Bug Fix\r\n* Fix splade for single input by @generall in https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fpull\u002F148\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed\u002Fcompare\u002Fv0.2.3...v0.2.4","2024-03-13T18:26:53"]