[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-openai--chatgpt-retrieval-plugin":3,"tool-openai--chatgpt-retrieval-plugin":64},[4,17,25,39,48,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,14,15],"开发框架","Agent","语言模型","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":10,"last_commit_at":23,"category_tags":24,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,15],{"id":26,"name":27,"github_repo":28,"description_zh":29,"stars":30,"difficulty_score":10,"last_commit_at":31,"category_tags":32,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[33,34,35,36,14,37,15,13,38],"图像","数据工具","视频","插件","其他","音频",{"id":40,"name":41,"github_repo":42,"description_zh":43,"stars":44,"difficulty_score":45,"last_commit_at":46,"category_tags":47,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,3,"2026-04-04T04:44:48",[14,33,13,15,37],{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":45,"last_commit_at":54,"category_tags":55,"status":16},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74913,"2026-04-05T10:44:17",[15,33,13,37],{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":45,"last_commit_at":62,"category_tags":63,"status":16},2181,"OpenHands","OpenHands\u002FOpenHands","OpenHands 是一个专注于 AI 驱动开发的开源平台，旨在让智能体（Agent）像人类开发者一样理解、编写和调试代码。它解决了传统编程中重复性劳动多、环境配置复杂以及人机协作效率低等痛点，通过自动化流程显著提升开发速度。\n\n无论是希望提升编码效率的软件工程师、探索智能体技术的研究人员，还是需要快速原型验证的技术团队，都能从中受益。OpenHands 提供了灵活多样的使用方式：既可以通过命令行（CLI）或本地图形界面在个人电脑上轻松上手，体验类似 Devin 的流畅交互；也能利用其强大的 Python SDK 自定义智能体逻辑，甚至在云端大规模部署上千个智能体并行工作。\n\n其核心技术亮点在于模块化的软件智能体 SDK，这不仅构成了平台的引擎，还支持高度可组合的开发模式。此外，OpenHands 在 SWE-bench 基准测试中取得了 77.6% 的优异成绩，证明了其解决真实世界软件工程问题的能力。平台还具备完善的企业级功能，支持与 Slack、Jira 等工具集成，并提供细粒度的权限管理，适合从个人开发者到大型企业的各类用户场景。",70612,"2026-04-05T11:12:22",[15,14,13,36],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":80,"owner_email":80,"owner_twitter":80,"owner_website":81,"owner_url":82,"languages":83,"stars":96,"forks":97,"last_commit_at":98,"license":99,"difficulty_score":100,"env_os":101,"env_gpu":102,"env_ram":102,"env_deps":103,"category_tags":111,"github_topics":112,"view_count":10,"oss_zip_url":80,"oss_zip_packed_at":80,"status":16,"created_at":115,"updated_at":116,"faqs":117,"releases":148},3595,"openai\u002Fchatgpt-retrieval-plugin","chatgpt-retrieval-plugin","The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.","chatgpt-retrieval-plugin 是一个专为 ChatGPT 打造的检索后端工具，旨在让 AI 能够通过自然语言问答，精准查找并引用您的个人或工作文档。它有效解决了大模型无法直接访问用户私有数据、容易产生“幻觉”或缺乏上下文信息的痛点，将通用 AI 转化为懂您专属知识的智能助手。\n\n这款工具主要面向开发者和技术研究人员。如果您希望构建自定义 GPT，或者需要比原生文件上传更细粒度的控制权（例如自定义文本分块长度、选择特定的嵌入模型或向量数据库），那么它是非常理想的选择。普通用户若仅需求简单的文件问答，直接使用 ChatGPT 原生功能即可；但若您想深度定制检索逻辑，chatgpt-retrieval-plugin 则提供了强大的灵活性。\n\n其技术亮点在于架构的模块化与兼容性。它不仅支持 Pinecone、Elasticsearch 等多种主流向量数据库，还内置了文档分块、元数据提取及隐私信息（PII）检测等实用服务。通过 FastAPI 服务器，它可以轻松集成到 Custom GPTs、Function Calling 或 Assistants API 中，帮助开发者快","chatgpt-retrieval-plugin 是一个专为 ChatGPT 打造的检索后端工具，旨在让 AI 能够通过自然语言问答，精准查找并引用您的个人或工作文档。它有效解决了大模型无法直接访问用户私有数据、容易产生“幻觉”或缺乏上下文信息的痛点，将通用 AI 转化为懂您专属知识的智能助手。\n\n这款工具主要面向开发者和技术研究人员。如果您希望构建自定义 GPT，或者需要比原生文件上传更细粒度的控制权（例如自定义文本分块长度、选择特定的嵌入模型或向量数据库），那么它是非常理想的选择。普通用户若仅需求简单的文件问答，直接使用 ChatGPT 原生功能即可；但若您想深度定制检索逻辑，chatgpt-retrieval-plugin 则提供了强大的灵活性。\n\n其技术亮点在于架构的模块化与兼容性。它不仅支持 Pinecone、Elasticsearch 等多种主流向量数据库，还内置了文档分块、元数据提取及隐私信息（PII）检测等实用服务。通过 FastAPI 服务器，它可以轻松集成到 Custom GPTs、Function Calling 或 Assistants API 中，帮助开发者快速搭建安全、高效且可高度定制的私有知识库检索系统。","# ChatGPT Retrieval Plugin\n\nBuild Custom GPTs with a Retrieval Plugin backend to give ChatGPT access to personal documents.\n![Example Custom GPT Screenshot](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fopenai_chatgpt-retrieval-plugin_readme_ecce11d9acc7.png)\n\n## Introduction\n\nThe ChatGPT Retrieval Plugin repository provides a flexible solution for semantic search and retrieval of personal or organizational documents using natural language queries. It is a standalone retrieval backend, and can be used with [ChatGPT custom GPTs](https:\u002F\u002Fchat.openai.com\u002Fgpts\u002Fdiscovery), [function calling](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Ffunction-calling) with the [chat completions](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Ftext-generation) or [assistants APIs](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fassistants\u002Foverview), or with the [ChatGPT plugins model (deprecated)](https:\u002F\u002Fchat.openai.com\u002F?model=gpt-4-plugins). ChatGPT and the Assistants API both natively support retrieval from uploaded files, so you should use the Retrieval Plugin as a backend only if you want more granular control of your retrieval system (e.g. document text chunk length, embedding model \u002F size, etc.).\n\nThe repository is organized into several directories:\n\n| Directory                       | Description                                                                                                                |\n| ------------------------------- | -------------------------------------------------------------------------------------------------------------------------- |\n| [`datastore`](\u002Fdatastore)       | Contains the core logic for storing and querying document embeddings using various vector database providers.              |\n| [`docs`](\u002Fdocs)                 | Includes documentation for setting up and using each vector database provider, webhooks, and removing unused dependencies. |\n| [`examples`](\u002Fexamples)         | Provides example configurations, authentication methods, and provider-specific examples.                                   |\n| [`local_server`](\u002Flocal_server) | Contains an implementation of the Retrieval Plugin configured for localhost testing.                                       |\n| [`models`](\u002Fmodels)             | Contains the data models used by the plugin, such as document and metadata models.                                         |\n| [`scripts`](\u002Fscripts)           | Offers scripts for processing and uploading documents from different data sources.                                         |\n| [`server`](\u002Fserver)             | Houses the main FastAPI server implementation.                                                                             |\n| [`services`](\u002Fservices)         | Contains utility services for tasks like chunking, metadata extraction, and PII detection.                                 |\n| [`tests`](\u002Ftests)               | Includes integration tests for various vector database providers.                                                          |\n| [`.well-known`](\u002F.well-known)   | Stores the plugin manifest file and OpenAPI schema, which define the plugin configuration and API specification.           |\n\nThis README provides detailed information on how to set up, develop, and deploy the ChatGPT Retrieval Plugin (stand-alone retrieval backend).\n\n## Table of Contents\n\n- [Quickstart](#quickstart)\n- [About](#about)\n  - [Retrieval Plugin](#retrieval-plugin)\n  - [Retrieval Plugin with custom GPTs](#retrieval-plugin-with-custom-gpts)\n  - [Retrieval Plugin with function calling](#retrieval-plugin-with-function-calling)\n  - [Retrieval Plugin with the plugins model (deprecated)](#chatgpt-plugins-model)\n  - [API Endpoints](#api-endpoints)\n  - [Memory Feature](#memory-feature)\n  - [Security](#security)\n  - [Choosing an Embeddings Model](#choosing-an-embeddings-model)\n- [Development](#development)\n  - [Setup](#setup)\n    - [General Environment Variables](#general-environment-variables)\n  - [Choosing a Vector Database](#choosing-a-vector-database)\n    - [Pinecone](#pinecone)\n    - [Elasticsearch](#elasticsearch)\n    - [MongoDB Atlas](#mongodb-atlas)\n    - [Weaviate](#weaviate)\n    - [Zilliz](#zilliz)\n    - [Milvus](#milvus)\n    - [Qdrant](#qdrant)\n    - [Redis](#redis)\n    - [Llama Index](#llamaindex)\n    - [Chroma](#chroma)\n    - [Azure Cognitive Search](#azure-cognitive-search)\n    - [Azure CosmosDB Mongo vCore](#azure-cosmosdb-mongo-vcore)\n    - [Supabase](#supabase)\n    - [Postgres](#postgres)\n    - [AnalyticDB](#analyticdb)\n  - [Running the API Locally](#running-the-api-locally)\n  - [Personalization](#personalization)\n  - [Authentication Methods](#authentication-methods)\n- [Deployment](#deployment)\n- [Webhooks](#webhooks)\n- [Scripts](#scripts)\n- [Limitations](#limitations)\n- [Contributors](#contributors)\n- [Future Directions](#future-directions)\n\n## Quickstart\n\nFollow these steps to quickly set up and run the ChatGPT Retrieval Plugin:\n\n1. Install Python 3.10, if not already installed.\n2. Clone the repository: `git clone https:\u002F\u002Fgithub.com\u002Fopenai\u002Fchatgpt-retrieval-plugin.git`\n3. Navigate to the cloned repository directory: `cd \u002Fpath\u002Fto\u002Fchatgpt-retrieval-plugin`\n4. Install poetry: `pip install poetry`\n5. Create a new virtual environment with Python 3.10: `poetry env use python3.10`\n6. Activate the virtual environment: `poetry shell`\n7. Install app dependencies: `poetry install`\n8. Create a [bearer token](#general-environment-variables)\n9. Set the required environment variables:\n\n   ```\n   export DATASTORE=\u003Cyour_datastore>\n   export BEARER_TOKEN=\u003Cyour_bearer_token>\n   export OPENAI_API_KEY=\u003Cyour_openai_api_key>\n   export EMBEDDING_DIMENSION=256 # edit this value based on the dimension of the embeddings you want to use\n   export EMBEDDING_MODEL=text-embedding-3-large # edit this based on your model preference, e.g. text-embedding-3-small, text-embedding-ada-002\n\n   # Optional environment variables used when running Azure OpenAI\n   export OPENAI_API_BASE=https:\u002F\u002F\u003CAzureOpenAIName>.openai.azure.com\u002F\n   export OPENAI_API_TYPE=azure\n   export OPENAI_EMBEDDINGMODEL_DEPLOYMENTID=\u003CName of embedding model deployment>\n   export OPENAI_METADATA_EXTRACTIONMODEL_DEPLOYMENTID=\u003CName of deployment of model for metatdata>\n   export OPENAI_COMPLETIONMODEL_DEPLOYMENTID=\u003CName of general model deployment used for completion>\n   export OPENAI_EMBEDDING_BATCH_SIZE=\u003CBatch size of embedding, for AzureOAI, this value need to be set as 1>\n\n   # Add the environment variables for your chosen vector DB.\n   # Some of these are optional; read the provider's setup docs in \u002Fdocs\u002Fproviders for more information.\n\n   # Pinecone\n   export PINECONE_API_KEY=\u003Cyour_pinecone_api_key>\n   export PINECONE_ENVIRONMENT=\u003Cyour_pinecone_environment>\n   export PINECONE_INDEX=\u003Cyour_pinecone_index>\n\n   # Weaviate\n   export WEAVIATE_URL=\u003Cyour_weaviate_instance_url>\n   export WEAVIATE_API_KEY=\u003Cyour_api_key_for_WCS>\n   export WEAVIATE_CLASS=\u003Cyour_optional_weaviate_class>\n\n   # Zilliz\n   export ZILLIZ_COLLECTION=\u003Cyour_zilliz_collection>\n   export ZILLIZ_URI=\u003Cyour_zilliz_uri>\n   export ZILLIZ_USER=\u003Cyour_zilliz_username>\n   export ZILLIZ_PASSWORD=\u003Cyour_zilliz_password>\n\n   # Milvus\n   export MILVUS_COLLECTION=\u003Cyour_milvus_collection>\n   export MILVUS_HOST=\u003Cyour_milvus_host>\n   export MILVUS_PORT=\u003Cyour_milvus_port>\n   export MILVUS_USER=\u003Cyour_milvus_username>\n   export MILVUS_PASSWORD=\u003Cyour_milvus_password>\n\n   # Qdrant\n   export QDRANT_URL=\u003Cyour_qdrant_url>\n   export QDRANT_PORT=\u003Cyour_qdrant_port>\n   export QDRANT_GRPC_PORT=\u003Cyour_qdrant_grpc_port>\n   export QDRANT_API_KEY=\u003Cyour_qdrant_api_key>\n   export QDRANT_COLLECTION=\u003Cyour_qdrant_collection>\n\n   # AnalyticDB\n   export PG_HOST=\u003Cyour_analyticdb_host>\n   export PG_PORT=\u003Cyour_analyticdb_port>\n   export PG_USER=\u003Cyour_analyticdb_username>\n   export PG_PASSWORD=\u003Cyour_analyticdb_password>\n   export PG_DATABASE=\u003Cyour_analyticdb_database>\n   export PG_COLLECTION=\u003Cyour_analyticdb_collection>\n\n\n   # Redis\n   export REDIS_HOST=\u003Cyour_redis_host>\n   export REDIS_PORT=\u003Cyour_redis_port>\n   export REDIS_PASSWORD=\u003Cyour_redis_password>\n   export REDIS_INDEX_NAME=\u003Cyour_redis_index_name>\n   export REDIS_DOC_PREFIX=\u003Cyour_redis_doc_prefix>\n   export REDIS_DISTANCE_METRIC=\u003Cyour_redis_distance_metric>\n   export REDIS_INDEX_TYPE=\u003Cyour_redis_index_type>\n\n   # Llama\n   export LLAMA_INDEX_TYPE=\u003Cgpt_vector_index_type>\n   export LLAMA_INDEX_JSON_PATH=\u003Cpath_to_saved_index_json_file>\n   export LLAMA_QUERY_KWARGS_JSON_PATH=\u003Cpath_to_saved_query_kwargs_json_file>\n   export LLAMA_RESPONSE_MODE=\u003Cresponse_mode_for_query>\n\n   # Chroma\n   export CHROMA_COLLECTION=\u003Cyour_chroma_collection>\n   export CHROMA_IN_MEMORY=\u003Ctrue_or_false>\n   export CHROMA_PERSISTENCE_DIR=\u003Cyour_chroma_persistence_directory>\n   export CHROMA_HOST=\u003Cyour_chroma_host>\n   export CHROMA_PORT=\u003Cyour_chroma_port>\n\n   # Azure Cognitive Search\n   export AZURESEARCH_SERVICE=\u003Cyour_search_service_name>\n   export AZURESEARCH_INDEX=\u003Cyour_search_index_name>\n   export AZURESEARCH_API_KEY=\u003Cyour_api_key> (optional, uses key-free managed identity if not set)\n\n   # Azure CosmosDB Mongo vCore\n   export AZCOSMOS_API = \u003Cyour azure cosmos db api, for now it only supports mongo>\n   export AZCOSMOS_CONNSTR = \u003Cyour azure cosmos db mongo vcore connection string>\n   export AZCOSMOS_DATABASE_NAME = \u003Cyour mongo database name>\n   export AZCOSMOS_CONTAINER_NAME = \u003Cyour mongo container name>\n\n   # Supabase\n   export SUPABASE_URL=\u003Csupabase_project_url>\n   export SUPABASE_ANON_KEY=\u003Csupabase_project_api_anon_key>\n\n   # Postgres\n   export PG_HOST=\u003Cpostgres_host>\n   export PG_PORT=\u003Cpostgres_port>\n   export PG_USER=\u003Cpostgres_user>\n   export PG_PASSWORD=\u003Cpostgres_password>\n   export PG_DB=\u003Cpostgres_database>\n\n   # Elasticsearch\n   export ELASTICSEARCH_URL=\u003Celasticsearch_host_and_port> (either specify host or cloud_id)\n   export ELASTICSEARCH_CLOUD_ID=\u003Celasticsearch_cloud_id>\n\n   export ELASTICSEARCH_USERNAME=\u003Celasticsearch_username>\n   export ELASTICSEARCH_PASSWORD=\u003Celasticsearch_password>\n   export ELASTICSEARCH_API_KEY=\u003Celasticsearch_api_key>\n\n   export ELASTICSEARCH_INDEX=\u003Celasticsearch_index_name>\n   export ELASTICSEARCH_REPLICAS=\u003Celasticsearch_replicas>\n   export ELASTICSEARCH_SHARDS=\u003Celasticsearch_shards>\n\n   # MongoDB Atlas\n   export MONGODB_URI=\u003Cmongodb_uri>\n   export MONGODB_DATABASE=\u003Cmongodb_database>\n   export MONGODB_COLLECTION=\u003Cmongodb_collection>\n   export MONGODB_INDEX=\u003Cmongodb_index>\n   ```\n\n10. Run the API locally: `poetry run start`\n11. Access the API documentation at `http:\u002F\u002F0.0.0.0:8000\u002Fdocs` and test the API endpoints (make sure to add your bearer token).\n\n## About\n\n### Retrieval Plugin\n\nThis is a standalone retrieval backend that can be used with [ChatGPT custom GPTs](https:\u002F\u002Fchat.openai.com\u002Fgpts\u002Fdiscovery), [function calling](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Ffunction-calling) with the [chat completions](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Ftext-generation) or [assistants APIs](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fassistants\u002Foverview), or with the [ChatGPT plugins model (deprecated)](https:\u002F\u002Fchat.openai.com\u002F?model=gpt-4-plugins).\n\nIt enables a model to carry out semantic search and retrieval of personal or organizational documents, and write answers informed by relevent retrieved context (sometimes referred to as \"Retrieval-Augmented Generation\" or \"RAG\"). It allows users to obtain the most relevant document snippets from their data sources, such as files, notes, or emails, by asking questions or expressing needs in natural language. Enterprises can make their internal documents available to their employees through ChatGPT using this plugin.\n\nThe plugin uses OpenAI's embeddings model (`text-embedding-3-large` 256 dimension embeddings by default) to generate embeddings of document chunks, and then stores and queries them using a vector database on the backend. As an open-source and self-hosted solution, developers can deploy their own Retrieval Plugin and register it with ChatGPT. The Retrieval Plugin supports several vector database providers, allowing developers to choose their preferred one from a list.\n\nA FastAPI server exposes the plugin's endpoints for upserting, querying, and deleting documents. Users can refine their search results by using metadata filters by source, date, author, or other criteria. The plugin can be hosted on any cloud platform that supports Docker containers, such as Fly.io, Heroku, Render, or Azure Container Apps. To keep the vector database updated with the latest documents, the plugin can process and store documents from various data sources continuously, using incoming webhooks to the upsert and delete endpoints. Tools like [Zapier](https:\u002F\u002Fzapier.com) or [Make](https:\u002F\u002Fwww.make.com) can help configure the webhooks based on events or schedules.\n\n### Retrieval Plugin with Custom GPTs\n\nTo create a custom GPT that can use your Retrieval Plugin for semantic search and retrieval of your documents, and even store new information back to the database, you first need to have deployed a Retrieval Plugin. For detailed instructions on how to do this, please refer to the [Deployment section](#deployment). Once you have your app URL (e.g., `https:\u002F\u002Fyour-app-url.com`), take the following steps:\n\n1. Navigate to the create GPT page at `https:\u002F\u002Fchat.openai.com\u002Fgpts\u002Feditor`.\n2. Follow the standard creation flow to set up your GPT.\n3. Navigate to the \"Configure\" tab. Here, you can manually fill in fields such as name, description, and instructions, or use the smart creator for assistance.\n4. Under the \"Actions\" section, click on \"Create new action\".\n5. Choose an authentication method. The Retrieval Plugin supports None, API key (Basic or Bearer) and OAuth. For more information on these methods, refer to the [Authentication Methods Section](#authentication-methods).\n6. Import the OpenAPI schema. You can either:\n   - Import directly from the OpenAPI schema hosted in your app at `https:\u002F\u002Fyour-app-url.com\u002F.well-known\u002Fopenapi.yaml`.\n   - Copy and paste the contents of [this file](\u002F.well-known\u002Fopenapi.yaml) into the Schema input area if you only want to expose the query endpoint to the GPT. Remember to change the URL under the `-servers` section of the OpenAPI schema you paste in.\n7. Optionally, you might want to add a fetch endpoint. This would involve editing the [`\u002Fserver\u002Fmain.py`](\u002Fserver\u002Fmain.py) file to add an endpoint and implement this for your chosen vector database. If you make this change, please consider contributing it back to the project by opening a pull request! Adding the fetch endpoint to the OpenAPI schema would allow the model to fetch more content from a document by ID if some text is cut off in the retrieved result. It might also be useful to pass in a string with the text from the retrieved result and an option to return a fixed length of context before and after the retrieved result.\n8. If you want the GPT to be able to save information back to the vector database, you can give it access to the Retrieval Plugin's `\u002Fupsert` endpoint. To do this, copy the contents of [this file](\u002Fexamples\u002Fmemory\u002Fopenapi.yaml) into the schema area. This allows the GPT to store new information it generates or learns during the conversation. More details on this feature can be found at [Memory Feature](#memory-feature) and [in the docs here](\u002Fexamples\u002Fmemory).\n\nRemember: ChatGPT and custom GPTs natively support retrieval from uploaded files, so you should use the Retrieval Plugin as a backend only if you want more granular control of your retrieval system (e.g. self-hosting, embedding chunk length, embedding model \u002F size, etc.).\n\n### Retrieval Plugin with Function Calling\n\nThe Retrieval Plugin can be integrated with function calling in both the [Chat Completions API](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Ffunction-calling) and the [Assistants API](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fassistants\u002Foverview). This allows the model to decide when to use your functions (query, fetch, upsert) based on the conversation context.\n\n#### Function Calling with Chat Completions\n\nIn a call to the chat completions API, you can describe functions and have the model generate a JSON object containing arguments to call one or many functions. The latest models (gpt-3.5-turbo-0125 and gpt-4-turbo-preview) have been trained to detect when a function should be called and to respond with JSON that adheres to the function signature.\n\nYou can define the functions for the Retrieval Plugin endpoints and pass them in as tools when you use the Chat Completions API with one of the latest models. The model will then intelligently call the functions. You can use function calling to write queries to your APIs, call the endpoint on the backend, and return the response as a tool message to the model to continue the conversation. The function definitions\u002Fschemas and an example can be found [here](\u002Fexamples\u002Ffunction-calling\u002F).\n\n#### Function Calling with Assistants API\n\nYou can use the same function definitions with the OpenAI [Assistants API](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fassistants\u002Foverview), specifically the [function calling in tool use](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fassistants\u002Ftools\u002Ffunction-calling). The Assistants API allows you to build AI assistants within your own applications, leveraging models, tools, and knowledge to respond to user queries. The function definitions\u002Fschemas and an example can be found [here](\u002Fexamples\u002Ffunction-calling\u002F). The Assistants API natively supports retrieval from uploaded files, so you should use the Retrieval Plugin with function calling only if you want more granular control of your retrieval system (e.g. embedding chunk length, embedding model \u002F size, etc.).\n\nParallel function calling is supported for both the Chat Completions API and the Assistants API. This means you can perform multiple tasks, such as querying something and saving something back to the vector database, in the same message.\n\nRead more about function calling with the Retrieval Plugin [here](\u002Fexamples\u002Ffunction-calling\u002F).\n\n### ChatGPT Plugins Model\n\n(deprecated) We recommend using custom actions with GPTs to make use of the Retrieval Plugin through ChatGPT. Instrucitons for using retrieval with the deprecated plugins model can be found [here](\u002Fdocs\u002Fdeprecated\u002Fplugins.md).\n\n### API Endpoints\n\nThe Retrieval Plugin is built using FastAPI, a web framework for building APIs with Python. FastAPI allows for easy development, validation, and documentation of API endpoints. Find the FastAPI documentation [here](https:\u002F\u002Ffastapi.tiangolo.com\u002F).\n\nOne of the benefits of using FastAPI is the automatic generation of interactive API documentation with Swagger UI. When the API is running locally, Swagger UI at `\u003Clocal_host_url i.e. http:\u002F\u002F0.0.0.0:8000>\u002Fdocs` can be used to interact with the API endpoints, test their functionality, and view the expected request and response models.\n\nThe plugin exposes the following endpoints for upserting, querying, and deleting documents from the vector database. All requests and responses are in JSON format, and require a valid bearer token as an authorization header.\n\n- `\u002Fupsert`: This endpoint allows uploading one or more documents and storing their text and metadata in the vector database. The documents are split into chunks of around 200 tokens, each with a unique ID. The endpoint expects a list of documents in the request body, each with a `text` field, and optional `id` and `metadata` fields. The `metadata` field can contain the following optional subfields: `source`, `source_id`, `url`, `created_at`, and `author`. The endpoint returns a list of the IDs of the inserted documents (an ID is generated if not initially provided).\n\n- `\u002Fupsert-file`: This endpoint allows uploading a single file (PDF, TXT, DOCX, PPTX, or MD) and storing its text and metadata in the vector database. The file is converted to plain text and split into chunks of around 200 tokens, each with a unique ID. The endpoint returns a list containing the generated id of the inserted file.\n\n- `\u002Fquery`: This endpoint allows querying the vector database using one or more natural language queries and optional metadata filters. The endpoint expects a list of queries in the request body, each with a `query` and optional `filter` and `top_k` fields. The `filter` field should contain a subset of the following subfields: `source`, `source_id`, `document_id`, `url`, `created_at`, and `author`. The `top_k` field specifies how many results to return for a given query, and the default value is 3. The endpoint returns a list of objects that each contain a list of the most relevant document chunks for the given query, along with their text, metadata and similarity scores.\n\n- `\u002Fdelete`: This endpoint allows deleting one or more documents from the vector database using their IDs, a metadata filter, or a delete_all flag. The endpoint expects at least one of the following parameters in the request body: `ids`, `filter`, or `delete_all`. The `ids` parameter should be a list of document IDs to delete; all document chunks for the document with these IDS will be deleted. The `filter` parameter should contain a subset of the following subfields: `source`, `source_id`, `document_id`, `url`, `created_at`, and `author`. The `delete_all` parameter should be a boolean indicating whether to delete all documents from the vector database. The endpoint returns a boolean indicating whether the deletion was successful.\n\nThe detailed specifications and examples of the request and response models can be found by running the app locally and navigating to http:\u002F\u002F0.0.0.0:8000\u002Fopenapi.json, or in the OpenAPI schema [here](\u002F.well-known\u002Fopenapi.yaml). Note that the OpenAPI schema only contains the `\u002Fquery` endpoint, because that is the only function that ChatGPT needs to access. This way, ChatGPT can use the plugin only to retrieve relevant documents based on natural language queries or needs. However, if developers want to also give ChatGPT the ability to remember things for later, they can use the `\u002Fupsert` endpoint to save snippets from the conversation to the vector database. An example of a manifest and OpenAPI schema that gives ChatGPT access to the `\u002Fupsert` endpoint can be found [here](\u002Fexamples\u002Fmemory).\n\nTo include custom metadata fields, edit the `DocumentMetadata` and `DocumentMetadataFilter` data models [here](\u002Fmodels\u002Fmodels.py), and update the OpenAPI schema [here](\u002F.well-known\u002Fopenapi.yaml). You can update this easily by running the app locally, copying the JSON found at http:\u002F\u002F0.0.0.0:8000\u002Fsub\u002Fopenapi.json, and converting it to YAML format with [Swagger Editor](https:\u002F\u002Feditor.swagger.io\u002F). Alternatively, you can replace the `openapi.yaml` file with an `openapi.json` file.\n\n### Memory Feature\n\nA notable feature of the Retrieval Plugin is its capacity to provide ChatGPT with memory. By using the plugin's upsert endpoint, ChatGPT can save snippets from the conversation to the vector database for later reference (only when prompted to do so by the user). This functionality contributes to a more context-aware chat experience by allowing ChatGPT to remember and retrieve information from previous conversations. Learn how to configure the Retrieval Plugin with memory [here](\u002Fexamples\u002Fmemory).\n\n### Security\n\nThe Retrieval Plugin allows ChatGPT to search a vector database of content, and then add the best results into the ChatGPT session. This means it doesn’t have any external effects, and the main risk consideration is data authorization and privacy. Developers should only add content into their Retrieval Plugin that they have authorization for and that they are fine with appearing in users’ ChatGPT sessions. You can choose from a number of different authentication methods to secure the plugin (more information [here](#authentication-methods)).\n\n### Choosing an Embeddings Model\n\nThe ChatGPT Retrieval Plugin uses OpenAI's embeddings models to generate embeddings of document chunks. The default model for the Retrieval Plugin is `text-embedding-3-large` with 256 dimensions. OpenAI offers two latest embeddings models, `text-embedding-3-small` and `text-embedding-3-large`, as well as an older model, `text-embedding-ada-002`.\n\nThe new models support shortening embeddings without significant loss of retrieval accuracy, allowing you to balance retrieval accuracy, cost, and speed.\n\nHere's a comparison of the models:\n\n| Model                  | Embedding Size | Average MTEB Score | Cost per 1k Tokens |\n| ---------------------- | -------------- | ------------------ | ------------------ |\n| text-embedding-3-large | 3072           | 64.6%              | $0.00013           |\n| text-embedding-3-large | 1024           | 64.1%              | $0.00013           |\n| text-embedding-3-large | 256            | 62.0%              | $0.00013           |\n| text-embedding-3-small | 1536           | 62.3%              | $0.00002           |\n| text-embedding-3-small | 512            | 61.6%              | $0.00002           |\n| text-embedding-ada-002 | 1536           | 61.0%              | $0.0001            |\n\nWhen choosing a model, consider:\n\n1. **Retrieval Accuracy vs Cost**: `text-embedding-3-large` offers the highest accuracy but at a higher cost. `text-embedding-3-small` is more cost-effective with competitive accuracy. The older `text-embedding-ada-002` model has the lowest accuracy.\n\n2. **Embedding Size**: Larger embeddings provide better accuracy but consume more storage and could be slower to query. You can adjust the size of the embeddings to balance these factors.\n\nFor example, if your vector database supports up to 1024 dimensions, you can use `text-embedding-3-large` and set the dimensions API parameter to 1024. This shortens the embedding from 3072 dimensions, trading off some accuracy for lower storage and query costs.\n\nTo change your chosen embeddings model and size, edit the following environment variables:\n\n```\nEMBEDDING_DIMENSION=256 # edit this value based on the dimension of the embeddings you want to use\nEMBEDDING_MODEL=\"text-embedding-3-large\" # edit this value based on the model you want to use e.g. text-embedding-3-small, text-embedding-ada-002\n```\n\n## Development\n\n### Setup\n\nThis app uses Python 3.10, and [poetry](https:\u002F\u002Fpython-poetry.org\u002F) for dependency management.\n\nInstall Python 3.10 on your machine if it isn't already installed. It can be downloaded from the official [Python website](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F) or with a package manager like `brew` or `apt`, depending on your system.\n\nClone the repository from GitHub:\n\n```\ngit clone https:\u002F\u002Fgithub.com\u002Fopenai\u002Fchatgpt-retrieval-plugin.git\n```\n\nNavigate to the cloned repository directory:\n\n```\ncd \u002Fpath\u002Fto\u002Fchatgpt-retrieval-plugin\n```\n\nInstall poetry:\n\n```\npip install poetry\n```\n\nCreate a new virtual environment that uses Python 3.10:\n\n```\npoetry env use python3.10\npoetry shell\n```\n\nInstall app dependencies using poetry:\n\n```\npoetry install\n```\n\n**Note:** If adding dependencies in the `pyproject.toml`, make sure to run `poetry lock` and `poetry install`.\n\n#### General Environment Variables\n\nThe API requires the following environment variables to work:\n\n| Name             | Required | Description                                                                                                                                                                                                                                                   |\n| ---------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `DATASTORE`      | Yes      | This specifies the vector database provider you want to use to store and query embeddings. You can choose from `elasticsearch`, `chroma`, `pinecone`, `weaviate`, `zilliz`, `milvus`, `qdrant`, `redis`, `azuresearch`, `supabase`, `postgres`, `analyticdb`, `mongodb-atlas`. |\n| `BEARER_TOKEN`   | Yes      | This is a secret token that you need to authenticate your requests to the API. You can generate one using any tool or method you prefer, such as [jwt.io](https:\u002F\u002Fjwt.io\u002F).                                                                                   |\n| `OPENAI_API_KEY` | Yes      | This is your OpenAI API key that you need to generate embeddings using the one of the OpenAI embeddings model. You can get an API key by creating an account on [OpenAI](https:\u002F\u002Fopenai.com\u002F).                                                                |\n\n### Using the plugin with Azure OpenAI\n\nThe Azure Open AI uses URLs that are specific to your resource and references models not by model name but by the deployment id. As a result, you need to set additional environment variables for this case.\n\nIn addition to the `OPENAI_API_BASE` (your specific URL) and `OPENAI_API_TYPE` (azure), you should also set `OPENAI_EMBEDDINGMODEL_DEPLOYMENTID` which specifies the model to use for getting embeddings on upsert and query. For this, we recommend deploying `text-embedding-ada-002` model and using the deployment name here.\n\nIf you wish to use the data preparation scripts, you will also need to set `OPENAI_METADATA_EXTRACTIONMODEL_DEPLOYMENTID`, used for metadata extraction and\n`OPENAI_COMPLETIONMODEL_DEPLOYMENTID`, used for PII handling.\n\n### Choosing a Vector Database\n\nThe plugin supports several vector database providers, each with different features, performance, and pricing. Depending on which one you choose, you will need to use a different Dockerfile and set different environment variables. The following sections provide brief introductions to each vector database provider.\n\nFor more detailed instructions on setting up and using each vector database provider, please refer to the respective documentation in the `\u002Fdocs\u002Fproviders\u002F\u003Cdatastore_name>\u002Fsetup.md` file ([folders here](\u002Fdocs\u002Fproviders)).\n\n#### Pinecone\n\n[Pinecone](https:\u002F\u002Fwww.pinecone.io) is a managed vector database designed for speed, scale, and rapid deployment to production. It supports hybrid search and is currently the only datastore to natively support SPLADE sparse vectors. For detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fpinecone\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fpinecone\u002Fsetup.md).\n\n#### Weaviate\n\n[Weaviate](https:\u002F\u002Fweaviate.io\u002F) is an open-source vector search engine built to scale seamlessly into billions of data objects. It supports hybrid search out-of-the-box, making it suitable for users who require efficient keyword searches. Weaviate can be self-hosted or managed, offering flexibility in deployment. For detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fweaviate\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fweaviate\u002Fsetup.md).\n\n#### Zilliz\n\n[Zilliz](https:\u002F\u002Fzilliz.com) is a managed cloud-native vector database designed for billion-scale data. It offers a wide range of features, including multiple indexing algorithms, distance metrics, scalar filtering, time travel searches, rollback with snapshots, full RBAC, 99.9% uptime, separated storage and compute, and multi-language SDKs. For detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fzilliz\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fzilliz\u002Fsetup.md).\n\n#### Milvus\n\n[Milvus](https:\u002F\u002Fmilvus.io\u002F) is an open-source, cloud-native vector database that scales to billions of vectors. It is the open-source version of Zilliz and shares many of its features, such as various indexing algorithms, distance metrics, scalar filtering, time travel searches, rollback with snapshots, multi-language SDKs, storage and compute separation, and cloud scalability. For detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fmilvus\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fmilvus\u002Fsetup.md).\n\n#### Qdrant\n\n[Qdrant](https:\u002F\u002Fqdrant.tech\u002F) is a vector database capable of storing documents and vector embeddings. It offers both self-hosted and managed [Qdrant Cloud](https:\u002F\u002Fcloud.qdrant.io\u002F) deployment options, providing flexibility for users with different requirements. For detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fqdrant\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fqdrant\u002Fsetup.md).\n\n#### Redis\n\n[Redis](https:\u002F\u002Fredis.com\u002Fsolutions\u002Fuse-cases\u002Fvector-database\u002F) is a real-time data platform suitable for a variety of use cases, including everyday applications and AI\u002FML workloads. It can be used as a low-latency vector engine by creating a Redis database with the [Redis Stack docker container](\u002Fexamples\u002Fdocker\u002Fredis\u002Fdocker-compose.yml). For a hosted\u002Fmanaged solution, [Redis Cloud](https:\u002F\u002Fapp.redislabs.com\u002F#\u002F) is available. For detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fredis\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fredis\u002Fsetup.md).\n\n#### LlamaIndex\n\n[LlamaIndex](https:\u002F\u002Fgithub.com\u002Fjerryjliu\u002Fllama_index) is a central interface to connect your LLM's with external data.\nIt provides a suite of in-memory indices over your unstructured and structured data for use with ChatGPT.\nUnlike standard vector databases, LlamaIndex supports a wide range of indexing strategies (e.g. tree, keyword table, knowledge graph) optimized for different use-cases.\nIt is light-weight, easy-to-use, and requires no additional deployment.\nAll you need to do is specifying a few environment variables (optionally point to an existing saved Index json file).\nNote that metadata filters in queries are not yet supported.\nFor detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fllama\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fllama\u002Fsetup.md).\n\n#### Chroma\n\n[Chroma](https:\u002F\u002Ftrychroma.com) is an AI-native open-source embedding database designed to make getting started as easy as possible. Chroma runs in-memory, or in a client-server setup. It supports metadata and keyword filtering out of the box. For detailed instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fchroma\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fchroma\u002Fsetup.md).\n\n#### Azure Cognitive Search\n\n[Azure Cognitive Search](https:\u002F\u002Fazure.microsoft.com\u002Fproducts\u002Fsearch\u002F) is a complete retrieval cloud service that supports vector search, text search, and hybrid (vectors + text combined to yield the best of the two approaches). It also offers an [optional L2 re-ranking step](https:\u002F\u002Flearn.microsoft.com\u002Fazure\u002Fsearch\u002Fsemantic-search-overview) to further improve results quality. For detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fazuresearch\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fazuresearch\u002Fsetup.md)\n\n#### Azure CosmosDB Mongo vCore\n\n[Azure CosmosDB Mongo vCore](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fcosmos-db\u002Fmongodb\u002Fvcore\u002F) supports vector search on embeddings, and it could be used to seamlessly integrate your AI-based applications with your data stored in the Azure CosmosDB. For detailed instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fazurecosmosdb\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fazurecosmosdb\u002Fsetup.md)\n\n#### Supabase\n\n[Supabase](https:\u002F\u002Fsupabase.com\u002Fblog\u002Fopenai-embeddings-postgres-vector) offers an easy and efficient way to store vectors via [pgvector](https:\u002F\u002Fgithub.com\u002Fpgvector\u002Fpgvector) extension for Postgres Database. [You can use Supabase CLI](https:\u002F\u002Fgithub.com\u002Fsupabase\u002Fcli) to set up a whole Supabase stack locally or in the cloud or you can also use docker-compose, k8s and other options available. For a hosted\u002Fmanaged solution, try [Supabase.com](https:\u002F\u002Fsupabase.com\u002F) and unlock the full power of Postgres with built-in authentication, storage, auto APIs, and Realtime features. For detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fsupabase\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fsupabase\u002Fsetup.md).\n\n#### Postgres\n\n[Postgres](https:\u002F\u002Fwww.postgresql.org) offers an easy and efficient way to store vectors via [pgvector](https:\u002F\u002Fgithub.com\u002Fpgvector\u002Fpgvector) extension. To use pgvector, you will need to set up a PostgreSQL database with the pgvector extension enabled. For example, you can [use docker](https:\u002F\u002Fwww.docker.com\u002Fblog\u002Fhow-to-use-the-postgres-docker-official-image\u002F) to run locally. For a hosted\u002Fmanaged solution, you can use any of the cloud vendors which support [pgvector](https:\u002F\u002Fgithub.com\u002Fpgvector\u002Fpgvector#hosted-postgres). For detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fpostgres\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fpostgres\u002Fsetup.md).\n\n#### AnalyticDB\n\n[AnalyticDB](https:\u002F\u002Fwww.alibabacloud.com\u002Fhelp\u002Fen\u002Fanalyticdb-for-postgresql\u002Flatest\u002Fproduct-introduction-overview) is a distributed cloud-native vector database designed for storing documents and vector embeddings. It is fully compatible with PostgreSQL syntax and managed by Alibaba Cloud. AnalyticDB offers a powerful vector compute engine, processing billions of data vectors and providing features such as indexing algorithms, structured and unstructured data capabilities, real-time updates, distance metrics, scalar filtering, and time travel searches. For detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Fanalyticdb\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fanalyticdb\u002Fsetup.md).\n\n#### Elasticsearch\n\n[Elasticsearch](https:\u002F\u002Fwww.elastic.co\u002Fguide\u002Fen\u002Felasticsearch\u002Freference\u002Fcurrent\u002Findex.html) currently supports storing vectors through the `dense_vector` field type and uses them to calculate document scores. Elasticsearch 8.0 builds on this functionality to support fast, approximate nearest neighbor search (ANN). This represents a much more scalable approach, allowing vector search to run efficiently on large datasets. For detailed setup instructions, refer to [`\u002Fdocs\u002Fproviders\u002Felasticsearch\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Felasticsearch\u002Fsetup.md).\n\n#### Mongodb-Atlas\n\n[MongoDB Atlas](https:\u002F\u002Fwww.mongodb.com\u002Fdocs\u002Fatlas\u002Fgetting-started\u002F) Currently, the procedure involves generating an Atlas Vector Search index for all collections featuring vector embeddings of 2048 dimensions or fewer in width. This applies to diverse data types coexisting with additional data on your Atlas cluster, and the process is executed through the Atlas UI and Atlas Administration AP, refer to [`\u002Fdocs\u002Fproviders\u002Fmongodb_atlas\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fmongodb_atlas\u002Fsetup.md).\n\n### Running the API locally\n\nTo run the API locally, you first need to set the requisite environment variables with the `export` command:\n\n```\nexport DATASTORE=\u003Cyour_datastore>\nexport BEARER_TOKEN=\u003Cyour_bearer_token>\nexport OPENAI_API_KEY=\u003Cyour_openai_api_key>\n\u003CAdd the environment variables for your chosen vector DB here>\n```\n\nStart the API with:\n\n```\npoetry run start\n```\n\nAppend `docs` to the URL shown in the terminal and open it in a browser to access the API documentation and try out the endpoints (i.e. http:\u002F\u002F0.0.0.0:8000\u002Fdocs). Make sure to enter your bearer token and test the API endpoints.\n\n**Note:** If you add new dependencies to the pyproject.toml file, you need to run `poetry lock` and `poetry install` to update the lock file and install the new dependencies.\n\n### Personalization\n\nYou can personalize the Retrieval Plugin for your own use case by doing the following:\n\n- **Replace the logo**: Replace the image in [logo.png](\u002F.well-known\u002Flogo.png) with your own logo.\n\n- **Edit the data models**: Edit the `DocumentMetadata` and `DocumentMetadataFilter` data models in [models.py](\u002Fmodels\u002Fmodels.py) to add custom metadata fields. Update the OpenAPI schema in [openapi.yaml](\u002F.well-known\u002Fopenapi.yaml) accordingly. To update the OpenAPI schema more easily, you can run the app locally, then navigate to `http:\u002F\u002F0.0.0.0:8000\u002Fsub\u002Fopenapi.json` and copy the contents of the webpage. Then go to [Swagger Editor](https:\u002F\u002Feditor.swagger.io\u002F) and paste in the JSON to convert it to a YAML format. You could also replace the [openapi.yaml](\u002F.well-known\u002Fopenapi.yaml) file with an openapi.json file in the [.well-known](\u002F.well-known) folder.\n\n- **Change the plugin name, description, and usage instructions**: Update the plugin name, user-facing description, and usage instructions for the model. You can either edit the descriptions in the [main.py](\u002Fserver\u002Fmain.py) file or update the [openapi.yaml](\u002F.well-known\u002Fopenapi.yaml) file. Follow the same instructions as in the previous step to update the OpenAPI schema.\n\n- **Enable ChatGPT to save information from conversations**: See the instructions in the [memory example folder](\u002Fexamples\u002Fmemory).\n\n### Authentication Methods\n\nYou can choose from four options for authenticating requests to your plugin:\n\n1. **No Authentication**: Anyone can add your plugin and use its API without any credentials. This option is suitable if you are only exposing documents that are not sensitive or already public. It provides no security for your data. If using this method, copy the contents of this [main.py](\u002Fexamples\u002Fauthentication-methods\u002Fno-auth\u002Fmain.py) into the [actual main.py file](\u002Fserver\u002Fmain.py). Example manifest [here](\u002Fexamples\u002Fauthentication-methods\u002Fno-auth\u002Fai-plugin.json).\n\n2. **HTTP Bearer**: You can use a secret token as a header to authorize requests to your plugin. There are two variants of this option:\n\n   - **User Level** (default for this implementation): Each user who adds your plugin to ChatGPT must provide the bearer token when adding the plugin. You can generate and distribute these tokens using any tool or method you prefer, such as [jwt.io](https:\u002F\u002Fjwt.io\u002F). This method provides better security as each user has to enter the shared access token. If you require a unique access token for each user, you will need to implement this yourself in the [main.py](\u002Fserver\u002Fmain.py) file. Example manifest [here](\u002Fexamples\u002Fauthentication-methods\u002Fuser-http\u002Fai-plugin.json).\n\n   - **Service Level**: Anyone can add your plugin and use its API without credentials, but you must add a bearer token when registering the plugin. When you install your plugin, you need to add your bearer token, and will then receive a token from ChatGPT that you must include in your hosted manifest file. Your token will be used by ChatGPT to authorize requests to your plugin on behalf of all users who add it. This method is more convenient for users, but it may be less secure as all users share the same token and do not need to add a token to install the plugin. Example manifest [here](\u002Fexamples\u002Fauthentication-methods\u002Fservice-http\u002Fai-plugin.json).\n\n3. **OAuth**: Users must go through an OAuth flow to add your plugin. You can use an OAuth provider to authenticate users who add your plugin and grant them access to your API. This method offers the highest level of security and control, as users authenticate through a trusted third-party provider. However, you will need to implement the OAuth flow yourself in the [main.py](\u002Fserver\u002Fmain.py) file and provide the necessary parameters in your manifest file. Example manifest [here](\u002Fexamples\u002Fauthentication-methods\u002Foauth\u002Fai-plugin.json).\n\nConsider the benefits and drawbacks of each authentication method before choosing the one that best suits your use case and security requirements. If you choose to use a method different to the default (User Level HTTP), make sure to update the manifest file [here](\u002F.well-known\u002Fai-plugin.json).\n\n## Deployment\n\nYou can deploy your app to different cloud providers, depending on your preferences and requirements. However, regardless of the provider you choose, you will need to update two files in your app: [openapi.yaml](\u002F.well-known\u002Fopenapi.yaml) and [ai-plugin.json](\u002F.well-known\u002Fai-plugin.json). As outlined above, these files define the API specification and the AI plugin configuration for your app, respectively. You need to change the url field in both files to match the address of your deployed app.\n\nRender has a 1-click deploy option that automatically updates the url field in both files:\n\n[\u003Cimg src=\"https:\u002F\u002Frender.com\u002Fimages\u002Fdeploy-to-render-button.svg\" alt=\"Deploy to Render\" \u002F>](https:\u002F\u002Frender.com\u002Fdeploy?repo=https:\u002F\u002Fgithub.com\u002Frender-examples\u002Fchatgpt-retrieval-plugin\u002Ftree\u002Fmain)\n\nBefore deploying your app, you might want to remove unused dependencies from your [pyproject.toml](\u002Fpyproject.toml) file to reduce the size of your app and improve its performance. Depending on the vector database provider you choose, you can remove the packages that are not needed for your specific provider. Refer to the respective documentation in the [`\u002Fdocs\u002Fdeployment\u002Fremoving-unused-dependencies.md`](\u002Fdocs\u002Fdeployment\u002Fremoving-unused-dependencies.md) file for information on removing unused dependencies for each provider.\n\nInstructions:\n\n- [Deploying to Fly.io](\u002Fdocs\u002Fdeployment\u002Fflyio.md)\n- [Deploying to Heroku](\u002Fdocs\u002Fdeployment\u002Fheroku.md)\n- [Deploying to Render](\u002Fdocs\u002Fdeployment\u002Frender.md)\n- [Other Deployment Options](\u002Fdocs\u002Fdeployment\u002Fother-options.md) (Azure Container Apps, Google Cloud Run, AWS Elastic Container Service, etc.)\n\nOnce you have deployed your app, consider uploading an initial batch of documents using one of [these scripts](\u002Fscripts) or by calling the `\u002Fupsert` endpoint.\n\n## Webhooks\n\nTo keep the documents stored in the vector database up-to-date, consider using tools like [Zapier](https:\u002F\u002Fzapier.com) or [Make](https:\u002F\u002Fwww.make.com) to configure incoming webhooks to your plugin's API based on events or schedules. For example, this could allow you to sync new information as you update your notes or receive emails. You can also use a [Zapier Transfer](https:\u002F\u002Fzapier.com\u002Fblog\u002Fzapier-transfer-guide\u002F) to batch process a collection of existing documents and upload them to the vector database.\n\nIf you need to pass custom fields from these tools to your plugin, you might want to create an additional Retrieval Plugin API endpoint that calls the datastore's upsert function, such as `upsert-email`. This custom endpoint can be designed to accept specific fields from the webhook and process them accordingly.\n\nTo set up an incoming webhook, follow these general steps:\n\n- Choose a webhook tool like Zapier or Make and create an account.\n- Set up a new webhook or transfer in the tool, and configure it to trigger based on events or schedules.\n- Specify the target URL for the webhook, which should be the API endpoint of your Retrieval Plugin (e.g. `https:\u002F\u002Fyour-plugin-url.com\u002Fupsert`).\n- Configure the webhook payload to include the necessary data fields and format them according to your Retrieval Plugin's API requirements.\n- Test the webhook to ensure it's working correctly and sending data to your Retrieval Plugin as expected.\n\nAfter setting up the webhook, you may want to run a backfill to ensure that any previously missed data is included in the vector database.\n\nRemember that if you want to use incoming webhooks to continuously sync data, you should consider running a backfill after setting these up to avoid missing any data.\n\nIn addition to using tools like Zapier and Make, you can also build your own custom integrations to sync data with your Retrieval Plugin. This allows you to have more control over the data flow and tailor the integration to your specific needs and requirements.\n\n## Scripts\n\nThe `scripts` folder contains scripts to batch upsert or process text documents from different data sources, such as a zip file, JSON file, or JSONL file. These scripts use the plugin's upsert utility functions to upload the documents and their metadata to the vector database, after converting them to plain text and splitting them into chunks. Each script folder has a README file that explains how to use it and what parameters it requires. You can also optionally screen the documents for personally identifiable information (PII) using a language model and skip them if detected, with the [`services.pii_detection`](\u002Fservices\u002Fpii_detection.py) module. This can be helpful if you want to avoid uploading sensitive or private documents to the vector database unintentionally. Additionally, you can optionally extract metadata from the document text using a language model, with the [`services.extract_metadata`](\u002Fservices\u002Fextract_metadata.py) module. This can be useful if you want to enrich the document metadata. **Note:** if using incoming webhooks to continuously sync data, consider running a backfill after setting these up to avoid missing any data.\n\nThe scripts are:\n\n- [`process_json`](scripts\u002Fprocess_json\u002F): This script processes a file dump of documents in a JSON format and stores them in the vector database with some metadata. The format of the JSON file should be a list of JSON objects, where each object represents a document. The JSON object should have a `text` field and optionally other fields to populate the metadata. You can provide custom metadata as a JSON string and flags to screen for PII and extract metadata.\n- [`process_jsonl`](scripts\u002Fprocess_jsonl\u002F): This script processes a file dump of documents in a JSONL format and stores them in the vector database with some metadata. The format of the JSONL file should be a newline-delimited JSON file, where each line is a valid JSON object representing a document. The JSON object should have a `text` field and optionally other fields to populate the metadata. You can provide custom metadata as a JSON string and flags to screen for PII and extract metadata.\n- [`process_zip`](scripts\u002Fprocess_zip\u002F): This script processes a file dump of documents in a zip file and stores them in the vector database with some metadata. The format of the zip file should be a flat zip file folder of docx, pdf, txt, md, pptx or csv files. You can provide custom metadata as a JSON string and flags to screen for PII and extract metadata.\n\n## Pull Request (PR) Checklist\n\nIf you'd like to contribute, please follow the checklist below when submitting a PR. This will help us review and merge your changes faster! Thank you for contributing!\n\n1. **Type of PR**: Indicate the type of PR by adding a label in square brackets at the beginning of the title, such as `[Bugfix]`, `[Feature]`, `[Enhancement]`, `[Refactor]`, or `[Documentation]`.\n\n2. **Short Description**: Provide a brief, informative description of the PR that explains the changes made.\n\n3. **Issue(s) Linked**: Mention any related issue(s) by using the keyword `Fixes` or `Closes` followed by the respective issue number(s) (e.g., Fixes #123, Closes #456).\n\n4. **Branch**: Ensure that you have created a new branch for the changes, and it is based on the latest version of the `main` branch.\n\n5. **Code Changes**: Make sure the code changes are minimal, focused, and relevant to the issue or feature being addressed.\n\n6. **Commit Messages**: Write clear and concise commit messages that explain the purpose of each commit.\n\n7. **Tests**: Include unit tests and\u002For integration tests for any new code or changes to existing code. Make sure all tests pass before submitting the PR.\n\n8. **Documentation**: Update relevant documentation (e.g., README, inline comments, or external documentation) to reflect any changes made.\n\n9. **Review Requested**: Request a review from at least one other contributor or maintainer of the repository.\n\n10. **Video Submission** (For Complex\u002FLarge PRs): If your PR introduces significant changes, complexities, or a large number of lines of code, submit a brief video walkthrough along with the PR. The video should explain the purpose of the changes, the logic behind them, and how they address the issue or add the proposed feature. This will help reviewers to better understand your contribution and expedite the review process.\n\n## Pull Request Naming Convention\n\nUse the following naming convention for your PR branches:\n\n```\n\u003Ctype>\u002F\u003Cshort-description>-\u003Cissue-number>\n```\n\n- `\u003Ctype>`: The type of PR, such as `bugfix`, `feature`, `enhancement`, `refactor`, or `docs`. Multiple types are ok and should appear as \u003Ctype>, \u003Ctype2>\n- `\u003Cshort-description>`: A brief description of the changes made, using hyphens to separate words.\n- `\u003Cissue-number>`: The issue number associated with the changes made (if applicable).\n\nExample:\n\n```\nfeature\u002Fadvanced-chunking-strategy-123\n```\n\n## Limitations\n\nWhile the ChatGPT Retrieval Plugin is designed to provide a flexible solution for semantic search and retrieval, it does have some limitations:\n\n- **Keyword search limitations**: The embeddings generated by the chosen OpenAI embeddings model may not always be effective at capturing exact keyword matches. As a result, the plugin might not return the most relevant results for queries that rely heavily on specific keywords. Some vector databases, like Elasticsearch, Pinecone, Weaviate and Azure Cognitive Search, use hybrid search and might perform better for keyword searches.\n- **Sensitive data handling**: The plugin does not automatically detect or filter sensitive data. It is the responsibility of the developers to ensure that they have the necessary authorization to include content in the Retrieval Plugin and that the content complies with data privacy requirements.\n- **Scalability**: The performance of the plugin may vary depending on the chosen vector database provider and the size of the dataset. Some providers may offer better scalability and performance than others.\n- **Metadata extraction**: The optional metadata extraction feature relies on a language model to extract information from the document text. This process may not always be accurate, and the quality of the extracted metadata may vary depending on the document content and structure.\n- **PII detection**: The optional PII detection feature is not foolproof and may not catch all instances of personally identifiable information. Use this feature with caution and verify its effectiveness for your specific use case.\n\n## Future Directions\n\nThe ChatGPT Retrieval Plugin provides a flexible solution for semantic search and retrieval, but there is always potential for further development. We encourage users to contribute to the project by submitting pull requests for new features or enhancements. Notable contributions may be acknowledged with OpenAI credits.\n\nSome ideas for future directions include:\n\n- **More vector database providers**: If you are interested in integrating another vector database provider with the ChatGPT Retrieval Plugin, feel free to submit an implementation.\n- **Additional scripts**: Expanding the range of scripts available for processing and uploading documents from various data sources would make the plugin even more versatile.\n- **User Interface**: Developing a user interface for managing documents and interacting with the plugin could improve the user experience.\n- **Hybrid search \u002F TF-IDF option**: Enhancing the [datastore's upsert function](\u002Fdatastore\u002Fdatastore.py#L18) with an option to use hybrid search or TF-IDF indexing could improve the plugin's performance for keyword-based queries.\n- **Advanced chunking strategies and embeddings calculations**: Implementing more sophisticated chunking strategies and embeddings calculations, such as embedding document titles and summaries, performing weighted averaging of document chunks and summaries, or calculating the average embedding for a document, could lead to better search results.\n- **Custom metadata**: Allowing users to add custom metadata to document chunks, such as titles or other relevant information, might improve the retrieved results in some use cases.\n- **Additional optional services**: Integrating more optional services, such as summarizing documents or pre-processing documents before embedding them, could enhance the plugin's functionality and quality of retrieved results. These services could be implemented using language models and integrated directly into the plugin, rather than just being available in the scripts.\n\nWe welcome contributions from the community to help improve the ChatGPT Retrieval Plugin and expand its capabilities. If you have an idea or feature you'd like to contribute, please submit a pull request to the repository.\n\n## Contributors\n\nWe would like to extend our gratitude to the following contributors for their code \u002F documentation contributions, and support in integrating various vector database providers with the ChatGPT Retrieval Plugin:\n\n- [Pinecone](https:\u002F\u002Fwww.pinecone.io\u002F)\n  - [acatav](https:\u002F\u002Fgithub.com\u002Facatav)\n  - [gkogan](https:\u002F\u002Fgithub.com\u002Fgkogan)\n  - [jamescalam](https:\u002F\u002Fgithub.com\u002Fjamescalam)\n- [Weaviate](https:\u002F\u002Fwww.semi.technology\u002F)\n  - [byronvoorbach](https:\u002F\u002Fgithub.com\u002Fbyronvoorbach)\n  - [hsm207](https:\u002F\u002Fgithub.com\u002Fhsm207)\n  - [sebawita](https:\u002F\u002Fgithub.com\u002Fsebawita)\n- [Zilliz](https:\u002F\u002Fzilliz.com\u002F)\n  - [filip-halt](https:\u002F\u002Fgithub.com\u002Ffilip-halt)\n- [Milvus](https:\u002F\u002Fmilvus.io\u002F)\n  - [filip-halt](https:\u002F\u002Fgithub.com\u002Ffilip-halt)\n- [Qdrant](https:\u002F\u002Fqdrant.tech\u002F)\n  - [kacperlukawski](https:\u002F\u002Fgithub.com\u002Fkacperlukawski)\n- [Redis](https:\u002F\u002Fredis.io\u002F)\n  - [spartee](https:\u002F\u002Fgithub.com\u002Fspartee)\n  - [tylerhutcherson](https:\u002F\u002Fgithub.com\u002Ftylerhutcherson)\n- [LlamaIndex](https:\u002F\u002Fgithub.com\u002Fjerryjliu\u002Fllama_index)\n  - [jerryjliu](https:\u002F\u002Fgithub.com\u002Fjerryjliu)\n  - [Disiok](https:\u002F\u002Fgithub.com\u002FDisiok)\n- [Supabase](https:\u002F\u002Fsupabase.com\u002F)\n  - [egor-romanov](https:\u002F\u002Fgithub.com\u002Fegor-romanov)\n- [Postgres](https:\u002F\u002Fwww.postgresql.org\u002F)\n  - [egor-romanov](https:\u002F\u002Fgithub.com\u002Fegor-romanov)\n  - [mmmaia](https:\u002F\u002Fgithub.com\u002Fmmmaia)\n- [Elasticsearch](https:\u002F\u002Fwww.elastic.co\u002F)\n  - [joemcelroy](https:\u002F\u002Fgithub.com\u002Fjoemcelroy)\n","# ChatGPT 检索插件\n\n使用检索插件后端构建自定义 GPT，使 ChatGPT 能够访问个人文档。\n![示例自定义 GPT 截图](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fopenai_chatgpt-retrieval-plugin_readme_ecce11d9acc7.png)\n\n## 简介\n\nChatGPT 检索插件仓库提供了一种灵活的解决方案，用于通过自然语言查询对个人或组织文档进行语义搜索和检索。它是一个独立的检索后端，可以与 [ChatGPT 自定义 GPT](https:\u002F\u002Fchat.openai.com\u002Fgpts\u002Fdiscovery)、使用 [聊天补全](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Ftext-generation) 或 [助手 API](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fassistants\u002Foverview) 的 [函数调用](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Ffunction-calling)，或者与 [ChatGPT 插件模型（已弃用）](https:\u002F\u002Fchat.openai.com\u002F?model=gpt-4-plugins) 一起使用。ChatGPT 和助手 API 都原生支持从上传的文件中进行检索，因此只有在您希望对检索系统有更细粒度的控制时（例如文档文本分块长度、嵌入模型\u002F大小等），才应将检索插件用作后端。\n\n该仓库被组织成多个目录：\n\n| 目录                       | 描述                                                                                                                |\n| ------------------------------- | -------------------------------------------------------------------------------------------------------------------------- |\n| [`datastore`](\u002Fdatastore)       | 包含使用各种向量数据库提供商存储和查询文档嵌入的核心逻辑。              |\n| [`docs`](\u002Fdocs)                 | 包括设置和使用每个向量数据库提供商、Webhook 以及移除未使用的依赖项的文档。 |\n| [`examples`](\u002Fexamples)         | 提供示例配置、身份验证方法和特定于提供商的示例。                                   |\n| [`local_server`](\u002Flocal_server) | 包含为本地测试配置的检索插件实现。                                       |\n| [`models`](\u002Fmodels)             | 包含插件使用的数据模型，例如文档和元数据模型。                                         |\n| [`scripts`](\u002Fscripts)           | 提供用于处理和上传来自不同数据源的文档的脚本。                                         |\n| [`server`](\u002Fserver)             | 承载主要的 FastAPI 服务器实现。                                                                             |\n| [`services`](\u002Fservices)         | 包含用于分块、元数据提取和 PII 检测等任务的实用服务。                                 |\n| [`tests`](\u002Ftests)               | 包括针对各种向量数据库提供商的集成测试。                                                          |\n| [`.well-known`](\u002F.well-known)   | 存储插件清单文件和 OpenAPI 模式，它们定义了插件配置和 API 规范。           |\n\n本 README 提供了有关如何设置、开发和部署 ChatGPT 检索插件（独立检索后端）的详细信息。\n\n## 目录\n\n- [快速入门](#quickstart)\n- [简介](#about)\n  - [检索插件](#retrieval-plugin)\n  - [与自定义 GPT 结合的检索插件](#retrieval-plugin-with-custom-gpts)\n  - [与函数调用结合的检索插件](#retrieval-plugin-with-function-calling)\n  - [与插件模型结合的检索插件（已弃用）](#chatgpt-plugins-model)\n  - [API 端点](#api-endpoints)\n  - [记忆功能](#memory-feature)\n  - [安全性](#security)\n  - [选择嵌入模型](#choosing-an-embeddings-model)\n- [开发](#development)\n  - [设置](#setup)\n    - [通用环境变量](#general-environment-variables)\n  - [选择向量数据库](#choosing-a-vector-database)\n    - [Pinecone](#pinecone)\n    - [Elasticsearch](#elasticsearch)\n    - [MongoDB Atlas](#mongodb-atlas)\n    - [Weaviate](#weaviate)\n    - [Zilliz](#zilliz)\n    - [Milvus](#milvus)\n    - [Qdrant](#qdrant)\n    - [Redis](#redis)\n    - [Llama Index](#llamaindex)\n    - [Chroma](#chroma)\n    - [Azure 认知搜索](#azure-cognitive-search)\n    - [Azure CosmosDB Mongo vCore](#azure-cosmosdb-mongo-vcore)\n    - [Supabase](#supabase)\n    - [Postgres](#postgres)\n    - [AnalyticDB](#analyticdb)\n  - [在本地运行 API](#running-the-api-locally)\n  - [个性化](#personalization)\n  - [身份验证方法](#authentication-methods)\n- [部署](#deployment)\n- [Webhooks](#webhooks)\n- [脚本](#scripts)\n- [限制](#limitations)\n- [贡献者](#contributors)\n- [未来方向](#future-directions)\n\n## 快速入门\n\n按照以下步骤快速设置并运行 ChatGPT 检索插件：\n\n1. 如果尚未安装，请先安装 Python 3.10。\n2. 克隆仓库：`git clone https:\u002F\u002Fgithub.com\u002Fopenai\u002Fchatgpt-retrieval-plugin.git`\n3. 进入克隆后的仓库目录：`cd \u002Fpath\u002Fto\u002Fchatgpt-retrieval-plugin`\n4. 安装 Poetry：`pip install poetry`\n5. 使用 Python 3.10 创建一个新的虚拟环境：`poetry env use python3.10`\n6. 激活虚拟环境：`poetry shell`\n7. 安装应用依赖：`poetry install`\n8. 创建一个 [Bearer 令牌](#general-environment-variables)。\n9. 设置所需的环境变量：\n\n   ```\n   export DATASTORE=\u003Cyour_datastore>\n   export BEARER_TOKEN=\u003Cyour_bearer_token>\n   export OPENAI_API_KEY=\u003Cyour_openai_api_key>\n   export EMBEDDING_DIMENSION=256 # 根据您想要使用的嵌入维度调整此值\n   export EMBEDDING_MODEL=text-embedding-3-large # 根据您的模型偏好进行修改，例如 text-embedding-3-small、text-embedding-ada-002\n\n   # 在使用 Azure OpenAI 时可选的环境变量\n   export OPENAI_API_BASE=https:\u002F\u002F\u003CAzureOpenAIName>.openai.azure.com\u002F\n   export OPENAI_API_TYPE=azure\n   export OPENAI_EMBEDDINGMODEL_DEPLOYMENTID=\u003C嵌入模型部署名称>\n   export OPENAI_METADATA_EXTRACTIONMODEL_DEPLOYMENTID=\u003C元数据提取模型部署名称>\n   export OPENAI_COMPLETIONMODEL_DEPLOYMENTID=\u003C用于完成任务的通用模型部署名称>\n   export OPENAI_EMBEDDING_BATCH_SIZE=\u003C嵌入批处理大小，对于 Azure OpenAI，此值需设置为 1>\n\n   # 添加您所选择向量数据库的环境变量。\n   # 其中一些是可选的；请参阅 \u002Fdocs\u002Fproviders 中的提供商设置文档以获取更多信息。\n\n   # Pinecone\n   export PINECONE_API_KEY=\u003Cyour_pinecone_api_key>\n   export PINECONE_ENVIRONMENT=\u003Cyour_pinecone_environment>\n   export PINECONE_INDEX=\u003Cyour_pinecone_index>\n\n   # Weaviate\n   export WEAVIATE_URL=\u003Cyour_weaviate_instance_url>\n   export WEAVIATE_API_KEY=\u003Cyour_api_key_for_WCS>\n   export WEAVIATE_CLASS=\u003Cyour_optional_weaviate_class>\n\n   # Zilliz\n   export ZILLIZ_COLLECTION=\u003Cyour_zilliz_collection>\n   export ZILLIZ_URI=\u003Cyour_zilliz_uri>\n   export ZILLIZ_USER=\u003Cyour_zilliz_username>\n   export ZILLIZ_PASSWORD=\u003Cyour_zilliz_password>\n\n   # Milvus\n   export MILVUS_COLLECTION=\u003Cyour_milvus_collection>\n   export MILVUS_HOST=\u003Cyour_milvus_host>\n   export MILVUS_PORT=\u003Cyour_milvus_port>\n   export MILVUS_USER=\u003Cyour_milvus_username>\n   export MILVUS_PASSWORD=\u003Cyour_milvus_password>\n\n   # Qdrant\n   export QDRANT_URL=\u003Cyour_qdrant_url>\n   export QDRANT_PORT=\u003Cyour_qdrant_port>\n   export QDRANT_GRPC_PORT=\u003Cyour_qdrant_grpc_port>\n   export QDRANT_API_KEY=\u003Cyour_qdrant_api_key>\n   export QDRANT_COLLECTION=\u003Cyour_qdrant_collection>\n\n   # AnalyticDB\n   export PG_HOST=\u003Cyour_analyticdb_host>\n   export PG_PORT=\u003Cyour_analyticdb_port>\n   export PG_USER=\u003Cyour_analyticdb_username>\n   export PG_PASSWORD=\u003Cyour_analyticdb_password>\n   export PG_DATABASE=\u003Cyour_analyticdb_database>\n   export PG_COLLECTION=\u003Cyour_analyticdb_collection>\n\n\n   # Redis\n   export REDIS_HOST=\u003Cyour_redis_host>\n   export REDIS_PORT=\u003Cyour_redis_port>\n   export REDIS_PASSWORD=\u003Cyour_redis_password>\n   export REDIS_INDEX_NAME=\u003Cyour_redis_index_name>\n   export REDIS_DOC_PREFIX=\u003Cyour_redis_doc_prefix>\n   export REDIS_DISTANCE_METRIC=\u003Cyour_redis_distance_metric>\n   export REDIS_INDEX_TYPE=\u003Cyour_redis_index_type>\n\n   # Llama\n   export LLAMA_INDEX_TYPE=\u003Cgpt_vector_index_type>\n   export LLAMA_INDEX_JSON_PATH=\u003C保存索引的 JSON 文件路径>\n   export LLAMA_QUERY_KWARGS_JSON_PATH=\u003C保存查询参数的 JSON 文件路径>\n   export LLAMA_RESPONSE_MODE=\u003C查询响应模式>\n\n   # Chroma\n   export CHROMA_COLLECTION=\u003Cyour_chroma_collection>\n   export CHROMA_IN_MEMORY=\u003Ctrue_or_false>\n   export CHROMA_PERSISTENCE_DIR=\u003C您的 Chroma 持久化目录>\n   export CHROMA_HOST=\u003C您的 Chroma 主机>\n   export CHROMA_PORT=\u003C您的 Chroma 端口>\n\n   # Azure Cognitive Search\n   export AZURESEARCH_SERVICE=\u003C您的搜索服务名称>\n   export AZURESEARCH_INDEX=\u003C您的搜索索引名称>\n   export AZURESEARCH_API_KEY=\u003C您的 API 密钥>（可选，未设置时将使用无密钥托管身份验证）\n\n   # Azure CosmosDB Mongo vCore\n   export AZCOSMOS_API = \u003C您的 Azure Cosmos DB API，目前仅支持 MongoDB>\n   export AZCOSMOS_CONNSTR = \u003C您的 Azure Cosmos DB MongoDB vCore 连接字符串>\n   export AZCOSMOS_DATABASE_NAME = \u003C您的 MongoDB 数据库名称>\n   export AZCOSMOS_CONTAINER_NAME = \u003C您的 MongoDB 容器名称>\n\n   # Supabase\n   export SUPABASE_URL=\u003Csupabase_project_url>\n   export SUPABASE_ANON_KEY=\u003Csupabase_project_api_anon_key>\n\n   # Postgres\n   export PG_HOST=\u003Cpostgres_host>\n   export PG_PORT=\u003Cpostgres_port>\n   export PG_USER=\u003Cpostgres_user>\n   export PG_PASSWORD=\u003Cpostgres_password>\n   export PG_DB=\u003Cpostgres_database>\n\n   # Elasticsearch\n   export ELASTICSEARCH_URL=\u003Celasticsearch_host_and_port>（可以指定主机或 cloud_id）\n   export ELASTICSEARCH_CLOUD_ID=\u003Celasticsearch_cloud_id>\n\n   export ELASTICSEARCH_USERNAME=\u003Celasticsearch_username>\n   export ELASTICSEARCH_PASSWORD=\u003Celasticsearch_password>\n   export ELASTICSEARCH_API_KEY=\u003Celasticsearch_api_key>\n\n   export ELASTICSEARCH_INDEX=\u003Celasticsearch_index_name>\n   export ELASTICSEARCH_REPLICAS=\u003Celasticsearch_replicas>\n   export ELASTICSEARCH_SHARDS=\u003Celasticsearch_shards>\n\n   # MongoDB Atlas\n   export MONGODB_URI=\u003Cmongodb_uri>\n   export MONGODB_DATABASE=\u003Cmongodb_database>\n   export MONGODB_COLLECTION=\u003Cmongodb_collection>\n   export MONGODB_INDEX=\u003Cmongodb_index>\n   ```\n\n10. 在本地运行 API：`poetry run start`\n11. 访问 API 文档页面 `http:\u002F\u002F0.0.0.0:8000\u002Fdocs` 并测试 API 端点（请确保添加您的 Bearer 令牌）。\n\n## 关于\n\n### 检索插件\n\n这是一个独立的检索后端，可与 [ChatGPT 自定义 GPT](https:\u002F\u002Fchat.openai.com\u002Fgpts\u002Fdiscovery)、使用 [聊天补全](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Ftext-generation) 或 [助手 API](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fassistants\u002Foverview) 的 [函数调用](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Ffunction-calling)，或与 [已弃用的 ChatGPT 插件模型](https:\u002F\u002Fchat.openai.com\u002F?model=gpt-4-plugins) 一起使用。它使模型能够执行语义搜索和检索个人或组织文档，并根据相关检索到的内容撰写答案（有时称为“检索增强生成”或“RAG”）。用户只需用自然语言提问或表达需求，即可从其数据源中获取最相关的文档片段，例如文件、笔记或电子邮件。企业可以利用此插件，通过 ChatGPT 向员工开放内部文档。\n\n该插件使用 OpenAI 的嵌入模型（默认为 `text-embedding-3-large` 的 256 维嵌入）为文档块生成嵌入向量，并在后端使用向量数据库进行存储和查询。作为一款开源且可自托管的解决方案，开发者可以部署自己的检索插件并将其注册到 ChatGPT 中。检索插件支持多家向量数据库提供商，开发者可从中选择自己偏好的服务。\n\n一个 FastAPI 服务器公开了插件的端点，用于插入\u002F更新、查询和删除文档。用户可以通过来源、日期、作者或其他元数据过滤器来优化搜索结果。该插件可以部署在任何支持 Docker 容器的云平台上，例如 Fly.io、Heroku、Render 或 Azure 容器应用。为了使向量数据库始终包含最新文档，插件可以持续处理并存储来自各种数据源的文档，借助针对插入\u002F更新和删除端点的 Webhook 实现。像 [Zapier](https:\u002F\u002Fzapier.com) 或 [Make](https:\u002F\u002Fwww.make.com) 这样的工具可以帮助基于事件或计划配置这些 Webhook。\n\n### 检索插件与自定义 GPT\n\n要创建一个能够使用您的检索插件进行语义搜索和文档检索，甚至将新信息保存回数据库的自定义 GPT，您首先需要部署好检索插件。有关详细部署步骤，请参阅[部署部分](#deployment)。当您获得应用 URL（例如 `https:\u002F\u002Fyour-app-url.com`）后，按照以下步骤操作：\n\n1. 访问 `https:\u002F\u002Fchat.openai.com\u002Fgpts\u002Feditor` 的创建 GPT 页面。\n2. 按照标准流程设置您的 GPT。\n3. 切换到“配置”选项卡。在此处，您可以手动填写名称、描述和指令等字段，或使用智能创建工具辅助完成。\n4. 在“操作”部分，点击“创建新操作”。\n5. 选择身份验证方式。检索插件支持无身份验证、API 密钥（Basic 或 Bearer）以及 OAuth。有关这些方法的更多信息，请参阅[身份验证方法部分](#authentication-methods)。\n6. 导入 OpenAPI 规范。您可以：\n   - 直接从应用中托管的 OpenAPI 规范导入，地址为 `https:\u002F\u002Fyour-app-url.com\u002F.well-known\u002Fopenapi.yaml`。\n   - 如果您只想向 GPT 公开查询端点，也可以将[此文件](\u002F.well-known\u002Fopenapi.yaml)的内容复制并粘贴到 Schema 输入区域。请务必修改粘贴的 OpenAPI 规范中 `-servers` 部分的 URL。\n7. 您还可以选择添加一个获取端点。这需要编辑 [`\u002Fserver\u002Fmain.py`](\u002Fserver\u002Fmain.py) 文件以添加端点，并针对您选择的向量数据库实现该功能。如果您做出此更改，请考虑通过提交拉取请求将其贡献回项目！将获取端点添加到 OpenAPI 规范中，可以让模型根据 ID 从文档中获取更多内容，从而解决检索结果中文本被截断的问题。此外，您还可以传入检索结果中的文本字符串，并提供返回固定长度上下文的选项。\n8. 如果您希望 GPT 能够将信息保存回向量数据库，可以为其授予访问检索插件 `\u002Fupsert` 端点的权限。为此，您可以将[此文件](\u002Fexamples\u002Fmemory\u002Fopenapi.yaml)的内容复制到 Schema 区域。这样，GPT 就可以在对话过程中存储其生成或学到的新信息。有关此功能的更多详情，请参阅[记忆功能](#memory-feature)以及此处的[文档](\u002Fexamples\u002Fmemory)。\n\n请注意：ChatGPT 和自定义 GPT 原生支持从上传的文件中检索内容，因此只有在您希望对检索系统拥有更细粒度的控制时（例如自托管、嵌入块长度、嵌入模型\u002F尺寸等），才应将检索插件用作后端。\n\n### 带有函数调用的检索插件\n\n检索插件可以与 [Chat Completions API](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Ffunction-calling) 和 [Assistants API](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fassistants\u002Foverview) 中的函数调用功能集成。这使得模型能够根据对话上下文决定何时使用您的函数（查询、获取、插入或更新）。\n\n#### 使用 Chat Completions 进行函数调用\n\n在调用聊天完成 API 时，您可以描述函数，并让模型生成一个包含参数的 JSON 对象，以调用一个或多个函数。最新的模型（gpt-3.5-turbo-0125 和 gpt-4-turbo-preview）经过训练，能够检测何时应调用函数，并返回符合函数签名的 JSON 数据。\n\n您可以为检索插件端点定义函数，并在使用最新模型之一的 Chat Completions API 时将其作为工具传递进去。随后，模型将智能地调用这些函数。您可以通过函数调用来向您的 API 发送查询，在后端调用相应端点，并将响应作为工具消息返回给模型，以便继续对话。函数定义\u002F模式以及示例可以在此处找到：[这里](\u002Fexamples\u002Ffunction-calling\u002F)。\n\n#### 使用 Assistants API 进行函数调用\n\n您可以将相同的函数定义用于 OpenAI 的 [Assistants API](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fassistants\u002Foverview)，特别是其中的 [工具使用中的函数调用](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fassistants\u002Ftools\u002Ffunction-calling)。Assistants API 允许您在自己的应用程序中构建 AI 助手，利用模型、工具和知识来响应用户查询。函数定义\u002F模式以及示例可以在此处找到：[这里](\u002Fexamples\u002Ffunction-calling\u002F)。Assistants API 原生支持从上传的文件中进行检索，因此只有在您希望对检索系统进行更细粒度的控制时（例如嵌入块长度、嵌入模型\u002F大小等），才应结合函数调用使用检索插件。\n\nChat Completions API 和 Assistants API 都支持并行函数调用。这意味着您可以在同一条消息中执行多项任务，例如查询某些内容并将结果保存回向量数据库。\n\n有关使用检索插件进行函数调用的更多信息，请参阅[此处](\u002Fexamples\u002Ffunction-calling\u002F)。\n\n### ChatGPT 插件模型\n\n（已弃用）我们建议使用 GPTs 的自定义操作，通过 ChatGPT 来使用检索插件。有关如何在已弃用的插件模型中使用检索功能的说明可以在此处找到：[这里](\u002Fdocs\u002Fdeprecated\u002Fplugins.md)。\n\n### API 终端点\n\n检索插件基于 FastAPI 构建，FastAPI 是一个用于使用 Python 构建 API 的 Web 框架。FastAPI 能够轻松地开发、验证和记录 API 终端点。您可以在 [这里](https:\u002F\u002Ffastapi.tiangolo.com\u002F) 找到 FastAPI 的文档。\n\n使用 FastAPI 的一大优势是能够自动生成带有 Swagger UI 的交互式 API 文档。当 API 在本地运行时，可以通过 `\u003Clocal_host_url，例如 http:\u002F\u002F0.0.0.0:8000>\u002Fdocs` 访问 Swagger UI，与 API 终端点进行交互、测试其功能，并查看预期的请求和响应模型。\n\n该插件公开了以下终端点，用于向向量数据库中插入或更新文档、查询文档以及删除文档。所有请求和响应均采用 JSON 格式，并且需要在授权头中提供有效的 Bearer 令牌。\n\n- `\u002Fupsert`: 此终端点允许上传一个或多个文档，并将其文本和元数据存储在向量数据库中。文档会被分割成每个约 200 个标记的块，每个块都有唯一的 ID。该终端点在请求体中期望接收一个文档列表，每个文档包含 `text` 字段，以及可选的 `id` 和 `metadata` 字段。`metadata` 字段可以包含以下可选子字段：`source`、`source_id`、`url`、`created_at` 和 `author`。该终端点会返回已插入文档的 ID 列表（如果未提供 ID，则会自动生成）。\n\n- `\u002Fupsert-file`: 此终端点允许上传单个文件（PDF、TXT、DOCX、PPTX 或 MD），并将文件的文本和元数据存储在向量数据库中。文件会被转换为纯文本，并分割成每个约 200 个标记的块，每个块都有唯一的 ID。该终端点会返回一个包含已插入文件生成 ID 的列表。\n\n- `\u002Fquery`: 此终端点允许使用一个或多个自然语言查询以及可选的元数据过滤器来查询向量数据库。该终端点在请求体中期望接收一个查询列表，每个查询包含 `query` 字段，以及可选的 `filter` 和 `top_k` 字段。`filter` 字段应包含以下子字段的子集：`source`、`source_id`、`document_id`、`url`、`created_at` 和 `author`。`top_k` 字段指定针对给定查询返回多少条结果，默认值为 3。该终端点会返回一个对象列表，其中每个对象包含与给定查询最相关的文档块列表，以及它们的文本、元数据和相似度得分。\n\n- `\u002Fdelete`: 此终端点允许通过文档 ID、元数据过滤器或 `delete_all` 标志从向量数据库中删除一个或多个文档。该终端点在请求体中至少需要以下参数之一：`ids`、`filter` 或 `delete_all`。`ids` 参数应为要删除的文档 ID 列表；具有这些 ID 的文档的所有块都将被删除。`filter` 参数应包含以下子字段的子集：`source`、`source_id`、`document_id`、`url`、`created_at` 和 `author`。`delete_all` 参数应为布尔值，指示是否要删除向量数据库中的所有文档。该终端点会返回一个布尔值，表示删除操作是否成功。\n\n详细的规范以及请求和响应模型的示例，可以通过在本地运行应用程序并导航至 http:\u002F\u002F0.0.0.0:8000\u002Fopenapi.json 查看，或者在 OpenAPI 模式的 [这里](\u002F.well-known\u002Fopenapi.yaml) 查阅。请注意，OpenAPI 模式仅包含 `\u002Fquery` 终端点，因为这是 ChatGPT 需要访问的唯一功能。这样，ChatGPT 就只能使用该插件根据自然语言查询或需求检索相关文档。然而，如果开发者希望让 ChatGPT 具备稍后记住内容的能力，也可以使用 `\u002Fupsert` 终端点将对话片段保存到向量数据库中。一个赋予 ChatGPT `\u002Fupsert` 终端点访问权限的清单和 OpenAPI 模式示例可以在 [这里](\u002Fexamples\u002Fmemory) 找到。\n\n如需添加自定义元数据字段，请编辑 [这里](\u002Fmodels\u002Fmodels.py) 的 `DocumentMetadata` 和 `DocumentMetadataFilter` 数据模型，并更新 [这里](\u002F.well-known\u002Fopenapi.yaml) 的 OpenAPI 模式。您可以轻松完成此操作：在本地运行应用程序，复制 http:\u002F\u002F0.0.0.0:8000\u002Fsub\u002Fopenapi.json 中的 JSON 内容，并使用 [Swagger Editor](https:\u002F\u002Feditor.swagger.io\u002F) 将其转换为 YAML 格式。或者，您也可以用 `openapi.json` 文件替换 `openapi.yaml` 文件。\n\n### 记忆功能\n\n检索插件的一个显著特性是它能够为 ChatGPT 提供记忆功能。通过使用插件的 upsert 终端点，ChatGPT 可以在用户提示下将对话片段保存到向量数据库中，以便日后参考。此功能有助于实现更具上下文感知的聊天体验，使 ChatGPT 能够记住并检索之前对话中的信息。请参阅 [这里](\u002Fexamples\u002Fmemory) 了解如何配置具备记忆功能的检索插件。\n\n### 安全性\n\n检索插件允许 ChatGPT 搜索内容向量数据库，并将最佳结果添加到 ChatGPT 会话中。这意味着它不会产生任何外部影响，主要的风险考虑因素是数据授权和隐私。开发者应仅将获得授权且愿意出现在用户 ChatGPT 会话中的内容添加到他们的检索插件中。您可以选择多种不同的身份验证方法来保护插件的安全性（更多信息请参阅 [身份验证方法]）。\n\n### 选择嵌入模型\n\nChatGPT 检索插件使用 OpenAI 的嵌入模型来生成文档块的嵌入向量。检索插件的默认模型是 `text-embedding-3-large`，维度为 256。OpenAI 提供了两个最新的嵌入模型 `text-embedding-3-small` 和 `text-embedding-3-large`，以及一个较旧的模型 `text-embedding-ada-002`。\n\n新模型支持在不显著降低检索准确率的情况下缩短嵌入向量的长度，从而帮助您在检索准确率、成本和速度之间取得平衡。\n\n以下是各模型的对比：\n\n| 模型                  | 嵌入大小 | 平均 MTEB 分数 | 每 1k 个 token 的成本 |\n| ---------------------- | ---------- | -------------- | -------------------- |\n| text-embedding-3-large | 3072       | 64.6%        | $0.00013            |\n| text-embedding-3-large | 1024       | 64.1%        | $0.00013            |\n| text-embedding-3-large | 256        | 62.0%        | $0.00013            |\n| text-embedding-3-small | 1536       | 62.3%        | $0.00002            |\n| text-embedding-3-small | 512        | 61.6%        | $0.00002            |\n| text-embedding-ada-002 | 1536       | 61.0%        | $0.0001             |\n\n在选择模型时，请考虑以下几点：\n\n1. **检索准确率与成本**：`text-embedding-3-large` 提供最高的准确率，但成本也较高。`text-embedding-3-small` 在保持竞争力的同时更具成本效益。而较旧的 `text-embedding-ada-002` 模型则具有最低的准确率。\n\n2. **嵌入大小**：较大的嵌入向量能提供更好的准确率，但会占用更多存储空间，并且查询速度可能较慢。您可以调整嵌入向量的大小以平衡这些因素。\n\n例如，如果您的向量数据库最多支持 1024 维，则可以使用 `text-embedding-3-large` 并将 API 参数中的维度设置为 1024。这样可以将嵌入向量从 3072 维缩短至 1024 维，虽然会损失部分准确率，但可以降低存储和查询成本。\n\n要更改您选择的嵌入模型及其维度，请编辑以下环境变量：\n\n```\nEMBEDDING_DIMENSION=256 # 根据您希望使用的嵌入维度修改此值\nEMBEDDING_MODEL=\"text-embedding-3-large\" # 根据您希望使用的模型修改此值，例如 text-embedding-3-small、text-embedding-ada-002\n```\n\n## 开发\n\n### 设置\n\n本应用使用 Python 3.10，并通过 [poetry](https:\u002F\u002Fpython-poetry.org\u002F) 进行依赖管理。\n\n如果您的机器上尚未安装 Python 3.10，请先进行安装。您可以从官方 [Python 网站](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F) 下载，或使用包管理器（如 `brew` 或 `apt`）进行安装。\n\n从 GitHub 克隆仓库：\n\n```\ngit clone https:\u002F\u002Fgithub.com\u002Fopenai\u002Fchatgpt-retrieval-plugin.git\n```\n\n进入克隆后的仓库目录：\n\n```\ncd \u002Fpath\u002Fto\u002Fchatgpt-retrieval-plugin\n```\n\n安装 poetry：\n\n```\npip install poetry\n```\n\n创建一个使用 Python 3.10 的虚拟环境：\n\n```\npoetry env use python3.10\npoetry shell\n```\n\n使用 poetry 安装应用依赖项：\n\n```\npoetry install\n```\n\n**注意**：如果在 `pyproject.toml` 中添加了新的依赖项，请务必运行 `poetry lock` 和 `poetry install`。\n\n#### 常用环境变量\n\nAPI 正常运行需要以下环境变量：\n\n| 名称             | 必需 | 描述                                                                                                                                                                                                                                                   |\n| ---------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |\n| `DATASTORE`      | 是     | 指定您希望用于存储和查询嵌入向量的向量数据库提供商。可选值包括 `elasticsearch`、`chroma`、`pinecone`、`weaviate`、`zilliz`、`milvus`、`qdrant`、`redis`、`azuresearch`、`supabase`、`postgres`、`analyticdb`、`mongodb-atlas`。 |\n| `BEARER_TOKEN`   | 是     | 这是一个用于认证 API 请求的密钥令牌。您可以使用任何工具或方法生成，例如 [jwt.io](https:\u002F\u002Fjwt.io\u002F)。                                                                                   |\n| `OPENAI_API_KEY` | 是     | 这是您的 OpenAI API 密钥，用于调用 OpenAI 的嵌入模型生成嵌入向量。您可以通过在 [OpenAI](https:\u002F\u002Fopenai.com\u002F) 上注册账户来获取 API 密钥。                                                                |\n\n### 使用 Azure OpenAI 插件\n\nAzure OpenAI 使用特定于您资源的 URL，并且不通过模型名称而是通过部署 ID 来引用模型。因此，在这种情况下，您需要设置额外的环境变量。\n\n除了 `OPENAI_API_BASE`（您的特定 URL）和 `OPENAI_API_TYPE`（azure）之外，您还需要设置 `OPENAI_EMBEDDINGMODEL_DEPLOYMENTID`，用于指定在插入和查询时使用的嵌入模型。我们建议部署 `text-embedding-ada-002` 模型，并在此处使用其部署名称。\n\n如果您希望使用数据准备脚本，还需要设置用于元数据提取的 `OPENAI_METADATA_EXTRACTIONMODEL_DEPLOYMENTID`，以及用于 PII 处理的 `OPENAI_COMPLETIONMODEL_DEPLOYMENTID`。\n\n### 选择向量数据库\n\n该插件支持多种向量数据库提供商，每种都有不同的功能、性能和定价。根据您选择的数据库，您需要使用不同的 Dockerfile 并设置不同的环境变量。以下部分将简要介绍每种向量数据库提供商。\n\n有关每种向量数据库提供商的详细设置和使用说明，请参阅 `\u002Fdocs\u002Fproviders\u002F\u003Cdatastore_name>\u002Fsetup.md` 文件中的相应文档（[此处的文件夹](\u002Fdocs\u002Fproviders)）。\n\n#### Pinecone\n\n[Pinecone](https:\u002F\u002Fwww.pinecone.io) 是一种托管式向量数据库，专为速度、规模和快速部署到生产环境而设计。它支持混合搜索，也是目前唯一原生支持 SPLADE 稀疏向量的数据存储。有关详细的设置说明，请参阅 [`\u002Fdocs\u002Fproviders\u002Fpinecone\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fpinecone\u002Fsetup.md)。\n\n#### Weaviate\n\n[Weaviate](https:\u002F\u002Fweaviate.io\u002F) 是一个开源的向量搜索引擎，能够无缝扩展到数十亿条数据记录。它开箱即用支持混合搜索，非常适合需要高效关键词搜索的用户。Weaviate 可以自托管或由第三方托管，提供了灵活的部署方式。有关详细的设置说明，请参阅 [`\u002Fdocs\u002Fproviders\u002Fweaviate\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fweaviate\u002Fsetup.md)。\n\n#### Zilliz\n\n[Zilliz](https:\u002F\u002Fzilliz.com) 是一款面向十亿级数据的托管云原生向量数据库。它提供了丰富的功能，包括多种索引算法、距离度量方式、标量过滤、时间旅行查询、快照回滚、完整的 RBAC 权限管理、99.9% 的高可用性、存储与计算分离以及多语言 SDK。有关详细的部署指南，请参阅 [`\u002Fdocs\u002Fproviders\u002Fzilliz\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fzilliz\u002Fsetup.md)。\n\n#### Milvus\n\n[Milvus](https:\u002F\u002Fmilvus.io\u002F) 是一款开源的云原生向量数据库，可扩展至数十亿条向量。它是 Zilliz 的开源版本，共享许多相同的功能，例如多种索引算法、距离度量方式、标量过滤、时间旅行查询、快照回滚、多语言 SDK、存储与计算分离以及云端可扩展性。有关详细的部署指南，请参阅 [`\u002Fdocs\u002Fproviders\u002Fmilvus\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fmilvus\u002Fsetup.md)。\n\n#### Qdrant\n\n[Qdrant](https:\u002F\u002Fqdrant.tech\u002F) 是一款能够存储文档和向量嵌入的向量数据库。它提供自托管和托管的 [Qdrant Cloud](https:\u002F\u002Fcloud.qdrant.io\u002F) 部署选项，为不同需求的用户提供了灵活性。有关详细的部署指南，请参阅 [`\u002Fdocs\u002Fproviders\u002Fqdrant\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fqdrant\u002Fsetup.md)。\n\n#### Redis\n\n[Redis](https:\u002F\u002Fredis.com\u002Fsolutions\u002Fuse-cases\u002Fvector-database\u002F) 是一个适用于多种场景的实时数据平台，既可用于日常应用，也可用于 AI\u002FML 工作负载。通过使用 [Redis Stack Docker 容器](\u002Fexamples\u002Fdocker\u002Fredis\u002Fdocker-compose.yml) 创建 Redis 数据库，它可以作为低延迟的向量引擎使用。对于托管解决方案，可以使用 [Redis Cloud](https:\u002F\u002Fapp.redislabs.com\u002F#\u002F)。有关详细的部署指南，请参阅 [`\u002Fdocs\u002Fproviders\u002Fredis\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fredis\u002Fsetup.md)。\n\n#### LlamaIndex\n\n[LlamaIndex](https:\u002F\u002Fgithub.com\u002Fjerryjliu\u002Fllama_index) 是一个用于将您的大模型与外部数据连接起来的中央接口。它为您的非结构化和结构化数据提供了一系列内存中的索引，以便与 ChatGPT 配合使用。与标准的向量数据库不同，LlamaIndex 支持多种索引策略（如树状索引、关键词表、知识图谱），这些策略针对不同的用例进行了优化。它轻量级、易于使用，无需额外部署。您只需设置几个环境变量即可（也可以选择指向已保存的 Index JSON 文件）。需要注意的是，目前查询中尚不支持元数据过滤。有关详细的部署指南，请参阅 [`\u002Fdocs\u002Fproviders\u002Fllama\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fllama\u002Fsetup.md)。\n\n#### Chroma\n\n[Chroma](https:\u002F\u002Ftrychroma.com) 是一款面向 AI 的开源嵌入数据库，旨在让入门尽可能简单。Chroma 可以在内存中运行，也可以采用客户端-服务器架构。它开箱即用就支持元数据和关键词过滤。有关详细说明，请参阅 [`\u002Fdocs\u002Fproviders\u002Fchroma\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fchroma\u002Fsetup.md)。\n\n#### Azure 认知搜索\n\n[Azure 认知搜索](https:\u002F\u002Fazure.microsoft.com\u002Fproducts\u002Fsearch\u002F) 是一项完整的检索云服务，支持向量搜索、文本搜索以及混合搜索（结合向量和文本，以发挥两种方法的最佳效果）。它还提供一个可选的 [L2 重新排序步骤](https:\u002F\u002Flearn.microsoft.com\u002Fazure\u002Fsearch\u002Fsemantic-search-overview)，以进一步提升结果质量。有关详细的部署指南，请参阅 [`\u002Fdocs\u002Fproviders\u002Fazuresearch\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fazuresearch\u002Fsetup.md)。\n\n#### Azure Cosmos DB MongoDB vCore\n\n[Azure Cosmos DB MongoDB vCore](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fcosmos-db\u002Fmongodb\u002Fvcore\u002F) 支持对嵌入进行向量搜索，可用于将您的基于 AI 的应用程序与存储在 Azure Cosmos DB 中的数据无缝集成。有关详细说明，请参阅 [`\u002Fdocs\u002Fproviders\u002Fazurecosmosdb\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fazurecosmosdb\u002Fsetup.md)。\n\n#### Supabase\n\n[Supabase](https:\u002F\u002Fsupabase.com\u002Fblog\u002Fopenai-embeddings-postgres-vector) 提供了一种通过 Postgres 数据库的 [pgvector](https:\u002F\u002Fgithub.com\u002Fpgvector\u002Fpgvector) 扩展来高效存储向量的方式。您可以使用 [Supabase CLI](https:\u002F\u002Fgithub.com\u002Fsupabase\u002Fcli) 在本地或云端搭建完整的 Supabase 堆栈，也可以使用 docker-compose、k8s 等其他选项。对于托管解决方案，可以尝试 [Supabase.com](https:\u002F\u002Fsupabase.com\u002F)——它集成了身份验证、存储、自动 API 和实时功能，充分释放了 Postgres 的强大能力。有关详细的部署指南，请参阅 [`\u002Fdocs\u002Fproviders\u002Fsupabase\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fsupabase\u002Fsetup.md)。\n\n#### Postgres\n\n[Postgres](https:\u002F\u002Fwww.postgresql.org) 提供了一种通过 [pgvector](https:\u002F\u002Fgithub.com\u002Fpgvector\u002Fpgvector) 扩展高效存储向量的方式。要使用 pgvector，您需要设置一个启用了 pgvector 扩展的 PostgreSQL 数据库。例如，您可以使用 [Docker](https:\u002F\u002Fwww.docker.com\u002Fblog\u002Fhow-to-use-the-postgres-docker-official-image\u002F) 在本地运行。对于托管解决方案，可以选择任何支持 [pgvector](https:\u002F\u002Fgithub.com\u002Fpgvector\u002Fpgvector#hosted-postgres) 的云服务商。有关详细的部署指南，请参阅 [`\u002Fdocs\u002Fproviders\u002Fpostgres\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fpostgres\u002Fsetup.md)。\n\n#### AnalyticDB\n\n[AnalyticDB](https:\u002F\u002Fwww.alibabacloud.com\u002Fhelp\u002Fen\u002Fanalyticdb-for-postgresql\u002Flatest\u002Fproduct-introduction-overview) 是一款专为存储文档和向量嵌入而设计的分布式云原生向量数据库。它完全兼容 PostgreSQL 语法，并由阿里云托管。AnalyticDB 提供强大的向量计算引擎，能够处理数十亿条数据向量，并具备索引算法、结构化与非结构化数据处理能力、实时更新、距离度量方式、标量过滤以及时间旅行查询等功能。有关详细的部署指南，请参阅 [`\u002Fdocs\u002Fproviders\u002Fanalyticdb\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fanalyticdb\u002Fsetup.md)。\n\n#### Elasticsearch\n\n[Elasticsearch](https:\u002F\u002Fwww.elastic.co\u002Fguide\u002Fen\u002Felasticsearch\u002Freference\u002Fcurrent\u002Findex.html) 目前支持通过 `dense_vector` 字段类型存储向量，并利用这些向量来计算文档得分。Elasticsearch 8.0 在此基础上进一步支持快速的近似最近邻搜索 (ANN)，这是一种更具可扩展性的方法，能够在大型数据集上高效地执行向量搜索。有关详细的部署指南，请参阅 [`\u002Fdocs\u002Fproviders\u002Felasticsearch\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Felasticsearch\u002Fsetup.md)。\n\n#### MongoDB Atlas\n\n[MongoDB Atlas](https:\u002F\u002Fwww.mongodb.com\u002Fdocs\u002Fatlas\u002Fgetting-started\u002F) 目前，该流程涉及为所有包含宽度不超过 2048 维向量嵌入的集合生成 Atlas 向量搜索索引。这适用于与您 Atlas 集群上其他数据共存的各种数据类型，且该过程通过 Atlas UI 和 Atlas 管理 API 执行，请参阅 [`\u002Fdocs\u002Fproviders\u002Fmongodb_atlas\u002Fsetup.md`](\u002Fdocs\u002Fproviders\u002Fmongodb_atlas\u002Fsetup.md)。\n\n\n\n### 在本地运行 API\n\n要在本地运行 API，您首先需要使用 `export` 命令设置必要的环境变量：\n\n```\nexport DATASTORE=\u003Cyour_datastore>\nexport BEARER_TOKEN=\u003Cyour_bearer_token>\nexport OPENAI_API_KEY=\u003Cyour_openai_api_key>\n\u003C在此处添加您所选向量数据库的环境变量>\n```\n\n然后使用以下命令启动 API：\n\n```\npoetry run start\n```\n\n在终端显示的 URL 后面加上 `docs`，并在浏览器中打开，即可访问 API 文档并试用各个端点（即 http:\u002F\u002F0.0.0.0:8000\u002Fdocs）。请务必输入您的 Bearer 令牌，并测试 API 端点。\n\n**注意：** 如果您在 `pyproject.toml` 文件中添加了新的依赖项，需要运行 `poetry lock` 和 `poetry install` 来更新锁定文件并安装新依赖项。\n\n### 个性化设置\n\n您可以按照以下步骤为自己的用例个性化检索插件：\n\n- **替换徽标**：将 [logo.png](\u002F.well-known\u002Flogo.png) 中的图片替换为您自己的徽标。\n\n- **编辑数据模型**：编辑 [models.py](\u002Fmodels\u002Fmodels.py) 中的 `DocumentMetadata` 和 `DocumentMetadataFilter` 数据模型，以添加自定义元数据字段。相应地更新 [.well-known\u002Fopenapi.yaml](\u002F.well-known\u002Fopenapi.yaml) 中的 OpenAPI 模式。为了更方便地更新 OpenAPI 模式，您可以在本地运行应用，然后导航到 `http:\u002F\u002F0.0.0.0:8000\u002Fsub\u002Fopenapi.json`，复制页面内容。接着前往 [Swagger Editor](https:\u002F\u002Feditor.swagger.io\u002F) 并粘贴 JSON 内容将其转换为 YAML 格式。您也可以将 [.well-known\u002Fopenapi.yaml](\u002F.well-known\u002Fopenapi.yaml) 文件替换为位于 [.well-known](\u002F.well-known) 文件夹中的 openapi.json 文件。\n\n- **更改插件名称、描述和使用说明**：更新插件名称、面向用户的描述以及模型的使用说明。您可以在 [main.py](\u002Fserver\u002Fmain.py) 文件中直接编辑这些描述，也可以更新 [.well-known\u002Fopenapi.yaml](\u002F.well-known\u002Fopenapi.yaml) 文件。按照上一步骤中的说明更新 OpenAPI 模式。\n\n- **启用 ChatGPT 保存对话信息功能**：请参阅 [memory example folder](\u002Fexamples\u002Fmemory) 中的说明。\n\n### 身份验证方法\n\n您可从四种选项中选择用于对插件请求进行身份验证的方式：\n\n1. **无身份验证**：任何人都可以添加您的插件并使用其 API，无需任何凭据。此选项适用于仅公开非敏感或已公开的文档的情况。它无法为您的数据提供任何安全保障。若采用此方法，请将此 [main.py](\u002Fexamples\u002Fauthentication-methods\u002Fno-auth\u002Fmain.py) 的内容复制到实际的 [main.py](\u002Fserver\u002Fmain.py) 文件中。示例清单文件见 [此处](\u002Fexamples\u002Fauthentication-methods\u002Fno-auth\u002Fai-plugin.json)。\n\n2. **HTTP Bearer**：您可以使用密钥令牌作为标头来授权对插件的请求。此选项有两种变体：\n\n   - **用户级别**（本实现的默认设置）：每位将您的插件添加到 ChatGPT 的用户都必须在添加插件时提供 Bearer 令牌。您可以使用任何工具或方法生成并分发这些令牌，例如 [jwt.io](https:\u002F\u002Fjwt.io\u002F)。这种方式安全性更高，因为每个用户都需要输入共享的访问令牌。如果您需要为每个用户分配唯一的访问令牌，则需自行在 [main.py](\u002Fserver\u002Fmain.py) 文件中实现。示例清单文件见 [此处](\u002Fexamples\u002Fauthentication-methods\u002Fuser-http\u002Fai-plugin.json)。\n\n   - **服务级别**：任何人都可以添加您的插件并使用其 API，无需凭据，但您在注册插件时必须添加 Bearer 令牌。当您安装插件时，需要添加您的 Bearer 令牌，随后会收到 ChatGPT 分配的一个令牌，您必须将其包含在托管的清单文件中。ChatGPT 将使用您的令牌代表所有添加该插件的用户来授权对插件的请求。这种方法对用户更为便捷，但安全性可能较低，因为所有用户共享同一令牌，且无需添加令牌即可安装插件。示例清单文件见 [此处](\u002Fexamples\u002Fauthentication-methods\u002Fservice-http\u002Fai-plugin.json)。\n\n3. **OAuth**：用户必须通过 OAuth 流程才能添加您的插件。您可以使用 OAuth 提供商对添加插件的用户进行身份验证，并授予其访问您的 API 的权限。此方法提供了最高级别的安全性和控制，因为用户通过受信任的第三方提供商进行身份验证。然而，您需要自行在 [main.py](\u002Fserver\u002Fmain.py) 文件中实现 OAuth 流程，并在清单文件中提供必要的参数。示例清单文件见 [此处](\u002Fexamples\u002Fauthentication-methods\u002Foauth\u002Fai-plugin.json)。\n\n在选择最适合您用例和安全需求的身份验证方法之前，请考虑每种方法的优缺点。如果您选择使用不同于默认设置（用户级别 HTTP）的方法，请务必更新位于 [.well-known\u002Fai-plugin.json](\u002F.well-known\u002Fai-plugin.json) 的清单文件。\n\n## 部署\n\n您可以根据自己的偏好和需求，将应用部署到不同的云服务提供商。无论选择哪家提供商，您都需要更新应用中的两个文件：`openapi.yaml` 和 `ai-plugin.json`。如上文所述，这两个文件分别定义了应用的 API 规范和 AI 插件配置。您需要将这两个文件中的 `url` 字段修改为与已部署应用地址一致。\n\nRender 提供了一键部署选项，可自动更新这两个文件中的 `url` 字段：\n\n[\u003Cimg src=\"https:\u002F\u002Frender.com\u002Fimages\u002Fdeploy-to-render-button.svg\" alt=\"Deploy to Render\" \u002F>](https:\u002F\u002Frender.com\u002Fdeploy?repo=https:\u002F\u002Fgithub.com\u002Frender-examples\u002Fchatgpt-retrieval-plugin\u002Ftree\u002Fmain)\n\n在部署应用之前，建议您从 `pyproject.toml` 文件中移除未使用的依赖项，以减小应用体积并提升性能。根据您选择的向量数据库提供商，可以移除针对该特定提供商不必要的包。有关各提供商如何移除未使用依赖项的信息，请参阅 [`\u002Fdocs\u002Fdeployment\u002Fremoving-unused-dependencies.md`](\u002Fdocs\u002Fdeployment\u002Fremoving-unused-dependencies.md) 文件中的相应文档。\n\n部署指南：\n\n- [部署到 Fly.io](\u002Fdocs\u002Fdeployment\u002Fflyio.md)\n- [部署到 Heroku](\u002Fdocs\u002Fdeployment\u002Fheroku.md)\n- [部署到 Render](\u002Fdocs\u002Fdeployment\u002Frender.md)\n- [其他部署选项](\u002Fdocs\u002Fdeployment\u002Fother-options.md)（Azure 容器应用、Google Cloud Run、AWS 弹性容器服务等）\n\n完成应用部署后，您可以使用其中一份脚本或调用 `\u002Fupsert` 端点来上传初始文档批次。\n\n## Webhook\n\n为了保持向量数据库中存储的文档始终最新，您可以考虑使用 Zapier 或 Make 等工具，基于事件或计划配置指向插件 API 的入站 webhook。例如，当您更新笔记或收到电子邮件时，这些 webhook 可以帮助您同步新信息。此外，您还可以使用 Zapier Transfer 工具批量处理现有文档集合，并将其上传到向量数据库。\n\n如果您需要将自定义字段从这些工具传递到您的插件，可以创建一个额外的检索插件 API 端点，用于调用数据存储的 upsert 函数，例如 `upsert-email`。此自定义端点可以设计为接收来自 webhook 的特定字段，并按要求进行处理。\n\n设置入站 webhook 的一般步骤如下：\n\n- 选择 Zapier 或 Make 等 webhook 工具并注册账户。\n- 在工具中创建新的 webhook 或传输任务，并配置其基于事件或计划触发。\n- 指定 webhook 的目标 URL，即您的检索插件的 API 端点（例如 `https:\u002F\u002Fyour-plugin-url.com\u002Fupsert`）。\n- 配置 webhook 负载，包含必要的数据字段，并按照您的检索插件 API 要求进行格式化。\n- 测试 webhook，确保其正常工作并按预期将数据发送到您的检索插件。\n\n设置完 webhook 后，建议您执行一次回填操作，以确保所有之前遗漏的数据都被纳入向量数据库中。\n\n请记住，如果您希望使用入站 webhook 持续同步数据，在设置完成后应执行一次回填操作，以免遗漏任何数据。\n\n除了使用 Zapier 和 Make 等工具外，您也可以构建自定义集成来与您的检索插件同步数据。这样可以更好地控制数据流，并根据您的具体需求定制集成方案。\n\n## 脚本\n\n`scripts` 文件夹包含用于批量 upsert 或处理来自不同数据源的文本文档的脚本，例如 zip 文件、JSON 文件或 JSONL 文件。这些脚本利用插件的 upsert 工具函数，将文档及其元数据转换为纯文本并拆分为块后上传到向量数据库。每个脚本文件夹都配有 README 文件，说明如何使用以及所需的参数。您还可以选择使用语言模型对文档进行个人身份信息 (PII) 屏蔽，若检测到 PII 则跳过该文档，相关功能由 `services.pii_detection` 模块提供。这有助于避免无意中将敏感或隐私文档上传到向量数据库。此外，您还可以选择使用语言模型从文档文本中提取元数据，相关功能由 `services.extract_metadata` 模块提供。这在您希望丰富文档元数据时非常有用。**注意：** 如果使用入站 webhook 持续同步数据，建议在设置完成后执行一次回填操作，以避免遗漏任何数据。\n\n脚本包括：\n\n- [`process_json`](scripts\u002Fprocess_json\u002F)：此脚本处理 JSON 格式的文档转储文件，并将文档连同部分元数据存储到向量数据库中。JSON 文件的格式应为 JSON 对象列表，每个对象代表一篇文档。JSON 对象应包含 `text` 字段，并可选地包含其他字段以填充元数据。您可以提供自定义元数据（以 JSON 字符串形式）以及用于 PII 屏蔽和元数据提取的标志。\n- [`process_jsonl`](scripts\u002Fprocess_jsonl\u002F)：此脚本处理 JSONL 格式的文档转储文件，并将文档连同部分元数据存储到向量数据库中。JSONL 文件的格式应为每行一个有效 JSON 对象的换行分隔文件，每个对象代表一篇文档。JSON 对象应包含 `text` 字段，并可选地包含其他字段以填充元数据。您可以提供自定义元数据（以 JSON 字符串形式）以及用于 PII 屏蔽和元数据提取的标志。\n- [`process_zip`](scripts\u002Fprocess_zip\u002F)：此脚本处理 zip 文件中的文档转储，并将文档连同部分元数据存储到向量数据库中。zip 文件的格式应为 docx、pdf、txt、md、pptx 或 csv 文件的扁平文件夹。您可以提供自定义元数据（以 JSON 字符串形式）以及用于 PII 屏蔽和元数据提取的标志。\n\n## 拉取请求 (PR) 检查清单\n\n如果您希望贡献代码，请在提交 PR 时遵循以下检查清单。这将帮助我们更快地审查和合并您的更改！感谢您的贡献！\n\n1. **PR 类型**：在标题开头的方括号中添加标签以标明 PR 类型，例如 `[Bugfix]`、`[Feature]`、`[Enhancement]`、`[Refactor]` 或 `[Documentation]`。\n\n2. **简短描述**：提供一段简明扼要、信息丰富的 PR 描述，说明所做的更改。\n\n3. **关联问题**：使用关键字 `Fixes` 或 `Closes` 后跟相应的问题编号来提及任何相关问题（例如 `Fixes #123`、`Closes #456`）。\n\n4. **分支**：确保您为此次更改创建了一个新分支，并且该分支基于最新的 `main` 分支。\n\n5. **代码更改**：请确保代码更改尽可能精简、聚焦，并与所解决的问题或新增功能直接相关。\n\n6. **提交信息**：编写清晰简洁的提交信息，解释每次提交的目的。\n\n7. **测试**：为任何新代码或对现有代码的更改包含单元测试和\u002F或集成测试。请务必在提交 PR 之前确保所有测试都通过。\n\n8. **文档更新**：更新相关文档（如 README、内联注释或外部文档），以反映所做的任何更改。\n\n9. **请求评审**：至少请求一位其他贡献者或仓库维护者进行评审。\n\n10. **视频提交**（针对复杂或大型 PR）：如果您的 PR 引入了重大更改、复杂逻辑或大量代码行，请随 PR 一起提交一段简短的演示视频。视频应解释更改的目的、背后的逻辑，以及这些更改如何解决问题或实现拟议的功能。这将有助于评审人员更好地理解您的贡献，并加快评审流程。\n\n## 拉取请求命名规范\n\n请按照以下命名规范为您的 PR 分支命名：\n\n```\n\u003Ctype>\u002F\u003Cshort-description>-\u003Cissue-number>\n```\n\n- `\u003Ctype>`：PR 类型，例如 `bugfix`、`feature`、`enhancement`、`refactor` 或 `docs`。可以使用多个类型，用逗号分隔。\n- `\u003Cshort-description>`：对所做更改的简要描述，单词之间用连字符分隔。\n- `\u003Cissue-number>`：与所做更改相关联的问题编号（如果适用）。\n\n示例：\n\n```\nfeature\u002Fadvanced-chunking-strategy-123\n```\n\n## 限制\n\n尽管 ChatGPT 检索插件旨在为语义搜索和检索提供灵活的解决方案，但它仍存在一些局限性：\n\n- **关键词搜索限制**：所选 OpenAI 嵌入模型生成的嵌入可能无法有效捕捉精确的关键词匹配。因此，对于高度依赖特定关键词的查询，插件可能无法返回最相关的结果。某些向量数据库，如 Elasticsearch、Pinecone、Weaviate 和 Azure Cognitive Search，采用混合搜索方式，可能在关键词搜索方面表现更佳。\n- **敏感数据处理**：插件不会自动检测或过滤敏感数据。开发者有责任确保其拥有将内容纳入检索插件的必要授权，并保证内容符合数据隐私要求。\n- **可扩展性**：插件的性能可能因所选的向量数据库提供商及数据集大小而异。不同提供商的可扩展性和性能可能存在差异。\n- **元数据提取**：可选的元数据提取功能依赖于语言模型从文档文本中提取信息。此过程并不总是准确的，提取的元数据质量可能因文档内容和结构而异。\n- **PII 检测**：可选的 PII 检测功能并非万无一失，可能无法捕获所有个人身份信息。请谨慎使用此功能，并根据您的具体用例验证其有效性。\n\n## 未来发展方向\n\nChatGPT 检索插件为语义搜索和检索提供了一种灵活的解决方案，但仍有进一步开发的空间。我们鼓励用户通过提交拉取请求来为项目贡献新功能或改进。重要的贡献可能会获得 OpenAI 积分作为认可。\n\n未来的一些发展方向包括：\n\n- **更多向量数据库提供商**：如果您有兴趣将其他向量数据库提供商与 ChatGPT 检索插件集成，请随时提交相应的实现。\n- **更多脚本**：扩展可用于处理和上传来自各种数据源的文档的脚本范围，将使插件更加通用。\n- **用户界面**：开发用于管理文档和与插件交互的用户界面，可以提升用户体验。\n- **混合搜索 \u002F TF-IDF 选项**：在 [datastore 的 upsert 函数](\u002Fdatastore\u002Fdatastore.py#L18) 中增加使用混合搜索或 TF-IDF 索引的选项，可以提高插件在基于关键词查询方面的性能。\n- **高级分块策略和嵌入计算**：实施更复杂的分块策略和嵌入计算方法，例如对文档标题和摘要进行嵌入、对文档分块和摘要进行加权平均，或计算文档的平均嵌入，都有望带来更好的搜索结果。\n- **自定义元数据**：允许用户为文档分块添加自定义元数据，例如标题或其他相关信息，在某些场景下可能会改善检索结果。\n- **更多可选服务**：集成更多可选服务，例如对文档进行摘要生成或在嵌入前对文档进行预处理，可以增强插件的功能和检索结果的质量。这些服务可以使用语言模型实现，并直接集成到插件中，而不只是在脚本中提供。\n\n我们欢迎社区的贡献，以帮助改进 ChatGPT 检索插件并扩展其功能。如果您有任何想法或想要贡献的功能，请向仓库提交拉取请求。\n\n## 贡献者\n\n我们谨向以下贡献者致以诚挚的感谢，感谢他们为代码和文档所做的贡献，以及在将各类向量数据库提供商与 ChatGPT 检索插件集成过程中给予的支持：\n\n- [Pinecone](https:\u002F\u002Fwww.pinecone.io\u002F)\n  - [acatav](https:\u002F\u002Fgithub.com\u002Facatav)\n  - [gkogan](https:\u002F\u002Fgithub.com\u002Fgkogan)\n  - [jamescalam](https:\u002F\u002Fgithub.com\u002Fjamescalam)\n- [Weaviate](https:\u002F\u002Fwww.semi.technology\u002F)\n  - [byronvoorbach](https:\u002F\u002Fgithub.com\u002Fbyronvoorbach)\n  - [hsm207](https:\u002F\u002Fgithub.com\u002Fhsm207)\n  - [sebawita](https:\u002F\u002Fgithub.com\u002Fsebawita)\n- [Zilliz](https:\u002F\u002Fzilliz.com\u002F)\n  - [filip-halt](https:\u002F\u002Fgithub.com\u002Ffilip-halt)\n- [Milvus](https:\u002F\u002Fmilvus.io\u002F)\n  - [filip-halt](https:\u002F\u002Fgithub.com\u002Ffilip-halt)\n- [Qdrant](https:\u002F\u002Fqdrant.tech\u002F)\n  - [kacperlukawski](https:\u002F\u002Fgithub.com\u002Fkacperlukawski)\n- [Redis](https:\u002F\u002Fredis.io\u002F)\n  - [spartee](https:\u002F\u002Fgithub.com\u002Fspartee)\n  - [tylerhutcherson](https:\u002F\u002Fgithub.com\u002Ftylerhutcherson)\n- [LlamaIndex](https:\u002F\u002Fgithub.com\u002Fjerryjliu\u002Fllama_index)\n  - [jerryjliu](https:\u002F\u002Fgithub.com\u002Fjerryjliu)\n  - [Disiok](https:\u002F\u002Fgithub.com\u002FDisiok)\n- [Supabase](https:\u002F\u002Fsupabase.com\u002F)\n  - [egor-romanov](https:\u002F\u002Fgithub.com\u002Fegor-romanov)\n- [Postgres](https:\u002F\u002Fwww.postgresql.org\u002F)\n  - [egor-romanov](https:\u002F\u002Fgithub.com\u002Fegor-romanov)\n  - [mmmaia](https:\u002F\u002Fgithub.com\u002Fmmmaia)\n- [Elasticsearch](https:\u002F\u002Fwww.elastic.co\u002F)\n  - [joemcelroy](https:\u002F\u002Fgithub.com\u002Fjoemcelroy)","# ChatGPT Retrieval Plugin 快速上手指南\n\nChatGPT Retrieval Plugin 是一个独立的检索后端，旨在让 ChatGPT（通过 Custom GPTs、Function Calling 或 Assistants API）能够访问和检索您的个人或组织文档。它利用语义搜索技术，使模型能够基于您提供的私有数据生成更准确的回答（即 RAG 架构）。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**：Linux, macOS 或 Windows (推荐 WSL2)\n*   **Python 版本**：必须安装 **Python 3.10** (其他版本可能导致依赖冲突)\n*   **包管理工具**：`pip`\n*   **依赖管理工具**：`poetry`\n*   **向量数据库**：需预先准备一个向量数据库实例（如 Pinecone, Weaviate, Qdrant, Milvus, Redis, Chroma, Elasticsearch, MongoDB Atlas 等）并获取相应的连接凭证。\n*   **OpenAI API Key**：有效的 OpenAI API 密钥。\n\n> **提示**：国内开发者若遇到 `pip` 或 `poetry` 下载缓慢，可配置国内镜像源加速：\n> *   Pip: `pip config set global.index-url https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`\n> *   Poetry: `poetry config pypi-url https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`\n\n## 安装步骤\n\n请依次执行以下命令完成项目克隆与环境搭建：\n\n1.  **克隆仓库**\n    ```bash\n    git clone https:\u002F\u002Fgithub.com\u002Fopenai\u002Fchatgpt-retrieval-plugin.git\n    cd chatgpt-retrieval-plugin\n    ```\n\n2.  **安装 Poetry**\n    ```bash\n    pip install poetry\n    ```\n\n3.  **创建并激活虚拟环境**\n    指定使用 Python 3.10 创建环境：\n    ```bash\n    poetry env use python3.10\n    poetry shell\n    ```\n\n4.  **安装项目依赖**\n    ```bash\n    poetry install\n    ```\n\n5.  **配置环境变量**\n    您需要设置数据存储类型、认证令牌、OpenAI 密钥以及所选向量数据库的连接信息。\n\n    创建一个 `.env` 文件或直接在终端导出变量（以下为示例，请根据实际情况替换 `\u003C...>` 内容）：\n\n    ```bash\n    # 基础配置\n    export DATASTORE=\u003Cyour_datastore>  # 例如：pinecone, weaviate, qdrant 等\n    export BEARER_TOKEN=\u003Cyour_bearer_token> # 自定义一个用于 API 认证的令牌\n    export OPENAI_API_KEY=\u003Cyour_openai_api_key>\n    \n    # 嵌入模型配置 (默认使用 text-embedding-3-large)\n    export EMBEDDING_DIMENSION=256\n    export EMBEDDING_MODEL=text-embedding-3-large\n\n    # 向量数据库配置 (以 Pinecone 为例，其他数据库请参考 README 中的对应部分)\n    export PINECONE_API_KEY=\u003Cyour_pinecone_api_key>\n    export PINECONE_ENVIRONMENT=\u003Cyour_pinecone_environment>\n    export PINECONE_INDEX=\u003Cyour_pinecone_index>\n    ```\n\n    > **注意**：如果您使用的是 Azure OpenAI 或其他向量数据库（如 Milvus, Qdrant, Redis 等），请参照原 README 中 `Quickstart` 部分的完整列表设置对应的 `export` 变量。\n\n## 基本使用\n\n完成安装和配置后，您可以启动本地服务并进行测试。\n\n1.  **启动 API 服务**\n    在激活的虚拟环境中运行：\n    ```bash\n    poetry run start\n    ```\n    服务默认将在 `http:\u002F\u002F0.0.0.0:8000` 启动。\n\n2.  **访问文档与测试接口**\n    打开浏览器访问 Swagger UI 界面：\n    ```text\n    http:\u002F\u002F0.0.0.0:8000\u002Fdocs\n    ```\n\n3.  **执行操作**\n    *   在 Swagger 界面中，找到 `\u002Fupsert` 接口上传文档（支持文本、URL 或文件）。\n    *   点击 \"Try it out\"。\n    *   在 `Authorization` 栏输入您的 Bearer Token（格式：`Bearer \u003Cyour_bearer_token>`）。\n    *   填入文档内容进行测试。\n    *   随后可以使用 `\u002Fquery` 接口通过自然语言提问，系统将返回基于您上传文档的检索结果。\n\n此时，您的检索后端已运行成功，接下来可将其部署到云端并配置到 ChatGPT Custom GPTs 或通过 Function Calling 集成到您的应用中。","某科技公司的技术文档工程师需要快速从数万页的历史项目文档、会议记录和 API 手册中，为开发团队提取特定的架构决策依据。\n\n### 没有 chatgpt-retrieval-plugin 时\n- **检索效率极低**：面对海量非结构化文档，只能依靠关键词搜索，经常因术语不匹配而漏掉关键信息，人工翻阅耗时数小时。\n- **上下文割裂**：传统搜索仅返回包含关键词的片段，无法理解问题的语义，难以将分散在不同文档中的相关逻辑串联起来。\n- **定制化能力弱**：无法灵活控制文档切片长度或更换嵌入模型，导致对代码片段或长篇幅技术方案的解析精度不足。\n- **数据孤岛严重**：私有部署的内部文档无法安全地接入大模型，团队不得不手动复制粘贴内容到公共聊天窗口，存在泄露风险。\n\n### 使用 chatgpt-retrieval-plugin 后\n- **自然语言直达答案**：工程师直接用“上个季度微服务鉴权方案的变更原因是什么？”提问，系统通过语义检索秒级定位并汇总相关段落。\n- **智能上下文关联**：基于向量数据库的语义理解能力，自动跨文档关联起会议记录中的讨论点与最终 API 手册的实现细节，提供完整逻辑链。\n- **精细可控的后端**：团队可自主调整文本切片策略并选用专用的代码嵌入模型，显著提升了对复杂技术文档的召回准确率。\n- **安全的私有化集成**：作为独立后端部署在公司内网，既保留了数据主权，又通过标准 API 让 Custom GPTs 安全访问内部知识库。\n\nchatgpt-retrieval-plugin 通过将静态文档库转化为可对话的智能知识中枢，彻底解决了企业私有数据与大模型之间的“最后一公里”难题。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fopenai_chatgpt-retrieval-plugin_ecce11d9.png","openai","OpenAI","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fopenai_1960bbf4.png","",null,"https:\u002F\u002Fopenai.com\u002F","https:\u002F\u002Fgithub.com\u002Fopenai",[84,88,92],{"name":85,"color":86,"percentage":87},"Python","#3572A5",99.7,{"name":89,"color":90,"percentage":91},"Dockerfile","#384d54",0.2,{"name":93,"color":94,"percentage":95},"Makefile","#427819",0.1,21209,3629,"2026-04-04T18:24:30","MIT",4,"Linux, macOS, Windows","未说明",{"notes":104,"python":105,"dependencies":106},"该工具是一个基于 FastAPI 的后端服务，主要用于文档的语义搜索和检索（RAG）。它依赖外部向量数据库（如 Pinecone, Weaviate, Redis 等）存储嵌入向量，自身不运行大型本地模型，因此对 GPU 无强制要求。需安装 Poetry 进行依赖管理，并配置相应的向量数据库环境变量及 OpenAI API Key。支持多种向量数据库后端，用户需根据选择的数据库提供商查阅具体文档进行配置。","3.10",[107,108,109,110],"FastAPI","Poetry","OpenAI API","Vector Database Clients (e.g., Pinecone, Weaviate, Qdrant, etc.)",[36,15],[113,114],"chatgpt","chatgpt-plugins","2026-03-27T02:49:30.150509","2026-04-06T05:36:26.823920",[118,123,128,133,138,143],{"id":119,"question_zh":120,"answer_zh":121,"source_url":122},16462,"使用 Redis 查询接口时为什么返回空结果？","这通常是因为文档虽然成功插入（upsert），但索引未正确创建或查询配置有误。请确保：\n1. 使用了正确的 Docker 容器启动 Redis（包含 RediSearch 模块）：在 `examples\u002Fdocker\u002Fredis` 目录下运行 `docker-compose up -d`。\n2. 确认环境变量 `DATASTORE=\"redis\"` 已设置。\n3. 检查是否在插入后立即查询，可能需要短暂等待索引构建。\n4. 尝试使用具体的 `document_id` 进行查询测试。\n如果问题依旧，请拉取最新的主分支代码，因为已有 PR 修复了相关逻辑。","https:\u002F\u002Fgithub.com\u002Fopenai\u002Fchatgpt-retrieval-plugin\u002Fissues\u002F51",{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},16463,"遇到 'unknown command FT.CREATE' 错误如何解决？","该错误表明当前的 Redis 实例没有安装 RediSearch 模块。FT.CREATE 是 RediSearch 的命令。\n解决方案：\n1. 不要使用原生 Redis 镜像，必须使用包含 RediSearch 和 RedisJSON 模块的 Docker 镜像。\n2. 请进入项目的 `examples\u002Fdocker\u002Fredis` 目录。\n3. 运行命令 `docker compose up -d` 来启动配置好的 Redis 服务。\n4. 更新 Docker 配置后重启应用即可解决。","https:\u002F\u002Fgithub.com\u002Fopenai\u002Fchatgpt-retrieval-plugin\u002Fissues\u002F17",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},16464,"启动时报错 'Invalid rule type: JSON' 怎么办？","此错误是因为 Redis 缺少必要的模块支持。你需要确保同时安装了 `RediSearch` (版本 >= 2.6) 和 `RedisJSON` 模块。\n最简便的解决方法是使用项目提供的 Docker 配置：\n1. 切换到目录：`examples\u002Fdocker\u002Fredis`\n2. 执行命令：`docker compose up -d`\n该 Docker 容器默认包含了所需的模块，启动后即可正常运行 `poetry run start`。","https:\u002F\u002Fgithub.com\u002Fopenai\u002Fchatgpt-retrieval-plugin\u002Fissues\u002F26",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},16465,"调用 upsert 接口时出现 'AuthenticationError' 或 'RetryError' 是什么原因？","这通常是由于 OpenAI API Key 配置不正确或环境变量未生效导致的。\n排查步骤：\n1. 确认 `OPENAI_API_KEY` 有效且未过期。\n2. 检查环境变量的设置方式：\n   - Windows 用户请使用：`set OPENAI_API_KEY=\u003C你的密钥>`\n   - Linux\u002FMac 用户请使用：`export OPENAI_API_KEY=\u003C你的密钥>`\n3. 如果你属于多个组织，可能需要设置 `OPENAI_ORGANIZATION`，但对于单组织账户通常是可选的。\n4. 确保在启动应用前已经正确导入了这些环境变量。","https:\u002F\u002Fgithub.com\u002Fopenai\u002Fchatgpt-retrieval-plugin\u002Fissues\u002F107",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},16466,"如何正确配置和存放环境变量（如 API Key）？","为了安全起见，不建议将敏感信息硬编码在代码中或直接提交到 GitHub。\n推荐做法：\n1. **临时测试**：直接在运行应用的终端 shell 中导出变量。\n   - Linux\u002FMac: `export OPENAI_API_KEY='sk-...'`\n   - Windows: `set OPENAI_API_KEY=sk-...`\n2. **生产环境\u002F持久化**：创建一个 `.env` 文件（确保将其加入 `.gitignore`），并在启动脚本中加载它，或者在 Docker 运行时通过 `-e` 参数传入。\n维护者建议采用“在 shell 中输入环境变量”的方式以避免意外泄露数据库密码或 API 密钥。","https:\u002F\u002Fgithub.com\u002Fopenai\u002Fchatgpt-retrieval-plugin\u002Fissues\u002F310",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},16467,"连接 Redis 后端时出现 'NewConnectionError' 或连接失败如何处理？","当插件运行在 Docker 容器中而 Redis 运行在宿主机（或另一台机器）时，容易出现网络连接问题。\n解决步骤：\n1. **检查 IP 地址**：确保 `REDIS_HOST` 环境变量设置为宿主机的真实局域网 IP（例如 `192.168.x.x`），而不是 `localhost` 或 `127.0.0.1`，因为容器内的 localhost 指的是容器本身。\n2. **防火墙设置**：确认宿主机防火墙允许来自容器网段的 6379 端口连接。\n3. **连通性测试**：在容器内部使用 `telnet \u003CREDIS_HOST> \u003CREDIS_PORT>` 测试网络是否通畅。\n4. **绑定地址**：确保 Redis 配置文件 (`redis.conf`) 中设置了 `bind 0.0.0.0` 或具体的局域网 IP，允许外部连接。","https:\u002F\u002Fgithub.com\u002Fopenai\u002Fchatgpt-retrieval-plugin\u002Fissues\u002F68",[]]