[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-microsoft--PromptWizard":3,"tool-microsoft--PromptWizard":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":80,"owner_twitter":81,"owner_website":82,"owner_url":83,"languages":84,"stars":93,"forks":94,"last_commit_at":95,"license":96,"difficulty_score":10,"env_os":97,"env_gpu":98,"env_ram":98,"env_deps":99,"category_tags":111,"github_topics":79,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":112,"updated_at":113,"faqs":114,"releases":143},1282,"microsoft\u002FPromptWizard","PromptWizard","Task-Aware Agent-driven Prompt Optimization Framework","PromptWizard 是一个专注于提升 AI 提示（prompt）效果的开源框架，通过让大语言模型（LLM）自主生成、评估和优化提示内容，实现更高效的任务执行。它采用一种自适应机制，使模型在不断迭代中改进自身提示和示例，从而提高任务完成的质量与准确性。\n\n传统提示工程往往需要人工反复调整和测试，效率较低且难以覆盖所有可能的优化方向。PromptWizard 解决了这一问题，通过自动化的方式持续优化提示内容，同时还能生成多样化的示例，帮助模型更好地理解任务需求。其独特之处在于结合了反馈驱动的优化流程、多样化示例的生成以及自我生成的推理步骤，使得提示优化更加系统和高效。\n\nPromptWizard 适合研究人员和开发者使用，特别是那些希望提升模型性能、探索提示工程方法或进行大规模实验的用户。对于需要定制化提示优化方案的团队来说，它提供了一个灵活且强大的工具选择。","\n# PromptWizard 🧙\n\n\u003Cp align=\"left\">\n  \u003Ca href='https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.18369'>\n    \u003Cimg src=https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2409.10566-b31b1b.svg>\n  \u003C\u002Fa>\n  \u003Ca href='https:\u002F\u002Fwww.microsoft.com\u002Fen-us\u002Fresearch\u002Fblog\u002Fpromptwizard-the-future-of-prompt-optimization-through-feedback-driven-self-evolving-prompts\u002F'>\n    \u003Cimg src=images\u002Fmsr_blog.png width=\"16\">\n    Blog Post\n  \u003C\u002Fa>\n  \u003Ca href='https:\u002F\u002Fmicrosoft.github.io\u002FPromptWizard\u002F'>\n    \u003Cimg src=images\u002Fgithub.png width=\"16\">\n    Project Website\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n\n> **PromptWizard: Task-Aware Prompt Optimization Framework**\u003Cbr>\n> Eshaan Agarwal, Joykirat Singh, Vivek Dani, Raghav Magazine, Tanuja Ganu, Akshay Nambi \u003Cbr>\n\n## Overview 🌟\n\u003Cp align=\"center\">Overview of the PromptWizard framework\u003C\u002Fp>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_PromptWizard_readme_fb38fbb9bc72.png\" >\n\nPromptWizard is a discrete prompt optimization framework that employs a self-evolving mechanism where the LLM generates, critiques, and refines its own prompts and examples, continuously improving through iterative feedback and synthesis. This self-adaptive approach ensures holistic optimization by evolving both the instructions and in-context learning examples for better task performance.\n\nThree key components of PromptWizard are the following :\n\n- Feedback-driven Refinement: LLM generates, critiques, and refines its own prompts and examples, continuously improving through iterative feedback and synthesis​\n- Critique and Synthesize diverse examples: Generates synthetic examples that are robust, diverse and task-aware. Also it optimizes both prompt and examples in tandem​\n- Self generated Chain of Thought (CoT) steps with combination of positive, negative and synthetic examples\n\n\u003Cp align=\"center\">Stage 1: Iterative optimization of instructions\u003C\u002Fp>\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_PromptWizard_readme_865ccd931029.png\" width=\"49.5%\" \u002F>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">Stage 2: Sequential optimization of instruction and examples\u003C\u002Fp>\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_PromptWizard_readme_869ef47ee012.png\" width=\"49.5%\" \u002F>\n\u003C\u002Fp>\n\n## Installation ⬇️\n\nFollow these steps to set up the development environment and install the package:\n\n1) Clone the repository\n    ```\n    git clone https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FPromptWizard\n    cd PromptWizard\n    ```\n2) Create and activate a virtual environment\n\n    On Windows\n    ```\n    python -m venv venv\n    venv\\Scripts\\activate\n    ```\n    On macOS\u002FLinux:\n    ```\n    python -m venv venv\n    source venv\u002Fbin\u002Factivate\n    ```\n3) Install the package in development mode:\n    ```\n    pip install -e .\n    ```\n\n\n## Quickstart 🏃\n\nThere are three main ways to use PromptWizard:\n- Scenario 1 : Optimizing prompts without examples\n- Scenario 2 : Generating synthetic examples and using them to optimize prompts\n- Scenario 3 : Optimizing prompts with training data\n\n**NOTE** : Refer this [notebook](demos\u002Fscenarios\u002Fdataset_scenarios_demo.ipynb) to get a detailed understanding of the usage for each of the scenarios. **This serves as a starting point to understand the usage of PromptWizard**\n\n#### High level overview of using PromptWizard\n- Decide your scenario\n- Fix the configuration and environmental varibles for API calling\n  - Use ```promptopt_config.yaml``` to set configurations. For example for GSM8k this [file](demos\u002Fgsm8k\u002Fconfigs\u002Fpromptopt_config.yaml) can be used\n  - Use ```.env``` to set environmental varibles. For GSM8k this [file](demos\u002Fgsm8k\u002F.env) can be used\n  ```\n  USE_OPENAI_API_KEY=\"XXXX\"\n  # Replace with True\u002FFalse based on whether or not to use OPENAI API key\n\n  # If the first variable is set to True then fill the following two\n  OPENAI_API_KEY=\"XXXX\"\n  OPENAI_MODEL_NAME =\"XXXX\"\n  \n  # If the first variable is set to False then fill the following three\n  AZURE_OPENAI_ENDPOINT=\"XXXXX\" \n  # Replace with your Azure OpenAI Endpoint\n\n  OPENAI_API_VERSION=\"XXXX\"\n  # Replace with the version of your API\n\n  AZURE_OPENAI_CHAT_DEPLOYMENT_NAME=\"XXXXX\"\n  # Create a deployment for the model and place the deployment name here. \n  ```\n- Run the code\n  - To run PromptWizard on your custom dataset please jump [here](#run-on-custom-dataset) \n\n#### Running PromptWizard with training data (Scenario 3)\n- We support [GSM8k](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fopenai\u002Fgsm8k), [SVAMP](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FChilleD\u002FSVAMP), [AQUARAT](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fdeepmind\u002Faqua_rat) and [Instruction_Induction(BBII)](https:\u002F\u002Fgithub.com\u002Fxqlin98\u002FINSTINCT\u002Ftree\u002Fmain\u002FInduction\u002Fexperiments\u002Fdata\u002Finstruction_induction\u002Fraw) datasets\n- Please note that time taken for prompt optimzation is dependent on the dataset. In our experiments for the above mentioned datasets, it took around 20 - 30 minutes on average.\n\n#### Running on GSM8k (AQUARAT\u002FSVAMP)\n\n- Please note that this code requires access to LLMs via API calling for which we support AZURE endpoints or OPENAI keys\n- Set the AZURE endpoint configurations in [.env](demos\u002Fgsm8k\u002F.env)\n- Follow the steps in [demo.ipynb](demos\u002Fgsm8k\u002Fdemo.ipynb) to download the data, run the prompt optimization and carry out inference.\n\n#### Running on BBII\n\n- BBII has many datasets in it, based on the dataset set the configs [here](demos\u002Fbbh\u002Fconfigs\u002Fpromptopt_config.yaml)\n- In configs ```task_description```,```base_instruction``` and ```answer_format``` need to be changed for different datasets in BBII, the rest of the configs remain the same\n- A demo is presented in  [demo.ipynb](demos\u002Fbbh\u002Fdemo.ipynb)\n\n\n\n## Run on Custom Datasets 🗃️\n\n### Create Custom Dataset\n- Our code expects the dataset to be in ```.jsonl``` file format\n- Both the train and test set follow the same format\n- Every sample in the ```.jsonl``` should have 2 fields :\n  1) ```question``` : It should contain the complete question that is to asked to the LLM\n  2) ```answer``` : It should contain the ground truth answer which can be verbose or concise\n\n\n### Run on Custom Dataset\n\nNOTE : Refer to [demos](demos) folder for examples of folders for four datasets. The ```.ipynb``` in each of the folders shows how to run PromptWizard on that particular dataset. A similar procedure can be followed for a new dataset. Below is the explanation of each of the components of the ```.ipynb``` and the dataset specifc folder structure in detail\n\n#### Steps to be followed for custom datasets \n\n1) Every new dataset needs to have the following \n    - ```configs``` folder to store files for defining optimization hyperparameters and setup configs \n    - ```data``` folder to store ```train.jsonl``` and ```test.jsonl``` as curated [here](#create-custom-dataset) (this is done in the notebooks)\n    - ```.env``` file for environment varibles to be used for API calling\n    - ```.py\u002F.ipynb``` script to run the code\n\n2) Set the hyperparameters like number of mutations, refine steps, in-context examples etc.\n    - Set the following in [promptopt_config.yaml](demos\u002Fgsm8k\u002Fconfigs\u002Fpromptopt_config.yaml) : \n        - ```task_description``` : Desciption of the task at hand which will be fed into the prompt\n          - For GSM8k a description like the following can be used\n            ```\n            You are a mathematics expert. You will be given a mathematics problem which you need to solve\n            ```\n        - ```base_instruction``` : Base instruction in line with the dataset\n          - A commonly used base instruction could be\n            ```\n            Lets think step by step.\n            ```\n        - ```answer_format``` : Instruction for specifying the answer format\n          - It is crucial to set the ```answer_format``` properly to ensure correct extraction by ```def extract_final_answer()```\n          - Answer format could be :\n            ```\n            At the end, wrap only your final option between \u003CANS_START> and \u003CANS_END> tags\n            ```\n            Then in ```def extract_final_answer()``` we can simply write code to extract string between the tags\n          \n        - ```seen_set_size``` : The number of train samples to be used for prompt optimization\n          - In our experiments we set this to be 25. In general any number between 20-50 would work \n        - ```few_shot_count``` : The number of in-context examples needed in the prompt\n          - The value can be set to any positive integer based on the requirement\n          - For generating zero-shot prompts, set the values to a small number (i.e between 2-5) and after the final prompt is generated the in-context examples can be removed. We suggest using some in-context examples as during the optimization process the instructions in the prompt are refined using in-context examples hence setting it to a small number will give better zero-shot instructions in the prompt\n        - ```generate_reasoning``` : Whether or not to generate reasoning for the in-context examples\n          - In our experiments we found it to improve the prompt overall as it provides a step-by-step approach to reach the final answer. However if there is a constraint on the prompt length or number of prompt tokens, it can be turned off to get smaller sized prompts\n        - ```generate_expert_identity``` and ```generate_intent_keywords``` : Having these helped improve the prompt as they help making the prompt relevant to the task\n    - Refer ```promptopt_config.yaml``` files in folders present [here](demos)  for the descriptions used for AQUARAT, SVAMP and GSM8k. For BBII refer [description.py](demos\u002Fbbh\u002Fdescription.py) which has the meta instructions for each of the datasets\n    - Following are the global parameters which can be set based on the availability of the training data\n      - ```run_without_train_examples``` is a global hyperparameter which can be used when there are no training samples and in-context examples are not required in the final prompt \n      - ```generate_synthetic_examples``` is a global hyperparameter which can be used when there are no training samples and we want to generate synthetic data for training \n      - ```use_examples``` is a global hyperparameter which can be used to optimize prompts using training data \n3) Create a dataset specific class which inherits ```class DatasetSpecificProcessing``` similar to ```GSM8k(DatasetSpecificProcessing)``` in [demo.ipynb](demos\u002Fgsm8k\u002Fdemo.ipynb) and define the following functions in it\n      1) In ```def extract_answer_from_output()``` : This is a dataset specific function, given the ```answer``` from the dataset it should extract and return  a concise form of the answer. Note that based on the dataset it can also simply return the ```answer``` as it is like in case of SVAMP and AQUARAT datasets\n      2) ```def extract_final_answer()``` : This is a LLM output specific function, given the verbose answer from the LLM it should extract and return the concise final answer\n      3) Define ```def access_answer()``` : This function takes an input the LLM output, then does the following:\n         - Extracts the concise answer using ```def extract_final_answer()``` from the LLM output as defined above\n         - Evaluates the extracted answer with the ground truth and retuns\n            - Extracted answer from LLM output\n            - Boolean value indicating if answer is correct or not\n         - The evaluation done here is dataset specific, for datasets like GSM8k, SVAMP and AQUARAT which have final answer as an number, we can do a direct match between the numbers generated and the ground truth, while for datasets where the answer is a sentence or paragraph it would be better to do evaluation with llm-as-a-judge, to compare the generated and ground truth paragraph\u002Fsentence. An example is available in ```def access_answer()``` in [this](demos\u002Fbbh\u002Fdemo.ipynb) notebook\n\n\n## How PromptWizard Works 🔍\n- Using the problem description and initial prompt instruction, PW generates variations of the instruction by prompting LLMs to mutate it. Based on performance, the best prompt is selected. PW incorporates a critique component that provides feedback, thus guiding and refining the prompt over multiple iterations. \n- PW also optimizes in-context examples. PW selects a diverse set of examples\nfrom the training data, identifying positive and negative examples based on their performance with\nthe modified prompt. Negative examples help inform further prompt refinements. \n- Examples and instructions are sequentially optimized, using the critique to generate synthetic examples that address the current prompt’s weaknesses. These examples are integrated to further refine the prompt. \n- PW generates detailed reasoning chains via Chain-of-Thought (CoT), enriching the prompt’s capacity for problem-solving. \n- PW aligns prompts with human reasoning by integrating task intent and expert\npersonas, enhancing both model performance and interpretability.\n\n## Configurations ⚙️ \n\nHere we define the various hyperparameters used in prompt optimization process found in [promptopt_config.yaml](demos\u002Fgsm8k\u002Fconfigs\u002Fpromptopt_config.yaml)\n\n- ```mutate_refine_iterations```: Number of iterations for conducting mutation of task description\n followed by refinement of instructions\n- ```mutation_rounds```: Number of rounds of mutation to be performed when generating different styles\n- ```refine_task_eg_iterations```: Number of iterations for refining task description and in context examples \n- ```style_variation```: Number of thinking style variations to be used in prompt mutation\n- ```questions_batch_size```: Number of questions to be asked to LLM in a single batch, during training step\n- ```min_correct_count```: Minimum number of batches of questions to correctly answered, for a prompt to be considered as performing good\n- ```max_eval_batches```: Maximum number of mini-batches on which we should evaluate the prompt\n- ```top_n```: Number of top best prompts to be considered from scoring stage for the next stage\n- ```seen_set_size```: Number of samples from trainset to be used for training\n- ```few_shot_count```: Number of in-context examples required in final prompt\n\n## Best Practices 💡\n\nFollowing are some of best pracitices we followed during are experiments \n- Regarding the parameters in [promptopt_config.yaml](demos\u002Fgsm8k\u002Fconfigs\u002Fpromptopt_config.yaml)\n    - We found the best performing values for ```mutate_refine_iterations```,```mutation_rounds```,```refine_task_eg_iterations``` to be 3 or 5\n    - Other parameters have been set to their ideal values. ```seen_set_size``` can be increased to 50 and ```few_shot_count``` can be set based on the use case\n- The prompts generated at the end of the training process are usually very detailed, however user supervision can help tune it further for the task at hand\n- Trying both configurations of having synthetic in-context examples or in-context examples from the train set can be tried to find the best prompt based on use case. \n\n## Results 📈\n\n\u003Cp align=\"center\">\n  \u003Cimg src= \".\u002Fimages\u002Fcurve.png\" width=\"45%\" \u002F>\n  \u003Cp align=\"center\">PromptWizard consistently outperforms other methods across various\nthresholds, maintaining the highest p(τ) values, indicating that it consistently performs near the best\npossible accuracy across all tasks\u003C\u002Fp>\n\u003C\u002Fp>\n\n\n- The fiqure shows the performance profile curve for the instruction induction\ntasks. The performance profile curve visualizes how frequently\ndifferent approaches’ performance is within a given distance of the best performance. In this curve,\nthe x-axis (τ) represents the performance ratio relative to the best-performing method, and the y-axis\n(p(τ )) reflects the fraction of tasks where a method’s performance is within this ratio. So for a given\nmethod, the curve tells what percentage of the tasks are within τ distance to the best performance. \n\n\n## How to contribute: ✋\nThis project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https:\u002F\u002Fcla.microsoft.com.\nWhen you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repositories using our CLA.\nThis project has adopted the [Microsoft Open Source Code of Conduct](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002F). For more information see the [Code of Conduct FAQ](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002Ffaq\u002F) or contact opencode@microsoft.com with any additional questions or comments.\n\n## Citation 📝\n\nIf you make use of our work, please cite our paper:\n\n```\n@misc{agarwal2024promptwizardtaskawarepromptoptimization,\n      title={PromptWizard: Task-Aware Prompt Optimization Framework}, \n      author={Eshaan Agarwal and Joykirat Singh and Vivek Dani and Raghav Magazine and Tanuja Ganu and Akshay Nambi},\n      year={2024},\n      eprint={2405.18369},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.18369}, \n}\n```\n## Responsible AI Considerations \nFor guidelines and best practices related to Responsible AI, please refer to our [Responsible AI Guidelines](RESPONSIBLE_AI.md).\n\n","# PromptWizard 🧙\n\n\u003Cp align=\"left\">\n  \u003Ca href='https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.18369'>\n    \u003Cimg src=https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2409.10566-b31b1b.svg>\n  \u003C\u002Fa>\n  \u003Ca href='https:\u002F\u002Fwww.microsoft.com\u002Fen-us\u002Fresearch\u002Fblog\u002Fpromptwizard-the-future-of-prompt-optimization-through-feedback-driven-self-evolving-prompts\u002F'>\n    \u003Cimg src=images\u002Fmsr_blog.png width=\"16\">\n    博客文章\n  \u003C\u002Fa>\n  \u003Ca href='https:\u002F\u002Fmicrosoft.github.io\u002FPromptWizard\u002F'>\n    \u003Cimg src=images\u002Fgithub.png width=\"16\">\n    项目网站\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n\n> **PromptWizard：任务感知的提示优化框架**\u003Cbr>\n> Eshaan Agarwal、Joykirat Singh、Vivek Dani、Raghav Magazine、Tanuja Ganu、Akshay Nambi \u003Cbr>\n\n## 概述 🌟\n\u003Cp align=\"center\">PromptWizard框架概述\u003C\u002Fp>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_PromptWizard_readme_fb38fbb9bc72.png\" >\n\nPromptWizard是一个离散的提示优化框架，采用自进化机制，由大语言模型生成、评估并优化自身的提示与示例，通过迭代反馈与综合不断改进。这种自适应方法通过同时演化指令与上下文学习示例，确保整体优化，从而提升任务表现。\n\nPromptWizard的三大核心组件如下：\n\n- 反馈驱动的优化：大语言模型生成、评估并优化自身的提示与示例，通过迭代反馈与综合不断改进​  \n- 多样化示例的评估与合成：生成鲁棒、多样且任务感知的合成示例，并同步优化提示与示例​  \n- 自主生成思维链（CoT）步骤，结合正面、负面及合成示例  \n\n\u003Cp align=\"center\">阶段1：指令的迭代优化\u003C\u002Fp>\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_PromptWizard_readme_865ccd931029.png\" width=\"49.5%\" \u002F>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">阶段2：指令与示例的顺序优化\u003C\u002Fp>\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_PromptWizard_readme_869ef47ee012.png\" width=\"49.5%\" \u002F>\n\u003C\u002Fp>\n\n## 安装 ⬇️\n\n按照以下步骤设置开发环境并安装软件包：\n\n1) 克隆仓库\n    ```\n    git clone https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FPromptWizard\n    cd PromptWizard\n    ```\n2) 创建并激活虚拟环境\n\n    在Windows上\n    ```\n    python -m venv venv\n    venv\\Scripts\\activate\n    ```\n    在macOS\u002FLinux上：\n    ```\n    python -m venv venv\n    source venv\u002Fbin\u002Factivate\n    ```\n3) 以开发模式安装软件包：\n    ```\n    pip install -e .\n    ```\n\n\n## 快速入门 🏃\n\n使用PromptWizard主要有三种方式：\n- 场景1：在无示例的情况下优化提示\n- 场景2：生成合成示例并利用其优化提示\n- 场景3：结合训练数据优化提示\n\n**注意**：请参阅此[笔记本](demos\u002Fscenarios\u002Fdataset_scenarios_demo.ipynb)，以详细了解每种场景的具体用法。**这将作为理解PromptWizard使用方法的起点**\n\n#### 使用PromptWizard的高级概览\n- 确定您的使用场景\n- 设置API调用所需的配置与环境变量\n  - 使用```promptopt_config.yaml```文件进行配置。例如，对于GSM8k，可使用此[文件](demos\u002Fgsm8k\u002Fconfigs\u002Fpromptopt_config.yaml)\n  - 使用```.env```文件设置环境变量。对于GSM8k，可使用此[文件](demos\u002Fgsm8k\u002F.env)\n  ```\n  USE_OPENAI_API_KEY=\"XXXX\"\n  # 根据是否使用OPENAI API密钥，将其设置为True或False\n\n  # 如果第一个变量设为True，则填写以下两项\n  OPENAI_API_KEY=\"XXXX\"\n  OPENAI_MODEL_NAME =\"XXXX\"\n  \n  # 如果第一个变量设为False，则填写以下三项\n  AZURE_OPENAI_ENDPOINT=\"XXXXX\" \n  # 替换为您的Azure OpenAI Endpoint\n\n  OPENAI_API_VERSION=\"XXXX\"\n  # 替换为您的API版本\n\n  AZURE_OPENAI_CHAT_DEPLOYMENT_NAME=\"XXXXX\"\n  # 为模型创建部署，并在此处填写部署名称。 \n  ```\n- 运行代码\n  - 如需在自定义数据集上运行PromptWizard，请跳转至[此处](#run-on-custom-dataset) \n\n#### 使用训练数据运行PromptWizard（场景3）\n- 我们支持[GSM8k](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fopenai\u002Fgsm8k)、[SVAMP](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FChilleD\u002FSVAMP)、[AQUARAT](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fdeepmind\u002Faqua_rat)以及[Instruction_Induction(BBII)](https:\u002F\u002Fgithub.com\u002Fxqlin98\u002FINSTINCT\u002Ftree\u002Fmain\u002FInduction\u002Fexperiments\u002Fdata\u002Finstruction_induction\u002Fraw)数据集\n- 请注意，提示优化所需时间取决于数据集。在我们针对上述数据集的实验中，平均耗时约为20至30分钟。\n\n#### 在GSM8k上运行（AQUARAT\u002FSVAMP）\n\n- 请注意，这段代码需要通过API调用访问大语言模型，而我们支持AZURE端点或OPENAI密钥\n- 在[demos\u002Fgsm8k\u002F.env](demos\u002Fgsm8k\u002F.env)中设置AZURE端点配置\n- 按照[demos\u002Fgsm8k\u002Fdemo.ipynb](demos\u002Fgsm8k\u002Fdemo.ipynb)中的步骤下载数据、运行提示优化并进行推理。\n\n#### 在BBII上运行\n\n- BBII包含众多数据集，根据具体数据集设置[此处](demos\u002Fbbh\u002Fconfigs\u002Fpromptopt_config.yaml)的配置\n- 在配置中，```task_description```、```base_instruction```和```answer_format```需根据不同BBII数据集进行调整，其余配置保持不变\n- 示例演示见[demos\u002Fbbh\u002Fdemo.ipynb](demos\u002Fbbh\u002Fdemo.ipynb)\n\n\n\n## 在自定义数据集上运行 🗃️\n\n### 创建自定义数据集\n- 我们的代码要求数据集以```jsonl```文件格式提供\n- 训练集与测试集均遵循相同格式\n- 每个```jsonl```文件中的样本应包含两个字段：\n  1) ```question```：应包含需向大语言模型提出的完整问题\n  2) ```answer```：应包含真实答案，可以是详尽的或简洁的\n\n### 在自定义数据集上运行\n\n注意：有关四个数据集的文件夹示例，请参阅 [demos](demos) 文件夹。每个文件夹中的 ```.ipynb``` 文件展示了如何在该特定数据集上运行 PromptWizard。对于新数据集，也可以采用类似的流程。下面将详细说明 ```.ipynb``` 文件中的各个组件以及针对特定数据集的文件夹结构。\n\n#### 自定义数据集的操作步骤\n\n1) 每个新数据集都需要包含以下内容：\n   - ```configs``` 文件夹，用于存放定义优化超参数和设置配置的文件；\n   - ```data``` 文件夹，用于存放经过整理的 ```train.jsonl``` 和 ```test.jsonl``` 文件（详见[创建自定义数据集](#create-custom-dataset)），这些文件将在笔记本中生成；\n   - ```.env``` 文件，用于存储调用 API 所需的环境变量；\n   - ```.py\u002F.ipynb``` 脚本，用于运行代码。\n\n2) 设置超参数，如突变次数、微调步数、上下文示例等。\n   - 在 [promptopt_config.yaml](demos\u002Fgsm8k\u002Fconfigs\u002Fpromptopt_config.yaml) 中设置以下内容：\n     - ```task_description```：当前任务的描述，将作为提示输入。\n       - 对于 GSM8k，可以使用如下描述：\n         ```\n         你是一位数学专家。你将收到一道需要解答的数学题。\n         ```\n     - ```base_instruction```：与数据集相匹配的基础指令。\n       - 常用的基础指令可以是：\n         ```\n         让我们一步一步地思考。\n         ```\n     - ```answer_format```：指定答案格式的指令。\n       - 正确设置 ```answer_format``` 至关重要，以确保 ```def extract_final_answer()``` 能够正确提取答案。\n       - 答案格式可以是：\n         ```\n         最后，仅将你的最终选项用 \u003CANS_START> 和 \u003CANS_END> 标签括起来。\n         ```\n       - 那么在 ```def extract_final_answer()``` 中，我们只需编写代码来提取这两个标签之间的字符串即可。\n     - ```seen_set_size```：用于提示优化的训练样本数量。\n       - 在我们的实验中，将其设置为 25。一般来说，20 到 50 之间的任何数字都可以。\n     - ```few_shot_count```：提示中所需的上下文示例数量。\n       - 根据需求，该值可设为任意正整数。\n       - 对于生成零样本提示的情况，可将该值设为较小的数字（即 2 到 5），并在最终提示生成后移除上下文示例。我们建议保留一些上下文示例，因为在优化过程中，提示中的指令会借助上下文示例不断细化，因此将其设为较小的数字有助于生成更优质的零样本指令。\n     - ```generate_reasoning```：是否为上下文示例生成推理过程。\n       - 在我们的实验中，我们发现这能整体提升提示效果，因为它提供了逐步推导至最终答案的思路。然而，如果对提示长度或提示标记数量有限制，则可以关闭此功能，以获得更短的提示。\n     - ```generate_expert_identity``` 和 ```generate_intent_keywords```：启用这些功能有助于提升提示的相关性，使提示更加贴合任务需求。\n   - 请参考 [这里](demos) 提供的各文件夹中的 ```promptopt_config.yaml``` 文件，了解 AQUARAT、SVAMP 和 GSM8k 所使用的描述。对于 BBII，请参阅 [description.py](demos\u002Fbbh\u002Fdescription.py)，其中包含了针对每个数据集的元指令。\n   - 以下是可根据训练数据的可用性进行设置的全局参数：\n     - ```run_without_train_examples``` 是一个全局超参数，当没有训练样本且最终提示中不需要上下文示例时可使用。\n     - ```generate_synthetic_examples``` 是一个全局超参数，当没有训练样本且希望生成合成数据用于训练时可使用。\n     - ```use_examples``` 是一个全局超参数，可用于利用训练数据优化提示。\n\n3) 创建一个特定于数据集的类，继承自 ```class DatasetSpecificProcessing```，类似于 [demo.ipynb](demos\u002Fgsm8k\u002Fdemo.ipynb) 中的 ```GSM8k(DatasetSpecificProcessing)```，并在其中定义以下函数：\n   1) 在 ```def extract_answer_from_output()``` 中：这是一个数据集特定的函数，根据数据集提供的 ```answer```，应提取并返回简洁形式的答案。需要注意的是，根据数据集的不同，也可以直接返回原始的 ```answer```，例如 SVAMP 和 AQUARAT 数据集。\n   2) ```def extract_final_answer()```：这是一个针对 LLM 输出的函数，根据 LLM 的详细回答，应提取并返回简洁的最终答案。\n   3) 定义 ```def access_answer()```：该函数接收 LLM 的输出作为输入，然后执行以下操作：\n      - 使用上述定义的 ```def extract_final_answer()``` 从 LLM 输出中提取简洁答案；\n      - 将提取的答案与真实答案进行比对，并返回：\n        - LLM 输出的提取答案；\n        - 一个布尔值，表示答案是否正确；\n      - 此处的评估是针对特定数据集的。对于 GSM8k、SVAMP 和 AQUARAT 等数据集，由于其最终答案为数字，可以直接比较生成的数字与真实答案；而对于答案为句子或段落的数据集，则更适合采用 LLM 作为评判者的方式，比较生成的段落\u002F句子与真实答案。相关示例可在 [this](demos\u002Fbbh\u002Fdemo.ipynb) 笔记本中的 ```def access_answer()``` 中找到。\n\n\n## PromptWizard 的工作原理 🔍\n- PW 根据问题描述和初始提示指令，通过让 LLM 对指令进行变异来生成多种变体。根据性能表现，选择最优提示。PW 还内置了批评组件，提供反馈，从而在多次迭代中引导并优化提示。\n- PW 同时优化上下文示例。PW 从训练数据中选取多样化的示例，根据它们在修改后的提示下的表现，识别出正面和负面示例。负面示例有助于进一步优化提示。\n- 示例和指令会依次优化，利用批评结果生成能够弥补当前提示弱点的合成示例，这些示例会被整合进来，以进一步完善提示。\n- PW 通过思维链（CoT）生成详细的推理链条，增强提示解决问题的能力。\n- PW 通过融入任务意图和专家角色，使提示与人类推理保持一致，从而提升模型性能和可解释性。\n\n## 配置 ⚙️ \n\n在此我们定义了在 [promptopt_config.yaml](demos\u002Fgsm8k\u002Fconfigs\u002Fpromptopt_config.yaml) 中找到的提示优化过程中使用的各种超参数。\n\n- ```mutate_refine_iterations```：对任务描述进行变异并随后优化指令的迭代次数\n- ```mutation_rounds```：在生成不同风格时要执行的变异轮数\n- ```refine_task_eg_iterations```：对任务描述和上下文示例进行优化的迭代次数\n- ```style_variation```：在提示变异中使用的思维风格变化数量\n- ```questions_batch_size```：在训练步骤中，一次性向 LLM 提问的问题数量\n- ```min_correct_count```：一个提示被认为表现良好所需的正确回答问题的最少批次数量\n- ```max_eval_batches```：我们应对该提示进行评估的最大小批量数量\n- ```top_n```：从评分阶段选出的用于下一阶段的前 n 个最佳提示\n- ```seen_set_size```：用于训练的训练集样本数量\n- ```few_shot_count```：最终提示中所需的上下文示例数量\n\n## 最佳实践 💡\n\n以下是我们实验过程中遵循的一些最佳实践：\n- 关于 [promptopt_config.yaml](demos\u002Fgsm8k\u002Fconfigs\u002Fpromptopt_config.yaml) 中的参数：\n    - 我们发现 ```mutate_refine_iterations```、```mutation_rounds``` 和 ```refine_task_eg_iterations``` 的最佳取值为 3 或 5。\n    - 其他参数已设置为理想值。```seen_set_size``` 可以增加到 50，而 ```few_shot_count``` 则可根据具体用例进行设置。\n- 训练过程结束时生成的提示通常非常详细，但用户监督可以帮助进一步针对手头的任务进行微调。\n- 可以尝试同时使用合成的上下文示例和来自训练集的上下文示例这两种配置，以根据具体用例找到最佳提示。\n\n## 结果 📈\n\n\u003Cp align=\"center\">\n  \u003Cimg src= \".\u002Fimages\u002Fcurve.png\" width=\"45%\" \u002F>\n  \u003Cp align=\"center\">PromptWizard 在各种阈值下均持续优于其他方法，保持最高的 p(τ) 值，表明其在所有任务上都能稳定地接近最佳可能准确率。\u003C\u002Fp>\n\u003C\u002Fp>\n\n\n- 该图展示了指令诱导任务的性能曲线。性能曲线可视化了不同方法的性能在给定距离内与最佳性能的频率。在这条曲线上，x 轴 (τ) 表示相对于最佳方法的性能比，y 轴 (p(τ)) 则反映某一方法的性能处于该比例范围内的任务占比。因此，对于某一特定方法，这条曲线说明了有多少百分比的任务的性能与最佳性能相差不超过 τ 的距离。\n\n\n## 如何贡献： ✋\n本项目欢迎贡献与建议。大多数贡献都需要您同意一份贡献者许可协议（CLA），声明您有权且确实授予我们使用您的贡献的权利。详情请访问 https:\u002F\u002Fcla.microsoft.com。\n当您提交拉取请求时，CLA 机器人会自动判断您是否需要提供 CLA，并相应地对 PR 进行标注（例如添加标签、发表评论）。您只需按照机器人的指示操作即可。在整个使用我们 CLA 的仓库中，您只需完成一次此操作。\n本项目已采用 [微软开源行为准则](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002F)。如需了解更多信息，请参阅 [行为准则常见问题解答](https:\u002F\u002Fopensource.microsoft.com\u002Fcodeofconduct\u002Ffaq\u002F)，或如有任何其他问题或意见，请联系 opencode@microsoft.com。\n\n## 引用 📝\n\n如果您使用了我们的工作，请引用我们的论文：\n\n```\n@misc{agarwal2024promptwizardtaskawarepromptoptimization,\n      title={PromptWizard: Task-Aware Prompt Optimization Framework}, \n      author={Eshaan Agarwal and Joykirat Singh and Vivek Dani and Raghav Magazine and Tanuja Ganu and Akshay Nambi},\n      year={2024},\n      eprint={2405.18369},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.18369}, \n}\n```\n## 责任AI考量 \n有关责任AI的指南与最佳实践，请参阅我们的 [责任AI指南](RESPONSIBLE_AI.md)。","# PromptWizard 快速上手指南\n\n## 环境准备\n\n### 系统要求\n- Python 3.8 或更高版本\n- 支持 Windows、macOS 和 Linux 系统\n\n### 前置依赖\n- Git（用于克隆仓库）\n- Python 虚拟环境（推荐使用 `venv`）\n\n---\n\n## 安装步骤\n\n1. **克隆仓库**\n    ```bash\n    git clone https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FPromptWizard\n    cd PromptWizard\n    ```\n\n2. **创建并激活虚拟环境**\n\n   - Windows：\n    ```bash\n    python -m venv venv\n    venv\\Scripts\\activate\n    ```\n   - macOS\u002FLinux：\n    ```bash\n    python -m venv venv\n    source venv\u002Fbin\u002Factivate\n    ```\n\n3. **安装开发模式依赖**\n    ```bash\n    pip install -e .\n    ```\n\n> 💡 提示：如需加速依赖下载，可使用国内镜像源，例如：\n> ```bash\n> pip install -e . -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n> ```\n\n---\n\n## 基本使用\n\n### 使用场景概述\n\nPromptWizard 支持三种主要使用方式：\n\n1. **无示例优化提示词**  \n2. **生成合成示例并用于优化提示词**  \n3. **使用训练数据优化提示词**\n\n以下以 **Scenario 3（使用训练数据）** 为例，演示如何在 GSM8k 数据集上运行 PromptWizard。\n\n### 步骤说明\n\n1. **配置文件设置**\n   - 修改 `promptopt_config.yaml` 文件，设置任务描述、基础指令、答案格式等参数。\n     示例（GSM8k）：\n     ```yaml\n     task_description: \"You are a mathematics expert. You will be given a mathematics problem which you need to solve\"\n     base_instruction: \"Let's think step by step.\"\n     answer_format: \"At the end, wrap only your final option between \u003CANS_START> and \u003CANS_END> tags\"\n     seen_set_size: 25\n     few_shot_count: 5\n     generate_reasoning: true\n     ```\n\n   - 在 `.env` 文件中配置 API 访问信息（根据是否使用 OpenAI 或 Azure）：\n     ```env\n     USE_OPENAI_API_KEY=\"True\"\n     OPENAI_API_KEY=\"your_openai_api_key\"\n     OPENAI_MODEL_NAME=\"gpt-3.5-turbo\"\n     ```\n\n2. **运行代码**\n   - 执行如下命令运行 PromptWizard：\n     ```bash\n     python run_prompt_optimization.py --config_path demos\u002Fgsm8k\u002Fconfigs\u002Fpromptopt_config.yaml\n     ```\n\n   > 📌 注意：具体运行脚本名称和路径请参考对应数据集的 `demo.ipynb` 文件。\n\n3. **查看结果**\n   - 运行完成后，可以在输出目录中查看优化后的提示词、推理结果以及评估指标。\n\n---\n\n### 自定义数据集支持\n\n如果你有自定义数据集，请按照以下结构准备数据：\n\n- 数据格式为 `.jsonl`，每个样本包含两个字段：\n  - `question`: 需要输入给 LLM 的完整问题\n  - `answer`: 真实答案（可以是简洁或详细形式）\n\n- 示例 `.jsonl` 内容：\n  ```json\n  {\"question\": \"What is 2 + 2?\", \"answer\": \"4\"}\n  ```\n\n- 数据夹结构建议如下：\n  ```\n  your_dataset\u002F\n  ├── configs\u002F\n  │   └── promptopt_config.yaml\n  ├── data\u002F\n  │   ├── train.jsonl\n  │   └── test.jsonl\n  ├── .env\n  └── demo.ipynb\n  ```\n\n> ✅ 参考 `demos` 目录下的 GSM8k、AQUARAT、SVAMP 和 BBII 示例，可以快速搭建自己的数据集结构。","某电商公司内容运营团队正在为一个智能客服系统设计和优化用户问题解析的提示词（prompt），以提高客服机器人对用户意图识别的准确率，从而提升客户满意度和处理效率。\n\n### 没有 PromptWizard 时  \n- 团队需要手动编写和调整提示词，耗时且难以覆盖所有可能的用户提问场景。  \n- 提示词效果依赖于经验判断，缺乏系统性优化方法，导致不同任务下的表现参差不齐。  \n- 难以生成多样化的上下文示例，使得模型在面对新问题时泛化能力较弱。  \n- 调整提示词后，无法快速评估其效果，只能通过人工测试或大量样本验证，效率低下。  \n\n### 使用 PromptWizard 后  \n- 通过自动化迭代机制，PromptWizard 可持续生成、评估并优化提示词，显著减少人工干预和调试时间。  \n- 框架能够根据任务特性自动调整指令与示例，使模型在不同场景下保持高一致性与准确性。  \n- 支持生成多样化、任务感知的合成示例，增强模型对复杂或边缘情况的适应能力。  \n- 提供反馈驱动的优化流程，可实时评估提示词改进效果，加快迭代速度并提升整体性能。  \n\nPromptWizard 通过自适应优化机制，帮助团队高效提升智能客服系统的意图识别能力，实现更精准、更稳定的用户体验。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_PromptWizard_bf029fde.png","microsoft","Microsoft","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fmicrosoft_4900709c.png","Open source projects and samples from Microsoft",null,"opensource@microsoft.com","OpenAtMicrosoft","https:\u002F\u002Fopensource.microsoft.com","https:\u002F\u002Fgithub.com\u002Fmicrosoft",[85,89],{"name":86,"color":87,"percentage":88},"Python","#3572A5",99.7,{"name":90,"color":91,"percentage":92},"Makefile","#427819",0.3,3832,335,"2026-04-03T15:19:12","MIT","Linux, macOS, Windows","未说明",{"notes":100,"python":101,"dependencies":102},"需要通过 API 调用 LLM（如 OpenAI 或 Azure OpenAI），需配置相应的 API 密钥和端点。建议使用虚拟环境进行安装，并参考提供的配置文件（如 .env 和 promptopt_config.yaml）进行设置。","3.8+",[103,104,105,106,107,108,109,110],"torch","transformers","accelerate","datasets","tqdm","pyyaml","openai","azure-ai-openai",[26,13,15],"2026-03-27T02:49:30.150509","2026-04-06T06:54:03.258637",[115,120,124,129,134,139],{"id":116,"question_zh":117,"answer_zh":118,"source_url":119},5859,"如何解决 Azure Open AI 服务访问权限被拒绝的问题？","确保你的 Azure Open AI 密钥和端点是正确的，并且具有访问 API 的权限。如果问题仍然存在，可以尝试使用 vLLM 框架运行本地模型，并修改 `call_api` 函数以连接到该模型。示例代码如下：\n```\ndef call_api(messages):\n    from openai import OpenAI\n    client = OpenAI(\n        base_url = os.environ.get(\"AZURE_OPENAI_ENDPOINT\", \"http:\u002F\u002Flocalhost:8000\u002Fv1\"),\n        api_key = os.environ.get(\"AZURE_OPENAI_API_KEY\", \"token-abc123\")  \n    )\n    response = client.chat.completions.create(\n            model=os.environ[\"OPENAI_MODEL_NAME\"],\n            messages=messages,\n            temperature=0.0,\n        )\n    prediction = response.choices[0].message.content\n    return prediction\n```\n同时，请确认环境变量 `AZURE_OPENAI_ENDPOINT` 和 `AZURE_OPENAI_API_KEY` 已正确设置。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FPromptWizard\u002Fissues\u002F18",{"id":121,"question_zh":122,"answer_zh":123,"source_url":119},5860,"如何配置 Azure API 密钥？","请在 `.env` 文件中设置 `AZURE_OPENAI_API_KEY` 和 `AZURE_OPENAI_ENDPOINT` 环境变量，确保密钥和端点地址正确无误。此外，还需确认你拥有对 Azure Open AI 服务的访问权限。",{"id":125,"question_zh":126,"answer_zh":127,"source_url":128},5861,"为什么调用 `get_best_prompt` 方法返回的 `best_prompt` 是空值？","当 `run_without_train_examples=True` 时，如果没有训练数据或合成数据，`get_best_prompt` 方法可能无法生成有效的提示词。建议生成合成数据并传入以优化提示词。例如，将 `generate_synthetic_examples` 设置为 `True`，或者提供实际的训练数据。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FPromptWizard\u002Fissues\u002F24",{"id":130,"question_zh":131,"answer_zh":132,"source_url":133},5862,"如何解决 'No module named 'azure'' 的错误？","此错误表明缺少 Python 依赖项。请确保已安装 `azure` 相关库，可以通过 pip 安装：\n```\npip install azure\n```\n另外，项目维护者已经更新了相关文件以避免此类错误，请尝试使用最新的代码版本。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FPromptWizard\u002Fissues\u002F10",{"id":135,"question_zh":136,"answer_zh":137,"source_url":138},5863,"PromptWizard 是否支持除 OpenAI 以外的其他平台 API 密钥（如 Google、GLM4、DeepSeek）？","目前 PromptWizard 主要支持 OpenAI 的 API，但可以通过修改 `llm_mgr.py` 中的 `call_llm` 函数来适配其他平台的 API。例如，你可以通过自定义函数实现与 Google、GLM4 或 DeepSeek 等模型的集成。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FPromptWizard\u002Fissues\u002F14",{"id":140,"question_zh":141,"answer_zh":142,"source_url":138},5864,"如何让 PromptWizard 支持本地 API 端点（如 Ollama）？","可以通过修改 `llm_mgr.py` 中的 `call_llm` 函数，使其兼容本地 API 端点（如 Ollama）。只需调整函数逻辑，使其能够向本地服务发送请求并接收响应即可。",[]]