[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-uptrain-ai--uptrain":3,"tool-uptrain-ai--uptrain":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",160015,2,"2026-04-18T11:30:52",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",109154,"2026-04-18T11:18:24",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":76,"owner_website":77,"owner_url":78,"languages":79,"stars":100,"forks":101,"last_commit_at":102,"license":103,"difficulty_score":10,"env_os":104,"env_gpu":104,"env_ram":104,"env_deps":105,"category_tags":108,"github_topics":110,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":125,"updated_at":126,"faqs":127,"releases":163},9318,"uptrain-ai\u002Fuptrain","uptrain","UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.","UpTrain 是一个专为生成式 AI 应用打造的开源评估与优化平台。在开发大模型应用时，开发者往往难以量化模型输出的质量，也不易快速定位回答错误的具体原因。UpTrain 正是为了解决这一痛点而生，它提供了一套统一的解决方案，帮助用户系统性地评测应用表现并持续改进。\n\n该平台内置了超过 20 种预配置的检查项，涵盖自然语言处理、代码生成以及向量嵌入等多种应用场景，能够自动为模型输出“打分”。更独特的是，UpTrain 不仅指出问题所在，还能对失败案例进行根本原因分析，并提供具体的修复建议，让优化过程有迹可循。无论是正在构建智能客服的工程师，还是研究大模型性能的科研人员，都能通过 UpTrain 高效地监控模型状态，提升应用可靠性。如果你希望让 generative AI 项目从“能用”变得“好用”，UpTrain 将是一个得力的开源助手。","\u003Ch4 align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fuptrain.ai\">\n   \u003Cimg alt=\"Logo of UpTrain - an open-source platform to evaluate and improve LLM applications\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_e5c53deebd76.png\"\u002F>\n  \u003C\u002Fa>\n\u003C\u002Fh4>\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fdemo.uptrain.ai\u002Fevals_demo\u002F\">\n        \u003Cimg alt=\"Try out Evaluation\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTry%20Evaluations-uptrain?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAAEl0lEQVR4nO2dS8gWVQCGPzMvZVmWXRZ2VwuxolpIq8BMzDAqyChLQnBjpC76iVbZQvorSgqjsILoBgWFQQbhohICVwVJEbgwMu1C%2FaaRpWZPHP4jfQ1zZuab%2Bc6ZMzPvA7M792eu5za9nhBCCCGEEEIIIYQQHQOYBFwDrAYe7six2tZ5Ui8WgMuBV4BDdJdDwMvA3DpFTAGeAI7V3RoRYdpiFJgcWsaZwKd11z5iPgbOCCVjKrCj7ho3gE%2FMXSSEkGfrrmmDeNq3jKuB4wXuo7uAPQUK%2FAPwOfAX7eQ4cKVPIW%2FnFOALYHZf%2BMXAmEPaqr5wM4FttJO3fMmYnnMmH%2BuX0RdvVUrYpxzp%2F0j7%2BNPUzYeQhTkZ73LEM2d%2FkhtKXoFNZaEPISM5me5xxLskJewtjrDbaCcjPoRsKpDx4pR4j6eE25oSbi5wmHayyYeQzQUyHrPPjJn2yjAy%2FnaEfR2YZz6ggFsLvpU1lc11CRHpSEhkSEhkSEhkSEhkSEiXhZhxkUU6WJQxRhRUyDtDz6yhmLaQkIiQkMiQkMiQkMiQkMiQkMiQkAAA5wArgPuAs3PC6rXXJ8BNwC993xYHgOszwkuIL4AldoJCkq8z4khIYBknmNFYIcAE4KpAfV3XAhNzynOxnUFzXkkZB1zLD6IXAlwKfElYdgPzHSfGi8A%2FNtyR5GyQAjIMj2TUN3ohn1EP3wAnJcpyvyPshgFkPG%2FENlKIY4JcSK4o2FiGN6vKaIKQ02pezHNBojxbKqSVKyN6ITbOq9TDBylluQ446ktGU4ScCjwH%2FEoYDtr1jqmrl4C7MybuVZLRCCExAtxZ8Fa6ZRAZNm0J8SRlYBk2XQnxIKWUDJumhFQBWAp8b0X8YTcDmFAhPQmpiv2CnzWM3RgkJDIkJDIkJADAMtu1Yp4Py3LC6hniE7tvSZIHMsJLiC%2BAJzOW7aWOuUhIeBknOMsRT1fIINge6DXAM8A9wMklZOx2fas0QghwEfBgoB3e1gNzMraU%2BipR9u3AtAFkmN7iGzPqGrcQu9eJ%2BQIOyRFgeUpZNmTsbTWtoIzbctooXiFmCBX4lnoYA05JlOe9jPB7q8pogpDzqZf5ifJsLJlOIRlNEDLZDhjVwdHkmxBwLvBdiXRuH6CN4hVi4zxEPYw6ymP2U9nnQ0YjhNh4dwDv2jca38dWYGXOVJ0iUgaW0RghMQLM6RsHGYoMm66ElMVxpZSWYdOUkCqY%2Bb3AS3a66%2FtmbnDF9CQkJiQkMiQkMiSk40LMnxDS%2BHDomTUU0xaDfKhWzWxdRsdc6blMbYHx6USuD861PjK8GTdLex2H8ckQLpb4%2Bk3F744MTafdrF5HYXyi3d6Mv%2B%2F4%2BW0F8EbGWWAu1%2BVpw6It%2F%2B%2FWXcD%2BjHZ5zXcfUN6il9%2BAnYE6E7fXeOwsMLxgRi8v8yYkY66SSGejVxlWyESzZMxRAPEfH%2BWtmR%2BmlOkt%2FpPBMDAn7OlBZCSulMcKLCfuEoeBR5Nr5EOLuRB4AfiZ7vKTXSD6v6XYtWKvmAXAvfbnL6MtP0ZsXRcEe1YIIYQQQgghhBBCCNGLh38BZfHpULZhNcAAAAAASUVORK5CYII%3D&labelColor=CC766E&color=6A6A6A\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fdocs.uptrain.ai\u002Fgetting-started\u002Fquickstart\">\n        \u003Cimg alt=\"Read Docs\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRead%20Docs-uptrain?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAAC%2FklEQVR4nO3dO2sUURiH8SPipVAhKngJWggKolZ%2BAEGQIFiaL2CICNoLBmJAsBAbKxEREcTAYq%2BFYKGx8orxFkmhEltNY6P4yIEjs5FlJ4k7s%2B975v%2BD6ZJhD0%2FenZnsMhOCiIiIiIhIQwFbgFHgCjAJtPq83QF2hqYBNgNXgZ%2FYMwvsCE0B7E2LtuwjMBga8hb1CR9mso8C3MWXGWB7yBFwAPiNP2%2FjZIfcABfw6z2wLeQEuIdv74CtIRfA85IFv%2BnDdceTJUZ5FU%2FZQw6A6ZLFTvThNQ2zdC%2BBTcG7jIJEL9xHySzI3ygbg1cZBiEdF31GyTRI9AwYCN5kHIR0trYheJJ5kGgKWB%2B8aECQ6HHfogBHgJvpX%2BlPF%2FHzTQgSnat7EYPAAxaaXsTvNSXIRJ0L2APMdXgRClJ3kHgWAXygMwXpQ5Bu%2Fz73GmQAOLjEbbbv6wBWA99zC7IcJtaRzqi6UZCag5ymOwWpOcg43SlIzUEm6E5BCgrSxIO6JiRRkIXfBz5b8TYSSihIAuyjej6OhRbeslAQBelEE5JoQtroLaugCUk0IW00IQVNSALsip%2FdV7y1QgkFMUZBjFEQYxTEGAVJdNrbRqe9BU1IoglpowkpaEISTUgbTUhBE5JoQtpoQgqakIXfvh%2BueBsKJRTEGAUxRkGMURBjFMQYBTFGQYzJKcjrkn20arjO6MX2uWQd416CPKQZTnkJco1mOOwlyDHy9w1Y5SXIykUcR7w7X3mMXgVJ%2Bzlk9IkIvboD9jpXQdK%2BRoBf5OULsLvaChUFSfsbcvSUhDL3a78Nea%2BDpH2uBU6mW5DPOZqaH%2Bn5IjdqOaOqK4j8BwUxRkGMURBjFMQYBTFGQYxREGPi3ZpLrkPmDTw2tbWM7UzwCDhBni4Fj4D95Ol48Co9RS0n864eL%2FGv%2BNdEXsaCd8B18jAFrAnexUWkB9F79iibh0NGwApgFPiKv2PGWLyPfchRmpajwEXgloFrilaH7TZwOX3BrZ7Pu0VERERERIJ9fwBA%2FZMtFhwT0AAAAABJRU5ErkJggg%3D%3D&labelColor=976DA5&color=6A6A6A\">\n    \u003C\u002Fa>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fuptraincommunity\u002Fshared_invite\u002Fzt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg\">\n        \u003Cimg alt=\"Slack Community\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSlack%20Community-uptrain?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAADFklEQVR4nO2cPW4TURSFLaWBhg6xACs9YhEsISiLiIzoUkGNRIV3kDSxhJVVBArKrCBUSCnCTyTc8KGRphomjsfv3vGBdz7JpY%2Bu5iQz9v3kN5kYY4wxxhhjjDHG%2FAXwGDgGPgJfgRXwBfgAHAJ7u8yrCuAI%2BM56LoFnu8irCuA9m3MLPB8zryrav%2BSh3AD7Y%2BRVRXuP%2F8Z2nGfnVUf7wC1hmplXHcBF4QWcZeZVB3BdeAHnmXnVAfwqvIBnnbzfkXnVAVwVXsBFJ4%2FIvOoAli5EiHZ94f8QFZpdUru%2B8C1LBeAp8NPPECGaXVK7vhiKH%2BqJpew36wtciBbAFHjZbmwX97xedd5bSt0fe6PBhWiBC9GC2gvpcdbRrIY48OxCpB39hs46mst1DjyzEGlHP9BZR3N7lwPPKkTa0W%2FprKO56XPgGYVIO%2FpCZx3NeXYh8o4%2BwFlHM00uRNvRBzjraGbJhWg7%2BgBnHc08uRBtRx%2FgwKM56czXfC8o4TTY0Z%2BoO%2FBo3gbP180LnU%2FRgUdzEDxfNy90PkUHHskP4FHgfH15ofMpOvBIXgfP15cXOp%2BiA4%2FiM%2FAwcL7evIz51Bx4BJ%2BAJ4Hz3ZmXNZ%2BSAy%2B9J78BHgTNd29e5nwqDnzo6xR4B7zY9gHZmW9Q3hjzGWNMrWzh1Ed11og5f0WnPpqzRtD5Kzr1UZw1os5f0amnO2uEnb%2BiU0911og7%2F4xCjpWdNeLOP6OQC2VnjbjzzyjkWtlZI%2B78FZ16tLM%2BC54vGnmnHu2sF8HzRSPv1A%2BSC1mihbRTj3bWfYVIO%2F%2BMQpScdV8h0s4%2FqxQFZ73um7%2B0888qZZfOepPdmLTzT2EHznro9lja%2BadR6KzTf6SJuPOXYoxCzABciBgu5D8vhH%2FJgSuyg0K6%2BOx3sUIafPa7WCENPvtdrJAGn%2F1OOYvgW2D1Z7%2BrFTKb1IxgIfNJzST8rnwl7cDVSXD0V5F51ZHg6JeRedWR4OgPI%2FOqI8HR70XmVUmko8%2FIq5IoR5%2BVVyURjj4zr1oocPRj5BljjDHGGGOMMcYYY8xElT9jEHRBpJyhuQAAAABJRU5ErkJggg%3D%3D&labelColor=6565d8&color=6A6A6A\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues\u002Fnew?assignees=&labels=enhancement&template=feature_request.md&title=\">\n        \u003Cimg alt=\"Request New Feature\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRequest%20New%20Feature-uptrain?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAAG%2BElEQVR4nO2da6gWRRjHN1NLKy1Tu2B5CjEsMiO0wEiqoxhmZQUJ5YfyQ5FhdLEiIiItLQtRIswiMIvKNDOTCosyCzsVGWiXLykVdrESQz1aXn7xcObAsmdmdvc9715md37wftAzszuz%2F3eemX3mmecNAo%2FH4%2FF4PKUF6AmMBxYDG4E1wLii21VXEZYAf9KV%2F4BLim5n3UWIsqboNlcO0osQ5sui218J6J4IYRYX3RdnAY4ELgYWAr%2FRHFqL7pdTkI0IncjI6ll0H%2BsuQpjn8ujIWGAacA9wf0GfmcCwBuaEVnlI3ZwTOtkNvAq8YCkzPishzlQ3%2FoviaAdWADcAxxQswrVAH3UP%2BXc%2B5kp16EngX%2BotwmvAdZ0ihO7VR%2F1dx5Jmi3EC8AHFsAd4ADi%2BbCJE7isjJXtzBfQCPqIYfgHOSTif5S5CpA35mCu1%2BiiCfcDIHEXYk1aE3M0VMAI4GNORA8BmYFuCTsuy8mtgf4KyszXt6VGECMD54hgE%2Bhr%2BLvVzMVfSWBubwktOYAKw0yDaLaFyA4G1luseAk6KtGU48B05jgRgMPBpqK58CS5N8ZyaZ67UMNxr6dgB3fpfHrym7HxNuX7A74Zrf6MpL%2BLnZo7kCwF8a3jIfYswV5NiOrnZUE%2B%2B%2FVG0mzPA64Zrr46UG0K%2Bc4KIscVy7XFFmKs7Yzq8zVDvDE3ZSYayaw3XXh8pd6IyY0mR0TsbGNRAv%2BPEEM7L1VypGz2RoOMTNPXmasq9pSk3XL3s6djZwHymQxYPS4HRTRTjk4Tmqrm%2BK%2BCZBB3eqeaMgWpkzLWsypYBZwP9gasSrMouirTnWLVH3ainoE353Y6yTOBxYsjfB%2BdurlIIkiXvWL7F4lT8qcHr%2FgHMA4ZGxJClu43vgVMSzoGZ%2BK6KFkS4LcaDIH6tDTTGAeUbm5xwZESX4X1zM1clEuQgMD1BW0cBz1vmpO7QRYwE5qq1qoJ08n6SuCZgADAL2EqGYuS6uiqpIJ38oJbjVvd7yMUiAWqHaYwuc0ZCc5VNIENJBenkH%2BXPGpGgH8PUJK5z6aQWQ13zekvdy5suRowg69W3rwyfy0wOP01%2FjgNmJPSHTY65lslc7cgskMEiyPLAYYAjlJirLOZshaV%2BvqurqgsSRoliWt0NLY25qpEgrZaHO8dQJ7%2BXwboJIlje0GU%2BODooenVVQ0FmWEbJtFKYK9cFoSPyZApwl0QlJlh9yTJaR1vhqyvXBaHDrHweafOimDqLLN%2F8MYWuriogyEOGdo%2B11DnLsgRemtZ3JRExWXTMVUHeNrT77ph64i8zbXINSmOu5IU1i465KshCQ7uviaknbngTc9Qefay5Uhtp87PomKuCnK7xW22Mm3SVQ3KrZe8kqbmaKs8ui445KYgAnKZGymrgwRShP%2FeSDp25WlkbQejYar1dPeRRGVx%2FQEwsWhJz1V4LQejYFZT98LC%2F6dYM7mM7dJPEXFEXQT7UtGVvUvd7SuEbNVcr6iTILkN7Rmdwrw3dMFe1EaTNkK5icAb3kmiWtOYqXMddQYALgPtkPpAgOku5KzXLT62LvAltkhCj7Q2aK3cFoeOo2qHIiakWS%2FkLVbiPnFaa2qx2GO71cIPmyk1BZEcOfdjpsqAEACdbwlZt5spZQaYYrr8lKAkqlrhL%2B6LOw4i5claQkYbrd4mULwqJ%2F1LmsdOsfhU9pKSOSrQ7L4hhf7pdzvMFJUO9wZuC5h7RPCNnBemltlDfBJ6VMyOBQ6gV4r7KCOI6lj2S8gmiDnVKYpjHgKslQC2oGMByJwRRXtkfI%2FVerJooOCSI6XyiNQLENVwS5F1D3TuCCoFDgixIemrXZVwSZIjyS4WR%2FQw%2FhxS4yhJf0KPAS2KqgN5BxcCVEVIX8IKUCy9IyfCClIzSC6I2nU4NagJlFUQlnvkissRtevBB2SizIJ9pyq0MKg5lFEQ5Ek1h%2FJV793BBkH6GbG9y2qhHUGEooyCq7Bu5NKpklFmQ%2FipIeZdK0r8gbeJJFymtIKE6lXIelk2Qp9IKUjcw7%2FvMy%2BJmEubpBbEfgfvV8IxmBs0GuMIL0lDkpTAxaDZqotbFHNXeZAEtlt%2Br2h3NkdJMUXS%2FiVFbQYDewI2Ro3VRXsmyAedqItS1gqjELBI49rJpyKqFwjpHP22WnCjhA0TZRl8CT8cJYlgA3KwpV9Qv9eRF8xMGGIap5Fk0uU56G%2FLlbq%2BZIOty%2B%2BFIlcLoY3XjVZpD%2BjoOaxKA6bzCVeA923G8LHNQPS4nVDVJJXXn8TZprvEz1WK%2FyoPS%2FOw%2FaXKJaP5vYiQ5y9%2BaXzdo6UZS47KxQ%2BUN1ibKLFPSFzlJO92QK32MSmjs6mcWcJM6eFrciPB4PB6Px%2BPxeDwej8fj8XiCYvgfmdGLzXmrOA4AAAAASUVORK5CYII%3D&labelColor=5B92E9&color=6A6A6A\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fdemo.uptrain.ai\u002Fbenchmark\">\n      \u003Cimg alt=\"Daily Monitoring\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDaily%20Monitoring-uptrain?logo=data:image\u002Fpng;base64,iVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAAFUUlEQVR4nO2daahVVRTHt2WDls3RQDYXRUZBkRkRjdAraPjSAJUUKVSE0QuiQZ4FmRj2lMwwqGgitZTq5cvEIipKpGgymogwCrKs7IOkDf5i9ZYorzvsM+99zvrB\u002FXjWXfv+z9ln7zXs65xhGIZhGIZhGIZRCsD2wOXAIuA94P2MH7HxLHBBOSOoEcBewJsUx\u002FPATlWPMxqAQYpnXtXjjALgDMrhb+CwqscbPMAMymNy1eMNHuDxEgXpq3q8wQP0lSjIPVWPN3iAY4HNJkhAAHNMkIAAdgBml\u002FCk2JSVBOAo4HbgYWD+sM9KEyQQZP8ArDVBAgDYDfjUpqwAAEYCy+0dEgjAPHupBwJwq62yAgHo0WCgLXurBjgOWG\u002F7kAAA9gfW2MYwAIDRwKoUYkiq9pdG7NQZSq3eD3wCrANWAwuBOyRXDRyU0\u002FdsB7yQQoxvgf2AH2svCHCO50BFqBXALGAicCKwY8Lvmp5CjN+BcXp9fQXRu3VqxlXOn8DHwFNAL3AusG+b7xMRkyK+9Wxj42vP625zMQHsAyyjOH4AXtW07RXAZcCmFHZuHub3y57XxVMSBEwAviN85rbw\u002FSKP62T1NtLFADBFp5nQWdbuRwWe6HDdRqlucZFEUqWILAZWi79d3n13tdhUfgSMd6EDnAB8RRys9a2pAnbRWq+Lt6zCgge4BthAHPwBnObqCLCzpkNjYTNwlasjwNG6N8jCr\u002Fopi2mujuh8+lvGH2cVcKjaO1A3e1N08\u002FdZAeFyaUsY4eqEhDC0vCbrtDFLSnU8XqbjpX5Wq0be0fBGGt6V6dXVCWCsRkKzsB64NIMPI4DDxYZMP8AS4Jsu9VeDnZa3UQKcD\u002FycUYwPgSML3P+cDtykT5NMT\u002F3Aea5O6B0phWb\u002FZBRD3gujqx5P1Ghg8LUc1vzXVz2W6AGO0GRNFr4AjneRA+ypwcZrgbOS5mby6l6VeE0WJAs4xsVfVDddA4nb8n2WhUkaRy7JIMSm4bmFWAGe6TBOWdVdGXpvxZoooqAeABd6jFciDHu4ogEeTSGGrPX3djWBoWWzDxNDK7mU8MbdkjtwNQL\u002FGN2MMpw5RJer3ZDKjLNdDQG+9BRkdlkO3dLFkbckIOhqCqEJok5NalHBt0HTmnEk9+skiDq2q1YUTtaA3u6uARCqIE0FEyQsTJDAaIwgDFWWS\u002FX7B5pYegO4ofSAXRcaIYiWlq5rM7CVIe30ay+I5lZ+6jK4V1wgNEEQ3yOTTnEB0ARB3vYc4J0uAJogyOcxDZAGCCLpXB\u002FmuAAwQbZigtTpCWGobUCqIhcAD0pRQko7NmVlEUSDoFLF2IqBpEFREyS7IC\u002FSmUETpKQpCzjT03ZP9FOWtglcJ\u002FsDzYmMDVCQfk\u002Fb86MVRBse723R8y2dtQ8kzRgWLMgiT9uDMQvSn9fdVoIgSzxtL49SEKnH9ax2PzWBTRMkLcB9ed\u002FNJkgGgOc8BVmawKY9IWkBFnsKsiKBTRMkLSaIC+6lbk8IJsgWbNlrT8j\u002FsSdkK\u002FaEtLg7JJeQdyhCTiH1YWaB\u002Fg4U4G8p\u002FSEzPZ15JIFN3zMMbyzwb\u002FIeSmDzpWD+Nk9CIp7OeJ+SAFztYe8v4IAU\u002Fp7s6a\u002F3kXzS0Olhb2O7k1FzR09e6MRAkpN1tNX69S42p2bw97EuthekiHZ3OzSh15UFMAp4uo0ji9P0oOup1kvb9ClOy9KnKH8ULFNom0No5DDLUSlsjtF++1ZPRnliDHPqJA02PqlF0hNymhL75HhWPRT54Hy8\u002Fc\u002F2MXps+Vzt9BqXU\u002FS7V3+HSaVNU4ZhGIZhGIZhGIZhGIbhquVfiiAaCf\u002FKq\u002F8AAAAASUVORK5CYII=&labelColor=35B9D2&color=6A6A6A\">\n    \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Freleases\">\n    \u003Cimg alt=\"GitHub Release\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fv\u002Frelease\u002Fuptrain-ai\u002Fuptrain?labelColor=6A6A6A&color=CC766E\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fgraphs\u002Fcommit-activity\">\n    \u003Cimg alt=\"GitHub commit activity\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fcommit-activity\u002Fm\u002Fuptrain-ai\u002Fuptrain?labelColor=6A6A6A&color=976DA5\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fblob\u002Fmain\u002FLICENSE\">\n    \u003Cimg alt=\"GitHub License\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fuptrain-ai\u002Fuptrain?labelColor=6A6A6A&color=6565d8\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fpypistats.org\u002Fpackages\u002Fuptrain\">\n    \u003Cimg alt=\"PyPI - Downloads\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdm\u002Fuptrain?labelColor=6A6A6A&color=3E93C4\">\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Ch4 align=\"center\">\n  \u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fassets\u002F43818888\u002F68a3b169-2217-446b-93b2-96b48ec7201d\" alt=\"Demo of UpTrain's LLM evaluations with scores for hallucinations, retrieved-context quality, response tonality for a customer support chatbot\" autoplay>\n\u003C\u002Fh4>\n\n\n**[UpTrain](https:\u002F\u002Fuptrain.ai)** is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for [20+ preconfigured evaluations](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain?tab=readme-ov-file#pre-built-evaluations-we-offer-) (covering language, code, embedding use cases), perform [root cause analysis](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fblob\u002Fmain\u002Fexamples\u002Froot_cause_analysis\u002Frag_with_citation.ipynb) on failure cases and give insights on how to resolve them.    \n\n\u003Cbr \u002F>\n\n# Key Features 🔑\n\n\u003Cimg width=\"1088\" alt=\"Interactive Dashboards\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_5cdf36d1c0a2.png\">\n\nUpTrain Dashboard is a web-based interface that runs on your **local machine**. You can use the dashboard to evaluate your LLM applications, view the results, and perform a root cause analysis.\n\n\u003Cimg width=\"1088\" alt=\"20+ Pre-configured Evaluations\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_238aa6ac809c.png\">\n\nSupport for **20+ pre-configured evaluations** such as Response Completeness, Factual Accuracy, Context Conciseness etc.\n\n\u003Cimg width=\"1088\" alt=\"Data Security\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_8c51a5ae7628.png\">\n\nAll the evaluations and analysis run locally on your system, ensuring that the data never leaves your secure environment (except for LLM calls while using model grading checks)\n\n\u003Cimg width=\"1088\" alt=\"Experimentation\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_1fa86b4cdc29.png\">\n\n**Experiment with different embedding models** like text-embedding-3-large\u002Fsmall, text-embedding-3-ada, baai\u002Fbge-large, etc. UpTrain supports HuggingFace models, Replicate endpoints, or custom models hosted on your endpoint.\n\n\u003Cimg width=\"1088\" alt=\"Root Cause Analysis\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_d01009fcceb8.png\">\n\nYou can **perform root cause analysis** on cases with either negative user feedback or low evaluation scores to understand which part of your LLM pipeline is giving suboptimal results. Check out the supported RCA templates.\n\n\u003Cimg width=\"1088\" alt=\"Select from a Variety of Evaluation LLMs\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_b859ab51bfa6.png\">\n\nWe allow you to use any of OpenAI, Anthropic, Mistral, Azure's Openai endpoints or open-source LLMs hosted on Anyscale to be used as evaluators.\n\n\u003Cimg width=\"1088\" alt=\"Customize Evaluations\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_1a82aea699cc.png\">\n\nUpTrain provides tons of ways to **customize evaluations**. You can customize the evaluation method (chain of thought vs classify), few-shot examples, and scenario description. You can also create custom evaluators.\n\n### Coming Soon:\n\n1. Collaborate with your team\n2. Embedding visualization via UMAP and Clustering\n3. Pattern recognition among failure cases\n4. Prompt improvement suggestions\n\n\u003Cbr \u002F>\n\n\n# Getting Started 🙌\n\n## Method 1: Using the Locally Hosted Dashboard\n\nThe UpTrain dashboard is a web-based interface that allows you to evaluate your LLM applications. It is a self-hosted dashboard that runs on your local machine.\nYou don't need to write any code to use the dashboard. You can use the dashboard to evaluate your LLM applications, view the results, and perform a root cause analysis.\n\nBefore you start, ensure you have docker installed on your machine. If not, you can install it from [here](https:\u002F\u002Fdocs.docker.com\u002Fget-docker\u002F).\n\nThe following commands will download the UpTrain dashboard and start it on your local machine.\n\n```bash\n# Clone the repository\ngit clone https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\ncd uptrain\n\n# Run UpTrain\nbash run_uptrain.sh\n```\n> **NOTE:**  UpTrain Dashboard is currently in **Beta version**. We would love your feedback to improve it.\n\n## Method 2: Using the UpTrain package\n\nIf you are a developer and want to integrate UpTrain evaluations into your application, you can use the UpTrain package. This allows for a more programmatic way to evaluate your LLM applications.\n\n### Install the package through pip:\n```bash\npip install uptrain\n```\n\n### How to use UpTrain:\n\nYou can evaluate your responses via the open-source version by providing your OpenAI API key to run evaluations.\n\n```python\nfrom uptrain import EvalLLM, Evals\nimport json\n\nOPENAI_API_KEY = \"sk-***************\"\n\ndata = [{\n    'question': 'Which is the most popular global sport?',\n    'context': \"The popularity of sports can be measured in various ways, including TV viewership, social media presence, number of participants, and economic impact. Football is undoubtedly the world's most popular sport with major events like the FIFA World Cup and sports personalities like Ronaldo and Messi, drawing a followership of more than 4 billion people. Cricket is particularly popular in countries like India, Pakistan, Australia, and England. The ICC Cricket World Cup and Indian Premier League (IPL) have substantial viewership. The NBA has made basketball popular worldwide, especially in countries like the USA, Canada, China, and the Philippines. Major tennis tournaments like Wimbledon, the US Open, French Open, and Australian Open have large global audiences. Players like Roger Federer, Serena Williams, and Rafael Nadal have boosted the sport's popularity. Field Hockey is very popular in countries like India, Netherlands, and Australia. It has a considerable following in many parts of the world.\",\n    'response': 'Football is the most popular sport with around 4 billion followers worldwide'\n}]\n\neval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)\n\nresults = eval_llm.evaluate(\n    data=data,\n    checks=[Evals.CONTEXT_RELEVANCE, Evals.FACTUAL_ACCURACY, Evals.RESPONSE_COMPLETENESS]\n)\n\nprint(json.dumps(results, indent=3))\n```\nIf you have any questions, please join our [Slack community](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fuptraincommunity\u002Fshared_invite\u002Fzt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg)\n\nSpeak directly with the maintainers of UpTrain by [booking a call here](https:\u002F\u002Fcalendly.com\u002Fuptrain-sourabh\u002F30min).\n\n\u003Cbr \u002F>\n\n\n# Pre-built Evaluations We Offer 📝\n\u003Cimg width=\"1088\" alt=\"quality of your responses\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_a4a58e957b6c.png\">\n\n| Eval | Description |\n| ---- | ----------- |\n|[Response Completeness](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fresponse-quality\u002Fresponse-completeness) | Grades whether the response has answered all the aspects of the question specified. |\n|[Response Conciseness](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fresponse-quality\u002Fresponse-conciseness) | Grades how concise the generated response is or if it has any additional irrelevant information for the question asked. |\n|[Response Relevance](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fresponse-quality\u002Fresponse-relevance)| Grades how relevant the generated context was to the question specified.|\n|[Response Validity](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fresponse-quality\u002Fresponse-validity)| Grades if the response generated is valid or not. A response is considered to be valid if it contains any information.|\n|[Response Consistency](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fresponse-quality\u002Fresponse-consistency)| Grades how consistent the response is with the question asked as well as with the context provided.|\n\n\u003Cimg width=\"1088\" alt=\"quality of retrieved context and response groundedness\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_bec5d0416c39.png\">\n\n\n| Eval | Description |\n| ---- | ----------- |\n|[Context Relevance](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcontext-awareness\u002Fcontext-relevance) | Grades how relevant the context was to the question specified. |\n|[Context Utilization](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcontext-awareness\u002Fcontext-utilization) | Grades how complete the generated response was for the question specified, given the information provided in the context. |\n|[Factual Accuracy](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcontext-awareness\u002Ffactual-accuracy)| Grades whether the response generated is factually correct and grounded by the provided context.|\n|[Context Conciseness](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcontext-awareness\u002Fcontext-conciseness)| Evaluates the concise context cited from an original context for irrelevant information.\n|[Context Reranking](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcontext-awareness\u002Fcontext-reranking)| Evaluates how efficient the reranked context is compared to the original context.|\n\n\u003Cimg width=\"1088\" alt=\"language quality of the response\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_8e2bacdd068c.png\">\n\n| Eval | Description |\n| ---- | ----------- |\n|[Language Features](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Flanguage-quality\u002Ffluency-and-coherence) | Grades the quality and effectiveness of language in a response, focusing on factors such as clarity, coherence, conciseness, and overall communication. |\n|[Tonality](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcode-evals\u002Fcode-hallucination) | Grades whether the generated response matches the required persona's tone  |\n\n\u003Cimg width=\"1088\" alt=\"language quality of the response\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_02f492ac61f4.png\">\n\n| Eval | Description |\n| ---- | ----------- |\n|[Code Hallucination](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcode-evals\u002Fcode-hallucination) | Grades whether the code present in the generated response is grounded by the context. |\n\n\u003Cimg width=\"1088\" alt=\"conversation as a whole\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_81617f912cad.png\">\n\n| Eval | Description |\n| ---- | ----------- |\n|[User Satisfaction](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fconversation-evals\u002Fuser-satisfaction) | Grades how well the user's concerns are addressed and assesses their satisfaction based on provided conversation. |\n\n\u003Cimg width=\"1088\" alt=\"custom evaluations and others\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_f16beaa37e92.png\">\n\n Eval | Description |\n| ---- | ----------- |\n|[Custom Guideline](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcustom-evals\u002Fcustom-guideline) | Allows you to specify a guideline and grades how well the LLM adheres to the provided guideline when giving a response. |\n|[Custom Prompts](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcustom-evals\u002Fcustom-prompt-eval) | Allows you to create your own set of evaluations. |\n\n\u003Cimg width=\"1088\" alt=\"compare responses with ground truth\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_1dfcac623eff.png\">\n\n| Eval | Description |\n| ---- | ----------- |\n|[Response Matching](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fground-truth-comparison\u002Fresponse-matching) | Compares and grades how well the response generated by the LLM aligns with the provided ground truth. |\n\n\u003Cimg width=\"1088\" alt=\"safeguard system prompts and avoid LLM mis-use\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_02ba8d93b638.png\">\n\n| Eval | Description |\n| ---- | ----------- |\n|[Prompt Injection](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fsafeguarding\u002Fprompt-injection) | Grades whether the user's prompt is an attempt to make the LLM reveal its system prompts. |\n|[Jailbreak Detection](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fsafeguarding\u002Fjailbreak) | Grades whether the user's prompt is an attempt to jailbreak (i.e. generate illegal or harmful responses). |\n\n\u003Cimg width=\"1088\" alt=\"evaluate the clarity of user queries\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_602f818ab511.png\">\n\n| Eval | Description |\n| ---- | ----------- |\n|[Sub-Query Completeness](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fquery-quality\u002Fsub-query-completeness) | Evaluate whether all of the sub-questions generated from a user's query, taken together, cover all aspects of the user's query or not |\n| [Multi-Query Accuracy](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fquery-quality\u002Fmulti-query-accuracy) | Evaluate whether the variants generated accurately represent the original query |\n\n\n\u003Cbr \u002F>\n\n# Integrations 🤝\n\n| Eval Frameworks  | LLM Providers | LLM Packages | Serving frameworks | LLM Observability | Vector DBs |\n| ------------- | ------------- | ------------- | ------------- | ------------- |  ------------- |\n| [OpenAI Evals](https:\u002F\u002Fdocs.uptrain.ai\u002Ftutorials\u002Fopenai-evals) | [OpenAI](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Fopenai) | [LlamaIndex](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fframework\u002Fllamaindex-methods\u002Foverview) | [Ollama](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Follama) | [Langfuse](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fobservation-tools\u002Flangfuse) | [Qdrant](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fvector_db\u002Fqdrant) |\n| | [Azure](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Fazure) | |  [Together AI](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Ftogether_ai) | [Helicone](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fobservation-tools\u002Fhelicone) | [FAISS](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fvector_db\u002Ffaiss) |\n| | [Claude](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Fclaude) | |  [Anyscale](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Fanyscale) | [Zeno](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fobservation-tools\u002Fzeno) | [Chroma](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fvector_db\u002Fchroma) |\n| | [Mistral](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Fmistral) | | Replicate  |\n| |  | |  HuggingFace  |\n\nMore integrations are coming soon. If you have a specific integration in mind, please let us know by [creating an issue](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues).\n\n\u003Cbr \u002F>\n\n# Monitoring Prompt Drift in LLMs: Benchmark by UpTrain\n\nMost popular LLMs like GPT-4, GPT-3.5-turbo, Claude-2.1 etc., are closed-source, i.e. exposed via an API with very little visibility on what happens under the hood. There are many reported instances of prompt drift (or GPT-4 becoming lazy) and research work exploring the degradation in model quality. This benchmark is an attempt to track the change in model behaviour by evaluating its response on a fixed dataset.\n\n\u003Cimg width=\"1316\" alt=\"image\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_7f8dd13c7d55.png\">\n\nYou can find the benchmark [here](https:\u002F\u002Fdemo.uptrain.ai\u002Fbenchmark).\n\n\n# Resources 💡\n\n1. [How to evaluate your LLM application](https:\u002F\u002Fblog.uptrain.ai\u002Fhow-to-evaluate-your-llm-applications)\n1. [How to detect jailbreaks](https:\u002F\u002Fblog.uptrain.ai\u002Fllm-jailbreak\u002F)\n1. [Dealing with hallucinations](https:\u002F\u002Fblog.uptrain.ai\u002Fdealing-with-hallucinations-in-llms-a-deep-dive\u002F)\n\n\u003Cbr \u002F>\n\n# Why we are building UpTrain 🤔\n\nHaving worked with ML and NLP models for the last 8 years, we were continuosly frustated with numerous hidden failures in our models which led to us building UpTrain. UpTrain was initially started as an ML observability tool with checks to identify regression in accuracy. \n\nHowever we soon released that LLM developers face an even bigger problem -- there is no good way to measure accuracy of their LLM applications, let alone identify regression.\n\nWe also saw release of [OpenAI evals](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fevals), where they proposed the use of LLMs to grade the model responses. Furthermore, we gained confidence to approach this after reading [how Anthropic leverages RLAIF](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2212.08073.pdf) and dived right into the LLM evaluations research (We are soon releasing a repository of awesome evaluations research). \n\nSo, come today, UpTrain is our attempt to bring order to LLM chaos and contribute back to the community. While a majority of developers still rely on intuition and productionise prompt changes by reviewing a couple of cases, we have heard enough regression stories to believe \"evaluations and improvement\" will be a key part of LLM ecosystem as the space matures.\n\n1. Robust evaluations allows you to systematically experiment with different configurations and prevent any regressions by helping objectively select the best choice.\n\n1. It helps you understand where your systems are going wrong, find the root cause(s) and fix them - long before your end users complain and potentially churn out.\n\n1. Evaluations like prompt injection and jailbreak detection are essential to maintain safety and security of your LLM applications.\n\n1. Evaluations help you provide transparency and build trust with your end-users - especially relevant if you are selling to enterprises.\n\n\u003Cbr \u002F>\n\n# Why open-source? \n\n1. We understand that there is **no one-size-fits-all solution** when it come to evaluations. We are increasingly seeing the desire from developers to modify the evaluation prompt or set of choices or the few shot examples, etc. We believe the best developer experience lies in open-source, instead of exposing 20 different parameters.\n\n1. **Foster innovation**: The field of LLM evaluations and using LLM-as-a-judge is still pretty nascent. We see a lot of exciting research happening, almost on a daily basis and being open-source provides the right platform to us and our community to implement those techniques and innovate faster.\n\n\u003Cbr \u002F>\n\n## How You Can Help 🙏\n\nWe are continuously striving to enhance UpTrain, and there are several ways you can contribute:\n\n1. **Notice any issues or areas for improvement:** If you spot anything wrong or have ideas for enhancements, please [create an issue](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues) on our GitHub repository.\n\n1. **Contribute directly:** If you see an issue you can fix or have code improvements to suggest, feel free to contribute directly to the [repository](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fblob\u002Fmain\u002FCONTRIBUTING.md).\n\n1. **Request custom evaluations:** If your application requires a tailored evaluation, [let us know]((https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fuptraincommunity\u002Fshared_invite\u002Fzt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg)), and we'll add it to the repository.\n\n1. **Integrate with your tools:** Need integration with your existing tools? [Reach out]((https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fuptraincommunity\u002Fshared_invite\u002Fzt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg)), and we'll work on it.\n\n1. **Assistance with evaluations:** If you need assistance with evaluations, post your query on our [Slack channel](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fuptraincommunity\u002Fshared_invite\u002Fzt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg), and we'll resolve it promptly.\n\n1. **Show your support:** Show your support by starring us ⭐ on GitHub to track our progress.\n\n1. **Spread the word:** If you like what we've built, give us a shoutout on Twitter!\n\nYour contributions and support are greatly appreciated! Thank you for being a part of UpTrain's journey.\n\n\u003Cbr \u002F>\n\n# License 💻\n\nThis repo is published under Apache 2.0 license and we are committed to adding more functionalities to the UpTrain open-source repo. We also have a managed version if you just want a more hands-off experience. Please book a [demo call here](https:\u002F\u002Fcalendly.com\u002Fuptrain-sourabh\u002F30min).\n\n\u003Cbr \u002F>\n\n# Provide feedback (Harsher the better 😉) \n\nWe are building UpTrain in public. Help us improve by giving your feedback **[here](https:\u002F\u002Fdocs.google.com\u002Fforms\u002Fd\u002Fe\u002F1FAIpQLSezGUkkC0JoEvx-0gCrRSmGutA-jqyb7kl2lomXv302_C3MnQ\u002Fviewform?usp=sf_link)**.\n\n\u003Cbr \u002F>\n\n# Contributors 🖥️\n\nWe welcome contributions to UpTrain. Please see our [contribution guide](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fblob\u002Fmain\u002FCONTRIBUTING.md) for details.\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_596967de1f71.png\" \u002F>\n\u003C\u002Fa>\n","\u003Ch4 align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fuptrain.ai\">\n   \u003Cimg alt=\"UpTrain 的标志——一个用于评估和改进大语言模型应用的开源平台\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_e5c53deebd76.png\"\u002F>\n  \u003C\u002Fa>\n\u003C\u002Fh4>\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fdemo.uptrain.ai\u002Fevals_demo\u002F\">\n        \u003Cimg alt=\"试用评估\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTry%20Evaluations-uptrain?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAAEl0lEQVR4nO2dS8gWVwBFvzMvZVmWXRZ2VwuxolpIq8BMzDAqyChLQnBjpC76iVbZQvorSgqjsILoBgWFQQbhohICVwVJEbgwMu1C%2FaaRpWZPHP4jfQ1zZuab%2Bc6ZMzPvA7M792eu5za9nhBCCCGEEEIIIYQQHQOYBFwDrAYe7six2tZ5Ui8WgMuBV4BDdJdDwMvA3DpFTAGeAI7V3RoRYdpiFJgcWsaZwKd11z5iPgbOCCVjKrCj7ho3gE%2FMXSSEkGfrrmmDeNq3jKuB4wXuo72APQUK%2FAPAXzrFOAL4Hld4RfAZ8FcvsCLwJmHpLV9YWbCVttJO3fMmYnnMmH%2F1X0RdvVUrYpxzp%2F0j7%2BNPUzYeQhTkZ73LEM2d%2FkhtKXoFNZaEPISM5me6xxyJJQwtzZC7jdSSjPhCyqUDHiinjPP4knNLkVsKEhkSEhkSEhkSEhkSEhkSEiXhZhxkUU6WJQxRhRUyDtDz6yhmLaQkIiQkMiQkMiQkMiQkMiQkAAA5wArgPuAs3PC6rXXJ8BNwC%2F93xYHgOszwkuIL4Ald2FiSTfnxJHQQJLOMMZjRVQYIAJwVaA%2BrquBJhzSlPOxfYHzHSfGC8CP9jcEeSVkHHAlcC%2FAnYHmukpJOMA1wLfW3RCQkMiQkMiQkMiQkMiQkMiQkAAA5wArgPuAs3PC6rXXJ8BNwC%2F93xYHgOszwkuIL4Ald2FiSTfnxJHQQJLOMMZjRVQYIAJwVaA%2BrquBJhzWM%2FENlKIcAE4Kpiv1XdAMzNKeuFuwOm%2BpJkHsgITwks4wQzGigwAQgKtAdV1zTDHTXAM8A9wMklZOx2fas0QghwEfBgoB3e1gNzMraU%2BipR9u3AtAFkmN7iGzPqGrcQu9eJ%2BQIOyRFgeUpZNmTsbWNRAv%2BPEEM7L1VypGz2RoOMTNPXmasq9pSk3XL3s6djZwHymQxYPS4HRTRTjk4Tmqrm%2BK%2BCZBB3eqeaMgWpkzLWsypYBZwP9gasSrMouirTnWLVH3ainoE353Y6yTOBxYsjfB%2BdurlIIkiXvWL7F4lT8qcHr%2FgHMA4ZGxJClu43vgVMSzoGZ%2BK6KFkS4LcaDIH6tDTTGAeUbm5xwZESX4X1zM1clEuQgMD1BW0cBz1vmpO7QRYwE5qq1qoJ08n6SuCZgADAL2EqGYuS6uiqpIJ38oJbjVvd7yMUiAWqHaYwuc0ZCc5VNIENJBenkH%2BXPGpGgH8PUJK5z6aQWQ13zekvdy5suRowg69W3rwyfy0wOP01%2FjgNmJPSHTY65lslc7cgskMEiyPLAYYAjlJirLOZshaV%2BvqurqgsSRoliWt0NLY25qpEgrZaHO8dQJ7%2BXwboJIlje0GU%2BODooenVVQ0FmWEbJtFKYK9cFoSPyZApwl0QlJlh9yTJaR1vhqyvXBaHDrHweafOimDqLLN%2F8MYWuriogyEOGdo%2B11DnLsgRemtZ3JRExWXTMVUHeNrT77ph64i8zbXINSmOu5IU1i465KshCQ7uviaknbngTc9Qefay5Uhtp87PomKuCnK7xW22Mm3SVQ3KrZe8kqbmaKs8ui445KYgAnKZGymrgwRShP%2FeSDp25WlkbQejYar1dPeRRGVx%2FQEwsWhJz1V4LQejYFZT98LC%2F6dYM7mM7dJPEXFEXQT7UtGVvUvd7SuEbNVcr6iTILkN7Rmdwrw3dMFe1EaTNkK5icAb3kmiWtOYqXMddQYALgPtkPpAgOku5KzXLT62LvAltkhCj7Q2aK3cFoeOo2qHIiakWS%2FkLVbiPnFaa2qx2GO71cIPmyk1BZEcOfdjpsqAEACdbwlZt5spZQaYYrr8lKAkqlrhL%2B6LOw4i5claQkYbrd4mULwqJ%2F1LmsdOsfhU9pKSOSrQ7L4hhf7pdzvMFJUO9wZuC5h7RPCNnBemltlDfBJ6VMyOBQ6gV4r7KCOI6lj2S8gmiDnVKYpjHgKslQC2oGMByJwRRXtkfI%2FVerJooOCSI6XyiNQLENVwS5F1D3TuCCoFDgixIemrXZVwSZIjyS4WR%2FQw%2FhxS4yhJf0KPAS2KqgN5BxcCVEVIX8IKUCy9IyfCClIzSC6I2nU4NagJlFUQlnvkissRtevBB2SizIJ9pyq0MKg5lFEQ5Ek1h%2FJV793BBkH6GbG9y2qhHUGEooyCq7Bu5NKpklFmQ%2FipIeZdK0r8gbeJJFymtIKE6lXIelk2Qp9IKUjcw7%2FvMy%2BJmEubpBbEfgfvV8IxmBs0GuMIL0lDkpTAxaDZqotbFHNXeZAEtlt%2Br2h3NkdJMUXS%2FiVFbQYDewI2Ro3VRXsmyAedqItS1gqjELBI49rJpyKqFwjpHP22WnCjhA0TZRl8CT8cJYlgA3KwpV9Qv9eRF8xMGGIap5Fk0uU56G%2FLlbq%2BZIOty%2B%2BFIlcLoY3XjVZpD%2BjoOaxKA6bzCVeA923G8LHNQPS4nVDVJJXXn8TZprvEz1WK%2FyoPS%2FOw%2FaXKJaP5vYiQ5y9%2BaXzdo6UZS47KxQ%2BUN1ibKLFPSFzlJO92QK32MSmjs6mcWcJM6eFrciPB4PB6Px%2FPxeDwej8fj8XiCYvgfmdGLzXmrOA4AAAAASUVORK5CYII%3D&labelColor=CC766E&color=6A6A6A\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fdocs.uptrain.ai\u002Fgetting-started\u002Fquickstart\">\n        \u003Cimg alt=\"阅读文档\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRead%20Docs-uptrain?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAAC%2FklEQVR4nO3dO2sUURiH8SPipVAhKngJWggKolZ%2BAEGQIFiaL2CICNoLBmJAsBAbKxEREcTAYq%2BFYKGx8orxFkmhEltNY6P4yIEjs5FlJ4k7s5v%2B975v%2BD6ZJhD0%2FenZnsMhOCiIiIiIhIQwFbgFHgCjAJtPq83QF2hqYBNgNXgZ%2FYMwvsCE0B7Ehb1YIIYQQQgghhBBCCNGLh38BZfHpULZhNcAAAAAASUVORK5CYII%3D&labelColor=976DA5&color=6A6A6A\">\n    \u003C\u002Fa>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fuptraincommunity\u002Fshared_invite\u002Fzt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg\">\n        \u003Cimg alt=\"Slack社区\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSlack%20Community-uptrain?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAADFklEQVR4nO2cPW4TURSFLaWBhg6xACs9YhEsISiLiIzoUkGNRIV3kDSxhJVVBArKrCBUSCnCTyTc8KGRphomjsfv3vGBdz7JpY%2Bu5iQz9v3kN5kYY4wxxhhjjDHG%2FAXwGDgGPgJfgRXwBfgAHAJ7u8yrCuAIlo1mOOwW5DPOZqaH%2Bn5IjdqOqK4j8BwUxRkGMURBjFMQYBTFGQYxREGPi3ZpLrkPmDTw2tbWM7UzwCDhBni4Fj4D95Ol48Co9RS0n864eL%2FGv%2BNdEXsaCd8RS0n8jryrav%2BSh3AD7zY9gHZmXHcBFdeAHnmXnVAfwqv%2B8C1NBeAp8NPPECG1XVK7vhiKH%2BqJpew36wtciBWiBC9GC2gvpcdbRrIY48OxCpB39hs46mtvXfnVUf7wC1hmplXHcBFdeAHnmXnVAfwqv%2B8C1NBeAp8NPPECG1XVK7vhiKH%2BqJpew36wtciBWiBC9GC2gvpcdbRrIY48OxCpB39hs46mtvXfnVUf7wC1hmplXHcBFdeAHnmXnVAfwqv%2B8C1NBeAp8NPPECG1XVK7vhiKH%2BqJpew36wtciBWiBC9GC2gvpcdbRrIY48OxCpB39hs46mtvXfnVUf7wC1hmplXHcBFdeAHnmXnVAfwqv%2B8C1NBeAp8NPPECG1XVK7vhiKH%2BqJpew36wtciBWiBC9GC2gvpcdbRrIY48OxCpB39hs46mtvXfnVUf7wC1hmplXHcBFdeAHnmXnVAfwqv%2B8C1NBeAp8NPPECG1XVK7vhiKH%2BqJpew36wtciBWiBC9GC2gvpcdbRrIY48OxCpB39hs46mtvXfnVUf7wC1hmplXHcBFdeAHnmXnVAfwqv%2B8C1NBeAp8NPPECG1XVK7vhiKH%2BqJpew36wtciBWiBC9GC2gvpcdbRrIY48OxCpB39hs46mtvXfnVUf7wC1hmplXHcBFdeAHnmXnVAfwqv%2B8C1NBeAp8NPPECG1XVK7vhiKH%2BqJpew36wtciBWiBC9GC2gvpcdbRrIY48OxCpB39hs46mtvXfnVUf7wC1hmplXHcBFdeAHnmXnVAfwqv%2B8C1NBeAp8NPPECG1XVK7vhiKH%2BqJpew36wtciBWiBC9GC2gvpcdbRrIY48OxCpB39hs4......\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fdemo.uptrain.ai\u002Fevals_demo\u002F\">\n        \u003Cimg alt=\"试用评估\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTry%20Evaluations-uptrain?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAAEl0lEQVR4nO2dS8gWVwBFvzMvZVmWXRZ2VwuxolpIq8BMzDAqyChLQnBjpC76iVbZQvorSgqjsILoBgWFQQbhohICVwVJEbgwMu1C%2FaaRpWZPHP4jfQ1zZuab%2Bc6ZMzPvA7M792eu5za9nhBCCCGEEEIIIYQQHQOYBFwDrAYe7six2tZ5Ui8WgMuBV4BDdJdDwMvA3DpFTAGeAI7V3RoRYdpiFJgcWsaZwKd11z5iPgbOCCVjKrCj7ho3gE%2FMXSSEkGfrrmmDeNq3jKuB4wXuo72APQUK%2FAPAXzrFOAL4Hld4RfAZ8FdvcIXAmKfN1CRHpSEhkSEhkSEhkSEhkSEhkSEiXhZhxkUU6WJQxRhRUyDtDz6yhmLaQkIiQkMiQkMiQkMiQkMiQkAAA5wArgPuAs3PC6rXXJ8BNwC%2FnvmbvgOszwkuIL4AldgJCkq8z4khIYBknmNFYIcAE4KpAfV1XAkNznF9CN3ohn1EP3wAnJcpyvyPshgFkPG%2FENlKIY4JcSK4o2FiGN6vKaIKQ02pezHNBojxbKqSVKyN6ITbOq9TDBylluQ446ktGU4ScCjwH%2FEoYDtr1jqmrl4C7MyZuVZLRCCExAtxZ8Fa6ZRAZNm0J8SRlYBk2XQnxIKWUDJumhFQBWAp8b0X8YTcDmFAhPQmpiv2CnzWM3RgkJDIkJDIkJADAMtu1Yp4Py3LC6hniE7tvSZIHMsJLiC%2BAJzOW7aWOuUhIeBknOMsRT1fIINge6DXAM8A9wMklZOx2fas0QghwEfBgoB3e1gNzMraU%2BipR9u3AtAFkmN7iGzPqGrcQu9eJ%2BQIOyRFgeUpZNmTsbWNRAv%2BPEEM7L1VypGz2RoOMTNPXmasq9pSk3XL3s6djZwHymQxYPS4HRTRTjk4Tmqrm%2BK%2BCZBB3eqeaMgWpkzLWsypYBZwP9gasSrMouirTnWLVH3ainoE353Y6yTOBxYsjfB%2BdurlIIkiXvWL7F4lT8qcHr%2FgHMA4ZGxJClu43vgVMSzoGZ%2BK6KFkS4LcaDIH6tDTTGAeUbm5xwZESX4X1zM1clEuQgMD1BW0cBz1vmpO7QRYwE5qq1qoJ08n6SuCZgADAL2EqGYuS6uiqpIJ38oJbjVvd7yMUiAWqHaYwuc0ZCc5VNIENJBenkH%2BXPGpGgH8PUJK5z6aQWQ13zekvdy5suRowg69W3rwyfy0wOP01%2FjgNmJPSHTY65lslc7cgskMEiyPLAYYAjlJirLOZshaV%2BvqurqgsSRoliWt0NLY25qpEgrZaHO8dQJ7%2BXwboJIlje0GU%2BODooenVVQ0FmWEbJtFKYK9cFoSPyZApwl0QlJlh9yTJaR1vhqyvXBaHDrHweafOimDqLLN%2F8MYWuriogyEOGdo%2B11DnLsgRemtZ3JRExWXTMVUHeNrT77ph64i8zbXINSmOu5IU1i465KshCQ7uviaknbngTc9Qefay5Uhtp87PomKuCnK7xW22Mm3SVQ3KrZe8kqbmaKs8ui445KYgAnKZGymrgwRShP%2FeSDp25WlkbQejYar1dPeRRGVx%2FQEwsWhJz1V4LQejYFZT98LC%2F6dYM7mM7dJPEXFEXQT7UtGVvUvd7SuEbNVcr6iTILkN7Rmdwrw3dMFe1EaTNkK5icAb3kmiWtOYqXMddQYALgPtkPpAgOku5KzXLT62LvAltkhCj7Q2aK3cFoeOo2qHIiakWS%2FkLVbiPnFaa2qx2GO71cIPmyk1BZEcOfdjpsqAEACdbwlZt5spZQaYYrr8lKAkqlrhL%2B6LOw4i5claQkYbrd4mULwqJ%2F1LmsdOsfhU9pKSOSrQ7L4hhf7pdzvMFJUO9wZuC5h7RPCNnBemltlDfBJ6VMyOBQ6gV4r7KCOI6lj2S8gmiDnVKYpjHgKslQC2oGMByJwRRXtkfI%2FVerJooOCSI6XyiNQLENVwS5F1D3TuCCoFDgixIemrXZVwSZIjyS4WR%2FQw%2FhxS4yhJf0KPAS2KqgN5BxcCVEVIX8IKUCy9IyfCClIzSC6I2nU4NagJlFUQlnvkissRtevBB2SizIJ9pyq0MKg5lFEQ5Ek1h%2FJV793BBkH6GbG9y2qhHUGEooyCq7Bu5NKpklFmQ%2FipIeZdK0r8gbeJJFymtIKE6lXIelk2Qp9IKUjcw7%2FvMy%2BJmEubpBbEfgfvV8IxmBs0GuMIL0lDkpTAxaDZqotbFHNXeZAEtlt%2Br2h3NkdJMUXS%2FiVFbQYDewI2Ro3VRXsmyAedqItS1gqjELBI49rJpyKqFwjpHP22WnCjhA0TZRl8CT8cJYlgA3KwpV9Qv9eRF8xMGGIap5Fk0uU56G%2FLlbq%2BZIOty%2B%2BFIlcLoY3XjVZpD%2BjoOaxKA6bzCVeA923G8LHNQPS4nVDVJJXXn8TZprvEz1WK%2FyoPS%2FOw%2FaXKJaP5vYiQ5y9%2BaXzdo6UZS47KxQ%2BUN1ibKLFPSFzlJO92QK32MSmjs6mcWcJM6eFrciPB4PB6Px%2FPxeDwej8fj8XiCYvgfmdGLzXmrOA4AAAAASUVORK5CYII%3D&labelColor=CC766E&color=6A6A6A\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fdocs.uptrain.ai\u002Fgetting-started\u002Fquickstart\">\n        \u003Cimg alt=\"阅读文档\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRead%20Docs-uptrain?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAAC%2FklEQVR4nO3dO2sUURiH8SPipVAhKngJWggKolZ%2BAEGQIFiaL2CICNoLBmJAsBAbKxEREcTAYq%2BFYKGx8orxFkmhEltNY6P4yIEjs5FlJ4k7s5v%2B975v%2BD6ZJhD0%2FenZnsMhOCiIiIiIwFbgFHgCjAJtPq83QF2hqYBNgNXgZ%2FYMwvsCE0B7Ehb1Yh9wAPgN%2F28jbOd8ANfw6z2wLeQEuIdv74CtIRfA85LeB7cAbaECOLjEbbbv6wBWA923G8LHNQPS4nVDVJJXXn8TZprvEz1WK%2FyoPS%2FOw%2FaXKJaP5vYiQ5y9%2BaXzdo6UZS47KxQ%2BUN1ibKLFPSFzlJO92QK32MSmjs6mcWcJM6eFrciPB4PB6Px2FwPB6Px%2FPxeDwej8fj8XiCYvgfmdGLzXmrOA4AAAAASUVORK5CYII%3D&labelColor=976DA5&color=6A6A6A\">\n    \u003C\u002Fa>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fuptraincommunity\u002Fshared_invite\u002Fzt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg\">\n        \u003Cimg alt=\"Slack社区\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSlack%20Community-uptrain?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAADFklEQVR4nO2cPW4TURSFLaWBhg6xACs9YhEsISiLiIzoUkGNRIV3kDSxhJVVBArKrCBUSCnCTyTc8KGRphomjsfv3vGBdz7JpY%2Bu5iQz9v3kN5kYY4wxxhhjjDHG%2FAXwGDgGPgJfgRXwBfgAHAJ7u8yrCuAI%2BM56LoFnu8irCuA9zTm3g8dGzrqrbv%2BSh3AD7u7iqqve8f23HecZeZVB3BdeAHnmXnVAfwqv%2B8C1LBeAp8NPPECG1XVK7vhiKH%2BqJpew36wtciBWiBC9GC2gvpcdbRrIY48OxCpB39hs46stnGfnVUf7wC180LnU%2FRgUdzEDxfNy90PkUHHskP4FHgfH15ofMpOvBIXgfP15cXOp%2BiA4%2FiM%2FAwcL7evIz51Bx4BJ%2BAJ4Hz3ZmXNZ%2BSAy%2B9J78BHgTNd29e5nwqDnzo6xR4B7zY9gHZmW921og5f0WnPpqzRtD5Kzr1UZw1os5f0amnO2uEnb%2BiU091og7%2F4xCjpWdNeLOP6OQC2VnjbjzzyjkWtlZI%2B7cDVSXD0V5F51ZHg6JeRedWR4OgPI%2FOqI8HR70XmVUmko8%2FIq5IoR5%2BVVyURjj4zr1oocPRj5BljjDHGGGOMMcYYY8xElT9jEHRBVEyk3KDAAAAABJRU5ErkJggg%3D%3D&labelColor=6565d8&color=6A6A6A\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues\u002Fnew?assignees=&labels=enhancement&template=feature_request.md&title=\">\n        \u003Cimg alt=\"请求新功能\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRequest%20New%20Feature-uptrain?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAYAAABw4pVUAAAACXBIWXMAAAsTAAALEwEAmpwYAAAG%2BElEQVR4nO2daahVVRTHt2WDls3RQDYXRUZBkRkRjdAraPjSAJUUKVSE0QuiQZ4FmRj2lMwwqGgitZTq5cvEIipKpGgymogwCrKs7IOkDf5i9ZYorzvsM+99zvrB\u002FXjWXfv+z9ln7zXs65xhGIZhGIZhGIZRCsD2wOXAIuA94P2MH7HxLHBBOSOoEcBewJsUx\u002FPATATtq3qscbPMAMymNy1eMNHuDxEgXpq3q8wQP0lSjIPV8PN3iAY4HNJkhAAHNMkIAAdgBml\u002FCk2JSVBOAo4HbgYWD+sM9KEyQQZP8ArDVBAgHo0WCgLXurBjgOWG\u002F7xMRkyK+9Wxj42vP625zMQHsAyyjOH4AXtq3qscbPMAMymNy1eMNHuDxEgXpq3q8wQP0lSjIPV8PN3iAY4HNJkhAAF9gfW2MYwAIDRwKo9dEgjAPHupBwJwq32yAgHo0GCgLXurBjgOWG\u002F7xMRkyK+9Wxj42vP625zMQHsAyyjOH4AXtq3qscbPMAMymNy1eMNHuDxEgXpq3q8wQP0lSjIPV8PN3iAY4HNJkhAAF9gfW2MYwAIDRwKo9dEgjAPHupBwJwq32yAgHo0GCgLXurBjgOWG\u002F7xMRkyK+9Wxj42vP625zMQHsAyyjOH4AXtq3qscbPMAMymNy1eMNHuDxEgXpq3q8wQP0lSjIPV8PN3iAY4HNJkhAAF9gfW2MYwAIDRwKo9dEgjAPHupBwJwq32yAgHo0GCgLXurBjgOWG\u002F7xMRkyK+9Wxj42vP625zMQHsAyyjOH4AXtq3qscbPMAMymNy1eMNHuDxEgXpq3q8wQP0lSjIPV8PN3iAY4HNJkhAAF9gfW2MYwAIDRwKo9dEgjAPHupBwJwq32yAgHo0GCgLXurBjgOWG\u002F7xMRkyK+9Wxj42vP625zMQHsAyyjOH4AXtq3qscbPMAMymNy1eMNHuDxEgXpq3q8wQP0lSjIPV8PN3iAY4HNJkhAAF9gfW2MYwAIDRwKo9dEgjAPHupBwJwq32yAgHo0GCgLXurBjgOWG\u002F7xMRkyK+9Wxj42vP625zMQHsAyyjOH4AXtq3qscbPMAMymNy1eMNHuDxEgXpq3q8wQP0lSjIPV8PN3iAY4HNJkhAAF9gfW2MYwAIDRwKo9dEgjAPHupBwJwq32yAgHo0GCgLXurBjgOWG\u002F7xMRkyK+9Wxj42vP625zMQHsAyyjOH4AXtq3qscbPMAMymNy1eMNHuDxEgXpq3q8wQP0lSjIPV8PN3iAY4HNJkhAAF9gfW2MYwAIDRwKo9dEgjAPHupBwJwq32yAgHo0GCgLXurBjgOWG\u002F7xMRkyK+9Wxj42vP625zMQHsAyyjOH4AXtq3qscbPMAMymNy1eMNHuDxEgXpq3q8wQP0lSjIPV8PN3iAY4HNJkhAAF9gfW2MYwAIDRwKo9dEgjAPHupBwJwq......\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Freleases\">\n    \u003Cimg alt=\"GitHub Release\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fv\u002Frelease\u002Fuptrain-ai\u002Fuptrain?labelColor=6A6A6A&color=CC766E\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fgraphs\u002Fcommit-activity\">\n    \u003Cimg alt=\"GitHub commit activity\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fcommit-activity\u002Fm\u002Fuptrain-ai\u002Fuptrain?labelColor=6A6A6A&color=976DA5\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fblob\u002Fmain\u002FLICENSE\">\n    \u003Cimg alt=\"GitHub License\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fuptrain-ai\u002Fuptrain?labelColor=6A6A6A&color=6565d8\">\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fpypistats.org\u002Fpackages\u002Fuptrain\">\n    \u003Cimg alt=\"PyPI - Downloads\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdm\u002Fuptrain?labelColor=6A6A6A&color=3E93C4\">\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Ch4 align=\"center\">\n  \u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fassets\u002F43818888\u002F68a3b169-2217-446b-93b2-96b48ec7201d\" alt=\"UpTrain 的 LLM 评估演示，展示了幻觉、检索上下文质量以及客服聊天机器人回复语气的评分\" autoplay>\n\u003C\u002Fh4>\n\n\n**[UpTrain](https:\u002F\u002Fuptrain.ai)** 是一个开源的统一平台，用于评估和改进生成式 AI 应用。我们提供针对 [20 多种预配置评估](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain?tab=readme-ov-file#pre-built-evaluations-we-offer-) 的评分（涵盖语言、代码和嵌入用例），并对失败案例进行 [根本原因分析](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fblob\u002Fmain\u002Fexamples\u002Froot_cause_analysis\u002Frag_with_citation.ipynb)，并给出解决建议。    \n\n\u003Cbr \u002F>\n\n\n\n# 核心功能 🔑\n\n\u003Cimg width=\"1088\" alt=\"交互式仪表板\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_5cdf36d1c0a2.png\">\n\nUpTrain 仪表板是一个基于 Web 的界面，可在您的 **本地机器** 上运行。您可以通过该仪表板评估您的 LLM 应用程序、查看结果并进行根本原因分析。\n\n\u003Cimg width=\"1088\" alt=\"20+ 预配置评估\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_238aa6ac809c.png\">\n\n支持 **20 多种预配置评估**，例如响应完整性、事实准确性、上下文简洁性等。\n\n\u003Cimg width=\"1088\" alt=\"数据安全\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_8c51a5ae7628.png\">\n\n所有评估和分析都在您的本地系统上运行，确保数据不会离开您的安全环境（使用模型评分检查时除外）。\n\n\u003Cimg width=\"1088\" alt=\"实验\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_1fa86b4cdc29.png\">\n\n**尝试不同的嵌入模型**，如 text-embedding-3-large\u002Fsmall、text-embedding-3-ada、baai\u002Fbge-large 等。UpTrain 支持 HuggingFace 模型、Replicate 端点或您自己端点上托管的自定义模型。\n\n\u003Cimg width=\"1088\" alt=\"根本原因分析\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_d01009fcceb8.png\">\n\n您可以对收到负面用户反馈或评估分数较低的案例进行 **根本原因分析**，以了解您的 LLM 流程中哪一部分导致了不佳的结果。请查看支持的 RCA 模板。\n\n\u003Cimg width=\"1088\" alt=\"选择多种评估 LLM\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_b859ab51bfa6.png\">\n\n我们允许您使用任何 OpenAI、Anthropic、Mistral、Azure 的 OpenAI 端点，或在 Anyscale 上托管的开源 LLM 作为评估者。\n\n\u003Cimg width=\"1088\" alt=\"自定义评估\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_1a82aea699cc.png\">\n\nUpTrain 提供了大量方式来 **自定义评估**。您可以自定义评估方法（思维链 vs 分类）、少样本示例和场景描述。您还可以创建自定义评估器。\n\n### 即将推出：\n\n1. 与团队协作\n2. 通过 UMAP 和聚类进行嵌入可视化\n3. 在失败案例中识别模式\n4. 提供提示优化建议\n\n\u003Cbr \u002F>\n\n\n# 开始使用 🙌\n\n## 方法 1：使用本地托管的仪表板\n\nUpTrain 仪表板是一个基于 Web 的界面，可让您评估 LLM 应用程序。它是一个自托管的仪表板，可在您的本地机器上运行。\n您无需编写任何代码即可使用该仪表板。您可以使用该仪表板评估 LLM 应用程序、查看结果并进行根本原因分析。\n\n在开始之前，请确保您的机器上已安装 Docker。如果没有，您可以从 [这里](https:\u002F\u002Fdocs.docker.com\u002Fget-docker\u002F) 安装。\n\n以下命令将下载 UpTrain 仪表板并在您的本地机器上启动它。\n\n```bash\n# 克隆仓库\ngit clone https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\ncd uptrain\n\n# 运行 UpTrain\nbash run_uptrain.sh\n```\n> **注意：** UpTrain 仪表板目前处于 **测试版**。我们非常欢迎您的反馈，以帮助我们改进它。\n\n## 方法 2：使用 UpTrain 软件包\n\n如果您是开发者，并希望将 UpTrain 评估集成到您的应用程序中，可以使用 UpTrain 软件包。这提供了一种更程序化的方式来评估您的 LLM 应用程序。\n\n### 通过 pip 安装软件包：\n```bash\npip install uptrain\n```\n\n### 如何使用 UpTrain：\n\n您可以通过提供您的 OpenAI API 密钥来运行评估，从而通过开源版本评估您的响应。\n\n```python\nfrom uptrain import EvalLLM, Evals\nimport json\n\nOPENAI_API_KEY = \"sk-***************\"\n\ndata = [{\n    'question': '世界上最受欢迎的运动是什么？',\n    'context': '体育运动的受欢迎程度可以从多个方面衡量，包括电视收视率、社交媒体影响力、参与人数以及经济影响等。毫无疑问，足球是全球最受欢迎的运动，像 FIFA 世界杯这样的重大赛事以及罗纳尔多、梅西等体育明星，吸引了超过 40 亿的球迷。板球在印度、巴基斯坦、澳大利亚和英国等国家尤为流行。国际板球理事会世界杯和印度超级联赛（IPL）拥有庞大的观众群体。NBA 则使篮球在全球范围内广受欢迎，尤其是在美国、加拿大、中国和菲律宾等地。温布尔登网球锦标赛、美国公开赛、法国公开赛和澳大利亚公开赛等大型网球赛事也拥有庞大的全球观众。罗杰·费德勒、塞雷娜·威廉姆斯和拉斐尔·纳达尔等球员进一步提升了这项运动的知名度。曲棍球在印度、荷兰和澳大利亚等国家非常流行，在世界许多地区也有相当多的追随者。',\n    'response': '足球是世界上最受欢迎的运动，全球约有 40 亿球迷'\n}]\n\neval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)\n\nresults = eval_llm.evaluate(\n    data=data,\n    checks=[Evals.CONTEXT_RELEVANCE, Evals.FACTUAL_ACCURACY, Evals.RESPONSE_COMPLETENESS]\n)\n\nprint(json.dumps(results, indent=3))\n```\n如果您有任何问题，请加入我们的 [Slack 社区](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fuptraincommunity\u002Fshared_invite\u002Fzt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg)。\n\n您也可以通过[在此预约通话](https:\u002F\u002Fcalendly.com\u002Fuptrain-sourabh\u002F30min)直接与 UpTrain 的维护人员交流。\n\n\u003Cbr \u002F>\n\n# 我们提供的预构建评估 📝\n\u003Cimg width=\"1088\" alt=\"quality of your responses\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_a4a58e957b6c.png\">\n\n| 评估 | 描述 |\n| ---- | ----------- |\n|[响应完整性](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fresponse-quality\u002Fresponse-completeness) | 评分响应是否回答了问题中指定的所有方面。 |\n|[响应简洁性](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fresponse-quality\u002Fresponse-conciseness) | 评分生成的响应是否简洁，或者是否包含与所提问题无关的额外信息。 |\n|[响应相关性](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fresponse-quality\u002Fresponse-relevance)| 评分生成的内容与所提问题的相关程度。|\n|[响应有效性](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fresponse-quality\u002Fresponse-validity)| 评分生成的响应是否有效。如果响应包含任何信息，则被视为有效响应。|\n|[响应一致性](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fresponse-quality\u002Fresponse-consistency)| 评分响应与所提问题以及所提供上下文的一致性。|\n\n\u003Cimg width=\"1088\" alt=\"quality of retrieved context and response groundedness\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_bec5d0416c39.png\">\n\n\n| 评估 | 描述 |\n| ---- | ----------- |\n|[上下文相关性](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcontext-awareness\u002Fcontext-relevance) | 评分上下文与所提问题的相关程度。 |\n|[上下文利用度](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcontext-awareness\u002Fcontext-utilization) | 在给定上下文信息的情况下，评分生成的响应对所提问题的完整程度。 |\n|[事实准确性](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcontext-awareness\u002Ffactual-accuracy)| 评分生成的响应是否基于所提供的上下文且事实正确。|\n|[上下文简洁性](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcontext-awareness\u002Fcontext-conciseness)| 评估从原始上下文中引用的简洁性，判断是否存在无关信息。\n|[上下文重新排序](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcontext-awareness\u002Fcontext-reranking)| 评估重新排序后的上下文相比原始上下文的效率如何。|\n\n\u003Cimg width=\"1088\" alt=\"language quality of the response\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_8e2bacdd068c.png\">\n\n| 评估 | 描述 |\n| ---- | ----------- |\n|[语言特征](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Flanguage-quality\u002Ffluency-and-coherence) | 评分响应中语言的质量和效果，重点关注清晰度、连贯性、简洁性及整体沟通能力。 |\n|[语气](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcode-evals\u002Fcode-hallucination) | 评分生成的响应是否符合所需角色的语气 |\n\n\u003Cimg width=\"1088\" alt=\"language quality of the response\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_02f492ac61f4.png\">\n\n| 评估 | 描述 |\n| ---- | ----------- |\n|[代码幻觉](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcode-evals\u002Fcode-hallucination) | 评分生成响应中的代码是否基于上下文。 |\n\n\u003Cimg width=\"1088\" alt=\"conversation as a whole\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_81617f912cad.png\">\n\n| 评估 | 描述 |\n| ---- | ----------- |\n|[用户满意度](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fconversation-evals\u002Fuser-satisfaction) | 评分用户的关切是否得到妥善处理，并根据提供的对话内容评估其满意度。 |\n\n\u003Cimg width=\"1088\" alt=\"custom evaluations and others\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_f16beaa37e92.png\">\n\n 评估 | 描述 |\n| ---- | ----------- |\n|[自定义指南](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcustom-evals\u002Fcustom-guideline) | 允许您指定一条指南，并评分大型语言模型在给出响应时对指南的遵守情况。 |\n|[自定义提示](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fcustom-evals\u002Fcustom-prompt-eval) | 允许您创建自己的评估集。 |\n\n\u003Cimg width=\"1088\" alt=\"compare responses with ground truth\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_1dfcac623eff.png\">\n\n| 评估 | 描述 |\n| ---- | ----------- |\n|[响应匹配](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fground-truth-comparison\u002Fresponse-matching) | 比较并评分大型语言模型生成的响应与提供的真实答案之间的匹配程度。 |\n\n\u003Cimg width=\"1088\" alt=\"safeguard system prompts and avoid LLM mis-use\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_02ba8d93b638.png\">\n\n| 评估 | 描述 |\n| ---- | ----------- |\n|[提示注入](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fsafeguarding\u002Fprompt-injection) | 评分用户的提示是否试图让大型语言模型泄露其系统提示。 |\n|[越狱检测](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fsafeguarding\u002Fjailbreak) | 评分用户的提示是否试图进行越狱（即生成非法或有害的响应）。 |\n\n\u003Cimg width=\"1088\" alt=\"evaluate the clarity of user queries\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_602f818ab511.png\">\n\n| 评估 | 描述 |\n| ---- | ----------- |\n|[子查询完整性](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fquery-quality\u002Fsub-query-completeness) | 评估从用户查询中生成的所有子查询是否共同覆盖了用户查询的所有方面 |\n| [多查询准确性](https:\u002F\u002Fdocs.uptrain.ai\u002Fpredefined-evaluations\u002Fquery-quality\u002Fmulti-query-accuracy) | 评估生成的变体是否准确地代表了原始查询 |\n\n\n\u003Cbr \u002F>\n\n# 集成 🤝\n\n| 评估框架 | 大模型提供商 | 大模型工具包 | 服务框架 | 大模型可观测性 | 向量数据库 |\n| ------------- | ------------- | ------------- | ------------- | ------------- |  ------------- |\n| [OpenAI Evals](https:\u002F\u002Fdocs.uptrain.ai\u002Ftutorials\u002Fopenai-evals) | [OpenAI](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Fopenai) | [LlamaIndex](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fframework\u002Fllamaindex-methods\u002Foverview) | [Ollama](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Follama) | [Langfuse](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fobservation-tools\u002Flangfuse) | [Qdrant](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fvector_db\u002Fqdrant) |\n| | [Azure](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Fazure) | |  [Together AI](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Ftogether_ai) | [Helicone](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fobservation-tools\u002Fhelicone) | [FAISS](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fvector_db\u002Ffaiss) |\n| | [Claude](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Fclaude) | |  [Anyscale](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Fanyscale) | [Zeno](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fobservation-tools\u002Fzeno) | [Chroma](https:\u002F\u002Fdocs.uptrain.ai\u002Fintegrations\u002Fvector_db\u002Fchroma) |\n| | [Mistral](https:\u002F\u002Fdocs.uptrain.ai\u002Fllms\u002Fmistral) | | Replicate  |\n| |  | |  HuggingFace  |\n\n更多集成即将推出。如果您有特定的集成需求，请通过[创建议题](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues)告诉我们。\n\n\u003Cbr \u002F>\n\n# 监控大模型中的提示漂移：UpTrain 基准测试\n\n像 GPT-4、GPT-3.5-turbo、Claude-2.1 等最受欢迎的大模型都是闭源的，即通过 API 对外提供服务，而对其内部运作几乎无法窥探。目前已有许多关于提示漂移（或 GPT-4 变得“懒惰”）的报道，并且也有研究探讨模型质量的下降问题。本次基准测试旨在通过在一个固定数据集上评估模型响应，来追踪模型行为的变化。\n\n\u003Cimg width=\"1316\" alt=\"image\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_7f8dd13c7d55.png\">\n\n您可以在[这里](https:\u002F\u002Fdemo.uptrain.ai\u002Fbenchmark)找到该基准测试。\n\n# 资源 💡\n\n1. [如何评估您的大模型应用](https:\u002F\u002Fblog.uptrain.ai\u002Fhow-to-evaluate-your-llm-applications)\n1. [如何检测越狱攻击](https:\u002F\u002Fblog.uptrain.ai\u002Fllm-jailbreak\u002F)\n1. [应对幻觉问题](https:\u002F\u002Fblog.uptrain.ai\u002Fdealing-with-hallucinations-in-llms-a-deep-dive\u002F)\n\n\u003Cbr \u002F>\n\n# 我们为何构建 UpTrain 🤔\n\n在过去八年中，我们一直从事机器学习和自然语言处理模型的工作，但不断被模型中隐藏的各种故障所困扰，这促使我们开发了 UpTrain。UpTrain 最初是一个机器学习可观测性工具，用于检测准确率的退化。\n\n然而，我们很快意识到，大模型开发者面临更大的挑战——他们几乎没有好的方法来衡量其大模型应用的准确率，更不用说识别退化情况了。\n\n随后，我们看到了 [OpenAI evals](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fevals) 的发布，其中他们提出使用大模型自身来评估其他模型的输出。此外，阅读了 [Anthropic 如何利用 RLAIF](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2212.08073.pdf) 后，我们更加确信可以着手进行大模型评估的研究（我们很快将发布一个包含优秀评估研究的仓库）。\n\n因此，如今 UpTrain 是我们试图为大模型领域带来秩序并回馈社区的一次尝试。尽管大多数开发者仍然依赖直觉，并通过审查几个案例来部署提示变更，但我们已经听过太多因退化而导致的问题，因此相信随着这一领域的成熟，“评估与改进”将成为大模型生态系统的重要组成部分。\n\n1. 强大的评估体系可以帮助您系统地尝试不同的配置，客观地选择最佳方案，从而避免任何退化。\n\n1. 它能帮助您了解系统出错的原因，找到根本问题并及时修复——在最终用户抱怨甚至流失之前。\n\n1. 像提示注入和越狱检测这样的评估对于维护大模型应用的安全性和可靠性至关重要。\n\n1. 评估有助于提高透明度，建立与最终用户的信任——尤其是在面向企业客户时尤为重要。\n\n\u003Cbr \u002F>\n\n# 为什么选择开源？ \n\n1. 我们明白，在评估方面并没有**一刀切的解决方案**。越来越多的开发者希望修改评估提示、选项集或少样本示例等。我们认为，最好的开发者体验在于开源，而不是暴露20个不同的参数。\n\n1. **促进创新**：大模型评估以及使用大模型作为评判者这一领域仍处于起步阶段。我们每天都能看到许多令人兴奋的研究成果，而开源为我们和社区提供了一个合适的平台，以便更快地实现这些技术并推动创新。\n\n\u003Cbr \u002F>\n\n## 您可以如何帮助 🙏\n\n我们一直在努力改进 UpTrain，您可以通过以下几种方式做出贡献：\n\n1. **发现任何问题或改进建议**：如果您发现了任何错误或有改进建议，请在我们的 GitHub 仓库中[创建议题](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues)。\n\n1. **直接贡献代码**：如果您发现某个问题可以解决，或者有代码改进建议，欢迎直接向[仓库](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fblob\u002Fmain\u002FCONTRIBUTING.md)提交贡献。\n\n1. **请求自定义评估**：如果您的应用需要定制化的评估，请[告知我们]((https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fuptraincommunity\u002Fshared_invite\u002Fzt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg))，我们将把它添加到仓库中。\n\n1. **与您的工具集成**：如果您需要与现有工具集成，请[联系我们]((https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fuptraincommunity\u002Fshared_invite\u002Fzt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg))，我们会协助完成。\n\n1. **寻求评估帮助**：如果您在评估方面遇到困难，请在我们的[Slack 频道](https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fuptraincommunity\u002Fshared_invite\u002Fzt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg)中提问，我们会尽快为您解答。\n\n1. **表达支持**：请在 GitHub 上给我们点个赞 ⭐，以关注我们的进展。\n\n1. **分享给更多人**：如果您喜欢我们的工作，请在 Twitter 上为我们宣传！\n\n您的贡献和支持对我们意义重大！感谢您参与 UpTrain 的旅程。\n\n\u003Cbr \u002F>\n\n# 许可证 💻\n\n本仓库采用 Apache 2.0 许可证发布，我们致力于为 UpTrain 开源仓库添加更多功能。如果您希望获得更省心的服务，我们还提供托管版本。请在此处预约[演示通话](https:\u002F\u002Fcalendly.com\u002Fuptrain-sourabh\u002F30min)。\n\n\u003Cbr \u002F>\n\n# 提供反馈（越尖锐越好 😉） \n\n我们正在公开开发 UpTrain。请通过**[这里](https:\u002F\u002Fdocs.google.com\u002Fforms\u002Fd\u002Fe\u002F1FAIpQLSezGUkkC0JoEvx-0gCrRSmGutA-jqyb7kl2lomXv302_C3MnQ\u002Fviewform?usp=sf_link)**提供您的反馈，帮助我们改进。\n\n\u003Cbr \u002F>\n\n# 贡献者 🖥️\n\n我们欢迎对 UpTrain 的贡献。详情请参阅我们的[贡献指南](https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fblob\u002Fmain\u002FCONTRIBUTING.md)。\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_readme_596967de1f71.png\" \u002F>\n\u003C\u002Fa>","# UpTrain 快速上手指南\n\nUpTrain 是一个开源的统一平台，旨在评估和改进生成式 AI（LLM）应用。它提供 20+ 种预配置的评估指标（涵盖文本、代码、嵌入等场景），支持对失败案例进行根因分析，并提供改进建议。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**: Linux, macOS 或 Windows\n*   **Python 版本**: Python 3.8 或更高版本 (推荐 3.9+)\n*   **前置依赖**:\n    *   `pip` (Python 包管理工具)\n    *   一个有效的 LLM API Key (如 OpenAI API Key，用于驱动评估模型)\n\n## 安装步骤\n\n使用 pip 直接安装 UpTrain 及其核心依赖：\n\n```bash\npip install uptrain\n```\n\n> **提示**：如果您在国内网络环境下安装较慢，可以使用国内镜像源加速：\n> ```bash\n> pip install uptrain -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n> ```\n\n## 基本使用\n\n以下是一个最简单的示例，展示如何使用 UpTrain 评估 LLM 生成的回答是否存在“幻觉”（Hallucination）以及上下文的相关性。\n\n### 1. 配置 API Key\n\n在使用前，您需要设置环境变量或在代码中配置您的 LLM 提供商 API Key（以 OpenAI 为例）：\n\n```python\nimport os\nos.environ[\"OPENAI_API_KEY\"] = \"sk-your-api-key-here\"\n```\n\n### 2. 运行评估\n\n创建一个简单的评估脚本，输入用户问题、检索到的上下文以及 LLM 的回答，UpTrain 将自动打分。\n\n```python\nfrom uptrain import EvalLLM, Evals\n\n# 初始化评估引擎\neval_llm = EvalLLM()\n\n# 准备测试数据\ndata = [\n    {\n        \"question\": \"UpTrain 支持哪些评估指标？\",\n        \"context\": \"UpTrain 是一个开源平台，支持幻觉检测、上下文相关性、语气分析等 20 多种评估指标。\",\n        \"response\": \"UpTrain 支持幻觉检测和上下文相关性评估。\",\n    }\n]\n\n# 执行评估\nresults = eval_llm.evaluate(\n    data=data,\n    checks=[Evals.HALLUCINATION, Evals.CONTEXT_RELEVANCE],\n)\n\n# 打印结果\nfor result in results:\n    print(f\"评估项：{result['check_name']}\")\n    print(f\"得分：{result['score']}\")\n    print(f\"原因：{result['reason']}\")\n    print(\"-\" * 20)\n```\n\n### 3. 查看结果\n\n运行上述代码后，您将看到类似以下的输出，其中包含每个评估项的分数（0-1 之间）和具体的判断理由：\n\n```text\n评估项：HALLUCINATION\n得分：1.0\n原因：The response is fully supported by the context.\n--------------------\n评估项：CONTEXT_RELEVANCE\n得分：0.95\n原因：The context contains highly relevant information to answer the question.\n--------------------\n```\n\n现在您已经成功完成了第一次评估！您可以进一步探索 UpTrain 的仪表盘功能或自定义评估流程。","某电商团队正在开发基于大模型的智能客服系统，用于自动处理用户关于退货政策和订单状态的咨询。\n\n### 没有 uptrain 时\n- 团队依靠人工抽检少量对话来评估回答质量，效率极低且无法覆盖长尾错误案例。\n- 当模型开始胡编乱造（幻觉）或语气生硬时，缺乏自动化手段及时发现，导致不良体验流向线上。\n- 面对回答失败的情况，开发人员只能凭经验猜测原因，难以定位是提示词问题还是知识库缺失。\n- 每次迭代优化后，无法量化对比新旧版本的效果差异，改进工作如同“盲人摸象”。\n\n### 使用 uptrain 后\n- 利用 uptrain 内置的 20+ 预配置检查项（如事实一致性、语气友好度），实现对全量对话的自动化实时评分。\n- 系统自动标记出存在幻觉或逻辑错误的案例，并生成根因分析报告，直接指出是检索内容偏差还是推理逻辑漏洞。\n- 通过 uptrain 的可视化仪表盘，团队清晰看到不同版本模型在特定场景下的得分变化，让优化方向有据可依。\n- 针对嵌入（Embedding）检索效果进行专项评估，快速调整向量库策略，显著提升了相关文档的召回准确率。\n\nuptrain 将原本模糊的黑盒评估转化为可量化、可追溯的工程闭环，帮助团队以数据驱动的方式持续打磨高质量的生成式 AI 应用。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuptrain-ai_uptrain_ff0e9b91.png","uptrain-ai","UpTrain AI","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fuptrain-ai_1e3a5dec.png","",null,"https:\u002F\u002Fuptrain.ai\u002F","https:\u002F\u002Fgithub.com\u002Fuptrain-ai",[80,84,88,92,96],{"name":81,"color":82,"percentage":83},"Python","#3572A5",71.1,{"name":85,"color":86,"percentage":87},"JavaScript","#f1e05a",28.5,{"name":89,"color":90,"percentage":91},"Dockerfile","#384d54",0.2,{"name":93,"color":94,"percentage":95},"CSS","#663399",0.1,{"name":97,"color":98,"percentage":99},"Shell","#89e051",0,2342,204,"2026-04-18T08:01:17","Apache-2.0","未说明",{"notes":106,"python":104,"dependencies":107},"提供的 README 片段主要包含项目介绍、功能概览及徽章链接，未包含具体的运行环境需求（如操作系统、GPU、内存、Python 版本或依赖库列表）。建议查阅官方文档 (docs.uptrain.ai) 或 requirements.txt 文件以获取详细安装要求。",[104],[35,14,109],"其他",[111,112,113,114,115,116,117,118,119,120,121,122,123,124],"machine-learning","experimentation","llm-prompting","llm-test","llmops","monitoring","prompt-engineering","autoevaluation","evaluation","llm-eval","hallucination-detection","jailbreak-detection","openai-evals","root-cause-analysis","2026-03-27T02:49:30.150509","2026-04-19T06:04:18.904937",[128,133,138,143,148,153,158],{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},41821,"如何在本地测试 UpTrain 的最新代码更改或拉取请求（PR）？","你可以通过以下命令直接从 GitHub 的主分支安装最新版本进行测试：\npip install git+https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain.git@main\n\n如果是测试本地的修改，可以运行：\npython setup.py install\n\n运行测试文件时，请进入 tests 文件夹并执行：\npytest .","https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues\u002F644",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},41822,"如何使 UMAP 成为可选依赖，而不是安装 UpTrain 时的强制要求？","UpTrain 现已支持将 UMAP 作为可选依赖。如果你未在配置中定义 UMAP 检查，则无需安装它。如果在代码中需要用到 UMAP 但未安装，建议在相关代码块（如 umap.py）中使用 try\u002Fexcept 结构来处理导入错误：\n\ntry:\n  import umap\nexcept ImportError:\n  print(\"UMAP installation not found. For UMAP visualization, please install with `pip install umap-learn`.\")\n\n这样用户只有在需要可视化功能时才被提示安装 UMAP。","https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues\u002F128",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},41823,"在使用 NLTK 计算 BLEU 分数时遇到 `_normalize` 参数报错怎么办？","该问题是因为稳定版 NLTK 中 `fractions.Fraction` 类尚未重载 `_normalize` 参数，而 UpTrain 的开发分支依赖此特性。\n\n解决方案有两种：\n1. 从源码安装开发版 NLTK：\npip install https:\u002F\u002Fgithub.com\u002Fnltk\u002Fnltk\u002Farchive\u002Fdevelop.zip\n\n2. 或者改用 `torchmetrics` 库中的实现：\n使用 `torchmetrics.text.BLEUScore` 来替代 NLTK 的实现，以避免版本兼容性问题。","https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues\u002F278",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},41824,"Response Consistency（响应一致性）评估是否支持不需要解释（explanation）的分类模式？","是的，目前该功能已得到支持。之前响应一致性评估默认强制要求思维链（Chain-of-thought）解释，但现在已更新以支持 `CLASSIFY` 评估类型，允许在不生成解释的情况下直接输出评分。\n\n你可以通过更新到最新代码或使用合并了相关 PR（如 #583）的版本来使用此功能。","https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues\u002F528",{"id":149,"question_zh":150,"answer_zh":151,"source_url":152},41825,"如何在后台进程中运行 UpTrain 的监控任务？","UpTrain 支持在后台进程中运行监控。关于具体实现：\n1. 确认可以使用多进程模块实现。\n2. 需确保未使用的线程不会残留并占用计算资源。\n3. 目前官方尚未提供具体的延迟基准测试数据，这部分仍在进行中。\n\n建议在实现时注意清理闲置线程，并根据实际需求选择进程或线程模型。","https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues\u002F11",{"id":154,"question_zh":155,"answer_zh":156,"source_url":157},41826,"UpTrain 是否支持计算机视觉（CV）用例，例如目标检测？","是的，UpTrain 支持计算机视觉用例。社区已经添加了关于目标检测（Object Detection）的示例，涵盖约 80 个类别的模型。你可以参考仓库中的 examples 目录或相关 PR 来获取具体的 CV 用例代码和配置方法。","https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues\u002F38",{"id":159,"question_zh":160,"answer_zh":161,"source_url":162},41827,"如何在配置中将日志相关的参数（如 st_logging, log_data）归类管理？","为了优化配置结构，应将日志相关的参数（例如 `st_logging` 和 `log_data`）从主配置类移动到专门的 `LoggingArgs` 类中。这通常在 `config_handler.py` 文件中进行调整。这是一个适合新手的贡献点，旨在提高代码的可维护性。","https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fissues\u002F115",[164,169,174,179,184,189,194,199,204,209,214,219,224,229,234,239,244,248,252,256],{"id":165,"version":166,"summary_zh":167,"released_at":168},333863,"v0.7.1","## 变更内容\n* 文档：修复 `context-conciseness.mdx` 中的错别字，由 @zhaozhiming 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F693 中完成。\n* 更新 `zeno.ipynb` 的位置，由 @emmanuel-ferdman 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F696 中完成。\n* 仪表板：查找共性主题，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F695 中完成。\n* 新的评估项，由 @ashish-1600 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F699 中完成。\n* 将名称转换为 URL 映射中的 ID，由 @sanchitadev 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F700 中完成。\n* 为 `base.py` 添加文档字符串和注释，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F701 中完成。\n* 为 `ResponseMatching` 中的 `'exact'` 和 `'rouge'` 方法添加本地评估支持，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F702 中完成。\n* 允许同时使用多种响应匹配方法，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F703 中完成。\n* 仪表板重复评估，由 @ashish-1600 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F704 中完成。\n* 在 `Evalllamaindex` 中支持 `evaluation_name` 和 `project_name`，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F705 中完成。\n* v0.7.1 版本，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F706 中发布。\n\n## 新贡献者\n* @zhaozhiming 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F693 中完成了首次贡献。\n* @emmanuel-ferdman 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F696 中完成了首次贡献。\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.7.0...v0.7.1","2024-05-14T09:18:58",{"id":170,"version":171,"summary_zh":172,"released_at":173},333864,"v0.7.0","## 变更内容\n* 将默认评估模型更改为 gpt-3.5-turbo，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F685 中完成\n* 修复 llamaindex 中的链接问题，由 @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F687 中完成\n* 前端变更，由 @sanchitadev 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F689 中完成\n* 更新后端新 API，由 @ashish-1600 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F688 中完成\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.13...v0.7.0","2024-04-19T12:33:29",{"id":175,"version":176,"summary_zh":177,"released_at":178},333865,"v0.6.13","## 变更内容\n* 由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F676 中添加多查询准确性文档，并改进子查询完整性文档。\n* 由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F677 中修复指南遵循性问题。\n* 由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F678 中更新指南遵循性测试。\n* 由 @ashish-1600 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F679 中修改 API 客户端。\n* 由 @ashish-1600 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F680 中更新前端代码。\n* 由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F584 中提供嵌入算子教程。\n* 由 @ashish-1600 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F681 中添加 question_completeness 评估。\n* 由 @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F675 中进行助手评估。\n* 由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F683 中改进语言评论算子。\n* 由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F684 中更新响应匹配算子。\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.12...v0.6.13","2024-04-12T17:36:03",{"id":180,"version":181,"summary_zh":182,"released_at":183},333866,"v0.6.12","## 变更内容\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F668 中添加了 Milvus 集成指南\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F669 中更新了文档中的仪表板位置说明\n* @ashish-1600 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F670 中新增了查询改写算子\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F671 中创建了 MultiQueryAccuracy 算子\n* @ashish-1600 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F672 中为查询改写添加了开源提示\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F673 中增加了对 vllm 的支持\n* @Dominastorm 发布了 v0.6.12 版本，详情见 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F674\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.10.post1...v0.6.12","2024-04-04T09:34:59",{"id":185,"version":186,"summary_zh":187,"released_at":188},333867,"v0.6.10.post1","## 变更内容\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F662 中向文档添加了自定义评估功能。\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F663 中移除了自定义评估中的 TransformOp。\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F653 中改进了 Promptfoo 视图函数的处理。\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F664 中向 README 添加了 LlamaIndex 回调处理器文档和每日监控的相关链接。\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F666 中为 CustomPromptEval 添加了本地评估支持。\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F665 中移除了自定义评估对 OpenAI API 密钥的要求。\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.9...v0.6.10.post1","2024-03-22T07:20:14",{"id":190,"version":191,"summary_zh":192,"released_at":193},333868,"v0.6.9","## 变更内容\n* 将 `json.loads` 更改为 `parse_json`，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F647 中完成\n* 使响应输出的一致性保持一致，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F648 中完成\n* 更改演示视频，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F649 中完成\n* 更新 `pyproject.toml` 文件，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F650 中完成\n* README 变更：按钮样式，由 @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F651 中完成\n* 自定义 Python 函数，由 @sourabhagr 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F655 中完成\n* 在 EvalLLM 中添加对 `ColumnOp` 的支持，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F657 中完成\n* 添加 OpenAI API 密钥验证功能，由 @ashish-1600 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F659 中完成\n* 添加 Pandas 数据处理功能，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F652 中完成\n* 创建自定义评估教程，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F658 中完成\n* 在 README 中添加徽章，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F660 中完成\n* 修复：事实准确性与响应完整性测试，并将 `json5` 添加到依赖项中，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F661 中完成\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.8...v0.6.9","2024-03-20T18:22:34",{"id":195,"version":196,"summary_zh":197,"released_at":198},333869,"v0.6.8","## 变更内容\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F608 中改进了 RCA 教程\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F623 中增加了对 Ollama 的集成支持，用于评估\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F622 中修复了文档\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F618 中添加了关于 Gemma 与 GPT 的实验比较\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F626 中实现了 Ollama 集成\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F625 中编写了 Helicone 的使用指南\n* @sanchitadev 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F627 中更新了前端代码\n* @Abhiramkns 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F583 中为响应一致性评估新增了 CLASSIFY 评估类型的支持\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F629 中更新了依赖项\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F628 中进一步优化了 RCA 教程\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F630 中移除了旧的文档页面并修复了损坏的链接\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F634 中重新整理并更新了集成相关内容\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F636 中新增了 Promptfoo 支持\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F637 中改进了 README 并更新了 RCA 教程\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F638 中更新了 README.md\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F639 中重新排列了 README 中的关键特性\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F640 中更新了 Promptfoo 集成\n* @msalhab96 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F642 中更新了 quickstart.mdx 文件\n* @sanchitadev 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F643 中进行了前端改动\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F645 中改进了 LLM 输出解析及事实准确性提示\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F646 中发布了 v0.6.8 版本\n\n## 新贡献者\n* @msalhab96 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F642 中做出了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.7.post1...v0.6.8","2024-03-18T10:17:19",{"id":200,"version":201,"summary_zh":202,"released_at":203},333870,"v0.6.7.post1","## 变更内容\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F596 中添加了使用 Mistral 构建 RAG 的教程\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F598 中更新了 RCA 相关文档\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F590 中对比了 Claude-3 与 GPT-4\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F606 中更新了 CONTRIBUTING.md 文件\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F609 中更新了 pyproject.toml 文件\n* @ashish-1600 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F588 中实现了 Langfuse 集成\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F613 中修复了代码幻觉和响应匹配问题\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F610 中改进了实验教程，并优化了响应匹配功能（包含上下文）\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F614 中修复了 Pydantic，替换了已弃用的函数\n* @ashish-1600 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F616 中更新了 pyproject.toml 文件\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F617 中实现了实验中对数据重复项的处理\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F587 中实现了 PromptFoo 集成\n* @shrjain1312 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F593 中修复了 README 文件\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F612 中移除了 v0 版本\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F611 中清理了文档\n* @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F619 中删除了 .github\u002Fworkflows\u002Fdeploy-api-docs.yml 文件\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.6.post3...v0.6.7.post1","2024-03-11T21:17:52",{"id":205,"version":206,"summary_zh":207,"released_at":208},333871,"v0.6.6.post3","## 变更内容\n* 文档：@clemra 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F595 中优化了 LangFuse 的介绍\n* 默认在 Compose 中禁用 Next.js 的遥测功能，由 @TensorTemplar 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F597 中完成\n* 修复 fsspec 导入错误，由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F603 中完成\n\n## 新贡献者\n* @clemra 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F595 中完成了首次贡献\n* @TensorTemplar 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F597 中完成了首次贡献\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.6.post2...v0.6.6.post3","2024-03-07T05:38:19",{"id":210,"version":211,"summary_zh":212,"released_at":213},333872,"v0.6.6.post2","## 变更内容\n* 由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F591 中修复 Pydantic 相关问题\n* 由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F592 中更新 UpTrain 版本\n* 由 @Dominastorm 在 https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F594 中为 Table Operator 添加类型注解\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.6.post1...v0.6.6.post2","2024-03-06T13:32:42",{"id":215,"version":216,"summary_zh":217,"released_at":218},333873,"v0.6.6.post1","## What's Changed\r\n* Docs SEO Improvements by @Dominastorm in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F586\r\n* Add dashboard to docs by @shrjain1312 in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F585\r\n* Bump Pydantic to V2 by @Dominastorm in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F582\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.6...v0.6.6.post1","2024-03-05T11:54:56",{"id":220,"version":221,"summary_zh":222,"released_at":223},333874,"v0.6.6","## What's Changed\r\n* Zeno Integration by @shrjain1312 in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F576\r\n* Add Valid Question Operator by @ashish-1600 in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F577\r\n* Supported LLMs doc update by @shrjain1312 in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F578\r\n* Rca open source by @ashish-1600 in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F574\r\n* Fix variable name by @anas-rabhi in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F580\r\n* Update llama_index.py by @devanshi00 in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F564\r\n* Add support for llama-index v0.10+ by @Dominastorm in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F581\r\n\r\n## New Contributors\r\n* @anas-rabhi made their first contribution in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F580\r\n* @devanshi00 made their first contribution in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F564\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.5.post2...v0.6.6","2024-03-04T10:45:55",{"id":225,"version":226,"summary_zh":227,"released_at":228},333875,"v0.6.5.post2","## What's Changed\r\n* Fix checkset run for managed user by @ashish-1600 in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F572\r\n* Fix Custom Prompt Eval by @ashish-1600 in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F573\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.5.post1...v0.6.5.post2","2024-03-01T11:03:08",{"id":230,"version":231,"summary_zh":232,"released_at":233},333876,"v0.6.5.post1","## What's Changed\r\n* Fix response matching operator by @sourabhagr in https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fpull\u002F570\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fuptrain-ai\u002Fuptrain\u002Fcompare\u002Fv0.6.5...v0.6.5.post1","2024-03-01T09:29:56",{"id":235,"version":236,"summary_zh":237,"released_at":238},333877,"v0.6.5","Updated dependencies","2024-02-29T13:18:17",{"id":240,"version":241,"summary_zh":242,"released_at":243},333878,"v0.6.4","## All New User Interface\r\n* **Dashboard**\r\n  - The new No-Code dashboard allows you to run evaluations without writing any code\r\n  - You can visualize these results locally by running the dashboard on your system\r\n\r\n## New Operators\r\n\r\nIntroducing three new operators that fit right into your RAG pipelines. \r\n\r\n### Query Enhancement Operators\r\n* **Sub Query Completeness**\r\n  - See how well the sub-queries cover the main question\r\n\r\n### Contextual Operators\r\n* **Context Reranking**\r\n  - Check whether the reranking improved the order of your documents\r\n* **Context Conciseness**\r\n  - Check if a part of your context is as good as the original context to answer your query\r\n  \r\n## Integrations\r\n\r\n### API Integrations\r\n* **Mistral API Integration**\r\n  - Use the up-and-coming Mistral LLMs to perform your evaluations.\r\n\r\n* **Langfuse Integration**\r\n  - View traces, analyse use cases and user segments while evaluating your LLMs","2024-02-28T15:32:56",{"id":245,"version":246,"summary_zh":76,"released_at":247},333879,"v0.6.3","2024-02-22T05:20:46",{"id":249,"version":250,"summary_zh":76,"released_at":251},333880,"v0.6.2","2024-02-21T15:51:56",{"id":253,"version":254,"summary_zh":237,"released_at":255},333881,"v0.6.1","2024-02-21T15:44:15",{"id":257,"version":258,"summary_zh":259,"released_at":260},333882,"v0.6.0","#### We are thrilled to announce a significant array of enhancements aimed at improving user experience, ease of use, and overall functionality in UpTrain v0.6! \r\n\r\n### New Features:\r\n\r\n1. **Local Evaluation Capability ✨**\r\n   - Users can now run evaluations locally on their systems, providing more flexibility and control over the evaluation process.\r\n\r\n2. **Custom Prompt Evaluation 🎛️**\r\n   - Introducing the ability to create custom evaluations tailored to specific user needs, empowering users with more control over the evaluation process.\r\n\r\n3. **Scenario Description Parameter for Operators 📝**\r\n   - Operators can now specify additional context to the Language Model (LLM) using the scenario description parameter, enhancing the quality of evaluations.\r\n\r\n4. **Modular Prompt Templates 🧩**\r\n   - Release of customizable prompt templates featuring customizable instructions, few-shot examples, scenario descriptions, and output formats, providing users with versatile tools for prompt creation.\r\n\r\n5. **New Integrations 🚀**\r\n   - **Vector DBs Integration 🔍**\r\n     - Integration with vector databases such as Qdrant, ChromaDB, and FAISS for RAG operations, query responses, and evaluation using UpTrain.\r\n   - **Framework Integration 🛠️**\r\n     - Integration with LLamaindex framework for streamlined operations.\r\n   - **LLM Providers Integration 💡**\r\n     - Integration with LLMs like Mistral and Llama from platforms such as Anyscale and Together AI for evaluation purposes.\r\n   - **LLM Embeddings Integration 🧠**\r\n     - Integration with Jina for generating embeddings to enhance RAG operations and evaluations.\r\n\r\n6. **Research Integration 📚**\r\n   - UpTrain now incorporates the state-of-the-art Spade framework for auto-generating assertions to identify poor LLM outputs, facilitating seamless evaluation on user datasets.\r\n\r\n7. **Root Cause Analysis 🕵️**\r\n   - UpTrain facilitates root cause analysis for failure issues in RAG pipelines, aiding in the identification and resolution of problems.\r\n\r\n8. **Vector Search Integration 🔍**\r\n   - Enhanced vector search capability allows for comparing different embedding models, enabling users to derive more relevant context from vector databases.\r\n\r\n#### New Evaluations:\r\n\r\n1. **Jailbreak Detection 🚨**\r\n   - Identify attempts to perform illegal activities or misuse of the LLM. Users can specify a model purpose to ensure adherence to intended usage.\r\n\r\n2. **Code Hallucination 💻**\r\n   - Determine the grounding of code generated by the LLM based on provided documents\u002Fcontext, ensuring coherence and relevance.","2024-02-20T20:13:20"]