[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-uber--manifold":3,"tool-uber--manifold":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",154349,2,"2026-04-13T23:32:16",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":76,"owner_twitter":76,"owner_website":77,"owner_url":78,"languages":79,"stars":106,"forks":107,"last_commit_at":108,"license":109,"difficulty_score":110,"env_os":111,"env_gpu":111,"env_ram":111,"env_deps":112,"category_tags":117,"github_topics":118,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":122,"updated_at":123,"faqs":124,"releases":154},7318,"uber\u002Fmanifold","manifold","A model-agnostic visual debugging tool for machine learning","Manifold 是一款由 Uber 开源的机器学习可视化调试工具，旨在帮助开发者透过黑箱，直观地理解模型表现。在机器学习中，仅依赖 AUC、RMSE 等整体统计指标往往难以定位模型出错的具体原因。Manifold 解决了这一痛点，它不局限于单一模型或特定算法，而是支持对多个模型进行横向对比，帮助用户快速识别出模型在哪些数据子集上预测不准，并进一步分析导致性能差异的特征分布根源。\n\n这款工具特别适合机器学习工程师、数据科学家及算法研究人员使用。无论是需要优化现有模型的性能，还是希望在多个候选模型中做出最佳选择，Manifold 都能提供有力的视觉支持。其核心技术亮点在于“模型无关”的设计架构，这意味着无论底层使用的是何种算法，用户均可通过统一的界面进行分析。此外，Manifold 提供了性能对比视图和特征归因视图，不仅能宏观展示不同数据片段上的表现差异，还能微观揭示影响预测结果的关键特征分布变化。通过将抽象的数据关系转化为直观的图表，Manifold 让复杂的模型诊断过程变得更加清晰高效，是提升机器学习可解释性与调试效率的得力助手。","[![Gitpod Ready-to-Code](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGitpod-Ready--to--Code-blue?logo=gitpod)](https:\u002F\u002Fgitpod.io\u002F#https:\u002F\u002Fgithub.com\u002Fuber\u002Fmanifold) \n[![Build Status](https:\u002F\u002Ftravis-ci.com\u002Fuber\u002Fmanifold.svg?token=SZsMuk4iZZDLKwRXzyxu&branch=master)](https:\u002F\u002Ftravis-ci.com\u002Fuber\u002Fmanifold)\n[![CII Best Practices](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_50ccde67b228.png)](https:\u002F\u002Fbestpractices.coreinfrastructure.org\u002Fprojects\u002F3062)\n\n# Manifold\n\n_This project is stable and being incubated for long-term support._\n\n[\u003Cimg alt=\"Manifold\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_0a4f46f223ce.jpg\" width=\"600\">](https:\u002F\u002Fuber.github.io\u002Fmanifold\u002F)\n\nManifold is a model-agnostic visual debugging tool for machine learning.\n\nUnderstanding ML model performance and behavior is a non-trivial process, given the intrisic opacity of ML algorithms. Performance summary statistics such as AUC, RMSE, and others are not instructive enough for identifying what went wrong with a model or how to improve it.\n\nAs a visual analytics tool, Manifold allows ML practitioners to look beyond overall summary metrics to detect which subset of data a model is inaccurately predicting. Manifold also explains the potential cause of poor model performance by surfacing the feature distribution difference between better and worse-performing subsets of data.\n\n## Table of contents\n\n- [Prepare your data](#prepare-your-data)\n- [Interpret visualizations](#interpret-visualizations)\n- [Using the demo app](#using-the-demo-app)\n- [Using the component](#using-the-component)\n- [Contributing](#contributing)\n- [Versioning](#versioning)\n- [License](#license)\n\n## Prepare Your Data\n\nThere are 2 ways to input data into Manifold:\n\n- [csv upload](#upload-csv-to-demo-app) if you use the Manifold demo app, or\n- [convert data programatically](#load-and-convert-data) if you use the Manifold component in your own app.\n\nIn either case, data that's directly input into Manifold should follow this format:\n\n```js\nconst data = {\n  x:     [...],         \u002F\u002F feature data\n  yPred: [[...], ...]   \u002F\u002F prediction data\n  yTrue: [...],         \u002F\u002F ground truth data\n};\n```\n\nEach element in these arrays represents one data point in your evaluation dataset, and the order of data instances in `x`, `yPred` and `yTrue` should all match.\nThe recommended instance count for each of these datasets is 10000 - 15000. If you have a larger dataset that you want to analyze, a random subset of your data generally suffices to reveal the important patterns in it.\n\n##### `x`: {Object[]}\n\nA list of instances with features. Example (2 data instances):\n\n```js\n[{feature_0: 21, feature_1: 'B'}, {feature_0: 36, feature_1: 'A'}];\n```\n\n##### `yPred`: {Object[][]}\n\nA list of lists, where each child list is a prediction array from one model for each data instance. Example (3 models, 2 data instances, 2 classes `['false', 'true']`):\n\n```js\n[\n  [{false: 0.1, true: 0.9}, {false: 0.8, true: 0.2}],\n  [{false: 0.3, true: 0.7}, {false: 0.9, true: 0.1}],\n  [{false: 0.6, true: 0.4}, {false: 0.4, true: 0.6}],\n];\n```\n\n##### `yTrue`: {Number[] | String[]}\n\nA list, ground truth for each data instance. Values must be numbers for regression models, must be strings that match object keys in `yPred` for classification models. Example (2 data instances, 2 classes ['false', 'true']):\n\n```js\n['true', 'false'];\n```\n\n## Interpret visualizations\n\nThis guide explains how to interpret Manifold visualizations.\n\nManifold consists of:\n\n- [Performance Comparison View](#performance-comparison-view) which compares\n  prediction performance across models, across data subsets\n- [Feature Attribution View](#feature-attribution-view) which visualizes feature\n  distributions of data subsets with various performance levels\n\n### Performance Comparison View\n\nThis visualization is an overview of performance of your model(s) across\ndifferent segments of your data. It helps you identify under-performing data subsets for further inspection.\n\n#### Reading the chart\n\n\u003Cimg alt=\"performance comparison view\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_1fadbf9fcffb.png\" width=\"600\">\n\n1. **X axis:** performance metric. Could be log-loss, squared-error, or raw prediction.\n2. **Segments:** your dataset is automatically divided into segments based on performance similarity between instances, across models.\n3. **Colors:** represent different models.\n\n\u003Cimg alt=\"performance comparison view unit\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_69aea935d681.png\" width=\"600\">\n\n1. **Curve:** performance distribution (of one model, for one segment).\n2. **Y axis:** data count\u002Fdensity.\n3. **Cross:** the left end, center line, and right end are the 25th, 50th and 75th percentile of the distribution.\n\n#### Explanation\n\nManifold uses a clustering algorithm (k-Means) to break prediction data into N segments\nbased on performance similarity.\n\nThe input of the k-Means is per-instance performance scores. By default, that is the log-loss value for classification models and the squared-error value for regression models. Models with a lower log-loss\u002Fsquared-error perform better than models with a higher log-loss\u002Fsquared-error.\n\nIf you're analyzing multiple models, all model performance metrics will be included in the input.\n\n#### Usage\n\n- Look for segments of data where the error is higher (plotted to the right). These are areas you should analyze and try to improve.\n\n- If you're comparing models, look for segments where the log-loss is different for each model. If two models perform differently on the same set of data, consider using the better-performing model for that part of the data to boost performance.\n\n- After you notice any performance patterns\u002Fissues in the segments, slice the data to compare feature distribution for the data subset(s) of interest. You can create two segment groups to compare (colored pink and blue), and each group can have 1 or more segments.\n\n**Example**\n\n\u003Cimg alt=\"performance comparison view example\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_88eea925e4eb.png\" width=\"600\">\n\nData in Segment 0 has a lower log-loss prediction error compared to Segments 1 and 2, since curves in Segment 0 are closer to the left side.\n\nIn Segments 1 and 2, the XGBoost model performs better than the DeepLearning model, but DeepLearning outperforms XGBoost in Segment 0.\n\n\u003Cbr\u002F>\n\n### Feature Attribution View\n\nThis visualization shows feature values of your data, aggregated by user-defined segments. It helps you identify any input feature distribution that might correlate with inaccurate prediction output.\n\n#### Reading the chart\n\n\u003Cimg alt=\"feature attribution view\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_65f68bb86e4e.png\" width=\"600\">\n\n1. **Histogram \u002F heatmap:** distribution of data from each data slice, shown in the corresponding color.\n2. **Segment groups:** indicates data slices you choose to compare against each other.\n3. **Ranking:** features are ranked by distribution difference between slices.\n\n\u003Cimg alt=\"feature attribution view unit\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_76a1f92faa07.png\" width=\"600\">\n\n1. **X axis:** feature value.\n2. **Y axis:** data count\u002Fdensity.\n3. **Divergence score:** measure of difference in distributions between slices.\n\n#### Explanation\n\nAfter you slice the data to create segment groups, feature distribution histograms\u002Fheatmaps from the two segment groups are shown in this view.\n\nDepending on the feature type, features can be shown as heatmaps on a map for geo features, distribution curve for numerical features, or distribution bar chart for categorical features. (In bar charts, categories on the x-axis are sorted by instance count difference. Look for differences between the two distributions in each feature.)\n\nFeatures are ranked by their KL-Divergence - a measure of _difference_ between the two contrasting distributions. The higher the divergence is, the more likely this feature is correlated with the factor that differentiates the two Segment Groups.\n\n#### Usage\n\n- Look for the differences between the two distributions (pink and blue) in each feature. They represent the difference in data from the two segment groups you selected in the Performance Comparison View.\n\n**Example**\n\n\u003Cimg alt=\"feature attribution view example\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_24ee40e61d3c.png\" width=\"600\">\n\nData in Groups 0 and 1 have obvious differences in Features 0, 1, 2 and 3; but they are not so different in features 4 and 5.\n\nSuppose Data Groups 0 and 1 correspond to data instances with low and high prediction error respectively, this means that data with higher errors tend to have _lower_ feature values in Features 0 and 1, since peak of pink curve is to the left side of the blue curve.\n\n\u003Cbr\u002F>\n\n### Geo Feature View\n\nIf there are geospatial features in your dataset, they will be displayed on a map. Lat-lng coordinates and [h3](https:\u002F\u002Fgithub.com\u002Fuber\u002Fh3-js) hexagon ids are currently supoorted geo feature types.\n\n#### Reading the chart\n\n\u003Cimg alt=\"geo feature view lat-lng\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_72e7aa403027.png\" width=\"600\">\n\n1. **Feature name:** when multiple geo features exist, you can choose which one to display on the map.\n2. **Color-by:** if a lat-lng feature is chosen, datapoints are colored by group ids.\n3. **Map:** Manifold defaults to display the location and density of these datapoints using a heatmap.\n\n\u003Cimg alt=\"geo feature view hex id\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_40dd8071f349.png\" width=\"600\">\n\n1. **Feature name:** when choosing a hex-id feature to display, datapoints with the same hex-id are displayed in aggregate.\n2. **Color-by:** you can color the hexagons by: average model performance, percentage of segment group 0, or total count per hexagon.\n3. **Map:** all metrics that are used for coloring are also shown in tooltips, on the hexagon level.\n\n#### Usage\n\n- Look for the differences in geo location between the two segment groups (pink and grey). They represent the spation distribution difference between the two subsets you previously selected.\n\n**Example**\n\nIn the first map above, Group 0 has a more obvious tendency to be concentrated in downtown San Francisco area.\n\n\u003C!-- images in this doc are created from https:\u002F\u002Fdocs.google.com\u002Fpresentation\u002Fd\u002F1EqvjMyBLNX7wfEQPFKAoaE39bW0pXbBa8WIznQN49vE\u002Fedit?usp=sharing -->\n\n## Using the Demo App\n\nTo do a one-off evaluation using static outputs of your ML models, use the demo app.\nOtherwise, if you have a system that programmatically generates ML model outputs, you might consider [using the Manifold component](#using-the-component) directly.\n\n### Running Demo App Locally\n\nRun the following commands to set up your environment and run the demo:\n\n```bash\n# install all dependencies in the root directory\nyarn\n# demo app is in examples\u002Fmanifold directory\ncd examples\u002Fmanifold\n# install dependencies for the demo app\nyarn\n# run the app\nyarn start\n```\n\nNow you should see the demo app running at `localhost:8080`.\n\n### Upload CSV to Demo App\n\n\u003Cimg alt=\"csv upload interface\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_6d50deadc327.png\" width=\"500\">\n\nOnce the app starts running, you will see the interface above asking you to upload **\"feature\"**, **\"prediction\"** and **\"ground truth\"** datasets to Manifold.\nThey correspond to `x`, `yPred`, and `yTrue` in the \"[prepare your data](#prepare-your-data)\" section, and you should prepare your CSV files accordingly, illustrated below:\n\n|           Field            |   **`x`** (feature)    | **`yPred`** (prediction)  | **`yTrue`** (ground truth)  |\n| :------------------------: | :--------------------: | :-----------------------: | :-------------------------: |\n|      Number of CSVs        |           1            |         multiple          |              1              |\n| Illustration of CSV format | ![][feature csv image] | ![][prediction csv image] | ![][ground truth csv image] |\n\nNote, the index columns should be excluded from the input file(s).\nOnce the datasets are uploaded, you will see visualizations generated by these datasets.\n\n## Using the Component\n\nEmbedding the Manifold component in your app allows you to programmatically generate ML model data and visualize.\nOtherwise, if you have some static output from some models and want to do a one-off evaluation, you might consider [using the demo app](#using-the-demo-app) directly.\n\nHere are the basic steps to import Manifold into your app and load data for visualizing. You can also take a look at the examples folder.\n\n### Install Manifold\n\n```bash\n$ npm install @mlvis\u002Fmanifold styled-components styletron-engine-atomic styletron-react\n```\n\n### Load and Convert Data\n\nIn order to load your data files to Manifold, use the `loadLocalData` action. You could also reshape your data into the required Manifold format using `dataTransformer`.\n\n```js\nimport {loadLocalData} from '@mlvis\u002Fmanifold\u002Factions';\n\n\u002F\u002F create the following action and pass to dispatch\nloadLocalData({\n  fileList,\n  dataTransformer,\n});\n```\n\n##### `fileList`: {Object[]}\n\nOne or more datasets, in CSV format. Could be ones that your backend returns.\n\n##### `dataTransformer`: {Function}\n\nA function that transforms `fileList` into the [Manifold input data format](#prepare-your-data). Default:\n\n```js\nconst defaultDataTransformer = fileList => ({\n  x: [],\n  yPred: [],\n  yTrue: [],\n});\n```\n\n### Mount reducer\n\nManifold uses Redux to manage its internal state. You need to register `manifoldReducer` to the main reducer of your app:\n\n```js\nimport manifoldReducer from '@mlvis\u002Fmanifold\u002Freducers';\nimport {combineReducers, createStore, compose} from 'redux';\n\nconst initialState = {};\nconst reducers = combineReducers({\n  \u002F\u002F mount manifold reducer in your app\n  manifold: manifoldReducer,\n\n  \u002F\u002F Your other reducers here\n  app: appReducer,\n});\n\n\u002F\u002F using createStore\nexport default createStore(reducer, initialState);\n```\n\n### Mount Component\n\nIf you mount `manifoldReducer` in another address instead of `manifold` in the step above, you need to specify the path to it when you mount the component with the `getState` prop. `width` and `height` are both needed explicitly. If you have geospatial features and need to see them on a map, you also need a [mapbox token](https:\u002F\u002Fdocs.mapbox.com\u002Fhelp\u002Fhow-mapbox-works\u002Faccess-tokens\u002F).\n\n```js\nimport Manifold from '@mlvis\u002Fmanifold';\nconst manifoldGetState = state => state.pathTo.manifold;\nconst yourMapboxToken = ...;\n\nconst Main = props => (\n  \u003CManifold\n    getState={manifoldGetState}\n    width={width}\n    height={height}\n    mapboxToken={yourMapboxToken}\n  \u002F>\n);\n```\n\n### Styling\n\nManifold uses baseui, which uses Styletron as a styling engine. If you don't already use Styletron in other parts of your app, make sure to wrap Manifold with the [styletron provider](https:\u002F\u002Fbaseweb.design\u002Fgetting-started\u002Fsetup\u002F#adding-base-web-to-your-application).\n\nManifold uses the baseui [theming API](https:\u002F\u002Fbaseweb.design\u002Fguides\u002Ftheming\u002F). The default theme used by Manifold is exported as `THEME`. You can customize the styling by extending `THEME` and passing it as a `theme` prop of the `Manifold` component.\n\n```js\nimport Manifold, {THEME} from '@mlvis\u002Fmanifold';\nimport {Client as Styletron} from 'styletron-engine-atomic';\nimport {Provider as StyletronProvider} from 'styletron-react';\n\nconst engine = new Styletron();\nconst myTheme = {\n  ...THEME,\n  colors: {\n    ...THEME.colors,\n    primary: '#ff0000',\n  },\n}\n\nconst Main = props => (\n  \u003CStyletronProvider value={engine}>\n    \u003CManifold\n      getState={manifoldGetState}\n      theme={myTheme}\n    \u002F>\n  \u003C\u002FStyletronProvider>\n);\n```\n\n## Built With\n- [TensorFlow.js](https:\u002F\u002Fjs.tensorflow.org\u002F)\n- [React](https:\u002F\u002Freactjs.org\u002F)\n- [Redux](https:\u002F\u002Fredux.js.org\u002F)\n\n## Contributing\nPlease read our [code of conduct](CODE_OF_CONDUCT.md) before you contribute! You can find details for submitting pull requests in the [CONTRIBUTING.md](CONTRIBUTING.md) file. Refer to the issue [template](https:\u002F\u002Fhelp.github.com\u002Farticles\u002Fabout-issue-and-pull-request-templates\u002F).\n\n## Versioning\nWe document versions and changes in our changelog - see the [CHANGELOG.md](CHANGELOG.md) file for details.\n\n## License\nApache 2.0 License\n\n[feature csv image]: https:\u002F\u002Fd1a3f4spazzrp4.cloudfront.net\u002Fmanifold\u002Fdocs\u002Fx.png\n[prediction csv image]: https:\u002F\u002Fd1a3f4spazzrp4.cloudfront.net\u002Fmanifold\u002Fdocs\u002FyPred.png\n[ground truth csv image]: https:\u002F\u002Fd1a3f4spazzrp4.cloudfront.net\u002Fmanifold\u002Fdocs\u002FyTrue.png\n\n","[![Gitpod 准备就绪](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGitpod-Ready--to--Code-blue?logo=gitpod)](https:\u002F\u002Fgitpod.io\u002F#https:\u002F\u002Fgithub.com\u002Fuber\u002Fmanifold) \n[![构建状态](https:\u002F\u002Ftravis-ci.com\u002Fuber\u002Fmanifold.svg?token=SZsMuk4iZZDLKwRXzyxu&branch=master)](https:\u002F\u002Ftravis-ci.com\u002Fuber\u002Fmanifold)\n[![CII 最佳实践](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_50ccde67b228.png)](https:\u002F\u002Fbestpractices.coreinfrastructure.org\u002Fprojects\u002F3062)\n\n# Manifold\n\n_本项目现已稳定，并处于长期支持的孵化阶段。_\n\n[\u003Cimg alt=\"Manifold\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_0a4f46f223ce.jpg\" width=\"600\">](https:\u002F\u002Fuber.github.io\u002Fmanifold\u002F)\n\nManifold 是一款与模型无关的机器学习可视化调试工具。\n\n鉴于机器学习算法本身的不透明性，理解模型性能和行为并非易事。诸如 AUC、RMSE 等性能汇总统计指标往往不足以帮助我们找出模型的问题所在或改进方向。\n\n作为一款可视化分析工具，Manifold 使机器学习从业者能够超越整体汇总指标，定位出模型预测不准确的数据子集。同时，它还能通过比较表现较好与较差数据子集之间的特征分布差异，揭示模型性能不佳的潜在原因。\n\n## 目录\n\n- [准备您的数据](#prepare-your-data)\n- [解读可视化结果](#interpret-visualizations)\n- [使用演示应用](#using-the-demo-app)\n- [使用组件](#using-the-component)\n- [贡献代码](#contributing)\n- [版本管理](#versioning)\n- [许可证](#license)\n\n## 准备您的数据\n\n有两种方式可以将数据输入到 Manifold 中：\n\n- 如果您使用的是 Manifold 演示应用，则可以通过 [上传 CSV 文件](#upload-csv-to-demo-app)；\n- 如果您希望在自己的应用中集成 Manifold 组件，则可以采用 [程序化转换数据](#load-and-convert-data) 的方式。\n\n无论哪种方式，直接输入到 Manifold 的数据都应遵循以下格式：\n\n```js\nconst data = {\n  x:     [...],         \u002F\u002F 特征数据\n  yPred: [[...], ...]   \u002F\u002F 预测数据\n  yTrue: [...],         \u002F\u002F 真实标签数据\n};\n```\n\n这些数组中的每个元素代表评估数据集中的一个数据点，且 `x`、`yPred` 和 `yTrue` 中数据实例的顺序必须一致。建议每份数据集包含 10,000 至 15,000 个样本。如果您拥有更大的数据集，通常只需抽取其中的一个随机子集，即可揭示数据中的重要模式。\n\n##### `x`: {Object[]}\n\n包含特征的实例列表。示例（2 个数据实例）：\n\n```js\n[{feature_0: 21, feature_1: 'B'}, {feature_0: 36, feature_1: 'A'}];\n```\n\n##### `yPred`: {Object[][]}\n\n由多个列表组成的列表，其中每个子列表对应于某一模型对每个数据实例的预测结果。示例（3 个模型，2 个数据实例，2 个类别 `['false', 'true']`）：\n\n```js\n[\n  [{false: 0.1, true: 0.9}, {false: 0.8, true: 0.2}],\n  [{false: 0.3, true: 0.7}, {false: 0.9, true: 0.1}],\n  [{false: 0.6, true: 0.4}, {false: 0.4, true: 0.6}],\n];\n```\n\n##### `yTrue`: {Number[] | String[]}\n\n真实标签的列表，每个数据实例对应一个标签。对于回归模型，标签必须是数值；而对于分类模型，标签则必须是字符串，且需与 `yPred` 中的对象键相匹配。示例（2 个数据实例，2 个类别 ['false', 'true']）：\n\n```js\n['true', 'false'];\n```\n\n## 解读可视化结果\n\n本指南将介绍如何解读 Manifold 的可视化结果。\n\nManifold 包含以下两部分：\n\n- [性能对比视图](#performance-comparison-view)，用于比较不同模型以及不同数据子集之间的预测性能；\n- [特征归因视图](#feature-attribution-view)，用于可视化不同性能水平数据子集的特征分布。\n\n### 性能对比视图\n\n该可视化展示了您的模型在数据不同分段上的整体性能概况，有助于识别表现欠佳的数据子集以便进一步检查。\n\n#### 图表说明\n\n\u003Cimg alt=\"性能对比视图\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_1fadbf9fcffb.png\" width=\"600\">\n\n1. **X 轴：** 性能指标。可以是对数损失、平方误差或原始预测值。\n2. **分段：** 数据集会根据各实例在不同模型下的性能相似性自动划分为若干分段。\n3. **颜色：** 代表不同的模型。\n\n\u003Cimg alt=\"性能对比视图单元\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_69aea935d681.png\" width=\"600\">\n\n1. **曲线：** 某一模型在某一特定分段上的性能分布。\n2. **Y 轴：** 数据数量\u002F密度。\n3. **十字标记：** 左端、中心线和右端分别对应该分布的第 25、50 和 75 百分位数。\n\n#### 原理说明\n\nManifold 使用聚类算法（k-Means）基于性能相似性将预测数据划分为 N 个分段。\n\nk-Means 算法的输入是每个实例的性能得分。默认情况下，对于分类模型，该得分是对数损失值；对于回归模型，则为平方误差值。对数损失或平方误差越低，模型性能越好。\n\n如果您正在分析多个模型，所有模型的性能指标都将被纳入 k-Means 的输入中。\n\n#### 使用方法\n\n- 查找误差较高的数据分段（位于图表右侧）。这些区域需要重点分析并尝试改进。\n- 如果您在比较不同模型，请留意哪些分段中各模型的对数损失存在显著差异。如果两个模型在同一数据集上表现不同，可考虑针对该部分数据选用表现更优的模型以提升整体性能。\n- 在发现分段中的性能模式或问题后，您可以切分数据来比较感兴趣的数据子集的特征分布。您可以创建两组分段进行对比（分别用粉色和蓝色标注），每组可以包含一个或多个分段。\n\n**示例**\n\n\u003Cimg alt=\"性能对比视图示例\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_88eea925e4eb.png\" width=\"600\">\n\n与第 1 和第 2 分段相比，第 0 分段的对数损失预测误差更低，因为其曲线更靠近左侧。\n\n在第 1 和第 2 分段中，XGBoost 模型的表现优于深度学习模型；而在第 0 分段中，深度学习模型却优于 XGBoost。\n\n\u003Cbr\u002F>\n\n### 特征归因视图\n\n此可视化展示了您数据的特征值，并按用户定义的分段进行聚合。它有助于您识别任何可能与不准确预测输出相关的输入特征分布。\n\n#### 阅读图表\n\n\u003Cimg alt=\"feature attribution view\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_65f68bb86e4e.png\" width=\"600\">\n\n1. **直方图\u002F热力图：** 每个数据切片的数据分布，以对应的颜色显示。\n2. **分段组：** 表示您选择相互比较的数据切片。\n3. **排名：** 特征按照各切片之间的分布差异进行排序。\n\n\u003Cimg alt=\"feature attribution view unit\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_76a1f92faa07.png\" width=\"600\">\n\n1. **X轴：** 特征值。\n2. **Y轴：** 数据计数\u002F密度。\n3. **差异分数：** 衡量各切片之间分布差异的指标。\n\n#### 解释\n\n在您对数据进行切分以创建分段组后，此视图将显示两个分段组的特征分布直方图\u002F热力图。\n\n根据特征类型，地理特征可以以地图上的热力图形式展示，数值特征以分布曲线形式展示，而分类特征则以分布条形图形式展示。（在条形图中，x轴上的类别按实例数量差异排序。请在每个特征中寻找两种分布之间的差异。）\n\n特征按照其KL散度进行排序——这是一种衡量两种对比分布之间_差异_的指标。差异越大，该特征就越有可能与区分两个分段组的因素相关联。\n\n#### 使用方法\n\n- 查看每个特征中两种分布（粉色和蓝色）之间的差异。它们代表您在“性能比较视图”中所选两个分段组之间的数据差异。\n\n**示例**\n\n\u003Cimg alt=\"feature attribution view example\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_24ee40e61d3c.png\" width=\"600\">\n\n第0组和第1组的数据在特征0、1、2和3上存在明显差异；但在特征4和5上则不太相同。\n\n假设第0组和第1组分别对应于低误差和高误差的数据实例，则这意味着误差较高的数据在特征0和1上的特征值往往_较低_，因为粉色曲线的峰值位于蓝色曲线的左侧。\n\n\u003Cbr\u002F>\n\n### 地理特征视图\n\n如果您的数据集中包含地理空间特征，它们将被显示在地图上。目前支持的地理特征类型包括经纬度坐标和[h3](https:\u002F\u002Fgithub.com\u002Fuber\u002Fh3-js)六边形ID。\n\n#### 阅读图表\n\n\u003Cimg alt=\"geo feature view lat-lng\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_72e7aa403027.png\" width=\"600\">\n\n1. **特征名称：** 当存在多个地理特征时，您可以选择在地图上显示哪一个。\n2. **颜色依据：** 如果选择了经纬度特征，则数据点会根据分组ID进行着色。\n3. **地图：** Manifold默认使用热力图来显示这些数据点的位置和密度。\n\n\u003Cimg alt=\"geo feature view hex id\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_40dd8071f349.png\" width=\"600\">\n\n1. **特征名称：** 选择显示六边形ID特征时，具有相同六边形ID的数据点会以聚合形式显示。\n2. **颜色依据：** 您可以选择按以下方式为六边形着色：平均模型性能、第0分段组的百分比，或每个六边形的总数量。\n3. **地图：** 所有用于着色的指标也会在六边形级别的工具提示中显示。\n\n#### 使用方法\n\n- 查找两个分段组（粉色和灰色）之间的地理位置差异。它们代表您先前选择的两个子集之间的空间分布差异。\n\n**示例**\n\n在上面的第一张地图中，第0组更明显地集中在旧金山市中心区域。\n\n\u003C!-- images in this doc are created from https:\u002F\u002Fdocs.google.com\u002Fpresentation\u002Fd\u002F1EqvjMyBLNX7wfEQPFKAoaE39bW0pXbBa8WIznQN49vE\u002Fedit?usp=sharing -->\n\n## 使用演示应用\n\n要使用您的ML模型的静态输出进行一次性评估，请使用演示应用。\n否则，如果您有一个能够程序化生成ML模型输出的系统，您可以考虑直接[使用Manifold组件](#using-the-component)。\n\n### 在本地运行演示应用\n\n运行以下命令以设置环境并运行演示：\n\n```bash\n# 在根目录下安装所有依赖项\nyarn\n# 演示应用位于examples\u002Fmanifold目录\ncd examples\u002Fmanifold\n# 安装演示应用的依赖项\nyarn\n# 运行应用\nyarn start\n```\n\n现在您应该会在`localhost:8080`看到演示应用正在运行。\n\n### 将CSV上传到演示应用\n\n\u003Cimg alt=\"csv upload interface\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_readme_6d50deadc327.png\" width=\"500\">\n\n应用启动后，您将看到上述界面，要求您将“特征”、“预测”和“真实值”数据集上传到Manifold。\n它们分别对应于“准备您的数据”部分中的`x`、`yPred`和`yTrue`，您应相应地准备您的CSV文件，如下所示：\n\n|           字段            |   **`x`** (特征)    | **`yPred`** (预测)  | **`yTrue`** (真实值)  |\n| :------------------------: | :--------------------: | :-----------------------: | :-------------------------: |\n|      CSV文件数量        |           1            |         多个          |              1              |\n| CSV格式示例 | ![][feature csv image] | ![][prediction csv image] | ![][ground truth csv image] |\n\n请注意，输入文件中应排除索引列。\n数据集上传完成后，您将看到由这些数据集生成的可视化效果。\n\n## 使用组件\n\n将Manifold组件嵌入您的应用程序中，可以让您程序化地生成ML模型数据并进行可视化。\n否则，如果您有一些模型的静态输出，并希望进行一次性评估，您可以考虑直接[使用演示应用](#using-the-demo-app)。\n\n以下是将Manifold导入您的应用并加载数据以进行可视化的基本步骤。您也可以查看示例文件夹。\n\n### 安装Manifold\n\n```bash\n$ npm install @mlvis\u002Fmanifold styled-components styletron-engine-atomic styletron-react\n```\n\n### 加载和转换数据\n\n为了将您的数据文件加载到 Manifold 中，可以使用 `loadLocalData` 操作。您也可以使用 `dataTransformer` 将数据重塑为 Manifold 所需的格式。\n\n```js\nimport {loadLocalData} from '@mlvis\u002Fmanifold\u002Factions';\n\n\u002F\u002F 创建以下操作并传递给 dispatch\nloadLocalData({\n  fileList,\n  dataTransformer,\n});\n```\n\n##### `fileList`: {Object[]}\n\n一个或多个 CSV 格式的数据集。这些数据可以由您的后端返回。\n\n##### `dataTransformer`: {Function}\n\n一个将 `fileList` 转换为 [Manifold 输入数据格式](#prepare-your-data) 的函数。默认值如下：\n\n```js\nconst defaultDataTransformer = fileList => ({\n  x: [],\n  yPred: [],\n  yTrue: [],\n});\n```\n\n### 挂载 reducer\n\nManifold 使用 Redux 来管理其内部状态。您需要将 `manifoldReducer` 注册到应用的主要 reducer 中：\n\n```js\nimport manifoldReducer from '@mlvis\u002Fmanifold\u002Freducers';\nimport {combineReducers, createStore, compose} from 'redux';\n\nconst initialState = {};\nconst reducers = combineReducers({\n  \u002F\u002F 在您的应用中挂载 manifold reducer\n  manifold: manifoldReducer,\n\n  \u002F\u002F 其他 reducer\n  app: appReducer,\n});\n\n\u002F\u002F 使用 createStore\nexport default createStore(reducer, initialState);\n```\n\n### 挂载组件\n\n如果您在上一步中将 `manifoldReducer` 挂载到了除 `manifold` 之外的其他路径，则在使用 `getState` 属性挂载组件时，需要指定该路径。`width` 和 `height` 都需要显式指定。如果您有地理空间特征并且需要在地图上查看它们，还需要一个 [Mapbox 令牌](https:\u002F\u002Fdocs.mapbox.com\u002Fhelp\u002Fhow-mapbox-works\u002Faccess-tokens\u002F)。\n\n```js\nimport Manifold from '@mlvis\u002Fmanifold';\nconst manifoldGetState = state => state.pathTo.manifold;\nconst yourMapboxToken = ...;\n\nconst Main = props => (\n  \u003CManifold\n    getState={manifoldGetState}\n    width={width}\n    height={height}\n    mapboxToken={yourMapboxToken}\n  \u002F>\n);\n```\n\n### 样式\n\nManifold 使用 baseui，而 baseui 则以 Styletron 作为样式引擎。如果您在应用的其他部分尚未使用 Styletron，请确保使用 [styletron 提供者](https:\u002F\u002Fbaseweb.design\u002Fgetting-started\u002Fsetup\u002F#adding-base-web-to-your-application) 包裹 Manifold。\n\nManifold 使用 baseui 的 [主题化 API](https:\u002F\u002Fbaseweb.design\u002Fguides\u002Ftheming\u002F)。Manifold 默认使用的主题被导出为 `THEME`。您可以扩展 `THEME` 并将其作为 `Manifold` 组件的 `theme` 属性来传递，从而自定义样式。\n\n```js\nimport Manifold, {THEME} from '@mlvis\u002Fmanifold';\nimport {Client as Styletron} from 'styletron-engine-atomic';\nimport {Provider as StyletronProvider} from 'styletron-react';\n\nconst engine = new Styletron();\nconst myTheme = {\n  ...THEME,\n  colors: {\n    ...THEME.colors,\n    primary: '#ff0000',\n  },\n}\n\nconst Main = props => (\n  \u003CStyletronProvider value={engine}>\n    \u003CManifold\n      getState={manifoldGetState}\n      theme={myTheme}\n    \u002F>\n  \u003C\u002FStyletronProvider>\n);\n```\n\n## 构建于\n- [TensorFlow.js](https:\u002F\u002Fjs.tensorflow.org\u002F)\n- [React](https:\u002F\u002Freactjs.org\u002F)\n- [Redux](https:\u002F\u002Fredux.js.org\u002F)\n\n## 贡献\n请在贡献之前阅读我们的 [行为准则](CODE_OF_CONDUCT.md)! 关于提交拉取请求的详细信息，请参阅 [CONTRIBUTING.md](CONTRIBUTING.md) 文件。请参考 [issue 模板](https:\u002F\u002Fhelp.github.com\u002Farticles\u002Fabout-issue-and-pull-request-templates\u002F)。\n\n## 版本控制\n我们会在变更日志中记录版本和更改内容——详情请参阅 [CHANGELOG.md](CHANGELOG.md) 文件。\n\n## 许可证\nApache 2.0 许可证\n\n[特征 CSV 图片]: https:\u002F\u002Fd1a3f4spazzrp4.cloudfront.net\u002Fmanifold\u002Fdocs\u002Fx.png\n[预测 CSV 图片]: https:\u002F\u002Fd1a3f4spazzrp4.cloudfront.net\u002Fmanifold\u002Fdocs\u002FyPred.png\n[真实标签 CSV 图片]: https:\u002F\u002Fd1a3f4spazzrp4.cloudfront.net\u002Fmanifold\u002Fdocs\u002FyTrue.png","# Manifold 快速上手指南\n\nManifold 是一个与模型无关的机器学习可视化调试工具。它帮助开发者超越单一的汇总指标（如 AUC、RMSE），深入识别模型在哪些数据子集上表现不佳，并通过特征分布差异分析潜在原因。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**: Linux, macOS 或 Windows\n*   **Node.js**: 建议安装 LTS 版本 (v14+)\n*   **包管理器**: 推荐使用 `yarn` (原文示例基于 yarn)，也可使用 `npm`\n*   **浏览器**: 现代浏览器 (Chrome, Firefox, Edge 等)\n\n**前置依赖安装：**\n如果您尚未安装 yarn，可以通过 npm 全局安装：\n```bash\nnpm install -g yarn\n```\n\n> **提示**：国内开发者若遇到网络延迟，可配置淘宝镜像源加速依赖下载：\n> ```bash\n> yarn config set registry https:\u002F\u002Fregistry.npmmirror.com\n> ```\n\n## 安装步骤\n\nManifold 提供演示应用（Demo App）供快速体验，也支持作为组件集成到您的项目中。以下是运行本地演示应用的步骤：\n\n1.  **克隆仓库并安装根目录依赖**\n    ```bash\n    git clone https:\u002F\u002Fgithub.com\u002Fuber\u002Fmanifold.git\n    cd manifold\n    yarn\n    ```\n\n2.  **进入演示应用目录并安装其依赖**\n    ```bash\n    cd examples\u002Fmanifold\n    yarn\n    ```\n\n3.  **启动应用**\n    ```bash\n    yarn start\n    ```\n\n启动成功后，请在浏览器访问 `http:\u002F\u002Flocalhost:8080`。\n\n## 基本使用\n\n### 1. 准备数据\n\nManifold 接收特定格式的数据输入。您可以直接上传 CSV 文件到演示应用，或在代码中构建数据对象。\n\n**数据格式要求：**\n数据需包含三个核心部分，且数组顺序必须一一对应：\n*   `x`: 特征数据 (Feature Data)\n*   `yPred`: 预测结果 (Prediction Data)\n*   `yTrue`: 真实标签 (Ground Truth)\n\n**推荐数据量**：每个数据集建议包含 10,000 - 15,000 条实例。若数据量过大，随机抽样通常足以揭示关键模式。\n\n**JSON 数据结构示例：**\n\n```js\nconst data = {\n  x:     [...],         \u002F\u002F 特征数据列表\n  yPred: [[...], ...],  \u002F\u002F 预测数据列表（支持多模型）\n  yTrue: [...],         \u002F\u002F 真实标签列表\n};\n```\n\n*   **`x` (Object[])**: 包含特征的实例列表。\n    ```js\n    [{feature_0: 21, feature_1: 'B'}, {feature_0: 36, feature_1: 'A'}]\n    ```\n*   **`yPred` (Object[][])**: 嵌套列表。外层列表代表不同模型，内层列表代表每个实例的预测概率\u002F值。\n    ```js\n    \u002F\u002F 示例：2 个模型，2 个实例，分类问题 (classes: 'false', 'true')\n    [\n      [{false: 0.1, true: 0.9}, {false: 0.8, true: 0.2}], \u002F\u002F 模型 1\n      [{false: 0.3, true: 0.7}, {false: 0.9, true: 0.1}]  \u002F\u002F 模型 2\n    ]\n    ```\n*   **`yTrue` (Number[] | String[])**: 真实标签。回归模型为数字，分类模型需与 `yPred` 中的键名匹配。\n    ```js\n    ['true', 'false']\n    ```\n\n### 2. 上传数据 (演示应用)\n\n在 `localhost:8080` 页面中，您将看到文件上传界面。需准备三个 CSV 文件分别对应上述字段：\n\n| 字段 | 对应变量 | 文件数量 | 格式说明 |\n| :--- | :--- | :--- | :--- |\n| **Feature** | `x` | 1 个 | 每行一个样本，列为特征 |\n| **Prediction** | `yPred` | 多个 | 每个模型一个文件，内容为预测概率或值 |\n| **Ground Truth** | `yTrue` | 1 个 | 每行一个样本的真实标签 |\n\n上传后，系统将自动处理并生成可视化图表。\n\n### 3. 解读可视化结果\n\nManifold 主要包含两个核心视图，用于辅助调试：\n\n#### A. 性能对比视图 (Performance Comparison View)\n*   **功能**：概览模型在不同数据分段上的表现。\n*   **如何阅读**：\n    *   **X 轴**：性能指标（如 log-loss, squared-error）。越靠左表现越好。\n    *   **曲线**：代表某个模型在特定数据分段上的误差分布。\n    *   **分段 (Segments)**：系统自动将数据聚类为 N 个分段（基于表现相似度）。\n*   **操作建议**：寻找曲线靠右（误差高）的分段。如果不同模型在同一段落表现差异巨大，可考虑针对该数据子集切换模型。\n\n#### B. 特征归因视图 (Feature Attribution View)\n*   **功能**：对比不同表现分段（如“高分段”vs“低分段”）的特征分布差异。\n*   **如何阅读**：\n    *   系统会根据您在性能视图中选定的分段组（例如粉色组和蓝色组），展示各特征的分布直方图或热力图。\n    *   **排序**：特征按 KL 散度（分布差异度）排序。差异越大，该特征越可能是导致模型表现不佳的原因。\n*   **操作建议**：观察排名靠前的特征。如果某特征在“低性能组”的分布明显偏向特定值（例如数值偏小或集中在某类别），则表明模型在该特征取值范围内存在偏差。\n\n#### C. 地理特征视图 (Geo Feature View)\n*   若数据包含经纬度或 H3 Hexagon ID，系统将自动在地图上展示数据点的空间分布差异，帮助识别地理位置相关的模型偏差。","某金融风控团队正在优化信用卡欺诈检测模型，试图在保持高召回率的同时降低误报率。\n\n### 没有 manifold 时\n- 团队仅依赖 AUC 或 RMSE 等汇总指标评估模型，无法察觉模型在特定用户群体（如年轻租户）中的严重失效。\n- 面对整体性能瓶颈，开发人员只能盲目调整超参数或尝试不同算法，缺乏针对错误数据子集的定位依据。\n- 难以解释为何模型在某些样本上表现糟糕，无法直观对比“预测准确”与“预测错误”两组数据的特征分布差异。\n- 多模型对比耗时费力，需要手动编写大量代码生成图表才能判断哪个模型在关键细分场景下更优。\n\n### 使用 manifold 后\n- 通过性能对比视图，团队迅速锁定模型在“无固定住所”这一细分数据子集上准确率极低，而不仅仅是看到整体平均分下降。\n- 利用特征归因视图，直接发现该失败子集中“交易地点变动频率”的分布与成功子集存在显著偏差，明确了模型失效的根本原因。\n- 无需编写额外绘图代码，即可在同一界面并排对比三个候选模型在不同数据段的表现，快速选出在高风险群体中表现最稳健的模型。\n- 基于可视化的洞察，团队针对性地增加了相关特征工程，而非盲目重试，显著提升了迭代效率。\n\nManifold 将黑盒模型的调试过程从“猜谜游戏”转变为基于数据分布差异的精准诊断，让开发者能透过汇总指标看清模型在具体场景中的真实行为。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fuber_manifold_0a4f46f2.jpg","uber","Uber Open Source","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fuber_3a532a8c.png","Open Source at Uber",null,"http:\u002F\u002Fwww.uber.com","https:\u002F\u002Fgithub.com\u002Fuber",[80,84,88,92,96,99,103],{"name":81,"color":82,"percentage":83},"JavaScript","#f1e05a",89.5,{"name":85,"color":86,"percentage":87},"Python","#3572A5",6.8,{"name":89,"color":90,"percentage":91},"Jupyter Notebook","#DA5B0B",3.5,{"name":93,"color":94,"percentage":95},"Makefile","#427819",0.1,{"name":97,"color":98,"percentage":95},"HTML","#e34c26",{"name":100,"color":101,"percentage":102},"Shell","#89e051",0,{"name":104,"color":105,"percentage":102},"CSS","#663399",1671,116,"2026-04-08T12:32:10","Apache-2.0",4,"未说明",{"notes":113,"python":111,"dependencies":114},"该工具是一个基于 Web 的可视化调试组件，主要通过 JavaScript\u002FNode.js 环境运行。安装和运行依赖 yarn 包管理器。数据输入支持 CSV 上传或编程方式加载，推荐数据集大小为 10,000 到 15,000 条实例。若需本地运行演示应用，需在根目录及 examples\u002Fmanifold 目录下分别执行 yarn install，并通过 yarn start 在 localhost:8080 启动服务。支持地理空间特征（经纬度及 H3 Hexagon ID）的可视化。",[115,116],"yarn","node.js",[14],[119,120,121],"incubation","machine-learning","visualization","2026-03-27T02:49:30.150509","2026-04-14T12:26:55.679031",[125,130,135,140,145,150],{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},32862,"上传 CSV 文件后没有返回任何结果或报错，可能是什么原因？","首先请确保严格按照官方文档的说明格式化 CSV 文件（特征、预测值和真实值）。如果上传多个预测文件时出现类似 \"Error: yPred[1][0] has a different shape than other element in yPred\" 的错误，即使您认为形状相同，也请检查数据是否包含索引列。尝试移除 CSV 中的索引列（no index），或者单独上传每个预测文件以排查问题。","https:\u002F\u002Fgithub.com\u002Fuber\u002Fmanifold\u002Fissues\u002F103",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},32863,"在 Jupyter Notebook 中安装了扩展但无法显示可视化输出，只显示文本怎么办？","这通常是因为缺少 `jupyter-js-widgets` 依赖。请确保已正确安装并设置 `jupyter-js-widgets`。安装完成后，重新运行 Notebook 单元格即可正常显示可视化界面。","https:\u002F\u002Fgithub.com\u002Fuber\u002Fmanifold\u002Fissues\u002F98",{"id":136,"question_zh":137,"answer_zh":138,"source_url":139},32864,"运行示例项目启动时报错 \"Unknown option: base.rootMode\" 且页面空白，如何解决？","这通常是由于节点模块版本冲突或缓存问题导致的。解决方法是删除项目子文件夹中的 `node_modules` 目录（包括根目录和 example\u002Fmanifold 下的 node_modules），然后重新运行相应的安装命令（如 `yarn install` 或 `npm install`）来重建依赖。","https:\u002F\u002Fgithub.com\u002Fuber\u002Fmanifold\u002Fissues\u002F97",{"id":141,"question_zh":142,"answer_zh":143,"source_url":144},32865,"如何配置构建工具以仅生成 ESM  bundle 从而减小包体积？","可以通过修改构建配置，禁用默认的 es5 和 es6 bundle 生成，仅保留 esm bundle。这样可以启用 Tree-shaking 并支持子文件夹导入，从而显著减小包体积并节省构建时间。具体实现可以参考在 `package.json` 中添加构建脚本覆盖默认行为，或调整 webpack alias 配置以确保子目录导入指向正确的源文件（src）而非构建文件（dist）。","https:\u002F\u002Fgithub.com\u002Fuber\u002Fmanifold\u002Fissues\u002F31",{"id":146,"question_zh":147,"answer_zh":148,"source_url":149},32866,"Manifold 为什么要迁移到 Baseui，这对开发有什么影响？","迁移到 Baseui 主要是为了使用标准的 UI 组件（如带搜索功能的下拉框），这将缩短 UI 开发时间并让团队更专注于可视化核心逻辑。虽然短期内需要重构代码且可能暂时增加包体积（因为同时依赖 Styletron 和 styled-components），但这有助于项目长期扩展，并使熟悉 Uber UI 体系的开发者更容易上手。目前的折中方案是仅在复杂 UI 控件上使用 Baseui，其余样式保留 styled-components。","https:\u002F\u002Fgithub.com\u002Fuber\u002Fmanifold\u002Fissues\u002F22",{"id":151,"question_zh":152,"answer_zh":153,"source_url":144},32867,"在本地调试时，如何验证子目录导入（subdirectory import）是否正确指向了源代码？","可以通过以下步骤验证：1. 在 `modules\u002Fmanifold\u002Fsrc` 下的相关文件中设置断点；2. 完全删除 `dist\u002F` 目录；3. 运行 `yarn start` 启动网站。如果在调试器中命中了 `src` 目录下的断点，说明 alias 配置正常工作，导入确实指向了源代码而非构建产物。",[155,159],{"id":156,"version":157,"summary_zh":76,"released_at":158},247563,"v1.1.4","2020-01-13T18:36:15",{"id":160,"version":161,"summary_zh":76,"released_at":162},247564,"v1.1.2","2020-01-08T18:28:58"]