[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-xinychen--transdim":3,"tool-xinychen--transdim":65},[4,23,32,40,49,57],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":22},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,2,"2026-04-05T10:45:23",[13,14,15,16,17,18,19,20,21],"图像","数据工具","视频","插件","Agent","其他","语言模型","开发框架","音频","ready",{"id":24,"name":25,"github_repo":26,"description_zh":27,"stars":28,"difficulty_score":29,"last_commit_at":30,"category_tags":31,"status":22},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,3,"2026-04-04T04:44:48",[17,13,20,19,18],{"id":33,"name":34,"github_repo":35,"description_zh":36,"stars":37,"difficulty_score":29,"last_commit_at":38,"category_tags":39,"status":22},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74913,"2026-04-05T10:44:17",[19,13,20,18],{"id":41,"name":42,"github_repo":43,"description_zh":44,"stars":45,"difficulty_score":46,"last_commit_at":47,"category_tags":48,"status":22},3215,"awesome-machine-learning","josephmisiti\u002Fawesome-machine-learning","awesome-machine-learning 是一份精心整理的机器学习资源清单，汇集了全球优秀的机器学习框架、库和软件工具。面对机器学习领域技术迭代快、资源分散且难以甄选的痛点，这份清单按编程语言（如 Python、C++、Go 等）和应用场景（如计算机视觉、自然语言处理、深度学习等）进行了系统化分类，帮助使用者快速定位高质量项目。\n\n它特别适合开发者、数据科学家及研究人员使用。无论是初学者寻找入门库，还是资深工程师对比不同语言的技术选型，都能从中获得极具价值的参考。此外，清单还延伸提供了免费书籍、在线课程、行业会议、技术博客及线下聚会等丰富资源，构建了从学习到实践的全链路支持体系。\n\n其独特亮点在于严格的维护标准：明确标记已停止维护或长期未更新的项目，确保推荐内容的时效性与可靠性。作为机器学习领域的“导航图”，awesome-machine-learning 以开源协作的方式持续更新，旨在降低技术探索门槛，让每一位从业者都能高效地站在巨人的肩膀上创新。",72149,1,"2026-04-03T21:50:24",[20,18],{"id":50,"name":51,"github_repo":52,"description_zh":53,"stars":54,"difficulty_score":46,"last_commit_at":55,"category_tags":56,"status":22},2234,"scikit-learn","scikit-learn\u002Fscikit-learn","scikit-learn 是一个基于 Python 构建的开源机器学习库，依托于 SciPy、NumPy 等科学计算生态，旨在让机器学习变得简单高效。它提供了一套统一且简洁的接口，涵盖了从数据预处理、特征工程到模型训练、评估及选择的全流程工具，内置了包括线性回归、支持向量机、随机森林、聚类等在内的丰富经典算法。\n\n对于希望快速验证想法或构建原型的数据科学家、研究人员以及 Python 开发者而言，scikit-learn 是不可或缺的基础设施。它有效解决了机器学习入门门槛高、算法实现复杂以及不同模型间调用方式不统一的痛点，让用户无需重复造轮子，只需几行代码即可调用成熟的算法解决分类、回归、聚类等实际问题。\n\n其核心技术亮点在于高度一致的 API 设计风格，所有估算器（Estimator）均遵循相同的调用逻辑，极大地降低了学习成本并提升了代码的可读性与可维护性。此外，它还提供了强大的模型选择与评估工具，如交叉验证和网格搜索，帮助用户系统地优化模型性能。作为一个由全球志愿者共同维护的成熟项目，scikit-learn 以其稳定性、详尽的文档和活跃的社区支持，成为连接理论学习与工业级应用的最",65628,"2026-04-05T10:10:46",[20,18,14],{"id":58,"name":59,"github_repo":60,"description_zh":61,"stars":62,"difficulty_score":10,"last_commit_at":63,"category_tags":64,"status":22},3364,"keras","keras-team\u002Fkeras","Keras 是一个专为人类设计的深度学习框架，旨在让构建和训练神经网络变得简单直观。它解决了开发者在不同深度学习后端之间切换困难、模型开发效率低以及难以兼顾调试便捷性与运行性能的痛点。\n\n无论是刚入门的学生、专注算法的研究人员，还是需要快速落地产品的工程师，都能通过 Keras 轻松上手。它支持计算机视觉、自然语言处理、音频分析及时间序列预测等多种任务。\n\nKeras 3 的核心亮点在于其独特的“多后端”架构。用户只需编写一套代码，即可灵活选择 TensorFlow、JAX、PyTorch 或 OpenVINO 作为底层运行引擎。这一特性不仅保留了 Keras 一贯的高层易用性，还允许开发者根据需求自由选择：利用 JAX 或 PyTorch 的即时执行模式进行高效调试，或切换至速度最快的后端以获得最高 350% 的性能提升。此外，Keras 具备强大的扩展能力，能无缝从本地笔记本电脑扩展至大规模 GPU 或 TPU 集群，是连接原型开发与生产部署的理想桥梁。",63927,"2026-04-04T15:24:37",[20,14,18],{"id":66,"github_repo":67,"name":68,"description_en":69,"description_zh":70,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":81,"owner_email":82,"owner_twitter":82,"owner_website":83,"owner_url":84,"languages":85,"stars":90,"forks":91,"last_commit_at":92,"license":93,"difficulty_score":10,"env_os":94,"env_gpu":94,"env_ram":94,"env_deps":95,"category_tags":102,"github_topics":82,"view_count":29,"oss_zip_url":82,"oss_zip_packed_at":82,"status":22,"created_at":103,"updated_at":104,"faqs":105,"releases":126},691,"xinychen\u002Ftransdim","transdim","Machine learning for transportation data imputation and prediction.","transdim 是一款专为交通时空数据设计的机器学习开源工具。面对现实世界中传感器数据常因故障或传输问题而缺失的挑战，transdim 致力于提供精准的数据补全与时空预测解决方案。它不仅能修复随机、非随机及块状等多种模式的缺失数据，还能在数据不完整的情况下依然保持高预测精度。\n\n技术层面，transdim 引入了张量完成框架，核心采用低秩自回归张量完成（LATC）方法，有效捕捉交通流的时间与空间相关性。这使得它在处理复杂的时空任务时表现优异，无论是基础的数据清洗还是高精度的未来交通状态预报，都能胜任。\n\ntransdim 基于 Python 开发并遵循 MIT 协议，代码结构清晰，文档完善。它非常适合交通领域的科研人员、算法工程师以及需要处理大规模时空数据的开发者使用。借助 transdim，团队可以更高效地构建鲁棒的智能交通系统，无需从零开始搭建复杂的数据预处理流程。","# transdim\n\n[![MIT License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-green.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT)\n![Python 3.7](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.7-blue.svg)\n[![repo size](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frepo-size\u002Fxinychen\u002Ftransdim.svg)](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Farchive\u002Fmaster.zip)\n[![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxinychen\u002Ftransdim.svg?logo=github&label=Stars&logoColor=white)](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim)\n\n\u003Ch6 align=\"center\">Made by Xinyu Chen • :globe_with_meridians: \u003Ca href=\"https:\u002F\u002Fxinychen.github.io\">https:\u002F\u002Fxinychen.github.io\u003C\u002Fa>\u003C\u002Fh6>\n\n![logo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_93d8f1c81e51.png)\n\n**Trans**portation **d**ata **im**putation (a.k.a., transdim). \n\nMachine learning models make important developments in the field of spatiotemporal data modeling - like how to forecast near-future traffic states of road networks. But what happens when these models are built on incomplete data commonly collected from real-world systems (e.g., transportation system)?\n\n\u003Cbr>\n\nTable of Content\n--------------\n\n- [About this Project](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#about-this-project)\n- [Tasks and Challenges](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#tasks-and-challenges)\n- [Implementation](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#quick-start)\n- [Quick Start](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#quick-start)\n- [Documentation](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#documentation)\n- [Publications](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#publications)\n- [Contributors](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#collaborators)\n\n\u003Cbr>\n\nAbout this Project\n--------------\n\nIn the **transdim** project, we develop machine learning models to help address some of the toughest challenges of spatiotemporal data modeling - from missing data imputation to time series prediction. The strategic aim of this project is **creating accurate and efficient solutions for spatiotemporal traffic data imputation and prediction tasks**.\n\nIn a hurry? Please check out our contents as follows.\n\n\u003Cbr>\n\nTasks and Challenges\n--------------\n\n> Missing data are there, whether we like them or not. The really interesting question is how to deal with incomplete data.\n\n\u003Cp align=\"center\">\n\u003Cimg align=\"middle\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_b08f12ab5b7e.png\" width=\"800\" \u002F>\n\u003C\u002Fp>\n\n\u003Cp align = \"center\">\n\u003Cb>Figure 1\u003C\u002Fb>: Two classical missing patterns in a spatiotemporal setting.\n\u003C\u002Fp>\n\nWe create three missing data mechanisms on real-world data.\n\n- **Missing data imputation** 🔥\n\n  - Random missing (RM): Each sensor lost observations at completely random. (★★★)\n  - Non-random missing (NM): Each sensor lost observations during several days. (★★★★)\n  - Blockout missing (BM): All sensors lost their observations at several consecutive time points. (★★★★)\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_9841e8a82838.png\" alt=\"drawing\" width=\"800\"\u002F>\n\u003C\u002Fp>\n\n\u003Cp align = \"center\">\n\u003Cb>Figure 2\u003C\u002Fb>: Tensor completion framework for spatiotemporal missing traffic data imputation.\n\u003C\u002Fp>\n\n- **Spatiotemporal prediction** 🔥\n  - Forecasting without missing values. (★★★)\n  - Forecasting with incomplete observations. (★★★★★)\n\n\u003Cp align=\"center\">\n\u003Cimg align=\"middle\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_ba6d30e2785b.png\" width=\"700\" \u002F>\n\u003C\u002Fp>\n\n\u003Cp align = \"center\">\n\u003Cb>Figure 3\u003C\u002Fb>: Illustration of our proposed Low-Rank Autoregressive Tensor Completion (LATC) imputer\u002Fpredictor with a prediction window τ (green nodes: observed values; white nodes: missing values; red nodes\u002Fpanel: prediction; blue panel: training data to construct the tensor).\n\u003C\u002Fp>\n\n\u003Cbr>\n\nImplementation\n--------------\n\n### Open data\n\nIn this project, we have adapted some publicly available data sets into our experiments. The original links for these data are summarized as follows,\n\n- **Multivariate time series**\n  - [Birmingham parking data set](https:\u002F\u002Farchive.ics.uci.edu\u002Fml\u002Fdatasets\u002FParking+Birmingham)\n  - [California PeMS traffic speed data set](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.3939792) (large-scale)\n  - [Guangzhou urban traffic speed data set](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.1205228)\n  - [Hangzhou metro passenger flow data set](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.3145403)\n  - [London urban movement speed data set](https:\u002F\u002Fmovement.uber.com\u002F) (other cities are also available at [Uber movement project](https:\u002F\u002Fmovement.uber.com\u002F))\n  - [Portland highway traffic data set](https:\u002F\u002Fportal.its.pdx.edu\u002Fhome) (including traffic volume\u002Fspeed\u002Foccupancy, see [data documentation](https:\u002F\u002Fportal.its.pdx.edu\u002Fstatic\u002Ffiles\u002Ffhwa\u002FFreeway%20Data%20Documentation.pdf))\n  - [Seattle freeway traffic speed data set](https:\u002F\u002Fgithub.com\u002Fzhiyongc\u002FSeattle-Loop-Data)\n- **Multidimensional time series**\n  - [New York City (NYC) taxi data set](https:\u002F\u002Fwww1.nyc.gov\u002Fsite\u002Ftlc\u002Fabout\u002Ftlc-trip-record-data.page)\n  - [Pacific surface temperature data set](http:\u002F\u002Firidl.ldeo.columbia.edu\u002FSOURCES\u002F.CAC\u002F)\n\nFor example, if you want to view or use these data sets, please download them at the [..\u002Fdatasets\u002F](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Ftree\u002Fmaster\u002Fdatasets) folder in advance, and then run the following codes in your Python console:\n\n```python\nimport scipy.io\n\ntensor = scipy.io.loadmat('..\u002Fdatasets\u002FGuangzhou-data-set\u002Ftensor.mat')\ntensor = tensor['tensor']\n```\n\nIn particular, if you are interested in large-scale traffic data, we recommend **PeMS-4W\u002F8W\u002F12W** and [UTD19](https:\u002F\u002Futd19.ethz.ch\u002Findex.html). For PeMS data, you can download the data from [Zenodo](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.3939792) and place them at the folder of datasets (data path example: `..\u002Fdatasets\u002FCalifornia-data-set\u002Fpems-4w.csv`). Then you can use `Pandas` to open data:\n\n```python\nimport pandas as pd\n\ndata = pd.read_csv('..\u002Fdatasets\u002FCalifornia-data-set\u002Fpems-4w.csv', header = None)\n```\n\nFor model evaluation, we mask certain entries of the \"observed\" data as missing values and then perform imputation for these \"missing\" values.\n\n### Model implementation\n\nIn our experiments, we implemented some machine learning models mainly on `Numpy`, and written these Python codes with **Jupyter Notebook**. If you want to evaluate these models, please download and run these notebooks directly (prerequisite: **download the data sets** in advance). In the following implementation, we have improved Python codes (in Jupyter Notebook) in terms of both readiability and efficiency.\n\n> Our proposed models are highlighted in bold fonts.\n\n- **imputer** (imputation models)\n\n| Notebook                                        | Guangzhou | Birmingham | Hangzhou | Seattle | London | NYC | Pacific |\n| :----------------------------------------------------------- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n| [BPMF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBPMF.ipynb) |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [TRMF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FTRMF.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [BTRMF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBTRMF.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [**BTMF**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBTMF.ipynb) |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [**BGCP**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBGCP.ipynb) |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |\n| [**BATF**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBATF.ipynb) |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |\n| [**BTTF**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBTTF.ipynb) |   🔶   |   🔶   |   🔶   |   🔶   |   🔶   |   ✅   |   ✅   |\n| [HaLRTC](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FHaLRTC.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |\n| [**LRTC-TNN**](https:\u002F\u002Fnbviewer.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FLRTC-TNN.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   🔶   |   🔶   |   🔶   |\n\n- **predictor** (prediction models)\n\n| Notebook                                        | Guangzhou | Birmingham | Hangzhou | Seattle | London | NYC | Pacific |\n| :----------------------------------------------------------- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n| [TRMF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fpredictor\u002FTRMF.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [BTRMF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fpredictor\u002FBTRMF.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [BTRTF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fpredictor\u002FBTRTF.ipynb) |   🔶   |   🔶   |   🔶   |   🔶   |   🔶   |   ✅   |   ✅   |\n| [**BTMF**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fpredictor\u002FBTMF.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |\n| [**BTTF**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fpredictor\u002FBTTF.ipynb) |   🔶   |   🔶   |   🔶   |   🔶   |   🔶   |   ✅   |   ✅   |\n\n* ✅ — Cover\n* 🔶 — Does not cover\n* 🚧 — Under development\n\n> For the implementation of these models, we use both `dense_mat` and `sparse_mat` (or `dense_tensor` and `sparse_tensor`) as inputs. However, it is not necessary by doing so if you do not hope to see the imputation\u002Fprediction performance in the iterative process, you can remove `dense_mat` (or `dense_tensor`) from the inputs of these algorithms.\n\n### Imputation\u002FPrediction performance\n\n- **Imputation example (on Guangzhou data)**\n\n![example](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_7d1cd6cdd3a9.png)\n  *(a) Time series of actual and estimated speed within two weeks from August 1 to 14.*\n\n![example](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_05026e33dcdd.png)\n  *(b) Time series of actual and estimated speed within two weeks from September 12 to 25.*\n\n> The imputation performance of BGCP (CP rank r=15 and missing rate α=30%) under the fiber missing scenario with third-order tensor representation, where the estimated result of road segment #1 is selected as an example. In the both two panels, red rectangles represent fiber missing (i.e., speed observations are lost in a whole day).\n\n- **Prediction example**\n\n![example](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_cdcd9b3ac7cb.png)\n\n![example](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_d8b56735134f.png)\n\n![example](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_f5ac9b95454a.png)\n\n\u003Cbr>\n\nQuick Start\n--------------\nThis is an imputation example of Low-Rank Tensor Completion with Truncated Nuclear Norm minimization (LRTC-TNN). One notable thing is that unlike the complex equations in our paper, our Python implementation is extremely easy to work with.\n\n- First, import some necessary packages:\n\n```python\nimport numpy as np\nfrom numpy.linalg import inv as inv\n```\n\n- Define the operators of tensor unfolding (`ten2mat`) and matrix folding (`mat2ten`) using `Numpy`:\n\n```python\ndef ten2mat(tensor, mode):\n    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')\n```\n\n```python\ndef mat2ten(mat, tensor_size, mode):\n    index = list()\n    index.append(mode)\n    for i in range(tensor_size.shape[0]):\n        if i != mode:\n            index.append(i)\n    return np.moveaxis(np.reshape(mat, list(tensor_size[index]), order = 'F'), 0, mode)\n```\n\n- Define Singular Value Thresholding (SVT) for Truncated Nuclear Norm (TNN) minimization:\n\n```python\ndef svt_tnn(mat, tau, theta):\n    [m, n] = mat.shape\n    if 2 * m \u003C n:\n        u, s, v = np.linalg.svd(mat @ mat.T, full_matrices = 0)\n        s = np.sqrt(s)\n        idx = np.sum(s > tau)\n        mid = np.zeros(idx)\n        mid[:theta] = 1\n        mid[theta:idx] = (s[theta:idx] - tau) \u002F s[theta:idx]\n        return (u[:,:idx] @ np.diag(mid)) @ (u[:,:idx].T @ mat)\n    elif m > 2 * n:\n        return svt_tnn(mat.T, tau, theta).T\n    u, s, v = np.linalg.svd(mat, full_matrices = 0)\n    idx = np.sum(s > tau)\n    vec = s[:idx].copy()\n    vec[theta:] = s[theta:] - tau\n    return u[:,:idx] @ np.diag(vec) @ v[:idx,:]\n```\n\n- Define performance metrics (i.e., RMSE, MAPE):\n\n```python\ndef compute_rmse(var, var_hat):\n    return np.sqrt(np.sum((var - var_hat) ** 2) \u002F var.shape[0])\n\ndef compute_mape(var, var_hat):\n    return np.sum(np.abs(var - var_hat) \u002F var) \u002F var.shape[0]\n```\n\n- Define LRTC-TNN:\n\n```python\ndef LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter):\n    \"\"\"Low-Rank Tensor Completion with Truncated Nuclear Norm, LRTC-TNN.\"\"\"\n    \n    dim = np.array(sparse_tensor.shape)\n    pos_missing = np.where(sparse_tensor == 0)\n    pos_test = np.where((dense_tensor != 0) & (sparse_tensor == 0))\n    dense_test = dense_tensor[pos_test]\n    del dense_tensor\n    \n    X = np.zeros(np.insert(dim, 0, len(dim))) # \\boldsymbol{\\mathcal{X}}\n    T = np.zeros(np.insert(dim, 0, len(dim))) # \\boldsymbol{\\mathcal{T}}\n    Z = sparse_tensor.copy()\n    last_tensor = sparse_tensor.copy()\n    snorm = np.sqrt(np.sum(sparse_tensor ** 2))\n    it = 0\n    while True:\n        rho = min(rho * 1.05, 1e5)\n        for k in range(len(dim)):\n            X[k] = mat2ten(svt_tnn(ten2mat(Z - T[k] \u002F rho, k), alpha[k] \u002F rho, int(np.ceil(theta * dim[k]))), dim, k)\n        Z[pos_missing] = np.mean(X + T \u002F rho, axis = 0)[pos_missing]\n        T = T + rho * (X - np.broadcast_to(Z, np.insert(dim, 0, len(dim))))\n        tensor_hat = np.einsum('k, kmnt -> mnt', alpha, X)\n        tol = np.sqrt(np.sum((tensor_hat - last_tensor) ** 2)) \u002F snorm\n        last_tensor = tensor_hat.copy()\n        it += 1\n        if (it + 1) % 50 == 0:\n            print('Iter: {}'.format(it + 1))\n            print('MAPE: {:.6}'.format(compute_mape(dense_test, tensor_hat[pos_test])))\n            print('RMSE: {:.6}'.format(compute_rmse(dense_test, tensor_hat[pos_test])))\n            print()\n        if (tol \u003C epsilon) or (it >= maxiter):\n            break\n\n    print('Imputation MAPE: {:.6}'.format(compute_mape(dense_test, tensor_hat[pos_test])))\n    print('Imputation RMSE: {:.6}'.format(compute_rmse(dense_test, tensor_hat[pos_test])))\n    print()\n    \n    return tensor_hat\n```\n\n- Let us try it on Guangzhou urban traffic speed data set:\n\n```python\nimport scipy.io\n\nimport scipy.io\nimport numpy as np\nnp.random.seed(1000)\n\ndense_tensor = scipy.io.loadmat('..\u002Fdatasets\u002FGuangzhou-data-set\u002Ftensor.mat')['tensor']\ndim = dense_tensor.shape\nmissing_rate = 0.2 # Random missing (RM)\nsparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate)\n```\n\n- Run the imputation experiment:\n\n```python\nimport time\nstart = time.time()\nalpha = np.ones(3) \u002F 3\nrho = 1e-5\ntheta = 0.30\nepsilon = 1e-4\nmaxiter = 200\ntensor_hat = LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\nend = time.time()\nprint('Running time: %d seconds'%(end - start))\n```\n\n> This example is from [..\u002Fimputer\u002FLRTC-TNN.ipynb](https:\u002F\u002Fnbviewer.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FLRTC-TNN.ipynb), you can check out this Jupyter Notebook for details.\n\n\u003Cbr>\n\nDocumentation\n--------------\n\n1. [Intuitive understanding of randomized singular value decomposition](https:\u002F\u002Ftowardsdatascience.com\u002Fintuitive-understanding-of-randomized-singular-value-decomposition-9389e27cb9de). July 1, 2020.\n2. [Generating random numbers and arrays in Matlab and Numpy](https:\u002F\u002Ftowardsdatascience.com\u002Fgenerating-random-numbers-and-arrays-in-matlab-and-numpy-47dcc9997650). October 9, 2021.\n3. [Reduced-rank vector autoregressive model for high-dimensional time series forecasting](https:\u002F\u002Ftowardsdatascience.com\u002Freduced-rank-vector-autoregressive-model-for-high-dimensional-time-series-forecasting-bdd17df6c5ab). October 16, 2021.\n4. [Dynamic mode decomposition for spatiotemporal traffic speed time series in Seattle freeway](https:\u002F\u002Ftowardsdatascience.com\u002Fdynamic-mode-decomposition-for-spatiotemporal-traffic-speed-time-series-in-seattle-freeway-b0ba97e81c2c#ce4e-5f7c3f01d622). October 29, 2021.\n5. [Analyzing missing data problem in Uber movement speed data](https:\u002F\u002Fmedium.com\u002F@xinyu.chen\u002Fanalyzing-missing-data-problem-in-uber-movement-speed-data-208d7a126af5). February 14, 2022.\n6. [Using conjugate gradient to solve matrix equations](https:\u002F\u002Fmedium.com\u002Fp\u002F7f16cbae18a3). February 23, 2022.\n7. [Inpainting fluid dynamics with tensor decomposition (NumPy)](https:\u002F\u002Fmedium.com\u002Fp\u002Fd84065fead4d). March 15, 2022.\n8. [Temporal matrix factorization for multivariate time series forecasting](https:\u002F\u002Fmedium.com\u002Fp\u002Fb1c59faf05ea). March 20, 2022.\n9. [Forecasting multivariate time series with nonstationary temporal matrix factorization](https:\u002F\u002Fmedium.com\u002Fp\u002F4705df163fcf). April 25, 2022.\n10. [Implementing Kronecker product decomposition with NumPy](https:\u002F\u002Fmedium.com\u002Fp\u002F13f679f76347). June 20, 2022.\n11. [Tensor autoregression: A multidimensional time series model](https:\u002F\u002Fmedium.com\u002Fp\u002F21681f696d79). September 3, 2022.\n12. [Reproducing dynamic mode decomposition on fluid flow data in Python](https:\u002F\u002Fmedium.com\u002F@xinyu.chen\u002Freproducing-dynamic-mode-decomposition-on-fluid-flow-data-in-python-94b8d7e1f203). September 6, 2022.\n13. [Convolution nuclear norm minimization for time series modeling](https:\u002F\u002Fmedium.com\u002Fp\u002F377c56e49962). October 3, 2022.\n14. [Reinforce matrix factorization for time series modeling: Probabilistic sequential matrix factorization](https:\u002F\u002Fmedium.com\u002Fp\u002F873f4ca344de). October 5, 2022.\n15. [Discrete convolution and fast Fourier transform explained and implemented step by step](https:\u002F\u002Fmedium.com\u002Fp\u002F83ff1809378d). October 19, 2022.\n16. [Matrix factorization for image inpainting in Python](https:\u002F\u002Fmedium.com\u002Fp\u002Fd7300e6afbfd). December 8, 2022.\n17. [Circulant matrix nuclear norm minimization for image inpainting in Python](https:\u002F\u002Fmedium.com\u002Fp\u002Fb98eb94d8e). December 9, 2022.\n18. [Low-rank Laplacian convolution model for time series imputation and image inpainting](https:\u002F\u002Fmedium.com\u002Fp\u002Fa46dd88d107e). December 10, 2022.\n19. [Low-rank Laplacian convolution model for color image inpainting](https:\u002F\u002Fmedium.com\u002Fp\u002Fe8c5cdb3cc73). December 17, 2022.\n20. [Intuitive understanding of tensors in machine learning](https:\u002F\u002Fmedium.com\u002F@xinyu.chen\u002Fintuitive-understanding-of-tensors-in-machine-learning-33635c64b596). January 20, 2023.\n21. [Low-rank matrix and tensor factorization for speed field reconstruction](https:\u002F\u002Fmedium.com\u002Fp\u002Fbb4807cb93c5). March 9, 2023.\n22. [Bayesian vector autoregression forecasting](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Ftoy-examples\u002FBayesian-VAR-forecasting.ipynb)\n23. [Structured low-rank matrix completion](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Ftoy-examples\u002FSLRMC.ipynb)\n\n\u003Cbr>\n\nPublications\n--------------\n\n- Xinyu Chen, Zhanhong Cheng, HanQin Cai, Nicolas Saunier, Lijun Sun (2024). **Laplacian convolutional representation for traffic time series imputation**. IEEE Transactions on Knowledge and Data Engineering. 36 (11): 6490-6502. [[DOI](https:\u002F\u002Fdoi.org\u002F10.1109\u002FTKDE.2024.3419698)] [[Slides](https:\u002F\u002Fxinychen.github.io\u002Fslides\u002FLCR24.pdf)] [[Data & Python code](https:\u002F\u002Fgithub.com\u002Fxinychen\u002FLCR)]\n\n- Xinyu Chen, Lijun Sun (2022). **Bayesian temporal factorization for multidimensional time series prediction**. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44 (9): 4659-4673. [[Preprint](https:\u002F\u002Farxiv.org\u002Fabs\u002F1910.06366v2)] [[DOI](https:\u002F\u002Fdoi.org\u002F10.1109\u002FTPAMI.2021.3066551)] [[Slides](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.4693404)] [[Data & Python code](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim)]\n\n- Xinyu Chen, Mengying Lei, Nicolas Saunier, Lijun Sun (2022). **Low-rank autoregressive tensor completion for spatiotemporal traffic data imputation**. IEEE Transactions on Intelligent Transportation Systems, 23 (8): 12301-12310. [[Preprint](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.14936)] [[DOI](https:\u002F\u002Fdoi.org\u002F10.1109\u002FTITS.2021.3113608)] [[Data & Python code](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim)] (Also accepted in part to [MiLeTS Workshop of KDD 2021](https:\u002F\u002Fkdd-milets.github.io\u002Fmilets2021\u002F), see [workshop paper](https:\u002F\u002Fkdd-milets.github.io\u002Fmilets2021\u002Fpapers\u002FMiLeTS2021_paper_23.pdf))\n\n- Xinyu Chen, Yixian Chen, Nicolas Saunier, Lijun Sun (2021). **Scalable low-rank tensor learning for spatiotemporal traffic data imputation**. Transportation Research Part C: Emerging Technologies, 129: 103226. [[Preprint](https:\u002F\u002Farxiv.org\u002Fabs\u002F2008.03194)] [[DOI](https:\u002F\u002Fdoi.org\u002F10.1016\u002Fj.trc.2021.103226)] [[Data](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.3939792)] [[Python code](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Ftree\u002Fmaster\u002Flarge-imputer)]\n\n- Xinyu Chen, Jinming Yang, Lijun Sun (2020). **A nonconvex low-rank tensor completion model for spatiotemporal traffic data imputation**. Transportation Research Part C: Emerging Technologies, 117: 102673. [[Preprint](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.10271v2)] [[DOI](https:\u002F\u002Fdoi.org\u002F10.1016\u002Fj.trc.2020.102673)] [[Data & Python code](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim)]\n\n- Xinyu Chen, Zhaocheng He, Yixian Chen, Yuhuan Lu, Jiawei Wang (2019). **Missing traffic data imputation and pattern discovery with a Bayesian augmented tensor factorization model**. Transportation Research Part C: Emerging Technologies, 104: 66-77. [[DOI](https:\u002F\u002Fdoi.org\u002F10.1016\u002Fj.trc.2019.03.003)] [[Slides](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.2632552)] [[Data](http:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.1205229)] [[Matlab code](https:\u002F\u002Fgithub.com\u002Fsysuits\u002FBATF)] [[Python code](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBATF.ipynb)]\n\n- Xinyu Chen, Zhaocheng He, Lijun Sun (2019). **A Bayesian tensor decomposition approach for spatiotemporal traffic data imputation**. Transportation Research Part C: Emerging Technologies, 98: 73-84. [[Preprint](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F329177786_A_Bayesian_tensor_decomposition_approach_for_spatiotemporal_traffic_data_imputation)] [[DOI](https:\u002F\u002Fdoi.org\u002F10.1016\u002Fj.trc.2018.11.003)] [[Data](http:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.1205229)] [[Matlab code](https:\u002F\u002Fgithub.com\u002Flijunsun\u002Fbgcp_imputation)] [[Python code](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fexperiments\u002FImputation-BGCP.ipynb)]\n\n- Xinyu Chen, Zhaocheng He, Jiawei Wang (2018). **Spatial-temporal traffic speed patterns discovery and incomplete data recovery via SVD-combined tensor decomposition**. Transportation Research Part C: Emerging Technologies, 86: 59-77. [[DOI](http:\u002F\u002Fdoi.org\u002F10.1016\u002Fj.trc.2017.10.023)] [[Data](http:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.1205229)]\n\n  >This project is from the above papers, please cite these papers if they help your research.\n\n\u003Cbr>\n\nCollaborators\n--------------\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_03e40d1eda8d.png\" width=\"80px;\" alt=\"Xinyu Chen\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Xinyu Chen\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=xinychen\" title=\"Code\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fyangjm67\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_6263f9634164.png\" width=\"80px;\" alt=\"Jinming Yang\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Jinming Yang\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=yangjm67\" title=\"Code\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fyxnchen\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_98eb97d62114.png\" width=\"80px;\" alt=\"Yixian Chen\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Yixian Chen\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=yxnchen\" title=\"Code\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FMengyingLei\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_0c44c91a5445.png\" width=\"80px;\" alt=\"Mengying Lei\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Mengying Lei\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=MengyingLei\" title=\"Code\">💻\u003C\u002Fa>\u003C\u002Ftd>\n\u003C!--     \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Flijunsun\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_0cbf529ecc19.png\" width=\"80px;\" alt=\"Lijun Sun\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Lijun Sun\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=lijunsun\" title=\"Code\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FHanTY\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_965df20277e8.png\" width=\"80px;\" alt=\"Tianyang Han\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Tianyang Han\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=HanTY\" title=\"Code\">💻\u003C\u002Fa>\u003C\u002Ftd> -->\n\u003C!--   \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxxxx\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_252fc7400e00.png\" width=\"100px;\" alt=\"xxxx\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>xxxx\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=xxxx\" title=\"Code\">💻\u003C\u002Fa>\u003C\u002Ftd> -->\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n- **Advisory Board**\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Flijunsun\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_0cbf529ecc19.png\" width=\"80px;\" alt=\"Lijun Sun\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Lijun Sun\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=lijunsun\" title=\"Code\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fnsaunier\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_c4f2a8ee2849.png\" width=\"80px;\" alt=\"Nicolas Saunier\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Nicolas Saunier\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=nsaunier\" title=\"Code\">💻\u003C\u002Fa>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n> See the list of [contributors](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fgraphs\u002Fcontributors) who participated in this project.\n\n\u003Cbr>\n\nSupported by\n--------------\n\n\u003Ca href=\"https:\u002F\u002Fivado.ca\u002Fen\">\n\u003Cimg align=\"middle\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_4e2642f2954b.jpeg\" alt=\"drawing\" height=\"70\" hspace=\"50\">\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fwww.cirrelt.ca\u002F\">\n\u003Cimg align=\"middle\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_372181aa2745.png\" alt=\"drawing\" height=\"50\">\n\u003C\u002Fa>\n\n\u003Cbr>\n\nLicense\n--------------\n\nThis work is released under the MIT license.\n","# transdim\n\n[![MIT License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-green.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT)\n![Python 3.7](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.7-blue.svg)\n[![repo size](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frepo-size\u002Fxinychen\u002Ftransdim.svg)](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Farchive\u002Fmaster.zip)\n[![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fxinychen\u002Ftransdim.svg?logo=github&label=Stars&logoColor=white)](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim)\n\n\u003Ch6 align=\"center\">由陈新宇制作 • :globe_with_meridians: \u003Ca href=\"https:\u002F\u002Fxinychen.github.io\">https:\u002F\u002Fxinychen.github.io\u003C\u002Fa>\u003C\u002Fh6>\n\n![logo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_93d8f1c81e51.png)\n\n**Trans**portation **d**ata **im**putation（即 transdim，交通数据补全）。 \n\n机器学习模型在时空数据（spatiotemporal data）建模领域取得了重要进展——例如如何预测路网近期的交通状态。但是，当这些模型建立在从现实世界系统（如交通系统）中普遍收集的不完整数据之上时，会发生什么呢？\n\n\u003Cbr>\n\n目录\n--------------\n\n- [关于本项目](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#about-this-project)\n- [任务与挑战](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#tasks-and-challenges)\n- [实现](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#quick-start)\n- [快速开始](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#quick-start)\n- [文档](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#documentation)\n- [出版物](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#publications)\n- [贡献者](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim?tab=readme-ov-file#collaborators)\n\n\u003Cbr>\n\n关于本项目\n--------------\n\n在 **transdim** 项目中，我们开发机器学习模型以帮助解决时空数据建模的一些最严峻挑战——从缺失数据补全（missing data imputation）到时间序列预测（time series prediction）。本项目的战略目标是**为时空交通数据补全和预测任务创建准确且高效的解决方案**。\n\n赶时间吗？请查看以下内容。\n\n\u003Cbr>\n\n任务与挑战\n--------------\n\n> 缺失数据就在那里，无论我们是否喜欢它们。真正有趣的问题是如何处理不完整的数据。\n\n\u003Cp align=\"center\">\n\u003Cimg align=\"middle\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_b08f12ab5b7e.png\" width=\"800\" \u002F>\n\u003C\u002Fp>\n\n\u003Cp align = \"center\">\n\u003Cb>图 1\u003C\u002Fb>: 时空设置下的两种经典缺失模式。\n\u003C\u002Fp>\n\n我们在真实世界数据上创建了三种缺失数据机制。\n\n- **缺失数据补全** 🔥\n\n  - 随机缺失 (Random Missing, RM): 每个传感器完全随机地丢失观测值。(★★★)\n  - 非随机缺失 (Non-random Missing, NM): 每个传感器在几天内丢失观测值。(★★★★)\n  - 阻断缺失 (Blockout Missing, BM): 所有传感器在几个连续时间点丢失观测值。(★★★★)\n\n\u003Cp align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_9841e8a82838.png\" alt=\"drawing\" width=\"800\"\u002F>\n\u003C\u002Fp>\n\n\u003Cp align = \"center\">\n\u003Cb>图 2\u003C\u002Fb>: 用于时空缺失交通数据补全的张量补全（Tensor Completion）框架。\n\u003C\u002Fp>\n\n- **时空预测** 🔥\n  - 无缺失值预测。(★★★)\n  - 基于不完整观测值的预测。(★★★★★)\n\n\u003Cp align=\"center\">\n\u003Cimg align=\"middle\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_ba6d30e2785b.png\" width=\"700\" \u002F>\n\u003C\u002Fp>\n\n\u003Cp align = \"center\">\n\u003Cb>图 3\u003C\u002Fb>: 我们提出的低秩自回归张量补全（Low-Rank Autoregressive Tensor Completion, LATC）补全器\u002F预测器的示意图（绿色节点：观测值；白色节点：缺失值；红色节点\u002F面板：预测；蓝色面板：用于构建张量的训练数据）。\n\u003C\u002Fp>\n\n\u003Cbr>\n\n实现\n--------------\n\n### 公开数据\n\n在本项目中，我们将一些公开可用的数据集适配到了我们的实验中。这些数据的原始链接总结如下，\n\n- **多元时间序列 (Multivariate time series)**\n  - [伯明翰停车数据集](https:\u002F\u002Farchive.ics.uci.edu\u002Fml\u002Fdatasets\u002FParking+Birmingham)\n  - [加州 PeMS 交通速度数据集](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.3939792) (大规模)\n  - [广州城市交通速度数据集](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.1205228)\n  - [杭州地铁客流数据集](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.3145403)\n  - [伦敦城市移动速度数据集](https:\u002F\u002Fmovement.uber.com\u002F) (其他城市的数据也可在 [Uber Movement 项目](https:\u002F\u002Fmovement.uber.com\u002F) 获取)\n  - [波特兰高速公路交通数据集](https:\u002F\u002Fportal.its.pdx.edu\u002Fhome) (包括交通流量\u002F速度\u002F占有率，参见 [数据文档](https:\u002F\u002Fportal.its.pdx.edu\u002Fstatic\u002Ffiles\u002Ffhwa\u002FFreeway%20Data%20Documentation.pdf))\n  - [西雅图高速公路交通速度数据集](https:\u002F\u002Fgithub.com\u002Fzhiyongc\u002FSeattle-Loop-Data)\n- **多维时间序列 (Multidimensional time series)**\n  - [纽约市 (NYC) 出租车数据集](https:\u002F\u002Fwww1.nyc.gov\u002Fsite\u002Ftlc\u002Fabout\u002Ftlc-trip-record-data.page)\n  - [太平洋表面温度数据集](http:\u002F\u002Firidl.ldeo.columbia.edu\u002FSOURCES\u002F.CAC\u002F)\n\n例如，如果您想查看或使用这些数据集，请提前在 [..\u002Fdatasets\u002F](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Ftree\u002Fmaster\u002Fdatasets) 文件夹中下载它们，然后在您的 Python 控制台中运行以下代码：\n\n```python\nimport scipy.io\n\ntensor = scipy.io.loadmat('..\u002Fdatasets\u002FGuangzhou-data-set\u002Ftensor.mat')\ntensor = tensor['tensor']\n```\n\n特别是，如果您对大规模交通数据感兴趣，我们推荐 **PeMS-4W\u002F8W\u002F12W** 和 [UTD19](https:\u002F\u002Futd19.ethz.ch\u002Findex.html)。对于 PeMS 数据，您可以从 [Zenodo](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.3939792) 下载数据并将其放置在 datasets 文件夹中（数据路径示例：`..\u002Fdatasets\u002FCalifornia-data-set\u002Fpems-4w.csv`）。然后您可以使用 `Pandas` 打开数据：\n\n```python\nimport pandas as pd\n\ndata = pd.read_csv('..\u002Fdatasets\u002FCalifornia-data-set\u002Fpems-4w.csv', header = None)\n```\n\n对于模型评估，我们将“观测”数据中的某些条目掩码为缺失值，然后对这些“缺失”值执行补全操作。\n\n### 模型实现\n\n在我们的实验中，我们主要在 `Numpy` 上实现了一些机器学习模型，并使用 **Jupyter Notebook** 编写了这些 Python 代码。如果您想评估这些模型，请直接下载并运行这些笔记本（前提：提前下载**数据集**）。在下方的实现中，我们在可读性和效率方面改进了 Python 代码（在 Jupyter Notebook 中）。\n\n> 我们提出的模型以粗体显示。\n\n- **imputer**（插补模型）\n\n| Notebook                                        | 广州 | 伯明翰 | 杭州 | 西雅图 | 伦敦 | 纽约 | 太平洋 |\n| :----------------------------------------------------------- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n| [BPMF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBPMF.ipynb) |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [TRMF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FTRMF.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [BTRMF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBTRMF.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [**BTMF**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBTMF.ipynb) |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [**BGCP**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBGCP.ipynb) |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |\n| [**BATF**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBATF.ipynb) |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |\n| [**BTTF**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBTTF.ipynb) |   🔶   |   🔶   |   🔶   |   🔶   |   🔶   |   ✅   |   ✅   |\n| [HaLRTC](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FHaLRTC.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |\n| [**LRTC-TNN**](https:\u002F\u002Fnbviewer.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FLRTC-TNN.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   🔶   |   🔶   |   🔶   |\n\n- **predictor**（预测模型）\n\n| Notebook                                        | 广州 | 伯明翰 | 杭州 | 西雅图 | 伦敦 | 纽约 | 太平洋 |\n| :----------------------------------------------------------- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n| [TRMF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fpredictor\u002FTRMF.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [BTRMF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fpredictor\u002FBTRMF.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   🔶   |   🔶   |\n| [BTRTF](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fpredictor\u002FBTRTF.ipynb) |   🔶   |   🔶   |   🔶   |   🔶   |   🔶   |   ✅   |   ✅   |\n| [**BTMF**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fpredictor\u002FBTMF.ipynb) |   ✅   |   🔶   |   ✅   |   ✅   |   ✅   |   ✅   |   ✅   |\n| [**BTTF**](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fpredictor\u002FBTTF.ipynb) |   🔶   |   🔶   |   🔶   |   🔶   |   🔶   |   ✅   |   ✅   |\n\n* ✅ — 覆盖\n* 🔶 — 未覆盖\n* 🚧 — 开发中\n\n> 对于这些模型的实现，我们同时使用 `dense_mat`（稠密矩阵）和 `sparse_mat`（稀疏矩阵）（或 `dense_tensor`（稠密张量）和 `sparse_tensor`（稀疏张量））作为输入。但是，如果您不希望看到迭代过程中的插补\u002F预测性能，则无需如此操作，您可以从这些算法的输入中移除 `dense_mat`（或 `dense_tensor`）。\n\n### 插补\u002F预测性能\n\n- **插补示例（基于广州数据）**\n\n![example](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_7d1cd6cdd3a9.png)\n  *(a) 8 月 1 日至 14 日两周内实际与估计速度的时间序列。*\n\n![example](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_05026e33dcdd.png)\n  *(b) 9 月 12 日至 25 日两周内实际与估计速度的时间序列。*\n\n> BGCP 的插补性能（CP rank r=15 且缺失率α=30%），在三阶张量表示下的纤维缺失场景中，选取路段 #1 的估计结果作为示例。在这两个面板中，红色矩形代表纤维缺失（即全天速度观测值丢失）。\n\n- **预测示例**\n\n![example](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_cdcd9b3ac7cb.png)\n\n![example](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_d8b56735134f.png)\n\n![example](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_f5ac9b95454a.png)\n\n\u003Cbr>\n\n快速开始\n--------------\n这是一个低秩张量补全与截断核范数最小化 (LRTC-TNN) 的插补示例。值得注意的是，与我们论文中的复杂公式不同，我们的 Python 实现非常容易使用。\n\n- 首先，导入一些必要的包：\n\n```python\nimport numpy as np\nfrom numpy.linalg import inv as inv\n```\n\n- 使用 `Numpy` 定义张量展开 (`ten2mat`) 和矩阵折叠 (`mat2ten`) 算子：\n\n```python\ndef ten2mat(tensor, mode):\n    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')\n```\n\n```python\ndef mat2ten(mat, tensor_size, mode):\n    index = list()\n    index.append(mode)\n    for i in range(tensor_size.shape[0]):\n        if i != mode:\n            index.append(i)\n    return np.moveaxis(np.reshape(mat, list(tensor_size[index]), order = 'F'), 0, mode)\n```\n\n- 为截断核范数 (TNN) 最小化定义奇异值阈值处理 (SVT)：\n\n```python\ndef svt_tnn(mat, tau, theta):\n    [m, n] = mat.shape\n    if 2 * m \u003C n:\n        u, s, v = np.linalg.svd(mat @ mat.T, full_matrices = 0)\n        s = np.sqrt(s)\n        idx = np.sum(s > tau)\n        mid = np.zeros(idx)\n        mid[:theta] = 1\n        mid[theta:idx] = (s[theta:idx] - tau) \u002F s[theta:idx]\n        return (u[:,:idx] @ np.diag(mid)) @ (u[:,:idx].T @ mat)\n    elif m > 2 * n:\n        return svt_tnn(mat.T, tau, theta).T\n    u, s, v = np.linalg.svd(mat, full_matrices = 0)\n    idx = np.sum(s > tau)\n    vec = s[:idx].copy()\n    vec[theta:] = s[theta:] - tau\n    return u[:,:idx] @ np.diag(vec) @ v[:idx,:]\n```\n\n- 定义性能指标（即 RMSE, MAPE）：\n\n```python\ndef compute_rmse(var, var_hat):\n    return np.sqrt(np.sum((var - var_hat) ** 2) \u002F var.shape[0])\n\ndef compute_mape(var, var_hat):\n    return np.sum(np.abs(var - var_hat) \u002F var) \u002F var.shape[0]\n```\n\n- 定义 LRTC-TNN：\n\n```python\ndef LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter):\n    \"\"\"Low-Rank Tensor Completion with Truncated Nuclear Norm, LRTC-TNN.\"\"\"\n    \n    dim = np.array(sparse_tensor.shape)\n    pos_missing = np.where(sparse_tensor == 0)\n    pos_test = np.where((dense_tensor != 0) & (sparse_tensor == 0))\n    dense_test = dense_tensor[pos_test]\n    del dense_tensor\n    \n    X = np.zeros(np.insert(dim, 0, len(dim))) # \\boldsymbol{\\mathcal{X}}\n    T = np.zeros(np.insert(dim, 0, len(dim))) # \\boldsymbol{\\mathcal{T}}\n    Z = sparse_tensor.copy()\n    last_tensor = sparse_tensor.copy()\n    snorm = np.sqrt(np.sum(sparse_tensor ** 2))\n    it = 0\n    while True:\n        rho = min(rho * 1.05, 1e5)\n        for k in range(len(dim)):\n            X[k] = mat2ten(svt_tnn(ten2mat(Z - T[k] \u002F rho, k), alpha[k] \u002F rho, int(np.ceil(theta * dim[k]))), dim, k)\n        Z[pos_missing] = np.mean(X + T \u002F rho, axis = 0)[pos_missing]\n        T = T + rho * (X - np.broadcast_to(Z, np.insert(dim, 0, len(dim))))\n        tensor_hat = np.einsum('k, kmnt -> mnt', alpha, X)\n        tol = np.sqrt(np.sum((tensor_hat - last_tensor) ** 2)) \u002F snorm\n        last_tensor = tensor_hat.copy()\n        it += 1\n        if (it + 1) % 50 == 0:\n            print('Iter: {}'.format(it + 1))\n            print('MAPE: {:.6}'.format(compute_mape(dense_test, tensor_hat[pos_test])))\n            print('RMSE: {:.6}'.format(compute_rmse(dense_test, tensor_hat[pos_test])))\n            print()\n        if (tol \u003C epsilon) or (it >= maxiter):\n            break\n\n    print('Imputation MAPE: {:.6}'.format(compute_mape(dense_test, tensor_hat[pos_test])))\n    print('Imputation RMSE: {:.6}'.format(compute_rmse(dense_test, tensor_hat[pos_test])))\n    print()\n    \n    return tensor_hat\n```\n\n- 让我们尝试在广州城市交通速度数据集上运行：\n\n```python\nimport scipy.io\n\nimport scipy.io\nimport numpy as np\nnp.random.seed(1000)\n\ndense_tensor = scipy.io.loadmat('..\u002Fdatasets\u002FGuangzhou-data-set\u002Ftensor.mat')['tensor']\ndim = dense_tensor.shape\nmissing_rate = 0.2 # Random missing (RM)\nsparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate)\n```\n\n- 运行补全 (Imputation) 实验：\n\n```python\nimport time\nstart = time.time()\nalpha = np.ones(3) \u002F 3\nrho = 1e-5\ntheta = 0.30\nepsilon = 1e-4\nmaxiter = 200\ntensor_hat = LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\nend = time.time()\nprint('Running time: %d seconds'%(end - start))\n```\n\n> 此示例来自 [..\u002Fimputer\u002FLRTC-TNN.ipynb](https:\u002F\u002Fnbviewer.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FLRTC-TNN.ipynb)，您可以查看该 Jupyter Notebook 以获取详细信息。\n\n\u003Cbr>\n\nDocumentation\n--------------\n\n1. [随机奇异值分解的直观理解 (Intuitive Understanding of Randomized Singular Value Decomposition)](https:\u002F\u002Ftowardsdatascience.com\u002Fintuitive-understanding-of-randomized-singular-value-decomposition-9389e27cb9de). 2020 年 7 月 1 日。\n2. [Matlab 和 Numpy 中生成随机数和数组 (Generating Random Numbers and Arrays in Matlab and Numpy)](https:\u002F\u002Ftowardsdatascience.com\u002Fgenerating-random-numbers-and-arrays-in-matlab-and-numpy-47dcc9997650). 2021 年 10 月 9 日。\n3. [用于高维时间序列预测的低秩向量自回归模型 (Reduced-Rank Vector Autoregressive Model for High-Dimensional Time Series Forecasting)](https:\u002F\u002Ftowardsdatascience.com\u002Freduced-rank-vector-autoregressive-model-for-high-dimensional-time-series-forecasting-bdd17df6c5ab). 2021 年 10 月 16 日。\n4. [西雅图高速公路时空交通速度时间序列的动态模态分解 (Dynamic Mode Decomposition for Spatiotemporal Traffic Speed Time Series in Seattle Freeway)](https:\u002F\u002Ftowardsdatascience.com\u002Fdynamic-mode-decomposition-for-spatiotemporal-traffic-speed-time-series-in-seattle-freeway-b0ba97e81c2c#ce4e-5f7c3f01d622). 2021 年 10 月 29 日。\n5. [分析 Uber 移动速度数据中的缺失数据问题 (Analyzing Missing Data Problem in Uber Movement Speed Data)](https:\u002F\u002Fmedium.com\u002F@xinyu.chen\u002Fanalyzing-missing-data-problem-in-uber-movement-speed-data-208d7a126af5). 2022 年 2 月 14 日。\n6. [使用共轭梯度法求解矩阵方程 (Using Conjugate Gradient to Solve Matrix Equations)](https:\u002F\u002Fmedium.com\u002Fp\u002F7f16cbae18a3). 2022 年 2 月 23 日。\n7. [使用张量分解进行流体动力学修复 (NumPy) (Inpainting Fluid Dynamics with Tensor Decomposition (NumPy))](https:\u002F\u002Fmedium.com\u002Fp\u002Fd84065fead4d). 2022 年 3 月 15 日。\n8. [用于多元时间序列预测的时间矩阵分解 (Temporal Matrix Factorization for Multivariate Time Series Forecasting)](https:\u002F\u002Fmedium.com\u002Fp\u002Fb1c59faf05ea). 2022 年 3 月 20 日。\n9. [使用非平稳时间矩阵分解预测多元时间序列 (Forecasting Multivariate Time Series with Nonstationary Temporal Matrix Factorization)](https:\u002F\u002Fmedium.com\u002Fp\u002F4705df163fcf). 2022 年 4 月 25 日。\n10. [使用 NumPy 实现克罗内克积分解 (Implementing Kronecker Product Decomposition with NumPy)](https:\u002F\u002Fmedium.com\u002Fp\u002F13f679f76347). 2022 年 6 月 20 日。\n11. [张量自回归：一种多维时间序列模型 (Tensor Autoregression: A Multidimensional Time Series Model)](https:\u002F\u002Fmedium.com\u002Fp\u002F21681f696d79). 2022 年 9 月 3 日。\n12. [在 Python 中复现流体流动数据的动态模态分解 (Reproducing Dynamic Mode Decomposition on Fluid Flow Data in Python)](https:\u002F\u002Fmedium.com\u002F@xinyu.chen\u002Freproducing-dynamic-mode-decomposition-on-fluid-flow-data-in-python-94b8d7e1f203). 2022 年 9 月 6 日。\n13. [用于时间序列建模的卷积核范数最小化 (Convolution Nuclear Norm Minimization for Time Series Modeling)](https:\u002F\u002Fmedium.com\u002Fp\u002F377c56e49962). 2022 年 10 月 3 日。\n14. [强化矩阵因子化用于时间序列建模：概率顺序矩阵因子化 (Reinforce Matrix Factorization for Time Series Modeling: Probabilistic Sequential Matrix Factorization)](https:\u002F\u002Fmedium.com\u002Fp\u002F873f4ca344de). 2022 年 10 月 5 日。\n15. [离散卷积和快速傅里叶变换逐步解释与实现 (Discrete Convolution and Fast Fourier Transform Explained and Implemented Step by Step)](https:\u002F\u002Fmedium.com\u002Fp\u002F83ff1809378d). 2022 年 10 月 19 日。\n16. [Python 中用于图像修复的矩阵因子化 (Matrix Factorization for Image Inpainting in Python)](https:\u002F\u002Fmedium.com\u002Fp\u002Fd7300e6afbfd). 2022 年 12 月 8 日。\n17. [Python 中用于图像修复的循环矩阵核范数最小化 (Circulant Matrix Nuclear Norm Minimization for Image Inpainting in Python)](https:\u002F\u002Fmedium.com\u002Fp\u002Fb98eb94d8e). 2022 年 12 月 9 日。\n18. [用于时间序列补全和图像修复的低秩拉普拉斯卷积模型 (Low-Rank Laplacian Convolution Model for Time Series Imputation and Image Inpainting)](https:\u002F\u002Fmedium.com\u002Fp\u002Fa46dd88d107e). 2022 年 12 月 10 日。\n19. [用于彩色图像修复的低秩拉普拉斯卷积模型 (Low-Rank Laplacian Convolution Model for Color Image Inpainting)](https:\u002F\u002Fmedium.com\u002Fp\u002Fe8c5cdb3cc73). 2022 年 12 月 17 日。\n20. [机器学习张量的直观理解 (Intuitive Understanding of Tensors in Machine Learning)](https:\u002F\u002Fmedium.com\u002F@xinyu.chen\u002Fintuitive-understanding-of-tensors-in-machine-learning-33635c64b596). 2023 年 1 月 20 日。\n21. [用于速度场重建的低秩矩阵和张量因子化 (Low-Rank Matrix and Tensor Factorization for Speed Field Reconstruction)](https:\u002F\u002Fmedium.com\u002Fp\u002Fbb4807cb93c5). 2023 年 3 月 9 日。\n22. [贝叶斯向量自回归预测 (Bayesian Vector Autoregression Forecasting)](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Ftoy-examples\u002FBayesian-VAR-forecasting.ipynb)\n23. [结构化低秩矩阵补全 (Structured Low-Rank Matrix Completion)](https:\u002F\u002Fnbviewer.jupyter.org\u002Fgithub\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Ftoy-examples\u002FSLRMC.ipynb)\n\n\u003Cbr>\n\nPublications\n--------------\n\n- Xinyu Chen, Zhanhong Cheng, HanQin Cai, Nicolas Saunier, Lijun Sun (2024). **用于交通时间序列补全的拉普拉斯卷积表示 (Laplacian Convolutional Representation for Traffic Time Series Imputation)**. IEEE Transactions on Knowledge and Data Engineering. 36 (11): 6490-6502. [[DOI](https:\u002F\u002Fdoi.org\u002F10.1109\u002FTKDE.2024.3419698)] [[幻灯片](https:\u002F\u002Fxinychen.github.io\u002Fslides\u002FLCR24.pdf)] [[数据与 Python 代码](https:\u002F\u002Fgithub.com\u002Fxinychen\u002FLCR)]\n\n- Xinyu Chen, Lijun Sun (2022). **用于多维时间序列预测的贝叶斯时间因子分解**。IEEE 模式分析与机器智能汇刊，44 (9): 4659-4673。[[预印本](https:\u002F\u002Farxiv.org\u002Fabs\u002F1910.06366v2)] [[DOI](https:\u002F\u002Fdoi.org\u002F10.1109\u002FTPAMI.2021.3066551)] [[演示文稿](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.4693404)] [[数据与 Python 代码](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim)]\n\n- Xinyu Chen, Mengying Lei, Nicolas Saunier, Lijun Sun (2022). **用于时空交通数据填补的低秩自回归张量补全**。IEEE 智能交通系统汇刊，23 (8): 12301-12310。[[预印本](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.14936)] [[DOI](https:\u002F\u002Fdoi.org\u002F10.1109\u002FTITS.2021.3113608)] [[数据与 Python 代码](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim)] (亦被 [KDD 2021 的 MiLeTS 研讨会](https:\u002F\u002Fkdd-milets.github.io\u002Fmilets2021\u002F) 部分接收，参见 [研讨会论文](https:\u002F\u002Fkdd-milets.github.io\u002Fmilets2021\u002Fpapers\u002FMiLeTS2021_paper_23.pdf))\n\n- Xinyu Chen, Yixian Chen, Nicolas Saunier, Lijun Sun (2021). **用于时空交通数据填补的可扩展低秩张量学习**。交通运输研究 C 辑：新兴技术，129: 103226。[[预印本](https:\u002F\u002Farxiv.org\u002Fabs\u002F2008.03194)] [[DOI](https:\u002F\u002Fdoi.org\u002F10.1016\u002Fj.trc.2021.103226)] [[数据](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.3939792)] [[Python 代码](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Ftree\u002Fmaster\u002Flarge-imputer)]\n\n- Xinyu Chen, Jinming Yang, Lijun Sun (2020). **用于时空交通数据填补的非凸低秩张量补全模型**。交通运输研究 C 辑：新兴技术，117: 102673。[[预印本](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.10271v2)] [[DOI](https:\u002F\u002Fdoi.org\u002F10.1016\u002Fj.trc.2020.102673)] [[数据与 Python 代码](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim)]\n\n- Xinyu Chen, Zhaocheng He, Yixian Chen, Yuhuan Lu, Jiawei Wang (2019). **基于贝叶斯增强张量因子分解模型的缺失交通数据填补与模式发现**。交通运输研究 C 辑：新兴技术，104: 66-77。[[DOI](https:\u002F\u002Fdoi.org\u002F10.1016\u002Fj.trc.2019.03.003)] [[演示文稿](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.2632552)] [[数据](http:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.1205229)] [[Matlab 代码](https:\u002F\u002Fgithub.com\u002Fsysuits\u002FBATF)] [[Python 代码](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FBATF.ipynb)]\n\n- Xinyu Chen, Zhaocheng He, Lijun Sun (2019). **一种用于时空交通数据填补的贝叶斯张量分解方法**。交通运输研究 C 辑：新兴技术，98: 73-84。[[预印本](https:\u002F\u002Fwww.researchgate.net\u002Fpublication\u002F329177786_A_Bayesian_tensor_decomposition_approach_for_spatiotemporal_traffic_data_imputation)] [[DOI](https:\u002F\u002Fdoi.org\u002F10.1016\u002Fj.trc.2018.11.003)] [[数据](http:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.1205229)] [[Matlab 代码](https:\u002F\u002Fgithub.com\u002Flijunsun\u002Fbgcp_imputation)] [[Python 代码](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fexperiments\u002FImputation-BGCP.ipynb)]\n\n- Xinyu Chen, Zhaocheng He, Jiawei Wang (2018). **通过 SVD 结合张量分解进行时空交通速度模式发现与不完整数据恢复**。交通运输研究 C 辑：新兴技术，86: 59-77。[[DOI](http:\u002F\u002Fdoi.org\u002F10.1016\u002Fj.trc.2017.10.023)] [[数据](http:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.1205229)]\n\n  > 本项目源自上述论文，如果它们对您的研究有帮助，请引用这些论文。\n\n\u003Cbr>\n\n合作者\n--------------\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_03e40d1eda8d.png\" width=\"80px;\" alt=\"Xinyu Chen\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Xinyu Chen\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=xinychen\" title=\"代码\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fyangjm67\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_6263f9634164.png\" width=\"80px;\" alt=\"Jinming Yang\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Jinming Yang\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=yangjm67\" title=\"代码\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fyxnchen\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_98eb97d62114.png\" width=\"80px;\" alt=\"Yixian Chen\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Yixian Chen\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=yxnchen\" title=\"代码\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FMengyingLei\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_0c44c91a5445.png\" width=\"80px;\" alt=\"Mengying Lei\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Mengying Lei\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=MengyingLei\" title=\"代码\">💻\u003C\u002Fa>\u003C\u002Ftd>\n\u003C!--     \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Flijunsun\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_0cbf529ecc19.png\" width=\"80px;\" alt=\"Lijun Sun\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Lijun Sun\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=lijunsun\" title=\"代码\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FHanTY\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_965df20277e8.png\" width=\"80px;\" alt=\"Tianyang Han\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Tianyang Han\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=HanTY\" title=\"代码\">💻\u003C\u002Fa>\u003C\u002Ftd> -->\n\u003C!--   \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxxxx\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_252fc7400e00.png\" width=\"100px;\" alt=\"xxxx\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>xxxx\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=xxxx\" title=\"代码\">💻\u003C\u002Fa>\u003C\u002Ftd> -->\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n- **指导委员会**\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Flijunsun\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_0cbf529ecc19.png\" width=\"80px;\" alt=\"Lijun Sun\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Lijun Sun\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=lijunsun\" title=\"代码\">💻\u003C\u002Fa>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fnsaunier\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_c4f2a8ee2849.png\" width=\"80px;\" alt=\"Nicolas Saunier\"\u002F>\u003Cbr \u002F>\u003Csub>\u003Cb>Nicolas Saunier\u003C\u002Fb>\u003C\u002Fsub>\u003C\u002Fa>\u003Cbr \u002F>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fcommits?author=nsaunier\" title=\"代码\">💻\u003C\u002Fa>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n> 查看参与本项目的 [贡献者](https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fgraphs\u002Fcontributors) 列表。\n\n\u003Cbr>\n\n支持单位\n--------------\n\n\u003Ca href=\"https:\u002F\u002Fivado.ca\u002Fen\">\n\u003Cimg align=\"middle\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_4e2642f2954b.jpeg\" alt=\"drawing\" height=\"70\" hspace=\"50\">\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fwww.cirrelt.ca\u002F\">\n\u003Cimg align=\"middle\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_readme_372181aa2745.png\" alt=\"drawing\" height=\"50\">\n\u003C\u002Fa>\n\n\u003Cbr>\n\n许可证\n--------------\n\n本作品采用 MIT 许可证发布。","# transdim 快速上手指南\n\n**Transdim** (Transportation data imputation) 是一个专注于时空数据建模的机器学习项目，主要用于解决交通数据中的缺失值补全（Imputation）和时间序列预测（Prediction）任务。该项目基于 Numpy 实现，提供多种张量完成模型。\n\n## 环境准备\n\n- **操作系统**: Linux \u002F macOS \u002F Windows\n- **Python 版本**: >= 3.7\n- **核心依赖**:\n  - `numpy`\n  - `scipy`\n  - `pandas`\n  - `jupyter` (用于运行实验 Notebook)\n\n> 💡 **提示**: 国内开发者建议使用清华源加速安装：\n> ```bash\n> pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple numpy scipy pandas jupyter\n> ```\n\n## 安装步骤\n\n1. **克隆仓库**\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim.git\n   cd transdim\n   ```\n\n2. **安装依赖**\n   确保已安装上述核心依赖包。\n\n3. **准备数据集**\n   本项目需要手动下载公开数据集并放置到 `datasets` 文件夹中。例如：\n   - 广州交通速度数据\n   - Birmingham 停车数据\n   - PeMS 交通流数据等\n   \n   下载后路径示例：`..\u002Fdatasets\u002FGuangzhou-data-set\u002Ftensor.mat`\n\n## 基本使用\n\n### 方式一：运行 Jupyter Notebook（推荐）\n\n项目提供了完整的模型评估 Notebook，适合直接查看结果和复现论文。\n\n1. 启动 Jupyter Notebook：\n   ```bash\n   jupyter notebook\n   ```\n2. 打开对应模型的 Notebook 文件（位于 `imputer\u002F` 或 `predictor\u002F` 目录）。\n3. 确保数据路径正确，点击运行单元格即可开始训练或补全。\n\n### 方式二：核心算法快速演示\n\n项目强调其 Python 实现比论文公式更简单。以下是核心的张量操作与奇异值阈值处理（SVT）代码片段，可直接在 Python 环境中运行：\n\n```python\nimport numpy as np\nfrom numpy.linalg import inv as inv\n\n# 定义张量展开算子\ndef ten2mat(tensor, mode):\n    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')\n\n# 定义矩阵折叠算子\ndef mat2ten(mat, tensor_size, mode):\n    index = list()\n    index.append(mode)\n    for i in range(tensor_size.shape[0]):\n        if i != mode:\n            index.append(i)\n    return np.moveaxis(np.reshape(mat, list(tensor_size[index]), order = 'F'), 0, mode)\n\n# 定义截断核范数最小化的奇异值阈值处理 (SVT)\ndef svt_tnn(mat, tau, theta):\n    [m, n] = mat.shape\n    if 2 * m \u003C n:\n        u, s, v = np.linalg.svd(mat @ mat.T, full_matric\n```\n\n> **注意**: 上述代码为 README 中的核心逻辑片段。完整模型（如 BTMF, BGCP, LRTC-TNN 等）请通过 Jupyter Notebook 调用。\n\n### 数据加载示例\n\n```python\nimport scipy.io\n\ntensor = scipy.io.loadmat('..\u002Fdatasets\u002FGuangzhou-data-set\u002Ftensor.mat')\ntensor = tensor['tensor']\n```","某市智慧交通运营中心的数据团队负责监控全市主干道流量，但在设备维护期间常面临传感器数据大面积丢失的问题。\n\n### 没有 transdim 时\n- 传感器网络不稳定导致时间序列出现大量随机空洞，简单的线性插值无法还原真实的拥堵突变。\n- 遇到连续数小时断网的“块状缺失”时，传统算法直接报错，导致早高峰预测任务完全中断。\n- 因数据不全强行训练模型，使得神经网络过拟合噪声，预测结果与实际路况偏差超过 30%。\n- 运维人员需手动清洗数据，耗费大量人力且难以保证多源异构数据的时空一致性。\n\n### 使用 transdim 后\n- transdim 基于低秩张量补全框架，能自动识别并修复随机、非随机及块状等多种缺失模式。\n- 即使部分路段传感器持续离线，系统也能利用周边路网的空间相关性精准还原缺失时刻的车流状态。\n- 内置的预测器支持在观测值不完整的情况下直接输出未来短时流量趋势，无需先完美补全再预测。\n- 大幅降低了人工干预成本，将数据可用性从不足 70% 提升至接近 100%，保障了信号调控的实时性。\n\ntransdim 通过解决时空数据缺失难题，让残缺的交通监测数据具备了高可用的预测能力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fxinychen_transdim_955026e6.png","xinychen","Xinyu Chen","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fxinychen_e1436a1b.jpg","\r\n","University of Montreal","Montreal, Canada",null,"https:\u002F\u002Fxinychen.github.io","https:\u002F\u002Fgithub.com\u002Fxinychen",[86],{"name":87,"color":88,"percentage":89},"Jupyter Notebook","#DA5B0B",100,1286,305,"2026-04-03T07:37:22","MIT","未说明",{"notes":96,"python":97,"dependencies":98},"需提前下载数据集至 datasets 文件夹；代码基于 Jupyter Notebook 编写；主要使用 Numpy 进行张量运算，未提及 GPU 加速需求","3.7+",[99,100,101],"numpy","scipy","pandas",[14,18],"2026-03-27T02:49:30.150509","2026-04-06T05:17:22.011095",[106,111,116,121],{"id":107,"question_zh":108,"answer_zh":109,"source_url":110},2889,"如何将算法应用于二维矩阵数据（如 PEMS-BAY）以避免内存问题？","针对二维数据（N, T），直接增加维度变为（1, N, T）可能导致内存溢出或计算缓慢。建议尝试使用 NoTMF 模型（参考：https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftracebase\u002Fblob\u002Fmain\u002Fmodels\u002FNoTMF.ipynb），该方法计算效率较高。关于维度扩展方式，若需调整 batch_size 需注意计算量变化。","https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fissues\u002F26",{"id":112,"question_zh":113,"answer_zh":114,"source_url":115},2890,"LRTC-TNN.ipynb 运行时报错如何处理？","该文件存在 svt_tnn 函数参数缺失的 Bug（递归调用时缺少参数）。请前往更新后的 Notebook 地址尝试：https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fblob\u002Fmaster\u002Fimputer\u002FLRTC-TNN.ipynb。","https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fissues\u002F22",{"id":117,"question_zh":118,"answer_zh":119,"source_url":120},2891,"论文中特定公式（F-norm 与内积关系）是如何推导的？","该问题涉及 F-norm 与内积的关系。请参考书籍《tensor_book.pdf》第 70 页的公式 (7.64) 和 (7.67) 进行理解（PDF 会频繁更新，如有需要可重新请求详情）。","https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fissues\u002F19",{"id":122,"question_zh":123,"answer_zh":124,"source_url":125},2892,"如何在没有完整 dense_tensor 的数据集上使用 LATC 算法？","如果数据集不完整，不要直接使用 sparse_tensor.copy() 代替 dense_tensor，否则 RMSE 和 MAPE 可能为 NaN。正确的做法是掩码（mask）一定数量的观测值作为缺失值进行处理。","https:\u002F\u002Fgithub.com\u002Fxinychen\u002Ftransdim\u002Fissues\u002F20",[]]