[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-HKUDS--GraphGPT":3,"tool-HKUDS--GraphGPT":65},[4,17,27,35,48,57],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",158594,2,"2026-04-16T23:34:05",[13,14,15],"开发框架","Agent","语言模型","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,3,"2026-04-06T11:19:32",[15,26,14,13],"图像",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":10,"last_commit_at":33,"category_tags":34,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":10,"last_commit_at":41,"category_tags":42,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",85092,"2026-04-10T11:13:16",[26,43,44,45,14,46,15,13,47],"数据工具","视频","插件","其他","音频",{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":54,"last_commit_at":55,"category_tags":56,"status":16},5784,"funNLP","fighting41love\u002FfunNLP","funNLP 是一个专为中文自然语言处理（NLP）打造的超级资源库，被誉为\"NLP 民工的乐园”。它并非单一的软件工具，而是一个汇集了海量开源项目、数据集、预训练模型和实用代码的综合性平台。\n\n面对中文 NLP 领域资源分散、入门门槛高以及特定场景数据匮乏的痛点，funNLP 提供了“一站式”解决方案。这里不仅涵盖了分词、命名实体识别、情感分析、文本摘要等基础任务的标准工具，还独特地收录了丰富的垂直领域资源，如法律、医疗、金融行业的专用词库与数据集，甚至包含古诗词生成、歌词创作等趣味应用。其核心亮点在于极高的全面性与实用性，从基础的字典词典到前沿的 BERT、GPT-2 模型代码，再到高质量的标注数据和竞赛方案，应有尽有。\n\n无论是刚刚踏入 NLP 领域的学生、需要快速验证想法的算法工程师，还是从事人工智能研究的学者，都能在这里找到急需的“武器弹药”。对于开发者而言，它能大幅减少寻找数据和复现模型的时间；对于研究者，它提供了丰富的基准测试资源和前沿技术参考。funNLP 以开放共享的精神，极大地降低了中文自然语言处理的开发与研究成本，是中文 AI 社区不可或缺的宝藏仓库。",79857,1,"2026-04-08T20:11:31",[15,43,46],{"id":58,"name":59,"github_repo":60,"description_zh":61,"stars":62,"difficulty_score":54,"last_commit_at":63,"category_tags":64,"status":16},6590,"gpt4all","nomic-ai\u002Fgpt4all","GPT4All 是一款让普通电脑也能轻松运行大型语言模型（LLM）的开源工具。它的核心目标是打破算力壁垒，让用户无需依赖昂贵的显卡（GPU）或云端 API，即可在普通的笔记本电脑和台式机上私密、离线地部署和使用大模型。\n\n对于担心数据隐私、希望完全掌控本地数据的企业用户、研究人员以及技术爱好者来说，GPT4All 提供了理想的解决方案。它解决了传统大模型必须联网调用或需要高端硬件才能运行的痛点，让日常设备也能成为强大的 AI 助手。无论是希望构建本地知识库的开发者，还是单纯想体验私有化 AI 聊天的普通用户，都能从中受益。\n\n技术上，GPT4All 基于高效的 `llama.cpp` 后端，支持多种主流模型架构（包括最新的 DeepSeek R1 蒸馏模型），并采用 GGUF 格式优化推理速度。它不仅提供界面友好的桌面客户端，支持 Windows、macOS 和 Linux 等多平台一键安装，还为开发者提供了便捷的 Python 库，可轻松集成到 LangChain 等生态中。通过简单的下载和配置，用户即可立即开始探索本地大模型的无限可能。",77307,"2026-04-11T06:52:37",[15,13],{"id":66,"github_repo":67,"name":68,"description_en":69,"description_zh":70,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":91,"forks":92,"last_commit_at":93,"license":94,"difficulty_score":95,"env_os":96,"env_gpu":97,"env_ram":98,"env_deps":99,"category_tags":113,"github_topics":114,"view_count":10,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":120,"updated_at":121,"faqs":122,"releases":152},8208,"HKUDS\u002FGraphGPT","GraphGPT","[SIGIR'2024] \"GraphGPT: Graph Instruction Tuning for Large Language Models\"","GraphGPT 是一款专为提升大语言模型（LLM）图数据理解能力而设计的开源框架，由香港大学数据智能实验室与百度联合研发，相关成果已入选 SIGIR 2024 全文论文。\n\n传统大模型擅长处理文本，但在面对复杂的图结构数据（如社交网络、知识图谱）时往往表现不佳。GraphGPT 通过创新的“图指令微调”技术，成功搭建了图结构与自然语言之间的桥梁。它让大模型不仅能“读”懂节点和边的关系，还能直接响应关于图数据的复杂查询与分析指令，有效解决了通用大模型在图领域任务中推理能力不足的问题。\n\n该项目特别适合人工智能研究人员、算法工程师以及对图机器学习感兴趣的开发者使用。其核心技术亮点在于提出了两阶段指令微调策略：第一阶段让模型学习图结构基础，第二阶段强化其对特定图任务的指令遵循能力。此外，团队近期更新了轻量级训练代码，显著降低了硬件门槛，用户仅需两张 24GB 显存的 NVIDIA 3090 显卡即可完成完整的微调流程，极大地促进了该技术在学术研究和工业落地中的普及与应用。","# \u003Ccenter>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FHKUDS_GraphGPT_readme_5fb7fb799b38.png\" style=\"width: 5%\"> GraphGPT: Graph Instruction Tuning for Large Language Models\u003C\u002Fcenter>\n\n\u003Cdiv align='center'>\n \u003Ca href='https:\u002F\u002Ftjb-tech.github.io\u002F'>Jiabin Tang\u003C\u002Fa>, \u003Ca href='http:\u002F\u002Fyuh-yang.github.io'>Yuhao Yang\u003C\u002Fa>, \u003Ca href='#'>Wei Wei\u003C\u002Fa>, \u003Ca href='#'>Lei Shi\u003C\u002Fa>, \u003Ca href='#'>Suqi Cheng\u003C\u002Fa>, \u003Ca href='https:\u002F\u002Fwww.yindawei.com\u002F'>Dawei Yin\u003C\u002Fa> and \u003Ca='https:\u002F\u002Fsites.google.com\u002Fview\u002Fchaoh\u002Fhome'>Chao Huang*\u003C\u002Fa>. (*Correspondence )\n\n \u003Cstrong>\u003Ca href='https:\u002F\u002Fsites.google.com\u002Fview\u002Fchaoh\u002Fhome'>Data Intelligence Lab\u003C\u002Fa>@\u003Ca href='https:\u002F\u002Fwww.hku.hk\u002F'>University of Hong Kong\u003C\u002Fa>\u003C\u002Fstrong>, Baidu Inc.\n\n \u003Ca href='https:\u002F\u002Fgraphgpt.github.io\u002F'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-Green'>\u003C\u002Fa>\n \u003Ca href='#'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDemo-Page-purple'>\u003C\u002Fa> \n \u003Ca href='https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPaper-PDF-orange'>\u003C\u002Fa> \n[![YouTube](https:\u002F\u002Fbadges.aleen42.com\u002Fsrc\u002Fyoutube.svg)](#)\n\u003Ca href='https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FrvKTFdCk719Q6hT09Caglw' target='_blank'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F中文-博客-blue'>\u003C\u002Fa>\n\n\u003Cimg src='https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FHKUDS_GraphGPT_readme_a12ea064df20.jpeg' \u002F>\n\n\u003C!--\n[Jiabin Tang](https:\u002F\u002Ftjb-tech.github.io\u002F), [Yuhao Yang](http:\u002F\u002Fyuh-yang.github.io), [Wei Wei](#), [Lei Shi](#), [Lixin Su](#), [Suqi Cheng](#), [Dawei Yin](https:\u002F\u002Fwww.yindawei.com\u002F) and [Chao Huang](https:\u002F\u002Fsites.google.com\u002Fview\u002Fchaoh\u002Fhome)*.\n(*Correspondence )\n\n**[Data Intelligence Lab](https:\u002F\u002Fsites.google.com\u002Fview\u002Fchaoh\u002Fhome)@[University of Hong Kong](https:\u002F\u002Fwww.hku.hk\u002F)**, Baidu Inc.\n\n-----\n\n\u003Ca href='https:\u002F\u002Fgraphgpt.github.io\u002F'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-Green'>\u003C\u002Fa>\n\u003Ca href='#'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDemo-Page-purple'>\u003C\u002Fa> \n\u003Ca href='https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPaper-PDF-orange'>\u003C\u002Fa> \n[![YouTube](https:\u002F\u002Fbadges.aleen42.com\u002Fsrc\u002Fyoutube.svg)](#)\n • 🌐 \u003Ca href=\"https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FrvKTFdCk719Q6hT09Caglw\" target=\"_blank\">中文博客\u003C\u002Fa>\n-->\n\n\n\n\nThis repository hosts the code, data and model weight of **GraphGPT** (SIGIR'24 full paper track).\n\n\u003C!--\n-----------\n-->\n\n\u003C\u002Fdiv>\n\n## 🎉 News \n- [x] [2024.03.26]🎯🎯📢📢Our GraphGPT is accepted by SIGIR'24 in the Full paper track (20.1% acceptance rate)! Congrats to all GraphGPT team! 🎉🎉🎉\n- [x] [2023.12.26]🎯🎯📢📢We have updated the efficient and lightweight training code. With the updated script, it is possible to perform two-stage instruction tuning on two Nvidia 3090 GPUs (24 GB each). The specific deployment and fine-tuning methods are as follows: 🎄🎄\n\n#### 0. Environment Update: \n\nThe lightweight training requires PyTorch 2.1+, so we need to update corresponding libraries: \n\n```shell\n# if you have set up the env for GraphGPT earlier\npip uninstall torch\npip uninstall torchvision\npip uninstall torchaudio\n# CUDA 11.8\npip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu118\n\n# update pyg for the PyTorch 2.1+\npip install torch_geometric\npip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https:\u002F\u002Fdata.pyg.org\u002Fwhl\u002Ftorch-2.1.0+cu118.html\n\n# install lightning\npip install lightning\n```\n\n#### 1. Update the Graph Data\n\nDue to compatibility issues, if you are using the previously released graph data, we recommend downloading and updating it according to the provided link: [updated graph data](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FAll_pyg_graph_data).\n\n#### 2. Run the Scripts\n\nYou can run the scripts as follow:\n\n**Stage-1:**\n\n```shell\ncd path\u002Fto\u002FGraphGPT\nsh .\u002Fscripts\u002Ftune_script\u002Fgraphgpt_stage1.sh\n```\n\n**Stage-2:**\n\n```\ncd path\u002Fto\u002FGraphGPT\nsh .\u002Fscripts\u002Ftune_script\u002Fgraphgpt_stage2.sh\n```\n\n- [x] [2023.12.14]📢📢Thank you for the support from the research community. We have compiled a list of frequently asked questions (FAQs) regarding running and environment issues in the following **FAQ** list. Please take a look. Wishing everyone an early Merry Christmas!🎄🎄\n\n\u003Cdetails>\n\u003Csummary> \u003Cb>FQA\u003C\u002Fb> \u003C\u002Fsummary>\n\n- For 'pretrain_graph_model_path' is not defined. Please refer to issue [#7](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F7).\n- If there is something wrong for you to use flash attetion, just comment the `replace_llama_attn_with_flash_attn()` in line 8 in https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fblob\u002Fmain\u002Fgraphgpt\u002Ftrain\u002Ftrain_mem.py. For more details, please refer to [#17](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F17)\n- If you meet some error about package conflict or environment setup (especially fastchat), please refer to issue [#9](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F9) and issue [#11](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F11).\n- If you meet `No module named 'graphgpt'` error, you could refer to issue [#56](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F56)\n\n\u003C\u002Fdetails>\n\n\n🎯🎯📢📢 We have made significant updates to the **models** and **data** used in our GraphGPT on 🤗 **Huggingface**. We highly recommend referring to the table below for further details: \n\n| 🤗 Huggingface Address                                        | 🎯 Description                                                |\n| ------------------------------------------------------------ | ------------------------------------------------------------ |\n| [huggingface.co\u002FJiabin99\u002FGraphGPT-7B-mix-all](https:\u002F\u002Fhuggingface.co\u002FJiabin99\u002FGraphGPT-7B-mix-all) | It's the checkpoint of our GraphGPT based on Vicuna-7B-v1.5 tuned on instruction data [Arxiv-PubMed-mix-NC-LP](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FArxiv-PubMed-mix-NC-LP) |\n| [huggingface.co\u002FJiabin99\u002FArxiv-PubMed-GraphCLIP-GT](https:\u002F\u002Fhuggingface.co\u002FJiabin99\u002FArxiv-PubMed-GraphCLIP-GT) | It's the checkpoint of the pre-trained graph transformer (GT) trained on Arxiv and PubMed using Text-Graph grounding. |\n| [huggingface.co\u002Fdatasets\u002FJiabin99\u002FArxiv-PubMed-mix-NC-LP](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FArxiv-PubMed-mix-NC-LP) | This's the mixing instruction dataset with node classification (NC) and link prediction (LP) on Arxiv and PubMed. |\n| [huggingface.co\u002Fdatasets\u002FJiabin99\u002FGraphGPT-eval-instruction](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FGraphGPT-eval-instruction) | We release all instruction dataset for our evaluation.       |\n| [huggingface.co\u002Fdatasets\u002FJiabin99\u002FAll_pyg_graph_data](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FAll_pyg_graph_data) | We merge all utilized graph data.                            |\n| [huggingface.co\u002Fdatasets\u002FJiabin99\u002Fgraph-matching](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002Fgraph-matching) | This is the instruction data used in graph-matching stage.                            |\n\n\n- [x] [2023.10.28]📢📢For the Chinese version of the explanation, please refer to this [article](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FrvKTFdCk719Q6hT09Caglw).\n\n- [x] [2023.10.26]🔥🔥Release our utilized Instruction data.\n\n- [x] [2023.10.26]🔥🔥Release checkpoints of our GraphGPT and pre-trained graph encoder.\n\n- [x] [2023.10.23] 🚀🚀 The full paper of our GraphGPT is available at [https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023). Please check out it and give us more feedbacks! \n\n- [x] [2023.10.15] 🚀🚀 Release the code of GraphGPT.\n\n\n## 👉 TODO \n- [ ] Exploring the potential of our GraphGPT for more graph learning tasks.\n- [ ] ...\n\n-----------\n\n\n\n\n\u003Cspan id='introduction'\u002F>\n\n## Brief Introduction \n\n\nwe present the **GraphGPT** framework that aligns LLMs with graph structural knowledge with a graph instruction tuning paradigm.\n\n\n- **Structural Information Encoding with Text-Graph Grounding.** To enhance the understanding of graph structural information by large language models, our framework emphasizes aligning the encoding of graph structures with the natural language space. This alignment aims to enable language models to effectively comprehend and interpret the structural elements of the graph, leveraging their inherent language understanding capabilities. To achieve this objective, we introduce a text-graph grounding paradigm that generates prompts designed to preserve the graph’s structural context for language models. This paradigm acts as a bridge, connecting the semantic understanding of textual information with the inherent structural relationships found within the graph.\n- **Dual-Stage Graph Instruction Tuning.** The dual-stage graph instruction tuning paradigm proposed in this work builds upon the concept of instruction tuning, which has been recently introduced to enhance the adaptability of language models for specific domains. In this paradigm, we aim to align the language capacity of the model with the nuances of graph learning tasks, enabling the language model to generate more accurate and contextually appropriate responses for graph-structured data.\n- **Chain-of-Thought (CoT) Distillation.** When faced with diverse graph data, language models may encounter new or unfamiliar patterns and structures. This distribution shift can pose challenges in generating accurate and coherent responses, especially when the number of node classes varies across different types of graph data. To address this challenge and boost accuracy in the presence of distribution shift, it is essential to equip our GraphGPT with step-by-step reasoning abilities. In this regard, we propose utilizing the Chain-of-Thought (COT) technique [47], which explicitly models the flow of thoughts and reasoning steps. By incorporating COT, our language model improves the coherence and consistency of generated text. It enables the model to follow a logical progression of ideas, enhancing its ability to understand and reason about the given graph data.\n\n\nFor more technical details, kindly refer to the [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023) and the project [website](https:\u002F\u002Fgraphgpt.github.io\u002F) of our Graph. \n\n\n-----------\n\n\u003Cspan id='Usage'\u002F>\n\n## Getting Started\n\n\u003Cspan id='all_catelogue'\u002F>\n\n### Table of Contents:\n* \u003Ca href='#Code Structure'>1. Code Structure\u003C\u002Fa>\n* \u003Ca href='#Environment Preparation'>2. Environment Preparation \u003C\u002Fa>\n* \u003Ca href='#Training GraphGPT'>3. Training GraphGPT \u003C\u002Fa>\n  * \u003Ca href='#Prepare Pre-trained Checkpoint'>3.1. Prepare Pre-trained Checkpoint\u003C\u002Fa>\n  * \u003Ca href='#Self-Supervised Instruction Tuning'>3.2. Self-Supervised Instruction Tuning\u003C\u002Fa>\n  * \u003Ca href='#Extract the Trained Projector'>3.3. Extract the Trained Projector\u003C\u002Fa>\n  * \u003Ca href='#Task-Specific Instruction Tuning'>3.4. Task-Specific Instruction Tuning\u003C\u002Fa>\n* \u003Ca href='#Evaluating GraphGPT'>4. Evaluating GraphGPT\u003C\u002Fa>\n  * \u003Ca href='#Preparing Checkpoints and Data'>4.1. Preparing Checkpoints and Data\u003C\u002Fa>\n  * \u003Ca href='#Running Evaluation'>4.2. Running Evaluation\u003C\u002Fa>\n\n****\n\n\n\n\u003Cspan id='Code Structure'\u002F>\n\n### 1. Code Structure \u003Ca href='#all_catelogue'>[Back to Top]\u003C\u002Fa>\n\n```\n.\n├── README.md\n├── assets\n│   ├── demo_narrow.gif\n│   ├── screenshot_cli.png\n│   ├── screenshot_gui.png\n│   ├── server_arch.png\n│   └── vicuna_logo.jpeg\n├── format.sh\n├── graphgpt\n│   ├── __init__.py\n│   ├── constants.py\n│   ├── conversation.py\n│   ├── eval\n│   │   ├── README.md\n│   │   ├── requirements.txt\n│   │   ├── run_graphgpt.py\n│   │   ├── run_graphgpt_LP.py\n│   │   ├── run_vicuna.py\n│   │   └── script\n│   │       └── run_model_qa.yaml\n│   ├── model\n│   │   ├── GraphLlama.py\n│   │   ├── __init__.py\n│   │   ├── apply_delta.py\n│   │   ├── apply_lora.py\n│   │   ├── builder.py\n│   │   ├── compression.py\n│   │   ├── convert_fp16.py\n│   │   ├── graph_layers\n│   │   │   ├── __init__.py\n│   │   │   ├── bpe_simple_vocab_16e6.txt.gz\n│   │   │   ├── clip_graph.py\n│   │   │   ├── graph_transformer.py\n│   │   │   ├── mpnn.py\n│   │   │   └── simple_tokenizer.py\n│   │   ├── make_delta.py\n│   │   ├── model_adapter.py\n│   │   ├── model_registry.py\n│   │   ├── monkey_patch_non_inplace.py\n│   │   └── utils.py\n│   ├── protocol\n│   │   └── openai_api_protocol.py\n│   ├── serve\n│   │   ├── __init__.py\n│   │   ├── api_provider.py\n│   │   ├── bard_worker.py\n│   │   ├── cacheflow_worker.py\n│   │   ├── cli.py\n│   │   ├── controller.py\n│   │   ├── gateway\n│   │   │   ├── README.md\n│   │   │   └── nginx.conf\n│   │   ├── gradio_block_arena_anony.py\n│   │   ├── gradio_block_arena_named.py\n│   │   ├── gradio_css.py\n│   │   ├── gradio_patch.py\n│   │   ├── gradio_web_server.py\n│   │   ├── gradio_web_server_multi.py\n│   │   ├── huggingface_api.py\n│   │   ├── inference.py\n│   │   ├── model_worker.py\n│   │   ├── monitor\n│   │   │   ├── basic_stats.py\n│   │   │   ├── clean_battle_data.py\n│   │   │   ├── elo_analysis.py\n│   │   │   ├── hf_space_leaderboard_app.py\n│   │   │   └── monitor.py\n│   │   ├── openai_api_server.py\n│   │   ├── register_worker.py\n│   │   ├── test_message.py\n│   │   └── test_throughput.py\n│   ├── train\n│   │   ├── graphchat_trainer.py\n│   │   ├── llama_flash_attn_monkey_patch.py\n│   │   ├── train_graph.py\n│   │   ├── train_lora.py\n│   │   └── train_mem.py\n│   └── utils.py\n├── playground\n│   ├── inspect_conv.py\n│   ├── test_embedding\n│   │   ├── README.md\n│   │   ├── test_classification.py\n│   │   ├── test_semantic_search.py\n│   │   └── test_sentence_similarity.py\n│   └── test_openai_api\n│       ├── anthropic_api.py\n│       └── openai_api.py\n├── pyproject.toml\n├── scripts\n│   ├── eval_script\n│   │   └── graphgpt_eval.sh\n│   ├── extract_graph_projector.py\n│   ├── serving\n│   │   ├── controller.yaml\n│   │   └── model_worker.yaml\n│   └── tune_script\n│       ├── extract_projector.sh\n│       ├── graphgpt_stage1.sh\n│       └── graphgpt_stage2.sh\n└── tests\n    ├── test_openai_curl.sh\n    ├── test_openai_langchain.py\n    └── test_openai_sdk.py\n```\n\n\n\u003Cspan id='Environment Preparation'\u002F>\n\n\n### 2. Environment Preparation  \u003Ca href='#all_catelogue'>[Back to Top]\u003C\u002Fa>\nPlease first clone the repo and install the required environment, which can be done by running the following commands:\n```shell\nconda create -n graphgpt python=3.8\n\nconda activate graphgpt\n\n# Torch with CUDA 11.7\npip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu117\n# To support vicuna base model\npip3 install \"fschat[model_worker,webui]\"\n# To install pyg and pyg-relevant packages\npip install torch_geometric\npip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https:\u002F\u002Fdata.pyg.org\u002Fwhl\u002Ftorch-1.13.0+cu117.html\n# Clone our GraphGPT\ngit clone https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT.git\ncd GraphGPT\n# Install required libraries\npip install -r requirements.txt\n```\n\n\u003Cspan id='Training GraphGPT'\u002F>\n\n### 3. Training GraphGPT \u003Ca href='#all_catelogue'>[Back to Top]\u003C\u002Fa>\n\nGraphGPT tuning paradigm consists of two stages: (1) self-supervised instruction tuning; (2) task-specific instruction tuning.\n\n\u003Cspan id='Prepare Pre-trained Checkpoint'\u002F>\n\n#### 3.1. Preparing Pre-trained Checkpoint  \u003Ca href='#all_catelogue'>[Back to Top]\u003C\u002Fa>\nGraphGPT is trained based on following excellent existing models.\nPlease follow the instructions to prepare the checkpoints.\n\n- `Vicuna`:\n  Prepare our base model Vicuna, which is an instruction-tuned chatbot and base model in our implementation. Please download its weights [here](https:\u002F\u002Fgithub.com\u002Flm-sys\u002FFastChat#model-weights). We generally utilize v1.1 and v1.5 model with 7B parameters.\n\n- `Graph Encoder`:\n  is used to encode graph structures. We employ text-graph grounding approach to obtain the pre-trained graph transformer model, which you could download by [graph transformer](https:\u002F\u002Fhuggingface.co\u002FJiabin99\u002FArxiv-PubMed-GraphCLIP-GT) and put it at [[.\u002FGraphGPT]](.\u002FGraphGPT). We also provide source codes and example Cora data for text-graph grounding at [[.\u002Ftext-graph-grounding]](.\u002Ftext-graph-grounding) for your reference.\n\n- `Graph Data`:\n  is a combination of all utilized pyg graph data that contain node features, edge_index and so on. You can download by [all_graph_data.pt](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FAll_pyg_graph_data) and put it at [[.\u002FGraphGPT\u002Fgraph_data]](.\u002FGraphGPT\u002Fgraph_data)\n\n\u003Cspan id='Self-Supervised Instruction Tuning'\u002F>\n\n#### 3.2. Self-Supervised Instruction Tuning  \u003Ca href='#all_catelogue'>[Back to Top]\u003C\u002Fa>\n\n* **Prepare data:** Please download our instruction tuning data [graph_matching.json](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002Fgraph-matching) for the graph matching task.\n\n* **Start tuning:** After the aforementioned steps, you could start the first stage tuning by filling blanks at [graphgpt_stage1.sh](scripts\u002Ftune_script\u002Fgraphgpt_stage1.sh). There is an example as below: \n\n```shell\n# to fill in the following path to run the first stage of our GraphGPT!\nmodel_path=..\u002Fvicuna-7b-v1.5-16k\ninstruct_ds=.\u002Fdata\u002Fstage_1\u002Fgraph_matching.json\ngraph_data_path=.\u002Fgraph_data\u002Fall_graph_data.pt\npretra_gnn=clip_gt_arxiv\noutput_model=.\u002Fcheckpoints\u002Fstage_1\n\nwandb offline\npython -m torch.distributed.run --nnodes=1 --nproc_per_node=4 --master_port=20001 \\\n    graphgpt\u002Ftrain\u002Ftrain_mem.py \\\n    --model_name_or_path ${model_path} \\\n    --version v1 \\\n    --data_path ${instruct_ds} \\\n    --graph_content .\u002Farxiv_ti_ab.json \\\n    --graph_data_path ${graph_data_path} \\\n    --graph_tower ${pretra_gnn} \\\n    --tune_graph_mlp_adapter True \\\n    --graph_select_layer -2 \\\n    --use_graph_start_end \\\n    --bf16 True \\\n    --output_dir ${output_model} \\\n    --num_train_epochs 3 \\\n    --per_device_train_batch_size 2 \\\n    --per_device_eval_batch_size 2 \\\n    --gradient_accumulation_steps 1 \\\n    --evaluation_strategy \"no\" \\\n    --save_strategy \"steps\" \\\n    --save_steps 2400 \\\n    --save_total_limit 1 \\\n    --learning_rate 2e-3 \\\n    --weight_decay 0. \\\n    --warmup_ratio 0.03 \\\n    --lr_scheduler_type \"cosine\" \\\n    --logging_steps 1 \\\n    --tf32 True \\\n    --model_max_length 2048 \\\n    --gradient_checkpointing True \\\n    --lazy_preprocess True \\\n    --report_to wandb\n```\n\n\u003Cspan id='Extract the Trained Projector'\u002F>\n\n#### 3.3. Extract the Trained Projector  \u003Ca href='#all_catelogue'>[Back to Top]\u003C\u002Fa>\n\nWe could extract the trained projector in the stage 1 by filling blanks at [extract_projector.sh](scripts\u002Ftune_script\u002Fextract_projector.sh). There is an example as below: \n\n```shell\n# to fill in the following path to extract projector for the first tuning stage!\nsrc_model=.\u002Fcheckpoints\u002Fstage_1\noutput_proj=.\u002Fcheckpoints\u002Fstage_1_projector\u002Fstage_1_projector.bin\n\npython3.8 .\u002Fscripts\u002Fextract_graph_projector.py \\\n  --model_name_or_path ${src_model} \\\n  --output ${output_proj}\n```\n\n\u003Cspan id='Task-Specific Instruction Tuning'\u002F>\n\n#### 3.4. Task-Specific Instruction Tuning  \u003Ca href='#all_catelogue'>[Back to Top]\u003C\u002Fa>\n\n* **Prepare data:** The choices of our task-specific instruction data could be diverse, e.g., standard or COT (Chain-of-Thought) node classification, link prediction or mixing data for multitasking. Please refer to the  [task_specific](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FArxiv-PubMed-mix-NC-LP).\n\n* **Start tuning:** After the aforementioned steps, you could start the second stage tuning by filling blanks at [graphgpt_stage2.sh](scripts\u002Ftune_script\u002Fgraphgpt_stage2.sh). There is an example as below: \n\n```shell\n# to fill in the following path to run the second stage of our GraphGPT!\nmodel_path=..\u002Fvicuna-7b-v1.5-16k\ninstruct_ds=.\u002Fdata\u002Fstage_2\u002Fdata_all_mix.json\ngraph_data_path=.\u002Fgraph_data\u002Fall_graph_data.pt\npretra_gnn=clip_gt_arxiv\ntuned_proj=.\u002Fcheckpoints\u002Fstage_1_projector\u002Fstage_1_projector.bin\noutput_model=.\u002Fcheckpoints\u002Fstage_2\n\nwandb offline\npython -m torch.distributed.run --nnodes=1 --nproc_per_node=4 --master_port=20001 \\\n    graphgpt\u002Ftrain\u002Ftrain_mem.py \\\n    --model_name_or_path ${model_path} \\\n    --version v1 \\\n    --data_path ${instruct_ds} \\\n    --graph_content .\u002Farxiv_ti_ab.json \\\n    --graph_data_path ${graph_data_path} \\\n    --graph_tower ${pretra_gnn} \\\n    --pretrain_graph_mlp_adapter ${tuned_proj} \\\n    --tune_graph_mlp_adapter True \\\n    --graph_select_layer -2 \\\n    --use_graph_start_end True\\\n    --bf16 True \\\n    --output_dir ${output_model} \\\n    --num_train_epochs 2 \\\n    --per_device_train_batch_size 1 \\\n    --per_device_eval_batch_size 1 \\\n    --gradient_accumulation_steps 1 \\\n    --evaluation_strategy \"no\" \\\n    --save_strategy \"steps\" \\\n    --save_steps 50000 \\\n    --save_total_limit 1 \\\n    --learning_rate 2e-5 \\\n    --weight_decay 0. \\\n    --warmup_ratio 0.03 \\\n    --lr_scheduler_type \"cosine\" \\\n    --logging_steps 1 \\\n    --tf32 True \\\n    --model_max_length 2048 \\\n    --gradient_checkpointing True \\\n    --dataloader_num_workers 4 \\\n    --lazy_preprocess True \\\n    --report_to wandb\n\n```\n\n\n\n\u003Cspan id='Evaluating GraphGPT'\u002F>\n\n## 4. Evaluating GraphGPT  \u003Ca href='#all_catelogue'>[Back to Top]\u003C\u002Fa>\n\n\u003Cspan id='Preparing Checkpoints and Data'\u002F>\n\n\n#### 4.1. Preparing Checkpoints and Data \u003Ca href='#all_catelogue'>[Back to Top]\u003C\u002Fa>\n\n* **Checkpoints:** You could try to evaluate GraphGPT by using your own model or our released checkpoints.\n* **Data:** We split test sets for different graph datasets and make the instruction data for evaluation. Please refer to the  [evaluating](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FGraphGPT-eval-instruction).\n\n\u003Cspan id='Running Evaluation'\u002F>\n\n#### 4.2. Running Evaluation \u003Ca href='#all_catelogue'>[Back to Top]\u003C\u002Fa>\n\nYou could start the second stage tuning by filling blanks at [graphgpt_eval.sh](scripts\u002Feval_script\u002Fgraphgpt_eval.sh). There is an example as below: \n```shell\n# to fill in the following path to extract projector for the second tuning stage!\noutput_model=.\u002Fcheckpoints\u002Fstage_2\ndatapath=.\u002Fdata\u002Feval\u002Farxiv_nc.json\ngraph_data_path=.\u002Fgraph_data\u002Fall_graph_data.pt\nres_path=.\u002Foutput_stage_2_arxiv_nc\nstart_id=0\nend_id=20000\nnum_gpus=2\n\npython3.8 .\u002Fgraphgpt\u002Feval\u002Frun_graphgpt.py --model-name ${output_model}  --prompting_file ${datapath} --graph_data_path ${graph_data_path} --output_res_path ${res_path} --start_id ${start_id} --end_id ${end_id} --num_gpus ${num_gpus}\n```\n---------\n\n\n## Contact\n\nFor any questions or feedback, feel free to contact [Jiabin Tang](mailto:jiabintang77@gmail.com).\n\n## Misc\n\n\u003Cdiv align=\"center\">\n\n[![Stargazers repo roster for @HKUDS\u002FGraphGPT](https:\u002F\u002Freporoster.com\u002Fstars\u002FHKUDS\u002FGraphGPT)](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fstargazers)\n\n\n[![Forkers repo roster for @HKUDS\u002FGraphGPT](https:\u002F\u002Freporoster.com\u002Fforks\u002FHKUDS\u002FGraphGPT)](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fnetwork\u002Fmembers)\n\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FHKUDS_GraphGPT_readme_565eda28ba39.png)](https:\u002F\u002Fstar-history.com\u002F#HKUDS\u002FGraphGPT&Date)\n\n\u003C\u002Fdiv>\n\n## Citation\n\nIf you find GraphGPT useful in your research or applications, please kindly cite:\n```tex\n@articles{tang2023graphgpt,\ntitle={GraphGPT: Graph Instruction Tuning for Large Language Models}, \nauthor={Jiabin Tang and Yuhao Yang and Wei Wei and Lei Shi and Lixin Su and Suqi Cheng and Dawei Yin and Chao Huang},\nyear={2023},\neprint={2310.13023},\narchivePrefix={arXiv},\nprimaryClass={cs.CL}\n}\n```\n\n\n\n## Acknowledgements\nYou may refer to related work that serves as foundations for our framework and code repository, \n[Vicuna](https:\u002F\u002Fgithub.com\u002Flm-sys\u002FFastChat), [LLaVa](https:\u002F\u002Fgithub.com\u002Fhaotian-liu\u002FLLaVA), We also partially draw inspirations from [MiniGPT-4](https:\u002F\u002Fgithub.com\u002FVision-CAIR\u002FMiniGPT-4). For the text-graph grounding design, we leverages implementation from [G2P2](https:\u002F\u002Fgithub.com\u002FWenZhihao666\u002FG2P2). The design of our website and README.md was inspired by [NExT-GPT](https:\u002F\u002Fnext-gpt.github.io\u002F). Thanks for their wonderful works.\n\n\n\n\n\n","# \u003Ccenter>\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FHKUDS_GraphGPT_readme_5fb7fb799b38.png\" style=\"width: 5%\"> GraphGPT：面向大型语言模型的图指令微调\u003C\u002Fcenter>\n\n\u003Cdiv align='center'>\n \u003Ca href='https:\u002F\u002Ftjb-tech.github.io\u002F'>唐嘉斌\u003C\u002Fa>, \u003Ca href='http:\u002F\u002Fyuh-yang.github.io'>杨宇浩\u003C\u002Fa>, \u003Ca href='#'>魏伟\u003C\u002Fa>, \u003Ca href='#'>石磊\u003C\u002Fa>, \u003Ca href='#'>程苏琪\u003C\u002Fa>, \u003Ca href='https:\u002F\u002Fwww.yindawei.com\u002F'>殷大伟\u003C\u002Fa> 和 \u003Ca='https:\u002F\u002Fsites.google.com\u002Fview\u002Fchaoh\u002Fhome'>黄超*\u003C\u002Fa>. (*通讯作者 )\n\n \u003Cstrong>\u003Ca href='https:\u002F\u002Fsites.google.com\u002Fview\u002Fchaoh\u002Fhome'>数据智能实验室\u003C\u002Fa>@\u003Ca href='https:\u002F\u002Fwww.hku.hk\u002F'>香港大学\u003C\u002Fa>\u003C\u002Fstrong>, 百度公司。\n\n \u003Ca href='https:\u002F\u002Fgraphgpt.github.io\u002F'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F项目-页面-绿色'>\u003C\u002Fa>\n \u003Ca href='#'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F演示-页面-紫色'>\u003C\u002Fa> \n \u003Ca href='https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F论文-PDF-橙色'>\u003C\u002Fa> \n[![YouTube](https:\u002F\u002Fbadges.aleen42.com\u002Fsrc\u002Fyoutube.svg)](#)\n\u003Ca href='https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FrvKTFdCk719Q6hT09Caglw' target='_blank'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F中文-博客-蓝色'>\u003C\u002Fa>\n\n\u003Cimg src='https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FHKUDS_GraphGPT_readme_a12ea064df20.jpeg' \u002F>\n\n\u003C!--\n[Jiabin Tang](https:\u002F\u002Ftjb-tech.github.io\u002F), [Yuhao Yang](http:\u002F\u002Fyuh-yang.github.io), [Wei Wei](#), [Lei Shi](#), [Lixin Su](#), [Suqi Cheng](#), [Dawei Yin](https:\u002F\u002Fwww.yindawei.com\u002F) and [Chao Huang](https:\u002F\u002Fsites.google.com\u002Fview\u002Fchaoh\u002Fhome)*.\n(*Correspondence )\n\n**[Data Intelligence Lab](https:\u002F\u002Fsites.google.com\u002Fview\u002Fchaoh\u002Fhome)@[University of Hong Kong](https:\u002F\u002Fwww.hku.hk\u002F)**, Baidu Inc.\n\n-----\n\n\u003Ca href='https:\u002F\u002Fgraphgpt.github.io\u002F'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F项目-页面-绿色'>\u003C\u002Fa>\n\u003Ca href='#'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F演示-页面-紫色'>\u003C\u002Fa> \n\u003Ca href='https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F论文-PDF-橙色'>\u003C\u002Fa> \n[![YouTube](https:\u002F\u002Fbadges.aleen42.com\u002Fsrc\u002Fyoutube.svg)](#)\n • 🌐 \u003Ca href=\"https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FrvKTFdCk719Q6hT09Caglw\" target=\"_blank\">中文博客\u003C\u002Fa>\n-->\n\n\n\n\n此仓库托管了**GraphGPT**（SIGIR'24 全文赛道）的代码、数据及模型权重。\n\n\u003C!--\n-----------\n-->\n\n\u003C\u002Fdiv>\n\n## 🎉 新闻 \n- [x] [2024.03.26]🎯🎯📢📢我们的 GraphGPT 已被 SIGIR'24 全文赛道接收（录取率仅为 20.1%）！祝贺 GraphGPT 团队全体成员！🎉🎉🎉\n- [x] [2023.12.26]🎯🎯📢📢我们已更新高效轻量级训练代码。借助更新后的脚本，仅需两块 NVIDIA 3090 显卡（每块 24 GB）即可完成两阶段指令微调。具体的部署与微调方法如下：🎄🎄\n\n#### 0. 环境更新： \n\n轻量级训练需要 PyTorch 2.1+，因此我们需要更新相关库： \n\n```shell\n# 如果您之前已经为 GraphGPT 搭建过环境\npip uninstall torch\npip uninstall torchvision\npip uninstall torchaudio\n# CUDA 11.8\npip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu118\n\n# 更新 pyg 以适配 PyTorch 2.1+\npip install torch_geometric\npip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https:\u002F\u002Fdata.pyg.org\u002Fwhl\u002Ftorch-2.1.0+cu118.html\n\n# 安装 Lightning\npip install lightning\n```\n\n#### 1. 更新图数据\n\n由于兼容性问题，如果您正在使用之前发布的图数据，我们建议您按照提供的链接下载并更新：[更新后的图数据](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FAll_pyg_graph_data)。\n\n#### 2. 运行脚本\n\n您可以按以下步骤运行脚本：\n\n**阶段-1：**\n\n```shell\ncd path\u002Fto\u002FGraphGPT\nsh .\u002Fscripts\u002Ftune_script\u002Fgraphgpt_stage1.sh\n```\n\n**阶段-2：**\n\n```\ncd path\u002Fto\u002FGraphGPT\nsh .\u002Fscripts\u002Ftune_script\u002Fgraphgpt_stage2.sh\n```\n\n- [x] [2023.12.14]📢📢感谢研究社区的支持。我们在下面的**FAQ**列表中整理了关于运行和环境问题的常见问题解答，请查阅。祝大家提前圣诞快乐！🎄🎄\n\n\u003Cdetails>\n\u003Csummary> \u003Cb>FQA\u003C\u002Fb> \u003C\u002Fsummary>\n\n- 对于“pretrain_graph_model_path”未定义的问题，请参阅问题[#7](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F7)。\n- 如果您在使用Flash Attention时遇到问题，只需注释掉https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fblob\u002Fmain\u002Fgraphgpt\u002Ftrain\u002Ftrain_mem.py第8行中的`replace_llama_attn_with_flash_attn()`即可。更多详情请参阅问题[#17](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F17)。\n- 如果您遇到与包冲突或环境设置相关的问题（尤其是fastchat），请参阅问题[#9](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F9)和问题[#11](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F11)。\n- 如果您遇到“没有名为‘graphgpt’的模块”的错误，可以参考问题[#56](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F56)。\n\n\u003C\u002Fdetails>\n\n\n🎯🎯📢📢 我们对在🤗 **Huggingface** 上的 GraphGPT 所使用的**模型**和**数据**进行了重大更新。强烈建议您参考下表以获取更多详细信息：\n\n| 🤗 Huggingface 地址                                        | 🎯 描述                                                |\n| ------------------------------------------------------------ | ------------------------------------------------------------ |\n| [huggingface.co\u002FJiabin99\u002FGraphGPT-7B-mix-all](https:\u002F\u002Fhuggingface.co\u002FJiabin99\u002FGraphGPT-7B-mix-all) | 这是我们基于 Vicuna-7B-v1.5 训练的 GraphGPT 检查点，该模型在指令数据集 [Arxiv-PubMed-mix-NC-LP](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FArxiv-PubMed-mix-NC-LP) 上进行了微调。 |\n| [huggingface.co\u002FJiabin99\u002FArxiv-PubMed-GraphCLIP-GT](https:\u002F\u002Fhuggingface.co\u002FJiabin99\u002FArxiv-PubMed-GraphCLIP-GT) | 这是使用文本-图对齐技术，在 Arxiv 和 PubMed 数据上训练的预训练图变换器（GT）检查点。 |\n| [huggingface.co\u002Fdatasets\u002FJiabin99\u002FArxiv-PubMed-mix-NC-LP](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FArxiv-PubMed-mix-NC-LP) | 这是一个结合节点分类（NC）和链接预测（LP）的混合指令数据集，适用于 Arxiv 和 PubMed。 |\n| [huggingface.co\u002Fdatasets\u002FJiabin99\u002FGraphGPT-eval-instruction](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FGraphGPT-eval-instruction) | 我们发布了所有用于评估的指令数据集。       |\n| [huggingface.co\u002Fdatasets\u002FJiabin99\u002FAll_pyg_graph_data](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FAll_pyg_graph_data) | 我们整合了所有使用的图数据。                            |\n| [huggingface.co\u002Fdatasets\u002FJiabin99\u002Fgraph-matching](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002Fgraph-matching) | 这是用于图匹配阶段的指令数据集。                            |\n\n- [x] [2023.10.28]📢📢 关于中文版说明，请参阅这篇文章：[文章链接](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FrvKTFdCk719Q6hT09Caglw)。\n\n- [x] [2023.10.26]🔥🔥 发布我们使用的指令数据。\n\n- [x] [2023.10.26]🔥🔥 发布我们的 GraphGPT 检查点以及预训练的图编码器。\n\n- [x] [2023.10.23] 🚀🚀 我们的 GraphGPT 全文论文已在 [https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023) 上发布。请查阅并给我们提供更多反馈！\n\n- [x] [2023.10.15] 🚀🚀 发布 GraphGPT 的代码。\n\n\n## 👉 TODO \n- [ ] 探索 our GraphGPT 在更多图学习任务中的潜力。\n- [ ] ...\n\n-----------\n\n\n\n\n\u003Cspan id='introduction'\u002F>\n\n## 简要介绍 \n\n\n我们提出了 **GraphGPT** 框架，该框架通过图指令微调范式，将大型语言模型与图结构知识对齐。\n\n\n- **利用文本-图对齐进行结构信息编码。** 为了增强大型语言模型对图结构信息的理解，我们的框架强调将图结构的编码与自然语言空间对齐。这种对齐旨在使语言模型能够有效理解和解释图的结构元素，从而充分利用其固有的语言理解能力。为此，我们引入了一种文本-图对齐范式，生成能够为语言模型保留图结构上下文的提示。这一范式充当桥梁，将文本信息的语义理解与图中固有的结构关系联系起来。\n- **双阶段图指令微调。** 本文提出的双阶段图指令微调范式建立在指令微调的基础上，而指令微调最近被引入以提高语言模型在特定领域的适应性。在此范式中，我们旨在将模型的语言能力与图学习任务的细微差别相匹配，从而使语言模型能够针对图结构化数据生成更准确、更符合上下文的响应。\n- **思维链（CoT）蒸馏。** 面对多样化的图数据时，语言模型可能会遇到新的或不熟悉的模式和结构。这种分布变化可能会给生成准确且连贯的响应带来挑战，尤其是在不同类型的图数据中节点类别数量不同的情况下。为了解决这一挑战并提高在分布变化情况下的准确性，必须赋予我们的 GraphGPT 分步推理能力。为此，我们提出了利用思维链（COT）技术[47]，该技术明确地模拟思维过程和推理步骤。通过引入 COT，我们的语言模型提高了生成文本的一致性和连贯性，使其能够遵循逻辑清晰的思想发展脉络，从而更好地理解和推理给定的图数据。\n\n\n有关更多技术细节，请参阅我们的论文 [https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023) 以及项目网站 [https:\u002F\u002Fgraphgpt.github.io\u002F](https:\u002F\u002Fgraphgpt.github.io\u002F)。\n\n\n-----------\n\n\u003Cspan id='Usage'\u002F>\n\n## 使用指南\n\n\u003Cspan id='all_catelogue'\u002F>\n\n### 目录：\n* \u003Ca href='#代码结构'>1. 代码结构\u003C\u002Fa>\n* \u003Ca href='#环境准备'>2. 环境准备 \u003C\u002Fa>\n* \u003Ca href='#训练GraphGPT'>3. 训练GraphGPT \u003C\u002Fa>\n  * \u003Ca href='#准备预训练检查点'>3.1. 准备预训练检查点\u003C\u002Fa>\n  * \u003Ca href='#自监督指令调优'>3.2. 自监督指令调优\u003C\u002Fa>\n  * \u003Ca href='#提取训练好的投影器'>3.3. 提取训练好的投影器\u003C\u002Fa>\n  * \u003Ca href='#特定任务指令调优'>3.4. 特定任务指令调优\u003C\u002Fa>\n* \u003Ca href='#评估GraphGPT'>4. 评估GraphGPT\u003C\u002Fa>\n  * \u003Ca href='#准备检查点和数据'>4.1. 准备检查点和数据\u003C\u002Fa>\n  * \u003Ca href='#运行评估'>4.2. 运行评估\u003C\u002Fa>\n\n****\n\n\n\n\u003Cspan id='代码结构'\u002F>\n\n### 1. 代码结构 \u003Ca href='#all_catelogue'>[回到顶部]\u003C\u002Fa>\n\n```\n.\n├── README.md\n├── assets\n│   ├── demo_narrow.gif\n│   ├── screenshot_cli.png\n│   ├── screenshot_gui.png\n│   ├── server_arch.png\n│   └── vicuna_logo.jpeg\n├── format.sh\n├── graphgpt\n│   ├── __init__.py\n│   ├── constants.py\n│   ├── conversation.py\n│   ├── eval\n│   │   ├── README.md\n│   │   ├── requirements.txt\n│   │   ├── run_graphgpt.py\n│   │   ├── run_graphgpt_LP.py\n│   │   ├── run_vicuna.py\n│   │   └── script\n│   │       └── run_model_qa.yaml\n│   ├── model\n│   │   ├── GraphLlama.py\n│   │   ├── __init__.py\n│   │   ├── apply_delta.py\n│   │   ├── apply_lora.py\n│   │   ├── builder.py\n│   │   ├── compression.py\n│   │   ├── convert_fp16.py\n│   │   ├── graph_layers\n│   │   │   ├── __init__.py\n│   │   │   ├── bpe_simple_vocab_16e6.txt.gz\n│   │   │   ├── clip_graph.py\n│   │   │   ├── graph_transformer.py\n│   │   │   ├── mpnn.py\n│   │   │   └── simple_tokenizer.py\n│   │   ├── make_delta.py\n│   │   ├── model_adapter.py\n│   │   ├── model_registry.py\n│   │   ├── monkey_patch_non_inplace.py\n│   │   └── utils.py\n│   ├── protocol\n│   │   └── openai_api_protocol.py\n│   ├── serve\n│   │   ├── __init__.py\n│   │   ├── api_provider.py\n│   │   ├── bard_worker.py\n│   │   ├── cacheflow_worker.py\n│   │   ├── cli.py\n│   │   ├── controller.py\n│   │   ├── gateway\n│   │   │   ├── README.md\n│   │   │   └── nginx.conf\n│   │   ├── gradio_block_arena_anony.py\n│   │   ├── gradio_block_arena_named.py\n│   │   ├── gradio_css.py\n│   │   ├── gradio_patch.py\n│   │   ├── gradio_web_server.py\n│   │   ├── gradio_web_server_multi.py\n│   │   ├── huggingface_api.py\n│   │   ├── inference.py\n│   │   ├── model_worker.py\n│   │   ├── monitor\n│   │   │   ├── basic_stats.py\n│   │   │   ├── clean_battle_data.py\n│   │   │   ├── elo_analysis.py\n│   │   │   ├── hf_space_leaderboard_app.py\n│   │   │   └── monitor.py\n│   │   ├── openai_api_server.py\n│   │   ├── register_worker.py\n│   │   ├── test_message.py\n│   │   └── test_throughput.py\n│   ├── train\n│   │   ├── graphchat_trainer.py\n│   │   ├── llama_flash_attn_monkey_patch.py\n│   │   ├── train_graph.py\n│   │   ├── train_lora.py\n│   │   └── train_mem.py\n│   └── utils.py\n├── playground\n│   ├── inspect_conv.py\n│   ├── test_embedding\n│   │   ├── README.md\n│   │   ├── test_classification.py\n│   │   ├── test_semantic_search.py\n│   │   ─…\n\n# 请填写以下路径以运行我们GraphGPT的第一阶段！\nmodel_path=..\u002Fvicuna-7b-v1.5-16k\ninstruct_ds=.\u002Fdata\u002Fstage_1\u002Fgraph_matching.json\ngraph_data_path=.\u002Fgraph_data\u002Fall_graph_data.pt\npretra_gnn=clip_gt_arxiv\noutput_model=.\u002Fcheckpoints\u002Fstage_1\n\nwandb offline\npython -m torch.distributed.run --nnodes=1 --nproc_per_node=4 --master_port=20001 \\\n    graphgpt\u002Ftrain\u002Ftrain_mem.py \\\n    --model_name_or_path ${model_path} \\\n    --version v1 \\\n    --data_path ${instruct_ds} \\\n    --graph_content .\u002Farxiv_ti_ab.json \\\n    --graph_data_path ${graph_data_path} \\\n    --graph_tower ${pretra_gnn} \\\n    --tune_graph_mlp_adapter True \\\n    --graph_select_layer -2 \\\n    --use_graph_start_end \\\n    --bf16 True \\\n    --output_dir ${output_model} \\\n    --num_train_epochs 3 \\\n    --per_device_train_batch_size 2 \\\n    --per_device_eval_batch_size 2 \\\n    --gradient_accumulation_steps 1 \\\n    --evaluation_strategy \"no\" \\\n    --save_strategy \"steps\" \\\n    --save_steps 2400 \\\n    --save_total_limit 1 \\\n    --learning_rate 2e-3 \\\n    --weight_decay 0. \\\n    --warmup_ratio 0.03 \\\n    --lr_scheduler_type \"cosine\" \\\n    --logging_steps 1 \\\n    --tf32 True \\\n    --model_max_length 2048 \\\n    --gradient_checkpointing True \\\n    --lazy_preprocess True \\\n    --report_to wandb\n```\n\n\u003Cspan id='提取训练好的投影器'\u002F>\n\n#### 3.3. 提取训练好的投影器  \u003Ca href='#all_catelogue'>[回到顶部]\u003C\u002Fa>\n\n我们可以通过填写[extract_projector.sh](scripts\u002Ftune_script\u002Fextract_projector.sh)中的空白来提取第一阶段训练好的投影器。示例如下：\n\n```shell\n# 请填写以下路径以提取第一阶段的投影器！\nsrc_model=.\u002Fcheckpoints\u002Fstage_1\noutput_proj=.\u002Fcheckpoints\u002Fstage_1_projector\u002Fstage_1_projector.bin\n\npython3.8 .\u002Fscripts\u002Fextract_graph_projector.py \\\n  --model_name_or_path ${src_model} \\\n  --output ${output_proj}\n```\n\n\u003Cspan id='任务特定指令调优'\u002F>\n\n#### 3.4. 任务特定指令调优  \u003Ca href='#all_catelogue'>[回到顶部]\u003C\u002Fa>\n\n* **准备数据：** 我们的任务特定指令数据可以选择多种形式，例如标准或COT（思维链）节点分类、链接预测，或者混合数据进行多任务学习。请参考 [task_specific](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FArxiv-PubMed-mix-NC-LP)。\n\n* **开始调优：** 完成上述步骤后，您可以通过填写[graphgpt_stage2.sh](scripts\u002Ftune_script\u002Fgraphgpt_stage2.sh)中的空白来开始第二阶段调优。示例如下：\n\n```shell\n# 请填写以下路径以运行我们GraphGPT的第二阶段！\nmodel_path=..\u002Fvicuna-7b-v1.5-16k\ninstruct_ds=.\u002Fdata\u002Fstage_2\u002Fdata_all_mix.json\ngraph_data_path=.\u002Fgraph_data\u002Fall_graph_data.pt\npretra_gnn=clip_gt_arxiv\ntuned_proj=.\u002Fcheckpoints\u002Fstage_1_projector\u002Fstage_1_projector.bin\noutput_model=.\u002Fcheckpoints\u002Fstage_2\n\nwandb offline\npython -m torch.distributed.run --nnodes=1 --nproc_per_node=4 --master_port=20001 \\\n    graphgpt\u002Ftrain\u002Ftrain_mem.py \\\n    --model_name_or_path ${model_path} \\\n    --version v1 \\\n    --data_path ${instruct_ds} \\\n    --graph_content .\u002Farxiv_ti_ab.json \\\n    --graph_data_path ${graph_data_path} \\\n    --graph_tower ${pretra_gnn} \\\n    --pretrain_graph_mlp_adapter ${tuned_proj} \\\n    --tune_graph_mlp_adapter True \\\n    --graph_select_layer -2 \\\n    --use_graph_start_end True\\\n    --bf16 True \\\n    --output_dir ${output_model} \\\n    --num_train_epochs 2 \\\n    --per_device_train_batch_size 1 \\\n    --per_device_eval_batch_size 1 \\\n    --gradient_accumulation_steps 1 \\\n    --evaluation_strategy \"no\" \\\n    --save_strategy \"steps\" \\\n    --save_steps 50000 \\\n    --save_total_limit 1 \\\n    --learning_rate 2e-5 \\\n    --weight_decay 0. \\\n    --warmup_ratio 0.03 \\\n    --lr_scheduler_type \"cosine\" \\\n    --logging_steps 1 \\\n    --tf32 True \\\n    --model_max_length 2048 \\\n    --gradient_checkpointing True \\\n    --dataloader_num_workers 4 \\\n    --lazy_preprocess True \\\n    --report_to wandb\n\n```\n\n\n\n\u003Cspan id='评估GraphGPT'\u002F>\n\n## 4. 评估GraphGPT  \u003Ca href='#all_catelogue'>[回到顶部]\u003C\u002Fa>\n\n\u003Cspan id='准备检查点和数据'\u002F>\n\n\n#### 4.1. 准备检查点和数据 \u003Ca href='#all_catelogue'>[回到顶部]\u003C\u002Fa>\n\n* **检查点：** 您可以使用自己的模型或我们发布的检查点来评估GraphGPT。\n* **数据：** 我们为不同的图数据集划分了测试集，并准备了用于评估的指令数据。请参考 [evaluating](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FGraphGPT-eval-instruction)。\n\n\u003Cspan id='运行评估'\u002F>\n\n#### 4.2. 运行评估 \u003Ca href='#all_catelogue'>[回到顶部]\u003C\u002Fa>\n\n您可以填写[graphgpt_eval.sh](scripts\u002Feval_script\u002Fgraphgpt_eval.sh)中的空白来开始第二阶段的评估。示例如下：\n```shell\n# 请填写以下路径以提取第二阶段的投影器！\noutput_model=.\u002Fcheckpoints\u002Fstage_2\ndatapath=.\u002Fdata\u002Feval\u002Farxiv_nc.json\ngraph_data_path=.\u002Fgraph_data\u002Fall_graph_data.pt\nres_path=.\u002Foutput_stage_2_arxiv_nc\nstart_id=0\nend_id=20000\nnum_gpus=2\n\npython3.8 .\u002Fgraphgpt\u002Feval\u002Frun_graphgpt.py --model-name ${output_model}  --prompting_file ${datapath} --graph_data_path ${graph_data_path} --output_res_path ${res_path} --start_id ${start_id} --end_id ${end_id} --num_gpus ${num_gpus}\n```\n---------\n\n\n## 联系方式\n\n如有任何问题或反馈，请随时联系[Jiabin Tang](mailto:jiabintang77@gmail.com)。\n\n## 杂项\n\n\u003Cdiv align=\"center\">\n\n[![@HKUDS\u002FGraphGPT的星标罗列](https:\u002F\u002Freporoster.com\u002Fstars\u002FHKUDS\u002FGraphGPT)](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fstargazers)\n\n\n[![@HKUDS\u002FGraphGPT的叉子罗列](https:\u002F\u002Freporoster.com\u002Fforks\u002FHKUDS\u002FGraphGPT)](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fnetwork\u002Fmembers)\n\n\n[![星历史图表](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FHKUDS_GraphGPT_readme_565eda28ba39.png)](https:\u002F\u002Fstar-history.com\u002F#HKUDS\u002FGraphGPT&Date)\n\n\u003C\u002Fdiv>\n\n## 引用\n\n如果您在研究或应用中发现GraphGPT有用，请引用：\n```tex\n@articles{tang2023graphgpt,\ntitle={GraphGPT: Graph Instruction Tuning for Large Language Models}, \nauthor={Jiabin Tang and Yuhao Yang and Wei Wei and Lei Shi and Lixin Su and Suqi Cheng and Dawei Yin and Chao Huang},\nyear={2023},\neprint={2310.13023},\narchivePrefix={arXiv},\nprimaryClass={cs.CL}\n}\n```\n\n\n\n## 致谢\n您可以参考作为我们框架和代码库基础的相关工作，\n[Vicuna](https:\u002F\u002Fgithub.com\u002Flm-sys\u002FFastChat), [LLaVa](https:\u002F\u002Fgithub.com\u002Fhaotian-liu\u002FLLaVA), 我们也部分借鉴了[MiniGPT-4](https:\u002F\u002Fgithub.com\u002FVision-CAIR\u002FMiniGPT-4)的灵感。在文本-图对齐设计方面，我们采用了[G2P2](https:\u002F\u002Fgithub.com\u002FWenZhihao666\u002FG2P2)的实现。我们的网站和README.md的设计则受到了[NExT-GPT](https:\u002F\u002Fnext-gpt.github.io\u002F)的启发。感谢这些优秀的工作成果。","# GraphGPT 快速上手指南\n\nGraphGPT 是一个通过图指令微调（Graph Instruction Tuning）将大语言模型（LLM）与图结构知识对齐的框架。本指南帮助开发者快速搭建环境并运行模型。\n\n## 1. 环境准备\n\n### 系统要求\n- **GPU**: 推荐至少 2 张 NVIDIA 3090 (24GB) 或同等算力显卡进行轻量级训练。\n- **CUDA**: 推荐 CUDA 11.8。\n- **Python**: 建议 Python 3.8+。\n\n### 前置依赖安装\nGraphGPT 轻量版训练需要 PyTorch 2.1+。请严格按照以下顺序更新或安装依赖库：\n\n```shell\n# 如果之前安装过旧版 torch，请先卸载\npip uninstall torch\npip uninstall torchvision\npip uninstall torchaudio\n\n# 安装 PyTorch 2.1.0 (CUDA 11.8)\npip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu118\n\n# 更新 PyG (PyTorch Geometric) 以适配 PyTorch 2.1+\npip install torch_geometric\npip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https:\u002F\u002Fdata.pyg.org\u002Fwhl\u002Ftorch-2.1.0+cu118.html\n\n# 安装 Lightning\npip install lightning\n```\n\n> **提示**：国内用户若下载缓慢，可尝试使用清华源或阿里源替代默认 pip 源，但 PyG 的特殊 wheel 地址建议保留官方源以确保兼容性。\n\n## 2. 数据准备\n\n由于版本兼容性原因，请务必使用更新后的图数据。\n\n- **下载地址**: [HuggingFace - All_pyg_graph_data](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FAll_pyg_graph_data)\n- **操作**: 下载后将其放置于项目指定的数据目录中。\n\n其他相关模型权重和数据集（如预训练图编码器、指令数据集等）均可在 [GraphGPT HuggingFace 主页](https:\u002F\u002Fhuggingface.co\u002FJiabin99) 获取。\n\n## 3. 基本使用（训练流程）\n\nGraphGPT 采用两阶段指令微调策略。确保已克隆代码库并进入目录：\n\n```bash\ncd path\u002Fto\u002FGraphGPT\n```\n\n### 第一阶段：自监督指令微调 (Stage-1)\n执行以下脚本启动第一阶段训练：\n\n```shell\nsh .\u002Fscripts\u002Ftune_script\u002Fgraphgpt_stage1.sh\n```\n\n### 第二阶段：任务特定指令微调 (Stage-2)\n第一阶段完成后，执行以下脚本启动第二阶段训练：\n\n```shell\nsh .\u002Fscripts\u002Ftune_script\u002Fgraphgpt_stage2.sh\n```\n\n## 4. 模型评估\n\n完成训练后，可使用提供的评估脚本对模型性能进行测试。\n\n1. **准备检查点与数据**：确保已下载评估所需的指令数据集 [GraphGPT-eval-instruction](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiabin99\u002FGraphGPT-eval-instruction) 并配置好训练完成的模型权重路径。\n2. **运行评估**：根据具体任务（如节点分类 NC 或链接预测 LP），进入 `graphgpt\u002Feval` 目录运行相应的评估脚本（例如 `run_graphgpt.py` 或 `run_graphgpt_LP.py`）。\n\n---\n*更多技术细节请参考原论文 [ArXiv:2310.13023](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.13023) 或项目主页。*","某电商平台的推荐算法团队正试图利用大语言模型（LLM）分析复杂的用户 - 商品交互图谱，以生成更具解释性的个性化推荐理由。\n\n### 没有 GraphGPT 时\n- **结构信息丢失**：直接将图谱数据转化为文本序列输入 LLM，导致用户与商品间复杂的多跳连接关系被切断，模型无法理解深层关联。\n- **推理幻觉严重**：面对“为什么推荐此商品”的指令，模型常编造不存在的交互路径，生成的理由缺乏事实依据，难以取信于用户。\n- **微调成本高昂**：为了让通用模型理解图结构，需构造海量特定格式的训练数据，且在消费级显卡上难以完成全量指令微调。\n- **泛化能力薄弱**：模型仅能记忆训练集中的特定图谱模式，一旦遇到新用户或新商品构成的子图，推理效果急剧下降。\n\n### 使用 GraphGPT 后\n- **图指令对齐**：GraphGPT 通过专门的图指令微调机制，让 LLM 直接“读懂”图拓扑结构，精准捕捉用户兴趣的传播路径。\n- **可解释性增强**：模型能基于真实的图遍历路径生成推荐理由（如“因为您购买了 A，且 A 与 B 有强共现关系”），显著减少幻觉。\n- **高效轻量部署**：借助其优化的两阶段训练代码，团队仅需两张 RTX 3090 显卡即可完成微调，大幅降低了算力门槛。\n- **零样本泛化提升**：在未见过的图谱子结构上，GraphGPT 仍能保持稳定的推理能力，快速适应动态变化的电商数据环境。\n\nGraphGPT 成功打破了大语言模型与图数据之间的壁垒，让复杂的结构化知识成为 LLM 可理解、可推理的核心能力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FHKUDS_GraphGPT_5fb7fb79.png","HKUDS","✨Data Intelligence Lab@HKU✨","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FHKUDS_fc32cc87.jpg",null,"https:\u002F\u002Fsites.google.com\u002Fview\u002Fchaoh","https:\u002F\u002Fgithub.com\u002FHKUDS",[83,87],{"name":84,"color":85,"percentage":86},"Python","#3572A5",98.9,{"name":88,"color":89,"percentage":90},"Shell","#89e051",1.1,828,81,"2026-04-14T01:36:40","Apache-2.0",4,"Linux","必需 NVIDIA GPU。轻量级训练方案最低需 2 张 NVIDIA 3090 (每张 24GB 显存)；需支持 CUDA 11.8","未说明",{"notes":100,"python":101,"dependencies":102},"1. 必须使用 PyTorch 2.1+ 及对应的 CUDA 11.8 环境。\n2. 若使用旧版图数据，需从 HuggingFace 下载更新后的图数据以确保兼容性。\n3. 如遇 Flash Attention 报错，需在代码中注释相关替换函数。\n4. 项目涉及复杂的图神经网络依赖 (PyG 系列库)，安装时需严格匹配 PyTorch 版本对应的 wheel 地址。","未说明 (需配合 PyTorch 2.1+)",[103,104,105,106,107,108,109,110,111,112],"torch==2.1.0","torchvision==0.16.0","torchaudio==2.1.0","torch_geometric","pyg_lib","torch_scatter","torch_sparse","torch_cluster","torch_spline_conv","lightning",[15],[115,116,117,118,119],"graph-neural-networks","instruction-tuning","large-language-models","text-graph","graph-learning","2026-03-27T02:49:30.150509","2026-04-17T08:25:09.573501",[123,128,133,138,143,148],{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},36721,"运行脚本时遇到 'No module named graphgpt' 或 'config.json missing' 错误怎么办？","这通常是因为模型加载路径配置错误或未根据机器环境修改并行参数。请检查以下几点：\n1. 确认 pretrain_model_path 指向的目录下确实存在 config.json 文件。\n2. 根据实际机器显卡数量修改启动脚本中的并行参数，例如将 '--nnodes=1 --nproc_per_node=4' 中的 nproc_per_node 改为实际可用的 GPU 数量。\n3. 如果报错 'GraphLlamaConfig object has no attribute pretrain_graph_model_path'，请参考项目 Issue #7 进行配置修复。","https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F56",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},36722,"第一阶段训练后找不到 'pytorch_model.bin.index.json' 文件导致提取 Projector 失败如何解决？","较新版本的训练输出可能使用 safetensors 格式而非传统的 bin 索引文件。您可以修改提取脚本来兼容这两种格式。参考解决方案代码如下：\nfrom safetensors.torch import load_file\nimport json, torch, os\n\nmodel_indices = json.load(open(os.path.join(args.model_name_or_path, 'model.safetensors.index.json')))\nfor ckpt_name, weight_keys in ckpt_to_key.items():\n    ckpt_path = os.path.join(args.model_name_or_path, ckpt_name)\n    if ckpt_name.endswith('.safetensors'):\n        ckpt = load_file(ckpt_path)\n    else:\n        ckpt = torch.load(ckpt_path, map_location='cpu')\n    for k in weight_keys:\n        loaded_weights[k] = ckpt[k]","https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F37",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},36723,"如何在自定义图数据上对 GraphGPT 进行微调？数据应如何组织？","要使用自定义数据微调，需按以下步骤组织数据：\n1. 使用 torch.load 打开现有的 'graph_data_all.pt' 文件，它是一个字典结构 {数据集名字: pyg 格式图数据}。\n2. 将您的新图数据（PyG 格式）update 进该字典中。\n3. 在生成 instruction following 数据集时，确保 id 字段的前缀（第一个下划线前的名字）与您加入的整图数据名称一致，以便数据处理类能正确匹配。\n4. 节点特征向量通常由作者使用的 128 维 BERT 模型编码节点文本信息生成，如需复现需自行处理节点文本嵌入。","https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F33",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},36724,"安装时遇到 protobuf 版本冲突（icetk 依赖 \u003C3.19 而项目需要 3.20.0）或 PyGObject 错误怎么办？","对于 PyGObject 相关的安装问题，可以尝试安装特定版本来解决。有用户反馈安装 [pygobject:3.26.1](https:\u002F\u002Flaunchpad.net\u002Fubuntu\u002F+source\u002Fpygobject\u002F3.26.1-2) 可解决此类依赖冲突问题。对于 protobuf 版本冲突，建议检查虚拟环境中的包依赖树，尝试降级 protobuf 至兼容版本（如 3.19.x），或者升级 icetk 到支持新版 protobuf 的版本（如果可用）。","https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F9",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},36725,"为什么基线模型（如 GCN, GraphSAGE）在 ogbn-arxiv 上的结果低于预期的 70%？","基线结果偏低可能是因为实验中为 GNN 选择了较弱的 BERT 模型作为文本编码器。社区已知使用标准或更强的预训练 BERT 模型（无需额外训练）处理原始文本输入，可以使 GNN 基线达到更高的性能（通常高于 70%）。如果您复现时发现结果过低，请检查是否使用了与论文描述一致的强预训练 BERT 模型来初始化节点特征，而不是使用了受限或较弱的变体。","https:\u002F\u002Fgithub.com\u002FHKUDS\u002FGraphGPT\u002Fissues\u002F52",{"id":149,"question_zh":150,"answer_zh":151,"source_url":127},36726,"运行评估或训练时出现 'AssertionError: config.json missing' 但文件确实存在的错误原因是什么？","即使文件存在，该错误也可能由路径配置错误引起。请仔细检查配置文件（如 vicuna 的 config）中 'pretrain_model_path' 指定的绝对路径或相对路径是否正确指向了包含 config.json 的文件夹。此外，确保运行脚本时的当前工作目录正确，或者改用绝对路径以避免相对路径解析错误。同时，确认是否有权限读取该文件。",[]]