[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-peremartra--Large-Language-Model-Notebooks-Course":3,"tool-peremartra--Large-Language-Model-Notebooks-Course":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":81,"owner_twitter":82,"owner_website":83,"owner_url":84,"languages":85,"stars":94,"forks":95,"last_commit_at":96,"license":97,"difficulty_score":10,"env_os":98,"env_gpu":98,"env_ram":98,"env_deps":99,"category_tags":109,"github_topics":110,"view_count":119,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":120,"updated_at":121,"faqs":122,"releases":153},397,"peremartra\u002FLarge-Language-Model-Notebooks-Course","Large-Language-Model-Notebooks-Course","Practical course about Large Language Models. ","Large-Language-Model-Notebooks-Course 是一个专注于大语言模型（LLM）实战的开源课程仓库。它为工程师、研究人员和开发者提供了一系列基于 Jupyter Notebook 的手册和项目案例，旨在将抽象的理论知识转化为可落地的代码能力。\n\n很多人学习 LLM 时面临“懂原理却不会用”的困境，这个项目通过循序渐进的教程解决了这一问题。课程内容涵盖三大板块：首先是基础技术与库的使用，包括 Chatbot 构建、代码生成及向量数据库；其次是具体项目实战，讲解设计决策与 LLMOps；最后是面向企业的规模化解决方案。\n\n其独特亮点在于紧跟前沿技术，不仅涉及 OpenAI API 和 Hugging Face 生态，还深入讲解了 PEFT、LoRA、QLoRA 等高效微调方法以及知识蒸馏。虽然它与某本出版书籍内容相关，但项目处于永久开发中，会不断融入新示例和章节。对于希望快速掌握 LLM 应用开发并构建实际产品的技术人员来说，这是一个非常实用的资源库。","# Build with LLMs: Hands-on Projects for Engineers, Researchers and Developers using Large Language Models, GPT, LLaMA, LangChain and Hugging Face\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Fhubs.la\u002FQ040tvsK0\">\n\u003Cimg width=\"451\" height=\"257\" alt=\"newinmeap\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fperemartra_Large-Language-Model-Notebooks-Course_readme_6750a726dcac.png\" \u002F>\n\u003C\u002Fa>\n\u003C\u002Fp>\n\n### 🚨  News: I'm writing \"Rearchitecting LLMs\" with Manning! Learn to dismantle and rebuild Transformers using Pruning, Knowledge Distillation, and Attention Bypass—the same techniques elite AI labs use to push the boundaries of model efficiency. [Check the MEAP here](https:\u002F\u002Fhubs.la\u002FQ040tvsK0). 50% Off mith code: MLMartra. 🚨\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd  width=\"130\">\n      \u003Ca href=\"https:\u002F\u002Famzn.to\u002F4eanT1g\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fperemartra_Large-Language-Model-Notebooks-Course_readme_7901ac4096f6.jpg\"  alt=\"Rearchitecting LLMs\" width=\"100%\">\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd>\n      \u003Cp>\n        This is the unofficial repository for the book: \n        \u003Ca href=\"https:\u002F\u002Famzn.to\u002F4eanT1g\"> \u003Cb>Large Language Models:\u003C\u002Fb> Apply and Implement Strategies for Large Language Models\u003C\u002Fa> (Apress).\n        The book is based on the content of this repository, but the notebooks are being updated, and I am incorporating new examples and chapters.\n        If you are looking for the official repository for the book, with the original notebooks, you should visit the \n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FApress\u002FLarge-Language-Models-Projects\">Apress repository\u003C\u002Fa>, where you can find all the notebooks in their original format as they appear in the book. Buy it at: \u003Ca href=\"https:\u002F\u002Famzn.to\u002F3Bq2zqs\">[Amazon]\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Flink.springer.com\u002Fbook\u002F10.1007\u002F979-8-8688-0515-8\">[Springer]\u003C\u002Fa>\n      \u003C\u002Fp>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n*Please note that the course on GitHub does not contain all the information that is in the book.*\n\n**This practical free hands on course about Large Language models and their applications is 👷🏼in permanent development👷🏼. I will be posting the different lessons and samples as I complete them.**\n\nThe course provides a hands-on experience using models from OpenAI and the Hugging Face library. We are going to see and use a lot of tools and practice with small projects that will grow as we can apply the new knowledge acquired. \n\n![Large Language Models Course Path](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fperemartra_Large-Language-Model-Notebooks-Course_readme_ccd1450b0f6e.jpg)\n\n\u003Ch1> The course is divided into three major sections:\u003C\u002Fh1>\n\n\u003Ch2>1- Techniques and Libraries:\u003C\u002Fh2> \nIn this part, we will explore different techniques through small examples that will enable us to build bigger projects in the following section. We will learn how to use the most common libraries in the world of Large Language Models, always with a practical focus, while basing our approach on published papers.\n\nSome of the topics and technologies covered in this section include: Chatbots, Code Generation, OpenAI API, Hugging Face, Vector databases, LangChain, Fine Tuning, PEFT Fine Tuning, Soft Prompt tuning, LoRA, QLoRA, Evaluate Models, Knowledge Distillation.\n\n\u003Ch2>2- Projects:\u003C\u002Fh2> \nWe will create projects, explaining design decisions. Each project may have more than one possible implementation, as often there is not just one perfect solution. In this section, we will also delve into LLMOps-related topics, although it is not the primary focus of the course.\n\n\u003Ch2>3- Enterprise Solutions:\u003C\u002Fh2> Large Language Models are not a standalone solution. In large corporate environments, they are just one piece of the puzzle. We will explore how to structure solutions capable of transforming organizations with thousands of employees, and how Large Language Models play a main role in these new solutions.\n\n\u003Ch1>How to use the course.\u003C\u002Fh1>\nUnder each section you can find different chapters, that are formed by different lessons. The title of the lesson is a link to the lesson page, where you can found all the notebooks and articles of the lesson. \n\nEach Lesson is conformed by notebooks and articles. The notebooks contain sufficient information for understanding the code within them, the article provides more detailed explanations about the code and the topic covered. \n\nMy advice is to have the article open alongside the notebook and follow along. Many of the articles offer small tips on variations that you can introduce to the notebooks. I recommend following them to enhance clarity of the concepts.\n\nMost of the notebooks are hosted on Colab, while a few are on Kaggle. Kaggle provides more memory in the free version compared to Colab, but I find that copying and sharing notebooks is simpler in Colab, and not everyone has a Kaggle account.\n\nSome of the notebooks require more memory than what the free version of Colab provides. As we are working with large language models, this is a common situation that will recur if you continue working with them. You can run the notebooks in your own environment or opt for the Pro version of Colab.\n_____________\n\u003Ch1>🚀1- Techniques and Libraries.\u003C\u002Fh1>\n\nEach notebook is supported with a Medium article where the code is explained in detail. \n## [Introduction to Large Language Models with OpenAI.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI)\nIn this first section of the course, we will learn to work with the OpenAI API by creating two small projects. We'll delve into OpenAI's roles and how to provide the necessary instructions to the model through the prompt to make it behave as we desire.\n\nThe first project is a restaurant chatbot where the model will take customer orders. Building upon this project, we will construct an SQL statement generator. Here, we'll attempt to create a secure prompt that only accepts SQL creation commands and nothing else.\n\n### Create Your First Chatbot Using GPT 3.5, OpenAI, Python and Panel.\nWe will be utilizing OpenAI GPT-3.5 and Panel to develop a straightforward Chatbot tailored for a fast food restaurant. During the course, we will explore the fundamentals of prompt engineering, including understanding the various OpenAI roles, manipulating temperature settings, and how to avoid Prompt Injections. \n| [article panel](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fcreate-your-first-chatbot-using-gpt-3-5-openai-python-and-panel-7ec180b9d7f2) \u002F [article gradio](https:\u002F\u002Fai.plainenglish.io\u002Fcreate-a-simple-chatbot-with-openai-and-gradio-202684d18f35?sk=e449515ec7a803ae828418011bbaca52)| [Notebook panel](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_1-First_Chatbot_OpenAI.ipynb) \u002F [notebook gradio](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_1-First_Chatbot_OpenAI_Gradio.ipynb)|\n| --- | --- |\n\n### How to Create a Natural Language to SQL Translator Using OpenAI API.\nFollowing the same framework utilized in the previous article to create the ChatBot, we made a few modifications to develop a Natural Language to SQL translator. In this case, the Model needs to be provided with the table structures, and adjustments were made to the prompt to ensure smooth functionality and avoid any potential malfunctions. With these modifications in place, the translator is capable of converting natural language queries into SQL queries. @fmquaglia has created a notebook using DBML to describe the tables that by far is a better aproach than the original.\n| [Article](https:\u002F\u002Fpub.towardsai.net\u002Fhow-to-create-a-natural-language-to-sql-translator-using-openai-api-e1b1f72ac35a) \u002F [Article Gradio](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Ffirst-nl2sql-chat-with-openai-and-gradio-b1de0d6541b4)| [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_2-Easy_NL2SQL.ipynb) \u002F [Notebook Gradio](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_2-Easy_NL2SQL_Gradio.ipynb) \u002F [Notebook DBML](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_2b-Easy_NL2SQL.ipynb)\n| --- | --- |\n\n### Brief Introduction to Prompt Engineering with OpenAI.\nWe will explore prompt engineering techniques to improve the results we obtain from Models. Like how to format the answer and obtain a structured response using Few Shot Samples. \n| [Article](https:\u002F\u002Fmedium.com\u002Fgitconnected\u002Finfluencing-a-large-language-model-response-with-in-context-learning-b212f0eaa113) | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_3-Intro_Prompt_Engineering.ipynb)\n| --- | --- |\n\n## [Vector Databases with LLMs.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F2-Vector%20Databases%20with%20LLMs) \nA brief introduction to Vector Databases, a technology that will accompany us in many lessons throughout the course. We will work on an example of Retrieval Augmented Generation using information from various news datasets stored in ChromaDB.\n\n### Influencing Language Models with Personalized Information using a Vector Database. \nIf there's one aspect gaining importance in the world of large language models, it's exploring how to leverage proprietary information with them. In this lesson, we explore a possible solution that involves storing information in a vector database, ChromaDB in our case, and using it to create enriched prompts.\n|[Article](https:\u002F\u002Fpub.towardsai.net\u002Fharness-the-power-of-vector-databases-influencing-language-models-with-personalized-information-ab2f995f09ba?sk=ea2c5286fbff8430e5128b0c3588dbab) | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F2-Vector%20Databases%20with%20LLMs\u002F2_1_Vector_Databases_LLMs.ipynb) |\n| --- | --- |\n\n### Semantic Cache for RAG systems \nWe enhanced the RAG system by introducing a semantic cache layer capable of determining if a similar question has been asked before. If affirmative, it retrieves information from a cache system created with Faiss instead of accessing the Vector Database. \n\nThe inspiration and base code of the semantic cache present in this notebook exist thanks to the course: https:\u002F\u002Fmaven.com\u002Fboring-bot\u002Fadvanced-llm\u002F1\u002Fhome from Hamza Farooq.\n\n| Article | Notebook |\n| --- | ---|\n| WIP | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F2-Vector%20Databases%20with%20LLMs\u002Fsemantic_cache_chroma_vector_database.ipynb) |\n\n\n## [LangChain.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F3-LangChain)\nLangChain has been one of the libraries in the universe of large language models that has contributed the most to this revolution. \nIt allows us to chain calls to Models and other systems, allowing us to build applications based on large language models. In the course, we will use it several times, creating increasingly complex projects.\n\n### Retrieval Augmented Generation (RAG). Use the Data from your DataFrames with LLMs.\nIn this lesson, we used LangChain to enhance the notebook from the previous lesson, where we used data from two datasets to create an enriched prompt. This time, with the help of LangChain, we built a pipeline that is responsible for retrieving data from the vector database and passing it to the Language Model. The notebook is set up to work with two different datasets and two different models. One of the models is trained for Text Generation, while the other is trained for Text2Text Generation.\n| [Article](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fquery-your-dataframes-with-powerful-large-language-models-using-langchain-abe25782def5) | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_1_RAG_langchain.ipynb) |\n| --- | --- |\n\n### Create a Moderation system using LangChain. \nWe will create a comment response system using a two-model pipeline built with LangChain. In this setup, the second model will be responsible for moderating the responses generated by the first model.\n\nOne effective way to prevent our system from generating unwanted responses is by using  a second model that has no direct interaction with users to handle response generation. \n\nThis approach can reduce the risk of undesired responses generated by the first model in response to the user's entry. \n\n\nI will create separate notebooks for this task. One will involve models from OpenAI, and the others will utilize open-source models provided by Hugging Face. The results obtained in the three notebooks are very different. The system works much better with the OpenAI, and LLAMA2 models. \n| Article | Notebook |\n| --- | --- |\n| [OpenAI article](https:\u002F\u002Fpub.towardsai.net\u002Fcreate-a-self-moderated-commentary-system-with-langchain-and-openai-406a51ce0c8d?sk=b4903b827e44642f7f7c311cebaef57f) | [OpenAI notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_2_OpenAI_Moderation_Chat.ipynb) |\n| [Llama2-7B Article](https:\u002F\u002Flevelup.gitconnected.com\u002Fcreate-a-self-moderated-comment-system-with-llama-2-and-langchain-656f482a48be?sk=701ead7afb80e015ea4345943a1aeb1d) | [Llama2-7B Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_2_LLAMA2_Moderation_Chat.ipynb) |\n| No Article | [GPT-J Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_2_GPT_Moderation_System.ipynb) |\n\n### Create a Data Analyst Assistant using a LLM Agent. \nAgents are one of the most powerful tools in the world of Large Language Models. The agent is capable of interpreting the user's request and using the tools and libraries at its disposal until it achieves the expected result.\n\nWith LangChain Agents, we are going to create in just a few lines one of the simplest yet incredibly powerful agents. The agent will act as a Data Analyst Assistant and help us in analyzing data contained in any Excel file. It will be able to identify trends, use models, make forecasts. In summary, we are going to create a simple agent that we can use in our daily work to analyze our data.\n| [Article](https:\u002F\u002Fpub.towardsai.net\u002Fcreate-your-own-data-analyst-assistant-with-langchain-agents-722f1cdcdd7e) | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_3_Data_Analyst_Agent.ipynb) |\n| --- | --- |\n\n### Create a Medical ChatBot with LangChain and ChromaDB. \nIn this example, two technologies seen previously are combined: agents and vector databases. Medical information is stored in ChromaDB, and a LangChain Agent is created, which will fetch it only when necessary to create an enriched prompt that will be sent to the model to answer the user's question.\n\nIn other words, a RAG system is created to assist a Medical ChatBot.\n\n**Attention!!! Use it only as an example. Nobody should take the boot's recommendations as those of a real doctor. I disclaim all responsibility for the use that may be given to the ChatBot. I have built it only as an example of different technologies.**\n| [Article](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fquery-your-dataframes-with-powerful-large-language-models-using-langchain-abe25782def5) | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_4_Medical_Assistant_Agent.ipynb) |\n| ------ | ------ |\n\n## [Evaluating LLMs.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F4-Evaluating%20LLMs)\nThe metrics used to measure the performance of Large Language Models are quite different from the ones we've been using in more traditional models. We're shifting away from metrics like Accuracy, F1 score, or recall, and moving towards metrics like BLEU, ROUGE, or METEOR. \n\nThese metrics are tailored to the specific task assigned to the language model. \n\nIn this section, we'll explore examples of several of these metrics and how to use them to determine whether one model is superior to another for a given task. We'll delve into practical scenarios where these metrics help us make informed decisions about the performance of different models.\n\n### Evaluating translations with BLEU. \nBleu is one of the first Metrics stablished to evaluate the quality of translations. In the notebook we compare the quality of a translation made by google with other from an Open Source Model from Hugging Face.\n| Article WIP | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_1_bleu_evaluation.ipynb) |\n| --- | --- |\n\n### Evaluating Summarization with ROUGE. \nWe will explore the usage of the ROUGE metric to measure the quality of summaries generated by a language model. \nWe are going to use two T5 models, one of them being the t5-Base model and the other a t5-base fine-tuned specifically designed for creating summaries.\n\n| [Article](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Frouge-metrics-evaluating-summaries-in-large-language-models-d200ee7ca0e6) | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_1_rouge_evaluations.ipynb) |\n| --- | --- |\n\n### Monitor an Agent using LangSmith. \nIn this initial example, you can observe how to use LangSmith to monitor the traffic between the various components that make up the Agent. The agent is a RAG system that utilizes a vectorial database to construct an enriched prompt and pass it to the model. LangSmith captures both the use of the Agent's tools and the decisions made by the model, providing information at all times about the sent\u002Freceived data, consumed tokens, the duration of the query, and all of this in a truly user-friendly environment.\n| [Article](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Ftracing-a-llm-agent-with-langsmith-a81975634555)  | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_2_tracing_medical_agent.ipynb) |\n| ------ | ------ |\n\n### Evaluating the quality of summaries using Embedding distance with LangSmith. \nPreviously in the notebook, Rouge Metrics: Evaluating Summaries, we learned how to use ROUGE to evaluate which summary best approximated the one created by a human. This time, we will use embedding distance and LangSmith to verify which model produces summaries more similar to the reference ones.\n| [Article](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fevaluating-llm-summaries-using-embedding-distance-with-langsmith-5fb46fdae2a5?sk=24eb18ce187d28547cebd6fd3dd1ddad) | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_2_Evaluating_summaries_embeddings.ipynb) |\n| ------ | ------ |\n\n### Evaluating a RAG solution using Giskard. \nWe take the agent that functions as a medical assistant and incorporate Giskard to evaluate if its responses are correct. In this way, not only the model's response is evaluated, but also the information retrieval in the vector database. Giskard is a solution that allows evaluating a complete RAG solution.\n| [Article](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fevaluating-a-rag-solution-with-giskard-1bc138fa44af?sk=10811fe2953eb511fb1ffefda326f7a2) | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_3_evaluating_rag_giskard.ipynb)\n| ------ | ------ |\n\n### Introduction to the lm-evaluation library from Eluther.ai. \nThe lm-eval library by EleutherAI provides easy access to academic benchmarks that have become industry standards. It supports the evaluation of both Open Source models and APIs from providers like OpenAI, and even allows for the evaluation of adapters created using techniques such as LoRA.\n\nIn this notebook, I will focus on a small but important feature of the library: evaluating models compatible with Hugging Face's Transformers library.\n| Article - WIP| [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_4_lm-evaluation-harness.ipynb)\n| ------ | ------ |\n\n\n## [Fine Tuning & Optimization.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F5-Fine%20Tuning) \nIn the FineTuning & Optimization section, we will explore different techniques such as Prompt Fine Tuning or LoRA, and we will use the Hugging Face PEFT library to efficiently fine-tune Large Language Models. We will explore techniques like quantization to reduce the weight of the Models. \n\n### Prompt tuning using PEFT Library from Hugging Face. \nIn this notebook, two models are trained using Prompt Tuning from the PEFT library. This technique not only allows us to train by modifying the weights of very few parameters but also enables us to have different specialized models loaded in memory that use the same foundational model.\n\nPrompt tuning is an additive technique, and the weights of the pre-trained model are not modified. The weights that we modify in this case are those of virtual tokens that we add to the prompt.\n| [Article](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Ffine-tuning-models-using-prompt-tuning-with-hugging-faces-peft-library-998ae361ee27) | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F5-Fine%20Tuning\u002F5_4_Prompt_Tuning.ipynb) |\n| --- | --- |\n\n### Fine-Tuning with LoRA using PEFT from Hugging Face. \nAfter a brief explanation of how the fine-tuning technique LoRA works, we will fine-tune a model from the Bloom family to teach it to construct prompts that can be used to instruct large language models.\n|[Article](https:\u002F\u002Flevelup.gitconnected.com\u002Fefficient-fine-tuning-with-lora-optimal-training-for-large-language-models-266b63c973ca?sk=85d7b5d78e64e568faedfe07a35f81bd) | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F5-Fine%20Tuning\u002F5_2_LoRA_Tuning.ipynb)\n| --- | --- |\n\n### Fine-Tuning a 7B Model in a single 16GB GPU using QLoRA.\nWe are going to see a brief introduction to quantization, used to reduce the size of big Large Language Models. With quantization, you can load big models reducing the memory resources needed. It also applies to the fine-tuning process, you can fine-tune the model in a single GPU without consuming all the resources. \nAfter the brief explanation we see an example about how is possible to fine-tune a Bloom 7B Model ina a t4 16GB GPU on Google Colab. \n| [Article](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fqlora-training-a-large-language-model-on-a-16gb-gpu-00ea965667c1) | [Notebook](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F5-Fine%20Tuning\u002F5_3_QLoRA_Tuning.ipynb) |\n| --- | --- |\n\n## [Pruning Techniques for Large Language Models](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F6-PRUNING)\n**This section is still under construction. The goal is to build a curriculum that will take us from the most simple pruning techniques to creating a model using the same techniques employed by leading companies in the field, such as Microsoft, Google, Nvidia, or OpenAI, to build their models.**\n\n### Prune a distilGPT2 model using l1 norm to determine less important neurons. \nIn the first notebook, the pruning process will be applied to the feedforward layers of a distilGPT2 model. This means the model will have reduced weights in those specific layers. The neurons to prune are selected based on their importance scores, which we compute using the L1 norm of their weights. It is a simple aproach, for this first example, that can be used when you want to create a Pruned Model that mimics the Base model in all the areas. \n\nBy altering the model's structure, a new configuration file must be created to ensure it works correctly with the `transformers` library.\n\n| [Notebook: Pruning a distilGPT2 model.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F6_1_pruning_structured_l1_diltilgpt2.ipynb) |\n| --- |\n\n### Prune a Llama3.2 model. \nIn this first notebook, we attempt to replicate the pruning process used with the distilGPT2 model but applied to a Llama model. By not taking the model's characteristics into account, the pruning process results in a completely unusable model. This notebook serves as an exercise to understand how crucial it is to know the structure of the models that will undergo pruning.\n| [Notebook: Pruning a Llama3.2 model INCORRECT APROACH.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6_2_pruning_structured_llama3.2-1b_KO.ipynb) |\n| --- |\n\nThe second notebook addresses the issues encountered when applying the same pruning process to the Llama model as was used for distilGPT2.\n\nThe correct approach is to treat the MLP layers of the model as pairs rather than individual layers and to calculate neuron importance by considering both layers together. Additionally, we switch to using the maximum absolute weight to decide which neurons remain in the pruned layers.\n\n| [Pruning Llama3 Article](https:\u002F\u002Fmedium.com\u002Ftowards-data-science\u002Fhow-to-prune-llama-3-2-and-similar-large-language-models-cf18e9a2afb6?sk=af4c5e40e967437325050f019b3ae606) | [Notebook: Pruning a Llama3.2 model CORRECT APROACH.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F6_3_pruning_structured_llama3.2-1b_OK.ipynb) |\n| --- | --- | \n\n## Structured Depth Pruning. Eliminating complete blocks from large language models. \n### Depth pruning in a Llama-3.2 model. \nIn this notebook, we will look at an example of depth pruning, which involves removing entire layers from the model.\nThe first thing to note is that removing entire layers from a transformer model usually has a significant impact on the model's performance. This is a much more drastic architectural change compared to the simple removal of neurons from the MLP layers, as seen in the previous example.\n| [Notebook: Depth pruning a Llama Model.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F6_5_pruning_depth_st_llama3.2-1b_OK.ipynb) |\n| --- |\n\n## Attention Bypass\n### Pruning Attention Layers. \nThis notebook implements the ideas presented in the paper: [What Matters in Transformers? Not All Attention is Needed](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.15786). \n\nIn this notebook, the attention layers that contribute the least to the model are marked to be bypassed, improving inference efficiency and reducing the model's resource consumption. To identify the layers that contribute the least, a simple activation with a prompt is used, and the cosine similarity between the layer's input and output is measured. The smaller the difference, the less modification the layer introduces.\n\nThe layer selection process implemented in the notebook is iterative. That is, the least contributing layer is selected, and the contribution of the remaining layers is recalculated using the same prompt. This process is repeated until the desired number of layers has been deactivated.\n\nSince this type of pruning does not alter the model's structure, it does not reduce the model's size.\n\n| Article: WIP. | [Notebook: Pruning Attention Layers.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F6_6_pruning_attention_layers.ipynb) |\n| --- | --- | \n\n### Adaptive Attention Bypass. \nAdaptive models are those that can dynamically adapt their structure or change the parts they execute, either while creating the response or upon receiving the user's request. This notebook represents one of the first, if not the first, implementations of an adaptive model compatible with the Transformers library.\n\nThe resulting model is capable of deciding which attention layers to execute depending on the complexity of the prompt it receives. It is the most complex notebook in the entire repository and is very close to what can be considered pure research. In fact, there is no paper that describes the functioning of the implemented method, so it is considered an original work by the author (Pere Martra).\n\nThe model goes through a calibration process in which the importance of each layer is decided, and a configuration file is created. For each prompt received, the complexity is calculated using its length and embedding variance, and the model decides which layers it should use to provide a response to the user.\n| Article: WIP. | [Notebook: Adaptive Attention Bypass.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F6_6b_Adaptive_Inference_Attention_Pruning.ipynb) |\n| --- | --- |\n\n## Knowledge distillation. \nKnowledge Distillation involves training a smaller \"student\" model to mimic a larger, well-trained \"teacher\" model. The student learns not just from the correct labels but also from the probability distributions (soft targets) that the teacher model produces, effectively transferring the teacher's learned knowledge into a more compact form.\n\nWhen combined with Pruning, you first create a pruned version of your base model by removing less important connections. During this process, some knowledge is inevitably lost. To recover this lost knowledge, you can apply Knowledge Distillation using the original base model as the teacher and the pruned model as the student, helping to restore some of the lost performance.\n\nBoth techniques address the same challenge: reducing model size and computational requirements while maintaining performance, making them crucial for deploying AI in resource-constrained environments like mobile devices.\n\n### Recovering knowledge from the base model using KD. \nIn this notebook, we will use Knowledge distillation to recover some of the knowledge lost during the model pruning process. Llama-3.2-1B will be used as the Teacher model, and the 40% pruned version will be used as the Student model. We will specifically improve the performance on the Lambada benchmark.\n| [Notebook: Knowledge Distillation Llama 3.2.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F7_1_knowledge_distillation_Llama.ipynb) |\n| --- |\n\n## Bias & Fairness in LLMs. \nThis section introduces a preliminary line of work focused on detecting bias in LLMs by visualizing neural activations. While still in its early stages, these analyses pave the way for future fairness-aware pruning strategies, where structural pruning decisions also take into account the impact on different demographic or semantic groups.\n\n### Visualizing Bias in State of the art Transformer Models. \nThis notebook introduces techniques for visualizing neural activations in Transformer models, as a first step toward detecting and mitigating bias in language models.\nTechniques applied:\n  *  Dimensionality reduction with PCA\n  *  Visualization using heatmaps\n  *  Differential activation analysis between contrastive groups\n\n| Article                                                                 | Notebook                                                                 |\n|-------------------------------------------------------------------------|---------------------------------------------------------------------------|\n| [From Biased to Balanced: Visualizing and Fixing Bias in Transformer Models](https:\u002F\u002Fmedium.com\u002Fdata-science-collective\u002Ffrom-biased-to-balanced-visualizing-and-fixing-bias-in-transformer-models-d1a82f35393c?sk=abd12073ee311c3752da3219a5baf20f) | [8_1_transformer_activations_visualization.ipynb](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F8_1_transformer_activations_visualization.ipynb) |\n\n### Targeted Pruning for Bias Mitigation in Transformer Models\nThis notebook introduces a novel pruning method designed to mitigate bias in LLMs. By using pairs of contrastive prompts (e.g., \"Black man\" vs. \"white man\"), the method identifies neurons that respond differently depending on demographic cues. These neurons are then selectively removed based on a hybrid scoring system that combines bias contribution and structural importance.\n\nThe technique is implemented using the [optipfair library](https:\u002F\u002Fgithub.com\u002Fperemartra\u002Foptipfair), which provides detailed visualizations of layer-wise activations and internal bias metrics. You can explore the model’s internal behavior interactively via the companion Hugging Face Space: [🌐 Optipfair Bias Analyzer on Hugging Face](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Foopere\u002Foptipfair-bias-analyzer). \n\nThe results speak for themselves: with just 0.13% of the parameters pruned, the model’s internal bias metric was reduced by 22%, with minimal performance loss. This proof of concept demonstrates that bias-aware pruning can be both precise and efficient—offering a practical tool for building fairer AI systems.\n| [Notebook: 8_2_Targeted_Pruning_for_Bias_Mitigation.ipynb](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F8_2_Targeted_Pruning_for_Bias_Mitigation.ipynb) |\n| ---------------------------------------------------------|\n\n_____________\n\u003Ch1>🚀2- Projects.\u003C\u002Fh1>\n\n## [Natural Language to SQL.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002FP1-NL2SQL).\nIn this straightforward initial project, we are going to develop a SQL generator from natural language. We'll begin by creating the prompt to implement two solutions: one using OpenAI models running on Azure, and the other with an open-source model from Hugging Face.\n| Article | Notebook |\n| --- | --- |\n| [Create a NL2SQL prompt for OpenAI](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fcreate-a-superprompt-for-natural-language-to-sql-conversion-for-openai-9d19f0efe8f4) | [Prompt creation for OpenAI](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002FP1-NL2SQL\u002Fnl2sql_prompt_OpenAI.ipynb) |\n| WIP | [Prompt creation for defog\u002FSQLCoder](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002FP1-NL2SQL\u002Fnl2sql_prompt_SQLCoder.ipynb) |\n| [Inference Azure Configuration.](https:\u002F\u002Fpub.towardsai.net\u002Fhow-to-set-up-an-nl2sql-system-with-azure-openai-studio-2fcfc7b57301) | [Using Azure Inference Point](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002FP1-NL2SQL\u002FNL2SQL_OpenAI_Azure.ipynb) |\n\n## [Create and publish an LLM.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002FP2-MHF\u002Freadme.md) \nIn this small project we will create a new model aligning a microsoft-phi-3-model with DPO and then publish it to Hugging Face. \n| Article | Notebook |\n| --- | --- |\n| WIP | [Aligning with DPO a phi3-3 model.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002FP2-MHF\u002FAligning_DPO_phi3.ipynb)\n\n_____________\n\u003Ch1>🚀3- Architecting Enterprise Solutions.\u003C\u002Fh1>\n\n## [Architecting a NL2SQL Solution for immense Enterprise Databases](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002FE1-NL2SQL%20for%20big%20Databases).\nIn this initial solution, we design an architecture for an NL2SQL system capable of operating on a large database. The system is intended to be used with two or three different models. In fact, we use three models in the example. \n\nIt's an architecture that enables a fast project kickoff, providing service for only a few tables in the database, allowing us to add more tables at our pace.\n\n## [Decoding Risk: Transforming Banks with Customer Embeddings.](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002FE2-Transforming%20Banks%20With%20Embeddings)\nIn this solution, we explore the transformative power of embeddings and large language models (LLMs) in customer risk assessment and product recommendation in the financial industry. We'll be altering the format in which we store customer information, and consequently, we'll also be changing how this information travels within the systems, achieving important advantages. \n\n_____________\n### Contributing to the course: \nPlease, if you find any problems, open an [issue](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fissues) . I will do my best to fix it as soon as possible, and give you credit.  \n\nIf you'd like to make a contribution or suggest a topic, please don't hesitate to start a [Discussion](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fdiscussions). I'd be delighted to receive any opinions or advice.\n\nDon't be shy, share the course on your social networks with your friends. Connect with me on [LinkedIn](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fpere-martra\u002F) or [Twitter](https:\u002F\u002Ftwitter.com\u002FPereMartra) and feel free to share anything you'd like or ask any questions you may have.\n\nGive a Star ⭐️ to the repository. It helps me a lot, and encourages me to continue adding lessons. It's a nice way to support free Open Source courses like this one. \n\n_____________\n# References & Papers used in the Course: \n* Pere Martra [Fragile Knowledge, Robust Instruction-Following: The Width Pruning Dichotomy in Llama-3.2](https:\u002F\u002Fwww.techrxiv.org\u002Fusers\u002F1001026\u002Farticles\u002F1361522-fragile-knowledge-robust-instruction-following-the-width-pruning-dichotomy-in-llama-3-2). Width pruning. \n\n* Tom Kocmi, Christian Federmann, [Large Language Models Are State-of-the-Art Evaluators of Translation Quality](https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.14520). Evaluating LLMs with LLMs. \n\n* Pere Martra, [Introduction to Large Language Models with OpenAI](https:\u002F\u002Fdoi.org\u002F10.1007\u002F979-8-8688-0515-8_1)\n\n* [ReAct: Synergizing Reasoning and Acting in Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.03629). LangChain & Agents Section. Medical Assistant Sample.   \n\n* [The Power of Scale for Parameter-Efficient Prompt Tuning](https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2104.08691). Fine Tuning & Optimization Section. Prompt Tuning Sample. \n\n* [LoRA: Low-Rank Adaptation of Large Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.09685). Fine Tuning & Optimization Section. LoRA Fine-Tuning Sample. \n\n* [QLoRA: Efficient Finetuning of Quantized LLMs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.14314). Fine Tuning & Optimization Section. QLoRA Fine-Tuning Sample.\n\n* [How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain Settings]> (https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.11853). Project. Natural Language to SQL. \n\n* Saurav Muralidharan, Sharath Turuvekere Sreenivas, Raviraj Joshi, Marcin Chochowski, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Jan Kautz, Pavlo Molchanov, \"Compact Language Models via Pruning and Knowledge Distillation,\" arXiv preprint arXiv:2407.14679, 2024. Available at: [https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2407.14679](https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2407.14679).\n\n* He, S., Sun, G., Shen, Z., & Li, A. (2024). What matters in transformers? not all attention is needed. arXiv preprint arXiv:2406.15786. https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2406.15786\n\n* Kim, B. K., Kim, G., Kim, T. H., Castells, T., Choi, S., Shin, J., & Song, H. K. (2024). Shortened llama: A simple depth pruning for large language models. arXiv preprint arXiv:2402.02834, 11. https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2402.02834\n\n* Martra, P. (2024, December 26). Exploring GLU Expansion Ratios: Structured Pruning in Llama-3.2 Models. https:\u002F\u002Fdoi.org\u002F10.31219\u002Fosf.io\u002Fqgxea\n\n```\n@software{optipfair2025,\n  author = {Pere Martra},\n  title = {OptiPFair: A Library for Structured Pruning of Large Language Models},\n  year = {2025},\n  url = {https:\u002F\u002Fgithub.com\u002Fperemartra\u002Foptipfair}\n}\n```\n\n","# 使用大型语言模型（LLMs）、GPT、LLaMA、LangChain 和 Hugging Face 构建：面向工程师、研究人员和开发者的动手实践项目\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Fhubs.la\u002FQ040tvsK0\">\n\u003Cimg width=\"451\" height=\"257\" alt=\"newinmeap\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fperemartra_Large-Language-Model-Notebooks-Course_readme_6750a726dcac.png\" \u002F>\n\u003C\u002Fa>\n\u003C\u002Fp>\n\n### 🚨  新闻：我正在与 Manning 合作撰写《Rearchitecting LLMs》！学习如何使用剪枝（Pruning）、知识蒸馏（Knowledge Distillation）和注意力绕过（Attention Bypass）来拆解和重建 Transformer——这些是顶尖 AI 实验室用于推动模型效率边界的相同技术。[在此查看 MEAP](https:\u002F\u002Fhubs.la\u002FQ040tvsK0)。使用代码 MLMartra 可享 50% 折扣。🚨\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd  width=\"130\">\n      \u003Ca href=\"https:\u002F\u002Famzn.to\u002F4eanT1g\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fperemartra_Large-Language-Model-Notebooks-Course_readme_7901ac4096f6.jpg\"  alt=\"Rearchitecting LLMs\" width=\"100%\">\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd>\n      \u003Cp>\n        这是本书的非官方仓库： \n        \u003Ca href=\"https:\u002F\u002Famzn.to\u002F4eanT1g\"> \u003Cb>Large Language Models:\u003C\u002Fb> Apply and Implement Strategies for Large Language Models\u003C\u002Fa> (Apress)。\n        本书基于该仓库的内容，但笔记本（Notebook）正在更新，我正在添加新的示例和章节。\n        如果您在寻找包含原始笔记本的书籍官方仓库，请访问 \n        \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FApress\u002FLarge-Language-Models-Projects\">Apress 仓库\u003C\u002Fa>，您可以在那里找到以书中出现的原始格式存在的所有笔记本。购买地址：\u003Ca href=\"https:\u002F\u002Famzn.to\u002F3Bq2zqs\">[Amazon]\u003C\u002Fa> \u003Ca href=\"https:\u002F\u002Flink.springer.com\u002Fbook\u002F10.1007\u002F979-8-8688-0515-8\">[Springer]\u003C\u002Fa>\n      \u003C\u002Fp>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n*请注意，GitHub 上的课程并不包含书中的所有信息。*\n\n**这个关于大型语言模型及其应用的实用免费动手课程 👷🏼处于永久开发中👷🏼。我将随着完成进度发布不同的课程和示例。**\n\n本课程提供使用 OpenAI 和 Hugging Face 库中的模型的动手体验。我们将看到并使用大量工具，并通过小型项目进行练习，随着我们应用新获得的知识，这些项目将不断增长。 \n\n![Large Language Models Course Path](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fperemartra_Large-Language-Model-Notebooks-Course_readme_ccd1450b0f6e.jpg)\n\n\u003Ch1> 课程分为三个主要部分：\u003C\u002Fh1>\n\n\u003Ch2>1- 技术与库：\u003C\u002Fh2> \n在这一部分，我们将通过小型示例探索不同的技术，使我们能够在下一部分构建更大的项目。我们将学习如何使用大型语言模型世界中最常见的库，始终以实用为重点，同时基于已发表的论文建立我们的方法。\n\n本节涵盖的一些主题和技术包括：聊天机器人、代码生成、OpenAI API、Hugging Face、向量数据库、LangChain、微调（Fine Tuning）、参数高效微调（PEFT Fine Tuning）、软提示调优（Soft Prompt tuning）、低秩适应（LoRA）、量化低秩适应（QLoRA）、评估模型、知识蒸馏。\n\n\u003Ch2>2- 项目：\u003C\u002Fh2> \n我们将创建项目，并解释设计决策。每个项目可能有多种实现方式，因为通常不存在唯一的完美解决方案。在本节中，我们还将深入探讨与 LLMOps（大模型运维）相关的主题，尽管这不是课程的主要重点。\n\n\u003Ch2>3- 企业解决方案：\u003C\u002Fh2> 大型语言模型并非独立的解决方案。在大型企业中，它们只是拼图的一部分。我们将探索如何构建能够转变拥有数千名员工的组织的解决方案，以及大型语言模型在这些新解决方案中扮演的关键角色。\n\n\u003Ch1>如何使用课程。\u003C\u002Fh1>\n在每个部分下，您可以找到不同的章节，这些章节由不同的课程组成。课程的标题是链接到课程页面的链接，您可以在那里找到该课程的所有笔记本和文章。 \n\n每个课程由笔记本和文章组成。笔记本包含足够的信息以理解其中的代码，文章则提供更详细的关于代码和所涵盖主题的解释。 \n\n我的建议是将文章与笔记本一起打开并跟随操作。许多文章提供了可以在笔记本中引入的小变体技巧。我建议您遵循它们以增强概念的理解清晰度。\n\n大多数笔记本托管在 Colab 上，少数托管在 Kaggle 上。Kaggle 的免费版提供的内存比 Colab 更多，但我发现复制和共享笔记本在 Colab 中更简单，而且并非每个人都有 Kaggle 账户。\n\n一些笔记本需要的内存超过了 Colab 免费版提供的内存。由于我们正在处理大型语言模型，这是一个常见情况，如果您继续与它们一起工作，这种情况会再次出现。您可以在自己的环境中运行笔记本，或者选择 Colab 的专业版。\n_____________\n\u003Ch1>🚀1- 技术与库。\u003C\u002Fh1>\n\n每个笔记本都配有 Medium 文章支持，其中详细解释了代码。 \n## [使用 OpenAI 介绍大型语言模型。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI)\n在课程的这一第一部分中，我们将通过学习创建两个小型项目来了解如何使用 OpenAI API。我们将深入研究 OpenAI 的角色以及如何通过提示（Prompt）向模型提供必要的指令，使其按照我们的期望行为。\n\n第一个项目是一个餐厅聊天机器人，模型将接收客户订单。在此基础上，我们将构建一个 SQL 语句生成器。在这里，我们将尝试创建一个安全的提示，仅接受 SQL 创建命令，不接受其他任何内容。\n\n### 使用 GPT 3.5、OpenAI、Python 和 Panel 创建您的第一个聊天机器人。\n我们将利用 OpenAI GPT-3.5 和 Panel 开发一个针对快餐店的简单聊天机器人。在课程期间，我们将探索提示工程（Prompt Engineering）的基础知识，包括理解各种 OpenAI 角色、调整温度设置（temperature settings），以及如何避免提示注入（Prompt Injections）。 \n| [Panel 文章](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fcreate-your-first-chatbot-using-gpt-3-5-openai-python-and-panel-7ec180b9d7f2) \u002F [Gradio 文章](https:\u002F\u002Fai.plainenglish.io\u002Fcreate-a-simple-chatbot-with-openai-and-gradio-202684d18f35?sk=e449515ec7a803ae828418011bbaca52)| [Panel 笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_1-First_Chatbot_OpenAI.ipynb) \u002F [Gradio 笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_1-First_Chatbot_OpenAI_Gradio.ipynb)|\n| --- | --- |\n\n### 使用 OpenAI API 创建自然语言到 SQL 翻译器。\n遵循前篇文章中用于创建 ChatBot（聊天机器人）的相同框架，我们进行了一些修改来开发一个自然语言到 SQL 翻译器。在这种情况下，模型需要被提供表结构，并对提示词进行了调整以确保功能顺畅并避免任何潜在故障。有了这些修改，该翻译器能够将自然语言查询转换为 SQL 查询。@fmquaglia 创建了一个使用 DBML（数据库标记语言）来描述表的笔记本，这迄今为止是比原始方法更好的方案。\n| [文章](https:\u002F\u002Fpub.towardsai.net\u002Fhow-to-create-a-natural-language-to-sql-translator-using-openai-api-e1b1f72ac35a) \u002F [Gradio 文章](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Ffirst-nl2sql-chat-with-openai-and-gradio-b1de0d6541b4)| [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_2-Easy_NL2SQL.ipynb) \u002F [Gradio 笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_2-Easy_NL2SQL_Gradio.ipynb) \u002F [DBML 笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_2b-Easy_NL2SQL.ipynb)\n| --- | --- |\n\n### 使用 OpenAI 进行提示工程简介。\n我们将探索提示工程技术，以改善从模型获得的结果。例如如何使用 Few Shot Samples（少样本示例）来格式化答案并获得结构化响应。 \n| [文章](https:\u002F\u002Fmedium.com\u002Fgitconnected\u002Finfluencing-a-large-language-model-response-with-in-context-learning-b212f0eaa113) | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_3-Intro_Prompt_Engineering.ipynb)\n| --- | --- |\n\n## [使用大型语言模型 (LLMs) 的向量数据库](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F2-Vector%20Databases%20with%20LLMs) \n对向量数据库的简要介绍，这是一项将在课程许多课程中伴随我们的技术。我们将基于存储在 ChromaDB 中的各种新闻数据集信息，进行检索增强生成 (RAG) 的示例工作。\n\n### 使用向量数据库通过个性化信息影响语言模型。 \n如果大型语言模型世界中有一个方面正在变得重要，那就是探索如何利用专有信息与它们协作。在本课中，我们探索了一种可能的解决方案，涉及将信息存储在向量数据库中（在我们的案例中是 ChromaDB），并使用它来创建增强的提示词。\n|[文章](https:\u002F\u002Fpub.towardsai.net\u002Fharness-the-power-of-vector-databases-influencing-language-models-with-personalized-information-ab2f995f09ba?sk=ea2c5286fbff8430e5128b0c3588dbab) | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F2-Vector%20Databases%20with%20LLMs\u002F2_1_Vector_Databases_LLMs.ipynb) |\n| --- | --- |\n\n### RAG 系统的语义缓存 \n我们通过引入语义缓存层增强了 RAG 系统，该层能够确定之前是否问过类似的问题。如果是肯定的，它会从使用 Faiss 创建的缓存系统中检索信息，而不是访问向量数据库。 \n\n本笔记本中实现的语义缓存的灵感和基础代码源自 Hamza Farooq 的课程：https:\u002F\u002Fmaven.com\u002Fboring-bot\u002Fadvanced-llm\u002F1\u002Fhome。\n\n| 文章 | 笔记本 |\n| --- | ---|\n| 进行中 | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F2-Vector%20Databases%20with%20LLMs\u002Fsemantic_cache_chroma_vector_database.ipynb) |\n\n\n## [LangChain](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F3-LangChain)\nLangChain 一直是大型语言模型宇宙中对这场革命贡献最大的库之一。 \n它允许我们将对模型和其他系统的调用链接起来，使我们能够构建基于大型语言模型的应用程序。在课程中，我们将多次使用它，创建越来越复杂的项目。\n\n### 检索增强生成 (RAG)。使用大型语言模型 (LLMs) 处理数据框 (DataFrames) 中的数据。\n在本课中，我们使用 LangChain 增强了上一课的笔记本，当时我们使用了两个数据集的数据来创建增强提示词。这次，在 LangChain 的帮助下，我们构建了一个管道，负责从向量数据库检索数据并将其传递给语言模型。该笔记本设置为与两个不同的数据集和两个不同的模型一起工作。其中一个模型是为文本生成训练的，而另一个是为文本到文本生成训练的。\n| [文章](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fquery-your-dataframes-with-powerful-large-language-models-using-langchain-abe25782def5) | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_1_RAG_langchain.ipynb) |\n| --- | --- |\n\n### 使用 LangChain 创建审核系统。 \n我们将使用基于 LangChain 构建的双模型管道创建一个评论回复系统。在这种设置中，第二个模型将负责审核第一个模型生成的响应。\n\n防止我们的系统生成不期望响应的一个有效方法是使用一个不与用户直接交互的第二个模型来处理响应生成。 \n\n这种方法可以降低第一个模型因用户输入而产生不期望响应的风险。 \n\n\n我将为这项任务创建单独的笔记本。其中一个将涉及来自 OpenAI 的模型，其他则将利用 Hugging Face 提供的开源模型。三个笔记本中获得的结果非常不同。系统在 OpenAI 和 LLAMA2 模型上表现要好得多。 \n| 文章 | 笔记本 |\n| --- | --- |\n| [OpenAI 文章](https:\u002F\u002Fpub.towardsai.net\u002Fcreate-a-self-moderated-commentary-system-with-langchain-and-openai-406a51ce0c8d?sk=b4903b827e44642f7f7c311cebaef57f) | [OpenAI 笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_2_OpenAI_Moderation_Chat.ipynb) |\n| [Llama2-7B 文章](https:\u002F\u002Flevelup.gitconnected.com\u002Fcreate-a-self-moderated-comment-system-with-llama-2-and-langchain-656f482a48be?sk=701ead7afb80e015ea4345943a1aeb1d) | [Llama2-7B 笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_2_LLAMA2_Moderation_Chat.ipynb) |\n| 无文章 | [GPT-J 笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_2_GPT_Moderation_System.ipynb) |\n\n### 使用 LLM Agent（大语言模型智能体）创建数据分析师助手。 \n智能体（Agent）是大语言模型领域中最强大的工具之一。该智能体能够解释用户的请求，并利用其可用的工具和库，直到达成预期结果。\n\n使用 LangChain Agents，我们将用短短几行代码创建一个最简单却极其强大的智能体。该智能体将充当数据分析师助手，帮助我们分析任何 Excel 文件中包含的数据。它将能够识别趋势、使用模型、进行预测。总之，我们将创建一个简单的智能体，用于我们日常工作中分析数据。\n| [文章](https:\u002F\u002Fpub.towardsai.net\u002Fcreate-your-own-data-analyst-assistant-with-langchain-agents-722f1cdcdd7e) | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_3_Data_Analyst_Agent.ipynb) |\n| --- | --- |\n\n### 使用 LangChain 和 ChromaDB 创建医疗聊天机器人。 \n在此示例中，结合了之前看到的两种技术：智能体和向量数据库。医疗信息存储在 ChromaDB 中，并创建了一个 LangChain Agent，它仅在必要时获取这些信息，以创建一个增强型提示（prompt），该提示将被发送给模型以回答用户的问题。\n\n换句话说，创建了一个 RAG（检索增强生成）系统来辅助医疗聊天机器人。\n\n**注意！！！仅将其作为示例使用。任何人都不应将机器人的建议视为真实医生的建议。我对聊天机器人可能产生的使用后果不承担任何责任。我构建它仅仅是为了展示不同技术的示例。**\n| [文章](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fquery-your-dataframes-with-powerful-large-language-models-using-langchain-abe25782def5) | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F3-LangChain\u002F3_4_Medical_Assistant_Agent.ipynb) |\n| ------ | ------ |\n\n## [评估 LLMs（大语言模型）。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F4-Evaluating%20LLMs)\n用于衡量大语言模型性能的指标与我们在传统模型中使用的指标有很大不同。我们正逐渐放弃准确率（Accuracy）、F1 分数或召回率（recall）等指标，转而采用 BLEU、ROUGE 或 METEOR 等指标。 \n\n这些指标是专门为分配给语言模型的具体任务量身定制的。 \n\n在本节中，我们将探讨其中几种指标的示例，以及如何利用它们来确定某个模型在给定任务上是否优于另一个模型。我们将深入探讨这些指标帮助我们就不同模型的性能做出明智决策的实际场景。\n\n### 使用 BLEU 评估翻译质量。 \nBLEU 是最早建立的用于评估翻译质量的指标之一。在笔记本中，我们比较了 Google 制作的翻译与来自 Hugging Face 的开源模型的翻译质量。\n| 文章 WIP | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_1_bleu_evaluation.ipynb) |\n| --- | --- |\n\n### 使用 ROUGE 评估摘要生成。 \n我们将探索如何使用 ROUGE 指标来衡量语言模型生成的摘要的质量。 \n我们将使用两个 T5 模型，一个是 t5-Base 模型，另一个是专门设计用于创建摘要的 t5-base 微调模型。\n\n| [文章](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Frouge-metrics-evaluating-summaries-in-large-language-models-d200ee7ca0e6) | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_1_rouge_evaluations.ipynb) |\n| --- | --- |\n\n### 使用 LangSmith 监控智能体。 \n在这个初始示例中，您可以观察如何使用 LangSmith 监控构成智能体的各个组件之间的流量。该智能体是一个 RAG 系统，利用向量数据库构建增强型提示并将其传递给模型。LangSmith 捕获智能体工具的使用情况以及模型做出的决策，随时提供有关发送\u002F接收数据、消耗 token、查询持续时间的信息，并且这一切都在一个真正用户友好的环境中呈现。\n| [文章](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Ftracing-a-llm-agent-with-langsmith-a81975634555)  | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_2_tracing_medical_agent.ipynb) |\n| ------ | ------ |\n\n### 使用 LangSmith 通过嵌入距离评估摘要质量。 \n之前在笔记本\"Rouge Metrics: Evaluating Summaries\"中，我们学习了如何使用 ROUGE 来评估哪个摘要最接近人工创建的摘要。这次，我们将使用嵌入距离和 LangSmith 来验证哪个模型生成的摘要更接近参考摘要。\n| [文章](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fevaluating-llm-summaries-using-embedding-distance-with-langsmith-5fb46fdae2a5?sk=24eb18ce187d28547cebd6fd3dd1ddad) | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_2_Evaluating_summaries_embeddings.ipynb) |\n| ------ | ------ |\n\n### 使用 Giskard 评估 RAG 解决方案。 \n我们采用充当医疗助手的智能体，并整合 Giskard 来评估其回复是否正确。这样，不仅评估模型的回复，还评估向量数据库中的信息检索。Giskard 是一种允许评估完整 RAG 解决方案的解决方案。\n| [文章](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fevaluating-a-rag-solution-with-giskard-1bc138fa44af?sk=10811fe2953eb511fb1ffefda326f7a2) | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_3_evaluating_rag_giskard.ipynb)\n| ------ | ------ |\n\n### Eluther.ai 的 lm-evaluation 库简介。 \nEleutherAI 的 lm-eval 库提供了对已成为行业标准的学术基准的便捷访问。它支持评估开源模型以及像 OpenAI 这样的提供商的 API，甚至允许评估使用 LoRA 等技术创建的适配器。\n\n在这个笔记本中，我将专注于该库的一个小而重要的功能：评估兼容 Hugging Face Transformers 库的模型。\n| 文章 - WIP | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F4-Evaluating%20LLMs\u002F4_4_lm-evaluation-harness.ipynb)\n| ------ | ------ |\n\n## [微调与优化。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F5-Fine%20Tuning) \n在“微调与优化”部分，我们将探索不同的技术，例如 Prompt Fine Tuning（提示微调）或 LoRA（低秩适应），并将使用 Hugging Face 的 PEFT（参数高效微调）库来高效地微调大型语言模型（Large Language Models）。我们将探索量化（quantization）等技术以减少模型的权重占用。 \n\n### 使用 Hugging Face 的 PEFT 库进行 Prompt Tuning。 \n在此笔记本中，我们使用来自 PEFT 库的 Prompt Tuning 训练了两个模型。该技术不仅允许我们通过修改极少参数的权重来进行训练，还使我们能够在内存中加载不同的专用模型，这些模型都使用同一个基础模型。\n\nPrompt Tuning 是一种添加式技术，预训练模型的权重不会被修改。在这种情况下，我们修改的是添加到提示词中的虚拟 token 的权重。\n| [文章](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Ffine-tuning-models-using-prompt-tuning-with-hugging-faces-peft-library-998ae361ee27) | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F5-Fine%20Tuning\u002F5_4_Prompt_Tuning.ipynb) |\n| --- | --- |\n\n### 使用 Hugging Face 的 PEFT 进行 LoRA 微调。 \n在简要解释 LoRA 微调技术的工作原理后，我们将微调一个来自 Bloom 系列的模型，教它构建可用于指令大型语言模型的提示词。\n|[文章](https:\u002F\u002Flevelup.gitconnected.com\u002Fefficient-fine-tuning-with-lora-optimal-training-for-large-language-models-266b63c973ca?sk=85d7b5d78e64e568faedfe07a35f81bd) | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F5-Fine%20Tuning\u002F5_2_LoRA_Tuning.ipynb)\n| --- | --- |\n\n### 使用 QLoRA 在单个 16GB GPU 上微调 7B 模型。\n我们将简要介绍量化（quantization），用于减小大型语言模型的大小。通过量化，你可以加载大模型并减少所需的内存资源。这也适用于微调过程，你可以在单个 GPU（图形处理器）上微调模型而无需消耗所有资源。 \n简要解释之后，我们将看到一个示例，展示如何在 Google Colab 上的 T4 16GB GPU 上微调 Bloom 7B 模型。 \n| [文章](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fqlora-training-a-large-language-model-on-a-16gb-gpu-00ea965667c1) | [笔记本](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F5-Fine%20Tuning\u002F5_3_QLoRA_Tuning.ipynb) |\n| --- | --- |\n\n## [大型语言模型的剪枝技术](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002F6-PRUNING)\n**本部分仍在建设中。目标是构建一个课程，将我们从最基础的剪枝技术引导至使用领先公司（如 Microsoft、Google、Nvidia 或 OpenAI）构建其模型时采用的相同技术来创建模型。**\n\n### 使用 L1 范数剪枝 distilGPT2 模型以确定不太重要的神经元。 \n在第一个笔记本中，剪枝（Pruning）过程将应用于 distilGPT2 模型的前馈层。这意味着该特定层的模型权重将减少。要剪枝的神经元是根据其重要性分数选择的，我们使用其权重的 L1 范数（L1 norm）来计算这些分数。这是一个简单的方法，用于此第一个示例，当你想要创建一个在所有方面都模仿基础模型的剪枝模型时可以使用它。 \n\n通过更改模型结构，必须创建一个新的配置文件以确保它能与 `transformers` 库正确配合工作。\n\n| [笔记本：剪枝 distilGPT2 模型。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F6_1_pruning_structured_l1_diltilgpt2.ipynb) |\n| --- |\n\n### 剪枝 Llama3.2 模型。 \n在第一个笔记本中，我们尝试复制用于 distilGPT2 模型的剪枝过程，但将其应用于 Llama 模型。如果不考虑模型的特性，剪枝过程会导致生成完全无法使用的模型。此笔记本作为一个练习，旨在理解了解将要进行剪枝的模型结构是多么关键。\n| [笔记本：剪枝 Llama3.2 模型 错误方法。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6_2_pruning_structured_llama3.2-1b_KO.ipynb) |\n| --- |\n\n第二个笔记本解决了将用于 distilGPT2 的相同剪枝过程应用于 Llama 模型时所遇到的问题。\n\n正确的方法是将模型的 MLP（多层感知机）层视为成对而非单独层，并通过同时考虑这两层来计算神经元的重要性。此外，我们改用最大绝对权重来决定哪些神经元保留在剪枝后的层中。\n\n| [剪枝 Llama3 文章](https:\u002F\u002Fmedium.com\u002Ftowards-data-science\u002Fhow-to-prune-llama-3-2-and-similar-large-language-models-cf18e9a2afb6?sk=af4c5e40e967437325050f019b3ae606) | [笔记本：剪枝 Llama3.2 模型 正确方法。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F6_3_pruning_structured_llama3.2-1b_OK.ipynb) |\n| --- | --- | \n\n## 结构化深度剪枝。消除大型语言模型中的完整模块。 \n### Llama-3.2 模型中的深度剪枝。 \n在此笔记本中，我们将查看深度剪枝的一个示例，这涉及从模型中移除整个层。\n首先需要注意的是，从 Transformer 模型中移除整个层通常会对模型性能产生重大影响。与之前示例中看到的从 MLP 层中简单移除神经元相比，这是一种更为剧烈的架构变更。\n| [笔记本：Llama 模型深度剪枝。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F6_5_pruning_depth_st_llama3.2-1b_OK.ipynb) |\n| --- |\n\n## 注意力旁路\n\n### 剪枝注意力层。\n本笔记本实现了论文中提出的理念：[Transformer 中什么最重要？并非所有注意力都是必需的](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.15786)。\n\n在本笔记本中，对模型贡献最小的注意力层被标记为跳过，从而提高推理效率并减少模型的资源消耗。为了识别贡献最小的层，使用简单的提示激活，并测量该层输入和输出之间的余弦相似度。差异越小，该层引入的修改就越少。\n\n笔记本中实现的层选择过程是迭代式的。也就是说，选择贡献最小的层，并使用相同的提示重新计算剩余层的贡献。重复此过程，直到禁用所需数量的层为止。\n\n由于这种类型的剪枝不改变模型结构，因此它不会减少模型的大小。\n\n| 文章：进行中。 | [笔记本：剪枝注意力层。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F6_6_pruning_attention_layers.ipynb) |\n| --- | --- | \n\n### 自适应注意力绕过。\n自适应模型是指那些能够动态调整其结构或更改其执行部分的模型，无论是在生成响应时还是在接收用户请求时。本笔记本代表了与 Transformers 库兼容的自适应模型的最早实现之一，如果不是第一个的话。\n\n生成的模型能够根据接收到的提示的复杂性来决定执行哪些注意力层。它是整个仓库中最复杂的笔记本，非常接近可被视为纯研究的内容。事实上，没有论文描述所实现方法的运作机制，因此它被认为是作者（Pere Martra）的原创作品。\n\n模型经过校准过程，在此过程中决定每个层的重要性，并创建配置文件。对于接收到的每个提示，使用其长度和嵌入方差计算复杂度，然后模型决定应使用哪些层来向用户提供响应。\n| 文章：进行中。 | [笔记本：自适应注意力绕过。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F6_6b_Adaptive_Inference_Attention_Pruning.ipynb) |\n| --- | --- |\n\n## 知识蒸馏 (Knowledge Distillation)。\n知识蒸馏涉及训练一个较小的“学生”模型来模仿一个较大的、训练良好的“教师”模型。学生不仅从正确的标签中学习，还从教师模型产生的概率分布（软目标）中学习，有效地将教师学到的知识转移到更紧凑的形式中。\n\n当与剪枝结合使用时，你首先通过移除不太重要的连接来创建基础模型的剪枝版本。在此过程中，一些知识不可避免地会丢失。为了恢复这些丢失的知识，你可以应用知识蒸馏，使用原始基础模型作为教师，剪枝后的模型作为学生，帮助恢复部分丢失的性能。\n\n这两种技术都解决了同一个挑战：在保持性能的同时减少模型大小和计算需求，使它们对于在移动设备等资源受限环境中部署 AI 至关重要。\n\n### 使用知识蒸馏 (KD) 从基础模型恢复知识。\n在本笔记本中，我们将使用知识蒸馏来恢复模型剪枝过程中丢失的部分知识。Llama-3.2-1B 将用作教师模型，40% 剪枝版本将用作学生模型。我们将专门提高在 Lambada 基准上的性能。\n| [笔记本：知识蒸馏 Llama 3.2。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F7_1_knowledge_distillation_Llama.ipynb) |\n| --- |\n\n## 大型语言模型 (LLM) 中的偏见与公平性。\n本节介绍了一项初步工作，侧重于通过可视化神经激活来检测大型语言模型 (LLM) 中的偏见。虽然仍处于早期阶段，但这些分析为未来的公平感知剪枝策略铺平了道路，其中结构性剪枝决策也会考虑到对不同人口统计或语义群体的影响。\n\n### 在最先进的 Transformer 模型中可视化偏见。\n本笔记本介绍了在 Transformer 模型中可视化神经激活的技术，作为检测和缓解语言模型偏见的初步步骤。\n采用的技术：\n  * 使用主成分分析 (PCA) 进行降维\n  * 使用热力图进行可视化\n  * 对比组之间的差异激活分析\n\n| 文章 | 笔记本 |\n|---|---|\n| [从偏见到平衡：可视化并修复 Transformer 模型中的偏见](https:\u002F\u002Fmedium.com\u002Fdata-science-collective\u002Ffrom-biased-to-balanced-visualizing-and-fixing-bias-in-transformer-models-d1a82f35393c?sk=abd12073ee311c3752da3219a5baf20f) | [8_1_transformer_activations_visualization.ipynb](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F8_1_transformer_activations_visualization.ipynb) |\n\n### 针对 Transformer 模型偏见缓解的目标剪枝\n本笔记本介绍了一种新颖的剪枝方法，旨在缓解大型语言模型 (LLM) 中的偏见。通过使用成对的对比提示（例如，“黑人男性”与“白人男性”），该方法识别出根据人口统计线索以不同方式响应的神经元。然后根据结合偏见贡献和结构重要性的混合评分系统选择性删除这些神经元。\n\n该技术是使用 [optipfair 库](https:\u002F\u002Fgithub.com\u002Fperemartra\u002Foptipfair) 实现的，该库提供了逐层激活和内部偏见指标的详细可视化。您可以通过配套的 Hugging Face 空间交互式探索模型的内部行为：[🌐 Hugging Face 上的 Optipfair 偏见分析器](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Foopere\u002Foptipfair-bias-analyzer)。 \n\n结果不言自明：仅剪枝 0.13% 的参数，模型的内部偏见指标就降低了 22%，且性能损失极小。这一概念验证表明，偏见感知剪枝既可以精确又高效——为构建更公平的 AI 系统提供实用工具。\n| [笔记本：8_2_Targeted_Pruning_for_Bias_Mitigation.ipynb](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002F6-PRUNING\u002F8_2_Targeted_Pruning_for_Bias_Mitigation.ipynb) |\n| ---------------------------------------------------------|\n\n_____________\n\u003Ch1>🚀2- 项目。\u003C\u002Fh1>\n\n## [自然语言转 SQL。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002FP1-NL2SQL)。\n在这个简单的基础项目中，我们将开发一个从自然语言生成 SQL 的工具。我们将从创建提示词（Prompt）开始，以实现两种解决方案：一种使用运行在 Azure 上的 OpenAI 模型，另一种使用来自 Hugging Face 的开源模型。\n| 文章 | 笔记 |\n| --- | --- |\n| [为 OpenAI 创建 NL2SQL 提示词](https:\u002F\u002Fmedium.com\u002Ftowards-artificial-intelligence\u002Fcreate-a-superprompt-for-natural-language-to-sql-conversion-for-openai-9d19f0efe8f4) | [OpenAI 提示词创建](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002FP1-NL2SQL\u002Fnl2sql_prompt_OpenAI.ipynb) |\n| 进行中 | [defog\u002FSQLCoder 提示词创建](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002FP1-NL2SQL\u002Fnl2sql_prompt_SQLCoder.ipynb) |\n| [推理 Azure 配置。](https:\u002F\u002Fpub.towardsai.net\u002Fhow-to-set-up-an-nl2sql-system-with-azure-openai-studio-2fcfc7b57301) | [使用 Azure 推理端点](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002FP1-NL2SQL\u002FNL2SQL_OpenAI_Azure.ipynb)\n\n## [创建并发布大语言模型。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002FP2-MHF\u002Freadme.md) \n在这个小项目中，我们将创建一个新模型，使用 DPO（直接偏好优化）对齐 microsoft-phi-3 模型，然后将其发布到 Hugging Face。 \n| 文章 | 笔记 |\n| --- | --- |\n| 进行中 | [使用 DPO 对齐 phi3-3 模型。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fblob\u002Fmain\u002FP2-MHF\u002FAligning_DPO_phi3.ipynb)\n\n_____________\n\u003Ch1>🚀3- 构建企业级解决方案。\u003C\u002Fh1>\n\n## [为大型数据库构建 NL2SQL 企业级解决方案](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002FE1-NL2SQL%20for%20big%20Databases)。\n在这个初始方案中，我们设计了一个能够在大型数据库上运行的 NL2SQL（自然语言转 SQL）系统架构。该系统旨在配合两到三个不同的模型使用。事实上，我们在示例中使用了三个模型。 \n\n这是一种能够快速启动项目的架构，仅对数据库中的少量表提供服务，允许我们按自己的节奏添加更多表。\n\n## [解码风险：利用客户嵌入（Embeddings）技术变革银行业。](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Ftree\u002Fmain\u002FE2-Transforming%20Banks%20With%20Embeddings)\n在这个方案中，我们探索了嵌入技术和大语言模型（LLM）在金融行业客户风险评估和产品推荐中的变革力量。我们将改变存储客户信息的格式，因此，我们也将改变这些信息在系统内的传输方式，从而实现重要的优势。 \n\n_____________\n### 课程贡献： \n如果您发现任何问题，请打开一个 [问题](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fissues) 。我会尽最大努力尽快修复它，并署名感谢。  \n\n如果您想做出贡献或建议主题，请随时发起一个 [讨论](https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fdiscussions) 。我很乐意接收任何意见或建议。\n\n不要害羞，在您的社交网络上与朋友分享本课程。在 [LinkedIn](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fpere-martra\u002F) 或 [Twitter](https:\u002F\u002Ftwitter.com\u002FPereMartra) 上联系我，随时分享您想分享的内容或提出您的问题。\n\n给仓库点个 Star ⭐️。这对我帮助很大，也鼓励我继续添加课程。这是支持像这样免费开源课程的好方法。 \n\n_____________\n# 课程中使用的参考文献与论文： \n* Pere Martra [Fragile Knowledge, Robust Instruction-Following: The Width Pruning Dichotomy in Llama-3.2](https:\u002F\u002Fwww.techrxiv.org\u002Fusers\u002F1001026\u002Farticles\u002F1361522-fragile-knowledge-robust-instruction-following-the-width-pruning-dichotomy-in-llama-3-2)。宽度剪枝。 \n\n* Tom Kocmi, Christian Federmann, [Large Language Models Are State-of-the-Art Evaluators of Translation Quality](https:\u002F\u002Farxiv.org\u002Fabs\u002F2302.14520)。使用 LLM 评估 LLM。 \n\n* Pere Martra, [Introduction to Large Language Models with OpenAI](https:\u002F\u002Fdoi.org\u002F10.1007\u002F979-8-8688-0515-8_1)\n\n* [ReAct: Synergizing Reasoning and Acting in Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.03629)。LangChain 与智能体部分。医疗助手示例。   \n\n* [The Power of Scale for Parameter-Efficient Prompt Tuning](https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2104.08691)。微调与优化部分。提示词调优示例。 \n\n* [LoRA: Low-Rank Adaptation of Large Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.09685)。微调与优化部分。LoRA 微调示例。 \n\n* [QLoRA: Efficient Finetuning of Quantized LLMs](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.14314)。微调与优化部分。QLoRA 微调示例。\n\n* [How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain Settings](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.11853)。项目。自然语言转 SQL。 \n\n* Saurav Muralidharan, Sharath Turuvekere Sreenivas, Raviraj Joshi, Marcin Chochowski, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Jan Kautz, Pavlo Molchanov, \"Compact Language Models via Pruning and Knowledge Distillation,\" arXiv preprint arXiv:2407.14679, 2024。可用地址：[https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2407.14679](https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2407.14679)。\n\n* He, S., Sun, G., Shen, Z., & Li, A. (2024)。What matters in transformers? not all attention is needed。arXiv preprint arXiv:2406.15786。https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2406.15786\n\n* Kim, B. K., Kim, G., Kim, T. H., Castells, T., Choi, S., Shin, J., & Song, H. K. (2024)。Shortened llama: A simple depth pruning for large language models。arXiv preprint arXiv:2402.02834, 11。https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2402.02834\n\n* Martra, P. (2024, December 26)。Exploring GLU Expansion Ratios: Structured Pruning in Llama-3.2 Models。https:\u002F\u002Fdoi.org\u002F10.31219\u002Fosf.io\u002Fqgxea\n\n```\n@software{optipfair2025,\n  author = {Pere Martra},\n  title = {OptiPFair: A Library for Structured Pruning of Large Language Models},\n  year = {2025},\n  url = {https:\u002F\u002Fgithub.com\u002Fperemartra\u002Foptipfair}\n}\n```","# Large-Language-Model-Notebooks-Course 快速上手指南\n\n本仓库是一个关于大语言模型（LLM）及其应用的实践课程，包含大量基于 OpenAI、Hugging Face、LangChain 等技术的 Jupyter Notebook 示例。适合工程师、研究人员和开发者学习 LLM 构建与微调。\n\n## 环境准备\n\n由于大部分 Notebook 托管在 Google Colab 上，推荐优先使用云端环境。若需本地运行，请确保满足以下要求：\n\n*   **操作系统**：Linux \u002F macOS \u002F Windows\n*   **Python 版本**：>= 3.8\n*   **硬件建议**：\n    *   基础练习：CPU 即可\n    *   模型微调\u002F推理：建议配备 NVIDIA GPU (显存 >= 8GB)\n*   **账号准备**：\n    *   [Google Colab](https:\u002F\u002Fcolab.research.google.com\u002F) 账号（免费额度足够入门）\n    *   [OpenAI API](https:\u002F\u002Fplatform.openai.com\u002F) 密钥（部分示例需要）\n\n## 安装步骤\n\n### 方案 A：使用 Google Colab（推荐）\n无需本地安装，直接点击仓库中的 Notebook 链接即可在浏览器中运行。\n\n### 方案 B：本地环境搭建\n1.  **克隆仓库**\n    建议使用国内镜像加速下载：\n    ```bash\n    git clone https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course.git\n    cd Large-Language-Model-Notebooks-Course\n    ```\n\n2.  **创建虚拟环境**\n    ```bash\n    python -m venv venv\n    source venv\u002Fbin\u002Factivate  # Linux\u002FMac\n    # 或\n    venv\\Scripts\\activate     # Windows\n    ```\n\n3.  **安装核心依赖**\n    根据具体章节需求安装，以下为通用核心库（推荐使用清华源加速）：\n    ```bash\n    pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple \\\n    torch transformers langchain openai pandas jupyter panel gradio chromadb faiss-cpu\n    ```\n\n4.  **配置环境变量**\n    将你的 OpenAI API Key 添加到环境变量中：\n    ```bash\n    export OPENAI_API_KEY=\"your-api-key-here\"\n    ```\n\n## 基本使用\n\n### 1. 启动第一个项目\n进入 `1-Introduction to LLMs with OpenAI` 目录，打开第一个聊天机器人示例：\n```bash\njupyter notebook 1-Introduction%20to%20LLMs%20with%20OpenAI\u002F1_1-First_Chatbot_OpenAI.ipynb\n```\n或者直接在浏览器中访问对应的 Colab 链接运行。\n\n### 2. 运行代码\n*   按顺序点击单元格执行（Cell -> Run All）。\n*   注意查看 Notebook 中的注释，部分代码需要替换为你自己的 API Key。\n*   阅读配套的文章（Article）以理解代码逻辑和提示词工程细节。\n\n### 3. 进阶探索\n完成基础章节后，可依次尝试以下模块：\n*   **向量数据库**：探索 RAG（检索增强生成）系统。\n*   **LangChain**：构建复杂的应用管道。\n*   **微调技术**：学习 LoRA、QLoRA 等高效微调方法。\n\n> **提示**：部分 Notebook 对内存要求较高，如遇 OOM 错误，建议切换至 Colab Pro 或本地 GPU 环境运行。","某科技公司的资深后端工程师小王，负责搭建企业内部的知识库智能助手，希望利用大模型技术提升员工查询效率，但缺乏从零构建 LLM 应用的全栈经验。\n\n### 没有 Large-Language-Model-Notebooks-Course 时\n- 学习路径混乱，需要在 GitHub 和博客间反复切换才能拼凑出 LangChain 与向量数据库的集成方案\n- 面对私有数据微调时，对 LoRA、QLoRA 等参数高效微调技术理解不深，导致显存占用过高且效果不佳\n- 缺乏生产环境部署经验，不清楚如何设计包含评估指标和监控的企业级架构\n- 遇到 Prompt Engineering 或 RAG 检索增强生成问题时，只能靠试错，调试周期漫长\n\n### 使用 Large-Language-Model-Notebooks-Course 后\n- 跟随实战 Notebook 逐步完成从 API 调用到自定义项目构建，直接复用经过验证的代码模块\n- 深入掌握 PEFT 微调与知识蒸馏技巧，能够根据业务需求低成本地定制和优化开源模型\n- 参考企业级解决方案章节，学会如何将 LLM 融入现有 IT 基础设施并处理大规模并发请求\n- 通过结构化的课程路径图，快速定位所需知识点，显著降低了从原型到上线的开发门槛\n\nLarge-Language-Model-Notebooks-Course 帮助工程师跨越了从理论认知到工程落地的鸿沟，实现了高效的技术转型。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fperemartra_Large-Language-Model-Notebooks-Course_51603949.png","peremartra","Pere Martra","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fperemartra_a94dc0ca.jpg","ML Research Engineer | LLM Optimization & Efficiency | Authoring “Rearchitecting LLMs” @ManningPublications | 1.9K GitHub ⭐",null,"Barcelona","oopere@gmail.com","PereMartra","https:\u002F\u002Fmartra.uadla.com\u002F","https:\u002F\u002Fgithub.com\u002Fperemartra",[86,90],{"name":87,"color":88,"percentage":89},"Jupyter Notebook","#DA5B0B",100,{"name":91,"color":92,"percentage":93},"Python","#3572A5",0,1781,445,"2026-03-30T07:04:54","MIT","未说明",{"notes":100,"python":98,"dependencies":101},"课程主要基于 Jupyter Notebook 形式提供。大部分笔记本托管在 Google Colab，少数在 Kaggle。部分笔记本因内存需求较大，建议运行在本地环境或购买 Colab Pro 版本。学习时建议配合配套文章阅读以理解代码细节。",[102,103,104,105,106,107,108],"openai","transformers","langchain","chromadb","faiss","panel","gradio",[14,13,51,26],[111,112,113,104,114,103,115,116,117,118],"chatbots","hf","huggingface","large-language-models","vector-database","fine-tuning-llm","peft-fine-tuning-llm","pruning",4,"2026-03-27T02:49:30.150509","2026-04-06T05:16:40.519592",[123,128,133,138,143,148],{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},1474,"运行 Notebook 时遇到 OpenAI RateLimitError (429) 错误如何解决？","这是 OpenAI 账户配额耗尽导致的。解决方法是使用预付费系统（prepaid system），充值少量金额即可避免信用额度限制，从而绕过此错误。","https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fissues\u002F24",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},1473,"所有使用 OpenAI API 的 Notebook 报错 `AttributeError: module 'openai' has no attribute 'ChatCompletion'` 怎么办？","这是 OpenAI API 更新导致的不兼容问题。维护者已修复 Notebook。若需临时解决，可降级安装：`pip install \"openai\u003C1.0.0\"`。新代码应使用 `openai.chat.completions.create` 替代旧的 `openai.ChatCompletion.create`。","https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fissues\u002F4",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},1475,"在免费版 Colab 上使用 QLoRA Notebook 推理时出现 OOM（内存溢出）错误怎么办？","加载模型需要较多内存。维护者已在 Notebook 中添加说明。有用户反馈在 Kaggle（提供 30G RAM）上运行，需要至少 17G+ RAM。建议检查硬件资源或尝试其他平台如 Kaggle。","https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fissues\u002F6",{"id":139,"question_zh":140,"answer_zh":141,"source_url":142},1476,"当菜单项过多时，提示词可能超过 OpenAI 的最大 Token 限制，该如何处理？","维护者已在 Notebook 中添加说明指出此风险。建议后续引入向量数据库（VectorDB）作为上下文搜索，以优化输入长度，避免触发最大 Token 限制。","https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fissues\u002F1",{"id":144,"question_zh":145,"answer_zh":146,"source_url":147},1477,"为什么在 Prompt Tuning 中要将 'act' 和 'prompt' 列合并为一列，而不是分开作为 inputs 和 labels？","因为使用了 PEFT 库，内部的数据整理函数（DataCollatorForLanguageModeling）会处理数据准备。Transformer 架构在文本生成中，输出是下一个生成的词，每个词的输入是它前面的词序列（tokens）。因此合并后作为单一输入进行训练符合其工作原理。","https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fissues\u002F21",{"id":149,"question_zh":150,"answer_zh":151,"source_url":152},1478,"无法找到向量数据库与 LLM 相关 Notebook 中的数据集文件怎么办？","请确保在 Kaggle 上使用第 13 或 14 版本的 Notebook。如果仍有问题，可以尝试 Google Colab 版本的文件 `2_1_Vector_Databases_LLMs.ipynb`。","https:\u002F\u002Fgithub.com\u002Fperemartra\u002FLarge-Language-Model-Notebooks-Course\u002Fissues\u002F19",[]]