[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-ludwig-ai--ludwig":3,"tool-ludwig-ai--ludwig":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",140436,2,"2026-04-05T23:32:43",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":91,"forks":92,"last_commit_at":93,"license":94,"difficulty_score":23,"env_os":95,"env_gpu":96,"env_ram":97,"env_deps":98,"category_tags":109,"github_topics":110,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":131,"updated_at":132,"faqs":133,"releases":149},4060,"ludwig-ai\u002Fludwig","ludwig","Low-code framework for building custom LLMs, neural networks, and other AI models","Ludwig 是一个专为构建自定义大语言模型（LLM）及各类深度神经网络设计的低代码框架。它旨在解决传统深度学习开发中编码复杂、配置繁琐以及大规模训练难以优化的痛点，让用户无需编写大量底层代码，仅通过声明式的 YAML 配置文件即可轻松完成从数据输入到模型训练的全过程。\n\n这款工具非常适合希望快速验证想法的 AI 研究人员、需要高效交付模型的开发者，以及想要尝试定制大模型但受限于工程能力的技术团队。Ludwig 的核心亮点在于其“乐高积木”般的模块化设计，支持多任务与多模态学习，并能自动处理批量大小选择、分布式训练（如 DeepSpeed）、参数高效微调（PEFT）及 4-bit 量化等高级技术。此外，它还提供了从超参数优化到模型可解释性的专家级控制能力，并原生支持 Ray 和 Kubernetes，便于直接部署到生产环境。无论是微调 Llama、Mistral 等主流大模型，还是构建独特的神经网络的架构，Ludwig 都能以极高的效率和扩展性助你一臂之力。","\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fludwig.ai\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fludwig-ai_ludwig_readme_b2b6f1e1f50d.jpg\" height=\"150\">\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cdiv align=\"center\">\n\n_Declarative deep learning framework built for scale and efficiency._\n\n[![PyPI version](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fludwig.svg)](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fludwig)\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-Join%20Chat-5865F2?logo=discord&logoColor=white)](https:\u002F\u002Fdiscord.gg\u002FCBgdrGnZjy)\n[![DockerHub](https:\u002F\u002Fimg.shields.io\u002Fdocker\u002Fpulls\u002Fludwigai\u002Fludwig.svg)](https:\u002F\u002Fhub.docker.com\u002Fr\u002Fludwigai)\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fludwig-ai_ludwig_readme_e8db7d8a5430.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Fludwig)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache%202.0-blue.svg)](https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fblob\u002Fmain\u002FLICENSE)\n[![X](https:\u002F\u002Fimg.shields.io\u002Ftwitter\u002Ffollow\u002Fludwig_ai.svg?style=social&logo=twitter)](https:\u002F\u002Ftwitter.com\u002Fludwig_ai)\n\n\u003C\u002Fdiv>\n\n# 📖 What is Ludwig?\n\nLudwig is a **low-code** framework for building **custom** AI models like **LLMs** and other deep neural networks.\n\nKey features:\n\n- 🛠 **Build custom models with ease:** a declarative YAML configuration file is all you need to train a state-of-the-art LLM on your data. Support for multi-task and multi-modality learning. Comprehensive config validation detects invalid parameter combinations and prevents runtime failures.\n- ⚡ **Optimized for scale and efficiency:** automatic batch size selection, distributed training ([DDP](https:\u002F\u002Fpytorch.org\u002Ftutorials\u002Fbeginner\u002Fddp_series_theory.html), [DeepSpeed](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeed)), parameter efficient fine-tuning ([PEFT](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Fpeft)), 4-bit quantization (QLoRA), paged and 8-bit optimizers, and larger-than-memory datasets.\n- 📐 **Expert level control:** retain full control of your models down to the activation functions. Support for hyperparameter optimization, explainability, and rich metric visualizations.\n- 🧱 **Modular and extensible:** experiment with different model architectures, tasks, features, and modalities with just a few parameter changes in the config. Think building blocks for deep learning.\n- 🚢 **Engineered for production:** prebuilt [Docker](https:\u002F\u002Fhub.docker.com\u002Fu\u002Fludwigai) containers, native support for running with [Ray](https:\u002F\u002Fwww.ray.io\u002F) on [Kubernetes](https:\u002F\u002Fgithub.com\u002Fray-project\u002Fkuberay), export models to [Torchscript](https:\u002F\u002Fpytorch.org\u002Fdocs\u002Fstable\u002Fjit.html) and [Triton](https:\u002F\u002Fdeveloper.nvidia.com\u002Ftriton-inference-server), upload to [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fmodels) with one command.\n\nLudwig is hosted by the\n[Linux Foundation AI & Data](https:\u002F\u002Flfaidata.foundation\u002F).\n\n**Tech stack:** Python 3.12 | PyTorch 2.7+ | Pydantic 2 | Transformers 5 | Ray 2.54\n\n![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fludwig-ai_ludwig_readme_89601297f1a8.gif)\n\n# 💾 Installation\n\nInstall from PyPI. Be aware that Ludwig requires Python 3.12+.\n\n```shell\npip install ludwig\n```\n\nOr install with all optional dependencies:\n\n```shell\npip install ludwig[full]\n```\n\nPlease see [contributing](https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fblob\u002Fmain\u002FCONTRIBUTING.md) for more detailed installation instructions.\n\n# 🚂 Getting Started\n\nWant to take a quick peek at some of Ludwig's features? Check out this Colab Notebook 🚀 [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1lB4ALmEyvcMycE3Mlnsd7I3bc0zxvk39)\n\nLooking to fine-tune LLMs? Check out these notebooks:\n\n1. Fine-Tune Llama-2-7b: [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1r4oSEwRJpYKBPM0M0RSh0pBEYK_gBKbe)\n1. Fine-Tune Llama-2-13b: [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1zmSEzqZ7v4twBrXagj1TE_C--RNyVAyu)\n1. Fine-Tune Mistral-7b: [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1i_8A1n__b7ljRWHzIsAdhO7u7r49vUm4)\n\nFor a full tutorial, check out the official [getting started guide](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fgetting_started\u002F), or take a look at end-to-end [Examples](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples).\n\n## Large Language Model Fine-Tuning\n\n[![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1c3AO8l_H6V_x37RwQ8V7M6A-RmcBf2tG?usp=sharing)\n\nLet's fine-tune a pretrained LLM to follow instructions like a chatbot (\"instruction tuning\").\n\n### Prerequisites\n\n- [HuggingFace API Token](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Fhub\u002Fsecurity-tokens)\n- Access approval to your chosen base model (e.g., [Llama-3.1-8B](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.1-8B))\n- GPU with at least 12 GiB of VRAM (in our tests, we used an Nvidia T4)\n\n### Running\n\nWe'll use the [Stanford Alpaca](https:\u002F\u002Fcrfm.stanford.edu\u002F2023\u002F03\u002F13\u002Falpaca.html) dataset, which will be formatted as a table-like file that looks like this:\n\n|                    instruction                    |      input       |                      output                       |\n| :-----------------------------------------------: | :--------------: | :-----------------------------------------------: |\n|       Give three tips for staying healthy.        |                  | 1.Eat a balanced diet and make sure to include... |\n| Arrange the items given below in the order to ... | cake, me, eating |                  I eating cake.                   |\n| Write an introductory paragraph about a famous... |  Michelle Obama  | Michelle Obama is an inspirational woman who r... |\n|                        ...                        |       ...        |                        ...                        |\n\nCreate a YAML config file named `model.yaml` with the following:\n\n```yaml\nmodel_type: llm\nbase_model: meta-llama\u002FLlama-3.1-8B\n\nquantization:\n  bits: 4\n\nadapter:\n  type: lora\n\nprompt:\n  template: |\n    Below is an instruction that describes a task, paired with an input that may provide further context.\n    Write a response that appropriately completes the request.\n\n    ### Instruction:\n    {instruction}\n\n    ### Input:\n    {input}\n\n    ### Response:\n\ninput_features:\n  - name: prompt\n    type: text\n\noutput_features:\n  - name: output\n    type: text\n\ntrainer:\n  type: finetune\n  learning_rate: 0.0001\n  batch_size: 1\n  gradient_accumulation_steps: 16\n  epochs: 3\n  learning_rate_scheduler:\n    decay: cosine\n    warmup_fraction: 0.01\n\npreprocessing:\n  sample_ratio: 0.1\n\nbackend:\n  type: local\n```\n\nAnd now let's train the model:\n\n```bash\nexport HUGGING_FACE_HUB_TOKEN = \"\u003Capi_token>\"\n\nludwig train --config model.yaml --dataset \"ludwig:\u002F\u002Falpaca\"\n```\n\n## Supervised ML\n\nLet's build a neural network that predicts whether a given movie critic's review on [Rotten Tomatoes](https:\u002F\u002Fwww.kaggle.com\u002Fstefanoleone992\u002Frotten-tomatoes-movies-and-critic-reviews-dataset) was positive or negative.\n\nOur dataset will be a CSV file that looks like this:\n\n|     movie_title      | content_rating |              genres              | runtime | top_critic | review_content                                                                                                                                                                                                   | recommended |\n| :------------------: | :------------: | :------------------------------: | :-----: | ---------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |\n| Deliver Us from Evil |       R        |    Action & Adventure, Horror    |  117.0  | TRUE       | Director Scott Derrickson and his co-writer, Paul Harris Boardman, deliver a routine procedural with unremarkable frights.                                                                                       | 0           |\n|       Barbara        |     PG-13      | Art House & International, Drama |  105.0  | FALSE      | Somehow, in this stirring narrative, Barbara manages to keep hold of her principles, and her humanity and courage, and battles to save a dissident teenage girl whose life the Communists are trying to destroy. | 1           |\n|   Horrible Bosses    |       R        |              Comedy              |  98.0   | FALSE      | These bosses cannot justify either murder or lasting comic memories, fatally compromising a farce that could have been great but ends up merely mediocre.                                                        | 0           |\n|         ...          |      ...       |               ...                |   ...   | ...        | ...                                                                                                                                                                                                              | ...         |\n\nDownload a sample of the dataset from [here](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fdata\u002Frotten_tomatoes.csv).\n\n```bash\nwget https:\u002F\u002Fludwig.ai\u002Flatest\u002Fdata\u002Frotten_tomatoes.csv\n```\n\nNext create a YAML config file named `model.yaml` with the following:\n\n```yaml\ninput_features:\n  - name: genres\n    type: set\n    preprocessing:\n      tokenizer: comma\n  - name: content_rating\n    type: category\n  - name: top_critic\n    type: binary\n  - name: runtime\n    type: number\n  - name: review_content\n    type: text\n    encoder:\n      type: embed\noutput_features:\n  - name: recommended\n    type: binary\n```\n\nThat's it! Now let's train the model:\n\n```bash\nludwig train --config model.yaml --dataset rotten_tomatoes.csv\n```\n\n**Happy modeling**\n\nTry applying Ludwig to your data. [Reach out on Discord](https:\u002F\u002Fdiscord.gg\u002FCBgdrGnZjy)\nif you have any questions.\n\n# ❓ Why you should use Ludwig\n\n- **Minimal machine learning boilerplate**\n\n  Ludwig takes care of the engineering complexity of machine learning out of\n  the box, enabling research scientists to focus on building models at the\n  highest level of abstraction. Data preprocessing, hyperparameter\n  optimization, device management, and distributed training for\n  `torch.nn.Module` models come completely free.\n\n- **Easily build your benchmarks**\n\n  Creating a state-of-the-art baseline and comparing it with a new model is a\n  simple config change.\n\n- **Easily apply new architectures to multiple problems and datasets**\n\n  Apply new models across the extensive set of tasks and datasets that Ludwig\n  supports. Ludwig includes a\n  [full benchmarking toolkit](https:\u002F\u002Farxiv.org\u002Fabs\u002F2111.04260) accessible to\n  any user, for running experiments with multiple models across multiple\n  datasets with just a simple configuration.\n\n- **Highly configurable data preprocessing, modeling, and metrics**\n\n  Any and all aspects of the model architecture, training loop, hyperparameter\n  search, and backend infrastructure can be modified as additional fields in\n  the declarative configuration to customize the pipeline to meet your\n  requirements. For details on what can be configured, check out\n  [Ludwig Configuration](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fconfiguration\u002F)\n  docs.\n\n- **Multi-modal, multi-task learning out-of-the-box**\n\n  Mix and match tabular data, text, images, and even audio into complex model\n  configurations without writing code.\n\n- **Rich model exporting and tracking**\n\n  Automatically track all trials and metrics with tools like Tensorboard,\n  Comet ML, Weights & Biases, MLFlow, and Aim Stack.\n\n- **Automatically scale training to multi-GPU, multi-node clusters**\n\n  Go from training on your local machine to the cloud without code changes.\n\n- **Low-code interface for state-of-the-art models, including pre-trained Huggingface Transformers**\n\n  Ludwig also natively integrates with pre-trained models, such as the ones\n  available in [Huggingface Transformers](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Ftransformers\u002Findex).\n  Users can choose from a vast collection of state-of-the-art pre-trained\n  PyTorch models to use without needing to write any code at all. For example,\n  training a BERT-based sentiment analysis model with Ludwig is as simple as:\n\n  ```shell\n  ludwig train --dataset sst5 --config_str \"{input_features: [{name: sentence, type: text, encoder: bert}], output_features: [{name: label, type: category}]}\"\n  ```\n\n- **Low-code interface for AutoML**\n\n  [Ludwig AutoML](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fuser_guide\u002Fautoml\u002F)\n  allows users to obtain trained models by providing just a dataset, the\n  target column, and a time budget.\n\n  ```python\n  auto_train_results = ludwig.automl.auto_train(dataset=my_dataset_df, target=target_column_name, time_limit_s=7200)\n  ```\n\n- **Easy productionisation**\n\n  Ludwig makes it easy to serve deep learning models, including on GPUs.\n  Launch a REST API for your trained Ludwig model.\n\n  ```shell\n  ludwig serve --model_path=\u002Fpath\u002Fto\u002Fmodel\n  ```\n\n  Ludwig supports exporting models to efficient Torchscript bundles.\n\n  ```shell\n  ludwig export_torchscript -–model_path=\u002Fpath\u002Fto\u002Fmodel\n  ```\n\n# 📚 Tutorials\n\n- [Text Classification](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Ftext_classification)\n- [Tabular Data Classification](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fadult_census_income)\n- [Image Classification](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmnist)\n- [Multimodal Classification](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmultimodal_classification)\n\n# 🔬 Example Use Cases\n\n- [Named Entity Recognition Tagging](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fner_tagging)\n- [Natural Language Understanding](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fnlu)\n- [Machine Translation](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmachine_translation)\n- [Chit-Chat Dialogue Modeling through seq2seq](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fseq2seq)\n- [Sentiment Analysis](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fsentiment_analysis)\n- [One-shot Learning with Siamese Networks](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Foneshot)\n- [Visual Question Answering](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fvisual_qa)\n- [Spoken Digit Speech Recognition](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fspeech_recognition)\n- [Speaker Verification](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fspeaker_verification)\n- [Binary Classification (Titanic)](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Ftitanic)\n- [Timeseries forecasting](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fforecasting)\n- [Timeseries forecasting (Weather)](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fweather)\n- [Movie rating prediction](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmovie_ratings)\n- [Multi-label classification](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmulti_label)\n- [Multi-Task Learning](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmulti_task)\n- [Simple Regression: Fuel Efficiency Prediction](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Ffuel_efficiency)\n- [Fraud Detection](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Ffraud)\n\n# 💡 More Information\n\nRead our publications on [Ludwig](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1909.07930.pdf), [declarative ML](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2107.08148.pdf), and [Ludwig’s SoTA benchmarks](https:\u002F\u002Fopenreview.net\u002Fpdf?id=hwjnu6qW7E4).\n\nLearn more about [how Ludwig works](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fuser_guide\u002Fhow_ludwig_works\u002F), [how to get started](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fgetting_started\u002F), and work through more [examples](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples).\n\nIf you are interested in [contributing](https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fblob\u002Fmain\u002FCONTRIBUTING.md), have questions, comments, or thoughts to share, or if you just want to be in the\nknow, please consider [joining our Community Discord](https:\u002F\u002Fdiscord.gg\u002FCBgdrGnZjy) and follow us on [X](https:\u002F\u002Ftwitter.com\u002Fludwig_ai)!\n\n# 🤝 Join the community to build Ludwig with us\n\nLudwig is an actively managed open-source project that relies on contributions from folks just like\nyou. Consider joining the active group of Ludwig contributors to make Ludwig an even\nmore accessible and feature rich framework for everyone to use!\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fludwig-ai_ludwig_readme_8d08bdcc331a.png\" \u002F>\n\u003C\u002Fa>\u003Cbr\u002F>\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fludwig-ai_ludwig_readme_60abce3d35d1.png)](https:\u002F\u002Fstar-history.com\u002F#ludwig-ai\u002Fludwig&Date)\n\n# 👋 Getting Involved\n\n- [Discord](https:\u002F\u002Fdiscord.gg\u002FCBgdrGnZjy)\n- [X (Twitter)](https:\u002F\u002Ftwitter.com\u002Fludwig_ai)\n- [Medium](https:\u002F\u002Fmedium.com\u002Fludwig-ai)\n- [GitHub Issues](https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fissues)\n","\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fludwig.ai\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fludwig-ai_ludwig_readme_b2b6f1e1f50d.jpg\" height=\"150\">\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cdiv align=\"center\">\n\n_为规模化与高效性而构建的声明式深度学习框架。_\n\n[![PyPI版本](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fludwig.svg)](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fludwig)\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-Join%20Chat-5865F2?logo=discord&logoColor=white)](https:\u002F\u002Fdiscord.gg\u002FCBgdrGnZjy)\n[![DockerHub](https:\u002F\u002Fimg.shields.io\u002Fdocker\u002Fpulls\u002Fludwigai\u002Fludwig.svg)](https:\u002F\u002Fhub.docker.com\u002Fr\u002Fludwigai)\n[![下载量](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fludwig-ai_ludwig_readme_e8db7d8a5430.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002Fludwig)\n[![许可证](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache%202.0-blue.svg)](https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fblob\u002Fmain\u002FLICENSE)\n[![X](https:\u002F\u002Fimg.shields.io\u002Ftwitter\u002Ffollow\u002Fludwig_ai.svg?style=social&logo=twitter)](https:\u002F\u002Ftwitter.com\u002Fludwig_ai)\n\n\u003C\u002Fdiv>\n\n# 📖 Ludwig 是什么？\n\nLudwig 是一个 **低代码** 框架，用于构建 **自定义** 的 AI 模型，例如 **LLM** 和其他深度神经网络。\n\n主要特性：\n\n- 🛠 **轻松构建自定义模型：** 只需一个声明式的 YAML 配置文件，即可在您的数据上训练出最先进的 LLM。支持多任务和多模态学习。全面的配置验证能够检测无效的参数组合，并防止运行时错误。\n- ⚡ **针对规模化与效率优化：** 自动批量大小选择、分布式训练（[DDP](https:\u002F\u002Fpytorch.org\u002Ftutorials\u002Fbeginner\u002Fddp_series_theory.html)、[DeepSpeed](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeed)）、参数高效的微调（[PEFT](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Fpeft)）、4位量化（QLoRA）、分页及8位优化器，以及超出内存容量的数据集。\n- 📐 **专家级控制：** 您可以完全掌控模型的每一个细节，包括激活函数等。支持超参数优化、可解释性分析以及丰富的指标可视化。\n- 🧱 **模块化与可扩展性：** 仅需在配置中进行少量参数调整，即可尝试不同的模型架构、任务、特征和模态。就像搭建深度学习的积木一样。\n- 🚢 **专为生产环境设计：** 提供预构建的 [Docker](https:\u002F\u002Fhub.docker.com\u002Fu\u002Fludwigai) 容器，原生支持使用 [Ray](https:\u002F\u002Fwww.ray.io\u002F) 在 [Kubernetes](https:\u002F\u002Fgithub.com\u002Fray-project\u002Fkuberay) 上运行，可将模型导出为 [Torchscript](https:\u002F\u002Fpytorch.org\u002Fdocs\u002Fstable\u002Fjit.html) 和 [Triton](https:\u002F\u002Fdeveloper.nvidia.com\u002Ftriton-inference-server)，并可通过一条命令上传至 [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fmodels)。\n\nLudwig 由 **Linux 基金会人工智能与数据组织** 托管。\n\n**技术栈：** Python 3.12 | PyTorch 2.7+ | Pydantic 2 | Transformers 5 | Ray 2.54\n\n![img](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fludwig-ai_ludwig_readme_89601297f1a8.gif)\n\n# 💾 安装\n\n从 PyPI 安装。请注意，Ludwig 需要 Python 3.12 或更高版本。\n\n```shell\npip install ludwig\n```\n\n或者安装包含所有可选依赖项的版本：\n\n```shell\npip install ludwig[full]\n```\n\n更多详细的安装说明，请参阅 [贡献指南](https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fblob\u002Fmain\u002FCONTRIBUTING.md)。\n\n# 🚂 入门\n\n想快速了解 Ludwig 的一些功能吗？请查看这个 Colab 笔记本 🚀 [![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1lB4ALmEyvcMycE3Mlnsd7I3bc0zxvk39)\n\n想要微调 LLM 吗？请查看以下笔记本：\n\n1. 微调 Llama-2-7b：[![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1r4oSEwRJpYKBPM0M0RSh0pBEYK_gBKbe)\n1. 微调 Llama-2-13b：[![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1zmSEzqZ7v4twBrXagj1TE_C--RNyVAyu)\n1. 微调 Mistral-7b：[![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1i_8A1n__b7ljRWHzIsAdhO7u7r49vUm4)\n\n如需完整教程，请参阅官方的 [入门指南](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fgetting_started\u002F) 或查看端到端的 [示例](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples)。\n\n## 大型语言模型微调\n\n[![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1c3AO8l_H6V_x37RwQ8V7M6A-RmcBf2tG?usp=sharing)\n\n让我们对一个预训练的 LLM 进行微调，使其能够像聊天机器人一样遵循指令（“指令微调”）。\n\n### 前提条件\n\n- [HuggingFace API Token](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Fhub\u002Fsecurity-tokens)\n- 对所选基础模型的访问权限（例如，[Llama-3.1-8B](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama\u002FLlama-3.1-8B)）\n- 至少配备 12 GiB 显存的 GPU（我们在测试中使用了 Nvidia T4）\n\n### 运行步骤\n\n我们将使用 [Stanford Alpaca](https:\u002F\u002Fcrfm.stanford.edu\u002F2023\u002F03\u002F13\u002Falpaca.html) 数据集，该数据集将被格式化为类似表格的文件，内容如下：\n\n|                    指令                    |      输入       |                      输出                       |\n| :-----------------------------------------------: | :--------------: | :-----------------------------------------------: |\n| 给出三条保持健康的建议。                        |                  | 1. 保持均衡饮食，并确保摄入... |\n| 将下列物品按顺序排列：蛋糕、我、吃                | 蛋糕、我、吃     | 我正在吃蛋糕。                   |\n| 写一段关于著名人物的介绍性段落...               | 米歇尔·奥巴马   | 米歇尔·奥巴马是一位鼓舞人心的女性，她... |\n|                        ...                        |       ...        |                        ...                        |\n\n创建一个名为 `model.yaml` 的 YAML 配置文件，内容如下：\n\n```yaml\nmodel_type: llm\nbase_model: meta-llama\u002FLlama-3.1-8B\n\nquantization:\n  bits: 4\n\nadapter:\n  type: lora\n\nprompt:\n  template: |\n    下面是一条描述任务的指令，以及可能提供进一步上下文的输入。\n    请撰写一个恰当回应请求的内容。\n\n    ### 指令：\n    {instruction}\n\n    ### 输入：\n    {input}\n\n    ### 回答：\n\ninput_features:\n  - name: prompt\n    type: text\n\noutput_features:\n  - name: output\n    type: text\n\ntrainer:\n  type: finetune\n  learning_rate: 0.0001\n  batch_size: 1\n  gradient_accumulation_steps: 16\n  epochs: 3\n  learning_rate_scheduler:\n    decay: cosine\n    warmup_fraction: 0.01\n\npreprocessing:\n  sample_ratio: 0.1\n\nbackend:\n  type: local\n```\n\n现在让我们开始训练模型：\n\n```bash\nexport HUGGING_FACE_HUB_TOKEN = \"\u003Capi_token>\"\n\nludwig train --config model.yaml --dataset \"ludwig:\u002F\u002Falpaca\"\n```\n\n## 监督学习\n\n让我们构建一个神经网络，用于预测给定的电影评论家在[烂番茄](https:\u002F\u002Fwww.kaggle.com\u002Fstefanoleone992\u002Frotten-tomatoes-movies-and-critic-reviews-dataset)上的影评是正面还是负面。\n\n我们的数据集将是一个看起来像这样的CSV文件：\n\n|     电影标题      | 内容分级 |              类型              | 片长 | 权威影评人 | 影评内容                                                                                                                                                                                                   | 是否推荐 |\n| :------------------: | :------------: | :------------------------------: | :-----: | ---------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |\n| 拯救恶灵 |       R        |    动作与冒险、恐怖    |  117.0  | TRUE       | 导演斯科特·德里克森和他的联合编剧保罗·哈里斯·博德曼带来了一部平庸的程序化影片，惊悚效果乏善可陈。                                                                                       | 0           |\n|       巴巴拉        |     PG-13      | 艺术片与国际片、剧情片 |  105.0  | FALSE      | 在这个感人至深的故事中，芭芭拉设法坚守自己的原则、人性与勇气，奋力拯救一位正被共产党试图摧毁生命的年轻女异议分子。 | 1           |\n|   糟糕老板们    |       R        |              喜剧              |  98.0   | FALSE      | 这些老板既无法为谋杀辩护，也无法留下令人难忘的喜剧记忆，从而彻底破坏了一出本可以很出色的闹剧，最终却只沦为平庸之作。                                                        | 0           |\n|         ...          |      ...       |               ...                |   ...   | ...        | ...                                                                                                                                                                                                              | ...         |\n\n从[这里](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fdata\u002Frotten_tomatoes.csv)下载该数据集的样本。\n\n```bash\nwget https:\u002F\u002Fludwig.ai\u002Flatest\u002Fdata\u002Frotten_tomatoes.csv\n```\n\n接下来，创建一个名为`model.yaml`的YAML配置文件，内容如下：\n\n```yaml\ninput_features:\n  - name: genres\n    type: set\n    preprocessing:\n      tokenizer: comma\n  - name: content_rating\n    type: category\n  - name: top_critic\n    type: binary\n  - name: runtime\n    type: number\n  - name: review_content\n    type: text\n    encoder:\n      type: embed\noutput_features:\n  - name: recommended\n    type: binary\n```\n\n就这些！现在让我们训练模型：\n\n```bash\nludwig train --config model.yaml --dataset rotten_tomatoes.csv\n```\n\n**祝建模愉快**\n\n尝试将Ludwig应用于你的数据。如果你有任何问题，请通过[Discord联系我们](https:\u002F\u002Fdiscord.gg\u002FCBgdrGnZjy)。\n\n# ❓ 为什么你应该使用Ludwig\n\n- **极简的机器学习样板代码**\n\n  Ludwig开箱即用地处理了机器学习中的工程复杂性，使研究科学家能够专注于在最高抽象层次上构建模型。数据预处理、超参数优化、设备管理以及针对`torch.nn.Module`模型的分布式训练都完全免费提供。\n\n- **轻松构建基准测试**\n\n  创建最先进的基线并与新模型进行比较，只需简单的配置更改即可。\n\n- **轻松将新架构应用于多个问题和数据集**\n\n  将新模型应用于Ludwig支持的广泛任务和数据集。Ludwig包含一个对任何用户开放的[完整基准测试工具包](https:\u002F\u002Farxiv.org\u002Fabs\u002F2111.04260)，只需简单配置即可在多个数据集上运行多种模型的实验。\n\n- **高度可配置的数据预处理、建模和指标**\n\n  模型架构、训练循环、超参数搜索以及后端基础设施的任何方面都可以作为声明式配置中的附加字段进行修改，以定制流水线来满足你的需求。有关可配置内容的详细信息，请参阅[Ludwig配置文档](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fconfiguration\u002F)。\n\n- **开箱即用的多模态、多任务学习**\n\n  无需编写代码，即可将表格数据、文本、图像甚至音频混合搭配成复杂的模型配置。\n\n- **丰富的模型导出与跟踪功能**\n\n  自动使用Tensorboard、Comet ML、Weights & Biases、MLFlow和Aim Stack等工具跟踪所有试验和指标。\n\n- **自动将训练扩展到多GPU、多节点集群**\n\n  无需更改代码，即可从本地机器上的训练过渡到云端。\n\n- **低代码接口，适用于最先进的模型，包括预训练的Huggingface Transformers**\n\n  Ludwig还原生集成预训练模型，例如[Huggingface Transformers](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Ftransformers\u002Findex)中提供的模型。用户可以从大量最先进的预训练PyTorch模型中选择，而无需编写任何代码。例如，使用Ludwig训练基于BERT的情感分析模型非常简单：\n\n  ```shell\n  ludwig train --dataset sst5 --config_str \"{input_features: [{name: sentence, type: text, encoder: bert}], output_features: [{name: label, type: category}]}\"\n  ```\n\n- **低代码接口，用于AutoML**\n\n  [Ludwig AutoML](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fuser_guide\u002Fautoml\u002F)允许用户仅提供数据集、目标列和时间预算即可获得训练好的模型。\n\n  ```python\n  auto_train_results = ludwig.automl.auto_train(dataset=my_dataset_df, target=target_column_name, time_limit_s=7200)\n  ```\n\n- **易于生产部署**\n\n  Ludwig使深度学习模型的部署变得容易，包括在GPU上部署。为你的Ludwig训练好的模型启动一个REST API。\n\n  ```shell\n  ludwig serve --model_path=\u002Fpath\u002Fto\u002Fmodel\n  ```\n\n  Ludwig还支持将模型导出为高效的Torchscript捆绑包。\n\n  ```shell\n  ludwig export_torchscript -–model_path=\u002Fpath\u002Fto\u002Fmodel\n  ```\n\n# 📚 教程\n\n- [文本分类](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Ftext_classification)\n- [表格数据分类](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fadult_census_income)\n- [图像分类](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmnist)\n- [多模态分类](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmultimodal_classification)\n\n# 🔬 示例用例\n\n- [命名实体识别标注](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fner_tagging)\n- [自然语言理解](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fnlu)\n- [机器翻译](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmachine_translation)\n- [基于序列到序列的闲聊对话建模](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fseq2seq)\n- [情感分析](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fsentiment_analysis)\n- [使用暹罗网络进行一次学习](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Foneshot)\n- [视觉问答](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fvisual_qa)\n- [语音数字识别](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fspeech_recognition)\n- [说话人验证](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fspeaker_verification)\n- [二分类（泰坦尼克号）](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Ftitanic)\n- [时间序列预测](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fforecasting)\n- [时间序列预测（天气）](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fweather)\n- [电影评分预测](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmovie_ratings)\n- [多标签分类](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmulti_label)\n- [多任务学习](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Fmulti_task)\n- [简单回归：燃油效率预测](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Ffuel_efficiency)\n- [欺诈检测](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples\u002Ffraud)\n\n# 💡 更多信息\n\n阅读我们在 [Ludwig](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1909.07930.pdf)、[声明式机器学习](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2107.08148.pdf) 以及 [Ludwig 的 SOTA 基准测试](https:\u002F\u002Fopenreview.net\u002Fpdf?id=hwjnu6qW7E4) 方面的论文。\n\n了解更多关于 [Ludwig 的工作原理](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fuser_guide\u002Fhow_ludwig_works\u002F)、[如何开始使用](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fgetting_started\u002F) 的信息，并查看更多 [示例](https:\u002F\u002Fludwig.ai\u002Flatest\u002Fexamples)。\n\n如果您有兴趣 [参与贡献](https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fblob\u002Fmain\u002FCONTRIBUTING.md)，或有任何问题、意见和想法想要分享，亦或是希望随时了解最新动态，请考虑 [加入我们的社区 Discord](https:\u002F\u002Fdiscord.gg\u002FCBgdrGnZjy)，并关注我们在 [X](https:\u002F\u002Ftwitter.com\u002Fludwig_ai) 上的账号！\n\n# 🤝 加入社区，与我们一起构建 Ludwig\n\nLudwig 是一个由社区积极维护的开源项目，其发展离不开像您这样的贡献者。欢迎加入活跃的 Ludwig 贡献者团队，共同让 Ludwig 成为一个更加易用、功能更丰富的框架，惠及每一位用户！\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fludwig-ai_ludwig_readme_8d08bdcc331a.png\" \u002F>\n\u003C\u002Fa>\u003Cbr\u002F>\n\n## 星标历史\n\n[![星标历史图表](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fludwig-ai_ludwig_readme_60abce3d35d1.png)](https:\u002F\u002Fstar-history.com\u002F#ludwig-ai\u002Fludwig&Date)\n\n# 👋 如何参与\n\n- [Discord](https:\u002F\u002Fdiscord.gg\u002FCBgdrGnZjy)\n- [X（Twitter）](https:\u002F\u002Ftwitter.com\u002Fludwig_ai)\n- [Medium](https:\u002F\u002Fmedium.com\u002Fludwig-ai)\n- [GitHub 问题](https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fissues)","# Ludwig 快速上手指南\n\nLudwig 是一个声明式的低代码深度学习框架，专为构建自定义 AI 模型（包括大语言模型 LLM）而设计。通过简单的 YAML 配置文件，即可实现数据预处理、模型训练、超参数优化及分布式训练，无需编写复杂的底层代码。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**：Linux, macOS 或 Windows (WSL2 推荐)\n*   **Python 版本**：必须为 **Python 3.12** 或更高版本\n*   **硬件要求**：\n    *   基础任务：普通 CPU 或任意 GPU\n    *   LLM 微调：建议配备至少 12GB 显存的 NVIDIA GPU (如 T4, A10G 等)\n*   **前置依赖**：建议预先安装好 CUDA 驱动（如需 GPU 加速）\n\n> **国内开发者提示**：由于 PyPI 源在国外访问较慢，建议使用国内镜像源进行安装。\n\n## 安装步骤\n\n### 1. 基础安装\n使用 pip 安装核心功能：\n\n```bash\npip install ludwig -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 2. 全功能安装（推荐）\n若需使用所有可选依赖（包括特定编码器、可视化工具等），请执行：\n\n```bash\npip install \"ludwig[full]\" -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n## 基本使用\n\nLudwig 的核心工作流分为三步：准备数据 -> 编写配置 -> 运行训练。\n\n### 场景一：监督学习（文本分类）\n\n假设我们要根据电影评论预测其是否被推荐（基于 Rotten Tomatoes 数据集）。\n\n**1. 准备数据**\n下载示例数据：\n```bash\nwget https:\u002F\u002Fludwig.ai\u002Flatest\u002Fdata\u002Frotten_tomatoes.csv\n```\n\n**2. 创建配置文件**\n新建 `model.yaml`，定义输入特征（类型自动识别）和输出目标：\n\n```yaml\ninput_features:\n  - name: genres\n    type: set\n    preprocessing:\n      tokenizer: comma\n  - name: content_rating\n    type: category\n  - name: top_critic\n    type: binary\n  - name: runtime\n    type: number\n  - name: review_content\n    type: text\n    encoder:\n      type: embed\noutput_features:\n  - name: recommended\n    type: binary\n```\n\n**3. 启动训练**\n执行以下命令开始训练：\n\n```bash\nludwig train --config model.yaml --dataset rotten_tomatoes.csv\n```\n\n---\n\n### 场景二：大语言模型微调 (LLM Fine-Tuning)\n\n以微调 Llama-3.1-8B 模型使其遵循指令为例。\n\n**1. 前置检查**\n*   获取 [HuggingFace API Token](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Fhub\u002Fsecurity-tokens)。\n*   确保已获准访问基座模型（如 `meta-llama\u002FLlama-3.1-8B`）。\n*   设置环境变量：\n    ```bash\n    export HUGGING_FACE_HUB_TOKEN=\"\u003C你的_api_token>\"\n    ```\n\n**2. 创建配置文件**\n新建 `model.yaml`，配置量化（4-bit）、适配器（LoRA）及训练参数：\n\n```yaml\nmodel_type: llm\nbase_model: meta-llama\u002FLlama-3.1-8B\n\nquantization:\n  bits: 4\n\nadapter:\n  type: lora\n\nprompt:\n  template: |\n    Below is an instruction that describes a task, paired with an input that may provide further context.\n    Write a response that appropriately completes the request.\n\n    ### Instruction:\n    {instruction}\n\n    ### Input:\n    {input}\n\n    ### Response:\n\ninput_features:\n  - name: prompt\n    type: text\n\noutput_features:\n  - name: output\n    type: text\n\ntrainer:\n  type: finetune\n  learning_rate: 0.0001\n  batch_size: 1\n  gradient_accumulation_steps: 16\n  epochs: 3\n  learning_rate_scheduler:\n    decay: cosine\n    warmup_fraction: 0.01\n\npreprocessing:\n  sample_ratio: 0.1\n\nbackend:\n  type: local\n```\n\n**3. 启动训练**\n使用内置的 Alpaca 数据集进行训练：\n\n```bash\nludwig train --config model.yaml --dataset \"ludwig:\u002F\u002Falpaca\"\n```\n\n训练完成后，模型将自动保存，并可导出为 TorchScript 或上传至 HuggingFace。","某电商初创公司的数据团队急需基于用户评论数据微调一个专属的大语言模型，以自动识别产品缺陷并生成改进建议。\n\n### 没有 ludwig 时\n- 工程师需编写数百行复杂的 PyTorch 代码来处理数据加载、分布式训练及混合精度计算，开发周期长达数周。\n- 尝试不同的模型架构（如从 Llama 切换到 Mistral）或调整量化策略时，必须大幅重构底层训练逻辑，试错成本极高。\n- 缺乏统一的配置管理，团队成员间难以复现实验结果，且容易因参数组合错误导致训练中途崩溃。\n- 部署模型到生产环境需要手动转换格式并编写推理服务代码，运维门槛高且容易出错。\n\n### 使用 ludwig 后\n- 仅需编写一份声明式 YAML 配置文件，即可自动完成从数据预处理到 SOTA 大模型微调的全流程，将开发时间缩短至几天。\n- 通过修改配置中的几行参数，就能轻松切换模型骨干、启用 4-bit 量化（QLoRA）或开启多任务学习，灵活探索最优方案。\n- 内置的配置验证机制自动拦截无效参数组合，配合详细的指标可视化，确保实验过程稳定且结果可复现。\n- 利用一条命令即可将训练好的模型导出为 Torchscript 或 Triton 格式，甚至直接上传至 HuggingFace，无缝衔接生产部署。\n\nludwig 通过低代码声明式配置，让团队无需深陷底层代码泥潭，专注于数据与业务逻辑，实现了大模型定制的高效落地。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fludwig-ai_ludwig_c79799e3.png","ludwig-ai","Ludwig","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fludwig-ai_45361d95.png","Ludwig is a toolbox that allows to train and test deep learning models without the need to write code. It is an incubation level project in LF AI Foundation.",null,"https:\u002F\u002Fludwig.ai","https:\u002F\u002Fgithub.com\u002Fludwig-ai",[83,87],{"name":84,"color":85,"percentage":86},"Python","#3572A5",99.9,{"name":88,"color":89,"percentage":90},"Dockerfile","#384d54",0.1,11665,1215,"2026-04-05T17:59:56","Apache-2.0","Linux, macOS, Windows","非绝对必需（支持本地 CPU 运行），但训练 LLM 或大规模模型时推荐 NVIDIA GPU。示例中使用了 Nvidia T4，要求至少 12 GiB 显存。支持分布式训练 (DDP, DeepSpeed) 和量化技术以降低显存需求。","未说明（取决于数据集大小和模型规模，支持大于内存的数据集处理）",{"notes":99,"python":100,"dependencies":101},"该工具是一个声明式深度学习框架，主要通过 YAML 配置文件驱动。安装时可选择 'ludwig[full]' 获取所有可选依赖。针对大语言模型（LLM）微调，需要 HuggingFace API Token 及对应模型的访问权限。支持 Docker 容器化部署及 Kubernetes (Ray) 集群扩展。若进行 4-bit 量化微调 (QLoRA)，可显著降低显存门槛。","3.12+",[102,103,104,105,106,107,108],"PyTorch>=2.7","Pydantic>=2","Transformers>=5","Ray>=2.54","DeepSpeed","PEFT","HuggingFace Hub",[14,51,13,54,26],[111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130],"deep-learning","deeplearning","deep","learning","machine-learning","machinelearning","natural-language-processing","natural-language","computer-vision","data-centric","data-science","pytorch","neural-network","ml","llm","llm-training","fine-tuning","llama","mistral","llama2","2026-03-27T02:49:30.150509","2026-04-06T09:25:38.159793",[134,139,144],{"id":135,"question_zh":136,"answer_zh":137,"source_url":138},18488,"如何在 Mac M1 芯片上安装和运行 Ludwig？","目前 Mac M1 的支持仍在完善中。您可以尝试使用 PyTorch 后端并启用 MPS (Metal Performance Shaders) 加速。有用户测试发现，虽然模型可以在 MPS 上运行（速度比 CPU 快约一倍），但可能会遇到输出包含 NaN 或结果不正确的问题，这是因为部分 PyTorch 算子（如 'aten::nonzero'）在 MPS 后端尚不支持，会回退到 CPU 运行。建议关注 PyTorch 对 M1 的更新进度。如果仅需推理已训练好的模型，可以考虑使用 Apple 的 coremltools 将 PyTorch 模型转换为 CoreML 格式在本地运行。","https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fissues\u002F1101",{"id":140,"question_zh":141,"answer_zh":142,"source_url":143},18489,"读取 CSV 文件时出现 'pandas.errors.ParserError: Expected X fields in line Y, saw Z' 错误怎么办？","这个错误通常意味着 CSV 文件格式有问题（例如列数不匹配或分隔符混淆）。Ludwig 内部的 `data_utils.read_csv` 函数主要用于内部处理，功能有限（例如不支持 `escape_char` 参数且默认仅支持逗号分隔）。\n解决方案：\n1. 建议直接使用 `pandas.read_csv()` 加载数据，因为它支持更多参数（如 `escape_char`, `sep` 等）来处理复杂的 CSV 文件。\n2. 确保 CSV 文件格式正确，如果字段中包含逗号，请正确使用引号包裹。\n3. 加载为 pandas DataFrame 后，可以直接将其传递给 Ludwig 的 `train()`, `predict()`, 或 `experiment()` 方法，无需通过 Ludwig 的文件读取接口。","https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fissues\u002F201",{"id":145,"question_zh":146,"answer_zh":147,"source_url":148},18490,"运行 `ludwig visualize` 命令后没有生成任何图片或结果，如何解决？","这通常是因为使用了旧版本的命令行参数格式。从 Ludwig v0.3 开始，CLI 参数发生了破坏性变更。\n解决方法：\n1. 升级 Ludwig 到最新版本。\n2. 检查并更新命令参数。例如，绘制混淆矩阵的正确命令格式如下：\n```bash\nludwig visualize \\\n  --visualization confusion_matrix \\\n  --output_feature label \\\n  --ground_truth_metadata results\u002Fexperiment_run_0\u002Fmodel\u002Ftraining_set_metadata.json \\\n  --output_directory visualizations \\\n  --file_format png \\\n  --test_statistics .\u002Fresults\u002Fexperiment_run_0\u002Ftest_statistics.json\n```\n请确保指定了 `--ground_truth_metadata` 和 `--test_statistics` (或 `--training_statistics`) 等必要参数，并参考最新文档确认具体可视化类型的参数要求。","https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fissues\u002F69",[150,155,160,165,170,175,180,185,190,195,200,205,210,215,220,225,230,235,240,245],{"id":151,"version":152,"summary_zh":153,"released_at":154},109043,"v0.12.0","## Ludwig 0.12.0\n\n### 现代化构建系统\n- 从 `setup.py` 加上 10 个依赖文件，迁移到使用 hatchling 的单个 `pyproject.toml`\n- 从 `ludwig\u002Fglobals.py` 动态获取版本号\n- 使用 SafeTensors 实现安全、零拷贝的模型权重序列化（用于 ECD 模型和训练检查点）\n- 添加了 `torchcodec` 依赖（torchaudio 2.x 所需）\n\n### 清理配置层\n- 移除了所有与 marshmallow 向后兼容的层（`@ludwig_dataclass`、`_SchemaAdapter`、`.Schema().load\u002Fdump()`）\n- 将 `BaseMarshmallowConfig` 重命名为 `LudwigBaseConfig`，`DictMarshmallowField` 重命名为 `NestedConfigField`\n- 默认启用严格验证——未知的配置字段现在会发出警告并被移除\n\n### 新增 4 种组合器\n- **FTTransformerCombiner**（类型：`ft_transformer`）：[CLS] 标记 + Transformer 自注意力机制（Gorishniy 等人，NeurIPS 2021）\n- **CrossAttentionCombiner**（类型：`cross_attention`）：所有特征对之间的成对交叉注意力\n- **PerceiverCombiner**（类型：`perceiver`）：可学习的瓶颈潜变量标记（Jaegle 等人，ICML 2022）\n- **GatedFusionCombiner**（类型：`gated_fusion`）：受 Flamingo 启发的门控跨模态融合\n\n### 数值特征分词\n- **PLEEncoder**（类型：`ple`）：基于分位数边界的分段线性编码（Gorishniy 等人，NeurIPS 2022）\n- **PeriodicEncoder**（类型：`periodic`）：学习到的正弦特征\n\n### 多任务损失平衡（`trainer.loss_balancing`）\n- `log_transform`：对损失取 log(1+loss) 进行压缩（DB-MTL）\n- `uncertainty`：同方差不确定性加权（Kendall 等人，CVPR 2018）\n- `famo`：快速自适应多任务优化（Liu 等人，NeurIPS 2023）\n- `gradnorm`：梯度归一化（Chen 等人，ICML 2018）\n\n### 模型集成（`trainer.model_soup`）\n通过检查点权重平均来提升泛化能力，且无需额外推理成本（Wortsman 等人，ICML 2022）\n\n### 模态丢弃（`trainer.modality_dropout`）\n为缺失输入提供可学习的嵌入表示，以增强推理时对缺失数据的鲁棒性。\n\n### 质量预设（`preset: medium_quality|high_quality|best_quality`）\n借鉴 AutoGluon 的设计，提供一键式配置，用于在质量和速度之间进行不同权衡。\n\n### 其他改进\n- 在 `ludwig.datasets` 中新增了 `california_housing` 数据集\n- 简化了 torchaudio 的调用（移除了旧版后端的版本检查）\n- 修复了 SafeTensors 在共享内存中处理绑定权重时的问题\n\n### 基准测试结果\n\n| 模型 | Adult Census（AUC） | California Housing（RMSE） |\n|------|-------------------|--------------------------|\n| ft_transformer | **0.919** | **0.461** |\n| transformer | 0.918 | 0.469 |\n| cross_attention | 0.916 | 0.477 |\n| perceiver | 0.916 | 0.477 |\n| concat（基准） | 0.911 | 0.491 |\n\nFT-Transformer 的性能与论文报告的结果相差不超过 0.2%。","2026-04-04T04:01:25",{"id":156,"version":157,"summary_zh":158,"released_at":159},109044,"v0.11.4","## 修复\n\n- 修复 DDP 检查点的竞态条件：使用 `os.makedirs(exist_ok=True)` 防止多个工作进程同时创建训练检查点目录时出现 `FileExistsError` 错误。\n- 修复 `batch_predict` 中的 Dask 元数据不匹配问题：将 `from_ray_dataset()` 放在 `tensor_extension_casting(False)` 上下文中，以确保在校准过程中分区的数据类型与元数据一致。\n- 在分布式依赖中将 Dask 的最低版本锁定为 2026.1.2。\n- 在 `batch_predict` 中禁用张量扩展转换，以解决 Dask 元数据不匹配的问题。","2026-04-02T03:18:58",{"id":161,"version":162,"summary_zh":163,"released_at":164},109045,"v0.11.3","## 变更内容\n\n### 功能\n- 升级至 PyTorch 2.7.1，修复 bitsandbytes 和 CUDA 兼容性问题\n\n### 错误修复\n- 通过禁用张量扩展的类型转换，修复 batch_predict 中的 Dask 元数据不匹配问题\n- 修复序列编码器梯度测试中的 Dropout 被移除问题\n- 修正 StackedCNN 梯度测试，使其在稀疏更新下也能通过\n- 在 CI 中将 torchvision 和 torchaudio 的版本锁定为与 PyTorch 2.6.0 相匹配的版本\n\n### 维护\n- 预提交建议 (#4075)","2026-04-01T03:30:01",{"id":166,"version":167,"summary_zh":168,"released_at":169},109046,"v0.11.2","## 变更内容\n\n### 错误修复\n- 修复测试运行缓慢导致的失败问题：检查点 API、Dask-expr 的 batch_transform、MinIO S3 存储兼容性\n- 在 Ray Tune 的 S3 存储中使用 fsspec s3fs（修复了 PyArrow\u002FMinIO 分块传输编码不兼容的问题）\n- 修复从 Ray Tune 检查点加载 automl best_model 的问题（使用 `from_checkpoint=True`）\n- 修复分层划分时每类所需最小样本数不足的测试数据大小问题\n\n### 性能优化\n- 优化测试套件：在 14 个以上的集成测试文件中减少数据量、模型规模和训练轮数\n- 将 CI 运行时间缩短约 60%-70%\n\n### 文档更新\n- 修正 README 中的图片 URL，使其指向正确的分支\n- 重写 v0.11 版本的韩语 README\n- 添加 SchemaStore 集成所需的 JSON Schema 导出功能\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.11.1...v0.11.2","2026-02-28T10:56:33",{"id":171,"version":172,"summary_zh":173,"released_at":174},109047,"v0.11.1","## 变更内容\n\n### 功能\n- **迁移到 Pydantic 2**：将整个模式系统从 Marshmallow 迁移至 Pydantic 2 (#4070)\n- **移除 Marshmallow 依赖**：完全移除了 Marshmallow 和 marshmallow-dataclass 作为依赖项 (#4071)\n- **timm 编码器**：新增了包含 MetaFormer 变体的 timm 编码器——CAFormer、ConvFormer、PoolFormer (#4063)\n- **trust_remote_code**：增加了对带有 `trust_remote_code` 参数的自定义 Hugging Face 模型的支持 (#4065)\n\n### 修复\n- **LoRA 保存\u002F加载**：在保存前重新排序 `merge_and_unload`，并在加载时处理合并后的权重 (#4067)\n- **服务端点**：在日志记录前对概率进行裁剪，以防止服务端点出现 `-inf` (#4064)\n- **JSON 安全的进度**：将 `float('inf')` 替换为 `sys.float_info.max`，以确保训练进度的 JSON 兼容性 (#4062)\n- **回调基类**：在回调基类中添加了 `**kwargs`，以实现向前兼容 (#4066)\n- **DevContainer**：重写了 DevContainer 配置，以适应现代开发环境 (#4068)\n\n### 杂项\n- 对过时的配置、Docker 镜像、CI 流水线和文档进行了现代化改造 (#4069)\n- 优化了运行较慢的测试，以缩短 CI 执行时间 (#4072)","2026-02-27T23:25:39",{"id":176,"version":177,"summary_zh":178,"released_at":179},109048,"v0.11.0","# Ludwig v0.11.0\n\n这是一次重大更新，使 Ludwig 与现代 Python\u002FPyTorch\u002FRay 生态系统保持同步。\n\n## 亮点\n\n### 平台与依赖\n- **Python 3.10+** — 放弃对 Python 3.8 和 3.9 的支持\n- **PyTorch 2.6**，配备 `F.scaled_dot_product_attention` 用于自定义注意力机制\n- **Ray 2.54**，采用现代化的 `ray.data.Dataset`（取代了旧版 `DatasetPipeline`）\n- **transformers 5.x**、**torchaudio 2.x**、**NumPy 2.x**、**Dask 2026.1.2**\n- 兼容 **MLflow 3.10**\n\n### 移除的后端\n- **Horovod** — 因更倾向于使用 Ray 原生的分布式训练而被移除\n- **Neuropod** — 已移除（项目已归档）\n- **GBM (LightGBM)** — 为简化代码库而被移除\n\n### 架构变更\n- Ray `DatasetPipeline` → 惰性执行的 `ray.data.Dataset`\n- 自定义注意力机制改用 `F.scaled_dot_product_attention`（修复了 CUDA 上的 CUBLAS 错误）\n- PyTorch 性能分析器 API 更新至纳秒级精度（`start_ns`\u002F`duration_ns`）\n- Dask-expr 兼容性改进（`dd.concat()`、PyArrow 字符串处理）\n- 处理了 Ray Train 2.54 的破坏性变更（基于检查点的指标报告）\n\n### CI 与质量\n- 单元测试、集成测试和分布式测试套件共计通过 **3,266 项测试**\n- pre-commit 钩子更新至最新版本（black 26、flake8 7.3、isort 8、mdformat 1.0）\n- 全面清理 flake8 警告（移除了所有针对已解决问题的 `# noqa` 禁用注释）\n\n### Bug 修复\n- 修复了 Ray 后端在使用 `defaultdict` lambda 工厂时的序列化问题\n- 修复了 `BatchInferModel` 的 GPU\u002FCPU 设备管理问题\n- 修复了 `NoneTrainer` 在主节点上的指标同步问题\n- 修复了 `LLM.to_device()` 根据实际参数检测设备的问题\n- 修复了 torchaudio 2.x 下音频预处理崩溃的问题\n- 修复了 hyperopt 中 `tune_callbacks` 的透传问题\n\n## 安装\n\n```bash\npip install ludwig==0.11.0\n```\n\n或安装所有可选依赖：\n\n```bash\npip install ludwig[full]==0.11.0\n```\n\n## 完整变更日志\n\n请参阅 [完整差异](https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.10.4...v0.11.0) 以了解所有变更。","2026-02-27T04:28:58",{"id":181,"version":182,"summary_zh":183,"released_at":184},109049,"v0.10.4","## 变更内容\n* @arnavgarg1 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3993 中修复了去量化脚本中的一个小拼写错误。\n* 文档：@eltociear 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F4001 中更新了 visualize.py 脚本。\n* [维护] 最新版本的 matplotlib 导致 ptitprince 和 seaborn 的方法调用失效。@alexsherstinsky 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F4007 中进行了修复。\n* [维护] 为 ViTEncoder 的修复实现了一个更优雅的版本，以确保 transformers.ViTModel 正确返回 output_attentions，并减少了代码量。@alexsherstinsky 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F4008 中完成了这项工作。\n* 修复 MNIST 数据源。@mhabedank 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F4011 中完成了此修复。\n* 支持使用正则表达式冻结预训练视觉模型的层。@ethanreidel 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3981 中实现了这一功能。\n* 添加对 Phi-3 的支持。@arnavgarg1 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F4014 中完成了此项工作。\n\n## 新贡献者\n* @eltociear 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F4001 中做出了首次贡献。\n* @mhabedank 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F4011 中做出了首次贡献。\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.10.3...v0.10.4","2024-07-30T00:11:51",{"id":186,"version":187,"summary_zh":188,"released_at":189},109050,"v0.10.3","## 变更内容\n* 将 Slack 链接替换为 Discord 链接，由 @alexsherstinsky 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3988 中完成。\n* 允许在预处理阶段使用图像字节类型，由 @vijayi1 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3971 中完成。\n* 修复了 'upload_to_hf_hub()' 方法与 'save()' 方法路径不匹配的问题，由 @sanjaydasgupta 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3977 中完成。\n* 进行了小幅改动，以修复响应被错误截断的问题，由 @amankhandelia 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3986 中完成。\n* 将 transformers 的最低版本锁定为 4.39，以降低 Llama\u002FGemma 模型的内存压力，由 @arnavgarg1 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3976 中完成。\n* 真正添加对 RSLoRA 和 DoRA 的支持，由 @arnavgarg1 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3984 中完成。\n\n## 新贡献者\n* @amankhandelia 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3986 中完成了他们的首次贡献。\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.10.2...v0.10.3","2024-04-08T23:13:41",{"id":191,"version":192,"summary_zh":193,"released_at":194},109051,"v0.10.2","## 新增功能\n* **添加对 RSLoRA 和 DoRA 的支持**，由 @arnavgarg1 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3948 中实现  \n启用方法：在配置中将相应标志设置为 `true`（可组合使用）：\n```yaml\n    adapter:\n        type:  lora\n        use_rslora:  false\n        use_dora: false\n```\n* **添加对本地后端 LLM 的评估批次大小调优支持**，由 @arnavgarg1 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3957 中实现  \n启用方法：在训练器部分将 `eval_batch_size` 设置为 `auto`：\n```yaml\n    trainer:\n        eval_batch_size:  auto\n```\n* **支持从训练检查点加载模型权重**，由 @geoffreyangus 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3969 中实现  \n启用方法：在调用 `LudwigModel.load()` 时传入 `from_checkpoint=True`：\n```python\nLudwigModel.load(model_dir, from_checkpoint=True)\n```\n\n## 完整变更日志\n* 将包含模型权重的 Ludwig 配置文件保存到输出目录中，由 @sanjaydasgupta 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3965 中实现  \n* 添加图像工具 UNet 函数的单元测试，由 @vijayi1 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3921 中实现  \n* 修复：更新 imdb_genre_prediction 数据集 YAML 文件以匹配数据集，由 @jeffreyftang 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3944 中实现  \n* 修复 Kube APT 源，由 @noyoshi 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3952 中实现  \n* 暂时禁用开销较大的文本指标，由 @arnavgarg1 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3954 中实现  \n* 【维护】注释掉 PyTorch Nightly 测试，由 @alexsherstinsky 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3955 中实现  \n* 【修复】修复集成测试失败问题，由 @alexsherstinsky 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3959 中实现  \n* 【维护】使用最新版本的 psutil 库，由 @alexsherstinsky 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3956 中实现  \n\n## 新贡献者\n* @sanjaydasgupta 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3965 中完成了首次贡献\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.10.1...v0.10.2","2024-03-21T20:24:34",{"id":196,"version":197,"summary_zh":198,"released_at":199},109052,"v0.10.1","## 变更内容\n* 修复了Gemma模型微调中的一个严重 bug，该 bug 导致模型无法学习何时停止生成。此问题通过在指令微调的目标张量中使用 EOS 标记来解决，由 @geoffreyangus 在 https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3945 中实现。\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.10.0...v0.10.1","2024-02-28T07:30:10",{"id":201,"version":202,"summary_zh":203,"released_at":204},109053,"v0.10.0","## What's Changed\r\n* Add Phi-2 to model presets by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3912\r\n* Add default LoRA target modules for Phi-2 by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3911\r\n* Add support for prompt lookup decoding during generation by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3917\r\n* Pin pyarrow to \u003C 15.0.0 by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3918\r\n* Add unet encoder-decoder and image output feature by @vijayi1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3913\r\n* fix: Add Nested quantization check by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3916\r\n* fix typo in save_dequantized_base_model log statement by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3923\r\n* Add example for base model dequantization\u002Fupscaling by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3924\r\n* fix: Always return a list of quantization bits values from `get_quantization` by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3926\r\n* fix: set `use_reentrant` to `True` to fix `Mixtral-7b` bug by @geoffreyangus in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3928\r\n* Disabling AdaptionPrompt till PEFT is fixed. by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3935\r\n* Add default LoRA target modules for Gemma by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3936\r\n* Pinning transformers to 4.38.1 or above in order to ensure support for Gemma by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3940\r\n* Ludwig release version change by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3941\r\n\r\n## New Contributors\r\n* @vijayi1 made their first contribution in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3913\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.9.3...v0.10.0","2024-02-22T19:20:16",{"id":206,"version":207,"summary_zh":208,"released_at":209},109054,"v0.9.3","## What's Changed\r\n* [MAINTENANCE] Use Trusted Publishers credentials instead of User\u002FPassword for uploading releases to PyPi by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3892\r\n* Add support for official `microsoft\u002Fphi-2` by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3880\r\n* Ensure correct padding token for Phi and Pythia models by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3899\r\n* Enable AdaLoRA tests for LLM adapter by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3896\r\n* Cast `LLMEncoder` output to `torch.float32`, freeze final layer at init. by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3900\r\n* Enable IA3 adapters in `LLMEncoder` by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3902\r\n* [Maintenance] Remove torch nightly pin by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3903\r\n* Pin deepspeed to \u003C 0.13 and pandas to \u003C 2.2.0 by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3906\r\n* Add batch size tuning for LLMs by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3871\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.9.2...v0.9.3","2024-01-23T04:10:36",{"id":211,"version":212,"summary_zh":213,"released_at":214},109055,"v0.9.2","## What's Changed\r\n* fix: Handle missing and unexpected keys during LLMEncoder state dict load by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3841\r\n* fix: Add `name` and `description` classmethods to `IA3Config` by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3844\r\n* Improve IA3 long description by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3845\r\n* fix: Handle missing and unexpected keys during LLMEncoder state dict load, part 2 by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3843\r\n* Update description for max_new_tokens to explain the dynamic setting behavior in our docs by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3847\r\n* Add default LoRA target modules for Mixtral and Mixtral instruct by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3852\r\n* QOL: Fail config validation if a user tries to use ECD with a text output feature and an LLM encoder. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3792\r\n* Pin minimum transformers to 4.36 for Mixtral and Phi support by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3854\r\n* Revert hack that leads to OOM during fine-tuning by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3858\r\n* Add support for exporting models to Carton by @VivekPanyam in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3797\r\n* [Maintenance] Bump minimum tokenizers to 0.15 by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3856\r\n* fix: correct typo in FeatureCollection by @dennisrall in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3863\r\n* Convert test main script in algorithm_utils to unit test by @dennisrall in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3864\r\n* Allow hyperopt config to be loaded from a file by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3865\r\n* fix: unify ludwig training set metadata and hf pad token by @geoffreyangus in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3860\r\n* Add a utility to detect LLM usage in a config by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3869\r\n* Early stop training if model weights have nan or inf tensors by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3740\r\n* Scrub credentials from model_hyperparameters.json and description.json by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3866\r\n* [Maintenance] Bump minimum torch version to 2.0.0 by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3873\r\n* [Maintenance] Fix docker images by pinning ray==2.3.1, daft==0.1.20, unpinning proto, and using torch 2.1.1. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3872\r\n* [BUGFIX] Guard against UnicodeEncodeError when saving validation results in Google Colab environment by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3875\r\n* Docker image fixes part 2: pin to torch==2.1.0, add dependency for urllib\u003C2 by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3877\r\n* Add custom `prepare_for_trianing` logic to ECD model for LLM encoder adapter initialization by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3874\r\n* qol: Fix some lints. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3868\r\n* [Maintenance] Docker Image Fix part 3: fix torchaudio 2.1.0 dependencies by installing `libsox-dev` and update API by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3879\r\n* Add streaming support for zero shot inference by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3878\r\n* [Maintenance] Remove torchdata pin for nightly install by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3855\r\n* Add per-step token utilization to tensorboard and progress tracker. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3867\r\n* Set use_reentrant to False for gradient checkpointing by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3882\r\n* [BUGFIX] Pinning torch nightly to January 13, 2024 to avoid AttributeError by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3885\r\n\r\n## New Contributors\r\n* @VivekPanyam made their first contribution in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3797\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.9.1...v0.9.2","2024-01-16T21:14:25",{"id":216,"version":217,"summary_zh":218,"released_at":219},109056,"v0.9.1","## What's Changed\r\n* fix: Handle missing and unexpected keys during LLMEncoder state dict load by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3841\r\n* fix: Add `name` and `description` classmethods to `IA3Config` by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3844\r\n* Improve IA3 long description by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3845\r\n* bump ludwig version by @geoffreyangus in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3846\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.9...v0.9.1","2023-12-20T21:47:42",{"id":221,"version":222,"summary_zh":223,"released_at":224},109057,"v0.9","## What's Changed\r\n* int: Rename original `combiner_registry` to `combiner_config_registry`, update decorator name by @ksbrar in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3516\r\n* Add mechanic to override default values for generation during model.predict() by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3520\r\n* [feat] Support for numeric date feature inputs by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3517\r\n* Add new sythesized `response` column for text output features during postprocessing by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3521\r\n* Disable flaky twitter bots dataset loading test. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3439\r\n* Add test that verifies that the generation config passed in at model.predict() is used correctly. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3523\r\n* Move loss metric to same device as inputs by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3522\r\n* Add comment about batch size tuning by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3526\r\n* Ensure user sets backend to local w\u002F quantization by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3524\r\n* README: Update LLM fine-tuning config by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3530\r\n* Revert \"Ensure user sets backend to local w\u002F quantization (#3524)\" by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3531\r\n* Improve observability during LLM inference  by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3536\r\n* [bug] Pin pydantic to \u003C 2.0 by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3537\r\n* [bug] Support preprocessing `datetime.date` date features by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3534\r\n* Remove obsolete prompt tuning example. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3540\r\n* Add Ludwig 0.8 notebook to the README by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3542\r\n* Add `effective_batch_size` to auto-adjust gradient accumulation by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3533\r\n* Refactor evaluation metrics to support decoded generated text metrics like BLEU and ROUGE. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3539\r\n* Fix sequence generator test. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3546\r\n* Revert \"Add Cosine Annealing LR scheduler as a decay method (#3507)\" by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3545\r\n* Set default max_sequence_length to None for LLM text input\u002Foutput features by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3547\r\n* Add skip_all_evaluation as a mechanic to skip all evaluation. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3543\r\n* Roll-forward with fixes: Fix interaction between scheduler.step() and gradient accumulation steps, refactor schedulers to use `LambdaLR`, and add cosine annealing LR scheduler as a decay method. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3555\r\n* fix: Move model to the correct device for eval by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3554\r\n* Report loss in tqdm to avoid log spam by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3559\r\n* Wrap each metric update in try\u002Fexcept. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3562\r\n* Move DDP model to device if it hasn't been wrapped yet by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3566\r\n* ensure that there are enough colors to match the score index in visua… by @thelinuxkid in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3560\r\n* Pin Transformers to 4.31.0 by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3569\r\n* Add test to show global_max_sequence_length can never exceed an LLMs context length by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3548\r\n* WandB: Add metric logging support on eval end and epoch end by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3586\r\n* schema: Add `prompt` validation check by @ksbrar in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3564\r\n* Unpin Transformers for CodeLlama support by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3592\r\n* Add support for Paged Optimizers (Adam, Adamw), 8-bit optimizers, and new optimizers: LARS, LAMB and LION by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3588\r\n* FIX: Failure in TabTransformer Combiner Unit test by @jimthompson5802 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3596\r\n* fix: Move target tensor to model output device in `check_module_parameters_updated` by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3567\r\n* Allow user to specify huggingface link or local path to pretrained lora weights by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3572\r\n* Add codellama to tokenizer list for set_pad_token by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3598\r\n* Set default eval batch size to 2 for LLM fine-tuning by @arnavgarg1 in htt","2023-12-19T22:20:40",{"id":226,"version":227,"summary_zh":228,"released_at":229},109058,"v0.8.6","## What's Changed\r\n* Add consumer complaints generation dataset by @connor-mccorm in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3685\r\n* Set the metadata only during first training run by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3684\r\n* Add ability to upload Ludwig models to Predibase. by @martindavis in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3687\r\n* Log additional per-GPU information in model metadata files and GPU utilization on tensorboard. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3712\r\n* QoL: Only log generation config being used once at inference time by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3715\r\n* [MAINTENANCE] Adding typehint annotations in backend and data components and fixing mypy errors. by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3709\r\n* QoL: Limit top-level trainer logging messages such as saving model or resuming model training to main coordinator process by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3718\r\n* Add sample_size as a global preprocessing parameter by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3650\r\n* QOL: Update recommended vscode settings. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3717\r\n* Add new fine-tuning notebooks to README by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3722\r\n* Dynamically set `max_new_tokens` based on output feature length, GMSL and model window size by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3713\r\n* Fix issue while logging cuda device utilization to tensorboard by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3727\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.8.5...v0.8.6","2023-10-13T15:47:19",{"id":231,"version":232,"summary_zh":233,"released_at":234},109059,"v0.8.5","## What's Changed\r\n* Add function to free GPU memory by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3643\r\n* ❗ Enable LLM fine-tuning tests when no quantization is specified by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3626\r\n* Add check to ensure selected backend works with quantization for LLMs  by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3646\r\n* [CI] Use a torch-nightly-compatible version of torchaudio by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3644\r\n* Set do_sample default to True by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3641\r\n* FIX: Failure in audio feature related test by @jimthompson5802 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3651\r\n* Remove unnecessary peft config updating by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3642\r\n* FIX: docker build error for ludwig-gpu by @jimthompson5802 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3658\r\n* Exclude getdaft on Windows by @carlogrisetti in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3629\r\n* Add daft back for windows since the wheels are now officially published by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3663\r\n* fix: The final batch of an epoch is skipped when batch size is 1 by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3653\r\n* Place metric functions for BLEU and Rogue on correct devices when using multiple GPUs by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3671\r\n* Remove duplicate metrics by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3670\r\n* Increment epochs based on last_batch() instead of at the end of the train loop. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3668\r\n* [FEATURE] Support Merging LoRA Weights Into Base Model (Issue-3603) by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3649\r\n* [FEATURE] Include Mistral-7B model in list of supported base models by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3674\r\n* [MAINTENANCE] Partially reconcile type hints, fix some warnings, and fix comments in parts of the codebase. by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3673\r\n* Improve error message for when an LLM base model can't be loaded. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3675\r\n* Fix eos_token and pad_token issue by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3667\r\n* FIX: error with nightly CI tests for test_resize_image by @jimthompson5802 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3678\r\n* [BUGFIX] Remove spurious test directory at the end of the test_llm.py::test_local_path_loading test run by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3680\r\n* Add per-device logging to tensorboard by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3677\r\n* Fix dynamic generation config load during `model.predict` by @geoffreyangus in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3666\r\n* [CI] Ensure that mlflow callback cleans up background-saving threads on trainer teardown. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3683\r\n* fix: temporarily remove config validation check for backend by @geoffreyangus in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3688\r\n* fix: Failing test for backend with quantization by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3689\r\n* [BUGFIX] Ensure that full base models and not only adapter weights get saved when merge_and_unload is set by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3679\r\n* Add Ludwig Star History to README by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3696\r\n* Use sphinx for all docstrings in api.py by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3693\r\n* Fix binary variables being visualized as 0 and 1 by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3691\r\n* [MAINTENANCE] Fix the linting warnings in two backend component classes. by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3698\r\n* [BUGFIX] Pin deepspeed\u003C0.11, skip Horovod tests by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3700\r\n* Unpin deepspeed following fix in v0.11.1 by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3706\r\n* Move on_epoch_end and epoch increment to after run_evaluation loop. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3690\r\n* Remove model_load_path from experiment by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3707\r\n* [FEATURE] Allow typehints without the quotes. by @alexsherstinsky in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3699\r\n\r\n## New Contributors\r\n* @alexsherstinsky made their first contribution in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3649\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.8.4...v0.8.5","2023-10-09T21:39:32",{"id":236,"version":237,"summary_zh":238,"released_at":239},109060,"v0.8.4","## What's Changed\r\n* Add codellama to tokenizer list for set_pad_token by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3598\r\n* Set default eval batch size to 2 for LLM fine-tuning by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3599\r\n* [CI] Explicitly set eval batch size in determinism tests, introduce a new integration test group, and exclude slow tests. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3590\r\n* [CI] Run sudo apt-get update in GHAs. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3608\r\n* Store steps_per_epoch in Trainer by @hungcs in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3601\r\n* Updated characters, underscore and comma preprocessors to be TorchScriptable. by @martindavis in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3602\r\n* [CI] Deflake: Explicitly set eval batch size for mlflow test. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3612\r\n* Fix registration for char error rate. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3604\r\n* fix: Load 8-bit quantized models for eval after fine-tuning by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3606\r\n* Add Code Alpaca and Consumer Complaints Datasets by @connor-mccorm in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3611\r\n* Add support for gradient checkpointing for LLM fine-tuning by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3613\r\n* Bump min support transformers to 4.33.0 by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3616\r\n* [CI] Fix failing tests on master by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3617\r\n* Eliminate short-circuiting for loading from local by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3600\r\n* Refactor integration tests into matrix by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3618\r\n* fix: Check underlying model device type when moving 8-bit quantized models to GPU at eval by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3622\r\n* Fixed range validation for text generation penalty parameters by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3623\r\n* Update comment for predict to update Ludwig docs by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3535\r\n* Avoid deprecation warnings on pandas Series.fillna by @carlogrisetti in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3631\r\n* QoL: Default to using fast tokenizer for Llama models by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3625\r\n* fixed typo in EfficientNet's model variant from v2_ to v2_s by @saad-palapa in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3628\r\n* Add pytorch profiler and additional tensorboard logs for GPU memory usage. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3607\r\n* Pin minimum transformers version to `4.33.2` by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3637\r\n\r\n## New Contributors\r\n* @saad-palapa made their first contribution in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3628\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.8.3...v0.8.4","2023-09-19T16:20:12",{"id":241,"version":242,"summary_zh":243,"released_at":244},109061,"v0.8.3","## What's Changed\r\n* Add test to show global_max_sequence_length can never exceed an LLMs context length by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3548\r\n* WandB: Add metric logging support on eval end and epoch end by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3586\r\n* schema: Add `prompt` validation check by @ksbrar in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3564\r\n* Unpin Transformers for CodeLlama support by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3592\r\n* Add support for Paged Optimizers (Adam, Adamw), 8-bit optimizers, and new optimizers: LARS, LAMB and LION by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3588\r\n* fix: Failure in TabTransformer Combiner Unit test by @jimthompson5802 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3596\r\n* fix: Move target tensor to model output device in `check_module_parameters_updated` by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3567\r\n* Allow user to specify huggingface link or local path to pretrained lora weights by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3572\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.8.2...v0.8.3","2023-09-12T01:05:37",{"id":246,"version":247,"summary_zh":248,"released_at":249},109062,"v0.8.2","## What's Changed\r\n* int: Rename original `combiner_registry` to `combiner_config_registry`, update decorator name by @ksbrar in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3516\r\n* Add mechanic to override default values for generation during model.predict() by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3520\r\n* [feat] Support for numeric date feature inputs by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3517\r\n* Add new sythesized `response` column for text output features during postprocessing by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3521\r\n* Disable flaky twitter bots dataset loading test. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3439\r\n* Add test that verifies that the generation config passed in at model.predict() is used correctly. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3523\r\n* Move loss metric to same device as inputs by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3522\r\n* Add comment about batch size tuning by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3526\r\n* Ensure user sets backend to local w\u002F quantization by @Infernaught in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3524\r\n* README: Update LLM fine-tuning config by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3530\r\n* Revert \"Ensure user sets backend to local w\u002F quantization (#3524)\" by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3531\r\n* Revert \"Ensure user sets backend to local w\u002F quantization\" for release-0.8 branch and upgrade version to 0.8.1.post1 by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3532\r\n* Improve observability during LLM inference  by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3536\r\n* [bug] Pin pydantic to \u003C 2.0 by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3537\r\n* [bug] Support preprocessing `datetime.date` date features by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3534\r\n* Remove obsolete prompt tuning example. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3540\r\n* Add Ludwig 0.8 notebook to the README by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3542\r\n* Add `effective_batch_size` to auto-adjust gradient accumulation by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3533\r\n* Refactor evaluation metrics to support decoded generated text metrics like BLEU and ROUGE. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3539\r\n* Fix sequence generator test. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3546\r\n* Revert \"Add Cosine Annealing LR scheduler as a decay method (#3507)\" by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3545\r\n* Set default max_sequence_length to None for LLM text input\u002Foutput features by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3547\r\n* Add skip_all_evaluation as a mechanic to skip all evaluation. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3543\r\n* Roll-forward with fixes: Fix interaction between scheduler.step() and gradient accumulation steps, refactor schedulers to use `LambdaLR`, and add cosine annealing LR scheduler as a decay method. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3555\r\n* fix: Move model to the correct device for eval by @jeffkinnison in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3554\r\n* Report loss in tqdm to avoid log spam by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3559\r\n* Wrap each metric update in try\u002Fexcept. by @justinxzhao in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3562\r\n* Move DDP model to device if it hasn't been wrapped yet by @tgaddair in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3566\r\n* ensure that there are enough colors to match the score index in visua… by @thelinuxkid in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3560\r\n* Pin Transformers to 4.31.0 by @arnavgarg1 in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3569\r\n\r\n## New Contributors\r\n* @thelinuxkid made their first contribution in https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fpull\u002F3560\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fludwig-ai\u002Fludwig\u002Fcompare\u002Fv0.8.1...v0.8.2","2023-09-01T14:06:48"]