[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-minimaxir--textgenrnn":3,"tool-minimaxir--textgenrnn":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",160015,2,"2026-04-18T11:30:52",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",109154,"2026-04-18T11:18:24",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[52,13,15,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":32,"last_commit_at":59,"category_tags":60,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":77,"owner_email":78,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":87,"forks":88,"last_commit_at":89,"license":90,"difficulty_score":32,"env_os":91,"env_gpu":92,"env_ram":93,"env_deps":94,"category_tags":100,"github_topics":101,"view_count":32,"oss_zip_url":79,"oss_zip_packed_at":79,"status":17,"created_at":106,"updated_at":107,"faqs":108,"releases":139},9024,"minimaxir\u002Ftextgenrnn","textgenrnn","Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.","textgenrnn 是一个基于 Keras 和 TensorFlow 构建的 Python 库，旨在让用户仅用几行代码即可轻松训练专属的文本生成神经网络。它解决了传统深度学习模型门槛高、配置复杂的问题，支持用户在字符级或词级上，利用任意文本数据集快速构建从简单到复杂的生成模型。\n\n无论是希望探索 AI 创作潜力的普通用户，还是需要高效原型的开发者与研究人员，都能从中受益。其独特亮点在于采用了包含注意力机制和跳过嵌入的现代架构，显著提升了训练速度与生成质量；同时支持利用 GPU 加速训练（兼容 CuDNN），并在 CPU 上进行推理。此外，它还提供了“交互式模式”，允许用户在生成过程中逐步选择后续内容，为输出增添人为控制的灵活性。预训练模型权重小巧（约 2MB），便于保存、加载及迁移学习，即使只经过一轮数据训练也能生成连贯文本，是入门文本生成领域的理想工具。","# textgenrnn\n\n![dank text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fminimaxir_textgenrnn_readme_d5bcaedd93e6.gif)\n\nEasily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly train on a text using a pretrained model.\n\ntextgenrnn is a Python 3 module on top of [Keras](https:\u002F\u002Fgithub.com\u002Ffchollet\u002Fkeras)\u002F[TensorFlow](https:\u002F\u002Fwww.tensorflow.org) for creating [char-rnn](http:\u002F\u002Fkarpathy.github.io\u002F2015\u002F05\u002F21\u002Frnn-effectiveness\u002F)s, with many cool features:\n\n* A modern neural network architecture which utilizes new techniques as attention-weighting and skip-embedding to accelerate training and improve model quality.\n* Train on and generate text at either the character-level or word-level.\n* Configure RNN size, the number of RNN layers, and whether to use bidirectional RNNs.\n* Train on any generic input text file, including large files.\n* Train models on a GPU and then use them to generate text with a CPU.\n* Utilize a powerful CuDNN implementation of RNNs when trained on the GPU, which massively speeds up training time as opposed to typical LSTM implementations.\n* Train the model using contextual labels, allowing it to learn faster and produce better results in some cases.\n\nYou can play with textgenrnn and train any text file with a GPU *for free* in this [Colaboratory Notebook](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1mMKGnVxirJnqDViH7BDJxFqWrsXlPSoK\u002Fview?usp=sharing)! Read [this blog post](http:\u002F\u002Fminimaxir.com\u002F2018\u002F05\u002Ftext-neural-networks\u002F) or [watch this video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=RW7mP6BfZuY) for more information!\n\n## Examples\n\n```python\nfrom textgenrnn import textgenrnn\n\ntextgen = textgenrnn()\ntextgen.generate()\n```\n\n```text\n[Spoiler] Anyone else find this post and their person that was a little more than I really like the Star Wars in the fire or health and posting a personal house of the 2016 Letter for the game in a report of my backyard.\n```\n\nThe included model can easily be trained on new texts, and can generate appropriate text *even after a single pass of the input data*.\n\n```python\ntextgen.train_from_file('hacker_news_2000.txt', num_epochs=1)\ntextgen.generate()\n```\n\n```text\nProject State Project Firefox\n```\n\nThe model weights are relatively small (2 MB on disk), and they can easily be saved and loaded into a new textgenrnn instance. As a result, you can play with models which have been trained on hundreds of passes through the data. (in fact, textgenrnn learns *so well* that you have to increase the temperature significantly for creative output!)\n\n```python\ntextgen_2 = textgenrnn('\u002Fweights\u002Fhacker_news.hdf5')\ntextgen_2.generate(3, temperature=1.0)\n```\n\n```text\nWhy we got money “regular alter”\n\nUrburg to Firefox acquires Nelf Multi Shamn\n\nKubernetes by Google’s Bern\n```\n\nYou can also train a new model, with support for word level embeddings and bidirectional RNN layers by adding `new_model=True` to any train function.\n\n## Interactive Mode\n\nIt's also possible to get involved in how the output unfolds, step by step. Interactive mode will suggest you the *top N* options for the next char\u002Fword, and allows you to pick one.  \n  \nWhen running textgenrnn in the terminal, pass `interactive=True` and `top=N` to `generate`. N defaults to 3.\n\n```python\nfrom textgenrnn import textgenrnn\n\ntextgen = textgenrnn()\ntextgen.generate(interactive=True, top_n=5)\n```\n\n![word_level_demo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fminimaxir_textgenrnn_readme_f1196874bfdd.gif)\n  \nThis can add a *human touch* to the output; it feels like you're the writer! ([reference](https:\u002F\u002Ffivethirtyeight.com\u002Ffeatures\u002Fsome-like-it-bot\u002F))\n  \n## Usage\n\ntextgenrnn can be installed [from pypi](https:\u002F\u002Fpypi.python.org\u002Fpypi\u002Ftextgenrnn) via `pip`:\n\n```sh\npip3 install textgenrnn\n```\n\nFor the latest textgenrnn, *you must have a minimum TensorFlow version of 2.1.0*.\n\nYou can view a demo of common features and model configuration options in [this Jupyter Notebook](\u002Fdocs\u002Ftextgenrnn-demo.ipynb).\n\n`\u002Fdatasets` contains example datasets using Hacker News\u002FReddit data for training textgenrnn.\n\n`\u002Fweights` contains further-pretrained models on the aforementioned datasets which can be loaded into textgenrnn.\n\n`\u002Foutputs` contains examples of text generated from the above pretrained models.\n\n## Neural Network Architecture and Implementation\n\ntextgenrnn is based off of the [char-rnn](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fchar-rnn) project by [Andrej Karpathy](https:\u002F\u002Ftwitter.com\u002Fkarpathy) with a few modern optimizations, such as the ability to work with very small text sequences.\n\n![default model](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fminimaxir_textgenrnn_readme_334891923fdf.png)\n\nThe included pretrained-model follows a [neural network architecture](https:\u002F\u002Fgithub.com\u002Fbfelbo\u002FDeepMoji\u002Fblob\u002Fmaster\u002Fdeepmoji\u002Fmodel_def.py) inspired by [DeepMoji](https:\u002F\u002Fgithub.com\u002Fbfelbo\u002FDeepMoji). For the default model, textgenrnn takes in an input of up to 40 characters, converts each character to a 100-D character embedding vector, and feeds those into a 128-cell long-short-term-memory (LSTM) recurrent layer. Those outputs are then fed into *another* 128-cell LSTM. All three layers are then fed into an Attention layer to weight the most important temporal features and average them together (and since the embeddings + 1st LSTM are skip-connected into the attention layer, the model updates can backpropagate to them more easily and prevent vanishing gradients). That output is mapped to probabilities for up to [394 different characters](\u002Ftextgenrnn\u002Ftextgenrnn_vocab.json) that they are the next character in the sequence, including uppercase characters, lowercase, punctuation, and emoji. (if training a new model on a new dataset, all of the numeric parameters above can be configured)\n\n![context model](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fminimaxir_textgenrnn_readme_47d526aa41e2.png)\n\nAlternatively, if context labels are provided with each text document, the model can be trained in a contextual mode, where the model learns the text *given the context* so the recurrent layers learn the *decontextualized* language. The text-only path can piggy-back off the decontextualized layers; in all, this results in much faster training and better quantitative and qualitative model performance than just training the model gien the text alone.\n\nThe model weights included with the package are trained on hundreds of thousands of text documents from Reddit submissions ([via BigQuery](http:\u002F\u002Fminimaxir.com\u002F2015\u002F10\u002Freddit-bigquery\u002F)), from a very *diverse* variety of subreddits. The network was also trained using the decontextual approach noted above in order to both improve training performance and mitigate authorial bias.\n\nWhen fine-tuning the model on a new dataset of texts using textgenrnn, all layers are retrained. However, since the original pretrained network has a much more robust \"knowledge\" initially, the new textgenrnn trains faster and more accurately in the end, and can potentially learn new relationships not present in the original dataset (e.g. the [pretrained character embeddings](http:\u002F\u002Fminimaxir.com\u002F2017\u002F04\u002Fchar-embeddings\u002F) include the context for the character for all possible types of modern internet grammar).\n\nAdditionally, the retraining is done with a momentum-based optimizer and a linearly decaying learning rate, both of which prevent exploding gradients and makes it much less likely that the model diverges after training for a long time.\n\n## Notes\n\n* **You will not get quality generated text 100% of the time**, even with a heavily-trained neural network. That's the primary reason viral [blog posts](http:\u002F\u002Faiweirdness.com\u002Fpost\u002F170685749687\u002Fcandy-heart-messages-written-by-a-neural-network)\u002F[Twitter tweets](https:\u002F\u002Ftwitter.com\u002Fbotnikstudios\u002Fstatus\u002F955870327652970496) utilizing NN text generation often generate lots of texts and curate\u002Fedit the best ones afterward.\n\n* **Results will vary greatly between datasets**. Because the pretrained neural network is relatively small, it cannot store as much data as RNNs typically flaunted in blog posts. For best results, use a dataset with at least 2,000-5,000 documents. If a dataset is smaller, you'll need to train it for longer by setting `num_epochs` higher when calling a training method and\u002For training a new model from scratch. Even then, there is currently no good heuristic for determining a \"good\" model.\n\n* A GPU is not required to retrain textgenrnn, but it will take much longer to train on a CPU. If you do use a GPU, I recommend increasing the `batch_size` parameter for better hardware utilization.\n\n## Future Plans for textgenrnn\n\n* More formal documentation\n\n* A web-based implementation using tensorflow.js (works especially well due to the network's small size)\n\n* A way to visualize the attention-layer outputs to see how the network \"learns.\"\n\n* A mode to allow the model architecture to be used for chatbot conversations (may be released as a separate project)\n\n* More depth toward context (positional context + allowing multiple context labels)\n\n* A larger pretrained network which can accommodate longer character sequences and a more indepth understanding of language, creating better generated sentences.\n\n* Hierarchical softmax activation for word-level models (once Keras has good support for it).\n\n* FP16 for superfast training on Volta\u002FTPUs (once Keras has good support for it).\n\n## Articles\u002FProjects using textgenrnn\n\n### Articles\n\n* Lifehacker: [How to Train Your Own Neural Network](https:\u002F\u002Flifehacker.com\u002Fwe-trained-an-ai-to-generate-lifehacker-headlines-1826616918) by Beth Skwarecki\n* New York Times: [Let Our Algorithm Choose Your Halloween Costume](https:\u002F\u002Fwww.nytimes.com\u002Finteractive\u002F2018\u002F10\u002F26\u002Fopinion\u002Fhalloween-spooky-costumes-machine-learning-generator.html) by Janelle Shane\n* CNN Business: [This quirky experiment highlights AI's biggest challenges](https:\u002F\u002Fwww.cnn.com\u002F2018\u002F11\u002F09\u002Ftech\u002Fjanelle-shane-ai\u002Findex.html) by Rachel Metz\n\n### Projects\n\n* [Tweet Generator](https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftweet-generator) — Train a neural network optimized for generating tweets based off of any number of Twitter users\n* [Hacker News Simulator](https:\u002F\u002Ftwitter.com\u002Fhackernews_nn) — Twitter bot trained on 300,000+ Hacker News submissions using textgenrnn.\n* [SubredditRNN](https:\u002F\u002Fwww.reddit.com\u002Fr\u002Fsubredditnn) — Reddit Subreddit where all submitted content is from textgenrnn bots.\n* [Human-AI Collaborated Pizzas](https:\u002F\u002Fhowtogeneratealmostanything.com\u002Ffood\u002F2018\u002F08\u002F30\u002Fepisode2.html) — Pizza recepies generated with textgenrnn and made in real life.\n* [Board Game Titles](https:\u002F\u002Fboardgamegeek.com\u002Fthread\u002F2105706\u002Fi-trained-neural-network-17000-game-titles-bgg)\n* [Video Game Discussion Forum Titles](https:\u002F\u002Fwww.resetera.com\u002Fthreads\u002Fi-trained-an-ai-on-tens-of-thousands-of-resetera-post-titles-and-discovered-how-the-world-ends.82679\u002F)\n* [A.I Created Cakes](https:\u002F\u002Fwww.cupcaikes.com\u002Findex.html)\n* [AI Created Cookies](http:\u002F\u002Faiweirdness.com\u002Fpost\u002F180892528177\u002Faw-yeah-its-time-for-cookies-with-neural-networks)\n* [AI Generated Songs](http:\u002F\u002Faiweirdness.com\u002Fpost\u002F180654319147\u002Fhow-to-begin-a-song)\n\n### Tweets\n\n* [BuzzFeed YouTube Videos](https:\u002F\u002Ftwitter.com\u002Fminimaxir\u002Fstatus\u002F1064604986951163905)\n* [AWS Services](https:\u002F\u002Ftwitter.com\u002Fjamesoff\u002Fstatus\u002F1073647847130742787)\n* [Recipes + D&D Spells + Heavy Metal Names](https:\u002F\u002Ftwitter.com\u002FThomasClaburn\u002Fstatus\u002F1049069940571955201)\n* [RPG Adventure Names](https:\u002F\u002Ftwitter.com\u002F400goblins\u002Fstatus\u002F1036794962740953088)\n* [The Onion + Cosmopolitan](https:\u002F\u002Ftwitter.com\u002FBBCPARLlAMENT\u002Fstatus\u002F1014834653113585664)\n* [Google Conference Room Names](https:\u002F\u002Ftwitter.com\u002Ftensafefrogs\u002Fstatus\u002F1009912151060951045)\n* [Sith Lords](https:\u002F\u002Ftwitter.com\u002FJanelleCShane\u002Fstatus\u002F1002573232103305216)\n\n## Maintainer\u002FCreator\n\nMax Woolf ([@minimaxir](http:\u002F\u002Fminimaxir.com))\n\n*Max's open-source projects are supported by his [Patreon](https:\u002F\u002Fwww.patreon.com\u002Fminimaxir). If you found this project helpful, any monetary contributions to the Patreon are appreciated and will be put to good creative use.*\n\n## Credits\n\nAndrej Karpathy for the original proposal of the char-rnn via the blog post [The Unreasonable Effectiveness of Recurrent Neural Networks](http:\u002F\u002Fkarpathy.github.io\u002F2015\u002F05\u002F21\u002Frnn-effectiveness\u002F).\n\n[Daniel Grijalva](https:\u002F\u002Fgithub.com\u002FJuanets) for [contributing](https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn\u002Fpull\u002F52) an interactive mode.\n\n## License\n\nMIT\n\nAttention-layer code used from [DeepMoji](https:\u002F\u002Fgithub.com\u002Fbfelbo\u002FDeepMoji) (MIT Licensed)\n","# textgenrnn\n\n![dank text](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fminimaxir_textgenrnn_readme_d5bcaedd93e6.gif)\n\n只需几行代码，即可在任何文本数据集上轻松训练任意大小和复杂度的文本生成神经网络，或者使用预训练模型快速对一段文本进行训练。\n\ntextgenrnn 是一个基于 [Keras](https:\u002F\u002Fgithub.com\u002Ffchollet\u002Fkeras)\u002F[TensorFlow](https:\u002F\u002Fwww.tensorflow.org) 的 Python 3 模块，用于创建 [char-rnn](http:\u002F\u002Fkarpathy.github.io\u002F2015\u002F05\u002F21\u002Frnn-effectiveness\u002F) 模型，并提供了许多酷炫的功能：\n\n* 一种现代化的神经网络架构，利用注意力加权和跳跃嵌入等新技术来加速训练并提升模型质量。\n* 可以在字符级别或单词级别上进行训练和生成文本。\n* 可以配置 RNN 的规模、RNN 层数以及是否使用双向 RNN。\n* 支持训练任何通用的输入文本文件，包括大型文件。\n* 可以在 GPU 上训练模型，然后在 CPU 上使用该模型生成文本。\n* 在 GPU 上训练时，可以利用强大的 CuDNN 实现的 RNN，相比传统的 LSTM 实现，能够大幅加快训练速度。\n* 使用上下文标签进行训练，使模型在某些情况下能够更快地学习并产生更好的结果。\n\n您可以在这个 [Colaboratory Notebook](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1mMKGnVxirJnqDViH7BDJxFqWrsXlPSoK\u002Fview?usp=sharing) 中免费试用 textgenrnn，并使用 GPU 训练任意文本文件！阅读 [这篇博客文章](http:\u002F\u002Fminimaxir.com\u002F2018\u002F05\u002Ftext-neural-networks\u002F) 或 [观看这个视频](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=RW7mP6BfZuY) 以获取更多信息！\n\n## 示例\n\n```python\nfrom textgenrnn import textgenrnn\n\ntextgen = textgenrnn()\ntextgen.generate()\n```\n\n```text\n【剧透】还有谁觉得这篇帖子和那个人有点超出我真正喜欢《星球大战》的程度？他们在火堆旁讨论健康问题，还发布了2016年游戏信件中关于我家后院的报道。\n```\n\n附带的模型可以轻松地在新文本上进行训练，并且即使只遍历一次输入数据，也能生成合适的文本。\n\n```python\ntextgen.train_from_file('hacker_news_2000.txt', num_epochs=1)\ntextgen.generate()\n```\n\n```text\n项目状态 项目 火狐浏览器\n```\n\n模型权重相对较小（磁盘上约 2 MB），可以轻松保存并在新的 textgenrnn 实例中加载。因此，您可以尝试使用经过数百次数据遍历训练过的模型。（事实上，textgenrnn 学习得 *太好了*，以至于需要显著提高温度才能获得更具创造性的输出！）\n\n```python\ntextgen_2 = textgenrnn('\u002Fweights\u002Fhacker_news.hdf5')\ntextgen_2.generate(3, temperature=1.0)\n```\n\n```text\n为什么我们得到了钱“定期改变”\n\n乌尔堡向火狐浏览器收购了奈尔夫·穆尔蒂·沙姆恩\n\n谷歌的伯恩推出的 Kubernetes\n```\n\n您还可以通过在任何训练函数中添加 `new_model=True` 来训练一个新模型，该模型支持单词级别的嵌入和双向 RNN 层。\n\n## 交互模式\n\n您也可以逐步参与到输出内容的生成过程中。交互模式会为您建议下一个字符或单词的 *前 N 个选项*，并允许您从中选择一个。  \n\n在终端中运行 textgenrnn 时，只需在 `generate` 方法中传入 `interactive=True` 和 `top_n=N` 即可。N 的默认值为 3。\n\n```python\nfrom textgenrnn import textgenrnn\n\ntextgen = textgenrnn()\ntextgen.generate(interactive=True, top_n=5)\n```\n\n![word_level_demo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fminimaxir_textgenrnn_readme_f1196874bfdd.gif)\n  \n这可以为输出增添一份 *人性化的色彩*；感觉就像您自己在写作一样！([参考](https:\u002F\u002Ffivethirtyeight.com\u002Ffeatures\u002Fsome-like-it-bot\u002F))\n  \n## 使用方法\n\ntextgenrnn 可以通过 `pip` 从 [pypi](https:\u002F\u002Fpypi.python.org\u002Fpypi\u002Ftextgenrnn) 安装：\n\n```sh\npip3 install textgenrnn\n```\n\n要使用最新版本的 textgenrnn，*您必须安装至少 TensorFlow 2.1.0 版本*。\n\n您可以在 [这个 Jupyter Notebook](\u002Fdocs\u002Ftextgenrnn-demo.ipynb) 中查看常见功能和模型配置选项的演示。\n\n`\u002Fdatasets` 目录包含用于训练 textgenrnn 的 Hacker News\u002FReddit 数据示例。\n\n`\u002Fweights` 目录包含上述数据集上的进一步预训练模型，可以加载到 textgenrnn 中。\n\n`\u002Foutputs` 目录包含由上述预训练模型生成的文本示例。\n\n## 神经网络架构与实现\n\ntextgenrnn 基于 [Andrej Karpathy](https:\u002F\u002Ftwitter.com\u002Fkarpathy) 的 [char-rnn](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fchar-rnn) 项目，并加入了一些现代优化，例如能够处理非常短的文本序列。\n\n![默认模型](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fminimaxir_textgenrnn_readme_334891923fdf.png)\n\n内置的预训练模型遵循一种受 [DeepMoji](https:\u002F\u002Fgithub.com\u002Fbfelbo\u002FDeepMoji) 启发的 [神经网络架构](https:\u002F\u002Fgithub.com\u002Fbfelbo\u002FDeepMoji\u002Fblob\u002Fmaster\u002Fdeepmoji\u002Fmodel_def.py)。对于默认模型，textgenrnn 接收最多 40 个字符的输入，将每个字符转换为一个 100 维的字符嵌入向量，然后将其输入到一个包含 128 个单元的长短期记忆（LSTM）循环层中。该层的输出再被送入另一个 128 单元的 LSTM 层。随后，所有三层的输出都会被送入一个注意力层，以加权最重要的时间特征并对其进行平均（由于嵌入和第一层 LSTM 被跳过连接到注意力层，模型更新可以更容易地反向传播到这些层，从而防止梯度消失）。最终的输出会被映射为多达 [394 种不同字符](\u002Ftextgenrnn\u002Ftextgenrnn_vocab.json) 的概率分布，表示它们是序列中的下一个字符，包括大写字母、小写字母、标点符号和表情符号。（如果在新数据集上训练新模型，上述所有数值参数都可以进行配置）\n\n![上下文模型](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fminimaxir_textgenrnn_readme_47d526aa41e2.png)\n\n或者，如果每个文本文档都提供了上下文标签，模型可以以上下文模式进行训练，在这种模式下，模型会学习“给定上下文”的文本，从而使循环层学会“去情境化”的语言。仅基于文本的路径可以利用这些去情境化的层；总体而言，这使得训练速度更快，且模型的定量和定性表现都优于仅基于文本进行训练的情况。\n\n软件包中包含的模型权重是在来自 Reddit 提交的数十万份文本文档上训练得到的（[通过 BigQuery](http:\u002F\u002Fminimaxir.com\u002F2015\u002F10\u002Freddit-bigquery\u002F)），这些文档来自非常 *多样化* 的子版块。该网络还采用了上述的去情境化方法进行训练，以提高训练效率并减轻作者偏见的影响。\n\n当使用 textgenrnn 在新的文本数据集上对模型进行微调时，所有层都会被重新训练。然而，由于原始的预训练网络已经具备更为扎实的基础知识，因此新的 textgenrnn 模型在训练过程中会更快、更准确地收敛，并且有可能学习到原始数据集中不存在的新关系（例如，[预训练的字符嵌入](http:\u002F\u002Fminimaxir.com\u002F2017\u002F04\u002Fchar-embeddings\u002F)包含了现代互联网语法中各种可能的语境信息）。\n\n此外，重新训练采用基于动量的优化器和线性衰减的学习率，这两种方法都可以防止梯度爆炸，从而大大降低模型在长时间训练后发生发散的可能性。\n\n## 注意事项\n\n* **即使使用经过大量训练的神经网络，你也无法保证每次生成的文本质量都高**。这也是为什么那些利用神经网络文本生成技术的病毒式 [博客文章](http:\u002F\u002Faiweirdness.com\u002Fpost\u002F170685749687\u002Fcandy-heart-messages-written-by-a-neural-network)\u002F[Twitter 推文](https:\u002F\u002Ftwitter.com\u002Fbotnikstudios\u002Fstatus\u002F955870327652970496)通常会生成大量文本，然后再从中挑选出最好的内容的原因。\n\n* **不同数据集的结果差异很大**。由于预训练的神经网络规模相对较小，它无法像一些博客文章中展示的那样存储大量数据。为了获得最佳效果，建议使用至少包含 2,000–5,000 个文档的数据集。如果数据集较小，则需要通过在调用训练方法时增加 `num_epochs` 参数或从头开始训练新模型来延长训练时间。即便如此，目前仍然没有一个可靠的启发式方法来判断一个模型是否“良好”。\n\n* 重新训练 textgenrnn 并不需要 GPU，但在 CPU 上训练会花费更长的时间。如果你使用 GPU，建议增加 `batch_size` 参数，以更好地利用硬件资源。\n\n## textgenrnn 的未来计划\n\n* 更正式的文档\n* 使用 tensorflow.js 的基于 Web 的实现（由于网络规模较小，效果尤为理想）\n* 一种可视化注意力层输出的方法，以便观察网络是如何“学习”的。\n* 一种允许将模型架构用于聊天机器人对话的模式（可能会作为独立项目发布）。\n* 进一步深化对上下文的理解（位置上下文以及支持多个上下文标签）。\n* 一个更大的预训练网络，能够处理更长的字符序列，并对语言有更深入的理解，从而生成更好的句子。\n* 针对词级模型的层次 softmax 激活函数（一旦 Keras 对其提供良好支持）。\n* 使用 FP16 进行超快速训练（适用于 Volta\u002FTPU，一旦 Keras 对其提供良好支持）。\n\n## 使用 textgenrnn 的文章\u002F项目\n\n### 文章\n\n* Lifehacker：[如何训练你自己的神经网络](https:\u002F\u002Flifehacker.com\u002Fwe-trained-an-ai-to-generate-lifehacker-headlines-1826616918)——Beth Skwarecki\n* 纽约时报：[让我们的算法为你选择万圣节服装](https:\u002F\u002Fwww.nytimes.com\u002Finteractive\u002F2018\u002F10\u002F26\u002Fopinion\u002Fhalloween-spooky-costumes-machine-learning-generator.html)——Janelle Shane\n* CNN Business：[这个古怪的实验凸显了人工智能面临的最大挑战](https:\u002F\u002Fwww.cnn.com\u002F2018\u002F11\u002F09\u002Ftech\u002Fjanelle-shane-ai\u002Findex.html)——Rachel Metz\n\n### 项目\n\n* [推文生成器](https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftweet-generator) —— 训练一个针对 Twitter 用户优化的神经网络，用于生成推文\n* [Hacker News 模拟器](https:\u002F\u002Ftwitter.com\u002Fhackernews_nn) —— 一个基于 textgenrnn 训练的 Twitter 机器人，使用超过 30 万条 Hacker News 提交内容\n* [SubredditRNN](https:\u002F\u002Fwww.reddit.com\u002Fr\u002Fsubredditnn) —— 一个 Reddit 子版块，所有提交的内容均由 textgenrnn 生成的机器人发布\n* [人机协作披萨](https:\u002F\u002Fhowtogeneratealmostanything.com\u002Ffood\u002F2018\u002F08\u002F30\u002Fepisode2.html) —— 使用 textgenrnn 生成食谱并在现实中制作的披萨\n* [桌游标题](https:\u002F\u002Fboardgamegeek.com\u002Fthread\u002F2105706\u002Fi-trained-neural-network-17000-game-titles-bgg)\n* [视频游戏讨论论坛标题](https:\u002F\u002Fwww.resetera.com\u002Fthreads\u002Fi-trained-an-ai-on-tens-of-thousands-of-resetera-post-titles-and-discovered-how-the-world-ends.82679\u002F)\n* [人工智能创作的蛋糕](https:\u002F\u002Fwww.cupcaikes.com\u002Findex.html)\n* [人工智能创作的饼干](http:\u002F\u002Faiweirdness.com\u002Fpost\u002F180892528177\u002Faw-yeah-its-time-for-cookies-with-neural-networks)\n* [人工智能生成的歌曲](http:\u002F\u002Faiweirdness.com\u002Fpost\u002F180654319147\u002Fhow-to-begin-a-song)\n\n### 推文\n\n* [BuzzFeed YouTube 视频](https:\u002F\u002Ftwitter.com\u002Fminimaxir\u002Fstatus\u002F1064604986951163905)\n* [AWS 服务](https:\u002F\u002Ftwitter.com\u002Fjamesoff\u002Fstatus\u002F1073647847130742787)\n* [食谱 + D&D 法术 + 重金属乐队名](https:\u002F\u002Ftwitter.com\u002FThomasClaburn\u002Fstatus\u002F1049069940571955201)\n* [RPG 冒险名称](https:\u002F\u002Ftwitter.com\u002F400goblins\u002Fstatus\u002F1036794962740953088)\n* [洋葱报 + 美丽佳人杂志](https:\u002F\u002Ftwitter.com\u002FBBCPARLlAMENT\u002Fstatus\u002F1014834653113585664)\n* [谷歌会议室名称](https:\u002F\u002Ftwitter.com\u002Ftensafefrogs\u002Fstatus\u002F1009912151060951045)\n* [西斯尊主](https:\u002F\u002Ftwitter.com\u002FJanelleCShane\u002Fstatus\u002F1002573232103305216)\n\n## 维护者\u002F创作者\n\nMax Woolf ([@minimaxir](http:\u002F\u002Fminimaxir.com))\n\n*Max 的开源项目由他的 [Patreon](https:\u002F\u002Fwww.patreon.com\u002Fminimaxir) 支持。如果您觉得这个项目有帮助，欢迎向 Patreon 捐款，您的支持将用于富有创意的用途。*\n\n## 致谢\n\nAndrej Karpathy 通过博客文章 [循环神经网络的不合理有效性](http:\u002F\u002Fkarpathy.github.io\u002F2015\u002F05\u002F21\u002Frnn-effectiveness\u002F) 提出了 char-rnn 的原始构想。\n\n[Daniel Grijalva](https:\u002F\u002Fgithub.com\u002FJuanets) 贡献了交互模式 ([pull 请求 #52](https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn\u002Fpull\u002F52))。\n\n## 许可证\n\nMIT 许可证\n\n注意层代码取自 [DeepMoji](https:\u002F\u002Fgithub.com\u002Fbfelbo\u002FDeepMoji)（MIT 许可）。","# textgenrnn 快速上手指南\n\ntextgenrnn 是一个基于 Keras\u002FTensorFlow 的 Python 模块，旨在让用户通过几行代码轻松训练任意大小和复杂度的文本生成神经网络（Char-RNN）。它支持字符级或词级训练，并针对 GPU 训练进行了深度优化。\n\n## 环境准备\n\n*   **操作系统**：Linux, macOS 或 Windows\n*   **Python 版本**：Python 3.6+\n*   **核心依赖**：\n    *   TensorFlow >= 2.1.0 (必需)\n    *   Keras (通常随 TensorFlow 安装)\n*   **硬件建议**：\n    *   **训练**：推荐使用 NVIDIA GPU 以利用 CuDNN 加速，大幅缩短训练时间。\n    *   **推理**：CPU 即可流畅运行生成任务。\n\n## 安装步骤\n\n使用 pip 进行安装。国内开发者建议使用清华源或阿里源以提升下载速度。\n\n```bash\n# 使用默认源安装\npip3 install textgenrnn\n\n# 推荐：使用清华大学镜像源加速安装\npip3 install textgenrnn -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n> **注意**：安装前请确保您的环境中已安装满足版本要求的 TensorFlow (`pip3 install \"tensorflow>=2.1.0\"`)。\n\n## 基本使用\n\n### 1. 快速生成文本（使用预训练模型）\n\n加载默认预训练模型并直接生成文本，无需任何训练数据。\n\n```python\nfrom textgenrnn import textgenrnn\n\ntextgen = textgenrnn()\ntextgen.generate()\n```\n\n### 2. 微调模型（基于自定义数据集）\n\n只需一个文本文件即可开始训练。以下示例展示如何读取文件并进行单轮 epoch 训练，随后生成内容。\n\n```python\nfrom textgenrnn import textgenrnn\n\ntextgen = textgenrnn()\n\n# 训练模型，num_epochs 控制训练轮数\ntextgen.train_from_file('hacker_news_2000.txt', num_epochs=1)\n\n# 生成文本\ntextgen.generate()\n```\n\n### 3. 加载已保存的模型权重\n\n您可以保存训练好的模型权重（`.hdf5` 文件），并在后续会话中直接加载使用。\n\n```python\nfrom textgenrnn import textgenrnn\n\n# 加载指定权重的模型\ntextgen_2 = textgenrnn('\u002Fweights\u002Fhacker_news.hdf5')\n\n# temperature 参数控制生成的随机性（值越大越具创造性）\ntextgen_2.generate(3, temperature=1.0)\n```\n\n### 4. 交互式生成模式\n\n在终端运行时，可以开启交互模式，让 AI 提供多个候选词\u002F字符供用户选择，实现人机协作写作。\n\n```python\nfrom textgenrnn import textgenrnn\n\ntextgen = textgenrnn()\n# top_n 设置每次提供的候选选项数量，默认为 3\ntextgen.generate(interactive=True, top_n=5)\n```","某独立游戏开发者希望为复古 RPG 游戏快速生成大量风格统一的 NPC 对话和物品描述，以丰富游戏世界观。\n\n### 没有 textgenrnn 时\n- 开发者需手动撰写数千条文本，耗时数周且容易陷入创意枯竭，导致内容重复单调。\n- 若尝试自建深度学习模型，需从零搭建复杂的 RNN 架构并调试超参数，技术门槛极高且训练缓慢。\n- 缺乏针对字符级或词级的灵活配置，难以模仿特定游戏文本的独特语气（如古英语或科幻术语）。\n- 模型训练依赖昂贵的高性能 GPU 集群，本地开发机无法承担计算负载，迭代周期漫长。\n- 生成的文本往往逻辑混乱，无法通过简单的代码调整“温度”参数来控制内容的创造性与稳定性。\n\n### 使用 textgenrnn 后\n- 仅需几行代码加载游戏剧本数据集，textgenrnn 即可在单块 GPU 上利用 CuDNN 加速训练，数小时内产出海量草稿。\n- 内置的现代神经网络架构支持注意力机制和跳过嵌入，自动捕捉文本风格，无需开发者深入钻研算法细节。\n- 可自由切换字符级或词级训练，并配置双向 RNN 层，精准复刻出符合游戏设定的独特文风。\n- 训练好的模型权重仅约 2MB，可轻松部署在普通 CPU 上进行实时推理，极大降低了运行成本。\n- 通过调节温度参数或使用交互模式，开发者能动态控制输出结果的随机性，甚至人工介入选择后续词汇，实现人机协作创作。\n\ntextgenrnn 将原本需要专业团队数周完成的文本工程，转化为个人开发者几天内即可落地的高效流程，让创意不再受限于生产力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fminimaxir_textgenrnn_0d19a0b3.png","minimaxir","Max Woolf","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fminimaxir_3ab20437.jpg","Senior Data Scientist @buzzfeed. Plotter of pretty charts.","@buzzfeed ","San Francisco","max@minimaxir.com",null,"https:\u002F\u002Fminimaxir.com","https:\u002F\u002Fgithub.com\u002Fminimaxir",[83],{"name":84,"color":85,"percentage":86},"Python","#3572A5",100,4926,733,"2026-04-06T08:01:03","NOASSERTION","Linux, macOS, Windows","非必需。支持 NVIDIA GPU 以加速训练（利用 CuDNN），CPU 亦可运行但速度较慢。具体型号、显存大小及 CUDA 版本未在文中明确说明。","未说明",{"notes":95,"python":96,"dependencies":97},"该工具基于 Keras\u002FTensorFlow。虽然可以使用 CPU 进行训练和生成，但在 GPU 上训练速度会大幅提升。预训练模型权重较小（约 2MB）。建议使用至少 2,000-5,000 个文档的数据集以获得最佳效果。可通过 Google Colab 免费使用 GPU 进行体验。","3.x (Python 3)",[98,99],"tensorflow>=2.1.0","keras",[14,35],[99,102,103,104,105],"deep-learning","text-generation","tensorflow","python","2026-03-27T02:49:30.150509","2026-04-18T22:33:50.812248",[109,114,119,124,129,134],{"id":110,"question_zh":111,"answer_zh":112,"source_url":113},40467,"在 Google Colab 中运行 notebook 时出现 'cannot import name multi_gpu_model' 错误怎么办？","这是因为 GitHub 上的最新版本已支持 TensorFlow 2，而旧版 notebook 仍强制使用 TF 1.x。解决方法是：\n1. 注释掉 notebook 中的 `%tensorflow_version 1.x` 这一行。\n2. 将安装命令改为从 Git 安装最新版：`!pip3 install git+git:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn.git`。\n这样即可解决导入错误并正常使用。","https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn\u002Fissues\u002F231",{"id":115,"question_zh":116,"answer_zh":117,"source_url":118},40468,"为什么模型训练或生成时 GPU 利用率为 0% 或完全使用 CPU？","通常是因为 TensorFlow 版本与 textgenrnn 版本不兼容导致的。请确保安装维护者推荐的特定版本组合。如果使用的是 pip 安装的旧版（如 1.4.1），尝试升级或根据社区反馈切换到能正确识别 GPU 的版本配置。确认已安装 `tensorflow-gpu` 且代码中未错误地限制设备可见性。","https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn\u002Fissues\u002F92",{"id":120,"question_zh":121,"answer_zh":122,"source_url":123},40469,"使用 train_on_texts 训练列表数据时报错 'numpy.ndarray object has no attribute lower' 如何解决？","该问题通常由 Keras 更新或预处理包版本冲突引起。维护者已在 v1.3.2 版本中修复了此问题。请尝试升级 textgenrnn 到 v1.3.2 或更高版本：`pip install --upgrade textgenrnn`。如果问题依旧，检查是否安装了兼容的 `keras_preprocessing` 包。","https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn\u002Fissues\u002F57",{"id":125,"question_zh":126,"answer_zh":127,"source_url":128},40470,"如何从头开始训练模型（不使用预训练权重）？","可以通过调用 `reset()` 方法来实现。初始化模型后，执行 `textgen.reset()` 即可清除预训练权重，使模型架构保持默认但参数随机初始化，从而可以从头开始训练。具体示例可参考项目文档中的 demo notebook。","https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn\u002Fissues\u002F5",{"id":130,"question_zh":131,"answer_zh":132,"source_url":133},40471,"在 Colab 上运行时遇到 'Fail to find the dnn implementation' 或 CuDNN 相关错误怎么办？","该仓库代码较老，直接更新 CuDNN 通常无法解决问题。这通常是 TensorFlow 版本与底层 CUDA\u002FcuDNN 环境不匹配导致的。建议不要直接使用过时的官方 notebook，而是寻找社区成员提供的已修复版本的 Colab notebook（例如通过 Issue 评论中分享的链接），或者尝试在本地环境中严格匹配旧版 TensorFlow 和 CUDA 版本。","https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn\u002Fissues\u002F254",{"id":135,"question_zh":136,"answer_zh":137,"source_url":138},40472,"模型生成的文本陷入重复循环（looping）该如何调整？","生成重复通常是因为模型过拟合或架构参数设置不当。建议调整以下配置：\n1. 降低 `rnn_size`：除非数据量极大，否则不要使用 1024，推荐尝试 `rnn_size=128` 配合 `num_layers=6`，或 `rnn_size=256` 配合 `num_layers=3`。\n2. 调整 `max_length`：将其设置为训练文档中位长度的一半。设置过大容易导致过拟合。\n3. 仅当上述简单架构无效时，再尝试更复杂的结构调整。","https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn\u002Fissues\u002F19",[140,145,150,154,159,164,169,174,179,184,189,194],{"id":141,"version":142,"summary_zh":143,"released_at":144},323890,"v2.0.0","已添加对 TensorFlow 2.1 的支持！（感谢 @ZerxXxes 提交的 #165！）\n\n鉴于此次迁移，TensorFlow 2.1 现在已成为最低版本要求（同时支持 CPU 和 GPU）。如果您需要使用 TensorFlow 1.X，请使用较旧的版本。","2020-02-03T01:07:00",{"id":146,"version":147,"summary_zh":148,"released_at":149},323891,"v1.5.0","两大主要功能：\n\n# 合成（测试版）\n\n同时使用两个（或更多）已训练的模型生成文本。请参阅[这篇笔记本](https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn\u002Fblob\u002Fmaster\u002Fdocs\u002Ftextgenrnn-synthesize.ipynb)以获取演示。\n\n由于生成结果通常会更加杂乱，建议使用较低的 `temperature` 参数。此功能适用于字符级和词级模型，也可以混合使用。（不过，不建议将按行分隔的文本模型与完整文本模型混合使用！）\n\n如果遇到任何问题，请提交 Issue！\n\n# 生成进度条\n\n得益于 `tqdm`，所有 `generate` 函数现在都会显示进度条！您可以通过向函数传递 `progress=False` 来禁用此功能。\n\n此外，生成时的默认温度现已调整为 `[1.0, 0.5, 0.2, 0.2]`！","2019-01-09T04:36:17",{"id":151,"version":152,"summary_zh":79,"released_at":153},323892,"v1.4.1","2018-10-26T01:11:09",{"id":155,"version":156,"summary_zh":157,"released_at":158},323893,"v1.4","# 特性\n\n* 交互模式，允许您控制添加哪些文本。（#52，感谢 @Juanets！）\n* 支持 TensorFlow 之外的后端。（#44，感谢 @torokati44！）\n* 允许定期保存权重。（#37，感谢 @IrekRybark！）\n* 多 GPU 支持（测试版：参见 #62）\n\n# 修复\n\n* 正确处理词级模型中的 `prefix`。","2018-08-09T02:55:26",{"id":160,"version":161,"summary_zh":162,"released_at":163},323894,"v1.3.2","紧急修复程序，用于解决在较新版本 Keras 中出现的 #57 问题。","2018-08-04T04:29:12",{"id":165,"version":166,"summary_zh":167,"released_at":168},323895,"v1.3.1","* 增加了在训练过程中循环温度的功能（更多信息请参阅[此笔记本](https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn\u002Fblob\u002Fmaster\u002Fdocs\u002Ftextgenrnn-temp-cycle.ipynb)）\n* 为词汇表导出添加了 UTF-8 编码。\n* 为 `train_on_texts(new_model=True)` 添加了别名 `train_new_model`。\n* 修复了一个问题：指定 `dropout` 参数可能导致异常。","2018-06-06T05:11:36",{"id":170,"version":171,"summary_zh":172,"released_at":173},323896,"v1.3","* 添加了 `encode_text_vectors`，用于使用训练好的网络对文本进行编码。\n* 添加了 `similarity`，用于快速计算余弦相似度并返回最相似的文本。\n\n详情请参阅[此笔记本](https:\u002F\u002Fgithub.com\u002Fminimaxir\u002Ftextgenrnn\u002Fblob\u002Fmaster\u002Fdocs\u002Ftextgenrnn-encode-text.ipynb)。","2018-05-07T02:26:25",{"id":175,"version":176,"summary_zh":177,"released_at":178},323897,"v1.2.2","* 使 `is_csv` 真正适用于下游。\n* 对描述进行微调","2018-05-06T18:48:26",{"id":180,"version":181,"summary_zh":182,"released_at":183},323898,"v1.2.1","* 添加了 `validation` 参数，用于禁用验证集训练以提升速度。\n* 添加了 `is_csv` 参数：当源文件为单列 CSV（例如从 BigQuery 或 Google Sheets 导出的文件）时，与 `train_from_file` 一起使用，以正确处理引号和换行符转义。\n* 对 README 进行了一些调整","2018-05-05T19:44:42",{"id":185,"version":186,"summary_zh":187,"released_at":188},323899,"v1.2","* 将 `prop_keep` 重命名为 `train_size`，并使用剩余数据作为验证集。 * 添加了 `dropout`，它会在每个 epoch 中随机丢弃输入标记。","2018-05-04T02:25:16",{"id":190,"version":191,"summary_zh":192,"released_at":193},323900,"v1.1","- 已切换为使用 `fit_generator` 实现训练序列的生成，而非将所有序列一次性加载到内存中。这使得在无需占用大量内存的情况下，即可训练大型文本文件（10MB 以上）。\n- 更完善的词级别支持：\n  - 模型仅保留 `max_words` 个词汇，其余词汇将被丢弃。\n  - 模型不会训练去预测不在词汇表中的单词。\n  - 所有标点符号（包括智能引号）都被视为独立的标记。\n  - 在生成文本时，换行符和制表符会去除其周围的空白字符。（其他标点符号则因规则过于复杂而未做此处理）\n- 单一文本上的训练不再使用元标记来指示文本的开始或结束，并且在生成文本时也不再使用这些元标记，从而获得稍好的输出效果。","2018-04-30T03:37:41",{"id":195,"version":196,"summary_zh":197,"released_at":198},323901,"v1.0","First release after the major refactor.","2018-04-24T02:58:42"]