[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-patil-suraj--question_generation":3,"tool-patil-suraj--question_generation":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":80,"owner_email":81,"owner_twitter":82,"owner_website":83,"owner_url":84,"languages":85,"stars":94,"forks":95,"last_commit_at":96,"license":97,"difficulty_score":10,"env_os":98,"env_gpu":98,"env_ram":98,"env_deps":99,"category_tags":105,"github_topics":106,"view_count":10,"oss_zip_url":83,"oss_zip_packed_at":83,"status":16,"created_at":115,"updated_at":116,"faqs":117,"releases":147},1142,"patil-suraj\u002Fquestion_generation","question_generation","Neural question generation using transformers","Question Generation是一个基于Transformer的问答生成工具，专注于从文本段落中自动生成问题。它通过预训练的序列到序列模型（如T5）实现端到端的问答生成，支持多种输入处理方式，包括答案前置格式和答案高亮格式。该工具解决了传统问答生成任务中模型复杂、依赖人工处理流程的问题，提供简化的数据处理和训练脚本，让研究者能快速验证不同生成策略。适用于需要构建问答系统、知识蒸馏或文本理解应用的开发者和研究人员，尤其适合希望基于预训练模型进行微调的场景。技术上采用预训练Transformer架构，支持答案感知和无答案监督的生成模式，并通过不同的输入编码方式提升生成质量，为问答生成研究提供了灵活且易用的实验平台。","# Question Generation using 🤗transformers\n\n- [Question Generation using 🤗transformers](#question-generation-using-transformers)\n  - [Project Details](#project-details)\n  - [Initial experiments](#initial-experiments)\n    - [answer aware question generation](#answer-aware-question-generation)\n    - [answer extraction models](#answer-extraction-models)\n    - [Multitask QA-QG](#multitask-qa-qg)\n    - [End-to-End question generation (answer agnostic)](#end-to-end-question-generation-answer-agnostic)\n  - [Results](#results)\n  - [Requirements](#requirements)\n  - [Usage](#usage)\n      - [Question Generation](#question-generation)\n      - [Multitask QA-QG](#multitask-qa-qg-1)\n      - [End-to-end question generation (without answer supervision)](#end-to-end-question-generation-without-answer-supervision)\n  - [Fine-tuning](#fine-tuning)\n    - [Data processing](#data-processing)\n    - [training](#training)\n    - [Evaluation](#evaluation)\n  - [Applications 🚀](#applications-)\n  - [Relevant papers](#relevant-papers)\n\n\n## Project Details\nQuestion generation is the task of automatically generating questions from a text paragraph. The most straight-forward way for this is answer aware question generation. In answer aware question generation the model is presented with the answer and the passage and asked to generate a question for that answer by considering the passage context. While there are many papers available for QG task, it's still not as mainstream as QA. One of the reasons is most of the earlier papers use complicated models\u002Fprocessing pipelines and have no pre-trained models available. Few recent papers, specifically UniLM and ProphetNet have SOTA pre-trained weights availble for QG but the usage seems quite complicated. \n\nThis project is aimed as an open source study on question generation with pre-trained transformers (specifically seq-2-seq models) using straight-forward end-to-end methods without much complicated pipelines. The goal is to provide simplified data processing and training scripts and easy to use pipelines for inference.\n \n\n## Initial experiments\nInitial experiments are conducted using the SQuADv1 dataset and T5 model with different input processing formats as described below.\n\n### answer aware question generation\n\nFor answer aware models the input text can be processed in two ways.\n\n**1. prepend format:**\n\n Here the answer is simply added before the context and seperated by sep token. For example\n\n `42 [SEP] 42 is the answer to life, the universe and everything.`\n\n for T5 model the input is processed like this\n\n `answer: 42  context: 42 is the answer to life, the universe and everything.`\n\n**2. highlight format**\n\nHere the answer span is highlighted within the text with special highlight tokens.\n\n`\u003Chl> 42 \u003Chl> is the answer to life, the universe and everything.`\n\nThis idea is proposed in the \"A Recurrent BERT-based Model for Question Generation\" [paper](https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002FD19-5821.pdf). See section 4.3 \n\n### answer extraction models\n\nAs the answer aware models need answers for generating question, we need something which can extract answer like spans from the text. This can be done using various methods like NER, noun-phrase extarction etc. But here a model is trained to extract answer like spans, to see how it'll work. With T5, answer extarction is done using the text-to-format. \n\nAs the highlight format will need to know the position of extracted answer spans the input for answer extraction is processed as follows\n\n  1. split the text into senteces. \n  2. for each sentence that has answers, highlight the sentence with `\u003Chl>` tokens.\n  3. for the target text join the answers in that sentence with `\u003Csep>` tokens.\n\nFor example for this text \n\n`Python is a programming language. Created by Guido van Rossum and first released in 1991.` \n\nfollowing examples will be created\n\nInput text:\n`\u003Chl> Python is a programming language. \u003Chl> Created by Guido van Rossum and first released in 1991.`\n\ntarget text:\n`Python \u003Csep>`\n\nand \n\nInput text:\n`Python is a programming language. \u003Chl> Created by Guido van Rossum and first released in 1991 \u003Chl>.`\n\ntarget text:\n`Guido van Rossum \u003Csep> 1991 \u003Csep>`\n\nAt inference time the text is split into sentences and each sentence is highlighted.\n\n### Multitask QA-QG\n\nFor answer aware question generation we usually need 3 models, first which will extract answer like spans, second model will generate question on that answer and third will be a QA model which will take the question and produce an answer,\nthen we can compare the two answers to see if the generated question is correct or not.\n\nHaving 3 models for single task is lot of complexity, so goal is to create a multi-task model which can do all of these 3 tasks\n\n1. extract answer like spans\n2. generate question based on the answer\n3. QA\n\nT5 model is fine-tuned in multi-task way using task prefixes as described in the paper.\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"80%\", src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpatil-suraj_question_generation_readme_0525dbff7e81.png\">\n\u003C\u002Fp>\n\n### End-to-End question generation (answer agnostic)\n\nIn end-to-end question generation the model is aksed to generate questions without providing the answers. [This](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2005.01107v1.pdf) paper discusses these ideas in more detail. Here the T5 model is trained to generate multiple questions simultaneously by just providing the context. The questions are seperated by the `\u003Csep>` token. Here's how the examples are processed\n\ninput text: `Python is a programming language. Created by Guido van Rossum and first released in 1991.`\n\ntarget text: `Who created Python ? \u003Csep> When was python released ? \u003Csep>`\n\n**All the training details can be found in [this](https:\u002F\u002Fapp.wandb.ai\u002Fpsuraj\u002Fquestion-generation) wandb project**\n\n## Results\n\nResults on the SQuAD1.0 dev set using above approaches. For decoding, beam search with num_beams 4 is used with max decoding length set to 32. \n\nFor multitask qa-qg models the EM and F1 scores are privded as QA-EM and QA-F1.\n\nThe [nlg-eval](https:\u002F\u002Fgithub.com\u002FMaluuba\u002Fnlg-eval) package is used for calculating the metrics.\n\n\n| Name                                                                       | BLEU-4  | METEOR  | ROUGE-L | QA-EM  | QA-F1  | QG-FORMAT |\n|----------------------------------------------------------------------------|---------|---------|---------|--------|--------|-----------|\n| [t5-base-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-base-qg-hl)             | 21.3226 | 27.0854 | 43.5962 | -      | -      | highlight |\n| [t5-base-qa-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-base-qa-qg-hl)       | 21.0141 | 26.9113 | 43.2484 | 82.46  | 90.272 | highlight |\n| [t5-small-qa-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qa-qg-hl)     | 18.9872 | 25.2217 | 40.7893 | 76.121 | 84.904 | highlight |\n| [t5-small-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qg-hl)           | 18.5921 | 24.9915 | 40.1886 | -      | -      | highlight |\n| [t5-small-qg-prepend](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qg-prepend) | 18.2791 | 24.6722 | 39.958  | -      | -      | prepend   |\n\n\n## Requirements\n```\ntransformers==3.0.0\nnltk\nnlp==0.2.0 # only if you want to fine-tune.\n```\n\nafter installing `nltk` do\n```bash\npython -m nltk.downloader punkt\n```\n\n## Usage\nUse the pipeline whch mimics 🤗transformers pipeline for easy inference.\n\nThe pipeline is divided into 3 tasks\n1. `question-generation`: for single task question generation models.\n2. `multitask-qa-qg`: for multi-task qa,qg models.\n3. `e2e-qg`: for end-to-end question generation.\n\n[![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fpatil-suraj\u002Fquestion_generation\u002Fblob\u002Fmaster\u002Fquestion_generation.ipynb)\n\n#### Question Generation\n\n```python3\nfrom pipelines import pipeline\n\nnlp = pipeline(\"question-generation\")\nnlp(\"42 is the answer to life, the universe and everything.\")\n=> [{'answer': '42', 'question': 'What is the answer to life, the universe and everything?'}]\n```\n\n**prepend format**\n```python3\nnlp = pipeline(\"question-generation\", model=\"valhalla\u002Ft5-small-qg-prepend\", qg_format=\"prepend\")\nnlp(\"42 is the answer to life, the universe and everything.\")\n=> [{'answer': '42 ', 'question': 'What is the answer to life, the universe, and everything?'}]\n```\n\n#### Multitask QA-QG\n```python3\nnlp = pipeline(\"multitask-qa-qg\")\n\n# to generate questions simply pass the text\nnlp(\"42 is the answer to life, the universe and everything.\")\n=> [{'answer': '42', 'question': 'What is the answer to life, the universe and everything?'}]\n\n# for qa pass a dict with \"question\" and \"context\"\nnlp({\n    \"question\": \"What is 42 ?\",\n    \"context\": \"42 is the answer to life, the universe and everything.\"\n})\n=> 'the answer to life, the universe and everything'\n```\n\n#### End-to-end question generation (without answer supervision)\n```python3\nnlp = pipeline(\"e2e-qg\")\nnlp(\"Python is a programming language. Created by Guido van Rossum and first released in 1991.\")\n=> [\n    'What is a programming language?',\n    'Who created Python?',\n    'When was Python first released?'\n]\n```\n\nBy default both pipelines will use the t5-small* models, to use the other models pass the path through `model` paramter.\n\nBy default the `question-generation` pipeline will download the [valhalla\u002Ft5-small-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qg-hl) model with `highlight` qg format. If you want to use prepend format then provide the path to the prepend model and set `qg_format` to `\"prepend\"`. For extracting answer like spans it uses [valhalla\u002Ft5-small-qa-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qa-qg-hl) model, you can provide a different model through `ans_model` parameter.\n\nThe `multitask-qa-qg` model is for multitask models which can extract answer like spans, do qg and qa, so it won't need seperate `ans_model`. By default [valhalla\u002Ft5-small-qa-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qa-qg-hl) model is used with `highlight` format. If you want to use prepend format then provide the path to the prepend model and set `qg_format` to `\"prepend\"`\n\nThe `e2e-qg` pipeline is for end-to-end question generation. These models can generate multiple questions simultaneously without answer supervision. By default it uses [valhalla\u002Ft5-small-e2e-qg](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-e2e-qg)\n\n## Fine-tuning\n\n### Data processing \n\nTo support different data formats the trainer expects pre-processed cached dataset, so you can process the data the way you want.\nThe cached dataset should be saved using `torch.save` and it should return a `dict` with `source_ids`, `target_ids`, `attention_mask` keys from `__getitem__`.\n\n- `source_ids`: encoded source text\n- `target_ids`: encoded target text\n- `attention_mask`: attention mask for the `source_ids`\n  \nThe `T2TDataCollator` takes care of preparing right `input_ids` and `labels`. It also trims the batches dynamically to remove excessive padding tokens, to speed up the training.\n\nThe `data\u002Fsquad_multitask` containes the modifed SQuAD dataset for answer aware question generation (using both prepend and highlight formats), question answering (text-to-text), answer extraction and end-to-end question generation. This dataset can be loaded using the awesome 🤗`nlp` library, this makes processing very easy.\n\nTo process and cache the dataset use `prepare_data.py` script. It will load the correct tokenizer depending on the `model_type` argument. It adds two new tokens `\u003Csep>` and `\u003Chl>` to the tokenizer and saves it at `{model_type}_qg_tokenizer` path. You should pass this tokenizer to the fine-tuning script.\n\nThe datasets will be saved in `data\u002F` directory. You should provide filenames using `train_file_name` and `valid_file_name` arguments.\n\n**process data for single task question generation with highlight_qg_format**\n```bash\npython prepare_data.py \\\n    --task qg \\\n    --model_type t5 \\\n    --dataset_path data\u002Fsquad_multitask\u002F \\\n    --qg_format highlight_qg_format \\\n    --max_source_length 512 \\\n    --max_target_length 32 \\\n    --train_file_name train_data_qg_hl_t5.pt \\\n    --valid_file_name valid_data_qg_hl_t5.pt \\\n```\n\n**process data for multi-task qa-qg with highlight_qg_format**\n\n`valid_for_qg_only` argument is used to decide if the validation set should only contain data for qg task. For my multi-task experiments I used validation data with only qg task so that the eval loss curve can be easly compared with other single task models\n\n```bash\npython prepare_data.py \\\n    --task multi \\\n    --valid_for_qg_only \\ \n    --model_type t5 \\\n    --dataset_path data\u002Fsquad_multitask\u002F \\\n    --qg_format highlight_qg_format \\\n    --max_source_length 512 \\\n    --max_target_length 32 \\\n    --train_file_name train_data_qa_qg_hl_t5.pt \\\n    --valid_file_name valid_data_qg_hl_t5.pt \\\n```\n\n**process dataset for end-to-end question generation**\n```bash\npython prepare_data.py \\\n    --task e2e_qg \\\n    --valid_for_qg_only \\ \n    --model_type t5 \\\n    --dataset_path data\u002Fsquad_multitask\u002F \\\n    --qg_format highlight_qg_format \\\n    --max_source_length 512 \\\n    --max_target_length 32 \\\n    --train_file_name train_data_e2e_qg_t5.pt \\\n    --valid_file_name valid_data_e2e_qg_t5.pt \\\n```\n\n### training\nUse the `run_qg.py` script to  start training. It uses transformers `Trainer` class for training the models.\n\n\n```bash\npython run_qg.py \\\n    --model_name_or_path t5-small \\\n    --model_type t5 \\\n    --tokenizer_name_or_path t5_qg_tokenizer \\\n    --output_dir t5-small-qg-hl \\\n    --train_file_path data\u002Ftrain_data_qg_hl_t5.pt \\\n    --valid_file_path data\u002Fvalid_data_qg_hl_t5.pt \\\n    --per_device_train_batch_size 32 \\\n    --per_device_eval_batch_size 32 \\\n    --gradient_accumulation_steps 8 \\\n    --learning_rate 1e-4 \\\n    --num_train_epochs 10 \\\n    --seed 42 \\\n    --do_train \\\n    --do_eval \\\n    --evaluate_during_training \\\n    --logging_steps 100\n```\n\nor if you want to train it from script or notebook then\n\n```python3\nfrom run_qg import run_qg\n\nargs_dict = {\n    \"model_name_or_path\": \"t5-small\",\n    \"model_type\": \"t5\",\n    \"tokenizer_name_or_path\": \"t5_qg_tokenizer\",\n    \"output_dir\": \"t5-small-qg-hl\",\n    \"train_file_path\": \"data\u002Ftrain_data_qg_hl_t5.pt\",\n    \"valid_file_path\": \"data\u002Fvalid_data_qg_hl_t5.pt\",\n    \"per_device_train_batch_size\": 32,\n    \"per_device_eval_batch_size\": 32,\n    \"gradient_accumulation_steps\": 8,\n    \"learning_rate\": 1e-4,\n    \"num_train_epochs\": 10,\n    \"seed\": 42,\n    \"do_train\": True,\n    \"do_eval\": True,\n    \"evaluate_during_training\": True,\n    \"logging_steps\": 100\n}\n\n# start training\nrun_qg(args_dict)\n```\n\n### Evaluation\n\nUse the `eval.py` script for evaluting the model. \n\n```bash\npython eval.py \\\n    --model_name_or_path t5-base-qg-hl \\\n    --valid_file_path valid_data_qg_hl_t5.pt \\\n    --model_type t5 \\\n    --num_beams 4 \\\n    --max_decoding_length 32 \\\n    --output_path hypothesis_t5-base-qg-hl.txt\n```\n\nThis will save the output at {output_path} file.\n\nTo calculate the metrics install the [nlg-eval](https:\u002F\u002Fgithub.com\u002FMaluuba\u002Fnlg-eval) package and run\n\n```bash\nnlg-eval --hypothesis=hypothesis_t5-base-qg-hl.txt --references=data\u002Freferences.txt --no-skipthoughts --no-glove \n```\n\n## Applications 🚀\n\n1. A simple Trivia Quiz on topics of your choice - \u003Cbr\u002F>\n   [Medium article](https:\u002F\u002Fmedium.com\u002F@nvarshney97\u002Fusing-the-latest-nlp-techniques-for-fun-98f31ce7b556) and its [Colab Notebook](https:\u002F\u002Fcolab.research.google.com\u002Fgist\u002Fnrjvarshney\u002F39ed6c80e2fe293b9e7eca5bc3a45b7d\u002Fquiz.ipynb)\n2. [Autocards, Accelerating learning through machine-generated flashcards](https:\u002F\u002Fpaulbricman.com\u002Fdocs\u002Ftools\u002Fautocards\u002F)\n\n## Relevant papers\n- https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.05416\n- https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002FD19-5821\u002F\n- https:\u002F\u002Farxiv.org\u002Fabs\u002F2005.01107v1\n","# 使用 🤗transformers 进行问题生成\n\n- [使用 🤗transformers 进行问题生成](#question-generation-using-transformers)\n  - [项目详情](#project-details)\n  - [初步实验](#initial-experiments)\n    - [基于答案的问题生成](#answer-aware-question-generation)\n    - [答案抽取模型](#answer-extraction-models)\n    - [多任务问答-问题生成](#multitask-qa-qg)\n    - [端到端问题生成（无需答案监督）](#end-to-end-question-generation-answer-agnostic)\n  - [结果](#results)\n  - [要求](#requirements)\n  - [使用方法](#usage)\n      - [问题生成](#question-generation)\n      - [多任务问答-问题生成](#multitask-qa-qg-1)\n      - [端到端问题生成（无需答案监督）](#end-to-end-question-generation-without-answer-supervision)\n  - [微调](#fine-tuning)\n    - [数据处理](#data-processing)\n    - [训练](#training)\n    - [评估](#evaluation)\n  - [应用 🚀](#applications-)\n  - [相关论文](#relevant-papers)\n\n\n## 项目详情\n问题生成是从一段文本中自动生成问题的任务。最直接的方法是基于答案的问题生成。在这种方法中，模型会接收答案和段落，并根据段落上下文为该答案生成一个问题。尽管关于问题生成任务已有许多研究论文，但它仍然不如问答任务那样主流。原因之一是早期的大多数研究都采用了复杂的模型或处理流程，且没有现成的预训练模型可用。近年来，一些论文，特别是 UniLM 和 ProphetNet，提供了 SOTA 的预训练权重用于问题生成，但其使用方式似乎相当复杂。\n\n本项目旨在通过开源的方式，利用预训练的 Transformer 模型（特别是序列到序列模型），以简单直观的端到端方法进行问题生成研究，避免复杂的流水线。目标是提供简化的数据处理和训练脚本，以及易于使用的推理流程。\n \n\n## 初步实验\n初步实验使用 SQuADv1 数据集和 T5 模型进行，输入处理格式如下面所述。\n\n### 基于答案的问题生成\n\n对于基于答案的模型，输入文本可以有两种处理方式。\n\n**1. 前缀格式：**\n\n 在这种格式中，答案直接添加到上下文之前，并用分隔符 token 分隔。例如：\n\n `42 [SEP] 42 是生命、宇宙以及一切的答案。`\n\n 对于 T5 模型，输入会被处理成这样：\n\n `answer: 42  context: 42 是生命、宇宙以及一切的答案。`\n\n**2. 高亮格式**\n\n 在这种格式中，答案片段会在文本中用特殊的高亮 token 标记出来。\n\n `《hl》 42 《hl》 是生命、宇宙以及一切的答案。`\n\n 这一想法在“一种基于循环 BERT 的问题生成模型”[论文](https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002FD19-5821.pdf)中提出。参见第 4.3 节。\n\n### 答案抽取模型\n\n由于基于答案的模型需要答案才能生成问题，因此我们需要一个能够从文本中提取答案片段的模型。这可以通过多种方法实现，比如命名实体识别、名词短语抽取等。但在这里，我们训练了一个专门用于提取答案片段的模型，以观察其效果。对于 T5 模型，答案抽取是通过文本到格式转换来完成的。\n\n由于高亮格式需要知道提取出的答案片段的位置，因此答案抽取的输入处理如下：\n\n  1. 将文本按句子分割。\n  2. 对于包含答案的句子，用 `\u003Chl>` token 高亮标记该句子。\n  3. 对于目标文本，将该句子中的所有答案用 `\u003Csep>` token 连接起来。\n\n例如，对于以下文本：\n\n`Python 是一种编程语言。由 Guido van Rossum 创建，于 1991 年首次发布。`\n\n将会生成如下示例：\n\n输入文本：\n`《hl》 Python 是一种编程语言。《hl》 由 Guido van Rossum 创建，于 1991 年首次发布。`\n\n目标文本：\n`Python \u003Csep>`\n\n以及\n\n输入文本：\n`Python 是一种编程语言。《hl》 由 Guido van Rossum 创建，于 1991 年首次发布《hl》。`\n\n目标文本：\n`Guido van Rossum \u003Csep> 1991 \u003Csep>`\n\n在推理时，文本会被分割成句子，并对每个句子进行高亮标记。\n\n### 多任务问答-问题生成\n\n对于基于答案的问题生成，通常需要三个模型：第一个模型用于提取答案片段，第二个模型根据答案生成问题，第三个模型则是问答模型，它会接收问题并给出答案。然后我们可以比较这两个答案，以判断生成的问题是否正确。\n\n为一个任务同时使用三个模型会带来很大的复杂性，因此我们的目标是创建一个多任务模型，使其能够同时完成以下三项任务：\n\n1. 提取答案片段\n2. 根据答案生成问题\n3. 问答\n\nT5 模型通过使用论文中描述的任务前缀，以多任务方式进行了微调。\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"80%\", src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpatil-suraj_question_generation_readme_0525dbff7e81.png\">\n\u003C\u002Fp>\n\n### 端到端问题生成（无需答案监督）\n\n在端到端问题生成中，模型被要求在不提供答案的情况下生成问题。[这篇](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2005.01107v1.pdf)论文更详细地讨论了这一思路。在这里，T5 模型被训练为仅根据上下文同时生成多个问题，问题之间用 `\u003Csep>` token 分隔。以下是示例的处理方式：\n\n输入文本：`Python 是一种编程语言。由 Guido van Rossum 创建，于 1991 年首次发布。`\n\n目标文本：`谁创造了 Python？\u003Csep> Python 是什么时候发布的？\u003Csep>`\n\n**所有训练细节都可以在 [这个](https:\u002F\u002Fapp.wandb.ai\u002Fpsuraj\u002Fquestion-generation) wandb 项目中找到**\n\n## 结果\n\n使用上述方法在 SQuAD1.0 验证集上的结果。解码时采用束搜索，束宽为 4，最大解码长度设为 32。\n\n对于多任务问答-问题生成模型，EM 和 F1 分数分别以 QA-EM 和 QA-F1 的形式给出。\n\n使用 [nlg-eval](https:\u002F\u002Fgithub.com\u002FMaluuba\u002Fnlg-eval) 包来计算各项指标。\n\n\n| 名称                                                                       | BLEU-4  | METEOR  | ROUGE-L | QA-EM  | QA-F1  | QG-FORMAT |\n|----------------------------------------------------------------------------|---------|---------|---------|--------|--------|-----------|\n| [t5-base-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-base-qg-hl)             | 21.3226 | 27.0854 | 43.5962 | -      | -      | highlight |\n| [t5-base-qa-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-base-qa-qg-hl)       | 21.0141 | 26.9113 | 43.2484 | 82.46  | 90.272 | highlight |\n| [t5-small-qa-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qa-qg-hl)     | 18.9872 | 25.2217 | 40.7893 | 76.121 | 84.904 | highlight |\n| [t5-small-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qg-hl)           | 18.5921 | 24.9915 | 40.1886 | -      | -      | highlight |\n| [t5-small-qg-prepend](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qg-prepend) | 18.2791 | 24.6722 | 39.958  | -      | -      | prepend   |\n\n\n## 要求\n```\ntransformers==3.0.0\nnltk\nnlp==0.2.0 # 仅在需要微调时使用。\n```\n\n安装 `nltk` 后，请执行以下命令：\n```bash\npython -m nltk.downloader punkt\n```\n\n## 使用方法\n使用模仿 🤗transformers 流水线的管道，以便于推理。\n\n该管道分为 3 个任务：\n1. `question-generation`：用于单任务问题生成模型。\n2. `multitask-qa-qg`：用于多任务问答和问题生成模型。\n3. `e2e-qg`：用于端到端问题生成。\n\n[![在 Colab 中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fpatil-suraj\u002Fquestion_generation\u002Fblob\u002Fmaster\u002Fquestion_generation.ipynb)\n\n#### 问题生成\n\n```python3\nfrom pipelines import pipeline\n\nnlp = pipeline(\"question-generation\")\nnlp(\"42 是生命、宇宙以及一切的答案。\")\n=> [{'answer': '42', 'question': '生命、宇宙以及一切的答案是什么？'}]\n```\n\n**prepend 格式**\n```python3\nnlp = pipeline(\"question-generation\", model=\"valhalla\u002Ft5-small-qg-prepend\", qg_format=\"prepend\")\nnlp(\"42 是生命、宇宙以及一切的答案。\")\n=> [{'answer': '42 ', 'question': '生命、宇宙以及一切的答案是什么？'}]\n```\n\n#### 多任务问答-问题生成\n```python3\nnlp = pipeline(\"multitask-qa-qg\")\n\n# 要生成问题，只需传递文本即可\nnlp(\"42 是生命、宇宙以及一切的答案。\")\n=> [{'answer': '42', 'question': '生命、宇宙以及一切的答案是什么？'}]\n\n# 对于问答任务，传递包含 \"question\" 和 \"context\" 的字典\nnlp({\n    \"question\": \"42 是什么？\",\n    \"context\": \"42 是生命、宇宙以及一切的答案。\"\n})\n=> '生命、宇宙以及一切的答案'\n```\n\n#### 端到端问题生成（无需答案监督）\n```python3\nnlp = pipeline(\"e2e-qg\")\nnlp(\"Python 是一种编程语言。由 Guido van Rossum 创建，于 1991 年首次发布。\")\n=> [\n    '什么是编程语言？',\n    '谁创建了 Python？',\n    'Python 是何时首次发布的？'\n]\n```\n\n默认情况下，两个管道都会使用 t5-small* 模型。若要使用其他模型，可通过 `model` 参数指定路径。\n\n默认情况下，`question-generation` 管道会下载 [valhalla\u002Ft5-small-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qg-hl) 模型，并采用 `highlight` 问题生成格式。如果希望使用 prepend 格式，则需提供 prepend 模型的路径，并将 `qg_format` 设置为 `\"prepend\"`。对于提取答案跨度的任务，它会使用 [valhalla\u002Ft5-small-qa-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qa-qg-hl) 模型；您可以通过 `ans_model` 参数指定不同的模型。\n\n`multitask-qa-qg` 模型适用于能够同时提取答案跨度、进行问题生成和问答的多任务模型，因此不需要单独的 `ans_model`。默认情况下，使用的是 [valhalla\u002Ft5-small-qa-qg-hl](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-qa-qg-hl) 模型，并采用 `highlight` 格式。如果希望使用 prepend 格式，则需提供 prepend 模型的路径，并将 `qg_format` 设置为 `\"prepend\"`。\n\n`e2e-qg` 管道用于端到端问题生成。这些模型可以在没有答案监督的情况下同时生成多个问题。默认情况下，它使用 [valhalla\u002Ft5-small-e2e-qg](https:\u002F\u002Fhuggingface.co\u002Fvalhalla\u002Ft5-small-e2e-qg) 模型。\n\n## 微调\n\n### 数据处理\n\n为了支持不同的数据格式，训练器期望使用预处理过的缓存数据集，这样你可以按照自己的方式处理数据。缓存的数据集应该使用 `torch.save` 保存，并且在 `__getitem__` 方法中返回一个包含 `source_ids`、`target_ids` 和 `attention_mask` 键的字典。\n\n- `source_ids`: 编码后的源文本\n- `target_ids`: 编码后的目标文本\n- `attention_mask`: `source_ids` 的注意力掩码\n\n`T2TDataCollator` 负责准备正确的 `input_ids` 和 `labels`。它还会动态裁剪批次以去除多余的填充标记，从而加快训练速度。\n\n`data\u002Fsquad_multitask` 目录下包含了经过修改的 SQuAD 数据集，用于答案感知型问题生成（同时使用 prepend 和 highlight 格式）、问答（文本到文本）、答案抽取以及端到端的问题生成。这个数据集可以使用优秀的 🤗`nlp` 库加载，这使得数据处理变得非常容易。\n\n要处理并缓存数据集，请使用 `prepare_data.py` 脚本。它会根据 `model_type` 参数加载正确的分词器。该脚本会在分词器中添加两个新标记 `\u003Csep>` 和 `\u003Chl>`, 并将其保存到 `{model_type}_qg_tokenizer` 路径。你应该将此分词器传递给微调脚本。\n\n数据集将被保存在 `data\u002F` 目录下。你需要通过 `train_file_name` 和 `valid_file_name` 参数提供文件名。\n\n**为单任务问题生成处理数据，使用 highlight_qg_format 格式**\n```bash\npython prepare_data.py \\\n    --task qg \\\n    --model_type t5 \\\n    --dataset_path data\u002Fsquad_multitask\u002F \\\n    --qg_format highlight_qg_format \\\n    --max_source_length 512 \\\n    --max_target_length 32 \\\n    --train_file_name train_data_qg_hl_t5.pt \\\n    --valid_file_name valid_data_qg_hl_t5.pt \\\n```\n\n**为多任务 qa-qg 处理数据，使用 highlight_qg_format 格式**\n\n`valid_for_qg_only` 参数用于决定验证集是否只包含 qg 任务的数据。在我的多任务实验中，我使用了仅包含 qg 任务的验证数据，以便于将评估损失曲线与其他单任务模型进行比较。\n\n```bash\npython prepare_data.py \\\n    --task multi \\\n    --valid_for_qg_only \\ \n    --model_type t5 \\\n    --dataset_path data\u002Fsquad_multitask\u002F \\\n    --qg_format highlight_qg_format \\\n    --max_source_length 512 \\\n    --max_target_length 32 \\\n    --train_file_name train_data_qa_qg_hl_t5.pt \\\n    --valid_file_name valid_data_qg_hl_t5.pt \\\n```\n\n**为端到端问题生成处理数据集**\n```bash\npython prepare_data.py \\\n    --task e2e_qg \\\n    --valid_for_qg_only \\ \n    --model_type t5 \\\n    --dataset_path data\u002Fsquad_multitask\u002F \\\n    --qg_format highlight_qg_format \\\n    --max_source_length 512 \\\n    --max_target_length 32 \\\n    --train_file_name train_data_e2e_qg_t5.pt \\\n    --valid_file_name valid_data_e2e_qg_t5.pt \\\n```\n\n### 训练\n使用 `run_qg.py` 脚本来开始训练。它使用 transformers 的 `Trainer` 类来训练模型。\n\n```bash\npython run_qg.py \\\n    --model_name_or_path t5-small \\\n    --model_type t5 \\\n    --tokenizer_name_or_path t5_qg_tokenizer \\\n    --output_dir t5-small-qg-hl \\\n    --train_file_path data\u002Ftrain_data_qg_hl_t5.pt \\\n    --valid_file_path data\u002Fvalid_data_qg_hl_t5.pt \\\n    --per_device_train_batch_size 32 \\\n    --per_device_eval_batch_size 32 \\\n    --gradient_accumulation_steps 8 \\\n    --learning_rate 1e-4 \\\n    --num_train_epochs 10 \\\n    --seed 42 \\\n    --do_train \\\n    --do_eval \\\n    --evaluate_during_training \\\n    --logging_steps 100\n```\n\n或者，如果你想从脚本或笔记本中直接训练，可以这样做：\n\n```python3\nfrom run_qg import run_qg\n\nargs_dict = {\n    \"model_name_or_path\": \"t5-small\",\n    \"model_type\": \"t5\",\n    \"tokenizer_name_or_path\": \"t5_qg_tokenizer\",\n    \"output_dir\": \"t5-small-qg-hl\",\n    \"train_file_path\": \"data\u002Ftrain_data_qg_hl_t5.pt\",\n    \"valid_file_path\": \"data\u002Fvalid_data_qg_hl_t5.pt\",\n    \"per_device_train_batch_size\": 32,\n    \"per_device_eval_batch_size\": 32,\n    \"gradient_accumulation_steps\": 8,\n    \"learning_rate\": 1e-4,\n    \"num_train_epochs\": 10,\n    \"seed\": 42,\n    \"do_train\": True,\n    \"do_eval\": True,\n    \"evaluate_during_training\": True,\n    \"logging_steps\": 100\n}\n\n# 开始训练\nrun_qg(args_dict)\n```\n\n### 评估\n\n使用 `eval.py` 脚本来评估模型。\n\n```bash\npython eval.py \\\n    --model_name_or_path t5-base-qg-hl \\\n    --valid_file_path valid_data_qg_hl_t5.pt \\\n    --model_type t5 \\\n    --num_beams 4 \\\n    --max_decoding_length 32 \\\n    --output_path hypothesis_t5-base-qg-hl.txt\n```\n\n这会将输出保存到 {output_path} 文件中。\n\n要计算指标，安装 [nlg-eval](https:\u002F\u002Fgithub.com\u002FMaluuba\u002Fnlg-eval) 包并运行：\n\n```bash\nnlg-eval --hypothesis=hypothesis_t5-base-qg-hl.txt --references=data\u002Freferences.txt --no-skipthoughts --no-glove \n```\n\n## 应用场景 🚀\n\n1. 一个关于你选择主题的简单知识问答游戏 - \u003Cbr\u002F>\n   [Medium 文章](https:\u002F\u002Fmedium.com\u002F@nvarshney97\u002Fusing-the-latest-nlp-techniques-for-fun-98f31ce7b556)及其 [Colab 笔记本](https:\u002F\u002Fcolab.research.google.com\u002Fgist\u002Fnrjvarshney\u002F39ed6c80e2fe293b9e7eca5bc3a45b7d\u002Fquiz.ipynb)\n2. [Autocards，通过机器生成的抽认卡加速学习](https:\u002F\u002Fpaulbricman.com\u002Fdocs\u002Ftools\u002Fautocards\u002F)\n\n## 相关论文\n- https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.05416\n- https:\u002F\u002Fwww.aclweb.org\u002Fanthology\u002FD19-5821\u002F\n- https:\u002F\u002Farxiv.org\u002Fabs\u002F2005.01107v1","# Question Generation 快速上手指南\n\n## 环境准备\n**系统要求**  \nPython 3.6+  \n\n**前置依赖**  \n```bash\npip install transformers==3.0.0 nltk nlp==0.2.0\n```\n> 建议使用国内镜像加速安装（可选）：  \n> ```bash\n> pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple transformers==3.0.0 nltk nlp==0.2.0\n> ```\n\n**NLP数据下载**  \n```bash\npython -m nltk.downloader punkt\n```\n\n---\n\n## 安装步骤\n1. 安装依赖库  \n   ```bash\n   pip install transformers==3.0.0 nltk nlp==0.2.0\n   ```\n2. 下载NLTK资源  \n   ```bash\n   python -m nltk.downloader punkt\n   ```\n\n---\n\n## 基本使用\n### 1. 问答生成（需答案输入）\n```python\nfrom pipelines import pipeline\n\nnlp = pipeline(\"question-generation\")\nresult = nlp(\"42 is the answer to life, the universe and everything.\")\nprint(result)\n# 输出: [{'answer': '42', 'question': 'What is the answer to life, the universe and everything?'}]\n```\n\n### 2. 多任务处理（问答+生成）\n```python\nnlp = pipeline(\"multitask-qa-qg\")\n# 生成问题\nprint(nlp(\"42 is the answer to life, the universe and everything.\"))\n# 回答问题\nprint(nlp({\"question\": \"What is 42?\", \"context\": \"42 is the answer to life, the universe and everything.\"}))\n```\n\n### 3. 端到端生成（无需答案）\n```python\nnlp = pipeline(\"e2e-qg\")\nprint(nlp(\"Python is a programming language. Created by Guido van Rossum and first released in 1991.\"))\n# 输出: ['What is a programming language?', 'Who created Python?', 'When was Python first released?']\n```","某在线教育平台开发团队需为新发布的《Python编程入门》教材自动生成配套练习题，以提升学生互动学习体验。\n\n### 没有 question_generation 时\n- - 人工编写问题耗时2-3小时\u002F章节，严重拖慢课程更新节奏\n- - 问题质量依赖教师经验，常出现与上下文脱节或难度失衡（如基础概念题混入高级算法题）\n- - 扩展至新教材时，需临时抽调3名教研员，团队资源紧张\n- - 无法根据学生答题数据动态优化问题库，反馈周期长达1-2周\n\n### 使用 question_generation 后\n- - 问题生成时间压缩至5分钟\u002F章节，支持批量处理整本教材\n- - 生成的问题精准匹配教材上下文（如自动识别\"Guido van Rossum\"等关键实体生成相关问题）\n- - 新教材上线后立即自动生成配套题库，无需额外人力投入\n- - 实时分析学生答题数据，自动优化问题难度分布\n\nquestion_generation 通过端到端Transformer模型，将教材问题生成效率提升60倍，同时确保问题质量与教学目标高度一致。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fpatil-suraj_question_generation_3b32e8ce.png","patil-suraj","Suraj Patil","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fpatil-suraj_34552012.jpg","Researcher @black-forest-labs ","@black-forest-labs","India","surajp815@gmail.com","psuraj28",null,"https:\u002F\u002Fgithub.com\u002Fpatil-suraj",[86,90],{"name":87,"color":88,"percentage":89},"Jupyter Notebook","#DA5B0B",89.7,{"name":91,"color":92,"percentage":93},"Python","#3572A5",10.3,1144,350,"2026-03-14T02:52:12","MIT","未说明",{"notes":100,"python":98,"dependencies":101},"首次运行需下载约5GB模型文件，建议使用conda管理环境",[102,103,104],"transformers==3.0.0","nltk","nlp==0.2.0",[13,26],[107,108,109,110,111,112,113,114],"nlp","nlg","deep-learning","transformer","t5","question-generation","natural-language-processing","natural-language-generation","2026-03-27T02:49:30.150509","2026-04-06T07:11:51.766336",[118,123,128,133,138,142],{"id":119,"question_zh":120,"answer_zh":121,"source_url":122},5161,"如何解决GPU内存不足（OOM）问题？","建议使用单任务管道（single task pipeline）替代多任务管道，因为多任务模型会同时加载两个模型到GPU导致内存不足。若必须使用多任务，可尝试降低batch size或启用CPU-only模式。参考：https:\u002F\u002Fgithub.com\u002Fpatil-suraj\u002Fquestion_generation\u002Fissues\u002F2","https:\u002F\u002Fgithub.com\u002Fpatil-suraj\u002Fquestion_generation\u002Fissues\u002F2",{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},5162,"如何生成更多问题？","默认生成最大长度为256 tokens，可通过调整generate_kwargs参数扩展，例如：max_length=512, num_beams=4。对于超长文本需分段处理。示例代码：\ngenerate_kwargs = {\n    \"max_length\": 512,\n    \"num_beams\": 4,\n    \"length_penalty\": 1.5,\n}\nnlp(text, **generate_kwargs)\n参考：https:\u002F\u002Fgithub.com\u002Fpatil-suraj\u002Fquestion_generation\u002Fissues\u002F20","https:\u002F\u002Fgithub.com\u002Fpatil-suraj\u002Fquestion_generation\u002Fissues\u002F20",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},5163,"如何解决tokenizer实例化失败问题？","需安装sentencepiece库。若已安装仍报错，可尝试降级transformers版本至3.0.0：pip install transformers==3.0.0。参考：https:\u002F\u002Fgithub.com\u002Fpatil-suraj\u002Fquestion_generation\u002Fissues\u002F53","https:\u002F\u002Fgithub.com\u002Fpatil-suraj\u002Fquestion_generation\u002Fissues\u002F53",{"id":134,"question_zh":135,"answer_zh":136,"source_url":137},5164,"如何微调SQuAD数据集？","使用prepare_data.py预处理数据，通过--task参数指定任务类型（如qg\u002Fe2e_qg），并运行fine_tuning脚本。需注意数据格式需符合模型要求。参考：https:\u002F\u002Fgithub.com\u002Fpatil-suraj\u002Fquestion_generation\u002Fissues\u002F16","https:\u002F\u002Fgithub.com\u002Fpatil-suraj\u002Fquestion_generation\u002Fissues\u002F16",{"id":139,"question_zh":140,"answer_zh":141,"source_url":132},5165,"如何正确调用question-generation接口？","需使用pipeline接口并指定模型参数：\nnlp = pipeline(\"question-generation\", model=\"valhalla\u002Ft5-base-qg-hl\", ans_model='valhalla\u002Ft5-base-qa-qg-hl', qg_format=\"highlight\")\n注意确保输入文本格式符合模型要求（如包含\u003Chl>标签）。参考：https:\u002F\u002Fgithub.com\u002Fpatil-suraj\u002Fquestion_generation\u002Fissues\u002F53",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},5166,"如何处理input_ids形状错误？","需将input_ids转换为tensor格式，示例：\ntext = \"Your input text here\"\ninputs = singletask_tokenizer(text, return_tensors='pt')\noutput = singletask_model.generate(inputs['input_ids'])\n直接使用encode返回的list会导致shape属性缺失错误。参考：https:\u002F\u002Fgithub.com\u002Fpatil-suraj\u002Fquestion_generation\u002Fissues\u002F21","https:\u002F\u002Fgithub.com\u002Fpatil-suraj\u002Fquestion_generation\u002Fissues\u002F21",[]]