[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-declare-lab--MELD":3,"tool-declare-lab--MELD":65},[4,17,25,39,48,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,14,15],"开发框架","Agent","语言模型","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":10,"last_commit_at":23,"category_tags":24,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,15],{"id":26,"name":27,"github_repo":28,"description_zh":29,"stars":30,"difficulty_score":10,"last_commit_at":31,"category_tags":32,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[33,34,35,36,14,37,15,13,38],"图像","数据工具","视频","插件","其他","音频",{"id":40,"name":41,"github_repo":42,"description_zh":43,"stars":44,"difficulty_score":45,"last_commit_at":46,"category_tags":47,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,3,"2026-04-04T04:44:48",[14,33,13,15,37],{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":45,"last_commit_at":54,"category_tags":55,"status":16},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74913,"2026-04-05T10:44:17",[15,33,13,37],{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":62,"last_commit_at":63,"category_tags":64,"status":16},3215,"awesome-machine-learning","josephmisiti\u002Fawesome-machine-learning","awesome-machine-learning 是一份精心整理的机器学习资源清单，汇集了全球优秀的机器学习框架、库和软件工具。面对机器学习领域技术迭代快、资源分散且难以甄选的痛点，这份清单按编程语言（如 Python、C++、Go 等）和应用场景（如计算机视觉、自然语言处理、深度学习等）进行了系统化分类，帮助使用者快速定位高质量项目。\n\n它特别适合开发者、数据科学家及研究人员使用。无论是初学者寻找入门库，还是资深工程师对比不同语言的技术选型，都能从中获得极具价值的参考。此外，清单还延伸提供了免费书籍、在线课程、行业会议、技术博客及线下聚会等丰富资源，构建了从学习到实践的全链路支持体系。\n\n其独特亮点在于严格的维护标准：明确标记已停止维护或长期未更新的项目，确保推荐内容的时效性与可靠性。作为机器学习领域的“导航图”，awesome-machine-learning 以开源协作的方式持续更新，旨在降低技术探索门槛，让每一位从业者都能高效地站在巨人的肩膀上创新。",72149,1,"2026-04-03T21:50:24",[13,37],{"id":66,"github_repo":67,"name":68,"description_en":69,"description_zh":70,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":80,"owner_email":80,"owner_twitter":80,"owner_website":81,"owner_url":82,"languages":83,"stars":88,"forks":89,"last_commit_at":90,"license":91,"difficulty_score":92,"env_os":79,"env_gpu":93,"env_ram":93,"env_deps":94,"category_tags":97,"github_topics":98,"view_count":10,"oss_zip_url":80,"oss_zip_packed_at":80,"status":16,"created_at":113,"updated_at":114,"faqs":115,"releases":146},2824,"declare-lab\u002FMELD","MELD","MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation","MELD 是一个专为对话情感识别研究打造的多模态多角色数据集。它基于经典的情感线（EmotionLines）数据升级而来，收录了美剧《老友记》中超过 1400 段对话和 13000 句台词。与传统仅依赖文本的数据不同，MELD 创新性地融合了文本、音频和视觉三种模态信息，为每句台词标注了愤怒、喜悦、悲伤等七种具体情绪以及正负向情感倾向。\n\n该数据集主要解决了现有研究中缺乏高质量多模态会话数据的痛点，特别是在处理多人参与、情绪动态变化的复杂对话场景时，能帮助算法更准确地捕捉非语言线索（如语调、表情），从而提升情感分析的精度。其独特的技术亮点在于提供了完整的“文本 - 声音 - 画面”对齐数据，并详细统计了对话中的情绪转移现象，为验证多模态融合模型提供了坚实基础。\n\nMELD 非常适合人工智能研究人员、自然语言处理开发者以及高校师生使用，是训练和评估对话系统、情感计算模型的理想基准资源。普通用户虽不直接操作数据，但未来基于此研发的智能客服、心理陪伴机器人等产品将因此变得更加善解人意。目前，该数据集已被学术界广泛采用，并推动了多个前沿模型的诞生。","# MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation\n\n## Note\n\n🔥 If you are interested in IQ testing LLMs, check out our new work: [AlgoPuzzleVQA](https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002Fpuzzle-reasoning)\n\n:fire: We have released the visual features extracted using Resnet - https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002FMM-Align\n\n:fire: :fire: :fire: For updated baselines please visit this link: [conv-emotion](https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002Fconv-emotion)\n\n:fire: :fire: :fire: For downloading the data use wget: \n```wget http:\u002F\u002Fweb.eecs.umich.edu\u002F~mihalcea\u002Fdownloads\u002FMELD.Raw.tar.gz```\n\n## Leaderboard\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdeclare-lab_MELD_readme_1491652a74b4.png)\n\n## Updates\n\n10\u002F10\u002F2020: New paper and SOTA in Emotion Recognition in Conversations on the MELD dataset. Refer to the directory [COSMIC](https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002Fconv-emotion\u002Ftree\u002Fmaster\u002FCOSMIC) for the code. Read the paper -- [COSMIC: COmmonSense knowledge for eMotion Identification in Conversations](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.02795.pdf).\n\n22\u002F05\u002F2019: MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation has been accepted as a full paper at ACL 2019. The updated paper can be found here - https:\u002F\u002Farxiv.org\u002Fpdf\u002F1810.02508.pdf\n\n22\u002F05\u002F2019: Dyadic MELD has been released. It can be used to test dyadic conversational models.\n\n15\u002F11\u002F2018: The problem in the train.tar.gz has been fixed. \n\n## Research Works using MELD\n\nZhang, Yazhou, Qiuchi Li, Dawei Song, Peng Zhang, and Panpan Wang. \"Quantum-Inspired Interactive Networks for Conversational Sentiment Analysis.\" IJCAI 2019.\n\nZhang, Dong, Liangqing Wu, Changlong Sun, Shoushan Li, Qiaoming Zhu, and Guodong Zhou. \"Modeling both Context-and Speaker-Sensitive Dependence for Emotion Detection in Multi-speaker Conversations.\" IJCAI 2019.\n\nGhosal, Deepanway, Navonil Majumder, Soujanya Poria, Niyati Chhaya, and Alexander Gelbukh. \"DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation.\" EMNLP 2019.\n\n\n----------------------------------------------------\n\n## Introduction\nMultimodal EmotionLines Dataset (MELD) has been created by enhancing and extending EmotionLines dataset. MELD contains the same dialogue instances available in EmotionLines, but it also encompasses audio and visual modality along with text. MELD has more than 1400 dialogues and 13000 utterances from Friends TV series. Multiple speakers participated in the dialogues. Each utterance in a dialogue has been labeled by any of these seven emotions -- Anger, Disgust, Sadness, Joy, Neutral, Surprise and Fear. MELD also has sentiment (positive, negative and neutral) annotation for each utterance.\n\n### Example Dialogue\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdeclare-lab_MELD_readme_fd3b5c995f4e.jpeg)\n\n### Dataset Statistics\n| Statistics                      | Train   | Dev     | Test    |\n|---------------------------------|---------|---------|---------|\n| # of modality                   | {a,v,t} | {a,v,t} | {a,v,t} |\n| # of unique words               | 10,643  | 2,384   | 4,361   |\n| Avg. utterance length           | 8.03    | 7.99    | 8.28    |\n| Max. utterance length           | 69      | 37      | 45      |\n| Avg. # of emotions per dialogue | 3.30    | 3.35    | 3.24    |\n| # of dialogues                  | 1039    | 114     | 280     |\n| # of utterances                 | 9989    | 1109    | 2610    |\n| # of speakers                   | 260     | 47      | 100     |\n| # of emotion shift              | 4003    | 427     | 1003    |\n| Avg. duration of an utterance   | 3.59s   | 3.59s   | 3.58s   |\n\nPlease visit https:\u002F\u002Faffective-meld.github.io for more details.\n\n### Dataset Distribution\n\n|          | Train | Dev | Test |\n|----------|-------|-----|------|\n| Anger    | 1109  | 153 | 345  |\n| Disgust  | 271   | 22  | 68   |\n| Fear     | 268   | 40  | 50   |\n| Joy      | 1743  | 163 | 402  |\n| Neutral  | 4710  | 470 | 1256 |\n| Sadness  | 683   | 111 | 208  |\n| Surprise | 1205  | 150 | 281  |\n\n\n## Purpose\nMultimodal data analysis exploits information from multiple-parallel data channels for decision making. With the rapid growth of AI, multimodal emotion recognition has gained a major research interest, primarily due to its potential applications in many challenging tasks, such as dialogue generation, multimodal interaction etc. A conversational emotion recognition system can be used to generate appropriate responses by analysing user emotions. Although there are numerous works carried out on multimodal emotion recognition, only a very few actually focus on understanding emotions in conversations. However, their work is limited only to dyadic conversation understanding and thus not scalable to emotion recognition in multi-party conversations having more than two participants. EmotionLines can be used as a resource for emotion recognition for text only, as it does not include data from other modalities such as visual and audio. At the same time, it should be noted that there is no multimodal multi-party conversational dataset available for emotion recognition research. In this work, we have extended, improved, and further developed EmotionLines dataset for the multimodal scenario. Emotion recognition in sequential turns has several challenges and context understanding is one of them. The emotion change and emotion flow in the sequence of turns in a dialogue make accurate context modelling a difficult task. In this dataset, as we have access to the multimodal data sources for each dialogue, we hypothesise that it will improve the context modelling thus benefiting the overall emotion recognition performance.  This dataset can also be used to develop a multimodal affective dialogue system. IEMOCAP, SEMAINE are multimodal conversational datasets which contain emotion label for each utterance. However, these datasets are dyadic in nature, which justifies the importance of our Multimodal-EmotionLines dataset. The other publicly available multimodal emotion and sentiment recognition datasets are MOSEI, MOSI, MOUD. However, none of those datasets is conversational.\n\n## Dataset Creation\nThe first step deals with finding the timestamp of every utterance in each of the dialogues present in the EmotionLines dataset. To accomplish this, we crawled through the subtitle files of all the episodes which contains the beginning and the end timestamp of the utterances. This process enabled us to obtain season ID, episode ID, and timestamp of each utterance in the episode. We put two constraints whilst obtaining the timestamps: (a) timestamps of the utterances in a dialogue must be in increasing order, (b) all the utterances in a dialogue have to belong to the same episode and scene.\nConstraining with these two conditions revealed that in EmotionLines, a few dialogues consist of multiple natural dialogues. We filtered out those cases from the dataset. Because of this error correction step, in our case, we have the different number of dialogues as compare to the EmotionLines. After obtaining the timestamp of each utterance, we extracted their corresponding audio-visual clips from the source episode. Separately, we also took out the audio content from those video clips. Finally, the dataset contains visual, audio, and textual modality for each dialogue.\n\n## Paper\nThe paper explaining this dataset can be found - https:\u002F\u002Farxiv.org\u002Fpdf\u002F1810.02508.pdf\n\n## Download the data\nPlease visit - http:\u002F\u002Fweb.eecs.umich.edu\u002F~mihalcea\u002Fdownloads\u002FMELD.Raw.tar.gz to download the raw data. Data are stored in .mp4 format and can be found in XXX.tar.gz files. Annotations can be found in https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002FMELD\u002Ftree\u002Fmaster\u002Fdata\u002FMELD.\n\n## Description of the .csv files\n\n### Column Specification\n| Column Name  | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |\n|--------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| Sr No.       | Serial numbers of the utterances mainly for referencing the utterances in case of different versions or multiple copies with different subsets |\n| Utterance    | Individual utterances from EmotionLines as a string.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |\n| Speaker      | Name of the speaker associated with the utterance.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |\n| Emotion      | The emotion (neutral, joy, sadness, anger, surprise, fear, disgust) expressed by the speaker in the utterance.                                                                                                                                                                                                                                                                                                                                                                                                                                         |\n| Sentiment    | The sentiment (positive, neutral, negative) expressed by the speaker in the utterance.                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |\n| Dialogue_ID  | The index of the dialogue starting from 0.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |\n| Utterance_ID | The index of the particular utterance in the dialogue starting from 0.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |\n| Season       | The season no. of Friends TV Show to which a particular utterance belongs.                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |\n| Episode      | The episode no. of Friends TV Show in a particular season to which the utterance belongs.                                                                                                                                                                                                                                                                                                                                                                                                                                                              |\n| StartTime    | The starting time of the utterance in the given episode in the format 'hh:mm:ss,ms'.                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |\n| EndTime      | The ending time of the utterance in the given episode in the format 'hh:mm:ss,ms'.                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |\n\n### The files\n- \u002Fdata\u002FMELD\u002Ftrain_sent_emo.csv - contains the utterances in the training set along with Sentiment and Emotion labels.\n- \u002Fdata\u002FMELD\u002Fdev_sent_emo.csv - contains the utterances in the dev set along with Sentiment and Emotion labels.\n- \u002Fdata\u002FMELD\u002Ftest_sent_emo.csv - contains the utterances in the test set along with Sentiment and Emotion labels.\n- \u002Fdata\u002FMELD_Dyadic\u002Ftrain_sent_emo_dya.csv - contains the utterances in the training set of the dyadic variant of MELD along with Sentiment and Emotion labels. For getting the video clip corresponding to a particular utterance refer to the columns 'Old_Dialogue_ID' and 'Old_Utterance_ID'.\n- \u002Fdata\u002FMELD_Dyadic\u002Fdev_sent_emo_dya.csv - contains the utterances in the dev set of the dyadic variant along with Sentiment and Emotion labels. For getting the video clip corresponding to a particular utterance refer to the columns 'Old_Dialogue_ID' and 'Old_Utterance_ID'.\n- \u002Fdata\u002FMELD_Dyadic\u002Ftest_sent_emo_dya.csv - contains the utterances in the test set of the dyadic variant along with Sentiment and Emotion labels. For getting the video clip corresponding to a particular utterance refer to the columns 'Old_Dialogue_ID' and 'Old_Utterance_ID'.\n\n## Description of Pickle Files\nThere are 13 pickle files comprising of the data and features used for training the baseline models. Following is a brief description of each of the pickle files.\n\n### Data pickle files:\n\n* **data_emotion.p, data_sentiment.p** - These are the primary data files which contain 5 different elements stored as a list.\n    * *data*: It consists of a dictionary with the following key\u002Fvalue pairs.\n        * *text*: original sentence.\n        * *split*: train\u002Fval\u002Ftest - denotes the which split the tuple belongs to.\n        * *y*: label of the sentence.\n        * *dialog*: ID of the dialog the utterance belongs to.\n        * *utterance*: utterance number of the dialog ID.\n        * *num_words*: number of words in the utterance.\n    * W: glove embedding matrix\n    * vocab: the vocabulary of the dataset\n    * word_idx_map: mapping of each word from vocab to its index in W.\n    * max_sentence_length: maximum number of tokens in an utterance in the dataset.\n    * label_index: mapping of each label (emotion or sentiment) to its assigned index, eg. label_index['neutral']=0               \n```python\nimport pickle\ndata, W, vocab, word_idx_map, max_sentence_length, label_index = pickle.load(open(filepath, 'rb'))\n```\n\n* **text_glove_average_emotion.pkl, text_glove_average_sentiment.pkl** - It consists of 300 dimensional textual feature vectors of each utterance initialized as the average of the Glove embeddings of all tokens per utterance. It is a list comprising of 3 dictionaries for train, val and the test set with each dictionary indexed in the format *dia_utt*, where dia is the dialogue id and utt is the utterance id. For eg. train_text_avg_emb['0_0'].shape = (300, )\n```python\nimport pickle\ntrain_text_avg_emb, val_text_avg_emb, test_text_avg_emb = pickle.load(open(filepath, 'rb'))\n```\n\n\n\n* **audio_embeddings_feature_selection_emotion.pkl,audio_embeddings_feature_selection_sentiment.pkl** - It consists of 1611\u002F1422 dimensional audio feature vectors of each utterance trained for emotion\u002Fsentiment classification. These features are originally extracted from [openSMILE](https:\u002F\u002Fwww.audeering.com\u002Fopensmile\u002F) and then followed by L2-based feature selection using SVM. It is a list comprising of 3 dictionaries for train, val and the test set with each dictionary indexed in the format *dia_utt*, where dia is the dialogue id and utt is the utterance id. For eg. train_audio_emb['0_0'].shape = (1611, ) or (1422, )\n```python\nimport pickle\ntrain_audio_emb, val_audio_emb, test_audio_emb = pickle.load(open(filepath, 'rb'))\n```\n\n\n### Model output pickle files:\n\n* **text_glove_CNN_emotion.pkl, text_glove_CNN_sentiment.pkl** - It consists of 100 dimensional textual features obtained after training on a CNN-based [network](https:\u002F\u002Fgithub.com\u002Fdennybritz\u002Fcnn-text-classification-tf) for emotion\u002Fsentiment calssification. It is a list comprising of 3 dictionaries for train, val and the test set with each dictionary indexed in the format *dia_utt*, where dia is the dialogue id and utt is the utterance id. For eg. train_text_CNN_emb['0_0'].shape = (100, )\n```python\nimport pickle\ntrain_text_CNN_emb, val_text_CNN_emb, test_text_CNN_emb = pickle.load(open(filepath, 'rb'))\n```\n\n* **text_emotion.pkl, text_sentiment.pkl** - These files contain the contextual feature representations as produced by the uni-modal bcLSTM model. It consists of 600 dimensional textual feature vector for each utterance for emotion\u002Fsentiment classification stored as a dictionary indexed with dialogue id. It is a list comprising of 3 dictionaries for train, val and the test set. For eg. train_text_emb['0'].shape = (33, 600), where 33 is the maximum number of utterances in a dialogue. Dialogues with less utterances are padded with zero-vectors.\n```python\nimport pickle\ntrain_text_emb, val_text_emb, test_text_emb = pickle.load(open(filepath, 'rb'))\n```\n\n* **audio_emotion.pkl, audio_sentiment.pkl** - These files contain the contextual feature representations as produced by the uni-modal bcLSTM model. It consists of 300\u002F600 dimensional audio feature vector for each utterance for emotion\u002Fsentiment classification stored as a dictionary indexed with dialogue id. It is a list comprising of 3 dictionaries for train, val and the test set. For eg. train_audio_emb['0'].shape = (33, 300) or (33, 600), where 33 is the maximum number of utterances in a dialogue. Dialogues with less utterances are padded with zero-vectors.\n```python\nimport pickle\ntrain_audio_emb, val_audio_emb, test_audio_emb = pickle.load(open(filepath, 'rb'))\n```\n\n\n* **bimodal_sentiment.pkl** - This file contains the contextual feature representations as produced by the bi-imodal bcLSTM model. It consists of 600 dimensional bimodal (text, audio) feature vector for each utterance for sentiment classification stored as a dictionary indexed with dialogue id. It is a list comprising of 3 dictionaries for train, val and the test set. For eg. train_bimodal_emb['0'].shape = (33, 600), where 33 is the maximum number of utterances in a dialogue. Dialogues with less utterances are padded with zero-vectors.\n```python\nimport pickle\ntrain_bimodal_emb, val_bimodal_emb, test_bimodal_emb = pickle.load(open(filepath, 'rb'))\n```\n\n\n\n## Description of Raw Data\n- There are 3 folders (.tar.gz files)-train, dev and test; each of which corresponds to video clips from the utterances in the 3 .csv files.\n- In any folder, each video clip in the raw data corresponds to one utterance in the corresponding .csv file. The video clips are named in the format: diaX1\\_uttX2.mp4, where X1 is the Dialogue\\_ID and X2 is the Utterance_ID as provided in the corresponding .csv file, denoting the particular utterance.\n- For example, consider the video clip **dia6_utt1.mp4** in **train.tar.gz**. The corresponding utterance for this video clip will be in the file **train_sent_emp.csv** with **Dialogue_ID=6** and **Utterance_ID=1**, which is *'You liked it? You really liked it?'*\n\n## Reading the Data\nThere are 2 python scripts provided in '.\u002Futils\u002F':\n- read_meld.py \\- displays the path of the video file corresponding to an utterance in the .csv file from MELD.\n- read_emorynlp \\- displays the path of the video file corresponding to an utterance in the .csv file from Multimodal EmoryNLP Emotion Detection dataset.\n\n## Labelling\nFor experimentation, all the labels are represented as one-hot encodings, the indices for which are as follows:\n- **Emotion** - {'neutral': 0, 'surprise': 1, 'fear': 2, 'sadness': 3, 'joy': 4, 'disgust': 5, 'anger': 6}. Therefore, the label corresponding to the emotion *'joy'* would be [0., 0., 0., 0., 1., 0., 0.]\n- **Sentiment** - {'neutral': 0, 'positive': 1, 'negative': 2}. Therefore, the label corresponding to the sentiment *'positive'* would be [0., 1., 0.]\n\n## Class Weights\nFor the baseline on emotion classification, the following class weights were used. The indexing is the same as mentioned above.\nClass Weights: [4.0, 15.0, 15.0, 3.0, 1.0, 6.0, 3.0].\n\n## Run the baseline\n\nPlease follow these steps to run the baseline - \n\n1. Download the features from [here](http:\u002F\u002Fweb.eecs.umich.edu\u002F~mihalcea\u002Fdownloads\u002FMELD.Features.Models.tar.gz).\n2. Copy these features into `.\u002Fdata\u002Fpickles\u002F`\n3. To train\u002Ftest the baseline model, run the file: `baseline\u002Fbaseline.py` as follows:\n    - `python baseline.py -classify [Sentiment|Emotion] -modality [text|audio|bimodal] [-train|-test]` \n    - example command to train text unimodal for sentiment classification: `python baseline.py -classify Sentiment -modality text -train`\n    - use `python baseline.py -h` to get help text for the parameters.\n4. For pre-trained models, download the model weights from [here](http:\u002F\u002Fweb.eecs.umich.edu\u002F~mihalcea\u002Fdownloads\u002FMELD.Features.Models.tar.gz) and place the pickle files inside `.\u002Fdata\u002Fmodels\u002F`.\n\n## Citation\nPlease cite the following papers if you find this dataset useful in your research\n\nS. Poria, D. Hazarika, N. Majumder, G. Naik, E. Cambria, R. Mihalcea. MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation. ACL 2019.\n\nChen, S.Y., Hsu, C.C., Kuo, C.C. and Ku, L.W. EmotionLines: An Emotion Corpus of Multi-Party Conversations. arXiv preprint arXiv:1802.08379 (2018).\n\n# Multimodal EmoryNLP Emotion Recognition Dataset\n----------------------------------------------------\n## Description\nMultimodal EmoryNLP Emotion Detection Dataset has been created by enhancing and extending EmoryNLP Emotion Detection dataset. It contains the same dialogue instances available in EmoryNLP Emotion Detection dataset, but it also encompasses audio and visual modality along with text. There are more than 800 dialogues and 9000 utterances from Friends TV series exist in the multimodal EmoryNLP dataset. Multiple speakers participated in the dialogues. Each utterance in a dialogue has been labeled by any of these seven emotions -- Neutral, Joyful, Peaceful, Powerful, Scared, Mad and Sad. The annotations are borrowed from the original dataset.\n### Dataset Statistics\n| Statistics                      | Train   | Dev     | Test    |\n|---------------------------------|---------|---------|---------|\n| # of modality                   | {a,v,t} | {a,v,t} | {a,v,t} |\n| # of unique words               | 9,744  | 2,123   | 2,345   |\n| Avg. utterance length           | 7.86    | 6.97    | 7.79    |\n| Max. utterance length           | 78      | 60      | 61      |\n| Avg. # of emotions per scene | 4.10    | 4.00    | 4.40    |\n| # of dialogues                  | 659    | 89     | 79     |\n| # of utterances                 | 7551    | 954    | 984    |\n| # of speakers                   | 250     | 46      | 48     |\n| # of emotion shift              | 4596    | 575     | 653    |\n| Avg. duration of an utterance   | 5.55s   | 5.46s   | 5.27s   |\n\n### Dataset Distribution\n\n|          | Train | Dev | Test |\n|----------|-------|-----|------|\n| Joyful   | 1677  | 205 | 217  |\n| Mad      | 785   | 97  | 86   |\n| Neutral  | 2485  | 322 | 288  |\n| Peaceful | 638   | 82  | 111  |\n| Powerful | 551   | 70  | 96   |\n| Sad      | 474   | 51  | 70   |\n| Scared   | 941   | 127 | 116  |\n\n## Data\nVideo clips of this dataset can be download from [this link](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1UQduKw8QTqGf3RafxrTDfI1NyInYK3fr\u002Fview?usp=sharing).\nThe annotation files can be found in https:\u002F\u002Fgithub.com\u002FSenticNet\u002FMELD\u002Ftree\u002Fmaster\u002Fdata\u002Femorynlp. There are 3 .csv files. Each entry in the first column of these csv files contain an utterance whose corresponding video clip can be found [here](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1UQduKw8QTqGf3RafxrTDfI1NyInYK3fr\u002Fview?usp=sharing). Each utterance and its video clip is indexed by the season no., episode no., scene id and utterance id. For example, **sea1\\_ep2\\_sc6\\_utt3.mp4** implies the clip corresponds to the utterance with season no. 1, episode no. 2, scene\\_id 6 and utterance\\_id 3. A scene is simply a dialogue. This indexing is consistent with the original dataset. The .csv files and the video files are divided into the train, validation and test set in accordance with the original dataset. Annotations have been directly borrowed from the original EmoryNLP dataset (Zahiri et al. (2018)).\n\n### Description of the .csv files\n\n#### Column Specification\n| Column Name  | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |\n|--------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| Utterance    | Individual utterances from EmoryNLP as a string.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |\n| Speaker      | Name of the speaker associated with the utterance.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |\n| Emotion      | The emotion (Neutral, Joyful, Peaceful, Powerful, Scared, Mad and Sad) expressed by the speaker in the utterance.                                                                                                                                                                                                                                                                                                                                                                                                                                         |\n| Scene_ID  | The index of the dialogue starting from 0.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |\n| Utterance_ID | The index of the particular utterance in the dialogue starting from 0.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |\n| Season       | The season no. of Friends TV Show to which a particular utterance belongs.                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |\n| Episode      | The episode no. of Friends TV Show in a particular season to which the utterance belongs.                                                                                                                                                                                                                                                                                                                                                                                                                                                              |\n| StartTime    | The starting time of the utterance in the given episode in the format 'hh:mm:ss,ms'.                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |\n| EndTime      | The ending time of the utterance in the given episode in the format 'hh:mm:ss,ms'.\n\n***Note***: There are a few utterances for which we were not able to find the start and end time due to some inconsistencies in the subtitles. Such utterances have been omitted from the dataset. However, we encourage the users to find the corresponding utterances from the original dataset and generate video clips for the same.\n## Citation\nPlease cite the following papers if you find this dataset useful in your research\n\nS. Zahiri and J. D. Choi. Emotion Detection on TV Show Transcripts with Sequence-based Convolutional Neural Networks. In The AAAI Workshop on Affective Content Analysis, AFFCON'18, 2018.\n\nS. Poria, D. Hazarika, N. Majumder, G. Naik, E. Cambria, R. Mihalcea. MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation. ACL 2019.\n","# MELD：用于对话中情感识别的多模态多方数据集\n\n## 注意事项\n\n🔥 如果你对LLM的智商测试感兴趣，请查看我们的新工作：[AlgoPuzzleVQA](https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002Fpuzzle-reasoning)\n\n:fire: 我们已发布使用Resnet提取的视觉特征 - https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002FMM-Align\n\n:fire: :fire: :fire: 如需更新的基线模型，请访问此链接：[conv-emotion](https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002Fconv-emotion)\n\n:fire: :fire: :fire: 下载数据时请使用wget：\n```wget http:\u002F\u002Fweb.eecs.umich.edu\u002F~mihalcea\u002Fdownloads\u002FMELD.Raw.tar.gz```\n\n## 排行榜\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdeclare-lab_MELD_readme_1491652a74b4.png)\n\n## 更新信息\n\n2020年10月10日：关于MELD数据集的对话情感识别最新论文及SOTA结果。代码请参阅[COSMIC](https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002Fconv-emotion\u002Ftree\u002Fmaster\u002FCOSMIC)目录。论文阅读地址——[COSMIC：基于常识知识的对话情感识别](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2010.02795.pdf)。\n\n2019年5月22日：MELD：用于对话中情感识别的多模态多方数据集已被ACL 2019会议接收为全文发表。更新后的论文可在此查阅 - https:\u002F\u002Farxiv.org\u002Fpdf\u002F1810.02508.pdf\n\n2019年5月22日：双人版MELD已发布，可用于测试双人对话模型。\n\n2018年11月15日：train.tar.gz文件中的问题已修复。\n\n## 使用MELD的研究工作\n\nZhang, Yazhou, Qiuchi Li, Dawei Song, Peng Zhang, and Panpan Wang. “受量子启发的交互网络用于对话情感分析。” IJCAI 2019。\n\nZhang, Dong, Liangqing Wu, Changlong Sun, Shoushan Li, Qiaoming Zhu, and Guodong Zhou. “建模上下文与说话者相关依赖以进行多说话者对话中的情感检测。” IJCAI 2019。\n\nGhosal, Deepanway, Navonil Majumder, Soujanya Poria, Niyati Chhaya, and Alexander Gelbukh. “DialogueGCN：用于对话中情感识别的图卷积神经网络。” EMNLP 2019.\n\n\n----------------------------------------------------\n\n## 简介\n多模态EmotionLines数据集（MELD）是在EmotionLines数据集的基础上扩展和增强而创建的。MELD包含了EmotionLines中的所有对话实例，同时还增加了音频和视觉模态的数据。MELD包含来自电视剧《老友记》的1400多个对话和13000条话语。这些对话由多位说话者参与。每个话语都被标注为以下七种情感之一——愤怒、厌恶、悲伤、喜悦、中性、惊讶和恐惧。此外，MELD还为每条话语提供了情感标注（正面、负面和中性）。\n\n### 示例对话\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdeclare-lab_MELD_readme_fd3b5c995f4e.jpeg)\n\n### 数据集统计信息\n| 统计指标                      | 训练集   | 开发集     | 测试集    |\n|---------------------------------|---------|---------|---------|\n| 模态数量                   | {a,v,t} | {a,v,t} | {a,v,t} |\n| 唯一词数               | 10,643  | 2,384   | 4,361   |\n| 平均话语长度           | 8.03    | 7.99    | 8.28    |\n| 最大话语长度           | 69      | 37      | 45      |\n| 每个对话平均情感数     | 3.30    | 3.35    | 3.24    |\n| 对话数量                  | 1039    | 114     | 280     |\n| 话语数量                 | 9989    | 1109    | 2610    |\n| 说话者数量                | 260     | 47      | 100     |\n| 情感转换次数              | 4003    | 427     | 1003    |\n| 平均话语持续时间         | 3.59s   | 3.59s   | 3.58s   |\n\n更多详情请访问https:\u002F\u002Faffective-meld.github.io。\n\n### 数据分布\n\n|          | 训练集 | 开发集 | 测试集 |\n|----------|-------|-----|------|\n| 愤怒    | 1109  | 153 | 345  |\n| 厌恶  | 271   | 22  | 68   |\n| 恐惧     | 268   | 40  | 50   |\n| 喜悦      | 1743  | 163 | 402  |\n| 中性  | 4710  | 470 | 1256 |\n| 悲伤  | 683   | 111 | 208  |\n| 惊讶 | 1205  | 150 | 281  |\n\n\n## 目的\n多模态数据分析通过利用多个并行数据通道的信息来进行决策。随着人工智能的快速发展，多模态情感识别已成为研究热点，这主要得益于其在许多具有挑战性的任务中的潜在应用，例如对话生成、多模态交互等。对话情感识别系统可以通过分析用户的情感来生成适当的回应。尽管目前已有大量关于多模态情感识别的研究，但真正专注于理解对话中情感的研究却寥寥无几。然而，这些研究大多局限于双人对话的理解，因此无法扩展到包含两个以上参与者的多方对话情感识别。EmotionLines可以作为仅基于文本的情感识别资源，因为它不包含视觉和音频等其他模态的数据。同时需要注意的是，目前尚不存在用于情感识别研究的多模态多方对话数据集。在本工作中，我们对EmotionLines数据集进行了扩展、改进，并进一步开发以适应多模态场景。在连续轮次的对话中进行情感识别存在诸多挑战，其中情境理解是关键之一。对话中轮次顺序上的情感变化和情感流动使得准确的情境建模变得十分困难。在本数据集中，由于我们能够获取每个对话的多模态数据源，我们假设这将有助于改善情境建模，从而提升整体的情感识别性能。该数据集也可用于开发多模态情感对话系统。IEMOCAP和SEMAINE是包含每句话情感标签的多模态对话数据集。然而，这些数据集均为双人对话性质，这也凸显了我们多模态EmotionLines数据集的重要性。其他公开可用的多模态情感和情感识别数据集包括MOSEI、MOSI和MOUD，但这些数据集均非对话性质。\n\n## 数据集创建\n第一步是为 EmotionLines 数据集中每个对话中的每句话语找到时间戳。为此，我们遍历了所有剧集的字幕文件，从中提取出每句话的开始和结束时间戳。通过这一过程，我们获得了每个剧集的季 ID、集 ID 以及该集内每句话的时间戳。在获取时间戳时，我们设置了两个约束条件：(a) 对话中各句话的时间戳必须按递增顺序排列；(b) 同一对话中的所有话语必须属于同一集和同一场景。\n根据这两个条件进行筛选后发现，在 EmotionLines 数据集中，部分对话实际上由多个自然对话组成。我们将这些情况从数据集中剔除。由于这一错误修正步骤，我们的数据集与原始 EmotionLines 数据集相比，对话数量有所不同。在获取每句话的时间戳后，我们从源剧集中提取了对应的话语文本及其视听片段。此外，我们还单独提取了这些视频片段中的音频内容。最终，我们的数据集为每个对话提供了视觉、音频和文本三种模态的数据。\n\n## 论文\n有关该数据集的论文可参阅：https:\u002F\u002Farxiv.org\u002Fpdf\u002F1810.02508.pdf\n\n## 下载数据\n请访问 http:\u002F\u002Fweb.eecs.umich.edu\u002F~mihalcea\u002Fdownloads\u002FMELD.Raw.tar.gz 下载原始数据。数据以 .mp4 格式存储，压缩包名为 XXX.tar.gz。标注信息可在 https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002FMELD\u002Ftree\u002Fmaster\u002Fdata\u002FMELD 中找到。\n\n## .csv 文件说明\n\n### 列说明\n| 列名       | 描述                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |\n|------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| 序号       | 发话的序号，主要用于在不同版本或包含不同子集的多个副本中引用具体的发话内容。 |\n| 发话内容   | EmotionLines 数据集中的一条单独发话内容，以字符串形式表示。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |\n| 发话者     | 与该发话内容相关联的发话者姓名。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |\n| 情感       | 发话者在该发话中表达的情感（中性、喜悦、悲伤、愤怒、惊讶、恐惧、厌恶）。                                                                                                                                                                                                                                                                                                                                                                                                                                         |\n| 情感倾向   | 发话者在该发话中表达的情感倾向（正面、中性、负面）。                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |\n| 对话语篇ID | 对话语篇的索引，从0开始计数。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |\n| 发话ID     | 该发话在对话语篇中的索引，从0开始计数。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |\n| 季节数     | 该发话所属的《老友记》电视剧的季节数。                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |\n| 集数       | 该发话所属的《老友记》某一季度中的集数。                                                                                                                                                                                                                                                                                                                                                                                                                                                              |\n| 开始时间   | 该发话在给定集中的开始时间，格式为“hh:mm:ss,ms”。                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |\n| 结束时间   | 该发话在给定集中的结束时间，格式为“hh:mm:ss,ms”。                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |\n\n### 文件说明\n- \u002Fdata\u002FMELD\u002Ftrain_sent_emo.csv - 包含训练集中的话语及其情感和情绪标签。\n- \u002Fdata\u002FMELD\u002Fdev_sent_emo.csv - 包含验证集中的话语及其情感和情绪标签。\n- \u002Fdata\u002FMELD\u002Ftest_sent_emo.csv - 包含测试集中的话语及其情感和情绪标签。\n- \u002Fdata\u002FMELD_Dyadic\u002Ftrain_sent_emo_dya.csv - 包含MELD双人对话变体训练集中的话语及其情感和情绪标签。要获取与特定话语对应的视频片段，请参考“Old_Dialogue_ID”和“Old_Utterance_ID”两列。\n- \u002Fdata\u002FMELD_Dyadic\u002Fdev_sent_emo_dya.csv - 包含双人对话变体验证集中的话语及其情感和情绪标签。要获取与特定话语对应的视频片段，请参考“Old_Dialogue_ID”和“Old_Utterance_ID”两列。\n- \u002Fdata\u002FMELD_Dyadic\u002Ftest_sent_emo_dya.csv - 包含双人对话变体测试集中的话语及其情感和情绪标签。要获取与特定话语对应的视频片段，请参考“Old_Dialogue_ID”和“Old_Utterance_ID”两列。\n\n## Pickle 文件说明\n共有13个Pickle文件，包含了用于训练基线模型的数据和特征。以下是每个Pickle文件的简要说明。\n\n### 数据Pickle文件：\n\n* **data_emotion.p、data_sentiment.p** - 这是主要的数据文件，包含以列表形式存储的5个不同元素。\n    * *data*: 一个字典，包含以下键值对：\n        * *text*: 原始句子。\n        * *split*: train\u002Fval\u002Ftest - 表示该元组所属的划分（训练集、验证集或测试集）。\n        * *y*: 句子的标签。\n        * *dialog*: 该话语所属对话的ID。\n        * *utterance*: 对话ID中的话语编号。\n        * *num_words*: 该话语中的单词数量。\n    * W: Glove嵌入矩阵。\n    * vocab: 数据集的词汇表。\n    * word_idx_map: 词汇表中每个词到W中索引的映射。\n    * max_sentence_length: 数据集中单个话语的最大标记数。\n    * label_index: 每个标签（情绪或情感）与其分配索引的映射，例如label_index['neutral']=0。\n```python\nimport pickle\ndata, W, vocab, word_idx_map, max_sentence_length, label_index = pickle.load(open(filepath, 'rb'))\n```\n\n* **text_glove_average_emotion.pkl、text_glove_average_sentiment.pkl** - 包含每个话语的300维文本特征向量，这些向量由该话语中所有标记的Glove嵌入取平均得到。这是一个列表，包含针对训练集、验证集和测试集的3个字典，每个字典以*dia_utt*格式索引，其中dia为对话ID，utt为话语ID。例如，train_text_avg_emb['0_0'].shape = (300, )。\n```python\nimport pickle\ntrain_text_avg_emb、val_text_avg_emb、test_text_avg_emb = pickle.load(open(filepath, 'rb'))\n```\n\n\n\n* **audio_embeddings_feature_selection_emotion.pkl、audio_embeddings_feature_selection_sentiment.pkl** - 包含每个话语的1611\u002F1422维音频特征向量，这些特征是为情绪\u002F情感分类训练得到的。这些特征最初是从[openSMILE](https:\u002F\u002Fwww.audeering.com\u002Fopensmile\u002F)中提取的，随后使用基于L2范数的特征选择方法，并通过SVM进行筛选。这是一个列表，包含针对训练集、验证集和测试集的3个字典，每个字典以*dia_utt*格式索引，其中dia为对话ID，utt为话语ID。例如，train_audio_emb['0_0'].shape = (1611, )或(1422, )。\n```python\nimport pickle\ntrain_audio_emb、val_audio_emb、test_audio_emb = pickle.load(open(filepath, 'rb'))\n```\n\n\n### 模型输出Pickle文件：\n\n* **text_glove_CNN_emotion.pkl、text_glove_CNN_sentiment.pkl** - 包含在基于CNN的[网络](https:\u002F\u002Fgithub.com\u002Fdennybritz\u002Fcnn-text-classification-tf)上训练后得到的100维文本特征，用于情绪\u002F情感分类。这是一个列表，包含针对训练集、验证集和测试集的3个字典，每个字典以*dia_utt*格式索引，其中dia为对话ID，utt为话语ID。例如，train_text_CNN_emb['0_0'].shape = (100, )。\n```python\nimport pickle\ntrain_text_CNN_emb、val_text_CNN_emb、test_text_CNN_emb = pickle.load(open(filepath, 'rb'))\n```\n\n\n* **text_emotion.pkl、text_sentiment.pkl** - 这些文件包含由单模态bcLSTM模型生成的上下文特征表示。每个话语的600维文本特征向量用于情绪\u002F情感分类，以按对话ID索引的字典形式存储。这是一个列表，包含针对训练集、验证集和测试集的3个字典。例如，train_text_emb['0'].shape = (33, 600)，其中33是单个对话中最多的话语数。话语较少的对话会用零向量填充。\n```python\nimport pickle\ntrain_text_emb、val_text_emb、test_text_emb = pickle.load(open(filepath, 'rb'))\n```\n\n\n* **audio_emotion.pkl、audio_sentiment.pkl** - 这些文件包含由单模态bcLSTM模型生成的上下文特征表示。每个话语的300\u002F600维音频特征向量用于情绪\u002F情感分类，以按对话ID索引的字典形式存储。这是一个列表，包含针对训练集、验证集和测试集的3个字典。例如，train_audio_emb['0'].shape = (33, 300)或(33, 600)，其中33是单个对话中最多的话语数。话语较少的对话会用零向量填充。\n```python\nimport pickle\ntrain_audio_emb、val_audio_emb、test_audio_emb = pickle.load(open(filepath, 'rb'))\n```\n\n\n* **bimodal_sentiment.pkl** - 该文件包含由双模态bcLSTM模型生成的上下文特征表示。每个话语的600维双模态（文本、音频）特征向量用于情感分类，以按对话ID索引的字典形式存储。这是一个列表，包含针对训练集、验证集和测试集的3个字典。例如，train_bimodal_emb['0'].shape = (33, 600)，其中33是单个对话中最多的话语数。话语较少的对话会用零向量填充。\n```python\nimport pickle\ntrain_bimodal_emb、val_bimodal_emb、test_bimodal_emb = pickle.load(open(filepath, 'rb'))\n```\n\n## 原始数据描述\n- 数据集包含3个文件夹（.tar.gz文件）：train、dev和test；每个文件夹分别对应3个.csv文件中的语音片段。\n- 在任何一个文件夹中，原始数据中的每个视频片段都对应于相应.csv文件中的一个话语。视频片段的命名格式为：diaX1\\_uttX2.mp4，其中X1是对话ID，X2是话语ID，与相应.csv文件中的信息一致，用于标识特定的话语。\n- 例如，考虑train.tar.gz中的视频片段dia6_utt1.mp4。该视频片段对应的话语将在train_sent_emp.csv文件中，其Dialogue_ID为6，Utterance_ID为1，内容为*‘You liked it? You really liked it?’*。\n\n## 数据读取\n在’.\u002Futils\u002F’目录下提供了2个Python脚本：\n- read_meld.py \\- 显示MELD数据集中.csv文件中某话语对应的视频文件路径。\n- read_emorynlp \\- 显示多模态EmoryNLP情感识别数据集中.csv文件中某话语对应的视频文件路径。\n\n## 标签定义\n为了实验目的，所有标签均采用独热编码表示，其索引如下：\n- **情感** - {'neutral': 0, 'surprise': 1, 'fear': 2, 'sadness': 3, 'joy': 4, 'disgust': 5, 'anger': 6}。因此，情感为*‘joy’*的标签为[0., 0., 0., 0., 1., 0., 0.]\n- **情绪** - {'neutral': 0, 'positive': 1, 'negative': 2}。因此，情绪为*‘positive’*的标签为[0., 1., 0.]\n\n## 类别权重\n在情感分类的基线模型中，使用了以下类别权重。索引方式与上述相同。\n类别权重：[4.0, 15.0, 15.0, 3.0, 1.0, 6.0, 3.0]。\n\n## 运行基线模型\n\n请按照以下步骤运行基线模型：\n\n1. 从[这里](http:\u002F\u002Fweb.eecs.umich.edu\u002F~mihalcea\u002Fdownloads\u002FMELD.Features.Models.tar.gz)下载特征文件。\n2. 将这些特征文件复制到`.\u002Fdata\u002Fpickles\u002F`目录下。\n3. 要训练或测试基线模型，请运行文件：`baseline\u002Fbaseline.py`，命令如下：\n    - `python baseline.py -classify [Sentiment|Emotion] -modality [text|audio|bimodal] [-train|-test]`\n    - 示例命令：训练文本单模态的情感分类模型——`python baseline.py -classify Sentiment -modality text -train`\n    - 使用`python baseline.py -h`可获取参数帮助信息。\n4. 对于预训练模型，从[这里](http:\u002F\u002Fweb.eecs.umich.edu\u002F~mihalcea\u002Fdownloads\u002FMELD.Features.Models.tar.gz)下载模型权重，并将pickle文件放入`.\u002Fdata\u002Fmodels\u002F`目录中。\n\n## 引用\n如果您在研究中使用了本数据集，请引用以下论文：\n\nS. Poria, D. Hazarika, N. Majumder, G. Naik, E. Cambria, R. Mihalcea. MELD: 一种用于对话中情感识别的多模态多方数据集。ACL 2019。\n\nChen, S.Y., Hsu, C.C., Kuo, C.C. and Ku, L.W. EmotionLines: 一个多发言者对话的情感语料库。arXiv预印本arXiv:1802.08379 (2018)。\n\n# 多模态EmoryNLP情感识别数据集\n----------------------------------------------------\n## 描述\n多模态EmoryNLP情感检测数据集是在EmoryNLP情感检测数据集的基础上扩展和增强而创建的。它包含了EmoryNLP情感检测数据集中相同的对话实例，同时还增加了音频和视觉模态信息，与文本信息共同构成多模态数据。多模态EmoryNLP数据集中包含来自电视剧《老友记》的800多个对话和9000多个话语。对话中有多个说话者参与。每个话语都被标注为以下七种情感之一——中性、喜悦、平静、强大、恐惧、愤怒和悲伤。这些标注直接沿用了原始数据集的内容。\n### 数据集统计信息\n| 统计指标                      | Train   | Dev     | Test    |\n|---------------------------------|---------|---------|---------|\n| 模态数量                   | {a,v,t} | {a,v,t} | {a,v,t} |\n| 唯一词数               | 9,744  | 2,123   | 2,345   |\n| 平均话语长度           | 7.86    | 6.97    | 7.79    |\n| 最大话语长度           | 78      | 60      | 61      |\n| 每个场景平均情感数量 | 4.10    | 4.00    | 4.40    |\n| 对话数量                  | 659    | 89     | 79     |\n| 话语数量                 | 7551    | 954    | 984    |\n| 说话者数量                   | 250     | 46      | 48     |\n| 情感转换次数              | 4596    | 575     | 653    |\n| 平均话语时长   | 5.55s   | 5.46s   | 5.27s   |\n\n### 数据分布\n\n|          | Train | Dev | Test |\n|----------|-------|-----|------|\n| 喜悦   | 1677  | 205 | 217  |\n| 愤怒      | 785   | 97  | 86   |\n| 中性  | 2485  | 322 | 288  |\n| 平静 | 638   | 82  | 111  |\n| 强大 | 551   | 70  | 96   |\n| 悲伤 | 474   | 51  | 70   |\n| 恐惧 | 941   | 127 | 116  |\n\n## 数据\n本数据集的视频片段可以从[此链接](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1UQduKw8QTqGf3RafxrTDfI1NyInYK3fr\u002Fview?usp=sharing)下载。\n标注文件可在https:\u002F\u002Fgithub.com\u002FSenticNet\u002FMELD\u002Ftree\u002Fmaster\u002Fdata\u002Femorynlp中找到。共有3个.csv文件。这些csv文件的第一列每条记录对应一个话语，其相应的视频片段可以在[此处](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1UQduKw8QTqGf3RafxrTDfI1NyInYK3fr\u002Fview?usp=sharing)找到。每个话语及其视频片段均按季号、集号、场景编号和话语编号进行索引。例如，**sea1\\_ep2\\_sc6\\_utt3.mp4**表示该片段对应第1季第2集第6场景第3话语。一个场景即为一段对话。这种索引方式与原始数据集保持一致。csv文件和视频文件均按照原始数据集的划分方式分为训练集、验证集和测试集。标注直接沿用了原始的EmoryNLP数据集（Zahiri等，2018）。\n\n### .csv 文件说明\n\n#### 列规范\n| 列名         | 说明                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |\n|--------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| Utterance    | EmoryNLP 数据集中以字符串形式表示的单个话语。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |\n| Speaker      | 与该话语相关联的说话者姓名。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |\n| Emotion      | 说话者在该话语中表达的情绪（中性、喜悦、平静、强大、害怕、愤怒和悲伤）。                                                                                                                                                                                                                                                                                                                                                                                                                                         |\n| Scene_ID     | 对话的索引，从 0 开始。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |\n| Utterance_ID | 对话中特定话语的索引，从 0 开始。                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |\n| Season       | 该话语所属的《老友记》电视剧的季数。                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |\n| Episode      | 该话语所属的某一季中的集数。                                                                                                                                                                                                                                                                                                                                                                                                                                                              |\n| StartTime    | 该话语在相应集中的开始时间，格式为“hh:mm:ss,ms”。                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |\n| EndTime      | 该话语在相应集中的结束时间，格式为“hh:mm:ss,ms”。\n\n***注***：由于字幕存在一些不一致之处，我们无法找到少数话语的开始和结束时间。这些话语已被从数据集中移除。不过，我们鼓励用户从原始数据集中找到相应的话语，并为其生成视频片段。\n## 引用\n如果您在研究中使用了本数据集，请引用以下论文：\n\nS. Zahiri 和 J. D. Choi. 基于序列的卷积神经网络在电视剧剧本中的情感检测。载于 AAAI 情感内容分析研讨会，AFFCON'18，2018 年。\n\nS. Poria、D. Hazarika、N. Majumder、G. Naik、E. Cambria、R. Mihalcea. MELD：用于对话中情感识别的多模态多人数据集。ACL 2019。","# MELD 快速上手指南\n\nMELD (Multimodal EmotionLines Dataset) 是一个用于对话情感识别的多模态多参与者数据集。它基于《老友记》(Friends) 电视剧构建，包含文本、音频和视觉三种模态数据，适用于研究多轮对话中的情感变化与上下文建模。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**: Linux (推荐 Ubuntu 18.04+) 或 macOS。Windows 用户建议使用 WSL2。\n*   **存储空间**: 原始数据压缩包约 6GB+，解压后需要更多空间，建议预留至少 20GB 可用空间。\n*   **前置依赖**:\n    *   `wget`: 用于下载数据集。\n    *   `tar`: 用于解压数据。\n    *   Python 3.7+ (用于数据处理和分析)。\n    *   PyTorch 或 TensorFlow (若需运行官方提供的基线模型，请参考 [conv-emotion](https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002Fconv-emotion) 仓库的具体依赖)。\n\n## 安装步骤\n\n### 1. 下载数据集\n\n使用 `wget` 命令从密歇根大学服务器下载原始数据压缩包。\n*(注：由于服务器位于海外，国内下载速度可能较慢，建议使用支持断点续传的工具或配置网络加速)*\n\n```bash\nwget http:\u002F\u002Fweb.eecs.umich.edu\u002F~mihalcea\u002Fdownloads\u002FMELD.Raw.tar.gz\n```\n\n### 2. 解压数据\n\n下载完成后，解压文件以获取视频片段和音频文件。\n\n```bash\ntar -xzvf MELD.Raw.tar.gz\n```\n\n### 3. 获取标注数据\n\n标注信息（CSV 格式）托管在 GitHub 上。您可以克隆整个仓库或直接下载 `data` 目录。\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002FMELD.git\n```\n*标注文件位于：`MELD\u002Fdata\u002FMELD\u002F`*\n\n## 基本使用\n\nMELD 的核心使用方式是将视频\u002F音频片段与 CSV 标注文件进行对齐。以下是读取标注文件并定位对应媒体文件的简单示例。\n\n### 数据结构说明\n解压后的目录结构通常如下：\n*   `train\u002F`, `dev\u002F`, `test\u002F`: 包含按对话 ID 组织的 `.mp4` (含音视频) 或分离的 `.wav` 文件。\n*   `data\u002FMELD\u002F*.csv`: 包含 `Dialogue_ID`, `Utterance_ID`, `Emotion`, `Speaker` 等关键字段。\n\n### Python 读取示例\n\n以下代码演示如何加载训练集标注并理解字段含义：\n\n```python\nimport pandas as pd\nimport os\n\n# 设置路径 (根据实际解压位置调整)\ncsv_path = 'MELD\u002Fdata\u002FMELD\u002Ftrain_sent_emo.csv'\nmedia_root = 'MELD.Raw\u002Ftrain' # 假设解压后的文件夹名为 MELD.Raw\n\n# 读取标注文件\ndf = pd.read_csv(csv_path)\n\n# 查看前几行数据\nprint(df.head())\n\n# 示例：获取第一条数据的详细信息\nfirst_row = df.iloc[0]\ndialogue_id = first_row['Dialogue_ID']\nutterance_id = first_row['Utterance_ID']\nemotion = first_row['Emotion']\nspeaker = first_row['Speaker']\ntext = first_row['Utterance']\n\nprint(f\"\\n样本信息:\")\nprint(f\"对话 ID: {dialogue_id}\")\nprint(f\"话语 ID: {utterance_id}\")\nprint(f\"说话人: {speaker}\")\nprint(f\"情感标签: {emotion}\")\nprint(f\"文本内容: {text}\")\n\n# 构建对应的媒体文件路径 (具体文件名格式需参考解压后的实际文件命名规则)\n# 通常格式为：{Dialogue_ID}_{Utterance_ID}.mp4 或类似变体\n# 请根据实际文件夹内的文件名进行匹配\nfile_name = f\"{dialogue_id}_{utterance_id}.mp4\" \nfile_path = os.path.join(media_root, str(dialogue_id), file_name)\n\nif os.path.exists(file_path):\n    print(f\"媒体文件路径: {file_path}\")\nelse:\n    print(\"未找到对应的媒体文件，请检查目录结构或文件名格式。\")\n```\n\n### 标签体系\n数据集中的情感标签包含以下 7 类：\n*   `anger` (愤怒)\n*   `disgust` (厌恶)\n*   `fear` (恐惧)\n*   `joy` (高兴)\n*   `neutral` (中性)\n*   `sadness` (悲伤)\n*   `surprise` (惊讶)\n\n此外，每条数据还包含情感极性标注 (`sentiment`)：`positive`, `negative`, `neutral`。\n\n> **提示**: 如需复现最新的 SOTA 结果或使用预提取的视觉特征 (ResNet)，请访问官方更新的基线仓库：[conv-emotion](https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002Fconv-emotion)。","某智能客服团队正在开发一款能实时感知用户情绪并动态调整回复策略的对话机器人，以提升复杂投诉场景下的服务满意度。\n\n### 没有 MELD 时\n- **模态单一导致误判**：模型仅依赖文本分析，无法识别用户语气中的讽刺（如文字说“真好”但语调愤怒），导致情感分类准确率低下。\n- **多人对话上下文丢失**：在处理多方群聊或转接场景时，系统难以区分不同说话人的情绪变化，经常张冠李戴，无法追踪特定用户的情绪轨迹。\n- **缺乏真实场景泛化力**：使用实验室合成数据训练，模型面对《老友记》这类包含背景噪音、重叠说话和自然情绪突变的真实对话时，表现大幅下滑。\n- **情绪转折捕捉滞后**：无法有效识别对话中瞬间的情绪反转（如从平静突然转为恐惧），导致机器人回应迟钝，加剧用户不满。\n\n### 使用 MELD 后\n- **多模态融合精准识情**：利用 MELD 提供的音频、视觉与文本对齐数据，模型能结合语调起伏和面部表情，准确识别出隐藏在反话背后的真实愤怒。\n- **多方角色情绪独立追踪**：基于数据集中明确的多说话人标注，系统能清晰分离不同参与者的情绪流，在群聊中精准定位发起投诉的具体对象。\n- **真实噪声环境鲁棒性强**：通过在包含真实背景音和自然打断的 13000+ 语句上训练，机器人在实际嘈杂呼叫中心环境中依然保持高稳定性。\n- **实时捕捉情绪突变**：借助数据集中丰富的“情绪转移”样本，模型能毫秒级响应用户从“中性”到“惊讶”或“悲伤”的剧烈波动，即时切换安抚策略。\n\nMELD 通过提供高质量的多模态多方对话数据，让情感计算模型从“读懂文字”进化为真正“听懂人心”。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fdeclare-lab_MELD_1491652a.png","declare-lab","Deep Cognition and Language Research (DeCLaRe) Lab","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fdeclare-lab_27a2ce73.png","",null,"https:\u002F\u002Fdeclare-lab.github.io","https:\u002F\u002Fgithub.com\u002Fdeclare-lab",[84],{"name":85,"color":86,"percentage":87},"Python","#3572A5",100,1024,232,"2026-04-01T07:09:46","GPL-3.0",4,"未说明",{"notes":95,"python":93,"dependencies":96},"README 主要介绍数据集（MELD）的背景、统计信息和下载方式，未提供具体的代码运行环境、依赖库或硬件需求。数据包含音频、视频和文本模态，原始数据需通过 wget 下载（约数 GB），标注文件位于 GitHub 仓库的 data 目录中。若需运行相关基线模型，请参考文中提到的 conv-emotion 或 COSMIC 项目链接。",[],[35,37,15],[99,100,101,102,103,104,105,106,107,108,109,110,111,112],"emotion-recognition","sentiment-analysis","multimodal-sentiment-analysis","multimodal-interactions","dialogue-systems","conversational-ai","chatbot","personality-traits","personality-profiling","emotion","dialogue","emotion-detection","multimodal-emotion-recognition","emotion-recognition-in-conversation","2026-03-27T02:49:30.150509","2026-04-06T05:16:47.870169",[116,121,126,131,136,141],{"id":117,"question_zh":118,"answer_zh":119,"source_url":120},13056,"下载训练数据 (train.tar.gz) 时出现 'unexpected end of file' 或 'Unexpected EOF' 错误怎么办？","这通常是因为文件下载不完整或源链接失效。维护者提供了 Google Drive 的备用下载链接。您可以使用以下命令下载完整的训练数据：\n\nwget --load-cookies \u002Ftmp\u002Fcookies.txt \"https:\u002F\u002Fdocs.google.com\u002Fuc?export=download&confirm=$(wget --quiet --save-cookies \u002Ftmp\u002Fcookies.txt --keep-session-cookies --no-check-certificate 'https:\u002F\u002Fdocs.google.com\u002Fuc?export=download&id=1LgBQDIUPmx0SiMOGKxM4v33ig9CSO_Ps' -O- | sed -rn 's\u002F.*confirm=([0-9A-Za-z_]+).*\u002F\\1\\n\u002Fp')&id=1LgBQDIUPmx0SiMOGKxM4v33ig9CSO_Ps\" -O train.tar.gz && rm -rf \u002Ftmp\u002Fcookies.txt\n\n或者直接访问备用链接：https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1LgBQDIUPmx0SiMOGKxM4v33ig9CSO_Ps\u002Fview","https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002FMELD\u002Fissues\u002F5",{"id":122,"question_zh":123,"answer_zh":124,"source_url":125},13057,"无法访问 MELD 原始数据或特征文件的下载链接（如 bit.ly 链接失效）怎么办？","部分短链接可能已失效，请尝试以下官方或社区验证过的替代地址：\n1. 原始数据 (Raw Data): https:\u002F\u002Fweb.eecs.umich.edu\u002F~mihalcea\u002Fdownloads\u002FMELD.Raw.tar.gz\n2. 特征文件 (Features): https:\u002F\u002Fweb.eecs.umich.edu\u002F~mihalcea\u002Fdownloads\u002FMELD.Features.Models.tar.gz\n3. Google Drive 备份文件夹：https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1y4nj9rBMHyEvfLNMcpoKsXm9cGdAcZpy\n如果密歇根大学服务器连接超时，建议稍后重试或使用网络代理。","https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002FMELD\u002Fissues\u002F54",{"id":127,"question_zh":128,"answer_zh":129,"source_url":130},13058,"加载预训练模型时遇到 'unexpected keyword argument' 错误或版本不兼容问题如何解决？","这通常是由于 Keras 版本不匹配导致的。用户反馈将 Keras 降级到 2.1.5 版本可以解决该问题。此外，为了复现论文中的结果，请务必在训练时使用类别权重 (class weights)，维护者表示相关代码将在后续更新中发布。","https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002FMELD\u002Fissues\u002F7",{"id":132,"question_zh":133,"answer_zh":134,"source_url":135},13059,"如何使用自己的音频文件运行预训练模型进行情感和情绪标签提取？","baseline.py 默认加载处理好的 .pkl 特征文件而非原始音频。若需使用自己的音频，您需要先提取特征：\n1. 音频特征：使用 OpenSmile 工具提取音频功能量 (audio functionals)。\n2. 文本特征：使用 GloVe 嵌入并进行降维处理。\n3. 特征选择：可参考 scikit-learn 的特征选择模块。\n完成特征提取并生成类似的 .pkl 文件后，即可输入到预训练模型中。","https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002FMELD\u002Fissues\u002F12",{"id":137,"question_zh":138,"answer_zh":139,"source_url":140},13060,"运行 baseline.py 时遇到关于 Embedding 层 input_length 参数的报错或维度不匹配问题？","这是一个已知的代码错误。在 baseline.py 第 124 行左右，Embedding 层的 input_length 参数被错误地设置为序列长度 (sequence_length)，实际上应该设置为句子长度 (sentence_length)。维护者已确认并修复了此问题，如果您遇到类似错误（如第二维出现负数），请手动将该参数修改为句子长度。","https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002FMELD\u002Fissues\u002F6",{"id":142,"question_zh":143,"answer_zh":144,"source_url":145},13061,"运行 test_model 函数时抛出 'TypeError: Keyword argument not understood: input' 错误？","这是由于 TensorFlow\u002FKeras 版本差异导致的 API 变更。在新版 Keras 中，创建 Model 实例时不应使用 `input` 作为关键字参数，而应使用 `inputs`。请将代码中的：\n`Model(input=model.input, output=model.get_layer(\"utter\").output)`\n修改为：\n`Model(inputs=model.input, outputs=model.get_layer(\"utter\").output)`\n注意同时检查 `output` 参数是否也需要改为 `outputs`（取决于具体版本）。","https:\u002F\u002Fgithub.com\u002Fdeclare-lab\u002FMELD\u002Fissues\u002F23",[]]