[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tool-Natooz--MidiTok":3,"similar-Natooz--MidiTok":191},{"id":4,"github_repo":5,"name":6,"description_en":7,"description_zh":8,"ai_summary_zh":8,"readme_en":9,"readme_zh":10,"quickstart_zh":11,"use_case_zh":12,"hero_image_url":13,"owner_login":14,"owner_name":15,"owner_avatar_url":16,"owner_bio":17,"owner_company":18,"owner_location":19,"owner_email":20,"owner_twitter":21,"owner_website":22,"owner_url":23,"languages":24,"stars":33,"forks":34,"last_commit_at":35,"license":36,"difficulty_score":37,"env_os":38,"env_gpu":38,"env_ram":38,"env_deps":39,"category_tags":45,"github_topics":49,"view_count":57,"oss_zip_url":20,"oss_zip_packed_at":20,"status":58,"created_at":59,"updated_at":60,"faqs":61,"releases":90},6482,"Natooz\u002FMidiTok","MidiTok","MIDI \u002F symbolic music tokenizers for Deep Learning models 🎶","MidiTok 是一款专为深度学习模型设计的 Python 库，旨在将 MIDI 和 abc 格式的音乐文件高效转化为模型可理解的令牌（Token）序列。它主要解决了音乐生成、转录及音乐信息检索任务中，原始乐谱数据难以直接被 Transformer 等主流 AI 架构处理的难题，实现了从乐谱到数字序列的无缝转换与还原。\n\n这款工具非常适合从事音乐 AI 开发的工程师、研究人员以及希望尝试音乐生成项目的开发者使用。MidiTok 的核心亮点在于其高度的灵活性与兼容性：它不仅内置了 REMI、Compound Word 等多种主流音乐令牌化方案，还支持通过字节对编码（BPE）、Unigram 和 WordPiece 等算法训练自定义词表，以优化特定数据集的表现。此外，MidiTok 深度集成了 Hugging Face 生态，利用其高性能后端实现极速编码，并支持直接将训练好的令牌化器上传至社区共享。配合 PyTorch 等框架，用户只需几行代码即可完成数据加载、增强及模型训练流程，极大地降低了音乐人工智能项目的开发门槛。","# MidiTok\n\nPython package to tokenize music files, introduced at the ISMIR 2021 LBDs.\n\n![MidiTok Logo](docs\u002Fassets\u002Fmiditok_logo_stroke.png?raw=true \"\")\n\n[![PyPI version fury.io](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fmiditok.svg)](https:\u002F\u002Fpypi.python.org\u002Fpypi\u002Fmiditok\u002F)\n[![Python 3.9](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-≥3.9-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002Frelease\u002F)\n[![Documentation Status](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNatooz_MidiTok_readme_6bf48b3e9a6d.png)](https:\u002F\u002Fmiditok.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest)\n[![GitHub CI](https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Factions\u002Fworkflows\u002Fpytest.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Factions\u002Fworkflows\u002Fpytest.yml)\n[![Codecov](https:\u002F\u002Fimg.shields.io\u002Fcodecov\u002Fc\u002Fgithub\u002FNatooz\u002FMidiTok)](https:\u002F\u002Fcodecov.io\u002Fgh\u002FNatooz\u002FMidiTok)\n[![GitHub license](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002FNatooz\u002FMidiTok.svg)](https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fblob\u002Fmain\u002FLICENSE)\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNatooz_MidiTok_readme_6f46ea8c1944.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002FMidiTok)\n[![Code style](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcode%20style-ruff-000000.svg)](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fruff)\n\nMidiTok can tokenize MIDI and abc files, i.e. convert them into sequences of tokens ready to be fed to models such as Transformer, for any generation, transcription or MIR task.\nMidiTok features most known [music tokenizations](https:\u002F\u002Fmiditok.readthedocs.io\u002Fen\u002Flatest\u002Ftokenizations.html) (e.g. [REMI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.00212), [Compound Word](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.02402)...), and is built around the idea that they all share common parameters and methods. Tokenizers can be trained with [Byte Pair Encoding (BPE)](https:\u002F\u002Faclanthology.org\u002F2023.emnlp-main.123\u002F), [Unigram](https:\u002F\u002Faclanthology.org\u002FP18-1007\u002F) and [WordPiece](https:\u002F\u002Farxiv.org\u002Fabs\u002F1609.08144), and it offers data augmentation methods.\n\nMidiTok is integrated with the Hugging Face Hub 🤗! Don't hesitate to share your models to the community!\n\n**Documentation:** [miditok.readthedocs.com](https:\u002F\u002Fmiditok.readthedocs.io\u002Fen\u002Flatest\u002Findex.html)\n\n## Install\n\n```shell\npip install miditok\n```\nMidiTok uses [Symusic](https:\u002F\u002Fgithub.com\u002FYikai-Liao\u002Fsymusic) to read and write MIDI and abc files, and BPE\u002FUnigram is backed by [Hugging Face 🤗tokenizers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftokenizers) for superfast encoding.\n\n## Usage example\n\nTokenizing and detokenzing can be done by calling the tokenizer:\n\n```python\nfrom miditok import REMI, TokenizerConfig\nfrom symusic import Score\n\n# Creating a multitrack tokenizer, read the doc to explore all the parameters\nconfig = TokenizerConfig(num_velocities=16, use_chords=True, use_programs=True)\ntokenizer = REMI(config)\n\n# Loads a midi, converts to tokens, and back to a MIDI\nmidi = Score(\"path\u002Fto\u002Fyour_midi.mid\")\ntokens = tokenizer(midi)  # calling the tokenizer will automatically detect MIDIs, paths and tokens\nconverted_back_midi = tokenizer(tokens)  # PyTorch, Tensorflow and Numpy tensors are supported\n```\n\nHere is a complete yet concise example of how you can use MidiTok to train any PyTorch model. And [here](colab-notebooks\u002FExample_HuggingFace_Mistral_Transformer.ipynb) is a simple notebook example showing how to use Hugging Face models to generate music, with MidiTok taking care of tokenizing music files.\n\n```python\nfrom miditok import REMI, TokenizerConfig\nfrom miditok.pytorch_data import DatasetMIDI, DataCollator\nfrom miditok.utils import split_files_for_training\nfrom torch.utils.data import DataLoader\nfrom pathlib import Path\n\n# Creating a multitrack tokenizer, read the doc to explore all the parameters\nconfig = TokenizerConfig(num_velocities=16, use_chords=True, use_programs=True)\ntokenizer = REMI(config)\n\n# Train the tokenizer with Byte Pair Encoding (BPE)\nfiles_paths = list(Path(\"path\", \"to\", \"midis\").glob(\"**\u002F*.mid\"))\ntokenizer.train(vocab_size=30000, files_paths=files_paths)\ntokenizer.save(Path(\"path\", \"to\", \"save\", \"tokenizer.json\"))\n# And pushing it to the Hugging Face hub (you can download it back with .from_pretrained)\ntokenizer.push_to_hub(\"username\u002Fmodel-name\", private=True, token=\"your_hf_token\")\n\n# Split MIDIs into smaller chunks for training\ndataset_chunks_dir = Path(\"path\", \"to\", \"midi_chunks\")\nsplit_files_for_training(\n    files_paths=files_paths,\n    tokenizer=tokenizer,\n    save_dir=dataset_chunks_dir,\n    max_seq_len=1024,\n)\n\n# Create a Dataset, a DataLoader and a collator to train a model\ndataset = DatasetMIDI(\n    files_paths=list(dataset_chunks_dir.glob(\"**\u002F*.mid\")),\n    tokenizer=tokenizer,\n    max_seq_len=1024,\n    bos_token_id=tokenizer[\"BOS_None\"],\n    eos_token_id=tokenizer[\"EOS_None\"],\n)\ncollator = DataCollator(tokenizer.pad_token_id, copy_inputs_as_labels=True)\ndataloader = DataLoader(dataset, batch_size=64, collate_fn=collator)\n\n# Iterate over the dataloader to train a model\nfor batch in dataloader:\n    print(\"Train your model on this batch...\")\n```\n\n## Tokenizations\n\nMidiTok implements the tokenizations: (links to original papers)\n* [REMI](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3394171.3413671)\n* [REMI+](https:\u002F\u002Fopenreview.net\u002Fforum?id=NyR8OZFHw6i)\n* [MIDI-Like](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs00521-018-3758-9)\n* [TSD](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.11975)\n* [Structured](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.05944)\n* [CPWord](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F16091)\n* [Octuple](https:\u002F\u002Faclanthology.org\u002F2021.findings-acl.70)\n* [MuMIDI](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3394171.3413721)\n* [MMM](https:\u002F\u002Farxiv.org\u002Fabs\u002F2008.06048)\n* [PerTok](https:\u002F\u002Fwww.arxiv.org\u002Fabs\u002F2410.02060)\n\nYou can find short presentations in the [documentation](https:\u002F\u002Fmiditok.readthedocs.io\u002Fen\u002Flatest\u002Ftokenizations.html).\n\n## Contributions\n\nContributions are gratefully welcomed, feel free to open an issue or send a PR if you want to add a tokenization or speed up the code. You can read the [contribution guide](CONTRIBUTING.md) for details.\n\n### Todos\n\n* Support music-xml files;\n* `no_duration_drums` option, discarding duration tokens for drum notes;\n* Control Change messages;\n* Speed-up global\u002Ftrack events parsing with Rust or C++ bindings.\n\n## Citation\n\nIf you use MidiTok for your research, a citation in your manuscript would be gladly appreciated. ❤️\n\n[**[MidiTok paper]**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.17202)\n[**[MidiTok original ISMIR publication]**](https:\u002F\u002Farchives.ismir.net\u002Fismir2021\u002Flatebreaking\u002F000005.pdf)\n```bibtex\n@inproceedings{miditok2021,\n    title={{MidiTok}: A Python package for {MIDI} file tokenization},\n    author={Fradet, Nathan and Briot, Jean-Pierre and Chhel, Fabien and El Fallah Seghrouchni, Amal and Gutowski, Nicolas},\n    booktitle={Extended Abstracts for the Late-Breaking Demo Session of the 22nd International Society for Music Information Retrieval Conference},\n    year={2021},\n    url={https:\u002F\u002Farchives.ismir.net\u002Fismir2021\u002Flatebreaking\u002F000005.pdf},\n}\n```\n\nThe BibTeX citations of all tokenizations can be found [in the documentation](https:\u002F\u002Fmiditok.readthedocs.io\u002Fen\u002Flatest\u002Fcitations.html)\n\n\n## Acknowledgments\n\n@Natooz thanks its employers who allowed him to develop this project, by chronological order [Aubay](https:\u002F\u002Fblog.aubay.com\u002Findex.php\u002Flanguage\u002Fen\u002Fhome\u002F?lang=en), the [LIP6 (Sorbonne University)](https:\u002F\u002Fwww.lip6.fr\u002F?LANG=en), and the [Metacreation Lab (Simon Fraser University)](https:\u002F\u002Fwww.metacreation.net).\n\n## All Thanks To Our Contributors\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNatooz_MidiTok_readme_b26a476e86d3.png\" \u002F>\n\u003C\u002Fa>\n","# MidiTok\n\n用于对音乐文件进行分词的 Python 包，在 ISMIR 2021 LBDs 上首次发布。\n\n![MidiTok Logo](docs\u002Fassets\u002Fmiditok_logo_stroke.png?raw=true \"\")\n\n[![PyPI version fury.io](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fmiditok.svg)](https:\u002F\u002Fpypi.python.org\u002Fpypi\u002Fmiditok\u002F)\n[![Python 3.9](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-≥3.9-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002Frelease\u002F)\n[![Documentation Status](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNatooz_MidiTok_readme_6bf48b3e9a6d.png)](https:\u002F\u002Fmiditok.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest)\n[![GitHub CI](https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Factions\u002Fworkflows\u002Fpytest.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Factions\u002Fworkflows\u002Fpytest.yml)\n[![Codecov](https:\u002F\u002Fimg.shields.io\u002Fcodecov\u002Fc\u002Fgithub\u002FNatooz\u002FMidiTok)](https:\u002F\u002Fcodecov.io\u002Fgh\u002FNatooz\u002FMidiTok)\n[![GitHub license](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002FNatooz\u002FMidiTok.svg)](https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fblob\u002Fmain\u002FLICENSE)\n[![Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNatooz_MidiTok_readme_6f46ea8c1944.png)](https:\u002F\u002Fpepy.tech\u002Fproject\u002FMidiTok)\n[![Code style](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcode%20style-ruff-000000.svg)](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fruff)\n\nMidiTok 可以对 MIDI 和 abc 文件进行分词，即将其转换为可直接输入到 Transformer 等模型中的标记序列，适用于各种生成、转录或 MIR 任务。MidiTok 支持大多数已知的音乐分词方法（例如 [REMI](https:\u002F\u002Farxiv.org\u002Fabs\u002F2002.00212)、[Compound Word](https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.02402)...），并且基于这些方法共享通用参数和方法的理念构建。分词器可以使用 [Byte Pair Encoding (BPE)](https:\u002F\u002Faclanthology.org\u002F2023.emnlp-main.123\u002F)、[Unigram](https:\u002F\u002Faclanthology.org\u002FP18-1007\u002F) 和 [WordPiece](https:\u002F\u002Farxiv.org\u002Fabs\u002F1609.08144) 进行训练，同时还提供数据增强方法。\n\nMidiTok 已集成到 Hugging Face Hub 🤗！请随时与社区分享您的模型！\n\n**文档：** [miditok.readthedocs.com](https:\u002F\u002Fmiditok.readthedocs.io\u002Fen\u002Flatest\u002Findex.html)\n\n## 安装\n\n```shell\npip install miditok\n```\nMidiTok 使用 [Symusic](https:\u002F\u002Fgithub.com\u002FYikai-Liao\u002Fsymusic) 来读取和写入 MIDI 和 abc 文件，而 BPE\u002FUnigram 则由 [Hugging Face 🤗tokenizers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftokenizers) 提供支持，实现超快速编码。\n\n## 使用示例\n\n可以通过调用分词器来完成分词和反分词操作：\n\n```python\nfrom miditok import REMI, TokenizerConfig\nfrom symusic import Score\n\n# 创建一个多轨分词器，阅读文档以了解所有参数\nconfig = TokenizerConfig(num_velocities=16, use_chords=True, use_programs=True)\ntokenizer = REMI(config)\n\n# 加载一个 MIDI 文件，将其转换为标记，然后再转换回 MIDI\nmidi = Score(\"path\u002Fto\u002Fyour_midi.mid\")\ntokens = tokenizer(midi)  # 调用分词器会自动检测 MIDI 文件、路径和标记\nconverted_back_midi = tokenizer(tokens)  # 支持 PyTorch、Tensorflow 和 Numpy 张量\n```\n\n以下是一个完整但简洁的示例，展示了如何使用 MidiTok 训练任何 PyTorch 模型。而 [这里](colab-notebooks\u002FExample_HuggingFace_Mistral_Transformer.ipynb) 是一个简单的笔记本示例，说明如何使用 Hugging Face 模型生成音乐，MidiTok 负责对音乐文件进行分词。\n\n```python\nfrom miditok import REMI, TokenizerConfig\nfrom miditok.pytorch_data import DatasetMIDI, DataCollator\nfrom miditok.utils import split_files_for_training\nfrom torch.utils.data import DataLoader\nfrom pathlib import Path\n\n# 创建一个多轨分词器，阅读文档以了解所有参数\nconfig = TokenizerConfig(num_velocities=16, use_chords=True, use_programs=True)\ntokenizer = REMI(config)\n\n# 使用 Byte Pair Encoding (BPE) 训练分词器\nfiles_paths = list(Path(\"path\", \"to\", \"midis\").glob(\"**\u002F*.mid\"))\ntokenizer.train(vocab_size=30000, files_paths=files_paths)\ntokenizer.save(Path(\"path\", \"to\", \"save\", \"tokenizer.json\"))\n# 并将其推送到 Hugging Face hub（您可以使用 .from_pretrained 方法重新下载）\ntokenizer.push_to_hub(\"username\u002Fmodel-name\", private=True, token=\"your_hf_token\")\n\n# 将 MIDI 文件拆分为更小的块以进行训练\ndataset_chunks_dir = Path(\"path\", \"to\", \"midi_chunks\")\nsplit_files_for_training(\n    files_paths=files_paths,\n    tokenizer=tokenizer,\n    save_dir=dataset_chunks_dir,\n    max_seq_len=1024,\n)\n\n# 创建数据集、数据加载器和数据整理器以训练模型\ndataset = DatasetMIDI(\n    files_paths=list(dataset_chunks_dir.glob(\"**\u002F*.mid\")),\n    tokenizer=tokenizer,\n    max_seq_len=1024,\n    bos_token_id=tokenizer[\"BOS_None\"],\n    eos_token_id=tokenizer[\"EOS_None\"],\n)\ncollator = DataCollator(tokenizer.pad_token_id, copy_inputs_as_labels=True)\ndataloader = DataLoader(dataset, batch_size=64, collate_fn=collator)\n\n# 遍历数据加载器以训练模型\nfor batch in dataloader:\n    print(\"在这一批数据上训练您的模型...\")\n```\n\n## 分词方法\n\nMidiTok 实现了以下分词方法（链接至原始论文）：\n* [REMI](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3394171.3413671)\n* [REMI+](https:\u002F\u002Fopenreview.net\u002Fforum?id=NyR8OZFHw6i)\n* [MIDI-Like](https:\u002F\u002Flink.springer.com\u002Farticle\u002F10.1007\u002Fs00521-018-3758-9)\n* [TSD](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.11975)\n* [Structured](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.05944)\n* [CPWord](https:\u002F\u002Fojs.aaai.org\u002Findex.php\u002FAAAI\u002Farticle\u002Fview\u002F16091)\n* [Octuple](https:\u002F\u002Faclanthology.org\u002F2021.findings-acl.70)\n* [MuMIDI](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3394171.3413721)\n* [MMM](https:\u002F\u002Farxiv.org\u002Fabs\u002F2008.06048)\n* [PerTok](https:\u002F\u002Fwww.arxiv.org\u002Fabs\u002F2410.02060)\n\n您可以在 [文档](https:\u002F\u002Fmiditok.readthedocs.io\u002Fen\u002Flatest\u002Ftokenizations.html) 中找到简短介绍。\n\n## 贡献\n\n我们非常欢迎贡献，如果您想添加新的分词方法或加快代码速度，请随时提出问题或发送 PR。有关详细信息，请参阅 [贡献指南](CONTRIBUTING.md)。\n\n### 待办事项\n\n* 支持 music-xml 文件；\n* `no_duration_drums` 选项，忽略鼓音符的时长标记；\n* 控制变化消息；\n* 使用 Rust 或 C++ 绑定加速全局\u002F轨道事件解析。\n\n## 引用\n\n如果您在研究中使用 MidiTok，我们非常乐意在您的论文中看到引用。❤️\n\n[**[MidiTok 论文]**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.17202)\n[**[MidiTok 的 ISMIR 原始发表文章]**](https:\u002F\u002Farchives.ismir.net\u002Fismir2021\u002Flatebreaking\u002F000005.pdf)\n```bibtex\n@inproceedings{miditok2021,\n    title={{MidiTok}: A Python package for {MIDI} file tokenization},\n    author={Fradet, Nathan and Briot, Jean-Pierre and Chhel, Fabien and El Fallah Seghrouchni, Amal and Gutowski, Nicolas},\n    booktitle={Extended Abstracts for the Late-Breaking Demo Session of the 22nd International Society for Music Information Retrieval Conference},\n    year={2021},\n    url={https:\u002F\u002Farchives.ismir.net\u002Fismir2021\u002Flatebreaking\u002F000005.pdf},\n}\n```\n\n所有分词方法的 BibTeX 引用都可以在 [文档](https:\u002F\u002Fmiditok.readthedocs.io\u002Fen\u002Flatest\u002Fcitations.html) 中找到。\n\n## 致谢\n\n@Natooz 感谢其雇主允许他开展该项目，按时间先后顺序分别为 [Aubay](https:\u002F\u002Fblog.aubay.com\u002Findex.php\u002Flanguage\u002Fen\u002Fhome\u002F?lang=en)、[LIP6（索邦大学）](https:\u002F\u002Fwww.lip6.fr\u002F?LANG=en) 以及 [Metacreation Lab（西蒙菲莎大学）](https:\u002F\u002Fwww.metacreation.net)。\n\n## 衷心感谢所有贡献者\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNatooz_MidiTok_readme_b26a476e86d3.png\" \u002F>\n\u003C\u002Fa>","# MidiTok 快速上手指南\n\nMidiTok 是一个用于将 MIDI 和 abc 音乐文件转换为令牌（Token）序列的 Python 库，专为音乐生成、转录及音乐信息检索（MIR）任务设计。它支持多种主流音乐令牌化方案（如 REMI、Compound Word 等），并可结合 BPE、Unigram 等算法进行词表训练，完美适配 Transformer 等深度学习模型。\n\n## 环境准备\n\n*   **操作系统**：Linux, macOS, Windows\n*   **Python 版本**：≥ 3.9\n*   **核心依赖**：\n    *   `symusic`：用于高效读写 MIDI 和 abc 文件。\n    *   `tokenizers` (Hugging Face)：用于加速 BPE\u002FUnigram 编码。\n    *   深度学习框架（可选）：PyTorch, TensorFlow 或 NumPy（用于处理张量数据）。\n\n## 安装步骤\n\n使用 pip 直接安装最新稳定版：\n\n```shell\npip install miditok\n```\n\n> **提示**：国内开发者若遇到下载缓慢问题，可使用清华或阿里镜像源加速安装：\n> ```shell\n> pip install miditok -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n> ```\n\n## 基本使用\n\n### 1. 快速令牌化与还原\n\n以下示例展示如何加载 MIDI 文件，将其转换为令牌序列，并还原回 MIDI 文件。MidiTok 自动支持 PyTorch、TensorFlow 和 NumPy 张量格式。\n\n```python\nfrom miditok import REMI, TokenizerConfig\nfrom symusic import Score\n\n# 配置分词器：设置速度层级、启用和弦识别、启用乐器程序识别\nconfig = TokenizerConfig(num_velocities=16, use_chords=True, use_programs=True)\ntokenizer = REMI(config)\n\n# 加载 MIDI 文件\nmidi = Score(\"path\u002Fto\u002Fyour_midi.mid\")\n\n# 转换为令牌序列\ntokens = tokenizer(midi)  # 自动检测输入类型并转换\n\n# 将令牌序列还原为 MIDI 对象\nconverted_back_midi = tokenizer(tokens)\n```\n\n### 2. 训练自定义分词器并构建数据集\n\n若需训练生成模型，可先使用 BPE 算法训练专属分词器，随后构建 PyTorch DataLoader。\n\n```python\nfrom miditok import REMI, TokenizerConfig\nfrom miditok.pytorch_data import DatasetMIDI, DataCollator\nfrom miditok.utils import split_files_for_training\nfrom torch.utils.data import DataLoader\nfrom pathlib import Path\n\n# 初始化分词器配置\nconfig = TokenizerConfig(num_velocities=16, use_chords=True, use_programs=True)\ntokenizer = REMI(config)\n\n# 1. 训练分词器 (使用 BPE 算法)\nfiles_paths = list(Path(\"path\", \"to\", \"midis\").glob(\"**\u002F*.mid\"))\ntokenizer.train(vocab_size=30000, files_paths=files_paths)\ntokenizer.save(Path(\"path\", \"to\", \"save\", \"tokenizer.json\"))\n\n# (可选) 推送到 Hugging Face Hub\n# tokenizer.push_to_hub(\"username\u002Fmodel-name\", private=True, token=\"your_hf_token\")\n\n# 2. 切割长 MIDI 文件以适应训练序列长度\ndataset_chunks_dir = Path(\"path\", \"to\", \"midi_chunks\")\nsplit_files_for_training(\n    files_paths=files_paths,\n    tokenizer=tokenizer,\n    save_dir=dataset_chunks_dir,\n    max_seq_len=1024,\n)\n\n# 3. 构建 Dataset 和 DataLoader\ndataset = DatasetMIDI(\n    files_paths=list(dataset_chunks_dir.glob(\"**\u002F*.mid\")),\n    tokenizer=tokenizer,\n    max_seq_len=1024,\n    bos_token_id=tokenizer[\"BOS_None\"],\n    eos_token_id=tokenizer[\"EOS_None\"],\n)\ncollator = DataCollator(tokenizer.pad_token_id, copy_inputs_as_labels=True)\ndataloader = DataLoader(dataset, batch_size=64, collate_fn=collator)\n\n# 4. 开始训练循环\nfor batch in dataloader:\n    print(\"Train your model on this batch...\")\n    # 在此处添加你的模型训练代码\n```","一位音乐 AI 研究员正试图利用 Transformer 架构训练一个能够生成多风格钢琴曲的模型，需要将大量原始 MIDI 文件转化为模型可理解的序列数据。\n\n### 没有 MidiTok 时\n- **格式转换繁琐**：需要手动编写复杂的解析脚本提取音符、时长和力度信息，不同编码标准的 MIDI 文件极易导致程序报错。\n- **表征方式单一**：难以灵活切换 REMI、Compound Word 等主流音乐令牌化策略，每次尝试新算法都要重构整个数据预处理流水线。\n- **词汇表效率低下**：缺乏内置的 BPE 或 Unigram 训练机制，生成的令牌词汇表过于庞大且稀疏，严重拖慢模型收敛速度。\n- **数据增强困难**：无法便捷地对音乐序列进行变调或节奏微调等增强操作，导致训练数据多样性不足，模型容易过拟合。\n\n### 使用 MidiTok 后\n- **一键令牌化**：只需几行代码即可将 MIDI 文件自动转换为标准化的令牌序列，并支持无损还原为可播放的 MIDI 文件，大幅降低开发门槛。\n- **策略灵活切换**：通过修改配置参数即可在 REMI 等多种成熟令牌化方案间无缝切换，快速验证哪种表征最适合当前生成任务。\n- **智能词汇压缩**：直接调用内置的 BPE 训练功能，从海量曲目中学习最优子词单元，将词汇量控制在合理范围，显著提升训练效率。\n- **原生数据增强**：利用工具自带的数据增强方法轻松扩充数据集，让模型接触到更多样的音乐形态，生成结果更具创造力和鲁棒性。\n\nMidiTok 通过标准化音乐数据的“翻译”过程，让研究者能从繁琐的底层解析中解放出来，专注于音乐生成模型的核心架构创新。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002FNatooz_MidiTok_cc10b27b.png","Natooz","Nathan Fradet","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002FNatooz_b6904063.jpg","AI Researcher working on Music Generation","@NuMindAI","Paris",null,"NathanFradet","nathanfradet.com","https:\u002F\u002Fgithub.com\u002FNatooz",[25,29],{"name":26,"color":27,"percentage":28},"Python","#3572A5",96.2,{"name":30,"color":31,"percentage":32},"Jupyter Notebook","#DA5B0B",3.8,858,99,"2026-04-09T10:35:36","MIT",1,"未说明",{"notes":40,"python":41,"dependencies":42},"该工具主要用于将 MIDI 和 abc 文件转换为令牌序列，以便输入到 Transformer 等模型中。它本身不强制要求 GPU，但后续连接的深度学习模型（如 PyTorch 模型）可能需要 GPU 支持。支持通过 Hugging Face Hub 分享和下载预训练的分词器。未来计划支持 Rust 或 C++ 绑定以提升解析速度。",">=3.9",[43,44],"symusic","tokenizers",[46,47,48],"开发框架","其他","音频",[50,51,52,53,54,55,56],"midi","music","deep-learning","generative-model","music-generation","music-information-retrieval","machine-learning",2,"ready","2026-03-27T02:49:30.150509","2026-04-11T10:02:43.672502",[62,67,72,77,82,86],{"id":63,"question_zh":64,"answer_zh":65,"source_url":66},29335,"遇到 'ValueError: invalid literal for int() with base 10' 错误（例如值为 'Ignore'）该如何解决？","该问题已在代码库中修复，但尚未发布到最新的 PyPI 包版本中。解决方法是直接从 GitHub 安装最新版本的 MidiTok：\n\npip install git+https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\n\n此错误通常发生在模型输出 NoteOn token 后未跟随 Velocity token，或者解析特定 MIDI 事件时。","https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fissues\u002F15",{"id":68,"question_zh":69,"answer_zh":70,"source_url":71},29336,"如何显著提升 MIDI 文件的解析和预处理速度？","MidiTok V3 版本已集成更快的后端支持。建议使用 `symusic` 库替代传统的 `mido` 或 `miditoolkit` 进行读取。\n\n基准测试显示，在 Maestro 数据集上，使用 symusic 比 miditoolkit 快约 320 倍。升级到 MidiTok 3.0.0 并结合 symusic 后端后，端到端的 tokenization 速度可从秒级（~900ms）提升至毫秒级（~70ms）。\n\n确保安装最新版本的 MidiTok 以启用这些性能优化。","https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fissues\u002F112",{"id":73,"question_zh":74,"answer_zh":75,"source_url":76},29337,"为什么使用 `tokenize_midi_dataset` 函数比手动脚本慢很多？是否支持多核并行？","旧版本的 `tokenize_midi_dataset` 可能未充分利用所有 CPU 核心，导致速度远慢于手动实现的并行脚本（例如使用 `ProcessPoolExecutor`）。\n\n维护者已发布新版本，引入了 `DatasetMIDI` 类和新的 MIDI 分割方法来解决此性能瓶颈。请升级到最新版本以获得更好的多核支持和数据处理效率。如果仍需使用旧版本，建议参考社区提供的基于 `concurrent.futures` 的手动并行处理脚本作为临时方案。","https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fissues\u002F147",{"id":78,"question_zh":79,"answer_zh":80,"source_url":81},29338,"在使用 REMI 分词且开启 `use_time_signatures=True` 时，为什么会出现大量重复的小节（Bar）和拍号（TimeSig）标记？","这是由于在连续的小节中，如果拍号没有发生变化，逻辑上不应重复添加拍号标记，但某些实现细节可能导致了冗余。\n\n虽然这是一个已知行为，但在处理数据稀缺的场景下影响有限。如果需要严格去重，可以在分词后处理 token 序列，或者参考 `REMI.add_time_events` 方法的逻辑，根据拍号变化来确定每小节的 tick 数，从而在生成阶段优化事件添加逻辑。","https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fissues\u002F74",{"id":83,"question_zh":84,"answer_zh":85,"source_url":81},29339,"如何将包含跨小节音符的 MIDI 文件转换为 MusicXML 或其他格式而不丢失结构信息？","建议在将 MIDI 转换为 MusicXML 之前，先对音符进行分割处理。\n\n具体步骤如下：\n1. 根据拍号（Time Signature）的变化检测小节边界。\n2. 分析每个音符，如果其持续时间跨越了两个小节，则将其拆分为两个独立的音符。\n3. 可以参考 MidiTok 中 `REMI.add_time_events` 方法的实现逻辑，该方法能根据拍号计算每小节的 tick 数量，有助于确定分割点。\n\n这种预处理可以确保转换后的格式（如 MusicXML）结构正确，避免长音符导致的对齐问题。",{"id":87,"question_zh":88,"answer_zh":89,"source_url":71},29340,"MidiTok V3 相比 V2 在性能上有哪些具体提升？","MidiTok V3 通过引入更高效的底层解析后端（如 symusic），在多个数据集上实现了显著的性能飞跃：\n\n1. **Maestro 数据集**：端到端分词时间从 V2 的约 947ms (REMI) 降低至 V3 的 73ms，提速超过 10 倍。\n2. **MetaMIDI 数据集**：处理时间从约 300ms 降低至 41ms。\n3. **POP909 数据集**：同样表现出数量级的速度提升。\n\n此外，V3 还改进了数据集加载机制（如 `DatasetMIDI`），更好地支持并行处理，解决了旧版本中 `tokenize_midi_dataset` 函数效率低下的问题。",[91,96,101,106,111,116,121,126,131,136,141,146,151,156,161,166,171,176,181,186],{"id":92,"version":93,"summary_zh":94,"released_at":95},198107,"v3.0.6.post1","## 变更内容\n* 由 @JLenzy 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F243 中修复了与旧版 PerTok 的向后兼容性问题。","2025-07-22T12:40:21",{"id":97,"version":98,"summary_zh":99,"released_at":100},198108,"v3.0.6","## 变更内容\n\n* 修复构建目标，并在 GitHub Actions 中使用 hatch，由 @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F223 中完成\n* 将 codecov\u002Fcodecov-action 从 5.3.1 升级到 5.4.0，由 @dependabot 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F226 中完成\n* 修复加载训练好的 `PerTok` 分词器时的 bug，由 @HuwCheston 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F229 中完成\n* 将 codecov\u002Fcodecov-action 从 5.4.0 升级到 5.4.2，由 @dependabot 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F230 中完成\n* 修复 `PerTok` 解码（MIDI 转换）中 `use_sustain_pedals=True` 时的错误，由 @mimbres 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F232 中完成\n* 更新 split.py 文件，由 @FilippoGalli001 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F233 中完成\n* 将 codecov\u002Fcodecov-action 从 5.4.2 升级到 5.4.3，由 @dependabot 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F235 中完成\n* 修复：使音高范围的上限值包含在内，由 @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F239 中完成\n* 修复 `pitch_intervals_max_time_dist` 的类型注解，并对浮点数值进行四舍五入，由 @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F240 中完成\n* PerTok：用位置标记代替时间偏移，由 @JLenzy 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F236 中完成\n* CI 修复，由 @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F241 中完成\n\n## 新贡献者\n\n* @HuwCheston 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F229 中完成了首次贡献\n* @mimbres 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F232 中完成了首次贡献\n* @FilippoGalli001 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F233 中完成了首次贡献\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fcompare\u002Fv3.0.5...v3.0.6","2025-06-22T09:45:22",{"id":102,"version":103,"summary_zh":104,"released_at":105},198109,"v3.0.5.post1","## 变更内容\n\n* 修复构建目标，并在 GitHub Actions 中使用 hatch，由 @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F223 中完成\n","2025-02-17T08:38:55",{"id":107,"version":108,"summary_zh":109,"released_at":110},198110,"v3.0.5","## 变更内容\n\n* 修复了在最新 Hugging Face Hub 包更新后导入 `HfHubHTTPError` 的问题，由 @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F199 中完成。\n* MDTK_200：实现了 `add_trailing_bars` 功能，由 @Mintas 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F204 中完成。\n* 移除了文档中对 `split_midis_for_training` 的引用，由 @Zaka 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F205 中完成。\n* 捕获了在 `MIDILike` 中解码力度值时的异常，由 @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F210 中完成。\n* 更新了示例笔记本的引用，由 @emmanuel-ferdman 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F216 中完成。\n* 修复了训练初始字母表的 bug，由 @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F220 中完成。\n* 向 `augment_score` 函数添加了一个 `augment_copy` 参数，由 @pstrepetov 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F221 中完成。\n\n## 新贡献者\n\n* @Mintas 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F204 中完成了他们的首次贡献。\n* @Zaka 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F205 中完成了他们的首次贡献。\n* @emmanuel-ferdman 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F216 中完成了他们的首次贡献。\n* @pstrepetov 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F221 中完成了他们的首次贡献。\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fcompare\u002Fv3.0.4...v3.0.5","2025-02-16T18:32:10",{"id":112,"version":113,"summary_zh":114,"released_at":115},198111,"v3.0.4","本次发布引入了由 Lemonaide AI 提供的 `PerTok` 分词器、属性控制标记以及一些小修复。\n\n## 亮点\n\n### PerTok：高性能分词器\n\n（相关论文即将发布）\n\n由 Julian Lenz (@JLenzy) 在 [Lemonaide AI](https:\u002F\u002Fwww.lemonaide.ai) 开发，旨在捕捉乐谱符号中的表现性时值，同时保持极低的序列长度。它通过将时间差分为宏观和微观两类，并引入了一种新的 MicroTime 标记类型来实现这一目标。细微的非量化节拍偏差则由这些 Timeshift 标记来表示。\n\n此外，PerTok 还允许您通过在 `TokenizerConfig` 的 `beat_res` 参数中启用多个重叠值，来编码无限数量的音符细分。未来更新中，微时值标记将扩展到所有分词器。\n\n### 属性控制标记\n\n属性控制标记是额外的标记，用于训练模型，以便在推理过程中通过强制模型预测具有特定特征的音乐来对其进行控制。\n\n## 变更内容\n* @briane412 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F164 中对 Example_HuggingFace_Mistral_Transformer.ipynb 的更新\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F165 中将 `_model_name` 设为受保护属性\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F167 中修复了分词器训练的相关文档\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F168 中为拆分标记序列时添加了默认的 `continuing_subword_prefix`\n* @shenranwang 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F170 中修复了 MIDI 预分词中的一个小 bug\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F172 中添加了 `no_preprocess_score` 参数用于分词\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F173 中实现了 `TokSequence` 的可加性，并为 MMM 添加了 `concatenate_track_sequences` 参数\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F175 中更新了文档\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F177 中修复了空文件（无音轨和\u002F或无音符）的拆分方法\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F180 中将 logo 改为带有白色外框\n* @helloWorld199 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F181 中添加了属性控制功能\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F182 中更好地区分了 `one_token_stream` 和 `config.one_token_stream_for_programs`\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F183 中确保在 tokenizer_training_iterator.py 中按小节\u002F拍子拆分 MMM 标记序列时不会将其拼接在一起\n* @scottclowe 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F184 中修复了 rST 文档\n* @dependabot 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F185 中将 actions\u002Fstale 从 5.1.1 升级到 9.0.0\n* @dependabot 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F186 中将 actions\u002Fdownload-artifact 从 3 升级到 4\n* @dependabot 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F187 中将 codecov\u002Fcodecov-action 从 3.1.0 升级到 4.5.0\n* Bump a","2024-09-15T10:42:09",{"id":117,"version":118,"summary_zh":119,"released_at":120},198112,"v3.0.3","## 亮点\n\n* 支持 abc 文件，这些文件可以像 MIDI 文件一样使用 symusic 加载和转储；\n* 分词器现在也可以使用 **WordPiece** 和 **Unigram** 算法进行训练了！\n* 分词器的训练和标记 ID 编码现在可以按“小节”或“拍子”进行，这意味着分词器可以在严格的小节或拍子范围内，从基础标记的连续序列中学习新标记。这一设置由分词器配置中的 `encode_ids_split` 属性控制；\n* 现在需要 [symusic](https:\u002F\u002Fgithub.com\u002FYikai-Liao\u002Fsymusic) v0.4.3 或更高版本，以符合 `clip` 方法的使用要求；\n* 更好地处理了 `DatasetMIDI` 和 `DataCollator` 中的文件加载错误；\n* 引入了一个新的 `filter_dataset` 函数，用于在使用 MIDI\u002Fabc 数据集之前对其进行清理；\n* `MMM` 分词器已得到优化，现为完全模块化设计：它现在可以基于其他分词方式（`REMI`、`TSD` 和 `MIDILike`）工作，从而提供更高的灵活性和互操作性；\n* `TokSequence` 对象现在可以进行切片和拼接（例如 `seq3 = seq1[:50] + seq2[50:]`）；\n* 由分词器分词得到的 `TokSequence` 对象现在可以按小节或拍子子序列进行分割；\n* 其他一些小修复、代码改进和清理工作；\n\n### 方法重命名\n\n此前，一些方法和属性的名称带有“bpe”和“midi”的字样。为了使其更符合这些方法更为通用的用途（支持多种文件格式和训练算法），现已将其更名为更加贴切且准确的名称。\n\n\u003Cdetails>\n  \u003Csummary>带有弃用警告的重命名方法：\u003C\u002Fsummary>\n\n* `midi_to_tokens` --> `encode`；\n* `tokens_to_midi` --> `decode`；\n* `learn_bpe` --> `train`；\n* `apply_bpe` --> `encode_token_ids`；\n* `decode_bpe` --> `decode_token_ids`；\n* `ids_bpe_encoded` --> `are_ids_encoded`；\n* `vocab_bpe` --> `vocab_model`。\n* `tokenize_midi_dataset` --> `tokenize_dataset`；\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>不带弃用警告的重命名方法（使用较少，可减少代码混乱）：\u003C\u002Fsummary>\n\n* `MIDITokenizer` --> `MusicTokenizer`；\n* `augment_midi` --> `augment_score`；\n* `augment_midi_dataset` --> `augment_dataset`；\n* `augment_midi_multiple_offsets` --> `augment_score_multiple_offsets`；\n* `split_midis_for_training` --> `split_files_for_training`；\n* `split_midi_per_note_density` --> `split_score_per_note_density`；\n* `get_midi_programs` --> `get_score_programs`；\n* `merge_midis` --> `merge_scores`；\n* `get_midi_ticks_per_beat` --> `get_score_ticks_per_beat`；\n* `split_midi_per_ticks` --> `split_score_per_ticks`；\n* `split_midi_per_beats` --> `split_score_per_beats`；\n* `split_midi_per_tracks` --> `split_score_per_tracks`；\n* `concat_midis` --> `concat_scores`；\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>受保护的内部方法（无弃用警告，适用于高级用法）：\u003C\u002Fsummary>\n\n* `MIDITokenizer._tokens_to_midi` --> `MusicTokenizer._tokens_to_score`；\n* `MIDITokenizer._midi_to_tokens` --> `MusicTokenizer._score_to_tokens`；\n* `MIDITokenizer._create_midi_","2024-04-25T12:50:05",{"id":122,"version":123,"summary_zh":124,"released_at":125},198113,"v3.0.2","## 简而言之\n\n此新版本引入了一个新的 `DatasetMIDI` 类，用于训练 PyTorch 模型。它基于之前名为 `DatasetTok` 的类，增加了预分词选项，并更好地处理 BOS 和 EOS 标记。此外，还新增了一个 `miditok.pytorch_data.split_midis_for_training` 方法，可以根据乐段的音符密度，动态地将 MIDI 文件切分成接近所需标记序列长度的小片段，从而在最大化数据使用量的同时进行模型训练。为此，还创建了一些新的工具方法，例如用于拆分、拼接或合并 `symusic.Score` 对象的方法。\n\n感谢 @Kinyugo 在讨论和测试中为这些功能的开发提供了指导！（#147）\n\n此次更新还包含一些小修复，并且文档[链接](https:\u002F\u002Fmiditok.readthedocs.io\u002F)采用了全新的主题！\n\n## 变更内容\n\n* 由 @sunsetsobserver 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F145 中修复：将 `token_paths` 改为 `files_paths`，将 `config` 改为 `model_config`\n* 由 @ilya16 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F146 中修复：解决了 Octuple 中存在多种不同拍号时的问题\n* 由 @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F149 中实现：在音高间隔解码时，丢弃超出分词器音高范围的音符\n* 由 @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F150 中修复：使 `save_pretrained` 方法符合 huggingface_hub v0.21 的规范\n* 由 @JLenzy 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F153 中添加：允许在初始化时覆盖 `_create_durations_tuples` 方法\n* 由 @Natooz 和 @Kinyugo 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F148 中重构了 PyTorch 数据加载相关类和方法\n* 文档采用了全新的主题！使用了 [furo](https:\u002F\u002Fgithub.com\u002Fpradyunsg\u002Ffuro) 主题。\n\n## 新贡献者\n* @sunsetsobserver 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F145 中完成了首次贡献\n* @JLenzy 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F153 中完成了首次贡献\n\n**完整变更日志**：https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fcompare\u002Fv3.0.1...v3.0.2","2024-03-24T14:38:59",{"id":127,"version":128,"summary_zh":129,"released_at":130},198114,"v3.0.1","## 变更内容\n* 添加了 `use_pitchdrum_tokens` 选项，用于在鼓轨中使用专用的 `PitchDrum` 令牌。\n* 修复了时间签名预处理中的问题（时间划分不匹配），详见 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F132 (#131 @EterDelta)。\n* 修复了数据增强示例，并在加载 MIDI 文件时考虑所有 MIDI 扩展名，详见 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F136 (#135 @oiabtt)。\n* 在解码过程中，自动确保先解码 BPE，再补全 `tokens`，详见 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F138 (#137 @oiabtt)。\n* `load_tokens` 现在返回 `TokSequence` 对象，详见 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F139 (#137 @oiabtt)。\n* 当从保存的配置文件加载分词器时，将和弦映射从列表转换回元组，由 @shenranwang 完成，详见 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F141。\n* 现在可以像 Hugging Face 转换器库中的 `AutoTokenizer` 一样，使用 `MIDITokenizer.from_pretrained` 方法，详见 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F142（讨论见 #127 @oiabtt）。\n\n## 新贡献者\n* @shenranwang 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F141 中完成了首次贡献。\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fcompare\u002Fv3.0.0...v3.0.1","2024-02-02T08:55:08",{"id":132,"version":133,"summary_zh":134,"released_at":135},198115,"v3.0.0","## 切换到 symusic\n\n这个主要版本标志着从 [miditoolkit](https:\u002F\u002Fgithub.com\u002FYatingMusic\u002Fmiditoolkit) MIDI 读写库切换到 [**symusic**](https:\u002F\u002Fgithub.com\u002FYikai-Liao\u002Fsymusic)，并对 MIDI 预处理步骤进行了大规模优化。\n\nSymusic 是一个用 C++ 编写的 MIDI 读写库，并提供了 Python 绑定，其速度无与伦比，[**最高可比原生 Python 库快 500 倍**](https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fissues\u002F112#issuecomment-1895948962)。它基于 [minimidi](https:\u002F\u002Fgithub.com\u002Flzqlzzq\u002Fminimidi)。这两款库均由 @Yikai-Liao 和 @lzqlzzq 创建并维护，他们完成了非常出色的工作，而且这项工作仍在继续，因为路线图上还有许多实用功能！🫶\n\n**旧版本中的分词器与新版本兼容，但在比较 MIDI 的分词方式和标记解码时，可能会出现一些时间上的差异。**\n\n## 性能提升\n\n这些改动使得 MIDI 的加载、写入和分词速度大幅提升！**整体的分词过程（即加载 MIDI 并将其分词）的速度** [**提高了 5 到 12 倍**](https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fissues\u002F112#issuecomment-1896286910)，具体取决于所使用的分词器和数据集。您可以在 [这里](https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fissues\u002F112#issuecomment-1895948962)找到更多基准测试结果。\n\n如此巨大的速度提升，使得我们不再需要像以前那样建议将 MIDI 文件预先分词为 JSON 格式的标记，而是可以直接在训练或使用模型时**实时对 MIDI 文件进行分词**！我们已相应地更新了文档中的[使用示例](https:\u002F\u002Fmiditok.readthedocs.io\u002Fen\u002Flatest\u002Fexamples.html)，代码现在更加简洁。\n\n### 其他重大变化\n\n* 在使用拍号时，时间标记现在以每拍的刻度数来计算，而不是像之前那样以每四分音符的刻度数来计算。这一变化符合时间和时值标记的定义；在此之前，对于非四分音符的音符时值，MIDI 规范并未得到正确处理（https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F124）；\n* 添加了新的排版规则及其修复，以确保合规性，从而提升了代码质量（https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F115）；\n* MidiTok 仍然支持 `miditoolkit.MidiFile` 对象，但这些对象会在运行时被转换为 `symusic.Score` 对象，并会抛出弃用警告；\n* 已移除基于标记级别的数据增强方法，转而采用直接在 MIDI 上进行的数据增强，这样不仅速度更快，流程也更简化，且能够更好地处理时值；\n* 文档已修复；\n* 分词测试的工作流程已被统一并大幅简化，从而使测试断言更加稳健。同时，我们增加了测试用例和配置的数量，缩短了测试所需的时间。\n\n### 其他小的改动\n\n* 在 TokenizerConf 中设置特殊标记的值（https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F114）；\n* @kalyani2003 更新了 README.md 文件（https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpu","2024-01-17T20:04:56",{"id":137,"version":138,"summary_zh":139,"released_at":140},198116,"v2.1.8","这个新版本引入了一种新的附加标记类型：音高间隔。它允许表示同时和相继出现的音符之间的音高间隔。有关其工作原理的更多详细信息，请参阅[文档](https:\u002F\u002Fmiditok.readthedocs.io\u002Fen\u002Fv2.1.7\u002F)。\n\n我们大幅改进了测试和 CI 工作流，并在此过程中修复了一些小 bug，同时进行了一些优化。\n\n此外，此新版本**不再支持 Python 3.7，现要求使用 Python 3.8 及更高版本**。您可以在文档中了解更多关于这一决定的信息，以及如何实现向后兼容的方法。\n\n**我们鼓励您升级到最新的 [miditoolkit](https:\u002F\u002Fgithub.com\u002FYatingMusic\u002Fmiditoolkit) 版本**，该版本也包含一些修复和改进。其中最值得注意的是清理了依赖项，并且**与最新版本的 NumPy 兼容！**\n\n## 变更内容\n\n* @eltociear (#89)、@gfggithubleet (#91 和 #93)、@shresthasurav (#94)、@THEFZNKHAN (#98 和 #99) 在文档中修正了拼写错误。\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F92 中修复了在不使用特殊标记的情况下学习 BPE 时的一个 bug。\n* @akx 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F105 中将 lint\u002Fisort\u002Fformat 工具切换为 Ruff。\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F103 中添加了音高间隔选项。\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F106 中切换到 pyproject.toml 和 hatch 打包方式。\n* @parneyw 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F109 中修复了数据增强问题。\n* @feiyuehchen 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F110 中处理了空 MIDI 文件的问题。\n* @Natooz 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F108 中改进了测试并进行了一些小幅优化。\n\n## 新贡献者\n\n* @eltociear 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F89 中完成了首次贡献。\n* @gfggithubleet 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F91 中完成了首次贡献。\n* @shresthasurav 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F94 中完成了首次贡献。\n* @THEFZNKHAN 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F98 中完成了首次贡献。\n* @akx 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F105 中完成了首次贡献。\n* @parneyw 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F109 中完成了首次贡献。\n* @feiyuehchen 在 https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fpull\u002F110 中完成了首次贡献。\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fcompare\u002Fv2.1.7...v2.1.8","2023-11-28T13:35:24",{"id":142,"version":143,"summary_zh":144,"released_at":145},198117,"v2.1.7","**This release bring the integration of the Hugging Face Hub, along with a few important fixes and improvements!**\r\n\r\n## What's Changed\r\n\r\n* #87 Hugging Face hub integration! You can now push and load MidiTok tokenizers from the Hugging Face hub, using the `.from_pretrained` and `push_to_hub` methods as you would do for your models! Special thanks to @Wauplin and @julien-c for the help and support! 🤗🤗\r\n* #80 (#78 @leleogere) Adding `func_to_get_labels` argument to `DatasetTok` allowing to use it to retrieve labels when loading data;\r\n* #81 (#74 @Chunyuan-Li) Fixing multi-stream decoding with several identical programs + fixes with the encoding \u002F decoding of time signatures for Bar-based tokenizers;\r\n* #84 (#77 @VDT5702) Fix in `detect_chords` when checking whether to use unknown chords;\r\n* #82 (#79 @leleogere) `tokenize_midi_dataset` now reproduces the file tree of the source files. This change fixes issues when files with the same name were overwritten in the previous method. You can also specify wether to overwrite files in the destination directory or not.\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fcompare\u002Fv2.1.6...v2.1.7","2023-10-25T06:58:40",{"id":147,"version":148,"summary_zh":149,"released_at":150},198118,"v2.1.6","### Changelog\r\n\r\n* #72 (#71) adding `program_change` config option, that will insert `Program` tokens whenever an event is from a different track than the previous one. They mimic the MIDI `ProgramChange` messages. If this parameter is disabled (by default), a `Program` token will prepend each track programs (as done in previous versions);\r\n* #72 `MIDILike` decoding optimized;\r\n* #72 deduplicating overlapping pitch bends during preprocess;\r\n* #72 `tokenize_check_equals` test method and more test cases;\r\n* #75 and #76 (#73 and #74 by @Chunyuan-Li) Fixing time signature encoding \u002F decoding Time Signature workflows for `Bar`\u002F`Position`-based tokenizer (`REMI`, `CPWord`, `Octuple`, `MMM`;\r\n* #76 `Octuple` is now tested with time signature disabled: as `TimeSig` tokens are only carried with notes, `Octuple` cannot accurately represent time signatures; as a result, if a Time Signature change occurs and that the following bar do not contain any note, the time will be shifted by one or multiple bars depending on the previous time signature numerator and time gap between the last and current note. We do not recommend to use `Octuple` with MIDIs with several time signature changes (at least numerator changes);\r\n* #76 `MMM` tokenization workflow speedup.","2023-09-28T19:46:58",{"id":152,"version":153,"summary_zh":154,"released_at":155},198119,"v2.1.5","### Changelog\r\n\r\n* #69  bacea19e70ba596a05fbbcf9f2bf53beb9714540 sort notes in all cases when tokenizing as MIDIs can contain unsorted notes;\r\n* #70 (#68) New `one_token_stream_for_programs` parameter allowing treat all tracks of a MIDI as a single stream of tokens (adding `Program` tokens before `Pitch`\u002F`NoteOn`...). This option is enabled by default, and corresponds to the default code behaviour of the previous versions. Disabling it allows to have `Program` tokens in the vocabulary (`config.use_programs` enabled) while converting each track independently;\r\n* #70 (#68) `TimeShift` and `Rest` tokens can now be created successively during the tokenization, happening when the largest `TimeShift` \u002F `Rest` value of the tokenizer isn't sufficient;\r\n* #70 (#68) Rests are now represented using the same format as `TimeShift`s, and the `config.rest_range` parameter has been renamed `beat_res_rest` for simplicity and flexibility. The default value is `{(0, 1): 8, (1, 2): 4, (2, 12): 2}`;\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002FNatooz\u002FMidiTok\u002Fcompare\u002Fv2.1.4...v2.1.5\r\n\r\nThanks to @caenopy for reporting the bugs fixed here.\r\n\r\n### Compatibility\r\n\r\n* tokenizers of previous versions with `rest_range` parameter will be converted to the new `beat_res_rest` format.\r\n","2023-08-31T06:51:37",{"id":157,"version":158,"summary_zh":159,"released_at":160},198120,"v2.1.4","### Changelog\r\n\r\n* @ilya16 2e1978f5c533b0989c2c4929f5e976511e06c6bb Fix in `save_tokens` method, reading `kwargs` in the json file saved;\r\n* #67 Adding sustain pedal and pitch bend tokens for `REMI`, `TSD` and `MIDILike` tokenizers\r\n\r\n### Compatibility\r\n\r\n* `MMM` now adds additional tokens in the same order than other tokenizers, meaning previously saved `MMM` tokenizers with these tokens would need to be converted if needed.","2023-08-25T11:47:38",{"id":162,"version":163,"summary_zh":164,"released_at":165},198121,"v2.1.3","This big update brings a few important changes and improvements.\r\n\r\n### A new common tokenization workflow for all tokenizers. \r\n\r\nWe distinguish now three types of tokens: \r\n1. Global MIDI tokens, which represent attributes and events affecting the music globally, such as the tempo or time signature;\r\n2. Track tokens, representing values of distinct tracks such as the notes, chords or effects;\r\n3. Time tokens, which serve to structure and place the previous categories of tokens in time.\r\n\r\nAll tokenisations now follows the pattern:\r\n\r\n1. Preprocess the MIDI;\r\n2. Gather global MIDI events (tempo...);\r\n3. Gather track events (notes, chords);\r\n4.  If \"one token stream\", concatenate all global and track events and sort them by time of occurrence. Else, concatenate the global events to each sequence of track events;\r\n5. Deduce the time events for all the sequences of events (only one if \"one token stream\");\r\n6. Return the tokens, as a combination of list of strings and list of integers (token ids).\r\n\r\nThis cleans considerably the code (DRY, less redundant methods), while bringing speedups as the calls to sorting methods has been reduced.\r\n\r\n### TLDR; other changes\r\n\r\n* New submodule `pytorch_data` offering PyTorch `Dataset` objects and a data collator, to be used when training a PyTorch model. Learn more in the documentation of the module;\r\n* `MIDILike`, `CPWord` and `Structured` now handle natively `Program` tokens in a multitrack \u002F `one_token_stream` way;\r\n* Time signature changes are now handled by `TSD`, `MIDILike` and `CPWord`;\r\n* The `time_signature_range` config option is now more flexible \u002F convenient.\r\n\r\n### Changelog\r\n\r\n* #61 new `pytorch_data` submodule, with `DatasetTok` and `DatasetJsonIO` classes. This module is only loaded if `torch` is installed in the python environment;\r\n* #61 `tokenize_midi_dataset()` method now have a `tokenizer_config_file_name` argument, allowing to save the tokenizer config with a custom file name;\r\n* #61 \"all-in-one\" `DataCollator` object to be used with PyTorch `DataLoader`s;\r\n* #62 `Structured` and `MIDILike` now natively handle `Program` tokens. When setting `config.use_programs` true, a `Program` token will be added before each `Pitch`\u002F`NoteOn`\u002F`NoteOff` token to associate its instrument. MIDIs will also be treated as a single stream of tokens in this case, whereas otherwise each track is converted into independent token sequences;\r\n* #62 `miditok.utils.remove_duplicated_notes` method can now remove notes with the same pitch and onset time, regardless of their offset time \u002F duration;\r\n* #62 `miditok.utils.merge_same_program_tracks` is now called in `preprocess_midi` when `config.use_programs` is True;\r\n* #62 Big refactor of the `REMI` codebase, that now has all the features of `REMIPlus`, and code clean and speedups (less calls to sorting). The `REMIPlus` class is now basically only a wrapped `REMI` with programs and time signature enabled;\r\n* #62 `TSD` and `MIDILike` now encode and decode time signature changes;\r\n* #63 @ilya16 The `Tempo`s can now be created with a  logarithmic scale, instead of the default linear scale.\r\n* c53a008cadda0f111058a892c23375edde364077 and 5d1c12e18a35e3e633863f1f675374f28c8f7748 `track_to_tokens` and `tokens_to_track` methods are now partially removed. They are now protected, for classes that still rely on them, and removed from the others. These methods were made for internal calls and not recommended to use. Instead, the `midi_to_tokens` method is recommended;\r\n* #65 @ilya16 changes `time_signature_range` into a dictionary `{denom_i: [num_i1, ..., num_in] \u002F (min_num_i, max_num_i)}`;\r\n* #65 @ilya16 fix in the formula computing the number of ticks per bar.\r\n* #66 Adds an option to `TokenizerConfig` to delete the successive tempo \u002F time signature changes carrying the same value during MIDI preprocessing;\r\n* #66 now using xdist for tests, big speedup on Github actions (ty @ilya16 !);\r\n* #66 `CPWord` and `Octuple` now follow the common tokenization workflow;\r\n* #66 As a consequence to the previous point, `OctupleMono` is removed as there was no records of its use. It is now equivalent to `Octuple` without `config.use_programs`;\r\n* #66 `CPWord` now handling time signature changes;\r\n* #66 tests for tempo and time signatures changes are now more robust, exceptions were removed and fixed.\r\n* 5a6378b26d4d8176ca84361c5ecab038d7026f8a `save_tokens` now by default doesn't save programs if `config.use_programs` is False\r\n\r\n### Compatibility\r\n\r\n* Calls to `track_to_tokens` and `tokens_to_track` methods are not supported anymore. If you used these methods, you may replace them with `midi_to_tokens` and `tokens_to_midi` (or just __call__ the tokenizer) while selecting the appropriate token sequences \u002F tracks;\r\n* `time_signature_range` now needs to be given as a dictionary;\r\n* Due to changes in the order of vocabularies of `Octuple` (as programs are now optional), tokenizers and tokens made with previous versions will not be compatible unless the vocab","2023-08-17T11:17:51",{"id":167,"version":168,"summary_zh":169,"released_at":170},198122,"v2.1.2","Thanks to @Kapitan11 who spotted bugs when decodings tokens given as ids \u002F integers (#59), this update brings a few fixes that solve them alongside tests ensuring that the input \u002F output (i\u002Fo) formats of the tokenizers are well handled in every cases.\r\nThe documentation has also been updated on this subject, that was unclear until now.\r\n\r\n### Changes\r\n\r\n* 394dc4d Fix in `MuMIDI` and `Octuple` token encodings that performed the preprocessing steps twice;\r\n* 394dc4d code of [single track tests](tests\u002Ftest_one_track.py) improved and now covering tempos for most tokenizations;\r\n* 394dc4d `MuMIDI` can now decode tempo tokens;\r\n* 394dc4d `_in_as_seq` decorator now used solely for the `tokens_to_midi()` method, and removed from `tokens_to_track()` which explicitly expects a `TokSequence` object as argument (089fa74);\r\n* 089fa74 `_in_as_seq` decorator now handling all token ids input formats as it should;\r\n* 9fe7639 Fix in `TSD` decoding with multiple input sequences when not in `one_token_stream ` mode;\r\n* 9fe7639 Adding i\u002Fo input ids tests;\r\n* 8c2349bfb771145c805c8a652392ae8f11ed0756 `unique_track` property renamed to `one_token_stream` as it is more explicit and accurate;\r\n* 8c2349bfb771145c805c8a652392ae8f11ed0756 new `convert_sequence_to_tokseq` method, which can convert any input sequence holding ids (integer), tokens (string) or events (Event) data into a `TokSequence` or list of `TokSequence`s objects, with the appropriate format depending on the tokenizer. This method is used by the `_in_as_seq` decorator;\r\n* 8c2349bfb771145c805c8a652392ae8f11ed0756 new `io_format` tokenizer property, returning the tokenizer's io format as a tuple of strings. Their significations are: *I* for instrument (for non one_token_stream tokenizers), *T* for token, *C* for sub-token class (for multi-voc tokenizers)\r\n* Minor code lint improvements;\r\n\r\n### Compatibility\r\n\r\n* All good 🙌","2023-07-24T18:16:40",{"id":172,"version":173,"summary_zh":174,"released_at":175},198123,"v2.1.1","### Changes\r\n\r\n* 220f3842a55693e0d5a68e89f31c3eede6b4ab12 Fix in `learn_bpe()` for tokenizers in `unique_track` mode;\r\n* 30d554693b5c0c6e271cdcd72cb969ef5dc1efaa Fixes in data augmentation (on tokens) in `unique_track` mode:  1) was skipping files (detected as drums) and 2) it now augment all pitches except drums ones (as opposed to all before);\r\n* 30d554693b5c0c6e271cdcd72cb969ef5dc1efaa Tokenizer creating `Program` tokens from `tokenizer.config.programs` given by user.\r\n\r\n### Compatibility\r\n\r\n* If you used custom `Program` tokens, make sure to give `(-1, 128)` as argument for your tokenizer's config (`TokenizerConfig` `programs` arg). It's already it by default, this message only applied if you gave something else.","2023-07-06T13:22:02",{"id":177,"version":178,"summary_zh":179,"released_at":180},198124,"v2.1.0","### Major change\r\n\r\nThis \"mid-size\" update brings a new `TokenizerConfig` object, holding any tokenizer's configuration. This object is now used to instantiate all tokenizers, and replaces the now removed `beat_res`, `nb_velocities`, `pitch_range` and `additional_tokens` arguments. It allows to simplify the code, reduce exceptions, and expose a simplified way to custom tokenizers.\r\nYou can read the documentation and example to see how to use it.\r\n\r\n### Changes\r\n\r\n* e586b1fa444f90fd4f925f636f1eeffb549aae9d New `TokenizerConfig` object to hold config and instantiate tokenizers\r\n* 26a67a65b1d7af174d271294c5df38238c9a71b5 @tingled Fix in `__repr__`\r\n* 9970ec472bd7d6e983574d5b28d4b8cbcdd82013 Fix in CPWord token type graph\r\n* 69e64a7f4c1a8f511bd437d519b13fb838229fa6 `max_bar_embedding` argument for `REMIPlus` is now by default set to False\r\n* 62292d63bde48f619be354c69e371e03b3ee0d21  @Kapitan11 `load_params` now private method, and documentation updated for this feature\r\n* 3aeb7ffa03b3e5e5235ca1c42eabedc1311a1db5 Removing the depreciated \"slow\" BPE methods\r\n* f8ca8548c7e1bd5ac10092f3601fdaeed253694a @ilya16 Fixing PitchBend time attribute in `merge_tracks` method\r\n* b12d270660cff14ae36549e3cfc00c320c5032b0 `TSD` now natively handle `Program` tokens, the same way `REMIPlus` does. Using the `use_prorams` option will convert MIDIs into a single token sequence for all tracks, instead of one seq per track instead;\r\n* Other minor code, lint and docstring improvements\r\n\r\n### Compatibility\r\n\r\n* On your current \u002F previous projects, you will need to update your code, specifically the way you create tokenizers, to use this update. This doesn't apply to code creating tokenizers from config file (`params` arg);\r\n* Slow BPE removed. If you still use these methods, we encourage you to switch to the new fast ones. You trained models will need to be using with old slow tokenizers.","2023-07-03T14:47:26",{"id":182,"version":183,"summary_zh":184,"released_at":185},198125,"v2.0.6","### Changes\r\n\r\n* 811bd684e68b2c4ede5a6d0fc4f2396490b1851d #40 #41 Adding the `MMM` tokenizer ([Multi-Track Music Machine](https:\u002F\u002Farxiv.org\u002Fabs\u002F2008.06048))\r\n\r\n### Compatibility\r\n\r\n* All good 🙌","2023-05-16T18:41:05",{"id":187,"version":188,"summary_zh":189,"released_at":190},198126,"v2.0.5","### Changes\r\n\r\n* f9f63d076bd630606f0482291375af44e37d1136 (related to #37) adding a compatibility check to `learn_bpe` method\r\n* f1af66ad2fec24007a59961d96069782f9b97ffc fixing an issue when loading tokens in `learn_bpe` with `unique_track` compatible tokenizer (REMIPlus) causing no BPE learning\r\n* f1af66ad2fec24007a59961d96069782f9b97ffc in `learn_bpe`: checking that the total number of unique base tokens (chars) is inferior to the target vocabulary size\r\n* 47b616643643bdb3dac82388d10b0603ad988b4f handling multi-voc indexing with tokens present in all vocabs eg special\r\n\r\n### Compatibility\r\n\r\n* All good 🙌","2023-05-04T16:32:36",[192,204,212,221,229,238],{"id":193,"name":194,"github_repo":195,"description_zh":196,"stars":197,"difficulty_score":198,"last_commit_at":199,"category_tags":200,"status":58},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[201,46,202,203],"Agent","图像","数据工具",{"id":205,"name":206,"github_repo":207,"description_zh":208,"stars":209,"difficulty_score":198,"last_commit_at":210,"category_tags":211,"status":58},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[46,202,201],{"id":213,"name":214,"github_repo":215,"description_zh":216,"stars":217,"difficulty_score":57,"last_commit_at":218,"category_tags":219,"status":58},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",150037,"2026-04-10T23:33:47",[46,201,220],"语言模型",{"id":222,"name":223,"github_repo":224,"description_zh":225,"stars":226,"difficulty_score":57,"last_commit_at":227,"category_tags":228,"status":58},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108322,"2026-04-10T11:39:34",[46,202,201],{"id":230,"name":231,"github_repo":232,"description_zh":233,"stars":234,"difficulty_score":57,"last_commit_at":235,"category_tags":236,"status":58},6121,"gemini-cli","google-gemini\u002Fgemini-cli","gemini-cli 是一款由谷歌推出的开源 AI 命令行工具，它将强大的 Gemini 大模型能力直接集成到用户的终端环境中。对于习惯在命令行工作的开发者而言，它提供了一条从输入提示词到获取模型响应的最短路径，无需切换窗口即可享受智能辅助。\n\n这款工具主要解决了开发过程中频繁上下文切换的痛点，让用户能在熟悉的终端界面内直接完成代码理解、生成、调试以及自动化运维任务。无论是查询大型代码库、根据草图生成应用，还是执行复杂的 Git 操作，gemini-cli 都能通过自然语言指令高效处理。\n\n它特别适合广大软件工程师、DevOps 人员及技术研究人员使用。其核心亮点包括支持高达 100 万 token 的超长上下文窗口，具备出色的逻辑推理能力；内置 Google 搜索、文件操作及 Shell 命令执行等实用工具；更独特的是，它支持 MCP（模型上下文协议），允许用户灵活扩展自定义集成，连接如图像生成等外部能力。此外，个人谷歌账号即可享受免费的额度支持，且项目基于 Apache 2.0 协议完全开源，是提升终端工作效率的理想助手。",100752,"2026-04-10T01:20:03",[237,201,202,46],"插件",{"id":239,"name":240,"github_repo":241,"description_zh":242,"stars":243,"difficulty_score":57,"last_commit_at":244,"category_tags":245,"status":58},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[237,46]]