[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-microsoft--bioemu":3,"tool-microsoft--bioemu":65},[4,23,32,40,49,57],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":22},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,2,"2026-04-05T10:45:23",[13,14,15,16,17,18,19,20,21],"图像","数据工具","视频","插件","Agent","其他","语言模型","开发框架","音频","ready",{"id":24,"name":25,"github_repo":26,"description_zh":27,"stars":28,"difficulty_score":29,"last_commit_at":30,"category_tags":31,"status":22},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,3,"2026-04-04T04:44:48",[17,13,20,19,18],{"id":33,"name":34,"github_repo":35,"description_zh":36,"stars":37,"difficulty_score":29,"last_commit_at":38,"category_tags":39,"status":22},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74913,"2026-04-05T10:44:17",[19,13,20,18],{"id":41,"name":42,"github_repo":43,"description_zh":44,"stars":45,"difficulty_score":46,"last_commit_at":47,"category_tags":48,"status":22},3215,"awesome-machine-learning","josephmisiti\u002Fawesome-machine-learning","awesome-machine-learning 是一份精心整理的机器学习资源清单，汇集了全球优秀的机器学习框架、库和软件工具。面对机器学习领域技术迭代快、资源分散且难以甄选的痛点，这份清单按编程语言（如 Python、C++、Go 等）和应用场景（如计算机视觉、自然语言处理、深度学习等）进行了系统化分类，帮助使用者快速定位高质量项目。\n\n它特别适合开发者、数据科学家及研究人员使用。无论是初学者寻找入门库，还是资深工程师对比不同语言的技术选型，都能从中获得极具价值的参考。此外，清单还延伸提供了免费书籍、在线课程、行业会议、技术博客及线下聚会等丰富资源，构建了从学习到实践的全链路支持体系。\n\n其独特亮点在于严格的维护标准：明确标记已停止维护或长期未更新的项目，确保推荐内容的时效性与可靠性。作为机器学习领域的“导航图”，awesome-machine-learning 以开源协作的方式持续更新，旨在降低技术探索门槛，让每一位从业者都能高效地站在巨人的肩膀上创新。",72149,1,"2026-04-03T21:50:24",[20,18],{"id":50,"name":51,"github_repo":52,"description_zh":53,"stars":54,"difficulty_score":46,"last_commit_at":55,"category_tags":56,"status":22},2234,"scikit-learn","scikit-learn\u002Fscikit-learn","scikit-learn 是一个基于 Python 构建的开源机器学习库，依托于 SciPy、NumPy 等科学计算生态，旨在让机器学习变得简单高效。它提供了一套统一且简洁的接口，涵盖了从数据预处理、特征工程到模型训练、评估及选择的全流程工具，内置了包括线性回归、支持向量机、随机森林、聚类等在内的丰富经典算法。\n\n对于希望快速验证想法或构建原型的数据科学家、研究人员以及 Python 开发者而言，scikit-learn 是不可或缺的基础设施。它有效解决了机器学习入门门槛高、算法实现复杂以及不同模型间调用方式不统一的痛点，让用户无需重复造轮子，只需几行代码即可调用成熟的算法解决分类、回归、聚类等实际问题。\n\n其核心技术亮点在于高度一致的 API 设计风格，所有估算器（Estimator）均遵循相同的调用逻辑，极大地降低了学习成本并提升了代码的可读性与可维护性。此外，它还提供了强大的模型选择与评估工具，如交叉验证和网格搜索，帮助用户系统地优化模型性能。作为一个由全球志愿者共同维护的成熟项目，scikit-learn 以其稳定性、详尽的文档和活跃的社区支持，成为连接理论学习与工业级应用的最",65628,"2026-04-05T10:10:46",[20,18,14],{"id":58,"name":59,"github_repo":60,"description_zh":61,"stars":62,"difficulty_score":10,"last_commit_at":63,"category_tags":64,"status":22},3364,"keras","keras-team\u002Fkeras","Keras 是一个专为人类设计的深度学习框架，旨在让构建和训练神经网络变得简单直观。它解决了开发者在不同深度学习后端之间切换困难、模型开发效率低以及难以兼顾调试便捷性与运行性能的痛点。\n\n无论是刚入门的学生、专注算法的研究人员，还是需要快速落地产品的工程师，都能通过 Keras 轻松上手。它支持计算机视觉、自然语言处理、音频分析及时间序列预测等多种任务。\n\nKeras 3 的核心亮点在于其独特的“多后端”架构。用户只需编写一套代码，即可灵活选择 TensorFlow、JAX、PyTorch 或 OpenVINO 作为底层运行引擎。这一特性不仅保留了 Keras 一贯的高层易用性，还允许开发者根据需求自由选择：利用 JAX 或 PyTorch 的即时执行模式进行高效调试，或切换至速度最快的后端以获得最高 350% 的性能提升。此外，Keras 具备强大的扩展能力，能无缝从本地笔记本电脑扩展至大规模 GPU 或 TPU 集群，是连接原型开发与生产部署的理想桥梁。",63927,"2026-04-04T15:24:37",[20,14,18],{"id":66,"github_repo":67,"name":68,"description_en":69,"description_zh":70,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":80,"owner_email":81,"owner_twitter":82,"owner_website":83,"owner_url":84,"languages":85,"stars":98,"forks":99,"last_commit_at":100,"license":101,"difficulty_score":29,"env_os":102,"env_gpu":103,"env_ram":104,"env_deps":105,"category_tags":109,"github_topics":80,"view_count":29,"oss_zip_url":80,"oss_zip_packed_at":80,"status":22,"created_at":110,"updated_at":111,"faqs":112,"releases":141},447,"microsoft\u002Fbioemu","bioemu","Inference code for scalable emulation of protein equilibrium ensembles with generative deep learning","bioemu 是一款基于生成式深度学习的蛋白质结构模拟工具。简单来说，只要提供蛋白质的氨基酸序列，bioemu 就能快速生成该蛋白质在自然状态下的多种可能结构（平衡系综），而不仅仅是单一的静态结构。\n\n在传统研究中，模拟蛋白质的动态变化往往需要耗费巨大的算力和时间，bioemu 则有效解决了这一痛点，实现了高效、可扩展的结构采样。它内置了智能过滤机制，能自动剔除原子冲突或链断裂等不合理的物理结构，确保结果的科学性。此外，它还集成了引导系统来优化生成质量。\n\nbioemu 目前仅支持 Linux 系统，适合计算生物学、药物研发领域的研究人员和开发者使用。无论是通过命令行还是 Python API，用户都能轻松上手，快速获取高质量的蛋白质结构样本，从而加速新药研发或生物机理探索的进程。","\n\u003Ch1>\n\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_bioemu_readme_af7084d2f4ee.png\" alt=\"BioEmu logo\" width=\"300\"\u002F>\n\u003C\u002Fp>\n\u003C\u002Fh1>\n\n[![DOI:10.1101\u002F2024.12.05.626885](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.1101\u002F2024.12.05.626885.svg)](https:\u002F\u002Fdoi.org\u002F10.1101\u002F2024.12.05.626885)\n[![Requires Python 3.10+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.10+-blue.svg?logo=python&logoColor=white)](https:\u002F\u002Fpython.org\u002Fdownloads)\n\n\n# Biomolecular Emulator (BioEmu)\n\nBiomolecular Emulator (BioEmu for short) is a model that samples from the approximated equilibrium distribution of structures for a protein monomer, given its amino acid sequence.\n\nFor more information see our \u003Ca href=\"assets\u002Fbioemu_paper.pdf\" target=\"_blank\">paper\u003C\u002Fa>, [citation below](#citation).\n\nThis repository contains inference code and model weights.\n\n## Table of Contents\n- [Installation](#installation)\n- [Sampling structures](#sampling-structures)\n- [Steering to avoid chain breaks and clashes](#steering-to-avoid-chain-breaks-and-clashes)\n- [Azure AI Foundry](#azure-ai-foundry)\n- [Training data](#training-data)\n- [Get in touch](#get-in-touch)\n- [Citation](#citation)\n\n## Installation\nbioemu is provided as a Linux-only pip-installable package. We currently only support Python versions from 3.10 to 3.12:\n\n```bash\npip install bioemu\n```\n\n> [!NOTE]\n> The first time `bioemu` is used to sample structures, it will also setup [Colabfold](https:\u002F\u002Fgithub.com\u002Fsokrypton\u002FColabFold) on a separate virtual environment for MSA and embedding generation. By default this setup uses the `~\u002F.bioemu_colabfold` directory, but if you wish to have this changed please manually set the `BIOEMU_COLABFOLD_DIR` environment variable accordingly before sampling for the first time.\n\n\n## Sampling structures\nYou can sample structures for a given protein sequence using the `sample` module. To run a tiny test using the default model parameters and denoising settings:\n```\npython -m bioemu.sample --sequence GYDPETGTWG --num_samples 10 --output_dir ~\u002Ftest-chignolin\n```\n\nAlternatively, you can use the Python API:\n\n```python\nfrom bioemu.sample import main as sample\nsample(sequence='GYDPETGTWG', num_samples=10, output_dir='~\u002Ftest_chignolin')\n```\n\nThe model parameters will be automatically downloaded from [huggingface](https:\u002F\u002Fhuggingface.co\u002Fmicrosoft\u002Fbioemu). A path to a single-sequence FASTA file can also be passed to the `sequence` argument.\n\nSampling times will depend on sequence length and available infrastructure. The following table gives times for collecting 1000 samples measured on an A100 GPU with 80 GB VRAM for sequences of different lengths (using a `batch_size_100=20` setting in `sample.py`):\n | sequence length | time \u002F min |\n | --------------: | ---------: |\n |             100 |          4 |\n |             300 |         40 |\n |             600 |        150 |\n\nBy default, unphysical structures (steric clashes or chain discontinuities) will be filtered out, so you will typically get fewer samples in the output than requested. The difference can be very large if your protein has large disordered regions which are very likely to produce clashes. If you want to get all generated samples in the output, irrespective of whether they are physically valid, use the `--filter_samples=False` argument.\n\n\n> [!NOTE]\n> If you wish to use your own generated MSA instead of the ones retrieved via Colabfold, you can pass an A3M file containing the query sequence as the first row to the `sequence` argument. Additionally, the `msa_host_url` argument can be used to override the default Colabfold MSA query server. See [sample.py](.\u002Fsrc\u002Fbioemu\u002Fsample.py) for more options.\n\nThis code only supports sampling structures of monomers. You can try to sample multimers using the [linker trick](https:\u002F\u002Fx.com\u002Fag_smith\u002Fstatus\u002F1417063635000598528), but in our limited experiments, this has not worked well.\n\n## Steering to avoid chain breaks and clashes\n\nBioEmu includes a [steering system](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.06848) that uses [Sequential Monte Carlo (SMC)](https:\u002F\u002Fwww.stats.ox.ac.uk\u002F~doucet\u002Fdoucet_defreitas_gordon_smcbookintro.pdf) to guide the diffusion process toward more physically plausible protein structures.\nEmpirically, using three (or up to 10) steering particles per output sample greatly reduces the number of unphysical samples (steric clashes or chain breaks) produced by the model.\nSteering applies potential energy functions during denoising to favor conformations that satisfy physical constraints. \nAlgorithmically, steering simulates multiple *candidate samples* per desired output sample and resamples between these particles according to the favorability of the provided potentials. \n\n### Quick start with steering\n\nEnable steering with physical constraints using the CLI:\n\n```bash\npython -m bioemu.sample \\\n    --sequence GYDPETGTWG \\\n    --num_samples 100 \\\n    --output_dir ~\u002Fsteered-samples \\\n    --steering_config src\u002Fbioemu\u002Fconfig\u002Fsteering\u002Fphysical_steering.yaml \\\n    --denoiser_config src\u002Fbioemu\u002Fconfig\u002Fdenoiser\u002Fstochastic_dpm.yaml\n```\n\nOr using the Python API:\n\n```python\nfrom bioemu.sample import main as sample\n\nsample(\n    sequence='GYDPETGTWG',\n    num_samples=100,\n    output_dir='~\u002Fsteered-samples',\n    denoiser_config=\"..\u002Fsrc\u002Fbioemu\u002Fconfig\u002Fdenoiser\u002Fstochastic_dpm.yaml\",  # Use stochastic DPM\n    steering_config=\"..\u002Fsrc\u002Fbioemu\u002Fconfig\u002Fsteering\u002Fphysicality_steering.yaml\",  # Use physicality steering\n)\n```\n\n### Key steering parameters\n\n- `num_steering_particles`: Number of particles per sample (1 = no steering, >1 enables steering)\n- `steering_start_time`: When to start steering (0.0-1.0, default: 0.1) with reverse sampling 1 -> 0\n- `steering_end_time`: When to stop steering (0.0-1.0, default: 0.) with reverse sampling 1 -> 0\n- `resampling_interval`: How often to resample particles (default: 1)\n- `steering_config`: Path to potentials configuration file (required for steering)\n\n### Available potentials\n\nThe [`physical_steering.yaml`](.\u002Fsrc\u002Fbioemu\u002Fconfig\u002Fsteering\u002Fphysical_steering.yaml) configuration provides potentials for physical realism:\n- **ChainBreak**: Prevents backbone discontinuities\n- **ChainClash**: Avoids steric clashes between non-neighboring residues\n\nYou can create a custom `steering_config.yaml` YAML file instantiating your own potential to steer the system with your own potentials.\n\n## Azure AI Foundry\nBioEmu is also available on [Azure AI Foundry](https:\u002F\u002Fai.azure.com\u002F). See [How to run BioEmu on Azure AI Foundry](AZURE_AI_FOUNDRY.md) for more details.\n\n## Training data\nThe molecular dynamics training data used for BioEmu is available on Zenodo:\n- [CATH](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.15629740)\n- [Octapeptides](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.15641199)\n- [MegaSim](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.15641184)\n\nFor a full description of these, see the \u003Ca href=\"assets\u002Fbioemu_paper.pdf\" target=\"_blank\">paper\u003C\u002Fa>.\n\n## Reproducing results from the paper\nYou can use this code together with code from [bioemu-benchmarks](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu-benchmarks) to approximately reproduce results from our [paper].\n\n- The `bioemu-v1.0` checkpoint contains the model weights used to produce the results in the preprint. Due to simplifications made in the embedding computation and a more efficient sampler, the results obtained with this code are not identical but consistent with the preprint statistics, i.e., mode coverage and free energy errors averaged over the proteins in a test set. Results for individual proteins may differ. \n- [Default] The `bioemu-v1.1` checkpoint contains the model weights used to produce the results in the published Science [paper]. \n- The `bioemu-v1.2` checkpoint contains the model weights trained from an extended set of MD simulations and experimental measurements of folding free energies. \n\nFor more details, please check the [BIOEMU_RESULTS.md](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu-benchmarks\u002Fblob\u002Fmain\u002Fbioemu_benchmarks\u002FBIOEMU_RESULTS.md) document on the bioemu-benchmarks repository.\n\nTo use a specific checkpoint, you can specify the `model_name` in the `bioemu.sample` args, for example, `--model_name=\"bioemu-v1.1\"`.\n\n\n## Side-chain reconstruction and MD-relaxation\nBioEmu outputs structures in backbone frame representation. To reconstruct the side-chains, several tools are available. As an example, we interface with [HPacker](https:\u002F\u002Fgithub.com\u002Fgvisani\u002Fhpacker) to conduct side-chain reconstruction, and also provide basic tooling for running a short molecular dynamics (MD) equilibration.\n\n> [!WARNING]\n> This code is experimental and relies on a [conda-based package manager](https:\u002F\u002Fdocs.conda.io\u002Fprojects\u002Fconda\u002Fen\u002Flatest\u002Fuser-guide\u002Finstall\u002Findex.html) due to `hpacker` having `conda` as a dependency. Make sure that `conda` is in your `PATH` and that you have CUDA12-compatible drivers before running the following code.\n\nInstall optional dependencies:\n\n```bash\npip install bioemu[md]\n```\n\nYou can compute side-chain reconstructions via the `bioemu.sidechains_relax` module:\n```bash\npython -m bioemu.sidechain_relax --pdb-path path\u002Fto\u002Ftopology.pdb --xtc-path path\u002Fto\u002Fsamples.xtc\n```\n\n\n> [!NOTE]\n> The first time this module is invoked, it will attempt to install `hpacker` and its dependencies into a separate `hpacker` conda environment. If you wish for it to be installed in a different location, please set the `HPACKER_ENV_NAME` environment variable before using this module for the first time.\n\nBy default, side-chain reconstruction and local energy minimization are performed (no full MD integration for efficiency reasons).\nNote that the runtime of this code scales with the size of the system.\nWe suggest running this code on a selection of samples rather than the full set.\n\nThere are two other options:\n- To only run side-chain reconstruction without MD equilibration, add `--no-md-equil`.\n- To run a short NVT equilibration (0.1 ns), add `--md-protocol nvt_equil`\n\nTo see the full list of options, call `python -m bioemu.sidechain_relax --help`.\n\nThe script saves reconstructed all-heavy-atom structures in `samples_sidechain_rec.{pdb,xtc}` and MD-equilibrated structures in `samples_md_equil.{pdb,xtc}` (filename to be altered with `--outname other_name`).\n\n## Third-party code\nThe code in the `openfold` subdirectory is copied from [openfold](https:\u002F\u002Fgithub.com\u002Faqlaboratory\u002Fopenfold) with minor modifications. The modifications are described in the relevant source files.\n## Get in touch\nIf you have any questions not covered here, please create an issue or contact the BioEmu team by writing to the corresponding author on our [paper].\n\n## Citation\nIf you are using our code or model, please cite the following paper:\n```bibtex\n@article{bioemu2025,\n  title={Scalable emulation of protein equilibrium ensembles with generative deep learning},\n  author={Lewis, Sarah and Hempel, Tim and Jim{\\'e}nez-Luna, Jos{\\'e} and Gastegger, Michael and Xie, Yu and Foong, Andrew YK and Satorras, Victor Garc{\\'\\i}a and Abdin, Osama and Veeling, Bastiaan S and Zaporozhets, Iryna and Chen, Yaoyi and Yang, Soojung and Foster, Adam E. and Schneuing, Arne and Nigam, Jigyasa and Barbero, Federico and Stimper Vincent and  Campbell, Andrew and Yim, Jason and Lienen, Marten and Shi, Yu and Zheng, Shuxin and Schulz, Hannes and Munir, Usman and Sordillo, Roberto and Tomioka, Ryota and Clementi, Cecilia and No{\\'e},  Frank},\n  journal={Science},\n  pages={eadv9817},\n  year={2025},\n  publisher={American Association for the Advancement of Science},\n  doi={10.1126\u002Fscience.adv9817}\n}\n```\n[paper]: https:\u002F\u002Fwww.science.org\u002Fdoi\u002F10.1126\u002Fscience.adv9817","\u003Ch1>\n\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_bioemu_readme_af7084d2f4ee.png\" alt=\"BioEmu logo\" width=\"300\"\u002F>\n\u003C\u002Fp>\n\u003C\u002Fh1>\n\n[![DOI:10.1101\u002F2024.12.05.626885](https:\u002F\u002Fzenodo.org\u002Fbadge\u002FDOI\u002F10.1101\u002F2024.12.05.626885.svg)](https:\u002F\u002Fdoi.org\u002F10.1101\u002F2024.12.05.626885)\n[![Requires Python 3.10+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.10+-blue.svg?logo=python&logoColor=white)](https:\u002F\u002Fpython.org\u002Fdownloads)\n\n\n# 生物分子模拟器 (BioEmu)\n\n生物分子模拟器（简称 BioEmu）是一个模型，它可以根据给定的氨基酸序列，从蛋白质单体的近似平衡结构分布中进行采样。\n\n更多信息请参阅我们的\u003Ca href=\"assets\u002Fbioemu_paper.pdf\" target=\"_blank\">论文\u003C\u002Fa>，[引用信息见下文](#citation)。\n\n本代码仓库包含推理代码和模型权重。\n\n## 目录\n- [安装](#installation)\n- [结构采样](#sampling-structures)\n- [引导以避免链断裂和冲突](#steering-to-avoid-chain-breaks-and-clashes)\n- [Azure AI Foundry](#azure-ai-foundry)\n- [训练数据](#training-data)\n- [联系我们](#get-in-touch)\n- [引用](#citation)\n\n## 安装\nbioemu 以仅限 Linux 的 pip 安装包形式提供。我们目前仅支持 3.10 到 3.12 版本的 Python：\n\n```bash\npip install bioemu\n```\n\n> [!NOTE]\n> 首次使用 `bioemu` 采样结构时，它还会在单独的虚拟环境中设置 [Colabfold](https:\u002F\u002Fgithub.com\u002Fsokrypton\u002FColabFold)，用于 MSA（多序列比对）和嵌入生成。默认情况下，此设置使用 `~\u002F.bioemu_colabfold` 目录，但如果您希望更改此设置，请在首次采样前手动设置 `BIOEMU_COLABFOLD_DIR` 环境变量。\n\n\n## 结构采样\n您可以使用 `sample` 模块对给定的蛋白质序列进行结构采样。要使用默认模型参数和去噪设置运行一个小测试：\n```\npython -m bioemu.sample --sequence GYDPETGTWG --num_samples 10 --output_dir ~\u002Ftest-chignolin\n```\n\n或者，您可以使用 Python API：\n\n```python\nfrom bioemu.sample import main as sample\nsample(sequence='GYDPETGTWG', num_samples=10, output_dir='~\u002Ftest_chignolin')\n```\n\n模型参数将自动从 [huggingface](https:\u002F\u002Fhuggingface.co\u002Fmicrosoft\u002Fbioemu) 下载。也可以将单序列 FASTA 文件的路径传递给 `sequence` 参数。\n\n采样时间取决于序列长度和可用的基础设施。下表给出了在具有 80 GB 显存的 A100 GPU 上采集 1000 个样本所需的时间，针对不同长度的序列（在 `sample.py` 中使用 `batch_size_100=20` 设置）：\n | 序列长度 | 时间 \u002F 分钟 |\n | --------------: | ---------: |\n |             100 |          4 |\n |             300 |         40 |\n |             600 |        150 |\n\n默认情况下，非物理结构（空间冲突或链不连续）将被过滤掉，因此输出中的样本通常少于请求数量。如果您的蛋白质具有较大的无序区域，差异可能会非常大，因为这些区域很容易产生冲突。如果您想在输出中获取所有生成的样本，无论其物理上是否有效，请使用 `--filter_samples=False` 参数。\n\n\n> [!NOTE]\n> 如果您希望使用自己生成的 MSA 而不是通过 Colabfold 检索的 MSA，可以将包含查询序列作为第一行的 A3M 文件传递给 `sequence` 参数。此外，`msa_host_url` 参数可用于覆盖默认的 Colabfold MSA 查询服务器。更多选项请参见 [sample.py](.\u002Fsrc\u002Fbioemu\u002Fsample.py)。\n\n此代码仅支持单体结构的采样。您可以尝试使用 [linker trick](https:\u002F\u002Fx.com\u002Fag_smith\u002Fstatus\u002F1417063635000598528)（连接子技巧）来采样多聚体，但在我们有限的实验中，效果并不理想。\n\n## 引导以避免链断裂和冲突\n\nBioEmu 包含一个 [steering system](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.06848)（引导系统），该系统使用 [Sequential Monte Carlo (SMC)](https:\u002F\u002Fwww.stats.ox.ac.uk\u002F~doucet\u002Fdoucet_defreitas_gordon_smcbookintro.pdf)（序贯蒙特卡洛）来引导扩散过程，以生成更符合物理规律的蛋白质结构。\n根据经验，每个输出样本使用三个（或最多 10 个）引导粒子可大大减少模型产生的非物理样本（空间冲突或链断裂）数量。\n引导在去噪过程中应用势能函数，以倾向于满足物理约束的构象。\n在算法上，引导为每个期望的输出样本模拟多个*候选样本*，并根据所提供势能的有利程度在这些粒子之间进行重采样。\n\n### 引导快速开始\n\n使用 CLI 启用带有物理约束的引导：\n\n```bash\npython -m bioemu.sample \\\n    --sequence GYDPETGTWG \\\n    --num_samples 100 \\\n    --output_dir ~\u002Fsteered-samples \\\n    --steering_config src\u002Fbioemu\u002Fconfig\u002Fsteering\u002Fphysical_steering.yaml \\\n    --denoiser_config src\u002Fbioemu\u002Fconfig\u002Fdenoiser\u002Fstochastic_dpm.yaml\n```\n\n或使用 Python API：\n\n```python\nfrom bioemu.sample import main as sample\n\nsample(\n    sequence='GYDPETGTWG',\n    num_samples=100,\n    output_dir='~\u002Fsteered-samples',\n    denoiser_config=\"..\u002Fsrc\u002Fbioemu\u002Fconfig\u002Fdenoiser\u002Fstochastic_dpm.yaml\",  # Use stochastic DPM\n    steering_config=\"..\u002Fsrc\u002Fbioemu\u002Fconfig\u002Fsteering\u002Fphysicality_steering.yaml\",  # Use physicality steering\n)\n```\n\n### 关键引导参数\n\n- `num_steering_particles`: 每个样本的粒子数（1 = 无引导，>1 启用引导）\n- `steering_start_time`: 何时开始引导（0.0-1.0，默认值：0.1），反向采样为 1 -> 0\n- `steering_end_time`: 何时停止引导（0.0-1.0，默认值：0.），反向采样为 1 -> 0\n- `resampling_interval`: 重采样粒子的频率（默认值：1）\n- `steering_config`: 势能配置文件的路径（引导需要）\n\n### 可用势能\n\n[`physical_steering.yaml`](.\u002Fsrc\u002Fbioemu\u002Fconfig\u002Fsteering\u002Fphysical_steering.yaml) 配置提供了用于物理真实性的势能：\n- **ChainBreak**: 防止骨架不连续\n- **ChainClash**: 避免非相邻残基之间的空间冲突\n\n您可以创建一个自定义的 `steering_config.yaml` YAML 文件，实例化您自己的势能，从而使用您自己的势能来引导系统。\n\n## Azure AI Foundry\nBioEmu 也可在 [Azure AI Foundry](https:\u002F\u002Fai.azure.com\u002F) 上使用。有关更多详细信息，请参阅 [如何在 Azure AI Foundry 上运行 BioEmu](AZURE_AI_FOUNDRY.md)。\n\n## 训练数据\nBioEmu 使用的分子动力学训练数据可在 Zenodo 上获取：\n- [CATH](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.15629740)\n- [Octapeptides](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.15641199)\n- [MegaSim](https:\u002F\u002Fdoi.org\u002F10.5281\u002Fzenodo.15641184)\n\n有关这些数据的完整描述，请参阅\u003Ca href=\"assets\u002Fbioemu_paper.pdf\" target=\"_blank\">论文\u003C\u002Fa>。\n\n## 复现论文结果\n您可以将此代码与 [bioemu-benchmarks](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu-benchmarks) 中的代码结合使用，以近似复现我们 [论文][paper] 中的结果。\n\n- `bioemu-v1.0` checkpoint（检查点）包含用于生成预印本结果的模型权重。由于对 embedding（嵌入）计算进行了简化，并使用了更高效的 sampler（采样器），使用此代码获得的结果虽不完全相同，但与预印本统计数据一致，即在测试集蛋白质上平均的模式覆盖率和自由能误差。单个蛋白质的结果可能会有所不同。\n- [默认] `bioemu-v1.1` checkpoint 包含用于生成已发表 Science [论文][paper] 中结果的模型权重。\n- `bioemu-v1.2` checkpoint 包含基于扩展的 MD（分子动力学）模拟和折叠自由能实验测量数据训练的模型权重。\n\n欲了解更多详情，请查看 bioemu-benchmarks 仓库中的 [BIOEMU_RESULTS.md](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu-benchmarks\u002Fblob\u002Fmain\u002Fbioemu_benchmarks\u002FBIOEMU_RESULTS.md) 文档。\n\n要使用特定的 checkpoint，您可以在 `bioemu.sample` 参数中指定 `model_name`，例如 `--model_name=\"bioemu-v1.1\"`。\n\n\n## 侧链重构与 MD 弛豫\nBioEmu 以骨架框架表示法输出结构。为了重构侧链，有多种工具可用。作为一个示例，我们与 [HPacker](https:\u002F\u002Fgithub.com\u002Fgvisani\u002Fhpacker) 进行接口对接以执行侧链重构，并提供了用于运行短时分子动力学 (MD) 平衡的基本工具。\n\n> [!WARNING]\n> 此代码处于实验阶段，由于 `hpacker` 依赖 `conda`，因此依赖于 [conda 包管理器](https:\u002F\u002Fdocs.conda.io\u002Fprojects\u002Fconda\u002Fen\u002Flatest\u002Fuser-guide\u002Finstall\u002Findex.html)。在运行以下代码之前，请确保 `conda` 位于您的 `PATH` 环境变量中，并且您拥有兼容 CUDA12 的驱动程序。\n\n安装可选依赖项：\n\n```bash\npip install bioemu[md]\n```\n\n您可以通过 `bioemu.sidechains_relax` 模块计算侧链重构：\n```bash\npython -m bioemu.sidechain_relax --pdb-path path\u002Fto\u002Ftopology.pdb --xtc-path path\u002Fto\u002Fsamples.xtc\n```\n\n\n> [!NOTE]\n> 首次调用此模块时，它会尝试将 `hpacker` 及其依赖项安装到单独的 `hpacker` conda 环境中。如果您希望将其安装在其他位置，请在首次使用此模块之前设置 `HPACKER_ENV_NAME` 环境变量。\n\n默认情况下，会执行侧链重构和局部能量最小化（出于效率考虑，不进行完整的 MD 积分）。\n请注意，此代码的运行时间随系统规模而定。\n我们建议对选定的样本而非完整样本集运行此代码。\n\n还有另外两个选项：\n- 若仅运行侧链重构而不进行 MD 平衡，请添加 `--no-md-equil`。\n- 若要运行短时 NVT 平衡（0.1 ns），请添加 `--md-protocol nvt_equil`\n\n要查看完整选项列表，请调用 `python -m bioemu.sidechain_relax --help`。\n\n该脚本将重构后的全重原子结构保存在 `samples_sidechain_rec.{pdb,xtc}` 中，将 MD 平衡后的结构保存在 `samples_md_equil.{pdb,xtc}` 中（可使用 `--outname other_name` 更改文件名）。\n\n## 第三方代码\n`openfold` 子目录中的代码复制自 [openfold](https:\u002F\u002Fgithub.com\u002Faqlaboratory\u002Fopenfold)，并进行了少量修改。相关修改已在对应的源文件中描述。\n\n## 联系我们\n如果您有此处未涵盖的任何问题，请创建 issue 或通过致信我们 [论文][paper] 的通讯作者来联系 BioEmu 团队。\n\n## 引用\n如果您使用我们的代码或模型，请引用以下论文：\n```bibtex\n@article{bioemu2025,\n  title={Scalable emulation of protein equilibrium ensembles with generative deep learning},\n  author={Lewis, Sarah and Hempel, Tim and Jim{\\'e}nez-Luna, Jos{\\'e} and Gastegger, Michael and Xie, Yu and Foong, Andrew YK and Satorras, Victor Garc{\\'\\i}a and Abdin, Osama and Veeling, Bastiaan S and Zaporozhets, Iryna and Chen, Yaoyi and Yang, Soojung and Foster, Adam E. and Schneuing, Arne and Nigam, Jigyasa and Barbero, Federico and Stimper Vincent and  Campbell, Andrew and Yim, Jason and Lienen, Marten and Shi, Yu and Zheng, Shuxin and Schulz, Hannes and Munir, Usman and Sordillo, Roberto and Tomioka, Ryota and Clementi, Cecilia and No{\\'e},  Frank},\n  journal={Science},\n  pages={eadv9817},\n  year={2025},\n  publisher={American Association for the Advancement of Science},\n  doi={10.1126\u002Fscience.adv9817}\n}\n```\n[paper]: https:\u002F\u002Fwww.science.org\u002Fdoi\u002F10.1126\u002Fscience.adv9817","# BioEmu 快速上手指南\n\n## 环境准备\n- **操作系统**：仅支持 Linux 系统\n- **Python 版本**：3.10 - 3.12\n- **依赖项**：自动安装 Colabfold（首次使用时）\n- **推荐配置**：NVIDIA GPU（建议 CUDA 12 兼容驱动）\n\n> ⚠️ 国内用户建议使用国内镜像源加速安装：\n```bash\npip install bioemu --index-url https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n## 安装步骤\n1. 创建虚拟环境（推荐）：\n```bash\npython3.10 -m venv bioemu_env\nsource bioemu_env\u002Fbin\u002Factivate\n```\n\n2. 安装核心包：\n```bash\npip install bioemu --index-url https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n3. 可选：安装分子动力学工具链：\n```bash\npip install bioemu[md] --index-url https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n> 📌 首次运行会自动下载模型权重（约 2GB）和安装 Colabfold 依赖，建议设置环境变量指定本地存储路径：\n```bash\nexport BIOEMU_COLABFOLD_DIR=~\u002Fbioemu_colabfold\n```\n\n## 基本使用\n### 最简示例\n```bash\n# CLI 模式\npython -m bioemu.sample \\\n    --sequence GYDPETGTWG \\\n    --num_samples 10 \\\n    --output_dir ~\u002Ftest-chignolin\n```\n\n```python\n# Python API 模式\nfrom bioemu.sample import main as sample\nsample(sequence='GYDPETGTWG', num_samples=10, output_dir='~\u002Ftest_chignolin')\n```\n\n### 输出说明\n- 结果保存在指定 `output_dir` 目录\n- 默认过滤非物理结构（冲突\u002F断链），可通过 `--filter_samples=False` 禁用过滤\n\n### 性能参考（A100 GPU）\n| 序列长度 | 1000样本耗时 |\n|---------|------------|\n| 100     | 4分钟       |\n| 300     | 40分钟      |\n| 600     | 150分钟     |\n\n### 高级用法（带物理引导）\n```bash\npython -m bioemu.sample \\\n    --sequence GYDPETGTWG \\\n    --num_samples 100 \\\n    --output_dir ~\u002Fsteered-samples \\\n    --steering_config src\u002Fbioemu\u002Fconfig\u002Fsteering\u002Fphysical_steering.yaml \\\n    --denoiser_config src\u002Fbioemu\u002Fconfig\u002Fdenoiser\u002Fstochastic_dpm.yaml\n```\n\n> 📌 模型权重默认从 HuggingFace 下载，国内用户可手动下载后通过 `--model_name` 指定本地路径。","结构生物学家张博士正在研究一种新型溶菌酶的构象动态特性，需要生成大量合理的蛋白质三维结构以分析其活性位点的柔性变化。这种酶含有多个潜在无序区域，传统方法难以高效建模。\n\n### 没有 bioemu 时\n- **耗时冗长**：使用分子动力学模拟生成1000个有效构象需72小时以上，且常因链断裂问题导致样本量不足\n- **资源消耗大**：需要多块GPU并行计算，单次模拟成本超过$500\n- **结果不可靠**：约40%生成的结构存在原子碰撞或几何畸变，需人工筛选\n- **无序区域处理差**：IDR区域（120-150位）几乎无法生成合理构象\n- **流程复杂**：需手动准备MSA文件，跨平台调用多个软件工具\n\n### 使用 bioemu 后\n- **分钟级生成**：A100 GPU上15分钟内完成1000个样本生成（含过滤）\n- **成本降低90%**：单次运行成本控制在$50以内，内存占用减少80%\n- **物理合理性提升**：自动生成的结构通过立体化学验证，IDR区域采样完整度达85%\n- **智能过滤机制**：自动排除链断裂和原子碰撞样本，保留率从60%提升至92%\n- **端到端工作流**：通过Python API直接输入FASTA文件，自动调用Colabfold生成MSA\n\n核心价值：BioEmu将蛋白质构象采样效率提升15倍以上，同时确保物理合理性，使研究人员能快速获得高质量的动态结构集合用于功能分析。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fmicrosoft_bioemu_55d9b3e1.png","microsoft","Microsoft","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fmicrosoft_4900709c.png","Open source projects and samples from Microsoft",null,"opensource@microsoft.com","OpenAtMicrosoft","https:\u002F\u002Fopensource.microsoft.com","https:\u002F\u002Fgithub.com\u002Fmicrosoft",[86,90,94],{"name":87,"color":88,"percentage":89},"Python","#3572A5",95.9,{"name":91,"color":92,"percentage":93},"Jupyter Notebook","#DA5B0B",3.7,{"name":95,"color":96,"percentage":97},"Shell","#89e051",0.4,781,131,"2026-04-03T12:11:02","MIT","Linux","需要 NVIDIA GPU，显存 80GB+（测试环境为 A100 GPU）","未说明",{"notes":106,"python":107,"dependencies":108},"仅支持 Linux 系统；首次运行会自动安装 Colabfold 到独立虚拟环境（可通过 BIOEMU_COLABFOLD_DIR 修改路径）；侧链重建功能需 conda 环境及 CUDA12 兼容驱动；支持 Azure AI Foundry 平台运行","3.10-3.12",[104],[18],"2026-03-27T02:49:30.150509","2026-04-06T05:17:26.010169",[113,118,123,128,133,137],{"id":114,"question_zh":115,"answer_zh":116,"source_url":117},1732,"运行时提示文件未找到（FileNotFoundError）如何解决？","请确保使用Docker容器运行以避免本地配置问题。可使用提供的Dockerfile构建环境：\n\n```\n# Dockerfile示例\nFROM nvidia\u002Fcuda:12.4.1-runtime-ubuntu22.04\nENV BIOEMU_COLABFOLD_DIR=\u002Fopt\u002Fbioemu_colabfold\nRUN apt-get update && apt-get install -y git curl wget bzip2 ca-certificates patch\n# 安装micromamba并配置环境\n```\n\n此外需检查文件路径是否正确，确保ColabFold生成的文件能被BioEmu正确读取。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fissues\u002F194",{"id":119,"question_zh":120,"answer_zh":121,"source_url":122},1733,"如何解决单体表示维度不匹配的错误（expected input with shape [*, 384]）？","需安装特定版本的ColabFold（1.5.4）并手动修改模块文件：\n\n1. 创建colabfold_env环境：\n```bash\nconda create -n colabfold_env python=3.10\nuv pip install 'colabfold[alphafold-minus-jax]==1.5.4'\n```\n2. 修改`${SITE_PACKAGES_DIR}\u002Falphafold\u002Fmodel\u002Fmodules.py`文件，调整输出维度匹配要求。","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fissues\u002F102",{"id":124,"question_zh":125,"answer_zh":126,"source_url":127},1734,"采样速度异常缓慢如何优化？","若在HPC集群上运行，需预先准备MSA文件（Multiple Sequence Alignment）避免联网生成。可通过以下步骤：\n\n1. 离线生成MSA文件\n2. 将预生成的MSA文件路径传入bioemu.sample命令\n3. 确保集群节点有互联网访问权限（若需在线生成MSA）","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fissues\u002F128",{"id":129,"question_zh":130,"answer_zh":131,"source_url":132},1735,"ColabFold运行成功但BioEmu提示找不到结果文件？","请检查临时目录权限并验证文件生成流程：\n\n1. 确认ColabFold输出路径为`\u002Ftmp\u002F`目录\n2. 检查文件名是否包含`__unknown_description__`等特殊标识\n3. 手动指定结果文件路径：\n```python\nbioemu.sample(..., msa_output_dir='\u002F指定路径\u002F')\n```\n4. 确保BioEmu与ColabFold版本兼容（建议使用bioemu-v1.1）","https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fissues\u002F46",{"id":134,"question_zh":135,"answer_zh":136,"source_url":122},1736,"如何正确安装BioEmu及其依赖环境？","推荐分环境安装并使用Poetry管理依赖：\n\n1. 创建bioemu环境：\n```bash\ncd bioemu\npip install poetry\npoetry install\npip install pdb2pqr openmm msgpack_numpy brotli biotite cloudpathlib zstd einops\n```\n2. 创建colabfold_env环境：\n```bash\nconda create -n colabfold_env python=3.10\nuv pip install 'colabfold[alphafold-minus-jax]==1.5.4'\n```\n3. 使用Docker容器可避免环境冲突。",{"id":138,"question_zh":139,"answer_zh":140,"source_url":122},1737,"运行时报错'RuntimeError: Given normalized_shape=[384]'如何解决？","此问题源于模型输入维度不匹配，需执行以下修复：\n\n1. 更新至最新版本0.1.12：\n```bash\npip install bioemu==0.1.12\n```\n2. 若仍存在问题，手动修改模型文件：\n- 定位到`alphafold\u002Fmodel\u002Fmodules.py`\n- 在第154行添加维度转换逻辑：\n```python\nsingle_repr = single_repr[:, :, :384]\n```\n3. 确保使用的ColabFold版本与BioEmu兼容。",[142,147,152,157,161,166,171,176,181,186,191,196,201,206,210,215],{"id":143,"version":144,"summary_zh":145,"released_at":146},101216,"v.1.3.0","## What's Changed\r\n* Saves physics-filtered frame to topology.pdb, instead of 0th sample by @thempel in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F186\r\n* Report pytest durations by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F189\r\n* Speed up md-relaxation test by @thempel in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F190\r\n* feat: steering to avoid chain breaks and clashes by @ludwigwinkler in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F171\r\n* chore: update OpenMM version by @thempel in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F192\r\n* Remove Disulfide Potential by @ludwigwinkler in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F198\r\n* feat: Inline ColabFold and AlphaFold2 into BioEmu by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F206\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002Fv.1.2.0...v.1.3.0","2026-03-30T14:37:48",{"id":148,"version":149,"summary_zh":150,"released_at":151},101217,"v.1.2.0","## What's Changed\r\n* Toy example of PPFT training by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F134\r\n* Check input sequence validity by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F130\r\n* Various documentation and error message updates by @franknoe @josejimenezluna @sarahnlewis \r\n* Default model changed by @josejimenezluna  in  https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F136 \r\n* Updates to the DPM solver by @YuuuXie  in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F148 and https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F151\r\n* Use KDTree instead of mdtraj.compute_contacts in _filter_unphysical_traj_masks to mitigate MemoryError and improve performance by @ahmedselim2017 in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F158\r\n* Support extra_residue_embeds  by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F174 \r\n* By default, use system time for random seed, not zero by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F180\r\n\r\n## New Contributors\r\n* @ahmedselim2017 made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F160\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002Fv.0.1.12...v.1.2.0","2025-11-24T11:16:01",{"id":153,"version":154,"summary_zh":155,"released_at":156},101218,"v.1.1.0","## What's Changed\r\n* chore: remove `hpacker` citation from `README.md` by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F119\r\n* chore: check for supported model names by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F126\r\n* chore: Add instructions how to use the foundry endpoint by @ryotatomioka in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F124\r\n* chore: make sure input sequence is valid by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F130\r\n* feat: PPFT training by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F134\r\n* Update MODEL_CARD.md by @ryotatomioka in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F125\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002Fv.0.1.12...v.1.1.0","2025-06-27T07:41:04",{"id":158,"version":159,"summary_zh":80,"released_at":160},101219,"v.1.0.0pre","2025-06-24T12:30:56",{"id":162,"version":163,"summary_zh":164,"released_at":165},101220,"v.0.1.12","## What's Changed\r\n* fix: don't require `ensurepip` by @ryotatomioka in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F120\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002Fv.0.1.11...v.0.1.12","2025-05-07T11:27:47",{"id":167,"version":168,"summary_zh":169,"released_at":170},101221,"v.0.1.11","## What's Changed\r\n* Increase `--maxkb` in `pre-commit-config` by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F110\r\n* feat: add afdb cluster definitions by @oabdin in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F109\r\n* Update README.md by @franknoe in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F112\r\n* add hpacker reference to readme by @thempel in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F115\r\n* fix: colabfold install by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F116\r\n* Revert \"chore: tie colabfold dir to the python executable (#104)\" by @ryotatomioka in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F118\r\n\r\n## New Contributors\r\n* @oabdin made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F109\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002Fv.0.1.10...v.0.1.11","2025-05-06T14:27:36",{"id":172,"version":173,"summary_zh":174,"released_at":175},101222,"v.0.1.10","## What's Changed\r\n* fix: avoid globally enabling stackprinter by @ryotatomioka in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F108\r\n* fix: remove debug slice by @thempel in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F107\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002Fv.0.1.9...v.0.1.10","2025-04-25T10:04:31",{"id":177,"version":178,"summary_zh":179,"released_at":180},101223,"v.0.1.9","## What's Changed\r\n* chore: tie colabfold dir to the python executable by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F104\r\n* Setting N=50 as default in DPMsolver by @franknoe in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F105\r\n* bump version to `0.1.9` by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F106\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002Fv.0.1.8...v.0.1.9","2025-04-24T12:09:20",{"id":182,"version":183,"summary_zh":184,"released_at":185},101224,"v.0.1.8","## What's Changed\r\n* fix: reconstructed sidechains saved in current dir by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F82\r\n* minor chore: add progress bar while sampling by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F84\r\n* Update README.md by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F88\r\n* feat: Pass sys.executable to setup.sh to prevent installation issue by @ryotatomioka in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F91\r\n* fix: smarter `CONDA_PREFIX` resolution during hpacker install by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F93\r\n* chore: add `cache_so3_dir` to `sample.main` by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F98\r\n* chore(deps): bump torch from ==2.4.0 to >=2.6.0, bump bioemu version to 0.1.8 by @dependabot in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F101\r\n* fix: more reliable MD relaxation & free MD by @thempel in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F100\r\n\r\n## New Contributors\r\n* @dependabot made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F101\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002Fv0.1.6...v.0.1.8","2025-04-23T10:43:38",{"id":187,"version":188,"summary_zh":189,"released_at":190},101225,"v.0.1.7","## What's Changed\r\n* fix: reconstructed sidechains saved in current dir by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F82\r\n* minor chore: add progress bar while sampling by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F84\r\n* Update README.md by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F88\r\n* feat: Pass sys.executable to setup.sh to prevent installation issue by @ryotatomioka in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F91\r\n* fix: smarter `CONDA_PREFIX` resolution during hpacker install by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F93\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002Fv0.1.6...v.0.1.7","2025-04-07T14:56:30",{"id":192,"version":193,"summary_zh":194,"released_at":195},101226,"v0.1.6","## What's Changed\r\n* chore: refactor `HPACKER_PYTHONBIN` by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F79\r\n* Use venv instead of conda for colabfold installation by @ryotatomioka in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F77\r\n* fix: MD relaxation with multiple samples by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F81\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002Fv0.1.5...v0.1.6","2025-03-24T14:47:29",{"id":197,"version":198,"summary_zh":199,"released_at":200},101227,"v0.1.5","## What's Changed\r\n* potential-fix: update `localcolabfold` by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F64\r\n* chore: Add comment about multimers in README by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F69\r\n* fix: skip MD relaxation if setup fails by @thempel in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F68\r\n* chore: use `colabfold` pip package by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F76\r\n* bump version to `0.1.5` by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F75\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002F0.1.4...0.1.5","2025-03-19T15:14:04",{"id":202,"version":203,"summary_zh":204,"released_at":205},101228,"0.1.4","## What's Changed\r\n* feat: support user's MSA by @YuuuXie in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F48\r\n* chore: add platform specific tag in `publish.yml` by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F53\r\n* chore: Allow users to choose denoiser by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F56\r\n* minor-fix: more helpful message in `ensure_colabfold_install` by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F51\r\n* feat(bioemu): add filter of unphsyical samples by @mgastegger in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F59\r\n* chore: bump version to `0.1.4` by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F60\r\n\r\n## New Contributors\r\n* @YuuuXie made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F48\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcompare\u002F0.1.2...0.1.4","2025-03-03T13:24:51",{"id":207,"version":208,"summary_zh":80,"released_at":209},101229,"0.1.3","2025-02-27T14:21:55",{"id":211,"version":212,"summary_zh":213,"released_at":214},101230,"0.1.2","## What's Changed\r\n* Fix hpacker installation script by @thempel in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F32\r\n* CI fixes by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F41\r\n* Linux-only README.md by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F43\r\n* chore: automated `hpacker` install by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F45\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcommits\u002F0.1.2","2025-02-26T17:11:06",{"id":216,"version":217,"summary_zh":218,"released_at":219},101231,"0.1.0","## What's Changed\r\n* Add MD dependencies, readme, and code coverage reporting by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F6\r\n* model card updates by @thempel in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F7\r\n* Misc fixes by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F8\r\n* add hpacker, update README by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F9\r\n* feat: Save sequence.fasta file in output_dir by @ryotatomioka in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F10\r\n* fix: change checkpoint by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F11\r\n* Add dependabot.yml by @ryotatomioka in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F12\r\n* minor chore: remove white background from emu logo by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F17\r\n* chore: remove dependabot.yml by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F18\r\n* chore(emu): add sampling times by @mgastegger in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F19\r\n* chore: Update README.md by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F20\r\n* chore: setup changes for notebooks by @josejimenezluna in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F21\r\n* chore: Update README.md by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F22\r\n* fix: include checkpoints in non-editable install by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F23\r\n* Update README.md by @franknoe in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F24\r\n* Fix setup.sh by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F27\r\n* Use huggingface instead of keeping model weights in repo by @sarahnlewis in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F26\r\n\r\n## New Contributors\r\n* @sarahnlewis made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F6\r\n* @thempel made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F7\r\n* @ryotatomioka made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F10\r\n* @josejimenezluna made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F17\r\n* @mgastegger made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F19\r\n* @franknoe made their first contribution in https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fpull\u002F24\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fbioemu\u002Fcommits\u002F0.1.0","2025-02-24T16:14:08"]