[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tool-evo-design--evo":3,"similar-evo-design--evo":115},{"id":4,"github_repo":5,"name":6,"description_en":7,"description_zh":8,"ai_summary_zh":8,"readme_en":9,"readme_zh":10,"quickstart_zh":11,"use_case_zh":12,"hero_image_url":13,"owner_login":14,"owner_name":15,"owner_avatar_url":16,"owner_bio":17,"owner_company":18,"owner_location":18,"owner_email":18,"owner_twitter":18,"owner_website":19,"owner_url":20,"languages":21,"stars":34,"forks":35,"last_commit_at":36,"license":37,"difficulty_score":38,"env_os":39,"env_gpu":40,"env_ram":39,"env_deps":41,"category_tags":48,"github_topics":18,"view_count":38,"oss_zip_url":18,"oss_zip_packed_at":18,"status":50,"created_at":51,"updated_at":52,"faqs":53,"releases":79},1201,"evo-design\u002Fevo","evo","Biological foundation modeling from molecular to genome scale","evo是一个开源的生物基础模型，专注于从分子到基因组尺度的DNA序列建模与设计。它解决了传统生物信息学工具难以高效处理长DNA序列（如整个基因组）的问题，能以单核苷酸精度进行建模，计算和内存需求随序列长度近线性增长，大幅提升了基因组级分析的效率。evo拥有70亿参数，基于包含3000亿token的OpenGenome数据集训练，已成功生成SynGenome——首个包含超1000亿碱基对的AI合成DNA数据库。适合生物信息学研究者、AI开发者和合成生物学团队使用，尤其在基因功能研究、新基因设计和生物系统开发中提供强大支持。其核心技术采用StripedHyena架构，支持长上下文推理，让复杂基因组任务变得可行且高效。","# Evo: DNA foundation modeling from molecular to genome scale\n\n**We have developed a new model called Evo 2 that extends the Evo 1 model and its ideas to all domains of life. Please see [https:\u002F\u002Fgithub.com\u002Farcinstitute\u002Fevo2](https:\u002F\u002Fgithub.com\u002Farcinstitute\u002Fevo2) for more details.**\n\n![Evo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fevo-design_evo_readme_54b0f365577a.jpg)\n\nEvo is a biological foundation model capable of long-context modeling and design.\nEvo uses the [StripedHyena architecture](https:\u002F\u002Fgithub.com\u002Ftogethercomputer\u002Fstripedhyena) to enable modeling of sequences at a single-nucleotide, byte-level resolution with near-linear scaling of compute and memory relative to context length.\nEvo has 7 billion parameters and is trained on [OpenGenome](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FLongSafari\u002Fopen-genome), a prokaryotic whole-genome dataset containing ~300 billion tokens.\n\nWe describe Evo in the paper [“Sequence modeling and design from molecular to genome scale with Evo”](https:\u002F\u002Fwww.science.org\u002Fdoi\u002F10.1126\u002Fscience.ado9336).\n\nWe describe Evo 1.5 in the paper [“Semantic design of functional de novo genes from a genomic language model”](https:\u002F\u002Fwww.nature.com\u002Farticles\u002Fs41586-025-09749-7). We used the Evo 1.5 model to generate [SynGenome](https:\u002F\u002Fevodesign.org\u002Fsyngenome\u002F), the first AI-generated genomics database containing over 100 billion base pairs of synthetic DNA sequences.\n\nWe provide the following model checkpoints:\n| Checkpoint Name                        | Description |\n|----------------------------------------|-------------|\n| `evo-1.5-8k-base`   | A model pretrained with 8,192 context obtained by extending the pretraining of `evo-1-8k-base` to process 50% more training data. |\n| `evo-1-8k-base`     | A model pretrained with 8,192 context. We use this model as the base model for molecular-scale finetuning tasks. |\n| `evo-1-131k-base`   | A model pretrained with 131,072 context using `evo-1-8k-base` as the base model. We use this model to reason about and generate sequences at the genome scale. |\n| `evo-1-8k-crispr`   | A model finetuned using `evo-1-8k-base` as the base model to generate CRISPR-Cas systems. |\n| `evo-1-8k-transposon`   | A model finetuned using `evo-1-8k-base` as the base model to generate IS200\u002FIS605 transposons. |\n\n## News\n\n**December 17, 2024:** We have found and fixed a bug in the code for Evo model inference affecting package versions from Nov 15-Dec 16, 2024, which has been corrected in release versions 0.3 and above. If you installed the package during this timeframe, please upgrade to correct the issue.\n\n## Contents\n\n- [Setup](#setup)\n  - [Requirements](#requirements)\n  - [Installation](#installation)\n- [Usage](#usage)\n- [HuggingFace](#huggingface)\n- [Together API](#together-api)\n- [colab](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fevo-design\u002Fevo\u002Fblob\u002Fmain\u002Fscripts\u002Fhello_evo.ipynb)\n- [Playground wrapper](https:\u002F\u002Fevo.nitro.bio\u002F)\n- [Dataset](#dataset)\n- [Citation](#citation)\n\n## Setup\n\n### Requirements\n\nEvo is based on [StripedHyena](https:\u002F\u002Fgithub.com\u002Ftogethercomputer\u002Fstripedhyena\u002Ftree\u002Fmain).\n\nEvo uses [FlashAttention-2](https:\u002F\u002Fgithub.com\u002FDao-AILab\u002Fflash-attention), which may not work on all GPU architectures.\nPlease consult the [FlashAttention GitHub repository](https:\u002F\u002Fgithub.com\u002FDao-AILab\u002Fflash-attention#installation-and-features) for the current list of supported GPUs. Currently, Evo supports FlashAttention versions \u003C= 2.7.4.post0.\n\nMake sure to install the correct [PyTorch version](https:\u002F\u002Fpytorch.org\u002F) on your system. PyTorch versions >= 2.7.0 and \u003C 2.8.0a0 are supported by FlashAttention 2.7.4.\n\nWe recommend using a fresh conda environment to install these prerequisites. Below is an example of how to install these:\n```bash\nconda install -c nvidia cuda-nvcc cuda-cudart-dev\nconda install -c conda-forge flash-attn=2.7.4\n```\n\n### Installation\n\nYou can install Evo using `pip`\n```bash\npip install evo-model\n```\nor directly from the GitHub source\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo.git\ncd evo\u002F\npip install .\n```\n\nIf you are not using the conda-forge FlashAttention installation shown above, which will automatically install PyTorch, we recommend that you install the PyTorch library before installing all other dependencies (due to dependency issues of the `flash-attn` library; see, e.g., this [issue](https:\u002F\u002Fgithub.com\u002FDao-AILab\u002Fflash-attention\u002Fissues\u002F246)).\n\nOne of our [example scripts](scripts\u002F), demonstrating how to go from generating sequences with Evo to folding proteins ([scripts\u002Fgeneration_to_folding.py](scripts\u002Fgeneration_to_folding.py)), further requires the installation of `prodigal`. We have created an [environment.yml](environment.yml) file for this:\n\n```bash\nconda env create -f environment.yml\nconda activate evo-design\n```\n\n## Usage\n\nBelow is an example of how to download Evo and use it locally through the Python API.\n```python\nfrom evo import Evo\nimport torch\n\ndevice = 'cuda:0'\n\nevo_model = Evo('evo-1-131k-base')\nmodel, tokenizer = evo_model.model, evo_model.tokenizer\nmodel.to(device)\nmodel.eval()\n\nsequence = 'ACGT'\ninput_ids = torch.tensor(\n    tokenizer.tokenize(sequence),\n    dtype=torch.int,\n).to(device).unsqueeze(0)\n\nwith torch.no_grad():\n    logits, _ = model(input_ids) # (batch, length, vocab)\n\nprint('Logits: ', logits)\nprint('Shape (batch, length, vocab): ', logits.shape)\n```\nAn example of batched inference can be found in [`scripts\u002Fexample_inference.py`](scripts\u002Fexample_inference.py).\n\nWe provide an [example script](scripts\u002Fgenerate.py) for how to prompt the model and sample a set of sequences given the prompt.\n```bash\npython -m scripts.generate \\\n    --model-name 'evo-1-131k-base' \\\n    --prompt ACGT \\\n    --n-samples 10 \\\n    --n-tokens 100 \\\n    --temperature 1. \\\n    --top-k 4 \\\n    --device cuda:0\n```\n\nWe also provide an [example script](scripts\u002Fscore.py) for using the model to score the log-likelihoods of a set of sequences.\n```bash\npython -m scripts.score \\\n    --input-fasta examples\u002Fexample_seqs.fasta \\\n    --output-tsv scores.tsv \\\n    --model-name 'evo-1-131k-base' \\\n    --device cuda:0\n```\n\n## HuggingFace\n\nEvo is integrated with [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Ftogethercomputer\u002Fevo-1-131k-base).\n```python\nfrom transformers import AutoConfig, AutoModelForCausalLM\n\nmodel_name = 'togethercomputer\u002Fevo-1-8k-base'\n\nmodel_config = AutoConfig.from_pretrained(model_name, trust_remote_code=True, revision=\"1.1_fix\")\nmodel_config.use_cache = True\n\nmodel = AutoModelForCausalLM.from_pretrained(\n    model_name,\n    config=model_config,\n    trust_remote_code=True,\n    revision=\"1.1_fix\"\n)\n```\n\n\n## Together API\n\nEvo is available through Together AI with a [web UI](https:\u002F\u002Fapi.together.xyz\u002Fplayground\u002Flanguage\u002Ftogethercomputer\u002Fevo-1-131k-base), where you can generate DNA sequences with a chat-like interface.\n\nFor more detailed or batch workflows, you can call the Together API with a simple example below.\n\n\n```python\nimport openai\nimport os\n\n# Fill in your API information here.\nclient = openai.OpenAI(\n  api_key=TOGETHER_API_KEY,\n  base_url='https:\u002F\u002Fapi.together.xyz',\n)\n\nchat_completion = client.chat.completions.create(\n  messages=[\n    {\n      \"role\": \"system\",\n      \"content\": \"\"\n    },\n    {\n      \"role\": \"user\",\n      \"content\": \"ACGT\", # Prompt the model with a sequence.\n    }\n  ],\n  model=\"togethercomputer\u002Fevo-1-131k-base\",\n  max_tokens=128, # Sample some number of new tokens.\n  logprobs=True\n)\nprint(\n    chat_completion.choices[0].logprobs.token_logprobs,\n    chat_completion.choices[0].message.content\n)\n```\n\n## Dataset\n\nThe OpenGenome dataset for pretraining Evo is available at [Hugging Face datasets](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FLongSafari\u002Fopen-genome).\n\n## Citation\n\nPlease cite the following publication when referencing Evo.\n\n```\n@article{nguyen2024sequence,\n   author = {Eric Nguyen and Michael Poli and Matthew G. Durrant and Brian Kang and Dhruva Katrekar and David B. Li and Liam J. Bartie and Armin W. Thomas and Samuel H. King and Garyk Brixi and Jeremy Sullivan and Madelena Y. Ng and Ashley Lewis and Aaron Lou and Stefano Ermon and Stephen A. Baccus and Tina Hernandez-Boussard and Christopher Ré and Patrick D. Hsu and Brian L. Hie },\n   title = {Sequence modeling and design from molecular to genome scale with Evo},\n   journal = {Science},\n   volume = {386},\n   number = {6723},\n   pages = {eado9336},\n   year = {2024},\n   doi = {10.1126\u002Fscience.ado9336},\n   URL = {https:\u002F\u002Fwww.science.org\u002Fdoi\u002Fabs\u002F10.1126\u002Fscience.ado9336},\n}\n```\n\nPlease cite the following publication when referencing Evo 1.5.\n\n```\n@article{merchant2025semantic,\n    author = {Merchant, Aditi T and King, Samuel H and Nguyen, Eric and Hie, Brian L},\n    title = {Semantic design of functional de novo genes from a genomic language model},\n    year = {2025},\n    doi = {10.1038\u002Fs41586-025-09749-7},\n    URL = {https:\u002F\u002Fwww.nature.com\u002Farticles\u002Fs41586-025-09749-7},\n    journal = {Nature}\n}\n```\n","# Evo：从分子尺度到基因组尺度的DNA基础模型\n\n**我们开发了一种名为Evo 2的新模型，它将Evo 1模型及其理念扩展到了所有生命领域。更多详情请参见[https:\u002F\u002Fgithub.com\u002Farcinstitute\u002Fevo2](https:\u002F\u002Fgithub.com\u002Farcinstitute\u002Fevo2)。**\n\n![Evo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fevo-design_evo_readme_54b0f365577a.jpg)\n\nEvo是一个能够进行长上下文建模和设计的生物基础模型。\nEvo使用[StripedHyena架构](https:\u002F\u002Fgithub.com\u002Ftogethercomputer\u002Fstripedhyena)，能够在单核苷酸、字节级分辨率下对序列进行建模，同时计算和内存开销与上下文长度呈近线性关系。\nEvo拥有70亿参数，并在[OpenGenome](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FLongSafari\u002Fopen-genome)数据集上进行训练，该数据集是一个原核生物全基因组数据集，包含约3000亿个标记。\n\n我们在论文《利用Evo实现从分子尺度到基因组尺度的序列建模与设计》中详细介绍了Evo（[https:\u002F\u002Fwww.science.org\u002Fdoi\u002F10.1126\u002Fscience.ado9336](https:\u002F\u002Fwww.science.org\u002Fdoi\u002F10.1126\u002Fscience.ado9336)）。\n\n我们还在论文《基于基因组语言模型的功能性从头设计基因》中介绍了Evo 1.5版本（[https:\u002F\u002Fwww.nature.com\u002Farticles\u002Fs41586-025-09749-7](https:\u002F\u002Fwww.nature.com\u002Farticles\u002Fs41586-025-09749-7)）。我们使用Evo 1.5模型生成了[SynGenome](https:\u002F\u002Fevodesign.org\u002Fsyngenome\u002F)，这是首个由人工智能生成的基因组学数据库，包含超过1000亿个碱基对的合成DNA序列。\n\n我们提供了以下模型检查点：\n| 检查点名称                        | 描述 |\n|----------------------------------------|-------------|\n| `evo-1.5-8k-base`   | 一个预训练模型，上下文长度为8,192，通过将`evo-1-8k-base`的预训练数据量增加50%获得。 |\n| `evo-1-8k-base`     | 一个预训练模型，上下文长度为8,192。我们将其用作分子尺度微调任务的基础模型。 |\n| `evo-1-131k-base`   | 一个预训练模型，上下文长度为131,072，以`evo-1-8k-base`为基础模型。我们使用该模型来推理和生成基因组尺度的序列。 |\n| `evo-1-8k-crispr`   | 一个微调模型，以`evo-1-8k-base`为基础模型，用于生成CRISPR-Cas系统。 |\n| `evo-1-8k-transposon`   | 一个微调模型，以`evo-1-8k-base`为基础模型，用于生成IS200\u002FIS605转座子。 |\n\n## 新闻\n\n**2024年12月17日：** 我们发现并修复了Evo模型推理代码中的一个错误，该错误影响了2024年11月15日至12月16日期间的软件包版本。此问题已在0.3版及更高版本中得到解决。如果您在此期间安装了该软件包，请升级以修复问题。\n\n## 目录\n\n- [设置](#setup)\n  - [要求](#requirements)\n  - [安装](#installation)\n- [使用](#usage)\n- [HuggingFace](#huggingface)\n- [Together API](#together-api)\n- [colab](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fevo-design\u002Fevo\u002Fblob\u002Fmain\u002Fscripts\u002Fhello_evo.ipynb)\n- [Playground封装](https:\u002F\u002Fevo.nitro.bio\u002F)\n- [数据集](#dataset)\n- [引用](#citation)\n\n## 设置\n\n### 要求\n\nEvo基于[StripedHyena](https:\u002F\u002Fgithub.com\u002Ftogethercomputer\u002Fstripedhyena\u002Ftree\u002Fmain)。\n\nEvo使用[FlashAttention-2](https:\u002F\u002Fgithub.com\u002FDao-AILab\u002Fflash-attention)，但并非所有GPU架构都支持该库。\n请参阅[FlashAttention GitHub仓库](https:\u002F\u002Fgithub.com\u002FDao-AILab\u002Fflash-attention#installation-and-features)以获取当前支持的GPU列表。目前，Evo支持FlashAttention版本≤2.7.4.post0。\n\n请确保在您的系统上安装正确的[PyTorch版本](https:\u002F\u002Fpytorch.org\u002F)。FlashAttention 2.7.4支持PyTorch版本≥2.7.0且\u003C2.8.0a0。\n\n我们建议使用全新的conda环境来安装这些先决条件。以下是安装示例：\n```bash\nconda install -c nvidia cuda-nvcc cuda-cudart-dev\nconda install -c conda-forge flash-attn=2.7.4\n```\n\n### 安装\n\n您可以通过`pip`安装Evo：\n```bash\npip install evo-model\n```\n或者直接从GitHub源码安装：\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo.git\ncd evo\u002F\npip install .\n```\n\n如果您没有使用上述conda-forge FlashAttention安装方法——该方法会自动安装PyTorch——我们建议您在安装其他依赖之前先安装PyTorch库（由于`flash-attn`库的依赖问题；例如，请参阅此[问题](https:\u002F\u002Fgithub.com\u002FDao-AILab\u002Fflash-attention\u002Fissues\u002F246)）。\n\n我们的一个[示例脚本](scripts\u002F)展示了如何从使用Evo生成序列到蛋白质折叠的过程（[scripts\u002Fgeneration_to_folding.py](scripts\u002Fgeneration_to_folding.py)），该过程还需要安装`prodigal`。为此，我们创建了一个[environment.yml](environment.yml)文件：\n\n```bash\nconda env create -f environment.yml\nconda activate evo-design\n```\n\n## 使用\n\n以下是如何下载Evo并在本地通过Python API使用它的示例。\n```python\nfrom evo import Evo\nimport torch\n\ndevice = 'cuda:0'\n\nevo_model = Evo('evo-1-131k-base')\nmodel, tokenizer = evo_model.model, evo_model.tokenizer\nmodel.to(device)\nmodel.eval()\n\nsequence = 'ACGT'\ninput_ids = torch.tensor(\n    tokenizer.tokenize(sequence),\n    dtype=torch.int,\n).to(device).unsqueeze(0)\n\nwith torch.no_grad():\n    logits, _ = model(input_ids) # (batch, length, vocab)\n\nprint('Logits: ', logits)\nprint('形状 (batch, length, vocab): ', logits.shape)\n```\n批量推理的示例可以在[`scripts\u002Fexample_inference.py`](scripts\u002Fexample_inference.py)中找到。\n\n我们还提供了一个[示例脚本](scripts\u002Fgenerate.py)，演示如何向模型输入提示并根据提示采样一组序列。\n```bash\npython -m scripts.generate \\\n    --model-name 'evo-1-131k-base' \\\n    --prompt ACGT \\\n    --n-samples 10 \\\n    --n-tokens 100 \\\n    --temperature 1. \\\n    --top-k 4 \\\n    --device cuda:0\n```\n\n此外，我们还提供了一个[示例脚本](scripts\u002Fscore.py)，用于使用模型对一组序列的对数似然值进行评分。\n```bash\npython -m scripts.score \\\n    --input-fasta examples\u002Fexample_seqs.fasta \\\n    --output-tsv scores.tsv \\\n    --model-name 'evo-1-131k-base' \\\n    --device cuda:0\n```\n\n## HuggingFace\n\nEvo已集成到[HuggingFace](https:\u002F\u002Fhuggingface.co\u002Ftogethercomputer\u002Fevo-1-131k-base)中。\n```python\nfrom transformers import AutoConfig, AutoModelForCausalLM\n\nmodel_name = 'togethercomputer\u002Fevo-1-8k-base'\n\nmodel_config = AutoConfig.from_pretrained(model_name, trust_remote_code=True, revision=\"1.1_fix\")\nmodel_config.use_cache = True\n\nmodel = AutoModelForCausalLM.from_pretrained(\n    model_name,\n    config=model_config,\n    trust_remote_code=True,\n    revision=\"1.1_fix\"\n)\n```\n\n\n## Together API\n\nEvo可通过Together AI提供的[网页界面](https:\u002F\u002Fapi.together.xyz\u002Fplayground\u002Flanguage\u002Ftogethercomputer\u002Fevo-1-131k-base)访问，在那里您可以使用类似聊天的界面生成DNA序列。\n\n对于更复杂或批量的工作流程，您可以使用下面的简单示例调用Together API。\n\n\n```bash\nimport openai\nimport os\n\n# 在此处填写您的 API 信息。\nclient = openai.OpenAI(\n  api_key=TOGETHER_API_KEY,\n  base_url='https:\u002F\u002Fapi.together.xyz',\n)\n\nchat_completion = client.chat.completions.create(\n  messages=[\n    {\n      \"role\": \"system\",\n      \"content\": \"\"\n    },\n    {\n      \"role\": \"user\",\n      \"content\": \"ACGT\", # 向模型提供一个序列作为提示。\n    }\n  ],\n  model=\"togethercomputer\u002Fevo-1-131k-base\",\n  max_tokens=128, # 采样一定数量的新标记。\n  logprobs=True\n)\nprint(\n    chat_completion.choices[0].logprobs.token_logprobs,\n    chat_completion.choices[0].message.content\n)\n```\n\n## 数据集\n\n用于预训练 Evo 的 OpenGenome 数据集可在 [Hugging Face 数据集](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FLongSafari\u002Fopen-genome) 上获取。\n\n## 引用\n\n在引用 Evo 时，请引用以下出版物：\n\n```\n@article{nguyen2024sequence,\n   author = {Eric Nguyen 和 Michael Poli、Matthew G. Durrant、Brian Kang、Dhruva Katrekar、David B. Li、Liam J. Bartie、Armin W. Thomas、Samuel H. King、Garyk Brixi、Jeremy Sullivan、Madelena Y. Ng、Ashley Lewis、Aaron Lou、Stefano Ermon、Stephen A. Baccus、Tina Hernandez-Boussard、Christopher Ré、Patrick D. Hsu 和 Brian L. Hie},\n   title = {利用 Evo 实现从分子到基因组尺度的序列建模与设计},\n   journal = {Science},\n   volume = {386},\n   number = {6723},\n   pages = {eado9336},\n   year = {2024},\n   doi = {10.1126\u002Fscience.ado9336},\n   URL = {https:\u002F\u002Fwww.science.org\u002Fdoi\u002Fabs\u002F10.1126\u002Fscience.ado9336},\n}\n```\n\n在引用 Evo 1.5 时，请引用以下出版物：\n\n```\n@article{merchant2025semantic,\n    author = {Merchant, Aditi T、King, Samuel H、Nguyen, Eric 和 Hie, Brian L},\n    title = {基于基因组语言模型的功能性从头设计基因的语义设计},\n    year = {2025},\n    doi = {10.1038\u002Fs41586-025-09749-7},\n    URL = {https:\u002F\u002Fwww.nature.com\u002Farticles\u002Fs41586-025-09749-7},\n    journal = {Nature}\n}\n```","# Evo 快速上手指南\n\n## 环境准备\n\n- **系统要求**：NVIDIA GPU（需支持 FlashAttention-2，如 RTX 3090+）\n- **前置依赖**：\n  - PyTorch 2.7.0+（需 \u003C 2.8.0a0）\n  - FlashAttention-2 2.7.4\n  - **国内加速建议**：安装时使用清华镜像源加速\n    ```bash\n    conda config --add channels https:\u002F\u002Fmirrors.tuna.tsinghua.edu.cn\u002Fanaconda\u002Fpkgs\u002Ffree\u002F\n    conda config --set channel_priority strict\n    ```\n\n## 安装步骤\n\n```bash\n# 创建并激活conda环境（推荐）\nconda create -n evo-env python=3.10 -y\nconda activate evo-env\n\n# 安装依赖（使用清华镜像加速）\nconda install -c https:\u002F\u002Fmirrors.tuna.tsinghua.edu.cn\u002Fanaconda\u002Fpkgs\u002Ffree\u002F conda-forge flash-attn=2.7.4\n\n# 安装Evo\npip install evo-model\n```\n\n> **备选安装**（从GitHub源码安装）：\n> ```bash\n> git clone https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo.git\n> cd evo\u002F\n> pip install .\n> ```\n\n## 基本使用\n\n以下为最简Python使用示例（生成DNA序列logits）：\n\n```python\nfrom evo import Evo\nimport torch\n\ndevice = 'cuda:0'\n\nevo_model = Evo('evo-1-131k-base')\nmodel, tokenizer = evo_model.model, evo_model.tokenizer\nmodel.to(device)\nmodel.eval()\n\nsequence = 'ACGT'\ninput_ids = torch.tensor(\n    tokenizer.tokenize(sequence),\n    dtype=torch.int,\n).to(device).unsqueeze(0)\n\nwith torch.no_grad():\n    logits, _ = model(input_ids)\n\nprint('Logits: ', logits)\nprint('Shape (batch, length, vocab): ', logits.shape)\n```\n\n> **说明**：使用 `evo-1-131k-base` 模型进行基因组级任务。确保GPU可用，运行前检查 `torch.cuda.is_available()`。","某合成生物学初创公司研发团队正开发新型CRISPR-Cas基因编辑工具，需设计高兼容性DNA序列以实现精准基因靶向。\n\n### 没有 evo 时\n- - 依赖资深生物学家手动设计序列，单个功能模块耗时3周以上，团队月均仅完成5个迭代\n- - 传统工具仅支持8k bp短序列建模，无法覆盖完整基因组功能区域（如启动子-编码区连续性）\n- - 生成序列在细胞实验中功能失败率高达65%，需反复修改测试，实验成本激增\n\n### 使用 evo 后\n- - evo-1-8k-crispr模型1小时内生成优化序列，设计周期压缩至数小时，月迭代量提升至50+\n- - 利用131k上下文能力确保序列在基因组尺度的结构完整性（如避免剪切位点冲突）\n- - 实验验证成功率提升至85%，减少无效实验80%，加速从设计到验证的闭环\n\nevo将基因组级DNA设计效率提升40倍，让生物工程师从重复劳动中解放，聚焦创新突破。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fevo-design_evo_54b0f365.jpg","evo-design","Laboratory of Evolutionary Design","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fevo-design_b4f3a00b.png","Developing biological AI for human good",null,"https:\u002F\u002Fevodesign.org\u002F","https:\u002F\u002Fgithub.com\u002Fevo-design",[22,26,30],{"name":23,"color":24,"percentage":25},"Python","#3572A5",53.4,{"name":27,"color":28,"percentage":29},"Jupyter Notebook","#DA5B0B",45.8,{"name":31,"color":32,"percentage":33},"Shell","#89e051",0.8,1495,177,"2026-04-03T02:58:41","Apache-2.0",3,"未说明","需要 NVIDIA GPU，显存 16GB+，CUDA 11.7+",{"notes":42,"python":43,"dependencies":44},"建议使用 conda 管理环境，首次运行需下载约 5GB 模型文件，注意升级到 0.3+ 版本以修复 bug","3.8+",[45,46,47],"torch>=2.7.0","flash-attn==2.7.4","transformers>=4.30",[49],"语言模型","ready","2026-03-27T02:49:30.150509","2026-04-06T05:35:30.643330",[54,59,64,69,74],{"id":55,"question_zh":56,"answer_zh":57,"source_url":58},5482,"如何解决 'rotary_emb is not installed' 错误？","安装 Triton 库，使用命令 `pip install triton` 或 `pip install evo-model triton`。例如，在 Google Colab 上，执行 `pip -q install evo-model triton` 可解决问题。","https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo\u002Fissues\u002F23",{"id":60,"question_zh":61,"answer_zh":62,"source_url":63},5483,"如何解决 'MHA is not defined' 错误？","降级 flash_attn 版本，使用命令 `pip install flash_attn==2.6.0.post1`。同时确保已安装 Triton 库（`pip install triton`）。","https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo\u002Fissues\u002F10",{"id":65,"question_zh":66,"answer_zh":67,"source_url":68},5484,"为什么 Evo 131k 基础模型输出非 DNA 字符？","Together API 上的 Evo 131k 模型存在输出问题，建议使用 Hugging Face 版本进行推理，而非 Together API。例如，通过 Hugging Face 的 AutoModelForCausalLM 加载模型。","https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo\u002Fissues\u002F56",{"id":70,"question_zh":71,"answer_zh":72,"source_url":73},5485,"如何设置输入序列长度以避免 CUDA Out of Memory 错误？","通过修改模型配置设置 `max_seqlen` 和 `max_new_tokens`。例如：\n```python\nmodel_config = AutoConfig.from_pretrained('togethercomputer\u002Fevo-1-131k-base', trust_remote_code=True, revision=\"1.1_fix\")\nmodel_config.max_seqlen = 500_000\nmodel = AutoModelForCausalLM.from_pretrained('togethercomputer\u002Fevo-1-131k-base', config=model_config, trust_remote_code=True, revision=\"1.1_fix\")\noutputs = model.generate(input_ids, max_new_tokens=500_000, temperature=1., top_k=4)\n```","https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo\u002Fissues\u002F24",{"id":75,"question_zh":76,"answer_zh":77,"source_url":78},5486,"使用 Together API 时模型找不到，如何解决？","模型在 Together API 上可用，但需使用正确的模型名称和端点。注意：博客中的 API 链接错误，应为 `https:\u002F\u002Fapi.together.xyz\u002Fplayground\u002Flanguage\u002Ftogethercomputer\u002Fevo-1-131k-base` 而非 `genomics`。或直接使用 Hugging Face 版本。","https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo\u002Fissues\u002F14",[80,85,90,95,100,105,110],{"id":81,"version":82,"summary_zh":83,"released_at":84},104989,"v0.5","Fix compatibility with transformers v5 and numpy 2.0 (https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo\u002Fpull\u002F135)","2026-02-16T17:50:14",{"id":86,"version":87,"summary_zh":88,"released_at":89},104990,"v0.4","Add Evo 1.5: https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo\u002Fpull\u002F104","2024-12-18T16:25:45",{"id":91,"version":92,"summary_zh":93,"released_at":94},104991,"v0.3","Fix bug where incorrect Hugging Face checkpoint was being loaded for the base models: https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo\u002Fpull\u002F100","2024-12-17T05:45:56",{"id":96,"version":97,"summary_zh":98,"released_at":99},104992,"v0.2.1","Fix problem with package installation: https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo\u002Fcommit\u002F15bb34bf85cf84539b9cf60097a7cdc2646eec56","2024-11-15T21:20:53",{"id":101,"version":102,"summary_zh":103,"released_at":104},104993,"v0.2","Update to include new checkpoints, data, and links associated with the Evo publication: https:\u002F\u002Fgithub.com\u002Fevo-design\u002Fevo\u002Fpull\u002F92","2024-11-15T21:12:14",{"id":106,"version":107,"summary_zh":108,"released_at":109},104994,"v0.1.2","Fixes attention projections bug and points HF to the new checkpoints.","2024-04-30T22:35:35",{"id":111,"version":112,"summary_zh":113,"released_at":114},104995,"v0.1.1","Initial evo release","2024-02-27T09:53:10",[116,127,135,149,157,165],{"id":117,"name":118,"github_repo":119,"description_zh":120,"stars":121,"difficulty_score":122,"last_commit_at":123,"category_tags":124,"status":50},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[125,126,49],"开发框架","Agent",{"id":128,"name":129,"github_repo":130,"description_zh":131,"stars":132,"difficulty_score":122,"last_commit_at":133,"category_tags":134,"status":50},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[125,49],{"id":136,"name":137,"github_repo":138,"description_zh":139,"stars":140,"difficulty_score":122,"last_commit_at":141,"category_tags":142,"status":50},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[143,144,145,146,126,147,49,125,148],"图像","数据工具","视频","插件","其他","音频",{"id":150,"name":151,"github_repo":152,"description_zh":153,"stars":154,"difficulty_score":38,"last_commit_at":155,"category_tags":156,"status":50},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[126,143,125,49,147],{"id":158,"name":159,"github_repo":160,"description_zh":161,"stars":162,"difficulty_score":38,"last_commit_at":163,"category_tags":164,"status":50},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74913,"2026-04-05T10:44:17",[49,143,125,147],{"id":166,"name":167,"github_repo":168,"description_zh":169,"stars":170,"difficulty_score":38,"last_commit_at":171,"category_tags":172,"status":50},2181,"OpenHands","OpenHands\u002FOpenHands","OpenHands 是一个专注于 AI 驱动开发的开源平台，旨在让智能体（Agent）像人类开发者一样理解、编写和调试代码。它解决了传统编程中重复性劳动多、环境配置复杂以及人机协作效率低等痛点，通过自动化流程显著提升开发速度。\n\n无论是希望提升编码效率的软件工程师、探索智能体技术的研究人员，还是需要快速原型验证的技术团队，都能从中受益。OpenHands 提供了灵活多样的使用方式：既可以通过命令行（CLI）或本地图形界面在个人电脑上轻松上手，体验类似 Devin 的流畅交互；也能利用其强大的 Python SDK 自定义智能体逻辑，甚至在云端大规模部署上千个智能体并行工作。\n\n其核心技术亮点在于模块化的软件智能体 SDK，这不仅构成了平台的引擎，还支持高度可组合的开发模式。此外，OpenHands 在 SWE-bench 基准测试中取得了 77.6% 的优异成绩，证明了其解决真实世界软件工程问题的能力。平台还具备完善的企业级功能，支持与 Slack、Jira 等工具集成，并提供细粒度的权限管理，适合从个人开发者到大型企业的各类用户场景。",70612,"2026-04-05T11:12:22",[49,126,125,146]]