[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-souzatharsis--podcastfy":3,"tool-souzatharsis--podcastfy":65},[4,17,27,35,48,57],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",154349,2,"2026-04-13T23:32:16",[13,14,15],"开发框架","Agent","语言模型","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,3,"2026-04-06T11:19:32",[15,26,14,13],"图像",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":10,"last_commit_at":33,"category_tags":34,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":10,"last_commit_at":41,"category_tags":42,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",85092,"2026-04-10T11:13:16",[26,43,44,45,14,46,15,13,47],"数据工具","视频","插件","其他","音频",{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":54,"last_commit_at":55,"category_tags":56,"status":16},5784,"funNLP","fighting41love\u002FfunNLP","funNLP 是一个专为中文自然语言处理（NLP）打造的超级资源库，被誉为\"NLP 民工的乐园”。它并非单一的软件工具，而是一个汇集了海量开源项目、数据集、预训练模型和实用代码的综合性平台。\n\n面对中文 NLP 领域资源分散、入门门槛高以及特定场景数据匮乏的痛点，funNLP 提供了“一站式”解决方案。这里不仅涵盖了分词、命名实体识别、情感分析、文本摘要等基础任务的标准工具，还独特地收录了丰富的垂直领域资源，如法律、医疗、金融行业的专用词库与数据集，甚至包含古诗词生成、歌词创作等趣味应用。其核心亮点在于极高的全面性与实用性，从基础的字典词典到前沿的 BERT、GPT-2 模型代码，再到高质量的标注数据和竞赛方案，应有尽有。\n\n无论是刚刚踏入 NLP 领域的学生、需要快速验证想法的算法工程师，还是从事人工智能研究的学者，都能在这里找到急需的“武器弹药”。对于开发者而言，它能大幅减少寻找数据和复现模型的时间；对于研究者，它提供了丰富的基准测试资源和前沿技术参考。funNLP 以开放共享的精神，极大地降低了中文自然语言处理的开发与研究成本，是中文 AI 社区不可或缺的宝藏仓库。",79857,1,"2026-04-08T20:11:31",[15,43,46],{"id":58,"name":59,"github_repo":60,"description_zh":61,"stars":62,"difficulty_score":54,"last_commit_at":63,"category_tags":64,"status":16},5773,"cs-video-courses","Developer-Y\u002Fcs-video-courses","cs-video-courses 是一个精心整理的计算机科学视频课程清单，旨在为自学者提供系统化的学习路径。它汇集了全球知名高校（如加州大学伯克利分校、新南威尔士大学等）的完整课程录像，涵盖从编程基础、数据结构与算法，到操作系统、分布式系统、数据库等核心领域，并深入延伸至人工智能、机器学习、量子计算及区块链等前沿方向。\n\n面对网络上零散且质量参差不齐的教学资源，cs-video-courses 解决了学习者难以找到成体系、高难度大学级别课程的痛点。该项目严格筛选内容，仅收录真正的大学层级课程，排除了碎片化的简短教程或商业广告，确保用户能接触到严谨的学术内容。\n\n这份清单特别适合希望夯实计算机基础的开发者、需要补充特定领域知识的研究人员，以及渴望像在校生一样系统学习计算机科学的自学者。其独特的技术亮点在于分类极其详尽，不仅包含传统的软件工程与网络安全，还细分了生成式 AI、大语言模型、计算生物学等新兴学科，并直接链接至官方视频播放列表，让用户能一站式获取高质量的教育资源，免费享受世界顶尖大学的课堂体验。",79792,"2026-04-08T22:03:59",[46,26,43,13],{"id":66,"github_repo":67,"name":68,"description_en":69,"description_zh":70,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":80,"owner_email":81,"owner_twitter":80,"owner_website":82,"owner_url":83,"languages":84,"stars":105,"forks":106,"last_commit_at":107,"license":108,"difficulty_score":10,"env_os":109,"env_gpu":110,"env_ram":109,"env_deps":111,"category_tags":116,"github_topics":117,"view_count":10,"oss_zip_url":80,"oss_zip_packed_at":80,"status":16,"created_at":124,"updated_at":125,"faqs":126,"releases":155},7339,"souzatharsis\u002Fpodcastfy","podcastfy","An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI","Podcastfy 是一款开源的 Python 工具，旨在将多模态内容（如文本、图片、PDF、网站链接及 YouTube 视频）转化为生动有趣的多语言音频对话。它被视为 Google NotebookLM 播客功能的开源替代方案，核心解决了用户希望将复杂资料轻松转化为可听内容，同时需要更高定制自由度和程序化控制的需求。\n\n与主要面向研究合成且封闭的商业工具不同，Podcastfy 专注于通过代码实现个性化和规模化生成。用户不仅可以输入特定主题让 AI 自动构思对话，还能深度调整语音风格、语言种类及对话逻辑，非常适合开发者、研究人员以及希望构建自动化内容工作流的技术爱好者。当然，其提供的 Web 应用和命令行界面也让非技术背景的用户能够便捷体验。\n\n技术亮点方面，Podcastfy 利用生成式 AI 深入理解图文信息，模拟自然的双人交谈场景，支持多种语言输出，并提供了从 Python 包、CLI 到 Docker 部署的完整生态。无论是想为艺术画作生成解说播客，还是将长篇论文转化为听力素材，Podcastfy 都能以开放、灵活的方式帮助用户高效完成创作。","\u003Cdiv align=\"center\">\n\u003Ca name=\"readme-top\">\u003C\u002Fa>\n\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F12965\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_4a68feb902da.png\" alt=\"Podcastfy.ai | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\n# Podcastfy.ai 🎙️🤖\nAn Open Source API alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI\n\n\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F5d42c106-aabe-44c1-8498-e9c53545ba40\n\n\n\n[Paper](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fblob\u002Fmain\u002Fpaper\u002Fpaper.pdf) |\n[Python Package](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fblob\u002F59563ee105a0d1dbb46744e0ff084471670dd725\u002Fpodcastfy.ipynb) |\n[CLI](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fblob\u002F59563ee105a0d1dbb46744e0ff084471670dd725\u002Fusage\u002Fcli.md) |\n[Web App](https:\u002F\u002Fopenpod.fly.dev\u002F) |\n[Feedback](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues)\n\n[![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fsouzatharsis\u002Fpodcastfy\u002Fblob\u002Fmain\u002Fpodcastfy.ipynb)\n[![PyPi Status](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fpodcastfy)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpodcastfy\u002F)\n![PyPI Downloads](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_6deaada48438.png)\n[![Issues](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues-raw\u002Fsouzatharsis\u002Fpodcastfy)](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues)\n[![Pytest](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Factions\u002Fworkflows\u002Fpython-app.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Factions\u002Fworkflows\u002Fpython-app.yml)\n[![Docker](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Factions\u002Fworkflows\u002Fdocker-publish.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Factions\u002Fworkflows\u002Fdocker-publish.yml)\n[![Documentation Status](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_13d664e1afd7.png)](https:\u002F\u002Fpodcastfy.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache_2.0-blue.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FApache-2.0)\n![GitHub Repo stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsouzatharsis\u002Fpodcastfy)\n\u003C\u002Fdiv>\n\nPodcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, images, YouTube videos, as well as user provided topics.\n\nUnlike closed-source UI-based tools focused primarily on research synthesis (e.g. NotebookLM ❤️), Podcastfy focuses on open source, programmatic and bespoke generation of engaging, conversational content from a multitude of multi-modal sources, enabling customization and scale.\n\n## Testimonials 💬\n\n> \"Love that you casually built an open source version of the most popular product Google built in the last decade\"\n\n> \"Loving this initiative and the best I have seen so far especially for a 'non-techie' user.\"\n\n> \"Your library was very straightforward to work with. You did Amazing work brother 🙏\"\n\n> \"I think it's awesome that you were inspired\u002Frecognize how hard it is to beat NotebookLM's quality, but you did an *incredible* job with this! It sounds incredible, and it's open-source! Thank you for being amazing!\"\n\n[![Star History Chart](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_46786cc2c5b2.png)](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_46786cc2c5b2.png)\n\n## Audio Examples 🔊\nThis sample collection was generated using this [Python Notebook](usage\u002Fexamples.ipynb).\n\n### Images\nSample 1: Senecio, 1922 (Paul Klee) and Connection of Civilizations (2017) by Gheorghe Virtosu\n***\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_765886dce294.jpeg\" alt=\"Senecio, 1922 (Paul Klee)\" width=\"20%\" height=\"auto\"> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_b9e59dfb7072.jpg\" alt=\"Connection of Civilizations (2017) by Gheorghe Virtosu \" width=\"21.5%\" height=\"auto\">\n\u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fa4134a0d-138c-4ab4-bc70-0f53b3507e6b\">\u003C\u002Fvideo>  \n***\nSample 2: The Great Wave off Kanagawa, 1831 (Hokusai) and Takiyasha the Witch and the Skeleton Spectre, c. 1844 (Kuniyoshi)\n***\n \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_580c8798248f.jpg\" alt=\"The Great Wave off Kanagawa, 1831 (Hokusai)\" width=\"20%\" height=\"auto\"> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_0ab2ae661ddc.jpg\" alt=\"Takiyasha the Witch and the Skeleton Spectre, c. 1844 (Kuniyoshi)\" width=\"21.5%\" height=\"auto\"> \n\u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Ff6aaaeeb-39d2-4dde-afaf-e2cd212e9fed\">\u003C\u002Fvideo>  \n***\nSample 3: Pop culture icon Taylor Swift and Mona Lisa, 1503 (Leonardo da Vinci)\n***\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_d2ff05151d1e.png\" alt=\"Taylor Swift\" width=\"28%\" height=\"auto\"> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_b92c1b0eda39.jpeg\" alt=\"Mona Lisa\" width=\"10.5%\" height=\"auto\">\n\u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F3b6f7075-159b-4540-946f-3f3907dffbca\">\u003C\u002Fvideo> \n\n\n### Text\n| Audio | Description  | Source |\n|-------|--|--------|\n| \u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fef41a207-a204-4b60-a11e-06d66a0fbf06\">\u003C\u002Fvideo>  | Personal Website | [Website](https:\u002F\u002Fwww.souzatharsis.com) |\n| [Audio](https:\u002F\u002Fsoundcloud.com\u002Fhigh-lander123\u002Famodei?in=high-lander123\u002Fsets\u002Fpodcastfy-sample-audio-longform&si=b8dfaf4e3ddc4651835e277500384156) (`longform=True`) | Lex Fridman Podcast: 5h interview with Dario Amodei Anthropic's CEO |  [Youtube](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=ugvHCXCOmm4) |\n| [Audio](https:\u002F\u002Fsoundcloud.com\u002Fhigh-lander123\u002Fbenjamin?in=high-lander123\u002Fsets\u002Fpodcastfy-sample-audio-longform&si=dca7e2eec1c94252be18b8794499959a&utm_source=clipboard&utm_medium=text&utm_campaign=social_sharing) (`longform=True`)| Benjamin Franklin's Autobiography | [Book](https:\u002F\u002Fwww.gutenberg.org\u002Fcache\u002Fepub\u002F148\u002Fpg148.txt) |\n\n### Multi-Lingual Text\n| Language | Content Type | Description | Audio | Source |\n|----------|--------------|-------------|-------|--------|\n| French | Website | Agroclimate research information | [Audio](https:\u002F\u002Faudio.com\u002Fthatupiso\u002Faudio\u002Fpodcast-fr-agro) | [Website](https:\u002F\u002Fagroclim.inrae.fr\u002F) |\n| Portuguese-BR | News Article | Election polls in São Paulo | [Audio](https:\u002F\u002Faudio.com\u002Fthatupiso\u002Faudio\u002Fpodcast-thatupiso-br) | [Website](https:\u002F\u002Fnoticias.uol.com.br\u002Feleicoes\u002F2024\u002F10\u002F03\u002Fnova-pesquisa-datafolha-quem-subiu-e-quem-caiu-na-disputa-de-sp-03-10.htm) |\n\n\n## Quickstart 💻\n\n### Prerequisites\n- Python 3.11 or higher\n- `$ pip install ffmpeg` (for audio processing)\n\n### Setup\n1. Install from PyPI\n  `$ pip install podcastfy`\n\n2. Set up your [API keys](usage\u002Fconfig.md)\n\n### Python\n```python\nfrom podcastfy.client import generate_podcast\n\naudio_file = generate_podcast(urls=[\"\u003Curl1>\", \"\u003Curl2>\"])\n```\n### CLI\n```\npython -m podcastfy.client --url \u003Curl1> --url \u003Curl2>\n```\n\n### Fastapi (Beta for urls)\n```\nContainerize podcastify and launch the api\nDockerfile_api\n\nMake requests to the api look at the notebook for a clear example\nfetch_audio(request_data, ENDPOINT, BASE_URL)\n```\n  \n## Usage 💻\n\n- [Python Package Quickstart](podcastfy.ipynb)\n\n- [How to](usage\u002Fhow-to.md)\n\n- [Python Package Reference Manual](https:\u002F\u002Fpodcastfy.readthedocs.io\u002Fen\u002Flatest\u002Fpodcastfy.html)\n\n- [CLI](usage\u002Fcli.md)\n\n## Customization 🔧\n\nPodcastfy offers a range of customization options to tailor your AI-generated podcasts:\n- Customize podcast [conversation](usage\u002Fconversation_custom.md) (e.g. format, style, voices)\n- Choose to run [Local LLMs](usage\u002Flocal_llm.md) (156+ HuggingFace models)\n- Set other [Configuration Settings](usage\u002Fconfig.md)\n\n## Features ✨\n\n- Generate conversational content from multiple sources and formats (images, text, websites, YouTube, and PDFs).\n- Generate shorts (2-5 minutes) or longform (30+ minutes) podcasts.\n- Customize transcript and audio generation (e.g., style, language, structure).\n- Generate transcripts using 100+ LLM models (OpenAI, Anthropic, Google etc).\n- Leverage local LLMs for transcript generation for increased privacy and control.\n- Integrate with advanced text-to-speech models (OpenAI, Google, ElevenLabs, and Microsoft Edge).\n- Provide multi-language support for global content creation.\n- Integrate seamlessly with CLI and Python packages for automated workflows.\n\n## Built with Podcastfy 🚀\n\n- [OpenNotebook](https:\u002F\u002Fwww.open-notebook.ai\u002F)\n- [SurfSense](https:\u002F\u002Fwww.surfsense.net\u002F)\n- [OpenPod](https:\u002F\u002Fopenpod.fly.dev\u002F)\n- [Podcast-llm](https:\u002F\u002Fgithub.com\u002Fevandempsey\u002Fpodcast-llm)\n- [Podcastfy-HuggingFace App](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fthatupiso\u002FPodcastfy.ai_demo)\n\n\n## Updates 🚀🚀\n\n### v0.4.0+ release\n- Leverage natural conversational multi-Speaker TTS model\n- Generate short or longform podcasts\n- Generate podcasts from input topic using grounded real-time web search\n- Integrate with 100+ LLM models (OpenAI, Anthropic, Google etc) for transcript generation\n\nSee [CHANGELOG](CHANGELOG.md) for more details.\n\n\n## License\n\nThis software is licensed under [Apache 2.0](LICENSE). See [instructions](usage\u002Flicense-guide.md) if you would like to use podcastfy in your software.\n\n## Contributing 🤝\n\nWe welcome contributions! See [Guidelines](GUIDELINES.md) for more details.\n\n## Example Use Cases 🎧🎶\n\n- **Content Creators** can use `Podcastfy` to convert blog posts, articles, or multimedia content into podcast-style audio, enabling them to reach broader audiences. By transforming content into an audio format, creators can cater to users who prefer listening over reading.\n\n- **Educators** can transform lecture notes, presentations, and visual materials into audio conversations, making educational content more accessible to students with different learning preferences. This is particularly beneficial for students with visual impairments or those who have difficulty processing written information.\n\n- **Researchers** can convert research papers, visual data, and technical content into conversational audio. This makes it easier for a wider audience, including those with disabilities, to consume and understand complex scientific information. Researchers can also create audio summaries of their work to enhance accessibility.\n\n- **Accessibility Advocates** can use `Podcastfy` to promote digital accessibility by providing a tool that converts multimodal content into auditory formats. This helps individuals with visual impairments, dyslexia, or other disabilities that make it challenging to consume written or visual content.\n  \n## Contributors\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg alt=\"contributors\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_0f960d27696e.png\"\u002F>\n\u003C\u002Fa>\n\n\u003Cp align=\"right\" style=\"font-size: 14px; color: #555; margin-top: 20px;\">\n    \u003Ca href=\"#readme-top\" style=\"text-decoration: none; color: #007bff; font-weight: bold;\">\n        ↑ Back to Top ↑\n    \u003C\u002Fa>\n\u003C\u002Fp>\n","\u003Cdiv align=\"center\">\n\u003Ca name=\"readme-top\">\u003C\u002Fa>\n\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F12965\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_4a68feb902da.png\" alt=\"Podcastfy.ai | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\n# Podcastfy.ai 🎙️🤖\n一个开源的API，可替代NotebookLM的播客功能：利用生成式AI将多模态内容转化为引人入胜的多语言音频对话\n\n\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F5d42c106-aabe-44c1-8498-e9c53545ba40\n\n\n\n[论文](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fblob\u002Fmain\u002Fpaper\u002Fpaper.pdf) |\n[Python包](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fblob\u002F59563ee105a0d1dbb46744e0ff084471670dd725\u002Fpodcastfy.ipynb) |\n[命令行工具](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fblob\u002F59563ee105a0d1dbb46744e0ff084471670dd725\u002Fusage\u002Fcli.md) |\n[Web应用](https:\u002F\u002Fopenpod.fly.dev\u002F) |\n[反馈](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues)\n\n[![在Colab中打开](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fsouzatharsis\u002Fpodcastfy\u002Fblob\u002Fmain\u002Fpodcastfy.ipynb)\n[![PyPI状态](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fpodcastfy)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpodcastfy\u002F)\n![PyPI下载量](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_6deaada48438.png)\n[![问题数](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues-raw\u002Fsouzatharsis\u002Fpodcastfy)](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues)\n[![Pytest](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Factions\u002Fworkflows\u002Fpython-app.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Factions\u002Fworkflows\u002Fpython-app.yml)\n[![Docker](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Factions\u002Fworkflows\u002Fdocker-publish.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Factions\u002Fworkflows\u002Fdocker-publish.yml)\n[![文档状态](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_13d664e1afd7.png)](https:\u002F\u002Fpodcastfy.readthedocs.io\u002Fen\u002Flatest\u002F?badge=latest)\n[![许可证](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache_2.0-blue.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FApache-2.0)\n![GitHub仓库星标数](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fsouzatharsis\u002Fpodcastfy)\n\nPodcastfy是一个开源的Python库，它利用生成式AI将多模态内容（文本、图片）转化为引人入胜的多语言音频对话。输入内容包括网站、PDF文件、图片、YouTube视频以及用户提供的主题。\n\n与主要专注于研究综述的闭源UI工具不同（例如NotebookLM ❤️），Podcastfy专注于从多种多模态来源以开源、程序化和定制化的方式生成引人入胜的对话内容，从而实现高度的灵活性和规模化生产。\n\n## 用户评价 💬\n\n> “太喜欢了！你竟然随手就做了一个开源版本，而这个产品可是谷歌过去十年中最受欢迎的产品之一。”\n\n> “非常喜欢这项计划，对于非技术背景的用户来说，这绝对是目前为止最好的选择。”\n\n> “你的库非常容易上手。兄弟，你做得太棒了 🙏”\n\n> “我觉得你受到启发并意识到要超越NotebookLM的质量有多么困难，但你在这个项目上做得实在太出色了！声音效果惊人，而且还是开源的！谢谢你这么厉害！”\n\n[![星标历史图](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_46786cc2c5b2.png)](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_46786cc2c5b2.png)\n\n## 音频示例 🔊\n本示例集是使用此[Python笔记本](usage\u002Fexamples.ipynb)生成的。\n\n### 图片\n示例1：《Senecio》，1922年（保罗·克利）和《文明的连接》，2017年（格奥尔基·维尔托苏）\n***\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_765886dce294.jpeg\" alt=\"Senecio, 1922 (Paul Klee)\" width=\"20%\" height=\"auto\"> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_b9e59dfb7072.jpg\" alt=\"Connection of Civilizations (2017) by Gheorghe Virtosu \" width=\"21.5%\" height=\"auto\">\n\u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fa4134a0d-138c-4ab4-bc70-0f53b3507e6b\">\u003C\u002Fvideo>  \n***\n示例2：《神奈川冲浪里》，1831年（葛饰北斋）和《女巫泷屋与骷髅幽灵》，约1844年（歌川国芳）\n***\n \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_580c8798248f.jpg\" alt=\"The Great Wave off Kanagawa, 1831 (Hokusai)\" width=\"20%\" height=\"auto\"> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_0ab2ae661ddc.jpg\" alt=\"Takiyasha the Witch and the Skeleton Spectre, c. 1844 (Kuniyoshi)\" width=\"21.5%\" height=\"auto\"> \n\u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Ff6aaaeeb-39d2-4dde-afaf-e2cd212e9fed\">\u003C\u002Fvideo>  \n***\n示例3：流行文化偶像泰勒·斯威夫特和《蒙娜丽莎》，1503年（列奥纳多·达·芬奇）\n***\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_d2ff05151d1e.png\" alt=\"Taylor Swift\" width=\"28%\" height=\"auto\"> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_b92c1b0eda39.jpeg\" alt=\"Mona Lisa\" width=\"10.5%\" height=\"auto\">\n\u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F3b6f7075-159b-4540-946f-3f3907dffbca\">\u003C\u002Fvideo> \n\n\n### 文本\n| 音频 | 描述  | 来源 |\n|-------|--|--------|\n| \u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fef41a207-a204-4b60-a11e-06d66a0fbf06\">\u003C\u002Fvideo>  | 个人网站 | [网站](https:\u002F\u002Fwww.souzatharsis.com) |\n| [音频](https:\u002F\u002Fsoundcloud.com\u002Fhigh-lander123\u002Famodei?in=high-lander123\u002Fsets\u002Fpodcastfy-sample-audio-longform&si=b8dfaf4e3ddc4651835e277500384156) (`longform=True`) | Lex Fridman播客：与Anthropic公司CEO达里奥·阿莫迪的5小时访谈 |  [YouTube](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=ugvHCXCOmm4) |\n| [音频](https:\u002F\u002Fsoundcloud.com\u002Fhigh-lander123\u002Fbenjamin?in=high-lander123\u002Fsets\u002Fpodcastfy-sample-audio-longform&si=dca7e2eec1c94252be18b8794499959a&utm_source=clipboard&utm_medium=text&utm_campaign=social_sharing) (`longform=True`)| 本杰明·富兰克林自传 | [书籍](https:\u002F\u002Fwww.gutenberg.org\u002Fcache\u002Fepub\u002F148\u002Fpg148.txt) |\n\n### 多语言文本\n| 语言 | 内容类型 | 描述 | 音频 | 来源 |\n|----------|--------------|-------------|-------|--------|\n| 法语 | 网站 | 农业气候研究信息 | [音频](https:\u002F\u002Faudio.com\u002Fthatupiso\u002Faudio\u002Fpodcast-fr-agro) | [网站](https:\u002F\u002Fagroclim.inrae.fr\u002F) |\n| 葡萄牙语-巴西 | 新闻文章 | 圣保罗市选举民调 | [音频](https:\u002F\u002Faudio.com\u002Fthatupiso\u002Faudio\u002Fpodcast-thatupiso-br) | [网站](https:\u002F\u002Fnoticias.uol.com.br\u002Feleicoes\u002F2024\u002F10\u002F03\u002Fnova-pesquisa-datafolha-quem-subiu-e-quem-caiu-na-disputa-de-sp-03-10.htm) |\n\n\n## 快速入门 💻\n\n### 前置条件\n- Python 3.11或更高版本\n- `$ pip install ffmpeg`（用于音频处理）\n\n### 安装\n1. 从PyPI安装\n  `$ pip install podcastfy`\n\n2. 设置您的[API密钥](usage\u002Fconfig.md)\n\n### Python\n```python\nfrom podcastfy.client import generate_podcast\n\naudio_file = generate_podcast(urls=[\"\u003Curl1>\", \"\u003Curl2>\"])\n```\n### 命令行\n```\npython -m podcastfy.client --url \u003Curl1> --url \u003Curl2>\n```\n\n### Fastapi（针对URL的测试版）\n```\n将Podcastfy容器化并启动API\nDockerfile_api\n\n向API发送请求，参考笔记本中的示例获取清晰的操作指南\nfetch_audio(request_data, ENDPOINT, BASE_URL)\n```\n\n## 使用方法 💻\n\n- [Python 包快速入门](podcastfy.ipynb)\n\n- [操作指南](usage\u002Fhow-to.md)\n\n- [Python 包参考手册](https:\u002F\u002Fpodcastfy.readthedocs.io\u002Fen\u002Flatest\u002Fpodcastfy.html)\n\n- [命令行界面](usage\u002Fcli.md)\n\n## 自定义 🔧\n\nPodcastfy 提供了丰富的自定义选项，帮助您打造个性化的 AI 生成播客：\n- 自定义播客的[对话](usage\u002Fconversation_custom.md)（例如格式、风格、语音等）\n- 选择运行[本地大模型](usage\u002Flocal_llm.md)（支持 156+ 种 HuggingFace 模型）\n- 设置其他[配置选项](usage\u002Fconfig.md)\n\n## 功能 ✨\n\n- 支持从多种来源和格式生成对话式内容，包括图片、文本、网站、YouTube 和 PDF。\n- 可生成短视频（2–5 分钟）或长视频（30 分钟以上）播客。\n- 支持自定义字幕和音频生成（如风格、语言、结构等）。\n- 使用 100 多种大模型生成字幕（OpenAI、Anthropic、Google 等）。\n- 利用本地大模型生成字幕，以提升隐私性和可控性。\n- 集成先进的文本转语音模型（OpenAI、Google、ElevenLabs 和 Microsoft Edge）。\n- 提供多语言支持，助力全球内容创作。\n- 与命令行工具和 Python 包无缝集成，实现自动化工作流。\n\n## 使用 Podcastfy 打造的应用 🚀\n\n- [OpenNotebook](https:\u002F\u002Fwww.open-notebook.ai\u002F)\n- [SurfSense](https:\u002F\u002Fwww.surfsense.net\u002F)\n- [OpenPod](https:\u002F\u002Fopenpod.fly.dev\u002F)\n- [Podcast-llm](https:\u002F\u002Fgithub.com\u002Fevandempsey\u002Fpodcast-llm)\n- [Podcastfy-HuggingFace 应用](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fthatupiso\u002FPodcastfy.ai_demo)\n\n\n## 更新 🚀🚀\n\n### v0.4.0+ 版本\n- 引入自然流畅的多角色 TTS 模型\n- 支持生成短篇或长篇播客\n- 基于输入主题，结合实时网络搜索生成播客内容\n- 集成 100 多种大模型（OpenAI、Anthropic、Google 等）用于字幕生成\n\n更多详情请参阅 [CHANGELOG](CHANGELOG.md)。\n\n\n## 许可证\n\n本软件采用 [Apache 2.0](LICENSE) 许可证授权。如果您希望在自己的软件中使用 Podcastfy，请参阅 [使用指南](usage\u002Flicense-guide.md)。\n\n## 贡献 🤝\n\n我们欢迎各类贡献！更多详情请参阅 [贡献指南](GUIDELINES.md)。\n\n## 典型应用场景 🎧🎶\n\n- **内容创作者**可以使用 `Podcastfy` 将博客文章、新闻稿或多媒体内容转换为播客形式的音频，从而触达更广泛的受众。通过将内容转化为音频形式，创作者能够满足那些更倾向于听觉而非阅读的用户需求。\n  \n- **教育工作者**可以将讲义、演示文稿和视觉材料转化为音频对话，使教学内容更容易被不同学习方式的学生所接受。这对于视力障碍学生或难以处理书面信息的学生尤为有益。\n  \n- **研究人员**可以将研究论文、可视化数据和技术性内容转换为对话式音频，以便更广泛的受众（包括残障人士）理解和吸收复杂的科学信息。研究人员还可以为自己的研究成果制作音频摘要，以提升内容的可访问性。\n  \n- **无障碍倡导者**可以利用 `Podcastfy` 推动数字无障碍建设，提供一款将多模态内容转换为听觉形式的工具，帮助视力障碍、诵读困难或其他难以处理文字或视觉内容的人群更好地获取信息。\n  \n## 贡献者\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg alt=\"contributors\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_readme_0f960d27696e.png\"\u002F>\n\u003C\u002Fa>\n\n\u003Cp align=\"right\" style=\"font-size: 14px; color: #555; margin-top: 20px;\">\n    \u003Ca href=\"#readme-top\" style=\"text-decoration: none; color: #007bff; font-weight: bold;\">\n        ↑ 返回顶部 ↑\n    \u003C\u002Fa>\n\u003C\u002Fp>","# Podcastfy 快速上手指南\n\nPodcastfy 是一个开源 Python 工具，可利用生成式 AI 将多模态内容（文本、图片、网站、YouTube 视频、PDF 等）转换为引人入胜的多语言音频对话。它是 NotebookLM 播客功能的开源替代方案。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**：Linux, macOS 或 Windows\n*   **Python 版本**：Python 3.11 或更高版本\n*   **前置依赖**：需要安装 `ffmpeg` 用于音频处理。\n\n### 安装 ffmpeg\n\n*   **Ubuntu\u002FDebian**:\n    ```bash\n    sudo apt update && sudo apt install ffmpeg\n    ```\n*   **macOS** (使用 Homebrew):\n    ```bash\n    brew install ffmpeg\n    ```\n*   **Windows**:\n    请下载并安装 ffmpeg 构建版，或将 ffmpeg 添加到系统环境变量 PATH 中。\n\n## 安装步骤\n\n推荐使用 pip 进行安装。国内用户可使用清华或阿里镜像源加速下载。\n\n### 标准安装\n```bash\npip install podcastfy\n```\n\n### 使用国内镜像源安装（推荐）\n```bash\npip install podcastfy -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 配置 API Keys\n使用前需配置大模型（LLM）和语音合成（TTS）的 API Key。\n您可以创建配置文件或设置环境变量。详细配置方法请参考官方文档 [usage\u002Fconfig.md](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fblob\u002Fmain\u002Fusage\u002Fconfig.md)。\n\n通常需要在项目根目录创建 `.env` 文件或直接在代码中传入密钥。\n\n## 基本使用\n\nPodcastfy 支持通过 Python 代码或命令行（CLI）快速生成播客。\n\n### 方式一：Python 脚本\n\n创建一个 Python 文件（例如 `main.py`），输入以下内容：\n\n```python\nfrom podcastfy.client import generate_podcast\n\n# 替换为您想要转换的网址列表（支持网站、YouTube 链接、PDF 链接等）\nurls = [\"https:\u002F\u002Fwww.example.com\", \"https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=example\"]\n\n# 生成播客音频\naudio_file = generate_podcast(urls=urls)\n\nprint(f\"播客已生成并保存至：{audio_file}\")\n```\n\n运行脚本：\n```bash\npython main.py\n```\n\n### 方式二：命令行 (CLI)\n\n直接在终端中使用以下命令，无需编写代码：\n\n```bash\npython -m podcastfy.client --url \u003Curl1> --url \u003Curl2>\n```\n\n**示例：**\n```bash\npython -m podcastfy.client --url https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy --url https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=ugvHCXCOmm4\n```\n\n执行完成后，生成的音频文件将保存在当前目录下。\n\n---\n*更多高级用法（如自定义对话风格、选择本地大模型、多语言设置等）请参阅官方文档中的 [How-to Guide](https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fblob\u002Fmain\u002Fusage\u002Fhow-to.md)。*","一位独立教育博主希望将复杂的学术论文和博物馆艺术品图片转化为生动的多语言播客，以吸引全球听众。\n\n### 没有 podcastfy 时\n- **内容转化门槛高**：手动阅读长篇 PDF 论文或分析画作背景耗时数小时，难以快速提取核心观点并编写口语化脚本。\n- **多语言本地化困难**：若要覆盖非英语受众，需额外聘请翻译和配音员，成本高昂且沟通周期长，无法实现即时多语种发布。\n- **形式单一缺乏吸引力**：仅靠文字博客或静态图片难以在通勤等碎片化场景中留住用户，导致优质深度内容传播范围受限。\n- **定制化程度低**：依赖封闭平台的生成工具（如 NotebookLM）无法通过代码调整对话风格、语速或角色设定，难以打造独特的品牌声音。\n\n### 使用 podcastfy 后\n- **自动化内容重塑**：直接输入论文 URL 或艺术品图片，podcastfy 利用 GenAI 自动解析多模态内容，瞬间生成自然流畅的双人对话脚本。\n- **原生多语言支持**：一键配置目标语言，podcastfy 即可生成地道的法语、西班牙语等音频版本，无需额外翻译流程，轻松拓展全球市场。\n- **沉浸式听觉体验**：将枯燥的学术文本转化为引人入胜的音频故事，让听众在通勤或运动时也能轻松消化深度知识，显著提升用户粘性。\n- **高度可编程定制**：作为开源 Python 库，podcastfy 允许开发者通过代码精细控制主持人性格、对话节奏及音频参数，完美契合个人品牌调性。\n\npodcastfy 通过将多模态信息转化为可定制的多语言音频对话，彻底打破了深度内容创作的语言与形式壁垒。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fsouzatharsis_podcastfy_0d905215.png","souzatharsis","Tharsis Souza","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fsouzatharsis_b12a61fe.jpg","building data-driven stories",null,"souza.tharsis@gmail.com","www.souzatharsis.com","https:\u002F\u002Fgithub.com\u002Fsouzatharsis",[85,89,93,97,101],{"name":86,"color":87,"percentage":88},"Python","#3572A5",96.7,{"name":90,"color":91,"percentage":92},"TeX","#3D6117",2.6,{"name":94,"color":95,"percentage":96},"Dockerfile","#384d54",0.6,{"name":98,"color":99,"percentage":100},"Makefile","#427819",0.1,{"name":102,"color":103,"percentage":104},"Shell","#89e051",0,6196,717,"2026-04-13T15:30:14","Apache-2.0","未说明","非必需（支持本地运行 LLM，也支持调用 OpenAI、Anthropic 等云端 API）",{"notes":112,"python":113,"dependencies":114},"该工具主要依赖外部 API（如 OpenAI、Google、ElevenLabs 等）或本地部署的 LLM（支持 156+ HuggingFace 模型）。若选择本地运行大模型，需自行配置相应的 GPU 和显存环境；若使用云端 API，则对本地硬件无特殊要求。必须安装 ffmpeg 用于音频处理。支持通过 Docker 容器化部署。","3.11+",[115],"ffmpeg",[15,47,46],[118,119,120,121,122,123],"elevenlabs","gemini","genai","notebooklm","openai","podcast","2026-03-27T02:49:30.150509","2026-04-14T12:28:50.600861",[127,132,137,142,147,151],{"id":128,"question_zh":129,"answer_zh":130,"source_url":131},32964,"遇到 '404 models\u002Fgemini-1.5-pro-latest is not found' 错误怎么办？","该问题通常是由于模型版本更新或 API 变更导致的。维护者已对此进行了修复和更新。请确保将 podcastfy 库升级到最新版本，升级后该错误应不再出现。如果问题依旧，请检查是否使用了正确的 API 密钥配置。","https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues\u002F294",{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},32965,"如何生成超过常规时长的长篇幅播客（Long-form Podcasts）？","从 v0.3.6 版本开始，支持生成长达 20-30 分钟以上的播客。使用方法如下：\n1. CLI 方式：添加 `--longform` 标志。\n2. Python API 方式：设置 `longform=True` 参数。\n该功能采用了“内容分块与上下文链接”技术以保证连贯性。你还可以通过对话配置中的 `max_num_chunks` 和 `min_chunk_size` 参数进行微调。注意：旧版的 `word_count` 参数已不再使用。","https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues\u002F48",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},32966,"GPT-4o、Claude Sonnet 和 Gemini 等不同模型在生成播客时的表现有何区别？","根据社区测试反馈：\n- **GPT-4o Preview**：生成的对话更长，对话数量是基础版的 2-3 倍。\n- **GPT-4o Base**：内容质量良好，性能与 Gemini 相似，通常是性价比最高的选择。\n- **Claude 3.5 Sonnet**：生成的内容明显较短，有时可能忽略指令（如缺少结论部分）。\n- **o1 Preview**：效果最佳，但成本较高（平均约 1 美元\u002F次）。\n建议优先尝试 GPT-4o 或 Gemini，若追求极致质量且预算充足可选择 o1 Preview。","https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues\u002F127",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},32967,"为什么配置了 Edge TTS 的特定语音（如挪威语），生成的音频仍然是默认的美式英语？","这是一个已知的高优先级 Bug。即使你在配置文件中正确设置了 `default_tts_model: \"edge\"` 和具体的 `default_voices`（例如 `nb-NO-FinnNeural`），系统可能仍会忽略这些设置并使用默认语音。\n临时解决方案：不要仅在配置文件中设置，而是在调用 `generate_podcast` 函数时，显式通过 `tts_model` 参数传递模型名称（例如 `tts_model='edge'`）。请务必将库更新至最新版本以获取修复。","https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues\u002F132",{"id":148,"question_zh":149,"answer_zh":150,"source_url":136},32968,"在使用 Gemini TTS 生成长音频时遇到 'input.text is longer than the limit of 5000 bytes' 错误如何解决？","Gemini TTS API 对输入文本有 5000 token 的限制。解决此问题的最佳方案是使用 podcastfy 内置的长篇幅（longform）生成功能，它会自动将内容分块处理并合成，从而绕过单次请求的长度限制。请在生成时启用 `longform=True` 或使用 `--longform` 标志。虽然 Google 提供了 Long Audio API，但直接集成可能较复杂且音质可能受影响，推荐使用库自带的分块方案。",{"id":152,"question_zh":153,"answer_zh":154,"source_url":146},32969,"如何在配置中自定义 OpenAI 或 Edge 的默认语音角色？","你可以在对话配置（conversation_config）的 `text_to_speech` 部分指定默认语音。示例配置如下：\n```json\n\"text_to_speech\": {\n    \"default_tts_model\": \"openai\",\n    \"openai\": {\n        \"default_voices\": {\n            \"question\": \"alloy\",\n            \"answer\": \"echo\"\n        }\n    }\n}\n```\n如果是 Edge TTS，结构类似，将键改为 `edge_tts` 并填入对应的语音 ID（如 `nb-NO-FinnNeural`）。如果在 v0.2.17 及更早版本中发现配置不生效，请尝试在代码调用时直接传入 `tts_model` 参数，或升级到最新修复版。",[156,161,166,171,176,181,186,191,196,201,206,211,216,221,226,231,236,241,246,251],{"id":157,"version":158,"summary_zh":159,"released_at":160},247648,"v0.4.0","## 变更内容\n* v0.4.0 - 由 @souzatharsis 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F180 中添加了 Google 的 TTS 模型\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.3.6...v0.4.0","2024-11-16T19:07:47",{"id":162,"version":163,"summary_zh":164,"released_at":165},247649,"v0.3.6","## 变更内容\n* 功能\u002F长篇内容由 @souzatharsis 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F175 中实现\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.3.2...v0.3.6","2024-11-13T23:33:34",{"id":167,"version":168,"summary_zh":169,"released_at":170},247650,"v0.3.2","## 变更内容\n* 添加按主题生成播客功能 #126，由 @souzatharsis 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F158 中实现\n* 更新 tts-model 的默认值，由 @souzatharsis 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F159 中完成\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.3.0...v0.3.2","2024-11-07T18:25:28",{"id":172,"version":173,"summary_zh":174,"released_at":175},247651,"v0.3.0","## 变更内容\n* #115 由 @souzatharsis 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F153 中实现与 litellm 的集成\n* 文档更新由 @souzatharsis 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F154 中完成\n\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.19...v0.3.0","2024-11-06T19:23:36",{"id":177,"version":178,"summary_zh":179,"released_at":180},247652,"v0.2.19","## 变更内容\n* 添加 Google TTS 多语音模型\n\n## 新贡献者\n* @imgbot 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F136 中完成了首次贡献\n* @github-actions 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F142 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.17...v0.2.19","2024-11-06T14:21:06",{"id":182,"version":183,"summary_zh":184,"released_at":185},247653,"v0.2.17","**完整更新日志**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.16...v0.2.17","2024-10-31T19:16:37",{"id":187,"version":188,"summary_zh":189,"released_at":190},247654,"v0.2.16","## 解决的问题\n* https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues\u002F134 从转录文本生成中移除 TTS 特定标记\n* https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues\u002F133 重构 text-to-speech.py\n* https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues\u002F132 修复了未设置默认 TTS 音色的 bug\n* https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues\u002F120 输出文件夹现在可配置\n* https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fissues\u002F114 让 pytest 并行运行\n\n## 新贡献者\n* @twlite 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F123 中做出了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.15...v0.2.16","2024-10-31T16:37:11",{"id":192,"version":193,"summary_zh":194,"released_at":195},247655,"v0.2.15","**完整更新日志**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.12...v0.2.15","2024-10-27T21:21:57",{"id":197,"version":198,"summary_zh":199,"released_at":200},247656,"v0.2.12","## 变更内容\n* 由 @ghimirebibek 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F117 中更新了 README.md\n* 使输出目录可自定义，并支持并行生成 #120 #114，由 @souzatharsis 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F121 中实现\n\n## 新贡献者\n* @ghimirebibek 在 https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F117 中完成了首次贡献\n\n**完整变更日志**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.11...v0.2.12","2024-10-27T02:08:40",{"id":202,"version":203,"summary_zh":204,"released_at":205},247657,"v0.2.11","**完整更新日志**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.10...v0.2.11","2024-10-26T02:03:03",{"id":207,"version":208,"summary_zh":209,"released_at":210},247658,"v0.2.10","**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.9...v0.2.10","2024-10-25T23:49:35",{"id":212,"version":213,"summary_zh":214,"released_at":215},247659,"v0.2.9","- Versioned prompts\r\n- Suppress unnecessary langchain warning\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.8...v0.2.9","2024-10-25T16:23:07",{"id":217,"version":218,"summary_zh":219,"released_at":220},247660,"v0.2.8","## What's Changed\r\n* Feat\u002Finput raw text by @souzatharsis in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F108\r\n* feat: enable users to customize conversation with free text. Also fix… by @souzatharsis in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F110\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.7...v0.2.8","2024-10-25T04:42:02",{"id":222,"version":223,"summary_zh":224,"released_at":225},247661,"v0.2.7","## What's Changed\r\n* Hotfix local pdf by @brumar in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F91\r\n* Update python-app.yml by @souzatharsis in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F78\r\n* Add CLI tests #75 and fix #81 by @souzatharsis in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F82\r\n* Update LICENSE to Apache 2.0 by @souzatharsis in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F103\r\n* update tests by @souzatharsis in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F104\r\n* Pr91 updates by @souzatharsis in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F106\r\n\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.6...v0.2.7","2024-10-24T20:18:32",{"id":227,"version":228,"summary_zh":229,"released_at":230},247662,"v0.2.6","## What's Changed\r\n* Create how-to by @souzatharsis in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F77\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.5...v0.2.6","2024-10-16T22:06:48",{"id":232,"version":233,"summary_zh":234,"released_at":235},247663,"v0.2.5","Fixed CLI.\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.3...v0.2.5","2024-10-16T12:58:02",{"id":237,"version":238,"summary_zh":239,"released_at":240},247664,"v0.2.3","## What's Changed\r\n\r\n* Feat: add local llm option by @souzatharsis in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F65\r\n* feat: add user-provided TSS config such as voices #10 #6 #27 by @souzatharsis \r\n* feat: add open in collab and setting python version to 3.11 by @Devparihar5 in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F57\r\n* feat: add edge tts support by @ChinoUkaegbu in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F58\r\n* Update conversation_custom.md by @Yashbhatt786 in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F67\r\n* feat: update pypdf with pymupdf(10x faster then pypdf) #56 check by @Devparihar5 in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F66\r\n* feat: Replace r.jina.ai with simple BeautifulSoap #18 by @souzatharsis \r\n* bug: Fixed CLI for user-provided config #69 @souzatharsis \r\n* Feat: Enable running podcastfy with no API KEYs thanks to solving #18 , #58 , #65 by @souzatharsis and @ChinoUkaegbu \r\n\r\n## New Contributors\r\n* @Devparihar5 made their first contribution in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F57\r\n* @ChinoUkaegbu made their first contribution in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F58\r\n* @Yashbhatt786 made their first contribution in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F67","2024-10-15T19:01:31",{"id":242,"version":243,"summary_zh":244,"released_at":245},247665,"v0.2.2","## [0.2.2] - 2024-10-13\r\n\r\n### Added\r\n- Added API reference docs and published it to https:\u002F\u002Fpodcastfy.readthedocs.io\u002Fen\u002Flatest\u002F\r\n\r\n### Fixed \r\n- ([#52](https:\u002F\u002Fgithub.com\u002Fuser\u002Fpodcastfy\u002Fissues\u002F37)) Fixed simple bug introduced in 0.2.1 that broke the ability to generate podcasts from text inputs!\r\n- Fixed one example in the documentation that was not working.","2024-10-13T23:44:24",{"id":247,"version":248,"summary_zh":249,"released_at":250},247666,"v0.2.1","## [0.2.1] - 2024-10-12\r\n\r\n\r\n### Added\r\n- ([#8](https:\u002F\u002Fgithub.com\u002Fuser\u002Fpodcastfy\u002Fissues\u002F8)) Podcastfy is now multi-modal! Users can now generate audio from images by simply providing the paths to the image files.\r\n\r\n### Fixed \r\n- ([#40](https:\u002F\u002Fgithub.com\u002Fuser\u002Fpodcastfy\u002Fissues\u002F37)) Updated default ElevenLabs voice from `BrittneyHart` to `Jessica`. The latter was a non-default voice I used from my account, which caused error for users who don't have it.\r\n\r\n## What's Changed\r\n* wording styles->preferences by @brumar in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F41\r\n\r\n## New Contributors\r\n* @brumar made their first contribution in https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fpull\u002F41\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fsouzatharsis\u002Fpodcastfy\u002Fcompare\u002Fv0.2.0...v0.2.1","2024-10-12T21:44:58",{"id":252,"version":253,"summary_zh":254,"released_at":255},247667,"v0.2.0","## [0.2.0] - 2024-10-10\r\n\r\n### Added\r\n- Parameterized podcast generation with Conversation Configuration ([#11](https:\u002F\u002Fgithub.com\u002Fuser\u002Fpodcastfy\u002Fissues\u002F11), [#3](https:\u002F\u002Fgithub.com\u002Fuser\u002Fpodcastfy\u002Fissues\u002F3), [#4](https:\u002F\u002Fgithub.com\u002Fuser\u002Fpodcastfy\u002Fissues\u002F4))\r\n  - Users can now customize podcast style, structure, and content\r\n  - See `conversation_custom.md` for detailed options\r\n  - Updated demo in `podcastfy.ipynb`\r\n- LangChain integration for improved LLM interface and observability ([#29](https:\u002F\u002Fgithub.com\u002Fuser\u002Fpodcastfy\u002Fissues\u002F29))\r\n- Changelog to track version updates ([#22](https:\u002F\u002Fgithub.com\u002Fuser\u002Fpodcastfy\u002Fissues\u002F22))\r\n- Tests for Customized conversation scenarios\r\n\r\n### Fixed\r\n- CLI now correctly reads from user-provided local .env file ([#37](https:\u002F\u002Fgithub.com\u002Fuser\u002Fpodcastfy\u002Fissues\u002F37))","2024-10-10T18:49:33"]