[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-obsei--obsei":3,"tool-obsei--obsei":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":67,"owner_name":67,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":77,"owner_email":78,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":99,"forks":100,"last_commit_at":101,"license":102,"difficulty_score":23,"env_os":103,"env_gpu":104,"env_ram":105,"env_deps":106,"category_tags":112,"github_topics":113,"view_count":23,"oss_zip_url":77,"oss_zip_packed_at":77,"status":16,"created_at":134,"updated_at":135,"faqs":136,"releases":172},2045,"obsei\u002Fobsei","obsei","Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand image analysis, comparative study and more .","Obsei 是一款开源的低代码 AI 自动化工具，旨在帮助企业轻松处理非结构化数据。它通过“观察 - 分析 - 通知”三步流程，自动从社交媒体、应用商店评论、新闻网站等渠道收集信息，利用人工智能进行情感分析、分类或翻译，并将结果推送至工单系统或数据库中，以便团队及时采取行动。\n\n对于需要监控品牌声誉、倾听用户反馈或自动化客户投诉处理的企业而言，Obsei 有效解决了人工收集和分析海量网络数据效率低下、响应滞后的痛点。无论是初创公司的运营人员，还是希望快速构建自动化工作流的技术开发者，都能借助其低代码特性灵活部署，无需深厚的算法背景即可上手。\n\nObsei 的独特之处在于其模块化架构：观察者（Observer）、分析器（Analyzer）和通知器（Informer）可自由组合，支持对接多种数据源与下游应用。此外，它还具备状态持久化能力，能够稳定运行于定时任务或无服务器环境中。目前项目处于 Alpha 阶段，适合愿意尝试新技术并进行场景验证的用户探索使用。","\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_8b7077d74c7f.png\" \u002F>\n\u003C\u002Fp>\n\n---\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fwww.oraika.com\">\n            \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_258a37c0f2b8.png\" \u002F>\n    \u003C\u002Fa>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Factions\">\n        \u003Cimg alt=\"Test\" src=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fworkflows\u002FCI\u002Fbadge.svg?branch=master\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FLICENSE\">\n        \u003Cimg alt=\"License\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fl\u002Fobsei\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fobsei\">\n        \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fpyversions\u002Fobsei\" alt=\"PyPI - Python Version\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fobsei\u002F\">\n        \u003Cimg alt=\"Release\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fobsei\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fpepy.tech\u002Fproject\u002Fobsei\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_5d2fa38dd262.png\" alt=\"Downloads\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fobsei\u002Fobsei-demo\">\n        \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Spaces-blue\" alt=\"HF Spaces\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fcommits\u002Fmaster\">\n        \u003Cimg alt=\"Last commit\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flast-commit\u002Fobsei\u002Fobsei\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\">\n        \u003Cimg alt=\"Github stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fobsei\u002Fobsei?style=social\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fwww.youtube.com\u002Fchannel\u002FUCqdvgro1BzU13tkAfX3jCJA\">\n        \u003Cimg alt=\"YouTube Channel Subscribers\" src=\"https:\u002F\u002Fimg.shields.io\u002Fyoutube\u002Fchannel\u002Fsubscribers\u002FUCqdvgro1BzU13tkAfX3jCJA?style=social\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fobsei-community\u002Fshared_invite\u002Fzt-r0wnuz02-FAkAmhTAUoc6pD4SLB9Ikg\">\n        \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fobsei\u002Fobsei-resources\u002Fmaster\u002Flogos\u002FSlack_join.svg\" height=\"30\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Ftwitter.com\u002FObseiAI\">\n        \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Ftwitter\u002Ffollow\u002FObseiAI?style=social\">\n    \u003C\u002Fa>\n\u003C\u002Fp>\n\n---\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_ee3ae753dfb8.gif)\n\n---\n\n\u003Cspan style=\"color:red\">\n\u003Cb>Note\u003C\u002Fb>: Obsei is still in alpha stage hence carefully use it in Production. Also, as it is constantly undergoing development hence master branch may contain many breaking changes. Please use released version.\n\u003C\u002Fspan>\n\n---\n\n**Obsei** (pronounced \"Ob see\" | \u002Fəb-'sē\u002F) is an open-source, low-code, AI powered automation tool. _Obsei_ consists of -\n\n- **Observer**: Collect unstructured data from various sources like tweets from Twitter, Subreddit comments on Reddit, page post's comments from Facebook, App Stores reviews, Google reviews, Amazon reviews, News, Website, etc.\n- **Analyzer**: Analyze unstructured data collected with various AI tasks like classification, sentiment analysis, translation, PII, etc.\n- **Informer**: Send analyzed data to various destinations like ticketing platforms, data storage, dataframe, etc so that the user can take further actions and perform analysis on the data.\n\nAll the Observers can store their state in databases (Sqlite, Postgres, MySQL, etc.), making Obsei suitable for scheduled jobs or serverless applications.\n\n![Obsei diagram](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_d7fae3d6af12.png)\n\n### Future direction -\n\n- Text, Image, Audio, Documents and Video oriented workflows\n- Collect data from every possible private and public channels\n- Add every possible workflow to an AI downstream application to automate manual cognitive workflows\n\n## Use cases\n\n_Obsei_ use cases are following, but not limited to -\n\n- Social listening: Listening about social media posts, comments, customer feedback, etc.\n- Alerting\u002FNotification: To get auto-alerts for events such as customer complaints, qualified sales leads, etc.\n- Automatic customer issue creation based on customer complaints on Social Media, Email, etc.\n- Automatic assignment of proper tags to tickets based content of customer complaint for example login issue, sign up issue, delivery issue, etc.\n- Extraction of deeper insight from feedbacks on various platforms\n- Market research\n- Creation of dataset for various AI tasks\n- Many more based on creativity 💡\n\n## Installation\n\n### Prerequisite\n\nInstall the following (if not present already) -\n\n- Install [Python 3.7+](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n- Install [PIP](https:\u002F\u002Fpip.pypa.io\u002Fen\u002Fstable\u002Finstalling\u002F)\n\n### Install Obsei\n\nYou can install Obsei either via PIP or Conda based on your preference.\nTo install latest released version -\n\n```shell\npip install obsei[all]\n```\n\nInstall from master branch (if you want to try the latest features) -\n\n```shell\ngit clone https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei.git\ncd obsei\npip install --editable .[all]\n```\n  \nNote: `all` option will install all the dependencies which might not be needed for your workflow, alternatively \nfollowing options are available to install minimal dependencies as per need -\n - `pip install obsei[source]`: To install dependencies related to all observers\n - `pip install obsei[sink]`: To install dependencies related to all informers\n - `pip install obsei[analyzer]`:  To install dependencies related to all analyzers, it will install pytorch as well\n - `pip install obsei[twitter-api]`: To install dependencies related to Twitter observer\n - `pip install obsei[google-play-scraper]`: To install dependencies related to Play Store review scrapper observer\n - `pip install obsei[google-play-api]`: To install dependencies related to Google official play store review API based observer\n - `pip install obsei[app-store-scraper]`: To install dependencies related to Apple App Store review scrapper observer\n - `pip install obsei[reddit-scraper]`: To install dependencies related to Reddit post and comment scrapper observer\n - `pip install obsei[reddit-api]`: To install dependencies related to Reddit official api based observer\n - `pip install obsei[pandas]`: To install dependencies related to TSV\u002FCSV\u002FPandas based observer and informer\n - `pip install obsei[google-news-scraper]`: To install dependencies related to Google news scrapper observer\n - `pip install obsei[facebook-api]`: To install dependencies related to Facebook official page post and comments api based observer\n - `pip install obsei[atlassian-api]`: To install dependencies related to Jira official api based informer\n - `pip install obsei[elasticsearch]`: To install dependencies related to elasticsearch informer\n - `pip install obsei[slack-api]`:To install dependencies related to Slack official api based informer\n\nYou can also mix multiple dependencies together in single installation command. For example to install dependencies \nTwitter observer, all analyzer, and Slack informer use following command -\n```shell\npip install obsei[twitter-api, analyzer, slack-api]\n```\n\n\n## How to use\n\nExpand the following steps and create a workflow -\n\n\u003Cdetails>\u003Csummary>\u003Cb>Step 1: Configure Source\u002FObserver\u003C\u002Fb>\u003C\u002Fsummary>\n\n\u003Ctable >\u003Ctbody >\u003Ctr>\u003C\u002Ftr>\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_7571ea13179d.png\" width=\"20\" height=\"20\">\u003Cb>Twitter\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.twitter_source import TwitterCredentials, TwitterSource, TwitterSourceConfig\n\n# initialize twitter source config\nsource_config = TwitterSourceConfig(\n   keywords=[\"issue\"], # Keywords, @user or #hashtags\n   lookup_period=\"1h\", # Lookup period from current time, format: `\u003Cnumber>\u003Cd|h|m>` (day|hour|minute)\n   cred_info=TwitterCredentials(\n       # Enter your twitter consumer key and secret. Get it from https:\u002F\u002Fdeveloper.twitter.com\u002Fen\u002Fapply-for-access\n       consumer_key=\"\u003Ctwitter_consumer_key>\",\n       consumer_secret=\"\u003Ctwitter_consumer_secret>\",\n       bearer_token='\u003CENTER BEARER TOKEN>',\n   )\n)\n\n# initialize tweets retriever\nsource = TwitterSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_43abd58514fa.png\" width=\"20\" height=\"20\">\u003Cb>Youtube Scrapper\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.youtube_scrapper import YoutubeScrapperSource, YoutubeScrapperConfig\n\n# initialize Youtube source config\nsource_config = YoutubeScrapperConfig(\n    video_url=\"https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=uZfns0JIlFk\", # Youtube video URL\n    fetch_replies=True, # Fetch replies to comments\n    max_comments=10, # Total number of comments and replies to fetch\n    lookup_period=\"1Y\", # Lookup period from current time, format: `\u003Cnumber>\u003Cd|h|m|M|Y>` (day|hour|minute|month|year)\n)\n\n# initialize Youtube comments retriever\nsource = YoutubeScrapperSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_fd8ae6be8fa7.png\" width=\"20\" height=\"20\">\u003Cb>Facebook\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.facebook_source import FacebookCredentials, FacebookSource, FacebookSourceConfig\n\n# initialize facebook source config\nsource_config = FacebookSourceConfig(\n   page_id=\"110844591144719\", # Facebook page id, for example this one for Obsei\n   lookup_period=\"1h\", # Lookup period from current time, format: `\u003Cnumber>\u003Cd|h|m>` (day|hour|minute)\n   cred_info=FacebookCredentials(\n       # Enter your facebook app_id, app_secret and long_term_token. Get it from https:\u002F\u002Fdevelopers.facebook.com\u002Fapps\u002F\n       app_id=\"\u003Cfacebook_app_id>\",\n       app_secret=\"\u003Cfacebook_app_secret>\",\n       long_term_token=\"\u003Cfacebook_long_term_token>\",\n   )\n)\n\n# initialize facebook post comments retriever\nsource = FacebookSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_d07dde5ae291.png\" width=\"20\" height=\"20\">\u003Cb>Email\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.email_source import EmailConfig, EmailCredInfo, EmailSource\n\n# initialize email source config\nsource_config = EmailConfig(\n   # List of IMAP servers for most commonly used email providers\n   # https:\u002F\u002Fwww.systoolsgroup.com\u002Fimap\u002F\n   # Also, if you're using a Gmail account then make sure you allow less secure apps on your account -\n   # https:\u002F\u002Fmyaccount.google.com\u002Flesssecureapps?pli=1\n   # Also enable IMAP access -\n   # https:\u002F\u002Fmail.google.com\u002Fmail\u002Fu\u002F0\u002F#settings\u002Ffwdandpop\n   imap_server=\"imap.gmail.com\", # Enter IMAP server\n   cred_info=EmailCredInfo(\n       # Enter your email account username and password\n       username=\"\u003Cemail_username>\",\n       password=\"\u003Cemail_password>\"\n   ),\n   lookup_period=\"1h\" # Lookup period from current time, format: `\u003Cnumber>\u003Cd|h|m>` (day|hour|minute)\n)\n\n# initialize email retriever\nsource = EmailSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_6b6b38b8aeb1.png\" width=\"20\" height=\"20\">\u003Cb>Google Maps Reviews Scrapper\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.google_maps_reviews import OSGoogleMapsReviewsSource, OSGoogleMapsReviewsConfig\n\n# initialize Outscrapper Maps review source config\nsource_config = OSGoogleMapsReviewsConfig(\n   # Collect API key from https:\u002F\u002Foutscraper.com\u002F\n   api_key=\"\u003CEnter Your API Key>\",\n   # Enter Google Maps link or place id\n   # For example below is for the \"Taj Mahal\"\n   queries=[\"https:\u002F\u002Fwww.google.co.in\u002Fmaps\u002Fplace\u002FTaj+Mahal\u002F@27.1751496,78.0399535,17z\u002Fdata=!4m5!3m4!1s0x39747121d702ff6d:0xdd2ae4803f767dde!8m2!3d27.1751448!4d78.0421422\"],\n   number_of_reviews=10,\n)\n\n\n# initialize Outscrapper Maps review retriever\nsource = OSGoogleMapsReviewsSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_1b94a031e7d9.png\" width=\"20\" height=\"20\">\u003Cb>AppStore Reviews Scrapper\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.appstore_scrapper import AppStoreScrapperConfig, AppStoreScrapperSource\n\n# initialize app store source config\nsource_config = AppStoreScrapperConfig(\n   # Need two parameters app_id and country.\n   # `app_id` can be found at the end of the url of app in app store.\n   # For example - https:\u002F\u002Fapps.apple.com\u002Fus\u002Fapp\u002Fxcode\u002Fid497799835\n   # `310633997` is the app_id for xcode and `us` is country.\n   countries=[\"us\"],\n   app_id=\"310633997\",\n   lookup_period=\"1h\" # Lookup period from current time, format: `\u003Cnumber>\u003Cd|h|m>` (day|hour|minute)\n)\n\n\n# initialize app store reviews retriever\nsource = AppStoreScrapperSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_efef15fe89ec.png\" width=\"20\" height=\"20\">\u003Cb>Play Store Reviews Scrapper\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.playstore_scrapper import PlayStoreScrapperConfig, PlayStoreScrapperSource\n\n# initialize play store source config\nsource_config = PlayStoreScrapperConfig(\n   # Need two parameters package_name and country.\n   # `package_name` can be found at the end of the url of app in play store.\n   # For example - https:\u002F\u002Fplay.google.com\u002Fstore\u002Fapps\u002Fdetails?id=com.google.android.gm&hl=en&gl=US\n   # `com.google.android.gm` is the package_name for xcode and `us` is country.\n   countries=[\"us\"],\n   package_name=\"com.google.android.gm\",\n   lookup_period=\"1h\" # Lookup period from current time, format: `\u003Cnumber>\u003Cd|h|m>` (day|hour|minute)\n)\n\n# initialize play store reviews retriever\nsource = PlayStoreScrapperSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fobsei\u002Fobsei-resources\u002Fmaster\u002Flogos\u002Freddit.png\" width=\"20\" height=\"20\">\u003Cb>Reddit\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.reddit_source import RedditConfig, RedditSource, RedditCredInfo\n\n# initialize reddit source config\nsource_config = RedditConfig(\n   subreddits=[\"wallstreetbets\"], # List of subreddits\n   # Reddit account username and password\n   # You can also enter reddit client_id and client_secret or refresh_token\n   # Create credential at https:\u002F\u002Fwww.reddit.com\u002Fprefs\u002Fapps\n   # Also refer https:\u002F\u002Fpraw.readthedocs.io\u002Fen\u002Flatest\u002Fgetting_started\u002Fauthentication.html\n   # Currently Password Flow, Read Only Mode and Saved Refresh Token Mode are supported\n   cred_info=RedditCredInfo(\n       username=\"\u003Creddit_username>\",\n       password=\"\u003Creddit_password>\"\n   ),\n   lookup_period=\"1h\" # Lookup period from current time, format: `\u003Cnumber>\u003Cd|h|m>` (day|hour|minute)\n)\n\n# initialize reddit retriever\nsource = RedditSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fobsei\u002Fobsei-resources\u002Fmaster\u002Flogos\u002Freddit.png\" width=\"20\" height=\"20\">\u003Cb>Reddit Scrapper\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n\u003Ci>Note: Reddit heavily rate limit scrappers, hence use it to fetch small data during long period\u003C\u002Fi>\n\n```python\nfrom obsei.source.reddit_scrapper import RedditScrapperConfig, RedditScrapperSource\n\n# initialize reddit scrapper source config\nsource_config = RedditScrapperConfig(\n   # Reddit subreddit, search etc rss url. For proper url refer following link -\n   # Refer https:\u002F\u002Fwww.reddit.com\u002Fr\u002Fpathogendavid\u002Fcomments\u002Ftv8m9\u002Fpathogendavids_guide_to_rss_and_reddit\u002F\n   url=\"https:\u002F\u002Fwww.reddit.com\u002Fr\u002Fwallstreetbets\u002Fcomments\u002F.rss?sort=new\",\n   lookup_period=\"1h\" # Lookup period from current time, format: `\u003Cnumber>\u003Cd|h|m>` (day|hour|minute)\n)\n\n# initialize reddit retriever\nsource = RedditScrapperSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_6bd590596518.png\" width=\"20\" height=\"20\">\u003Cb>Google News\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.google_news_source import GoogleNewsConfig, GoogleNewsSource\n\n# initialize Google News source config\nsource_config = GoogleNewsConfig(\n   query='bitcoin',\n   max_results=5,\n   # To fetch full article text enable `fetch_article` flag\n   # By default google news gives title and highlight\n   fetch_article=True,\n   # proxy='http:\u002F\u002F127.0.0.1:8080'\n)\n\n# initialize Google News retriever\nsource = GoogleNewsSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_554aa2cef5a6.png\" width=\"20\" height=\"20\">\u003Cb>Web Crawler\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.website_crawler_source import TrafilaturaCrawlerConfig, TrafilaturaCrawlerSource\n\n# initialize website crawler source config\nsource_config = TrafilaturaCrawlerConfig(\n   urls=['https:\u002F\u002Fobsei.github.io\u002Fobsei\u002F']\n)\n\n# initialize website text retriever\nsource = TrafilaturaCrawlerSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fobsei\u002Fobsei-resources\u002Fmaster\u002Flogos\u002Fpandas.svg\" width=\"20\" height=\"20\">\u003Cb>Pandas DataFrame\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nimport pandas as pd\nfrom obsei.source.pandas_source import PandasSource, PandasSourceConfig\n\n# Initialize your Pandas DataFrame from your sources like csv, excel, sql etc\n# In following example we are reading csv which have two columns title and text\ncsv_file = \"https:\u002F\u002Fraw.githubusercontent.com\u002Fdeepset-ai\u002Fhaystack\u002Fmaster\u002Ftutorials\u002Fsmall_generator_dataset.csv\"\ndataframe = pd.read_csv(csv_file)\n\n# initialize pandas sink config\nsink_config = PandasSourceConfig(\n   dataframe=dataframe,\n   include_columns=[\"score\"],\n   text_columns=[\"name\", \"degree\"],\n)\n\n# initialize pandas sink\nsink = PandasSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>\u003Cb>Step 2: Configure Analyzer\u003C\u002Fb>\u003C\u002Fsummary>\n\n\u003Ci>Note: To run transformers in an offline mode, check [transformers offline mode](https:\u002F\u002Fhuggingface.co\u002Ftransformers\u002Finstallation.html#offline-mode).\u003C\u002Fi>\n\n\u003Cp>Some analyzer support GPU and to utilize pass \u003Cb>device\u003C\u002Fb> parameter.\nList of possible values of \u003Cb>device\u003C\u002Fb> parameter (default value \u003Ci>auto\u003C\u002Fi>):\n\u003Col>\n    \u003Cli> \u003Cb>auto\u003C\u002Fb>: GPU (cuda:0) will be used if available otherwise CPU will be used\n    \u003Cli> \u003Cb>cpu\u003C\u002Fb>: CPU will be used\n    \u003Cli> \u003Cb>cuda:{id}\u003C\u002Fb> - GPU will be used with provided CUDA device id\n\u003C\u002Fol>\n\u003C\u002Fp>\n\n\u003Ctable >\u003Ctbody >\u003Ctr>\u003C\u002Ftr>\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_b1b59ee76898.png\" width=\"20\" height=\"20\">\u003Cb>Text Classification\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\nText classification: Classify text into user provided categories.\n\n```python\nfrom obsei.analyzer.classification_analyzer import ClassificationAnalyzerConfig, ZeroShotClassificationAnalyzer\n\n# initialize classification analyzer config\n# It can also detect sentiments if \"positive\" and \"negative\" labels are added.\nanalyzer_config=ClassificationAnalyzerConfig(\n   labels=[\"service\", \"delay\", \"performance\"],\n)\n\n# initialize classification analyzer\n# For supported models refer https:\u002F\u002Fhuggingface.co\u002Fmodels?filter=zero-shot-classification\ntext_analyzer = ZeroShotClassificationAnalyzer(\n   model_name_or_path=\"typeform\u002Fmobilebert-uncased-mnli\",\n   device=\"auto\"\n)\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_e1e772095c96.png\" width=\"20\" height=\"20\">\u003Cb>Sentiment Analyzer\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\nSentiment Analyzer: Detect the sentiment of the text. Text classification can also perform sentiment analysis but if you don't want to use heavy-duty NLP model then use less resource hungry dictionary based Vader Sentiment detector.\n\n```python\nfrom obsei.analyzer.sentiment_analyzer import VaderSentimentAnalyzer\n\n# Vader does not need any configuration settings\nanalyzer_config=None\n\n# initialize vader sentiment analyzer\ntext_analyzer = VaderSentimentAnalyzer()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_2b730d24c8d4.png\" width=\"20\" height=\"20\">\u003Cb>NER Analyzer\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\nNER (Named-Entity Recognition) Analyzer: Extract information and classify named entities mentioned in text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc\n\n```python\nfrom obsei.analyzer.ner_analyzer import NERAnalyzer\n\n# NER analyzer does not need configuration settings\nanalyzer_config=None\n\n# initialize ner analyzer\n# For supported models refer https:\u002F\u002Fhuggingface.co\u002Fmodels?filter=token-classification\ntext_analyzer = NERAnalyzer(\n   model_name_or_path=\"elastic\u002Fdistilbert-base-cased-finetuned-conll03-english\",\n   device = \"auto\"\n)\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_1bd64751fea1.png\" width=\"20\" height=\"20\">\u003Cb>Translator\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.analyzer.translation_analyzer import TranslationAnalyzer\n\n# Translator does not need analyzer config\nanalyzer_config = None\n\n# initialize translator\n# For supported models refer https:\u002F\u002Fhuggingface.co\u002Fmodels?pipeline_tag=translation\nanalyzer = TranslationAnalyzer(\n   model_name_or_path=\"Helsinki-NLP\u002Fopus-mt-hi-en\",\n   device = \"auto\"\n)\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_8abbf929a193.png\" width=\"20\" height=\"20\">\u003Cb>PII Anonymizer\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.analyzer.pii_analyzer import PresidioEngineConfig, PresidioModelConfig, \\\n   PresidioPIIAnalyzer, PresidioPIIAnalyzerConfig\n\n# initialize pii analyzer's config\nanalyzer_config = PresidioPIIAnalyzerConfig(\n   # Whether to return only pii analysis or anonymize text\n   analyze_only=False,\n   # Whether to return detail information about anonymization decision\n   return_decision_process=True\n)\n\n# initialize pii analyzer\nanalyzer = PresidioPIIAnalyzer(\n   engine_config=PresidioEngineConfig(\n       # spacy and stanza nlp engines are supported\n       # For more info refer\n       # https:\u002F\u002Fmicrosoft.github.io\u002Fpresidio\u002Fanalyzer\u002Fdeveloping_recognizers\u002F#utilize-spacy-or-stanza\n       nlp_engine_name=\"spacy\",\n       # Update desired spacy model and language\n       models=[PresidioModelConfig(model_name=\"en_core_web_lg\", lang_code=\"en\")]\n   )\n)\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_d517a5bd25a0.png\" width=\"20\" height=\"20\">\u003Cb>Dummy Analyzer\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\nDummy Analyzer: Does nothing. Its simply used for transforming the input (TextPayload) to output (TextPayload) and adding the user supplied dummy data.\n\n```python\nfrom obsei.analyzer.dummy_analyzer import DummyAnalyzer, DummyAnalyzerConfig\n\n# initialize dummy analyzer's configuration settings\nanalyzer_config = DummyAnalyzerConfig()\n\n# initialize dummy analyzer\nanalyzer = DummyAnalyzer()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>\u003Cb>Step 3: Configure Sink\u002FInformer\u003C\u002Fb>\u003C\u002Fsummary>\n\n\u003Ctable >\u003Ctbody >\u003Ctr>\u003C\u002Ftr>\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fobsei\u002Fobsei-resources\u002Fmaster\u002Flogos\u002Fslack.svg\" width=\"25\" height=\"25\">\u003Cb>Slack\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.sink.slack_sink import SlackSink, SlackSinkConfig\n\n# initialize slack sink config\nsink_config = SlackSinkConfig(\n   # Provide slack bot\u002Fapp token\n   # For more detail refer https:\u002F\u002Fslack.com\u002Fintl\u002Fen-de\u002Fhelp\u002Farticles\u002F215770388-Create-and-regenerate-API-tokens\n   slack_token=\"\u003CSlack_app_token>\",\n   # To get channel id refer https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002F40940327\u002Fwhat-is-the-simplest-way-to-find-a-slack-team-id-and-a-channel-id\n   channel_id=\"C01LRS6CT9Q\"\n)\n\n# initialize slack sink\nsink = SlackSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_e28a4f7f192a.png\" width=\"20\" height=\"20\">\u003Cb>Zendesk\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.sink.zendesk_sink import ZendeskSink, ZendeskSinkConfig, ZendeskCredInfo\n\n# initialize zendesk sink config\nsink_config = ZendeskSinkConfig(\n   # provide zendesk domain\n   domain=\"zendesk.com\",\n   # provide subdomain if you have one\n   subdomain=None,\n   # Enter zendesk user details\n   cred_info=ZendeskCredInfo(\n       email=\"\u003Czendesk_user_email>\",\n       password=\"\u003Czendesk_password>\"\n   )\n)\n\n# initialize zendesk sink\nsink = ZendeskSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_ace43fd305a1.png\" width=\"20\" height=\"20\">\u003Cb>Jira\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.sink.jira_sink import JiraSink, JiraSinkConfig\n\n# For testing purpose you can start jira server locally\n# Refer https:\u002F\u002Fdeveloper.atlassian.com\u002Fserver\u002Fframework\u002Fatlassian-sdk\u002Fatlas-run-standalone\u002F\n\n# initialize Jira sink config\nsink_config = JiraSinkConfig(\n   url=\"http:\u002F\u002Flocalhost:2990\u002Fjira\", # Jira server url\n    # Jira username & password for user who have permission to create issue\n   username=\"\u003Cusername>\",\n   password=\"\u003Cpassword>\",\n   # Which type of issue to be created\n   # For more information refer https:\u002F\u002Fsupport.atlassian.com\u002Fjira-cloud-administration\u002Fdocs\u002Fwhat-are-issue-types\u002F\n   issue_type={\"name\": \"Task\"},\n   # Under which project issue to be created\n   # For more information refer https:\u002F\u002Fsupport.atlassian.com\u002Fjira-software-cloud\u002Fdocs\u002Fwhat-is-a-jira-software-project\u002F\n   project={\"key\": \"CUS\"},\n)\n\n# initialize Jira sink\nsink = JiraSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_dae29809149b.png\" width=\"20\" height=\"20\">\u003Cb>ElasticSearch\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.sink.elasticsearch_sink import ElasticSearchSink, ElasticSearchSinkConfig\n\n# For testing purpose you can start Elasticsearch server locally via docker\n# `docker run -d --name elasticsearch -p 9200:9200 -e \"discovery.type=single-node\" elasticsearch:8.5.0`\n\n# initialize Elasticsearch sink config\nsink_config = ElasticSearchSinkConfig(\n   # Elasticsearch server\n   hosts=\"http:\u002F\u002Flocalhost:9200\",\n   # Index name, it will create if not exist\n   index_name=\"test\",\n)\n\n# initialize Elasticsearch sink\nsink = ElasticSearchSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_576de5fa98da.png\" width=\"20\" height=\"20\">\u003Cb>Http\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.sink.http_sink import HttpSink, HttpSinkConfig\n\n# For testing purpose you can create mock http server via postman\n# For more details refer https:\u002F\u002Flearning.postman.com\u002Fdocs\u002Fdesigning-and-developing-your-api\u002Fmocking-data\u002Fsetting-up-mock\u002F\n\n# initialize http sink config (Currently only POST call is supported)\nsink_config = HttpSinkConfig(\n   # provide http server url\n   url=\"https:\u002F\u002Flocalhost:8080\u002Fapi\u002Fpath\",\n   # Here you can add headers you would like to pass with request\n   headers={\n       \"Content-type\": \"application\u002Fjson\"\n   }\n)\n\n# To modify or converting the payload, create convertor class\n# Refer obsei.sink.dailyget_sink.PayloadConvertor for example\n\n# initialize http sink\nsink = HttpSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fobsei\u002Fobsei-resources\u002Fmaster\u002Flogos\u002Fpandas.svg\" width=\"20\" height=\"20\">\u003Cb>Pandas DataFrame\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom pandas import DataFrame\nfrom obsei.sink.pandas_sink import PandasSink, PandasSinkConfig\n\n# initialize pandas sink config\nsink_config = PandasSinkConfig(\n   dataframe=DataFrame()\n)\n\n# initialize pandas sink\nsink = PandasSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_f324ea1e48ad.png\" width=\"20\" height=\"20\">\u003Cb>Logger\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\nThis is useful for testing and dry running the pipeline.\n\n```python\nfrom obsei.sink.logger_sink import LoggerSink, LoggerSinkConfig\nimport logging\nimport sys\n\nlogger = logging.getLogger(\"Obsei\")\nlogging.basicConfig(stream=sys.stdout, level=logging.INFO)\n\n# initialize logger sink config\nsink_config = LoggerSinkConfig(\n   logger=logger,\n   level=logging.INFO\n)\n\n# initialize logger sink\nsink = LoggerSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>\u003Cb>Step 4: Join and create workflow\u003C\u002Fb>\u003C\u002Fsummary>\n\n`source` will fetch data from the selected source, then feed it to the `analyzer` for processing, whose output we feed into a `sink` to get notified at that sink.\n\n```python\n# Uncomment if you want logger\n# import logging\n# import sys\n# logger = logging.getLogger(__name__)\n# logging.basicConfig(stream=sys.stdout, level=logging.INFO)\n\n# This will fetch information from configured source ie twitter, app store etc\nsource_response_list = source.lookup(source_config)\n\n# Uncomment if you want to log source response\n# for idx, source_response in enumerate(source_response_list):\n#     logger.info(f\"source_response#'{idx}'='{source_response.__dict__}'\")\n\n# This will execute analyzer (Sentiment, classification etc) on source data with provided analyzer_config\nanalyzer_response_list = text_analyzer.analyze_input(\n    source_response_list=source_response_list,\n    analyzer_config=analyzer_config\n)\n\n# Uncomment if you want to log analyzer response\n# for idx, an_response in enumerate(analyzer_response_list):\n#    logger.info(f\"analyzer_response#'{idx}'='{an_response.__dict__}'\")\n\n# Analyzer output added to segmented_data\n# Uncomment to log it\n# for idx, an_response in enumerate(analyzer_response_list):\n#    logger.info(f\"analyzed_data#'{idx}'='{an_response.segmented_data.__dict__}'\")\n\n# This will send analyzed output to configure sink ie Slack, Zendesk etc\nsink_response_list = sink.send_data(analyzer_response_list, sink_config)\n\n# Uncomment if you want to log sink response\n# for sink_response in sink_response_list:\n#     if sink_response is not None:\n#         logger.info(f\"sink_response='{sink_response}'\")\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>\u003Cb>Step 5: Execute workflow\u003C\u002Fb>\u003C\u002Fsummary>\nCopy the code snippets from \u003Cb>Steps 1 to 4\u003C\u002Fb> into a python file, for example \u003Ccode>example.py\u003C\u002Fcode> and execute the following command -\n\n```shell\npython example.py\n```\n\n\u003C\u002Fdetails>\n\n## Demo\n\nWe have a minimal [streamlit](https:\u002F\u002Fstreamlit.io\u002F) based UI that you can use to test Obsei.\n\n![Screenshot](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_4242c7662fa8.png)\n\n### Watch UI demo video\n\n[![Introductory and demo video](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_f2da74e06f9a.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=GTF-Hy96gvY)\n\nCheck demo at [![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fobsei\u002Fobsei-demo)\n\n(**Note**: Sometimes the Streamlit demo might not work due to rate limiting, use the docker image (locally) in such cases.)\n\nTo test locally, just run\n\n```\ndocker run -d --name obesi-ui -p 8501:8501 obsei\u002Fobsei-ui-demo\n\n# You can find the UI at http:\u002F\u002Flocalhost:8501\n```\n\n**To run Obsei workflow easily using GitHub Actions (no sign ups and cloud hosting required), refer to this [repo](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fdemo-workflow-action)**.\n\n## Companies\u002FProjects using Obsei\n\nHere are some companies\u002Fprojects (alphabetical order) using Obsei. To add your company\u002Fproject to the list, please raise a PR or contact us via [email](contact@obsei.com).\n\n- [Oraika](https:\u002F\u002Fwww.oraika.com): Contextually understand customer feedback\n- [1Page](https:\u002F\u002Fwww.get1page.com\u002F): Giving a better context in meetings and calls\n- [Spacepulse](http:\u002F\u002Fspacepulse.in\u002F): The operating system for spaces\n- [Superblog](https:\u002F\u002Fsuperblog.ai\u002F): A blazing fast alternative to WordPress and Medium\n- [Zolve](https:\u002F\u002Fzolve.com\u002F): Creating a financial world beyond borders\n- [Utilize](https:\u002F\u002Fwww.utilize.app\u002F): No-code app builder for businesses with a deskless workforce\n\n## Articles\n\n\u003Ctable>\n\u003Cthead>\n\u003Ctr class=\"header\">\n\u003Cth>Sr. No.\u003C\u002Fth>\n\u003Cth>Title\u003C\u002Fth>\n\u003Cth>Author\u003C\u002Fth>\n\u003C\u002Ftr>\n\u003C\u002Fthead>\n\u003Ctbody>\n\u003Ctr>\n\u003Ctd>1\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Freenabapna.medium.com\u002Fai-based-comparative-customer-feedback-analysis-using-deep-learning-models-def0dc77aaee\">AI based Comparative Customer Feedback Analysis Using Obsei\u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"linkedin.com\u002Fin\u002Freena-bapna-66a8691a\">Reena Bapna\u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>2\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fmedium.com\u002Fmlearning-ai\u002Flinkedin-app-user-feedback-analysis-9c9f98464daa\">LinkedIn App - User Feedback Analysis\u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"http:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fhimanshusharmads\">Himanshu Sharma\u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n## Tutorials\n\n\u003Ctable>\n\u003Cthead>\n\u003Ctr class=\"header\">\n\u003Cth>Sr. No.\u003C\u002Fth>\n\u003Cth>Workflow\u003C\u002Fth>\n\u003Cth>Colab\u003C\u002Fth>\n\u003Cth>Binder\u003C\u002Fth>\n\u003C\u002Ftr>\n\u003C\u002Fthead>\n\u003Ctbody>\n\u003Ctr>\n\u003Ctd rowspan=\"2\">1\u003C\u002Ftd>\n\u003Ctd colspan=\"3\">Observe app reviews from Google play store, Analyze them by performing text classification and then Inform them on console via logger\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>PlayStore Reviews → Classification → Logger\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002Ftutorials\u002F01_PlayStore_Classification_Logger.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002Fobsei\u002Fobsei\u002FHEAD?filepath=tutorials%2F01_PlayStore_Classification_Logger.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fmybinder.org\u002Fbadge_logo.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd rowspan=\"2\">2\u003C\u002Ftd>\n\u003Ctd colspan=\"3\">Observe app reviews from Google play store, PreProcess text via various text cleaning functions, Analyze them by performing text classification, Inform them to Pandas DataFrame and store resultant CSV to Google Drive\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>PlayStore Reviews → PreProcessing → Classification → Pandas DataFrame → CSV in Google Drive\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002Ftutorials\u002F02_PlayStore_PreProc_Classification_Pandas.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002Fobsei\u002Fobsei\u002FHEAD?filepath=tutorials%2F02_PlayStore_PreProc_Classification_Pandas.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fmybinder.org\u002Fbadge_logo.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd rowspan=\"2\">3\u003C\u002Ftd>\n\u003Ctd colspan=\"3\">Observe app reviews from Apple app store, PreProcess text via various text cleaning function, Analyze them by performing text classification, Inform them to Pandas DataFrame and store resultant CSV to Google Drive\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>AppStore Reviews → PreProcessing → Classification → Pandas DataFrame → CSV in Google Drive\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002Ftutorials\u002F03_AppStore_PreProc_Classification_Pandas.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002Fobsei\u002Fobsei\u002FHEAD?filepath=tutorials%2F03_AppStore_PreProc_Classification_Pandas.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fmybinder.org\u002Fbadge_logo.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd rowspan=\"2\">4\u003C\u002Ftd>\n\u003Ctd colspan=\"3\">Observe news article from Google news, PreProcess text via various text cleaning function, Analyze them via performing text classification while splitting text in small chunks and later computing final inference using given formula\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>Google News → Text Cleaner → Text Splitter → Classification → Inference Aggregator\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002Ftutorials\u002F04_GoogleNews_Cleaner_Splitter_Classification_Aggregator.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002Fobsei\u002Fobsei\u002FHEAD?filepath=tutorials%2F04_GoogleNews_Cleaner_Splitter_Classification_Aggregator.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fmybinder.org\u002Fbadge_logo.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n\u003Cdetails>\u003Csummary>\u003Cb>💡Tips: Handle large text classification via Obsei\u003C\u002Fb>\u003C\u002Fsummary>\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_090e883d2ba7.gif)\n\n\u003C\u002Fdetails>\n\n## Documentation\n\nFor detailed installation instructions, usages and examples, refer to our [documentation](https:\u002F\u002Fobsei.github.io\u002Fobsei\u002F).\n\n## Support and Release Matrix\n\n\u003Ctable>\n\u003Cthead>\n\u003Ctr class=\"header\">\n\u003Cth>\u003C\u002Fth>\n\u003Cth>Linux\u003C\u002Fth>\n\u003Cth>Mac\u003C\u002Fth>\n\u003Cth>Windows\u003C\u002Fth>\n\u003Cth>Remark\u003C\u002Fth>\n\u003C\u002Ftr>\n\u003C\u002Fthead>\n\u003Ctbody>\n\u003Ctr>\n\u003Ctd>Tests\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd>Low Coverage as difficult to test 3rd party libs\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>PIP\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd>Fully Supported\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>Conda\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">❌\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">❌\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">❌\u003C\u002Ftd>\n\u003Ctd>Not Supported\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n## Discussion forum\n\nDiscussion about _Obsei_ can be done at [community forum](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fdiscussions)\n\n## Changelogs\n\nRefer [releases](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Freleases) for changelogs\n\n## Security Issue\n\nFor any security issue please contact us via [email](mailto:contact@oraika.com)\n\n## Stargazers over time\n\n[![Stargazers over time](https:\u002F\u002Fstarchart.cc\u002Fobsei\u002Fobsei.svg)](https:\u002F\u002Fstarchart.cc\u002Fobsei\u002Fobsei)\n\n## Maintainers\n\nThis project is being maintained by [Oraika Technologies](https:\u002F\u002Fwww.oraika.com). [Lalit Pagaria](https:\u002F\u002Fgithub.com\u002Flalitpagaria) and [Girish Patel](https:\u002F\u002Fgithub.com\u002FGirishPatel) are maintainers of this project.\n\n## License\n\n- Copyright holder: [Oraika Technologies](https:\u002F\u002Fwww.oraika.com)\n- Overall Apache 2.0 and you can read [License](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FLICENSE) file.\n- Multiple other secondary permissive or weak copyleft licenses (LGPL, MIT, BSD etc.) for third-party components refer [Attribution](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FATTRIBUTION.md).\n- To make project more commercial friendly, we void third party components which have strong copyleft licenses (GPL, AGPL etc.) into the project.\n\n## Attribution\n\nThis could not have been possible without these [open source softwares](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FATTRIBUTION.md).\n\n## Contribution\n\nFirst off, thank you for even considering contributing to this package, every contribution big or small is greatly appreciated.\nPlease refer our [Contribution Guideline](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FCONTRIBUTING.md) and [Code of Conduct](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FCODE_OF_CONDUCT.md).\n\nThanks so much to all our contributors\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_649d705c3b13.png\" \u002F>\n\u003C\u002Fa>\n","\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_8b7077d74c7f.png\" \u002F>\n\u003C\u002Fp>\n\n---\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fwww.oraika.com\">\n            \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_258a37c0f2b8.png\" \u002F>\n    \u003C\u002Fa>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Factions\">\n        \u003Cimg alt=\"Test\" src=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fworkflows\u002FCI\u002Fbadge.svg?branch=master\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FLICENSE\">\n        \u003Cimg alt=\"License\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fl\u002Fobsei\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fobsei\">\n        \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fpyversions\u002Fobsei\" alt=\"PyPI - Python Version\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fobsei\u002F\">\n        \u003Cimg alt=\"Release\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fobsei\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fpepy.tech\u002Fproject\u002Fobsei\">\n        \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_5d2fa38dd262.png\" alt=\"Downloads\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fobsei\u002Fobsei-demo\">\n        \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Spaces-blue\" alt=\"HF Spaces\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fcommits\u002Fmaster\">\n        \u003Cimg alt=\"Last commit\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flast-commit\u002Fobsei\u002Fobsei\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\">\n        \u003Cimg alt=\"Github stars\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fobsei\u002Fobsei?style=social\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fwww.youtube.com\u002Fchannel\u002FUCqdvgro1BzU13tkAfX3jCJA\">\n        \u003Cimg alt=\"YouTube Channel Subscribers\" src=\"https:\u002F\u002Fimg.shields.io\u002Fyoutube\u002Fchannel\u002Fsubscribers\u002FUCqdvgro1BzU13tkAfX3jCJA?style=social\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fjoin.slack.com\u002Ft\u002Fobsei-community\u002Fshared_invite\u002Fzt-r0wnuz02-FAkAmhTAUoc6pD4SLB9Ikg\">\n        \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fobsei\u002Fobsei-resources\u002Fmaster\u002Flogos\u002FSlack_join.svg\" height=\"30\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Ftwitter.com\u002FObseiAI\">\n        \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Ftwitter\u002Ffollow\u002FObseiAI?style=social\">\n    \u003C\u002Fa>\n\u003C\u002Fp>\n\n---\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_ee3ae753dfb8.gif)\n\n---\n\n\u003Cspan style=\"color:red\">\n\u003Cb>注意\u003C\u002Fb>：Obsei目前仍处于alpha阶段，因此请在生产环境中谨慎使用。此外，由于它正在持续开发中，主分支可能包含许多破坏性更改。请使用已发布的版本。\n\u003C\u002Fspan>\n\n---\n\n**Obsei**（发音为“奥布西” | \u002Fəb-'sē\u002F）是一款开源、低代码、基于人工智能的自动化工具。_Obsei_由以下部分组成：\n\n- **观察者**：从各种来源收集非结构化数据，例如Twitter上的推文、Reddit上的Subreddit评论、Facebook页面帖子的评论、应用商店评价、Google评价、Amazon评价、新闻、网站等。\n- **分析器**：利用多种人工智能任务对收集到的非结构化数据进行分析，如分类、情感分析、翻译、个人身份信息处理等。\n- **通知者**：将分析后的数据发送到各种目的地，如工单平台、数据存储、数据框等，以便用户采取进一步行动并对数据进行分析。\n\n所有观察者都可以将其状态存储在数据库中（SQLite、PostgreSQL、MySQL等），这使得Obsei非常适合定时作业或无服务器应用。\n\n![Obsei示意图](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_d7fae3d6af12.png)\n\n### 未来方向 -\n\n- 面向文本、图像、音频、文档和视频的工作流\n- 从所有可能的私有和公共渠道收集数据\n- 将所有可能的工作流添加到人工智能下游应用中，以实现手动认知工作的自动化\n\n## 使用场景\n\n_Obsei_的使用场景包括但不限于：\n\n- 社交监听：监听社交媒体帖子、评论、客户反馈等。\n- 警报\u002F通知：针对客户投诉、合格销售线索等事件自动获取警报。\n- 根据社交媒体、电子邮件等渠道的客户投诉自动生成客户问题。\n- 根据客户投诉内容自动为工单分配合适的标签，例如登录问题、注册问题、配送问题等。\n- 从各个平台的反馈中提取更深入的洞察。\n- 市场研究。\n- 为各种人工智能任务创建数据集。\n- 更多基于创造力的场景💡\n\n## 安装\n\n### 先决条件\n\n安装以下内容（如果尚未安装）：\n\n- 安装[Python 3.7+](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n- 安装[PIP](https:\u002F\u002Fpip.pypa.io\u002Fen\u002Fstable\u002Finstalling\u002F)\n\n### 安装 Obsei\n\n您可以根据自己的偏好，通过 PIP 或 Conda 安装 Obsei。\n\n要安装最新发布的版本 -\n\n```shell\npip install obsei[all]\n```\n\n从主分支安装（如果您想尝试最新功能）-\n\n```shell\ngit clone https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei.git\ncd obsei\npip install --editable .[all]\n```\n\n注意：`all` 选项会安装所有依赖项，而这些依赖项可能并非您的工作流所必需。作为替代方案，您可以根据需要安装以下最小依赖项：\n- `pip install obsei[source]`：安装与所有观察者相关的依赖项\n- `pip install obsei[sink]`：安装与所有通知器相关的依赖项\n- `pip install obsei[analyzer]`：安装与所有分析器相关的依赖项，同时会安装 PyTorch\n- `pip install obsei[twitter-api]`：安装与 Twitter 观察者相关的依赖项\n- `pip install obsei[google-play-scraper]`：安装与 Play 商店评论抓取器观察者相关的依赖项\n- `pip install obsei[google-play-api]`：安装与 Google 官方 Play 商店评论 API 基础的观察者相关的依赖项\n- `pip install obsei[app-store-scraper]`：安装与 Apple App Store 评论抓取器观察者相关的依赖项\n- `pip install obsei[reddit-scraper]`：安装与 Reddit 帖子和评论抓取器观察者相关的依赖项\n- `pip install obsei[reddit-api]`：安装与 Reddit 官方 API 基础的观察者相关的依赖项\n- `pip install obsei[pandas]`：安装与 TSV\u002FCSV\u002FPandas 基础的观察者和通知器相关的依赖项\n- `pip install obsei[google-news-scraper]`：安装与 Google 新闻抓取器观察者相关的依赖项\n- `pip install obsei[facebook-api]`：安装与 Facebook 官方页面帖子和评论 API 基础的观察者相关的依赖项\n- `pip install obsei[atlassian-api]`：安装与 Jira 官方 API 基础的通知器相关的依赖项\n- `pip install obsei[elasticsearch]`：安装与 Elasticsearch 通知器相关的依赖项\n- `pip install obsei[slack-api]`：安装与 Slack 官方 API 基础的通知器相关的依赖项\n\n您还可以在单个安装命令中混合使用多个依赖项。例如，要安装 Twitter 观察者、所有分析器和 Slack 通知器，请使用以下命令：\n```shell\npip install obsei[twitter-api, analyzer, slack-api]\n```\n\n\n## 使用方法\n\n展开以下步骤并创建一个工作流 -\n\n\u003Cdetails>\u003Csummary>\u003Cb>步骤 1：配置源\u002F观察者\u003C\u002Fb>\u003C\u002Fsummary>\n\n\u003Ctable >\u003Ctbody >\u003Ctr>\u003C\u002Ftr>\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_7571ea13179d.png\" width=\"20\" height=\"20\">\u003Cb>Twitter\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.twitter_source import TwitterCredentials, TwitterSource, TwitterSourceConfig\n\n# 初始化 Twitter 源配置\nsource_config = TwitterSourceConfig(\n   keywords=[\"issue\"], # 关键词、@用户或#话题标签\n   lookup_period=\"1h\", # 从当前时间开始的查找周期，格式：\u003C数字>\u003Cd|h|m>（天|小时|分钟）\n   cred_info=TwitterCredentials(\n       # 输入您的 Twitter 消费者密钥和秘密。可在 https:\u002F\u002Fdeveloper.twitter.com\u002Fen\u002Fapply-for-access 获取\n       consumer_key=\"\u003Ctwitter_consumer_key>\",\n       consumer_secret=\"\u003Ctwitter_consumer_secret>\",\n       bearer_token='\u003C请输入 Bearer Token>',\n   )\n)\n\n# 初始化推文检索器\nsource = TwitterSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_43abd58514fa.png\" width=\"20\" height=\"20\">\u003Cb>YouTube 抓取器\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.youtube_scrapper import YoutubeScrapperSource, YoutubeScrapperConfig\n\n# 初始化 YouTube 源配置\nsource_config = YoutubeScrapperConfig(\n    video_url=\"https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=uZfns0JIlFk\", # YouTube 视频 URL\n    fetch_replies=True, # 获取评论回复\n    max_comments=10, # 要获取的评论和回复总数\n    lookup_period=\"1Y\", # 从当前时间开始的查找周期，格式：\u003C数字>\u003Cd|h|m|M|Y>（天|小时|分钟|月|年）\n)\n\n# 初始化 YouTube 评论检索器\nsource = YoutubeScrapperSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_fd8ae6be8fa7.png\" width=\"20\" height=\"20\">\u003Cb>Facebook\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.facebook_source import FacebookCredentials, FacebookSource, FacebookSourceConfig\n\n# 初始化 Facebook 源配置\nsource_config = FacebookSourceConfig(\n   page_id=\"110844591144719\", # Facebook 页面 ID，例如 Obsei 的这个页面\n   lookup_period=\"1h\", # 从当前时间开始的查找周期，格式：\u003C数字>\u003Cd|h|m>（天|小时|分钟）\n   cred_info=FacebookCredentials(\n       # 输入您的 Facebook 应用 ID、应用秘密和长期令牌。可在 https:\u002F\u002Fdevelopers.facebook.com\u002Fapps\u002F 获取\n       app_id=\"\u003Cfacebook_app_id>\",\n       app_secret=\"\u003Cfacebook_app_secret>\",\n       long_term_token=\"\u003Cfacebook_long_term_token>\",\n   )\n)\n\n# 初始化 Facebook 帖子评论检索器\nsource = FacebookSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_d07dde5ae291.png\" width=\"20\" height=\"20\">\u003Cb>电子邮件\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.email_source import EmailConfig, EmailCredInfo, EmailSource\n\n# 初始化电子邮件源配置\nsource_config = EmailConfig(\n   # 最常用电子邮件提供商的 IMAP 服务器列表\n   # https:\u002F\u002Fwww.systoolsgroup.com\u002Fimap\u002F\n   # 另外，如果您使用的是 Gmail 账户，请确保在您的账户中允许不太安全的应用程序 -\n   # https:\u002F\u002Fmyaccount.google.com\u002Flesssecureapps?pli=1\n   # 同时启用 IMAP 访问 -\n   # https:\u002F\u002Fmail.google.com\u002Fmail\u002Fu\u002F0\u002F#settings\u002Ffwdandpop\n   imap_server=\"imap.gmail.com\", # 输入 IMAP 服务器\n   cred_info=EmailCredInfo(\n       # 输入您的电子邮件账户用户名和密码\n       username=\"\u003Cemail_username>\",\n       password=\"\u003Cemail_password>\"\n   ),\n   lookup_period=\"1h\" # 从当前时间开始的查找周期，格式：\u003C数字>\u003Cd|h|m>（天|小时|分钟）\n)\n\n# 初始化电子邮件检索器\nsource = EmailSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_6b6b38b8aeb1.png\" width=\"20\" height=\"20\">\u003Cb>Google 地图评论抓取器\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.google_maps_reviews import OSGoogleMapsReviewsSource, OSGoogleMapsReviewsConfig\n\n# 初始化Outscrapper Maps评论源配置\nsource_config = OSGoogleMapsReviewsConfig(\n   # 从https:\u002F\u002Foutscraper.com\u002F获取API密钥\n   api_key=\"\u003C输入您的API密钥>\",\n   # 输入Google地图链接或地点ID\n   # 例如，以下为“泰姬陵”的链接\n   queries=[\"https:\u002F\u002Fwww.google.co.in\u002Fmaps\u002Fplace\u002FTaj+Mahal\u002F@27.1751496,78.0399535,17z\u002Fdata=!4m5!3m4!1s0x39747121d702ff6d:0xdd2ae4803f767dde!8m2!3d27.1751448!4d78.0421422\"],\n   number_of_reviews=10,\n)\n\n\n# 初始化Outscrapper Maps评论获取器\nsource = OSGoogleMapsReviewsSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_1b94a031e7d9.png\" width=\"20\" height=\"20\">\u003Cb>AppStore评论抓取器\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.appstore_scrapper import AppStoreScrapperConfig, AppStoreScrapperSource\n\n# 初始化AppStore源配置\nsource_config = AppStoreScrapperConfig(\n   # 需要两个参数：app_id和country。\n   # `app_id`可在App Store应用URL的末尾找到。\n   # 例如 - https:\u002F\u002Fapps.apple.com\u002Fus\u002Fapp\u002Fxcode\u002Fid497799835\n   # `310633997`是Xcode的app_id，`us`是国家。\n   countries=[\"us\"],\n   app_id=\"310633997\",\n   lookup_period=\"1h\" # 查找周期，从当前时间开始，格式：\u003C数字>\u003Cd|h|m>（天|小时|分钟）\n)\n\n\n# 初始化AppStore评论获取器\nsource = AppStoreScrapperSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_efef15fe89ec.png\" width=\"20\" height=\"20\">\u003Cb>Play Store评论抓取器\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.playstore_scrapper import PlayStoreScrapperConfig, PlayStoreScrapperSource\n\n# 初始化Play Store源配置\nsource_config = PlayStoreScrapperConfig(\n   # 需要两个参数：package_name和country。\n   # `package_name`可在Play Store应用URL的末尾找到。\n   # 例如 - https:\u002F\u002Fplay.google.com\u002Fstore\u002Fapps\u002Fdetails?id=com.google.android.gm&hl=en&gl=US\n   # `com.google.android.gm`是Xcode的package_name，`us`是国家。\n   countries=[\"us\"],\n   package_name=\"com.google.android.gm\",\n   lookup_period=\"1h\" # 查找周期，从当前时间开始，格式：\u003C数字>\u003Cd|h|m>（天|小时|分钟）\n)\n\n# 初始化Play Store评论获取器\nsource = PlayStoreScrapperSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_c621dca9144a.png\" width=\"20\" height=\"20\">\u003Cb>Reddit\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.reddit_source import RedditConfig, RedditSource, RedditCredInfo\n\n# 初始化Reddit源配置\nsource_config = RedditConfig(\n   subreddits=[\"wallstreetbets\"], # 子Reddit列表\n   # Reddit账号用户名和密码\n   # 也可以输入Reddit client_id、client_secret或refresh_token\n   # 在https:\u002F\u002Fwww.reddit.com\u002Fprefs\u002Fapps创建凭据\n   # 参考https:\u002F\u002Fpraw.readthedocs.io\u002Fen\u002Flatest\u002Fgetting_started\u002Fauthentication.html\n   # 目前支持密码流、只读模式和已保存刷新令牌模式\n   cred_info=RedditCredInfo(\n       username=\"\u003Creddit_username>\",\n       password=\"\u003Creddit_password>\"\n   ),\n   lookup_period=\"1h\" # 查找周期，从当前时间开始，格式：\u003C数字>\u003Cd|h|m>（天|小时|分钟）\n)\n\n# 初始化Reddit获取器\nsource = RedditSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_c621dca9144a.png\" width=\"20\" height=\"20\">\u003Cb>Reddit抓取器\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n\u003Ci>注意：Reddit对抓取器有严格的速率限制，因此请在长时间内使用它来获取少量数据\u003C\u002Fi>\n\n```python\nfrom obsei.source.reddit_scrapper import RedditScrapperConfig, RedditScrapperSource\n\n# 初始化Reddit抓取器源配置\nsource_config = RedditScrapperConfig(\n   # Reddit子Reddit、搜索等RSS URL。如需正确URL，请参考以下链接：\n   # 参考https:\u002F\u002Fwww.reddit.com\u002Fr\u002Fpathogendavid\u002Fcomments\u002Ftv8m9\u002Fpathogendavids_guide_to_rss_and_reddit\u002F\n   url=\"https:\u002F\u002Fwww.reddit.com\u002Fr\u002Fwallstreetbets\u002Fcomments\u002F.rss?sort=new\",\n   lookup_period=\"1h\" # 查找周期，从当前时间开始，格式：\u003C数字>\u003Cd|h|m>（天|小时|分钟）\n)\n\n# 初始化Reddit获取器\nsource = RedditScrapperSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_6bd590596518.png\" width=\"20\" height=\"20\">\u003Cb>Google新闻\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.google_news_source import GoogleNewsConfig, GoogleNewsSource\n\n# 初始化Google新闻源配置\nsource_config = GoogleNewsConfig(\n   query='bitcoin',\n   max_results=5,\n   # 要获取全文，启用`fetch_article`标志\n   # 默认情况下，Google新闻提供标题和亮点\n   fetch_article=True,\n   # proxy='http:\u002F\u002F127.0.0.1:8080'\n)\n\n# 初始化Google新闻获取器\nsource = GoogleNewsSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_554aa2cef5a6.png\" width=\"20\" height=\"20\">\u003Cb>网络爬虫\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.source.website_crawler_source import TrafilaturaCrawlerConfig, TrafilaturaCrawlerSource\n\n# 初始化网站爬虫源配置\nsource_config = TrafilaturaCrawlerConfig(\n   urls=['https:\u002F\u002Fobsei.github.io\u002Fobsei\u002F']\n)\n\n# 初始化网站文本获取器\nsource = TrafilaturaCrawlerSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fobsei\u002Fobsei-resources\u002Fmaster\u002Flogos\u002Fpandas.svg\" width=\"20\" height=\"20\">\u003Cb>Pandas数据框\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nimport pandas as pd\nfrom obsei.source.pandas_source import PandasSource, PandasSourceConfig\n\n# 从CSV、Excel、SQL等来源初始化你的Pandas数据框\n# 在以下示例中，我们读取包含两列的CSV文件：title和text\ncsv_file = \"https:\u002F\u002Fraw.githubusercontent.com\u002Fdeepset-ai\u002Fhaystack\u002Fmaster\u002Ftutorials\u002Fsmall_generator_dataset.csv\"\ndataframe = pd.read_csv(csv_file)\n\n# 初始化Pandas数据源配置\nsink_config = PandasSourceConfig(\n   dataframe=dataframe,\n   include_columns=[\"score\"],\n   text_columns=[\"name\", \"degree\"],\n)\n\n# 初始化 Pandas 接收器\nsink = PandasSource()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>\u003Cb>步骤 2：配置分析器\u003C\u002Fb>\u003C\u002Fsummary>\n\n\u003Ci>注意：要在离线模式下运行转换器，请查看【转换器离线模式】(https:\u002F\u002Fhuggingface.co\u002Ftransformers\u002Finstallation.html#offline-mode)。\u003C\u002Fi>\n\n\u003Cp>一些分析器支持 GPU，可以使用\u003Cb>device\u003C\u002Fb>参数。\n\u003Cb>device\u003C\u002Fb>参数的可能值列表（默认值\u003Ci>auto\u003C\u002Fi>)：\n\u003Col>\n    \u003Cli> \u003Cb>auto\u003C\u002Fb>：如果可用则使用 GPU (cuda:0)，否则使用 CPU\n    \u003Cli> \u003Cb>cpu\u003C\u002Fb>：使用 CPU\n    \u003Cli> \u003Cb>cuda:{id}\u003C\u002Fb>：使用提供的 CUDA 设备 ID 的 GPU\n\u003C\u002Fol>\n\u003C\u002Fp>\n\n\u003Ctable >\u003Ctbody >\u003Ctr>\u003C\u002Ftr>\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_b1b59ee76898.png\" width=\"20\" height=\"20\">\u003Cb>文本分类\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n文本分类：将文本分类到用户提供的类别中。\n\n```python\nfrom obsei.analyzer.classification_analyzer import ClassificationAnalyzerConfig, ZeroShotClassificationAnalyzer\n\n# 初始化分类分析器配置\n# 如果添加了“positive”和“negative”标签，它也可以检测情感。\nanalyzer_config=ClassificationAnalyzerConfig(\n   labels=[\"服务\", \"延误\", \"性能\"],\n)\n\n# 初始化分类分析器\n# 支持的模型请参考 https:\u002F\u002Fhuggingface.co\u002Fmodels?filter=zero-shot-classification\ntext_analyzer = ZeroShotClassificationAnalyzer(\n   model_name_or_path=\"typeform\u002Fmobilebert-uncased-mnli\",\n   device=\"auto\"\n)\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_e1e772095c96.png\" width=\"20\" height=\"20\">\u003Cb>情感分析器\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n情感分析器：检测文本的情感。文本分类也可以进行情感分析，但如果您不想使用重型 NLP 模型，则可以使用资源消耗较少的基于词典的 Vader 情感检测器。\n\n```python\nfrom obsei.analyzer.sentiment_analyzer import VaderSentimentAnalyzer\n\n# Vader 不需要任何配置设置\nanalyzer_config=None\n\n# 初始化 Vader 情感分析器\ntext_analyzer = VaderSentimentAnalyzer()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_2b730d24c8d4.png\" width=\"20\" height=\"20\">\u003Cb>NER 分析器\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\nNER（命名实体识别）分析器：从文本中提取信息，并将提到的命名实体分类到预定义的类别中，例如人名、组织、地点、医疗代码、时间表达式、数量、货币值、百分比等。\n\n```python\nfrom obsei.analyzer.ner_analyzer import NERAnalyzer\n\n# NER 分析器不需要配置设置\nanalyzer_config=None\n\n# 初始化 NER 分析器\n# 支持的模型请参考 https:\u002F\u002Fhuggingface.co\u002Fmodels?filter=token-classification\ntext_analyzer = NERAnalyzer(\n   model_name_or_path=\"elastic\u002Fdistilbert-base-cased-finetuned-conll03-english\",\n   device = \"auto\"\n)\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_1bd64751fea1.png\" width=\"20\" height=\"20\">\u003Cb>翻译器\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.analyzer.translation_analyzer import TranslationAnalyzer\n\n# 翻译器不需要分析器配置\nanalyzer_config = None\n\n# 初始化翻译器\n# 支持的模型请参考 https:\u002F\u002Fhuggingface.co\u002Fmodels?pipeline_tag=translation\nanalyzer = TranslationAnalyzer(\n   model_name_or_path=\"Helsinki-NLP\u002Fopus-mt-hi-en\",\n   device = \"auto\"\n)\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_8abbf929a193.png\" width=\"20\" height=\"20\">\u003Cb>PII 匿名化器\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.analyzer.pii_analyzer import PresidioEngineConfig, PresidioModelConfig, \\\n   PresidioPIIAnalyzer, PresidioPIIAnalyzerConfig\n\n# 初始化 PII 分析器的配置\nanalyzer_config = PresidioPIIAnalyzerConfig(\n   # 是否仅返回 PII 分析结果或匿名化文本\n   analyze_only=False,\n   # 是否返回匿名化决策的详细信息\n   return_decision_process=True\n)\n\n# 初始化 PII 分析器\nanalyzer = PresidioPIIAnalyzer(\n   engine_config=PresidioEngineConfig(\n       # 支持 spacy 和 stanza NLP 引擎\n       # 更多信息请参考\n       # https:\u002F\u002Fmicrosoft.github.io\u002Fpresidio\u002Fanalyzer\u002Fdeveloping_recognizers\u002F#utilize-spacy-or-stanza\n       nlp_engine_name=\"spacy\",\n       # 更新所需的 spacy 模型和语言\n       models=[PresidioModelConfig(model_name=\"en_core_web_lg\", lang_code=\"en\")]\n   )\n)\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_d517a5bd25a0.png\" width=\"20\" height=\"20\">\u003Cb>虚拟分析器\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n虚拟分析器：什么都不做。它只是用于将输入（TextPayload）转换为输出（TextPayload），并添加用户提供的虚拟数据。\n\n```python\nfrom obsei.analyzer.dummy_analyzer import DummyAnalyzer, DummyAnalyzerConfig\n\n# 初始化虚拟分析器的配置设置\nanalyzer_config = DummyAnalyzerConfig()\n\n# 初始化虚拟分析器\nanalyzer = DummyAnalyzer()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>\u003Cb>步骤 3：配置接收器\u002F通知器\u003C\u002Fb>\u003C\u002Fsummary>\n\n\u003Ctable >\u003Ctbody >\u003Ctr>\u003C\u002Ftr>\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fobsei\u002Fobsei-resources\u002Fmaster\u002Flogos\u002Fslack.svg\" width=\"25\" height=\"25\">\u003Cb>Slack\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.sink.slack_sink import SlackSink, SlackSinkConfig\n\n# 初始化 Slack 接收器配置\nsink_config = SlackSinkConfig(\n   # 提供 Slack 机器人\u002F应用令牌\n   # 更多详情请参考 https:\u002F\u002Fslack.com\u002Fintl\u002Fen-de\u002Fhelp\u002Farticles\u002F215770388-Create-and-regenerate-API-tokens\n   slack_token=\"\u003CSlack_app_token>\",\n   # 获取频道 ID 请参考 https:\u002F\u002Fstackoverflow.com\u002Fquestions\u002F40940327\u002Fwhat-is-the-simplest-way-to-find-a-slack-team-id-and-a-channel-id\n   channel_id=\"C01LRS6CT9Q\"\n)\n\n# 初始化 Slack 接收器\nsink = SlackSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_e28a4f7f192a.png\" width=\"20\" height=\"20\">\u003Cb>Zendesk\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.sink.zendesk_sink import ZendeskSink, ZendeskSinkConfig, ZendeskCredInfo\n\n# 初始化 Zendesk 接收器配置\nsink_config = ZendeskSinkConfig(\n   # 提供 Zendesk 域名\n   domain=\"zendesk.com\",\n   # 如果您有子域名，请提供\n   subdomain=None,\n   # 输入 Zendesk 用户详细信息\n   cred_info=ZendeskCredInfo(\n       email=\"\u003Czendesk_user_email>\",\n       password=\"\u003Czendesk_password>\"\n   )\n)\n\n# 初始化 Zendesk 接收器\nsink = ZendeskSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_ace43fd305a1.png\" width=\"20\" height=\"20\">\u003Cb>Jira\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.sink.jira_sink import JiraSink, JiraSinkConfig\n\n# 为了测试，您可以本地启动 Jira 服务器\n# 参考 https:\u002F\u002Fdeveloper.atlassian.com\u002Fserver\u002Fframework\u002Fatlassian-sdk\u002Fatlas-run-standalone\u002F\n\n# 初始化 Jira 接收器配置\nsink_config = JiraSinkConfig(\n   url=\"http:\u002F\u002Flocalhost:2990\u002Fjira\", # Jira 服务器 URL\n    # 具有创建问题权限的用户的用户名和密码\n   username=\"\u003Cusername>\",\n   password=\"\u003Cpassword>\",\n   # 要创建的问题类型\n   # 更多信息请参考 https:\u002F\u002Fsupport.atlassian.com\u002Fjira-cloud-administration\u002Fdocs\u002Fwhat-are-issue-types\u002F\n   issue_type={\"name\": \"Task\"},\n   # 在哪个项目下创建问题\n   # 更多信息请参考 https:\u002F\u002Fsupport.atlassian.com\u002Fjira-software-cloud\u002Fdocs\u002Fwhat-is-a-jira-software-project\u002F\n   project={\"key\": \"CUS\"},\n)\n\n# 初始化 Jira 接收器\nsink = JiraSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_dae29809149b.png\" width=\"20\" height=\"20\">\u003Cb>ElasticSearch\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.sink.elasticsearch_sink import ElasticSearchSink, ElasticSearchSinkConfig\n\n# 为了测试，您可以使用 Docker 本地启动 Elasticsearch 服务器\n# `docker run -d --name elasticsearch -p 9200:9200 -e \"discovery.type=single-node\" elasticsearch:8.5.0`\n\n# 初始化 Elasticsearch 接收器配置\nsink_config = ElasticSearchSinkConfig(\n   # Elasticsearch 服务器\n   hosts=\"http:\u002F\u002Flocalhost:9200\",\n   # 索引名称，如果不存在则会自动创建\n   index_name=\"test\",\n)\n\n# 初始化 Elasticsearch 接收器\nsink = ElasticSearchSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_576de5fa98da.png\" width=\"20\" height=\"20\">\u003Cb>Http\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom obsei.sink.http_sink import HttpSink, HttpSinkConfig\n\n# 为了测试，您可以使用 Postman 创建一个模拟 HTTP 服务器\n# 更多详情请参考 https:\u002F\u002Flearning.postman.com\u002Fdocs\u002Fdesigning-and-developing-your-api\u002Fmocking-data\u002Fsetting-up-mock\u002F\n\n# 初始化 HTTP 接收器配置（目前仅支持 POST 请求）\nsink_config = HttpSinkConfig(\n   # 提供 HTTP 服务器 URL\n   url=\"https:\u002F\u002Flocalhost:8080\u002Fapi\u002Fpath\",\n   # 您可以在这里添加希望随请求一起发送的头部信息\n   headers={\n       \"Content-type\": \"application\u002Fjson\"\n   }\n)\n\n# 如果需要修改或转换负载数据，可以创建转换器类\n# 请参考 obsei.sink.dailyget_sink.PayloadConvertor 示例\n\n# 初始化 HTTP 接收器\nsink = HttpSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fobsei\u002Fobsei-resources\u002Fmaster\u002Flogos\u002Fpandas.svg\" width=\"20\" height=\"20\">\u003Cb>Pandas DataFrame\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n```python\nfrom pandas import DataFrame\nfrom obsei.sink.pandas_sink import PandasSink, PandasSinkConfig\n\n# 初始化 Pandas 接收器配置\nsink_config = PandasSinkConfig(\n   dataframe=DataFrame()\n)\n\n# 初始化 Pandas 接收器\nsink = PandasSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>\u003Cdetails >\u003Csummary>\u003Cimg style=\"vertical-align:middle;margin:2px 10px\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_f324ea1e48ad.png\" width=\"20\" height=\"20\">\u003Cb>Logger\u003C\u002Fb>\u003C\u002Fsummary>\u003Chr>\n\n这在测试和流水线的干运行中非常有用。\n\n```python\nfrom obsei.sink.logger_sink import LoggerSink, LoggerSinkConfig\nimport logging\nimport sys\n\nlogger = logging.getLogger(\"Obsei\")\nlogging.basicConfig(stream=sys.stdout, level=logging.INFO)\n\n# 初始化 logger 接收器配置\nsink_config = LoggerSinkConfig(\n   logger=logger,\n   level=logging.INFO\n)\n\n# 初始化 logger 接收器\nsink = LoggerSink()\n```\n\n\u003C\u002Fdetails>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>\u003Cb>步骤 4：连接并创建工作流\u003C\u002Fb>\u003C\u002Fsummary>\n\n`source` 将从选定的来源获取数据，然后将其输入到 `analyzer` 进行处理，我们再将 `analyzer` 的输出输入到 `sink`，以便在该接收器处收到通知。\n\n```python\n# 如果您想记录日志，请取消注释\n# import logging\n# import sys\n# logger = logging.getLogger(__name__)\n# logging.basicConfig(stream=sys.stdout, level=logging.INFO)\n\n# 这将从配置的来源获取信息，例如 Twitter、应用商店等\nsource_response_list = source.lookup(source_config)\n\n# 如果您想记录源响应，请取消注释\n# for idx, source_response in enumerate(source_response_list):\n#     logger.info(f\"source_response#{idx}='{source_response.__dict__}'\")\n\n# 这将使用提供的 analyzer_config 对源数据执行分析器（情感分析、分类等）\nanalyzer_response_list = text_analyzer.analyze_input(\n    source_response_list=source_response_list,\n    analyzer_config=analyzer_config\n)\n\n# 如果您想记录分析器响应，请取消注释\n# for idx, an_response in enumerate(analyzer_response_list):\n#    logger.info(f\"analyzer_response#{idx}='{an_response.__dict__}'\")\n\n# 分析器输出被添加到 segmented_data 中\n# 如果您想记录它，请取消注释\n# for idx, an_response in enumerate(analyzer_response_list):\n#    logger.info(f\"analyzed_data#{idx}='{an_response.segmented_data.__dict__}'\")\n\n# 这将把分析后的输出发送到配置的接收器，例如 Slack、Zendesk 等\nsink_response_list = sink.send_data(analyzer_response_list, sink_config)\n\n# 如果您想记录接收器响应，请取消注释\n# for sink_response in sink_response_list:\n#     if sink_response is not None:\n#         logger.info(f\"sink_response='{sink_response}'\")\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>\u003Cb>步骤 5：执行工作流\u003C\u002Fb>\u003C\u002Fsummary>\n将 \u003Cb>步骤 1 至 4\u003C\u002Fb> 中的代码片段复制到一个 Python 文件中，例如 \u003Ccode>example.py\u003C\u002Fcode>,然后执行以下命令：\n\n```shell\npython example.py\n```\n\n\u003C\u002Fdetails>\n\n## 演示\n\n我们提供了一个基于 [Streamlit](https:\u002F\u002Fstreamlit.io\u002F) 的最小化 UI，您可以用来测试 Obsei。\n\n![截图](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_4242c7662fa8.png)\n\n### 观看UI演示视频\n\n[![入门与演示视频](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_f2da74e06f9a.jpg)](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=GTF-Hy96gvY)\n\n在[![](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fobsei\u002Fobsei-demo)查看演示\n\n（**注意**：有时由于速率限制，Streamlit演示可能无法正常运行，请在这种情况下使用本地Docker镜像。）\n\n要在本地测试，只需运行：\n\n```\ndocker run -d --name obesi-ui -p 8501:8501 obsei\u002Fobsei-ui-demo\n\n# 您可以在 http:\u002F\u002Flocalhost:8501 找到UI\n```\n\n**要通过GitHub Actions轻松运行Obsei工作流（无需注册和云托管），请参考此[仓库](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fdemo-workflow-action)**。\n\n## 使用Obsei的公司\u002F项目\n\n以下是一些使用Obsei的公司\u002F项目（按字母顺序排列）。如需将您的公司\u002F项目添加到列表，请提交PR或通过[电子邮件](contact@obsei.com)联系我们。\n\n- [Oraika](https:\u002F\u002Fwww.oraika.com)：上下文理解客户反馈\n- [1Page](https:\u002F\u002Fwww.get1page.com\u002F)：为会议和通话提供更好的背景信息\n- [Spacepulse](http:\u002F\u002Fspacepulse.in\u002F)：空间的操作系统\n- [Superblog](https:\u002F\u002Fsuperblog.ai\u002F)：WordPress和Medium的极速替代方案\n- [Zolve](https:\u002F\u002Fzolve.com\u002F)：打造超越国界的金融世界\n- [Utilize](https:\u002F\u002Fwww.utilize.app\u002F)：面向无固定办公场所企业的无代码应用构建器\n\n## 文章\n\n\u003Ctable>\n\u003Cthead>\n\u003Ctr class=\"header\">\n\u003Cth>序号\u003C\u002Fth>\n\u003Cth>标题\u003C\u002Fth>\n\u003Cth>作者\u003C\u002Fth>\n\u003C\u002Ftr>\n\u003C\u002Fthead>\n\u003Ctbody>\n\u003Ctr>\n\u003Ctd>1\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Freenabapna.medium.com\u002Fai-based-comparative-customer-feedback-analysis-using-deep-learning-models-def0dc77aaee\">基于AI的比较客户反馈分析——使用Obsei\u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"linkedin.com\u002Fin\u002Freena-bapna-66a8691a\">Reena Bapna\u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>2\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fmedium.com\u002Fmlearning-ai\u002Flinkedin-app-user-feedback-analysis-9c9f98464daa\">LinkedIn应用——用户反馈分析\u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"http:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fhimanshusharmads\">Himanshu Sharma\u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n## 教程\n\n\u003Ctable>\n\u003Cthead>\n\u003Ctr class=\"header\">\n\u003Cth>序号\u003C\u002Fth>\n\u003Cth>工作流\u003C\u002Fth>\n\u003Cth>Colab\u003C\u002Fth>\n\u003Cth>Binder\u003C\u002Fth>\n\u003C\u002Ftr>\n\u003C\u002Fthead>\n\u003Ctbody>\n\u003Ctr>\n\u003Ctd rowspan=\"2\">1\u003C\u002Ftd>\n\u003Ctd colspan=\"3\">观察Google Play商店的应用评论，通过文本分类进行分析，然后通过日志记录器在控制台中显示\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>PlayStore评论 → 分类 → 日志记录器\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002Ftutorials\u002F01_PlayStore_Classification_Logger.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002Fobsei\u002Fobsei\u002FHEAD?filepath=tutorials%2F01_PlayStore_Classification_Logger.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fmybinder.org\u002Fbadge_logo.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd rowspan=\"2\">2\u003C\u002Ftd>\n\u003Ctd colspan=\"3\">观察Google Play商店的应用评论，通过多种文本清理函数预处理文本，通过文本分类进行分析，将结果存储到Pandas DataFrame中，并保存为CSV文件到Google Drive\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>PlayStore评论 → 预处理 → 分类 → Pandas DataFrame → Google Drive中的CSV\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002Ftutorials\u002F02_PlayStore_PreProc_Classification_Pandas.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002Fobsei\u002Fobsei\u002FHEAD?filepath=tutorials%2F02_PlayStore_PreProc_Classification_Pandas.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fmybinder.org\u002Fbadge_logo.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd rowspan=\"2\">3\u003C\u002Ftd>\n\u003Ctd colspan=\"3\">观察Apple App Store的应用评论，通过多种文本清理函数预处理文本，通过文本分类进行分析，将结果存储到Pandas DataFrame中，并保存为CSV文件到Google Drive\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>AppStore评论 → 预处理 → 分类 → Pandas DataFrame → Google Drive中的CSV\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002Ftutorials\u002F03_AppStore_PreProc_Classification_Pandas.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002Fobsei\u002Fobsei\u002FHEAD?filepath=tutorials%2F03_AppStore_PreProc_Classification_Pandas.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fmybinder.org\u002Fbadge_logo.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd rowspan=\"2\">4\u003C\u002Ftd>\n\u003Ctd colspan=\"3\">观察Google新闻中的新闻文章，通过多种文本清理函数预处理文本，通过文本分类进行分析，同时将文本分割成小块，再用给定公式计算最终推理结果\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>Google新闻 → 文本清理器 → 文本分割器 → 分类 → 推理聚合器\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002Ftutorials\u002F04_GoogleNews_Cleaner_Splitter_Classification_Aggregator.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003Ctd>\n    \u003Ca href=\"https:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002Fobsei\u002Fobsei\u002FHEAD?filepath=tutorials%2F04_GoogleNews_Cleaner_Splitter_Classification_Aggregator.ipynb\">\n        \u003Cimg alt=\"Colab\" src=\"https:\u002F\u002Fmybinder.org\u002Fbadge_logo.svg\">\n    \u003C\u002Fa>\n\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n\u003Cdetails>\u003Csummary>\u003Cb>💡提示：通过Obsei处理大型文本分类\u003C\u002Fb>\u003C\u002Fsummary>\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_090e883d2ba7.gif)\n\n\u003C\u002Fdetails>\n\n## 文档\n\n有关详细的安装说明、使用方法和示例，请参阅我们的[文档](https:\u002F\u002Fobsei.github.io\u002Fobsei\u002F)。\n\n## 支持与发布矩阵\n\n\u003Ctable>\n\u003Cthead>\n\u003Ctr class=\"header\">\n\u003Cth>\u003C\u002Fth>\n\u003Cth>Linux\u003C\u002Fth>\n\u003Cth>Mac\u003C\u002Fth>\n\u003Cth>Windows\u003C\u002Fth>\n\u003Cth>备注\u003C\u002Fth>\n\u003C\u002Ftr>\n\u003C\u002Fthead>\n\u003Ctbody>\n\u003Ctr>\n\u003Ctd>测试\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd>覆盖率较低，因为难以测试第三方库\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>PIP\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">✅\u003C\u002Ftd>\n\u003Ctd>完全支持\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003Ctr>\n\u003Ctd>Conda\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">❌\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">❌\u003C\u002Ftd>\n\u003Ctd style=\"text-align:center\">❌\u003C\u002Ftd>\n\u003Ctd>不支持\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n## 讨论论坛\n\n关于_Obsei_的讨论可在[社区论坛](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fdiscussions)上进行。\n\n## 更改日志\n\n请参阅[发布](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Freleases)以获取更改日志\n\n## 安全问题\n\n如有任何安全问题，请通过[电子邮件](mailto:contact@oraika.com)联系我们\n\n## 长期以来的观星者\n\n[![长期以来的观星者](https:\u002F\u002Fstarchart.cc\u002Fobsei\u002Fobsei.svg)](https:\u002F\u002Fstarchart.cc\u002Fobsei\u002Fobsei)\n\n## 维护者\n\n本项目由[Oraika Technologies](https:\u002F\u002Fwww.oraika.com)维护。[Lalit Pagaria](https:\u002F\u002Fgithub.com\u002Flalitpagaria)和[Girish Patel](https:\u002F\u002Fgithub.com\u002FGirishPatel)是本项目的维护者。\n\n## 许可协议\n\n- 版权持有者：[Oraika Technologies](https:\u002F\u002Fwww.oraika.com)\n- 整体采用Apache 2.0许可，您可阅读[许可证](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FLICENSE)文件。\n- 对于第三方组件，我们采用了多种其他次要的宽松许可或弱 copyleft 许可（如LGPL、MIT、BSD等），详情请参阅[署名文件](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FATTRIBUTION.md)。\n- 为使项目更符合商业需求，我们已将那些具有强 copyleft 许可（如GPL、AGPL等）的第三方组件排除在项目之外。\n\n## 署名\n\n没有这些[开源软件](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FATTRIBUTION.md)，这一切都不可能实现。\n\n## 贡献\n\n首先，感谢您考虑为本软件包做出贡献，无论大小，每一份贡献都备受珍视。\n请参阅我们的[贡献指南](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FCONTRIBUTING.md)和[行为规范](https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fblob\u002Fmaster\u002FCODE_OF_CONDUCT.md)。\n\n非常感谢所有贡献者！\n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_readme_649d705c3b13.png\" \u002F>\n\u003C\u002Fa>","# Obsei 快速上手指南\n\nObsei 是一个开源、低代码的 AI 自动化工具，旨在帮助用户从各种来源收集非结构化数据（如社交媒体评论、应用商店评价等），利用 AI 进行分析（如情感分析、分类），并将结果推送到指定目的地（如工单系统、数据库）。\n\n## 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**：Linux, macOS 或 Windows\n*   **Python 版本**：Python 3.7 或更高版本\n*   **包管理工具**：pip (通常随 Python 安装)\n\n> **注意**：Obsei 目前处于 Alpha 阶段，生产环境使用需谨慎。建议始终使用已发布的稳定版本，避免直接使用 master 分支以防遇到破坏性变更。\n\n## 安装步骤\n\n您可以使用 pip 进行安装。为了加速下载，中国开发者推荐使用国内镜像源（如阿里云或清华大学源）。\n\n### 1. 安装完整功能版\n如果您希望一次性安装所有依赖项（包含所有数据源、分析器和接收器）：\n\n```shell\npip install obsei[all] -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 2. 按需最小化安装\n`all` 选项会安装大量可能用不到的依赖（包括 PyTorch）。您可以根据实际需求组合安装特定模块：\n\n*   **仅安装数据源 (Observers)**: `pip install obsei[source]`\n*   **仅安装分析器 (Analyzers)**: `pip install obsei[analyzer]` (会自动安装 PyTorch)\n*   **仅安装接收器 (Informers)**: `pip install obsei[sink]`\n*   **组合示例** (安装 Twitter 数据源 + 所有分析器 + Slack 接收器):\n\n```shell\npip install obsei[twitter-api,analyzer,slack-api] -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n常用可选依赖标识：\n*   `twitter-api`: Twitter 数据抓取\n*   `reddit-api` \u002F `reddit-scraper`: Reddit 数据抓取\n*   `google-play-api` \u002F `google-play-scraper`: Google Play 评论抓取\n*   `facebook-api`: Facebook 页面评论抓取\n*   `atlassian-api`: Jira 工单集成\n*   `slack-api`: Slack 消息推送\n\n## 基本使用\n\nObsei 的工作流由三个核心组件组成：**Observer** (数据源)、**Analyzer** (分析器) 和 **Informer** (数据目的地)。\n\n以下是一个最简化的概念示例，展示如何配置一个工作流（以从 Twitter 抓取数据为例）：\n\n### 第一步：配置数据源 (Observer)\n\n初始化配置对象并创建源实例。您需要替换为您的真实 API 凭证。\n\n```python\nfrom obsei.source.twitter_source import TwitterCredentials, TwitterSource, TwitterSourceConfig\n\n# 初始化 Twitter 源配置\nsource_config = TwitterSourceConfig(\n   keywords=[\"issue\"], # 搜索关键词、@用户 或 #标签\n   lookup_period=\"1h\", # 查找时间段，格式：\u003C数字>\u003Cd|h|m> (天|小时|分钟)\n   cred_info=TwitterCredentials(\n       # 请填写您的 Twitter Developer 凭证\n       consumer_key=\"\u003Ctwitter_consumer_key>\",\n       consumer_secret=\"\u003Ctwitter_consumer_secret>\",\n       bearer_token='\u003CENTER BEARER TOKEN>',\n   )\n)\n\n# 初始化数据获取器\nsource = TwitterSource()\n```\n\n### 第二步：配置分析器 (Analyzer)\n\nObsei 支持多种 AI 任务，如情感分析、文本分类等。这里以默认的情感分析为例：\n\n```python\nfrom obsei.analyzer.text_analyzer import TextAnalyzer, TextAnalyzerConfig\n\n# 初始化分析器配置\nanalyzer_config = TextAnalyzerConfig(\n    model_name_or_path=\"distilbert-base-uncased-finetuned-sst-2-english\", # 使用 HuggingFace 模型\n    batch_size=10\n)\n\n# 初始化分析器\nanalyzer = TextAnalyzer()\n```\n\n### 第三步：配置目的地 (Informer) 并执行工作流\n\n将分析后的数据发送到目的地（例如打印到控制台或存入 DataFrame），然后执行整个流程：\n\n```python\nfrom obsei.sink.dataframe_sink import DataframeSink, DataframeSinkConfig\n\n# 初始化接收器配置 (此处以输出到 Pandas DataFrame 为例)\nsink_config = DataframeSinkConfig()\nsink = DataframeSink()\n\n# 执行工作流\n# 1. 从源获取数据\nsource_response_list = source.lookup(source_config)\n\n# 2. 进行分析\nanalyzer_response_list = analyzer.analyze_input(\n    source_response_list=source_response_list,\n    analyzer_config=analyzer_config\n)\n\n# 3. 发送结果到目的地\nsink.send(analyzer_response_list, sink_config)\n```\n\n通过以上三个步骤，您即可构建一个完整的自动化数据处理闭环。您可以参考官方文档替换不同的 Source（如 YouTube, Email, Google Maps）和 Sink（如 Jira, Slack, Elasticsearch）以适应具体业务场景。","某电商品牌的客户体验团队每天需要监控 Twitter、Reddit 及应用商店中数千条关于新品的用户评论，以快速识别潜在的产品缺陷和负面舆情。\n\n### 没有 obsei 时\n- 团队成员需手动登录多个平台逐条复制评论，耗时费力且容易遗漏关键信息。\n- 面对海量非结构化文本，人工判断情感倾向（正面\u002F负面）标准不一，导致误判率高。\n- 发现严重投诉后，依赖人工转发至工单系统，平均响应延迟超过 4 小时，错失最佳挽回时机。\n- 缺乏统一的数据存储格式，难以进行跨渠道的对比分析和长期趋势追踪。\n- 夜间或节假日出现突发舆情时，因无人值守导致品牌声誉受损风险激增。\n\n### 使用 obsei 后\n- 配置 Obsei 的 Observer 模块自动抓取多平台评论，实现 7x24 小时无人值守的数据收集。\n- 利用 Analyzer 模块的 AI 模型统一进行情感分析与分类，准确识别愤怒情绪并提取敏感实体（PII）。\n- 通过 Informer 模块将确认为“严重投诉”的数据自动推送到 Jira 等工单系统，触发即时警报，响应时间缩短至分钟级。\n- 所有分析结果自动存入数据库并形成结构化 DataFrame，便于团队直接生成可视化报表进行深度洞察。\n- 建立自动化闭环流程，即使在非工作时间也能确保负面反馈被即时捕获并分派给相应负责人。\n\nObsei 将原本需要多人协作数小时的被动监控工作，转化为实时、精准且全自动的智能预警体系，显著提升了品牌危机应对能力。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fobsei_obsei_d7fae3d6.png","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fobsei_0f3eb95b.png","Home of open source projects undertaken by Oraika Technologies Private Limited",null,"contact@oraika.com","OraikaTech","https:\u002F\u002Foraika.com","https:\u002F\u002Fgithub.com\u002Fobsei",[83,87,91,95],{"name":84,"color":85,"percentage":86},"Python","#3572A5",60.4,{"name":88,"color":89,"percentage":90},"Jupyter Notebook","#DA5B0B",39.2,{"name":92,"color":93,"percentage":94},"Dockerfile","#384d54",0.3,{"name":96,"color":97,"percentage":98},"HTML","#e34c26",0.1,1387,175,"2026-04-04T18:23:06","Apache-2.0","未说明 (基于 Python 和 PyTorch，通常支持 Linux, macOS, Windows)","非必需。仅在安装 [analyzer] 组件时依赖 PyTorch，可根据任务选择 CPU 或 GPU 运行，README 未指定具体显卡型号或 CUDA 版本要求。","未说明",{"notes":107,"python":108,"dependencies":109},"该工具处于 Alpha 阶段，生产环境请谨慎使用并建议使用发布版本而非 master 分支。支持模块化安装，可通过 pip install obsei[组件名] 按需安装依赖（如 twitter-api, analyzer, slack-api 等），避免安装不必要的库。Analyzer 组件会安装 PyTorch，其他组件通常不需要重型深度学习框架。","3.7+",[110,111],"torch (可选，安装 analyzer 时必需)","PyPI: obsei",[26,15],[114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133],"artificial-intelligence","natural-language-processing","sentiment-analysis","workflow","social-network-analysis","customer-engagement","text-analysis","text-analytics","python","nlp","issue-tracking-system","customer-support","lowcode","text-classification","anonymization","low-code","business-process-automation","workflow-automation","process-automation","social-listening","2026-03-27T02:49:30.150509","2026-04-06T06:45:30.667688",[137,142,147,152,157,162,167],{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},9315,"在 Google Colab 中安装 Obsei 时遇到 \"No module named 'dateparser'\" 错误怎么办？","该问题已在后续版本修复（参考 PR #257）。如果遇到此错误，请尝试降低查找范围或确保安装了最新版本的 Obsei。如果问题依旧，请检查设置中是否启用了相关访问权限（如 Twitter V2 API），启用后可能需要等待几分钟生效。","https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fissues\u002F229",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},9316,"Obsei 如何处理超过 BERT 模型限制（512 tokens）的长文本？","Obsei 引入了 TextSplitter（文本分割器）作为预处理节点，灵感来源于 Haystack 分割器。它会将长文本分割成多个块（chunks），并为每个块添加 chunk_id、passage_id 等元数据。虽然目前主要实现的是分割功能，但未来计划增加 `InferenceAggregator` 节点，通过投票或平均等方式聚合多个文本块的推理结果以得出最终结论。","https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fissues\u002F153",{"id":148,"question_zh":149,"answer_zh":150,"source_url":151},9317,"如何配置 Twitter 源以避免 \"At least one non empty parameter required\" 属性错误？","在配置 `TwitterSourceConfig` 时，必须至少提供以下参数中的一个且不能为空：`query`（查询）、`keywords`（关键词）、`hashtags`（标签）或 `usernames`（用户名）。之前的代码逻辑存在 Bug，已修复为正确检查所有参数是否为空。确保你的配置对象中明确包含了至少一个有效参数，例如：`usernames=[\"@Zappos\"]`。","https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fissues\u002F243",{"id":153,"question_zh":154,"answer_zh":155,"source_url":156},9318,"Obsei 的分析器（Analyzer）是否支持批量处理以提高性能？","是的，分析器支持批量调用。可以将输入数组划分为多个批次（batches），并将批次数组直接传递给 pipeline 方法，而不是逐个迭代调用。为了灵活性，`batch_size` 输入参数是可配置的。这有助于降低库的延迟并提高处理效率。","https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fissues\u002F88",{"id":158,"question_zh":159,"answer_zh":160,"source_url":161},9319,"如何使用 Obsei 进行意图分类（如识别购买、销售意图）？","Obsei 集成了专门的意图分类分析器，用于检测客户目标（如 buy, sell, purchase）。虽然也可以使用零样本分类器（zero-shot classifier），但建议使用独立的意图分类分析器以便加载专用模型。用户可以在 Hugging Face 上找到预训练模型（例如 `shahrukhx01\u002Fbuy-sell-intent-classifier-bert-mini`）并将其集成到 Obsei 工作流中。","https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fissues\u002F77",{"id":163,"question_zh":164,"answer_zh":165,"source_url":166},9320,"Obsei 演示版本（Demo\u002FHF Space）出现运行时错误如何解决？","如果在使用 Obsei 的 Hugging Face Space 演示应用时遇到运行时错误或界面异常，这通常是后端服务暂时性问题。维护者通常会尽快修复部署问题。如果遇到此类情况，请稍后重试或查看项目 Issue 页面确认是否已有修复公告。","https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fissues\u002F308",{"id":168,"question_zh":169,"answer_zh":170,"source_url":171},9321,"Obsei 未来的工作流架构会支持 DAG（有向无环图）吗？","项目计划引入基于 DAG 的工作流以支持更复杂的生产部署。目前团队正在评估多种方案，包括 Ray, Temporal, Airflow, NetworkX, Apache Beam 以及容器化方案（类似 Jina）。目标是构建一个抽象层，允许用户插入任何喜欢的框架，但该功能目前优先级较低，旨在先建立用户基础。","https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fissues\u002F8",[173,178,183,188,193,198,203,208,213,218,223,228,233,238],{"id":174,"version":175,"summary_zh":176,"released_at":177},106699,"v0.0.15","## :star: Major Highlights\r\n- Replaced Gnew package with GoogleNews as it was hampering dependencies upgrade\r\n- Upgrade to Pydantic V2 version\r\n- Removing Python 3.7 support\r\n- Upgraded all dependencies to current latest and greated\r\n\r\n## Changes\r\n\r\n- Use GoogleNews instead of GNews and dependencies upgrade @lalitpagaria (#319)\r\n\r\n","2023-12-31T08:47:15",{"id":179,"version":180,"summary_zh":181,"released_at":182},106700,"v0.0.14","## :star: Major Highlights\r\n- Making dependencies more strict\r\n- Updating more information on Readme\r\n- Adding code coverage reporting with CI job\r\n\r\n## Changes\r\n\r\n- Making dependencies strict @lalitpagaria (#270)\r\n","2022-11-19T09:35:20",{"id":184,"version":185,"summary_zh":186,"released_at":187},106701,"v0.0.13","## :star: Major Highlights\r\n- Python 3.10 support\r\n- Segregated optional dependencies to support install on need basis to reduce the docker image size\r\n- Added about [oraika](https:\u002F\u002Fwww.oraika.com) our parent organization and user\r\n\r\n## 🚀 Features\r\n\r\n- Segregated optional dependencies to support install on need basis @GirishPatel (#257)\r\n- Add Python 3.10 support and fix website @lalitpagaria (#218)\r\n\r\n## 🐛 Bug Fixes\r\n\r\n- Fixing sample UI dependencies @lalitpagaria (#262)\r\n- Fix UI docker build @lalitpagaria (#255)\r\n\r\n## 🧰 Maintenance\r\n\r\n- Updating Readme and fixing pypi release @lalitpagaria (#267)\r\n- Fixing mypy reported issues @lalitpagaria (#268)\r\n- Bump actions\u002Fcheckout from 2 to 3.1.0 @dependabot (#263)\r\n- Bump actions\u002Fcache from 2 to 3.0.11 @dependabot (#265)\r\n- Bump docker\u002Fmetadata-action from 3.6.2 to 4.1.1 @dependabot (#266)\r\n- Adding numpy to conda build @lalitpagaria (#256)\r\n\r\n## ⚠️Breaking Changes\r\n- To add ElasticSearch 8.x support few input params are modified (#268)\r\n- By default it will only install bare minimal dependencies, in order to install all use `pip install obsei[all]` (#257)\r\n","2022-11-11T09:54:24",{"id":189,"version":190,"summary_zh":191,"released_at":192},106702,"v0.0.12","## 🐛 Bug Fixes\r\n\r\n- Fix outscrapper map review API along with moving to faster V3 API @lalitpagaria (#250)\r\n- Fix TwitterSource username bug, add Gnews proxy @chxlium (#246)\r\n\r\n## 🧰 Maintenance\r\n\r\n- Upgrade dependencies and add dateparser in dependency list @lalitpagaria (#252)\r\n- Bump actions\u002Fsetup-python from 2 to 4 @dependabot (#248)\r\n- Bump crazy-max\u002Fghaction-docker-meta from 1 to 3.6.2 @dependabot (#226)\r\n- Upgraded click version to fix typer dependency @GirishPatel (#245)\r\n\r\n## 🚀 Misc\r\n\r\n- Add Utilize.app to the list of companies using Obsei @arorajatin (#249)\r\n\r\n\r\nThanks to new contributors @chxlium and @arorajatin ","2022-07-23T11:50:53",{"id":194,"version":195,"summary_zh":196,"released_at":197},106703,"v0.0.11","## :star: Major Highlights\r\n- Youtube: Now fetch Youtube video comments (via Scrapper)\r\n- License: Removed all strong copyleft dependencies\r\n- Demo: Improved demo UI along with adding more detailed logging\r\n- Few bug fixes, dependencies upgrade, CI enhancements and fixing security issue\r\n\r\n## 🚀 Features\r\n\r\n- Youtube integration via scrapper @lalitpagaria (#224)\r\n- Removing third party dependencies with strong copyleft licenses @lalitpagaria (#221)\r\n- Enhancing demo UI @lalitpagaria (#214)\r\n\r\n## 🐛 Bug Fixes\r\n\r\n- Fixing typing-extensions dependency issue on python 3.7 @lalitpagaria (#217)\r\n- Google News max result fix @tanish36 (#211)\r\n- Bug: Updating long\\_term\\_token param to access\\_token for facebook source. @lalitpagaria (#210)\r\n\r\n## 🧰 Maintenance\r\n\r\n- Updated the README @kuutsav (#222)\r\n- Fix security issue with lxml @lalitpagaria (#219)\r\n- Dep upgrade (to address Dependabot for NLTK as well) @lalitpagaria (#215)\r\n- Enabling CI caching @lalitpagaria (#213)\r\n","2022-02-09T11:55:54",{"id":199,"version":200,"summary_zh":201,"released_at":202},106704,"v0.0.10","## :star: Major Highlights\r\n- Google Maps: Now observer able to fetch google maps reviews\r\n- Handle Long Text: Now use TextSplitter and InferenceAggregator to seamlessly process very long text for example news article\r\n- Analyzer: New TextClassification analyzer let you use non zero-shot classification model\r\n- Analyzer: New Spacy powered NER analyzer let you use spacy based NER models\r\n- Pandas: Now you can use Pandas DataFrame as an observer and informer, which enable you to load and store data from CSV, TSV, Excel and SQL DBs.\r\n- Miscellaneous: Jinja template support for Slack message, new tutorials and pre-commit hook to save dev time\r\n\r\n## 🚀 Features\r\n\r\n- Jinja Template for Slack, RegEx \\& Lammatizer cleaner functions and Sentence based text splitting @lalitpagaria (#206)\r\n- Add SDK and UI-Demo image release job @lalitpagaria (#205)\r\n- Adding jinja template support for slack messages @lalitpagaria (#199)\r\n- Adding google maps review observer via outscrapper api @lalitpagaria (#195)\r\n- add text classification analyzer @shahrukhx01 (#191)\r\n- Add Pandas as Observer\u002FSource @cnarte (#184)\r\n- Adding app\\_url support to appstore and playstore scrappers @lalitpagaria (#180)\r\n- Adding InferenceAggregator @akar5h (#166)\r\n- spacy ner analyzer , #165 enhancement @akar5h (#171)\r\n- Adding tutorials information and articles @lalitpagaria (#168)\r\n- Colab Tutorials Added @reenabapna (#167)\r\n- TextSplitter Preprocessor Pipeline @akar5h (#160)\r\n- Pre commit integration @salilmishra23 (#156)\r\n- add requirement files for development @salilmishra23 (#159)\r\n- Facebook source time range improvement @GirishPatel (#157)\r\n\r\n## 🐛 Bug Fixes\r\n\r\n- Reverting to older messaging format for Twitter v2 API @lalitpagaria (#198)\r\n- Fixing import issues along @lalitpagaria (#193)\r\n- Fix Twitter Source Config import issue @lalitpagaria (#190)\r\n- Use of BaseSettings Causing regression so reverting the changes @lalitpagaria (#189)\r\n- Creating quoted query before passing to GNews client @lalitpagaria (#181)\r\n- Handle Null case when crawler failed to fetch article @lalitpagaria (#173)\r\n- Email source fixed, no duplicates in each iteration @namanjuneja771 (#158)\r\n\r\n## 🧰 Maintenance\r\n\r\n- Adding support to dailyget message api @lalitpagaria (#179)\r\n\r\n## ⚠️Breaking Changes\r\n\r\n- Fix Twitter Source Config import issue @lalitpagaria (#190)\r\n- Adding InferenceAggregator @akar5h (#166)\r\n- spacy ner analyzer , #165 enhancement @akar5h (#171)\r\n\r\n## 🙏 Release Contributors! :heart:\r\n@akar5h @cnarte @GirishPatel @lalitpagaria @namanjuneja771 @reenabapna @salilmishra23 @shahrukhx01\r\n\r\n## 🥳 New Contributors\r\n* @namanjuneja771 made their first contribution in https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F158\r\n* @salilmishra23 made their first contribution in https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F159\r\n* @reenabapna made their first contribution in https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F167\r\n* @cnarte made their first contribution in https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F184\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fcompare\u002F0.0.9...v0.0.10","2021-10-05T12:37:38",{"id":204,"version":205,"summary_zh":206,"released_at":207},106705,"0.0.9","## :star: Major Highlights\r\n### Pre-Processing (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F110)\r\nThis added new component to Obsei. It's main idea to provide simple but configurable step to pre-process text before sending it for model prediction. Currently TextCleaning step is added, which helps user to clean raw text's. It is great contribution by @shahrukhx01 \r\n### Facebook Integration (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F150)\r\nObsei can now observe comments from Facebook page's posts thanks to contribution by @GirishPatel \r\n### Google News Integration (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F111)\r\nObsei can now search news on GoogleNews and scrap full news article in text\r\n### Website Scrapper Integration (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F111)\r\nObsei can now scrap particular URL or full website if it contains sitemap\r\n### Pandas DataFrame Integration (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F107)\r\nObsei can now have Pandas DataFrame as Informer to publish Analyzer's data to DataFrame.\r\n\r\n## 🔆 Other Changes\r\n- Added contribution guideline and code of conduct (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fcommit\u002Fe102f8915d99e7241974e1b2360b3d075300338a and https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fcommit\u002Ffcc9a9121efc0b049024b100ce28d8bab521f072)\r\n- Adding version tag along with default logging config (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F149)\r\n- Ignoring error during cleaning and fixing exception in google news module (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F143)\r\n- Add analyzer batching (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F118)\r\n- Adding app name support for app and play store (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F128)\r\n- Mypy integration (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F135 and https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F125\r\n- Fixing error regarding offset-naive and offset-aware datetimes comparison (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F119)\r\n- Remove import from configuration.py as it is causing loop of import (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F133)\r\n- Trimming excessive text before passing to model (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F114)\r\n- Ignoring error during cleaning and fixing exception in google news module (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F143)\r\n- Adding version tag along with default logging config (#149)\r\n- [BUG] Tokenizer loading error the NER Analyzer (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F152)\r\n\r\n## ⚠️ Breaking Changes\r\n- Unifying analyzer request and response (https:\u002F\u002Fgithub.com\u002Fobsei\u002Fobsei\u002Fpull\u002F148)\r\n\r\n## 🙏 Release Contributors! :heart:\r\n@GirishPatel @shahrukhx01 @akar5h @lalitpagaria ","2021-07-03T08:10:50",{"id":209,"version":210,"summary_zh":211,"released_at":212},106706,"0.0.8","This release includes (Refer for detailed changes https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fprojects\u002F6) -\r\n1. **New Analyzers**: Personal Information Anonymizer and Translation\r\n2. **GPU support**: Analyzer's now can utilize GPU\r\n3. **Conda release**: Adding initial support to install package from Conda as well as creation of development environment\r\n4. **Window support**: Adding support for windows platform\r\n5. **Obsei UI demo**: Adding streamlit based UI to try Obsei\r\n6. **Various bug fixes**","2021-05-21T14:19:45",{"id":214,"version":215,"summary_zh":216,"released_at":217},106707,"0.0.7","This release include -\r\n\r\n- Email Observer: Currently do not segment attachment (linked issue https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fissues\u002F30)\r\n- Zendesk Informer (linked issue https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fissues\u002F31)\r\n- Add extensive example, colab and binder support in [Readme](https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei#how-to-use) (linked issue https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fissues\u002F39)\r\n- Remove hydra dependency (linked issue https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fissues\u002F33)\r\n- Adding Logger Informer, so user can easily test out end-to-end pipeline (linked commit https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fcommit\u002Fca99da8b72611788e5969565438f02d7d356cfd1)\r\n- Adding security policy to repo (linked commit https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fcommit\u002F2685fffdd36b57acdeb6a8adfc6f9920dfa7082b)\r\n- **[Bug\u002FRegression]**: Rest interface via docker build was failing (linked commit https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fcommit\u002Fef48b36792e1ce0d7118df6f3081f1e3b368c65c)","2021-03-27T21:31:16",{"id":219,"version":220,"summary_zh":221,"released_at":222},106708,"0.0.6","This release include following bug fixes -\r\n- Play store scrapper is failing (linked issue https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fissues\u002F34)\r\n- Correct analyzer usages in examples (linked issue https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fissues\u002F36)","2021-02-05T23:28:46",{"id":224,"version":225,"summary_zh":226,"released_at":227},106709,"0.0.5","```diff\r\n+! Release Detail !+\r\n```\r\n- Reddit and Slack integration (https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fcommit\u002Fb69b64c439fda4d92fee5be18904a3f347d2ea12 and https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fcommit\u002F7a0f6ab846a30c7fda606d4bee42bedc81eaa6c7)\r\n- New Analyzers NER and Dummy apart from Sentiment, Classification (https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fcommit\u002Fe1115595372e83708e081a3c05138abb9116b9e9 and https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fcommit\u002F870c1eaea3494e00dcda4b9c94aa0203b85082d4)\r\n- Image and Readme improvements\r\n- Documentation website: https:\u002F\u002Flalitpagaria.github.io\u002Fobsei\u002F\r\n- Discussion forum: https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fdiscussions\r\n\r\n```diff\r\n-! Breaking Changes !-\r\n```\r\nThis release have breaking changes related to change in package and arguments for Analyzers. Refer https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fcommit\u002Fe1115595372e83708e081a3c05138abb9116b9e9","2021-02-03T01:26:48",{"id":229,"version":230,"summary_zh":231,"released_at":232},106710,"0.0.4","This release include -\r\n- Google play store reviews Observer, this need authentication (linked issue https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fissues\u002F4)\r\n- Apple app store reviews Observer via scrapping, this do not need authentication (linked issue https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fissues\u002F9)\r\n- Google play store reviews Observer via scrapping, this do not need authentication (linked issue https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fissues\u002F10)\r\n- Persist Observer current state to persistent storage of user choice like SQLite, MySQL, Postgres etc (linked issue https:\u002F\u002Fgithub.com\u002Flalitpagaria\u002Fobsei\u002Fissues\u002F6)","2021-01-13T05:17:01",{"id":234,"version":235,"summary_zh":236,"released_at":237},106711,"0.0.3","This release includes -\r\n- Google play store reviews as source\r\n- Jira client fix\r\n- Sending markdown in Jira description\r\n- Few updated to readme","2020-12-29T18:18:24",{"id":239,"version":240,"summary_zh":241,"released_at":242},106712,"v0.0.2","First release with support for following components -\r\n- **Source**: Twitter \r\n- **Analyzer**: Sentiment and Text classification\r\n- **Sink**: HTTP API, ElasticSearch, DailyGet, and Jira\r\n- **Processor**: Simple integration between Source, Analyser and Sink\r\n- **API Server**: Rest interface","2020-12-23T19:41:20"]