automagica

GitHub
3.1k 489 困难 1 次阅读 昨天图像开发框架其他Agent
AI 解读 由 AI 自动生成,仅供参考

Automagica 是一款融合了人工智能技术的智能机器人流程自动化(RPA)开源项目,旨在让自动化技术变得人人可用。它主要解决重复性电脑操作耗时费力的问题,能够自动执行如浏览器与 Excel 交互、文件加密解密等复杂任务,显著提升工作效率。

这套工具特别适合希望实现办公自动化的普通用户,以及需要灵活定制流程的开发者。对于开发者而言,Automagica 提供了基于 Jupyter Notebook 的开发环境(Automagica Lab)和可视化流程设计器(Automagica Flow),并支持直接编写 Python 代码,兼顾了低门槛与高自由度。

其独特的技术亮点在于"Automagica Wand",这是一个由 AI 驱动的界面元素拾取器,能更精准地识别和操作软件界面,解决了传统 RPA 在动态界面中定位不准的痛点。虽然该项目核心部分已被 Netcall 收购并转向商业集成,但其原有的架构设计依然展示了将 AI 视觉能力与传统自动化脚本完美结合的创新思路,是学习 RPA 与 AI 结合应用的优秀案例。

使用场景

某电商公司的财务专员每天需从多个供应商门户下载发票 PDF,提取关键数据并录入 Excel 报表,同时需对敏感文件进行加密归档。

没有 automagica 时

  • 员工必须手动登录不同网站下载文件,耗时且容易因疲劳输错网址或账号。
  • 依靠人工肉眼识别 PDF 中的金额和日期,效率低下且极易出现转录错误。
  • 缺乏统一的流程管理,一旦某个步骤出错(如网页元素变动),整个任务中断且难以排查。
  • 敏感财务文件分散在本地文件夹,缺乏自动化的加密机制,存在数据泄露风险。
  • 每次新增供应商都需要重新编写复杂的脚本,维护成本极高,非技术人员无法参与优化。

使用 automagica 后

  • 利用 Automagica Bot 自动登录各门户并批量下载发票,无需人工干预,释放人力。
  • 通过 AI 驱动的 Automagica Wand 智能识别并抓取 PDF 关键字段,准确率提升至 99% 以上。
  • 借助 Automagica Flow 可视化设计流程,当网页结构变化时可快速调整节点,系统稳定性大幅增强。
  • 内置加密活动(Encrypt file)自动对归档文件进行 Fernet 加密,确保数据存储符合安全合规要求。
  • 业务人员可通过低代码界面直接修改自动化逻辑,无需依赖专业开发团队,响应速度显著加快。

automagica 将繁琐重复的财务流程转化为稳定、安全且可自我进化的智能自动化闭环,让企业真正实现了降本增效。

运行环境要求

操作系统
  • Windows
GPU

未说明

内存

未说明

依赖
notes该项目已于 2020 年 10 月 13 日被 Netcall plc 收购,不再作为开源软件(AGPL3 许可)提供。部分功能(如 Random beep)明确仅在 Windows 上有效。Automagica Lab 组件需要预先安装 Jupyter。原有的免费服务仅对已部署用户保留三个月过渡期。
python未说明
Jupyter (用于 Automagica Lab)
automagica hero image

快速开始

https://automagica.com)

Automagica

Automagica 项目始于 2018 年,旨在开发开源软件,确保机器人流程自动化技术能够惠及所有人。

随着 Automagica 各个版本的发布,诸如 Wand 和 Portal 等更高级的功能需要一套服务基础设施来支持更可靠的机器人运行、先进的服务以及管理和控制功能。

随着这些服务使用量的增加,服务层的托管和维护成本也随之上升。

为了推动这些重要服务进入下一发展阶段,我们很高兴地宣布:2020 年 10月13 日,领先的低代码、客户互动及联络中心软件提供商 Netcall plc 已收购 Oakwood Technologies BV(以“Automagica”名义运营)。

Netcall 将把 Automagica 的 RPA 集成到其 Liberty 平台中,从而提供 RPA、低代码和客户互动解决方案的强大组合。

Automagica Robot 现已不再依据 AGPL3 许可证条款提供。

不过,我们不会停止对已部署机器人的服务支持。这些机器人将继续免费使用 Wand 和 OCR 功能,为期三个月,自今日(2020年10月13日)起算。

现有 Automagica Portal 用户在未来三个月内也可免费访问该平台;在此期间,我们将为用户提供迁移到商业服务的选项。

在此,我们衷心感谢所有为该项目做出贡献的人士。

热爱 Automagica 示例

组件

Automagica 套件由以下组件构成:

  • Automagica Bot:负责执行自动化任务的运行时/代理。
  • Automagica Flow:可视化流程设计器,可快速构建自动化流程,并全面支持 Python 代码。
  • Automagica Wand:基于 AI 的 UI 元素选择器。
  • Automagica Lab:基于 Jupyter Notebooks 的笔记本式自动化开发环境(需安装 Jupyter)。
  • Automagica Portal:用于管理机器人、凭据、自动化流程、日志等。

Portal 和 Flow

示例

浏览器与 Excel 协同工作:

Excel 示例 Automagica

活动

Automagica 所有官方活动概览:

Process Description
Cryptography ‌‌
Random key Generate random Fernet key.
Encrypt text Encrypt text with (Fernet) key,
Decrypt text Dexrypt bytes-like object to string with (Fernet) key
Encrypt file Encrypt file with (Fernet) key. Note that file will be unusable unless unlocked with the same key.
Decrypt file Decrypts file with (Fernet) key
Key from password Generate key based on password and salt. If both password and salt are known the key can be regenerated.
Hash from file Generate hash from file
Hash from text Generate hash from text. Keep in mind that MD5 is not cryptographically secure.
Random ‌‌
Random number Random numbers can be integers (not a fractional number) or a float (fractional number).
Random data Generates all kinds of random data. Specifying locale changes format for some options
Random boolean Generates a random boolean (True or False)
Random name Generates a random name. Adding a locale adds a more common name in the specified locale. Provides first name and last name.
Random words Generates a random sentence. Specifying locale changes language and content based on locale.
Random address Generates a random address. Specifying locale changes random locations and streetnames based on locale.
Random beep Generates a random beep, only works on Windows
Random date Generates a random date.
Today's date Generates today's date.
Generate unique identifier Generates a random UUID4 (universally unique identifier).
Output ‌‌
Display overlay message Display custom OSD (on-screen display) message. Can be used to display a message for a limited amount of time. Can be used for illustration, debugging or as OSD.
Print message in console Print message in console. Can be used to display data in the Automagica Flow console
Browser ‌‌
Chrome Open Chrome Browser
Save all images Save all images on current page in the Browser
Browse to URL Browse to URL.
Find elements by text Find all elements by their text. Text does not need to match exactly, part of text is enough.
Find all links Find all links on a webpage in the browser
Find first link on a webpage Find first link on a webpage
Get all text on webpage Get all the raw body text from current webpage
Highlight element Highlight elements in yellow in the browser
Exit the browser Quit the browser by exiting gracefully. One can also use the native 'quit' function
Find all XPaths Find all elements with specified xpath on a webpage in the the browser. Can also use native 'find_elements_by_xpath'
Find XPath in browser Find all element with specified xpath on a webpage in the the browser. Can also use native 'find_elements_by_xpath'
Find class in browser Find element with specified class on a webpage in the the browser. Can also use native 'find_element_by_class_name'
Find class in browser Find all elements with specified class on a webpage in the the browser. Can also use native 'find_elements_by_class_name' function
Find element in browser based on class and text Find all elements with specified class and text on a webpage in the the browser.
Find id in browser Find element with specified id on a webpage in the the browser. Can also use native 'find_element_by_id' function
Switch to iframe in browser Switch to an iframe in the browser
Credential Management ‌‌
Set credential Add a credential which stores credentials locally and securely. All parameters should be Unicode text.
Delete credential Delete a locally stored credential. All parameters should be Unicode text.
Get credential Get a locally stored redential. All parameters should be Unicode text.
FTP ‌‌
Create FTP connection (insecure) Can be used to automate activites for FTP
Download file Downloads a file from FTP server. Connection needs to be established first.
Upload file Upload file to FTP server
List FTP files Generate a list of all the files in the FTP directory
Check FTP directory Check if FTP directory exists
Create FTP directory Create a FTP directory. Note that sufficient permissions are present
Keyboard ‌‌
Press key Press and release an entered key. Make sure your keyboard is on US layout (standard QWERTY).If you are using this on Mac Os you might need to grant access to your terminal application.
Press key combination Press a combination of two or three keys simultaneously. Make sure your keyboard is on US layout (standard QWERTY).
Type text Simulate keystrokes. If an element ID is specified, text will be typed in a specific field or element based on the element ID (vision) by the recorder.
Mouse ‌‌
Get mouse coordinates Get the x and y pixel coordinates of current mouse position.
Display mouse position Displays mouse position in an overlay.
Mouse click Clicks on an element based on the element ID (vision)
Mouse click coordinates Clicks on an element based on pixel position determined by x and y coordinates. To find coordinates one could use display_mouse_position().
Double mouse click coordinates Double clicks on a pixel position determined by x and y coordinates.
Double mouse click Double clicks on an element based on the element ID (vision)
Right click Right clicks on an element based on the element ID (vision)
Right click coordinates Right clicks on an element based pixel position determined by x and y coordinates.
Move mouse Moves te pointer to an element based on the element ID (vision)
Move mouse coordinates Moves te pointer to an element based on the pixel position determined by x and y coordinates
Move mouse relative Moves the mouse an x- and y- distance relative to its current pixel position.
Drag mouse Drags mouse to an element based on pixel position determined by x and y coordinates
Drag mouse Drags mouse to an element based on the element ID (vision)
Image ‌‌
Random screen snippet Take a random square snippet from the current screen. Mainly for testing and/or development purposes.
Screenshot Take a screenshot of current screen.
Folder Operations ‌‌
List files in folder List all files in a folder (and subfolders)
Create folder Creates new folder at the given path.
Rename folder Rename a folder
Open a folder Open a folder with the default explorer.
Move a folder Moves a folder from one place to another.
Remove folder Remove a folder including all subfolders and files. For the function to work optimal, all files and subfolders in the main targetfolder should be closed.
Empty folder Remove all contents from a folderFor the function to work optimal, all files and subfolders in the main targetfolder should be closed.
Checks if folder exists Check whether folder exists or not, regardless if folder is empty or not.
Copy a folder Copies a folder from one place to another.
Zip Zip folder and its contents. Creates a .zip file.
Unzip Unzips a file or folder from a .zip file.
Return most recent file in directory Return most recent file in directory
Delay ‌‌
Wait Make the robot wait for a specified number of seconds. Note that this activity is blocking. This means that subsequent activities will not occur until the the specified waiting time has expired.
Wait for folder Waits until a folder exists.Note that this activity is blocking and will keep the system waiting.
Word Application ‌‌
Start Word Application For this activity to work, Microsoft Office Word needs to be installed on the system.
Save Save active Word document
Save As Save active Word document to a specific location
Append text Append text at end of Word document.
Replace text Can be used for example to replace arbitrary placeholder value. For example whenusing template document, using 'XXXX' as a placeholder. Take note that all strings are case sensitive.
Read all text Read all the text from a document
Export to PDF Export the document to PDF
Export to HTML Export to HTML
Set footers Set the footers of the document
Set headers Set the headers of the document
Quit Word This closes Word, make sure to use 'save' or 'save_as' if you would like to save before quitting.
Word File ‌‌
Read and Write Word files These activities can read, write and edit Word (docx) files without the need of having Word installed.
Read all text Read all the text from the document
Append text Append text at the end of the document
Save Save document
Save as Save file on specified path
Set headers Set headers of Word document
Replace all Replaces all occurences of a placeholder text in the document with a replacement text.
Outlook Application ‌‌
Start Outlook Application For this activity to work, Outlook needs to be installed on the system.
Send e-mail Send an e-mail using Outlook
Retrieve folders Retrieve list of folders from Outlook
Retrieve e-mails Retrieve list of messages from Outlook
Delete e-mails Deletes e-mail messages in a certain folder. Can be specified by searching on subject, body or sender e-mail.
Move e-mails Move e-mail messages in a certain folder. Can be specified by searching on subject, body or sender e-mail.
Save attachments Save all attachments from certain folder
Retrieve contacts Retrieve all contacts
Add a contact Add a contact to Outlook contacts
Quit Close the Outlook application
Excel Application ‌‌
Start Excel Application For this activity to work, Microsoft Office Excel needs to be installed on the system.
Add worksheet Adds a worksheet to the current workbook
Activate worksheet Activate a worksheet in the current Excel document by name
Save Save the current workbook. Defaults to homedir
Save as Save the current workbook to a specific path
Write cell Write to a specific cell in the currently active workbook and active worksheet
Read cell Read a cell from the currently active workbook and active worksheet
Write range Write to a specific range in the currently active worksheet in the active workbook
Read range Read a range of cells from the currently active worksheet in the active workbook
Run macro Run a macro by name from the currently active workbook
Get worksheet names Get names of all the worksheets in the currently active workbook
Get table Get table data from the currently active worksheet by name of the table
Activate range Activate a particular range in the currently active workbook
Activate first empty cell down Activates the first empty cell going down
Activate first empty cell right Activates the first empty cell going right
Activate first empty cell left Activates the first empty cell going left
Activate first empty cell up Activates the first empty cell going up
Write cell formula Write a formula to a particular cell
Read cell formula Read the formula from a particular cell
Insert empty row Inserts an empty row to the currently active worksheet
Insert empty column Inserts an empty column in the currently active worksheet. Existing columns will shift to the right.
Delete row in Excel Deletes a row from the currently active worksheet. Existing data will shift up.
Delete column Delete a column from the currently active worksheet. Existing columns will shift to the left.
Export to PDF Export to PDF
Insert data as table Insert list of dictionaries as a table in Excel
Read worksheet Read data from a worksheet as a list of lists
Quit Excel This closes Excel, make sure to use 'save' or 'save_as' if you would like to save before quitting.
Excel File ‌‌
Read and Write xlsx files. This activity can read, write and edit Excel (xlsx) files without the need of having Excel installed.
Export file to dataframe Export to pandas dataframe
Activate worksheet Activate a worksheet. By default the first worksheet is activated.
Save as Save file as
Save as Save file
Write cell Write a cell based on column and row
Read cell Read a cell based on column and row
Add worksheet Add a worksheet
Get worksheet names Get worksheet names
PowerPoint Application ‌‌
Start PowerPoint Application For this activity to work, PowerPoint needs to be installed on the system.
Save PowerPoint Save PowerPoint Slidedeck
Save PowerPoint Save PowerPoint Slidedeck
Close PowerPoint Application Close PowerPoint
Add PowerPoint Slides Adds slides to a presentation
Slide count Returns the number of slides
Text to slide Add text to a slide
Delete slide Delete a slide
Replace all occurences of text in PowerPoint slides Can be used for example to replace arbitrary placeholder value in a PowerPoint.
PowerPoint to PDF Export PowerPoint presentation to PDF file
Slides to images Export PowerPoint slides to seperate image files
Office 365 ‌‌
Send email Office Outlook 365 Send email Office Outlook 365
Salesforce ‌‌
Salesforce API Activity to make calls to Salesforce REST API.
E-mail (SMTP) ‌‌
Mail with SMTP This function lets you send emails with an e-mail address.
Windows OS ‌‌
Find window with specific title Find a specific window based on the name, either a perfect match or a partial match.
Login to Windows Remote Desktop Create a RDP and login to Windows Remote Desktop
Stop Windows Remote Desktop Stop Windows Remote Desktop
Set Windows password Sets the password for a Windows user.
Check Windows password Validates a Windows user password if it is correct
Lock Windows Locks Windows requiring login to continue.
Check if Windows logged in Checks if the current user is logged in and not on the lockscreen. Most automations do not work properly when the desktop is locked.
Check if Windows is locked Checks if the current user is locked out and on the lockscreen. Most automations do not work properly when the desktop is locked.
Get Windows username Get current logged in user's username
Set clipboard Set any text to the Windows clipboard.
Get clipboard Get the text currently in the Windows clipboard
Empty clipboard Empty text from clipboard. Getting clipboard data after this should return in None
Run VBSscript Run a VBScript file
Beep Make a beeping sound. Make sure your volume is up and you have hardware connected.
Get all network interface names Returns a list of all network interfaces of the current machine
Enable network interface Enables a network interface by its name.
Disable network interface Disables a network interface by its name.
Get default printer Returns the name of the printer selected as default
Set default printer Set the default printer.
Remove printer Removes a printer by its name
Get service status Returns the status of a service on the machine
Start a service Starts a Windows service
Stop a service Stops a Windows service
Set window to foreground Sets a window to foreground by its title.
Get foreground window title Retrieve the title of the current foreground window
Close window Closes a window by its title
Maximize window Maximizes a window by its title
Restore window Restore a window by its title
Minimize window Minimizes a window by its title
Resize window Resize a window by its title
Hide window Hides a window from the user desktop by using it's title
Terminal ‌‌
Run SSH command Runs a command over SSH (Secure Shell)
SNMP ‌‌
SNMP Get Retrieves data from an SNMP agent using SNMP (Simple Network Management Protocol)
Active Directory ‌‌
AD interface Interface to Windows Active Directory through ADSI. Connects to the AD domain to which the machine is joined by default.
Get AD object by name Interface to Windows Active Directory through ADSI
Utilities ‌‌
Get user home path Returns the current user's home path
Get desktop path Returns the current user's desktop path
Get downloads path Returns the current user's default download path
Open file Opens file with default programs
Set wallpaper Set Windows desktop wallpaper with the the specified image
Download file from a URL Download file from a URL
System ‌‌
Rename a file This activity will rename a file. If the the desired name already exists in the folder file will not be renamed. Make sure to add the exstention to specify filetype.
Move a file If the new location already contains a file with the same name.
Remove a file Remove a file
Check if file exists This function checks whether the file with the given path exists.
Wait until a file exists. Note that this activity is blocking and will keep the system waiting.
List to .txt Writes a list to a text (.txt) file.Every element of the entered list is written on a new line of the text file.
Read list from .txt file This activity reads the content of a .txt file to a list and returns that list.Every new line from the .txt file becomes a new element of the list. The activity willnot work if the entered path is not attached to a .txt file.
Read .txt file This activity reads a .txt file and returns the content
Append to .txt Append a text line to a file and creates the file if it does not exist yet.
Make text file Initialize text file
Read .txt file with newlines to list Read a text file to a Python list-object
Copy a file Copies a file from one place to another.If the new location already contains a file with the same name, a random 4 character uid is added to the name.
Get file extension Get extension of a file
Print Send file to default printer to priner. This activity sends a file to the printer. Make sure to have a default printer set up.
PDF ‌‌
Text from PDF Extracts the text from a PDF. This activity reads text from a pdf file. Can only read PDF files that contain a text layer.
Merge PDF Merges multiple PDFs into a single file
Extract page from PDF Extracts a particular range of a PDF to a separate file.
Extract images from PDF Save a specific page from a PDF as an image
Watermark a PDF Watermark a PDF
System Monitoring ‌‌
CPU load Get average CPU load for all cores.
Count CPU Get the number of CPU's in the current system.
CPU frequency Get frequency at which CPU currently operates.
CPU Stats Get CPU statistics
Memory statistics Get memory statistics
Disk stats Get disk statistics of main disk
Partition info Get disk partition info
Boot time Get most recent boot time
Uptime Get uptime since last boot
Image Processing ‌‌
Show image Displays an image specified by the path variable on the default imaging program.
Rotate image Rotate an image
Resize image Resizes the image specified by the path variable.
Get image width Get with of image
Get image height Get height of image
Crop image Crops the image specified by path to a region determined by the box variable.
Mirror image horizontally Mirrors an image with a given path horizontally from left to right.
Mirror image vertically Mirrors an image with a given path vertically from top to bottom.
Process ‌‌
Windows run Use Windows Run to boot a processNote this uses keyboard inputs which means this process can be disrupted by interfering inputs
Run process Use subprocess to open a windows process
Check if process is running Check if process is running. Validates if given process name (name) is currently running on the system.
Get running processes Get names of unique processes currently running on the system.
Kill process Kills a process forcefully
Optical Character Recognition (OCR) ‌‌
Get text with OCR This activity extracts all text from the current screen or an image if a path is specified.
Find text on screen with OCR This activity finds position (coordinates) of specified text on the current screen using OCR.
Click on text with OCR This activity clicks on position (coordinates) of specified text on the current screen using OCR.
Double click on text with OCR This activity double clicks on position (coordinates) of specified text on the current screen using OCR.
Right click on text with OCR This activity Right clicks on position (coordinates) of specified text on the current screen using OCR.
UiPath ‌‌
Execute a UiPath process This activity allows you to execute a process designed with the UiPath Studio. All console output from the Write Line activity will be printed as output.
AutoIt ‌‌
Execute a AutoIt script This activity allows you to run an AutoIt script. If you use the ConsoleWrite function, the output will be presented to you.
Alternative frameworks ‌‌
Execute a Robot Framework test case This activity allows you to run a Robot Framework test case. Console output of the test case will be printed.
Run a Blue Prism process This activity allows you to run a Blue Prism process.
Run an Automation Anywhere task This activity allows you to run an Automation Anywhere task.
General ‌‌
Raise exception Raises an exception
SAP GUI ‌‌
Quit SAP GUI Quits the SAP GUI completely and forcibly.
Log in to SAP GUI Logs in to an SAP system on SAP GUI.
Click on a SAP GUI element Clicks on an identifier in the SAP GUI.
Get text from a SAP GUI element Retrieves the text from a SAP GUI element.
Set text of a SAP GUI element Sets the text of a SAP GUI element.
Highlights a SAP GUI element Temporarily highlights a SAP GUI element
Portal ‌‌
Create a new job in the Automagica Portal This activity creates a new job in the Automagica Portal for a given process. The bot performing this activity needs to be in the same team as the process it creates a job for.
Get a credential from the Automagica Portal This activity retrieves a credential from the Automagica Portal.
Vision ‌‌
Check if element is visible on screen This activity can be used to check if a certain element is visible on the screen.Note that this uses Automagica Portal and uses some advanced an fuzzy matching algorithms for finding identical elements.
Wait for an element to appear Wait for an element that is defined the recorder
Wait Vanish This activity allows the bot to wait for an element to vanish.
Read Text with Automagica Wand This activity allows the bot to detect and read the text of an element by using the Automagica Portal API with a provided sample ID.
‌‌

许可

版权与许可

本仓库中的所有源代码及其他文件,除非另有说明,均受 Netcall plc 版权保护。

商业许可

有关许可、试用及商业使用的更多信息,请参阅此页面

常见问题

相似工具推荐

openclaw

OpenClaw 是一款专为个人打造的本地化 AI 助手,旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚,能够直接接入你日常使用的各类通讯渠道,包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息,OpenClaw 都能即时响应,甚至支持在 macOS、iOS 和 Android 设备上进行语音交互,并提供实时的画布渲染功能供你操控。 这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地,用户无需依赖云端服务即可享受快速、私密的智能辅助,真正实现了“你的数据,你做主”。其独特的技术亮点在于强大的网关架构,将控制平面与核心助手分离,确保跨平台通信的流畅性与扩展性。 OpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者,以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力(支持 macOS、Linux 及 Windows WSL2),即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你

349.3k|★★★☆☆|4天前
Agent开发框架图像

stable-diffusion-webui

stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面,旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点,将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。 无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师,还是想要深入探索模型潜力的开发者与研究人员,都能从中获益。其核心亮点在于极高的功能丰富度:不仅支持文生图、图生图、局部重绘(Inpainting)和外绘(Outpainting)等基础模式,还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外,它内置了 GFPGAN 和 CodeFormer 等人脸修复工具,支持多种神经网络放大算法,并允许用户通过插件系统无限扩展能力。即使是显存有限的设备,stable-diffusion-webui 也提供了相应的优化选项,让高质量的 AI 艺术创作变得触手可及。

162.1k|★★★☆☆|4天前
开发框架图像Agent

everything-claude-code

everything-claude-code 是一套专为 AI 编程助手(如 Claude Code、Codex、Cursor 等)打造的高性能优化系统。它不仅仅是一组配置文件,而是一个经过长期实战打磨的完整框架,旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。 通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能,everything-claude-code 能显著提升 AI 在复杂任务中的表现,帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略,使得模型响应更快、成本更低,同时有效防御潜在的攻击向量。 这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库,还是需要 AI 协助进行安全审计与自动化测试,everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目,它融合了多语言支持与丰富的实战钩子(hooks),让 AI 真正成长为懂上

148.6k|★★☆☆☆|今天
开发框架Agent语言模型

ComfyUI

ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎,专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式,采用直观的节点式流程图界面,让用户通过连接不同的功能模块即可构建个性化的生成管线。 这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景,也能自由组合模型、调整参数并实时预览效果,轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性,不仅支持 Windows、macOS 和 Linux 全平台,还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构,并率先支持 SDXL、Flux、SD3 等前沿模型。 无论是希望深入探索算法潜力的研究人员和开发者,还是追求极致创作自由度的设计师与资深 AI 绘画爱好者,ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能,使其成为当前最灵活、生态最丰富的开源扩散模型工具之一,帮助用户将创意高效转化为现实。

108.1k|★★☆☆☆|昨天
开发框架图像Agent

markitdown

MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具,专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片(含 OCR)、音频(含语音转录)、HTML 乃至 YouTube 链接等多种格式的解析,能够精准提取文档中的标题、列表、表格和链接等关键结构信息。 在人工智能应用日益普及的今天,大语言模型(LLM)虽擅长处理文本,却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点,它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式,成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外,它还提供了 MCP(模型上下文协议)服务器,可无缝集成到 Claude Desktop 等 LLM 应用中。 这款工具特别适合开发者、数据科学家及 AI 研究人员使用,尤其是那些需要构建文档检索增强生成(RAG)系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性,但其核心优势在于为机器

93.4k|★★☆☆☆|3天前
插件开发框架

LLMs-from-scratch

LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目,旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型(LLM)。它不仅是同名技术著作的官方代码库,更提供了一套完整的实践方案,涵盖模型开发、预训练及微调的全过程。 该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型,却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码,用户能够透彻掌握 Transformer 架构、注意力机制等关键原理,从而真正理解大模型是如何“思考”的。此外,项目还包含了加载大型预训练权重进行微调的代码,帮助用户将理论知识延伸至实际应用。 LLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API,而是渴望探究模型构建细节的技术人员而言,这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计:将复杂的系统工程拆解为清晰的步骤,配合详细的图表与示例,让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础,还是为未来研发更大规模的模型做准备

90.1k|★★★☆☆|3天前
语言模型图像Agent