automagica
Automagica 是一款融合了人工智能技术的智能机器人流程自动化(RPA)开源项目,旨在让自动化技术变得人人可用。它主要解决重复性电脑操作耗时费力的问题,能够自动执行如浏览器与 Excel 交互、文件加密解密等复杂任务,显著提升工作效率。
这套工具特别适合希望实现办公自动化的普通用户,以及需要灵活定制流程的开发者。对于开发者而言,Automagica 提供了基于 Jupyter Notebook 的开发环境(Automagica Lab)和可视化流程设计器(Automagica Flow),并支持直接编写 Python 代码,兼顾了低门槛与高自由度。
其独特的技术亮点在于"Automagica Wand",这是一个由 AI 驱动的界面元素拾取器,能更精准地识别和操作软件界面,解决了传统 RPA 在动态界面中定位不准的痛点。虽然该项目核心部分已被 Netcall 收购并转向商业集成,但其原有的架构设计依然展示了将 AI 视觉能力与传统自动化脚本完美结合的创新思路,是学习 RPA 与 AI 结合应用的优秀案例。
使用场景
某电商公司的财务专员每天需从多个供应商门户下载发票 PDF,提取关键数据并录入 Excel 报表,同时需对敏感文件进行加密归档。
没有 automagica 时
- 员工必须手动登录不同网站下载文件,耗时且容易因疲劳输错网址或账号。
- 依靠人工肉眼识别 PDF 中的金额和日期,效率低下且极易出现转录错误。
- 缺乏统一的流程管理,一旦某个步骤出错(如网页元素变动),整个任务中断且难以排查。
- 敏感财务文件分散在本地文件夹,缺乏自动化的加密机制,存在数据泄露风险。
- 每次新增供应商都需要重新编写复杂的脚本,维护成本极高,非技术人员无法参与优化。
使用 automagica 后
- 利用 Automagica Bot 自动登录各门户并批量下载发票,无需人工干预,释放人力。
- 通过 AI 驱动的 Automagica Wand 智能识别并抓取 PDF 关键字段,准确率提升至 99% 以上。
- 借助 Automagica Flow 可视化设计流程,当网页结构变化时可快速调整节点,系统稳定性大幅增强。
- 内置加密活动(Encrypt file)自动对归档文件进行 Fernet 加密,确保数据存储符合安全合规要求。
- 业务人员可通过低代码界面直接修改自动化逻辑,无需依赖专业开发团队,响应速度显著加快。
automagica 将繁琐重复的财务流程转化为稳定、安全且可自我进化的智能自动化闭环,让企业真正实现了降本增效。
运行环境要求
- Windows
未说明
未说明

快速开始

Automagica
Automagica 项目始于 2018 年,旨在开发开源软件,确保机器人流程自动化技术能够惠及所有人。
随着 Automagica 各个版本的发布,诸如 Wand 和 Portal 等更高级的功能需要一套服务基础设施来支持更可靠的机器人运行、先进的服务以及管理和控制功能。
随着这些服务使用量的增加,服务层的托管和维护成本也随之上升。
为了推动这些重要服务进入下一发展阶段,我们很高兴地宣布:2020 年 10月13 日,领先的低代码、客户互动及联络中心软件提供商 Netcall plc 已收购 Oakwood Technologies BV(以“Automagica”名义运营)。
Netcall 将把 Automagica 的 RPA 集成到其 Liberty 平台中,从而提供 RPA、低代码和客户互动解决方案的强大组合。
Automagica Robot 现已不再依据 AGPL3 许可证条款提供。
不过,我们不会停止对已部署机器人的服务支持。这些机器人将继续免费使用 Wand 和 OCR 功能,为期三个月,自今日(2020年10月13日)起算。
现有 Automagica Portal 用户在未来三个月内也可免费访问该平台;在此期间,我们将为用户提供迁移到商业服务的选项。
在此,我们衷心感谢所有为该项目做出贡献的人士。

组件
Automagica 套件由以下组件构成:
- Automagica Bot:负责执行自动化任务的运行时/代理。
- Automagica Flow:可视化流程设计器,可快速构建自动化流程,并全面支持 Python 代码。
- Automagica Wand:基于 AI 的 UI 元素选择器。
- Automagica Lab:基于 Jupyter Notebooks 的笔记本式自动化开发环境(需安装 Jupyter)。
- Automagica Portal:用于管理机器人、凭据、自动化流程、日志等。

示例
浏览器与 Excel 协同工作:

活动
Automagica 所有官方活动概览:
| Process | Description |
|---|---|
| Cryptography | |
| Generate random Fernet key. | |
| Encrypt text with (Fernet) key, | |
| Dexrypt bytes-like object to string with (Fernet) key | |
| Encrypt file with (Fernet) key. Note that file will be unusable unless unlocked with the same key. | |
| Decrypts file with (Fernet) key | |
| Generate key based on password and salt. If both password and salt are known the key can be regenerated. | |
| Generate hash from file | |
| Generate hash from text. Keep in mind that MD5 is not cryptographically secure. | |
| Random | |
| Random numbers can be integers (not a fractional number) or a float (fractional number). | |
| Generates all kinds of random data. Specifying locale changes format for some options | |
| Generates a random boolean (True or False) | |
| Generates a random name. Adding a locale adds a more common name in the specified locale. Provides first name and last name. | |
| Generates a random sentence. Specifying locale changes language and content based on locale. | |
| Generates a random address. Specifying locale changes random locations and streetnames based on locale. | |
| Generates a random beep, only works on Windows | |
| Generates a random date. | |
| Generates today's date. | |
| Generates a random UUID4 (universally unique identifier). | |
| Output | |
| Display custom OSD (on-screen display) message. Can be used to display a message for a limited amount of time. Can be used for illustration, debugging or as OSD. | |
| Print message in console. Can be used to display data in the Automagica Flow console | |
| Browser | |
| Open Chrome Browser | |
| Save all images on current page in the Browser | |
| Browse to URL. | |
| Find all elements by their text. Text does not need to match exactly, part of text is enough. | |
| Find all links on a webpage in the browser | |
| Find first link on a webpage | |
| Get all the raw body text from current webpage | |
| Highlight elements in yellow in the browser | |
| Quit the browser by exiting gracefully. One can also use the native 'quit' function | |
| Find all elements with specified xpath on a webpage in the the browser. Can also use native 'find_elements_by_xpath' | |
| Find all element with specified xpath on a webpage in the the browser. Can also use native 'find_elements_by_xpath' | |
| Find element with specified class on a webpage in the the browser. Can also use native 'find_element_by_class_name' | |
| Find all elements with specified class on a webpage in the the browser. Can also use native 'find_elements_by_class_name' function | |
| Find all elements with specified class and text on a webpage in the the browser. | |
| Find element with specified id on a webpage in the the browser. Can also use native 'find_element_by_id' function | |
| Switch to an iframe in the browser | |
| Credential Management | |
| Add a credential which stores credentials locally and securely. All parameters should be Unicode text. | |
| Delete a locally stored credential. All parameters should be Unicode text. | |
| Get a locally stored redential. All parameters should be Unicode text. | |
| FTP | |
| Can be used to automate activites for FTP | |
| Downloads a file from FTP server. Connection needs to be established first. | |
| Upload file to FTP server | |
| Generate a list of all the files in the FTP directory | |
| Check if FTP directory exists | |
| Create a FTP directory. Note that sufficient permissions are present | |
| Keyboard | |
| Press and release an entered key. Make sure your keyboard is on US layout (standard QWERTY).If you are using this on Mac Os you might need to grant access to your terminal application. | |
| Press a combination of two or three keys simultaneously. Make sure your keyboard is on US layout (standard QWERTY). | |
| Simulate keystrokes. If an element ID is specified, text will be typed in a specific field or element based on the element ID (vision) by the recorder. | |
| Mouse | |
| Get the x and y pixel coordinates of current mouse position. | |
| Displays mouse position in an overlay. | |
| Clicks on an element based on the element ID (vision) | |
| Clicks on an element based on pixel position determined by x and y coordinates. To find coordinates one could use display_mouse_position(). | |
| Double clicks on a pixel position determined by x and y coordinates. | |
| Double clicks on an element based on the element ID (vision) | |
| Right clicks on an element based on the element ID (vision) | |
| Right clicks on an element based pixel position determined by x and y coordinates. | |
| Moves te pointer to an element based on the element ID (vision) | |
| Moves te pointer to an element based on the pixel position determined by x and y coordinates | |
| Moves the mouse an x- and y- distance relative to its current pixel position. | |
| Drags mouse to an element based on pixel position determined by x and y coordinates | |
| Drags mouse to an element based on the element ID (vision) | |
| Image | |
| Take a random square snippet from the current screen. Mainly for testing and/or development purposes. | |
| Take a screenshot of current screen. | |
| Folder Operations | |
| List all files in a folder (and subfolders) | |
| Creates new folder at the given path. | |
| Rename a folder | |
| Open a folder with the default explorer. | |
| Moves a folder from one place to another. | |
| Remove a folder including all subfolders and files. For the function to work optimal, all files and subfolders in the main targetfolder should be closed. | |
| Remove all contents from a folderFor the function to work optimal, all files and subfolders in the main targetfolder should be closed. | |
| Check whether folder exists or not, regardless if folder is empty or not. | |
| Copies a folder from one place to another. | |
| Zip folder and its contents. Creates a .zip file. | |
| Unzips a file or folder from a .zip file. | |
| Return most recent file in directory | |
| Delay | |
| Make the robot wait for a specified number of seconds. Note that this activity is blocking. This means that subsequent activities will not occur until the the specified waiting time has expired. | |
| Waits until a folder exists.Note that this activity is blocking and will keep the system waiting. | |
| Word Application | |
| For this activity to work, Microsoft Office Word needs to be installed on the system. | |
| Save active Word document | |
| Save active Word document to a specific location | |
| Append text at end of Word document. | |
| Can be used for example to replace arbitrary placeholder value. For example whenusing template document, using 'XXXX' as a placeholder. Take note that all strings are case sensitive. | |
| Read all the text from a document | |
| Export the document to PDF | |
| Export to HTML | |
| Set the footers of the document | |
| Set the headers of the document | |
| This closes Word, make sure to use 'save' or 'save_as' if you would like to save before quitting. | |
| Word File | |
| These activities can read, write and edit Word (docx) files without the need of having Word installed. | |
| Read all the text from the document | |
| Append text at the end of the document | |
| Save document | |
| Save file on specified path | |
| Set headers of Word document | |
| Replaces all occurences of a placeholder text in the document with a replacement text. | |
| Outlook Application | |
| For this activity to work, Outlook needs to be installed on the system. | |
| Send an e-mail using Outlook | |
| Retrieve list of folders from Outlook | |
| Retrieve list of messages from Outlook | |
| Deletes e-mail messages in a certain folder. Can be specified by searching on subject, body or sender e-mail. | |
| Move e-mail messages in a certain folder. Can be specified by searching on subject, body or sender e-mail. | |
| Save all attachments from certain folder | |
| Retrieve all contacts | |
| Add a contact to Outlook contacts | |
| Close the Outlook application | |
| Excel Application | |
| For this activity to work, Microsoft Office Excel needs to be installed on the system. | |
| Adds a worksheet to the current workbook | |
| Activate a worksheet in the current Excel document by name | |
| Save the current workbook. Defaults to homedir | |
| Save the current workbook to a specific path | |
| Write to a specific cell in the currently active workbook and active worksheet | |
| Read a cell from the currently active workbook and active worksheet | |
| Write to a specific range in the currently active worksheet in the active workbook | |
| Read a range of cells from the currently active worksheet in the active workbook | |
| Run a macro by name from the currently active workbook | |
| Get names of all the worksheets in the currently active workbook | |
| Get table data from the currently active worksheet by name of the table | |
| Activate a particular range in the currently active workbook | |
| Activates the first empty cell going down | |
| Activates the first empty cell going right | |
| Activates the first empty cell going left | |
| Activates the first empty cell going up | |
| Write a formula to a particular cell | |
| Read the formula from a particular cell | |
| Inserts an empty row to the currently active worksheet | |
| Inserts an empty column in the currently active worksheet. Existing columns will shift to the right. | |
| Deletes a row from the currently active worksheet. Existing data will shift up. | |
| Delete a column from the currently active worksheet. Existing columns will shift to the left. | |
| Export to PDF | |
| Insert list of dictionaries as a table in Excel | |
| Read data from a worksheet as a list of lists | |
| This closes Excel, make sure to use 'save' or 'save_as' if you would like to save before quitting. | |
| Excel File | |
| This activity can read, write and edit Excel (xlsx) files without the need of having Excel installed. | |
| Export to pandas dataframe | |
| Activate a worksheet. By default the first worksheet is activated. | |
| Save file as | |
| Save file | |
| Write a cell based on column and row | |
| Read a cell based on column and row | |
| Add a worksheet | |
| Get worksheet names | |
| PowerPoint Application | |
| For this activity to work, PowerPoint needs to be installed on the system. | |
| Save PowerPoint Slidedeck | |
| Save PowerPoint Slidedeck | |
| Close PowerPoint | |
| Adds slides to a presentation | |
| Returns the number of slides | |
| Add text to a slide | |
| Delete a slide | |
| Can be used for example to replace arbitrary placeholder value in a PowerPoint. | |
| Export PowerPoint presentation to PDF file | |
| Export PowerPoint slides to seperate image files | |
| Office 365 | |
| Send email Office Outlook 365 | |
| Salesforce | |
| Activity to make calls to Salesforce REST API. | |
| E-mail (SMTP) | |
| This function lets you send emails with an e-mail address. | |
| Windows OS | |
| Find a specific window based on the name, either a perfect match or a partial match. | |
| Create a RDP and login to Windows Remote Desktop | |
| Stop Windows Remote Desktop | |
| Sets the password for a Windows user. | |
| Validates a Windows user password if it is correct | |
| Locks Windows requiring login to continue. | |
| Checks if the current user is logged in and not on the lockscreen. Most automations do not work properly when the desktop is locked. | |
| Checks if the current user is locked out and on the lockscreen. Most automations do not work properly when the desktop is locked. | |
| Get current logged in user's username | |
| Set any text to the Windows clipboard. | |
| Get the text currently in the Windows clipboard | |
| Empty text from clipboard. Getting clipboard data after this should return in None | |
| Run a VBScript file | |
| Make a beeping sound. Make sure your volume is up and you have hardware connected. | |
| Returns a list of all network interfaces of the current machine | |
| Enables a network interface by its name. | |
| Disables a network interface by its name. | |
| Returns the name of the printer selected as default | |
| Set the default printer. | |
| Removes a printer by its name | |
| Returns the status of a service on the machine | |
| Starts a Windows service | |
| Stops a Windows service | |
| Sets a window to foreground by its title. | |
| Retrieve the title of the current foreground window | |
| Closes a window by its title | |
| Maximizes a window by its title | |
| Restore a window by its title | |
| Minimizes a window by its title | |
| Resize a window by its title | |
| Hides a window from the user desktop by using it's title | |
| Terminal | |
| Runs a command over SSH (Secure Shell) | |
| SNMP | |
| Retrieves data from an SNMP agent using SNMP (Simple Network Management Protocol) | |
| Active Directory | |
| Interface to Windows Active Directory through ADSI. Connects to the AD domain to which the machine is joined by default. | |
| Interface to Windows Active Directory through ADSI | |
| Utilities | |
| Returns the current user's home path | |
| Returns the current user's desktop path | |
| Returns the current user's default download path | |
| Opens file with default programs | |
| Set Windows desktop wallpaper with the the specified image | |
| Download file from a URL | |
| System | |
| This activity will rename a file. If the the desired name already exists in the folder file will not be renamed. Make sure to add the exstention to specify filetype. | |
| If the new location already contains a file with the same name. | |
| Remove a file | |
| This function checks whether the file with the given path exists. | |
| Note that this activity is blocking and will keep the system waiting. | |
| Writes a list to a text (.txt) file.Every element of the entered list is written on a new line of the text file. | |
| This activity reads the content of a .txt file to a list and returns that list.Every new line from the .txt file becomes a new element of the list. The activity willnot work if the entered path is not attached to a .txt file. | |
| This activity reads a .txt file and returns the content | |
| Append a text line to a file and creates the file if it does not exist yet. | |
| Initialize text file | |
| Read a text file to a Python list-object | |
| Copies a file from one place to another.If the new location already contains a file with the same name, a random 4 character uid is added to the name. | |
| Get extension of a file | |
| Send file to default printer to priner. This activity sends a file to the printer. Make sure to have a default printer set up. | |
| | |
| Extracts the text from a PDF. This activity reads text from a pdf file. Can only read PDF files that contain a text layer. | |
| Merges multiple PDFs into a single file | |
| Extracts a particular range of a PDF to a separate file. | |
| Save a specific page from a PDF as an image | |
| Watermark a PDF | |
| System Monitoring | |
| Get average CPU load for all cores. | |
| Get the number of CPU's in the current system. | |
| Get frequency at which CPU currently operates. | |
| Get CPU statistics | |
| Get memory statistics | |
| Get disk statistics of main disk | |
| Get disk partition info | |
| Get most recent boot time | |
| Get uptime since last boot | |
| Image Processing | |
| Displays an image specified by the path variable on the default imaging program. | |
| Rotate an image | |
| Resizes the image specified by the path variable. | |
| Get with of image | |
| Get height of image | |
| Crops the image specified by path to a region determined by the box variable. | |
| Mirrors an image with a given path horizontally from left to right. | |
| Mirrors an image with a given path vertically from top to bottom. | |
| Process | |
| Use Windows Run to boot a processNote this uses keyboard inputs which means this process can be disrupted by interfering inputs | |
| Use subprocess to open a windows process | |
| Check if process is running. Validates if given process name (name) is currently running on the system. | |
| Get names of unique processes currently running on the system. | |
| Kills a process forcefully | |
| Optical Character Recognition (OCR) | |
| This activity extracts all text from the current screen or an image if a path is specified. | |
| This activity finds position (coordinates) of specified text on the current screen using OCR. | |
| This activity clicks on position (coordinates) of specified text on the current screen using OCR. | |
| This activity double clicks on position (coordinates) of specified text on the current screen using OCR. | |
| This activity Right clicks on position (coordinates) of specified text on the current screen using OCR. | |
| UiPath | |
| This activity allows you to execute a process designed with the UiPath Studio. All console output from the Write Line activity will be printed as output. | |
| AutoIt | |
| This activity allows you to run an AutoIt script. If you use the ConsoleWrite function, the output will be presented to you. | |
| Alternative frameworks | |
| This activity allows you to run a Robot Framework test case. Console output of the test case will be printed. | |
| This activity allows you to run a Blue Prism process. | |
| This activity allows you to run an Automation Anywhere task. | |
| General | |
| Raises an exception | |
| SAP GUI | |
| Quits the SAP GUI completely and forcibly. | |
| Logs in to an SAP system on SAP GUI. | |
| Clicks on an identifier in the SAP GUI. | |
| Retrieves the text from a SAP GUI element. | |
| Sets the text of a SAP GUI element. | |
| Temporarily highlights a SAP GUI element | |
| Portal | |
| This activity creates a new job in the Automagica Portal for a given process. The bot performing this activity needs to be in the same team as the process it creates a job for. | |
| This activity retrieves a credential from the Automagica Portal. | |
| Vision | |
| This activity can be used to check if a certain element is visible on the screen.Note that this uses Automagica Portal and uses some advanced an fuzzy matching algorithms for finding identical elements. | |
| Wait for an element that is defined the recorder | |
| This activity allows the bot to wait for an element to vanish. | |
| This activity allows the bot to detect and read the text of an element by using the Automagica Portal API with a provided sample ID. | |
| |
许可
版权与许可
本仓库中的所有源代码及其他文件,除非另有说明,均受 Netcall plc 版权保护。
商业许可
有关许可、试用及商业使用的更多信息,请参阅此页面。
常见问题
相似工具推荐
openclaw
OpenClaw 是一款专为个人打造的本地化 AI 助手,旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚,能够直接接入你日常使用的各类通讯渠道,包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息,OpenClaw 都能即时响应,甚至支持在 macOS、iOS 和 Android 设备上进行语音交互,并提供实时的画布渲染功能供你操控。 这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地,用户无需依赖云端服务即可享受快速、私密的智能辅助,真正实现了“你的数据,你做主”。其独特的技术亮点在于强大的网关架构,将控制平面与核心助手分离,确保跨平台通信的流畅性与扩展性。 OpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者,以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力(支持 macOS、Linux 及 Windows WSL2),即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你
stable-diffusion-webui
stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面,旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点,将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。 无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师,还是想要深入探索模型潜力的开发者与研究人员,都能从中获益。其核心亮点在于极高的功能丰富度:不仅支持文生图、图生图、局部重绘(Inpainting)和外绘(Outpainting)等基础模式,还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外,它内置了 GFPGAN 和 CodeFormer 等人脸修复工具,支持多种神经网络放大算法,并允许用户通过插件系统无限扩展能力。即使是显存有限的设备,stable-diffusion-webui 也提供了相应的优化选项,让高质量的 AI 艺术创作变得触手可及。
everything-claude-code
everything-claude-code 是一套专为 AI 编程助手(如 Claude Code、Codex、Cursor 等)打造的高性能优化系统。它不仅仅是一组配置文件,而是一个经过长期实战打磨的完整框架,旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。 通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能,everything-claude-code 能显著提升 AI 在复杂任务中的表现,帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略,使得模型响应更快、成本更低,同时有效防御潜在的攻击向量。 这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库,还是需要 AI 协助进行安全审计与自动化测试,everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目,它融合了多语言支持与丰富的实战钩子(hooks),让 AI 真正成长为懂上
ComfyUI
ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎,专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式,采用直观的节点式流程图界面,让用户通过连接不同的功能模块即可构建个性化的生成管线。 这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景,也能自由组合模型、调整参数并实时预览效果,轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性,不仅支持 Windows、macOS 和 Linux 全平台,还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构,并率先支持 SDXL、Flux、SD3 等前沿模型。 无论是希望深入探索算法潜力的研究人员和开发者,还是追求极致创作自由度的设计师与资深 AI 绘画爱好者,ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能,使其成为当前最灵活、生态最丰富的开源扩散模型工具之一,帮助用户将创意高效转化为现实。
markitdown
MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具,专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片(含 OCR)、音频(含语音转录)、HTML 乃至 YouTube 链接等多种格式的解析,能够精准提取文档中的标题、列表、表格和链接等关键结构信息。 在人工智能应用日益普及的今天,大语言模型(LLM)虽擅长处理文本,却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点,它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式,成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外,它还提供了 MCP(模型上下文协议)服务器,可无缝集成到 Claude Desktop 等 LLM 应用中。 这款工具特别适合开发者、数据科学家及 AI 研究人员使用,尤其是那些需要构建文档检索增强生成(RAG)系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性,但其核心优势在于为机器
LLMs-from-scratch
LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目,旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型(LLM)。它不仅是同名技术著作的官方代码库,更提供了一套完整的实践方案,涵盖模型开发、预训练及微调的全过程。 该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型,却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码,用户能够透彻掌握 Transformer 架构、注意力机制等关键原理,从而真正理解大模型是如何“思考”的。此外,项目还包含了加载大型预训练权重进行微调的代码,帮助用户将理论知识延伸至实际应用。 LLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API,而是渴望探究模型构建细节的技术人员而言,这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计:将复杂的系统工程拆解为清晰的步骤,配合详细的图表与示例,让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础,还是为未来研发更大规模的模型做准备