[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-gorgonia--gorgonia":3,"tool-gorgonia--gorgonia":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",144730,2,"2026-04-07T23:26:32",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":64,"owner_name":72,"owner_avatar_url":73,"owner_bio":74,"owner_company":75,"owner_location":75,"owner_email":75,"owner_twitter":75,"owner_website":76,"owner_url":77,"languages":78,"stars":98,"forks":99,"last_commit_at":100,"license":101,"difficulty_score":102,"env_os":103,"env_gpu":104,"env_ram":105,"env_deps":106,"category_tags":112,"github_topics":113,"view_count":32,"oss_zip_url":75,"oss_zip_packed_at":75,"status":17,"created_at":129,"updated_at":130,"faqs":131,"releases":160},5341,"gorgonia\u002Fgorgonia","gorgonia","Gorgonia is a library that helps facilitate machine learning in Go.","Gorgonia 是一个专为 Go 语言打造的机器学习库，旨在让开发者能够轻松编写和计算涉及多维数组的数学方程。它的核心设计理念类似于 Python 界的 Theano 和 TensorFlow，但原生支持 Go 生态，解决了传统机器学习流程中“实验用 Python、部署需重写为 C++\"的割裂痛点。使用 Gorgonia，团队可以全程采用熟悉的 Go 语言栈，从模型研发到生产部署无缝衔接，大幅降低工程复杂度与维护成本。\n\n这款工具特别适合已经在使用 Go 进行后端开发的工程师，以及希望探索非标准深度学习算法（如进化算法、新赫布学习等）的研究人员。它无需切换编程语言即可构建高性能的生产级机器学习系统。\n\n在技术特性上，Gorgonia 支持自动微分、符号微分、梯度下降优化及数值稳定性处理，并提供了丰富的函数来辅助构建神经网络。其计算性能出色，CPU 实现速度可与 PyTorch 和 TensorFlow 媲美，同时支持 CUDA\u002FGPGPU 加速，并具备未来扩展分布式计算的能力。如果你追求编译部署的简洁性，又需要强大的图计算能力，Gorgonia 是一个值得尝试的专业选择。","![Logo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgorgonia_gorgonia_readme_dceeb70e38b4.png)\n\n[![GoDoc](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgorgonia_gorgonia_readme_ff803135673c.png)](https:\u002F\u002Fgodoc.org\u002Fgorgonia.org\u002Fgorgonia) [![GitHub version](https:\u002F\u002Fbadge.fury.io\u002Fgh\u002Fgorgonia%2Fgorgonia.svg)](https:\u002F\u002Fbadge.fury.io\u002Fgh\u002Fgorgonia%2Fgorgonia) \n![Build and Tests](https:\u002F\u002Fgithub.com\u002Fgorgonia\u002Fgorgonia\u002Fworkflows\u002FBuild%20and%20Tests%20on%20Linux\u002Famd64\u002Fbadge.svg)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fgorgonia\u002Fgorgonia\u002Fbranch\u002Fmaster\u002Fgraph\u002Fbadge.svg)](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fgorgonia\u002Fgorgonia)\n[![Go Report Card](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgorgonia_gorgonia_readme_c1743a0ee95f.png)](https:\u002F\u002Fgoreportcard.com\u002Freport\u002Fgorgonia.org\u002Fgorgonia) [![unstable](http:\u002F\u002Fbadges.github.io\u002Fstability-badges\u002Fdist\u002Funstable.svg)](http:\u002F\u002Fgithub.com\u002Fbadges\u002Fstability-badges)\n\n#\n\nGorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily. If this sounds like [Theano](http:\u002F\u002Fdeeplearning.net\u002Fsoftware\u002Ftheano\u002F) or [TensorFlow](https:\u002F\u002Fwww.tensorflow.org\u002F), it's because the idea is quite similar. Specifically, the library is pretty low-level, like Theano, but has higher goals like Tensorflow.\n\nGorgonia:\n\n* Can perform automatic differentiation\n* Can perform symbolic differentiation\n* Can perform gradient descent optimizations\n* Can perform numerical stabilization\n* Provides a number of convenience functions to help create neural networks\n* Is fairly quick (comparable to Theano and TensorFlow speed)\n* Supports CUDA\u002FGPGPU computation (OpenCL not yet supported, send a pull request)\n* Will support distributed computing\n\n# Goals #\n\nThe primary goal for Gorgonia is to be a *highly performant* machine learning\u002Fgraph computation-based library that can scale across multiple machines. It should bring the appeal of Go (simple compilation and deployment process) to the ML world. It's a long way from there currently, however, the baby steps are already there.\n\nThe secondary goal for Gorgonia is to provide a platform for the exploration of non-standard deep-learning and neural network-related things. This includes things like neo-hebbian learning, corner-cutting algorithms, evolutionary algorithms, and the like.\n\n# Why Use Gorgonia? #\n\nThe main reason to use Gorgonia is developer comfort. If you're using a Go stack extensively, now you have access to the ability to create production-ready machine learning systems in an environment that you are already familiar with and comfortable with.\n\nML\u002FAI at large is usually split into two stages: the experimental stage where one builds various models, tests, and retests; and the deployed state where a model after being tested and played with, is deployed. This necessitates different roles like data scientist and data engineer.\n\nTypically the two phases have different tools: Python ([PyTorch](http:\u002F\u002Fpytorch.org\u002F), etc) is commonly used for the experimental stage, and then the model is rewritten in some more performant language like C++ (using [dlib](http:\u002F\u002Fdlib.net\u002Fml.html), [mlpack](http:\u002F\u002Fmlpack.org) etc). Of course, nowadays the gap is closing and people frequently share the tools between them. Tensorflow is one such tool that bridges the gap.\n\nGorgonia aims to do the same but for the Go environment. Gorgonia is currently fairly performant - its speeds are comparable to PyTorch's and Tensorflow's  CPU implementations. GPU implementations are a bit finicky to compare due to the heavy CGO tax, but rest assured that this is an area of active improvement.\n\n# Getting started\n\n## Installation #\n\nThe package is go-gettable: `go get -u gorgonia.org\u002Fgorgonia`.\n\nGorgonia is compatible with Go modules.\n\n## Documentation\n\nUp-to-date documentation, references, and tutorials are present on the official Gorgonia website at [https:\u002F\u002Fgorgonia.org](https:\u002F\u002Fgorgonia.org).\n\n## Keeping Updated\n\nGorgonia's project has a [Slack channel on gopherslack](https:\u002F\u002Fgophers.slack.com\u002Fmessages\u002Fgorgonia\u002F), as well as a [Twitter account](https:\u002F\u002Ftwitter.com\u002FgorgoniaML). Official updates and announcements will be posted to those two sites.\n\n## Usage\n\nGorgonia works by creating a computation graph and then executing it. Think of it as a programming language, but is limited to mathematical functions, and has no branching capability (no if\u002Fthen or loops). In fact, this is the dominant paradigm that the user should be used to thinking about. The computation graph is an [AST](http:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FAbstract_syntax_tree).\n\nMicrosoft's [CNTK](https:\u002F\u002Fgithub.com\u002FMicrosoft\u002FCNTK), with its BrainScript, is perhaps the best at exemplifying the idea that building a computation graph and running the computation graphs are different things and that the user should be in different modes of thought when going about them.\n\nWhilst Gorgonia's implementation doesn't enforce the separation of thought as far as CNTK's BrainScript does, the syntax does help a little bit.\n\nHere's an example - say you want to define a math expression `z = x + y`. Here's how you'd do it:\n\n[embedmd]:# (example_basic_test.go)\n```Go\npackage gorgonia_test\n\nimport (\n\t\"fmt\"\n\t\"log\"\n\n\t. \"gorgonia.org\u002Fgorgonia\"\n)\n\n\u002F\u002F Basic example of representing mathematical equations as graphs.\n\u002F\u002F\n\u002F\u002F In this example, we want to represent the following equation\n\u002F\u002F\t\tz = x + y\nfunc Example_basic() {\n\tg := NewGraph()\n\n\tvar x, y, z *Node\n\tvar err error\n\n\t\u002F\u002F define the expression\n\tx = NewScalar(g, Float64, WithName(\"x\"))\n\ty = NewScalar(g, Float64, WithName(\"y\"))\n\tif z, err = Add(x, y); err != nil {\n\t\tlog.Fatal(err)\n\t}\n\n\t\u002F\u002F create a VM to run the program on\n\tmachine := NewTapeMachine(g)\n\tdefer machine.Close()\n\n\t\u002F\u002F set initial values then run\n\tLet(x, 2.0)\n\tLet(y, 2.5)\n\tif err = machine.RunAll(); err != nil {\n\t\tlog.Fatal(err)\n\t}\n\n\tfmt.Printf(\"%v\", z.Value())\n\t\u002F\u002F Output: 4.5\n}\n```\n\nYou might note that it's a little more verbose than other packages of similar nature. For example, instead of compiling to a callable function, Gorgonia specifically compiles into a `*program` which requires a `*TapeMachine` to run. It also requires manual a `Let(...)` call.\n\nThe author would like to contend that this is a Good Thing - to shift one's thinking to machine-based thinking. It helps a lot in figuring out where things might go wrong.\n\nAdditionally, there is no support for branching - that is to say, there are no conditionals (if\u002Felse) or loops. The aim is not to build a Turing-complete computer.\n\n---\nMore examples are present in the `example` subfolder of the project, and step-by-step tutorials are present on the [main website](https:\u002F\u002Fgorgonia.org\u002Ftutorials\u002F)\n\n## Using CUDA ##\n\nGorgonia comes with CUDA support out of the box.\nPlease see the reference documentation about how cuda works on [the Gorgonia.org](https:\u002F\u002Fgorgonia.org\u002Freference\u002Fcuda\u002F) website, or jump to the [tutorial](https:\u002F\u002Fgorgonia.org\u002Ftutorials\u002Fmnist-cuda\u002F).\n\n# About Gorgonia's development process\n\n## Versioning ##\n\nWe use [semver 2.0.0](http:\u002F\u002Fsemver.org\u002F) for our versioning. Before 1.0, Gorgonia's APIs are expected to change quite a bit. API is defined by the exported functions, variables, and methods. For the developers' sanity, there are minor differences to SemVer that we will apply before version 1.0. They are enumerated below:\n\n* The MINOR number will be incremented every time there is a deleterious break in API. This means any deletion or any change in function signature or interface methods will lead to a change in the MINOR number.\n* Additive changes will NOT change the MINOR version number before version 1.0. This means that if new functionality were added that does not break the way you use Gorgonia, there would not be an increment in the MINOR version. There will be an increment in the PATCH version.\n\n### API Stability #\nGorgonia's API is as of right now, not considered stable. It will be stable from version 1.0 forward.\n\n\n## Go Version Support ##\n\nGorgonia supports 2 versions below the Master branch of Go. This means Gorgonia will support the current released version of Go, and up to 4 previous versions - providing something doesn't break. Where possible a shim will be provided (for things like new `sort` APIs or `math\u002Fbits` which came out in Go 1.9).\n\nThe current version of Go is 1.13.1. The earliest version Gorgonia supports is Go 1.11.x but Gonum supports only 1.12+. Therefore, the minimum Go version to run the master branch is Go > 1.12.\n\n## Hardware and OS supported ##\n\nGorgonia runs on :\n- linux\u002FAMD64\n- linux\u002FARM7\n- linux\u002FARM64\n- win32\u002FAMD64\n- darwin\u002FAMD64\n- freeBSD\u002FAMD64\n\nIf you have tested Gorgonia on other platforms, please update this list.\n\n## Hardware acceleration\n\nGorgonia uses some pure assembler instructions to accelerate some mathematical operations. Unfortunately, only amd64 is supported.\n\n\n# Contributing #\n\nObviously, since you are most probably reading this on Github, Github will form the major part of the workflow for contributing to this package.\n\nSee also: [CONTRIBUTING.md](CONTRIBUTING.md)\n\n\n## Contributors and Significant Contributors ##\nAll contributions are welcome. However, there is a new class of contributors, called Significant Contributors.\n\nA Significant Contributor has shown *a deep understanding* of how the library works and\u002For its environs.  Here are examples of what constitutes a Significant Contribution:\n\n* Wrote significant amounts of documentation on **why**\u002Fthe mechanics of particular functions\u002Fmethods and how the different parts affect one another\n* Wrote code and tests around the more intricately connected parts of Gorgonia\n* Wrote code and tests, and had at least 5 pull requests accepted\n* Provided expert analysis on parts of the package (for example, you may be a floating point operations expert who optimized one function)\n* Answered at least 10 support questions.\n\nThe significant Contributors list will be updated once a month (if anyone even uses Gorgonia that is).\n\n# How To Get Support #\nThe best way of support right now is to open a [ticket on Github](https:\u002F\u002Fgithub.com\u002Fgorgonia\u002Fgorgonia\u002Fissues\u002Fnew).\n\n# Frequently Asked Questions #\n\n### Why are there seemingly random `runtime.GC()` calls in the tests? ###\n\nThe answer to this is simple - the design of the package uses CUDA in a particular way: specifically, a CUDA device and context are tied to a `VM`, instead of at the package level. This means for every `VM` created, a different CUDA context is created per device per `VM`. This way all the operations will play nicely with other applications that may be using CUDA (this needs to be stress-tested, however).\n\nThe CUDA contexts are only destroyed when the `VM` gets garbage collected (with the help of a finalizer function). In the tests, about 100 `VM`s get created, and garbage collection for the most part can be considered random. This leads to cases where the GPU runs out of memory as there are too many contexts being used.\n\nTherefore at the end of any tests that may use GPU, a `runtime.GC()` call is made to force garbage collection, freeing GPU memories.\n\nIn production, one is unlikely to start that many `VM`s, therefore it's not a problem. If there is, open a ticket on GitHub, and we'll look into adding a `Finish()` method for the `VM`s.\n\n\n# Licence #\n\nGorgonia is licensed under a variant of Apache 2.0. It's the same as the Apache 2.0 Licence, except not being able to commercially profit directly from the package unless you're a Significant Contributor (for example, providing commercial support for the package). It's perfectly fine to profit directly from a derivative of Gorgonia (for example, if you use Gorgonia as a library in your product)\n\nEveryone is still allowed to use Gorgonia for commercial purposes (for example: using it in software for your business).\n\n## Dependencies ##\n\nThere are very few dependencies that Gorgonia uses - and they're all pretty stable, so as of now there isn't a need for vendoring tools. These are the list of external packages that Gorgonia calls, ranked in order of reliance that this package has (sub-packages are omitted):\n\n|Package|Used For|Vitality|Notes|Licence|\n|-------|--------|--------|-----|-------|\n|[gonum\u002Fgraph](https:\u002F\u002Fgithub.com\u002Fgonum\u002Fgonum\u002Ftree\u002Fmaster\u002Fgraph)| Sorting `*ExprGraph`| Vital. Removal means Gorgonia will not work | Development of Gorgonia is committed to keeping up with the most updated version|[gonum license](https:\u002F\u002Fgithub.com\u002Fgonum\u002Flicense) (MIT\u002FBSD-like)|\n|[gonum\u002Fblas](https:\u002F\u002Fgithub.com\u002Fgonum\u002Fgonum\u002Ftree\u002Fmaster\u002Fblas)|Tensor subpackage linear algebra operations|Vital. Removal means Gorgonial will not work|Development of Gorgonia is committed to keeping up with the most updated version|[gonum license](https:\u002F\u002Fgithub.com\u002Fgonum\u002Flicense) (MIT\u002FBSD-like)|\n|[cu](https:\u002F\u002Fgorgonia.org\u002Fcu)| CUDA drivers | Needed for CUDA operations | Same maintainer as Gorgonia | MIT\u002FBSD-like|\n|[math32](https:\u002F\u002Fgithub.com\u002Fchewxy\u002Fmath32)|`float32` operations|Can be replaced by `float32(math.XXX(float64(x)))`|Same maintainer as Gorgonia, same API as the built-in `math` package|MIT\u002FBSD-like|\n|[hm](https:\u002F\u002Fgithub.com\u002Fchewxy\u002Fhm)|Type system for Gorgonia|Gorgonia's graphs are pretty tightly coupled with the type system | Same maintainer as Gorgonia | MIT\u002FBSD-like|\n|[vecf64](https:\u002F\u002Fgorgonia.org\u002Fvecf64)| optimized `[]float64` operations | Can be generated in the `tensor\u002Fgenlib` package. However, plenty of optimizations have been made\u002Fwill be made | Same maintainer as Gorgonia | MIT\u002FBSD-like|\n|[vecf32](https:\u002F\u002Fgorgonia.org\u002Fvecf32)| optimized `[]float32` operations | Can be generated in the `tensor\u002Fgenlib` package. However, plenty of optimizations have been made\u002Fwill be made | Same maintainer as Gorgonia | MIT\u002FBSD-like|\n|[set](https:\u002F\u002Fgithub.com\u002Fxtgo\u002Fset)|Various set operations|Can be easily replaced|Stable API for the past 1 year|[set licence](https:\u002F\u002Fgithub.com\u002Fxtgo\u002Fset\u002Fblob\u002Fmaster\u002FLICENSE) (MIT\u002FBSD-like)|\n|[gographviz](https:\u002F\u002Fgithub.com\u002Fawalterschulze\u002Fgographviz)|Used for printing graphs|Graph printing is only vital to debugging. Gorgonia can survive without, but with a major (but arguably nonvital) feature loss|Last update 12th April 2017|[gographviz license](https:\u002F\u002Fgithub.com\u002Fawalterschulze\u002Fgographviz\u002Fblob\u002Fmaster\u002FLICENSE) (Apache 2.0)|\n|[rng](https:\u002F\u002Fgithub.com\u002Fleesper\u002Fgo_rng)|Used to implement helper functions to generate initial weights|Can be replaced fairly easily. Gorgonia can do without the convenience functions too||[rng license](https:\u002F\u002Fgithub.com\u002Fleesper\u002Fgo_rng\u002Fblob\u002Fmaster\u002FLICENSE) (Apache 2.0)|\n|[errors](https:\u002F\u002Fgithub.com\u002Fpkg\u002Ferrors)|Error wrapping|Gorgonia won't die without it. In fact Gorgonia has also used [goerrors\u002Ferrors](https:\u002F\u002Fgithub.com\u002Fgo-errors\u002Ferrors) in the past.|Stable API for the past 6 months|[errors licence](https:\u002F\u002Fgithub.com\u002Fpkg\u002Ferrors\u002Fblob\u002Fmaster\u002FLICENSE) (MIT\u002FBSD-like)|\n|[gonum\u002Fmat](http:\u002F\u002Fgithub.com\u002Fgonum\u002Fgonum)|Compatibility between `Tensor` and Gonum's Matrix|Development of Gorgonia is committed to keeping up with the most updated version||[gonum license](https:\u002F\u002Fgithub.com\u002Fgonum\u002Flicense) (MIT\u002FBSD-like)|\n|[testify\u002Fassert](https:\u002F\u002Fgithub.com\u002Fstretchr\u002Ftestify)|Testing|Can do without but will be a massive pain in the ass to test||[testify license](https:\u002F\u002Fgithub.com\u002Fstretchr\u002Ftestify\u002Fblob\u002Fmaster\u002FLICENSE) (MIT\u002FBSD-like)|\n\n\n## Various Other Copyright Notices ##\n\nThese are the packages and libraries that inspired and were adapted from in the process of writing Gorgonia (the Go packages that were used were already declared above):\n\n| Source | How it's Used | Licence |\n|------|---|-------|\n| Numpy  | Inspired large portions. Directly adapted algorithms for a few methods (explicitly labeled in the docs) | MIT\u002FBSD-like. [Numpy Licence](https:\u002F\u002Fgithub.com\u002Fnumpy\u002Fnumpy\u002Fblob\u002Fmaster\u002FLICENSE.txt) |\n| Theano | Inspired large portions. (Unsure: number of directly adapted algorithms) | MIT\u002FBSD-like [Theano's license](http:\u002F\u002Fdeeplearning.net\u002Fsoftware\u002Ftheano\u002FLICENSE.html) |\n| Caffe | `im2col` and `col2im` directly taken from Caffe. Convolution algorithms inspired by the original Caffee methods | [Caffe Licence](https:\u002F\u002Fgithub.com\u002FBVLC\u002Fcaffe\u002Fblob\u002Fmaster\u002FLICENSE)\n","![Logo](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgorgonia_gorgonia_readme_dceeb70e38b4.png)\n\n[![GoDoc](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgorgonia_gorgonia_readme_ff803135673c.png)](https:\u002F\u002Fgodoc.org\u002Fgorgonia.org\u002Fgorgonia) [![GitHub version](https:\u002F\u002Fbadge.fury.io\u002Fgh\u002Fgorgonia%2Fgorgonia.svg)](https:\u002F\u002Fbadge.fury.io\u002Fgh\u002Fgorgonia%2Fgorgonia) \n![Build and Tests](https:\u002F\u002Fgithub.com\u002Fgorgonia\u002Fgorgonia\u002Fworkflows\u002FBuild%20and%20Tests%20on%20Linux\u002Famd64\u002Fbadge.svg)\n[![codecov](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fgorgonia\u002Fgorgonia\u002Fbranch\u002Fmaster\u002Fgraph\u002Fbadge.svg)](https:\u002F\u002Fcodecov.io\u002Fgh\u002Fgorgonia\u002Fgorgonia)\n[![Go Report Card](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgorgonia_gorgonia_readme_c1743a0ee95f.png)](https:\u002F\u002Fgoreportcard.com\u002Freport\u002Fgorgonia.org\u002Fgorgonia) [![unstable](http:\u002F\u002Fbadges.github.io\u002Fstability-badges\u002Fdist\u002Funstable.svg)](http:\u002F\u002Fgithub.com\u002Fbadges\u002Fstability-badges)\n\n#\n\nGorgonia 是一个帮助在 Go 语言中进行机器学习的库。你可以轻松地编写和评估涉及多维数组的数学方程。如果这听起来像 [Theano](http:\u002F\u002Fdeeplearning.net\u002Fsoftware\u002Ftheano\u002F) 或 [TensorFlow](https:\u002F\u002Fwww.tensorflow.org\u002F)，那是因为它的理念非常相似。具体来说，这个库类似于 Theano，属于比较底层的实现，但目标却更接近 TensorFlow。\n\nGorgonia：\n\n* 可以进行自动微分\n* 可以进行符号微分\n* 可以执行梯度下降优化\n* 可以进行数值稳定化处理\n* 提供了许多便捷函数来帮助构建神经网络\n* 性能相当快（与 Theano 和 TensorFlow 的速度相当）\n* 支持 CUDA\u002FGPGPU 计算（OpenCL 尚未支持，欢迎提交 Pull Request）\n* 将来会支持分布式计算\n\n# 目标 #\n\nGorgonia 的首要目标是成为一个**高性能**的机器学习\u002F图计算库，能够在多台机器上扩展。它旨在将 Go 语言的优势（简单的编译和部署流程）引入机器学习领域。目前距离这一目标还有很长的路要走，但我们已经迈出了第一步。\n\nGorgonia 的次要目标是为探索非传统的深度学习和神经网络相关技术提供平台。这包括新赫布理论学习、剪枝算法、进化算法等。\n\n# 为什么使用 Gorgonia？#\n\n使用 Gorgonia 的主要原因在于开发者的舒适度。如果你已经在项目中大量使用 Go 技术栈，现在你就可以在一个熟悉且舒适的环境中构建生产级的机器学习系统了。\n\n通常，机器学习和人工智能的工作流程分为两个阶段：实验阶段，即构建各种模型并反复测试；以及部署阶段，即在经过充分测试和调整后，将模型投入实际应用。这两个阶段往往需要不同的角色，例如数据科学家和数据工程师。\n\n传统上，这两个阶段使用的工具并不相同：Python（如 PyTorch 等）常用于实验阶段，而模型随后会被重写成性能更高的语言，比如 C++（使用 dlib、mlpack 等）。当然，如今这种界限正在逐渐模糊，人们也经常共享工具。TensorFlow 就是一个弥合这一差距的工具。\n\nGorgonia 的目标则是为 Go 生态系统实现同样的功能。目前，Gorgonia 的性能已经相当不错——其 CPU 实现的速度可以与 PyTorch 和 TensorFlow 的 CPU 版本相媲美。由于 CGO 调用的开销较大，GPU 实现的性能对比稍显复杂，不过请放心，这方面仍在积极改进中。\n\n# 快速入门\n\n## 安装 ##\n\n可以通过 go get 命令安装：`go get -u gorgonia.org\u002Fgorgonia`。\n\nGorgonia 兼容 Go 模块。\n\n## 文档 ##\n\n最新的文档、参考和教程都可在 Gorgonia 官网 [https:\u002F\u002Fgorgonia.org](https:\u002F\u002Fgorgonia.org) 上找到。\n\n## 获取最新信息 ##\n\nGorgonia 项目有一个 [Slack 频道](https:\u002F\u002Fgophers.slack.com\u002Fmessages\u002Fgorgonia\u002F) 和一个 [Twitter 账号](https:\u002F\u002Ftwitter.com\u002FgorgoniaML)。官方更新和公告都会发布在这两个平台上。\n\n## 使用方法 ##\n\nGorgonia 的工作原理是先创建一个计算图，然后执行它。你可以把它想象成一种编程语言，但它仅限于数学函数，没有分支结构（没有 if\u002Fthen 或循环）。事实上，这就是用户需要习惯的主要思维方式。计算图本质上是一个 [抽象语法树](http:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FAbstract_syntax_tree)。\n\n微软的 [CNTK](https:\u002F\u002Fgithub.com\u002FMicrosoft\u002FCNTK)，通过其 BrainScript，很好地说明了构建计算图和运行计算图是两件不同的事情，用户在思考时也需要采用不同的模式。\n\n虽然 Gorgonia 的实现并没有像 CNTK 的 BrainScript 那样严格区分这两种思维模式，但其语法确实提供了一些帮助。\n\n下面是一个例子：假设你想定义一个数学表达式 `z = x + y`，以下是实现方式：\n\n[embedmd]:# (example_basic_test.go)\n```Go\npackage gorgonia_test\n\nimport (\n\t\"fmt\"\n\t\"log\"\n\n\t. \"gorgonia.org\u002Fgorgonia\"\n)\n\n\u002F\u002F 将数学方程表示为计算图的基本示例。\n\u002F\u002F\n\u002F\u002F 在这个示例中，我们想要表示以下方程：\n\u002F\u002F\t\tz = x + y\nfunc Example_basic() {\n\tg := NewGraph()\n\n\tvar x, y, z *Node\n\tvar err error\n\n\t\u002F\u002F 定义表达式\n\tx = NewScalar(g, Float64, WithName(\"x\"))\n\ty = NewScalar(g, Float64, WithName(\"y\"))\n\tif z, err = Add(x, y); err != nil {\n\t\tlog.Fatal(err)\n\t}\n\n\t\u002F\u002F 创建一个 VM 来运行程序\n\tmachine := NewTapeMachine(g)\n\tdefer machine.Close()\n\n\t\u002F\u002F 设置初始值并运行\n\tLet(x, 2.0)\n\tLet(y, 2.5)\n\tif err = machine.RunAll(); err != nil {\n\t\tlog.Fatal(err)\n\t}\n\n\tfmt.Printf(\"%v\", z.Value())\n\t\u002F\u002F 输出: 4.5\n}\n```\n\n你可能会注意到，与其他类似库相比，这里的代码略显冗长。例如，Gorgonia 不会直接编译成可调用的函数，而是专门编译成一个 `*program`，需要借助 `*TapeMachine` 来运行。此外，还需要手动调用 `Let(...)` 函数。\n\n作者认为，这种设计其实是一件好事——它可以帮助开发者转变思维方式，从传统的编程思维转向面向机器的思维。这有助于更好地理解可能出现的问题所在。\n\n另外，Gorgonia 不支持分支结构，也就是说，没有条件语句（if\u002Felse）或循环。它的设计初衷并不是要构建一台图灵完备的计算机。\n---\n更多示例可以在项目的 `example` 子文件夹中找到，逐步教程则可在 [官方网站](https:\u002F\u002Fgorgonia.org\u002Ftutorials\u002F) 上查阅。\n\n## 使用 CUDA ##\n\nGorgonia 默认支持 CUDA。\n有关 CUDA 的详细使用方法，请参阅 Gorgonia 官网上的参考文档 [https:\u002F\u002Fgorgonia.org\u002Freference\u002Fcuda\u002F](https:\u002F\u002Fgorgonia.org\u002Freference\u002Fcuda\u002F)，或者直接阅读 [CUDA 教程](https:\u002F\u002Fgorgonia.org\u002Ftutorials\u002Fmnist-cuda\u002F)。\n\n# 关于 Gorgonia 的开发过程\n\n## 版本管理 ##\n\n我们采用 [语义化版本 2.0.0](http:\u002F\u002Fsemver.org\u002F) 进行版本管理。在 1.0 版本之前，Gorgonia 的 API 预计会有较大变化。API 定义为公开的函数、变量和方法。为了开发者的便利，在 1.0 版本之前，我们会对语义化版本规范做一些小调整，具体如下：\n\n* 每次出现破坏性的 API 变更时，MINOR 版本号将递增。这意味着任何函数签名或接口方法的删除或更改都会导致 MINOR 版本号的增加。\n* 在 1.0 版本之前，新增功能不会改变 MINOR 版本号。也就是说，如果添加了新功能但不破坏现有使用方式，则 MINOR 版本号不会递增，而是 PATCH 版本号会递增。\n\n### API 稳定性 #\n目前，Gorgonia 的 API 尚未被认为是稳定的。从 1.0 版本开始，API 将被视为稳定。\n\n## Go 版本支持 ##\n\nGorgonia 支持 Go 主分支以下的两个版本。这意味着 Gorgonia 将支持当前发布的 Go 版本，以及最多往前追溯的四个版本——前提是不会出现兼容性问题。在可能的情况下，我们会提供适配层（例如针对 Go 1.9 引入的新 `sort` API 或 `math\u002Fbits` 包）。\n\n当前的 Go 版本是 1.13.1。Gorgonia 最早支持的版本是 Go 1.11.x，但 Gonum 只支持 1.12 及以上版本。因此，运行主分支的最低 Go 版本要求是 Go > 1.12。\n\n## 支持的硬件与操作系统 ##\nGorgonia 支持以下平台：\n- linux\u002FAMD64\n- linux\u002FARM7\n- linux\u002FARM64\n- win32\u002FAMD64\n- darwin\u002FAMD64\n- freeBSD\u002FAMD64\n\n如果您曾在其他平台上测试过 Gorgonia，请更新此列表。\n\n## 硬件加速 ##\nGorgonia 使用了一些纯汇编指令来加速部分数学运算。遗憾的是，目前仅支持 amd64 架构。\n\n# 贡献指南 #\n\n显然，由于您很可能正在 GitHub 上阅读本文，GitHub 将成为贡献本项目的主要工作流程。\n\n参阅：[CONTRIBUTING.md](CONTRIBUTING.md)\n\n## 贡献者与重要贡献者 ##\n我们欢迎所有贡献。不过，我们引入了一类新的贡献者，称为“重要贡献者”。\n\n重要贡献者需展现出对该库及其相关领域的深刻理解。以下是构成“重要贡献”的示例：\n\n* 编写了大量关于特定函数\u002F方法的**原理**、实现机制，以及各部分之间相互作用的文档；\n* 编写了围绕 Gorgonia 内部复杂模块的代码及测试；\n* 编写了代码并提交了至少 5 个被接受的 Pull Request；\n* 对项目的某些部分提供了专业分析（例如，作为浮点运算专家优化了某个函数）；\n* 回答了至少 10 个支持相关的问题。\n\n重要贡献者名单将每月更新一次（当然，前提是还有人使用 Gorgonia）。\n\n# 如何获取支持 #\n目前获取支持的最佳方式是在 [GitHub 上提交 Issue](https:\u002F\u002Fgithub.com\u002Fgorgonia\u002Fgorgonia\u002Fissues\u002Fnew)。\n\n# 常见问题解答 #\n\n### 为什么测试中会出现看似随机的 `runtime.GC()` 调用？ ###\n\n答案很简单：该库的设计以特定方式使用 CUDA——即 CUDA 设备和上下文与 `VM` 绑定，而不是在包级别绑定。这意味着每创建一个 `VM`，就会为每个设备创建一个独立的 CUDA 上下文。这样可以确保所有操作与其他可能使用 CUDA 的应用程序良好共存（尽管这一点仍需进一步压力测试）。\n\n这些 CUDA 上下文只有在 `VM` 被垃圾回收时才会被销毁（通过终结器函数实现）。在测试中，通常会创建约 100 个 `VM`，而垃圾回收的时间点往往是随机的。这可能导致 GPU 内存不足，因为同时存在过多的 CUDA 上下文。\n\n因此，在可能使用 GPU 的测试结束时，我们会调用 `runtime.GC()` 来强制进行垃圾回收，释放 GPU 内存。\n\n在生产环境中，一般不太可能启动如此多的 `VM`，因此这种情况并不常见。如果确实遇到问题，请在 GitHub 上提交 Issue，我们将考虑为 `VM` 添加 `Finish()` 方法。\n\n# 许可证 #\nGorgonia 采用 Apache 2.0 的变体许可证。它与 Apache 2.0 许可证基本相同，唯一的区别在于，除非您是重要贡献者（例如为该项目提供商业支持），否则不得直接从本项目中获利。不过，您可以自由地从基于 Gorgonia 的衍生作品中获利（例如将 Gorgonia 作为库集成到您的产品中）。\n\n任何人仍然可以将 Gorgonia 用于商业目的（例如将其应用于企业软件中）。\n\n## 依赖项 ##\n\nGorgonia 使用的依赖项非常少，而且都非常稳定，因此目前不需要使用依赖管理工具。以下是 Gorgonia 调用的外部包列表，按本包对其依赖程度排序（省略了子包）：\n\n| 包名 | 用途 | 是否关键 | 备注 | 许可证 |\n|-------|--------|--------|-----|-------|\n|[gonum\u002Fgraph](https:\u002F\u002Fgithub.com\u002Fgonum\u002Fgonum\u002Ftree\u002Fmaster\u002Fgraph)| 对 `*ExprGraph` 进行排序| 关键。移除后 Gorgonia 将无法运行 | Gorgonia 的开发团队致力于保持与最新版本同步|[gonum 许可证](https:\u002F\u002Fgithub.com\u002Fgonum\u002Flicense)（MIT\u002F类似 BSD）|\n|[gonum\u002Fblas](https:\u002F\u002Fgithub.com\u002Fgonum\u002Fgonum\u002Ftree\u002Fmaster\u002Fblas)|Tensor 子包中的线性代数运算|关键。移除后 Gorgonia 将无法运行|Gorgonia 的开发团队致力于保持与最新版本同步|[gonum 许可证](https:\u002F\u002Fgithub.com\u002Fgonum\u002Flicense)（MIT\u002F类似 BSD）|\n|[cu](https:\u002F\u002Fgorgonia.org\u002Fcu)| CUDA 驱动程序 | 进行 CUDA 操作所必需 | 与 Gorgonia 由同一维护者维护 | MIT\u002F类似 BSD|\n|[math32](https:\u002F\u002Fgithub.com\u002Fchewxy\u002Fmath32)|`float32` 运算|可用 `float32(math.XXX(float64(x)))` 替代|与 Gorgonia 由同一维护者维护，API 与内置 `math` 包相同|MIT\u002F类似 BSD|\n|[hm](https:\u002F\u002Fgithub.com\u002Fchewxy\u002Fhm)|Gorgonia 的类型系统|Gorgonia 的图结构与类型系统紧密耦合 | 与 Gorgonia 由同一维护者维护 | MIT\u002F类似 BSD|\n|[vecf64](https:\u002F\u002Fgorgonia.org\u002Fvecf64)|优化的 `[]float64` 操作 | 可在 `tensor\u002Fgenlib` 包中生成。不过已经进行了大量优化，未来还将继续优化 | 与 Gorgonia 由同一维护者维护 | MIT\u002F类似 BSD|\n|[vecf32](https:\u002F\u002Fgorgonia.org\u002Fvecf32)|优化的 `[]float32` 操作 | 可在 `tensor\u002Fgenlib` 包中生成。不过已经进行了大量优化，未来还将继续优化 | 与 Gorgonia 由同一维护者维护 | MIT\u002F类似 BSD|\n|[set](https:\u002F\u002Fgithub.com\u002Fxtgo\u002Fset)|各种集合操作|可以轻松替换|过去一年 API 稳定|[set 许可证](https:\u002F\u002Fgithub.com\u002Fxtgo\u002Fset\u002Fblob\u002Fmaster\u002FLICENSE)（MIT\u002F类似 BSD）|\n|[gographviz](https:\u002F\u002Fgithub.com\u002Fawalterschulze\u002Fgographviz)|用于打印图|图的打印仅对调试至关重要。Gorgonia 即使没有这一功能也能运行，但会失去一个主要（尽管可能不算关键）特性|最后更新于 2017 年 4 月 12 日|[gographviz 许可证](https:\u002F\u002Fgithub.com\u002Fawalterschulze\u002Fgographviz\u002Fblob\u002Fmaster\u002FLICENSE)（Apache 2.0）|\n|[rng](https:\u002F\u002Fgithub.com\u002Fleesper\u002Fgo_rng)|用于实现生成初始权重的辅助函数|可以较为容易地替换。Gorgonia 也可以不使用这些便利函数||[rng 许可证](https:\u002F\u002Fgithub.com\u002Fleesper\u002Fgo_rng\u002Fblob\u002Fmaster\u002FLICENSE)（Apache 2.0）|\n|[errors](https:\u002F\u002Fgithub.com\u002Fpkg\u002Ferrors)|错误包装|即使没有它，Gorgonia 也不会崩溃。事实上，Gorgonia 过去也曾使用过 [goerrors\u002Ferrors](https:\u002F\u002Fgithub.com\u002Fgo-errors\u002Ferrors)。|过去 6 个月 API 稳定|[errors 许可证](https:\u002F\u002Fgithub.com\u002Fpkg\u002Ferrors\u002Fblob\u002Fmaster\u002FLICENSE)（MIT\u002F类似 BSD）|\n|[gonum\u002Fmat](http:\u002F\u002Fgithub.com\u002Fgonum\u002Fgonum)|`Tensor` 与 Gonum 矩阵之间的兼容性|Gorgonia 的开发团队致力于保持与最新版本同步||[gonum 许可证](https:\u002F\u002Fgithub.com\u002Fgonum\u002Flicense)（MIT\u002F类似 BSD）|\n|[testify\u002Fassert](https:\u002F\u002Fgithub.com\u002Fstretchr\u002Ftestify)|测试|可以不用，但会使测试变得非常麻烦||[testify 许可证](https:\u002F\u002Fgithub.com\u002Fstretchr\u002Ftestify\u002Fblob\u002Fmaster\u002FLICENSE)（MIT\u002F类似 BSD）|\n\n\n## 其他版权声明 ##\n\n以下是在编写 Gorgonia 过程中受到启发并加以改编的包和库（已使用的 Go 包已在上文列出）：\n\n| 来源 | 使用方式 | 许可证 |\n|------|---|-------|\n| Numpy  | 受到大量启发。部分方法直接采用了其算法（文档中已明确标注） | MIT\u002F类似 BSD。[Numpy 许可证](https:\u002F\u002Fgithub.com\u002Fnumpy\u002Fnumpy\u002Fblob\u002Fmaster\u002FLICENSE.txt) |\n| Theano | 受到大量启发。（不确定直接采用了多少算法） | MIT\u002F类似 BSD [Theano 许可证](http:\u002F\u002Fdeeplearning.net\u002Fsoftware\u002Ftheano\u002FLICENSE.html) |\n| Caffe | `im2col` 和 `col2im` 直接来自 Caffe。卷积算法则受到 Caffe 原始方法的启发 | [Caffe 许可证](https:\u002F\u002Fgithub.com\u002FBVLC\u002Fcaffe\u002Fblob\u002Fmaster\u002FLICENSE)","# Gorgonia 快速上手指南\n\nGorgonia 是一个用于 Go 语言的机器学习库，支持自动微分、符号微分、梯度下降优化及 CUDA\u002FGPU 加速。其设计理念类似于 Theano 和 TensorFlow，旨在让 Go 开发者能够在熟悉的生态中构建高性能的机器学习模型。\n\n## 环境准备\n\n### 系统要求\nGorgonia 支持以下操作系统与架构：\n- **Linux**: AMD64, ARM7, ARM64\n- **Windows**: AMD64 (win32)\n- **macOS**: AMD64 (darwin)\n- **FreeBSD**: AMD64\n\n> **注意**：部分硬件加速功能（纯汇编指令优化）目前仅支持 **amd64** 架构。若需使用 CUDA\u002FGPU 加速，请确保已安装对应的 NVIDIA 驱动和 CUDA Toolkit。\n\n### 前置依赖\n- **Go 版本**：建议使用 **Go 1.12** 或更高版本（官方支持当前版本及前两个主要版本）。\n- **Go Modules**：项目已兼容 Go Modules。\n\n## 安装步骤\n\n使用 `go get` 命令即可安装最新版本的 Gorgonia：\n\n```bash\ngo get -u gorgonia.org\u002Fgorgonia\n```\n\n国内开发者若遇到下载速度慢的问题，可配置国内代理加速：\n\n```bash\nexport GOPROXY=https:\u002F\u002Fgoproxy.cn,direct\ngo get -u gorgonia.org\u002Fgorgonia\n```\n\n## 基本使用\n\nGorgonia 的核心工作流程是：**构建计算图 -> 创建虚拟机 -> 赋值并执行**。\n\n以下是一个最基础的示例，演示如何定义并计算数学表达式 $z = x + y$：\n\n```go\npackage main\n\nimport (\n\t\"fmt\"\n\t\"log\"\n\n\t\"gorgonia.org\u002Fgorgonia\"\n)\n\nfunc main() {\n\t\u002F\u002F 1. 创建一个新的计算图\n\tg := gorgonia.NewGraph()\n\n\tvar x, y, z *gorgonia.Node\n\tvar err error\n\n\t\u002F\u002F 2. 定义变量节点 (标量，浮点型)\n\tx = gorgonia.NewScalar(g, gorgonia.Float64, gorgonia.WithName(\"x\"))\n\ty = gorgonia.NewScalar(g, gorgonia.Float64, gorgonia.WithName(\"y\"))\n\n\t\u002F\u002F 3. 定义操作：z = x + y\n\tif z, err = gorgonia.Add(x, y); err != nil {\n\t\tlog.Fatal(err)\n\t}\n\n\t\u002F\u002F 4. 创建虚拟机 (TapeMachine) 用于执行计算图\n\tmachine := gorgonia.NewTapeMachine(g)\n\tdefer machine.Close()\n\n\t\u002F\u002F 5. 为变量赋初始值\n\tgorgonia.Let(x, 2.0)\n\tgorgonia.Let(y, 2.5)\n\n\t\u002F\u002F 6. 运行所有计算\n\tif err = machine.RunAll(); err != nil {\n\t\tlog.Fatal(err)\n\t}\n\n\t\u002F\u002F 7. 输出结果\n\tfmt.Printf(\"Result: %v\\n\", z.Value())\n\t\u002F\u002F Output: Result: 4.5\n}\n```\n\n### 关键点说明\n- **计算图模式**：Gorgonia 不支持条件分支（if\u002Felse）或循环，专注于数学运算图的构建与执行。\n- **显式执行**：需要手动创建 `TapeMachine` 并调用 `RunAll()` 来触发计算，这有助于更清晰地控制内存和执行流程。\n- **更多资源**：进阶教程（如神经网络构建、CUDA 使用等）请访问官方文档 [https:\u002F\u002Fgorgonia.org](https:\u002F\u002Fgorgonia.org)。","某电商平台的后端团队主要使用 Go 语言构建高并发微服务，现在需要在交易链路中实时集成一个自定义的欺诈检测模型。\n\n### 没有 gorgonia 时\n- **技术栈割裂**：数据科学家需用 Python 训练模型，后端工程师必须将其重写为 C++ 或通过 RPC 调用外部服务，导致开发流程冗长且易出错。\n- **部署复杂度高**：生产环境需额外维护 Python 运行时或重型 TensorFlow 服务容器，增加了运维负担和资源消耗。\n- **调试困难**：一旦模型预测出现异常，跨语言堆栈追踪极其困难，难以快速定位是数据预处理问题还是算法逻辑错误。\n- **性能损耗**：网络序列化与反序列化带来的延迟，无法满足毫秒级风控决策的严苛要求。\n\n### 使用 gorgonia 后\n- **全链路 Go 化**：团队直接在 Go 代码中定义计算图并执行自动微分，从实验到部署无需切换语言，实现了“编写即生产”。\n- **轻量级交付**：仅需编译单个二进制文件即可包含完整的推理引擎，消除了对外部 AI 框架运行时的依赖，显著降低内存占用。\n- **统一调试体验**：开发人员利用熟悉的 Go 工具链即可对神经网络前向传播和梯度计算进行断点调试，排查效率大幅提升。\n- **极致低延迟**：得益于原生执行和 CUDA 支持，模型推理直接在进程内完成，去除了网络开销，完美契合高频交易场景。\n\ngorgonia 让 Go 开发者能在熟悉的环境中无缝构建高性能机器学习系统，彻底打破了算法实验与工程落地之间的壁垒。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fgorgonia_gorgonia_dceeb70e.png","Gorgonia","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fgorgonia_3476b18e.png","",null,"https:\u002F\u002Fgorgonia.org","https:\u002F\u002Fgithub.com\u002Fgorgonia",[79,83,87,91,95],{"name":80,"color":81,"percentage":82},"Go","#00ADD8",96.3,{"name":84,"color":85,"percentage":86},"C","#555555",2.8,{"name":88,"color":89,"percentage":90},"Cuda","#3A4E3A",0.8,{"name":92,"color":93,"percentage":94},"Python","#3572A5",0,{"name":96,"color":97,"percentage":94},"Assembly","#6E4C13",5910,448,"2026-04-07T12:05:37","Apache-2.0",4,"Linux, macOS, Windows, FreeBSD","可选。如需 GPU 加速需 NVIDIA 显卡并支持 CUDA（具体型号、显存及 CUDA 版本未说明）；不支持 OpenCL。纯汇编加速仅限 amd64 架构。","未说明",{"notes":107,"python":108,"dependencies":109},"该工具是基于 Go 语言的机器学习库，非 Python 项目。最低 Go 版本要求为 1.12+（因依赖 Gonum）。GPU 实现涉及较重的 CGO 开销。API 在 1.0 版本前不稳定。","不需要 (基于 Go 语言)",[110,111],"Go > 1.12","gonum (隐含依赖)",[14],[114,115,116,117,118,119,120,121,64,122,123,124,125,126,127,128],"machine-learning","artificial-intelligence","neural-network","computation-graph","differentiation","golang","go","gradient-descent","deep-learning","deeplearning","deep-neural-networks","automatic-differentiation","symbolic-differentiation","hacktoberfest","graph-computation","2026-03-27T02:49:30.150509","2026-04-08T12:17:51.168738",[132,137,142,147,152,156],{"id":133,"question_zh":134,"answer_zh":135,"source_url":136},24226,"如何在 Gorgonia 中实现张量的广播（Broadcasting）机制？","虽然 `Broadcast` 函数已导出，但直接使用时内部类型未导出。推荐的解决方案有两种：\n1. **使用新的 API 设计**：创建一个函数，接受两个输入节点和一个模式，返回两个经过广播处理的节点，而不直接应用运算符。示例代码：\n```go\nfunc Broadcast(a, b *Node, pattern BroadcastPattern) (*Node, *Node, error)\n```\n使用时先调用此函数获取新节点，再对它们进行二元运算。\n2. **半手动方式**：将 `broadcastPattern` 作为第三个参数传递给所有已实现的运算符（如 Add, HadamardProd 等）。例如：`Add(a, b, broadcastPattern)`。这种方式虽然不够自动化，但在等待更好实现期间是可行的。","https:\u002F\u002Fgithub.com\u002Fgorgonia\u002Fgorgonia\u002Fissues\u002F223",{"id":138,"question_zh":139,"answer_zh":140,"source_url":141},24227,"在使用 CUDA 运行卷积神经网络示例时遇到\"Import Cycle Not Allowed\"错误或程序挂起，如何解决？","这个问题在 Go 1.15.3 版本中出现，表现为导入循环错误或程序在分配 GPU 内存后无限挂起。\n解决方案：\n1. 该问题已在主分支（master）中修复，请升级到最新版本的 Gorgonia。\n2. 如果必须使用旧版本，可以尝试回退到 Go 1.13.9，但需注意程序可能会挂起。\n3. 在使用 `cudagen` 工具生成代码时，可能需要临时移除文件顶部的 `\u002F\u002F+build cuda` 标签，生成完成后再加回，以避免构建约束排除所有 Go 文件的错误。","https:\u002F\u002Fgithub.com\u002Fgorgonia\u002Fgorgonia\u002Fissues\u002F445",{"id":143,"question_zh":144,"answer_zh":145,"source_url":146},24228,"Gorgonia 支持哪些依赖管理工具？是否计划迁移到 vgo 或 dep？","Gorgonia 曾考虑采用 `dep` 作为官方安装机制，并提供了 `Gopkg.toml` 和 `Gopkg.lock` 文件。关于迁移到 `vgo` (Go Modules 的前身)，维护者表示用户已成功在项目中结合 `vgo` 使用 Gorgonia。但在 `tensor` 包中可能存在一些兼容性问题，特别是在处理 `gonum` 库的最新版本变更时。建议用户尝试使用 Go Modules（现代 Go 版本默认支持），如果遇到特定包的版本冲突，可能需要手动调整依赖版本。","https:\u002F\u002Fgithub.com\u002Fgorgonia\u002Fgorgonia\u002Fissues\u002F116",{"id":148,"question_zh":149,"answer_zh":150,"source_url":151},24229,"Iterator.Chan() 方法是否存在已知问题或潜在风险？","是的，`Iterator.Chan()` 被认为是有问题的（considered harmful）。在重构 `tensor` 包以支持可插拔执行引擎（如 CPU、GPU 或网络计算）的过程中，发现该方法可能导致难以调试的行为，例如创建负长度的 channel 或产生不确定的值。维护者已将相关讨论和修复工作移至 `gorgonia\u002Ftensor` 仓库中进行跟踪。建议用户在遍历张量数据时谨慎使用此方法，或参考最新的文档和代码更新以获取更安全的替代方案。","https:\u002F\u002Fgithub.com\u002Fgorgonia\u002Fgorgonia\u002Fissues\u002F135",{"id":153,"question_zh":154,"answer_zh":155,"source_url":151},24230,"如何为 Gorgonia 自定义执行引擎以支持稀疏张量或远程计算？","Gorgonia 的 `tensor` 包经过重构，将数据结构与执行引擎分离，允许用户插入自定义的执行后端。默认执行引擎是 `StdEng`。用户可以通过实现特定的接口来扩展它，以支持稀疏张量矩阵分解或远程计算（如在 Raspberry Pi 上将计算卸载到工作站）。\n具体步骤包括：\n1. 参考 `internal\u002Fexecution` 包中的默认实现代码。\n2. 查看示例扩展测试文件（example_extension_test.go）了解如何编写自定义引擎。\n3. 注意：早期的远程计算实验表明，网络延迟可能导致性能严重下降，需谨慎评估使用场景。",{"id":157,"question_zh":158,"answer_zh":159,"source_url":136},24231,"为什么 Broadcast 函数不能直接在包外用于二元运算符？","`Broadcast` 函数虽然导出了，但它依赖于包内私有的 `binOp` 实现（如 `BinaryOperatorType`），这些类型未在包外暴露，导致无法直接在外部调用完整的广播运算逻辑。\n解决思路是将“广播形状计算”与“实际运算”解耦：\n- 第一步：使用一个函数计算 `a` 和 `b` 的广播后形状（类似 NumPy 的 broadcast 对象）。\n- 第二步：使用另一个函数将节点调整为该形状（类似 `broadcast_to`）。\n这样就不需要导出私有类型，且能兼容任何自定义的操作符（Op）。",[161,166,171,176,181,186,191,196,201,206,211,216,221,225,230,235,240,245,249,254],{"id":162,"version":163,"summary_zh":164,"released_at":165},145803,"v0.9.18","这可能是 0.9 分支在 v0.10.0 带来重大变化之前的最后一次发布。","2023-12-03T23:10:42",{"id":167,"version":168,"summary_zh":169,"released_at":170},145804,"v0.9.17","## CI\nCI（GitHub Actions）引入了新的模板系统，这将简化 Go 发布版本的升级流程。此外，它现在为 ARM64 架构提供了自定义运行器。这一改进促使我们在 ARM64 上发现了并修复了测试中的几个问题。\n\n## 修复\n* 为 BatchNorm 操作支持平铺权重 (#465)\n* 修复 Tape 计算机的 reset 方法 (#467)\n* 修复 Adam 求解器中的裁剪问题 (#469)\n* 修复 `GlorotEtAlN64` 中的 panic 消息 (#470)\n* 修复并发示例 (#472)\n\n## API 变更\n* 用于创建原生 Value 类型的函数（NewF64、NewF32 等）(#481)\n* **破坏性变更：** 已移除 `BatchNorm1d` 函数；BatchNorm 函数现支持 1D 和 2D 操作 (#482)","2021-03-14T07:55:55",{"id":172,"version":173,"summary_zh":174,"released_at":175},145805,"v0.9.16","这个版本对 `tensor` 包的语义进行了澄清——不安全指针相关的内容也已清理干净。\n\n此外，还修复了 `SoftMax` 中的一些小 bug，现在 `SoftMax` 不再会导致竞态条件。","2020-12-31T13:10:41",{"id":177,"version":178,"summary_zh":179,"released_at":180},145806,"v0.9.15","当向量以重复次数为 1 的方式广播时，其中一个值会被意外置零。这会在神经网络中产生非常奇怪的伪影。\n\n这个问题现已修复。","2020-09-28T01:19:38",{"id":182,"version":183,"summary_zh":184,"released_at":185},145807,"v0.9.14","随着 gorgonia.org\u002Ftensor@v0.9.11 的发布，该张量 now 已经支持复数了。","2020-09-10T18:15:58",{"id":187,"version":188,"summary_zh":189,"released_at":190},145808,"v0.9.13","这引用了 GoMachine 的新实现。","2020-08-06T14:36:22",{"id":192,"version":193,"summary_zh":194,"released_at":195},145809,"v0.9.12","@cpllbstr 添加了 Upsample2D 算子。它与 PyTorch 中的该算子类似：https:\u002F\u002Fpytorch.org\u002Fdocs\u002Fmaster\u002Fgenerated\u002Ftorch.nn.Upsample.html","2020-06-18T18:39:19",{"id":197,"version":198,"summary_zh":199,"released_at":200},145810,"v0.9.11","得益于 @wzzhu 的出色工作，形状推断如今更加稳健。它回归了 Gorgonia 最初对张量形状的理解——即在进行降维操作时，不会过于激进地压缩维度。","2020-06-15T19:24:25",{"id":202,"version":203,"summary_zh":204,"released_at":205},145811,"v0.9.10","在之前的版本中，`repeatOp` 是一个复合操作，其函数签名实际上是 `func repeat(a, nTimes *Node, axes ...int)`。因此，你可以这样使用：`repeat(a, 300, 1, 2, 3)`，表示将张量 `a` 在第 1、2 和 3 个轴上分别重复 300 次。\n\n现在，这一实现已被降级优化为 `func repeat(a, repeat *Node, axis int)`。之所以进行这种降级优化，是因为进一步分析后发现，该函数实际上只是多次调用 `tensor.Repeat`，从而导致大量新张量被分配。然而，符号化操作的核心目的之一，正是为了能够在运行前预先分配好所需的内存。\n\n通过这种降级优化，`repeatOp` 现在可以调用 `tensor.RepeatReuse`，使重复操作能够复用已预先分配的存储空间，从而减少内存分配次数，提升性能。","2020-04-10T09:10:36",{"id":207,"version":208,"summary_zh":209,"released_at":210},145812,"v0.9.9","Dropout 长期存在一个 bug，已由 @MarkKremer 修复。","2020-03-25T21:35:45",{"id":212,"version":213,"summary_zh":214,"released_at":215},145813,"v0.9.8","Two bugfixes in this release:\r\n\r\n* An Off-By-One bug in which the axes of softmax was  affected. \r\n* TrimSpace being used in the iris example\r\n* Return value of scalar values are fixed ","2020-02-10T22:19:50",{"id":217,"version":218,"summary_zh":219,"released_at":220},145814,"v0.9.7","Previously when an expression such as `-(x+y)` is given and `x` and `y` are scalar values, the neg op would fail to correctly pass the derivative into the constituents. This is due to a misuse of `UnsafeDo` . This has been rectified now.","2020-01-19T21:47:25",{"id":222,"version":223,"summary_zh":75,"released_at":224},145815,"v0.9.6","2020-01-04T13:22:37",{"id":226,"version":227,"summary_zh":228,"released_at":229},145816,"v0.9.5","A number of new features were added, mainly to support `golgi` - gorgonia.org\u002Fgolgi. Here is an incomplete enumeration:\r\n\r\n* `KeepDims` is introduced as a function to decorate another function\r\n*  A bunch of `BroadcastXXX` operations were added (autogenerated)\r\n* `Unconcat` which is the opposite of  `Concat` \r\n* `BatchedMatMul` supports more than 3D tensors\r\n* `SoftMax` supports multiple axes now\r\n* Monadish handling of `*Nodes` \r\n* Consistent axis operations thanks to @bdleitner \r\n* GAP operator","2019-12-08T00:58:15",{"id":231,"version":232,"summary_zh":233,"released_at":234},145817,"v0.9.4","@owulveryck added the Global Averaging Pool operator. It does not currently support `ADOp`, thus may not be used by `lispMachine` yet.","2019-11-07T12:04:38",{"id":236,"version":237,"summary_zh":238,"released_at":239},145818,"v0.9.3","@blackrez  added ARMV6 support. No API changes","2019-09-06T09:35:28",{"id":241,"version":242,"summary_zh":243,"released_at":244},145819,"v0.9.2","Gorgonia is now citable by zenodo","2019-08-30T00:08:21",{"id":246,"version":247,"summary_zh":75,"released_at":248},145820,"v0.9.1","2019-01-30T01:11:15",{"id":250,"version":251,"summary_zh":252,"released_at":253},145821,"v0.9.0-beta","Ongoing notes:\r\n\r\n* **CUDA**: Better CUDA support (IN PROGRESS)\r\n    * ~ColMajor used by default if engine is CUDA.~ (ColMajor is supported, but defaults to using RowMajor for all the major cuBLAS versions. Careful reasoning of the parameters obviates the need for ColMajor by  default, which causes more headaches. It is still supported)\r\n    * Transposition will be automatically done when performing transports back to CPU.\r\n    * cudnn operations supported (IN PROGRESS) (note: these are the ones I use more often hence gets bigger attention):\r\n        * [x] Conv2d\r\n        * [x] Dropout\r\n        * [x] Maxpool2d \r\n        * [x] BatchNorm\r\n        * [x] Rectify\r\n    * Other CUDA related optimizations\r\n        *  [x] full cuBLAS support\r\n* **New Ops**:\r\n    * BatchNorm \r\n    * InvSqrt\r\n    * CUDA enabled ops in `ops\u002Fnn` (preview for how things will start to look in v0.10.0)\r\n* **New Features**:\r\n    * Limited shape inference.  Working towards a calculus for shapes (first raised in #96 and #97).\r\n* **Optimizations**:\r\n    * Optimizations of basic ops to use engine functions if available, otherwise, fall back to using `Apply`, which adds a penalty from repeatedly calling functions.\r\n    * Faster VMs (1 of 2 VMs): ~greedy goroutines grabs gigs from a priority queue. This causes faster execution of  code in general.~ (this is moved to a future version of 0.9.xx):\r\n```\r\nbenchmark                           old ns\u002Fop      new ns\u002Fop      delta\r\nBenchmarkTapeMachineExecution-8     3129074510     2695304022     -13.86%\r\n\r\nbenchmark                           old allocs     new allocs     delta\r\nBenchmarkTapeMachineExecution-8     25745          25122          -2.42%\r\n\r\nbenchmark                           old bytes      new bytes      delta\r\nBenchmarkTapeMachineExecution-8     4804578705     4803784111     -0.02%\r\n````\r\n* **Code generation**: some exported API is now auto generated\r\n* **New Solver** : @ynqa added the Momentum solver.\r\n* **Breaking API**: `Solver` now take a slice of `ValueGrad` instead of `Nodes`. `ValueGrad` is an interface, of which a `*Node` fulfils. An additional utility function `NodesToValueGrads` has been added to aid with refactoring.  This was done for two reasons:\r\n    * ~The support for BatchNorm operation, which is a verily impure and highly stateful function. The BatchNorm Op has internal states that need to have their gradients updated as well. But the internal state of BatchNorm isn't really part of  the expression graph, and really it shouldn't be.~ Turns out there was a better API for `BatchNorm`. \r\n    * In the next version,  v0.10.0. We aim to  do [better package organization](https:\u002F\u002Fgithub.com\u002Fgorgonia\u002Fgorgonia\u002Fissues\u002F91) for managability.  With this API breaking change, the solver now is less dependent on the other parts of Gorgonia and can be easily separated.\r\n* **Breaking Semantics**: A `gorgonia.VM` now implements `io.Closer`. It should be treated as a resource as well as a computation device - the VM must be `Close()`d in order for the resources acquired by the VM to actually be released. Turns out, automatic resource management is too difficult. Who'd thunk that?\r\n","2018-08-19T03:02:23",{"id":255,"version":256,"summary_zh":257,"released_at":258},145822,"v0.8.4","Errors were previously shadowed and not returned in the Barwain-Borzilai solver. They've been fixed","2018-05-11T18:24:18"]