[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-alibaba--Alink":3,"tool-alibaba--Alink":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":79,"owner_twitter":79,"owner_website":80,"owner_url":81,"languages":82,"stars":120,"forks":121,"last_commit_at":122,"license":123,"difficulty_score":23,"env_os":124,"env_gpu":124,"env_ram":124,"env_deps":125,"category_tags":133,"github_topics":134,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":154,"updated_at":155,"faqs":156,"releases":185},2331,"alibaba\u002FAlink","Alink","Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform. ","Alink 是由阿里巴巴计算平台 PAI 团队研发的开源机器学习算法平台，基于强大的 Flink 计算引擎构建。它旨在解决大规模数据处理场景下，机器学习算法难以高效运行和扩展的痛点，让用户能够轻松在分布式环境中完成从数据预处理、特征工程到模型训练与预测的全流程任务。\n\nAlink 特别适合数据工程师、算法开发者及科研人员使用。无论是需要处理海量日志的互联网从业者，还是希望快速验证算法模型的研究者，都能通过 Alink 获得稳定的算力支持。其核心亮点在于提供了丰富的通用算法组件库，涵盖分类、回归、聚类等多种主流机器学习任务，并完美支持 Java 和 Python（PyAlink）双语言接口。\n\n借助 PyAlink，用户可以在熟悉的 Jupyter Notebook 环境中以交互式方式编写代码，既能享受 Python 生态的便捷，又能底层调用 Flink 的高性能分布式计算能力，无需关心复杂的集群部署细节。此外，Alink 还配套了详尽的中文教程与插件下载器，大幅降低了学习门槛，帮助用户快速上手并构建高效的机器学习流水线。","\u003Cfont size=7>[English](README.en-US.md)| 简体中文\u003C\u002Ffont>\n\n# Alink\n\n Alink是基于Flink的通用算法平台,由阿里巴巴计算平台PAI团队研发,欢迎大家加入Alink开源用户钉钉群进行交流。\n \n \n\u003Cdiv align=center>\n\u003Cimg src=\"https:\u002F\u002Fimg.alicdn.com\u002Ftfs\u002FTB1kQU0sQY2gK0jSZFgXXc5OFXa-614-554.png\" height=\"25%\" width=\"25%\">\n\u003C\u002Fdiv>\n\n- Alink组件列表：http:\u002F\u002Falinklab.cn\u002Fmanual\u002Findex.html\n- Alink教程：http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Findex.html\n- Alink插件下载器：https:\u002F\u002Fwww.yuque.com\u002Fpinshu\u002Falink_guide\u002Fplugin_downloader\n\n#### Alink教程\n\u003Cdiv align=center>\n\u003Cimg src=\"https:\u002F\u002Fimg.alicdn.com\u002Fimgextra\u002Fi2\u002FO1CN01Z7sbCr1Hg22gLIsdk_!!6000000000786-0-tps-1280-781.jpg\" height=\"50%\" width=\"50%\">\n\u003C\u002Fdiv>\n\n- Alink教程（Java版）：http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_java.html\n- Alink教程（Python版）：http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_python.html\n- 源代码地址：https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink\u002Ftree\u002Fmaster\u002Ftutorial\n- Java版的数据和资料链接：http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_java_00_reference.html\n- Python版的数据和资料链接：http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_python_00_reference.html\n- Alink教程(Java版)代码的运行攻略  http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_java_00_code_help.html\n- Alink教程(Python版)代码的运行攻略  http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_python_00_code_help.html\n\n#### 开源算法列表\n\n\u003Cdiv align=center>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falibaba_Alink_readme_99d01b03269b.png\" height=\"100%\" width=\"100%\">\n\u003C\u002Fdiv>\n\n#### PyAlink 使用截图\n\n\u003Cdiv align=center>\n\u003Cimg src=\"https:\u002F\u002Fimg.alicdn.com\u002Ftfs\u002FTB1TmKloAL0gK0jSZFxXXXWHVXa-2070-1380.png\" height=\"60%\" width=\"60%\">\n\u003C\u002Fdiv>\n\n# 快速开始\n\n## PyAlink 使用介绍\n\n### 使用前准备：\n---------\n\n#### 包名和版本说明：\n\n  - PyAlink 根据 Alink 所支持的 Flink 版本提供不同的 Python 包：\n其中，`pyalink` 包对应为 Alink 所支持的最新 Flink 版本，当前为 1.13，而 `pyalink-flink-***` 为旧版本的 Flink 版本，当前提供 `pyalink-flink-1.12`, `pyalink-flink-1.11`, `pyalink-flink-1.10` 和 `pyalink-flink-1.9`。\n  - Python 包的版本号与 Alink 的版本号一致，例如`1.6.2`。\n\n####安装步骤：\n1. 确保使用环境中有Python3，版本限于 3.6，3.7 和 3.8。\n2. 确保使用环境中安装有 Java 8。\n3. 使用 pip 命令进行安装：\n  `pip install pyalink`、`pip install pyalink-flink-1.12`、`pip install pyalink-flink-1.11`、`pip install pyalink-flink-1.10` 或者 `pip install pyalink-flink-1.9`。\n  \n#### 安装注意事项：\n\n1. `pyalink` 和 `pyalink-flink-***` 不能同时安装，也不能与旧版本同时安装。\n如果之前安装过 `pyalink` 或者 `pyalink-flink-***`，请使用`pip uninstall pyalink` 或者 `pip uninstall pyalink-flink-***` 卸载之前的版本。\n2. 出现`pip`安装缓慢或不成功的情况，可以参考[这篇文章](https:\u002F\u002Fsegmentfault.com\u002Fa\u002F1190000006111096)修改pip源，或者直接使用下面的链接下载 whl 包，然后使用 `pip` 安装：\n   - Flink 1.13：[链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink-1.6.2.post0-py3-none-any.whl) (MD5: d4b7b1fe6474b11ca7f45d0fb0daf5bc)\n   - Flink 1.12：[链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink_flink_1.12-1.6.2.post0-py3-none-any.whl) (MD5: 527b9ac24383ccc8593cd61b06cc610d)\n   - Flink 1.11：[链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink_flink_1.11-1.6.2.post0-py3-none-any.whl) (MD5: 7e59ba00b3739386996cf55d8f522ed2)\n   - Flink 1.10：[链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink_flink_1.10-1.6.2.post0-py3-none-any.whl) (MD5: 6d5d9048c9a44f27285467c5117e8deb)\n   - Flink 1.9: [链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink_flink_1.9-1.6.2.post0-py3-none-any.whl) (MD5: e89ac35a6a1c63c0426f3d9ca1025880)\n3. 如果有多个版本的 Python，可能需要使用特定版本的 `pip`，比如 `pip3`；如果使用 Anaconda，则需要在 Anaconda 命令行中进行安装。\n\n### 开始使用：\n-------\n可以通过 Jupyter Notebook 来开始使用 PyAlink，能获得更好的使用体验。\n\n使用步骤：\n1. 在命令行中启动Jupyter：`jupyter notebook`，并新建 Python 3 的 Notebook 。\n2. 导入 pyalink 包：`from pyalink.alink import *`。\n3. 使用方法创建本地运行环境：\n`useLocalEnv(parallism, flinkHome=None, config=None)`。\n其中，参数 `parallism` 表示执行所使用的并行度；`flinkHome` 为 flink 的完整路径，一般情况不需要设置；`config`为Flink所接受的配置参数。运行后出现如下所示的输出，表示初始化运行环境成功：\n```\nJVM listening on ***\n```\n4. 开始编写 PyAlink 代码，例如：\n```python\nsource = CsvSourceBatchOp()\\\n    .setSchemaStr(\"sepal_length double, sepal_width double, petal_length double, petal_width double, category string\")\\\n    .setFilePath(\"https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fdata-files\u002Firis.csv\")\nres = source.select([\"sepal_length\", \"sepal_width\"])\ndf = res.collectToDataframe()\nprint(df)\n```\n\n### 编写代码：\n------\n在 PyAlink 中，算法组件提供的接口基本与 Java API 一致，即通过默认构造方法创建一个算法组件，然后通过 `setXXX` 设置参数，通过 `link\u002FlinkTo\u002FlinkFrom` 与其他组件相连。\n这里利用 Jupyter Notebook 的自动补全机制可以提供书写便利。\n\n对于批式作业，可以通过批式组件的 `print\u002FcollectToDataframe\u002FcollectToDataframes` 等方法或者 `BatchOperator.execute()` 来触发执行；对于流式作业，则通过 `StreamOperator.execute()` 来启动作业。\n\n\n### 更多用法：\n------\n  - [DataFrame 与 Operator 互转](docs\u002Fpyalink\u002Fpyalink-dataframe.md)\n  - [StreamOperator 数据预览](docs\u002Fpyalink\u002Fpyalink-stream-operator-preview.md)\n  - [UDF\u002FUDTF\u002FSQL 使用](docs\u002Fpyalink\u002Fpyalink-udf.md)\n  - [与 PyFlink 一同使用](docs\u002Fpyalink\u002Fpyalink-pyflink.md)\n  - [PyAlink 常见问题](docs\u002Fpyalink\u002Fpyalink-qa.md)\n\n## Java 接口使用介绍\n----------\n\n### 示例代码\n\n```java\nString URL = \"https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fdata-files\u002Firis.csv\";\nString SCHEMA_STR = \"sepal_length double, sepal_width double, petal_length double, petal_width double, category string\";\n\nBatchOperator data = new CsvSourceBatchOp()\n        .setFilePath(URL)\n        .setSchemaStr(SCHEMA_STR);\n\nVectorAssembler va = new VectorAssembler()\n        .setSelectedCols(new String[]{\"sepal_length\", \"sepal_width\", \"petal_length\", \"petal_width\"})\n        .setOutputCol(\"features\");\n\nKMeans kMeans = new KMeans().setVectorCol(\"features\").setK(3)\n        .setPredictionCol(\"prediction_result\")\n        .setPredictionDetailCol(\"prediction_detail\")\n        .setReservedCols(\"category\")\n        .setMaxIter(100);\n\nPipeline pipeline = new Pipeline().add(va).add(kMeans);\npipeline.fit(data).transform(data).print();\n```\n\n### Flink-1.13 的 Maven 依赖\n```xml\n\u003Cdependency>\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\n    \u003CartifactId>alink_core_flink-1.13_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.6.2\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.13.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.13.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-clients_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.13.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n```\n\n### Flink-1.12 的 Maven 依赖\n```xml\n\u003Cdependency>\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\n    \u003CartifactId>alink_core_flink-1.12_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.6.2\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.12.1\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.12.1\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-clients_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.12.1\u003C\u002Fversion>\n\u003C\u002Fdependency>\n```\n\n### Flink-1.11 的 Maven 依赖\n```xml\n\u003Cdependency>\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\n    \u003CartifactId>alink_core_flink-1.11_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.6.2\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.11.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.11.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-clients_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.11.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n```\n\n### Flink-1.10 的 Maven 依赖\n```xml\n\u003Cdependency>\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\n    \u003CartifactId>alink_core_flink-1.10_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.6.2\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.10.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.10.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n```\n\n### Flink-1.9 的 Maven 依赖\n\n```xml\n\u003Cdependency>\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\n    \u003CartifactId>alink_core_flink-1.9_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.6.2\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.9.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.9.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n```\n\n\n\n## 快速开始在集群上运行Alink算法\n--------\n\n1. 准备Flink集群\n```shell\n  wget https:\u002F\u002Farchive.apache.org\u002Fdist\u002Fflink\u002Fflink-1.13.0\u002Fflink-1.13.0-bin-scala_2.11.tgz\n  tar -xf flink-1.13.0-bin-scala_2.11.tgz && cd flink-1.13.0\n  .\u002Fbin\u002Fstart-cluster.sh\n```\n\n2. 准备Alink算法包\n```shell\n  git clone https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink.git\n  # add \u003Cscope>provided\u003C\u002Fscope> in pom.xml of alink_examples.\n  cd Alink && mvn -Dmaven.test.skip=true clean package shade:shade\n```\n\n3. 运行Java示例\n```shell\n  .\u002Fbin\u002Fflink run -p 1 -c com.alibaba.alink.ALSExample [path_to_Alink]\u002Fexamples\u002Ftarget\u002Falink_examples-1.5-SNAPSHOT.jar\n  # .\u002Fbin\u002Fflink run -p 1 -c com.alibaba.alink.GBDTExample [path_to_Alink]\u002Fexamples\u002Ftarget\u002Falink_examples-1.5-SNAPSHOT.jar\n  # .\u002Fbin\u002Fflink run -p 1 -c com.alibaba.alink.KMeansExample [path_to_Alink]\u002Fexamples\u002Ftarget\u002Falink_examples-1.5-SNAPSHOT.jar\n```\n\n## 部署\n----------\n\n[集群部署](docs\u002Fdeploy\u002Fcluster-deploy.md)\n","\u003Cfont size=7>[English](README.en-US.md)| 简体中文\u003C\u002Ffont>\n\n# Alink\n\n Alink是基于Flink的通用算法平台,由阿里巴巴计算平台PAI团队研发,欢迎大家加入Alink开源用户钉钉群进行交流。\n \n \n\u003Cdiv align=center>\n\u003Cimg src=\"https:\u002F\u002Fimg.alicdn.com\u002Ftfs\u002FTB1kQU0sQY2gK0jSZFgXXc5OFXa-614-554.png\" height=\"25%\" width=\"25%\">\n\u003C\u002Fdiv>\n\n- Alink组件列表：http:\u002F\u002Falinklab.cn\u002Fmanual\u002Findex.html\n- Alink教程：http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Findex.html\n- Alink插件下载器：https:\u002F\u002Fwww.yuque.com\u002Fpinshu\u002Falink_guide\u002Fplugin_downloader\n\n#### Alink教程\n\u003Cdiv align=center>\n\u003Cimg src=\"https:\u002F\u002Fimg.alicdn.com\u002Fimgextra\u002Fi2\u002FO1CN01Z7sbCr1Hg22gLIsdk_!!6000000000786-0-tps-1280-781.jpg\" height=\"50%\" width=\"50%\">\n\u003C\u002Fdiv>\n\n- Alink教程（Java版）：http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_java.html\n- Alink教程（Python版）：http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_python.html\n- 源代码地址：https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink\u002Ftree\u002Fmaster\u002Ftutorial\n- Java版的数据和资料链接：http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_java_00_reference.html\n- Python版的数据和资料链接：http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_python_00_reference.html\n- Alink教程(Java版)代码的运行攻略  http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_java_00_code_help.html\n- Alink教程(Python版)代码的运行攻略  http:\u002F\u002Falinklab.cn\u002Ftutorial\u002Fbook_python_00_code_help.html\n\n#### 开源算法列表\n\n\u003Cdiv align=center>\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falibaba_Alink_readme_99d01b03269b.png\" height=\"100%\" width=\"100%\">\n\u003C\u002Fdiv>\n\n#### PyAlink 使用截图\n\n\u003Cdiv align=center>\n\u003Cimg src=\"https:\u002F\u002Fimg.alicdn.com\u002Ftfs\u002FTB1TmKloAL0gK0jSZFxXXXWHVXa-2070-1380.png\" height=\"60%\" width=\"60%\">\n\u003C\u002Fdiv>\n\n# 快速开始\n\n## PyAlink 使用介绍\n\n### 使用前准备：\n---------\n\n#### 包名和版本说明：\n\n  - PyAlink 根据 Alink 所支持的 Flink 版本提供不同的 Python 包：\n其中，`pyalink` 包对应为 Alink 所支持的最新 Flink 版本，当前为 1.13，而 `pyalink-flink-***` 为旧版本的 Flink 版本，当前提供 `pyalink-flink-1.12`, `pyalink-flink-1.11`, `pyalink-flink-1.10` 和 `pyalink-flink-1.9`。\n  - Python 包的版本号与 Alink 的版本号一致，例如`1.6.2`。\n\n####安装步骤：\n1. 确保使用环境中有Python3，版本限于 3.6，3.7 和 3.8。\n2. 确保使用环境中安装有 Java 8。\n3. 使用 pip 命令进行安装：\n  `pip install pyalink`、`pip install pyalink-flink-1.12`、`pip install pyalink-flink-1.11`、`pip install pyalink-flink-1.10` 或者 `pip install pyalink-flink-1.9`。\n  \n#### 安装注意事项：\n\n1. `pyalink` 和 `pyalink-flink-***` 不能同时安装，也不能与旧版本同时安装。\n如果之前安装过 `pyalink` 或者 `pyalink-flink-***`，请使用`pip uninstall pyalink` 或者 `pip uninstall pyalink-flink-***` 卸载之前的版本。\n2. 出现`pip`安装缓慢或不成功的情况，可以参考[这篇文章](https:\u002F\u002Fsegmentfault.com\u002Fa\u002F1190000006111096)修改pip源，或者直接使用下面的链接下载 whl 包，然后使用 `pip` 安装：\n   - Flink 1.13：[链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink-1.6.2.post0-py3-none-any.whl) (MD5: d4b7b1fe6474b11ca7f45d0fb0daf5bc)\n   - Flink 1.12：[链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink_flink_1.12-1.6.2.post0-py3-none-any.whl) (MD5: 527b9ac24383ccc8593cd61b06cc610d)\n   - Flink 1.11：[链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink_flink_1.11-1.6.2.post0-py3-none-any.whl) (MD5: 7e59ba00b3739386996cf55d8f522ed2)\n   - Flink 1.10：[链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink_flink_1.10-1.6.2.post0-py3-none-any.whl) (MD5: 6d5d9048c9a44f27285467c5117e8deb)\n   - Flink 1.9: [链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink_flink_1.9-1.6.2.post0-py3-none-any.whl) (MD5: e89ac35a6a1c63c0426f3d9ca1025880)\n3. 如果有多个版本的 Python，可能需要使用特定版本的 `pip`，比如 `pip3`；如果使用 Anaconda，则需要在 Anaconda 命令行中进行安装。\n\n### 开始使用：\n-------\n可以通过 Jupyter Notebook 来开始使用 PyAlink，能获得更好的使用体验。\n\n使用步骤：\n1. 在命令行中启动Jupyter：`jupyter notebook`，并新建 Python 3 的 Notebook 。\n2. 导入 pyalink 包：`from pyalink.alink import *`。\n3. 使用方法创建本地运行环境：\n`useLocalEnv(parallism, flinkHome=None, config=None)`。\n其中，参数 `parallism` 表示执行所使用的并行度；`flinkHome` 为 flink 的完整路径，一般情况不需要设置；`config`为Flink所接受的配置参数。运行后出现如下所示的输出，表示初始化运行环境成功：\n```\nJVM listening on ***\n```\n4. 开始编写 PyAlink 代码，例如：\n```python\nsource = CsvSourceBatchOp()\\\n    .setSchemaStr(\"sepal_length double, sepal_width double, petal_length double, petal_width double, category string\")\\\n    .setFilePath(\"https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fdata-files\u002Firis.csv\")\nres = source.select([\"sepal_length\", \"sepal_width\"])\ndf = res.collectToDataframe()\nprint(df)\n```\n\n### 编写代码：\n------\n在 PyAlink 中，算法组件提供的接口基本与 Java API 一致，即通过默认构造方法创建一个算法组件，然后通过 `setXXX` 设置参数，通过 `link\u002FlinkTo\u002FlinkFrom` 与其他组件相连。\n这里利用 Jupyter Notebook 的自动补全机制可以提供书写便利。\n\n对于批式作业，可以通过批式组件的 `print\u002FcollectToDataframe\u002FcollectToDataframes` 等方法或者 `BatchOperator.execute()` 来触发执行；对于流式作业，则通过 `StreamOperator.execute()` 来启动作业。\n\n\n### 更多用法：\n------\n  - [DataFrame 与 Operator 互转](docs\u002Fpyalink\u002Fpyalink-dataframe.md)\n  - [StreamOperator 数据预览](docs\u002Fpyalink\u002Fpyalink-stream-operator-preview.md)\n  - [UDF\u002FUDTF\u002FSQL 使用](docs\u002Fpyalink\u002Fpyalink-udf.md)\n  - [与 PyFlink 一同使用](docs\u002Fpyalink\u002Fpyalink-pyflink.md)\n  - [PyAlink 常见问题](docs\u002Fpyalink\u002Fpyalink-qa.md)\n\n## Java 接口使用介绍\n----------\n\n### 示例代码\n\n```java\nString URL = \"https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fdata-files\u002Firis.csv\";\nString SCHEMA_STR = \"sepal_length double, sepal_width double, petal_length double, petal_width double, category string\";\n\nBatchOperator data = new CsvSourceBatchOp()\n        .setFilePath(URL)\n        .setSchemaStr(SCHEMA_STR);\n\nVectorAssembler va = new VectorAssembler()\n        .setSelectedCols(new String[]{\"sepal_length\", \"sepal_width\", \"petal_length\", \"petal_width\"})\n        .setOutputCol(\"features\");\n\nKMeans kMeans = new KMeans().setVectorCol(\"features\").setK(3)\n        .setPredictionCol(\"prediction_result\")\n        .setPredictionDetailCol(\"prediction_detail\")\n        .setReservedCols(\"category\")\n        .setMaxIter(100);\n\nPipeline pipeline = new Pipeline().add(va).add(kMeans);\npipeline.fit(data).transform(data).print();\n```\n\n### Flink-1.13 的 Maven 依赖\n```xml\n\u003Cdependency>\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\n    \u003CartifactId>alink_core_flink-1.13_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.6.2\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.13.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.13.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-clients_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.13.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n```\n\n### Flink-1.12 的 Maven 依赖\n```xml\n\u003Cdependency>\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\n    \u003CartifactId>alink_core_flink-1.12_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.6.2\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.12.1\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.12.1\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-clients_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.12.1\u003C\u002Fversion>\n\u003C\u002Fdependency>\n```\n\n### Flink-1.11 的 Maven 依赖\n```xml\n\u003Cdependency>\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\n    \u003CartifactId>alink_core_flink-1.11_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.6.2\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.11.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.11.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-clients_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.11.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n```\n\n### Flink-1.10 的 Maven 依赖\n```xml\n\u003Cdependency>\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\n    \u003CartifactId>alink_core_flink-1.10_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.6.2\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.10.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.10.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n```\n\n### Flink-1.9 的 Maven 依赖\n\n```xml\n\u003Cdependency>\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\n    \u003CartifactId>alink_core_flink-1.9_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.6.2\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.9.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n\u003Cdependency>\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\n    \u003Cversion>1.9.0\u003C\u002Fversion>\n\u003C\u002Fdependency>\n```\n\n\n\n## 快速开始在集群上运行Alink算法\n--------\n\n1. 准备Flink集群\n```shell\n  wget https:\u002F\u002Farchive.apache.org\u002Fdist\u002Fflink\u002Fflink-1.13.0\u002Fflink-1.13.0-bin-scala_2.11.tgz\n  tar -xf flink-1.13.0-bin-scala_2.11.tgz && cd flink-1.13.0\n  .\u002Fbin\u002Fstart-cluster.sh\n```\n\n2. 准备Alink算法包\n```shell\n  git clone https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink.git\n  # 在 alink_examples 的 pom.xml 中添加 \u003Cscope>provided\u003C\u002Fscope>。\n  cd Alink && mvn -Dmaven.test.skip=true clean package shade:shade\n```\n\n3. 运行Java示例\n```shell\n  .\u002Fbin\u002Fflink run -p 1 -c com.alibaba.alink.ALSExample [path_to_Alink]\u002Fexamples\u002Ftarget\u002Falink_examples-1.5-SNAPSHOT.jar\n  # .\u002Fbin\u002Fflink run -p 1 -c com.alibaba.alink.GBDTExample [path_to_Alink]\u002Fexamples\u002Ftarget\u002Falink_examples-1.5-SNAPSHOT.jar\n  # .\u002Fbin\u002Fflink run -p 1 -c com.alibaba.alink.KMeansExample [path_to_Alink]\u002Fexamples\u002Ftarget\u002Falink_examples-1.5-SNAPSHOT.jar\n```\n\n## 部署\n----------\n\n[集群部署](docs\u002Fdeploy\u002Fcluster-deploy.md)","# Alink 快速上手指南\n\nAlink 是阿里巴巴基于 Flink 开发的通用机器学习算法平台，支持批处理和流处理，提供 Java 和 Python（PyAlink）两种 API。\n\n## 1. 环境准备\n\n在开始之前，请确保您的开发环境满足以下要求：\n\n*   **操作系统**：Linux, macOS 或 Windows\n*   **Python 版本**：Python 3.6, 3.7 或 3.8 (仅限 PyAlink)\n*   **Java 版本**：JDK 8\n*   **构建工具**：Maven (仅限 Java 源码编译或集群部署)\n\n## 2. 安装步骤\n\n### 方式一：使用 PyAlink (推荐新手)\n\nPyAlink 根据底层 Flink 版本提供不同的安装包。`pyalink` 对应最新支持的 Flink 版本（当前为 1.13），其他版本需指定后缀。\n\n**安装命令：**\n\n```bash\n# 安装最新版 (Flink 1.13)\npip install pyalink\n\n# 或者安装特定 Flink 版本\n# pip install pyalink-flink-1.12\n# pip install pyalink-flink-1.11\n# pip install pyalink-flink-1.10\n# pip install pyalink-flink-1.9\n```\n\n> **注意**：\n> 1. `pyalink` 与 `pyalink-flink-*` 包互斥，请勿同时安装。若已安装旧版，请先执行 `pip uninstall \u003C包名>` 卸载。\n> 2. 若 pip 下载缓慢，可手动下载 `.whl` 文件后安装。国内用户可使用以下阿里云镜像链接：\n>    - Flink 1.13: [下载链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink-1.6.2.post0-py3-none-any.whl)\n>    - Flink 1.12: [下载链接](https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fv1.6.2.post0\u002Fpyalink_flink_1.12-1.6.2.post0-py3-none-any.whl)\n\n### 方式二：使用 Java API (Maven 依赖)\n\n在项目的 `pom.xml` 中添加对应 Flink 版本的依赖。以下以 **Flink 1.13** 为例：\n\n```xml\n\u003Cdependencies>\n    \u003C!-- Alink 核心依赖 -->\n    \u003Cdependency>\n        \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\n        \u003CartifactId>alink_core_flink-1.13_2.11\u003C\u002FartifactId>\n        \u003Cversion>1.6.2\u003C\u002Fversion>\n    \u003C\u002Fdependency>\n    \n    \u003C!-- Flink 相关依赖 -->\n    \u003Cdependency>\n        \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n        \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\n        \u003Cversion>1.13.0\u003C\u002Fversion>\n    \u003C\u002Fdependency>\n    \u003Cdependency>\n        \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n        \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\n        \u003Cversion>1.13.0\u003C\u002Fversion>\n    \u003C\u002Fdependency>\n    \u003Cdependency>\n        \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\n        \u003CartifactId>flink-clients_2.11\u003C\u002FartifactId>\n        \u003Cversion>1.13.0\u003C\u002Fversion>\n    \u003C\u002Fdependency>\n\u003C\u002Fdependencies>\n```\n\n*(注：若使用 Flink 1.12\u002F1.11\u002F1.10\u002F1.9，请将 artifactId 中的版本号及 flink 依赖版本调整为对应版本)*\n\n## 3. 基本使用\n\n### PyAlink 示例\n\n建议使用 Jupyter Notebook 以获得更好的交互体验。\n\n1.  **启动环境**：\n    ```bash\n    jupyter notebook\n    ```\n2.  **编写代码**：\n    新建 Python 3 Notebook，输入以下代码运行一个简单的数据读取与列选择示例：\n\n```python\nfrom pyalink.alink import *\n\n# 1. 初始化本地运行环境 (并行度设为 1)\nuseLocalEnv(1)\n\n# 2. 创建数据源并设置 Schema\nsource = CsvSourceBatchOp()\\\n    .setSchemaStr(\"sepal_length double, sepal_width double, petal_length double, petal_width double, category string\")\\\n    .setFilePath(\"https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fdata-files\u002Firis.csv\")\n\n# 3. 执行操作：选择部分列\nres = source.select([\"sepal_length\", \"sepal_width\"])\n\n# 4. 触发执行并打印结果 (转为 Pandas DataFrame)\ndf = res.collectToDataframe()\nprint(df)\n```\n\n### Java API 示例\n\n以下是一个完整的 KMeans 聚类流程示例：\n\n```java\nimport com.alibaba.alink.operator.batch.source.CsvSourceBatchOp;\nimport com.alibaba.alink.operator.batch.feature.VectorAssembler;\nimport com.alibaba.alink.operator.batch.clustering.KMeans;\nimport com.alibaba.alink.pipeline.Pipeline;\nimport com.alibaba.alink.operator.batch.BatchOperator;\n\npublic class QuickStart {\n    public static void main(String[] args) throws Exception {\n        String URL = \"https:\u002F\u002Falink-release.oss-cn-beijing.aliyuncs.com\u002Fdata-files\u002Firis.csv\";\n        String SCHEMA_STR = \"sepal_length double, sepal_width double, petal_length double, petal_width double, category string\";\n\n        \u002F\u002F 1. 加载数据\n        BatchOperator data = new CsvSourceBatchOp()\n                .setFilePath(URL)\n                .setSchemaStr(SCHEMA_STR);\n\n        \u002F\u002F 2. 特征组装\n        VectorAssembler va = new VectorAssembler()\n                .setSelectedCols(new String[]{\"sepal_length\", \"sepal_width\", \"petal_length\", \"petal_width\"})\n                .setOutputCol(\"features\");\n\n        \u002F\u002F 3. 定义 KMeans 算法\n        KMeans kMeans = new KMeans()\n                .setVectorCol(\"features\")\n                .setK(3)\n                .setPredictionCol(\"prediction_result\")\n                .setMaxIter(100);\n\n        \u002F\u002F 4. 构建管道并运行\n        Pipeline pipeline = new Pipeline().add(va).add(kMeans);\n        pipeline.fit(data).transform(data).print();\n    }\n}\n```\n\n### 核心概念提示\n*   **组件连接**：通过 `link`, `linkTo`, `linkFrom` 将多个算子串联。\n*   **触发执行**：\n    *   批处理作业：调用 `print()`, `collectToDataframe()` 或 `BatchOperator.execute()`。\n    *   流处理作业：调用 `StreamOperator.execute()`。","某电商数据团队需要基于 Flink 构建实时用户行为分析系统，以快速迭代推荐算法模型。\n\n### 没有 Alink 时\n- **开发门槛高**：数据科学家需手动编写大量 Java\u002FScala 代码调用 Flink ML 底层 API，难以发挥 Python 生态优势。\n- **算法复用难**：缺乏统一的算法组件库，每次新需求都要重复造轮子，特征工程与模型训练逻辑耦合严重。\n- **调试周期长**：本地模拟大规模流式数据极其困难，代码上线前无法有效验证逻辑，排查错误耗时耗力。\n- **运维成本高**：自定义算子缺乏标准化监控指标，任务失败时难以快速定位是数据倾斜还是算法收敛问题。\n\n### 使用 Alink 后\n- **Python 友好**：通过 PyAlink 直接使用 Python 编写流批一体的机器学习管道，无缝衔接 Pandas 与 Jupyter 工作流。\n- **组件丰富**：直接调用 Alink 内置的数十种标准化算法组件（如特征标准化、随机森林），像搭积木一样构建复杂链路。\n- **即时反馈**：利用 `useLocalEnv` 在本地轻松模拟并行环境，秒级预览中间结果，大幅缩短从想法到验证的闭环时间。\n- **生产就绪**：依托 Flink 原生能力，一键将本地调试好的脚本部署为高可用集群任务，自动处理容错与状态管理。\n\nAlink 让算法工程师能专注于业务逻辑创新，而非底层分布式框架的繁琐实现，真正实现了机器学习的高效落地。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Falibaba_Alink_dca50285.png","alibaba","Alibaba","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Falibaba_f65f7221.png","Alibaba Open Source",null,"https:\u002F\u002Fopensource.alibaba.com\u002F","https:\u002F\u002Fgithub.com\u002Falibaba",[83,87,91,95,99,103,107,111,114,117],{"name":84,"color":85,"percentage":86},"Java","#b07219",50.8,{"name":88,"color":89,"percentage":90},"HTML","#e34c26",34.8,{"name":92,"color":93,"percentage":94},"Python","#3572A5",7.7,{"name":96,"color":97,"percentage":98},"C++","#f34b7d",4.6,{"name":100,"color":101,"percentage":102},"TypeScript","#3178c6",1.6,{"name":104,"color":105,"percentage":106},"CMake","#DA3434",0.3,{"name":108,"color":109,"percentage":110},"Jupyter Notebook","#DA5B0B",0.1,{"name":112,"color":113,"percentage":110},"Less","#1d365d",{"name":115,"color":116,"percentage":110},"Scala","#c22d40",{"name":118,"color":119,"percentage":110},"Shell","#89e051",3619,791,"2026-03-30T13:11:28","Apache-2.0","未说明",{"notes":126,"python":127,"dependencies":128},"1. PyAlink 包需根据 Flink 版本选择安装（pyalink 对应最新 Flink 1.13，或 pyalink-flink-1.12\u002F1.11\u002F1.10\u002F1.9），不同版本包不能共存。\n2. Java 接口开发需使用 Scala 2.11 编译的 Flink 依赖。\n3. 支持批式和流式作业，可通过 Jupyter Notebook 获得更好体验。\n4. 集群部署需自行准备 Flink 环境并编译 Alink 源码包。","3.6, 3.7, 3.8",[129,130,131,132],"Java 8","Apache Flink (1.9 - 1.13)","pyalink (对应 Flink 版本)","Scala 2.11 (Java 接口依赖)",[13,51],[135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153],"machine-learning","flink","classification","clustering","regression","graph-algorithms","xgboost","recommender","recommender-system","feature-engineering","statistics","kafka","data-mining","apriori","word2vec","flink-ml","fm","flink-machine-learning","graph-embedding","2026-03-27T02:49:30.150509","2026-04-06T05:35:30.319573",[157,162,166,171,176,181],{"id":158,"question_zh":159,"answer_zh":160,"source_url":161},10703,"WebUI 启动后运行流程报错，是否与 Flink 版本有关？","是的，WebUI Server 默认使用的是 Flink 1.9 版本。如果您使用的是 Flink 1.10 或其他版本，需要修改依赖后重新打包。具体代码位置在：https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink\u002Fblob\u002F53bef0f1d17c023af1ce30b4116e4c30b21f6c68\u002Fwebui\u002Fserver\u002Fpom.xml#L19，请修改该处的 Flink 版本号以匹配您的环境。","https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink\u002Fissues\u002F204",{"id":163,"question_zh":164,"answer_zh":165,"source_url":161},10704,"WebUI 部署后访问地址显示为 localhost 而非具体 IP，导致连接失败怎么办？","这是由于前后端分离配置问题导致的。临时解决方法是修改前端代码中的请求地址：打开文件 https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink\u002Fblob\u002Fmaster\u002Fwebui\u002Fweb\u002Fsrc\u002Frequests\u002Frequest.ts，将第一行的目标 IP 修改为您的服务器实际 IP，然后重新打包部署。正规做法是在前端所在的 Docker 镜像中配置 Nginx，增加对 Server 的转发规则。",{"id":167,"question_zh":168,"answer_zh":169,"source_url":170},10705,"使用 PyAlink 时出现 'JavaPackage' object is not callable 报错如何解决？","该错误通常是因为安装的 PyAlink 版本过旧或与当前环境不兼容。请下载并安装最新版本的 PyAlink 进行尝试。确保版本与您的 Flink 和 Scala 版本匹配（例如 1.0-flink-1.9.0-scala-2.11 或更新的 1.0.1 版本）。","https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink\u002Fissues\u002F17",{"id":172,"question_zh":173,"answer_zh":174,"source_url":175},10706,"运行 Alink 或 PyAlink 时报错，对 Java 运行环境有什么要求？","Alink 的运行环境必须使用 Java 8 (JDK 1.8)。如果使用了其他版本的 Java（如 Java 11 或更高），可能会导致类加载错误或 'JavaPackage' object is not callable 等异常。请检查您的 java -version 输出，确保为 1.8.x。","https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink\u002Fissues\u002F6",{"id":177,"question_zh":178,"answer_zh":179,"source_url":180},10707,"如何获取 Alink 的编译 Jar 包？是否需要在 Maven 仓库手动下载？","不需要手动编译，Alink 每次发布都会将对应的 Jar 包发布到 Maven 中央仓库。您可以直接在 Maven 仓库搜索 \"Alink\" 找到对应版本进行依赖引入。搜索地址：https:\u002F\u002Fmvnrepository.com\u002Fsearch?q=Alink。如果需要自行构建，项目已包含所需的 Flink 源码，通常无需手动安装 Flink。","https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink\u002Fissues\u002F106",{"id":182,"question_zh":183,"answer_zh":184,"source_url":175},10708,"Maven 打包时提示无法解析依赖包（如 json-lib）或 Unknown host 错误怎么办？","这通常是网络问题导致无法连接到 Maven 阿里云镜像（maven.aliyun.com）。请检查网络连接和 DNS 解析是否正常。如果问题持续，可以尝试切换其他 Maven 镜像源，或者检查 pom.xml 中的仓库配置是否正确。此外，确保运行环境为 Java 8，某些依赖可能不兼容高版本 JDK。",[186,191,196,201,206,211,216,221,226,231,236,241,246,251,256,261,266,271,276,281],{"id":187,"version":188,"summary_zh":189,"released_at":190},71296,"v1.6.2","Release version 1.6.2","2023-11-03T10:32:24",{"id":192,"version":193,"summary_zh":194,"released_at":195},71297,"v1.6.1","Optimize performance and fixed bugs.","2023-03-15T07:27:31",{"id":197,"version":198,"summary_zh":199,"released_at":200},71298,"v1.6.0"," Optimize performance and fixed bugs. ","2022-11-08T08:03:36",{"id":202,"version":203,"summary_zh":204,"released_at":205},71299,"v1.5.8","1. Add more rules about exception.\r\n2. Add more graph algorithms.","2022-09-08T08:08:07",{"id":207,"version":208,"summary_zh":209,"released_at":210},71300,"v1.5.7","1. Add pipeline model support in  model stream.\r\n2. Refine online learning.\r\n3. Fixed some bugs.","2022-07-21T04:22:06",{"id":212,"version":213,"summary_zh":214,"released_at":215},71301,"v1.5.6","1. Add partition in ak, csv and parquet source\u002Fsink.\r\n2. Add model stream as initial model in online learning\r\n3. Add lazyVizDive and lazyVizStatistics\r\n4. Add hbase connector\r\n5. Add custom path in resource plugin","2022-06-17T11:52:18",{"id":217,"version":218,"summary_zh":219,"released_at":220},71302,"v1.5.5","1. Add datahub catalog.\r\n2. Add Kernel Dense Estimate\r\n3. Fixed some bugs","2022-05-10T12:56:24",{"id":222,"version":223,"summary_zh":224,"released_at":225},71303,"v1.5.4","1. Add parquet source\r\n2. Add Export2FileSinkStreamOp https:\u002F\u002Fwww.yuque.com\u002Fpinshu\u002Falink_tutorial\u002Fbook_java_3_2_5\r\n3. Add Model File Path on predictop and pipeline.\r\n4. Add TFRecordDataset source\u002Fsink\r\n5. Add ONNX predict operators","2022-04-22T10:43:59",{"id":227,"version":228,"summary_zh":229,"released_at":230},71304,"v1.5.3","1. Add xgboost wrapper using plugin\r\n2. Add PyTorch model predictor, close #207.\r\n3. Support pyalink on vvp. https:\u002F\u002Fwww.yuque.com\u002Fpinshu\u002Falink_tutorial\u002Fflink_vvp\r\n4. Support pyalink on dsw. https:\u002F\u002Fwww.yuque.com\u002Fpinshu\u002Falink_tutorial\u002Fpai_dsw\r\n5. Support pyalink on  pai-designer. https:\u002F\u002Fwww.yuque.com\u002Fpinshu\u002Falink_tutorial\u002Fpai_designer\r\n6. Add serializer of MTable and Tensor types\r\n7. Add code of pyalink, close #208.\r\n8. Fix #199, #193.","2022-04-02T10:06:55",{"id":232,"version":233,"summary_zh":234,"released_at":235},71305,"v1.5.2","1. Improve the performance of model stream.\r\n2. Add ToTensor, ToVector, ToMTable and support tensor, vector, mtable types in csv source.\r\n3. Keras sequential operators now support string\u002Fint types; Improve plugin mechanism in TF predictor.\r\n4. Add redis to plugin and add lookup redis operator.\r\n5. PyAlink: StreamOperator print function now supports specifying port. ","2022-01-07T10:37:31",{"id":237,"version":238,"summary_zh":239,"released_at":240},71306,"v1.5.1","1. Improve the performance of dl module.\r\n2. Resolve many issues on Windows platform.\r\n3. Add incremental training mode for LR, Softmax etc.\r\n4. Improve the performance of graph-based random walk algorithms.\r\n","2021-11-26T10:13:20",{"id":242,"version":243,"summary_zh":244,"released_at":245},71307,"v1.5.0","1. Add timeseries\r\n  ○ Add prophet model. #176\r\n  ○ Add AutoArima, Arima, HoltWinters, AutoGarch\r\n  ○ Add LSTNet, DeepAR\r\n2. Add deep learning module (Linux and MacOSX Intel chips)\r\n3. Add MTable, Tensor\r\n4. Add resource plugin\r\n5. Improve usage of PyFlink in PyAlink, close #178","2021-10-09T08:42:50",{"id":247,"version":248,"summary_zh":249,"released_at":250},71308,"v1.4.0","1. Adapt flink 1.13.\r\n2. Fixed some bugs\r\n3. Add some feature engineering methods\r\n4. Refine the documents of BatchOp\u002FStreamOp\r\n5. Add java demoes","2021-06-11T17:44:00",{"id":252,"version":253,"summary_zh":254,"released_at":255},71309,"v1.3.2","Release note:\r\n1. SLF4J load error when run Java example #109\r\n2. one hot encode a little optimization #112\r\n3. Add quoting in mysql column name #159\r\n4. fix some error (decimal exception & partition set invalid) #162\r\n5. Fix partition overwrite in hive.\r\n6. Upgrade the flink version from 1.12.0 to 1.12.1","2021-02-24T05:21:10",{"id":257,"version":258,"summary_zh":259,"released_at":260},71310,"v1.3.1","1. Adapt flink 1.12.\r\n2. Add plugin of kafka.\r\n3. Add s3 file system.\r\n4. Add odps catalog.\r\n5. Fix poisson and add glm model info.\r\n6. Add multi-files in pipeline loader and local predictor loader.\r\n7. Use legacy serializer to compatible with old ak format.\r\n8. Change vector type as CompositeType and Change Sparse vector as pojo type.\r\n9. Remove the REGEXP_REPLACE in sql selector for flink 1.12","2021-01-07T11:49:15",{"id":262,"version":263,"summary_zh":264,"released_at":265},71311,"v1.3.0","1. Add more model info batch op and support print model info in pipeline model.\r\n1. Add recommendation module.\r\n   - Supported recommender are:\r\n      - Als\r\n      - Factorization Machines\r\n      - ItemCF\r\n      - UserCF\r\n   - Supported others function for recommendation module are:\r\n      - Leave k-object out\r\n      - Leave top k-object out\r\n      - Ranking evaluation\r\n      - Multi-Label evaluation\r\n3. Add online learning algorithoms.\r\n   - ftrl model filter\r\n4. Add a series of similarity algorithms.\r\n   - VectorNearestNeighbor\r\n   - TextSimilarity\r\n   - TextNearestNeighbor\r\n   - TextApproxNearestNeighbor\r\n   - StringSimilarity\r\n   - StringNearestNeighbor\r\n   - StringApproxNearestNeighbor\r\n5. Add DocWordCountBatchOp，KeywordsExtractionBatchOp， TfidfBatchOp，WordCountBatchOp\r\n5. Add KNN\r\n5. Add GeoKMeans, Streaming Kmeans\r\n5. Add model selctor algorithms.\r\n   - RandomSearchCV\r\n   - RandomSearchTVSplit\r\n9. Add plugin in filesystem and catalog. Add catalogs of hive, mysql, derby and sqlite\r\n9. PyAlink:\r\n   - Align with new functionalities in Java side, including new operators, catalog, plugin mechanism, and so on;\r\n   - For Flink version 1.9, PyAlink now depends on PyFlink directly, resulting in supporting flink run, and table-related operetions.\r\n11. Fix some issues, optimize performance and add more parameters in linear and tree model\r\n11. Add test utils module and optimize performance of unit tests.\r\n11. Remove the db module.\r\n11. Refine the save\u002Fload in pipeline and pipeline model. Use Ak as the default format for save\u002Fload.\r\n11. Support load LocalPredictor from Ak file which saved on filesystem. This will avoid `collect` when load the LocalPredictor. see #78 #79\r\n11. Add multi-threads in all mapper\r\n11. Optimize memory usage of batch prediction.\r\n11. Add pseudoInverse in matrix\r\n11. Support that the sparse vector has not size\r\n11. Fix sequencing issue when linkFrom the model info batch op\r\n11. Optimize the format of lazy print.\r\n11. Add Stopwatch and TimeSpan\r\n11. Add serialVersionUID in all serializable classes.","2020-11-30T08:11:40",{"id":267,"version":268,"summary_zh":269,"released_at":270},71312,"v1.2.0","1. Adapt for Flink 1.11\r\n   - Flink API calls (#129), Hive connectors (#130) and kafka connector(#129) are adapted for Flink 1.11.\r\n   - Adjust `FilePath` of `FileSystem` for Flink 1.11 #131\r\n2. Add Factorization Machines classification and regression #115\r\n3. Support Lazy APIs for higher user interactivity and richer information.\r\nLazy APIs enable intermediate outputs of the ML pipeline to be printed, collected, and post-processed along with the mainstream of data process. Such intermediate outputs include: ML model and training information, evaluation metrics, data statistics, etc.\r\n   - PyAlink supported\r\n   - Support Lazy APIs for BatchOperators and related methods in EstimatorBase\u002FTransformerBase #116\r\n   - Add model information:\r\n      - Linear model #118 #132\r\n      - Tree model #125\r\n      - PCA #117\r\n      - ChisqSelector #117\r\n      - VectorChisqSelector #117\r\n      - KMeans #120\r\n      - BisectingKMeans #120\r\n      - NaiveBayes #122\r\n      - Lda #122\r\n      - GaussianMixture #120\r\n      - OneHotEncoder #120\r\n      - QuantileDiscretizer #120\r\n      - MinMaxScaler #122\r\n      - VectorMinMaxScaler #122\r\n      - MaxAbsScaler #122\r\n      - VectorMaxAbsScaler #122\r\n      - StandardScaler #122\r\n      - VectorStandardScaler #122\r\n   - Add training information:\r\n      - word2vec #125\r\n   - Add statistics:\r\n      - Correlation #117\r\n      - Summary #117\r\n   - Add EvaluationMetrics #124\r\n4. Add FileSystem APIs. #126\r\nUsing FileSystem APIs, users can process files on different file systems with unified and friendly experience. Such processing can be `exists`, `isDir`, `list`, `read`, `write` or other commonly functions used for files. Supported file system are:\r\n      - HDFS\r\n      - OSS\r\n      - Local\r\n5. Add Ak source\u002Fsink and Csv source\u002Fsink support new FileSystem APIs. #126\r\nAk is a file format storing data together with its schema that can be written to filesystem. It makes the advantages of compressed, tabular data representation.The supported APIs are shown in the table below:\r\n\r\n    |  | HDFS | OSS | Local |\r\n    | :---: | :---: | :---: | :---: |\r\n    | Ak source | ✔️ | ✔️ | ✔️ |\r\n    | Ak sink | ✔️ | ✔️ | ✔️ |\r\n    | Csv source | ✔️ | ✔️ | ✔️ |\r\n    | Csv sink | ✔️ | ✔️ | ✔️ |\r\n\r\n\r\n6. Support EqualWidthDiscretizer. #123\r\n7. Feature Enhancements and API unification in Clustering. #121\r\n8. Refine code of QuantileDiscretizer and OneHotEncoder #111\r\n9. Fix predict stream op in alspredictstreamop.md #104","2020-07-31T08:58:17",{"id":272,"version":273,"summary_zh":274,"released_at":275},71313,"v1.1.2","1. Add transformers among formats Vector, CSV, Json, KV, Columns and Triple #93\r\n• Support AnyToAny transformation\r\n• Unified transformation params and easy use.\r\n2. Support SQL select statements in the Pipeline and LocalPredictor #61\r\n• Support flink planner built-in functions regarding individual rows: comparison, logical, arithmetic, string, temporal, conditional, type conversion, hash, etc.\r\n• Add alink_shaded\u002Fshaded_protobuf_java to support usage of native Calcite.\r\n3. Support Hive source and sink #96\r\n• Support Batch\u002FStream source&sink of Hive.\r\n• Support partition of table.\r\n• Simplify the dependence of Hive jar.\r\n• Support multiple versions: 2.0, 2.1, 2.2, 2.3, 3.0\r\n4. Fix PyAlink starting and UDF issues on Windows #76, #77\r\n5. Support BigInteger type in MySql source #86\r\n6. Add open and close in mapper. #92\r\n7. Add open function in SegmentMapper and StopwordsRemoverMapper #94\r\n8. Unify HandleInvalid Params #95","2020-06-08T15:21:50",{"id":277,"version":278,"summary_zh":279,"released_at":280},71314,"v1.1.1","## Enhancements & New Features\r\n1. Optimize conversion between operator and dataframe\r\n2. Auto-detect localIp when useRemoveEnv\r\n3. Add enum type parameter #65\r\n• Adapt enum type params in quantile, distance and decision tree. #67\r\n• linear model train params change to enum #71\r\n• Kafka, StringIndexer and Join add enum parameters #72\r\n• Adapt enum type params in pca, chi square test, glm and correlation. #73\r\n4. streamop window group by #68\r\n5. Add operators to parse strings in CSV, JSON and KV formats to columns #70\r\n6. Tokenizer supports string split with multiple spaces #69\r\n7. Make error message clear when selected columns are not found #66\r\n8. Add an FTRL example #64\r\n## Fix & Refinements\r\n1. Fix dill version conflict\r\n2. ALSExample error #33\r\n3. Bug of HasVectorSize alias #56\r\n4. mysqlsource error when i use collect method #45","2020-04-20T13:01:39",{"id":282,"version":283,"summary_zh":284,"released_at":285},71315,"v1.1.0","## Enhancements & New Features\r\n\r\n- Improvement of UDF\u002FUDTF operators, Java and PyAlink have consistent usage and behaviors.  #32 #44.\r\n- Publish to maven central and PyPI.\r\n- Support Flink 1.10 and Flink 1.9.  #46\r\n   - https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink\u002Freleases\u002Ftag\u002Fv1.1.0-flink-1.10\r\n   - https:\u002F\u002Fgithub.com\u002Falibaba\u002FAlink\u002Freleases\u002Ftag\u002Fv1.1.0-flink-1.9\r\n- Support more Kafka connectors.  #41.\r\n\r\n## API change\r\n\r\n- Modify Naive Bayes algorithm as a text classifier. #47\r\n- Modify and enhance the parameter, model in QuantileDiscretizer, OneHotEncoder and Bucketizer. #48\r\n\r\n## Documentation\r\n\r\n- Update data links in docs and codes. #28\r\n- Update PyAlink install instructions. #8\r\n\r\n## Fix & Refinements\r\n\r\n- Fix the problem in LDA online method and refine comments in FeatureLabelUtil. #29\r\n- Fit the bug that initial data of KMeansAssignCluster is not cleared. #31\r\n- Fix the int overflow bug in reading large csv file, and dd test cases for CsvFileInputSplit. See #27\r\n- Cleanup some code.  #15 \r\n- Remove a redundant test case whose data source is unaccessible. see #28\r\n- Fix the NEP in PCA. see #42\r\n\r\n## PyPI support\r\n\r\n- Support PyAlink installation using `pip install pyalink`\r\n\r\n## Maven Dependencies\r\n\r\nAlink is now synchronized to the Maven central repository, which you can easily add to Maven projects.\r\n\r\n### With Flink-1.10\r\n```xml\r\n\u003Cdependency>\r\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\r\n    \u003CartifactId>alink_core_flink-1.10_2.11\u003C\u002FartifactId>\r\n    \u003Cversion>1.1.0\u003C\u002Fversion>\r\n\u003C\u002Fdependency>\r\n\u003Cdependency>\r\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\r\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\r\n    \u003Cversion>1.10.0\u003C\u002Fversion>\r\n\u003C\u002Fdependency>\r\n\u003Cdependency>\r\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\r\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\r\n    \u003Cversion>1.10.0\u003C\u002Fversion>\r\n\u003C\u002Fdependency>\r\n```\r\n\r\n### With Flink-1.9\r\n```xml\r\n\u003Cdependency>\r\n    \u003CgroupId>com.alibaba.alink\u003C\u002FgroupId>\r\n    \u003CartifactId>alink_core_flink-1.9_2.11\u003C\u002FartifactId>\r\n    \u003Cversion>1.1.0\u003C\u002Fversion>\r\n\u003C\u002Fdependency>\r\n\u003Cdependency>\r\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\r\n    \u003CartifactId>flink-streaming-scala_2.11\u003C\u002FartifactId>\r\n    \u003Cversion>1.9.0\u003C\u002Fversion>\r\n\u003C\u002Fdependency>\r\n\u003Cdependency>\r\n    \u003CgroupId>org.apache.flink\u003C\u002FgroupId>\r\n    \u003CartifactId>flink-table-planner_2.11\u003C\u002FartifactId>\r\n    \u003Cversion>1.9.0\u003C\u002Fversion>\r\n\u003C\u002Fdependency>\r\n```\r\n","2020-02-28T13:20:53"]