[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-intel--BigDL":3,"tool-intel--BigDL":61},[4,18,26,36,44,52],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",141543,2,"2026-04-06T11:32:54",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107888,"2026-04-06T11:32:50",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":10,"last_commit_at":50,"category_tags":51,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":53,"name":54,"github_repo":55,"description_zh":56,"stars":57,"difficulty_score":10,"last_commit_at":58,"category_tags":59,"status":17},4292,"Deep-Live-Cam","hacksider\u002FDeep-Live-Cam","Deep-Live-Cam 是一款专注于实时换脸与视频生成的开源工具，用户仅需一张静态照片，即可通过“一键操作”实现摄像头画面的即时变脸或制作深度伪造视频。它有效解决了传统换脸技术流程繁琐、对硬件配置要求极高以及难以实时预览的痛点，让高质量的数字内容创作变得触手可及。\n\n这款工具不仅适合开发者和技术研究人员探索算法边界，更因其极简的操作逻辑（仅需三步：选脸、选摄像头、启动），广泛适用于普通用户、内容创作者、设计师及直播主播。无论是为了动画角色定制、服装展示模特替换，还是制作趣味短视频和直播互动，Deep-Live-Cam 都能提供流畅的支持。\n\n其核心技术亮点在于强大的实时处理能力，支持口型遮罩（Mouth Mask）以保留使用者原始的嘴部动作，确保表情自然精准；同时具备“人脸映射”功能，可同时对画面中的多个主体应用不同面孔。此外，项目内置了严格的内容安全过滤机制，自动拦截涉及裸露、暴力等不当素材，并倡导用户在获得授权及明确标注的前提下合规使用，体现了技术发展与伦理责任的平衡。",88924,"2026-04-06T03:28:53",[14,15,13,60],"视频",{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":66,"readme_en":67,"readme_zh":68,"quickstart_zh":69,"use_case_zh":70,"hero_image_url":71,"owner_login":72,"owner_name":73,"owner_avatar_url":74,"owner_bio":75,"owner_company":76,"owner_location":76,"owner_email":77,"owner_twitter":76,"owner_website":76,"owner_url":78,"languages":79,"stars":117,"forks":118,"last_commit_at":119,"license":120,"difficulty_score":10,"env_os":121,"env_gpu":122,"env_ram":123,"env_deps":124,"category_tags":136,"github_topics":137,"view_count":32,"oss_zip_url":76,"oss_zip_packed_at":76,"status":17,"created_at":145,"updated_at":146,"faqs":147,"releases":163},4401,"intel\u002FBigDL","BigDL","BigDL: Distributed TensorFlow, Keras and PyTorch on Apache Spark\u002FFlink & Ray","BigDL 是一套专为英特尔硬件优化的开源 AI 库，旨在帮助用户轻松将数据分析和人工智能应用从笔记本电脑无缝扩展至云端集群。它核心解决了深度学习框架（如 TensorFlow、PyTorch）在处理大规模数据时面临的分布式训练难、硬件加速复杂以及隐私安全等痛点。\n\n这套工具集特别适合需要处理海量数据的 AI 开发者、数据科学家及研究人员。通过其子项目 Orca，用户可以在 Spark 或 Ray 集群上直接运行现有的单机深度学习代码，无需重写逻辑即可实现分布式扩容；Nano 模块则能透明地加速英特尔 CPU 和 GPU 上的模型推理与训练。此外，BigDL 还提供了 Chronos 用于自动化时间序列分析，Friesian 用于构建推荐系统，以及 PPML 利用硬件安全技术保护数据隐私。\n\n值得注意的是，原本的大语言模型（LLM）支持已迁移至新项目 IPEX-LLM，专注于在英特尔架构上高效运行大模型。BigDL 让开发者能够利用熟悉的 Python 生态，结合大数据平台的强大算力，以更低的成本构建高性能、可扩展且安全的智能应用。","> [!IMPORTANT]\n> ***`bigdl-llm` has now become `ipex-llm`, and our future development will move to the [IPEX-LLM](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002Fipex-llm) project.***\n\n---\n\u003Cdiv align=\"center\">\n\n\u003Cp align=\"center\"> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fintel_BigDL_readme_69c9d089f658.jpg\" height=\"140px\">\u003Cbr>\u003C\u002Fp>\n\n\u003C\u002Fdiv>\n\n---\n\n## Overview\n\nBigDL seamlessly scales your data analytics & AI applications from laptop to cloud, with the following libraries:\n\n- `LLM` ***(deprecated - please use [IPEX-LLM](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002Fipex-llm) instead)***: Optimizaed large language model library for Intel CPU and GPU\n\n- [Orca](#orca): Distributed Big Data & AI (TF & PyTorch) Pipeline on Spark and Ray\n\n- [Nano](#nano): Transparent Acceleration of Tensorflow & PyTorch Programs on Intel CPU\u002FGPU\n\n- [DLlib](#dllib): “Equivalent of Spark MLlib” for Deep Learning\n\n- [Chronos](#chronos): Scalable Time Series Analysis using AutoML\n\n- [Friesian](#friesian): End-to-End Recommendation Systems\n\n- [PPML](#ppml): Secure Big Data and AI (with SGX\u002FTDX Hardware Security)\n\nFor more information, you may [read the docs](https:\u002F\u002Fbigdl.readthedocs.io\u002F).\n\n---\n\n## Choosing the right BigDL library\n```mermaid\nflowchart TD;\n    Feature1{{HW Secured Big Data & AI?}};\n    Feature1-- No -->Feature2{{Python vs. Scala\u002FJava?}};\n    Feature1-- \"Yes\"  -->ReferPPML([\u003Cem>\u003Cstrong>PPML\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature2-- Python -->Feature3{{What type of application?}};\n    Feature2-- Scala\u002FJava -->ReferDLlib([\u003Cem>\u003Cstrong>DLlib\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature3-- \"Large Language Model\" -->ReferLLM([\u003Cem>\u003Cstrong>LLM\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature3-- \"Big Data + AI (TF\u002FPyTorch)\" -->ReferOrca([\u003Cem>\u003Cstrong>Orca\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature3-- Accelerate TensorFlow \u002F PyTorch -->ReferNano([\u003Cem>\u003Cstrong>Nano\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature3-- DL for Spark MLlib -->ReferDLlib2([\u003Cem>\u003Cstrong>DLlib\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature3-- High Level App Framework -->Feature4{{Domain?}};\n    Feature4-- Time Series -->ReferChronos([\u003Cem>\u003Cstrong>Chronos\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature4-- Recommender System -->ReferFriesian([\u003Cem>\u003Cstrong>Friesian\u003C\u002Fstrong>\u003C\u002Fem>]);\n    \n    click ReferLLM \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002Fipex-llm\"\n    click ReferNano \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#nano\"\n    click ReferOrca \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#orca\"\n    click ReferDLlib \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#dllib\"\n    click ReferDLlib2 \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#dllib\"\n    click ReferChronos \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#chronos\"\n    click ReferFriesian \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#friesian\"\n    click ReferPPML \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#ppml\"\n    \n    classDef ReferStyle1 fill:#5099ce,stroke:#5099ce;\n    classDef Feature fill:#FFF,stroke:#08409c,stroke-width:1px;\n    class ReferLLM,ReferNano,ReferOrca,ReferDLlib,ReferDLlib2,ReferChronos,ReferFriesian,ReferPPML ReferStyle1;\n    class Feature1,Feature2,Feature3,Feature4,Feature5,Feature6,Feature7 Feature;\n    \n```\n---\n## Installing\n\n - To install BigDL, we recommend using [conda](https:\u002F\u002Fdocs.conda.io\u002Fprojects\u002Fconda\u002Fen\u002Flatest\u002Fuser-guide\u002Finstall\u002F)  environment:\n\n    ```bash\n    conda create -n my_env \n    conda activate my_env\n    pip install bigdl\n    ```\n    To install latest nightly build, use `pip install --pre --upgrade bigdl`; see [Python](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FUserGuide\u002Fpython.html) and [Scala](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FUserGuide\u002Fscala.html) user guide for more details.\n\n - To install each individual library, such as Chronos, use `pip install bigdl-chronos`; see the [document website](https:\u002F\u002Fbigdl.readthedocs.io\u002F) for more details.\n---\n\n## Getting Started\n### Orca\n\n- The _Orca_ library seamlessly scales out your single node **TensorFlow**, **PyTorch** or **OpenVINO** programs across large clusters (so as to process distributed Big Data).\n\n  \u003Cdetails>\u003Csummary>Show Orca example\u003C\u002Fsummary>\n  \u003Cbr\u002F>\n\n  You can build end-to-end, distributed data processing & AI programs using _Orca_ in 4 simple steps:\n\n  ```python\n  # 1. Initilize Orca Context (to run your program on K8s, YARN or local laptop)\n  from bigdl.orca import init_orca_context, OrcaContext\n  sc = init_orca_context(cluster_mode=\"k8s\", cores=4, memory=\"10g\", num_nodes=2) \n\n  # 2. Perform distribtued data processing (supporting Spark DataFrames,\n  # TensorFlow Dataset, PyTorch DataLoader, Ray Dataset, Pandas, Pillow, etc.)\n  spark = OrcaContext.get_spark_session()\n  df = spark.read.parquet(file_path)\n  df = df.withColumn('label', df.label-1)\n  ...\n\n  # 3. Build deep learning models using standard framework APIs\n  # (supporting TensorFlow, PyTorch, Keras, OpenVino, etc.)\n  from tensorflow import keras\n  ...\n  model = keras.models.Model(inputs=[user, item], outputs=predictions)  \n  model.compile(...)\n\n  # 4. Use Orca Estimator for distributed training\u002Finference\n  from bigdl.orca.learn.tf.estimator import Estimator\n  est = Estimator.from_keras(keras_model=model)  \n  est.fit(data=df,\n          feature_cols=['user', 'item'],\n          label_cols=['label'],\n          ...)\n  ```\n\n  \u003C\u002Fdetails> \n\n  *See Orca [user guide](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FOrca\u002FOverview\u002Forca.html), as well as [TensorFlow](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FOrca\u002FHowto\u002Ftf2keras-quickstart.html) and [PyTorch](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FOrca\u002FHowto\u002Fpytorch-quickstart.html) quickstarts, for more details.*\n\n- In addition, you can also run standard **Ray** programs on Spark cluster using _**RayOnSpark**_ in Orca.\n\n  \u003Cdetails>\u003Csummary>Show RayOnSpark example\u003C\u002Fsummary>\n  \u003Cbr\u002F>\n  \n  You can not only run Ray program on Spark cluster, but also write Ray code inline with Spark code (so as to process the in-memory Spark RDDs or DataFrames) using _RayOnSpark_ in Orca.\n \n  ```python\n  # 1. Initilize Orca Context (to run your program on K8s, YARN or local laptop)\n  from bigdl.orca import init_orca_context, OrcaContext\n  sc = init_orca_context(cluster_mode=\"yarn\", cores=4, memory=\"10g\", num_nodes=2, init_ray_on_spark=True) \n\n  # 2. Distribtued data processing using Spark\n  spark = OrcaContext.get_spark_session()\n  df = spark.read.parquet(file_path).withColumn(...)\n  \n  # 3. Convert Spark DataFrame to Ray Dataset\n  from bigdl.orca.data import spark_df_to_ray_dataset\n  dataset = spark_df_to_ray_dataset(df)\n  \n  # 4. Use Ray to operate on Ray Datasets\n  import ray\n\n  @ray.remote\n  def consume(data) -> int:\n     num_batches = 0\n     for batch in data.iter_batches(batch_size=10):\n         num_batches += 1\n     return num_batches\n\n  print(ray.get(consume.remote(dataset)))\n  ```\n\n  \u003C\u002Fdetails>  \n  \n  *See RayOnSpark [user guide](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FOrca\u002FOverview\u002Fray.html) and [quickstart](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FOrca\u002FHowto\u002Fray-quickstart.html) for more details.*\n### Nano\nYou can transparently accelerate your TensorFlow or PyTorch programs on your laptop or server using *Nano*. With minimum code changes, *Nano* automatically applies modern CPU optimizations (e.g., SIMD,  multiprocessing, low precision, etc.) to standard TensorFlow and PyTorch code, with up-to 10x speedup.\n\n\u003Cdetails>\u003Csummary>Show Nano inference example\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\nYou can automatically optimize a trained PyTorch model for inference or deployment using _Nano_:\n\n```python\nmodel = ResNet18().load_state_dict(...)\ntrain_dataloader = ...\nval_dataloader = ...\ndef accuracy (pred, target):\n  ... \n\nfrom bigdl.nano.pytorch import InferenceOptimizer\noptimizer = InferenceOptimizer()\noptimizer.optimize(model,\n                   training_data=train_dataloader,\n                   validation_data=val_dataloader,\n                   metric=accuracy)\nnew_model, config = optimizer.get_best_model()\n\noptimizer.summary()\n```\nThe output of `optimizer.summary()` will be something like:\n```\n -------------------------------- ---------------------- -------------- ----------------------\n|             method             |        status        | latency(ms)  |     metric value     |\n -------------------------------- ---------------------- -------------- ----------------------\n|            original            |      successful      |    45.145    |        0.975         |\n|              bf16              |      successful      |    27.549    |        0.975         |\n|          static_int8           |      successful      |    11.339    |        0.975         |\n|         jit_fp32_ipex          |      successful      |    40.618    |        0.975*        |\n|  jit_fp32_ipex_channels_last   |      successful      |    19.247    |        0.975*        |\n|         jit_bf16_ipex          |      successful      |    10.149    |        0.975         |\n|  jit_bf16_ipex_channels_last   |      successful      |    9.782     |        0.975         |\n|         openvino_fp32          |      successful      |    22.721    |        0.975*        |\n|         openvino_int8          |      successful      |    5.846     |        0.962         |\n|        onnxruntime_fp32        |      successful      |    20.838    |        0.975*        |\n|    onnxruntime_int8_qlinear    |      successful      |    7.123     |        0.981         |\n -------------------------------- ---------------------- -------------- ----------------------\n* means we assume the metric value of the traced model does not change, so we don't recompute metric value to save time.\nOptimization cost 60.8s in total.\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>Show Nano Training example\u003C\u002Fsummary>\n\u003Cbr\u002F>\nYou may easily accelerate PyTorch training (e.g., IPEX, BF16, Multi-Instance Training, etc.) using Nano:\n\n```python\nmodel = ResNet18()\noptimizer = torch.optim.SGD(...)\ntrain_loader = ...\nval_loader = ...\n\nfrom bigdl.nano.pytorch import TorchNano\n\n# Define your training loop inside `TorchNano.train`\nclass Trainer(TorchNano):\n\tdef train(self):\n\t# call `setup` to prepare for model, optimizer(s) and dataloader(s) for accelerated training\n\tmodel, optimizer, (train_loader, val_loader) = self.setup(model, optimizer,\n  train_loader, val_loader)\n  \n    for epoch in range(num_epochs):  \n      model.train()  \n      for data, target in train_loader:  \n        optimizer.zero_grad()  \n        output = model(data)  \n        # replace the loss.backward() with self.backward(loss)  \n        loss = loss_fuc(output, target)  \n        self.backward(loss)  \n        optimizer.step()   \n\n# Accelerated training (IPEX, BF16 and Multi-Instance Training)\nTrainer(use_ipex=True, precision='bf16', num_processes=2).train()\n```\n\n\u003C\u002Fdetails>  \n\n*See Nano [user guide](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FNano\u002FOverview\u002Fnano.html) and [tutotial](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x\u002Ftree\u002Fmain\u002Fpython\u002Fnano\u002Ftutorial) for more details.*\n    \n### DLlib\n\nWith _DLlib_, you can write distributed deep learning applications as standard (**Scala** or **Python**) Spark programs, using the same **Spark DataFrames** and **ML Pipeline** APIs.\n\n\u003Cdetails>\u003Csummary>Show DLlib Scala example\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\nYou can build distributed deep learning applications for Spark using *DLlib* Scala APIs in 3 simple steps:\n\n```scala\n\u002F\u002F 1. Call `initNNContext` at the beginning of the code: \nimport com.intel.analytics.bigdl.dllib.NNContext\nval sc = NNContext.initNNContext()\n\n\u002F\u002F 2. Define the deep learning model using Keras-style API in DLlib:\nimport com.intel.analytics.bigdl.dllib.keras.layers._\nimport com.intel.analytics.bigdl.dllib.keras.Model\nval input = Input[Float](inputShape = Shape(10))  \nval dense = Dense[Float](12).inputs(input)  \nval output = Activation[Float](\"softmax\").inputs(dense)  \nval model = Model(input, output)\n\n\u002F\u002F 3. Use `NNEstimator` to train\u002Fpredict\u002Fevaluate the model using Spark DataFrame and ML pipeline APIs\nimport org.apache.spark.sql.SparkSession\nimport org.apache.spark.ml.feature.MinMaxScaler\nimport org.apache.spark.ml.Pipeline\nimport com.intel.analytics.bigdl.dllib.nnframes.NNEstimator\nimport com.intel.analytics.bigdl.dllib.nn.CrossEntropyCriterion\nimport com.intel.analytics.bigdl.dllib.optim.Adam\nval spark = SparkSession.builder().getOrCreate()\nval trainDF = spark.read.parquet(\"train_data\")\nval validationDF = spark.read.parquet(\"val_data\")\nval scaler = new MinMaxScaler().setInputCol(\"in\").setOutputCol(\"value\")\nval estimator = NNEstimator(model, CrossEntropyCriterion())  \n        .setBatchSize(128).setOptimMethod(new Adam()).setMaxEpoch(5)\nval pipeline = new Pipeline().setStages(Array(scaler, estimator))\n\nval pipelineModel = pipeline.fit(trainDF)  \nval predictions = pipelineModel.transform(validationDF)\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>Show DLlib Python example\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\nYou can build distributed deep learning applications for Spark using *DLlib* Python APIs in 3 simple steps:\n\n```python\n# 1. Call `init_nncontext` at the beginning of the code:\nfrom bigdl.dllib.nncontext import init_nncontext\nsc = init_nncontext()\n\n# 2. Define the deep learning model using Keras-style API in DLlib:\nfrom bigdl.dllib.keras.layers import Input, Dense, Activation\nfrom bigdl.dllib.keras.models import Model\ninput = Input(shape=(10,))\ndense = Dense(12)(input)\noutput = Activation(\"softmax\")(dense)\nmodel = Model(input, output)\n\n# 3. Use `NNEstimator` to train\u002Fpredict\u002Fevaluate the model using Spark DataFrame and ML pipeline APIs\nfrom pyspark.sql import SparkSession\nfrom pyspark.ml.feature import MinMaxScaler\nfrom pyspark.ml import Pipeline\nfrom bigdl.dllib.nnframes import NNEstimator\nfrom bigdl.dllib.nn.criterion import CrossEntropyCriterion\nfrom bigdl.dllib.optim.optimizer import Adam\nspark = SparkSession.builder.getOrCreate()\ntrain_df = spark.read.parquet(\"train_data\")\nvalidation_df = spark.read.parquet(\"val_data\")\nscaler = MinMaxScaler().setInputCol(\"in\").setOutputCol(\"value\")\nestimator = NNEstimator(model, CrossEntropyCriterion())\\\n    .setBatchSize(128)\\\n    .setOptimMethod(Adam())\\\n    .setMaxEpoch(5)\npipeline = Pipeline(stages=[scaler, estimator])\n\npipelineModel = pipeline.fit(train_df)\npredictions = pipelineModel.transform(validation_df)\n```\n\n\u003C\u002Fdetails>\n\n*See DLlib [NNFrames](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FDLlib\u002FOverview\u002Fnnframes.html) and [Keras API](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FDLlib\u002FOverview\u002Fkeras-api.html) user guides for more details.*\n\n### Chronos\n\nThe *Chronos* library makes it easy to build fast, accurate and scalable **time series analysis** applications (with AutoML).\n\n\u003Cdetails>\u003Csummary>Show Chronos example\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\nYou can train a time series forecaster using _Chronos_ in 3 simple steps:\n\n```python\nfrom bigdl.chronos.forecaster import TCNForecaster \nfrom bigdl.chronos.data.repo_dataset import get_public_dataset\n\n# 1. Process time series data using `TSDataset`\ntsdata_train, tsdata_val, tsdata_test = get_public_dataset(name='nyc_taxi')\nfor tsdata in [tsdata_train, tsdata_val, tsdata_test]:\n    data.roll(lookback=100, horizon=1)\n\n# 2. Create a `TCNForecaster` (automatically configured based on train_data)\nforecaster = TCNForecaster.from_tsdataset(train_data)\n\n# 3. Train the forecaster for prediction\nforecaster.fit(train_data)\n\npred = forecaster.predict(test_data)\n```\n\nTo apply AutoML, use `AutoTSEstimator` instead of normal forecasters.\n```python\n# Create and fit an `AutoTSEstimator`\nfrom bigdl.chronos.autots import AutoTSEstimator\nautotsest = AutoTSEstimator(model=\"tcn\", future_seq_len=10)\n\ntsppl = autotsest.fit(data=tsdata_train, validation_data=tsdata_val)\npred = tsppl.predict(tsdata_test)\n```\n\n\u003C\u002Fdetails>  \n\n*See Chronos [user guide](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FChronos\u002Findex.html) and [quick start](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FChronos\u002FQuickStart\u002Fchronos-autotsest-quickstart.html) for more details.*\n\n### Friesian\nThe *Friesian* library makes it easy to build end-to-end, large-scale **recommedation system** (including *offline* feature transformation and traning, *near-line* feature and model update, and *online* serving pipeline). \n\n*See Freisian [readme](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x\u002Fblob\u002Fmain\u002Fpython\u002Ffriesian\u002FREADME.md) for more details.* \n\n### PPML\n\n*BigDL PPML* provides a **hardware (Intel SGX) protected** *Trusted Cluster Environment* for running distributed Big Data & AI applications (in a secure fashion on private or public cloud). \n\n*See PPML [user guide](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FPPML\u002FOverview\u002Fppml.html) and [tutorial](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x\u002Fblob\u002Fmain\u002Fppml\u002FREADME.md) for more details.* \n\n## Getting Support\n\n- [Mail List](mailto:bigdl-user-group+subscribe@googlegroups.com)\n- [User Group](https:\u002F\u002Fgroups.google.com\u002Fforum\u002F#!forum\u002Fbigdl-user-group)\n- [Github Issues](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x\u002Fissues)\n---\n\n## Citation\n\nIf you've found BigDL useful for your project, you may cite our papers as follows:\n\n- *[BigDL 2.0](https:\u002F\u002Farxiv.org\u002Fabs\u002F2204.01715): Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster*\n  ```\n  @INPROCEEDINGS{9880257,\n      title={BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster}, \n      author={Dai, Jason Jinquan and Ding, Ding and Shi, Dongjie and Huang, Shengsheng and Wang, Jiao and Qiu, Xin and Huang, Kai and Song, Guoqiong and Wang, Yang and Gong, Qiyuan and Song, Jiaming and Yu, Shan and Zheng, Le and Chen, Yina and Deng, Junwei and Song, Ge},\n      booktitle={2022 IEEE\u002FCVF Conference on Computer Vision and Pattern Recognition (CVPR)}, \n      year={2022},\n      pages={21407-21414},\n      doi={10.1109\u002FCVPR52688.2022.02076}\n  }\n  ```\n\n[^1]: Performance varies by use, configuration and other factors. `bigdl-llm` may not optimize to the same degree for non-Intel products. Learn more at www.Intel.com\u002FPerformanceIndex.\n\n- *[BigDL](https:\u002F\u002Farxiv.org\u002Fabs\u002F1804.05839): A Distributed Deep Learning Framework for Big Data*\n  ```\n  @INPROCEEDINGS{10.1145\u002F3357223.3362707,\n      title = {BigDL: A Distributed Deep Learning Framework for Big Data},\n      author = {Dai, Jason Jinquan and Wang, Yiheng and Qiu, Xin and Ding, Ding and Zhang, Yao and Wang, Yanzhang and Jia, Xianyan and Zhang, Cherry Li and Wan, Yan and Li, Zhichao and Wang, Jiao and Huang, Shengsheng and Wu, Zhongyuan and Wang, Yang and Yang, Yuhao and She, Bowen and Shi, Dongjie and Lu, Qi and Huang, Kai and Song, Guoqiong},\n      booktitle = {Proceedings of the ACM Symposium on Cloud Computing (SoCC)},\n      year = {2019},\n      pages = {50–60},\n      doi = {10.1145\u002F3357223.3362707}\n  }\n  ```\n","> [!IMPORTANT]\n> ***`bigdl-llm` 现已更名为 `ipex-llm`，我们未来的开发将迁移至 [IPEX-LLM](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002Fipex-llm) 项目。***\n\n---\n\u003Cdiv align=\"center\">\n\n\u003Cp align=\"center\"> \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fintel_BigDL_readme_69c9d089f658.jpg\" height=\"140px\">\u003Cbr>\u003C\u002Fp>\n\n\u003C\u002Fdiv>\n\n---\n\n## 概述\n\nBigDL 可以无缝地将您的数据分析和 AI 应用从笔记本电脑扩展到云端，提供以下库：\n\n- `LLM` ***(已弃用 - 请改用 [IPEX-LLM](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002Fipex-llm))***：针对 Intel CPU 和 GPU 优化的大语言模型库\n\n- [Orca](#orca)：基于 Spark 和 Ray 的分布式大数据与 AI（TF 和 PyTorch）流水线\n\n- [Nano](#nano)：在 Intel CPU\u002FGPU 上透明加速 TensorFlow 和 PyTorch 程序\n\n- [DLlib](#dllib)：深度学习领域的“Spark MLlib 对等工具”\n\n- [Chronos](#chronos)：使用 AutoML 进行可扩展的时间序列分析\n\n- [Friesian](#friesian)：端到端的推荐系统\n\n- [PPML](#ppml)：安全的大数据与 AI（采用 SGX\u002FTDX 硬件安全技术）\n\n如需更多信息，请访问 [文档](https:\u002F\u002Fbigdl.readthedocs.io\u002F)。\n\n---\n\n## 选择合适的 BigDL 库\n```mermaid\nflowchart TD;\n    Feature1{{硬件安全的大数据与 AI？}};\n    Feature1-- 否 -->Feature2{{Python 还是 Scala\u002FJava？}};\n    Feature1-- 是  -->ReferPPML([\u003Cem>\u003Cstrong>PPML\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature2-- Python -->Feature3{{应用类型是什么？}};\n    Feature2-- Scala\u002FJava -->ReferDLlib([\u003Cem>\u003Cstrong>DLlib\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature3-- “大语言模型” -->ReferLLM([\u003Cem>\u003Cstrong>LLM\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature3-- “大数据 + AI（TF\u002FPyTorch）” -->ReferOrca([\u003Cem>\u003Cstrong>Orca\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature3-- 加速 TensorFlow \u002F PyTorch -->ReferNano([\u003Cem>\u003Cstrong>Nano\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature3-- 针对 Spark MLlib 的 DL -->ReferDLlib2([\u003Cem>\u003Cstrong>DLlib\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature3-- 高层次应用框架 -->Feature4{{领域是什么？}};\n    Feature4-- 时间序列 -->ReferChronos([\u003Cem>\u003Cstrong>Chronos\u003C\u002Fstrong>\u003C\u002Fem>]);\n    Feature4-- 推荐系统 -->ReferFriesian([\u003Cem>\u003Cstrong>Friesian\u003C\u002Fstrong>\u003C\u002Fem>]);\n    \n    click ReferLLM \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002Fipex-llm\"\n    click ReferNano \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#nano\"\n    click ReferOrca \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#orca\"\n    click ReferDLlib \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#dllib\"\n    click ReferDLlib2 \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#dllib\"\n    click ReferChronos \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#chronos\"\n    click ReferFriesian \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#friesian\"\n    click ReferPPML \"https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x#ppml\"\n    \n    classDef ReferStyle1 fill:#5099ce,stroke:#5099ce;\n    classDef Feature fill:#FFF,stroke:#08409c,stroke-width:1px;\n    class ReferLLM,ReferNano,ReferOrca,ReferDLlib,ReferDLlib2,ReferChronos,ReferFriesian,ReferPPML ReferStyle1;\n    class Feature1,Feature2,Feature3,Feature4,Feature5,Feature6,Feature7 Feature;\n    \n```\n---\n## 安装\n\n - 建议使用 [conda](https:\u002F\u002Fdocs.conda.io\u002Fprojects\u002Fconda\u002Fen\u002Flatest\u002Fuser-guide\u002Finstall\u002F) 环境来安装 BigDL：\n\n    ```bash\n    conda create -n my_env \n    conda activate my_env\n    pip install bigdl\n    ```\n    如需安装最新的 nightly 版本，请使用 `pip install --pre --upgrade bigdl`；更多详细信息请参阅 [Python](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FUserGuide\u002Fpython.html) 和 [Scala](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FUserGuide\u002Fscala.html) 用户指南。\n\n - 若要单独安装某个库，例如 Chronos，可以使用 `pip install bigdl-chronos`；更多详情请访问 [文档网站](https:\u002F\u002Fbigdl.readthedocs.io\u002F)。\n---\n\n## 入门\n### Orca\n\n- _Orca_ 库能够无缝地将您单节点的 **TensorFlow**、**PyTorch** 或 **OpenVINO** 程序扩展到大型集群中运行（以便处理分布式大数据）。\n\n  \u003Cdetails>\u003Csummary>显示 Orca 示例\u003C\u002Fsummary>\n  \u003Cbr\u002F>\n\n  您可以使用 _Orca_ 通过 4 个简单步骤构建端到端的分布式数据处理与 AI 程序：\n\n  ```python\n  # 1. 初始化 Orca 上下文（以便在 K8s、YARN 或本地笔记本上运行程序）\n  from bigdl.orca import init_orca_context, OrcaContext\n  sc = init_orca_context(cluster_mode=\"k8s\", cores=4，memory=\"10g\"，num_nodes=2)\n\n  # 2. 执行分布式数据处理（支持 Spark DataFrames、TensorFlow Dataset、PyTorch DataLoader、Ray Dataset、Pandas、Pillow 等）\n  spark = OrcaContext.get_spark_session()\n  df = spark.read.parquet(file_path)\n  df = df.withColumn('label', df.label-1)\n  ...\n\n  # 3. 使用标准框架 API 构建深度学习模型\n  # （支持 TensorFlow、PyTorch、Keras、OpenVino 等）\n  from tensorflow import keras\n  ...\n  model = keras.models.Model(inputs=[user, item], outputs=predictions)  \n  model.compile(...)\n\n  # 4. 使用 Orca Estimator 进行分布式训练\u002F推理\n  from bigdl.orca.learn.tf.estimator import Estimator\n  est = Estimator.from_keras(keras_model=model)  \n  est.fit(data=df，\n          feature_cols=['user', 'item'],\n          label_cols=['label'],\n          ...)\n  ```\n\n  \u003C\u002Fdetails> \n\n  *有关更多详细信息，请参阅 Orca [用户指南](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FOrca\u002FOverview\u002Forca.html)，以及 [TensorFlow](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FOrca\u002FHowto\u002Ftf2keras-quickstart.html) 和 [PyTorch](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FOrca\u002FHowto\u002Fpytorch-quickstart.html) 快速入门。*\n\n- 此外，您还可以使用 Orca 中的 _**RayOnSpark**_ 在 Spark 集群上运行标准 **Ray** 程序。\n\n  \u003Cdetails>\u003Csummary>显示 RayOnSpark 示例\u003C\u002Fsummary>\n  \u003Cbr\u002F>\n  \n  您不仅可以在 Spark 集群上运行 Ray 程序，还可以使用 Orca 中的 _RayOnSpark_ 将 Ray 代码内嵌到 Spark 代码中（以便处理内存中的 Spark RDD 或 DataFrame）。\n \n  ```python\n  # 1. 初始化 Orca 上下文（以便在 K8s、YARN 或本地笔记本上运行程序）\n  from bigdl.orca import init_orca_context，OrcaContext\n  sc = init_orca_context(cluster_mode=\"yarn\"，cores=4，memory=\"10g\"，num_nodes=2，init_ray_on_spark=True)\n\n  # 2. 使用 Spark 进行分布式数据处理\n  spark = OrcaContext.get_spark_session()\n  df = spark.read.parquet(file_path).withColumn(...)\n  \n  # 3. 将 Spark DataFrame 转换为 Ray 数据集\n  from bigdl.orca.data import spark_df_to_ray_dataset\n  dataset = spark_df_to_ray_dataset(df)\n\n  # 4. 使用 Ray 操作 Ray 数据集\n  import ray\n\n  @ray.remote\n  def consume(data) -> int:\n     num_batches = 0\n     for batch in data.iter_batches(batch_size=10):\n         num_batches += 1\n     return num_batches\n\n  print(ray.get(consume.remote(dataset)))\n  ```\n\n  \u003C\u002Fdetails>  \n  \n  *有关更多详细信息，请参阅 RayOnSpark [用户指南](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FOrca\u002FOverview\u002Fray.html) 和 [快速入门](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FOrca\u002FHowto\u002Fray-quickstart.html)。*\n\n### Nano\n您可以在笔记本电脑或服务器上使用 *Nano* 透明地加速您的 TensorFlow 或 PyTorch 程序。只需进行最少的代码修改，*Nano* 就能自动将现代 CPU 优化（例如 SIMD、多进程、低精度等）应用于标准的 TensorFlow 和 PyTorch 代码，从而实现高达 10 倍的加速。\n\n\u003Cdetails>\u003Csummary>显示 Nano 推理示例\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\n您可以使用 _Nano_ 自动优化已训练好的 PyTorch 模型以用于推理或部署：\n\n```python\nmodel = ResNet18().load_state_dict(...)\ntrain_dataloader = ...\nval_dataloader = ...\ndef accuracy (pred, target):\n  ... \n\nfrom bigdl.nano.pytorch import InferenceOptimizer\noptimizer = InferenceOptimizer()\noptimizer.optimize(model,\n                   training_data=train_dataloader,\n                   validation_data=val_dataloader,\n                   metric=accuracy)\nnew_model, config = optimizer.get_best_model()\n\noptimizer.summary()\n```\n`optimizer.summary()` 的输出可能如下所示：\n```\n -------------------------------- ---------------------- -------------- ----------------------\n|             method             |        status        | latency(ms)  |     metric value     |\n -------------------------------- ---------------------- -------------- ----------------------\n|            original            |      successful      |    45.145    |        0.975         |\n|              bf16              |      successful      |    27.549    |        0.975         |\n|          static_int8           |      successful      |    11.339    |        0.975         |\n|         jit_fp32_ipex          |      successful      |    40.618    |        0.975*        |\n|  jit_fp32_ipex_channels_last   |      successful      |    19.247    |        0.975*        |\n|         jit_bf16_ipex          |      successful      |    10.149    |        0.975         |\n|  jit_bf16_ipex_channels_last   |      successful      |    9.782     |        0.975         |\n|         openvino_fp32          |      successful      |    22.721    |        0.975*        |\n|         openvino_int8          |      successful      |    5.846     |        0.962         |\n|        onnxruntime_fp32        |      successful      |    20.838    |        0.975*        |\n|    onnxruntime_int8_qlinear    |      successful      |    7.123     |        0.981         |\n -------------------------------- ---------------------- -------------- ----------------------\n* 表示我们假设追踪模型的指标值不会改变，因此为了节省时间，不再重新计算指标值。\n优化总耗时 60.8 秒。\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>显示 Nano 训练示例\u003C\u002Fsummary>\n\u003Cbr\u002F>\n您可以通过 Nano 轻松加速 PyTorch 训练（例如 IPEX、BF16、多实例训练等）：\n\n```python\nmodel = ResNet18()\noptimizer = torch.optim.SGD(...)\ntrain_loader = ...\nval_loader = ...\n\nfrom bigdl.nano.pytorch import TorchNano\n\n# 在 `TorchNano.train` 中定义您的训练循环\nclass Trainer(TorchNano):\n\tdef train(self):\n\t# 调用 `setup` 准备模型、优化器和数据加载器，以便进行加速训练\n\tmodel, optimizer, (train_loader, val_loader) = self.setup(model, optimizer,\n  train_loader, val_loader)\n  \n    for epoch in range(num_epochs):  \n      model.train()  \n      for data, target in train_loader:  \n        optimizer.zero_grad()  \n        output = model(data)  \n        # 将 loss.backward() 替换为 self.backward(loss)  \n        loss = loss_fuc(output, target)  \n        self.backward(loss)  \n        optimizer.step()   \n\n# 加速训练（IPEX、BF16 和多实例训练）\nTrainer(use_ipex=True, precision='bf16', num_processes=2).train()\n```\n\n\u003C\u002Fdetails>  \n\n*请参阅 Nano [用户指南](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FNano\u002FOverview\u002Fnano.html) 和 [教程](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x\u002Ftree\u002Fmain\u002Fpython\u002Fnano\u002Ftutorial) 以获取更多详细信息。*\n    \n### DLlib\n\n借助 _DLlib_，您可以使用标准的 (**Scala** 或 **Python**) Spark 程序，通过相同的 **Spark DataFrames** 和 **ML Pipeline** API 来编写分布式深度学习应用程序。\n\n\u003Cdetails>\u003Csummary>显示 DLlib Scala 示例\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\n您可以通过以下三个简单步骤使用 *DLlib* 的 Scala API 为 Spark 构建分布式深度学习应用程序：\n\n```scala\n\u002F\u002F 1. 在代码开头调用 `initNNContext`： \nimport com.intel.analytics.bigdl.dllib.NNContext\nval sc = NNContext.initNNContext()\n\n\u002F\u002F 2. 使用 DLlib 中的 Keras 风格 API 定义深度学习模型：\nimport com.intel.analytics.bigdl.dllib.keras.layers._\nimport com.intel.analytics.bigdl.dllib.keras.Model\nval input = Input[Float](inputShape = Shape(10))  \nval dense = Dense[Float](12).inputs(input)  \nval output = Activation[Float](\"softmax\").inputs(dense)  \nval model = Model(input, output)\n\n\u002F\u002F 3. 使用 `NNEstimator` 通过 Spark DataFrame 和 ML Pipeline API 来训练\u002F预测\u002F评估模型\nimport org.apache.spark.sql.SparkSession\nimport org.apache.spark.ml.feature.MinMaxScaler\nimport org.apache.spark.ml.Pipeline\nimport com.intel.analytics.bigdl.dllib.nnframes.NNEstimator\nimport com.intel.analytics.bigdl.dllib.nn.CrossEntropyCriterion\nimport com.intel.analytics.bigdl.dllib.optim.Adam\nval spark = SparkSession.builder().getOrCreate()\nval trainDF = spark.read.parquet(\"train_data\")\nval validationDF = spark.read.parquet(\"val_data\")\nval scaler = new MinMaxScaler().setInputCol(\"in\").setOutputCol(\"value\")\nval estimator = NNEstimator(model, CrossEntropyCriterion())  \n        .setBatchSize(128).setOptimMethod(new Adam()).setMaxEpoch(5)\nval pipeline = new Pipeline().setStages(Array(scaler, estimator))\n\nval pipelineModel = pipeline.fit(trainDF)  \nval predictions = pipelineModel.transform(validationDF)\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>显示 DLlib Python 示例\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\n您也可以通过以下三个简单步骤使用 *DLlib* 的 Python API 为 Spark 构建分布式深度学习应用程序：\n\n```python\n# 1. 在代码开头调用 `init_nncontext`：\nfrom bigdl.dllib.nncontext import init_nncontext\nsc = init_nncontext()\n\n# 2. 使用 DLlib 中的 Keras 风格 API 定义深度学习模型：\nfrom bigdl.dllib.keras.layers import Input, Dense, Activation\nfrom bigdl.dllib.keras.models import Model\ninput = Input(shape=(10,))\ndense = Dense(12)(input)\noutput = Activation(\"softmax\")(dense)\nmodel = Model(input, output)\n\n# 3. 使用 `NNEstimator` 通过 Spark DataFrame 和 ML pipeline API 训练\u002F预测\u002F评估模型\nfrom pyspark.sql import SparkSession\nfrom pyspark.ml.feature import MinMaxScaler\nfrom pyspark.ml import Pipeline\nfrom bigdl.dllib.nnframes import NNEstimator\nfrom bigdl.dllib.nn.criterion import CrossEntropyCriterion\nfrom bigdl.dllib.optim.optimizer import Adam\nspark = SparkSession.builder.getOrCreate()\ntrain_df = spark.read.parquet(\"train_data\")\nvalidation_df = spark.read.parquet(\"val_data\")\nscaler = MinMaxScaler().setInputCol(\"in\").setOutputCol(\"value\")\nestimator = NNEstimator(model, CrossEntropyCriterion())\\\n    .setBatchSize(128)\\\n    .setOptimMethod(Adam())\\\n    .setMaxEpoch(5)\npipeline = Pipeline(stages=[scaler, estimator])\n\npipelineModel = pipeline.fit(train_df)\npredictions = pipelineModel.transform(validation_df)\n```\n\n\u003C\u002Fdetails>\n\n*更多详细信息，请参阅 DLlib 的 [NNFrames](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FDLlib\u002FOverview\u002Fnnframes.html) 和 [Keras API](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FDLlib\u002FOverview\u002Fkeras-api.html) 用户指南。*\n\n### Chronos\n\n*Chronos* 库使构建快速、准确且可扩展的 **时间序列分析** 应用程序（支持 AutoML）变得简单易行。\n\n\u003Cdetails>\u003Csummary>显示 Chronos 示例\u003C\u002Fsummary>\n\u003Cbr\u002F>\n\n您可以通过以下三个简单步骤使用 *Chronos* 训练时间序列预测器：\n\n```python\nfrom bigdl.chronos.forecaster import TCNForecaster \nfrom bigdl.chronos.data.repo_dataset import get_public_dataset\n\n# 1. 使用 `TSDataset` 处理时间序列数据\ntsdata_train, tsdata_val, tsdata_test = get_public_dataset(name='nyc_taxi')\nfor tsdata in [tsdata_train, tsdata_val, tsdata_test]:\n    data.roll(lookback=100, horizon=1)\n\n# 2. 创建一个 `TCNForecaster`（根据训练数据自动配置）\nforecaster = TCNForecaster.from_tsdataset(train_data)\n\n# 3. 训练预测器以进行预测\nforecaster.fit(train_data)\n\npred = forecaster.predict(test_data)\n```\n\n要应用 AutoML，可以使用 `AutoTSEstimator` 代替普通预测器。\n```python\n# 创建并拟合一个 `AutoTSEstimator`\nfrom bigdl.chronos.autots import AutoTSEstimator\nautotsest = AutoTSEstimator(model=\"tcn\", future_seq_len=10)\n\ntsppl = autotsest.fit(data=tsdata_train, validation_data=tsdata_val)\npred = tsppl.predict(tsdata_test)\n```\n\n\u003C\u002Fdetails>  \n\n*更多详细信息，请参阅 Chronos 的 [用户指南](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FChronos\u002Findex.html) 和 [快速入门](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FChronos\u002FQuickStart\u002Fchronos-autotsest-quickstart.html)。*\n\n### Friesian\n*Friesian* 库使构建端到端的大规模 **推荐系统** 变得容易（包括 *离线* 特征转换与训练、*近线* 特征和模型更新，以及 *在线* 服务管道）。\n\n*更多详细信息，请参阅 Friesian 的 [README](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x\u002Fblob\u002Fmain\u002Fpython\u002Ffriesian\u002FREADME.md)。*\n\n### PPML\n\n*BigDL PPML* 提供一种 **基于硬件（Intel SGX）保护的可信集群环境**，用于在私有云或公有云上以安全的方式运行分布式大数据和 AI 应用程序。\n\n*更多详细信息，请参阅 PPML 的 [用户指南](https:\u002F\u002Fbigdl.readthedocs.io\u002Fen\u002Flatest\u002Fdoc\u002FPPML\u002FOverview\u002Fppml.html) 和 [教程](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x\u002Fblob\u002Fmain\u002Fppml\u002FREADME.md)。*\n\n## 获取支持\n\n- 邮件列表：[mail list](mailto:bigdl-user-group+subscribe@googlegroups.com)\n- 用户组：[user group](https:\u002F\u002Fgroups.google.com\u002Fforum\u002F#!forum\u002Fbigdl-user-group)\n- GitHub 问题：[GitHub issues](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002FBigDL-2.x\u002Fissues)\n\n---\n\n## 引用\n\n如果您发现 BigDL 对您的项目有所帮助，您可以按照以下方式引用我们的论文：\n\n- *[BigDL 2.0](https:\u002F\u002Farxiv.org\u002Fabs\u002F2204.01715)：从笔记本电脑到分布式集群的 AI 流程无缝扩展*\n  ```\n  @INPROCEEDINGS{9880257,\n      title={BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster}, \n      author={Dai, Jason Jinquan and Ding, Ding and Shi, Dongjie and Huang, Shengsheng and Wang, Jiao and Qiu, Xin and Huang, Kai and Song, Guoqiong and Wang, Yang and Gong, Qiyuan and Song, Jiaming and Yu, Shan and Zheng, Le and Chen, Yina and Deng, Junwei and Song, Ge},\n      booktitle={2022 IEEE\u002FCVF Conference on Computer Vision and Pattern Recognition (CVPR)}, \n      year={2022},\n      pages={21407-21414},\n      doi={10.1109\u002FCVPR52688.2022.02076}\n  }\n  ```\n\n[^1]: 性能因使用场景、配置及其他因素而异。`bigdl-llm` 可能无法针对非 Intel 产品达到相同的优化效果。更多信息请访问 www.Intel.com\u002FPerformanceIndex。\n\n- *[BigDL](https:\u002F\u002Farxiv.org\u002Fabs\u002F1804.05839)：面向大数据的分布式深度学习框架*\n  ```\n  @INPROCEEDINGS{10.1145\u002F3357223.3362707,\n      title = {BigDL: A Distributed Deep Learning Framework for Big Data},\n      author = {Dai, Jason Jinquan and Wang, Yiheng and Qiu, Xin and Ding, Ding and Zhang, Yao and Wang, Yanzhang and Jia, Xianyan and Zhang, Cherry Li and Wan, Yan and Li, Zhichao and Wang, Jiao and Huang, Shengsheng and Wu, Zhongyuan and Wang, Yang and Yang, Yuhao and She, Bowen and Shi, Dongjie and Lu, Qi and Huang, Kai and Song, Guoqiong},\n      booktitle = {Proceedings of the ACM Symposium on Cloud Computing (SoCC)},\n      year = {2019},\n      pages = {50–60},\n      doi = {10.1145\u002F3357223.3362707}\n  }\n  ```","# BigDL 快速上手指南\n\n> **重要提示**：原 `bigdl-llm` 组件已更名为 **`ipex-llm`**。如果您主要关注大语言模型（LLM）在 Intel CPU\u002FGPU 上的运行，请前往 [IPEX-LLM 项目](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002Fipex-llm)。本指南主要针对 BigDL 的其他核心组件（如 Orca, Nano, DLlib 等）。\n\nBigDL 是一套用于在 Intel 架构（CPU\u002FGPU）上无缝扩展数据分析和 AI 应用的库，支持从笔记本电脑到云集群的部署。\n\n## 1. 环境准备\n\n在开始之前，请确保您的系统满足以下要求：\n\n*   **操作系统**：Linux (推荐 Ubuntu\u002FCentOS), macOS, 或 Windows。\n*   **Python 版本**：建议 Python 3.8 - 3.10。\n*   **前置依赖**：\n    *   推荐安装 [Conda](https:\u002F\u002Fdocs.conda.io\u002Fprojects\u002Fconda\u002Fen\u002Flatest\u002Fuser-guide\u002Finstall\u002F) 以管理虚拟环境。\n    *   若使用分布式功能（Orca），需确保集群环境（如 K8s, YARN）或本地 Spark 环境可用。\n*   **硬件加速**：BigDL 自动利用 Intel CPU 指令集（如 AVX-512, AMX）及 GPU，无需额外配置驱动即可享受性能提升。\n\n## 2. 安装步骤\n\n推荐使用 Conda 创建独立环境进行安装。\n\n### 创建并激活环境\n```bash\nconda create -n bigdl_env python=3.9\nconda activate bigdl_env\n```\n\n### 安装 BigDL 核心包\n安装包含基础功能的完整包：\n```bash\npip install bigdl\n```\n\n### 安装特定组件（可选）\n如果您只需要特定功能模块，可以单独安装以减少依赖体积：\n*   **时序分析 (Chronos)**: `pip install bigdl-chronos`\n*   **推荐系统 (Friesian)**: `pip install bigdl-friesian`\n*   **隐私计算 (PPML)**: `pip install bigdl-ppml`\n\n> **提示**：如需体验最新夜间构建版本，可使用 `pip install --pre --upgrade bigdl`。国内用户若遇下载缓慢，可配置 pip 使用清华或阿里镜像源（例如：`pip install -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple bigdl`）。\n\n## 3. 基本使用\n\nBigDL 包含多个子库，以下是两个最常用场景的快速示例。\n\n### 场景一：使用 Nano 加速 PyTorch 推理\n**Nano** 可以在几乎不修改代码的情况下，自动应用 Intel CPU 优化技术（如多进程、低精度量化等），显著提升 TensorFlow 或 PyTorch 程序的运行速度。\n\n```python\nfrom bigdl.nano.pytorch import InferenceOptimizer\nimport torch\nfrom torchvision.models import resnet18\n\n# 1. 加载预训练模型\nmodel = resnet18(pretrained=True)\nmodel.eval()\n\n# 2. 准备数据 (此处仅为示例占位符)\ntrain_dataloader = ... \nval_dataloader = ...\n\ndef accuracy(pred, target):\n    # 定义准确率计算逻辑\n    ...\n\n# 3. 初始化优化器并执行优化\noptimizer = InferenceOptimizer()\noptimizer.optimize(model,\n                   training_data=train_dataloader,\n                   validation_data=val_dataloader,\n                   metric=accuracy)\n\n# 4. 获取最佳加速模型\nnew_model, config = optimizer.get_best_model()\n\n# 查看不同优化策略的性能对比\noptimizer.summary()\n```\n*执行后，`summary()` 将输出原始模型与多种加速策略（如 BF16, INT8, OpenVINO 等）的延迟和精度对比。*\n\n### 场景二：使用 Orca 进行分布式训练\n**Orca** 允许您将单机的 TensorFlow\u002FPyTorch 程序轻松扩展到 Spark 或 Ray 集群上进行分布式大数据处理。\n\n```python\nfrom bigdl.orca import init_orca_context, OrcaContext\nfrom bigdl.orca.learn.tf.estimator import Estimator\nfrom tensorflow import keras\n\n# 1. 初始化 Orca 上下文 (支持本地、K8s、YARN)\nsc = init_orca_context(cluster_mode=\"local\", cores=4, memory=\"10g\")\n\n# 2. 获取 Spark Session 进行分布式数据处理\nspark = OrcaContext.get_spark_session()\ndf = spark.read.parquet(\"hdfs:\u002F\u002F\u002Fpath\u002Fto\u002Fdata\")\n# 数据预处理...\ndf = df.withColumn('label', df.label-1)\n\n# 3. 构建标准 Keras 模型\ninputs = keras.Input(shape=(10,))\nx = keras.layers.Dense(12, activation='relu')(inputs)\noutputs = keras.layers.Dense(1, activation='softmax')(x)\nkeras_model = keras.Model(inputs, outputs)\nkeras_model.compile(optimizer='adam', loss='mse')\n\n# 4. 使用 Orca Estimator 进行分布式训练\nest = Estimator.from_keras(keras_model=keras_model)\nest.fit(data=df,\n        feature_cols=['feature1', 'feature2'],\n        label_cols=['label'],\n        batch_size=32,\n        epochs=5)\n```\n\n---\n更多详细文档、API 参考及高级教程，请访问 [BigDL 官方文档](https:\u002F\u002Fbigdl.readthedocs.io\u002F)。","某电商数据团队需要在现有 Spark 大数据集群上，利用历史用户行为数据训练大规模深度学习推荐模型，以优化实时商品推荐效果。\n\n### 没有 BigDL 时\n- **架构割裂严重**：数据处理依赖 Spark，而模型训练需迁移至独立的 TensorFlow\u002FPyTorch 集群，数据搬运耗时且易出错。\n- **资源利用率低**：现有的 Spark 计算节点在数据预处理阶段空闲，无法直接复用其算力进行分布式模型训练。\n- **开发门槛高**：数据工程师需额外学习复杂的分布式深度学习框架，且难以将熟悉的 Spark MLlib 流程平滑升级为深度神经网络。\n- **扩展性差**：随着数据量激增，单机或小规模集群训练时间从小时级延长至数天，无法满足业务快速迭代需求。\n\n### 使用 BigDL 后\n- **无缝集成架构**：通过 BigDL Orca 组件，直接在 Spark 集群上运行分布式 TensorFlow 或 PyTorch 程序，实现“数据在哪里，计算就在哪里”。\n- **资源高效复用**：透明地调用现有 Spark 集群的 CPU\u002FGPU 资源进行模型训练，无需维护额外的 AI 基础设施，硬件成本显著降低。\n- **开发体验流畅**：支持用标准的 Python 代码编写模型，像使用 Spark MLlib 一样简单，数据科学家可专注于算法而非底层分布式逻辑。\n- **线性加速能力**：轻松将训练任务从笔记本电脑扩展至云端千核集群，训练速度提升数十倍，使大规模推荐模型的日更成为可能。\n\nBigDL 打破了大数据与深度学习的技术壁垒，让企业能以最低成本在现有数据平台上构建高性能 AI 应用。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fintel_BigDL_69c9d089.jpg","intel","Intel® Corporation","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fintel_bfc55d04.png","",null,"webadmin@linux.intel.com","https:\u002F\u002Fgithub.com\u002Fintel",[80,84,88,92,96,100,104,108,111,114],{"name":81,"color":82,"percentage":83},"Jupyter Notebook","#DA5B0B",67,{"name":85,"color":86,"percentage":87},"Python","#3572A5",15.5,{"name":89,"color":90,"percentage":91},"Scala","#c22d40",14.5,{"name":93,"color":94,"percentage":95},"Java","#b07219",1.4,{"name":97,"color":98,"percentage":99},"Shell","#89e051",1.3,{"name":101,"color":102,"percentage":103},"Dockerfile","#384d54",0.2,{"name":105,"color":106,"percentage":107},"Makefile","#427819",0,{"name":109,"color":110,"percentage":107},"RobotFramework","#00c0b5",{"name":112,"color":113,"percentage":107},"C","#555555",{"name":115,"color":116,"percentage":107},"PowerShell","#012456",2693,732,"2026-04-02T08:35:21","Apache-2.0","Linux, macOS, Windows","非必需。支持 Intel CPU 和 Intel GPU（通过 IPEX\u002FBigDL-Nano 优化）；若使用 OpenVINO 后端可加速 Intel 集成显卡。未提及对 NVIDIA GPU 或 CUDA 的具体需求。","未说明（示例代码中演示了单节点 10GB 内存配置，实际取决于数据规模和模型大小）",{"notes":125,"python":126,"dependencies":127},"1. 原 bigdl-llm 组件已弃用，大语言模型功能已迁移至 ipex-llm 项目。\n2. 推荐使用 conda 创建虚拟环境进行安装。\n3. 该工具主要针对 Intel 硬件架构（CPU\u002FGPU）进行了深度优化，支持在笔记本电脑到云端集群间无缝扩展。\n4. 支持多种运行模式：本地、K8s、YARN。\n5. 包含多个子库（Orca, Nano, DLlib 等），可按需单独安装（如 pip install bigdl-chronos）。","未明确指定版本（建议 Python 3.x，需配合 conda 环境使用）",[128,129,130,131,132,133,134,135],"conda","bigdl","tensorflow","pytorch","spark","ray","openvino","ipex",[14],[138,139,140,141,129,142,143,144,131],"apache-spark","deep-neural-network","distributed-deep-learning","keras-tensorflow","analytics-zoo","python","scala","2026-03-27T02:49:30.150509","2026-04-06T20:55:25.073182",[148,153,158],{"id":149,"question_zh":150,"answer_zh":151,"source_url":152},20016,"在 Kubernetes 上使用 PPML 镜像运行 Spark 任务时遇到“初始作业未接受任何资源”的错误，如何解决？","PPML 镜像专为启用 SGX 的平台设计（推荐第 3-4 代至强平台并在 BIOS 中启用 SGX）。如果普通 Apache Spark 镜像能运行而 PPML 镜像报错，请首先检查平台是否已启用 SGX 以及是否配置了足够的 EPC（SGX 保留内存）。该错误信息实际上是一个警告，会持续显示直到作业获得足够资源。如果 EPC 内存有限且不需要 SGX 功能，可以尝试设置参数 `--conf spark.kubernetes.sgx.enabled=false` 来禁用 SGX 支持进行测试。若仍无法解决，可考虑暂时切换回标准的 Apache Spark 镜像。","https:\u002F\u002Fgithub.com\u002Fintel\u002FBigDL\u002Fissues\u002F5178",{"id":154,"question_zh":155,"answer_zh":156,"source_url":157},20017,"BigDL-LLM 子模块依赖的二进制文件是从哪里下载的？如何自行编译这些依赖？","BigDL-LLM 子模块及其他相关模块默认会从 SourceForge 下载预编译的二进制文件。目前项目已迁移，未来的开发将转移到 [IPEX-LLM](https:\u002F\u002Fgithub.com\u002Fintel-analytics\u002Fipex-llm) 项目。如果您需要针对不同平台重新编译、调试或应用自定义补丁，建议关注新的 IPEX-LLM 仓库以获取最新的构建指南和源码编译支持。","https:\u002F\u002Fgithub.com\u002Fintel\u002FBigDL\u002Fissues\u002F5212",{"id":159,"question_zh":160,"answer_zh":161,"source_url":162},20018,"在 Scala 目录下执行 make-dist.sh 脚本时出现“找不到符号 (Cannot find symbol)”错误怎么办？","该编译错误通常与特定分支的代码状态有关。根据反馈，切换到 `Branch-2.3` 分支后重新构建可以成功解决此问题。请确保您检出的是正确的稳定分支再进行打包操作。","https:\u002F\u002Fgithub.com\u002Fintel\u002FBigDL\u002Fissues\u002F5210",[164,169,174,179,184,189,194,198,202,206,210,215,220,225,230,235,240,245,250,255],{"id":165,"version":166,"summary_zh":167,"released_at":168},118040,"v2.5.0b1","## 亮点\r\n***注意：*** BigDL v2.5.0b1 已更新，包含了功能性和安全性方面的改进。建议用户升级到最新版本。","2024-10-15T07:20:46",{"id":170,"version":171,"summary_zh":172,"released_at":173},118041,"v2.4.0","## 亮点\r\n***注意：*** BigDL v2.4.0 已更新，包含了功能性和安全性方面的改进。建议用户升级到最新版本。","2024-03-06T09:39:13",{"id":175,"version":176,"summary_zh":177,"released_at":178},118042,"v2.3.0","## 亮点\n***注意：*** BigDL v2.3.0 已更新，包含了功能性和安全性方面的改进。建议用户升级至最新版本。\n\nNano\n- 增强了 `trace` 和 `量化` 流程（用于 PyTorch 和 TensorFlow 模型优化）\n- 新的推理优化方法（包括对 Intel ARC 系列 GPU 的支持、CPU fp16、JIT int8 等）\n- 新的推理\u002F训练特性（包括 TorchCCL 支持、异步推理流水线、压缩模型保存、自动 channels_last_3d 转换、针对自定义 TF 训练循环的多实例训练等）\n- 推理优化模型的性能提升和开销降低\n- 更加友好的文档和 API 设计\n\nOrca:\n- 针对不同数据输入的分步式 TensorFlow 和 PyTorch 教程。\n- 分布式 MMCV 流水线的改进及示例。\n- 进一步增强 Orca Estimator（通过 Hook 实现更灵活的 PyTorch 训练循环、改进多输出预测、针对 OpenVINO 的内存优化等）。\n\nChronos\n- 预测器延迟降低 70%\n- 新增 `bigdl.chronos.aiops` 模块，基于 Chronos 算法应用于 AIOps 场景。\n- 增强基于 TF 的 TCNForecaster，以获得更高的准确性。\n\nFriesian:\n- 使用 Helm Chart 在 Kubernetes 上自动部署推荐系统推理流水线。\n\nPPML\n- TDX（包括 VM 和 CoCo）支持大数据、深度学习训练与推理服务（涵盖 TDX-VM 编排与 k8s 部署、TDXCC 安装与部署、证明与密钥管理支持等）。\n- 全新可信机器学习工具包（支持安全、分布式 SparkML 和 LightGBM）。\n- 可信大数据工具包升级（EPC 使用量减少 2 倍以上、支持 Apache Flink、Azure MAA、多 KMS 等）。\n- 可信深度学习工具包升级（通过 BigDL Nano、tcmalloc 等技术提升性能）。\n- 可信 DL 推理工具包升级（支持 Torch Serve、TF-Serving，并改善吞吐量和延迟）。","2024-03-06T09:39:20",{"id":180,"version":181,"summary_zh":182,"released_at":183},118043,"v2.2.0","## 亮点\n***注意：*** BigDL v2.2.0 已更新，包含了功能性和安全性方面的改进。建议用户升级到最新版本。\n\n* Nano\n   * 扩展 BigDL Nano 推理支持，新增对集成显卡及更多数据类型（INT8\u002FBF16\u002FFP16 量化）的支持\n   * 增加多项性能优化功能（如 Keras 的 InferenceOptimizer、PyTorch 训练循环的 Nano 装饰器、用于线程数控制和自动混合精度的 Nano 上下文管理器等）\n   * 支持安装更多版本的 PyTorch 和 TensorFlow，并根据不同平台提供可选依赖项\n* PPML\n   * 升级 BigDL PPML 解决方案，支持新的 LibOS（如 Gramine 1.3.1、Occlum 0.29.2），以提升安全性、性能和稳定性，并简化部署流程。\n   * 扩展对更多大数据框架（Spark 3.1.3、Flink、Hive 等）、Python 和数据科学工具（Numpy、Pandas、scikit-learn、Torch Serve、Triton、Flask 等）的支持，同时支持使用 Orca 进行分布式深度学习训练。\n   * 改进证明机制（如 MREnclave 证明）、密钥管理（如多 KMS 支持）以及加密功能（如透明加密），以构建更完善的端到端安全流水线。\n   * 初步支持在 SPR TDX 平台上运行 BigDL PPML（包括虚拟机和 TDX 机密容器）。\n* Chronos\n   * 扩展 BigDL Chronos 支持范围，新增对 Windows 和 Mac 系统以及 Python 3.8\u002F3.9 版本的支持。\n   * 为 Chronos 用户提供基准测试工具，以便评估 Chronos 在其平台上的性能。\n   * 增加多项性能优化功能（如 TCNForecaster 的准确性和性能提升、更低的内存占用、自动超参搜索、更快且更易移植的 TSDataset 等）。\n* Friesian\n   * 新增对 LightGBM 训练的支持。\n   * 优化在线推理流水线的性能。\n* Orca\n   * 改进 Orca Estimator API，提升用户体验。\n   * 针对使用 Spark DataFrame 进行分布式训练的场景，优化内存使用。\n   * 更好地支持图像输入与可视化，并结合 Xshards 实现高效处理。\n   * 使用 Orca 开展分布式 MMCV 应用。\n* 文档\n   * 提供在 YARN\u002FK8s\u002FDatabricks 上运行 BigDL Orca 的教程。\n   * 提供 Azure 平台上的 BigDL PPML 解决方案文档。\n   * 提供 Chronos 预测与部署流程的操作指南及示例。","2024-03-06T09:39:27",{"id":185,"version":186,"summary_zh":187,"released_at":188},118044,"v2.1.0","## 亮点\n***注意：*** BigDL v2.1.0 已更新，包含了功能性和安全性方面的改进。建议用户升级到最新版本。\n\n- Orca\n  - 改进 Orca Estimator 的用户体验和 API 一致性。\n  - 在 Orca TensorFlow2 Estimator 中支持直接保存和加载 TensorFlow 模型格式。\n  - 提供更多示例（例如 PyTorch 脑部图像分割、用于分布式 Python 数据处理的 XShards 教程等）。\n  - 在 Orca PyTorch Estimator 中支持自定义指标。\n- Nano\n  - 新的推理优化流水线，包含更多优化方法和全新的 InferenceOptimizer。\n  - 更多训练优化方法（如 bf16、channel last）。\n  - 增加 TorchNano 支持，用于 PyTorch 模型的自定义训练循环。\n  - 多实例训练的自动学习率缩放。\n  - 内置通过超参数优化实现的 AutoML 支持。\n  - 支持广泛的 PyTorch 版本（1.9–1.12）和 TensorFlow 版本（2.7–2.9）。\n- DLlib\n  - 添加 LightGBM 支持。\n  - 改进 Keras 风格的模型摘要 API。\n  - 增加 Python 对 HDFS 文件加载的支持。\n- Chronos\n  - 添加新的 Autoformer（https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.13008）预测器及在 CPU 上优化的流水线。\n  - TensorFlow 2 支持 LSTM、Seq2Seq、TCN 和 MTNet 预测器。\n  - 增加轻量级（不依赖 Spark\u002FRay Tune）的自动调参功能。\n  - 更好地支持分布式工作流（Spark DataFrame 和分布式 Pandas 处理）。\n  - 增加更多安装选项，使安装更加轻便。\n- Friesian:\n  - 将 DeepRec（https:\u002F\u002Fgithub.com\u002Falibaba\u002FDeepRec）与 Friesian 集成。\n  - 增加更多参考示例，例如多任务推荐、TFRS（https:\u002F\u002Fwww.tensorflow.org\u002Frecommenders）列表式排序、LightGBM 训练等。\n  - 添加离线分布式相似度搜索的参考示例（使用 FAISS）。\n  - FeatureTable 中提供更多操作（例如基于 BERT 的字符串嵌入等）。\n- PPML \n  - 在 Gramine 上升级 BigDL PPML。\n  - 改进证明和密钥管理流程。\n  - 在 BigDL PPML 上支持更多大数据框架（包括 Spark、Flink、Hive、HDFS 等）。\n  - 增加 PPMLContext API，用于加密 I\u002FO 和 KMS，支持不同的文件格式、加密算法和 KMS 服务。\n  - 在 VFL 场景中支持 PSI、PyTorch NN、Keras NN、FGBoost（联邦 XGBoost）、以及 VFL 下的线性回归和逻辑回归。\n\n","2024-03-06T09:39:33",{"id":190,"version":191,"summary_zh":192,"released_at":193},118045,"v2.0.0","## 亮点\r\n***注意：*** BigDL v2.0.0 已更新，增加了功能性和安全性方面的改进。建议用户升级到最新版本。","2024-03-06T09:39:40",{"id":195,"version":196,"summary_zh":76,"released_at":197},118046,"v0.13.0","2024-03-07T02:15:02",{"id":199,"version":200,"summary_zh":76,"released_at":201},118047,"v0.12.2","2024-03-07T02:15:08",{"id":203,"version":204,"summary_zh":76,"released_at":205},118048,"v0.12.1","2024-03-07T02:15:15",{"id":207,"version":208,"summary_zh":76,"released_at":209},118049,"v0.11.1","2024-03-07T02:15:21",{"id":211,"version":212,"summary_zh":213,"released_at":214},118050,"v0.10.0","## 亮点\n\n* 持续优化 RNN。我们支持将 LSTM 和 GRU 与 MKL-DNN 集成，性能提升约 3 倍。\n\n* ONNX 支持。我们支持通过 ONNX 加载第三方框架的模型。\n\n* 更丰富的数据预处理支持以及分割推理流水线支持。\n\n\n## 详情\n* [新特性] 完整 MaskRCNN 模型支持，并包含数据处理功能。\n* [新特性] 支持可变尺寸的 Resize 操作。\n* [新特性] 支持区域建议的批量输入。\n* [新特性] 支持在一个 mini-batch 中包含不同尺寸的样本。\n* [新特性] 实现了 MAP 验证方法。\n* [新特性] ROILabel 功能增强，同时支持目标检测和分割任务。\n* [新特性] 分割任务支持灰度图像。\n* [新特性] 为特征金字塔网络（FPN）添加 TopBlocks 支持。\n* [新特性] 将 GRU 与 MKL-DNN 集成并支持。\n* [新特性] 为 MaskRCNN 添加 MaskHead 支持。\n* [新特性] 为 MaskRCNN 添加 BoxHead 支持。\n* [新特性] 为 MaskRCNN 添加 RegionalProposal 支持。\n* [新特性] 为 ONNX 添加 Shape 操作支持。\n* [新特性] 为 ONNX 添加 Gemm 操作支持。\n* [新特性] 为 ONNX 添加 Gather 操作支持。\n* [新特性] 为 ONNX 添加 AveragePool 操作支持。\n* [新特性] 为 ONNX 添加 BatchNormalization 操作支持。\n* [新特性] 为 ONNX 添加 Concat 操作支持。\n* [新特性] 为 ONNX 添加 Conv 操作支持。\n* [新特性] 为 ONNX 添加 MaxPool 操作支持。\n* [新特性] 为 ONNX 添加 Reshape 操作支持。\n* [新特性] 为 ONNX 添加 Relu 操作支持。\n* [新特性] 为 ONNX 添加 SoftMax 操作支持。\n* [新特性] 为 ONNX 添加 Sum 操作支持。\n* [新特性] 为 ONNX 添加 Squeeze 操作支持。\n* [新特性] 为 ONNX 添加 Const 操作支持。\n* [新特性] 实现了 ONNX 模型加载器。\n* [新特性] RioAlign 层支持。\n* [增强] 对齐 mklblas 和 mkl-dnn 中的批归一化层。\n* [增强] Python API 增强，支持嵌套列表输入。\n* [增强] 使用 MKL-DNN 支持多模型训练\u002F推理。\n* [增强] 将 BatchNormalization 与 Scale 融合。\n* [增强] SoftMax companion object 支持无参数初始化。\n* [增强] Python 支持使用 MKL-DNN 进行训练。\n* [增强] 文档改进。\n* [Bug 修复] 修复模型版本比较问题。\n* [Bug 修复] 修复 ParallelTable 的图反向传播 bug。\n* [Bug 修复] 修复使用 MKL-DNN 训练时的内存泄漏问题。\n* [Bug 修复] 修复训练过程中由非规格化数值引起的性能问题。\n* [Bug 修复] 修复在 MKL-DNN 下 SoftMax 发生段错误的问题。\n* [Bug 修复] 修复 TimeDistributedCriterion 的 Python API 与 Scala 不一致的问题。","2024-03-07T02:15:28",{"id":216,"version":217,"summary_zh":218,"released_at":219},118051,"v0.9.0","## 亮点\n\n* 继续支持 VNNI 加速，我们为更多 CNN 模型（包括目标检测模型）添加了优化，并增强了对 VNNI 的模型尺度生成功能支持。\n* 增加基于注意力机制的模型支持，我们为语言模型和翻译模型都实现了 Transformer 架构。\n* RNN 优化：我们支持将 LSTM 与 MKL-DNN 集成，性能提升约 3 倍。\n\n\n## 详情\n* [新特性] 添加注意力层支持\n* [新特性] 添加前馈网络层支持\n* [新特性] 添加 ExpandSize 层支持\n* [新特性] 添加 TableOperation 层，以支持不同输入尺寸的表格计算\n* [新特性] 添加层归一化层支持\n* [新特性] 为语言模型和翻译模型添加 Transformer 支持\n* [新特性] 在 Transformer 模型中添加束搜索支持\n* [新特性] 添加逐层自适应学习率缩放优化方法\n* [新特性] 添加 LSTM 与 MKL-DNN 集成支持\n* [新特性] 添加空洞卷积与 MKL-DNN 集成支持\n* [新特性] 为 LarsSGD 优化方法添加参数处理功能\n* [新特性] 支持与 MKL-DNN 的亲和性绑定选项\n* [增强] 完善配置和构建相关文档\n* [增强] 改进反射机制，以便获取构造函数参数的默认值\n* [增强] 用户可使用一个 AllReduce 参数进行多优化方法训练\n* [增强] CAddTable 层增强，支持沿特定维度扩展输入\n* [增强] ResNet-50 预处理流水线增强，用中心裁剪替换随机裁剪\n* [增强] 支持为任意掩码计算模型尺度\n* [增强] 启用全局平均池化\n* [增强] 检查输入形状与底层 MKL-DNN 布局的一致性\n* [增强] 线程池增强，在执行器运行时抛出适当的异常\n* [增强] 支持从 ntc 到 tnc 的 MKL-DNN 格式转换\n* [Bug 修复] 修复反向图生成拓扑排序问题\n* [Bug 修复] 修复 MemoryData 哈希码计算问题\n* [Bug 修复] 修复 BCECriterion 的日志输出问题\n* [Bug 修复] 修复容器量化时掩码设置问题\n* [Bug 修复] 修复多执行器使用同一工作节点运行时的验证准确率问题\n* [Bug 修复] 修复多组掩码卷积与批归一化之间的 INT8 层融合问题\n* [Bug 修复] 修复 JoinTable 尺度生成问题\n* [Bug 修复] 修复 CMul 在特殊输入格式下的前向传播问题\n* [Bug 修复] 修复模型融合后权重变化问题\n* [Bug 修复] 修复空间卷积原语初始化问题\n","2024-03-07T02:15:35",{"id":221,"version":222,"summary_zh":223,"released_at":224},118052,"v0.8.0","## 亮点\n* 增加 MKL-DNN Int8 支持，尤其是对 VNNI 加速的支持。低精度推理可显著提升延迟和吞吐量。\n* 增加在 MKL-DNN 下运行 MKL-BLAS 模型的支持。我们利用 MKL-DNN 来加速 MKL-BLAS 模型的训练和推理。\n* 增加对 Spark 2.4 的支持。我们的示例和 API 与 Spark 2.4 完全兼容，并随其他 Spark 版本一同发布了适用于 Spark 2.4 的二进制文件。\n\n## 详情\n* [新特性] 增加 MKL-DNN Int8 支持，特别是对 VNNI 的支持。\n* [新特性] 增加在 MKL-DNN 下运行 MKL-BLAS 模型的支持。\n* [新特性] 增加对 Spark 2.4 的支持。\n* [新特性] 增加自动融合功能，以加速模型推理。\n* [新特性] 为低精度推理增加内存重排支持。\n* [新特性] 为 DNN 张量增加字节支持。\n* [新特性] 在 MKL-DNN 层中增加 SAME 填充模式。\n* [新特性] 为训练完成添加组合（加\u002F或）触发条件。\n* [增强] 增强 Inception-V1 的 Python 训练支持。\n* [增强] 分布式优化器增强，以支持自定义优化器。\n* [增强] 为 DNN 支持的层增加计算输出形状的功能。\n* [增强] 新的 MKL-DNN 计算线程池。\n* [增强] 为 Predictor 增加 MKL-DNN 支持。\n* [增强] 对稀疏张量、MKL-DNN 支持等文档进行增强。\n* [增强] 为 AvgPooling 和 MaxPooling 层增加 ceilm 模式。\n* [增强] 为 DLClassifierModel 增加二分类支持。\n* [增强] 改进内存重排功能，支持 NHWC 和 NCHW 之间的转换。\n* [Bug 修复] 修复输入被缩小的 SoftMax 层问题。\n* [Bug 修复] TensorFlow 加载器支持检查所有数据类型。\n* [Bug 修复] 修复加载 TensorFlow 图时 Add 运算不支持 double 类型的问题。\n* [Bug 修复] 修复训练过程中验证阶段单步权重更新缺失的问题。\n* [Bug 修复] 修复 Scala 编译器在 2.10 和 2.11 版本中的安全问题。\n* [Bug 修复] 修复模型广播缓存 UUID 问题。\n* [Bug 修复] 修复批量大小为 1 时的预测器问题。","2024-03-07T02:15:41",{"id":226,"version":227,"summary_zh":228,"released_at":229},118053,"v0.7.0","## 亮点\n* MKL-DNN 支持增强，包括训练优化、更多模型的训练支持以及模型序列化支持\n* 针对基于 MKL-DNN 的模型推出全新分布式优化器。该优化器可在分布式训练过程中重叠执行训练与通信操作，从而提升多节点环境下的可扩展性\n## 详情\n* [新特性] 新增优化方法 ParallelAdam，充分利用多线程能力\n* [新特性] 增加推荐系统中广泛使用的评估指标 HitRate\n* [新特性] 增加推荐系统中广泛使用的评估指标 NDCG\n* [新特性] 在分布式训练同步参数时支持通信优先级\n* [新特性] 支持 ModelBroadcast 自定义配置\n* [新特性] 针对基于 MKL-DNN 的模型推出全新分布式优化器。该优化器可在分布式训练过程中重叠执行训练与通信操作，从而提升多节点环境下的可扩展性\n* [API 变更] 在 Python model.predict API 中新增 batch size 参数\n* [增强] 为 LeNet 添加 MKL-DNN 训练示例\n* [增强] 通过消除梯度截断和零梯度问题，提升基于 MKL-DNN 的模型训练性能\n* [增强] 增加基于 MKL-DNN 的 VGG-16 训练示例\n* [增强] 支持 Graph 输出中的嵌套表结构\n* [增强] 优化线程池，使其与 MKL-DNN 引擎兼容\n* [增强] 提供 MKL-DNN 模型序列化支持\n* [增强] 增加 VGG-16 验证示例\n* [Bug 修复] 修复当 batch size 发生变化时，JoinTable 在反向传播过程中抛出异常的问题\n* [Bug 修复] 将 ReshapeLoadTF 中的 Reshape 操作改为 InferReShape\n* [Bug 修复] 修复 Predictor 中 splitBatch 问题，该问题出现在模型包含多个 Graph 且每个 Graph 均输出表的情况下\n* [Bug 修复] 修复 MDL-DNN 推理性能问题，避免在推理阶段复制权重\n* [Bug 修复] 修复存在未标记数据时训练会崩溃的问题\n* [Bug 修复] 修复输入为灰度图像而模型需要 3 通道输入的情况\n* [Bug 修复] 调整样式检查任务，确保输入和输出文件均采用 UTF-8 编码格式\n* [Bug 修复] 仅在指定 MKL-DNN 引擎时加载相关库\n* [Bug 修复] 对 org.tensorflow.framework 进行阴影处理以避免冲突\n* [Bug 修复] 修复 dlframes 未被打包进 pip 的问题\n* [Bug 修复] 修复 LocalPredictor 因嵌套 logger 变量而无法序列化的问题\n* [Bug 修复] 克隆 Cell 时需清空 Recurrent preTopology 的输出\n* [Bug 修复] MM 层在多次运行时对相同输入产生不同输出\n* [Bug 修复] Distribute predictor 在执行 `mapPartition` 时会发送两次模型\n* [文档] 针对 Spark 2.3 的 Kubernetes 编程指南\n* [文档] 增加将预处理组件和模型封装在一个 Graph 中的相关文档，并提供其 Python API","2024-03-07T02:15:48",{"id":231,"version":232,"summary_zh":233,"released_at":234},118054,"v0.6.0","## Highlights\r\n* We integrate [MKL-DNN](https:\u002F\u002Fgithub.com\u002Fintel\u002Fmkl-dnn) as an alternative execution engine for CNN models. MKL-DNN provides better training\u002Finference performance and less memory consuming. On some CNN models, we find there’s 2x throughput improvement in our experiment. \r\n* Support using different optimization methods to optimize different parts of the model. This is necessary when train some models. \r\n* Spark 2.3 support. We have tested our code and examples on Spark 2.3. We release the binary for Spark 2.3, and Spark 1.5 will not be  supported. \r\n \r\n## Details\r\n* [New Feature] MKL-DNN integration. We integrate [MKL-DNN](https:\u002F\u002Fgithub.com\u002Fintel\u002Fmkl-dnn) as an alternative execution engine for CNN models. It supports speedup layers like: AvgPooling, MaxPooling, CAddTable, LRN, JoinTable, Linear, ReLU, SpatialConvolution, SpatialBatchnormalization, Softmax. MKL-DNN provides better training\u002Finference performance and less memory consuming.\r\n* [New Feature] Layer fusion. Support layer fusion on conv + relu, batchnorm + relu, conv + batchnorm and conv + sum(some of the fusion can only be applied in the inference). Layer fusion provides better performance especially on inference. Currently layer fusion are only available for MKL-DNN related layers.\r\n* [New Feature] Multiple optimization method support in optimizer. Support using different optimization methods to optimize different parts of the model.\r\n* [New Feature] Add a new optimization method Ftrl, which is often used in recommendation model training.\r\n* [New Feature] Add a new example: Training Resnet50 on ImageNet dataset.\r\n* [New Feature] Add new OpenCV based image preprocessing transformer ChannelScaledNormalizer.\r\n* [New Feature] Add new OpenCV based image preprocessing transformer RandomAlterAspect.\r\n* [New Feature] Add new OpenCV based image preprocessing transformer RandomCropper.\r\n* [New Feature] Add new OpenCV based image preprocessing transformer RandomResize.\r\n* [New Feature] Support loading Tensorflow Max operation.\r\n* [New Feature] Allow user to specify input port when loading Tensorflow model. If the input operation accepts multiple tensors as input, user can specify which to feed data to instead of feed all tensors.\r\n* [New Feature] Support loading Tensorflow Gather operation.\r\n* [New Feature] Add random split for ImageFrame\r\n* [New Feature] Add setLabel and getURI API into ImageFrame\r\n* [API Change] Add batch size into the Python model.predict API.\r\n* [API Change] Add generateBackward into load Tensorflow model API, which allows user choose whether to generate backward path when load Tensorflow model.\r\n* [API Change] Add feature() and label() to the Sample.\r\n* [API Change] Deprecate the DLClassifier\u002FDLEstimator in org.apache.spark.ml. Prefer using DLClassifier\u002FDLEstimator under com.intel.analytics.bigdl.dlframes.\r\n* [Enhancement] Refine StridedSlice. Support begin\u002Fend\u002FshrinkAxis mask just like Tensorflow.\r\n* [Enhancement] Add layer sync to SpatialBatchNormalization.  SpatialBatchNormalization can calculate mean\u002Fstd on a larger batch size. The model with SpatialBatchNormalization layer can converge to a better accuracy even the local batch size is small.\r\n* [Enhancement] Code refactor in DistriOptimizer for advanced parameter operations, e.g. global gradient clipping.\r\n* [Enhancement] Add more models into the LoadModel example.\r\n* [Enhancement] Share Const values when broadcast the model. The Const value will not be changed and we can share it when use multiple model for inference on a same node, which will reduce memory usage.\r\n* [Enhancement] Refine the getTime and time counting implementation.\r\n* [Enhancement] Support group serializer so that layers of the same hierarchy could share the same serializer.\r\n* [Enhancement] Dockerfile use Python 2.7.\r\n* [Bug Fix] Fix memory leak problem when using quantized model in predictor.\r\n* [Bug Fix] Fix PY4J Java gateway not compatible in Spark local mode for Spark 2.3.\r\n* [Bug Fix] Fix a bug in python inception example.\r\n* [Bug Fix] Fix a bug when run Tensorflow model using loop.\r\n* [Bug Fix] Fix a bug in the Squeeze layer.\r\n* [Bug Fix] Fix python API for random split.\r\n* [Bug Fix] Using parameters() instead of getParameterTable() to get weight and bias in serialization.\r\n* [Document] Fix incorrectness in Quantized model document.\r\n* [Document] Fix incorrect instructions when generate Sequence files for ImageNet 2012 dataset in the document.\r\n* [Document] Move bigdl-core build document into a separated page and refine the format.\r\n* [Document] Fix incorrect command in Tensorflow load and transfer learning examples.\r\n","2024-03-07T02:15:54",{"id":236,"version":237,"summary_zh":238,"released_at":239},118055,"v0.5.0","## Highlights\r\n* Bring in a Keras-like API(Scala and Python). User can easily run their Keras code (training and inference) on Apache Spark through BigDL. For more details, see [this link](https:\u002F\u002Fbigdl-project.github.io\u002F0.5.0\u002F#KerasStyleAPIGuide\u002Fkeras-api-python\u002F).\r\n* Support load Tensorflow dynamic models(e.g. LSTM, RNN) in BigDL and support more Tensorflow operations, see [this page](https:\u002F\u002Fbigdl-project.github.io\u002F0.5.0\u002F#APIGuide\u002Ftensorflow_ops_list\u002F ).\r\n* Support combining data preprocessing and neural network layers in the same model (to make model deployment easy )\r\n* Speedup various modules in BigDL (BCECriterion, rmsprop, LeakyRelu, etc.)\r\n* Add DataFrame-based image reader and transformer\r\n \r\n## New Features\r\n* Tensor can be converted to OpenCVMat\r\n* Bring in a new Keras-like API for scala and python\r\n* Support load Tensorflow dynamic models(e.g. LSTM, RNN)\r\n* Support load more Tensorflow operations(InvertPermutation, ConcatOffset, Exit, NextIteration, Enter, RefEnter, LoopCond, ControlTrigger, TensorArrayV3,TensorArrayGradV3, TensorArrayGatherV3, TensorArrayScatterV3, TensorArrayConcatV3, TensorArraySplitV3, TensorArrayReadV3, TensorArrayWriteV3, TensorArraySizeV3, StackPopV2, StackPop, StackPushV2, StackPush, StackV2, Stack)\r\n* ResizeBilinear support NCHW\r\n* ImageFrame support load Hadoop sequence file\r\n* ImageFrame support gray image\r\n* Add Kv2Tensor Operation(Scala)\r\n* Add PGCriterion to compute the negative policy gradient given action distribution, sampled action and reward\r\n* Support gradual increase learning rate in LearningrateScheduler\r\n* Add FixExpand and add more options to AspectScale for image preprocessing\r\n* Add RowTransformer(Scala)\r\n* Support to add preprocessors to Graph, which allows user combine preprocessing and trainable model into one model\r\n* Resnet on cifar-10 example support load images from HDFS\r\n* Add CategoricalColHashBucket operation(Scala)\r\n* Predictor support Table as output\r\n* Add BucketizedCol operation(Scala)\r\n* Support using DenseTensor and SparseTensor together to create Sample\r\n* Add CrossProduct Layer (Scala)\r\n* Provide an option to allow user bypass the exception in transformer\r\n* DenseToSparse layer support disable backward propagation\r\n* Add CategoricalColVocaList Operation(Scala)\r\n* Support imageframe in python optimizer\r\n* Support get executor number and executor cores in python\r\n* Add IndicatorCol Operation(Scala)\r\n* Add TensorOp, which is an operation with Tensor[T]-formatted input and output, and provides shortcuts to build Operations for tensor transformation by closures. (Scala)\r\n* Provide a docker file to make it easily to setup testing environment of BigDL\r\n* Add CrossCol Operation(Scala)\r\n* Add MkString Operation(Scala)\r\n* Add a prediction service interface for concurrent calls and accept bytes input\r\n* Add SparseTensor.cast & SparseTensor.applyFun\r\n* Add DataFrame-based image reader and transformer\r\n* Support load tensoflow model files saved by tf.saved_model API\r\n* SparseMiniBatch supporting multiple TensorDataTypes\r\n \r\n## Enhancement\r\n* ImageFrame support serialization\r\n* A default implementation of zeroGradParameter is added to AbstractModule\r\n* Improve the style of the document website\r\n* Models in different threads share weights in model training\r\n* Speed up leaky relu\r\n* Speed up Rmsprop\r\n* Speed up BCECriterion\r\n* Support Calling Java Function in Python Executor and ModelBroadcast in Python\r\n* Add detail instructions to run-on-ec2\r\n* Optimize padding mechanism\r\n* Fix maven compiling warnings\r\n* Check duplicate layers in the container\r\n* Refine the document which introduce how to automatically Deploy BigDL on Dataproc cluster\r\n* Refactor adding extra jars\u002Fpython packages for python user. Now only need to set env variable BIGDL_JARS & BIGDL_PACKAGES\r\n* Implement appendColumn and avoid the error caused by API mismatch between different Spark version\r\n* Add python inception training on ImageNet example\r\n* Update \"can't find locality partition for partition ...\" to warning message\r\n\r\n## API change\r\n* Move DataFrame-based API to dlframe package\r\n* Refine the Container hierarchy. The add method(used in Sequential, Concat…) is moved to a subclass DynamicContainer\r\n* Refine the serialization code hierarchy\r\n* Dynamic Graph has been an internal class which is only used to run tensorflow models\r\n* Operation is not allowed to use outside Graph\r\n* The getParamter method as final and private[bigdl], which should be only used in model training\r\n*  remove the updateParameter method, which is only used in internal test\r\n* Some Tensorflow related operations are marked as internal, which should be only used when running Tensorflow models\r\n \r\n## Bug Fix\r\n* Fix Sparse sample batch bug. It should add another dimension instead of concat the original tensor\r\n* Fix some activation or layers don’t work in TimeDistributed and RnnCell\r\n* Fix a bug in SparseTensor resize method\r\n* Fix a bug when convert SparseTensor to DenseTensor\r\n* Fix a bug in SpatialFullConvolut","2024-03-07T02:16:02",{"id":241,"version":242,"summary_zh":243,"released_at":244},118056,"v0.4.0","## Highlights\r\n* Supported all Keras layers, and support Keras 1.2.2 model loading. See [keras-support](https:\u002F\u002Fbigdl-project.github.io\u002F0.4.0\u002F#ProgrammingGuide\u002Fkeras-support\u002F) for detail\r\n* Python 3.6 support\r\n* OpenCV support, and add a dozen of image transformer based on OpenCV\r\n* More layers\u002Foperations\r\n\r\n## New Features\r\n* Models & Layers & Operations & Loss function\r\n  + Add layers for Keras: Cropping2D, Cropping3D, UpSampling1D, UpSampling2D, UpSampling3D,   masking,Maxout,HighWay,GaussianDropout, GaussianNoise, CAveTable, VolumetricAveragePooling, HardSigmoidSReLU, LocallyConnected1D, LocallyConnected2D, SpatialSeparableConvolution, ActivityRegularization, SpatialDropout1D, SpatialDropout2D, SpatialDropout3D\r\n  + Add Criterion for keras: PoissonCriterion, KullbackLeiblerDivergenceCriterion, MeanAbsolutePercentageCriterion, MeanSquaredLogarithmicCriterion, CosineProximityCriterion\r\n  + Support NHWC for LRN and BatchNormalization\r\n  + Add LookupTableSparse (lookup table for multivalue)\r\n  + Add activation argument for recurrent layers\r\n  + Add MultiRNNCell\r\n  + Add SpatialSeparableConvolution\r\n  + Add MSRA filler\r\n  + Support SAME padding in 3d conv and allows user config padding size in convlstm and convlstm3d\r\n  + TF opteration: SegmentSum, conv3d related operations, Dilation2D, Dilation2DBackpropFilter, Dilation2DBackpropInput, Digamma, Erf, Erfc, Lgamma, TanhGrad, depthwise,  Rint, All, Any, Range, Exp, Expm1, Round, FloorDiv, TruncateDiv, Mod, FloorMod, TruncateMod, IntopK, Round, Maximum, Minimum, BatchMatMu, Sqrt, SqrtGrad, Square, RsqrtGrad, AvgPool, AvgPoolGrad, BiasAddV1, SigmoidGrad, Relu6, Relu6Grad, Elu, EluGrad, Softplus, SoftplusGrad, LogSoftmax, Softsign, SoftsignGrad, Abs, LessEqual, GreaterEqual, ApproximateEqual, Log, LogGrad, Log1p, Log1pGrad, SquaredDifference, Div, Ceil, Inv, InvGrad, IsFinite, IsInf, IsNan, Sign, TopK. See details at  [tensorflow_ops_list](https:\u002F\u002Fbigdl-project.github.io\u002F0.4.0\u002F#APIGuide\u002Ftensorflow_ops_list\u002F))\r\n  + Add object detection related layers: PriorBox, NormalizeScale, Proposal, DetectionOutputSSD, DetectionOutputFrcnn, Anchor\r\n* Transformer\r\n  + Add image Transformer based on OpenCV: Resize, Brightness, ChannelOrder, Contrast, Saturation, Hue, ChannelNormalize, PixelNormalize, RandomCrop, CenterCrop, FixedCrop, DetectionCrop, Expand, Filler, ColorJitter, RandomSampler, MatToFloats, AspectScale, RandomAspectScale, BytesToMat\r\n  + Add Transformer: RandomTransformer, RoiProject, RoiHFlip, RoiResize, RoiNormalize\r\n* API change\r\n  + Add predictImage function in LocalPredictor\r\n  + Add partition number option for ImageFrame read\r\n  + Add an API to get node from graph model with given name\r\n  + Support List of JTensors for label in Python API\r\n  + Expose local optimizer and predictor in Python API\r\n* Install & Deploy\r\n  + Support BigDL on [Spark on k8s](https:\u002F\u002Fgithub.com\u002Fapache-spark-on-k8s\u002Fspark)\r\n* Model Save\u002FLoad\r\n  + Support big-sized model (parameter exceed > 2.1G) for both java and protobuffer\r\n  + Support keras model loading\r\n* Training\r\n  + Allow user to set new train data or new criterion for optimizer reusing\r\n  + Support gradient clipping (constant clip and clip by L2-norm)\r\n\r\n## Enhancement\r\n* Speed up BatchNormalization.\r\n* Speed up MSECriterion\r\n* Speed up Adam\r\n* Speed up static graph execution\r\n* Support reading TFRecord files from HDFS\r\n* Support reading raw binary files from HDFS\r\n* Check input size in concat layer\r\n* Add proper exception handling for CaffeLoader&Persister\r\n* Add serialization support for multiple tensor numeric\r\n* Add an Activity wrapper for Python to simplify the returning value\r\n* Override joda-time in hadoop-aws to reduce compile time\r\n* LocalOptimizer-use modelbroadcast-like method to clone module\r\n* Time counting for paralleltable's forward\u002Fbackward\r\n* Use shade to package jar-with-dependencies to manage some package conflict\r\n* Support loading bigdl_conf_file in multiple python zip files\r\n\r\n## Bug Fix\r\n* Fix getModel failed in DistriOptimizer when model parameters exceed 2.1G\r\n* Fix core number is 0 where there's only one core in system\r\n* Fix SparseJoinTable throw exception if input’s nElement changed.\r\n* Fix some issues found when save bigdl model to tensorflow format file\r\n* Fix return object type error of DLClassifier.transform in Python\r\n* Fix graph generatebackward is lost in serialization\r\n* Fix resizing tensor to empty tensor doesn’t work properly\r\n* Fix Adapter layer does not support different batch size at runtime\r\n* Fix Adaper layer cannot be serialized directly\r\n* Fix calling wrong function when set user-defined mkl threads\r\n* Fix SmoothL1Criterion and SoftmaxWithCriterion doesn’t deal with input’s offset.\r\n* Fix L1Regularization throw NullPointerException while broadcasting model.\r\n* Fix CMul layer will crash for certain configure\r\n","2024-03-07T02:16:08",{"id":246,"version":247,"summary_zh":248,"released_at":249},118057,"v0.3.0","## Highlights\r\n* New protobuf-based model storage format\r\n* Support model quantization\r\n* Support sparse tensor and model\r\n* Easier and broader Tensorflow model load support\r\n* More layers\u002Foperations\r\n* Apache Spark 2.2 support\r\n \r\n## New Features\r\n* Models & Layers & Operations & Loss function\r\n  + Support convlstm3D model\r\n  + Support Variational Auto Encoder \r\n  + Support Unet\r\n  + Support PTB model\r\n  + Add SpatialWithinChannelLRN layer\r\n  + Add 3D-deconv layer\r\n  + Add BifurcateSplitTable layer\r\n  + Add KLD criterion\r\n  + Add Gaussian layer\r\n  + Add Sampler layer\r\n  + Add RNN decoder layer\r\n  + Support NHWC data format in 2D-conv, 2D-pooling layers\r\n  + Support same\u002Fvalid padding type in 2D-conv and 2D-pooling layers \r\n  + Support dynamic execution flow in Graph\r\n  + Graph node can pass nested tensors\r\n  + Layer\u002FOperation can support different input and output numeric tensor\r\n  + Start to support operations in BigDL, add following operations: LogicalNot, LogicalOr, LogicalAnd, 1D Max Pooling, Squeeze, Prod, Sum, Reshape, Identity, ReLU, Equals, Greater, Less, Switch, Merge, Floor, L2Loss, RandomUniform, Rank, MatMul, SoftMax, Conv2d, Add, Assert, Onehot, Assign, Cast, ExpandDims, MaxPool, Realdiv, BiasAdd, Pad, Tile, StridedSlice, Transpose, Negative, AssignGrad, BiasAddGrad, Deconv2D, Conv2DBackFilter CrossEntropy, MaxPoolGrad, NoOp, RandomUniform, ReluGrad, Select, Sum, Pow, BroadcastGradientArgs, Control Dependency\r\n  + Start to support sparse layers in BigDL, add following sparse layers: SparseLinear, SparseJoinTable, DenseToSparse \r\n* Tensor\r\n  + Support sparse tensor\r\n  + Support scalar (0-D tensor)\r\n  + Tensor support more numeric type: boolean, short, int, long, string, char, bytestring\r\n  + Tensor don’t display full content in toString when there’re too many elements\r\n* API change\r\n  + Expose evaluate API to python\r\n  + Add a predictClass API to model to simplify the code when user want to use model in classification\r\n  + Change model.test to model.evaluate in Python\r\n  + Refine Recurrent, BiRecurrent and RnnCell API\r\n  + Sample.features from ndarray to JTensor\u002FList[JTensor]\r\n  + Sample.label from ndarray to JTensor\r\n* Install & Deploy\r\n  + Support Apache Spark 2.2\r\n  + Add script to run BigDL on Google DataProc platform\r\n  + Refine run-example.sh scripts to run bigdl examples on AWS with build-in Spark\r\n  + Pip install will now auto install spark-2.2\r\n  + Add a docker file\r\n* Model Save\u002FLoad\r\n  + New model persistent format(protobuf based) to provide a better user experience when save\u002Fload bigdl models\r\n  + Support load more operations from Tensorflow\r\n  + Support read tensor content from Tensorflow checkpoint\r\n  + Support load a subset of Tensorflow graph\r\n  + Support load Tensorflow preprocessing graph(read\u002Fparse tfrecord data, image decoders and queues)\r\n  + Automatically convert data in Tensorflow queue to RDD and feeding model training in BigDL\r\n  + Support load deconv layer from caffe and Tensorflow\r\n  + Support save\u002Fload SpatialCrossLRN torch module\r\n* Training\r\n  + Allow user to modify the optimization algorithm status when resuming the training in Python\r\n  + Allow user to specify optimization algorithms, learning rate and learning rate decay when use BigDL in Spark * ML pipeline\r\n  + Allow user to stop gradient on some layers in backpropagation\r\n  + Allow user to freeze layer parameters in training\r\n  + Add ML pipeline python API, user can use BigDL with ML pipeline in python code\r\n \r\n## Enhancement\r\n1. Support model quantization. User can speed up model inference by quantize the model\r\n2. Display bigdl model in Tensorboard\r\n3. User can easily convert a sequential model to graph model by invoking new added toGraph method\r\n4. Remove unnecessary contiguous check in 3D conv\r\n5. Support global average pooling\r\n6. Support regularizer in 3D convolution layer\r\n7. Add regularizer for convlstmpeephole3d\r\n8. Throw more meaningful messages in layers and criterions\r\n9. Migrate GRU\u002FLSTM\u002FRNN\u002FLSTM-Peehole definition from sequence to graph\r\n10. Switch to pytest for python unit tests\r\n11. Speed up tanh layer\r\n12. Speed up sigmoid layer\r\n13. Speed up recurrent layer\r\n14. Support batch normalization in recurrent\r\n15. Speedup Python ndarray to scala tensor convertion\r\n16. Improve gradient sync performance in distributed training\r\n17. Speedup tensor dot operation with mkl dot\r\n18. Speedup copy operation in recurrent container\r\n19. Speedup logsoftmax\r\n20. Move classes.lst and img_class.lst to the model example folder, so user can easier to find them.\r\n21. Ensure spark.speculation is set to false to get a better performance in training \r\n22. Easier to turn on performance data in distributed training log\r\n23. Optimize memory usage when broadcasting the model\r\n24. Support mllib vector as feature for BigDL\r\n25. Support create multiple tensors Sample in python\r\n26. Support resizing in BytesToBGRImg\r\n \r\n## Bug Fix\r\n1. Fix TemporalConv layer cannot return parameter table\r\n2. Fix some bugs when loading dilated g","2024-03-07T02:16:15",{"id":251,"version":252,"summary_zh":253,"released_at":254},118058,"v0.2.0","## New feature\r\n\r\n* A new BigDL document website online https:\u002F\u002Fbigdl-project.github.io\u002F, which replace the original BigDL wiki\r\n* Added  New Models & Layers  \r\n    + TreeLSTM and examples for sentiment analytics\r\n    + convLSTM layer\r\n    + 1D convolution layer\r\n    + Mean Absolute Error (MAE) metrics\r\n    + TimeDistributed Layer\r\n    + VolumetricConvolution(3D convolution)\r\n    + VolumetricMaxPooling\r\n    + RoiPooling layer\r\n    + DiceCoefficient loss\r\n    + bi-recurrent layers\r\n* API change  \r\n    + Allow user to set regularization per layer\r\n    + Allow user to set learning rate per layer\r\n    + Add predictClass API for python\r\n    + Add DLEstimator  for Spark ML pipeline\r\n    + Add Functional API for model definition\r\n    + Add movie length dataset API\r\n    + Add 4d normalize support\r\n    + Add evaluator API to simplify model test\r\n* Install & Deploy\r\n    + Allow user to install BigDL from pip\r\n    + Support win64 platform\r\n    + A new script to auto pack\u002Fdistribute python dependency on yarn cluster mode\r\n* Model Save\u002FLoad\r\n    + Allow user to save BigDL model as Caffe model file\r\n    + Allow user to load\u002Fsave some Tensorflow model(cover tensorflow slim APIs)\r\n    + Support save\u002Fload model file from\u002Fto s3\u002Fhdfs\r\n* Optimization\r\n    + Add plateau learning rate schedule\r\n    + Allow user to adjust optimization process based on loss and score\r\n    + Add Exponential learning rate decay\r\n    + Add natural exp decay learning rate schedule\r\n    + Add multistep learning rate policy\r\n\r\n## Enhancement\r\n1.\tOptimization method API refactor\r\n2.\tAllow user to load a Caffe model without pre-defining a BigDL model\r\n3.\tOptimize Recurrent Layers performance\r\n4.\tRefine the ML pipeline related API, and add more examples\r\n5.\tOptimize JoinTable layer performance\r\n6.\tAllow user to use nio blockmanager on Spark 1.5\r\n7.\tRefine layer parameter initialization algorithm API\r\n8.\tRefine Sample class to save memory usage when cache train\u002Ftest dataset as tensor format\r\n9.\tRefine MiniBatch API to support padding and multiple tensors\r\n10.\tRemove bigdl.sh. BigDL will set MKL behavior through MKL Java API, and user can control this via Java properties\r\n11.\tAllow user to remove Spark log in redirecting log file\r\n12.\tAllow user create a SpatialConvultion layer without bias\r\n13.\tRefine validation metrics API\r\n14.\tRefine smoothL1Criterion and reduce tensor storage usage\r\n15.\tUse reflection to handle difference of Spark2 platforms, and user need not to recompile BigDL for different Spark2 platform\r\n16.\tOptimize FlattenTable performance\r\n17.\tUse maven package instead of script to copy dist artifacts together\r\n\r\n\r\n## Bug Fix\r\n1.\tFix some error in Text-classifier document\r\n2.\tFix a bug when call JoinTable after clearState()\r\n3.\tFix a bug in Concat layer when the dimension concatenated along is larger than 2\r\n4.\tFix a bug in MapTable layer\r\n5.\tFix some multi-thread error not catch issue\r\n6.\tFix maven artifact dependency issue\r\n7.\tFix model save method won’t close the stream issue\r\n8.\tFix a bug in BCECriterion\r\n9.\tFix some ConcatTable don’t clear gradInput buffer\r\n10.\tFix SpatialDilatedConvolution not clear gradInput content\r\n","2024-03-07T02:16:22",{"id":256,"version":257,"summary_zh":258,"released_at":259},118059,"v0.1.1","**Release Notes**\r\n* API Change\r\n1.\t**Use bigdl as the top level package name for all bigdl python module**\r\n2.\tAllow user to change the model in the optimizer\r\n3.\tAllow user to define a model in python API\r\n4.\tAllow user to invoke BigDL scala code from python in 3rd prject\r\n5.\tAllow user to use BigDL random generator in python\r\n6.\tAllow user to use forward\u002Fbackward method in python\r\n7.\tAdd BiRnn layer to python\r\n8.\tRemove useless CriterionTable layer\r\n\r\n\r\n* Enhancement\r\n1.\tLoad libjmkl.so in the class load phase\r\n2.\tSupport python 3.5\r\n3.\tInitialize gradient buffer at the start of backward to reduce the memory usage\r\n4.\tAuto pack python dependency in yarn cluster mode\r\n\r\n* Bug Fix\r\n1.\tFix optimizer continue without failure after retry maximum number\r\n2.\tFix LookupTable python API throw noSuchMethod error\r\n3.\tFix an addmv bug for 1x1 matrix\r\n4.\tFix lenet python example error\r\n5.\tFix python load text file encoding issue\r\n6.\tFix HardTanh performance issue\r\n7.\tFix data may distribute unevenly in vgg example when input partition is too large\r\n8.\tFix a bug in SpatialDilatedConvolution\r\n9.\tFix a bug in BCECriterion loss function\r\n10.\tFix a bug in Add layer\r\n11.\tFix runtime error when run BigDL on Pyspark 1.5\r\n","2024-03-07T02:16:28"]