[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-tobegit3hub--simple_tensorflow_serving":3,"tool-tobegit3hub--simple_tensorflow_serving":61},[4,18,26,36,44,53],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":17},4358,"openclaw","openclaw\u002Fopenclaw","OpenClaw 是一款专为个人打造的本地化 AI 助手，旨在让你在自己的设备上拥有完全可控的智能伙伴。它打破了传统 AI 助手局限于特定网页或应用的束缚，能够直接接入你日常使用的各类通讯渠道，包括微信、WhatsApp、Telegram、Discord、iMessage 等数十种平台。无论你在哪个聊天软件中发送消息，OpenClaw 都能即时响应，甚至支持在 macOS、iOS 和 Android 设备上进行语音交互，并提供实时的画布渲染功能供你操控。\n\n这款工具主要解决了用户对数据隐私、响应速度以及“始终在线”体验的需求。通过将 AI 部署在本地，用户无需依赖云端服务即可享受快速、私密的智能辅助，真正实现了“你的数据，你做主”。其独特的技术亮点在于强大的网关架构，将控制平面与核心助手分离，确保跨平台通信的流畅性与扩展性。\n\nOpenClaw 非常适合希望构建个性化工作流的技术爱好者、开发者，以及注重隐私保护且不愿被单一生态绑定的普通用户。只要具备基础的终端操作能力（支持 macOS、Linux 及 Windows WSL2），即可通过简单的命令行引导完成部署。如果你渴望拥有一个懂你",349277,3,"2026-04-06T06:32:30",[13,14,15,16],"Agent","开发框架","图像","数据工具","ready",{"id":19,"name":20,"github_repo":21,"description_zh":22,"stars":23,"difficulty_score":10,"last_commit_at":24,"category_tags":25,"status":17},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,"2026-04-05T11:01:52",[14,15,13],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":32,"last_commit_at":33,"category_tags":34,"status":17},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",147882,2,"2026-04-09T11:32:47",[14,13,35],"语言模型",{"id":37,"name":38,"github_repo":39,"description_zh":40,"stars":41,"difficulty_score":32,"last_commit_at":42,"category_tags":43,"status":17},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",108111,"2026-04-08T11:23:26",[14,15,13],{"id":45,"name":46,"github_repo":47,"description_zh":48,"stars":49,"difficulty_score":32,"last_commit_at":50,"category_tags":51,"status":17},4721,"markitdown","microsoft\u002Fmarkitdown","MarkItDown 是一款由微软 AutoGen 团队打造的轻量级 Python 工具，专为将各类文件高效转换为 Markdown 格式而设计。它支持 PDF、Word、Excel、PPT、图片（含 OCR）、音频（含语音转录）、HTML 乃至 YouTube 链接等多种格式的解析，能够精准提取文档中的标题、列表、表格和链接等关键结构信息。\n\n在人工智能应用日益普及的今天，大语言模型（LLM）虽擅长处理文本，却难以直接读取复杂的二进制办公文档。MarkItDown 恰好解决了这一痛点，它将非结构化或半结构化的文件转化为模型“原生理解”且 Token 效率极高的 Markdown 格式，成为连接本地文件与 AI 分析 pipeline 的理想桥梁。此外，它还提供了 MCP（模型上下文协议）服务器，可无缝集成到 Claude Desktop 等 LLM 应用中。\n\n这款工具特别适合开发者、数据科学家及 AI 研究人员使用，尤其是那些需要构建文档检索增强生成（RAG）系统、进行批量文本分析或希望让 AI 助手直接“阅读”本地文件的用户。虽然生成的内容也具备一定可读性，但其核心优势在于为机器",93400,"2026-04-06T19:52:38",[52,14],"插件",{"id":54,"name":55,"github_repo":56,"description_zh":57,"stars":58,"difficulty_score":10,"last_commit_at":59,"category_tags":60,"status":17},4487,"LLMs-from-scratch","rasbt\u002FLLMs-from-scratch","LLMs-from-scratch 是一个基于 PyTorch 的开源教育项目，旨在引导用户从零开始一步步构建一个类似 ChatGPT 的大型语言模型（LLM）。它不仅是同名技术著作的官方代码库，更提供了一套完整的实践方案，涵盖模型开发、预训练及微调的全过程。\n\n该项目主要解决了大模型领域“黑盒化”的学习痛点。许多开发者虽能调用现成模型，却难以深入理解其内部架构与训练机制。通过亲手编写每一行核心代码，用户能够透彻掌握 Transformer 架构、注意力机制等关键原理，从而真正理解大模型是如何“思考”的。此外，项目还包含了加载大型预训练权重进行微调的代码，帮助用户将理论知识延伸至实际应用。\n\nLLMs-from-scratch 特别适合希望深入底层原理的 AI 开发者、研究人员以及计算机专业的学生。对于不满足于仅使用 API，而是渴望探究模型构建细节的技术人员而言，这是极佳的学习资源。其独特的技术亮点在于“循序渐进”的教学设计：将复杂的系统工程拆解为清晰的步骤，配合详细的图表与示例，让构建一个虽小但功能完备的大模型变得触手可及。无论你是想夯实理论基础，还是为未来研发更大规模的模型做准备",90106,"2026-04-06T11:19:32",[35,15,13,14],{"id":62,"github_repo":63,"name":64,"description_en":65,"description_zh":66,"ai_summary_zh":67,"readme_en":68,"readme_zh":69,"quickstart_zh":70,"use_case_zh":71,"hero_image_url":72,"owner_login":73,"owner_name":74,"owner_avatar_url":75,"owner_bio":76,"owner_company":77,"owner_location":78,"owner_email":79,"owner_twitter":77,"owner_website":80,"owner_url":81,"languages":82,"stars":120,"forks":121,"last_commit_at":122,"license":123,"difficulty_score":32,"env_os":124,"env_gpu":125,"env_ram":126,"env_deps":127,"category_tags":141,"github_topics":142,"view_count":32,"oss_zip_url":77,"oss_zip_packed_at":77,"status":17,"created_at":151,"updated_at":152,"faqs":153,"releases":192},6003,"tobegit3hub\u002Fsimple_tensorflow_serving","simple_tensorflow_serving","Generic and easy-to-use serving service for machine learning models","simple_tensorflow_serving 是一款通用且易用的机器学习模型部署服务，旨在简化从模型训练到线上推理的最后一公里。它解决了开发者在部署不同框架模型时面临的接口不统一、环境配置复杂以及多版本管理困难等痛点，让用户无需编写大量服务端代码即可快速搭建高性能的推理接口。\n\n这款工具非常适合机器学习工程师、后端开发人员以及算法研究人员使用。无论是需要快速验证原型的科研人员，还是追求稳定生产环境的开发团队，都能从中受益。其核心亮点在于极强的兼容性，不仅原生支持 TensorFlow，还广泛涵盖 PyTorch、MXNet、ONNX、Scikit-learn 等主流框架，真正实现了“一次部署，多框通用”。\n\n此外，simple_tensorflow_serving 具备多项实用特性：支持 GPU 加速推理以提升性能；允许同时在线服务多个模型及其不同版本，并能自动检测更新实现动态加载；提供标准的 RESTful API，方便任何编程语言的客户端调用；甚至能根据模型自动生成客户端代码，大幅降低对接成本。配合 Docker 和 Kubernetes，它能轻松适应从本地测试到大规模集群的","simple_tensorflow_serving 是一款通用且易用的机器学习模型部署服务，旨在简化从模型训练到线上推理的最后一公里。它解决了开发者在部署不同框架模型时面临的接口不统一、环境配置复杂以及多版本管理困难等痛点，让用户无需编写大量服务端代码即可快速搭建高性能的推理接口。\n\n这款工具非常适合机器学习工程师、后端开发人员以及算法研究人员使用。无论是需要快速验证原型的科研人员，还是追求稳定生产环境的开发团队，都能从中受益。其核心亮点在于极强的兼容性，不仅原生支持 TensorFlow，还广泛涵盖 PyTorch、MXNet、ONNX、Scikit-learn 等主流框架，真正实现了“一次部署，多框通用”。\n\n此外，simple_tensorflow_serving 具备多项实用特性：支持 GPU 加速推理以提升性能；允许同时在线服务多个模型及其不同版本，并能自动检测更新实现动态加载；提供标准的 RESTful API，方便任何编程语言的客户端调用；甚至能根据模型自动生成客户端代码，大幅降低对接成本。配合 Docker 和 Kubernetes，它能轻松适应从本地测试到大规模集群的各种部署场景，是构建高效机器学习服务流水线的得力助手。","# Simple TensorFlow Serving\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_22f9d43f040e.jpeg)\n\n## Introduction\n\nSimple TensorFlow Serving is the generic and easy-to-use serving service for machine learning models. Read more in \u003Chttps:\u002F\u002Fstfs.readthedocs.io>.\n\n* [x] Support distributed TensorFlow models\n* [x] Support the general RESTful\u002FHTTP APIs\n* [x] Support inference with accelerated GPU\n* [x] Support `curl` and other command-line tools\n* [x] Support clients in any programming language\n* [x] Support code-gen client by models without coding\n* [x] Support inference with raw file for image models\n* [x] Support statistical metrics for verbose requests\n* [x] Support serving multiple models at the same time\n* [x] Support dynamic online and offline for model versions\n* [x] Support loading new custom op for TensorFlow models\n* [x] Support secure authentication with configurable basic auth\n* [x] Support multiple models of TensorFlow\u002FMXNet\u002FPyTorch\u002FCaffe2\u002FCNTK\u002FONNX\u002FH2o\u002FScikit-learn\u002FXGBoost\u002FPMML\u002FSpark MLlib\n\n## Installation\n\nInstall the server with [pip](https:\u002F\u002Fpypi.python.org\u002Fpypi\u002Fsimple-tensorflow-serving).\n\n```bash\npip install simple_tensorflow_serving\n```\n\nOr install from [source code](https:\u002F\u002Fgithub.com\u002Ftobegit3hub\u002Fsimple_tensorflow_serving).\n\n```bash\npython .\u002Fsetup.py install\n\npython .\u002Fsetup.py develop\n\nbazel build simple_tensorflow_serving:server\n```\n\nOr use the [docker image](https:\u002F\u002Fhub.docker.com\u002Fr\u002Ftobegit3hub\u002Fsimple_tensorflow_serving\u002F).\n\n```bash\ndocker run -d -p 8500:8500 tobegit3hub\u002Fsimple_tensorflow_serving\n\ndocker run -d -p 8500:8500 tobegit3hub\u002Fsimple_tensorflow_serving:latest-gpu\n\ndocker run -d -p 8500:8500 tobegit3hub\u002Fsimple_tensorflow_serving:latest-hdfs\n\ndocker run -d -p 8500:8500 tobegit3hub\u002Fsimple_tensorflow_serving:latest-py34\n```\n\n````bash\ndocker-compose up -d\n````\n\nOr deploy in [Kubernetes](https:\u002F\u002Fkubernetes.io\u002F).\n\n```bash\nkubectl create -f .\u002Fsimple_tensorflow_serving.yaml\n```\n\n## Quick Start\n\nStart the server with the TensorFlow [SavedModel](https:\u002F\u002Fwww.tensorflow.org\u002Fprogrammers_guide\u002Fsaved_model).\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Ftensorflow_template_application_model\"\n```\n\nCheck out the dashboard in [http:\u002F\u002F127.0.0.1:8500](http:\u002F\u002F127.0.0.1:8500) in web browser.\n \n![dashboard](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_df26f4f58c36.png)\n\nGenerate Python client and access the model with test data without coding.\n\n```bash\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_client?language=python > client.py\n```\n\n```bash\npython .\u002Fclient.py\n```\n\n## Advanced Usage\n\n### Multiple Models\n\nIt supports serve multiple models and multiple versions of these models. You can run the server with this configuration.\n\n```json\n{\n  \"model_config_list\": [\n    {\n      \"name\": \"tensorflow_template_application_model\",\n      \"base_path\": \".\u002Fmodels\u002Ftensorflow_template_application_model\u002F\",\n      \"platform\": \"tensorflow\"\n    }, {\n      \"name\": \"deep_image_model\",\n      \"base_path\": \".\u002Fmodels\u002Fdeep_image_model\u002F\",\n      \"platform\": \"tensorflow\"\n    }, {\n       \"name\": \"mxnet_mlp_model\",\n       \"base_path\": \".\u002Fmodels\u002Fmxnet_mlp\u002Fmx_mlp\",\n       \"platform\": \"mxnet\"\n    }\n  ]\n}\n```\n\n```bash\nsimple_tensorflow_serving --model_config_file=\".\u002Fexamples\u002Fmodel_config_file.json\"\n```\n\nAdding or removing model versions will be detected automatically and re-load latest files in memory. You can easily choose the specified model and version for inference.\n\n```json\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"keys\": [[11.0], [2.0]],\n      \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1],\n                   [1, 1, 1, 1, 1, 1, 1, 1, 1]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\n```\n\n### GPU Acceleration\n\nIf you want to use GPU, try with the docker image with GPU tag and put cuda files in `\u002Fusr\u002Fcuda_files\u002F`.\n\n```bash\nexport CUDA_SO=\"-v \u002Fusr\u002Fcuda_files\u002F:\u002Fusr\u002Fcuda_files\u002F\"\nexport DEVICES=$(\\ls \u002Fdev\u002Fnvidia* | xargs -I{} echo '--device {}:{}')\nexport LIBRARY_ENV=\"-e LD_LIBRARY_PATH=\u002Fusr\u002Flocal\u002Fcuda\u002Fextras\u002FCUPTI\u002Flib64:\u002Fusr\u002Flocal\u002Fnvidia\u002Flib:\u002Fusr\u002Flocal\u002Fnvidia\u002Flib64:\u002Fusr\u002Fcuda_files\"\n\ndocker run -it -p 8500:8500 $CUDA_SO $DEVICES $LIBRARY_ENV tobegit3hub\u002Fsimple_tensorflow_serving:latest-gpu\n```\n\nYou can set session config and gpu options in command-line parameter or the model config file.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Ftensorflow_template_application_model\" --session_config='{\"log_device_placement\": true, \"allow_soft_placement\": true, \"allow_growth\": true, \"per_process_gpu_memory_fraction\": 0.5}'\n```\n\n```json\n{\n  \"model_config_list\": [\n    {\n      \"name\": \"default\",\n      \"base_path\": \".\u002Fmodels\u002Ftensorflow_template_application_model\u002F\",\n      \"platform\": \"tensorflow\",\n      \"session_config\": {\n        \"log_device_placement\": true,\n        \"allow_soft_placement\": true,\n        \"allow_growth\": true,\n        \"per_process_gpu_memory_fraction\": 0.5\n      }\n    }\n  ]\n}\n```\n\n### Generated Client\n\nYou can generate the test json data for the online models.\n\n```bash\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_json\n```\n\nOr generate clients in different languages(Bash, Python, Golang, JavaScript etc.) for your model without writing any code.\n\n```bash\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_client?language=python > client.py\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_client?language=bash > client.sh\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_client?language=golang > client.go\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_client?language=javascript > client.js\n```\n\nThe generated code should look like these which can be test immediately.\n\n```python\n#!\u002Fusr\u002Fbin\u002Fenv python\n\nimport requests\n\ndef main():\n  endpoint = \"http:\u002F\u002F127.0.0.1:8500\"\n  json_data = {\"model_name\": \"default\", \"data\": {\"keys\": [[1], [1]], \"features\": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]]} }\n  result = requests.post(endpoint, json=json_data)\n  print(result.text)\n\nif __name__ == \"__main__\":\n  main()\n```\n\n```python\n#!\u002Fusr\u002Fbin\u002Fenv python\n\nimport requests\n\ndef main():\n  endpoint = \"http:\u002F\u002F127.0.0.1:8500\"\n\n  input_data = {\"keys\": [[1.0], [1.0]], \"features\": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]]}\n  result = requests.post(endpoint, json=input_data)\n  print(result.text)\n\nif __name__ == \"__main__\":\n  main()\n```\n\n### Image Model\n\nFor image models, we can request with the raw image files instead of constructing array data.\n\nNow start serving the image model like [deep_image_model](https:\u002F\u002Fgithub.com\u002Ftobegit3hub\u002Fdeep_image_model).\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fdeep_image_model\u002F\"\n```\n\nThen request with the raw image file which has the same shape of your model.\n\n```bash\ncurl -X POST -F 'image=@.\u002Fimages\u002Fmew.jpg' -F \"model_version=1\" 127.0.0.1:8500\n```\n\n## TensorFlow Estimator Model\n\nIf we use the TensorFlow Estimator API to export the model, the model signature should look like this.\n\n```\ninputs {\n  key: \"inputs\"\n  value {\n    name: \"input_example_tensor:0\"\n    dtype: DT_STRING\n    tensor_shape {\n      dim {\n        size: -1\n      }\n    }\n  }\n}\noutputs {\n  key: \"classes\"\n  value {\n    name: \"linear\u002Fbinary_logistic_head\u002F_classification_output_alternatives\u002Fclasses_tensor:0\"\n    dtype: DT_STRING\n    tensor_shape {\n      dim {\n        size: -1\n      }\n      dim {\n        size: -1\n      }\n    }\n  }\n}\noutputs {\n  key: \"scores\"\n  value {\n    name: \"linear\u002Fbinary_logistic_head\u002Fpredictions\u002Fprobabilities:0\"\n    dtype: DT_FLOAT\n    tensor_shape {\n      dim {\n        size: -1\n      }\n      dim {\n        size: 2\n      }\n    }\n  }\n}\nmethod_name: \"tensorflow\u002Fserving\u002Fclassify\"\n```\n\nWe need to construct the string tensor for inference and use base64 to encode the string for HTTP. Here is the example Python code.\n\n```python\ndef _float_feature(value):\n  return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))\n\ndef _bytes_feature(value):\n  return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))\n\ndef main():\n  # Raw input data\n  feature_dict = {\"a\": _bytes_feature(\"10\"), \"b\": _float_feature(10)}\n\n  # Create Example as base64 string\n  example_proto = tf.train.Example(features=tf.train.Features(feature=feature_dict))\n  tensor_proto = tf.contrib.util.make_tensor_proto(example_proto.SerializeToString(), dtype=tf.string)\n  tensor_string = tensor_proto.string_val.pop()\n  base64_tensor_string = base64.urlsafe_b64encode(tensor_string)\n\n  # Request server\n  endpoint = \"http:\u002F\u002F127.0.0.1:8500\"\n  json_data = {\"model_name\": \"default\", \"base64_decode\": True, \"data\": {\"inputs\": [base64_tensor_string]}}\n  result = requests.post(endpoint, json=json_data)\n  print(result.json())\n```\n\n### Custom Op\n\nIf your models rely on new TensorFlow [custom op](https:\u002F\u002Fwww.tensorflow.org\u002Fextend\u002Fadding_an_op), you can run the server while loading the so files.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodel\u002F\" --custom_op_paths=\".\u002Ffoo_op\u002F\"\n```\n\nPlease check out the complete example in [.\u002Fexamples\u002Fcustom_op\u002F](.\u002Fexamples\u002Fcustom_op\u002F).\n\n### Authentication\n\nFor enterprises, we can enable basic auth for all the APIs and any anonymous request is denied.\n\nNow start the server with the configured username and password.\n\n```bash\n.\u002Fserver.py --model_base_path=\".\u002Fmodels\u002Ftensorflow_template_application_model\u002F\" --enable_auth=True --auth_username=\"admin\" --auth_password=\"admin\"\n```\n\nIf you are using the Web dashboard, just type your certification. If you are using clients, give the username and password within the request.\n\n```bash\ncurl -u admin:admin -H \"Content-Type: application\u002Fjson\" -X POST -d '{\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}' http:\u002F\u002F127.0.0.1:8500\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"data\": {\n      \"keys\": [[11.0], [2.0]],\n      \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]\n  }\n}\nauth = requests.auth.HTTPBasicAuth(\"admin\", \"admin\")\nresult = requests.post(endpoint, json=input_data, auth=auth)\n```\n\n### TSL\u002FSSL\n\nIt supports TSL\u002FSSL and you can generate the self-signed secret files for testing.\n\n```bash\nopenssl req -x509 -newkey rsa:4096 -nodes -out \u002Ftmp\u002Fsecret.pem -keyout \u002Ftmp\u002Fsecret.key -days 365\n```\n\nThen run the server with certification files.\n\n```bash\nsimple_tensorflow_serving --enable_ssl=True --secret_pem=\u002Ftmp\u002Fsecret.pem --secret_key=\u002Ftmp\u002Fsecret.key --model_base_path=\".\u002Fmodels\u002Ftensorflow_template_application_model\"\n```\n\n## Supported Models\n\nFor MXNet models, you can load with commands and configuration like these.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fmxnet_mlp\u002Fmx_mlp\" --model_platform=\"mxnet\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[12.0, 2.0]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\nFor ONNX models, you can load with commands and configuration like these.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fonnx_mnist_model\u002Fonnx_model.proto\" --model_platform=\"onnx\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\nFor H2o models, you can load with commands and configuration like these.\n\n```bash\n# Start H2o server with \"java -jar h2o.jar\"\n\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fh2o_prostate_model\u002FGLM_model_python_1525255083960_17\" --model_platform=\"h2o\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\nFor Scikit-learn models, you can load with commands and configuration like these.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fscikitlearn_iris\u002Fmodel.joblib\" --model_platform=\"scikitlearn\"\n\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fscikitlearn_iris\u002Fmodel.pkl\" --model_platform=\"scikitlearn\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\nFor XGBoost models, you can load with commands and configuration like these.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fxgboost_iris\u002Fmodel.bst\" --model_platform=\"xgboost\"\n\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fxgboost_iris\u002Fmodel.joblib\" --model_platform=\"xgboost\"\n\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fxgboost_iris\u002Fmodel.pkl\" --model_platform=\"xgboost\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\nFor PMML models, you can load with commands and configuration like these. This relies on [Openscoring](https:\u002F\u002Fgithub.com\u002Fopenscoring\u002Fopenscoring) and [Openscoring-Python](https:\u002F\u002Fgithub.com\u002Fopenscoring\u002Fopenscoring-python) to load the models.\n\n```bash\njava -jar .\u002Fthird_party\u002Fopenscoring\u002Fopenscoring-server-executable-1.4-SNAPSHOT.jar\n\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fpmml_iris\u002FDecisionTreeIris.pmml\" --model_platform=\"pmml\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\n\n## Supported Client\n\nHere is the example client in [Bash](.\u002Fbash_client\u002F).\n\n```bash\ncurl -H \"Content-Type: application\u002Fjson\" -X POST -d '{\"data\": {\"keys\": [[1.0], [2.0]], \"features\": [[10, 10, 10, 8, 6, 1, 8, 9, 1], [6, 2, 1, 1, 1, 1, 7, 1, 1]]}}' http:\u002F\u002F127.0.0.1:8500\n```\n\nHere is the example client in [Python](.\u002Fpython_client\u002F).\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\npayload = {\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\n\nresult = requests.post(endpoint, json=payload)\n```\n\nHere is the example client in [C++](.\u002Fcpp_client\u002F).\n\nHere is the example client in [Java](.\u002Fjava_client\u002F).\n\nHere is the example client in [Scala](.\u002Fscala_client\u002F).\n\nHere is the example client in [Go](.\u002Fgo_client\u002F).\n\n```go\nendpoint := \"http:\u002F\u002F127.0.0.1:8500\"\ndataByte := []byte(`{\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}`)\nvar dataInterface map[string]interface{}\njson.Unmarshal(dataByte, &dataInterface)\ndataJson, _ := json.Marshal(dataInterface)\n\nresp, err := http.Post(endpoint, \"application\u002Fjson\", bytes.NewBuffer(dataJson))\n```\n\nHere is the example client in [Ruby](.\u002Fruby_client\u002F).\n\n```ruby\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\nuri = URI.parse(endpoint)\nheader = {\"Content-Type\" => \"application\u002Fjson\"}\ninput_data = {\"data\" => {\"keys\"=> [[11.0], [2.0]], \"features\"=> [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\nhttp = Net::HTTP.new(uri.host, uri.port)\nrequest = Net::HTTP::Post.new(uri.request_uri, header)\nrequest.body = input_data.to_json\n\nresponse = http.request(request)\n```\n\nHere is the example client in [JavaScript](.\u002Fjavascript_client\u002F).\n\n```javascript\nvar options = {\n    uri: \"http:\u002F\u002F127.0.0.1:8500\",\n    method: \"POST\",\n    json: {\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\n};\n\nrequest(options, function (error, response, body) {});\n```\n\nHere is the example client in [PHP](.\u002Fphp_client\u002F).\n\n```php\n$endpoint = \"127.0.0.1:8500\";\n$inputData = array(\n    \"keys\" => [[11.0], [2.0]],\n    \"features\" => [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]],\n);\n$jsonData = array(\n    \"data\" => $inputData,\n);\n$ch = curl_init($endpoint);\ncurl_setopt_array($ch, array(\n    CURLOPT_POST => TRUE,\n    CURLOPT_RETURNTRANSFER => TRUE,\n    CURLOPT_HTTPHEADER => array(\n        \"Content-Type: application\u002Fjson\"\n    ),\n    CURLOPT_POSTFIELDS => json_encode($jsonData)\n));\n\n$response = curl_exec($ch);\n```\n\nHere is the example client in [Erlang](.\u002Ferlang_client\u002F).\n\n```erlang\nssl:start(),\napplication:start(inets),\nhttpc:request(post,\n  {\"http:\u002F\u002F127.0.0.1:8500\", [],\n  \"application\u002Fjson\",\n  \"{\\\"data\\\": {\\\"keys\\\": [[11.0], [2.0]], \\\"features\\\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\"\n  }, [], []).\n```\n\nHere is the example client in [Lua](.\u002Flua_client\u002F).\n\n```lua\nlocal endpoint = \"http:\u002F\u002F127.0.0.1:8500\"\nkeys_array = {}\nkeys_array[1] = {1.0}\nkeys_array[2] = {2.0}\nfeatures_array = {}\nfeatures_array[1] = {1, 1, 1, 1, 1, 1, 1, 1, 1}\nfeatures_array[2] = {1, 1, 1, 1, 1, 1, 1, 1, 1}\nlocal input_data = {\n    [\"keys\"] = keys_array,\n    [\"features\"] = features_array,\n}\nlocal json_data = {\n    [\"data\"] = input_data\n}\nrequest_body = json:encode (json_data)\nlocal response_body = {}\n\nlocal res, code, response_headers = http.request{\n    url = endpoint,\n    method = \"POST\", \n    headers = \n      {\n          [\"Content-Type\"] = \"application\u002Fjson\";\n          [\"Content-Length\"] = #request_body;\n      },\n      source = ltn12.source.string(request_body),\n      sink = ltn12.sink.table(response_body),\n}\n```\n\nHere is the example client in [Rust](.\u002Fswift_client\u002F).\n\nHere is the example client in [Swift](.\u002Fswift_client\u002F).\n\nHere is the example client in [Perl](.\u002Fperl_client\u002F).\n\n```perl\nmy $endpoint = \"http:\u002F\u002F127.0.0.1:8500\";\nmy $json = '{\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}';\nmy $req = HTTP::Request->new( 'POST', $endpoint );\n$req->header( 'Content-Type' => 'application\u002Fjson' );\n$req->content( $json );\n$ua = LWP::UserAgent->new;\n\n$response = $ua->request($req);\n```\n\nHere is the example client in [Lisp](.\u002Fswift_client\u002F).\n\nHere is the example client in [Haskell](.\u002Fswift_client\u002F).\n\nHere is the example client in [Clojure](.\u002Fclojure_client\u002F).\n\nHere is the example client in [R](.\u002Fr_client\u002F).\n\n```r\nendpoint \u003C- \"http:\u002F\u002F127.0.0.1:8500\"\nbody \u003C- list(data = list(a = 1), keys = 1)\njson_data \u003C- list(\n  data = list(\n    keys = list(list(1.0), list(2.0)), features = list(list(1, 1, 1, 1, 1, 1, 1, 1, 1), list(1, 1, 1, 1, 1, 1, 1, 1, 1))\n  )\n)\n\nr \u003C- POST(endpoint, body = json_data, encode = \"json\")\nstop_for_status(r)\ncontent(r, \"parsed\", \"text\u002Fhtml\")\n```\n\nHere is the example with Postman.\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_deddffc393cd.png)\n\n\n## Performance\n\nYou can run SimpleTensorFlowServing with any WSGI server for better performance. We have benchmarked and compare with `TensorFlow Serving`. Find more details in [benchmark](.\u002Fbenchmark\u002F).\n\nSTFS(Simple TensorFlow Serving) and TFS(TensorFlow Serving) have similar performances for different models. Vertical coordinate is inference latency(microsecond) and the less is better.\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_9f9cbf9f3033.jpeg)\n\nThen we test with `ab` with concurrent clients in CPU and GPU. `TensorFlow Serving` works better especially with GPUs.\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_820e6d10644d.jpeg)\n\nFor [simplest model](.\u002Fbenchmark\u002Fsimplest_model\u002F), each request only costs ~1.9 microseconds and one instance of Simple TensorFlow Serving can achieve 5000+ QPS. With larger batch size, it can inference more than 1M instances per second.\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_b6af3ad7d332.jpeg)\n\n## How It Works\n\n1. `simple_tensorflow_serving` starts the HTTP server with `flask` application.\n2. Load the TensorFlow models with `tf.saved_model.loader` Python API.\n3. Construct the feed_dict data from the JSON body of the request.\n   ```\n   \u002F\u002F Method: POST, Content-Type: application\u002Fjson\n   {\n     \"model_version\": 1, \u002F\u002F Optional\n     \"data\": {\n       \"keys\": [[1], [2]],\n       \"features\": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]]\n     }\n   }\n   ```\n4. Use the TensorFlow Python API to `sess.run()` with feed_dict data.\n5. For multiple versions supported, it starts independent thread to load models.\n6. For generated clients, it reads user's model and render code with [Jinja](http:\u002F\u002Fjinja.pocoo.org\u002F) templates. \n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_30985215dee0.jpeg)\n\n## Contribution\n\nFeel free to open an issue or send pull request for this project. It is warmly welcome to add more clients in your languages to access TensorFlow models.\n","# 简单的 TensorFlow Serving\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_22f9d43f040e.jpeg)\n\n## 简介\n\n简单的 TensorFlow Serving 是一种通用且易于使用的机器学习模型推理服务。更多信息请参阅 \u003Chttps:\u002F\u002Fstfs.readthedocs.io>。\n\n* [x] 支持分布式 TensorFlow 模型\n* [x] 支持通用的 RESTful\u002FHTTP API\n* [x] 支持使用 GPU 加速进行推理\n* [x] 支持 `curl` 及其他命令行工具\n* [x] 支持任何编程语言的客户端\n* [x] 支持通过模型自动生成客户端代码，无需手动编写\n* [x] 支持对图像模型使用原始文件进行推理\n* [x] 支持为详细请求提供统计指标\n* [x] 支持同时服务多个模型\n* [x] 支持模型版本的动态上线与下线\n* [x] 支持加载 TensorFlow 模型的新自定义算子\n* [x] 支持可配置的基本认证安全机制\n* [x] 支持 TensorFlow\u002FMXNet\u002FPyTorch\u002FCaffe2\u002FCNTK\u002FONNX\u002FH2o\u002FScikit-learn\u002FXGBoost\u002FPMML\u002FSpark MLlib 等多种框架的模型\n\n## 安装\n\n使用 [pip](https:\u002F\u002Fpypi.python.org\u002Fpypi\u002Fsimple-tensorflow-serving) 安装服务器。\n\n```bash\npip install simple_tensorflow_serving\n```\n\n或者从 [源代码](https:\u002F\u002Fgithub.com\u002Ftobegit3hub\u002Fsimple_tensorflow_serving) 安装。\n\n```bash\npython .\u002Fsetup.py install\n\npython .\u002Fsetup.py develop\n\nbazel build simple_tensorflow_serving:server\n```\n\n也可以使用 [Docker 镜像](https:\u002F\u002Fhub.docker.com\u002Fr\u002Ftobegit3hub\u002Fsimple_tensorflow_serving\u002F)。\n\n```bash\ndocker run -d -p 8500:8500 tobegit3hub\u002Fsimple_tensorflow_serving\n\ndocker run -d -p 8500:8500 tobegit3hub\u002Fsimple_tensorflow_serving:latest-gpu\n\ndocker run -d -p 8500:8500 tobegit3hub\u002Fsimple_tensorflow_serving:latest-hdfs\n\ndocker run -d -p 8500:8500 tobegit3hub\u002Fsimple_tensorflow_serving:latest-py34\n```\n\n````bash\ndocker-compose up -d\n````\n\n或者在 [Kubernetes](https:\u002F\u002Fkubernetes.io\u002F) 中部署。\n\n```bash\nkubectl create -f .\u002Fsimple_tensorflow_serving.yaml\n```\n\n## 快速入门\n\n使用 TensorFlow 的 [SavedModel](https:\u002F\u002Fwww.tensorflow.org\u002Fprogrammers_guide\u002Fsaved_model) 启动服务器。\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Ftensorflow_template_application_model\"\n```\n\n在浏览器中访问 [http:\u002F\u002F127.0.0.1:8500](http:\u002F\u002F127.0.0.1:8500)，查看仪表板。\n\n![dashboard](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_df26f4f58c36.png)\n\n生成 Python 客户端，并使用测试数据无需编码即可访问模型。\n\n```bash\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_client?language=python > client.py\n```\n\n```bash\npython .\u002Fclient.py\n```\n\n## 高级用法\n\n### 多个模型\n\n该服务支持同时服务多个模型及其不同版本。您可以使用以下配置运行服务器。\n\n```json\n{\n  \"model_config_list\": [\n    {\n      \"name\": \"tensorflow_template_application_model\",\n      \"base_path\": \".\u002Fmodels\u002Ftensorflow_template_application_model\u002F\",\n      \"platform\": \"tensorflow\"\n    }, {\n      \"name\": \"deep_image_model\",\n      \"base_path\": \".\u002Fmodels\u002Fdeep_image_model\u002F\",\n      \"platform\": \"tensorflow\"\n    }, {\n       \"name\": \"mxnet_mlp_model\",\n       \"base_path\": \".\u002Fmodels\u002Fmxnet_mlp\u002Fmx_mlp\",\n       \"platform\": \"mxnet\"\n    }\n  ]\n}\n```\n\n```bash\nsimple_tensorflow_serving --model_config_file=\".\u002Fexamples\u002Fmodel_config_file.json\"\n```\n\n添加或移除模型版本时，系统会自动检测并重新加载最新文件到内存中。您可以轻松选择指定的模型和版本进行推理。\n\n```json\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"keys\": [[11.0], [2.0]],\n      \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1],\n                   [1, 1, 1, 1, 1, 1, 1, 1, 1]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\n```\n\n### GPU 加速\n\n如果您希望使用 GPU，请尝试带有 GPU 标签的 Docker 镜像，并将 CUDA 文件放置在 `\u002Fusr\u002Fcuda_files\u002F` 目录下。\n\n```bash\nexport CUDA_SO=\"-v \u002Fusr\u002Fcuda_files\u002F:\u002Fusr\u002Fcuda_files\u002F\"\nexport DEVICES=$(\\ls \u002Fdev\u002Fnvidia* | xargs -I{} echo '--device {}:{}')\nexport LIBRARY_ENV=\"-e LD_LIBRARY_PATH=\u002Fusr\u002Flocal\u002Fcuda\u002Fextras\u002FCUPTI\u002Flib64:\u002Fusr\u002Flocal\u002Fnvidia\u002Flib:\u002Fusr\u002Flocal\u002Fnvidia\u002Flib64:\u002Fusr\u002Fcuda_files\"\n\ndocker run -it -p 8500:8500 $CUDA_SO $DEVICES $LIBRARY_ENV tobegit3hub\u002Fsimple_tensorflow_serving:latest-gpu\n```\n\n您可以在命令行参数或模型配置文件中设置会话配置和 GPU 选项。\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Ftensorflow_template_application_model\" --session_config='{\"log_device_placement\": true, \"allow_soft_placement\": true, \"allow_growth\": true, \"per_process_gpu_memory_fraction\": 0.5}'\n```\n\n```json\n{\n  \"model_config_list\": [\n    {\n      \"name\": \"default\",\n      \"base_path\": \".\u002Fmodels\u002Ftensorflow_template_application_model\u002F\",\n      \"platform\": \"tensorflow\",\n      \"session_config\": {\n        \"log_device_placement\": true,\n        \"allow_soft_placement\": true,\n        \"allow_growth\": true,\n        \"per_process_gpu_memory_fraction\": 0.5\n      }\n    }\n  ]\n}\n```\n\n### 自动生成的客户端\n\n您可以为在线模型生成测试 JSON 数据。\n\n```bash\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_json\n```\n\n或者为您的模型生成不同语言（Bash、Python、Golang、JavaScript 等）的客户端，而无需编写任何代码。\n\n```bash\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_client?language=python > client.py\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_client?language=bash > client.sh\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_client?language=golang > client.go\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_client?language=javascript > client.js\n```\n\n生成的代码应如下所示，可以直接测试：\n\n```python\n#!\u002Fusr\u002Fbin\u002Fenv python\n\nimport requests\n\ndef main():\n  endpoint = \"http:\u002F\u002F127.0.0.1:8500\"\n  json_data = {\"model_name\": \"default\", \"data\": {\"keys\": [[1], [1]], \"features\": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]]} }\n  result = requests.post(endpoint, json=json_data)\n  print(result.text)\n\nif __name__ == \"__main__\":\n  main()\n```\n\n```python\n#!\u002Fusr\u002Fbin\u002Fenv python\n\nimport requests\n\ndef main():\n  endpoint = \"http:\u002F\u002F127.0.0.1:8500\"\n\n  input_data = {\"keys\": [[1.0], [1.0]], \"features\": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]]}\n  result = requests.post(endpoint, json=input_data)\n  print(result.text)\n\nif __name__ == \"__main__\":\n  main()\n```\n\n### 图像模型\n\n对于图像模型，我们可以直接使用原始图像文件进行请求，而无需构造数组数据。\n\n现在开始服务类似于 [deep_image_model](https:\u002F\u002Fgithub.com\u002Ftobegit3hub\u002Fdeep_image_model) 的图像模型。\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fdeep_image_model\u002F\"\n```\n\n然后使用与您的模型形状相同的原始图像文件进行请求。\n\n```bash\ncurl -X POST -F 'image=@.\u002Fimages\u002Fmew.jpg' -F \"model_version=1\" 127.0.0.1:8500\n```\n\n## TensorFlow Estimator 模型\n\n如果我们使用 TensorFlow Estimator API 导出模型，模型签名应如下所示。\n\n```\ninputs {\n  key: \"inputs\"\n  value {\n    name: \"input_example_tensor:0\"\n    dtype: DT_STRING\n    tensor_shape {\n      dim {\n        size: -1\n      }\n    }\n  }\n}\noutputs {\n  key: \"classes\"\n  value {\n    name: \"linear\u002Fbinary_logistic_head\u002F_classification_output_alternatives\u002Fclasses_tensor:0\"\n    dtype: DT_STRING\n    tensor_shape {\n      dim {\n        size: -1\n      }\n      dim {\n        size: -1\n      }\n    }\n  }\n}\noutputs {\n  key: \"scores\"\n  value {\n    name: \"linear\u002Fbinary_logistic_head\u002Fpredictions\u002Fprobabilities:0\"\n    dtype: DT_FLOAT\n    tensor_shape {\n      dim {\n        size: -1\n      }\n      dim {\n        size: 2\n      }\n    }\n  }\n}\nmethod_name: \"tensorflow\u002Fserving\u002Fclassify\"\n```\n\n我们需要构造用于推理的字符串张量，并使用 base64 对字符串进行编码以供 HTTP 使用。以下是示例 Python 代码。\n\n```python\ndef _float_feature(value):\n  return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))\n\ndef _bytes_feature(value):\n  return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))\n\ndef main():\n  # 原始输入数据\n  feature_dict = {\"a\": _bytes_feature(\"10\"), \"b\": _float_feature(10)}\n\n  # 将 Example 创建为 base64 字符串\n  example_proto = tf.train.Example(features=tf.train.Features(feature=feature_dict))\n  tensor_proto = tf.contrib.util.make_tensor_proto(example_proto.SerializeToString(), dtype=tf.string)\n  tensor_string = tensor_proto.string_val.pop()\n  base64_tensor_string = base64.urlsafe_b64encode(tensor_string)\n\n  # 向服务器发送请求\n  endpoint = \"http:\u002F\u002F127.0.0.1:8500\"\n  json_data = {\"model_name\": \"default\", \"base64_decode\": True, \"data\": {\"inputs\": [base64_tensor_string]}}\n  result = requests.post(endpoint, json=json_data)\n  print(result.json())\n```\n\n### 自定义 Op\n\n如果您的模型依赖于新的 TensorFlow [自定义 Op](https:\u002F\u002Fwww.tensorflow.org\u002Fextend\u002Fadding_an_op)，您可以在加载 so 文件的同时运行服务器。\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodel\u002F\" --custom_op_paths=\".\u002Ffoo_op\u002F\"\n```\n\n请查看 [.\u002Fexamples\u002Fcustom_op\u002F](.\u002Fexamples\u002Fcustom_op\u002F) 中的完整示例。\n\n### 身份验证\n\n对于企业用户，我们可以为所有 API 启用基本身份验证，并拒绝任何匿名请求。\n\n现在使用配置的用户名和密码启动服务器。\n\n```bash\n.\u002Fserver.py --model_base_path=\".\u002Fmodels\u002Ftensorflow_template_application_model\u002F\" --enable_auth=True --auth_username=\"admin\" --auth_password=\"admin\"\n```\n\n如果您使用 Web 仪表板，只需输入您的凭据即可。如果您使用客户端，则需要在请求中提供用户名和密码。\n\n```bash\ncurl -u admin:admin -H \"Content-Type: application\u002Fjson\" -X POST -d '{\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}' http:\u002F\u002F127.0.0.1:8500\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"data\": {\n      \"keys\": [[11.0], [2.0]],\n      \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]\n  }\n}\nauth = requests.auth.HTTPBasicAuth(\"admin\", \"admin\")\nresult = requests.post(endpoint, json=input_data, auth=auth)\n```\n\n### TSL\u002FSSL\n\n它支持 TSL\u002FSSL，您可以生成自签名的密钥文件用于测试。\n\n```bash\nopenssl req -x509 -newkey rsa:4096 -nodes -out \u002Ftmp\u002Fsecret.pem -keyout \u002Ftmp\u002Fsecret.key -days 365\n```\n\n然后使用证书文件运行服务器。\n\n```bash\nsimple_tensorflow_serving --enable_ssl=True --secret_pem=\u002Ftmp\u002Fsecret.pem --secret_key=\u002Ftmp\u002Fsecret.key --model_base_path=\".\u002Fmodels\u002Ftensorflow_template_application_model\"\n```\n\n## 支持的模型\n\n对于 MXNet 模型，您可以使用以下命令和配置进行加载。\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fmxnet_mlp\u002Fmx_mlp\" --model_platform=\"mxnet\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[12.0, 2.0]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\n对于 ONNX 模型，您可以使用以下命令和配置进行加载。\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fonnx_mnist_model\u002Fonnx_model.proto\" --model_platform=\"onnx\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\n对于 H2o 模型，您可以使用以下命令和配置进行加载。\n\n```bash\n# 使用 \"java -jar h2o.jar\" 启动 H2o 服务器\n\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fh2o_prostate_model\u002FGLM_model_python_1525255083960_17\" --model_platform=\"h2o\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\n对于 Scikit-learn 模型，您可以使用以下命令和配置进行加载。\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fscikitlearn_iris\u002Fmodel.joblib\" --model_platform=\"scikitlearn\"\n\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fscikitlearn_iris\u002Fmodel.pkl\" --model_platform=\"scikitlearn\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\n对于 XGBoost 模型，您可以使用以下命令和配置进行加载。\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fxgboost_iris\u002Fmodel.bst\" --model_platform=\"xgboost\"\n\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fxgboost_iris\u002Fmodel.joblib\" --model_platform=\"xgboost\"\n\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fxgboost_iris\u002Fmodel.pkl\" --model_platform=\"xgboost\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model版本\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\n对于 PMML 模型，您可以使用以下命令和配置进行加载。这依赖于 [Openscoring](https:\u002F\u002Fgithub.com\u002Fopenscoring\u002Fopenscoring) 和 [Openscoring-Python](https:\u002F\u002Fgithub.com\u002Fopenscoring\u002Fopenscoring-python) 来加载模型。\n\n```bash\njava -jar .\u002Fthird_party\u002Fopenscoring\u002Fopenscoring-server-executable-1.4-SNAPSHOT.jar\n\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fpmml_iris\u002FDecisionTreeIris.pmml\" --model_platform=\"pmml\"\n```\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model版本\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\n## 支持的客户端\n\n以下是 [Bash](.\u002Fbash_client\u002F) 中的示例客户端。\n\n```bash\ncurl -H \"Content-Type: application\u002Fjson\" -X POST -d '{\"data\": {\"keys\": [[1.0], [2.0]], \"features\": [[10, 10, 10, 8, 6, 1, 8, 9, 1], [6, 2, 1, 1, 1, 1, 7, 1, 1]]}}' http:\u002F\u002F127.0.0.1:8500\n```\n\n以下是 [Python](.\u002Fpython_client\u002F) 中的示例客户端。\n\n```python\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\npayload = {\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\n\nresult = requests.post(endpoint, json=payload)\n```\n\n以下是 [C++](.\u002Fcpp_client\u002F) 中的示例客户端。\n\n以下是 [Java](.\u002Fjava_client\u002F) 中的示例客户端。\n\n以下是 [Scala](.\u002Fscala_client\u002F) 中的示例客户端。\n\n以下是 [Go](.\u002Fgo_client\u002F) 中的示例客户端。\n\n```go\nendpoint := \"http:\u002F\u002F127.0.0.1:8500\"\ndataByte := []byte(`{\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}`)\nvar dataInterface map[string]interface{}\njson.Unmarshal(dataByte, &dataInterface)\ndataJson, _ := json.Marshal(dataInterface)\n\nresp, err := http.Post(endpoint, \"application\u002Fjson\", bytes.NewBuffer(dataJson))\n```\n\n以下是 [Ruby](.\u002Fruby_client\u002F) 中的示例客户端。\n\n```ruby\nendpoint = \"http:\u002F\u002F127.0.0.1:8500\"\nuri = URI.parse(endpoint)\nheader = {\"Content-Type\" => \"application\u002Fjson\"}\ninput_data = {\"data\" => {\"keys\"=> [[11.0], [2.0]], \"features\"=> [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\nhttp = Net::HTTP.new(uri.host, uri.port)\nrequest = Net::HTTP::Post.new(uri.request_uri, header)\nrequest.body = input_data.to_json\n\nresponse = http.request(request)\n```\n\n以下是 [JavaScript](.\u002Fjavascript_client\u002F) 中的示例客户端。\n\n```javascript\nvar options = {\n    uri: \"http:\u002F\u002F127.0.0.1:8500\",\n    method: \"POST\",\n    json: {\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\n};\n\nrequest(options, function (error, response, body) {});\n```\n\n以下是 [PHP](.\u002Fphp_client\u002F) 中的示例客户端。\n\n```php\n$endpoint = \"127.0.0.1:8500\";\n$inputData = array(\n    \"keys\" => [[11.0], [2.0]],\n    \"features\" => [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]],\n);\n$jsonData = array(\n    \"data\" => $inputData,\n);\n$ch = curl_init($endpoint);\ncurl_setopt_array($ch, array(\n    CURLOPT_POST => TRUE,\n    CURLOPT_RETURNTRANSFER => TRUE,\n    CURLOPT_HTTPHEADER => array(\n        \"Content-Type: application\u002Fjson\"\n    ),\n    CURLOPT_POSTFIELDS => json_encode($jsonData)\n));\n\n$response = curl_exec($ch);\n```\n\n以下是 [Erlang](.\u002Ferlang_client\u002F) 中的示例客户端。\n\n```erlang\nssl:start(),\napplication:start(inets),\nhttpc:request(post,\n  {\"http:\u002F\u002F127.0.0.1:8500\", [],\n  \"application\u002Fjson\",\n  \"{\\\"data\\\": {\\\"keys\\\": [[11.0], [2.0]], \\\"features\\\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\"\n  }, [], []).\n```\n\n以下是 [Lua](.\u002Flua_client\u002F) 中的示例客户端。\n\n```lua\nlocal endpoint = \"http:\u002F\u002F127.0.0.1:8500\"\nkeys_array = {}\nkeys_array[1] = {1.0}\nkeys_array[2] = {2.0}\nfeatures_array = {}\nfeatures_array[1] = {1, 1, 1, 1, 1, 1, 1, 1, 1}\nfeatures_array[2] = {1, 1, 1, 1, 1, 1, 1, 1, 1}\nlocal input_data = {\n    [\"keys\"] = keys_array,\n    [\"features\"] = features_array,\n}\nlocal json_data = {\n    [\"data\"] = input_data\n}\nrequest_body = json:encode (json_data)\nlocal response_body = {}\n\nlocal res, code, response_headers = http.request{\n    url = endpoint,\n    method = \"POST\", \n    headers = \n      {\n          [\"Content-Type\"] = \"application\u002Fjson\";\n          [\"Content-Length\"] = #request_body;\n      },\n      source = ltn12.source.string(request_body),\n      sink = ltn12.sink.table(response_body),\n}\n```\n\n以下是 [Rust](.\u002Fswift_client\u002F) 中的示例客户端。\n\n以下是 [Swift](.\u002Fswift_client\u002F) 中的示例客户端。\n\n以下是 [Perl](.\u002Fperl_client\u002F) 中的示例客户端。\n\n```perl\nmy $endpoint = \"http:\u002F\u002F127.0.0.1:8500\";\nmy $json = '{\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}';\nmy $req = HTTP::Request->new( 'POST', $endpoint );\n$req->header( 'Content-Type' => 'application\u002Fjson' );\n$req->content( $json );\n$ua = LWP::UserAgent->new;\n\n$response = $ua->request($req);\n```\n\n以下是 [Lisp](.\u002Fswift_client\u002F) 中的示例客户端。\n\n以下是 [Haskell](.\u002Fswift_client\u002F) 中的示例客户端。\n\n以下是 [Clojure](.\u002Fclojure_client\u002F) 中的示例客户端。\n\n以下是 [R](.\u002Fr_client\u002F) 中的示例客户端。\n\n```r\nendpoint \u003C- \"http:\u002F\u002F127.0.0.1:8500\"\nbody \u003C- list(data = list(a = 1), keys = 1)\njson_data \u003C- list(\n  data = list(\n    keys = list(list(1.0), list(2.0)), features = list(list(1, 1, 1, 1, 1, 1, 1, 1, 1), list(1, 1, 1, 1, 1, 1, 1, 1, 1))\n  )\n)\n\nr \u003C- POST(endpoint, body = json_data, encode = \"json\")\nstop_for_status(r)\ncontent(r, \"parsed\", \"text\u002Fhtml\")\n```\n\n以下是 Postman 的示例。\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_deddffc393cd.png)\n\n\n## 性能\n\n你可以使用任何 WSGI 服务器来运行 SimpleTensorFlowServing，以获得更好的性能。我们已经进行了基准测试，并与 `TensorFlow Serving` 进行了比较。更多详细信息请参见 [benchmark](.\u002Fbenchmark\u002F)。\n\n对于不同的模型，STFS（Simple TensorFlow Serving）和 TFS（TensorFlow Serving）的性能相似。纵坐标是推理延迟（微秒），越低越好。\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_9f9cbf9f3033.jpeg)\n\n然后我们使用 `ab` 工具在 CPU 和 GPU 上测试并发客户端。`TensorFlow Serving` 在 GPU 上表现得更好。\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_820e6d10644d.jpeg)\n\n对于 [最简单的模型](.\u002Fbenchmark\u002Fsimplest_model\u002F)，每个请求仅需约 1.9 微秒，而一个 Simple TensorFlow Serving 实例可以达到 5000+ QPS。通过增加批处理大小，每秒可以推理超过 100 万个实例。\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_b6af3ad7d332.jpeg)\n\n## 工作原理\n\n1. `simple_tensorflow_serving` 使用 `flask` 应用程序启动 HTTP 服务器。\n2. 使用 Python API `tf.saved_model.loader` 加载 TensorFlow 模型。\n3. 根据请求的 JSON 主体构造 feed_dict 数据。\n   ```\n   \u002F\u002F 方法：POST，内容类型：application\u002Fjson\n   {\n     \"model_version\": 1, \u002F\u002F 可选\n     \"data\": {\n       \"keys\": [[1], [2]],\n       \"features\": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]]\n     }\n   }\n   ```\n4. 使用 TensorFlow Python API 通过 feed_dict 数据执行 `sess.run()`。\n5. 对于支持多个版本的情况，会启动独立线程来加载模型。\n6. 对于生成的客户端，它会读取用户的模型，并使用 [Jinja](http:\u002F\u002Fjinja.pocoo.org\u002F) 模板渲染代码。\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_readme_30985215dee0.jpeg)\n\n## 贡献\n\n欢迎为该项目提交问题或拉取请求。非常欢迎添加更多语言的客户端，以便访问 TensorFlow 模型。","# Simple TensorFlow Serving 快速上手指南\n\nSimple TensorFlow Serving 是一个通用且易用的机器学习模型服务工具，支持 TensorFlow、PyTorch、MXNet、ONNX、Scikit-learn 等多种框架模型的部署。\n\n## 环境准备\n\n*   **操作系统**：Linux \u002F macOS \u002F Windows (推荐 Linux)\n*   **Python 版本**：Python 3.4+\n*   **前置依赖**：\n    *   `pip` 包管理工具\n    *   若需 GPU 加速，请确保已安装 NVIDIA 驱动及 CUDA 环境（推荐使用 Docker）\n*   **网络建议**：国内用户建议使用国内 PyPI 镜像源加速安装。\n\n## 安装步骤\n\n### 方式一：使用 pip 安装（推荐）\n\n```bash\n# 使用国内镜像源加速安装\npip install simple_tensorflow_serving -i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple\n```\n\n### 方式二：使用 Docker 部署（推荐生产环境或 GPU 场景）\n\n**CPU 版本：**\n```bash\ndocker run -d -p 8500:8500 tobegit3hub\u002Fsimple_tensorflow_serving\n```\n\n**GPU 版本：**\n```bash\ndocker run -d -p 8500:8500 tobegit3hub\u002Fsimple_tensorflow_serving:latest-gpu\n```\n\n### 方式三：源码安装\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Ftobegit3hub\u002Fsimple_tensorflow_serving.git\ncd simple_tensorflow_serving\npython .\u002Fsetup.py install\n```\n\n## 基本使用\n\n以下示例展示如何启动服务并进行最简单的推理调用。\n\n### 1. 启动服务\n\n假设你有一个 TensorFlow SavedModel 格式的模型，位于 `.\u002Fmodels\u002Ftensorflow_template_application_model` 目录下。\n\n```bash\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Ftensorflow_template_application_model\"\n```\n\n启动成功后，可在浏览器访问 `http:\u002F\u002F127.0.0.1:8500` 查看可视化监控面板。\n\n### 2. 自动生成客户端代码\n\n无需手动编写请求代码，工具可根据模型签名自动生成测试客户端。\n\n**生成 Python 客户端：**\n```bash\ncurl http:\u002F\u002Flocalhost:8500\u002Fv1\u002Fmodels\u002Fdefault\u002Fgen_client?language=python > client.py\n```\n\n### 3. 执行推理\n\n运行生成的客户端脚本即可发送测试数据并获取结果：\n\n```bash\npython .\u002Fclient.py\n```\n\n生成的 `client.py` 核心逻辑如下（仅供参考）：\n\n```python\n#!\u002Fusr\u002Fbin\u002Fenv python\n\nimport requests\n\ndef main():\n  endpoint = \"http:\u002F\u002F127.0.0.1:8500\"\n  # 自动填充的测试数据\n  json_data = {\"model_name\": \"default\", \"data\": {\"keys\": [[1], [1]], \"features\": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]]} }\n  result = requests.post(endpoint, json=json_data)\n  print(result.text)\n\nif __name__ == \"__main__\":\n  main()\n```\n\n### 进阶提示\n*   **多模型支持**：可通过 JSON 配置文件同时加载多个不同框架的模型。\n*   **图像模型**：支持直接上传原始图片文件进行推理，无需手动转换为数组。\n*   **安全认证**：支持配置 Basic Auth 用户名密码及 SSL\u002FTLS 加密传输。","某电商团队需要将训练好的 TensorFlow 商品推荐模型和 PyTorch 图像识别模型快速部署到生产环境，以支持实时用户行为分析。\n\n### 没有 simple_tensorflow_serving 时\n- **多框架适配难**：团队需为 TensorFlow 和 PyTorch 分别编写不同的推理服务代码，维护两套独立的部署脚本，开发成本高昂。\n- **版本更新繁琐**：每次模型迭代都需要手动重启服务或编写复杂的热加载逻辑，导致服务中断风险高且上线周期长。\n- **客户端对接慢**：后端开发人员必须手动编写各语言（如 Python、Java）的 HTTP 请求封装代码，容易出错且效率低下。\n- **缺乏监控可视**：无法直观查看模型调用次数、延迟等统计指标，排查性能瓶颈如同“盲人摸象”。\n\n### 使用 simple_tensorflow_serving 后\n- **统一服务入口**：通过一个配置文件即可同时托管 TensorFlow 和 PyTorch 等多个框架的模型，提供标准的 RESTful API 统一对外服务。\n- **动态热更新**：只需将新模型文件放入指定目录，simple_tensorflow_serving 自动检测并加载最新版本，实现零停机平滑升级。\n- **自动生成客户端**：利用 `curl` 一键生成特定语言的客户端代码，开发人员无需手写请求逻辑，直接调用即可测试模型。\n- **内置监控看板**：访问 Web 仪表盘即可实时查看各模型的请求量与耗时统计，运维监控一目了然。\n\nsimple_tensorflow_serving 通过屏蔽底层框架差异与简化运维流程，让算法模型从“实验室”到“生产线”的部署时间从数天缩短至分钟级。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Ftobegit3hub_simple_tensorflow_serving_df26f4f5.png","tobegit3hub","tobe","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Ftobegit3hub_41fbce74.png","The architecture and developer for LLM infrastructure.",null,"China","tobeg3oogle@gmail.com","https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fchendihao\u002F","https:\u002F\u002Fgithub.com\u002Ftobegit3hub",[83,87,91,95,99,103,107,111,114,117],{"name":84,"color":85,"percentage":86},"JavaScript","#f1e05a",52.9,{"name":88,"color":89,"percentage":90},"Python","#3572A5",40.1,{"name":92,"color":93,"percentage":94},"HTML","#e34c26",3.5,{"name":96,"color":97,"percentage":98},"Shell","#89e051",1.4,{"name":100,"color":101,"percentage":102},"Swift","#F05138",0.4,{"name":104,"color":105,"percentage":106},"Lua","#000080",0.3,{"name":108,"color":109,"percentage":110},"Go","#00ADD8",0.2,{"name":112,"color":113,"percentage":110},"Dockerfile","#384d54",{"name":115,"color":116,"percentage":110},"C++","#f34b7d",{"name":118,"color":119,"percentage":110},"Ruby","#701516",758,185,"2026-02-04T09:16:03","Apache-2.0","Linux","非必需。若需加速，需 NVIDIA GPU 及 CUDA 环境（通过 Docker 镜像 `latest-gpu` 部署，需挂载宿主机的 `\u002Fdev\u002Fnvidia*` 设备及 CUDA 库文件）","未说明",{"notes":128,"python":129,"dependencies":130},"该工具是一个通用的模型服务框架，支持多种框架模型（TensorFlow, MXNet, PyTorch 等）。推荐使用 Docker 部署（提供 CPU、GPU、HDFS 等多个镜像标签）。支持动态加载\u002F卸载模型版本、自定义 TensorFlow Op、基础认证 (Basic Auth) 及 SSL\u002FTLS 加密。可通过 API 自动生成多语言客户端代码。","支持 Python 2 (示例含 py34 标签) 及 Python 3 (通过 pip 安装或源码编译)",[131,132,133,134,135,136,137,138,139,140],"TensorFlow","MXNet","PyTorch","Caffe2","CNTK","ONNX","H2o","Scikit-learn","XGBoost","PMML",[52,14],[143,144,145,146,147,148,149,150],"tensorflow-models","savedmodel","tensorflow","serving","client","http","machine-learning","deep-learning","2026-03-27T02:49:30.150509","2026-04-10T06:26:43.929640",[154,159,164,169,173,178,182,187],{"id":155,"question_zh":156,"answer_zh":157,"source_url":158},27209,"遇到 AttributeError: module 'tensorflow' has no attribute 'gfile' 错误怎么办？","该问题是由于 TensorFlow 2.x 版本中移除了 `tf.gfile` 属性导致的。维护者确认此问题已通过更新为新的 TensorFlow API 解决。请确保升级到最新版本的 simple-tensorflow-serving，以兼容 TensorFlow 2.0 及以上版本。","https:\u002F\u002Fgithub.com\u002Ftobegit3hub\u002Fsimple_tensorflow_serving\u002Fissues\u002F45",{"id":160,"question_zh":161,"answer_zh":162,"source_url":163},27210,"启动服务时出现 TemplateNotFound: index.html 错误如何解决？","这是一个已知问题，已在版本 0.8.2 中修复。请通过运行命令 `pip install -U simple-tensorflow-serving>=0.8.2` 升级到 0.8.2 或更高版本即可解决该模板丢失问题。","https:\u002F\u002Fgithub.com\u002Ftobegit3hub\u002Fsimple_tensorflow_serving\u002Fissues\u002F60",{"id":165,"question_zh":166,"answer_zh":167,"source_url":168},27211,"Docker 镜像默认是 CPU 模式吗？如何启用 GPU 支持？","默认 Docker 镜像确实使用 CPU 模式。若要启用 GPU，请使用标记为 `latest-gpu` 的镜像（如 `tobegit3hub\u002Fsimple_tensorflow_serving:latest-gpu`）。运行时需挂载 CUDA 文件并传递设备参数，示例命令如下：\nexport CUDA_SO=\"-v \u002Fusr\u002Fcuda_files\u002F:\u002Fusr\u002Fcuda_files\u002F\"\nexport DEVICES=$(ls \u002Fdev\u002Fnvidia* | xargs -I{} echo '--device {}:{}')\nexport LIBRARY_ENV=\"-e LD_LIBRARY_PATH=\u002Fusr\u002Flocal\u002Fcuda\u002Fextras\u002FCUPTI\u002Flib64:\u002Fusr\u002Flocal\u002Fnvidia\u002Flib:\u002Fusr\u002Flocal\u002Fnvidia\u002Flib64:\u002Fusr\u002Fcuda_files\"\ndocker run -it -p 8500:8500 $CUDA_SO $DEVICES $LIBRARY_ENV tobegit3hub\u002Fsimple_tensorflow_serving:latest-gpu","https:\u002F\u002Fgithub.com\u002Ftobegit3hub\u002Fsimple_tensorflow_serving\u002Fissues\u002F26",{"id":170,"question_zh":171,"answer_zh":172,"source_url":168},27212,"如何在 GPU 模式下限制显存占用，避免分配全部显存？","在 GPU 模式下，默认可能会占用全部显存。虽然目前无法直接通过编译 TF-Serving 添加配置，但可以在 TensorFlow Python 服务器端通过设置 `per_process_gpu_memory_fraction` 参数或使用 `allow_growth=True` 来控制显存增长。建议关注相关 TensorFlow Serving 议题以获取后续原生支持。",{"id":174,"question_zh":175,"answer_zh":176,"source_url":177},27213,"为什么在 Python 3 环境下请求返回 \"Invalid json data\" 错误？","这是由于 Python 2 和 Python 3 之间的兼容性问题，特别是字典操作（如 `has_key`）在 Python 3 中不再支持。维护者已确认该问题并计划在使用 Python 3 的 Docker 镜像（如 `:latest-py34`）中进行修复。建议暂时使用 Python 2 环境或等待官方修复更新。","https:\u002F\u002Fgithub.com\u002Ftobegit3hub\u002Fsimple_tensorflow_serving\u002Fissues\u002F25",{"id":179,"question_zh":180,"answer_zh":181,"source_url":177},27214,"如何将 TensorFlow 的 checkpoint 文件（.meta, .index, .data）转换为 SavedModel 格式以供 serving 使用？","需要编写 Python 脚本加载 checkpoint 并导出为 SavedModel 格式。具体步骤包括：使用 `tf.train.import_meta_graph` 加载 .meta 文件，恢复变量，然后使用 `tf.saved_model.simple_save` 或 `tf.compat.v1.saved_model.builder` 将模型保存为 SavedModel 格式（包含 saved_model.pb 和 variables 文件夹）。",{"id":183,"question_zh":184,"answer_zh":185,"source_url":186},27215,"如何使用图片进行推理？输入格式是什么？","现在支持以 base64 编码的图片作为输入。您可以拉取最新代码获取预训练模型，并使用以下命令进行测试：\nsimple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Fdeep_image_platforms\"\ncurl -X POST -F 'image=@.\u002Fimages\u002Fmew.jpg' -F \"model_version=1\" 127.0.0.1:8500\n或者直接访问 http:\u002F\u002Flocalhost:8500\u002F 仪表盘上传图片进行推理。","https:\u002F\u002Fgithub.com\u002Ftobegit3hub\u002Fsimple_tensorflow_serving\u002Fissues\u002F21",{"id":188,"question_zh":189,"answer_zh":190,"source_url":191},27216,"在 Windows 10 上通过 pip 安装后运行命令提示找不到目录怎么办？","在 Windows 上运行时，请确保模型路径正确且存在。注意 Windows 路径分隔符可能需要使用双反斜杠 `\\\\` 或正斜杠 `\u002F`。例如：`simple_tensorflow_serving --model_base_path=\".\u002Fmodels\u002Ftensorflow_template_application_model\"`。如果路径中包含空格或特殊字符，请用引号包裹。同时确认当前目录下确实存在指定的模型文件夹。","https:\u002F\u002Fgithub.com\u002Ftobegit3hub\u002Fsimple_tensorflow_serving\u002Fissues\u002F51",[]]