[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-orobix--retina-unet":3,"tool-orobix--retina-unet":64},[4,17,26,40,48,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,2,"2026-04-03T11:11:01",[13,14,15],{"id":27,"name":28,"github_repo":29,"description_zh":30,"stars":31,"difficulty_score":23,"last_commit_at":32,"category_tags":33,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,34,35,36,15,37,38,13,39],"数据工具","视频","插件","其他","语言模型","音频",{"id":41,"name":42,"github_repo":43,"description_zh":44,"stars":45,"difficulty_score":10,"last_commit_at":46,"category_tags":47,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,38,37],{"id":49,"name":50,"github_repo":51,"description_zh":52,"stars":53,"difficulty_score":10,"last_commit_at":54,"category_tags":55,"status":16},519,"PaddleOCR","PaddlePaddle\u002FPaddleOCR","PaddleOCR 是一款基于百度飞桨框架开发的高性能开源光学字符识别工具包。它的核心能力是将图片、PDF 等文档中的文字提取出来，转换成计算机可读取的结构化数据，让机器真正“看懂”图文内容。\n\n面对海量纸质或电子文档，PaddleOCR 解决了人工录入效率低、数字化成本高的问题。尤其在人工智能领域，它扮演着连接图像与大型语言模型（LLM）的桥梁角色，能将视觉信息直接转化为文本输入，助力智能问答、文档分析等应用场景落地。\n\nPaddleOCR 适合开发者、算法研究人员以及有文档自动化需求的普通用户。其技术优势十分明显：不仅支持全球 100 多种语言的识别，还能在 Windows、Linux、macOS 等多个系统上运行，并灵活适配 CPU、GPU、NPU 等各类硬件。作为一个轻量级且社区活跃的开源项目，PaddleOCR 既能满足快速集成的需求，也能支撑前沿的视觉语言研究，是处理文字识别任务的理想选择。",74913,"2026-04-05T10:44:17",[38,14,13,37],{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":23,"last_commit_at":62,"category_tags":63,"status":16},2471,"tesseract","tesseract-ocr\u002Ftesseract","Tesseract 是一款历史悠久且备受推崇的开源光学字符识别（OCR）引擎，最初由惠普实验室开发，后由 Google 维护，目前由全球社区共同贡献。它的核心功能是将图片中的文字转化为可编辑、可搜索的文本数据，有效解决了从扫描件、照片或 PDF 文档中提取文字信息的难题，是数字化归档和信息自动化的重要基础工具。\n\n在技术层面，Tesseract 展现了强大的适应能力。从版本 4 开始，它引入了基于长短期记忆网络（LSTM）的神经网络 OCR 引擎，显著提升了行识别的准确率；同时，为了兼顾旧有需求，它依然支持传统的字符模式识别引擎。Tesseract 原生支持 UTF-8 编码，开箱即用即可识别超过 100 种语言，并兼容 PNG、JPEG、TIFF 等多种常见图像格式。输出方面，它灵活支持纯文本、hOCR、PDF、TSV 等多种格式，方便后续数据处理。\n\nTesseract 主要面向开发者、研究人员以及需要构建文档处理流程的企业用户。由于它本身是一个命令行工具和库（libtesseract），不包含图形用户界面（GUI），因此最适合具备一定编程能力的技术人员集成到自动化脚本或应用程序中",73286,"2026-04-03T01:56:45",[13,14],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":69,"readme_en":70,"readme_zh":71,"quickstart_zh":72,"use_case_zh":73,"hero_image_url":74,"owner_login":75,"owner_name":76,"owner_avatar_url":77,"owner_bio":78,"owner_company":79,"owner_location":79,"owner_email":80,"owner_twitter":79,"owner_website":81,"owner_url":82,"languages":83,"stars":88,"forks":89,"last_commit_at":90,"license":79,"difficulty_score":91,"env_os":92,"env_gpu":93,"env_ram":92,"env_deps":94,"category_tags":106,"github_topics":79,"view_count":23,"oss_zip_url":79,"oss_zip_packed_at":79,"status":16,"created_at":107,"updated_at":108,"faqs":109,"releases":139},3682,"orobix\u002Fretina-unet","retina-unet","Retina blood vessel segmentation with a convolutional neural network","retina-unet 是一款基于卷积神经网络的开源工具，专门用于从眼底图像中自动分割视网膜血管。它通过将图像中的每个像素分类为“血管”或“非血管”，解决了传统人工标注效率低、主观性强以及自动化分析难度大的问题，为糖尿病视网膜病变等眼科疾病的辅助诊断提供关键技术支撑。\n\n该工具非常适合医学影像领域的研究人员、AI 开发者以及生物医学工程师使用。用户可以直接利用其预训练模型进行测试，或基于提供的代码框架在自己的数据集上进行训练和微调。\n\nretina-unet 的核心技术亮点在于采用了经典的 U-Net 架构，并针对眼底图像特性进行了深度优化。在训练前，它对数据进行了灰度转换、标准化、限制对比度自适应直方图均衡化（CLAHE）及伽马校正等一系列精细预处理。此外，通过随机截取包含视野边界的小图像块进行训练，模型学会了有效区分血管与视野边缘。在权威的 DRIVE 数据库测试中，retina-unet 取得了极高的 ROC 曲线下面积（AUC）分数，表现优于许多已发表的方法，证明了其在复杂医学图像分割任务中的卓越性能与可靠性。","# Retina blood vessel segmentation with a convolution neural network (U-net)\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Forobix_retina-unet_readme_423f9c23f33c.png)\n\nThis repository contains the implementation of a convolutional neural network used to segment blood vessels in retina fundus images. This is a binary classification task: the neural network predicts if each pixel in the fundus image is either a vessel or not.  \nThe neural network structure is derived from the *U-Net* architecture, described in this [paper](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1505.04597.pdf).  \nThe performance of this neural network is tested on the DRIVE database, and it achieves the best score in terms of area under the ROC curve in comparison to the other methods published so far. Also on the STARE datasets, this method reports one of the best performances.\n\n\n## Methods\nBefore training, the 20 images of the DRIVE training datasets are pre-processed with the following transformations:\n- Gray-scale conversion\n- Standardization\n- Contrast-limited adaptive histogram equalization (CLAHE)\n- Gamma adjustment\n\nThe training of the neural network is performed on sub-images (patches) of the pre-processed full images. Each patch, of dimension 48x48, is obtained by randomly selecting its center inside the full image. Also the patches partially or completely outside the Field Of View (FOV) are selected, in this way the neural network learns how to discriminate the FOV border from blood vessels.  \nA set of 190000 patches is obtained by randomly extracting 9500 patches in each of the 20 DRIVE training images. Although the patches overlap, i.e. different patches may contain same part of the original images, no further data augmentation is performed. The first 90% of the dataset is used for training (171000 patches), while the last 10% is used for validation (19000 patches).\n\nThe neural network architecture is derived from the *U-net* architecture (see the [paper](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1505.04597.pdf)).\nThe loss function is the cross-entropy and the stochastic gradient descent is employed for optimization. The activation function after each convolutional layer is the Rectifier Linear Unit (ReLU), and a dropout of 0.2 is used between two consecutive convolutional layers.  \nTraining is performed for 150 epochs, with a mini-batch size of 32 patches. Using a GeForce GTX TITAN GPU the training lasts for about 20 hours.\n\n\n## Results on DRIVE database\nTesting is performed with the 20 images of the DRIVE testing dataset, using the gold standard as ground truth. Only the pixels belonging to the FOV are considered. The FOV is identified with the masks included in the DRIVE database.  \nIn order to improve the performance, the vessel probability of each pixel is obtained by averaging multiple predictions. With a stride of 5 pixels in both height and width, multiple consecutive overlapping patches are extracted in each testing image. Then, for each pixel, the vessel probability is obtained by averaging probabilities over all the predicted patches covering the pixel.\n\nThe results reported in the `.\u002Ftest` folder are referred to the trained model which reported the minimum validation loss. The `.\u002Ftest` folder includes:\n- Model:\n  - `test_model.png` schematic representation of the neural network\n  - `test_architecture.json` description of the model in json format\n  - `test_best_weights.h5` weights of the model which reported the minimum validation loss, as HDF5 file\n  - `test_last_weights.h5`  weights of the model at last epoch (150th), as HDF5 file\n  - `test_configuration.txt` configuration of the parameters of the experiment\n- Experiment results:\n  - `performances.txt` summary of the test results, including the confusion matrix\n  - `Precision_recall.png` the precision-recall plot and the corresponding Area Under the Curve (AUC)\n  - `ROC.png` the Receiver Operating Characteristic (ROC) curve and the corresponding AUC\n  - `all_*.png` the 20 images of the pre-processed originals, ground truth and predictions relative to the DRIVE testing dataset\n  - `sample_input_*.png` sample of 40 patches of the pre-processed original training images and the corresponding ground truth\n  - `test_Original_GroundTruth_Prediction*.png` from top to bottom, the original pre-processed image, the ground truth and the prediction. In the predicted image, each pixel shows the vessel predicted probability, no threshold is applied.\n\n\nThe following table compares this method to other recent techniques, which have published their performance in terms of Area Under the ROC curve (AUC ROC) on the DRIVE dataset.\n\n| Method                  | AUC ROC on DRIVE |\n| ----------------------- |:----------------:|\n| Soares et al [1]        | .9614            |\n| Azzopardi et al. [2]    | .9614            |\n| Osareh et al  [3]       | .9650            |\n| Roychowdhury et al. [4] | .9670            |\n| Fraz et al.  [5]        | .9747            |\n| Qiaoliang et al. [6]    | .9738            |\n| Melinscak et al. [7]    | .9749            |\n| Liskowski et al.^ [8]   | .9790            |\n| **this method**         | **.9790**        |\n\n^ different definition of FOV\n\n## Running the experiment on DRIVE\nThe code is written in Python, it is possible to replicate the experiment on the DRIVE database by following the guidelines below.\n\n\n### Prerequisities\nThe neural network is developed with the Keras library, we refer to the [Keras repository](https:\u002F\u002Fgithub.com\u002Ffchollet\u002Fkeras) for the installation.\n\nThis code has been tested with Keras 1.1.0, using either Theano or TensorFlow as backend. In order to avoid dimensions mismatch, it is important to set `\"image_dim_ordering\": \"th\"` in the `~\u002F.keras\u002Fkeras.json` configuration file. If this file isn't there, you can create it. See the Keras documentation for more details.\n\nThe following dependencies are needed:\n- numpy >= 1.11.1\n- PIL >=1.1.7\n- opencv >=2.4.10\n- h5py >=2.6.0\n- ConfigParser >=3.5.0b2\n- scikit-learn >= 0.17.1\n\n\nAlso, you will need the DRIVE database, which can be freely downloaded as explained in the next section.\n\n### Training\n\nFirst of all, you need the DRIVE database. We are not allowed to provide the data here, but you can download the DRIVE database at the official [website](http:\u002F\u002Fwww.isi.uu.nl\u002FResearch\u002FDatabases\u002FDRIVE\u002F). Extract the images to a folder, and call it \"DRIVE\", for example. This folder should have the following tree:\n```\nDRIVE\n│\n└───test\n|    ├───1st_manual\n|    └───2nd_manual\n|    └───images\n|    └───mask\n│\n└───training\n    ├───1st_manual\n    └───images\n    └───mask\n```\nWe refer to the DRIVE website for the description of the data.\n\nIt is convenient to create HDF5 datasets of the ground truth, masks and images for both training and testing.\nIn the root folder, just run:\n```\npython prepare_datasets_DRIVE.py\n```\nThe HDF5 datasets for training and testing will be created in the folder `.\u002FDRIVE_datasets_training_testing\u002F`.  \nN.B: If you gave a different name for the DRIVE folder, you need to specify it in the `prepare_datasets_DRIVE.py` file.\n\nNow we can configure the experiment. All the settings can be specified in the file `configuration.txt`, organized in the following sections:  \n**[data paths]**  \nChange these paths only if you have modified the `prepare_datasets_DRIVE.py` file.  \n**[experiment name]**  \nChoose a name for the experiment, a folder with the same name will be created and will contain all the results and the trained neural networks.  \n**[data attributes]**  \nThe network is trained on sub-images (patches) of the original full images, specify here the dimension of the patches.  \n**[training settings]**  \nHere you can specify:  \n- *N_subimgs*: total number of patches randomly extracted from the original full images. This number must be a multiple of 20, since an equal number of patches is extracted in each of the 20 original training images.\n- *inside_FOV*: choose if the patches must be selected only completely inside the FOV. The neural network correctly learns how to exclude the FOV border if also the patches including the mask are selected. However, a higher number of patches are required for training.\n- *N_epochs*: number of training epochs.\n- *batch_size*: mini batch size.\n- *nohup*: the standard output during the training is redirected and saved in a log file.\n\n\nAfter all the parameters have been configured, you can train the neural network with:\n```\npython run_training.py\n```\nIf available, a GPU will be used.  \nThe following files will be saved in the folder with the same name of the experiment:\n- model architecture (json)\n- picture of the model structure (png)\n- a copy of the configuration file\n- model weights at last epoch (HDF5)\n- model weights at best epoch, i.e. minimum validation loss (HDF5)\n- sample of the training patches and their corresponding ground truth (png)\n\n\n### Evaluate the trained model\nThe performance of the trained model is evaluated against the DRIVE testing dataset, consisting of 20 images (as many as in the training set).\n\nThe parameters for the testing can be tuned again in the `configuration.txt` file, specifically in the [testing settings] section, as described below:  \n**[testing settings]**  \n- *best_last*: choose the model for prediction on the testing dataset: best = the model with the lowest validation loss obtained during the training; last = the model at the last epoch.\n- *full_images_to_test*: number of full images for testing, max 20.\n- *N_group_visual*: choose how many images per row in the saved figures.\n- *average_mode*: if true, the predicted vessel probability for each pixel is computed by averaging the predicted probability over multiple overlapping patches covering the same pixel.\n- *stride_height*: relevant only if average_mode is True. The stride along the height for the overlapping patches, smaller stride gives higher number of patches.\n- *stride_width*: same as stride_height.\n- *nohup*: the standard output during the prediction is redirected and saved in a log file.\n\nThe section **[experiment name]** must be the name of the experiment you want to test, while **[data paths]** contains the paths to the testing datasets. Now the section **[training settings]** will be ignored.\n\nRun testing by:\n```\npython run_testing.py\n```\nIf available, a GPU will be used.  \nThe following files will be saved in the folder with same name of the experiment:\n- The ROC curve  (png)\n- The Precision-recall curve (png)\n- Picture of all the testing pre-processed images (png)\n- Picture of all the corresponding segmentation ground truth (png)\n- Picture of all the corresponding segmentation predictions (png)\n- One or more pictures including (top to bottom): original pre-processed image, ground truth, prediction\n- Report on the performance\n\nAll the results are referred only to the pixels belonging to the FOV, selected by the masks included in the DRIVE database\n\n\n## Results on STARE database\n\nThis neural network has been tested also on another common database, the [STARE](http:\u002F\u002Fcecas.clemson.edu\u002F~ahoover\u002Fstare\u002F). The neural network is identical as in the experiment with the DRIVE dataset, however some modifications in the code and in the methodology were necessary due to the differences between the two datasets.  \nThe STARE consists of 20 retinal fundus images with two sets of manual segmentation provided by two different observers, with the former one considered as the ground truth. Conversely to the DRIVE dataset, there is no standard division into train and test images, therefore the experiment has been performed with the *leave-one-out* method. The training-testing cycle has been repeated 20 times: at each iteration one image has been left out from the training set and then used for the test.  \nThe pre-processing is the same applied for the DRIVE dataset, and 9500 random patches of 48x48 pixels each are extracted from each of the 19 images forming the training set. Also the area outside the FOV has been considered for the patch extraction. From these patches, 90% (162450 patches) are used for training and 10% (18050 patches) are used for validation.  The training parameters (epochs, batch size...) are the same as in the DRIVE experiment.  \nThe test is performed each time on the single image left out from the training dataset. Similarly to the DRIVE dataset, the vessel probability of each pixel is obtained by averaging over multiple overlapping patches, obtained with a stride of 5 pixels in both width and height. Only the pixels belonging to the FOV are considered. This time the FOV is identified by applying a color threshold in the original images, since no masks are available in the STARE dataset.  \n\nThe following table shows the results (in terms of AUC ROC) obtained over the 20 different trainings, with the stated image used for test.\n\n| STARE image| AUC ROC|\n| ---------- |:------:|\n| im0239.ppm | .9751 |\n| im0324.ppm | .9661 |\n| im0139.ppm | .9845 |\n| im0082.ppm | .9929 |\n| im0240.ppm | .9832 |\n| im0003.ppm | .9856 |\n| im0319.ppm | .9702 |\n| im0163.ppm | .9952 |\n| im0077.ppm | .9925 |\n| im0162.ppm | .9913 |\n| im0081.ppm | .9930 |\n| im0291.ppm | .9635 |\n| im0005.ppm | .9703 |\n| im0235.ppm | .9912 |\n| im0004.ppm | .9732 |\n| im0044.ppm | .9883 |\n| im0001.ppm | .9709 |\n| im0002.ppm | .9588 |\n| im0236.ppm | .9893 |\n| im0255.ppm | .9819 |\n\n__AVERAGE:   .9805 +- .0113__\n\nThe folder `.\u002FSTARE_results` contains all the predictions. Each image shows (from top to bottom) the pre-processed original image of the STARE dataset, the ground truth and the corresponding prediction. In the predicted image, each pixel shows the vessel predicted probability, no threshold is applied.\n\nThe following table compares this method to other recent techniques, which have published their performance in terms of Area Under the ROC curve (AUC ROC) on the STARE dataset.\n\n| Method                  | AUC ROC on STARE |\n| ----------------------- |:----------------:|\n| Soares et al [1]        | .9671           |\n| Azzopardi et al. [2]    | .9563            |\n| Roychowdhury et al. [4] | .9688            |\n| Fraz et al.  [5]        | .9768            |\n| Qiaoliang et al. [6]    | .9879            |\n| Liskowski et al.^ [8]   | .9930            |\n| **this method**         | **.9805**        |\n\n^ different definition of FOV\n\n## Bibliography\n\n[1] Soares et al., “Retinal vessel segmentation using the 2-d Gabor wavelet and supervised classification,” *Medical Imaging, IEEE Transactions on*, vol. 25, no. 9, pp. 1214–1222, 2006.\n\n[2] Azzopardi et al., “Trainable cosfire filters for vessel delineation with application to retinal images,”\n*Medical image analysis*, vol. 19, no. 1, pp. 46–57, 2015.\n\n[3] Osareh et al., “Automatic blood vessel segmentation in color images of retina,” *Iran. J. Sci. Technol. Trans. B: Engineering*, vol. 33, no. B2, pp. 191–206, 2009.\n\n[4] Roychowdhury et al., “Blood vessel segmentation of fundus images by major vessel extraction and subimage\nclassification,” *Biomedical and Health Informatics, IEEE Journal of*, vol. 19, no. 3, pp. 1118–1128, 2015.\n\n[5] Fraz et al., \"An Ensemble Classification-Based Approach Applied to Retinal Blood Vessel Segmentation\",   *IEEE Transactions on Biomedical Engineering*, vol. 59, no. 9, pp. 2538-2548, 2012.\n\n[6] Qiaoliang et al., \"A Cross-Modality Learning Approach for Vessel Segmentation in Retinal Images\", *IEEE Transactions on Medical Imaging*, vol. 35, no. 1, pp. 109-118, 2016.\n\n[7] Melinscak et al., \"Retinal vessel segmentation using deep neural networks\", *In Proceedings of the 10th International Conference on Computer Vision Theory and Applications (VISIGRAPP 2015)*, (2015), pp. 577–582.\n\n[8] Liskowski et al., \"Segmenting Retinal Blood Vessels with Deep Neural Networks\",  *IEEE Transactions on Medical Imaging*, vol. PP, no. 99, pp. 1-1, 2016.\n\n\n## Acknowledgements\n\nThis work was supported by the EU Marie Curie Initial Training Network (ITN) “REtinal VAscular Modelling, Measurement And Diagnosis\" (REVAMMAD), Project no. 316990.\n\n## License\n\nThis project is licensed under the MIT License\n\nCopyright (c) 2016 Daniele Cortinovis, Orobix Srl (www.orobix.com).\n","# 基于卷积神经网络（U-Net）的视网膜血管分割\n\n![](https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Forobix_retina-unet_readme_423f9c23f33c.png)\n\n本仓库包含用于分割视网膜眼底图像中血管的卷积神经网络实现。这是一个二分类任务：神经网络预测眼底图像中的每个像素是否为血管。  \n该神经网络结构基于*U-Net*架构，相关描述见这篇[论文](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1505.04597.pdf)。  \n该神经网络在DRIVE数据库上的性能测试表明，其ROC曲线下面积指标优于目前已发表的其他方法。此外，在STARE数据集上，该方法也表现出优异的性能。\n\n## 方法\n在训练之前，对DRIVE训练数据集中的20张图像进行以下预处理：\n- 灰度化\n- 标准化\n- 对比度受限自适应直方图均衡化（CLAHE）\n- Gamma校正\n\n神经网络的训练是在预处理后完整图像的子图像（补丁）上进行的。每个补丁的尺寸为48×48像素，其中心点随机选取于整张图像内。同时，也会选择部分或完全位于视野外（FOV）的补丁，以便让网络学会如何区分视野边界与血管。  \n通过从DRIVE训练集的20张图像中每张随机抽取9,500个补丁，共得到190,000个补丁。尽管这些补丁存在重叠——即不同补丁可能包含原始图像的同一区域——但并未进行额外的数据增强。数据集中前90%用于训练（171,000个补丁），后10%用于验证（19,000个补丁）。\n\n神经网络架构源自*U-Net*架构（参见[论文](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1505.04597.pdf)）。损失函数采用交叉熵，优化方法为随机梯度下降。每个卷积层后的激活函数为ReLU，并在连续两个卷积层之间使用0.2的丢弃率（Dropout）。  \n训练共进行150个epoch，每次使用32个补丁作为小批量。在GeForce GTX TITAN GPU上运行，整个训练过程大约持续20小时。\n\n## DRIVE数据库上的结果\n测试使用DRIVE测试数据集中的20张图像，并以金标准作为真实标签。仅考虑属于视野（FOV）范围内的像素。视野区域由DRIVE数据库附带的掩码标识。  \n为了提升性能，每个像素的血管概率是通过对多次预测结果取平均值获得的。在每张测试图像中，按高度和宽度方向每隔5个像素提取一组连续且部分重叠的补丁。随后，对于每个像素，其血管概率即为覆盖该像素的所有预测补丁中概率的平均值。\n\n`.\u002Ftest`文件夹中报告的结果对应于验证损失最小的已训练模型。该文件夹包含：\n- 模型：\n  - `test_model.png` 神经网络结构示意图\n  - `test_architecture.json` 模型的JSON格式描述\n  - `test_best_weights.h5` 验证损失最小的模型权重，存储为HDF5文件\n  - `test_last_weights.h5` 最终第150个epoch时的模型权重，存储为HDF5文件\n  - `test_configuration.txt` 实验参数配置\n- 实验结果：\n  - `performances.txt` 测试结果摘要，包括混淆矩阵\n  - `Precision_recall.png` 精确率-召回率曲线及其对应的AUC值\n  - `ROC.png` ROC曲线及其对应的AUC值\n  - `all_*.png` DRIVE测试数据集的20张预处理后的原始图像、真实标签及预测结果\n  - `sample_input_*.png` 预处理后训练图像的40个补丁样本及其对应的真实标签\n  - `test_Original_GroundTruth_Prediction*.png` 从上至下依次为预处理后的原始图像、真实标签和预测结果。在预测图像中，每个像素显示的是预测的血管概率，未应用任何阈值。\n\n下表将本方法与其他近期技术进行了比较，这些方法均已在DRIVE数据集上公布了其ROC曲线下面积（AUC ROC）指标。\n\n| 方法                  | AUC ROC on DRIVE |\n| ----------------------- |:----------------:|\n| Soares等 [1]        | .9614            |\n| Azzopardi等. [2]    | .9614            |\n| Osareh等  [3]       | .9650            |\n| Roychowdhury等. [4] | .9670            |\n| Fraz等.  [5]        | .9747            |\n| Qiaoliang等. [6]    | .9738            |\n| Melinscak等. [7]    | .9749            |\n| Liskowski等.^ [8]   | .9790            |\n| **本方法**         | **.9790**        |\n\n^ 不同的视野定义\n\n## 在DRIVE数据集上运行实验\n代码使用Python编写，可通过以下步骤在DRIVE数据集上复现该实验。\n\n\n### 先决条件\n神经网络基于Keras库开发，安装可参考[Keras官方仓库](https:\u002F\u002Fgithub.com\u002Ffchollet\u002Fkeras)。\n\n本代码已在Keras 1.1.0版本上测试完成，后端可选择Theano或TensorFlow。为避免维度不匹配问题，需在`~\u002F.keras\u002Fkeras.json`配置文件中设置`\"image_dim_ordering\": \"th\"`。若该文件不存在，可自行创建。更多细节请参阅Keras文档。\n\n所需依赖如下：\n- numpy >= 1.11.1\n- PIL >=1.1.7\n- opencv >=2.4.10\n- h5py >=2.6.0\n- ConfigParser >=3.5.0b2\n- scikit-learn >= 0.17.1\n\n\n此外，您还需要DRIVE数据集，可在下一节中了解其免费下载方式。\n\n### 训练\n\n首先，你需要 DRIVE 数据库。我们无法在此提供数据，但你可以从官方 [网站](http:\u002F\u002Fwww.isi.uu.nl\u002FResearch\u002FDatabases\u002FDRIVE\u002F) 下载 DRIVE 数据库。将图像解压到一个文件夹中，并将其命名为“DRIVE”，例如。该文件夹的目录结构应如下所示：\n```\nDRIVE\n│\n└───test\n|    ├───1st_manual\n|    └───2nd_manual\n|    └───images\n|    └───mask\n│\n└───training\n    ├───1st_manual\n    └───images\n    └───mask\n```\n有关数据的详细说明，请参阅 DRIVE 官方网站。\n\n为便于训练和测试，建议分别创建用于真值、掩码和图像的 HDF5 数据集。\n\n在根目录下，只需运行以下命令：\n```\npython prepare_datasets_DRIVE.py\n```\n训练和测试用的 HDF5 数据集将被创建在 `.\u002FDRIVE_datasets_training_testing\u002F` 文件夹中。  \n注意：如果你为 DRIVE 文件夹指定了不同的名称，则需要在 `prepare_datasets_DRIVE.py` 文件中进行相应修改。\n\n接下来可以配置实验。所有设置均可在 `configuration.txt` 文件中指定，文件按以下部分组织：  \n**[数据路径]**  \n仅当您修改了 `prepare_datasets_DRIVE.py` 文件时，才需更改这些路径。  \n**[实验名称]**  \n为实验选择一个名称，系统将创建同名文件夹，用于存储所有结果及训练好的神经网络。  \n**[数据属性]**  \n网络是在原始完整图像的子图像（补丁）上进行训练的，请在此处指定补丁的尺寸。  \n**[训练设置]**  \n您可以在此处指定：  \n- *N_subimgs*：从原始完整图像中随机提取的补丁总数。此数值必须是 20 的倍数，因为每个原始训练图像都会提取相同数量的补丁。  \n- *inside_FOV*：选择是否仅在 FOV 内部完全选取补丁。如果同时选取包含掩码的补丁，神经网络也能正确学习如何排除 FOV 边缘区域。不过，这样会需要更多的补丁用于训练。  \n- *N_epochs*：训练轮数。  \n- *batch_size*：小批量大小。  \n- *nohup*：训练过程中的标准输出会被重定向并保存到日志文件中。\n\n完成所有参数配置后，即可通过以下命令开始训练神经网络：\n```\npython run_training.py\n```\n如果有可用的 GPU，程序将自动使用 GPU 进行计算。  \n以下文件将保存在与实验同名的文件夹中：  \n- 模型架构（json）  \n- 模型结构图（png）  \n- 配置文件副本  \n- 最后一 epoch 的模型权重（HDF5）  \n- 验证损失最低的最优 epoch 的模型权重（HDF5）  \n- 训练补丁及其对应真值的示例图片（png）\n\n### 评估训练好的模型\n训练好的模型将在 DRIVE 测试数据集上进行评估，该数据集包含 20 张图像，与训练集数量相同。\n\n测试参数可在 `configuration.txt` 文件的 [测试设置] 部分再次调整，具体说明如下：  \n**[测试设置]**  \n- *best_last*：选择用于测试数据集预测的模型：best 表示训练过程中验证损失最低的模型；last 表示最后一个 epoch 的模型。  \n- *full_images_to_test*：用于测试的完整图像数量，最多 20 张。  \n- *N_group_visual*：选择保存的图片中每行显示的图像数量。  \n- *average_mode*：如果为真，则每个像素的血管概率预测值将通过对覆盖同一像素的多个重叠补丁的预测概率取平均值得到。  \n- *stride_height*：仅在 average_mode 为 True 时有效。表示重叠补丁沿高度方向的步长，步长越小，生成的补丁数量越多。  \n- *stride_width*：与 stride_height 相同。  \n- *nohup*：预测过程中的标准输出会被重定向并保存到日志文件中。\n\n其中，**[实验名称]** 必须填写您要测试的实验名称，而 **[数据路径]** 则需填写测试数据集的路径。此时，**[训练设置]** 部分会被忽略。\n\n运行测试的命令如下：\n```\npython run_testing.py\n```\n如果有可用的 GPU，程序将自动使用 GPU 进行计算。  \n以下文件将保存在与实验同名的文件夹中：  \n- ROC 曲线（png）  \n- 精确率-召回率曲线（png）  \n- 所有测试预处理图像的图片（png）  \n- 所有对应分割真值的图片（png）  \n- 所有对应分割预测结果的图片（png）  \n- 一张或多张包含（从上到下）：原始预处理图像、真值、预测结果的图片  \n- 性能报告  \n\n所有结果仅针对属于 FOV 区域的像素，这些像素由 DRIVE 数据库中包含的掩码所选定。\n\n## STARE 数据库上的结果\n\n该神经网络也在另一个常用数据集 [STARE](http:\u002F\u002Fcecas.clemson.edu\u002F~ahoover\u002Fstare\u002F) 上进行了测试。神经网络与使用 DRIVE 数据集的实验中所用的完全相同，但由于两个数据集之间的差异，代码和方法上仍需进行一些修改。  \nSTARE 数据集包含 20 张视网膜眼底图像，由两位不同的观察者提供了两组手动分割标注，其中第一组被视为真实标签。与 DRIVE 数据集不同的是，STARE 没有标准的训练集和测试集划分，因此实验采用了“留一法”进行。训练-测试循环重复了 20 次：每次迭代都会从训练集中排除一张图像，并将其用于测试。  \n预处理步骤与 DRIVE 数据集相同，从构成训练集的 19 张图像中，每张图像随机提取 9500 个 48×48 像素的补丁。在补丁提取过程中，FOV 外部区域也被纳入考虑。这些补丁中，90%（162450 个）用于训练，10%（18050 个）用于验证。训练参数（如 epoch 数、批大小等）与 DRIVE 实验中的设置一致。  \n每次测试都针对从训练集中单独留出的一张图像进行。与 DRIVE 数据集类似，每个像素的血管概率是通过对多个重叠补丁取平均得到的，这些补丁在宽度和高度方向上的步长均为 5 像素。仅 FOV 内的像素会被计入。由于 STARE 数据集没有提供掩码，本次通过在原始图像上应用颜色阈值来确定 FOV。\n\n下表展示了在 20 次不同训练中，以指定图像作为测试时所获得的 AUC ROC 结果。\n\n| STARE 图像 | AUC ROC |\n| ---------- |:-------:|\n| im0239.ppm | .9751 |\n| im0324.ppm | .9661 |\n| im0139.ppm | .9845 |\n| im0082.ppm | .9929 |\n| im0240.ppm | .9832 |\n| im0003.ppm | .9856 |\n| im0319.ppm | .9702 |\n| im0163.ppm | .9952 |\n| im0077.ppm | .9925 |\n| im0162.ppm | .9913 |\n| im0081.ppm | .9930 |\n| im0291.ppm | .9635 |\n| im0005.ppm | .9703 |\n| im0235.ppm | .9912 |\n| im0004.ppm | .9732 |\n| im0044.ppm | .9883 |\n| im0001.ppm | .9709 |\n| im0002.ppm | .9588 |\n| im0236.ppm | .9893 |\n| im0255.ppm | .9819 |\n\n__平均值：.9805 ± .0113__\n\n文件夹 `.\u002FSTARE_results` 包含所有预测结果。每张图像从上到下依次显示 STARE 数据集的预处理后原始图像、真实标签以及对应的预测结果。在预测图像中，每个像素显示的是预测的血管概率，未应用任何阈值。\n\n下表将本方法与其他近期技术在 STARE 数据集上的 AUC ROC 性能进行了比较。\n\n| 方法                  | STARE 上的 AUC ROC |\n| ----------------------- |:------------------:|\n| Soares 等人 [1]        | .9671           |\n| Azzopardi 等人 [2]    | .9563            |\n| Roychowdhury 等人 [4] | .9688            |\n| Fraz 等人 [5]        | .9768            |\n| Qiaoliang 等人 [6]    | .9879            |\n| Liskowski 等人^ [8]   | .9930            |\n| **本方法**         | **.9805**        |\n\n^ 不同的 FOV 定义\n\n## 参考文献\n\n[1] Soares 等人，“基于二维 Gabor 小波和监督分类的视网膜血管分割”，《IEEE 医学成像汇刊》，第 25 卷，第 9 期，第 1214–1222 页，2006 年。\n\n[2] Azzopardi 等人，“可训练的 cosfire 滤波器用于血管勾勒及其在视网膜图像中的应用”，《医学图像分析》，第 19 卷，第 1 期，第 46–57 页，2015 年。\n\n[3] Osareh 等人，“视网膜彩色图像中的自动血管分割”，《伊朗科学技术期刊 B：工程》，第 33 卷，第 B2 期，第 191–206 页，2009 年。\n\n[4] Roychowdhury 等人，“通过主要血管提取和子图像分类实现眼底图像的血管分割”，《IEEE 生物医学与健康信息学杂志》，第 19 卷，第 3 期，第 1118–1128 页，2015 年。\n\n[5] Fraz 等人，“应用于视网膜血管分割的集成分类方法”，《IEEE 生物医学工程汇刊》，第 59 卷，第 9 期，第 2538–2548 页，2012 年。\n\n[6] Qiaoliang 等人，“用于视网膜图像血管分割的跨模态学习方法”，《IEEE 医学成像汇刊》，第 35 卷，第 1 期，第 109–118 页，2016 年。\n\n[7] Melinscak 等人，“利用深度神经网络进行视网膜血管分割”，《第 10 届国际计算机视觉理论与应用会议（VISIGRAPP 2015）论文集》（2015 年），第 577–582 页。\n\n[8] Liskowski 等人，“利用深度神经网络分割视网膜血管”，《IEEE 医学成像汇刊》，PP 卷，第 99 期，第 1–1 页，2016 年。\n\n## 致谢\n\n本研究得到了欧盟玛丽居里初始培训网络（ITN）“视网膜血管建模、测量与诊断”（REVAMMAD）项目的支持，项目编号为 316990。\n\n## 许可证\n\n本项目采用 MIT 许可证授权。\n\n版权所有 © 2016 Daniele Cortinovis, Orobix Srl (www.orobix.com)。","# Retina-UNet 快速上手指南\n\nRetina-UNET 是一个基于 U-Net 架构的卷积神经网络项目，用于视网膜眼底图像中的血管分割。该项目在 DRIVE 数据集上取得了领先的 ROC 曲线下面积（AUC）表现。\n\n## 环境准备\n\n本项目基于 Python 开发，依赖 Keras 深度学习框架。\n\n### 系统要求\n- **操作系统**: Linux \u002F macOS \u002F Windows\n- **Python 版本**: 建议 Python 2.7 或 3.5+ (原文基于较旧版本测试，建议使用兼容环境)\n- **硬件加速**: 推荐使用 NVIDIA GPU (如 GeForce GTX TITAN) 以加速训练，CPU 亦可运行但速度较慢。\n\n### 前置依赖\n请确保安装以下库及其最低版本要求：\n- `numpy` >= 1.11.1\n- `PIL` (Pillow) >= 1.1.7\n- `opencv` (cv2) >= 2.4.10\n- `h5py` >= 2.6.0\n- `ConfigParser` (Python 3 中为 `configparser`) >= 3.5.0b2\n- `scikit-learn` >= 0.17.1\n- `Keras` >= 1.1.0 (后端支持 Theano 或 TensorFlow)\n\n**重要配置**:\n为了避免维度不匹配错误，必须设置 Keras 后端图像维度顺序为 `\"th\"` (channels_first)。\n请在用户主目录下创建或编辑 `~\u002F.keras\u002Fkeras.json` 文件，内容如下：\n```json\n{\n    \"image_dim_ordering\": \"th\",\n    \"epsilon\": 1e-07,\n    \"floatx\": \"float32\",\n    \"backend\": \"tensorflow\"\n}\n```\n*(注：若使用国内网络，安装 Keras 及后端时可配置 pip 使用清华源或阿里源加速)*\n\n## 安装步骤\n\n1. **克隆代码仓库**\n   ```bash\n   git clone \u003Crepository_url>\n   cd retina-unet\n   ```\n\n2. **安装 Python 依赖**\n   建议使用虚拟环境，并通过 pip 安装所需包（国内用户可添加 `-i https:\u002F\u002Fpypi.tuna.tsinghua.edu.cn\u002Fsimple`）：\n   ```bash\n   pip install numpy pillow opencv-python h5py scikit-learn keras\n   # 注意：ConfigParser 在 Python 3 中内置，无需单独安装；若报错请尝试 pip install configparser\n   ```\n\n3. **准备 DRIVE 数据集**\n   本项目不包含数据，需手动下载 DRIVE 数据库。\n   - 访问官网下载：[DRIVE Database](http:\u002F\u002Fwww.isi.uu.nl\u002FResearch\u002FDatabases\u002FDRIVE\u002F)\n   - 解压后整理目录结构如下（假设文件夹命名为 `DRIVE`）：\n     ```text\n     DRIVE\n     │\n     ├───test\n     |    ├───1st_manual\n     |    ├───2nd_manual\n     |    ├───images\n     |    └───mask\n     │\n     └───training\n         ├───1st_manual\n         ├───images\n         └───mask\n     ```\n   - 将 `DRIVE` 文件夹放置在项目根目录下。\n\n4. **预处理数据**\n   运行脚本将图像转换为 HDF5 格式数据集：\n   ```bash\n   python prepare_datasets_DRIVE.py\n   ```\n   *注意：如果 DRIVE 文件夹名称不同，需修改 `prepare_datasets_DRIVE.py` 中的路径配置。生成的数据将保存在 `.\u002FDRIVE_datasets_training_testing\u002F`。*\n\n## 基本使用\n\n### 1. 配置实验参数\n编辑根目录下的 `configuration.txt` 文件。主要关注以下部分：\n- **[experiment name]**: 设置实验名称，结果将保存至同名文件夹。\n- **[data attributes]**: 设置补丁尺寸（默认 48x48）。\n- **[training settings]**:\n  - `N_subimgs`: 提取的补丁总数（必须是 20 的倍数，默认 190000）。\n  - `N_epochs`: 训练轮数（默认 150）。\n  - `batch_size`: 批大小（默认 32）。\n\n### 2. 训练模型\n执行训练脚本。若有可用 GPU，将自动调用。\n```bash\npython run_training.py\n```\n训练完成后，实验文件夹中将生成模型架构 (.json)、权重文件 (.h5) 及训练样本可视化图。\n\n### 3. 评估模型\n训练结束后，使用测试集评估模型性能。\n再次编辑 `configuration.txt`，重点修改 **[testing settings]** 部分：\n- `best_last`: 选择使用验证损失最小的模型 (`best`) 还是最后一轮的模型 (`last`)。\n- `average_mode`: 设为 `true` 以通过重叠补丁平均化预测结果，提升精度。\n- `stride_height` \u002F `stride_width`: 重叠步长（默认为 5）。\n\n运行测试脚本：\n```bash\npython run_testing.py\n```\n\n### 4. 查看结果\n测试完成后，进入实验名称对应的文件夹，可查看：\n- `ROC.png`: ROC 曲线及 AUC 值。\n- `Precision_recall.png`: 精确率 - 召回率曲线。\n- `test_Original_GroundTruth_Prediction*.png`: 原始图像、金标准标注与模型预测结果的对比图。\n- `performances.txt`: 包含混淆矩阵的性能总结报告。","某三甲医院眼科科研团队正致力于构建糖尿病视网膜病变自动筛查系统，需要从数千张眼底照片中精准提取血管结构以量化微血管瘤等病灶。\n\n### 没有 retina-unet 时\n- 医生需手动勾画血管轮廓进行标注，单张图像耗时超过 20 分钟，且不同医师间的标注一致性差，难以形成高质量的金标准数据集。\n- 传统图像处理算法（如阈值分割）对光照不均和噪声极度敏感，常将视神经边缘误判为血管，导致后续病灶分析出现大量假阳性。\n- 缺乏端到端的深度学习方案，研究人员需自行搭建复杂的 U-Net 架构并调试超参数，模型在 DRIVE 等公开数据集上的复现难度极大，研发周期被拉长数月。\n\n### 使用 retina-unet 后\n- 利用预训练的卷积神经网络自动完成像素级二分类，秒级输出高精度血管掩膜，将单图处理时间压缩至毫秒级，彻底解放人力并确保标注客观统一。\n- 内置的 CLAHE 对比度增强与伽马校正预处理流程，有效抑制了眼底图像的背景噪声，配合多补丁重叠预测策略，显著提升了视场边界处的识别准确率。\n- 直接复用基于 DRIVE 数据库训练的最优权重模型，无需从零开始调参即可达到业界领先的 ROC 曲线下面积（AUC），让团队能立即聚焦于上层疾病诊断逻辑的开发。\n\nretina-unet 通过提供经过验证的高精度血管分割能力，将眼科影像分析从繁琐的手工特征工程时代直接推进到自动化智能诊断阶段。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Forobix_retina-unet_423f9c23.png","orobix","Orobix","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Forobix_1acfd248.png","",null,"info@orobix.com","www.orobix.com","https:\u002F\u002Fgithub.com\u002Forobix",[84],{"name":85,"color":86,"percentage":87},"Python","#3572A5",100,1350,466,"2026-04-01T14:01:46",4,"未说明","非绝对必需但推荐使用（代码会自动检测并使用可用 GPU）。文中提及测试环境为 GeForce GTX TITAN，训练耗时约 20 小时。需配置后端支持 CUDA（Theano 或 TensorFlow）。",{"notes":95,"python":96,"dependencies":97},"1. 必须手动下载 DRIVE 或 STARE 数据集，项目不提供数据。\n2. 关键配置：必须在 ~\u002F.keras\u002Fkeras.json 文件中设置 \"image_dim_ordering\": \"th\" (channels_first)，否则会导致维度不匹配错误。\n3. 该项目基于较旧的 Keras 1.1.0 版本开发，与现代深度学习环境可能存在兼容性冲突，建议创建独立的虚拟环境运行。\n4. 训练过程需要从原始图像中提取大量补丁（patches），预处理脚本会生成 HDF5 格式的数据集。","未说明 (基于依赖库版本推测为 Python 2.7 或早期 Python 3 版本)",[98,99,100,101,102,103,104,105],"keras==1.1.0","numpy>=1.11.1","PIL>=1.1.7","opencv>=2.4.10","h5py>=2.6.0","ConfigParser>=3.5.0b2","scikit-learn>=0.17.1","theano 或 tensorflow (作为 Keras 后端)",[14],"2026-03-27T02:49:30.150509","2026-04-06T05:44:09.815574",[110,115,120,125,130,135],{"id":111,"question_zh":112,"answer_zh":113,"source_url":114},16876,"该代码是否兼容 Keras 2 版本？遇到 \"concat\" 模式形状不匹配或 visualize_util 导入错误怎么办？","原始代码是基于 Keras 1.1 开发的，直接在 Keras 2 上运行会报错（如 concat 形状不匹配或找不到 visualize_util 模块）。维护者已确认代码已更新以兼容 Keras 2。如果遇到此类错误，请确保拉取最新的代码版本。对于旧版代码迁移，主要涉及将 `merge` 层改为 `concatenate`，并将 `keras.utils.visualize_util` 替换为 `keras.utils.vis_utils`。","https:\u002F\u002Fgithub.com\u002Forobix\u002Fretina-unet\u002Fissues\u002F31",{"id":116,"question_zh":117,"answer_zh":118,"source_url":119},16877,"如何获取 DRIVE 视网膜血管分割数据集？官网链接无法访问或显示 404 怎么办？","DRIVE 数据集需要通过官网申请下载。访问 http:\u002F\u002Fwww.isi.uu.nl\u002FResearch\u002FDatabases\u002FDRIVE\u002Fdownload.php，填写有效的电子邮件地址并点击发送。系统会在几分钟内发送一封包含唯一验证码的邮件，使用该验证码即可在页面上下载约 30MB 的压缩包（含原始 TIFF 图像、GIF 格式的分割掩膜和视野掩膜）。如果页面暂时无法访问，可能是服务器端问题，建议稍后重试或检查邮件垃圾箱。","https:\u002F\u002Fgithub.com\u002Forobix\u002Fretina-unet\u002Fissues\u002F20",{"id":121,"question_zh":122,"answer_zh":123,"source_url":124},16878,"该网络架构是否适用于其他类型的语义分割任务（如超声波神经分割）？需要做哪些修改？","虽然视网膜血管分割（树状结构）与超声波神经分割（块状结构）特征不同，但该 U-Net 架构通常具有通用性。针对不同任务，主要的修改建议集中在预处理阶段：需要去除低频对比度变化并进行局部强度归一化，使每个图像块具有相似的强度统计特性，从而增强局部变化。可以参考“自适应直方图均衡化”（Adaptive Histogram Equalization）技术进行处理。此外，若需进行多分类任务，可参考 FCN 的做法修改输出层通道数和损失函数。","https:\u002F\u002Fgithub.com\u002Forobix\u002Fretina-unet\u002Fissues\u002F13",{"id":126,"question_zh":127,"answer_zh":128,"source_url":129},16879,"代码中 `is_patch_inside_FOV` 函数的半径计算逻辑是否正确？","是的，之前的代码存在错误。正确的逻辑应该是减去对角线的一半而不是全长。原代码中的 `R_inside = 270 - int(patch_h*1.42)` 已被修正。正确的计算公式应考虑到正方形 patch 的对角线长度为 `patch_h * sqrt(2)`，因此限制半径应为 `270 - (patch_h * sqrt(2) \u002F 2)`。维护者已根据社区反馈更新了代码以修复此数学错误。","https:\u002F\u002Fgithub.com\u002Forobix\u002Fretina-unet\u002Fissues\u002F19",{"id":131,"question_zh":132,"answer_zh":133,"source_url":134},16880,"如何利用该项目实现多分类（Multi-classification）任务或处理彩色数据？","要实现多分类任务，可以借鉴 \"Fully Convolutional Networks for Semantic Segmentation\" 的做法，修改输出层的通道数（num_output）以对应类别数量，并相应调整损失函数。对于灰度标签的多分类，有用户建议将损失函数改为 MMSE 并将最终层卷积为单通道，但这取决于具体标签编码方式。如果数据集需要颜色信息，需确保输入预处理流程支持多通道（RGB）输入，而不仅仅是当前的单通道灰度处理。","https:\u002F\u002Fgithub.com\u002Forobix\u002Fretina-unet\u002Fissues\u002F22",{"id":136,"question_zh":137,"answer_zh":138,"source_url":124},16881,"为什么使用灰度图像进行训练？预处理中去除“缓慢趋势”（slower trends）具体指什么？","使用灰度图像是因为视网膜图像（除极少数最新相机外）本质上是单通道的，彩色通常是后期合成的，因此信息本身是 1 通道的。预处理中提到的“去除跨图像块的缓慢趋势”，主要是指消除光照不均或低频对比度变化。具体做法通常包括局部强度归一化，使得每个图像块相对于其他块具有相似的强度统计分布，从而突出局部的血管结构特征。",[]]