U-Net is a

convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...

that was developed for biomedical

image segmentation In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects ( sets of pixels). The goal of segmentation is to simpli ...

at the Computer Science Department of the

University of Freiburg The University of Freiburg (colloquially german: Uni Freiburg), officially the Albert Ludwig University of Freiburg (german: Albert-Ludwigs-Universität Freiburg), is a public research university located in Freiburg im Breisgau, Baden-Württemb ...

. The network is based on the fully convolutional network and its architecture was modified and extended to work with fewer training images and to yield more precise segmentations. Segmentation of a 512 × 512 image takes less than a second on a modern

GPU A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...

Description

The U-Net architecture stems from the so-called “fully convolutional network” first proposed by Long, Shelhamer, and Darrell. The main idea is to supplement a usual contracting network by successive layers, where pooling operations are replaced by

upsampling In digital signal processing, upsampling, expansion, and interpolation are terms associated with the process of resampling in a multi-rate digital signal processing system. ''Upsampling'' can be synonymous with ''expansion'', or it can describe a ...

operators. Hence these layers increase the resolution of the output. A successive convolutional layer can then learn to assemble a precise output based on this information. One important modification in U-Net is that there are a large number of feature channels in the upsampling part, which allow the network to propagate context information to higher resolution layers. As a consequence, the expansive path is more or less symmetric to the contracting part, and yields a u-shaped architecture. The network only uses the valid part of each

convolution In mathematics (in particular, functional analysis), convolution is a mathematical operation on two functions ( and ) that produces a third function (f*g) that expresses how the shape of one is modified by the other. The term ''convolution'' ...

without any fully connected layers. To predict the pixels in the border region of the image, the missing context is extrapolated by mirroring the input image. This tiling strategy is important to apply the network to large images, since otherwise the resolution would be limited by the

memory.

History

U-Net was created by Olaf Ronneberger, Philipp Fischer, Thomas Brox in 2015 and reported in the paper “U-Net: Convolutional Networks for Biomedical Image Segmentation”. It's an improvement and development of FCN: Evan Shelhamer, Jonathan Long, Trevor Darrell (2014). "Fully convolutional networks for semantic segmentation".

Network architecture

The network consists of a contracting path and an expansive path, which gives it the u-shaped architecture. The contracting path is a typical convolutional network that consists of repeated application of convolutions, each followed by a rectified linear unit (ReLU) and a max pooling operation. During the contraction, the spatial information is reduced while feature information is increased. The expansive pathway combines the feature and spatial information through a sequence of up-convolutions and concatenations with high-resolution features from the contracting path. Example architecture of U-Net for producing k 256-by-256 image masks for a 256-by-256 RGB image

Example architecture of U-Net for producing k 256-by-256 image masks for a 256-by-256 RGB image

Applications

There are many applications of U-Net in biomedical

, such as brain image segmentation (''BRATS'') and liver image segmentation ("siliver07") as well as protein binding site prediction. Variations of the U-Net have also been applied for medical image reconstruction. Here are some variants and applications of U-Net as follows: # Pixel-wise regression using U-Net and its application on pansharpening; # 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation; # TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation. # Image-to-image translation to estimate fluorescent stains #In binding site prediction of protein structure.

Implementations

jakeret (2017): "Tensorflow Unet" U-Net source code from Pattern Recognition and Image Processing at Computer Science Department of the University of Freiburg, Germany.

References

{{Reflist Deep learning software applications University of Freiburg