computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...

, landmark detection is the process of finding significant

landmark A landmark is a recognizable natural or artificial feature used for navigation, a feature that stands out from its near environment and is often visible from long distances. In modern use, the term can also be applied to smaller structures or f ...

s in an image. This originally referred to finding landmarks for

navigation Navigation is a field of study that focuses on the process of monitoring and controlling the movement of a craft or vehicle from one place to another.Bowditch, 2003:799. The field of navigation includes four general categories: land navigation, ...

al purposes – for instance, in robot vision or creating maps from satellite images. Methods used in navigation have been extended to other fields, notably in facial recognition where it is used to identify key points on a face. It also has important applications in medicine, identifying

anatomical landmark Anatomical terminology is a form of scientific terminology used by anatomists, zoologists, and health professionals such as doctors. Anatomical terminology uses many unique terms, suffixes, and prefixes deriving from Ancient Greek and Latin. The ...

s in

medical image Medical imaging is the technique and process of imaging the interior of a body for clinical analysis and medical intervention, as well as visual representation of the function of some organs or tissues (physiology). Medical imaging seeks to rev ...

Applications

Navigation

Facial landmarks

Finding facial landmarks is an important step in facial identification of people in an image. Facial landmarks can also be used to extract information about mood and intention of the person. Methods used fall in to three categories: holistic methods, constrained local model methods, and

regression Regression or regressions may refer to: Science * Marine regression, coastal advance due to falling sea level, the opposite of marine transgression * Regression (medicine), a characteristic of diseases to express lighter symptoms or less extent ( ...

-based methods. Holistic methods are pre-progammed with statistical information on face shape and landmark location coefficients. The classic holistic method is the

active appearance model An active appearance model (AAM) is a computer vision algorithm for matching a statistical model of object shape and appearance to a new image. They are built during a training phase. A set of images, together with coordinates of landmarks that ap ...

(AAM) introduced in 1998. Since then there has been a number of extensions and improvements to the method. These are largely improvements to the fitting algorithm and can be classified into two groups: analytical fitting methods, and learning-based fitting methods. Analytical methods apply nonlinear optimization methods such as the

Gauss–Newton algorithm The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum ...

. This algorithm is very slow but better ones have been proposed such as the project out inverse compositional (POIC) algorithm and the simultaneous inverse compositional (SIC) algorithm. Learning-based fitting methods use

machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

techniques to predict the facial coefficients. These can use

linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...

nonlinear regression In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fit ...

and other fitting methods. In general, the analytic fitting methods are more accurate and do not need training, while the learning-based fitting methods are faster, but need to be trained. Other extensions to the basic AAM method analyse wavelets in the image rather than

pixel In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a raster image, or the smallest point in an all points addressable display device. In most digital display devices, pixels are the smal ...

intensity. This helps with fitting unseen parts of the face which basic AAM finds troublesome.

Medical images

Cephalometry

Fashion

The purpose of landmark detection in fashion images is for classification purposes. This aids in the retrieval of images with specified features from a database or general search. An example of a fashion landmark is the location of the hemline of a dress. Fashion landmark detection is particularly difficult due to the extreme deformation that can occur in clothing. Some classical methods of feature detection such as scale-invariant feature transform have been used in the past. However, it is now more common to use

deep learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. De ...

methods. This has been helped along enormously by the publication of a number of large fashion datasets that can be used for training. These methods include regression-based models, constraint-based models, and attentive models. The particular problems of fashion landmark detection (deformtion) have led to pose estimation models which detect and take into account the pose of the model wearing the clothes.

Methods

There are several

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...

s for locating

s in images. Nowadays the task usually is solved using

Artificial Neural Network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...

s and especially

Deep Learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. De ...

algorithms, but evolutionary algorithms such as particle swarm optimization are also can be useful to perform this task.

Deep Learning

Deep learning has had a significant impact on autonomous facial landmark detection by enabling more accurate and efficient detection of landmarks in real-world photos. With traditional

computer vision Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the hum ...

techniques, detecting facial landmarks could be challenging due to variations in lighting, head position, and occlusion, but

Convolutional Neural Network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...

s (CNNs), have revolutionized landmark detection by allowing computers to learn the features from large datasets of images. By training a CNN on a dataset of images with labeled facial landmarks, the algorithm can learn to detect these landmarks in new images with high accuracy even when they appear in different lighting conditions, at different angles, or in partially occluded views. In particular, solutions based on this approach have achieved real-time efficiency on mobile devices' GPUs and found its usage within

augmented reality Augmented reality (AR) is an interactive experience that combines the real world and computer-generated content. The content can span multiple sensory modalities, including visual, auditory, haptic, somatosensory and olfactory. AR can be de ...

applications.

Evolutionary algorithm

Evolutionary algorithms at the training stage try to learn the method of correct determination of landmarks. This phase is an iterative process and, accordingly, is performed in several iterations. As a result of the completion of the last iteration, a system will be obtained that can correctly determine the landmark with a certain accuracy. In the particle swarm optimization method, there are particles that search for landmarks, and each of them uses a certain formula in each iteration to optimize landmark detection.

References

Bibliography

* * * {{Digital image processing Applications of artificial intelligence Applications of computer vision Landmarks