Machine learning control (MLC) is a subfield of
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
,
intelligent control
Intelligent control is a class of control techniques that use various artificial intelligence computing approaches like neural networks, Bayesian probability, fuzzy logic, machine learning, reinforcement learning, evolutionary computation and gene ...
, and
control theory
Control theory is a field of control engineering and applied mathematics that deals with the control system, control of dynamical systems in engineered processes and machines. The objective is to develop a model or algorithm governing the applic ...
which aims to solve
optimal control
Optimal control theory is a branch of control theory that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering and operations ...
problems with machine learning methods. Key applications are complex nonlinear systems for which
linear control theory methods are not applicable.
Types of problems and tasks
Four types of problems are commonly encountered:
* Control parameter identification: MLC translates to a parameter identification
[Thomas Bäck & Hans-Paul Schwefel (Spring 1993]
"An overview of evolutionary algorithms for parameter optimization"
Journal of Evolutionary Computation (MIT Press), vol. 1, no. 1, pp. 1-23 if the structure of the control law is given but the parameters are unknown. One example is the
genetic algorithm
In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to g ...
for optimizing coefficients of a
PID controller
PID or Pid may refer to:
Medicine
* Pelvic inflammatory disease or pelvic inflammatory disorder, an infection of the upper part of the female reproductive system
* Primary immune deficiency, disorders in which part of the body's immune system is ...
[N. Benard, J. Pons-Prats, J. Periaux, G. Bugeda, J.-P. Bonnet & E. Moreau, (2015]
"Multi-Input Genetic Algorithm for Experimental Optimization of the Reattachment Downstream of a Backward-Facing Step with Surface Plasma Actuator"
Paper AIAA 2015-2957 at 46th AIAA Plasmadynamics and Lasers Conference, Dallas, TX, USA, pp. 1-23. or discrete-time optimal control.
* Control design as
regression problem of the first kind: MLC approximates a general nonlinear mapping from sensor signals to actuation commands, if the sensor signals and the optimal actuation command are known for every state. One example is the computation of sensor feedback from a known
full state feedback
Full state feedback (FSF), or pole placement, is a method employed in feedback control system theory to place the closed-loop poles of a Plant (control theory), plant in predetermined locations in the s-plane.* Placing poles is desirable because t ...
.
Neural networks
A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either Cell (biology), biological cells or signal pathways. While individual neurons are simple, many of them together in a netwo ...
are commonly used for such tasks.
* Control design as regression problem of the second kind: MLC may also identify arbitrary nonlinear control laws which minimize the
cost function of the plant. In this case, neither a model, the control law structure, nor the optimizing actuation command needs to be known. The optimization is only based on the control performance (cost function) as measured in the plant.
Genetic programming
Genetic programming (GP) is an evolutionary algorithm, an artificial intelligence technique mimicking natural evolution, which operates on a population of programs. It applies the genetic operators selection (evolutionary algorithm), selection a ...
is a powerful regression technique for this purpose.
* Reinforcement learning control: The control law may be continually updated over measured performance changes (rewards) using
reinforcement learning
Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learnin ...
.
Adaptive Dynamic Programming
Adaptive Dynamic Programming (ADP), also known as approximate dynamic programming or neuro-dynamic programming, is a machine learning control method that combines reinforcement learning with dynamic programming to solve optimal control problems for complex systems. ADP addresses the "
curse of dimensionality
The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. T ...
" in traditional dynamic programming by approximating value functions or control policies using parametric structures such as neural networks. The core idea revolves around learning a control policy that minimizes a long-term cost function
, defined as
, where
is the system state,
is the control input,
is the instantaneous reward, and
is a discount factor. ADP employs two interacting components: a critic that estimates the value function
, and an actor that updates the control policy
. The critic and actor are trained iteratively using temporal difference learning or gradient descent to satisfy the
Hamilton-Jacobi-Bellman (HJB) equation:
where
describes the system dynamics. Key variants include heuristic dynamic programming (HDP), dual heuristic programming (DHP), and globalized dual heuristic programming (GDHP).
ADP has been applied to robotics, power systems, and autonomous vehicles, offering a data-driven framework for near-optimal control without requiring full system models. Challenges remain in ensuring stability guarantees and convergence for general nonlinear systems.
Applications
MLC has been successfully applied
to many nonlinear control problems,
exploring unknown and often unexpected actuation mechanisms. Example applications include:
*
spacecraft attitude control
Spacecraft attitude control is the process of controlling the orientation of a spacecraft (vehicle or satellite) with respect to an inertial frame of reference or another entity such as the celestial sphere, certain fields, and nearby objects, ...
,
* thermal control of buildings,
* feedback control of
turbulence
In fluid dynamics, turbulence or turbulent flow is fluid motion characterized by chaotic changes in pressure and flow velocity. It is in contrast to laminar flow, which occurs when a fluid flows in parallel layers with no disruption between ...
,
* and
remotely operated underwater vehicle
A remotely operated underwater vehicle (ROUV) or remotely operated vehicle (ROV) is a free-swimming submersible craft used to perform underwater observation, inspection and physical tasks such as valve operations, hydraulic functions and other g ...
s.
Many more engineering MLC application are summarized in the review article of PJ Fleming & RC Purshouse (2002).
[Peter J. Fleming, R. C. Purshouse (200]
"Evolutionary algorithms in control systems engineering: a survey"
Control Engineering Practice, vol. 10, no. 11, pp. 1223-1241
As is the case for all general nonlinear methods,
MLC does not guarantee convergence,
optimality
Optimality may refer to:
* Mathematical optimization
* Optimality theory in linguistics
* Optimality model, approach in biology
See also
*
* Optimism (disambiguation)
* Optimist (disambiguation)
* Optimistic (disambiguation)
* Optimization (d ...
, or
robustness
Robustness is the property of being strong and healthy in constitution. When it is transposed into a system
A system is a group of interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, ...
for a range of operating conditions.
See also
*
Reinforcement learning
Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learnin ...
References
Further reading
* Dimitris C Dracopoulos (August 1997
"Evolutionary Learning Algorithms for Neural Adaptive Control" Springer. .
*Thomas Duriez, Steven L. Brunton &
Bernd R. Noack (November 2016
"Machine Learning Control - Taming Nonlinear Dynamics and Turbulence" Springer. .
{{refend
Machine learning
Control theory
Cybernetics