Dual Control Theory
   HOME
*





Dual Control Theory
Dual control theory is a branch of control theory that deals with the control of systems whose characteristics are initially unknown. It is called ''dual'' because in controlling such a system the controller's objectives are twofold: * (1) Action: To control the system as well as possible based on current system knowledge * (2) Investigation: To experiment with the system so as to learn about its behavior and control it better in the future. These two objectives may be partly in conflict. In the context of reinforcement learning, this is known as the exploration-exploitation trade-off (e.g. Multi-armed bandit#Empirical motivation). Dual control theory was developed by Alexander Aronovich Fel'dbaum in 1960. He showed that in principle the optimal solution can be found by dynamic programming, but this is often impractical; as a result a number of methods for designing sub-optimal dual controllers have been devised. Example To use an analogy Analogy (from Greek ''analogia'', ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Control Theory
Control theory is a field of mathematics that deals with the control of dynamical systems in engineered processes and machines. The objective is to develop a model or algorithm governing the application of system inputs to drive the system to a desired state, while minimizing any ''delay'', ''overshoot'', or ''steady-state error'' and ensuring a level of control stability; often with the aim to achieve a degree of optimality. To do this, a controller with the requisite corrective behavior is required. This controller monitors the controlled process variable (PV), and compares it with the reference or set point (SP). The difference between actual and desired value of the process variable, called the ''error'' signal, or SP-PV error, is applied as feedback to generate a control action to bring the controlled process variable to the same value as the set point. Other aspects which are also studied are controllability and observability. Control theory is used in control system eng ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Reinforcement Learning
Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected. Instead the focus is on finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge). The environment is typically stated in the form of a Markov decision process (MDP), because many reinforcement learning algorithms for this context use dynamic programming techniques. The main difference between the classical dynamic programming methods and reinforcement learning algorithms is that the latter do not assume knowledge of an exact mathematica ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Multi-armed Bandit
In probability theory and machine learning, the multi-armed bandit problem (sometimes called the ''K''- or ''N''-armed bandit problem) is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain, when each choice's properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice. This is a classic reinforcement learning problem that exemplifies the exploration–exploitation tradeoff dilemma. The name comes from imagining a gambler at a row of slot machines (sometimes known as "one-armed bandits"), who has to decide which machines to play, how many times to play each machine and in which order to play them, and whether to continue with the current machine or try a different machine. The multi-armed bandit problem also falls into the broad category of stochastic scheduling. In the problem, each mach ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Alexander Aronovich Fel'dbaum
Alexander is a male given name. The most prominent bearer of the name is Alexander the Great, the king of the Ancient Greek kingdom of Macedonia who created one of the largest empires in ancient history. Variants listed here are Aleksandar, Aleksander and Aleksandr. Related names and diminutives include Iskandar, Alec, Alek, Alex, Alexandre, Aleks, Aleksa and Sander; feminine forms include Alexandra, Alexandria, and Sasha. Etymology The name ''Alexander'' originates from the (; 'defending men' or 'protector of men'). It is a compound of the verb (; 'to ward off, avert, defend') and the noun (, genitive: , ; meaning 'man'). It is an example of the widespread motif of Greek names expressing "battle-prowess", in this case the ability to withstand or push back an enemy battle line. The earliest attested form of the name, is the Mycenaean Greek feminine anthroponym , , (/Alexandra/), written in the Linear B syllabic script. Alaksandu, alternatively called ''Alakasandu'' or ' ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Optimal Control
Optimal control theory is a branch of mathematical optimization that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering and operations research. For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the moon with minimum fuel expenditure. Or the dynamical system could be a nation's economy, with the objective to minimize unemployment; the controls in this case could be fiscal and monetary policy. A dynamical system may also be introduced to embed operations research problems within the framework of optimal control theory. Optimal control is an extension of the calculus of variations, and is a mathematical optimization method for deriving control policies. The method is largely due to the work of Lev Pontryagin and Richard Bellman in the 1950s, after contributions to calc ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Dynamic Programming
Dynamic programming is both a mathematical optimization method and a computer programming method. The method was developed by Richard Bellman in the 1950s and has found applications in numerous fields, from aerospace engineering to economics. In both contexts it refers to simplifying a complicated problem by breaking it down into simpler sub-problems in a recursive manner. While some decision problems cannot be taken apart this way, decisions that span several points in time do often break apart recursively. Likewise, in computer science, if a problem can be solved optimally by breaking it into sub-problems and then recursively finding the optimal solutions to the sub-problems, then it is said to have ''optimal substructure''. If sub-problems can be nested recursively inside larger problems, so that dynamic programming methods are applicable, then there is a relation between the value of the larger problem and the values of the sub-problems.Cormen, T. H.; Leiserson, C. E.; Rives ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Car Analogy
The car analogy is a common technique, used predominantly in engineering textbooks to ease the understanding of abstract concepts in which a car, its composite parts, and common circumstances surrounding it are used as analogs for elements of the conceptual systems. The car analogy can be seen elsewhere, in textbooks covering other subjects and at various educational levels, such as explaining regulation of human temperature. Uses of car analogies The efficiency of car analogies reside on their capacity to explain difficult concepts (usually due to their high abstraction level) on more mundane terms with which the target audience is comfortable, and with which many also have a special interest. Due to that, car analogies appear more often on works related to applied sciences and technology. In order to work, car analogies translate agents of action as the car driver, the seller, or police officers; likewise, elements of a system are referred as car pieces, such as wheels, mot ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Automation And Remote Control
''Automation and Remote Control'' (russian: italic=yes, Автоматика и Телемеханика, translit=Avtomatika i Telemekhanika) is a Russian scientific journal published by MAIK Nauka/Interperiodica Press and distributed in English by Springer Science+Business Media. The journal was established in April 1936 by the USSR Academy of Sciences Department of Control Processes Problems. Cofounders were the Trapeznikov Institute of Control Sciences and the Institute of Information Transmission Problems. The journal covers research on control theory problems and applications. The editor-in-chief is Andrey A. Galyaev. According to the ''Journal Citation Reports'', the journal has a 2020 impact factor of 0.520. History The journal was established in April 1936 and published bimonthly. Since 1956 the journal has been a monthly publication and was translated into English and published in the United States under the title ''Automation and Remote Control'' by Plenum Publishi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]