HOME

TheInfoList



OR:

Transfer learning (TL) is a research problem in
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
(ML) that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize cars could apply when trying to recognize trucks. This area of research bears some relation to the long history of psychological literature on
transfer of learning Transfer of learning occurs when people apply information, strategies, and skills they have learned to a new situation or context. Transfer is not a discrete activity, but is rather an integral part of the learning process. Researchers attempt to ...
, although practical ties between the two fields are limited. From the practical standpoint, reusing or transferring information from previously learned tasks for the learning of new tasks has the potential to significantly improve the sample efficiency of a
reinforcement learning Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine ...
agent.


History

In 1976, Stevo Bozinovski and Ante Fulgosi published a paper explicitly addressing transfer learning in neural networks training. The paper gives a mathematical and geometrical model of transfer learning. In 1981, a report was given on the application of transfer learning in training a neural network on a dataset of images representing letters of computer terminals. Both positive and negative transfer learning was experimentally demonstrated. In 1993, Lorien Pratt published a paper on transfer in
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
, formulating the discriminability-based transfer (DBT) algorithm. In 1997, Pratt and
Sebastian Thrun Sebastian Thrun (born May 14, 1967) is a German-American entrepreneur, educator, and computer scientist. He is CEO of Kitty Hawk Corporation, and chairman and co-founder of Udacity. Before that, he was a Google VP and Fellow, a Professor of Comp ...
guest edited a special issue of ''
Machine Learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
'' devoted to transfer learning, and by 1998, the field had advanced to include
multi-task learning Multi-task learning (MTL) is a subfield of machine learning in which multiple learning tasks are solved at the same time, while exploiting commonalities and differences across tasks. This can result in improved learning efficiency and prediction ac ...
, along with a more formal analysis of its theoretical foundations. ''Learning to Learn'', edited by Thrun and Pratt, is a 1998 review of the subject. Transfer learning has also been applied in cognitive science, with Pratt also guest editing an issue of ''Connection Science'' on reuse of neural networks through transfer in 1996.
Andrew Ng Andrew Yan-Tak Ng (; born 1976) is a British-born American computer scientist and technology entrepreneur focusing on machine learning and AI. Ng was a co-founder and head of Google Brain and was the former Chief Scientist at Baidu, building ...
said in his NIPS 2016 tutorial that TL will be the next driver of ML commercial success after
supervised learning Supervised learning (SL) is a machine learning paradigm for problems where the available data consists of labelled examples, meaning that each data point contains features (covariates) and an associated label. The goal of supervised learning alg ...
to highlight the importance of TL. In the 2020 paper "Rethinking Pre-Training and self-training", Zoph et al. show that pre-training can hurt accuracy, and advocate self-training instead.


Definition

The definition of transfer learning is given in terms of domains and tasks. A domain \mathcal consists of: a
feature space In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a phenomenon. Choosing informative, discriminating and independent features is a crucial element of effective algorithms in pattern r ...
\mathcal and a
marginal probability distribution In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables ...
P(X), where X = \ \in \mathcal. Given a specific domain, \mathcal = \, a task consists of two components: a label space \mathcal and an objective predictive function f:\mathcal \rightarrow \mathcal . The function f is used to predict the corresponding label f(x) of a new instance x. This task, denoted by \mathcal = \, is learned from the training data consisting of pairs \, where x_i \in X and y_i \in \mathcal. Material was copied from this source, which is available under
Creative Commons Attribution 4.0 International License
Given a source domain \mathcal_S and learning task \mathcal_S, a target domain \mathcal_Tand learning task \mathcal_T, where \mathcal_S \neq \mathcal_T, or \mathcal_S \neq \mathcal_T, transfer learning aims to help improve the learning of the target predictive function f_T (\cdot) in \mathcal_T using the knowledge in \mathcal_S and \mathcal_S.


Applications

Algorithms are available for transfer learning in
Markov logic network A Markov logic network (MLN) is a probabilistic logic which applies the ideas of a Markov network to first-order logic, enabling uncertain inference. Markov logic networks generalize first-order logic, in the sense that, in a certain limit, all uns ...
s and
Bayesian networks A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bay ...
. Transfer learning has also been applied to cancer subtype discovery,Hajiramezanali, E. & Dadaneh, S. Z. & Karbalayghareh, A. & Zhou, Z. & Qian, X. Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data. 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada. building utilization,
general game playing General game playing (GGP) is the design of artificial intelligence programs to be able to play more than one game successfully. For many games like chess, computers are programmed to play these games using a specially designed algorithm, which ca ...
,
text classification Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") ...
, digit recognition, medical imaging and
spam filtering Various anti-spam techniques are used to prevent email spam (unsolicited bulk email). No technique is a complete solution to the spam problem, and each has trade-offs between incorrectly rejecting legitimate email (false positives) as opposed to ...
. In 2020 it was discovered that, due to their similar physical natures, transfer learning is possible between
electromyographic Electromyography (EMG) is a technique for evaluating and recording the electrical activity produced by skeletal muscles. EMG is performed using an medical instrument, instrument called an electromyograph to produce a record called an electromyog ...
(EMG) signals from the muscles and classifying the behaviors of
electroencephalographic Electroencephalography (EEG) is a method to record an electrogram of the spontaneous electrical activity of the brain. The biosignals detected by EEG have been shown to represent the postsynaptic potentials of pyramidal neurons in the neocortex ...
(EEG) brainwaves, from the
gesture recognition Gesture recognition is a topic in computer science and language technology with the goal of interpreting human gestures via mathematical algorithms. It is a subdiscipline of computer vision. Gestures can originate from any bodily motion or sta ...
domain to the mental state recognition domain. It was also noted that this relationship worked vice versa, showing that EEG can likewise be used to classify EMG. The experiments noted that the accuracy of
neural networks A neural network is a network or circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up of biological ...
and
convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
s were improved through transfer learning at both the first epoch (prior to any learning, ie. compared to standard random weight distribution) and the asymptote (the end of the learning process). That is, algorithms are improved by exposure to another domain. Moreover, the end-user of a pre-trained model can change the structure of fully-connected layers to achieve superior performance. In the domain of machine learning on code, it has been shown that transfer learning is useful for automatically repairing security vulnerabilities.


Softwares

Several compilations of transfer learning and domain adaptation algorithms have been implemented over the past decades: * ADAPT (Python) * TLlib (Python) * Domain-Adaptation-Toolbox Ke Yan. (2016
"Domain adaptation toolbox"
/ref> (Matlab)


See also

*
Crossover (genetic algorithm) In genetic algorithms and evolutionary computation, crossover, also called recombination, is a genetic operator used to combine the genetic information of two parents to generate new offspring. It is one way to stochastically generate new soluti ...
*
Domain adaptation Domain adaptation is a field associated with machine learning and transfer learning. This scenario arises when we aim at learning from a source data distribution a well performing model on a different (but related) target data distribution. For ...
*
General game playing General game playing (GGP) is the design of artificial intelligence programs to be able to play more than one game successfully. For many games like chess, computers are programmed to play these games using a specially designed algorithm, which ca ...
*
Multi-task learning Multi-task learning (MTL) is a subfield of machine learning in which multiple learning tasks are solved at the same time, while exploiting commonalities and differences across tasks. This can result in improved learning efficiency and prediction ac ...
*
Multitask optimization Multi-task optimization is a paradigm in the optimization literature that focuses on solving multiple self-contained tasks simultaneously.Gupta, A., Ong, Y. S., & Feng, L. (2018)Insights on transfer optimization: Because experience is the best ...
* Zero-shot learning


References


Sources

* {{cite book, url={{google books, plainurl=y, id=X_jpBwAAQBAJ, title=Learning to Learn, last1=Thrun, first1=Sebastian, last2=Pratt, first2=Lorien, date=6 December 2012, publisher=Springer Science & Business Media, isbn=978-1-4615-5529-2 Machine learning