Transformer (machine Learning)
   HOME
*



picture info

Transformer (machine Learning)
A transformer is a deep learning architecture developed by researchers at Google and based on the multi-head Attention (machine learning), attention mechanism, proposed in a 2017 paper "Attention Is All You Need". Text is converted to numerical representations called Large language model#Probabilistic tokenization, tokens, and each token is converted into a vector via looking up from a word embedding table. At each layer, each token is then Contextualization (computer science), contextualized within the scope of the context window with other (unmasked) tokens via a parallel Multi-head attention, multi-head Attention (machine learning), attention mechanism allowing the signal for key tokens to be amplified and less important tokens to be diminished. Transformers have the advantage of having no recurrent units, and therefore require less training time than earlier Recurrent neural network, recurrent neural architectures (RNNs) such as long short-term memory (LSTM). Later variations ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  



MORE