HOME

TheInfoList



OR:

ML.NET is a
free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, n ...
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vi ...
for the C# and F# programming languages. It also supports Python models when used together with NimbusML. The preview release of ML.NET included transforms for
feature engineering Feature engineering or feature extraction or feature discovery is the process of using domain knowledge to extract features (characteristics, properties, attributes) from raw data. The motivation is to use these extra features to improve the qua ...
like
n-gram In the fields of computational linguistics and probability, an ''n''-gram (sometimes also called Q-gram) is a contiguous sequence of ''n'' items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or ...
creation, and learners to handle binary classification, multi-class classification, and regression tasks. Additional ML tasks like anomaly detection and recommendation systems have since been added, and other approaches like deep learning will be included in future versions.


Machine learning

ML.NET brings model-based Machine Learning analytic and prediction capabilities to existing .NET developers. The framework is built upon .NET Core and .NET Standard inheriting the ability to run cross-platform on
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
,
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for se ...
and
macOS macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and la ...
. Although the ML.NET framework is new, its origins began in 2002 as a Microsoft Research project named TMSN (
text mining Text mining, also referred to as ''text data mining'', similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extract ...
search and navigation) for use internally within Microsoft products. It was later renamed to TLC (the learning code) around 2011. ML.NET was derived from the TLC library and has largely surpassed its parent says Dr. James McCaffrey, Microsoft Research. Developers can train a Machine Learning Model or reuse an existing Model by a 3rd party and run it on any environment offline. This means developers do not need to have a background in Data Science to use the framework. Support for the
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized so ...
Open Neural Network Exchange ( ONNX)
Deep Learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. ...
model format was introduced from build 0.3 in ML.NET. The release included other notable enhancements such as Factorization Machines,
LightGBM LightGBM, short for light gradient-boosting machine, is a free and open-source distributed gradient-boosting framework for machine learning, originally developed by Microsoft. It is based on decision tree algorithms and used for ranking, classifi ...
, Ensembles, LightLDA transform and OVA. The ML.NET integration of
TensorFlow TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. "It is machine learnin ...
is enabled from the 0.5 release. Support for x86 & x64 applications was added to build 0.7 including enhanced recommendation capabilities with Matrix Factorization. A full roadmap of planned features have been made available on the official GitHub repo. The first stable 1.0 release of the framework was announced at
Build (developer conference) Microsoft Build (often stylised as ) is an annual conference event held by Microsoft, aimed at software engineers and web developers using Windows, Microsoft Azure and other Microsoft technologies. First held in 2011, it serves as a succes ...
2019. It included the addition of a Model Builder tool and AutoML (Automated Machine Learning) capabilities. Build 1.3.1 introduced a preview of Deep Neural Network training using C# bindings for Tensorflow and a Database loader which enables model training on databases. The 1.4.0 preview added ML.NET scoring on ARM processors and Deep Neural Network training with GPU's for Windows and Linux.


Performance

Microsoft's paper on machine learning with ML.NET demonstrated it is capable of training sentiment analysis models using large datasets while achieving high accuracy. Its results showed 95% accuracy on Amazon's 9GB review dataset.


Model builder

The ML.NET CLI is a
Command-line interface A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
which uses ML.NET AutoML to perform model training and pick the best algorithm for the data. The ML.NET Model Builder preview is an extension for
Visual Studio Visual Studio is an integrated development environment (IDE) from Microsoft. It is used to develop computer programs including web site, websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development platfor ...
that uses ML.NET CLI and ML.NET AutoML to output the best ML.NET Model using a
GUI The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, inste ...
.


Model explainability

AI fairness and explainability has been an area of debate for AI Ethicists in recent years. A major issue for Machine Learning applications is the black box effect where end users and the developers of an application are unsure of how an algorithm came to a decision or whether the dataset contains bias. Build 0.8 included model explainability API's that had been used internally in Microsoft. It added the capability to understand the feature importance of models with the addition of 'Overall Feature Importance' and 'Generalized Additive Models'. When there are several variables that contribute to the overall score, it is possible to see a breakdown of each variable and which features had the most impact on the final score. The official documentation demonstrates that the scoring metrics can be output for debugging purposes. During training & debugging of a model, developers can preview and inspect live filtered data. This is possible using the
Visual Studio Visual Studio is an integrated development environment (IDE) from Microsoft. It is used to develop computer programs including web site, websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development platfor ...
DataView tools.


Infer.NET

Microsoft Research announced the popular Infer.NET model-based machine learning framework used for research in academic institutions since 2008 has been released open source and is now part of the ML.NET framework. The Infer.NET framework utilises probabilistic programming to describe
probabilistic model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
s which has the added advantage of interpretability. The Infer.NET namespace has since been changed to Microsoft.ML.Probabilistic consistent with ML.NET namespaces.


NimbusML Python support

Microsoft acknowledged that the Python programming language is popular with Data Scientists, so it has introduced NimbusML the experimental Python bindings for ML.NET. This enables users to train and use machine learning models in Python. It was made open source similar to Infer.NET.


Machine learning in the browser

ML.NET allows users to export trained models to the Open Neural Network Exchange (ONNX) format. This establishes an opportunity to use models in different environments that don't use ML.NET. It would be possible to run these models in the client side of a browser using ONNX.js, a javascript client-side framework for deep learning models created in the Onnx format.


AI School Machine Learning Course

Along with the rollout of the ML.NET preview, Microsoft rolled out free AI tutorials and courses to help developers understand techniques needed to work with the framework.


See also

*
scikit-learn scikit-learn (formerly scikits.learn and also known as sklearn) is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support-vector ...
* Accord.NET *
LightGBM LightGBM, short for light gradient-boosting machine, is a free and open-source distributed gradient-boosting framework for machine learning, originally developed by Microsoft. It is based on decision tree algorithms and used for ranking, classifi ...
*
TensorFlow TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. "It is machine learnin ...
* Microsoft Cognitive Toolkit *
List of numerical analysis software Listed here are notable end-user computer applications intended for use with numerical or data analysis: Numerical-software packages General-purpose computer algebra systems Interface-oriented Language-oriented Historically signific ...
* List of numerical libraries for .NET framework


References


Further reading

* * *


External links

* {{Microsoft FOSS Data mining and machine learning software Deep learning software Probabilistic models Probabilistic software Free statistical software .NET software Free software programmed in C Sharp Free software programmed in C++ Microsoft free software Open-source artificial intelligence Software using the MIT license 2018 software