ML.NET
   HOME

TheInfoList



OR:

ML.NET is a
free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, no ...
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...
for the C# and F# programming languages. It also supports
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
models when used together with NimbusML. The preview release of ML.NET included transforms for
feature engineering Feature engineering or feature extraction or feature discovery is the process of using domain knowledge to extract features (characteristics, properties, attributes) from raw data. The motivation is to use these extra features to improve the qual ...
like
n-gram In the fields of computational linguistics and probability, an ''n''-gram (sometimes also called Q-gram) is a contiguous sequence of ''n'' items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or b ...
creation, and learners to handle binary classification, multi-class classification, and regression tasks. Additional ML tasks like anomaly detection and recommendation systems have since been added, and other approaches like deep learning will be included in future versions.


Machine learning

ML.NET brings model-based Machine Learning analytic and prediction capabilities to existing .NET developers. The framework is built upon .NET Core and .NET Standard inheriting the ability to run cross-platform on
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
,
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
and
macOS macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and lapt ...
. Although the ML.NET framework is new, its origins began in 2002 as a Microsoft Research project named TMSN (
text mining Text mining, also referred to as ''text data mining'', similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extract ...
search and navigation) for use internally within Microsoft products. It was later renamed to TLC (the learning code) around 2011. ML.NET was derived from the TLC library and has largely surpassed its parent says Dr. James McCaffrey, Microsoft Research. Developers can train a Machine Learning Model or reuse an existing Model by a 3rd party and run it on any environment offline. This means developers do not need to have a background in Data Science to use the framework. Support for the
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
Open Neural Network Exchange (
ONNX The Open Neural Network Exchange (ONNX) [] is an Open-source software, open-source artificial intelligence ecosystem of technology companies and research organizations that establish open standards for representing machine learning algorithms and ...
)
Deep Learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. De ...
model format was introduced from build 0.3 in ML.NET. The release included other notable enhancements such as Factorization Machines,
LightGBM LightGBM, short for light gradient-boosting machine, is a free and open-source distributed gradient-boosting framework for machine learning, originally developed by Microsoft. It is based on decision tree algorithms and used for ranking, class ...
, Ensembles, LightLDA transform and OVA. The ML.NET integration of
TensorFlow TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. "It is machine learning ...
is enabled from the 0.5 release. Support for x86 & x64 applications was added to build 0.7 including enhanced recommendation capabilities with Matrix Factorization. A full roadmap of planned features have been made available on the official GitHub repo. The first stable 1.0 release of the framework was announced at
Build (developer conference) Microsoft Build (often stylised as ) is an annual conference event held by Microsoft, aimed at software engineers and web developers using Windows, Microsoft Azure and other Microsoft technologies. First held in 2011, it serves as a successor ...
2019. It included the addition of a Model Builder tool and
AutoML Automated machine learning (AutoML) is the process of automating the tasks of applying machine learning to real-world problems. AutoML potentially includes every stage from beginning with a raw dataset to building a machine learning model ready ...
(Automated Machine Learning) capabilities. Build 1.3.1 introduced a preview of Deep Neural Network training using C# bindings for Tensorflow and a Database loader which enables model training on databases. The 1.4.0 preview added ML.NET scoring on ARM processors and Deep Neural Network training with GPU's for Windows and Linux.


Performance

Microsoft's paper on machine learning with ML.NET demonstrated it is capable of training sentiment analysis models using large datasets while achieving high accuracy. Its results showed 95% accuracy on Amazon's 9GB review dataset.


Model builder

The ML.NET CLI is a
Command-line interface A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
which uses ML.NET AutoML to perform model training and pick the best algorithm for the data. The ML.NET Model Builder preview is an extension for
Visual Studio Visual Studio is an integrated development environment (IDE) from Microsoft. It is used to develop computer programs including web site, websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development platfor ...
that uses ML.NET CLI and ML.NET AutoML to output the best ML.NET Model using a
GUI The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, inste ...
.


Model explainability

AI fairness and explainability has been an area of debate for AI Ethicists in recent years. A major issue for Machine Learning applications is the black box effect where end users and the developers of an application are unsure of how an algorithm came to a decision or whether the dataset contains bias. Build 0.8 included model explainability API's that had been used internally in Microsoft. It added the capability to understand the feature importance of models with the addition of 'Overall Feature Importance' and 'Generalized Additive Models'. When there are several variables that contribute to the overall score, it is possible to see a breakdown of each variable and which features had the most impact on the final score. The official documentation demonstrates that the scoring metrics can be output for debugging purposes. During training & debugging of a model, developers can preview and inspect live filtered data. This is possible using the
Visual Studio Visual Studio is an integrated development environment (IDE) from Microsoft. It is used to develop computer programs including web site, websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development platfor ...
DataView tools.


Infer.NET

Microsoft Research announced the popular Infer.NET model-based machine learning framework used for research in academic institutions since 2008 has been released open source and is now part of the ML.NET framework. The Infer.NET framework utilises
probabilistic programming Probabilistic programming (PP) is a programming paradigm in which probabilistic models are specified and inference for these models is performed automatically. It represents an attempt to unify probabilistic modeling and traditional general pur ...
to describe
probabilistic model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
s which has the added advantage of interpretability. The Infer.NET namespace has since been changed to Microsoft.ML.Probabilistic consistent with ML.NET namespaces.


NimbusML Python support

Microsoft acknowledged that the
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
programming language is popular with Data Scientists, so it has introduced NimbusML the experimental Python bindings for ML.NET. This enables users to train and use machine learning models in Python. It was made open source similar to Infer.NET.


Machine learning in the browser

ML.NET allows users to export trained models to the
Open Neural Network Exchange The Open Neural Network Exchange (ONNX) [] is an Open-source software, open-source artificial intelligence ecosystem of technology companies and research organizations that establish open standards for representing machine learning algorithms and ...
(ONNX) format. This establishes an opportunity to use models in different environments that don't use ML.NET. It would be possible to run these models in the client side of a browser using ONNX.js, a javascript client-side framework for deep learning models created in the Onnx format.


AI School Machine Learning Course

Along with the rollout of the ML.NET preview, Microsoft rolled out free AI tutorials and courses to help developers understand techniques needed to work with the framework.


See also

*
scikit-learn scikit-learn (formerly scikits.learn and also known as sklearn) is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support-vector ...
*
Accord.NET Accord.NET is a framework for scientific computing in .NET. The source code of the project is available under the terms of the Gnu Lesser Public License, version 2.1. The framework comprises a set of libraries that are available in source code a ...
*
LightGBM LightGBM, short for light gradient-boosting machine, is a free and open-source distributed gradient-boosting framework for machine learning, originally developed by Microsoft. It is based on decision tree algorithms and used for ranking, class ...
*
TensorFlow TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. "It is machine learning ...
*
Microsoft Cognitive Toolkit Microsoft Cognitive Toolkit, previously known as CNTK and sometimes styled as The Microsoft Cognitive Toolkit, is a deprecated deep learning framework developed by Microsoft Research. Microsoft Cognitive Toolkit describes neural networks as a seri ...
*
List of numerical analysis software Listed here are notable end-user computer applications intended for use with numerical or data analysis: Numerical-software packages General-purpose computer algebra systems Interface-oriented Language-oriented Historically significa ...
* List of numerical libraries for .NET framework


References


Further reading

* * *


External links

* {{Microsoft FOSS Data mining and machine learning software Deep learning software Probabilistic models Probabilistic software Free statistical software .NET software Free software programmed in C Sharp Free software programmed in C++ Microsoft free software Open-source artificial intelligence Software using the MIT license 2018 software