HOME

TheInfoList



OR:

SAS (previously "Statistical Analysis System") is a
statistical software The following is a list of statistical software. Open-source * ADaMSoft – a generalized statistical software with data mining algorithms and methods for data management * ADMB – a software suite for non-linear statistical modeling based on C+ ...
suite developed by SAS Institute for
data management Data management comprises all disciplines related to handling data as a valuable resource, it is the practice of managing an organization's data so it can be analyzed for decision making. Concept The concept of data management emerged alongsi ...
, advanced analytics,
multivariate analysis Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable, i.e., '' multivariate random variables''. Multivariate statistics concerns understanding the differ ...
,
business intelligence Business intelligence (BI) consists of strategies, methodologies, and technologies used by enterprises for data analysis and management of business information. Common functions of BI technologies include Financial reporting, reporting, online an ...
, and
predictive analytics Predictive analytics encompasses a variety of Statistics, statistical techniques from data mining, Predictive modelling, predictive modeling, and machine learning that analyze current and historical facts to make predictions about future or other ...
. SAS was developed at
North Carolina State University North Carolina State University (NC State, North Carolina State, NC State University, or NCSU) is a public university, public Land-grant university, land-grant research university in Raleigh, North Carolina, United States. Founded in 1887 and p ...
from 1966 until 1976, when SAS Institute was incorporated. SAS was further developed in the 1980s and 1990s with the addition of new statistical procedures, additional components and the introduction of JMP. A point-and-click interface was added in version 9 in 2004. A social media analytics product was added in 2010. SAS Viya, a suite of analytics and artificial intelligence software, was introduced in 2016.


Technical overview and terminology

SAS is a software suite that can mine, alter, manage and retrieve data from a variety of sources and perform statistical analysis on it. SAS provides a graphical point-and-click user interface for non-technical users and more through the SAS language. SAS programs have DATA steps, which retrieve and manipulate data, PROC (procedures) which analyze the data, and may also have functions. Each step consists of a series of statements. The DATA step has executable statements that result in the software taking an action, and declarative statements that provide instructions to read a data set or alter the data's appearance. The DATA step has two phases: compilation and execution. In the compilation phase, declarative statements are processed and syntax errors are identified. Afterwards, the execution phase processes each executable statement sequentially. Data sets are organized into tables with rows called "observations" and columns called "variables". Additionally, each piece of data has a descriptor and a value. PROC statements call upon named procedures. Procedures perform analysis and reporting on data sets to produce statistics, analyses, and graphics. There are more than 300 named procedures and each one performs a substantial body of statistical work. PROC statements can also display results, sort data or perform other operations. SAS macros are pieces of code or variables that are coded once and referenced to perform repetitive tasks. SAS data can be published in HTML, PDF, Excel, RTF and other formats using the Output Delivery System, which was first introduced in 2007. SAS Enterprise Guide is SAS's point-and-click interface. It generates code to manipulate data or perform analysis without the use of the SAS programming language. The SAS software suite has more than 200 add-on packages, sometimes called components Some of these SAS components, i.e. add on packages to Base SAS include:


History


Origins

The development of SAS started in 1966 after
North Carolina State University North Carolina State University (NC State, North Carolina State, NC State University, or NCSU) is a public university, public Land-grant university, land-grant research university in Raleigh, North Carolina, United States. Founded in 1887 and p ...
re-hired Anthony Barr to program his analysis of variance and regression software so that it would run on
IBM System/360 The IBM System/360 (S/360) is a family of mainframe computer systems announced by IBM on April 7, 1964, and delivered between 1965 and 1978. System/360 was the first family of computers designed to cover both commercial and scientific applicati ...
computers. The project was funded by the
National Institutes of Health The National Institutes of Health (NIH) is the primary agency of the United States government responsible for biomedical and public health research. It was founded in 1887 and is part of the United States Department of Health and Human Service ...
. and was originally intended to analyze agricultural data to improve crop yields. Barr was joined by student James Goodnight, who developed the software's statistical routines, and the two became project leaders. In 1968, Barr and Goodnight integrated new multiple regression and
analysis of variance Analysis of variance (ANOVA) is a family of statistical methods used to compare the Mean, means of two or more groups by analyzing variance. Specifically, ANOVA compares the amount of variation ''between'' the group means to the amount of variati ...
routines. In 1972, after issuing the first release of SAS, the project lost its funding. According to Goodnight, this was because NIH only wanted to fund projects with medical applications. Goodnight continued teaching at the university for a salary of $1 and access to mainframe computers for use with the project, until it was funded by the University Statisticians of the Southern Experiment Stations the following year. John Sall joined the project in 1973 and contributed to the software's econometrics, time series, and matrix algebra. Another early participant, Caroll G. Perkins, contributed to SAS' early programming. Jolayne W. Service and Jane T. Helwig created SAS's first documentation. The first versions of SAS, from SAS 71 to SAS 82, were named after the year in which they were released. In 1971, SAS 71 was published as a limited release. It was used only on IBM mainframes and had the main elements of SAS programming, such as the DATA step and the most common procedures, i.e. PROCs. The following year a full version was released as SAS 72, which introduced the MERGE statement and added features for handling missing data or combining data sets. The development of SAS was described by the
CNBC CNBC is an American List of business news channels, business news channel owned by the NBCUniversal News Group, a unit of Comcast's NBCUniversal. The network broadcasts live business news and analysis programming during the morning, Day ...
website as an " inflection point" in the history of artificial intelligence. In 1976, Barr, Goodnight, Sall, and Helwig removed the project from North Carolina State and incorporated it as the SAS Institute, Inc.


Development

SAS was redesigned in SAS 76. The INPUT and INFILE statements were improved so they could read most data formats used by IBM mainframes. Generating reports was also added through the PUT and FILE statements. The ability to analyze
general linear model The general linear model or general multivariate regression model is a compact way of simultaneously writing several multiple linear regression models. In that sense it is not a separate statistical linear model. The various multiple linear regre ...
s was also added as was the FORMAT procedure, which allowed developers to customize the appearance of data. In 1979, SAS 79 added support for the IBM VM/ CMS operating system and introduced the DATASETS procedure. Three years later, SAS 82 introduced an early macro language and the APPEND procedure. Beginning with SAS 4, released in 1984, SAS releases have followed a sequential naming convention not based on year of release. SAS version 4 had limited features, but made SAS more accessible. Version 5 introduced a complete macro language, array subscripts, and a full-screen interactive user interface called Display Manager. In 1985, SAS was rewritten in the C programming language. This enabled the SAS' MultiVendor Architecture which allows the software to run on
UNIX Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
,
MS-DOS MS-DOS ( ; acronym for Microsoft Disk Operating System, also known as Microsoft DOS) is an operating system for x86-based personal computers mostly developed by Microsoft. Collectively, MS-DOS, its rebranding as IBM PC DOS, and a few op ...
, and
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
. It was previously written in
PL/I PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language initially developed by IBM. It is designed for scientific, engineering, business and system programming. It has b ...
, Fortran, and
assembly language In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
. In the 1980s and 1990s, SAS released a number of components to complement Base SAS. SAS/GRAPH, which produces graphics, was released in 1980, as well as the SAS/ETS component, which supports econometric and time series analysis. A component intended for pharmaceutical users, SAS/PH-Clinical, was released in the 1990s. The
Food and Drug Administration The United States Food and Drug Administration (FDA or US FDA) is a List of United States federal agencies, federal agency of the United States Department of Health and Human Services, Department of Health and Human Services. The FDA is respo ...
standardized on using SAS/PH-Clinical for new drug applications in 2002. Vertical products like SAS Financial Management and SAS Human Capital Management (then called CFO Vision and HR Vision respectively) were also introduced. JMP was developed by SAS co-founder John Sall and a team of developers, in order to take advantage of the graphical user interface introduced in the 1984
Apple Macintosh Mac is a brand of personal computers designed and marketed by Apple Inc., Apple since 1984. The name is short for Macintosh (its official name until 1999), a reference to the McIntosh (apple), McIntosh apple. The current product lineup inclu ...
. JMP's name originally stood for "John's Macintosh Project". JMP was shipped for the first time in 1989. Updated versions of JMP were released continuously after 2002 with the most recent release in 2016. In January 2022, JMP became a wholly owned subsidiary of SAS Institute, having previously been a business unit of the company. SAS 6 was used throughout the 1990s and was available on a wider range of operating systems, including
Macintosh Mac is a brand of personal computers designed and marketed by Apple Inc., Apple since 1984. The name is short for Macintosh (its official name until 1999), a reference to the McIntosh (apple), McIntosh apple. The current product lineup inclu ...
,
OS/2 OS/2 is a Proprietary software, proprietary computer operating system for x86 and PowerPC based personal computers. It was created and initially developed jointly by IBM and Microsoft, under the leadership of IBM software designer Ed Iacobucci, ...
,
Silicon Graphics Silicon Graphics, Inc. (stylized as SiliconGraphics before 1999, later rebranded SGI, historically known as Silicon Graphics Computer Systems or SGCS) was an American high-performance computing manufacturer, producing computer hardware and soft ...
, and PRIMOS. SAS introduced new features through dot-releases. From 6.06 to 6.09, a user interface based on the Windows paradigm was introduced and support for SQL was added. Version 7 introduced the Output Delivery System (ODS) and an improved text editor. Subsequent releases improved upon the ODS. For example, more output options were added in version 8. The number of operating systems that were supported was reduced to
UNIX Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
,
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
and
z/OS z/OS is a 64-bit operating system for IBM z/Architecture mainframes, introduced by IBM in October 2000. It derives from and is the successor to OS/390, which in turn was preceded by a string of MVS versions.Starting with the earliest: ...
, and
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
was added. SAS 8 and SAS Enterprise Miner were released in 1999.


Recent history

In 2002, SAS Text Miner software was introduced. Text Miner analyzes text data like emails for patterns in
business intelligence Business intelligence (BI) consists of strategies, methodologies, and technologies used by enterprises for data analysis and management of business information. Common functions of BI technologies include Financial reporting, reporting, online an ...
applications. In 2004, SAS Version 9.0 was released, referred to as "Project Mercury" internally, and was designed to make SAS accessible to a broader range of business users. SAS 9.0 added custom user interfaces based on the user's role and established the point-and-click user interface of SAS Enterprise Guide as the software's primary
graphical user interface A graphical user interface, or GUI, is a form of user interface that allows user (computing), users to human–computer interaction, interact with electronic devices through Graphics, graphical icon (computing), icons and visual indicators such ...
(GUI). The
Customer Relationship Management Customer relationship management (CRM) is a strategic process that organizations use to manage, analyze, and improve their interactions with customers. By leveraging data-driven insights, CRM helps businesses optimize communication, enhance cus ...
(CRM) features were improved in 2004 with SAS Interaction Management. In 2008, SAS announced Project Unity, designed to integrate data quality, data integration, and
master data management Master data management (MDM) is a discipline in which business and information technology collaborate to ensure the uniformity, accuracy, stewardship, semantic consistency, and accountability of the enterprise's official shared master data assets. ...
. SAS Institute Inc v World Programming Ltd was a lawsuit with developers of a competing implementation, World Programming System, alleging that they had infringed SAS's copyright in part by implementing the same functionality. The case was referred by the United Kingdom's
High Court of Justice The High Court of Justice in London, known properly as His Majesty's High Court of Justice in England, together with the Court of Appeal (England and Wales), Court of Appeal and the Crown Court, are the Courts of England and Wales, Senior Cour ...
to the
European Court of Justice The European Court of Justice (ECJ), officially the Court of Justice (), is the supreme court of the European Union in matters of European Union law. As a part of the Court of Justice of the European Union, it is tasked with interpreting ...
on 11 August 2010. In May 2012, the
European Court of Justice The European Court of Justice (ECJ), officially the Court of Justice (), is the supreme court of the European Union in matters of European Union law. As a part of the Court of Justice of the European Union, it is tasked with interpreting ...
ruled in favor of World Programming, finding that "the functionality of a computer program and the programming language cannot be protected by copyright." A free version of SAS was introduced for students in 2010. SAS Social Media Analytics, a tool for social media monitoring, engagement and sentiment analysis, was also released that year. SAS Rapid Predictive Modeler (RPM), which creates basic analytical models using
Microsoft Excel Microsoft Excel is a spreadsheet editor developed by Microsoft for Microsoft Windows, Windows, macOS, Android (operating system), Android, iOS and iPadOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a ...
, was introduced the same year. In 2010, JMP 9 included a new interface for using the
R programming language R is a programming language for statistical computing and data visualization. It has been widely adopted in the fields of data mining, bioinformatics, data analysis, and data science. The core R language is extended by a large number of so ...
and an add-in for MS Excel. The following year, a High Performance Computing platform was made available in a partnership with
Teradata Teradata Corporation is an American software company that provides cloud database and Analytics, analytics-related software, products, and services. The company was formed in 1979 in Brentwood, California, as a collaboration between researchers a ...
and EMC Greenplum. In 2011, the company released SAS Enterprise Miner 7.1. The company introduced 27 data management products from October 2013 to October 2014 and updates to 160 others. At the SAS Global Forum 2015, SAS announced several new products that were specialized for different industries, as well as new training software. The company has invested in the development of artificial general intelligence, or "strong AI", with the goal of advancing deep learning and natural language processing to the point of achieving cognitive computing. In 2019, SAS announced that it would be investing $1 billion into the development of advanced artificial intelligence, deep learning, natural language processing and
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
. It announced an additional $1 billion investment into these areas in 2023, particularly for industries such as finance, insurance, government, health care and energy. In September 2023, the company announced its expansion of research into the applications of generative AI in analytics, data management and modeling.


Software products

As of 2011, SAS's largest set of products was its line for
customer intelligence Customer intelligence (CI) as part of business intelligence is the process of gathering information regarding customers, and their details and activities, to build deeper and more effective customer relationship management, customer relationships ...
. SAS modules for web, social media and marketing analytics may be used to profile customers and prospects, attempt to predict their behaviors, and manage communications. SAS also provides the SAS Fraud Framework, which is designed to monitor transactions across different networks and use analytics to identify anomalies that are indicative of fraud. This software uses artificial intelligence to monitor income and assets. SAS has various analytical tools related to
risk management Risk management is the identification, evaluation, and prioritization of risks, followed by the minimization, monitoring, and control of the impact or probability of those risks occurring. Risks can come from various sources (i.e, Threat (sec ...
. SAS has products for specific industries, such as government, retail, telecommunications, aerospace, marketing optimization, and
high-performance computing High-performance computing (HPC) is the use of supercomputers and computer clusters to solve advanced computation problems. Overview HPC integrates systems administration (including network and security knowledge) and parallel programming into ...
. The company also has a suite of analytical products related to health care and life sciences.


Comparison to other products

In a 2005 article for the '' Journal of Marriage and Family'' comparing statistical packages from SAS and its competitors Stata and SPSS, Alan C. Acock wrote that SAS programs provide "extraordinary range of data analysis and data management tasks," but were difficult to learn and use. SPSS and Stata, meanwhile, were both easier to learn but had less capable analytic abilities, though these could be expanded with paid (in SPSS) or free (in Stata) add-ons. Acock concluded that SAS was best for power users, while occasional users would benefit most from SPSS and Stata. A 2014 comparison by the
University of California, Los Angeles The University of California, Los Angeles (UCLA) is a public university, public Land-grant university, land-grant research university in Los Angeles, California, United States. Its academic roots were established in 1881 as a normal school the ...
, gave similar results. Competitors such as Revolution Analytics and Alpine Data Labs advertise their products as considerably cheaper than SAS's. In a 2011 comparison, Doug Henschen of '' InformationWeek'' found that start-up fees for the three are similar, though he admitted that the starting fees were not necessarily the best basis for comparison. SAS's business model is not weighted as heavily on initial fees for its programs, instead focusing on revenue from annual subscription fees.


SAS Viya

In 2016, SAS Viya, an artificial intelligence, machine learning, analytics and data management platform, was introduced with a new architecture optimized for running SAS software in public clouds. Viya also increased interoperability with open source software, allowing models to be built in tools such as R, Python and Jupyter, and then executed on SAS's Cloud Analytics Services (CAS) engine. In 2020, a further architectural revamp in Viya 4 containerized the software. SAS sells Viya alongside SAS 9.4, and has not positioned it as a replacement for SAS 9.4. In 2023, two new
software as a service Software as a service (SaaS ) is a cloud computing service model where the provider offers use of application software to a client and manages all needed physical and software resources. SaaS is usually accessed via a web application. Unlike o ...
(SaaS) modules for SAS Viya were released as a private preview: Workbench, for use in creating AI models, and App Factory, for use in creating AI applications. Both modules support multiple programming languages and are expected to become generally available in 2024. SAS Viya also became available on Microsoft Azure Marketplace under a pay-as-you-use model in 2023. In 2023, the company introduced SAS Health, a common health data model built on the SAS Viya platform.


Adoption

According to IDC, SAS is the largest market-share holder in "advanced analytics" with 35.4 percent of the market as of 2013. It is the fifth largest market-share holder for business intelligence (BI) software with a 6.9% share and the largest independent vendor. It competes in the BI market against SAP BusinessObjects, IBM Cognos,
SPSS Modeler IBM SPSS Modeler is a data mining and text analytics software application from IBM. It is used to build Predictive modelling, predictive models and conduct other analytic tasks. It has a visual interface which allows users to leverage statistica ...
, Oracle Hyperion, and Microsoft Power BI. SAS has been named in the Gartner Leader's Quadrant for Data Integration Tools and for Business Intelligence and Analytical Platforms. A study published in 2011 in '' BMC Health Services Research'' found that SAS was used in 42.6 percent of data analyses in health service research, based on a sample of 1,139 articles drawn from three journals.


Uses and applications


Education

SAS' analytical software is used in education to measure and visualize student outcomes and growth trends. Several states, including Virginia, North Carolina, Mississippi, Missouri, and North Dakota use its software to measure and analyze learning loss and learning recovery in students.


Environmental science

SAS and the International Institute for Applied Systems Analysis launched an app that crowdsources image data related to deforestation to train AI algorithms that can identify human impact on the environment. The
University of Florida The University of Florida (Florida or UF) is a public university, public land-grant university, land-grant research university in Gainesville, Florida, United States. It is a senior member of the State University System of Florida and a preem ...
's Center for Coastal Solutions partners with SAS to develop research, training programs and analytical tools related to environmental issues affecting coastal communities. The UNC Center for Galapagos Studies partnered with SAS in 2023 to create a model that can track the health and migratory patterns of species such as sea turtles and hammerhead sharks, as well as the health of the phytoplankton population.


Finance and insurance

SAS's fraud detection and prevention software is used by the tax agencies of various countries, such as the United States, and
Malta Malta, officially the Republic of Malta, is an island country in Southern Europe located in the Mediterranean Sea, between Sicily and North Africa. It consists of an archipelago south of Italy, east of Tunisia, and north of Libya. The two ...
.


Healthcare and life sciences

SAS has been a partner of the
Cleveland Clinic Cleveland Clinic is an American Nonprofit organization, nonprofit Academic health science center, academic Medical centers in the United States, medical center based in Cleveland, Ohio. Owned and operated by the Cleveland Clinic Foundation, an O ...
since 1982. During the
COVID-19 pandemic The COVID-19 pandemic (also known as the coronavirus pandemic and COVID pandemic), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), began with an disease outbreak, outbreak of COVID-19 in Wuhan, China, in December ...
, the clinic used predictive models developed by SAS to forecast factors such as patient volume, availability of medical equipment and bed capacity in various scenarios. SAS joined UNC Chapel Hill's Rapidly Emerging Antiviral Drug Development Initiative (READDI) in 2021.


See also

* Comparison of numerical-analysis software * Comparison of OLAP servers * JMP (statistical software), a subsidiary of SAS Institute Inc. * SAS language *
R (programming language) R is a programming language for statistical computing and Data and information visualization, data visualization. It has been widely adopted in the fields of data mining, bioinformatics, data analysis, and data science. The core R language is ...


References


Further reading

* * Wikiversity:Data Analysis using the SAS Language


External links

*
SAS OnDemand for Academics
No-cost access for learners (free SAS Profile required)


SAS for Developers

SAS community forums
{{Good article SAS Institute Fourth-generation programming languages Business intelligence software Proprietary software programmed in C Data mining and machine learning software Data warehousing Extract, transform, load tools Mathematical optimization software Numerical software Proprietary commercial software for Linux Proprietary cross-platform software Science software for Linux