GotoBLAS
   HOME
*





GotoBLAS
In scientific computing, GotoBLAS and GotoBLAS2 are open source implementations of the BLAS (Basic Linear Algebra Subprograms) API with many hand-crafted optimizations for specific processor types. GotoBLAS was developed by Kazushige Goto at the Texas Advanced Computing Center. , it was used in seven of the world's ten fastest supercomputers. GotoBLAS remains available, but development ceased with a final version touting optimal performance on Intel's Nehalem architecture (contemporary in 2008). OpenBLAS is an actively maintained fork of GotoBLAS, developed at the Lab of Parallel Software and Computational Science, ISCAS. GotoBLAS was written by Goto during his sabbatical leave from the Japan Patent Office in 2002. It was initially optimized for the Pentium 4 processor and managed to immediately boost the performance of a supercomputer based on that CPU from 1.5 TFLOPS to 2 TFLOPS. , the library was available at no cost for noncommercial use. A later open source version wa ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Basic Linear Algebra Subprograms
Basic Linear Algebra Subprograms (BLAS) is a specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector addition, scalar multiplication, dot products, linear combinations, and matrix multiplication. They are the ''de facto'' standard low-level routines for linear algebra libraries; the routines have bindings for both C ("CBLAS interface") and Fortran ("BLAS interface"). Although the BLAS specification is general, BLAS implementations are often optimized for speed on a particular machine, so using them can bring substantial performance benefits. BLAS implementations will take advantage of special floating point hardware such as vector registers or SIMD instructions. It originated as a Fortran library in 1979* and its interface was standardized by the BLAS Technical (BLAST) Forum, whose latest BLAS report can be found on the netlib website. This Fortran library is known as the ''reference implementation'' (sometimes co ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


BLAS
Basic Linear Algebra Subprograms (BLAS) is a specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector addition, scalar multiplication, dot products, linear combinations, and matrix multiplication. They are the ''de facto'' standard low-level routines for linear algebra libraries; the routines have bindings for both C ("CBLAS interface") and Fortran ("BLAS interface"). Although the BLAS specification is general, BLAS implementations are often optimized for speed on a particular machine, so using them can bring substantial performance benefits. BLAS implementations will take advantage of special floating point hardware such as vector registers or SIMD instructions. It originated as a Fortran library in 1979* and its interface was standardized by the BLAS Technical (BLAST) Forum, whose latest BLAS report can be found on the netlib website. This Fortran library is known as the ''reference implementation'' (sometimes co ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


OpenBLAS
In scientific computing, OpenBLAS is an open-source implementation of the BLAS (Basic Linear Algebra Subprograms) and LAPACK APIs with many hand-crafted optimizations for specific processor types. It is developed at the Lab of Parallel Software and Computational Science, ISCAS. OpenBLAS adds optimized implementations of linear algebra kernels for several processor architectures, including Intel Sandy Bridge and Loongson. It claims to achieve performance comparable to the Intel MKL: this mostly holds true on the BLAS part, while the LAPACK part falls behind. On machines that support the AVX2 instruction set, OpenBLAS can achieve similar performance to MKL, but there are currently almost no open source libraries comparable to MKL on CPUs with the AVX512 instruction set. OpenBLAS is a fork of GotoBLAS2, which was created by Kazushige Goto at the Texas Advanced Computing Center. History and present OpenBLAS was developed by the parallel software group led by Professor Yunquan ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Kazushige Goto
is a software engineer specializing in high performance, hand-written, machine code. Education Goto was a research associate at the Texas Advanced Computing Center at the University of Texas at Austin when he wrote his famously hand-optimized assembly routines for supercomputing and PC platforms that outperform the best compiler generated code. Several of the fastest supercomputers in the world still use his implementation of the Basic Linear Algebra Subprograms (BLAS) known as GotoBLAS. Career In 2010, Goto joined Microsoft's Technical Computing Group with the title of Senior Researcher. In July 2012, he joined Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ... with the title of Software Engineer. Goto continues to write hand-optimized machine code, utilizing detail ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Math Kernel Library
Intel oneAPI Math Kernel Library (Intel oneMKL; formerly Intel Math Kernel Library or Intel MKL) is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math. The library supports Intel processors and is available for Windows, Linux and macOS operating systems. ''Intel oneAPI Math Kernel Library'' is not to be confused with ''oneAPI Math Kernel Library'' (oneMKL) Interfaces, a piece of open-source glue code that allows Intel MKL routines to be used from Data Parallel C++. History and licensing Intel launched the Math Kernel Library on May 9, 2003, and called it blas.lib. The project's development teams are located in Russia and the United States. The library was available in a standalone form, free of charge under the terms of Intel Simplified Software License which allow redistribution. Since April 2020, MKL has become part of oneAPI ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


BSD License
BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD license was used for its namesake, the Berkeley Software Distribution (BSD), a Unix-like operating system. The original version has since been revised, and its descendants are referred to as modified BSD licenses. BSD is both a license and a class of license (generally referred to as BSD-like). The modified BSD license (in wide use today) is very similar to the license originally used for the BSD version of Unix. The BSD license is a simple license that merely requires that all code retain the BSD license notice if redistributed in source code format, or reproduce the notice if redistributed in binary format. The BSD license (unlike some other licenses e.g. GPL) does not require that source code be distributed at all. Terms In addition t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Numerical Linear Algebra
Numerical linear algebra, sometimes called applied linear algebra, is the study of how matrix operations can be used to create computer algorithms which efficiently and accurately provide approximate answers to questions in continuous mathematics. It is a subfield of numerical analysis, and a type of linear algebra. Computers use floating-point arithmetic and cannot exactly represent irrational data, so when a computer algorithm is applied to a matrix of data, it can sometimes increase the difference between a number stored in the computer and the true number that it is an approximation of. Numerical linear algebra uses properties of vectors and matrices to develop computer algorithms that minimize the error introduced by the computer, and is also concerned with ensuring that the algorithm is as efficient as possible. Numerical linear algebra aims to solve problems of continuous mathematics using finite precision computers, so its applications to the natural and social sciences ar ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




ACM Transactions On Mathematical Software
''ACM Transactions on Mathematical Software'' (''TOMS'') is a quarterly scientific journal that aims to disseminate the latest findings of note in the field of numeric, symbolic, algebraic, and geometric computing applications. It is one of the oldest scientific journals specifically dedicated to mathematical algorithms and their implementation in software, and has been published since March 1975 by the Association for Computing Machinery (ACM). The journal is described as follows on thTOMS Homepage of the ACM Digital Librarypage: The purpose of the journal was laid out by its founding editor, John Rice, in the inaugural issue. The decision to found the journal came out of the 1970 Mathematical Software Symposium at Purdue University, also organized by Rice, who then negotiated with both SIAM Thailand ( ), historically known as Siam () and officially the Kingdom of Thailand, is a country in Southeast Asia, located at the centre of the Mainland Southeast Asia, Indochi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

New York Times
''The New York Times'' (''the Times'', ''NYT'', or the Gray Lady) is a daily newspaper based in New York City with a worldwide readership reported in 2020 to comprise a declining 840,000 paid print subscribers, and a growing 6 million paid digital media, digital subscribers. It also is a producer of popular podcasts such as ''The Daily (podcast), The Daily''. Founded in 1851 by Henry Jarvis Raymond and George Jones (publisher), George Jones, it was initially published by Raymond, Jones & Company. The ''Times'' has won List of Pulitzer Prizes awarded to The New York Times, 132 Pulitzer Prizes, the most of any newspaper, and has long been regarded as a national "newspaper of record". For print it is ranked List of newspapers by circulation, 18th in the world by circulation and List of newspapers in the United States, 3rd in the U.S. The paper is owned by the New York Times Company, which is Public company, publicly traded. It has been governed by the Sulzberger family since 189 ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Automatically Tuned Linear Algebra Software
Automatically Tuned Linear Algebra Software (ATLAS) is a software library for linear algebra. It provides a mature open source implementation of BLAS APIs for C and Fortran77. ATLAS is often recommended as a way to automatically generate an optimized BLAS library. While its performance often trails that of specialized libraries written for one specific hardware platform, it is often the first or even only optimized BLAS implementation available on new systems and is a large improvement over the generic BLAS available at Netlib. For this reason, ATLAS is sometimes used as a performance baseline for comparison with other products. ATLAS runs on most Unix-like operating systems and on Microsoft Windows (using Cygwin). It is released under a BSD-style license without advertising clause, and many well-known mathematics applications including MATLAB, Mathematica, Scilab, SageMath, and some builds of GNU Octave may use it. Functionality ATLAS provides a full implementation of the ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

L2 Cache
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which stores copies of the data from frequently used main memory locations. Most CPUs have a hierarchy of multiple cache levels (L1, L2, often L3, and rarely even L4), with different instruction-specific and data-specific caches at level 1. The cache memory is typically implemented with static random-access memory (SRAM), in modern CPUs by far the largest part of them by chip area, but SRAM is not always used for all levels (of I- or D-cache), or even any level, sometimes some latter or all levels are implemented with eDRAM. Other types of caches exist (that are not counted towards the "cache size" of the most important caches mentioned above), such as the translation lookaside buffer (TLB) which is part of the memory management unit (MMU) whi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

CPU Cache
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which stores copies of the data from frequently used main memory locations. Most CPUs have a hierarchy of multiple cache levels (L1, L2, often L3, and rarely even L4), with different instruction-specific and data-specific caches at level 1. The cache memory is typically implemented with static random-access memory (SRAM), in modern CPUs by far the largest part of them by chip area, but SRAM is not always used for all levels (of I- or D-cache), or even any level, sometimes some latter or all levels are implemented with eDRAM. Other types of caches exist (that are not counted towards the "cache size" of the most important caches mentioned above), such as the translation lookaside buffer (TLB) which is part of the memory management unit (MMU) w ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]