Continuous analytics is a
data science
Data science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processing, scientific visualization, algorithms and systems to extract or extrapolate knowledge from potentially noisy, stru ...
process that abandons
ETLs and complex batch
data pipelines in favor of
cloud
In meteorology, a cloud is an aerosol consisting of a visible mass of miniature liquid droplets, frozen crystals, or other particles, suspended in the atmosphere of a planetary body or similar space. Water or various other chemicals may ...
-native and
microservices
In software engineering, a microservice architecture is an architectural pattern that organizes an application into a collection of loosely coupled, fine-grained services that communicate through lightweight protocols. This pattern is characterize ...
paradigms. Continuous
data processing
Data processing is the collection and manipulation of digital data to produce meaningful information. Data processing is a form of ''information processing'', which is the modification (processing) of information in any manner detectable by an o ...
enables real time interactions and immediate insights with fewer resources.
Defined
Analytics
Analytics is the systematic computational analysis of data or statistics. It is used for the discovery, interpretation, and communication of meaningful patterns in data, which also falls under and directly relates to the umbrella term, data sc ...
is the application of
mathematics
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
and
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
to big data. Data scientists write analytics programs to look for solutions to business problems, like forecasting
demand
In economics, demand is the quantity of a goods, good that consumers are willing and able to purchase at various prices during a given time. In economics "demand" for a commodity is not the same thing as "desire" for it. It refers to both the desi ...
or setting an optimal price. The continuous approach runs multiple stateless engines which concurrently enrich, aggregate, infer and act on the data. Data scientists, dashboards and client apps all access the same raw or real-time data derivatives with proper identity-based security,
data masking
Data masking or data obfuscation is the process of modifying sensitive data in such a way that it is of no or little value to unauthorized intruders while still being usable by software or authorized personnel. Data masking can also be referred ...
and
versioning in real-time.
Traditionally, data scientists have not been part of
IT development teams, like regular
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
programmers. This is because their skills set them apart in their own department not normally related to IT, i.e., math, statistics, and data science. So it is logical to conclude that their approach to writing
software code does not enjoy the same efficiencies as the traditional programming team. In particular traditional programming has adopted the Continuous Delivery approach to writing code and the
agile methodology
Agile software development is an umbrella term for approaches to developing software that reflect the values and principles agreed upon by ''The Agile Alliance'', a group of 17 software practitioners, in 2001. As documented in their ''Manifesto ...
. That releases software in a continuous circle, called
iterations.
Continuous analytics then is the extension of the continuous delivery software development model to the
big data
Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data processing, data-processing application software, software. Data with many entries (rows) offer greater statistical power, while data with ...
analytics development team. The goal of the continuous analytics practitioner then is to find ways to incorporate writing analytics code and installing big data software into the agile development model of automatically running unit and functional tests and building the environment system with automated tools.
To make this work means getting
data scientists
Data science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processing, scientific visualization, algorithms and systems to extract or extrapolate knowledge from potentially noisy, structu ...
to write their code in the same
code repository
In version control systems, a repository is a data structure that stores metadata for a set of files or directory structure. Depending on whether the version control system in use is distributed, like Git or Mercurial, or centralized, like Subvers ...
that regular programmers use so that software can pull it from there and run it through the build process. It also means saving the configuration of the big data cluster (sets of
virtual machines
In computing, a virtual machine (VM) is the virtualization or emulator, emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve ...
) in some kind of repository as well. That facilitates sending out analytics code and big data software and objects in the same automated way as the continuous integration process.
Data Scientist Ricardo Ramon Benitez
/ref>
External links
Continuous analytics
Development model
References
Data analysis
Big data
{{Database-stub