Apache Beam is an
open source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
unified programming model to define and execute data processing
pipelines, including
ETL,
batch and
stream (continuous) processing.
Beam Pipelines are defined using one of the provided
SDKs and executed in one of the Beam’s supported ''runners'' (
distributed processing back-ends) including
Apache Flink
Apache Flink is an open-source, unified stream-processing and batch-processing framework developed by the Apache Software Foundation. The core of Apache Flink is a distributed streaming data-flow engine written in Java and Scala. Flink execu ...
,
Apache Samza,
Apache Spark, and
Google Cloud Dataflow.
History
Apache Beam
is one implementation of the Dataflow model paper.
The Dataflow model is based on previous work on distributed processing abstractions at Google, in particular on FlumeJava
and Millwheel.
Google released an open SDK implementation of the Dataflow model in 2014 and an environment to execute Dataflows locally (non-distributed) as well as in the
Google Cloud Platform service.
Timeline
Apache Beam makes minor releases every 6 weeks.
See also
*
List of Apache Software Foundation projects
References
{{DEFAULTSORT:Beam
Apache Software Foundation
Apache Software Foundation projects
Big data products
Cluster computing
Distributed stream processing
Google software
Hadoop
Java platform
Free software programmed in Java (programming language)