Apache Storm is a distributed stream processing computation framework written predominantly apache storm book pdf the Clojure programming language. It uses custom created “spouts” and “bolts” to define information sources and manipulations to allow batch, distributed processing of streaming data.

The initial release was on 17 September 2011. Edges on the graph are named streams and direct data from one node to another.

Together, the topology acts as a data transformation pipeline. Storm became an Apache Top-Level Project in September 2014 and was previously in incubation since September 2013.

Apache Storm is developed under the Apache License, making it available to most companies to use. Git is used for version control and Atlassian JIRA for issue tracking, under the Apache Incubator program. Nodes- There are two types of nodes, i.

Master Node and Worker Node. The Master Node executes a daemon Nimbus which assigns tasks to machines and monitors their performances.

On the other hand, the Worker Node runs the daemon called Supervisor which assigns the tasks to other worker node and operates them as per the need. Components- Storm has three critical components, viz. Topology is a network made of Stream and Spout.

Stream is an unbounded pipeline of tuples and Spout is the source of the data streams which converts the data into the tuple of streams and sends to the bolts to be processed. Storm is but one of dozens of stream processing engines, for a more complete list see Stream processing.

