Kafka
Apache Kafka is a
streaming
platform that can handle billions of events every hour. It is easily scalable and has false tolerance.
Spark
Apache spark is an open-source
distributed
processing
engine
.It can read data from multiple sources and process them in parallel
Hadoop
Apache Hadoop is the most widely used
framework
to process and store big data. It can be deployed on commodity computers and scales as per user needs.
Hive
Apache hive is an
SQL
engine
on top of Hadoop. It enables developers to query big data using simple SQL queries.
Cloudera
Cloudera is the
enterprise
version of many open source Big data tools. Cloudera provides secured and integrated big data tools to customers.
Trino
Trino is based on the presto
query
engine. Trino allows users to query data irrespective of matter where it is stored.
Flink
Apache flink is gaining popularity as it can process data in
real-time
and has also the features of apache spark.
Apache kafka terminologies