Apache Kafka is a
platform that can handle billions of events every hour. It is easily scalable and has false tolerance.
Apache spark is an open-source
.It can read data from multiple sources and process them in parallel
Apache Hadoop is the most widely used
to process and store big data. It can be deployed on commodity computers and scales as per user needs.
Apache hive is an
on top of Hadoop. It enables developers to query big data using simple SQL queries.
Cloudera is the
version of many open source Big data tools. Cloudera provides secured and integrated big data tools to customers.
Trino is based on the presto
engine. Trino allows users to query data irrespective of matter where it is stored.
Apache flink is gaining popularity as it can process data in
and has also the features of apache spark.
Apache kafka terminologies