Pyspark provides many advantages, but it has a few
drawbacks
as well. A few of them are:-
Since Python is an interpreted language, so pyspark code is relatively
slower
as compared to scala.
Pyspark consumes a lot of
memory
and sometimes it might be challenging if there are a large number of processes running.
Expressing a problem in pyspark is
Difficult
.
There are a few
functionalities
that are only
available
in scala/java but not available in pyspark.
Since spark is written in scala so using pyspark we cannot fully
utilize
the internal functioning of Spark.
apache Kafka use cases