Spark Components for Data Analysis:
1. Spark Core- 💡
It contains the basic functionality of Spark, including components for task scheduling, memory management, fault recovery, interacting with storage systems.
2. Spark SQL- 📊
It is Spark’s package for working with structured data. It allows querying data via SQL as well as the Apache Hive variant of SQL—called the Hive Query Language or HQL.
3. Spark Streaming- 🔎
It is a Spark component that enables processing of live streams of data.
4. MLlib- 🎰
Spark comes with a library containing common machine learning (ML) functionality, called MLlib. MLlib provides multiple types of machine learning algorithms, including classification, regression, clustering, and collaborative filtering.
5. GraphX- 📈
It is a library for manipulating graphs (e.g., a social network’s friend graph) and performing graph-parallel computations.