A quick deep-dive into Apache Spark, the most popular distributed data engineering tool.
What is Spark? Why is it so popular? When and how to use it?
Learn the difference between the sub components (RDDs, DataFrames, SQL, Streaming, ...), setup PySpark , and learn how to write Spark transformations using Python and Jupyter Notebook.