ProgramGuru
Login
DS
DSA
Python
Java
MongoDB
SQL
NumPy
UML
Apache Spark Course
Apache Spark Course
1
❯
1
Apache Spark Course for Beginners
Module 1:
Introduction to Big Data
4
❯
1
What is Big Data?
2
3Vs of Big Data: Volume, Velocity, Variety
3
Limitations of Traditional Data Processing
4
Overview of Big Data Tools: Hadoop, Spark, Hive
Module 2:
Getting Started with Apache Spark
6
❯
1
What is Apache Spark?
2
Use Cases of Apache Spark in Industry
3
Installing Apache Spark on Windows (Step-by-Step Guide)
4
Installing Apache Spark on a Local Linux System
5
Installing Apache Spark on MacOS
6
Running Your First Spark Application
Module 3:
Apache Spark Architecture
4
❯
1
Spark Ecosystem Components: Core, SQL, Streaming, MLlib
2
Driver, Executors, and Cluster Manager in Apache Spark
3
Job, Stage, and Task in Apache Spark
4
Understanding DAG and Lazy Evaluation in Apache Spark
Module 4:
RDDs - Resilient Distributed Datasets
5
❯
1
What is an RDD in Apache Spark?
2
Creating RDDs from Collections and Files
3
Transformations vs Actions in Apache Spark
4
Persistence and Caching in Apache Spark
5
Limitations of RDDs
Module 5:
Introduction to DataFrames
4
❯
1
Why DataFrames over RDDs?
2
Creating and Displaying DataFrames in PySpark
3
Reading CSV, JSON, and Parquet Files in PySpark
4
Selecting, Filtering, and Transforming Data in Spark DataFrames
Module 6:
PySpark Essentials
4
❯
1
Setting up PySpark in Jupyter Notebook
2
Basic DataFrame Operations in PySpark
3
Working with Columns, Expressions, and User-Defined Functions in PySpark
4
Practical Examples with Real Datasets
Module 7:
Spark SQL
4
❯
1
Creating Temporary Views and Global Views in Spark SQL
2
Using SQL Queries on DataFrames in Spark
3
Common Aggregations and Joins in Spark SQL
4
Optimization using Catalyst in Apache Spark
Module 8:
Working with Complex Data
4
❯
1
Dealing with Nested JSON in Apache Spark
2
Exploding Arrays and Structs in Apache Spark
3
Flattening Hierarchical Data
4
Schema Evolution and Inference in Apache Spark
Module 9:
Data Cleaning & Transformations
4
❯
1
Dropping Nulls and Handling Missing Values in Apache Spark
2
Replacing, Filtering, and Grouping Data in PySpark
3
Window Functions in Apache Spark
4
Data Aggregation and Pivots in PySpark
Module 10:
Spark Streaming
4
❯
1
What is Spark Streaming?
2
Micro-batching and Continuous Data Processing in Spark Streaming
3
Reading from Kafka, Socket, and File Stream in Spark
4
Window Operations and Aggregations in Spark Streaming
Module 11:
Introduction to Machine Learning with Spark MLlib
6
❯
1
Overview of Spark MLlib
2
Feature Engineering with VectorAssembler and StringIndexer in PySpark
3
Building Machine Learning Pipelines in Apache Spark
4
Classification Using Logistic Regression in Spark MLlib
5
Linear Regression in Spark MLlib
6
Clustering with KMeans in Spark MLlib
Module 12:
Project – Real-World Data Pipeline
1
❯
1
Movie Ratings Project with Apache Spark
Welcome to ProgramGuru
Sign up to start your journey with us
Sign in with Google
Player Settings
Speed:
2s
Show Pseudocode
Show Output
Reset
Close