(BIG-DATA-PYTHON.AJ1) / ISBN : 978-1-64459-315-8
This course includes
Lessons
TestPrep
Hands-On Labs
AI Tutor (Add-on)
237 Review
Get A Free Trial

Get the support you need. Enroll in our Instructor-Led Course.

Lessons

9+ Lessons | 20+ Exercises | 50+ Quizzes | 65+ Flashcards | 65+ Glossary of terms

TestPrep

30+ Pre Assessment Questions | 30+ Post Assessment Questions |

Hands-On Labs

48+ LiveLab | 12+ Video tutorials | 20+ Minutes

1

Preface

  • About
2

The Python Data Science Stack

  • Introduction
  • Python Libraries and Packages
  • Using Pandas
  • Data Type Conversion
  • Aggregation and Grouping
  • Exporting Data from Pandas
  • Visualization with Pandas
  • Summary
3

Statistical Visualizations

  • Introduction
  • Types of Graphs and When to Use Them
  • Components of a Graph
  • Seaborn
  • Which Tool Should Be Used?
  • Types of Graphs
  • Pandas DataFrames and Grouped Data
  • Changing Plot Design: Modifying Graph Components
  • Exporting Graphs
  • Summary
4

Working with Big Data Frameworks

  • Introduction
  • Hadoop
  • Spark
  • Writing Parquet Files
  • Handling Unstructured Data
  • Summary
5

Diving Deeper with Spark

  • Introduction
  • Getting Started with Spark DataFrames
  • Writing Output from Spark DataFrames
  • Exploring Spark DataFrames
  • Data Manipulation with Spark DataFrames
  • Graphs in Spark
  • Summary
6

Handling Missing Values and Correlation Analysis

  • Introduction
  • Setting up the Jupyter Notebook
  • Missing Values
  • Handling Missing Values in Spark DataFrames
  • Correlation
  • Summary
7

Exploratory Data Analysis

  • Introduction
  • Defining a Business Problem
  • Translating a Business Problem into Measurable Metrics and Exploratory Data Analysis (EDA)
  • Structured Approach to the Data Science Project Life Cycle
  • Summary
8

Reproducibility in Big Data Analysis

  • Introduction
  • Reproducibility with Jupyter Notebooks
  • Gathering Data in a Reproducible Way
  • Code Practices and Standards
  • Avoiding Repetition
  • Summary
9

Creating a Full Analysis Report

  • Introduction
  • Reading Data in Spark from Different Data Sources
  • SQL Operations on a Spark DataFrame
  • Generating Statistical Measurements
  • Summary

1

The Python Data Science Stack

  • Interacting with the Python Shell
  • Calculating the Square
  • Grouping a DataFrame
  • Applying a Function to a Column
  • Subsetting a DataFrame
  • Slicing and Subsetting
  • Reading Data from a CSV File
  • Viewing the Standard Deviation
  • Calculating the Median Value
  • Calculating the Mean Value
2

Statistical Visualizations

  • Plotting an Analytical Graph
  • Creating a Graph
  • Creating a Graph for a Mathematical Function
  • Creating a Line Graph Using Seaborn
  • Creating a Line Graph Using pandas
  • Creating a Line Graph Using matplotlib
  • Detecting Outliers
  • Displaying Histograms
  • Using a Box Plot
  • Constructing a Scatterplot
  • Plotting a Line Graph with Styles and Color
  • Configuring a Title and Labels for Axis Objects
  • Designing a Complete Plot
  • Exporting a Graph to a File on a Disk
3

Working with Big Data Frameworks

  • Performing DataFrame Operations in Spark
  • Accessing Data with Spark
  • Parsing Text in Spark
4

Diving Deeper with Spark

  • Creating a DataFrame Using a CSV File
  • Creating a DataFrame from an Existing RDD
  • Specifying the Schema of a DataFrame
  • Removing a Column from a DataFrame
  • Renaming a Column in a DataFrame
  • Adding a Column to a DataFrame
  • Creating a KDE Plot
  • Creating a Linear Model Plot
  • Creating a Bar Chart
5

Handling Missing Values and Correlation Analysis

  • Filtering Data
  • Counting Missing Values
  • Handling NaN Values
  • Using the Backward and Forward Filling Methods
  • Calculating Correlation Coefficient
6

Exploratory Data Analysis

  • Generating the Feature Importance of the Target Variable
  • Identifying the Target Variable
  • Plotting a Heatmap
  • Generating a Normal Distribution Plot
7

Reproducibility in Big Data Analysis

  • Performing Data Reproducibility
  • Preprocessing Missing Values with High Reproducibility
  • Normalizating the Data

Any questions?
Check out the FAQs

Still have unanswered questions and need to get in touch?

Contact us now

Related Courses

All Course
scroll to top