Big Data Hadoop and Spark Developer Training Course

$699.00

Our Big Data Hadoop certification training online course helps you master ✔️Hadoop framework ✔️Big Data Tools & ✔️Pass Cloudera CCA175 certification exam. Enroll now!

Big Data Hadoop and Spark Developer Training Course

$699.00

SKU: 95ab0f834ea8 Category:

Program Overview:

The AWS Big Data certification training prepares you for all aspects of hosting big data and
performing distributed processing on the AWS platform and has been aligned to the AWS
Certified Data Analytics – Specialty exam. This course is developed by industry leaders and
aligned with the latest best practices.

Program Features:

  • Four industry-based course-end projects
  • Interactive learning with Jupyter notebooks integrated labs
  • Dedicated mentoring session from our industry expert faculty members

Delivery Mode:

Blended – Online self-paced learning

Prerequisites:

It is recommended that participants in this AWS Big Data Certification online course have:

  • Basic knowledge of AWS technical essentials
  • Fair understanding of big data and Hadoop concepts

Target Audience:

This course is best suited for the following professionals:

  • Data scientists
  • Data engineers
  • Solutions architects
  • Data analysts

Key Learning Outcomes:

When you complete this AWS Big Data Certification course, you will be able to accomplish
the following:

  • Understand how to use Amazon EMR for processing the data using Hadoop ecosystem tools
  • Understand how to use Amazon Kinesis for big data processing in real-time
  • Analyze and transform big data using Kinesis Streams
  • Visualize data and perform queries using Amazon QuickSight

Certification Details and Criteria:

  • At least 85 percent attendance of one live virtual classroom
  • A score of at least 75 percent on the course-end assessment
  • A score of at least 75 percent on the course-end assessment

Course Curriculum:

Lesson 01 – AWS in Big Data introduction

  • Introduction to Cloud Computing
  • Cloud Computing Deployments Models
  • Amazon Web Services Cloud Platform
  • The Cloud Computing Difference
  • AWS Cloud Economics
  • AWS Virtuous Cycle
  • AWS Cloud Architecture Design Principles
  • Why AWS for Big Data – Reasons
  • Why AWS for Big Data – Challenges
  • Databases in AWS
  • Relational vs Non-Relational Databases
  • Data Warehousing in AWS
  • Services for Collecting, Processing, Storing, and Analyzing Big Data
    1. Amazon Redshift
    2. Amazon Kinesis
    3. Amazon EMR
    4. Amazon DynamoDB
    5. Amazon Machine Learning
    6. AWS Lambda
    7. Amazon Elasticsearch Service
    8. Amazon EC2 (big data analytics software on EC2 instances)
  • Amazon Redshift
  • Amazon Kinesis
  • Amazon EMR
  • Amazon DynamoDB
  • Amazon Machine Learning
  • AWS Lambda
  • Amazon Elasticsearch Service
  • Amazon EC2 (big data analytics software on EC2 instances)
  • Key Takeaway
  • Knowledge Checks
  • Lesson End Project

Lesson 02 Collection

  • Objectives
  • Amazon Kinesis Fundamentals
  • Loading Data into Kinesis Stream
  • Kinesis Data Stream High-Level Architecture
  • Kinesis Stream Core Concepts
  • Kinesis Stream Emitting Data to AWS Services
  • Kinesis Connector Library
  • Kinesis Firehose
  • Transferring Data Using Lambda
  • Amazon SQS
  • IoT and Big Data
  • IoT Framework
  • AWS Data Pipeline
  • AWS Data Pipeline Components
  • Key Takeaway
  • Knowledge Checks
  • Lesson End Project

Lesson 03 Storage

  • Objectives
  • Introduction to AWS Big Data Storage Services
  • Amazon Glacier
  • Glacier and Big Data
  • DynamoDB Introduction
  • The Architecture of the DynamoDB Table
  • DynamoDB in AWS Ecosystem
  • DynamoDB Partitions
  • Data Distribution
  • Local Secondary Index (LSI) **
  • Global Secondary Index (GSI) **
  • DynamoDB GSI vs LSI
  • DynamoDB Stream
  • Cross-Region Replication in DynamoDB
  • Partition Key Selection
  • Snowball & AWS Big Data
  • AWS DMS
  • AWS Aurora in Big Data
  • Key Takeaway
  • Knowledge Checks
  • Lesson End Project

Lesson 04 Processing I

  • Objectives
  • Introduction to AWS Big Data Processing Services
  • Amazon Elastic MapReduce (EMR)
  • Apache Hadoop
  • EMR Architecture
  • Storage Options
  • EMR File Storage and Compression
  • Supported File Format and File Size
  • Single-AZ Concept
  • EMR Operations
  • EMR Releases
  • AWS Cluster
  • Launching a Cluster
  • Advanced EMR Setting Option
  • Choosing Instance Type
  • Number of Instances
  • Monitoring EMR
  • Resizing of Cluster
  • Using Hue with EMR
  • Setup Hue for LDAP
  • Hive on EMR
  • Hive Use Cases
  • Key Takeaway
  • Knowledge Checks
  • Lesson End Project

Lesson 05 – Processing II

  • HBase with EMR
  • HBase Use Cases
  • Comparison of HBase with Redshift and DynamoDB
  • HBase Architecture HBase on S3
  • HBase and EMRFS
  • HBase Integration
  • HCatalog
  • Presto with EMR
  • Advantages of Presto
  • Presto Architecture
  • Spark with EMR
  • Spark Use Cases
  • Spark Components
  • Spark Integration With EMR
  • AWS Lambda in AWS Big Data Ecosystem
  • Limitations of Lambda
  • Lambda and Kinesis Stream
  • Lambda and Redshift
  • Key Takeaway
  • Knowledge Checks
  • Lesson End Project

Lesson 06 – Analysis I

  • Objectives
  • Introduction to AWS Big Data Analysis Services
  • RedShift
  • RedShift Architecture
  • RedShift in the AWS Ecosystem
  • Columnar Databases
  • RedShift Table Design
  • RedShift Workload Management
  • RedShift Loading Data
  • RedShift Maintenance and Operations
  • Key Takeaway
  • Knowledge Checks
  • Lesson End Project

Lesson 07 – Analysis II

  • Machine Learning
  • Machine Learning – Use Cases
  • Algorithms
  • Amazon SageMaker
  • Elasticsearch
  • Amazon Elasticsearch Service
  • Loading of Data into Elasticsearch
  • Logstash
  • Kibana
  • RStudio
  • Characteristics
  • Athena
  • Presto and Hive
  • Integration with AWS Glue
  • Comparison of Athena with Other AWS Services
  • Lab Run Query on S3 Using Serverless Athena
  • Key Takeaway
  • Knowledge Checks
  • Lesson End Project

Lesson 08 – Visualisation

  • Objectives
  • Introduction to AWS Big Data Visualization Services
  • Amazon QuickSight
  • Amazon QuickSight – Use Cases
  • LAB Create an Analysis with a Single Visual Using Sample Data
  • Working with Data
  • Assisted Practice: TBD
  • QuickSight Visualization
  • Big Data Visualization
  • Apache Zeppelin
  • Jupyter Notebook
  • Comparison Between Notebooks
  • D3.js (Data-Driven Documents)
  • MicroStrategy
  • Key Takeaway
  • Knowledge Checks
  • Lesson End Project

Lesson 09 – Security

  • Objectives
  • Introduction to AWS Big Data Security Services
  • EMR Security
  • Roles
  • Private Subnet
  • Encryption At Rest and In Transit
  • RedShift Security
  • KMS Overview
  • SloudHSM
  • Limit Data Access
  • STS and Cross Account Access
  • Cloud Trail
  • Key Takeaway
  • Knowledge Checks
  • Lesson End Project