Get in Touch

Course Outline

Introduction

  • Introduction to Cloud Computing and Big Data solutions
  • Overview of Apache Hadoop Features and Architecture

Setting up Hadoop

  • Planning a Hadoop cluster (on-premises, cloud, etc.)
  • Selecting the operating system and Hadoop distribution
  • Provisioning resources (hardware, network, etc.)
  • Downloading and installing the software
  • Sizing the cluster for flexibility

Working with HDFS

  • Understanding the Hadoop Distributed File System (HDFS)
  • Overview of HDFS Command Reference
  • Accessing HDFS
  • Performing basic file operations on HDFS
  • Using S3 as a complement to HDFS

Overview of MapReduce

  • Understanding data flow in the MapReduce framework
  • Map, Shuffle, Sort, and Reduce
  • Demo: Computing Top Salaries

Working with YARN

  • Understanding resource management in Hadoop
  • Working with ResourceManager, NodeManager, and Application Master
  • Scheduling jobs under YARN
  • Scheduling for large numbers of nodes and clusters
  • Demo: Job scheduling

Integrating Hadoop with Spark

  • Setting up storage for Spark (HDFS, Amazon S3, NoSQL, etc.)
  • Understanding Resilient Distributed Datasets (RDDs)
  • Creating an RDD
  • Implementing RDD transformations
  • Demo: Implementing a text search program for movie titles

Managing a Hadoop Cluster

  • Monitoring Hadoop
  • Securing a Hadoop cluster
  • Adding and removing nodes
  • Running a performance benchmark
  • Tuning a Hadoop cluster to optimise performance
  • Backup, recovery, and business continuity planning
  • Ensuring high availability (HA)

Upgrading and Migrating a Hadoop Cluster

  • Assessing workload requirements
  • Upgrading Hadoop
  • Moving from on-premises to cloud and vice versa
  • Recovering from failures

Troubleshooting

Summary and Conclusion

Requirements

  • Experience in system administration
  • Familiarity with the Linux command line
  • A solid understanding of big data concepts

Target Audience

  • System administrators
  • Database administrators (DBAs)
 35 Hours

Number of participants


Price per participant

Testimonials (3)

Provisional Upcoming Courses (Require 5+ participants)

Related Categories