Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Big Data Overview:
- What is Big Data?
- Why Big Data is gaining popularity
- Big Data case studies
- Key characteristics of Big Data
- Solutions for working with Big Data.
Hadoop & Its Components:
- What is Hadoop, and what are its core components?
- Hadoop architecture and the types of data it can handle and process.
- A brief history of Hadoop, including the companies that use it and the reasons behind their adoption.
- Detailed explanation of the Hadoop framework and its components.
- Understanding HDFS (Hadoop Distributed File System), including read and write operations.
- Setting up a Hadoop cluster in various modes: standalone, pseudo-distributed, and multi-node clusters.
(This includes configuring a Hadoop cluster in VirtualBox, KVM, or VMware, managing critical network configurations, running Hadoop daemons, and testing the cluster).
- What is the MapReduce framework and how does it operate?
- Executing MapReduce jobs on a Hadoop cluster.
- Understanding replication, mirroring, and rack awareness in the context of Hadoop clusters.
Hadoop Cluster Planning:
- How to plan your Hadoop cluster effectively.
- Understanding the hardware and software requirements for cluster planning.
- Analysing workloads and planning the cluster to prevent failures and ensure optimal performance.
What is MapR and Why Choose MapR:
- An overview of MapR and its architecture.
- Understanding and working with the MapR Control System, MapR Volumes, snapshots, and mirrors.
- Planning a cluster within the MapR context.
- Comparing MapR with other distributions and Apache Hadoop.
- Installing MapR and deploying a cluster.
Cluster Setup & Administration:
- Managing services, nodes, snapshots, mirror volumes, and remote clusters.
- Understanding and managing individual nodes.
- Understanding Hadoop components and installing them alongside MapR services.
- Accessing data on the cluster, including via NFS, as well as managing services and nodes.
- Managing data using volumes, handling users and groups, assigning roles to nodes, commissioning and decommissioning nodes, cluster administration, performance monitoring, configuring and analysing metrics for performance tracking, and configuring and administering MapR security.
- Understanding and working with M7, the native storage solution for MapR tables.
- Configuring and tuning the cluster for optimum performance.
Cluster Upgrade and Integration with Other Setups:
- Upgrading the MapR software version and understanding the different types of upgrades.
- Configuring a MapR cluster to access an HDFS cluster.
- Setting up a MapR cluster on Amazon Elastic MapReduce.
All the above topics include demonstrations and practical sessions to provide learners with hands-on experience of the technology.
Requirements
- Basic knowledge of the Linux File System (FS)
- Fundamental Java skills
- Familiarity with Apache Hadoop (recommended)
28 Hours
Testimonials (1)
practical things of doing, also theory was served good by Ajay