Cloudera Designing & Building Big Data Applications (CDBBDA)

Quantity

$3,095.00

New Age Technologies has been delivering Authorized Training since 1996. We offer Cloudera’s full suite of authorized courses including courses pertaining to Apache Spark, Hadoop, Apache HBase, MapReduce, Data Science, Big Data Applications and more. If you have any questions or can’t seem to find the Cloudera class that you are interested in, contact one of our Cloudera Training Specialists. Invest in your future today with Cloudera training from New Age Technologies.

Cloudera Training Specialists | ☏ 502.909.0819

Current Promotion

  • ENTER CODE "CLOUDERA10" @ CHECKOUT & RECEIVE 10% OFF OR REQUEST GIFT CARD EQUIVALENT
Private IT Training

Cloudera Designing & Building Big Data Applications Overview:

The Cloudera Designing & Building Big Data Applications hands-on course prepares you to work through the entire process of designing and building solutions, including ingesting data, determining the appropriate file format for storage, processing the stored data, and presenting the results to the end-user in an easy-to-digest form. Go beyond MapReduce to use additional elements of the enterprise data hub (EDH) and develop converged applications that are highly relevant to the business.

Who Should Attend:

  • Developers, engineers, and architects who want to use Hadoop and related tools to solve real-world problems

Cloudera Designing & Building Big Data Applications Prerequisites:

Before attending this course, you must have the following:

  • Attended Cloudera Developer Training for Apache Hadoop or have equivalent practical experience
  • Good knowledge of Java and basic familiarity with Linux are required
  • Experience with SQL is helpful

Cloudera Designing & Building Big Data Applications Objectives:

After successfully completing this course, you will learn such topics as:

  • Creating a data set with Kite SDK
  • Developing custom Flume components for data ingestion
  • Managing a multi-stage workflow with Oozie
  • Analyzing data with Crunch
  • Writing user-defined functions for Hive and Impala
  • Transforming data with Morphlines
  • Indexing data with Cloudera Search

Cloudera Designing & Building Big Data Applications Certification:

  • Cloudera Certified Developer for Apache Hadoop (CCDH)

Cloudera Designing & Building Big Data Applications Outline:

Module 1: Application Architecture
  • Scenario Explanation
  • Understanding the Development Environment
  • Identifying and Collecting Input Data
  • Selecting Tools for Data Processing and Analysis
  • Presenting Results to the User
Module 2: Defining and Using Data Sets
  • Metadata Management
  • What is Apache Avro?
  • Avro Schemas
  • Avro Schema Evolution
  • Selecting a File Format
  • Performance Considerations
Module 3: Using the Kite SDK Data Module
  • What is the Kite SDK?
  • Fundamental Data Module Concepts
  • Creating New Data Sets Using the Kite SDK
  • Loading, Accessing, and Deleting a Data Set
Module 4: Importing Relational Data with Apache Sqoop
  • What is Apache Sqoop?
  • Basic Imports
  • Limiting Results
  • Improving Sqoop’s Performance
  • Sqoop 2
Module 5: Capturing Data with Apache Flume
  • What is Apache Flume?
  • Basic Flume Architecture
  • Flume Sources
  • Flume Sinks
  • Flume Configuration
  • Logging Application Events to Hadoop
Module 6: Developing Custom Flume Components
  • Flume Data Flow and Common Extension Points
  • Custom Flume Sources
  • Developing a Flume Pollable Source
  • Developing a Flume Event-Driven Source
  • Custom Flume Interceptors
  • Developing a Header-Modifying Flume Interceptor
  • Developing a Filtering Flume Interceptor
  • Writing Avro Objects with a Custom Flume Interceptor
Module 7: Managing Workflows with Apache Oozie
  • The Need for Workflow Management
  • What is Apache Oozie?
  • Defining an Oozie Workflow
  • Validation, Packaging, and Deployment
  • Running and Tracking Workflows Using the CLI
  • Hue UI for Oozie
Module 8: Processing Data Pipelines with Apache Crunch
  • What is Apache Crunch?
  • Understanding the Crunch Pipeline
  • Comparing Crunch to Java MapReduce
  • Working with Crunch Projects
  • Reading and Writing Data in Crunch
  • Data Collection API
  • Functions
  • Utility Classes in the Crunch API
Module 9: Working with Tables in Apache Hive
  • What is Apache Hive?
  • Accessing Hive
  • Basic Query Syntax
  • Creating and Populating Hive Tables
  • How Hive Reads Data
  • Using the RegexSerDe in Hive
Module 10: Developing User-Defined Functions
  • What are User-Defined Functions?
  • Implementing a User-Defined Function
  • Deploying Custom Libraries in Hive
  • Registering a User-Defined Function in Hive
Module 11: Executing Interactive Queries with Impala
  • What is Impala?
  • Comparing Hive to Impala
  • Running Queries in Impala
  • Support for User-Defined Functions
  • Data and Metadata Management
Module 12: Understanding Cloudera Search
  • What is Cloudera Search?
  • Search Architecture
  • Supported Document Formats
Module 13: Indexing Data with Cloudera Search
  • Collection and Schema Management
  • Morphlines
  • Indexing Data in Batch Mode
  • Indexing Data in Near Real Time
Module 14: Presenting Results to Users
  • Solr Query Syntax
  • Building a Search UI with Hue
  • Accessing Impala through JDBC
  • Powering a Custom Web Application with Impala and Search

Boost your salary by obtaining your Cloudera Certification: Cloudera Certified Developer for Apache Hadoop (CCDH):

    Top