Maxtrain.com - [email protected] - 513-322-8888 - 866-595-6863


Extracting Business Value from Big Data with Pig and Hive

Alert Me


Increase productivity by avoiding low-level Java coding characteristic of MapReduce, and rapidly begin extracting business value for competitive advantage. In this big data training course, you will learn to gain access to previously inaccessible data, gather and feed data into Hadoop for storage, transform and filter data using Pig, and extract value using Hive and Spark SQL.

You Will Learn How To

  • Manipulate complex data sets stored in Hadoop for competitive advantage
  • Automate the transfer of data into Hadoop storage with Flume and Sqoop
  • Filter data with Extract-Transform-Load (ETL) operations using Pig
  • Query multiple data sets for analysis with Pig and Hive
  • Perform real-time queries on Hadoop data with Tez and Spark SQL



The Hadoop Ecosystem

  • Hadoop overview
  • Surveying the Hadoop components
  • Defining the Hadoop architecture

Exploring HDFS and MapReduce

Storing data in HDFS

  • Achieving reliable and secure storage
  • Monitoring storage metrics
  • Controlling HDFS from the Command Line

Parallel processing with MapReduce

  • Detailing the MapReduce approach
  • Transferring algorithms not data
  • Dissecting the key stages of a MapReduce job

Automating data transfer

  • Facilitating data Ingress and Egress
  • Aggregating data with Flume
  • Configuring data fan in and fan out
  • Moving relational data with Sqoop

Executing Data Flows with Pig

Describing characteristics of Apache Pig

  • Contrasting Pig with MapReduce
  • Identifying Pig use cases
  • Pinpointing key Pig configurations

Structuring unstructured data

  • Representing data in Pig's data model
  • Running Pig Latin commands at the Grunt Shell
  • Expressing transformations in Pig Latin Syntax
  • Invoking Load and Store functions

Performing ETL with Pig

Transforming data with Relational Operators

  • Creating new relations with joins
  • Reducing data size by sampling
  • Extending Pig with user–defined functions

Filtering data with Pig

  • Consolidating data sets with unions
  • Partitioning data sets with splits
  • Injecting parameters into Pig scripts

Manipulating Data with Hive

Leveraging business advantages of Hive

  • Factoring Hive into components
  • Imposing structure on data with Hive

Organizing data in Hive Data Warehouse

  • Creating Hive databases and tables
  • Contrasting available data types in Hive
  • Loading and storing data efficiently with SerDes

Designing data layout for maximum performance

  • Populating tables from queries
  • Partitioning Hive Tables for optimal queries
  • Composing HiveQL queries

Extracting Business Value with HiveQL

Performing joins on unstructured data

  • Distinguishing joins available in Hive
  • Optimizing join structure for performance

Pushing HiveQL to the limit

  • Sorting, distributing and clustering data
  • Reducing query complexity with views
  • Improving query performance with indexes

Deploying Hive in production

  • Designing Hive schemas
  • Setting up data compression
  • Debugging Hive scripts

Streamlining storage management with HCatalog

  • Unifying the data view with HCatalog
  • Leveraging HCatalog to access the Hive metastore
  • Communicating via the HCatalog interfaces
  • Populating a Hive table from Pig

Interacting with Hadoop Data in Real Time

  • Performing low-latency queries with Impala
  • Leveraging the Tez execution engine to improve performance
  • Reducing data access time with Spark SQL



Recommended Experience: Knowledge of databases and SQL.




Anyone with working knowledge of databases and SQL.

$2990.00 List Price

4 Days Course

Class Dates

Request a Date or a Private Class below.

MAX Educ. Savings
Categories: ,
Please wait...
Send a message

Sorry, we aren't online at the moment. Leave your message and we'll respond to you as soon as we're back in the office!

Your name
* Email
* How can we help?
    Start Chat Now

    Hello and welcome!

    I'm here if you have any questions.

    * Your name
    * How can we help?
    We're online!

    Help us help you better! Feel free to leave us any additional feedback.

    How do you rate our support?
      Loading ...