Maxtrain.com - [email protected] - 513-322-8888 - 866-595-6863
Hadoop Programming on the Cloudera Platform is a 5-day, instructor led training course introduces you to the Apache Hadoop and key Hadoop ecosystem projects: Pig, Hive, Sqoop, Impala, Oozie, HBase, and Spark.
This intensive Hadoop training course uses lectures and guided, hands-on labs that help you learn theoretical knowledge and gain practical experience of Apache Hadoop and related Apache projects.
You will learn:
Lab 1. Learning the Lab Environment Lab 2. The Hadoop Distributed File System Lab 3. Hadoop Streaming MapReduce Lab 4. Programming Java MapReduce Jobs on Hadoop Lab 5. Getting Started with Apache Pig Lab 6. Apache Pig HDFS Command-Line Interface Lab 7. Working with Data Sets in Apache Pig Lab 8. Using Relational Operators in Apache Pig Lab 9. The Hive and Beeline Shells Lab 10. Hive Data Definition Language Lab 11. Using Select Statement in HiveQL Lab 12. Table Partitioning in Hive Lab 13. Data Import and Export with Sqoop Lab 14. Using Impala Lab 15. Elements of Functional Programming with Python Lab 16. Using the spark-submit Tool Lab 17. The Spark Shell Lab 18. RDD Performance Improvement Techniques Lab 19. Spark ETL and HDFS Interface Lab 20. Using Broadcast Variables Lab 21. Using Accumulators Lab 22. Common Map / Reduce Programs in Spark Lab 23. Spark SQL Lab 24. Getting Started with GraphX Lab 25. PageRank with GraphX Lab 26. Using Random Forests for Classification with Spark MLlib Lab 27. Using k-means Algorithm from MLlib
Participants should have the general knowledge of programming in Java and SQL as well as experience working in Unix environments (e.g. running shell commands, etc.)
Business Analysts, IT Architects, Technical Managers and Developers
5 Days Course