This course provides a technical overview of Apache Hadoop. It includes high-level information about concepts, architecture, operation, and uses of the Hortonworks Data Platform (HDP) and the Hadoop ecosystem. The course provides an optional primer for those who plan to attend a hands-on, instructor-led courses.
Instructor did a great job, from experience this subject can be a bit dry to teach but he was able to keep it very engaging and made it much easier to focus.
Excellent presentation skills, subject matter knowledge, and command of the environment.
Instructor was outstanding. Knowledgeable, presented well, and class timing was perfect.
Click here to print this page »
No previous Hadoop or programming knowledge is required. Students will need browser access to the Internet.
Detailed Class Syllabus
Describe the use case for Hadoop ?Identify Hadoop Ecosystem architectural categories
Data Governance and Integration
Detail the HDFS architecture
Describe data ingestion options and frameworks for batch and real-time streaming
Explain the fundamentals of parallel processing
See popular data transformation and processing engines in action ?Apache Hive
Detail the architecture and features of YARN
Describe how to secure Hadoop
Operational overview with Ambari
Loading data into HDFS
Data manipulation with Hive
Risk Analysis with Pig
Risk Analysis with Spark and Zeppelin
Securing Hive with Ranger