The Ultimate Hands-On Hadoop - Tame your Big Data!

4.5
21 432 коментария
Payment
Обучение платное
Certificate
Сертификация бесплатная
Duration
14.5 часов курса
О курсе

The world of Hadoop and "Big Data" can be intimidating - hundreds of different technologies with cryptic names form the Hadoop ecosystem. With this Hadoop tutorial, you'll not only understand what those systems are and how they fit together - but you'll go hands-on and learn how to use them to solve real business problems!

Learn and master the most popular big data technologies in this comprehensive course, taught by a former engineer and senior manager from Amazon and IMDb. We'll go way beyond Hadoop itself, and dive into all sorts of distributed systems you may need to integrate with.

  • Install and work with a real Hadoop installation right on your desktop with Hortonworks (now part of Cloudera) and the Ambari UI
  • Manage big data on a cluster with HDFS and MapReduce
  • Write programs to analyze data on Hadoop with Pig and Spark
  • Store and query your data with Sqoop, Hive, MySQL, HBase, Cassandra, MongoDB, Drill, Phoenix, and Presto
  • Design real-world systems using the Hadoop ecosystem
  • Learn how your cluster is managed with YARN, Mesos, Zookeeper, Oozie, Zeppelin, and Hue
  • Handle streaming data in real time with Kafka, Flume, Spark Streaming, Flink, and Storm

Understanding Hadoop is a highly valuable skill for anyone working at companies with large amounts of data.

Almost every large company you might want to work at uses Hadoop in some way, including Amazon, Ebay, Facebook, Google, LinkedIn, IBM,  Spotify, Twitter, and Yahoo! And it's not just technology companies that need Hadoop; even the New York Times uses Hadoop for processing images.

This course is comprehensive, covering over 25 different technologies in over 14 hours of video lectures. It's filled with hands-on activities and exercises, so you get some real experience in using Hadoop - it's not just theory.

You'll find a range of activities in this course for people at every level. If you're a project manager who just wants to learn the buzzwords, there are web UI's for many of the activities in the course that require no programming knowledge. If you're comfortable with command lines, we'll show you how to work with them too. And if you're a programmer, I'll challenge you with writing real scripts on a Hadoop system using Scala, Pig Latin, and Python.

You'll walk away from this course with a real, deep understanding of Hadoop and its associated distributed systems, and you can apply Hadoop to real-world problems. Plus a valuable completion certificate is waiting for you at the end! 

Please note the focus on this course is on application development, not Hadoop administration. Although you will pick up some administration skills along the way.

Knowing how to wrangle "big data" is an incredibly valuable skill for today's top tech employers. Don't be left behind - enroll now!

  • "The Ultimate Hands-On Hadoop was a crucial discovery for me. I supplemented your course with a bunch of literature and conferences until I managed to land an interview. I can proudly say that I landed a job as a Big Data Engineer around a year after I started your course. Thanks so much for all the great content you have generated and the crystal clear explanations. " - Aldo Serrano
  • "I honestly wouldn’t be where I am now without this course. Frank makes the complex simple by helping you through the process every step of the way. Highly recommended and worth your time especially the Spark environment.   This course helped me achieve a far greater understanding of the environment and its capabilities.  Frank makes the complex simple by helping you through the process every step of the way. Highly recommended and worth your time especially the Spark environment." - Tyler Buck
Программа
Learn all the buzzwords! And install the Hortonworks Data Platform Sandbox.
Identify the major components of the Hadoop ecosystem, and run Hadoop on your desktop.
Udemy 101: Getting the Most From This Course
How to ask questions, tune the video playback, enable captions, and leave reviews.
Tips for Using This Course
If you have trouble downloading Hortonworks Data Platform...
Installing Hadoop [Step by Step]
After a quick intro, we'll dive right in and install Hortonworks Sandbox in a virtual machine right on your own PC. This is the quickest way to get up and running with Hadoop so you can start learning and experimenting with it. We'll then download some real movie ratings data, and use Hive to analyze it!
Hadoop Overview and History
What's Hadoop for? What problems does it solve? Where did it come from? We'll learn Hadoop's backstory in this lecture.
Overview of the Hadoop Ecosystem
We'll take a quick tour of all the technologies we'll cover in this course, and how they all fit together. You'll come out of this lecture knowing all the buzzwords!
Using Hadoop's Core: HDFS and MapReduce
Learn how HDFS distributes your data across a cluster, how to manage files on HDFS, and use MapReduce to analyze this data.
HDFS: What it is, and how it works
Learn how Hadoop's Distributed Filesystem allows you store massive data sets across a cluster of commodity computers, in a reliable and scalable manner.
Installing the MovieLens Dataset
Before we can analyze movie ratings data from GroupLens using Hadoop, we need to load it into HDFS. You don't need to mess with command lines or programming to use HDFS. We'll start by importing some real movie ratings data into HDFS just using a web-based UI provided by Ambari.
[Activity] Install the MovieLens dataset into HDFS using the command line
Developers might be more comfortable interacting with HDFS via the command line interface. We'll import the same data, this time from a terminal prompt.
MapReduce: What it is, and how it works
Learn how mappers and reducers provide a clever way to analyze massive distributed datasets quickly and reliably.
Требования
  • You will need access to a PC running 64-bit Windows, MacOS, or Linux with an Internet connection and at least 8GB of free (not total) RAM, if you want to participate in the hands-on activities and exercises. If your PC does not meet these requirements, you can still follow along in the course without doing hands-on activities.
  • Some activities will require some prior programming experience, preferably in Python or Scala.
  • A basic familiarity with the Linux command line will be very helpful.
Что Вы изучите?
  • Design distributed systems that manage "big data" using Hadoop and related technologies.
  • Use HDFS and MapReduce for storing and analyzing data at scale.
  • Use Pig and Spark to create scripts to process data on a Hadoop cluster in more complex ways.
  • Analyze relational data using Hive and MySQL
  • Analyze non-relational data using HBase, Cassandra, and MongoDB
  • Query data interactively with Drill, Phoenix, and Presto
  • Choose an appropriate data storage technology for your application
  • Understand how Hadoop clusters are managed by YARN, Tez, Mesos, Zookeeper, Zeppelin, Hue, and Oozie.
  • Publish data to your Hadoop cluster using Kafka, Sqoop, and Flume
  • Consume streaming data using Spark Streaming, Flink, and Storm
Лекторы
Sundog Education by Frank Kane
Sundog Education by Frank Kane
Founder, Sundog Education. Machine Learning Pro

Sundog Education's mission is to make highly valuable career skills in big data, data science, and machine learning accessible to everyone in the world. Our consortium of expert instructors shares our knowledge in these emerging fields with you, at prices anyone can afford. 

Sundog Education is led by Frank Kane and owned by Frank's company, Sundog Software LLC. Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.

Due to our volume of students we are unable to respond to private messages; please post your questions within the Q&A of your course. Thanks for understanding.

Frank Kane
Frank Kane
Founder, Sundog Education

Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computingdata mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.

Due to our volume of students, I am unable to respond to private messages; please post your questions within the Q&A of your course. Thanks for understanding.

Платформа
Udemy
Курсы Udemy подойдут для профессионального развития. Платформа устроена таким образом, что эксперты сами запускают курсы. Все материалы передаются в пожизненный доступ. На этой платформе можно найти курс, без преувеличений, на любую тему – начиная от тьюториала по какой-то камере и заканчивая теоретическим курсом по управлению финансовыми рисками. Язык и формат обучения устанавливается преподавателем, поэтому стоит внимательно изучить информацию о курсе перед покупкой.
122.99 $ 189.99 $
Рейтинг
4.5
12 115
7 263
1 659
254
201