Курс Streaming Big Data with Spark Streaming & Scala - Hands On!

4.6
2 717 комментариев
Payment
Обучение платное
Certificate
Сертификация бесплатная
Duration
6.5 часов курса
О курсе

New! Updated for Spark 3.0.0!

"Big Data" analysis is a hot and highly valuable skill. Thing is, "big data" never stops flowing! Spark Streaming is a new and quickly developing technology for processing massive data sets as they are created - why wait for some nightly analysis to run when you can constantly update your analysis in real time, all the time? Whether it's clickstream data from a big website, sensor data from a massive "Internet of Things" deployment, financial data, or something else - Spark Streaming is a powerful technology for transforming and analyzing that data right when it is created, all the time.

You'll be learning from an ex-engineer and senior manager from Amazon and IMDb.

This course gets your hands on to some real live Twitter data, simulated streams of Apache access logs, and even data used to train machine learning models! You'll write and run real Spark Streaming jobs right at home on your own PC, and toward the end of the course, we'll show you how to take those jobs to a real Hadoop cluster and run them in a production environment too.

Across over 30 lectures and almost 6 hours of video content, you'll:

  • Get a crash course in the Scala programming language
  • Learn how Apache Spark operates on a cluster
  • Set up discretized streams with Spark Streaming and transform them as data is received
  • Use structured streaming to stream into dataframes in real-time
  • Analyze streaming data over sliding windows of time
  • Maintain stateful information across streams of data
  • Connect Spark Streaming with highly scalable sources of data, including Kafka, Flume, and Kinesis
  • Dump streams of data in real-time to NoSQL databases such as Cassandra
  • Run SQL queries on streamed data in real time
  • Train machine learning models in real time with streaming data, and use them to make predictions that keep getting better over time
  • Package, deploy, and run self-contained Spark Streaming code to a real Hadoop cluser using Amazon Elastic MapReduce.

This course is very hands-on, filled with achievable activities and exercises to reinforce your learning. By the end of this course, you'll be confidently creating Spark Streaming scripts in Scala, and be prepared to tackle massive streams of data in a whole new way. You'll be surprised at how easy Spark Streaming makes it!

Программа
Getting Started
Get your development environment for Spark Streaming set up.
Tip: Apply for a Twitter Developer Account now!
Warning about Java 11 and Spark 2.4!
Introduction, and Getting Set Up
A brief introduction to the course, and then we'll get your development environment for Spark and Scala all set up on your desktop. A quick test application will confirm Spark is working on your system! Remember - be sure to install Spark 2.2 for this course, and install Java 8, not Java 9.
[Activity] Stream Live Tweets with Spark Streaming!
Get set up with a Twitter developer account, and run your first Spark Streaming application to listen to and print out live Tweets as they happen!
Udemy 101: Getting the Most From This Course
A Crash Course in Scala
Write simple code in the Scala programming language, and understand functional programming.
[Activity] Scala Basics: Part 1
We start our crash course in the Scala programming language by covering some basics of the language: types and variables, printing, and boolean comparisons.
[Exercise] Scala Basics: Part 2
Part 2 of our introduction to the basics of Scala programming, and a simple exercise to get you writing your own Scala code.
[Exercise] Flow Control in Scala
Our Scala crash course continues, illustrating various means of flow control in Scala. For loops, do/while loops, while loops, etc.
[Exercise] Functions in Scala
Scala is a functional programming language, and so understanding how functions work and are treated in Scala is hugely important! This lecture covers the fundamentals, and lets you put it into practice.
[Excercise] Data Structures in Scala
We wrap up our Scala crash course with commonly used data structures using in Spark with Scala. Tuples, lists, and maps.
Требования
  • To follow along with the examples, you'll need a personal computer. The course is filmed using Windows 10, but the tools we install are available for Linux and MacOS as well.
  • We'll walk through installing the required software in the first lecture: The Scala IDE, Spark, and a JDK.
  • My "Taming Big Data with Apache Spark - Hands On!" would be a helpful introduction to Spark in general, but it is not required for this course. A quick introduction to Spark is included.
  • The course includes a crash course in the Scala programming language if you're new to it; if you already know Scala, then great.
Что Вы изучите?
  • Process massive streams of real-time data using Spark Streaming
  • Integrate Spark Streaming with data sources, including Kafka, Flume, and Kinesis
  • Use Spark 2's Structured Streaming API
  • Create Spark applications using the Scala programming language
  • Output transformed real-time data to Cassandra or file systems
  • Integrate Spark Streaming with Spark SQL to query streaming data in real time
  • Train machine learning models with streaming data, and use those models for real-time predictions
  • Ingest Apache access log data and transform streams of it
  • Receive real-time streams of Twitter feeds
  • Maintain stateful data across a continuous stream of input data
  • Query streaming data across sliding windows of time
Лекторы
Sundog Education by Frank Kane
Sundog Education by Frank Kane
Founder, Sundog Education. Machine Learning Pro

Sundog Education's mission is to make highly valuable career skills in big data, data science, and machine learning accessible to everyone in the world. Our consortium of expert instructors shares our knowledge in these emerging fields with you, at prices anyone can afford. 

Sundog Education is led by Frank Kane and owned by Frank's company, Sundog Software LLC. Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.

Due to our volume of students we are unable to respond to private messages; please post your questions within the Q&A of your course. Thanks for understanding.

Frank Kane
Frank Kane
Founder, Sundog Education

Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computingdata mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.

Due to our volume of students, I am unable to respond to private messages; please post your questions within the Q&A of your course. Thanks for understanding.

Платформа
Udemy
Курсы Udemy подойдут для профессионального развития. Платформа устроена таким образом, что эксперты сами запускают курсы. Все материалы передаются в пожизненный доступ. На этой платформе можно найти курс, без преувеличений, на любую тему – начиная от тьюториала по какой-то камере и заканчивая теоретическим курсом по управлению финансовыми рисками. Язык и формат обучения устанавливается преподавателем, поэтому стоит внимательно изучить информацию о курсе перед покупкой.
96.99 $ 149.99 $
Рейтинг
4.6
1 519
920
219
39
20