Description
Introduction to Apache Spark is designed to introduce you to one of the most important Big Data technologies on the market, Apache Spark. You will start by learning about some of the basic concepts behind Spark, including the Resilient Distributed Datasets which tie everything together. From there, you will learn how to work with datasets in Spark using a functional programming approach as well as SQL. Finally, you will learn how to use the Eclipse IDE to write programs to work with data, learning a common technique for deploying code for Apache Spark jobs.
Instructor
MITCHELL PEARSON
BI Consultant and Trainer
As a Business Intelligence Consultant and Trainer for Pragmatic Works, Mitchell’s focus is on the full BI Stack (SSIS, SSAS and SSRS). In addition to the BI Stack, he also has experience with Data Modeling, T-SQL, MDX, Power Pivot and the Power BI Tools. Mitchell graduated from the University of North Florida in 2007 and is constantly expanding his knowledge on all things SQL Server.
What to Know Before the Class
The target audience of this course is an application or database developer interested in learning about Big Data technologies. No knowledge of Spark or Hadoop is assumed. Knowledge of development languages like Java, C#, or Python are helpful but not required.