Big Data Analytics Using Spark

Course Starts on April 1, 2018

University of California, San Diego is offering free online course on Big Data Analytics Using Spark. In data science, data is qualified as “big” if it cannot fit into the memory of a single standard laptop or workstation.

Course Details

In data science, data is called “big” if it cannot fit into the memory of a single standard laptop or workstation. The analysis of big datasets requires using a cluster of tens, hundreds or thousands of computers. Effectively using such clusters requires the use of distributed files systems, such as the Hadoop Distributed File System (HDFS) and corresponding computational models, such as Hadoop, Map Reduce and Spark

Length: 10 weeks
Effort: 10 hours pw
Subject: Data Analysis & Statistics
Institution: University of California, San Diego and edx
Languages: English
Price: Free
Certificate Available: Yes, Add a Verified Certificate for $350

Providers’ Details

The University of California, San Diego (UC San Diego) is a student-centered, research-focused, service-oriented public institution that provides opportunity for all. This young university has made its mark regionally, nationally and internationally.

Benefits

You will learn how to perform supervised unsupervised machine learning on massive data sets using the Machine Learning Library (MLlib). In this course, as in the other ones in this Micro Masters program, you will gain hands-on experience using Py Spark within the Jupyter notebooks environment.

Learning Outcomes

  • Programming Spark using Py spark
  • Identifying the computational trade-offs in a Spark application
  • Performing data loading and cleaning using Spark and Parquet
  • Modeling data through statistical and machine learning methods

Instructors

Yoav Freund

Dr. Freund is a Professor of Computer Science and Engineering in the University of California San Diego.

ENROLL HERE

Share Now On:

Leave a Reply

Your email address will not be published. Required fields are marked *