Predictive Analytics Using Apache Spark MLlib on Databricks
https://CoursePig.com
Duration: 1h 57m | Updated: Oct 26, 2021 | Video: 1280x720, 48kHz | 272 MB
Genre: eLearning | Language: English | Level: Advanced
This course will teach you to understand and implement important techniques for predictive analytics such as regression and classification using Apache Spark MLlib APIs on Databricks.
The Spark unified analytics engine is one of the most popular frameworks for big data analytics and processing. Spark offers extremely comprehensive and easy to use APIs for machine learning which you can use to build predictive models for regression and classification and pre-process data to feed into these models.
In this course, Predictive Analytics Using Apache Spark MLlib on Databricks, you will learn to implement machine learning models using Spark ML APIs. First, you will understand the different Spark libraries available for machine learning, the older RDD-based library, and the newer DataFrame based library. You will then explore the range of transformers available in Spark for pre-processing data for machine learning - such as scaling and standardization transformers for numeric data and label encoding and one-hot encoding transformers for categorical data.
Next, you will use linear regression and ensemble models such as random forest and gradient boosted trees to build regression models. You will use these models for prediction on batch data. In addition, you will also see how you can use Spark ML Pipelines to chain together transformers and estimators to build a complete machine learning workflow.