From the course: Scala Essential Training for Data Science
Unlock the full course today
Join today to access over 23,200 courses taught by industry experts.
Summary of Scala and Spark RDDs - Scala Tutorial
From the course: Scala Essential Training for Data Science
Summary of Scala and Spark RDDs
- [Instructor] Let's summarize some of the key facts about Scala and Spark RDDs. RDDs are distributed data structures. That means they run across multiple nodes. Now single-node clusters, like we're using here, are useful for development and test, but in big data production environments, you should consider running multiple nodes in your Spark cluster.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.