From the course: Scala Essential Training for Data Science

Unlock the full course today

Join today to access over 23,200 courses taught by industry experts.

Creating DataFrames

Creating DataFrames

- [Instructor] In this video, I'm going to create data frames. Now, data frames, they're kind of like relational tables. They're a data structure that's organized into rows, and they have named columns. Now you may have heard of data frames before if you've worked with R/R or the pandas package in Python. The data frames in Spark are very similar. Now, what I'd like to do here, is create three data frames using some text files that are available as exercise files. So I you have access to those exercise files, you can download the text files and follow along with me. The first thing I'll do is create a value, or a local variable, called spark, which is a Spark session. So the first thing I'm going to do for that is import a package that we'll be using, and that package is called org.apache.spark. sql.SparkSession, and then I'll create a value called spark, and I'll assign that to a session, and I'll call the builder function, and I'll give this session a name, I'll call it…

Contents