Evolution of Apache Spark
Journey of Spark in 1.x series
● Madhukara Phatak
● Technical Lead at Tellius
● Consultant and Trainer at
● Consult in Hadoop, Spark
and Scala
● Spark 1.0
● State of Big data
● Change in ecosystem
● Dawn of structured data
● Working with structured sources
● Dawn of custom memory management
● Evolution of Libraries
Spark 1.0
● Release on May 2014 [1]
● First production ready, backward compatible release
● Contains
○ Spark batch
○ Spark streaming
○ Shark
○ MLLib and Graphx
● Developed over 4 years
● Better hadoop
State of Big data Industry
● Map/Reduce was the way to do big data processing
● HDFS was primary source of the data
● Tools like Sqoop developed for moving data to hdfs and
hdfs acted like single point of source
● Every data by default assumed to be unstructured and
structure was laid on top of it
● Hive and Pig were popular ways to do structured and
semi structured data processing on top of Map/Reduce
Spark 1.0 Ideas
● RDD abstraction was supported to do Map/Reduce style
● Primary source supported was HDFS and memory as
the speedup layer
● Spark-streaming viewed as faster batch processing
rather than as streaming
● To support Hive, Shark was created to generate RDD
code rather than Map/Reduce
Changes from 2014
● Big data industry has gone through many radical
changes in thinking in last two years
● Some of those changes started in spark and some other
are influenced by other frameworks
● These changes are important to understand why Spark
2.0 abstractions are radically different than Spark 1.0
● Many of these are already discussed in earlier meetups,
links to the videos are in reference
Dawn of Structured Data
Usage of Big data in 2014.
● Most of the people were using higher level tools like
Hive and Pig to process data rather using Map/Reduce
● Most of the data was residing in the RDBMS databases
and user ETL data from mysql to hive to query
● So lot of use cases were analysing structured data
rather than basic assumption of unstructured in big data
● Huge time is consumed for ETL and non optimized
workflows from Hive
Spark with Structured Data in 1.2
● Spark recognised need of structured data in the market
and started to evolve the platform to support that use
● First attempt was to have a specialised RDD called
SchemaRDD in Spark 1.2 which represented that
● But this approach was not clean
● Also even though there was InputFormat to read from
structured data, there was no direct API to read from
DataSource API in Spark 1.3
● First API to provide an unified API to read from
structured and semi structured sources
● Can read from RDBMS, NoSql databases like
Mongodb,Cassandra etc
● Advanced API like InputFormat which gives lot of
control to source to optimize locality of data
● So in Spark 1.3, spark addressed the need of structured
data being first class in Big data ecosystem
● For more info refer to, Anatomy of DataSource API talk[2]
DataFrame abstraction in Spark
● Spark understood modifying the RDD abstraction is not
good enough
● Many frameworks like Hive, Pig tried and failed mapping
querying efficiently on Map/Reduce
● So Spark came up with Dataframe abstraction which
goes through a complete different pipeline that of RDD
which is highly optimized
● For more info refer to, Anatomy of DataFrame API talk [3]
Evolution of InMemory processing
In memory in Spark 1.0
● Spark was the first open source big data framework to
embrace in memory computing
● With cheaper hardware and abstractions like RDD
allowed spark to exploit memory in efficient way than all
other hadoop ecosystem projects
● The first implementation of in memory computing
followed typical cache approach of keeping serialized
java bytes
● This proved to be challenging in future
Challenges of in memory in Java
● As more and more big data frameworks started to
exploit memory, soon they realised few limitation of
Java memory model
● Java memory is tuned for short lived objects and
complete control of memory is given to JVM
● But big data system started using JVM for long term
storage, JVM memory model started feel inadequate
● Also as java heap grew, to cache more data, GC
pauses started to kill performance
Custom memory management
● Apache Flink is first big data system to implement
custom memory management in java
● Flink follows Dataframe like API with custom memory
● The custom memory model with non GC based
approach proved to be highly successful
● By observing trends in community, optly Spark also
adopted same in Spark 1.4
Tungsten in Spark 1.4
● Spark release first version of custom memory
management in 1.4 version
● It was only supported DF as they need custom memory
● Custom memory management greatly improved use of
spark in higher vm size and fewer GC paused
● Solved OOM issues which plagued earlier versions of
● For more info refer to, Anatomy of In memory
management in Spark talk [4]
DSL’s for data processing
RDD and Map/Reduce API API
● RDD API of spark follows functional programming
paradigm which is similar to Map/Reduce
● RDD API passes around opaque function objects which
is great for programming but bad for system based
● Map/Reduce API of Java also follows same patterns but
less elegant than scala ones
● Hard to optimise compared to Pig/Hive
● So we saw a steady increase in custom DSL’s in
hadoop world
Need of DSL’s in Hadoop
● DSL’s like Pig or Hive are much more easier to
understand compare to Java API
● Less error prone and helps to be very specific
● Can be easily optimised, as DSL only focuses on what
to do not how to do
● As Java Map/Reduce mixes what with how, it’s hard to
optimize compare to Hive and Pig
● So more and more people prefered these DSL over
platform level API’s
Challenges of DSL in Hadoop
● Hive and Pig DSL do not integrate well with
Map/Reduce API’s
● DSL often lack the flexibility of complete programming
● Hive/Pig DSL don’t define single abstraction to share so
you will be not able mix
● DSL are powerful for optimization but soon become
limited in terms of functionality
Scala as language to host DSL
● Scala is one of the first language to embrace DSL as
the first class citizens
● Scala features like implicits, higher order functions,
structured types etc allow easily build DSL’s and
integrate with language
● This allows any library on scala to integrate DSL and
harness full power of language
● Many libraries define their own DSL outside big data. Ex
: Slick, Akka-http, Sbt
DF DSL and Spark SQL DSL
● To harness power of custom memory management and
hive like optimizes spark encourages to write DF and
spark sql DSL over spark RDD code
● Whenever we write this DSL, all the features of scala
language and its libraries are available,which makes it
more powerful that Pig/ Hive
● Other frameworks like Flink, Beam follow same ideas on
scala, Java 8 etc
● You can easily mix and match DSL with RDD API
Dataset DSL in Spark 1.6
● Dataframe DSL introduced in 1.4 and stabilised in 1.5
● As spark observed the user and performance benefits of
DSL based programming, it wanted to make as import
pillar of Spark
● So in Spark 1.6, Spark released Dataset DSL which is
poised to complete RDD API from user land
● This indicates a big shift in thinking as we are more and
more moving away from 1.0 Map/Reduce and
unstructured mindset.
Evolution of Libraries
Evolution of libraries vs frameworks
● Spark is one of the first big data framework to build
platform rather than collection of frameworks
● Single abstraction results in multiple libraries not
multiple frameworks
● All these libraries get benefits from the improvements in
run time
● This made spark to build lot of ecosystem in very less
● To understand the meaning of platform, refer to
Introduction to Flink talk [5]
Data exchange between Libraries
● As more and more libraries are added to spark, having
common way to exchange data became important
● Initially libraries started using RDD as data exchange
format, but soon discovered some limitations
● Limitations of RDD as data exchange format is
○ No defined schema. Need to come up with domain
object for each library
○ Too low level
○ Custom serialization is hard to integrate
DataFrame as data exchange format
● From last few release, spark is making Dataframe as
new data exchange format of Spark
● Dataframe has schema and can be easily passed
around between libraries
● Dataframe is higher level abstraction compared RDD
● As Dataframe are serialized using platform specific
code generation, all libraries will be following same
● Dataset will follow the same advantages
Learnings from Spark 1.x
● Structured/Semi structured data is the first class of Big
data processing system
● Custom memory management and code generated
serialization gives best performance on JVM
● DataFrame/ Dataset are the new abstraction layers to
build next generation big data processing system
● DSL is the way forward over Map/Reduce like API’s
● Having high level structured abstractions make libraries
coexist happily on a platform

Introduction to Tableau
Introduction to TableauIntroduction to Tableau
Introduction to Tableau
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
How to Use Spatial Data Science in your Site Planning Process? [CARTOframes]
How to Use Spatial Data Science in your Site Planning Process? [CARTOframes] How to Use Spatial Data Science in your Site Planning Process? [CARTOframes]
How to Use Spatial Data Science in your Site Planning Process? [CARTOframes]
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
Hadoop Hadoop
PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...
PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...
PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...
Introduction to Pig
Introduction to PigIntroduction to Pig
Introduction to Pig
Apache spark
Apache sparkApache spark
Apache spark
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.x
Spark graphx
Spark graphxSpark graphx
Spark graphx
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
Phar Data Platform: From the Lakehouse Paradigm to the Reality
Phar Data Platform: From the Lakehouse Paradigm to the RealityPhar Data Platform: From the Lakehouse Paradigm to the Reality
Phar Data Platform: From the Lakehouse Paradigm to the Reality
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overview
"Spark Search" - In-memory, Distributed Search with Lucene, Spark, and Tachyo...
"Spark Search" - In-memory, Distributed Search with Lucene, Spark, and Tachyo..."Spark Search" - In-memory, Distributed Search with Lucene, Spark, and Tachyo...
"Spark Search" - In-memory, Distributed Search with Lucene, Spark, and Tachyo...
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep diveApache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview

Introduction to spark 2.0
Introduction to spark 2.0Introduction to spark 2.0
Introduction to spark 2.0
Introduction to Spark 2.0 Dataset API
Introduction to Spark 2.0 Dataset APIIntroduction to Spark 2.0 Dataset API
Introduction to Spark 2.0 Dataset API
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Anatomy of Data Frame API : A deep dive into Spark Data Frame API
Anatomy of Data Frame API :  A deep dive into Spark Data Frame APIAnatomy of Data Frame API :  A deep dive into Spark Data Frame API
Anatomy of Data Frame API : A deep dive into Spark Data Frame API
Improving Mobile Payments With Real time Spark
Improving Mobile Payments With Real time SparkImproving Mobile Payments With Real time Spark
Improving Mobile Payments With Real time Spark
Building distributed processing system from scratch - Part 2
Building distributed processing system from scratch - Part 2Building distributed processing system from scratch - Part 2
Building distributed processing system from scratch - Part 2
Building Distributed Systems from Scratch - Part 1
Building Distributed Systems from Scratch - Part 1Building Distributed Systems from Scratch - Part 1
Building Distributed Systems from Scratch - Part 1
Anatomy of Spark SQL Catalyst - Part 2
Anatomy of Spark SQL Catalyst - Part 2Anatomy of Spark SQL Catalyst - Part 2
Anatomy of Spark SQL Catalyst - Part 2
Spark architecture
Spark architectureSpark architecture
Spark architecture
Productionalizing a spark application
Productionalizing a spark applicationProductionalizing a spark application
Productionalizing a spark application
Anatomy of spark catalyst
Anatomy of spark catalystAnatomy of spark catalyst
Anatomy of spark catalyst
Interactive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark StreamingInteractive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark Streaming
Functional programming in Scala
Functional programming in ScalaFunctional programming in Scala
Functional programming in Scala
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Solr Architecture
Solr ArchitectureSolr Architecture
Solr Architecture
Building end to end streaming application on Spark
Building end to end streaming application on SparkBuilding end to end streaming application on Spark
Building end to end streaming application on Spark
Mesos and Kubernetes ecosystem overview
Mesos and Kubernetes ecosystem overviewMesos and Kubernetes ecosystem overview
Mesos and Kubernetes ecosystem overview
Interactive workflow management using Azkaban
Interactive workflow management using AzkabanInteractive workflow management using Azkaban
Interactive workflow management using Azkaban
Predictive modeling healthcare
Predictive modeling healthcarePredictive modeling healthcare
Predictive modeling healthcare
Ranking the Web with Spark
Ranking the Web with SparkRanking the Web with Spark
Ranking the Web with Spark

Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
Apache Spark for Beginners
Apache Spark for BeginnersApache Spark for Beginners
Apache Spark for Beginners
Spark 101
Spark 101Spark 101
Spark 101
Apache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource ManagerApache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource Manager
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
Apache Spark on HDinsight Training
Apache Spark on HDinsight TrainingApache Spark on HDinsight Training
Apache Spark on HDinsight Training
Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014
spark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark examplespark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark example
The Semantic Web and Drupal 7 - Loja 2013
The Semantic Web and Drupal 7 - Loja 2013The Semantic Web and Drupal 7 - Loja 2013
The Semantic Web and Drupal 7 - Loja 2013
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
Dec6 meetup spark presentation
Dec6 meetup spark presentationDec6 meetup spark presentation
Dec6 meetup spark presentation
Apache Spark - Lightning Fast Cluster Computing - Hyderabad Scalability Meetup
Apache Spark - Lightning Fast Cluster Computing - Hyderabad Scalability MeetupApache Spark - Lightning Fast Cluster Computing - Hyderabad Scalability Meetup
Apache Spark - Lightning Fast Cluster Computing - Hyderabad Scalability Meetup
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Queries Per Second with Updates: Spark As A Real-Time Web Service
700 Queries Per Second with Updates: Spark As A Real-Time Web Service700 Queries Per Second with Updates: Spark As A Real-Time Web Service
700 Queries Per Second with Updates: Spark As A Real-Time Web Service
Making the big data ecosystem work together with Python & Apache Arrow, Apach...
Making the big data ecosystem work together with Python & Apache Arrow, Apach...Making the big data ecosystem work together with Python & Apache Arrow, Apach...
Making the big data ecosystem work together with Python & Apache Arrow, Apach...
Making the big data ecosystem work together with python apache arrow, spark,...
Making the big data ecosystem work together with python  apache arrow, spark,...Making the big data ecosystem work together with python  apache arrow, spark,...
Making the big data ecosystem work together with python apache arrow, spark,...

Multi Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and TelliusMulti Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and Tellius
State management in Structured Streaming
State management in Structured StreamingState management in Structured Streaming
State management in Structured Streaming
Spark on Kubernetes
Spark on KubernetesSpark on Kubernetes
Spark on Kubernetes
Understanding transactional writes in datasource v2
Understanding transactional writes in  datasource v2Understanding transactional writes in  datasource v2
Understanding transactional writes in datasource v2
Introduction to Datasource V2 API
Introduction to Datasource V2 APIIntroduction to Datasource V2 API
Introduction to Datasource V2 API
Exploratory Data Analysis in Spark
Exploratory Data Analysis in SparkExploratory Data Analysis in Spark
Exploratory Data Analysis in Spark
Core Services behind Spark Job Execution
Core Services behind Spark Job ExecutionCore Services behind Spark Job Execution
Core Services behind Spark Job Execution
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloads
Structured Streaming with Kafka
Structured Streaming with KafkaStructured Streaming with Kafka
Structured Streaming with Kafka
Understanding time in structured streaming
Understanding time in structured streamingUnderstanding time in structured streaming
Understanding time in structured streaming
Spark stack for Model life-cycle management
Spark stack for Model life-cycle managementSpark stack for Model life-cycle management
Spark stack for Model life-cycle management
Productionalizing Spark ML
Productionalizing Spark MLProductionalizing Spark ML
Productionalizing Spark ML
Introduction to Structured streaming
Introduction to Structured streamingIntroduction to Structured streaming
Introduction to Structured streaming
Building real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark StreamingBuilding real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark Streaming
Testing Spark and Scala
Testing Spark and ScalaTesting Spark and Scala
Testing Spark and Scala
Understanding Implicits in Scala
Understanding Implicits in ScalaUnderstanding Implicits in Scala
Understanding Implicits in Scala
Migrating to Spark 2.0 - Part 2
Migrating to Spark 2.0 - Part 2Migrating to Spark 2.0 - Part 2
Migrating to Spark 2.0 - Part 2
Migrating to spark 2.0
Migrating to spark 2.0Migrating to spark 2.0
Migrating to spark 2.0
Scalable Spark deployment using Kubernetes
Scalable Spark deployment using KubernetesScalable Spark deployment using Kubernetes
Scalable Spark deployment using Kubernetes
Introduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actorsIntroduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actors

Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
UiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPath Community Day Amsterdam: Code, Collaborate, ConnectUiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPath Community Day Amsterdam: Code, Collaborate, Connect
Redefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI CapabilitiesRedefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI Capabilities
Demystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity ApplicationsDemystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity Applications
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptxFIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan..."Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptxFIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptxFIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
FIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Munich Seminar Workforce Authentication Case Study.pptxFIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Munich Seminar Workforce Authentication Case Study.pptx
Top 12 AI Technology Trends For 2024.pdf
Top 12 AI Technology Trends For 2024.pdfTop 12 AI Technology Trends For 2024.pdf
Top 12 AI Technology Trends For 2024.pdf
AMD Zen 5 Architecture Deep Dive from Tech Day
AMD Zen 5 Architecture Deep Dive from Tech DayAMD Zen 5 Architecture Deep Dive from Tech Day
AMD Zen 5 Architecture Deep Dive from Tech Day
Choosing the Best Outlook OST to PST Converter: Key Features and Considerations
Choosing the Best Outlook OST to PST Converter: Key Features and ConsiderationsChoosing the Best Outlook OST to PST Converter: Key Features and Considerations
Choosing the Best Outlook OST to PST Converter: Key Features and Considerations
FIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Munich Seminar FIDO Automotive Apps.pptxFIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Munich Seminar FIDO Automotive Apps.pptx
Indian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for StartupsIndian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for Startups
Camunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptxCamunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
History and Introduction for Generative AI ( GenAI )
History and Introduction for Generative AI ( GenAI )History and Introduction for Generative AI ( GenAI )
History and Introduction for Generative AI ( GenAI )
Perth MuleSoft Meetup July 2024
Perth MuleSoft Meetup July 2024Perth MuleSoft Meetup July 2024
Perth MuleSoft Meetup July 2024
What's New in Teams Calling, Meetings, Devices June 2024
What's New in Teams Calling, Meetings, Devices June 2024What's New in Teams Calling, Meetings, Devices June 2024
What's New in Teams Calling, Meetings, Devices June 2024

