SlideShare a Scribd company logo
GPU Computing for Data Science
John Joo
Data Science Evangelist @ Domino Data Lab
• Why use GPUs?
• Example applications in data science
• Programming your GPU
Case Study:
Monte Carlo Simulations
• Simulate behavior when randomness
is a key component
• Average the results of many
• Make predictions
Little Information in One “Noisy Simulation”
Price(t+1) = Price(t) e InterestRate•dt + noise
Many “Noisy Simulations” ➡ Actionable Information
Price(t+1) = Price(t) e InterestRate•dt + noise
Monte Carlo Simulations Are Often Slow
• Lots of simulation data is required to
create valid models
• Generating lots of data takes time
• CPU works sequentially
CPUs designed for sequential, complex tasks
Source: Mythbusters
GPUs designed for parallel, low level tasks
Source: Mythbusters
GPUs designed for parallel, low level tasks
Source: Mythbusters
Applications of GPU Computing in Data Science
• Matrix Manipulation
• Numerical Analysis
• Sorting
• String matching
• Monte Carlo simulations
• Machine learning
• Search
Algorithms for GPU Acceleration
• Inherently parallel
• Matrix operations
• High FLoat-point Operations Per Sec
GPUs Make Deep Learning Accessible
Stanford AI Lab
# of machines 1,000 3
# of CPUs or
2,000 CPUs 12 GPUs
Cores 16,000 18,432
Power used 600 kW 4 kW
Cost $5,000,000 $33,000
Adam Coates, Brody Huval,Tao Wang, David Wu, Bryan Catanzaro, Ng Andrew ; JMLR W&CP 28 (3) : 1337–1345, 2013
CPU vs GPU Architecture:
Structured for Different Purposes
4-8 High Performance Cores
100s-1000s of bare bones cores
Both CPU and GPU are required
Compute intensive
Everything else
General Purpose GPU Computing (GPGPU)
Heterogeneous Computing
Getting Started: Hardware
• Need a computer with GPU
• GPU should not be operating your
Spin up a GPU/CPU computer with 1 click.
8 CPU cores, 15 GB RAM
1,536 GPU cores, 4GB RAM
Getting Started: Hardware
Programming CPU
• Sequential
• Write code top to bottom
• Can do complex tasks
• Independent
Programming GPU
• Parallel
• Multi-threaded - race conditions
• Low level tasks
• Dependent on CPU
Getting Started: Software
Talking to your GPU
CUDA and OpenCL are GPU computing frameworks
Choosing How to Interface with GPU:
Simplicity vs Flexibility
purpose GPU
OpenCL code
Application Specific Libraries
• Theano - Symbolic math
• TensorFlow - ML
• Lasagne - NN
• Pylearn2 - ML
• mxnet - NN
• ABSsysbio - Systems Bio
• cudaBayesreg - fMRI
• mxnet - NN
• rpud -SVM
• rgpu - bioinformatics
Tutorial on using Theano, Lasagne, and no-learn:
General Purpose GPU Libraries
• Python and R wrappers for basic matrix
and linear algebra operations
• scikit-cuda
• cudamat
• gputools
• Drop-in library
Drop-in Library
Credit: NVIDIA
Also works for Python!
Custom CUDA/OpenCL Code
1. Allocate memory on the GPU
2. Transfer data from CPU to GPU
3. Launch the kernel to operate on the CPU
4. Transfer results back to CPU
Example of using Python and CUDA:
Monte Carlo Simulations
• Using PyCuda to interface Python and
• Simulating 3 million paths, 100 time steps
Python Code for CPU
Python/PyCUDA Code for GPU
8 more lines of code
Python Code for CPU
Python/PyCUDA Code for CPU
1. Allocate memory on the GPU
Python Code for CPU
Python/PyCUDA Code for CPU
2. Transfer data from CPU to GPU
Python Code for CPU
Python/PyCUDA Code for CPU
3. Launch the kernel to operate on the CPU cores
Python Code for CPU
Python/PyCUDA Code for CPU
4. Transfer results back to CPU
Python Code for CPU
26 sec
Python/PyCUDA Code for CPU
8 more lines of code
1.5 sec
17x speed up
Some sample Jupyter notebooks
• Monte Carlo example using PyCUDA
• PyCUDA example compiling CUDA C for kernel
• Scikit-cuda example of matrix multiplication
• Calculating a distance matrix using rpud
More resources
• Berkeley GPU workshop
• Duke Statistics on GPU (Python)
• Andreas Klockner’s webpage (Python)
• Summary of GPU libraries
More resources
• Walk through of CUDA programming in R
• List of libraries for GPU computing in R
• Matrix computations in Machine Learning

More Related Content

What's hot

ChatGPT ChatBot
ChatGPT ChatBotChatGPT ChatBot
ChatGPT ChatBot
Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
Qualcomm Research
Generative AI
Generative AIGenerative AI
Generative AI
shailesh sangle
Why Social Media Chat Bots Are the Future of Communication - Deck
Why Social Media Chat Bots Are the Future of Communication - DeckWhy Social Media Chat Bots Are the Future of Communication - Deck
Why Social Media Chat Bots Are the Future of Communication - Deck
Jan Rezab
State of the Cloud 2023—The AI era
State of the Cloud 2023—The AI eraState of the Cloud 2023—The AI era
State of the Cloud 2023—The AI era
Bessemer Venture Partners
What is ChatGPT.pdf
What is ChatGPT.pdfWhat is ChatGPT.pdf
What is ChatGPT.pdf
The Podcasting
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
Maxim Salnikov
Deep dive into ChatGPT
Deep dive into ChatGPTDeep dive into ChatGPT
Deep dive into ChatGPT
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The Future
Arturo Pelayo
ChatGPT Training Session
ChatGPT Training SessionChatGPT Training Session
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLP
Prompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowaniaPrompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowania
Michal Jaskolski
ChatGPT SEO Guide 2023
ChatGPT SEO Guide 2023ChatGPT SEO Guide 2023
ChatGPT SEO Guide 2023
Web Trainings Academy
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit SummitHabits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habit Summit
The Future of AI in Digital Marketing Transforming Customer Experiences.pdf
The Future of AI in Digital Marketing Transforming Customer Experiences.pdfThe Future of AI in Digital Marketing Transforming Customer Experiences.pdf
The Future of AI in Digital Marketing Transforming Customer Experiences.pdf
Different Roles in Machine Learning Career
Different Roles in Machine Learning CareerDifferent Roles in Machine Learning Career
Different Roles in Machine Learning Career

What's hot (20)

ChatGPT ChatBot
ChatGPT ChatBotChatGPT ChatBot
ChatGPT ChatBot
Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
Generative AI
Generative AIGenerative AI
Generative AI
Why Social Media Chat Bots Are the Future of Communication - Deck
Why Social Media Chat Bots Are the Future of Communication - DeckWhy Social Media Chat Bots Are the Future of Communication - Deck
Why Social Media Chat Bots Are the Future of Communication - Deck
State of the Cloud 2023—The AI era
State of the Cloud 2023—The AI eraState of the Cloud 2023—The AI era
State of the Cloud 2023—The AI era
What is ChatGPT.pdf
What is ChatGPT.pdfWhat is ChatGPT.pdf
What is ChatGPT.pdf
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
Deep dive into ChatGPT
Deep dive into ChatGPTDeep dive into ChatGPT
Deep dive into ChatGPT
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The Future
ChatGPT Training Session
ChatGPT Training SessionChatGPT Training Session
ChatGPT Training Session
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLP
Prompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowaniaPrompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowania
ChatGPT SEO Guide 2023
ChatGPT SEO Guide 2023ChatGPT SEO Guide 2023
ChatGPT SEO Guide 2023
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit SummitHabits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
The Future of AI in Digital Marketing Transforming Customer Experiences.pdf
The Future of AI in Digital Marketing Transforming Customer Experiences.pdfThe Future of AI in Digital Marketing Transforming Customer Experiences.pdf
The Future of AI in Digital Marketing Transforming Customer Experiences.pdf
Different Roles in Machine Learning Career
Different Roles in Machine Learning CareerDifferent Roles in Machine Learning Career
Different Roles in Machine Learning Career

Viewers also liked

DAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data QualityDAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data Quality
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
Booz Allen Hamilton
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Working With Big Data
Working With Big DataWorking With Big Data
Working With Big Data
Seth Familian
Analytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolutionAnalytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolution
Deloitte United States
Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data stores
Tomas Doran
Net Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to AvoidNet Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to Avoid
Aureus Analytics
Pollen VC Building A Digital Lending Business
Pollen VC Building A Digital Lending BusinessPollen VC Building A Digital Lending Business
Pollen VC Building A Digital Lending Business
Pollen VC
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...
Jonathan Gray
Visualising Data with Code
Visualising Data with CodeVisualising Data with Code
Visualising Data with Code
Ri Liu
Data made out of functions
Data made out of functionsData made out of functions
Data made out of functions
GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom
Brian Housand
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShare
Mobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigitalMobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigital
Aleyda Solís
Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.
IT in Healthcare
IT in HealthcareIT in Healthcare
IT in Healthcare
African Americans: College Majors and Earnings
African Americans: College Majors and Earnings African Americans: College Majors and Earnings
African Americans: College Majors and Earnings
CEW Georgetown
SXSW 2016: The Need To Knows
SXSW 2016: The Need To KnowsSXSW 2016: The Need To Knows
SXSW 2016: The Need To Knows
Ogilvy Consulting
Creative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage StartupsCreative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage Startups
Tommaso Di Bartolo
Mobile Is Eating the World (2016)
Mobile Is Eating the World (2016)Mobile Is Eating the World (2016)
Mobile Is Eating the World (2016)

Viewers also liked (20)

DAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data QualityDAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data Quality
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Working With Big Data
Working With Big DataWorking With Big Data
Working With Big Data
Analytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolutionAnalytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolution
Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data stores
Net Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to AvoidNet Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to Avoid
Pollen VC Building A Digital Lending Business
Pollen VC Building A Digital Lending BusinessPollen VC Building A Digital Lending Business
Pollen VC Building A Digital Lending Business
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...
Visualising Data with Code
Visualising Data with CodeVisualising Data with Code
Visualising Data with Code
Data made out of functions
Data made out of functionsData made out of functions
Data made out of functions
GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShare
Mobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigitalMobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigital
Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.
IT in Healthcare
IT in HealthcareIT in Healthcare
IT in Healthcare
African Americans: College Majors and Earnings
African Americans: College Majors and Earnings African Americans: College Majors and Earnings
African Americans: College Majors and Earnings
SXSW 2016: The Need To Knows
SXSW 2016: The Need To KnowsSXSW 2016: The Need To Knows
SXSW 2016: The Need To Knows
Creative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage StartupsCreative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage Startups
Mobile Is Eating the World (2016)
Mobile Is Eating the World (2016)Mobile Is Eating the World (2016)
Mobile Is Eating the World (2016)

Similar to GPU Computing for Data Science

"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
Edge AI and Vision Alliance
GPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonGPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And Python
Jen Aman
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
Putchong Uthayopas
Tim Child
PostgreSQL with OpenCL
PostgreSQL with OpenCLPostgreSQL with OpenCL
PostgreSQL with OpenCL
Muhaza Liebenlito
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farmKernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Anne Nicolas
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practices
Lior Sidi
Programming Models for Heterogeneous Chips
Programming Models for  Heterogeneous ChipsProgramming Models for  Heterogeneous Chips
Programming Models for Heterogeneous Chips
Facultad de Informática UCM
OpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CADOpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CAD
Design World
GPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech TalkGPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech Talk
Red Hat Developers
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
Rogue Wave Software
The GPGPU Continuum
The GPGPU ContinuumThe GPGPU Continuum
The GPGPU Continuum
Ofer Rosenberg
Stream Processing
Stream ProcessingStream Processing
Stream Processing
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
Kernel TLV
NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012
GPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holdsGPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holds
Arnon Shimoni
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
Ganesan Narayanasamy
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
HPCC Systems
Gpgpu intro
Gpgpu introGpgpu intro
Gpgpu intro
Dominik Seifert

Similar to GPU Computing for Data Science (20)

"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
GPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonGPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And Python
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
PostgreSQL with OpenCL
PostgreSQL with OpenCLPostgreSQL with OpenCL
PostgreSQL with OpenCL
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farmKernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practices
Programming Models for Heterogeneous Chips
Programming Models for  Heterogeneous ChipsProgramming Models for  Heterogeneous Chips
Programming Models for Heterogeneous Chips
OpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CADOpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CAD
GPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech TalkGPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech Talk
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
The GPGPU Continuum
The GPGPU ContinuumThe GPGPU Continuum
The GPGPU Continuum
Stream Processing
Stream ProcessingStream Processing
Stream Processing
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012
GPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holdsGPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holds
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
Gpgpu intro
Gpgpu introGpgpu intro
Gpgpu intro

More from Domino Data Lab

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...
Domino Data Lab
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
Domino Data Lab
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops data
Domino Data Lab
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
Domino Data Lab
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
Domino Data Lab
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
Domino Data Lab
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile Virus
Domino Data Lab
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
Domino Data Lab
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data Science
Domino Data Lab
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
Domino Data Lab
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)
Domino Data Lab
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
Domino Data Lab
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
Domino Data Lab
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data Scientists
Domino Data Lab
Making Big Data Smart
Making Big Data SmartMaking Big Data Smart
Making Big Data Smart
Domino Data Lab
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Domino Data Lab
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
Domino Data Lab
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
Domino Data Lab
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino Data Lab
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data Science
Domino Data Lab

More from Domino Data Lab (20)

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops data
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile Virus
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data Science
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data Scientists
Making Big Data Smart
Making Big Data SmartMaking Big Data Smart
Making Big Data Smart
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data Science

Recently uploaded

Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
Technology used in Ott data analysis project
Technology used in Ott data analysis  projectTechnology used in Ott data analysis  project
Technology used in Ott data analysis project
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptxParcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Full Disclosure Board Policy.docx BRGY LICUMA
Full  Disclosure Board Policy.docx BRGY LICUMAFull  Disclosure Board Policy.docx BRGY LICUMA
Full Disclosure Board Policy.docx BRGY LICUMA
Getting Started with Interactive Brokers API and Python.pdf
Getting Started with Interactive Brokers API and Python.pdfGetting Started with Interactive Brokers API and Python.pdf
Getting Started with Interactive Brokers API and Python.pdf
Riya Sen
CT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptxCT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptx
Accounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-RegulationsAccounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-Regulations
Big Data and Analytics Shaping the future of Payments
Big Data and Analytics Shaping the future of PaymentsBig Data and Analytics Shaping the future of Payments
Big Data and Analytics Shaping the future of Payments
The Rise of Python in Finance,Automating Trading Strategies: _.pdf
The Rise of Python in Finance,Automating Trading Strategies: _.pdfThe Rise of Python in Finance,Automating Trading Strategies: _.pdf
The Rise of Python in Finance,Automating Trading Strategies: _.pdf
Riya Sen
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop ServiceCal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
Audits Of Complaints Against the PPD Report_2022.pdf
Audits Of Complaints Against the PPD Report_2022.pdfAudits Of Complaints Against the PPD Report_2022.pdf
Audits Of Complaints Against the PPD Report_2022.pdf
Training on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptxTraining on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptx
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
Unit 1 Introduction to DATA SCIENCE .pptx
Unit 1 Introduction to DATA SCIENCE .pptxUnit 1 Introduction to DATA SCIENCE .pptx
Unit 1 Introduction to DATA SCIENCE .pptx
Priyanka Jadhav
SFBA Splunk Usergroup meeting July 17, 2024
SFBA Splunk Usergroup meeting July 17, 2024SFBA Splunk Usergroup meeting July 17, 2024
SFBA Splunk Usergroup meeting July 17, 2024
Becky Burwell
From Signals to Solutions: Effective Strategies for CDR Analysis in Fraud Det...
From Signals to Solutions: Effective Strategies for CDR Analysis in Fraud Det...From Signals to Solutions: Effective Strategies for CDR Analysis in Fraud Det...
From Signals to Solutions: Effective Strategies for CDR Analysis in Fraud Det...
Milind Agarwal
Field Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdfField Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdf
Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?
Selcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdfSelcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdf
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...

Recently uploaded (20)

Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
Technology used in Ott data analysis project
Technology used in Ott data analysis  projectTechnology used in Ott data analysis  project
Technology used in Ott data analysis project
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptxParcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Full Disclosure Board Policy.docx BRGY LICUMA
Full  Disclosure Board Policy.docx BRGY LICUMAFull  Disclosure Board Policy.docx BRGY LICUMA
Full Disclosure Board Policy.docx BRGY LICUMA
Getting Started with Interactive Brokers API and Python.pdf
Getting Started with Interactive Brokers API and Python.pdfGetting Started with Interactive Brokers API and Python.pdf
Getting Started with Interactive Brokers API and Python.pdf
CT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptxCT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptx
Accounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-RegulationsAccounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-Regulations
Big Data and Analytics Shaping the future of Payments
Big Data and Analytics Shaping the future of PaymentsBig Data and Analytics Shaping the future of Payments
Big Data and Analytics Shaping the future of Payments
The Rise of Python in Finance,Automating Trading Strategies: _.pdf
The Rise of Python in Finance,Automating Trading Strategies: _.pdfThe Rise of Python in Finance,Automating Trading Strategies: _.pdf
The Rise of Python in Finance,Automating Trading Strategies: _.pdf
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop ServiceCal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
Audits Of Complaints Against the PPD Report_2022.pdf
Audits Of Complaints Against the PPD Report_2022.pdfAudits Of Complaints Against the PPD Report_2022.pdf
Audits Of Complaints Against the PPD Report_2022.pdf
Training on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptxTraining on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptx
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
Unit 1 Introduction to DATA SCIENCE .pptx
Unit 1 Introduction to DATA SCIENCE .pptxUnit 1 Introduction to DATA SCIENCE .pptx
Unit 1 Introduction to DATA SCIENCE .pptx
SFBA Splunk Usergroup meeting July 17, 2024
SFBA Splunk Usergroup meeting July 17, 2024SFBA Splunk Usergroup meeting July 17, 2024
SFBA Splunk Usergroup meeting July 17, 2024
From Signals to Solutions: Effective Strategies for CDR Analysis in Fraud Det...
From Signals to Solutions: Effective Strategies for CDR Analysis in Fraud Det...From Signals to Solutions: Effective Strategies for CDR Analysis in Fraud Det...
From Signals to Solutions: Effective Strategies for CDR Analysis in Fraud Det...
Field Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdfField Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdf
Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?
Selcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdfSelcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdf
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...

GPU Computing for Data Science

  • 1. GPU Computing for Data Science John Joo Data Science Evangelist @ Domino Data Lab
  • 2. Outline • Why use GPUs? • Example applications in data science • Programming your GPU
  • 3. Case Study: Monte Carlo Simulations • Simulate behavior when randomness is a key component • Average the results of many simulations • Make predictions
  • 4. Little Information in One “Noisy Simulation” Price(t+1) = Price(t) e InterestRate•dt + noise
  • 5. Many “Noisy Simulations” ➡ Actionable Information Price(t+1) = Price(t) e InterestRate•dt + noise
  • 6. Monte Carlo Simulations Are Often Slow • Lots of simulation data is required to create valid models • Generating lots of data takes time • CPU works sequentially
  • 7. CPUs designed for sequential, complex tasks Source: Mythbusters
  • 8. GPUs designed for parallel, low level tasks Source: Mythbusters
  • 9. GPUs designed for parallel, low level tasks Source: Mythbusters
  • 10. Applications of GPU Computing in Data Science • Matrix Manipulation • Numerical Analysis • Sorting • FFT • String matching • Monte Carlo simulations • Machine learning • Search Algorithms for GPU Acceleration • Inherently parallel • Matrix operations • High FLoat-point Operations Per Sec (FLOPS)
  • 11. GPUs Make Deep Learning Accessible Google Datacenter Stanford AI Lab # of machines 1,000 3 # of CPUs or GPUs 2,000 CPUs 12 GPUs Cores 16,000 18,432 Power used 600 kW 4 kW Cost $5,000,000 $33,000 Adam Coates, Brody Huval,Tao Wang, David Wu, Bryan Catanzaro, Ng Andrew ; JMLR W&CP 28 (3) : 1337–1345, 2013
  • 12. CPU vs GPU Architecture: Structured for Different Purposes CPU 4-8 High Performance Cores GPU 100s-1000s of bare bones cores
  • 13. Both CPU and GPU are required CPU GPU Compute intensive functions Everything else General Purpose GPU Computing (GPGPU) Heterogeneous Computing
  • 14. Getting Started: Hardware • Need a computer with GPU • GPU should not be operating your display Spin up a GPU/CPU computer with 1 click. 8 CPU cores, 15 GB RAM 1,536 GPU cores, 4GB RAM
  • 16. Programming CPU • Sequential • Write code top to bottom • Can do complex tasks • Independent Programming GPU • Parallel • Multi-threaded - race conditions • Low level tasks • Dependent on CPU Getting Started: Software
  • 17. Talking to your GPU CUDA and OpenCL are GPU computing frameworks
  • 18. Choosing How to Interface with GPU: Simplicity vs Flexibility Application specific libraries General purpose GPU libraries Custom CUDA/ OpenCL code Flexibility Simplicity Low Low High High
  • 19. Application Specific Libraries Python • Theano - Symbolic math • TensorFlow - ML • Lasagne - NN • Pylearn2 - ML • mxnet - NN • ABSsysbio - Systems Bio R • cudaBayesreg - fMRI • mxnet - NN • rpud -SVM • rgpu - bioinformatics Tutorial on using Theano, Lasagne, and no-learn:
  • 20. General Purpose GPU Libraries • Python and R wrappers for basic matrix and linear algebra operations • scikit-cuda • cudamat • gputools • HiPLARM • Drop-in library
  • 21. Drop-in Library Credit: NVIDIA Also works for Python!
  • 22. Custom CUDA/OpenCL Code 1. Allocate memory on the GPU 2. Transfer data from CPU to GPU 3. Launch the kernel to operate on the CPU cores 4. Transfer results back to CPU
  • 23. Example of using Python and CUDA: Monte Carlo Simulations • Using PyCuda to interface Python and CUDA • Simulating 3 million paths, 100 time steps each
  • 24. Python Code for CPU Python/PyCUDA Code for GPU 8 more lines of code
  • 25. Python Code for CPU Python/PyCUDA Code for CPU 1. Allocate memory on the GPU
  • 26. Python Code for CPU Python/PyCUDA Code for CPU 2. Transfer data from CPU to GPU
  • 27. Python Code for CPU Python/PyCUDA Code for CPU 3. Launch the kernel to operate on the CPU cores
  • 28. Python Code for CPU Python/PyCUDA Code for CPU 4. Transfer results back to CPU
  • 29. Python Code for CPU 26 sec Python/PyCUDA Code for CPU 8 more lines of code 1.5 sec 17x speed up
  • 30. Some sample Jupyter notebooks • • Monte Carlo example using PyCUDA • PyCUDA example compiling CUDA C for kernel instructions • Scikit-cuda example of matrix multiplication • Calculating a distance matrix using rpud
  • 31. More resources • NVIDIA • • Berkeley GPU workshop • gpuWorkshop.html • Duke Statistics on GPU (Python) • CUDAPython.html • Andreas Klockner’s webpage (Python) • • Summary of GPU libraries •
  • 32. More resources • Walk through of CUDA programming in R • programming-with-gpus-and-r.html • List of libraries for GPU computing in R • HighPerformanceComputing.html • Matrix computations in Machine Learning • talk_dhillon.pdf