Data Science Reinvents Learning?

2015-08-24 • San Jose
Paco Nathan, @pacoid 
Director, O’Reilly Learning
Data Science Reinvents Learning?
Beyond Gutenberg and Erasmus
meetup.com/SF-Bay-ACM/events/221693508/

2
Some Background…
• O’Reilly Learning: you may only hear about us in  
a few instances, if we do our job well; ACM is a great
forum for this discussion
• prior: built-out the community evangelism and training
program for Apache Spark at Databricks
• prior: led Data teams for several years, working on  
large-scale ML apps in industry, including: one of the
largest Hadoop instances running in AWS (2008);  
one of the ﬁrst 100% AWS system architectures (2006)
• …
• ancient prior: Stanford CSD teaching fellowship (1984-86,
Alice Supton, Stuart Reges) peer-teaching CS course
which later became Residential Computing

4
Intro
Quite candidly, the one common catch phrase  
in SiliconValley that I ﬁnd most terrifying:
“It’s like Uber, for ___”

5
Intro
Ostensibly that leads to a question, how might  
an “Uber for Education” look?

6
Intro
a) Similar to Cthulhu, we might regret actually seeing that

7
Intro

8
Intro
b) Would we really need that anywho?

9
Intro
c) Uber itself might not take that approach …

10
Intro
c) Uber itself might not take that approach …
Perhaps “Uber for Learning” might be somewhat 
more apt?
In any case, what comes after Books,
Kindle, MOOCs?

11
Some Definitions…
“Learning”
ergo…
“Education”
ergo…
“School”

“Learning”
ergo…
“Education”
ergo…
“School”
X
12
Some Definitions…
Schools are great to have…
If you need a school, pick a  
good one and go
To be clear, we’re not a school

13
Some Definitions…
Even the best schools these days question 
what they will become in 5-10 years
Not-so-best schools are perhaps questioning  
much more than that

14
Some Definitions…
Oh BTW, too many (funded) teams seem to  
have this mediocre idea for “education”:
1. assessment: collect test scores ➜
2. define “quantified student” ➜
3. reuse online marketing funnel ad-tech ➜
4. invoke agile coding teams ➜
5. ship mobile/cloud-based SaaS platform ➜
6. ...
7. profit

Oh BTW, too many (funded) teams seem to
have this mediocre idea for “education”
1. assessment: collect test scores
2. define “quantified student”
3. reuse online marketing funnel ad-tech
4. invoke agile coding
5. ship a mobile/cloud-based SaaS platform
6. ...
7. profit
15
Some Definitions…
LMS

K-12 not so much, except perhaps in the
case of Safari for Schools
undergrad textbooks?
graduate textbooks, conferences?
professional focus of our audience
16
Some Definitions…

17
• vocational:  
making a career move
• aspirational:  
improvement within a career path
• proﬁciency:  
has a speciﬁc pain-point, needs to resolve it
• familiarity:  
wants to join in a team dialog about a topic,  
e.g., conversational programmer
Learner Personas for professional category

19
What about MOOCs?
Massive Open Online Courses –  
seven year trend, beginning with:
Connectivism and Connective Knowledge 
George Siemens, Stephen Downes 
University of PEI (2008) 
http://cck11.mooc.ca/

21
What about MOOCs?
Anthony Joseph 
UC Berkeley
early Jun 2015
edx.org/course/uc-berkeleyx/uc-
berkeleyx-cs100-1x-
introduction-big-6181
Ameet Talwalkar 
UCLA
late Jun 2015
edx.org/course/uc-berkeleyx/
uc-berkeleyx-cs190-1x-
scalable-machine-6066

22
What about MOOCs?
Pros:
• cost-effective to reach a large audience
• popular with students
• ¿ addresses “train the trainers” bottleneck ?
Cons:
• expensive to produce and curate
• most students are sampling
• low completion rates
• somewhat chaotic
• lecture fatigue
• ¿ reinforces advantage of the elites ?

23
What about MOOCs?
Online education: MOOCs taken by educated few 
Ezekiel Emanuel, Nature 503, 342 (2013-11-21)
• 80% students already have an advanced degree
• 80% come from the richest 6% of the population
Michael Shanks @Stanford: retrenchment around traditional
disciplines will make disparities even more pronounced
An Early Report Card on Massive Open Online Courses 
Geoffrey Fowler, WSJ (2013-10-08)
Amherst, Duke, etc., have rejected edX
see: Open edX Universities Symposium @GWU, 2015-11-11

24
• search engines surface too many choices  
among the available learning content
• we must get people wanting to interact with
the material – generally due to social context
• academe strives to decontextualize, which  
is the opposite of learning in context
• how do we recognize that learning has
occurred?
• what is the learning promise?
What about MOOCs?

26
Introduction to Robotics
Peter Corke @QUT
https://moocs.qut.edu.au/learn/introduction-to-
robotics-august-2015
• effective use of peer review for scaling
• worked well reaching into Africa, India
Peer Review

27
EffectiveThinkingThrough Mathematics
Michael Starbird @UT/Austin
https://www.edx.org/course/effective-thinking-
through-mathematics-utaustinx-ut-9-01x
• getting students to articulate their
epiphany moments is more interesting  
than other results – Donna Kidwell
Epiphany Moments

28
Caltech Offers Online Course with  
Live Lectures in Machine Learning
Yaser Abu-Mostafa (2012-03-30)
http://www.caltech.edu/news/caltech-offers-online-
course-live-lectures-machine-learning-4248
• signiﬁcant improvement through the use
of “ﬂipped” a.k.a. inverted classrooms
Inverted Classrooms

29
Scalable Learning 
David Black-Schaffer @Uppsala 
Sverker Janson @KTH SICS
https://www.scalable-learning.com/
• active learning: Flipped Classroom and Just-in-timeTeaching
• exams built directly into speciﬁc diagrams within videos
• metrics for where in video+code that students get stuck
• instructor can customize subsequent classroom discussions  
(active teaching phase) based on stuck/unstuck metrics
Inverted Classrooms

30
How to Flip a Class  
CLT @UT/Austin 
http://ctl.utexas.edu/teaching/ﬂipping-a-class/how
1. identify where the ﬂipped classroom model makes  
the most sense for your course
2. spend class time engaging students in application
activities with feedback
3. clarify connections between inside and outside  
of class learning
4. adapt your materials for students to acquire course
content in preparation of class
5. extend learning beyond class through individual  
and collaborative practice
Inverted Classrooms

31
Learning programming at scale
Philip Guo  
O’Reilly Radar (2015-08-13)
http://radar.oreilly.com/2015/08/learning-
programming-at-scale.html
• PythonTutor
• Codechella
Tutors could keep an eye on around  
50 learners during a 30-minute session,  
start 12 chat conversations, and  
concurrently help 3 learners at once
Collaborative Learning

32
Data-driven Education and the Quantiﬁed Student
Lorena Barba @GWU
PyData Seattle 2015
https://youtu.be/2YIZ2SY9mW4
• keynote talk: abstract, slides
• homepage
If you study just one link in this entire talk…

34
If by some bizarre chance you haven’t used  
it already, go to https://jupyter.org/
• 50+ different language kernels
• new funding 2015-07
• UC Berkeley, Cal Poly
• nbgrader autograder by Jess Hamrick
• jupyterhub multi-user server
• curating a list of examples
• repeatable science!
see also: 
Teaching with Jupyter Notebooks 
http://tinyurl.com/scipy2015-education
Project Jupyter

35
Deploying JupyterHub for Education 
Jessica Hamrick 
Rackspace blog (2015-03-24) 
https://developer.rackspace.com/blog/deploying-
jupyterhub-for-education/
Project Jupyter

36
Literate Programming 
Don Knuth 
Univ of Chicago Press (1992) 
literateprogramming.com/
Instead of imagining that our main task is  
to instruct a computer what to do, let us 
concentrate rather on explaining to human 
beings what we want a computer to do
Evoking some earlier works…

37
Most deﬁnitely check out CodeNeuro,
both online and the conf/hackathon…
Some great examples:
Jeremey Freeman, HHMI Janelia Farm 
http://notebooks.codeneuro.org/
Matthew Conlen, NY Data Company 
http://lightning-viz.org/
Olga Botvinnick, UCSD 
http://yeolab.github.io/ﬂotilla/docs/gallery/
Great Examples

38
http://mybinder.org/
turn a GitHub repo into a collection  
of interactive notebooks powered by
Jupyter and Kubernetes
Launch Vehicles

40
Embracing Jupyter Notebooks at O'Reilly 
Andrew Odewahn 
O’Reilly Media (2015-05-07)
https://beta.oreilly.com/ideas/jupyter-at-oreilly
O’Reilly Media is using our Atlas platform  
to make Jupyter Notebooks a ﬁrst class
authoring environment for our publishing
program
Jupyter, Thebe, Atlas, Docker, etc.
Content Toolchain

41
Embracing Jupyter Notebooks at O'Reilly
Andrew Odewahn
O’Reilly Media (2015-05-07)
https://beta.oreilly.com/ideas/jupyter-at-oreilly
O’Reilly Media is using our Atlas platform
to make Jupyter Notebooks a ﬁrst class
authoring environment for our publishing
program
Jupyter
Content Toolchain

42
On Demand Analytic and Learning Environments with Jupyter 
Kyle Kelley, Andrew Odewahn 
lambdaops.com/jupyter-environments-odsc2015/
Exploring a couple themes, in particular:
• computational narratives
- exploratory data analysis
- software development/collaboration
- API exploration
- technical papers
- reports, exec dashboards
• code-as-media
- Thebe project, etc.
Content Toolchain

43
Personal experiences during 2012-2015  
as an author and instructor…
Just Enough Math 
Paco Nathan 
O’Reilly Media (2014) 
http://justenoughmath.com
Content Toolchain

44
Learnings based on working on this
project with Kyle and Andrew…
How to transit from roles of data scientist,
software developer, engineering director –  
into roles of author, teacher – and vice versa
Content Toolchain

45
Interactive notebooks:  
Sharing the code
Helen Shen
Nature (2014-11-05)
nature.com/news/interactive-notebooks-
sharing-the-code-1.16261
Content Toolchain

46
Content Toolchain
Atlas is our content platform backed by Git,
for project collaboration among authors,
editors, et al.
https://atlas.oreilly.com/

47
Content Toolchain
Thebe (a moon of Jupiter) provides a layer
atop Jupyter that is needed for publishing,
white-labeled content, etc.
https://github.com/oreillymedia/thebe

48
Content Toolchain
Beta is our new site design:
https://beta.oreilly.com/learning

49
Content Toolchain
Contrast our current talent workﬂow and this  
new world of Jupyter+Docker+Thebe+cloud …
How would it work with known successes such  
as Head First?
production presentation
Thebe:
player
Jupyter:
notebook
Docker:
container
web page:
interaction
Git:
versioning
Atlas:
publications
various
formats
authoring
cloud
infra

Does Science begin with
Phenomenology?

51
Audience Patterns for Learning: ad-hoc

52
Audience Patterns for Learning: architecture
events inverted on-demand
Mostly
Synchronous
Mostly
Asynch
Inverted
Classroom
Paywall
Subscription
Free
Content

53
The Learning Architecture:
Deﬁning Development and Enabling Continuous Learning
David Mallon, Dani Johnson
Bersin (2014-05-06)
http://www.bersin.com/Practice/Detail.aspx?
docid=17435&mode=search&p=Learning-@-Development
This report is designed to help leaders  
and talent development and learning  
professionals to take positive steps  
toward understanding and implementing  
learning architectures
Sidebar: Learning Architecture

Think of a favorite open source framework …
who (or where) are the experts in this graph?
Sidebar: Innovators vs. Experts
Diffusion of Innovation 
Everett Rogers (1962) 
http://sphweb.bumc.bu.edu/otlt/MPH-Modules/SB/SB721-
Models/SB721-Models4.html
54

55
Building Blocks
In software engineering, we rarely hand a  
developer the spec for some app and say  
“Start from scratch, then come back when
you’re done.” Instead:
• focus on MVP
• leverage APIs, libraries, microservices, etc.
• iterate on small, incremental changes
• this allows for TDD, CI, etc.
• plus, customer experiments ➜ data science
Compare/contrast that with how publishers
approach authors, speakers, instructors?

56
Building Blocks
Proposing a new format spec to replace  
EPUB, MOBI, etc.:
• video segments + transcripts
• notebooks in Jupyter+Thebe+Docker
• metadata (persona, topics, cues, etc.)
• links to Git repos, Dat data
• annotations atop existing content
• webcast/livestream
• social interaction (TA/mentoring)
• evaluation modules
• discourse analytics
most reused across a spectrum
of synchronous to async
instrumented for experiments,
analytics, iteration

57
total
newbie
good
overview
Do you have sufﬁcient familiarity with the topic?
utterly
confused
familiar
territory
Can you build on familiarity with a related topic?
must get
unstuck
send pull
request
Do you have necessary proﬁciency in the topic?
learner
topic
experience
concise
topic
inter-
disciplinary
How many boundaries must you span to achieve structural literacy for this topic?
want to
for myself
have to
for my job
What is your primary motivation to learn this topic?
bleeding
edge
COBOL 2020
Where are you on the "diffusion of innovation" curve w.r.t. the topic?
on-
demand
major
event
How high is the transaction cost for the experience delivered to you?
"go read
the code"
full-team
participation
Does the learning experience immerse you within a diverse, supportive social context?
Dimensional Reduction
Did we mention intense needs  
for data analytics at scale?

58
Is it possible to measure “distance” between  
a learner and a subject community?
From Amateurs to Connoisseurs: 
Modeling the Evolution of User  
Expertise through Online Reviews 
Julian McAuley, Jure Leskovec 
http://i.stanford.edu/~julian/pdfs/www13.pdf
Recommender Systems

59
Back to “Uber for Learning” – approaching from a learner
(audience) perspective, generally within a social context
Given that:
• books aren’t used by learners as much anymore
• experts don’t have time to write books anymore
If we can:
• ﬁt learners’ needs to topics w.r.t. subject communities,  
based on their S-curve positions
• personalize lectures for learners’ pain-points
• reuse containerized building blocks
Imagine the extent to which our current data science  
tooling and techniques can be leveraged?
Summary

60
PS: If you are interested in opportunities  
to write, speak, teach, mentor, code, etc.,  
based on these approaches, let us know
Get Involved!

presenter:
Just Enough Math
O’Reilly (2014)
justenoughmath.com
monthly newsletter for updates,  
events, conf summaries, etc.:
liber118.com/pxn/

Data Science Reinvents Learning?

Related slideshows

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Data Science Reinvents Learning?

Similar to Data Science Reinvents Learning? (20)

More from Paco Nathan

More from Paco Nathan (11)

Recently uploaded

Recently uploaded (20)

Data Science Reinvents Learning?