Information Technology Infrastructure Committee (ITIC): Report to the NACLarry Smarr
This document summarizes the December 2013 report from the NASA Advisory Council's Information Technology Infrastructure Committee (ITIC). It discusses NASA's transition to a more agile, collaborative agency that brings together experts from multiple centers to solve problems. The report outlines NASA's vision for a "OneNASA" organization enabled by unified IT tools and infrastructure. It also notes that NASA has begun implementing improved IT governance and developing a framework to coordinate IT investments across centers and missions.
The document describes the Strongly Coupled LambdaCloud project at Calit2. It discusses two new buildings at Calit2 that house over 1000 researchers working on nanotech, biotech, chips, VR and other areas. It also describes the OptIPuter network that connects these researchers via 10Gbps lightpaths, enabling collaborative data-intensive research worldwide. The network includes 50 OptIPortals connected to resources like supercomputers and satellite imagery.
The Energy Efficient Cyberinfrastructure in Slowing Climate ChangeLarry Smarr
10.04.28
Invited Speaker
Community Alliance for Distributed Energy Resources
Scripps Forum, UCSD
Title: The Energy Efficient Cyberinfrastructure in Slowing Climate Change
La Jolla, CA
Why Researchers are Using Advanced NetworksLarry Smarr
07.07.03
Remote Talk from Calit2 to:
Building KAREN Communities for Collaboration Forum
KIWI Advanced Research and Education Network
University of Auckland, Auckland City, New Zealand
Title: Why Researchers are Using Advanced Networks
La Jolla, CA
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific ApplicationsLarry Smarr
07.03.21
IEEE Computer Society Tsutomu Kanai Award Keynote
At the Joint Meeting of the: 8th International Symposium on Autonomous Decentralized Systems
2nd International Workshop on Ad Hoc, Sensor and P2P Networks
11th IEEE International Workshop on Future Trends of Distributed Computing Systems
Title: OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
Sedona, AZ
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...Larry Smarr
05.06.14
Keynote to the 15th Federation of Earth Science Information Partners Assembly Meeting: Linking Data and Information to Decision Makers
Title: The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way to the International LambdaGrid
San Diego, CA
From the Shared Internet to Personal Lightwaves: How the OptIPuter is Transfo...Larry Smarr
The document summarizes how the OptIPuter project is transforming scientific research through user-controlled high-speed optical network connections. It provides examples of how 1-10Gbps connections through projects like National LambdaRail are enabling new forms of collaborative work and access to scientific instruments and global data repositories. The OptIPuter creates an environment where researchers can access remote resources through local "OptIPortals" connected to these high-speed optical networks.
Remote Telepresence for Exploring Virtual WorldsLarry Smarr
The document describes the history and development of remote telepresence and virtual reality technologies over several decades. It outlines key projects and innovations including the NSFnet which connected supercomputers in the 1980s, the development of the CAVE virtual reality system in the early 1990s, and more advanced optical network projects like OptIPuter in the 2000s which enabled high-resolution telepresence and collaboration across global research centers.
High Performance Cyberinfrastructure for Data-Intensive ResearchLarry Smarr
This document summarizes a lecture given by Dr. Larry Smarr on high performance cyberinfrastructure for data-intensive research. The summary discusses:
1) The need for dedicated high-bandwidth networks separate from the shared internet to enable big data research due to the increasing volume of digital scientific data.
2) Extensions being made to networks like CENIC in California to provide campus "Big Data Freeways" connecting instruments, computing resources, and remote facilities.
3) The use of networks like HPWREN to provide high-performance wireless access for data-intensive applications in rural areas like astronomy, wildfire detection, and more.
Toward a Global Interactive Earth Observing CyberinfrastructureLarry Smarr
The document discusses the need for a new generation of cyberinfrastructure to support interactive global earth observation. It outlines several prototyping projects that are building examples of systems enabling real-time control of remote instruments, remote data access and analysis. These projects are driving the development of an emerging cyber-architecture using web and grid services to link distributed data repositories and simulations.
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...Larry Smarr
09.11.03
Report to the
Dept. of Energy Advanced Scientific Computing Advisory Committee
Title: Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * Calit2 * LBNL * NICS * ORNL * SDSC
Oak Ridge, TN
The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Te...Larry Smarr
05.03.09
Invited Talk
Optical Fiber Communication Conference (OFC2005)
Title: The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Testbed for Optical Technologies Enabling LambdaGrid Computing
Anaheim, CA
06.12.13
Panelist
Panel on Issues, Challenges, and Future Directions of Multimedia Research
IEEE International Symposium on Multimedia (ISM 2006)
Title: Towards GigaPixel Displays
La Jolla, CA
- The Pacific Research Platform (PRP) interconnects campus DMZs across multiple institutions to provide high-speed connectivity for data-intensive research.
- The PRP utilizes specialized data transfer nodes called FIONAs that provide disk-to-disk transfer speeds of 10-100Gbps.
- Early applications of the PRP include distributing telescope data between UC campuses, connecting particle physics experiments to computing resources, and enabling real-time wildfire sensor data analysis.
The Pacific Research Platform (PRP) is a multi-institutional cyberinfrastructure project that connects researchers across California and beyond to share large datasets. It spans the 10 University of California campuses, major private research universities, supercomputer centers, and some out-of-state universities. Fifteen multi-campus research teams in fields like physics, astronomy, earth sciences, biomedicine, and multimedia will drive the technical needs of the PRP over five years. The goal is to create a "big data freeway" to allow high-speed sharing of data between research labs, supercomputers, and repositories across multiple networks without performance loss over long distances.
Towards a High-Performance National Research Platform Enabling Digital ResearchLarry Smarr
The document summarizes Dr. Larry Smarr's keynote presentation on enabling a high-performance national research platform. It describes how multi-institutional research increasingly relies on access to large datasets, requiring new cyberinfrastructure. The Pacific Research Platform provides high-bandwidth networking between universities to support research collaborations across disciplines. The next steps involve scaling this model into a national and global platform. The presentation highlights how the PRP enables various scientific applications and drives innovation through improved data transfer capabilities and distributed computing resources.
Creating a Big Data Machine Learning Platform in CaliforniaLarry Smarr
Big Data Tech Forum: Big Data Enabling Technologies and Applications
San Diego Chinese American Science and Engineering Association (SDCASEA)
Sanford Consortium
La Jolla, CA
December 2, 2017
Peering The Pacific Research Platform With The Great Plains NetworkLarry Smarr
The Pacific Research Platform (PRP) connects research institutions across the western United States with high-speed networks to enable data-intensive science collaborations. Key points:
- The PRP connects 15 campuses across California and links to the Great Plains Network, allowing researchers to access remote supercomputers, share large datasets, and collaborate on projects like analyzing data from the Large Hadron Collider.
- The PRP utilizes Science DMZ architectures with dedicated data transfer nodes called FIONAs to achieve high-speed transfer of large files. Kubernetes is used to manage distributed storage and computing resources.
- Early applications include distributed climate modeling, wildfire science, plankton imaging, and cancer genomics. The PR
Opening Keynote Lecture
15th Annual ON*VECTOR International Photonics Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
February 29, 2016
The document discusses the Pacific Research Platform (PRP), a distributed cyberinfrastructure that connects researchers and data across multiple campuses in California and beyond using optical fiber networking. Key points:
- The PRP uses high-speed networking infrastructure like the CENIC network to connect data generators and consumers across 15+ campuses, creating an integrated "big data freeway system".
- It deploys specialized data transfer nodes called FIONAs to enable high-speed transfer of large datasets between sites at near the full network speed.
- Recent additions include using Kubernetes to orchestrate containers across the PRP infrastructure and integrating machine learning resources through the CHASE-CI grant to support data-intensive AI applications.
Pacific Wave and PRP Update Big News for Big DataLarry Smarr
The Pacific Research Platform (PRP) aims to create a "Big Data freeway system" across research institutions in the western United States and Pacific region by leveraging high-bandwidth optical fiber networks. The PRP connects multiple universities and national laboratories, providing bandwidth up to 100Gbps for data-intensive science applications. Initial testing of the PRP demonstrated disk-to-disk transfer speeds exceeding 5Gbps between many sites. The PRP will be expanded with SDN/SDX capabilities to enable even higher performance for large-scale datasets from fields like astronomy, genomics, and particle physics.
Looking Back, Looking Forward NSF CI Funding 1985-2025Larry Smarr
This document provides an overview of the development of national research platforms (NRPs) from 1985 to the present, with a focus on the Pacific Research Platform (PRP). It describes the evolution of the PRP from early NSF-funded supercomputing centers to today's distributed cyberinfrastructure utilizing optical networking, containers, Kubernetes, and distributed storage. The PRP now connects over 15 universities across the US and internationally to enable data-intensive science and machine learning applications across multiple domains. Going forward, the document discusses plans to further integrate regional networks and partner with new NSF-funded initiatives to develop the next generation of NRPs through 2025.
Similar to Toward A National Big Data Superhighway (20)
The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
My Remembrances of Mike Norman Over The Last 45 YearsLarry Smarr
Mike Norman has been a leader in computational astrophysics for over 45 years. Some of his influential work includes:
- Cosmic jet simulations in the early 1980s which helped explain phenomena from galactic centers.
- Pioneering the use of adaptive mesh refinement in the 1990s to achieve dynamic load balancing on supercomputers.
- Massive cosmology simulations in the late 2000s with over 100 trillion particles using thousands of processors across multiple supercomputing sites, producing petabytes of data.
- Developing end-to-end workflows in the 2000s to couple supercomputers, high-speed networks, and large visualization systems to enable real-time analysis of extremely large astrophysics simulations.
How UiPath Discovery Suite supports identification of Agentic Process Automat...DianaGray10
📚 Understand the basics of the newly persona-based LLM-powered Agentic Process Automation and discover how existing UiPath Discovery Suite products like Communication Mining, Process Mining, and Task Mining can be leveraged to identify APA candidates.
Topics Covered:
💡 Idea Behind APA: Explore the innovative concept of Agentic Process Automation and its significance in modern workflows.
🔄 How APA is Different from RPA: Learn the key differences between Agentic Process Automation and Robotic Process Automation.
🚀 Discover the Advantages of APA: Uncover the unique benefits of implementing APA in your organization.
🔍 Identifying APA Candidates with UiPath Discovery Products: See how UiPath's Communication Mining, Process Mining, and Task Mining tools can help pinpoint potential APA candidates.
🔮 Discussion on Expected Future Impacts: Engage in a discussion on the potential future impacts of APA on various industries and business processes.
Enhance your knowledge on the forefront of automation technology and stay ahead with Agentic Process Automation. 🧠💼✨
Speakers:
Arun Kumar Asokan, Delivery Director (US) @ qBotica and UiPath MVP
Naveen Chatlapalli, Solution Architect @ Ashling Partners and UiPath MVP
Redefining Cybersecurity with AI CapabilitiesPriyanka Aash
In this comprehensive overview of Cisco's latest innovations in cybersecurity, the focus is squarely on resilience and adaptation in the face of evolving threats. The discussion covers the imperative of tackling Mal information, the increasing sophistication of insider attacks, and the expanding attack surfaces in a hybrid work environment. Emphasizing a shift towards integrated platforms over fragmented tools, Cisco introduces its Security Cloud, designed to provide end-to-end visibility and robust protection across user interactions, cloud environments, and breaches. AI emerges as a pivotal tool, from enhancing user experiences to predicting and defending against cyber threats. The blog underscores Cisco's commitment to simplifying security stacks while ensuring efficacy and economic feasibility, making a compelling case for their platform approach in safeguarding digital landscapes.
Generative AI technology is a fascinating field that focuses on creating comp...Nohoax Kanont
Generative AI technology is a fascinating field that focuses on creating computer models capable of generating new, original content. It leverages the power of large language models, neural networks, and machine learning to produce content that can mimic human creativity. This technology has seen a surge in innovation and adoption since the introduction of ChatGPT in 2022, leading to significant productivity benefits across various industries. With its ability to generate text, images, video, and audio, generative AI is transforming how we interact with technology and the types of tasks that can be automated.
Retrieval Augmented Generation Evaluation with RagasZilliz
Retrieval Augmented Generation (RAG) enhances chatbots by incorporating custom data in the prompt. Using large language models (LLMs) as judge has gained prominence in modern RAG systems. This talk will demo Ragas, an open-source automation tool for RAG evaluations. Christy will talk about and demo evaluating a RAG pipeline using Milvus and RAG metrics like context F1-score and answer correctness.
The Challenge of Interpretability in Generative AI Models.pdfSara Kroft
Navigating the intricacies of generative AI models reveals a pressing challenge: interpretability. Our blog delves into the complexities of understanding how these advanced models make decisions, shedding light on the mechanisms behind their outputs. Explore the latest research, practical implications, and ethical considerations, as we unravel the opaque processes that drive generative AI. Join us in this insightful journey to demystify the black box of artificial intelligence.
Dive into the complexities of generative AI with our blog on interpretability. Find out why making AI models understandable is key to trust and ethical use and discover current efforts to tackle this big challenge.
Top 12 AI Technology Trends For 2024.pdfMarrie Morris
Technology has become an irreplaceable component of our daily lives. The role of AI in technology revolutionizes our lives for the betterment of the future. In this article, we will learn about the top 12 AI technology trends for 2024.
Choosing the Best Outlook OST to PST Converter: Key Features and Considerationswebbyacad software
When looking for a good software utility to convert Outlook OST files to PST format, it is important to find one that is easy to use and has useful features. WebbyAcad OST to PST Converter Tool is a great choice because it is simple to use for anyone, whether you are tech-savvy or not. It can smoothly change your files to PST while keeping all your data safe and secure. Plus, it can handle large amounts of data and convert multiple files at once, which can save you a lot of time. It even comes with 24*7 technical support assistance and a free trial, so you can try it out before making a decision. Whether you need to recover, move, or back up your data, Webbyacad OST to PST Converter is a reliable option that gives you all the support you need to manage your Outlook data effectively.
The Zaitechno Handheld Raman Spectrometer is a powerful and portable tool for rapid, non-destructive chemical analysis. It utilizes Raman spectroscopy, a technique that analyzes the vibrational fingerprint of molecules to identify their chemical composition. This handheld instrument allows for on-site analysis of materials, making it ideal for a variety of applications, including:
Material identification: Identify unknown materials, minerals, and contaminants.
Quality control: Ensure the quality and consistency of raw materials and finished products.
Pharmaceutical analysis: Verify the identity and purity of pharmaceutical compounds.
Food safety testing: Detect contaminants and adulterants in food products.
Field analysis: Analyze materials in the field, such as during environmental monitoring or forensic investigations.
The Zaitechno Handheld Raman Spectrometer is easy to use and features a user-friendly interface. It is compact and lightweight, making it ideal for field applications. With its rapid analysis capabilities, the Zaitechno Handheld Raman Spectrometer can help you improve efficiency and productivity in your research or quality control workflows.
DefCamp_2016_Chemerkin_Yury-publish.pdf - Presentation by Yury Chemerkin at DefCamp 2016 discussing mobile app vulnerabilities, data protection issues, and analysis of security levels across different types of mobile applications.
It's your unstructured data: How to get your GenAI app to production (and spe...Zilliz
So you've successfully built a GenAI app POC for your company -- now comes the hard part: bringing it to production. Aparavi addresses the challenges of AI projects while addressing data privacy and PII. Our Service for RAG helps AI developers and data scientists to scale their app to 1000s to millions of users using corporate unstructured data. Aparavi’s AI Data Loader cleans, prepares and then loads only the relevant unstructured data for each AI project/app, enabling you to operationalize the creation of GenAI apps easily and accurately while giving you the time to focus on what you really want to do - building a great AI application with useful and relevant context. All within your environment and never having to share private corporate data with anyone - not even Aparavi.
The History of Embeddings & Multimodal EmbeddingsZilliz
Frank Liu will walk through the history of embeddings and how we got to the cool embedding models used today. He'll end with a demo on how multimodal RAG is used.
"Making .NET Application Even Faster", Sergey Teplyakov.pptxFwdays
In this talk we're going to explore performance improvement lifecycle, starting with setting the performance goals, using profilers to figure out the bottle necks, making a fix and validating that the fix works by benchmarking it. The talk will be useful for novice and seasoned .NET developers and architects interested in making their application fast and understanding how things work under the hood.
This PDF delves into the aspects of information security from a forensic perspective, focusing on privacy leaks. It provides insights into the methods and tools used in forensic investigations to uncover and mitigate privacy breaches in mobile and cloud environments.
1. “Toward A National Big Data Superhighway”
Closing Kenote
Internet2 Global Summit
Washington, DC
April 26, 2017
Dr. Larry Smarr
Director, California Institute for Telecommunications and Information Technology
Harry E. Gruber Professor,
Dept. of Computer Science and Engineering
Jacobs School of Engineering, UCSD
http://lsmarr.calit2.net
1
2. Abstract
Research in data-intensive fields is increasingly multi-investigator and multi-institutional,
depending on ever more rapid access to ultra-large heterogeneous and widely
distributed datasets. The Pacific Research Platform (PRP) is an NSF-funded research
project which extends NSF-funded campus Science DMZs to a regional model, built on
the CENIC/Pacific Wave backbone, establishing a science-driven high-capacity data-
centric "freeway system." The PRP spans all 10 campuses of the University of
California, as well as the major California private research universities, four
supercomputer centers, and several universities outside California. Fifteen multi-campus
data-intensive application teams, including particle physics, astronomy/astrophysics,
earth sciences, biomedicine, and scalable multimedia, act as drivers of the PRP,
providing feedback over the five years to the technical design staff. Over the next three
years, PRP will examine sustainable methods for expanding such regional networks to a
national scale.
3. Vision: Creating a West Coast “Big Data Freeway”
Connected by CENIC/Pacific Wave to Internet2 & GLIF
Use Lightpaths to Connect
Big Data Generators and Consumers,
Creating a “Big Data” Freeway
Integrated With High Performance Global Networks
“The Bisection Bandwidth of a Cluster Interconnect,
but Deployed on a 20-Campus Scale.”
This Vision Has Been Building for Over a Decade
4. NSF’s OptIPuter Project: Using Supernetworks
to Meet the Needs of Data-Intensive Researchers
OptIPortal–
Termination
Device
for the
OptIPuter
Global
Backplane
Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PI
Univ. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST
Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent
2003-2009
$13,500,000
In August 2003,
Jason Leigh and his
students used
RBUDP to blast
data from NCSA to
SDSC over the
TeraGrid DTFnet,
achieving18Gbps
file transfer out of
the available
20Gbps
LS Slide 2005
5. DOE ESnet’s Science DMZ: A Scalable Network
Design Model for Optimizing Science Data Transfers
• A Science DMZ integrates 4 key concepts into a unified whole:
– A network architecture designed for high-performance applications,
with the science network distinct from the general-purpose network
– The use of dedicated systems as data transfer nodes (DTNs)
– Performance measurement and network testing systems that are
regularly used to characterize and troubleshoot the network
– Security policies and enforcement mechanisms that are tailored for
high performance science environments
http://fasterdata.es.net/science-dmz/
Science DMZ
Coined 2010
The DOE ESnet Science DMZ and the NSF “Campus Bridging” Taskforce Report Formed the Basis
for the NSF Campus Cyberinfrastructure Network Infrastructure and Engineering (CC-NIE) Program
6. Based on Community Input and on ESnet’s Science DMZ Concept,
NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways
Red 2012 CC-NIE Awardees
Yellow 2013 CC-NIE Awardees
Green 2014 CC*IIE Awardees
Blue 2015 CC*DNI Awardees
Purple Multiple Time Awardees
Source: NSF
7. I Believe as Greg Bell Has Said
We Should Engineer the Network as an Instrument of Discovery
It is all about the end users!
We Must Optimize The Instrument
For Multi-Campus Collaborating Application Teams
8. How CC-NIE Prism@UCSD Grant Transforms Big Data Microbiome Science:
Preparing for Knight/Smarr 1 Million Core-Hour Analysis
12 Cores/GPU
128 GB RAM
3.5 TB SSD
48TB Disk
10Gbps NIC
Knight Lab
FIONA
10Gbps
Gordon
Prism@UCSD
Data Oasis
7.5PB,
200GB/s
Knight 1024 Cluster
In SDSC Co-Lo
CHERuB
100Gbps
Emperor & Other Vis Tools
64Mpixel Data Analysis Wall
120Gbps
40Gbps
1.3Tbps
9. The Next Logical Step:
Build a Regional DMZ by Connecting West Coast Campus DMZs
• May 2014 LS Gives Invited Presentation to UC IT Leadership Council
– Strong Support from UC and UCOP CIOs
• July 2014 LS Gives Invited Talk to CENIC Annual Retreat
– CENIC/PW Agrees to Act as Backplane
– CIO Support Extends to CA Private Research Universities
• December 2014 UCOP CIO and VPR’s Provide PRP “Momentum Money”
• January 2015 Kickoff of PRPv0 by Network Engineers
– Begins Every Two Week Conference Calls, Now Weekly
• March 2015 LS Invited “Blue Sky” Presentation to UC VCR/CIO Summit
– NSF PRP Proposal Submitted With Letters of Commitment From:
– 50 Researchers from 15 Campuses
– 32 IT/Network Organization Leaders
10. The Pacific Research Platform:
a Working End-to-End Science-Driven Regional DMZ-Connector
NSF CC*DNI Grant
$5M 10/2015-10/2020
PI: Larry Smarr, UC San Diego Calit2
Co-Pis:
• Camille Crittenden, UC Berkeley CITRIS,
• Tom DeFanti, UC San Diego Calit2,
• Philip Papadopoulos, UCSD SDSC,
• Frank Wuerthwein, UCSD Physics and SDSC
(GDC)
PRP is Built on CENIC/Pacific Wave
11. Our Prototype System – Built for for Scientists
Out of a Bunch of Independently Managed Networks
• Challenge:
– Campus DMZs, Regional (e.g., CENIC), National (Internet2), International
Networks (e.g., GLIF) are Individually-Architected Systems
• How Do They Work Together with Predictable Performance?
• PRP is Focused on Disk-to-Disk Data Movement
– From the Eyes of Domain Scientists
– End-to-End for Their Data is Their Only Real Metric of Concern (As it Should Be)
Source: Phil Papadopoulos
12. PRP Science DMZ Data Transfer Nodes (DTNs) -
Flash I/O Network Appliances (FIONAs)
UCSD Designed FIONAs
To Solve the Disk-to-Disk
Data Transfer Problem
at Full Speed
on 10G, 40G and 100G Networks
FIONAS—10/40G, $8,000
FIONette—1G, $1,000
Phil Papadopoulos, SDSC &
Tom DeFanti, Joe Keefe & John Graham, Calit2
John Graham, Calit2
13. More Than 30 PRP Installed FIONAs:
Customized to the Needs of Application Teams
• Data Transfer Nodes
– 1, 10, 40, and 100Gb/s NICs
• Storage Transfer Nodes
– Up to 160TB of Rotating Disks
– Nonvolatile Memory Disks (NVMe - 10x Faster than Flash)
– ½ PB Flash Disk (at SC15, on Loan From Vendor)
• Compute Transfer Nodes
– 12-48 Intel CPU Cores
– 1-8 GPUs (Delivers Up to 500,000 GPU Core Hours/Day)
• Visualization Transfer Nodes
– 3-45 Tiled displays (up to 180 Megapixels, 2D & 3D)
– 360-Megapixel SunCAVE Coming Soon
14. PRP Continues to Expand Rapidly While Increasing Connectivity:
1 1/2 Years of Progress – 12 Sites to 24 Sites
January 29, 2016
Connected 24 DMZ FIONAs
at 10G and 40G
April 24, 2017
Source: John Graham, Calit2
15. We Measure FIONA Disk-to-Disk Throughput with 10GB File Transfer
4 Times Per Day in Both Directions for All PRP Sites
See Time Lapse Movie Jan 2016 to Today
http://prp-maddash.calit2.optiputer.net/optiputer/optiputer.mp4
16. We Have Held a Number of
PRP Science Engagement Workshops
Source: Camille Crittenden, UC Berkeley
UC San Diego
UC Merced
UC Davis UC Berkeley
17. PRP’s First 1.5 Years:
Connecting Campus Application Teams and Devices
18. We Scale the Working PRP by Providing Multi-Campus Application Teams
With Disk-to-Disk Measurements
UIC
UCSD
UCI
U Hawaii
USC
NCAR
SDSU
19. LHC Rearchers Look to PRP to Fix the Last Mile Architecture in California:
Data and Compute Resources Can Both Be Shared
PRP provides an Implementation of All This on a Single FIONA,
PRP helps Integrate Local Resources into This FIONA.
login nodes
compute
scheduler
compute cluster
storage clusterDTN
CTN
WAN
CTN = compute transfer node
DTN = data transfer node
Science DMZ
Source: Frank Wuerthwein, UCSD, SDSC
20. >360 California Scientists Are Researching
Particle Physics Big Data Analysis
• ATLAS
– UCB/LBNL (63)
– SLAC/Stanford (51)
– UCSC (30)
– UCI (32)
• Total of 176 members listed in
ATLAS HR database at CERN
• CMS (Members)
– Caltech (29)
– LLNL (3)
– UCD (41)
– UCLA (17)
– UCR (25)
– UCSD (36)
– UCSB (35)
• Total of 186 members listed in CMS
HR database at CERN
Source: Frank Wuerthwein, UCSD, SDSC
21. LHC Computing and Data Resources
10 Institutions
• ATLAS Institutions
– SLAC “T2”
– NERSC (used by both)
– UCSC T3
– UCI T3
• CMS Institutions
– Caltech T2
– UCSD T2
– SDSC (used by both)
– UCD T3
– UCR T3
– UCSB T3
Lots of Potential Network Traffic for LHC on PRP
Source: Frank Wuerthwein, UCSD, SDSC
22. 100 Gbps FIONA at UCSC Connects the UCSC Hyades Cluster
to the NERSC Supercomputer at LBNL
Supporting UCSC Remote Access
to Large Data Subsets
of the Dark Energy Spectroscopic Instrument (DESI)
and AGORA Galaxy Simulation Data
Produced at NERSC.
250 images per night
800GB per night
Shawfeng Dong, UCSC Cyberengineer
UCSC Feb 7, 2017
23. 40G FIONAs
20x40G PRP-connected
WAVE@UC San Diego
PRP Now Enables
Distributed Virtual Reality
PRP
WAVE @UC Merced
Transferring 5 CAVEcam Images from UCSD to UC Merced:
2 Gigabytes now takes 2 Seconds (8 Gb/sec)
24. PRP Will Link the Laboratories of
the Pacific Earthquake Engineering Research Center
http://peer.berkeley.edu/
PEER Labs: UC Berkeley, Caltech, Stanford,
UC Davis, UC San Diego, and UC Los Angeles
John Graham Installing FIONette at PEER Feb 10, 2017
25. Cancer Genomics Hub (UCSC) is Housed in SDSC:
Large Data Flows to End Users at UCSC, UCB, UCSF, …
1G
8G
Data Source: David Haussler,
Brad Smith, UCSC
15G
Jan 2016
30,000 TB
Per Year
27. The Prototype PRP Has Attracted
New Application Drivers-More in Next Larry and Scott Talks
Scott Sellars, Marty Ralph
Center for Western Weather and Water Extremes
Frank Vernon - Expansion of HPWREN
Tom Levy, Cultural Heritage
Cryo EM
28. GPU JupyterHub:
2 x 14-core CPUs
256GB RAM
1.2TB FLASH
3.8TB SSD
Nvidia K80 GPU
Dual 40GbE NICs
And a Trusted Platform
Module
GPU JupyterHub:
1 x 18-core CPUs
128GB RAM
3.8TB SSD
Nvidia K80 GPU
Dual 40GbE NICs
And a Trusted Platform
Module
PRP UC-JupyterHub Backbone
UCB Next Step: Deploy Across PRP UCSD
Source: John Graham, Calit2
29. Atmospheric
Rivers
(fall and winter)
Southwest
Monsoon
(summer & fall)
Great Plains Convection
(spring and summer)
Front Range Upslope
(rain/snow)
Funded collaborations
CW3E
Based at UCSD/Scripps Oceanography
CW3E-North
at Sonoma
County Water
Agency
Key Phenomena Causing Extreme Precipitation in the Western U.S. (Ralph et al.
2014)
Director: F. Martin Ralph Website: cw3e.ucsd.edu
Data is at the heart of what we do!
• High resolution numerical models
• Satellite images
• Ground based weather stations
• Weather radar
• Historical climate data
Big Data Collaboration with:
Source: Scott Sellers, CW3E
Collaboration on Atmospheric Water
Between UC San Diego and UC Irvine
Director, Soroosh Sorooshian, UCSD Website http://chrs.web.uci.edu
30. Calit2’s FIONA
SDSC’s COMET
Calit2’s FIONA
Pacific Research Platform (10-100 Gb/s)
GPUsGPUs
Complete workflow time: 20 days20 hrs20 Minutes!
UC, Irvine UC, San Diego
Improvement of Over 1000x With PRP
31. Cryo-electron Microscopy (cryo-EM)
Has Driven a “Resolution Revolution” in the Last Five Years
Exposure (every 60 seconds):
X & Y dimensions: 7420 x 7676 Pixels
Frames per Movie: 10 - 50
Size: 3 - 10 GB per Movie
Every 24 hours:
Number of Movies: ~1400
Data Size: ~5 TB
Typical Datasets:
Length of Time: 2 - 6 Days
Total size: 10 - 30 TB
Each Cryo-EM ‘Image’ is Actually a Movie
Source: Michael A. Cianfrocco,
Elizabeth Villa, & Andres Leschziner, UCSD
32. Using PRP to Connect Cryo-EM across California
With End Users and Computational Facilities
Long term:
‣ Partner with Cryo-EM Facilities to Stream Data
Straight from Microscopes (over PRP) to SDSC
‣ Perform All Cryo-EM Analysis (from Micrographs
to 3D Models) via Web Browser on SDSC
‣ Expand Computing to Other XSEDE Resources
(e.g. Xstream) and DOE’s NERSC
Short term:
‣ Provide 2D and 3D Analysis on Particle Stacks on
Comet at SDSC
Source: Michael A. Cianfrocco, UCSD
*
*
SDSC
NERSC
Xstream
3 Supercomputer Centers
cosmic-cryoem.org
~20 Microscopes in CA
UCLA
UC Davis
UC Santa Cruz
SF Bay
UC Berkeley, LBNL,
UCSF, Stanford
San Diego
UCSD, TSRI, Salk*
33. Linking Cultural Heritage and Archaeology Datasets
at UCB, UCLA, UCM and UCSD with CAVEkiosks
48 Megapixel CAVEkiosk
UCSD Library
48 Megapixel CAVEkiosk
UCB Library
24 Megapixel CAVEkiosk
UCM Library
34. PRP is the Platform Chosen for 2017 Expansion
of HPWREN, Connected to CENIC, into Orange and Riverside Counties
• PRP CENIC 100G Link
UCSD to SDSU
– DTN FIONAs Endpoints
– Data Redundancy
– Disaster Recovery
– High Availability
– Network Redundancy
• Anchor to CENIC at UCI
– PRP FIONA Connects to
CalREN-HPR Network
– Data Replication Site
• Potential Future UCR
CENIC Anchor
UCR
UCI
UCSD
SDSU
Source: Frank Vernon,
Greg Hidley, UCSD
35. Proposed Cognitive Hardware and Software Ecosystem
On the Pacific Research Platform
• Working With 30 CSE Machine Learning Researchers
– Goal is 320 Game GPUs in 32-40 FIONAs at 10 PRP Campuses
– PRP Couples FIONAs with GPUs into a Condor-Managed Cloud
• PRP Access to Emerging Processors
– IBM TrueNorth, KnuEdge, FPGA, and Qualcomm Snapdragon
• Software Including a Wide Range of Open ML Algorithms
• Metrics for Performance of Processors and Algorithms
Source: Tom DeFanti, Calit2
FIONA with 8-Game GPUs
36. We are Now Investigating
How the PRP Prototype Might Be Extended to National-Scale
From the text of the PRP cooperative agreement:
After approximately 18 (or TBD) months, a site visit and comprehensive review of
progress towards meeting project milestones and goals and overall performance and
management processes will take place, including user community relationships,
scientific impacts, and the status of the project as a model for potential future
national-scale, network-aware, data-focused cyberinfrastructure attributes,
approaches, and capabilities.
37. Expanding to National Research Platform and Global Research Platform
Via CENIC/Pacific Wave, Internet2, and International Links
PRP’s Current
International
Partners
Korea Shows Distance is Not the Barrier
to Above 5Gb/s Disk-to-Disk Performance
38. PRP Working on Connecting Guam
via the University of Oregon-Based Network Startup Resource Center
The PRP shipped a FIONette
to CENIC’s John Hess
to be Installed in Guam Mid-May
To support projects in:
• Geography
• Climate History
• Guam EPSCoR
• The UOG Marine Laboratory
“During the quarter century that this group has been helping to build internet infrastructure
around the world, there’s hardly a place on the planet that has not been touched
by the great work of the Network Startup Resource Center,” -- Larry Smarr.
39. PRP is Partnering with the Advanced CyberInfrastructure –
Research and Education Facilitators (ACI-REF) NSF Grant to Explore Extension
PRP Connected
ACI-REF has also spawned the 28-
member Campus Research
Computing consortium (CaRC),
funded by the NSF as a Research
Coordination Network (RCN).
CaRC is dedicated to sharing best
practices, expertise, and
resources, enabling the
advancement of campus- based
research computing activities
around the nation.
Jim Bottum, Principal Investigator
ACI-REF
CaRC
40. Announcing the First National Research Platform
Workshop August 7-8, 2017
Co-Chairs:
Larry Smarr, Calit2
& Jim Bottum, Internet2
See pacificresearchplatform.org
for Registration Information
41. Toward a National Research Platform
PRP has 3 FTEs to Connect ~25 Campuses.
How Many are Needed to Expand to a NRP
Serving Researchers at 250 Campuses in Dozens of Fields?
What is the Path Forward?
As Internet2 Board of Trustees Member
John Evans Said to Me Last Night:
“We Are Near an Inflection Point.”
42. Our Support:
• US National Science Foundation (NSF) awards CNS 0821155 and
CNS-1338192, CNS-1456638, ACI-1540112, and ACI-1541349
• University of California Office of the President CIO
• UCSD Chancellor’s Integrated Digital Infrastructure Program
• UCSD Next Generation Networking initiative
• Calit2 and Calit2 Qualcomm Institute
• CENIC, PacificWave and StarLight
• DOE ESnet