If you’re familiar with relational databases, designing your app to use a NoSQL database like DynamoDB may be new to you. In this webinar, we’ll walk you through common data design patterns for a variety of applications to help you learn how to design a schema, then store and retrieve the data with DynamoDB. We will discuss the benefits of using DynamoDB to develop mobile, web, IoT, and gaming apps.
Learning Objectives:
Learn schema design best practices with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others
Who Should Attend:
Architects, Developers, and SysOps interested in learning how to design NoSQL schemas to support mobile, web, IoT, AdTech, and gaming apps.
Familiarity with DynamoDB is helpful
Data processing and analysis is where big data is most often consumed - driving business intelligence (BI) use cases that discover and report on meaningful patterns in the data. In this session, we will discuss options for processing, analyzing and visualizing data. We will also look at partner solutions and BI-enabling services from AWS. Attendees will learn about optimal approaches for stream processing, batch processing and Interactive analytics. AWS services to be covered include: Amazon Machine Learning, Elastic MapReduce (EMR), and Redshift.
This session will begin with an introduction to non-relational (NoSQL) databases and compare them with relational (SQL) databases. Learn the fundamentals of Amazon DynamoDB, a fully managed NoSQL database service, and see the DynamoDB console first-hand. See a walk-through demo of building a serverless web application using this high-performance key-value and JSON document store.
NoSQL is an important part of many big data strategies. Attend this session to learn how Amazon DynamoDB helps you create fast ingest and response data sets. We demonstrate how to use DynamoDB for batch-based query processing and ETL operations (using a SQL-like language) through integration with Amazon EMR and Hive. Then, we show you how to reduce costs and achieve scalability by connecting data to Amazon ElasticCache for handling massive read volumes. We’ll also discuss how to add indexes on DynamoDB data for free-text searching by integrating with Elasticsearch using AWS Lambda and DynamoDB streams. Finally, you’ll find out how you can take your high-velocity, high-volume data (such as IoT data) in DynamoDB and connect it to a data warehouse (Amazon Redshift) to enable BI analysis.
This webinar discusses Amazon DynamoDB, a NoSQL, highly scalable, SSD-based, zero administration database service in the AWS Cloud. We explain how DynamoDB works and also walk through some best practices and tips to get the most out of the service.
Interested in learning about event-driven programming? In this session we will introduce you to some of the basics of using Amazon DynamoDB, its newly launched Streams feature and AWS Lambda. We will provide an overview of both AWS products and walk you through the process of building a real-world application using AWS Triggers, which combines DynamoDB Streams and AWS Lambda.
This document provides an overview of Amazon Web Services storage options for big data and analytics workloads. It discusses Amazon S3, Amazon EBS volume types, use cases for different storage solutions, examples of customers optimizing storage, and a new feature called EBS Elastic Volumes that allows modifying the configuration of live EBS volumes non-disruptively.
BDA305 NEW LAUNCH! Intro to Amazon Redshift Spectrum: Now query exabytes of d...Amazon Web Services
Amazon Redshift Spectrum is a new feature that extends Amazon Redshift’s analytics capabilities beyond the data stored in your data warehouse to also query your data in Amazon S3. You can use Amazon Redshift and your existing business intelligence tools to run SQL queries against exabytes of data, and Redshift Spectrum applies sophisticated query optimization, scaling processing across thousands of nodes so results are fast – even with large data sets and complex queries.
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftAmazon Web Services
An overview of how Amazon Redshift uses columnar technology, massively parallel processing, and other techniques to deliver fast query performance on petabyte-size datasets.
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudAmazon Web Services
FINRA’s Data Lake unlocks the value in its data to accelerate analytics and machine learning at scale. FINRA's Technology group has changed its customer's relationship with data by creating a Managed Data Lake that enables discovery on Petabytes of capital markets data, while saving time and money over traditional analytics solutions. FINRA’s Managed Data Lake includes a centralized data catalog and separates storage from compute, allowing users to query from petabytes of data in seconds. Learn how FINRA uses Spot instances and services such as Amazon S3, Amazon EMR, Amazon Redshift, and AWS Lambda to provide the 'right tool for the right job' at each step in the data processing pipeline. All of this is done while meeting FINRA’s security and compliance responsibilities as a financial regulator.
Amazon DynamoDB is a fast and flexible NoSQL database service for applications that need consistent, single-digit millisecond latency at any scale. It is a fully managed cloud database and supports both document and key-value store models. Its flexible data model and reliable performance make it a great fit for mobile, web, gaming, ad tech, IoT, and many other applications.
Learning Objectives:
Understand the differences between relational and non-relational databases
Learn about common use cases for DynamoDB across gaming, ad tech, IoT, and more
See how DynamoDB helps customers handle spikes in traffic and save development time for new feature launches
Who Should Attend:
Developers, IT Decision Makers, and Executives interested in learning more about Amazon Web Services’ serverless NoSQL service to scale mobile, web, IoT, ad tech, and gaming apps
Data collection and storage is a primary challenge for any big data architecture. This session will focus on the different types of data that customers are handling to drive high-scale workloads on AWS. Our goal is to help you choose the best approach for your workload. We will dive into optimization techniques that improve performance and reduce the cost of data ingestion and AWS services including Amazon S3, DynamoDB, and Kinesis.
Created by: Mark Korver, Senior Solutions Architect
DynamoDB is a NoSQL database service built for fast, scalable, consistent performance. This presentation introduces DynamoDB and discusses how to get started, provision throughput, design for the DynamoDB data model, query and scan tables and scale reads and writes without downtime.
This document provides an overview of Amazon DynamoDB including key concepts like tables, data types, indexes, scaling, data modeling best practices, and example scenarios. It discusses how to design DynamoDB tables for different data access patterns including 1:1, 1:N, and N:M relationships. It also provides recommendations for modeling time series data, popular fast-changing items, and messaging applications.
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013Amazon Web Services
SmugMug.com is a popular hosting and commerce platform for photo enthusiasts with hundreds of thousands of subscribers and millions of viewers. Learn now SmugMug uses Amazon DynamoDB to provide customers detailed information about millions of daily image and video views. Smugmug shares code and information about their stats stack, which includes an HTTP interface to Amazon DynamoDB and also interfaces with their internal PHP stack and other tools such as Memcached. Get a detailed picture of lessons learned and the methods SmugMug uses to create a system that is easy to use, reliable, and high performing.
Building Big Data Applications with Serverless Architectures - June 2017 AWS...Amazon Web Services
Learning Objectives:
- Use cases and best practices for serverless big data applications
- Leverage AWS technologies such as AWS Lambda and Amazon Kinesis
- Learn to perform ETL, event processing, ad-hoc analysis, real-time processing, and MapReduce with serverless
Building data processing applications is challenging and time-consuming, and often requires specialized expertise to deploy and operate. With serverless computing, you can perform real-time stream processing of multiple data types without needing to spin up servers or install software, allowing you to deploy big data applications quickly and more easily. Come learn how you can use AWS Lambda with Amazon Kinesis to analyze streaming data in real-time and then store the results in a managed NoSQL database such as Amazon DynamoDB. You’ll learn tips and tricks for doing in-line processing, data manipulation, and even distributed MapReduce on large data sets.
(BDT314) A Big Data & Analytics App on Amazon EMR & Amazon RedshiftAmazon Web Services
Nasdaq has extended its use of Amazon Redshift to include Amazon EMR and Amazon S3 in order to better manage storage and compute resources separately. Data is ingested into Redshift and then transformed and unloaded to S3. EMR is then used to convert the data to Parquet format and write it to S3 partitioned by date. The data in S3 is accessed using Presto with encryption at rest. Hive is used to manage schemas and partitions across data sources. Tools were developed to help with encryption, schema management, and data migrations between systems while maintaining security and performance.
(BDT310) Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
The world is producing an ever increasing volume, velocity, and variety of big data. Consumers and businesses are demanding up-to-the-second (or even millisecond) analytics on their fast-moving data, in addition to classic batch processing. AWS delivers many technologies for solving big data problems. But what services should you use, why, when, and how? In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesAmazon Web Services
Traditional data warehouses become expensive and slow down as the volume of your data grows. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it easy to analyze all of your data using existing business intelligence tools for as low as $1000/TB/year. This webinar will provide an introduction to Amazon Redshift and cover the essentials you need to deploy your data warehouse in the cloud so that you can achieve faster analytics and save costs.
Learning Objectives:
• Get an introduction to Amazon Redshift's massively parallel processing, columnar, scale-out architecture
• Learn how to configure your data warehouse cluster, optimize schema, and load data efficiently
• Get an overview of all the latest features including interleaved sorting and user-defined functions
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014Amazon Web Services
Working with big volumes of data is a complicated task, but it's even harder if you have to do everything in real time and try to figure it all out yourself. This session will use practical examples to discuss architectural best practices and lessons learned when solving real-time social media analytics, sentiment analysis, and data visualization decision-making problems with AWS. Learn how you can leverage AWS services like Amazon RDS, AWS CloudFormation, Auto Scaling, Amazon S3, Amazon Glacier, and Amazon Elastic MapReduce to perform highly performant, reliable, real-time big data analytics while saving time, effort, and money. Gain insight from two years of real-time analytics successes and failures so you don't have to go down this path on your own.
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...Amazon Web Services
Learning Objectives:
- Learn about Redshift Spectrum, a new feature that allows you to run Redshift queries directly against your data in Amazon S3 - Understand common use cases for Redshift Spectrum
- Identify strategies for improving performance and saving costs when querying your data in Amazon S3
Amazon Redshift Spectrum is a new feature that extends Amazon Redshift’s analytics capabilities beyond the data stored in your data warehouse to also query your data in Amazon S3. You can use Amazon Redshift and your existing business intelligence tools to run SQL queries against exabytes of data.
In this session, we will show you how you can easily start querying your data stored in Amazon S3 with Redshift Spectrum. You can run Amazon Redshift queries on Amazon S3 data on its own, or you can run queries that join together data in S3 with data already in your Redshift data warehouse. We will highlight technical details of query execution and implementation of Redshift Spectrum. We will talk about supported queries, data formats, and strategies to save cost by compressing or transforming your data into a columnar format.
If you’re familiar with relational databases, designing your app to use a fully-managed NoSQL database service like Amazon DynamoDB may be new to you. In this webinar, we’ll walk you through common NoSQL design patterns for a variety of applications to help you learn how to design a schema, store, and retrieve data with DynamoDB. We will discuss best practices with DynamoDB to develop IoT, AdTech, and gaming apps.
Building a Real-Time Geospatial-Aware Recommendation EngineAmazon Web Services
Recommendation engines help your prospects and customers find the most relevant offers and content. In this presentation, you will learn how to use AWS building blocks to build your own location-aware recommendation engine. You’ll see how to store real-time events using Amazon Kinesis and Amazon DynamoDB. See how to easily move data into Amazon Redshift using Kinesis Firehose. As your site or app rises in popularity, you’ll need to track a wider variety of events and scale to handle traffic and usage spikes. Learn architectural patterns for processing large datasets and high-request volume applications.
Amazon DynamoDB Design Patterns for Ultra-High Performance Apps (DAT304) | AW...Amazon Web Services
The document summarizes a presentation on advanced design patterns for building ultra-high performance apps using Amazon DynamoDB. The presentation covers using hash and range schemas to model social networks, secondary indexes for flexible querying of image data, conditional writes for synchronizing game state, and fine-grained access control for user data. Examples are provided for each pattern discussed.
This document summarizes a presentation on building applications with DynamoDB. The presentation covers:
- Getting started with DynamoDB by making two decisions - choosing a primary key and provisioning throughput - and making one API call to create a table.
- Data modeling concepts in DynamoDB including tables, items, attributes, primary keys, and queries. Common patterns like modeling relationships and handling large items are also discussed.
- Programming the DynamoDB API and available operations like PutItem, GetItem, Query and Scan. Conditional updates, batch operations, and pagination of results are also covered.
- Real-world data modeling examples including storing scores and leaderboards for an online game and creating
A new generation of sophisticated geospatial mobile apps are being developed, which are serverless and can scale to virtually unlimited users without any infrastructure or servers to manage. This session will take a practical approach to developing lean and cost-effective real-world location-based mobile apps through live demonstrations and code walkthroughs. It will showcase how cloud services can be used to authenticate users, store and synchronize data, understand behavior, react upon location and state changes, test apps and send notifications to nearby app users.
Olivier Klein, Solutions Architect, Amazon Web Services, Greater China
The document discusses event driven programming using Visual Studio and VB.NET. It describes key aspects of event driven programming including event loops, GUI design using forms and controls, trigger functions, and event handlers. It provides examples of how to use these tools and techniques in Visual Studio and VB.NET, demonstrating the development process with code snippets and screenshots.
Event driven programming is a programming paradigm where the flow of the program is determined by events such as user actions, sensor outputs, or messages from other programs or devices. The program waits for these external events to occur and then triggers a response by executing appropriate event handler functions. This contrasts with procedural programming which follows a linear sequence of instructions. Event driven programming is commonly used for graphical user interface programs and operating systems, but can also be applied to non-GUI programs like burglar alarms or process control systems that need to respond to external stimuli.
This document provides an overview of NoSQL databases and then discusses Amazon DynamoDB in more depth. It explains that NoSQL databases are an alternative to relational databases for certain data-intensive applications. It then discusses DynamoDB specifically, highlighting that it is a fully managed NoSQL database that provides fast and predictable performance, flexible data model, automatic scaling, and pay per request pricing. The document also provides examples of applications that were built on DynamoDB as part of a challenge.
Event-driven programming is a programming paradigm where the flow of the program is determined by events such as user input. In event-driven programming, an event trigger causes an event handler method to execute. Common event triggers include user interactions like mouse clicks or key presses. This paradigm is well-suited for graphical user interfaces that allow users to interact with and control the flow of the program.
Event driven programming is commonly used for GUI applications where events like button clicks or text changes trigger event handler functions. Key aspects include event handlers that contain code to execute in response to events, trigger functions that determine which handler to run, and event loops that constantly check for events. This approach provides flexibility by allowing programmers to control where code runs and what specific user actions it will respond to.
Event driven programming uses graphical user interface elements like forms, controls, and events to build interactive programs. Key aspects include forms that contain controls for user interaction, trigger functions that specify code for different control events, and an event loop that detects user interactions and runs the corresponding event handler code. Event driven programming is well-suited for GUIs and allows programmers to gradually build up programs by adding and coding individual controls. However, complex programs with many forms can be confusing for users and errors may be harder to detect than in simpler programs.
A talk about lessons learned from building developer and sysadmin-facing tools at Puppet and Docker. It’s applicable to open source tools for broader use and the tools you and your teams develop inside your organization. Building tools, both external and internal, is hard. Getting people to adopt those tools is even harder. These are ideas to make it easier.
Introducing AWS IoT - Interfacing with the Physical World - Technical 101Amazon Web Services
This document provides an overview of AWS IoT, a service that allows devices to securely connect and interact with cloud applications and other devices. It discusses how AWS IoT provides a complete platform for connected devices with SDKs, authentication/authorization, a rules engine, device shadows and registry. It also highlights how AWS IoT supports MQTT and HTTP protocols, allows devices to securely connect and exchange messages, and integrates with other AWS services and third-party services. The document concludes with information on getting started with AWS IoT device SDKs.
In this session you'll learn about the decisions that went into designing and building DynamoDB, and how it allows you to stay focused on your application while enjoying single digit latencies at any scale. We'll dive deep on how to model data, maintain maximum throughput, and drive analytics against your data, while profiling real world use cases, tips and tricks from customers running on DynamoDB today.
Understanding how memory is managed with MongoDB is instrumental in maximizing database performance and hardware utilisation. This talk covers the workings of low level operating system components like the page cache and memory mapped files. We will examine the differences between RAM, SSD and hard disk drives to help you choose the right hardware configuration. Finally, we will learn how to monitor and analyze memory and disk usage using the MongoDB Management Service, linux administration commands and MongoDB commands.
This document provides an overview and summary of a presentation on Amazon DynamoDB. The presentation will cover DynamoDB tables, APIs, data types, indexes, scaling, data modeling, scenarios and best practices. It will also discuss using DynamoDB Streams to enable cross-region replication and integration with other AWS services like S3, CloudSearch, ElastiCache and Redshift. The goal is to teach design patterns and best practices for building highly scalable applications with DynamoDB.
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...Amazon Web Services
Amazon DynamoDB is a fully managed, highly scalable distributed database service. In this technical talk, we show you how to use DynamoDB to build high-scale applications like social gaming, chat, and voting. We show you how to use building blocks such as secondary indexes, conditional writes, consistent reads, and batch operations to build the higher-level functionality such as multi-item atomic writes and join queries. We also discuss best practices such as index projections, item sharding, and parallel scan for maximum scalability.
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...Amazon Web Services
Toyota Racing Development (TRD) developed a robust and highly performant real-time data analysis tool for professional racing. In this talk, learn how we structured a reliable, maintainable, decoupled architecture built around Amazon DynamoDB as both a streaming mechanism and a long-term persistent data store. In racing, milliseconds matter and even moments of downtime can cost a race. You'll see how we used DynamoDB together with Amazon Kinesis and Kinesis Firehose to build a real-time streaming data analysis tool for competitive racing.
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Amazon Web Services
Organisations today need a way to manage the ever-increasing volume of data from numerous sources such as log systems, click streams or connected devices and be able to analyse this data in real-time. In this session we will walk through an architecture demonstration of how to leverage AWS services to meet these needs.
Speaker: Ganesh Raja, Solutions Architect, Amazon Web Services
In this session, we explore Amazon DynamoDB capabilities and benefits in detail and discusses how to get the most out of your DynamoDB database. We go over schema design best practices with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others. We also explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including JSON document support, Streams, and more.
Amazon DynamoDB is a fully managed NoSQL database service for applications that need consistent, single-digit millisecond latency at any scale. This talk explores DynamoDB capabilities and benefits in detail and discusses how to get the most out of your DynamoDB database. We go over schema design best practices with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others. We also explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including JSON document support, Streams, and more.
Amazon DynamoDB is a fully managed NoSQL database service for applications that need consistent, single-digit millisecond latency at any scale. This talk explores DynamoDB capabilities and benefits in detail and discusses how to get the most out of your DynamoDB database. We go over schema design best practices with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others. We also explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including JSON document support, Streams, and more.
This document provides an overview and summary of key aspects of DynamoDB, including:
1. DynamoDB is a fully managed NoSQL database that scales to any workload and provides fast and consistent performance.
2. DynamoDB uses a table structure with partition and sort keys to organize and access data, and scales both read and write throughput independently across partitions.
3. Common challenges include hot keys/partitions that can cause throttling, and designing schemas and partitions to spread access uniformly across the keyspace.
4. NoSQL data modeling focuses on aggregations rather than relations, using patterns like hierarchical data structures and parent-child relationships to model one-to-many and many-to-
Amazon DynamoDB is a fully managed, highly scalable distributed database service. In this technical talk, we show you how to use Amazon DynamoDB to build high-scale applications like social gaming, chat, and voting. We show you how to use building blocks such as secondary indexes, conditional writes, consistent reads, and batch operations to build the higher-level functionality such as multi-item atomic writes and join queries. We also discuss best practices such as index projections, item sharding, and parallel scan for maximum scalability.
Speakers:
Philip Fitzsimons, AWS Solutions Architect
Richard Freeman, PhD, Senior Data Scientist/Architect, JustGiving
This document provides tips and best practices for using Amazon DynamoDB. It discusses using indexes like local secondary indexes (LSI) and global secondary indexes (GSI) as well as scaling DynamoDB. It also covers data modeling patterns for different types of relationships and using DynamoDB for scenarios like storing time series data and building catalogs. The document provides two case studies for using DynamoDB with other AWS services like S3, Lambda, Elasticsearch and EMR/Hive to enable big data analytics.
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...Amazon Web Services
Applications have traditionally stored data in a relational database management system (RDBMS) and have used a Structured Query Language (SQL) to retrieve and update that data. The growth of “internet scale” apps, such as e-commerce, social media, mobile apps, and the rise of big data have increased data throughput demands beyond the range of traditional relational databases. Non-relational (NoSQL) databases enables your application to scale more cost effectively, even for extraordinarily high demand. Amazon DynamoDB is a fully managed NoSQL database service that lets you focus on your app so you don’t have to worry about hardware acquisition or database management and lets you scale down your costs for off-peak periods. In this webinar, we’ll describe common database tasks, then compare and contrast SQL with equivalent DynamoDB operations.
Learning Objectives:
• Why consider the switch from SQL to NoSQL?
• Benefits of Amazon’s NoSQL database service
• Common SQL database operations and their DynamoDB equivalents
This session will begin with an introduction to non-relational (NoSQL) databases and compare them with relational (SQL) databases. We will also explain the fundamentals of Amazon DynamoDB, a fully managed NoSQL database service. Learn the fundamentals of DynamoDB and see the new DynamoDB console first-hand as we discuss common use cases and benefits of this high-performance key-value and JSON document store.
Amazon DynamoDB is a fully managed NoSQL database service for applications that need consistent, single-digit millisecond latency at any scale. This talk explores DynamoDB capabilities and benefits in detail and discusses how to get the most out of your DynamoDB database. We go over schema design best practices with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others. We also explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including JSON document support, Streams, and more.
This document summarizes tips and best practices for using DynamoDB. It discusses using local secondary indexes (LSI) and global secondary indexes (GSI) to query data. It covers scaling DynamoDB tables by partitioning and provisioning throughput. Common data modeling patterns like one-to-one, one-to-many, and many-to-many relationships are presented. Best practices for time series data, caching frequently accessed items, and optimizing queries are provided. Examples of using DynamoDB for game analytics and metadata storage in S3 are also included.
Amazon DynamoDB is a fully managed NoSQL database service for applications that need consistent, single-digit millisecond latency at any scale. This talk explores DynamoDB capabilities and benefits in detail and discusses how to get the most out of your DynamoDB database. We go over schema design best practices with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others. We also explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including JSON document support, Streams, and more.
Data warehousing is a critical component for analysing and extracting actionable insights from your data. Amazon Redshift allows you to deploy a scalable data warehouse in a matter of minutes and starts to analyse your data right away using your existing business intelligence tools.
Explore Amazon DynamoDB capabilities and benefits in detail and learn how to get the most out of your DynamoDB database. We go over best practices for schema design with DynamoDB across multiple use cases, including gaming, IoT, and others. We explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including DynamoDB Accelerator (DAX), DynamoDB Time-to-Live, and more. We also provide lessons learned from operating DynamoDB at scale, including provisioning DynamoDB for IoT.
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...Amazon Web Services
“Attribution" is the marketing term of art for allocating full or partial credit to individual advertisements that eventually lead to a purchase, sign up, download, or other desired consumer interaction. We'll share how we use DynamoDB at the core of our attribution system to store terabytes of advertising history data. The system is cost effective and dynamically scales from 0 to 300K requests per second on demand with predictable performance and low operational overhead.
Following the Amazon DynamoDB Deep Dive session, this workshop is a design session (no computer needed) in which we will work through several real world DynamoDB use cases. For each one, we will go over the requirements, propose and analyze possible solutions and their pros and cons, with an eye for performance efficiency, scalability, and cost optimization.
This document discusses best practices for using DynamoDB for game data, including tips for indexing, scaling, data modeling, and real-world use cases. It provides examples of using local secondary indexes (LSI) and global secondary indexes (GSI) to query game data efficiently. It also recommends modeling time series data with separate tables per time period. The document concludes with an overview of how Nexon Korea uses DynamoDB for mobile game databases.
Amazon DynamoDB is a fully managed, highly scalable distributed database service. In this technical talk, we will deep dive on how to: Use DynamoDB to build high-scale applications like social gaming, chat, and voting. - Model these applications using DynamoDB, including how to use building blocks such as conditional writes, consistent reads, and batch operations to build the higher-level functionality such as multi-item atomic writes and join queries. - Incorporate best practices such as index projections, item sharding, and parallel scan for maximum scalability
The document provides an overview of Amazon DynamoDB, including its key capabilities like auto scaling, on-demand throughput capacity, and integration with other AWS services; it also discusses DynamoDB fundamentals like data modeling techniques, partitioning strategies to scale workload, and using secondary indexes to enable richer queries. Use cases that benefit from DynamoDB include applications that require massive scale, predictable low latency, or flexible schemas to support unstructured or semi-structured data like IoT, gaming metadata, social feeds, and e-commerce cart data.
Similar to AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB (20)
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
Il Forecasting è un processo importante per tantissime aziende e viene utilizzato in vari ambiti per cercare di prevedere in modo accurato la crescita e distribuzione di un prodotto, l’utilizzo delle risorse necessarie nelle linee produttive, presentazioni finanziarie e tanto altro. Amazon utilizza delle tecniche avanzate di forecasting, in parte questi servizi sono stati messi a disposizione di tutti i clienti AWS.
In questa sessione illustreremo come pre-processare i dati che contengono una componente temporale e successivamente utilizzare un algoritmo che a partire dal tipo di dato analizzato produce un forecasting accurato.
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
La varietà e la quantità di dati che si crea ogni giorno accelera sempre più velocemente e rappresenta una opportunità irripetibile per innovare e creare nuove startup.
Tuttavia gestire grandi quantità di dati può apparire complesso: creare cluster Big Data su larga scala sembra essere un investimento accessibile solo ad aziende consolidate. Ma l’elasticità del Cloud e, in particolare, i servizi Serverless ci permettono di rompere questi limiti.
Vediamo quindi come è possibile sviluppare applicazioni Big Data rapidamente, senza preoccuparci dell’infrastruttura, ma dedicando tutte le risorse allo sviluppo delle nostre le nostre idee per creare prodotti innovativi.
Ora puoi utilizzare Amazon Elastic Kubernetes Service (EKS) per eseguire pod Kubernetes su AWS Fargate, il motore di elaborazione serverless creato per container su AWS. Questo rende più semplice che mai costruire ed eseguire le tue applicazioni Kubernetes nel cloud AWS.In questa sessione presenteremo le caratteristiche principali del servizio e come distribuire la tua applicazione in pochi passaggi
Vent'anni fa Amazon ha attraversato una trasformazione radicale con l'obiettivo di aumentare il ritmo dell'innovazione. In questo periodo abbiamo imparato come cambiare il nostro approccio allo sviluppo delle applicazioni ci ha permesso di aumentare notevolmente l'agilità, la velocità di rilascio e, in definitiva, ci ha consentito di creare applicazioni più affidabili e scalabili. In questa sessione illustreremo come definiamo le applicazioni moderne e come la creazione di app moderne influisce non solo sull'architettura dell'applicazione, ma sulla struttura organizzativa, sulle pipeline di rilascio dello sviluppo e persino sul modello operativo. Descriveremo anche approcci comuni alla modernizzazione, compreso l'approccio utilizzato dalla stessa Amazon.com.
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
L’utilizzo dei container è in continua crescita.
Se correttamente disegnate, le applicazioni basate su Container sono molto spesso stateless e flessibili.
I servizi AWS ECS, EKS e Kubernetes su EC2 possono sfruttare le istanze Spot, portando ad un risparmio medio del 70% rispetto alle istanze On Demand. In questa sessione scopriremo insieme quali sono le caratteristiche delle istanze Spot e come possono essere utilizzate facilmente su AWS. Impareremo inoltre come Spreaker sfrutta le istanze spot per eseguire applicazioni di diverso tipo, in produzione, ad una frazione del costo on-demand!
In recent months, many customers have been asking us the question – how to monetise Open APIs, simplify Fintech integrations and accelerate adoption of various Open Banking business models. Therefore, AWS and FinConecta would like to invite you to Open Finance marketplace presentation on October 20th.
Event Agenda :
Open banking so far (short recap)
• PSD2, OB UK, OB Australia, OB LATAM, OB Israel
Intro to Open Finance marketplace
• Scope
• Features
• Tech overview and Demo
The role of the Cloud
The Future of APIs
• Complying with regulation
• Monetizing data / APIs
• Business models
• Time to market
One platform for all: a Strategic approach
Q&A
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
Per creare valore e costruire una propria offerta differenziante e riconoscibile, le startup di successo sanno come combinare tecnologie consolidate con componenti innovativi creati ad hoc.
AWS fornisce servizi pronti all'utilizzo e, allo stesso tempo, permette di personalizzare e creare gli elementi differenzianti della propria offerta.
Concentrandoci sulle tecnologie di Machine Learning, vedremo come selezionare i servizi di intelligenza artificiale offerti da AWS e, anche attraverso una demo, come costruire modelli di Machine Learning personalizzati utilizzando SageMaker Studio.
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
Con l'approccio tradizionale al mondo IT per molti anni è stato difficile implementare tecniche di DevOps, che finora spesso hanno previsto attività manuali portando di tanto in tanto a dei downtime degli applicativi interrompendo l'operatività dell'utente. Con l'avvento del cloud, le tecniche di DevOps sono ormai a portata di tutti a basso costo per qualsiasi genere di workload, garantendo maggiore affidabilità del sistema e risultando in dei significativi miglioramenti della business continuity.
AWS mette a disposizione AWS OpsWork come strumento di Configuration Management che mira ad automatizzare e semplificare la gestione e i deployment delle istanze EC2 per mezzo di workload Chef e Puppet.
Scopri come sfruttare AWS OpsWork a garanzia e affidabilità del tuo applicativo installato su Instanze EC2.
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
Vuoi conoscere le opzioni per eseguire Microsoft Active Directory su AWS? Quando si spostano carichi di lavoro Microsoft in AWS, è importante considerare come distribuire Microsoft Active Directory per supportare la gestione, l'autenticazione e l'autorizzazione dei criteri di gruppo. In questa sessione, discuteremo le opzioni per la distribuzione di Microsoft Active Directory su AWS, incluso AWS Directory Service per Microsoft Active Directory e la distribuzione di Active Directory su Windows su Amazon Elastic Compute Cloud (Amazon EC2). Trattiamo argomenti quali l'integrazione del tuo ambiente Microsoft Active Directory locale nel cloud e l'utilizzo di applicazioni SaaS, come Office 365, con AWS Single Sign-On.
Dal riconoscimento facciale al riconoscimento di frodi o difetti di fabbricazione, l'analisi di immagini e video che sfruttano tecniche di intelligenza artificiale, si stanno evolvendo e raffinando a ritmi elevati. In questo webinar esploreremo le possibilità messe a disposizione dai servizi AWS per applicare lo stato dell'arte delle tecniche di computer vision a scenari reali.
Amazon Web Services e VMware organizzano un evento virtuale gratuito il prossimo mercoledì 14 Ottobre dalle 12:00 alle 13:00 dedicato a VMware Cloud ™ on AWS, il servizio on demand che consente di eseguire applicazioni in ambienti cloud basati su VMware vSphere® e di accedere ad una vasta gamma di servizi AWS, sfruttando a pieno le potenzialità del cloud AWS e tutelando gli investimenti VMware esistenti.
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
Molte aziende oggi, costruiscono applicazioni con funzionalità di tipo ledger ad esempio per verificare lo storico di accrediti o addebiti nelle transazioni bancarie o ancora per tenere traccia del flusso supply chain dei propri prodotti.
Alla base di queste soluzioni ci sono i database ledger che permettono di avere un log delle transazioni trasparente, immutabile e crittograficamente verificabile, ma sono strumenti complessi e onerosi da gestire.
Amazon QLDB elimina la necessità di costruire sistemi personalizzati e complessi fornendo un database ledger serverless completamente gestito.
In questa sessione scopriremo come realizzare un'applicazione serverless completa che utilizzi le funzionalità di QLDB.
Con l’ascesa delle architetture di microservizi e delle ricche applicazioni mobili e Web, le API sono più importanti che mai per offrire agli utenti finali una user experience eccezionale. In questa sessione impareremo come affrontare le moderne sfide di progettazione delle API con GraphQL, un linguaggio di query API open source utilizzato da Facebook, Amazon e altro e come utilizzare AWS AppSync, un servizio GraphQL serverless gestito su AWS. Approfondiremo diversi scenari, comprendendo come AppSync può aiutare a risolvere questi casi d’uso creando API moderne con funzionalità di aggiornamento dati in tempo reale e offline.
Inoltre, impareremo come Sky Italia utilizza AWS AppSync per fornire aggiornamenti sportivi in tempo reale agli utenti del proprio portale web.
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
In queste slide, gli esperti AWS e VMware presentano semplici e pratici accorgimenti per facilitare e semplificare la migrazione dei carichi di lavoro Oracle accelerando la trasformazione verso il cloud, approfondiranno l’architettura e dimostreranno come sfruttare a pieno le potenzialità di VMware Cloud ™ on AWS.
1) The document discusses building a minimum viable product (MVP) using Amazon Web Services (AWS).
2) It provides an example of an MVP for an omni-channel messenger platform that was built from 2017 to connect ecommerce stores to customers via web chat, Facebook Messenger, WhatsApp, and other channels.
3) The founder discusses how they started with an MVP in 2017 with 200 ecommerce stores in Hong Kong and Taiwan, and have since expanded to over 5000 clients across Southeast Asia using AWS for scaling.
This document discusses pitch decks and fundraising materials. It explains that venture capitalists will typically spend only 3 minutes and 44 seconds reviewing a pitch deck. Therefore, the deck needs to tell a compelling story to grab their attention. It also provides tips on tailoring different types of decks for different purposes, such as creating a concise 1-2 page teaser, a presentation deck for pitching in-person, and a more detailed read-only or fundraising deck. The document stresses the importance of including key information like the problem, solution, product, traction, market size, plans, team, and ask.
This document discusses building serverless web applications using AWS services like API Gateway, Lambda, DynamoDB, S3 and Amplify. It provides an overview of each service and how they can work together to create a scalable, secure and cost-effective serverless application stack without having to manage servers or infrastructure. Key services covered include API Gateway for hosting APIs, Lambda for backend logic, DynamoDB for database needs, S3 for static content, and Amplify for frontend hosting and continuous deployment.
This document provides tips for fundraising from startup founders Roland Yau and Sze Lok Chan. It discusses generating competition to create urgency for investors, fundraising in parallel rather than sequentially, having a clear fundraising narrative focused on what you do and why it's compelling, and prioritizing relationships with people over firms. It also notes how the pandemic has changed fundraising, with examples of deals done virtually during this time. The tips emphasize being fully prepared before fundraising and cultivating connections with investors in advance.
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
This document discusses Amazon's machine learning services for building conversational interfaces and extracting insights from unstructured text and audio. It describes Amazon Lex for creating chatbots, Amazon Comprehend for natural language processing tasks like entity extraction and sentiment analysis, and how they can be used together for applications like intelligent call centers and content analysis. Pre-trained APIs simplify adding machine learning to apps without requiring ML expertise.
Amazon Elastic Container Service (Amazon ECS) è un servizio di gestione dei container altamente scalabile, che semplifica la gestione dei contenitori Docker attraverso un layer di orchestrazione per il controllo del deployment e del relativo lifecycle. In questa sessione presenteremo le principali caratteristiche del servizio, le architetture di riferimento per i differenti carichi di lavoro e i semplici passi necessari per poter velocemente migrare uno o più dei tuo container.
DefCamp_2016_Chemerkin_Yury-publish.pdf - Presentation by Yury Chemerkin at DefCamp 2016 discussing mobile app vulnerabilities, data protection issues, and analysis of security levels across different types of mobile applications.
Demystifying Neural Networks And Building Cybersecurity ApplicationsPriyanka Aash
In today's rapidly evolving technological landscape, Artificial Neural Networks (ANNs) have emerged as a cornerstone of artificial intelligence, revolutionizing various fields including cybersecurity. Inspired by the intricacies of the human brain, ANNs have a rich history and a complex structure that enables them to learn and make decisions. This blog aims to unravel the mysteries of neural networks, explore their mathematical foundations, and demonstrate their practical applications, particularly in building robust malware detection systems using Convolutional Neural Networks (CNNs).
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Zilliz
Enterprises have traditionally prioritized data quantity, assuming more is better for AI performance. However, a new reality is setting in: high-quality data, not just volume, is the key. This shift exposes a critical gap – many organizations struggle to understand their existing data and lack effective curation strategies and tools. This talk dives into these data challenges and explores the methods of automating data curation.
Generative AI technology is a fascinating field that focuses on creating comp...Nohoax Kanont
Generative AI technology is a fascinating field that focuses on creating computer models capable of generating new, original content. It leverages the power of large language models, neural networks, and machine learning to produce content that can mimic human creativity. This technology has seen a surge in innovation and adoption since the introduction of ChatGPT in 2022, leading to significant productivity benefits across various industries. With its ability to generate text, images, video, and audio, generative AI is transforming how we interact with technology and the types of tasks that can be automated.
Finetuning GenAI For Hacking and DefendingPriyanka Aash
Generative AI, particularly through the lens of large language models (LLMs), represents a transformative leap in artificial intelligence. With advancements that have fundamentally altered our approach to AI, understanding and leveraging these technologies is crucial for innovators and practitioners alike. This comprehensive exploration delves into the intricacies of GenAI, from its foundational principles and historical evolution to its practical applications in security and beyond.
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc
In a landmark year marked by significant AI advancements, it’s vital to prioritize transparency, accountability, and respect for privacy rights with your AI innovation.
Learn how to navigate the shifting AI landscape with our innovative solution TRUSTe Responsible AI Certification, the first AI certification designed for data protection and privacy. Crafted by a team with 10,000+ privacy certifications issued, this framework integrated industry standards and laws for responsible AI governance.
This webinar will review:
- How compliance can play a role in the development and deployment of AI systems
- How to model trust and transparency across products and services
- How to save time and work smarter in understanding regulatory obligations, including AI
- How to operationalize and deploy AI governance best practices in your organization
UiPath Community Day Amsterdam: Code, Collaborate, ConnectUiPathCommunity
Welcome to our third live UiPath Community Day Amsterdam! Come join us for a half-day of networking and UiPath Platform deep-dives, for devs and non-devs alike, in the middle of summer ☀.
📕 Agenda:
12:30 Welcome Coffee/Light Lunch ☕
13:00 Event opening speech
Ebert Knol, Managing Partner, Tacstone Technology
Jonathan Smith, UiPath MVP, RPA Lead, Ciphix
Cristina Vidu, Senior Marketing Manager, UiPath Community EMEA
Dion Mes, Principal Sales Engineer, UiPath
13:15 ASML: RPA as Tactical Automation
Tactical robotic process automation for solving short-term challenges, while establishing standard and re-usable interfaces that fit IT's long-term goals and objectives.
Yannic Suurmeijer, System Architect, ASML
13:30 PostNL: an insight into RPA at PostNL
Showcasing the solutions our automations have provided, the challenges we’ve faced, and the best practices we’ve developed to support our logistics operations.
Leonard Renne, RPA Developer, PostNL
13:45 Break (30')
14:15 Breakout Sessions: Round 1
Modern Document Understanding in the cloud platform: AI-driven UiPath Document Understanding
Mike Bos, Senior Automation Developer, Tacstone Technology
Process Orchestration: scale up and have your Robots work in harmony
Jon Smith, UiPath MVP, RPA Lead, Ciphix
UiPath Integration Service: connect applications, leverage prebuilt connectors, and set up customer connectors
Johans Brink, CTO, MvR digital workforce
15:00 Breakout Sessions: Round 2
Automation, and GenAI: practical use cases for value generation
Thomas Janssen, UiPath MVP, Senior Automation Developer, Automation Heroes
Human in the Loop/Action Center
Dion Mes, Principal Sales Engineer @UiPath
Improving development with coded workflows
Idris Janszen, Technical Consultant, Ilionx
15:45 End remarks
16:00 Community fun games, sharing knowledge, drinks, and bites 🍻
Self-Healing Test Automation Framework - HealeniumKnoldus Inc.
Revolutionize your test automation with Healenium's self-healing framework. Automate test maintenance, reduce flakes, and increase efficiency. Learn how to build a robust test automation foundation. Discover the power of self-healing tests. Transform your testing experience.
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...Fwdays
.NET 8 brought a lot of improvements for developers and maturity to the Azure serverless container ecosystem. So, this talk will cover these changes and explain how you can apply them to your projects. Another reason for this talk is the re-invention of Serverless from a DevOps perspective as a Platform Engineering trend with Backstage and the recent Radius project from Microsoft. So now is the perfect time to look at developer productivity tooling and serverless apps from Microsoft's perspective.
Keynote : AI & Future Of Offensive SecurityPriyanka Aash
In the presentation, the focus is on the transformative impact of artificial intelligence (AI) in cybersecurity, particularly in the context of malware generation and adversarial attacks. AI promises to revolutionize the field by enabling scalable solutions to historically challenging problems such as continuous threat simulation, autonomous attack path generation, and the creation of sophisticated attack payloads. The discussions underscore how AI-powered tools like AI-based penetration testing can outpace traditional methods, enhancing security posture by efficiently identifying and mitigating vulnerabilities across complex attack surfaces. The use of AI in red teaming further amplifies these capabilities, allowing organizations to validate security controls effectively against diverse adversarial scenarios. These advancements not only streamline testing processes but also bolster defense strategies, ensuring readiness against evolving cyber threats.
Choosing the Best Outlook OST to PST Converter: Key Features and Considerationswebbyacad software
When looking for a good software utility to convert Outlook OST files to PST format, it is important to find one that is easy to use and has useful features. WebbyAcad OST to PST Converter Tool is a great choice because it is simple to use for anyone, whether you are tech-savvy or not. It can smoothly change your files to PST while keeping all your data safe and secure. Plus, it can handle large amounts of data and convert multiple files at once, which can save you a lot of time. It even comes with 24*7 technical support assistance and a free trial, so you can try it out before making a decision. Whether you need to recover, move, or back up your data, Webbyacad OST to PST Converter is a reliable option that gives you all the support you need to manage your Outlook data effectively.
2. What to expect from the session
• Brief history of data processing
• DynamoDB Internals
• Tables, API, data types, indexes
• Scaling and data modeling
• Design patterns and best practices
• Event driven applications and DDB Streams
4. Data Volume Since 2010
• 90% of stored data generated in
last 2 years
• 1 Terabyte of data in 2010 equals
6.5 Petabytes today
• Linear correlation between data
pressure and technical innovation
• No reason these trends will not
continue over time
9. 00 55 A954 FFAA00 FF
Partition Keys
Partition Key uniquely identifies an item
Partition Key is used for building an unordered hash index
Allows table to be partitioned for scale
Id = 1
Name = Jim
Hash (1) = 7B
Id = 2
Name =
Andy
Dept = EngHash (2) = 48
Id = 3
Name = Kim
Dept = Ops
Hash (3) = CD
Key Space
10. Partition:Sort Key
Partition:Sort Key uses two attributes together to uniquely identify an Item
Within unordered hash index, data is arranged by the sort key
No limit on the number of items (∞) per partition key
• Except if you have local secondary indexes
00:0 FF:∞
Hash (2) = 48
Customer# = 2
Order# = 10
Item = Pen
Customer# = 2
Order# = 11
Item = Shoes
Customer# = 1
Order# = 10
Item = Toy
Customer# = 1
Order# = 11
Item = Boots
Hash (1) = 7B
Customer# = 3
Order# = 10
Item = Book
Customer# = 3
Order# = 11
Item = Paper
Hash (3) = CD
55 A9:∞54:∞ AA
Partition 1 Partition 2 Partition 3
11. Partitions are three-way replicated
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Replica 1
Replica 2
Replica 3
Partition 1 Partition 2 Partition N
13. Local secondary index (LSI)
Alternate sort key attribute
Index is local to a partition key
A1
(partition)
A3
(sort)
A2
(item key)
A1
(partition)
A2
(sort)
A3 A4 A5
LSIs A1
(partition)
A4
(sort)
A2
(item key)
A3
(projected)
Table
KEYS_ONLY
INCLUDE A3
A1
(partition)
A5
(sort)
A2
(item key)
A3
(projected)
A4
(projected)
ALL
10 GB max per partition
key, i.e. LSIs limit the # of
range keys!
14. Global secondary index (GSI)
Alternate partition and/or sort key
Index is across all partition keys
A1
(partition)
A2 A3 A4 A5
GSIs A5
(partition)
A4
(sort)
A1
(item key)
A3
(projected)
Table
INCLUDE A3
A4
(partition)
A5
(sort)
A1
(item key)
A2
(projected)
A3
(projected) ALL
A2
(partition)
A1
(itemkey) KEYS_ONLY
RCUs/WCUs
provisioned separately
for GSIs
Online indexing
15. How do GSI updates work?
TableTable
Primary
table
Primary
table
Primary
table
Primary
table
Primary
table
Primary
table
Primary
table
Primary
table
Global
Secondary
Index
Global
Secondary
Index
ClientClient
1. Update request
2. Asynchronous
update (in progress)
2. Update response
If GSIs don’t have enough write capacity, table writes will be throttled!
16. LSI or GSI?
LSI can be modeled as a GSI
If data size in an item collection > 10 GB, use GSI
If eventual consistency is okay for your scenario, use
GSI!
18. Scaling
Throughput
• Provision any amount of throughput to a table
Size
• Add any number of items to a table
• Max item size is 400 KB
• LSIs limit the number of range keys due to 10 GB limit
Scaling is achieved through partitioning
19. Throughput
Provisioned at the table level
• Write capacity units (WCUs) are measured in 1 KB per second
• Read capacity units (RCUs) are measured in 4 KB per second
• RCUs measure strictly consistent reads
• Eventually consistent reads cost 1/2 of consistent reads
Read and write throughput limits are independent
WCURCU
21. Partitioning example Table size = 8 GB, RCUs = 5000, WCUs = 500
RCUs per partition = 5000/3 = 1666.67
WCUs per partition = 500/3 = 166.67
Data/partition = 10/3 = 3.33 GB
RCUs and WCUs are uniformly
spread across partitions
22. What causes throttling?
If sustained throughput goes beyond provisioned throughput per partition
Non-uniform workloads
• Hot keys/hot partitions
• Very large bursts
Mixing hot data with cold data
• Use a table per time period
From the example before:
• Table created with 5000 RCUs, 500 WCUs
• RCUs per partition = 1666.67
• WCUs per partition = 166.67
• If sustained throughput > (1666 RCUs or 166 WCUs) per key or partition,
DynamoDB may throttle requests
• Solution: Increase provisioned throughput
24. Getting the most out of DynamoDB throughput
“To get the most out of
DynamoDB throughput, create
tables where the hash key
element has a large number of
distinct values, and values are
requested fairly uniformly, as
randomly as possible.”
—DynamoDB Developer Guide
Space: access is evenly spread
over the key-space
Time: requests arrive evenly
spaced in time
27. 1:1 relationships or key-values
Use a table or GSI with an alternate partition key
Use GetItem or BatchGetItem API
Example: Given an SSN or license number, get attributes
Users Table
Partiton key Attributes
SSN = 123-45-6789 Email = johndoe@nowhere.com, License = TDL25478134
SSN = 987-65-4321 Email = maryfowler@somewhere.com, License = TDL78309234
Users-Email-GSI
Partition key Attributes
License = TDL78309234 Email = maryfowler@somewhere.com, SSN = 987-65-4321
License = TDL25478134 Email = johndoe@nowhere.com, SSN = 123-45-6789
28. 1:N relationships or parent-children
Use a table or GSI with partition and sort key
Use Query API
Example:
• Given a device, find all readings between epoch X, Y
Device-measurements
Partition Key Sort key Attributes
DeviceId = 1 epoch = 5513A97C Temperature = 30, pressure = 90
DeviceId = 1 epoch = 5513A9DB Temperature = 30, pressure = 90
29. N:M relationships
Use a table and GSI with partition and sort key elements
switched
Use Query API
Example: Given a user, find all games. Or given a game,
find all users.
User-Games-Table
Partition Key Sort key
UserId = bob GameId = Game1
UserId = fred GameId = Game2
UserId = bob GameId = Game3
Game-Users-GSI
Partition Key Sort key
GameId = Game1 UserId = bob
GameId = Game2 UserId = fred
GameId = Game3 UserId = bob
31. Hierarchical Data Structures as Items…
Use composite sort key to define a Hierarchy
Highly selective result sets with sort queries
Index anything, scales to any size
32. … or as Documents (JSON)
JSON data types (M, L, BOOL, NULL)
Document SDKs Available
Indexing only via Streams/Lambda
400KB max item size (limits hierarchical data structure)
35. Time series tables
Events_table_2015_April
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
Events_table_2015_March
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
Events_table_2015_Feburary
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
Events_table_2015_January
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
RCUs = 1000
WCUs = 1
RCUs = 10000
WCUs = 10000
RCUs = 100
WCUs = 1
RCUs = 10
WCUs = 1
Current table
Older tables
HotdataColddata
Don’t mix hot and cold data; archive cold data to Amazon S3
36. Dealing with time series data
Use a table per time period
Pre-create daily, weekly, monthly tables
Provision required throughput for current table
Writes go to the current table
Turn off (or reduce) throughput for older tables
38. Partition 1
2000 RCUs
Partition K
2000 RCUs
Partition M
2000 RCUs
Partition 50
2000 RCU
Scaling bottlenecks
Product A Product B
Shoppers
70,000/sec
ProductCatalog Table
SELECT Id, Description, ...
FROM ProductCatalog
WHERE Id="POPULAR_PRODUCT"
40. Partition 1 Partition 2
ProductCatalog Table
User
DynamoDB
User
Cache
popular items
SELECT Id,
Description, ...
FROM ProductCatalog
WHERE Id="POPULAR_PRODUCT"
43. Messages
Table
Messages App
David
SELECT *
FROM Messages
WHERE Recipient='David'
LIMIT 50
ORDER BY Date DESC
InboxInbox
SELECT *
FROM Messages
WHERE Sender ='David'
LIMIT 50
ORDER BY Date DESC
OutboxOutbox
44. Recipient Date Sender Message
David 2014-10-02 Bob …
… 48 more messages for David …
David 2014-10-03 Alice …
Alice 2014-09-28 Bob …
Alice 2014-10-01 Carol …
Large and small attributes mixed
(Many more messages)
David
Messages Table
50 items × 256 KB each
Partition key Sort key
Large message bodies
Attachments
SELECT *
FROM Messages
WHERE Recipient='David'
LIMIT 50
ORDER BY Date DESC
InboxInbox
45. Computing inbox query cost
Items evaluated by query
Average item size
Conversion ratio
Eventually consistent reads
50 * 256KB * (1 RCU / 4KB) * (1 / 2) = 1600 RCU
46. Recipient Date Sender Subject MsgId
David 2014-10-02 Bob Hi!… afed
David 2014-10-03 Alice RE: The… 3kf8
Alice 2014-09-28 Bob FW: Ok… 9d2b
Alice 2014-10-01 Carol Hi!... ct7r
Separate the bulk data
Inbox-GSI Messages Table
MsgId Body
9d2b …
3kf8 …
ct7r …
afed …
David
1. Query Inbox-GSI: 1 RCU
2. BatchGetItem Messages: 1600 RCU
(50 separate items at 256 KB)
(50 sequential items at 128 bytes)
Uniformly distributes large item reads
50. Reduce one-to-many item sizes
Configure secondary index projections
Use GSIs to model M:N relationship
between sender and recipient
Distribute large items
Querying many large items at once
InboxMessagesOutbox
52. GameId Date Host Opponent Status
d9bl3 2014-10-02 David Alice DONE
72f49 2014-09-30 Alice Bob PENDING
o2pnb 2014-10-08 Bob Carol IN_PROGRESS
b932s 2014-10-03 Carol Bob PENDING
ef9ca 2014-10-03 David Bob IN_PROGRESS
Games Table
Hierarchical Data Structures
Partition key
53. Query for incoming game requests
DynamoDB indexes provide partiton and sort
What about queries for two equalities and a sort?
SELECT * FROM Game
WHERE Opponent='Bob‘
AND Status=‘PENDING'
ORDER BY Date DESC
(hash)
(range)
(?)
54. Secondary Index
Opponent Date GameId Status Host
Alice 2014-10-02 d9bl3 DONE David
Carol 2014-10-08 o2pnb IN_PROGRESS Bob
Bob 2014-09-30 72f49 PENDING Alice
Bob 2014-10-03 b932s PENDING Carol
Bob 2014-10-03 ef9ca IN_PROGRESS David
Approach 1: Query filter
BobPartition key Sort key
55. Secondary Index
Approach 1: Query filter
Bob
Opponent Date GameId Status Host
Alice 2014-10-02 d9bl3 DONE David
Carol 2014-10-08 o2pnb IN_PROGRESS Bob
Bob 2014-09-30 72f49 PENDING Alice
Bob 2014-10-03 b932s PENDING Carol
Bob 2014-10-03 ef9ca IN_PROGRESS David
SELECT * FROM Game
WHERE Opponent='Bob'
ORDER BY Date DESC
FILTER ON Status='PENDING'
(filtered out)
57. Send back less data “on the wire”
Simplify application code
Simple SQL-like expressions
• AND, OR, NOT, ()
Use query filter
Your index isn’t entirely selective
59. Secondary Index
Approach 2: Composite key
Opponent StatusDate GameId Host
Alice DONE_2014-10-02 d9bl3 David
Carol IN_PROGRESS_2014-10-08 o2pnb Bob
Bob IN_PROGRESS_2014-10-03 ef9ca David
Bob PENDING_2014-09-30 72f49 Alice
Bob PENDING_2014-10-03 b932s Carol
Partition key Sort key
60. Opponent StatusDate GameId Host
Alice DONE_2014-10-02 d9bl3 David
Carol IN_PROGRESS_2014-10-08 o2pnb Bob
Bob IN_PROGRESS_2014-10-03 ef9ca David
Bob PENDING_2014-09-30 72f49 Alice
Bob PENDING_2014-10-03 b932s Carol
Secondary Index
Approach 2: Composite key
Bob
SELECT * FROM Game
WHERE Opponent='Bob'
AND StatusDate BEGINS_WITH 'PENDING'
62. Sparse indexes
Id
(Partition)
User Game Score Date Award
1 Bob G1 1300 2012-12-23
2 Bob G1 1450 2012-12-23
3 Jay G1 1600 2012-12-24
4 Mary G1 2000 2012-10-24 Champ
5 Ryan G2 123 2012-03-10
6 Jones G2 345 2012-03-20
Game-scores-tableGame-scores-table
Award
(Partition)
Id User Score
Champ 4 Mary 2000
Award-GSIAward-GSI
Scan sparse GSIs
68. Trade off read cost for write scalability
Consider throughput per partition key
Shard write-heavy partition keys
Your write workload is not horizontally
scalable
69. Concatenate attributes to form useful
secondary index keys
Take advantage of sparse indexes
Replace filter with indexes
You want to optimize a query as much
as possible
Status + Date
71. Stream of updates to a table
Asynchronous
Exactly once
Strictly ordered
• Per item
Highly durable
• Scale with table
24-hour lifetime
Sub-second latency
DynamoDB Streams
72. View Type Destination
Old image—before update Name = John, Destination = Mars
New image—after update Name = John, Destination = Pluto
Old and new images Name = John, Destination = Mars
Name = John, Destination = Pluto
Keys only Name = John
View types
UpdateItem (Name = John, Destination = Pluto)
74. DynamoDB Streams
Open Source Cross-
Region Replication Library
Asia Pacific (Sydney) EU (Ireland) Replica
US East (N. Virginia)
Cross-region replication
77. Analytics with
DynamoDB Streams
Collect and de-dupe data in DynamoDB
Aggregate data in-memory and flush periodically
Performing real-time aggregation and
analytics