Scylla Summit 2016: ScyllaDB, Present and FutureScyllaDB
Where is Scylla now and where is it going? ScyllaDB's CTO Avi Kivity outlines the 3 ScyllaDB Commitments, and gives an overview of the ScyllaDB road map.
Learn how to get started with Scylla
Join us for an overview of NoSQL best practices and get a look into the scale-out vs scale up models and the Scylla philosophy for accelerated success. In this session, we will show you how to get fast wins when using Scylla. We will cover architectural concepts, installation best practices, data model antipatterns, use case examples, scale-out vs. scale up approaches and management and monitoring tools. If you’re a database architect, developer, or manager, this session is for you!
Architectural concepts
Installation of a single cluster of Scylla
Using Docker to get a 3-node cluster on your laptop
Connecting your application to the database
Data model antipatterns
Management and monitoring installation
Many NoSQL DBaaS vendors limit what cloud platform you can run on, the size of the data you can run and require you to over-provision cloud infrastructure resources while failing to deliver performance and low latency at scale.
In this session, we will compare the performance and Total Cost of Ownership (TCO) of competing NoSQL DBaaS offerings. We will also review how to migrate to Scylla Cloud, our fully managed database service.
You will learn:
- The true cost of ownership for selected NoSQL DBaaS offerings
- The 8 essentials for selecting a NoSQL DBaaS
- Migration options from Apache Cassandra, DynamoDB and other databases
Seastar is an open source framework that provides highly scalable and asynchronous distributed applications. It uses a shared-nothing architecture with no locks or threads to achieve linear scaling across cores. Applications built on Seastar can handle millions of connections and I/O operations in parallel. It uses an asynchronous programming model based on promises and futures with zero-copy networking and disk I/O for high performance.
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...ScyllaDB
ScyllaDB is a distributed database designed to scale horizontally and vertically — in theory. What about in practice? ScyllaDB’s Benny Halevy, Director, Software Engineering, will take you through the process and results of benchmarking our NoSQL database at the petabyte level, showing how you can use advanced features like workload prioritization to control priorities of transactional (read-write) and analytic (read-only) queries on the same cluster with smooth and predictable performance.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
Scylla Summit 2018: Consensus in Eventually Consistent DatabasesScyllaDB
Eventually consistent databases choose to remain available under failure, allowing for conflicting data to be stored in different replicas (later repaired by background processes). Weakening the consistency guarantees improves not only availability, but also performance, as the number of replicas involved in a given operation can be minimized. There are, however, use-cases that require the opposite trade-off. Indeed, Apache Cassandra and Scylla provide Lightweight Transactions (LWT), which allow single-key linearizable updates. The mechanism underlying LWT is asynchronous consensus. In this talk, we'll describe the characteristics and requirements of Scylla's consensus implementation, and how it enables strongly consistent updates. We will also cover how consensus can be applied to other aspects of the system, such as schema changes, node membership, and range movements, in order to improve their reliability and safety. We will thus show that an eventually consistent database can leverage consensus without compromising either availability or performance.
Scylla is a new open source NoSQL database that is compatible with Apache Cassandra but provides significantly higher performance through a redesign that takes advantage of modern hardware. Scylla is capable of over 1.8 million operations per second per node with predictable low latencies. It uses an architecture with shard-per-core and reactor programming that avoids locks and threads for near-linear scaling. Scylla also has its own efficient unified cache and I/O scheduler that maximize throughput and allow it to outperform Cassandra on benchmarks by an order of magnitude. Scylla is fully compatible with Cassandra and aims to build an open source community around ongoing core database improvements.
Scylla’s Journey Towards Being an Elastic Cloud Native DatabaseScyllaDB
Cloud Native Databases are required to scale while serving the increase in online workload with a minimal disruption and complete it as fast as possible. In this session we will review the different components that are stressed in scaling scenarios and present work we have done over the year to improve Scylla’s elasticity as we enhance it to be a true Cloud Native Database.
How to Monitor and Size Workloads on AWS i3 instancesScyllaDB
There is a new class of machines in town! Amazon recently unveiled i3, a new class of machines targeted at I/O-intensive workloads. Scylla will officially support i3, and previews are already available.
Join our webinar to learn how to build a state-of-the-art database solution. Presenters Glauber Costa and Eyal Gutkind will cover how to:
- Determine which workloads can benefit from i3 instances
- Ensure Scylla fully leverages the great resources in the i3 family
- Effectively navigate the Scylla monitoring system and identify bottlenecks
You'll also see a live demonstration with a dashboard featuring an i3 cluster with different data models and workloads.
ScyllaDB CTO Avi Kivity looks at the present state of Scylla's capabilities, and offers a glimpse of what's to come. From incremental compaction strategy to take advantage of newer, denser nodes, to data transformations with User Defined Functions (UDFs) and User Defined Aggregates (UDAs), ScyllaDB continues to expand its horizons for capabilities, use cases and APIs.
Why you need benchmarks
Finding the right database solution for your use case can be an arduous journey. The database deployment touches aspects of throughput performance, latency control, high availability and data resilience.
You will need to decide on the infrastructure to use: Cloud, on-premise or a hybrid solution.
Data models also have an impact on finding the right fit for the use case. Once you establish a requirements set, the next step is to test your use case against the databases of choice.
In this workshop, we will discuss the different data points you need to collect in order to get the most realistic testing environment.
We will cover:
Data model impact on performance and latency
Client behavior related to database capabilities
Failover and high availability testing
Hardware selection and cluster configuration impact
We will show 2 benchmarking tools you can use to test and benchmark your clusters to identify the optimal deployment scenario for your use case.
Attend this virtual workshop if you are:
Looking to minimize the cost of your database deployment
Making a database decision based on performance and scale data
Planning to emulate your workload on a pre-production system where you can test, fail fast and learn.
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...Data Con LA
Scylla is a new, open-source NoSQL data store with a novel design optimized for modern hardware, capable of 1.8 million requests per second per node, while providing Apache Cassandra compatibility and scaling properties. While conventional NoSQL databases suffer from latency hiccups, expensive locking, and low throughput due to low processor utilization, the Scylla design is based on a modern shared-nothing approach. Scylla runs multiple engines, one per core, each with its own memory, CPU and multi-queue NIC. The result is a NoSQL database that delivers an order of magnitude more performance, with less performance tuning needed from the administrator.
With extra performance to work with, NoSQL projects can have more flexibility to focus on other concerns, such as functionality and time to market. Come for the tech details on what Scylla does under the hood, and leave with some ideas on how to do more with NoSQL, faster.
Speaker bio
Don Marti is technical marketing manager for ScyllaDB. He has written for Linux Weekly News, Linux Journal, and other publications. He co-founded the Linux consulting firm Electric Lichen. Don is a strategic advisor for Mozilla, and has previously served as president and vice president of the Silicon Valley Linux Users Group and on the program committees for Uselinux, Codecon, and LinuxWorld Conference and Expo.
Mesosphere and Contentteam: A New Way to Run CassandraDataStax Academy
We, Ben Whitehead and Robert Stupp, will show you how to run Cassandra on Mesos. We will go through all the technical steps how to plan, setup and operate even large scale Cassandra clusters on Mesos. Further we illustrate how the Cassandra-on-Mesos framework helps you to setup Cassandra on Mesos, schedule regular maintenance tasks and manage hardware failures in the heart of your data center.
Scylla Summit 2022: What’s New in ScyllaDB Operator for KubernetesScyllaDB
This document summarizes the Scylla Operator for Kubernetes, including its developers, features, releases, and roadmap. Key points include:
- The Scylla Operator manages and automates tasks for Scylla clusters on Kubernetes.
- Features include seedless mode, security enhancements, performance tuning, and improved stability.
- It follows a rapid 6-week release cycle and supports the latest two releases.
- Future plans include additional performance optimizations, persistent storage support, TLS encryption, and multi-datacenter capabilities.
Since its inception, Scylla has offered a compelling alternative to Apache Cassandra, providing better performance for a lower cost of ownership.
With Scylla Open Source 4.0 we continue to extend our CQL interface features and capabilities and also now provide an open source alternative to DynamoDB, allowing you to run your workloads anywhere, on any cloud provider, or on premises.
Join ScyllaDB co-founders, CTO Avi Kivity and CEO Dor Laor, for a look at the new features in Scylla Open Source 4.0, and architectural and cost comparisons with the coming Cassandra 4.0.
Topics will include:
Improved consistency with our new Lightweight Transactions
Scylla Operator for Kubernetes
How we stack up against Apache Cassandra 4.0
Our “run anywhere” DynamoDB alternative
Seastar is a framework for disk, network, compute, and multicore intensive applications such as databases and filesystems. It treats multicore CPUs and disk I/O as asynchronous entities like networking, replacing locks with message passing. This provides benefits like high throughput, low latency, and control over where throughput and latency occur. The keynote discussed Seastar's approach to scheduling, opportunities around coroutines, and its goals for modules, stream revamping, and task co-execution. Compatibility policies were outlined emphasizing community involvement in supported compilers, APIs, and architectures.
Scylla on Kubernetes: Introducing the Scylla OperatorScyllaDB
The document introduces the Scylla Operator for Kubernetes, which provides a management layer for Scylla on Kubernetes. It addresses some limitations of using StatefulSets alone to run Scylla, such as safe scale down operations and tracking member identity. The operator implements the controller pattern with custom resources to deploy and manage Scylla clusters on Kubernetes. It handles tasks like cluster creation and scale up/down while addressing issues like local storage failures.
Renegotiating the boundary between database latency and consistencyScyllaDB
With the increasing complexity of modern distributed systems, concerns around latency, availability, and consistency have become almost 'universal'. In response, a new generation of distributed databases is taking over: databases capable of harnessing the power and capabilities of the multi-cloud ecosystem. This new generation of distributed databases is challenging many of the traditional tradeoffs between relational and non-relational models.
This webinar will explore the technologies and trends behind this new generation of distributed databases, then take a technical deep dive into one example: the open source non-relational database ScyllaDB. ScyllaDB was built specifically for extreme low latencies, but has recently increased consistency by implementing the Raft consensus protocol. Engineers will share how they are implementing a low-latency architecture, and how strongly consistent topology and schema changes enable highly reliable and safe systems, without sacrificing low-latency characteristics.
Patience with Apache Cassandra’s volatile latencies was wearing thin at Rakuten, a global online retailer serving 1.5B worldwide members. The Rakuten Catalog Platform team architected an advanced data platform – with Cassandra at its core – to normalize, validate, transform, and store product data for their global operations. However, while the business was expecting this platform to support extreme growth with exceptional end-user experiences, the team was battling Cassandra’s instability, inconsistent performance at scale, and maintenance overhead. So, they decided to migrate.
Join this webinar to hear a firsthand account of:
How specific Cassandra challenges were impacting the team and their product
How they determined whether migration would be worth the effort
What processes they used to evaluate alternative databases
What their migration required from a technical perspective
Strategies (and lessons learned) for your own database migration
Scylla: 1 Million CQL operations per second per serverAvi Kivity
My Cassandra Summit 2015 presentation introducing Scylla, an open source NoSQL implementation compatible with Apache Cassandra, but 10 times faster.
De-animated
http://scylladb.com
Scylla Summit 2016: Keynote - Big Data Goes NativeScyllaDB
This document discusses Scylla, a new database that aims to improve upon existing databases. It notes several key differences in Scylla's architecture that allow it to be faster and more scalable than other databases, including its use of techniques like log-structured merge trees, lock-free design, and asynchronous programming. The document also outlines Scylla's value proposition as the fastest database with the best high availability and ease of management compared to other options.
Seastar is a C++ asynchronous programming framework that allows for multi-domain async programming across networking, storage I/O, and multi-core communications. It uses an event-driven model where each logical core runs a task scheduler independently. Logical cores communicate through queues. Seastar is applicable for workloads with high I/O to compute ratios, high concurrency needs, and distributed applications. It provides futures/promises abstractions and rich APIs for tasks like HTTP servers, RPC, and distributed databases.
The document discusses Samsung SDS America's biometric authentication solution. It offers several benefits: eliminating password management to save time and costs; ensuring secure storage of biometric credentials on devices rather than servers; compatibility with FIDO Alliance security standards; and minimizing testing needs through one-time integration. Biometric authentication using methods like voice or iris scans provides a convenient, reliable and secure alternative to passwords for accessing corporate data on devices. It is compliant with various technical standards and works across multiple device platforms.
WTF: Why Do Banner Ads Still Exist On Mobile?AppNexus
Banner ads continue to dominate the mobile advertising format despite widespread criticism because:
1) The open mobile advertising supply chain has been slow to support new formats due to its complexity.
2) Closed ecosystems found in companies like Facebook are more flexible but publishers and marketers must balance relying too heavily on any single company.
3) Progress is being made towards native advertising but it still only accounts for around 25% of display ad spending according to projections.
This document discusses rich media banners and their interactivity. It begins by defining different levels of interactivity, from user-to-user to user-to-system. It then provides guidelines for rich media creatives, defining them as advertisements that users can interact with. Examples of rich media formats are provided, like expandable banners, video ads, and floating ads. The document also reviews IAB ad size and animation length recommendations.
This document provides information about Apple Inc. and Samsung Electronics Co., Ltd., including:
1) Company profiles that describe each company's products, internationalization strategies, and competitive advantages.
2) Overviews of the socio-political, economic, and financial environments in India, the United States, Australia, and South Korea - four key countries for the companies' international operations.
3) Details on Apple and Samsung's specific operations and sources of risk in those selected countries.
Maximize the benefits of digital advertising media by analyzing customers' reaction and quantitatively measuring ad effectiveness, and efficiently deliver advertising messages.
In the world of ecommerce, 2015 was a turning point in terms of mobile’s place in the affiliate shopping experience.
More than a device on which to research or browse, smartphones and tablets last year triumphed as sales portals -- mobile sales now represents nearly 1 in 3 sales which occur in the CJ Affiliate global network. Near the end of 2016, sales on mobile devices for some categories will likely account for 40% of global affiliate sales.
Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...Data Con LA
This document discusses decision making systems and the lambda architecture. It introduces decision making algorithms like multi-armed bandits that balance exploration vs exploitation. Contextual multi-armed bandits are discussed as well. The lambda architecture is then described as having serving, speed, and batch layers to enable low latency queries, real-time updates, and batch model training. The software stack of Kafka, Spark/Spark Streaming, HBase and MLLib is presented as enabling scalable stream processing and machine learning.
My talk on data center futures at Samsung SDS at Techtonic Summit in NYC. CoreOS has a full video of the talk with slides showing that is nicely produced along with the other talks which were also very informative.
TLDR: A single management layer providing large shared clusters is the only way to approach Google / Amazon levels of efficiency. There are a limited number of open source options, and I discuss why we chose Kubernetes and a container-centric foundation.
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...Amazon Web Services
Amazon DynamoDB is a fully managed, highly scalable distributed database service. In this technical talk, we show you how to use DynamoDB to build high-scale applications like social gaming, chat, and voting. We show you how to use building blocks such as secondary indexes, conditional writes, consistent reads, and batch operations to build the higher-level functionality such as multi-item atomic writes and join queries. We also discuss best practices such as index projections, item sharding, and parallel scan for maximum scalability.
Samsung is a South Korean multinational electronics company founded in 1938. It has annual revenue over $305 billion and employs 489,000 people globally. Samsung operates in 80 countries through 15 regional headquarters and has diverse business areas including consumer electronics, IT, mobile communications, and semiconductor manufacturing. It has a strong focus on innovation through its $9 billion annual R&D budget and 34 R&D centers worldwide. Samsung holds the top market share position for LCD screens and mobile phones. It faces challenges from short product lifecycles and aggressive Chinese competitors, but maintains its leading position through localized marketing, premium pricing, and vertical integration across manufacturing and supply chain.
This document provides an overview of Sybase's global IT infrastructure, including:
- 5 data centers located across the US, UK and Hong Kong supporting over 3100 physical servers and 3500 VMs
- 1.5 petabytes of storage from various vendors and 120TB of backups per week
- Metrics on help desk tickets, messaging/mobility users, Access Anywhere/VDI instances
- Organization of infrastructure teams into Networks, Client Infrastructure, Data Centers, Applications groups
- Use of operations teams in India for lower-cost support
Brk3288 sql server v.next with support on linux, windows and containers was...Bob Ward
This document discusses Microsoft's plans to deliver SQL Server on Linux and other heterogeneous environments. Key points include:
- SQL Server will be available on Linux, Windows, and Docker containers, allowing choice of operating system. It will support multiple languages and tools.
- Microsoft is delivering more options in response to businesses adopting heterogeneous environments with various data types, languages, and platforms.
- The document outlines SQL Server's capabilities on Linux such as high availability, security, and tools/drivers available now or in development.
Managing and Deploying High Performance Computing Clusters using Windows HPC ...Saptak Sen
The new management features built into Windows HPC Server 2008 R2 are the foundation for deploying and managing HPC clusters of scale up to 1000 nodes. Join us for a deep dive in monitoring and diagnostic tools, a review of the updated heat-map and template-based deployment. We also cover the new PowerShell-based scripting capabilities: the basics of management shell, as well as the underlying design and key concepts, new Reporting Capabilities, and a discussion on network boot.
Executing hundreds or thousands of process instances per second? Yes, it's possible. This webinar is about best practices for high-load situations, and how to scale Camunda BPM horizontally.
SQL Server 2017 will bring SQL Server to Linux for the first time. This presentation covers the scope, schedule, and architecture as well as a background on why Microsoft is making SQL Server available on Linux.
The document discusses using Dell EMC Isilon all-flash storage for SAS GRID workloads. It describes a test of the Isilon F810 node with hardware-accelerated compression using a multi-user SAS analytics workload. The testing focused on performance, scalability, compression benefits, deduplication savings, and cost when running the workload on an Isilon cluster with up to 12 grid nodes and comparing results with and without enabling various compression options.
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
Discover how to avoid common pitfalls when shifting to an event-driven architecture (EDA) in order to boost system recovery and scalability. We cover Kafka Schema Registry, in-broker transformations, event sourcing, and more.
Tobiasz Janusz Koprowski presented a beginner's guide to tips and tricks for using Windows Azure SQL Database. The presentation covered key Azure SQL Database concepts like database tiers, performance levels measured in Database Transaction Units (DTUs), data migration options, and compatibility with on-premises SQL Server versions. It provided an overview of supported and non-supported features between SQL Azure and different SQL Server versions. The presentation aimed to help attendees understand how to plan, configure and manage databases in the Azure SQL Database platform.
SQL Server v.Next will be released for Linux in 2017. The summary provides an overview of the key points about SQL Server on Linux including:
- SQL Server will have the same functionality and capabilities on Linux as on Windows. It will support the same editions and features such as high availability, security, and programming features.
- The architecture involves a SQL Platform Abstraction Layer that maps Windows APIs to Linux system calls to provide a consistent programming model.
- An early adoption program is currently underway to get feedback from customers and partners on functionality and to help validate SQL Server on Linux prior to general availability in 2017.
Horizontal Scaling for Millions of Customers! elangovans
This document provides an overview of Elangovan Shanmugam's experience and expertise in software architecture. Some key points:
- Elangovan has over 25 years of experience in software development and has designed resilient systems that can handle millions of customers and transactions per second.
- He discusses his work on Tax products that can import documents in under 2 seconds for 45 million filers, and his role as Chief Architect for Mint which serves 35 million customers processing billions of transactions daily.
- The document outlines Elangovan's approach to software architecture including strategies for microservices, scalability, high availability, and application architecture for multiple platforms and millions of users.
The document discusses troubleshooting performance issues for SQL Server. It begins with an introduction and case study on the MS Society of Canada's website. It then discusses optimizing the environment, using Performance Monitor (PerfMon) to monitor performance, and concludes with recommendations to address issues like high CPU usage, slow disk speeds, and insufficient memory.
The document provides a summary of Ashutosh Pandey's experience as an Oracle Database professional with over 10 years of experience. He has extensive skills in database administration, performance tuning, high availability solutions, database backups and recovery, and data replication technologies. Some of the key projects listed in his experience include Oracle 11g RAC implementations, database upgrades, data migrations involving large databases, and GoldenGate setups for active-active replication.
Migrate or modernize your database applications using Azure SQL Database Mana...ALI ANWAR, OCP®
Data Platform Summit 2019 is a community initiative by eDominer Systems. The agenda included presentations on Azure SQL Database Managed Instance, migration to the cloud with Azure SQL Database, and a demo. Azure SQL Database Managed Instance provides fully managed SQL Server instances in Azure with built-in intelligence and security. It offers several options for migrating SQL Server workloads to the cloud.
Design Like a Pro: How to Pick the Right System ArchitectureInductive Automation
Whether your automation project has only a few tags or hundreds of thousands of tags, you need to make sure that it will work properly now and that it has enough room to grow in the future. Having the right architecture and server sizes are absolutely essential in reaching this goal.
This resume is for Pramod Singh, who has 5 years of experience as an MSSQL DBA. He has worked at Infosys Limited from 2010-2014 and currently works at L&T InfoTech. He has skills in SQL Server, IIS, cluster management, and Azure. He has experience with projects at Microsoft and Otis Elevators involving SQL Server, databases, and ERP systems. His responsibilities include database administration, maintenance, backups, performance monitoring and incident management. He has certifications from Microsoft and has attended Oracle and SQL Server trainings.
PASS Summit - SQL Server 2017 Deep DiveTravis Wright
Deep dive into SQL Server 2017 covering SQL Server on Linux, containers, HA improvements, SQL graph, machine learning, python, adaptive query processing, and much much more.
Windows Azure SQL Database for Beginners (tips & tricks)
The document provides an overview and introduction to Windows Azure SQL Database including:
- Key features such as scalability, availability, data protection, and programmatic DBA functionality.
- Performance levels are described in DTU (database transaction units) with different tiers for Basic, Standard, and Premium databases.
- Limitations are discussed around database sizing, collations, logins/users, and compatibility with on-premises SQL Server features.
Similar to Scylla Summit 2016: Scylla at Samsung SDS (20)
Using ScyllaDB for Real-Time Write-Heavy WorkloadsScyllaDB
Keeping latencies low for highly concurrent, intensive data ingestion
ScyllaDB’s “sweet spot” is workloads over 50K operations per second that require predictably low (e.g., single-digit millisecond) latency. And its unique architecture makes it particularly valuable for the real-time write-heavy workloads such as those commonly found in IoT, logging systems, real-time analytics, and order processing.
Join ScyllaDB technical director Felipe Cardeneti Mendes and principal field engineer, Lubos Kosco to learn about:
- Common challenges that arise with real-time write-heavy workloads
- The tradeoffs teams face and tips for negotiating them
- ScyllaDB architectural elements that support real-time write-heavy workloads
- How your peers are using ScyllaDB with similar workloads
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...ScyllaDB
In this presentation, we explore how standard profiling and monitoring methods may fall short in identifying bottlenecks in low-latency data ingestion workflows. Instead, we showcase the power of simple yet clever methods that can uncover hidden performance limitations.
Attendees will discover unconventional techniques, including clever logging, targeted instrumentation, and specialized metrics, to pinpoint bottlenecks accurately. Real-world use cases will be presented to demonstrate the effectiveness of these methods. By the end of the session, attendees will be equipped with alternative approaches to identify bottlenecks and optimize their low-latency data ingestion workflows for high throughput.
Mitigating the Impact of State Management in Cloud Stream Processing SystemsScyllaDB
Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can be challenging. Decoupling compute and storage architecture has emerged as an effective solution to these challenges, but it can introduce high latency issues, especially when dealing with complex continuous queries that necessitate managing extra-large internal states.
In this talk, we focus on addressing the high latency issues associated with S3 storage in stream processing systems that employ a decoupled compute and storage architecture. We delve into the root causes of latency in this context and explore various techniques to minimize the impact of S3 latency on stream processing performance. Our proposed approach is to implement a tiered storage mechanism that leverages a blend of high-performance and low-cost storage tiers to reduce data movement between the compute and storage layers while maintaining efficient processing.
Throughout the talk, we will present experimental results that demonstrate the effectiveness of our approach in mitigating the impact of S3 latency on stream processing. By the end of the talk, attendees will have gained insights into how to optimize their stream processing systems for reduced latency and improved cost-efficiency.
Measuring the Impact of Network Latency at TwitterScyllaDB
Widya Salim and Victor Ma will outline the causal impact analysis, framework, and key learnings used to quantify the impact of reducing Twitter's network latency.
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...ScyllaDB
BlazingMQ is a new open source* distributed message queuing system developed at and published by Bloomberg. It provides highly-performant queues to applications for asynchronous, efficient, and reliable communication. This system has been used at scale at Bloomberg for eight years, where it moves terabytes of data and billions of messages across tens of thousands of queues in production every day.
BlazingMQ provides highly-available, fault-tolerant queues courtesy of replication based on the Raft consensus algorithm. In addition, it provides a rich set of enterprise message routing strategies, enabling users to implement a variety of scenarios for message processing.
Written in C++ from the ground up, BlazingMQ has been architected with low latency as one of its core requirements. This has resulted in some unique design and implementation choices at all levels of the system, such as its lock-free threading model, custom memory allocators, compact wire protocol, multi-hop network topology, and more.
This talk will provide an overview of BlazingMQ. We will then delve into the system’s core design principles, architecture, and implementation details in order to explore the crucial role they play in its performance and reliability.
*BlazingMQ will be released as open source between now and P99 (exact timing is still TBD)
Noise Canceling RUM by Tim Vereecke, AkamaiScyllaDB
Noisy Real User Monitoring (RUM) data can ruin your P99!
We introduce a fresh concept called ""Human Visible Navigations"" (HVN) to tackle this risk; we focus on the experiences you actually care about when talking about the speed of our sites:
- Human: We exclude noise coming from bots and synthetic measurements.
- Visible: We remove any partial or fully hidden experiences. These tend to be very slow but users don’t see this slowness.
- Navigations: We ignore lightning fast back-forward navigations which usually have few optimisation opportunities.
Adopting Human Visible Navigations provides you with these key benefits:
- Fewer changes staying below the radar
- Fewer data fluctuations
- Fewer blindspots when finding bottlenecks
- Better correlation with business metrics
This is supported by plenty of real world examples coming from the world's largest scale modeling site (6M Monthly visits) in combination with aggregated data from the brand new rumarchive.com (open source)
After attending this session; your P99 and other percentiles will become less noisy and easier to tune!
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...ScyllaDB
In this session, Tanel introduces a new open source eBPF tool for efficiently sampling both on-CPU events and off-CPU events for every thread (task) in the OS. Linux standard performance tools (like perf) allow you to easily profile on-CPU threads doing work, but if we want to include the off-CPU timing and reasons for the full picture, things get complicated. Combining eBPF task state arrays with periodic sampling for profiling allows us to get both a system-level overview of where threads spend their time, even when blocked and sleeping, and allow us to drill down into individual thread level, to understand why.
Performance Budgets for the Real World by Tammy EvertsScyllaDB
Performance budgets have been around for more than ten years. Over those years, we’ve learned a lot about what works, what doesn’t, and what we need to improve. In this session, Tammy revisits old assumptions about performance budgets and offers some new best practices. Topics include:
• Understanding performance budgets vs. performance goals
• Aligning budgets with user experience
• Pros and cons of Core Web Vitals
• How to stay on top of your budgets to fight regressions
Using Libtracecmd to Analyze Your Latency and Performance TroublesScyllaDB
Trying to figure out why your application is responding late can be difficult, especially if it is because of interference from the operating system. This talk will briefly go over how to write a C program that can analyze what in the Linux system is interfering with your application. It will use trace-cmd to enable kernel trace events as well as tracing lock functions, and it will then go over a quick tutorial on how to use libtracecmd to read the created trace.dat file to uncover what is the cause of interference to you application.
Reducing P99 Latencies with Generational ZGCScyllaDB
With the low-latency garbage collector ZGC, GC pause times are no longer a big problem in Java. With sub-millisecond pause times there are instead other things in the GC and JVM that can cause application threads to experience unexpected latencies. This talk will dig into a specific use where the GC pauses are no longer the cause of unexpected latencies and look at how adding generations to ZGC help lower the p99 application latencies.
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000XScyllaDB
Linters are a type of database! They are a collection of lint rules — queries that look for rule violations to report — plus a way to execute those queries over a source code dataset.
This is a case study about using database ideas to build a linter that looks for breaking changes in Rust library APIs. Maintainability and performance are key: new Rust releases tend to have mutually-incompatible ways of representing API information, and we cannot afford to reimplement and optimize dozens of rules for each Rust version separately. Fortunately, databases don't require rewriting queries when the underlying storage format or query plan changes! This allows us to ship massive optimizations and support multiple Rust versions without making any changes to the queries that describe lint rules.
Ship now, optimize later"" can be a sustainable development practice after all — join us to see how!
How Netflix Builds High Performance Applications at Global ScaleScyllaDB
We all want to build applications that are blazingly fast. We also want to scale them to users all over the world. Can the two happen together? Can users in the slowest of environments also get a fast experience? Learn how we do this at Netflix: how we understand every user's needs and preferences and build high performance applications that work for every user, every time.
Conquering Load Balancing: Experiences from ScyllaDB DriversScyllaDB
Load balancing seems simple on the surface, with algorithms like round-robin, but the real world loves throwing curveballs. Join me in this session as we delve into the intricacies of load balancing within ScyllaDB Drivers. Discover firsthand experiences from our journey in driver development, where we employed the Power of Two Choices algorithm, optimized the implementation of load balancing in Rust Driver, mitigated cloud costs through zone-aware load balancing and combated the issue of overloading a particular core of ScyllaDB. Be prepared to delve into the practical and theoretical aspects of load balancing, gaining valuable insights along the way.
Interaction Latency: Square's User-Centric Mobile Performance MetricScyllaDB
Mobile performance metrics often take inspiration from the backend world and measure resource usage (CPU usage, memory usage, etc) and workload durations (how long a piece of code takes to run).
However, mobile apps are used by humans and the app performance directly impacts their experience, so we should primarily track user-centric mobile performance metrics. Following the lead of tech giants, the mobile industry at large is now adopting the tracking of app launch time and smoothness (jank during motion).
At Square, our customers spend most of their time in the app long after it's launched, and they don't scroll much, so app launch time and smoothness aren't critical metrics. What should we track instead?
This talk will introduce you to Interaction Latency, a user-centric mobile performance metric inspired from the Web Vital metric Interaction to Next Paint"" (web.dev/inp). We'll go over why apps need to track this, how to properly implement its tracking (it's tricky!), how to aggregate this metric and what thresholds you should target.
How to Avoid Learning the Linux-Kernel Memory ModelScyllaDB
The Linux-kernel memory model (LKMM) is a powerful tool for developing highly concurrent Linux-kernel code, but it also has a steep learning curve. Wouldn't it be great to get most of LKMM's benefits without the learning curve?
This talk will describe how to do exactly that by using the standard Linux-kernel APIs (locking, reference counting, RCU) along with a simple rules of thumb, thus gaining most of LKMM's power with less learning. And the full LKMM is always there when you need it!
99.99% of Your Traces are Trash by Paige CruzScyllaDB
Distributed tracing is still finding its footing in many organizations today, one challenge to overcome is the data volume - keeping 100% of your traces is expensive and unnecessary. Enter sampling - head vs tail how do you decide? Let’s look at the design of Sifter and get familiar with why tail-based sampling is the way to enact a cost-effective tracing solution while actually increasing the system’s observability.
Square's Lessons Learned from Implementing a Key-Value Store with RaftScyllaDB
To put it simply, Raft is used to make a use case (e.g., key-value store, indexing system) more fault tolerant to increase availability using replication (despite server and network failures). Raft has been gaining ground due to its simplicity without sacrificing consistency and performance.
Although we'll cover Raft's building blocks, this is not about the Raft algorithm; it is more about the micro-lessons one can learn from building fault-tolerant, strongly consistent distributed systems using Raft. Things like majority agreement rule (quorum), write-ahead log, split votes & randomness to reduce contention, heartbeats, split-brain syndrome, snapshots & logs replay, client requests dedupe & idempotency, consistency guarantees (linearizability), leases & stale reads, batching & streaming, parallelizing persisting & broadcasting, version control, and more!
And believe it or not, you might be using some of these techniques without even realizing it!
This is inspired by Raft paper (raft.github.io), publications & courses on Raft, and an attempt to implement a key-value store using Raft as a side project.
A Deep Dive Into Concurrent React by Matheus AlbuquerqueScyllaDB
Writing fluid user interfaces becomes more and more challenging as the application complexity increases. In this talk, we’ll explore how proper scheduling improves your app’s experience by diving into some of the concurrent React features, understanding their rationales, and how they work under the hood.
Generative AI technology is a fascinating field that focuses on creating comp...Nohoax Kanont
Generative AI technology is a fascinating field that focuses on creating computer models capable of generating new, original content. It leverages the power of large language models, neural networks, and machine learning to produce content that can mimic human creativity. This technology has seen a surge in innovation and adoption since the introduction of ChatGPT in 2022, leading to significant productivity benefits across various industries. With its ability to generate text, images, video, and audio, generative AI is transforming how we interact with technology and the types of tasks that can be automated.
The Zaitechno Handheld Raman Spectrometer is a powerful and portable tool for rapid, non-destructive chemical analysis. It utilizes Raman spectroscopy, a technique that analyzes the vibrational fingerprint of molecules to identify their chemical composition. This handheld instrument allows for on-site analysis of materials, making it ideal for a variety of applications, including:
Material identification: Identify unknown materials, minerals, and contaminants.
Quality control: Ensure the quality and consistency of raw materials and finished products.
Pharmaceutical analysis: Verify the identity and purity of pharmaceutical compounds.
Food safety testing: Detect contaminants and adulterants in food products.
Field analysis: Analyze materials in the field, such as during environmental monitoring or forensic investigations.
The Zaitechno Handheld Raman Spectrometer is easy to use and features a user-friendly interface. It is compact and lightweight, making it ideal for field applications. With its rapid analysis capabilities, the Zaitechno Handheld Raman Spectrometer can help you improve efficiency and productivity in your research or quality control workflows.
The Challenge of Interpretability in Generative AI Models.pdfSara Kroft
Navigating the intricacies of generative AI models reveals a pressing challenge: interpretability. Our blog delves into the complexities of understanding how these advanced models make decisions, shedding light on the mechanisms behind their outputs. Explore the latest research, practical implications, and ethical considerations, as we unravel the opaque processes that drive generative AI. Join us in this insightful journey to demystify the black box of artificial intelligence.
Dive into the complexities of generative AI with our blog on interpretability. Find out why making AI models understandable is key to trust and ethical use and discover current efforts to tackle this big challenge.
Discovery Series - Zero to Hero - Task Mining Session 1DianaGray10
This session is focused on providing you with an introduction to task mining. We will go over different types of task mining and provide you with a real-world demo on each type of task mining in detail.
Finetuning GenAI For Hacking and DefendingPriyanka Aash
Generative AI, particularly through the lens of large language models (LLMs), represents a transformative leap in artificial intelligence. With advancements that have fundamentally altered our approach to AI, understanding and leveraging these technologies is crucial for innovators and practitioners alike. This comprehensive exploration delves into the intricacies of GenAI, from its foundational principles and historical evolution to its practical applications in security and beyond.
Redefining Cybersecurity with AI CapabilitiesPriyanka Aash
In this comprehensive overview of Cisco's latest innovations in cybersecurity, the focus is squarely on resilience and adaptation in the face of evolving threats. The discussion covers the imperative of tackling Mal information, the increasing sophistication of insider attacks, and the expanding attack surfaces in a hybrid work environment. Emphasizing a shift towards integrated platforms over fragmented tools, Cisco introduces its Security Cloud, designed to provide end-to-end visibility and robust protection across user interactions, cloud environments, and breaches. AI emerges as a pivotal tool, from enhancing user experiences to predicting and defending against cyber threats. The blog underscores Cisco's commitment to simplifying security stacks while ensuring efficacy and economic feasibility, making a compelling case for their platform approach in safeguarding digital landscapes.
It's your unstructured data: How to get your GenAI app to production (and spe...Zilliz
So you've successfully built a GenAI app POC for your company -- now comes the hard part: bringing it to production. Aparavi addresses the challenges of AI projects while addressing data privacy and PII. Our Service for RAG helps AI developers and data scientists to scale their app to 1000s to millions of users using corporate unstructured data. Aparavi’s AI Data Loader cleans, prepares and then loads only the relevant unstructured data for each AI project/app, enabling you to operationalize the creation of GenAI apps easily and accurately while giving you the time to focus on what you really want to do - building a great AI application with useful and relevant context. All within your environment and never having to share private corporate data with anyone - not even Aparavi.
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptxFwdays
I will share my personal experience of full-time development on wasm Blazor
What difficulties our team faced: life hacks with Blazor app routing, whether it is necessary to write JavaScript, which technology stack and architectural patterns we chose
What conclusions we made and what mistakes we committed
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc
In a landmark year marked by significant AI advancements, it’s vital to prioritize transparency, accountability, and respect for privacy rights with your AI innovation.
Learn how to navigate the shifting AI landscape with our innovative solution TRUSTe Responsible AI Certification, the first AI certification designed for data protection and privacy. Crafted by a team with 10,000+ privacy certifications issued, this framework integrated industry standards and laws for responsible AI governance.
This webinar will review:
- How compliance can play a role in the development and deployment of AI systems
- How to model trust and transparency across products and services
- How to save time and work smarter in understanding regulatory obligations, including AI
- How to operationalize and deploy AI governance best practices in your organization
DefCamp_2016_Chemerkin_Yury-publish.pdf - Presentation by Yury Chemerkin at DefCamp 2016 discussing mobile app vulnerabilities, data protection issues, and analysis of security levels across different types of mobile applications.
Welcome to Cyberbiosecurity. Because regular cybersecurity wasn't complicated...Snarky Security
How wonderful it is that in our modern age, every bit of our biological data can be digitized, stored, and potentially pilfered by cyber thieves! Isn't it just splendid to think that while scientists are busy pushing the boundaries of biotechnology, hackers could be plotting the next big bio-data heist? This delightful scenario is brought to you by the ever-expanding digital landscape of biology and biotechnology, where the integration of computer science, engineering, and data science transforms our understanding and manipulation of biological systems.
While the fusion of technology and biology offers immense benefits, it also necessitates a careful consideration of the ethical, security, and associated social implications. But let's be honest, in the grand scheme of things, what's a little risk compared to potential scientific achievements? After all, progress in biotechnology waits for no one, and we're just along for the ride in this thrilling, slightly terrifying, adventure.
So, as we continue to navigate this complex landscape, let's not forget the importance of robust data protection measures and collaborative international efforts to safeguard sensitive biological information. After all, what could possibly go wrong?
-------------------------
This document provides a comprehensive analysis of the security implications biological data use. The analysis explores various aspects of biological data security, including the vulnerabilities associated with data access, the potential for misuse by state and non-state actors, and the implications for national and transnational security. Key aspects considered include the impact of technological advancements on data security, the role of international policies in data governance, and the strategies for mitigating risks associated with unauthorized data access.
This view offers valuable insights for security professionals, policymakers, and industry leaders across various sectors, highlighting the importance of robust data protection measures and collaborative international efforts to safeguard sensitive biological information. The analysis serves as a crucial resource for understanding the complex dynamics at the intersection of biotechnology and security, providing actionable recommendations to enhance biosecurity in an digital and interconnected world.
The evolving landscape of biology and biotechnology, significantly influenced by advancements in computer science, engineering, and data science, is reshaping our understanding and manipulation of biological systems. The integration of these disciplines has led to the development of fields such as computational biology and synthetic biology, which utilize computational power and engineering principles to solve complex biological problems and innovate new biotechnological applications. This interdisciplinary approach has not only accelerated research and development but also introduced new capabilities such as gene editing and biomanufact
Retrieval Augmented Generation Evaluation with RagasZilliz
Retrieval Augmented Generation (RAG) enhances chatbots by incorporating custom data in the prompt. Using large language models (LLMs) as judge has gained prominence in modern RAG systems. This talk will demo Ragas, an open-source automation tool for RAG evaluations. Christy will talk about and demo evaluating a RAG pipeline using Milvus and RAG metrics like context F1-score and answer correctness.
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...Fwdays
.NET 8 brought a lot of improvements for developers and maturity to the Azure serverless container ecosystem. So, this talk will cover these changes and explain how you can apply them to your projects. Another reason for this talk is the re-invention of Serverless from a DevOps perspective as a Platform Engineering trend with Backstage and the recent Radius project from Microsoft. So now is the perfect time to look at developer productivity tooling and serverless apps from Microsoft's perspective.
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...
Scylla Summit 2016: Scylla at Samsung SDS
1. ScyllaDB in Samsung SDS
Dror Gadot, Kuyul Noh
Samsung SDS
From PoC to contribution, and beyond
2. Agenda • Challenges & Solution
• Use Cases
• Technical Validation
• Future Plan
3. 3 / 23
Samsung SDS
IT Services Business Solutions Logistics BPO
Logistics BPO2Consulting / SI1
Infrastructure Outsourcing
Application Outsourcing
Supply Chain & Logistics
1
SI : Systems Integration
2
BPO : Business Process Outsourcing
Enterprise Applications
Enterprise Analytics
Enterprise Mobility
• As an ‘IT Solution & Service Provider’, Samsung SDS play a pivotal
role in improving IT competitiveness across the Samsung Group to
become top tier companies in multiple industries.
4. 4 / 23
Samsung SDS (2/2)
51 Global Offices in30countries
Global Presence
SDS China
Beijing, China
Global HQ
Seoul,
Korea
SDS Latin America
Sao Paulo, Brazil
SDS Asia Pacific
Singapore
SDS America
New Jersey, USA
SDS India
New Delhi, India
SDS Europe
Weybridge, UK
SDS Middle East
Dubai, UAE
Global Footprint
4 SW Centers
29 Logistics Offices
7 Subsidiaries
11 Data Centers
5. 5 / 23
Samsung SDS – Scylla
• Deep evaluation of ScyllaDB solution
• Prepare adoption of ScyllaDB
(improve performance & reduce cost of internal systems)
• Contribution to ScyllaDB code base
• Define additional collaboration schemes
6. 6 / 23
Challenges & Solution
• Challenges of a NoSQL
§ Performance dilemma
- To get higher performance, more servers to a cluster.
- 100~200 nodes in a cluster?; Performance/management issues
§ JVM limitation
- JVM based application has excellent portability.
- DBMS on JVM?; Garbage Collection, Memory management issues
• Solution
§ No more JVM based
§ NUMA friendly new architecture
§ High Performance Network processing
7. Agenda • Challenges & Solution
• Use Cases
▸ IoT Platform
▸ Messenger Service
▸ Requirements
• Technical Validation
• Future Plan
8. 8 / 23
IoT Platform (1/2)
• An enterprise IoT platform that manages the entire lifecycle
of data to provide analytical insights for business operations.
Operations Manager
Sensor
Device
PLC
Work Station
Data Scientist
Enterprise System
Edge Connect Process Analyze Utilize
E2E Security
Enterprise IoT Platform
Brightics™
9. 9 / 23
IoT Platform (2/2)
Connect AnalyzeProcessEdge Utilize
Sensor Device
Work Station
Video / Smart
device
Predictive
Analytics
Anomaly
Detection
Visualization
Tools
Enterprise System
Interface
Analytics
Model
Hadoop Eco.
In-
Memory
IoT
Connectivity
Edge
Gateway
Data
Connectivity
Connect AnalyzeProcessEdge Utilize
Batch
Processing
Real-time
Processing
Micro Service
Execution
CEP
…
IOT Data
Processing
10. 10 / 23
Messenger Service (1/2)
• Square Messenger provides a communication service to
400,000 users optimized for business.
Real-time conversation with Mobile and Desktop
Always on Messenger Service
Collaboration up to 600 people
Following to existing conversation with chat history
Seamless
Communication
Collaboration
for
GroupChat
Advanced
Security
Message recall , Private conversation
Check the message read status
Screen Lock based on Password & fingerprint
Screen capture prevention
11. 11 / 23
Messenger Service (2/2)
ConnectMessaging Utilize
Messaging
Service
Messaging
Interface
Push
Contact
Presence
Message
Processing
Agent
External
Service
ConnectAnalyticsConnect
User
Management
Desktop
Android
ConnectAuthentication
Message Data
Processing
12. 12 / 23
Requirements
• Higher throughput and Lower latency
• Elastic Scalability
• Stability for 24 x 7 services
• Reduce # of Physical Servers
• Minimal code changes of existing application
13. Agenda • Challenges & Solution
• Use Cases
• Technical Validation
▸ Testing Environment
▸ Functional Test
▸ Non-Functional Test
• Future Plan
14. 14 / 23
Testing Environment
Node #1
Other
ScyllaDB
Node #2
Other
ScyllaDB
Node #3
Other
ScyllaDB
Node #4
Other
ScyllaDB
Additional nodes for Scale-OutBase nodes
Node #5
Other
ScyllaDB
Node #6
Other
ScyllaDB
Agent #1
Cassandra-
stress
Agent #2
Cassandra-
stress
Agent #3
Cassandra-
stress
* Software
• OS: CentOS 7.2
• ScyllaDB: 1.0
• Other : 2.1.8
• Cassandra-stress: 2.1.8
※ Replication Factor : 3
* Hardware
• Model : Supermicro 6048R
• CPU : 16core
• Main Memory : 64GB
• NIC : 10GB * 4ea
• Disk : SSD 300GB (RAID 0)
15. 15 / 23
Testing Scenario
§ Has only 1 column,
but data size is varied.
[Data Schema]
§ Has always fixed column.
Category Items
Functional
Monitoring Tool
Data export/import
Backup/restore(snapshot)
Cassandra Compatibility
Client Connection (cqlsh, thrift)
Repair, Compaction, etc.
Non-
Functional
Performance
By Scale-Out (3 à 4 à 5 à 6 nodes)
By Consistency Level
By workload, etc.
Availability
Recovery after Seed node down
Recovery after 2 nodes down, etc.
Scalability
Add 1 nodes after Seed Node down
Add 1~2 nodes under heavy stress, etc.
Stability Aging test for 5 days under heavy stress
Case 2Case 1
[Testing Items]
16. 16 / 23
Functional Test
Test items
Results
Remark
Other ScyllaDB
Monitoring (nodetool, UI etc.) O O - Support Tessera, Riemann-dash UI (Docker Container)
Data migration
(data file compatibility)
- O - Fully compatible with other NoSQL (ver. 2.1.8)
Client connection (cqlsh, thrift) O O - Thrift is supported at Ver.1.3
Repair command X O
- Other NoSQL : At manual repair, many time-out was
occurred under heavy writing
cqlsh features - △
- Not supported features
Counter type
Secondary Index
Trigger
(will be supported at 1.4)
Compaction features O O - Support SizeTiered, Leveled, DateTiered types
Other Scylla
CQL data types 8 7
Functions 5 5
cqlsh commands 11 11
CQL commands 28 24
※ updated based on ScyllaDB Ver. 1.3 RC3 (‘16.8.18) (O: fully meet, △ : partially meet, X : don't meet)
• Most of features are work well, a few are under development
17. 17 / 23
Non-Functional Test - Performance(1/2)
(load: 3,000 threads)
Case1-
Read
Case1-
Write
TPS Latency (unit : ms)
194,144
776,283
5.4
15.5
84,999
349,722
7.7
35.3
• Has 2~8 times higher performance
Other
ScyllaDB
Other
ScyllaDB
Other
ScyllaDB
Other
ScyllaDB
18. 18 / 23
Non-Functional Test - Performance(2/2)
Case1-
Read 70%
Write 30%
Case2-
Read 50%
Write 50%
96,610
518,482
5.1
30.6
69,038
407,883
1.5
43.4
TPS Latency (unit : ms)
Other
ScyllaDB
Other
ScyllaDB
Other
ScyllaDB
Other
ScyllaDB
19. 19 / 23
Non-Functional Test – Availability/Scalability
• No issues on availability and scalability
kill restart
[Availability]
Down Seed Node & Rejoin it to cluster
kill restart
Other ScyllaDB
[Scalability]
Add 2 nodes into cluster simultaneously
add add
è The TPS was decreased for 40~60ms,
and then recovered the previous TPS
for 50~70 ms when the node was rejoined.
è The TPS was decreased when 2 node was added,
and then increased to expected TPS after 100 ms
in both cases
Other ScyllaDB
20. 20 / 23
Non-Functional Test– Stability
• Keep in stable under heavy stress for 5 days
TPS Latency
ScyllaDB
• Average TPS :
113,879
• Average Latency :
1.37 ms
CPU Usage
Other
• Average TPS :
16,249
• Average Latency :
25.2 ms
(unit : ms)
※ Data Schema : Case2, Transaction Type : Read 50%, Write 50 %, Work Load: 400 threads
21. Agenda • Challenges & Solution
• Use Cases
• Technical Validation
• Future Plan
▸ Continuous engagement
▸ Contribution
22. 22 / 23
Continuous engagement
• Ver. 0.10 (Oct. 2015)
§ Feasibility test, Requirements discussion
• Ver. 0.17 (Feb.2016)
§ Functional/Performance Test
§ Report bugs (performance drop when new two nodes were added, Manual repair/compact time-out, etc.)
• Ver. 1.0 (Apr.2016)
§ Use cases based PoC
§ Report bugs (Large partition data insertion error, Major compaction error, etc.)
• Ver. 1.3 (Aug.2016)
§ New feature test
23. 23 / 23
Future Plan
• Applying to business cases
§ IoT data gathering, Message processing, etc.
§ Many new use cases
• Planning to develop additional enterprise features
§ Large cluster management, ScyllaDB as a Service, etc.
• Will contribute to community
§ Monitoring tool
§ Management tool