The document discusses high availability solutions for MariaDB databases. It begins by defining high availability and concepts like Recovery Time Objective (RTO) and Recovery Point Objective (RPO). It then presents different MariaDB and MaxScale architectures that provide high availability, including single node, primary-replica, Galera cluster, and SkySQL solutions. Key aspects covered are automatic failover, load balancing, data filtering, and service level agreements.
- MySQL Cluster is a distributed, shared-nothing database that provides high performance, scalability, availability, and linear scalability.
- It uses a shared-nothing architecture with synchronous data replication across nodes to ensure high availability and data redundancy.
- New features in MySQL Cluster 7.1 focus on reducing the cost of operations through simplified management and monitoring tools and faster restarts.
Training Slides: Basics 102: Introduction to Tungsten ClusteringContinuent
This document provides an introduction to Continuent Tungsten clustering. It discusses key benefits like high availability, multi-site deployment, and ease of use. It examines the clustering architecture including topologies, automatic and manual failover, and rolling maintenance procedures. Commands for monitoring and managing the cluster are also reviewed, including cctrl and tpm diag. A demo shows using cctrl to perform a manual failover by promoting a slave to master.
Presented at Spark+AI Summit Europe 2019
https://databricks.com/session_eu19/apache-spark-at-scale-in-the-cloud
Using Apache Spark to analyze large datasets in the cloud presents a range of challenges. Different stages of your pipeline may be constrained by CPU, memory, disk and/or network IO. But what if all those stages have to run on the same cluster? In the cloud, you have limited control over the hardware your cluster runs on.
You may have even less control over the size and format of your raw input files. Performance tuning is an iterative and experimental process. It’s frustrating with very large datasets: what worked great with 30 billion rows may not work at all with 400 billion rows. But with strategic optimizations and compromises, 50+ TiB datasets can be no big deal.
By using Spark UI and simple metrics, explore how to diagnose and remedy issues on jobs:
Sizing the cluster based on your dataset (shuffle partitions)
Ingestion challenges – well begun is half done (globbing S3, small files)
Managing memory (sorting GC – when to go parallel, when to go G1, when offheap can help you)
Shuffle (give a little to get a lot – configs for better out of box shuffle) – Spill (partitioning for the win)
Scheduling (FAIR vs FIFO, is there a difference for your pipeline?)
Caching and persistence (it’s the cost of doing business, so what are your options?)
Fault tolerance (blacklisting, speculation, task reaping)
Making the best of a bad deal (skew joins, windowing, UDFs, very large query plans)
Using Apache Spark to analyze large datasets in the cloud presents a range of challenges. Different stages of your pipeline may be constrained by CPU, memory, disk and/or network IO. But what if all those stages have to run on the same cluster? In the cloud, you have limited control over the hardware your cluster runs on.
You may have even less control over the size and format of your raw input files. Performance tuning is an iterative and experimental process. It’s frustrating with very large datasets: what worked great with 30 billion rows may not work at all with 400 billion rows. But with strategic optimizations and compromises, 50+ TiB datasets can be no big deal.
By using Spark UI and simple metrics, explore how to diagnose and remedy issues on jobs:
Sizing the cluster based on your dataset (shuffle partitions)
Ingestion challenges – well begun is half done (globbing S3, small files)
Managing memory (sorting GC – when to go parallel, when to go G1, when offheap can help you)
Shuffle (give a little to get a lot – configs for better out of box shuffle) – Spill (partitioning for the win)
Scheduling (FAIR vs FIFO, is there a difference for your pipeline?)
Caching and persistence (it’s the cost of doing business, so what are your options?)
Fault tolerance (blacklisting, speculation, task reaping)
Making the best of a bad deal (skew joins, windowing, UDFs, very large query plans)
Writing to S3 (dealing with write partitions, HDFS and s3DistCp vs writing directly to S3)
Slow things down to make them go faster [FOSDEM 2022]Jimmy Angelakos
Talk from FOSDEM 2022
It's easy to get misled into overconfidence based on the performance of powerful servers, given today's monster core counts and RAM sizes. However, the reality of high concurrency usage is often disappointing, with less throughput than one would expect. Because of its internals and its multi-process architecture, PostgreSQL is very particular about how it likes to deal with high concurrency and in some cases it can slow down to the point where it looks like it's not performing as it should. In this talk we'll take a look at potential pitfalls when you throw a lot of work at your database. Specifically, very high concurrency and resource contention can cause problems with lock waits in Postgres. Very high transaction rates can also cause problems of a different nature. Finally, we will be looking at ways to mitigate these by examining our queries and connection parameters, leveraging connection pooling and replication, or adapting the workload.
Topics:
1. Understand what we mean by high concurrency.
2. Understand ACID & MVCC in Postgres.
3. Understand how high concurrency affects Postgres performance.
4. Understand how locks/latches affect Postgres performance.
5. Understand how high transaction rates can affect Postgres.
6. Mitigation strategies for high concurrency scenarios.
The document summarizes new features in Oracle Database 12c from Oracle 11g that would help a DBA currently using 11g. It lists and briefly describes features such as the READ privilege, temporary undo, online data file move, DDL logging, and many others. The objectives are to make the DBA aware of useful 12c features when working with a 12c database and to discuss each feature at a high level within 90 seconds.
1. If it’s not SQL, it’s not a database.
2. It takes 5+ years to build a database.
3. Listen to your users.
4. Too much magic is a bad thing.
5. It’s the cloud, stupid.
This document summarizes a presentation about building a high performance directory server in Java using the OpenDS project. It discusses the OpenDS architecture, design patterns used like asynchronous I/O and immutable objects, experiences tuning the Sun JVM like using large heap sizes and the CMS garbage collector. Performance tests showed search and modification rates of over 200,000 and 20,000 operations per second respectively on a Sun x4170 server. The presentation concludes by encouraging people to try OpenDS and get involved in the open source community around it.
Philip Thompson is a software engineer at DataStax and contributor to Apache Cassandra. The document discusses Apache Cassandra, an open source, distributed database built for scalability and high availability. It describes Cassandra's architecture including data distribution across nodes, replication, consistency levels, and mechanisms for repair and anti-entropy.
A presentation about how to make MySQL highly available, presented at the San Francisco MySQL Meetup (http://www.sfmysql.org/events/15760472/) on January 26th, 2011.
A video recording of this presentation is available from Ustream: http://ustre.am/fyLk
This document discusses various high availability solutions for MySQL databases. It begins with an overview of high availability concepts and considerations. It then summarizes MySQL replication, disk replication using DRBD, shared storage, and MySQL Cluster. Other high availability tools mentioned include Pacemaker, Galera replication, MMM, Tungsten Replicator, Red Hat Cluster Suite, Solaris Cluster, and Flipper. The document provides information on how these different techniques can be used to add redundancy and eliminate single points of failure for MySQL databases.
Mysqlhacodebits20091203 1260184765-phpapp02Louis liu
This document discusses various high availability solutions for MySQL databases. It begins with an overview of high availability concepts and considerations. It then covers MySQL replication, disk replication using DRBD, MySQL Cluster, and other tools like Pacemaker, Galera replication, MMM, Tungsten Replicator, Red Hat Cluster Suite, Solaris Cluster, and Flipper. The document provides details on how each solution works and its advantages and disadvantages for providing redundancy and high availability for MySQL databases.
Why new hardware may not make Oracle databases fasterSolarWinds
How can you know if hardware is the right answer to your Oracle database performance issues? How can you know for sure which hardware components will have the biggest impact? As a DBA or database developer, you should know that you can gain significant performance improvements without the time, money and risk associated with providing the latest server or flash storage array.
Learn why new hardware may not make your Oracle database faster and what you can do instead.
The document discusses MySQL 5.6 replication features including:
- Multi-threaded replication which allows parallel application of transactions to different databases for increased slave throughput.
- Binary log group commit which increases master performance by committing multiple transactions as a group to the binary log.
- Optimized row-based replication which reduces binary log size and network bandwidth by only replicating changed row elements.
- Global transaction identifiers which simplify tracking replication across clusters and identifying the most up-to-date slave for failover.
- Crash-safe slaves which store replication metadata in tables, allowing automatic recovery of slaves and binary logs after failures.
This document provides an overview of Galera Cluster for MySQL. It discusses how Galera Cluster allows for highly available, multi-master replication across multiple nodes through synchronous replication. It also covers topics like installation, configuration, operations including rolling restarts, load balancing, and catching node state changes. The document includes demonstrations of these concepts.
Running Dataproc At Scale in production - Searce Talk at GDG DelhiSearce Inc
This document provides information about Dataproc, Google Cloud's fully managed Spark and Hadoop service. It discusses how Dataproc allows users to create clusters on-demand to process large datasets in a flexible and cost-effective manner. It also covers how Dataproc integrates with other Google Cloud services and provides open-source tools like Spark, Hadoop, Hive and Pig. Additionally, it summarizes best practices for using Dataproc such as leveraging initialization actions, specifying cluster versions, and using the Jobs API for submissions.
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messagesLINE Corporation
Yuto Kawamura
LINE / Z Part Team
At LINE we've been operating Apache Kafka to provide the company-wide shared data pipeline for services using it for storing and distributing data.
Kafka is underlying many of our services in some way, not only the messaging service but also AD, Blockchain, Pay, Timeline, Cryptocurrency trading and more.
Many services feeding many data into our cluster, leading over 250 billion daily messages and 3.5GB incoming bytes in 1 second which is one of the world largest scale.
At the same time, it is required to be stable and performant all the time because many important services uses it as a backend.
In this talk I will introduce the overview of Kafka usage at LINE and how we're operating it.
I'm also going to talk about some engineerings we did for maximizing its performance, solving troubles led particularly by hosting huge data from many services, leveraging advanced techniques like kernel-level dynamic tracing.
Since 5.7.2, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK (DATABASE based parallel replication is also implemented in 5.6 but this is not covered in this talk). In early 5.7 versions, parallel replication was based on group commit (like MariaDB) and 5.7.6 changed that to intervals.
Intervals are more complicated but they are also more powerful. In this talk, I will explain in detail how they work and why intervals are better than group commit. I will also cover how to optimize parallel replication in MySQL 5.7 and what improvements are coming in MySQL 8.0.
The document discusses improvements to the MariaDB query optimizer. It notes that while MySQL is widely used for web applications and OLTP, it is not well-suited for complex analytics queries on large datasets due to issues with disk access strategies and subquery optimizations. MariaDB 5.3 includes multi-range read and batched key access features that improve disk access, reducing query times by over 10x on benchmark tests. It also includes many additional subquery optimization strategies beyond those in earlier MySQL versions.
Similar to Hochverfügbarkeitslösungen mit MariaDB (20)
MariaDB Paris Workshop 2023 - NewpharmaMariaDB plc
This document summarizes Newpharma's transition from a standalone database server to an enterprise MariaDB Galera cluster configuration between 2018-2023. It discusses the business needs that drove the change, including increased traffic and access to multiple data sources. Key benefits of the Galera cluster are highlighted like synchronous replication, read/write access from any node, and automatic node joining. Challenges of migrating like converting table types and splitting large transactions are also outlined. The transition has supported Newpharma's growth to over 100 million euro in turnover.
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB plc
MariaDB is an open-source database that is highly tunable and modular. It allows for various storage engines, plugins, and configurations to optimize performance depending on usage. Key aspects that impact performance include memory allocation, disk access, query optimization, and architecture choices like replication, sharding, or using ColumnStore for analytics. Solutions like MyRocks, Spider, MaxScale can improve performance for transactional or large scale workloads by optimizing resources, adding high availability, and distributing load.
MariaDB Paris Workshop 2023 - MaxScale MariaDB plc
The document outlines requirements and criteria for a database solution involving two buildings 30km apart with a WAN link. The chosen solution was MariaDB with Galera cluster for high availability and synchronous replication across sites, along with Maxscale for read/write splitting and failover. Maxscale instances on each site allow for zero downtime database patching and upgrades per site, while the Galera cluster provides structure-independent synchronous replication between sites.
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server MariaDB plc
MariaDB Enterprise Server 10.6 includes the following key features:
- New JSON functions and data types like UUID and INET4.
- Improved Oracle compatibility with function parameters.
- Enhanced partitioning capabilities like converting partitions.
- Optimistic ALTER TABLE for replicas to reduce downtime.
- Online schema changes without locking tables for improved performance.
- Security enhancements including password policies and privilege changes.
MariaDB SkySQL is a cloud database service that provides autonomous scaling, observability, and cloud backup capabilities. It offers multi-cloud and hybrid operations across AWS, Google Cloud, and on-premises databases. The service includes features like the Remote Observability Service (ROS) for monitoring across environments, and a Cloud Backup Service. It aims to provide a simple yet advanced service for scaling databases from small to extreme sizes with tools for automation, self-service, and unified operations.
Die Neuheiten in MariaDB Enterprise ServerMariaDB plc
This document summarizes new features in MariaDB Enterprise Server. Key points include:
- MariaDB Enterprise Server is geared toward enterprise customers and focuses on stability, robustness, and predictability.
- It has a longer release cycle than Community Server, with new versions every 2 years and long maintenance cycles. New features from Community Server are backported.
- Recent additions include analytics functions, JSON support, bi-temporal modeling, schema changes, database compatibility features, and security enhancements.
- The upcoming 23.x release will include new JSON functions, data types like UUID and INET4, Oracle compatibility features, partitioning improvements, and Galera enhancements.
Global Data Replication with Galera for Ansell Guardian®MariaDB plc
Ansell Guardian® faced challenges with their previous database replication solution as their data and usage grew globally. They evaluated MariaDB/Galera and implemented it to replace their legacy solution. The implementation was smooth using automation scripts. MariaDB/Galera provided increased performance, faster deployment times, and more reliable data synchronization across their 3 data centers compared to their previous solution. It helped resolve a critical data divergence issue and improved the user experience. They plan to further enhance their database infrastructure using MaxScale in the future.
SkySQL is the first and only database-as-a-service (DBaaS) to perform workload analysis with advanced deep learning models, identifying and classifying discrete workload patterns so DBAs can better understand database workloads, identify anomalies and predict changes.
In this session, we’ll explain the concepts behind workload analysis and show how it can be used in the real world (and with sample real-world data) to improve database performance and efficiency by identifying key metrics and changes to cyclical patterns.
SkySQL uses best-of-breed software, and when it comes to metrics and monitoring that means Prometheus and Grafana. SkySQL Monitor is built on both, and provides customers with interactive dashboards for both real-time and historic metrics monitoring. In addition, it meets the same high availability and security requirements as other SkySQL components, ensuring metrics are always available and always secure.
In this session, we’ll explain how SkySQL Monitor works, walk through its dashboards and show how to monitor key metrics for performance and replication.
Introducing the R2DBC async Java connectorMariaDB plc
Not too long ago, a reactive variant of the JDBC driver was released, known as Reactive Relational Database Connectivity (R2DBC for short). While R2DBC started as an experiment to enable integration of SQL databases into systems that use reactive programming models, it now specifies a full-fledged service-provider interface that can be used to retrieve data from a target data source.
In this session, we’ll take a look at the new MariaDB R2DBC connector and examine the advantages of fully reactive, non-blocking development with MariaDB. And, of course, we’ll dive in and get a first-hand look at what it’s like to use the new connector with some live coding!
The capabilities and features of MariaDB Platform continue to expand, resulting in larger and more sophisticated production deployments – and the need for better tools. To provide DBAs with comprehensive, consolidating tooling, we created MariaDB Enterprise Tools: an easy-to-use, modular command-line interface for interacting with any part of MariaDB Platform.
In this session, we will provide a preview of the MariaDB Enterprise Client, walk through current and planned modules and discuss future plans for MariaDB Enterprise Tools – including SkySQL modules and the ability to create custom modules.
Faster, better, stronger: The new InnoDBMariaDB plc
For MariaDB Enterprise Server 10.5, the default transactional storage engine, InnoDB, has been significantly rewritten to improve the performance of writes and backups. Next, we removed a number of parameters to reduce unnecessary complexity, not only in terms of configuration but of the code itself. And finally, we improved crash recovery thanks to better consistency checks and we reduced memory consumption and file I/O thanks to an all new log record format.
In this session, we’ll walk through all of the improvements to InnoDB, and dive deep into the implementation to explain how these improvements help everything from configuration and performance to reliability and recovery.
SkySQL implements a groundbreaking, state-of-the-art architecture based on Kubernetes and ServiceNow, and with a strong emphasis on cloud security – using compartmentalization and indirect access to secure and protect customer databases.
In this session, we’ll walk through the architecture of SkySQL and discuss how MariaDB leverages an advanced Kubernetes operator and powerful ServiceNow configuration/workflow management to deploy and manage databases on cloud infrastructure.
What to expect from MariaDB Platform X5, part 1MariaDB plc
MariaDB Platform X5 will be based on MariaDB Enterprise Server 10.5. This release includes Xpand, a fully distributed storage engine for scaling out, as well as many new features and improvements for DBAs and developers alike, including enhancements to temporal tables, additional JSON functions, a new performance schema, non-blocking schema changes with clustering and a Hashicorp Vault plugin for key management.
In this session, we’ll walk through all of the new features and enhancements available in MariaDB Enterprise Server 10.5. In addition, we will highlight those being backported to maintenance releases of MariaDB Enterprise Server 10.2, 10.3 and 10.4.
"Making .NET Application Even Faster", Sergey Teplyakov.pptxFwdays
In this talk we're going to explore performance improvement lifecycle, starting with setting the performance goals, using profilers to figure out the bottle necks, making a fix and validating that the fix works by benchmarking it. The talk will be useful for novice and seasoned .NET developers and architects interested in making their application fast and understanding how things work under the hood.
Choosing the Best Outlook OST to PST Converter: Key Features and Considerationswebbyacad software
When looking for a good software utility to convert Outlook OST files to PST format, it is important to find one that is easy to use and has useful features. WebbyAcad OST to PST Converter Tool is a great choice because it is simple to use for anyone, whether you are tech-savvy or not. It can smoothly change your files to PST while keeping all your data safe and secure. Plus, it can handle large amounts of data and convert multiple files at once, which can save you a lot of time. It even comes with 24*7 technical support assistance and a free trial, so you can try it out before making a decision. Whether you need to recover, move, or back up your data, Webbyacad OST to PST Converter is a reliable option that gives you all the support you need to manage your Outlook data effectively.
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptxFwdays
I will share my personal experience of full-time development on wasm Blazor
What difficulties our team faced: life hacks with Blazor app routing, whether it is necessary to write JavaScript, which technology stack and architectural patterns we chose
What conclusions we made and what mistakes we committed
Increase Quality with User Access Policies - July 2024Peter Caitens
⭐️ Increase Quality with User Access Policies ⭐️, presented by Peter Caitens and Adam Best of Salesforce. View the slides from this session to hear all about “User Access Policies” and how they can help you onboard users faster with greater quality.
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...Fwdays
.NET 8 brought a lot of improvements for developers and maturity to the Azure serverless container ecosystem. So, this talk will cover these changes and explain how you can apply them to your projects. Another reason for this talk is the re-invention of Serverless from a DevOps perspective as a Platform Engineering trend with Backstage and the recent Radius project from Microsoft. So now is the perfect time to look at developer productivity tooling and serverless apps from Microsoft's perspective.
Discovery Series - Zero to Hero - Task Mining Session 1DianaGray10
This session is focused on providing you with an introduction to task mining. We will go over different types of task mining and provide you with a real-world demo on each type of task mining in detail.
UiPath Community Day Amsterdam: Code, Collaborate, ConnectUiPathCommunity
Welcome to our third live UiPath Community Day Amsterdam! Come join us for a half-day of networking and UiPath Platform deep-dives, for devs and non-devs alike, in the middle of summer ☀.
📕 Agenda:
12:30 Welcome Coffee/Light Lunch ☕
13:00 Event opening speech
Ebert Knol, Managing Partner, Tacstone Technology
Jonathan Smith, UiPath MVP, RPA Lead, Ciphix
Cristina Vidu, Senior Marketing Manager, UiPath Community EMEA
Dion Mes, Principal Sales Engineer, UiPath
13:15 ASML: RPA as Tactical Automation
Tactical robotic process automation for solving short-term challenges, while establishing standard and re-usable interfaces that fit IT's long-term goals and objectives.
Yannic Suurmeijer, System Architect, ASML
13:30 PostNL: an insight into RPA at PostNL
Showcasing the solutions our automations have provided, the challenges we’ve faced, and the best practices we’ve developed to support our logistics operations.
Leonard Renne, RPA Developer, PostNL
13:45 Break (30')
14:15 Breakout Sessions: Round 1
Modern Document Understanding in the cloud platform: AI-driven UiPath Document Understanding
Mike Bos, Senior Automation Developer, Tacstone Technology
Process Orchestration: scale up and have your Robots work in harmony
Jon Smith, UiPath MVP, RPA Lead, Ciphix
UiPath Integration Service: connect applications, leverage prebuilt connectors, and set up customer connectors
Johans Brink, CTO, MvR digital workforce
15:00 Breakout Sessions: Round 2
Automation, and GenAI: practical use cases for value generation
Thomas Janssen, UiPath MVP, Senior Automation Developer, Automation Heroes
Human in the Loop/Action Center
Dion Mes, Principal Sales Engineer @UiPath
Improving development with coded workflows
Idris Janszen, Technical Consultant, Ilionx
15:45 End remarks
16:00 Community fun games, sharing knowledge, drinks, and bites 🍻
This PDF delves into the aspects of information security from a forensic perspective, focusing on privacy leaks. It provides insights into the methods and tools used in forensic investigations to uncover and mitigate privacy breaches in mobile and cloud environments.
Keynote : AI & Future Of Offensive SecurityPriyanka Aash
In the presentation, the focus is on the transformative impact of artificial intelligence (AI) in cybersecurity, particularly in the context of malware generation and adversarial attacks. AI promises to revolutionize the field by enabling scalable solutions to historically challenging problems such as continuous threat simulation, autonomous attack path generation, and the creation of sophisticated attack payloads. The discussions underscore how AI-powered tools like AI-based penetration testing can outpace traditional methods, enhancing security posture by efficiently identifying and mitigating vulnerabilities across complex attack surfaces. The use of AI in red teaming further amplifies these capabilities, allowing organizations to validate security controls effectively against diverse adversarial scenarios. These advancements not only streamline testing processes but also bolster defense strategies, ensuring readiness against evolving cyber threats.
Redefining Cybersecurity with AI CapabilitiesPriyanka Aash
In this comprehensive overview of Cisco's latest innovations in cybersecurity, the focus is squarely on resilience and adaptation in the face of evolving threats. The discussion covers the imperative of tackling Mal information, the increasing sophistication of insider attacks, and the expanding attack surfaces in a hybrid work environment. Emphasizing a shift towards integrated platforms over fragmented tools, Cisco introduces its Security Cloud, designed to provide end-to-end visibility and robust protection across user interactions, cloud environments, and breaches. AI emerges as a pivotal tool, from enhancing user experiences to predicting and defending against cyber threats. The blog underscores Cisco's commitment to simplifying security stacks while ensuring efficacy and economic feasibility, making a compelling case for their platform approach in safeguarding digital landscapes.
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc
In a landmark year marked by significant AI advancements, it’s vital to prioritize transparency, accountability, and respect for privacy rights with your AI innovation.
Learn how to navigate the shifting AI landscape with our innovative solution TRUSTe Responsible AI Certification, the first AI certification designed for data protection and privacy. Crafted by a team with 10,000+ privacy certifications issued, this framework integrated industry standards and laws for responsible AI governance.
This webinar will review:
- How compliance can play a role in the development and deployment of AI systems
- How to model trust and transparency across products and services
- How to save time and work smarter in understanding regulatory obligations, including AI
- How to operationalize and deploy AI governance best practices in your organization
It's your unstructured data: How to get your GenAI app to production (and spe...Zilliz
So you've successfully built a GenAI app POC for your company -- now comes the hard part: bringing it to production. Aparavi addresses the challenges of AI projects while addressing data privacy and PII. Our Service for RAG helps AI developers and data scientists to scale their app to 1000s to millions of users using corporate unstructured data. Aparavi’s AI Data Loader cleans, prepares and then loads only the relevant unstructured data for each AI project/app, enabling you to operationalize the creation of GenAI apps easily and accurately while giving you the time to focus on what you really want to do - building a great AI application with useful and relevant context. All within your environment and never having to share private corporate data with anyone - not even Aparavi.
Top 12 AI Technology Trends For 2024.pdfMarrie Morris
Technology has become an irreplaceable component of our daily lives. The role of AI in technology revolutionizes our lives for the betterment of the future. In this article, we will learn about the top 12 AI technology trends for 2024.
2. 2
High Availability - HA
High availability
(HA) is a characteristic
of a system which aims
to ensure an agreed
level of operational
performance, usually
uptime, for a higher than
normal period.
https://en.wikipedia.org/wiki/High_availability
3. https://upload.wikimedia.org/wikipedia/commons/6/69
/RPO_RTO_example_converted.png
3
RPO RTO
Recovery Time Objective
The Recovery Time Objective (RTO) is the targeted
duration of time and a service level within which a
business process must be restored after a disruption in
order to avoid a break in business continuity.
According to business continuity planning methodology,
the RTO is established during the Business Impact
Analysis (BIA) by the owner(s) of the process, including
identifying time frames for alternate or manual
workarounds.
Recovery Point Objective
A Recovery Point Objective (RPO) is the maximum
acceptable interval during which transactional data is
lost from an IT service.
For example, if RPO is measured in minutes, then in
practice, off-site mirrored backups must be continuously
maintained as a daily off-site backup will not suffice.
https://en.wikipedia.org/wiki/Disaster_recovery
5. 5
Architecture - Single Node
MariaDB
Primary
r/w
Your
Applications Single Node Setup
No failover option
Backup / Restore is key
RPO / RTO define the SLA
6. 6
Architecture - Primary / Replica Setup
MariaDB
Primary
r/w
Your
Applications
Primary / Replica Node Setup
“Manual” failover to Slave
Asynchronous Replication
Semi-synchronous Replication
“Passive” Hardware
Manual failover process defines the SLA
Backup process can run on Slave
MariaDB
Replica
7. 7
Architecture - Primary / Replica Setup
MariaDB
Primary
r/w
Your
Applications
Primary / Replica Node Setup
“Manual” failover to Slave
Asynchronous Replication
Semi-synchronous Replication
Galera Cluster
“Passive” Hardware
Manual failover process defines the SLA
Backup process can run on Slave
MariaDB
Replica
MariaDB
Replica
8. Architecture for high Availability with MaxScale
MariaDB
Primary
MaxScale
MariaDB
Replica
r/w r
Your
Applications
MariaDB MaxScale is an advanced SQL firewall, proxy, router, and load balancer:
• MaxScale performs automated failover for MariaDB replication.
• MaxScale's ReadWriteSplit router performs query-based load balancing.
• MaxScale's Cache filter can improve SELECT performance by caching and
reusing results.
• MaxScale can filter data via Data Masking, with defined patterns
• MaxScale also helps to avoid downtimes or hick-ups with
- Upgrades and Patches
- adding Nodes
- DoS attacks
- SQL Injection
- Security Violations
MariaDB
Replica
r
9. Architecture for high Availability in SkySQL
MariaDB
Primary
MaxScale
MariaDB
Replica
r/w r
Your
Applications
Performance Standard
SkySQL Foundation Tier
• Multi-node configurations will deliver a 99.95%
service availability on a per-billing-month basis.
• For example, with this availability target in a 30 day
calendar month the maximum service downtime is 21
minutes and 54 seconds.
SkySQL Power Tier
• Multi-node configurations will deliver a 99.995%
service availability on a per-billing-month basis.
• For example, with this availability target in a 30 day
calendar month the maximum service downtime is 2
minutes and 11 seconds.
Availability
Zone 2
Availability
Zone 1
MariaDB SkySQL SLA
12. Traditional Setup
12
● Prior to MaxScale 2.5, MaxScale HA required manual intervention
● While all the MaxScale nodes can route queries, read write splitting
and other operations, only the “active” MaxScale node (PASSIVE =
false) could perform automatic failovers.
● In case of the “active” MaxScale goes down, one of the remaining
MaxScale nodes needed to be set to “PASSIVE = false” so that
particular node could handle automatic failover.
● This was usually done with the help of third party tools such as
○ keepalived
○ corosync/pacemaker
13. Typical Recommended Architecture (Traditionally)
13
MaxScale
MaxScale
1
Active
Primary Replica-1
MaxScale
MaxScale
2
Passive
Replica-2
Replication
● Can’t have both MaxScale doing database
Failover
● Must use 3rd Party tools such as KeepaliveD to
control which is the “Active” MaxScale
● Issues for support in case of KeepaliveD failure
● Complex Configuration
● Only One MaxScale can be used for Query
routing
KeepaliveD
Virtual IP
14. Why “Cooperative Locking”
14
● Starting with MaxScale 2.5, Co-op Locking was introduced
● Multiple MaxScale nodes can work together without the need of any
third party component(s)
● MaxScale nodes will seamlessly decide which is the primary
MaxScale and which is not.
○ This is done by a special locking mechanism.
● Primary MaxScale handles the MariaDB failover.
● Two modes to choose from
○ majority_of_running
○ majority_of_all
15. cooperative_monitoring_locks (maxscale.cnf)
15
majority_of_running
● Default in SkySQL if the customer goes for dual MaxScale setup.
● MaxScale node that has the maximum number of locks will become the Primary
● In this mode, the total number of “Running” MariaDB nodes are considered excluding the
nodes that are down.
● Locks required are calculated as
○ Round the result up: n_servers/2 + 1
○ “n_servers” is the total number alive servers in the cluster
○ Consider a 3 nodes cluster
■ All 3 nodes are alive: round(3/2+1) = 2
■ 1 Node goes down: round(2/2+1) = 2
■ 2 Nodes goes down: round(1/2+1) = 1
○ This supports more nodes failure while still being able to do automatic MariaDB
failover.
16. majority_of_running
16
MaxScale
MaxScale
1
Primary Primary
Primary DC DR DC
MaxScale
MaxScale
2
Replica-1
Async Replication
● One nodes go down, the minimum of DB locks required reduced to “2”, it
can be achieved,
● MaxScale 1 is “primary”
● Automatic DB failover remains activated.
17. majority_of_running
17
MaxScale
MaxScale
1
Primary Primary
Primary DC DR DC
MaxScale
MaxScale
2
Primary
● 2 nodes go down, the minimum of DB locks required reduced to “1” which can
be still achieved.
● MaxScale 1 is still the “primary” MaxScale
● Automatic DB failover remains activated.
18. majority_of_running
18
MaxScale
MaxScale
1
Primary Primary
Primary DC DR DC
MaxScale
MaxScale
2
● Entire Data Center goes down
● The minimum of DB locks required is
reduced to “1” which can be still achieved.
● MaxScale 3 becomes “primary”
● Automatic DB failover remains activated.
Primary
19. cooperative_monitoring_locks (maxscale.cnf)
19
majority_of_running
● Can cause split-brain (Multiple MaxScale nodes becoming primary!)
○ Consider a Primary / DR setup
○ In case of a network partition between the two data centers, both
MaxScale on each data center will become “Primary” as they can’t
see the other side DB nodes.
○ This leads to Two “Primary” MariaDB servers running on each data
center!
○ Unlikely scenario but keep this in mind.
20. majority_of_running
20
MaxScale
MaxScale
1
Primary Replica-1
Primary DC DR DC
MaxScale
MaxScale
2
Primary
● Network between the two data centers is LOST
● The MaxScale nodes can only see the DB nodes within their own data centers
● “majority_of_running” rule applies and minimum of locks required is reduced to 2 for DC and is reduced to 1 for DR
● Split-Brain! We now have two “primary” MaxScale nodes!
● The new “primary” MaxScale node in DR, promotes one of the Replica as “Primary DB”
● Two Primary DB nodes running, one on each DC creating data inconsistency!
Async Replication
21. cooperative_monitoring_locks (maxscale.cnf)
21
majority_of_all
● In this mode, all the nodes are considered
● MaxScale node that has the maximum number of locks will become the Primary
● Locks required are calculated as
○ Round the result up: n_servers/2 + 1
○ “n_servers” is the total number of MariaDB servers in the cluster
○ In case of
■ 3 nodes setup, the locks required by MaxScale is round(3/2+1) = 2
■ 7 nodes setup, the locks required by MaxScale is round(7/2+1) = 4
● If too many MariaDB nodes going down at the same time, none of the
MaxScale nodes will be able to get the minimum number of locks required.
○ Consider, total of 3 backend servers, if 2 nodes go down, the minimum of
required locks, “2”, can’t be achieved
○ No automatic failover.
○ Minimum of “n_servers/2 + 1” must be alive for the automatic failover to
work
22. majority_of_all
22
MaxScale
MaxScale
1
Primary Replica-1
Primary DC DR DC
MaxScale
MaxScale
2
Replica-2
Async Replication
● Locks required are (round up of) 3/2+1 = 2
● MaxScale 1 has the max locks for instance, it becomes “primary”
● Other three MaxScale noes are “secondary”
23. majority_of_all
23
MaxScale
MaxScale
1
Primary Primary
Primary DC DR DC
MaxScale
MaxScale
2
Replica-2
Async Replication
● One nodes go down, the minimum of DB locks required “2” can be achieved,
● MaxScale 1 is still “primary”
● It is possible, another MaxScale node becomes primary, but only one.
24. majority_of_all
24
MaxScale
MaxScale
1
Primary Primary
Primary DC DR DC
MaxScale
MaxScale
2
Replica-2
● 2 nodes go down, the minimum of DB locks required “2” can no longer be be achieved!
● All the MaxScale nodes become “secondary”, Automatic failover is disabled.
26. majority_of_all
26
MaxScale
MaxScale
1
Primary Replica-1
Primary DC DR DC (Read-Only)
MaxScale
MaxScale
2
Replica-2
● Network between the two data centers is broken, all the MaxScale nodes on DC can
acquire 2 locks each which the same as minimum requirement of “2”
● DC still MaxScale still can do automatic failover.
● But the DR MaxScale can only get lock on “1” node, it’s automatic failover is disabled.
27. 27
Architecture - higher Availability Options
MariaDB
Primary
MaxScale
r/w
MariaDB
Replica
r
Your
Applications
MariaDB
Replica
MaxScale
MariaDB
Replica
r r
MariaDB
Replica
r
Datacenter 2
Datacenter 1
28. 28
MaxScale config_sync_cluster
When configuring MaxScale synchronization for the first time, the same static configuration files should be used on all MaxScale
instances that use same cluster value of “config_sync_cluster” must be the same on all MaxScale instances and the cluster (i.e.
the monitor) pointed by it and its servers must be the same in every configuration.
31. 31
Xpand - the distributed OLTP Database
Transactional
Distributed SQL
Full Elasticity (↑↓)
Read/Write Scale
32. 32
Xpand - the distributed OLTP Database
When you run a
distributed Database, you
always think about:
- Data Distribution
- Data Replication
- Skewing
- Shared-nothing
- Distributed SQL
- Data locality
- GEO-Distribution
- read/write
performance
- etc. ….
49. Serves Multiple Problem Domains
49
High Volume, Fast, Parallel
Asynchronous Replication
Active-Active
topology
Passive Standby
for Disaster Recovery
Daisy chain
replication to multiple regions
for global access