The document discusses SQL query performance analysis. It covers topics like the query optimizer, execution plans, statistics analysis, and different types of queries and scanning. The query optimizer is cost-based and determines the most efficient execution plan using cardinality estimates and cost models. Addhoc queries are non-parameterized queries that SQL Server treats differently than prepared queries. Execution plans show the steps and methods used to retrieve and process data. Statistics help the optimizer generate accurate cardinality estimates to pick high-performing plans.
This document provides an overview of SQL and how to use it effectively. It discusses what is needed to work with SQL, including a laptop, database management system like MySQL or PostgreSQL, and an SQL interface. It then covers SQL history, anatomy including statements, clauses and predicates. It explains different SQL clauses and functions like SELECT, JOIN, UNION, aggregates, analytic functions, and triggers and stored procedures. The document concludes with tips on SQL tuning and optimizing queries, indexes, and servers.
This document discusses the key challenges and design of a distributed graph index service. It proposes distributing the index across partitions, with each partition storing an independent index. Index keys are designed to encode vertex and edge data partition ID, type and properties. The document outlines writing new data to indexes, choosing the optimal index for queries, scanning partitions in parallel, and rebuilding indexes offline by replaying all operations. Maintaining consistency between indexes and data across versions and partitions is a primary challenge discussed.
Indexing the MySQL Index: Key to performance tuningOSSCube
Indexing the MySQL Index: Guide to Performance Enhancement presented techniques for optimizing database performance through effective indexing. Database indexes improve the speed of data retrieval operations by allowing for rapid random lookups and efficient reporting. The presentation discussed different types of indexes like column indexes, concatenated indexes, covering indexes, and fulltext indexes; and how they each optimize queries differently. Common indexing mistakes were also covered, such as not using an index, over-indexing, or applying functions to indexed columns.
This document outlines algorithms for query processing and optimization in database systems. It discusses translating SQL queries to relational algebra, algorithms for sorting and joining large datasets that exceed available memory, including nested loop joins, sort-merge joins, and hash joins. It also describes query optimization techniques and factors that influence query performance.
The document provides an introduction to the R programming language. It discusses that R is an open-source programming language for statistical analysis and graphics. It can run on Windows, Unix and MacOS. The document then covers downloading and installing R and R Studio, the R workspace, basics of R syntax like naming conventions and assignments, working with data in R including importing, exporting and creating calculated fields, using R packages and functions, and resources for R help and tutorials.
The document discusses query processing and optimization. It describes the basic concepts including query processing, query optimization, and the phases of query processing. It also explains relational algebra operations like selection, projection, joins, and additional operations. The document then covers topics like query decomposition, analysis, normalization, simplification, and restructuring during query optimization. It discusses cost estimation and algorithms for implementing relational algebra operations and file organization.
Elasticsearch 101 - Cluster setup and tuningPetar Djekic
Elasticsearch 101 provides an overview of setting up, configuring, and tuning an Elasticsearch cluster. It discusses hardware requirements including memory, avoiding high cardinality fields, indexing and querying data, and tooling. The document also covers potential issues like data loss during network partitions and exhausting available Java heap memory.
The document discusses query processing and optimization. It describes several key activities in query processing including translating queries to a format executable by the database, applying optimization techniques, and evaluating the queries. It then provides details on three specific operations: selection using linear searches and indices, sorting, and join operations. It explains different algorithms for implementing each operation and factors to consider when choosing algorithms such as indexing and data sizes.
With the introduction of SQL Server 2012 data developers have new ways to interact with their databases. This session will review the powerful new analytic windows functions, new ways to generate numeric sequences and new ways to page the results of our queries. Other features that will be discussed are improvements in error handling and new parsing and concatenating features.
This document discusses connecting to and interacting with remote databases using JDBC in Java. It describes PreparedStatement and ResultSet interfaces for executing queries and accessing results. Methods are provided for navigating through rows, accessing columns by index or name, and performing data manipulation like inserts, updates and deletes. The steps for connecting include loading a driver, getting a connection, creating statements, executing queries, and navigating result sets.
This document provides a step-by-step guide to learning R. It begins with the basics of R, including downloading and installing R and R Studio, understanding the R environment and basic operations. It then covers R packages, vectors, data frames, scripts, and functions. The second section discusses data handling in R, including importing data from external files like CSV and SAS files, working with datasets, creating new variables, data manipulations, sorting, removing duplicates, and exporting data. The document is intended to guide users through the essential skills needed to work with data in R.
The MinMax Cache concept keeps the minimum and maximum values of each data partition in a cache. When searching for a value, it can skip partitions that are outside the range of the minimum and maximum values, improving search speed compared to searching all partitions. Setting up a MinMax cache on columns where values increase over time allows faster pruning of partitions during searches. Testing on a dataset of 100M records showed searches were over 7 times faster with the MinMax cache enabled compared to without it.
Executing Queries on a Sharded DatabaseNeha Narula
Determining a data storage solution as your web application scales can be the most difficult part of web development, and takes time away from developing application features. MongoDB, Redis, Postgres, Riak, Cassandra, Voldemort, NoSQL, MySQL, NewSQL — the options are overwhelming, and all claim to be elastic, fault-tolerant, durable, and give great performance for both reads and writes. In the first portion of this talk I’ll discuss these different storage solutions and explain what is really important when choosing a datastore — your application data schema and feature requirements.
This presentation was presented at Percona Live UK.
Although a DBMS hides the internal mechanics of indexing. But to be able to create efficient indexes, you need to know how they work. This talk will help you understand the mechanics of the data structure used to store indexes and as to how it applies to InnoDB. At the end of the talk you will be able to learn how to use cost-analysis to pick and choose correct index definitions and will learn how to create indexes that will work efficiently with InnoDB.
The document discusses algorithms and techniques for query processing and optimization in relational database management systems. It covers translating SQL queries into relational algebra, algorithms for operations like selection, projection, join and sorting, using heuristics and cost estimates for optimization, and an overview of query optimization in Oracle databases.
The No SQL Principles and Basic Application Of Casandra ModelRishikese MR
The slides discuss various matters of the No SQL and casandra Models, the slide gives a complete picture of the both topics and its relations. Also it discuss the merits and demerits of the topics and its features and examples are also described.
This chapter discusses how to connect to and manipulate SQL Server databases from ASP.NET applications. It covers using classes in the System.Data.SqlClient namespace to connect to databases and execute SQL commands. Methods like ExecuteReader, ExecuteNonQuery and SqlDataReader are used to retrieve and modify data. The chapter also describes how to create, update and delete databases and tables by executing SQL statements with ASP.NET code.
The document provides an overview of various techniques for optimizing database and application performance. It discusses fundamentals like minimizing logical I/O, balancing workload, and serial processing. It also covers the cost-based optimizer, column constraints and indexes, SQL tuning tips, subqueries vs joins, and non-SQL issues like undo storage and data migrations. Key recommendations include using column constraints, focusing on serial processing per table, and not over-relying on statistics to solve all performance problems.
the presentation would brief the online reader with the concepts of few advanced data structures like hash tables, tries, Binary Trees, Binary Search Trees, Threaded Binary Trees and AVL Trees.
Sharding allows MongoDB deployments to horizontally scale or "scale out" by partitioning data across multiple servers. The key points are:
1) Data is partitioned into chunks based on a user-defined shard key and distributed across shards.
2) The config server stores metadata about chunk mappings and locations.
3) Mongos routes queries to appropriate shards and balances the cluster by migrating chunks when needed.
4) Sharding improves scalability when working set exceeds memory or throughput exceeds I/O capacity.
This document is the United Nations E-Government Survey 2012. It discusses how e-government can advance sustainable development and be more responsive, citizen-centric, and socially inclusive. The survey found that while e-government has helped promote transparency and accountability, more needs to be done to reduce digital divides and increase access for vulnerable groups. Cooperation between governments, funding, and innovations like mobile services will help e-government better serve people.
Estudio benchmark publicidad digital mayor rendimiento por media mindComercio Electronico
MediaMind, proveedor líder mundial de soluciones integradas de publicidad online, ha presentado su estudio comparativo mundial (benchmark) de publicidad online.
This document discusses various techniques for optimizing SQL queries in SQL Server, including:
1) Using parameterized queries instead of ad-hoc queries to avoid compilation overhead and improve plan caching.
2) Ensuring optimal ordering of predicates in the WHERE clause and creating appropriate indexes to enable index seeks.
3) Understanding how the query optimizer works by estimating cardinality based on statistics and choosing low-cost execution plans.
4) Avoiding parameter sniffing issues and non-deterministic expressions that prevent accurate cardinality estimation.
5) Using features like the Database Tuning Advisor and query profiling tools to identify optimization opportunities.
This document discusses how the author's time is extraordinarily valuable and feels it is being wasted. The author concludes that whoever is wasting their time is now "dead to me" and they should refer to the previous bullet points listed.
This document summarizes an analytics bootcamp hosted by Google from June 25-28, 2012 in Bogota. It provides an agenda for the 4 day course, covering topics like filters, e-commerce, metrics, regular expressions, advanced segments, and analytics intelligence. The final day includes a question and answer session and exam.
SQL Server stores data in pages that are grouped into extents for management. By default, rows are returned in the order they were inserted. A nonclustered index will use less space than a clustered index since it only stores the key columns rather than the entire row. The ORDER BY clause is used to sort query results. Various types of joins can be used to combine data from multiple tables.
This document provides an introduction and overview of Microsoft SQL Server and its main components: the Database Engine, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), and SQL Server Reporting Services (SSRS). It also briefly discusses the differences between SQL Server and MySQL Server and their respective client tools. The remainder of the document focuses on SQL syntax, explaining clauses such as FROM, WHERE, GROUP BY, HAVING, and ORDER BY and functions like COUNT, AVG, MIN, MAX and SUM. It emphasizes that the order of predicates in the WHERE clause can impact performance.
Reporte Cámara Colombiana de Comercio Electrónico Retail, Banking y TravelComercio Electronico
El documento presenta datos sobre el uso de internet en Colombia en agosto de 2013 en las categorías de retail, banca online y travel. En retail, los sitios más visitados fueron MercadoLibre, Amazon y Apple. El crecimiento en los últimos 6 meses fue de 7.082 mil visitantes únicos. En banca online, los más visitados fueron Grupo Banco Colombia, Davivienda y Banco de Bogotá, con 2.547 mil visitantes únicos. Y en travel, Despegar-Decolar, Avianca y LanChile encabezaron la lista, con un total de 2.974
This document discusses SQL query performance analysis through indexing. It begins by defining an index as a way to organize data to make searching, sorting, and grouping faster. Indexes are needed for clauses like WHERE, ON, HAVING (searching), ORDER BY (sorting), and GROUP BY (grouping). The document then discusses different types of scans like table scans, index scans, and index seeks and explains their time complexities. It defines clustered and non-clustered indexes and how they are structured. Key terms in execution plans like predicate, object, and seek predicate are also explained. Finally, the concept of covering indexes to optimize queries is introduced.
Optimizing Query is very important to improve the performance of the database. Analyse query using query execution plan, create cluster index and non-cluster index and create indexed views
This session talks all about SQL Server Query Parameterization. Covers various topics like Query Tuning, Adhoc Query, Execution Plan, Query Optimizer, Parameter Sniffing, Indexes etc. etc.
This document provides an overview of a presentation on building better SQL Server databases. The presentation covers how SQL Server stores and retrieves data by looking under the hood at tables, data pages, and the process of requesting data. It then discusses best practices for database design such as using the right data types, avoiding page splits, and tips for writing efficient T-SQL code. The presentation aims to teach attendees how to design databases for optimal performance and scalability.
Redshift is Amazon's cloud data warehousing service that allows users to interact with S3 storage and EC2 compute. It uses a columnar data structure and zone maps to optimize analytic queries. Data is distributed across nodes using either an even or keyed approach. Sort keys and queries are optimized using statistics from ANALYZE operations while VACUUM reclaims space. Security, monitoring, and backups are managed natively with Redshift.
This document summarizes a lecture on key-value storage systems. It introduces the key-value data model and compares it to relational databases. It then describes Cassandra, a popular open-source key-value store, including how it maps keys to servers, replicates data across multiple servers, and performs reads and writes in a distributed manner while maintaining consistency. The document also discusses Cassandra's use of gossip protocols to manage cluster membership.
This document introduces different types of indexes in SQL Server, including clustered, non-clustered, full text, XML, and columnstore indexes. It discusses common misconceptions about indexes, downsides to indexes, and advanced index tuning techniques like fillfactor and padding. The document provides examples of SQL statements for creating indexes and querying index usage statistics.
This document discusses indexing and query optimization in SQL Server. It provides an overview of indexes including clustered and non-clustered indexes. It describes how data is stored at the page and extent level and differences between tables with and without clustered indexes. The document also outlines the query optimization process including parsing, optimization, execution and the cost-based optimizer. Finally, it reviews common execution plan operators like table scans, index scans and seeks and when they would be considered good or bad.
The document discusses various techniques for optimizing query performance in MySQL, including using indexes appropriately, avoiding full table scans, and tools like EXPLAIN, Performance Schema, and pt-query-digest for analyzing queries and identifying optimization opportunities. It provides recommendations for index usage, covering indexes, sorting and joins, and analyzing slow queries.
The document provides biographical information about Antonios Chatzipavlis, a SQL Server expert and evangelist. It then summarizes his presentation on statistics and index internals in SQL Server, which covers topics like cardinality estimation, inspecting and updating statistics, index structure and types, and identifying missing indexes. The presentation includes demonstrations of analyzing cardinality estimation and picking the right index key.
This document provides an overview and introduction to Cassandra including:
- An agenda that outlines the topics covered in the overview including architecture, data modeling differences from RDBMS, and CQL.
- Recommended resources for learning more about Cassandra including documentation, video courses, books, and articles.
- Requirements that Cassandra aims to meet for database management including scaling, uptime, performance, and cost.
- Key aspects of Cassandra including being open source, distributed, decentralized, scalable, fault tolerant, and using a flexible data model.
- Examples of large companies that use Cassandra in production including Apple, Netflix, eBay, and others handling large datasets.
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services
Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze big data for a fraction of the cost of traditional data warehouses. By following a few best practices, you can take advantage of Amazon Redshift’s columnar technology and parallel processing capabilities to minimize I/O and deliver high throughput and query performance. This webinar will cover techniques to load data efficiently, design optimal schemas, and tune query and database performance.
Learning Objectives:
Get an inside look at Amazon Redshift's columnar technology and parallel processing capabilities
Learn how to migrate from existing data warehouses, optimize schemas, and load data efficiently
Learn best practices for managing workload, tuning your queries, and using Amazon Redshift's interleaved sorting features
Learn best practices for taking advantage of Amazon Redshift's columnar technology and parallel processing capabilities to improve your data warehouse performance.
This document provides an overview of in-memory databases, summarizing different types including row stores, column stores, compressed column stores, and how specific databases like SQLite, Excel, Tableau, Qlik, MonetDB, SQL Server, Oracle, SAP Hana, MemSQL, and others approach in-memory storage. It also discusses hardware considerations like GPUs, FPGAs, and new memory technologies that could enhance in-memory database performance.
This document discusses index tuning in Microsoft SQL Server. It provides an overview of index types including clustered and nonclustered indexes. It also discusses concepts like covering indexes, unique indexes, filtered indexes and query execution plans. The document aims to help users understand how to think about performance tuning from an index perspective and demystify common index tuning myths. It provides best practices for index tuning and maintaining performance in SQL Server.
This document discusses Parallel Query, a feature of POLARDB for MySQL that allows queries to run in parallel across multiple CPU cores for improved performance. It begins with an introduction to Parallel Query and how it works, then discusses how to use Parallel Query, how it is implemented internally, examples of performance improvements seen, and some current limitations and plans for future work.
The query optimizer in SQL Server is cost-based and determines the optimal query plan by estimating the cost of different query plans based on cardinality estimates derived from statistics, the cost model for different query operations, and the total estimated execution time. Statistics are important for the query optimizer to generate high-quality query plans, and the optimizer monitors when statistics may be out of date and automatically updates statistics.
30334823 my sql-cluster-performance-tuning-best-practicesDavid Dhavan
This document provides guidance on performance tuning MySQL Cluster. It outlines several techniques including:
- Optimizing the database schema through denormalization, proper primary key selection, and optimizing data types.
- Tuning queries through rewriting slow queries, adding appropriate indexes, and utilizing simple access patterns like primary key lookups.
- Configuring MySQL server parameters and hardware settings for optimal performance.
- Leveraging techniques like batching operations and parallel scanning to minimize network roundtrips and improve throughput.
The overall goal is to minimize network traffic for common queries through schema design, query optimization, configuration tuning, and hardware scaling. Performance tuning is an ongoing process of measuring, testing and optimizing based on application
10 Reasons to Start Your Analytics Project with PostgreSQLSatoshi Nagayasu
PostgreSQL provides several advantages for analytics projects:
1) It allows connecting to external data sources and performing analytics queries across different data stores using features like foreign data wrappers.
2) Features like materialized views, transactional DDLs, and rich SQL capabilities help build effective data warehouses and data marts for analytics.
3) Performance optimizations like table partitioning, BRIN indexes, and parallel queries enable PostgreSQL to handle large datasets and complex queries efficiently.
PostgreSQL and the future
Aaron Thul discusses PostgreSQL 9.0 which includes new features like streaming replication and improved error messages. He talks about the growing PostgreSQL community and major events. Potential threats to PostgreSQL include patent attacks and hiring away volunteer developers. The presentation encourages best practices like avoiding unnecessary data types and indexes to improve performance.
1. Script backup
2. Recovery models
3. Backup type(Full,differential, transactional)
4. Transactional logs
5. Point in time restore
6. Transactional log shipping
7. Recovery of deleted data without any backup
This document discusses database indexing. It provides information on the benefits of indexes, how to create indexes, common misconceptions about indexing, and rules for determining when and how to create indexes. Key points include that indexes improve performance of queries by enabling faster data retrieval and synchronization; indexes should be created on columns frequently filtered in WHERE and JOIN clauses; and the order of columns in an index matters for its effectiveness.
The document discusses how data is stored and organized in Microsoft SQL Server. It explains that rows in a table will always appear in the order they were inserted by default. It also describes how data is stored across 8KB pages that are grouped into extents for storage management, and how rows cannot span multiple pages. The document also provides an overview of different types of JOINs between tables like INNER, LEFT OUTER, RIGHT OUTER, and FULL OUTER JOINs.
The document discusses database normalization through third normal form. It describes eliminating repeating groups and redundant data by creating separate tables for each set of related data identified by a primary key. These tables should have no non-key attributes that are functionally dependent on other non-prime attributes. The document also covers one-to-one, one-to-many, and many-to-many relationships and discusses whether an application is better suited for online transaction processing or online analytical processing.
This document discusses SQL query performance analysis. Addhoc queries are non-parameterized queries that SQL Server treats as different statements even if they only differ by parameters. Prepared queries avoid this issue by using parameters. The query optimizer determines the most efficient execution plan based on criteria like cardinality and cost models. Execution plans and contexts are cached to improve performance. Examples are provided showing how join can outperform subquery, order in the WHERE clause matters, and how same outputs can have different execution plans.
The document discusses several key concepts in SQL Server including:
1. Data in tables is stored in pages by default in the order records are inserted (natural order).
2. Pages are the basic unit of storage which are grouped into extents of 8 contiguous pages.
3. The ORDER BY clause allows ordering the results of a SELECT statement in ascending or descending order.
This document discusses four topics related to database queries: ad hoc queries, execution plans, statistics analysis, and deadlock analysis. Ad hoc queries are queries that are not pre-defined while execution plans show how the database will execute a particular query. Statistics analysis examines data distribution and deadlock analysis identifies locking issues between concurrent queries.
Top 12 AI Technology Trends For 2024.pdfMarrie Morris
Technology has become an irreplaceable component of our daily lives. The role of AI in technology revolutionizes our lives for the betterment of the future. In this article, we will learn about the top 12 AI technology trends for 2024.
Choosing the Best Outlook OST to PST Converter: Key Features and Considerationswebbyacad software
When looking for a good software utility to convert Outlook OST files to PST format, it is important to find one that is easy to use and has useful features. WebbyAcad OST to PST Converter Tool is a great choice because it is simple to use for anyone, whether you are tech-savvy or not. It can smoothly change your files to PST while keeping all your data safe and secure. Plus, it can handle large amounts of data and convert multiple files at once, which can save you a lot of time. It even comes with 24*7 technical support assistance and a free trial, so you can try it out before making a decision. Whether you need to recover, move, or back up your data, Webbyacad OST to PST Converter is a reliable option that gives you all the support you need to manage your Outlook data effectively.
This PDF delves into the aspects of information security from a forensic perspective, focusing on privacy leaks. It provides insights into the methods and tools used in forensic investigations to uncover and mitigate privacy breaches in mobile and cloud environments.
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptxFwdays
I will share my personal experience of full-time development on wasm Blazor
What difficulties our team faced: life hacks with Blazor app routing, whether it is necessary to write JavaScript, which technology stack and architectural patterns we chose
What conclusions we made and what mistakes we committed
Increase Quality with User Access Policies - July 2024Peter Caitens
⭐️ Increase Quality with User Access Policies ⭐️, presented by Peter Caitens and Adam Best of Salesforce. View the slides from this session to hear all about “User Access Policies” and how they can help you onboard users faster with greater quality.
Finetuning GenAI For Hacking and DefendingPriyanka Aash
Generative AI, particularly through the lens of large language models (LLMs), represents a transformative leap in artificial intelligence. With advancements that have fundamentally altered our approach to AI, understanding and leveraging these technologies is crucial for innovators and practitioners alike. This comprehensive exploration delves into the intricacies of GenAI, from its foundational principles and historical evolution to its practical applications in security and beyond.
UiPath Community Day Amsterdam: Code, Collaborate, ConnectUiPathCommunity
Welcome to our third live UiPath Community Day Amsterdam! Come join us for a half-day of networking and UiPath Platform deep-dives, for devs and non-devs alike, in the middle of summer ☀.
📕 Agenda:
12:30 Welcome Coffee/Light Lunch ☕
13:00 Event opening speech
Ebert Knol, Managing Partner, Tacstone Technology
Jonathan Smith, UiPath MVP, RPA Lead, Ciphix
Cristina Vidu, Senior Marketing Manager, UiPath Community EMEA
Dion Mes, Principal Sales Engineer, UiPath
13:15 ASML: RPA as Tactical Automation
Tactical robotic process automation for solving short-term challenges, while establishing standard and re-usable interfaces that fit IT's long-term goals and objectives.
Yannic Suurmeijer, System Architect, ASML
13:30 PostNL: an insight into RPA at PostNL
Showcasing the solutions our automations have provided, the challenges we’ve faced, and the best practices we’ve developed to support our logistics operations.
Leonard Renne, RPA Developer, PostNL
13:45 Break (30')
14:15 Breakout Sessions: Round 1
Modern Document Understanding in the cloud platform: AI-driven UiPath Document Understanding
Mike Bos, Senior Automation Developer, Tacstone Technology
Process Orchestration: scale up and have your Robots work in harmony
Jon Smith, UiPath MVP, RPA Lead, Ciphix
UiPath Integration Service: connect applications, leverage prebuilt connectors, and set up customer connectors
Johans Brink, CTO, MvR digital workforce
15:00 Breakout Sessions: Round 2
Automation, and GenAI: practical use cases for value generation
Thomas Janssen, UiPath MVP, Senior Automation Developer, Automation Heroes
Human in the Loop/Action Center
Dion Mes, Principal Sales Engineer @UiPath
Improving development with coded workflows
Idris Janszen, Technical Consultant, Ilionx
15:45 End remarks
16:00 Community fun games, sharing knowledge, drinks, and bites 🍻
Redefining Cybersecurity with AI CapabilitiesPriyanka Aash
In this comprehensive overview of Cisco's latest innovations in cybersecurity, the focus is squarely on resilience and adaptation in the face of evolving threats. The discussion covers the imperative of tackling Mal information, the increasing sophistication of insider attacks, and the expanding attack surfaces in a hybrid work environment. Emphasizing a shift towards integrated platforms over fragmented tools, Cisco introduces its Security Cloud, designed to provide end-to-end visibility and robust protection across user interactions, cloud environments, and breaches. AI emerges as a pivotal tool, from enhancing user experiences to predicting and defending against cyber threats. The blog underscores Cisco's commitment to simplifying security stacks while ensuring efficacy and economic feasibility, making a compelling case for their platform approach in safeguarding digital landscapes.
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...Fwdays
.NET 8 brought a lot of improvements for developers and maturity to the Azure serverless container ecosystem. So, this talk will cover these changes and explain how you can apply them to your projects. Another reason for this talk is the re-invention of Serverless from a DevOps perspective as a Platform Engineering trend with Backstage and the recent Radius project from Microsoft. So now is the perfect time to look at developer productivity tooling and serverless apps from Microsoft's perspective.
2. Topics To Cover
• Query Optimizer
• Addhoc queries
• Execution Plan
• Statistics Analysis
3. Query Optimizer
The query optimizer in SQL Server is cost-based. It includes:
1. Cost for using different resources (CPU and IO)
2. Total execution time
It determines the cost by using:
• Cardinality: The total number of rows processed at each level
of a query plan with the help of histograms , predicates and
constraint
• Cost model of the algorithm: To perform various operations
like sorting, searching, comparisons etc.
4. Addhoc queries
Any non-Parameterized quires are called addhoc queries. For
example :
SELECT MsgID, Severity FROM SqlMessage WHERE MsgID = 100
In sql server if we execute a sql query it goes through two steps
just like any other programming languages:
• 1. Compilation
• 2. Execution
5. Properties of addhoc query
• Case sensitive
• Space sensitive
• Parameter sensitive
Sql severs treats two same sql query but of different parameters
as different sql statements. For example:
• SELECT MsgID, Severity FROM SqlMessage WHERE MsgID = 1
• SELECT MsgID, Severity FROM SqlMessage WHERE MsgID = 2
6. Effect of faulty C# code
• Sql server has took extra n * (Compilation time) ms to display
records
• Extra time to insert records in cached plans.
• Sql server has to frequently fire a job to delete the cached
plan since it will reach the max limit very soon.
• It will not only decrease the performance of this sql query but
all sql queries of other application since this faulty code will
force to delete cached query plans of other sql statement.
7. Prepared queries
Example:
(@Msgid int)SELECT MsgID, Severity FROM SqlMessage WHERE
MsgID = @Msgid
• It is not case, space and parameter sensitive and it is our goal.
Stored procedure :
• It is precompiled sql queries which follow a common
execution plan.
8. Execution Plan
• What is an index in sql server?
Index is a way to organize data in a table to make some
operations like searching, sorting, grouping etc faster. So, in
other word we need indexing when sql query has:
• WHERE clause (That is searching)
• ORDER BY clause (That is sorting)
• GROUP BY clause (This is grouping) etc.
9. Table scan:
SELECT * FROM Student WHERE RollNo = 111
Time complexity of table scan is : O(n)
RollNo Name Country Age
101 Greg UK 23
102 Sachin India 21
103 Akaram Pakistan 22
107 Miyabi China 18
108 Marry Russia 27
109 Scott USA 31
110 Benazir Banglades 17
111 Miyabi Japan 24
112 Rahul India 27
113 Nicolus France 19
10. Clustered index
• When we create a clustered index on any
table physical organization of table is changed.
• Now data of table is stored as binary search
tree(B tree).
12. Types of scanning
• Table scan: It is very slow can and it is used only if table has
not any clustered index.
• Index scan: It is also slow scan. It is used when table has
clustered index and either in WHERE clause non-key columns
are present or query has not been covered (will discuss later)
or both.
• Index Seek: It is very fast. Our goal is to achieve this.
13. Terms of execution plan
• Predicate: It is condition in WHERE clause which is either non-
key column or column which has not been covered.
• Object: It is name of source from where it getting the data. It
can be name of table, Clustered index or non-clustered index
• Output list: It is name of the columns which is getting from
object.
• Seek Predicate: It is condition in WHERE clause which is either
key column or fully covered.
14. Non-clustered index
• It is logical organization of data of table. A non-clustered index
can be of two types.
1. Heap
2. Based on clustered index.
• If table has clustered index then leaf node of non-clustered
index keeps the key columns of clustered index.
• If the table has not any clustered index then leaf node of non-
clustered index keeps RID which unique of each row of table.
17. Covering of queries
• We can specify maximum 16 column names.
• Sum of size of the columns cannot be more than 900 bytes.
• All columns must belong to same table.
• Data type of columns cannot be ntext, text,
varchar (max), nvarchar (max), varbinary (max), xml, or image
• It cannot be non-deterministic computed column.
18. Statistics Analysis
• The query optimizer uses statistics to create query plans that
improve query performance
• A correct statistics will lead to high-quality query plan.
• Auto create and updates applies strictly to single-column
statistics.
• The query optimizer determines when statistics might be out-
of-date by counting the number of data modifications since
the last statistics update and comparing the number of
modifications to a threshold.
19. To improve cardinality
• If possible, simplify expressions with constants in them.
• If there is cross relation between column use computed
column.
• Rewriting the query to use a parameter instead of a local
variable.
• Avoid changing the parameter value within the stored
procedure before using it in the query.
20. Goal
• Should we use sub query or inner join?
• Should we use temp table or table variable?
Other tools:
• Sql query profiler
• Database Tuning Advisor
• Resource Governor