This document discusses MongoDB capacity planning. It begins with a brief history of databases and the factors driving NoSQL adoption. It then discusses MongoDB's origins and key features like document storage, auto-sharding, and high availability. The document emphasizes that capacity planning requires understanding an application's requirements, resources used, and monitoring metrics over time. It provides examples of measuring and planning for storage, memory, CPU, and network resources as applications and data change. The goal of capacity planning is to continuously and proactively scale resources to meet evolving needs.
This document provides recommendations for optimizing performance of a SharePoint farm. It suggests architecting the farm with separate web, service application, and database servers. It also provides tips for SQL Server tuning, such as setting the maximum RAM, formatting disks, and configuring maintenance plans. Additionally, it recommends techniques like caching, minimizing page size, limiting navigation depth, and leveraging tools to identify bottlenecks. The overall message is to consider each layer of the farm and apply techniques like caching, SQL optimization, and network configuration to improve performance.
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012larsgeorge
This document summarizes Lars George's presentation on moving from batch to real-time processing with Hadoop. It discusses using Hadoop (HDFS and MapReduce) for batch processing of large amounts of data and integrating real-time databases and stream processing tools like HBase and Storm to enable faster querying and analytics. Example architectures shown combine batch and real-time systems by using real-time tools to process streaming data and periodically syncing results to Hadoop and HBase for long-term storage and analysis.
Polyglot Persistence - Two Great Tastes That Taste Great TogetherJohn Wood
The days of the relational database being a one-stop-shop for all of your persistence needs are over. Although NoSQL databases address some issues that can’t be addressed by relational databases, the opposite is true as well. The relational database offers an unparalleled feature set and rock solid stability. One cannot underestimate the importance of using the right tool for the job, and for some jobs, one tool is not enough. This talk focuses on the strength and weaknesses of both relational and NoSQL databases, the benefits and challenges of polyglot persistence, and examples of polyglot persistence in the wild.
These slides were presented at WindyCityDB 2010.
The rise of NoSQL is characterized with confusion and ambiguity; very much like any fast-emerging organic movement in the absence of well-defined standards and adequate software solutions. Whether you are a developer or an architect, many questions come to mind when faced with the decision of where your data should be stored and how it should be managed. The following are some of these questions: What does the rise of all these NoSQL technologies mean to my enterprise? What is NoSQL to begin with? Does it mean "No SQL"? Could this be just another fad? Is it a good idea to bet the future of my enterprise on these new exotic technologies and simply abandon proven mature Relational DataBase Management Systems (RDBMS)? How scalable is scalable? Assuming that I am sold, how do I choose the one that fit my needs best? Is there a middle ground somewhere? What is this Polyglot Persistence I hear about? The answers to these questions and many more is the subject of this talk along with a survey of the most popular of NoSQL technologies. Be there or be square.
This document discusses the limitations of relational databases for modern applications and real-time architectures. It describes how NoSQL databases like Aerospike can provide better performance and scalability. Specific examples are given of how Aerospike has been used to power applications in domains like advertising technology, social media, travel portals, and financial services that require high throughput, low latency access to large datasets.
Slides from my talk at ACCU2011 in Oxford on 16th April 2011. A whirlwind tour of the non-relational database families, with a little more detail on Redis, MongoDB, Neo4j and HBase.
This document discusses performance metrics for monitoring and optimizing a social network built using Django/Python. It recommends tools like New Relic for high-level insights, Graphite for detailed metrics storage and querying, and PgFouine for analyzing database queries. Specific metrics discussed include page load times broken down by component, database query analysis, background task performance, and deploy impact. The goal is to identify bottlenecks and optimize performance across development, systems, and pages.
At Yahoo! over the past year we have helped migrate hundreds of our grids? users to YARN. Our YARN clusters have in aggregate run over 18 million jobs with more than 3 billion tasks consuming over 10 thousand years of compute time. With one single cluster running 90 thousand jobs a day. From this experience we would like to share what we have learned about running YARN well, how this is different from running a 1.0 based cluster, and what it takes to migrate your jobs to YARN from 1.0.
When dealing with infrastructure we often go through the process of determining the different resources needed to attend our application requirements. This talks looks into the way that resources are used by MongoDB and which aspects should be considered to determined the sizing, capacity and deployment of a MongoDB cluster given the different scenarios, different sets of operations and storage engines available.
Eberhard Wolff is an architecture and technology manager at adesso, a leading IT consultancy in Germany. He discusses how cloud computing differs from traditional enterprise computing by having many inexpensive instances with unreliable networks. To build reliable systems, applications need to be stateless, easily scaled, and handle failures. He uses the example of Spring Biking, an e-commerce site, to illustrate an architecture with stateless frontends, elastic scaling, and database replication to achieve reliability in the cloud.
Karmasphere Studio is an integrated development environment (IDE) for Hadoop that allows users to develop, debug, deploy, and monitor Hadoop jobs. It integrates with major Hadoop distributions and cloud providers like Amazon to enable easy deployment of jobs to private or public clusters. The IDE aims to provide all the typical development and deployment tools within a single interface for working with Hadoop.
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetCloudera, Inc.
This document discusses YapMap, a visual search platform built on Hadoop and HBase. It summarizes how YapMap interfaces with HBase data, uses HBase as a data processing pipeline with checkpoints, and had to adjust schemas and migrate data as the system evolved. It also covers how YapMap constructs search indexes in shards based on HBase regions and stored indexes on HDFS. The document concludes with some lessons learned around optimizing HBase operations.
Faster Data Integration Pipeline Execution using Spark-JobserverDatabricks
As you may already know, the open-source Spark Job Server offers a powerful platform for managing Spark jobs, jars, and contexts, turning Spark into a much more convenient and easy-to-use service. The Spark-Jobserver can keep Spark context warmed up and readily available for accepting new jobs. At Informatica we are leveraging the Spark-Jobserver offerings to solve the data-visualization use-case.
MapReduce with Apache Hadoop is a framework for distributed processing of large datasets across clusters of computers. It allows for parallel processing of data, fault tolerance, and scalability. The framework includes Hadoop Distributed File System (HDFS) for reliable storage, and MapReduce for distributed computing. MapReduce programs can be written in various languages and frameworks provide higher-level interfaces like Pig and Hive.
For the last 40 years or so, we used relational databases successfully in nearly all business contexts and systems of nearly all sizes. Therefore, if you feel no pain using a RDBMS, you can stay with it. But, if you always have to work around your RDBMS to get your job done, a document oriented database might be worth a look.
RavenDB is a 2nd generation document database that allows you to write a data-access layer with much more freedom and many less constraints. If you have to work with large volumes of data, thousands of queries per second, unstructured/semi-structured data or event sourcing, you will find RavenDB particulary rewarding.
In this talk we will explore some document database usage scenarios. I will share some data modeling techniques and many architectural criteria to help you to decide where safely adopt RavenDB as a right choice.
The document discusses migrating from Amazon Redshift to Spark and Presto for data warehousing and querying needs at Stitch Fix. Redshift was experiencing performance issues with too many queries in the morning when production pipelines and ad-hoc queries used the same cluster. Simply scaling Redshift up and down was not feasible. The proposed solution is to use Presto for light ad-hoc queries, Spark for heavy jobs, store data in S3 as the single source of truth, and run the systems on EMR which can scale up and down quickly.
Website redesign, if not done in the proper manner, could spell doom for a site in the search rankings. Soumya Shankar & Vishal, who work as SEO professionals with Convonix, have created a resourceful presentation on the factors that need to be taken care of, while you are redesigning a website. Our website design & usability team follows these processes with the utmost care, and have successfully managed to replace entire websites, without seeing major fluctuations in rankings.
Here are the steps to solve each equation by graphing:
1) x2 + 5x + 6 = 0
Write in standard form: x2 + 5x + 6 = 0
Graph the related function y = x2 + 5x + 6. The x-intercepts are the solutions, which are -3 and -2.
2) x2 + 8x + 16 = 0
Write in standard form: x2 + 8x + 16 = 0
Graph the related function y = x2 + 8x + 16. The x-intercepts are the solutions, which are -4 and -4.
3) x2 - 2x + 3 = 0
Write in standard
Customizable and area exclusive. Stay top of mind, foster trust and stand out from the competition. Includes email blog and social media content. Small monthly fee. pat@greatreachinc.com
19 sept12 is social exclusion still important for older peopleILC- UK
The concept of social exclusion explicitly recognises that material exclusion is both caused by and causes exclusion from other domains essential for wellbeing, and builds on a longstanding tradition within public policy and social science research. However, the terminology ‘social exclusion’ is perhaps most synonymous with the former Labour government, with the coalition government having disbanded the Social Exclusion Unit Taskforce. In its place there exists something of a gulf in terminology to replace the usage of ‘social exclusion’ in policy-terms, although the concept itself continues to play some part in policy making, while the term itself is still widely used within academic research and in EU and UN policy
In comparison to children, young people, and families, social exclusion among older people has received little attention. This is despite the fact that it is perhaps among this group that the notion of social exclusion is most pertinent, with older people at high risk of social isolation and loneliness, as well as exhibiting substantial inequalities in income and housing. In addition, within the extant evidence base, there has been comparatively little longitudinal research into social exclusion patterns among older people.
At this event, ILC-UK presented the results from a report examining social exclusion among older people, 'Is Social Exclusion still important for Older People?', sponsored by Age UK. The work investigated trends in the number of socially excluded people, and examined their outcomes. Other speakers will also contribute to a debate that explores the underlying question of whether social exclusion should remain part of public policy and if ‘social exclusion is still important for older people’.
Agenda from the event:
08:15 – 08:30
Registration with Tea/Coffee/Pastries
08:30 – 08:35
Welcome - David Sinclair, ILC-UK
08:35 - 08:50
Is Social Exclusion still important for Older People? - Dylan Kneale, ILC-UK
08:50 - 09:10
Greg Lewis, Age UK
Justin Russell, Department for Work and Pensions
09:10 - 09:25
Debate
09:25 – 09:30
Close - David Sinclair, ILC-UK
This document provides tips on how to market smarter and save money through direct mail campaigns. It discusses the importance of cleaning mailing lists to remove outdated addresses, which can save significant money on printing and postage costs while improving response rates. Additional ways to market smarter with direct mail mentioned are using various finishing techniques like embossing, foil stamping, and die-cutting to make mail pieces stand out from the crowd at relatively low cost. The document emphasizes that mailing more does not necessarily mean better results and offers examples of companies wasting significant funds by mailing to inactive addresses without properly updating their lists.
This document provides strategies for authors to increase their online visibility and use of tags (keywords). It discusses organizing online credentials, tagging content with relevant keywords that potential customers may search, and uploading photos to image sites and groups with those tags to drive backlinks and search traffic. The goal is to have customers find and spread information about the author and their content online through strategic tagging and sharing across websites and social media.
The document discusses issues related to longevity, aging populations, and their economic impacts. It notes that populations are aging globally as life expectancy increases, which will significantly impact economies by reducing the ratio of working-age to older individuals. This could reduce economic growth by decreasing workforce participation and increasing costs for pensions and healthcare. However, aging populations also represent new opportunities for certain industries that cater to older consumers. Addressing the challenges of aging societies will be important for economic policymakers.
The document introduces EMC's new Data Domain DD800 appliance series and Data Domain Archiver for backup and archive storage. The systems provide faster backup speeds, increased capacity, and cost-effective long-term retention compared to traditional tape storage. The appliances leverage data deduplication and support all major backup software for backup, archive, and disaster recovery across on-premise and off-premise locations.
21Jan4 - I can't afford to die - Managing the cost of dying in an ageing soc...ILC- UK
The document summarizes a discussion event on managing the costs of dying in an aging society. It provides an agenda for the event including speakers from organizations like Cruse Bereavement Care, Sun Life Direct, and Sue Ryder Manorlands Hospice. The speakers will discuss issues like rising funeral costs, gaps in state support for funerals, and the need for individuals and families to plan financially for end of life. Research presented will examine projections that the total cost of dying in England and Wales could triple by 2037 due to rising costs and an increasing number of deaths. Questions will focus on understanding and reducing the costs associated with dying, end of life, bereavement, and the roles of public, private, and
Three tips are provided to improve online effectiveness: 1) Maximize Google Analytics to identify top traffic sources and content, 2) Write engaging content with a catchy title and proofread, 3) Maintain a consistent social media presence by updating regularly, monitoring brand sentiment, and separating personal and professional accounts.
This document contains examples of solving multi-step equations through addition, subtraction, multiplication, and division. It provides 5 warm up problems involving equations with variables, then explains the steps to solve two-step equations by translating words to math, and using operations to isolate the variable. Examples are given showing division and multiplication used to solve equations with more than one operation.
Public service and demographic change: an ILC-UK/Actuarial Profession joint d...ILC- UK
Full details of the event are available here: http://www.ilcuk.org.uk/index.php/events/ilc_uk_and_the_actuarial_profession_debate_public_service_and_demographic_c
The live blog for this event is available here: http://blog.ilcuk.org.uk/2013/04/23/live-blog-public-service-and-demographic-change/
With the advance of social media as a standard way of communicating, being literate now includes being digitally educated. Savvy communicators realize social sites like Facebook can be utilized as platforms to connect, educate, and encourage relationship. In “Tips and Best Practices for Shepherding with Facebook” you’ll walk away with strategies and tools to build community and extend your personal and ministry reach. Facebook is already accomplishing many of the goals we have for church communication. Learn how to take advantage of the Facebook social structure to build a strong local or global community.
Deploying any software can be a challenge if you don't understand how resources are used or how to plan for the capacity of your systems. Whether you need to deploy or grow a single MongoDB instance, replica set, or tens of sharded clusters then you probably share the same challenges in trying to size that deployment.
This webinar will cover what resources MongoDB uses, and how to plan for their use in your deployment. Topics covered will include understanding how to model and plan capacity needs for new and growing deployments. The goal of this webinar will be to provide you with the tools needed to be successful in managing your MongoDB capacity planning tasks.
Deploying any software can be a challenge if you don't understand how resources are used or how to plan for the capacity of your systems. Whether you need to deploy or grow a single MongoDB instance, replica set, or tens of sharded clusters then you probably share the same challenges in trying to size that deployment.
Deploying MongoDB can be a challenge if you don't understand how resources are used nor how to plan for the capacity of your systems. If you need to deploy, or grow, a MongoDB single instance, replica set, or tens of sharded clusters then you probably share the same challenges in trying to size that deployment. This talk will cover what resources MongoDB uses, and how to plan for their use in your deployment. Topics covered will include understanding how to model and plan capacity needs from the perspective of a new deployment, growing an existing one, and defining where the steps along scalability on your path to the top. The goal of this presentation will be to provide you with the tools needed to be successful in managing your MongoDB capacity planning tasks.
MongoDB capacity planning involves determining hardware requirements and sizing to meet performance and availability expectations. Key aspects include measuring the working set, monitoring resource usage, and iteratively planning as requirements and data change over time. Resources like CPU, storage, memory and network need to be considered based on the application's throughput, responsiveness and availability needs.
DoneDeal AWS Data Analytics Platform build using AWS products: EMR, Data Pipeline, S3, Kinesis, Redshift and Tableau. Custom built ETL was written using PySpark.
This document discusses handling massive writes for online transaction processing (OLTP) systems. It begins with an introduction and overview of the topics to be covered, including terminology, differences between massive reads versus writes, and potential solutions using relational databases, NoSQL databases, and code optimizations. Specific solutions discussed for massive writes include using memory, fast disks, caching, column-oriented databases, SQL tuning, database partitioning, reading from slaves, and sharding or splitting data across multiple databases. The document provides pros and cons of each approach and examples of performance improvements observed.
This document discusses capacity planning for deploying MongoDB. It defines capacity planning as planning for requirements like availability, throughput, and responsiveness by determining necessary resources like CPU, memory, storage, and network capacity. It emphasizes starting capacity planning before launch to avoid downtime. Key aspects of capacity planning for MongoDB include estimating working memory set size, storage I/O needs based on data size and access patterns, using tools like IOStat and MongoDB Management Service for monitoring and automation, and conducting iterative testing and deployments. Failure occurs if planned resources cannot meet requirements.
Capacity Planning For Your Growing MongoDB ClusterMongoDB
This document discusses capacity planning for deploying MongoDB. It defines capacity planning as planning for requirements like availability, throughput, and responsiveness by determining necessary resources like CPU, memory, storage, and network capacity. It emphasizes starting capacity planning before launch to avoid downtime. Key aspects of capacity planning for MongoDB include estimating working memory set size, storage I/O needs based on data size and access patterns, using tools like IOStat and MongoDB Management Service for monitoring and automation, and conducting iterative testing and deployments. Failure occurs if planned resources cannot meet requirements.
This document discusses hardware provisioning best practices for MongoDB. It covers key concepts like bottlenecks, working sets, and replication vs sharding. It also presents two case studies where these concepts were applied: 1) For a Spanish bank storing logs, the working set was 4TB so they provisioned servers with at least that much RAM. 2) For an online retailer storing products, testing found the working set was 270GB, so they recommended a replica set with 384GB RAM per server to avoid complexity of sharding. The key lessons are to understand requirements, test with a proof of concept, measure resource usage, and expect that applications may become bottlenecks over time.
SharePoint Saturday San Antonio: SharePoint 2010 PerformanceBrian Culver
Is your farm struggling to server your organization? How long is it taking between page requests? Where is your bottleneck in your farm? Is your SQL Server tuned properly? Worried about upgrading due to poor performance? We will look at various tools for analyzing and measuring performance of your farm. We will look at simple SharePoint and IIS configuration options to instantly improve performance. I will discuss advanced approaches for analyzing, measuring and implementing optimizations in your farm.
This document provides a checklist for deploying MongoDB, including application design considerations like schema and sharding, operational requirements for performance, capacity, high availability, backup, security, and monitoring. It also discusses hardware requirements and maintenance processes like upgrades.
SharePoint Saturday The Conference 2011 - SP2010 PerformanceBrian Culver
Is your farm struggling to server your organization? How long is it taking between page requests? Where is your bottleneck in your farm? Is your SQL Server tuned properly? Worried about upgrading due to poor performance? We will look at various tools for analyzing and measuring performance of your farm. We will look at simple SharePoint and IIS configuration options to instantly improve performance. I will discuss advanced approaches for analyzing, measuring and implementing optimizations in your farm.
New developers and teams are now polyglot :
- they use multiple programming languages (Java, Javascript, Ruby, ...)
- they use multiple persistence store (RDBMS, NoSQL, Hadoop)
In this talk you will learn about the benefits if being polyglot: use the good language or framework for the good cause, select the good persistence for specific constraints.
This presentation will show how developer could mix the Java platform with other technologies such as NodeJS and AngularJS to build application in a more productive way. This is also the opportunity to talk about the new Command Query Responsibility Segregation (CQRS) pattern to allow developers to be more effective and deliver the proper application to the user quicker.
This presentation was delivered during Devfest Nantes 2014
Building a Mongo DSL in Scala at Hot Potato (Lincoln Hochberg)MongoSF
Hot Potato is a startup that connects audiences around shared interests through their mobile app and website. They use Scala and MongoDB for their API and data storage due to MongoDB's scalability and flexibility as the app grows. MongoDB allows the data model and queries to evolve over time as reads, writes, and data usage changes, unlike a traditional RDBMS which requires more normalization. Scala provides benefits like immutability, functional programming, and concurrency which are well-suited for Hot Potato's asynchronous services.
Database as a Service on the Oracle Database Appliance PlatformMaris Elsins
Speaker: Marc Fielding, Co-speaker: Maris Elsins.
Oracle Database Appliance provides a robust, highly-available, cost-effective, and surprisingly scalable platform for database as a service environment. By leveraging Oracle Enterprise Manager's self-service features, databases can be provisioned on a self-service basis to a cluster of Oracle Database Appliance machines. Discover how multiple ODA devices can be managed together to provide both high availability and incremental, cost-effective scalability. Hear real-world lessons learned from successful database consolidation implementations.
The document summarizes Michael DelNegro's work operationalizing MongoDB at AOL after it was initially introduced by a developer in 2010. Key aspects of the operationalization effort included establishing standards for host setup, directory structure, and build scripts; implementing monitoring with Argus, Nagios, and MMS; performing backups; sharing information through internal documentation; and addressing challenges like requiring more planning from developers. The operationalization operation helped support over 500 MongoDB servers in production running a variety of projects and applications at AOL.
The document summarizes Michael DelNegro's work operationalizing MongoDB at AOL over several years. Key aspects included establishing standards for host setup, directory structure, and build scripts; implementing monitoring with Argus, Nagios, and MMS; performing backups; sharing information through internal documentation; and addressing challenges like requiring more planning from developers. The operationalization effort helped support over 500 MongoDB servers storing data for 30+ active projects at AOL.
- MongoDB is well-suited for systems of engagement that have demanding real-time requirements, diverse and mixed data sets, massive concurrency, global deployment, and no downtime tolerance.
- It performs well for workloads with mixed reads, writes, and updates and scales horizontally on demand. However, it is less suited for analytical workloads, data warehousing, business intelligence, or transaction processing workloads.
- MongoDB shines for use cases involving single views of data, mobile and geospatial applications, real-time analytics, catalogs, personalization, content management, and log aggregation. It is less optimal for workloads requiring joins, full collection scans, high-latency writes, or five nines u
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Lucas Jellema
This presentation gives an brief overview of the history of relational databases, ACID and SQL and presents some of the key strentgths and potential weaknesses. It introduces the rise of NoSQL - why it arose, what is entails, when to use it. The presentation focuses on MongoDB as prime example of NoSQL document store and it shows how to interact with MongoDB from JavaScript (NodeJS) and Java.
This document summarizes Terry Bunio's presentation on breaking and fixing broken data. It begins by thanking sponsors and providing information about Terry Bunio and upcoming SQL events. It then discusses the three types of broken data: inconsistent, incoherent, and ineffectual data. For each type, it provides an example and suggestions on how to identify and fix the issues. It demonstrates how to use tools like Oracle Data Modeler, execution plans, SQL Profiler, and OStress to diagnose problems to make data more consistent, coherent and effective.
Similar to 2013 CPM Conference, Nov 6th, NoSQL Capacity Planning (20)
Retrieval Augmented Generation Evaluation with RagasZilliz
Retrieval Augmented Generation (RAG) enhances chatbots by incorporating custom data in the prompt. Using large language models (LLMs) as judge has gained prominence in modern RAG systems. This talk will demo Ragas, an open-source automation tool for RAG evaluations. Christy will talk about and demo evaluating a RAG pipeline using Milvus and RAG metrics like context F1-score and answer correctness.
The History of Embeddings & Multimodal EmbeddingsZilliz
Frank Liu will walk through the history of embeddings and how we got to the cool embedding models used today. He'll end with a demo on how multimodal RAG is used.
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...Fwdays
.NET 8 brought a lot of improvements for developers and maturity to the Azure serverless container ecosystem. So, this talk will cover these changes and explain how you can apply them to your projects. Another reason for this talk is the re-invention of Serverless from a DevOps perspective as a Platform Engineering trend with Backstage and the recent Radius project from Microsoft. So now is the perfect time to look at developer productivity tooling and serverless apps from Microsoft's perspective.
UiPath Community Day Amsterdam: Code, Collaborate, ConnectUiPathCommunity
Welcome to our third live UiPath Community Day Amsterdam! Come join us for a half-day of networking and UiPath Platform deep-dives, for devs and non-devs alike, in the middle of summer ☀.
📕 Agenda:
12:30 Welcome Coffee/Light Lunch ☕
13:00 Event opening speech
Ebert Knol, Managing Partner, Tacstone Technology
Jonathan Smith, UiPath MVP, RPA Lead, Ciphix
Cristina Vidu, Senior Marketing Manager, UiPath Community EMEA
Dion Mes, Principal Sales Engineer, UiPath
13:15 ASML: RPA as Tactical Automation
Tactical robotic process automation for solving short-term challenges, while establishing standard and re-usable interfaces that fit IT's long-term goals and objectives.
Yannic Suurmeijer, System Architect, ASML
13:30 PostNL: an insight into RPA at PostNL
Showcasing the solutions our automations have provided, the challenges we’ve faced, and the best practices we’ve developed to support our logistics operations.
Leonard Renne, RPA Developer, PostNL
13:45 Break (30')
14:15 Breakout Sessions: Round 1
Modern Document Understanding in the cloud platform: AI-driven UiPath Document Understanding
Mike Bos, Senior Automation Developer, Tacstone Technology
Process Orchestration: scale up and have your Robots work in harmony
Jon Smith, UiPath MVP, RPA Lead, Ciphix
UiPath Integration Service: connect applications, leverage prebuilt connectors, and set up customer connectors
Johans Brink, CTO, MvR digital workforce
15:00 Breakout Sessions: Round 2
Automation, and GenAI: practical use cases for value generation
Thomas Janssen, UiPath MVP, Senior Automation Developer, Automation Heroes
Human in the Loop/Action Center
Dion Mes, Principal Sales Engineer @UiPath
Improving development with coded workflows
Idris Janszen, Technical Consultant, Ilionx
15:45 End remarks
16:00 Community fun games, sharing knowledge, drinks, and bites 🍻
Finetuning GenAI For Hacking and DefendingPriyanka Aash
Generative AI, particularly through the lens of large language models (LLMs), represents a transformative leap in artificial intelligence. With advancements that have fundamentally altered our approach to AI, understanding and leveraging these technologies is crucial for innovators and practitioners alike. This comprehensive exploration delves into the intricacies of GenAI, from its foundational principles and historical evolution to its practical applications in security and beyond.
"Making .NET Application Even Faster", Sergey Teplyakov.pptxFwdays
In this talk we're going to explore performance improvement lifecycle, starting with setting the performance goals, using profilers to figure out the bottle necks, making a fix and validating that the fix works by benchmarking it. The talk will be useful for novice and seasoned .NET developers and architects interested in making their application fast and understanding how things work under the hood.
Self-Healing Test Automation Framework - HealeniumKnoldus Inc.
Revolutionize your test automation with Healenium's self-healing framework. Automate test maintenance, reduce flakes, and increase efficiency. Learn how to build a robust test automation foundation. Discover the power of self-healing tests. Transform your testing experience.
Cracking AI Black Box - Strategies for Customer-centric Enterprise ExcellenceQuentin Reul
The democratization of Generative AI is ushering in a new era of innovation for enterprises. Discover how you can harness this powerful technology to deliver unparalleled customer value and securing a formidable competitive advantage in today's competitive market. In this session, you will learn how to:
- Identify high-impact customer needs with precision
- Harness the power of large language models to address specific customer needs effectively
- Implement AI responsibly to build trust and foster strong customer relationships
Whether you're at the early stages of your AI journey or looking to optimize existing initiatives, this session will provide you with actionable insights and strategies needed to leverage AI as a powerful catalyst for customer-driven enterprise success.
It's your unstructured data: How to get your GenAI app to production (and spe...Zilliz
So you've successfully built a GenAI app POC for your company -- now comes the hard part: bringing it to production. Aparavi addresses the challenges of AI projects while addressing data privacy and PII. Our Service for RAG helps AI developers and data scientists to scale their app to 1000s to millions of users using corporate unstructured data. Aparavi’s AI Data Loader cleans, prepares and then loads only the relevant unstructured data for each AI project/app, enabling you to operationalize the creation of GenAI apps easily and accurately while giving you the time to focus on what you really want to do - building a great AI application with useful and relevant context. All within your environment and never having to share private corporate data with anyone - not even Aparavi.
3. Some History
• 1970's Relational Databases Invented
– Storage is expensive
– Data is normalized
– Data is abstracted away from app
4. Some History
• 1970's Relational Databases Invented
– Storage is expensive
– Data is normalized
– Data is abstracted away from app
• 1980's RDBMS commercialized
– Client/Server model
– SQL becomes the standard
5. Some History
• 1970's Relational Databases Invented
– Storage is expensive
– Data is normalized
– Data storage is abstracted away from app
• 1980's RDBMS commercialized
– Client/Server model
– SQL becomes the standard
• 1990's Things begin to change
– Client/Server=> 3-tier architecture
– Internet and the Web
6. Some History
• 2000's Web 2.0
– "Social Media"
– E-Commerce
– Decrease of HW prices
– Increase of collected data
7. Some History
• 2000's Web 2.0
– "Social Media"
– E-Commerce
– Decrease of HW prices
– Increase of collected data
• Result
– Need to scale
-- How do we keep up?
8. Developers
• Agile Development Methodology
– Shorter development cycles
– Constant evolution of requirements
– Flexibility at design time
9. Developers
• Agile Development Methodology
– Shorter development cycles
– Constant evolution of requirements
– Flexibility at design time
• Relational Schema
– Hard to evolve
• must stay in sync with
application
18. MongoDB History
• Designed/developed by founders of Doubleclick, ShopWiki, GILT
groupe, etc.
• First production site March 2008 - businessinsider.com
• Open Source – AGPL, written in C++
• Version 0.8 – first official release February 2009
• Version 2.4 – March 2013
23. Better Data Locality
• Data model means "entities" can reside "together"
• Optimize schema for read and write access patterns
• Minimize "seeks" as they dominate IO slowdown
• Failure to take advantage of document model:
– no improved performance
– all the disadvantages with non of the advantages!
– incorrect model can overshoot "all data embedded"
25. In-memory Caching
• memory mapped files,
• caching handled by OS,
• naturally leaves most frequently accessed data in RAM
• have enough RAM to fit indexes and working data set
for best performance
27. Auto-Sharding
• horizontal scaling is "built-in" to the product
• Replication is for HA
• Sharding is for scaling
• Number of servers in replica set based on HA
requirements
• Number of shards is based on capacity needed vs.
single server/replicaset capacity
28. MongoDB Performance*
Top 5 Marketing
Firm
Government Agency
Top 5 Investment
Bank
10+ fields, arrays,
nested documents
20+ fields, arrays,
nested documents
Queries Key-based
1 – 100 docs/query
80/20 read/write
Compound queries
Range queries
MapReduce
20/80 read/write
Compound queries
Range queries
50/50 read/write
Servers ~250
~50
~40
Ops/sec 1,200,000
500,000
30,000
Data Key/value
* These figures are provided as examples. Your application governs your performance.
33. What
• There is one thing that is absolutely mandatory to
have in order to succeed in capacity planning
• Without it, you will not be successful
• We must have REQUIREMENTS from business
– without requirements, we're building a roadmap without
knowing the desired destination
Imagine building a car without knowing what its top speed
should be, acceleration, MPH, and cost?
35. What
• Availability: what is uptime requirement?
• Throughput
– average read/write/users
– peak throughput?
– OPS (operations per second)? per hour? per day?
• Responsiveness
– what is acceptable latency?
– is higher during peak times acceptable?
38. When
• At the beginning before production, but after you launch you
must continue the process
• Lack of future planning: Failure to project performance
drop-off as the amount of data increases –
• Process (steps): -> ACTIONS
– Requirements ask, guess, try/measure.
– Understand application needs
– Choose hardware to meet that pattern (...)
– How many machines you need
– Monitor to recognize growth exceeding current capacity.
39. Capacity Planning: What?
Understand Resources
– Storage
– Memory
– CPU
– Network
• Understand Your Application
– Monitor and Collect Metrics
– Model to Predict Change
– Allocate and Deploy
– (repeat process)
40. Resource Usage
Storage
– IOPS
– Size
– Data & Loading Patterns
Memory
– Working Set
CPU
– Speed
– Cores
Network
– Latency
– Throughput
58. Starter Questions
What is the working set?
– How does that equate to memory
– How much disk access will that require
How efficient are the queries?
What is the rate of data change?
How big are the highs and lows?
59. Deployment Types
All of these use the same resources:
•
Single Instance
•
Multiple Instances (Replica Set)
•
Cluster (Sharding)
•
Data Centers
61. Monitoring
• CLI and internal status commands
• mongostat; mongotop; db.serverStatus()
• Plug-ins for munin, Nagios, cacti, etc.
• Integration via SNMP to other tools
• MMS
67. Velocity of Change
• Limitations -> takes time
– Data Movement
– Allocation/Provisioning (servers/mem/disk)
• Improvement
– Limit Size of Change (if you can)
– Increase Frequency
– MEASURE its effect
– Practice