Elasticsearch

•

1 like•460 views

This document provides an overview of Elasticsearch, including its uses cases at companies like GitHub, Stack Overflow, and Netflix. It discusses Elasticsearch's data indexing and querying capabilities. Key topics covered include document mapping and types, shards and replicas, analyzers, term queries, match queries, sorting, aggregations, and cluster configuration. The document concludes with lessons learned and a reference to Elasticsearch's documentation.

What's hot

Introduction to Elasticsearch with basics of Lucene

Rahul Jain

Rahul Jain gives an introduction to Elasticsearch and its basic concepts like term frequency, inverse document frequency, and boosting. He describes Lucene as a fast, scalable search library that uses inverted indexes. Elasticsearch is introduced as an open source search platform built on Lucene that provides distributed indexing, replication, and load balancing. Logstash and Kibana are also briefly described as tools for collecting, parsing, and visualizing logs in Elasticsearch.

Elasticsearch V/s Relational Database

Richa Budhraja

How Solr Search Works

Atlogys Technical Consulting

Practical Machine Learning for Smarter Search with Solr and Spark

Jake Mannix

Elasticsearch Introduction at BigData meetup

Eric Rodriguez (Hiring in Lex)

Scaling Recommendations, Semantic Search, & Data Analytics with solr

Trey Grainger

This presentation is from the inaugural Atlanta Solr Meetup held on 2014/10/21 at Atlanta Tech Village. Description: CareerBuilder uses Solr to power their recommendation engine, semantic search, and data analytics products. They maintain an infrastructure of hundreds of Solr servers, holding over a billion documents and serving over a million queries an hour across thousands of unique search indexes. Come learn how CareerBuilder has integrated Solr into their technology platform (with assistance from Hadoop, Cassandra, and RabbitMQ) and walk through api and code examples to see how you can use Solr to implement your own real-time recommendation engine, semantic search, and data analytics solutions. Speaker: Trey Grainger is the Director of Engineering for Search & Analytics at CareerBuilder.com and is the co-author of Solr in Action (2014, Manning Publications), the comprehensive example-driven guide to Apache Solr. His search experience includes handling multi-lingual content across dozens of markets/languages, machine learning, semantic search, big data analytics, customized Lucene/Solr scoring models, data mining and recommendation systems. Trey is also the Founder of Celiaccess.com, a gluten-free search engine, and is a frequent speaker at Lucene and Solr-related conferences.

Scaling Analytics with elasticsearch

dnoble00

This document summarizes how Elasticsearch can be used for scaling analytics applications. Elasticsearch is an open source, distributed search and analytics engine that can index large volumes of data. It automatically shards and replicates data across nodes for redundancy and high availability. Analytics queries like date histograms, statistical facets, and geospatial searches can retrieve insightful results from large datasets very quickly. The document provides an example of using Elasticsearch to perform sentiment analysis, location tagging, and analytical queries on over 100 million social media documents.

Apache Lucene intro - Breizhcamp 2015

Adrien Grand

Lucene is an open-source information retrieval library written in Java. It was created in 1999 and is now developed by the Apache Software Foundation. Lucene provides full-text search, structured search, highlighting, faceting, and suggestions capabilities. It embeds an inverted index for efficient query execution, a document store to retrieve original data, and a column store for sorting and analytics. Lucene indexes are divided into immutable segments that are periodically merged to reclaim space and improve performance.

Lucene

Harshit Agarwal

The document provides an overview of how search engines and the Lucene library work. It explains that search engines use web crawlers to index documents, which are then stored and searched. Lucene is an open source library for indexing and searching documents. It works by analyzing documents to extract terms, indexing the terms, and allowing searches to match indexed terms. The document details Lucene's indexing and searching process including analyzing text, creating an inverted index, different query types, and using the Luke tool.

Solr: 4 big features

David Smiley

Eventually Elasticsearch: Eventual Consistency in the Real World

BeyondTrees

Based on the experience of an ElasticSearch implementation at bol.com, we'll discuss the consequences of different modes of operation of ElasticSearch in an environment of existing SQL databases. How can you connect ElasticSearch to change queues of other databases, how can the versioning mechanism be used to implement optimistic locking, and what are the consistency consequences of using ElasticSearch as either a free text index on external data, a data cache or as the single source-of-truth system?

The Anatomy of a Large-Scale Hypertextual Web Search Engine

Mehul Boricha

The document describes the design of Google's first web search engine. It discusses the challenges of building a large-scale search engine that can crawl and index the rapidly growing web efficiently and produce high-quality search results. It outlines Google's goals of improving search quality through tools with high precision and furthering academic research. The major sections describe Google's system features, including PageRank to prioritize results, and the major data structures used, such as repositories to store web pages, indexes to catalog them, and hit lists to track word occurrences.

Scalable Data Models with Elasticsearch

BeyondTrees

At bol.com, a leading ecommerce platform in The Netherlands, we have done extensive research into what it would take to use ElasticSearch as the main search provider. We will explain the specific challenges and requirements of running an Elasticsearch cluster at bol.com-scale, and show how we have used generated data to do performance and scalability tests on different ways to model a hierarchical data model into Elasticsearch. We will describe the benefits and drawbacks of the different data model options, and their consequences for the design of the index and search applications.

Webinar: Event Processing & Data Analytics with Lucidworks Fusion

Lucidworks

Barcelona 2014: CrossRef System and Support Update by Chuck Koscher

Crossref

The document summarizes updates to the CrossRef system. It notes new features like cross-publisher reference linking, metadata feeds to content management systems, originality screening, and text and data mining. It provides statistics on DOI clicks and source articles. It outlines improvements to deposits, inclusion of additional metadata like FundRef and text mining licenses, support for ORCIDs and queries. Notable changes include new FundRef and access indicator metadata, assigning multiple DOIs to books, and allowing references to non-CrossRef DOIs like those in DataCite.

ElasticSearch - index server used as a document database

Robert Lujo

Presentation held on 5.10.2014 on http://2014.webcampzg.org/talks/. Although ElasticSearch (ES) primary purpose is to be used as index/search server, in its featureset ES overlaps with common NoSql database; better to say, document database. Why this could be interesting and how this could be used effectively? Talk overview: - ES - history, background, philosophy, featureset overview, focus on indexing/search features - short presentation on how to get started - installation, indexing and search/retrieving - Database should provide following functions: store, search, retrieve -> differences between relational, document and search databases - it is not unusual to use ES additionally as an document database (store and retrieve) - an use-case will be presented where ES can be used as a single database in the system (benefits and drawbacks) - what if a relational database is introduced in previosly demonstrated system (benefits and drawbacks) ES is a nice and in reality ready-to-use example that can change perspective of development of some type of software systems.

ElasticSearch

Volodymyr Kraietskyi

Elasticsearch is a search engine based on Apache Lucene that provides distributed, full-text search capabilities. It allows users to store and search documents of any structure in near real-time. Documents are organized into indexes, shards, and clusters to provide scalability and fault tolerance. Elasticsearch uses analysis and mapping to index documents for full-text search. Queries can be built using the Elasticsearch DSL for complex searches. While Elasticsearch provides fast search, it has disadvantages for transactional operations or large document churn. Elastic HQ is a web plugin that provides monitoring and management of Elasticsearch clusters through a browser-based interface.

Webinar: Solr 6 Deep Dive - SQL and Graph

Lucidworks

This document provides an agenda and overview for a conference session on Solr 6 and its new capabilities for parallel SQL and graph queries. The session will cover motivations for adding these features to Solr, how streaming expressions enable parallel SQL, graph capabilities through the new graph query parser and streaming expressions, and comparisons to other technologies. The document includes examples of SQL queries and graph streaming expressions in Solr.

Introduction to Apache Solr

Alexandre Rafalovitch

What's hot (19)

Introduction to Elasticsearch with basics of Lucene

Elasticsearch V/s Relational Database

How Solr Search Works

Practical Machine Learning for Smarter Search with Solr and Spark

Elasticsearch Introduction at BigData meetup

Scaling Recommendations, Semantic Search, & Data Analytics with solr

Scaling Analytics with elasticsearch

Apache Lucene intro - Breizhcamp 2015

Lucene

Solr: 4 big features

Eventually Elasticsearch: Eventual Consistency in the Real World

The Anatomy of a Large-Scale Hypertextual Web Search Engine

Scalable Data Models with Elasticsearch

Webinar: Event Processing & Data Analytics with Lucidworks Fusion

Barcelona 2014: CrossRef System and Support Update by Chuck Koscher

ElasticSearch - index server used as a document database

ElasticSearch

Webinar: Solr 6 Deep Dive - SQL and Graph

Introduction to Apache Solr

Viewers also liked

Elasticsearch first-steps

Matteo Moci

This document discusses using Elasticsearch for social media analytics and provides examples of common tasks. It introduces Elasticsearch basics like installation, indexing documents, and searching. It also covers more advanced topics like mapping types, facets for aggregations, analyzers, nested and parent/child relations between documents. The document concludes with recommendations on data design, suggesting indexing strategies for different use cases like per user, single index, or partitioning by time range.

Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...

Oleksiy Panchenko

In the age of information and big data, ability to quickly and easily find a needle in a haystack is extremely important. Elasticsearch is a distributed and scalable search engine which provides rich and flexible search capabilities. Social networks (Facebook, LinkedIn), media services (Netflix, SoundCloud), Q&A sites (StackOverflow, Quora, StackExchange) and even GitHub - they all find data for you using Elasticsearch. In conjunction with Logstash and Kibana, Elasticsearch becomes a powerful log engine which allows to process, store, analyze, search through and visualize your logs. Video: https://www.youtube.com/watch?v=GL7xC5kpb-c Scripts for the Demo: https://github.com/opanchenko/morning-at-lohika-ELK

Elastic search overview

ABC Talks

Elasticsearch Sharding Strategy at Tubular Labs

Tubular Labs

- The document discusses Tubular Labs' sharding strategy for their Elasticsearch clusters which include 3 search clusters, 1 autocomplete cluster, and 1 Elastic Stack cluster. - They conducted repeatable experiments using Rally to help determine the optimal shard size and number of shards per node. Tests were run against their 2.5 billion document, 4TB production cluster which was CPU intensive. - The results showed that query performance dropped as the number of shards per node increased. However, loading the cluster more fully in testing yielded better results than their full production cluster, revealing new questions around load distribution and bottlenecks.

ElasticSearch on AWS - Real Estate portal case study (Spitogatos.gr)

Andreas Chatzakis

Elasticsearch & "PeopleSearch"

George Stathis

JBug_React_and_Flux_2015

Lukas Vlcek

This document discusses React and Flux. It introduces React as a JavaScript library created by Facebook for building user interfaces. Flux is described as an application architecture pattern for avoiding complex event chains. Key aspects of React covered include using JSX, the virtual DOM for efficient updates, and integrating with other libraries. The document emphasizes thinking about data flow and putting it in good order using Flux. It concludes by recommending enjoying life on a sunny day.

OseeGenius - Semantic search engine and discovery platform

@CULT Srl

The document discusses the OseeGenius discovery platform and its features. It provides an overview of OseeGenius' services, search capabilities, and technical details. Key features include facets, explorers, classification, keyword indexing, metadata extraction, stemming, auto-completion, geospatial search, and integration with library systems. Screenshots demonstrate the user interface and capabilities like highlighting, user workspaces, reviews, and MARC import.

MoSQL: An Elastic Storage Engine for MySQL

Alex Tomic

This document describes MoSQL, an elastic storage engine for MySQL that allows adding and removing storage nodes with little performance impact. It has three main components: MySQL servers that interface with clients, storage nodes that store encrypted data using a multi-version key-value store, and a certifier that ensures transactions commit on up-to-date data. Evaluation shows MoSQL outperforms MySQL on TPC-C benchmarks and can dynamically add nodes with minimal throughput reduction. Future work includes supporting different consensus protocols and improving usability.

Apache Hadoop 1.1

Sperasoft

- Hadoop was created to allow processing of large datasets in a distributed, fault-tolerant manner. It was originally developed by Doug Cutting and Mike Cafarella at Nutch in response to the growing amounts of data and computational needs at Google and other companies. - The core of Hadoop consists of Hadoop Distributed File System (HDFS) for storage and Hadoop MapReduce for distributed processing. It also includes utilities like Hadoop Common for file system access and other basic functionality. - Hadoop's goals were to process multi-petabyte datasets across commodity hardware in a reliable, flexible and open source way. It assumes failures are expected and handles them to provide fault tolerance.

Building search app with ElasticSearch

Lukas Vlcek

Lukas Vlcek built a search app for public mailing lists in 15 minutes using ElasticSearch. The app allows users to search mailing lists, filter results by facets like date and author, and view document previews with highlighted search terms. Key challenges included parsing email structure and content, normalizing complex email subjects, identifying conversation threads, and determining how to handle quoted content and author disambiguation. The search application and a monitoring tool for ElasticSearch called BigDesk will be made available on GitHub.

Social Miner: Webinar people marketing em 30 min

Social Miner

Oxalide Academy : Workshop #3 Elastic Search

Oxalide

Introduction to Elasticsearch

Sperasoft

The document provides an overview of Elasticsearch including that it is easy to install, horizontally scalable, and highly available. It discusses Elasticsearch's core search capabilities using Lucene and how data can be stored and retrieved. The document also covers Elasticsearch's distributed nature, plugins, scripts, custom analyzers, and other features like aggregations, filtering and sorting.

Amministratori Di Sistema: Adeguamento al Garante Privacy - Log Management e ...

Simone Onofri

Oak / Solr integration

Tommaso Teofili

Elastic search

Rahul Agarwal

quick intro to elastic search

medcl

Elastic search Walkthrough

Suhel Meman

Elasticsearch is an open-source, distributed search and analytics engine built on Apache Lucene. It allows storing, searching, and analyzing large volumes of data quickly and in near real-time. Key concepts include being schema-free, document-oriented, and distributed. Indices can be created to store different types of documents. Mapping defines how documents are indexed. Documents can be added, retrieved, updated, and deleted via RESTful APIs. Queries can be used to search for documents matching search criteria. Faceted search provides aggregated data based on search queries. Elastica provides a PHP client for interacting with Elasticsearch.

Elastic search adaptto2014

Vivek Sachdeva

Viewers also liked (20)

Elasticsearch first-steps

Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...

Elastic search overview

Elasticsearch Sharding Strategy at Tubular Labs

ElasticSearch on AWS - Real Estate portal case study (Spitogatos.gr)

Elasticsearch & "PeopleSearch"

JBug_React_and_Flux_2015

OseeGenius - Semantic search engine and discovery platform

MoSQL: An Elastic Storage Engine for MySQL

Apache Hadoop 1.1

Building search app with ElasticSearch

Social Miner: Webinar people marketing em 30 min

Oxalide Academy : Workshop #3 Elastic Search

Introduction to Elasticsearch

Amministratori Di Sistema: Adeguamento al Garante Privacy - Log Management e ...

Oak / Solr integration

Elastic search

quick intro to elastic search

Elastic search Walkthrough

Elastic search adaptto2014

Similar to Elasticsearch

PEARC17: Designsafe: Using Elasticsearch to Share and Search Data on a Scienc...

Josue Balandrano

Designsafe is a web portal focused on helping Natural Hazards Engineering to conduct research. Natural Hazards research spans across multiple physical locations, where the experiments take place, and multiple disciplines. Sharing and searching data is an imperative feature when doing research in multiple physical locations. We are able to handle the research needs by using a distributed database (Elasticsearch) to index important features extracted from data.

Lucene and MySQL

farhan "Frank" mashraqi

This document provides an overview of Lucene and how it can be used with MySQL. It discusses: - What Lucene is and its origins as an open source information retrieval library. - How Lucene works as a toolkit for building search applications rather than a turnkey search engine. - Core Lucene classes like IndexWriter, Directory, Analyzer, and Document that are used for indexing data. - Classes like IndexSearcher and Query that support basic search operations through queries and hits. - Examples of loading data from a MySQL database into a Lucene index and performing searches on that indexed data.

Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati

Robert Calcavecchia

IR with lucene

Stelios Gorilas

This document provides an introduction to Lucene, an open-source information retrieval library. It discusses Lucene's components and architecture, how it models content and performs indexing and searching. It also summarizes how to build search applications using Lucene, including acquiring content, building documents, analyzing text, indexing documents, and querying. Finally, it discusses frameworks that are built on Lucene like Compass and Solr.

Segmentation

Pavel Yakovlev

We research hierarchy of topics extracted from documents (news, publications, discussions etc.). Our system is targeted at data researchers. It provides: -Trend tracking -Similar and related topics detection -Topic segmentation, which aims to solve information overload (http://mlvl.github.io/Hierarchie/) problem The topic model we use is not a collection of tags but is the combination of NLP + statistical analysis.

JavaCro'15 - Elasticsearch as a search alternative to a relational database -...

HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association

Elasticsearch is a distributed, RESTful search and analytics engine built on Apache Lucene. It allows storing and searching of documents in near real-time. Documents are stored in indexes which can be sharded across multiple nodes for horizontal scalability and high availability. Queries use a simple JSON over HTTP interface to retrieve and analyze documents. The PBZ bank uses Elasticsearch to index over 600 million documents from customer transactions for fast retrieval of turnovers by account number.

Using ElasticSearch as a fast, flexible, and scalable solution to search occu...

kristgen

Elasticsearch is an open source search engine that provides fast, flexible, and scalable search of occurrence records and checklists. It allows adding and querying data through a REST API or Java API. Data can be imported from databases or other sources using rivers. Mappings customize indexing and querying. Elasticsearch has been used at Canadensys to index vascular plant names with filters for autocompletion, genus filtering, and epithet hierarchy. It is also used at GBIF France to search biodiversity data from MongoDB with filters and calculate statistics with facets.

In search of: A meetup about Liferay and Search 2016-04-20

Tibor Lipusz

BigData Search Simplified with ElasticSearch

TO THE NEW | Technology

A quick Description about presentation: • What is ElasticSearch and how it works. • How ElasticSearch works to analyze data splitting a document into meaningful portions and indexing each of those portions separately. So whenever a new search request comes in, it knows what to find. • Features and advantages of ElasticSearch like built in sharding defaults, maintaining fail-safe node clusters, automatically adding a new node without having to reboot and so on. • Out of the box features for today’s applications like faceted search, reverse search using Percolators and pre-built Analyzers. The tutorial includes big data search, contenders, intro to elasticsearch, more than just search, unchartered territory. Beginning is a brief detail about big data search which includes big data search in terms of rapid consumption and the challenges faced by big data search. Following is a section about contenders. It includes contenders like lucene, apache soir, sphinx and ElasticSearch itself. Moreover, there is also an introduction section to ElasticSearch. It includes an introduction to ElasticSearch as a search server and it's features like push replication, node auto discovery, fail-safe. It also includes data analyzing and ways of indexing it right. Afterwards, there is a section on more than search which includes factors more than just search functions like facets, range facet, histogram facet, geo facet, percolator and ElasticSearch percolating. The last section of this tutorial includes unchartered territory. It includes territories like ElasticSearch and NoSQL database, situations in cases of WHAT IF and references.

Using elasticsearch with rails

Tom Z Zeng

Elasticsearch is a powerful open source search and analytics engine. It allows for full text search capabilities as well as powerful analytics functions. Elasticsearch can be used as both a search engine and as a NoSQL data store. It is easy to set up, use, scale, and maintain. The document provides examples of using Elasticsearch with Rails applications and discusses advanced features such as fuzzy search, autocomplete, and geospatial search.

ElasticSearch Basics

Satya Mohapatra

This document provides an overview of Elasticsearch, including why search engines are useful, what Elasticsearch is, how it works, and some key concepts. Elasticsearch is an open source, distributed, real-time search and analytics engine. It facilitates full-text search across numerous data types and returns results based on relevance. It stores data in JSON documents and uses inverted indexes to enable fast full-text search. Documents are analyzed and tokenized to build the indexes. Elasticsearch can be queried using RESTful APIs or the query DSL to perform complex searches and return highlighted results.

Introduction to Elasticsearch

Ruslan Zavacky

Multi-language Content Discovery Through Entity Driven Search

Alessandro Benedetti

This talk is about the description of the implementation of a Semantic Search Engine based on Solr. Meaningfully structuring content is critical, Natural Language Processing and Semantic Enrichment is becoming increasingly important to improve the quality of Solr search results . Our solution is based on three advanced features : Entity-oriented search - Searching not by keyword, but by entities (concepts in a certain domain). Knowledge graphs - Leveraging relationships amongst entities: Linked Data datasets (Freebase, DbPedia, Custom ...) Search assistance - Autocomplete and Spellchecking are now common features, but using semantic data makes it possible to offer smarter features, driving the users to build queries in a natural way. The approach includes unstructured data processing mechanisms integrated with Solr to automatically index semantic and multi-language information. Smart Autocomplete will complete users' query with entity names and properties from the domain knowledge graph. As the user types, the system will propose a set of named entities and/or a set of entity types across different languages. As the user accepts a suggestion, the system will dynamically adapt following suggestions and return relevant documents. Semantic More Like This will find similar documents to a seed one, based on the underlying knowledge in the documents, instead of tokens.

Search explained T3DD15

Hans Höchtl

Sustainability Investment Research Using Cognitive Analytics

Cambridge Semantics

Elasticsearch for beginners

Neil Baker

Elasticsearch is a free and open source distributed search and analytics engine. It allows documents to be indexed and searched quickly and at scale. Elasticsearch is built on Apache Lucene and uses RESTful APIs. Documents are stored in JSON format across distributed shards and replicas for fault tolerance and scalability. Elasticsearch is used by many large companies due to its ability to easily scale with data growth and handle advanced search functions.

Intro to elasticsearch

Joey Wen

1) The document discusses information retrieval and search engines. It describes how search engines work by indexing documents, building inverted indexes, and allowing users to search indexed terms. 2) It then focuses on Elasticsearch, describing it as a distributed, open source search and analytics engine that allows for real-time search, analytics, and storage of schema-free JSON documents. 3) The key concepts of Elasticsearch include clusters, nodes, indexes, types, shards, and documents. Clusters hold the data and provide search capabilities across nodes.

Eureka, I found it! - Special Libraries Association 2021 Presentation

Access Innovations, Inc.

Have you ever wondered how search works while visiting an e-commerce site, internal website, or searching through other types of online resources? Look no further than this informative session on the ways that taxonomies help end-users navigate the internet! Hear from taxonomists and other information professionals who have first-hand experience creating and working with taxonomies that aid in navigation, search, and discovery across a range of disciplines.

James elastic search

LearningTech

Elasticsearch is an open source, distributed, real-time search and analytics engine. It allows storing and searching of documents of any schema in JSON format. Documents are indexed to allow fast searching, and Elasticsearch can scale horizontally and remain highly available across many servers. Queries can be performed using RESTful APIs to search specific fields, run full-text searches across all fields, or filter results.

Introduction to Apache Lucene/Solr

Rahul Jain

This document provides an introduction to Apache Lucene and Solr. It begins with an overview of information retrieval and some basic concepts like term frequency-inverse document frequency. It then describes Lucene as a fast, scalable search library and discusses its inverted index and indexing pipeline. Solr is introduced as an enterprise search platform built on Lucene that provides features like faceting, scalability and real-time indexing. The document concludes with examples of how Lucene and Solr are used in applications and websites for search, analytics, auto-suggestion and more.

Similar to Elasticsearch (20)

PEARC17: Designsafe: Using Elasticsearch to Share and Search Data on a Scienc...

Lucene and MySQL

Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati

IR with lucene

Segmentation

JavaCro'15 - Elasticsearch as a search alternative to a relational database -...

Using ElasticSearch as a fast, flexible, and scalable solution to search occu...

In search of: A meetup about Liferay and Search 2016-04-20

BigData Search Simplified with ElasticSearch

Using elasticsearch with rails

ElasticSearch Basics

Introduction to Elasticsearch

Multi-language Content Discovery Through Entity Driven Search

Search explained T3DD15

Sustainability Investment Research Using Cognitive Analytics

Elasticsearch for beginners

Intro to elasticsearch

Eureka, I found it! - Special Libraries Association 2021 Presentation

James elastic search

Introduction to Apache Lucene/Solr

Elasticsearch

1. ELASTICSEARCH

2. Agenda - Intro - Data Indexing - Data Querying - Cluster - Lessons learned - Q & A

3. Intro - General - Written in Java (Lucene based) - Full Text Search Engine - Distributed (easy to scale) - High availability - Document oriented - Restful API (JSON over HTTP) - Schema-less - Community support

4. Into - Use cases: Github Search - search repos, users, issues, PRs - search lines of codes - track events & logs

5. Into - Use cases: Stackoverflow - FTS search combined with geolocation - related questions & answers

6. Into - Use cases: Netflix - query log events - tracking service deployments - related items

7. Intro - Nomenclature - Document? - Type? - Index? - Shard? - Replica? - Mapping? - Cluster?

8. Intro - Nomenclature: Document

9. Intro - Nomenclature: Doc Type A type is what defines a Document (list of fields and their types)

10. Intro - Nomenclature: Index - ElasticSearch stores data in logical indices (which maps to multiple types) - An Index has at least 1 shard

11. Intro -Nomenclature: Shard & Replica - Shard: a single Lucene Index - Replica: a copy of the primary shard

12. Intro - Nomenclature: Mapping Each index has a mapping that defines each type.

13. Intro - Nomenclature: Analogy Index ⇔ Database Mapping ⇔ Schema Index.Type ⇔ Table Document ⇔ Table.Row Document.Field ⇔ Table.Column

14. Intro - Inverted Index

15. Indexing: Mapping - Static: define how each field should be mapped to the search engine. - Dynamic: automatically created when a new type or new field is introduced.

16. Indexing: FTS vs Exact match “New Brand Analytics” => [“New Brand Analytics”]

17. Indexing: FTS vs Exact match “New Brand Analytics” => [‘New’, ‘Brand’, ‘Analytics’]

18. Indexing: Analyzers - Standard - Simple - Whitespace - Language - Stop words - Pattern - Custom

19. Indexing: Analyzers “The quick brown fox jumped over the lazy dogs,123-456”

20. Querying - Basic queries - Filtering - Sort - Scripts - Aggregation

21. Querying - Basic Queries Term query: matches documents that (1) have ‘userHandle’ field, and contains the term ‘Amine’ (not analyzed).

22. Querying - Basic Queries

23. Querying - Match Queries

24. Querying - Multifield Queries

25. Querying - Sort & Limit Queries

26. Querying - Script Filters

27. Querying - Aggregation - Pre-loads candidate Docs in memory. - Agg happens in memory. - { “size “: 0 } unless data needs to be seen. - Nested types = Nested aggregation. - More flexible than Facets, but slightly slower.

28. Cluster - Config - cluster.name: - The main cluster name - node.name - The specific node name - node.master - Only one node can be set to be master in a cluster - node.data - If this node will hold data or not.

29. Cluster - Config

30. Lessons learned - # of shards cannot be changed. - NEVER EVER allocate more than 50% of the available RAM to the ES heap. - Version collision on concurrent inserts.

31. Q & A https://www.elastic.co/guide/index.html

Elasticsearch

Related slideshows

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (20)

Similar to Elasticsearch

Similar to Elasticsearch (20)

Elasticsearch