This document discusses how Walmart uses Apache Solr as a "not-so-evil twin" to complement their source of truth database and help scale their data infrastructure. It describes how Walmart abstracts the complexity of managing databases, caches, search queries, and messaging to provide scalable querying across database shards. The use of Solr has allowed Walmart to offload queries, recurring reads, analytics
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...Lucidworks
The document discusses the challenges of building a news search engine at Bloomberg L.P. It describes how Bloomberg uses Apache Solr/Lucene to index millions of news stories and handle complex search queries from customers. Some key challenges discussed include optimizing searches over huge numbers of documents and metadata fields, handling arbitrarily complex queries, and developing an alerting system to notify users of new matching results. The system has been scaled up to include thousands of Solr cores distributed across data centers to efficiently search and retrieve news content.
PlayStation and Lucene - Indexing 1M documents per second: Presented by Alexa...Lucidworks
The document discusses search capabilities for the PlayStation Network. It describes how the initial system indexed 200,000 documents per second for the PS Store, but more capabilities were needed. It then details the challenges in moving from a relational database to NoSQL to support indexing 1 million documents per second across multiple services for 65 million monthly active users.
The document discusses Netflix's use of Elasticsearch for querying log events. It describes how Netflix evolved from storing logs in files to using Elasticsearch to enable interactive exploration of billions of log events. It also summarizes some of Netflix's best practices for running Elasticsearch at scale, such as automatic sharding and replication, flexible schemas, and extensive monitoring.
Queue Based Solr Indexing with Collection Management: Presented by Devansh Dh...Lucidworks
This document discusses Gannett's transition from synchronous to asynchronous indexing using queues. It summarizes Gannett's current indexing architecture, the problems with synchronous operations like outages during schema changes, and how moving to a queue-based system with RabbitMQ allows for faster schema changes, auto-scaling of workers, and more resilient indexing with minimal authoring impact. The queue-based approach includes real-time, batch, and prep queues to handle different types of document changes and updates in a decoupled, eventually consistent manner.
This document provides an overview of Central Log Management at the University of Cape Town. It discusses Splunk and the ELK stack for collecting, analyzing, and monitoring machine data from various sources. Splunk is featured for its collection, search, reporting, and alerting capabilities. The ELK stack deployed at UCT includes Logstash to process logs from firewalls and send them to Elasticsearch for storage and querying in Kibana for visualization. Shipper and indexer configurations are shown for ingesting Palo Alto firewall logs into Elasticsearch.
Webinar: Replace Google Search Appliance with Lucidworks FusionLucidworks
Lucidworks Senior Search Engineer, Evan Sayer, and Enterprise Content Management and Big Data Architect for the County of Sacramento, Guy Sperry, explore the benefits of replacing Google Search Appliance with Lucidworks Fusion.
Searching The Enterprise Data Lake With Solr - Watch Us Do It!: Presented by...Lucidworks
The document discusses searching enterprise data lakes with Apache Solr. It begins with an overview of how data storage has evolved from single databases to data warehouses to modern data lakes that store vast amounts of raw and processed data. The challenge is finding needed data in this environment. The document then covers the process for indexing data lake contents with Solr, including ingesting data, configuring Solr, parsing and indexing data, searching and analyzing data. It concludes with a demonstration of performing these steps and resources for further information.
Descubre las mas recientes y futuras características del Stack: gestión del ciclo de vida de los datos para arquitecturas hot/warm/cold con DataStreams, mejoras en uso de memoria y disco, mejoras en el enrutado de las consultas; Analítica de datos multi lenguaje con query cDSL, SQL, KQL, PromQL y EQL; el nuevo sistema de Alertas y Acciones.
In this presentation a summary of the work done for comparing NoSQL versus MySQL for a pretended Internet Access Logs application is done.
The work done has four parts:
- An initial study of what is the actual state of Open Source NoSQL solutions
- Why MongoDB has been chosen and how it has been installed and configured
- Design of a schema, a few PHP classes and scripts for testing MongoDB and MySQL
- The comparative results and conclussions
More info at http://www.ciges.net/mysql-vs-mongodb-para-el-analisis-de-logs-de-acceso-a-internet or at https://github.com/Ciges/internet_access_control_demo
Elastic{ON} is the big conference organized by Elastic where new features and roadmap are announced. In this session, we will explore what's new in the Elastic stack
This document summarizes an open source scalable log analytics solution. The solution uses Lumberjack for log collection, Logstash for indexing and filtering logs, Redis for buffering, Elasticsearch for indexing and searching logs, MongoDB for document storage, and Kibana with D3.js for visualization. Logs are collected from servers by Lumberjack, sent to Logstash for processing, indexed by Elasticsearch for searching, stored in MongoDB for retrieval, and visualized through dashboards and reports in Kibana. The solution allows for real-time log analysis, flexible searching and filtering, and scales horizontally as needs grow.
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Cohesive Networks
Slides from the Chicago AWS user group on May 5th, 2016. Asaf Yigal, Co-Founder and VP Product at Logz.io, presented on using Elasticsearch, Logstash, and Kibana in Amazon Web Services.
"Setting up the increasingly-popular open-source ELK Stack (Elasticsearch, Logstash, and Kibana) on AWS might seem like an easy task, but we have gone through several iterations in our architecture and have made some mistakes in our deployments that have turned out to be common in the industry. In this talk, we will go through what we did and explain what worked and what failed -- and why. We will also provide a complete blueprint of how to set up ELK for production on AWS." ~ @asafyigal
Elasticsearch is a free and open source distributed search and analytics engine. It allows documents to be indexed and searched quickly and at scale. Elasticsearch is built on Apache Lucene and uses RESTful APIs. Documents are stored in JSON format across distributed shards and replicas for fault tolerance and scalability. Elasticsearch is used by many large companies due to its ability to easily scale with data growth and handle advanced search functions.
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...Lucidworks
This document discusses efficient scalable search in a multi-tenant environment. It describes Bloomberg Vault, which hosts large volumes of enterprise communications and documents for compliance. The system uses a distributed architecture with shards that are loaded on demand to serve search queries. Security is ensured by dynamically generating field values that encapsulate access permissions for each user's view of a document.
Logging is one of those things that everyone complains about, but doesn't dedicate time to. Of course, the first rule of logging is "do it". Without that, you have no visibility into system activities when investigations are required. But, the end goal is much, much more than this. Almost all applications require security audit logs for compliance; application logs for visibility across all cloud properties; and application tracing for tracking usage patterns and business intelligence. The latter is that magic sauce that helps businesses learn about their customer or in some cases the data is FOR the customer. Without a strategy this can get very messy, fast. In this session Michele will discuss design patterns for a sound logging and audit strategy; considerations for security and compliance; the benefits of a noSQL approach; and more.
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...Lucidworks
This document discusses optimizing Solr for near real-time indexing of large datasets. The author describes benchmarking different indexing configurations, finding that batching documents by time, size or number provides much higher indexing throughput than single documents. The author proposes a PID controller to dynamically adjust batching parameters based on indexing performance. Future work includes refining the PID controller, integrating it with benchmarking tools, and using it for hardware sizing.
The document outlines an agenda for a conference on search and recommenders hosted by Lucidworks, including presentations on use cases for ecommerce, compliance, fraud and customer support; a demo of Lucidworks Fusion which leverages signals from user engagement to power both search and recommendations; and a discussion of future directions including ensemble and click-based recommendation approaches.
Building an ETL pipeline for Elasticsearch using SparkItai Yaffe
How we, at eXelate, built an ETL pipeline for Elasticsearch using Spark, including :
* Processing the data using Spark.
* Indexing the processed data directly into Elasticsearch using elasticsearch-hadoop plugin-in for Spark.
* Managing the flow using some of the services provided by AWS (EMR, Data Pipeline, etc.).
The presentation includes some tips and discusses some of the pitfalls we encountered while setting-up this process.
Elasticsearch is a distributed, open source search and analytics engine built on Apache Lucene. It allows storing and searching of documents of any schema in JSON format. Documents are organized into indexes which can have multiple shards and replicas for scalability and high availability. Elasticsearch provides a RESTful API and can be easily extended with plugins. It is widely used for full-text search, structured search, analytics and more in applications requiring real-time search and analytics of large volumes of data.
Each money persona has its own set of beliefs and behaviours. Identifying your money persona allows you to better understand and improve your relationship with money.
Framtidens konkurransekraft finnes der det skapes sammen @ First Tuesday BergenFirst Tuesday Bergen
Framtidens konkurransekraft finnes der det skapes sammen
Johnny Anderson - Leder marked og kommunikasjon @ Skandiabanken
Verden krymper stadig og store globale selskaper gjør seg små og smarte. Snart henger virkelig alt sammen med alt. Finansbransjen møter et nytt trusselbilde og mange etablerte logikker blir utfordret. Dette gir også enorme muligheter for små teknologiselskaper med store ambisjoner. De som ønsker å være et innovativt kraftpunkt i framtiden må jobbe hardt, dele kunnskap og finne gode partnerskap. Og vi som bank må bli utfordret fra utsiden for å kunne se ting på nye måter og skape framtidens konsept og kundeopplevelser.
Johnny Anderson er leder for Marked & Kommunikasjon i Skandiabanken og har ansvar for bankens merkevare, kommunikasjon, kundeutvikling og kundeservice. Han har lang fartstid fra utvikling av bankens konsept, kanaler og tjenester siden 2005, er utdannet medieviter fra Universitet i Bergen og er opptatt av hvordan ny teknologi påvirker og endrer vår adferd.
Developing A Big Data Search Engine - Where we have gone. Where we are going:...Lucidworks
Mark Miller discussed the history and future of search engines like Lucene and Solr. He explained that Lucene search engines currently lead the field and are widely used. While search has scaled up, it remains imperfect and can be flaky at large scales. Extensive testing, including of integration and distributions, is needed to improve reliability. Leveraging Hadoop's distributed capabilities can help push search engines to be more scalable and correct at large sizes.
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Lucidworks
George Bailey and Cameron Baker of Rackspace presented their solution for indexing over 50,000 documents per second for Rackspace Email. They modernized their system using Apache Flume for event processing and aggregation and SolrCloud for real-time search. This reduced indexing time from over 20 minutes to under 5 seconds, reduced the number of physical servers needed from over 100 to 14, and increased indexing throughput from 1,000 to over 50,000 documents per second while supporting over 13 billion searchable documents.
1. The document provides an overview of iShares exchange traded funds (ETFs), including BlackRock's growth in the ETF market, what ETFs are and their benefits, key areas to consider when selecting ETFs, and iShares' product range.
2. iShares is the leading ETF provider globally with over $1 trillion in assets under management and 780 ETFs listed. ETFs provide diversified exposure to indexes with the trading flexibility of stocks at a low cost.
3. The document discusses factors to consider when selecting ETFs like the underlying index, domicile and registration, and structure and risks. It also outlines the evolution of the large and
This document discusses various approaches to analyzing delays in construction projects, including As-Planned vs As-Built, Impacted As-Planned, Collapsed As-Built, and Time Impact Analysis using snapshot and window approaches. It defines key delay analysis terms and provides examples of inserting delays into schedules and calculating extension of time and costs using different methods. The preferred approach discussed is window-based Time Impact Analysis, which divides a project into time windows and compares schedules to determine delay impacts at different points in time. Concurrent delays that cannot be separated are generally only entitled to extension of time but not additional costs.
Ubiquitous Solr - A Database's not-so-evil TwinAyon Sinha
Ubiquitous Solr - A Database’s not-so-evil Twin
This document discusses how Apache Solr can be used as a "not-so-evil" twin to complement and scale a database. As user demand and data volume increases over time, offloading queries, analytics, and recurring reads to Solr can help reduce load on the database. This allows separating concerns between the source of truth database and a search index for improved performance and scalability. The document advocates for viewing Solr as a first-class citizen in the technology stack that can symbiotically exist with databases, NoSQL, and big data systems to power use cases beyond just search.
1. We provide database administration and management services for Oracle, MySQL, and SQL Server databases.
2. Big Data solutions need to address storing large volumes of varied data and extracting value from it quickly through processing and visualization.
3. Hadoop is commonly used to store and process large amounts of unstructured and semi-structured data in parallel across many servers.
Bill Yetman of Ancestry gave a presentation on starting a Big Data initiative. He summarized that Big Data initiatives should:
1) Understand business goals and work backwards to define needed analytics and actions.
2) Deliver value at each step by producing actionable insights and business results.
3) Follow companies like Netflix, Facebook, and LinkedIn that are leaders in Big Data architecture to learn best practices.
Transform from database professional to a Big Data architectSaurabh K. Gupta
This document discusses transitioning from an Oracle DBA to a Big Data architect. It provides an overview of big data, key technologies, and how DBAs can leverage their skills. The speaker is introduced as a data leader with Oracle experience who will cover how DBAs can contribute to big data. The agenda includes an overview of big data, designing big data solutions, common technologies, and building a big data team. Common data sources, acquisition methods, storage options, and analytics are also summarized.
5 Things that Make Hadoop a Game Changer
Webinar by Elliott Cordo, Caserta Concepts
There is much hype and mystery surrounding Hadoop's role in analytic architecture. In this webinar, Elliott presented, in detail, the services and concepts that makes Hadoop a truly unique solution - a game changer for the enterprise. He talked about the real benefits of a distributed file system, the multi workload processing capabilities enabled by YARN, and the 3 other important things you need to know about Hadoop.
To access the recorded webinar, visit the event site: https://www.brighttalk.com/webcast/9061/131029
For more information the services and solutions that Caserta Concepts offers, please visit http://casertaconcepts.com/
This document provides an overview of big data and business analytics. It discusses the key characteristics of big data, including volume, variety, and velocity. Volume refers to the enormous and growing amount of data being generated. Variety means data comes in all types from structured to unstructured. Velocity indicates that data is being created in real-time and needs to be analyzed rapidly. The document also outlines some of the challenges of big data and how cloud computing and technologies like Hadoop are helping to manage and analyze large, complex data sets.
The Data Lake and Getting Buisnesses the Big Data Insights They NeedDunn Solutions Group
Do terms like "Data Lake" confuse you? You’re not alone. With all of the technology buzzwords flying around today, it can become a task to keep up with and clearly understand each of them. However a data lake is definitely something to dedicate the time to understand. Leveraging data lake technology, companies are finally able to keep all of their disparate information and streams of data in one secure location ready for consumption at any time – this includes structured, unstructured, and semi-structured data. For more information on our Big Data Consulting Services, don’t hesitate to visit us online at: http://bit.ly/2fvV5rR
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...Fabio Fumarola
The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
Data Integration and Data Warehousing for Cloud, Big Data and IoT: What’s Ne...Rittman Analytics
Mark Rittman presented at Big Data World in London in March 2017 on data integration and data warehousing for cloud, big data, and IoT. He discussed the history of data warehousing and how it has evolved from traditional RDBMS implementations to embrace big data technologies like Hadoop. He described how cloud data warehouse offerings from Google BigQuery and Amazon Redshift combine the scalability of big data with the structure of data warehousing. Rittman also covered new approaches to ETL using data pipelines, schema discovery using machine learning, emerging open-source BI tools, and his current work in these areas.
The term "Data Lake" has become almost as overused and undescriptive as "Big Data". Many believe that centralizing datasets in HDFS makes a data lake, but then they struggle to realize any tangible value. This talk will redefine the "Data Lake" by describing four specific, key characteristics that we at Koverse have learned are crucial to successful enterprise data lake deployments. These characteristics are 1) indexing and search across all data sets, 2) interactive access for all users in the enterprise, 3) multi-level access control, and 4) integration with data science tools. These characteristics define a system that lets people realize value from their data versus getting lost in the hype. The talk will go on to provide a technical description of how we have integrated several projects, namely Apache Accumulo, Hadoop, and Spark, to implement an enterprise data lake with these key features.
The term "Data Lake" has become almost as overused and undescriptive as "Big Data". Many believe that centralizing datasets in HDFS makes a data lake, but then they struggle to realize any tangible value. This talk will redefine the "Data Lake" by describing four specific, key characteristics that we at Koverse have learned are crucial to successful enterprise data lake deployments. These characteristics are 1) indexing and search across all data sets, 2) interactive access for all users in the enterprise, 3) multi-level access control, and 4) integration with data science tools. These characteristics define a system that lets people realize value from their data versus getting lost in the hype. The talk will go on to provide a technical description of how we have integrated several projects, namely Apache Accumulo, Hadoop, and Spark, to implement an enterprise data lake with these key features.
Slides used for the keynote at the even Big Data & Data Science http://eventos.citius.usc.es/bigdata/
Some slides are borrowed from random hadoop/big data presentations
What exactly is big data? The definition of big data is data that contains greater variety, arriving in increasing volumes and with more velocity. This is also known as the three Vs. Put simply, big data is larger, more complex data sets, especially from new data sources.
La BuzzWord dell’ultimo anno è “Data Science”. Ma cosa significa realmente? Cosa fa un “Data Scientist”? Che strumenti sono messi a disposizione da Microsoft? E che altri strumenti ci sono oltre a Microsoft?
Data is being generated at a feverish pace and forward thinking companies are integrating big data and analytics as part of their core strategy from day one. However, it is often hard to sift through the hype around big data and many companies start with only a small subset of data. Can smaller companies benefit from big data efforts? We will discuss several use cases and examples of how startups are using data to optimize their operations, connect with their users, and expand their market.
Hadoop World 2011: Data Mining in Hadoop, Making Sense of it in Mahout! - Mic...Cloudera, Inc.
Much of Hadoop adoption thus far has been for use cases such as processing log files, text mining, and storing masses of file data -- all very necessary, but largely not exciting. In this presentation, Michael Cutler presents a selection of methodologies, primarily using Mahout, that will enable you to derive real insight into your data (mined in Hadoop) and build a recommendation engine focused on the implicit data collected from your users.
Real Time Interactive Queries IN HADOOP: Big Data Warehousing MeetupCaserta
During the Big Data Warehousing Meetup, we discussed options for enabling real-time/interactive queries to support business intelligence type functionality on Hadoop. Also, Hortonworks provided a deep-dive demo of Stinger! You can access that slideshow here: http://www.slideshare.net/CasertaConcepts/stinger-initiative-hortonworks
If you would like more information, please don't hesitate to contact us at info@casertaconcepts.com. Or, visit our website at http://casertaconcepts.com/.
Similar to Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, WalmartLabs (20)
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
With ecommerce experiencing explosive growth, it seems intuitive that the B2B segment of that ecosystem is mirroring the same trajectory. That said, B2B has very different needs when it comes to transacting with the same style of experiences that we see in B2C. For instance, B2B ecommerce is about precision findability, whereas B2C customers can convert at higher rates when they’re just browsing online. In order for the B2B buying experience to be successful, search needs to be tuned to meet the unique needs of the segment.
In this webinar with Forrester senior analyst Joe Cicman, you’ll learn:
-Which verticals in B2B will drive the most growth, and how machine-learning powered personalization tactics can be deployed to support those specific verticals
-Why an omnichannel selling approach must be deployed in order to see success in B2B
-How deploying content search capabilities will support a longer sales cycle at scale
-What the next steps are to support a robust B2B commerce strategy supported by new technology
Speakers
Joe Cicman, Senior Analyst, Forrester
Jenny Gomez, VP of Marketing, Lucidworks
Customer loyalty starts with quickly responding to your customer’s needs. When it comes to resolving open support cases, time is of the essence. Time spent searching for answers adds up and creates inefficiencies in resolving cases at scale. Relevant answers need to be a few clicks away and easily accessible for agents directly from their service console.
We will explore how Lucidworks’ Agent Insights application automatically connects agents with the correct answers and resources. You’ll learn how to:
-Configure a proactive widget in an agent’s case view page to access resources across third-party systems (such as Sharepoint, Confluence, JIRA, Zendesk, and ServiceNow).
-Easily set up query pipelines to autonomously route assets and resources that are relevant to the case-at-hand—directly to the right agent.
-Identify subject matter experts within your support data and access tribal knowledge with lightning-fast speed.
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
Lunch and Learn during Retail TouchPoints #RIC21 virtual event.
***
Crate & Barrel’s previous search solution couldn’t provide its shoppers with an online search and browse experience consistent with the customer-centric Crate & Barrel brand. Meanwhile, Crate & Barrel merchandisers spent the bulk of their time manually creating and maintaining search rules. The search experience impacted customer retention, loyalty, and revenue growth.
Join this lunch & learn for an interactive chat on how Crate & Barrel partnered with Lucidworks to:
-Improve search and browse by modernizing the technology stack with ML-based personalization and merchandising solutions
-Enhance the experience for both shoppers and merchandisers
-Explore signals to transform the omnichannel shopping experience
Questions? Visit https://lucidworks.com/contact/
Learn how to guide customers to relevant products using eCommerce search, hyper-personalisation, and recommendations in our ‘Best-In-Class Retail Product Discovery’ webinar.
Nowadays, shoppers want their online experience to be engaging, inspirational and fulfilling. They want to find what they’re looking for quickly and easily. If the sought after item isn’t available, they want the next best product or content surfaced to them. They want a website to understand their goals as though they were talking to a sales assistant in person, in-store.
In this webinar, we explore IMRG industry data insights and a best-in-class example of retail product discovery. You’ll learn:
- How AI can drive increased revenue through hyper-personalised experiences
- How user intent can be easily understood and results displayed immediately
- How merchandisers can be empowered to curate results and product placement – all without having to rely on IT.
Presented by:
Dave Hawkins, Principal Sales Engineer - Lucidworks
Matthew Walsh, Director of Data & Retail - IMRG
Connected Experiences Are Personalized ExperiencesLucidworks
Many companies claim personalization and omnichannel capabilities are top priorities. Few are able to deliver on those experiences.
For a recent Lucidworks-commissioned study, Forrester Consulting surveyed 350+ global business decision-makers to see what gets in the way of achieving these goals. They discovered that inefficient technology, lack of behavioral insights, and failure to tie initiatives to enterprise-wide goals are some of the most frequent blockers to personalization success.
Join guest speaker, Forrester VP and Principal Analyst, Brendan Witcher, and Lucidworks CEO, Will Hayes, to hear the results of the Forrester Consulting study, how to avoid “digital blindness,” and how to apply VoC data in real-time to delight customers with personalized experiences connected across every touchpoint.
In this webinar, you’ll learn:
- Why companies who utilize real-time customer signals report more effective personalization
- How to connect employees and customers in a shared experience through search and browse
- How Lucidworks clients Lenovo, Morgan Stanley and Red Hat fast-tracked improvements in conversion, engagement and customer satisfaction
Featuring
- Will Hayes, CEO, Lucidworks
- Brendan Witcher, VP, Principal Analyst, Forrester
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
Intelligent Policing. Leveraging Data to more effectively Serve Communities.
Policing in the next decade is anticipated to be very different from historical methods. More data driven, more focused on the intricacies of communities they serve and more open and collaborative to make informed recommendations a reality. Whether its social populations, NIBRS or organization improvement that’s the driver, the IT requirement is largely the same. Provide 360 access to large volumes of siloed data to gain a full 360 understanding of existing connections and patterns for improved insight and recommendation.
Join us for a round table discussion of how the Toronto Police Service is better serving their community through deploying a unified intelligent data platform.
Data innovation improves officers' engagement with existing data and streamlines investigation workflows by enhancing collaboration. This improved visibility into existing police data allows for a more intelligent and responsive police force.
In this webinar, we'll cover:
-The technology needs of an intelligent police force.
-How a Global Search improves an officer's interaction with existing data.
Featuring:
-Simon Taylor, VP, Worldwide Channels & Alliances, Lucidworks
-Michael Cizmar, Managing Director, MC+A
-Ian Williams, Manager of Analytics & Innovation, Toronto Police Service
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
Policing in the next decade is anticipated to be very different from historical methods. More data driven, more focused on the intricacies of communities they serve and more open and collaborative to make informed recommendations a reality. Whether its social populations, NIBRS or organization improvement that’s the driver, the IT requirement is largely the same. Provide 360 access to large volumes of siloed data to gain a full 360 understanding of existing connections and patterns for improved insight and recommendation.
Join us for a round table discussion of how the Toronto Police Service is better serving their community through deploying a unified intelligent data platform.
Data innovation improves officers' engagement with existing data and streamlines investigation workflows by enhancing collaboration. This improved visibility into existing police data allows for a more intelligent and responsive police force.
In this webinar, we'll cover:
The technology needs of an intelligent police force.
How a Global Search improves an officer's interaction with existing data.
Featuring
-Simon Taylor, VP, Worldwide Channels & Alliances, Lucidworks
-Michael Cizmar, Managing Director, MC+A
-Ian Williams, Manager of Analytics & Innovation, Toronto Police Service
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
This document provides a framework for prioritizing onsite search problems and key performance indicators (KPIs) to measure for e-commerce search optimization. It recommends prioritizing fixing searches that yield no results, improving relevance of results, and reducing false positives. The most essential KPIs to measure include query latency, throughput, result relevance through click-through rates and NDCG scores. The document also provides tips for self-benchmarking search performance and examples of search performance benchmarks across nine e-commerce sites from various industries.
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
Wish your conversion rates were higher? Can’t figure out how to efficiently and effectively serve all the visitors on your site? Embarrassed by the quality of your product discovery experience? The bar is high and the influx of online shopping over recent months has reminded us that the opportunities are real. We’re all deep in holiday prep, but let’s take a few minutes to think about January 2021 and beyond. How can we position ourselves for success with our customers and against our competition?
Grab your lunch and let’s dive into three strategies that need to be part of your 2021 roadmap. You don’t need an army to get there. But you do need to take action and capitalize on the shoppers abandoning the product discovery journey on your site.
In this session, attendees will find out how to:
-Take control of merchandising at scale;
-Implement hands-free search relevancy; and
-Address personalization challenges.
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
For a personalized search experience, search curation requires robust text interpretation, data enrichment, relevancy tuning and recommendations. In order to achieve this, language and entity identification are crucial.
For teams working on search applications, advanced language packages allow them to achieve greater recall without sacrificing precision.
Join us for a guided tour of our new Advanced Linguistics packages, available in Fusion, thanks to the technology partnership between Lucidworks and Basistech.
We’ll explore the application of language identification and entity extraction in the context of search, along with practical examples of personalizing search and enhancing entity extraction.
In this webinar, we’ll cover:
-How Fusion uses the Rosette Basic Linguistics and Entity Extraction packages
-Tips for improving language identification and treatment as well as data enrichment for personalization
-Speech2 demo modeling Active Recommendation
-Use Rosette’s packages with Fusion Pipelines to build custom entities for specific domain use cases
Featuring:
-Radu Miclaus, Director of Product, AI and Cloud, Lucidworks, Lucidworks
-Robert Lucarini, Senior Software Engineer, Lucidworks
-Nick Belanger, Solutions Engineer, Basis Technology
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
Before COVID-19, almost 80% of the US workforce worked service in jobs that involve in-person interaction with strangers. Now, leaders of service organizations must reshape their offerings during the pandemic and prepare for whatever the new normal turns out to be. Our three panelists will share ideas for adapting their service businesses, now that closer-than-six-feet isn’t an option.
Join Lucidworks as we talk shop with 3 service business leaders, covering:
-Common impacts of the pandemic on service businesses (and what to do about them),
-How service teams can maintain a human touch across virtual channels, and
-Plans for the future, before and after the pandemic subsides.
Featuring
-Sara Nathan, President & CEO, AMIGOS
-Anthony Carruesco, Founder, AC Fly Fishing
-sara bradley, chef and proprietor, freight house
-Justin Sears, VP Product Marketing, Lucidworks
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
The COVID-19 pandemic has forced companies to support far more customers and employees through digital channels than ever before. Many are turning to chatbots to help meet increasing demand, but traditional rules-based approaches can’t keep up. Our new Smart Answers add-on to Lucidworks Fusion makes existing chatbots and virtual assistants more intelligent and more valuable to the people you serve.
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
Watch our on-demand webinar showcasing Smart Answers on Lucidworks Fusion. This technology makes existing chatbots and virtual assistants more intelligent and more valuable to the people you serve.
In this webinar, we’ll cover off:
-How search and deep learning extend conversational frameworks for improved experiences
-How Smart Answers improves customer care, call deflection, and employee self-service
-A live demo of Smart Answers for multi-channel self-service support
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
In the current climate, it’s now more important than ever to digitally enable your workforce and customers.
Hear from Simon Taylor, VP Global Partners & Alliances, Lucidworks and Matt Aslett, Research Vice President, 451 Research to get the inside scoop on how industry leaders in Europe are developing and executing their digital transformation strategies.
In this webinar, we’ll discuss:
The top challenges and aspirations European business and technology leaders are solving using AI and search technology
Which search and AI use cases are making the biggest impact in industries such as finance, healthcare, retail and energy in Europe
What technology buyers should look for when evaluating AI and search solutions
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
This document introduces Fusion 5.1 and its new capabilities for integrating with data science tools like Tensorflow, Scikit-Learn, and Spacy.
It provides an overview of Fusion's capabilities for understanding content, users, and delivering insights at scale. The document then demonstrates Fusion's Jupyter Notebook integration for reading and writing data and running SQL queries.
Finally, it shows how Fusion integrates with Seldon Core to easily deploy machine learning models with tools like Tensorflow and Scikit-Learn. A live demo is provided of deploying a custom model and using it in Fusion's query and indexing pipelines.
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
In this webinar with 451 Research, you'll understand how retailers are using AI to predict customer intent and learn which key performance metrics are used by more than 120 online retailers in Lucidworks’ 2019 Retail Benchmark Survey.
In this webinar, you’ll learn:
● What trends and opportunities are facing the ecommerce industry in 2020
● Why search is the universal path to understanding customer intent
● How large online retailers apply AI to maximize the effectiveness of their personalization efforts
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
Nordstrom Rack | Hautelook curates and serves customers a wide selection of on-trend apparel, accessories, and shoes at an everyday savings of up to 75 percent off regular prices. With over a million visitors shopping across different platforms every day, and a realization that customers have become accustomed to robust and personalized search interactions, Nordstrom Rack | Hautelook launched an initiative over a year ago to provide data science-driven digital experiences to their customers.
In this session, we’ll discuss Nordstrom Rack | Hautelook’s journey of operationalizing a hefty strategy, optimizing a fickle infrastructure, and rallying troops around a single vision of building an expansible machine-learning driven product discovery engine.
The audience will learn about:
-The key technical challenges and outcomes that come with onboarding a solution
-The lessons learned of creating and executing operational design
-The use of Lucidworks Fusion to plug custom data science models into search and browse applications to understand user intent and deliver personalized experiences
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
Knowledge graphs and machine learning are on the rise as enterprises hunt for more effective ways to connect the dots between the data and the business world. With newer technologies, the digital workplace can dramatically improve employee engagement, data-driven decisions, and actions that serve tangible business objectives.
In this webinar, you will learn
-- Introduction to knowledge graphs and where they fit in the ML landscape
-- How breakthroughs in search affect your business
-- The key features to consider when choosing a data discovery platform
-- Best practices for adopting AI-powered search, with real-world examples
Webinar: Building a Business Case for Enterprise SearchLucidworks
The document discusses building a business case for enterprise search. It notes that 85% of information is unstructured data locked in various locations and applications. Many knowledge workers spend a significant portion of their day searching across multiple systems for information. The rise of unstructured data and AI capabilities can help organizations unlock value from their information assets. Effective enterprise search powered by AI can provide real-time intelligence, personalized information, and more efficient research to help knowledge workers.
Choosing the Best Outlook OST to PST Converter: Key Features and Considerationswebbyacad software
When looking for a good software utility to convert Outlook OST files to PST format, it is important to find one that is easy to use and has useful features. WebbyAcad OST to PST Converter Tool is a great choice because it is simple to use for anyone, whether you are tech-savvy or not. It can smoothly change your files to PST while keeping all your data safe and secure. Plus, it can handle large amounts of data and convert multiple files at once, which can save you a lot of time. It even comes with 24*7 technical support assistance and a free trial, so you can try it out before making a decision. Whether you need to recover, move, or back up your data, Webbyacad OST to PST Converter is a reliable option that gives you all the support you need to manage your Outlook data effectively.
Welcome to Cyberbiosecurity. Because regular cybersecurity wasn't complicated...Snarky Security
How wonderful it is that in our modern age, every bit of our biological data can be digitized, stored, and potentially pilfered by cyber thieves! Isn't it just splendid to think that while scientists are busy pushing the boundaries of biotechnology, hackers could be plotting the next big bio-data heist? This delightful scenario is brought to you by the ever-expanding digital landscape of biology and biotechnology, where the integration of computer science, engineering, and data science transforms our understanding and manipulation of biological systems.
While the fusion of technology and biology offers immense benefits, it also necessitates a careful consideration of the ethical, security, and associated social implications. But let's be honest, in the grand scheme of things, what's a little risk compared to potential scientific achievements? After all, progress in biotechnology waits for no one, and we're just along for the ride in this thrilling, slightly terrifying, adventure.
So, as we continue to navigate this complex landscape, let's not forget the importance of robust data protection measures and collaborative international efforts to safeguard sensitive biological information. After all, what could possibly go wrong?
-------------------------
This document provides a comprehensive analysis of the security implications biological data use. The analysis explores various aspects of biological data security, including the vulnerabilities associated with data access, the potential for misuse by state and non-state actors, and the implications for national and transnational security. Key aspects considered include the impact of technological advancements on data security, the role of international policies in data governance, and the strategies for mitigating risks associated with unauthorized data access.
This view offers valuable insights for security professionals, policymakers, and industry leaders across various sectors, highlighting the importance of robust data protection measures and collaborative international efforts to safeguard sensitive biological information. The analysis serves as a crucial resource for understanding the complex dynamics at the intersection of biotechnology and security, providing actionable recommendations to enhance biosecurity in an digital and interconnected world.
The evolving landscape of biology and biotechnology, significantly influenced by advancements in computer science, engineering, and data science, is reshaping our understanding and manipulation of biological systems. The integration of these disciplines has led to the development of fields such as computational biology and synthetic biology, which utilize computational power and engineering principles to solve complex biological problems and innovate new biotechnological applications. This interdisciplinary approach has not only accelerated research and development but also introduced new capabilities such as gene editing and biomanufact
Self-Healing Test Automation Framework - HealeniumKnoldus Inc.
Revolutionize your test automation with Healenium's self-healing framework. Automate test maintenance, reduce flakes, and increase efficiency. Learn how to build a robust test automation foundation. Discover the power of self-healing tests. Transform your testing experience.
How UiPath Discovery Suite supports identification of Agentic Process Automat...DianaGray10
📚 Understand the basics of the newly persona-based LLM-powered Agentic Process Automation and discover how existing UiPath Discovery Suite products like Communication Mining, Process Mining, and Task Mining can be leveraged to identify APA candidates.
Topics Covered:
💡 Idea Behind APA: Explore the innovative concept of Agentic Process Automation and its significance in modern workflows.
🔄 How APA is Different from RPA: Learn the key differences between Agentic Process Automation and Robotic Process Automation.
🚀 Discover the Advantages of APA: Uncover the unique benefits of implementing APA in your organization.
🔍 Identifying APA Candidates with UiPath Discovery Products: See how UiPath's Communication Mining, Process Mining, and Task Mining tools can help pinpoint potential APA candidates.
🔮 Discussion on Expected Future Impacts: Engage in a discussion on the potential future impacts of APA on various industries and business processes.
Enhance your knowledge on the forefront of automation technology and stay ahead with Agentic Process Automation. 🧠💼✨
Speakers:
Arun Kumar Asokan, Delivery Director (US) @ qBotica and UiPath MVP
Naveen Chatlapalli, Solution Architect @ Ashling Partners and UiPath MVP
The Challenge of Interpretability in Generative AI Models.pdfSara Kroft
Navigating the intricacies of generative AI models reveals a pressing challenge: interpretability. Our blog delves into the complexities of understanding how these advanced models make decisions, shedding light on the mechanisms behind their outputs. Explore the latest research, practical implications, and ethical considerations, as we unravel the opaque processes that drive generative AI. Join us in this insightful journey to demystify the black box of artificial intelligence.
Dive into the complexities of generative AI with our blog on interpretability. Find out why making AI models understandable is key to trust and ethical use and discover current efforts to tackle this big challenge.
Finetuning GenAI For Hacking and DefendingPriyanka Aash
Generative AI, particularly through the lens of large language models (LLMs), represents a transformative leap in artificial intelligence. With advancements that have fundamentally altered our approach to AI, understanding and leveraging these technologies is crucial for innovators and practitioners alike. This comprehensive exploration delves into the intricacies of GenAI, from its foundational principles and historical evolution to its practical applications in security and beyond.
UiPath Community Day Amsterdam: Code, Collaborate, ConnectUiPathCommunity
Welcome to our third live UiPath Community Day Amsterdam! Come join us for a half-day of networking and UiPath Platform deep-dives, for devs and non-devs alike, in the middle of summer ☀.
📕 Agenda:
12:30 Welcome Coffee/Light Lunch ☕
13:00 Event opening speech
Ebert Knol, Managing Partner, Tacstone Technology
Jonathan Smith, UiPath MVP, RPA Lead, Ciphix
Cristina Vidu, Senior Marketing Manager, UiPath Community EMEA
Dion Mes, Principal Sales Engineer, UiPath
13:15 ASML: RPA as Tactical Automation
Tactical robotic process automation for solving short-term challenges, while establishing standard and re-usable interfaces that fit IT's long-term goals and objectives.
Yannic Suurmeijer, System Architect, ASML
13:30 PostNL: an insight into RPA at PostNL
Showcasing the solutions our automations have provided, the challenges we’ve faced, and the best practices we’ve developed to support our logistics operations.
Leonard Renne, RPA Developer, PostNL
13:45 Break (30')
14:15 Breakout Sessions: Round 1
Modern Document Understanding in the cloud platform: AI-driven UiPath Document Understanding
Mike Bos, Senior Automation Developer, Tacstone Technology
Process Orchestration: scale up and have your Robots work in harmony
Jon Smith, UiPath MVP, RPA Lead, Ciphix
UiPath Integration Service: connect applications, leverage prebuilt connectors, and set up customer connectors
Johans Brink, CTO, MvR digital workforce
15:00 Breakout Sessions: Round 2
Automation, and GenAI: practical use cases for value generation
Thomas Janssen, UiPath MVP, Senior Automation Developer, Automation Heroes
Human in the Loop/Action Center
Dion Mes, Principal Sales Engineer @UiPath
Improving development with coded workflows
Idris Janszen, Technical Consultant, Ilionx
15:45 End remarks
16:00 Community fun games, sharing knowledge, drinks, and bites 🍻
This PDF delves into the aspects of information security from a forensic perspective, focusing on privacy leaks. It provides insights into the methods and tools used in forensic investigations to uncover and mitigate privacy breaches in mobile and cloud environments.
Top 12 AI Technology Trends For 2024.pdfMarrie Morris
Technology has become an irreplaceable component of our daily lives. The role of AI in technology revolutionizes our lives for the betterment of the future. In this article, we will learn about the top 12 AI technology trends for 2024.
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, WalmartLabs
1. Ubiquitous Solr - A Database’s not-so-evil Twin
Ayon Sinha
Data Foundation @WalmartLabs
O C T O B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X
2. 2
Text Search
wow
Search Suggestions
Search Engine… Lucene… Solr
• Internet and Intranet Search
• Relevance
• Search Suggestions
• Faceting
• Recommendations
• Time series
• Log search
• Geo-spatial search
• Analytics
• Graph search
• Document Store
Recommendations
Relevance
Facets
3. 3
Overview
• How to scale any data infrastructure with Apache Solr
• Build a high performance and highly available data platform for
internal and external users alike
• Walmart’s commitment to open source
4. 4
About me
• Team lead at the Data Foundation team for the largest retailer and the
largest private employer in the world
• Prior to Walmart, worked at startups building recommendation and
analytics systems
• And prior to that, was building search applications, recommendations
systems and Hadoop based analytics systems for the largest online
auction company, ebay, for 6 years
• Have been a manuscript reviewer for Manning publications for 4 years
and have helped shape the contents of “Hadoop in Practice” and “Big
Data”
5. 5
About Walmart
• 11,000+ Stores in 27 countries
• 11 eCommerce sites
• 250M customers weekly in stores and online
• Millions of database transactions per day
• Sales, Holidays and massive volume shifts
7. 7
Turns out to be a great idea!
Users seem to like the new product
8. 8
Users REALLY like this..
Higher volume, increased use cases. Quick fix scaling
alternatives add some headroom … and complexity
9. 9
We need more Business Intelligence
Business is looking good but source-of-truth data store,
not so much …
10. 10
Scale up (in a hurry) with hardware
Least risk. Diminishing returns. What next?
11. 11
Design to scale out
• Offload queries to Search Engines
• Offload recurring reads to Cache
• Offload analytics to OLAP datastores
• Shard the database
… and of course do something to hide the complexity. It is
worth it.
13. 13
The “not-so-evil” Twin to protect your Source of Truth DB
• What if a copy of your source-of-truth data is available … Just about
anywhere you want it?
• How could you use a search engine to protect and augment your
database?
– Redirect queries
• Helps scale by reducing demand for
– database indexing
– database connections
– scarce database resources like memory, storage
• Not-so-evil Twin
– Adding multiple near real-time search adds complexity … and it
comes at a cost; but done right, the benefits far outweigh the costs
14. 14
Our Approach
• Abstract the complexity of managing
– source-of-truth database
– cache coherence
– Search queries
– message bus
• Abstract Connection pool management
• Provide a scalable way to query across shards with full control of Solr
schema
• And to analyze big data without affecting real-time systems and
isolating individual data domains
20. 20
Lessons learned
A Search engine like Apache Solr is…
• not limited to search-based business applications.
• a first class citizen in your persistence technology stack; it
complements the SoT database.
• easy to adopt and has all of us as community for support.
21. 21
The Future
• Symbiotic existence of Solr/Lucene with RDBMS, NoSQL and Big
Data systems
• Walmart is committed to be part of the community building it
22. 22
Questions? Reach us at:
• You can reach me, Ayon Sinha, at:
– asinha@walmartlabs.com
– https://www.linkedin.com/in/ayonsinha
• Jason Sardina, our Lead Persistence Architect
– jsardina@walmartlabs.com
• @WalmartLabs is always hiring the best