This presentation is part of my work for the course 'Big Data Seminar' at TU Berlin within the IT4BI (Information Technology for Business Intelligence) master programme.
Benchmarking graph databases on the problem of community detectionSotiris Beis
- The document presents a benchmark for evaluating the performance of graph databases on the task of community detection from social networks. It tests Titan, OrientDB, and Neo4j on synthetic and real-world datasets.
- The results show that OrientDB is most efficient for community detection workloads, Titan performs best for single insertions, and Neo4j scales best for bulk insertions and queries.
- Future work includes testing with larger graphs, distributed versions of the databases, and improving the community detection method.
Benchmarking graph databases on the problem of community detectionSymeon Papadopoulos
- The document presents a benchmark for evaluating the performance of graph databases Titan, OrientDB, and Neo4j on the task of community detection from graph data.
- OrientDB performed most efficiently for community detection workloads, while Titan was fastest for single insertion workloads and Neo4j generally had the best performance for querying and massive data insertion.
- Future work includes testing with larger graphs, running distributed versions of the databases, and improving the implemented community detection method.
A review of some of the content and some of the references for the paper:
Flexible Support for Spatial Decision Making
Shan Gao, John Paynter, and David Sundaram Proceedings of the 37th Hawaii International Conference on System Sciences – 2004
The International Journal of Database Management Systems (IJDMS) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the database management systems & its applications. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on understanding Modern developments in this field, and establishing new collaborations in these areas.
SYSTEMATIC LITERATURE REVIEW ON RESOURCE ALLOCATION AND RESOURCE SCHEDULING I...ijait
he objective the work is intend to highlight the key features and afford finest future directions in the
research community of Resource Allocation, Resource Scheduling and Resource management from 2009 to
2016. Exemplifying how research on Resource Allocation, Resource Scheduling and Resource management
has progressively increased in the past decade by inspecting articles, papers from scientific and standard
publications. Survey materialized in three fold process. Firstly, investigate on the amalgamation of
Resource Allocation, Resource Scheduling and then proceeded with Resource management. Secondly, we
performed a structural analysis on different author’s prominent contributions in the form of tabulation by
categories and graphical representation. Thirdly, huddle with conceptual similarity in the field and also
impart a summary on all resource allocations. In cloud computing environments, there are two players:
cloud providers and cloud users. On one hand, providers hold massive computing resources in their large
datacenters and rent resources out to users on a per-usage basis. On the other hand, there are users who
The document provides an overview of the Research Methodology course offered at Purbanchal University. The course is aimed at teaching students key concepts in research methods, including research design, data collection techniques, sampling methods, data analysis, and developing a research proposal. The course is divided into 8 sections that will cover topics such as defining a research problem, different types of research designs, methods of data collection and analysis, writing a research report and proposal. The overall goal is to equip students with the skills to select appropriate methodologies and effectively plan and conduct research studies.
This document outlines the syllabus for an INT 306 Database Management System lecture. It includes 6 units covering introductions to databases, relational query languages, relational database design, transaction processing, programming constructs in databases, and file organization and trends in databases. It also provides textbook recommendations and evaluation criteria. MOOCs and industry certifications related to databases are listed at the end.
NOVEL FUNCTIONAL DEPENDENCY APPROACH FOR STORAGE SPACE OPTIMISATION IN GREEN ...Nurul Emran
The document discusses reducing storage requirements in data centers to decrease costs and carbon emissions. It notes that existing storage optimization techniques like compression assume all data can be optimized, ignoring application semantics. The objectives are to design a proxy-based algorithm for optimizing storage space in a way that considers application knowledge, evaluate accuracy of queries on the optimized database and space saved, and examine the relationship between these savings and reduced power consumption.
Meeting the NSF DMP Requirement: March 7, 2012IUPUI
March 7 version of the IUPUI workshop Meeting the NSF Data Management Plan Requirement: What you need to know. This workshop is co-sponsored by the Office of the Vice Chancellor for Research and the University Library.
Meeting the NSF DMP Requirement June 13, 2012IUPUI
The document provides guidance on developing a data management plan (DMP) to meet requirements for National Science Foundation grant proposals. It discusses the context and rationale for federal data policies, defines the key elements required for a DMP, and provides examples of DMPs for different types of research data. The main points are: understanding the NSF data policy aims to increase research impact and data sharing/reuse; a DMP must address the types of data generated, metadata standards, data access/sharing plans, long-term preservation, and associated costs; and good planning helps ensure data remains accessible, usable and preserved into the future. Resources and guidance are available to help researchers develop robust and fundable DMPs.
Introduction to Database and Database Management. This presentation gives a basic idea of the differences among terms and types of databases.
It can be used for the first lecture on Database Management course or a seminar in Information Systems.
It doesn't cover database modelling and languages.
This document proposes a new similarity measure for comparing spatial MDX queries in a spatial data warehouse to support spatial personalization approaches. The proposed similarity measure takes into account the topology, direction, and distance between the spatial objects referenced in the MDX queries. It defines the topological distance between spatial scenes referenced in queries based on a conceptual neighborhood graph. It also defines the directional distance between queries based on a graph of spatial directions and transformation costs. The similarity measure will be included in a recommendation approach the authors are developing to recommend relevant anticipated queries to users based on their previous queries.
This document outlines a research proposal to design an algorithm that minimizes SLA violations and cost for resource provisioning of hosted SaaS applications in cloud computing. The objectives are to 1) minimize SLA violations through resource provisioning and request rescheduling, 2) minimize total cost, and 3) improve customer satisfaction levels. The proposed method will be experimentally evaluated and compared to prior work on predictive scheduling, SLA-based resource allocation, and profit-driven scheduling. A literature review discusses previous research that considered resource utilization, streaming scheduling, profit-driven task scheduling, and stochastic resource provisioning cost optimization.
Rethinking Lessons Learned in the PMBoK Process Groups: A Model based on Peop...Marcirio Chaves
The Ballistic 2.0 model
Intends to fill a gap in literature regarding LL
Based on consolidated literature
Expands the use of the knowledge creation model
Is in tune with PM 2.0 (agile, flexible, dynamic)
Provides theoretical foundation for future researches.
M Sc Applied eLearning - WIP PresentationPatWalshDIT
This document discusses using data from Webcourses to better understand student learning and make informed decisions. It outlines the following:
1) Exploring how student engagement with course modules compares to module grades using the Module Reports, Performance Dashboard, and Retention Center features.
2) Conducting a mixed-methods study involving quantitative analysis of module report data from 4-5 courses followed by qualitative interviews with staff participants.
3) Highlighting literature identifying links between LMS data like participation and academic performance, as well as privacy and ethical issues around learning analytics.
Data integration in a Hadoop-based data lake: A bioinformatics caseIJDKP
When we work in a data lake, data integration is not easy, mainly because the data is usually
stored in raw format. Manually performing data integration is a time-consuming task that requires the
supervision of a specialist, which can make mistakes or not be able to see the optimal point for data integration among two or more datasets. This paper presents a model to perform heterogeneous in-memory
data integration in a Hadoop-based data lake based on a top-k set similarity approach. Our main contribution is the process of ingesting, storing, processing, integrating, and visualizing the data integration
points. The algorithm for data integration is based on the Overlap coefficient since it presented better
results when compared with the set similarity metrics Jaccard, Sørensen-Dice, and the Tversky index. We
tested our model applying it on eight bioinformatics-domain datasets. Our model presents better results
when compared to an analysis of a specialist, and we expect our model can be reused for other domains of
datasets.
Data integration in a Hadoop-based data lake: A bioinformatics caseIJDKP
When we work in a data lake, data integration is not easy, mainly because the data is usually
stored in raw format. Manually performing data integration is a time-consuming task that requires the
supervision of a specialist, which can make mistakes or not be able to see the optimal point for data integration among two or more datasets. This paper presents a model to perform heterogeneous in-memory
data integration in a Hadoop-based data lake based on a top-k set similarity approach. Our main contribution is the process of ingesting, storing, processing, integrating, and visualizing the data integration
points. The algorithm for data integration is based on the Overlap coefficient since it presented better
results when compared with the set similarity metrics Jaccard, Sørensen-Dice, and the Tversky index. We
tested our model applying it on eight bioinformatics-domain datasets. Our model presents better results
when compared to an analysis of a specialist, and we expect our model can be reused for other domains of
datasets.
June -2024 - Top 10 Download Articles in Database Mangement.pdfIJDMS
The International Journal of Database Management Systems (IJDMS) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the database management systems & its applications. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on understanding Modern developments in this field, and establishing new collaborations in these areas.
This document provides an overview of a tutorial on using linked data in learning analytics. The tutorial aims to teach researchers and developers the basics of exploiting linked data resources to enrich learning analytics processes. It introduces linked data technologies like RDF and SPARQL and includes hands-on exercises using real education datasets. The tutorial also explores how tools like R, Tableau and Gephi can interface with linked data. It is supported by the LinkedUp project, which provides scenario data and a framework for evaluating linked data applications in education.
10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Sl...DuraSpace
“Hot Topics: The DuraSpace Community Webinar Series, " Series Six: Research Data in Repositories” Curated by David Minor, Research Data Curation Program, UC San Diego Library. Webinar 1: “Research Data Curation at UC San Diego: An Overview”
Presented by David Minor & Declan Fleming, Chief Technology Strategist, UC San Diego Library
Similar to Scheduling and sharing resources in Data Clusters (20)
This presentation is part of my work for the course 'Heterogeneous and Distributed Information Systems' at TU Berlin within the IT4BI (Information Technology for Business Intelligence) master programme.
RDFa: introduction, comparison with microdata and microformats and how to use itJose Luis Lopez Pino
Report for the course 'XML and Web Technologies' of the IT4BI Erasmus Mundus Master's Programme. Introduction, motivation, target domain, schema, attributes, comparing RDFa with RDF, comparing RDFa with Microformats, comparing RDFa with Microdata, how to use RDFa to improve websites, how to extract metadata defined with RDFa, GRDDL and a simple exercise.
RDFa: introduction, comparison with microdata and microformats and how to use itJose Luis Lopez Pino
Presentation for the course 'XML and Web Technologies' of the IT4BI Erasmus Mundus Master's Programme. Introduction, motivation, target domain, schema, attributes, comparing RDFa with RDF, comparing RDFa with Microformats, comparing RDFa with Microdata, how to use RDFa to improve websites, how to extract metadata defined with RDFa, GRDDL and a simple exercise.
¿Qué es la esteganografía?
¿Qué NO es la esteganografía?
Esteganografía y criptografía
¿Por qué usarla?
Esteganografía física
Técnicas de esteganografía digital
Usos curiosos de la esteganografía digital
Ataques
Técnicas de ataque
Estegoanálisis
Marcas de agua
Presentación realizada para el CUSL nacional.
Se puede probar la última versión de Visuse en www.visuse.com
Más información sobre el proyecto en http://visuse.wordpress.com
Visuse es un metabuscador visual que clasifica y muestra los resultados obtenidos de otros buscadores como imágenes y videos. Los objetivos son comunicarse con otros buscadores, organizar la información, puntuar resultados y mostrarlos de forma visual aprovechando el espacio del navegador. Las características incluyen módulos para buscadores como YouTube y Flickr, algoritmos para puntuar y ordenar resultados, y paginación.
Visuse es un metabuscador que clasifica y muestra resultados de búsqueda de forma visual centrándose en contenido multimedia. Usa Python, Django y JavaScript para recibir consultas de buscadores, determinar la importancia de los resultados y mostrarlos de forma optimizada. El proyecto aún necesita expandirse con más módulos, características de caché y configuración, y una versión pública.
Este documento proporciona instrucciones para desarrollar un módulo para el buscador Visuse. Explica los pasos necesarios para crear las clases que definan los resultados de búsqueda y el proceso de búsqueda, así como probar el módulo.
Este documento resume las mejoras realizadas en el proyecto Visuse, un metabuscador visual. Se mejoraron los módulos para incluir Wikimedia Commons, Picasa y Flickr. También se mejoró la interfaz para corregir errores y las instrucciones de instalación. Otras mejoras incluyeron traducciones, agregar copyright a los archivos de código y sugerencias para nuevos módulos y mejor organización. El documento concluye explicando cómo usar Visuse.
Este documento describe las características principales de Android para el desarrollo de aplicaciones móviles. Explica cómo implementar interfaces de usuario, servicios de localización, cámara, servicios, hilos, reproducción multimedia, notificaciones y conectividad. También cubre Content Providers, opciones de conexión y conclusiones sobre la facilidad de desarrollo, integración entre aplicaciones y abstracción del hardware de Android.
EyeOS es un sistema operativo web que permite acceder a aplicaciones desde cualquier navegador. Usa una arquitectura de micronúcleo con cuatro capas y servicios clave como MMAP, VFS y eyeX. Los desarrolladores pueden crear aplicaciones mediante widgets y la recepción de eventos en archivos .eyecode. Aunque ofrece portabilidad y disponibilidad, EyeOS depende de una conexión a internet y tiene limitaciones de rendimiento y seguridad al no controlar directamente el hardware.
El documento provee una introducción al diseño gráfico, incluyendo una definición de diseño gráfico, una clasificación de los tipos de productos que puede crear un diseñador gráfico, y un análisis de software de diseño gráfico comúnmente usado como Photoshop, Illustrator e InDesign.
Introducción a Firefox, navegador libre de Mozilla
Versión 2:
- Arreglada imagen sobre el consumo de RAM.
- Incluidos los ejemplos.
- Incluidas extensiones buscadas para la presentación en Económicas.
Versión 3:
- Nuevas extensiones: Cooliris, Peers y Speed Dial.
- Algunas características que se van a incluir en próximas versiones de Firefox.
El documento describe 7 actividades para configurar un servidor IRC. Estas actividades incluyen cambiar el nombre y descripción del servidor, establecer el puerto de escucha en 6667, limitar el número de usuarios en canales a 3, permitir acceso con contraseña, denegar acceso a una dirección IP específica, permitir acceso solo desde la red local 192.168.1.*, y permitir acceso local sin contraseña pero requerir contraseña para la red local.
Este documento proporciona información sobre InspIRCd, un servidor IRC de código abierto y multiplataforma. Explica cómo instalar e iniciar InspIRCd en Linux y configurar opciones básicas como los puertos de escucha, los usuarios permitidos y los límites de los canales. También incluye algunos comandos básicos del cliente IRC y actividades sugeridas para practicar la configuración del servidor.
Este documento proporciona información y guía sobre cómo configurar e instalar el servidor IRC InspIRCD. Explica cómo instalar InspIRCD en Ubuntu, iniciar y reiniciar el servidor, configurar clientes IRC como XChat y gestionar canales e interacciones básicas. También describe cómo configurar opciones clave del servidor como puertos de escucha, límites de usuarios, acceso de usuarios y más editando el archivo de configuración inspircd.conf.
The Zaitechno Handheld Raman Spectrometer is a powerful and portable tool for rapid, non-destructive chemical analysis. It utilizes Raman spectroscopy, a technique that analyzes the vibrational fingerprint of molecules to identify their chemical composition. This handheld instrument allows for on-site analysis of materials, making it ideal for a variety of applications, including:
Material identification: Identify unknown materials, minerals, and contaminants.
Quality control: Ensure the quality and consistency of raw materials and finished products.
Pharmaceutical analysis: Verify the identity and purity of pharmaceutical compounds.
Food safety testing: Detect contaminants and adulterants in food products.
Field analysis: Analyze materials in the field, such as during environmental monitoring or forensic investigations.
The Zaitechno Handheld Raman Spectrometer is easy to use and features a user-friendly interface. It is compact and lightweight, making it ideal for field applications. With its rapid analysis capabilities, the Zaitechno Handheld Raman Spectrometer can help you improve efficiency and productivity in your research or quality control workflows.
Redefining Cybersecurity with AI CapabilitiesPriyanka Aash
In this comprehensive overview of Cisco's latest innovations in cybersecurity, the focus is squarely on resilience and adaptation in the face of evolving threats. The discussion covers the imperative of tackling Mal information, the increasing sophistication of insider attacks, and the expanding attack surfaces in a hybrid work environment. Emphasizing a shift towards integrated platforms over fragmented tools, Cisco introduces its Security Cloud, designed to provide end-to-end visibility and robust protection across user interactions, cloud environments, and breaches. AI emerges as a pivotal tool, from enhancing user experiences to predicting and defending against cyber threats. The blog underscores Cisco's commitment to simplifying security stacks while ensuring efficacy and economic feasibility, making a compelling case for their platform approach in safeguarding digital landscapes.
Increase Quality with User Access Policies - July 2024Peter Caitens
⭐️ Increase Quality with User Access Policies ⭐️, presented by Peter Caitens and Adam Best of Salesforce. View the slides from this session to hear all about “User Access Policies” and how they can help you onboard users faster with greater quality.
Generative AI technology is a fascinating field that focuses on creating comp...Nohoax Kanont
Generative AI technology is a fascinating field that focuses on creating computer models capable of generating new, original content. It leverages the power of large language models, neural networks, and machine learning to produce content that can mimic human creativity. This technology has seen a surge in innovation and adoption since the introduction of ChatGPT in 2022, leading to significant productivity benefits across various industries. With its ability to generate text, images, video, and audio, generative AI is transforming how we interact with technology and the types of tasks that can be automated.
Retrieval Augmented Generation Evaluation with RagasZilliz
Retrieval Augmented Generation (RAG) enhances chatbots by incorporating custom data in the prompt. Using large language models (LLMs) as judge has gained prominence in modern RAG systems. This talk will demo Ragas, an open-source automation tool for RAG evaluations. Christy will talk about and demo evaluating a RAG pipeline using Milvus and RAG metrics like context F1-score and answer correctness.
Finetuning GenAI For Hacking and DefendingPriyanka Aash
Generative AI, particularly through the lens of large language models (LLMs), represents a transformative leap in artificial intelligence. With advancements that have fundamentally altered our approach to AI, understanding and leveraging these technologies is crucial for innovators and practitioners alike. This comprehensive exploration delves into the intricacies of GenAI, from its foundational principles and historical evolution to its practical applications in security and beyond.
Cracking AI Black Box - Strategies for Customer-centric Enterprise ExcellenceQuentin Reul
The democratization of Generative AI is ushering in a new era of innovation for enterprises. Discover how you can harness this powerful technology to deliver unparalleled customer value and securing a formidable competitive advantage in today's competitive market. In this session, you will learn how to:
- Identify high-impact customer needs with precision
- Harness the power of large language models to address specific customer needs effectively
- Implement AI responsibly to build trust and foster strong customer relationships
Whether you're at the early stages of your AI journey or looking to optimize existing initiatives, this session will provide you with actionable insights and strategies needed to leverage AI as a powerful catalyst for customer-driven enterprise success.
Top 12 AI Technology Trends For 2024.pdfMarrie Morris
Technology has become an irreplaceable component of our daily lives. The role of AI in technology revolutionizes our lives for the betterment of the future. In this article, we will learn about the top 12 AI technology trends for 2024.
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...Fwdays
.NET 8 brought a lot of improvements for developers and maturity to the Azure serverless container ecosystem. So, this talk will cover these changes and explain how you can apply them to your projects. Another reason for this talk is the re-invention of Serverless from a DevOps perspective as a Platform Engineering trend with Backstage and the recent Radius project from Microsoft. So now is the perfect time to look at developer productivity tooling and serverless apps from Microsoft's perspective.
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Zilliz
Enterprises have traditionally prioritized data quantity, assuming more is better for AI performance. However, a new reality is setting in: high-quality data, not just volume, is the key. This shift exposes a critical gap – many organizations struggle to understand their existing data and lack effective curation strategies and tools. This talk dives into these data challenges and explores the methods of automating data curation.
Demystifying Neural Networks And Building Cybersecurity ApplicationsPriyanka Aash
In today's rapidly evolving technological landscape, Artificial Neural Networks (ANNs) have emerged as a cornerstone of artificial intelligence, revolutionizing various fields including cybersecurity. Inspired by the intricacies of the human brain, ANNs have a rich history and a complex structure that enables them to learn and make decisions. This blog aims to unravel the mysteries of neural networks, explore their mathematical foundations, and demonstrate their practical applications, particularly in building robust malware detection systems using Convolutional Neural Networks (CNNs).
It's your unstructured data: How to get your GenAI app to production (and spe...Zilliz
So you've successfully built a GenAI app POC for your company -- now comes the hard part: bringing it to production. Aparavi addresses the challenges of AI projects while addressing data privacy and PII. Our Service for RAG helps AI developers and data scientists to scale their app to 1000s to millions of users using corporate unstructured data. Aparavi’s AI Data Loader cleans, prepares and then loads only the relevant unstructured data for each AI project/app, enabling you to operationalize the creation of GenAI apps easily and accurately while giving you the time to focus on what you really want to do - building a great AI application with useful and relevant context. All within your environment and never having to share private corporate data with anyone - not even Aparavi.
Self-Healing Test Automation Framework - HealeniumKnoldus Inc.
Revolutionize your test automation with Healenium's self-healing framework. Automate test maintenance, reduce flakes, and increase efficiency. Learn how to build a robust test automation foundation. Discover the power of self-healing tests. Transform your testing experience.
2. Introduction
YARN
Mesos
Omega
Related work
Conclusions
Table of contents
1
2
3
Introduction
The problem
Solutions
YARN
Architecture
Advantages
Drawbacks
Performance
Mesos
Architecture
Advantages
4
5
6
Jose Luis Lopez Pino
Drawbacks
Performance
Omega
Architecture
Advantages
Drawbacks
Performance
Related work
Resource managers
Scheduling techniques
Conclusions
Scheduling and sharing resources in Data Clusters
18. Introduction
YARN
Mesos
Omega
Related work
Conclusions
Resource managers
Scheduling techniques
Scheduling techniques
Lottery scheduling[11]
Dynamic Proportional Share Scheduling[7]
Calibration: how does a particular task perform in a particular
node?[5]
Stragglers and speculative relaunch[13]
Delay scheduling: achieve locality, relax fairness[12]
Rich resource-requests[2]
Optimize short jobs[3]
Jose Luis Lopez Pino
Scheduling and sharing resources in Data Clusters
20. Introduction
YARN
Mesos
Omega
Related work
Conclusions
References I
[1]
Ronnie Chaiken, Bob Jenkins, Per-˚ke Larson, Bill Ramsey,
A
Darren Shakib, Simon Weaver, and Jingren Zhou.
Scope: easy and efficient parallel processing of massive data
sets.
Proceedings of the VLDB Endowment, 1(2):1265–1276, 2008.
[2]
Carlo Curino, Djellel Difallah, Chris Douglas, Raghu
Ramakrishnan, and Sriram Rao.
Reservation-based scheduling: If youre late dont blame us!
[3]
Khaled Elmeleegy.
Piranha: Optimizing short jobs in hadoop.
Proceedings of the VLDB Endowment, 6(11):985–996, 2013.
Jose Luis Lopez Pino
Scheduling and sharing resources in Data Clusters
21. Introduction
YARN
Mesos
Omega
Related work
Conclusions
References II
[4]
Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder,
Kunal Talwar, and Andrew Goldberg.
Quincy: fair scheduling for distributed computing clusters.
In Proceedings of the ACM SIGOPS 22nd symposium on
Operating systems principles, pages 261–276. ACM, 2009.
[5]
Gunho Lee, Byung-Gon Chun, and Randy H Katz.
Heterogeneity-aware resource allocation and scheduling in the
cloud.
In Proceedings of the 3rd USENIX Workshop on Hot Topics
in Cloud Computing, HotCloud, volume 11, 2011.
Jose Luis Lopez Pino
Scheduling and sharing resources in Data Clusters
22. Introduction
YARN
Mesos
Omega
Related work
Conclusions
References III
[6]
Kyong-Ha Lee, Yoon-Joon Lee, Hyunsik Choi, Yon Dohn
Chung, and Bongki Moon.
Parallel data processing with mapreduce: a survey.
ACM SIGMOD Record, 40(4):11–20, 2012.
[7]
Thomas Sandholm and Kevin Lai.
Dynamic proportional share scheduling in hadoop.
In Job scheduling strategies for parallel processing, pages
110–131. Springer, 2010.
Jose Luis Lopez Pino
Scheduling and sharing resources in Data Clusters
23. Introduction
YARN
Mesos
Omega
Related work
Conclusions
References IV
[8]
Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek,
and John Wilkes.
Omega: Flexible, scalable schedulers for large compute
clusters.
In Proceedings of the 8th ACM European Conference on
Computer Systems, EuroSys ’13, pages 351–364, New York,
NY, USA, 2013. ACM.
[9]
Facebook Engineering Team.
Under the hood: Scheduling mapreduce jobs more efficiently
with corona.
Jose Luis Lopez Pino
Scheduling and sharing resources in Data Clusters
24. Introduction
YARN
Mesos
Omega
Related work
Conclusions
References V
[10] Vinod K. Vavilapalli.
Apache Hadoop YARN: Yet Another Resource Negotiator.
In Proc. SOCC, 2013.
[11] Carl A Waldspurger and William E Weihl.
Lottery scheduling: Flexible proportional-share resource
management.
In Proceedings of the 1st USENIX conference on Operating
Systems Design and Implementation, page 1. USENIX
Association, 1994.
Jose Luis Lopez Pino
Scheduling and sharing resources in Data Clusters
25. Introduction
YARN
Mesos
Omega
Related work
Conclusions
References VI
[12] Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma,
Khaled Elmeleegy, Scott Shenker, and Ion Stoica.
Delay scheduling: a simple technique for achieving locality
and fairness in cluster scheduling.
In Proceedings of the 5th European conference on Computer
systems, pages 265–278. ACM, 2010.
[13] Matei Zaharia, Andy Konwinski, Anthony D Joseph, Randy H
Katz, and Ion Stoica.
Improving mapreduce performance in heterogeneous
environments.
In OSDI, volume 8, page 7, 2008.
Jose Luis Lopez Pino
Scheduling and sharing resources in Data Clusters