General Data Protection Regulation - BDW Meetup, October 11th, 2017

•

0 likes•661 views

Caserta Presentation: General Data Protection Regulation (GDPR) is a business and technical challenge for companies worldwide - and the deadlines are coming fast! American institutions that do business in the EU or have customers from the EU will have their data practices affected. With this in mind, Caserta – joined by Waterline Data, Salt Recruiting, and Squire Patton Boggs – hosted a BDW Meetup on the GDPR, which is perhaps the most controversial data legislation that has been passed to date. Joe Caserta, Founding President, Caserta, spoke on the basics of the GDPR, how it will impact data privacy around the world, and some techniques geared towards compliance.

What's hot

Journey to Cloud Analytics

Datavail

This presentation will discuss the stories of 3 companies that span different industries; what challenges they faced and how cloud analytics solved for them; what technologies were implemented to solve the challenges; and how they were able to benefit from their new cloud analytics environments. The objectives of this session include: • Detail and explain the key benefits and advantages of moving BI and analytics workloads to the cloud, and why companies shouldn’t wait any longer to make their move. • Compare the different analytics cloud options companies have, and the pros and cons of each. • Describe some of the challenges companies may face when moving their analytics to the cloud, and what they need to prepare for. • Provide the case studies of three companies, what issues they were solving for, what technologies they implemented and why, and how they benefited from their new solutions. • Learn what to look for one considering a partner and trusted advisor to assist with an analytics cloud migration.

An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...

Neo4j

This document discusses how graphs and cloud computing can accelerate innovation. It notes that all data and organizations are naturally connected in complex ways and graphs are core to modern intelligent applications. Connections in data help with personalization, recommendations, health, fraud prevention, and more. The document highlights growing adoption of graph databases and Neo4j's cloud-managed graph database service, Neo4j Aura, which provides simplicity, flexibility, reliability, and empowers faster iteration and collaboration in the cloud.

The Emerging Role of the Data Lake

Caserta

The 20th annual Enterprise Data World (EDW) Conference took place in San Diego last month April 17-21. It is recognized as the most comprehensive educational conference on data management in the world. Joe Caserta was a featured presenter. His session “Evolving from the Data Warehouse to Big Data Analytics - the Emerging Role of the Data Lake," highlighted the challenges and steps to needed to becoming a data-driven organization. Joe also participated in in two panel discussions during the show: • "Data Lake or Data Warehouse?" • "Big Data Investments Have Been Made, But What's Next For more information on Caserta Concepts, visit our website at http://casertaconcepts.com/.

Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...

DATAVERSITY

J.B. Hunt, one of the leading providers of transportation and logistics services in North America, recognizes the criticality of customer responsiveness, service quality, and operational efficiency for its success. However, with its data spread across multiple sources, including legacy mainframe systems, the organization was struggling to meet data requirements from multiple departments. They struggled to troubleshoot operational issues and respond to customers quickly. Join this webinar to hear about the optimized solution J. B. Hunt implemented, which automates real-time data pipelines for a reliable cloud data lake and provides multiple user groups an in-the-moment view of data without overwhelming internal operational systems. Discover how J.B. Hunt now leverages a modernized data environment to accelerate data delivery and drive various AI and analytics initiatives such as real-time service-pricing, competitive counterbidding, and improving their customer experience. Learn how you can: • Ingest data in real-time from legacy mainframe systems, enterprise applications, and more • Create a reliable cloud data lake to accelerate AI and Analytic Initiatives • Catalog, prepare, and provision data to empower data consumers • Drive operational efficiency and customer experience with AI-augmented insights

Big Data Analytics on the Cloud

Caserta

Caserta Concepts, Datameer and Microsoft shared their combined knowledge and a use case on big data, the cloud and deep analytics. Attendes learned how a global leader in the test, measurement and control systems market reduced their big data implementations from 18 months to just a few. Speakers shared how to provide a business user-friendly, self-service environment for data discovery and analytics, and focus on how to extend and optimize Hadoop based analytics, highlighting the advantages and practical applications of deploying on the cloud for enhanced performance, scalability and lower TCO. Agenda included: - Pizza and Networking - Joe Caserta, President, Caserta Concepts - Why are we here? - Nikhil Kumar, Sr. Solutions Engineer, Datameer - Solution use cases and technical demonstration - Stefan Groschupf, CEO & Chairman, Datameer - The evolving Hadoop-based analytics trends and the role of cloud computing - James Serra, Data Platform Solution Architect, Microsoft, Benefits of the Azure Cloud Service - Q&A, Networking For more information on Caserta Concepts, visit our website: http://casertaconcepts.com/

How to Consume Your Data for AI

DATAVERSITY

Smarter businesses apply AI to learn and continuously evolve the way they work. To extract full value from AI, companies need data strategy that gives them access to all their data – no matter where it lives – in an environment that easily scales and applies the latest discovery technology including advanced analytics, visualization and AI. Learn how IBM Watson and Data provides all the tools companies need to embed AI, machine learning and deep learning in their business, while enabling professionals to gain the most from their data to drive smarter business and lead industry-changing transformations.

You're the New CDO, Now What?

Caserta

Joe Caserta was a featured speaker, along with MIT Sloan School faculty and other industry thought-leaders. His session 'You're the New CDO, Now What?' discussed how new CDOs can accomplish their strategic objectives and overcome tactical challenges in this emerging executive leadership role. In its tenth year, the MIT CDOIQ Symposium 2016 continues to explore the developing role of the Chief Data Officer. For more information, visit http://casertaconcepts.com/

Moving Past Infrastructure Limitations

Caserta

Moving Past Infrastructure Limitations Presented by MediaMath This presentation was given at a Big Data Warehousing Meetup with Caserta Concepts, MediaMath and Qubole. You can learn more about the event here: http://www.meetup.com/Big-Data-Warehousing/events/228372516/ Event description: At Caserta Concepts, we are firm believers in big data thriving on the cloud. The instant-on, nearly unlimited storage and computing capabilities of AWS has made it the defacto solution for a full spectrum of organizations needing to process large amounts of data. What's more, an ecosystem of value-added platforms has emerged to further ease and democratize the implementation of cloud based solutions. Qubole has developed a great platform for easily deploying and managing ephemeral and long-lived Hadoop and Spark clusters on AWS. Moving Past Infrastructure Limitations: Data Warehousing at MediaMath Over the past year and a half, MediaMath has undertaken a “data liberation” effort in an attempt to leave their bigbox, monolithic data warehouse behind. In this talk, Rory Sawyer, Software Engineer at MediaMath, will describe how this effort transformed MediaMath’s legacy architecture and legacy mindset, which imposed harsh inefficiencies on data sharing and utilization. The current mindset removes these inefficiencies and allows them to say “yes” to more projects and ideas. Rory will also demo how MediaMath uses Amazon Web Services and Qubole so that infrastructure is no longer a limiting factor on what and how users query. This combination allows them to scale their resources up and down as needed while bridging different data sources and execution engines. Using and extending MediaMath’s data warehousing is no longer a privileged activity but an ability that every employee and client has.

Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017

Caserta

Over the past eight or nine years, applying DevOps practices to various areas of technology within business has grown in popularity and produced demonstrable results. These principles are particularly fruitful when applied to a data analytics environment. Bob Eilbacher explains how to implement a strong DevOps practice for data analysis, starting with the necessary cultural changes that must be made at the executive level and ending with an overview of potential DevOps toolchains. Bob also outlines why DevOps and disruption management go hand in hand. Topics include: - The benefits of a DevOps approach, with an emphasis on improving quality and efficiency of data analytics - Why the push for a DevOps practice needs to come from the C-suite and how it can be integrated into all levels of business - An overview of the best tools for developers, data analysts, and everyone in between, based on the business’s existing data ecosystem - The challenges that come with transforming into an analytics-driven company and how to overcome them - Practical use cases from Caserta clients This presentation was originally given by Bob at the 2017 Strata Data Conference in New York City.

Focus on Your Analysis, Not Your SQL Code

DATAVERSITY

This document discusses the challenges of using SQL for data analysis and introduces Alteryx as an alternative. It notes that SQL can be difficult to understand and repeat, while Alteryx allows users to see the full data workflow, perform transformations without coding, and access different data sources flexibly. The presentation includes an agenda, overview of Alteryx's benefits, and demonstration of its capabilities.

A modern, flexible approach to Hadoop implementation incorporating innovation...

DataWorks Summit

Are You Your Company's Chief Data Officer?

Brendan Aldrich

Are you your company’s chief data officer? Given the scarcity of the official role, it’s likely that you’re not — at least in title. But that doesn't mean that you shouldn't operate like one. Do you approach data leadership as a C-level executive or a senior data head? Is your team’s output strategic or just operational? In this interactive keynote, one of the Windy City’s foremost data leaders will lead an interactive discussion on what it takes to lead like a chief, what it looks like, and how to get there and get it done.

Using Machine Learning to Understand and Predict Marketing ROI

DATAVERSITY

Marketing is all about attracting, retaining and building profitable relationships with your customers, but how do you know which customers to target, which campaigns to run, and which marketing programs to invest in, to get most return for your dollar? Join Alteryx and Keyrus as we demonstrate how to combine all relevant marketing, sales and customer data, and perform sophisticated analytics to deepen customer insight and calculate ROI of marketing programs. You’ll walk away knowing how to: Segment and profile your customers – take that raw data and translate it into real value Build a marketing attribution model within Alteryx, creating a personal answer engine for your company. Leverage R or Python code in an Alteryx workflow so data scientists can collaborate with non-coding stake holders in a code-friendly and code-free environment. Join Alteryx and Keyrus and get the actionable insights you need to drive marketing ROI analytics, and answer million-dollar questions without spending millions of dollars on standardized solutions.

Reveal the Intelligence in your Data with Talend Data Fabric

Jean-Michel Franco

Discover the Winter'20 release of Talend Data Fabric. Find out about the newly released product, Talend Data Inventory, and the powerful new capabilities and AI that accelerate and modernize data engineering. Find out how to: - Ensure trusted data at first sight with Data Inventory - Increase efficiency and productivity with Pipeline Designer - Automate more integration tasks with AI and APIs

Data Catalog as the Platform for Data Intelligence

Alation

Data catalogs are in wide use today across hundreds of enterprises as a means to help data scientists and business analysts find and collaboratively analyze data. Over the past several years, customers have increasingly used data catalogs in applications beyond their search & discovery roots, addressing new use cases such as data governance, cloud data migration, and digital transformation. In this session, the founder and CEO of Alation will discuss the evolution of the data catalog, the many ways in which data catalogs are being used today, the importance of machine learning in data catalogs, and discuss the future of the data catalog as a platform for a broad range of data intelligence solutions.

NTXISSACSC3 - Why Enterprise Information Management is the Key to GRC by Mika...

North Texas Chapter of the ISSA

This document discusses how enterprise information management is key to effective governance, risk management, and compliance (GRC). It defines GRC and explains that traditional GRC strategies often fail because information is siloed across unstructured files and structured data systems. Effective GRC requires synchronizing information and activities across governance, risk, and compliance to operate efficiently, enable information sharing, report activities, and avoid duplication. The document proposes that an information management system like M-Files can bridge the gap by structuring unstructured content and building relationships between structured and unstructured data. This allows information to be more easily found, visualized, and analyzed to support GRC.

Reinventing the Modern Information Pipeline: Paxata and MapR

Lilia Gutnik

(Presented at MapR's Big Data Everywhere event in Redwood City, CA in December 2016) The relationship between business teams and IT has changed as the complexity of data has increased. A traditional data pipeline designed for an IT-centered approach to information management is not designed for the data demands of today's business decisions. Designing a big data strategy requires modernizing previous approaches. Self-service data preparation in a collaborative, intuitive, governed, and secure environment is the key to a nimble and decisive business unit.

Mastering Customer Data on Apache Spark

Caserta

During this Big Data Warehousing Meetup, Caserta Concepts and Databricks addressed the number one operational and analytic goal of nearly every organization today – to have complete view of every customer. Customer Data Integration (CDI) must be implemented to cleanse and match customer identities within and across various data systems. CDI has been a long-standing data engineering challenge, not just one of logic and complexity but also of performance and scalability. The speakers brought together best practice techniques with Apache Spark to achieve complete CDI. Speakers: Joe Caserta, President, Caserta Concepts Kevin Rasmussen, Big Data Engineer, Caserta Concepts Vida Ha, Lead Solutions Engineer, Databricks The sessions covered a series of problems that are adequately solved with Apache Spark, as well as those that are require additional technologies to implement correctly. Topics included: · Building an end-to-end CDI pipeline in Apache Spark · What works, what doesn’t, and how do we use Spark we evolve · Innovation with Spark including methods for customer matching from statistical patterns, geolocation, and behavior · Using Pyspark and Python’s rich module ecosystem for data cleansing and standardization matching · Using GraphX for matching and scalable clustering · Analyzing large data files with Spark · Using Spark for ETL on large datasets · Applying Machine Learning & Data Science to large datasets · Connecting BI/Visualization tools to Apache Spark to analyze large datasets internally The speakers also touched on data governance, on-boarding new data rapidly, how to balance rapid agility and time to market with critical decision support and customer interaction. They also shared examples of problems that Apache Spark is not optimized for. For more information on the services offered by Caserta Concepts, visit our website: http://casertaconcepts.com/

ADV Slides: The World in 2045 – What Has Artificial Intelligence Created?

DATAVERSITY

DataOps: Nine steps to transform your data science impact Strata London May 18

Harvinder Atwal

According to Forrester Research, only 22% of companies are currently seeing a significant return from data science expenditures. Most data science implementations are high-cost IT projects, local applications that are not built to scale for production workflows, or laptop decision support projects that never impact customers. Despite this high failure rate, we keep hearing the same mantra and solutions over and over again. Everybody talks about how to create models, but not many people talk about getting them into production where they can impact customers. Harvinder Atwal offers an entertaining and practical introduction to DataOps, a new and independent approach to delivering data science value at scale, used at companies like Facebook, Uber, LinkedIn, Twitter, and eBay. The key to adding value through DataOps is to adapt and borrow principles from Agile, Lean, and DevOps. However, DataOps is not just about shipping working machine learning models; it starts with better alignment of data science with the rest of the organization and its goals. Harvinder shares experience-based solutions for increasing your velocity of value creation, including Agile prioritization and collaboration, new operational processes for an end-to-end data lifecycle, developer principles for data scientists, cloud solution architectures to reduce data friction, self-service tools giving data scientists freedom from bottlenecks, and more. The DataOps methodology will enable you to eliminate daily barriers, putting your data scientists in control of delivering ever-faster cutting-edge innovation for your organization and customers.

What's hot (20)

Journey to Cloud Analytics

An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...

The Emerging Role of the Data Lake

Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...

Big Data Analytics on the Cloud

How to Consume Your Data for AI

You're the New CDO, Now What?

Moving Past Infrastructure Limitations

Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017

Focus on Your Analysis, Not Your SQL Code

A modern, flexible approach to Hadoop implementation incorporating innovation...

Are You Your Company's Chief Data Officer?

Using Machine Learning to Understand and Predict Marketing ROI

Reveal the Intelligence in your Data with Talend Data Fabric

Data Catalog as the Platform for Data Intelligence

NTXISSACSC3 - Why Enterprise Information Management is the Key to GRC by Mika...

Reinventing the Modern Information Pipeline: Paxata and MapR

Mastering Customer Data on Apache Spark

ADV Slides: The World in 2045 – What Has Artificial Intelligence Created?

DataOps: Nine steps to transform your data science impact Strata London May 18

Similar to General Data Protection Regulation - BDW Meetup, October 11th, 2017

How to implement gdpr in your document repository

XeniT Solutions nv

25 May 2018, the General Data Protection Regulation (GDPR) deadline, is less than 6 months away. As the attention on the regulation is at the top, there is now a growing concern for any organization that is affected by. We would like to invite you to join our webinar to share with you our approach and help your organization and you document repository to be compliant with GDPR. During the webinar, our special guests, George Parapadakis – Business Solutions Strategy, Alfresco and Bart van Bouwel – Managing Partner, CDI-Partners, will provide you with: - How to implement GDPR in your document repository - How the Alfresco Digital Business Platform can help your organization to be compliant with GDPR - Xenit approach: a managed shared drive -Xenit demonstration -Top tips to start preparing for the GDPR.

Webinar: Designing Storage Architectures for Data Privacy, Compliance and Gov...

Storage Switzerland

Managing data is about more than managing capacity growth; organizations today need to adhere to increasingly strict data privacy, compliance and governance regulations. Privacy regulations like GDPR and California’s Consumer Privacy Act place new expectations on organizations that require them to not only protect data but also organize it so it can be found and deleted on request. Traditional backup and archive are ill-equipped to help organization adhere to these new regulations. In this webinar join Storage Switzerland and Hitachi Vantara for a roundtable discussion on the meaning of these various regulations, the impact of them on traditional storage infrastructures and how to design a storage architecture that can meet today’s regulations as well as tomorrows.

How Cloudera SDX can aid GDPR compliance 6.21.18

Cloudera, Inc.

Big data solutions from Cloudera can help organizations comply with the GDPR in three main ways: 1) Provide comprehensive encryption, access controls, and auditing to satisfy principles around integrity, confidentiality, and accountability. 2) Track the classification, usage, and lineage of personal data to demonstrate lawfulness, fairness, and transparency. 3) Enable capabilities like fast data updates, redaction, and erasure of individual records to comply with principles regarding purpose limitation, data minimization, accuracy, and storage limitation.

[Webinar Slides] Data Privacy – Learn What It Takes to Protect Your Information

AIIM International

Beyond GDPR Compliance - Role of Internal Audit

Omo Osagiede

Internal audit can play a strategic role in supporting an organization's GDPR compliance and remediation activities by: 1) Providing expertise and a "big picture" view of personal data flows and requirements. 2) Identifying opportunities to improve data governance and privacy risk management practices. 3) Conducting reviews of key GDPR compliance elements like data mapping, privacy impact assessments, and data subject rights management.

Mastering Data Compliance in a Dynamic Business Landscape

Denodo

Watch full webinar here: https://buff.ly/48rpLQ3 Join us for an enlightening webinar, "Mastering Data Compliance in a Dynamic Business Landscape," presented by Denodo Technologies and W5 Consulting. This session is tailored for business leaders and decision-makers who are navigating the complexities of data compliance in an ever-evolving business environment. This webinar will focus on why data compliance is crucial for your business. Discover how to turn compliance into a competitive advantage, enhancing operational efficiency and market trust. We'll also address the risks of non-compliance, including financial penalties and the loss of customer trust, and provide strategies to proactively overcome these challenges. Key Takeaways: - How can your business leverage data management practices to stay agile and compliant in a rapidly changing regulatory landscape? - Keys to balancing data accessibility with security and privacy in today's data-driven environment. - What are the common pitfalls in achieving compliance with regulations like GDPR, CCPA, and HIPAA, and how can your business avoid them? We will go beyond the technical aspects and delve into how you can strategically position your organization in the realm of data management and compliance. Learn how to craft a data compliance strategy that aligns with your business goals, enhances operational efficiency, and builds stakeholder trust.

Building the Governance Ready Enterprise for GDPR Compliance

Index Engines Inc.

The EU General Data Protection Regulation (GDPR) fundamentally changes how organizations manage personal data. Giving citizens the right to access, rectify, erase, restrict, and migrate their personal content existing in any data center that does business in the European Union. Index Engines' technology delivers extensive search and management solutions that empower you to find all personal data under management with considerable precision and meet or exceed the requirements of the regulation through implementation of powerful indexing technology. Index Engines supports all classes of data from primary storage to legacy backup data.

Looking Beyond GDPR Compliance Deadline

accenture

The document discusses organizations' experiences with GDPR compliance after the May 2018 deadline. It finds that many organizations are still dealing with residual risks and have uncovered more personal data than expected during their discovery processes. Specifically, organizations have struggled to fully comply with data deletion requests due to data being spread across systems without full lineage. The document advocates that organizations view GDPR not just as a compliance burden but as an opportunity to improve data governance, build customer trust, and enable digital expansion.

Big Data LDN 2017: Applied AI for GDPR

Matt Stubbs

Ciso round table on effective implementation of dlp & data security

Priyanka Aash

The document discusses an effective implementation of data loss prevention (DLP) and data security. It covers key factors like the evolving threat landscape, business drivers for DLP, common challenges, and approaches to solve data security issues. An effective methodology is proposed, including identifying critical data and channels, deploying suitable policies, monitoring incidents, and establishing governance through continuous review and improvement. Critical success factors include business involvement, a phased implementation approach, and repeating the plan-do-check-act cycle periodically. The expected project outcomes are protection of critical channels, improved data tracking and awareness, and happier customers and auditors.

Data lake protection ft 3119 -ver1.0

Finto Thomas , CISSP, TOGAF, CCSP, ITIL. JNCIS

This document discusses security considerations for data lakes. It notes that data lakes consolidate an organization's most valuable data, making them an attractive target for hackers. The document outlines key risks like housing all customer data in a single repository and in the cloud. It proposes security design principles like zero trust and least privilege. The document then presents a protection framework with components like access controls, network security, data protection policies and governance. Specific capabilities are described for areas like platform access, policies, network isolation, and data protection. The goal is to properly secure the data lake while still enabling data sharing and analytics.

Michael Josephs

daveGBE

The document discusses the risks associated with big data, including increased data production leading to higher costs of replication and storage, evolving privacy and security regulations, and growing litigation and discovery obligations. It notes that most of the significant risks and costs of big data are not clearly visible and addresses challenges in areas like existing infrastructure, regulatory compliance, contracting, data retention, and eDiscovery.

Richard Hogg & Dennis Waldron - #InfoGov17 - Cognitive Unified Governance & P...

ARMA International

GDPR is Coming, May 25 2018 brings a whole new order of EU Personal Data Privacy and Protection rights, duties and obligations. What changes, what's your risk and how can you start to prepare? How can a Unified Governance strategy and capabilities transform both your information governance program, and provide a framework for personal data? How that strategy can leverage metadata to support and accelerate meeting regulatory issues.

DAMA Ireland - GDPR

DAMA Ireland

The General Data Protection Regulation and the DAMA DMBOK – Tools you can use for Compliance Abstract: The General Data Protection Regulation will be the law governing data privacy in Europe in 2018. Surveys show that less than 50% of organisations are aware of the changes within the legislation, and even fewer have any plan for achieving compliance. In this session, Daragh O Brien takes us on a high level overview of the GDPR and how the disciplines of the DMBOK can help compliance. Notes: DMBOK is an abbreviation for the "Data Management Book of Knowledge" which is published by DAMA International (The Data Management Association)

CIO WaterCooler Focus: GDPR Jasmit Sagoo

Andrew Pryor

GDPR How to get started?

Peter Witsenburg

Data- and database security & GDPR: end-to-end offer

Capgemini

This document discusses Capgemini and Sogeti's end-to-end offering for database security and GDPR compliance. It outlines a four-phase approach including a GDPR readiness assessment, roadmap development, privacy impact assessment, and implementing database security solutions. Each phase has defined activities, timelines, and results to help organizations assess their GDPR compliance and secure databases containing personal data. The offering is designed to help organizations address new accountability and security requirements under the upcoming GDPR regulation.

The value of big data analytics

Marc Vael

How Cloudera SDX can aid GDPR compliance

Cloudera, Inc.

Cloudera's big data platform can help organizations comply with the EU's General Data Protection Regulation (GDPR) in three key ways: 1. It provides a single system to securely store, govern, and manage all analytic workloads and personal data across on-premises, cloud, structured, and unstructured data sources. 2. Its shared services like data catalog, security, governance, and lifecycle management can be applied uniformly across the platform to meet GDPR principles like data minimization, storage limitation, and accuracy. 3. Specific capabilities like its GDPR data hub, consent management, and ability to delete individual data records upon request help automate key GDPR requirements at scale,

The EU General Protection Regulation and how Oracle can help

Niklas Hjorthen

The document discusses Oracle's technology solutions that can help organizations comply with the EU General Data Protection Regulation (GDPR). It provides an overview of GDPR requirements and describes Oracle products that address key areas like data discovery, access controls, monitoring and auditing, and personal data management. It outlines a multi-step approach organizations can take using Oracle technologies to establish the necessary technical foundation and processes for GDPR compliance.

Similar to General Data Protection Regulation - BDW Meetup, October 11th, 2017 (20)

How to implement gdpr in your document repository

Webinar: Designing Storage Architectures for Data Privacy, Compliance and Gov...

How Cloudera SDX can aid GDPR compliance 6.21.18

[Webinar Slides] Data Privacy – Learn What It Takes to Protect Your Information

Beyond GDPR Compliance - Role of Internal Audit

Mastering Data Compliance in a Dynamic Business Landscape

Building the Governance Ready Enterprise for GDPR Compliance

Looking Beyond GDPR Compliance Deadline

Big Data LDN 2017: Applied AI for GDPR

Ciso round table on effective implementation of dlp & data security

Data lake protection ft 3119 -ver1.0

Michael Josephs

Richard Hogg & Dennis Waldron - #InfoGov17 - Cognitive Unified Governance & P...

DAMA Ireland - GDPR

CIO WaterCooler Focus: GDPR Jasmit Sagoo

GDPR How to get started?

Data- and database security & GDPR: end-to-end offer

The value of big data analytics

How Cloudera SDX can aid GDPR compliance

The EU General Protection Regulation and how Oracle can help

More from Caserta

Introduction to Data Science (Data Summit, 2017)

Caserta

This document summarizes an introduction to data science presentation by Joe Caserta and Bill Walrond of Caserta Concepts. Caserta Concepts is an internationally recognized data innovation and engineering consulting firm. The agenda covers why data science is important, challenges of working with big data, governing big data, the data pyramid, what data scientists do, standards for data science, and a demonstration of data analysis. Popular machine learning algorithms like regression, decision trees, k-means clustering and collaborative filtering are also discussed.

Looker Data Modeling in the Age of Cloud - BDW Meetup May 2, 2017

Caserta

This document discusses the evolution of data analytics and modeling. It describes three waves: the first with slow hardware and manual entry; the second with faster PCs but tool explosions; and the third wave now with big data, cloud warehouses, and data-driven tools like Looker and BigQuery. It argues that in this current wave, having a flexible yet performant data model built on SQL in a warehouse, and using a language like LookML to define relationships and translate questions, allows gaining reliable answers with agility without worrying about low-level syntax or tools.

The Data Lake - Balancing Data Governance and Innovation

Caserta

Making Big Data Easy for Everyone

Caserta

Benefits of the Azure Cloud

Caserta

Not Your Father's Database by Databricks

Caserta

This document discusses appropriate and inappropriate use cases for Apache Spark based on the type of data and workload. It provides examples of good uses, such as batch processing, ETL, and machine learning/data science. It also gives examples of bad uses, such as random access queries, frequent incremental updates, and low latency stream processing. The document recommends using a database instead of Spark for random access, updates, and serving live queries. It suggests using message queues instead of files for low latency stream processing. The goal is to help users understand how to properly leverage Spark for big data workloads.

Balancing Data Governance and Innovation

Caserta

This document discusses balancing data governance and innovation. It describes how traditional data analytics methods can inhibit innovation by requiring lengthy processes to analyze new data. The document advocates adopting a data lake approach using tools like Hadoop and Spark to allow for faster ingestion and analysis of diverse data types. It also discusses challenges around simultaneously enabling innovation through a data lake while still maintaining proper data governance, security, and quality. Achieving this balance is key for organizations to leverage data for competitive advantage.

Introducing Kudu, Big Data Warehousing Meetup

Caserta

Not just an SQL interface or file system, Kudu - the new, updating column store for Hadoop, is changing the storage landscape. It's easy to operate and makes new data immediately available for analytics or operations. At the Caserta Concepts Big Data Warehousing Meetup, our guests from Cloudera outlined the functionality of Kudu and talked about why it will become an integral component in big data warehousing on Hadoop. To learn more about what Caserta Concepts has to offer, visit http://casertaconcepts.com/

Balancing Data Governance and Innovation

Caserta

How do you balance the need for structured and rule-based governance to assure enterprise data quality - with the imperative to innovate in order to stay relevant and competitive in today's business marketplace? At the recent CDO Summit in NYC, a range of C-Level Executives across a variety of industries came to hear Joe Caserta, president of Caserta Concepts, put it all in perspective. Joe talked about the challenges of "data sprawl" and the paradigm shift underway in the evolving big data and data-driven world. For more information or to contact us, visit http://casertaconcepts.com/

What Data Do You Have and Where is It?

Caserta

Joe Caserta, President at Caserta Concepts presented at the 3rd Annual Enterprise DATAVERSITY conference. The emphasis of this year's agenda is on the key strategies and architecture necessary to create a successful, modern data analytics organization. Joe Caserta presented What Data Do You Have and Where is it? For more information on the services offered by Caserta Concepts, visit out website at http://casertaconcepts.com/.

Setting Up the Data Lake

Caserta

Incorporating the Data Lake into Your Analytic Architecture

Caserta

Joe Caserta, President at Caserta Concepts presented at the 3rd Annual Enterprise DATAVERSITY conference. The emphasis of this year's agenda is on the key strategies and architecture necessary to create a successful, modern data analytics organization. Joe Caserta presented Incorporating the Data Lake into Your Analytics Architecture. For more information on the services offered by Caserta Concepts, visit out website at http://casertaconcepts.com/.

Real Time Big Data Processing on AWS

Caserta

Big Data: Setting Up the Big Data Lake

Caserta

In this presentation at DAMA New York, Joe started by asking a key question: why are we doing this? Why analyze and share all these massive amounts of data? Basically, it comes down to the belief that in any organization, in any situation, if we can get the data and make it correct and timely, insights from it will become instantly actionable for companies to function more nimbly and successfully. Enabling the use of data can be a world-changing, world-improving activity and this session presents the steps necessary to get you there. Joe explained the concept of the "data lake" and also emphasizes the role of a strong data governance strategy that incorporates seven components needed for a successful program. For more information on this presentation or Caserta Concepts, visit our website at http://casertaconcepts.com/.

Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!

Caserta

Joe Caserta went over the details inside the big data ecosystem and the Caserta Concepts Data Pyramid, which includes Data Ingestion, Data Lake/Data Science Workbench and the Big Data Warehouse. He then dove into the foundation of dimensional data modeling, which is as important as ever in the top tier of the Data Pyramid. Topics covered: - The 3 grains of Fact Tables - Modeling the different types of Slowly Changing Dimensions - Advanced Modeling techniques like Ragged Hierarchies, Bridge Tables, etc. - ETL Architecture. He also talked about ModelStorming, a technique used to quickly convert business requirements into an Event Matrix and Dimensional Data Model. This was a jam-packed abbreviated version of 4 days of rigorous training of these techniques being taught in September by Joe Caserta (Co-Author, with Ralph Kimball, The Data Warehouse ETL Toolkit) and Lawrence Corr (Author, Agile Data Warehouse Design). For more information, visit http://casertaconcepts.com/.

Big Data's Impact on the Enterprise

Caserta

Against the backdrop of Big Data, the Chief Data Officer, by any name, is emerging as the central player in the business of data, including cybersecurity. The MITCDOIQ Symposium explored the developing landscape, from local organizational issues to global challenges, through case studies from industry, academic, government and healthcare leaders. Joe Caserta, president at Caserta Concepts, presented "Big Data's Impact on the Enterprise" at the MITCDOIQ Symposium. Presentation Abstract: Organizations are challenged with managing an unprecedented volume of structured and unstructured data coming into the enterprise from a variety of verified and unverified sources. With that is the urgency to rapidly maximize value while also maintaining high data quality. Today we start with some history and the components of data governance and information quality necessary for successful solutions. I then bring it all to life with 2 client success stories, one in healthcare and the other in banking and financial services. These case histories illustrate how accurate, complete, consistent and reliable data results in a competitive advantage and enhanced end-user and customer satisfaction. To learn more, visit www.casertaconcepts.com

More from Caserta (16)

Introduction to Data Science (Data Summit, 2017)

Looker Data Modeling in the Age of Cloud - BDW Meetup May 2, 2017

The Data Lake - Balancing Data Governance and Innovation

Making Big Data Easy for Everyone

Benefits of the Azure Cloud

Not Your Father's Database by Databricks

Balancing Data Governance and Innovation

Introducing Kudu, Big Data Warehousing Meetup

Balancing Data Governance and Innovation

What Data Do You Have and Where is It?

Setting Up the Data Lake

Incorporating the Data Lake into Your Analytic Architecture

Real Time Big Data Processing on AWS

Big Data: Setting Up the Big Data Lake

Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!

Big Data's Impact on the Enterprise

Recently uploaded

Bimbingan kaunseling untuk pelajar IPTA/IPTS di Malaysia

aznidajailani

PRODUCT | RESEARCH-PRESENTATION-1.1.pptx

amazenolmedojeruel

Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...

weiwchu

We recently discovered that models trained with large-scale speech datasets sourced from the web could achieve superior accuracy and potentially lower cost than traditionally human-labeled or simulated speech datasets. We developed a customizable AI-driven data labeling system. It infers word-level transcriptions with confidence scores, enabling supervised ASR training. It also robustly generates phone-level timestamps even in the presence of transcription or recognition errors, facilitating the training of TTS models. Moreover, It automatically assigns labels such as scenario, accent, language, and topic tags to the data, enabling the selection of task-specific data for training a model tailored to that particular task. We assessed the effectiveness of the datasets by fine-tuning open-source large speech models such as Whisper and SeamlessM4T and analyzing the resulting metrics. In addition to openly-available data, our data handling system can also be tailored to provide reliable labels for proprietary data from certain vertical domains. This customization enables supervised training of domain-specific models without the need for human labelers, eliminating data breach risks and significantly reducing data labeling cost.

Unit 1 Introduction to DATA SCIENCE .pptx

Priyanka Jadhav

Training on CSPro and step by steps.pptx

lenjisoHussein

Big Data and Analytics Shaping the future of Payments

RuchiRathor2

future-of-asset-management-future-of-asset-management

Aadee4

Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...

deepikakumaridk25

SFBA Splunk Usergroup meeting July 17, 2024

Becky Burwell

Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx

AltanAtabarut

393947940-The-Dell-EMC-PowerMax-Family-Overview.pdf

Ladislau5

From Signals to Solutions: Effective Strategies for CDR Analysis in Fraud Det...

Milind Agarwal

CT AnGIOGRAPHY of pulmonary embolism.pptx

RejoJohn2

Combined supervised and unsupervised neural networks for pulse shape discrimi...

Samuel Jackson

Our methodology for pulse shape discrimination is split into two steps. Firstly, we learn a model to discriminate between pulses using "clean" low-rate examples by removing pile-up & saturated events. In addition to traditional tail sum discrimination, we investigate three different choices for discrimination between γ-pulses, fast, thermal neutrons. We consider clustering the pulses directly using Gaussian Mixture Modelling (GMM), using variational autoencoders to learn a representation of the pulses and then clustering the learned representation (VAE+GMM) and using density ratio estimation to discriminate between a mixed (γ + neutron) and pure (γ only) sources using a multi-layer perceptron (MLP) as a supervised learning problem. Secondly, we aim to classify and recover pile-up events in the < 150 ns regime by training a single unified multi-label MLP. To frame the problem as a multi-label supervised learning method, we first simulate pile-up events with known components. Then, using the simulated data and combining it with single event data, we train a final multi-label MLP to output a binary code indicating both how many and which type of events are present within an event window.

Vrinda store data analysis project using Excel

SantuJana12

Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data

Samuel Jackson

We present our work to improve data accessibility and performance for data-intensive tasks within the fusion research community. Our primary goal is to develop services that facilitate efficient access for data-intensive applications while ensuring compliance with FAIR principles [1], as well as adoption of interoperable tools, methods and standards. The major outcome of our work is the successful creation and deployment of a data service for the MAST (Mega Ampere Spherical Tokamak) experiment [2], leading to substantial enhancements in data discoverability, accessibility, and overall data retrieval performance, particularly in scenarios involving large-scale data access. Our work follows the principles of Analysis-Ready, Cloud Optimised (ARCO) data [3] by using cloud optimised data formats for fusion data. Our system consists of a query-able metadata catalogue, complemented with an object storage system for publicly serving data from the MAST experiment. We will show how our solution integrates with the Pandata stack [4] to enable data analysis and processing at scales that would have previously been intractable, paving the way for data-intensive workflows running routinely with minimal pre-processing on the part of the researcher. By using a cloud-optimised file format such as zarr [5] we can enable interactive data analysis and visualisation while avoiding large data transfers. Our solution integrates with common python data analysis libraries for large, complex scientific data such as xarray [6] for complex data structures and dask [7] for parallel computation and lazily working with larger that memory datasets. The incorporation of these technologies is vital for advancing simulation, design, and enabling emerging technologies like machine learning and foundation models, all of which rely on efficient access to extensive repositories of high-quality data. Relying on the FAIR guiding principles for data stewardship not only enhances data findability, accessibility, and reusability, but also fosters international cooperation on the interoperability of data and tools, driving fusion research into new realms and ensuring its relevance in an era characterised by advanced technologies in data science. [1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016) https://doi.org/10.1038/sdata.2016.18 [2] M Cox, The Mega Amp Spherical Tokamak, Fusion Engineering and Design, Volume 46, Issues 2–4, 1999, Pages 397-404, ISSN 0920-3796, https://doi.org/10.1016/S0920-3796(99)00031-9 [3] Stern, Charles, et al. "Pangeo forge: crowdsourcing analysis-ready, cloud optimized data production." Frontiers in Climate 3 (2022): 782909. [4] Bednar, James A., and Martin Durant. "The Pandata Scalable Open-Source Analysis Stack." (2023). [5] Alistair Miles (2024) ‘zarr-developers/zarr-python: v2.17.1’. Zenodo. doi: 10.5281/zenodo.10790679 [6] Hoyer, S. & Hamman, J., (20

Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...

rightmanforbloodline

Annex K RBF's The World Game pdf document

Steven McGee

Technology used in Ott data analysis project

49AkshitYadav

Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)

Alireza Kamrani

Recently uploaded (20)

Bimbingan kaunseling untuk pelajar IPTA/IPTS di Malaysia

PRODUCT | RESEARCH-PRESENTATION-1.1.pptx

Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...

Unit 1 Introduction to DATA SCIENCE .pptx

Training on CSPro and step by steps.pptx

Big Data and Analytics Shaping the future of Payments

future-of-asset-management-future-of-asset-management

Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...

SFBA Splunk Usergroup meeting July 17, 2024

Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx

393947940-The-Dell-EMC-PowerMax-Family-Overview.pdf

From Signals to Solutions: Effective Strategies for CDR Analysis in Fraud Det...

CT AnGIOGRAPHY of pulmonary embolism.pptx

Combined supervised and unsupervised neural networks for pulse shape discrimi...

Vrinda store data analysis project using Excel

Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data

Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...

Annex K RBF's The World Game pdf document

Technology used in Ott data analysis project

Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)

General Data Protection Regulation - BDW Meetup, October 11th, 2017

1. Big Data Warehousing Meetup General Data Protection Regulation GDPR October 11, 2017

2. About Caserta  Data Intelligence Consulting and Modern Data Engineering  Data Lakes, Data Laboratories, Data Warehouses  Award-winning company for Data Innovation  Data Science, Machine Learning, Artificial Intelligence  Internationally recognized work force  Keynote Speakers, Educators, Mentors  Strategy, Architecture, Governance, Implementation

3. Strategic Partners

4. Corporate Data Pyramid

5. GDPR Cannot be Ignored  GDPR Compliance Top Data Protection Priority for 92% of US Organizations in 2017 - PwC Survey • The GDPR requirements will force U.S. companies to change the way they process, store, and protect customers’ personal data. • Companies must be able to show compliance by May 25, 2018 • A data protection officer (DPO) may be required

6. GDPR in a Town Near You  New York legislature, inspired by the GDPR, proposed the Right to be Forgotten Act,.  GDPR will continue influencing privacy regulations across the globe  Companies that comply with the GDPR will be better prepared for future changes in U.S. legislation.

7. Data Elements Regulated  Basic identity information such as name, address and ID numbers  Web data such as location, IP address, cookie data and RFID tags  Health and genetic data  Biometric data  Racial or ethnic data  Political opinions  Sexual orientation

8. The Technical Challenge “Delete all my personal data without undue delay when it is no longer necessary or when consent has been withdrawn”  Legal: Right to Erasure or Right to be Forgotten  Engineer: Need the ability to delete some specific subset or all data associated with a customer from all data systems

9. More GDPR Technical Goals  The pseudonymisation and encryption of personal data.  The ability to ensure the ongoing confidentiality, integrity, availability and resilience of processing systems and services.  The ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident.  A process for regularly testing, assessing and evaluating the effectiveness of technical and organizational measures for ensuring the security of the processing.

10. GDPR Three-Legged Stool  Metadata  github.com/linkedin/wherehows  Data Access  Something Similar to DALI (Data Access LinkedIn)  Data Lifecycle Management  gobblin.apache.com

11. GDPR Tips  Bake Data Privacy into the Design  Encrypt the Data, Implement Access Control Governance  Enable Fine Grain Access Control (FGAC)  Keep Inventory of Data and Processes  Document how data is collected, purged  Record or detect Data Lineage  Potentially Hire Data Protection Officer  Or Consultants to establish GDPR Strategy & Execution Plan

12. Thank You  Joe Caserta, President  joe@casertaconcepts.com  Twitter: joe_caserta

General Data Protection Regulation - BDW Meetup, October 11th, 2017

Related slideshows

More Related Content

What's hot

What's hot (20)

Similar to General Data Protection Regulation - BDW Meetup, October 11th, 2017

Similar to General Data Protection Regulation - BDW Meetup, October 11th, 2017 (20)

More from Caserta

More from Caserta (16)

Recently uploaded

Recently uploaded (20)

General Data Protection Regulation - BDW Meetup, October 11th, 2017