Of course, you know what data is. Probably you know what Big Data and small data is. But what's the heck is that buzz about data? Why is it so important today? These are the questions which will be the topic of the session. This session will be beyond the definitions and descriptions. We will talk about data, about different options for data usage, and how we can benefit from data.
Detecting solar farms with deep learningJason Brown
Talk delivered at Free and Open Source Software for Geo North America 2019 (FOSS4GNA)
Large scale solar arrays or farms have been installed globally faster than can be reliably tracked by interested stakeholders. We have built a deep learning model with Sentinel 2 satellite imagery that allows us to create accurate, timely global maps of solar farms.
This document describes a geospatial modeling tool developed to retrieve climate data from large climate model databases in an efficient manner. The tool integrates R programming with ArcGIS to subset and extract grid point data for specific study areas from netCDF climate model files. It was tested on CORDEX climate model data and found to accurately obtain grid points, providing a less tedious method than manual retrieval. The tool allows climate data to be efficiently obtained and prepared as model inputs.
This document discusses the use of machine learning techniques for analyzing astronomical and earth observation data. It notes that large sky survey projects generate huge amounts of data that exceeds our ability to analyze using traditional methods. Machine learning can help process and extract insights from this big data. Specifically, the document discusses how convolutional neural networks have achieved 98% accuracy in classifying galaxy images from sky surveys. It also provides examples of applying machine learning to tasks like terrain classification from earth observation satellite imagery.
Scientific Computing With Amazon Web ServicesJamie Kinney
Researchers from around the world are increasingly using AWS for a wide-array of use cases. This presentation describes how AWS facilitates scientific collaboration and powers some of the world's largest scientific efforts, including real-world examples from NASA JPL, the European Space Agency (ESA) and CERN's CMS particle detector.
The Pacific Research Platform (PRP) aims to achieve transparent and rapid data access among collaborating scientists at multiple institutions through an integrated implementation of data-focused networking that extends the university campus Science DMZ model to a regional, national, and, eventually, a global scale.
PRP researchers are routinely achieving high-performance end-to-end networking from their labs to their collaborators’ labs and data centers, traversing multiple, heterogeneous Science DMZs and wide-area networks connecting multiple campus gateways, enabling researchers across the partnership to transfer data over dedicated optical lightpaths at speeds from 10Gb/s to 100Gb/s.
Stair Lab introduces two new datasets for deep learning: Stair Captions, a dataset of 100,000 images with captions, and Stair Actions, a dataset of 100 action videos annotated with 1000 action instances. Stair Captions is based on the MS-COCO 2014 dataset and uses a 2D CNN and RNN model to generate captions. Stair Actions contains videos from a research project and aims to help with action recognition tasks. Both datasets are publicly available for researchers through the Stair Lab website.
Anatol Salanevich developed a method for seismic data analysis based on the maximin method. He created a program called Maximin in the IDE Qt. The program takes a dbf file of seismic events as input data from CreateShapeGIS and outputs a csv file with a list of clusters.
The Next Light Wave: Why Too Much Light is An IssueGTTP-GHOU-NUCLIO
Presentation on importance of light for astronomy and society presented at "International Conference on Communication and Light" from 2 - 4 November in Braga, Portugal by Pedro Russo.
EventNet is a neural network architecture that can efficiently process asynchronous event streams from event cameras in real-time. It uses a temporal coding function to recursively update the network's state as new events arrive, avoiding redundant computation. Experimental results show it can perform tasks like target motion estimation and ego-motion estimation at rates over 1000 Hz using only the new event data. Compared to frame-based and PointNet approaches, EventNet significantly reduces computation time by recursively updating representations rather than reprocessing all prior events.
This document discusses investigating the capabilities of ESRI products for handling large datasets and big data. The objectives are to study ESRI's existing abilities to process and analyze large data sets, and to examine ESRI's architecture for big data processing. The author works with New York taxi trip data, comparing different processing and visualization methods in Python, ArcPy, and Tableau Public. These include spatial joining, data filtering, and creating visualizations to analyze patterns and outliers. The conclusion evaluates the best method based on processing time, dependencies, and license restrictions. Objective 2 briefly outlines ESRI's machine-based architecture for hosting big data solutions.
The document discusses the impact and usage trends of the EGI (European Grid Initiative) federated cloud computing infrastructure. It notes that EGI has supported over 23,000 research papers since 2008. Usage of EGI resources has increased significantly in recent years across many research domains, with computing hours increasing by 40% from 2016 to 2017. EGI provides federated cloud computing resources to thousands of individual researchers and supports the long-tail of science through various applications and thematic services.
Mike Warren is the co-founder and CTO of Descartes Labs, a company that operates a geospatial analysis platform using multiple integrated satellite image datasets. The platform provides analysis-ready images with historical records for machine learning and allows users to find, measure, monitor changes over time, and predict future changes to minimize risk and optimize outcomes. It eliminates much of the data preparation time typically required by geospatial scientists by maintaining a growing archive of processed images and a robust pipeline for continuous updates as new images become available.
This document discusses using spatial analysis and mixture of Gaussians modeling to analyze geo-tagged tweets from a city to identify hot spots and patterns in people's behavior over time and location. The goal is to empirically model the spatial density of tweets. The document describes using expectation maximization to fit the mixture of Gaussians model to synthetic and real tweet location data, and issues that arose such as model collapsing and the lack of a global maximum. BIC was used to select the number of clusters but also had limitations. Future work proposed focusing on city centers and understanding when BIC works best.
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...OpenTopography Facility
High-resolution topography is a powerful tool for studying the Earth's surface, vegetation, and urban landscapes, with broad scientific, engineering, and educational-based applications. Over the past decade, there has been dramatic growth in the acquisition of these data for scientific, environmental, engineering and planning purposes. In the US, the U.S. Geological Society is undertaking the 3D Elevation Program (3DEP) to map the entire lower 48 with lidar by 2023.
The richness of these topography datasets make them extremely valuable beyond the application that drove their acquisition and thus are of interest to a large and varied user community. A cyberinfrastructure platform that enables users to efficiently discover, access and process these massive volumes of data increases the impact of investments in collection of the data and catalyzes scientific discovery as well as informs critical decisions that are made across our Nation every day that depend on elevation data, ranging from immediate safety of life, property, and environment to long term planning for infrastructure projects.
Join us to hear about the motivations, technology, and data assets behind the National Science Foundation funded OpenTopography platform, which aims to democratize access to high resolution topographic data. OpenTopography’s innovation is in co-locating massive volumes of topographic data with processing tools that enable users with varied expertise and application domains to quickly and easily access and process data, to enable innovation and decision making.
This document discusses using Predix technology to forecast energy generation from solar power plants. It describes how Predix can be used for now-casting, short-term forecasting, and long-term forecasting of solar energy production. Predix utilizes sensors and analytics to process data from solar installations and predict upcoming energy generation for balancing the energy grid and avoiding blackouts.
The purpose of this study is to develop a system which will assist a user to determine if a location can be entitled as a “Safe” residence or not. The output will be based on an analysis carried out on the local crime history of the city. This involves examining a huge geolocation data and zeroing down to a single area. The area with majority crime incidents will be highlighted as Unsafe. Clicking/hovering on a single record will display name, associated crime and its rank depending on number of crimes occurred. Big Data Hadoop and Hive systems are implemented in Azure for the analysis.
Keywords: Hadoop, Big Data, Hive, Azure
Valerii Vasylkov Erlang. measurements and benefits.Аліна Шепшелей
The document discusses the benefits of Erlang, including its functional nature, powerful pattern matching, built-in concurrency and fault tolerance through let it crash philosophy, ability to perform distributed computation, and capability for hot code upgrades without downtime. It covers Erlang's actor model approach to concurrency, use of processes and message passing, supervision trees for fault tolerance, and tools for debugging, profiling, and detecting bottlenecks.
Марина Бриль Организация работы маркетинговыхкоманд и экономическое обоснован...Аліна Шепшелей
1. Инфорамция о построении маркетинговой
команды для проектов.
2. Зоны ответственности участников команды и
определение KPI.
3. Алгоритм определения приоритетности задач
для сотрудников агентства и инхаус сотрудников.
JHipster is a Yeoman generator used to create a Spring Boot and AngularJS project. It saves development time by including accepted practices and scaffolding for both design and runtime. The generator supports technologies like Spring Boot, AngularJS, Bootstrap, and MySQL. Developers can add additional functionality through JHipster modules and sub-generators. The generated projects include tools for testing, deployment to Docker, and integration with services like Elasticsearch.
It's hard to imagine any business without IT now , just like any house without electricity. And it is hard to imagine any successful business with no clouds in it's IT.
Step by step guide on building the multipurpose parser for scalable web data extraction.
Designing and usage of universal format for stripped web articles.
Format comparison with AMP(Google), Facebook Instant Articles and Apple News.
This document discusses Google's involvement in virtual reality (VR) and its Daydream VR platform. It outlines some of the key differences between Daydream and other VR platforms like Oculus Rift, provides details on Android OS optimizations and the Google VR SDK for developing Daydream apps, and briefly touches on potential future applications of VR in areas like education, medicine, news and entertainment.
Dmitriy Kouperman Working with legacy systems. stabilization, monitoring, man...Аліна Шепшелей
About half of the developers, one way or another, faced with the legacy-projects. Not everyone can (and want) work with them. But with the right approach, such projects can be carried out with pleasure and even enthusiasm. We suggest that such a legacy of understanding, what are these project management techniques, practices, and explore the developers consider useful decisions: • Examples of optimization - it's worth a try; • Monitoring applications - JavaMelody; • Monitoring applications - logs and ELK (ELasticSearch + Logstash + Kibana); • Monitoring applications - Java Mission Control and Heap Dump Memory Analyzer Tool.
Dmytro Zaitsev Viper: make your mvp cleanerАліна Шепшелей
VIPER is an architectural pattern for structuring Android applications. It divides an app into distinct layers - View, Interactor, Presenter, Entity, and Router. The Presenter handles view logic and communication between the View and Interactor. The Interactor contains business logic. The View displays content from the Presenter. VIPER aims to make apps easier to understand, maintain, and test by separating concerns and reducing dependencies between layers. It is best for medium to large apps but may be overkill for small projects.
Anna Lavrova Gladiator in the suit: crisis is our brand!Аліна Шепшелей
The document discusses the challenges that can arise when a software development team loses its project lead. It notes that without a team lead to guide them, team members may leave the project. It also suggests that the development roadmap could lack estimates, clients may leave if release dates are not met, designs may not follow guidelines, retrospectives may not occur, and it may be unclear who is responsible for creating stories. The document closes by thanking the audience and providing contact information for any questions.
Mihail Patalaha Aso: how to start and how to finish?Аліна Шепшелей
Mikhail Patalakha is a mobile ASO manager with experience managing over 50 successful projects. He provides tips for optimizing mobile app keywords and rankings, including opening the application, researching competitors' keywords, removing duplicates, getting new keywords from Google Keyword Planner, defining competition and traffic from services like SensorTower and ASOdesk, choosing keywords based on difficulty, and calculating approximate visitor numbers using a provided formula. His contact information is provided for further questions.
Andrew Veles Product design is about the processАліна Шепшелей
This document discusses product design and the product design process. It emphasizes that product design is about focus, thinking through every step of the process from initial ideas to implementation. This includes activities like creating portraits, user stories, specifications, site maps, flows, wireframes, prototypes, and UI design. It also notes that the goal is stable growth for the product over time, but that the solution designed may need to change as problems change. Examples are provided of redesigns for a mobile app, desktop app, and logo. The conclusion emphasizes that building the right features for the right users is more challenging than just building features.
Andrey Sobol Blockchain crowdfunding or "mommy, look, i launched ipo"Аліна Шепшелей
I will talk about:
Why crowdfunding in cryptocurrency is a good idea?
How can you create DIY IPO?
Why blockchain IPO gives confidence to your investors?
Vladimir Lozanov How to deliver high quality apps to the app storeАліна Шепшелей
Mobile QA teams are responsible for thoroughly testing apps before release to ensure high quality. They use a variety of manual and automated testing methods at different stages of development. QA works closely with development and customer support to catch bugs, validate fixes, and improve the product based on user feedback. The goal is to deliver stable, bug-free apps through collaboration across teams.
This document provides an overview of how SQL Server processes queries. It discusses the key components like the query processor, parser, algebrizer, optimizer and executor. The query processor breaks queries into logical and physical representations. The optimizer chooses the most efficient execution plan. The executor then runs the query. It also touches on topics like parameter sniffing, locking, deadlocks and the thread pool model.
This document summarizes Kx Systems, a company that provides a high-performance time-series database called kdb+. Kdb+ can process and analyze large volumes of real-time and historical time-series data extremely fast with low latency. It is widely used in financial services and is now being applied to other industries like manufacturing, utilities, and life sciences. Kx Systems offers software, consulting services, and can help clients integrate kdb+ with their existing technologies and scale their deployments.
Tooling Up for Efficiency: DIY Solutions @ Netflix - ABD319 - re:Invent 2017Amazon Web Services
At Netflix, we have traditionally approached cloud efficiency from a human standpoint, whether it be in-person meetings with the largest service teams or manually flipping reservations. Over time, we realized that these manual processes are not scalable as the business continues to grow. Therefore, in the past year, we have focused on building out tools that allow us to make more insightful, data-driven decisions around capacity and efficiency. In this session, we discuss the DIY applications, dashboards, and processes we built to help with capacity and efficiency. We start at the ten thousand foot view to understand the unique business and cloud problems that drove us to create these products, and discuss implementation details, including the challenges encountered along the way. Tools discussed include Picsou, the successor to our AWS billing file cost analyzer; Libra, an easy-to-use reservation conversion application; and cost and efficiency dashboards that relay useful financial context to 50+ engineering teams and managers.
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
This document discusses computing challenges posed by rapidly increasing data scales in scientific applications and high performance computing. It introduces the concept of online data analysis and reduction as an alternative to traditional offline analysis to help address these challenges. The key messages are that dramatic changes in HPC system geography due to different growth rates of technologies are driving new application structures and computational logistics problems, presenting exciting new computer science opportunities in online data analysis and reduction.
In this video from ChefConf 2014 in San Francisco, Cycle Computing CEO Jason Stowe outlines the biggest challenge facing us today, Climate Change, and suggests how Cloud HPC can help find a solution, including ideas around Climate Engineering, and Renewable Energy.
"As proof points, Jason uses three use cases from Cycle Computing customers, including from companies like HGST (a Western Digital Company), Aerospace Corporation, Novartis, and the University of Southern California. It’s clear that with these new tools that leverage both Cloud Computing, and HPC – the power of Cloud HPC enables researchers, and designers to ask the right questions, to help them find better answers, faster. This all delivers a more powerful future, and means to solving these really difficult problems."
Watch the video presentation: http://insidehpc.com/2014/09/video-hpc-cluster-computing-64-156000-cores/
Making Earth observation data available by using Amazon S3 is accelerating scientific discovery and enabling the creation of new products. Attend and learn how the scale and performance of Amazon S3 lets earth scientists, researchers, startups, and GIS professionals gather and analyse planetary-scale data without worrying about limitations of bandwidth, storage, memory, or processing power. Co-presented with support of the Australian Geoscience Data Cube collaboration, DigitalGlobe’s Geospatial Big Data Platform and the developer of the popular ObservedEarth mobile app.
Speakers:
Craig Lawton, Public Sector Solutions Architect, Amazon Web Services
Lachlan Hurst, Observed Earth
Matt Paget, Senior Experimental Scientist, CSIRO
Dan Getman, Digital Globe
This document discusses using deep learning techniques to detect extreme weather patterns in climate data. It begins by outlining the scientific motivation and successes of deep learning in computer vision. It then describes early successes applying deep learning to climate science tasks like classifying tropical cyclones, atmospheric rivers, and weather fronts. Challenges include dealing with multi-variate climate data and lack of labeled examples. Future work involves creating unified deep learning models that can perform detection, localization, and segmentation of extreme weather across different climate datasets.
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16Boris Adryan
Das Gesetz der großen Zahlen gilt immer: Die statistische Sicherheit nimmt mit der Anzahl der Datenpunkte immer zu, sofern die Datennahme fair erfolgt. Leider kostet das Sammeln der Daten oftmals Geld, und so ist man vor allem im Bereich der Sensorik (Stichwort: Internet der Dinge) gezwungen, sinnvolle Kompromisse einzugehen. In diesem Vortrag fasse ich die Erkenntnisse eines Projekts zusammen, in dem die Datenanalytik zeigte, dass man zukünftig nur 60% der ausgebrachten Sensoren wirklich braucht. Auch muss es nicht immer Echtzeit-Analyse sein: Mit einer auf den Business-Case abgestimmten Datenstrategie lassen sich unnötige Ausgaben vermeiden.
This thesis proposal aims to develop a system called Eureka to efficiently discover training data for visual machine learning tasks. Eureka combines early discard filters, just-in-time machine learning, and the ability to create more accurate filters without writing new code. The goal is to reduce the manual effort required of domain experts to find and label rare phenomena in large unlabeled visual datasets. The proposal outlines research thrusts to apply Eureka in different computing environments like edge, cloud, and smart storage, as well as different problem domains including images, videos, and other multidimensional data. Initial experiments show Eureka can discover more true positives per unit time compared to naive hand-labeling.
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...Raffaele Montella
FACE-IT is an effort to develop a new IT infrastructure to accelerate existing disciplinary research and enable information transfer among traditionally separate fields. At present, finding data and processing it into usable form can dominate research efforts. By providing ready access to not only data but also the software tools used to process it for specific uses (e.g., climate impact and economic model inputs), FACE-IT allows researchers to concentrate their efforts on analysis. Lowering barriers to data access allows researchers to stretch in new directions and allows researchers to learn and respond to the needs of other fields. FACE-IT builds on the Globus Galaxies platform, which has been developed over the past several years at the University of Chicago. FACE-IT also benefit from substantial software development undertaken by the communities who have developed most of the domain-specific tools required to populate FACE-IT with useful capabilities. The FACE-IT Galaxy manages earth system datatypes (as NetCDF), new tool parameters (dates, map, opendap), aggregated datatypes (RAFT), service providers and cool map visualizers.
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...Dataconomy Media
Stephen Cantrell, kdb+ Developer at Kx Systems
“Kdb+: How Wall Street Tech can Speed up the World"
You can see some additional notes here:
https://github.com/cantrells/berlin_kdb_demo?files=1
A modified k means algorithm for big data clusteringSK Ahammad Fahad
Amount of data is getting bigger in every moment and this data comes from everywhere; social media, sensors, search engines, GPS signals, transaction records, satellites, financial markets, ecommerce sites etc. This large volume of data may be semi-structured, unstructured or even structured. So it is important to derive meaningful information from this huge data set. Clustering is the process to categorize data such that data are grouped in the same cluster when they are similar according to specific metrics. In this paper, we are working on k-mean clustering technique to cluster big data. Several methods have been proposed for improving the performance of the k-means clustering algorithm. We propose a method for making the algorithm less time consuming, more effective and efficient for better clustering with reduced complexity. According to our observation, quality of the resulting clusters heavily depends on the selection of initial centroid and changes in data clusters in the subsequence iterations. As we know, after a certain number of iterations, a small part of the data points change their clusters. Therefore, our proposed method first finds the initial centroid and puts an interval between those data elements which will not change their cluster and those which may change their cluster in the subsequence iterations. So that it will reduce the workload significantly in case of very large data sets. We evaluate our method with different sets of data and compare with others methods as well.
As a Presidio Fellow in Sustainability and Sports, at the Presidio Graduate School, San Francisco, CA, [http://www.presidio.edu/academics/presidiopro/certificates/sports- sustainability] I presented a class on energy efficiency and solar in sports stadiums and arenas. It covers related issues of advanced BIM (Building Information Modeling or Building Intelligence Management), Internet of Everything (IoT), continuous commissioning over building lifecycle, LED lighting systems, and more.
Benchmarking search relevance in industry vs academiaNick Craswell
Update of my WSDM2017 practice and experience talk (also on slideshare) talking about lessons from industry on the use of offline metrics in information retrieval. Since a key thing is to have more training and test sets, this talk describes our more recent data releases.
[CS570] Machine Learning Team Project (I know what items really are)Kunwoo Park
This document summarizes a team's approach to predicting which items users might be interested in using a recommendation system. It describes extracting features from user and item metadata to train an SVM model, but this was too computationally expensive. Instead, the team used logistic regression with stochastic gradient descent. They tested features like age, gender and network similarities. Their combined model outperformed random prediction baselines on the KDD Cup 2012 Track 1 dataset.
This document discusses graph databases and Neo4j. It begins with an agenda that includes stories about graph databases in Washington DC, the state of graph databases in 2019, innovation waves, and recommendations for the future regarding AI and graphs. It then provides examples of how Neo4j is being used by organizations like ICIJ, NASA, and to search for cures for cancer. The document discusses the graph database market and Neo4j's dominance. It outlines the Neo4j graph platform vision and upcoming features. Specific customer use cases are presented, including ones for the German Center for Diabetes Research, Caterpillar, and DeviantArt.
Jim Gray presented on his work with large databases and grid computing. He discussed two major projects - TerraServer and SkyServer/World Wide Telescope. TerraServer is a photo database of the United States containing over 15 TB of imagery data accessed through an SQL database. SkyServer is a database of astronomical data containing images and attributes of celestial objects from surveys like SDSS. Gray discussed lessons learned from building and managing these large databases, and future plans to build databases from inexpensive disk bricks. He advocated for grid computing through web services as a way to federate and access distributed data sources on the internet.
Oleksandr Yefremov Continuously delivering mobile projectАліна Шепшелей
This document discusses best practices for continuously delivering mobile projects. It outlines a CI/CD workflow that includes running tests and manual QA on pull requests, notifying stakeholders, automatically generating changelogs and version bumps, preparing release artifacts, and publishing them to stores or S3. Key steps are running tests on pull requests, using strict PR naming conventions, notifying teams in Slack, automating versioning and publishing with scripts and Fastlane, and deploying beta builds to Fabric/Crashlytics. The full workflow aims to streamline mobile releases by automating repetitive tasks and integrating all steps.
Alexander Voronov Test driven development in real worldАліна Шепшелей
This document discusses test-driven development (TDD) practices. It covers topics like the benefits of cleaner interfaces and unbiased design when tests are written first. It also addresses challenges like introducing TDD to an existing codebase or team. Key points emphasized are starting simple with critical features, finding the lowest testable point, and making incremental changes to introduce tests and refactoring step-by-step. Continuous integration practices are also highlighted.
Apache Spark is an open-source parallel processing framework that supports in-memory processing to boost the performance of big-data analytic applications. We will cover approaches of processing Big Data on Spark cluster for real time analytic, machine learning and iterative BI and also discuss the pros and cons of using Spark in Azure cloud.
Valerii Iakovenko Drones as the part of the presentАліна Шепшелей
Drones, these are the tools that have densely entered to our life now. These are sources of geospatial information which form the basis and supplements many systems of monitoring and control. In detail, the speech will be about agribusiness.
This document provides an overview and agenda for an Apache HBase workshop. It introduces HBase as an open-source NoSQL database built on Hadoop that uses a column-family data model. The agenda covers what HBase is, its data model including rows, columns, cells and versions, CRUD operations, architecture including regions and masters, schema design best practices, and the Java API. Performance tips are given for client reads and writes such as using batches, caching, and tuning durability.
Anton Ivinskyi Application level metrics and performance testsАліна Шепшелей
It is important to understand how your code behaves in production, not just guess how it should behave. Know what takes time and what goes wrong. Measure it all. Be ready for the load with performance tests.
Anton Parkhomenko Boost your design workflow or git rebase for designersАліна Шепшелей
The document provides 4 tips to boost a designer's workflow: 1) Use Git to version and collaborate on design files, 2) Automate repetitive processes, 3) Be prepared for changes by using flexible components and responsive design, 4) Create prototypes to gather feedback early in the design process.
Kononenko Alina Designing for Apple Watch and Apple TVАліна Шепшелей
Apple Watch and Apple TV apps are inherently different from other apps, in both form and function. You will learn watchOS and tvOS user experience foundations and design principles, get the quick overview of the best existing solutions and possible ways of extending your projects to this platforms.
Gregory Shehet Undefined' on prod, or how to test a react appАліна Шепшелей
During the lecture we'll discuss the unit-testing of the interface. The stack of technologies: React (Redux, MobX), Mocha/Chai, React Tests Utils, Enzyme, Tape/Ava. Also, I will mention how we in Grammarly rewrite selenium to the unit tests, and how it works.
Alexey Osipenko Basics of functional reactive programmingАліна Шепшелей
During the report, we will develop incrementally construct primitives and algebra (in the worst case, just come up with a library interface) for the organization of interaction of the application with the mess in the real world. This approach will keep the application logic in the pure functions and declaratively associate external events with the necessary output.
And no React JS.
Roman Ugolnikov Migrationа and sourcecontrol for your dbАліна Шепшелей
The document discusses database migration and source control. It describes how database structure, data, and logic can change across versions. It recommends using tools like Liquibase and Flyway to manage database schema changes and keep the database schema in sync with code. These tools allow defining changes in migration files and rolling back changes if needed. The document also covers how the tools work, supported databases, file formats, preconditions, and provides a demo of using the tools for a sample database migration.
Alex Theedom Java ee revisits design patternsАліна Шепшелей
Enter "Django Channels": new way of desinging and thinking about your application. It separates transport and processing concerns in typical Django project using combination of ASGI (Asynchronous Server Gateway Interface) and worker processes, enabling your application to be "event-oriented" and implement new workflows for processing your data. How does it work? What do you need to start? Is it even useful? Learn for yourself with this introductory talk.
Alexey Tokar To find a needle in a haystackАліна Шепшелей
The talk will cover core principles of text search applicable to fixed size dictionaries. We will have a deep look at some algorithms which are deeply hidden inside huge search engines or basic search inputs on web-sites. My goal is to provide comparison between different search approaches and provide objective assessment based on complexity, memory consumption and CPU utilization of each of them.
Den Golotyuk Big data from 30 million daily usersАліна Шепшелей
This document summarizes the key details of an analytics company called .io over the past year. In 3 sentences:
The company has grown significantly in the past year, now serving over 30 million uniques across 200 customers globally. They focus on collecting and processing huge data flows in a simple way for customers by handling complex analytics internally and providing simple outputs. The company is supported by a small team of 4 engineers and processes over 2 billion daily requests and 100GB of daily backups across 150 cloud and physical nodes.
Anton Fedorchenko Swift for server side developmentАліна Шепшелей
Since Swift programming language was open sourced in December 2015, its popularity has boomed. This smart move from Apple introduced new opportunities for the languages and increased its impact on the developer community. This includes expanding Swift to other platforms and using it for server-side development. The presentation gives an introduction to the server-side development with Swift, highlights most popular frameworks and solutions, covers key questions regarding the language adoption.
Ruslan Shevchenko Programming languages landscape: new & old ideasАліна Шепшелей
In this lection we will talk about emerging development in the field of industry programming languages, new ideas for niche and mainstream market and what is possible to use now. Will mark positions and main characteristics for today-s spectrum of new languages: from Scala, Rust, Julia to Wolfram and Racket.
Cracking AI Black Box - Strategies for Customer-centric Enterprise ExcellenceQuentin Reul
The democratization of Generative AI is ushering in a new era of innovation for enterprises. Discover how you can harness this powerful technology to deliver unparalleled customer value and securing a formidable competitive advantage in today's competitive market. In this session, you will learn how to:
- Identify high-impact customer needs with precision
- Harness the power of large language models to address specific customer needs effectively
- Implement AI responsibly to build trust and foster strong customer relationships
Whether you're at the early stages of your AI journey or looking to optimize existing initiatives, this session will provide you with actionable insights and strategies needed to leverage AI as a powerful catalyst for customer-driven enterprise success.
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc
In a landmark year marked by significant AI advancements, it’s vital to prioritize transparency, accountability, and respect for privacy rights with your AI innovation.
Learn how to navigate the shifting AI landscape with our innovative solution TRUSTe Responsible AI Certification, the first AI certification designed for data protection and privacy. Crafted by a team with 10,000+ privacy certifications issued, this framework integrated industry standards and laws for responsible AI governance.
This webinar will review:
- How compliance can play a role in the development and deployment of AI systems
- How to model trust and transparency across products and services
- How to save time and work smarter in understanding regulatory obligations, including AI
- How to operationalize and deploy AI governance best practices in your organization
Generative AI technology is a fascinating field that focuses on creating comp...Nohoax Kanont
Generative AI technology is a fascinating field that focuses on creating computer models capable of generating new, original content. It leverages the power of large language models, neural networks, and machine learning to produce content that can mimic human creativity. This technology has seen a surge in innovation and adoption since the introduction of ChatGPT in 2022, leading to significant productivity benefits across various industries. With its ability to generate text, images, video, and audio, generative AI is transforming how we interact with technology and the types of tasks that can be automated.
UiPath Community Day Amsterdam: Code, Collaborate, ConnectUiPathCommunity
Welcome to our third live UiPath Community Day Amsterdam! Come join us for a half-day of networking and UiPath Platform deep-dives, for devs and non-devs alike, in the middle of summer ☀.
📕 Agenda:
12:30 Welcome Coffee/Light Lunch ☕
13:00 Event opening speech
Ebert Knol, Managing Partner, Tacstone Technology
Jonathan Smith, UiPath MVP, RPA Lead, Ciphix
Cristina Vidu, Senior Marketing Manager, UiPath Community EMEA
Dion Mes, Principal Sales Engineer, UiPath
13:15 ASML: RPA as Tactical Automation
Tactical robotic process automation for solving short-term challenges, while establishing standard and re-usable interfaces that fit IT's long-term goals and objectives.
Yannic Suurmeijer, System Architect, ASML
13:30 PostNL: an insight into RPA at PostNL
Showcasing the solutions our automations have provided, the challenges we’ve faced, and the best practices we’ve developed to support our logistics operations.
Leonard Renne, RPA Developer, PostNL
13:45 Break (30')
14:15 Breakout Sessions: Round 1
Modern Document Understanding in the cloud platform: AI-driven UiPath Document Understanding
Mike Bos, Senior Automation Developer, Tacstone Technology
Process Orchestration: scale up and have your Robots work in harmony
Jon Smith, UiPath MVP, RPA Lead, Ciphix
UiPath Integration Service: connect applications, leverage prebuilt connectors, and set up customer connectors
Johans Brink, CTO, MvR digital workforce
15:00 Breakout Sessions: Round 2
Automation, and GenAI: practical use cases for value generation
Thomas Janssen, UiPath MVP, Senior Automation Developer, Automation Heroes
Human in the Loop/Action Center
Dion Mes, Principal Sales Engineer @UiPath
Improving development with coded workflows
Idris Janszen, Technical Consultant, Ilionx
15:45 End remarks
16:00 Community fun games, sharing knowledge, drinks, and bites 🍻
It's your unstructured data: How to get your GenAI app to production (and spe...Zilliz
So you've successfully built a GenAI app POC for your company -- now comes the hard part: bringing it to production. Aparavi addresses the challenges of AI projects while addressing data privacy and PII. Our Service for RAG helps AI developers and data scientists to scale their app to 1000s to millions of users using corporate unstructured data. Aparavi’s AI Data Loader cleans, prepares and then loads only the relevant unstructured data for each AI project/app, enabling you to operationalize the creation of GenAI apps easily and accurately while giving you the time to focus on what you really want to do - building a great AI application with useful and relevant context. All within your environment and never having to share private corporate data with anyone - not even Aparavi.
"Making .NET Application Even Faster", Sergey Teplyakov.pptxFwdays
In this talk we're going to explore performance improvement lifecycle, starting with setting the performance goals, using profilers to figure out the bottle necks, making a fix and validating that the fix works by benchmarking it. The talk will be useful for novice and seasoned .NET developers and architects interested in making their application fast and understanding how things work under the hood.
Demystifying Neural Networks And Building Cybersecurity ApplicationsPriyanka Aash
In today's rapidly evolving technological landscape, Artificial Neural Networks (ANNs) have emerged as a cornerstone of artificial intelligence, revolutionizing various fields including cybersecurity. Inspired by the intricacies of the human brain, ANNs have a rich history and a complex structure that enables them to learn and make decisions. This blog aims to unravel the mysteries of neural networks, explore their mathematical foundations, and demonstrate their practical applications, particularly in building robust malware detection systems using Convolutional Neural Networks (CNNs).
Discovery Series - Zero to Hero - Task Mining Session 1DianaGray10
This session is focused on providing you with an introduction to task mining. We will go over different types of task mining and provide you with a real-world demo on each type of task mining in detail.
How UiPath Discovery Suite supports identification of Agentic Process Automat...DianaGray10
📚 Understand the basics of the newly persona-based LLM-powered Agentic Process Automation and discover how existing UiPath Discovery Suite products like Communication Mining, Process Mining, and Task Mining can be leveraged to identify APA candidates.
Topics Covered:
💡 Idea Behind APA: Explore the innovative concept of Agentic Process Automation and its significance in modern workflows.
🔄 How APA is Different from RPA: Learn the key differences between Agentic Process Automation and Robotic Process Automation.
🚀 Discover the Advantages of APA: Uncover the unique benefits of implementing APA in your organization.
🔍 Identifying APA Candidates with UiPath Discovery Products: See how UiPath's Communication Mining, Process Mining, and Task Mining tools can help pinpoint potential APA candidates.
🔮 Discussion on Expected Future Impacts: Engage in a discussion on the potential future impacts of APA on various industries and business processes.
Enhance your knowledge on the forefront of automation technology and stay ahead with Agentic Process Automation. 🧠💼✨
Speakers:
Arun Kumar Asokan, Delivery Director (US) @ qBotica and UiPath MVP
Naveen Chatlapalli, Solution Architect @ Ashling Partners and UiPath MVP
The Challenge of Interpretability in Generative AI Models.pdfSara Kroft
Navigating the intricacies of generative AI models reveals a pressing challenge: interpretability. Our blog delves into the complexities of understanding how these advanced models make decisions, shedding light on the mechanisms behind their outputs. Explore the latest research, practical implications, and ethical considerations, as we unravel the opaque processes that drive generative AI. Join us in this insightful journey to demystify the black box of artificial intelligence.
Dive into the complexities of generative AI with our blog on interpretability. Find out why making AI models understandable is key to trust and ethical use and discover current efforts to tackle this big challenge.
The Challenge of Interpretability in Generative AI Models.pdf
Denis Reznik Data driven future
1. Data-Driven Future
What to Learn and What to Expect?
Denis Reznik
Data Architect at Intapp Kyiv
Microsoft Data Platform MVP
2. About me
•Denis Reznik
•Kyiv, Ukraine
•Data Architect at Intapp, Inc.
•Microsoft Data Platform MVP
•Co-Founder of Ukrainian Data Community
2 |
3. Agenda
•Data is a new Oil (c)
•Data and Science
•Data in Big Companies
•Data and Application Development
•Data-Driven Future
4. Data is a New Oil
“Data is the new oil. It’s valuable, but if unrefined it
cannot really be used. It has to be changed into gas,
plastic, chemicals, etc to create a valuable entity that
drives profitable activity; so must data be broken
down, analyzed for it to have value.”
(c) Clive Humby, UK Mathemetician
5. Data and Science
•Thousands of years
•Empirical
•Few hundreds of years
•Theoretical
•Last fifty years
•Computational
•“Query the world”
•Last twenty years
•eScience (Data Science)
•“Download the world”
10. Parallel Processing
Temperature Sensor Datasets (n Items)
Q: How many times temperature was above
the norm during the last week?
A: 5
Time: 2 sec
Algorithmic Complexity: O(n)
11. Parallel Processing
Temperature Sensor Datasets (k Items in each one)
Q: How many times temperature was above
the norm during the last week?
A: 1
Time: 0.5 sec
Algorithmic Complexity: O(n/k)
A: 0 A: 3 A: 4
19. Data-Driven Future
•Data amount is growing and this is cool
•More and more decisions are based on data
•More and more applications are developed
•It is exciting to be a Software Engineer now!