This document introduces the Reactive Data System (RDS) framework called RFX for solving fast data problems reactively. It discusses how RFX was developed to handle common issues like counting pageviews, unique users, and real-time marketing. RFX is an open source, full stack framework that uses various tools like Kafka, Spark, and Redis to process high volumes of event data in real-time for applications like analytics, advertising, and monitoring. The document provides an example architecture and topology for collecting tracking data, processing it through RFX components, and generating reports.
Building Reactive Real-time Data PipelineTrieu Nguyen
Topic: Building reactive real-time data pipeline at FPT ?
1) What is “Data Pipeline” ?
2) Big Data Problems at FPT
+ VnExpress: pageview and heat-map
+ eClick: real-time reactive advertising
3) Solutions and Patterns
4) Fast Data Architecture at FPT
5) Wrap up
Reactive Reatime Big Data with Open Source Lambda Architecture - TechCampVN 2014Trieu Nguyen
This document discusses using a reactive lambda architecture with open source tools to solve real-time big data problems. It begins by defining big data and explaining that simply having data is not enough - you need to solve the right problems with the right team and tools. It then presents three example problems that could benefit from real-time big data solutions: disaster prediction and response, understanding customers through social media data, and optimizing marketing campaigns in real-time. The document proposes using a reactive lambda architecture along with open source frameworks like Hadoop, Spark, Storm and databases like Redis, HDFS and HBase to build streaming data pipelines and query data in real-time. It demonstrates this through a social media user tracking and personalized recommendations use
Before the Web...
Then came the Web...
Then happened Web2.0...
How Web2.0 Got its Name
Web2.0: An Overview
Web2.0: Web as a Platform
Web2.0: Harnessing Collective Intelligence
Web2.0: Rich User Experience
Web2.0: Visual Design?
Web2.0: Design Patterns
Web2.0: What is proprietary? What is the biz model?
Web2.0: Beyond the web, beyond the community: Web3?
Web2.0: Implications for Media
Are we going into a Bubble?
Some creative Web2.0 applications?
Nimish Vohra, Regalix
UX Analytics for Data-driven Product DevelopmentTrieu Nguyen
- UX analytics can help companies turn their user data into real products by discovering user interests in real-time.
- Mobile analytics is important because mobile devices are becoming the dominant way users access the web, and big data and analytics are major trends.
- Core KPIs for mobile analytics include users, sessions, events, and other metrics to understand user behavior and how to engage app users.
This document outlines a website farm project created by a group of 7 students. It includes details of the necessary components needed such as computers, servers, furniture, and an office space. It lists the employees required such as managers, engineers, and security. The costs associated with starting the project are provided. The document also describes the development process, organizational structure, network diagram, and services to be provided. It emphasizes the importance of website design for companies and provides requirements and cautions for the project.
IRJET- Socially Smart an Aggregation System for Social Media using Web Sc...IRJET Journal
Socially Smart is a web-based platform that aggregates the latest posts from multiple social media sites like Reddit, Onion News, and GitHub. It uses web scraping and APIs to fetch the top trending posts from these sites and displays them in a concise manner on its interface. The goal is to provide users access to relevant social media content in one place and save the time spent visiting different sites. The platform stores scraped data in a database and uses the Django framework to build the frontend. It analyzes and only displays highly engaged posts. Users can also take notes on interesting posts for future reference.
A .net developer experiences with web2.0 and social mediaRoy Lachica
This document discusses the rise of social media and Web 2.0 technologies and their growing impact and uses. It outlines how these technologies allow for greater collaboration, knowledge sharing, and engagement both within businesses and with customers. Examples of popular social media sites and platforms are provided, as well as tips for developers on how to integrate social features into applications and leverage APIs.
IRJET- Animal Welfare and Wellness Application using JavascriptIRJET Journal
This document describes an animal welfare application that will be developed using JavaScript and React Native. The application, called "Paws", will serve as a social networking platform where users can share pet photos, find shelters for abandoned animals, shop for pet accessories, and ask/answer questions about pet care. It will be designed for Android devices. The backend will use Node.js to manage packages and provide easy-to-deploy development servers. The goal is to raise awareness about helping stray and abandoned animals through an engaging social media application.
This document provides information about a proposal to develop version 3 of the Mifos Android Field Operations app as part of Google Summer of Code (GSoC) 2016. The proposal outlines refactoring the app to use the Model-View-Presenter architecture pattern, adding offline content availability, increasing test coverage, and implementing new features like collection sheets, staff notifications, and client editing. The proposer provides their contact information, previous projects, and a 12-week schedule to complete the work in two phases - refactoring and adding core functionality in phase 1, then additional features and enhancements in phase 2. The proposer has already submitted 5 patches or pull requests to the Mifos-X project on GitHub.
Barcamphanoi Opensocial Application DevelopmentHoat Le
This document provides an introduction to OpenSocial, including:
1) OpenSocial allows developers to build social applications that can be used across multiple social networks through a common API. This avoids having to build separate applications for each individual network.
2) OpenSocial applications are built with Gadgets, which define the user interface, and Gadgets JavaScript/OpenSocial JavaScript, which provide APIs for functionality like user profiles, activities, and persistent storage.
3) Upcoming OpenSocial features include Caja for security and templating to simplify converting OpenSocial data to HTML.
The document discusses Web 2.0 and how it relates to service-oriented architecture (SOA). It defines Web 2.0 as a participatory web platform where users can socialize, communicate, publish, and share information. It also discusses how enterprises can use Web 2.0 tools to improve collaboration and knowledge sharing among employees. Additionally, the document explores how web mashups and rich internet applications (RIAs) are playing a role in the evolution of Web 2.0 and its integration with SOA.
STUDY OF DEEP WEB AND A NEW FORM BASED CRAWLING TECHNIQUEIAEME Publication
This document describes a study of the deep web and a new form-based crawling technique. It defines the deep web as unindexed web content that can only be accessed by filling out forms. The paper proposes a crawling method that equips web crawlers with appropriate input values to submit forms and retrieve search results. An experiment on people search websites demonstrates the technique, achieving high precision and recall rates in associating forms, domains, and attributes. In conclusion, the form-based crawling approach shows potential for effectively surfacing content from the deep web.
The document is a project proposal for developing a live shopping system. It outlines the goals of creating an online shopping application that provides features like product searching, wish lists, notifications, and tracking of deliveries. The proposal discusses the motivation to address problems with existing online shopping and the objectives to develop a reliable and functional virtual shopping experience. It also provides details on the technologies to be used like Java, MySQL, and Tomcat as well as a proposed project schedule.
From Search Engines to Augmented Search ServicesGabriela Bosetti
This document discusses an approach to augmenting web searches by enabling users to customize searches and perform ancillary searches without leaving their current web context. The approach defines Augmented Services that match the user interface of existing search engines using DOM annotation and automation. A Firefox extension allows users to perform on-demand, in-context searches that reuse existing search mechanisms and present results without opening new tabs. This reduces the time and interactions needed for searching. The approach was validated on the top 20 sites from Alexa and demonstrated success in integrating search results into the original web context. Further work is needed to experiment with non-technical users and define metadata for search services.
IRJET- IoT based Vending Machine with Cashless PaymentIRJET Journal
This document describes an IoT-based vending machine that allows for cashless payment. The proposed system uses a website interface for customers to select products, make online payments using Razorpay, and receive a unique code to enter at the vending machine to retrieve their purchase. An Arduino board connected to the vending machine via WiFi receives the code and verifies payment by checking a database before using servo motors to dispense the correct product. The system aims to streamline the purchasing process and eliminate the need to carry cash.
Mainstream Development is a growing international IT company headquartered in Minsk, Belarus (Eastern Europe). Now our team consists of 25+ developers.
We specialize primarily in mobile (iOS, Android, Windows Phone) and Web application development ( NET,SharePoint, LAMP, ROR) based on customer specifications.
Here you can see the presentation with selected projects.
More info you can get from our website http://mainstreamdevelopment.biz/
Feel free to ask questions :)
mail id: anna.vyrostak@mainstreamdevelopment.biz
skype id: anna.vyrostak
Eurecom уличили приложения для Android в тайной от пользователя активностиSergey Ulankin
This research report summarizes a study that characterized the network behavior of Android applications through analyzing the URLs they connect to. The researchers developed a lightweight methodology to automatically extract network traces from applications and categorize the destination URLs. They found instances of excessive advertising, user tracking, and connections to previously suspicious sites. To provide visibility into app network activity, the researchers also developed an Android application to monitor outgoing traffic and identify destinations in categories like ads, trackers, and suspicious sites.
Similar to Introduction to RFX for Backend Developer (20)
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfTrieu Nguyen
1. The document outlines the Chief Platform Engineer's background and introduces LEO CDP, a customer data platform for the travel industry.
2. It discusses 5 challenges companies face related to customer growth, journeys, data platforms, communication and understanding customers with big data.
3. A case study shows how LEO CDP can be used to create a customer journey map for a travel agency, including personalized promotions and offers sent via email.
How to track and improve Customer Experience with LEO CDPTrieu Nguyen
This document discusses how to track and improve customer experience using LEO CDP. It begins by explaining why measuring customer experience is important, then introduces four key metrics: Customer Feedback Score, Customer Effort Score, Customer Satisfaction Score, and Net Promoter Score. It describes using journey maps to manage customer experience data and visualize the customer journey. Finally, it presents LEO CDP as a software solution for collecting customer experience data, building surveys, and generating reports to gain insights to improve products, services, and the overall customer experience.
[Notes] Customer 360 Analytics with LEO CDPTrieu Nguyen
Part 1: Why should every business need to deploy a CDP ?
1. Big data is the reality of business today
2. What are technologies to manage customer data ?
3. The rise of first-party data and new technologies for Digital Marketing
4. How to apply USPA mindset to build your CDP for data-driven business
Part 2: How to use LEO CDP for your business
1. Core functions of LEO CDP for marketers and IT managers
2. Data Unification for Customer 360 Analytics
3. Data Segmentation
4. Customer Personalization
5. Customer Data Activation
Part 3: Case study in O2O Retail and Ecommerce
1. How to build customer journey map for ecommerce and retail
2. How to do customer analytics to find ideal customer profiles
The ideal customer profile in a B2B context
The ideal customer profile in a B2C context
3. Manage product catalog for customer personalization
4. Monitoring Data of Customer Experience (CX Analytics)
CX Data Flow
CX Rating plugin is embedded in the website, to collect feedback data
An overview of CX Report
A CX Report in a customer profile
5. Monitoring data with real-time event tracking reports
Event Data Flow
Summary Event Data Report
Event Data Report in a Customer Profile
Part 4: How to setup an instance of LEO CDP for free
1. Technical architecture
2. Server infrastructure
3. Setup middlewares: Nginx, ArangoDB, Redis, Java and Python
Network requirements
Software requirements for new server
ArangoDB
Nginx Proxy
SSL for Nginx Server
Java 8 JVM
Redis
Install Notes for Linux Server
Clone binary code for new server
Set DNS hosts for LEO CDP workers
4. Setup data for testing and system verification
Part 5: Summary all key ideas
The document outlines new features and updates for 2022 from USPA Technology Company, including a new dedicated dashboard for CMOs, updated UI for Customer 360 Insights, and a focus on data-driven business processes and digital marketing in B2B through standardizing data-driven processes and focusing on customer insights.
Lộ trình triển khai LEO CDP cho ngành bất động sảnTrieu Nguyen
1) Hiểu bài toán số hoá trải nghiệm khách hàng
2) Nghiên cứu giải pháp LEO CDP
3) Lộ trình triển khai
Phát triển / số hoá điểm chạm khách hàng
Xây dựng bản đồ hành trình khách hàng
Định nghĩa các metrics và KPI quan trọng
Xây dựng web portal và mobile data hub
Xây dựng kế hoạch Digital Marketing
Triển khai CDP và Marketing Automation
Xây dựng đội Analytics để phân tích dữ liệu
From Dataism to Customer Data PlatformTrieu Nguyen
1) How to think in the age of Dataism with LEO CDP ?
2) Why is Dataism for human, business and society ?
3) How should LEO Customer Data Platform (LEO CDP) work ?
4) How to use LEO CDP for your business ?
Data collection, processing & organization with USPA frameworkTrieu Nguyen
1) How to think in the age of Dataism with USPA framework ?
2) How to collect customer data
3) Data Segmentation Processing for flexibility and scalability
4) Data Organization for personalization and business activation
Part 1: Introduction to digital marketing technologyTrieu Nguyen
This document provides an overview of a mini-course on data-driven marketing using the USPA framework presented by Trieu Nguyen. It includes biographical information about Trieu Nguyen's background and experience in big data projects, machine learning, and digital marketing roles. The document also outlines the topics that will be covered in the mini-course, including digital media models, search engine marketing, social media marketing, advertising technology, customer data platforms, and case studies. Key terms like omnichannel strategy, customer experience strategy, and artificial intelligence strategies for marketing are also defined.
Transform your marketing and sales capabilities with Big Data and A.I
1) Why is Customer Data Platform (CDP) ?
Case study: Enhancing the revenue of your restaurant with CDP and mobile app marketing
Question: Why can CDP disrupt business model for restaurant industry (B2C) ?
2) How would CDP work in practice ?
Introducing USPA.tech as logical framework for implementing CDP in practice
How Can a Customer Data Platform Enhance Your Account-Based Marketing Strategy (B2B) ?
3) How can we implement CDP for business?
Introducing the CDP as customer-first marketing platform for all industries (my key idea in this slide)
How to build a Personalized News Recommendation PlatformTrieu Nguyen
This document discusses how to build a personalized news recommendation platform. It explains that recommendation systems are needed to retain users, increase traffic, and improve the content experience. It describes popular techniques like collaborative filtering, content-based filtering, and hybrid systems. Specifically, it outlines a case study using a USPA framework with real social news data. Key factors for a news recommendation system are discussed like novelty, user history, and location. The document also provides a simple example of building a recommendation engine with Apache Spark.
Video Ecosystem and some ideas about video big dataTrieu Nguyen
Introduction to Video Ecosystem Mind Map
Video Streaming Platform
Video Ad Tech Platform
Video Player Platform
Video Content Distribution Platform
Video Analytics Platform
Summary of key ideas
Q & A
Concepts, use cases and principles to build big data systems (1)Trieu Nguyen
1) Introduction to the key Big Data concepts
1.1 The Origins of Big Data
1.2 What is Big Data ?
1.3 Why is Big Data So Important ?
1.4 How Is Big Data Used In Practice ?
2) Introduction to the key principles of Big Data Systems
2.1 How to design Data Pipeline in 6 steps
2.2 Using Lambda Architecture for big data processing
3) Practical case study : Chat bot with Video Recommendation Engine
4) FAQ for student
This document discusses open over-the-top (OTT) video content platforms. It defines OTT as streaming media distributed directly over the internet bypassing traditional distribution methods. The document then covers OTT market drivers and business models. It examines the most popular OTT platform in Vietnam and challenges for successful OTT platforms including scalability, content acquisition and management, audience engagement, and business models. Finally, it proposes a modular technical architecture for an open OTT video platform using open source technologies.
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisTrieu Nguyen
This document provides an introduction to Apache Hadoop and Spark for data analysis. It discusses the growth of big data from sources like the internet, science, and IoT. Hadoop is introduced as providing scalability on commodity hardware to handle large, diverse data types with fault tolerance. Key Hadoop components are HDFS for storage, MapReduce for processing, and HBase for non-relational databases. Spark is presented as improving on MapReduce by using in-memory computing for iterative jobs like machine learning. Real-world use cases of Spark at companies like Uber, Pinterest, and Netflix are briefly described.
Introduction to Recommendation Systems (Vietnam Web Submit)Trieu Nguyen
1) Why do we need recommendation systems ?
2) How can we think with recommendation systems ?
3) How can we implement a recommendation system with open source technologies ?
RFX framework https://github.com/rfxlab
Apache Kafka: https://kafka.apache.org
Apache Spark: https://spark.apache.org
Giới thiệu cơ bản về Big Data và các ứng dụng thực tiễnTrieu Nguyen
1. Các ứng dụng Big Data thực tiễn trên thế giới
2. Các lĩnh vực đang ứng dụng Big Data ở Việt
Nam
3. Các bài toán Big Data tiêu biểu ở Vietnam
a. Quản lý chăm sóc khách hàng (CRM)
b. Tối ưu hoá trải nghiệm truyền hình Internet
c. Quảng cáo trực tuyến AdsPlay.net
4. Giới thiệu về công việc và thị trường việc làm
Big Data ở Việt Nam
5. Kiến thức nền tảng cho các bạn sinh viên
The document summarizes the results of a survey about online shopping in Vietnam. Some key findings include:
- The Vietnamese e-commerce market reached $4.07 billion in 2015 and is growing at a rate double that of Japan.
- Popular online shopping sites in Vietnam include Lazada, Tiki, Zalora, and The Gioi Di Dong. Lazada has the most monthly users.
- Two-thirds of Vietnamese ages 18-39 in major cities have shopped online. Nearly half of those people have also used Facebook shopping.
- The top products purchased online are fashion, IT/mobile phones, and kitchen/home appliances. Lazada is the most widely used and recently used site.
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...weiwchu
We recently discovered that models trained with large-scale speech datasets sourced from the web could achieve superior accuracy and potentially lower cost than traditionally human-labeled or simulated speech datasets. We developed a customizable AI-driven data labeling system. It infers word-level transcriptions with confidence scores, enabling supervised ASR training. It also robustly generates phone-level timestamps even in the presence of transcription or recognition errors, facilitating the training of TTS models. Moreover, It automatically assigns labels such as scenario, accent, language, and topic tags to the data, enabling the selection of task-specific data for training a model tailored to that particular task. We assessed the effectiveness of the datasets by fine-tuning open-source large speech models such as Whisper and SeamlessM4T and analyzing the resulting metrics. In addition to openly-available data, our data handling system can also be tailored to provide reliable labels for proprietary data from certain vertical domains. This customization enables supervised training of domain-specific models without the need for human labelers, eliminating data breach risks and significantly reducing data labeling cost.
The Rise of Python in Finance,Automating Trading Strategies: _.pdfRiya Sen
In the dynamic realm of finance, where every second counts, the integration of technology has become indispensable. Aspiring traders and seasoned investors alike are turning to coding as a powerful tool to unlock new avenues of financial success. In this blog, we delve into the world of Python live trading strategies, exploring how coding can be the key to navigating the complexities of the market and securing your path to prosperity.
Data analytics is a powerful tool that can transform business decision-making across industries. Contact District 11 Solutions, which specializes in data analytics, to make informed decisions and achieve your business goals.
1. Introduction to RFX for
Backend Developer
Reactive Function (X)
the Open Source Framework for solving Fast Data Problem
and reacting to the World with Deep Learning
By Triều Nguyễn, the creator of RFX
http://mc2ads.com (Reactive Big Data Lab)
λ(x)
2. 2008: Java Developer, develop Social Trading Network for a
startup (Yopco)
2011: joined FPT Online, software engineer, worked in banbe.
net social network and VnExpress Mobile Restful API
2012: backend engineer at Greengar Studios
12/2012 to now - back to FPT Online, lead engineer,
developed new version Data Analytics Platform (ad-network
eclick.vn and VnExpress News)
Introduction about myself
3. 1. What is Rfx ?
2. Inception and Ideas
3. How Rfx was born
4. Why is Rfx ?
a. from big picture view (business)
b. from business view
c. from specific problems
5. Concepts and architecture: The BIG picture
6. Coding and tutorials from practical problems
7. Resources for self-studying
Contents of this talk
4. ● A framework for reactive real-time big fast data
● A collection of Open Source Tools
● The mission of RFX
→ “BUILD digital data-driven brain for every company in the
World”
What is RFX or Reactive Function X ?
5. INCEPTION and Ideas
Ideas when I was student, internship at DRD,
non-profit Organization
More info at http://activefunctor.blogspot.com
11. Why Rfx ?
● Ideas since 2007 (from Haskell and Actor model theory)
● R&D and Deployed in Production since 2013
● Open Source: Apache License, Version 2.0
● Full Stack: Front-end and Back-end
● Apply Agile for Analytics and Data Science
● Apply Reactive Lambda Architecture
● Really fast and near-real-time processing
● Tested with 1.000.000 logs / second (1 million in 1 second)
● Simple development model for big data developer
13. Domain (in business) where Rfx can be used
● real-time data analytics for digital marketing, advertising
● hospital systems
● personal banking system
● financial institution to detect frauds
● manufacturing plant
● airline systems
● online trading system
● emergency control system
● manufacturing plant management system
● road tolling system detects
● social networking site
15. Problem: How to monitor Mobile Web Performance and react
to slow response time
http://sixrevisions.com/mobile/pay-attention-to-mobile-web-performance
16. Luggage management system, events are produced by the check-in process
and by the various radio-frequency identification (RFID) readers, which emit
events about the movement of the luggage in the system. The events
generated by the event processing system are consumed by the luggage
control system itself, by airport staff, or even by the passengers themselves.
Problem: Monitoring sensor data and real-time security checking
17. Actor
User, Mobile,
Browser, ...
Reactive Lambda Architecture
System Rfx-
Topology
data + context + metadata
useful (data + relationship)
Database
NoSQL
1. Actor → System
2. System → Database
3. Database → System
4. System → Actor
18. Concepts
● Each user, who uses the services and creates data, is the
actor in system
● Actor is the source of all events (aka: logs), (click, reading
news, sending message to friends, playing games, ...)
● Functor (aka: neuron) is a computing object, used for
storing, processing data and emitting results to subscribed
functors
● Topology is the directed graph, define how functors that
are connected with stream data and process data
21. There are 3 demos, from simple to advanced user story
User story 1: Counting Real-time URL Pageview
User story 2: Monitoring Social Media Statistics
User story 3: Social Ranking for Recommendation Engine
22. User Story
Domain problem: Reactive Real-time Marketing
User story’s details:
1. User does read news from a website
→ tracking user activities (pageview, time on site)
2. User does login with Facebook Account
3. User clicks on like Facebook button
→ tracking what user liked, commented
4. The marketer/data analyst should see the trending most
read article in real-time
● → Personalized articles for the reader
● → Native advertising in real-time
24. Demo user story 1: Counting Real-time URL Pageview
Input:
1. The pageview logs from HTTP
Output:
1. The total number of page-view
2. The total number of page-view per hour
3. The total number of page-view per minute
4. The total number of page-view per second
5. The total number of page-view for URL
26. Demo user story 2: Monitoring Social Media Statistics
Input:
1. The pageview logs from HTTP
Output:
The social media statistics from:
1. Facebook: Like, Share, Comment
2. Twitter: Tweet Count
3. LinkedIn: Share Count
4. Geolocation heat-map report
28. Demo user story 3:
Social Ranking for Recommendation Engine
Input:
● Data: the URL of article
● Context: where (User's Location), when (time visit), from
where (referer url)
● Metadata: keywords, category of article
Output:
Real-time Statistics about pageview, social media statistics
(Share, Like, Comment), recommended articles
The list of articles are ranked by:
● most liked and same category
● most viewed and same category
● most liked, same category and near user's location
30. Reference Resources
Main website for Rfx: http://www.mc2ads.com
Ideas:
● http://journal.frontiersin.org/Journal/10.3389/fninf.2010.00112/full
● http://singularityhub.com/2014/04/20/new-imaging-method-shows-young-neurons-making-
connections-exchanging-information
● http://en.wikipedia.org/wiki/Actor_model_theory
● http://java.dzone.com/articles/introduction-event-processing
● http://www.technologyreview.com/featuredstory/526501/brain-mapping
● http://www.technologyreview.com/featuredstory/513696/deep-learning
Apache Storm: http://storm.incubator.apache.org
Apache Kafka: http://kafka.apache.org
In-memory NoSQL: http://redis.io
Deep Learning for Java: http://deeplearning4j.org
Distributed processing with Actor Model: http://akka.io
Papers:
● Real-Time Visualization of Streaming Text Data
● Hypernetworks for the Science of. Complex Systems
Main Blogs: http://www.mc2ads.org
31. The end and thank you
https://github.com/mc2ads/rfx
http://www.mc2ads.org
λ(x)