![Stream Processing with Apache Kafka, Samza, and Flink cover photo](https://cdn.statically.io/img/secure.meetupstatic.com/photos/event/2/c/2/d/clean_479831309.webp)
What we’re about
Stream processing/real time event processing is everywhere. This group's goal is to showcase some of the cutting edge developments that are happening in stream processing in the Industry. The focus of the meetup will be Apache Kafka, Apache Samza, Apache Flink, Change Data Capture, Lambda/Kappa Architecture and such. Hosted by Linkedin.
Past meetup talks are available at https://www.youtube.com/playlist?list=PLZDyxA22zzGx34wdHESUux2_V1qfkQ8zx
Upcoming events (1)
See all- [In-Person + Online] Stream Processing with Apache Kafka, Samza, and FlinkLink visible for attendees
- Venue: 700 E Middlefield Rd, Mountain View, Building 4, 1st Floor, Together
- Zoom: https://linkedin.zoom.us/j/99117864226
5:30 - 6:00: Networking [in-person only + catered food]
6:00 - 6:05: Welcome
6:05 - 6:40: Kafka @ GitHub: Challenges and future direction
Eric Sun, Github
Eric discusses how Apache Kafka is used at GitHub to power features through event streaming and background job processing platforms, our specific design trade-offs and challenges in scaling these services, and what the future holds next for our platform team.- Eric Sun brings his 10 yrs experience on platform teams to the Data Pipelines team at GitHub. Our group oversees the messaging platforms that connect GitHub’s applications and data stores for millions of developers world-wide. He enjoys working on performance problems and specializes in incident analysis and remediation.
6:40 - 7:15: Revenue alerting at scale
Arsh Khandelwal & Kshitij Grover, Orb
In this talk, engineers from Orb will talk about the stream processing and query challenges of implementing a usage and spend alerting product features at the scale of millions of events a second. Orb powers billing for some of the world's fastest growing infrastructure companies, including Vercel, Replit, and Pinecone. In Orb, you can set up metrics (effectively queries) that determine the basis of consumption charges for your customers, and set up alerts that trigger as a result of customer spend. This translates to an invalidation architecture problem with some serious challenges to maintain correctness performantly, especially when allowing extremely flexible metric definitions.- Arsh Khandelwal is an infrastructure engineer at Orb, and is responsible for a broad range of infrastructure efforts, including scaling our real-time cost alerting infrastructure, improving the reliability of our datastores, and tackling the ingestion system at Orb that handles billions of events a day. Before Orb, Arsh worked at Nuro on the platform team, responsible for storing logs across the system. Having seen a peek of how hard it was to build a system to track costs, Arsh wanted to bring business-aware data infrastructure to larger enterprises.
- Kshitij Grover is co-founder and CTO of Orb, the billing platform for the fastest growing infrastructure companies. Prior to founding Orb, Kshitij was an engineering leader at Asana. He received degrees in Computer Science and Philosophy at Caltech.
7:15 - 7:50: Streaming Data Enrichment and Analysis with Redpanda, CDC, and Python
Bryan Wood, Redpanda
Learn how to utilize a streaming data platform to facilitate real-time data enrichment from monolithic databases into modern, streaming architectures. We’ll leverage Debezium for Change Data Capture (CDC) and explore the development of a Python-based service that subscribes to this data for real-time processing and applies machine learning to perform sentiment analysis.
To show a practical use of how data streaming and processing pipelines work in real life, the processed sentiment data is then published on another topic, highlighting how simple this is to do now with new advancements in streaming and ML.
Attendees will gain insights into the use of Debezium with Redpanda, developing Python services for ML applications on streaming data, and strategies for data aggregation and visualization. This talk is ideal for product leaders, managers, architects, and technical leads interested in adopting streaming technologies to enhance data-driven decision-making in their organizations.- Bryan Wood is a creative technology leader passionate about driving positive change and leading teams on the journey of continuous improvement. Bryan is inspired to deliver better product, quicker through low friction interactions, automation, and seamless integration with product organizations and stakeholders.
Ideals & Values
-- Assume good intent
-- Innovation, Efficiency, and Simplicity lead the market
-- Use the right tool for the right job
-- Be passionate about your work
-- Have as much fun as you can