Redpanda Data’s Post

Redpanda Data reposted this

View profile for Pau Labarta Bajo, graphic
Pau Labarta Bajo Pau Labarta Bajo is an Influencer

I build real-world ML products. And then help you do the same 🚀

How do you build 𝗿𝗲𝗮𝗹-𝘁𝗶𝗺𝗲 ML systems, at 𝘀𝗰𝗮𝗹𝗲, 𝘄𝗶𝘁𝗵𝗼𝘂𝘁 𝗯𝘂𝗿𝗻𝗶𝗻𝗴 𝗰𝗮𝘀𝗵? 🧠 ↓↓↓ 𝗧𝗵𝗲 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 🤔 Let’s say you work as an ML engineer at a fintech startup, whose flagship product is a mobile app for online payments. The company is still small, but nonetheless a critical problem you need to tackle from day 0 is the automatic detection of fraudulent transactions. The system needs to be > 𝗙𝗮𝘀𝘁, otherwise, you will detect fraud too late, and the predictions, no matter how precise they are, will be useless. > 𝗦𝗰𝗮𝗹𝗮𝗯𝗹𝗲 based on the volume of transactions your users generate per second. You aim to become a serious contender to VISA or MasterCard, so you want a design that can scale to over 65,000 transactions per second. > 𝗖𝗼𝘀𝘁-𝗲𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝘁. You don’t want high upfront costs 💸 and complex infrastructure to mange (well, who does? 😛) So the question is > How can you build a 𝘀𝗰𝗮𝗹𝗮𝗯𝗹𝗲 and 𝗰𝗼𝘀𝘁-𝗲𝗳𝗳𝗲𝗰𝘁𝗶𝘃𝗲 𝗿𝗲𝗮𝗹-𝘁𝗶𝗺𝗲 system to detect fraud❓ 𝗧𝗵𝗲 𝘀𝗼𝗹𝘂𝘁𝗶𝗼𝗻 🧠 To make your system capable of → ingesting transactions, and → producing fraud notifications to end-users at scale and fast you can use a streaming data platform (aka message bus), like Apache Kafka or, even better, Redpanda Data. 𝗪𝗵𝗮𝘁 𝗮𝗯𝗼𝘂𝘁 𝘀𝗰𝗮𝗹𝗮𝗯𝗶𝗹𝗶𝘁𝘆? 🎛️ The scalability of the system depends on 2 things: > Number of brokers in your message bus and the resources you allocate to them (CPU, memory…). > Number of Docker instances you run of your Fraud Detector Service. Horizontal scaling 𝗪𝗵𝗮𝘁 𝗮𝗯𝗼𝘂𝘁 𝗰𝗼𝘀𝘁𝘀? 💸 To run this system you need at least 2 things: > A 𝗰𝗼𝗺𝗽𝘂𝘁𝗲 𝗽𝗹𝗮𝘁𝗳𝗼𝗿𝗺, where your Fraud Detector Service runs. For example, Amazon EKS, Google Cloud GKE or Quix Cloud, and > A 𝘀𝘁𝗿𝗲𝗮𝗺𝗶𝗻𝗴 𝗽𝗹𝗮𝘁𝗳𝗼𝗿𝗺, like Apache Kafka or Redpanda Data. 𝗔𝗽𝗮𝗰𝗵𝗲 𝗞𝗮𝗳𝗸𝗮 𝗼𝗿 𝗥𝗲𝗱𝗽𝗮𝗻𝗱𝗮 ❓ Apache Kafka is built in Java and requires a complex mesh of services to run, while Redpanda Data is designed from the ground up in C++ and shipped as a single binary. Which means it is easier and cheaper to run. 🤑 𝗠𝘆 𝗿𝗲𝗰𝗼𝗺𝗺𝗲𝗻𝗱𝗮𝘁𝗶𝗼𝗻 💡 If you want to start building real-time ML systems, without investing a lot in upfront infrastructure costs, I recommend you try 𝗥𝗲𝗱𝗽𝗮𝗻𝗱𝗮 𝗦𝗲𝗿𝘃𝗲𝗿𝗹𝗲𝘀𝘀. → With a 𝗳𝗲𝘄 𝘀𝗲𝗰𝗼𝗻𝗱𝘀 you get your production cluster up and running 🚀 → You 𝗱𝗼𝗻’𝘁 𝗻𝗲𝗲𝗱 𝘁𝗼 𝗺𝗮𝗻𝗮𝗴𝗲 𝗮𝗻𝘆 𝗶𝗻𝗳𝗿𝗮𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 🎉, and → You 𝗼𝗻𝗹𝘆 𝗽𝗮𝘆 𝗳𝗼𝗿 𝘄𝗵𝗮𝘁 𝘆𝗼𝘂 𝘂𝘀𝗲, so you go from 0 USD and scale based on your growth 🤑 Click here to try it for FREE ↓↓↓ https://lnkd.in/eSd-P4Es ---- Hi there! It's Pau Labarta Bajo 👋 Every day I share free, hands-on content, on production-grade ML, to help you build real-world ML products. 𝗙𝗼𝗹𝗹𝗼𝘄 𝗺𝗲 and 𝗰𝗹𝗶𝗰𝗸 𝗼𝗻 𝘁𝗵𝗲 🔔 so you don't miss what's coming next #machinelearning #realtimeml #streaming

  • No alternative text description for this image
Pau Labarta Bajo

I build real-world ML products. And then help you do the same 🚀

4d

Join 16k+ Machine Learning developers in 𝗧𝗵𝗲 𝗥𝗲𝗮𝗹 𝗪𝗼𝗿𝗹𝗱 𝗠𝗟 𝗡𝗲𝘄𝘀𝗹𝗲𝘁𝘁𝗲𝗿. 𝗘𝘃𝗲𝗿𝘆 𝗦𝗮𝘁𝘂𝗿𝗱𝗮𝘆 𝗺𝗼𝗿𝗻𝗶𝗻𝗴. For FREE ↓↓↓ https://www.realworldml.net/subscribe

Axel Mendoza

Senior MLOps Engineer | Founder @ ConsciousML ⇒ Helping you build ML-products with ease

4d

Hi Pau Labarta Bajo! I've seen mentioned that Kafka and Flink could be substituted by Bytewax and Quix Streams. Any idea how they compare to Redpanda? Redpanda has way more stars on GitHub, though 😁

Shobha Mourya

Data Scientist proficient in statistical & exploratory data analysis and Machine Learning using Python | MLOps | GenAI | R | SQL | Tableau

4d

Sounds like event-based messaging system utilizing publish-subscribe model based on topics.

Abhi Khan

I love data and playing with it using ML Algo

4d

Thats Great. Also teach us how to implement an LLM for company for their routine tasks or predictions.

Edward Akorlie

Experienced Backend Software and Data Engineer | Python, Ruby, Elixir | Optimizing Server-Side Performance | Scalable System Design

4d

Redpanda or Kafka

Pedro Alcantara Costa

Control and Automation Engineer | Industry 4.0 | Business Intelligence (BI) | Data Science and Analytics | Internet of Things (IoT) | Machine Learning | User Training | Systems Governance

4d

Very interesting! thank you for sharing.

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics