Learn how to orchestrate tasks like data ingestion, transformation, and AI calls, as well as how to monitor and get analytics on data products.
I build AI systems that turn unstructured data into business value. Optimizing at system level, creating feedback loops, and building data assets. CEO @Aisbach | Host How AI Is Built
Bringing AI into data orchestration or orchestrating data workflows for AI. Today, you can learn about both. In today’s episode of How AI Is Built, I learned from Hugo Lu how to build robust, cost-efficient, and scalable data pipelines that are easy to monitor. Hugo is the founder of Orchestra, a serverless data orchestration tool that aims to provide a unified control plane for managing data pipelines, infrastructure, and analytics across an organization's modern data stack. If you only take away three things, here they are: Find the right level of abstraction when building data orchestration tasks/workflows. "I think the right level of abstraction is always good. I think like Prefect do this really well, right? Their big sell was, just put a decorator on a function and it becomes a task. That is a great idea. You know, just make tasks modular and have them do all the boilerplate stuff like error logging, monitoring of data, all of that stuff.” Modularize data pipeline components: "It's just around understanding what that dev workflow should look like. I think it should be a bit more modular." Having a modular architecture where different components like data ingestion, transformation, model training are decoupled allows better flexibility and scalability." Adopt a streaming/event-driven architecture for low-latency AI use cases: "If you've got an event-driven architecture, then, you know, that's not what you use an orchestration tool for...if you're having a conversation with a chatbot, like, you know, you're sending messages, you're sending events, you're getting a response back. That I would argue should be dealt with by microservices." Listen now: https://lnkd.in/dPgZQ6Am Question to you: How are AI workloads changing the way you approach data orchestration? Are you using specialized tools or adapting existing ones? Stay tuned for next week, when I discuss how to build data pipelines specifically for generative AI with Derek Tu from Carbon. #genai #llms #data #dataengineering #dataorchestration