Skip to content
#

spark-streaming

Here are 1,039 public repositories matching this topic...

risingwave

SQL stream processing, analytics, and management. We decouple storage and compute to offer efficient joins, instant failover, dynamic scaling, speedy bootstrapping, and concurrent query serving.

  • Updated Jul 26, 2024
  • Rust

This repo contains implementations of PySpark for real-world use cases for batch data processing, streaming data processing sourced from Kafka, sockets, etc., spark optimizations, business specific bigdata processing scenario solutions, and machine learning use cases.

  • Updated Jul 24, 2024
  • Jupyter Notebook

The U.S. Department of Transportation's (DOT) Bureau of Transportation Statistics tracks the on-time performance of domestic flights operated by large air carriers. Summary information on the number of on-time, delayed, canceled, and diverted flights is published in DOT's monthly Air Travel Consumer Report and in this dataset of 2015 flight dela…

  • Updated Jul 17, 2024
  • Scala

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines

  • Updated Jul 26, 2024
  • Python

Improve this page

Add a description, image, and links to the spark-streaming topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the spark-streaming topic, visit your repo's landing page and select "manage topics."

Learn more