#############################
Video Source: www.youtube.com/watch?v=GqAcTrqKcrY
In this video, you will be building a real-time data streaming pipeline, covering each phase from data ingestion to processing and finally storage. We'll utilize a powerful stack of tools and technologies, including Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra—all neatly containerized using Docker. • MORE FREE COURSES: https://datamasterylab.com • π What You'll Learn: • π Setting up a data pipeline with Apache Airflow • π Streaming data with Kafka and Kafka Connect • π Using Zookeeper for distributed synchronization • π Data processing with Apache Spark • π Data storage solutions with Cassandra and PostgreSQL • π Containerizing your data engineering environment with Docker • β¨ Timestamps: β¨ • 0:00 Introduction • 0:53 System architecture • 3:47 Getting data from API with Airflow • 17:10 Docker Compose for the architecture • 26:09 Streaming data into Kafka • 44:29 Apache Spark and Cassandra setup • 49:33 Streaming data into cassandra • 1:27:05 Outro • π¦π» My Linkedin: / yusuf-ganiyu-b90140107 • π Twitter: / yusufoganiyu • π Medium: / yusuf.ganiyu • π Please LIKE β€οΈ and SUBSCRIBE for more AMAZING content! π • Like this video? Buy me a coffee β€οΈ https://www.buymeacoffee.com/yusuf.ga... • π Useful Links and Resources: • β Code: https://github.com/airscholar/e2e-dat... • β Medium Article: / realtime-data-engineering-project-with-air... • β Docker Compose Documentation: https://docs.docker.com/compose/ • β Apache Kafka Official Site: https://kafka.apache.org/ • β Apache Spark Official Site: https://spark.apache.org/ • β Apache Airflow Official Site: https://airflow.apache.org/ • β Cassandra: https://cassandra.apache.org/ • β Confluent Docs: https://docs.confluent.io/home/overvi... • • β¨ Tags β¨ • Data Engineering, Apache Airflow, Kafka, Apache Spark, Cassandra, PostgreSQL, Zookeeper, Docker, Docker Compose, ETL Pipeline, Data Pipeline, Big Data, Streaming Data, Real-time Analytics, Kafka Connect, Spark Master, Spark Worker, Schema Registry, Control Center, Data Streaming • β¨ Hashtags β¨ • #confluent #DataEngineering #ApacheAirflow #Kafka #ApacheSpark #Cassandra #PostgreSQL #Docker #ETLPipeline #DataPipeline #StreamingData #RealTimeAnalytics
#############################