Skip to content
View airscholar's full-sized avatar
๐Ÿ’ญ
Do hard things!
๐Ÿ’ญ
Do hard things!

Highlights

  • Pro

Block or report airscholar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
airscholar/README.md

๐Ÿ‘‹ Hi, I'm Yusuf Ganiyu (@airscholar)

๐Ÿš€ About Me

Data Engineer @AstraZeneca | Founder @DataMasteryLab | AI & Big Data Architect | YouTuber @CodeWithYu | Teaching 50K+ students worldwide

I build AI-powered, production-grade data systems and architect big data solutions for the future:

  • ๐Ÿค– AI & Machine Learning (MLOps, LLMs, Generative AI, Vector Databases)
  • ๐Ÿ”ง Big Data Engineering (Spark, Kafka, Airflow, Flink, dbt)
  • โ˜๏ธ Cloud & Distributed Systems
  • ๐Ÿ“Š Real-time Streaming & Intelligent Analytics
  • ๐Ÿง  AI-Native Data Platforms & Data Mesh

๐ŸŒŸ Focus Areas

Building the future of data with:

  • Generative AI integration into data pipelines
  • Real-time ML systems and intelligent data workflows
  • Scalable big data architectures for AI workloads
  • LLM fine-tuning and RAG (Retrieval-Augmented Generation)
  • Next-gen data platforms and AI-powered analytics

๐Ÿ“ Where to Find Me

  • ๐Ÿ“š Data Mastery Lab - My AI & Data educational platform
  • ๐ŸŽฅ YouTube - Code With Yu - End-to-end data engineering tutorials
  • โœ๏ธ Medium - 3K+ followers | Writing about AI, Big Data & Future Tech
  • ๐Ÿ“ฐ Substack | Writing about Big Data, ML, AI, AI Agents & Future Tech
  • ๐Ÿ’ผ LinkedIn - Let's connect!
  • ๐ŸŽ“ Udemy - Teaching AI-powered data engineering & emerging technologies

๐ŸŽ“ Background

MSc in Computational Intelligence and Data Analytics | Cranfield University

๐ŸŒŸ Mission

Empowering the next generation of data & AI professionals to build intelligent, scalable, future-ready solutions.

๐Ÿ“ซ Let's collaborate on AI and big data projects!

๐Ÿ“บ Latest Youtube Videos

๐Ÿ“š Latest Medium Stories

airscholar

Pinned Loading

  1. e2e-data-engineering e2e-data-engineering Public

    An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All compโ€ฆ

    Python 318 143

  2. RedditDataEngineering RedditDataEngineering Public

    This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and servโ€ฆ

    Python 207 92

  3. RealtimeStreamingEngineering RealtimeStreamingEngineering Public

    This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch. It covers each stage from daโ€ฆ

    Python 43 31

  4. FlinkCommerce FlinkCommerce Public

    This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessary infrastructure components, including Apache Flink, Elasticโ€ฆ

    Java 48 30

  5. SparkingFlow SparkingFlow Public

    This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python, Scala and Java as an example.

    Java 48 29

  6. realtime-voting-data-engineering realtime-voting-data-engineering Public

    This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgres and Streamlit. The system is built using Docker Compose tโ€ฆ

    Python 45 33