Featured Projects

A showcase of data engineering pipelines, cloud architecture, and analytical platforms.

Data Engineering Chemical Engineering

Monitoring Refinery Operations using Real Time Intelligence in Microsoft Fabric

I built a data solution that monitors the performance of a Crude Distillation Unit (CDU) using Real Time Intelligence in Microsoft Fabric. The project simulates and streams sensor data generated during the fractional distillation of crude oil which flows simultaneously into a Fabric Lakehouse for a historical pipeline and also into a Fabric Eventhouse for real time monitoring in a KQL Database.

Microsoft Fabric Pyspark Python Kusto Query Language(KQL) Power BI
See project details
Data Engineering Economics/Finance

Nigerian Macro Economic Data Pipeline and Analytics

This project is an end-to-end data engineering pipeline that automates the collection, transformation, and storage of Nigerian economic data (Exchange Rates and Inflation) over the last 20 years. Using the Medallion Architecture, the pipeline ingests data from the CBN's API endpoint, transforms it, and then serves it to a Postgres Database for further analytics.

Airflow Apache Spark Python PostgresSQL Docker Hadoop(HDFS) Metabase
View on GitHub View a snapshot of the Insights Dashboard
Data Engineering

FAANG Stock Data Pipeline

Developed a fully containerized data platform to track FAANG stock data performance. Built with a focus on scalability and reproducibility, the project leverages Terraform for GCP provisioning and Docker for containerization. Data is orchestrated using Airflow, transformed within BigQuery using dbt for robust modeling, and visualized in Looker Studio to provide a comprehensive view of market volatility and historical growth.

Airflow BigQuery dbt Terraform
View on GitHub
MIT Professional Education

MIT Data Science & ML Capstone

Comprehensive portfolio showcasing Machine Learning models and statistical projects completed during the MIT Professional Education program.

Predictive Modeling Feature Engineering Regression Collaborative Filtering Matrix Factorization Recommendation Systems Descriptive Statistics Data Visualization EDA
Access MIT Portfolio
ETL / Docker

NYC Taxi Data Pipeline

Automated ingestion of public NYC taxi datasets into PostgreSQL. Built to handle batch ingestion and optimized for scalable database performance using Python.

Python PostgreSQL Docker
View on GitHub

Price Analytics Automation

Automated dynamic pricing system for E-commerce clients that adjusted prices based on real-time competitor data.

Enterprise Architecture

Azure Sales Analytics Platform

An end-to-end platform orchestrating data from on-prem SQL databases to Azure Synapse via ADF, Databricks, and Data Lake Gen2. Visualized in Power BI with automated triggers.

Azure ADF Databricks Synapse Power BI