Data Engineering Portfolio

Data Engineer building reliable pipelines and analytics-ready datasets

Designing scalable data platforms with Python, SQL, Airflow, dbt, Azure, AWS, Databricks, and modern cloud warehouses.

Featured Project

NYC 311 Service Requests Lakehouse project thumbnail

AZURE DATA ENGINEERING

NYC 311 Service Requests Lakehouse

Azure-first medallion lakehouse for NYC 311 operational analytics, transforming raw API data into analytics-ready bronze, silver, and gold datasets.

  • Azure Data Factory -> ADLS Gen2 -> Databricks pipeline with proven raw landing and medallion processing
  • Reusable data quality checks, dimensional models, and reporting marts
  • Architecture notes, runbooks, SQL assets, notebook exports, and cloud execution proof
Azure Data FactoryADLS Gen2DatabricksPySparkDelta LakePythonSQLPower BIGitHub Actions

Data Engineering

Cloud Flight Fare Pipeline project thumbnail

REAL AWS CLOUD PROOF

Cloud Flight Fare Pipeline

Real AWS cloud proof project showing EventBridge Scheduler -> ECS/Fargate -> Flight API -> S3 Bronze -> Redshift Serverless -> dbt staging/marts/tests -> CloudWatch Logs, with proof screenshots, runbooks, and cost/secret safety notes.

AWSECS/FargateEventBridgeS3RedshiftdbtDockerCloudWatch
Travelpayouts Flight Collector project thumbnail

PYTHON DATA INGESTION

Travelpayouts Flight Collector

Python API ingestion project that collects live Travelpayouts flight fare data and publishes dated CSV snapshots for analytics.

PythonAPI IngestionCSVSchedulingpytestGitHub Actions

AI Data Engineering

Projects connecting cloud data engineering foundations with RAG, vector search, and LLM-powered analytics assistants.

AI DATA ENGINEERING / HYBRID RAG

In Progress

CivicLens RAG — NYC 311 Operations Copilot

AI data engineering project that extends the NYC 311 Lakehouse with a cited RAG assistant for service request documentation, data definitions, pipeline runbooks, and operational analytics questions.

  • Ingests NYC 311 documentation, data dictionary notes, and project runbooks
  • Chunks, embeds, and stores searchable vectors with metadata
  • Retrieves relevant context and generates cited LLM answers
  • Designed as a hybrid RAG layer on top of trusted data engineering assets
PythonFastAPIPostgreSQLpgvectorOpenAI APIEmbeddingsHybrid RAGVector SearchSQLDockerGitHub Actions

Supporting Work

Sumryze - AI-Powered SEO Reporting Dashboard project thumbnail

AI REPORTING SAAS

Sumryze - AI-Powered SEO Reporting Dashboard

SaaS-style dashboard for automated SEO reporting, AI-generated summaries, analytics visualizations, and client-ready insights.

Next.jsTypeScriptTailwindOpenAIREST APIsVercel
Floral Daily SKU Analysis project thumbnail

DATA ANALYTICS

Floral Daily SKU Analysis

Sales and inventory analysis project focused on daily SKU movement, reporting, and business decision support.

SQLAnalyticsReporting

Skills & Tools

Orchestration & Workflow

Apache AirflowAzure Data FactoryEventBridge SchedulerGitHub Actions

Cloud Execution & Containers

DockerECS/FargateECRCloudWatch Logs

Storage, Lakehouse & Warehouse

ADLS Gen2Delta LakeDatabricksAmazon S3Redshift ServerlessPostgreSQL

Transformation & Modeling

PythonSQLPySparkdbtDimensional Modeling

Data Quality & CI

dbt TestspytestData ValidationValidation SQLGitHub Actions

Analytics Enablement

Power BISQL MartsKPI DesignBI HandoffDocumentation

About

I am a Data Engineer focused on building reliable cloud data pipelines, analytics-ready datasets, and documented project workflows.

My projects show end-to-end data engineering work across API ingestion, cloud storage, transformation layers, data quality checks, dbt modeling, and analytics-ready outputs. I have built portfolio projects using Azure Data Factory, ADLS Gen2, Databricks, PySpark, Delta Lake, AWS S3, ECS/Fargate, Redshift Serverless, CloudWatch, Python, SQL, and dbt.

I care about clear SQL, maintainable Python, reproducible workflows, validation checks, and documentation that helps reviewers understand how a pipeline works.

With a background in analytics and web development, I can connect technical data engineering work with dashboards, reporting needs, and user-facing project presentation.

Contact

Interested in collaborating on data engineering work or portfolio projects? Reach out and I will follow up.