Welcome to Marcelo's Profile Page

šŸ“§ marcsopranzi@gmail.com

About Me

I am Marcelo, a Data Engineer passionate about working with data and streaming solutions. I am committed to automating everything apart from the creativity :)

With over 10 years of experience, I specialize in designing and implementing end-to-end data solutions, ranging from data ingestion to reporting. I have successfully modernized many manual and on-premises monolithic systems into scalable, cloud-based applications. My expertise includes designing and implementing data warehouses, data lakes, and streaming solutions, with a strong focus on RBAC, Data Quality, and GDPR compliance.

šŸ” Open to Flexible Opportunities: I am excited about roles where I can apply my expertise to create impactful solutions, whether in contract or permanent positions, and collaborate effectively in remote or hybrid environments.

šŸ”­ What I’m Currently Working On...

I am developing data solutions and pipelines using technologies such as Python, SQL, GitHub Actions, DBT, AWS, Kubernetes, Docker and Airflow.

šŸš€ Projects

Scalable ETL Pipeline

This project demonstrates a Docker-Kubernetes based GCP ETL pipeline for processing data using containerized components. It showcases orchestration, data transformations, and scalability using Kubernetes.

View Project on GitHub

DWH Foundations

This project takes a straightforward approach using Airflow and GCP for both on-premises and cloud-based core concepts, such as data lineage for a data warehouse.

View Project on GitHub

Real-Time Processing

This project showcases the integration of Kafka, PySpark, and Cassandra to create a real-time data ingestion solution.

View Project on GitHub

Deploying an ETL with Notifications

This project demonstrates how to launch infrastructure-as-code (IaC) in AWS for a small ETL pipeline with Slack notifications via Lambda & SNS.

View Project on GitHub

Deploying a solution with CI/CD Principles

In this project for a website, I used GitHub Actions to deploy changes to different buckets based on branch names.

View Project on GitHub