About Me
I am Marcelo, a Data Engineer passionate about working with data and streaming solutions. I am committed to automating everything apart from the creativity :)
With over 10 years of experience, I specialize in designing and implementing end-to-end data solutions, ranging from data ingestion to reporting. I have successfully modernized many manual and on-premises monolithic systems into scalable, cloud-based applications. My expertise includes designing and implementing data warehouses, data lakes, and streaming solutions, with a strong focus on RBAC, Data Quality, and GDPR compliance.
š Open to Flexible Opportunities: I am excited about roles where I can apply my expertise to create impactful solutions, whether in contract or permanent positions, and collaborate effectively in remote or hybrid environments.
š What Iām Currently Working On...
I am developing data solutions and pipelines using technologies such as Python, SQL, GitHub Actions, DBT, AWS, Kubernetes, Docker and Airflow.
š Projects
Scalable ETL Pipeline
This project demonstrates a Docker-Kubernetes based GCP ETL pipeline for processing data using containerized components. It showcases orchestration, data transformations, and scalability using Kubernetes.
DWH Foundations
This project takes a straightforward approach using Airflow and GCP for both on-premises and cloud-based core concepts, such as data lineage for a data warehouse.
Real-Time Processing
This project showcases the integration of Kafka, PySpark, and Cassandra to create a real-time data ingestion solution.
Deploying an ETL with Notifications
This project demonstrates how to launch infrastructure-as-code (IaC) in AWS for a small ETL pipeline with Slack notifications via Lambda & SNS.
Deploying a solution with CI/CD Principles
In this project for a website, I used GitHub Actions to deploy changes to different buckets based on branch names.