I'm really passionate for the real knowledge obtained by the analysis over raw and real data. The world is full of misinformation and biases due a certain incapacity of being true to ourselves and to face what the facts and its data has to say. Work with data and innovative technologies, to help people live better and companies make better decisions, is one of my drives.
Ensuring data quality; extracting, transforming and validating data from different sources (Postgres, Snowflake and Redshift databases, also files from AWS S3 through Glue Data Catalog and Athena), using SQL and PySpark for querying, validation and transformation.
Making good visualizations; building advanced dashboards and visualizations for the business areas on Looker and Metabase, gathering data from different sources.
Data enrichment; creating new and useful features crossing different data from correlated entities in order to empower the analytical explorations.
Data pipeline; helping data engineering team to architecture and build the data pipeline from scratch on AWS Cloud, using S3 as data lake, EMR, Lambda and Step Functions for ETL, and Redshift as data warehouse.