Your browser is outdated!
To ensure you have the best experience and security possible, update your browser.
Update now
×
Fallou
Tall
Lead Data Engineer - Azure, Databricks and Google Cloud Certified
Home
Experiences
Education
Skills
Download my PDF resume
Contact me
Professional Status
Employed
Open to opportunities
Resume created on DoYouBuzz
Experiences
Lead Data Engineer: View 360 project
National Bank of Canada
Since November 2021
Profile and analyze relevant data
Develop data processing pipelines
Profile and optimize SQL queries
Put the data processing pipelines into production
Automate quality assurance testing
Monitor production status
Develop a Pyspark library for quality assurance testing
Stack: Azure; AWS, Databricks, Python, Snowflake, Spark, SQL
Senior Data Engineer (consultant): Hadoop Cluster Migration
Orange - Ivory Cost
September 2020 to August 2021
Migrate Flume, Pig, Spark1 and Sqoop workflows to Spark2.
Orchestrate workflows on the new cluster with Oozie.
Migrate Oracle databases to Hive.
Developed a Spark library to boost the data engineers productivity.
Stack: Hadoop (Hive, HBase, HDFS, Oozie...), Scala, Spark, Oracle
Data Engineer & Data Scientist (Consultant): Customer Experience Managment
Orange - Senegal
June 2019 to August 2020
Extract raw QoE (NPS, resp.) data from HDFS, then transform it and finally save it to the Hive data warehouse via Spark
Explore then prepare QoE data for modeling with Pyspark and Pandas
Predict customer QoE by machine learning with Scikit-learn
Deploy the QoE prediction model as a Flask API and store the results in SQL Server
Automate the retraining of the QoE prediction model at regular monthly intervals (experimentally defined)
Extract Churn data from Hive via Spark, then transform it and finally save it to the Hive data warehouse via Spark
Perform correlation analysis between NPS and QoE (Churn and QoE, resp.) with Pyspark and Pandas, then visualization as a Dashboard with Tableau
Explore then prepare Churn data for modeling with Spark
Develop customer churn prediction models (base, recharge and data) by machine learning with SparkML
Deploy churn prediction models in batch mode via Spark and store results in SQL Server
Automate the retraining of the model at regular monthly intervals (experimentally defined) with Oozie
Report traffic alarms and alerts in real time for the dynamic management of Orange sites
Predict the outages of these sites by machine learning based on the alarms and alerts data
Extract Fibers data from Hive, then transform it and finally save it to the Hive data warehouse via Spark
Develop algorithms for recommending fiber to customers and recommending areas to fiber to Orange with SparkML
Deploy recommendation models in batch mode via Spark and store results in SQL Server
Orchestrate data processing pipelines with Oozie
Stack: Hadoop (HDFS, Hive, Oozie), Scala, Spark, SQL Server, Tableau, Python, Scikit-Learn, Flask
Data Engineer & Data Scientist (Consultant): Implementation of a digital data monetization platform
Orange - Senegal
October 2018 to April 2019
Set up the micro-services architecture composed by the stack Kubernetes, Kafka, Cassandra, Spark and Node.js
Extract CDRs and Probs data from Kafka, then transform and store in Cassandra via Spark-Streaming
Develop a model for locating the living and working places of Orange customers in Dakar from CDRs data with Spark
Develop a model for determining the origin-destination matrix of Orange customers in Dakar from probs data with Spark
Validate algorithms with the urban transportation service data and demographic data
Extrapolate the results obtained on the entire population of Dakar
Predict population movements in Dakar by machine learning with SparkML based on origin-destination data combined with probes data
Dockerize and deploy the Spark applications on Kubernetes
Stack: Cassandra, Kafka, Kubernetes, Scala, Spark
Data Scientist: Development of an HR chatbot
Atos - Senegal
February 2018 to July 2018
Design and implement the architecture of the application
Scrape HR data from the HR platform with Beautiful Soop
Store scraped data into Google Cloud Storage
Clean and prepare training data for modeling
Define the intents and entities then manually create some dialog flows
Automatically generate dialog flows with Rasa Interactive
Develop an Intent Classification Model with Rasa-NLU and TensorFlow
Develop an entity recognition model with Rasa-NLU and Spacy
Develop a chatbot response prediction model with Rasa-Core and TensorFlow
Dockerize then connect the app to Facebook Messenger API by setting up a webhook
Deploy the chatbot on Google Cloud Platform via App-Engine Flex
Stack: Beautiful Soop, GCP, Messenger API, Python, Rasa-Core, Rasa-NLU, Spacy, TensorFlow
Data Scientist: Optimization of supercomputer energy consumption by artificial intelligence
Bull - France
April 2017 to January 2018
Literature review of job scheduling algorithms
Extract Slurm log history from MySQL with Pandas
Explore then prepare data for modeling with Pandas
Develop a clustering model of applications that run on the system by machine learning with Scikit-Learn
Develop a supercomputer user classification model by machine learning with Scikit-Learn
Deploy models as REST APIs
Develop an energy-efficient job scheduling algorithm in Python based on the prediction of the resource consumption of jobs and their owners
Stack: Anaconda, MySQL, Python, Scikit-Learn
Education
Cooperative Master in Mathematical Sciences - Major Big Data
African Institute for Mathematical Sciences (AIMS - Senegal)
August 2016 to February 2018
Bachelor in Applied Mathematics
Cheikh Anta Diop University (UCAD - Senegal)
October 2009 to July 2013
Deep Learning Specialization
deeplearning.ai
April 2017 to December 2017
Skills
Big Data
Apache Hadoop
Advanced
Apache Spark
Advanced
Cloudera/Hortonworks
Good
Kafka
Good
Cloud
AWS
Advanced
Azure
Advanced
Databricks
Advanced
GCP
Advanced
Snowflake
Advanced
Data Science
Machine Learning
Advanced
Deep Learning
Good
Statistics
Advanced
Data Strutures & Algorithms
Advanced
Problem Solving
Good
Langages
English
Advanced
French
Expert
Wolof
Expert
Programming
Python
Advanced
R
Good
Scala
Advanced
SQL
Advanced
Project Management
Agile methodology
Advanced
Atlassian
Advanced
Github
Advanced
Certifications
AWS Certified Data Analytics - Speciality (on going...)
- April 2022
(View certification)
Databricks Developper Essentials
- January 2022
(View certification)
Databricks Certified Associate Developper for Apache Spark
- January 2022
(View certification)
Databricks Certified Associate Developper for Apache Spark
- October 2021
(View certification)
Azure Data Engineer Associate - MCID: 991749803
- October 2021
(View certification)
Google Cloud Pofessional Data Engineer
- December 2020
(View certification)