Tel: +447789843468, Email: matthieu.marshall@gmail.com, LinkedIn: https://uk.linkedin.com/in/matthieu-marshall-18a659a4
Summary
Data Engineer with key skills in data wrangling using Python, SQL and Pyspark. Currently working for a market research company using Azure Databricks and Data Factory among other technologies for ETL processes. Previous experience at a technology consultancy working with Python, Scala, SQL and elements of the Hadoop stack for the ingestion, processing and surfacing of data.
Experience
Data Engineering Lead at Redslim (July 2020 - Present)
Worked in a small Agile Scrum developing tools to ingest and enrich sales data for use by Headquarters and Analysts, largely within the FMCG and Pharmaceutical Industry
- Developed a library using Pyspark to clean and process sales data into a standard format.
- Migrated ETL steps from SQL to Pyspark.
- Coordinated ETL processes run on Databricks and Azure SQL Server using Azure Data Factory.
- Developed custom Python/Pyspark scripts to transform data according to client needs.
- Worked with Azure DevOps to manage work items and administer CI pipelines.
- Wrote stored procedures for loading data from Azure Storage to Azure SQL Server databases.
- Supported proof of concept development of running Python code in Azure Container Instance.
- Developed a server-side data application service making use of fastAPI, duckdb and PyArrow.
- Supported the increasing adoption of Agile methodologies within the team and wider company.
I have also grown in my soft skills by interviewing for, onboarding and coaching members of the Python development team as well as discussing wider business requirements with management to feed and refine our backlog of work.
Data Engineer at Roke Manor Research Lld (Nov 2017 - July 2020)
I started at Roke on the graduate development program, which I finished in November 2019 when I was promoted to a Data Engineer. During this time, I have been involved in the following:
Project supporting development of a Big Data Pipeline for use in the defence domain.
- Developed on a data pipeline using Apache Nifi and Kafka to ingest data which is stored in Apache HBase and then presented to the user with a GraphQL API via a Scala backend.
- Worked on customer site in a rainbow team alongside other contractors.
- Produced documentation and technical drawings in order to design solutions for the team to implement.
- Experienced working in an agile scrum environment and took a leading role in Agile scrum ceremonies; taking the place of the scrum master when they were absent.
- Worked in a secured Linux development environment and with tools such as Jenkins, Docker, Ansible and Python to administer development environments and drive continuous integration.
- Communicated with users and external stakeholders to explain the platform and support their integration to it.
- Provided support to colleagues on other software engineering tasks to automate the testing and deployment of the platform.
Project providing test support to a database development project in the law enforcement domain.
- Generated test data for migration of and creation of a new law enforcement database
- Used Python, SQL and custom-made software to create a range of complex test data.
- Risk assessed migration from Microsoft SQLServer to PostgreSQL and produced documentation for the customer on this.
Project conducting a capability assessment of open source methods for face recognition.
- Used image-processing and machine learning packages in Python such as sci-kit learn and opencv to identify individuals from a single photo.
- Investigated and carried out web-scraping for labelled face images.
Project assisting a cyber vulnerability assessment in the automotive sector.
- Investigated the security of the internal network of a modern car.
- Collected, visualised and analysed bus network data.
Other work activity:
- Successfully supported bidding for a Defence and Security Accelerator project.
- Planned and delivered activities for STEM promotion events and school visits.
- Assisted with the organisation and running of workshops with defence stakeholders.
Finance Intern at the University of Southampton (Jun 2016 - Sep 2016)
My primary role as an intern with the Professional Services Finance Department at the University of Southampton was to prepare a data set for benchmarking analysis. In doing so, I built on the following skills:
- Technical knowledge o Used Microsoft Excel to manipulate hundreds of lines of data using functions such as SUMIF, COUNTIF, VLOOKUP and others. o Improved my ability to organise and analyse numerical data. o Gained insight into business operations, finance and strategy within the university which can be applied elsewhere.
- Project management o Coordinated by email and in person with estates employees, managers, heads of faculty finance and other accountants in order to collect data. o Managed and instructed team members on how to prepare relevant data for submission to the dataset. o Completed work to deadlines while also carrying out day to day accounting tasks with Agresso.
- Communicating Ideas o Discussed with colleagues in order to overcome problems with the project. o Used Pivot tables to summarise and present data clearly.
Skills
Python • PostgreSQL • Pyspark • Unix • Data Analysis • Statistical Modeling • Agile Methodologies •
Docker • Hadoop Stack • Azure Databricks • Azure Data Factory • Scala • R
Education
BSc, Mathematics and Statistics, First Class Honours, University of Southampton (Sep 2014 - Jun 2017)
- Developed my coding skills with R, carrying out regression, factorial and survival analysis in order to complete coursework.
- Built on my modelling skills, learning about the properties and applications of a range of statistical distributions in order to select the most appropriate one for analysis.
- Strengthened my ability to analyse situations in a mathematical manner and break down problems into logical set of steps in order to solve them.
- Group projects o Used R for regression analysis on airline fares in order to assess the suitability of the model. o Used Simul8 to solve a problem relating to queues in a call centre. o Improved my skills in discussing mathematical concepts with my team. o Collaborated to deliver high-scoring reports and presentations to academic staff and my peers.
A levels at The Cherwell School, Oxford (2007 - 2014)
Studied for A levels(grade) in French(B), Maths(A), Further Maths(B) and Geography(A), as well as AS levels in Physics(A), Biology(A) and General Studies(A).
Certifications
Advanced Python • LinkedIn Learning
Advanced SQL for Machine Learning • LinkedIn Learning
Using Power BI with Excel • LinkedIn Learning
Intermediate Python for Data Science • DataCamp
Understanding Machine Learning with Python • Pluralsight
Introduction to R • DataCamp
Awards & References
Award for work on Programme NELSON while at Roke Manor Research Ltd (April 2020)
“Matthieu … ended up working most of the weekend to resolve issues. He did this off his own back which highlights the level of professionalism and dedication he always applies, and why he is one of the strongest people we have on the programme…. I can’t thank him enough for his efforts.
This is an outstanding effort, particularly during the current times. Matthieu is a very well-respected member of the team, and recently he has been taking on more responsibility and using his own initiative to solve problems as they arise.”
You may download a copy of my CV in both French and English from this GitHub repository https://github.com/matthieumarshall/Matthieu-Marshall
References are also available upon request