I’m the Director of Machine Learning at Monzo Bank in the United Kingdom. Monzo is an app-only bank - you download the app, sign up, and you then get a hot coral debit card in the post in a couple of days: from there, you can manage all of your finances via the app, including chatting with customer support.
My role now boils down to 'making machine learning work for Monzo' - or, put another way, I spend my time thinking about how ML methods, tools, and people can be most impactful across the company.
I stumbled into working on machine learning while doing a PhD in Computer Science which was originally meant to be about ubiquitous computing. Shortly after starting my research, the Netflix Prize was announced and one of my advisors forwarded me an email announcement about it.
At this stage, I knew nothing about machine learning but I was fascinated by online recommender systems, and so I started learning and shaping my research on the topic of how recommender systems evolve over time. I still think that recommender systems are one of the best ways to be introduced to machine learning!
Since then, all of the work that I have done has touched on ML in one way or another, although primarily from a systems perspective.
Although I started as an individual contributor when I joined Monzo, I code much less nowadays - the last major system that I was hands-on with was the first version of our feature store. Majority of my most recent days have been spent talking, reading, and writing:
My current approach is to set machine learning to one side and spend the majority of my time understanding the different parts of the business and what problems they are looking to solve.
Using this approach, machine learning opportunities tend to jump out - problems we want to automate that are (a) non-deterministic, (b) have a lot of data, and (c) could have a meaningful impact on that area’s KPIs.
I’m a staunch believer in end-to-end machine learning work - it is faster than having different teams train/ship models and focuses the machine learning on the end use-case so it is more impactful.
That means that the Machine Learning Scientist in my team who trains a model will then ship it as well, rather than hand it off to another team. This way, they get to see their modeling efforts all the way through to impact, and they stay close to the end-users of their systems.
Right now, my first port of call would be to find a Machine Learning Scientist in my team who has some capacity to look into it :)
When mentoring people through this scenario, I always recommend the same things: look at the data, write things down, and have a baseline. Working on a new ML problem is as much about defining the problem as it is working on the solution, and so keeping track of everything that happens along the way is super important - as is having a sense for how to measure what ‘good’ looks like.
A large part of our work is enabled by Platform teams who run our infrastructure and build tooling that we use - both across our analytics and production stacks. For example, here’s a great overview of the >1,500 microservices running in our platform by Matt Heath and Suhail Patel - I know that without all of the hard work that underpins our infrastructure, our journey into machine learning would have looked very different.
When building specific systems, we collaborate with the teams that own that problem area in the business. Sometimes, that means we split the problem in two. More often, it means that ML Scientists embed with that team for a while.
There’s a spectrum of org structures - ranging from fully embedding Machine Learning Scientists into product teams through to fully centralising Machine Learning Scientists into a single team. Neither of those extremes really works: folks either lose touch with their discipline and peers or lose touch with the business.
All of the org structures that I’ve seen ‘work’ are some kind of hybrid. I don’t think there’s a one-size-fits-all solution; especially inside a single company where things change quickly over time!
Many different ways!
One of the ones I’d like to highlight: we split up the code that creates a dataset from the code that trains a model, typically into separate pipelines. This echoes the importance of data quality in our work, and also helps to make our pipelines more plug & play with each other.
I spoke at length about processes and tools we use in this episode of the MLOps Community podcast. In short: we often discover tools we need by examining the areas of our work that are the most repetitive. That’s why we built a feature store and a model store (note: I have a separate open-source project that is also a model store).
The impact of our work is measured in cost savings or revenue - everything we do ultimately finds its way back to one of these metrics. That’s in large part thanks to being in a company that is not seeking to optimise engagement metrics because it doesn’t sell ads.
The biggest impact we’ve seen so far is when tackling different typologies of financial crime using machine learning. Beyond the impact on our KPIs, we’ve heard directly from customers who we have saved from being scammed out of their life savings - and it is very hard to quantify that.
For lower-level monitoring (e.g., response times, error rates), we use Grafana, just like all of our backend engineers.
Monzo runs all of our analytics in Google Cloud using dbt, BigQuery, and Looker. Having a single place where everyone in the company goes to look for numbers (regardless of whether they are high-level company metrics or nuanced metrics for A/B tests) has been very powerful, and so we decided that our ML performance monitoring should go there as well.
Our choice to use these tools, rather than something that was designed specifically for machine learning monitoring, was deliberate - we then don’t need to manage these tools ourselves, and giving others access and insight into our work is much easier.
The most impactful Machine Learning Scientists that I’ve worked with all have a product-oriented mindset: they recognise that (a) machine learning is a tool, and therefore may not always be the right tool for a task, and (b) that we’re using this tool to try and deliver value to our customers, and so the focus should be on the end-goal rather than just on the details on the modeling.
A lot of what we learn in school or books is focused on training models for a well-defined task on a given dataset. It’s been very rare for me to find this scenario at work!
When applying ML, nearly all of those conditions are no longer true: the task needs to be defined (and can nearly always be formulated in more than one way), the dataset may need to be created and designed, and the focus will often be on systems.
A practical example of this: one of the biggest improvements that I’ve ever seen to a ML system came from an ML Scientist having a hypothesis about data that could become a new feature and then relentlessly pursuing setting up everything we needed to get that input.
I enjoy watching online machine learning courses every now and again, as a way to re-learn continuously. The last three I went through were Full Stack Deep Learning, Luigi Patruno’s Sagemaker Course, and a previous edition of the fast.ai course.
Beyond that, I do continue to (sporadically) peer review submissions to academic conferences - most recently, I’ve been on the Senior Programme Committee for ACM RecSys - and so I’ll peruse papers from these conferences when the proceedings are published.
Read more mentor interviews?