Hi! I’m Nihit. I am a Staff Engineer at Facebook, where I currently work on business integrity.
Along with a friend of mine, I also write a biweekly newsletter focused on challenges and opportunities associated with real-world applications of ML.
I studied Computer Science as an undergrad, and the first machine learning course I took was in my junior year (if I recall correctly!). During the following year, I worked on hierarchical topic extraction from text documents, as part of my senior thesis. This collection of experiences was my formal introduction to data science and machine learning.
I understood the potential real world impact of these technologies only really after I started working. I spent a couple of years at LinkedIn, working on search quality, where I got to dabble in ML problems in search: query understanding, autocomplete, spelling correction and of course, ranking.
Following this, I went to grad school at Stanford, where I took a few courses focused on ML and deep learning applications in vision and language understanding. One of the more fun projects I got to work on, was analyzing project pitches on Kickstarter, to predict project success or failure. In the following year, I was a TA for Andrew Ng’s Machine Learning course. Teaching was a new experience for me, and I think helped me understand the importance of communicating technical ideas well.
Since then I’ve been at Facebook, where I currently work on business integrity. My team works on detection of various classes of bad or harmful content in ads, in order to make sure we can take these ads down in a timely manner.
This varies quite a bit week to week, but it is usually some varying combination of the following:
A framework that I’ve found helpful is the following:
In my experience, I have found that a setup like what I described above, generally sets up ML teams for delivering success. It is important to keep refining each step in this framework though to make sure that ML, as fantastic a tool as it is, is pointed in a direction that actually improves product experience for users.
Qualitative evaluation and dogfooding is equally important too. Especially for ML applications like search and recommendations where personalization is an inherent part of the machine learning application, I’ve found it super useful to dogfood the product (or the model output) to better understand what works well and what is broken.
My first instinct would be to understand the problem (what is the goal, what are the constraints) and evaluate if Machine Learning is the right tool to build a solution. I feel this gets missed often - ML is a fantastic tool, if applied to the correct set of problems.
Assuming the answer to the first question is yes, the next step is usually to map the business/product problem to a machine learning problem. This means, (1) figuring out what your model optimizes for (the objective function) and (2) data collection to be able to train your model. In practice, each of these steps can be quite complex. For e.g. how should a high-level objective such as ‘autonomous driving’ be translated to a set of machine learning problems?
The step after this is usually training and validation of models. If this is a new domain and I’m building the first or second version of a machine learning model, my approach is usually to start simple, and make sure the infrastructure for training, monitoring and collecting data over time is correctly set up.
Usually, the step after this would be some sort of online experimentation or validation. Ideally, a controlled A/B test but for new problem areas or urgent firefighting this might not always be practical. So this is somewhat problem dependent.
This definitely rings true. Depending on the nature of the project, I’ll collaborate with some subset of the following:
Scaling myself - I would say this has definitely been a learning process for me. I understand the importance of it, but honestly I don’t know if I’m good at it yet. One thing that’s helped me a lot is taking the time to understand my colleagues and collaborators. Knowing whose judgement to trust on what issues has helped me delegate a lot more effectively, but this is always a work in progress.
Across the industry, I believe we are still in the early innings of ML adoption - while the general principles/good practices are well understood, the tools and workflows for scaling adoption of these practices are yet to mature.
A few observations about machine learning ecosystem within Facebook that I think have:
One thing I have found extremely helpful time and time again is exploratory/qualitative data analysis - what does my data look like? Which features are correlated? Are the labels reliable or do they need cleanup? My go-to is usually Jupyter notebooks with relevant integrations for fetching/visualizing the specific type of data.
Another thing I’d like to highlight is creating reproducible training and deployment workflows - version everything. It will save a ton of debugging headaches down the line.
We’ve learned to pay a lot more attention to post-deployment model management (including monitoring). We have two kinds of monitoring for production models today:
For the most important metrics we have alerts setup to notify on-call in case of unexpected movements. This setup, along with tools for exploratory data analysis to drill down and look at events is a helpful starting point to debug production issues.
Most impactful ML engineers that I know understand that machine learning is, in most cases, only one part of the product. This means sometimes solving issues outside of your domain of expertise or comfort
They are generalists and willing to adapt or learn new frameworks/tools/model architectures as needed.
There’s a lot of value in starting with a simple approach first, and focusing on the end to end data and label collection process. Over time, the quantity and quality of data you have is likely the single biggest limiting factor of your model performance.
So many! In general, ML courses and books are a great resource for learning foundations. However, in my experience this is only a small fraction of what you need to be a good machine learning practitioner. Here’s a few things I have learned over the years that I wish someone told me on Day 1:
I wish I had a more deliberate process for this! I often learn about new ideas or developments (papers, tools, products, upcoming meetups etc) through the community of ML researchers and practitioners around me. Some examples that come to mind: TWiML AI, MLOps.Community, curated list of resources such as Awesome MLOps, internal groups and paper reading/tech talk events at Facebook.
Then, for a few hours every week I’ll sit down and sort through these updates and dive deeper into ones I find interesting or useful. This could be reading a paper, listening to a podcast or trying out a new tool/product.
Read more mentor interviews?