Learn more about Jeremy on his
I currently work as a machine learning engineer at Duo Security; we have a range of products (such as
We built
I got my degree in Materials Science and Engineering. Throughout my time in university, I worked in various materials research labs. For the first two years, I worked in a lab exploring some novel techniques for carbon sequestration. This was very hands-on research and involved a lot of time in a physical laboratory formulating materials and performing various experiments.
I eventually shifted my focus to computational modeling research, thinking that it would be more efficient to perform a wide range of experiments via computer simulations, identify a few promising candidates, and reduce the time spent in a physical laboratory formulating materials.
I really enjoyed the modeling aspects of my computational materials research and I kept reading about interesting applications of data science and machine learning, so that summer I decided to apply for data science internships to “dip my toes in the water” and see if I enjoyed it. I ended up getting an internship at Red Hat where I built some web scrapers and used some basic NLP techniques to find duplicate product questions across web forums. I had an absolute blast that summer and it confirmed my desire to pursue machine learning as a career.
However, that internship experience also showed me that I had a lot to learn. After finishing my materials science degree, I immediately dove into a (full time) self-study to build my machine learning foundation. I put together a curriculum of topics, watched lectures from Stanford, MIT, and Udacity, and started writing
After making my way through this initial set of coursework, I started working on some side projects to apply what I had learned. One of those side projects gained some momentum and grew into a startup, which I wrote more about
No day is exactly the same, but most of my days are some combination of:
I’m currently working with some teammates to re-architect our machine learning infrastructure to be more modular, flexible, and scalable as Duo continues to grow and add new machine learning capabilities to our products. This involves work such as tracking all of our models with MLflow, building out an offline feature store for our batch jobs, and working on shifting some of our model training workflows from EMR clusters to a container-native approach.
One of the things that attracted me to Duo is the company’s strong product culture. My team works with an amazing product manager that spends a lot of time talking with customers, performing competitive analysis across the industry, and shaping a product vision for data-informed features and improvements to our product line.
We spend a lot of time focusing on understanding the problems we’re trying to tackle, being careful not to prescribe specific solutions too early on in the process. Part of this includes not making the assumption that machine learning will always be the right tool to solve a problem.
In fact, we’ll usually only reach for machine learning solutions after initially deploying a heuristic-based approach, observing the performance characteristics, and identifying the failure modes of simple rules.
At Duo, we have a design research team that has put a ton of effort into defining a set of user personas that represent the various sets of user groups that interact with our product (e.g. a security administrator, an end user, the company’s CISO). Each persona includes a description of their background, objectives and motivations, and concerns or pain points they may have with our products. When we’re talking about changes or improvements to the product, we’re sure to specify the change in the context of the relevant personas.
We also use Duo’s products in our day to day work! We’ll often
Perhaps it would be helpful to provide a concrete example. At my last company, I worked on a project with some of our threat researchers that wanted better visibility on the malware landscape. Specifically, they wanted to be able to (1) track the evolution of malware over time and (2) identify malware variants where we may be missing family attribution.
If you’re not familiar with malware detection, those objectives probably don’t mean much to you. I was starting off from a similar position! I spent countless hours on Zoom calls with these researchers to understand their existing workflows, how they detect and categorize malware, and where they lacked visibility in their current workflows.
I learned that the threat actor groups who deliver malware track the effectiveness of their attacks and will ship new versions when their malware starts getting detected (and thus blocked) too often. I also learned that our products provided value to our customers not only by blocking malware, but also by providing intelligence on the malware that we detect. Thus, it’s important to not only detect the malware but effectively categorize it (i.e. malware family attribution).
Once I felt that I had a decent understanding of the problem that we wanted to address, I mocked up some wireframes in
Once the researchers were able to play around with the initial prototype powered by these simple models, they were able to give me feedback on where performance was suffering and I was able to make targeted improvements to the models (and UX design) to address their concerns.
In general, if you’re given a new, unfamiliar problem to solve I recommend to:
Related: check out this
It can be a big effort indeed. On a typical project, we’ll collaborate with product managers and designers to make sure that we (1) understand the problem well and (2) that the solutions we develop are intuitive to the end user. We’ll also collaborate with other engineering teams, as the ML service that we deploy is usually only one piece of the larger system that powers the product, in addition to working with our site reliability engineers to ensure the services we deploy are robust and in-line with company standards.
In order to scale myself, I try to always leave behind a trail of documentation so that I’m never the only person who knows how to operate the systems that I build. I also try to take the opportunity to automate processes as much as feasibly possible so that I can be efficient with where my time is spent.
I’ve seen a number of organizational structures including:
With all of this said, I don’t think there’s one organizational model that reigns supreme in all scenarios. Ultimately, the right organizational model is going to depend on your company size, maturity, and variety of data science work available.
For more on this topic, I’d recommend checking out
To start, we value rapid iteration as part of our process. This allows us to make upfront investments in infrastructure aimed at reducing the amount of time it takes to get from an experiment to results. We spend a lot of time thinking about how we can enable paved paths for common workflows while preserving flexibility for data scientists to be able to use the right tool/framework for their problem.
I’m a big fan of the Unix philosophy of simple, sharp tools that work well together. In the context of machine learning infrastructure, most of our effort is focused on designing interfaces between modular components which represent the machine learning lifecycle. For example,
Additionally, we like to stay scrappy at the beginning of projects and get feedback from low-fidelity mockups before spending much time building anything. We take a very incremental approach towards developing solutions, which naturally leads to shorter feedback loops.
A couple years ago, I wrote a
The central theme of that blog post is that the machine learning lifecycle is an extremely iterative process. The figure below depicts a visual summary of the process that I follow.
A few tools that I’ve found valuable as I follow this process on projects:
At Duo, we train and deploy thousands of models daily; this scale necessitates a high level of automation to build, deploy, and monitor these models. Currently, our models are retrained on a fixed schedule, although we’ve been looking into the feasibility of switching to an event-based model where retraining is tied to some metric/threshold.
In terms of monitoring performance, we have a series of alerts that fire when our model behavior deviates from its expected parameters (e.g. flagging way too many login attempts as anomalous). We also have feedback mechanisms built into the product which allows our users to provide signals when our models are doing well and when they’re doing poorly.
I’ve been fortunate to work with and learn from a good number of highly effective practitioners. One of the common themes that I’ve observed is that they spend a lot of time talking with stakeholders/end users. For example, one of my previous coworkers would have this practice of scheduling a ton of “informational interviews” at the onset of a project; he would take time to understand the existing workflows and established processes that various people followed in order to better understand the task we were trying to automate. He was very skilled in identifying the part of the process where we could feasibly introduce automation with ML to make the end users more effective at the overall task.
Additionally, I’ve found that highly effective practitioners are also usually the most curious; they will invest a lot of time making sure they deeply understand the systems and tools they’re using to build out a solution.
I keep a Trello board with my learning goals for the current year. This board has a combination of blog posts I want to write, side projects I want to work on, and courses I’m interested in taking. I also try to keep up with the firehose of research papers and blog posts that pique my interest in newsletters or my Twitter feed.
I’ve found writing to be a really effective learning tool for me in a number of ways. One of my original goals for writing as I learned about machine learning topics was to provide a snapshot of my mental state when a concept “clicked” and record this in my own words; that way I can easily revisit a subject in the future, even if I haven’t touched it in a few years, and quickly get back up to speed. I also discovered that writing is a really effective way to identify gaps in my understanding of a topic, since it forces me to think through a narrative flow of how I would teach myself a given concept. Finally, publishing my writing on a blog has opened the door for so many conversations and friendships with people who have similar interests – it’s been incredibly rewarding!
Read more mentor interviews?
© Eugene Yan 2023 • About • Suggest edits.