I work as a principal data scientist at OLX Group and I lead a small team of two people. Our main focus right now is helping our data scientists be more effective. Mostly, it’s about solving engineering challenges that come with model deployment.
I also go into other areas of the process — from problem definition to model evaluation. One of the projects I’m doing now is standardizing how we productionize machine learning in our data science department.
After work, I run DataTalks.Club — a community of people who love data. We have weekly events and amazing discussions in our Slack.
I started my career as a Java developer. I worked at a bank and my colleagues told me about this new exciting course on Coursera by Andrew Ng. Then I took more courses and eventually did a masters in business intelligence. At the same time, I was freelancing — doing machine learning projects in Java. That and my master thesis helped to build a good portfolio of projects, so it was enough to get my first data science job.
At my first job, my colleague convinced me to try Kaggle. It was a competition about finding the correct answers to a set multiple choice question. I failed miserably in that competition, but I also learned a lot. Most importantly, I learned that all the theoretical knowledge I had from my masters and online courses was quite useless for applied machine learning problems. I took part in more competitions and this is when I really learned machine learning.
After some time, I joined a startup. In a startup, there’s always more work than people. There, I would do everything: work on the roadmap, set up data pipelines, write scrapers, and buy groceries. It was an amazing experience and I realized that being a generalist is more interesting for me than being a specialist in one particular area. When I joined OLX, I saw that many of my colleagues don’t like deploying machine learning projects, they’d rather focus on modelling. But I liked the deploying part, so I started helping my colleagues with that from my first days.
Now I work as a principal data scientist on a variety of different things — starting from identifying the most impactful projects to unifying how we do machine learning across the organization.
I do a number of things:
I spend most of my day in meetings.
It’s also important to document all these steps. I like creating “project journals” — a document that contains everything related to a project: problem description, the KPIs, notes from meetings (who was there, the decisions, and the next steps).
Experimentation is probably the best way of doing it. Usually we do it with A/B tests.
We usually have monitoring and on-call for important projects.
They can do things end-to-end. Also, they are good communicators and can convince others to help them.
A successful machine learning project involves a lot of talking. At the beginning you need to understand the problem well. Once you have a model, you need to explain how it works. If others don’t understand how it works — they won’t trust your solution. Finally, when your solution is ready, you also need to convince others to use it. That involves a lot of talking as well.
Another thing the school didn’t emphasise enough was the importance of setting up a cross-validation framework. I think this is the most important machine learning skill. You can answer any question by setting up a cross-validation framework and then experimenting.
I often do “just-in-time learning” (this is how Eugene called it during our interview). When I don’t know how to solve something, I start digging in — do a lot of googling, read blogs and papers, and eventually find the solution. This type of learning works best for me — I focus on the problem and learn by solving it.
Also, I have access to Udemy at work, and I like watching courses there. I’ve finished courses on product and project management, web development, cloud services, vector graphics, marketing, copywriting, public speaking, and many other areas. When possible, I try to use these skills — if I don’t do this, I forget the content of that course in one month.
Read more mentor interviews?