If you work for a company that makes or serves consumer goods, your first data science project will most likely be a recommendation engine - of course I am purposely excluding the projects that focus on data analysis, but only limited to machine learning models that are shippable to consumer-facing platforms. After all, our users face too many choices every day, and it is your job to make each user's life simpler (or in this case, reduce the number of items to choose from).
In my career, I have worked for two consumer-goods companies; one being e-commerce and the other being a video-on-demand business. In both of these companies, my first deliverable project was building a recommendation engine. As you may have guessed already, I like to write about tips and guides in the reflection of the mistakes I've made in the past or the experiences I gained from trial and error. This is a guide for young data science teams who are ready to ship their first recommendation engine at work!
Your company may not have that many choices, to begin with
What comes to your mind when you think of a recommendation engine? Netflix? Amazon? Alternatively, both? A recommendation engine is a classic example of how machine learning can be used for real businesses, and there has been an abundant resource on this topic. Not to mention these famous papers on how Netflix, Amazon, or Youtube does theirs. The papers above are the result of work over many years by brightest minds in the industry. These companies also provide endless choices to users. Hence, their needs for a complex recommendation is well-justified. With this being said, I would like to invite you to think about what you are trying to solve. More specifically, is a complex recommendation engine really necessary in your case?
It is hard to pinpoint at what point the number of choices becomes large enough for a recommendation or vice versa. Also, I am not discouraging you not to use data for your product. However, you might want to think about the effort, resource, and the expected return of the machine learning product you are about to build.
Also, if the number of choices for users is sufficiently small, you will suffer from not having enough data. Collaborative filtering technique may become obsolete as the correlation between two objects will be all too-strong or all too-weak. Think of a pizza restaurant that only sells two types of pizza: cheese pizza and sausage pizza. Every customer who comes to your pizza place has only two options to choose from. Do you want to risk recommending only one pizza to the customer (as your best guess)? If I were you, I would rather recommend the two menus at the same time as it is not very difficult for the customer to choose from the two options.
When you are facing this problem, my advice is to think simple. Talk to the operators and find out what kind of logic is currently used to rank items or recommend items. See if there is a way for the machine to improve the current process. It could be as small as process automation to more a refined version.
Depending on your company's roadmap, the number of choices available to users may grow in the futures. So it is good to be prepared, but don't go overboard by building a convolutional neural network over two types of pizza. Instead, use your resource to stabilizing the infrastructure or gathering the right data points. In this way, once your company is ready to expand, your team will be able to support the expansion at a better capacity.
Of course, you won't have the right kind of data
The three pillars of building a recommendation engine are explicit rating, item metadata, user metadata (yes, subjective). It is surprising to see how many problems you will have for getting your hands on each of these.
No explicit rating
Explicit rating is a measure on how much a user A likes an item 123. The famous Netflix's Kaggle competition was to predict this rating value. Another favorite toy dataset, IMDB dataset also has user rating. As a result, most of the research papers which utilize this dataset build the recommendation models, and explicit user rating is an important component in each of these models. In the real world, many platforms do have explicit rating features; Facebook's like feature, Instagram's heart feature, or Amazon's user rating feature. However, there are equally many other platforms which do not have the explicit user rating data. Even if they do, the data is sparse and can be quite meaningless.
In this case, your team will need to devise a way to convert user behavior into implicit rating or make the recommendation without the rating score. Both methods can yield to good results, but it largely depends on a case by case.
Not enough item metadata
Item metadata is an important ingredient to make a content-based recommendation (without using any user data). Except for a few lucky cases, your team will most likely face not-enough item metadata (or equivalently problematic, having dirty data). Unless your company's DB admins had designed the database with machine learning in the back of their heads, the legacy DB structure, data quality, and many more factors might not be optimal for data science team's use. Alternatively, people who are supposed to input data may overlook the importance of data quality. Then this is not only the DB problem, but it becomes a data culture problem in the organization (a bigger problem which cannot be fixed with one email).
My team still struggles with these legacy systems: how people input data, how the organization monitors data, how data is created. Especially, if the data science team is something new to the organization, it is tough to get support from other teams to change the work process that has been passed down for decades.
Not available user demography/meta data
Nowadays, most mobile platforms require you to type in the email address or phone number to sign up. Some may ask your gender or age; some other even may ask your specific demography data. I think you can guess what I am about to share next: yes, user metadata is not always there for you. There can be many reasons why this is the case. May be your company did not collect them because they didn't think it was needed in the first place. Maybe you cannot use the user data because of a regulatory issue.
If you are using a third-party solution like Google Analytics, you may notice that this tool offers a feature where it gives you an estimated demography data. However, you don't know how this is collected, and most importantly, you cannot directly map each demo data to your user ID. Someone in the meeting room will suggest doing a user-base survey to gather demography data. Well, my personal opinion is that doing a one-time survey is too costly, and what are you going to do if your user base is more than 1 million?
I have not used user demography data in any of the recommendation models I built, so I cannot say the effect of having demo data as an input feature. However, the models without the demo data yield good result as well. I think demo data can help explain why the model is behaving a certain way.
Be reasonable with exception handling cases
Business stakeholders like to add exceptions to everything. Let's think of an example of movie recommendation (much like the Netflix's) platform. Imagine your team has built a wonderful recommendation engine that can spit out personalized movie lists to any individual user. Now these are the examples (some a bit exaggerated but some real) of what can happen to your model at the front-end.
Advertising team: They have just signed a contract with Y company selling an item G. The contract requires your company to show movie F to at least 1 million users (perhaps, item G happens to be shown in movie F)
Contents acquisition team: They just spent gazillion dollars on buying the rights of movie G in your region. They need to break even by heavily advertising movie G to every user on the platform
Company president: Your company president happen to read a movie editorial piece on movie B, saying what an exceptional story it is. The president wants to know why movie B is not on every user's recommendation list
You get the point right? In these circumstances, your company has no choice but to make amends to the output of the recommendation result. Exceptions happen all the time, and as a data science team in a commercial company, we are expected to accommodate business requirements.
My advice here is to be reasonable. Your team should be able to defend the value of the recommendation model. However, at the same time, you should always remember that it takes time for big companies to transition to a more data-driven culture. Let this be the small price you pay for including a data-serving product in your legacy platform. My experiences are that after a while, business stakeholders become more relaxed with the requirements or the exceptions. They are more willing to remove the exception layers once they have seen how the model works. Also, this will buy your team time to monitor and refine the model to a better state.