BG-Menu: churn predictor
BG-Menu is one of the biggest food delivery services in Bulgaria. A visitor to their website can explore different chains of restaurants, decide what he/she wants to eat and order the meal online. BG-Menu then delivers the meal from the specific restaurant to the user's address. The client was interested in predicting the likelihood for a user to churn in order to trigger various incentives.
Standard for its industry, BG-Menu is investing heavily on user acquisition, however there is a percentage of users which churn on a monthly basis and BG-Menu believes they can reduce this percentage if they know who is most likely to churn.
We have developed and deployed a model which predicts users who are going to churn with 85%+ accuracy
BG-Menu: delivery time predictor
BG-Menu is one of the biggest food delivery services in Bulgaria, on their website a person can see different chains of restaurants, decide what he/she wants to eat and order the meal online. BG-Menu then delivers the meal from the specific restaurant to the user's address.
BG-Menu was facing challenges at predicting delivery times for the different orders. This can sometimes lead to frustrated customers who haven't received their order on time, or sometimes predict a longer delivery times than the actually needed, which leads to drop in conversation rates, due to customers who are not willing to wait that much for a meal.
We developed and deployed a model which can predict the delivery time of an order from restaurant A to place B. Below are some of the achieved performance metrics :
- 30% accuracy when predicting a food to be delivered with not more than 3 minutes difference from the actual time
- 45% accuracy when predicting a food to be delivered with not more than 5 minutes difference from the actual time
- 70% accuracy when predicting a food to be delivered with not more than 10 minutes difference from the actual time
NEG: topic classification
NEG owns the largest online community website in Bulgaria after Facebook. The forum contains lots of content and new one is produced with large volumes every day (50+ million page views per month).
they needed better categorization of what is talked about in the different topics/sub-forums, so they can use these places for more targeted advertisements.
A model which can put a post into one of up to 50 different categories (e.g. Health, Business, Technology, etc..). The model has around 80% accuracy in classifying posts from the forum.
SEC Live aimed at changing the way people consume SEC filings. Their flagship product took the best out of filings by combining machine learning and natural language processing algorithms to make sense of both textual and numeric data and then wrapping it up into an elegant and beautiful web reader, where analysts and investors could keep all of their research in one place.
- Size - the filings repository contained over 4 TB of millions of loosely structured SEC filings. The regular and timely fetching and processing of such amount is no trivial think.
- Variety - there are over 80 different types of filings with different type of content, sections and structure. Categorization and processing such varied source of textual data is a very complex classification problem.
We developed a bid data processing solution that was computing all of the ML algorithms on a Hadoop MapReduce scalable and elastic computational environment that was able to process the full amount of filings in less than an hour by allocating and freeing a cloud of 100 instances on demand keeping the cloud infrastructure cost under 500$ per month.
To accommodate the variety of fillings, different approaches to training the ML algorithms were employed, resulting in building an internal UI for annotating training data and validation of classification results.
Additionally, we have done thorough performance testing, tuning and monitoring, to accommodate a traffic of 1000 simultaneous users.