Alleviating the cold-start problem by using demographic data and domain-aware similarity measure

More Info
expand_more

Abstract

Recommender systems (RS) are a cornerstone for most online businesses that cater to a large customer base such as e-commerce, social network platforms and many others. RS's enable these platforms to provide tailor-made experiences to each of their customers by strategically utilizing users/items rating data or any other available data. Collaborative filtering (CF) techniques are some of the most popular and successful RS models created. However, CF techniques often suffer from the cold start (CS) problem. In particular, they struggle with complete cold start (CCS) situations in which no user/item rating history is available and incomplete cold start (ICS) situations in which only a limited amount of user/item rating history is available.
In this paper, we explore two models which utilize novel ideas to combat the CCS and ICS problems. The first model (DCF) focuses on the intelligent use of user demographic data to combat the CCS problem. The second model (PIPCF) focuses on the use of a novel domain-specific similarity measure called Proximity-Impact-Popularity (PIP) to combat the ICS problem. In addition to this, we also propose our own model (DPIP-CF) which combines these two ideas in conjunction with some of our own modifications to combat the CCS and ICS problems simultaneously.
We utilize the MovieLens data set which is a commonly available and popular dataset that is often used to test RS's. Through a series of experiments, we demonstrate the strengths of DCF and PIPCF in dealing with the CCS and ICS problems respectively. Finally, we also show that our DPIP-CF model outperforms all other models discussed in this paper and is a viable solution to dealing with the CCS and ICS problems simultaneously.