Tushar Prakash, Raksha Jalan, Onoe Naoyuki
30th September 2024
This blog is written by Brijraj Singh, it is a summary of paper which is accepted at AAAI-2025 conference that can be found here: Paper link
Though recommender systems are technical in nature, they essentially deal with human behavior. This work is motivated by following behavioural and technical considerations.
Fig.2 An experiment on rats: The analysis of Dopamine firing [Lembke, A. (2021). Dopamine nation: Finding balance in the age of indulgence. Penguin.]
Behavioural Motivation: According to the theory of predictive coding and the free energy principle in neuroscience, the human brain works towards reducing the difference between the predicted world model and the observed world state through sensory input (Friston and Kiebel 2009), where the difference is known as free energy. The reduction in free energy gives a sense of satisfaction through the release of Dopamine and motivates the brain to optimize it even further (Friston et al. 2014) by continuing the engagement. In case of online services, we realize that it is the recommendation model that provides more relevant content to the user which the brain finds closer to the predicted world model (based on prior experience of engagement) and is one of the reasons for Dopamine release. This often results in a continuous engagement with the platform, which is invariably broken at some time, due to more pressing work, alternate pursuits, distractions, fatigue, or simply wanting to get out of a rabbit hole.
A famous experiment of neuroscience where dopamine level is monitored from the brain of rat while involving them in activities. This image in Fig.2 shows that a cue conditioned rat makes a connection between lightening of bulb with receiving of awards. Therefore, when bulb lightens, there is a spike of dopamine level in their brain which is quickly controlled by homeostasis feature of the brain. This firing gives the anticipatory pleasure, which usually happens in anticipation of good things. However, the control process brings the level down and most cases it decreases it even more than the baseline level. Therefore, this deficit in Dopamine level is the cause of craving to achieve the awards. If awards is received there is even higher spike and otherwise the level of dopamine dips even deeper than earlier. Hence, An expected reward that fails to materialize is worse than a reward that was never anticipated in the first place.
The same phenomenon happens with the users who are interacting with the online platforms leveraged by time driven sessions. User who after making multiple interactions expects to receive more and more relevant recommendations, have to get off the platform for time duration greater than the threshold (because of some other work). Hence, when user returns back the recommendations got marginalized by their long term preferences because time driven session changes after in-activity at the platform. Therefore, it impacts the user experience badly and affects the CTR in long term.
Technical Motivation: Each individual shows a different behavior while using online services. Even if the content provided by recommendation is highly relevant to a group of users, only a few users can afford to be in continuous engagement (e.g. binge-watching). Most other users can consume content only in an intermittent manner because of other tasks, distractions, or fatigue. Time-driven SBRS considers an idle time threshold θ as a waiting period and if the user does not come back in this duration, it creates new session.
Fig.4 Long term preferences spoils short term preference and reduces experience
Fig. 5 Case (i) Session splits with same categories, Case (iI) Session doesn’t split otherwise
In all the previous studies of SBRS, it is the time that was considered for defining the session. In general, 30 minutes is the time period that has been extensively used to limit the length (Bernardis et al. 2022) of the break (or considered as hyperparameter but once decided is fixed for all the users). Considering a constant time period as a threshold forces any of the session-based models to summarize the user activities, even if they are heterogeneous in nature. This con- fuses the recommendation models, which expect homogeneous behavior within a session so that it can clearly capture short-term preferences and summarize them. Fig. 5 Case (i) Session splits with same categories, Case (iI) Session doesn’t split otherwise
Evaluating the performance of the model on last session has been the convention but it makes more sense in entertainment doman where user’s behavior is captured in long term which gets visible at nth session.
Fig.7 Evaluation in entertainment domains and in e-commerce domain.
Whereas in e-commerce domain, user’s behavior is mostly driven by their requirements rather than their choices. So when a purchase is made user suddenly changes the category of items because next need may not necessary be similar to previous need. Eg. Use purchased a dress now they started exploring in electronics or grocery etc. Hence, evaluation of the model should be made on last item of same session, because (n-1) session may be of interaction of some other category of items and nth session may be of different category.
In this paper, we propose a novel method of creating a session leveraging content and named it as content-driven session. We proposed that a content-driven session captures the user behavior more precisely therefore it improves the recommendation performance. Content-driven sessions are created on the basis of homogeneity in the content. We proposed to cluster the items in the feature space to know a) Total number of categories available b) The category of each item. The category of each item is leveraged to decide the boundary of each session. The proposed method creates new session Id in the existing dataset and keeps the rest of the experimental setting intact. We evaluate the performance of the proposed method on four benchmark datasets on 6 session-based recommendation models. The proposed method outperformed the time-driven session on all the baseline models. This way, content-driven method a). Helps users in continuing the same experience after returning back b). Provides a solution to the problem of deciding threshold θ in TS-SBRS c). Better captures the user interaction pattern and improves the recommendation performance. The proposed method performs clustering to create labels. In domains (such as e-commerce) where the category is already known, the clustering step can be skipped.
In most of the cases, it has been found that Content Driven sessions outperform the time driven sessions. The results are obtained on 6 baselines: STAMP, NARM, GRU4Rec, CD-HRNN, Tr4Rec on datasets like Movielens (Movies), GoodRead Book, LastFM (Music), Amazon (e-commerce).
In most of the cases, it has been found that Content Driven sessions outperform the time driven sessions. The results are obtained on 6 baselines: STAMP, NARM, GRU4Rec, CD-HRNN, Tr4Rec on datasets like Movielens (Movies), GoodRead Book, LastFM (Music), Amazon (e-commerce).