Recommendation Systems of Short Video Platforms

Auditing Algorithms of Short Format Video Platforms to Understand the Rabbit Hole Effect on YouTube Shorts

More Info
expand_more

Abstract

The rapid rise of short-format video platforms such as TikTok and YouTube shorts in the last 5 years has fundamentally transformed the way that users consume content. These platforms rely more than any other on recommendation systems to provide content to users. These systems analyze user behaviors on certain types of content to learn their interests and deliver content to users directly, limiting direct user inputs. While this innovative approach to content delivery simplifies the user experience and generally increases user satisfaction, there are concerns that they can lead to an increase in rabbit holes. These rabbit holes are defined as situations where users are recommended increasingly narrow content which can reinforce biases and limit diverse perspectives and content.

The scientific body on recommendation systems has proven time and time again that they generate rabbit holes. However, a gap exists in the understanding how exactly these rabbit holes form specifically on Short-Format Video (SFV) Platforms, such as YouTube Shorts or TikTok. While many studies have attempted to audit recommendation systems there is a lack of methodology to effectively prove the importance of different users actions in both influencing the output of the algorithm as well as the generation of rabbit holes. To address this knowledge gap, the main research question of this study is:

"How can the feature importance of user actions in algorithmic recommendation systems on Short-Form Video Platforms, such as YouTube Shorts, be analyzed to understand their role in leading users down a rabbit hole?"

This research question can be divided into two sub-questions, which when answered will provide a comprehensive understanding to the main research question. These sub-questions are as follows:

1. How can a tool be built to collect a large dataset of algorithmic recommendations based on user actions?
2. How can this dataset be analyzed to understand the key user actions that lead to falling into a rabbit hole on YouTube Shorts?

As seen by the sub questions, this research will focus on distinct phases and objectives. The first will be to develop a tool that is able to fully simulate a human user of YouTube Shorts, the SFV platform of interest in the study, with full user interests, actions and decision making. This will permit the study to then focus on the second phase which will be to run this automation tool with varying user personas each interacting with content in different ways and recording all the output of the algorithm. This tool is novel in that it utilizes Large Language Models (LLMs) prompts in order to understand the content of the videos it is watching and decide how to react (positively or negatively) based on the video classification and the user interests. This will allow the researchers to understand how different user actions influence the algorithm and permit it to learn the user interests. It will also permit the researchers to investigate if rabbit holes form on YouTube Shorts as well as the user actions that intensify them.
The tool is built to collect a time series of video recommendations with two key data variables for each point: the video category and the reaction of the user (positive or negative).

A time series of 550 videos is collected for the watch, like, share, dislike and skip reactions. The runs for the watch and like are ran at varying probabilities of the reaction happening to moderate their influence. The research then analyses these time series by plotting the running average of videos that positively aligned with the user interests as well as the different video categories recommended to the user. These plots reveal which actions are most important in influencing the algorithm and pertaining to user interests as well as how rabbit holes form. These results are presented below:

- Au automated tool to fully simulate users on YouTube Shorts was successfully developed and can now be used perform algorithmic audits. A full explanation of the development and architecture of the bot/tool is provided in the methodology section in order. The tool itself is also provided to the scientific community on GitHub in order for other researchers to perform more audits.
- Among user interactions, watching videos is identified as the most influential in guiding the YouTube Shorts algorithm. Liking also has a significant impact, though less than watching. In contrast, sharing and disliking videos have minimal influence on the algorithm’s output.
- The results also confirms that the YouTube Shorts algorithm has a strong tendency to create rabbit holes by progressively narrowing down content recommendations based on user engagement, particularly through actions like watching and liking videos. It is also shown that the consistency in user actions is key in the intensity of rabbit holes and that to avoid rabbit hole.

From these results, this study aims to advance the understanding of how YouTube Shorts' recommendation algorithms foster algorithmic rabbit holes. The study introduces a novel automated bot for simulating user behavior, offering a new tool for auditing algorithmic processes on short-form video platforms. By providing insights into the mechanics of content personalization, this work contributes to ongoing discussions about algorithm transparency and ethical considerations in digital content curation.

The research on auditing SFV platforms for the rabbit hole effect aligns seamlessly with the overarching theme of the CoSEM program within the Technology Policy Management faculty. By delving into the intricacies of algorithmic recommendation systems on SFV platforms, the study intersects policy and technology within the context of a complex system. The objective is not only to understand the dynamics of these algorithms comprehensively but also to provide a tool for policy makers to analyse and understand the algorithms shaping our world. The results of this thesis can help platform developers optimize algorithms to reduce rabbit holes, guide policymakers in crafting regulations to promote transparency, and inform users on how to engage more critically with content.