Human-computer interaction has long been the focus of technological evolution; however, in order for this type of system to reach its peak potential, machines must recognize that humans are constantly influenced by emotions. Text affective content analysis models are one attempt
...
Human-computer interaction has long been the focus of technological evolution; however, in order for this type of system to reach its peak potential, machines must recognize that humans are constantly influenced by emotions. Text affective content analysis models are one attempt to integrate human psychology into computers, trying to detect the emotion transmitted by written input. There are numerous approaches to implementing such systems, with supervised learning still popular. The challenge of creating textual affective datasets is not in the availability of records, as humanity has reached a peak in data production, especially text, but in ensuring the consistency of the annotations provided by humans when included in the process. This study conducts a systematic literature review focused on providing details of published corpora. The annotation process, as well as any trends in how it evolved, will be examined to obtain dataset particularities. The ultimate intent is to lay the groundwork for an ample study aimed at analyzing the relationship between interrater agreement levels and performance scores of models trained on these datasets. The relevant literature was extracted from 3 search engines: Scopus, IEEE Xplore, and Web of Science, with a focus on manually labeled written records that are not part of multimodal systems, resulting in an analysis of 41 datasets. According to the aggregations, when humans are recruited to perform this task, researchers are more likely to use multiple annotators and calculate the degree of agreement between them to ensure the data's reliability before using it. The conducting researchers are inclined to either train these people before the procedure or tailor the set of labels to potentially increase the uniformity of ratings. As a result, this paper highlights the variety of annotation process characteristics and points towards standardizing this task.