ND

N. De Leeuw

1 records found

NLP and reinforcement learning to generate morally aligned text

How does explainable models perform compared to black-box models


This paper evaluates the performance of an automated explainable model, Moral- Strength, to predict morality, or more pre- cisely Moral Foundations Theory (MFT) traits. MFT is a way to represent and divide morality into precise and detailed traits. This evaluation happens in ...