Journal article
Empirical Analysis of Multi-Task Learning for Reducing Identity Bias in Toxic Comment Detection
Proceedings of the International AAAI Conference on Web and Social Media, Vol.14, pp.683-693
05/26/2020
DOI: 10.1609/icwsm.v14i1.7334
Abstract
With the recent rise of toxicity in online conversations on social media platforms, using modern machine learning algorithms for toxic comment detection has become a central focus of many online applications. Researchers and companies have developed a variety of models to identify toxicity in online conversations, reviews, or comments with mixed successes. However, many existing approaches have learned to incorrectly associate non-toxic comments that have certain trigger-words (e.g. gay, lesbian, black, muslim) as a potential source of toxicity. In this paper, we evaluate several state-of-the-art models with the specific focus of reducing model bias towards these commonly-attacked identity groups. We propose a multi-task learning model with an attention layer that jointly learns to predict the toxicity of a comment as well as the identities present in the comments in order to reduce this bias. We then compare our model to an array of shallow and deep-learning models using metrics designed especially to test for unintended model bias within these identity groups.
Details
- Title: Subtitle
- Empirical Analysis of Multi-Task Learning for Reducing Identity Bias in Toxic Comment Detection
- Creators
- Ameya VaidyaFeng Mai - Stevens Institute of TechnologyYue Ning - Stevens Institute of Technology
- Resource Type
- Journal article
- Publication Details
- Proceedings of the International AAAI Conference on Web and Social Media, Vol.14, pp.683-693
- DOI
- 10.1609/icwsm.v14i1.7334
- ISSN
- 2162-3449
- eISSN
- 2334-0770
- Number of pages
- 11
- Language
- English
- Date published
- 05/26/2020
- Academic Unit
- Business Analytics
- Record Identifier
- 9984701832502771
Metrics
1 Record Views