Conference proceeding
Mitigating Adversarial Norm Training with Moral Axioms
Proceedings of the ... AAAI Conference on Artificial Intelligence, Vol.37(10), pp.11882-11889
AAAI Conference on Artificial Intelligence
06/27/2023
DOI: 10.1609/aaai.v37i10.26402
Abstract
This paper addresses the issue of adversarial attacks on ethical AI systems. We investigate using moral axioms and rules of deontic logic in a norm learning framework to mitigate adversarial norm training. This model of moral intuition and construction provides AI systems with moral guard rails yet still allows for learning conventions. We evaluate our approach by drawing inspiration from a study commonly used in moral development research. This questionnaire aims to test an agent's ability to reason to moral conclusions despite opposed testimony. Our findings suggest that our model can still correctly evaluate moral situations and learn conventions in an adversarial training environment. We conclude that adding axiomatic moral prohibitions and deontic inference rules to a norm learning model makes it less vulnerable to adversarial attacks.
Details
- Title: Subtitle
- Mitigating Adversarial Norm Training with Moral Axioms
- Creators
- Taylor Olson - Northwestern UniversityKenneth D. Forbus - Northwestern University
- Contributors
- B Williams (Editor)Y Chen (Editor)J Neville (Editor)
- Resource Type
- Conference proceeding
- Publication Details
- Proceedings of the ... AAAI Conference on Artificial Intelligence, Vol.37(10), pp.11882-11889
- Series
- AAAI Conference on Artificial Intelligence
- DOI
- 10.1609/aaai.v37i10.26402
- ISSN
- 2159-5399
- eISSN
- 2374-3468
- Publisher
- Assoc Advancement Artificial Intelligence
- Number of pages
- 8
- Grant note
- FA9550-20-1-0091 / Air Force Office of Scientific Research; United States Department of Defense; Air Force Office of Scientific Research (AFOSR)
- Language
- English
- Date published
- 06/27/2023
- Academic Unit
- Computer Science
- Record Identifier
- 9984948140302771
Metrics
1 Record Views