'Aligning Language Agents' Behaviours with Human Moral Norms' | Seminar 4
Topic: Aligning Language Agents' Behaviours with Human Moral Norms
Abstract: Language agents, designed to interact with their environment and achieve goals through natural language, traditionally rely on Reinforcement Learning (RL). The emergence of Large Language Models (LLMs) has expanded their capabilities, offering greater autonomy and adaptability. However, there's been little attention on augmenting the morality of these agents. RL agents are often programmed with a focus on specific goals, neglecting moral consequences, while LLMs might incorporate biases from their training data, which could lead to immoral behaviours in practical applications. This presentation introduces our latest research endeavors focused on enhancing both the task performance and ethical conduct of language agents involved in intricate interactive tasks.
For RL agents, we use text-based games as a simulation environment, mirroring real-world complexities with embedded moral dilemmas. Our objective thus extends beyond improving game performance to developing agents that exhibit moral behaviour. We first develop a novel algorithm that boosts the moral reasoning of RL agents using a moral-aware learning module, enabling adaptive learning of task execution and ethical behavior. Considering the implicit nature of morality, we further integrate a cost-effective human-in-the-loop strategy to guide RL agents toward moral decision-making. This method significantly reduces the necessary human feedback, demonstrating that minimal human input can enhance task performance and diminish immoral behaviour.
Shifting focus to LLM agents, we begin with a comprehensive review of morality in LLM research, scrutinizing their moral task performance, alignment strategies for moral incorporation, and the evaluation metrics provided by existing datasets and benchmarks. We then explore how LLM agents can improve their moral decision-making through reflection. Our experiments, conducted within text-based games, show that integrating reflection enables LLM agents to make more ethical decisions when confronted with moral dilemmas.
Speaker: Prof Ling Chen (Australian Artificial Intelligence Institute, UTS)
Ling Chen is a Professor in the School of Computer Science at the University of Technology Sydney. She is the Deputy Head of School (Research). She also leads the Data Science and Knowledge Discovery Laboratory (The DSKD Lab) within the Australian Artificial Intelligence Institute (AAII) at UTS. Ling has been persistently working in the area of machine learning and data mining for 20 years. Her recent research interests include anomaly detection, data representation learning, and dialogues and interactive systems. Ling’s research has gained recognition from both government agencies, receiving multiple competitive grants from Australian Research Council (ARC), and industry partners, with gift and contracted research support from entities like Facebook Research and TPG Telecom. Ling serves as an Editorial Board member for journals including the IEEE Journal of Social Computing, the Elsevier Journal of Data and Knowledge Engineering and the Computer Standards and Interfaces.