Tao Shen, Data Science and Knowledge Discovery Lab

SUPERVISOR:Distinguished Professor Chengqi ZhangTHESIS TOPIC:Knowledge Graph-Based Text Representation learning for Natural Language Processing. Photo of Tao Shen in front of Sydney Harbour Bridge and the CBD looking across the water

Image: Tao Shen overlooking Sydney Harbour

What is the most rewarding aspect of your research?

Studying various pre-processing toolkits and learning widely-used algorithms are mandatory as a starting point to tackle natural language processing tasks. Then, built upon previous state-of-the-art machine learning algorithms for a specific task, innovative and non-trivial improvements are made along several directions, including but not limited to performance, efficiency, interpretability, practicability, and generalization. These improvements not only are reflected in metrics increase on benchmark datasets from academia, but also benefit a broad spectrum of industrial applications. More detailed publications could be found in my Google Scholar profile.

What are the real-world applications of your research?

Most natural language processing tasks have close connections with real-world applications, such as machine translation and dialogue system. My research is more related to the practical applications involving real-world, factoid or commonsense knowledge graphs. For example, considering a curated knowledge graph suffering from sparseness, how to auto-complete it by predicting the missing links; given a factoid question from a user, how to utilise the backend knowledge graphs to efficiently derive the correct answer; and based on commonsense knowledge graph, how to equip a machine with the capability of reasoning.

What ideas do you have for future research?

In the future, we will focus on how to well handle low-resource or imbalance scenarios, such as few-shot learning, AI's long-tail problem, and cross-lingual settings. Recently, how to reach a better trade-off between model scale and effectiveness also becomes an open problem in natural language processing. Moreover, user privacy protection in machine learning attracts tremendous attention from both academia and industry, so federated learning as decentralised techniques will also be considered.

Are you involved in collaborative research?

We have been collaborating with universities and corporations from all over the world, such as University of Queensland, University of Washington, and Microsoft. The collaborations mainly refer to developing state-of-the-art machine learning algorithms for natural language processing.

What inspired you to undertake a PhD in computer science?

My major was communication engineering during undergraduate, and by chance I was involved in a project related to computer vision. I started to study the basics of computer science and entered a laboratory in the Faculty to learn advanced algorithms. After a period of study, I found I was very interested in computer science due to the techniques to effectively solve real-world problems, and its promising applications in the future. Therefore, I chose to further conduct research in computer science by undertaking a PhD.

Analytics and Data Science

Business

Communication

Design, Architecture and Building

Education

Engineering

Health

Health (GEM)

Information Technology

International Studies

Law

Science

Transdisciplinary Innovation