The world has already been fundamentally transformed by data-driven intelligence, says Professor Cao, but we are only in the early stages. “If parallels are drawn with the evolution of the internet, the socio-economic and cultural impact of data science will be unprecedented,” he says.
The impact of data science will be unprecedented.
— Professor Longbing Cao
Professor Cao is a professor of information technology in the Faculty of Engineering and IT at the University of Technology Sydney (UTS). He began promoting data science and analytics research, education and development in the early 2000s, while still working as a chief technology officer in business. He then established Australia’s first Data Science Lab, at UTS, in 2007 and then the highly regarded UTS Advanced Analytics Institute (AAi) in 2011.
With his global leadership in data science research, education, community building, and its impact on business and government decision making, Professor Cao was recognised with the prestigious Eureka Prize for Excellence in Data Science in 2019.
Professor Cao and his team made a very early commitment to tackling real-life big data analytics issues for large governmental and industrial organisations inside and outside Australia. For more than a decade now they have worked on challenges and opportunities in areas such as the capital markets, banking, transport, marketing, social networks, taxation, immigration, health, social welfare and education.
Professor Cao’s focus is on discovering actionable insights from data – insights that are unique to specific organisations or issues and which inform or recommend smarter decisions and actions. “It’s about the complex, real world,” he says.
This requires intelligent models and algorithms that consider the real-life characteristics and complexities hidden in data, he says. ‘Transformational’ insights are not something that can be achieved by businesses simply applying off-the-shelf programs, pre-trained tools or laboratory-based algorithms and models without significant modification, he argues.
“Our research aims for actionable intelligence and solutions that disclose intrinsic and intricate working mechanisms, interactions, structures, relations, hierarchies and dynamics,” he says. “These drive the problem formation, evolution and its consequences.”
It’s about the complex, real world.
— Professor Longbing Cao
It also requires strategic and systematic ‘data science thinking’ – something that takes data science from the merely technical to higher-level thinking, he says.
“In practice, the quality of data science work is highly dependent on the quality of data science thinking. It is data science thinking that differentiates a good data scientist from an average one, and a quality data science output from a bad or average one,” he writes in his recent book, Data Science Thinking (Springer, 2018).
The quality of data science relies on high-level intelligence, rich data science thinking and multidisciplinary knowledge and enterprise experience, Professor Cao says. “The success of data science also needs the meta-synthesis of data intelligence, behaviour intelligence, organisational intelligence, social intelligence and human intelligence. Without these, so-called data science is straightforward data analysis – not deep, systematic, personalised, highly dimensional and dynamic data science.”
The work done by Professor Cao and his team has produced significant business and socio-economic benefits.
The researchers have partnered with a number of federal government agencies, including the Australian Taxation Office (ATO), and with businesses like SAS and Microsoft, general and health insurers, banks and airlines.
In collaboration with the innovation lab and data science group of one major bank, the researchers were able to develop a ‘universal representation’ of all its customers – what Professor Cao calls an ‘enterprise data gene’. This representation was valid regardless of the business lines with which the customers were affiliated or their individual circumstances.
“This is the first time an enterprise-wide benchmark has been established that can be used as the foundation for saving low-level data manipulation siloed in specific units, and to prevent inconsistent benchmarking and evaluation,” Professor Cao says. This enterprise data gene is estimated to have saved millions of dollars for the bank in the area of credit card risk management alone.
In another project the team’s research contributed directly to the ability of the ATO to tackle multibillion-dollar tax debt associated with businesses and individuals overclaiming or being late with payments.
Modelling debtor behaviour and developing behaviour insights helped improve another government client’s processes, with one result being customised ‘early intervention’ SMS messages to debtors to help them keep up with payments. This work involved debt in the billions of dollars.
Another series of government data science innovations were made to analyse, detect, predict and recover incorrect income reporting and declarations for social welfare services. This series of work represents one of the very first systematic exploration of data science for human services in the world.
The group’s machine learning expertise aided the development of a ‘smart trademark’ tool for IP Australia, while its pattern-based discovery and prediction algorithms have helped improve the accuracy of diagnosis and prognosis for a range of cancer types.
“The values in data are comprehensive,” Professor Cao says. “They can be monetary, such as recovered debt or saved overpayments; they can be social good, such as improving community services; they can be health-related, by alerting people to bad behaviours that will affect quality of life. Data science can also be widely used for environmental improvement, to enhance government policy, and in business applications such as smart manufacturing.”
“We are lucky to be living in the era of data science.”
Downloads
- Longbing Cao. Data Science Thinking: The Next Scientific, Technological and Economic Revolution, ISBN: 978-3-319-95092-1, Springer International Publishing, 2018.
- Longbing Cao. Data Science: A Comprehensive Overview. ACM Computing Surveys, 50(3), 43:1–42, 2017
- Longbing Cao. Data Science: Profession and Education. IEEE Intelligent Systems, 34(5): 35–44, 2019
- Longbing Cao. Data Science: Challenges and Directions. Communications of the ACM, Vol. 60 No. 8, 59–68, 2018.
Research team
-
Professor, Advanced Analytics Institute
Faculty
- Faculty of Engineering and Information Technology
- UTS Advanced Analytics Institute
Funded by
- Australian Research Council discovery and linkage grants, various industry and government partners