Deep industry collaborations in nlPUG are driving transformative outcomes across a wide range of sectors and applications.

Our research
Unstructured text analysis for the Transport Accident Commission (TAC)
Using static and dynamic topic modelling, our team has helped predict the recovery trajectory of the clients of the Victorian Government’s accident compensation agency, the TAC. We have also tackled the complexity of the TAC’s internal documentation, generating an informative taxonomy of their document collections and building a customised search facility. This collaborative industry project, carried out as part of a CRC, has funded a PhD scholarship and a research associate position at UTS, and the deliverables have been used in the TAC’s broader framework for client analysis and needs prediction.
Multi-document summarisation for RoZetta Technology
As part of a close collaboration with Sydney-based data science company RoZetta Technology, our team is developing an automated multi-document summarisation tool which will generate informative and fluent summaries from clusters of related documents. The tool could be used, for instance, to generate real-time summaries of financial news at the beginning of a trading day. The project is funding a current PhD position and an adjunct researcher.
Named-entity recognition in Persian
Our researchers have developed a novel approach to named-entity recognition (NER) in Persian, collaborating with Australian company Sintelix. The project funded a PhD scholarship, and the resultant software – also modified to enable Arabic NER – is now used by Sintelix. With Persian having fewer annotations than other mainstream languages, among the project’s key contributions has been the public release of the first NER-annotated Persian dataset. The project has also delivered four different word embeddings trained over unannotated corpora for a comprehensive Persian dictionary of nearly 50,000 unique words.
LEARN MORE
about our researchers