Our projects
Collection Digitisation with the Museum of Applied Arts and Sciences (MAAS), Sydney
The Perceptual Imaging Laboratory is excited to work together with MAAS and the UTS’s Mechanical and Mechatronic Engineering Laboratories to produce 3D models of some of the museum's wonderful collection. Digitising these works of art is a thrill, but the thing we are really excited about is the future of 3D scanning. Creating 3D scans is just the start, it's what the scans are used for that really shows the potential of this technology.
Imagine reconstructing a 3D model of a piece of ancient history lost in war and using 3D printing to bring it back from oblivion or capturing the iridescent sheen of a butterfly's wing, indistinguishable by the naked eye. Imagine a child's excitement when virtual reality technology allows them to walk the streets of ancient Rome, stand on the shores of a methane lake on Titan, or feel Leonardo's brush strokes on a perfect copy of the Mona Lisa. Imagine unlocking the information about the past by digitally restoring a damaged painting without risking harm to the original, removing centuries of grime from a statue and examining the sculptor's chisel marks, or viewing the original artwork in a perfect reconstruction of its original home. Now imagine all of these things being possible for anybody, regardless of how remotely they live. These are just some of the exciting future directions of 3D scanning of which UTS is excited to play a part. For further details, please contact Prof Stuart Perry.
Intention Prediction Based on Multimodal Feature Fusion
The realm of intent prediction holds remarkable relevance across a wide array of fields, encompassing areas such as autonomous driving, robot navigation, virtual reality, assistive technology, and game development, among others. Our research emphasizes a potential approach for human intention Prediction and trajectory Prediction through the integration of multimodal feature extraction. This method holds potential to catalyse computational systems to profoundly understand and adapt to complex real-world situations, thus strengthening the interaction and collaboration between AI and humans. For further details, please contact A/Prof Min Xu.
Multimedia Affective Computing
Emotional factors usually affect users’ preferences for and evaluations of images and videos. Understanding multimedia emotions makes intelligent human multimedia interactions possible; therefore multimedia analysis attracts increasing attention from both research community and industry. However, there are still major challenges remaining. It is difficult to classify an image (or a video) into a single emotion type since different image regions/video segments within an image/a video can represent different emotions. Moreover, developing representative features is a challenging task since there is a gap between low-level features and high-level emotions. For video affective computing, besides spatial features, temporal features need to be considered. In this research, we propose the concept of affective map for pixel-level image emotion analysis and look at spatial-temporal features for video emotion analysis. For further details, please contact A/Prof Min Xu.
Visual Saliency Estimation
Saliency maps that integrate individual feature maps into a global measure of visual attention are widely used to estimate human gaze density. Most of the existing methods consider low-level visual features and locations of objects, and/or emphasise the spatial position with centre prior. Recent psychology research suggests that emotions strongly influence human visual attention. In this research, we explore the influence of emotional content on visual attention. On top of the traditional bottom-up saliency map generation, our saliency map is generated in cooperation with three emotion factors, i.e., general emotional content, facial expression intensity, and emotional object locations. For further details, please contact A/Prof Min Xu.
Lightfield Imaging for Advanced 3D Scanning (joint project with Animal Logic Entertainment and the School of Optometry and Vision Science, UNSW)
Light-field cameras such as the Lytro ILLUM capture not only the intensity of light striking the sensor, but also the angle of incidence of the light. This information can be used to reconstruct depth information, change focal point post-capture, and change viewpoint post-capture. PILab is working to develop algorithms for 3D shape capture technologies using light-field camera systems. In particular, to develop 3D capture methods that can quantify and capture material appearance properties such as colour changes with angle or gloss characteristics, by making use of the unique properties of the additional information present in light-field images. For further details, please contact Prof Stuart Perry.
Road Crack Detection and Classification using Digital Image Processing (joint project with Vietnam National University Hanoi)
Pavement crack detection is an essential problem in road maintenance. Despite the considerable work that has been done on this problem, the accuracy of current crack analysis methods in determining the required crack parameters such as crack width, length and direction, is often inadequate. In fact, because there is no clear definition of what constitutes a crack, current methods sometime erroneously identify other image features. The lack of high quality ground truth data presents a particular impediment to overcoming the limitations of current methods. PILab is working on this project in conjunction with Vietnam National University Hanoi. For further details, please contact Prof Stuart Perry.
Perceptually optimized plenoptic data representation and coding
Light field (LF) is a type of Plenoptic data that has recently come into prominence. Conventionally, digital sensors record the amount of light, regardless of its direction for rays passing through a single position. LF images open a range of new possibilities for processing digital images. By capturing redundant information i.e. light from several directions, LF cameras have three advantages compared to traditional cameras: 1) allowing to reconstruct images with small viewpoint changes, 2) allowing to compute a depth map using depth from focus, and 3) allowing to refocus on the image after data capture. The better the quality of the image, the larger the storage capacity of transmission. Hence, the need for effective data representation and coding methods integrated with the human perception characteristics has been more compelling than ever. In this context, we aim to deal with two major challenges of LF data: LF visual representation and coding. The LF visual representation considers the quality of display on different application devices while LF coding mostly considers compression performance for developing an efficient video framework. PILab is working on this project in conjunction with Vietnam National University Hanoi. For further details, please contact Prof Stuart Perry.
Light field image reconstruction and synthetic light field image generation
Light field imaging technology captures the pixel intensity and also the direction of the incident light, this additional dimensionality of data allows the generation of images different focal lengths, extended depth of field, and allows for image manipulation in a more flexible way. We explore the different techniques for light field reconstruction and synthesis with an aim to understand the most efficient and accurate way to synthesise light field images. Synthetic light field image generation deals with using focal stack images or a depth information of the image to predict the perspective views for the scene to synthesise a light field image. Both these techniques could change the way we perceive and render traditional photography. For further details, please contact A/Prof Eva Cheng.
Classification of Glaucoma in 3D Optical Coherence Tomography (joint project with the School of Optical Imaging & Visualization Laboratory (OIVL), Optometry and Vision Science, UNSW)
PILab is working with the School of Optometry and Vision Science at the University of New South Wales on the computer-aided detection of glaucoma in Optical Coherence Tomography imagery. It is hoped that this collaboration can produce technologies that help diagnose problems for patients earlier and allow for more effective treatment of this condition. For further details, please contact Prof Stuart Perry.
OCT Image of a Human Retina
Learning-based point cloud compression
PILab is working in the context of the JPEG Standardisation group within the International Standards Organisation to develop new standards for point cloud compression based on innovative deep-learning technologies. For further details, please contact Prof Stuart Perry.
Semantic-Aware Simultaneous Localization and Mapping System for Unmanned Aerial Vehicles Using Deep Learning
The golden age of Unmanned Aerial Vehicle (UAV) technology is upon us and promising to build on the industry’s evolution and success over the past year. Enterprise drone use has increased rapidly — with the construction, mining, agriculture, surveying, and real estate sectors leading the way. Drones are starting to become an indispensable part of our economic infrastructure. To efficiently solve many tasks automatically, UAVs need a map of the environment. The ability of Simultaneous Localization and Mapping (SLAM) on mobile platforms allows for the design of systems that can operate in complex environments only based on their onboard sensors and without entirely relying on an external reference system like, e.g., GPS. To deal with challenging conditions (e.g., fast movement, perceptual alias, or dynamic environments), modern SLAM algorithms are exploiting advancements in 3D vision and deep learning. These approaches provided us with better performance and high-level semantic labels, which both increase the robustness of SLAM and enrich the capability of further applications.
Deriving from the vital role of SLAM in UAV systems, PILab is working on a novel visual SLAM method that uses deep learning to label the semantics of 3D visual data simultaneously. The robustness of the proposed system is based on two aspects: (1) the capability of image analysis by deep learning will leverage the stability under challenging environments, and (2) the efficiency of high-level semantic map representation will allow us to integrate on smaller UAV platforms. PILab is working on this project in conjunction with Vietnam National University Hanoi. For further details, please contact Prof Stuart Perry.
VR Technology for Empathy Creation
Diverse groups (e.g. gender diverse, disabled, cultural minorities, and more), face various difficulties due to technology and societal systems being unsuited and non-inclusive of their needs. This is a result of the lack of understanding, awareness, and empathy of technology designers, policy makers, and service providers of the above issues. There is currently no truly compelling, accessible way to enable people to "experience" the perspective of these diverse groups. Virtual Reality (VR) can provide for immersive experiences, however there is no guidance on how VR can be used to address these issues, and these are not currently used extensively in technology design as a tool to enable human centred design. PILab is working on the use of VR technologies to present users with immersive experiences that allow them to experience problems and challenges faced by under-privileged groups. These experiences would be designed in conjunction with experts in the area of Equity, Diversity and Inclusion and created using actors and 360 degree video capture technology and advanced knowledge of User Interface Design and principles of effective human-computer interaction. PILab is working on this project in conjunction with Vividhata. For further details, please contact Prof Stuart Perry or A/Prof Arti Agrawal.
Physically-based simulation of 3D materials
PILab is working with the UTS School of Biomedical Engineering as well as the University of Genova on modelling 3D semi-fluids. This has a variety of applications including bioprinting and computational design. For further details, please contact A/Prof Nico Pietroni.
Other PILab projects include:
Monte Carlo simulation and modelling of scatter in high-speed CT imaging systems
For further details, please contact Dr Daniel Franklin.
Nanocomposite materials for low-cost, high-sensitivity PET systems
For further details, please contact Dr Daniel Franklin.
Real-time 3D tracking of compact radiation sources in a scattering medium
For further details, please contact Dr Daniel Franklin.
Scatter in long axial field-of-view PET scanners
For further details, please contact Dr Daniel Franklin.
Immersive VR Experiences for Empathy Generation
Extended Reality (XR), encompassing Virtual, Augmented and Mixed Reality technologies, enjoys continued growth and application in society. Up to now, this technology has been primarily applied to commercial or entertainment applications. However, this technology also has great potential for supporting social justice, and Equity, Diversity and Inclusion.
PILab have been working with our partner Vividhata Pty Ltd (a social enterprise) to develop virtual reality experiences that explore and raise awareness about how inadvertent gender and cultural exclusion, harassment and discrimination creates long term problems for individuals. By providing 360 degree immersion through VR headsets, viewers gain a deeper experience of the issues that various groups face on a daily basis. When this is combined with interactive elements to facilitate effective training on Equity, Diversity and Inclusion provided by Vividhata, there is a great potential for the development of effective VR based training. This can in turn make a huge difference for culturally diverse groups and organisations that rely on the increasingly diverse workforces of the 21st century.
Below we can see a recording of a viewer interacting with an interactive, immersive 360 degree experience bystander training when observing harassment in a public transport setting created by students at UTS for the iMOVE CRC to showcase the ability of VR for immersion and creating empathy. (The scenario is filmed using student and staff actors- nobody was actually harassed in real life!) Please do not view this experience if you are likely to feel uncomfortable or triggered by this scenario.
Interactive VR experience on the topic of harassment on public transport
Behind the scenes look at the creation and goals of the prototype VR experience
PILab VR for Empathy Partners: