One Step Closer to Genomics Empowered Healthcare: AAII in Collaboration with 23Strands
Genome and Gene Expression Data Analysis
Milestone 1: BiblioGene Miner
Taking individual genetic variability into account is a new frontier in modern medical research. Awareness of a disease’s genetic bases can contribute much to better risk assessment, diagnostics, and therapeutic treatment strategies. AAII researchers, in collaboration with 23Strands, have created BiblioGene Miner.
BiblioGene Miner is a text mining software that aims to identify latent features of genes from scientific papers. Using raw scientific papers on the target disease as the input, we can use BiblioGene Miner to 1) recognise gene and other bioentity (diseases, chemicals, genetic variant) names from scientific papers, 2) produce the gene similarity matrix based on quantitative evidence from scientific papers and biomedical databases, and 3) profile all genes’ importance and specificity to the target disease. In this way, we can leverage the massive volume of data contained in scholarly papers and encode latent gene features to facilitate further heterogeneous bioentity graph representation.
Pilot Case Study: Atrial fibrillation
Atrial fibrillation (AF) is one of the most common forms of cardiac arrhythmia. Disease progress is closely related to atrial size and the extent of atrial fibrosis, both of which are affected by genetic factors. Several gene groups and genetic mutations have been linked to AF, though evidence is far from sufficient to begin integrating our knowledge into clinical practice. The framework we used in this comprehensive pilot case study exploits the literature to capture known associations between biomedical entities and diseases and, further, combines the results with network analytics and link prediction to identify both core and potentially emerging genetic factors. The results indicate our strategy is a very promising solution for genetic factor analysis and prediction, by successfully identifying some key biomedical entities associated with AF. Read the full paper.
Pipeline and Case Study on Stroke
In a recent paper, we systematically introduced the updated work pipeline of BiblioGene Miner (also called BiblioEngine) and performed an additional case study on stroke. Stroke is a prevalent cardiovascular disease with detrimental implications for individuals and public health but poses a complex genetic landscape that remains incompletely understood. The case study elucidates the research intelligence and knowledge landscape within the field, demonstrates three groups of frequently studied genes, and reveals genes that need further investigation due to their current potential, albeit limited evidence. Read the full paper (forthcoming).
Milestone 2: GGE Analysis
Genome and gene expression (GGE) data promises to pave the way for human whole-genome sequencing in medical settings. Using AI-empowered techniques, we have pinpointed crucial associations between genes and health characteristics (e.g. diseases and treatments). AAII’s research team in partnership with 23strands have successfully identified genome and gene expression (GGE) data from a vast amount of scientific literature. As an initial step towards the ultimate knowledge cohort, the team developed and applied a series of intelligent bibliometrics-enhanced models.
Unlocking knowledge on the associations between GGE and health characteristics contained within the massive scale of biomedical research articles can provide supplemental messages to the human whole-genome sequencing data. By distilling and transferring these identified associations and genetic biomarkers, we are a step closer to integrating them into treatment regimes. This progress in our understanding of the link between genes and health characteristics is a milestone towards revolutionising existing approaches to healthcare.