UNIVERSITY OF CALGARY
Network Driven Bio-Data Integration and Mining for Bio-Medical
Predictions
By
Ala Qabaja
A THESIS
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL
FULFILMENT OF TH REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
DEPARTMENT OF COMPUTER SCIENCE
CALGARY, ALBERTA
July, 2013
© Ala Qabaja 2013
II
Abstract
Drug repositioning is increasingly attracting much attention from pharmaceutical
community to tackle the problem of long term development in drug discovery. The complex
nature of human diseases, for example cancer, poses major challenges in pharmaceutical
industry nowadays. With the increasing amount of research conducted to understand
associations between drugs and diseases, a new direction of research has come to light.
Thanks to the development of high-throughput technologies to generate tremendous
amount of data and to the web-based systems to store and organize the generated data,
drug repositioning has become cost and time effective.
Since biomolecular interactions and omics-data integration has had success in drug
development, we have been motivated to develop a new paradigm that integrates data
from three major sources to predict novel therapeutic drug indications. Microarray data,
biomedical text mining data and biomolecular interactions are all integrated to predict
ranked lists of genes based on their relevance to a particular drug’s or disease’s molecular
action. These ranked lists of genes are used as raw input for building a disease-drug
connectivity map based on enrichment statistical measure. This integrative paradigm was
able to report a sensitivity improvement of 18% and 26% in comparison with using text-
mining and microarray data, respectively, independently. In addition, this paradigm was
able to predict many clinically validated disease-drug associations that could not be
captured with using microarray or text mining data independently.
The robustness of the integrative paradigm has been further investigated to predict
functional miRNA-disease associations. In here, disease-gene associations from microarray
III
experiments and text mining together with miRNA-gene associations from computational
predictions and protein networks have been integrated to build miRNA-disease
associations. The findings of the proposed model were validated against gold standard
datasets using ROC analysis and results were promising (AUC = 0.81). The proposed
integrated approach allowed us to reconstruct functional associations between miRNAs
and human diseases and to uncover functional roles of newly discovered miRNAs
IV
Acknowledgements
I would like first to express my sincere gratitude and thanks to the best friend and the best
supervisor Prof. Reda Alhajj. I would like to thank you for introducing me to the field of
data mining, databases and bioinformatics. Without your academic and financial support, I
wouldn’t have been able to finish this thesis.
I would like to thank Dr. Moussavi and Dr. Rokne for spending time reviewing this
thesis.
I would like to thank Dr. Lisa Hiaghm for teaching me the concepts of algorithms and
their design and analysis, Dr. Moussavi for teaching me the concept of software
requirements gathering and engineering, Dr. Gordon Chua for teaching me the applications
of computer algorithms on real biological problems. Without knowing these concepts, it
would have been really hard for me to build this system.
I would like to thank all great friends who had their fingerprints helping me to build
this manuscript. Also I would like to thank those who were caring about me and about my
family. Thanks to Ali Rahmani, Atieh Sarraf, Fatemeh Keshavarsh, Shermin Bazazian, Thaer
Shunnar, Omar Zarour, Omar Addam, Mohammed Shalalfeh, Abdullah Al-Shiekh, Ala
Hamdan, Sara Aghakhani, Sarah Fakhoury, Derar Al-Asi, Abdelrahman Qiswani, Ala Kassab,
Maisa Barbarawi, Nour El-Masri and Tamer Jarada.
Special thanks to Dr. Jalal Kawash for his kind assistance
Finally my deepest love and thanks go to my family. Thanks to my greatest father,
(Mohammed), most amazing mother (Mazuza), sisters (Nadin, Suzan, Bayan and Doa) and
brothers (Abood and Mahmoud).
V
To my first and only love, Suha Janajreh
To my newborn boy, Jad Qabaja
My whole life is nothing without you
1 / 156 100%