Circular Image

18 records found

Motivation. DNA molecules mutate thousands of times every day. Some mutations are harmful to human cells, and may lead to the loss of function in important genes involved in DNA damage repair (DDR) mechanisms. Diseases such as tumors can exploit mutations in important, dri ...
Targeted and successful cellular therapies for disease treatment require an extensive mapping of the complex structure and dynamics of molecular mechanisms which determine the behaviour and function of cell. CELL-seq is a genome-wide screening procedure measuring specific and tar ...

Assessing Machine Learning Robustness to Sample Selection Bias

Evaluating the effectiveness of semi-supervised learning techniques

This paper tackles the problem of sample selection bias in machine learning, where the assumption of train and test sets being drawn from the same distribution is often violated. Existing solutions in domain adaptation, such as semi-supervised learning techniques, aim to correct ...
Sample selection bias occurs when the selected samples in a subset of the original data set follow a different distribution than the samples from the original data set. This type of bias in the training set could result in a classifier being unable to predict samples from a testi ...
Importance weighting is a class of domain adaptation techniques for machine learning, which aims to correct the discrepancy in distribution between the train and test datasets, often caused by sample selection bias. In doing so, it frequently uses unlabeled data from the test set ...
Sample selection bias is a well-known problem in machine learning, where the source and target data distributions differ, leading to biased predictions and difficulties in generalization. This bias presents significant challenges for modern machine learning algorithms. To tackle ...
Domain adaptation allows machine learning models to perform well in a domain that is different from the available train data. This non-trivial task is approached in many ways and often relies on assumptions about the source (train) and target (test) domains. Unsupervised domain a ...
Synthetic lethality (SL) is a relationship between two genes, exploited for targeted anti-cancer therapy, whereby functional loss of both genes induces cell death, but the functional loss of either gene alone is non-lethal. Computational prediction of SL gene pairs is sought afte ...
Double-strand break (DSB) repair is a critical cellular process which repairs breaks in both strands of the DNA double helix. Different repair mechanisms are tasked with repairing such breaks. Predicting deficiencies in repair mechanisms has been widely used for therapeutic purpo ...
The inclusion of intronic reads in the downstream analysis of RNA-sequencing (RNA-seq) data has long been controversial. Recent studies show that intronic reads do contain relevant biological signal. Additionally, studies have discovered differential expression unique to intronic ...
Genomics has revolutionized our understanding of evolution, hereditary diseases, and more. The advent of long-read DNA sequencers i.e. Oxford Nanopore Technologies' innovations, has opened many new research potentials in genomics. These sequencers produce significantly longer DNA ...
Motivation: Many tumors show deficiencies in DNA damage repair. These deficiencies can play a role in the disease, but also expose vulnerabilities with therapeutic potential. Targeted treatments exploit specific repair deficiencies, for instance based on synthetic lethality. To d ...

Attention-based deep learning for DNA repair outcome prediction

Learning how the cell repairs DNA breaks using local sequence context

Recent advancements in quantification of repair outcomes of CRISPR-Cas9 mediated double-stranded DNA breaks (DSBs) have allowed for the use of machine learning for predicting the frequencies of these repair outcomes. Local DNA sequence context influences the frequencies of mutati ...
Motivation: As one of the most common and life-threatening diseases in humans, cancer is a result of the accumulation of somatic mutations throughout the life cycle. Somatic mutation is a joint result of DNA lesion, which is a result of damage on DNA caused by mutagen, and the fa ...

Explainable Survival Analysis

For Urothelial Cancer

Survival analysis is a statistical method used to predict when an event will occur. Machine learning survival models have been used in many cancer studies. However, machine learning models may not always be interpretable. The current lack of research for explainable survival anal ...
Double strand breaks are lesions to the DNA and can be fatal for cells. Therefore these breaks are repaired, primarily by one of the three major repair pathways. Two of these pathways are non-homologous end-joining (NHEJ) and theta-mediated end-joining (TMEJ). These pathways leav ...

Machine Learning of Synthetic Lethality

Data Integration, Generalisation, and Selection Bias

Synthetic lethality (SL) arises between two genes when loss of function of both genes would lead cells to become inviable. This can be exploited for therapy, where a drug is used to selectively kill diseased cells by perturbing one gene of an SL pair where the other gene is inact ...
Due to their altered genetic context, cancer cells can become dependent on specific genes for their survival. Such cancer-specific dependencies may represent promising therapeutic targets. However, knowledge on which molecular features of cancer cells induce specific dependencies ...