NA

19 records found

Comparison of Cloud-to-Cloud Distance Calculation Methods

Is the Most Complex Always the Most Suitable?

Cloud-to-cloud (C2C) distance calculations are frequently performed as an initial stage in change detection and spatiotemporal analysis with point clouds. There are various methods for calculating C2C distance, also called inter-point distance, which refers to the distance betwee ...
The advantages of using point clouds for change detection analysis include comprehensive spatial and temporal representation, as well as high precision and accuracy in the calculations. These benefits make point clouds a powerful data type for spatio-temporal analysis. Neverthele ...

SALoBa

Maximizing Data Locality and Workload Balance for Fast Sequence Alignment on GPUs

Sequence alignment forms an important backbone in many sequencing applications. A commonly used strategy for sequence alignment is an approximate string matching with a two-dimensional dynamic programming approach. Although some prior work has been conducted on GPU acceleration o ...
Recent advances in DNA sequencing technology have opened new doors for scientists to use genomic data analysis in a variety of applications that directly affect human lives. However, the analysis of unprecedented volumes of sequencing data being produced represents a formidable c ...
The seeding heuristic is widely used in many DNA analysis applications to speed up the analysis time. In many applications, seeding takes a substantial amount of the total execution time. In this paper, we present an efficient GPU implementation for computing maximal exact matchi ...
Background: Immense improvements in sequencing technologies enable producing large amounts of high throughput and cost effective next-generation sequencing (NGS) data. This data needs to be processed efficiently for further downstream analyses. Computing systems need this large a ...

ArrowSAM

In-Memory Genomics Data Processing Using Apache Arrow

The rapidly growing size of genomics data bases, driven by advances in sequencing technologies, demands fast and cost-effective processing. However, processing this data creates many challenges, particularly in selecting appropriate algorithms and computing platforms. Computing s ...
BACKGROUND: In Overlap-Layout-Consensus (OLC) based de novo assembly, all reads must be compared with every other read to find overlaps. This makes the process rather slow and limits the practicality of using de novo assembly methods at a large scale in the field. Darwin is a fas ...

Correction to

GASAL2: A GPU accelerated sequence alignment library for high-Throughput NGS data (BMC Bioinformatics (2019) 20 (520) DOI: 10.1186/s12859-019-3086-9)

Following publication of the original article [1], the author requested changes to the figures 4, 7, 8, 9, 12 and 14 to align these with the text. The corrected figures are supplied below. The original article [1] has been corrected. [Typesetter, please insert new supplied figure ...

GASAL2

A GPU accelerated sequence alignment library for high-throughput NGS data

BACKGROUND: Due the computational complexity of sequence alignment algorithms, various accelerated solutions have been proposed to speedup this analysis. NVBIO is the only available GPU library that accelerates sequence alignment of high-throughput NGS data, but has limited perfo ...
Background: Pairwise sequence alignment is widely used in many biological tools and applications. Existing GPU accelerated implementations mainly focus on calculating optimal alignment score and omit identifying the optimal alignment itself. In GATK HaplotypeCaller (HC), the semi ...

SparkGA2

Production-quality memory-efficient Apache Spark based genome analysis framework

Due to the rapid decrease in the cost of NGS (Next Generation Sequencing), interest has increased in using data generated from NGS to diagnose genetic diseases. However, the data generated by NGS technology is usually in the order of hundreds of gigabytes per experiment, thus req ...
Convolutional Neural Networks (CNNs) are a class of widely used deep artificial neural networks. However, training large CNNs to produce state-of-the-art results can take a long time. In addition, we need to reduce compute time of the inference stage for trained networks to make ...
In order to improve the accuracy of indel detection, micro-assembly is used in multiple variant callers, such as the GATK HaplotypeCaller to reassemble reads in a specific region of the genome. Assembly is a computationally intensive process that causes runtime bottlenecks. In th ...
Sequence alignment is a core step in the processing of DNA and RNA sequencing data. In this paper, we present a high performance GPU accelerated library (GASAL) for pairwise sequence alignment of DNA and RNA sequences. The GASAL library provides accelerated kernels for local, glo ...
Much research has been dedicated to reducing the computational time associated with the analysis of genome data, which resulted in shifting the bottleneck from the time needed for the computational analysis part to the actual time needed for sequencing of DNA information. DNA seq ...
The large amount of data generated by NextGeneration Sequencing (NGS) technology, usually in the order of hundreds of gigabytes per experiment, has to be analyzed quickly to generate meaningful variant results. The first step
in analyzing such data is to map those sequenced r ...
DNA read alignment is a major step in genome analysis. However, as DNA reads continue to become longer, new approaches need to be developed to effectively use these longer reads in the alignment process. Modern aligners commonly use a two-step approach for read alignment: 1. seed ...
The fast decrease in cost of DNA sequencing has resulted in an enormous growth in available genome data, and hence led to an increasing demand for fast DNA analysis algorithms used for diagnostics of genetic disorders, such as cancer. One of the most computationally intensive ste ...