NA

20 records found

Authored

The advantages of using point clouds for change detection analysis include comprehensive spatial and temporal representation, as well as high precision and accuracy in the calculations. These benefits make point clouds a powerful data type for spatio-temporal analysis. Neverthele ...

Comparison of Cloud-to-Cloud Distance Calculation Methods

Is the Most Complex Always the Most Suitable?

Cloud-to-cloud (C2C) distance calculations are frequently performed as an initial stage in change detection and spatiotemporal analysis with point clouds. There are various methods for calculating C2C distance, also called inter-point distance, which refers to the distance betwee ...

SALoBa

Maximizing Data Locality and Workload Balance for Fast Sequence Alignment on GPUs

Sequence alignment forms an important backbone in many sequencing applications. A commonly used strategy for sequence alignment is an approximate string matching with a two-dimensional dynamic programming approach. Although some prior work has been conducted on GPU acceleration o ...

Background: Immense improvements in sequencing technologies enable producing large amounts of high throughput and cost effective next-generation sequencing (NGS) data. This data needs to be processed efficiently for further downstream analyses. Computing systems need this larg ...

The seeding heuristic is widely used in many DNA analysis applications to speed up the analysis time. In many applications, seeding takes a substantial amount of the total execution time. In this paper, we present an efficient GPU implementation for computing maximal exact matchi ...
Recent advances in DNA sequencing technology have opened new doors for scientists to use genomic data analysis in a variety of applications that directly affect human lives. However, the analysis of unprecedented volumes of sequencing data being produced represents a formidable c ...

ArrowSAM

In-Memory Genomics Data Processing Using Apache Arrow

The rapidly growing size of genomics data bases, driven by advances in sequencing technologies, demands fast and cost-effective processing. However, processing this data creates many challenges, particularly in selecting appropriate algorithms and computing platforms. Computin ...

BACKGROUND: In Overlap-Layout-Consensus (OLC) based de novo assembly, all reads must be compared with every other read to find overlaps. This makes the process rather slow and limits the practicality of using de novo assembly methods at a large scale in the field. Darwin is a ...

Convolutional Neural Networks (CNNs) are a class of widely used deep artificial neural networks. However, training large CNNs to produce state-of-the-art results can take a long time. In addition, we need to reduce compute time of the inference stage for trained networks to ma ...

Background: Pairwise sequence alignment is widely used in many biological tools and applications. Existing GPU accelerated implementations mainly focus on calculating optimal alignment score and omit identifying the optimal alignment itself. In GATK HaplotypeCaller (HC), the semi ...

SparkGA2

Production-quality memory-efficient Apache Spark based genome analysis framework

Due to the rapid decrease in the cost of NGS (Next Generation Sequencing), interest has increased in using data generated from NGS to diagnose genetic diseases. However, the data generated by NGS technology is usually in the order of hundreds of gigabytes per experiment, thus ...

GASAL2

A GPU accelerated sequence alignment library for high-throughput NGS data

BACKGROUND: Due the computational complexity of sequence alignment algorithms, various accelerated solutions have been proposed to speedup this analysis. NVBIO is the only available GPU library that accelerates sequence alignment of high-throughput NGS data, but has limited pe ...

Correction to

GASAL2: A GPU accelerated sequence alignment library for high-Throughput NGS data (BMC Bioinformatics (2019) 20 (520) DOI: 10.1186/s12859-019-3086-9)

Following publication of the original article [1], the author requested changes to the figures 4, 7, 8, 9, 12 and 14 to align these with the text. The corrected figures are supplied below. The original article [1] has been corrected. [Typesetter, please insert new supplied fig ...

In order to improve the accuracy of indel detection, micro-assembly is used in multiple variant callers, such as the GATK HaplotypeCaller to reassemble reads in a specific region of the genome. Assembly is a computationally intensive process that causes runtime bottlenecks. In th ...
The large amount of data generated by NextGeneration Sequencing (NGS) technology, usually in the order of hundreds of gigabytes per experiment, has to be analyzed quickly to generate meaningful variant results. The first step
in analyzing such data is to map those sequenced r ...
Much research has been dedicated to reducing the computational time associated with the analysis of genome data, which resulted in shifting the bottleneck from the time needed for the computational analysis part to the actual time needed for sequencing of DNA information. DNA seq ...
Sequence alignment is a core step in the processing of DNA and RNA sequencing data. In this paper, we present a high performance GPU accelerated library (GASAL) for pairwise sequence alignment of DNA and RNA sequences. The GASAL library provides accelerated kernels for local, glo ...
The fast decrease in cost of DNA sequencing has resulted in an enormous growth in available genome data, and hence led to an increasing demand for fast DNA analysis algorithms used for diagnostics of genetic disorders, such as cancer. One of the most computationally intensive ste ...
DNA read alignment is a major step in genome analysis. However, as DNA reads continue to become longer, new approaches need to be developed to effectively use these longer reads in the alignment process. Modern aligners commonly use a two-step approach for read alignment: 1. seed ...

Contributed

DNA aligning is a compute-intensive and time-consuming task required for all further DNA processing. It consists in finding for each DNA string from a sample its location in a reference genome. Usual techniques for short reads (hundreds of bases) involve seed-extension, where a s ...