AL

22 records found

Efficient Video Action Recognition

How well does TriDet perform and generalize in a limited compute power and data setting?

In temporal action localization, given an input video, the goal is to predict the action that is present in the video, along with its temporal boundaries. Several powerful models have been proposed throughout the years, with transformer-based models achieving state-of-the-art per ...

Efficient Temporal Action Localization via Vision-Language Modelling

An Empirical Study on the STALE Model's Efficiency and Generalizability in Resource-constrained Environments

Temporal Action Localization (TAL) aims to localize the start and end times of actions in untrimmed videos and classify the corresponding action types. TAL plays an important role in understanding video. Existing TAL approaches heavily rely on deep learning and require large-scal ...

Benchmarking Data and Computational Efficiency of ActionFormer on Temporal Action Localization Tasks

Analysing the Performance and Generalizability of ActionFormer in Resource-constrained Environments

In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin and where they end. Training and testing current state-of-the-art, deep learning models is done assuming access to large amounts of data and computational pow ...

TemporalMaxer Performance in the Face of Constraint: A Study in Temporal Action Localization

A Comprehensive Analysis on the Adaptability of TemporalMaxer in Resource-Scarce Environments

This paper presents an analysis of the data and compute efficiency of the TemporalMaxer deep learning model in the context of temporal action localization (TAL), which involves accurately detecting the start and end times of specific video actions. The study explores the performa ...

Algal Bloom Forecasting using Remote Sensing

Discovering the most predictive data modalities for Algal Bloom Forecasting

An algal bloom is defined as a rapid increase in common algae (phytoplankton) abundance in water bodies and it can occur when a group of certain environmental factors is combined. If the algae populations grow out of control, such algal blooms become problematic and cause damage ...

Algal Bloom Forecasting in a Classification and Regression Setting

Implementing a UNet Architecture to evaluate the differences between both settings

Forecasting algal blooms using remote sensing data is less labour-intensive and has better cover- age in time and space than direct water sampling. The paper implements a deep learning technique, the UNet Architecture, to predict the chlorophyll concentration, which is a good ind ...
The term ”Algal Bloom” refers to the accumulation of algae in a confined geological space. They may harm human health and negatively affect ecological systems around the area. Thus, forecasting algal blooms could mitigate the environmental and socio-economical damages. Particular ...
This research presents a method for forecasting algal blooms using remote sensing with spatially and temporally sparse satellite data. The method involves the use of multiple interpolation methods to interpolate the sparse input data. The approach is shown to be effective in pred ...
Color information has been shown to provide useful information during image classification. Yet current popular deep convolutional neural networks use 2-dimensional convolutional layers. The first 2-dimensional convolutional layer in the network combines the color channels of the ...
CycleGANs [1] and CIConv [2] are both relatively new approaches to their respective applications. For CycleGANs this application is unpaired image-to-image domain adaptation and for CIConv this application is making images more
robust to illumination changes. We investigate w ...
While deep neural networks show great potential for being part of safety-critical applications such as autonomous driving, covering their sensitivity to illumination shifts by adding training data is of- ten non-trivial. The undesired illumination shift between train and test dat ...
The possibility to improve an existing method by making (part of) it learnable is explored in this research. The work that this research extends added prior knowledge to a Convolutional Neural Network (CNN) to improve its performance when dealing with an illumination shift. The m ...
Color Invariant Convolution (CIConv) is a learnable Convolutional Neural Network (CNN) layer that reduces the distribution shift between the source and target set in the CNN under an illumination-based domain shift. We explore the semantic segmentation performance for daynight do ...

To alleviate lower classification performance on rare classes in imbalanced datasets, a possible solution is to augment the underrepresented classes with synthetic samples. Domain adaptation can be incorporated in a classifier to decrease the domain discrepan ...

Wheat is a widely used ingredient for food products. To increase the productionand quality of wheat, the density of ’wheat heads’ in a farm can be studied. Accuratelylocating wheat heads in images can be challenging. A lot of work has taken place insupervised semantic segmentatio ...
In the field of ecology, camera traps are important tools to collect information on the wildlife of certain areas. The problem that arises with many camera traps is that they can collect more images than a human can realistically go trough all by themselves. To help classify thes ...
The natural world is long-tailed: rare classes are observed orders of magnitudes less frequently than common ones, leading to highly-imbalanced data where rare classes can have only handfuls of examples. Learning from few examples is a known challenge for deep learning based clas ...
Camera traps are used around the world to provide data on species, population sizes and how species are interacting. However this creates a lot of work in identifying which animal was actually spotted near the camera. Attempts have been made to use deep-learning to identify anima ...
This research paper analyses the effect that using frequency information can have on object detectors. The latter are complex networks that learn information about objects from images and are then able to predict the location of these objects in new, unseen images. There are, how ...
In this paper, we run two methods of explanation, namely LIME and Grad-CAM, on a convolutional neural network trained to label images with the LEGO bricks that are visible in them. We evaluate them on two criteria, the improvement of the network's core performance and the trust t ...