S. Pintea | TU Delft Repository

A step towards understanding why classification helps regression

A number of computer vision deep regression approaches report improved results when adding a classification loss to the regression loss. Here, we explore why this is useful in practice and when it is beneficial. To do so, we start from precisely controlled dataset variations and ...

Is there progress in activity progress prediction?

Activity progress prediction aims to estimate what percentage of an activity has been completed. Currently this is done with machine learning approaches, trained and evaluated on complicated and realistic video datasets. The videos in these datasets vary drastically in length and ...

Objects do not disappear

Video object detection by single-frame object location anticipation

Objects in videos are typically characterized by continuous smooth motion. We exploit continuous smooth motion in three ways. 1) Improved accuracy by using object motion as an additional source of supervision, which we obtain by anticipating object locations from a static keyfram ...

Deep Vanishing Point Detection

Geometric priors make dataset variations vanish

Deep learning has improved vanishing point detection in images. Yet, deep networks require expensive annotated datasets trained on costly hardware and do not generalize to even slightly different domains, and minor problem variants. Here, we address these issues by injecting deep ...

No frame left behind

Full Video Action Recognition

Not all video frames are equally informative for recognizing an action. It is computationally infeasible to train deep networks on all video frames when actions develop over hundreds of frames. A common heuristic is uniformly sampling a small number of video frames and using thes ...

Resolution learning in deep convolutional networks using scale-space theory

Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard- ...

Seismic inversion with deep learning

A proposal for litho-type classification

This article investigates bypassing the inversion steps involved in a standard litho-type classification pipeline and performing the litho-type classification directly from imaged seismic data. We consider a set of deep learning methods that map the seismic data directly into lit ...

Semi-Supervised Lane Detection With Deep Hough Transform

Conference paper (2021) - Yancong Lin (author), Y. Lin (author), Yancong Lin (author), Silvia L. Pintea (author), Silvia Pintea (author), Silvia-Laura Pintea (author), S. Pintea (author), Jan Van Gemert (author), Jan van van Gemert (author), Jan van Gemert (author), J.C. Van Gemert (author), Jan C. Van Gemert (author), Jan van Van Gemert (author), Jan C. Gemert (author), Jan van Gemert (author), J.C. Gemert (author), J.C. van Gemert (author), Jan Gemert (author), Jan C. van Gemert (author)

Current work on lane detection relies on large manually annotated datasets. We reduce the dependency on annotations by leveraging massive cheaply available unlabelled data. We propose a novel loss function exploiting geometric knowledge of lanes in Hough space, where a lane can b ...

Deep Hough-Transform Line Priors

Conference paper (2020) - Yancong Lin (author), Y. Lin (author), Yancong Lin (author), Silvia L. Pintea (author), Silvia Pintea (author), Silvia-Laura Pintea (author), S. Pintea (author), Jan Van Gemert (author), Jan van van Gemert (author), Jan van Gemert (author), J.C. Van Gemert (author), Jan C. Van Gemert (author), Jan van Van Gemert (author), Jan C. Gemert (author), Jan van Gemert (author), J.C. Gemert (author), J.C. van Gemert (author), Jan Gemert (author), Jan C. van Gemert (author)

Classical work on line segment detection is knowledge-based; it uses carefully designed geometric priors using either image gradients, pixel groupings, or Hough transform variants. Instead, current deep learning methods do away with all prior knowledge and replace priors by train ...

Top-down networks

A coarse-to-fine reimagination of CNNs

Conference paper (2020) - Ioannis Lelekas (author), Nergis Tömen (author), N. Tömen (author), Silvia Pintea (author), Silvia-Laura Pintea (author), S. Pintea (author), Silvia L. Pintea (author), Jan Van Gemert (author), Jan van van Gemert (author), Jan van Gemert (author), J.C. Van Gemert (author), Jan C. Van Gemert (author), J.C. van Gemert (author), Jan C. Gemert (author), Jan van Gemert (author), Jan Gemert (author), J.C. Gemert (author), Jan van Van Gemert (author), Jan C. van Gemert (author)

Biological vision adopts a coarse-to-fine information processing pathway, from initial visual detection and binding of salient features of a visual scene, to the enhanced and preferential processing given relevant stimuli. On the contrary, CNNs employ a fine-to-coarse processing, ...

Divide and Count

Generic Object Counting by Image Divisions

Journal article (2019) - Tobias Stahl (author), Tobias Stahl (author), S. Pintea (author), Silvia Pintea (author), Silvia L. Pintea (author), Silvia-Laura Pintea (author), J.C. Gemert (author), Jan Gemert (author), J.C. Van Gemert (author), Jan van van Gemert (author), Jan Van Gemert (author), Jan van Gemert (author), Jan C. Gemert (author), Jan C. Van Gemert (author), Jan van Van Gemert (author), J.C. van Gemert (author), Jan van Gemert (author), Jan C. van Gemert (author)

We propose a general object counting method that does not use any prior category information. We learn from local image divisions to predict global image-level counts without using any form of local annotations. Our method separates the input image into a set of image divisions - ...

Using Phase Instead of Optical Flow for Action Recognition

Conference paper (2019) - Omar Hommos (author), Silvia Pintea (author), S. Pintea (author), Silvia-Laura Pintea (author), Silvia L. Pintea (author), Pascal S.M. Mettes (author), Jan van Gemert (author), Jan C. Van Gemert (author), J.C. Gemert (author), Jan van Gemert (author), Jan van Van Gemert (author), J.C. van Gemert (author), Jan Van Gemert (author), J.C. Van Gemert (author), Jan C. van Gemert (author), Jan Gemert (author), Jan van van Gemert (author), Jan C. Gemert (author)

Currently, the most common motion representation for action recognition is optical flow. Optical flow is based on particle tracking which adheres to a Lagrangian perspective on dynamics. In contrast to the Lagrangian perspective, the Eulerian model of dynamics does not track, but ...

Hand-tremor frequency estimation in videos

Conference paper (2019) - Silvia L. Pintea (author), S. Pintea (author), Silvia Pintea (author), Silvia-Laura Pintea (author), Jian Zheng (author), Xilin Li (author), Paulina J.M. Bank (author), Jacobus J. van Hilten (author), Jan Van Gemert (author), J.C. Van Gemert (author), J.C. van Gemert (author), Jan van Van Gemert (author), Jan C. Gemert (author), Jan C. Van Gemert (author), Jan van van Gemert (author), Jan C. van Gemert (author), Jan Gemert (author), Jan van Gemert (author), J.C. Gemert (author), Jan van Gemert (author)

We focus on the problem of estimating human hand-tremor frequency from input RGB video data. Estimating tremors from video is important for non-invasive monitoring, analyzing and diagnosing patients suffering from motor-disorders such as Parkinson’s disease. We consider two appro ...

Recurrent Knowledge Distillation

Conference paper (2018) - S. Pintea (author), Silvia L. Pintea (author), Silvia Pintea (author), Silvia-Laura Pintea (author), Yue Liu (author), Jan van Gemert (author), Jan van van Gemert (author), Jan van Gemert (author), Jan C. Gemert (author), Jan C. van Gemert (author), Jan van Van Gemert (author), Jan C. Van Gemert (author), Jan Van Gemert (author), J.C. Gemert (author), J.C. van Gemert (author), J.C. Van Gemert (author), Jan Gemert (author)

Knowledge distillation compacts deep networks by letting a small student network learn from a large teacher network. The accuracy of knowledge distillation recently benefited from adding residual layers. We propose to reduce the size of the student network even further by recasti ...

Asymmetric kernel in Gaussian Processes for learning target variance

Journal article (2018) - S. Pintea (author), Silvia Pintea (author), Silvia L. Pintea (author), Silvia-Laura Pintea (author), J.C. Gemert (author), Jan Gemert (author), J.C. Van Gemert (author), Jan van van Gemert (author), Jan Van Gemert (author), Jan van Gemert (author), Jan C. Gemert (author), Jan C. Van Gemert (author), Jan van Van Gemert (author), J.C. van Gemert (author), Jan van Gemert (author), Jan C. van Gemert (author), Arnold W.M. Smeulders (author)

This work incorporates the multi-modality of the data distribution into a Gaussian Process regression model. We approach the problem from a discriminative perspective by learning, jointly over the training data, the target space variance in the neighborhood of a certain sample th ...

Video Acceleration Magnification

Conference paper (2017) - Yichao Zhang (author), S. Pintea (author), Silvia Pintea (author), Silvia L. Pintea (author), Silvia-Laura Pintea (author), J.C. Gemert (author), Jan Gemert (author), J.C. Van Gemert (author), Jan van van Gemert (author), Jan Van Gemert (author), Jan van Gemert (author), Jan C. Gemert (author), Jan C. Van Gemert (author), Jan van Van Gemert (author), J.C. van Gemert (author), Jan van Gemert (author), Jan C. van Gemert (author)

The ability to amplify or reduce subtle image changes over time is useful in contexts such as video editing, medical video analysis, product quality control and sports. In these contexts there is often large motion present which severely distorts current video amplification metho ...

One-step time-dependent future video frame prediction with a convolutional encoder-decoder neural network

Conference paper (2017) - V. Vukotic (author), V. Vukotic (author), V. Vukotic (author), Vedran Vukotic (author), Vedran Vukotic (author), Vedran Vukotic (author), S. Pintea (author), Silvia L. Pintea (author), Silvia Pintea (author), Silvia-Laura Pintea (author), Christian Raymond (author), Christian Raymond (author), Guillaume Gravier (author), Guillaume Gravier (author), Jan van Gemert (author), Jan van Van Gemert (author), J.C. Gemert (author), Jan C. Van Gemert (author), Jan Gemert (author), Jan Van Gemert (author), J.C. Van Gemert (author), Jan C. van Gemert (author), Jan van Gemert (author), J.C. van Gemert (author), Jan van van Gemert (author), Jan C. Gemert (author)

There is an inherent need for autonomous cars, drones, and other robots to have a notion of how their environment behaves and to anticipate changes in the near future. In this work, we focus on anticipating future appearance given the current frame of a video. Existing work focus ...

Large scale Gaussian Process for overlap-based object proposal scoring

This work considers the task of object proposal scoring by integrating the consistency between state- of-the-art object proposal algorithms. It represents a novel way of thinking about proposals, as it starts with the assumption that consistent proposals are most likely centered ...

Making a Case for Learning Motion Representations with Phase

Conference paper (2016) - Silvia Pintea (author), S. Pintea (author), Silvia L. Pintea (author), Silvia-Laura Pintea (author), Jan van Van Gemert (author), Jan van van Gemert (author), J.C. Gemert (author), Jan van Gemert (author), J.C. Van Gemert (author), Jan C. Van Gemert (author), Jan van Gemert (author), Jan Van Gemert (author), J.C. van Gemert (author), Jan Gemert (author), Jan C. van Gemert (author), Jan C. Gemert (author)

This work advocates Eulerian motion representation learning over the current standard Lagrangian optical flow model. Eulerian motion is well captured by using phase, as obtained by decomposing the image through a complex-steerable pyramid. We discuss the gain of Eulerian motion i ...

Featureless

Bypassing feature extraction in action categorization

This method introduces an efficient manner of learning action categories without the need of feature estimation. The approach starts from low-level values, in a similar style to the successful CNN methods. However, rather than extracting general image features, we learn to predic ...