MM

M.A. Migut

36 records found

Exploring the Gorillas in the Malware Jungle

Investigating the communication and attack characteristics of the Gorilla botnet

The rise of the Internet of Things (IoT) has introduced levels of convenience never seen before, but also presents a significant cybersecurity challenge. Especially the insecure nature of many of these IoT devices fuels the emergence of advanced IoT botnets. The Gorilla botnet is ...

Optimal Multiple Importance Resampling

Optimal Spatial Reuse for Monte Carlo Light Transport Simulation

Ray tracing has experienced increasing adoption in various spaces of computer graphics. The ReSTIR (Reservoir-based Spatiotemporal Importance Resampling) family of techniques has enabled several orders of magnitude speedups in light transport simulation algorithms which rely on r ...

A Comparative Study of Threshold Multiparty Private Set Intersection Protocols

For Cyber Threat Intelligence Sharing in a Medical Setting

Within the field of \emph{cyber threat intelligence} (CTI), healthcare institutions are one of the most targeted organizations by cybercriminals. To mitigate future attacks on their digital infrastructures, healthcare institutions can collaborate and exchange security logs. These ...
With the rapid growth in data collection, efficient data processing is critical. Dimensionality reduction methods, like t-distributed stochastic neighbour embedding (t-SNE), compress high-dimensional data into embeddings that preserve the key features of the datasets making data ...

Accelerating hyperbolic t-SNE using the Lorentz Hyperboloid

Exploring a different way to speed up hyperbolic t-SNE

This paper investigates a method for accelerating hyperbolic t-SNE — a popular high-dimensional data visualization technique. In particular, it focuses on building a hyperbolic t-SNE variant that uses a different model of hyperbolic space (called the Lorentz Hyperboloid model) fo ...

Accelerating hyperbolic t-SNE in the Klein Disk model

Accelerating hyperbolic t-distributed Stochastic Neighbourhood Embedding approximation using a polar quadtree in the Klein Disk model

In this work we aim to implement a variaton of the acceleration of hyperbolic t-SNE done by Skrodzki et. al. [19]. This variation aims to embed the points in the Klein Disk model of hyperbolic space instead of the Poincar ́e Disk model using an altared version of a polar quadtree ...
Dimensionality reduction is an important task in high-dimensional data visualisation. Among the popular algorithms for achieving this is t-SNE, which aims to preserve local neighbourhoods in the lower-dimensional embeddings. While t-SNE traditionally works in Euclidean space, emb ...
This research evaluates the performance of Meta's Code Llama 7B model in generating comments for Java code written in Polish. Using a mixed-methods approach, we conduct both quantitative and qualitative methods to discover the model's accuracy and limitations. We preprocess a dat ...
This paper evaluates the performance of Large Language Models, specifically StarCoder 2, in non-English code summarization, with a focus on the Greek language. We establish a hierarchical error taxonomy through an open coding approach to enhance the understanding and improvement ...
Interest in Large Language Models is growing, especially in software development tasks such as code completion and comment generation. However, most Large Language Models are primarily trained on English language data, raising concerns about their effectiveness when applied to ot ...
After the emergence of BERT, Large Language Models (LLMs) have demonstrated remarkable multilingual capabilities and have seen widespread adoption globally, particularly in the field of programming. However, current evaluations and benchmarks of LLMs on code primarily focus on En ...

LLM of Babel

An analysis of the behavior of large language models when performing Java code summarization in Dutch

How well do large language models (LLMs) infer text in a non-English context when performing code summarization? The goal of this paper was to understand the mistakes made by LLMs when performing code summarization in Dutch. We categorized the mistakes made by CodeQwen1.5-7b when ...
Masked Autoencoders (MAEs) represent a significant shift in self-supervised learning (SSL) due to their independence from augmentation techniques for generating positive (and/or negative) pairs as in contrastive frameworks. Their masking and reconstruction strategy also aligns we ...

The artificially generated microbiome

A study on the generation and potential use cases of predicted meta-omics data

Motivation: Imbalances in the human gut microbiome have been linked to various conditions, including inflammatory bowel disease (IBD), diabetes, and mental health disorders. While metagenomics and amplicon sequencing are the most commonly used technologies to characterize ...
Accurate forecasts are essential for integrating wind energy into the power grid. With wind energy's growing role in the renewable mix, precise short-term generation forecasts are increasingly vital. Turbine-level forecasts are critical for optimal wind farm operation, control, a ...

Probing the Dark Web

Optimizing Port Scanning for Dark Web Protocol Analysis

The inception of onion routing in the mid-1990s, evolving into Tor (The Onion Routing) and other anonymous networks, marked a pivotal moment in the quest for internet privacy. However, the emergence of the dark web, facilitated by these networks, has also increased cybercrime act ...

From Course to Online Learning Paths

Improving the Teacher's Experience of an Existing Online Node-link Course Tool

The online website Skill Circuits is a tool developed by teachers at the Delft University of Technology. Skill Circuits is an online learning tool that presents students with a node-link (i.e. a tree) structure where each node represents a skill, containing tasks that aim to teac ...

Comparative Study of Loss Functions in Personal Identification for Smartwatch Data

Examining Accuracy of Loss Functions in Personal Identification using Outlier Detection with Auto-encoders on Data from Smartwatches

Smartwatches are equipped with sensors that allow continuous monitoring of physiological and physical activities, making them ideal sources of data for data analysis. However, accurately identifying individuals based on smartwatch data can be challenging due to the presence of o ...
Since the recent rise and advancement of video conferencing platforms such as Zoom, it has become important to interpret the logistics of remote online meetings. Analysing verbal and non-verbal cues (such as body language) between members of these virtual forums can provide addit ...

Revealing the Secret to Successful Virtual Meetings: How Personality, Social Skills, and More Impact Conversational Involvement

Do people that are assessed as better conversational partners have a higher level of involvement?

The importance of understanding how to maximize involvement in virtual meetings has been greatly increased due to the rapid rise of video conferencing tools during the COVID-19 pandemic. This research builds upon the data collected by the MEMO Corpus, specifically interview foota ...