Circular Image

M. Finavaro Aniche

30 records found

Technical debt is a term that describes the consequences of taking shortcuts or quick-and-dirty solutions in the software engineering process, in order to gain short term advantages in the development process of software projects. In this paper, we investigate the technical debt ...
Web APIs are being used for increasingly larger and complex use cases. Right now it can be hard to make sure that what is documented about an API is correct everywhere and to know if a change will have impact on the users of a web API. When details are missing in an API specifica ...
Practice is central in mathematics skill acquisition. The practice process can be facilitated by flexible digital exercise systems, supporting personalized learning and providing students with parameterized, open answer exercises containing answer-specific feedback. However, curr ...

Covert DNS Storage Channel Detection

Uncovering surreptitious data exchange using the phonebook of the internet

The cyber arms race has red and blue teams continuously at their toes to keep ahead. Increasingly capable cyber actors breach secure networks at a worrying scale. While network monitoring and analysis should identify blatant data exfiltration attempts, covert channels bypass thes ...
Software testing is an integral part of the development of embedded systems. Among other reasons, tests are frequently used to ensure that a system meets all the specifications, which is especially important when designing systems for the medical industry. Software changes that h ...
The convenient service offered by credit cards and the technological advances in e-commerce have caused the number of online payment transactions to increase daily. With this rising number, the opportunity for fraudsters to obtain cardholder details via online credit card fraud h ...

Renaming for Everyone

Language-parametric Renaming in Spoofax

A refactoring is a program transformation that improves the design of the source code, while preserving its behavior. Most modern IDEs offer a number of automated refactorings as editor services. The Rename refactoring is the most-commonly applied refactoring and is used to chang ...
Many types of database management systems exist, but finding the one that is right for a specific use case is becoming increasingly more difficult. Benchmarks allow one to compare various systems, but in a world where distributed DBMSs are increasingly used for mission critical p ...

Computational Thinking Dashboard

For learners in Jupyter notebooks

Computational Thinking (CT) - the process of thinking like a programmer or computer scientist - is a skill that that has the potential to transform the way students learn at educational institutions in different domains and different grade levels. With the increasing integration ...
Static Analysis is of indispensable value for the robustness of software systems and the efficiency of developers. Moreover, many modern-day software systems are composed of interacting subsystems written in different programming languages. However, in most cases no static valida ...
Active state machine learning algorithms are a class of algorithms that allow us to infer state machines representing certain systems. These algorithms interact with a system and build a hypothesis of what the state machine describing that system looks like according to the behav ...
Privacy-preserving data aggregation protocols have been researched widely, but usually cannot guarantee correctness of the aggregate if users are malicious. These protocols can be extended with zero-knowledge proofs and commitments to work in the malicious model, but this incurs ...
As organizations start to adopt machine learning in critical business scenarios, the development processes change and the reliability of the applications becomes more important. To investigate these changes and improve the reliability of those applications, we conducted two studi ...

Perspective Discovery in Controversial Debates

An exploration of unsupervised topic models

Since the introduction of the Web, online platforms have become a place to share opinions across various domains (e.g., social media platforms, discussion fora or webshops). Consequently, many researchers have seen a need to classify, summarise or categorise these large sets of u ...
Code duplication is a form of technical debt frequently observed in software systems. Its existence negatively affects the maintainability of a system in numerous ways. In order to tackle the issues that come with it, various automated clone detection techniques have been propose ...
Attribution of the malware to the developers writing the malware is an important factor in cybercrime investigative work. Clustering together not only malware of the same family, but also inter-family malware relations together provides more information about the authors and aid ...

The Error that is the Error Message

Comparing information expectations of novice programmers against the information in Python error messages

Learning to program is not a easy task, as has become evident from the abundance of research papers concerning the subject. One of the learning barriers of learning a new programming language is understanding their error message, as coding errors have to be resolved before the pr ...
Modern web information systems use machine learning models to provide personalized user services and experiences. However, machine learning models require annotated data for training, and creating annotated data is done through crowdsourcing tasks. The content used in annotation ...
Huge amounts of log data are generated every day by software. These data contain valuable information about the behavior and the health of the system, which is rarely exploited, because of their volume and unstructured nature. Manually going through log files is a time-consuming ...
Log data, produced from every computer system and program, are widely used as source of valuable information to monitor and understand their behavior and their health. However, as large-scale systems generate a massive amount of log data every minute, it is impossible to detect t ...