GH
G.J.P.M. Houben
32 records found
1
Neural information retrieval (IR) has transitioned from using classical human-defined relevance rules to leveraging complex neural models for retrieval tasks. While benefiting from advances in machine learning (ML), neural IR also inherits several drawbacks, including the opacity
...
This thesis addresses the semantic gap in visual understanding, improving visual models with semantic reasoning capabilities so they can handle tasks like image captioning, question-answering, and scene understanding. The main focus is on integrating visual and textual data, leve
...
We are witnessing a paradigm shift in machine learning (ML) and artificial intelligence (AI) from a focus primarily on innovating ML models, the model-centric paradigm, to prioritising high-quality, reliable data for AI/ML applications, the data-centric paradigm. This emphasis on
...
Over the last two decades, the machine learning (ML) field has witnessed a dramatic expansion, propelled by burgeoning data volumes and the advancement of computational technologies. Deep learning (DL) in particular has demonstrated remarkable success across a wide range of domai
...
Schema matching is a critical data integration process, which aims at capturing relevance between elements of different datasets; when datasets are tabular, it translates to the process of discovering related columns among them. Accurately discovering column matches is integral f
...
Web search plays an important role in the contemporary information landscape, shaping individual and collective knowledge by providing fast and effortless access to vast amounts of resources. We rely on web search engines for various information needs, some of which can carry ser
...
Data processing has heavily evolved in the last two decades, from single-node processing to distributed processing and from the MapReduce paradigm to the stream processing paradigm. At the same time, cloud computing has emerged as the primary means of deploying and operating a da
...
Music annotation and transcription of music sheets are traditionally performed by experts. Although these processes result in high quality data, the scope of each effort is relatively narrow resulting in highly specialised and specific datasets of annotated music compositions, wh
...
Search engines are used to gather and collect information. This interaction sometimes influences the user and changes their attitude towards a topic after such interaction. Prior work has shown that it is a complex endeavour to understand attitude change, as there are many things
...
Several input types have been developed in different technological landscapes like crowdsourcing and conversational agents. However, sign language remains one of the input types that has not been looked upon. Although numerous amount of people around the world use sign language a
...
This thesis mainly studies the causality in natural language processing. Understanding causality is key to the success of NLP applications, especially in high-stakes domains. Causality comes in various perspectives such as enable and prevent that, despite their importance, have b
...
Interpretability of ML models and image recognition models specifaclly, is a increasing problem. In this thesis, the design and implementation of Brickroutine: a system that used a trained model, is presented. Using human annotations, semantic interpretations are given to image c
...
As the world continues to embrace cloud computing, more applications are being scaled elastically. Elastic scaling allows applications to add or remove computing resources based on the load experienced by the application. When the load is high more resources are provisioned enabl
...
Natural Language Interfaces for Databases (NLIDBs) offer a way for users to reason about data. It does not require the user to know the data structure, its relations, or familiarity with a query language like SQL. It only requires the use of Natural Language. This thesis focuses
...
Characterization and Mitigation of High-Confidence Errors Through the Use of Human-In-The-Loop Methods
Domain Expert Driven Approach to Model Development
In the use of Machine Learning systems, attaining the trust of those that are the end-users can often be difficult. Many of the current state-of-the-art systems operate as Black-Boxes. Errors produced by these Black-Box systems, without further explanation as to why these decisio
...
Machine learning inference queries are a type of database query for databases where a model pipeline is needed to evaluate its boolean predicates. Using a model zoo it is possible to select a variety of models to execute in a sequence rather than using a highly specialized model
...
Knowing Better Than the AI
How the Dunning-Kruger Effect Shapes Reliance on Human-AI Decision Making
Artificial Intelligence (AI) is increasingly helping people with all kinds of tasks, due to its promising capabilities. In some tasks, an AI system by itself will take over tasks, but in other tasks, an AI system making decisions on its own would be undesired due to ethical and l
...
Data drift refers to the variation in the production data compare to the training data and sometimes the machine learning model would decay because of it. Some machine learning models face the problem when in production: they receive drift data while there’s no ground truth to ev
...