AA
A. Anand
25 records found
1
QuanTemp
A real-world open-domain benchmark for fact-checking numerical claims
With the growth of misinformation on the web, automated fact checking has garnered immense interest for detecting growing misinformation and disinformation. Current systems have made significant advancements in handling synthetic claims sourced from Wikipedia, and noteworthy prog
...
Dual-encoder-based dense retrieval models have become the standard in IR. They employ large Transformer-based language models, which are notoriously inefficient in terms of resources and latency.We propose Fast-Forward indexes - vector forward indexes which exploit the semantic m
...
Contextual ranking models have delivered impressive performance improvements over classical models in the document ranking task. However, these highly over-parameterized models tend to be data-hungry and require large amounts of data even for fine-tuning. In this article, we prop
...
An significant challenge in text-ranking systems is handling hard queries that form the tail end of the query distribution. Difficulty may arise due to the presence of uncommon, underspecified, or incomplete queries. In this work, we improve the ranking performance of hard or dif
...
Large language models (LLMs) have recently gained significant attention due to their unparalleled zero-shot performance on various natural language processing tasks. However, the pre-Training data utilized in LLMs is often confined to a specific corpus, resulting in inherent fres
...
Answering complex questions is a challenging task that requires question decomposition and multistep reasoning for arriving at the solution. While existing supervised and unsupervised approaches are specialized to a certain task and involve training, recently proposed prompt-base
...
Zorro
Valid, sparse, and stable explanations in graph neural networks
With the ever-increasing popularity and applications of graph neural networks, several proposals have been made to explain and understand the decisions of a graph neural network. Explanations for graph neural networks differ in principle from other input settings. It is important
...
This tutorial presents explainable information retrieval (ExIR), an emerging area focused on fostering responsible and trustworthy deployment of machine learning systems in the context of information retrieval. As the field has rapidly evolved in the past 4-5 years, numerous appr
...
Neural document ranking models perform impressively well due to superior language understanding gained from pre-Training tasks. However, due to their complexity and large number of parameters these (typically transformer-based) models are often non-interpretable in that ranking d
...
Pre-trained contextual language models such as BERT, GPT, and XLnet work quite well for document retrieval tasks. Such models are fine-tuned based on the query-document/query-passage level relevance labels to capture the ranking signals. However, the documents are longer than the
...
Contextual models like BERT are highly effective in numerous text-ranking tasks. However, it is still unclear as to whether contextual models understand well-established notions of relevance that are central to IR. In this paper, we use probing, a recent approach used to analyze
...
This paper proposes a novel approach towards better interpretability of a trained text-based ranking model in a post-hoc manner. A popular approach for post-hoc interpretability text ranking models are based on locally approximating the model behavior using a simple ranker. Since
...
Configurable software systems have become increasingly popular as they enable customized software variants. The main challenge in dealing with configuration problems is that the number of possible configurations grows exponentially as the number of features increases. Therefore,
...
Quality control is essential for creating extractive question answering (EQA) datasets via crowdsourcing. Aggregation across answers, i.e. word spans within passages annotated, by different crowd workers is one major focus for ensuring its quality. However, crowd workers cannot r
...
For What It's Worth
Humans Overwrite Their Economic Self-interest to Avoid Bargaining With AI Systems
As algorithms are increasingly augmenting and substituting human decision-making, understanding how the introduction of computational agents changes the fundamentals of human behavior becomes vital. This pertains to not only users, but also those parties who face the consequences
...
Contextual ranking models have delivered impressive performance improvements over classical models in the document ranking task. However, these highly over-parameterized models tend to be data-hungry and require large amounts of data even for fine tuning. This paper proposes a si
...
SparCAssist
A Model Risk Assessment Assistant Based on Sparse Generated Counterfactuals
We introduce SparCAssist, a general-purpose risk assessment tool for the machine learning models trained for language tasks. It evaluates models' risk by inspecting their behavior on counterfactuals, namely out-of-distribution instances generated based on the given data instance.
...
BERT Rankers are Brittle
A Study using Adversarial Document Perturbations
Contextual ranking models based on BERT are now well established for a wide range of passage and document ranking tasks. However, the robustness of BERT-based ranking models under adversarial inputs is under-explored. In this paper, we argue that BERT-rankers are not immune to ad
...
There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of em
...