M. Izadi | TU Delft Repository

A Transformer-Based Approach for Smart Invocation of Automatic Code Completion

Transformer-based language models are highly effective for code completion, with much research dedicated to enhancing the content of these completions. Despite their effectiveness, these models come with high operational costs and can be intrusive, especially when they suggest to ...

Investigating the Performance of Language Models for Completing Code in Functional Programming Languages

A Haskell Case Study

Conference paper (2024) - Tim van Dam (author), Frank van der Heijden (author), Philippe de Bekker (author), Berend Nieuwschepen (author), Marc Otten (author), Maliheh Izadi (author), Maliheh Izadi (author), M. Izadi (author)

Language model-based code completion models have quickly grown in use, helping thousands of developers write code in many different programming languages. However, research on code completion models typically focuses on imperative languages such as Python and JavaScript, which re ...

Correction to

The potential of an adaptive computerized dynamic assessment tutor in diagnosing and assessing learners’ listening comprehension (Education and Information Technologies, (2024), 29, 3, (3637-3661), 10.1007/s10639-023-11871-w)

Journal article (2024) - Mehri Izadi (author), M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author), Farrokhlagha Heidari (author), Farrokhlagha Heidari (author)

In the PDF of this article, the pages were incorrectly numbered as ‘2303–2327’ when it should have been ‘3637–3661’. The page range was found to be just correct in the HTML version of the article. The original article has been corrected.@en

Generative AI in Software Engineering Must Be Human-Centered

The Copenhagen Manifesto

Journal article (2024) - Daniel Russo (author), Sebastian Baltes (author), Niels Van Berkel (author), Niels van Berkel (author), Paris Avgeriou (author), Fabio Calefato (author), Beatriz Cabrero-Daniel (author), Gemma Catolino (author), Maliheh Izadi (author), M. Izadi (author), Maliheh Izadi (author), Bogdan Vasilescu (author), More Authors..., More authors...

The (ab)use of Open Source Code to Train Large Language Models

Conference paper (2023) - A. Al-Kaswan (author), Ali Al-Kaswan (author), M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author)

In recent years, Large Language Models (LLMs) have gained significant popularity due to their ability to generate human-like text and their potential applications in various fields, such as Software Engineering. LLMs for Code are commonly trained on large unsanitized corpora of s ...

Correction to

The potential of an adaptive computerized dynamic assessment tutor in diagnosing and assessing learners’ listening comprehension (Education and Information Technologies, (2024), 29, 3, (3637-3661), 10.1007/s10639-023-11871-w)

Journal article (2023) - Mehri Izadi (author), M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author), Farrokhlagha Heidari (author), Farrokhlagha Heidari (author)

The copyright holder in the original publication of this article was incorrect. The original article has been corrected.@en

Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries

Binary reverse engineering is used to understand and analyse programs for which the source code is unavailable. Decompilers can help, transforming opaque binaries into a more readable source code-like representation. Still, reverse engineering is difficult and costly, involving c ...

The NLBSE'23 Tool Competition

Conference paper (2023) - Rafael Kallis (author), M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author), Luca Pascarella (author), Oscar Chaparro (author), Pooja Rani (author)

We report on the organization and results of the second edition of the tool competition from the International Workshop on Natural Language-based Software Engineering (NLBSE'23). As in the prior edition, we organized the competition on automated issue report classification, with ...

The potential of an adaptive computerized dynamic assessment tutor in diagnosing and assessing learners’ listening comprehension

Journal article (2023) - Mehri Izadi (author), M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author), Farrokhlagha Heidari (author), Farrokhlagha Heidari (author)

In today’s environment of growing class sizes due to the prevalence of online and e-learning systems, providing one-to-one instruction and feedback has become a challenging task for teachers. Anyhow, the dialectical integration of instruction and assessment into a seamless and dy ...

In today’s environment of growing class sizes due to the prevalence of online and e-learning systems, providing one-to-one instruction and feedback has become a challenging task for teachers. Anyhow, the dialectical integration of instruction and assessment into a seamless and dynamic activity can provide a continuous flow of assessment information for teachers to boost and individualize learning. In this regard, adaptive learning technology is one way to facilitate teacher-supported learning and personalize curriculum and learning experiences. This study aimed to investigate the potential of an adaptive Computerized Dynamic Assessment (C-DA) tool applicable as a language diagnostician and assistant. The study tried to get insight into 75 Iranian EFL learners’ listening development by focusing on the learning potential exhibited through learners’ assessment and the degree of internalization of mediation. To achieve these, a C-DA tutor including two dynamic listening comprehension tests, each comprising 20 items, arranged in the order of difficulty was developed. The test takers unable to answer an item correctly were provided with graduated hints for different comprehension- and production-type items and the overall difficulty level of the test was adapted to the test takers’ proficiency level. In order to have a full diagnosis of each individual’s listening development, the adaptive C-DA automatically generated five test scores on each learner’s performance: actual (unmediated) score, mediated score, gain score, Learning Potential Score (LPS), and transfer score. The results of paired-sample t-tests revealed a significant development from the actual to the mediated scores. Furthermore, the LPSs indicated that the tutor was capable of revealing learners’ potential for learning. Moreover, learners with high LPS gained a higher mean for transfer scores followed by transfer scores of medium and low levels. The results of Mann-Whitney tests revealed a significant difference in the degree of internalization of mediation of learners with mid and low range of LPSs on the easy test and high and low range of LPSs on the difficult test. The findings of this research can have important theoretical and practical implications for researchers and educationalists. The instructional value of this adaptive C-DA tool lies in its unique opportunities for individualizing learning and developing individual learning plans in accordance with learners’ needs.

@en

STACC: Code Comment Classification using SentenceTransformers

Code comments are a key resource for information about software artefacts. Depending on the use case, only some types of comments are useful. Thus, automatic approaches to clas-sify these comments have been proposed. In this work, we address this need by proposing, STACC, a set o ...

Enriching Source Code with Contextual Data for Code Completion Models

An Empirical Study

Conference paper (2023) - Tim van Dam (author), M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author), Arie Van Deursen (author), A van Deursen (author), Arie Deursen (author), Arie van Deursen (author), A Van Deursen (author), Arie Van Deursen (author), A. Van Deursen (author), A Deursen (author), A. van Deursen (author), Arie Van van Deursen (author), Arie Deursen (author), Arie Van Van Deursen (author), Arie van van Deursen (author), Arie van Deursen (author), Arie van Deursen (author), Arie Van Deursen (author), Arie van Van Deursen (author), A. Deursen (author)

Transformer-based pre-trained models have recently achieved great results in solving many software engineering tasks including automatic code completion which is a staple in a developer’s toolkit. While many have striven to improve the code-understanding abilities of such models, ...

Semantically-enhanced topic recommendation systems for software projects

Journal article (2023) - M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author), Mahtab Nejati (author), Abbas Heydarnoori (author)

Software-related platforms such as GitHub and Stack Overflow, have enabled their users to collaboratively label software entities with a form of metadata called topics. Tagging software repositories with relevant topics can be exploited for facilitating various downstream tasks. ...

Software-related platforms such as GitHub and Stack Overflow, have enabled their users to collaboratively label software entities with a form of metadata called topics. Tagging software repositories with relevant topics can be exploited for facilitating various downstream tasks. For instance, a correct and complete set of topics assigned to a repository can increase its visibility. Consequently, this improves the outcome of tasks such as browsing, searching, navigation, and organization of repositories. Unfortunately, assigned topics are usually highly noisy, and some repositories do not have well-assigned topics. Thus, there have been efforts on recommending topics for software projects, however, the semantic relationships among these topics have not been exploited so far. In this work, we propose two recommender models for tagging software projects that incorporate the semantic relationship among topics. Our approach has two main phases; (1) we first take a collaborative approach to curate a dataset of quality topics specifically for the domain of software engineering and development. We also enrich this data with the semantic relationships among these topics and encapsulate them in a knowledge graph we call SED-KGraph. Then, (2) we build two recommender systems; The first one operates only based on the list of original topics assigned to a repository and the relationships specified in our knowledge graph. The second predictive model, however, assumes there are no topics available for a repository, hence it proceeds to predict the relevant topics based on both textual information of a software project (such as its README file), and SED-KGraph. We built SED-KGraph in a crowd-sourced project with 170 contributors from both academia and industry. Through their contributions, we constructed SED-KGraph with 2,234 carefully evaluated relationships among 863 community-curated topics. Regarding the recommenders’ performance, the experiment results indicate that our solutions outperform baselines that neglect the semantic relationships among topics by at least 25% and 23% in terms of Average Success Rate and Mean Average Precision metrics, respectively. We share SED-KGraph, as a rich form of knowledge for the community to re-use and build upon. We also release the source code of our two recommender models, KGRec and KGRec+ (https://github.com/mahtab-nejati/KGRec).

@en

Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge [PRESENTATION]

Previous work has shown that Large Language Models are susceptible to so-called data extraction attacks. This allows an attacker to extract a sample that was contained in the training data, which has massive privacy implications. The construction of data extraction attacks is cha ...

CatIss

An Intelligent Tool for Categorizing Issues Reports using Transformers

Conference paper (2022) - M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author)

Users use Issue Tracking Systems to keep track and manage issue reports in their repositories. An issue is a rich source of software information that contains different reports including a problem, a request for new features, or merely a question about the software product. As th ...

On the Evaluation of NLP-based Models for Software Engineering

Conference paper (2022) - M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author), Martin Nili Nili Ahmadabadi (author), Martin Nili Ahmadabadi (author), M Ahmadabadi (author), M Nili Ahmadabadi (author)

NLP-based models have been increasingly incorporated to address SE problems. These models are either employed in the SE domain with little to no change, or they are greatly tailored to source code and its unique characteristics. Many of these approaches are considered to be outpe ...

Predicting the objective and priority of issue reports in software repositories

Journal article (2022) - M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author), Kiana Akbari (author), Abbas Heydarnoori (author)

Software repositories such as GitHub host a large number of software entities. Developers collaboratively discuss, implement, use, and share these entities. Proper documentation plays an important role in successful software management and maintenance. Users exploit Issue Trackin ...

Software repositories such as GitHub host a large number of software entities. Developers collaboratively discuss, implement, use, and share these entities. Proper documentation plays an important role in successful software management and maintenance. Users exploit Issue Tracking Systems, a facility of software repositories, to keep track of issue reports, to manage the workload and processes, and finally, to document the highlight of their team’s effort. An issue report is a rich source of collaboratively-curated software knowledge, and can contain a reported problem, a request for new features, or merely a question about the software product. As the number of these issues increases, it becomes harder to manage them manually. GitHub provides labels for tagging issues, as a means of issue management. However, about half of the issues in GitHub’s top 1000 repositories do not have any labels. In this work, we aim at automating the process of managing issue reports for software teams. We propose a two-stage approach to predict both the objective behind opening an issue and its priority level using feature engineering methods and state-of-the-art text classifiers. To the best of our knowledge, we are the first to fine-tune a Transformer for issue classification. We train and evaluate our models in both project-based and cross-project settings. The latter approach provides a generic prediction model applicable for any unseen software project or projects with little historical data. Our proposed approach can successfully predict the objective and priority level of issue reports with 82 % (fine-tuned RoBERTa) and 75 % (Random Forest) accuracy, respectively. Moreover, we conducted human labeling and evaluation on unlabeled issues from six unseen GitHub projects to assess the performance of the cross-project model on new data. The model achieves 90 % accuracy on the sample set. We measure inter-rater reliability and obtain an average Percent Agreement of 85.3 % and Randolph’s free-marginal Kappa of 0.71 that translate to a substantial agreement among labelers.

@en

CodeFill

Multi-token Code Completion by Jointly learning from Structure and Naming Sequences

Conference paper (2022) - Maliheh Izadi (author), M. Izadi (author), Maliheh Izadi (author), Roberta Gismondi (author), Georgios Gousios (author), Georgios Gousios (author), Gousios Georgios (author), Gousios Georgios (author), Gousios Gousios (author), Gousios Gousios (author), Georgios Georgios (author), Georgios Georgios (author), G. Georgios (author), G. Georgios (author), G. Gousios (author), G. Gousios (author)

Code completion is an essential feature of IDEs, yet current auto-completers are restricted to either grammar-based or NLP-based single token completions. Both approaches have significant draw-backs: grammar-based autocompletion is restricted in dynamically-typed language environ ...

Topic recommendation for software repositories using multi-label classification algorithms

Journal article (2021) - Maliheh Izadi (author), Maliheh Izadi (author), M. Izadi (author), Abbas Heydarnoori (author), G. Georgios (author), Gousios Gousios (author), Georgios Gousios (author), Georgios Georgios (author), Gousios Georgios (author), G. Gousios (author)

Many platforms exploit collaborative tagging to provide their users with faster and more accurate results while searching or navigating. Tags can communicate different concepts such as the main features, technologies, functionality, and the goal of a software repository. Recently ...

Automated Recovery of Issue-Commit Links Leveraging Both Textual and Non-textual Data

Conference paper (2021) - Pooya Rostami Mazrae (author), M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author), Abbas Heydarnoori (author)

An issue report documents the discussions around required changes in issue-tracking systems, while a commit contains the change itself in the version control systems. Recovering links between issues and commits can facilitate many software evolution tasks such as bug localization ...

An issue report documents the discussions around required changes in issue-tracking systems, while a commit contains the change itself in the version control systems. Recovering links between issues and commits can facilitate many software evolution tasks such as bug localization, defect prediction, software quality measurement, and software documentation. A previous study on over half a million issues from GitHub reports only about 42.2% of issues are manually linked by developers to their pertinent commits. Automating the linking of commit-issue pairs can contribute to the improvement of the said tasks. By far, current state-of-the-art approaches for automated commit-issue linking suffer from low precision, leading to unreliable results, sometimes to the point that imposes human supervision on the predicted links. The low performance gets even more severe when there is a lack of textual information in either commits or issues. Current approaches are also proven computationally expensive. We propose Hybrid-Linker, an enhanced approach that overcomes such limitations by exploiting two information channels; (1) a non-textual-based component that operates on non-textual, automatically recorded information of the commit-issue pairs to predict a link, and (2) a textual-based one which does the same using textual information of the commit-issue pairs. Then, combining the results from the two classifiers, Hybrid-Linker makes the final prediction. Thus, every time one component falls short in predicting a link, the other component fills the gap and improves the results. We evaluate Hybrid-Linker against competing approaches, namely FRLink and DeepLink on a dataset of 12 projects. Hybrid-Linker achieves 90.1%, 87.8%, and 88.9% based on recall, precision, and F-measure, respectively. It also outperforms FRLink and DeepLink by 31.3%, and 41.3%, regarding the F-measure. Moreover, the proposed approach exhibits extensive improvements in terms of performance as well. Finally, our source code and data are publicly available.

@en

Improving Quality of a Post's Set of Answers in Stack Overflow

Conference paper (2020) - Mohammadreza Tavakoli (author), M. Izadi (author), Maliheh Izadi (author), Maliheh Izadi (author), Abbas Heydarnoori (author)

Community Question Answering platforms such as Stack Overflow help a wide range of users solve their challenges on-line. As the popularity of these communities has grown over the years, both the number of members and posts have escalated. Also, due to the diverse backgrounds, ski ...

Community Question Answering platforms such as Stack Overflow help a wide range of users solve their challenges on-line. As the popularity of these communities has grown over the years, both the number of members and posts have escalated. Also, due to the diverse backgrounds, skills, expertise, and viewpoints of users, each question may obtain more than one answer. Therefore, the focus has changed toward producing posts that have a set of answers more valuable for the community as a whole, not just one accepted-answer aimed at satisfying only the question-asker. Same as every universal community, a large number of low-quality posts on Stack Overflow require improvement. We call these posts "deficient", and define them as posts with questions that either have no answer yet or can be improved by other ones. In this paper, we propose an approach to automate the identification process of such posts and boost their set of answers, utilizing the help of related experts. With the help of 60 participants, we trained a classification model to identify deficient posts by investigating the relationship between characteristics of 3075 questions posted on Stack Overflow and their need for better answers set. Then, we developed an Eclipse plugin named SOPI and integrated the prediction model in the plugin to link these deficient posts to related developers (in terms of their development context and expertise area) and help them improve the answer set. We evaluated both the functionality of our plugin and the impact of answers submitted to Stack Overflow with the help of 10 and 15 expert industrial developers, respectively. Our results indicate that decision trees, specifically the J48 algorithm, predicts a deficient question better than the other methods with 94.5% precision and 90.3% recall. We conclude that not only our plugin helps programmers contribute more easily to Stack Overflow, but also it improves the quality of existing answers.

@en