S. Proksch
33 records found
1
A reliable dependency resolution process should minimize dependency-related issues. We identify transparency, stability, and flexibility as the three core properties that define a reliable resolution process and discuss how different dependency declaration strategies affect them.
...
Research on open-source software evolution gained popularity in the last decade focusing on the theoretical determining factors. Additional works studied growth patterns modeling using time series techniques on small projects and metrics samples or non-openly available larger dat
...
In the modern digital landscape, cybersecurity threats are a significant concern, particularly for publicly accessible computer systems. Vulnerabilities, or flaws in system design, can be exploited by malicious actors to compromise system security and integrity. This paper explor
...
This paper examines the release practices of Java Maven Repositories on GitHub. Most prior research in this vein has been done on Maven Central, the largest Maven package repository. However, GitHub hosts 15.5 million Java repositories, and is left untapped. Additionally of inter
...
Navigating Repositories
Assessing the Impact of External Repositories on Packages in Maven Central
This paper presents a comprehensive experimental study on the use and impact of external repositories in the Maven ecosystem. For this research the prevalence, naming patterns, and potential risks associated with external repositories were analyzed. We analyzed 199,188 packages a
...
This study conducts an investigation of the challenges faced by aging projects in Maven Central, focusing on the issue of missing dependencies. Using the Maven Explorer indexer, we systematically examine the correlation between the age of a project and the frequency of dependency
...
Can we extract a relevant, available, and self-contained core of the Maven ecosystem?
Extracting the pillars of the community, and their dependencies.
The Maven ecosystem, with an emphasis on Maven Central, contains a plethora of toy-projects. This paper addresses this problem by formulating a core containing the pillars of the Maven ecosystem, such that it can be exploited for research concerning li- brary quality. The constru
...
Discovering Digital Siblings
Quantifying Inter-Repository Similarity Through GitHub Dependency Structures
Open Source developers typically use Git repositories to transparently store the source code of projects and contribute to the code of others. There are millions of repositories actively hosted on platforms such as GitHub. This presents an opportunity for sharing knowledge betwee
...
Finding your digital sibling
Grouping GitHub projects that share certain attributes based on interactions and activities
This study explores the feasibility of categorizing GitHub projects based on their interactions and activities, aiming to assist both researchers and practitioners in navigating the vast landscape of open-source software. Through experiments and analysis, key attributes contribut
...
Finding your digital sibling: which other GitHub projects are similar to yours?
Finding similar repositories based on the available documentation
This paper aims to study the importance of considering the documentation side of GitHub repositories when assessing the similarity between two or more applications. Readme and Wiki files, along with Comments from the source files, are the dimensions proposed to be analyzed throug
...
Contribution of source code identifiers to GitHub project similarity
Which other GitHub projects are similar to yours?
GitHub is an online platform that hosts millions of projects. Many of these projects have the same topic or share the same goal. Finding similar projects which can be used as role models, inspiration or examples can help developers meet their requirements faster and more efficien
...
GitHub is the home of hundreds of millions of Open Source Software(OSS) repositories where users collaborate on projects and find inspiration for new ideas. Some of these projects have certain build configurations set up to make building, testing, and deploying the software more
...
Call graphs are useful tools for representing method relationships within software projects and correlations between dependencies. Although static analysis is a prevalent method for call graph construction, it has its limitations such as struggling with handling dynamic features
...
The escalating complexity of software systems in the digital age heavily relies on reusable code collections(packages) for their development and operation. Despite the numerous advantages of pre-existing libraries, managing dependencies can be intricate and time-consuming. This t
...
Uncovering secrets of the Maven Repository: Java Build Aspects
An empirical analysis
The Maven Central Repository hosts over 11 million packages. As Maven itself is a build tool for Java, the majority of these packages are Java archives.
This research aims to analyze these packages and look into various build aspects of these projects (the research questions) ...
This research aims to analyze these packages and look into various build aspects of these projects (the research questions) ...
Maven Central serves as the de-facto repository for distributing free and open-source Java libraries and components. Evaluating its present state and overall robustness is pivotal for enabling the community to make well-informed decisions concerning its future progression. Such i
...
Uncovering Secrets of the Maven Repository
Maven packaging
Maven, a widely adopted software ecosystem for Java libraries, plays a critical role in the development and deployment of software applications. However, there exists a limited understanding of the composition and characteristics of the Maven repository, leaving users and contrib
...
GitHub Mining
The Implementation of Continuous Integration Pipelines
While continuous integration has already been proven to positively affect software development, little is known about how it should be implemented based on project context. This paper investigates how CI pipelines are configured by analysing data mined from software projects on G
...
Discovering the metrics for assessing a project’s maturity
An analysis of key indicators of maturity
Continuous integration (CI) is a software engineering practice that promotes frequent code integration into a shared repository, improving the productivity within development teams as well as the quality of the software being developed. While CI adoption has gained traction, stud
...
Exploring Descriptive Metrics of Build Performance
A Study of GitHub Actions in Continuous Integration Projects
The Continuous Integration (CI) practice, has been rapidly growing and developing ever since it's introduction. This practice has been constantly providing benefits to developers such as early bug detection and feedback to development teams. In this study, we aim to identify the
...