Using a Time Dependency Graph to find the most widely used Debian package

More Info
expand_more

Abstract

The main principle of Open Source development is that developers can reuse different libraries over and over again to make their lives easier. That is why this practice has gained a lot of popularity. However, libraries usually depend on other already existing pieces of code. This means that whenever some small piece of code fails, the whole application may crash due to its dependencies. Since Debian is considered as one of the largest community run distributions, it is important to have a good tool to analyse these dependencies to help in avoiding such crashes. That is why this research will focus on building a dependency graph for Debian's package manager and finding out which are the most widely used packages. What separates this research from previous related works, is the addition of a time component to the graph. This will be in the form of a timestamp of the release date for each version of a package. This allows for extensive traversal of the graph, which can also be based on time periods. By doing so, no transitive dependencies should be missed. The paper concludes with defining most widely used as the packages which achieve the highest PageRank score when put into the graph. The top 3 ones are "libc6", "libgcc1" and "multiarch-support".