The Grid computing vision promises to provide the needed platform for a new and more demanding range of applications. For this promise to become true, a number of hurdles, including the design and deployment of adequate resource management and information services, need to be ove
...
The Grid computing vision promises to provide the needed platform for a new and more demanding range of applications. For this promise to become true, a number of hurdles, including the design and deployment of adequate resource management and information services, need to be overcome. In this context, understanding the characteristics of real Grid workloads is a crucial step for improving the quality of existing Grid services, and in guiding the design of new solutions. Towards this goal, in this work we present the characteristics of traces of four real Grid environments, namely LCG, Grid3, and TeraGrid, which are among the largest production Grids currently deployed, and the DAS, which is a research Grid. We focus our analysis on virtual organizations, on users, and on individual jobs characteristics. We further attempt to quantify the evolution and the performance of the Grid systems from which our traces originate. Finally, given the scarcity of the information available for analysis purposes, we discuss the requirements of a new format for Grid traces, and we propose the establishment of a virtual center for workload-based Grid benchmarking data: The Grid Workloads Archive.@en