Memory Layout Optimisation on Abstract Syntax Trees
Impact on Utilisation Speed During Type Checking and Code Generation Phases
More Info
expand_more
Abstract
In the field of software engineering, the speed of compilation plays a crucial role in enhancing development productivity. This thesis investigates the impact of optimising the memory layout of Abstract Syntax Trees (ASTs) on the performance of the type checking and code generation phases in the compilation process of strictly-typed procedural programming languages. Existing compilers often employ a traditional Object-Oriented (OO) approach to AST construction, leading to inefficiencies such as high memory overhead and suboptimal cache usage. We applied Data-Oriented Design (DOD) principles to restructure ASTs, aiming to enhance data locality and reduce memory access times.
The study investigates various AST memory layouts, including a transition from a naive OO model to a Struct-of-Arrays (SoA) model. We evaluated the effect of these memory layouts on compiler performance using a set of benchmarks based on real-world code. We executed the benchmarks on diverse hardware platforms, measuring key performance metrics such as type checking time, code generation time, and cache miss rates.
The results demonstrate that adopting an SoA model for ASTs significantly reduces the duration of the type checking and code generation phases, with improvements varying across different hardware architectures. We provide empirical evidence supporting the application of DOD principles and specifically SoA to ASTs for performance improvements. By highlighting and addressing the performance bottlenecks associated with traditional AST layouts, we contribute to the broader goal of advancing compiler design and optimisation.