Scratch is a popular, visual programming language aimed at children, and is used by teachers and after school code clubs to teach their students about programming. Measuring whether they understand the underlying concepts, however, is a difficult task. In this research, we tried
...
Scratch is a popular, visual programming language aimed at children, and is used by teachers and after school code clubs to teach their students about programming. Measuring whether they understand the underlying concepts, however, is a difficult task. In this research, we tried clustering Scratch projects by complexity to help students improve their programming skills. We did this by selecting an existing data set to extract features that indicate code complexity. Before, researchers attempted clustering on one metric that globalises the project’s complexity. Different researchers set out to measure the growth of the students by clustering the projects the students created.
With that in mind, we adopt a partition-based clustering algorithm to cluster the projects, as this method indicates outliers. We examine the quality of these clusters using the silhouette coefficient. We set up five experiments with different input vectors to make out the impact each input has on the clusters. We did not find a clear indication of the projects being clustered by the selected features. This could mean that Scratch projects are not suitable to measure a high-level understanding of programming concepts. Including the project name in the input vector had a negligible effect on the outcome of the experiments.