Solar energy is an abundant, scalable, and clean source of energy. With an exponential drop in prices of PV modules, more and more rooftop photovoltaic (PV) systems are being installed worldwide. Since these small-scale PV systems do not use expensive sensors, it is difficult to
...
Solar energy is an abundant, scalable, and clean source of energy. With an exponential drop in prices of PV modules, more and more rooftop photovoltaic (PV) systems are being installed worldwide. Since these small-scale PV systems do not use expensive sensors, it is difficult to detect malfunctions for these systems. This could lead to lower energy generation along with financial losses for the owners. Thus, a method for PV yield monitoring is developed for early and remote fault detection. This method does not use the conventional analytical approach as it depends on inaccurately extrapolated weather data. Instead, the proposed method uses data from similar or neighbouring or peer PV systems for estimating the expected energy generation. By comparing the expected energy generation with actual energy generation, a faulty system can be flagged. In this project, information from about 12000 PV systems is used, which includes system design information such as location, number of panels, panel orientation, etc. along with the historical daily energy generation for periods ranging from two months to up to seven years per system.
In this thesis, a machine learning model was developed for predicting energy yields, which uses a Genetic Algorithm (GA) for optimization. This model splits the available data into system design, system location and system yield data. Thus, the model uses these as criteria for finding PV systems similar to the monitored system. Once good peer systems are located, system yield data of those systems are used for estimating the expected energy yields of the monitored system. The three criteria used by the model do not have equal influence on finding good peers, thus, the model had to be trained or optimization was done using the training data. Post optimization, the relative influence of system design: system yield: system location was found to be 0.125:0.875:0 with on average 16 good peers needed for accurate predictions. The proposed model has a mean normalized RMSE of 0.057 and about 95% of the systems tested had an R2 score higher than 0.85. The existing commercial software at Solar Monkey has a mean normalized RMSE of 0.082 and about 83% of the systems tested had an R2 score higher than 0.85.
The predicted energy generation calculated by the proposed model is compared with the actual energy generation to detect any malfunctions that may have occurred in the monitored system. Thus, 120 randomly chosen PV systems were analysed for faults. Based on this, a semi-automatic categorization framework was created with the proposed model as one of the criteria to detect common faults in the system such as missing data, under-performance, over-performance and false positives. Using the categorization framework, certain PV systems were found as interesting examples for under-performance with broken panels or string, over-performance with system size change and false positives. The model is especially useful for separating system design mismatch from actual system malfunctions. With the framework, it was shown how the proposed peer-to-peer model can be used for fault detection along with certain other models.