[1911.03336v1] Hierarchical Clustering for Smart Meter Electricity Loads based on Quantile Autocovariances
The main advantage of this approach is that we can summarize each series in only a set of representative features which makes them very easy to implement (highly efficient), easy to automatize and scalable to hundreds of thousands of series, i.e., valid for realworld applications with large datasets of time series, as the ones obtained from smart meters

Abstract In order to improve the efficiency and sustainability of electricity systems, most countries worldwide are deploying advanced metering infrastructures, and in particular household smart meters, in the residential sector. This technology is able to record electricity load time series at a very high frequency rates, information that can be exploited to develop new clustering models to group individual households by similar consumptions patterns. To this end, in this work we propose three hierarchical clustering methodologies that allow capturing different characteristics of the time series. These are based on a set of “dissimilarity” measures computed over different features: quantile auto-covariances, and simple and partial autocorrelations. The main advantage is that they allow summarizing each time series in a few representative features so that they are computationally efficient, robust against outliers, easy to automatize, and scalable to hundreds of thousands of smart meters series. We evaluate the performance of each clustering model in a real-world smart meter dataset with thousands of half-hourly time series. The results show how the obtained clusters identify relevant consumption behaviors of households and capture part of their geo-demographic segmentation. Moreover, we apply a supervised classification procedure to explore which features are more relevant to define each cluster.
‹Figure 1. Dendrogram obtained with quantile autocovariance and complete linkage. (Numerical Results)Figure 2. Dendrogram obtained with autocorrelation coefficients and complete linkage. (Numerical Results)Figure 3. Dendrogram obtained with partial autocorrelation coefficients and complete linkage. (Numerical Results)Figure 4. Main clusters obtained with quantile autocovariances and complete linkage. (Numerical Results)Figure 5. Main clusters obtained with autocorrelation coefficients and complete linkage. (Numerical Results)Figure 6. Main clusters obtained with partial autocorrelation coefficients and complete linkage. (Numerical Results)Figure 7. Predictor importance estimates for clusters based on quantile autocovariances. (Numerical Results)Figure 8. Predictor importance estimates for clusters based on autocorrelation coefficients. (Numerical Results)Figure 9. Predictor importance estimates for clusters based on partial autocorrelation coefficients. (Numerical Results)Figure 11. Prototype’s hourly profile for clusters obtained with autocorrelation coefficients and complete linkage. (Numerical Results)Figure 10. Prototype’s hourly profile for clusters obtained with quantile autocovariance and complete linkage. (Numerical Results)Figure 12. Prototype’s hourly profile for clusters obtained with partial autocorrelation coefficients and complete linkage. (Numerical Results)›