徐寿亮, 徐剑. Spark架构下基于改进深度聚类的用户日负荷分类方法[J]. 现代电力. DOI: 10.19725/j.cnki.1007-2322.2023.0133
引用本文: 徐寿亮, 徐剑. Spark架构下基于改进深度聚类的用户日负荷分类方法[J]. 现代电力. DOI: 10.19725/j.cnki.1007-2322.2023.0133
XU Shouliang, XU Jian. User Daily Load Classification Method Based on Improved Deep Clustering Under Spark Architecture[J]. Modern Electric Power. DOI: 10.19725/j.cnki.1007-2322.2023.0133
Citation: XU Shouliang, XU Jian. User Daily Load Classification Method Based on Improved Deep Clustering Under Spark Architecture[J]. Modern Electric Power. DOI: 10.19725/j.cnki.1007-2322.2023.0133

Spark架构下基于改进深度聚类的用户日负荷分类方法

User Daily Load Classification Method Based on Improved Deep Clustering Under Spark Architecture

  • 摘要: 负荷聚类是电力系统管理的重要技术之一,通过聚类来挖掘用户的用电模式可以帮助电力系统管理者更好地理解和优化电力系统的运行,提高其效率和经济性。目前,在负荷数据海量化与复杂化趋势下,传统的负荷聚类方法难以高效精确地处理海量高维的负荷数据。因此,提出一种Spark分布式计算架构下基于改进深度聚类的日负荷分类方法。首先,利用卷积神经网络自编码器获取用户具有代表性的特征向量,送入K-means的聚类层完成负荷聚类,接着将特征提取模型和聚类模型联合优化,组成深度聚类模型。其次,考虑了处于负荷类别边界的边缘负荷样本对神经网络的不利影响,引入自步学习技术并设计了一个新的损失函数;最后,将大数据技术与深度聚类算法结合,利用Spark分布式计算平台实现深度聚类算法的并行计算。通过算例验证,所提算法在聚类效果和处理效率上都优于传统算法。

     

    Abstract: Load clustering is one of the most important technologies in power system management. By clustering algorithms to mine users' electricity consumption patterns, power system managers can gain a better understanding and enhance the optimization of power system operation, thereby improving its efficiency and economy. At present, it is difficult for traditional load clustering methods to deal with massive and high-dimensional load data efficiently and accurately under the trend of load data quantification and complexity. In this paper, we propose a daily load classification method in a Spark distributed computing architecture, which is based on improved deep clustering. First, a convolutional neural network autoencoder is utilized to acquire the representative feature vectors of users and send them to K-means clustering layer for load clustering completion. Subsequently, the feature extraction model and clustering model are jointly optimized to form a deep clustering model. Secondly, considering the adverse effects of the edge load samples at the boundary of the load class on the neural network, a self-stepping learning technique is introduced and a new loss function is designed. Finally, the integration of big data technology with deep clustering algorithm and the utilization of Spark distributed computing platform enable the parallel computing of deep clustering algorithm. The results indicate that the proposed algorithm outperforms the traditional algorithm in terms of both clustering effect and processing efficiency.

     

/

返回文章
返回