Polo del Conocimiento, Vol 10, No 6 (2025)

Machine learning-based fault type identification using random forests

Alex Ricardo Guamán Andrade, Jorge Rigoberto López Ortega, Hernán Patricio Moyano Ayala, José Luis Guamán Andrade

Resumen


This manuscript presents an advanced framework for fault classification in electrical distribution networks, using Random Forest (RF)-based methodologies coupled with simulation-generated datasets. A 6-node IEEE test setup was modeled in MATLAB Simulink to emulate different fault types under diverse operating conditions. Electrical metrics were systematically recorded, and from each experimental scenario, statistical features—mainly root mean square (RMS) values—were extracted to construct a structured dataset. The RF classifier was trained with labeled data and rigorously evaluated using stratified cross-validation techniques. An overall accuracy of 86% was achieved across seven distinct fault classes, showing remarkable precision and recall values for most fault types, especially A-G, AB, AC, B-G, and C-G. Despite relatively low effectiveness in differentiating ABC and BC faults, the model demonstrated considerable generalization capabilities when applied to an external test case with an AB fault, which was correctly classified despite not being present in the training set. The results support the effectiveness of the proposed methodology for scalable, data-driven fault diagnosis within distribution networks. The integration of simulation-based data generation with ensemble learning techniques constitutes a robust strategy for facilitating real-time network monitoring and adaptive protection mechanisms. Future research will focus on expanding the feature space, improving symmetric fault classification, and incorporating the framework into edge computing architectures for real-time deployment.