Qi Zhou

and 6 more

Machine learning can improve the accuracy of identifying mass movements in seismic signals and extend early warning times. However, we lack a profound understanding of the effective seismic features and the limitations of different machine learning models, especially for debris-flow warning. Here, we investigate the importance of seismic features for the binary debris flow classification tasks using two ensemble models: Random Forest (RF) and eXtreme Gradient Boosting (XGB) models. We find that an established approach to training machine learning models for debris flow classification task based on more than seventy seismic signal features may be affected by redundant input information. These seismic features are derived from physical and statistical knowledge of impact sources and are grouped into waveform, spectrum, spectrogram, and network sets. Our results show that only six selected seismic features can perform similarly for the binary debris flow classification task compared to published benchmark results trained with seventy features. Considering models that aim to capture patterns in sequential data rather than focusing on information only in one given window as ensemble models, using the Long Short-Term Memory (LSTM) algorithm does not improve the performance of binary debris flow classification tasks over RF and XGB. As a debris flow alarm task, the LSTM model predicts debris flow initiation more consistently and generates fewer false warnings. Our proposed framework simplifies seismic signal-driven early warning for debris flows and provides an appropriate workflow for identifying other mass movements.