BrutNet: A Novel Approach for Violence Detection and Classification
using DCNN with GRU
Abstract
Automatic Violence Detection and Classification (AVDC) with deep
learning has garnered significant attention in computer vision research.
This paper presents a novel approach to combining a custom Deep
Convolutional Neural Network (DCNN) with a Gated Recurrent Unit (GRU) in
developing a new AVDC model called BrutNet. Specifically, we develop a
time-distributed DCNN (TD-DCNN) to generate a compact 2D representation
with 512 spatial features per frame from a set of equally-spaced frames
of dimension 160×90 in short video segments. Further to leverage the
temporal information, a GRU layer is utilised, generating a condensed 1D
vector that enables binary classification of violent or non-violent
content through multiple dense layers. Overfitting is addressed by
incorporating dropout layers with a rate of 0.5, while the hidden and
output layers employ rectified linear unit (ReLU) and sigmoid
activations, respectively. The model is trained on the NVIDIA Tesla K80
GPU through Google Colab, demonstrating superior performance compared to
existing models across various video datasets, including hockey fights,
movie fights, AVD, and RWF-2000. Notably, our model stands out by
requiring only 3.416 million parameters and achieving impressive test
accuracies of 97.62%, 100%, 97.22%, and 86.43% on the respective
datasets. Thus, BrutNet exhibits the potential to emerge as a highly
efficient and robust AVDC model in support of greater public safety,
content moderation and censorship, computer-aided investigations, and
law enforcement.