Face Detection:
We created the graphical user interface (GUI) in C++ by modifyingimglab tools for image annotation We trained the interface to
detect seal faces, allowing for automated detection of all seal faces in
each photo. In addition, the GUI allows for the option to manually
select seal faces by drawing boxes around valid faces in the
application. A valid seal face is determined based on the quality and
clarity of the image, as well as the angle of the seal face to the
camera. Invalid faces are those that are too blurry, not facing the
camera, or are partially obstructed. Invalid faces are ignored by the
software, as are regions of the image not marked as faces. Variations in
illuminations, lighting, and other conditions can introduce noise to the
data and impede analysis. We next converted the photos to grayscale to
help the model learn based on physical features and color patterns
rather than the colors, which also serves to reduce overfitting during
training. After all photos were aligned and cropped, we manually grouped
photos of the same seals into folders by individual. To train our face
detector, we selected all seal faces from the 516 photos taken at Brandt
Ledges on January 29th, 2020.
Our imglab based face detection software is a CNN network which uses
Max-Margin Object Detection ( loss function. The first three layers of
the network downsample the input images by 8 and output a feature map of
32 channels. This feature map will go through 4 more convolutional
layers with batch normalization and Rectified Linear Unit (ReLU) as
nonlinearity. The final output will only have 1 channel; a large value
will indicate that the network has found an object at that location and
vice versa.
Using the full 2020 dataset, we measured the accuracy of the model using
5-fold stratified cross-validation. Each strata (i.e, each location and
date) was split into 5 sections. For each fold, 4 of the 5 sections were
chosen as a training set while the remaining section was used as a
validation set. For each fold, the training set contained
~413 photos from all 5 locations, and the validation set
contained ~103 photos from the same 5 locations. The
accuracy of the face detector is measured by two metrics: precision and
recall (Figure 3 ).