loading page

Building a Geological Cyber-infrastructure: Automatically detecting Clasts in Photomicrographs
  • +1
  • Ari'El Encarnacion,
  • Ben Katin,
  • Matty Mookerjee,
  • Gurman Gill
Ari'El Encarnacion
Sonoma State University
Author Profile
Ben Katin
Sonoma State University
Author Profile
Matty Mookerjee
Sonoma State University
Author Profile
Gurman Gill
Sonoma State University

Corresponding Author:gurman@gmail.com

Author Profile

Abstract

To incentivize the participation and contribution to the growth of an earth-science-based cyberinfrastructure, analytical environments need to be developed that allow automatic analysis and classification of data from connected data repositories. The purpose of this study is to investigate a machine learning technique for automatically detecting shear-sense-indicating clasts (i.e., sigma or delta clasts and mica fish) in photomicrographs, and finding their shear sense (i.e., sinistral (CCW) or dextral (CW) shearing). Previous work employed transfer learning, a technique in which a pre-trained Convolutional Neural Network (CNN) was repurposed, and artificially augmented image datasets to distinguish between CCW and CW shearing. Preprocessing images by denoising, a process in which noise at different scales is removed while preserving edges of an image, improved classification accuracy. However, upon randomizing the denoising parameters, the CNN model didn’t converge due to severe lack of data. While the efforts for acquiring more labeled data is ongoing, this work compensated for it by implementing a pre-processing “detection” system that automatically crops images to regions of image containing the clasts. This is done by utilizing YOLOv3, a CNN based image detection system that outputs a bounding box around an object of interest. YOLOv3 was trained using 93 photomicrographs containing bounding boxes of 344 shear-sense-indicating clasts. The retrained detector was tested on two sets: set A with 10 photomicrographs containing clasts and set B with 100 photomicrographs not containing clasts. All but one of the clasts in set A were correctly detected with an average confidence score of 96.6%. On set B, 72% of images correctly did not indicate presence of clasts. On the remaining images, where clasts were incorrectly identified, an average confidence score of 78.3% was observed. By utilizing a threshold on the confidence scores, the system could be made more accurate. Future work involves utilizing the bounding boxes output by the detection system to refine and improve the CNN model for classifying shear sense of clasts in photomicrographs.