loading page

A Mathematical Assessment of the Isolation Random Forest Method for Anomaly Detection in Big Data.
  • Fernando Morales,
  • Jorge Ramírez,
  • Edgar Ramos
Fernando Morales
Universidad Nacional de Colombia Sede Medellin

Corresponding Author:famoralesj@unal.edu.co

Author Profile
Jorge Ramírez
Universidad Nacional de Colombia Sede Medellin
Author Profile
Edgar Ramos
Universidad Nacional de Colombia Sede Medellin
Author Profile

Abstract

We present the mathematical analysis of the Isolation Random Forest Method (IRF Method) for anomaly detection, introduced in {\sc F.~T. Liu, K.~M. Ting, Z.-H. Zhou:}, {\it Isolation-based anomaly detection}, TKDD 6 (2012) 3:1–3:39. We prove that the IRF space can be endowed with a probability induced by the Isolation Tree algorithm (iTree). In this setting, the convergence of the IRF method is proved, using the Law of Large Numbers. A couple of counterexamples are presented to show that the method is inconclusive and no certificate of quality can be given, when using it as a means to detect anomalies. Hence, an alternative version of the method is proposed whose mathematical foundation is fully justified. Furthermore, a criterion for choosing the number of sampled trees needed to guarantee confidence intervals of the numerical results is presented. Finally, numerical experiments are presented to compare the performance of the classic method with the proposed one.
26 Jan 2021Submitted to Mathematical Methods in the Applied Sciences
27 Jan 2021Submission Checks Completed
27 Jan 2021Assigned to Editor
30 Jan 2021Reviewer(s) Assigned
18 Feb 2022Review(s) Completed, Editorial Evaluation Pending
21 Feb 2022Editorial Decision: Revise Major
01 Apr 20221st Revision Received
04 Apr 2022Submission Checks Completed
04 Apr 2022Assigned to Editor
04 Apr 2022Reviewer(s) Assigned
03 Jul 2022Review(s) Completed, Editorial Evaluation Pending
03 Jul 2022Editorial Decision: Accept
15 Jan 2023Published in Mathematical Methods in the Applied Sciences volume 46 issue 1 on pages 1156-1177. 10.1002/mma.8570