AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP
Luca Malinverno
Luca Malinverno

Public Documents 1
A novel architecture for knowledge mining from digitised document libraries
Luca Malinverno
Alessio Tugnoli

Luca Malinverno

and 6 more

November 25, 2022
This paper examines a novel knowledge mining architecture based on the Azure cloud data and AI services, to extract data from the Emporium library, a modern art journal published between 1985 and 1964. The knowledge mining starts with Optical Character Recognition (OCR) and custom Name Entity Recognition (NER) on digitised images of the pages and provide the final user with an user-friendly search portal to navigates the hundreds of pages in milliseconds through a semantic query. The study proved how this architecture fits from an art scholar’s perspective and how it enables to build more comprehensive statistics and description of the document corpus.

| Powered by Authorea.com

  • Home