Abstract
Large Language Models (LLMs) are continuously being applied in a more
diverse set of con‐ texts. At their current state, however, even
state‐of‐the‐art LLMs such as Generative Pre‐Trained Transformer 4
(GTP‐4) have challenges when extracting information from real‐world
technical docu‐ mentation without a heavy preprocessing. One such area
with real‐world technical documentation is telecommunications
engineering, which could greatly benefit from domain‐specific LLMs. The
unique format and overall structure of telecommunications internal
specifications differs greatly from standard English and thus it is
evident that the application of out‐of‐the‐box Natural Language
Processing (NLP) tools is not a viable option. This article provides a
brief outline of the limitations of out‐of‐the‐box NLP tools for
processing technical information generated by telecommunications experts
and expand the concept of Technical Language Processing (TLP) to the
telecommunica‐ tions domain. Additionally, we emphasize the importance
of use case definition by introducing the required information mapping
from the perspective of a Q&A application that uses internal speci‐
fications as the source of knowledge. Finally, we recommend actions to
mitigate the effect of the internal specifications format on information
extraction, effectively achieving LLM‐friendly inter‐ nal
specifications.