Amit Kumar Singh -

In-band Network Telemetry with Programmable Data Plane Research Review: Challenges and Future DirectionsAmit Kumar Singh 1, Dr. Mayank Pandey 21Department of Computer Science and Engineering, MNNIT Allahabad, Prayagraj India- 2110042Department of Computer Science and Engineering, MNNIT Allahabad, Prayagraj India - 2110041amit.2021rcs02@mnnit.ac.in, 2 mayanakpandey@mnnit.ac.inCorresponding author: Amit Kumar Singh. E-mail: amit.2021rcs02@mnnit.ac.inABSTRACT:In the past few years, “Network Telemetry” has emerged as a popular buzzword for newer network data collection and consumption techniques. Network telemetry refers to the practice of collecting, analyzing, and interpreting data from network devices and infrastructure in real-time to gain visibility and insights into network performance and behavior. Network telemetry is becoming increasingly important as networks grow more complex and the demand for high-performance, reliable connectivity increases. By using network telemetry, organizations can proactively identify and troubleshoot issues before they become critical, optimize network performance, and ensure that service-level agreements (SLAs) are being met. In recent years, the rise of Programmable Data Planes and the development of programming languages such as P4, have provided robust tools for creating new network protocols and revamping existing network applications and systems. In this manner, the network monitoring technique can be revolutionized using programmable data planes. In-band Network Telemetry is the technique revolutionizing network monitoring by embedding real-time telemetry data directly into data packets as they traverse the network. This technique leverages the flexibility and programmability of modern data planes, particularly using the P4 programming language. P4, designed for programming protocol-independent packet processors, enables the dynamic definition and modification of packet processing behaviors, making it an ideal tool for implementing INT. The collection of real-time data and detailed network-wide information is crucial for designing practical monitoring tools integrated into complex Operations, Administration, and maintenance applications. This review paper provides a comprehensive analysis of the current state of INT using programmable data planes using P4, covering its fundamental principles, implementation methodologies, benefits, and challenges. This review also highlights the technical advancement in Data Plane Programmability and its potential for next-generation network measurement. We also discuss the research gaps and future direction for various network management requirements. Keywords: In-band Network Telemetry, Software Defined Networks Programmable data plane, P4, Network Monitoring1. Introduction: Eﬃcient methods for monitoring network performance, detecting congestion, failures, and anomalies, and responding to them in real-time have always been and will continue to be crucial and challenging in modern networks. Over the years, various methods have been proposed to gather information about the status of the network and make decisions based on the status of the data, which can be categorized as active and passive monitoring solutions. These methods offer a comprehensive view of network status for Operations, Administration, and Maintenance (OAM) applications[1]. Active monitoring, also known as synthetic monitoring, involves injecting test traﬃc and measuring network performance proactively to identify potential issues before they occur. However, active monitoring might not always accurately represent the network’s performance. In contrast, passive monitoring analyzes real network traﬃc by capturing either all or a specific portion of the traﬃc ﬂowing through the network, offering a more precise view. Passive monitoring commonly collects counters and statistics directly from network devices using protocols like SNMP [2] and NETCONF [3]. Passive monitoring is based on a polling-based nature, and high processing overhead can lead to performance limitations, especially in large-scale networks. This challenge has led to network telemetry solutions, where network devices push specific metrics in real-time to a collector. The Comparison of Active, Passive and Hybrid Network Measurement based on few parameters are summarized in table1. Table 1: Comparison of Active Passive and Hybrid Network Measurement Parameter Active Measurement Passive Measurement Hybrid Measurement Data Collection Approach Actively generates test traffic or probes to measure network performance. Observe existing network traﬃc without actively injecting test data. Combines active probing and passive observation to gain insights into network performance. Real-time Data Provides real-time data on network performance. Typically provides real-time data as it analyzes live network traﬃc. Offers real-time data through active probes and passive analysis of live traﬃc. Control Over Testing Offers control over the type, timing, and volume of test traﬃc. Lacks control over network conditions since it relies on existing traﬃc. Provides control through active measurements, while passive measurements are less controlled. Intrusiveness Can be intrusive as it injects test traﬃc, potentially affecting network resources. Non-intrusive as it observes existing traﬃc without adding to it. Can be somewhat intrusive due to active probing but is less intrusive than pure active measurement. Resource Overhead Active probes can consume network resources, leading to increased overhead. Typically has lower resource overhead as it doesn’t add extra traﬃc. Moderate resource overhead due to active probing but still lower than pure active measurement. Coverage May not capture all network traﬃc and nuances. Focuses on specific test points. Captures all network traﬃc for analysis, providing comprehensive data. Provides comprehensive data through passive observation and controlled data points with active probing. PrivacyConcerns Generally fewer privacy concerns since it generates test traﬃc. Privacy concerns are minimal as it doesn’t actively introduce new data. Privacy concerns may exist, primarily when active probing captures sensitive data. Complexity May require setup and configuration of test traﬃc generators. Simpler to set up and deploy compared to active measurement. It can be complex due to the integration of both active and passive monitoring tools. Use Cases Useful for proactive issue detection, controlled testing, and specific performance measurements. Effective for continuous monitoring, traﬃc analysis, and anomaly detection. Ideal for gaining a comprehensive view of network performance by combining the strengths of both active and passive measurement. The advent of programmable data plane (PDP) [4] has started a new era of network telemetry solutions. PDP is a cutting-edge technology enabling the programming of packet processing tasks through domain-specific high-level languages and programmable switches. Notable languages facilitating data plane programmability include POF (Portable object Forma) [5] and P4 [6]. These advancements have provided the way for innovative telemetry methods, enabling end-to-end monitoring directly within the data plane. In this context, In-Band Network Telemetry (INT) is a primary network monitoring framework implemented using P4 and developed by the P4 Language Consortium [7]. INT involves embedding monitoring information into data packets as they traverse the network rather than using dedicated packets. P4, a representative data plane programming language, provides packet processing abstractions in networking elements in a target-independent manner. Unlike OpenFlow, which relies on predefined protocol headers, P4 allows for adaptable header definitions and matching functionality without the need for extending the specification. Moreover, P4 supports stateful programming, enabling forwarding procedure definition based on network state, ﬂow statistics, and historical data at the node level, all without controller intervention. The P4 framework empowers researchers to enhance existing network applications. The P4 Application Working Group has identified several areas of interest, including forwarding-plane telemetry, ﬂow monitoring using sketches, heavy-hitter detection in the data plane, low latency congestion control, big data aggregation inside the network, middlebox functions such as load balancing, fast in-network cache of distributed services, and consensus protocols at network speed. INT can potentially obtain information about the network with accuracy, so many academics and industries are interested in this field. Daily, many end-user devices such as mobile phones and smart devices establish connections to the Internet to access a wide range of services hosted remotely. The interconnection between these endpoints is facilitated by numerous intermediate transit networks responsible for transmitting large volumes of data packets in a best-effort manner. Although edge networks, data center networks, and transit networks possess slightly distinct objectives, the operators of these networks share the everyday necessity of understanding the state of their networks. Network Operation Administration and Maintenance processes are used to fulfill this essential requirement. Network Operation Administration and Maintenance aims to solve a crucial question for network operators: "What is happening inside the network?". INT is a real-time method for keeping track of and evaluating network performance. INT provides detailed information about the network traﬃc, including packet loss, delay, and congestion, which can be used to optimize network performance and troubleshoot issues. The programmability of the data plane empowers network operators to customize the functionality of network devices in every layer where INT is deployed. The objective of network measurement is to comprehend the ongoing events within a network, which is fundamental in making decisions about managing and enhancing its performance, availability, security, and eﬃciency. Network measurement is the base of controlling and managing the network [8]. Due to many academic and research organizations starting to focus on production network measurement, many improved measurement methods from academia and industry have been proposed. The network measurement can be categorized into three distinct research areas based on the evolution of measurement techniques: First, traditional network measurement, which has been introduced since 1995; second, software-defined measurement, which emerged and developed alongside software-defined networks (SDN) starting in 2008; and third network telemetry, which has become prominent due to the ﬂexibility of programmable data plane (PDP) technology since 2015. This review article focuses on network-wide in-band telemetry solutions implemented in P4, particularly highlighting INT and other innovative approaches that leverage P4. This article aims to explore related academic and industrial publications and projects in the In-band Network Telemetry and Data Plane Programmability domain. However, this review paper comprehensively collects the research paper on In-band Network Telemetry and Data Plane Programmability. 1.2 Challenges Related to Traditional Network Measurement MethodsConventional OAM tools like ping, traceroute, Netﬂow, etc., often have limitations in that these methods are not able to monitor and troubleshoot today's and next-generation large, complex, dynamic, and programmable networks. The following are some of the common issues with traditional OAM tools.v Limited Insight: Traditional OAM tools typically rely on periodic polling (pull-based) to collect network data. This approach provides limited visibility into network activity, as it only captures a snapshot of network conditions at specific points in time. This can make it diﬃcult to identify issues that occur between polling intervals.v Limited Data Range: To manage nowadays network operations, a lot of data emanating from multiple sources is needed. This data may come from different network devices, different components of a network device, or different network planes. This data can be of different varieties like data related to packet processing engines and traﬃc manager, data related to device configurations and operations, and data regarding line cards, user ﬂows, control protocol packets, etc. Traditional OAM tools cannot provide all the necessary data probes and only operate upon a narrow range of data which must be configured statically by network administrators.v Lack of Formal Data Model: The lack of a formal data model in traditional OAM techniques can make it difficult for network administrators to manage and analyze network data effectively. It can also lead to inconsistencies in data collection and interpretation, which can hinder troubleshooting and problem-resolution efforts. For example, SNMP relies on a hierarchical structure of objects known as the Management Information Base (MIB), which provides a way to organize and access data from network devices. However, the MIB structure can be complex and diﬃcult to navigate, particularly in large or heterogeneous networks.1.3 Network Measurement in Software-Defined Environments:The rise of SDN[9] and PDP[10] has sparked a profound transformation in the conventional network measurement approach. These innovations have revolutionized network measurements by empowering more extensive capabilities for programming the control plane and data plane. By utilizing SDN, it becomes possible to reconstruct the network’s control and data planes. Moreover, traditional ﬂow collection and sampling methods[11] have been retained. By incorporating a programmable data plane, network administrators can achieve the ﬂexibility to program a switch’s packet processing logic, implement innovative functions and protocols, and migrate network measurement applications connecting the programmable data plane with terminals or measurement servers. As a result, this leads to a network measurement process that is more direct and eﬃcient. Academia and industry have put forth numerous software-defined measurement techniques to facilitate a diverse array of network performance and function measurements. These techniques encompass a wide range of aspects, including available bandwidth [12], packet loss [13], throughput [14]latency[15], path tracing[16] , data plane rule consistency verification [17], long ﬂow detection [18], and fault location [19].1.4 Network Telemetry and in band Network TelemetryIn recent years, "Network Telemetry" has become a popular buzzword for modern network data collection and utilization methods. Network telemetry collects, analyzes, and interprets data from network devices and infrastructure in real-time to gain visibility and insights into network performance and behavior. Network telemetry is becoming increasingly important as networks grow more complex and the demand for high performance, reliable connectivity increases. Using network telemetry, organizations can proactively identify and troubleshoot issues before they become critical, optimize network performance, and ensure that service-level agreements (SLAs) are met. The role of IETF (Internet Engineering Task Force) and some other projects have been instrumental in developing this body of knowledge. The most important of these is the Open Network Foundation (ONF) [20] and the Openconfig[21] project. ONF is a non-profit organization that promotes developing and adopting open-source software-defined networking (SDN) technologies. The ONF was founded in 2011 by a group of networking industry leaders, including Google, Facebook, Microsoft, Deutsche Telekom, and Yahoo. Openconfig is a vendor neutral, open-source project that provides a standard data model for network devices and infrastructure. The Openconfig project is led by network operators and engineers from the world’s largest service providers, including Google, AT&T, and comcast. Some of the mainstream technologies developed by the projects mentioned above are as follows:v SDN and Data Plane Programmability: SDN advocated disaggregating control, data, and management planes of networking devices. Disaggregation means that the interface between the control and data plane is open and well- defined, which makes it possible for different vendors to take responsibility for different planes. Disaggregation requires a well-defined forwarding abstraction, a general-purpose way for the control plane to instruct the data plane to forward packets in a particular manner. This led to the development of the OpenFlow protocol [22]. Data plane programmability refers to thinking beyond fixed function forwarding pipelines (of network switches) to programmable channels. This led to the development of PISA (Protocol Independent Switch Architecture) and P4 (Programming Protocol-independent Packet Processors) programming languages.v YANG (Yet Another Next Generation): It is a modeling language used to describe data models for network devices and services. The IETF developed it to standardize the description of network management data, including configuration data, operational data, and event notifications. YANG is used with the NETCONF (Network Configuration Protocol)[23] or RESTCONF (RESTful Network Configuration Protocol) [24] protocols to configure and manage network devices. YANG [25] data models define the hierarchy and structure of the configuration and operational data, as well as the data types and constraints.v gNMI and gNOI: gNMI (gRPC Network Management Interface) [26]and gNOI (gRPC Network Operations Interface) [27]are two related network management protocols developed by Google and based on the gRPC framework. gNMI provides a standard interface for network devices to publish and subscribe to telemetry data and network management applications to retrieve operational data, configuration information, and statistics from network devices. It uses YANG models [28] to describe the data elements and operations that can be accessed and gRPC as the underlying transport protocol for eﬃcient communication between devices and management applications. gNOI, on the other hand, provides a standardized interface for network devices to manage their own lifecycle, including software upgrades, certificate management, and other operational tasks. It uses the same gRPC transport protocol and YANG data models as gNMI but with different operations and data elements.v GPB (Google Protocol Buffers): GPB is a language agnostic binary serialization format developed by Google. Itallows data to be encoded in a compact and eﬃcient structure that can be used for communication between systems, storage, and data serialization for machine learning models. GPB is widely used within Google, and other companies and projects also use its open-source implementation. It offers many advantages over other serialization formats, such as smaller message sizes, faster parsing and serialization, and language independence. It also supports versioning of data structures, which allows for backward and forward compatibility.The tools and techniques described above help achieve the goals of modern network telemetry frameworks. These goals include support for (i) subscription-based push and streaming of network telemetry data, (ii) dynamism, inter activeness, and automated OAM, (iii) network data analytics for root cause analysis, and (iv) vendor and protocol agnostic OAM. Network telemetry can be applied to various planes within a network, as well as other sources outside the network. Data Plane Telemetry involves collecting data related to the actual forwarding of network traﬃc, such as packet loss, latency, throughput, and traﬃc ﬂows. Control plane telemetry focuses on gathering data about the network protocols and algorithms used for routing and signaling. Management plane telemetry involves monitoring the network management infrastructure itself. It collects data on device configurations, software versions, network inventory, and health monitoring of network devices and services. Telemetry can also be applied to external sources outside the network infrastructure. This includes data from application servers, virtual machines, containers, or other external systems that interact with the network. Further, network telemetry can be broadly classified as Inband, and out-of-band based on different methods of collecting and transmitting telemetry data in a network environment.1.5 Contribution of the PaperThis survey has two main objectives. Firstly, it aims to provide a comprehensive introduction and overview of Network measurement methods and then focuses on the evolution of data plane programming and its relationship to network measurement methods. Secondly, the survey examines publications that describe applied research based on P4 technology and INT. The key contributions of this survey can be summarized as follows:v Evolution of Network Measurement methods: This section explains the development of various Network measurement methods and their applications.v Overview of data plane programming with P4: This section covers various aspects of data plane programming with P4, including the P4 programming language, architectures, compilers, targets, and data plane APIs. It covers the foundational elements and discusses advancements, extensions, and experiences related to P4.v Advancements in P4 data planes: This section summarizes research efforts aimed at advancing P4 data planes. It includes topics such as optimization of development and deployment, testing and debugging, research on P4 targets, and advances in control plane operation.v Analysis of P4-based applied research: This section analyzes a substantial body of literature comprising research papers focused on P4-based applied research. The papers are categorized into different application domains, and their essential contributions are summarized. The survey explores how the reviewed works benefit from the core features of P4.1.6 NoveltyWhile several research reviews have focused on network measurement, very few review papers have specifically addressed INT technology using Data plane programmability. This review aims to fill this gap by providing an overview of the background, research status, applications, technical opportunities, problems, and challenges related to Inband network telemetry technology and programmable data plane. This review paper encompasses publications on Data Plane Programmability using P4 and INT that were published we have also categorized them in different application areas. Based on the classification we choose to work in the field of congestion control using p4 then anyone can refer to the papers listed in congestion control section. Similarly different INT approaches are categorized. In this paper we have started from very basic of data plane programmability and Inband Network Telemetry also explained the workflow required for the testing of the basic p4 programming. 1.6 Organization of this ReviewThe structure of this review article is as follows: Section 2 Illustrations of In-Band Network Telemetry Approaches, explains about the various network measurement methods i.e., INT, IOAM, and active network telemetry and other related concepts, while Section 3 Explains the detailed description of P4 programming languages and experimental setup for running P4 based application. Section 4 Explains a detailed overview of the research related to INT Applications such as Network Monitoring, Congestion control, Flow Monitoring, and Network Security. Section 5 describes the research domain associated with INT using Data-plane Programmability and P4. Section 6 Concludes this review, provides the current research status by academia and industry in programmable data plane and In-band Network Telemetry, and describes the research gap related to INT and data plane programmability and future challenges.1.7 Abbreviation Used P4: Programming Protocol Independent Packet Processor PDP: Programmable Data Plane DCN: Data Center Networks INT: In-Band Network Telemetry IETF: Internet Engineering Task Force ONF: Open Networking Foundation TPP: Tiny Packet Program FPGA: Field-Programmable Gate Arrays DRL: Deep Reinforcement Learning KDN: Knowledge Defined Networking SFC: Service Function Chaining IDC: Internet Data Centers QoS: Quality of Service MNP: Multi-Modal Network Processor IR: Intermediate Representation OAM: Operations, Administration and Maintenance CLI: Command Line Interface MIB: Management Information Base SLAs: Service Level Agreements PISA: Protocol Independent Switch Architecture YANG: Yet Another Next Generation GPB: Google Protocol Buffers NFV: Network Function Virtualization MPLS: Multiprotocol level switching POF: Portable object Format AM-PM: Alternate Marking-Performance Measurement gRPC: Google Remote Procedure Call ANT: Active Network Telemetry IOAM: In situ Operation Administration and Maintenance NETCONF: Network Configuration ProtocolSDN: Software Defined Networks 2. ILLUSTRATIONS OF IN-BAND NETWORK TELEMETRY APPROACHES The IETF IPPM working group and P4.org are actively leading research on in-band network telemetry in the data plane. INT [29]is being driven by P4.org. IOAM is being led by P4.org. INT has proposed key implementation approaches and primarily focuses on utilizing programmable data planes to perform path-level network monitoring. Contrarily, the IETF is the driving force behind IOAM [30], which is pursuing research and standardization activities regarding the architecture and protocols associated to in-band network telemetry resulted in multiple RFC documents and attracted a lot of interest from equipment manufacturers. IOAM, AM-PM, and ANT are three examples of in-band network telemetry methods. Both AM-PM and ANT offer more adaptability and implementation simplicity as compared to INT and IOAM.2.1 In-band Network TelemetryINT involves embedding telemetry data within the network traﬃc itself. This means the telemetry data is carried alongside the regular data packets ﬂowing through the network. INT typically uses packet headers or metadata fields to have telemetry information. Network devices along the data path can extract and process this telemetry data to gain insights into the network performance, latency, packet loss, and other relevant metrics. INT provides real-time visibility into network behavior without additional dedicated telemetry channels. Basic Working of Inband Network Telemetry is shown in Fig1. Fig 1: Inband Network Telemetry SystemINT is primarily concerned with leveraging programmable data planes to enable path-level network measurement and has proposed some initial implementation concepts. On the other hand, IOAM is being promoted by the IETF and is focused on researching and standardizing the architecture and protocols for In-band network telemetry. This endeavor has garnered considerable attention from equipment manufacturers and led to the creation of multiple RFC documents. Data plane programmability allows customization of much needed ﬂexibility essential for in-band network telemetry. In-band telemetry involves the insertion of telemetry data with the data packets. Data plane programmability enables the eﬃcient processing of these packets by implementing customized packet-handling logic using programmable forwarding pipelines. This allows for the seamless extraction of telemetry data without introducing significant latency or performance degradation to the regular data traﬃc. Further, data plane programmability. allows for implementing protocol agnostic telemetry mechanisms that can extract and process telemetry data from different types of networks traﬃc, including IP, Ethernet, MPLS, or even custom protocols. Furthermore, Data plane programmability allows network administrators to define and customize the data collection points, specify which metrics to collect, and determine the granularity of the telemetry data. This ﬂexibility ensures that the telemetry data captured aligns with the specific monitoring requirements and provides relevant insights. Network telemetry is a research domain with substantial theoretical value and promising engineering applications for network management.In 2015, INT was collaboratively introduced by Barefoot, Arista, Dell, Intel, and VMware. Instead of collecting and reporting network status through the network control plane, this framework acts independently of it. Switching devices manage and interpret packets including telemetry instructions inside the INT architecture. The instructions direct the INT device to collect and embed network data when these telemetry packets pass through the device. Jeyakumar et al.[31] created the TPP programming interface in 2014, drawing inﬂuence from active network notions [32]. TPP enables customized TPP packets to be delivered from terminals for purposes including congestion control, load balancing, network diagnostics, and network performance monitoring. TPP promotes the ﬂow of information between switches and data packets. TPP has Control, Instructions, and Memory fields in its protocol architecture. The length is specified in the Control field. The given table describes the details of the terms related to the process of inband network telemetry. Table 2:In-band telemetry related terms and Description Term Description INT Header The header data within the telemetry packet that signifies the telemetry content INT Packet Packets containing telemetry data INT Metadata A package designed to gather telemetry data INT Instruction Guidelines for specifying the INT metadata items to be included in telemetry data packets INT Source Node The entity responsible for inserting INT Headers at the beginning point INT Sink Node The entity tasked with retrieving INT information at the endpoint INT Transit Hop The entity positioned in the middle to embed INT Metadata Telemetry Server A server dedicated to receiving and handling telemetry data The INT Source Node, INT Sink Node, and INT Transit Hop are represented in Table 2 are the parts of the INT forwarding plane. While the INT Sink Node collects and delivers telemetry information, the INT Source Node embeds telemetry instructions in regular or telemetry business packets. Both nodes may be NICs, network management software, terminal network protocol stacks, and network applications. Following INT packet specifications, INT Transit Hop simply needs to provide telemetry metadata. Insert (LOAD), Read (STORE), Conditional Store and Execute (CSTORE), and Conditional Perform (CEXEC) are the four operations covered by the instructions. While read advises the switch to extract the state value from the packet, Insert directs the switch to include the state value into the packet.Kim et al.[29] presented INT technology based on the programmable data plane and shown how to evaluate HTTP instantaneous latency issues. The authors determined that switch queue congestion often triggers spikes in HTTP request latency by combining path marking and queue delay data in INT messages. P4.org established the INT data plane standard, detailing INT system terminology, telemetry metadata definitions, and INT encapsulation, building on earlier research. Additionally, they included instances of how the VXLAN, Geneve, NSH, TCP, UDP, and GRE protocols were implemented. 2.2 IOAMOAM (Operation, Administration, and Maintenance), as defined in [33], encompasses a suite of tools for tasks like performance assessment, fault identification, and problem resolution. Within this context, In-situ Operation Administration and Maintenance (IOAM) emerges as a technology that captures operational data within a packet as it travels between two points in the network, as described in [30], [34] IOAM complements existing out-of-band OAM methods that rely on ICMP or other probe packet types. An IOAM entity comprises three essential components: the encapsulating node, the decapsulating node, and the transit node. This technology enables the implementation of intricate OAM functionalities, including path tracing, path validation, and SLA (Service Level Agreement) verification are explained in Table 3.Table 3: Internet drafts by IETF concerning IOAM. Proposed Drafts Description In-Situ OAM Data Model Using YANG [35] constructs a YANG module description for the IOAM function. In situ OAM (IOAM) Data Network Service Header (NSH) Encapsulation [36] explains how the Network Service Header (NSH) encapsulates IOAM data fields. Encapsulating MPLS Data Plane for In-Place OAM Data[37] outlines the MPLS data plane encapsulation used to carry IOAM data fields. VXLAN-GPE Encapsulation for In-situ OAM Data [38] Encapsulating In Situ OAM Data with VXLAN-GPE. Geneve encapsulation for In-situ OAM Data[30] Encapsulating Geneve for In Situ OAM Data Identification of In-situ OAM Data via Ether Type Protocol[39] Defines a header that contains the IOAM data fields and an Ether Type that designates IOAM data fields as the next protocol in a packet. OAM raw data export in situ via IP- FIX [40] explains how IPFIX may be used to export IOAM data in its raw form from network devices to systems. OAM Direct Exporting in situ [41] IOAM data may now be exported directly without being placed into in- ﬂight data packets thanks to the Direct Export (DEX) option. Justification of Transit [42] outlines methods for safely demonstrating that traﬃc traveled over the specified route. Fields for Details for In Situ OAM [43] explains the in-situ OAM data fields and related data types. Options for In-Situ OAM IPv6 [44] explains the IPv6 encapsulation of IOAM data fields. OAM Flags for in situ OAM [45] new ﬂags, including the Loopback and Active ﬂags, are included in the IOAM Trace Option headers. Profiles for in situ OAM[46] introduces the idea of IOAM profiles determined by use cases. 2.3 AM -PMFor evaluating end-to-end network packet loss and delay, the alternative marking-performance measurement (AM-PM) [47]represents a hybrid telemetry technique. It offers distinguishable benefits in terms of deployment versatility and capability to deliver extremely precise statistical data. Each packet header in the AM-PM technique includes a binary marking field called the "Marking Bit," which can have the values "0" or "1." This Marking Bit divides the ﬂow into blocks of packets that are sent in order, making it easier to synchronize and coordinate measurement events between two measurement checkpoints. Different encapsulation contexts, such as Geneve, SFC NSH, BIER, MPLS, and QUIC, are being investigated for AM-PM. Additionally, it may be used as an overlay for existing network protocols like IP. Six working groups within the IETF are actively debating AM-PM, and several Internet drafts contributed by many different authors show the broad interest that various manufacturers and operators have in AM-PM. Various INT Term and their significance related to telemetry are listed in Table 4. Table 4: INT Term and Significance Strategy Telemetry instructions are carried in business packets Telemetry Data are carried by business packets Telemetry with MTU Limits Location of telemetry data export Network overhead Telemetry metadata INT ✓ ✓ Finite Node decapsulate High Huge IOAM ✓ ✓ Finite Node decapsulate High Huge PBT-M ✓ ✓ Open Each hop High Huge iFIT ✓ x Open Each hop Low Huge AM-PM ✓ x Open Each hop Low Limited ANT ✓ ✓ Finite Decapsulate node High Huge Riesenberg et al.[48]proposed that effective network management relies heavily on network telemetry, particularly for extensive networks. One innovative method for precisely assessing packet loss and latency within a network is the Alternate Marking Performance Measurement (AM-PM) technique. AM-PM introduces an eﬃcient approach, requiring only a minimal overhead of one or two bits per data packet. This paper presents a pioneering time multiplexed parsing methodology that enables the practical and precise integration of AM-PM into network devices, utilizing just a single bit per packet. The paper also showcases experimental findings from both hardware and software-based implementation using P4.Karaagac et al.[49] uses the AM- PM to ensure the uninterrupted and reliable operation of Industrial Wireless Sensor Networks (IWSNs), so it becomes imperative to maintain constant visibility and awareness of network activities. This is crucial for use cases demanding deterministic, real-time network services with stringent latency and reliability requirements. Continuous monitoring of network devices is essential to ensure their proper functioning, promptly identify and address any issues, and verify compliance with all system specifications. This article explores a lightweight telemetry solution tailored for IWSNs in this context. This solution enables the precise and ongoing collection of telemetry data based on network ﬂows, all without imposing any additional overhead on the packets being monitored. The proposed monitoring approach leverages the innovative Alternate Marking Performance Monitoring (AM-PM) concept, primarily evaluating end-to-end and hop-by-hop reliability and delay performance in critical application ﬂows. The occurrence of persistent packet loss within cloud-scale overlay networks can significantly degrade the experiences of tenants. Cloud providers are highly motivated to identify the root causes of such problems promptly and automatically. However, current solutions are tailored for physical networks or need help pinpointing the specific reasons behind packet loss. In this research paper, author proposed a method for recording and analyzing the on-site forwarding behavior of packets during packet level tracing. The complexity of cloud-scale overlay networks, characterized by their high network intricacy, multi-tenant environment, and diverse root causes, poses significant challenges to achieving this objective.Fang et al. introduce VTrace, an automated diagnostic system designed to address persistent packet loss issues in cloud-scale[50] overlay networks. VTrace leverages the "fast path-slow path" architecture of virtual forwarding devices (VFDs), such as vSwitches, to install several rules for "coloring, matching, and logging" within VFDs. These rules enable the selective tracking and in-depth inspection of packets of interest. The forwarding details at each hop are meticulously logged and subsequently analyzed using an eﬃcient path reconstruction approach. In this paper, the author presents experimental results to showcase VTrace’s minimal performance overhead and rapid responsiveness. 2.4 Active Network TelemetryAn active measurement-based telemetry data collection approach is known as Active Network Telemetry (ANT). The key concept involves the proactive generation of a telemetry probe to travel a certain telemetry path. For example, Microsoft’s EverFlow data center telemetry solution adopts a proactive strategy by creating telemetry packets that are exact replicas of those that are seen during network failures. The virtual network device notifies the SDN controller of the presence of these telemetry data packets. The control plane then examines this status data to pinpoint issues. It is difficult to find sporadic network failures since EverFlow works in real-time but can only detect the network failure state that telemetry packets encounter.The idea of "Network Telemetry as a Service" and the "Net Vision" active network telemetry platform was introduced by Liu et al [51]. The amount and type of probe packets that Net Vision actively distributes are suited to the network’s status and telemetry requirements. This strategy improves telemetry coverage and scalability while lowering telemetry overhead. Segmental routing is used by Net Vision for route management, and the detection path is customized by changing the SR tags. The "SR+INT" dual-stack structure used by the Net Vision probe includes the label list length and telemetry label list in the INT stack and the output port label and length in the SR stack. The router inserts the INT label after removing the SR label to carry out telemetry forwarding.In-band network telemetry and active telemetry were merged by Pan et al., who also suggested the INT-path framework for in-band network telemetry. To support the user-specified monitoring path, INT-path, like Net Vision, correlates the INT probe with the source routing label stack. The telemetry source node and end node are already chosen by the controller, which then computes the INT route using a centralized routing algorithm. Periodically, the telemetry source node updates the empty probe delivered by the external host by adding INT information and inserting the INT path as an SR tag into the open probe header. The IP address of the controller is used as the destination address at the end node when sending the INT probe. The INT-path constructs fixed-length SR label stacks statically and uses a right shift operation to conduct stack pops because P4 does not enable parsing dual-length variable stacks.A brand-new network telemetry framework called NetView that is specifically designed for data center networks was introduced by Lin et al [52]. Various telemetry applications and frequencies can be accommodated by NetView on demand. This is accomplished with only one vantage server by actively sending dedicated probes to monitor each device. Technically speaking, NetView separates the probe into a forwarding stack and a telemetry stack, with the first overseeing ﬂexible forwarding and the latter overseeing monitoring network status to ensure thorough coverage and visibility. The number of probes is also greatly reduced by several probe generation and update techniques, providing exceptional scalability. Due to their usage of business packets to transport telemetry data, INT, IOAM, and ANT are constrained by MTU, as shown in Table 4. However, PBT, iFIT, and AM-PM only use business packets to export telemetry data and transmit telemetry instructions, which results in significant network overhead. Other telemetry solutions provide more detailed telemetry data; however, AM-PM only offers telemetry delay and packet loss rate. Classification and summary of in-band network telemetry approaches are listed in Table 5. Table 5: Classification and summary of in-band network telemetry approaches Classification Proposed Works Technical Aspects INT Kim [17] An INT demonstration using the programmable HTTP instantaneous delay measurement data plane. POINT [53] A framework for intent-driven INT. Gulenko [54] An example of an INT implementation in Open vSwitch. INT in 6TiSCH [55] INT is extended to the wireless network from the wired network. Sel-INT [56] A real-time, selectable, and programable Protocol Oblivious Forwarding (POF) INT solution. ML-INT [57] A P4-based adaptable multilayer INT system for an IP network via fiber. IOAM PBT [58] By tagging business packets or putting telemetry instructions into business packets that don’t include telemetry data, telemetry is realized. PBT-M stood for the mark scheme, and PBT for the instruction insertion scheme. iFIT [59] Combines the benefits of IOAM and PBT. Co-iOAM [60] A LoWPAN/LPWAN network with limited resources may be monitored and troubleshot using an in-place network telemetry technique. AM-PM AM-PM [19] The early AM/PM approach. Mizrahi [61] Introduces several methods and the AM-PM work- ﬂow. Riesenberg [62] A Marvell Prestera chip- and P4 programming-based AM-PM prototype system. Less than 100 ns represented the double-mark measurement delay error. Karaagac [49] A monitoring system based on AM-PM that can assess the dependability and delay performance of industrial wireless sensor networks from end to end and hop by hop. VTrace [50] A computerized diagnosis method that fixed the persistent packet loss issue in cloud networks. ANT EverFlow[63] Microsoft’s real-time telemetry technology for data centers creates telemetry packets that are proactive and identical to those with network loss. Net Vision [51] Telemetry of the network as a service. INT-path to support the user-specified monitoring path, connects the INT probe to the source routing stack. NetView [52] Supports a range of telemetry applications and telemetry frequencies on demand, actively monitoring each device by sending specialized probes 2.5 Comparison between INT Modes of Operation and Traditional Networks Fig 2: INT Modes In this section, we are comparing the INT modes of operation and traditional network monitoring methods such as SNMP and NetFlow. The INT modes of operation have a potential security vulnerability due to their involvement in packet modification within the network, which could lead to eavesdropping or tampering. Implementing security measures in the data plane pipeline to counter these attacks is more complex.INT-MD simplifies ﬂow tracking because it involves single telemetry packets that are easier to analyze at the monitoring station. On the other hand, INT-MX and INT-XD require a more detailed analysis of telemetry reports, including timestamps and sequence numbers, to accurately identify the network state experienced by the packet. This additional computation may delay obtaining the network state compared to traditional methods, but it’s important to note that conventional methods still need to exhibit faster performance. Furthermore, INT and traditional telemetry methods differ in Traditional methods may need more accuracy when network links are congested, requiring them to poll for the network state. In contrast, INT can swiftly report the network state based on specific events, such as congestion, offering quick and precise insights into packet drops. Moreover, INT provides more granular measurements at the nanoseconds scale, whereas traditional methods report at the millisecond scale. This granularity enhances the precision of network information provided by INT, which can be extremely valuable for troubleshooting Comparison of INT Modes and Traditional INT are explained in Table 6.Table 6: Comparison of INT Modes and Traditional INT Aspect INT Modes Traditional INT Visibility Provides detailed per packet telemetry Limited visibility and monitoring capabilities Packet-level Information Captures granular packet information Limited or no visibility into packet level details Network Performance Monitoring Enables real-time monitoring of network performance metrics Limited performance monitoring capabilities Troubleshooting Facilitates accurate and eﬃcient troubleshooting by pinpointing is- sues at the packet level Troubleshooting is often time consuming and involves manual inspection Latency Measurement Allows precise measurement of latency at various network stages Limited or no capability to measure latency accurately Quality of Service (QoS) Monitoring Provides insights into QoS parameters such as packet loss, jitter, and delay Limited or no visibility into QoS metrics Security Monitoring Enables the detection and analysis of security-related events and anomalies Limited security monitoring capabilities Customizability and Flexibility Offers programmability and ﬂexibility to define telemetry metrics and processing logic Limited customization options and rigid network configurations Scalability Scales eﬃciently with the ability to collect telemetry from multiple network devices Scalability may be limited by hardware and management constraints Overhead Introduces additional overhead due to the collection and processing of telemetry data Minimal overhead in traditional networks Network Automation and Analytics Enables the use of telemetry data for automated network management and advanced analytics automated network management and advanced analytics Limited automation and analytics capabilities in traditional networks 3. PROGRAMMABLE DATA PLANEThe Programmable Data Plane [10]refers to a networking infrastructure that enables the customization of network processing logic to meet the demands of modern applications. This infrastructure allows for processing network traﬃc at high speeds and with low latency, enabling new networking features that are impossible with traditional networking hardware. The Programmable Data Plane also provides improved security and network visibility. With standard networking hardware, monitoring and controlling network traﬃc can be challenging, leading to security vulnerabilities. The Programmable Data Plane, however, allows for creating custom security policies and monitoring network traﬃc in real-time, providing enhanced security and network visibility. To implement INT using the Programmable Data Plane, the data plane must be programmed to extract network traﬃc data and forward it to a monitoring tool. This requires using a custom data plane programming language, such as P4 which enables the creation of custom packet headers and processing logic. The Programmable Data Plane can be used to implement Inband Network Telemetry, allowing real-time collection and monitoring of network performance metrics. This provides several benefits, including real-time metrics with high accuracy and ﬂexibility in creating custom network performance metrics and policies. Programmable Data Plane for INT also has some limitations, including the complexity of programming the data plane and the potential impact on network performance.3.1 Programming Protocol-Independent Packet ProcessorsP4 [6]is a domain-specific programming language designed for programmable network devices. It is an open-source language that allows network administrators and developers to program network devices such as switches and routers at the packet processing level. P4 provides a way to specify how network packets are processed and forwarded pro- grammatically, independent of the underlying hardware. Two standards of P4 Programming Language have been proposed by the P4 Working Groups P414 [64], [65] and P416[65]However, P414 has some limitations, such as its inability to describe various goals and architectures, loose semantics, and insuﬃcient support for program modular- ization.P416 addresses these shortcomings by introducing struct types, expressions, and nested data structures while also supporting multiple different targets and pipeline architectures. The current version of the P4 language compiler has implemented most of the features in the P4 language. The P416 language specification provides a detailed description of the syntax, semantic rules, and requirements for conformant implementations of the language.3.1.1 P4 CompilerThe P4 compiler, provided by P4 target manufacturers, compiles P4 programs into configuration binaries. The P4 com- piler performs two main functions: First, converting the P4 program into a target-independent intermediate representation (IR), and second, mapping the IR to a specific target. The P4 compiler generates run-time mapping metadata, enabling communication between the control plane and data plane using P4 run-time [53]. It also generates an executable file for the target data plane, specifying the header format and corresponding operations. Each P4 compiler is designed for a specific P4 target. For example, Xilinx P4-SDNet and P4FPGA are used for FPGA-based de- vices, while P4c[69], PISCES are used for bmv2 software switches. The Barefoot P4 compiler is used for Barefoot Tofino’s programmable ASIC.3.1.2 P4 RuntimeP4 run-time is an open standard API used to configure and control P4-programmable forwarding planes. It allows network operators to dynamically configure and manage the behavior of their network forwarding devices using a high-level, abstracted interface. P4 run-time provides a separation between the control plane and the data plane, enabling the control plane to be implemented using a variety of programming languages and frameworks. In contrast, the data plane can be programmed using P4. The API is based on Protocol Buffers, a language-agnostic binary serialization format, and provides a set of standard P4 run-time services, including packet I/O, table management, and counter management. With P4 run-time, network operators can deploy and manage P4-based forwarding planes more agile and eﬃciently, enabling faster innovation and filling time-to-time market demands. Many networking vendors, including Cisco, Arista, Huawei, Juniper, and Barefoot Networks, support P4 run-time.3.1.3 P4 TargetsP4 targets refer to the hardware or software platforms that can be programmed using P4. Here are some common P4 targets.v FPGA based devices: These are hardware devices that can be reprogrammed using P4. They are commonly used in research and development environments and are known for their ﬂexibility and versatility.v Programmable ASICs: These are application-specific integrated circuits that can be programmed using P4. They are designed for high-performance, low-latency networking applications in data centers and carrier-grade networks.v Software switches: These are software-based switches that can be programmed using P4. They are commonly used in virtualized network environments and are known for their ﬂexibility and scalability.v Smart NICs: These network interface cards include programmable logic and can be programmed using P4. They are used in high-performance computing and data center environments to oﬄoad network processing from the host CPU.v Network processors: These are specialized chips designed for packet processing and can be programmed using P4. They are used in networking equipment such as routers and switches.3.1.4 P4 Workflow Fig 3: P4 Typical Workﬂow The workﬂow for implementing a P4 program Identify the target platform: The first step is determining the target platform, which can be a software switch, programmable ASIC, or FPGA. The choice of target platform determines the P4 compiler and development tools that will be used.v Design the P4 program: The next step is to design the P4 program, which involves defining the packet header format, parser, control ﬂow, and actions. The P4 program is written using the P4 language and can be tested using a P4 simulator.v Compile the P4 program: The P4 program is compiled into a target-specific binary format using a P4 compiler. The compiler generates the necessary runtime metadata for communicating with the control plane.v Deploy the P4 program: The compiled binary is then deployed on the target platform. This involves loading the binary onto the hardware or software switch and configuring the switch to use the P4 program.v Test the P4 program: Once the P4 program is deployed, it can be tested using various traﬃc generators and test scenarios to verify its correctness and performance.v Debug and refine the P4 program: If any issues or bugs are discovered during testing, the P4 program can be debugged and refined using the P4 development tools and simulator.v Update the P4 program: As network requirements change, the P4 program may need to be updated or modified. This involves repeating the updated program’s design, compiling, deploying, and testing steps.4. IN-BAND NETWORK TELEMETRY APPLICATIONS SUMMARYIn the beginning, In-band network telemetry technology was primarily utilized for measuring network performance over time, its applications have expanded to include monitoring and optimization, load balancing, fault location, congestion control, routing decision-making, traﬃc engineering, and network data plane verification.Table 7: Summary of in-band network telemetry Applications Proposed Method Research Objective Technical Description FindINT[66] Detect and locate the Lost The INT packet coding scheme employs two marking strategies to assess the packet loss rate and location on a per-ﬂow basis. LossSight[67] Packet Loss Monitoring Identifying instances of packet loss, determining when and where these losses occur, diagnosing the underlying reasons for the losses, and retrieving the lost INT data. INT-detector[57] Anomaly Detection INT-detector, an advanced and swift network anomaly detection system that seamlessly merges In-band Network Telemetry (INT) with the power of Deep Learning (DL). Sonata [68] Monitoring PISA switches: the BMV2 P4 software switch, Bare- foot Wedge 100B-65X (Tofino),Thrift API Apache Spark INT-ONOS[69] Network monitoring BMV2 Switches, P4 Language Delta INT[70] Low bandwidth overhead Barefoot Tofino. TPP[31] Detection of Microburst Switch, port, and queue length are recorded hop by hop. PRoML-INT [71] Troubleshooting In less than 1 ms, the fine-grained telemetry data from the optical packet network may be acquired, and then an oﬄine deep neural network training to find abnormalities and classify their causes would follow. Hohemberger [72] Troubleshooting The anomaly detection rate reaches a remarkable 97 percent by using machine learning to organize telemetry deployment strategies. Jia[73] Troubleshooting Using INT, a quick gray failure detection and localization technique. Sel-INT [56] Path Verification The system gathers device IDs at a 10 percent sampling rate to identify inconsistent forwarding paths resulting from misconfiguration. Wang [74] Match rule verification By dynamically modifying the report frequency of network ﬂow table matches based on variations in the INT telemetry information value, it is possible to achieve a reduction of over 39 times in network band- width overhead. HPCC [75] Congestion control In high-performance data center networks, comprehensive switch load information controls the transmit- ting rate of the terminal, and this rate is continually updated by ACK packets. CLOVE [76] Routing decision The reserved field in the header of the encapsulation protocol is employed to integrate INT, enabling real-time path utilization detection, source node path weight maintenance, and execution of weighted multi-path selection. KDN [77] Network intelligence On the "ONOS+BMv2" platform, a proof-of-concept knowledge-defined network solution has been built. Network AI [78] Network intelligence Deep reinforcement learning in conjunction with INT is utilized to generate control strategies dynamically that are nearly optimal. DINT [79] Minimizing network overhead DINT continuously gathers network information at a frequency that adapts to the behavior of the network. When traﬃc is more erratic, a higher frequency is employed, whereas a lower frequency is utilized for more stable traﬃc conditions. FindINT[66] Detect and Locate the Lost In-Band Network Telemetry Packet" The proposed method is designed to detect, and pinpoint loss of In-band Network Telemetry (INT) packets. The proposed solution utilizes innovative techniques to identify instances of missing INT packets and accurately determine their specific locations within the network. By effectively identifying and locating lost telemetry data, FindINT facilitates eﬃcient troubleshooting and network optimization, ultimately enhancing the overall reliability and performance of the network.LossSight [67]serves various functions, such as identifying instances of packet loss, determining when and where these losses occur, diagnosing the underlying reasons for the losses, and retrieving the lost INT (In-band Network Telemetry) data. Empirical findings demonstrate that LossSight delivers exceptional performance with minimal impact on system resources. Its detection accuracy and diagnostic precision approach 100%, and it achieves remarkably low detection latency in mere milliseconds. Notably, LossSight leverages a generative adversarial network to restore lost telemetry information, exhibiting outstanding accuracy and reliability in the process.INT-detector [57] presents a method which is an automated and high-speed network anomaly detection system. The system achieves this by combining In-band Network Telemetry (INT) and Deep Learning (DL) techniques. The pro- posed solution aims to eﬃciently and swiftly detect network anomalies, enabling prompt actions to mitigate potential issues and enhance network performance.In this paper [80], the technique of INT is employed to ensure QoS at a per-packet level. They achieve this by adjusting the priority levels dynamically based on queuing statistics gathered across the network for each packet. The outcome of this approach is a decrease in the highest latency experienced by packets, intending to provide a limited delay and jitter for each packet. The implementation is carried out in the data plane, eliminating the need for communication with an SDN controller.Sonata[68] is a versatile and scalable telemetry system that orchestrates the seamless gathering and analysis of network traﬃc. It offers a declarative interface to express queries for diverse telemetry tasks, enabling real-time execution by distributing each query across the stream processor and data plane. The system optimizes limited switch memory resources by dynamically refining each query to focus solely on traﬃc that meets the query criteria, allowing as much of the query as possible to run on the network switch at line rate.In this article[69], the authors show the development and execution of an INT monitoring framework within the ONOS controller. This approach involves an INT architecture tailored for UDP, with the potential for easy adaptability to TCP. The author used P4 to implement the design, debuting a packet-level monitoring system in ONOS. Furthermore, the author also explains how the proposed method and implementation can be effortlessly modified to support TCP.DeltaINT [70] is a framework that provides a versatile INT solution that supports various packet-level and ﬂow- level applications for managing networks while minimizing bandwidth overhead. The critical insight behind DeltaINT is that state changes are often insignificant, so the entire state information is only embedded into a packet when there is a significant state change. The author proposed two variants of DeltaINT that balance bandwidth usage and measurement accuracy, exhibiting lower bandwidth overhead than the original INT framework.TPP[31] employed a method to integrate new data plane features quickly into the network. This involved utilizing the Network interface programming functionality, directly allowing end-hosts to query and modify network state through compact packet programs. The proposed method supports a distributed programming model where all end hosts participate in a task and a logically centralized model where a central controller can oversee and program the network. The paper demonstrates that TPPs enable various novel and practical end-host applications.The proML-INT[71] allows the real-time visualization of packets over optical networks and facilitates customized performance monitoring and troubleshooting. The author also provides a detailed overview of its system design and explains how to selectively insert INT fields in packets to control the overhead of ML-INT.In this paper[72] , the author introduces a machine learning-based orchestration model to theoretically formalize the INT Orchestration Plan Problem. INT (In-band Network Telemetry) emerges as a novel network monitoring approach, enhancing network-wide visibility through real-time collection of low-level telemetry items and facilitating prompt issue detection, including microbursts. While recent research has concentrated on developing INT mechanisms and monitoring solutions, little emphasis has been placed on coordinating the collection of telemetry items. This coordination poses challenges as gathering specific telemetry items may impact network-wide visibility concerning consistency and freshness.The primary objective is to ensure the sustainable functioning of Data Center Networks (DCNs) even in the presence of internal failures and to aid network operators in their swift resolution. Nevertheless, certain failures, known as "gray failures," can occur silently, causing considerable harm to the network without any prior notification. To tackle these challenges, the authors propose a rapid detection and localization mechanism based on INT (In-band Network Telemetry) [73]. This approach involves employing simplified INT probe packets for network-wide telemetry, enabling servers under ToR (Top of Rack) switches to establish all feasible paths between sources and destinations. In the event of a network failure, the affected paths are swiftly identified and removed from the path information table at each server, using a time-out mechanism. This proactive approach empowers servers to reroute traﬃc based on source routing, effectively preventing significant packet loss and ensuring uninterrupted quality of experience. Concurrently, all outdated path entries are sent to a remote controller for centralized failure localization by identifying common path elements. To validate the effectiveness of their proposal, the authors constructed a virtual network testbed using software P4 switches and a Redis database. The evaluation results demonstrate that their system can swiftly detect and reroute affected traﬃc while localizing the failure within a few seconds.Sel-Int [56] is proposed to address the high overheads caused by per-packet operation by designing and implementing a runtime programmable selective INT system. This approach involves the development of a particular INT scheme based on protocol oblivious forwarding (POF), followed by the extension of the OpenvSwitch (OVS) platform to create a software switch that supports Sel-INT. The author also implemented a Data Analyzer to extract and analyze the INT data. In the results, it is observed that the overheads of INT were significantly reduced, and the packet processing throughput of the software switch improved significantly when the sampling rate was restricted below 20 percent. Despite a low sampling rate of 7.1 percent, the proposed approach can still maintain a relatively high monitoring accuracy on fast-changing INT data. Wang[74]suggests a brand-new Introspection (INT) system that is bandwidth-eﬃcient for monitoring the rules that packets in a ﬂow match. The system is designed to reduce the overhead of INT and avoid the bottleneck caused by the aggregation of data, which is a problem with existing INT systems. The proposed system uses a two-level aggregation approach that first aggregates data within each network switch and then aggregates data from multiple switches. The system also introduces a new INT data structure, called a Rule- Track Record (RTR), to eﬃciently record the matched rules for each packet. Experimental results demonstrate that the proposed system can reduce the amount of INT data generated by up to 90 percent while maintaining high accuracy in tracking rule matches. The system can also reduce the processing time of rule matching by up to 70 percent. The proposed approach can effectively reduce the overhead of INT and improve network performance.The paper proposes a novel congestion control algorithm for data center networks called HPCC (High Precision Congestion Control)[75]. HPCC leverages an end-to-end feedback mechanism to estimate switch queue lengths accurately. The algorithm then uses this information to adjust the congestion window size and effectively control con- gestion in the network. The authors evaluate HPCC’s performance using simulations and real-world experiments on a 32-node testbed. The results show that HPCC outperforms several state-of-the-art congestion control algorithms regarding throughput, fairness, and stability. HPCC can achieve high-precision congestion control, improving network performance and user experience.Clove [76] a congestion-aware load balancing system for virtual edge networks. Clove leverages software-defined networking (SDN) to monitor network congestion and dynamically balance traﬃc among multiple servers. Clove pro- vides several benefits, such as reduced network congestion, improved server utilization, and enhanced user experience. The authors evaluate Clove’s performance using a real-world deployment in a virtualized environment and demonstrate its effectiveness in handling network congestion and load balancing. The paper highlights Clove’s potential for supporting emerging applications requiring low latency, high throughput, and reliable network performance.KDN[77] a novel approach to network management called Knowledge-Defined Networking (KDN) that leverages INT(INT). KDN aims to enable network operators to better understand network behavior and performance by pro- viding them with fine-grained telemetry data. The authors propose a KDN architecture that utilizes INT to collect and analyze telemetry data and machine learning techniques to build a knowledge base of network behavior. The knowledge base can then inform network management decisions like traﬃc engineering and resource allocation. The authors evaluate the KDN architecture using simulations and demonstrate its effectiveness in improving network performance, reducing congestion, and enhancing security. The paper concludes by highlighting the potential of KDN in supporting emerging applications that require low latency, high reliability, and scalable network performance.This paper[78] proposes an intelligent network architecture called NetworkAI for self-learning control strategies in software-defined networks (SDNs). The NetworkAI architecture includes two main components: a deep reinforcement learning (DRL) module and a control module. The DRL module is responsible for learning optimal control strategies through interactions with the network, while the control module uses these strategies to control the network. The authors evaluate the performance of NetworkAI using simulations and experiments, showing that it outperforms existing control strategies in terms of network delay and throughput. The proposed architecture could have applications in various network environments, including data centers, telecommunications networks, and the Internet of Things (IoT).DINT [79] was developed with two primary objectives: Minimizing network overhead while maintaining monitoring quality and ensuring the algorithm is lightweight enough to be executed on commodity PISA devices with limited programming capabilities. DINT accomplishes this by incorporating network information into regular packets within a defined time interval that dynamically adjusts based on changes in traﬃc volume. As a result, DINT continuously gathers network information at a frequency that adapts to the behavior of the network. A higher frequency is employed when traﬃc is more erratic, whereas a lower frequency is utilized for more stable traﬃc conditions.5. RESEARCH DOMAINS RELATED TO NETWORK TELEMETRY5.1 Network Performance TelemetryIt is crucial for both the control plane and management plane to gather up-to-date information on the network’s performance to meet the operational and security needs of a network. Aspects including latency, packet loss, available bandwidth, and Quality of Service (QoS) measures are included in typical network performance metrics. For instance, Kim et al.[81] created a one-way delay measurement utilizing HTTP requests based on the INT (In-band Network Telemetry) within the programmable data plane for measuring latency. Other systems, such as EverFlow[82] and Pingmesh [58], proactively create network probes to track changes in end-to-end latency and jitter as well as to identify sudden delays suggestive of network failures in data centers. Hop-by-hop network delays are calculated using AM-PM (Active Measurement - Passive Measurement) approaches by Fioccola et al. [60] and Riesenberg et al.[62] , which often results in minimum data plane overhead and extremely precise measurements. Kagami et al. [61] proposed CAPEST, an INT-based data plane oﬄoading method for calculating network capacity and available bandwidth, in the field of measuring available bandwidth. By including Port ID, Capacity, and Available Bandwidth information into telemetry packets whenever it’s necessary to gauge connection capacity and available bandwidth, this method eliminates the drawbacks of conventional INT systems. To determine a more accurate capacity estimate, CAPEST uses an autocorrelation operation on the estimated statistical information histogram and its inverse. The available bandwidth estimate is then derived from the capacity and usage calculations. The telemetry information contained in the business packet is finally combined with the connection bandwidth and available bandwidth values. CAPEST improves measurement accuracy and real-time response while significantly reducing telemetry intrusion.SIMON[83] is a reliable system that reconstructs crucial network status variables, such as packet queuing times at switches, link utilizations, and the composition of queues and connections at the ﬂow-level, for precise and scalable monitoring in data centers. Geng et al. [83] have shown that SIMON can run virtually in real-time by using the function approximation capabilities of multi-layered neural networks to speed it by a factor of 5000–100,000.5.2 Network MonitoringThis paper [84] presents a streamlined sampling-based method for overseeing communication. It involves a switch assisting a packet in gathering data about its journey from source to destination, including any congestion it encounters enroute. Notably, this approach incurs virtually no additional bandwidth usage, as it merely incorporates a minimal amount of information within the header of the monitored IP packet. This practicality allows for the monitoring of every packet without significant overhead. In a previous study, we conducted network simulations on large-scale, tightly coupled High-Performance Computing (HPC) applications, demonstrating that this approach can furnish com- prehensive insights into traﬃc patterns and congestion issues, facilitating the identification of their underlying causes. In this work, we describe the implementation of this approach using P4 for data center networks and showcase its capabilities through a basic experimental demonstration.This article [55] introduces an innovative and effective method for monitoring and gathering data from Industrial Wireless Sensor Networks, with a primary focus on the 6TiSCH Network stack. This stack encompasses a comprehensive set of protocols tailored for highly dependable and energy-eﬃcient wireless mesh networks. The presented monitoring approach establishes an adaptable and robust in-band network telemetry framework. This framework minimizes resource usage and communication overhead while accommodating a diverse array of monitoring tasks and strategies, addressing various network scenarios and application needs. Furthermore, the proposed solution’s technical capabilities and attributes are assessed through both real-world implementation and a blend of practical and theoretical analyses. The outcomes of these experiments underscore the potential of in-band telemetry to deliver exceptionally eﬃcient network monitoring processes. Notably, these monitoring activities do not adversely affect network performance or behavior, thus aﬃrming the suitability of this approach for deployment in Industrial Wireless Sensor Networks.In this article [85],authors shares their successful endeavor in monitoring INT packet loss and present our achievements. We have conceived, realized, and made available a robust system for tracking packet loss in INT, which we named LossSight. LossSight encompasses a range of functionalities, including the identification of packet loss incidents, determination of their timing and origin, analysis of the underlying causes, and retrieval of lost INT data. Our experimental outcomes demonstrate LossSight’s exceptional performance and its minimal impact on system resources. The system exhibits nearly ﬂawless detection accuracy and diagnostic precision, approaching 100%, while maintaining a mere millisecond-level latency for detection.FlowStalker [86] is a P4-based solution for monitoring network traﬃc ﬂows in real time. The architecture proposed in this paper includes two components: a monitoring agent and a control agent. The monitoring agent is responsible for capturing ﬂow packets and generating ﬂow statistics in real-time. In contrast, the control agent manages the monitoring agents and handles user queries for ﬂow data. The P4 program the monitoring agent uses is designed to eﬃciently process ﬂow packets and extract the necessary ﬂow information, including packet count, byte count, source IP address, destination IP address, source port, and destination port. This information is then used to generate and store ﬂow statistics in a local ﬂow table. The control agent manages the monitoring agents and handles user queries for ﬂow data. Users can query the ﬂow table for ﬂow statistics based on various parameters, such as source or destination IP address, port number, and protocol type. The control agent can also dynamically reconfigure the monitoring agents to adapt to changes in the network. The authors evaluate the performance of FlowStalker using a software based P4 switch and demonstrate its ability to capture and process various ﬂows in real time with low overhead. They also show that FlowStalker can effectively handle many ﬂow queries from multiple users without sacrificing performance.P4BID [87] a novel information ﬂow control system for P4-based networks. The goal of P4BID is to prevent information leakage in data center networks by enforcing information ﬂow control policies on the data plane. The authors first introduce the concept of information ﬂow control and explain how it can be applied to P4-based networks. The authors proposed the design of P4BID, which consists of two main components: a P4 program that enforces information ﬂow control policies and a dynamic information ﬂow control policy manager that can be updated in real-time. The P4 program is responsible for monitoring packet headers and payload, tagging packets with labels representing their security clearance level, and dropping or allowing packets based on the current security policy. The dynamic policy manager enables network administrators to update the security policy at runtime based on changes in the net- work environment or security threats. The authors evaluate the performance of P4BID using a testbed of P4-based switches and show that it can enforce information ﬂow control policies with low overhead and minimal impact on network performance. They also demonstrate the effectiveness of P4BID in preventing information leakage.The paper[88] provides an overview of the key concepts and techniques involved in traﬃc management, such as scheduling, policing, and shaping, and discusses how these functions can be implemented using P4 programmable switches. The authors also highlight the challenges and opportunities of using P4 for traﬃc management, including the need for eﬃcient resource allocation, scalable control plane integration, and dynamic adaptation to changing traffic patterns. This paper reviews various traﬃc management techniques implemented using P4, including hierarchical scheduling, congestion control, and quality of service (QoS) enforcement. The authors also discuss the limitations of existing approaches and suggest potential directions for future research, such as developing more eﬃcient algorithms for traﬃc management, improving the programmability and ﬂexibility of P4, and exploring new applications of programmable data planes in traﬃc management. SR - INT [89] In this work, the authors further extend the advantages of SR-INT by studying how to plan SR-INT schemes for ﬂows at the network level. This involves balancing the trade-off between bandwidth usage and network monitoring coverage. We formulate this problem as a mixed integer linear programming model (MILP) and prove that it is NP-hard. To reduce the time complexity of solving this problem, the author proposes a novel greedy algorithm based on path ranking and a column generation (CG) based approximation algorithm. Extensive simulations verify the performance of our proposed algorithms. SFANT [90] proposes a new network telemetry scheme that uses the Segment Routing IPv6 mechanism to collect network status information from the data plane forwarding device. The method is based on P4, which allows for ﬂexible and fine-grained control over the network telemetry process. The SFANT scheme has several advantages, such that it is more accurate, as it collects information directly from the data plane forwarding device rather than from user packets. Second, it is more ﬂexible, as it can manage various network status information and be tailored to specific network requirements. Third, it is more eﬃcient, requiring no additional bandwidth or resources from the user packets. PacketScope [91] presents a lightweight, packet-centric monitoring system for network traﬃc and congestion implemented in P4, a programming language for network devices. The proposed approach, called PacketScope, provides real-time monitoring of network traﬃc at the packet level, allowing network operators to detect congestion and other network issues as they occur. PacketScope uses a combination of P4 programmable switches and a central monitoring system to provide both low-level packet information and higher-level statistics about network performance. The authors evaluate the performance of PacketScope using simulations and experiments, demonstrating its effectiveness in detecting net- work congestion and other issues with minimal impact on network performance. The proposed system could have applications in various network environments, including data centers, cloud networks, and the Internet of Things (IoT).Harrison et al.[92] proposed a monitoring system that utilized programmable switches to identify substantial data traﬃc within a network. Every edge switch within the network keeps track of counters and threshold values for specific identifiers such as Source IP and Destination IP. Whenever a counter surpasses its local threshold value, the switch transmits a report to the controller, including both the counter and the corresponding identifier. The controller consolidates the counter values for the same identifier from multiple switches, compares them against a global threshold value, and declares the identifier as a heavy ﬂow if the cumulative count exceeds the global threshold value.The authors of the paper [93] have introduced a framework that enables P4-based network monitoring and scheduling within SDN switches, eliminating the need for controller intervention. This comprehensive framework consists of a monitoring module responsible for collecting status information (such as hop latency, queue occupancy, ingress port ID, and switch ID) from each switch in the forwarding plane, utilizing the INT (In-band Network Telemetry) framework. The obtained link status information from the monitoring function is utilized by the traﬃc scheduling mod- ule to appropriately route traﬃc along alternative paths in the event of congestion. This approach ensures eﬃcient monitoring of network status without introducing any additional delays or probe packets.L4S[94]is a monitoring framework for Low Latency, Low Loss, Scalable Throughput (L4S) networks using INT(INT) and P4 programming language. The proposed framework aims to provide comprehensive visibility into the L4S network performance and enable effective troubleshooting, performance optimization, and congestion control. The authors describe the key components of the framework, including the P4-based INT probes, the data collector, and the data analytics module. The INT probes are implemented using P4, allowing ﬂexible and fine-grained packet processing at a line rate. The data collector aggregates the collected INT data and stores it in a centralized location for analysis. The data analytics module provides real-time insights into network performance and congestion. The proposed framework addresses the need for detailed and real-time visibility into L4S network performance and can enable effective troubleshooting and congestion control.SRv6-INT [95] a runtime monitoring solution for Service Function Chaining (SFC) in Beyond 5G Mobile Edge Computing (B5G-MEC) networks. The proposed solution leverages Segment Routing over IPv6 (SRv6) and INT(INT) to provide real-time monitoring and optimization of SFCs.The SRv6-enabled routers implement network functions as SRv6 segments, while the INT probes collect network performance metrics and forward them to the data collector. The data collector aggregates the INT data and stores it for analysis, while the analytics module provides real-time insights into network performance and optimization. The authors show how the proposed solution can dynamically optimize SFCs based on real-time network conditions, resulting in significant energy savings and reduced latency.FlowSpy [96] framework consists of three main components: a P4 program that captures packet headers, a monitoring controller that communicates with the P4 program, and a monitoring application that processes and analyzes the monitored data. The authors demonstrate the effectiveness of FlowSpy by evaluating its performance on various network scenarios, including a campus network and an internet data center (IDC) network. The results show that FlowSpy achieves high accuracy and eﬃciency in detecting various types of networks traﬃc, such as TCP and UDP traﬃc, and can effectively monitor network traﬃc with high traﬃc rates. The authors also compare FlowSpy to existing network monitoring approaches, showing that it outperforms them in accuracy and eﬃciency.INT-FLOC[97] is an intelligent software-defined network (SDN) fault localization framework that uses programmable INT(INT) and machine learning algorithms. The framework is called INT-FLOC, and it aims to reduce the time and resources required to locate network faults by accurately identifying the root cause of the problem. The INT- FLOC framework comprises three main components: data collection, feature engineering, and machine learning-based fault localization. The authors evaluate the effectiveness of INT-FLOC using simulations and experiments, showing that it can accurately locate network faults with high accuracy and low false-positive rates. The proposed framework exhibits a noteworthy potential for substantially enhancing the eﬃciency and reliability of network fault diagnosis in SDN environments.The proposed approach [98]consists of two main components: a query processing module and a data processing module. The query processing module receives network monitoring queries from users. It transforms them into a set of P4 (Programming Protocol-Independent Packet Processors) rules, which are then installed into the data processing module. The data processing module monitors network traﬃc and processes the P4 rules to extract relevant data for each query.Int-Opt [99] In this research paper, the author introduces an adaptable and expressive telemetry system crafted for the versatile monitoring of Virtual Network Function (VNF) service chains in network environments, achieved through active probing. This system, known as Int-Opt, provides the capability to define specific monitoring criteria for each service chain. These criteria are then translated into telemetry item collection tasks that retrieve the required telemetry data from programmable data plane elements utilizing P4 (programming protocol-independent packet processors). In this approach, the Software-Defined Networking (SDN) controller intelligently generates the minimum necessary monitoring ﬂows to fulfill the telemetry needs of the deployed service chains within the network. The analysis demonstrates that our proposed approach can reduce monitoring overhead by 39% and total delays by 57%. Such optimization has the potential to enhance the scalability of existing expressive monitoring frameworks, making them suitable for larger real-time network environments.5.3 Congestion ControlIn this paper[100] , The authors first explain the challenges of traditional congestion control mechanisms in networks, such as their high overhead and slow response time. They then introduce the benefits of using P4 technology for congestion control, such as its programmability and ability to handle high-speed networks. The proposed approach consists of two main components: a congestion detection module and a congestion avoidance module, and both are implemented using P4 technology. The congestion detection module monitors network traﬃc and detects congestion by analyzing packet headers and ﬂow statistics, while the congestion avoidance module dynamically adjusts the network parameters to avoid congestion. The authors evaluate the performance of their approach using simulation experiments and show that it can effectively detect and avoid congestion in high-speed networks with low overhead and fast response time. They also compare their approach to traditional congestion control mechanisms, showing that it outperforms them in accuracy and eﬃciency.ConQuest [101] is a novel approach for measuring the queue occupancy in the data plane of a network device using P4 programmable switches. The proposed method, QProbe, provides a fine-grained measurement of queue occupancy and allows for accurate estimation of packet loss rates, which are crucial in congestion control. The QProbe architecture consists of P4 programs deployed on the switch pipeline. These programs extract relevant packet header fields and compute the queue occupancy for each packet at various points in the pipeline. The computed values are then sent to a centralized controller for further analysis. The authors demonstrate the effectiveness of QProbe using a prototype implementation on a P4 programmable switch. They show that QProbe can accurately measure queue occupancy with low overhead and provide insights into the root causes of congestion. Additionally, they show that QProbe can be used for various applications such as real-time congestion detection, ﬂow-level performance analysis, and end-to-end delay measurement.The paper [80] proposes a novel approach to fast, soft failure recovery in packet-optical networks using P4-based telemetry processing. They present a P4-based telemetry processing approach that enables fine-grained network traﬃc monitoring and rapid detection of soft failures, such as congestion and link degradation. The proposed approach is implemented using P4 and an OpenFlow controller. The P4-based telemetry processing is used to extract network telemetry data, which is sent to the OpenFlow controller for analysis and decision-making. The controller then issues commands to the network devices to implement the necessary failure recovery measures. The experiments show that the approach can detect soft failures within seconds and recover from them within tens of milliseconds, significantly reducing network downtime and improving network reliability. HINT [102], an in-band network telemetry structure crafted to furnish insights into network congestion for the end-host TCP algorithm's learning process. The core concept involves adjusting switch behavior via P4 programming language, instructing switches to embed basic device information like processing delay and queue occupancy directly into transmitted packets. The emergence of machine learning-driven congestion control methods has only partially tapped into network data, focusing more on intricate model design while potentially overlooking valuable data fragments. This paper introduces Preliminary experimental findings indicate that while this approach incurs minimal network overhead, it substantially enhances visibility and, consequently, improves the accuracy of TCP decisions made by the end-host. Simultaneously, the flexibility of both switches and hosts in their programmability enables tailoring default behaviors to adapt to evolving user requirements. P4QCN [103]In this paper, the authors first introduce the challenges of congestion control in data center networks, such as the need for high scalability, low latency, and low overhead. They then explain the benefits of using P4 technology for congestion control, such as its programmability and ﬂexibility. The proposed mechanism, P4QCN, consists of two main components: a congestion detection module and a congestion control module, both implemented using P4 technology. The congestion detection module monitors network traﬃc and detects congestion by analyzing packet headers and ﬂow statistics, while the congestion control module dynamically adjusts the network parameters to avoid congestion. The authors evaluate the performance of P4QCN using simulation experiments and show that it can effectively control congestion in data center networks with low overhead and fast response time. They also compare P4QCN to existing congestion control mechanisms and show that it outperforms them regarding scalability and eﬃciency.5.4 Network SecurityINT(INT) has become increasingly popular due to its ability to provide valuable insights into network operations. How- ever, INT has potential vulnerabilities, such as man-in-the-middle attacks and Trojan horse injection, which can result in falsified network measurements and disastrous consequences. This paper proposes a secure INT architecture called SINT [104] that can effectively mitigate INT vulnerabilities. SINT is designed to be implemented using chipset-based multi-modal network processors (MNP) and adopts blockchain technology to prevent random access and malicious modification. The authors intended SINT to be lightweight and minimize the intrusiveness of the INT and blockchain operations. The chiplet MNP system makes SINT highly ﬂexible and adaptive to facilitate INT convergence and related blockchain updates. The proposed architecture dispatches INT tasks and blockchain operations to different chips, achieving an optimal trade-off among measurement accuracy, security requirements, and computing resource on the data plane. Fek et al. [105] proposed an anti-spoofing system for the network, utilizing the OpenFlow 1.5 and P4 domain-specific language within the SDN data plane. They introduced five algorithms: HTTP Redirect, TCP Proxy, TCP Reset, TCP Safe Reset, and DNS Anti-Spoofing. In this system, the programmable switch acts as an SYN proxy and only forwards requests to the server that has completed the three-way handshake process. Another research examined the feasibility of employing the P4 language to detect DDoS attack traﬃc, implementing the History- Based IP filtering algorithm. This algorithm maintains a database of legitimate sources, tracking the number of packets received from each source and the duration of observation. The packet and duration values are then compared to a predefined threshold, and any seeds with values below the threshold are discarded.PINT [106] deals with the overhead to a user specific value. The Management of overhead with the packet need to manage because maximum overhead can affect the performance of the network. PINT bound the amount of information added to each packet. PINT utilizes multiple packets to encode the requested data, enabling per-packet overhead limits that can be minimized to just one bit. We conduct an analysis of PINT, establishing performance limits, even in scenarios where multiple queries are being processed simultaneously. 5.5 RoutingLewis et al. [107] address the problem of existing intent-based networking approaches that often suffer from scalability issues when dealing with large-scale networks. To address this challenge, the paper proposes using P4, a domain-specific language for programming network data planes, to enable scalable intent-based networking. P4 allows network administrators to program the behavior of the data plane in a ﬂexible and programmable manner, making it suitable for implementing intent-based policies. The authors introduce a P4-based architecture called the Intent Plane, which consists of three key components: Intent Compiler, Intent Store, and Network Controller. The Intent Compiler translates high-level intents specified by network administrators into P4 programs that define the desired behavior of the data plane. The Intent Store stores and manages the intents, while the Network Controller interacts with the data plane and orchestrates the deployment and enforcement of the intents.Luo et al. [108] introduce the concept of path segments, which are building blocks of explicit paths in the network. Path segments are defined as specific portions of a path that can be independently controlled and manipulated. The authors propose a scalable approach to explicit path control using path segments, where complex routes can be constructed by combining multiple segments. The paper also discusses various algorithms and protocols used in the proposed mechanism, such as the Path Segment Allocation Algorithm (PSAA) and the Path Segment Protocol (PSP). These algorithms ensure proper allocation and management of path segments and eﬃcient communication between the controller and network devices.Contra [109] is a novel programmable system designed to optimize network routing decisions for improved performance. The paper addresses the need for routing protocols that adapt to changing network conditions and traﬃc patterns. The key idea behind Contra is to enable routers to make routing decisions based on performance metrics rather than traditional static routing tables. By leveraging programmable network devices, Contra allows network operators to define and customize routing policies that consider real-time network conditions and performance objectives. In this paper, the author highlights the limitations of existing routing protocols, such as the Border Gateway Protocol (BGP), which primarily focuses on path selection based on the number of hops or the shortest path with- out considering other performance factors. Contra aims to overcome these limitations by introducing a ﬂexible and programmable routing framework.Michel et al. [110]discuss the benefits of policy routing using process-level identifiers. By incorporating application- specific policies into the routing decision process, network operators can optimize performance, enforce security measures, and enable differentiated services based on the requirements of methods or applications. The fundamental idea behind the proposed approach is to associate routing policies with specific processes or applications running on a host. By identifying and tracking methods at the network layer, the paper aims to enable more precise control over routing decisions based on application-specific requirements and policies.6. CONCLUSIONExisting monitoring methods and mechanisms need to be revised to meet the requirements of modern networks. These methods suffer from two fundamental limitations. Firstly, they need to be faster due to their reliance on CPU and Control Planes, which struggle to keep up with the rapid changes in the network state. Secondly, these monitoring methods cannot provide end-to-end visibility, making it challenging to correlate the state of individual network elements with the actual path of a ﬂow. One of the primary research gaps in this area lies in developing eﬃcient and ﬂexible mechanisms for inserting and extracting telemetry information within the network data plane. Another research gap lies in the optimization of INT data collection and processing. Interest in INT is steadily growing within both academia and industry. Academic researchers are exploring its potential to offer novel solutions for network closed-loop control, while the industry has already developed advanced network monitoring products. Numerous international organizations, research institutions, and universities have expressed keen interest in In-band Network Telemetry. Network telemetry is actively pursued by prominent research institutions such as the IETF, Open Network Foundation (ONF), and Barefoot Networks. The organization is involved in several network telemetry projects, such as NG-SDN, P4, and Information Model. Barefoot Networks provides high-performance programmable network solutions focusing on developing speedy Ethernet switch chips. One of its noteworthy endeavors is the introduction of Deep Insight, a packet-by-packet network monitoring application that utilizes the Tofino forwarding chip and P4 language. Both academia and industry need more research concerning a universal telemetry model for integrated con- verged network scenarios. Creating an abstract model that distinguishes network-wide measurement queries from fundamental measurement behaviors, leveraging the adaptability and expandability of programmable data planes to develop on-route telemetry solutions across heterogeneous networks, and choosing appropriate network state parameters to establish a cohesive network state perspective are crucial uses of in-band network telemetry technology. While there have been proposals for new telemetry technologies that combine the strengths of various approaches, exploring the integration of telemetry technologies remains a valuable area of study. By leveraging INT technology, data center operators can maintain high-performance, secure, and reliable networks to support modern applications and services’ diverse and demanding needs. Similarly, Data plane programmability in data center networks is pivotal for creating agile, eﬃcient, and secure infrastructures. By allowing administrators to define and modify data plane behavior, data centers can adapt to changing requirements, optimize performance, enhance security, and provide a ﬂexible platform for innovation and the development of new networking paradigms. However, existing research on network telemetry has primarily focused on architecture and applications, yielding limited progress in fundamental measurement capabilities. Areas such as cross-heterogeneous networks, universal telemetry models, and high-performance telemetry data query systems have received little attention. Building on previous studies, several avenues for further research in INT can be identified. One area for improvement involves enhancing the telemetry report monitoring station to increase the packet processing rate. Additionally, making the network more intelligent can be achieved by generating reports based on the network state rather than processing every packet. For instance, if consecutive packets contain similar embedded INT metadata, such as queue occupancy, specific packets could be discarded to alleviate the monitoring station’s workload and improve eﬃciency. Based on the research gaps identified in this study, we believe that future research in INT could focus on the following areas:v How to Improve the processing and forwarding of user data traﬃc through data plane.v To Maintain the trade-off between forwarding performance and network telemetry.v Enhancing the performance and scalability of INT deployments.v Enabling eﬃcient collection and processing of telemetry data within the data plane.v Providing insights into dynamic and adaptive telemetry configurations.v The security and privacy implications of in band Network Telemetry using data plane programming. Statements & DeclarationsFunding:The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.Competing Interests:The authors have no relevant financial or non-financial interests to disclose.Author Contributions:All authors contributed to the study conception and design. The first draft of the manuscript was written by Amit Kumar Singh and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.Data Availability:This review paper synthesizes information from previously published studies. All data supporting the conclusions drawn in this manuscript are derived from the cited references. No new data were generated or analyzed during the current studyReferences: [1] T. Mizrahi, N. Sprecher, E. Bellagamba, and Y. Weingarten, “An Overview of Operations, Administration, and Maintenance (OAM) Tools,” Jun. 2014, doi: 10.17487/RFC7276.[2] D. Levi, P. Meyer, and B. Stewart, “Simple Network Management Protocol (SNMP) Applications,” Dec. 2002, doi: 10.17487/RFC3413.[3] “Information on RFC 6241 » RFC Editor.” Accessed: Jan. 03, 2024. [Online]. Available: https://www.rfc-editor.org/info/rfc6241[4] R. Bifulco and G. Rétvári, “A Survey on the Programmable Data Plane: Abstractions, Architectures, and Open Problems”.[5] H. Song, “Protocol-Oblivious Forwarding: Unleash the Power of SDN through a Future-Proof Forwarding Plane”.[6] P. Bosshart et al., “P4: Programming Protocol-Independent Packet Processors”.[7] “In-band Network Telemetry (INT) Dataplane Specification Version 2.1”.[8] H. Zhang, Z. Cai, Q. Liu, Q. Xiao, Y. Li, and C. F. Cheang, “A Survey on Security-Aware Measurement in SDN,” 2018, doi: 10.1155/2018/2459154.[9] B. Astuto, A. Nunes, M. Mendonca, X.-N. Nguyen, K. Obraczka, and T. Turletti, “A Survey of Software-Defined Networking: Past, Present, and Future of Programmable Networks,” IEEE COMMUNICATIONS SURVEYS & TUTORIALS, vol. 16, no. 3, 2014, doi: 10.1109/SURV.2014.012214.00180.[10] O. Michel, S. Schmid, R. Bifulco, and G. Rétvári, “The Programmable Data Plane: Ab-stractions,” Architectures, Algorithms, and Applications. ACM Comput. Surv, vol. 54, 2021, doi: 10.1145/3447868.[11] R. Ben Basat, G. Einziger, J. Gong, J. M. Technion, and D. Raz, “q-MAX: A Unified Scheme for Improving Network Measurement Throughput,” 2019, doi: 10.1145/3355369.3355569.[12] P. Megyesi, A. Botta, G. Aceto, A. Pescapé, and S. Molnár, “Challenges and solution for measuring available bandwidth in software defined networks,” Comput Commun, vol. 99, pp. 48–61, Feb. 2017, doi: 10.1016/J.COMCOM.2016.12.004.[13] X. Zhang, Y. Wang, J. Zhang, L. Wang, and Y. Zhao, “A two-way link loss measurement approach for software-defined networks,” in 2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS), 2017, pp. 1–10. doi: 10.1109/IWQoS.2017.7969164.[14] W. Queiroz, M. A. M. Capretz, and M. Dantas, “An approach for SDN traffic monitoring based on big data techniques,” Journal of Network and Computer Applications, vol. 131, pp. 28–39, Apr. 2019, doi: 10.1016/J.JNCA.2019.01.016.[15] C. Yu, C. Lumezanu, A. Sharma, Q. Xu, G. Jiang, and H. V. Madhyastha, “Software-defined latency monitoringin data center networks,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8995, pp. 360–372, 2015, doi: 10.1007/978-3-319-15509-8_27/FIGURES/5.[16] S. Wang, J. Zhang, T. Huang, J. Liu, Y. jie Liu, and F. R. Yu, “FlowTrace: measuring round-trip time and tracing path in software-defined networking with low communication overhead,” Frontiers of Information Technology and Electronic Engineering, vol. 18, no. 2, pp. 206–219, Feb. 2017, doi: 10.1631/FITEE.1601280/METRICS.[17] Q. Li, X. Zou, Q. Huang, J. Zheng, and P. P. C. Lee, “Dynamic Packet Forwarding Verification in SDN,” IEEE Trans Dependable Secure Comput, vol. 16, no. 06, pp. 915–929, Nov. 2019, doi: 10.1109/TDSC.2018.2810880.[18] A. Alghadhban and B. Shihada, “FLight: A Fast and Lightweight Elephant-Flow Detection Mechanism”, doi: 10.1109/ICDCS.2018.00161.[19] Y. Zhao, P. Zhang, and Y. Jin, “Netography: Troubleshoot your network with packet behavior in SDN,” Proceedings of the NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium, pp. 878–882, Jun. 2016, doi: 10.1109/NOMS.2016.7502919.[20] “Open Networking Foundation.” Accessed: Jan. 03, 2024. [Online]. Available: https://opennetworking.org/[21] “OpenConfig.” Accessed: Jan. 03, 2024. [Online]. Available: https://www.openconfig.net/[22] N. Mckeown et al., “OpenFlow: Enabling Innovation in Campus Networks”.[23] “Network Configuration (netconf).” Accessed: Jan. 03, 2024. [Online]. Available: https://datatracker.ietf.org/wg/netconf/about/[24] “RFC 8040 - RESTCONF Protocol.” Accessed: Jan. 03, 2024. [Online]. Available: https://datatracker.ietf.org/doc/html/rfc8040[25] “RFC 6020 - YANG - A Data Modeling Language for the Network Configuration Protocol (NETCONF).” Accessed: Jan. 03, 2024. [Online]. Available: https://datatracker.ietf.org/doc/html/rfc6020[26] “Understanding gRPC Services for Managing Network Devices | Junos OS | Juniper Networks.” Accessed: Jan. 08, 2024. [Online]. Available: https://www.juniper.net/documentation/us/en/software/junos/grpc-network-services/topics/concept/grpc-services-overview.html[27] “gNOI.” Accessed: Jan. 08, 2024. [Online]. Available: https://documentation.nokia.com/srlinux/22-6/SR_Linux_Book_Files/SysMgmt_Guide/gnoi_interface.html[28] M. Bjorklund, “Internet Engineering Task Force (IETF),” 2010, [Online]. Available: http://www.rfc-editor.org/info/rfc6020.[29] C. Kim, A. Sivaraman, N. Katta, A. Bas, A. Dixit, and L. J. Wobker, “In-band Network Telemetry via Programmable Dataplanes,” 2015.[30] “draft-brockners-ippm-ioam-geneve-05 - Geneve encapsulation for In-situ OAM Data.” Accessed: Jan. 03, 2024. [Online]. Available: https://datatracker.ietf.org/doc/draft-brockners-ippm-ioam-geneve/[31] V. Jeyakumar, M. Alizadeh, Y. Geng, C. Kim, and D. Mazières, “Millions of little minions: Using packets for low latency network programming and visibility,” Computer Communication Review, vol. 44, no. 4, pp. 3–14, Feb. 2015, doi: 10.1145/2619239.2626292.[32] J. M. Smith, W. David Sincoskie, and B. J. Communications Research David, “A Survey of Active Network Research”.[33] “RFC 7276 - An Overview of Operations, Administration, and Maintenance (OAM) Tools.” Accessed: Jan. 03, 2024. [Online]. Available: https://datatracker.ietf.org/doc/rfc7276/[34] “RFC 9197 - Data Fields for In Situ Operations, Administration, and Maintenance (IOAM).” Accessed: Jan. 03, 2024. [Online]. Available: https://datatracker.ietf.org/doc/html/rfc9197[35] T. Zhou, E. Huawei, J. Guichard, F. F. Brockners, C. Systems, and S. Raghavan, “A YANG Data Model for In-Situ OAM,” 2024. [Online]. Available: https://trustee.ietf.org/license-info[36] F. Brockners and S. Bhandari, Eds., “Network Service Header (NSH) Encapsulation for In Situ OAM (IOAM) Data,” Aug. 2023, doi: 10.17487/RFC9452.[37] R. Gandhi and F. Brockners, “MPLS Data Plane Encapsulation for In Situ OAM Data,” 2023. [Online]. Available: https://trustee.ietf.org/license-info[38] C. H. Gredler RtBrick Inc J Leddy Comcast S Youell JMPC T Mizrahi Marvell A Kfir B Gafni and P. M. Lapukhov Facebook Spiegel, “VXLAN-GPE Encapsulation for In-situ OAM Data draft-brockners-ippm-ioam-vxlan-gpe-00,” 2018. [Online]. Available: http://datatracker.ietf.org/drafts/current/.[39] “EtherType Protocol Identification of In-situ OAM Data.” Accessed: Jan. 04, 2024. [Online]. Available: https://www.ietf.org/archive/id/draft-weis-ippm-ioam-eth-05.html[40] “In-situ OAM raw data export with IPFIX.” Accessed: Jan. 04, 2024. [Online]. Available: https://www.ietf.org/archive/id/draft-spiegel-ippm-ioam-rawexport-06.html[41] H. Song, B. Gafni, F. Brockners, S. Bhandari, and T. Mizrahi, “RFC 9326 In Situ Operations, Administration, and Maintenance (IOAM) Direct Exporting Abstract,” 2022. [Online]. Available: https://www.rfc-editor.org/info/rfc9326[42] F. Brockners, “Network Working Group,” 2020. [Online]. Available: https://trustee.ietf.org/[43] “Information on RFC 9197 » RFC Editor.” Accessed: Jan. 04, 2024. [Online]. Available: https://www.rfc-editor.org/info/rfc9197[44] S. Bhandari, E. Thoughtspot, F. Brockners, and E. Cisco, “RFC 9486: IPv6 Options for In Situ Operations, Administration, and Maintenance (IOAM),” 2023. [Online]. Available: https://www.rfc-editor.org/info/rfc9486[45] T. Mizrahi, F. Brockners, S. Bhandari, B. Gafni, and M. Spiegel, “RFC 9322: In Situ Operations, Administration, and Maintenance (IOAM) Loopback and Active Flags,” 2022. [Online]. Available: https://www.rfc-editor.org/info/rfc9322[46] C. S. Bhandari et al., “Network Working Group,” 2022. [Online]. Available: https://trustee.ietf.org/license-info[47] “Alternate-Marking Method for Passive and Hybrid Performance Monitoring,” 2018. [Online]. Available: https://trustee.ietf.org/license-info[48] A. Riesenberg, Y. Kirzon, M. Bunin, E. Galili, G. Navon, and T. Mizrahi, “Time-Multiplexed Parsing in Marking-based Network Telemetry,” 2019, doi: 10.1145/3319647.3325837.[49] A. Karaagac, E. De Poorter, and J. Hoebeke, “Alternate Marking-based Network Telemetry for Industrial WSNs”.[50] C. Fang et al., “VTrace: Automatic Diagnostic System for Persistent Packet Loss in Cloud-Scale Overlay Network,” SIGCOMM 2020 - Proceedings of the 2020 Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, pp. 31–43, Jul. 2020, doi: 10.1145/3387514.3405851.[51] Z. Liu, J. Bi, Y. Zhou, Y. Wang, and Y. Lin, “NetVision: Towards Network Telemetry as a Service”, doi: 10.1109/ICNP.2018.00036.[52] Y. Lin et al., “NetView: Towards On-Demand Network-Wide Telemetry in the Data Center”.[53] “P4 Runtime - Putting the Control Plane in Charge of the Forwarding Plane - Open Networking Foundation.” Accessed: Jan. 04, 2024. [Online]. Available: https://opennetworking.org/news-and-events/blog/p4-runtime-putting-the-control-plane-in-charge-of-the-forwarding-plane/[54] S. Ibanez, G. Brebner, N. Mckeown, and N. Zilberman, “The P4→NetFPGA Workflow for Line-Rate Packet Processing”.[55] A. Karaagac, E. De Poorter, and J. Hoebeke, “In-Band Network Telemetry in Industrial Wireless Sensor Networks,” IEEE Transactions on Network and Service Management, vol. 17, no. 1, pp. 517–531, Mar. 2020, doi: 10.1109/TNSM.2019.2949509.[56] S. Tang, D. Li, B. Niu, J. Peng, Z. Zhu, and S. Member, “Sel-INT: A Runtime-Programmable Selective In-Band Network Telemetry System,” IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, vol. 17, no. 2, 2020, doi: 10.1109/TNSM.2019.2953327.[57] Y. Zhang et al., “Automating Rapid Network Anomaly Detection With In-Band Network Telemetry,” IEEE Networking Letters, vol. 4, no. 1, pp. 39–42, Nov. 2021, doi: 10.1109/lnet.2021.3130573.[58] C. Guo et al., “Pingmesh: A Large-Scale System for Data Center Network Latency Measurement and Analysis *”, doi: 10.1145/2785956.2787496.[59] C. Kim, A. Sivaraman, N. P. K. Katta, A. Bas, A. A. Dixit, and L. J. Wobker, “In-band Network Telemetry via Programmable Dataplanes,” 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:15782087[60] T. Pan et al., “INT-path: Towards Optimal Path Planning for In-band Network-Wide Telemetry”.[61] N. S. Kagami, R. I. Tavares Da Costa Filho, and L. P. Gaspary, “CAPEST: Offloading Network Capacity and Available Bandwidth Estimation to Programmable Data Planes,” IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, vol. 17, no. 1, 2020, doi: 10.1109/TNSM.2019.2934316.[62] A. Riesenberg, Y. Kirzon, M. Bunin, E. Galili, G. Navon, and T. Mizrahi, “Time-Multiplexed Parsing in Marking-based Network Telemetry,” 2019, doi: 10.1145/3319647.3325837.[63] “IIGR: Packet-Level Telemetry in Large DC Networks (‘Everflow’) | by Michael Orr | Medium.” Accessed: Jan. 04, 2024. [Online]. Available: https://medium.com/@orr101/iigr-packet-level-telemetry-in-large-dc-networks-everflow-59b642eae073[64] “The P4 Language Specification,” 2018.[65] “The P4 Language Consortium.” [Online]. Available: https://p4.org/p4-spec/docs/P4-16-v1.0.0-spec.html[66] L. Tan, W. Su, J. Miao, and W. Zhang, “FindINT: Detect and Locate the Lost in-Band Network Telemetry Packet,” IEEE Networking Letters, vol. 4, no. 1, pp. 20–24, Mar. 2021, doi: 10.1109/lnet.2021.3067343.[67] L. Tan, W. Su, W. Zhang, H. Shi, J. Miao, and P. Manzanares-Lopez, “A Packet Loss Monitoring System for In-Band Network Telemetry: Detection, Localization, Diagnosis and Recovery,” IEEE Transactions on Network and Service Management, vol. 18, no. 4, pp. 4151–4168, Dec. 2021, doi: 10.1109/TNSM.2021.3125012.[68] A. Gupta, N. Feamster, R. Harrison, J. Rexford, M. Canini, and W. Willinger, “Sonata: Query-driven streaming network telemetry,” SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, vol. 15, no. 18, pp. 357–371, Aug. 2018, doi: 10.1145/3230543.3230555.[69] N. Van Tu, J. Hyun, and J. Won-Ki Hong, “Towards ONOS-based SDN Monitoring using In-band Network Telemetry”.[70] S. Sheng, Q. Huang, and P. P. C. Lee, “A general delta-based in-band network telemetry framework with extremely low bandwidth overhead ✩,” Computer Networks, vol. 223, p. 109573, 2023, doi: 10.1016/j.comnet.2023.109573.[71] S. Tang, J. Kong, B. Niu, and Z. Zhu, “Programmable Multilayer INT: An Enabler for AI-Assisted Network Automation,” IEEE Communications Magazine, vol. 58, no. 1, pp. 26–32, Jan. 2020, doi: 10.1109/MCOM.001.1900365.[72] R. Hohemberger et al., “Orchestrating In-Band Data Plane Telemetry With Machine Learning,” IEEE COMMUNICATIONS LETTERS, vol. 23, no. 12, 2019, doi: 10.1109/LCOMM.2019.2946562.[73] C. Jia et al., “Rapid Detection and Localization of Gray Failures in Data Centers via In-band Network Telemetry,” 2020.[74] S.-Y. Wang, Y.-R. Chen, J.-Y. Li, H.-W. Hu, J.-A. Tsai, and Y.-B. Lin, “A Bandwidth-Efficient INT System for Tracking the Rules Matched by the Packets of a Flow”.[75] Y. Li et al., “HPCC: High Precision Congestion Control,” Conference of the ACM Special Interest Group on Data Communication, vol. 19, 2019, doi: 10.1145/3341302.3342085.[76] N. Katta et al., “Clove: Congestion-Aware Load Balancing at the Virtual Edge,” 2017, doi: 10.1145/3143361.3143401.[77] J. Hyun and J. Won-Ki Hong, “Knowledge-Defined Networking using In-band Network Telemetry”.[78] H. Yao, T. Mai, X. Xu, P. Zhang, M. Li, and Y. Liu, “NetworkAI: An Intelligent Network Architecture for Self-Learning Control Strategies in Software Defined Networks,” IEEE Internet Things J, vol. 5, no. 6, 2018, doi: 10.1109/JIOT.2018.2859480.[79] H. B. Brum, C. R. P. Dos Santos, and T. C. Ferreto, “DINT: A Dynamic Algorithm for In-band Network Telemetry.” [Online]. Available: https://github.com/HenriqueBBrum/DINT/tree/main[80] F. Cugini, P. Gunning, F. Paolucci, P. Castoldi, and A. Lord, “P4 In-Band Telemetry (INT) for Latency-aware VNF in Metro Networks,” 2019.[81] C. Kim, A. Sivaraman, N. Katta, A. Bas, A. Dixit, and L. J. Wobker, “In-band Network Telemetry via Programmable Dataplanes,” 2015.[82] Y. Zhu et al., “Packet-Level Telemetry in Large Datacenter Networks”, doi: 10.1145/2785956.2787483.[83] Y. Geng et al., “SIMON: A Simple and Scalable Method for Sensing, Inference and Measurement in Data Center Networks”, Accessed: Jan. 04, 2024. [Online]. Available: https://www.usenix.org/conference/nsdi19/presentation/geng[84] P. Taffet and J. Mellor-Crummey, “Lightweight, Packet-Centric Monitoring of Network Traffic and Congestion Implemented in P4”, doi: 10.1109/HOTI.2019.00026.[85] L. Tan, W. Su, W. Zhang, H. Shi, J. Miao, and P. Manzanares-Lopez, “A Packet Loss Monitoring System for In-Band Network Telemetry: Detection, Localization, Diagnosis and Recovery,” IEEE Transactions on Network and Service Management, vol. 18, no. 4, pp. 4151–4168, Dec. 2021, doi: 10.1109/TNSM.2021.3125012.[86] L. Castanheira, R. Parizotto, and A. E. Schaeffer-Filho, “FlowStalker: Comprehensive Traffic Flow Monitoring on the Data Plane Using P4”.[87] K. Grewal and J. Hsu, “P4BID: In-formation Flow Control in P4,” p. 2022, 2022, doi: 10.1145/3519939.3523717.[88] D. Sanvito, “Traffic Management in Networks with Programmable Data Planes,” in Special Topics in Information Technology, A. Geraci, Ed., Cham: Springer International Publishing, 2021, pp. 13–23. doi: 10.1007/978-3-030-62476-7_2.[89] B. Chen, F. Chen, S. Tang, Q. Zheng, and Z. Zhu, “On Orchestration of Segment Routing and In-band Network Telemetry,” IEEE Transactions on Network and Service Management, Dec. 2023, doi: 10.1109/TNSM.2023.3254200.[90] Y. Liu, Y. Xia, W. Zhang, W. Jia, and J. Wu, “SFANT: A SRv6-based Flexible and Active Network Telemetry Scheme in Programming Data Plane,” 2023, doi: 10.1109/TNSE.2023.3277000.[91] P. Taffet and J. Mellor-Crummey, “Lightweight, Packet-Centric Monitoring of Network Traffic and Congestion Implemented in P4”, doi: 10.1109/HOTI.2019.00026.[92] R. Harrison, Q. Cai, A. Gupta, and J. Rexford, “Network-Wide Heavy Hitter Detection with Commodity Switches,” 2018, doi: 10.1145/3185467.3185476.[93] J. Geng, J. Yan, Y. Ren, and Y. Zhang, “Design and Implementation of Network Monitoring and Scheduling Architecture Based on P4.”[94] H. N. Nguyen, B. Mathieu, M. Letourneau, G. Doyen, S. Tuffin, and E. Montes De Oca, “A Comprehensive P4-based Monitoring Framework for L4S leveraging In-band Network Telemetry,” 2023, doi: 10.1109/NOMS56928.2023.10154331ï.[95] X. Yan, Z. Xu, B. Chen, and Z. Zhu, “SRv6-INT: Runtime Monitoring for Green Service Function Chaining in B5G-MEC.”[96] B. Guan and S.-H. Shen, “FlowSpy:An Efficient Network Monitoring Framework using P4 in Software-Defined Networks”.[97] Y. Tang, Y. Wu, G. Cheng, and Z. Xu, “Intelligence Enabled SDN Fault Localization via Programmable In-band Network Telemetry”.[98] Y. Zhou et al., “HyperSight: Towards Scalable, High-Coverage, and Dynamic Network Monitoring Queries,” IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, vol. 38, no. 6, 2020, doi: 10.1109/JSAC.2020.2986690.[99] D. Bhamare, A. Kassler, J. Vestin, M. A. Khoshkholghi, and J. Taheri, “IntOpt: In-band Network Telemetry Optimization for NFV Service Chain Monitoring”.[100] B. Turkovic, F. Kuipers, tudelftnl Niels van Adrichem TNO, K. Langendoen, N. van Adrichem, and K. Lan-, “Fast network congestion detection and avoidance using P4,” 2018, doi: 10.1145/3229574.3229581.[101] X. Chen et al., “Fine-Grained Queue Measurement in the Data Plane,” 2019, doi: 10.1145/3359989.3365408.[102] A. Sacco, A. Angi, F. Esposito, and G. Marchetto, “HINT: Supporting Congestion Control Decisions with P4-driven In-Band Network Telemetry,” in IEEE International Conference on High Performance Switching and Routing, HPSR, IEEE Computer Society, 2023, pp. 83–88. doi: 10.1109/HPSR57248.2023.10147977.[103] J. Geng, J. Yan, and Y. Zhang, “P4QCN: Congestion Control Using P4-Capable Device in Data Center Networks,” Electronics 2019, Vol. 8, Page 280, vol. 8, no. 3, p. 280, Mar. 2019, doi: 10.3390/ELECTRONICS8030280.[104] Y. Zhao, G. Cheng, and Y. Tang, “SINT: Toward a Blockchain-Based Secure In-Band Network Telemetry Architecture,” IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, vol. 18, p. 2667, 2023, doi: 10.1109/TIFS.2023.3269891.[105] Y. Afek, A. Bremler-Barr, and L. Shafir, “Network Anti-Spoofing with SDN Data plane”.[106] R. Ben Basat et al., “PINT : Probabilistic In-band Network Telemetry,” vol. 19, doi: 10.1145/3387514.3405894.[107] B. Lewis, L. Fawcett, M. Broadbent, and N. Race, “Using P4 to Enable Scalable Intents in Software Defined Networks”, doi: 10.1109/ICNP.2018.00064.[108] L. Luo, H. Yu, S. Luo, Z. Ye, X. Du, and M. Guizani, “Scalable explicit path control in software-defined networks ☆,” Journal of Network and Computer Applications, vol. 141, pp. 86–103, 2019, doi: 10.1016/j.jnca.2019.05.014.[109] K.-F. Hsu, R. Beckett, A. Chen, J. Rexford, P. Tammana, and D. Walker, “Contra: A Programmable System for Performance-aware Routing”.[110] O. Michel and E. Keller, “Policy Routing using Process-Level Identifiers,” 2016, doi: 10.1109/IC2EW.2016.10. .