TLV (Tag-Length-Value) is a binary format used to represent data in a structured way. TLV is commonly used in computer networking protocols, smart card applications, and other data exchange scenarios. The three parts of TLV are:
- Tag: Identifies uniquely the type of data. It's typically a single byte or a small sequence of bytes.
- Length: Length of the data field in bytes. In some protocols, the lengths of tag and length fields are also included.
- Value: Actual data being transmitted, which can be of any type or format.
Entities that send messages would encode information into TLV format. Entities that receive such messages would decode them to retrieve the information. Many programming languages have libraries for TLV encoding and decoding. Developers can also build their own custom encoders and decoders, perhaps optimized for their applications.
Could you explain the TLV format with an example?
Developers will usually represent data in a form that's most convenient and efficient for processing. For example, this could be an associative array, a linked list, or a class with attributes. When this data needs to be stored or transmitted, it has to be serialized. This is where TLV format is used. A TLV encoder reads the data/message and outputs a stream of bytes. A TLV decoder does the reverse.
The figure shows a TLV example used for F-TEID, an information element (IE) used in 5G's PFCP protocol. The type value 21 indicates that this is F-TEID. The protocol defines other values for other IEs. Since type field is two bytes, it will be encoded as 0x0015.
The length field is 2 bytes. It's value indicates the number of bytes that follow. The latter are part of the 'V' in TLV. The 4-byte TEID field is mandatory. Byte 5 contains some flags (CHID, CH, V6, V4) to indicate the presence of optional fields. So if V6 is set to 1, 16 bytes of IPv6 address is present.
Does 'T' in TLV refer to "tag" or "type"?
The TLV acronym can refer to either "Tag-Length-Value" or "Type-Length-Value." Both of these terms are used interchangeably and can be considered correct.
In some contexts, "Tag" and "Type" may be used to refer to slightly different things. "Tag" may refer to an identifier used within a particular protocol. "Type" may refer to a more general data type, such as an integer, string, or binary data. However, in most cases, the terms "Tag" and "Type" are used interchangeably to refer to the identifier of the data being transmitted.
What are the benefits of using the TLV format?
The TLV format allows data to be structured in a flexible way. Data can be organized into logical groups. The length field allows for variable-length data to be represented.
The format is also extensible, meaning that new tags can be added to the format without requiring changes to existing code. This makes it easy to add new functionality to an existing system.
It's also an efficient way to store and transmit data. By using a binary format, the data can be transmitted more quickly and with less overhead than with text-based formats.
The use of length fields in the TLV format makes it easy to detect errors in the data. If the length of a field does not match the expected value, it is likely that an error has occurred.
Which standards have adopted the TLV format?
The OSI (Open Systems Interconnection) reference model is a layered architecture. TLV is used in many protocols across the OSI layers. For example, at the data link layer, Ethernet frames and Wi-Fi frames use TLV. At the network layer, IP and ICMP are two examples that use TLV. At the application layer, there are plenty of protocols that use TLV: HTTP, CoAP, DNS, and MQTT are some examples.
TLV is typically not used at the physical layer, which usually deals with raw bits. TLV is typically not used at the transport, session or presentation layers.
Here are more standards that use TLV:
- ISO 7816: This is a communication protocol between smart cards and card readers. The APDU (Application Protocol Data Unit) format is based on TLV.
- Bluetooth: The Bluetooth Low Energy (BLE) specification uses the TLV format to encode the data for advertising and communication between BLE devices.
- SIM/eSIM cards: In cellular systems, SIM and eSIM cards use the TLV format to store data, and exchange data with the mobile device.
What endianness is used in TLV?
The endianness used depends on the specific protocol or application, which must specify the endianness it's using. If messages are being exchanged between devices with different endianness, proper conversion would be needed before processing those messages.
There are some protocols that mix endianness for different fields within the same TLV message. One such protocol is the Bluetooth Low Energy (BLE) protocol. The endianness of the length field and the value field may be different. Specifically, the length field in the TLV message is always encoded in little-endian byte order, while the endianness of the value field depends on the type of data being transmitted. For example, if the value field contains a 16-bit unsigned integer, it's encoded in little-endian byte order, since the length field is also in little-endian byte order. However, if the value field contains a 32-bit floating-point value, it's encoded in big-endian byte order.
What are some best practices for designing TLV-based messages?
Use standardized tags to ensure that your messages can be easily understood and implemented by other systems. Consider using a standardized byte order or including byte order information in the message.
Define a clear message structure that defines the order and the type of fields. When possible, use fixed-length fields. These approaches make parsing and processing more efficient. Where variable-length fields are needed, use a length field to indicate the size of the data. Include error checking in the message to ensure that the message is valid and has not been corrupted during transmission.
When designing a TLV message, reserve certain tags for future use. This can ensure that new fields can be added to the message without requiring changes to existing code. Use flags to indicate which fields in the message are optional. This can help reduce the size of the message and make it easier to parse. Flags also help with backward compatibility. A version number included in a message can help older systems interwork with newer systems. Another technique is to use a a variable-length encoding scheme that can represent both older and newer data formats.
How should developers encode/decode messages in TLV format?
Any TLV encoder/decoder must be tested for correctness. Even when invalid input is fed to them, they should fail gracefully. They can log a warning message and keep applications robust and secure.
The length value can't be set until all message fields are encoded. One approach is to increment the length value as each field is encoded.
If fields are not byte-aligned, bit manipulation is required. This involves bit shifting and bit masking to get or set specific bits from a field. Here's an example (assuming big-endian byte order):
- Encoder: Data
dtis of range 0-3 (2 bits). It's to be encoded into bits 5 and 4 without disturbing other bits in a 2-byte field
fld. We can do
fld |= (dt & 0x0003) << 3.
- Decoder: We wish to extract bits 5 and 4 from a 2-byte field
fld. We can do
dt = (fld & 0x0018) >> 3or
dt = (fld >> 3) & 0x0003.
Developers can use a well-tested and widely-used TLV library rather than writing a custom implementation. Examples include libtins and tlvcpp (C++); Apache MINA and TlvParser (Java); Construct and TLV-Coding (Python); BouncyCastle and PeterO.TlvLib (.NET).
- Encoder: Data
What are some variations of the TLV format?
TLV has some variations and they offer different trade-offs between simplicity, flexibility, and efficiency:
- TVL: The tag comes first, followed by the value, and then the length. This format is sometimes used in legacy systems, but it is less common than the TLV format.
- LV: There's no tag field. This format is simpler than TLV, but it doesn't provide any information about the meaning or purpose of the data.
- TFLV: Type-specific flags field is also included.
- Nested TLV: The value field can contain nested TLV structures, allowing for the representation of more complex data. This is often used in protocols that require hierarchical or nested data structures.
- Extended TLV (ETLV): This includes an extra "extended" field in the tag that provides additional information about the tag's purpose or context. This allows for more flexibility in the use of tags, as well as better support for backward compatibility.
- Binary TLV (BTLV): BTLV is a binary variant of TLV that's designed for use in low-level system programming. It uses fixed-width fields for the tag, length, and value, and is often used in embedded systems and low-level network protocols.
What are the alternatives to the TLV format?
TLV is a binary format. Other binary formats include Protocol Buffers, ASN.1 and MessagePack. Protocol Buffers was developed by Google. It's language- and platform-independent. ASN.1 is an older format that's widely adopted. It has different TLV-based encoding rules (BER, DER) and non-TLV-based encoding rules (PER, XER). MessagePack is designed to be fast and compact. These alternatives allow developers to define and maintain message definitions in readable syntax while encoding them into binary formats.
Sometimes an efficient binary format is not essential. Developers may prefer a textual format that's easier to read and parse. In such situations, XML and JSON formats could be used. These are widely used in web services and mobile applications. XML is also used for document formatting.
The use of TLV encoding can be traced back to the development of the Abstract Syntax Notation One (ASN.1) standard in the 1980s. This happens within the telecommunications industry. ASN.1 uses TLV encoding to represent complex data structures in a compact and efficient format.
In the 1990s, TLV is used in the development of the EMV (Europay, Mastercard, and Visa) standard for payment cards. The EMV standard uses TLV encoding to represent the data stored on a payment card, including the cardholder's name, card number, expiration date, and other information.
Through the 2000s, the use of TLV encoding becomes more common in computer networking protocols, such as the Simple Network Management Protocol (SNMP) and the Link Layer Discovery Protocol (LLDP).
- ASF. 2007. "TLV Page Info." Apache ASN.1 Documentation, Apache Software Foundation, March 12. Accessed 2023-02-18.
- Cisco. 2020. "Link Layer Discovery Protocol (LLDP)." The Cisco Learning Network, Cisco, February 13. Accessed 2023-02-18.
- ETSI. 2022. "TS 129 244: LTE; 5G; Interface between the Control Plane and the User Plane nodes." V16.10.0, July. Accessed 2023-02-18.
- Juhana, T. 2014. "Network Layer Part IV." Slides, Computer Networks, Institut Teknologi Bandung. Accessed 2023-02-18.
- Stewart, R., M. Tüxen, and K. Nielsen. 2022. "RFC 9260: Stream Control Transmission Protocol." IETF, June. Accessed 2023-02-18.
- Protocol Buffers
- Byte Ordering
- Data Serialization
- Internet Protocol Suite
- Discussion answers at these positions have no citations: 1, 2, 3, 4, 5, 6, 7, 9
- Milestones at these positions have no citations: 1, 2, 3
- Following sections are empty: Further Reading
- A good article must have at least 1.5 references per 200 words. This article has 0.6.
- A good article must have at least 1.5 inline citations per 100 words. This article has 0.3.