networking

Adding TLS Fragmentation to OpenConnect

In this post, I discussed what it means to apply fragmentation at the TCP and TLS Record layers during the TLS handshake, and how we can use these techniques to evade Internet censorship.

Moorko

31 Aug 2024 • 12 min read

As an Iranian, I can't remember the Internet without VPNs. From the moment I discovered the Internet, I discovered VPNs too. Although I didn’t know why I was using it or what it actually did, I just knew that if I wanted to access my favorite websites (like YouTube, etc.), I had to run this piece of software before opening my browser.

Years have passed and things have changed. I’m not that child anymore, I know that I had to use a VPN because the dictatorship doesn’t want me or my people to access the free flow of information, and one more thing that has changed: I've learned that I can fight back.

After the Mahsa Amini protests in 2022, the Islamic Regime intensified its efforts to combat internet censorship circumvention tools. In response to the widespread use of VPNs and other technologies that Iranians relied on to bypass digital restrictions and Internet blackouts[[1]], the regime took more aggressive measures to block these tools and prevent access to uncensored information. This crackdown has reached new extremes, with some experts arguing that Iran’s internet censorship is now even more restrictive than China’s. As the regime goes to greater lengths to suppress online freedom, my own fight against censorship has become more determined and passionate than ever.

In this post, I’m going to quickly and simply review what TLS is, then discuss what fragmentation means in the TLS Record and TCP layers[[2]], and how we can use it to circumvent the network sensors deployed as part of the Islamic Regime’s Great Firewall (IRGFW).

Transport Layer Security (TLS)

TLS (formerly known as SSL) is a widely used security protocol that ensures data protection for communications over the Internet. While TLS can be used to encrypt any application data, its use in securing HTTP is the most publicly visible—like when a web browser accesses a website and you see the "https" sign instead of "http."

But what happens if we don’t use TLS? In that case, anyone with the proper access to your network can read your HTTP packets. This means they can view HTTP headers, such as the Host header. For example, let’s imagine you’re a dictator and you want to ban Twitter. You could inspect every HTTP packet and check if the Host header equals twitter.com, and if it does, you simply block that packet from reaching Twitter.

This is where TLS comes in: with TLS, you (sorry, dictator-you!) can’t read the Host header anymore because it’s encrypted.

TLS handshake

If our packets are already getting encrypted, then what's the problem? Well, before TLS can transmit encrypted application data, it performs an action called a handshake. This handshake, which is unencrypted, contains something called Server Name Indication (SNI)—basically the content of the HTTP Host header.

So, as a dictator, you wouldn’t be able to see the content I’m transmitting to the website, but you can see which website I’m trying to connect to. A network sensor can parse this handshake (specifically the first packet sent by the client, called ClientHello) and block the TCP connection if the dictator doesn’t like the destination.

If you want more details on this topic, you can read this great blog post:

Circumventing the GFW with TLS Record Fragmentation | System Security Group

How Fragmentation Can Be Extended to the TLS Layer

System Security GroupSystem Security Group

So, What Can We Do Now?

The only thing we can do is make it harder for network sensors to effectively process ClientHello and the SNI. Why does this matter? Well, network sensors are resource-intensive because it’s not easy to dissect every packet across different protocols. On top of that, these sensors need to be fast, since not all packets are "illegal." If they can’t operate quickly, they’ll significantly degrade network quality for all users (as we’re already seeing in Iran’s network).

The idea, then, is simple: make ClientHello parsing more expensive for the network sensors tasked with detecting "illegal" websites.

TLS handshake fragmentation

As I mentioned earlier, the Achilles’ heel of a network sensor is its performance, and the "bullet" we can shoot at that heel is to force it to hold states. These sensors process millions (if not billions) of connections, and allocating memory to maintain connection states is extremely resource-intensive. As a result, they often choose to ignore these types of packets, allowing us to evade censorship.

One approach to achieve this is to fragment our TLS handshake packets. If a network sensor wants to parse fragmented data, it needs to reassemble the fragments first, which increases the processing burden.

This fragmentation can happen at two layers: the TCP layer and the TLS Record layer. I’ll start by examining TCP fragmentation first, as it’s simpler.

TCP fragmentation

First of all, fragmentation isn’t particularly meaningful in the context of TCP. In TCP, packets are referred to as segments, and each TCP segment can contain either complete application messages or only parts of them if they can’t fit into a single segment.

Here, we essentially want to divide our data stream into multiple TCP segments:

Segment 1:
|------------|
| This is sup|
| posed to be|
| data in the|
| TCP segment|
|------------|

      |
      |
      |
      T


Segment 1:
|------------|
| This is sup|
|------------|

Segment 2:
|------------|
| posed to be|
|------------|


Segment 3:
|------------|
| data in the|
|------------|

Segment 4:
|------------|
| TCP segment|
|------------|

This is what we're trying to do, dividing a TCP segment into multiple segments.

Implementing TCP fragmentation (AKA. segmentation) is actually quite simple. We'll explore this further after we examine TLS Records and the concept of their fragmentation.

TLS Record Protocol

The TLS Record Protocol is a layered protocol. At each layer, messages may include fields for length, description, and content. The Record Protocol takes the messages to be transmitted, fragments the data into manageable blocks, optionally compresses it, applies a MAC, encrypts the data, and transmits the result.

When data is received, it is decrypted, verified, decompressed, reassembled, and then delivered to higher-level clients.

TLS Record

The TLS Record Layer receives uninterpreted data from higher layers in non-empty blocks of arbitrary size. It fragments these information blocks into TLSPlaintext records, carrying data in chunks of 2^14 bytes or less.

An important aspect of the TLS Record specifications for us is that client message boundaries are not preserved in the record layer. This means that multiple client messages of the same ContentType may be coalesced into a single TLSPlaintext record, or a single message may be fragmented across several records.

Now that we understand that each TLS Record can contain one or more TLSPlaintext blocks/records, let’s look at the structure of TLSPlaintext:

struct {
    ContentType type;
    ProtocolVersion version;
    uint16 length;
    char fragment[TLSPlaintext.length];
} TLSPlaintext;

In the struct above, fragment is the data we want to convey via a TLS Record.

Now let's say we have a TLS record as depicted below:

TLSRecord [
  TLSPlaintext {type: handshake, version: 1.2, length: 27, fragment: "Some data, SNI: twitter.com"}
]

Our goal is to split that single TLSPlaintext record into multiple structures:

TLSRecord [
  TLSPlaintext {type: handshake, version: 1.2, length: 10, fragment: "Some data,"},
  
  TLSPlaintext {type: handshake, version: 1.2, length: 10, fragment: " SNI: twit"},
  
  TLSPlaintext {type: handshake, version: 1.2, length: 7, fragment: "ter.com"},
]

Combining TLS Record and TCP fragmentation together

Assume we have the same TLSRecord with single TLSPlaintext in a single TCP segment:

Segment 1:
|-------------|
| TlsRecord [ |
| TLSPlaintext|
| {type: hand |
| shake, vers |
| ion: 1.2,   |
| length: 27, |
| fragment: " |
| Some data,  |
| SNI: twitte |
| r.com } ]   |
|-------------|

By combining TCP and TLS record fragmentation, our final structure should look something like this:

Segment 1:
|-------------|
| TlsRecord [ |
| TLSPlaintext|
| {type: hand |
| shake, vers |
| ion: 1.2,   |
| length: 10, |
| fragment: "S|
| ome data,",}|
|-------------|


Segment 2:
|-------------|
| TLSPlaintext|
| {type: hand |
| shake, vers |
| ion: 1.2,   |
| length: 10, |
| fragment: " |
| SNI: twit"},|
|-------------|


Segment 3:
|-------------|
| TLSPlaintext|
| {type: hand |
| shake, vers |
| ion: 1.2,   |
| length: 7,  |
| fragment: " |
| ter.com"} ] |
|-------------|

Now, imagine that as a network sensor, you need to buffer all these segments, then reassemble them to create a meaningful stream of data (i.e., a TLSRecord). After that, you’d also need to reassemble multiple TLSPlaintext records into a single TLSPlaintext. This operation can be quite expensive. On the other hand, it doesn’t create a significant anomaly in normal traffic since these fragmentations can happen during regular communication as well.

We can take it a step further by capturing those TCP segments right before they leave our system and sending them in the wrong order or even sending them twice (mimicking TCP retransmission). This would result in out-of-order TCP segments, forcing network sensors to perform TCP reassembly. In the case of Iran’s network, this wouldn't raise alarms, as the nation-wide network is already highly unstable due to censorship, with frequent occurrences of out-of-order and retransmission packets.

Now, let’s look at how I implemented these techniques in the OpenConnect project.

Implement fragmentation in OpenConnect

OpenConnect supports two TLS libraries: OpenSSL, and GnuTLS

OpenSSL

TLS Record fragmentation

Implementing TLS Record fragmentation was super easy in OpenSSL as it already provides an API for it: SSL_set_split_send_fragment.

TCP fragmentation

When OpenSSL (or GnuTLS) finishes creating the TLS Record and handling anything related to the TLS layer, it typically calls the send() function[[3]] (not directly—OpenSSL takes several factors into account, but we’ll simplify for now) to transfer data to the kernel’s network stack. The kernel then handles fitting the data into TCP segments and sending them out.

What we need to do is intercept this process and make those libraries (i.e., OpenSSL and GnuTLS) use our own transport layer function instead of the default one. So, we essentially need to replace their transport layer function—we need to be the ones responsible for calling the send() function.

But why do we need to do this? And what exactly will this new function do?

Why?

This is the only way (at least as far as I know) that we can mimic this behavior while maintaining the consistency of the existing code.

What?

The custom transport layer function should send a single stream of data in multiple chunks. It can be as simple as this:

int custom_send_function (int socket_fd, const char* data, const int size)
{
    // Our desired TCP segment payload size
    const size_t tcp_frag_size = 20;
    size_t written_bytes = 0;

	while (written_bytes < size)
	{
		const ssize_t ret = send(socket_fd, &data[written_bytes], MIN(tcp_frag_size, size - written_bytes), 0);
		written_bytes += ret;
	}

	return written_bytes;
}

Simple custom transport layer function

For this code to work, though, you need to know about a mechanism in Linux called Nagle’s algorithm. This algorithm essentially makes the sender buffer its output until there’s enough data to fill a packet, allowing the data to be sent all at once. So by default, we don’t necessarily get a TCP segment every time we call the send() function. To disable this mechanism, the TCP socket should have the TCP_NODELAY option[[4]] enabled. Fortunately, OpenConnect enables this option during the initial setup, so there’s nothing more we need to do for that. 🙂

Now that we have our custom function, how do we replace OpenSSL’s transport layer with ours?

To do that, we need to understand a few OpenSSL-specific concepts:

BIO object

A BIO is an I/O abstraction in OpenSSL, it hides many of the underlying I/O details from an application. If an application uses a BIO for its I/O it can transparently handle SSL connections, unencrypted network connections and file I/O.[[5]]

Each SSL object is also associated with two BIO objects. A BIO object is used for sending or receiving data from the underlying transport layer. For example you might create a BIO to represent a TCP socket. The SSL object uses one BIO for reading data and one BIO for writing data. In most cases we would use the same BIO for each direction.[[6]]

BIO method

BIO objects in OpenSSL use BIO methods to perform the actual I/O operations, which can involve sockets, files, etc. To implement our own transport layer, we need to create a custom BIO method and have OpenConnect use this custom BIO object instead of the default OpenSSL socket BIO.

One important detail is that this BIO object is used throughout the entire connection and handles all the data transmission after the connection is established. If we continue using our TCP fragmentation function during the entire transmission, all data passing through OpenConnect would be fragmented into multiple TCP segments, which is highly inefficient. Therefore, we only want to use our custom send/write function during the handshake. After the handshake is completed, we restore OpenSSL’s default transport layer to ensure efficient data transmission for user traffic through OpenConnect.

Then you might wonder, why not just use the default OpenSSL socket BIO and modify the specific transport layer function (like the write() function) for TCP fragmentation during the handshake, and then revert back to the default afterwards? The reason lies in the implementation of the BIO_new_socket() function[[7]]. As you can see, the BIO_s_socket() function returns a static const structure[[8]], which means we can’t modify any entries in that struct.

Eventually what we need to do instead is create a custom BIO method, populate its entries using the default BIO socket method, and have OpenConnect use this new method. This custom BIO method allows its entries to be modified, which is exactly what we'll do later. But first, let's see how we create and populate this custom BIO method with OpenSSL's default transport layer functions.

static BIO_METHOD* mutable_socket_method ()
{
	BIO_METHOD* bio_method = BIO_meth_new(BIO_TYPE_SOCKET, "custom_socket");

	const BIO_METHOD* default_method = BIO_s_socket();

	BIO_meth_set_write_ex(bio_method, BIO_meth_get_write_ex(default_method));
	BIO_meth_set_write(bio_method, BIO_meth_get_write(default_method));

	BIO_meth_set_read_ex(bio_method, BIO_meth_get_read_ex(default_method));
	BIO_meth_set_read(bio_method, BIO_meth_get_read(default_method));

	BIO_meth_set_gets(bio_method, BIO_meth_get_gets(default_method));
	BIO_meth_set_puts(bio_method, BIO_meth_get_puts(default_method));

	BIO_meth_set_ctrl(bio_method, BIO_meth_get_ctrl(default_method));
	BIO_meth_set_callback_ctrl(bio_method, BIO_meth_get_callback_ctrl(default_method));

	BIO_meth_set_create(bio_method, BIO_meth_get_create(default_method));
	BIO_meth_set_destroy(bio_method, BIO_meth_get_destroy(default_method));


	return bio_method;
}

I don’t think this piece of code needs much explanation since it’s quite simple and straightforward. 🙂

Now that we have our BIO method ready, we can create a BIO object, modify the necessary functions during the handshake, and then restore the default behavior for the rest of the the program’s flow. Something like this:

bio_socket_method = mutable_socket_method();
https_bio = BIO_new(bio_socket_method);

SSL_set_bio(https_ssl, https_bio, https_bio);

if (is_tcp_fragmentation_enabled) {
  BIO_meth_set_write(bio_socket_method, tcp_frag_write_func);
  BIO_meth_set_write_ex(bio_socket_method, tcp_frag_write_ex_func);
}

// Perform TLS handshake here
Handshake();
// ....


// Restore BIO method write functions
BIO_meth_set_write_ex(bio_socket_method, BIO_meth_get_write_ex(BIO_s_socket()));
BIO_meth_set_write(bio_socket_method, BIO_meth_get_write(BIO_s_socket()));

And that's it! We've successfully enabled fragmentation for both the TLS Record and TCP layers.

Now, let's move on to GnuTLS.

GnuTLS

TLS Record fragmentation

Unlike OpenSSL, GnuTLS doesn’t offer a specific API to automatically handle fragmentation. We need to implement it ourselves within our custom transport layer function (referred to as a "push" function in GnuTLS). When GnuTLS prepares packets to be sent out, it calls our push function. We then take that data, parse it using the TLSPlaintext structure we discussed earlier to extract the TLS Record, and divide that TLS Record into multiple records. It looks something like this:

static ssize_t tls_fragment_push_func(int socket_fd, const void* original_data, size_t original_size)
{
	char* data_to_be_written = (char*)original_data;
	size_t data_size_to_be_written = original_size;
	const size_t tls_record_frag_size = 10;


	struct tls_record_header_st base_header = *((struct tls_record_header_st*)data_to_be_written);

	/*
	 * We only want to fragment TLS records that have ContentType of Handshake.
	 */
	if (base_header.content_type == Handshake) {

		const size_t data_without_base_header_size = data_size_to_be_written - sizeof(struct tls_record_header_st);
		const char* data_without_base_header = data_to_be_written + sizeof(struct tls_record_header_st);

		// Calculate the number of the new TLS Record headers we want
		const int needs_carry = data_without_base_header_size % tls_record_frag_size == 0 ? 0 : 1;
		const int number_of_headers = (data_without_base_header_size / tls_record_frag_size) + needs_carry;
		const size_t tls_frag_rec_overhead = number_of_headers * sizeof(struct tls_record_header_st);

		/*
		 * We need to allocate a new buffer as we need to copy new headers
		 * in between our data.
		 */
		data_size_to_be_written = data_without_base_header_size + tls_frag_rec_overhead;
		data_to_be_written = (char*)malloc(data_without_base_header_size + tls_frag_rec_overhead);
		if (data_to_be_written == NULL) {
			return -1;
		}

		/*
		 * Create a new buffer with fragmented TLS records.
		 */
		ssize_t data_remained_bytes = data_without_base_header_size;
		ssize_t copied_bytes = 0;
		for (int i = 0; i < number_of_headers; i++) {

			const size_t frag_size = MIN(tls_record_frag_size, data_remained_bytes);

			// Copy TLS Record base header
			base_header.length = htons(frag_size);
			memcpy(data_to_be_written + copied_bytes, (char*)(&base_header), sizeof(struct tls_record_header_st));
			copied_bytes += sizeof(struct tls_record_header_st);

			// Copy fragment of data
			memcpy(data_to_be_written + copied_bytes, data_without_base_header + (i * tls_record_frag_size), frag_size);
			copied_bytes += frag_size;
			data_remained_bytes -= frag_size;
		}
	}

	/*
	 * Perform TCP fragmentation here if needed and send the data out.
	 * ....
	 */


	// return written_bytes;
}

In the code above, we parse the TLS Record, check if the record type is Handshake, and if so, we create new data that divides the original data across multiple TLS Records, as shown in the code.

Next, we replace the default GnuTLS push function with our custom function during the handshake, and then restore the default function afterward:

if (is_tls_rec_frag_enabeld) {
    gnutls_transport_set_push_function(vpninfo->https_sess, tls_fragment_push_func);
}

// Perform TLS handshake ....


if (is_tls_rec_frag_enabeld) {
    gnutls_transport_set_push_function(vpninfo->https_sess, tls_default_push_func);
}

And that’s it! (Well, not entirely—there are some nuances in the code, but the concept remains the same). We’ve successfully implemented TLS handshake fragmentation at both the TLS Record and TCP layers for OpenSSL and GnuTLS. 🙂

The final result would look something like this:

ClientHello fragmented in TLS Record layer

Conclusion

In this post, I briefly covered what TLS is and why it matters. We then looked at TCP fragmentation, explored the TLS Record Protocol, and discussed what it means to apply fragmentation at that layer.

Afterward, we walked through the code examples and saw how to implement fragmentation using OpenSSL and GnuTLS. As we’ve seen, it’s quite simple. 🙂

This was my first blog post written in English, and I hope to write more often!

[[1]]: In 2019 and 2022

[[2]]: By TCP fragmentation I actually mean segmentation, so don't confuse it with IP fragmentation.

[[3]]: send(2)

[[4]]: TCP_NODELAY is a socket option. See tcp(7).

[[5]]: OpenSSL documentation: Link

[[6]]: OpenSSL documentation: Link

[[7]]: OpenSSL code: Link

[[8]]: OpenSSL code: Link