Websocket Protocol: Opening Handshake - Part 2

Proving the opening handshake is valid
Published on 2024/03/27

After going over the first half of the opening handshake, we will try to understand how the protocol expects two entities to prove that a handshake was received. I'll add the handshake request here again to help us out:

GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: http://example.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13

I think this section is missing a few details but this is reasonable since it's an RFC. If there was no way to prove that the handshake was received, an attacker could attempt to send unauthorized requests to the server or intercept data. This means that this protocol needs a way for the server and the client to verify the handshake happened and that the connection was established legitimately. Since there is no other way to allow for the multiplexing to happen, a handshake has to fulfill before any data is exchanged. Having this mechanism prevents any Websocket hijacking. But how exactly is this achieved?

I think the RFC example is pretty clear. The server grabs the value stored in the Sec-WebSocket-Key header and does the following:

  • Concatenate the base64-encoded value dGhlIHNhbXBsZSBub25jZQ== with GUID 258EAFA5-E914-47DA-95CA-C5AB0DC85B11
  • Hash the value dGhlIHNhbXBsZSBub25jZQ==258EAFA5-E914-47DA-95CA-C5AB0DC85B11
  • Echo this base64-encoded value s3pPLMBiTxaQ9kYGzzhZRbK+xOo= in the Sec-WebSocket-Accept field

The response from the server will look like this:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: chat

Based on what we explore in part 1 and here, this response looks intuitive. The first line acknowledges the request to switch protocol, if we look back at the client handshake it asked for a connection upgrade to switch from HTTP to WebSocket. If the server were to respond with anything other than 101 it would still use HTTP semantics. Following we have Connection and Upgrade, these confirm what the connection has been upgraded to. They go hand in hand with the 101 status response.

Based on all the fields we covered, here are the cases in which the client would consider the handshake as not accepted:

  • A status code other than 101.
  • A Sec-WebSocket-Accept that doesn't match what the client computed.
  • If Sec-WebSocket-Accept is missing.

In any of these cases, no data frames will be exchanged. Last is an Option field, as from part 1 the client can specify subprotocols it will accept, the client then verifies the server responded with one they accept.

Thoughts

We only got started but already going strong with some solid fundamental knowledge. I was looking forward to learning how malicious connection attempts are prevented and was happy to learn a simple yet effective mechanism to do so. Computing the Sec-WebSocket-Accept is incredibly simple, here's a poor go example:

func computeSecWebsocketAccept(wsKey string) string {
  guid := "258EAFA5-E914-47DA-95CA-C5AB0DC85B11"

  hasher := sha1.New()
  hasher.Write([]byte(wsKey + guid))
  sha1Hash := hasher.Sum(nil)

  return base64.StdEncoding.EncodeToString(sha1Hash)
}

I wish I had the time to create a small websocket client that is able to do a successful handshake. Maybe next time (but sadly probably not).

0
← Go Back