[Golang] Understanding TCP: Debunking Misconceptions with Practical Examples

TCP might seem straightforward, especially when using Go. You set up a net.Conn, call Write to send data, Read to receive it, and everything just seems to work. It’s straightforward, minimal, and very Go-like. However, this simplicity can lead to misunderstandings because the details beneath the surface are easy to overlook.

This blog post aims to demystify the real workings of TCP using practical examples, tackling common myths, all within the context of Go development.

A Simple Starting Point

When using Go’s net.Conn, it’s very straightforward to get things up and running quickly. For example:

Sender:

conn.Write([]byte("hello"))

Receiver:

buf := make([]byte, 1024)
n, _ := conn.Read(buf)
fmt.Println(string(buf[:n]))

In basic situations, this approach seems flawless. One write, one read—it seems like it just works. This can lead to the assumption that the process will always be this simple.

From Simple Messages to File Transfers

Let’s take this further and try transferring a file:

Sender:

file, _ := os.Open("large_file.dat")
io.Copy(conn, file)

Receiver:

buf := make([]byte, 1024)
n, _ := conn.Read(buf)
out.Write(buf[:n])

Suddenly, issues arise. File transfers might not complete, data seems to disappear, and it sometimes works, sometimes doesn’t. This brings us to the actual nature of TCP.

The Nature of TCP: Byte Stream, Not Messages

TCP operates as a continuous flow of bytes, not discrete messages like you might think. Here’s what that means:

If you send conn.Write(A) then conn.Write(B), what you receive could be:
AB as one combined read
A followed by B
A broken across multiple reads

TCP doesn’t recognize the end of one message and the start of another. It only guarantees the data arrives in the order it was sent.

The 1024 Buffer: A Convenient Illusion

Why did the small example work? Thanks to the small message size and good timing—everything fit in one go. However, this reliability falls apart with larger data, exposing the underlying issue: that buffer size isn’t a substitute for message boundaries.

Creating Reliable Transfers: Using Message Boundaries

To transfer data consistently, the receiver needs to know how much data belongs to each message. We can solve this by including the size of the data before the data itself:

[length][payload]

Sender:

data := []byte("hello world")
binary.Write(conn, binary.BigEndian, uint32(len(data)))
conn.Write(data)

Receiver:

var length uint32
binary.Read(conn, binary.BigEndian, &length)

buf := make([]byte, length)
io.ReadFull(conn, buf)

This method lets the receiver know exactly how many bytes to expect for each piece of data.

Creating a SendFile and ReceiveFile Function

Let’s implement a file transfer in Go with message boundaries:

Sender:

func sendFile(conn net.Conn, filePath string) error {
  file, err := os.Open(filePath)
  if err != nil {
    return err
  }
  defer file.Close()

  fileInfo, err := file.Stat()
  if err != nil {
    return err
  }

  fileSize := uint32(fileInfo.Size())
  binary.Write(conn, binary.BigEndian, fileSize)
  _, err = io.Copy(conn, file)
  return err
}

Receiver:

func receiveFile(conn net.Conn, destPath string) error {
  var fileSize uint32
  binary.Read(conn, binary.BigEndian, &fileSize)

  file, err := os.Create(destPath)
  if err != nil {
    return err
  }
  defer file.Close()

  buf := make([]byte, fileSize)
  _, err = io.ReadFull(conn, buf)
  if err != nil {
    return err
  }

  _, err = file.Write(buf)
  return err
}

Explaining `conn.Write` and `binary.Write`

There’s often confusion about why we use both conn.Write and binary.Write. Here’s a quick rundown:

conn.Write sends raw data as bytes.
binary.Write formats and sends structured data, like numbers, by converting them into a specific byte order.

For example:

binary.Write(conn, binary.BigEndian, uint32(100))

This is equivalent to:

conn.Write([]byte{0x00, 0x00, 0x00, 0x64})

Importance of Endianness

When you send numbers as bytes, the order matters:

Big Endian: [00][00][00][64] (standard in networking)
Little Endian: [64][00][00][00]

Ensuring both sender and receiver agree on this order is crucial for correct data interpretation.

Beyond Boundaries: Understanding Data Types

Knowing how many bytes to read is one thing, but understanding what those bytes mean is another. For instance, a byte sequence like [00 00 00 64] could be a number, a string, or part of a file.

When Structure Is Essential

If your system only handles one type of data, like file transfers, boundaries are enough. But if it handles various data types (file chunks, messages, commands), you need a different approach:

[type][length][payload]

This helps the receiver understand what the data is and how to handle it.

Managing Data: File Transfer vs. Streaming

An important aspect is knowing whether data exchanges have a defined end:

File Transfer:

[length][file]

Has a clear start and finish.

Streaming:

[chunk][chunk][chunk][...]

Can go on indefinitely, ending only when the connection is closed or a special signal indicates the end.

Practical File Transfers with Chunks

Instead of

[file size][huge file]

more practical systems use:

[chunk][chunk][chunk]

This allows for tracking progress, resuming transfers, and ensuring more reliable communication.

Conclusion: Building a Solid Understanding

Here’s the key takeaway:

TCP → Gives you bytes  
Length → Tells you how many bytes to read  
Structure → Explains what those bytes are

Keep these guidelines in mind:

Never assume Read corresponds to a complete message.
Buffer size is not a boundary marker.
Define clear message boundaries for complex data.
Use length prefixes for simplicity.
Incorporate type definitions only when necessary.

By focusing on these elements—bytes, boundaries, and meaning—you can develop more reliable networking systems. Keep this guide handy as you navigate TCP’s complexities.

Enjoyed this article? Support my work with a coffee ☕ on Ko-fi.

A Simple Starting Point#

From Simple Messages to File Transfers#

The Nature of TCP: Byte Stream, Not Messages#

The 1024 Buffer: A Convenient Illusion#

Creating Reliable Transfers: Using Message Boundaries#

Creating a SendFile and ReceiveFile Function#

Explaining conn.Write and binary.Write#

Importance of Endianness#

Beyond Boundaries: Understanding Data Types#

When Structure Is Essential#

Managing Data: File Transfer vs. Streaming#

Practical File Transfers with Chunks#

Conclusion: Building a Solid Understanding#

A Simple Starting Point

From Simple Messages to File Transfers

The Nature of TCP: Byte Stream, Not Messages

The 1024 Buffer: A Convenient Illusion

Creating Reliable Transfers: Using Message Boundaries

Creating a SendFile and ReceiveFile Function

Explaining `conn.Write` and `binary.Write`

Importance of Endianness

Beyond Boundaries: Understanding Data Types

When Structure Is Essential

Managing Data: File Transfer vs. Streaming

Practical File Transfers with Chunks

Conclusion: Building a Solid Understanding