Overview

A string is a read-only slice of arbitrary bytes. It is not required to hold Unicode, UTF-8 text, or any other predefined format

A string literal, absent byte-level escapes, always holds valid UTF-8 sequences

Representation

string is represented by a stringStruct struct:

type stringStruct struct {
	str unsafe.Pointer
	len int
}
  • str is a pointer to an immutable backing array of bytes
  • len is a total number of bytes in the backing array

For example:

string representation|200

Strings are immutable, so there is no need for a capacity (you can’t grow them)

It is safe for multiple strings to share the same storage, so slicing s results in a new 2-word structure with a potentially different pointer and length that still refers to the same byte sequence
This means that slicing can be done without allocation or copying, making string slices as efficient as passing around explicit indexes

When a string is assigned to another string, the two word value is copied, resulting in two different string values both sharing the same backing array. The cost of copying a string is the same regardless of the size of a string, a two word copy

Casting and Memory Allocation

Because the underlying byte array is immutable, casting []byte to string and vice versa results in a copy

However, there are some optimizations that the compiler makes to avoid copies:

  1. For a map m of type map[string]T and []byte bm[string(b)] doesn’t allocate
  2. No allocation when converting a string into a []byte for ranging over the bytes
  3. No allocation when converting a []byte into a string for comparison purposes
  4. A conversion from []byte] to string which is used in a string concatenation, and at least one of concatenated string values is a non-blank string constant

Substrings and Memory Leaks

When performing a substring operation, the Go specification doesn’t specify whether the resulting string and the one involved in the substring operation should share the same data. However, the standard Go compiler does allow them share the same backing array

Thus, there can be memory leaks, as the string returned by a substring operation will be backed by the same byte array. The solution is to make a copy of the string:

string([]byte(s[:ind]))
// or
strings.Clone(s[:ind])

References