Overview
A string is a read-only slice of arbitrary bytes. It is not required to hold Unicode, UTF-8 text, or any other predefined format
A string literal, absent byte-level escapes, always holds valid UTF-8 sequences
Representation
A string
is represented by a stringStruct
struct:
str
is a pointer to an immutable byte sequencelen
is a total number of bytes in this sequence
For example:
Strings are immutable, so there is no need for a capacity (you can’t grow them)
It is safe for multiple strings to share the same storage, so slicing s
results in a new 2-word structure with a potentially different pointer and length that still refers to the same byte sequence
This means that slicing can be done without allocation or copying, making string slices as efficient as passing around explicit indexes
Casting and Memory Allocation
Because the underlying byte array is immutable, casting []byte
to string
and vice versa results in a copy
However, there are some optimizations that the compiler makes to avoid copies:
- For a map
m
of typemap[string]T
and[]byte b
,m[string(b)]
doesn’t allocate - No allocation when converting a
string
into a[]byte
for ranging over the bytes - No allocation when converting a
[]byte
into astring
for comparison purposes - A conversion from
[]byte]
tostring
which is used in a string concatenation, and at least one of concatenated string values is a non-blank string constant
Substrings and Memory Leaks
When performing a substring operation, the Go specification doesn’t specify whether the resulting string and the one involved in the substring operation should share the same data. However, the standard Go compiler does allow them share the same backing array
Thus, there can be memory leaks, as the string returned by a substring operation will be backed by the same byte array. The solution is to make a copy of the string: