Recently a YouTube video “So you think you know Go” was doing the rounds with some Go gotchas. It’s an excellent little trip into some of the traps that Go developers may or may not be aware of that exist within Go programming.
The gotchas are all explained in the spec, but that doesn’t seem to stop people running into them.
This post focuses on the sub slice section of that video.
Slices are an interesting feature of Go, the technical explanation of them is that they are structs
that contain, amongst other things, a pointer to a backing array. The code for the Slice struct is quite simple and we will use the Slice Header from the reflect
package to see what’s happening in that struct.
In my opinion the easiest way to understand slices in Go is to think of them as views
on a backing array. They do not hold any data at all, they merely show you which parts of a backing array to use as a slice.
Because your application deals with the Slice struct, rather than the backing array, changes can be made to the backing array and you don’t have to do any book keeping. That is, if Go decides that the backing array isn’t big enough for the changes you request, Go can create a new array provisioned with more memory, copy the data from the old array to the new array, and your application never needs to be any the wiser.
Hopefully that preamble makes the next section a little easier to understand, but may make the gotchas even more painful :)
Subslices
What is a subslice? A sub slice is a slice of another slice. Here’s an example:
a := []int{1,2,3}
b := a[:2]
Given the understanding of slices explained above it’s reasonable to expect that a
and b
are different views on the same backing array, and you’d be right. This can be proved by looking at the slice header.
playground example
package main
import (
"fmt"
"reflect"
"unsafe"
)
func main() {
a := []int{1, 2, 3}
b := a[:2]
sha := (*reflect.SliceHeader)(unsafe.Pointer(&a))
shb := (*reflect.SliceHeader)(unsafe.Pointer(&b))
fmt.Printf("%#v", sha)
fmt.Printf("%#v", shb)
}
// Output
&reflect.SliceHeader{Data:0xc000014020, Len:3, Cap:3}
&reflect.SliceHeader{Data:0xc000014020, Len:2, Cap:3}
Note: the Data address will be dependant on your machine, the point is that the value is the same for both slices.
Gotchas
With that all out of the way it’s time to look at the gotchas presented in the video.
A working example of the gotchas can be found on this playground link.
The slice and sub slice that demonstrate the issues is:
beatles := []string{"John", "Paul", "George", "Ringo"}
guitarists := beatles[:3]
If the guitarists
are upper cased, is that going to affect the beatles
// Upper case the guitarists
for i, g := range guitarists {
guitarists[i] = strings.ToUpper(g)
}
// Note the effect on guitarists
fmt.Println("Guitarists", guitarists)
// Note the effect on beatles
fmt.Println("Beatles", beatles)
// Output
Guitarists [JOHN PAUL GEORGE]
Beatles [JOHN PAUL GEORGE Ringo]
That’s correct, the beatles
slice sees the changes made to the guitarists
slice. This is because they are both looking at (and modifying) the same backing array.
Now, append a name to the guitarists
slice
// Append one name to guitarists
guitarists = append(guitarists, "Fred")
for i, g := range guitarists {
guitarists[i] = strings.ToUpper(g)
}
// Note the effect on guitarists
fmt.Println("Guitarists", guitarists)
// Note the effect on beatles
fmt.Println("Beatles", beatles)
// Output
Guitarists [JOHN PAUL GEORGE FRED]
Beatles [JOHN PAUL GEORGE FRED]
Surprised? append
saw that the guitarists
slice had a length of three, and appended a fourth name. That fourth name overwrote the fourth name in the backing array which the beatles
slice also shares.
Reset the slices then append two names to the list
// Reset beatles and guitarists
beatles = []string{"John", "Paul", "George", "Ringo"}
guitarists = beatles[:3]
// Append *two* names to guitarists
guitarists = append(guitarists, "Fred","Wilma")
for i, g := range guitarists {
guitarists[i] = strings.ToUpper(g)
}
fmt.Println("Beatles", beatles)
fmt.Println("Guitarists", guitarists)
// Output
Beatles [John Paul George Ringo]
Guitarists [JOHN PAUL GEORGE FRED WILMA]
The beatles
slice hasn’t changed at all, but the guitarists
has. append
saw that the backing array for guitarists
didn’t have enough capacity to hold a fifth string, so created a new backing array to hold that data, copied the existing data over, and changed guitarists
slice struct to now be a view on that new backing array. There was no need to do that for the beatles
slice because the original backing array still provides everything that it requires. The two slices now point to two different backing arrays.
The lesson that should be learnt from here is don’t modify sub slices.
Copy and sub slices
Copying sub slices using the copy
function means that from the moment the sub slice is created, it’s a view on a different backing array to the first. Modifications to the sub slice will not be apparent to the first slice.
Be aware, though, that there are some gotchas lurking in this approach Playground link
Create guitarists
with []string{}
beatles := []string{"John", "Paul", "George", "Ringo"}
guitarists := []string{}
copy(guitarists, beatles[:3])
fmt.Println(guitarists)
// Output
[]
Because the guitarists
slice was created with a len and cap of 0 copy
will copy zero items across.
Create guitarists
with make
providing a len
of 2
guitarists := make([]string, 2)
copy(guitarists, beatles[:3])
fmt.Println(guitarists)
// Output
[John Paul]
Same as before, copy
sees how much room there is to copy into, and only copies that amount.
Create guitarists
with make
providing a len
of 5
guitarists := make([]string, 5)
copy(guitarists, beatles[:3])
fmt.Println(guitarists)
// Output
[John Paul George ]
More than ample room for the data to be copied across to.
Therefore best practice for copy
is to give it the len
of the amount of source slice data that is desired to be copied. If you want all of it then:
guitarists = make([]string, len(beatles))
Pointers
Finally, if you must have a second slice where modifications between the two are seen, then you just need a pointer (note all of the dereferencing).
b := &[]int{3,4,5}
c := b
*c = append(*c, 99)
fmt.Println(*b)
fmt.Println(*c)
// Output
[3 4 5 99]
[3 4 5 99]
Hopefully this post has both opened some eyes, and provided guidance on both how to deal with sub slices and why best practice exists.