Go · Slices · Wut

Go Sub slices

Recently a YouTube video “So you think you know Go” was doing the rounds with some Go gotchas. It’s an excellent little trip into some of the traps that Go developers may or may not be aware of that exist within Go programming.

The gotchas are all explained in the spec, but that doesn’t seem to stop people running into them.

This post focuses on the sub slice section of that video.

Slices are an interesting feature of Go, the technical explanation of them is that they are structs that contain, amongst other things, a pointer to a backing array. The code for the Slice struct is quite simple and we will use the Slice Header from the reflect package to see what’s happening in that struct.

In my opinion the easiest way to understand slices in Go is to think of them as views on a backing array. They do not hold any data at all, they merely show you which parts of a backing array to use as a slice.

Because your application deals with the Slice struct, rather than the backing array, changes can be made to the backing array and you don’t have to do any book keeping. That is, if Go decides that the backing array isn’t big enough for the changes you request, Go can create a new array provisioned with more memory, copy the data from the old array to the new array, and your application never needs to be any the wiser.

Hopefully that preamble makes the next section a little easier to understand, but may make the gotchas even more painful :)

Subslices

What is a subslice? A sub slice is a slice of another slice. Here’s an example:

a := []int{1,2,3}
b := a[:2]

Given the understanding of slices explained above it’s reasonable to expect that a and b are different views on the same backing array, and you’d be right. This can be proved by looking at the slice header. playground example

package main

import (
	"fmt"
	"reflect"
	"unsafe"
)

func main() {
	a := []int{1, 2, 3}
	b := a[:2]

	sha := (*reflect.SliceHeader)(unsafe.Pointer(&a))
	shb := (*reflect.SliceHeader)(unsafe.Pointer(&b))
	fmt.Printf("%#v", sha)
	fmt.Printf("%#v", shb)
}

// Output
&reflect.SliceHeader{Data:0xc000014020, Len:3, Cap:3}
&reflect.SliceHeader{Data:0xc000014020, Len:2, Cap:3}

Note: the Data address will be dependant on your machine, the point is that the value is the same for both slices.

Gotchas

With that all out of the way it’s time to look at the gotchas presented in the video.

A working example of the gotchas can be found on this playground link.

The slice and sub slice that demonstrate the issues is:

beatles := []string{"John", "Paul", "George", "Ringo"}
guitarists := beatles[:3]

If the guitarists are upper cased, is that going to affect the beatles

	// Upper case the guitarists
	for i, g := range guitarists {
		guitarists[i] = strings.ToUpper(g)
	}
	// Note the effect on guitarists
	fmt.Println("Guitarists", guitarists)
	// Note the effect on beatles
	fmt.Println("Beatles", beatles)

// Output
Guitarists [JOHN PAUL GEORGE]
Beatles [JOHN PAUL GEORGE Ringo]

That’s correct, the beatles slice sees the changes made to the guitarists slice. This is because they are both looking at (and modifying) the same backing array.

Now, append a name to the guitarists slice

       // Append one name to guitarists
	guitarists = append(guitarists, "Fred")
	for i, g := range guitarists {
		guitarists[i] = strings.ToUpper(g)
	}
	// Note the effect on guitarists
	fmt.Println("Guitarists", guitarists)
	// Note the effect on beatles
	fmt.Println("Beatles", beatles)

// Output
Guitarists [JOHN PAUL GEORGE FRED]
Beatles [JOHN PAUL GEORGE FRED]

Surprised? append saw that the guitarists slice had a length of three, and appended a fourth name. That fourth name overwrote the fourth name in the backing array which the beatles slice also shares.

Reset the slices then append two names to the list

	// Reset beatles and guitarists
	beatles = []string{"John", "Paul", "George", "Ringo"}
	guitarists = beatles[:3]
	
	// Append *two* names to guitarists
	guitarists = append(guitarists, "Fred","Wilma")
	for i, g := range guitarists {
		guitarists[i] = strings.ToUpper(g)
	}
	fmt.Println("Beatles", beatles)
	fmt.Println("Guitarists", guitarists)

// Output
Beatles [John Paul George Ringo]
Guitarists [JOHN PAUL GEORGE FRED WILMA]

The beatles slice hasn’t changed at all, but the guitarists has. append saw that the backing array for guitarists didn’t have enough capacity to hold a fifth string, so created a new backing array to hold that data, copied the existing data over, and changed guitarists slice struct to now be a view on that new backing array. There was no need to do that for the beatles slice because the original backing array still provides everything that it requires. The two slices now point to two different backing arrays.

The lesson that should be learnt from here is don’t modify sub slices.

Copy and sub slices

Copying sub slices using the copy function means that from the moment the sub slice is created, it’s a view on a different backing array to the first. Modifications to the sub slice will not be apparent to the first slice.

Be aware, though, that there are some gotchas lurking in this approach Playground link

Create guitarists with []string{}

	beatles := []string{"John", "Paul", "George", "Ringo"}
	guitarists := []string{}
	copy(guitarists, beatles[:3])
	fmt.Println(guitarists)

// Output
[]

Because the guitarists slice was created with a len and cap of 0 copy will copy zero items across.

Create guitarists with make providing a len of 2

	guitarists := make([]string, 2)
	copy(guitarists, beatles[:3])
	fmt.Println(guitarists)

// Output
[John Paul]

Same as before, copy sees how much room there is to copy into, and only copies that amount.

Create guitarists with make providing a len of 5

	guitarists := make([]string, 5)
	copy(guitarists, beatles[:3])
	fmt.Println(guitarists)

// Output
[John Paul George  ]

More than ample room for the data to be copied across to.

Therefore best practice for copy is to give it the len of the amount of source slice data that is desired to be copied. If you want all of it then:

guitarists = make([]string, len(beatles))

Pointers

Finally, if you must have a second slice where modifications between the two are seen, then you just need a pointer (note all of the dereferencing).

	b := &[]int{3,4,5}
	c := b
	*c = append(*c, 99)
	fmt.Println(*b)
	fmt.Println(*c)

// Output
[3 4 5 99]
[3 4 5 99]

Hopefully this post has both opened some eyes, and provided guidance on both how to deal with sub slices and why best practice exists.

Published:
comments powered by Disqus