Go

Waitgroup usage

Why is knowing the internals of the Go runtime so important? For that matter why is knowing Computer Science from the Silicon on up so important?

First and foremost, knowing what’s going on underneath allows us to reason about code and guard against error prone code, before it becomes an issue.

I can give a concrete example. I was chatting on IRC and a fellow developer was asking about best practice with Waitgroup.Add() placement, specifically they wanted to know if it was better practice to place the statement before the go keyword (which launched the Goroutine) as seen in example 1, or once inside the Goroutine’s called function, as seen in example 2.

// Example 1
func main(){
    var wg sync.WaitGroup
    wg.Add()
    go func() {
        fmt.Println("I am a Go routine")
        wg.Done()
    }()
    wg.Wait()
}
// Example 2
func main(){
    var wg sync.WaitGroup
    go func() {
        wg.Add()
        fmt.Println("I am a Go routine")
        wg.Done()
    }()
    wg.Wait()
}

My initial thought was to put the wg.Add() call before the Goroutine was launched, but I had not reasoned why this might be better or worse than the other option. But, upon careful examination it’s actually imperative that the call to wg.Add() be made before the Goroutine is launched.

In order to understand why it is so important that the wg.Add() call be made before the Goroutine is launched let’s look through the code examples (note: the wg is passed to the Goroutines as a closure).

There are two ‘threads’ in each example, the main ‘thread’ and the Goroutine thread. When the main thread exits, the application closes. The point of waitgroups is to hold the main thread open, until the Goroutines have signalled (via calls to wg.Done()), that they have finished their work.

The wg.Add() function increments a counter, and the wg.Done() decrements that counter, the wg.Wait() call waits until that counter is 0 (or below).

In the first example the program counter points first to the creation of the wg variable, then it points to the wg.Add() call, incrementing the counter, then points to the go keyword. The main thread continues to the wg.Wait() call, and waits because the value is greater than 0, unless the Goroutine has already completed its task and called wg.Done() which allows the main thread to continue to the end, and exit.

In the second example the program counter points first at the creation of the wg variable, then the go keyword is encountered, which means that the scheduler will be called, but the program counter for the main thread is free to continue to the next statement (thewg.Wait()). The scheduler will create the new Goroutine, call the anonymous function and then increment the counter for the waitgroup.

It should be abundantly clear that there is no guarantee which order those two threads of operation will occur. Maybe the main thread will reach the wait call before the counter has been incremented by the Goroutine, maybe it won’t.

That’s a race condition. The outcome is determined by a race between two threads of operation. Which is what we certainly don’t want to happen (the program could exit before the Goroutine has done its work). That is why the wg.Add() call must occur before the Goroutine is created.

My instinct was correct, but that was a matter of luck. It’s only once we have a closer inspection of what’s happening that it becomes clear how awful one style is. With that in mind I now mandate in all code reviews that, absolutely, the wg.Add() must be done before the Goroutine is created, and so should you.

Published:
comments powered by Disqus