When new people join the Go-Miami group they always write that they want to learn more about Go's concurrency model. Concurrency seems to be the big buzz word around the language. It was for me when I first started hearing about Go. It was Rob Pike's Go Concurrency Patterns video that finally convinced me I needed to learn this language.
To understand how Go makes writing concurrent programs easier and less prone to errors, we first need to understand what a concurrent program is and the problems that result from such programs. I will not be talking about CSP (Communicating Sequential Processes) in this post, which is the basis for Go's implementation of channels. This post will focus on what a concurrent program is, the role that goroutines play and how the GOMAXPROCS environment variable and runtime function affects the behavior of the Go runtime and the programs we write.
Processes and Threads
When we run an application, like the browser I am using to write this post, a process is created by the operating system for the application. The job of the process is to act like a container for all the resources the application uses and maintains as it runs. These resources include things like a memory address space, handles to files, devices and threads.
A thread is a path of execution that is scheduled by the operating system to execute the applications code against a processor or core. A process starts out with one thread, the main thread, and when that thread terminates the process terminates. This is because the main thread is the origin for the application. The main thread can then in turn launch more threads and those threads can launch even more threads. Once we have more than one thread running in our program, we have a concurrent program.
The operating system schedules a thread to run on an available processor or core regardless of which process the thread belongs to. Each operating system has its own algorithms that make these decisions and it is best for us to write concurrent programs that are not specific to one algorithm or the other. Plus these algorithms change with every new release of an operating system, so it is dangerous game to play.
Goroutines and Parallelism
Goroutines are functions that we request the Go runtime goroutine scheduler to execute concurrently. We can consider that the main function is executing on a goroutine, however the Go runtime does not start that goroutine. Goroutines are considered to be lightweight because they use little memory and resources plus their initial stack size is small. Prior to version 1.2 the stack size started at 4K and now it starts at 8K. The stack has the ability to grow and shrink as needed.
The operating system schedules threads to run against available processors and the Go runtime schedules goroutines to run against available threads from the schedulers thread pool. By default the schedulers thread pool is allocated with only one thread. Even with one thread, hundreds of thousands of goroutines can be scheduled to run concurrently. It is not recommended to change the size of the schedulers thread pool, but if you want to run goroutines in parallel, Go provides the ability to change the size of the schedulers thread pool via the GOMAXPROCS environment variable or runtime function.
Parallelism is when two or more threads are executing code simultaneously against different processors or cores. We can achieve running goroutines in parallel as long as we are running on a machine with multiple processors or cores and we add more than one thread to the schedulers thread pool. If we add more threads to the schedulers thread pool but run our program on a single CPU machine, our goroutines will run against multiple threads but will be running concurrently against the single CPU, not in parallel.
Concurrency Example
Let's build a small program that shows Go running goroutines concurrently. In this example we are using the default setting for the schedulers thread pool which is one thread:
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("Starting Go Routines")
go func() {
for char := 'a'; char < 'a'+26; char++ {
fmt.Printf("%c ", char)
}
}()
go func() {
for number := 1; number < 27; number++ {
fmt.Printf("%d ", number)
}
}()
fmt.Println("Waiting To Finish")
time.Sleep(1 * time.Second)
fmt.Println("\nTerminating Program")
}
Starting Go Routines
Waiting To Finish
a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Terminating Program
We can see that the first goroutine completes displaying all 26 letters and then the second goroutine gets a turn to display all 26 numbers. Because it takes less than a microsecond for the first goroutine to complete its work, we don't see the scheduler interrupt the first goroutine before it finishes its work. We can give a reason to the scheduler to swap the goroutines by putting a sleep into the first goroutine:
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("Starting Go Routines")
go func() {
time.Sleep(1 * time.Microsecond)
for char := 'a'; char < 'a'+26; char++ {
fmt.Printf("%c ", char)
}
}()
go func() {
for number := 1; number < 27; number++ {
fmt.Printf("%d ", number)
}
}()
fmt.Println("Waiting To Finish")
time.Sleep(1 * time.Second)
fmt.Println("\nTerminating Program")
}
Starting Go Routines
Waiting To Finish
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 a
b c d e f g h i j k l m n o p q r s t u v w x y z
Terminating Program
Parallel Example
In our past two examples the goroutines were running concurrently, but not in parallel. Let's make a change to the code to allow the goroutines to run in parallel. All we need to do is change the default size of the schedulers thread pool to use two threads:
package main
import (
"fmt"
"runtime"
"time"
)
func main() {
runtime.GOMAXPROCS(2)
fmt.Println("Starting Go Routines")
go func() {
for char := 'a'; char < 'a'+26; char++ {
fmt.Printf("%c ", char)
}
}()
go func() {
for number := 1; number < 27; number++ {
fmt.Printf("%d ", number)
}
}()
fmt.Println("Waiting To Finish")
time.Sleep(1 * time.Second)
fmt.Println("\nTerminating Program")
}
Starting Go Routines
Waiting To Finish
a b 1 2 3 4 c d e f 5 g h 6 i 7 j 8 k 9 10 11 12 l m n o p q 13 r s 14
t 15 u v 16 w 17 x y 18 z 19 20 21 22 23 24 25 26
Terminating Program
Conclusion
Just because we can change the size of the schedulers thread pool, doesn't mean we should. There is a reason the Go team has set the defaults to the runtime the way they did. Especially the default for the schedulers thread pool. Just know that arbitrarily adding threads to the schedulers thread pool and running goroutines in parallel will not necessarily provide better performance for your programs. Always profile and benchmark your programs and make sure the Go runtime configuration is only changed if absolutely required.