The How and Why of Go, Part 1: Tooling

I'm one of those people surprised at the success of the Go programming language. Here is a language that prides itself in offering less than languages designed decades ago, unabashedly not OOP, and without a decent dependency management system (at least initially), but still wildly successful, with a number of significant open source projects written in it (e.g. Docker, Terraform & Kubernetes). Another intriguing thing is that people who use Go as their primary language rarely complain about it (maybe a generics would be nice here and there), while those who come into initial contact with it can't stop swearing, at least initially (mea culpa). I used this gap in affinity as a chance to understand the intricacies of Go by diving into the platform, and writing down what I think is necessary knowledge for newcomers to become productive on it. The target reader group would be developers already proficient in one language and platform; as the text is already quite long, I didn't want to explain common programming terminology. Unavoidably, my perspective is skewed by the technologies I'm acquainted with, especially Python, with which I frequently compare Go, but it should be useful for all newcomers, even those without too much programming experience. I hope that those already working in Go can also find a useful tip here and there.

And now to the "Why" in the title. The design of Go is a bit curious, in that it leaves out most features of other popular programming languages, going for simplicity rather than recommending itself with more features. My aim was to make a proper attempt at understanding the context for this choice, by following the path from expectations from a language, to design principles, to language features, and finally to the embodiment of the language in terms of compiler, runtime and tooling. This process is never perfect for any technical product, as there are incidental turns taken at every step, but I think the knowledge of how different aspects of a language came to be the way they are, while depending on each other and the context, is very important and useful. I therefore attempted to start with an overview of the "intellectual" history of Go, connecting following discussions of features to this history.

This first part in what I intend to be a two-part series will concentrate on the Go toolchain, that is, the set of tools for writing, verifying, compiling and maintaning Go applications. As we will see, Go tooling has come a long way, and offers a first-class development environment for writing correct and performant applications. The second installment will deal with the most important features of the language, also in the light of Go design decisions. I would like both parts to stay as up-to-date and relevant as possible, so if you have any comments on improving, do leave a comment, and I will make sure to address it.

Since the text is rather long, here is a table of contents, in case you want to jump to a subsection:

Why is Go the way it is?

In order to appreciate the design decisions that went into the Go language, it's important to understand where the language designers started from, and what problems they expected their language to solve. As Rob Pike explains in detail in this article from 2012, Go was not designed to experiment with PLT concepts. The languages and technologies it intended to replace were those in daily use at Google (C++, Java and Python); Go was designed to solve the problems these platforms presented at Google scale. A rough three-part categorization of the problems Go is intended to address can be done as follows:

Build issues: The problems that C and C++ model of compilation presents are well-known. As explained in the linked article, compiling a moderately large C++ codebase can lead to gigabytes of IO. Go avoids this and similar issues by making unused imports an error, and replacing header files and includes with an inverted dependency model of compilation. Circular imports are also not allowed. One interesting side effect of this stress on dependency hygiene is that copy-paste is preferred to importing large packages:

Dependency hygiene trumps code reuse. One example of this in practice is that the (low-level) net package has its own integer-to-decimal conversion routine to avoid depending on the bigger and dependency-heavy formatted I/O package. Another is that the string conversion package strconv has a private implementation of the definition of 'printable' characters rather than pull in the large Unicode character class tables; that strconv honors the Unicode standard is verified by the package's tests.

Developer ergonomics: Go is a minimalist language that tries to do away with features of languages used at Google, such as C++, Java and Python, which are not coincidentally also quite popular in the rest of the dev community. In fact, as this candid blog post explains, the initial drive to develop Go came from unpleasant experiences developing concurrent code in C++, and the intention of the standards committee to make the language even more complex. While differentiating itself from popular languages, Go cannot stray too far away from them, as it is intended to be used in production at a company the size of Google. As such, it has to be easy to learn, and familiar to junior developers. This simplicity is at the service of solving modern programming challengs, foremost being concurrency. Concurrency in Go is provided through communicating sequential processes (CSP), the advantage of which is that it is easy to integrate into a procedural language. Another modern feature that now meets a C-like language in Go is garbage collection. Due to the type system and memory allocation features of Go, however, garbage collection is different from the way it works in languages like Java or Python; we will discuss this in the follow-up post.

Google-scale: The design of Go is optimized for disambiguity and parsability. The underlying reasons are the ease of writing external tools and avoiding discord in large developer teams. As an example, Pike mentions that having languages that are whitespace-sensitive, such as Python, is not an issue in itself, but Python embedded in SWIG declarations turns out to be a huge problem. In order to preclude such nuisances, Go has curly braces and clear formatting rules. Another example is the now famous auto-formatting tool that provides a standard through implementation. This, and similar tools like gofix which we will discuss later, are possible because the language is easy to parse and unambiguous (in comparison to e.g. C++, which can have statements that can be parsed multiple ways). These tools enable standards to be set within large groups, and also systematic changes such as API changes to be made on large code bases. Generally, it can also be said that the rest of the design concerns gathered in the previous categories also contribute to scaling Go, especially those concerning concurrency primitives and strong standard library support. Another aspect of scale is the number of people working on a project. As Pike correctly observes, developers tend to stick to a subset they understand of a complex language with many features. As Go has a rather limited set of features, there is no subset to agree on.

Obviously, not all the design choices that went into Go can be explained through these points. There are quite a number of things that are put to good use in other languages, but are explicitly shunned in Go, such as OOP, exceptions and generics. In my opinion, there is one general thread that connects the dots, which is that in Go, things that are a pain in the large are not allowed in the small, either. Or in Rob Pike's words:

As with many of the design decisions in Go, it forces the programmer to think earlier about a larger-scale issue … that if left until later may never be addressed satisfactorily.

Another important aspect you need to keep in mind when reading Go documentation, and wondering at how basic it is, is that priorities in keeping the implementation simple and feasible in certain areas led the designers to simply omit what one takes for granted in other languages, but leads to hidden complexity in the implementation. As stated in the official FAQ:

Go was designed with an eye on felicity of programming, speed of compilation, orthogonality of concepts, and the need to support features such as concurrency and garbage collection. Your favorite feature may be missing because it doesn't fit, because it affects compilation speed or clarity of design, or because it would make the fundamental system model too difficult.

The Go toolchain

The core of the Go toolchain is the go command line tool that bundles the most important components, including the compiler. In the rest of this text, Go refers to the language and ecosystem, whereas go (with lowercase g) refers to the command line tool. Go adheres to the recent pattern of delivering development tools where the entry point is one single command which accepts a first argument as an action (other examples could be git and kubectl). In daily work, when building Go code, one rarely has to deal directly with the actual compiler or linker, which are hidden somewhere in the Go distribution. The Go compiler is in fact written in Go itself, and thanks to this fact and the snappy compile times, boostrapping the Go toolchain is one of the simplest ways to get an up-to-date version on your computer. You will first need to get the out-of-date but still useful version of go from system repositories, as in apt install golang. Afterwards, download the latest source package from the official downloads page. After unpacking it, run the command ./make.bash in the src/ subdirectory. This will compile the compiler, various other tools and the library. On my relatively outdated i7 2.40GHz computer it took 5 minutes in total. The compiler will now reside in the bin/ directory; you can either use it by referencing it explicitly or by setting the search path with export PATH=`pwd`/bin:$PATH. If you pack the following into the file hello.go, and compile it with go build hello.go, you should have the traditional Hello World:

package main

import "fmt"

func main() {  
    fmt.Println("Hello world")
}

The executable is created by default with the name of the file, i.e. hello, which means no more a.out. As we mentioned, the compiler and linker are being called in the background; you can find out how and where by running the same command with the -x option, i.e. go build -x hello.go. In this verbose output, you can see how go creates a temporary working directory, creates some files for specifying where the various obkect files are, and then brings everything together.

You can also run the Hello World file with go run hello.go; this will directly execute the code without creating an executable. The argument to this command does not have to be a file; it can also be a package or directory (with a main package; more on this later).

Cross-compiling Go

As mentioned, it is possible to cross-compile Go code for another platform. This can be achieved using environment variables that specify the target operating system and architecture. In order to compile for Linux on the Raspberry Pi which uses an ARM chip, for example, you would need to run the following:

GOOS=linux GOARCH=arm go build hello.go

If you now look at what kind of a file the resulting binary is with file hello, you will see that it's an ELF 32-bit LSB executable, ARM.

Built-in Go tools

As the Go language targets large teams of developers without too much experience, the toolchain contains a couple of tools that effectively set standards by implementing them. The best-known of these is the gofmt tool that automatically formats code. The aim is avoiding bikeshedding discussions by providing one correct, automated way of formatting Go code. The gofmt tool is delivered as a part of the Go codebase; if you built the code from source as explained above, you should have it lying next to the go tool. In daily usage, gofmt is called using the alias go fmt, which is simply gofmt -l -w. With these options, gofmt reformats the files in-place, and prints their names. This isn't all gofmt can do, however. It is also a useful tool for simple transformations using the -r option. Let's say that you modified a frequently called function yarbl, and changed the order of its arguments; the first and second arguments have to be switched. That is, instead of yarbl(x, y, z), you need yarbl(y, x, z). The following command will update all calls to yarbl in file code.go (we will see later how to refer to a module or package) and make them fit the new signature:

gofmt -r 'yarbl(x, y, z) -> yarbl(y, x, z)' -w code.go

In the pattern specification, you need to use single lowercase letters to match sub-expressions; anything else will be matched exactly. With the above pattern, the following code:

yarbl(x, y, z)  
yarbl(foo, bar, zap)  

will be changed to the following:

yarbl(y, x, z)  
yarbl(bar, foo, zap)  

This feature is rather useful for refactoring Go code, e.g. in order to change the name of a function or variable in order to export it publicly. Another switch gofmt accepts is -s, which can be used to simplify your code, but the transformations carried out by this option are relatively limited, in complexity and in number.

Environment variables

Before we go any further, I would like to explain the role of a couple of environment variables in the functioning of the go tool. We already saw above how the target platform and operating system can be passed into the Go compiler via environment variables. There are three more environment variables that determine the way Go looks for, stores and compiles code. In the order of importance, these are GOPATH, GOBIN and GOROOT. Other environment variables affect other functionality, as you can read in the official documentation (or on the commandline with go help environment), but they are not as significant. You can also print all the environment variables that Go consults by running go env, or go env VARNAME to get the value of a specific variable. These commands will also print the default values if the particular variables are not set.

GOPATH used to have a very central role in how code under development was organized; you needed to place your code in a very specific place under GOPATH for the go tool to work, but this situation has changed with the new module system, which we discuss below. By setting GOPATH, you can determine where go downloads third party packages and source code. If not set, it defaults to the go directory in user home. You can set it to an arbitrary directory, for example the directory you are in with export GOPATH=`pwd`".

GOBIN determines where Go saves executables that are compiled with the two other very frequently used go commands, go install and go get. These commands are used for compiling and putting executables to the GOBIN directory from local and remote code respectively. You can get a taste of the first command by running go install hello.go in the directory where the Hello World code resides. This should place the hello binary in the GOBIN directory. When not set, GOBIN defaults to $GOPATH/bin.

GOROOT is the directory in which Go looks for the standard library. In normal usage, you don't need to set this yourself: go will figure this out by looking at where it's running.

Organizing your code in modules and packages

The Hello World example above had as its first line the declaration package main. Every Go code file needs such a line at the very top (optionally after some comments for documentation), telling the compiler in which package the code in the file belongs. In order to understand and use packages, we need to start at a higher level of abstraction, namely modules. Modules are the distribution units of Go code, be it libraries or executables. Technically, they are collections of packages that have common dependencies and compilation conditions. In the old way of doing things, modules were determined by the path in which Go files were located with respect to the GOPATH, but this is not necessary anymore; you can define modules with a single command, as we will see later, which creates a go.mod file in a certain format. Once defined this way, you can organize your code into packages, just like the main package that we used above. Before we continue with examples, I would like to point out that you can get rather detailed documentation on modules on the command line with the go help modules command (available online here). As per this documentation, the module-related behavior of the go tool can be controlled in detail using certain environment variables. Generally, however, you can assume that if you're in a module (i.e. there is a go.mod file in a supervening directory), you are in module mode, and the instructions here apply. We will later handle downloading and installing Go code without the use of modules.

We will create our module in an empty directory; the name of the directory is not important. Within this directory, run the command go mod init myprinter. This will create the aforementioned go.mod, which should have the following content:

module myprinter

go 1.14  

Side note: In the Go world, module names are connected to how they can be found on the internet; the conventional way of naming a module is prefixing it with its repository URL. We will deal with this topic later, in order not to complicate the matters at this point.

Obviously, these are the module name and the Go version with which it was created. Let's add some code to this module; add the following to the file myprinter.go right next to go.mod:

package main

import (  
    "fmt"
    "os"
    "path/filepath"
)

func main() {  
    dir, _ := filepath.Abs(filepath.Dir(os.Args[0]))
    fmt.Println(dir)
}

This file has the package declaration main, but the filename is completely different, which go permits. You can in fact put code for the same package in different files in the same directory, with the restriction that there is only one package in a directory. The only exception to this single package rule is the test package; more on this later. Now within the same directory, run the command go install myprinter. Before doing so, however, make sure that you have set GOBIN to a practical location. You should end up with the executable myprinter in the GOBIN directory, and when you run it, its output should be the path to the executable you just ran. The main package has a special meaning in Go. When you ask Go to create an executable from a code directory, it will look for the main package within that directory and compile it to an executable, with the main function as the entry point. That is, you cannot create an executable out of an arbitrary file; it has to be a main package, even if it's a subpackage. For subpackages, the last part of the path specification will be taken as the name of the executable. If the main package is at the base, as with the toy example here, it will be the name of the module. Fittingly, you cannot create a main package and import code from it; Go will complain that the location you are trying to import from "is a program, not an importable package".

Now let's move the logic for finding the path of the current executable to a separate package. To add a new package to our module, create the subdirectory pathfinder and put the following in the file pathfinder.go in that directory:

package pathfinder

import (  
    "os"
    "path/filepath"
)

func Find() string {  
    dir, _ := filepath.Abs(filepath.Dir(os.Args[0]))
    return dir
}

Here is something you should pay attention to: the Find function has to be capitalized. Otherwise Go will complain that it cannot be found when accessed in main.go. This is an interesting feature of the Go language: Visibility is tied to capitalization. We will see more on this in the second installment, but you should keep it in mind in case you see an error. Also modify the main.go file to look like this:

package main

import (  
    "fmt"
    "myprinter/pathfinder"
)

func main() {  
    fmt.Println(pathfinder.Find())
}

As you can see, we are importing our new package as myprinter/pathfinder. Go does not have relative imports; every import path has to uniquely identify the package it is importing – another feature through a lack of feature, making code analysis and refactoring easier. You can now run go install myprinter, and it should create a binary in the same location that does the same thing. The last argument to go install is optional; when omitted, Go will build and install the main package in the current directory. The command go build we saw earlier will do something very similar, simply dropping the compiled executable in the current directory instead of moving it to $GOBIN/bin.

You might be asking yourself, how can one check whether a package that is not an executable but simply a library is error-free and can be compiled? This can be done with both go build and go install. For non-main packages, both of these commands will compile the intermediate package binary, and then discard it (this behavior is new with modules; in the past, go install used to compile packages to $GOPATH/pkg).

Other useful subcommands

In addition to fmt, build and install, the base go tool has a number of very useful subcommands. You can list these by simply running go. Additional information on each subcommand can be printed by running e.g. go help build. I would strongly recommend you to read these help pages every now and then; I found out about go build -x while going through the help page, for example. In this section, I would like to go into a bit more detail on two subcommands that are rather useful, go list and go doc. The go list subcommand prints information about the packages specified as arguments, or the packages in the current directory if none are specified. We can list the packages under our myprinter module, for example, with the command go list myprinter. You have to do this while inside the directory, because otherwise module mode will not be activated, and the module will not be found. The output will simply be the name of the base package, myprinter. What if we want to refer to all the packages of a module recursively? Ellipsis, or three dots, is the operator you need for this purpose, as in go list myprinter/.... All go subcommands accept an argument with ellipsis; for example, to build all of the myprinter module, we could run go build myprinter/..., which would be totally useless in this case. We will see more useful applications of ellipsis later.

If we run go list myprinter/..., we will get the following list:

myprinter
myprinter/pathfinder

This is all nice and dandy, but not that useful; the same could be achieved with some grep (well, someone else could do it, at least). The real power of go list is in the use of the template argument, documented in go help list. The template can be given as the argument -f, and can include statements that interpolate from the Package data structure (also documented in the help printout). For example, for each path, we can print the package path and the imports, as follows:

$ go list -f "{{ .ImportPath }}: {{ .Imports }}" myprinter/...
myprinter: [fmt myprinter/pathfinder]
myprinter/pathfinder: [os path/filepath]

If you want to try out this command on a large module, you can try something from the standard library, such as net/.... Alternatively, you can also use the special argument all, which will print information on all "active" packages, meaning those that are depended on, including those in the standard library. go list can also be used to print information about modules, with the flag -m. With this flag, the struct that is used for interpolation is, as one would expect, Module instead of Package. For both packages and modules, there is a lot of extra information that can be printed out, which can be rather useful for automated analysis and overviews of large code bases.

Once you list out the packages in a module, you will probably want to get more information about what's in them. The command for this purpose is go doc. If you go ahead and try to print documentation on our toy module with go doc myprinter, you will see that an empty line is printed out; this is because there is no documentation. Let's add the following to the top of the file main.go:

// A module with an entry point that prints the path to the binary.
//
// This module is for demo purposes. It does not do anything useful.
// You can read the blog post at http://okigiveup.net.

If you now run go doc myprinter, you will see the above text. This is the convention for documenting Go packages: a short description, and then a longer text, both as comments at the very top, and separated by a blank line. By default, go doc does not print any members from a main package. If we run it on the pathfinder subpackage, we will see that it prints information on the Find function:

package pathfinder // import "myprinter/pathfinder"

func Find() string  

When given a single argument that is a package path, go doc will print the documentation for the package and list the exported symbols (as mentioned above, this is done by capitalizing their names). As you would be prone to guess, we could get extra documentation on the Find function, but we don't have any. The go doc tool looks for a comment block right before a function as its documentation (the same is valid for constants, package variables etc); let's add the following to pathfinder.go right before Find :

// Find finds and returns the path to the currently executing binary

Now, in order to get this documentation, we would need to refer to the Find function somehow. There are two ways of doing this: either with myprinter/pathfinder.Find, or by providing a second argument, as in go doc myprinter/pathfinder Find. Both should give you the following result:

package pathfinder // import "myprinter/pathfinder"

func Find() string  
    Find finds and returns the path to the currently executing binary

Another built-in tool that is useful for checking the correctness of Go code is go vet. There are certain kinds of errors that are possible in Go code which the compiler can't (or won't) find; for example, string interpolation arguments can be missing or invalid (a %d where a string is specified), or nil checks can be unnecessary because a value cannot be nil. go vet has a number of built-in checks that are all applied by default; you can see a list with go doc vet (or by following the above link). When you run a test using go test (details of this command will be discussed later), go vet is applied with a subset of these checks, such as the printf check, which concerns the aforementioned string interpolation. If you have a CI pipeline, it makes sense to add go vet to catch any subtle issues that might otherwise slip through.

Dependency management and the build system

Dependency management in the Go world is a curious story. In earlier versions of Go dependency management was, mildly put, quite difficult to get used to. It was essentially very close to a bash script that used go list to print out and clone all the git repositories referenced in the code. For a long time it wasn't even possible to pin versions of dependencies. The recommended way to get reproducible builds for a project was to copy dependencies into the project repository (see the last paragraph of the previous link). You also had to put your own code in a very specific place, along with the dependencies, which had a weird feeling of propagating the dependencies up the code hierarchy, instead of down (i.e. in a subfolder like node_modules). Fortunately, the new module system, available since version 1.11, frees developers from this sorry state of affairs. It is the result of a nearly two year long design discussion; you can read the various posts that explain the state of the design, together with extensive discussion in the comments section, here. The resulting dependency management system is the new standard, and is miles better than the old way of doing things. Therefore I will not discuss the old GOPATH-based one, and concentrate solely on the module-based dependency management system here.

A very interesting decision Go has taken from the beginning is to combine the package system with code hosting. Above, we called our module myprinter; this is actually not the conventional way of naming packages. What we should have done is to name the module after the version control location where it would be hosted, i.e. something like github.com/afroisalreadyinu/myprinter. When you do so, Go can fetch and install these modules without any additional work on your or the community's part, like hosting a module index such as PyPI, the Python Package Index. The details of the remote import path specification can be found in the documentation with go help importpath. The gist of it is as follows. Certain well-known code hosting sites, such as Github and Bitbucket, have built-in support so that you can use them in package paths. You can also directly use VCS urls, such as ones that end with .git for Git repos. The VCS's with which Go can work is not limited to git; you can also point to bazaar, fossil, mercurial and subversion repositories. A third, more general remote import mechanism is possible through the use of meta tags on HTML pages. If a page has a specifically formatted meta tag that points to a location that hosts a code repository, the URL of that page can be used in an import path. The details can be found in the importpath documentation mentioned above, or online in the go command documentation.

This relatively simple scheme does make the import strings longer than usual, but it is actually a nice solution to the perennial problem of specifying which package you are referring to in which import. Since the import path refers also to the location, you will not run into problems using libraries that share a name, and you can easily clone a repository to same other location, and use that version instead of the "canonical" one. Go faced some criticism that tying package names to code hosting sites would centralize package distribution, especially considering the dominance of Github in this space, but compared to other package hosting solutions, such as Python's PyPI and the registry of node, Go's solution is actually more decentralized, since one can host a package on many different, easy to set up locations. Go also has a well thought out module proxy protocol; you can read about it in go help goproxy. This proxy protocol enables one to host dependencies without resorting to any public infrastructure with very little pain, as there are multiple independent implementations. You can read up on using a module proxy, and reasons you should host one, in this blog post.

So how do you add a dependency to your project? By simply importing it. Let's say we would like to print our message to the terminal in color using github.com/fatih/color. In order to do so, we first modify myprinter.go to import and use it:

package main

import (  
    "github.com/fatih/color"
    "myprinter/pathfinder
)

func main() {  
    color.Blue(pathfinder.Find())
}

If you now run go install myprinter, you should see go fetch the new dependency and place it in $GOPATH/pkg/mod directory, with subdirectories named in the same scheme as the URL module path. In addition, the go.mod file should get updated, and the following line added:

require github.com/fatih/color v1.9.0

When you add a new dependency as we did right now, and then run a go subcommand (such as build, install, test or list) Go will pick the latest stable release version, based on semantic versioning, download it, and add it to go.mod. What Go won't do is to extract and add the dependencies of the new package to go.mod. If you look at the go.mod of the new dependency, you will see that it depends on two other packages, but these are not in the updated go.mod of our module. This is in comparison to pip in Python, for example, where all dependencies will be spit out if you do a pip freeze. If you remove a dependency from your code, you can reliably remove it from go.mod by running go mod tidy. As we will see later, there is one more file that is edited when new dependencies are added, but before that we need to discover the go get command.

Installing and updating software with go get

We have seen how one can build locally available code with go install and go build. What if we want to install a command, such as goversion, which gives information on the compilation context of a binary? The command we need is go get. It accepts the same URL of the package that you would use in an import statement. Using go get, we can install goversion either from Github, or from the domain of its developer (rsc.io), which redirects to the correct repository. Let's opt for the latter:

go get rsc.io/goversion

This will download, compile and install the executable as $GOBIN/goversion. If you run go get from within the module directory, you will see that it has been added to the go.mod file as a dependency, but with the comment indirect at the end of the line. An indirect dependency of a module is one that is not directly visible from the code. Using go get to install an executable is one way to get such a dependency; the other is updating a dependency-of-a-dependency (called a transitive dependency) manually, which is also done with go get. In semantic versioning, version numbers are specified in the format MAJOR.MINOR.PATCH. As we will see later, in Go, major version changes are never done automatically; they are pretty much treated as a different module. One can use Go tooling, however, to view and apply minor and patch updates. We saw above the go list command; we can use it to view all the dependencies of a module, with go list -m all. This will list all the dependencies, also the transitive ones. There is another very useful flag that adds update status information to this output; running go list -m -u all will list, for each dependency, the current version and the available update. What do you do if there is a dependency in there that you don't know how it got in there? There is a command for it; go mod why -m MODULE will figure out the shortest direct path to that module through your dependencies and print it.

Our toy module depends on github.com/fatih/color, which has been frozen for a while and did not have its dependencies updated. When I run go list -m -u all, I can see that there are a number of dependencies with available updates. In such a situation, we principally have three options: Update one single dependency, update all transitive dependencies stemming from a direct dependency, or update all dependencies. Go allows all of these. The first one, updating a single dependency, can be achieved with e.g. go get github.com/mattn/go-isatty; this will update to the highest version in the currently used major version (i.e. minor and patch updates). If you want to update to a specific version instead, you can do this by specifying the version explicitly, as in go get github.com/mattn/go-isatty@v0.0.13. Keep in mind that Go always expects that single-letter v prefix wherever a version has to be specified; @0.0.12 will not work. The version here can also be provided as @latest, which will mean the highest version under the current major version.

The second option, updating all transitive dependencies stemming from a direct dependency, can be achieved using the -u flag. If we run go get -u github.com/fatih/, go will fetch the next valid update version for all the dependencies of this one dependency, and update them. If you want to run only patch updates, you can use -u=patch. The last action, updating all dependencies, can be done by omitting all arguments, and running go get -u at the base of the module. With any of these update commands, if you also append -t, test dependencies will also be updated.

Whichever way we update indirect dependencies, the new versions will be tagged as indirect in go.mod. The next time myprinter is built, these new versions of the transitive dependencies will be used, overriding the dependencies in github.com/fatih/color. In case the latter is updated, however, obviating the need for the indirect dependency, the next go command will remove the indirect dependency from go.mod. If you want to do this explicitly, you can run go mod tidy.

File hash checking

As already mentioned, there is another file in addition to go.mod that is changed when dependencies of a module change. This file is go.sum, which contains the cryptographic hash of each module, even the transitive dependencies that were not included in go.mod. An error will be raised if the contents of a module do not hash to this value that is first saved when the dependency is added. In fact, even if you haven't already installed a module before, Go will check its hash against a central database to make sure the code has not been modified (or manipulated, if you are so inclined) since the version has been published. The URL of this service is stored in the GOSUMDB environment variable, with the default value sum.golang.org. If this environment variable's value is off, or if the go command is called with the -insecure flag (also turning off HTTPS certificate validation), checksum validation is skipped. The sum is done lazily, only when a module is downloaded. If you want to make sure that the locally cached dependencies have not been tempered with, and have the same sum as when they were downloaded, you can run the command go mod verify.

These correctness checks might sound tad excessive – they are definitely much more detailed than the ones I'm used to from other languages – but they are direct results of the priorities set in the design discussion of the Go build system. These priorities are discussed in detail in the blog post Reproducible, Verifiable, Verified Builds, where it's explained that the Go build mechanism should provide builds that have the following three properties:

  • Reproducible: When repeated, a go install or go build will create the same result

  • Verifiable: A build artefact should record information on how it was exactly produced.

  • Verified: Build process should check that the expected source code packages are being used.

The use of go.mod and go.sum as explained above enable reproducible and verified builds. In order to make build output verifiable, the Go compiler packs in the necessary build information into its output. We can use the goversion tool that we installed above to print this information. By default, goversion only prints the Go version with which a binary has been built, but it can be made to print the complete build context. If we run it on our little executable with $GOBIN/goversion -mh $GOBIN/myprinter, you should get something similar to the following:

/home/ulas/go/bin/myprinter go1.14
    path  myprinter
    mod   myprinter                      (devel)
    dep   github.com/fatih/color         v1.9.0                              h1:8xPHl4/q1VyqGIPif1F+1V3Y3lSmrq01EabUW3CoW5s=
    dep   github.com/mattn/go-colorable  v0.1.4                              h1:snbPLB8fVfU9iwbbo30TPtbLRzwWu6aJS6Xh4eaaviA=
    dep   github.com/mattn/go-isatty     v0.0.11                             h1:FxPOTFNqGkuDUGi3H/qkUbQO4ZiBa2brKq5r0l8TGeM=
    dep   golang.org/x/sys               v0.0.0-20191026070338-33540a1f6037  h1:YyJpGZS1sBuBCzLAR1VEpK193GlqGZbnPFnPV/5Rsb4=

Given a Go binary, a user has complete access to the build context. I find the design of the build system rather impressive, as it strictly adheres to clear principles without compromising on usability. Especially for mission-critical applications that need to be testable with different dependency configurations, and debuggable deep into the dependency tree, Go offers a very convincing toolchain without burdening the developer with too many tools and commands.

Replacing packages with local copies

One thing that I frequently do in Python is open the code of a dependency and edit it or add debug statements while developing my own code. If you use virtual environments, the Python tool for isolating dependency contexts, this is particularly easy, as it would affect only a single such environment. How would one go about doing this in Go? One could fiddle around with the code in the package cache, but this is not recommended practice, and it will break the hash validation. In fact, the source files of dependencies downloaded by go are not even editable on my computer. The supported way of doing this would be to use the replace feature of go.mod. One can tell the module system, through a line in the go.mod file, that a local directory should be used for satisfying a dependency instead of downloading it. Let's say that I checked out github.com/fatih/color locally to /home/ulas/code/color, made a couple of changes to it, and would like to make sure it works with our sample repo. I can tell go to use this local checkout with the following command:

go mod edit -replace=github.com/fatih/color=/home/ulas/code/color

This will add the following line to go.mod:

replace github.com/fatih/color => /home/ulas/code/color

One can of course add this line manually, instead of using a command. Now, when we build myprinter, the local code checkout will be used. This replacement can be removed either by removing the replace directive from go.mod, or with the following command:

go mod edit -dropreplace=github.com/fatih/color

Import paths and major versions

Go takes semantic versioning rather seriously. The idea behind the major version in semantic versioning is that it signifies backwards-incompatible changes. Go treats such different major versions as different modules; you can import different major versions of a module, refer to them in the same package namespace, and have multiple references to different major versions in go.mod. This is called semantic import versioning. In order to demonstrate this, I have forked github.com/syohex/gowsay, turned it into a library instead of an executable, and added two versions to it. Version v1.0.0 is pretty straightforward: gowsay.MakeCow accepts a string to wrap and an options struct. Version v2.0.2 (I had to up the version a couple of times because I didn't get things right) improves the interface by exporting enumerations for the cow types and accepting one as an argument. There are two things you have to pay attention to when writing a library for external use – or rather, that I didn't pay attention to and cost me time. The first is that the module name in go.mod should be the same as how you would refer to it when used, i.e. with the repository path. In the case of gowsay the module name has to be github.com/afroisalreadyinu/gowsay. The other thing is that the version tag has to start with a v; otherwise go will not recognize it as a valid version, and will simply use the latest state of the repo. Now let's use gowsay in our demo codebase, by modifying main.go to look like this:

package main

import (  
    "github.com/afroisalreadyinu/gowsay"
    "github.com/fatih/color"
    "myprinter/pathfinder"
)

func main() {  
    path := pathfinder.Find()
    message, err := gowsay.MakeCow(path, gowsay.Mooptions{})
    if err != nil {
        message = path
    }
    color.Blue(message)
}

We see an example of error handling the Go way here; gowsay.MakeCow has multiple return values, with the second one being an error. If this error is not nil, we print only the path, and not the cow-wrapped path. If you now do a go install, you should see the following new line in the require section of go.mod:

github.com/afroisalreadyinu/gowsay v1.0.0

Although there are two versions, Go automatically picks version v1.0.0, and not the highest version. Conceptually, the basic module path github.com/afroisalreadyinu/gowsay always refers to version 1.

Updating major versions

What if we want to use gowsay version 2? The solution designers of Go have come up with is having the version built in to the import path. That is, if we import gowsay as github.com/afroisalreadyinu/gowsay@v2, any following command such as go install myprinter will download and compile version v2.0.2. A subtle and important point when changing the major version of a library you are working on is that you have to make sure to change the module name in go.mod. For version v1.0.0 of gowsay, for example, the first line of go.mod will simply be the following:

module github.com/afroisalreadyinu/gowsay

When we tag and release the next major release, we have to change this line to the following:

module github.com/afroisalreadyinu/gowsay/v2

Otherwise, go will complain with a message similar to the following:

go get github.com/afroisalreadyinu/gowsay@v2.0.2:
github.com/afroisalreadyinu/gowsay@v2.0.2: invalid version: module contains a
go.mod file, so major version must be compatible: should be v0 or v1, not v2

Another way to update the go.mod file and use the next major version of a dependency is to use go get with a higher version. But you have to be careful here: If you simply run go get github.com/afroisalreadyinu/gowsay@v2.0.2, you will get the same error message as above. The reason is that github.com/afroisalreadyinu/gowsay refers to major version 1. Go will check out version v2.0.2 and will look for the module name without the v2, and failing at this, issue an error message.

Once you import version 2.0.2 of gowsay, you can use the new interface, referring to the new version using the same name as before:

package main

import (  
    "github.com/afroisalreadyinu/gowsay/v2"
    "github.com/fatih/color"
    "myprinter/pathfinder"
)

func main() {  
    path := pathfinder.Find()
    message, err := gowsay.MakeCow(path, gowsay.BeavisZen, gowsay.Mooptions{})
    if err != nil {
        message = path
    }
    color.Blue(message)
}

It wouldn't be the case with our silly gowsay library, but if you felt the need to refer to different major versions of a library in the same go file, you can definitely do that. One of the imports, however, has to be prefixed with an alternative reference, so that the names do not clash, as in the following example:

import (  
    "github.com/afroisalreadyinu/gowsay"
    gowsayTwo "github.com/afroisalreadyinu/gowsay/v2"
)

Go module system has a number of other features we will not go into detail here, such as vendoring, where code depended on is stored in the repository. The best place to read up on these is the Modules page of the Go wiki on Github, which is exhaustive as far as I can judge. I would highly recommend at least skimming through that page, in order to have an idea of the tools that are available, and get a glimpse of the versatility Go offers.

Testing

Testing is an integral part of Go, as one would expect of a language of our times. Beyond built-in support at the language and library level for automated testing, there are multiple tools for putting tests to use in various ways. A good starting point for testing in Go is the output of go help test. We can demonstrate Go testing facilities by adding a relatively useless test to our myprinter module. In terms of where to put the test code, our options would be either a separate directory, which would make matching code to tests very difficult, or having test files next to the code they exercise. The latter would put us in a difficult situation, since test code would have to be in the same namespace as functional code due to the one namespace per directory rule. In fact, Go allows a separate namespace for tests through a built-in exception. Any file that matches the pattern *_test.go is considered a test file. These files are excluded when normal application code is built. When you run go test, however, test files are compiled and linked against the application code. Test files can also have the package name package_test, where package is the package name of the application code. We can demonstrate this by putting the following into a file named pathfinder_test.go in the pathfinder directory of our mini-project:

package pathfinder_test

import "testing"

func TestFind(t *testing.T) {  
    t.Fail()
}

If you now switch to the pathfinder subdirectory and run go test, you should see a report like the following.

--- FAIL: TestFind (0.00s)
FAIL
exit status 1
FAIL    myprinter/pathfinder    0.002s

As with many other commands, go test will use the package in the current directory if no argument is supplied. If we wanted to run the failing test from the base directory, we would need to call the command as go test myprinter/pathfinder. What if we want to run all the tests in a project? One might expect go test myprinter to work, but that refers only to the myprinter base package; the way one can refer to all subpackages of a package is by using the ellipsis, as in go test myprinter/....

There is an interesting feature of the go test runner. If you run the same non-failing tests consecutively without modifying relevant code, the tests will actually not be run; you will see a (cached) in the output next to the test's name. This is a great feature that lets you run all the tests of a package without unnecessary overhead, but in case you want to override the cache, you can enforce running them by using the -count option, as in go test myprinter/... -count 1. This option enables setting the exact number of times a set of tests is run.

A test that fails without a decent output is of course quite useless; we need assertions that provide more information. Go doesn't come with an assertions library, interestingly, but there are excellent third party alternatives. One widely used open-source package is github.com/stretchr/testify/assert. This library has many useful tools for writing better tests; you should definitely have a look at the readme. We can improve our test by asserting that pathfinder.Find does not return an empty string, which might be the case if the underlying filepath.Abs call fails:

package pathfinder_test

import (  
    "github.com/stretchr/testify/assert"
    "myprinter/pathfinder"
    "testing"
)

func TestFind(t *testing.T) {  
    assert.NotEqual(t, "", pathfinder.Find())
}

Useful test options

The go test command has quite a few tricks up its sleeve, helping you get the most out of automated tests. The flags of go test are documented under go help testflag; don't be surprised if you can't find them under go help test. Among the arguments, the count argument was already mentioned; this is very useful when you are trying to debug intermittantly failing tests. If you want to run only a single test, you can use the -run option. This option accepts a regular expression and runs only the tests matching it. When you are running multiple tests, the test run will continue even if there are failing tests. You can override this behavior, and have the test run stop when a test fails, by supplying the failfast option.

Coverage analysis of tests are built into the test tool; you can enable it with the -cover flag. The coverage analysis tooling of Go is quite intricate and versatile; you can read the details in this blog post from the time of its release. Using only the -cover option will make go print the percentage of statements covered in the module targeted by a test file. If you want to get a detailed analysis of which lines were covered, you have to use the -coverprofile option to provide a filename in which coverage analysis will be saved. For example, we can do a coverage analysis of our pet project with the following command:

go test ./... -coverprofile=cover.out

The resulting cover.out is a text file that can be turned into a nice HTML page using the command go tool cover -html=cover.out. Running this command will pop a browser window with a colorful display. Lines covered will be in green, whereas lines skipped will be in red. You can also see the exact number of times a line was called by running the test with -covermode=count option. When this option is used, the intensity of the green will actually change depending on how many times a line was executed; you can also see the exact count by hovering over a line. The default value of the covermode option is set, which records whether a line was run at all. The third and last option is atomic, which can be used in parallel tests, and which we will deal with in the second part of this tutorial.

You might wonder how tests are run, considering that Go is a compiled language and all code that runs must be packed into an executable. This is what Go does behind the curtains; tests are compiled in a per-package manner into executables in temporary directories and executed. You can achieve the same thing with the -c flag; in our module, if you switch to the pathfinder subdirectory and run go test -c, you should end up with an executable named pathfinder.test. This is not only useful, but pretty much necessary if you want to use a debugger (more on these later) to debug your tests.

One last useful option to go test worth mentioning here is -race that enables the built-in race detector. We will look at the concurrency features of Go in the second part; this option will be covered when the topic comes up.

There are two more areas handled by Go's testing module: Benchmarking and example code. We will not go into the details of these here, but keep in mind that there is extensive support for these in the standard tools, and you don't need to roll out your own.

Further Go tools

It is possible to speak of three levels of Go tools: The ones that are first-order subcommands of the go command, those that are available under go tool, and those that need to be installed with go get. We have dealt with those in the first group, such as fmt, build, list above already. The second group comprises a set of tools directed to more fine-grained compilation, analysis and debugging of Go programs. You can get a list if you run go tool. Covering all of these subcommands is beyond the scope of this tutorial, but you can have a quick look at the documentation for a command CMD with go doc cmd/CMD. Most of them are relevant for more involved work with the Go compiler and the language; we saw one, go tool cover, which can be used to convert coverage report output to html. Another important go tool subcommand is pprof, which is used for displaying profiling output.

Viewing documentation in the browser with godoc

Among the third-party tools for working with Go code, a couple are very useful for daily work. The first of these is the godoc tool, not to be confused with go doc. Whereas go doc prints documentation, godoc runs a server with documentation for all the packages that can be found in the standard library and installed modules. After installing it with go get golang.org/x/tools/cmd/godoc, you can start it with $GOBIN/godoc -http=:6060, the argument providing the location to listen at. If you now go to http://localhost:6060/ on your computer, you should see a web page looking very similar to the official Go documentation. This is way better than trying to figure out the right path to a symbol on the command line. Another useful feature of godoc is the -index flag that makes the documentation searchable. When called with this argument, a search box will be available on the top right of the page.

goimports

One tool I find very useful is goimports, which makes it much easier to work with Go import statements. Because unused imports are an error in Go, one frequently needs to add and remove imports to a file as one tries things out. Particullary annoying is adding print statements with fmt.Printf, having the compiler tell you that you need to import it, removing the same statement after resolving the issue, and then having the compiler tell you that you now have an unused import. goimports is a tool that solves this issue by adding and removing the respective imports. After installing it with go get golang.org/x/tools/cmd/goimports, you can use it as a gofmt replacement, since it takes care of the imports in addition to running gofmt. In case you are using Emacs, integrating it with the default Go mode is as simple as setting it as the formatter with (setq gofmt-command "path/to/goimports").

One remarkable thing here is that a tool like goimports is possible because the language is so simple and strict. In Python, for example, in order to figure out what a file imports, you pretty much have to execute it, as an import statement can happen anywhere. It is actually common practice to do an import within a function to beat circular imports, which are prohibited in Go. In Go, imports are allowed only in the header; that's why one can automate handling them, or create dependency graphs and analysis. That is, it was the thinking that went into the design of Go that allows tools like these to be written.

errcheck

We saw an example of error handling in Go above: A function can return multiple values, one of which can be an error. The calling code has the responsibility to read this error value, and handle it accordingly. What frequently happens is that either the second return value is not read at all, by binding the return to a single variable, or it is bound to the blank identifier _. Although this might make sense in some contexts, you want to avoid it as much as possible in production code. errcheck is a Go tool for detecting cases of error return values not being handled. You can install it with go get github.com/kisielk/errcheck, and call it in the same manner as other go tools, e.g. with errcheck ./... at the base of a module to check all packages. By default, errcheck will report only on the cases where the return value of a function with an error is not matched at all; passing in the -blank flag will make it also report cases where error values are matched to the blank identifier.

Debugging

The last topic we will touch upon is debugging Go programs. A considerable subset of developers shun using debuggers, especially for compiled languages like Go or C, but mastering a debugger definitely pays back in reduced debugging time, even if you consider only the time spent adding new print statements and recompiling. Go does not include a built-in debugger, and opts for exporting debug symbols and providing lightweight support for GDB, the GNU debugger. Since the debug symbols are exported by default when building a go executable, you can start debugging our toy module with gdb $GOBIN/myprinter, once you have installed it. You will get a curious message when GDB starts; it will either tell you that a file name runtime-gdb.py has failed to load due to a configuration error, or that it has been loaded. This file, the only Python file in the Go source repository, is a GDB extension responsible for integrating Go types and concepts (such as goroutines) with GDB. If it could not be loaded, you can follow the directions in the initial output of GDB to enable it.

I will not go into the details of using GDB with Go; you can read up on it on the Go blog. You will recognize, however, that even this post on the official Go blog recommends the third-party alternative Delve, instead of GDB. Delve, a Go debugger written in Go, is in fact much easier to use, as it is integrated into the Go toolchain, and more complete. First, install it with go get github.com/go-delve/delve/cmd/dlv. To debug a Go executable, simply navigate to the main package directory (in our toy module the base directory) and run $GOBIN/dlv debug. You can also debug your tests, by switching into the appropriate directory and running dlv test. Once the debugger is started, there are a number of commands available. All frequently used commands have two forms: standard and a short alias. The most useful commands and their aliases are break (b) to set a breakpoint, continue (c) to continue execution until a breakpoint or termination, next (n) to execute one source line, print (p) to print the value of a variable, and list (ls or l) to show code. When you start delve with dlv debug, you land in the initialization of the executable; list will show you some go runtime C. You can land at the beginning of your program by setting a breakpoint there with b main.main, and continuing until the breakpoint with c. Delve will run the code until the start of the main function and print the surrounding code context. When debugging our toy module, you could for example enter n twice after the beginning of main.main is reached, landing at the line after path is set, and then print the value of this variable with p path. This is a very simple example of what delve can provide; I would highly recommend reading the getting started guide, and going through the commands listed when you enter help at the delve console.

Both GDB and Delve are CLI debuggers. If you are more into visual debuggers, you can use one of the popular IDEs with Go integration, such as VSCode or Goland from JetBrains. Unfortunately, I'm not familiar with any of these, but a quick Google search shows that they can be used as debuggers for Go.

In the next episode

In this first part of this tutorial, we looked at the reasons Go was developed, the fundamental ideas behind its design, and the tooling packed mostly into the go command and some other third-party packages. You should now be ready to write, test, combine, debug and package Go code using these standard tools. In the next part of this tutorial, we will go into more detail of what kind of code to actually write.

Resources

  • Go at Google: Language Design in the Service of Software Engineering provides the reasoning behind the design of Go, the trade-offs and explicit non-goals. It's a great resource for understanding which problems the designers wanted to solve, and why they left certain things out. A similar, but less systematic text is Less is exponentially more, which details the very early driving guidelines and design decisions that went into Go.

  • The design of Go dependency management was discussed over a number of blog posts which are linked to on this page; I would definitely recommend [??]. Once the design was finalized and implemented, it was announced over a couple of posts on the official Go blog; the first post has links to the further ones. These posts discuss practical aspects of working with modules and dependency management. If you want to read the intricate details and questions, these are discussed extensively on the wiki of the go project on Github.

  • Go tooling in action is an excellent screencast on developing and improving Go code with standard and a couple of third-party tools, especially improving performance using pprof, go-torch and go-wrt. A nice display of how fast Go code, especially web service, can be made to perform.

  • An Overview of Go's Tooling is an excellent tutorial that covers a lot of similar ground to this post. I actually picked up quite a few tricks from it. It also covers a couple of topics such as compiler options and benchmarking that are not covered here.

  • Go's Tooling is an Undervalued Technology is an enthusiastic look at a couple of aspects of Go tooling. It covers topics skipped in this post, such as vendoring.