Introduction

This tutorial goes through the details of how Go builds binaries and uses linked packages. We assume that the reader has installed the language tools and was able to create and run a basic application. The purpose of the post is to help the reader grok the internals of the language and select the proper package management strategy.

Let’s follow the example from the installation instructions and create a simple application in our $GOPATH directory.

$ tree
.
└── src
    └── github.com
        └── orloffm
            └── hello
                └── hello.go

And the content of the file should be:

package main

import "fmt"

func main() {
        fmt.Printf("hello, world\n")
}

Packaging in Go

First, a few words about Go project structure.

All source code in Go should be kept under $GOPATH/src. The subdirectory where code files reside inside $GOPATH/src/ (in our case, github.com/orloffm/hello) is called an import path. Import paths identify applications and libraries in Go world. Such an approach works because effectively most of the programs are developed on GitHub (or equivalent services), and thus such URL-based identification is transparent, reliable and unique.

The code files directly under the import path comprise the package. Every directory is a package. The type of package, application or library, is defined by package attribute in the topmost line of the code file.

  • If it is set to main, the package is thought to be an application. It gets compiled into an executable and stored inside the $GOPATH/bin directory.
  • All other values mean the package is a library. It will be compiled into a non-executable file and stored inside $GOPATH/pkg/<cpu_architecture>/. More on this later, we will use zzzz as an example.

Since applications are final products and are not used in other projects, we usually mean libraries when we say the word «packages». That’s why libraries are compiled into pkg directory.

Packages can contain multiple files, but all of them should be in the same directory and all should have the same string as package. One can think of them as of a single document split in multiple files: all functions from all files can call each other, and everything will be compiled together as a whole.

The common sense approach is to give your library package the same name as your GitHub repository. For example, Cobra framework is developed at https://github.com/spf13/cobra, and all files in the repository’s root start with package cobra.

One can reference packages in code files using the import directive and their import paths. In the sample, we only import fmt. This package is pre-built and is stored at $GOROOT/pkg/<cpu_architecture>/fmt.a ($GOROOT most likely is /usr/local/go) along with other built-in first-party Go packages.

Simple application

Let’s return to our simple application. It has only a single code file.

We can use three go commands: build, install and run to compile our package.

Run

go run hello.go compiles the program files and runs them from memory. No binary files are created.

$ cd src/github.com/orloffm/hello/
$ go run hello.go
hello, world

If the package consists of multiple files, you need to pass all of them as arguments, maybe like go run *.go.

Libraries cannot be run. This is what we get if we switch package from main to zzzz:

go run: cannot run non-main package

⇒ Use this command to quickly execute the code file.

Build

go build <import path> works on package level. It compiles that package into a binary. If the package is an application, it gets written to disk to current directory.

Imported packages are also compiled if needed.

⇒ Use this command to test if the project builds.

Install

go install <import path> does what build does, but it also always stores the binaries in the $GOPATH structure.

⇒ Use install to create binaries.

Now, to examples.

Applications

Let’s install our simple app.

go install github.com/orloffm/hello

Here we have the resulting executable.

$ tree
.
├── bin
│    └── hello
└── src
    └── github.com
        └── orloffm
            └── hello
                └── hello.go

Note that its name is defined not by the name of the source file hello.go, but by the last part of the import path (because packages may contain multiple files).

We can now run the program by name:

$ hello
hello, world

Libraries

Had we run the same command with package changed to zzzz, we would’ve gotten the following.

$ tree
.
├── pkg
│    └── linux_amd64
│        └── github.com
│            └── orloffm
│                └── hello.a
└── src
    └── github.com
        └── orloffm
            └── hello
                └── hello.go

Note that the binary’s name is hello.a, not zzzz.a. Again, this is the last part of the import path.

Target directories

To summarise, on a 64bit Ubuntu machine:

  • Standard binaries are stored in:
    • Applications (there are three: go, godoc and gofmt) — in $GOROOT/bin/.
    • Libraries (fmt.a, bytes.a etc) — in $GOROOT/pkg/<cpu_architecture>/.
  • All other binaries are built into:
    • Applications — $GOPATH/bin/.
    • Libraries — $GOPATH/pkg/<cpu_architecture>/.

This is in a stark contrast with the approach of classical programming languages where everything builds into its own target directory. Go compiler looks only in these directories when building packages.

Using other packages

Now, how do we work with the third-party packages that we’ve imported? Let’s follow the earlier example of the imported github.com/spf13/cobra package and change our simple application to this:

package main

import (
        "fmt"
        _ "github.com/spf13/cobra"
)

func main() {
        fmt.Printf("hello, world\n")
}

We don’t use anything from the package, so to avoid the compiler error, we add _ before the import path. Currently build fails:

$ go build github.com/orloffm/hello
src/github.com/orloffm/hello/hello.go:5:2: cannot find package "github.com/spf13/cobra" in any of:
    /usr/local/go/src/github.com/spf13/cobra (from $GOROOT)
    /home/ubuntu/work/src/github.com/spf13/cobra (from $GOPATH)

Manual

The manual approach would consist of:

  1. Finding out recursively all dependencies. (In this case, cobra uses another package of the same author, pflag.)
  2. Cloning each of them into $GOPATH/src/<import path>’s.

Something like:

$ cd $GOPATH/src
$ git clone --depth=1 http://github.com/spf13/cobra.git github.com/spf13/cobra
$ git clone --depth=1 http://github.com/spf13/pflag.git github.com/spf13/pflag

Now github.com/orloffm/hello will build. go build will compile it together with cobra and pflag, but won’t write any binaries. go install will create all binaries — in $GOPATH/pkg for the dependent libraries and in $GOPATH/bin for our application.

Get command

go get automates the work described in the previous section. We can simply execute

$ go get github.com/spf13/cobra

and get the binaries of cobra and its dependency pflag installed. More specifically, this command:

  1. Downloads the latest source code of the specified package (i.e. from a GitHub repository) into $GOPATH/src folder, keeping its whole directory structure.
  2. Recursively downloads the latest source code of the other packages specified in import directives in the files.
  3. install’s all downloaded packages. This can be disabled by the -d command line argument.

In other words, if there is a package developed on GitHub, running this command gets you its local compiled binary.

Problems

As one can see, a particular version of cobra hasn’t been mentioned anywhere. This obviously leads to two issues:

  1. Non-reproducible builds. Our application may work today, but the next day cobra could change, and we won’t be able to download its exact version:
    • import directive does not support versioning, so we won’t know which version we were using at all.
    • go get does not support versions, so even if we noted the proper version, we’d have to download it and all its dependencies manually.
  2. Since, as we saw above, all binaries are put into a single shared directory structure, it is impossible to work with multiple versions of the same package in different projects.

Solutions before mid 2015

The first problem could’ve been easily solved by a third-party package manager that replaced go get. It would store the required versions of the dependencies in a custom separate file and download them on request.

For the second problem, there’ve been several approaches.

  1. Import paths rewriting. This means changing import paths in the source code and downloading dependencies to those changed paths.
  2. Different $GOPATH paths for every project.

Obviously, both these approaches are far away from a proper workflow.

Vendor directory

Go since version 1.5 (mid 2015) supports vendoring, which is the answer to both problems. Basically, it means that if there is a directory vendor inside some import path, it will serve as a higher priority replacement for $GOPATH/src for that particular package. This allows us to add this vendor directory to source control and always keep the preferred versions of our dependencies.

Let’s do it for our example. First, clean things up.

$ cd $GOPATH
$ rm -rf bin pkg src/github.com/spf13

Now manually download the packages into vendor directory.

$ cd $GOPATH/src/github.com/orloffm/hello
$ mkdir vendor && cd vendor
$ git clone --depth=1 http://github.com/spf13/cobra.git github.com/spf13/cobra
$ git clone --depth=1 http://github.com/spf13/pflag.git github.com/spf13/pflag

So we have a clean source tree:

$ tree -L 8
.
└── src
    └── github.com
        └── orloffm
            └── hello
                ├── hello.go
                └── vendor
                    └── github.com
                        └── spf13
                            ├── cobra
                            └── pflag

OK, we have the dependencies in vendor directory. Now let’s install our application.

$ go install github.com/orloffm/hello
$ tree -L 9
.
├── bin
│    └── hello
├── pkg
│    └── linux_amd64
│        └── github.com
│            └── orloffm
│                └── hello
│                    └── vendor
│                        └── github.com
│                            └── spf13
│                                ├── cobra.a
│                                └── pflag.a
└── src (omitted)

Note that the dependent packages are built into subdirectories of $GOPATH/pkg/linux_amd64/github.com/orloffm/hello/vendor/. This means that they are not affecting any other project in any way. But we saw above that the location of the binary is defined by package’s import path. Is this an exception?

Actually, no. Cobra is imported in hello.go as github.com/spf13/cobra, but if the compiler finds its source in vendor directory, as it did in our example, it will internally treat the Cobra package as having import path github.com/orloffm/hello/vendor/github.com/spf13/cobra. So, if Cobra was used by some other package X in hello’s dependency tree, that package wouldn’t be able to use the Cobra objects created by our application because X will use Cobra from a different import path. We must have a single copy of all dependencies’ code when building the package.

Worth noting that Go supports nested vendor directories, that is, vendored packages might also have vendor directory inside, and it will all be handled properly. Obviously, using this functionality only increases the mess discussed in the previous paragraph.

Thus we got to the following conclusions:

  1. Every dependency should be vendored.
  2. All packages should be stored inside the vendor directory of our package.
  3. They should be stored in a flat structure (i.e. without nested vendor directories).

Thoughts

Now we can theoretically solve the versioning problem. But since go get does not support vendoring and versions, we need to find a proper tool to do it for us.

What should such a tool do?

  1. Download recursively all dependencies.
  2. Store them in a flat structure in vendor for compilation.
  3. Keep their versions in some file.
  4. Preferably respect the version information left by other tools for other packages.
  5. Use all known version information when downloading packages.
  6. (Optional) Should support Windows for greater compatibility.

Package managers

Let’s go through existing package managers and review them. A list of available packages is available at https://github.com/golang/go/wiki/PackageManagementTools#go15vendorexperiment. We’ll look only on those that allow to lock versions.

godep

Website: https://github.com/tools/godep, starred by 3710.

Initially existed in pre-vendor times. Does not recover the dependencies, is meant to copy dependencies from $GOPATH and check them in. Doesn’t seem to work with vendor properly.

glide

Website: https://github.com/Masterminds/glide, starred by 2345.

Automatically creates a list of nested dependencies. Flattens vendor directories with -s -v argument.

Pros: Supports other tools’ config files.

Cons: Sometimes glide install fails on Windows.

gom

Website: https://github.com/mattn/gom, starred by 1126.

gopm

Website: https://github.com/gpmgo/gopm, starred by 1062.

Does not seem to support vendor.

govendor

Website: https://github.com/kardianos/govendor, starred by 804.

Is meant to copy dependencies from $GOPATH and check them in, also can fetch from remote locations. Does not do that automatically by files. Flattens the vendor directories of referenced packages. Stores configuration as vendor/vendor.json.

Cons: Stores configuration inside vendor directory. So the whole directory cannot be easily mentioned in .gitignore.

goop

Website: https://github.com/nitrous-io/goop, starred by 764.

Does not support vendor.

govend

Website: https://github.com/govend/govend, starred by 150.

With the --prune flag removes unused files, which includes nested vendor directories. Does not respect the vendored versions of dependencies. Does not support configs in dependencies. Fails on Windows.

bunch

Website: https://github.com/dkulchenko/bunch, starred by 71.

Does not support vendor.

trash

Website: https://github.com/rancher/trash, starred by 66.

Downloads dependencies according to its config file. The config file should be created and maintained manually. Also supports configs from glide. Removes nested vendor directories.

Conclusion

Comparison

Tool vendor from scratch Flattens vendor Config support in deps Windows
glide Yes, glide init With -s -v Itself, Godep, gb, gom, and GPM Bugs
govendor No, manual invocation Yes Yes
gom Yes, gom gen gomfile Keeps No
govend Yes, govend -v -l With --prune No Bugs
bunch No
godep No
trash Partial, manual config editing No, removes it No No

Currently, it seems, the only two proper tools are govendor and glide. glide looks better because it auto-creates the configuration file and supports other tools’ configuration files in dependencies.