“How do you decide that a variable should be a pointer, or a value?” and “How do you know when to use pointer receiver and value receiver of a method?” are among the most common questions I receive from Go new comers. Even though there is no offical rule that dictates which types should use pointer, and which types should use value, it’s worth to deep dive into the topic of pointer and value semantic in Go, which is what we’ll do in this article. We’ll look at some examples, especially Go standard packages, examine what semantic the Go standard packages use, and then we’ll come up with a general guideline on how to choose the semantic for our own defined types.
Value and pointer
Firstly we have a quick recap of what a value and a pointer mean in the context of this article. We all know a Go variable can hold a value, which can be a string
for example. On the other hand, it can also hold an address, which “points” to something (or nothing at all!). This pointer could occupy 8 bytes on a 64-bit machines, or 4 bytes on a 32-bit machine.
The next question to ask is why do we need pointers? The short answer is to share data. Yes, it is to share data! If share is not something you have to do in your program, you don’t need to create pointer variables at all. In the following program, x
, which is an integer value, is shared with the increment
function by the main
function.
package main
func main() {
x := 1
increment(&x)
}
func increment(x *int) {
*x++
}
The following is another version of increment
that, in contrast, uses the value semantic. In this version x
is copied to increment
’s stack frame, not shared.
package main
func main() {
x := 1
x = increment(x)
}
func increment(x int) int {
return x++
}
It’s crucial to know the trade-offs between these 2 semantics: pointer allows a function down the call stack to mutate a variable, which may make the program harder to reason about. On the other hand, value results in data copy, which may be slow (if the data is large enough), or sometimes not possible (for example, a strings.Builder
cannot be copied, a sync.WaitGroup
will not work correctly if copied across functions).
Mixing pointer and value semantics is rarely a good idea, because it negatively affects the readability and reasonability. Here’s our user
type, and its methods.
// some methods use value while other use pointer
// DON'T DO THIS
type user struct{}
func (u user) funcA() { ... }
func (u *user) funcB() { ... }
func (u *user) funcC() { ... }
Some methods can mutate user
while the others cannot. It would be such a nuisance to read a code base in which we have to guess which methods mutate user
and which don’t.
Is there any general guideline we can follow to ensure we’re using value and pointer effectively? Fortunately the answer is yes!
Rule of thumb
To keep our sanity in check, here’s a rule of thumb we can follow:
- For basic types (like string, int, bool, etc.), always use value semantic, it’s generally ok to copy variables with these types.
- For reference types (slice, map, channel, interface, and function), use value semantic. One may be confused why it’s ok to use value semantic for slice, for example, because copying the entire slice is not efficient at all. In fact, when passing a slice value to a function, it’s not the slice that’s copied to the function stack frame, but the slice header, which holds much less weight than the slice itself.
There are exceptions, however. For example, when decoding or unmarshaling raw data to a map, we need to use the pointer semantic.
raw := []byte(`{"name":"john"}`)
var data map[string]string
json.Unmarshal(raw, &data)
Another exception could be when we want to marshal a struct to a json object, but we want to explicitly make the field nullable. In the example below, we explicitly remove email
field from the json object if the user does not have any email.
type User struct {
Name string `json:"name"`
Email *string `json:"email,omitempty"`
}
So far we have covered basic and reference types. In the next section, we’ll unpack Go standard packages types and user-defined types one by one.
Types in standard packages
It’s surprisingly easy to know what semantic we should use for types in Go standard packages. If we pay attention to its APIs, especially mutation APIs, usually we can grasp a good idea what semantic to use. Let’s observe some examples!
First, we take a look at time
package.
package time
// Now returns the current local time.
func Now() Time { ... }
The standard time
package provides Time
type, which can be constructed by calling Now
. The returned type of Now
is a value, which is a clear signal that Time
follows value semantic. Looking at Time
’s mutation API, we can double confirm that.
package time
func (t Time) Add(d Duration) Time { ... }
As can be seen, the Add
method returns a new Time
value, which is another pretty solid evidence that we should apply the semantic for this type.
On the contrary, when we take a look at the File
type in os
package, we can see that it’s using a pointer semantic.
package os
func Open(name string) (*File, error) {
return OpenFile(name, O_RDONLY, 0)
}
func (f *File) Write(b []byte) (n int, err error) { ... }
The Open
constructor function (and the OpenFile
constructor function as well) returns a *File
, which is a hint that a File
uses a pointer semantic. This is fortified by the fact that File
’s methods (Write
for instance) all use pointer receivers.
Did the Go team decide which semantic to use at random? Of course not! The selection of value semantic for Time
and pointer semantic for File
are not random at all. Let’s analyze how these types could be used in our code base.
birthday := time.Date(1985, 12, 20, 1, 0, 0, 0, time.Local)
christmasDay := birthday.Add(5 * 24 * time.Hour)
Think about it: if my birthday was on December 20th, adding 5 more days wouldn’t make it my birthday still. Instead it would be Christmas Day. So a Time
mutation will result in another instance of Time
.
f, _ := os.OpenFile("/temp/to_do_list.txt", os.O_CREATE|os.O_RDWR, 0644)
f.Write([]byte("writing my new article"))
On the opposite side of the scale, if I have a file to jot down my todo lists, and then I add an item to it, it’s still the exact same file, it does not result in a new instance of file. Therefore, the chosen semantic is pointer.
User-defined types
Having the knowledge of how semantics are selected for standard packages types, we can apply it to our own defined types. The important question to ask is: if I mutate an instance of this type, is it still the same instance, or will it be a different instance? In our engineering journey, mistakes are inevitable, it is ok to use the wrong semantic for our types, and make amends later. However, one must be careful when designing an API for external clients. This requires extra thought as switching value to pointer, or vice versa, may result in non-compatible changes for the clients.
Another point to note here, is that no matter which semantic is chosen, a user-defined type must always use a pointer as a method receiver to implement any unamrshaller or decoder interface (the json.Unmarshaler
interface for instance).
Summary
We have come a long way, and here’s a checklist we can refer to when it comes to making decision about pointer or value:
- For basic types (string, int, bool, etc.) and reference types (slice, map, channel, interface, and function), use value semantic. There are a few exceptions:
- It must be a pointer when we want to pass them to a marshal or decode function.
- It’s ok to use a pointer for marshalling purpose (like marshalling a
null
field to a json object)
- For the types in standard packages, pay attention to its APIs to know which semantic to use.
- For our own defined types, ask ourselves one question: if I mutate an instance of this type, is it still the same instance, or will it be a different instance? If it’s still the same instance, use pointer. Otherwise, use value. It’s not always possible to get it right in the beginning, so don’t be afraid to switch semantics when the time comes.
- No matter which semantic is chosen for a type, to implement unmarshaller or decoder interface (like
json.Unmarshaller
), the method receiver must be a pointer.