The unsafe Package in Golang

最新推荐文章于 2024-01-28 23:16:46 发布

蜗牛凯

最新推荐文章于 2024-01-28 23:16:46 发布

阅读量724

点赞数

分类专栏： golang

golang 专栏收录该内容

19 篇文章 1 订阅

订阅专栏

The unsafe Package in Golang

2016/10/22, by @TapirLiu

The unsafe standard package in Golang is a special package. Why? This article will explain it in detail.

The Warnings From Go Official Documents

The unsafe package docs says:

Packages that import unsafe may be non-portable and are not protected by the Go 1 compatibility guidelines.

And the Go 1 compatibility guidelines says:

Packages that import unsafe may depend on internal properties of the Go implementation. We reserve the right to make changes to the implementation that may break such programs.

Yes, the package name has implied that the unsafe package is unsafe. But how dangerous is this package? let's see what is the role of the unsafe package firstly.

The Role Of The unsafe Package

Up tp now (Go1.7), the unsafe package contains following resources:

three functions:
and one type:
- type Pointer *ArbitraryType

Here, ArbitraryType is not a real type, it is just a placeholder here.

Unlike most functions in Golang, callings of the above three functions will be always evaluated at compile time, instead of run time. This means their return results can be assigned to constants.

(BTW, the functions in unsafe package are not the only ones which callings will be evaluated at compile time. Sometimes, callings of the builtin len and cap functions may be also evaluated at compile time, when the passed parameter to len and cap is an array value.)

Besides the three functions, the only type, Pointer, in unsafe package also serves for compiler.

For security reason, Golang doesn't allow following direct conversions between:

values of two different pointer types, for example, *int64 and *float64.
values of a pointer type and uintptr.

But with the help of unsafe.Pointer, we can break the Go type and memory security and make above conversions become possible. How can this happen? Let's read the rules listed in the docs of unsafe package:

A pointer value of any type can be converted to a unsafe.Pointer.
A unsafe.Pointer can be converted to a pointer value of any type.
A uintptr can be converted to a unsafe.Pointer.
A unsafe.Pointer can be converted to a uintptr.

These rules are consistent with Go spec:

Any pointer or value of underlying type uintptr can be converted to a Pointer type and vice versa.

The rules show that unsafe.Pointer is much like the void* in c language. Yes, void* in c language is dangerous!

Under above rules, for two different types T1 and T2, a *T1 value can be concerted to a unsafe.Pointer value, then the unsafe.Pointer value can be converted to a *T2 value (or a uintptr value). By this way, Go type and memory security is bypassed. Surely, this way is dangerous if it is misused.

An example:

package main

import (
	"fmt"
	"unsafe"
)
func main() {
	var n int64 = 5
	var pn = &n
	var pf = (*float64)(unsafe.Pointer(pn))
	// now, pn and pf are pointing at the same memory address
	fmt.Println(*pf) // 2.5e-323
	*pf = 3.14159
	fmt.Println(n) // 4614256650576692846
}

The conversion in this example may be meaningless, but it is safe and legal ( why is it safe?).

So the role of the resources in unsafe package is serving for Go compilers, and the role of the unsafe.Pointer type is to bypass Go type and memory security.

More About unsafe.Pointer and uintptr

Here are some facts about unsafe.Pointer and uintptr:

uintptr is an interger type.
- the data at the address represented by a uintptr variable may be GCed even if the uintptr variable is still alive.
unsafe.Pointer is a pointer type.
- but unsafe.Pointer values are can't be dereferenced.
- the data at the address represented by a unsafe.Pointer variable will not be GCed if the unsafe.Pointer variable is still alive.
*unsafe.Pointer is a general pointer type, just like *int etc.

As uintptr is an interger type, uintptr values can do arithmetic operations. So by using uintptr and unsafe.Pointer, we can bypass the restriction that *T values can't be offset in Golang:

package main

import (
	"fmt"
	"unsafe"
)

func main() {
	a := [4]int{0, 1, 2, 3}
	p1 := unsafe.Pointer(&a[1])
	p3 := unsafe.Pointer(uintptr(p1) + 2 * unsafe.Sizeof(a[0]))
	*(*int)(p3) = 6
	fmt.Println("a =", a) // a = [0 1 2 6]
	
	// ...

	type Person struct {
		name   string
		age    int
		gender bool
	}
	
	who := Person{"John", 30, true}
	pp := unsafe.Pointer(&who)
	pname := (*string)(unsafe.Pointer(uintptr(pp) + unsafe.Offsetof(who.name)))
	page := (*int)(unsafe.Pointer(uintptr(pp) + unsafe.Offsetof(who.age)))
	pgender := (*bool)(unsafe.Pointer(uintptr(pp) + unsafe.Offsetof(who.gender)))
	*pname = "Alice"
	*page = 28
	*pgender = false
	fmt.Println(who) // {Alice 28 false}
}

How Dangerous Is The unsafe Package?

About the unsafe package, Ian, one of the core member of Go team, has confirmed:

the signatures of the functions in unsafe package will not change in later Go versions,
and the unsafe.Pointer type will always there in later Go versions.

So, the three functions in unsafe package don't look dangerous. The go team leaders even want to put them elsewhere. The only unsafety of the functions in unsafe package is their callings may return different values in later go versions. It is hardly to say this small unsafety is a danger.

It looks all the dangers of unsafe package are related to using unsafe.Pointer. The unsafe package docslists some cases of using unsafe.Pointer legally or illegally. Here just lists part of the illegally use cases:

package main

import (
	"fmt"
	"unsafe"
)

// case A: conversions between unsafe.Pointer and uintptr 
//         don't appear in the same expression
func illegalUseA() {
	fmt.Println("===================== illegalUseA")
	
	pa := new([4]int)
	
	// split the legal use
	// p1 := unsafe.Pointer(uintptr(unsafe.Pointer(pa)) + unsafe.Sizeof(pa[0]))
	// into two expressions (illegal use):
	ptr := uintptr(unsafe.Pointer(pa))
	p1 := unsafe.Pointer(ptr + unsafe.Sizeof(pa[0]))
	// "go vet" will make a warning for the above line:
	// possible misuse of unsafe.Pointer
	
	// the unsafe package docs, https://golang.org/pkg/unsafe/#Pointer,
	// thinks above splitting is illegal.
	// but the current Go compiler and runtime (1.7.3) can't detect
	// this illegal use.
	// however, to make your program run well for later Go versions,
	// it is best to comply with the unsafe package docs.
	
	*(*int)(p1) = 123
	fmt.Println("*(*int)(p1)  :", *(*int)(p1)) //
}	

// case B: pointers are pointing at unknown addresses
func illegalUseB() {
	fmt.Println("===================== illegalUseB")
	
	a := [4]int{0, 1, 2, 3}
	p := unsafe.Pointer(&a)
	p = unsafe.Pointer(uintptr(p) + uintptr(len(a)) * unsafe.Sizeof(a[0]))
	// now p is pointing at the end of the memory occupied by value a.
	// up to now, although p is invalid, it is no problem.
	// but it is illegal if we modify the value pointed by p
	*(*int)(p) = 123
	fmt.Println("*(*int)(p)  :", *(*int)(p)) // 123 or not 123
	// the current Go compiler/runtime (1.7.3) and "go vet" 
	// will not detect the illegal use here.
	
	// however, the current Go runtime (1.7.3) will 
	// detect the illegal use and panic for the below code.
	p = unsafe.Pointer(&a)
	for i := 0; i <= len(a); i++ {
		*(*int)(p) = 123 // Go runtime (1.7.3) never panic here in the tests
		
		fmt.Println(i, ":", *(*int)(p))
		// panic at the above line for the last iteration, when i==4.
		// runtime error: invalid memory address or nil pointer dereference
		
		p = unsafe.Pointer(uintptr(p) + unsafe.Sizeof(a[0]))
	}
}

func main() {
	illegalUseA()
	illegalUseB()
}

It is hard for compilers to detect the illegal unsafe.Pointer uses in a Go program. Running "go vet" can help find some potential ones, but not all of them. The same is for Go runtime, which also can't detect all the illegal uses. Illegal unsafe.Pointer uses may make a program crash or behave weird (sometimes normal and sometimes abnormal). This is why using the unsafe package is dangerous.

Conversion of a T1 to T2

For a conversion of a *T1 to unsafe.Pointer, then to *T2, unsafe package docs says:

Provided that T2 is no larger than T1 and that the two share an equivalent memory layout, this conversion allows reinterpreting data of one type as data of another type.

This definition of "equivalent memory layout" is some vague. And it looks the go team deliberately make it vague. This makes using the unsafe package more dangerous.

As Go team are not willing to make an accurate definition here, this article also doesn't attempt to do this. Here, a samll part of confirmed legal use cases are listed,

Legal Use Case 1: conversions between []T and []MyT

In this example, we will use int as T:

type MyInt int

In Golang, []int and []MyInt are two different types and their underlying type are themselves. So values of []int can't be converted to []MyInt, vice versa. But with the help of unsafe.Pointer, the conversions are possible:

package main

import (
	"fmt"
	"unsafe"
)

func main() {
	type MyInt int
	
	a := []MyInt{0, 1, 2}
	// b := ([]int)(a) // error: cannot convert a (type []MyInt) to type []int
	b := *(*[]int)(unsafe.Pointer(&a))
	
	b[0]= 3
	
	fmt.Println("a =", a) // a = [3 1 2]
	fmt.Println("b =", b) // b = [3 1 2]
	
	a[2] = 9
	
	fmt.Println("a =", a) // a = [3 1 9]
	fmt.Println("b =", b) // b = [3 1 9]
}

Legal Use Case 2: callings of pointer related fucntions in sync/atomic package

Most of the parameter and result types of following functions in sync/atomic package are either unsafe.Pointer or *unsafe.Pointer:

To use these functions, the unsafe package must be imported.
NOTE: *unsafe.Pointer is a general type, so a value of *unsafe.Pointer can be converted to unsafe.Pointer, and vice versa.

package main

import (
	"fmt"
	"log"
	"time"
	"unsafe"
	"sync/atomic"
	"sync"
	"math/rand"
)

var data *string

// get data atomically
func Data() string {
	p := (*string)(atomic.LoadPointer(
			(*unsafe.Pointer)(unsafe.Pointer(&data)),
		))
	if p == nil {
		return ""
	} else {
		return *p
	}
}

// set data atomically
func SetData(d string) {
	atomic.StorePointer(
			(*unsafe.Pointer)(unsafe.Pointer(&data)), 
			unsafe.Pointer(&d),
		)
}

func main() {
	var wg sync.WaitGroup
	wg.Add(200)
	
	for range [100]struct{}{} {
		go func() {
			time.Sleep(time.Second * time.Duration(rand.Intn(1000)) / 1000)
			
			log.Println(Data())
			wg.Done()
		}()
	}
	
	for i := range [100]struct{}{} {
		go func(i int) {
			time.Sleep(time.Second * time.Duration(rand.Intn(1000)) / 1000)
			s := fmt.Sprint("#", i)
			log.Println("====", s)
			
			SetData(s)
			wg.Done()
		}(i)
	}
	
	wg.Wait()
	
	fmt.Println("final data = ", *data)
}

Conclusions

The unsafe package is serving for Go compiler instead of Go runtime.
Using unsafe as the package name is just to make you use this package carefully.
Using unsafe.Pointer is not always a bad idea, sometimes, we must use it.
The type system of Golang are designed for both security and efficiency. But security is more important than efficiency in Go type system. Generally Go is performant, but sometimes the security really causes some inefficiencies in Go programs. The unsafe package is used for experienced programmers to remove these inefficiencies, by breaking the security of Go type system, safely.
Again, surely, the unsafe package could be misused and is dangerous.

转自：http://www.tapirgames.com/blog/golang-unsafe