【Golang】关于Go语言中Slice扩容策略

一、概述

当切片的容量不足时,我们会调用 runtime.growslice 函数为切片扩容,扩容是为切片分配新的内存空间并拷贝原切片中元素的过程,我们先来看新切片的容量是如何确定的,使用的是 growslice 函数

func growslice(et *_type, old slice, cap int) slice {
	if raceenabled {
		callerpc := getcallerpc()
		racereadrangepc(old.array, uintptr(old.len*int(et.size)), callerpc, funcPC(growslice))
	}
	if msanenabled {
		msanread(old.array, uintptr(old.len*int(et.size)))
	}

	if cap < old.cap {
		panic(errorString("growslice: cap out of range"))
	}

	if et.size == 0 {
		// append should not create a slice with nil pointer but non-zero len.
		// We assume that append doesn't need to preserve old.array in this case.
		return slice{unsafe.Pointer(&zerobase), old.len, cap}
	}

	newcap := old.cap
	doublecap := newcap + newcap
	if cap > doublecap {
		newcap = cap
	} else {
		if old.cap < 1024 {
			newcap = doublecap
		} else {
			// Check 0 < newcap to detect overflow
			// and prevent an infinite loop.
			for 0 < newcap && newcap < cap {
				newcap += newcap / 4
			}
			// Set newcap to the requested cap when
			// the newcap calculation overflowed.
			if newcap <= 0 {
				newcap = cap
			}
		}
	}

	var overflow bool
	var lenmem, newlenmem, capmem uintptr
	// Specialize for common values of et.size.
	// For 1 we don't need any division/multiplication.
	// For sys.PtrSize, compiler will optimize division/multiplication into a shift by a constant.
	// For powers of 2, use a variable shift.
	switch {
	case et.size == 1:
		lenmem = uintptr(old.len)
		newlenmem = uintptr(cap)
		capmem = roundupsize(uintptr(newcap))
		overflow = uintptr(newcap) > maxAlloc
		newcap = int(capmem)
	case et.size == sys.PtrSize:
		lenmem = uintptr(old.len) * sys.PtrSize
		newlenmem = uintptr(cap) * sys.PtrSize
		capmem = roundupsize(uintptr(newcap) * sys.PtrSize)
		overflow = uintptr(newcap) > maxAlloc/sys.PtrSize
		newcap = int(capmem / sys.PtrSize)
	case isPowerOfTwo(et.size):
		var shift uintptr
		if sys.PtrSize == 8 {
			// Mask shift for better code generation.
			shift = uintptr(sys.Ctz64(uint64(et.size))) & 63
		} else {
			shift = uintptr(sys.Ctz32(uint32(et.size))) & 31
		}
		lenmem = uintptr(old.len) << shift
		newlenmem = uintptr(cap) << shift
		capmem = roundupsize(uintptr(newcap) << shift)
		overflow = uintptr(newcap) > (maxAlloc >> shift)
		newcap = int(capmem >> shift)
	default:
		lenmem = uintptr(old.len) * et.size
		newlenmem = uintptr(cap) * et.size
		capmem, overflow = math.MulUintptr(et.size, uintptr(newcap))
		capmem = roundupsize(capmem)
		newcap = int(capmem / et.size)
	}

	// The check of overflow in addition to capmem > maxAlloc is needed
	// to prevent an overflow which can be used to trigger a segfault
	// on 32bit architectures with this example program:
	//
	// type T [1<<27 + 1]int64
	//
	// var d T
	// var s []T
	//
	// func main() {
	//   s = append(s, d, d, d, d)
	//   print(len(s), "\n")
	// }
	if overflow || capmem > maxAlloc {
		panic(errorString("growslice: cap out of range"))
	}

	var p unsafe.Pointer
	if et.ptrdata == 0 {
		p = mallocgc(capmem, nil, false)
		// The append() that calls growslice is going to overwrite from old.len to cap (which will be the new length).
		// Only clear the part that will not be overwritten.
		memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem)
	} else {
		// Note: can't use rawmem (which avoids zeroing of memory), because then GC can scan uninitialized memory.
		p = mallocgc(capmem, et, true)
		if lenmem > 0 && writeBarrier.enabled {
			// Only shade the pointers in old.array since we know the destination slice p
			// only contains nil pointers because it has been cleared during alloc.
			bulkBarrierPreWriteSrcOnly(uintptr(p), uintptr(old.array), lenmem-et.size+et.ptrdata)
		}
	}
	memmove(p, old.array, lenmem)

	return slice{p, old.len, newcap}
}

二、分配策略

在分配内存空间之前需要先确定新的切片容量,运行时根据切片的当前容量选择不同的策略进行扩容:

  1. 如果期望容量大于当前容量的两倍就会使用期望容量;
  2. 如果当前切片的长度小于 1024 就会将容量翻倍;
  3. 如果当前切片的长度大于 1024 就会每次增加 25% 的容量,直到新容量大于期望容量

三、代码验证

1、测试代码1

package main

import (
	"fmt"
	"time"
)

func main() {
	vals := []string{"a", "b", "c"}
	oldCapacity := cap(vals)
	//避免结果输出过快
	tick := time.NewTicker(1 * time.Millisecond)
	for {
		<-tick.C
		vals = append(vals, "a")
		if capacity := cap(vals); capacity != oldCapacity {
			multiplier := float64(capacity) / float64(oldCapacity)
			fmt.Printf("len(vals) is %d => cap(vals) is %d , oldCap is %d ; multiplier is %.2f\n", len(vals), capacity, oldCapacity, multiplier)
			oldCapacity = capacity
		}
	}
}

验证结果1

len(vals) is 4 => cap(vals) is 6 , oldCap is 3 ; multiplier is 2.00
len(vals) is 7 => cap(vals) is 12 , oldCap is 6 ; multiplier is 2.00
len(vals) is 13 => cap(vals) is 24 , oldCap is 12 ; multiplier is 2.00
len(vals) is 25 => cap(vals) is 48 , oldCap is 24 ; multiplier is 2.00
len(vals) is 49 => cap(vals) is 96 , oldCap is 48 ; multiplier is 2.00
len(vals) is 97 => cap(vals) is 192 , oldCap is 96 ; multiplier is 2.00
len(vals) is 193 => cap(vals) is 384 , oldCap is 192 ; multiplier is 2.00
len(vals) is 385 => cap(vals) is 768 , oldCap is 384 ; multiplier is 2.00
len(vals) is 769 => cap(vals) is 1536 , oldCap is 768 ; multiplier is 2.00
len(vals) is 1537 => cap(vals) is 2048 , oldCap is 1536 ; multiplier is 1.33
len(vals) is 2049 => cap(vals) is 2560 , oldCap is 2048 ; multiplier is 1.25
len(vals) is 2561 => cap(vals) is 3584 , oldCap is 2560 ; multiplier is 1.40
len(vals) is 3585 => cap(vals) is 4608 , oldCap is 3584 ; multiplier is 1.29
len(vals) is 4609 => cap(vals) is 6144 , oldCap is 4608 ; multiplier is 1.33

通过输出结果我们可以看到,刚开始slice容量为3,之后每次扩容都是遵循之前的扩容策略,但是当容量为1536时,下次扩容按照我们的扩容策略计算应该是1920(1536+1536*1/4),但是实际上的容量却是2048,那么2048是怎么得到的呢?当执行到代码片一之后仅会确定切片的大致容量,之后还需要根据切片中的元素大小对齐内存,当数组中元素所占的字节大小为 1、8 或者 2 的倍数时,运行时会使用其他代码对齐内存。

因为string类型占用16个字节,是2的倍数,在确定大致容量后,之后会进行内存对齐,会执行下面的代码

case isPowerOfTwo(et.size):
        var shift uintptr
        if sys.PtrSize == 8 {
            // Mask shift for better code generation.
            shift = uintptr(sys.Ctz64(uint64(et.size))) & 63
        } else {
            shift = uintptr(sys.Ctz32(uint32(et.size))) & 31
        }
        //当数据类型size为16时,shift=4
        lenmem = uintptr(old.len) << shift
        newlenmem = uintptr(cap) << shift
        //执行代码片一之后,newcap为1920
        capmem = roundupsize(uintptr(newcap) << shift)
        overflow = uintptr(newcap) > (maxAlloc >> shift)
        newcap = int(capmem >> shift)

上面的代码执行后

//shift=4
newcap=1920(1536+1536*1/4)
capmem=1920*16

之后执行 runtime.roundupsize 函数会将待申请的内存向上取整,取整时会使用 runtime.class_to_size 数组,使用该数组中的整数可以提高内存的分配效率并减少碎片,这可能会导致容量的变化与上面的扩容策略有所出入,我们再看一下roundupsize方法

//  _MaxSmallSize   = 32768 //代表小对象的最大字节数也就是32kb
//  smallSizeDiv    = 8     
//  smallSizeMax    = 1024 
func roundupsize(size uintptr) uintptr {
	if size < _MaxSmallSize {
		if size <= smallSizeMax-8 {
			return uintptr(class_to_size[size_to_class8[divRoundUp(size, smallSizeDiv)]])
		} else {//这里
			return uintptr(class_to_size[size_to_class128[divRoundUp(size-smallSizeMax, largeSizeDiv)]])
		}
	}
	if size+_PageSize < size {
		return size
	}
	return alignUp(size, _PageSize)
}

返回的值为

class_to_size[size_to_class128[divRoundUp(size-smallSizeMax, largeSizeDiv]]
divRoundUp(size-smallSizeMax, largeSizeDiv=(29696+128-1)/128=232

size_to_class128[32]=66
class_to_size[66]=32768

即返回的capmem为32768,之后会执行下面这段代码

newcap = int(capmem >> shift)

最终的容量为newcap=32768/16=2048

2、测试代码2

package main

import (
	"fmt"
	"time"
)

func main() {
	vals := []byte{'a', 'b', 'c'}
	oldCapacity := cap(vals)
	//避免结果输出过快
	tick := time.NewTicker(1 * time.Millisecond)
	for {
		<-tick.C
		vals = append(vals, 'a')
		if capacity := cap(vals); capacity != oldCapacity {
			multiplier := float64(capacity) / float64(oldCapacity)
			fmt.Printf("len(vals) is %d => cap(vals) is %d , oldCap is %d ; multiplier is %.2f\n", len(vals), capacity, oldCapacity, multiplier)
			oldCapacity = capacity
		}
	}
}

验证结果2

len(vals) is 3 => cap(vals) is 8 , oldCap is 2 ; multiplier is 4.00
len(vals) is 9 => cap(vals) is 16 , oldCap is 8 ; multiplier is 2.00
len(vals) is 17 => cap(vals) is 32 , oldCap is 16 ; multiplier is 2.00
len(vals) is 33 => cap(vals) is 64 , oldCap is 32 ; multiplier is 2.00
len(vals) is 65 => cap(vals) is 128 , oldCap is 64 ; multiplier is 2.00
len(vals) is 129 => cap(vals) is 256 , oldCap is 128 ; multiplier is 2.00
len(vals) is 257 => cap(vals) is 512 , oldCap is 256 ; multiplier is 2.00
len(vals) is 513 => cap(vals) is 1024 , oldCap is 512 ; multiplier is 2.00
len(vals) is 1025 => cap(vals) is 1280 , oldCap is 1024 ; multiplier is 1.25
len(vals) is 1281 => cap(vals) is 1792 , oldCap is 1280 ; multiplier is 1.40
len(vals) is 1793 => cap(vals) is 2304 , oldCap is 1792 ; multiplier is 1.29
len(vals) is 2305 => cap(vals) is 3072 , oldCap is 2304 ; multiplier is 1.33
len(vals) is 3073 => cap(vals) is 4096 , oldCap is 3072 ; multiplier is 1.33
len(vals) is 4097 => cap(vals) is 5376 , oldCap is 4096 ; multiplier is 1.31
len(vals) is 5377 => cap(vals) is 6784 , oldCap is 5376 ; multiplier is 1.26
len(vals) is 6785 => cap(vals) is 9472 , oldCap is 6784 ; multiplier is 1.40
len(vals) is 9473 => cap(vals) is 12288 , oldCap is 9472 ; multiplier is 1.30

从输出结果我们可以看出,刚开始slice容量为21,在第一次扩容时就不满足代码片一中的扩容策略,按照代码片一中的扩容策略,此次扩容,slice的容量应该是42(21*2),我们从代码中分析一下原因,当数据类型size为1时,在执行代码片一之后会执行以下代码:

    case et.size == 1:
        lenmem = uintptr(old.len)
        newlenmem = uintptr(cap)
        //执行过代码片一之后,newcap=42
        capmem = roundupsize(uintptr(newcap))
        overflow = uintptr(newcap) > maxAlloc
        newcap = int(capmem)

同样会使用roundupsize向上取整,会执行以下代码

    if size < _MaxSmallSize {
        //smallSizeMax 每个span中能存放的最多的小对象个数
        if size <= smallSizeMax-8 {
            //这里
            return uintptr(class_to_size[size_to_class8[divRoundUp(size, smallSizeDiv)]])
        } else {
            return uintptr(class_to_size[size_to_class128[divRoundUp(size-smallSizeMax, largeSizeDiv)]])
        }
    }

返回的值为

class_to_size[size_to_class8[divRoundUp(size, smallSizeDiv)]]
divRoundUp(size, smallSizeDiv)=(42+8-1)/8=6

size_to_class8[6]=4
class_to_size[4]=48

即返回的capmem=48,因此newcap=48

3、内存对齐

计算出了新容量之后,还没有完,出于内存的高效利用考虑,还要进行内存对齐

capmem := roundupsize(uintptr(newcap) * uintptr(et.size))

newcap就是前文中计算出的newcap,et.size代表slice中一个元素的大小,capmem计算出来的就是此次扩容需要申请的内存大小。roundupsize函数就是处理内存对齐的函数。

func roundupsize(size uintptr) uintptr {
	if size < _MaxSmallSize {
		if size <= 1024-8 {
			return uintptr(class_to_size[size_to_class8[(size+7)>>3]])
		} else {
			return uintptr(class_to_size[size_to_class128[(size-1024+127)>>7]])
		}
	}
	if size+_PageSize < size {
		return size
	}
	return round(size, _PageSize)
}

 _MaxSmallSize的值在64位macos上是32«10,也就是2的15次方,32k。golang事先生成了一个内存对齐表。通过查找(size+7) » 3,也就是需要多少个8字节,然后到class_to_size中寻找最小内存的大小。承接上文的size,应该是40,size_to_class8的内容是这样的:

size_to_class8:1 1 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11...

查表得到的数字是4,而class_to_size的内容是这样的:

class_to_size:0 8 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256...

因此得到最小的对齐内存是48字节。完成内存对齐计算后,重新计算应有的容量,也就是48/8 = 6。扩容得到的容量就是6了。

4、向上取整

之前说到了slice扩容时可能会将容量向上取整,为什么需要向上取整?

这与golang的内存分配机制有关,因为我没有深入太多细节,所以这里大概说一下, 目前,当我们创建一个切片或增长一个切片时,我们使用 malloc为该底层数组 (目前golang应该是用Tmalloc来分配内存)分配内存。 golang中分配内存的原理大概是把内存分成大小不等的块(8,16,32,48......)。然后在需要时,找个一个大小合适的内存块,分配给这个对象(只是大概的原理,其真实的分配流程必然更加复杂),比如说我们创建一个容量为42的字符型slice([]byte),那么malloc会为其底层数组分配大小为48B的内存块。

假设我们的slice容量不向上取整,那么我们的slice指向该大小为42字节的数组(其实际分配内存为48字节),len为22,cap=42,那么就有6字节的内存被浪费了,也无法分配给其他的对象。

所以我们在slice 扩容时对其容量向上取整,比如说我们创建有一个容量为21的字符型slice([]byte),之后apped操作触发slice扩容,那么在确定其容量时向上取整48,那么同样会为其底层数组分配大小为48的内存块,但是不会造成内存的浪费了。

posted @ 2022-10-02 11:24  踏雪无痕SS  阅读(437)  评论(0编辑  收藏  举报