go java gc_Golang GC算法-CSDN博客

本文链接：https://blog.csdn.net/weixin_34856444/article/details/114171199

概括

Go的垃圾回收官方形容为非分代非紧缩写屏障三色并发标记清理算法。

非分代：不像Java那样分为年轻代和年老代，自然也没有minor gc和maj o gc的区别。

非紧缩：在垃圾回收之后不会进行内存整理以清除内存碎片。

写屏障：在并发标记的过程中，如果应用程序(mutator)修改了对象图，就可能出现标记遗漏的可能，写屏障就是为了处理标记遗漏的问题。

三色：将GC中的对象按照搜索的情况分成三种：

黑色: 对象在这次GC中已标记,且这个对象包含的子对象也已标记

灰色: 对象在这次GC中已标记, 但这个对象包含的子对象未标记

白色: 对象在这次GC中未标记

并发：可以和应用程序(mutator)在一定程度上并发执行。

标记清理：GC算法分为两个大步骤：标记阶段找出要回收的对象，清理阶段则回收未被标记的对象(要被回收的对象)

触发时机

gcTriggerAlways: 强制触发GC，没找到什么情况下使用这个

gcTriggerHeap: 当前分配的内存达到一定值(动态计算)就触发GC

gcTriggerTime: 当一定时间(2分钟)没有执行过GC就触发GC

gcTriggerCycle: 要求启动新一轮的GC, 已启动则跳过, 手动触发GC的runtime.GC()会使用这个条件

func gcStart(mode gcMode, trigger gcTrigger) {

// Since this is called from malloc and malloc is called in

// the guts of a number of libraries that might be holding

// locks, don't attempt to start GC in non-preemptible or

// potentially unstable situations.

mp := acquirem()

if gp := getg(); gp == mp.g0 || mp.locks > 1 || mp.preemptoff != "" {

releasem(mp)

return

}

releasem(mp)

mp = nil

// 检查GC条件是否满足，和下面的test()构成双检查锁，如果满足GC条件但目前处于GC清理阶段，那就参与清理

for trigger.test() && gosweepone() != ^uintptr(0) {

sweep.nbgsweep++

}

// 加锁检查

semacquire(&work.startSema)

if !trigger.test() {

semrelease(&work.startSema)

return

}

/*************** ..... *****************/

}

在trigger.test()函数中，检查是否满足GC触发的条件

func (t gcTrigger) test() bool {

if !memstats.enablegc || panicking != 0 {

return false

}

if t.kind == gcTriggerAlways {

return true

}

if gcphase != _GCoff {

return false

}

switch t.kind {

case gcTriggerHeap:

// Non-atomic access to heap_live for performance. If

// we are going to trigger on this, this thread just

// atomically wrote heap_live anyway and we'll see our

// own write.

return memstats.heap_live >= memstats.gc_trigger

case gcTriggerTime:

if gcpercent < 0 {

return false

}

lastgc := int64(atomic.Load64(&memstats.last_gc_nanotime))

// forcegcperiod = 2分钟

return lastgc != 0 && t.now-lastgc > forcegcperiod

case gcTriggerCycle:

// t.n > work.cycles, but accounting for wraparound.

return int32(t.n-work.cycles) > 0

}

return true

}

const (

// gcTriggerAlways indicates that a cycle should be started

// unconditionally, even if GOGC is off or we're in a cycle

// right now. This cannot be consolidated with other cycles.

gcTriggerAlways gcTriggerKind = iota

// gcTriggerHeap indicates that a cycle should be started when

// the heap size reaches the trigger heap size computed by the

// controller.

gcTriggerHeap

// gcTriggerTime indicates that a cycle should be started when

// it's been more than forcegcperiod nanoseconds since the

// previous GC cycle.

gcTriggerTime

// gcTriggerCycle indicates that a cycle should be started if

// we have not yet started cycle number gcTrigger.n (relative

// to work.cycles).

gcTriggerCycle

)

算法过程

Sweep Termination: 对未清扫的span进行清扫, 只有上一轮的GC的清扫工作完成才可以开始新一轮的GC

Mark: 扫描所有根对象, 和根对象可以到达的所有对象, 标记它们不被回收

Mark Termination: 完成标记工作, 重新扫描部分根对象(要求STW)

Sweep: 按标记结果清扫span

golang_gc.jpg

func gcStart(mode gcMode, trigger gcTrigger) {

// 拿到锁，保证只有一个执行流进入到这个临界区

semacquire(&worldsema)

// 启动后台扫描任务(G)

if mode == gcBackgroundMode {

gcBgMarkStartWorkers()

}

gcResetMarkState()

work.stwprocs, work.maxprocs = gomaxprocs, gomaxprocs

if work.stwprocs > ncpu {

work.stwprocs = ncpu

}

work.heap0 = atomic.Load64(&memstats.heap_live)

work.pauseNS = 0

work.mode = mode

now := nanotime()

work.tSweepTerm = now

work.pauseStart = now

if trace.enabled {

traceGCSTWStart(1)

}

systemstack(stopTheWorldWithSema)

// Finish sweep before we start concurrent scan.

systemstack(func() {

finishsweep_m()

})

// clearpools before we start the GC. If we wait they memory will not be

// reclaimed until the next GC cycle.

clearpools()

work.cycles++

if mode == gcBackgroundMode { // Do as much work concurrently as possible

gcController.startCycle()

work.heapGoal = memstats.next_gc

// Enter concurrent mark phase and enable

// write barriers.

setGCPhase(_GCmark)

gcBgMarkPrepare() // Must happen before assist enable.

gcMarkRootPrepare()

// Mark all active tinyalloc blocks. Since we're

// allocating from these, they need to be black like

// other allocations. The alternative is to blacken

// the tiny block on every allocation from it, which

// would slow down the tiny allocator.

gcMarkTinyAllocs()

// At this point all Ps have enabled the write

// barrier, thus maintaining the no white to

// black invariant. Enable mutator assists to

// put back-pressure on fast allocating

// mutators.

atomic.Store(&gcBlackenEnabled, 1)

// Assists and workers can start the moment we start

// the world.

gcController.markStartTime = now

// Concurrent mark.

systemstack(func() {

now = startTheWorldWithSema(trace.enabled)

})

work.pauseNS += now - work.pauseStart

work.tMark = now

}

semrelease(&work.startSema)

}

关键函数及路径：

gcBgMarkStartWorkers():准备后台标记工作goroutine(allp)，启动后等待该任务通知信号量bgMarkReady再继续，notewakeup(&work.bgMarkReady)

gcResetMarkState():重置一些全局状态和所有gorontine的栈(一种根对象)扫描状态

systemstack(stopTheWorldWithSema):启动stop the world

systemstack(func(){finishsweep_m()}): 不断去除要清理的span进行清理，然后重置gcmark位

clearpools(): 清扫sched.sudogcache和sched.deferpool，不知道在干嘛......

gcController.startCycle():启动新一轮GC，设置gc controller的状态位和计算一些估计值

setGCPhase(_GCmark):设置GC阶段，启用写屏障

gcBgMarkPrepare():设置后台标记任务计数；work.nproc = ^uint32(0)，work.nwait = ^uint32(0)

gcMarkRootPrepare(): 计算扫描根对象的任务数量

gcMarkTinyAllocs(): 标记所有tiny alloc等待合并的对象

atomic.Store(&gcBlackenEnabled, 1): 启用辅助GC

systemstack(func(){now=startTheWorldWithSema(trace.enable)}): 停止stop the world

func gcBgMarkWorker(_p_ *p) {

/********** ....... ***********/

// 通知gcBgMarkStartWorkers可以继续处理

notewakeup(&work.bgMarkReady)

for {

// 切换到g0运行

systemstack(func() {

// Mark our goroutine preemptible so its stack

// can be scanned. This lets two mark workers

// scan each other (otherwise, they would

// deadlock). We must not modify anything on

// the G stack. However, stack shrinking is

// disabled for mark workers, so it is safe to

// read from the G stack.

casgstatus(gp, _Grunning, _Gwaiting)

switch _p_.gcMarkWorkerMode {

default:

throw("gcBgMarkWorker: unexpected gcMarkWorkerMode")

case gcMarkWorkerDedicatedMode:

gcDrain(&_p_.gcw, gcDrainUntilPreempt|gcDrainFlushBgCredit)

if gp.preempt {

lock(&sched.lock)

for {

gp, _ := runqget(_p_)

if gp == nil {

break

}

globrunqput(gp)

}

unlock(&sched.lock)

}

// Go back to draining, this time

// without preemption.

gcDrain(&_p_.gcw, gcDrainNoBlock|gcDrainFlushBgCredit)

case gcMarkWorkerFractionalMode:

gcDrain(&_p_.gcw, gcDrainFractional|gcDrainUntilPreempt|gcDrainFlushBgCredit)

case gcMarkWorkerIdleMode:

gcDrain(&_p_.gcw, gcDrainIdle|gcDrainUntilPreempt|gcDrainFlushBgCredit)

}

casgstatus(gp, _Gwaiting, _Grunning)

})

/******** ...... ***********/

// 判断是否所有后台标记任务都完成, 并且没有更多的任务

if incnwait == work.nproc && !gcMarkWorkAvailable(nil) {

gcMarkDone()

}

gcDrain()是执行标记的函数

当所有标记任务完成时，执行gcMarkDone()函数

func gcDrain(gcw *gcWork, flags gcDrainFlags) {

initScanWork := gcw.scanWork

// 如果根对象未扫描完，则先扫描根对象，Jobs为根对象总数，next相当于一个对象任务的取数器

if work.markrootNext < work.markrootJobs {

for !(preemptible && gp.preempt) {

job := atomic.Xadd(&work.markrootNext, +1) - 1

if job >= work.markrootJobs {

break

}

// 将会扫描根对象，并把它加入到标记队列gcWork中之中，也就是把对象变成灰色

markroot(gcw, job)

if check != nil && check() {

goto done

}

// 当根对象全部put到标记队列中, 消费标记队列，根据对象图进行消费

for !(preemptible && gp.preempt) {

if work.full == 0 {

gcw.balance()

}

var b uintptr

if blocking {

b = gcw.get()

} else {

b = gcw.tryGetFast()

if b == 0 {

b = gcw.tryGet()

}

if b == 0 {

// work barrier reached or tryGet failed.

break

}

scanobject(b, gcw)

// 如果已经扫描了一定数量的对象(gcCreditSlack的值是2000)

if gcw.scanWork >= gcCreditSlack {

// 把扫描的对象数量添加到全局

atomic.Xaddint64(&gcController.scanWork, gcw.scanWork)

// 减少辅助GC的工作量和唤醒等待中的G

if flushBgCredit {

gcFlushBgCredit(gcw.scanWork - initScanWork)

initScanWork = 0

}

idleCheck -= gcw.scanWork

gcw.scanWork = 0

// 如果是idle模式且达到了检查的扫描量, 则检查是否有其他任务(G), 如果有则跳出循环

if idle && idleCheck <= 0 {

idleCheck += idleCheckThreshold

if pollWork() {

break

}

done:

// 把扫描的对象数量添加到全局

if gcw.scanWork > 0 {

atomic.Xaddint64(&gcController.scanWork, gcw.scanWork)

// 减少辅助GC的工作量和唤醒等待中的G

if flushBgCredit {

gcFlushBgCredit(gcw.scanWork - initScanWork)

}

gcw.scanWork = 0

}

func gcMarkDone() {

semacquire(&work.markDoneSema)

// Re-check transition condition under transition lock.

if !(gcphase == _GCmark && work.nwait == work.nproc && !gcMarkWorkAvailable(nil)) {

semrelease(&work.markDoneSema)

return

}

// 暂时禁止启动新的后台标记任务

atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, -0xffffffff)

prevFractionalGoal := gcController.fractionalUtilizationGoal

gcController.fractionalUtilizationGoal = 0

// 转换到Mark Termination阶段，进入STW阶段

systemstack(stopTheWorldWithSema)

// 标记对根对象的扫描已完成

work.markrootDone = true

// 禁止辅助GC和后台任务

atomic.Store(&gcBlackenEnabled, 0)

// 唤醒所有因为辅助GC而休眠的G

gcWakeAllAssists()

semrelease(&work.markDoneSema)

// 计算下一次触发gc需要的heap大小

nextTriggerRatio := gcController.endCycle()

// 计算下一次触发gc需要的heap大小

gcMarkTermination(nextTriggerRatio)

}

func gcMarkTermination(nextTriggerRatio float64) {

// 禁止辅助GC和后台标记任务的运行

// 重新允许本地标记队列(下次GC使用)

// 设置当前GC阶段到完成标记阶段, 并启用写屏障

atomic.Store(&gcBlackenEnabled, 0)

gcBlackenPromptly = false

setGCPhase(_GCmarktermination)

systemstack(func() {gcMark(startTime)})

systemstack(func() {

// 设置当前GC阶段到关闭, 并禁用写屏障

setGCPhase(_GCoff)

// 唤醒后台清扫任务, 将在STW结束后开始运行

gcSweep(work.mode)

})

// 更新下一次触发gc需要的heap大小(gc_trigger)

gcSetTriggerRatio(nextTriggerRatio)

// 重置清扫状态

sweep.nbgsweep = 0

sweep.npausesweep = 0

// 统计执行GC的次数然后唤醒等待清扫的G

lock(&work.sweepWaiters.lock)

memstats.numgc++

injectglist(work.sweepWaiters.head.ptr())

work.sweepWaiters.head = 0

unlock(&work.sweepWaiters.lock)

// 重新启动世界

systemstack(func() { startTheWorldWithSema(true) })

// 移动标记队列使用的缓冲区到自由列表, 使得它们可以被回收

prepareFreeWorkbufs()

// 释放未使用的栈

systemstack(freeStackSpans)

semrelease(&worldsema)

// 重新允许当前的G被抢占

releasem(mp)

mp = nil

当标记的扫描工作完成之后，会进入到GC Mark Termination阶段，也就是gcMarkDone()函数，关键路径：

systemstack(stopTheWorldWithSema)：启动STW

gcWakeAllAssists()：唤醒所有因辅助gc而休眠的G

nextTriggerRatio:=gcController.endCycle()：计算下一次触发gc需要的heap大小

setGCPhase(_GCmarktermination)：启用写屏障

systemstack(func() {gcMark(startTime)})：再次执行标记

systemstack(func(){setGCPhase(_GCoff)；gcSweep(work.mode)})：关闭写屏障，唤醒后台清扫任务，将在STW结束后开始运行

gcSetTriggerRatio(nextTriggerRatio)：更新下次触发gc时的heap大小

systemstack(func() { startTheWorldWithSema(true) }): 停止STW

STW分析：web程序中，我们关注最大停顿时间

STW出现在两个位置，分别是在初始标记阶段Mark和并发标记完成后重标记Mark Termination:

初始标记阶段：

systemstack(stopTheWorldWithSema):启动stop the world

systemstack(func(){finishsweep_m()}): 不断去除要清理的span进行清理，然后重置gcmark位

clearpools(): 清扫sched.sudogcache和sched.deferpool，不知道在干嘛......

gcController.startCycle():启动新一轮GC，设置gc controller的状态位和计算一些估计值

gcMarkRootPrepare(): 计算扫描根对象的任务数量

gcMarkTinyAllocs(): 涂灰所有tiny alloc等待合并的对象

systemstack(func(){now=startTheWorldWithSema(trace.enable)}): 停止stop the world

找出其中比较耗时的阶段：

finishsweep_m():如果上一次GC清扫阶段没有完成，那么在新的一轮GC阶段中就会在阻塞在这里，使得原本可以和应用程序并行的清扫阶段被放进STW。所以，如果频繁的执行GC，就可能会使得GC的最大停顿时间变长。

clearpools():时间复杂度大概为：O(5*L),L为_defer中链表的长度。

gcController.startCycle():O(P)，P为go的P的数量，和cpu数有关，时间复杂度可以忽略

gcMarkRootPrepare(): O(全局变量区)，包括bss段和data段

gcMarkTinyAllocs(): O(P)

个人觉得，对STW影响最大的是finishsweep_m()阶段，所有我们应该尽量避免让go在清扫期执行新一轮的GC。

重新标记阶段

systemstack(stopTheWorldWithSema)：启动STW

gcWakeAllAssists()：唤醒所有因辅助gc而休眠的G

nextTriggerRatio:=gcController.endCycle()：计算下一次触发gc需要的heap大小

setGCPhase(_GCmarktermination)：启用写屏障

systemstack(func() {gcMark(startTime)})：再次执行标记

systemstack(func(){setGCPhase(_GCoff)；gcSweep(work.mode)})：关闭写屏障，唤醒后台清扫任务，将在STW结束后开始运行

gcSetTriggerRatio(nextTriggerRatio)：更新下次触发gc时的heap大小

systemstack(func() { startTheWorldWithSema(true) }): 停止STW

找出其中比较耗时的阶段：

gcWakeAllAssists()：O(G)，将所有可运行的G插入到调度链表

systemstack(func() {gcMark(startTime)})：