开始吧：Golang并发性，第2部分

最新推荐文章于 2024-09-17 13:39:00 发布

cunjie3951

最新推荐文章于 2024-09-17 13:39:00 发布

阅读量132

点赞数

文章标签：队列 python java go linux

总览

Go的独特功能之一是使用通道在goroutine之间安全通信。在本文中，您将学习什么是渠道，如何有效地使用它们以及一些常见的模式。

什么是频道？

通道是一个同步的内存队列，goroutine和常规函数可以使用该队列发送和接收类型化的值。通信通过通道序列化。

您使用make()创建一个通道，并指定该通道接受的值的类型：

ch := make(chan int)

Go为向/从通道发送和接收提供了很好的箭头语法：

// send value to a channel
    ch <- 5

    // receive value from a channel
    x := <- ch

您不必消耗价值。只需从通道中弹出一个值即可：

<-ch

默认情况下，渠道处于封锁状态。如果您将值发送到频道，则会阻塞直到有人收到它。同样，如果您是从某个渠道收到的，则您将一直阻塞，直到有人向该渠道发送值为止。

下面的程序演示了这一点。 main()函数创建一个通道并启动一个名为go的例程，该例程打印“ start”，从该通道读取一个值，然后进行打印。然后main()启动另一个goroutine，该例程仅每秒输出一个破折号（“-”）。然后，它Hibernate2.5秒，将值发送到通道，再Hibernate3秒以使所有goroutine完成。

import (
    "fmt"
    "time"
)

func main() {
    ch := make(chan int)

    // Start a goroutine that reads a value from a channel and prints it
    go func(ch chan int) {
        fmt.Println("start")
        fmt.Println(<-ch)
    }(ch)

    // Start a goroutine that prints a dash every second
    go func() {
        for i := 0; i < 5; i++ {
            time.Sleep(time.Second)
            fmt.Println("-")
        }
    }()

    // Sleep for two seconds
    time.Sleep(2500 * time.Millisecond)

    // Send a value to the channel
    ch <- 5

    // Sleep three more seconds to let all goroutines finish
    time.Sleep(3 * time.Second)
}

该程序很好地演示了通道的阻塞特性。第一个goroutine立即打印“开始”，但是在尝试从通道接收之前被阻塞，直到main()函数Hibernate2.5秒并发送值。另一个goroutine只是通过每秒定期打印破折号来提供时间流的可视指示。

这是输出：

start
-
-
5
-
-
-

缓冲通道

此行为将发送者与接收者紧密耦合，有时不是您想要的。 Go提供了几种机制来解决这个问题。

缓冲通道是可以保存一定数量（预定义）值的通道，这样即使在没有人接收的情况下，发送方也不会阻塞直到缓冲区满为止。

要创建缓冲通道，只需添加容量作为第二个参数：

ch := make(chan int, 5)

以下程序说明了缓冲通道的行为。 main()程序定义了一个容量为3的缓冲通道。然后，它启动一个goroutine，该goroutine每秒从该通道读取一个缓冲区并打印，而另一个goroutine每秒仅打印破折号以直观地显示进度时间。然后，它将五个值发送到通道。

import (
    "fmt"
    "time"
)


func main() {
    ch := make(chan int, 3)

    // Start a goroutine that reads a value from the channel every second and prints it
    go func(ch chan int) {
        for {
            time.Sleep(time.Second)
            fmt.Printf("Goroutine received: %d\n", <-ch)
        }

    }(ch)

    // Start a goroutine that prints a dash every second
    go func() {
        for i := 0; i < 5; i++ {
            time.Sleep(time.Second)
            fmt.Println("-")
        }
    }()

    // Push values to the channel as fast as possible
    for i := 0; i < 5; i++ {
        ch <- i
        fmt.Printf("main() pushed: %d\n", i)
    }

    // Sleep five more seconds to let all goroutines finish
    time.Sleep(5 * time.Second)
}

运行时会发生什么？前三个值被通道立即缓冲，并且main()函数块被缓冲。一秒钟后，goroutine接收到一个值，并且main()函数可以推送另一个值。再过一秒钟，goroutine收到另一个值，并且main()函数可以推送最后一个值。此时，goroutine保持每秒从通道接收值。

这是输出：

main() pushed: 0
main() pushed: 1
main() pushed: 2
-
Goroutine received: 0
main() pushed: 3
-
Goroutine received: 1
main() pushed: 4
-
Goroutine received: 2
-
Goroutine received: 3
-
Goroutine received: 4

选择

缓冲通道（只要缓冲区足够大）就可以解决临时波动的问题，即接收者不足以处理所有已发送的消息。但是，还有一个相反的问题，即阻塞的接收者等待消息处理。去让你覆盖。

如果您希望goroutine在没有消息要处理的情况下执行其他操作怎么办？一个很好的例子是，如果您的接收方正在等待来自多个通道的消息。如果频道B现在有消息，则您不想在频道A上屏蔽。以下程序尝试使用机器的全部功率来计算3和5的总和。

该想法是模拟具有冗余的复杂操作（例如，对分布式DB的远程查询）。 sum()函数（请注意如何在main()中将其定义为嵌套函数）接受两个int参数并返回一个int通道。内部匿名goroutine随机睡眠一段时间，直到一秒钟，然后将总和写入通道，关闭并返回。

现在，main调用sum(3, 5)四次，并将结果通道存储在变量ch1至ch4中。对sum()的四个调用会立即返回，因为随机Hibernate发生在每个sum()函数调用的goroutine内部。

这里是最酷的部分。 select语句使main()函数在所有通道上等待并响应第一个返回的通道。 select语句的操作有点类似于switch语句。

func main() {
    r := rand.New(rand.NewSource(time.Now().UnixNano()))

    sum := func(a int, b int) <-chan int {
        ch := make(chan int)
        go func() {
            // Random time up to one second
            delay := time.Duration(r.Int()%1000) * time.Millisecond
            time.Sleep(delay)
            ch <- a + b
            close(ch)
        }()
        return ch
    }

    // Call sum 4 times with the same parameters
    ch1 := sum(3, 5)
    ch2 := sum(3, 5)
    ch3 := sum(3, 5)
    ch4 := sum(3, 5)

    // wait for the first goroutine to write to its channel
    select {
    case result := <-ch1:
        fmt.Printf("ch1: 3 + 5 = %d", result)
    case result := <-ch2:
        fmt.Printf("ch2: 3 + 5 = %d", result)
    case result := <-ch3:
        fmt.Printf("ch3: 3 + 5 = %d", result)
    case result := <-ch4:
        fmt.Printf("ch4: 3 + 5 = %d", result)
    }
}

有时，您不希望main()函数阻塞等待第一个goroutine完成。在这种情况下，您可以添加一个默认情况，如果所有通道都被阻止，该情况将执行。

Web爬网程序示例

在上一篇文章中，我展示了Tour of Go中的Web爬虫练习解决方案。我用过goroutines和一个同步映射。我还使用渠道解决了该练习。两种解决方案的完整源代码都可以在GitHub上找到。

让我们看一下相关部分。首先，这是一个结构，每当goroutine解析页面时，该结构都会发送到通道。它包含当前深度以及在页面上找到的所有URL。

type links struct {
    urls  []string
    depth int
}

fetchURL()函数接受URL，深度和输出通道。它使用提取程序（由练习提供）来获取页面上所有链接的URL。它将URL列表作为一条消息发送到候选人的频道，作为深度递减的links结构。深度代表我们应该爬行多远。当深度达到0时，将不再进行任何处理。

func fetchURL(url string, depth int, candidates chan links) {
    body, urls, err := fetcher.Fetch(url)
    fmt.Printf("found: %s %q\n", url, body)

    if err != nil {
        fmt.Println(err)
    }

    candidates <- links{urls, depth - 1}
}

ChannelCrawl()函数可协调所有内容。它跟踪已在映射中获取的所有URL。无需同步访问，因为没有其他功能或goroutine正在接触。它还定义了所有goroutine将其结果写入其中的候选通道。

然后，它开始为每个新URL调用parseUrl作为goroutines。该逻辑通过管理计数器来跟踪启动了多少个goroutine。每当从通道中读取值时，计数器都会递减（因为发送goroutine在发送后退出），并且每当启动新的goroutine时，计数器就会递增。如果深度达到零，则不会启动新的goroutine，并且main函数将继续从通道读取直到所有goroutine完成。

// ChannelCrawl crawls links from a seed url
func ChannelCrawl(url string, depth int, fetcher Fetcher) {
    candidates := make(chan links, 0)
    fetched := make(map[string]bool)
    counter := 1

    // Fetch initial url to seed the candidates channel
    go fetchURL(url, depth, candidates)

    for counter > 0 {
        candidateLinks := <-candidates
        counter--
        depth = candidateLinks.depth
        for _, candidate := range candidateLinks.urls {
            // Already fetched. Continue...
            if fetched[candidate] {
                continue
            }

            // Add to fetched mapped
            fetched[candidate] = true

            if depth > 0 {
                counter++
                go fetchURL(candidate, depth, candidates)
            }
        }
    }