Google云的平台工程

张建飞（Frank）

于 2023-11-14 09:43:52 发布

阅读量4.1k

点赞数

本文链接：https://blog.csdn.net/significantfrank/article/details/134410939

版权

GCP（Google Cloud Platform）是Google云，为其内部（Google search、Gmail、YouTube等）和外部客户提供IaaS、PaaS以及Serverless computing等云服务的平台。

本文将带领你走进GCP，并深入体验其产品功能，感受Google云的产品设计理念以及相关架构思想。从而可以淬其精华，为我所用。本文的主要内容如下：

1. 注册Free Trial账号

云的产品体系非常庞大，注册一个GCP的Free Trial账号非常有必要。有了账号，我们可以解锁大部分GCP功能。注册不难，前往cloud.google.com按照要求注册就可以。

但有一点需要注意，在注册过程中，需要进行信用卡预授权（Authorize）300美元，这个钱试用期之后会退给你。试用期限是3个月，所以这3个月中所有的资源消耗都是免费的，当然这些资源也是有限的，不过用来运行一些demo和跑一些tutorial还是足够用了。

2.在GKE部署hello-app

2.1 根据tutorial完成hello-app的部署

这是一个Learn Tutorial，如下图所示，即它会用页面引导的方式，手把手教你怎么在GKE里面部署一个web应用。

2.2 体验autoscale

按照上面的tutorial部署完web-app之后，你会发现初始时这个deployment的replicas是3，但是过一段时间就会变成1。这是因为autoscale这个插件在搞鬼，当它检测到3个pods的CPU使用率都小于80%的时候，它就会将pods的数量从3缩容到1。

这里直接修改AutoScaler配置可能会报不能有两个AutoScaler错误，这时可以先Delete再Save就可以了。

你也可以使用kubectl get hpa来查看autoscaler情况，对于我们这个案例来说，将显示如下内容，其中minipods是2正是我们上面在console上设置的值。

NAME                 REFERENCE              TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hello-app-hpa-ryrw   Deployment/hello-app   0%/80%    2         5         2          4h41m

其中hpa是CRD对象HorizontalPodAutoscaler，更多关于autoscale的内容可以参看autoscale文档。

2.3 更新hello-app

在这个tutorial里面，这个hello-app的代码是从github上clone下来的，在cloud shell中的路径是kubernetes-engine-samples/hello-app，这是一个非常简单的go http server，按照tutorial我们是打了一个hello-app v1的镜像，然后我们在us-west4这个Region创建了一个Artifact Repository，并把hello-app v1镜像push到这个Repository了。

现在，我们可以增加一个计数功能，打一个v2版本的新image，然后对workload进行滚动更新，其具体步骤如下：

修改kubernetes-engine-samples/hello-app/main.go代码：

var count int

func hello(w http.ResponseWriter, r *http.Request) {
    log.Printf("Serving request: %s", r.URL.Path)
    host, _ := os.Hostname()
    count++
    fmt.Fprintf(w, "Visited count: %d\n", count)
    fmt.Fprintf(w, "Version: 2.0.0\n")
    fmt.Fprintf(w, "Hostname: %s\n", host)
}

可以在本地通过go run main.go启动服务，然后运行命令curl localhost:8080测试一下改动是否ok
测试没有问题，通过下面命令重新打v2版本的docker镜像

docker build -t ${REGION}-docker.pkg.dev/${PROJECT_ID}/hello-repo/hello-app:v2 .

# 或者替换变量
docker build -t us-west4-docker.pkg.dev/my-second-project-398309/hello-repo/hello-app:v2 .

注意：其中REGION和PROJECT_ID这两个环境变量是在2.1的过程中设置过的，可以通过echo ${PROJECT_ID}验证一下，因为如果cloud shell重新连接的话，环境变量会丢失，需要重新设置。

将新的image上传到我们项目的repository

docker push ${REGION}-docker.pkg.dev/${PROJECT_ID}/hello-repo/hello-app:v2

# 或者替换变量
docker push us-west4-docker.pkg.dev/my-second-project-398309/hello-repo/hello-app:v2

5. 进行滚动更新（Rolling update）
在kubenetes的环境里，我们一般会通过修改yaml文件，将里面的container使用的image修改成我们上面上传的v2版本，然后kubectl apply一下，控制面会帮助我们完成更新。因为我们这个tutorial是基于GUI console的，所以我们也可以在界面上完成更新操作。具体就是点击Actions里面的Rolling update，然后将Container Image换成我们的v2版本，点击update就可以完成更新任务。

完成这些动作之后，通过访问hello-app-service暴露的公网endpoint，访问一下，如果返回页面上有统计访问计数功能，说明我们的更新是完成了。

3. 体验Terraform

3.1 基础概念

Terraform是一个IT基础架构自动化编排工具，可以用代码来管理维护IT资源。它编写了描述云资源拓扑的配置文件中的基础结构，例如虚拟机、存储账户和网络接口。也就是我们通常说的IAC（Infrastructure As Code）。
其中Provider和Resource这两个最重要的概念，我们要理解。

Provider：Terraform是一个框架，它可以支撑所有的云厂商，不同的云厂商是不同的Provider。
Resource：Resource是infrastructure的各类组件，可以是物理组件比如服务器，也可以是逻辑组件比如安全组等。在描述Resource block的时候，有两个String字段，它们分别表示Resource type和Resource name。比如：

resource "google_compute_network" "vpc_network" {
  name = "terraform-network"
}

The resource type is google_compute_network and the name is vpc_network. The prefix of the type maps to the name of the provider. In the example configuration, Terraform manages the google_compute_network resource with the google provider. Together, the resource type and resource name form a unique ID for the resource. For example, the ID for your network is google_compute_network.vpc_network。

如果想要更近一步了解provider和resource的概念，可以查看hashicorp的官方文档。如果需要查看GCP的terraform Resource定义可以查看GCP的Terraform文档。同样，华为云也有自己的Terraform指南。

接下来，让我们根据Terraform Tutorial一起体验一下其强大的IaC功能。

3.2 使用Terraform创建VPC

根据tutorial，我们要新建一个main.tf文件，首先创建VPC网络。具体内容参考tutorial

3.3 使用Terraform创建虚机

然后是创建Compute Engine虚拟机资源，具体内容参考tutorial
此时我们的main.tf内容如下：

# Create VPC network and subnet
resource "google_compute_network" "vpc_network" {
  name                    = "my-custom-mode-network"
  auto_create_subnetworks = false
  mtu                     = 1460
}

resource "google_compute_subnetwork" "default" {
  name          = "my-custom-subnet"
  ip_cidr_range = "10.0.1.0/24"
  region        = "us-west1"
  network       = google_compute_network.vpc_network.id
}

3.4 执行Terraform

执行一般需要3步：

第一步是使用terraform init来添加必要的插件并构建.terraform目录。
第二步使用terraform plan来验证main.tf的语法是否正确，显示将要创建的资源。
第三步是使用terraform apply来实施资源的创建。

注意：因为我们使用的是free trial账号，能创建的资源非常有限，可能会出现配额不足的情况。此时，我们可以通过Quota配额管理，来查看配额使用情况。如果是配额原因的话，可以释放资源再尝试。

3.5 用Terraform创建一个可部署Web应用的环境

接下来，我们会创建一个web应用，将其部署到虚拟机。这需要我们做以下准备：

添加自定义SSH防火墙规则，可以让我们远程SSH到虚拟机。对于华为云来说，就是安全组（Security Group）

# 在原来main.tf的基础上，追加以下内容，在执行terraform apply的时候，
# 已创建的资源会被ignore，只会创建新添加的resource

# add ssh firewall rule
resource "google_compute_firewall" "ssh" {
  name = "allow-ssh"
  allow {
    ports    = ["22"]
    protocol = "tcp"
  }
  direction     = "INGRESS"
  network       = google_compute_network.vpc_network.id
  priority      = 1000
  source_ranges = ["0.0.0.0/0"]
  target_tags   = ["ssh"]
}

使用ssh登录到flask-vm的虚拟机

构建Flask应用
创建app.py文件

写一个简单hello world http server

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_cloud():
  return 'Hello Terraform!'

app.run(host='0.0.0.0')

运行python3 app.py。Flask默认为通过localhost:5000暴露服务
在虚拟机上开放5000端口
为了执行上面的配置，我们需要在原来main.tf的基础上追加以下内容，然后执行terraform apply。

# expose 5000 port 
resource "google_compute_firewall" "flask" {
  name    = "flask-app-firewall"
  network = google_compute_network.vpc_network.id

  allow {
    protocol = "tcp"
    ports    = ["5000"]
  }
  source_ranges = ["0.0.0.0/0"]
}

# A variable for extracting the external IP address of the VM
output "Web-server-URL" {
 value = join("",["http://",google_compute_instance.default.network_interface.0.access_config.0.nat_ip,":5000"])
}

然后我们部署一个简单的Python Flask应用来验证我们的web server是否OK，至此我们整个实验也就完成了。

4. 实现GitOps形式的CICD

4.1 基本概念

Cloud build是GCP的提供的serverless CI/CD平台，可以在上面完成代码托管、构建、artifiact registry，测试和部署等ops工作。
术语 GitOps 一词由 Weaveworks 首先提出，如果说DevOps是一种理念的话，那么GitOps则是对DevOps理念的落地实践，主要是通过IaC（Infrastructure as Code）的方式将基础实施创建，CI/CD等运维动作“代码化”（比如Cloud build的配置，Terraform的配置，kubenetes的配置等），然后用Git对这些配置脚本进行管理。
因此，实现GitOps的前提条件就是我们的infrastructure要可以声明式管理（Be declaratively managed）。

4.2 实现概述

本实验的参考Tutorial，本教程使用两个 Git 代码库：

app 代码库：包含应用本身的源代码
env 代码库：包含 Kubernetes 部署的清单

我们将 app 代码库和 env 代码库分开，是因为它们具有不同的生命周期和用途。app 代码库的主要用户是真人，此代码库专用于特定应用。env 代码库的主要用户是自动化系统（例如 Cloud Build），并且此代码库可能由多个应用共享。env 代码库可以有多个分支，每个分支映射到特定环境。

这里为了演示，我们只有一个环境，真实场景可以有多个环境。

4.3 用Cloud Build实现GitOps CI

所谓的CI，Continuous integration refers to the build and unit testing stages of the software release process. Every revision that is committed triggers an automated build and test.
在GCP里面，是通过触发器（Trigger）来感知GIT里面提交代码的变化，然后在这个trigger上配置Cloud Build流水线配置文件：

steps:
# This step runs the unit tests on the app
- name: 'python:3.7-slim'
  id: Test
  entrypoint: /bin/sh
  args:
  - -c
  - 'pip install flask && python test_app.py -v'

# This step builds the container image.
- name: 'gcr.io/cloud-builders/docker'
  id: Build
  args:
  - 'build'
  - '-t'
  - 'us-central1-docker.pkg.dev/$PROJECT_ID/my-repository/hello-cloudbuild:$SHORT_SHA'
  - '.'

# This step pushes the image to Artifact Registry
# The PROJECT_ID and SHORT_SHA variables are automatically
# replaced by Cloud Build.
- name: 'gcr.io/cloud-builders/docker'
  id: Push
  args:
  - 'push'
  - 'us-central1-docker.pkg.dev/$PROJECT_ID/my-repository/hello-cloudbuild:$SHORT_SHA'

上面的脚本，实际上就是一个典型的CI流程，即测试、打镜像包、上传镜像。

4.4 用Cloud Build实现GitOps CD

所谓的CD（Continuous Delivery），简单来说就是在CI的基础上多出了deployment部署的动作，在tutorial中这个动作也是通过git代码仓库触发的。
为达此目的，我们需要新建一个代码仓，同样我们需要给这个代码仓配置一个触发器，这个触发器的作用就是在接收到新的镜像变化时，将其部署到GKE集群。然而这个触发动作，需要我们在之前CI cloudbuild.yaml后面追加以下内容，其主要职责就是将新的镜像写入CD的配置文件，并push到CD的配置代码仓，从而触发CD执行。

# This step clones the hello-cloudbuild-env repository
- name: 'gcr.io/cloud-builders/gcloud'
  id: Clone env repository
  entrypoint: /bin/sh
  args:
  - '-c'
  - |
    gcloud source repos clone hello-cloudbuild-env && \
    cd hello-cloudbuild-env && \
    git checkout candidate && \
    git config user.email $(gcloud auth list --filter=status:ACTIVE --format='value(account)')

# This step generates the new manifest
- name: 'gcr.io/cloud-builders/gcloud'
  id: Generate manifest
  entrypoint: /bin/sh
  args:
  - '-c'
  - |
     sed "s/GOOGLE_CLOUD_PROJECT/${PROJECT_ID}/g" kubernetes.yaml.tpl | \
     sed "s/COMMIT_SHA/${SHORT_SHA}/g" > hello-cloudbuild-env/kubernetes.yaml

# This step pushes the manifest back to hello-cloudbuild-env
- name: 'gcr.io/cloud-builders/gcloud'
  id: Push manifest
  entrypoint: /bin/sh
  args:
  - '-c'
  - |
    set -x && \
    cd hello-cloudbuild-env && \
    git add kubernetes.yaml && \
    git commit -m "Deploying image us-central1-docker.pkg.dev/$PROJECT_ID/my-repository/hello-cloudbuild:${SHORT_SHA}
    Built from commit ${COMMIT_SHA} of repository hello-cloudbuild-app
    Author: $(git log --format='%an <%ae>' -n 1 HEAD)" && \
    git push origin candidate

此时我们已经将kubernetes.yaml写入hello-cloudbuild-env的candidate分支。为了执行部署动作，我们需要新建另一个触发器（Trigger），这里我们将其命名为cloudbuild-delivery.yaml，其内容如下：

# [START cloudbuild-delivery]
steps:
# This step deploys the new version of our container image
# in the hello-cloudbuild Kubernetes Engine cluster.
- name: 'gcr.io/cloud-builders/kubectl'
  id: Deploy
  args:
  - 'apply'
  - '-f'
  - 'kubernetes.yaml'
  env:
  - 'CLOUDSDK_COMPUTE_REGION=us-central1'
  - 'CLOUDSDK_CONTAINER_CLUSTER=hello-cloudbuild'

# This step copies the applied manifest to the production branch
# The COMMIT_SHA variable is automatically
# replaced by Cloud Build.
- name: 'gcr.io/cloud-builders/git'
  id: Copy to production branch
  entrypoint: /bin/sh
  args:
  - '-c'
  - |
    set -x && \
    # Configure Git to create commits with Cloud Build's service account
    git config user.email $(gcloud auth list --filter=status:ACTIVE --format='value(account)') && \
    # Switch to the production branch and copy the kubernetes.yaml file from the candidate branch
    git fetch origin production && git checkout production && \
    git checkout $COMMIT_SHA kubernetes.yaml && \
    # Commit the kubernetes.yaml file with a descriptive commit message
    git commit -m "Manifest from commit $COMMIT_SHA
    $(git log --format=%B -n 1 $COMMIT_SHA)" && \
    # Push the changes back to Cloud Source Repository
    git push origin production
# [END cloudbuild-delivery]

这样，当hello-cloudbuild-env这个代码仓接收到代码提交，就会触发部署动作，把最新的docker镜像部署到auto-pilot集群，并且可以通过service访问。一切正常的话，最后会把最新部署脚本写入production分支。

4.5 结果验证

修改代码

sed -i 's/Hello Cloud Build/Hello huawei/g' app.py
  sed -i 's/Hello Cloud Build/Hello huawei/g' test_app.py
  
  sed -i 's/Hello huawei/Hello Cloud Build/g' app.py
  sed -i 's/Hello huawei/Hello Cloud Build/g' test_app.py

提交变更，或者MR
git add app.py test_app.py
git commit -m “Hello Cloud Build”
git push
查看CICD
https://console.cloud.google.com/cloud-build/builds
查看容器部署 kubectl get pods
查看service：http://34.172.165.210/

5 GKE Autopilot

可以根据创建Autopilot集群指南来创建Autopilot集群。

可以根据部署无状态负载指南来部署一个无状态应用。

我们还是选用2.1中创建的Docker镜像来部署，我们可以在Cloud Shell中创建一个deployment.yaml，其内如如下：

apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: my-app
  spec:
    replicas: 3
    selector:
      matchLabels:
        run: my-app
    template:
      metadata:
        labels:
          run: my-app
      spec:
        containers:
        - name: hello-app
          image: us-west1-docker.pkg.dev/polynomial-text-398202/hello-repo/hello-app:v1

然后使用kubectl apply -f deployment.ymal进行部署，然后用kubectl get pods查看部署情况：

NAME                      READY   STATUS    RESTARTS   AGE
my-app-7fcfbd5c8d-55jcf   1/1     Running   0          61m
my-app-7fcfbd5c8d-lrxcn   1/1     Running   0          61m
my-app-7fcfbd5c8d-wfzzz   1/1     Running   0          61m

张建飞（Frank）

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Google云的平台工程

GCP（Google Cloud Platform）是Google云，为其内部（Google search、Gmail、YouTube等）和外部客户提供IaaS、PaaS以及Serverless computing等云服务的平台。本文将带领你走进GCP，并深入体验其产品功能，感受Google云的产品设计理念以及相关架构思想。从而可以淬其精华，为我所用。本文的主要内容如下：1. 注册Free Tri...
复制链接

扫一扫