【云原生】Hadoop HA on k8s 环境部署

最新推荐文章于 2024-05-08 18:18:39 发布

苏书QAQ

最新推荐文章于 2024-05-08 18:18:39 发布

阅读量372

点赞数

分类专栏：大数据云原生文章标签： hadoop 云原生 kubernetes

原文链接：https://blog.csdn.net/qq_35745940/article/details/127032309

版权

云原生同时被 2 个专栏收录

27 篇文章 1 订阅

订阅专栏

大数据

13 篇文章 5 订阅

订阅专栏

一、概述

在 Hadoop 2.0.0 之前，一个集群只有一个Namenode，这将面临单点故障问题。如果 Namenode 机器挂掉了，整个集群就用不了了。只有重启 Namenode ，才能恢复集群。另外正常计划维护集群的时候，还必须先停用整个集群，这样没办法达到 7 * 24小时可用状态。Hadoop 2.0 及之后版本增加了 Namenode 高可用机制，这里主要讲Hadoop HA on k8s 环境部署。
HDFS
![image.png](https://img-blog.csdnimg.cn/img_convert/2ec969f4d33d362365b39c1c86b7605d.png#averageHue=#f8f7f7&clientId=ud4394105-4a12-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=uaf0dc34f&margin=[object Object]&name=image.png&originHeight=463&originWidth=769&originalType=url&ratio=1&rotation=0&showTitle=false&size=194882&status=done&style=none&taskId=u5881f35b-c1c4-4824-81be-d64a93cbeab&title=)
YARN
![image.png](https://img-blog.csdnimg.cn/img_convert/1209e27669d3068ef21d915ad47676f2.png#averageHue=#e7ecdd&clientId=ud4394105-4a12-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=udee6d504&margin=[object Object]&name=image.png&originHeight=541&originWidth=1080&originalType=url&ratio=1&rotation=0&showTitle=false&size=151463&status=done&style=none&taskId=ub0970af6-d0da-44a5-886a-fa2a4802f40&title=)

二、开始部署

这里是基于非高可用编排的基础上改造。不了解的小伙伴，可以先看我上面的文章。

1）添加 journalNode 编排

1、控制器Statefulset

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: {{ include "hadoop.fullname" . }}-hdfs-jn
  annotations:
    checksum/config: {{ include (print $.Template.BasePath "/hadoop-configmap.yaml") . | sha256sum }}
  labels:
    app.kubernetes.io/name: {{ include "hadoop.name" . }}
    helm.sh/chart: {{ include "hadoop.chart" . }}
    app.kubernetes.io/instance: {{ .Release.Name }}
    app.kubernetes.io/component: hdfs-jn
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ include "hadoop.name" . }}
      app.kubernetes.io/instance: {{ .Release.Name }}
      app.kubernetes.io/component: hdfs-jn
  serviceName: {{ include "hadoop.fullname" . }}-hdfs-jn
  replicas: {{ .Values.hdfs.jounralNode.replicas }}
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ include "hadoop.name" . }}
        app.kubernetes.io/instance: {{ .Release.Name }}
        app.kubernetes.io/component: hdfs-jn
    spec:
      affinity:
        podAntiAffinity:
        {{- if eq .Values.antiAffinity "hard" }}
          requiredDuringSchedulingIgnoredDuringExecution:
          - topologyKey: "kubernetes.io/hostname"
            labelSelector:
              matchLabels:
                app.kubernetes.io/name: {{ include "hadoop.name" . }}
                app.kubernetes.io/instance: {{ .Release.Name }}
                app.kubernetes.io/component: hdfs-jn
        {{- else if eq .Values.antiAffinity "soft" }}
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 5
            podAffinityTerm:
              topologyKey: "kubernetes.io/hostname"
              labelSelector:
                matchLabels:
                  app.kubernetes.io/name: {{ include "hadoop.name" . }}
                  app.kubernetes.io/instance: {{ .Release.Name }}
                  app.kubernetes.io/component: hdfs-jn
        {{- end }}
      terminationGracePeriodSeconds: 0
      containers:
      - name: hdfs-jn
        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
        imagePullPolicy: {{ .Values.image.pullPolicy | quote }}
        command:
           - "/bin/bash"
           - "/opt/apache/tmp/hadoop-config/bootstrap.sh"
           - "-d"
        resources:
{{ toYaml .Values.hdfs.jounralNode.resources | indent 10 }}
        readinessProbe:
          tcpSocket:
            port: 8485
          initialDelaySeconds: 10
          timeoutSeconds: 2
        livenessProbe:
          tcpSocket:
            port: 8485
          initialDelaySeconds: 10
          timeoutSeconds: 2
        volumeMounts:
        - name: hadoop-config
          mountPath: /opt/apache/tmp/hadoop-config
        {{- range .Values.persistence.journalNode.volumes }}
        - name: {{ .name }}
          mountPath: {{ .mountPath }}
        {{- end }}
        securityContext:
          runAsUser: {{ .Values.securityContext.runAsUser }}
          privileged: {{ .Values.securityContext.privileged }}
      volumes:
      - name: hadoop-config
        configMap:
          name: {{ include "hadoop.fullname" . }}
{{- if .Values.persistence.journalNode.enabled }}
  volumeClaimTemplates:
  {{- range .Values.persistence.journalNode.volumes }}
  - metadata:
      name: {{ .name }}
      labels:
        app.kubernetes.io/name: {{ include "hadoop.name" $ }}
        helm.sh/chart: {{ include "hadoop.chart" $ }}
        app.kubernetes.io/instance: {{ $.Release.Name }}
        app.kubernetes.io/component: hdfs-jn
    spec:
      accessModes:
      - {{ $.Values.persistence.journalNode.accessMode | quote }}
      resources:
        requests:
          storage: {{ $.Values.persistence.journalNode.size | quote }}
    {{- if $.Values.persistence.journalNode.storageClass }}
    {{- if (eq "-" $.Values.persistence.journalNode.storageClass) }}
      storageClassName: ""
    {{- else }}
      storageClassName: "{{ $.Values.persistence.journalNode.storageClass }}"
    {{- end }}
    {{- end }}
      {{- else }}
      - name: dfs
        emptyDir: {}
      {{- end }}
{{- end }}

2、service

# A headless service to create DNS records
apiVersion: v1
kind: Service
metadata:
  name: {{ include "hadoop.fullname" . }}-hdfs-jn
  labels:
    app.kubernetes.io/name: {{ include "hadoop.name" . }}
    helm.sh/chart: {{ include "hadoop.chart" . }}
    app.kubernetes.io/instance: {{ .Release.Name }}
    app.kubernetes.io/component: hdfs-jn
spec:
  ports:
  - name: jn
    port: {{ .Values.service.journalNode.ports.jn }}
    protocol: TCP
    {{- if and (eq .Values.service.journalNode.type "NodePort") .Values.service.journalNode.nodePorts.jn }}
    nodePort: {{ .Values.service.journalNode.nodePorts.jn }}
    {{- end }}
  type: {{ .Values.service.journalNode.type }}
  selector:
    app.kubernetes.io/name: {{ include "hadoop.name" . }}
    app.kubernetes.io/instance: {{ .Release.Name }}
    app.kubernetes.io/component: hdfs-jn

2）修改配置

1、修改values.yaml

image:
  repository: myharbor.com/bigdata/hadoop
  tag: 3.3.2
  pullPolicy: IfNotPresent

# The version of the hadoop libraries being used in the image.
hadoopVersion: 3.3.2
logLevel: INFO

# Select antiAffinity as either hard or soft, default is soft
antiAffinity: "soft"

hdfs:
  nameNode:
    replicas: 2
    pdbMinAvailable: 1

    resources:
      requests:
        memory: "256Mi"
        cpu: "10m"
      limits:
        memory: "2048Mi"
        cpu: "1000m"

  dataNode:
    # Will be used as dfs.datanode.hostname
    # You still need to set up services + ingress for every DN
    # Datanodes will expect to
    externalHostname: example.com
    externalDataPortRangeStart: 9866
    externalHTTPPortRangeStart: 9864

    replicas: 3

    pdbMinAvailable: 1

    resources:
      requests:
        memory: "256Mi"
        cpu: "10m"
      limits:
        memory: "2048Mi"
        cpu: "1000m"

  webhdfs:
    enabled: true

  jounralNode:
    replicas: 3
    pdbMinAvailable: 1

    resources:
      requests:
        memory: "256Mi"
        cpu: "10m"
      limits:
        memory: "2048Mi"
        cpu: "1000m"

yarn:
  resourceManager:
    pdbMinAvailable: 1
    replicas: 2

    resources:
      requests:
        memory: "256Mi"
        cpu: "10m"
      limits:
        memory: "2048Mi"
        cpu: "2000m"

  nodeManager:
    pdbMinAvailable: 1

    # The number of YARN NodeManager instances.
    replicas: 1

    # Create statefulsets in parallel (K8S 1.7+)
    parallelCreate: false

    # CPU and memory resources allocated to each node manager pod.
    # This should be tuned to fit your workload.
    resources:
      requests:
        memory: "256Mi"
        cpu: "500m"
      limits:
        memory: "2048Mi"
        cpu: "1000m"

persistence:
  nameNode:
    enabled: true
    storageClass: "hadoop-ha-nn-local-storage"
    accessMode: ReadWriteOnce
    size: 1Gi
    local:
    - name: hadoop-ha-nn-0
      host: "local-168-182-110"
      path: "/opt/bigdata/servers/hadoop-ha/nn/data/data1"
    - name: hadoop-ha-nn-1
      host: "local-168-182-111"
      path: "/opt/bigdata/servers/hadoop-ha/nn/data/data1"

  dataNode:
    enabled: true
    enabledStorageClass: false
    storageClass: "hadoop-ha-dn-local-storage"
    accessMode: ReadWriteOnce
    size: 1Gi
    local:
    - name: hadoop-ha-dn-0
      host: "local-168-182-110"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data1"
    - name: hadoop-ha-dn-1
      host: "local-168-182-110"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data2"
    - name: hadoop-ha-dn-2
      host: "local-168-182-110"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data3"
    - name: hadoop-ha-dn-3
      host: "local-168-182-111"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data1"
    - name: hadoop-ha-dn-4
      host: "local-168-182-111"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data2"
    - name: hadoop-ha-dn-5
      host: "local-168-182-111"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data3"
    - name: hadoop-ha-dn-6
      host: "local-168-182-112"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data1"
    - name: hadoop-ha-dn-7
      host: "local-168-182-112"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data2"
    - name: hadoop-ha-dn-8
      host: "local-168-182-112"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data3"
    volumes:
    - name: dfs1
      mountPath: /opt/apache/hdfs/datanode1
      hostPath: /opt/bigdata/servers/hadoop-ha/dn/data/data1
    - name: dfs2
      mountPath: /opt/apache/hdfs/datanode2
      hostPath: /opt/bigdata/servers/hadoop-ha/dn/data/data2
    - name: dfs3
      mountPath: /opt/apache/hdfs/datanode3
      hostPath: /opt/bigdata/servers/hadoop-ha/dn/data/data3

  journalNode:
    enabled: true
    storageClass: "hadoop-ha-jn-local-storage"
    accessMode: ReadWriteOnce
    size: 1Gi
    local:
    - name: hadoop-ha-jn-0
      host: "local-168-182-110"
      path: "/opt/bigdata/servers/hadoop-ha/jn/data/data1"
    - name: hadoop-ha-jn-1
      host: "local-168-182-111"
      path: "/opt/bigdata/servers/hadoop-ha/jn/data/data1"
    - name: hadoop-ha-jn-2
      host: "local-168-182-112"
      path: "/opt/bigdata/servers/hadoop-ha/jn/data/data1"
    volumes:
    - name: jn
      mountPath: /opt/apache/hdfs/journalnode

service:
  nameNode:
    type: NodePort
    ports:
      dfs: 9000
      webhdfs: 9870
    nodePorts:
      dfs: 30900
      webhdfs: 30870
  dataNode:
    type: NodePort
    ports:
      webhdfs: 9864
    nodePorts:
      webhdfs: 30864
  resourceManager:
    type: NodePort
    ports:
      web: 8088
    nodePorts:
      web: 30088
  journalNode:
    type: ClusterIP
    ports:
      jn: 8485
    nodePorts:
      jn: ""


securityContext:
  runAsUser: 9999
  privileged: true

2、修改hadoop/templates/hadoop-configmap.yaml

修改的内容比较多，这里就不贴出来了，最下面会给出git下载地址。

3）开始安装

# 创建存储目录
mkdir -p /opt/bigdata/servers/hadoop-ha/{nn,dn,jn}/data/data{1..3}
chmod -R 777 -R /opt/bigdata/servers/hadoop-ha/{nn,dn,jn}

helm install hadoop-ha ./hadoop -n hadoop-ha --create-namespace

查看

kubectl get pods,svc -n hadoop-ha -owide

![image.png](https://img-blog.csdnimg.cn/img_convert/b12e520a012f2048a0af5e0e60e3ec80.png#averageHue=#262321&clientId=ud4394105-4a12-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=u76cf4b46&margin=[object Object]&name=image.png&originHeight=829&originWidth=1781&originalType=url&ratio=1&rotation=0&showTitle=false&size=198157&status=done&style=none&taskId=u725bc9d1-403b-4811-8cdd-15452e94c5d&title=)
HDFS WEB-nn1：http://192.168.182.110:31870/dfshealth.html#tab-overview
![image.png](https://img-blog.csdnimg.cn/img_convert/ac34478bd99c2bb07d37d5dd9db19a29.png#averageHue=#faf9f8&clientId=ud4394105-4a12-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=u7cfcd130&margin=[object Object]&name=image.png&originHeight=811&originWidth=1629&originalType=url&ratio=1&rotation=0&showTitle=false&size=111452&status=done&style=none&taskId=ua5f18348-d24f-4b50-a53a-655ed98bb60&title=)
HDFS WEB-nn2：http://192.168.182.110:31871/dfshealth.html#tab-overview
![image.png](https://img-blog.csdnimg.cn/img_convert/46b3d9ebeceed0a852ffc29cb7a692f2.png#averageHue=#faf9f8&clientId=ud4394105-4a12-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=ufd860fe1&margin=[object Object]&name=image.png&originHeight=802&originWidth=1679&originalType=url&ratio=1&rotation=0&showTitle=false&size=113918&status=done&style=none&taskId=u64a7a766-5557-4ceb-9737-1df389bc2d8&title=)
YARN WEB-rm1：http://192.168.182.110:31088/cluster/cluster
![image.png](https://img-blog.csdnimg.cn/img_convert/553d10e5a300c35bf925e6243b4de5ff.png#averageHue=#faf9f8&clientId=ud4394105-4a12-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=u4cebe246&margin=[object Object]&name=image.png&originHeight=832&originWidth=1678&originalType=url&ratio=1&rotation=0&showTitle=false&size=114756&status=done&style=none&taskId=uc64eae59-cfa7-4f95-8459-add9a0caaea&title=)
YARN WEB-rm2：http://192.168.182.110:31089/cluster/cluster
![image.png](https://img-blog.csdnimg.cn/img_convert/270ad9e2b7bece9edda9dadc16cfed38.png#averageHue=#debb7b&clientId=ud4394105-4a12-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=u1222d470&margin=[object Object]&name=image.png&originHeight=872&originWidth=1519&originalType=url&ratio=1&rotation=0&showTitle=false&size=192001&status=done&style=none&taskId=u94b6ba84-ee67-4a51-b23d-3384d99b997&title=)

4）测试验证

kubectl exec -it hadoop-ha-hadoop-hdfs-nn-0 -n hadoop-ha -- bash

![image.png](https://img-blog.csdnimg.cn/img_convert/ab5d6ceb3410c53a01ae37b687814a61.png#averageHue=#232120&clientId=ud4394105-4a12-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=ue513917e&margin=[object Object]&name=image.png&originHeight=472&originWidth=1108&originalType=url&ratio=1&rotation=0&showTitle=false&size=57049&status=done&style=none&taskId=uc39d42c5-46f4-4fef-bf31-22a1ab53ac3&title=)

5）卸载

helm uninstall hadoop-ha -n hadoop-ha

kubectl delete pod -n hadoop-ha `kubectl get pod -n hadoop-ha|awk 'NR>1{print $1}'` --force
kubectl patch ns hadoop-ha -p '{"metadata":{"finalizers":null}}'
kubectl delete ns hadoop-ha --force

rm -fr /opt/bigdata/servers/hadoop-ha/{nn,dn,jn}/data/data{1..3}/*

git下载地址：https://gitee.com/hadoop-bigdata/hadoop-ha-on-k8s

苏书QAQ

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【云原生】Hadoop HA on k8s 环境部署

在 Hadoop 2.0.0 之前，一个集群只有一个Namenode，这将面临单点故障问题。如果 Namenode 机器挂掉了，整个集群就用不了了。只有重启 Namenode ，才能恢复集群。另外正常计划维护集群的时候，还必须先停用整个集群，这样没办法达到 7 * 24小时可用状态。Hadoop 2.0 及之后版本增加了 Namenode 高可用机制，这里主要讲Hadoop HA on k8s 环境部署。
复制链接

扫一扫