upgrade k8s (by quqi99)

作者:张华 发表于:2023-11-17
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明(http://blog.csdn.net/quqi99)

本文只是从网上搜索一些升级k8s的理论学习,下面的步骤未实际测试。

理论学习 - upgrade k8s from 1.20.6 to 1.20.15 by kubeadm

refer: 云原生Kubernetes:K8S集群版本升级(v1.20.6 - v1.20.15) - https://blog.csdn.net/cronaldo91/article/details/133789264

1, check verion
kubectl get nodes
kubectl version
kubeadm version
kubectl get componentstatuses
kubectl get deployments --all-namespaces

2, upgrade kubeadm
apt install kubeadm=1.20.15*
kubeadm version

3, upgrade master1
kubeadm upgrade plan
kubeadm upgrade apply v1.20.15
#in case it's offline mode, need to load the image first
docker image load -i kube-apiserver\:v1.15.1.tar 
docker image load -i kube-scheduler\:v1.15.1.tar 
docker image load -i kube-controller-manager\:v1.15.1.tar 
docker image load -i kube-proxy\:v1.15.1.tar
docker image list

4, upgrade master2, but pls use 'kubeadm upgrade node' instead of 'kubeadm upgrade apply'
apt install kubeadm=1.20.15*
kubeadm version
kubeadm upgrade node

5, upgrade kubelet and kubectl on master1
kubectl drain master1 --ignore-daemonsets
apt install kubelet=1.20.15*
systemctl daemon-reload && systemctl restart kubelet
kubectl uncordon master1
kubectl get nodes

6, upgrade kubelet and k on master2
kubectl drain master2 --ignore-daemonsets
apt install kubelet=1.20.15*
systemctl daemon-reload && systemctl restart kubelet
kubectl uncordon master2
kubectl get nodes

7, upgrade worker
apt install kubeadm=1.20.15*
kubeadm version
kubeadm upgrade node
kubectl drain worker1 --ignore-daemonsets --delete-emptydir-data
apt install kubelet=1.20.15*
systemctl daemon-reload && systemctl restart kubelet
kubectl uncordon worker1
kubectl get nodes

8, verify the cluster
kubectl get nodes
kubeadm alpa certs check-expiration
kubectl get pods -n kube-system

实践 - Upgrade k8s from 1.21 to 1.26 by charm

下面upgrade采用n-1模式(Upgrade path: 1.21 --> 1.22 --> 1.23 --> 1.24 --> 1.25 --> 1.26),已在实际环境验证,it works.

0, deply k8s 1.21 test env via the bundle - https://github.com/charmed-kubernetes/bundle/blob/main/releases/1.21/bundle.yaml
   juju scp kubernetes-master/0:config ~/.kube/config

1, backup db
juju run-action etcd/leader snapshot --wait
juju scp etcd/1:/home/ubuntu/etcd-snapshots/etcd-snapshot-2023-11-21-02.25.59.tar.gz .

2, upgrade containerd to the latest stable charm revision (it's 1.23 without --revision, from revision 607 to 200) - https://ubuntu.com/kubernetes/docs/1.22/upgrading
   pls wait for the units to turn back to "active" state - https://paste.ubuntu.com/p/ZtCKs9jx7k/
juju upgrade-charm containerd
watch juju status containerd
# fix 'blocked' and 'idle' state
juju run -u <unit_in_blocked_state> 'hooks/update-status'

3, upgrade etcd to 1.23 directly without --revision (from revision 607 to 768), and wait for the status to turn back to "active" 
juju upgrade-charm etcd
watch juju status etcd

4, upgrade the additional charms to 1.23 one by one (after waiting for the status to turn back to "active" then start the next one)
#juju upgrade-charm easyrsa --revision 420  #we don't use --revision to use 420(1.22), we will upgrade it to 1.23 directly
juju upgrade-charm easyrsa                  #from 395(1.21) to 441(1.23)
juju upgrade-charm flannel                  #from 571 to 619
juju upgrade-charm kubeapi-load-balancer    #from 814 to 866
#juju upgrade-charm calico
#juju upgrade-charm hacluster-kubernetes-master
#juju upgrade-charm hacluster-dashboard
#juju upgrade-charm hacluster-keystone
#juju upgrade-charm hacluster-vault
#juju upgrade-charm telegraf
#juju upgrade-charm public-policy-routing
#juju upgrade-charm landscape-client
#juju upgrade-charm filebeat
#juju upgrade-charm ntp
#juju upgrade-charm nfs-client
#juju upgrade-charm nrpe-container
#juju upgrade-charm nrpe-host
#juju upgrade-charm prometheus-ceph-exporter

5, upgrade k8s master from 1.21 to 1.22 (from revision 1034 to 1078)
   1.22 is in old charmstore, not in charmhub - https://bugs.launchpad.net/charm-kubernetes-master/+bug/2043783/comments/2
#the revision for 1.22 is 1078 according to 1078 - https://github.com/charmed-kubernetes/bundle/blob/main/releases/1.22/bundle.yaml
#should use ch: instead of cs: for the old storestore - https://charmhub.io/containers-kubernetes-master
#juju refresh kubernetes-master --switch ch:containers-kubernetes-master --revision 1078
#ERROR --switch and --revision are mutually exclusive
juju upgrade-charm kubernetes-master --revision 1078
watch juju status kubernetes-master          #wait for the status to turn back to "active"
juju config kubernetes-master channel=1.22/stable
juju run-action kubernetes-master/0 upgrade  #one by one as well, mainly install kube-proxy snap - https://paste.ubuntu.com/p/tnSxpGXx5D/
juju run-action kubernetes-master/1 upgrade

6, upgrade k8s worker from 1.21 to 1.22 (from revision 788 to 816)
# https://github.com/charmed-kubernetes/bundle/blob/main/releases/1.22/bundle.yaml
juju upgrade-charm kubernetes-worker --revision 816
juju config kubernetes-worker channel=1.22/stable
juju run-action kubernetes-worker/0 upgrade         #one by one as well, mainly install kube-proxy snap
juju run-action kubernetes-worker/1 upgrade
juju run-action kubernetes-worker/2 upgrade

7, upgrade k8s master and worker from 1.22 to 1.23, only some charms (not all) are in charmhub for 1.23 - https://github.com/charmed-kubernetes/bundle/blob/main/releases/1.23/bundle.yaml
juju upgrade-charm kubernetes-master --revision 1106
watch juju status kubernetes-master
juju run-action kubernetes-master/0 upgrade
juju run-action kubernetes-master/1 upgrade
juju upgrade-charm kubernetes-worker --revision 838
juju run-action kubernetes-worker/0 upgrade
juju run-action kubernetes-worker/1 upgrade
juju run-action kubernetes-worker/2 upgrade

8, when upgrading charmed k8s to 1.24, which has relocated from juju store to charmhub, this means that upgrading each charm will require the use of --switch during the upgrade.
   # https://github.com/charmed-kubernetes/bundle/blob/main/releases/1.24/bundle.yaml
juju upgrade-charm containerd --switch ch:containerd --channel 1.24/stable   #channel=1.24/stable, rev=27
juju upgrade-charm etcd --switch ch:etcd --channel 1.24/stable               #channel=1.24/stable, rev=701
juju upgrade-charm easyrsa --switch ch:easyrsa --channel 1.24/stable         #channel=1.24/stable, rev=15
juju upgrade-charm flannel --switch ch:flannel --channel 1.24/stable         #channel=1.24/stable, rev=28
#juju upgrade-charm calico --switch ch:calico --channel 1.24/stable
#juju upgrade-charm hacluster-kubernetes-master --switch ch:hacluster --channel latest/stable
#juju upgrade-charm hacluster-dashboard --switch ch:hacluster --channel latest/stable
#juju upgrade-charm hacluster-kesystone --switch ch:hacluster --channel latest/stable
#juju upgrade-charm hacluster-vault --switch ch:hacluster --channel latest/stable
#juju upgrade-charm telegraf --switch ch:telegraf --channel latest/stable
#juju upgrade-charm public-policy-routing
#juju upgrade-charm landscape-client
#juju upgrade-charm filebeat --switch ch:filebeat --channel latest/stable
#juju upgrade-charm ntp --switch ch:ntp --channel latest/stable
#juju upgrade-charm nfs-client
#juju upgrade-charm nrpe-container --switch ch:nrpe --channel latest/stable
#juju upgrade-charm nrpe-host --switch ch:nrpe --channel latest/stable
#juju upgrade-charm prometheus-ceph-exporter --switch ch:prometheus-ceph-exporter --channel latest/stable

#for the charm 1.24, kubernetes-master is renamed to kubernetes=control-plane as well
juju upgrade-charm kubernetes-master --switch ch:kubernetes-control-plane --channel 1.24/stable  #channel=1.24/stable, rev=171
juju config kubernetes-master channel=1.24/stable  #for the following 'run-action ... upgrade'
juju run-action kubernetes-master/0 upgrade
juju run-action kubernetes-master/1 upgrade
# for the message "ceph-storage relation deprecated, use ceph-client instead" 
juju remove-relation kubernetes-master:ceph-storage ceph-mon
juju upgrade-charm kubernetes-worker --switch ch:kubernetes-worker --channel 1.24/stable
juju config kubernetes-worker channel=1.24/stable
juju run-action kubernetes-worker/0 upgrade
juju run-action kubernetes-worker/1 upgrade
juju run-action kubernetes-worker/2 upgrade
kubectl get nodes -A

9, upgrade k8s from 1.24 to 1.25
juju upgrade-charm containerd --channel 1.25/stable         #channel=1.25/stable, rev=41
juju upgrade-charm etcd --channel 1.25/stable               #channel=1.25/stable, rev=718
juju upgrade-charm easyrsa --channel 1.25/stable            #channel=1.25/stable, rev=26
juju upgrade-charm flannel --channel 1.25/stable            #channel=1.25/stable, rev=49
juju upgrade-charm kubernetes-master --channel 1.25/stable  #channel=1.25/stable, rev=219
juju config kubernetes-master channel=1.25/stable
juju run-action kubernetes-master/0 upgrade
juju run-action kubernetes-master/1 upgrade
juju config kubernetes-worker channel=1.25/stable
juju run-action kubernetes-worker/0 upgrade
juju run-action kubernetes-worker/1 upgrade
juju run-action kubernetes-worker/2 upgrade
kubectl get nodes -A

10, upgrade k8s from 1.25 to 1.26, need a 'juju refresh' after 'juju run-action'
juju upgrade-charm containerd --channel 1.26/stable         #channel=1.26/stable, rev=54
juju upgrade-charm etcd --channel 1.26/stable               #channel=1.26/stable, rev=728
juju upgrade-charm easyrsa --channel 1.26/stable            #channel=1.26/stable, rev=33
juju upgrade-charm flannel --channel 1.26/stable            #channel=1.26/stable, rev=63
juju upgrade-charm kubeapi-load-balancer --channel 1.26/stable
juju upgrade-charm kubernetes-master --channel 1.26/stable  #channel=1.26/stable, rev=247

juju config kubernetes-master channel=1.26/stable
juju run-action kubernetes-master/0 upgrade
juju run-action kubernetes-master/1 upgrade
juju refresh kubernetes-master --channel 1.26/stable

juju config kubernetes-worker channel=1.26/stable
juju run-action kubernetes-worker/0 upgrade
juju run-action kubernetes-worker/1 upgrade
juju run-action kubernetes-worker/2 upgrade
juju refresh kubernetes-worker --channel 1.26/stable
kubectl get nodes -A

客户问题

客户只是从1.21往1.22升级时运行‘juju upgrade-charm kubernetes-master’时发现charmhub里没有1.22 (1.22太老已经被删除了,它只有1.23到1.29, 可用‘juju info --series focal kubernetes-worker’查看), k8s升级没法从1.21直接跳到1.23啊。
那没有1.22怎么办?找产品team的把1.22加上?另外,可以使用local charm过渡吗 ( https://github.com/charmed-kubernetes/charm-kubernetes-worker/releases/tag/1.22%2Bck2 ). 下面方法可以build 1.22 charm, 但可以从1.21升级到local 1.22再升级到1.23 on charmhub? 这个需要测试

#'python_version < "3.8"\' don\'t match your environment\nIgnoring Jinja2: markers \'python_version >= "3.0" and python_version <= "3.4"\'
# ppa:deadsnakes/ppa only has >=python3.5 as well, so have to use xenial instead
#juju add-machine --series jammy --constraints "mem=16G cores=8 root-disk=100G" -n 2
juju add-machine --series xenial -n 1
juju ssh 0
# but xenial is also using python3.5, and it said:
DEPRECATION: Python 3.5 reached the end of its life on September 13th, 2020. Please upgrade your Python as Python 3.5 is no longer maintained. pip 21.0 will drop support for Python 3.5 in January 2021. pip 21.0 will remove support for this functionality.

那在这个xenial的基础上继续通过源码来编译python3.2之后再build charm成功了
sudo apt-get install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev wget -y
wget https://www.python.org/ftp/python/3.2.6/Python-3.2.6.tgz
tar -xf Python-3.2.6.tgz
cd Python-3.2.6/
./configure --enable-optimizations
make -j$(nproc)
sudo make altinstall
python3.2 --version
alias python=python3.2
alias python3=python3.2
sudo apt install build-essential -y
sudo apt install python3-pip python3-dev python3-nose python3-mock -y
cd $CHARM_LAYERS_DIR/..
charm build --debug ./layers/kubernetes-worker/
cd /home/ubuntu/charms/layers/builds/kubernetes-worker
zip -rq ../kubernetes-worker.charm .

sudo snap install charm --classic
mkdir -p /home/ubuntu/charms
mkdir -p ~/charms/{layers,interfaces}
export JUJU_REPOSITORY=/home/ubuntu/charms
export CHARM_INTERFACES_DIR=$JUJU_REPOSITORY/interfaces
export CHARM_LAYERS_DIR=$JUJU_REPOSITORY/layers
export CHARM_BUILD_DIR=$JUJU_REPOSITORY/layers/builds
cd $CHARM_LAYERS_DIR
git clone https://github.com/charmed-kubernetes/charm-kubernetes-worker.git kubernetes-worker
cd kubernetes-worker && git checkout -b 1.22+ck2 1.22+ck2
sudo apt install python3-virtualenv tox -y
cd .. && charm build --debug layers/kubernetes-worker/
#cd ${JUJU_REPOSITORY}/layers/builds/kubernetes-worker && tox -e func
cd /home/ubuntu/charms/layers/builds/kubernetes-worker
zip -rq ../kubernetes-worker.charm .

还有一个问题是,在1.24时charm kubernetes-master被更名为kubernetes-control-plane, charmhub中的版本目前是从1.23开始的已经是kubernetes-control-plane了,所以‘juju info --series focal kubernetes-control-plane’能看到信息,而‘juju info --series focal kubernetes-master’是看不到的。

测试环境搭建

20231120更新 - 注:下面的方法只改部分的容易出现问题,最终报了这个错“2023-11-20 04:35:53 INFO unit.kubernetes-control-plane/0.juju-log server.go:316 status-set: waiting: Failed to setup auth-webhook tokens; will retry”,其实搭1.21的环境很简单,直接用1.21的release bundle即可。见:https://github.com/charmed-kubernetes/bundle/blob/main/releases/1.21/bundle.yaml

需要搭建一个和客户环境尽可能一样的环境, 客户环境还在使用下列768的charmstore里的老charm.

charmstore - https://charmhub.io/containers-kubernetes-worker
charmhub - https://charmhub.io/kubernetes-worker
charmstore - https://charmhub.io/containers-kubernetes-master
charmhub - https://charmhub.io/kubernetes-control-plane
  kubernetes-worker:
    charm: cs:~containers/kubernetes-worker-768
    channel: stable

目前测试工具默认生成的是用的charmhub中的latest/stable.

    charm: ch:kubernetes-control-plane
    channel: latest/stable

我的初步想法是使用"–use-stable-charms --charmstore --k8s-channel stable --revision-info ./juju_export_bundle.txt"在产生bundle时将ch变回cs:

./generate-bundle.sh -s focal --name k8s --num-control-planes 2 --num-workers 2 --calico --use-stable-charms --charmstore --k8s-channel stable --revision-info ./juju_export_bundle.txt

但是因为客户给的是juju_export_bundle的输出,而不是juju status的输出,所以上面的–revision-info不work, 这样上面命令产生的输出是:

cs:~containers/kubernetes-worker
channel: latest/stable

这样我的设想是:

  • 手动编辑b/k8s/kubernetes.yaml只是将k8s master与k8s worker改成和客户一样的版本。 另外,charmstore里的stable revision 768已经不存在了,目前charmstore website已经停服了,所以没法查比768还新一点的版本是哪个,可以通过’juju deploy --series focal cs:containers-kubernetes-worker-770 test’增加数字一个个试,最终试出来是770. 所以最后改成:cs:~containers/kubernetes-worker-770
  • 其他charm还是用charmstore里最新的stable版本, 这样做upgrade测试的时候这些upgrade都省了,将精力只集中在客户有问题的k8s worker 上, 所以最后改成:cs:~containers/kubernetes-master-1008

修改完b/k8s/kubernetes.yaml之后运行‘./generate-bundle.sh --name k8s --replay --run’完成测试环境搭建。

./generate-bundle.sh --name k8s --replay --run
watch -c juju status --color                                                    
sudo snap install kubectl --classic                                             
juju ssh kubernetes-control-plane/leader -- cat config > ~/.kube/config         
source <(kubectl completion bash)                                               
kubectl completion bash |sudo tee /etc/bash_completion.d/kubectl                
kubectl cluster-info

上游求助

似乎1.21可以直接升级到1.23 - https://bugs.launchpad.net/charm-kubernetes-master/+bug/2043783/comments/1
1.23是一个临时的charmstore版本,1.24才是第一个正式charmstore版本(ch-only),这是为什么在charmstore里看不到1.22的原因。
从这里(https://github.com/charmed-kubernetes/bundle/blob/main/releases/1.22/bundle.yaml)可以找到 1.22发布时的revision号(如,对于workers是: cs:~containers/kubernetes-worker-816)

1, 当切换到charmstore的老版本charm时(如816)时用–switch, 但注意此时需要用ch来代替cs

# switch back to old charm 816 in charmstore (NOTE: should use ch instead of cs here) - https://charmhub.io/containers-kubernetes-worker
juju refresh kubernetes-worker --switch ch:containers-kubernetes-worker --revision 816

2, 再切换回charmhub (1.24+)时,需要再到–switch

juju refresh kubernetes-worker --switch ch:kubernetes-worker --channel 1.2x/stable

3, 1.24版本有个从kubernetes-master到kubernetes-control-plane的更名https://ubuntu.com/kubernetes/docs/1.24/upgrading

juju refresh kubernetes-master --switch ch:kubernetes-control-plane --channel 1.24/stable

定稿 - test n-2 upgrade

下面n-2模式会work吗?

  • 从1.21直接升级到1.23,跳过1.22 (1.21与1.23都在charmstore里)
  • 从1.23升级到1.24, (1.24是charmhub里的第一个版本)
  • 从1.24直接升级到1.26, n-2也跳过1.25 ?
  • 下面运行upgrade-charm升级charm的命令可以同时运行,之后得等juju status能看到所有为active状态( 中间有时可运行 'juju run -u <unit_in_blocked_state> ‘hooks/update-status’ 命令解决), 之后再运行 'juju run-action xxx upgrade’命令来升级k8s-master与k8s-worker上的kube-proxy snap
1, upgrade k8s from 1.21 to 1.23 directly (1.23 is the last charm in the charmstore) - https://github.com/charmed-kubernetes/bundle/blob/main/releases/1.23/bundle.yaml
juju upgrade-charm containerd --revision 200
juju upgrade-charm etcd --revision 655
juju upgrade-charm easyrsa --revision 441
juju upgrade-charm flannel --revision 619
juju upgrade-charm kubeapi-load-balancer --revision 866
juju upgrade-charm kubernetes-master --revision 1106
juju upgrade-charm kubernetes-worker --revision 838

# 上面的命令可以同时运行,但不能同时和下面的命令同时运行
watch juju status  #wait for the status to 'active' again

juju config kubernetes-master channel=1.23/stable
juju run-action kubernetes-master/0 upgrade
juju run-action kubernetes-master/1 upgrade
juju config kubernetes-worker channel=1.23/stable
juju run-action kubernetes-worker/0 upgrade
juju run-action kubernetes-worker/1 upgrade
juju run-action kubernetes-worker/2 upgrade

2, upgrade k8s from 1.23 to 1.24, 1.24 is the first version in the charmhub, and kubernetes-master is renamed to kubernetes=control-plane as well
juju upgrade-charm containerd --switch ch:containerd --channel 1.24/stable
juju upgrade-charm etcd --switch ch:etcd --channel 1.24/stable
juju upgrade-charm easyrsa --switch ch:easyrsa --channel 1.24/stable
juju upgrade-charm flannel --switch ch:flannel --channel 1.24/stable
juju upgrade-charm kubernetes-master --switch ch:kubernetes-control-plane --channel 1.24/stable
juju upgrade-charm kubernetes-worker --switch ch:kubernetes-worker --channel 1.24/stable

watch juju status  #wait for the status to 'active' again

juju config kubernetes-master channel=1.24/stable
juju run-action kubernetes-master/0 upgrade
juju run-action kubernetes-master/1 upgrade
juju config kubernetes-worker channel=1.24/stable
juju run-action kubernetes-worker/0 upgrade
juju run-action kubernetes-worker/1 upgrade
juju run-action kubernetes-worker/2 upgrade

3, upgrade k8s from 1.24 to 1.26 (n+2 upgrade)
juju upgrade-charm containerd --channel 1.26/stable
juju upgrade-charm etcd --channel 1.26/stable
juju upgrade-charm easyrsa --channel 1.26/stable
juju upgrade-charm flannel --channel 1.26/stable
juju upgrade-charm kubernetes-master --channel 1.26/stable
juju config kubernetes-worker channel=1.26/stable

watch juju status  #wait for the status to 'active' again

juju config kubernetes-master channel=1.26/stable
juju run-action kubernetes-master/0 upgrade
juju run-action kubernetes-master/1 upgrade
juju config kubernetes-worker channel=1.26/stable
juju run-action kubernetes-worker/0 upgrade
juju run-action kubernetes-worker/1 upgrade
juju run-action kubernetes-worker/2 upgrade

# 1.26引入了'juju refresh', 如果不运行下列refresh命令,那么用'kubectl get nodes'会看到worker的版本始终是1.24,而master是1.26,这样master端会一直看到: Waiting for kubelet,kube-proxy to start
#juju refresh kubernetes-master --channel 1.26/stable
juju refresh kubernetes-worker --channel 1.26/stable
kubectl get nodes

其它问题:

  • upgrade juju 2.9.11 to 2.9.29+ to avoid: https://bugs.launchpad.net/juju/+bug/1968931
1. backup the controller
juju model-config -m controller backup-dir="/var/snap/juju-db/common"
juju create-backup -m controller
2, Upgrade your juju client
snap refresh juju --channel=2.9/stable:
3, Upgrade the controller
juju upgrade-controller
4, Once done, upgrade your model
juju upgrade-model --dry-run
juju upgrade-model
  • 当中间出现’juju status’不能回到active时,多半运行’juju run -u <unit_in_blocked_state> ‘hooks/update-status’'即可解决(或者重启unit也行). 对于ntp遇到的这种问题可能还要额外之前重启chronyd, 对于kubernetes-api遇到的这种问题可能还要额外之前重启crm api
  • 如果涉及到ceph, 可能在升级到1.24时,需要处理 # for the message “ceph-storage relation deprecated, use ceph-client instead” , juju remove-relation kubernetes-master:ceph-storage ceph-mon
  • 如果涉及到calico, 如果遇到‘stat /opt/calicoctl/kubeconfig: no such file or directory’, 可以用其他worker的/opt/calicoctl/kubeconfig作为workaround
  • 注意:虽然自己测的可以skip level upgrade, 但这并不是官方推荐,向客户推荐时应该与官方保持一致 (https://ubuntu.com/kubernetes/docs/upgrading), 就像o7k也一般不帮助客户升级的,要升级可以用bs服务,升级挂了是要背锅的

20240814 - k8s offline

#for k8s
juju config containerd http_proxy=http://squid.internal:3128 https_proxy=http://squid.internal:3128 no_proxy="127.0.0.1,localhost,::1,10.149.0.0/16"
#For MicroK8S you should instead set the juju model's proxy - https://microk8s.io/docs/install-proxy
juju model-config http-proxy=http://squid.internal:3128 https-proxy=http://squid.internal:3128 no-proxy="127.0.0.1,localhost,::1,10.149.0.0/16"

juju actually configures the proxy in /etc/profile.d/juju-proxy.sh and not /etc/environment, so microk8s doesn't use it.
You can manually update /etc/environment, or the microk8s charm (which stsstack-bundles uses) also has a containerd_http_proxy option and containerd_custom_registries options:
https://charmhub.io/microk8s/configuration

20241230 - my first microk8s case

https://github.com/canonical/k8s-dqlite/issues/196

cd microk8s
./generate-bundle.sh -n microk8s -s focal --run
juju add-unit microk8s -n2  #it will create microk8s HA automatically
#mkdir -p /home/ubuntu/.kube && juju exec --unit microk8s/0 microk8s.config && cat ~/.kube/config
./configure
#why microk8s snap support 1.31, but microk8s charm doesn't support 1.31 - https://charmhub.io/microk8s
juju exec -a microk8s "sudo snap refresh microk8s --channel=1.31/stable"  #upgrade microk8s to 1.31/stable
kubectl get nodes

#or create microk8s HA by hand
juju ssh 1
sudo snap refresh microk8s --channel=1.31/stable
#or on the host which have microk8s
mkdir -p ~/.kube/config && cp /snap/bin/microk8s.config ~/.kube/config
sudo usermod -a -G microk8s ubuntu
newgrp microk8s
microk8s status --wait-ready
microk8s kubectl get nodes
microk8s add-node
microk8s join 10.149.144.78:25000/db1eeacc37302589777dd1c6632c83c1/9779088ae6e6 --worker
#microk8s leave
#microk8s remove-node 10.22.254.79

Kine是k3s.io开发的etcd代理(etcdshim, 处于kube-apiserver和rdbms的中间层),它将etcd API映射到轻量级关系型数据库如 SQLite、Postgres和 MySQL/MariaDB,以及消息队列系统 NATS上。设计初衷是为了提供给任何 K8S(尤其是 K3s)环境一个替代etcd的选择 (etcd在 k8s之外基本上没有应用的场景,并且 etcd迭代也比较慢,由于没有人愿意维护因此一直在衰退)。 Kine实现上使用一个table(kine) (id, name, created, deleted, create_revision, prev_revision, lease, value, old_vaule)来包含所有数据,每次写入都会创建一个新行来存储已创建或更新的k8s对象,name column使用与etcd相同的存储 结构/registry/<RESOURCE_TYPE>//NAME(eg: /registry/serviceaccounts/kube-system/replication-controller, select count(id),name from kine group by name)表示集群中对象。

#https://www.cncf.io/wp-content/uploads/2020/11/MicroK8s-High-Availability-Alex-Chalkias.pdf
#start k8s-dqlite, dqlite is distributed sqlite3
/snap/microk8s/7449/bin/k8s-dqlite --storage-dir=/var/snap/microk8s/7449/var/kubernetes/backend/ --listen=unix:///var/snap/microk8s/7449/var/kubernetes/backend/kine.sock:12379
#To get a raw sqlite shell
sudo /snap/microk8s/current/bin/dqlite --cert /var/snap/microk8s/current/var/kubernetes/backend/cluster.crt \
  --key /var/snap/microk8s/current/var/kubernetes/backend/cluster.key --servers file:var/snap/microk8s/current/var/kubernetes/backend/cluster.yaml k8s
dqlite> SELECT name FROM sqlite_master WHERE type = 'table';
kine
sqlite_sequence
#print all current database keys and create a tarball that you can inspect
sudo microk8s dbctl --debug backup

使用valgrind看是否k8s-dqlite有memory leak,

#https://blog.csdn.net/quqi99/article/details/140525441
sudo apt install valgrind -y
sudo systemctl stop snap.microk8s.daemon-k8s-dqlite.service
#sudo deluser $USER microk8s
sudo usermod -a -G microk8s $USER && id $USER
newgrp microk8s
sudo /usr/bin/valgrind --track-origins=yes --leak-check=full --show-reachable=yes --trace-children=yes --log-file=/home/ubuntu/valgrind_k8s-dqlite.log /snap/microk8s/current/bin/k8s-dqlite --storage-dir=/var/snap/microk8s/current/var/kubernetes/backend/ --listen=unix:///var/snap/microk8s/current/var/kubernetes/backend/kine.sock:12379

#但是报了大量下列的错,似乎valgrind与snap不兼容, 也可试试 AddressSanitizer
==1421963== Conditional jump or move depends on uninitialised value(s)
==1421963==    at 0x456349: ??? (in /snap/microk8s/7449/bin/k8s-dqlite)
==1421963==    by 0x456E5E: ??? (in /snap/microk8s/7449/bin/k8s-dqlite)
==1421963==    by 0x470B4E: ??? (in /snap/microk8s/7449/bin/k8s-dqlite)
==1421963==    by 0x47531E: ??? (in /snap/microk8s/7449/bin/k8s-dqlite)
==1421963==    by 0x1FFF000287: ???
==1421963==    by 0x47531E: ??? (in /snap/microk8s/7449/bin/k8s-dqlite)
==1421963==    by 0x1954D5F: ???
==1421963==    by 0x2: ???
==1421963==    by 0x470964: ??? (in /snap/microk8s/7449/bin/k8s-dqlite)
==1421963==    by 0x4708EE: ??? (in /snap/microk8s/7449/bin/k8s-dqlite)
==1421963==    by 0x2: ???
==1421963==    by 0x1FFF000307: ???
==1421963==  Uninitialised value was created by a stack allocation
==1421963==    at 0x63A00F: ??? (in /snap/microk8s/7449/bin/k8s-dqlite)

==1489004== Invalid read of size 16
==1489004==    at 0x59BB97: ??? (in /snap/microk8s/7449/bin/k8s-dqlite)
==1489004==  Address 0xc0002a7400 is in a rw- anonymous segment

#这种错误实际上是正常的(https://bytes.usc.edu/cs104/wiki/valgrind),
#我们可以用下列方法忽略它们, 通过--gen-suppressions=all自动产生忽略文件, 当然我们也可以不care这些错误。
valgrind --gen-suppressions=all --error-limit=no --leak-check=full --show-leak-kinds=definite --show-reachable=yes --trace-children=yes --log-file=/home/ubuntu/valgrind_k8s-dqlite.log /snap/microk8s/current/bin/k8s-dqlite --storage-dir=/var/snap/microk8s/current/var/kubernetes/backend/ --listen=unix:///var/snap/microk8s/current/var/kubernetes/backend/kine.sock:12379
awk '/^{/,/^}/ {print; if ($0 ~ /^}/) print ""}' valgrind_k8s-dqlite.log > suppressions.supp
awk '{ if ($0 ~ /<insert_a_suppression_name_here>/) print "   suppression_" NR; else print $0 }' suppressions.supp > temp.supp && mv temp.supp suppressions.supp
valgrind --suppressions=suppressions.supp  --error-limit=no --leak-check=full --show-leak-kinds=definite --show-reachable=yes --trace-children=yes --log-file=/home/ubuntu/valgrind_k8s-dqlite.log /snap/microk8s/current/bin/k8s-dqlite --storage-dir=/var/snap/microk8s/current/var/kubernetes/backend/ --listen=unix:///var/snap/microk8s/current/var/kubernetes/backend/kine.sock:12379

但是用valgrind之后dqlite总有"ERRO[0076] failed to list /registry/flowschemas/ for revision 202251"之类的错,并且kubectl get nodes看到运行valgrind的节点变成NotReady状态,用htop看到cpu太高,这种情况明摆着就是运行valgrind之后导致的cpu升高造成的dqlite有问题进而产生NotReady. 可以通过 ‘–num-callers=20’之类的参数性小valgrind的性能消耗,但这治标不治本。
所以改试速度更快的asan就没问题了,步骤如下,注意一点:microk8s 是一个snap, snap是通过静态链接库编译的,而asan似乎是通过动态链接库编译. 我改用动态链库编译k8s-dslite也可以运行(当然不是在snap里运行).

#unable to access 'https://salsa.debian.org/debian/libtirpc.git/': Failed to connect to salsa.debian.org port 443: Connection timed out
#even squid.internal can't access certain specific websites, so use lxd in my home instead
lxc launch focal focalsru --profile default
lxc exec focalsru bash
apt install build-essential cmake gcc libasan6 golang-go -y
#https://go.dev/doc/install
GO_VERSION=1.23.4
GO_ARCH=linux-amd64
curl -o go.tgz https://dl.google.com/go/go${GO_VERSION}.${GO_ARCH}.tar.gz
sudo rm -rf /usr/lib/go && sudo tar -C /usr/lib -xzf go.tgz
sudo chown -R $USER /usr/lib/go
export GOROOT=/usr/lib/go
export GOPATH=/bak/golang
export PATH=$GOROOT/bin:$GOPATH/bin:$PATH
go version

git clone https://github.com/canonical/k8s-dqlite
cd k8s-dqlite
#https://github.com/canonical/microk8s/blob/master/build-scripts/components/k8s-dqlite/version.sh
git checkout -b 1.3.0 v1.3.0
make clean
#snap uses static way, but asan uses dynamic way
#make static -j
patch -p1 < diff  #see blow diff file
make dynamic -j
$ ldd k8s-dqlite/bin/dynamic/k8s-dqlite |grep asan
        libasan.so.5 => /lib/x86_64-linux-gnu/libasan.so.5 (0x00007fb2b867f000)
lxc file pull focalsru/root/k8s-dqlite/bin k8s-dqlite --recursive
scp -o "ProxyCommand ssh hua@t440p -W %h:%p" -r k8s-dqlite ubuntu@bastion:/tmp
scp -i /home/ubuntu/.local/share/juju/ssh/juju_id_rsa -r /tmp/k8s-dqlite ubuntu@10.149.144.126:

sudo usermod -a -G microk8s $USER && id $USER
newgrp microk8s
export LD_LIBRARY_PATH=/home/ubuntu/k8s-dqlite/bin/dynamic/lib:$LD_LIBRARY_PATH  #use dynamic lib
#/snap/microk8s/current/bin/k8s-dqlite --storage-dir=/var/snap/microk8s/current/var/kubernetes/backend/ --listen=unix:///var/snap/microk8s/current/var/kubernetes/backend/kine.sock:12379
export ASAN_OPTIONS=halt_on_error=0:symbolize=1:detect_leaks=1:verbosity=1:log_path=asan.log
/home/ubuntu/k8s-dqlite/bin/dynamic/k8s-dqlite --storage-dir=/var/snap/microk8s/current/var/kubernetes/backend/ --listen=unix:///var/snap/microk8s/current/var/kubernetes/backend/kine.sock:12379

root@focalsru:~/k8s-dqlite# git diff
diff --git a/hack/dynamic-dqlite.sh b/hack/dynamic-dqlite.sh
index bca84e7..5a70ae2 100755
--- a/hack/dynamic-dqlite.sh
+++ b/hack/dynamic-dqlite.sh
@@ -84,6 +84,7 @@ if [ ! -f "${BUILD_DIR}/sqlite/libsqlite3.la" ]; then
 fi
 
 # build dqlite
+ASAN_OPTION="-fsanitize=address -fno-omit-frame-pointer -fsanitize-recover=all -g -O1"
 if [ ! -f "${BUILD_DIR}/dqlite/libdqlite.la" ]; then
   (
     cd "${BUILD_DIR}"
@@ -92,8 +93,8 @@ if [ ! -f "${BUILD_DIR}/dqlite/libdqlite.la" ]; then
     cd dqlite
     autoreconf -i > /dev/null
     ./configure --enable-build-raft \
-      CFLAGS="-I${BUILD_DIR}/sqlite -I${BUILD_DIR}/libuv/include -I${BUILD_DIR}/lz4/lib -Werror=implicit-function-declaration" \
-      LDFLAGS=" -L${BUILD_DIR}/libuv/.libs -L${BUILD_DIR}/lz4/lib -L${BUILD_DIR}/libnsl/src" \
+      CFLAGS="-I${BUILD_DIR}/sqlite -I${BUILD_DIR}/libuv/include -I${BUILD_DIR}/lz4/lib -Werror=implicit-function-declaration ${ASAN_OPTION}" \
+      LDFLAGS=" -L${BUILD_DIR}/libuv/.libs -L${BUILD_DIR}/lz4/lib -L${BUILD_DIR}/libnsl/src -fsanitize=address" \
       UV_CFLAGS="-I${BUILD_DIR}/libuv/include" \
       UV_LIBS="-L${BUILD_DIR}/libuv/.libs" \
       SQLITE_CFLAGS="-I${BUILD_DIR}/sqlite" \
@@ -101,7 +102,7 @@ if [ ! -f "${BUILD_DIR}/dqlite/libdqlite.la" ]; then
       LZ4_LIBS="-L${BUILD_DIR}/lz4/lib" \
       > /dev/null
 
-    make -j > /dev/null
+    make -j
   )
 fi
 
@@ -122,8 +123,8 @@ fi
   cp -r dqlite/include/* "${INSTALL_DIR}/include"
 )
 
@@ -122,8 +123,8 @@ fi
   cp -r dqlite/include/* "${INSTALL_DIR}/include"
 )
 
-export CGO_CFLAGS="-I${INSTALL_DIR}/include"
-export CGO_LDFLAGS="-L${INSTALL_DIR}/lib -ldqlite -luv -llz4 -lsqlite3 -Wl,-z,stack-size=1048576"
+export CGO_CFLAGS="-I${INSTALL_DIR}/include ${ASAN_OPTION}"
+export CGO_LDFLAGS="-L${INSTALL_DIR}/lib -ldqlite -luv -llz4 -lsqlite3 -Wl,-z,stack-size=1048576 -fsanitize=address"
 export LD_LIBRARY_PATH="${INSTALL_DIR}/lib"
 
 echo "Libraries are in '${INSTALL_DIR}/lib'"
diff --git a/hack/dynamic-go-build.sh b/hack/dynamic-go-build.sh
index ab80887..21a70f3 100755
--- a/hack/dynamic-go-build.sh
+++ b/hack/dynamic-go-build.sh
@@ -4,7 +4,7 @@ DIR="$(realpath `dirname "${0}"`)"
 
 . "${DIR}/dynamic-dqlite.sh"
 
-go build \
+go build -asan \
   -tags libsqlite3 \
   -ldflags '-s -w -extldflags "-Wl,-rpath,$ORIGIN/lib -Wl,-rpath,$ORIGIN/../lib"' \
   "${@}"

其他一些内存调试方法:

kubectl top pods -A --sort-by=memory --sum=true
kubectl describe node <master-node-name>

cd <sosreport>/proc then run:
$ for pid in [0-9]*; do
  if [ -f "$pid/status" ]; then
    mem=$(grep -i VmRSS "$pid/status" 2>/dev/null | awk '{print $2}')
    name=$(grep -i Name "$pid/status" 2>/dev/null | awk '{print $2}')
    echo "$mem KB $name (PID: $(basename $pid))"
  fi
done | sort -nr | head -2
766836 KB kubelite (PID: 1982)
575196 KB k8s-dqlite (PID: 779)

#top -p $(pgrep -f k8s-dqlite)
while true; do
    echo "$(date): $(ps -o pid,%mem,%cpu,cmd -p $(pgrep -f k8s-dqlite))" >> kubelite_memory.log
    sleep 60
done

systemReserved is set to {memory: 1Gi}, evictionHard is set to {memory.available: "<500Mi"} 

[Service]
MemoryMax=2G     # Maximum memory the service can use
MemoryHigh=1500M    # A warning threshold for memory usage

但哪怕运行了下列脚本也没有触发memory leak.

vim create-update-configmap.sh
#!/bin/bash
COUNT=1000
kubectl create configmap test-config --from-literal=key=initial-value
if [ $? -ne 0 ]; then
  echo "Failed to create initial ConfigMap"
  exit 1
fi
for i in $(seq 1 $COUNT); do
  kubectl patch configmap test-config --patch '{"data": {"key": "value'$i'"}}'
  if [ $? -ne 0 ]; then
    echo "Failed to update ConfigMap"
    exit 1
  fi
  echo "Updated ConfigMap to value$i"
  sleep 1
done
echo "ConfigMap updated $COUNT times successfully!"

目前的状态是:VmRSS一直在增加,但asan看却无问题(和采用dynamic而不是snap里的static的编译方式有关吗), 但这仍然无法确认就存在内存泄漏(真正内存泄漏是MemFree与MemAvailable均减少直到OOM),因为要首先排除glibc缓存的影响, 可将glibc malloc替换为其他分配器(如gperftools’s tcmalloc),若替换后问题消失,则可能是glibc的缓存导致的。

早期的linux malloc只有一个分配区(arena, 内存分配库表演的竞技场), 每次分配内存都要对分配区加锁,分配完释放锁,导致多线程下并发申请释放内存锁的竞争激烈。
后来多开几个arena, 这就是ptmalloc2, 只有一个主分配区(使用brk/mmap申请虚拟内存),但增加了多个非主分配区(只能使用mmap申请)。当调用 malloc分配内存的时候,会先查看当前线程私有变量中是否已经存在一个分配区 arena。如果存在,则尝试会对这个 arena 加锁. 如果加锁成功,则会使用这个分配区分配内存, 如果加锁失败,说明有其它线程正在使用,则遍历 arena 列表寻找没有加锁的 arena 区域,如果找到则用这个 arena 区域分配内存。
glibc 每次申请的虚拟内存区块大小是 64MB,glibc 再根据应用需要切割为小块零售。
MALLOC_ARENA_MAX用于调整arena大小(较大的值可以减少内存分配的竞争。然而,在某些情况下,较小的值可能更适合,例如在内存敏感的应用程序中,可以减少内存的浪费)
tcache 是 glibc 中的一种快速内存分配缓存机制, 禁用tcache: export GLIBC_TUNABLES=glibc.malloc.tcache_count=0

GLIBC_TUNABLES=glibc.malloc.tcache_count=0 MALLOC_ARENA_MAX=1 /home/ubuntu/k8s-dqlite/bin/dynamic/k8s-dqlite --storage-dir=/var/snap/microk8s/current/var/kubernetes/backend/ --listen=unix:///var/snap/microk8s/current/var/kubernetes/backend/kine.sock:12379

若使用libtcmalloc的话,它又无法和静态编译的 /snap/microk8s/7449/bin/k8s-dqlite 一起用。

#https://blog.csdn.net/uestcyms/article/details/142894411
sudo apt install google-perftools libgoogle-perftools-dev -y
google-pprof --version
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 HEAPPROFILE=/tmp/mem_leak /snap/microk8s/7449/bin/k8s-dqlite --storage-dir=/var/snap/microk8s/current/var/kubernetes/backend/ --listen=unix:///var/snap/microk8s/current/var/kubernetes/backend/kine.sock:12379
pprof --text /home/ubuntu/k8s-dqlite/bin/dynamic/k8s-dqlite /tmp/k8s-dqlite_heap_profile.0001.heap

继续尝试用静态编译asan, 又失败了(下一步用动态的将某些依赖库也编译asan
c/c++动态编译 ASan:安装 libasan6 包,编译选项(-fsanitize=address -fno-omit-frame-pointer -fsanitize-recover=all -g -O1), 链接选择(-lasan但无需显示指定, 但实际上我好像也要添加-fsanitize=address才不报错)
c/c++静态编译 ASan:安装libgcc-$(gcc -dumpversion | cut -d. -f1)-dev包(dpkg -L libgcc-9-dev | grep libasan.a),编译选项(-fsanitize=address -fno-omit-frame-pointer -fsanitize-recover=all -g -O1), 链接选择(-static-libasan -static, 其中-static代表全部链接静态库那么所依赖的所有库如libc都要有静态库才行记得是所有的)
go中对asan的支持: 动态(go build -asan), 静态(-asan -ldflags=“-static”)

其他手段

sudo systemd-cgtop | grep microk8s
ps aux --sort -%mem
glibc mtrace - 需修改源码 - https://www.cnblogs.com/const-zpc/p/16364426.html
gperftools(tcmalloc | cpu profiler | heap profiler) - https://www.cnblogs.com/minglee/p/10124174.html

我之前是仅将k8s-dqlite里的dqlite用asan编译,我现在也对sqlite编译

# build sqlite3
ASAN_OPTION="-fsanitize=address -fno-omit-frame-pointer -fsanitize-recover=all -g -O1"
if [ ! -f "${BUILD_DIR}/sqlite/libsqlite3.la" ]; then
  (
    cd "${BUILD_DIR}"
    rm -rf sqlite
    git clone "${REPO_SQLITE}" --depth 1 --branch "${TAG_SQLITE}" > /dev/null
    cd sqlite
    export CFLAGS="${ASAN_OPTION}"
    export LDFLAGS="-fsanitize=address"
    ./configure --disable-readline \
      > /dev/null
    make libsqlite3.la -j > /dev/null
    unset CFLAGS
    unset LDFLAGS
  )
fi

结果在编译sqlite3时就报了下列memory leaks:

#也可以这样
git clone "${REPO_SQLITE}" --depth 1 --branch "${TAG_SQLITE}" > /dev/null
cd sqlite
export CFLAGS="-fsanitize=address -fno-omit-frame-pointer -fsanitize-recover=all -g -O1"
export LDFLAGS="-fsanitize=address"
./configure --disable-readline
make libsqlite3.la -j
unset CFLAGS
unset LDFLAGS

#错误如下:
+ cd sqlite
+ export 'CFLAGS=-fsanitize=address -fno-omit-frame-pointer -fsanitize-recover=all -g -O1'
+ CFLAGS='-fsanitize=address -fno-omit-frame-pointer -fsanitize-recover=all -g -O1'
+ export LDFLAGS=-fsanitize=address
+ LDFLAGS=-fsanitize=address
+ ./configure --disable-readline
configure: WARNING: Can't find Tcl configuration definitions
configure: WARNING: *** Without Tcl the regression tests cannot be executed ***
configure: WARNING: *** Consider using --with-tcl=... to define location of Tcl ***
+ make libsqlite3.la -j
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL

=================================================================
==387496==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 10240 byte(s) in 1 object(s) allocated from:
    #0 0x7342ea90da06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
    #1 0x64f38cc68bbd in Symbol_insert /root/k8s-dqlite/hack/.build/dynamic/sqlite/tool/lemon.c:5519
    #2 0x64f38cc6a0c9 in Symbol_new /root/k8s-dqlite/hack/.build/dynamic/sqlite/tool/lemon.c:5421
    #3 0x64f38cc700b1 in parseonetoken /root/k8s-dqlite/hack/.build/dynamic/sqlite/tool/lemon.c:2620
    #4 0x64f38cc700b1 in Parse /root/k8s-dqlite/hack/.build/dynamic/sqlite/tool/lemon.c:3112
    #5 0x64f38cc75868 in main /root/k8s-dqlite/hack/.build/dynamic/sqlite/tool/lemon.c:1703
    #6 0x7342ea632082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082)

Direct leak of 5120 byte(s) in 1 object(s) allocated from:
    #0 0x7342ea90da06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
    #1 0x64f38cc68825 in Symbol_init /root/k8s-dqlite/hack/.build/dynamic/sqlite/tool/lemon.c:5482
    #2 0x64f38cc75729 in main /root/k8s-dqlite/hack/.build/dynamic/sqlite/tool/lemon.c:1692
    #3 0x7342ea632082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082)

Direct leak of 2048 byte(s) in 1 object(s) allocated from:
    #0 0x7342ea90da06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
    #1 0x64f38cc730b4 in Configtable_init /root/k8s-dqlite/hack/.build/dynamic/sqlite/tool/lemon.c:5821
    #2 0x64f38cc731b5 in Configlist_init /root/k8s-dqlite/hack/.build/dynamic/sqlite/tool/lemon.c:1331
    #3 0x64f38cc7508d in FindStates /root/k8s-dqlite/hack/.build/dynamic/sqlite/tool/lemon.c:908
    #4 0x64f38cc762bc in main /root/k8s-dqlite/hack/.build/dynamic/sqlite/tool/lemon.c:1755
    #5 0x7342ea632082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082)

SUMMARY: AddressSanitizer: 17408 byte(s) leaked in 3 allocation(s).
make[1]: *** [Makefile:1140: parse.c] Error 1
make[1]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile:1267: fts5parse.c] Segmentation fault (core dumped)
make: *** [Makefile:38: bin/dynamic/lib/libdqlite.so] Error 2

#也可单独编译它(cc -fsanitize=address -o lemon tool/lemon.c)此时可以还不报memory leak, 然后运行'./lemon ./test.y'就会报错了
$ cat test.y 
// 声明部分
%token_type { int }
%token NUMBER.

// 规则部分
expr ::= NUMBER(A). { printf("Number: %d\n", A); }

从microk8s版本找对应的sqlite版本的方法如下:

#客户在Nov 29号将microk8s的版本从v1.30.1升级到了v1.31.2, 下面一个命令能看到snap版本是什么时候变化的,那Nov 29之前用哪个版本?那个版本也有memory leak吗?
#cat /var/lib/snapd/sequence/microk8s.json | jq '.sequence[] | {name: .name, channel: .channel, revision: .revision}' && ls -l /var/lib/snapd/snaps/ | grep microk8s
hua@minipc:/bak/work/canonical/microk8s$ git checkout 1.31
Already on '1.31'
hua@minipc:/bak/work/canonical/microk8s$ cat build-scripts/components/k8s-dqlite/version.sh 
#!/bin/bash
echo "master"
hua@minipc:/bak/work/canonical/microk8s$ git checkout 1.30
Switched to branch '1.30'
hua@minipc:/bak/work/canonical/microk8s$ cat build-scripts/components/k8s-dqlite/version.sh 
#!/bin/bash
echo "master"

#但是无论是1.30还是1.31用到的sqlite版本似乎都是3.40.0啊
$ git show 1.30 |grep Date
Date:   Fri Apr 19 09:49:23 2024 +0300
$ git show 1.31 |grep Date
Date:   Wed Aug 14 14:25:31 2024 +0300
hua@minipc:/bak/work/canonical/k8s-dqlite$ git log --tags --simplify-by-decoration --before="Fri Apr 19 09:49:23 2024 +0300" --pretty=format:'%D' | grep 'tag:' |head -n1
tag: v1.1.9
hua@minipc:/bak/work/canonical/k8s-dqlite$ git log --tags --simplify-by-decoration --before="Wed Aug 14 14:25:31 2024 +0300" --pretty=format:'%D' | grep 'tag:' |head -n1
tag: v1.1.10

hua@minipc:/bak/work/canonical/k8s-dqlite$ git checkout -b 1.1.9 v1.1.9
Switched to a new branch '1.1.9'
hua@minipc:/bak/work/canonical/k8s-dqlite$ grep -E 'REPO_SQLITE|TAG_SQLITE' ./hack/env.sh
REPO_SQLITE="https://github.com/sqlite/sqlite.git"
TAG_SQLITE="version-3.40.0"
hua@minipc:/bak/work/canonical/k8s-dqlite$ git checkout -b 1.1.10 v1.1.10
Switched to a new branch '1.1.10'
hua@minipc:/bak/work/canonical/k8s-dqlite$ grep -E 'REPO_SQLITE|TAG_SQLITE' ./hack/env.sh
REPO_SQLITE="https://github.com/sqlite/sqlite.git"
TAG_SQLITE="version-3.40.0"

最后原因是因为 -
https://github.com/canonical/k8s-dqlite/pull/240
https://github.com/canonical/k8s-dqlite/issues/36
https://github.com/canonical/k8s-dqlite/issues/196

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

quqi99

你的鼓励就是我创造的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值