作者:张华 发表于:2023-10-15
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明(http://blog.csdn.net/quqi99)
what is sunbeam
sunbeam是一个部署openstack的工具,它会用juju定义两个clouds(microk8s and sunbeam), microk8s用于部署openstack控制服务(位于openstack model), sunbeam用于部署sunbeam-controller(位于admin/conroller model):
- openstack控制面部署在microk8s中
- openstack数据面(ovn and nova-compute)用snap部署(openstack-hypervisor)
Deploy sunbeam
思路:
- 在reset环境时还是应该删除microk8s,由sunbeam来安装microk8s
- 看到microk8s snap被安装之后,再运行用于特色网络的workaround来下载image
# Backup images, remember to use 'mcirok8s.ctr' instead of 'ctr', and remove 'ctr' by 'sudo apt purge containerd docker.io -y'
microk8s.ctr --namespace k8s.io images ls -q |grep -v sha256 |xargs -I {} sh -c 'fname=$(echo "{}" | tr -s "/" "_"); microk8s.ctr --namespace k8s.io image export ${fname}.tar "{}"'
ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g");echo $fname; microk8s.ctr --namespace k8s.io image import {} ${fname}'
# or use micro.registry
#sudo microk8s.enable registry
#ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g"); sudo docker load -i {}; sudo docker tag ${fname} localhost:32000/${fname}; sudo docker push localhost:32000/${fname}'
microk8s.ctr --namespace k8s.io image ls
sed -i "s#registry.k8s.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /var/snap/microk8s/current/args/containerd-template.toml
sudo systemctl restart snap.microk8s.daemon-containerd.service
# sudo microk8s.stop && sudo microk8s.start #并不需要重启microk8s, 重启上面的snap.microk8s.daemon-containerd即可
cat << EOF |tee -a /var/snap/microk8s/current/args/containerd-env
# should use http type
HTTP_PROXY=http://192.168.99.186:3129
HTTPS_PROXY=http://192.168.99.186:3129
NO_PROXY=10.0.0.0/8,127.0.0.0/16,192.168.0.0/16
EOF
- 再运行"sudo microk8s.stop && sudo microk8s.start’
- 若此方法不work,可以在国外机器上安装好sunbeam之后,再用上面的命令将镜像备份回来再导入
juju add sunbean && juju add-machine --series jammy --constraints "root-disk=100G mem=32G cores=8"
juju ssh 0
# 遇到各种奇怪的问题时,要先reset环境
sudo snap remove --purge openstack
sudo snap remove --purge juju
sudo snap remove --purge juju-db
sudo snap remove --purge kubectl
sudo /usr/sbin/remove-juju-services
sudo rm -rf /var/lib/juju
rm -rf ~/.local/share/juju
rm -rf ~/snap/juju/
rm -rf ~/snap/openstack
rm -rf ~/snap/openstack-hypervisor
rm -rf ~/snap/microstack/
rm -rf ~/snap/microk8s/
sudo snap remove --purge vault
sudo snap remove --purge microk8s
sudo snap remove --purge openstack-hypervisor
rm -rf /home/hua/.local/share/openstack/deployments.yaml
sudo init 6 #最好重启,否则会有一些calico的网卡和namespace去不了
# 用2023.2/candidate将会遇到:Please run `sunbeam configure` first,所以肜2023.1/stable
sudo snap install openstack --channel 2023.1/stable
python3 -c "import socket; print(socket.getfqdn())"
sunbeam prepare-node-script | bash -x
sudo usermod -a -G snap_daemon $USER && newgrp snap_daemon
#ERROR failed to bootstrap model: machine is already provisioned
sudo remove-juju-services
# 如果hangout在'Bootstrapping juju into machine'无反应,用'journalctl -f'看到是ssh的问题
# 是因为ssh密钥设密码了,同时配置无密码登录,且配置‘NOPASSWD:ALL’
lxc network set lxdbr0 ipv4.address=192.168.9.1/24
lxc network set lxdbr0 ipv6.address none
lxc network set lxdbr0 ipv4.dhcp=true
sunbeam cluster bootstrap --accept-defaults
mkdir -p ~/.kube && sudo chown -R $USER ~/.kube
sudo usermod -a -G snap_microk8s $USER && newgrp snap_microk8s
microk8s.kubectl get pods --all-namespaces
microk8s.ctr --namespace k8s.io image ls
#registry.k8s.io, docker.io, registry.jujucharms.com, quay.io
##echo 'HTTPS_PROXY=http://192.168.99.186:9311' |sudo tee -a /var/snap/microk8s/current/args/containerd-env
microk8s.ctr --namespace k8s.io containers ls
alias kubectl='sudo /snap/bin/microk8s.kubectl'
source <(kubectl completion bash) && kubectl completion bash |sudo tee /etc/bash_completion.d/kubectl
sunbeam cluster list
#Unable to complete operation for new subnet. The number of DNS nameservers exceeds the limit 5.
#The workaround is to modify /run/systemd/resolve/resolv.conf, but don't restart systemd-resolv according to
#https://github.com/openstack-snaps/snap-openstack/commit/7b7ca702efb490f13624002093e1b0b4cefe3aab
sunbeam configure --accept-defaults --openrc demo-openrc
sunbeam openrc > admin-openrc
source admin-openrc
sunbeam launch ubuntu --name test
sudo journalctl -u snap.openstack.clusterd.service -f
source <(openstack complete)
openstack complete |sudo tee /etc/bash_completion.d/openstack
openstack hypervisor list
sudo snap get openstack-hypervisor node
sudo snap logs openstack-hypervisor.hypervisor-config-service
sudo snap logs openstack-hypervisor.ovn-controller
#juju switch opensetack && juju ssh ovn-central/0
sudo microk8s.kubectl -n openstack exec -it ovn-central-0 bash
sudo microk8s.kubectl -n openstack exec -it ovn-central-0 -c ovn-northd -- ovn-sbctl --db=ssl:ovn-central-0.ovn-central-endpoints.openstack.svc.cluster.local:16642 -c /etc/ovn/cert_host -C /etc/ovn/ovn-central.crt -p /etc/ovn/key_host list
cat /var/snap/openstack-hypervisor/common/etc/nova/nova.conf
Featured deployment
上面的部署方法是在正常网络条件下部署的,但是如果存在特色网络,可能需要做更多的hack.
1, 首先如果OS不是fresh的,需要reset env
sudo snap remove --purge microk8s
sudo snap remove --purge juju
sudo snap remove --purge openstack
sudo snap remove --purge openstack-hypervisor
sudo /usr/sbin/remove-juju-services
sudo rm -rf /var/lib/juju
rm -rf ~/.local/share/juju
rm -rf ~/snap/openstack
rm -rf ~/snap/openstack-hypervisor
rm -rf ~/snap/microstack/
rm -rf ~/snap/juju/
rm -rf ~/snap/microk8s/
sudo init 6 #最好重启,否则会有一些calico的网卡和namespace去不了
2, 由于我的OS有ssh key,并且ssh key是设置有密码的,所以得额外加一步使用无密码的.local/share/juju/ssh/juju_id_rsa
#because my default ssh key has password, so need one extra step to avoid: Timeout before authentication for 192.168.99.179 port 56142
cat .local/share/juju/ssh/juju_id_rsa.pub |sudo tee -a ~/.ssh/authorized_keys
ssh hua@minipc.lan -i .local/share/juju/ssh/juju_id_rsa
3, dns要用长格式:
echo '192.168.99.179 minipc.lan minipc' |sudo tee -a /etc/hosts
python3 -c "import socket; print(socket.getfqdn())"
4, 可查看日志: sudo journalctl -f
5, 特色环境需要能正确下载镜像
echo 'HTTP_PROXY=http://192.168.99.186:9311' |sudo tee -a /var/snap/microk8s/current/args/containerd-env
echo 'HTTPS_PROXY=http://192.168.99.186:9311' |sudo tee -a /var/snap/microk8s/current/args/containerd-env
echo 'NO_PROXY=10.0.0.0/8,192.168.0.0/16,127.0.0.1,172.16.0.0/12' |sudo tee -a /var/snap/microk8s/current/args/containerd-env
sudo snap restart microk8s
microk8s.kubectl get pods --all-namespaces
microk8s.ctr --namespace k8s.io image ls
How to debug snap
为lp bug (https://bugs.launchpad.net/snap-openstack/+bug/2039403)产生了一个小patch
diff --git a/sunbeam-python/sunbeam/utils.py b/sunbeam-python/sunbeam/utils.py
index 542c1c1..cd02ee2 100644
--- a/sunbeam-python/sunbeam/utils.py
+++ b/sunbeam-python/sunbeam/utils.py
@@ -242,7 +242,7 @@ def get_free_nic() -> str:
return nic
-def get_nameservers(ipv4_only=True) -> List[str]:
+def get_nameservers(ipv4_only=True, max_count=5) -> List[str]:
"""Return a list of nameservers used by the host."""
resolve_config = Path("/run/systemd/resolve/resolv.conf")
nameservers = []
@@ -258,7 +258,7 @@ def get_nameservers(ipv4_only=True) -> List[str]:
nameservers = list(set(nameservers))
except FileNotFoundError:
nameservers = []
- return nameservers
+ return nameservers[:max_count]
它位于/snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py, 但如何在生产环境中能过打diff来快速调试它呢?因为它是read-only的,感觉好麻烦,下面的步骤也不好使,因为在’sudo snap remove openstack’之后sunbeam命令也找不着了也就无法运行‘sunbeam configure --accept-defaults --openrc demo-openrc’了
cd snap-openstack/sunbeam-python
tox -epy3
#In order to debug read-only file /snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py
sudo unsquashfs -d squashfs-root /var/lib/snapd/snaps/openstack_*.snap
sudo snap try ./squashfs-root/ --devmode
cd ./squashfs-root/lib/python3.10/site-packages
sudo patch -p1 < ./diff
cd ~ && sudo vim ./squashfs-root/lib/python3.10/site-packages/sunbeam/utils.py
import rpdb;rpdb.set_trace()
sudo ./squashfs-root/bin/python3 -m pip install rpdb
sudo systemctl restart snap.openstack.clusterd.service
nc 127.0.0.1 4444
sunbeam configure --accept-defaults --openrc demo-openrc #trigger it, but there is no sunbeam now
重新build snap也需要先删除snap也不能debug
git clone https://github.com/openstack-snaps/snap-openstack.git
cd snap-openstack/
patch -p1 < diff
sudo apt install build-essential -y
sudo snap install --classic snapcraft
#snapcraft clean
sudo snapcraft
用下列’mount --bind’的方法可以,但因为只是mount一个文件所以只能用日志:
cp /snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py .
sudo mount --bind utils.py /snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py
mount | grep "utils.py"
#now we can modify /snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py (NOTE: it's not ./utils.py)
LOG.warn("quqi {}".format(nameservers[:max_count]))
#python needs to be restarted if they are running in the daemon
sudo systemctl restart snap.openstack.clusterd.service
sunbeam configure --accept-defaults --openrc demo-openrc #trigger it
想要’mount --bind’一个目录然后安装rpdb模块,但没有成功(以后再机会再确认一下)
mkdir -p ~/snap_write/ && cp -r /snap/openstack/274 ~/snap_write/
sudo mount --bind ~/snap_write/274 /snap/openstack/274
mount |grep /snap/openstack/274
vim /snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py
LOG.warn("quqi {}".format(nameservers[:max_count]))
import rpdb;rpdb.set_trace()
/snap/openstack/274/bin/pip install rpdb
#python needs to be restarted if they are running in the daemon
sudo systemctl restart snap.openstack.clusterd.service
sunbeam configure --accept-defaults --openrc demo-openrc #trigger it
nc 127.0.0.1 4444
Some Info
juju ssh -m admin/controller 0
ubuntu@juju-5d90c3-sunbeam-0:~$ juju clouds |tail -n2
Only clouds with registered credentials are shown.
There are more clouds, use --all to see them.
microk8s 1 localhost k8s 0 built-in A Kubernetes Cluster
sunbeam 1 default manual 0 local
ubuntu@juju-5d90c3-sunbeam-0:~$ juju controllers |tail -n1
sunbeam-controller* admin/controller juju-5d90c3-sunbeam-0.cloud.sts superuser 2 - - 3.2.0
ubuntu@juju-5d90c3-sunbeam-0:~$ juju models |tail -n3
Model Cloud/Region Type Status Machines Cores Units Access Last connection
admin/controller* sunbeam/default manual available 1 8 4 admin just now
openstack sunbeam-microk8s/localhost kubernetes available 0 - 24 admin 1 minute ago
ubuntu@juju-5d90c3-sunbeam-0:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
metallb-system speaker-2rspk 1/1 Running 0 108m
kube-system coredns-6f5f9b5d74-ctc9d 1/1 Running 0 109m
kube-system calico-node-m74rh 1/1 Running 0 107m
metallb-system controller-9556c586f-kqslx 1/1 Running 0 108m
kube-system calico-kube-controllers-7457875fc6-xdst9 1/1 Running 0 106m
openstack modeloperator-7f5fcd7474-w2f5p 1/1 Running 0 105m
openstack cinder-ceph-mysql-router-0 2/2 Running 0 105m
openstack ovn-relay-0 2/2 Running 0 105m
openstack certificate-authority-0 1/1 Running 0 104m
openstack horizon-mysql-router-0 2/2 Running 1 (101m ago) 105m
openstack horizon-0 2/2 Running 0 105m
openstack keystone-mysql-router-0 2/2 Running 0 104m
openstack cinder-ceph-0 2/2 Running 0 105m
openstack rabbitmq-0 2/2 Running 0 105m
openstack placement-0 2/2 Running 0 104m
openstack neutron-0 2/2 Running 0 104m
openstack keystone-0 2/2 Running 0 105m
openstack glance-0 2/2 Running 1 (91m ago) 104m
openstack traefik-0 2/2 Running 0 105m
openstack cinder-mysql-router-0 2/2 Running 2 (41m ago) 105m
openstack neutron-mysql-router-0 2/2 Running 2 (35m ago) 104m
openstack nova-api-mysql-router-0 2/2 Running 2 (10m ago) 104m
openstack cinder-0 3/3 Running 1 (8m43s ago) 104m
kube-system hostpath-provisioner-69cd9ff5b8-kdjpp 1/1 Running 5 (7m22s ago) 108m
openstack nova-mysql-router-0 2/2 Running 3 (7m19s ago) 105m
openstack nova-0 4/4 Running 2 (7m19s ago) 103m
openstack glance-mysql-router-0 2/2 Running 1 (7m19s ago) 104m
openstack ovn-central-0 4/4 Running 2 (5m51s ago) 103m
openstack nova-cell-mysql-router-0 2/2 Running 1 (4m38s ago) 105m
openstack mysql-0 2/2 Running 1 (3m21s ago) 104m
openstack placement-mysql-router-0 2/2 Running 3 (7m19s ago) 104m
ubuntu@juju-5d90c3-sunbeam-0:~$ juju switch admin/controller
sunbeam-controller:juju-5d90c3-sunbeam-0.cloud.sts/openstack -> sunbeam-controller:admin/controller
ubuntu@juju-5d90c3-sunbeam-0:~$ juju status
Model Controller Cloud/Region Version SLA Timestamp Notes
controller sunbeam-controller sunbeam/default 3.2.0 unsupported 03:50:49Z upgrade available: 3.2.3
SAAS Status Store URL
certificate-authority active local juju-5d90c3-sunbeam-0.cloud.sts/openstack.certificate-authority
keystone waiting local juju-5d90c3-sunbeam-0.cloud.sts/openstack.keystone
ovn-relay active local juju-5d90c3-sunbeam-0.cloud.sts/openstack.ovn-relay
rabbitmq active local juju-5d90c3-sunbeam-0.cloud.sts/openstack.rabbitmq
App Version Status Scale Charm Channel Rev Exposed Message
controller active 1 juju-controller 3.2/stable 14 no
microceph unknown 0 microceph edge 9 no
microk8s active 1 microk8s legacy/stable 121 no
openstack-hypervisor active 1 openstack-hypervisor 2023.1/stable 105 no
sunbeam-machine active 1 sunbeam-machine latest/edge 1 no
Unit Workload Agent Machine Public address Ports Message
controller/0* active idle 0 10.5.1.11
microk8s/0* active idle 0 10.5.1.11 16443/tcp
openstack-hypervisor/0* active idle 0 10.5.1.11
sunbeam-machine/0* active idle 0 10.5.1.11
Machine State Address Inst id Base AZ Message
0 started 10.5.1.11 manual: ubuntu@22.04 Manually provisioned machine
Offer Application Charm Rev Connected Endpoint Interface Role
microceph microceph microceph 9 0/0 ceph ceph-client provider
ubuntu@juju-5d90c3-sunbeam-0:~$ juju switch openstack
sunbeam-controller:admin/controller -> sunbeam-controller:juju-5d90c3-sunbeam-0.cloud.sts/openstack
ubuntu@juju-5d90c3-sunbeam-0:~$ juju status
Model Controller Cloud/Region Version SLA Timestamp
openstack sunbeam-controller sunbeam-microk8s/localhost 3.2.0 unsupported 03:56:29Z
App Version Status Scale Charm Channel Rev Address Exposed Message
certificate-authority active 1 tls-certificates-operator latest/stable 22 10.152.183.253 no
cinder waiting 1 cinder-k8s 2023.1/stable 47 10.152.183.47 no installing agent
cinder-ceph waiting 1 cinder-ceph-k8s 2023.1/stable 38 10.152.183.65 no installing agent
cinder-ceph-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.165 no
cinder-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.124 no
glance active 1 glance-k8s 2023.1/stable 59 10.152.183.202 no
glance-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.77 no
horizon active 1 horizon-k8s 2023.1/stable 56 10.152.183.234 no http://10.20.21.10/openstack-horizon
horizon-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.218 no
keystone waiting 1 keystone-k8s 2023.1/stable 125 10.152.183.123 no installing agent
keystone-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.78 no
mysql 8.0.34-0ubuntu0.22.04.1 active 1 mysql-k8s 8.0/candidate 99 10.152.183.183 no
neutron waiting 1 neutron-k8s 2023.1/stable 53 10.152.183.187 no installing agent
neutron-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.45 no
nova waiting 1 nova-k8s 2023.1/stable 48 10.152.183.59 no installing agent
nova-api-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.46 no
nova-cell-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.194 no
nova-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.110 no
ovn-central active 1 ovn-central-k8s 23.03/stable 61 10.152.183.195 no
ovn-relay active 1 ovn-relay-k8s 23.03/stable 49 10.20.21.11 no
placement active 1 placement-k8s 2023.1/stable 43 10.152.183.90 no
placement-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.210 no
rabbitmq 3.9.13 active 1 rabbitmq-k8s 3.9/stable 30 10.20.21.12 no
traefik 2.10.4 maintenance 1 traefik-k8s 1.0/candidate 148 10.20.21.10 no updating ingress configuration for 'ingress:48'
Unit Workload Agent Address Ports Message
certificate-authority/0* active idle 10.1.105.20
cinder-ceph-mysql-router/0* active idle 10.1.105.9
cinder-ceph/0* blocked idle 10.1.105.12 (ceph) integration missing
cinder-mysql-router/0* active idle 10.1.105.7
cinder/0* waiting idle 10.1.105.30 (workload) Not all relations are ready
glance-mysql-router/0* active idle 10.1.105.19
glance/0* active idle 10.1.105.35
horizon-mysql-router/0* active idle 10.1.105.11
horizon/0* active idle 10.1.105.13
keystone-mysql-router/0* active idle 10.1.105.25
keystone/0* waiting idle 10.1.105.22 (workload) Not all relations are ready
mysql/0* active idle 10.1.105.36 Primary
neutron-mysql-router/0* active idle 10.1.105.26
neutron/0* waiting idle 10.1.105.29 (workload) Not all relations are ready
nova-api-mysql-router/0* active idle 10.1.105.21
nova-cell-mysql-router/0* active idle 10.1.105.18
nova-mysql-router/0* active idle 10.1.105.8
nova/0* waiting idle 10.1.105.31 (workload) Not all relations are ready
ovn-central/0* active idle 10.1.105.37
ovn-relay/0* active idle 10.1.105.10
placement-mysql-router/0* active idle 10.1.105.28
placement/0* active idle 10.1.105.27
rabbitmq/0* active idle 10.1.105.23
traefik/0* maintenance idle 10.1.105.24 updating ingress configuration for 'ingress:48'
Offer Application Charm Rev Connected Endpoint Interface Role
certificate-authority certificate-authority tls-certificates-operator 22 1/1 certificates tls-certificates provider
keystone keystone keystone-k8s 125 1/1 identity-credentials keystone-credentials provider
ovn-relay ovn-relay ovn-relay-k8s 49 1/1 ovsdb-cms-relay ovsdb-cms provider
rabbitmq rabbitmq rabbitmq-k8s 30 1/1 amqp rabbitmq provider
20231128更新 - try microk8s
正常地,这么装:
sudo update-alternatives --install "$(which editor)" editor "$(which vim)" 15
sudo update-alternatives --config editor
# only for snap download, it's useless
#sudo systemctl edit --full snapd.service
#[Service]
#Environment="HTTP_PROXY=http://192.168.99.186:9311"
#Environment="HTTPS_PROXY=http://192.168.99.186:9311"
#Environment="NO_PROXY=localhost,127.0.0.1,192.168.0.0/24,10.0.0.0/8,172.16.0.0/16,*.lan"
#sudo systemctl restart snapd.service
#sudo snap set system proxy.http=”http://localhost:8081"
#sudo snap set system proxy.https=”http://localhost:8081"
snap info microk8s
sudo snap install microk8s --classic
mkdir -p ~/.kube && sudo chown -R $USER ~/.kube
sudo usermod -a -G microk8s $USER && newgrp microk8s #switch to the group microk8s
sudo journalctl -f -u snap.microk8s.daemon-kubelite
microk8s.kubectl get pods --all-namespaces
microk8s.ctr --namespace k8s.io image ls
microk8s.ctr --namespace k8s.io containers ls
alias kubectl='sudo /snap/bin/microk8s.kubectl'
source <(kubectl completion bash) && kubectl completion bash |sudo tee /etc/bash_completion.d/kubectl
kubectl describe pod -n kube-system coredns-864597b5fd-pj27h
但特色网络肯定会失败,下面的无论是修改contained-env还是containerd-template.toml都失败了
curl -x http://192.168.99.186:3129 https://registry.k8s.io
openssl s_client -connect registry.k8s.io:443
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace=k8s.io images pull registry.k8s.io/pause:3.7
sudo ctr --namespace=k8s.io images pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.7
#echo -e 'HTTP_PROXY=http://192.168.99.186:3129\nHTTPS_PROXY=http://192.168.99.186:3129' |tee -a /var/snap/microk8s/current/args/containerd-env
sed -i "s#registry.k8s.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /var/snap/microk8s/current/args/containerd-template.toml
sudo systemctl restart snap.microk8s.daemon-containerd.service
# sudo microk8s.stop && sudo microk8s.start #并不需要重启microk8s, 重启上面的snap.microk8s.daemon-containerd即可
试图将所有image全下载,也失败了:
kubectl get pods --all-namespaces -o=jsonpath='{range .items[*]}{.metadata.namespace}:{.metadata.name}{"\t"}{range .spec.containers[*]}{.image}{"\n"}{end}{end}'
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace=k8s.io images pull registry.k8s.io/pause:3.7
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace=k8s.io images pull docker.io/calico/kube-controllers:v3.25.1
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace=k8s.io images pull docker.io/calico/node:v3.25.1
sudo microk8s.ctr --namespace=k8s.io images pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.10.1
用L3层的工具还是失败了。
再换下面的方法来下载所有镜像:
git clone https://github.com/ubuntu/microk8s.git
cd microk8s
# know it's 1.28 according to 'snap info microk8s'
git checkout -b 1.28 v1.28
# grep -ir 'image:' * | awk '{print $3 $4}'
# reference - https://soulteary.com/2019/09/08/build-your-k8s-environment-with-microk8s.html
images=(
nginx:latest
rocks.canonical.com/cdk/diverdane/nginxdualstack:1.0.0
nginx:1.14.2
image:cdkbot/microbot-amd64
docker.io/calico/cni:v3.23.4
docker.io/calico/cni:v3.23.4
docker.io/calico/pod2daemon-flexvol:v3.23.4
docker.io/calico/node:v3.23.4
docker.io/calico/kube-controllers:v3.23.4
docker.io/calico/cni:v3.21.1
docker.io/calico/cni:v3.21.1
docker.io/calico/pod2daemon-flexvol:v3.21.1
docker.io/calico/node:v3.21.1
docker.io/calico/kube-controllers:v3.17.3
docker.io/calico/cni:v3.25.1
docker.io/calico/cni:v3.25.1
docker.io/calico/node:v3.25.1
docker.io/calico/kube-controllers:v3.25.1
)
for image in ${images[@]};do
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace k8s.io images pull $image
done
# 记得处理pause
#sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace k8s.io images pull registry.k8s.io/pause:3.7
sed -i "s#registry.k8s.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /var/snap/microk8s/current/args/containerd-template.toml
sudo systemctl restart snap.microk8s.daemon-containerd.service
microk8s.kubectl get pods --all-namespaces -A
sudo usermod -a -G microk8s hua && sudo chown -r hua ~/.kube
newgrp microk8s
microk8s.inspect && microk8s status
microk8s.kubectl get pods --all-namespaces -A
# 查看什么错误的利器,
microk8s.kubectl describe pod --all-namespaces > tmp/tmp #error reason
grep -r failed tmp/tmp |tail -n1
# 然后重启microk8s (microk8s stop && microk8s start)
# 接着用‘kubectl get pods --all-namespaces -A’看到有一个calico-node启动失败的原因是有一个landscape的服务占用了9099端口
kubectl get pods --all-namespaces -A
可通过下面命令备份导入这些镜像:
# 注意:在import image时记得用microk8s.ctr代替ctr, 否则用“microk8s.ctr --namespace k8s.io image ls”看不到
sudo ctr --namespace k8s.io images ls -q |grep -v sha256 |xargs -I {} sh -c 'fname=$(echo "{}" | tr -s "/" "_"); sudo ctr --namespace k8s.io image export ${fname}.tar "{}"'
ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g");echo $fname; microk8s.ctr --namespace k8s.io image import {} ${fname}'
microk8s.ctr --namespace k8s.io image ls
sudo microk8s.stop && sudo microk8s.start
sed -i "s#registry.k8s.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /var/snap/microk8s/current/args/containerd-template.toml
sudo systemctl restart snap.microk8s.daemon-containerd.service
# sunbeam特有的image
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace k8s.io images pull quay.io/metallb/speaker:v0.13.3
kubectl delete -n metallb-system pod speaker-52bqz --grace-period=0 --force
继续部署cos:
# https://charmhub.io/topics/canonical-observability-stack/tutorials/install-microk8s
juju add-model cos sunbeam
juju switch cos
juju deploy cos-lite --trust
watch --color juju status --color --relations
完整步骤
# Backup images, remember to use 'mcirok8s.ctr' instead of 'ctr', and remove 'ctr' by 'sudo apt purge containerd docker.io -y'
microk8s.ctr --namespace k8s.io images ls -q |grep -v sha256 |xargs -I {} sh -c 'fname=$(echo "{}" | tr -s "/" "_"); microk8s.ctr --namespace k8s.io image export ${fname}.tar "{}"'
# ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g");echo $fname; microk8s.ctr --namespace k8s.io image import {} ${fname}'
# microk8s.ctr --namespace k8s.io image ls
# Reset the env
sudo snap remove --purge openstack
sudo snap remove --purge juju
sudo snap remove --purge juju-db
sudo snap remove --purge kubectl
sudo /usr/sbin/remove-juju-services
sudo rm -rf /var/lib/juju
rm -rf ~/.local/share/juju
rm -rf ~/snap/juju/
rm -rf ~/snap/openstack
rm -rf ~/snap/openstack-hypervisor
rm -rf ~/snap/microstack/
rm -rf ~/snap/microk8s/
sudo snap remove --purge vault
sudo snap remove --purge microk8s
sudo snap remove --purge openstack-hypervisor
sudo init 6 #最好重启,否则会有一些calico的网卡和namespace去不了
# Create a ssh key without password, and use 'NOPASSWD:ALL'
ssh-keygen
echo 'hua ALL=(ALL) NOPASSWD:ALL' |sudo tee -a /etc/sudoers
# Install sunbeam
sudo snap install openstack --channel 2023.1/stable
sunbeam prepare-node-script | bash -x
sudo usermod -a -G snap_daemon $USER && newgrp snap_daemon
sunbeam cluster bootstrap --accept-defaults
journalctl -f
# 下面这些东西现在全不用,改成用mirror的方式就行
# Monitor the status, when seeing the microk8s snap by using 'snap list |grep k8s', or the log 'Adding MicroK8S unit to machine ...', then load the image
ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g");echo $fname; microk8s.ctr --namespace k8s.io image import {} ${fname}'
# or use micro.registry
#sudo microk8s.enable registry
#ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g"); sudo docker load -i {}; sudo docker tag ${fname} localhost:32000/${fname}; sudo docker push localhost:32000/${fname}'
microk8s.ctr --namespace k8s.io image ls
sed -i "s#registry.k8s.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /var/snap/microk8s/current/args/containerd-template.toml
sudo systemctl restart snap.microk8s.daemon-containerd.service
# sudo microk8s.stop && sudo microk8s.start #并不需要重启microk8s, 重启上面的snap.microk8s.daemon-containerd即可
microk8s.kubectl get pods --all-namespaces
cat << EOF |tee -a /var/snap/microk8s/current/args/containerd-env
# should use http type
HTTP_PROXY=http://192.168.99.186:3129
HTTPS_PROXY=http://192.168.99.186:3129
NO_PROXY=10.0.0.0/8,127.0.0.0/16,192.168.0.0/16
EOF
# If there is a pause during the installation of pod in microk8s, we can restart microk8s by: sudo microk8s.stop && sudo microk8s.start
# Use sunbeam
alias kubectl='sudo /snap/bin/microk8s.kubectl'
source <(kubectl completion bash) && kubectl completion bash |sudo tee /etc/bash_completion.d/kubectl
source <(openstack complete) && openstack complete |sudo tee /etc/bash_completion.d/openstack
sunbeam configure --accept-defaults --openrc demo-openrc
sunbeam openrc > admin-openrc
source admin-openrc
sunbeam launch ubuntu --name test
sudo journalctl -u snap.openstack.clusterd.service -f
sudo snap logs openstack-hypervisor.ovn-controller
openstack hypervisor list
sunbeam cluster list
# Debug hacks
microk8s.kubectl describe pod --all-namespaces > tmp && grep -r failed tmp |tail -n1
microk8s.kubectl logs -n kube-system pod xxx
$ microk8s.ctr --namespace k8s.io image ls -q |grep -v sha256
docker.io/calico/cni:v3.21.1
docker.io/calico/cni:v3.23.4
docker.io/calico/cni:v3.23.5
docker.io/calico/cni:v3.25.1
docker.io/calico/kube-controllers:v3.17.3
docker.io/calico/kube-controllers:v3.23.4
docker.io/calico/kube-controllers:v3.23.5
docker.io/calico/kube-controllers:v3.25.1
docker.io/calico/node:v3.21.1
docker.io/calico/node:v3.23.4
docker.io/calico/node:v3.23.5
docker.io/calico/node:v3.25.1
docker.io/calico/pod2daemon-flexvol:v3.21.1
docker.io/calico/pod2daemon-flexvol:v3.23.4
docker.io/cdkbot/hostpath-provisioner:1.4.2
docker.io/coredns/coredns:1.9.3
docker.io/jujusolutions/charm-base:ubuntu-20.04
docker.io/jujusolutions/charm-base:ubuntu-22.04
docker.io/jujusolutions/jujud-operator:3.2.0
quay.io/metallb/controller:v0.13.3
quay.io/metallb/speaker:v0.13.3
registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.7
registry.k8s.io/pause:3.7
rocks.canonical.com/cdk/diverdane/nginxdualstack:1.0.0
在这个过程中,如果运行juju clouds看到了下列问题, 那是因为在newgrp snap_daemon里运行了它 , juju 3.x开始用了strict (snap debug sandbox-features)这样用snap_daemon这个组肯定少权限。
update.go:85: cannot change mount namespace according to change mount (/run/user/1000/doc/by-app/snap.juju /run/user/1000/doc none bind,rw,x-snapd.ignore-missing 0 0): cannot inspect "/run/user/1000/doc": lstat /run/user/1000/doc: permission denied
遇到下面问题,将/etc/ntp.conf里搜索restrict禁用ipv6即可:
11月 29 13:43:28 minipc ntpd[1910]: bind(30) AF_INET6 fe80::ecee:eeff:feee:eeee%14#123 flags 0x11 failed: Cannot assign requested address
11月 29 13:43:28 minipc ntpd[1910]: unable to create socket on cali93e42ce2874 (14) for fe80::ecee:eeff:feee:eeee%14#123
另外,如果snap无法下载,是因为gw上的网络有问题。在microk8s里的pod始终部署不结束可能和这个有关,因为它要去registry.jujucharms.com里拉openstack的charm (这个目前网络可以正常访问)
grep -r 'PullImage from image service failed' /var/log/syslog | awk -F'image="' '{split($2, a, "@sha256:"); print a[1]}'
ov 29 13:50:15 minipc microk8s.daemon-kubelite[38665]: E1129 13:50:15.323748 38665 remote_image.go:171] "PullImage from image service failed" err="rpc error: code = Unknown desc = failed to pull and unpack image \"registry.jujucharms.com/charm/6a0rnzywlucfo4rvn7y2aylcc19uaarnwsrge/ovn-sb-db-server-image@sha256:d132bf917fde0e48743ace9f0bceb0ae3ba17a7cc41c0a76c4160a1fb606940a\": failed to resolve reference \"registry.jujucharms.com/charm/6a0rnzywlucfo4rvn7y2aylcc19uaarnwsrge/ovn-sb-db-server-image@sha256:d132bf917fde0e48743ace9f0bceb0ae3ba17a7cc41c0a76c4160a1fb606940a\": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed" image="registry.jujucharms.com/charm/6a0rnzywlucfo4rvn7y2aylcc19uaarnwsrge/ovn-sb-db-server-image@sha256:d132bf917fde0e48743ace9f0bceb0ae3ba17a7cc41c0a76c4160a1fb606940a"
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 microk8s.ctr --namespace=k8s.io images pull registry.jujucharms.com/charm/6a0rnzywlucfo4rvn7y2aylcc19uaarnwsrge/ovn-sb-db-server-image
the way - microk8s.registry
安装方法:
sudo microk8s.enable registry
刚开始不知道端口,用了一些方法找到端口是5000, 并且向外导出的端口是32000
journalctl -u snap.microk8s.daemon-containerd -u snap.microk8s.daemon-registry
cat /var/snap/microk8s/current/args/containerd.toml |grep '\.registry' -A1
cat /var/snap/microk8s/6100/args/certs.d/localhost\:32000/hosts.toml
hua@minipc:~$ sudo microk8s.kubectl logs -n container-registry registry-77c7575667-q9qr2 |tail -n1
time="2023-11-29T06:08:29.221059247Z" level=info msg="listening on [::]:5000" go.version=go1.16.15 instance.id=5e67e401-e456-49bc-acaf-d6406f012e7f service=registry version="v2.8.1+unknown"
$ microk8s.kubectl get svc -A |grep reg
container-registry registry NodePort 10.152.183.20 <none> 5000:32000/TCP 33m
curl http://192.168.99.179:32000/v2/_catalog
sudo microk8s.kubectl proxy
http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
kubectl run -it --rm --image=alpine:latest test-container-registry -- sh
# apk add curl
导入镜像:
ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g"); microk8s.ctr image import {} localhost:32000/${fname}'
但不清楚怎么列出localhost:32000中的镜像,用"microk8s.ctr images ls localhost:32000"肯定不行因为ctr侧重与containerd打交道而不是镜像库,这里可能要用到docker的(sudo docker image ls localhost:32000/k8s.io)但也是空的,难道是需要将镜像还从ctr里导到docker吗或用docker重新标识吗?(注:可能就是用microk8s.ctr images ls 查看,下次试试)
ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g"); sudo docker load -i {}; sudo docker tag ${fname} localhost:32000/${fname}; sudo docker push localhost:32000/${fname}'
#sudo docker image ls localhost:32000
# 用上面的docker image ls localhost:32000还是看不到东西的,得按下列的命令看. docker并不关心存在哪,只看tag (tag里有localhost哦)
sudo docker image ls
sudo docker image ls localhost:32000/docker.io/calico/node
the way - use mirror
sudo docker pull registry.k8s.io/pause:3.7
sudo docker pull k8s.m.daocloud.io/pause:3.7
# https://github.com/DaoCloud/public-image-mirror
# https://microk8s.io/docs/registry-private
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/registry.k8s.io
echo '
server = "registry.k8s.io"
[host."https://registry.aliyuncs.com/v2/google_containers"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/registry.k8s.io/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/rocks.canonical.com
echo '
server = "rocks.canonical.com"
[host."https://rocks-canonical.m.daocloud.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/rocks.canonical.com/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/registry.jujucharms.com
echo '
server = "registry.jujucharms.com"
[host."https://jujucharms.m.daocloud.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/registry.jujucharms.com/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/quay.io
echo '
server = "quay.io"
[host."https://quay.m.daocloud.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/quay.io/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/gcr.io
echo '
server = "gcr.io"
[host."https://gcr.m.daocloud.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/gcr.io/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/docker.io
echo '
server = "docker.io"
[host."https://m.daocloud.io/docker.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/docker.io/hosts.toml
# 如果不生效,应是下面的权限问题
sud chown -R hua:hua /var/snap/microk8s/current/args/certs.d
sudo snap restart microk8s
其他问题
上面用mirror方式可以顺利安装完sunbeam,但启动一个测试虚机时报了下列错,
$ sudo dmesg | grep 'apparmor="DENIED"' |tail -n1
[ 8189.359936] audit: type=1400 audit(1701263375.834:2157): apparmor="DENIED" operation="open" profile="snap.openstack-hypervisor.nova-api-metadata" name="/etc/nova/api-paste.ini" pid=1440903 comm="python3" requested_mask="r" denied_mask="r" fsuid=0 ouid=1000
可这么fix它:
sudo vim /var/lib/snapd/apparmor/profiles/snap.openstack-hypervisor.nova-api-metadata
/etc/nova/api-paste.ini r,
/etc/nova/** r,
sudo apparmor_parser -r /var/lib/snapd/apparmor/profiles/snap.openstack-hypervisor.nova-api-metadata
但之后‘openstack hypervisor list’还是空,看到了下列错. 现在bug实在太多先不玩了:
hua@minipc:~$ journalctl -f -u snap.openstack-hypervisor.nova-compute.service
11月 29 21:31:06 minipc nova-compute[1655283]: 2023-11-29 21:31:06.227 1655283 INFO nova.virt.libvirt.driver [None req-646001d6-e3ff-414d-8be4-6bf0c1882b2b - - - - - -] Connection event '0' reason 'Failed to connect to libvirt: Failed to connect socket to '/var/snap/openstack-hypervisor/common/run/libvirt/virtqemud-sock': No such file or directory'
20240222 - docker 加速
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-EOF
{
"registry-mirrors": [
"https://dockerproxy.com",
"https://mirror.baidubce.com",
"https://docker.m.daocloud.io",
"https://docker.nju.edu.cn",
"https://docker.mirrors.sjtug.sjtu.edu.cn"
]
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
20240618 - Use sunbeam to verify SRU (>jammy)
1, 通过我们自己将horizon设置为NodePort似乎不 work
kubectl get service -n openstack
kubectl patch service horizon --namespace openstack -p '{"spec": {"type": "NodePort"}}'
kubectl describe service horizon --namespace openstack |grep NodePort
kubectl get svc horizon --namespace openstack -o yaml
kubectl get pods --namespace openstack -l app.kubernetes.io/name=horizon
kubectl get nodes -o wide
kubectl logs horizon-0 --namespace openstack
sudo ufw allow 32241
2, 改用'sunbeam dashboard-url'这次work了,这样既可以通过FIP(10.20.21.11)访问,也可以通过NodePort=32241访问
#above NodePort didn't work so let's change to use dashboard-url instead
#juju run horizon/0 get-dashboard-url
sunbeam dashboard-url
#http://10.20.21.11:80/openstack-horizon
sshuttle -v -r ubuntu@seg 10.20.0.0/16
$ juju status horizon |grep active
horizon active 1 horizon-k8s 2023.2/stable 62 10.152.183.130 no http://10.20.21.11/openstack-horizon
horizon/0* active idle 10.1.141.236
$ kubectl get services -o wide -A|grep '10.152.183.130'
openstack horizon NodePort 10.152.183.130 <none> 65535:32241/TCP 6h14m app.kubernetes.io/name=horizon
3, 但horizon上的apache2运行在哪里了?找过去看到的只是traefik,它是类似于haproxy来做HA的(由于charm是由haproxy来做HA的,现在改成microk8s上运行openstack后将由k8s的traefik来为openstack提供HA)
$ kubectl get nodes -o wide |grep Ready
harhall Ready <none> 4h43m v1.28.10 10.230.62.239 <none> Ubuntu Core 20 5.15.0-101-generic containerd://1.6.28
#10.152.183.131 is ClusterIP for LoadBalancer, pod's cidr is 10.1.141.0/24(cni), the IP of br-ex is 10.20.21.1
$ kubectl get services --namespace openstack -o wide |grep 10.20.21.11
traefik-public LoadBalancer 10.152.183.131 10.20.21.11 80:30109/TCP,443:31719/TCP 3h44m app.kubernetes.io/name=traefik-public
ubuntu@harhall:~$ kubectl describe service traefik-public -n openstack
...
IP: 10.152.183.131
LoadBalancer Ingress: 10.20.21.11
Port: traefik-public 80/TCP
TargetPort: 80/TCP
NodePort: traefik-public 30109/TCP
Endpoints: 10.1.141.211:80
Port: traefik-public-tls 443/TCP
TargetPort: 443/TCP
NodePort: traefik-public-tls 31719/TCP
Endpoints: 10.1.141.211:443
$ kubectl describe service horizon -n openstack
...
Type: NodePort
IP: 10.152.183.130
IPs: 10.152.183.130
Port: placeholder 65535/TCP
TargetPort: 65535/TCP
NodePort: placeholder 32241/TCP
Endpoints: 10.1.141.236:65535
$ kubectl get pods --namespace openstack -o wide |grep 10.1.141.211
traefik-public-0 2/2 Running 0 4h32m 10.1.141.211 harhall <none> <none>
4, 也可通过原来juju方式访问快速访问horizon unit,这下发现原来并没有通过apaches而是通过wsgi.py来运行horion的,wsgi是由host运行的,horizon/0 unit内只运行了horizon charm和pebble进程 (pebble是一个类似于systemd的东西 - https://github.com/canonical/pebble , 查看服务用'/charm/bin/pebble services', 停止服务用'/charm/bin/pebble stop <xxx>', 查看juju的log用/charm/bin/pebble logs)
#kubectl exec -it horizon-0 --namespace openstack -- /bin/bash
cat $HOME/snap/openstack/current/account.yaml
juju switch openstack && juju ssh horizon/0
#https://opendev.org/openstack/sunbeam-charms.git
root@horizon-0:/var/lib/juju# grep -r -i 'script' /var/lib/juju/agents/unit-horizon-0/charm/src/templates/wsgi-horizon.conf
WSGIScriptAlias {{ ingress_public.ingress_path }} /usr/share/openstack-dashboard/openstack_dashboard/wsgi.py process-group=horizon
root@horizon-0:/var/lib/juju# find / -name 'openstack-dashboard'
<empty>
ubuntu@harhall:~$ sudo find / -name 'openstack_dashboard' |grep dist-packages
/var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/274/fs/usr/lib/python3/dist-packages/openstack_dashboard
/var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/236/fs/usr/lib/python3/dist-packages/openstack_dashboard
#https://review.opendev.org/c/openstack/horizon/+/910321/4/openstack_dashboard/dashboards/identity/projects/tabs.py
sudo vim /var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/236/fs/usr/lib/python3/dist-packages/openstack_dashboard/dashboards/identity/projects/tabs.py
ubuntu@harhall:~$ ps -ef |grep wsgi |grep horizon |head -n1
42420 1366796 1366774 4 04:22 ? 00:19:22 (wsgi:horizon) -DFOREGROUND
5,现在知道了代码位于(/var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/236/fs/usr/lib/python3/dist-packages/openstack_dashboard/dashboards/identity/projects/tabs.py), 我以为它是从microk8s snap来的所以也以为它是read-only的, 但unsquash(注意:对read-only snap debug可用mount --bind)后又找不着这个文件呢 - ./squashfs-root/usr/lib/python3/dist-packages/openstack_dashboard
sudo unsquashfs -d squashfs-root /var/lib/snapd/snaps/microk8s*.snap
#need to first exit bash from 'kubectl -n openstack exec -it horizon-0 -- bash', then run
ubuntu@harhall:~$ sudo snap try ./squashfs-root/ --devmode
2024-06-18T03:06:50Z INFO Waiting for "snap.microk8s.daemon-kubelite.service" to stop.
microk8s v1.28.10 mounted from /home/ubuntu/squashfs-root
ubuntu@harhall:~$ ls ./squashfs-root/usr/lib/python3/dist-packages/openstack_dashboard
ls: cannot access './squashfs-root/usr/lib/python3/dist-packages/openstack_dashboard': No such file or directory
6, 哦,原来它来自于Rocks image, 并且这个文件并不是read-only的, 可以直接改写, 那easy了。
ubuntu@harhall:~$ microk8s.kubectl describe -n openstack pod/horizon-0 |grep 'Image:' |grep horizon
Image: registry.jujucharms.com/charm/ctws3sfxf7i2j12tvw9obbkcq9q0576tozepc/horizon-image@sha256:a1d591317e14c9d4a4bae7971cd8074ca9d4e5f0399779bd8da313b130f482f8
用'mount |grep upperdir'命令看到 upperdir指定的目录是可写的(upperdir=/var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/548/fs)
7, 但采用‘sudo microk8s.stop dashboard && sudo microk8s.start dashboard’(这个命令应该是重启k8s的dashboard吧)重启之后整个o7k不work了,用'sudo microk8s.stop && sudo microk8s.start'也恢复不了。这
cd /var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/236/fs/usr/lib/python3/dist-packages
patch -p1 < diff
sudo microk8s.stop dashboard && sudo microk8s.start dashboard
8, 快速恢复openstack环境,将charmed o7k运行在microk8s上就有下列通过delete pod来恢复o7k的好处,另外,也可以用k8s的LB来代替charmed haproxy
ubuntu@harhall:~$ juju status |grep mysql |grep installing |grep -v router
glance-mysql 8.0.35-0ubuntu0.22.04.1 waiting 1 mysql-k8s 8.0/stable 127 10.152.183.184 no installing agent
keystone-mysql 8.0.35-0ubuntu0.22.04.1 waiting 1 mysql-k8s 8.0/stable 127 10.152.183.218 no installing agent
nova-mysql 8.0.35-0ubuntu0.22.04.1 waiting 1 mysql-k8s 8.0/stable 127 10.152.183.105 no installing agent
kubectl delete pod -n openstack glance-mysql-0
kubectl delete pod -n openstack keystone-mysql-0
kubectl delete pod -n openstack nova-mysql-0
#also delete router pods
kubectl delete pod -n openstack keystone-mysql-router-0
kubectl delete pod -n openstack nova-api-mysql-router-0
kubectl delete pod -n openstack nova-cell-mysql-router-0
kubectl delete pod -n openstack nova-mysql-router-0
9, 然后delete horizon-0 pod重启horizon后就可以成功验证patch了
kubectl delete pod -n openstack horizon-0
ubuntu@harhall:~$ sudo grep -r 'identity.get_domain_id_for_operation' /var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/236/fs/usr/lib/python3/dist-packages/openstack_dashboard/dashboards/identity/projects/tabs.py
domain_id = identity.get_domain_id_for_operation(self.request)
domain_id = identity.get_domain_id_for_operation(self.request)
domain_id = identity.get_domain_id_for_operation(self.request)
10, 但是我想试一个rpdb, 但是不work
sudo vim /var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/236/fs/usr/lib/python3/dist-packages/openstack_dashboard/dashboards/identity/projects/tabs.py
#import rpdb;rpdb.set_trace()
import rpdb;rpdb.Rpdb(addr='0.0.0.0', port=4444).set_trace()
sudo pip3 uninstall rpdb && /snap/openstack/335/bin/pip3 install rpdb
kubectl delete pod -n openstack horizon-0
既然用sunbeam在mantic上验证了2023.2 (4:23.3.0-0ubuntu1.2,其实也不能说验证了mantic, mantic用23.3.0版本作为base, 只能说验证了 23.3.0, 实际上sunbeam也是用的jammy + 23.3.0). 接着也用subbeam来验证2024.1 (noble用4:24.0.0-0ubuntu1.1作为base)时居然失败了,原因是另一个不相关的问题居然此时在domain那里只能列出一个domain来了(之前是可以列出多个的)。切换到devstack debug后找出来原来是这个原因(https://bugs.launchpad.net/keystone/+bug/2041611 , https://review.opendev.org/c/openstack/keystone/+/900028),如下,正常project-scoped token可以返回多个domain, 但domain-scoped应该只能返回一个domain, 但之前有bug出现了domain-scoped也能返回多个domain(而horizon里调用 keystone API是严格使用domain-scoped的),所以这样相当于间接掩盖了我之前的bug (https://bugs.launchpad.net/horizon/+bug/2054799), 因为今后将无法列出多个domain所以也就不存在通过admin domain登录来为别的domain如k8s domain来设置domain context的触发那个bug的先决条件了。
#project-scoped token can return multiple domains
$ env |grep OS_
OS_PROJECT_DOMAIN_ID=default
OS_CACERT=
OS_AUTH_URL=http://localhost:5000/v3/
OS_USER_DOMAIN_ID=default
OS_USERNAME=admin
OS_PROJECT_NAME=admin
OS_PASSWORD=password
#export OS_PROJECT_NAME=admin
#export OS_PROJECT_DOMAIN_ID=default
$ openstack domain list
+----------------------------------+---------+---------+--------------------+
| ID | Name | Enabled | Description |
+----------------------------------+---------+---------+--------------------+
| default | Default | True | The default domain |
| e3c1e806d5204b638906c6e64d72ec9a | k8s | True | |
+----------------------------------+---------+---------+--------------------+
#domain-scoped token should return only one domain
#but although the token of the domain member user making the API request is strictly domain-scoped,
#all domains are returned without this patch https://bugs.launchpad.net/keystone/+bug/2041611
unset OS_PROJECT_NAME
unset OS_PROJECT_DOMAIN_ID
export OS_DOMAIN_NAME=default
$ openstack domain list
+---------+---------+---------+--------------------+
| ID | Name | Enabled | Description |
+---------+---------+---------+--------------------+
| default | Default | True | The default domain |
+---------+---------+---------+--------------------+
另外,charm可以暂时可能下列方法在jammy上创建2024.1环境 (2024.1上还没有合适的ovn版本所以用ml2-ovs), noble上用juju3.x所用的charmcraft.yaml格式不一样,charm目前无法同时管理两套格式,所以目前noble还直接运行不了charm所以也verify不了sru debian包(用sunbeam也只是验证patch而不是deb)
我试图先安装jammy-caracal, 然后仅将dashboard unit做series upgrade to mantic,但升级后的虚机的console-log显示eth0没有IP导致无法继续(所以也无法验证之前说的mysql-router charm是否会有问题)
./generate-bundle.sh -s jammy -r caracal -n osttest --ml2-ovs --run --openstack-dashboard
juju ssh openstack-dashboard/0 -- sudo -s
systemctl stop jujud-machine-12.service && systemctl disable jujud-machine-12.service
#we should NOT disable /etc/apt/sources.list.d/cloud-archive.list etc, otherwise, we will hit package dependencies issue during do-release-upgrade
sed -i 's/Prompt=lts/Prompt=normal/g' /etc/update-manager/release-upgrades
apt upgrade -y && apt dist-upgrade -y
reboot
juju ssh openstack-dashboard/0 -- sudo -s
#series upgrade from jammy to mantic
do-release-upgrade
因为安装的是jammy-caracal, 所以直接从jammy升级到noble (避开mantic)就不会遇到上面的IP问题了。
Mantic == Bobcat which is < Caracal. If you deploy Jammy Caracal using these instructions you should not upgrade to Mantic because you will then be running an unsupported combination of packages. Forget Mantic entirely and go to straight Noble.
sed -i 's/Prompt=normal/Prompt=lts/g' /etc/update-manager/release-upgrades
do-release-upgrade -d -f DistUpgradeViewNonInteractive
systemctl start jujud-machine-12.service && systemctl enable jujud-machine-12.service
20240923 - 如果国内快速安装sunbeam/microk8s再次理论分析
通过docker registry是可以创建docker image cache的(见: https://docs.docker.com/docker-hub/mirror/), 也见( How to use private registry in docker and containerd and k8s (by quqi99) 0- https://blog.csdn.net/quqi99/article/details/104842160 ).
假设目前已经为为https://k8s.gcr.io准备了cache server (http://10.245.167.25:8082), 为https://docker.io准备了cache servr(http://10.245.166.36:8082), micork8s添加下列配置之后重启(sudo systemctl restart snap.microk8s.daemon-containerd.service )即可:
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/k8s.gcr.io
cat << EOF | sudo tee /var/snap/microk8s/current/args/certs.d/k8s.gcr.io/hosts.toml
server = "http://10.245.167.25:8082"
[host."http://10.245.167.25:8082"]
capabilities = ["pull", "resolve"]
EOF
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/docker.io
cat << EOF | sudo tee /var/snap/microk8s/current/args/certs.d/docker.io/hosts.toml
server = "http://10.245.166.36:8082"
[host."http://10.245.166.36:8082"]
capabilities = ["pull", "resolve"]
EOF
安装sunbeam的话,可以先照上面方式安装microk8s之后再装sunbeam.
如果使用docker-registry这个charm来部署docker registry的话,可以这样部署:
series: focal
applications:
docker-io-caching-registry:
charm: ./docker-registry
num_units: 1
options:
cache-password: CHANGEME
cache-remoteurl: https://registry-1.docker.io
cache-username: CHANGEME
http_proxy: http://xx:3128
https_proxy: http://xx:3128
registry-http-proxy: http://xx:3128
registry-https-proxy: http://xx:3128
registry-port: 8082
constraints: arch=amd64 cpu-cores=4 allocate-public-ip=true
k8s-gcr-io-caching-registry:
charm: ./docker-registry
num_units: 1
options:
cache-remoteurl: https://k8s.gcr.io
http_proxy: http://xx:3128
https_proxy: http://xx:3128
registry-http-proxy: http://xx:3128
registry-https-proxy: http://xx:3128
registry-port: 8082
constraints: arch=amd64 cpu-cores=4 allocate-public-ip=true
也不用自建docker registry, 照上面’the way - use mirror’中的用国内公共的docker registry即可 (或者加前缀m.daocloud.io即可,如: k8s.gcr.io/coredns/coredns => m.daocloud.io/k8s.gcr.io/coredns/coredns)。
#https://github.com/DaoCloud/public-image-mirror
#sudo docker pull registry.k8s.io/pause:3.7
#sudo docker pull k8s.m.daocloud.io/pause:3.7
registry.k8s.io - https://registry.aliyuncs.com/v2/google_containers
rocks.canonical.com - https://rocks-canonical.m.daocloud.io
registry.jujucharms.com - https://jujucharms.m.daocloud.io
quay.io - https://quay.m.daocloud.io
gcr.io - https://gcr.m.daocloud.io
docker.io - https://m.daocloud.io/docker.io
sud chown -R hua:hua /var/snap/microk8s/current/args/certs.d
sudo snap restart microk8s
microk8s charm也是有一个custom_registries配置参数的 - https://github.com/canonical/charm-microk8s/blob/legacy/config.yaml#L27 , 例如:
juju config containerd custom_registries='[{"url": "https://zhhuabj-bastion.cloud.sts:5000", "username": "test", "password": "password"}]'
docker也是支持custom_registries配置参数的,在/etc/docker/daemon.json中
"registry-mirrors": [
"https://docker.m.daocloud.io"
]
但是为什么下列的方法不work呢?待查:
#https://microk8s.io/
sudo snap install microk8s --classic
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/registry.k8s.io
echo '
server = "registry.k8s.io"
[host."https://m.daocloud.io/registry.k8s.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/registry.k8s.io/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/rocks.canonical.com
echo '
server = "rocks.canonical.com"
[host."https://m.daocloud.io/rocks.canonical.com"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/rocks.canonical.com/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/registry.jujucharms.com
echo '
server = "registry.jujucharms.com"
[host."https://m.daocloud.io/registry.jujucharms.com"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/registry.jujucharms.com/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/quay.io
echo '
server = "quay.io"
[host."https://m.daocloud.io/quay.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/quay.io/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/gcr.io
echo '
server = "gcr.io"
[host."https://m.daocloud.io/gcr.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/gcr.io/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/docker.io
echo '
server = "docker.io"
[host."https://m.daocloud.io/docker.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/docker.io/hosts.toml
sudo chown -R $USER /var/snap/microk8s/current/args/certs.d
journalctl -u snap.microk8s.daemon-containerd.service -f
sudo systemctl restart snap.microk8s.daemon-containerd.service
sudo tail -f /var/log/syslog |grep 'image'
#sudo docker pull docker.io/calico/cni:v3.25.1
#sudo docker pull m.daocloud.io/docker.io/calico/cni:v3.25.1
#sed -i "s#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /var/snap/microk8s/current/args/containerd-template.toml
#sudo systemctl restart snap.microk8s.daemon-containerd.service
#cat /var/snap/microk8s/3629/args/containerd.toml
microk8s.ctr --namespace k8s.io image ls
microk8s kubectl get all --all-namespaces
microk8s status --wait-ready
microk8s enable registry
它报这种错, 明显是从docker.m.daocloud.io这里报了StatusNotFound,然后再去docker.io里下载就遇到了Forbidden:
9月 24 14:28:07 minipc microk8s.daemon-containerd[3155217]: time="2024-09-24T14:28:07.871135376+08:00" level=info msg="PullImage \"docker.io/calico/cni:v3.25.1\""
9月 24 14:28:08 minipc microk8s.daemon-containerd[3155217]: time="2024-09-24T14:28:08.010462692+08:00" level=info msg="trying next host - response was http.StatusNotFound" host=docker.m.daocloud.io
9月 24 14:28:21 minipc microk8s.daemon-containerd[3155217]: time="2024-09-24T14:28:21.587410000+08:00" level=error msg="PullImage \"docker.io/calico/cni:v3.25.1\" failed" error="failed to pull and unpack image \"docker.io/calico/cni:v3.25.1\": failed to resolve reference \"docker.io/calico/cni:v3.25.1\": unexpected status from HEAD request to https://www.docker.com/v2/calico/cni/manifests/v3.25.1: 403 Forbidden"
所以我直接编译 /var/snap/microk8s/current/args/containerd-template.toml 添加下列配置之后(sudo cat /var/snap/microk8s/current/args/containerd.toml)再运行’sudo systemctl restart snap.microk8s.daemon-containerd.service’就正常了。
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://docker.m.daocloud.io","http://hub-mirror.c.163.com"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."gcr.io"]
endpoint = ["gcr.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."k8s.gcr.io"]
endpoint = ["k8s-gcr.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."quay.io"]
endpoint = ["quay.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.k8s.io"]
endpoint = ["k8s.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.elastic.co"]
endpoint = ["elastic.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.jujucharms.com"]
endpoint = ["jujucharms.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."rocks.canonical.com"]
endpoint = ["rocks-canonical.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."harbor.sundayhk.com"]
endpoint = ["https://harbor.sundayhk.com"]
# 'plugins."io.containerd.grpc.v1.cri".registry' contains config related to the registry
#[plugins."io.containerd.grpc.v1.cri".registry]
# config_path = "${SNAP_DATA}/args/certs.d"
Reference
[1] Sunbeam underlying projects - https://discourse.ubuntu.com/t/sunbeam-underlying-projects/37526
[2] https://github.com/canonical/snap-openstack.git