Kubernetes上的Neo4j集群
这篇文章假定我们熟悉在Kubernetes上部署Neo4j。 我在Neo4j博客上写了一篇文章, 对此进行了更详细的解释 。
我们为核心服务器创建的StatefulSet需要持久存储 ,这是通过PersistentVolumeClaim(PVC)原语实现的。 包含3个核心服务器的Neo4j群集将具有以下PVC:
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
datadir-neo-helm-neo4j-core-0 Bound pvc-043efa91-cc54-11e7-bfa5-080027ab9eac 10Gi RWO standard 45s
datadir-neo-helm-neo4j-core-1 Bound pvc-1737755a-cc54-11e7-bfa5-080027ab9eac 10Gi RWO standard 13s
datadir-neo-helm-neo4j-core-2 Bound pvc-18696bfd-cc54-11e7-bfa5-080027ab9eac 10Gi RWO standard 11s
每个PVC都有一个对应的PersistentVolume(PV),可以满足以下要求:
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-043efa91-cc54-11e7-bfa5-080027ab9eac 10Gi RWO Delete Bound default/datadir-neo-helm-neo4j-core-0 standard 41m
pvc-1737755a-cc54-11e7-bfa5-080027ab9eac 10Gi RWO Delete Bound default/datadir-neo-helm-neo4j-core-1 standard 40m
pvc-18696bfd-cc54-11e7-bfa5-080027ab9eac 10Gi RWO Delete Bound default/datadir-neo-helm-neo4j-core-2 standard 40m
通常在部署StatefulSet的同时创建PVC和PV。 我们需要干预该生命周期,以便我们的数据集在部署StatefulSet之前就已经存在。
部署现有数据集
我们可以按照以下步骤进行操作:
- 手动创建具有上述名称的PVC
- 将吊舱连接到这些PVC
- 将我们的数据集复制到这些Pod
- 删除豆荚
- 部署我们的Neo4j集群
我们可以使用以下脚本来创建PVC和容器:
pvs.sh
#!/usr/bin/env bash
set -exuo pipefail
for i in $(seq 0 2); do
cat <<EOF | kubectl apply -f -
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: datadir-neo-helm-neo4j-core-${i}
labels:
app: neo4j
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
EOF
cat <<EOF | kubectl apply -f -
kind: Pod
apiVersion: v1
metadata:
name: neo4j-load-data-${i}
labels:
app: neo4j-loader
spec:
volumes:
- name: datadir-neo4j-core-${i}
persistentVolumeClaim:
claimName: datadir-neo-helm-neo4j-core-${i}
containers:
- name: neo4j-load-data-${i}
image: ubuntu
volumeMounts:
- name: datadir-neo4j-core-${i}
mountPath: /data
command: ["/bin/bash", "-ecx", "while :; do printf '.'; sleep 5 ; done"]
EOF
done;
让我们运行该脚本来创建我们的PVC和pod:
$ ./pvs.sh
++ seq 0 2
+ for i in $(seq 0 2)
+ cat
+ kubectl apply -f -
persistentvolumeclaim "datadir-neo-helm-neo4j-core-0" configured
+ cat
+ kubectl apply -f -
pod "neo4j-load-data-0" configured
+ for i in $(seq 0 2)
+ cat
+ kubectl apply -f -
persistentvolumeclaim "datadir-neo-helm-neo4j-core-1" configured
+ cat
+ kubectl apply -f -
pod "neo4j-load-data-1" configured
+ for i in $(seq 0 2)
+ cat
+ kubectl apply -f -
persistentvolumeclaim "datadir-neo-helm-neo4j-core-2" configured
+ cat
+ kubectl apply -f -
pod "neo4j-load-data-2" configured
现在我们可以将数据库复制到pod上:
for i in $(seq 0 2); do
kubectl cp graph.db.tar.gz neo4j-load-data-${i}:/data/
kubectl exec neo4j-load-data-${i} -- bash -c "mkdir -p /data/databases && tar -xf /data/graph.db.tar.gz -C /data/databases"
done
graph.db.tar.gz包含我创建的本地数据库的备份:
$ tar -tvf graph.db.tar.gz
drwxr-xr-x 0 markneedham staff 0 24 Jul 15:23 graph.db/
drwxr-xr-x 0 markneedham staff 0 24 Jul 15:23 graph.db/certificates/
drwxr-xr-x 0 markneedham staff 0 17 Feb 2017 graph.db/index/
drwxr-xr-x 0 markneedham staff 0 24 Jul 15:22 graph.db/logs/
-rw-r--r-- 0 markneedham staff 8192 24 Jul 15:23 graph.db/neostore
-rw-r--r-- 0 markneedham staff 896 24 Jul 15:23 graph.db/neostore.counts.db.a
-rw-r--r-- 0 markneedham staff 1344 24 Jul 15:23 graph.db/neostore.counts.db.b
-rw-r--r-- 0 markneedham staff 9 24 Jul 15:23 graph.db/neostore.id
-rw-r--r-- 0 markneedham staff 65536 24 Jul 15:23 graph.db/neostore.labelscanstore.db
...
-rw------- 0 markneedham staff 1700 24 Jul 15:23 graph.db/certificates/neo4j.key
我们将运行以下命令来检查数据库是否到位:
$ kubectl exec neo4j-load-data-0 -- ls -lh /data/databases/
total 4.0K
drwxr-xr-x 6 501 staff 4.0K Jul 24 14:23 graph.db
$ kubectl exec neo4j-load-data-1 -- ls -lh /data/databases/
total 4.0K
drwxr-xr-x 6 501 staff 4.0K Jul 24 14:23 graph.db
$ kubectl exec neo4j-load-data-2 -- ls -lh /data/databases/
total 4.0K
drwxr-xr-x 6 501 staff 4.0K Jul 24 14:23 graph.db
到目前为止一切都很好。 豆荚已完成工作,因此我们将其拆散:
$ kubectl delete pods -l app=neo4j-loader
pod "neo4j-load-data-0" deleted
pod "neo4j-load-data-1" deleted
pod "neo4j-load-data-2" deleted
现在,我们准备部署Neo4j集群。
helm install incubator/neo4j --name neo-helm --wait --set authEnabled=false
最后,我们将运行一个Cypher查询来检查Neo4j服务器是否使用了我们上传的数据库:
$ kubectl exec neo-helm-neo4j-core-0 -- bin/cypher-shell "match (n) return count(*)"
count(*)
32314
$ kubectl exec neo-helm-neo4j-core-1 -- bin/cypher-shell "match (n) return count(*)"
count(*)
32314
$ kubectl exec neo-helm-neo4j-core-2 -- bin/cypher-shell "match (n) return count(*)"
count(*)
32314
成功!
通过使用init容器,我们可以达到类似的结果,但是我还没有机会尝试这种方法。 如果您尝试一下,请在评论中让我知道,我将其添加到帖子中。
翻译自: https://www.javacodegeeks.com/2017/11/kubernetes-copy-dataset-statefulsets-persistentvolume.html