linux 内核参数 rss,容器内核参数

容器安全-内核参数?

容器与宿主机共用内核,修改内核参数可能会影响到宿主机和其他容器。非特权容器大部分内核参数无法修改,可以修改的内核参数主要涉及到 IPC Namespace和Net Namespace,可修改内核参数需要满足3个条件

docker白名单

可命名空间化

容器中可见

docker白名单

IPC Namespace kernel.msgmax kernel.msgmnb kernel.msgmni kernel.sem kernel.shmall kernel.shmmax kernel.shmmni kernel.shm_rmid_forced fs.mqueue.msg_default fs.mqueue.msg_max fs.mqueue.msgsize_default fs.mqueue.msgsize_max fs.mqueue.queues_max

Net Namespace net.*

可命名空间化的

使用特权的容器可以修改所有容器可见的参数,其中不影响宿主机的为可命名空间化的,与白名单大致一致。

# sysctl -a | grep fs.mqueue.msg_max

fs.mqueue.msg_max = 10

# docker run --privileged --rm centos /bin/bash -c "sysctl -w fs.mqueue.msg_max=5"

fs.mqueue.msg_max = 5

# sysctl -a | grep fs.mqueue.msg_max

fs.mqueue.msg_max = 10

3、符合白名单但是容器中不可见的参数

符合白名单且不可见的参数有:

net.bridge.bridge-nf-call-arptables

net.bridge.bridge-nf-call-ip6tables

net.bridge.bridge-nf-call-iptables

net.bridge.bridge-nf-filter-pppoe-tagged

net.bridge.bridge-nf-filter-vlan-tagged

net.bridge.bridge-nf-pass-vlan-input-dev

net.core.bpf_jit_enable

net.core.bpf_jit_harden

net.core.bpf_jit_kallsyms

net.core.bpf_jit_limit

net.core.busy_poll

net.core.busy_read

net.core.default_qdisc

net.core.dev_weight

net.core.dev_weight_rx_bias

net.core.dev_weight_tx_bias

net.core.fb_tunnels_only_for_init_net

net.core.flow_limit_cpu_bitmap

net.core.flow_limit_table_len

net.core.max_skb_frags

net.core.message_burst

net.core.message_cost

net.core.netdev_budget

net.core.netdev_budget_usecs

net.core.netdev_max_backlog

net.core.netdev_rss_key

net.core.netdev_tstamp_prequeue

net.core.optmem_max

net.core.rmem_default

net.core.rmem_max

net.core.rps_sock_flow_entries

net.core.tstamp_allow_data

net.core.warnings

net.core.wmem_default

net.core.wmem_max

net.ipv4.icmp_msgs_burst

net.ipv4.icmp_msgs_per_sec

net.ipv4.inet_peer_maxttl

net.ipv4.inet_peer_minttl

net.ipv4.inet_peer_threshold

net.ipv4.ipfrag_secret_interval

net.ipv4.route.error_burst

net.ipv4.route.error_cost

net.ipv4.route.gc_elasticity

net.ipv4.route.gc_interval

net.ipv4.route.gc_min_interval

net.ipv4.route.gc_min_interval_ms

net.ipv4.route.gc_thresh

net.ipv4.route.gc_timeout

net.ipv4.route.max_size

net.ipv4.route.min_adv_mss

net.ipv4.route.min_pmtu

net.ipv4.route.mtu_expires

net.ipv4.route.redirect_load

net.ipv4.route.redirect_number

net.ipv4.route.redirect_silence

net.ipv4.tcp_allowed_congestion_control

net.ipv4.tcp_available_congestion_control

net.ipv4.tcp_available_ulp

net.ipv4.tcp_low_latency

net.ipv4.tcp_max_orphans

net.ipv4.tcp_mem

net.ipv4.udp_mem

net.netfilter.nf_log_all_netns

net.nf_conntrack_max

net.unix.max_dgram_qlen

net.netfilter.nf_conntrack_tcp_timeout_close_wait

net.netfilter.nf_conntrack_tcp_timeout_established

net.ipv6.conf.default.disable_ipv6

net.ipv6.conf.all.disable_ipv6

net.netfilter.nf_conntrack_count

4、常见可修改的内核参数

net.core.somaxconn

net.ipv4.conf.xxx.proxy_arp_pvlan

net.ipv4.ip_default_ttl

net.ipv4.ip_forward

net.ipv4.tcp_base_mss

net.ipv4.tcp_sack

net.ipv4.tcp_syncookies

net.ipv4.tcp_timestamps

net.ipv4.tcp_tw_reuse

net.ipv4.tcp_window_scaling

net.ipv4.tcp_wmem

net.ipv4.udp_rmem_min

net.ipv4.udp_wmem_min

5、容器缺省会修改的内核参数

docker中有部分内核参数缺省会修改,需要注意

net.unix.max_dgram_qlen = 10

net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60

net.netfilter.nf_conntrack_tcp_timeout_established = 432000

net.ipv6.conf.default.disable_ipv6 = 1

net.ipv6.conf.all.disable_ipv6 = 1

net.netfilter.nf_conntrack_count = 0

6、docker中修改内核参数

如果ipc和net配置为host,则无法修改。

拥有特权的容器可以修改所有可见参数,可能会影响宿主机和其他容器。

非特权容器内内核参数为只读文件系统,任何内核参数在容器内部修改会报错。

# sysctl -w net.ipv6.icmp.ratelimit=500

sysctl: setting key "net.ipv6.icmp.ratelimit": Read-only file system

在docker run中修改不在白名单中的内核参数,则会报错不在白名单

# docker run -it --sysctl vm.swappiness=10 centos /bin/bash invalid argument "vm.swappiness=10" for "--sysctl" flag: sysctl 'vm.swappiness=10' is not whitelisted

See 'docker run --help'.

在docker run中修改不可见内核参数,容器会报文件不存在

# docker run -it --sysctl net.core.busy_poll=1 centos /bin/bash

docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"write sysctl key net.core.busy_poll: open /proc/sys/net/core/busy_poll: no such file or directory\"": unknown.

ERRO[0000] error waiting for container: context canceled

在docker run中修改可修改的内核参数net.ipv4.ip_default_ttl=32

# sysctl -a | grep net.ipv4.ip_default_ttl

net.ipv4.ip_default_ttl = 64

# docker run --rm centos /bin/bash -c "ping -c 1 172.17.0.2"

PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data.

64 bytes from 172.17.0.2: icmp_seq=1 ttl=64 time=0.279 ms

--- 172.17.0.2 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 0.279/0.279/0.279/0.000 ms

# docker run --sysctl net.ipv4.ip_default_ttl=32 --rm centos /bin/bash -c "ping -c 1 172.17.0.2"

PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data.

64 bytes from 172.17.0.2: icmp_seq=1 ttl=32 time=0.033 ms

--- 172.17.0.2 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 0.033/0.033/0.033/0.000 ms

调整内核参数启动ipv6

# docker run --rm centos ip a

1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever

2: tunl0@NONE: mtu 1480 qdisc noop state DOWN group default qlen 1000

link/ipip 0.0.0.0 brd 0.0.0.0

148: eth0@if149: mtu 1500 qdisc noqueue state UP group default

link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0

inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0

valid_lft forever preferred_lft forever

# docker run --sysctl net.ipv6.conf.default.disable_ipv6=0 --sysctl net.ipv6.conf.all.disable_ipv6=0 --rm centos ip a

1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: tunl0@NONE: mtu 1480 qdisc noop state DOWN group default qlen 1000

link/ipip 0.0.0.0 brd 0.0.0.0

150: eth0@if151: mtu 1500 qdisc noqueue state UP group default

link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0

inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0

valid_lft forever preferred_lft forever

inet6 fe80::42:acff:fe11:2/64 scope link tentative

valid_lft forever preferred_lft forever

6、K8S中修改内核参数

查看相关参数

# kubectl explain pod.spec.securityContext.sysctls

KIND: Pod

VERSION: v1

RESOURCE: sysctls

DESCRIPTION:

Sysctls hold a list of namespaced sysctls used for the pod. Pods with

unsupported sysctls (by the container runtime) might fail to launch.

Sysctl defines a kernel parameter to be set

FIELDS:

name -required-

Name of a property to set

value -required-

Value of a property to set

不是docker中可以修改的内核参数在k8s中就可以修改,缺省k8s只认为下面三个参数是安全的:

kernel.shm_rmid_forced

net.ipv4.ip_local_port_range

net.ipv4.tcp_syncookies

要使用其他内核参数,需要在kubelet中启用参数

# cat /etc/sysconfig/kubelet

KUBELET_EXTRA_ARGS= "--allowed-unsafe-sysctls=net.ipv6.conf.*"

在pod中启用ipv6的内核参数

apiVersion: apps/v1

kind: Deployment

metadata:

labels:

run: test1

name: test1

namespace: default

spec:

selector:

matchLabels:

run: test1

template:

metadata:

labels:

run: test1

spec:

containers:

- args:

- "/bin/sh"

- "-c"

- "sleep 120"

image: centos:latest

name: test1

securityContext:

sysctls:

- name: net.ipv6.conf.default.disable_ipv6

value: '0'

- name: net.ipv6.conf.all.disable_ipv6

value: '0'

查看ipv6地址

# kubectl exec test1-66ccc967c5-7xrpc ip a

1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever

2: tunl0@NONE: mtu 1480 qdisc noop state DOWN group default qlen 1000

link/ipip 0.0.0.0 brd 0.0.0.0

4: eth0@if154: mtu 1440 qdisc noqueue state UP group default

link/ether 66:cf:30:b8:22:71 brd ff:ff:ff:ff:ff:ff link-netnsid 0

inet 172.16.137.96/32 scope global eth0

valid_lft forever preferred_lft forever

inet6 fe80::64cf:30ff:feb8:2271/64 scope link

valid_lft forever preferred_lft forever

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值