Linux redis故障,linux – Redis Stall故障排除

我们在服务器上运行了几个redis实例.还有多个Web层服务器连接到那些同时遇到停顿的实例.

我们当时正在进行数据包捕获,确定在TX和RX流量中存在停顿,如下面的wireshark IO图:

redis调用有一个相关的峰值,但我怀疑这是一个影响,而不是由于时滞造成的原因:

使用15 / s采样间隔(这是作为计数器收集),平均有136个内存分配停顿:

似乎同时迁移的NUMA页面数量异乎寻常:

虽然上面看起来很正常,但是有两个连续的数据点使得它与图中看到的其他300个峰值相比变得异常.

内存压缩失败和压缩失速也有相关的峰值:

虽然我在这里有丰富的内存信息,但我的Linux内存知识还不够深入,无法真正假设一个好的故事,将所有这些信息汇总在一起来解释失速.任何具有深入Linux内存知识(也许是深度内存知识)的人都可以将这些信息结合在一起吗?

我们以15秒的间隔从/ proc / vmstat收集所有统计信息,因此如果您认为可能会添加任何数据,请执行此操作.我刚刚选择了似乎有趣的活动,特别是alloc stall,numa迁移和压缩停顿/失败.总计如下,并涵盖20天的正常运行时间:

[kbrandt@ny-redis01: ~] uptime

21:11:49 up 20 days,20:05,8 users,load average: 1.05,0.74,0.69

[kbrandt@ny-redis01: ~] cat /proc/vmstat

nr_free_pages 105382

nr_alloc_batch 5632

nr_inactive_anon 983455

nr_active_anon 15870487

nr_inactive_file 12904618

nr_active_file 2266184

nr_unevictable 0

nr_mlock 0

nr_anon_pages 16361259

nr_mapped 26329

nr_file_pages 15667318

nr_dirty 48588

nr_writeback 0

nr_slab_reclaimable 473720

nr_slab_unreclaimable 37147

nr_page_table_pages 38701

nr_kernel_stack 987

nr_unstable 0

nr_bounce 0

nr_vmscan_write 356302

nr_vmscan_immediate_reclaim 174305

nr_writeback_temp 0

nr_isolated_anon 0

nr_isolated_file 32

nr_shmem 423906

nr_dirtied 3071978326

nr_written 3069010459

numa_hit 1825289996

numa_miss 3360625955

numa_foreign 3360626253

numa_interleave 64798

numa_local 1856473774

numa_other 3329442177

workingset_refault 297175

workingset_activate 24923

workingset_nodereclaim 0

nr_anon_transparent_hugepages 41

nr_free_cma 0

nr_dirty_threshold 3030688

nr_dirty_background_threshold 1515344

pgpgin 25709012

pgpgout 12284206511

pswpin 143954

pswpout 341570

pgalloc_dma 430

pgalloc_dma32 498407404

pgalloc_normal 8131576449

pgalloc_movable 0

pgfree 8639210186

pgactivate 12022290

pgdeactivate 14512106

pgfault 61444049878

pgmajfault 23740

pgrefill_dma 0

pgrefill_dma32 1084722

pgrefill_normal 13419119

pgrefill_movable 0

pgsteal_kswapd_dma 0

pgsteal_kswapd_dma32 11991303

pgsteal_kswapd_normal 1051781383

pgsteal_kswapd_movable 0

pgsteal_direct_dma 0

pgsteal_direct_dma32 58737

pgsteal_direct_normal 36277968

pgsteal_direct_movable 0

pgscan_kswapd_dma 0

pgscan_kswapd_dma32 13416911

pgscan_kswapd_normal 1053143529

pgscan_kswapd_movable 0

pgscan_direct_dma 0

pgscan_direct_dma32 58926

pgscan_direct_normal 36291030

pgscan_direct_movable 0

pgscan_direct_throttle 0

zone_reclaim_Failed 0

pginodesteal 0

slabs_scanned 1812992

kswapd_inodesteal 5096998

kswapd_low_wmark_hit_quickly 8600243

kswapd_high_wmark_hit_quickly 5068337

pageoutrun 14095945

allocstall 567491

pgrotated 971171

drop_pagecache 8

drop_slab 0

numa_pte_updates 58218081649

numa_huge_pte_updates 416664

numa_hint_faults 57988385456

numa_hint_faults_local 57286615202

numa_pages_migrated 39923112

pgmigrate_success 48662606

pgmigrate_fail 2670596

compact_migrate_scanned 29140124

compact_free_scanned 28320190101

compact_isolated 21473591

compact_stall 57784

compact_fail 37819

compact_success 19965

htlb_buddy_alloc_success 0

htlb_buddy_alloc_fail 0

unevictable_pgs_culled 5528

unevictable_pgs_scanned 0

unevictable_pgs_rescued 18567

unevictable_pgs_mlocked 20909

unevictable_pgs_munlocked 20909

unevictable_pgs_cleared 0

unevictable_pgs_stranded 0

thp_fault_alloc 11613

thp_fault_fallback 53

thp_collapse_alloc 3

thp_collapse_alloc_Failed 0

thp_split 9804

thp_zero_page_alloc 1

thp_zero_page_alloc_Failed 0

还有所有/ proc / sys / vm / *设置如果有帮助:

***/proc/sys/vm/admin_reserve_kbytes***

8192

***/proc/sys/vm/block_dump***

0

***/proc/sys/vm/dirty_background_bytes***

0

***/proc/sys/vm/dirty_background_ratio***

10

***/proc/sys/vm/dirty_bytes***

0

***/proc/sys/vm/dirty_expire_centisecs***

3000

***/proc/sys/vm/dirty_ratio***

20

***/proc/sys/vm/dirty_writeback_centisecs***

500

***/proc/sys/vm/drop_caches***

1

***/proc/sys/vm/extfrag_threshold***

500

***/proc/sys/vm/hugepages_treat_as_movable***

0

***/proc/sys/vm/hugetlb_shm_group***

0

***/proc/sys/vm/laptop_mode***

0

***/proc/sys/vm/legacy_va_layout***

0

***/proc/sys/vm/lowmem_reserve_ratio***

256 256 32

***/proc/sys/vm/max_map_count***

65530

***/proc/sys/vm/memory_failure_early_kill***

0

***/proc/sys/vm/memory_failure_recovery***

1

***/proc/sys/vm/min_free_kbytes***

90112

***/proc/sys/vm/min_slab_ratio***

5

***/proc/sys/vm/min_unmapped_ratio***

1

***/proc/sys/vm/mmap_min_addr***

4096

***/proc/sys/vm/nr_hugepages***

0

***/proc/sys/vm/nr_hugepages_mempolicy***

0

***/proc/sys/vm/nr_overcommit_hugepages***

0

***/proc/sys/vm/nr_pdflush_threads***

0

***/proc/sys/vm/numa_zonelist_order***

default

***/proc/sys/vm/oom_dump_tasks***

1

***/proc/sys/vm/oom_kill_allocating_task***

0

***/proc/sys/vm/overcommit_kbytes***

0

***/proc/sys/vm/overcommit_memory***

1

***/proc/sys/vm/overcommit_ratio***

50

***/proc/sys/vm/page-cluster***

3

***/proc/sys/vm/panic_on_oom***

0

***/proc/sys/vm/percpu_pagelist_fraction***

0

***/proc/sys/vm/scan_unevictable_pages***

0

***/proc/sys/vm/stat_interval***

1

***/proc/sys/vm/swappiness***

60

***/proc/sys/vm/user_reserve_kbytes***

131072

***/proc/sys/vm/vfs_cache_pressure***

100

***/proc/sys/vm/zone_reclaim_mode***

0

更新:

有一个thp_split在时间上接近:

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值