Приветствую!В логах много сообщения типа:
Oct 10 19:05:52 ns kernel: BUG: soft lockup - CPU#0 stuck for 10s! [swapper:0]
Oct 10 19:05:53 ns kernel: BUG: soft lockup - CPU#3 stuck for 10s! [swapper:0]
Oct 10 19:05:53 ns kernel: BUG: soft lockup - CPU#2 stuck for 10s! [dd:29344]
Oct 10 19:05:54 ns kernel: BUG: soft lockup - CPU#0 stuck for 10s! [swapper:0]
Oct 10 19:05:55 ns kernel: BUG: soft lockup - CPU#3 stuck for 10s! [swapper:0]
Oct 10 19:05:55 ns kernel: BUG: soft lockup - CPU#2 stuck for 10s! [dd:29344]
Oct 10 19:05:56 ns kernel: BUG: soft lockup - CPU#0 stuck for 10s! [swapper:0]
Oct 10 19:05:56 ns kernel: BUG: soft lockup - CPU#0 stuck for 10s! [swapper:0]
Соответственно для swapper:
Oct 10 19:05:53 ns kernel: BUG: soft lockup - CPU#3 stuck for 10s! [swapper:0]
Oct 10 19:05:53 ns kernel: CPU 3:
Oct 10 19:05:53 ns kernel: Modules linked in: ip_conntrack_netbios_ns xt_state ip_conntrack nfnetlink iptable_filter ip_tables
ipt_REJECT ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api dm_mirror dm_multipath scsi_d
h video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy sg shpchp e1000e e
1000 i2c_i801 i2c_core serio_raw pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache usb_storage ata_piix li
bata sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Oct 10 19:05:53 ns kernel: Pid: 0, comm: swapper Not tainted 2.6.18-164.el5 #1
Oct 10 19:05:53 ns kernel: RIP: 0010:[<ffffffff80064bcc>] [<ffffffff80064bcc>] .text.lock.spinlock+0x2/0x30
Oct 10 19:05:53 ns kernel: RSP: 0018:ffff81011fcbff08 EFLAGS: 00000282
Oct 10 19:05:53 ns kernel: RAX: 0000000000091188 RBX: ffffffff8030c080 RCX: ffff81011fcbff30
Oct 10 19:05:53 ns kernel: RDX: ffff8100433f72c0 RSI: ffff81000101f560 RDI: ffffffff8030c100
Oct 10 19:05:53 ns kernel: RBP: ffff81011fcbfe80 R08: 00000000340cf100 R09: ffff810080beb200
Oct 10 19:05:53 ns kernel: R10: ffff81011fcbff98 R11: 0000000000000202 R12: ffffffff8005dc8e
Oct 10 19:05:53 ns kernel: R13: ffff81000101f560 R14: ffffffff80077717 R15: ffff81011fcbfe80
Oct 10 19:05:53 ns kernel: FS: 0000000000000000(0000) GS:ffff81011fc556c0(0000) knlGS:0000000000000000
Oct 10 19:05:53 ns kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Oct 10 19:05:53 ns kernel: CR2: 00002b050954e000 CR3: 0000000112869000 CR4: 00000000000006e0
Oct 10 19:05:53 ns kernel:
Oct 10 19:05:53 ns kernel: Call Trace:
Oct 10 19:05:53 ns kernel: <IRQ> [<ffffffff8009d704>] __rcu_process_callbacks+0xe4/0x1a1
Oct 10 19:05:53 ns kernel: [<ffffffff8009d7e4>] rcu_process_callbacks+0x23/0x43
Oct 10 19:05:53 ns kernel: [<ffffffff80093ebb>] tasklet_action+0x89/0xfd
Oct 10 19:05:53 ns kernel: [<ffffffff8001235a>] __do_softirq+0x89/0x133
Oct 10 19:05:53 ns kernel: [<ffffffff8005e2fc>] call_softirq+0x1c/0x28
Oct 10 19:05:53 ns kernel: [<ffffffff8006cb14>] do_softirq+0x2c/0x85
Oct 10 19:05:53 ns kernel: [<ffffffff8006b2cc>] default_idle+0x0/0x50
Oct 10 19:05:53 ns kernel: [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c
Oct 10 19:05:53 ns kernel: <EOI> [<ffffffff8006b2f5>] default_idle+0x29/0x50
Oct 10 19:05:53 ns kernel: [<ffffffff8004939e>] cpu_idle+0x95/0xb8
Oct 10 19:05:53 ns kernel: [<ffffffff80076e23>] start_secondary+0x45a/0x469
и для dd:
Oct 10 19:05:53 ns kernel: BUG: soft lockup - CPU#2 stuck for 10s! [dd:29344]
Oct 10 19:05:53 ns kernel: CPU 2:
Oct 10 19:05:53 ns kernel: Modules linked in: ip_conntrack_netbios_ns xt_state ip_conntrack nfnetlink iptable_filter ip_tables
ipt_REJECT ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api dm_mirror dm_multipath scsi_d
h video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy sg shpchp e1000e e
1000 i2c_i801 i2c_core serio_raw pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache usb_storage ata_piix li
bata sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Oct 10 19:05:53 ns kernel: Pid: 29344, comm: dd Not tainted 2.6.18-164.el5 #1
Oct 10 19:05:53 ns kernel: RIP: 0010:[<ffffffff80064bcf>] [<ffffffff80064bcf>] .text.lock.spinlock+0x5/0x30
Oct 10 19:05:53 ns kernel: RSP: 0018:ffff81011fc8bf08 EFLAGS: 00000282
Oct 10 19:05:54 ns kernel: RAX: 0000000000091189 RBX: ffffffff8030c080 RCX: ffff81011fc8bf30
Oct 10 19:05:54 ns kernel: RDX: ffff81002d4b5ec0 RSI: ffff810001016f60 RDI: ffffffff8030c100
Oct 10 19:05:54 ns kernel: RBP: ffff81011fc8be80 R08: 00000000340cf100 R09: ffff810080be2c00
Oct 10 19:05:54 ns kernel: R10: ffff81011fc8bf98 R11: 00000000c4fcaf60 R12: ffffffff8005dc8e
Oct 10 19:05:54 ns kernel: R13: ffff810001016f60 R14: ffffffff80077717 R15: ffff81011fc8be80
Oct 10 19:05:54 ns kernel: FS: 00002b2e005a6f10(0000) GS:ffff81011fc55ec0(0000) knlGS:0000000000000000
Oct 10 19:05:54 ns kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct 10 19:05:54 ns kernel: CR2: 00002b27a074e0a0 CR3: 0000000111b21000 CR4: 00000000000006e0
Oct 10 19:05:54 ns kernel:
Oct 10 19:05:54 ns kernel: Call Trace:
Oct 10 19:05:54 ns kernel: <IRQ> [<ffffffff8009d6ac>] __rcu_process_callbacks+0x8c/0x1a1
Oct 10 19:05:54 ns kernel: [<ffffffff8009d7e4>] rcu_process_callbacks+0x23/0x43
Oct 10 19:05:54 ns kernel: [<ffffffff80093ebb>] tasklet_action+0x89/0xfd
Oct 10 19:05:54 ns kernel: [<ffffffff8001235a>] __do_softirq+0x89/0x133
Oct 10 19:05:54 ns kernel: [<ffffffff8005e2fc>] call_softirq+0x1c/0x28
Oct 10 19:05:54 ns kernel: [<ffffffff8006cb14>] do_softirq+0x2c/0x85
Oct 10 19:05:54 ns kernel: [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c
Oct 10 19:05:54 ns kernel: <EOI> [<ffffffff80038d1a>] sha_transform+0x180/0x1ef
Oct 10 19:05:54 ns kernel: [<ffffffff801a05b6>] extract_buf+0x3b/0xef
Oct 10 19:05:54 ns kernel: [<ffffffff801a0ade>] extract_entropy_user+0x7b/0xd2
Oct 10 19:05:54 ns kernel: [<ffffffff8000b695>] vfs_read+0xcb/0x171
Oct 10 19:05:54 ns kernel: [<ffffffff80011b72>] sys_read+0x45/0x6e
Oct 10 19:05:54 ns kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Раньше такого не встречал, вот этому спрашиваю, что с таким делать? Проблема на программном уровне или с железом?
++++
В эти моменты сервер подвисает и не реагирует ни на что...
atop в близкие или в эти же моменты показывает нагрузку на один из CPU 100%, при чем все 100% приходятся на IRQ.