I blew up the dragonflybsd.org domain when I upgraded the box running the DNS. The new version of bind disallows certain constructions (domain names with underscores), and as per normal stupidity it decided to stop serving the entire file.
# The 'NATA' set of drivers are set to replace the previous ATA drivers, # and this set of drivers is mutually exclusive with the old ones. This means, # you can't have both at the same time!
panic: vm_page_dirty: page in free/cache queue! mp_lock = 00000000; cpuid = 0; lapic.id = 00000000 Trace beginning at frame 0xd4bbe9f8 panic(c04f8288,0,c0531b18,d4bbea28,18981463) at panic+0x17f panic(c0531b18,c05de3c0,d4681000,c049f277,c0f0cc00) at panic+0x17f pmap_remove_pte(c05de3c0,bff51a04,d4681000,d4bbea80,0) at pmap_remove_pte+0xb3 pmap_enter(c05de3c0,d4681000,c0ae3600,7,1) at pmap_enter+0x245 vm_fault(c05c2320,d4681000,3,1,0) at vm_fault+0x202 vm_fault_wire(c05c2320,d3618cc0,0,d4bbeb84,1) at vm_fault_wire+0xaf vm_map_wire(c05c2320,d4680000,d4683000,0,0) at vm_map_wire+0x19d kmem_alloc3(c05c2320,3000,0,d42cec00,c049fb23) at kmem_alloc3+0x206 lwkt_alloc_thread(0,3000,ffffffff,0,0) at lwkt_alloc_thread+0x120 lwp_fork(c9b06700,d37eada0,40000014,d4bbecb4,c9b06700) at lwp_fork+0x16f fork1(c9b06700,40000014,d4bbec9c,50,0) at fork1+0x600 sys_fork(d4bbece8,0,0,0,1) at sys_fork+0x35 syscall2(d4bbed40) at syscall2+0x294 Xint0x80_syscall() at Xint0x80_syscall+0x35 Debugger("panic")
panic: vm_page_dirty: page in free/cache queue! mp_lock = 00000001; cpuid = 1; lapic.id = 01000000 Trace beginning at frame 0xd48fa93c panic(c04f83a8,1000000,c0531c5c,d48fa96c,2d6f463) at panic+0x17f panic(c0531c5c,c05de520,d442f000,c049f377,c08ee6f0) at panic+0x17f pmap_remove_pte(c05de520,bff510bc,d442f000,d48fa9c4,0) at pmap_remove_pte+0xb3 pmap_enter(c05de520,d442f000,c0ba6cf0,7,1) at pmap_enter+0x245 kmem_slab_alloc(1000,1000,2,c02d21e9,c185e4b0) at kmem_slab_alloc+0x44f kmalloc(1000,c0566400,2,40,c9b05900) at kmalloc+0x27b mmrw(c05a06e0,d48fac88,20000,d48fab10,c0261564) at mmrw+0x39d mmread(d48faafc,c05626f8,c05a06e0,d48fac88,20000) at mmread+0x22 dev_dread(c05a06e0,d48fac88,20000,c05a06e0,d48fab68) at dev_dread+0x2c spec_read(d48fab68,20,d3f556d0,d48fabc0,d48fac88) at spec_read+0x52 ufsspec_read(d48fab68,d48fab98,c02d24be,d48fab68,c056e7b0) at ufsspec_read+0x28 ufs_vnoperatespec(d48fab68,c056e7b0,c185df10,0,0) at ufs_vnoperatespec+0x16 vop_read(c185df10,d3f555d0,d48fac88,20000,c1705af8) at vop_read+0x34 vn_read(d2d03f30,d48fac88,c1705af8,0,c049dc92) at vn_read+0x16e dofileread(3,d2d03f30,d48fac88,0,d48face8) at dofileread+0xc5 kern_preadv(3,d48fac88,0,d48face8,8079820,20,d48fac80,1,0,0,20,0,0,c9b0a200) at kern_preadv+0xa6 sys_read(d48face8,d48facf8,c,0,1) at sys_read+0x75 syscall2(d48fad40) at syscall2+0x294 Xint0x80_syscall() at Xint0x80_syscall+0x35 Debugger("panic")
CPU1 stopping CPUs: 0x00000001
-- うーむ やっぱりハードウェアが腐っているのか
431 名前:426 mailto:sage [2007/04/02(月) 23:37:54 ]
panic: vm_page_dirty: page in free/cache queue! mp_lock = 00000000; cpuid = 0; lapic.id = 00000000 Trace beginning at frame 0xd2c319f8 panic(c04f83a8,0,c0531c5c,d2c31a28,1a0c3463) at panic+0x17f panic(c0531c5c,c05de520,d2e43000,c049f377,c0f75690) at panic+0x17f pmap_remove_pte(c05de520,bff4b90c,d2e43000,d2c31a80,0) at pmap_remove_pte+0xb3 pmap_enter(c05de520,d2e43000,c1010a90,7,1) at pmap_enter+0x245 vm_fault(c05c2480,d2e43000,3,1,100100) at vm_fault+0x202 vm_fault_wire(c05c2480,d3361d00,0,d2c31b84,1) at vm_fault_wire+0xaf vm_map_wire(c05c2480,d2e43000,d2e46000,0,0) at vm_map_wire+0x19d kmem_alloc3(c05c2480,3000,0,c049fc23,c049fc23) at kmem_alloc3+0x206 lwkt_alloc_thread(0,3000,ffffffff,0,0) at lwkt_alloc_thread+0x120 lwp_fork(c9b07400,d3de4620,c0000034,c0269d98,c9b07400) at lwp_fork+0x16f fork1(c9b07400,c0000034,d2c31c9c,0,28082100) at fork1+0x600 sys_vfork(d2c31ce8,d2c31d40,1ab3f,0,1) at sys_vfork+0x35 syscall2(d2c31d40) at syscall2+0x294 Xint0x80_syscall() at Xint0x80_syscall+0x35 Debugger("panic")
$ cd /path/to/sys/config $ config -rd ~/kern ~/MYKERNEL-WITH-DEBUG $ cd ~/kern $ make -s kernel-depend && make -sj3 kernel.debug $ su # cp kernel.debug /test-kernel # reboot (boot loader) OK unload OK load /test-kernel OK boot
と思ったら UP (GENERIC) Kernel において env MAKEOBJDIRPREFIX=/home/USER/obj make -j4 buildworld で
panic: vm_page_dirty: page in free/cache queue! Trace beginning at frame 0xd4208948 panic(c0559c8f,c0624d80,c059ca24,d4208978,1c072563) at panic+0x99 panic(c059ca24,c0652d20,d4332000,d376a900,c1003fc8) at panic+0x99 pmap_remove_pte(c0652d20,bff50cc8,d4332000,d42089d0,0) at pmap_remove_pte+0xb3 pmap_enter(c0652d20,d4332000,c08bbdc8,7,1) at pmap_enter+0x245 kmem_slab_alloc(1000,1000,2,c185da90,d4208a70) at kmem_slab_alloc+0x421 kmalloc(1000,c05d5220,2,d4208ab4,c15bb800) at kmalloc+0x27b mmrw(c0622860,d4208c8c,20000,d4208b1c,c02b97d8) at mmrw+0x39d mmread(d4208b08,c05d1518,c0622860,d4208c8c,20000) at mmread+0x22 dev_dread(c0622860,d4208c8c,20000,c0622860,d4208b74) at dev_dread+0x2c spec_read(d4208b74,20,d2d42170,d4208bc4,20000) at spec_read+0x52 ufsspec_read(d4208b74,d4208ba4,c032621a,d4208b74,c05dd230) at ufsspec_read+0x28 ufs_vnoperatespec(d4208b74,c05dd230,c185d4f0,0,0) at ufs_vnoperatespec+0x16 vop_read(c185d4f0,d2d42070,d4208c8c,20000,c1705c48) at vop_read+0x34 vn_read(c820ef40,d4208c8c,c1705c48,0,d2f06e03) at vn_read+0x168 dofileread(3,c820ef40,d4208c8c,0,d4208ce8) at dofileread+0xc5 kern_preadv(3,d4208c8c,0,d4208ce8,805b760,20,d4208c84,1,0,0,20,0,0,c8110600) at kern_preadv+0xa6 sys_read(d4208ce8,d4208cf8,c,0,0) at sys_read+0x75 syscall2(d4208d40) at syscall2+0x214 Xint0x80_syscall() at Xint0x80_syscall+0x35 Debugger("panic") Stopped at Debugger+0x44: movb $0,in_Debugger.0 db>
HAMMER is going to be a little unstable as I commit the crash recovery code. I'm about half way through it. Meta-data updates to the disk media have now been separated out. I have a few things left to do before crash recovery will actually work:
* I have to flush the undo buffers out before the meta-data buffers * Then I have to flush the volume header so mount can see the updated undo info. * Then I have to flush out the meta-data buffers that the UNDO info refers to. * And, finally, the mount code must scan the UNDO buffers and perform any required UNDOs.
The idea being that if a crash occurs at any point in the above sequence, HAMMER will be able to run the UNDOs to undo any partially written meta-data. HAMMER would be able to do this at mount-time and it would probably take less then a second, so basically this gives us our instant crash-recovery feature.
One interesting outcome of the separation work I just committed is that the frontend VOPs are *massively* disconnected from backend disk I/O now. In coming weeks I hope to take advantage of this separation to remove the remaining stalls and significantly improve HAMMER's performance.