OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
"ROOTBACKUP=1" corruption problems on amd64 (OPENBSD_4_0)

From: Didier Wiroth (didier.wirothmcesr.etat.lu)
Date: Thu Mar 29 2007 - 02:11:36 CDT


Hello,
I'm using ROOTBACKUP=1 to have daily backups on several boxes running
amd64 OPENBSD_4_0.
Actually I noticed that on 1 box (the hardware is +/- 3 month old), the
partition is *always* corrupted after the backup.
The corruption happens every day.

Does anyone have an idea what could be the problem?

I'm using a LSI Megaraid controller (see dmesg below), here is the
output.
#bioctl ami0
Volume Status Size Device
 ami0 0 Online 104857600000 sd0 RAID5
      0 Online 400083124224 0:0.0 noencl < ST3400620NS
3.AE>
      1 Online 400083124224 0:1.0 noencl < ST3400620NS
3.AE>
      2 Online 400083124224 0:2.0 noencl < ST3400620NS
3.AE>
      3 Online 400083124224 0:3.0 noencl < ST3400620NS
3.AE>
      4 Online 400083124224 0:4.0 noencl < ST3400620NS
3.AE>
 ami0 1 Online 20971520000 sd1 RAID0
      0 Online 400083124224 0:0.0 noencl < ST3400620NS
3.AE>
      1 Online 400083124224 0:1.0 noencl < ST3400620NS
3.AE>
      2 Online 400083124224 0:2.0 noencl < ST3400620NS
3.AE>
      3 Online 400083124224 0:3.0 noencl < ST3400620NS
3.AE>
      4 Online 400083124224 0:4.0 noencl < ST3400620NS
3.AE>
 ami0 2 Online 739246080000 sd2 RAID5
      0 Online 400083124224 0:0.0 noencl < ST3400620NS
3.AE>
      1 Online 400083124224 0:1.0 noencl < ST3400620NS
3.AE>
      2 Online 400083124224 0:2.0 noencl < ST3400620NS
3.AE>
      3 Online 400083124224 0:3.0 noencl < ST3400620NS
3.AE>
      4 Online 400083124224 0:4.0 noencl < ST3400620NS
3.AE>
 ami0 3 Online 739451600896 sd3 RAID5
      0 Online 400083124224 0:0.0 noencl < ST3400620NS
3.AE>
      1 Online 400083124224 0:1.0 noencl < ST3400620NS
3.AE>
      2 Online 400083124224 0:2.0 noencl < ST3400620NS
3.AE>
      3 Online 400083124224 0:3.0 noencl < ST3400620NS
3.AE>
      4 Online 400083124224 0:4.0 noencl < ST3400620NS
3.AE>
 ami0 4 Hot spare 400083124224 0:5.0 noencl < ST3400620NS
3.AE>

Here is the daily mail report I get:
Backing up root filesystem:

copying /dev/rsd0a to /dev/rsd0h
262139+1 records in
262139+1 records out
2147443200 bytes transferred in 548.279 secs (3916696 bytes/sec)
** /dev/rsd0h
** Last Mounted on /
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
UNREF FILE I=103073 OWNER=root MODE=100555
SIZE=282672 MTIME=Feb 13 08:58 2007
CLEAR? yes

UNREF FILE I=103086 OWNER=root MODE=100555
SIZE=106928 MTIME=Feb 13 08:58 2007
CLEAR? yes

UNREF FILE I=103113 OWNER=root MODE=100500
SIZE=255536 MTIME=Feb 13 08:58 2007
CLEAR? yes

** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? yes

SUMMARY INFORMATION BAD
SALVAGE? yes

BLK(S) MISSING IN BIT MAPS
SALVAGE? yes

3116 files, 24391 used, 1007208 free (280 frags, 125866 blocks, 0.0%
fragmentation)

MARK FILE SYSTEM CLEAN? yes
---- end snip ------

Here is the dmesg:
OpenBSD 4.0-stable (GENERIC.MP) #0: Mon Jan 8 12:54:22 CET 2007
 
rootcediesbak.cedies.etat.lu:/home/sources/src/sys/arch/amd64/compile/G
ENERIC.MP
real mem = 2146562048 (2096252K)
avail mem = 1834729472 (1791728K)
using 22937 buffers containing 214863872 bytes (209828K) of memory
mainbus0 (root)
bios0 at mainbus0: SMBIOS rev. 2.4 0xf0690 (74 entries)
bios0: stem manufacturer P5WDG2 WS PRO
mainbus0: Intel MP Specification (Version 1.4) (INTEL PRO )
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 CPU 6600 2.40GHz, 2404.44 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,
CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,LONG
cpu0: 4MB 64b/line 16-way L2 cache
cpu0: apic clock running at 267MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 CPU 6600 2.40GHz, 2404.11 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,
CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,LONG
cpu1: 4MB 64b/line 16-way L2 cache
mpbios: bus 0 is type PCI
mpbios: bus 1 is type PCI
mpbios: bus 2 is type PCI
mpbios: bus 3 is type PCI
mpbios: bus 4 is type PCI
mpbios: bus 5 is type PCI
mpbios: bus 6 is type ISA
ioapic0 at mainbus0 apid 2 pa 0xfec00000, version 20, 24 pins
ioapic1 at mainbus0 apid 3 pa 0xfec10000, version 20, 24 pins
pci0 at mainbus0 bus 0: configuration mode 1
pchb0 at pci0 dev 0 function 0 vendor "Intel", unknown product 0x277c
rev 0xc0
ppb0 at pci0 dev 1 function 0 vendor "Intel", unknown product 0x277d rev
0xc0
pci1 at ppb0 bus 5
vga1 at pci1 dev 0 function 0 vendor "NVIDIA", unknown product 0x0163
rev 0xa1
mpbios: bus 3 is type PCI
mpbios: bus 4 is type PCI
mpbios: bus 5 is type PCI
mpbios: bus 6 is type ISA
ioapic0 at mainbus0 apid 2 pa 0xfec00000, version 20, 24 pins
ioapic1 at mainbus0 apid 3 pa 0xfec10000, version 20, 24 pins
pci0 at mainbus0 bus 0: configuration mode 1
pchb0 at pci0 dev 0 function 0 vendor "Intel", unknown product 0x277c
rev 0xc0
ppb0 at pci0 dev 1 function 0 vendor "Intel", unknown product 0x277d rev
0xc0
pci1 at ppb0 bus 5
vga1 at pci1 dev 0 function 0 vendor "NVIDIA", unknown product 0x0163
rev 0xa1
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
ppb1 at pci0 dev 28 function 0 "Intel 82801GB PCIE" rev 0x01
pci2 at ppb1 bus 3
ppb2 at pci2 dev 0 function 0 "Intel PCIE-PCIE" rev 0x09
pci3 at ppb2 bus 4
ami0 at pci3 dev 1 function 0 "Symbios Logic MegaRAID" rev 0x01: apic 3
int 0 (irq 10)
ami0: LSI 523, 32b, FW 713R, BIOS vG121, 64MB RAM
ami0: 1 channels, 0 FC loops, 4 logical drives
scsibus0 at ami0: 40 targets
sd0 at scsibus0 targ 0 lun 0: <AMI, Host drive #00, > SCSI2 0/direct
fixed
sd0: 100000MB, 100000 cyl, 64 head, 32 sec, 512 bytes/sec, 204800000 sec
total
sd1 at scsibus0 targ 1 lun 0: <AMI, Host drive #01, > SCSI2 0/direct
fixed
sd1: 20000MB, 20000 cyl, 64 head, 32 sec, 512 bytes/sec, 40960000 sec
total
sd2 at scsibus0 targ 2 lun 0: <AMI, Host drive #02, > SCSI2 0/direct
fixed
sd2: 705000MB, 705000 cyl, 64 head, 32 sec, 512 bytes/sec, 1443840000
sec total
sd3 at scsibus0 targ 3 lun 0: <AMI, Host drive #03, > SCSI2 0/direct
fixed
sd3: 705196MB, 705196 cyl, 64 head, 32 sec, 512 bytes/sec, 1444241408
sec total
scsibus1 at ami0: 16 targets
ppb3 at pci0 dev 28 function 4 "Intel 82801G PCIE" rev 0x01
pci4 at ppb3 bus 2
mskc0 at pci4 dev 0 function 0 "Marvell Yukon 88E8052" rev 0x21, Marvell
Yukon-2 EC rev. A3 (0x2): apic 2 int 16 (irq 10)
msk0 at mskc0 port A, address 00:18:f3:46:84:3d
eephy0 at msk0 phy 0: Marvell 88E1111 Gigabit PHY, rev. 2
uhci0 at pci0 dev 29 function 0 "Intel 82801GB USB" rev 0x01: apic 2 int
20 (irq 11)
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 29 function 1 "Intel 82801GB USB" rev 0x01: apic 2 int
17 (irq 7)
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2 at pci0 dev 29 function 2 "Intel 82801GB USB" rev 0x01: apic 2 int
18 (irq 3)
usb2 at uhci2: USB revision 1.0
uhub2 at usb2
uhub2: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3 at pci0 dev 29 function 3 "Intel 82801GB USB" rev 0x01: apic 2 int
19 (irq 5)
usb3 at uhci3: USB revision 1.0
uhub3 at usb3
uhub3: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
ehci0 at pci0 dev 29 function 7 "Intel 82801GB USB" rev 0x01: apic 2 int
20 (irq 11)
usb4 at ehci0: USB revision 2.0
uhub4 at usb4
uhub4: Intel EHCI root hub, rev 2.00/1.00, addr 1
uhub4: 8 ports with 8 removable, self powered
ppb4 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xe1
pci5 at ppb4 bus 1
"TI TSB43AB22 FireWire" rev 0x00 at pci5 dev 3 function 0 not configured
skc0 at pci5 dev 5 function 0 "Marvell Yukon 88E8001/8003/8010" rev
0x14, Marvell Yukon Lite (0x9): apic 2 int 21 (irq 11)
sk0 at skc0 port A, address 00:18:f3:46:84:3e
eephy1 at sk0 phy 0: Marvell 88E1011 Gigabit PHY, rev. 5
pcib0 at pci0 dev 31 function 0 "Intel 82801GB LPC" rev 0x01
pciide0 at pci0 dev 31 function 1 "Intel 82801GB IDE" rev 0x01: DMA,
channel 0 configured to compatibility, channel 1 configured to
compatibility
atapiscsi0 at pciide0 channel 0 drive 0
scsibus2 at atapiscsi0: 2 targets
cd0 at scsibus2 targ 0 lun 0: <_NEC, DVD_RW ND-4571A, 1-01> SCSI0
5/cdrom removable
cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
pciide0: channel 1 disabled (no drives)
pciide1 at pci0 dev 31 function 2 "Intel 82801GB SATA" rev 0x01: DMA,
channel 0 configured to native-PCI, channel 1 configured to native-PCI
pciide1: using apic 2 int 23 (irq 10) for native-PCI interrupt
ichiic0 at pci0 dev 31 function 3 "Intel 82801GB SMBus" rev 0x01: apic 2
int 23 (irq 0)
iic0 at ichiic0
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0 mux 0

Thx a lot
Didier