OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
 
Re: Strange server crashes with large table and myisamchk

From: gerald_clark (gerald_clarksuppliersystems.com)
Date: Fri Jul 02 2004 - 08:04:13 CDT


It is telling you that your hard drive is failing.
Replace it.

Hanno Fietz wrote:

> Hello everybody,
>
> I'm experiencing problems with a 4.0.15 MySQL-Server running on a SuSE
> Linux 8.2 box with 512 MB RAM, some one-point-something GHz CPU and 40
> GB IDE Harddisk.
>
> We have a database with some administrative tables and one large data
> table (now ~ 30 M rows, ~ 1GB index file and ~ 800 MB data file) that
> we insert new rows into on a per-minute basis. Read / Write ratio
> probably is around 1 : 2 or 1 : 3. To achieve good performance despite
> the size of the table, we run "myisamchk -r" and "myisamchk -R 1"
> every night as a part of the backup routine. The server is taken down
> for that purpose.
>
> For the last two weeks now, we are getting these syslog messages when
> running the optimization:
>
> Jul 2 03:10:28 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jul 2 03:10:28 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=429367, sector=316864
> Jul 2 03:10:28 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 316864
> Jul 2 03:10:28 t56 kernel: klogd 1.4.1, ---------- state change
> ----------
> Jul 2 03:10:30 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jul 2 03:10:30 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=429367, sector=316872
> Jul 2 03:10:30 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 316872
> Jul 2 03:10:32 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jul 2 03:10:32 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=429367, sector=316880
> Jul 2 03:10:32 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 316880
> Jul 2 03:10:33 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jul 2 03:10:33 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=429367, sector=316888
> Jul 2 03:10:33 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 316888
> Jul 2 03:10:39 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jul 2 03:10:39 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=429367, sector=316896
> Jul 2 03:10:39 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 316896
> Jul 2 03:10:39 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jul 2 03:10:39 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=429367, sector=316904
> Jul 2 03:10:39 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 316904
> Jul 2 03:10:39 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jul 2 03:10:39 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=429367, sector=316912
> Jul 2 03:10:39 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 316912
> Jul 2 03:12:17 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jul 2 03:12:17 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=159072, sector=46592
> Jul 2 03:12:17 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 46592
> Jul 2 03:12:19 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jul 2 03:12:19 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=159072, sector=46600
> Jul 2 03:12:19 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 46600
> Jul 2 03:13:14 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jul 2 03:13:14 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=285328, sector=172864
> Jul 2 03:13:14 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 172864
> Jul 2 03:13:16 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jul 2 03:13:16 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=285328, sector=172872
> Jul 2 03:13:16 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 172872
>
>
> Occasionally (not always!!), the MySQL-Server won't some up again
> after optimization, sometimes myisamchk even leaves the table
> corrupted and has to be run again. To make it even more confusing:
> sometimes I get server crashes during shutdown, due to signal 11
> (SEGV). I included a resolved stack dump below:
>
> 0x8071f64 handle_segfault + 420
> 0x82916c8 pthread_sighandler + 184
> 0x8188a9f btr_search_drop_page_hash_index + 5359
> 0x8188e1a btr_search_drop_page_hash_when_freed + 138
> 0x81dbbea fseg_free_extent + 746
> 0x81dc7fa fseg_free_step + 2458
> 0x815c3ba btr_free_but_not_root + 122
> 0x8100efe dict_drop_index_tree + 94
> 0x814969a row_upd_clust_step + 538
> 0x81499fa row_upd + 106
> 0x8149c62 row_upd_step + 322
> 0x811c7be que_run_threads + 334
> 0x8136132 row_drop_table_for_mysql + 2114
> 0x80cf4ce delete_table__11ha_innobasePCc + 270
> 0x80c5c8c ha_delete_table__F7db_typePCc + 60
> 0x80d3bf1 mysql_rm_table_part2__FP3THDP13st_table_listbT2 + 497
> 0x80d38c1 mysql_rm_table__FP3THDP13st_table_listc + 177
> 0x807e6f1 mysql_execute_command__Fv + 8561
> 0x8080565 mysql_parse__FP3THDPcUi + 149
> 0x807bac3 dispatch_command__F19enum_server_commandP3THDPcUi + 1443
> 0x807b50e do_command__FP3THD + 158
> 0x807acfe handle_one_connection + 638
> 0x828ee7c pthread_start_thread + 220
> 0x82c258a thread_start + 4
>
>
> Server crashes like that (caught signal 11) have recently been
> observed during normal operations as well, also preceded by hd errors
> in the syslog:
>
> Jun 30 14:06:55 t56 kernel: hda: dma_intr: status=0x51 {
> DriveReadySeekComplete Error }
> Jun 30 14:06:55 t56 kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=186887, sector=74432
> Jun 30 14:06:55 t56 kernel: end_request: I/O error, dev 03:02 (hda),
> sector 74432
>
>
> The server restarted itself after that and wrote error messages to the
> logfile. Again, I include the stack trace:
>
> 0x8071f64 handle_segfault + 420
> 0x82916c8 pthread_sighandler + 184
> 0x82aad07 vfprintf + 6295
> 0x82b1645 vsprintf + 85
> 0x823928b ut_sprintf + 27
> 0x8226406 sync_array_cell_print + 166
> 0x8226ea4 sync_array_print_long_waits + 116
> 0x80f99d8 srv_error_monitor_thread + 88
> 0x828ee7c pthread_start_thread + 220
> 0x82c258a thread_start + 4
>
>
> I have googled the syslog messages and worked myself through several
> forums but can't really pinpoint the problem. It seems there are some
> problems with our hard disk, which could mean that it is damaged (bad
> blocks etc.) but what I can't see is why this is so closely related to
> the optimization / backup script. There definetly is a strong
> correlation (we do get hd errors outside the backup process, but very
> rarely) between running myisamchk and getting I / O errors, but I just
> don't know if myisamchk causes the problem or if it is prone to suffer
> from disk trouble more than other processes.
>
> Any help would be appreciated.
> Some questions I have:
> - How do I read the resolved stack trace? There are function calls
> (probably youngest first), OK, but what does that " + xxx" at the end
> of each line mean?)
> - Do the function calls executed just before p_thread_sighandler have
> something in common?
> - Is there a way to get more output out of myisamchk apart from -v?
>
> Hanno Fietz
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql