|
Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com |
Re: Strange server crashes with large table and myisamchk
From: gerald_clark (gerald_clark
suppliersystems.com)
Date: Fri Jul 02 2004 - 08:50:25 CDT
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Try this:
dd if=/dev/hda of=/dev/null
This will exercise the entire drive.
You should see lots of errors if your drive is failing.
Another possibility is a bad cable. Cables don't usually go bad if
they are
not disturbed. Drives do.
A failing IDE contoller is another unlikely possibility.
I would put my money on the drive.
40G is tiny these days, and cheap.
Hanno Fietz wrote:
> Yes, I was suspecting that as well, but: Why do I get these messages
> whenever I run myisamchk and (almost) never at any other time? Is
> myisamchk using the hd more extensively than e. g. MySQL itself? Can
> the rather large demand for temporary disk space account for that?
>
> Thanks,
> Hanno
>
> gerald_clark wrote:
>
>> It is telling you that your hard drive is failing.
>> Replace it.
>>
>> Hanno Fietz wrote:
>>
>>> Hello everybody,
>>>
>>> I'm experiencing problems with a 4.0.15 MySQL-Server running on a
>>> SuSE Linux 8.2 box with 512 MB RAM, some one-point-something GHz CPU
>>> and 40 GB IDE Harddisk.
>>>
>>> We have a database with some administrative tables and one large
>>> data table (now ~ 30 M rows, ~ 1GB index file and ~ 800 MB data
>>> file) that we insert new rows into on a per-minute basis. Read /
>>> Write ratio probably is around 1 : 2 or 1 : 3. To achieve good
>>> performance despite the size of the table, we run "myisamchk -r" and
>>> "myisamchk -R 1" every night as a part of the backup routine. The
>>> server is taken down for that purpose.
>>>
>>> For the last two weeks now, we are getting these syslog messages
>>> when running the optimization:
>>>
>>> Jul 2 03:10:28 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jul 2 03:10:28 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=429367, sector=316864
>>> Jul 2 03:10:28 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 316864
>>> Jul 2 03:10:28 t56 kernel: klogd 1.4.1, ---------- state change
>>> ----------
>>> Jul 2 03:10:30 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jul 2 03:10:30 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=429367, sector=316872
>>> Jul 2 03:10:30 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 316872
>>> Jul 2 03:10:32 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jul 2 03:10:32 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=429367, sector=316880
>>> Jul 2 03:10:32 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 316880
>>> Jul 2 03:10:33 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jul 2 03:10:33 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=429367, sector=316888
>>> Jul 2 03:10:33 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 316888
>>> Jul 2 03:10:39 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jul 2 03:10:39 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=429367, sector=316896
>>> Jul 2 03:10:39 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 316896
>>> Jul 2 03:10:39 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jul 2 03:10:39 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=429367, sector=316904
>>> Jul 2 03:10:39 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 316904
>>> Jul 2 03:10:39 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jul 2 03:10:39 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=429367, sector=316912
>>> Jul 2 03:10:39 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 316912
>>> Jul 2 03:12:17 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jul 2 03:12:17 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=159072, sector=46592
>>> Jul 2 03:12:17 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 46592
>>> Jul 2 03:12:19 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jul 2 03:12:19 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=159072, sector=46600
>>> Jul 2 03:12:19 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 46600
>>> Jul 2 03:13:14 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jul 2 03:13:14 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=285328, sector=172864
>>> Jul 2 03:13:14 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 172864
>>> Jul 2 03:13:16 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jul 2 03:13:16 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=285328, sector=172872
>>> Jul 2 03:13:16 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 172872
>>>
>>>
>>> Occasionally (not always!!), the MySQL-Server won't some up again
>>> after optimization, sometimes myisamchk even leaves the table
>>> corrupted and has to be run again. To make it even more confusing:
>>> sometimes I get server crashes during shutdown, due to signal 11
>>> (SEGV). I included a resolved stack dump below:
>>>
>>> 0x8071f64 handle_segfault + 420
>>> 0x82916c8 pthread_sighandler + 184
>>> 0x8188a9f btr_search_drop_page_hash_index + 5359
>>> 0x8188e1a btr_search_drop_page_hash_when_freed + 138
>>> 0x81dbbea fseg_free_extent + 746
>>> 0x81dc7fa fseg_free_step + 2458
>>> 0x815c3ba btr_free_but_not_root + 122
>>> 0x8100efe dict_drop_index_tree + 94
>>> 0x814969a row_upd_clust_step + 538
>>> 0x81499fa row_upd + 106
>>> 0x8149c62 row_upd_step + 322
>>> 0x811c7be que_run_threads + 334
>>> 0x8136132 row_drop_table_for_mysql + 2114
>>> 0x80cf4ce delete_table__11ha_innobasePCc + 270
>>> 0x80c5c8c ha_delete_table__F7db_typePCc + 60
>>> 0x80d3bf1 mysql_rm_table_part2__FP3THDP13st_table_listbT2 + 497
>>> 0x80d38c1 mysql_rm_table__FP3THDP13st_table_listc + 177
>>> 0x807e6f1 mysql_execute_command__Fv + 8561
>>> 0x8080565 mysql_parse__FP3THDPcUi + 149
>>> 0x807bac3 dispatch_command__F19enum_server_commandP3THDPcUi + 1443
>>> 0x807b50e do_command__FP3THD + 158
>>> 0x807acfe handle_one_connection + 638
>>> 0x828ee7c pthread_start_thread + 220
>>> 0x82c258a thread_start + 4
>>>
>>>
>>> Server crashes like that (caught signal 11) have recently been
>>> observed during normal operations as well, also preceded by hd
>>> errors in the syslog:
>>>
>>> Jun 30 14:06:55 t56 kernel: hda: dma_intr: status=0x51 {
>>> DriveReadySeekComplete Error }
>>> Jun 30 14:06:55 t56 kernel: hda: dma_intr: error=0x40 {
>>> UncorrectableError }, LBAsect=186887, sector=74432
>>> Jun 30 14:06:55 t56 kernel: end_request: I/O error, dev 03:02 (hda),
>>> sector 74432
>>>
>>>
>>> The server restarted itself after that and wrote error messages to
>>> the logfile. Again, I include the stack trace:
>>>
>>> 0x8071f64 handle_segfault + 420
>>> 0x82916c8 pthread_sighandler + 184
>>> 0x82aad07 vfprintf + 6295
>>> 0x82b1645 vsprintf + 85
>>> 0x823928b ut_sprintf + 27
>>> 0x8226406 sync_array_cell_print + 166
>>> 0x8226ea4 sync_array_print_long_waits + 116
>>> 0x80f99d8 srv_error_monitor_thread + 88
>>> 0x828ee7c pthread_start_thread + 220
>>> 0x82c258a thread_start + 4
>>>
>>>
>>> I have googled the syslog messages and worked myself through several
>>> forums but can't really pinpoint the problem. It seems there are
>>> some problems with our hard disk, which could mean that it is
>>> damaged (bad blocks etc.) but what I can't see is why this is so
>>> closely related to the optimization / backup script. There definetly
>>> is a strong correlation (we do get hd errors outside the backup
>>> process, but very rarely) between running myisamchk and getting I /
>>> O errors, but I just don't know if myisamchk causes the problem or
>>> if it is prone to suffer from disk trouble more than other processes.
>>>
>>> Any help would be appreciated.
>>> Some questions I have:
>>> - How do I read the resolved stack trace? There are function calls
>>> (probably youngest first), OK, but what does that " + xxx" at the
>>> end of each line mean?)
>>> - Do the function calls executed just before p_thread_sighandler
>>> have something in common?
>>> - Is there a way to get more output out of myisamchk apart from -v?
>>>
>>> Hanno Fietz
>>>
>>
>>
>>
>>
>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]