OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
 
RE: adding primary index after inserting data

From: Knappke, Christian (christian.knappkesap.com)
Date: Thu Jan 22 2004 - 01:51:40 CST


> i am running maxdb 7.50 Build005 on debian 3.0r2 with kernel
> 2.4.23-xfs.
>
> i created a table
>
> CREATE TABLE "ACCOUNTING"."CISCO3640BR_200310"
> (
> "ID" Integer NOT NULL DEFAULT SERIAL (1),
> "SOURCE" Varchar (16) ASCII NOT NULL DEFAULT ' ',
> "DESTINATION" Varchar (16) ASCII NOT NULL DEFAULT ' ',
> "PACKETS" Integer,
> "BYTES" Integer,
> "TIMSTAMP" Timestamp NOT NULL
> )
>
> i disabled the log and did not add an primary key and no index.

MaxDB doesn't have any tables without a primary key. If you don't
define it, then MaxDB creates a (hidden) key for you.

The table data is stored in a B* tree structure and is sorted by
the primary key. Always. Period.

> afterwards i inserted the data from a csv file (about 24mio rows)

So these 24000000 rows are sorted by the generated key.
 
> the insert took on the test mashine 2hours (it has an slow megatrend
> raid5 controller :)
>
> afterwards - still with disabled log - i tried to add an primary key
> over the ID field.
>
> ALTER TABLE "ACCOUNTING"."CISCO3640BR_200310" ADD PRIMARY KEY ("ID")

Now you force a completely different sort order of the 24000000 rows.
It also affords a complete rewrite of all data pages.

> now the sqlstudio is hanging since 5day's. and the machine is doing
> nearly nothing.

This *can* take some time, depending on the system resources (memory,
CPU, I/O).

> it does not respond to taskmgr and doesnt accept any commands, but it
> redraws from time to time the screen.
>
> any ideas? is this possible to reindex 24mio rows?

It is possible (and quite fast) to create/alter *indexes*. But you should
be careful to alter the *primary key*. Note the difference.

You should define the primary key when you "CREATE TABLE ...". Other
(secondary) indexes may be created after the table load.

> and maxdb eats up more space on harddisk, as mysql does.

MaxDB is a different DBMS. You can't compare only this part without
taking other benefits into account.

> the table need's currently with maxdb 2.4gb space in harddisk
> (without primary key and without any index.)

The primary key is part of the table data and doesn't take extra space.

> mysql need's for the data 1.2gb and for index 631mb
>
> is it normal, that maxdb need 3time's more space?

You meant 30% more. It depends a lot on the data types, DEFAULT constraints etc.

HTH
Christian
--
#include <std_disclaimer.h>
/* The opinions stated above are my own and not
   necessarily those of my employer. */

--
MaxDB Discussion Mailing List
For list archives: http://lists.mysql.com/maxdb