|
Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com |
From: Olaf Stein (steino
ccri.net)
Date: Wed Sep 05 2007 - 14:47:33 CDT
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thanks baron,
I will try this just for test purposes as I already wrote a script, which is
slow but not as bad as using IN()
Olaf
On 9/5/07 3:29 PM, "Baron Schwartz" <baron
xaprb.com> wrote:
> IN() subqueries in MySQL are badly optimized. It's usually better to
> use a JOIN, even though it's non-standard:
>
> DELETE geno_260k.*
> FROM geno_260k
> INNER JOIN (
> SELECT ident FROM geno_260k
> WHERE a1=0
> GROUP BY ident HAVING count(*)>250000
> ) AS der USING(ident);
>
> Try profiling this and see if it's faster. It probably will be on any
> reasonably large data set, as long as the table has an index on ident.
>
> Note I changed the COUNT(a1) to COUNT(*) for efficiency. Counting a
> column counts the number of values (e.g. non-null). Counting * just
> counts the number of rows and can be faster. COUNT(*) is what you want
> to use 99% of the time.
>
> Regards
> Baron
>
> Olaf Stein wrote:
>> Thanks,
>>
>> This seems to work but that IN seems to be really slow...
>>
>>
>> On 9/5/07 9:41 AM, "Justin" <mysql
b0rker.com> wrote:
>>
>>> try
>>>
>>> SELECT * from geno_260k WHERE ident IN (SELECT ident FROM geno_260k WHERE
>>> a1=0
>>> GROUP BY ident HAVING count(a1)>250000);
>>>
>>> This will give you what you're deleting first.. then if that is good. do
>>>
>>>
>>> DELETE FROM geno_260k WHERE ident IN (SELECT ident FROM geno_260k WHERE
>>> a1=0
>>> GROUP BY ident HAVING count(a1)>250000);
>>>
>>> (note the change in case is just my way of seeing things.. it's not
>>> necessary that I know of)
>>>
>>>
>>> ----- Original Message -----
>>> From: "Olaf Stein" <steino
ccri.net>
>>> To: "MySql" <mysql
lists.mysql.com>
>>> Sent: Wednesday, September 05, 2007 9:35 AM
>>> Subject: Delete query question
>>>
>>>
>>>> Hey all
>>>>
>>>> I am stuck here (thinking wise) and need some ideas:
>>>>
>>>> I have this table:
>>>>
>>>> CREATE TABLE `geno_260k` (
>>>> `genotype_id` int(10) unsigned NOT NULL auto_increment,
>>>> `ident` int(10) unsigned NOT NULL,
>>>> `marker_id` int(10) unsigned NOT NULL,
>>>> `a1` tinyint(3) unsigned NOT NULL,
>>>> `a2` tinyint(3) unsigned NOT NULL default '0',
>>>> PRIMARY KEY (`genotype_id`),
>>>> KEY `ident` (`ident`),
>>>> KEY `marker_id` (`marker_id`),
>>>> CONSTRAINT `geno_260k_ibfk_2` FOREIGN KEY (`marker_id`) REFERENCES
>>>> `markers` (`marker_id`),
>>>> CONSTRAINT `geno_260k_ibfk_1` FOREIGN KEY (`ident`) REFERENCES
>>>> `individual` (`ident`)
>>>> ) ENGINE=InnoDB AUTO_INCREMENT=91318273 DEFAULT CHARSET=utf8
>>>>
>>>>
>>>> And with the following query I get 159 ident's back:
>>>>
>>>> select ident from geno_260k where a1=0 group by ident having
>>>> count(a1)>250000;
>>>>
>>>> I want to delete all records containing those idents (about 260000 per
>>>> ident
>>>> so 159*260000).
>>>> So I thought
>>>>
>>>> delete from geno_260k where ident=(select ident from geno_260k where a1=0
>>>> group by ident having count(a1)>250000);
>>>>
>>>> But mysql can not select and delete from the same table.
>>>>
>>>> Any ideas?
>>>>
>>>> Thanks
>>>> Olaf
>>>>
>>>>
>>>>
>>>> --
>>>> MySQL General Mailing List
>>>> For list archives: http://lists.mysql.com/mysql
>>>> To unsubscribe: http://lists.mysql.com/mysql?unsub=mysql
b0rker.com
>>>>
>>>>
>>
>>
>>
>>
>>
>> -------------------------
>> Olaf Stein
>> DBA
>> Center for Quantitative and Computational Biology
>> Columbus Children's Research Institute
>> 700 Children's Drive
>> phone: 1-614-355-5685
>> cell: 1-614-843-0432
>> email: steino
ccri.net
>>
>>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]