OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
Re: Really strange index/speed issues

From: Chris Hemmings (listscheeky.org)
Date: Tue Sep 11 2007 - 11:34:21 CDT


Chris Hemmings wrote:
> Baron Schwartz wrote:
>> Hi Chris,
>>
>> Chris Hemmings wrote:
>>> Dan Buettner wrote:
>>>> Chris, a couple of thoughts -
>>>>
>>>> First, your index on the section is doing you no good (at this time)
>>>> since
>>>> all the values are the same. You may already know that, but thought
>>>> I'd
>>>> mention it.
>>>>
>>>> Second, my theory on why query #1 is faster - if all your prices
>>>> range from
>>>> 1 up, and you're querying for prices greater than 0, then MySQL can
>>>> just
>>>> return the first 30 rows after sorting them.
>>>>
>>>> The second query, where you are looking for prices greater than 1,
>>>> MySQL has
>>>> to sort and then examine a number of rows until it finds enough
>>>> matching
>>>> rows (price > 1) to satisfy your query. This likely takes a little
>>>> bit of
>>>> time. How many rows do you have with price = 1? It would have to
>>>> scan over
>>>> that many before it could start satisfying your query, if you think
>>>> about
>>>> it.
>>>>
>>>> HTH,
>>>> Dan
>>>>
>>>>
>>>>
>>>> On 9/10/07, Chris Hemmings <listscheeky.org> wrote:
>>>>> Hello,
>>>>>
>>>>> I have a table, currently holding 128,978 rows... In this table, I
>>>>> have a
>>>>> section column (int) and a price column (int). Every row has a
>>>>> section of
>>>>> 1
>>>>> currently, every row has a price, ranging from 1 to 10,000.
>>>>>
>>>>> I have an index on both columns separately.
>>>>>
>>>>> Have a look at these two queries, can someone tell me why there is
>>>>> such a
>>>>> difference in speed of execution? (Note difference in price
>>>>> qualifier)
>>>>>
>>>>> ########################################
>>>>>
>>>>> SELECT *
>>>>> FROM `table1`
>>>>> WHERE price >0
>>>>> AND section =1
>>>>> ORDER BY price
>>>>> LIMIT 0 , 30
>>>>>
>>>>> Showing rows 0 - 29 (128,978 total, Query took 0.9462 sec)
>>>>>
>>>>> Explain output: 1 SIMPLE table1 ALL section,price NULL NULL NULL 96734
>>>>> Using
>>>>> where; Using filesort
>>>>>
>>>>> ########################################
>>>>>
>>>>> SELECT *
>>>>> FROM `table1`
>>>>> WHERE price >1
>>>>> AND section =1
>>>>> ORDER BY price
>>>>> LIMIT 0 , 30
>>>>>
>>>>>
>>>>> Showing rows 0 - 29 (128,949 total, Query took 0.0008 sec)
>>>>>
>>>>> Explain output: 1 SIMPLE table1 ALL section,price NULL NULL NULL 96734
>>>>> Using
>>>>> where; Using filesort
>>>>>
>>>>> ########################################
>>>>>
>>>>> Other info:
>>>>>
>>>>> Query cacheing = off
>>>>> MySQL version = 5.0.32
>>>>> OS = Debian Sarge
>>>>>
>>>>> Sure, the second query returns 29 fewer records than the first, but
>>>>> should
>>>>> that make the difference in time?
>>>>>
>>>>> Hope you can shed some light onto this :-)
>>>>>
>>>>> Ta!
>>>>>
>>>>> Chris.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> MySQL General Mailing List
>>>>> For list archives: http://lists.mysql.com/mysql
>>>>> To unsubscribe:
>>>>> http://lists.mysql.com/mysql?unsub=drbuettnergmail.com
>>>>>
>>>>>
>>>>
>>>
>>> Thanks Dan,
>>>
>>> I've got you on the section index... I was going to use that later,
>>> when I get somre real data in there.
>>>
>>> Anyway, I agree with your logic, but, the inverse is happening. The
>>> one where it has to actually exclude some rows (because price>1) is
>>> actually faster. Thats really why this has me baffled, I would
>>> presume that the price>1 would be slower as it does have to filter
>>> rows out first.
>>
>> There's an easy way to find out: FLUSH STATUS, run the query, SHOW
>> STATUS LIKE 'handler%'. Do this on an otherwise quiet server if
>> possible.Or use MySQL Query Profiler -- it does a lot of math for you
>> :-)Baron
>>
>>
>>
>>
>>
>>
>
> Thanks Baron!
>
> I think you have hit upon something, doing what you said on a 'silent'
> server, I get the following:
>
> SELECT * FROM `table1` WHERE price >1 AND section =1 ORDER BY price
> LIMIT 0 , 30;
>
> 30 rows in set (0.00 sec)
>
> mysql> SHOW STATUS LIKE 'handler%';
> +----------------------------+-------+
> | Variable_name | Value |
> +----------------------------+-------+
> | Handler_commit | 0 |
> | Handler_delete | 0 |
> | Handler_discover | 0 |
> | Handler_prepare | 0 |
> | Handler_read_first | 0 |
> | Handler_read_key | 1 |
> | Handler_read_next | 29 |
> | Handler_read_prev | 0 |
> | Handler_read_rnd | 0 |
> | Handler_read_rnd_next | 0 |
> | Handler_rollback | 0 |
> | Handler_savepoint | 0 |
> | Handler_savepoint_rollback | 0 |
> | Handler_update | 0 |
> | Handler_write | 14 |
> +----------------------------+-------+
> 15 rows in set (0.00 sec)
>
>
> SELECT * FROM `table1` WHERE price >0 AND section =1 ORDER BY price
> LIMIT 0 , 30;
>
> 30 rows in set (0.95 sec)
>
> mysql> SHOW STATUS LIKE 'handler%';
> +----------------------------+--------+
> | Variable_name | Value |
> +----------------------------+--------+
> | Handler_commit | 0 |
> | Handler_delete | 0 |
> | Handler_discover | 0 |
> | Handler_prepare | 0 |
> | Handler_read_first | 0 |
> | Handler_read_key | 1 |
> | Handler_read_next | 128978 |
> | Handler_read_prev | 0 |
> | Handler_read_rnd | 30 |
> | Handler_read_rnd_next | 0 |
> | Handler_rollback | 0 |
> | Handler_savepoint | 0 |
> | Handler_savepoint_rollback | 0 |
> | Handler_update | 0 |
> | Handler_write | 14 |
> +----------------------------+--------+
> 15 rows in set (0.01 sec)
>
> So, the slower query obvisouly has the larger Handler_read_next number
> in it. Looking at the manual, it says the following for that variable:
>
> "The number of requests to read the next row in key order. This value is
> incremented if you are querying an index column with a range constraint
> or if you are doing an index scan."
>
> So, being a bit of a MySQL novice, I can't really see how >0 increases
> the number of reads by such a VAST amount... Can you guys put this in
> layman's terms for me?
>
> Thanks for the help guys :-)
>
> Chris.
>

Hi Guys,

I'm still not having much luck with this, but, I have spotted something
that maybe I missed earlier.

Doing an explain on both queries again produces the following:

explain select * from table1 where price>0 and section=1 order by price
limit 0,20;

| 1 | SIMPLE | table1 | range | section,price | section | 4
| NULL | 128966 | Using where; Using filesort |

explain select * from table1 where price>1 and section=1 order by price
limit 0,20;

| 1 | SIMPLE | table1 | range | section,price | price | 4 |
NULL | 128951 | Using where |

I'm not sure why these 'explains' looks different to the previous
example. For some reason, the second query (price>1) is using the
correct key, but on the first query, the wrong key is being used and
then a filesort has to take place.

Any ideas why this seems to happen, why should the incorrect key be
selected just by changing the value to qualify on?

I have tried a few more queries that are similar on other tables if you
would like them.

Thanks,

Chris.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql