OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
Re: Possible to find this duplicate?

From: Mark Goodge (markgood-stuff.co.uk)
Date: Sat Feb 13 2010 - 11:06:10 CST


On 13/02/2010 16:12, Brian Dunning wrote:
> Hey all -
>
> I have a table listing references for chapters in a book. I'm trying to find all the cases where a single chapter lists more than one reference from the same author. In this example table, I want it to find IDs 1 and 2, because they're both from the same author, and both in chapter 1 of the book. It should not return ID 4, because that's in a different chapter.
>
> Note that J. and John have to be considered the same. For my purposes, it's sufficient to look at the first word, Smith, and consider that a duplicate.
>
> +----+--------------+---------+
> | ID | Author | Chapter |
> +----+--------------+---------+
> | 1 | Smith, John | 1 |
> | 2 | Smith, J. | 1 |
> | 3 | Williams, B. | 1 |
> | 4 | Smith, John | 2 |
> +----+--------------+---------+
>
> I haven't been able to even get a start on this. Any suggestions?

Try this:

   select
   count(id) as total,
   concat(substring_index(Author,",",1),Chapter) as my_reference
   from my_table
   group by my_reference
   having total > 1

That may or may not work straight off, I haven't tested it. But the
thing you're looking for is something involving a substring_index on the
Author column.

Mark

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql