dbi_doing_top_n_queries_aka_limit_in_oracle

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.



To: Tony Foiani <anthony_foiani@non.hp.com>
From: Tony Foiani <anthony_foiani@non.hp.com>
Subject: Re: Limiting the number of records selected.
Date: 13 Mar 2001 18:32:55 -0700

Continuing the ongoing saga...  I wrote that the "top-n" queries in
Oracle can be handled by using a nested SELECT with an ORDER BY, then
using ROWNUM in the external WHERE clause.  But...

> I just tried it on our instance and it doesn't work.  We are on
> Oracle7 Server Release 7.3.4.5.0.  Is this something that only works
> in Oracle 8+ ?  It doesn't seem to like the ORDER BY clause in the
> FROM clause query.  If I take ORDER BY dsc out the query runs but
> not with the desired results.  What gives?

Well, I saw this feature (for doing "top-n" queries) listed as one of
the reasons to migrate to Oracle8i:

   http://www.elementkjournals.com/dbm/0003/dbm0031.htm

So I wouldn't be surprised if this were indeed a version 8 feature.

Oh, and here's the 8.1.5 documentation on this feature:

   http://oradoc.photo.net/ora81/DOC/server.815/a68003/rollup_c.htm#33316

(This is obviously where the docs I referenced earlier are pointing,
but it seems to be removed from the 8.1.6 documentation set.  Uhg.)

> | SQL>r 
> |   1  SELECT * 
> |   2     FROM ( SELECT * FROM codes ORDER BY dsc ) 
> |   3*    WHERE ROWNUM <= 10 
> |    FROM ( SELECT * FROM codes ORDER BY dsc ) 
> |                               * 
> | ERROR at line 2: 
> | ORA-00907: missing right parenthesis 
>
> If I take ORDER BY dsc out the query runs but not with the desired
> results:
>
> | SQL>r 
> |   1  SELECT * 
> |   2     FROM ( SELECT * FROM codes ) 
> |   3*    WHERE ROWNUM <= 10 
>
> What gives?

As I said above, it probably is an Oracle 8 feature.

You can always "brute force" this programmatically; just grab them
all, in order, then display only the ones you're interested in.  You
can always finish the handle after you've found those.

If the records are very wide, you might save some time by returning
only an identifying field in the desired sort order; after you grab
the id fields for the relevant range of records, you can then fetch
exactly those using an IN clause or similar.  (But I'd guess that the
extra execute would be more costly, unless you have lots of very wide
fields in the SELECTed columns!)

Hm.  Here's an arcane way of doing it:

   http://www.caribdata.co.uk/additional/new_user.html#top_n

And a variant on the same:

   http://www.cstone.net/~phh5j/sqlfaq.html#TOP

(Both were the result for a google.com search for "oracle top-n
query".)

If you have a lot of records, however, I'd expect that asking Oracle
to just give you all of them, in sequence, would be faster.  Both of
the above methods appear to be quadratic algorithms, while a simple
ordered query should run in only n-log-n time.

The XSQL servlet has "max-rows" and "skip-rows" values for this
reason:

   http://otn.oracle.com/tech/xml/xsql_servlet/htdocs/relnotes.htm#ID2827

But I would guess it's doing exactly as I suggest above -- get the
whole thing, and only show the ones you care about (and quit early, if
possible).

In your case, I'd just do:

| my $sql = "SELECT * FROM codes ORDER BY dsc";
| my $sth = $dbh->prepare($sql);
| $sth->execute();
| my $n = 0;
| while (my $cur = $sth->fetch())
| {
|     $n++;
|     next if $n < $start_row_num;
|     last if $n > $last_row_num;
|     # do stuff with $cur here
| }
| $sth->finish();

Hope this helps,
t.

===

To: Bill OConnor <bill_oconnor@yahoo.com>
From: Mark Thornber <emthornber@iee.org>
Subject: Re: Limiting the number of records selected.
Date: Wed, 14 Mar 2001 08:34:16 +0000

Bill,

Try to find a copy of 'SQL for Smarties' by Joe Celko.

Chapter 25 Subsets has example code for the Top N values which I have
successfully used in the past to get Next N as well.

The final iteration of the code to display top three salaries (taken
from the book) is :-

select distinct count(*), a.salary
from employees as a, employees as b
where (a.salary <= b.salary)
group by a.salary
having count(*) <=3;

(I did have to translate the SQL into the local version :-^)

In general what is being done is find the number of rows 'less than' the
target row and use that to order and then select the required target
rows.

Hope that makes sense.


===
To: dbi-users@perl.org
From: "Difalco, Phil" <PJDa@pge.com>
Subject: RE: Limiting the number of records selected.
Date: Wed, 14 Mar 2001 12:04:13 -0800

The select statement below, only works if the top 3 salaries are unique.

If two (or more) of the top salaried people have similar salaries - this
query breaks down (for the intended result).

So, as long as the target column is unique, this is an "great" way to limit
selected rows...
Thanks...


-----Original Message-----
From: Mark Thornber [mailto:emthornber@iee.org]
Sent: Wednesday, March 14, 2001 12:34 AM
Subject: Re: Limiting the number of records selected.


Try to find a copy of 'SQL for Smarties' by Joe Celko.

Chapter 25 Subsets has example code for the Top N values which I have
successfully used in the past to get Next N as well.

The final iteration of the code to display top three salaries (taken from
the book) is :-

	SELECT DISTINCT COUNT(*), a.salary
	FROM employees AS a, employees AS b
	WHERE (a.salary <= b.salary)
	GROUP BY a.salary
	HAVING COUNT(*) <=3;

(I did have to translate the SQL into the local version :-^)

In general what is being done is find the number of rows 'less than' the
target row and use that to order and then select the required target rows.

===


the rest of The Pile (a partial mailing list archive)

doom@kzsu.stanford.edu