blogs.conchango.com

welcome to the conchango blogging site
Welcome to blogs.conchango.com Sign in | Join | Help
in Search

Jim 2.0

SQL Quick Tip: Using Top N to browse a database

If you ever need to browse through a database, and dip in and out of tables to see what the data looks like, but you have very little data to go on, then you will be aware that simply running a select * against each table can be a time consuming and inefficient process. I am in this situation currently with a database that has no enforced referential integrity between it's tables that I am having to explore (foreign keys exist but aren't enforced at the database level).

In such situations, I like to use the top N command to return just a portion of the data. This gives me enough to get a feel for what the data in a table looks like, but doesn't waste time trying to return all the rows. The following will return the top N rows for T-SQL and PL/SQL respectively: 

T-SQL

SELECT  TOP 100 field
FROM     tablename

PL/SQL

SELECT  field
FROM    tablename
WHERE  rownum <= 100

Obviously the top N rows will depend on what column the ordering is based on, and as above will run on whatever the default is. You can specify an order by clause, although this will increase the time taken to run the query.

James

Published 09 August 2007 23:23 by James.Pipe
Filed under: , , , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

David.Portas said:

There are some differences between ROWNUM and TOP. ROWNUM is evaluated *before* ORDER BY so it does not respect the sort order in the outer part of the query. You have to construct a query such that the result is presorted. Ex:

SELECT * FROM (SELECT * FROM employees ORDER BY salary) WHERE ROWNUM < 5; However, not all query operations preserve order so the result is potentially undefined in some cases.

TOP will use the ORDER BY specified on the same SELECT statement as the TOP keyword. There is NO default ordering. If TOP is used without ORDER BY then the result is undefined and will be determined at runtime by the execution plan. If you have multiple processors and multiple data files for example there is every chance of getting different sets of rows returned from successive executions.

August 10, 2007 16:38
 

Mohit Nayyar said:

We can make use of something like this script to check the existence of a specific column in a given database; even this script will also highlight the constraints like Primary Key or Foreign Key (if available).

SELECT base.TABLE_NAME, base.COLUMN_NAME, cons.CONSTRAINT_TYPE

FROM INFORMATION_SCHEMA.COLUMNS base

LEFT JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE usage ON (base.TABLE_SCHEMA = usage.TABLE_SCHEMA and base.TABLE_NAME = usage.TABLE_NAME and base.COLUMN_NAME = usage.COLUMN_NAME)

LEFT JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS CONS on (base.TABLE_SCHEMA = usage.TABLE_SCHEMA and usage.CONSTRAINT_NAME = cons.CONSTRAINT_NAME and usage.TABLE_NAME = cons.TABLE_NAME)

WHERE base.COLUMN_NAME = 'StreamDetailID'

And using TOP clause to check data is a really good example. We can also use “SET ROWCOUNT” to do the same stuff and with this there is no need to change the existing query and putting TOP clause in that. And even SET ROWCOUNT limit applies after ORDER BY clause.

On the same lines, in SQL Server 2005 we can make TOP clause parametrized as well. That means at run time we can decide how many rows we actually need. TOP Percent is another good example when we need to pull a specific percentage of result set instead of any fixed number.

Mohit Nayyar

August 14, 2007 17:49

Leave a Comment

(required) 
(optional)
(required) 
Submit
Powered by Community Server (Personal Edition), by Telligent Systems