blogs.conchango.com

welcome to the conchango blogging site
Welcome to blogs.conchango.com Sign in | Join | Help
in Search

Jim 2.0

PL-SQL: Returning a value from the previous row in a query

For any number of reasons which I won’t go into, you may have a need, or desire, to return a value in a query from one row next to a value from another. A typical example might be in a daily sales database, where I want to see yesterday’s total sales against today’s total sales. It is straight forward enough to write the query to return the total sales for each day, the question however, is how do we display the sales for each day alongside the sales from the previous day?

The long way round:

One way we can do this is to carry out a self join on the query, offset based on the date, e.g.:

    1 select  ds1.SalesDate

    2 ,       ds1.SalesValue

    3 ,       ds2.SalesDate

    4 ,       ds2.SalesValue

    5 from    Daily_sales ds1

    6 ,       Daily_sales ds2

    7 where   ds2.SalesDate = (

    8                         select    max(ds.SalesDate)

    9                         from      Daily_sales ds

   10                         where     ds.SalesDate < ds1.SalesDate

   11                         )

   12 order   by ds1.SalesDate

   13 ;

As you can see, this is a very simple join and probably not very robust. If the database had data for a chain of stores for example, you may be interested to see the sales data each day against the previous day for each store, in which case you need to join on the field which identifies the store, in this example, StoreID:

    1 select  ds1.StoreID

    2 ,       ds1.SalesDate

    3 ,       ds1.SalesValue

    4 ,       ds2.SalesDate

    5 ,       ds2.SalesValue

    6 from    Daily_sales ds1

    7 inner   join Daily_sales ds2

    8 on      ds1.StoreID = ds2.StoreID

    9 where   ds2.SalesDate = (

   10                         select    max(ds.SalesDate)

   11                         from      Daily_sales ds

   12                         where     ds.SalesDate < ds1.SalesDate

   13                         and       ds.StoreID = ds1.StoreID

   14                         )

   15 order   by ds1.StoreID

   16 ,       ds1.SalesDate

   17 ;

Things get even more complex (and messy) if you have a database which instead of storing the total sales for each day, stores all sales for each day. In order to see total sales by day against total sales from the previous day for each store therefore, we need to write a query to sum the sales for each day and then perform a self join on this query using the same principles as above:

    1 select  q1.StoreID

    2 ,       q1.SalesDay

    3 ,       q1.DaySales

    4 ,       q2.PreviousSalesDay

    5 ,       q2.PreviousDaySales

    6 from    (

    7         select  ds.StoreID

    8         ,       trunc(ds.SalesDate) SalesDay

    9         ,       sum(ds.SalesValue) DaySales

   10         from    Daily_sales ds

   11         group   by ds.StoreID

   12         ,       trunc(ds.SalesDate)

   13         ) q1

   14 inner   join (

   15         select  ds.StoreID

   16         ,       trunc(ds.SalesDate) PreviousSalesDay

   17         ,       sum(ds.SalesValue) PreviousDaySales

   18         from    Daily_sales ds

   19         group   by ds.StoreID

   20         ,       trunc(ds.SalesDate)

   21         ) q2

   22 on      q1.StoreID = q2.StoreID

   23 where   q2.SalesDay = (

   24                       select    max(sq1.SalesDay) MaxSalesDay

   25                       from      (

   26                                 select  ds.StoreID

   27                                 ,       trunc(ds.SalesDate) SalesDay

   28                                 ,       sum(ds.SalesValue) DaySales

   29                                 from    Daily_sales ds

   30                                 group   by ds.StoreID

   31                                 ,       trunc(ds.SalesDate)

   32                                 ) sq1

   33                       where     sq1.StoreID = q1.StoreID

   34                       and       sq1.SalesDay < q1.SalesDay

   35                       )

   36 order   by q1.StoreID

   37 ,       q1.SalesDay

   38 ;

Using Analytic functions:

We can achieve everything we have done above much more simply and efficiently using the lag analytic function, which I will now demonstrate. First, the list of sales for each day along side the previous days sales:

    1 select  ds.SalesDate

    2 ,       ds.SalesValue

    3 ,       lag(ds.SalesDate,1) over (order by ds.SalesDate) PreviousDate

    4 ,       lag(ds.SalesValue,1) over (order by ds.SalesDate) PreviousSales

    5 from    Daily_sales ds

    6 order   by ds.SalesDate

    7 ;

To view the data by store we can use partition by:

    1 select  ds.StoreID

    2 ,       ds.SalesDate

    3 ,       ds.SalesValue

    4 ,       lag(ds.SalesDate,1) over (partition by ds.StoreID order by ds.SalesDate) PreviousDate

    5 ,       lag(ds.SalesValue,1) over (partition by ds.StoreID order by ds.SalesDate) PreviousSales

    6 from    Daily_sales ds

    7 order   by ds.StoreID

    8 ,       ds.SalesDate

    9 ;

Now for the final example we see how we can simply use the sum function for the scenario where our database stores individual sales for the day that have to be aggregated to give the total sales each day:

    1 select  ds.StoreID

    2 ,       trunc(ds.SalesDate) SalesDay

    3 ,       sum(ds.SalesValue) DaySales

    4 ,       lag(trunc(ds.SalesDate),1) over (partition by ds.StoreID order by trunc(ds.SalesDate)) PreviousDay

    5 ,       lag(sum(ds.SalesValue),1) over (partition by ds.StoreID order by trunc(ds.SalesDate)) PreviousSales

    6 from    Daily_sales ds

    7 group   by ds.StoreID

    8 ,       trunc(ds.SalesDate)

    9 order   by ds.StoreID

   10 ,       ds.SalesDate

   11 ;

It is immediately obvious from these examples how much more succinct and easy to apply these functions are compared to the equivalent queries using joins. Furthermore, in performance terms they are much faster to execute. Finally, because they are less complicated to apply in certain situations such as this example, they can be much more accurate, predominantly because there is less scope to make mistakes when writing the query. Using the self joins it is critical that the where clauses and calculations for example are executed in the correct places, but that issue is avoided using the analytic functions.

Notes for T-SQL:

Whilst analytic functions are not specific to PL-SQL, Oracle appears to offer a wider range of them than in T-SQL and it seems there is no T-SQL equivalent to the Lag and Lead functions. It is still possible however to capture the previous/next row value more efficiently than with a self join by using the MIN/MAX functions and the Where clause as you would in the self join to specify the criteria. See here for an example (under Reporting Aggregate Functions).

James

Published 24 January 2007 20:33 by James.Pipe

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Jim 2.0 said:

Hi, following on from my previous two posts about Analytic Functions , and specifically about the lack

January 30, 2007 03:51
 

Jim 2.0 said:

. . . Or making the complex simpler. Following on from my previous post , Analytic Functions are powerful

January 30, 2007 23:04
 

sarvendra said:

can u please provide more on offset value in lag function.............i need to change this value according to the max no. of rows in my table.

can i use

lag('col_name',max(rownum)/4,0) over(order by col2_name)............

November 26, 2007 06:41
 

James.Pipe said:

Looks like you have too many arguments in the lag function

lag('col_name',max(rownum)/4) over(order by col2_name) looks like it should work. Try it and see.

November 28, 2007 15:13
 

Adrian said:

hi can someone help me in removing white spaces from this query

sqlplus -s $DB_USERID_RW/$DB_PASSWORD_RW<<_EOF  

     set role dba;

     set heading off;

     set show off;

     set verify off;

     set echo off;

     set feedback off;

     spool $DM_SIT/mrsc/tablespace_gt_90.lst    

     select b.tablespace_name, b.bytes, b.file_name

     from dba_free_space a, (select distinct(tablespace_name) tablespace_name,

                             sum(bytes) bytes, file_name  

                             from dba_data_files

                             where tablespace_name not like '%ROLL%' and

                                   tablespace_name not like '%TMP%'

                                   group by tablespace_name, file_name) b

     where a.tablespace_name = b.tablespace_name

     group by b.tablespace_name, b.file_name, b.bytes

     having ((b.bytes - sum(nvl(a.bytes,0)))/(b.bytes))*100 > 90

     order by b.tablespace_name, b.file_name, b.bytes;

     spool off;  

_EOF

this is the output

PERFSTAT                       104857600

/var/opt/vgdb10/u01/oracle_dbf/rtcis/data/rtcis_perfstat_01.dbf

SL_DATA_OTHER                  1192230912

/var/opt/vgdb10/u03/oracle_dbf/rtcis/data/sl_data_other.dbf

SYSTEM                         157286400

/var/opt/vgdb10/u02/oracle_dbf/rtcis/data/rtcis_system_01.dbf

May 14, 2008 18:22

Leave a Comment

(required) 
(optional)
(required) 
Submit
Powered by Community Server (Personal Edition), by Telligent Systems