Monday, March 15, 2010

Formulating Expressions a Step at a Time: Lazy Evaluation

I am reading this book, and I found the section with the title “Formulating Expressions a Step at a Time” particularly interesting:

First, the book shows you this the query for the sentence "Get pairs of supplier numbers such that the suppliers concerned are collocated (i.e., are in the same city)" written in Tutorial D:

( ( ( S RENAME ( SNO AS SA ) ) { SA , CITY } JOIN
( S RENAME ( SNO AS SB ) ) { SB , CITY } )
WHERE SA < SB ) { SA , SB }

And then it proceeds to show you how to write this query in a more readable (step by step) way:

WITH ( S RENAME ( SNO AS SA ) ) { SA , CITY } AS R1 ,
( S RENAME ( SNO AS SB ) ) { SB , CITY } AS R2 ,
R1 JOIN R2 AS R3 ,
R3 WHERE SA < SB AS R4 :
R4 { SA, SB }

Finally, it shows you how to write this query in SQL:

WITH T1 AS ( SELECT SNO AS SA , CITY
FROM S ) ,
T2 AS ( SELECT SNO AS SB , CITY
FROM S ) ,
T3 AS ( SELECT *
FROM T1 NATURAL JOIN T2 ) ,
T4 AS ( SELECT *
FROM T3
WHERE SA < SB )
SELECT SA , SB
FROM T4

Thanks to the “with” keyword, both in SQL and in Tutorial D, it is possible to deal with this query in a “step by step” way, instead of having to deal with it a single hard to write and hard to read expression (note that this is not a recursive query, so the with keyword is not being used for that in this examples)

Sadly, so far I have been unable to find a equivalent for this syntax in Dataphor… While it is possible to write something like (I am not 100% confident the syntax is right, but I think it should give you the general idea):

var R1 := S {SNO SA}
var R2 := S {SNO SB}
var R3 := R2 JOIN R3
var R4 := R3 WHERE SA < SB
select R4 {SA, SB}

In Dataphor, the variables (R1, R2, etc) are not lazily evaluated, and therefore the performance is not as good, as in, for example, the SQL case (I ran a similar example in SqlServer, and the expressions were evaluated lazily, at the end, instead of one by one, resulting in far better performance). Maybe I am doing something wrong?

I wonder how hard would it be to make Dataphor generate SQL using the WITH keyword with those databases that support it (so far the latest versions of SQLServer (2008) and Oracle (10 & 11) seem to support this syntax)… I guess it is time to ask the Dataphor authors…

1 comment:

Nathan Allan said...

Expression indirection syntax would be a valuable addition to Dataphor for sure. Challenges include getting the syntax right and translating into SQL. It would also be tempting to overuse the feature, especially by the VB crowd who seem to loath expresiveness. Challenges aside, this is a feature that we've wanted almost since the beginning though.