A colleague just came to me to ask if I knew how to make a particular thing happen in SQL; turns out I’d done what he wanted before, so it jogged a memory and I could solve it. But I don’t know what the term of art is for this pattern. Basically, it’s a missing-values problem: you have a large number of observations from, let’s say, every hour of the day; they’re stored in the form of (day, hour, value) triples. You expect one value for every hour of every day. When the value should be there but isn’t, fill it in with some default.
For this I use a cross product, like so:
create view expected as select distinct day, hour from ( select distinct day from my_table ) a, ( select distinct hour from my_table ) b;
The values we actually do have would be retrieved with “select * from my_table”. Now if we want to find (day, hour) pairs for which we don’t have values, we do
select a.day day, a.hour hour, nvl(b.value, 0) value from expected a left outer join my_table b on a.day = b.day and a.hour = b.hour
nvl(a, b) is a function that returns
a is null,
a otherwise; the PostgreSQL equivalent is
This solution works okay, so long as the cross product of days and hours isn’t too large. In the specific case of days and hours, it probably won’t be. But in the more generic case where you’d left-outer-joining a cross product to a set of actual data, there’s no guarantee that the cross product will be small.
So is there a SQL term of art for this? Is there a more elegant solution?