How to determine the Record Offset given a primary key and a defined order by clause – Part 1 Dev Sneak Peek vers 1.2.5
Apr 162010

Part 2

See Part 1 of this article How to determine the Record Offset given a primary key and a defined order by clause – Part 1

To summarize where we left off in part 1, our user sorted our table by age descending and selected 16th record on the 5th page (25 records per page). The record has on offset of 140. The user book marks the record. 150 records are added (100 of them with an age less than 35) and 50 records are deleted (30 of them with an age less than 35). For a net change of 70 records with an age less than 35, resulting in a offset of 210.

So the question is how do we NOW determine the offset of the record that was book marked ONE week ago after the table has been updated?

One way we could do this is to execute the query again with the same sort and then loop through the records, counting them until we find the record with the primary key id the user book marked. Here’s our user’s book mark URL

http://www.example.com/pagedgui.php?id=48&sort1=age&dir1=desc

$pk_id = $_GET['id']
$sort = $_GET['sort1'] . ' ' . $_GET['dir1']
$sql = 'select * from students order by ' . $sort
mysql_execute_query($sql);

$offset = 0
for each record
    if the primary key id of the current record == $pk_id then
        exit for
    end if
    $offset++
end for each

When the loop exits $offset will have the offset of the record we’re looking for. Its kind of inefficient, but its simple and straight forward to implement. But what if you have a table with 300K plus records? And what if you have lots of users all accessing the web site at the same time? This probably isn’t the best solution.

Wait a minute! Doesn’t SQL have a count(*) function, can’t we use that? Yes and No. Yes we could use it, but not without doing some extra work as the only thing we have is the primary key id of the book marked record. So this would require us to lookup the value of the age of the record tat was book marked.

$pk_id = $_GET['id']
$sort = $_GET['sort1'] . ' ' . $_GET['dir1']
$sql = 'select age from students where id = ' . $pk_id
mysql_execute_query($sql);

// returns a value of 35, see the previous post
$age = age of the record returned

// count all records with an age greater than 35 (descending sort here)
$sql = 'select count(*) as rec_count from students where age < ' . $age .
    ' order by ' . $sort
mysql_execute_query($sql);
$offset = rec_count of the record return

This certainly is an improvement in efficiency over the previous solution. But what if there are several records in the table with the age of 35, this will exclude more than just the record with the primary key id of 48. But what if we make it >= (greater than or equal)? That wont work either for it may include some records that shouldn’t be included… in fact, we need a secondary sort, perhaps name.

select count(*) as rec_count from students where age >= $age and name >= $name
order by age desc, name desc

Ok, but what if there are two people with the same name and the same age? We have to add the primary key to the sort as well. In fact to make the above query work the primary key always has to be included in the sort to ensure that there is a guaranteed unique order.

select count(*) as rec_count from students where age >= $age and name >= $name
and id > $pk_id order by age desc, name desc, id desc

That works (note: no two records can have the same primary key id, as it must be unique), but what if some one enters a record into the table that doesn’t have a age? NULL values are a fact that we have to deal with. So how about…

select count(*) as rec_count from students where
(age is not null and $age is not null and age >= $age) or
(age is not null and $age is null)
order by age desc

Since NULL values sort after non Null values in a descending sort (at least in MySQL and before non NULL values in an ascending sort) if the age of the record being compared to is not null and the age of the record for the primary key pass in is NULL, then include the record in the count. We’ve also had to add not null clauses to the >= comparison too. And I’ve also simplified this first pass at the query so you can see whats going on. If we were to also add in similar logic for the name field and add back in the final check for the primary key id, things really start to get messy quick.

select count(*) as rec_count from students where
(age is not null and $age is not null and age > $age) or
(age is not null and $age is not null and age = $age and name is not null
    and $name is not null and name > $name) or
(age is not null and $age is not null and age = $age and name is not null
    and $name is not null and name = $name and id > $pk_id) or
(age is not null and $age is null) or
(age is null and $age is null and name is not null and $name is not null
    and name > $name) or
(age is null and $age is null and name is not null and $name is not null
    and name = $name and id > $pk_id) or
(age is null and $age is null and name is not null and $name is null) or
(age is null and $age is null and name is null and $name is null
    and id > $pk_id)
order by age desc, name desc, id desc

YUCK! We can simplify this a little as follows…

select count(*) as rec_count from students where
(age is not null and ($age is not null and age > $age or
(name is not null and $name is not null and ((name > $name) or
(name = $name and id > $pk_id))))) or
($age is null and (age is not null or
(age is null and (($name is not null and name is not null and ((name > $name) or
(name = $name and id > $pk_id))) or ($name is null and ((name is not null) or
(name is null and id > $pk_id)))))))
order by age desc, name desc, id desc

But its no great improvement as it is on the complex side and this is only for 2 sort fields and the primary key!! Can you imagine what the SQL statement would look like for 7 or 8 sort fields? And what are the chances you could write the SQL statement without making any errors? There must be a better way…

See my next post, How to determine the Record Offset given a primary key and a defined order by clause – Part 3, for the answer.

Leave a Reply

(required)

(required)