GotoDBA Database Development,DB Camparison DB Comparison Project: Pseudo Columns

DB Comparison Project: Pseudo Columns

This is the third post in this project, where I compare 4 major databases: Oracle, SQL Server, PostgreSQL, and MariaDB, and this time I’ll talk about pseudo columns (or system columns). If you want to read about this project and see the other post, you can click here. Let’s begin:

Coming from the Oracle world, pseudo columns are something I’m very familiar with. It seems like Oracle is richer than the other databases with these columns, so I decided to see what other databases have to offer.

What is a Pseudo Column?

To define pseudo columns I’ll use Oracle documentation:

A pseudocolumn behaves like a table column, but is not actually stored in the table. You can select from pseudocolumns, but you cannot insert, update, or delete their values. A pseudocolumn is also similar to a function without arguments (refer to Functions). However, functions without arguments typically return the same value for every row in the result set, whereas pseudocolumns typically return a different value for each row

As you can see from the definition, adding a “column” that returns the date is not really a pseudo column, a column that returns the row number, however (but not a function), is.

There are different uses for pseudo columns, but they are basically not a necessity and you’ll see that different databases implement different pseudo columns or don’t implement them at all.

I wanted to add that there are a few pseudo columns related to different features (like hierarchical queries), I won’t talk about those. In this post I’ll only discuss the basic pseudo columns for regular tables. Another thing that can be confusing is virtual columns. Some databases support virtual columns which are column that are part of the table definition but do not store any actual data (they can be an output of a function for example). I don’t consider virtual column as pseudo columns because they are indeed part of the table.

Oracle

This time I’ll start with Oracle. Let’s see a few of the main pseudo columns Oracle has and what they mean. For more information about these columns (and to see some other ones) you can check the documentation.

ROWNUM

The ROWNUM pseudo column returns the number of row in the resultset (before the “order by” clause). This was the only way to filter rows based on their location in the resultset (TOP-N queries) until Oracle introduced the TOP-N feature in 2013.

Example:

SQL> DESC test_tab
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                                 NUMBER(38)
 F1                                                 NUMBER(38)

SQL> SELECT rownum,id,f1 FROM test_tab;

    ROWNUM         ID         F1
---------- ---------- ----------
         1          1          1
         2          2          2
         3          3         10
         4                     4
         5          5          5

ORA_ROWSCN

Oracle uses an internal sequence number (called SCN – System Change Number) to track changes to data in the database. the ORA_ROWSCN reflects the SCN of most recently changed the row or block.

Example:

SQL> SELECT ora_rowscn,id,f1 FROM test_tab;

ORA_ROWSCN         ID         F1
---------- ---------- ----------
   2067372          1          1
   2067372          2          2
   2067372          3         10
   2067372                     4
   2090957          5          5

ROWID

To locate a row, Oracle uses the ROWID as the physical address of the file. It contains information about the data file, the data block and the number of row in the block.

Example:

ROWID                      ID         F1
------------------ ---------- ----------
AAAF3HAALAAADKzAAA          1          1
AAAF3HAALAAADKzAAB          2          2
AAAF3HAALAAADKzAAC          3         10
AAAF3HAALAAADKzAAD                     4
AAAF3HAALAAADK3AAA          5          5

MariaDB

MariaDB apparently doesn’t support any pseudo columns. The only thing I managed to find is the “_rowid” column which is a synonym to the primary key column (you can use it if you don’t know the PK column name). The following query will return an error if the table doesn’t have a primary key or has a primary key on multiple columns:

Example:

MariaDB [test]> desc test_tab;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id    | int(11) | NO   | PRI | NULL    |       |
| f1    | int(11) | YES  | UNI | NULL    |       |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.001 sec)

MariaDB [test]> select _rowid,id from test_tab;
+--------+----+
| _rowid | id |
+--------+----+
|      1 |  1 |
|      2 |  2 |
|      3 |  3 |
|      5 |  5 |
+--------+----+
4 rows in set (0.000 sec)

SQL Server

I couldn’t find any pseudo columns in SQL Server as well. The only thing I found was a hidden column, undocumented, that apparently should not be used, called %%physloc%%. This looks similar to ROWID in Oracle as it provides the physical address of the row. More about it here.

Example:

1> select %%physloc%%,id from test_tab;
2> go
                   id
------------------ -----------
0x4801000001000000           1
0x4801000001000100           2
0x4801000001000200           3
0x4801000001000300        NULL

(4 rows affected)

PostgreSQL

PostgreSQL supports a few pseudo columns. I’ll show here 3 of them and you can find more about it here.

tableoid

the tableoid column returns the OID of the table that contains the row.

Example:

lirondb=# select tableoid,id from test_tab;
 tableoid | id
----------+----
    16396 |
    16396 |  3
    16396 |  2
    16396 |  1

ctid

This will return the physical address of the row. This is similar to ROWID in Oracle, however, there is a major difference between how Oracle and PosgreSQL work when updating a row. Oracle updates in place, so the ROWID doesn’t change. Moreover, even if the row needs to move, Oracle will migrate it, leaving a pointer in the original location, so Oracle can (and does) rely on the ROWID to access a row (there are some exceptions like when shrinking a table, but I won’t get into this now). PostgreSQL, on the other hand, creates a new row for every update, and the new row has a new location, therefore, ctid is not permanent and can’t be relied on.

Example:

lirondb=# select ctid,id from test_tab;
  ctid  | id
--------+----
 (0,4)  |
 (0,13) |  3
 (0,14) |  2
 (0,16) |  1
(4 rows)

xmin

The xmin pseudo column seems to me like the SCN in Oracle. The concept here (a bit different, but still) is to show the transaction id of the last transaction that created this row. Since PostgreSQL creates new row with every change (let’s call it a version), the xmin column is the transaction ID of the transaction which inserted the current version of the row.

Example:

lirondb=*# select xmin,id from test_tab;
 xmin | id
------+----
  510 |
  519 |  3
  520 |  2
  522 |  1
(4 rows)

Conclusion

Both SQl Server and MariaDB don’t really use pseudo columns, while Oracle and PostgreSQL have quite a few columns you can use with any query. Why would you want to use them? That, as always, depends. I’ve seen cases in Oracle where using a ROWID for example, helped improve performance of some SQL statements. And ROWNUM is very common (it’s unnecessary now with the TOP-N query, but I did find a benefit of using ROWNUM still. Read here if you’re interested). I’m sure that there are useful scenarios in PostgreSQL for using these columns (like this).

Do we have to have pseudo columns in the database? Probably not. Are they useful? Probably in some cases. Either way, if you do use these and then move to another database, be aware that you won’t necessarily manage to find a replacement.

Final Notes

This post was fun to research. As an Oracle person, I’ve always assumed that pseudo columns are part every database. I figured out it’s not quite the case a while ago when looking for the equivalent of Oracle one in some other database. This is why I decided to write about it as part of this DB comparison project.

I hope you enjoyed this post. Please let me know if you have any other interesting differences between different databases, or leave a comment below if you have anything to say about this post.

If you’d like to get an email when I publish a new post, fill in your email and click the “subscribe” button below:

3 thoughts on “DB Comparison Project: Pseudo Columns”

  1. Nice article. There are pseudo columns in SQL Server like $IDENTITY, $ROWGUID. %%physloc%% is a special one .. not directly used in query but can be used by sys.fn_PhysLocFormatter() to identity physical location of the row in a data page.

  2. Hi David, thanks for the comment.
    The thing with $IDENTITY and $ROWGUID is that they are available only if you have guid or identity columns in the table, so they are basically aliases (like the _rowid in MariaDB), they are not really pseudo columns. Maybe I should still add them to the post though.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post