Before I start I just wanted to remind that this is one part of a series about comparing the basic functionality of four different RDBMS: SQL Server, Oracle, MariaDB, and PostgreSQL. For more information about this project and links to the other posts, check this introduction post. And now to the post:
One of the basic concepts of RDBMS is transactions. Transactions and how RDBMS handle them are described by ACID (Atomicity, Consistency, Isolation, Durability). In short (you can read more on Wikipedia), this is what it means:
- Atomicity – multiple statements in a single transaction will be either all succeed or all fail. In one transaction there should not be a case where some statements succeed and others fail.
- Consistency – a transaction brings the database from one valid state to another. So a completed transaction cannot break data consistency (e.g. referential constraints and column uniqueness).
- Isolation – concurrent transactions will leave the database in the same state as if they were executed sequentially.
- Durability – once a transaction is committed (completes successfully) it will survive a system failure.
These are the concepts and I have to say that there are a few ways to break some of these, But let’s leave it for now and concentrate on how the different databases behave with simple transactions.
What I tested
There are a few things I’d like to test, but before that I have to say that I’m using the default database configuration. There might be options to change how specific database works, but I won’t do that here.
What I decided to test is the basic transactions behavior and check how the following things work:
- Auto commit – some databases have auto commit as default (meaning every single statement is a transaction on its own). I’ll check which databases configure that by default and this is the only thing that I might change from the default behavior in order to continue my tests. Also note that different clients might set this by default at the client level, which might be different than the database default.
- Locks – locking is quite basic so I’ll just mention some information about it, but generally, all databases behaves the same: when I update a row it will be locked and changes to the same row and to the table structure will be disallowed, but changes to other rows in the table will be allowed.
- Read consistency – after I update a row in a table, what do other sessions see when they query the table?
- Transactional DDLs – what happens when I run a DDL (create, alter, or drop an object) within a transaction.
MariaDB
It’s important to remember that MariaDB (or MySQL) supports all kind of different storage engines and each behaves differently. Here, for the RDBMS comparison I’ll use the default InnoDB storage engine.
Auto Commit
By default, MariaDB is configured with auto commit. With auto commit on I can’t really test anything, so I’ll turn it off. There are two ways to disable this:
For s single transaction
SET TRANSACTION;
<commands>
COMMIT | ROLLBACK;
Or for the entire session:
SET autocommit=0;
Read Consistency
For this test I started two sessions and executed some statements. I’m posting the test in the same block, and I added comments which session I’m displaying.
-- session 1
MariaDB [test]> SELECT * FROM test_tab;
+------+------+
| id | f1 |
+------+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| NULL | 4 |
+------+------+
4 rows in set (0.000 sec)
MariaDB [test]> UPDATE test_tab SET f1=10 WHERE id=3;
Query OK, 1 row affected (0.000 sec)
Rows matched: 1 Changed: 1 Warnings: 0
-- session 2
MariaDB [test]> SELECT * FROM test_tab;
+------+------+
| id | f1 |
+------+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| NULL | 4 |
+------+------+
4 rows in set (0.001 sec)
-- session 1
MariaDB [test]> COMMIT;
Query OK, 0 rows affected (0.314 sec)
-- session 2
MariaDB [test]> SELECT * FROM test_tab;
+------+------+
| id | f1 |
+------+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 10 |
| NULL | 4 |
+------+------+
4 rows in set (0.000 sec)
As you can see, with MariaDB default behavior (InnoDB), when updating a row, other sessions see the old value until the transaction is committed. Once the transaction is committed, all sessions see the new value. Note that you can set the transaction isolation level to READ UNCOMMITTED to be able to query “dirty reads”, meaning, queries will show rows with the most recent values, even if the transactions that changed these rows are not yet committed.
Transactional DDL
To check this I’ll simply start a transaction, insert a row to the table, create a new table, rollback, and see what happens. If DDLs are part of the transaction I expect the new row and new table not to exist.
MariaDB [test]> insert into test_tab values (5,5);
Query OK, 1 row affected (0.000 sec)
MariaDB [test]> create table new_tab(id int);
Query OK, 0 rows affected (0.068 sec)
MariaDB [test]> rollback;
Query OK, 0 rows affected (0.000 sec)
MariaDB [test]> select * from test_tab where id=5;
+------+------+
| id | f1 |
+------+------+
| 5 | 5 |
+------+------+
1 row in set (0.154 sec)
MariaDB [test]> select * from new_tab;
Empty set (0.000 sec)
As you can see both the row and table exist! This is because MariaDB does not support transactional DDL. When we issue a DDL, the open transaction is committed and the DDL is executed is a non-transactional way (cannot be committed or rolled back).
PostgreSQL
Auto Commit
Even though the documentation states that PostgreSQL default is not auto commit, it seems like it is (it might be that the auto commit is the psql client default though). So just like with MariaDB I will disable it. Again, we can simply start a transaction manually:
SET TRANSACTION;
<commands>
COMMIT | ROLLBACK;
Or set this at the session level (according to the documentation):
SET AUTOCOMMIT = off
However, in psql client this doesn’t work, so this is how we can turn it off:
\set AUTOCOMMIT false
Read Consistency
Let’s try the same as before:
-- session 1
lirondb=# SELECT * FROM test_tab;
id | f1
----+----
1 | 1
2 | 2
| 4
3 | 3
(4 rows)
lirondb=*# UPDATE test_tab SET f1=10 WHERE id=3;
UPDATE 1
-- session 2
lirondb=# SELECT * FROM test_tab;
id | f1
----+----
1 | 1
2 | 2
| 4
3 | 3
(4 rows)
-- session 1
lirondb=*# commit;
COMMIT
-- session 2
lirondb=*# SELECT * FROM test_tab;
id | f1
----+----
1 | 1
2 | 2
| 4
3 | 10
(4 rows)
PostgreSQL behaves exactly like MariaDB above: other sessions don’t see row changes until the transaction is committed. In PostgreSQL there is a READ UNCOMMITTED transaction isolation level, but unlike MariaDB, it is ignored and behaves just like the READ COMMITTED isolation level (which is the default I showed above).
Transactional DDL
lirondb=# insert into test_tab values (5,5);
INSERT 0 1
lirondb=*# create table new_tab(id int);
CREATE TABLE
lirondb=*# rollback;
ROLLBACK
lirondb=# select * from test_tab where id=5;
id | f1
----+----
(0 rows)
lirondb=*# select * from new_tab;
ERROR: relation "new_tab" does not exist
LINE 1: select * from new_tab;
PostgreSQL, unlike MariaDB, allows transactional DDLs. This means that if we have an open transaction and we execute a DDL, the DDL is part of the transaction and is committed or rolled back with it. In this case, we rolled back the transaction so both the row and the table don’t exist after the rollback.
SQL Server
Auto Commit
Like the two databases above, SQL Server has auto commit on by default. You can manually declare a transaction by using:
BEGIN TRAN[SACTION];
<commands>
COMMIT | ROLLBACK;
Or disable it completely for your session using:
SET IMPLICIT_TRANSACTIONS ON;
Read consistency
Now I’ll try what I tried before. Opening a transaction, updating a row and then, without committing, trying to read the table:
-- session 1
1> set implicit_transactions on;
2> go
1> SELECT * FROM test_tab;
2> go
id f1
----------- -----------
1 1
2 2
3 3
NULL 4
(4 rows affected)
1> UPDATE test_tab SET f1=10 WHERE id=3;
2> go
(1 rows affected)
-- session 2
1> SELECT * FROM test_tab;
2> go
-- waits
Well, this is a first. In SQL Server, locking a table for update, blocks queries on this table from another session. This is a well known issue with SQL Server where many time developers overcome this with the “nolock” option. I will cancel session 2 and try this instead:
1> SELECT * FROM test_tab WITH (nolock);
2> go
id f1
----------- -----------
1 1
2 2
3 10
NULL 4
(4 rows affected)
So this worked, but note that this is a dirty read, the update that changed f1 to 10 is not committed yet, but can be seen by other sessions and that might be a problem. However, SQL Server can support the behavior the other databases present using the READ_COMMITTED_SNAPSHOT setting at the database level.
Locks
Another thing about SQL Server is that it’s the only database out of the four (as far as I know) that uses lock escalations. This means that if many rows or pages of a table are locked, SQL Server will escalated that lock to a table level (instead of each row/page independently) in order to reduce resource overhead. You can read more about it here.
Transactional DDL
1> insert into test_tab values (5,5);
2> go
(1 rows affected)
1> create table new_tab(id int);
2> go
1> rollback;
2> go
3> select * from test_tab where id=5;
4> go
id f1
----------- -----------
(0 rows affected)
1> select * from new_tab;
2> go
Msg 208, Level 16, State 1, Server WINDEV2104EVAL, Line 1
Invalid object name 'new_tab'.
SQL Server, just like PostgreSQL, allows transactional DDL. The rollback I issued reverted both the insert and the create statement.
Oracle
Auto Commit
Oracle is the only one here that doesn’t have auto commit on by default, not in general and not in any of its built-in client tools. You can however configure auto commit at the client level. I’m not aware of a way to turn this on at the database level, but for SQL*Plus client tool this is how you do it:
SET autocommit on
Read consistency
Let’s do the same as we did with the other databases:
-- session 1
SQL> SELECT * FROM test_tab;
ID F1
---------- ----------
1 1
2 2
3 3
4
SQL> UPDATE test_tab SET f1=10 WHERE id=3;
1 row updated.
-- session 2
SQL> SELECT * FROM test_tab;
ID F1
---------- ----------
1 1
2 2
3 3
4
-- session 1
SQL> COMMIT;
Commit complete.
SQL> SELECT * FROM test_tab;
ID F1
---------- ----------
1 1
2 2
3 10
4
So Oracle behaves just like MariaDB and PostgreSQL and we can only see committed data. If there are open transactions that changed some data we are querying, we will see the committed version of the data. In Oracle there is no “read uncommitted” isolation level and dirty reads are disallowed (similar to PostgreSQL).
Transactional DDL
SQL> insert into test_tab values (5,5);
1 row created.
SQL> create table new_tab(id int);
Table created.
SQL> rollback;
Rollback complete.
SQL> select * from test_tab where id=5;
ID F1
---------- ----------
5 5
SQL> select * from new_tab;
no rows selected
Oracle works like MariaDB and doesn’t allow transactional DDLs. The “create table” I issued committed the open transaction and the DDL itself is not transactional. This is why after the rollback I see both the new table and the new row.
Conclusion
Here is a summary of the things I found in this post:
- Auto Commit is enabled by default for MariaDB and SQL Server. PostgreSQL documentation claims that auto commit is not enabled by default at the database level, but at least the “psql” command line tool has it enabled by default. With Oracle, the default in the database and all the DB and built-in client tools is auto commit off.
- When a session is changing a table in an uncommitted transaction, PostgreSQL, MariaDB, and Oracle will allow querying the table from a different session, showing the last committed data as a result. In SQL Server, the second session will wait for the transaction to complete before querying the table. However, SQL Server has the option to allow row versioning which makes it behave like the other databases.
- MariaDB has a transaction isolation level setting to allow dirty reads (uncommitted data of another session). SQL Server allows dirty reads using the “with (nolock)” option when querying the table. PostgreSQL and Oracle don’t have the option the allow dirty reads.
- From a transactional DDL standpoint, Oracle and MariaDB don’t allow this. Any DDL command is committed immediately and cannot be rolled back, while it also commits any open transaction you might had. PostgreSQL and SQL Server on the other hand allow transactional DDL, so DDLs are part of the transaction and will be committed or rolled back with it.
Final Notes
This was a tricky post to write. Transaction Isolation level is a complex topic and I could have gone down the rabbit hole and get lost. This is why I kept it simple with the default settings to compare the databases, but I definitely did not test all the available options and settings.
I hope you enjoyed this post. Please let me know if you have any other interesting differences between the different databases, or leave a comment below if you have anything to say about this post.
If you’d like to get an email when I publish a new post, fill in your email and click the “subscribe” button below:
1 thought on “DB Comparison Project: Handling Transactions”