Rowids and Primary Keys
SQLite has the concept of a rowid. A special column with a unique integer identifier for each row. It’s present in most tables, but generally hidden from view. This post gives an overview of how rowids work and their relationship with primary keys.
sqlite3
command-line tool.
If you run the following SQL:
CREATE TABLE rowidExample (id INTEGER);
You’ll end up with a table containing only one visible column. But the rowid
column is there too,
just not shown by default.
If we run these statements:
INSERT INTO rowidExample DEFAULT VALUES;
INSERT INTO rowidExample DEFAULT VALUES;
INSERT INTO rowidExample DEFAULT VALUES;
SELECT rowid, id FROM rowidExample;
You’ll get the following results:
rowid | id |
---|---|
1 | NULL |
2 | NULL |
3 | NULL |
At this point it might be tempting to think “Great! I don’t need to specify a primary key now, I’ve got this one anyway”. Suppress that thought. The rowid (mostly) isn’t for you or me. It’s for SQLite itself. If you rely on it as-is, you’ll run in to problems.
For example, rowids can change. If you run these statements:
DELETE FROM rowidExample WHERE rowid = 2;
VACUUM;
SELECT rowid, id FROM rowidExample;
You’ll get these results:
rowid | id |
---|---|
1 | NULL |
2 | NULL |
Which might seem odd, we just deleted the row with rowid 2 after all. But VACUUM
can cause rowids
to be changed1.
So what do we do instead?
Make your own keys
You should always specify your own keys. If you need a unique reference for a row, then you’ll want to explicitly model that in your tables. A simple option is to add an integer primary key column, like this:
CREATE TABLE explicitPrimaryKey (id INTEGER PRIMARY KEY);
The natural follow-up is to try the same steps as before:
INSERT INTO explicitPrimaryKey DEFAULT VALUES;
INSERT INTO explicitPrimaryKey DEFAULT VALUES;
INSERT INTO explicitPrimaryKey DEFAULT VALUES;
SELECT rowid, id FROM explicitPrimaryKey;
rowid | id |
---|---|
1 | 1 |
2 | 2 |
3 | 3 |
This is mostly the same, except the rowid
and id
columns both have the same value.
Taking the next step, we get:
DELETE FROM explicitPrimaryKey WHERE rowid = 2;
VACUUM;
SELECT rowid, id FROM explicitPrimaryKey;
rowid | id |
---|---|
1 | 1 |
3 | 3 |
Which is what we really wanted the first time! The rowids haven’t been changed by the VACUUM
.
That’s because columns declared as INTEGER PRIMARY KEY
become an alias for the rowid column.
SQLite now knows that they shouldn’t be rearranged and leaves them alone.
Which is the “real” primary key?
That’s a fun question.
For you and me, creating and querying tables, the primary key is the one we
declare. It might be a column with PRIMARY KEY
as one of the constraints, or a PRIMARY KEY
table
constraint across multiple columns.
For the internals of SQLite though, the “real” primary key is the rowid. That’s the value that gets used in the internal data structures.
If you use a column declared as INTEGER PRIMARY KEY
, then our “outside” primary key is an alias of
the “internal” primary key. Most of the time, we don’t need to worry about that though2.
What about AUTOINCREMENT
?
For most cases, you don’t want or need it.
If you have an INTEGER PRIMARY KEY
column, because it’s an alias for the rowid, SQLite will
automatically provide a value for you if you don’t specify one. It’ll try to pick the next largest
number in that column. If you’ve managed to use up to the maximum 64-bit integer value
(9,223,372,036,854,775,807), it’ll pick numbers at random until it finds an unused one or gives up.
If you add the AUTOINCREMENT
keyword to that column, the behaviour will change a bit. SQLite
guarantees that values in this column will always be larger than existing values. If you insert the
maximum integer value to this column, SQLite will not allow any more rows to be inserted in the
table.
Other special or unusual things
Primary keys don’t have to always be integers. You can have all sorts:
-- A table of books, using the ISBN as the primary key
CREATE TABLE books (
isbn TEXT PRIMARY KEY,
title TEXT NOT NULL
);
-- A table matching students to enrolled courses.
-- The combination of studentId and courseId must be unique,
-- you can't have a student enrolling in any given course more than once.
CREATE TABLE enrollments (
studentId INTEGER NOT NULL,
courseId TEXT NOT NULL,
enrollmentDate DATE,
PRIMARY KEY (studentId, courseId)
);
It’s interesting to note that in these cases, the PRIMARY KEY
is basically just a
UNIQUE
constraint. Because of this, SQLite allows NULL values unless the column is declared as NOT NULL
. This isn’t standard SQL, but is a long-standing quirk of SQLite.
For example:
CREATE TABLE booksAllowingNull (
isbn TEXT PRIMARY KEY,
title TEXT,
author TEXT
);
INSERT INTO booksAllowingNull VALUES (NULL, 'The Colour of Magic', 'Pratchett, T');
INSERT INTO booksAllowingNull VALUES (NULL, 'The Light Fantastic', 'Pratchett, T');
Will result in two rows being inserted into the table, both with a NULL
value for isbn
.
If we change this to:
CREATE TABLE booksWithoutNull (
isbn TEXT PRIMARY KEY NOT NULL,
title TEXT,
author TEXT
);
INSERT INTO booksWithoutNull VALUES (NULL, 'The Colour of Magic', 'Pratchett, T');
The INSERT
will fail, as we’d expected.
It’s worth noting that again, columns declared INTEGER PRIMARY KEY
get special treatment and are
automatically assumed to be NOT NULL
even if you don’t specify it3.
Rowid aliases
You can refer to the rowid using the names rowid
, _rowid_
or oid
. But you can also create
columns with those names. So for this table, the internal rowid value can’t be accessed from the outside:
CREATE TABLE rowidNames (rowid INTEGER, _rowid_ INTEGER, oid INTEGER);
I have no idea why anyone would choose do this, but it turns out you can!
WITHOUT ROWID
tables
It is possible to declare a table as explicitly not having a rowid. This isn’t as common and deserves an article to itself. For now it’s enough to note that this is a thing that is possible. If you want to read more about this feature, there is an excellent description on the SQLite website.
References
The SQLite website has excellent, detailed information about:
-
Copying rows between tables with
INSERT INTO table2 SELECT * FROM table1;
won’t keep the same rowids either. ↩︎ -
You might sometimes care about this for performance reasons, since lookups made by rowid are usually faster than other primary key types. ↩︎
-
This automatic behaviour also happens for
STRICT
andWITHOUT ROWID
tables. ↩︎