Dynamic columns in MariaDB

Add columns to a table on the fly without altering its schema.

MariaDB and similar SQL database systems allow for a variety of data types that may be used for storing data in columns within tables. When creating or altering a table’s schema, it’s good to know what to expect, to know what kind of data will be stored in each column. If you know that a column will contain numbers, use a numeric data type like INT, not VARCHAR. It’s best to use the appropriate data type for a column. Generally, you’ll have better control of the data and possibly better performance.

But sometimes you can’t predict what type of data might be entered into a column. For such a situation, you might use VARCHAR set to 255 characters wide, or maybe TEXT if plenty of data might be entered. This is a very cool and fairly new alternative: you could create a table in which you would add columns on the fly, but without altering the table’s schema. That may sound absurd, but it’s possible to do this in MariaDB with dynamic columns.

Dynamic columns are basically columns within a column. If you know programming well, they’re like a hash within an array. That may sound confusing, but it will make more sense when you see it in action. To illustrate this, I’ll pull some ideas from my new book, Learning MySQL and MariaDB (O’Reilly 2015). All of the examples in my book and this article are based on a database for bird-watchers.

Preparing for dynamic columns

Suppose we want to create a table, which will be accessed by a web page. This web page will have a form in which users can enter information on a bird they spotted. So as not to restrict them, one of the columns of the table will store details about the bird that the birder noticed. We’ll create that table in MariaDB by entering the following:

CREATE TABLE bird_sightings
(sighting_id INT AUTO_INCREMENT KEY, 
 human_id INT, time_seen DATETIME,
 observations BLOB);

The first column in this table is a simple identifier for each row. The second column, the human_id will be used to join this table to one containing information on birdwatchers. These columns don’t matter much to this article.

The column we care about here is the observations column. We’ll store the dynamic columns in that column. To be able to hold plenty, we’re using the BLOB data type. We could use TEXT, or even VARCHAR if we don’t exceed its limit.

We could add more columns, perhaps one for recording the GPS location where the bird was seen. That’s all we need to do, though, to create a table that will hold dynamic columns.

You’re probably wondering at this point where the dynamism comes in. You’ll see that when we insert data into this table.

Inserting dynamic columns

To be able to transform a simple column into a dynamic column in MariaDB, we would execute an INSERT statement and use the COLUMN_CREATE( ) function to give the value associated with the column to become or contain the dynamic column. Let’s see how this looks by putting some data into the table we created. We’ll enter the following SQL statement:

INSERT INTO bird_sightings 
(human_id, time_seen, observations) 
VALUES
  (27, NOW(),
   COLUMN_CREATE('wing-shape','pointed', 
  'bill-shape','all-purpose', 'main-color','gray') );

INSERT INTO bird_sightings 
(human_id, time_seen, observations) 
VALUES
  (36, NOW(),
   COLUMN_CREATE('wing-shape','rounded', 'wingspan','60', 
  'bill-shape','all-purpose', 'main-color','orange') );

We’re putting plenty of information in the observations column for each row we’re inserting here. We’re using COLUMN_CREATE( ) to create the lists of observations: each observation has a key and a value. The basic syntax is to give the name of the key within quotes, followed by the value for the key. So in the first SQL statement, we’re inserting a key of wing-shape with a value of pointed to indicate the bird’s wings were pointed.

Notice that the first entry has three observations and the second entry has four. Each row can have different keys and a different number of keys. It’s pretty open. Incidentally, dynamic columns were introduced in version 5.3 of MariaDB, but in this earlier version one could give only a number for the keys. Starting with MariaDB version 10.0.1, you can give strings for the keys, as you see in the examples here. MySQL doesn’t have dynamic columns. So you’ll have to upgrade to MariaDB if you want to use them.

Getting data from dynamic columns

Let’s see now how data may be retrieved from a dynamic column. We’ll execute a SELECT statement to get each person’s name, the date of their bird sighting, and we’ll use the COLUMN_GET( ) function to get values from the observations column. We’ll join our table to another one, the humans table to get the person’s name.

SELECT name_first AS 'Birder', 
DATE_FORMAT(time_seen, '%b %d') AS 'Date',
COLUMN_GET(observations, 'wing-shape' AS CHAR) AS 'Wings',
COLUMN_GET(observations, 'wingspan' AS INT) AS 'Span (cm)',
COLUMN_GET(observations, 'bill-shape' AS CHAR) AS 'Beak'
FROM bird_sightings
JOIN humans USING(human_id);

+---------+--------+---------+-----------+-------------+
| Birder  | Date   | Wings   | Span (cm) | Beak        |
+---------+--------+---------+-----------+-------------+
| Anahit  | Apr 14 | pointed |      NULL | all-purpose |
| Michael | Apr 14 | rounded |        60 | all-purpose |
+---------+--------+---------+-----------+-------------+

Here you can see three examples of the syntax of COLUMN_GET( ). The first argument is the name of the column containing a dynamic column. The second argument specifies the key to use to get the data we want. In this example, we’re extracting the wing shape, then the estimated measurement of the wingspan, and then the shape of the bird’s bill for the last field.

We’re also including the keyword, AS within COLUMN_GET( ) to indicate the data type it should use (i.e., CHAR), to cast the value it returns for field in the results table. This is necessary because MariaDB won’t know what to expect since the data type of a dynamic column isn’t predefined and can vary depending on which key is called.

We can do all of the usual things to improve the results. So that the headings in the results table look nicer, we’re using the AS clause for each column selected. Notice that since the row for Anahit’s bird sighting didn’t include the bird’s wingspan, there is a NULL value for that field in the results table. We could use the IFNULL( ) function to set that to blank or something like, unknown.

The complexity related to using dynamic columns may see a bit much, especially when we use COLUMN_GET( ). But keep in mind that one would normally put these SQL statements within code for a database and especially for a web site that interfaces with a database. So it’s not that much to enter since a web interface will reuse it many times.

Changing dynamic columns

As with any data entered in a database, at some point we may need to change it. With dynamic columns, we’ll need to use the COLUMN_ADD( ) function. It may seem odd, to use a function named column add to change the data. However, it will either add or replace the key/value pair that’s given, if it already exists. Let’s try using it to see it in action.

In the results earlier you can see that Michael said that the bird he saw had a wingspan of 60. We added cm in parentheses to the heading to clarify that the value is in centimeters. But suppose we anticipate that some users will enter other values (e.g., millimeters or inches). The heading we were using would then be incorrect. Therefore, we need to record what the numeric value represents in the column. Since we don’t need to do any calculations or sorting based on that value, let’s change the value for the row above from ‘60‘ to ‘60 cm‘. Let’s also add a wingspan for Anahit’s bird, but with inches. To make these changes, we’ll execute UPDATE statements like this:

UPDATE bird_sightings
SET observations = 
    COLUMN_ADD(observations, 'wingspan', '60 cm')
WHERE sighting_id = 2;

UPDATE bird_sightings
SET observations = 
    COLUMN_ADD(observations, 'wingspan', '10 in')
WHERE sighting_id = 1;

In the first SQL statement here, we’re changing the wingspan of Michael’s bird. In the second, we’re adding the wingspan for Anahit’s bird. This is easy because we know what keys are contained in the observations column. But if we don’t know the keys contained in a column, we’ll have to check to see what it contains.

Investigating dynamic columns

There are a couple of functions that may be used to get a list of the contents of a dynamic column. If we don’t know, we can use the COLUMN_LIST( ) function like this:

SELECT sighting_id AS 'Id',
COLUMN_LIST(observations) 
  AS 'observations - keys' 
FROM bird_sightings;

+----+----------------------------------------------------+
| Id | observations - keys                                |
+----+----------------------------------------------------+
|  1 | `wingspan`,`bill-shape`,`main-color`,`wing-shape` |
|  2 | `wingspan`,`bill-shape`,`main-color`,`wing-shape` |
+----+----------------------------------------------------+

That’s useful. We can also use COLUMN_LIST( ) in a WHERE clause to determine which rows have a specific key. However, there’s a function just for that, COLUMN_EXISTS( ). Here’s how that looks:

SELECT sighting_id AS 'Id'
FROM bird_sightings
WHERE COLUMN_EXISTS(observations, 'wing-shape');

The COLUMN_LIST( ) function is good for getting a list of keys. But if we want a list of keys and their values for a dynamic column, we can use the COLUMN_JSON( ) function. It will give us both, as you can see in the results here:

SELECT COLUMN_JSON(observations) AS 'Observations'
FROM bird_sightings
WHERE sighting_id = 1 \G

*************************** 1. row ***************************
Observations: {"wingspan":"10 in","bill-shape":"all-purpose",
               "main-color":"gray","wing-shape":"pointed"}

Here you can see what I was saying earlier about a dynamic column being like a hash — or an associative array, as a hash is called in PHP.

Deleting values in dynamic columns

We’ve added and changed dynamic columns. Now let’s delete some key/value pairs from one of the rows we entered. Let’s delete the wingspan and bill-shape for the first row. To do that, we’ll use the UPDATE statement with another function, COLUMN_DELETE( ) like this:

UPDATE bird_sightings
SET observations = 
COLUMN_DELETE(observations, 'wing-shape', 'bill-shape')
WHERE sighting_id = 1;

With the COLUMN_DELETE( ), after specifying the column name, we can give, in a comma separated list, all of the keys to delete: we don’t have to enter each key in a separate SQL statement. We can do this because deleting is simple: there isn’t the elaborating of keys and values together as required by the other dynamic column functions when deleting.

Conclusion

Dynamic columns are very cool. They’re perhaps not as efficient as using regular columns, but they have a lot of potential for the right situation. They could be used in particular if you want to be responsive to user needs. You could use a dynamic column to collect ideas for new columns until you see some repetition in the data. For the examples here, if we see many users entering wing shapes and spans, we could create columns for them. Until then, we can stick with the dynamic columns.

If you’d like more information on dynamic columns, see the MariaDB Documentation. If you’d like more information on me and my books, check out my web site.

tags: , , ,