So let’s start with a simple MySQL setup:
mysql> CREATE TABLE example (prefix VARCHAR(50),
domain VARCHAR(30), PRIMARY KEY(prefix, domain));
Query OK, 0 rows affected (0.02 sec)
mysql> INSERT INTO example VALUES
("hello", "world.org"), ("foo", "bar"), ("Foot","ball");
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
Alright… so we have some sort of “split up” email addresses. It’s not a problem to combine them together on the fly as far as MySQL provides basic string operations.
mysql> SELECT CONCAT(prefix, "@", domain) as email_addr FROM example; +-----------------+ | email_addr | +-----------------+ | foo@bar | | Foot@ball | | hello@world.org | +-----------------+ 3 rows in set (0.00 sec)
So now let’s say, we want to remove the Top Level Domain from the domain (“world” instead of “world.org”) [1].
mysql> SELECT SUBSTR(domain, 0, LOCATE('.', domain)-1) as tld FROM example;
+-----+
| tld |
+-----+
| |
| |
| |
+-----+
3 rows in set (0.00 sec)
What? Let’s slow down… take a substring of domain starting at position zero and with length of the position of the “.” (dot character) minus 1 (before that dot character). Okay… so something has to be wrong about it.
mysql> SELECT domain as tld FROM example; +-----------+ | tld | +-----------+ | bar | | ball | | world.org | +-----------+ 3 rows in set (0.00 sec)
A quarter of an hour later I realized the problem with the help of a colleague:
mysql> SELECT SUBSTR(domain, 1, LOCATE('.', domain)-1) as tld FROM example;
+-------+
| tld |
+-------+
| |
| |
| world |
+-------+
3 rows in set (0.00 sec)
SQL (and therefore the substr function) is one-based. So to address the first character you have to specify it using “1″. MySQL uses 0 to tell “no match”.
Stupid world. Why do we have conventions? To break them? I was aware that MATLAB sucks in this regard too, but I was shocked, when I heard about Lua (both languages one-based).
Note [1]. If you really have such an usecase, please refer to SUBSTRING_INDEX.
Recent Comments