In this lesson we'll cover:
In previous lessons, you learned that certain functions work on some data types, but not others. For example, COUNT works with any data type, but SUM only works for numerical data (if this doesn't sound familiar, you should revisit this lesson. This is actually more complicated than it appears: in order to use SUM, the data must appear to be numeric, but it must also be stored in the database in a numeric form.
You might run into this, for example, if you have a column that appears to be entirely numeric, but happens to contain spaces or commas. Yes, it turns out that numeric columns cannot contain commas—If you upload data to Mode with commas in a column full of numbers, Mode will treat that column as non-numeric. Generally, numeric column types in various SQL databases do not support commas or currency symbols. To make things more complicated, SQL databases can store data in many different formats with different levels of precision.
The INTEGER data type, for example, only stores whole numbers—no decimals. The DOUBLE PRECISION data type, on the other hand, can store between 15 and 17 significant decimal digits (almost certainly more than you need unless you're a physicist). There are a lot of data types, so it doesn't make sense to list them all here. For the complete list, click here.
Here is the list of exact data types stored :
| Imported as | Stored as | With these rules |
|---|---|---|
| String | VARCHAR(1024) | Any characters, with a maximum field length of 1024 characters. |
| Date/Time | TIMESTAMP | Stores year, month, day, hour, minute and second values as YYYY-MM-DD hh:mm:ss. |
| Number | DOUBLE PRECISION | Numerical, with up to 17 significant digits decimal precision. |
| Boolean | BOOLEAN | Only TRUE or FALSE values. |
"Imported as" refers to the types that is selected in the import flow (see image below), "Stored as" refers to the official SQL data type, and the third column explains the rules associated with the SQL data type.
It's certainly best for data to be stored in its optimal format from the beginning, but if it isn't, you can always change it in your query. It's particularly common for dates or numbers, for example, to be stored as strings. This becomes problematic when you want to sum a column and you get an error because SQL is reading numbers as strings. When this happens, you can use CAST or CONVERT to change the data type to a numeric one that will allow you to perform the sum.
You can actually achieve this with two different type of syntax. For example, CAST(column_name AS integer) and column_name::integer produce the same result.
You could replace integer with any other data type that would make sense for that column—all values in a given column must fit with the new data types.
Mode Community (the site you're using to complete this tutorial) performs implicit conversion in certain circumstances, so data types are rarely likely to be problematic. However, if you're accessing an internal database (your employer's, for example), you may need to be careful about managing data types for some functions.