best practices

Six rules for creating valid CSV files

What makes a “good” CSV file as opposed to an “evil” one? Okay, I suppose evilness isn’t really a property of CSV files, but if you’ve ever wrestled with a bad one, I think you know what I mean.

Since you're now aware that CSV (comma separated value) files are superior to delimited files for data importing, let’s talk about how to create a CSV file.

Here are the rules for creating CSV files:

  1. Separate data fields with a delimiter, usually a comma. This should be a single character. If you do not want to use a comma, you should choose something like a tab or a pipe (|) character.

  2. Keep each record on a separate line. Each record must start on its own line, but a single record can span multiple lines.

  3. Do not follow the last record in a file with a carriage return.

  4. In the first line of the file, include a header with a list of the column names in the file. This is optional, but strongly recommended; it allows the file to be self-documenting.

  5. Make sure the header list is delimited in the same way as the rest of the file. This helps guard against the data fields being transposed in the data when it is loaded, which can lead to getting wrong answers when you query the data.

  6. Remember that the enclosing character (typically double quotes) must be used when required, such as when the delimiter appears in a field.

For more detail on these rules, you can look at Wikipedia and RFC 4180 (the Request for Comments document in the CSV specification).

Put your flat file mastery to the test

Now that you have a thorough mastery of the different types of flat files, try out some data imports. If you have ThoughtSpot, you can upload a CSV file (or several related CSV files) from your web browser.

 

Happy data searching!