
In the next section, I’ll quickly introduce two of the command line’s most powerful features: pipes and output redirection. I’ve also included a brief description of the command line tools used in this cookbook below in A Quick Overview of the Commands Used in this Cookbook. If these topics are new to you, there’s a pretty good introduction linked below in Some Useful Resources. You don’t have to be a command line wizard to use the commands below, but you do need to know the basics of how to get to a command line (hint for OS X users, it’s through the Terminal app) and how to do things like changing directories ( cd) and listing files ( ls).

Even though the Unix command line has existed for decades, it’s still always just a few clicks away, whether you use Linux on some HPC hardware or cloud service, or whether you use Mac OS X on your laptop. This power, speed, and flexibility extends to working with CSV files, and it’s here that I’d like to demonstrate a small slice of what can be done with just a few keystrokes. Those familiar with working in the command line, however, wouldn’t trade its power, speed, and flexibility for anything, no matter how shiny or “user friendly” an alternative drag-and-drop interface might be. To many, the command line (or shell) is a strange and scary place. Support for CSV files is built into most data analysis software, programming languages, and online services (see Some Useful Resources at the end of this article for links for your software of choice). CSVs are also version-independent, so ten years down the road you won’t have to track down some ancient piece of software in order to revisit your data (or do the same for someone else’s data).


Because they are plain text, these files can be easily read and edited without the need for specialized or proprietary software. This format offers several significant advantages.

Within the file, each row contains a record, and each field in that record is separated by a comma, tab, or some other character. CSV is an informally-defined file format that stores tabular data (think spreadsheets) in plain text. Working with CSVs on the Command Line September 23, 2013Ĭomma-separated values (CSV), and its close relatives (e.g., Tab-separated values) play a very important role in open access science.
