« Go home

Reading CSV's and other delimited files with F#

When most people are introduced to F# they're shown one or more of the available type providers. These type providers will automatically generate types for your data based on sampling, which is provided inline, from a StreamReader or via URL.

FEBRUARY 1, 2019

If you're exploring a new data source, like an API, these are wickedly sexy and effective tools. However, in production they can become a bit of pain to use and manage.

F# is absolutely loaded with what I'll call "language tools", and providers are an excellent example of this. Which makes it easy to forget that you will at times need to go lower and in most situations I'd suggest you'll either want to be there or inevitably wind up there.

If you're working in .NET and not using CsvHelper to read/write delimited files, you're missing out. It's an incredible tool.

For those "rolling their own", just stop, right now...

As always, using imperative libraries in F# typically means writing small adapter functions. Below is a snippet from a module I frequently use to parse delimited files in my F# projects, including a sample of how to partially apply the base read function. read is configurable at runtime and accepts a file path & record mapper function.

module Csv =    
  open CsvHelper 
  open System.IO

  let read config path mapCsv =
      seq {
          use reader = new StreamReader(path = path)
          use csv = new CsvReader(reader, configuration = config)         

          csv.Read() |> ignore
          csv.ReadHeader() |> ignore

          while csv.Read() do
              yield mapCsv csv          

  let readTabDelimited path mapCsv = 
      let config = Configuration()
      config.Delimiter <- "\t"
      read config path mapCsv

To use simply provide a working file path and function with a CsvReader -> 'a signature.

type MyRecord = { FullName : string }

        (fun csv -> { FullName = csv.GetField("FULL_NAME") })  

And that's it! You might be asking yourself why not use csv.GetRecords<MyRecord>? I personally like to avoid the [<CliMutable>] at all costs, because that's why I'm using F# in the first place. Plus, creating new record types in F# is so arbitrary that I am PERFECTLY happy to write my own mapping code, even for things like consuming IDbReader's.

It's important to remember that with very little massaging F# can be great at inteorping with imperative libraries.