joeware - never stop exploring... :)

Information about joeware mixed with wild and crazy opinions...

AdFind CSV output, Embedded double quotes, and Excel

by @ 10:26 pm on 1/16/2012. Filed under tech

If you have ever had AdFind output CSV before *and* you sent that output into Excel[1] *and* one or more of the fields had embedded double quotes as part of the value you likely ran into an issue with how AdFind escapes the double quotes.

The default for AdFind’s double quote escape character is "\" which is the old CSV standard that I grew up with (you used "\" to escape all characters that needed escaped). So, logically, that is what I wrote AdFind to use. I first wrote the CSV functionality after an MVP summit in Spring 2005, it was first released in October/November 2005.

Interestingly, from a timing standpoint, in October 2005, RFC4180 was published which specified a standard for escaping double quotes and as you may imagine, they didn’t choose "\", no the standard wants you to escape a double quote with another double quote.

   7.  If double-quotes are used to enclose fields, then a double-quote
       appearing inside a field must be escaped by preceding it with
       another double quote.  For example:

       "aaa","b""bb","ccc"

I pretty much ignored that RFC… in actuality I probably had no clue it existed, RFCs on spreadsheet formats wasn’t really something I was wont to go looking for. However in January 2009 I added a new switch to the version of AdFind released in February 2009 called -csvqesc which allowed you to specify the escape character. I don’t recall why I did it but I expect someone asked me to do it so I did it. It is unlikely I thought to do it myself, I try hard not to use quotes in field values (along with all sorts of other characters that are painful to deal with in scripts) and don’t think I would have run into the issue importing the data into Excel that I wouldn’t have solved via a quick perl script. Anyway, the switch allows you to specify -csvqesc \" which will then escape any embedded double quotes with another double quote.

So now kick forward a few years to 2011… I get an email explaining the problem with the escape character for the double quotes and I absolutely space on the new switch and explain why the ability isn’t in AdFind and that I will add a DCR for the functionality. Even later as I start going through the source determining the "cost" of adding the functionality I see that I already had the capability in the tool… So I sent a new email saying… "Hey… here is how you can do it…".

I am not entirely surprised that I forgot all about the switch, I have a bazillion and three switches in the tool. It does annoy me a little though so I have…. tada… added a new shortcut switch that will make it a little easier to remember -csvxl which stands for Excel CSV. It is a little easier to remember. The shortcut simply inserts -csv and -csvqesc \" into the command stream for you.

    joe

 

[1] Or some number of other spreadsheet apps.

Rating 4.60 out of 5

Comments are closed.

[joeware – never stop exploring… :) is proudly powered by WordPress.]