Ever want to parse a CSV file with perl?
If you have used perl, the answer is probably yes. What happens if the delimiter you use, say the comma, is also used in one of the fields but the field is quoted so the comma shouldn’t be visible when you parse the string by comma…
The logical way to split up a string in this way would be to use split// except it isn’t smart enough to do it correct. For instance if you have something like
"CN=Builtin,DC=joe,DC=com","{CB61E1C9-C7FD-4ED7-848C-F28953526B7B}"
"OU=CleanOU,DC=joe,DC=com","{A2FF7DA6-7882-45D0-B24B-01C9F5B72C29}"
"CN=Computers,DC=joe,DC=com","{E11E1B12-7182-4330-B075-09B9DFC6E9F9}"
"OU=contacts,DC=joe,DC=com","{EC879C4D-1732-4818-99BE-7E895D05FD1E}"
"OU=Domain Controllers,DC=joe,DC=com","{6B9EC17A-A3B5-48EF-A541-38CC1422A851}"
The first line (stored in $thisline) if parsed with
my @fields=split/,/,$thisline;
would yield
$field[0]="CN=Builtin
$field[1]=DC=joe
$field[2]=DC=com"
$field[3]="{CB61E1C9-C7FD-4ED7-848C-F28953526B7B}"
when what we logically wanted was
$field[0]="CN=Builtin,DC=joe,DC=com"
$field[1]="{CB61E1C9-C7FD-4ED7-848C-F28953526B7B}"
The answer? It is built into the perl core and is called quotewords.
At the top of your script put in the line
use Text::ParseWords;
Then when you want to parse out a quoted CSV line you use the line
my @fields = quotewords(",", 0, $thisline);
Why did I have to figure this out? Well I am updating ADFIND to have a CSV output format and I needed to sort some of that csv output by pwdLastSet. Well the date format by default is mm/dd/yyyy-hh:mm:ss which doesn’t sort all that well in Excel. So I needed a quick perl script to rearrange the date format to yyyy/mm/dd-hh:mm:ss. So when writing it I ran into the whole split and quoted fields issue.
Below is the resulting quick and dirty script. Long term I am not sure what I will do but most likely I will add another switch to decode time/date stamps into a string sortable format.
You will note that I don’t assume the delimiter, I actually figure it out on the fly.
use Text::ParseWords; $in=shift; $out=shift; open IFH,"< $in" or die("Error opening input file - $in : $!\n"); open OFH,">$out" or die("Error opening output file - $out : $!\n"); $datefilter='(\d\d)\/(\d\d)\/(\d\d\d\d)-(\d\d:\d\d:\d\d)'; $delimiter=""; foreach $thisline () { chomp $thisline; if ($thisline=~/\"$datefilter\"/) { if (!$delimiter) { ($delimiter)=($thisline=~/\"(.)\"/); print "Delimiter is '$delimiter'\n"; } my @fields = quotewords($delimiter, 0, $thisline); map {if (/$datefilter/) {$_="$3/$1/$2-$4";}} @fields; $thisline=""; map{$thisline.="\"$_\"$delimiter"} @fields; $thisline=~s/$delimiter$//; } print OFH "$thisline\n"; }