Separators in CSV values

Discussion of bugs and problems found in Altap Salamander. In your reports, please be as descriptive as possible, and report one incident per report. Do not post crash reports here, send us the generated bug report by email instead, please.
User avatar
mdruiter
Posts: 263
Joined: 22 Feb 2006, 15:33
Location: Amsterdam, The Netherlands
Contact:

Separators in CSV values

Post by mdruiter »

Sometimes, I have CSV files where some values contain separators. An example (also attached):

Code: Select all

LastUpdated, Location, Percentile
2010-10-20, "Amsterdam, The Netherlands", 0
This file is an excerpt from a result generated by: http://stackusers.com/Users/Find

Every comma is seen as a separator, which causes the value to be split. Even if I explicitly specify double quotes as text qualifiers. :(

After a while, I found the problem. :idea:
The commas each have a space after it. So the first double quote is not directly following the comma, and it is not seen as a text qualifier.
When I remove the space after the commas, the text qualifiers are respected and the file is properly displayed.

I'm not sure whitespace around separators is "officially" allowed when using text qualifiers (whatever "officially" means anyway). But it wouldn't hurt! And if it does, one can choose not to use text delimiters.
It would be nice if the Database Viewer did The Right Thing automatically.
Attachments
test.csv
(98 Bytes) Downloaded 421 times
therube
Posts: 681
Joined: 14 Dec 2006, 06:22

Re: Separators in CSV values

Post by therube »

I'm not seeing an issue?

CSV Options: Separator, Comma; Text Qualifier, Double Quotes; Convert, Auto.
WinXP Pro SP3 or Win7 x86 | SS 2.54
User avatar
mdruiter
Posts: 263
Joined: 22 Feb 2006, 15:33
Location: Amsterdam, The Netherlands
Contact:

Re: Separators in CSV values

Post by mdruiter »

I see this:
Screen shot
Screen shot
csv.PNG (1.7 KiB) Viewed 9848 times
But I would like to see this:
Expected screen shot
Expected screen shot
csvok.PNG (1.71 KiB) Viewed 9848 times
therube
Posts: 681
Joined: 14 Dec 2006, 06:22

Re: Separators in CSV values

Post by therube »

I see this:

Image

I see that (^^^) from your attached file, which is different from your "coded" file.

Image

Attached file is "correct" (assuming you want each field comma delimited, with double quotes as text qualifier).
The coded file is kind ambiguous in that respect. (First field is comma+space delimited, the second field you want to be comma+space delimited, but also with double quote qualifier. Assume that because all fields do not match, that is throwing things off (& not that I know how things things are supposed to work, though it would seem to me that if you want as you want, then all fields should be double quoted text qualifier, not just the one.)
Last edited by therube on 14 Feb 2011, 21:34, edited 1 time in total.
WinXP Pro SP3 or Win7 x86 | SS 2.54
User avatar
mdruiter
Posts: 263
Joined: 22 Feb 2006, 15:33
Location: Amsterdam, The Netherlands
Contact:

Re: Separators in CSV values

Post by mdruiter »

Ouch, sorry. Apparently I edited the file before pressing Submit (but after Add the file). :oops:

To summarize the feature request: it would be nice if spaces around text qualifiers (if any) were ignored.
therube
Posts: 681
Joined: 14 Dec 2006, 06:22

Re: Separators in CSV values

Post by therube »

(I've probably edited my reply above - a number of times, so re-read though I think the end result will be the same.)
WinXP Pro SP3 or Win7 x86 | SS 2.54
User avatar
mdruiter
Posts: 263
Joined: 22 Feb 2006, 15:33
Location: Amsterdam, The Netherlands
Contact:

Re: Separators in CSV values

Post by mdruiter »

I agree the format (inconsistent text qualifier use, spaces after separators) could be improved. But I cannot change the generation process of these files.
And I do not see any downsides of my proposal. Except that somebody will have to implement it. :)
Why not do The Right Thing in more cases?
therube
Posts: 681
Joined: 14 Dec 2006, 06:22

Re: Separators in CSV values

Post by therube »

How do other CSV programs handle the same data?


If your data is consistent, if you changed to space separated with double quote qualifier, appears it should work - though it then does retain the commas as part of the field data. (Tested with a VERY small sampling of data ;-).)
WinXP Pro SP3 or Win7 x86 | SS 2.54
User avatar
SvA
Posts: 486
Joined: 29 Mar 2006, 02:41
Location: DE

Re: Separators in CSV values

Post by SvA »

Well, there is some kind of "official" description of the format in rfc 4180 and there is the most "down to earth" description of the format at http://creativyst.com/Doc/Articles/CSV/CSV01.htm. They contradict each other in this point.

I believe there are two approaches to this issue
  • go with the flow and implement the latter
  • be versatile and add an option
Note: The latter description failes to mention that Microsoft redefined the C of CSV to read Character instead of Comma and uses the list separator character defined in Window's Regional setting as the separator for the CSV file, so, for me, Excel uses ; as the separator. Salamander already does a good job at guessing the separator character.
Moreover the description of Excel's preservation of leading spaces is not correct, at least not according to my tests with Excel 2007. Excel expects the CSV format to preserve spaces adjacent to the separator when writing, and preserves those spaces when reading. However, after a field is read, excel checks it's format and converts the field to the type it recognizes. If, for instance, the content of a field looks like a number, it is convertet to a number and any spaces, leading or trailing (and leading zeros also) are lost. This has nothing to do with reading the CSV, but with postprocessing the data read.
Post Reply