Project

General

Profile

Bug #4181

DataoneEMLParser treats all delimited files as text/csv

Added by Ben Leinfelder about 11 years ago. Updated about 11 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Ben Leinfelder
Category:
d1_libclient_java
Target version:
-
Start date:
Due date:
% Done:

100%

Milestone:
CCI-1.3
Product Version:
*
Story Points:
Sprint:

Description

Sarah Clark noticed that her tab delimited .txt file was marked as formatId=text/csv. Traced it back to the EML parser that is assigning all "dataFormat/textFormat/simpleDelimited" data as csv regardless of the specified fieldDelimiter.

History

#1 Updated by Matthew Jones about 11 years ago

Many files are called CSV even if they use field delimiters other than commas. Tab, space, and pipe delimited files are also considered CSV files by some. But tab-delimited files have their own mime type too. The MIME type situation with CSV is confused, but following RFC 4180 is possibly sane. The question is, do we then need a separate type for every other delimited text file type? It is important to distinguish tab and space delimited files from text/plain files (which are unstructured). Its not as clear what the benefit is in distinguishing common delimited files from other delimited files, as one will very frequently encounter ".csv" files using non-comma delimiters, and programs like excel are built to read them.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)