Bug #4181
DataoneEMLParser treats all delimited files as text/csv
100%
Description
Sarah Clark noticed that her tab delimited .txt file was marked as formatId=text/csv. Traced it back to the EML parser that is assigning all "dataFormat/textFormat/simpleDelimited" data as csv regardless of the specified fieldDelimiter.
History
#1 Updated by Matthew Jones about 11 years ago
Many files are called CSV even if they use field delimiters other than commas. Tab, space, and pipe delimited files are also considered CSV files by some. But tab-delimited files have their own mime type too. The MIME type situation with CSV is confused, but following RFC 4180 is possibly sane. The question is, do we then need a separate type for every other delimited text file type? It is important to distinguish tab and space delimited files from text/plain files (which are unstructured). Its not as clear what the benefit is in distinguishing common delimited files from other delimited files, as one will very frequently encounter ".csv" files using non-comma delimiters, and programs like excel are built to read them.