-
Describe the bug To Reproduce
Expected behavior Observed behavior Additional context Using version 23.0.0 var iso8859Encoding = Encoding.GetEncoding("ISO-8859-1");
using (var stream = new MemoryStream())
using (var writer = new StreamWriter(stream, iso8859Encoding))
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
var options = new TypeConverterOptions { Formats = new[] { "yyyy-MM-dd" } };
csv.Context.TypeConverterOptionsCache.AddOptions<DateTime>(options);
csv.Context.TypeConverterOptionsCache.AddOptions<DateTime?>(options);
csv.Context.RegisterClassMap<MyModelMap>();
await csv.WriteRecordsAsync(myModels);
await writer.FlushAsync();
await stream.FlushAsync();
stream.Seek(0, SeekOrigin.Begin);
return iso8859Encoding.GetString(stream.ToArray());
} One of the models contains the string:
What I get in the output is:
What I expect is:
|
Beta Was this translation helpful? Give feedback.
Replies: 9 comments
-
CsvHelper writes the characters you give it to the You can change the behavior of when a field is quoted buy using the config option |
Beta Was this translation helpful? Give feedback.
-
So, I'm handing a smart double quote to CsvWriter, CsvWriter is writing a smart double quote, then the Iso8859 encoding is writing it as a single double quote, resulting in a malformed CSV file. What, then, is the correct approach for writing a CSV file, in a specific Encoding, if the Encoding can break the output CSV? For now, I'm running Replace() on every string field in the input records, but that's not a very elegant solution. Does CsvWriter need to be encoding aware? |
Beta Was this translation helpful? Give feedback.
-
What is the |
Beta Was this translation helpful? Give feedback.
-
These are both converted to " in Iso8859: “ (U+201C) LEFT DOUBLE QUOTATION MARK |
Beta Was this translation helpful? Give feedback.
-
What encoding is the source text? |
Beta Was this translation helpful? Give feedback.
-
At some point, you'll need to convert the UTF-8 text into ISO-8859-1 before it hits the Some options that I can think of.
This example works correctly. Strings in the IDE are UTF-16, which is why I used that there. void Main()
{
var text = "one “two” three";
var stringEncoding = Encoding.GetEncoding("UTF-16");
var encoding = Encoding.GetEncoding("ISO-8859-1");
var records = new List<Foo>
{
new Foo { Id = 1, Name = encoding.GetString(Encoding.Convert(stringEncoding, encoding, stringEncoding.GetBytes(text))) },
};
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
};
using (var stream = new MemoryStream())
using (var writer = new StreamWriter(stream, encoding))
using (var csv = new CsvWriter(writer, config))
{
csv.WriteRecords(records);
writer.Flush();
stream.Position = 0;
encoding.GetString(stream.ToArray()).Dump();
}
}
private class Foo
{
public int Id { get; set; }
public string Name { get; set; }
} This is what overriding private class Utf8ToIso88591Writer : CsvWriter
{
private Encoding sourceEncoding = Encoding.UTF8;
private Encoding destEncoding = Encoding.GetEncoding("ISO-8859-1");
public Utf8ToIso88591Writer(TextWriter writer, CultureInfo cultureInfo, bool leaveOpen = false) : base(writer, cultureInfo, leaveOpen) { }
public Utf8ToIso88591Writer(TextWriter writer, CsvConfiguration configuration) : base(writer, configuration) { }
public override void WriteToBuffer(string value)
{
var bytes = sourceEncoding.GetBytes(value);
bytes = Encoding.Convert(sourceEncoding, destEncoding, bytes);
value = destEncoding.GetString(bytes);
base.WriteToBuffer(value);
}
} This is what a custom type converter would look like. private class Utf8ToIso88591Converter : DefaultTypeConverter
{
private Encoding sourceEncoding = Encoding.UTF8;
private Encoding destEncoding = Encoding.GetEncoding("ISO-8859-1");
public override string ConvertToString(object value, IWriterRow row, MemberMapData memberMapData)
{
var s = value as string;
if (s == null)
{
return base.ConvertToString(value, row, memberMapData);
}
var bytes = sourceEncoding.GetBytes(s);
bytes = Encoding.Convert(sourceEncoding, destEncoding, bytes);
s = destEncoding.GetString(bytes);
return s;
}
} You would then apply the converter globally for strings like this. csv.Context.TypeConverterCache.AddConverter<string>(new Utf8ToIso88591Converter()); It would be the fastest to convert a whole file at once if that's possible. |
Beta Was this translation helpful? Give feedback.
-
Thanks for that. |
Beta Was this translation helpful? Give feedback.
-
Hi I have the same problem with quotes / quotation marks (“”), however I can't use the proposed solutions as I'm working on a dynamic form where I don't have a strongly typed object to write as CSV. public MemoryStream WriteCsv(IEnumerable<FormSubmission> submissions, Form form)
{
var stream = new MemoryStream();
var writer = new StreamWriter(stream, Encoding.Latin1);
if (form?.Fields == null || !form.Fields.Any() || !submissions.Any())
{
return stream;
}
var configuration = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ";"
};
var csv = new CsvWriter(writer, configuration);
var headers = new List<string> { "FormId", "State" };
foreach (var field in form.Fields)
{
headers.Add(field.Name);
}
// Write the headers
foreach (var header in headers)
{
csv.WriteField(header);
}
csv.NextRecord();
// Write the rows
foreach (var s in submssions)
{
csv.WriteField(s.Id);
csv.WriteField(s.Status);
// for each defined field get a formatted value to write on file
foreach (var field in form.Fields)
{
var submittedField = s.GetSubmittedField(field);
if (submittedField == null)
{
csv.WriteField(string.Empty);
continue;
}
var value = submittedField.GetFormattedValue();
csv.WriteField(value);
}
csv.NextRecord();
}
writer.Flush();
stream.Position = 0;
return stream;
} So I can't use the converter as For the moment I replace the quotes when formatting the value of the field however it would be much better and clean approach to have it integrated on the method |
Beta Was this translation helpful? Give feedback.
At some point, you'll need to convert the UTF-8 text into ISO-8859-1 before it hits the
TextWriter
.Some options that I can think of.
CsvWriter.WriteToBuffer
and convert it, then call the base method.This example works correctly. Strings in the IDE are UTF-16, which is why I used that there.