解析一个CSV用逗号数据逗号、数据、CSV

2023-09-03 04:17:25 作者:北城以北、思念不归

可能重复:   处理逗号文件

我写我自己一个CSV解析器它正常工作,直到我打这个纪录: B002VECGTG,B002VECGTG,HAS_17131_spaceshooter,4,426,0.04%,4832,0.03%,0%,1,0.02%,$ 20.47,1 该逃了出来,在4,426,并在4,426刹车我的解析器。

I wrote myself a CSV parser it works fine until I hit this record: B002VECGTG,B002VECGTG,HAS_17131_spaceshooter,"4,426",0.04%,"4,832",0.03%,0%,1,0.02%,$20.47 ,1 The escaped , in "4,426" and in "4,426" brake my parser.

这是我用什么来解析文本行:

This is what I am using to parse the line of text:

            char[] comma = { ',' };
            string[] words = line.Split(comma);

我如何prevent我的程序破裂?

How do I prevent my program from breaking?

推荐答案

您不能只是分裂逗号。为了实现适当的解析器这种情况下,通过字符串自己需要循环,跟踪你是否是引号内与否。如果你是一个引号字符串内,你应该继续,直到找到另一个报价。

You can't just split on comma. To implement a proper parser for that case, you need to loop through the string yourself, keeping track of whether you are inside quotes or not. If you are inside a quoted string, you should keep on until you find another quote.

IEnumerable<string> LineSplitter(string line)
{
    int fieldStart = 0;
    for(int i = 0; i < line.Length; i++)
    {
        if(line[i] == ',')
        {    
            yield return line.SubString(fieldStart, i - fieldStart);
            fieldStart = i + 1;
        }
        if(line[i] == '"')
            for(i++; line[i] != '"'; i++) {}
    }
}