如何获取数据的从文本文件在C#文本文件、数据

2023-09-04 05:41:33 作者:要刺激别喊疼

我有一个纺织包含的数据量较大,第一件事情是我必须筛选哪些是分散在那里,叶单元格数据here.For这第一行我过滤是符合添加GCELL ,其中包含原始数据,接下来我要做的是我用得到来自同一个文本文件中的相关数据 CELLID 即将在同一添加GCELL line.Related的数据都是未来在符合 beggining ADD GTRX 和的数据都是 FREQ,TRXNO,ISMAINBCCH,。在一言以蔽之 CELLID 是两个行添加GCELL共同的价值添加GTRX 。我已经做了一些编码在C#中,但我被困在某处 这里是文本文件的一部分 ........................... ...........................

I have a textile contains big amount of data ,first thing is i have to filter Leaf cell data which is scattered over there and here.For this first line i filtered is line beginning with ADD GCELL which contains primary data,next what i have to do is i have to get the related data from the same text file by using CELLID coming in the same ADD GCELL line.Related datas are coming in the line beggining with ADD GTRX and datas are FREQ , TRXNO , ISMAINBCCH ,.in nutshell CELLID is the common value for both line ADD GCELL and ADD GTRX. I have done few coding in c# , but i got stuck somewhere Here is part of text file ........................... ...........................

ADD GCELL:CELLID=13, CELLNAME="NR_0702_07021_G1_A", MCC="424", MNC="02", LAC=6112, CI=7021, NCC=6, BCC=0, EXTTP=Normal_cell, IUOTP=Concentric_cell, ENIUO=ON, DBFREQBCCHIUO=Extra, FLEXMAIO=OFF, CSVSP=3, CSDSP=5, PSHPSP=4, PSLPSVP=6, BSPBCCHBLKS=1, BSPAGBLKSRES=4, BSPRACHBLKS=1, TYPE=GSM900_DCS1800, OPNAME="Tester", VIPCELL=NO
..............................
ADD GTRX:TRXID=11140, TRXNAME="T_RAK_JaziratHamra_G_702_7021_A-0", FREQ=99, TRXNO=0, CELLID=13, IDTYPE=BYID, ISMAINBCCH=YES, ISTMPTRX=NO, GTRXGROUPID=80;

code我所做的就是

Code i have done is

using (StreamReader sr = File.OpenText(filename))
{
    while ((s = sr.ReadLine()) != null)
    {
        if (s.Contains("ADD GCELL:"))
        {
            s = s.Replace("ADD GCELL:", "");
            string[] items = s.Split(',');
            foreach (string str in items)
            {
                string[] str1 = str.Split('=');
                if (str1[0] == "CELLID")
                {
                    cellidnew = str1[1];
                }
                string fieldname = str1[0];
                string value = str1[1].Replace(";", string.Empty).Replace("\"", string.Empty);

            }

            Getgtrxvalues(filename, ref cellname, ref cellidnew, ref Frequency, ref TRXNO ,ref ISMAINBCCH);


        }
    }
}

private static void Getgtrxvalues(string filename, ref string cellname, ref string cellid, ref int Frequency,  ref int TRXNO ,ref bool ISMAINBCCH)
{
    using (StreamReader sr = File.OpenText(filename))
    {
        while ((s = sr.ReadLine()) != null)
        {
            if (s.Contains("ADD GTRX:"))
            {
                try
                {


}
}
}
}

更新

一切工作正常,只是多了一个条件,我必须satisfy.Here为ADD Gtrx:我现在所有的值,包括频率时ISMAINBCCH = YES,但在同一时间ISMAINBCCH = NO还有,我要为频率值得到的逗号分隔values​​.For例如像在这里首先我将FREQ其中,CELLID = 639(动态的任何事情都可能发生)和ISMAINBCCH = YES,那我这样做,现在接下来的任务就是我必须contenate以逗号分隔的方式FREQ值其中,CELLID = 639和ISMAINBCCH = NO,所以在这里我想输出是24,28,67。如何实现这一

Everything working fine except one more condition i have to satisfy.Here for for ADD Gtrx: i am taking all values including Freq when ISMAINBCCH=YES ,but at the same time ISMAINBCCH=NO there are values for Freq which i have to get as comma seperated values.For example Like here First i will take FREQ where CELLID = 639(dynamic one anything can happen) and ISMAINBCCH=YES,that i have done now next task is i have to contenate FREQ values in a comma seperated way where CELLID=639 and ISMAINBCCH=NO, so here the output i want is 24,28,67 .How to achieve this one

线

 ADD GTRX:TRXID=0, TRXNAME="M_RAK_JeerExch_G_1879_18791_A-0", FREQ=81, TRXNO=0, CELLID=639, IDTYPE=BYID, ISMAINBCCH=YES, ISTMPTRX=NO, GTRXGROUPID=2556;
 ADD GTRX:TRXID=1, TRXNAME="M_RAK_JeerExch_G_1879_18791_A-1", FREQ=24, TRXNO=1, CELLID=639, IDTYPE=BYID, ISMAINBCCH=NO, ISTMPTRX=NO, GTRXGROUPID=2556;
 ADD GTRX:TRXID=5, TRXNAME="M_RAK_JeerExch_G_1879_18791_A-2", FREQ=28, TRXNO=2, CELLID=639, IDTYPE=BYID, ISMAINBCCH=NO, ISTMPTRX=NO, GTRXGROUPID=2556;
 ADD GTRX:TRXID=6, TRXNAME="M_RAK_JeerExch_G_1879_18791_A-3", FREQ=67, TRXNO=3, CELLID=639, IDTYPE=BYID, ISMAINBCCH=NO, ISTMPTRX=NO, GTRXGROUPID=2556;

更新

最后我做到了像图所示code

Finally i did it like shown below code

我创建了一个更多的财产 DEFINED_TCH_FRQ = NULL 获取级联string.But的问题是,它是很慢的。我是迭代的文本文件两次,第一次是SR .readline()和第二个是由 File.Readline 获取连接字符串(pviously我用这个aslo $ P $ File.Readalllines ,并得到了内存溢出异常)

i created one more property DEFINED_TCH_FRQ = null for getting concatenated string.But the problem is it is very slow .I am iterating text file two times ,first time is sr.readline() and second is for getting concatenated string by File.Readline(this aslo previously i used File.Readalllines and got out of memory exception)

 List<int> intarr = new List<int>();
            intarr.Clear(); 
var gtrx = new Gtrx
                            {
                                CellId = int.Parse(PullValue(s, "CELLID")),
                                Freq = int.Parse(PullValue(s, "FREQ")),
                                TrxNo = int.Parse(PullValue(s, "TRXNO")),
                                IsMainBcch = PullValue(s, "ISMAINBCCH").ToUpper() == "YES",
                                Commabcch = new List<string> { PullValue(s, "ISMAINBCCH") },
                                DEFINED_TCH_FRQ = null,

                                TrxName = PullValue(s, "TRXNAME"),

                            };

 if (!intarr.Contains(gtrx.CellId))
                            {

                                if (!_dictionary.ContainsKey(gtrx.CellId))
                                {
                                    // No GCell record for this id. Do something!
                                    continue;
                                }
                                intarr.Add(gtrx.CellId);
                                string results = string.Empty;

                                    var result = String.Join(",",
        from ss in File.ReadLines(filename)
        where ss.Contains("ADD GTRX:")
        where int.Parse(PullValue(ss, "CELLID")) == gtrx.CellId
        where PullValue(ss, "ISMAINBCCH").ToUpper() != "YES"
        select int.Parse(PullValue(ss, "FREQ")));
                                    results = result;


                                var gtrxnew = new Gtrx
                                {
                                    DEFINED_TCH_FRQ = results
                                };

                                _dictionary[gtrx.CellId].Gtrx = gtrx;

更新

最后,我没有像第一次救我用File.Readalllines开始加入GTRX线到一个数组,然后只能使用该阵列来获得,而不是存储整个文本文件连接字符串,并得到了一些表现improvement.Now我的问题是,如果我将我的文本文件包含每行几十万到XML,然后从XML文件中检索数据,将它做任何的性能提升?如果我使用的数据表和数据集而不是类,这里将它做任何的性能提升?

Finally i did it like first i saved lines starting with ADD GTRX in to an array by using File.Readalllines and then used only that array to get concatenated string instead of storing entire text file and got some performance improvement.Now my question is if i convert my Text files each contain hundreds of thousands of lines in to xml and then retrieve data from xml file, will it make any performance improvement? if i use datatable and dataset rather than classes here will it make any performance improvement?

推荐答案

假设数据是一致的,我也假设GCells会GTrx前行(因为GTrx被引用GCell的ID),那么你可以创建一个简单的解析器这样做的,存放在字典中的值。

Assuming the data is consistent and I'm also assuming the GCells will come before GTrx line (since GTrx is referencing the id of the GCell), then you could create a simple parser for doing this and store the values in a dictionary.

首先要做的就是创建一个类来保存Gtrx数据和GCell数据。请记住,我只是抓住了数据的一个子集。您可以添加到这一点,如果你需要更多的字段:

First thing to do is create a class to hold the Gtrx data and the GCell data. Keep in mind that I am just grabbing a subset of the data. You can add to this if you need more fields:

private class Gtrx
{
    public int Freq { get; set; }
    public int TrxNo { get; set; }
    public string TrxName { get; set; }
    public int CellId { get; set; }
    public bool IsMainBcch { get; set; }
}

private class Gcell
{
    public int CellId { get; set; }
    public string CellName { get; set; }
    public string Mcc { get; set; }
    public int Lac { get; set; }
    public int Ci { get; set; }
}

除了这些课程,我们还需要一个类来链接这两个班在一起:

In addition to these classes, we will also need a class to "link" these two classes together:

private class GcellGtrx
{
    public Gcell Gcell { get; set; }
    public Gtrx Gtrx { get; set; }
}

现在我们可以建立一个简单的解析器:

Now we can build a simple parser:

private readonly Dictionary<int, GcellGtrx> _dictionary = new Dictionary<int, GcellGtrx>();

string data = "ADD GCELL:CELLID=13, CELLNAME=\"NR_0702_07021_G1_A\", MCC=\"424\", MNC=\"02\", LAC=6112, CI=7021, NCC=6, BCC=0, EXTTP=Normal_cell, IUOTP=Concentric_cell, ENIUO=ON, DBFREQBCCHIUO=Extra, FLEXMAIO=OFF, CSVSP=3, CSDSP=5, PSHPSP=4, PSLPSVP=6, BSPBCCHBLKS=1, BSPAGBLKSRES=4, BSPRACHBLKS=1, TYPE=GSM900_DCS1800, OPNAME=\"Tester\", VIPCELL=NO" + Environment.NewLine;
data = data + "ADD GTRX:TRXID=11140, TRXNAME=\"T_RAK_JaziratHamra_G_702_7021_A-0\", FREQ=99, TRXNO=0, CELLID=13, IDTYPE=BYID, ISMAINBCCH=YES, ISTMPTRX=NO, GTRXGROUPID=80;" + Environment.NewLine;

using (var sr = new StringReader(data))
{
    string line = sr.ReadLine();
    while (line != null)
    {
        line = line.Trim();
        if (line.StartsWith("ADD GCELL:"))
        {
            var gcell = new Gcell
            {
                CellId = int.Parse(PullValue(line, "CELLID")),
                CellName = PullValue(line, "CELLNAME"),
                Ci = int.Parse(PullValue(line, "CI")),
                Lac = int.Parse(PullValue(line, "LAC")),
                Mcc = PullValue(line, "MCC")
            };
            var gcellGtrx = new GcellGtrx();
            gcellGtrx.Gcell = gcell;
            _dictionary.Add(gcell.CellId, gcellGtrx);
        }
        if (line.StartsWith("ADD GTRX:"))
        {
            var gtrx = new Gtrx
            {
                CellId = int.Parse(PullValue(line, "CELLID")),
                Freq = int.Parse(PullValue(line, "FREQ")),
                TrxNo = int.Parse(PullValue(line, "TRXNO")),
                IsMainBcch = PullValue(line, "ISMAINBCCH").ToUpper() == "YES",
                TrxName = PullValue(line, "TRXNAME")
            };

            if (!_dictionary.ContainsKey(gtrx.CellId))
            {
                // No GCell record for this id. Do something!
                continue;
            }
            _dictionary[gtrx.CellId].Gtrx = gtrx;
        }
        line = sr.ReadLine();
    }
}

// Now you can pull your data using a CellId:
// GcellGtrx cell13 = _dictionary[13];
// 
// Or you could iterate through each one:
// foreach (KeyValuePair<int, GcellGtrx> kvp in _dictionary)
// {
//     int key = kvp.Key;
//     GcellGtrx gCellGtrxdata = kvp.Value;
//     // Do Stuff
// }

最后,我们需要定义一个简单的辅助方法:

And finally, we need to define a simple helper method:

private string PullValue(string line, string key)
{
    key = key + "=";
    int ndx = line.IndexOf(key, 0, StringComparison.InvariantCultureIgnoreCase);
    if (ndx >= 0)
    {
        int ndx2 = line.IndexOf(",", ndx, StringComparison.InvariantCultureIgnoreCase);
        if (ndx2 == -1)
            ndx2 = line.Length - 1;
        return line.Substring(ndx + key.Length, ndx2 - ndx - key.Length).Trim('"').Trim();
    }

    return "";
}

这应该这样做!看看,这并不为你工作。请记住,这是非常基本的。你可能要处理一些可能出现的错误(如不存在的键,等等)。

That should do it! See if that doesn't work for you. Keep in mind that this is very basic. You'd probably want to handle some possible errors (such as the key not existing, etc).