在.NET SerializationException序列化大量的对象时,对象、序列化、NET、SerializationException

2023-09-02 11:58:41 作者:没关系,我会放手

我快到序列地段.NET对象的问题。对象图是pretty的大一些的新的数据集被使用,所以我得到:

  System.Runtime.Serialization.SerializationException
内部数组不能扩展到大于Int32.MaxValue元素。
 

有没有其他人遇到这个限制?你是如何解决的呢?

这将是一件好事,如果我仍然可以使用内置的序列化机制,如果有可能,但似乎有刚刚推出自己的(并保持与现有的数据文件向后兼容)

中的对象都是 POCO 并使用的是正在连载的BinaryFormatter 。每个对象被序列化工具 ISerializable的来选择序列的成员(他们中的一些负荷期间重新计算)。

它看起来像这样一个开放的问题的MS(details这里),但它已经解决,因为无须改正。细节是(来自链路):

  

二进制序列失败对象   图拥有超过〜13200000   对象。在试图这样做的原因   在一个异常   ObjectIDGenerator.Rehash用   误导性的错误消息引用   Int32.MaxValue。

     

经检验的   ObjectIDGenerator.cs在SSCLI   来源$ C ​​$ C,看来大   对象图可以被处理   添加额外的条目进入   尺寸阵列。请参阅下面的几行:

  //表素数作为哈希表的大小使用。每个条目的
//最小素数的两倍previous进入较大。
私人静态只读INT []尺寸= {5,11,29,47,97,197,397,
797,1597,3203,6421,12853,25717,51437,102877,205759,
411527,823117,1646237,3292489,6584983};
 

     

不过,这将是很好,如果   序列化工作的任何   对象图的合理规模。

解决方案

我试图重现问题,但code只是需要永远即使每个13+百万个对象的只有2个字节来运行。所以,我怀疑你不仅可以解决这个问题,同时也显著提高性能,如果你更好一点收拾你的数据在自定义ISerialize实现。不要让串行看到这么深入的结构,但就砍下来的,你的对象图炸毁几十万数组元素或更多的点(因为presumably如果你有很多对象,他们'再pretty的小,或者您将无法保持在内存中是这样)。就拿这个例子中,它允许串行看到B级和C,但手工管理A类的集合:

 类节目
{
    静态无效的主要(字串[] args)
    {
        C C =新C(8,200万);
        System.Runtime.Serialization.Formatters.Binary.BinaryFormatter BF =新System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
        System.IO.MemoryStream毫秒​​=新System.IO.MemoryStream();
        bf.Serialize(MS,C);
        ms.Seek(0,System.IO.SeekOrigin.Begin);
        的for(int i = 0;我3;;我++)
            对于(INT J =; J<我+ 3; J ++)
                Console.WriteLine({0},{1},c.all [I] [J]。.b1,c.all [I] [J] .b2);
        Console.WriteLine(=====);
        C = NULL;
        C =(C)(bf.Deserialize(毫秒));
        的for(int i = 0;我3;;我++)
            对于(INT J =; J<我+ 3; J ++)
                Console.WriteLine({0},{1},c.all [I] [J]。.b1,c.all [I] [J] .b2);
        Console.WriteLine(=====);
    }
}

A级
{
    字节dataByte1;
    字节dataByte2;
    公开发行A(字节B1,B2字节)
    {
        dataByte1 = B1;
        dataByte2 = B2;
    }

    公共UINT16 GetAllData()
    {
        返回(UINT16)((dataByte1<< 8)| dataByte2);
    }

    公开发行A(UINT16 ALLDATA)
    {
        dataByte1 =(字节)(ALLDATA>→8);
        dataByte2 =(字节)(ALLDATA&安培; 0xFF的);
    }

    公共字节B1
    {
        得到
        {
            返回dataByte1;
        }
    }

    公共字节B2
    {
        得到
        {
            返回dataByte2;
        }
    }
}

[序列化()]
B类:System.Runtime.Serialization.ISerializable
{
    字符串名称;
    名单< A> myList中;

    市民B(INT尺寸)
    {
        myList上=新的名单,其中,A>(大小);

        的for(int i = 0; I<大小;我++)
        {
            myList.Add(新的A((字节)(ⅰ%255),(字节)(第(i + 1)%255)));
        }
        名+ size.ToString()=清单;
    }

    公开发行A本[INT指数]
    {
        得到
        {
            返回myList中[指数]
        }
    }

    #地区ISerializable的成员

    公共无效GetObjectData使用(System.Runtime.Serialization.SerializationInfo信息,System.Runtime.Serialization.StreamingContext上下文)
    {
        UINT16 [] =包装新UINT16 [myList.Count]
        info.AddValue(姓名,名);
        的for(int i = 0; I< myList.Count;我++)
        {
            包装[我] = myList上[I] .GetAllData();
        }
        info.AddValue(packedData,包装);
    }

    保护B(System.Runtime.Serialization.SerializationInfo信息,System.Runtime.Serialization.StreamingContext上下文)
    {
        名称= info.GetString(姓名);
        UINT16 []挤满=(UINT16 [])(info.GetValue(packedData的typeof(UINT16 [])));
        myList上=新的名单,其中,A>(packed.Length);
        的for(int i = 0; I< packed.Length;我++)
            myList.Add(新A(包装[I]));
    }

    #endregion
}

[序列化()]
C类
{
    公开名单< B>所有;
    市民C(诠释计数,诠释大小)
    {
        所有=新的名单,其中,B>(计数);
        的for(int i = 0; I<计数;我++)
        {
            all.Add(新乙(大小));
        }
    }
}
 
你需要了解的有关.NET日期时间的必要信息

I'm running into problems serializing lots of objects in .NET. The object graph is pretty big with some of the new data sets being used, so I'm getting:

System.Runtime.Serialization.SerializationException
"The internal array cannot expand to greater than Int32.MaxValue elements."

Has anyone else hit this limit? How have you solved it?

It would be good if I can still use the built in serialization mechanism if possible, but it seems like have to just roll my own (and maintain backwards compatibility with the existing data files)

The objects are all POCO and are being serialized using BinaryFormatter. Each object being serialized implements ISerializable to selectively serialize its members (some of them are recalculated during loading).

It looks like this an open issue for MS (details here), but it's been resolved as Wont Fix. The details are (from the link):

Binary serialization fails for object graphs with more than ~13.2 million objects. The attempt to do so causes an exception in ObjectIDGenerator.Rehash with a misleading error message referencing Int32.MaxValue.

Upon examination of ObjectIDGenerator.cs in the SSCLI source code, it appears that larger object graphs could be handled by adding additional entries into the sizes array. See the following lines:

// Table of prime numbers to use as hash table sizes. Each entry is the
// smallest prime number larger than twice the previous entry.
private static readonly int[] sizes = {5, 11, 29, 47, 97, 197, 397,
797, 1597, 3203, 6421, 12853, 25717, 51437, 102877, 205759, 
411527, 823117, 1646237, 3292489, 6584983};

However, it would be nice if serialization worked for any reasonable size of the object graph.

解决方案

I tried reproducing the problem, but the code just takes forever to run even when each of the 13+ million objects is only 2 bytes. So I suspect you could not only fix the problem, but also significantly improve performance if you pack your data a little better in your custom ISerialize implementations. Don't let the serializer see so deep into your structure, but cut it off at the point where your object graph blows up into hundreds of thousands of array elements or more (because presumably if you have that many objects, they're pretty small or you wouldn't be able to hold them in memory anyway). Take this example, which allows the serializer to see classes B and C, but manually manages the collection of class A:

class Program
{
    static void Main(string[] args)
    {
        C c = new C(8, 2000000);
        System.Runtime.Serialization.Formatters.Binary.BinaryFormatter bf = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
        System.IO.MemoryStream ms = new System.IO.MemoryStream();
        bf.Serialize(ms, c);
        ms.Seek(0, System.IO.SeekOrigin.Begin);
        for (int i = 0; i < 3; i++)
            for (int j = i; j < i + 3; j++)
                Console.WriteLine("{0}, {1}", c.all[i][j].b1, c.all[i][j].b2);
        Console.WriteLine("=====");
        c = null;
        c = (C)(bf.Deserialize(ms));
        for (int i = 0; i < 3; i++)
            for (int j = i; j < i + 3; j++)
                Console.WriteLine("{0}, {1}", c.all[i][j].b1, c.all[i][j].b2);
        Console.WriteLine("=====");
    }
}

class A
{
    byte dataByte1;
    byte dataByte2;
    public A(byte b1, byte b2)
    {
        dataByte1 = b1;
        dataByte2 = b2;
    }

    public UInt16 GetAllData()
    {
        return (UInt16)((dataByte1 << 8) | dataByte2);
    }

    public A(UInt16 allData)
    {
        dataByte1 = (byte)(allData >> 8);
        dataByte2 = (byte)(allData & 0xff);
    }

    public byte b1
    {
        get
        {
            return dataByte1;
        }
    }

    public byte b2
    {
        get
        {
            return dataByte2;
        }
    }
}

[Serializable()]
class B : System.Runtime.Serialization.ISerializable
{
    string name;
    List<A> myList;

    public B(int size)
    {
        myList = new List<A>(size);

        for (int i = 0; i < size; i++)
        {
            myList.Add(new A((byte)(i % 255), (byte)((i + 1) % 255)));
        }
        name = "List of " + size.ToString();
    }

    public A this[int index]
    {
        get
        {
            return myList[index];
        }
    }

    #region ISerializable Members

    public void GetObjectData(System.Runtime.Serialization.SerializationInfo info, System.Runtime.Serialization.StreamingContext context)
    {
        UInt16[] packed = new UInt16[myList.Count];
        info.AddValue("name", name);
        for (int i = 0; i < myList.Count; i++)
        {
            packed[i] = myList[i].GetAllData();
        }
        info.AddValue("packedData", packed);
    }

    protected B(System.Runtime.Serialization.SerializationInfo info, System.Runtime.Serialization.StreamingContext context)
    {
        name = info.GetString("name");
        UInt16[] packed = (UInt16[])(info.GetValue("packedData", typeof(UInt16[])));
        myList = new List<A>(packed.Length);
        for (int i = 0; i < packed.Length; i++)
            myList.Add(new A(packed[i]));
    }

    #endregion
}

[Serializable()]
class C
{
    public List<B> all;
    public C(int count, int size)
    {
        all = new List<B>(count);
        for (int i = 0; i < count; i++)
        {
            all.Add(new B(size));
        }
    }
}

 
精彩推荐
图片推荐