NHibernate的批量插入或更新批量、NHibernate

2023-09-03 16:22:05 作者:给不了你的幸福

你好我工作的一个项目,我们需要处理多个XML文件,每天一次,并填充数据库包含在这些文件中的信息。

Hi I'm working a project where we need to process several xml files once a day and populate a Database with the information contained in those files.

每个文件大约为1Mb,其中包含约1000条记录;我们通常需要12至25这些文件的处理。我已经看到了使用NHibernate的一些有关信息,批量插入,但我们的问题是有点棘手,因为XML文件中包含有更新的记录混合的新纪录。

Each file is roughly 1Mb and contains about 1000 records; we usually need to process between 12 and 25 of these files. I've seen some information regarding bulk inserts using NHibernate but our problem is somehow trickier since the xml files contain new records mixed with updated records.

在的XML有一个标志,它告诉我们是一个特定的记录是新的或更新到现有的记录,但不哪些信息发生了变化。在XML记录不包含我们的数据库标识符,但我们可以用一个标识符从XML记录,以独特的定位记录在我们的数据库中。

In the xml there is a flag that tells us is a specific record is a new one or an update to an existing record, but not what information has changed. The xml records do not contain our DB identifier, but we can use an identifier from the xml record to uniquely locate a record in our DB.

我们的战略,迄今已查明如果当前记录是插入或更新,并基于我们要么执行对数据库插入或我们做搜索,那么我们更新的信息对象的信息从XML记录来了,最后我们做数据库的更新。

Our strategy so far has been to identify if the current record is an insert or an update and based on that we either perform an insert on the DB or we do a search, then we update the information on the object with the information coming from the xml record and finally we do an update on the DB.

与我们目前的做法的问题是,我们有与数据库锁的问题,我们的性能下降非常快。我们已经想到了一些替代品就像有单独的表为不同的操作,甚至单独的数据库的,但做这样的举动将意味着很大的努力,以便任何决定之前,我想问一下,为社会舆论在这个问题上,先谢谢了。

The problem with our current approach is that we are having issues with DB locks and our performance degrades really fast. We have thought about some alternatives like having separate tables for the distinct operations or even separate DB’s but doing such a move would mean a big effort so before any decisions I would like to ask for the community opinion on this matter, thanks in advance.

推荐答案

一对夫妇的想法:

请尽量使用IStatelessSession进行批量操作。 如果你还不满意的表现,只是跳过NHibernate和使用存储过程或参数化查询具体到这一点,或使用的 IQuery.ExecuteUpdate() 如果您正在使用SQL Server,你可以把你的XML格式BCPFORMAT XML然后在其上(仅适用于插入)运行BULK INSERT 如果你有太多的DB锁,尝试分组操作(即先找出什么需要插入和更新的东西,然后得到的PK的更新,然后运行BULK INSERT的插入,然后运行更新) 如果解析源文件是一个性能问题(即它马克塞斯一个CPU核心),尝试做并行(你可以使用并行扩展的) Always try to use IStatelessSession for bulk operations. If you're still not happy with the performance, just skip NHibernate and use a stored procedure or parameterized query specific to this, or use IQuery.ExecuteUpdate() If you're using SQL Server, you could convert your xml format to BCPFORMAT xml then run BULK INSERT on it (only for insertions) If you're having too many DB locks, try grouping the operations (i.e. first find out what needs to be inserted and what updated, then get PKs for the updates, then run BULK INSERT for insertions, then run updates) If parsing the source files is a performance issue (i.e. it maxes out a CPU core), try doing it in parallel (you could use Parallel Extensions)