如何将庞大的XML文件转换为关系型数据进行查询转换为、如何将、庞大、关系

2023-09-04 03:18:04 作者:对你微笑纯属礼貌

我至少有100 XML文件,每次约300 MB与电子邮件基本下面列出的格式。

I have at least 100 xml files each about 300 MB with email messages basically in the format listed below.

现在我的问题是,我如何得到这个数据到发言权SQL Sever的数据库,这样我可以在这个数据进行查询。我的疑问是大致相同的:具有一定的人发送了一封电子邮件到另一个某些人在一个特定时期与主体/身体等

Now my question is, how do I get this data into say SQL Sever database so that I can perform query on this data. My queries would be along the lines of: Has a certain person sent an email to another certain person on a given period with certain keywords on subject/body etc.

下面是我曾尝试:

Here is what I have tried:

1)装载每个XML文件转换成XML数据类型字段到SQL Server。通过这种方法,我不能拿出的XPath(?)的查询做什么,我需要。它甚至有可能做到这一点在XPath?

1) Loading each XML file into XML data type field into SQL Server. With this approach I could not come up with the Xpath(?) queries to do what I need. Is it even possible to do this in Xpath?

2)加载每个文件到使用ReadXML的和ReadSchema .NET数据集。这似乎负荷罚款,这似乎创造合适数量的数据表的外键等,但这就意味着我必须在数据库中创建100套表。不知怎的,所有加入到一个单一的表,然后执行查询。

2) Loading each file into .NET DataSet using ReadXML and ReadSchema. This seems to Load fine and it seems to create the right number of DataTable with the foreign keys etc but this would mean I will have to create 100 sets of table on the database. Somehow join all into one single table then perform the query.

让我知道,如果你们有任何其他建议。

Let me know if you guys have any other suggestions.

感谢。

<Message>
<MsgID>4651286700000CAA00EF00010000</MsgID>
<MsgTime>2007-05-21-01.04.39.000000</MsgTime>
<MsgTimeUTC>1179723879</MsgTimeUTC>
<MsgLang>CODE 1252</MsgLang>
<Sender>
	<UserInfo>
		<FirstName>X</FirstName>
		<LastName>Y</LastName>
		<AccountName>121212</AccountName>
		<CorporateEmailAddress>someone@somewhere.com</CorporateEmailAddress>
	</UserInfo>
</Sender>
<Recipient DeliveryType = " ">
	<UserInfo>
		<FirstName>A</FirstName>
		<LastName>B</LastName>
		<FirmNumber>7593</FirmNumber>
		<AccountName>STRATEGIC AS</AccountName>
		<AccountNumber>604806</AccountNumber>
		<CorporateEmailAddress>A@B.COM</CorporateEmailAddress>
	</UserInfo>
</Recipient>
<Subject>
	Please review the following
</Subject>
<Attachment>
	<FileName>37715772.htm</FileName>
	<FileID>503242486522279_37715772.htm</FileID>
	<FileSize>31175</FileSize>
</Attachment>
<MsgBody>
	This is the message Body
</MsgBody>

推荐答案

使用 XML批量加载组件

http://support.microsoft.com/kb/316005