在Access数据库中提取的附件野外文件数据库中、野外、附件、文件

2023-09-02 10:24:11 作者:最美的风景不过有你的陪伴

我们正在研究一个项目,我们需要迁移存储在Access数据库高速缓存数据库中的数据。 Access数据库包含列与附件的数据类型;一些元组包含多个附件。我可以用 .FileName 来获得这些文件的文件名,但我不能确定如何确定一个文件时结束,另一个开始于 .FileData 。

我使用下面的获取这些数据:

  System.Data.OleDb.OleDbCommand命令=新System.Data.OleDb.OleDbCommand();
command.CommandText =选择[Sheet1中] [PDF] .FileData,* FROM [Sheet1中。
command.Connection =康涅狄格州;
System.Data.OleDb.OleDbDataReader RDR = Command.ExecuteReader却();
 

解决方案

(我原来的这个问题的答案是误导性的,这对于随后打开使用Adobe Reader PDF文件的效果不错,但它并不总是工作适当的其他类型的文件。以下是修正版本。)的

不幸的是,我们不能直接使用的OleDb检索文件的内容在Access 附件字段。 Jet数据库引擎prepends一些元数据文件的二进制内容,以及元数据包括在内,如果我们检索 .FileData 通过OleDb的。

Access数据库的迁移问题探究

要说明这一点,一个名为Document1.pdf文档保存到使用Access UI中的附件栏。该PDF文件的开头是这样的:

如果我们用以下code,试图提取PDF文件保存到磁盘

使用

(OleDbCommand的CMD =新的OleDbCommand()) {     cmd.Connection = CON;     cmd.CommandText =             选择Attachments.FileData+             从AttachTest+             WHE​​RE Attachments.FileName ='Document1.pdf';     使用(OleDbDataReader RDR = cmd.ExecuteReader())     {         rdr.Read();         字节[]的FileData =(字节[])RDR [0];         使用(VAR FS =新的FileStream(                 @C: Users 用户戈德桌面 FromFileData.pdf                 FileMode.Create,FileAccess.Write))         {             fs.Write(的FileData,0,fileData.Length);             fs.Close();         }     } }

然后得到的文件将包括在文件的开头的元数据(在此情况下,20字节)

Adob​​e Reader的能够打开这个文件,因为它是强大到足以忽略任何垃圾的%PDF-1.4签署之前,可能会出现在文件中。不幸的是不是所有的文件格式和应用程序是如此宽容外来字节在文件的开头。

在仅官方及贸易;从附件解压文件的方式在Access领域是使用ACE DAO的 .SaveToFile 方法字段2 的对象,像这样:

//需要COM引用信息:Microsoft Office 14.0 Access数据库引擎对象库 // //使用Microsoft.Office.Interop.Access.Dao; ... VAR DBE =新用到dbengine(); 数据库DB = dbe.OpenDatabase(@C:用户公用 Database1.accdb); 记录rstMain = db.OpenRecordset(         选择附件从AttachTest WHERE ID = 1,         RecordsetTypeEnum.dbOpenSnapshot); Recordset2 rstAttach = rstMain.Fields [附件]值。 而((Document1.pdf.Equals(rstAttach.Fields [文件名]值))及!&安培;!(rstAttach.EOF)) {     rstAttach.MoveNext(); } 如果(rstAttach.EOF) {     Console.WriteLine(未找到。); } 其他 {     场2 FLD =(场2)rstAttach.Fields [的FileData];     fld.SaveToFile(@C: Users 用户戈德桌面 FromSaveToFile.pdf); } db.Close();

请注意,如果您尝试使用的Field2对象的。价值你仍然会得到元数据的字节序列的开始;在 .SaveToFile 的过程就是剥离出来。

We are working on a project where we need to migrate data stored in an Access database to a cache database. The Access database contains columns with a data type of Attachment; some of the tuples contain multiple attachments. I am able to obtain the filenames of these files by using .FileName, but I'm unsure how to determine when one file ends and another starts in .FileData.

I am using the following to obtain this data:

System.Data.OleDb.OleDbCommand command= new System.Data.OleDb.OleDbCommand();
command.CommandText = "select [Sheet1].[pdf].FileData,* from [Sheet1]";
command.Connection = conn;
System.Data.OleDb.OleDbDataReader rdr = command.ExecuteReader();

解决方案

(My original answer to this question was misleading. It worked okay for PDF files that were subsequently opened with Adobe Reader, but it did not always work properly for other types of files. The following is the corrected version.)

Unfortunately we cannot directly retrieve the contents of a file in an Access Attachment field using OleDb. The Access Database Engine prepends some metadata to the binary contents of the file, and that metadata is included if we retrieve the .FileData via OleDb.

To illustrate, a document named "Document1.pdf" is saved to an Attachment field using the Access UI. The beginning of that PDF file looks like this:

If we use the following code to try and extract the PDF file to disk

using (OleDbCommand cmd = new OleDbCommand())
{
    cmd.Connection = con;
    cmd.CommandText = 
            "SELECT Attachments.FileData " +
            "FROM AttachTest " +
            "WHERE Attachments.FileName='Document1.pdf'";
    using (OleDbDataReader rdr = cmd.ExecuteReader())
    {
        rdr.Read();
        byte[] fileData = (byte[])rdr[0];
        using (var fs = new FileStream(
                @"C:UsersGordDesktopFromFileData.pdf", 
                FileMode.Create, FileAccess.Write))
        {
            fs.Write(fileData, 0, fileData.Length);
            fs.Close();
        }
    }
}

then the resulting file will include the metadata at the beginning of the file (20 bytes in this case)

Adobe Reader is able to open this file because it is robust enough to ignore any "junk" that may appear in the file before the '%PDF-1.4' signature. Unfortunately not all file formats and applications are so forgiving of extraneous bytes at the beginning of the file.

The only Official™ way of extracting files from an Attachment field in Access is to use the .SaveToFile method of an ACE DAO Field2 object, like so:

// required COM reference: Microsoft Office 14.0 Access Database Engine Object Library
//
// using Microsoft.Office.Interop.Access.Dao; ...
var dbe = new DBEngine();
Database db = dbe.OpenDatabase(@"C:UsersPublicDatabase1.accdb");
Recordset rstMain = db.OpenRecordset(
        "SELECT Attachments FROM AttachTest WHERE ID=1",
        RecordsetTypeEnum.dbOpenSnapshot);
Recordset2 rstAttach = rstMain.Fields["Attachments"].Value;
while ((!"Document1.pdf".Equals(rstAttach.Fields["FileName"].Value)) && (!rstAttach.EOF))
{
    rstAttach.MoveNext();
}
if (rstAttach.EOF)
{
    Console.WriteLine("Not found.");
}
else
{
    Field2 fld = (Field2)rstAttach.Fields["FileData"];
    fld.SaveToFile(@"C:UsersGordDesktopFromSaveToFile.pdf");
}
db.Close();

Note that if you try to use the .Value of the Field2 object you will still get the metadata at the beginning of the byte sequence; the .SaveToFile process is what strips it out.

 
精彩推荐
图片推荐