Monday, October 13, 2014

Processing SQL Server FILESTREAM Data, Part 4 - Readin' and Writin'


In the prior installments in this series I covered some background, FILESTREAM setup, and the file and table creation for this project. In this final installment we'll finally see some C# code that I used to read and write the FILESTREAM data.

The Three "R"s

I was always confused by the irony that only one of the legendary Three "R"'s actually starts with an "R". Yet another indictment of American education? But I digress.
Before we work on the code to read FILESTREAM data, let's write to it first. First, we'll need a couple of structures to store information returned from various database operations.

Then we can create a routine that mimics an SMTP send, but instead stores the email information to the database tables we created in "Processing SQL Server FILESTREAM Data, Part 3 - Creating Tables". Pardon the formatting in order to make the overlong lines fit within the blog template.

A couple of notes about the code shown above:
  • The code uses Marc Gravell and Sam Saffron's superb Micro-ORM Dapper which I highly recommend. While religious wars rage over the use of Micro-ORMs vs heavy ORMs I far prefer Dapper to other approaches;
  • The INSERT statements use the SQL Server OUTPUT clause to return ID information about the inserted rows, which is a more efficient method than sending a subsequent SELECT query for the information;
  • Once the streams have been opened, the .Net 4.0 CopyTo method will do a nice job of copying the bytes. If you're on an earlier version of the framework this method can easily be created. See Jon Skeet's sample implementation here.

Once the email message has been inserted into the master table and we have its ID we can then attempt to insert the attachments into their corresponding detail table. This is done in two steps:
  1. Insert the metadata about the attachment to the EmailAttachments table. Once this is complete you can retrieve a file name and context ID for streaming attachment data to the FILESTREAM;
  2. Open the FILESTREAM using provided framework methods for doing so. Write the attachment data to the FILESTREAM;

Seems simple, but there is a subtlety. The INSERT statement to add the metadata must add at least one byte of data to the file using Transact-SQL. That is indicated by the null byte ("0x00") that is the last value of the statement. If you don't supply this, instead supplying NULL or, as I initially attempted, default, SQL Server will not create a file since you haven't given it any data. Consequently the SQL Server PathName() function will return NULL and the call to open the SqlFileStream will fail unceremoniously.
There are two ways that I could have submitted the attachment data to SQL Server, as the last value of the INSERT statement to the EmailAttachments table, or using streaming as I did in the example. I chose the latter so that, in the case of very large attachment, I could stream the file in chunks rather than reading the entire file into memory to submit via INSERT statement. This is less resource intensive under the heavy load I expect for this utility.
I then created a separate Windows service to read the messages, attempt to send via SMTP, log successes and failures, and queue for retrying a certain number of times. The heart of the portion that reads the attachments looks quite similar to the write operation

Some notes about the code shown above:
  • I created a result class, shown earlier in this post, for retaining the file path and transaction context returned from the query;
  • Note that you must create a transaction for the SELECT in order for the GET_FILESTREAM_TRANSACTION_CONTEXT method to return a context that can be used in the SqlFileStream constructor;
  • Once again I have used the CopyTo method to move the bytes between the streams.

Summary

That finishes the heart of the SQL Server FILESTREAM operations for the utility I was constructing. The real trick of it was the initial configuration and understanding the process. Hopefully this series of articles will help someone past the problems I encountered. Good luck and good coding!

4 comments:

  1. Reading, Riting, and Rithmitic..... Spelling was not deemed important.

    ReplyDelete
  2. Quick question..... Can a Filestream contain a foreign key? Like so:

    CREATE TABLE [dbo].[AttachmentFileStream](
    [attachmentID] [bigint] IDENTITY(1,1) NOT NULL,
    [guid] [uniqueidentifier] ROWGUIDCOL UNIQUE NOT NULL DEFAULT (newid()),
    [Name] [nvarchar](100) NOT NULL,
    [Data] [varbinary](max) FILESTREAM NULL,
    [emailID] [bigint] NOT NULL, //Foreign Key
    [description] [varchar](max) NULL, //I should delete this because I am not using this column
    [filesize] [float] NULL, //I should probably calculate the size everytime and delete this column.
    [timestep] [datetime2](7) NULL DEFAULT (sysdatetime()),
    PRIMARY KEY CLUSTERED......

    This way all attachments for a single email can be grouped together.

    ReplyDelete
    Replies
    1. Patrick, my apologies for the delayed reply. I honestly did not see your comments until this very day.

      There is no reason that the table containing the FILESTREAM cannot also contain another column that is a a FOREIGN KEY into a related table. If you run into any issues with that construct feel free to contact me.

      Delete
  3. Wow i can say that this is another great article as expected of this blog.Bookmarked this site.. Odzyskiwanie danych Łódź

    ReplyDelete