A note on Azure storage and downloading large files

I have written a simple ASP.NET MVC application for upload and download of files to/from Azure storage.

Getting large file upload to work was the first exercise, described here. That is working well; but what about download?

If your files in Azure storage are public, you can simply serve an URL to the file. If it is not public though, you have a couple of choices:

1. Download the file under application control, by writing to Response.OutputStream or using a FileResult action.

2. Issue a Shared Access Signature (SAS) to the client which enables it to retrieve the file directly from Azure storage. The SAS is sent as an URL argument which tells Azure storage that the request is authorised. The browser downloads the file directly, so it makes no difference to your web application if the file is large.

Note that if you use the first option, it will not work with large files if you simply call DownloadToStream or similar:

container.GetBlockBlobReference(FileName).DownloadToStream(Response.OutputStream);

Why not? Well, the way this code works is that it downloads the large file to the web server, then sends it to the browser. What if your large file is 5GB? The browser will wait a long time for the first byte to be served (giving the user an unresponsive page); but before that happens, the web application will probably throw an exception because it does not like downloading such a large file.

This means the SAS option is a good one, though note that you have to specify an expiry time which could cause problems for users on a slow connection.

Another option is to serve the file in chunks. Use CloudBlockBlob.DownloadRangeToStream to write to Response.OutputStream in a loop until the download is complete. Call Response.Flush() after each chunk to send the chunk to the browser immediately.

This gives the user a nice responsive download experience complete with a cancel option as provided by the browser, and does not crash the application on the server. It seems to me a reasonable approach if the web application is also hosted on Azure and therefore has a fast connection to Azure storage.

What about resuming a failed download? The SAS approach should work as Azure supports it. You could also support this in your app with some additional work since Resume means reading the Range header in a GET request. I have not tried doing this but you might find some clues here.