Tuesday, November 15, 2011

Private Streaming with CloudFront: A Guide

Update, 1 Oct 2012: This post is largely obsolete, as Amazon recently added private streaming support to the CloudFront section of the AWS Console.  The original post follows.

I'll just assume you're aware of the IaaS offering known as Amazon Web Services, AWS.  CloudFront is a CDN in the AWS micropayments-as-you-go style, which offers the ability to serve non-public content stored in S3.  This is a compendium of the things I learned setting up a private streaming distribution for use with PHP.

This is going to be fairly low-level, since I like to drink deeply of the systems I'm working with.  I don't think AWS works smoothly enough yet that you can put the API on the "it's magic" side of the line.


Don't touch that yet: Making S3 CloudFront friendly

CloudFront will request content from S3 using the virtual-hosting style, your-bucket-name.s3.amazonaws.com, so you can only create a CloudFront distribution for a bucket that has a "DNS safe" name.  For me, that meant going back and renaming the bucket since I didn't want to upload another 1G of files.

If you're going to serve your content with TLS (nee SSL) enabled, over https, you need an ssl-safe name: one without dots.  Amazon has a wildcard certificate for *.s3.amazonaws.com which will not match bucket.example.com.s3.amazonaws.com, since the * in SSL only applies to one level of name.  Replacing the dots with dashes, bucket-example-com would work fine.

Once you have a suitable bucket name ready for CloudFront, you can proceed with creating your distribution.

Creating or Configuring a Distribution

The CloudFront web console (the AWS management console's CloudFront tab) is particularly lacking in support for private distributions at the moment.  If I were doing things again, I'd probably skip creating the distribution in the console, and hack up a script to do everything that way.  The lack of support means that you end up having to call the API somehow, regardless.

At any rate, whether you're creating the distribution from whole cloth in a script or re-configuring one created via the console, there are three important settings to engage, along with one dependency.  That dependency is to create an origin access identity (OAI).  Then, you can associate the OAI to the account and to the origin, and set yourself as trusted signer.  At this point, the distribution is fully private, and you can grant access to the OAI to your S3 objects, and move on to signing your URLs.

I'm working in PHP as usual, so I grabbed the AWS SDK for PHP via Amazon's pear channel: pear channel-discover pear.amazonwebservices.com ; pear install pear.amazonwebservices.com/sdk

This lets you include('AWSSDKforPHP/sdk.php'); in your PHP script, and start using calls as documented here.  The first thing we need to call is create_oai which will be one of our dependencies.  Save yourself some time, and print every response you get.  In particular, you'll need the Id and S3CanonicalUser from the create_oai response later.  Otherwise, you'll have to call get_oai_list (which returns array rather than CFResponse) and get_oai to find what you missed.

Creating a Distribution from Scratch (Recommended)

If you have created a configuration through generate_config_xml, or a distribution through create_distribution which internally calls generate_config_xml, then S3Origin will properly receive the OriginAccessIdentity child, and you can skip the following section.  All you need to do is pass 'OriginAccessIdentity' => $oai_id, 'TrustedSigners' => array('Self'), 'Streaming' => true in the options, and everything will work.

Updating the Config of an Existing Distribution

If you've created the distribution already, you can get its ID by looking at its properties in the console, or through list_distributions with the Streaming option set.  Next, call get_distribution_config with the Streaming option to fetch the distribution's configuration.  You can pass this response straight to update_config_xml; again, with the Streaming option set, but also 'OriginAccessIdentity' => $oai_id, 'TrustedSigners' => array('Self'). Now here's the part that tripped me up for almost three days: the XML that comes back does not include an OriginAccessIdentity element as a child of S3Origin.  This seems to apply to 1.4.7 as well as the 1.4.6 version of the PHP SDK I was using.  The distribution is private (will require signed URLs) because the OriginAccessIdentity is present in the distribution config, but you need to add the OriginAccessIdentity under S3Origin by hand.

For me, it was something like this:

$sxml = simplexml_load_string($new_cfg);
$sxml->registerXPathNamespace("cf",

    "http://cloudfront.amazonaws.com/doc/2010-11-01/");
foreach ($sxml->xpath('//cf:S3Origin') as $node) {

    $node->addChild('OriginAccessIdentity',
        "origin-access-identity/cloudfront/$oai_id");
}
$new_cfg = $sxml->asXML();


The call to registerXPathNamespace tells SimpleXML that we want to access items in Amazon's namespace with a prefix of "cf", then we use that in the XPath expression to find the S3Origin node, to which we add our OAI.  Without this, CloudFront won't try to actually use the OAI to access S3 content, and will therefore be limited to public files.

At this point, you have something suitable for set_distribution_config, with the Streaming option, of course.

Letting CloudFront Access S3

I didn't want  to set every video as accessible to the CloudFront OAI individually, so I set up a bucket policy instead.  Bucket policies let you grant/revoke permissions for all objects in the bucket, overriding the individual object ACLs.

However, they are not terribly easy to write.  I took the example from the S3 documentation Access Control → Using Bucket Policies → Example Cases.... The CanonicalUser to use can be found via get_oai's S3CanonicalUser element, if you didn't record it after calling create_oai.

After substituting your S3CanonicalUser and bucket name in the appropriate spots in the policy, it needs to be set as the actual bucket policy.  To do this at the console: click the bucket, choose Actions → Properties from the button above, and look for the "Edit bucket policy" button on the Permissions tab. Paste and save your policy, and you should be set.

Signing URLs

CloudFront does request authentication differently than other AWS services, so you need to create an RSA keypair for it in your AWS account.  From the console, click your name way up in the upper right, above the tabs, and choose Security Credentials.  Log in again, and scroll down a bit, and you'll see a tab for Key Pairs.  This is where you generate your signing key.  You'll need to save the private key and keep it on your server for creating the signatures.

If you search the web hard enough, you can find some URL signing code from RyanP@AWS.  I played with this for a while, but for streaming distributions it gives you a certainly-wrong URL, and I never did test whether it actually worked once I understood how to sign URLs.  (I suspect it doesn't.)  The URL signer that was actually helpful was the one that is included in the CloudFront Developer's Guide under "Using a Signed URL.... → Signature Code, Examples, and Tools → Create ... using PHP".

Whatever you're using to sign your URLs with, test it with the private key and example policies from the developer's guide to make sure you get the same output.  I had to go back and undo some of my "improvements" to my signing script.  For one, the canned policy is not necessarily the JSON which would be output by a json encoder: slashes in the resource URL are not backslash-escaped.  Secondly, I assembled some parameters through http_build_query, which turned the '~' characters in the Signature field into %7E, and CloudFront didn't care for that either.

Since we're streaming, using RTMP, there are two URLs involved in the process.  There's a streamer at syourhostname.cloudfront.net/cfx/st which is the "streamer" in jwPlayer parlance, or "netConnectionURL" in flowplayer.  This is where Flash connects to access the stream; it is like a server in its own right, that takes another "file" or "URL" (again, in jwPlayer and flowplayer terms, respectively) to determine what content should be streamed over that connection.

The streamer does not participate in the URL signing process at all.  It is only the content to be streamed that is signed.  Therefore, the URL that actually gets signed is the full path of your file, with the extension, within your S3 bucket.  If you have video-example-com as your bucket, then video-example-com.s3.amazonaws.com/usa/managing-payments.flv will use usa/managing-payments.flv as the resource to sign.

The resulting signature then gets attached to the stream name as Flash prefers it, flv:usa/managing-payments?Expires=....  jwPlayer seems to work with the URL as-is, when invoking it through its jwplayer JavaScript function, but flowplayer needs the signed URL to be URL-encoded, except for the colon.  That is, my final URL-building for flowplayer takes the shape: $file_url = 'flv:' . urlencode("folder/file?$sign_params");  I did not need any more encoding of slashes or spaces in the folder/file names.

I can't really say more about jwPlayer here, because once I got the OriginAccessIdentity associated to the S3Origin and could access my files in either player, I went back to using flowplayer.

Troubleshooting

If you've just made changes, wait a bit and try again.  S3 ACLs don't seem to apply instantly, so bucket policies probably don't either, and Amazon notes somewhere that it can take up to 15 minutes for a CloudFront distribution to be updated after an edit is made.

If you're still having problems after waiting it out, turn on access logging for both CloudFront and the S3 bucket.  That's what ultimately directed me toward my problem, that CloudFront was not using its OAI to access the files in S3.

Access logging of an S3 bucket uses the "prefix" parameter as-is: if you set it to 's3-videos-' you'll get files in the root of the access logging bucket, with names beginning 's3-videos-'.  The CloudFront console's Edit Distribution screen seems to suggest the same, but it actually uses it as a folder name.  I gave it a prefix of 'cf-' and all my CloudFront logs ended up in 'cf-/'.

The access log files aren't available instantly, but in time, many very short files will show up.  This is where a tool like s3sync or Cloudberry Explorer comes in handy, to fetch them so you can see them without a three-click download process between each one.  CloudFront's logs are gzip-compressed, but S3's are just text.

Incidentally, you can edit a distribution in CloudFront without affecting its private status.  The console only changes the properties that it supports.

I can't really endorse any particular app for accessing or managing CloudFront distributions, because of the ones that understood streaming distributions, they all seemed blissfully unaware of my misconfiguration.

Conclusion

To compress all of the above into an "If I were doing this again from scratch" set of steps:
  1. Create a DNS- and possibly SSL-safe S3 bucket.
  2. Upload your videos into it.
  3. Create a CloudFront Origin Access Identity, through the API.  Record the OAI's Id and S3CanonicalUser values.
  4. Create a CloudFront Distribution through the API, using the OAI Id from the previous step, TrustedSigners=[Self], and Streaming=true to make it a private streaming distribution.
  5. Create a bucket policy on the S3 bucket, using the S3CanonicalUser from the OAI, granting it GetObject on all items in the bucket.
  6. Create an RSA keypair in the Security Credentials meta-console.  (For lack of a better term.)
  7. Save the private key to your server.
  8. Fetch the example code from the CloudFront Developer's Guide to sign your URLs.
  9. Use the signing code to sign URLs for "path/file.flv".
  10. Use "flv:path/file?Expires=...&Signature=...&Key-Pair-Id=..." as the file URL in your player.
  11. Use the http or https://syourhostname.cloudfront.net/cfx/st URL as the streamer URL in your player.
At this point, everything should be working.  If not, Amazon allows themselves 15 minutes for changes to CloudFront to propagate, so take a little break and try again.

3 comments:

Ronak said...

Nice video..Looks really good.
I am one of the developer team member. I would like to share a simple application called Bucket Explorer, which helps you to working with all features of Amazon S3 and cloudfront.
You can also create different types of Distribution on Bucket using it,in very easy way.

Public Distribution.
Private Distribution.
Streaming Distribution.
Private Streaming Distributon.

Andy Mandy said...

Thank you so much for featuring CloudBerry Explorer on your blog!
Andy, CloudBerry Lab team

Seethaprasad said...

Thanks a lot for the ":" hint in the blog which makes flowplayer work. We wasted lot of time wondering what was going wrong.