Optimising Rocket Chat Content Delivery with CloudFront
It's recommended to use Rocket Chat with either Amazon S3 or Google Cloud Storage to store uploaded files. However when downloading files stored with these solutions, there is typically a redirect to the blob store to actually obtain the data. This post discusses how to eliminate intermediate redirects and leverage edge caching using Amazon CloudFront.
Adding CloudFront
The first thing to do is to set up the distribution with your Rocket Chat reverse proxy. To do that, you should set up the default behaviour to pass through requests to the origin (using the CachingDisabled AWS-managed policy, with the AllViewerExceptHostHeader origin request policy). Only certain paths are safe to cache and need to be enabled on a case-by-case basis (for the TL;DR, see Distribution Overview).
Eliminate Redirects
Out of the box, when using Amazon S3 as a file store, the client makes the following sequence of requests:
In order to enable effective caching, it's possible to completely remove the intermediate redirect request using a Lambda@Edge function to resolve the signed URL to pass to CloudFront as its new origin.
That means that a request to GET /file-uploads/{id}/image.jpg
will directly result in the content of image.jpg
for the client, without a redirect. This can optionally be cached, but may also reduce latency given the redirect is resolved and followed within AWS, rather than by the client.
We can use a Lambda@Edge Origin Request function to eliminate the redirect using Node.JS. I created a generic, unit tested version of this with built-in retries, which can be easily deployed and used in a cache behaviour. See the README for follow-origin-redirects on GitHub.
Edge Caching
Now we have CloudFront resolving the redirect internally, we can engage the edge caching. There are two options for caching, and which you pick depends on the following trade-offs.
Option | Authentication in Cache Key | Authentication for File Access | Cache Scope |
---|---|---|---|
1 | Yes | Yes | Per-user session |
2 | No | None (if cached) | All users |
Given files have a unique ID and you must know this ahead of time in order to access them, you might consider the risk of unauthenticated access to cached files acceptable.
Option 1
Rocket Chat uses cookies and query strings for authentication (depending on client type), so we must create a new behaviour to serve the path /file-uploads/*
and instruct CloudFront to use this data in the cache key using a Cache Policy.
Ensure that you set a sensible minimum, maximum and default TTL (for example, one year), and instruct CloudFront to include the following query strings/cookies in the cache key under "Cache Key settings":
rc_token
(the user's session token for this device)rc_uid
(the user ID)
Option 2
If you want all users to share the same cache for files and don't need authentication to access them, then you can create a new behaviour to serve the path /file-uploads/*
, and use the AWS-defined Managed-CachingOptimized cache policy with the Managed-AllViewerExceptHostHeader origin request policy.
Fixing Inefficient Proxying
As discussed Rocket Chat typically generates signed URLs to download the content of uploaded files. However, there is one exception - when you click on a file to preview it (for example on the web). In that case, for Amazon S3-backed file storage, the URL loaded is actually:
GET https://example.com/ufs/AmazonS3:Uploads/{id}/image.jpg
The response of this API streams the bytes from S3 and proxies them via the Rocket Chat server, which can be very inefficient when using a remote blob store. It seems to work this way because the UFS APIs use the generic file system interface, which I assume was designed with local files in mind.
However, there is a workaround we can use here - the following APIs appear to be equivalent if you're only using one type of file store like S3:
GET https://example.com/ufs/AmazonS3:Uploads/{id}/image.jpg
GET https://example.com/file-upload/{id}/image.jpg
So if we re-write /ufs/AmazonS3:Uploads/
to /file-upload/
, we can route all requests to download files directly to S3. Luckily, URL re-writing is a 5 line CloudFront Function:
function handler(event) {
var request = event.request;
request.uri = request.uri.replace('/ufs/AmazonS3:Uploads/', '/file-upload/');
return request;
}
To use this, we need to add a new behaviour to the distribution to match /ufs/*
, with the above CloudFront Function used as the Viewer Request association. This way the URL re-write happens before CloudFront computes the cache key for the request, and before it hits the origin.
Caching Static Assets
Now the low-hanging fruit is addressed, we can optimise delivery of static assets like JavaScript, CSS, and sprites. Rocket Chat downloads around 5MB of these assets when loading the web interface, which are typically static and can easily be served from an edge cache.
The below list notes some static asset paths and their requirements:
/meteor_runtime_config.js
- Requires theUser-Agent
header andhash
query parameter/*.js
- All other JavaScript seems fairly static (no other input requirements)/*.css
- Static CSS/assets/*
- Static logo images, served with cache-busting headers/packages/*
- Emoji images, served with cache-busting headers/fonts/*
- Includesrocketchat.woff2
Distribution Overview
The following table shows different behaviours you can consider adding to your CloudFront distribution, with the Cache Policy, Origin Request Policy, and any function logic (Lambda@Edge / CloudFront Functions) inbetween.
Path | Caching Policy | Origin Request Policy | Associated Functions |
---|---|---|---|
/assets/* | CachingForced | - | - |
/packages/* | CachingForced | - | - |
/fonts/* | CachingForced | - | - |
/*.css | CachingOptimized | - | - |
/meteor*.js | CachingDisabled | AllViewerExceptHostHeader | - |
/*.js | CachingOptimized | - | - |
/file-upload/* | CachingForced | AllViewerExceptHostHeader | FollowRedirects |
/ufs/AmazonS3:Uploads/* | CachingForced | AllViewerExceptHostHeader | RewriteUfs, FollowRedirects |
Default (*) | CachingDisabled | AllViewerExceptHostHeader | - |
There is one custom (non-AWS managed) caching policy in use:
- CachingForced - A copy of CachingOptimized with a large minimum cache time
And two functions:
- FollowRedirects - See earlier in post, follow redirects to S3 internally
- RewriteUfs - See earlier in post, re-write
/ufs/AmazonS3:Uploads/
to/file-upload/
Future Work
To improve this further, Rocket Chat would really need some architectural changes to allow CDNs to accellerate file transfers, and to eliminate proxying via the Rocket Chat server. In addition to simplifying the steps needed to allow downloads from a CDN, uploads would need to follow something like this pattern:
- Client requests a signed URL to upload to
- Client uploads to the signed URL
- Client notifies Rocket Chat of completed upload
The upload URL could point to CloudFront, S3 Transfer Acceleration, the nearest S3 bucket, etc.
🏷️ file cache cloudfront rocket chat policy s3 upload caching origin client url redirect cachingforced store
Please click here to load comments.