Publishing assets: Sequential vs Parallel
A few days ago, we quietly deployed a change that nearly doubled publishing speed for the vast majority of Unbounce pages.
Our publishing process involves two main steps:
- Render the page, and publish it to S3
- Find all the assets referenced by the page, and publish them to S3 (where they are served by Cloudfront)
The second step can only be done after the first step is done, because it needs the rendered page to find all the assets. The first step is quite complex, with little room for optimization. However, we recently discovered that there was some low-hanging fruit for optimization in the second step.
We realized that time spent publishing pages is split pretty evenly between those two steps. It takes about as long to deal with all the assets as it does to render the page for publishing. This meant there was an improvement we could make: publishing the assets in parallel.
Sequential vs Parallel
If you were to write out some pseudo-code of what step 2 looks like, it might look something like this:
for each asset in the rendered page, do the following, one at a time: download the asset from wherever it’s stored upload the asset to S3/Cloudfront
A really simple diagram of the flow of this code looks like:
This probably seems pretty straightforward, and it is. It’s how we’ve been doing publishing for years. The problem (or potential for improvement) is that it downloads the first asset, and uploads it completely, before it does anything with the second asset. And it does the same with the second asset, before looking at the third one, and so on.
There’s a nifty little thing called Amdahl’s Law that applies here. It’s a bunch of math that, when applied to the situation here, effectively says that if half the publishing time is spent in something we can parallelize, then we can nearly half how long publishing takes. Uploading the assets is a perfect candidate to parallelize.
Here’s some pseudocode for the newly parallelized asset publishing:
for each asset in the rendered page, do the following, all at the same time: download the asset from wherever it’s stored upload the asset to S3/Cloudfront
It looks easy when it’s pseudocode, but trust me when I say it’s not nearly as easy in real code. A really simple diagram of this parallel code looks like:
Instead of uploading the assets one at a time, we do them all at the same time. If there were 50 assets, uploading the assets should take about 1/50th the time. That seems like a huge speedup, but when it’s applied to the whole publishing time, there’s still that pesky rendering to do first.
Also, sadly, running 50 of these at the same time doesn’t always mean a 50 time speed up. 50 at the same time means 50 downloads (or uploads) at the same time. If the network connection can’t handle that much traffiic, maybe you’ll only see a 40 or 30 times speed up. Additionally, if 1 of those 50 is a big file, it probably accounts for more than 1/50th of the time overall, so we still have to wait for it.
Result
So what does all this mean in the end, for our customers? What I’ve seen from our measurements so far is that for 90% of pages, publishing takes only 58% as long as did before.
This graph shows the median (50 percentile), 90, 95, and 99 percentile numbers for each 3 hour window, over the last 7 days. As you can see, they all drop significantly on the afternoon of the 21st, when we deployed this change.
A page that previously may have taken 13 seconds to publish, now takes less than 8 seconds, and a page that took 7 seconds to publish now takes only 4. This depends on the page, however. Pages that use very few assets, but still take a long time to render won’t show as much benefit from this improvement.
I’m really happy with these changes, and I hope you will be too.
–Derek Lewis
Senior Software Developer