BigPipe in Drupal

BigPipe was conceived at facebook as a solution to load dynamic pages quickly. Its a way of loading various sections of your web-page in parallel so end-users don't have to wait for the DOM to be completely ready to start interacting with the website. In this article, we will be talking about the architectural changes that allowed this kind of rendering & dive into big_pipe module to see how it works.

How does this work?

During rendering, the personalized parts are turned into placeholders.
By default, Drupal 8 uses the Single Flush strategy (aka "traditional") for replacing the placeholders. i.e. we don't send a response until we've replaced all placeholders.
The BigPipe module introduces a new strategy, that allows us to flush the initial page first, and then stream the replacements for the placeholders.
This results in hugely improved front-end/perceived performance (watch the 40-second screencast above).

A common misconception with bigpipe is that it increases the performance of the webserver stack to return pages faster. The reality is that the load time for a page stays the same, the advantage being that a user can start interacting with a section of the page as soon as its ready (ala Perceived performance).

Perceived performance

Perceived Performance

In the screenshot above, the request completion time is the same for both: big_pipe enabled & disabled. However, the page is ready for user-interaction at 44 ms in case of big_pipe enabled, while for the disabled case, its 964 ms.

To understand Bigpipe caching strategy, lets first look at how caching works in Drupal 7 and Drupal 8

Caching in Drupal 7

Caching in Drupal 8

Drupal 8 caching

To understand the above examples better, lets take a look at the data being rendered in different regions.

Header: Menu & Banner. These are mostly static & are not going to change frequently. (Good candidate for caching)
Footer: Menu, Legal text etc. Again mostly static content (Good candidate for caching)
Content: Content of node with nid:1. This is not going to change until an update to the node. (Good candidate for caching with proper cache invalidation)
Sidebar 1: User info block. Supposed to be different for each user. (Can't be cached)
Sidebar 2: Search block. The layout would stay the same. (Good candidate for caching)

Now, if we were talking about Drupal 7, the complete page would be cacheable for anonymous user. But, there would be no caching for authenticated user, even though there are parts of the page that can be cached. For authenticated requests in Drupal 7, one could use the modules like authcache to have better performance.

However in Drupal 8 core, this is doable using the dynamic page cache module in the core. Drupal 8 core has support for cacheability metadata, that aids dynamic page cache module. To enable the module, go to admin/modules -> select dynamic page cache & save the configuration.

(Stay tuned for our next post to read about Authcache Vs Dynamic page cache)

Dynamic Page cache

Cacheability metadata: Cache Tags & cache contexts

The background concept that brought dynamic page cache module in core was introduction of cacheablility metadata:Cache tags & Cache contexts.

Auto-Placeholdering

One major way in which Facebook or any general implementation of BigPipe varies from the one in Drupal is its ability to identify regions that can benefit from BigPipe delivery.

When we talk about caching sections of a page, there could be sections that cannot be cached or caching them is an overhead. Drupal can now identify such sections of the page based on a few conditions(explained below) automatically. These sections are replaced by placeholders by Drupal core. So now while preparing the HTML response, Drupal doesn't need to wait for these non-cached(uncacheable) parts to pull fresh data. Rather, it can send out the skeleton markup with the non-cached parts of the page rendered as placeholders. The processing of these placeholders can be done via placeholder strategies defined. Drupal core has only one placeholder strategy called as single flush. But, it also provides a way for contrib modules to create their own strategies(This is where big_pipe module hooks in).

Auto Placeholdering Conditions

Auto Placeholdering conditions

High Cardinality: Some content that can have a lot of variations. e.g., a block that needs to change per user.
High Invalidation rate: If the content of a block is going to change frequently.
Low max-age: Again cached content with low max age point to the high invalidation rate.

While rendering a page, all such blocks are identified & automatically replaced with placeholders. Drupal 8 core uses the following critera for placeholdering:

renderer.config: required_cache_contexts: ['languages:language_interface', 'theme', 'user.permissions'] auto_placeholder_conditions: max-age: 0 contexts: ['session', 'user'] tags: []

max-age: 0 - Render array with max-age set to 0 cannot be cached. This makes them a perfect candidate for auto-placeholdering.
contexts: ['session', 'user'] - Render array with session & user contexts will have a very high cardinality. This makes caching them an overhead.
tags: [] - Render array that don't have a cache tag are again not cacheable, making them a perfect candidate for auto-placeholdering.

Caching is performed once the DOM is ready with cached content + placeholders. The next step is to flush these sections of the page with content (Single-flush strategy). Core provides only with single flush strategy wherein the complete page(with unchanged placeholders) is sent to Drupal\Core\EventSubscriber\HtmlResponsePlaceholderStrategySubscriber where placeholders are flushed out at once to replace them with actual content. Drupal 8 core allows contrib modules to define their own flush strategies leading to the support for BigPipe, ESI etc.

To read more about the cacheablility metadata & auto-placeholdering, I would recommend going through:

P.S: The flush strategies & placholdering is only applicable for HTML response.

Enter Big Pipe Module

At a high level, BigPipe sends a HTML response in chunks:

one chunk: everything just before </body> — this contains BigPipe placeholders for the personalized parts of the page. Let's call it The Skeleton.
n chunks: a <script> tag per BigPipe placeholder in The Skeleton.
one chunk: </body> and everything after it.

The major way in which Drupal's implementation differs from Facebook's implementation (and others) is in its ability to automatically figure out which parts of the page can benefit from BigPipe-style delivery using auto-placeholdering.

BigPipe can only work if JavaScript is enabled. BigPipe module in Drupal also allows to replace placeholders without JavaScript. Technically its not BigPipe, but use of multiple flushes termed as 'no-JS BigPipe'.

This allows us to use both no-JS BigPipe and "classic" BigPipe in the same response to maximize the amount of content we can send as early as possible.

Getting deeper into implementation:

BigPipe placeholders: 1 HtmlResponse + N embedded AjaxResponses.
- BigPipe does not use multiple AJAX requests/responses. It uses a single HTML response. But it is a long-lived one: The Skeleton is sent first, the closing </body> tag is not yet sent, and the connection is kept open. Whenever another BigPipe Placeholder is rendered, Drupal sends (and so actually appends to the already-sent HTML) something like <script type="application/json">[{"command":"settings","settings":{…}}, {"command":…}.</pre>
- The <script> tag above is same as that of an Ajax response. The BigPipe module has JavaScript that listens for these and applies them. It is termed as Embedded AJAX Response (since it is embedded in the HTML response).
No-JS BigPipe placeholders: 1 HtmlResponse + N embedded HtmlResponses.
- The Skeleton is split into multiple parts, the separators are where the no-JS BigPipe placeholders used to be. Whenever another no-JS BigPipe placeholder is rendered, Drupal sends (and so actually appends to the already-sent HTML) something like <link rel="stylesheet" …><script …><content>.
- For every no-JS BigPipe placeholders, the associated CSS & JS is sent alongside(if it has not been already sent viadrupalSettings.ajaxPageState.libraries). This is termed as embedded HTMLResponses.

Combining all of the above, when using both BigPipe placeholders and no-JS BigPipe placeholders, BigPipe sends: 1 HtmlResponse + M Embedded HTML Responses + N Embedded AJAX Responses.

BigPipe request flow

Disclaimer: The module is under active development now. Things you see below might change going forward. The section below will focus on diving into the code chunks from BigPipe module to see how it plays with the core. I have tried to put down parts of the module detailing how the module is hooking into the core.

How does the module hook into the core's placeholder strategy?

The module actually doesn't hook into the core's placholdering strategy, but defines a new one.

ChainedPlaceholderStrategy in core looks at the available placeholder strategies. These should be tagged with placeholder_strategy.

How does the module create placholders?

BigPipe module defines its own placeholder strategy Drupal\big_pipe\Render\Placeholder\BigPipeStrategy which implements PlaceholderStrategyInterface.

This strategy is activated only if the current request is associated with a session. Without a session, it is assumed this response is not actually dynamic & can be handled using internal page cache.
The strategy also defines 2 sub-strategies to handle the 2 cases defined in the big_pipe_flow image defined above:
- with JavaScript enabled: #attached[big_pipe_js_placeholders]

/** * Creates a BigPipe JS placeholder. * * @param string $original_placeholder * The original placeholder. * @param array $placeholder_render_array * The render array for a placeholder. * * @return array * The resulting BigPipe JS placeholder render array. */ protected static function createBigPipeJsPlaceholder($original_placeholder, array $placeholder_render_array) { ... }

with JavaScript disabled: #attached[big_pipe_nojs_placeholders]

/** * Creates a BigPipe no-JS placeholder. * * @param string $original_placeholder * The original placeholder. * @param array $placeholder_render_array * The render array for a placeholder. * * @return array * The resulting BigPipe no-JS placeholder render array. * * @todo Figure out how to simplify this. Perhaps no new placeholder is in fact necessary? * @todo Related, perhaps distinguish between "HTML" and "non-HTML (attr value)" use cases? Because right now, this *breaks* HTML and therefore breaks response filters: this indiscriminately uses a <div> as a placeholder, which is invalid inside a HTML attribute, and thus breaks DOM parsing. */ protected static function createBigPipeNoJsPlaceholder($original_placeholder, array $placeholder_render_array) { ... }

HtmlResponseAttachmentsProcessor doesn't know about BigPipe placeholders?

BigPipe module creates another service html_response.attachments_processor.big_pipe to override the the processing of attachments with the request.

big_pipe.services.yml --- html_response.attachments_processor.big_pipe: public: false class: \Drupal\big_pipe\Render\BigPipeResponseAttachmentsProcessor decorates: html_response.attachments_processor decoration_inner_name: html_response.attachments_processor.original arguments: ['@html_response.attachments_processor.original', '@asset.resolver', '@config.factory', '@asset.css.collection_renderer', '@asset.js.collection_renderer', '@request_stack', '@renderer', '@module_handler']

BigPipeResponseAttachmentsProcessor extends HtmlResponseAttachmentsProcessor overriding function processAttachments to do the following:

Remove BigPipe placeholders from attachments

$attachments = $response->getAttachments(); $big_pipe_placeholders = []; $big_pipe_nojs_placeholders = []; if (isset($attachments['big_pipe_placeholders'])) { $big_pipe_placeholders = $attachments['big_pipe_placeholders']; unset($attachments['big_pipe_placeholders']); } if (isset($attachments['big_pipe_nojs_placeholders'])) { $big_pipe_nojs_placeholders = $attachments['big_pipe_nojs_placeholders']; unset($attachments['big_pipe_nojs_placeholders']); } $response->setAttachments($attachments);

Process the attachments that HtmlResponseAttachmentsProcessor understands

// Call HtmlResponseAttachmentsProcessor to process all other attachments. $this->htmlResponseAttachmentsProcessor->processAttachments($response);

Attach back the BigPipe placeholders

// Restore BigPipe placeholders. $attachments = $response->getAttachments(); if (count($big_pipe_placeholders)) { $attachments['big_pipe_placeholders'] = $big_pipe_placeholders; } if (count($big_pipe_nojs_placeholders)) { $attachments['big_pipe_nojs_placeholders'] = $big_pipe_nojs_placeholders; } $response->setAttachments($attachments);

Drupal core responds <html> using HTMLResponse which uses single flush. How to respond using n-flush strategy provided by BigPipe?

BigPipe module provides with BigPipeResponse which extends HtmlResponse. BigPipe module defines its own EventSubscriber to handle response using BigPipeResponse.

Class BigPipeResponse overrides sendContent function to make use of bigPipe service(which is actually responsible for sending data in chunks).

/** * {@inheritdoc} */ public function sendContent() { $this->bigPipe->sendContent($this->content, $this->getAttachments()); return $this; }

Where does BigPipe do the heavy-lifting of chunked response?

BigPipe defines another service to handle this big_pipe.

big_pipe.service.yml --- big_pipe: class: Drupal\big_pipe\Render\BigPipe arguments: ['@renderer', '@session', '@request_stack', '@http_kernel', '@event_dispatcher']

This service is actually responsible for the heavy lifting needed for chunk-ed response. The control is passed down to this service from BigPipeResponse class we discussed int he above section. Its responsible for:

Breaking down the content being rendered into parts: preBody, placeholders, postBody

list($pre_body, $post_body) = explode('</body>', $content, 2); $this->sendPreBody($pre_body, $nojs_placeholders, $cumulative_assets); $this->sendPlaceholders($placeholders, $this->getPlaceholderOrder($pre_body), $cumulative_assets); $this->sendPostBody($post_body);

Collecting all the assets in drupalSettings.ajaxPageState.libraries & attach them with each response.

$cumulative_assets = AttachedAssets::createFromRenderArray(['#attached' => $attachments]); $cumulative_assets->setAlreadyLoadedLibraries(explode(',', $attachments['drupalSettings']['ajaxPageState']['libraries']));

Helper functions:

sendPreBody(): Sends everything until just before </body>.
sendPlaceholders(): Sends no-JS BigPipe placeholders' replacements as embedded HTML responses.
sendPlaceholders(): Sends BigPipe placeholders' replacements as embedded AJAX responses.
sendPostBody(): Sends </body> and everything after it.

Where are the AJAX commands set in the BigPipe placeholders processed?

BigPipe defines its own library for handling assets & defines big_pipe.js in it.

big_pipe.libraries.yml --- big_pipe: version: VERSION js: js/big_pipe.js: {} drupalSettings: bigPipePlaceholders: [] dependencies: - core/jquery - core/drupal - core/drupal.ajax - core/drupalSettings

Javascript is responsible for:

Processing all the scrip tags in between script[data-big-pipe-event="start"] & script[data-big-pipe-event="stop"
function bigPipeProcessContainer(context) { // Make sure we have bigPipe related scripts before processing further. if (!context.querySelector('script[data-big-pipe-event="start"]')) { return false; } $(context).find('script[data-drupal-ajax-processor="big_pipe"]').once('big-pipe') .each(bigPipeProcessPlaceholder); // If we see a stop element always clear the timeout. if (context.querySelector('script[data-big-pipe-event="stop"]')) { if (timeoutID) { clearTimeout(timeoutID); } return true; } return false; }

Executes Ajax commands included in the script tag
function bigPipeProcessPlaceholder(index, placeholder) { var placeholderName = this.getAttribute('data-big-pipe-placeholder'); var content = this.textContent.trim(); // Ignore any placeholders that are not in the known placeholder list. // This is used to avoid someone trying to XSS the site via the // placeholdering mechanism.; if (typeof drupalSettings.bigPipePlaceholders[placeholderName] !== 'undefined') { // If we try to parse the content too early textContent will be empty, // making JSON.parse fail. Remove once so that it can be processed again // later. if (content === '') { $(this).removeOnce('big-pipe'); } else { var response = JSON.parse(content); // Use a dummy url. var ajaxObject = Drupal.ajax({url: 'big-pipe/placeholder.json'}); ajaxObject.success(response); } } }

The complete magic recipe for rendering the blocks in parallel is above.

Thats pretty much it from my end. I'd really like to thank Wim Leers, Fabianx and others who worked really hard on bringing this caching strategy to Drupal 8 and working on getting it into core with 8.1 release.

References: