Historic DrupalCon Amsterdam 2014 - Let's get to the bottom of Headless Drupal

Let the Debates Begin - Part II

Other articles in this series:

Intro

Whatever it is, and in this article we are going to venture a proposal for a canonical definition, Headless Drupal seems to synthesis a heartfelt need in the context of the current Drupal problematic. It has been a hot subject for quite some time now, with an active group presence on Drupal Groups, and with a veritable avalanche of articles and presentations. Barring the obvious number one topic of Drupal 8 (which we'll debate in the next article in this series) and successfully competing with "the new PHP" itself as a center of interest, it was really the number one topic at DrupalCon Amsterdam 2014, with training, presentations and at least one very important BOF:

Let's group the concepts first, and then open up questions and concerns with a view to supporting the ongoing debate in the Drupal community.

Concepts

eatings's session on Headless Drupal

Video: https://www.youtube.com/watch?v=p4tl64NnUqc

Slides: http://slides.com/eatings/decoupled-drupal-and-the-future?token=7Kifj3qFerCnUxwwWwBw8AyTLzMr#/

The title works within the concept of decoupled front-end as a synonym for Headless Drupal (seedebate on geedeeoh "What's in a name"). The presentation aims at being a comprehensive high-level overview of the meaning and challenges involved in Headless Drupal.

What is Headless?
  • Drupal as a content model and entity API minus the client-facing theme layer
  • Drupal as a RESTful web services endpoint (sic)
  • Drupal as a content repository
  • Your app up front, Drupal in the back
  • "Bring your own front-end"
  • Your app up front, Drupal in the Back
  • Client-side JS MVC frontend frameworks rendering Drupal data.
  • Drush
Advantages
  • Complete separation of concerns
  • Drupal Backend + Front-end App
  • Ultimate control of your frontend stack (sic)
  • Drupal's content modelling tools
  • Content authoring and workflow tools
  • Users, roles, permissions
  • Server exposed RESTful entity data

Of course, you don't get anything from the front-end you might have become accustomed to depending on, like Drupal's render() pipeline, visual templating, panels, display suite, other display modules, the front-facing Form API, and of course Drupal's theme system.

Use cases for going headless
  • Things are changing very quickly in the front-end, there's backbone.js, angular.js, single page apps, etc. Drupal core can't keep up and shouldn't even try.
  • Frontend moves at innovation speed, Drupal + Content Model moves at its own pace
  • Everybody can use the tools they prefer in contexts that make the most sense.

Then, some examples are given:

  • Drupal site feeding mobile and other apps
  • Drupal content modelling plugging into massive online retail system
  • Drupal pushing content to multiple web endpoints
  • Swappable frontend teams, techniques and systems, with consistent Drupal base.
  • What if all Drupal front ends were decoupled services-based apps?

And the conclusion, expressed as "coda", 95% of which we agree with fully, is quoted in full in the debate section below.

One fascinating example mentioned is the Word Press demo front-end client for "Headless WordPress", an interface for the JSON REST API plugin for WordPress. It generates a Backbone JavaScript application pairing with the server-side API.

Also, Solstice, "A simple Solr wrapper for AngularJS apps".

In the questions an argument is made for the consideration of evolutionary tightly coupled front ends (reference demos?) that can better take advantage of having "everything in memory in PHP" and will not break cache systems. Larry Garfield also mentions a little Drupal 8 specific jewel: Pushing front-end logic, etc. to twig templates will allow use of twig.js, "a pure JavaScript implementation of the Twig PHP templating language", which will effectively shove the front-end into the browser.

kvirta's Turbocharging Drupal syndication with Node.JS

This presentation is exceedingly interesting since it is borne out of the pain of developing a huge distributed web app; and also because it dovetails nicely with the Four Kitchen's training and rejected session's vision of Headless Drupal:

  • CMS: Drupal as closed CMS for content modeling and generation
  • Back-end
    • Drupal as delivery back-end, as prototype and then in many cases for production too.
    • For most systems, A Node.js back-end for delivery via REST API
  • Front-end: Multiple clients, one of which may be Drupal.

How interesting for two giants (the largest video delivery system in Finland, on the one hand, and Pantheon partner Four Kitchens, on the other) to come to similar conclusions via the fruits of hard-won experience "from the trenches" with no a priori preconceived notions! See the Debate section below for my suggestion of adopting this as the canonical definition of Headless Drupal.

Kudos to kvirta for their team sharing everything they've learned with such hard work!

Requirements they were faced with:
  • Build a platform for a large TV broadcaster to support subscription video and an on-demand video service (Live sports: Finnish ice hockey!)
Architecture:
  • Video content management system stores video metadata, and is fed by uploaded content and linear TV programming (ERP system feed)
  • The CMS may receive Uploaded videos, but they are moved out of the CMS and into a separate stream. The CMS also directs streaming of videos from multiple content delivery networks. The CMS does not deliver any web pages outside of admin CMS functionality. Headless Drupal delivering JSON.
    • Custom modules for
      • Integration to TV ERP system (storing up to 3 months of video metadata programming)
      • Controlling the video binary management system
      • Marking videos as set to be played
  • Content delivery system for downstream clients, multiple websites (Drupal 7 and Wordpress), mobile and Smart TV apps
    • Downstream clients required time-bound attributes such as seconds since last fetch, if content was new or changed (seconds as parameter in the URL, so CDN directed to send as of that moment, not before; this made caching impossible)
What they found:
  • As the number of downstream clients increased, the headless Drupal server was unable to scale, in spite of special steps taken for performance:
    • A fast database (MongoDB) was used for field storage.
    • The JSON Views feeds originated in a SOLR backend.
    • MongoDB field storage is faster but isn't compatible with Views unless used with Entity Field Queries, which were not optimum. The team concluded that using MongoDB was not then worth the trouble.
    • With 40 fields per item, and 1000 items ina feed page, Field API results in 40,000 calls to every hook per page load, slowing things down considerably.
    • Due to the time-bound requirements for downstream clients described above, caching with, say, Varnish, was impossible.
  • An initial modification to simply index outside of Drupal via Apache SOLR integration failed
    • SOLR was already being used as a backend to Views and search.
    • SOLR was schemaless and so convenient
    • Content could be distributed via a simple REST API
    • But, the need for frequent re-indexing due to the need to include popularity data and to order often viral content by popularity led to a very slow process on SOLR itself despite various attepts at overcoming the problem.
  • As a result all data except the full-text search backend was moved to MongoDB, whose document based records were the most flexible solution for indexing outside of Drupal.
    • MongoDB locks the whole database for indexing, which will be solved in a short-term future version, but even so it is the fastest solution.
    • The content is then fed downstream via a Node.js REST API integrated with MongoDB and the full text search on SOLR. This is integrated with the Drupal 7 CMS via the MongoDB Indexer module (in Sandbox, see below).
    • MongoDB Indexer uses a straight connection to MongoDB without the use of an API, and de-normalizes the data for optimized distribution by the Node.js REST API
  • Once the optized and indexed data, then, was stored in MongoDB, a Node.js content delivery system was found to be the fastest means of delivery, also allowing for logic to be applied as part of the delivery process on data garnered from various sources (session and auth validation with single sign-on server in the back-end, hidden token inclusion to protect streams, user profile info from sales server, live sports statistics).
    • Thanks to its event-driven non-blocking character, Node.js is eight times as fast as PHP running on the server (with no caching, remember; and that's PHP, including Symfony, which "is slow also"), given the nature of the task at hand. So MongoDB is never the bottleneck.
    • No Node.js framework used, although it would not have hindered speed, noticeable.
    • No fronting Nginx used, (Cluster (takes advantage of multiple cores) and Forever (keeping the process running) npm modules used), but Nginx is on the roadmap, after sufficient performance testing.
    • Downstream clients login directly to Node.js app .
    • TCP connections might be next bottleneck if everything runs sufficiently fast.
  • It all proved to be quite a change for PHP programmers
    • Asynchronous processing and code
    • The speed at which cutting edge projects such as Node.js and MongoDB actually change, even over the 18 month lifetime of the project.
  • When do you need to use Node.js for your web services content delivery system
  • How to use Node.js to speed up Drupal content delivery
  • Detailed comparison between scaling with Drupal vs. Node.js
MongoDB Indexer module:

Debate

Check out some of the other sessions and BOF meetings! They flesh out the topic nicely. But with the detail we have already shared, there is enough to draw some conclusions and ask some questions.

Questions and concerns, and a proposal:

  • In the web app industry, the front-end is changing rapidly, new frameworks are emerging and developers want more control and want to standardize all their web app work, whether with Drupal or not. Headless Drupal in any of its varieties is a solution to this problem. This is actually not a concern, just a point that has to be saluted as one on which there is wide agreement.
  • But is the action all in the front-end? No, the back-end is changing just as fast. Drupal 7 and Drupal 8 can both serve as back-ends, and that should be the prototype as the information architecture and API mature, and for many applications may serve well in production. However, for many Drupal cannot scale as a REST API delivery server and is out-paced by many alternatives (notably Node.js, but also golang, erlang, java, scala, etc.).
  • Another front that has to be dealt with is the need for using scalable solutions for persistance above and beyond Drupal's database layer: integration in the back-end with MongoDB, CouchDB, Riak, and also network databases are another exciting field modern web apps need to include and do so at an ever-increasing pace.
  • To say, as eatings does, "Frontend moves at innovation speed, Drupal + Content Model moves at its own pace" is inadequate, we need to consider both the front-end and the back-end for integration and migration. Nor can we assume, as this definition of Headless Drupal does, that innovation is to be restricted only to the front-end. To be fair, @eatings does cover all the options in the excellent summary of examples he provides.
  • Drush isn't headless Drupal: it is part and parcel of a front-end: the command line. While it can be made part of an automated deployment system, for example, in conjunction with other tools, like Puppet, etc., it is not and was not designed to be a stand-alone component.
  • "Swappable frontend teams, techniques and systems, with consistent Drupal base." This is the most confusing of all, what do we mean by "base" here? Reality demands: "Swappable frontend teams, techniques and systems" Period.
  • At the end of his presentation, I think @eatings goes to the heart of the Drupal problematic, which is why his talk is so valuable. To quote:

Drupal has tried to be all things to everybody. All-in-one, soup-to-nuts Drupal-shaped hammer. You bend the app to Drupal, you build the use case around Drupal, you live and die by the whole stack: Presentation, Content, Data Model. What if Drupal was just another layer in your stack (and didn't have to own all of it)... that you could freely swap out, any piece at any time?

  • Then he says: "By decoupling the front end, we make it more flexible and widely relevant". I would say, by decoupling on all tiers, via a complete separation of concerns, everything is made more flexible and widely relevant. Drupal, and any other framework, will always be used where it is needed for what it does best!
  • Innovating in both the front-end and the back-end begs the question "So why use Drupal at all?" The answer is two-fold: Using Drupal as an additional client in addition to, and at least, as the CMS, allows for tremendous opportunities for participation of Drupal and its community in the most exciting and demanding of projects as the future unfolds. And, secondly, actual use of Drupal will grow in all aspects as it is used with Drush, for example, or as an all-purpose data migration tool. Even if the decision is to migrate to a completely different stack, Drupal migration will be there and new developers will come into contact with it and the Drupal community. Drupal and similar CMS's, like Backdrop and WordPress, will continue to grow in their own niches and would do well to excel in what they do best.
  • "Everybody can use the tools they prefer in contexts that make the most sense." That we can all agree on, can't we? What if Drupal was built like that?
  • A very important use case is that a RESTful back-end exposing an API dovetails perfectly as a material base for implementing structured content as a content strategy. The same kind of decoupling as discussed here is implicit. See Jeff Eaton's The Battle for the Body Field and NPR's COPE (check out the complete transcript and the works cited).
  • So one of the great advantages of diving into Headless Drupal is that we don't need to wait for Drupal 8 at all.

Proposal for canonical definition of Headless Drupal

  • CMS: Drupal as closed CMS for content modeling and generation
  • Back-end
    • Drupal as delivery back-end, as prototype and then in many cases for production too.
    • For most systems, A Node.js back-end for delivery via REST API
  • Front-end: Multiple clients, one of which may be Drupal.