Scalability and Architecture Issues of SME Web Content Management Systems

Many low-end and middle-end web CMS providers profess (indeed stress) scalability and extensibility but in my experience the architectures of those systems are fundamentally flawed in achieving these aims. What are your thoughts?

The design of most of the systems is centered on the website, i.e. web-centric, which is fair enough as they are not intending to be enterprise content management systems in the wider sense.  However, they are also often web-server-centric and this poses significant scalability issues.

Issue 1: web-server-centric scaling and caching

In the web-server-centric (WSC) model, scaling is achieved by scale-out of front-end servers into a web-farm.  Typically these servers share a single database but notifications are needed between servers to flush caches.  This requires communication between front-end servers.  Worse, the amount of communication traffic increases with each additional server in the web-farm reducing overall effectiveness of the addition of servers.  Not to mention security issues…

Issue 2: web-server-centric connectivity and control

The WSC model typically only allows full access to the API via the web server.  This may be exposed as a web service, or some other distributed communications technology, but the web server is still involved in interpreting the request.  This means that any other system that needs to be involved in the creation, management or processing of content needs to have access to a web server.  This again has issues for security and also scalability, as it increases the processing demand on each front-end server for work that has nothing to do with presenting content to a remote web-based client.

Issue 3: web-server-centric extensibility

Next on the list of woes is that custom business logic must be added to all the web-servers themselves, possibly compromising security, increasing the processing burden and also the complexity of the task.  For example, the extension may need to operate in a guaranteed-once-only mechanism, but that is difficult to achieve with a pure API approach (advocated by many of these systems).  Transactional behaviour like this typically needs either a message-based or a database solution.  The code must also handle the possibility of an event being missed or fired multiple times, depending on the guarantees of the CMS API.

Issue 4: web-server-centric integration

Integration of other systems with WSC approaches require access to the front-end web servers, often require custom web services (or other extension points) to allow access to the specific APIs and further burden the already overloaded web server.

An alternative approach

An SOA design based on loosely coupled subsystems, installed on the same or multiple machines, would solve these issues directly.  No additional hardware or software is required to support this approach but enterprise quality could be achieved by using discovery, orchestration, enterprise service bus and management systems.

Most of the systems I’ve worked with already have the necessary service boundaries in their internal API.  To be truly SOA these interfaces should be adapted to support coarse-grained messages and properly controlled boundaries (exception shielding, process isolation, independent security).

What do you think?  Do you agree that many low to middle range web CMS solutions get it wrong?  Do you know of one that does not suffer from these criticisms?

Advertisements

One thought on “Scalability and Architecture Issues of SME Web Content Management Systems

  1. Bas Groot

    I totally agree that a web server centric CMS is a fundamentally flawed construct.
    Apart from issue 1-4 there are even more objections!

    Web servers are designed ground-up to serve out files like .html, .gif and .jpeg, not to serve applications.
    Because it runs and kills many processes all the time, web server applications are structurally crippled:
    – all these processes start up and contain the -entire- application suite many times over
    – enormous overreliance on the database for chached and temporary data shared between different user sessions
    – overreliance on file cacheing to avoid side-fx of the above
    – server-initiated data- and integration tasks require multi-tier U-turn constructs
    – very hard to guarantee one-time execution of any background processing task
    – slow, complex (integration) requests are done inline, occupying all web server sockets

    In a nutshell: web servers are not designed to function as application servers.

    I practise what I preach: I have a web application platform that does not run web server centric but as a separate deamon next to the web server.
    So these nasty web server centric challenges are just not there.

    – Issue 1: web-server-centric scaling and caching
    Because memory footprint is considerably less, the web server can be configured to serve much larger numbers.
    Nothing is cached, saving web server memory, avoiding consistency and security issues altogether.
    It’s extremely fast, 1 recent 8-core server pizzabox can easily process 5 to 10 million personalised page views per day.
    I said personalized. Even for a multi-million audience site, just a handful of servers will suffice.

    – Issue 2: web-server-centric connectivity and control
    The web server is just a “passthrough hatch”, the WAXTRAPP application logic is responsible for security and has much finer control and much more security intelligence than a web server ever could.

    Issue 3: web-server-centric extensibility
    WAXTRAPP is designed as an extensible web application platform and all extending takes place right there.
    The web server is not involved in extending as it just passes on HTTP requests to WAXTRAPP.
    The web server config is hardly ever changed, domains, URLs and sites, it’s all virtual and managed from WAXTRAPP.

    Issue 4: web-server-centric integration
    WAXTRAPP takes the integration burden. If it’s about remote web services requests, it’s all handled at WAXTRAPP.
    For server-initiated processing and integration tasks, WAXTRAPP manages that and has complete control.

    Because WAXTRAPP does the application work and the web server only does web serving,
    – these two tasks do not depend on each others load
    – the entire application suite is not running in many duplicates

    This construct essentially means that the middle tier is a server of its own.
    This way one can put a much larger request load per server and greatly simplify developing high performance web applications.

    And it is a mid-market/mid-price system.
    Sorry that it may sound like an ad but I really mean to be informing…

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s