Everything you wanted to know about Movable Type’s Publish Queue

I recently added the following chapter to [Movable Type’s Open Source Operations Manual](http://www.majordojo.com/2009/03/an-open-source-movable-type-operations-manual.php) and wanted to publish here for review by the community and feedback.
## About Publish Queue
The Movable Type Publish Queue is an essential component to any large scale Movable Type powered web site because it plays a crucial role in publishing performance optimization. There are a number of benefits to using the publish queue, they are:
* It **eliminates redundant, duplicated and unnecessary publication** of files.
* It **offloads publishing to stand alone process** which can be throttled and scaled independently from the Movable Type web application itself.
* It **speeds up the commenting experience** by reducing the number of files that an end user must wait to be published prior to being able to navigate the web site again.
## How it Works
It might be best to describe how the publish queue works by examining a scenario in which it would be utilized: republishing the necessary files in response to a comment.
### Adding Jobs to the Queue
When a comment comes in to Movable Type multiple files are often in need of being updated, not only because the comment needs to be published to the entry’s permalink page, but also because multiple other pages which display a comment count associated with the comment’s entry may need to be updated.
Each of those pages (assuming they are configured to be published via the publish queue) will then be added to the “publish queue.” When this happens, a publishing “job” is created and added to the database for each page that need to be published. There is one row in the database for each individual job in the system.
Now let’s assume for a moment that shortly after receiving the first comment, a second one is published by a different visitor to your web site. This action also results in pages needing to be republished. However this time, before those pages are added to the queue as jobs the system checks to see if a job corresponding to each page is already on the queue. If there is, then the job is discarded because its work would be unnecessarily duplicated otherwise. If the job is not already on the queue, then it is added. This ensures that no unnecessary work is performed by the system.
In addition, each page that is added to the publish queue is given a priority which dictates the order in which the corresponding job will be processed. The higher the priority, the sooner the system will work on the job. Movable Type assigns priority based upon the following criteria:

Page/Template Type Priority
Preferred Page and Entry archives 10
Index templates with a filename beginning with “index” or “default” 9
Feed index templates 9
All other index templates 8
Non-preferred Page and Entry archives 5
Daily archives 4
Weekly archives 3
Monthly archives 2
Any Category archive 1
Any Author archive 1
Yearly archives 1

And that is how jobs are added to the queue. There is a separate process that exists that is then responsible for publishing.
### Creating Publish Queue Workers
One or more publish queue “workers” can be created to process jobs on the queue. The number of workers needed by a system is based largely upon two variables:
* The capacity of any one worker to process jobs on the queue.
* The volume of jobs being added to the queue over time.
A worker is created by running the “run-periodic-tasks” script that comes with every copy of Movable Type. This script can be run in three modes:
* **daemon mode** – in this mode the script never quits; instead it constantly monitors the job queue for work to be done and nearly the instance a job is made available for work, the script will begin work on it.
* **run-once** – in this mode the script is run via the command line and will quit only after there is no more work on the queue to be done.
* **scheduled task** – in this mode the script is executed in the “run-once” mode periodically according to a schedule defined by cron or a similar service.
### Processing Jobs on the Queue
Each worker will monitor the queue for jobs. When one becomes available it is pulled off the queue to be worked on. Once it is “off the queue” no other workers can claim it. This makes sure that no two workers are trying to work on the same job at the same time.
In the event that something goes wrong during the publishing process and the file is not published, then the system will notice saying something skin to, “uh-oh, look at this job that was claimed on the queue, but was never successfully finished,” and then free up the job for a worker to pick up and try again on. If the task is retried more than 5 times, then the job is marked as failed and left on the queue. In this state it is possible for a similar job to be placed on the queue, and if the problem that was resulting in the published failure is not transient, then that job is likely to fail again.
An important thing to note is that if a job is pulled off the queue by a worker to be worked on, then it remains possible at that point in time for that same page to be added to the queue again in response to the receipt of another comment. The rational being that by the time the page is finished being rebuilt it is most likely out of date, and so needs to be published again.
## What Powers It?
The Publish Queue is powered by a stand alone job/queue management library called “The Schwartz.” The Schwartz is actually a more generic and abstract job management system capable of processing any number of tasks via a similar queuing mechanism.
For the time being, Movable Type only utilizes the Schwartz for publishing, but in the future may use this framework for sending emails or other non-critical system tasks.
## Publish Queue Tools

Publish Queue Manager screenshot

There is one tool in particular that is recommended for most systems that utilize the Publish Queue, aptly named the Publish Queue Manager.
This tool provides a user interface within Movable Type that allows administrators to monitor and inspect jobs on the queue. Each job can be deleted, or have its priority changed.
For more information, visit the plugin’s web site at the following URL:
[http://www.majordojo.com/projects/movable-type/publish-queue-manager/ ](http://www.majordojo.com/projects/movable-type/publish-queue-manager/)
## Additional Reading
To learn more about the Publish Queue, consider reading the following resources:
* [Using the Publish Queue
](http://www.movabletype.org/documentation/administrator/publishing/publish-queue.html)
* [Setting up run-periodic-tasks
](http://www.movabletype.org/documentation/administrator/setting-up-run-periodic-taskspl.html)
* [Scalable Publishing Models in Movable Type
](http://www.movabletype.org/documentation/enterprise/publish-queue.html)
* [The Schwartz Homepage](http://search.cpan.org/~bradfitz/TheSchwartz-1.07/lib/TheSchwartz.pm)
By the way, a new version of the Movable Type Operations Manual is now [available](http://www.majordojo.com/projects/downloads/MTOpsManual.pdf).

Advertisements


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Everything you wanted to know about Movable Type’s Publish Queue

I recently added the following chapter to [Movable Type’s Open Source Operations Manual](http://www.majordojo.com/2009/03/an-open-source-movable-type-operations-manual.php) and wanted to publish here for review by the community and feedback.
## About Publish Queue
The Movable Type Publish Queue is an essential component to any large scale Movable Type powered web site because it plays a crucial role in publishing performance optimization. There are a number of benefits to using the publish queue, they are:
* It **eliminates redundant, duplicated and unnecessary publication** of files.
* It **offloads publishing to stand alone process** which can be throttled and scaled independently from the Movable Type web application itself.
* It **speeds up the commenting experience** by reducing the number of files that an end user must wait to be published prior to being able to navigate the web site again.
## How it Works
It might be best to describe how the publish queue works by examining a scenario in which it would be utilized: republishing the necessary files in response to a comment.
### Adding Jobs to the Queue
When a comment comes in to Movable Type multiple files are often in need of being updated, not only because the comment needs to be published to the entry’s permalink page, but also because multiple other pages which display a comment count associated with the comment’s entry may need to be updated.
Each of those pages (assuming they are configured to be published via the publish queue) will then be added to the “publish queue.” When this happens, a publishing “job” is created and added to the database for each page that need to be published. There is one row in the database for each individual job in the system.
Now let’s assume for a moment that shortly after receiving the first comment, a second one is published by a different visitor to your web site. This action also results in pages needing to be republished. However this time, before those pages are added to the queue as jobs the system checks to see if a job corresponding to each page is already on the queue. If there is, then the job is discarded because its work would be unnecessarily duplicated otherwise. If the job is not already on the queue, then it is added. This ensures that no unnecessary work is performed by the system.
In addition, each page that is added to the publish queue is given a priority which dictates the order in which the corresponding job will be processed. The higher the priority, the sooner the system will work on the job. Movable Type assigns priority based upon the following criteria:

Page/Template Type Priority
Preferred Page and Entry archives 10
Index templates with a filename beginning with “index” or “default” 9
Feed index templates 9
All other index templates 8
Non-preferred Page and Entry archives 5
Daily archives 4
Weekly archives 3
Monthly archives 2
Any Category archive 1
Any Author archive 1
Yearly archives 1

And that is how jobs are added to the queue. There is a separate process that exists that is then responsible for publishing.
### Creating Publish Queue Workers
One or more publish queue “workers” can be created to process jobs on the queue. The number of workers needed by a system is based largely upon two variables:
* The capacity of any one worker to process jobs on the queue.
* The volume of jobs being added to the queue over time.
A worker is created by running the “run-periodic-tasks” script that comes with every copy of Movable Type. This script can be run in three modes:
* **daemon mode** – in this mode the script never quits; instead it constantly monitors the job queue for work to be done and nearly the instance a job is made available for work, the script will begin work on it.
* **run-once** – in this mode the script is run via the command line and will quit only after there is no more work on the queue to be done.
* **scheduled task** – in this mode the script is executed in the “run-once” mode periodically according to a schedule defined by cron or a similar service.
### Processing Jobs on the Queue
Each worker will monitor the queue for jobs. When one becomes available it is pulled off the queue to be worked on. Once it is “off the queue” no other workers can claim it. This makes sure that no two workers are trying to work on the same job at the same time.
In the event that something goes wrong during the publishing process and the file is not published, then the system will notice saying something skin to, “uh-oh, look at this job that was claimed on the queue, but was never successfully finished,” and then free up the job for a worker to pick up and try again on. If the task is retried more than 5 times, then the job is marked as failed and left on the queue. In this state it is possible for a similar job to be placed on the queue, and if the problem that was resulting in the published failure is not transient, then that job is likely to fail again.
An important thing to note is that if a job is pulled off the queue by a worker to be worked on, then it remains possible at that point in time for that same page to be added to the queue again in response to the receipt of another comment. The rational being that by the time the page is finished being rebuilt it is most likely out of date, and so needs to be published again.
## What Powers It?
The Publish Queue is powered by a stand alone job/queue management library called “The Schwartz.” The Schwartz is actually a more generic and abstract job management system capable of processing any number of tasks via a similar queuing mechanism.
For the time being, Movable Type only utilizes the Schwartz for publishing, but in the future may use this framework for sending emails or other non-critical system tasks.
## Publish Queue Tools

Publish Queue Manager screenshot

There is one tool in particular that is recommended for most systems that utilize the Publish Queue, aptly named the Publish Queue Manager.
This tool provides a user interface within Movable Type that allows administrators to monitor and inspect jobs on the queue. Each job can be deleted, or have its priority changed.
For more information, visit the plugin’s web site at the following URL:
[http://www.majordojo.com/projects/movable-type/publish-queue-manager/ ](http://www.majordojo.com/projects/movable-type/publish-queue-manager/)
## Additional Reading
To learn more about the Publish Queue, consider reading the following resources:
* [Using the Publish Queue
](http://www.movabletype.org/documentation/administrator/publishing/publish-queue.html)
* [Setting up run-periodic-tasks
](http://www.movabletype.org/documentation/administrator/setting-up-run-periodic-taskspl.html)
* [Scalable Publishing Models in Movable Type
](http://www.movabletype.org/documentation/enterprise/publish-queue.html)
* [The Schwartz Homepage](http://search.cpan.org/~bradfitz/TheSchwartz-1.07/lib/TheSchwartz.pm)
By the way, a new version of the Movable Type Operations Manual is now [available](http://www.majordojo.com/projects/downloads/MTOpsManual.pdf).


4 Comments on “Everything you wanted to know about Movable Type’s Publish Queue”

  1. Retroriff says:

    You don’t need MT publish queue if your comments need approval, am I right?

  2. Byrne says:

    I should add a section called “When to Use Publish Queue.”
    But in regards to your question, that is not necessarily a rule. Regardless of your moderation policy, publishing a comment *may* result in a large number of files to be republished. And whenever that is the case, publishing in the background is preferred.
    More importantly however, when you publish content statically (e.g. *not* via publish queue) someone, whether a visitor or an editor from with the admin, has to wait for those pages to publish. I personally hate to wait, and would much rather the app return control to me so that I can go about my business.
    However, I almost always have permalink pages publish statically/synchronously, that way you are guaranteed that when a user returns to the entry they just commented on that they will see their comment listed there.

  3. Hans says:

    Great article/ What templates do you suggest to put in the publish queue and which publish statistically for faster commenting?
    By the way your Ajax commenting does not work.

  4. Hans says:

    Ah commenting does work, but I did not got a message being publshed. Thought it was freezing. Sorry.


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s