Reducing the memory usage of the XML-RPC Endpoint in WordPress

WordPress uses the XML-RPC remote publishing interface in order to provide a standardized way to transfer data between a 3rd party client, like a mobile app, and the CMS. Nowadays many popular clients use this capability, useful for transferring posts, pages, and comments, but unfortunately it requires a lot of memory on the server side if used to publish picture or video. I wanted to reduce the memory  usage during the parsing process of  XML-RPC requests in WordPress, and make the upload of big pictures and videos available on the XML-RPC Endpoint of cheap shared hosting installations with strict php memory limit.

A quick recap. WordPress uses a conventional approach to XML-RPC request parsing, it relies on the basic idea that the whole XML-RPC request is loaded into memory ($HTTP_RAW_POST_DATA), and it often stores intermediate values into memory. Since a single XML-RPC request document might have a large picture or a video in it, the parsing process could take a lot of memory. For instance, when you upload a picture of 3Mb, you may need to upload at least 4Mb of data, due to the base64 encoding process that will increase the image size by approximately 35%. Hence the full parsing procedure requires at least 7Mb of memory: a 4Mb variable that holds the XML-RPC document and 3Mb for the variable that holds the data of the image. That’s not good. If you want upload a short video of 100Mb, recorded on your mobile device, you may have to set a really high value for the ‘memory_limit’ option, that could not be possible on shared hosting.

That said, I though the XML-RPC parsing process could be improved by using an XML Streaming parser, that reads the data directly from the input stream, and stores intermediate values on disk. In details I thought to get rid of $HTTP_RAW_POST_DATA, read the request from the input stream php://, parse it by using a Streaming Parser, store intermediate values on disk, and pack these changes into a plugin.

Unfortunately I got through a closed road, since PHP always populates the variable $HTTP_RAW_POST_DATA on POST requests with “text/xml” content-type. Even if you set the directive always_populate_raw_post_data to Off,  PHP populates that variable. The only exception are requests with content-type of “application/x-www-form-urlencoded” or “multipart/form-data”. I investigated more, and seems that there isn’t a simple way to get rid of  $HTTP_RAW_POST_DATA without modifying the PHP code. Shared hosting installation could not get huge benefits from this plugin, since they can’t install a modified version of PHP, anyway it could help a little bit, because it doesn’t store intermediate value in memory but uses the disk space.

The plugins does the following:

  • Gets the php input stream (php://input)
  • Reads from the input stream, and parse the content by chunck (Get rid of $HTTP_RAW_POST_DATA).
  • Doesn’t store any partial values, or the final parsed value in memory. It uses a tmp file in the tmp directory (Only for base64 data).
  • Changes the function mw_newMediaObject to use a path to the input file on disk rather accept the whole content as parameter.
  • Introduces a new function that copy the uploaded file to the right location.

If you want to test it here is the link to the code. (Note: This plugin is only an ‘alpha’ version, do not use in production). I don’t think I will develop it further, since a REST(ful) API, was already published on WordPress.com, and I hope it will be soon adopted on core too.

Special thanks to Luca Ercoli that assisted me with the php core stuff.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s