Generating PDFs from HTML – The Easy Way

In: Development|PHP

17 Aug 2010

Recently, I had a spec to produce PDF downloads/exports of a reporting suite. On the face of it, this doesn’t sound like a particularly complicated task, but after some research it turned out to be much more complicated than it needed to be.

The programming language was PHP, with some parts being converted to C++ using Facebook’s Hip-Hop. Ill therefore be talking about PHP in this post.

Googling ways of producing PDFs from HTML, using PHP brings up a number of solutions:

Most of these have some basic requirements, and some common limitations:

  • They need valid XHTML (unfortunately i was dealing with invalid XHTML)
  • None of them can replicate what JS would have done to a page
  • None of them can replicate what Flash would look like on the page
  • They are slow
  • Their XHTML & CSS support is limited (especially HTML5 & CSS3)
  • The solutions were slow to have to implement (I was going to have to write a lot of code)

To this end, i carried on looking – I know I needed a better solution than any of these offered. Eventually I stumbled upon WKHTMLTOPDF. This is basically just a server side install of WebKit, with a wrapper around to focus and extend some of its functionality. It’s key function – being able to save rendered web pages (including JavaScript execution), and save them as a PDF.

This actually makes generating PDFs absolutely easy. All you have to do, is send the HTML, CSS & JS that would have gone to the browser, to a file. Then pass this file into WKHTMLTOPDF, and serve the generated PDF to the browser!

Here’s a fe lines of PHP to show just how easy is to do!

  1.  
  2. // get the full html that would be sent to the browser.
  3. $html = ‘All the HTML of the page, with full server-side links to images & css & javascript eg. /var/www/css/global.css’;
  4.  
  5. // unique temporary name
  6. $tempFilename = ‘uniqueString’;
  7.  
  8. // write the html
  9. $fp = fopen($tempFilename.‘.html’, ‘w’);
  10. fwrite($fp, $html);
  11. fclose($fp);
  12.  
  13. // download and put WKHTMLTOPDF anywhere you like on your server
  14. // this is linux here. It must be executable by apache (or whatever web server youre using)
  15. $path = ‘/usr/local/bin/WKHTMLTOPDF’;
  16.  
  17. // this also adds in a delay of 800 miliseconds, to allow JS execution to finish
  18. $cmd = $path.‘ –enable-plugins –javascript-delay 800 /tmp/’.$tempFilename.‘.html /tmp/’.$tempFilename.‘.pdf’;
  19. exec($cmd);
  20.  
  21. // serve the PDF
  22. header("Content-type:application/pdf");
  23. header("Content-Disposition:attachment;filename=’export.pdf’");
  24. readfile($tempFilename.‘.pdf’);
  25.  
  26. // cleanup
  27. unlink($tempFilename.‘.html’);
  28. unlinke($tempFilename.‘.pdf’);
  29.  

And there you have it! Top quality PDFs, with minimal code!

FacebookTwitterShare

Comment Form

About this blog

Blog of Jon Reed. I am Senior Software Engineer, at AOL UK. I believe in working had & playing hard. I love gadgets and technology.

  • Jon Reed: That's the great thing about trying to track people this way. These cookies are detected/set/get all [...]
  • Mike Pearce: Good post and some useful information. However, the biggest problem with use flash cookies is Apple. [...]
  • Paul M.: Congratulations! [...]
  • Mike: Woo! [...]
  • Mike Pearce: Great post mate, some advanced stuff. What do you suggest as an alternative to document.write(). [...]