page_white_acrobatAdds support for exporting wiki pages to PDF on the client-side using the web browser.
Recommended
TypeXAR
CategoryApplication
Developed by

XWiki Development Team

Active Installs42
Rating
2 Votes
LicenseGNU Lesser General Public License 2.1
Compatibility

XWiki 14.2+

Installable with the Extension Manager

Description

Uses paged.js along with CSS Paged Media Module and the CSS Generated Content for Paged Media Module to export wiki pages to PDF using the browser's print to PDF feature.

This application provides:

  • a default PDF template for basic needs
  • a template provider to help creating new PDF templates
  • an administration section to configure the list of PDF templates the end user can select from
  • an improved PDF Export Options modal, that allows the user to select the PDF template
  • a customized Export Modal that supports multi-page PDF export
  • a PDF export job that renders the selected XWiki pages on the server-side in a background (daemon) thread
  • components to print web pages to PDF on the server-side using a headless Chrome web browser running inside a Docker container

History

Originally the XWiki PDF Export feature was developed to work server-side. However, as XWiki's development progressed, more and more features got implemented in Javascript and the server-side PDF export cannot export changes done to the HTML DOM by Javascript (that would require a JavaScript engine running on the server-side and it's not easy to integrate one that would execute any Javascript framework properly). Thus, we've decided to rewrite the PDF export feature and this extension is the result of that.

This extension is currently experimental (it should be working but there are likely still important bugs and we need your input to raise them.

Our goal is that, once this extension is reported to be working well-enough on a large variety of environments, to include it in XWiki Standard by default and to replace the default server-side PDF export with it. This will likely occur in XWiki Standard 14.x (where x >= 7).

How it works

The Front-end

  • The user opens the "Export" modal using the "More Actions > Export" page menu, selects the pages to export and then clicks on the "Export as PDF" button which opens the "PDF Export Options" modal.
  • The user chooses the PDF export options and then clicks on the "Export" button.
  • The JavaScript click listener on the "Export" button makes an HTTP request to start the PDF export job on the back-end, passing the collected data (the list of pages to export, the PDF template, whether to generate the cover page and the table of contents, etc.); the HTTP response includes the id of the scheduled job;
  • The JavaScript code then makes subsequent HTTP requests to get the status of the PDF export job, passing the id received when the job was started, until the job ends (either successfully or failing).
  • 14.4.3+, 14.6+ The user can click on the "Cancel" button to cancel the running PDF export job; this sends an HTTP request to the back-end to stop the PDF export by setting the corresponding flag on the job status; the PDF export job won't stop immediately but as soon as it reads the cancel flag.
  • When the JavaScript code detects that the PDF export job finished (based on its status) it has two options:
    • if the job status specifies a PDF file, which is the case when the PDF is generated server-side, then it redirects the user to that file
    • otherwise it uses a hidden iframe to load the PDF template passing the id of the finished PDF export job, waits for everything to load and be ready for print then calls window.print() which opens the browser's print modal that the user can use to save the result as PDF
  • The PDF template uses the status of the PDF export job specified on the HTTP request to generate the HTML that is going to be printed to PDF
    • it uses paged.js to split the HTML content in print pages and to generate the PDF cover page, table of contents as well as the page header and footer

The Back-end

  • The PDF export job simply iterates the list of wiki pages to export and renders them to HTML, collecting the results, without aggregating them (this is done later by the PDF template)
  • The rendering results are exposed on the job status (to be read by the PDF template) but they are accessible only by the user that triggered the export.
  • If the configuration says that the PDF should be generated server-side then the PDF export job uses a dedicated component to generate the PDF using a headless Chrome web browser and saves the PDF file as a temporary resource, exposing its reference on the job status.
  • The PDF printer component is responsible for downloading the Docker image, creating the Docker container and connecting to the headless Chrome web browser running inside.
  • The PDF printer uses a separate browser context for each export, copying the cookies from the original request that triggered the PDF export in order to have the user authenticated
  • The PDF printer tells Chrome to open the PDF template and waits for everything to be ready before calling the Chrome API to save the web page as PDF, returning the generated PDF file to the PDF export job

Export Modes

There are multiple ways in which the PDF can be generated and the application provides configuration options (in xwiki.properties) to choose what's best for you.

Managed Docker Container (Default)

By default the PDF is generated on the server-side using a headless Chrome web browser running inside a Docker container. The application takes care of:

  • pulling the right Docker image
  • creating the container and starting it
  • stopping the container at the end when XWiki shuts down

The requirements for this are:

  • Docker 20.10+ must be installed on the machine running XWiki (the servlet engine) if XWiki is not itself inside a Docker container (see the following section). The reason is because in this case (XWiki running outside Docker, on the same machine as the Docker daemon) the Chrome browser running inside a Docker container needs to access the XWiki instance running on the Docker host. This is possible thanks to the host-gateway magic host name that was introduced in Docker 20.10 and which we use when creating the Chrome container like this: --add-host=host.xwiki.internal:host-gateway.
  • the OS user running XWiki (e.g. "tomcat") must be allowed to use Docker (e.g. on Linux this usually means adding the user to the "docker" group so that it has access to the Docker socket)
  • internet access to pull the Docker image

Docker out of Docker

If XWiki is also running inside a Docker container then:

  • you need to bind-mount the Docker socket so that XWiki can communicate with the Docker daemon in order to manage the headless Chrome container
  • you should create a Docker network, add the XWiki container to that network and configure XWiki to use it for the headless Chrome container so that they can communicate (XWiki needs to access the Chrome container for remote debugging and the Chrome container needs to be able to load XWiki pages)
    # Tell XWiki which Docker network to use to communicate with the headless Chrome container.
    export.pdf.dockerNetwork=xwiki-network
  • you have to specify in the XWiki configuration the host that the Chrome container can use to access XWiki (usually the network alias of the XWiki container or its IP address):
    # The host that the Chrome container uses to access XWiki.
    export.pdf.xwikiHost=xwiki-container

Note that in this case you can use an older version of Docker because being in the same network means XWiki and Chrome can talk to each other based on their network aliases or IP addresses. We don't need to rely on the magic host-gateway provided by Docker 20.10+.

Reusable Docker Container

If for some reason the machine running XWiki doesn't have internet access but it has Docker installed then you have the option to (re)use an existing Docker container with the headless Chrome web browser:

# Tell the application to reuse an existing Docker container.
export.pdf.chromeDockerContainerReusable=true
# Specify the name of the Docker container to reuse.
export.pdf.chromeDockerContainerName=headless-chrome-pdf-printer

In this case you are responsible for creating the headless Chrome container using a proper image. XWiki will be responsible for starting and stopping the Chrome container as needed. The requirements for this are:

  • Docker must be installed on the machine running XWiki (the servlet engine). No specific version of Docker is needed (from the point of view of XWiki), but you need to make sure that the Chrome container you create (for XWiki to reuse) can access the XWiki instance (specified using the export.pdf.xwikiHost configuration). Be aware that if XWiki runs on the same host as the Docker daemon (rather than inside its own Docker container) then you probably need to:
    • either set export.pdf.xwikiHost=host.docker.internal, if you are on Windows or MacOS and have Docker 18.03+
    • or create the Chrome container with --add-host=host.xwiki.internal:host-gateway, if you are on Linux and have Docker 20.10+ (which supports the magic host-gateway)
  • the OS user running XWiki (e.g. "tomcat") must be allowed to use Docker (e.g. on Linux this usually means adding the user to the "docker" group so that it has access to the Docker socket)

If XWiki is also running inside a Docker container then check out the Docker out of Docker section above.

Remote Chrome Instance

If you don't want to rely on Docker, or you don't want to give XWiki access to Docker for security reasons, but you still want to perform the PDF export on the server side then you also have the option to connect to a remote Chrome instance:

# Specify the Chrome host and port so that we can connect for remote debugging.
export.pdf.chromeHost=172.17.0.3
export.pdf.chromeRemoteDebuggingPort=9222
# Specify how the remote Chrome instance can access the XWiki instance in order to load XWiki pages (print preview).
export.pdf.xwikiHost=172.17.0.2

Note that "remote" could also mean local if you use Docker containers like this:

  • run XWiki in a Docker container
  • run headless Chrome in a Docker container
  • put both containers in the same Docker network
  • configure chromeHost and xwikiHost (see above) either using the container IPs or their network aliases

User Browser

The last option is to generate the PDF using the user's web browser, on the client side. Obviously this has the downside that different users (with different web browsers or different versions of the same web browser) can get different results.

14.4.3+, 14.6+ 

You can opt for the client side PDF generation using the available global configuration:

# Use the user's browser to generate the PDF instead of a headless Chrome browser instance on the server-side (Docker).
export.pdf.serverSide=false

The PDF export job request also has a property to force the client-side generation for a custom export:

#set ($pdfExportJobRequest = $services.export.pdf.createRequest())
## Tell the PDF export job we want to generate the PDF on the client side.
#set ($discard = $pdfExportJobRequest.setServerSide(false))
## The PDF export job will only render the XWiki pages on the server side. Once the job is done you'll have to redirect
## the user to the print preview page with the job id in the query string.
#set ($pdfExportJob = $services.export.pdf.execute($pdfExportJobRequest))

Configuration Options

The following configuration options can be set from xwiki.properties:

# The Docker image used to create the Docker container running the headless Chrome web browser.
export.pdf.chromeDockerImage=zenika/alpine-chrome:latest

# The name of the Docker container running the headless Chrome web browser. This is especially useful when reusing an
# existing container.
export.pdf.chromeDockerContainerName=headless-chrome-pdf-printer

# Specify if the Docker container running the headless Chrome web browser can be reused across XWiki restarts. When
# false, the container is removed each time XWiki is stopped or restarted.
export.pdf.chromeDockerContainerReusable=false

# The name or id of the Docker network to add the Chrome Docker container to; this is useful when XWiki itself runs
# inside a Docker container and you want to have the Chrome container in the same network in order for them to
# communicate. The default value "bridge" represents the default Docker network.
export.pdf.dockerNetwork=bridge

# The host running the headless Chrome web browser, specified either by its name or by its IP address. This allows you
# to use a remote Chrome instance, running on a separate machine, rather than a Chrome instance running in a Docker
# container on the same machine; defaults to empty value, meaning that by default the PDF export is done using the
# Chrome instance running in the specified Docker container.
export.pdf.chromeHost=

# The port number used for communicating with the headless Chrome web browser.
export.pdf.chromeRemoteDebuggingPort=9222

# The host name or IP address that the headless Chrome browser should use to access the XWiki instance (i.e. the print
# preview page); defaults to "host.xwiki.internal" which means the host running the Docker daemon; if XWiki runs itself
# inside a Docker container then you should use the assigned network alias, provided both containers (XWiki and Chrome)
# are in the same Docker network.
export.pdf.xwikiHost=host.xwiki.internal

# [Since 14.4.3]
# [Since 14.6RC1]
# Whether the PDF export should be performed server-side, e.g. using a headless Chrome web browser running inside a
# Docker container, or client-side, using the user's web browser instead; defaults to server-side PDF generation
export.pdf.serverSide=true

Script Service

The application provides a script service that can be used to perform custom PDF exports:

## Create a PDF export job request based on the current servlet request.
#set ($pdfExportJobRequest = $services.export.pdf.createRequest())

## Customize the PDF export job request:
#set ($discard = $pdfExportJobRequest.setDocuments($documentReferenceList))
#set ($discard = $pdfExportJobRequest.setTemplate($templateDocumentReference))
#set ($discard = $pdfExportJobRequest.setWithCover(true))
#set ($discard = $pdfExportJobRequest.setWithToc(false))
#set ($discard = $pdfExportJobRequest.setWithHeader(true))
#set ($discard = $pdfExportJobRequest.setWithFooter(false))
#set ($discard = $pdfExportJobRequest.setServerSide(true))

## Trigger the PDF export job and wait for it to finish.
#set ($pdfExportJob = $services.export.pdf.execute($pdfExportJobRequest))
#set ($discard = $pdfExportJob.join())

## Get the PDF file reference from the job status.
#set ($pdfExportJobStatus = $pdfExportJob.status)
#set ($pdfFileReference = $pdfExportJobStatus.getPDFFileReference())
#if ($services.resource.temporary.exists($pdfFileReference))
  #set ($pdfFileURL = $services.resource.temporary.getURL($pdfFileReference))

  ## Redirect the use to the generated PDF file.
  #set ($discard = $response.sendRedirect($pdfFileURL))
#end

Prerequisites & Installation Instructions

We recommend using the Extension Manager to install this extension (Make sure that the text "Installable with the Extension Manager" is displayed at the top right location on this page to know if this extension can be installed with the Extension Manager). Note that installing Extensions when being offline is currently not supported and you'd need to use some complex manual method.

You can also use the following manual method, which is useful if this extension cannot be installed with the Extension Manager or if you're using an old version of XWiki that doesn't have the Extension Manager:

  1. Log in the wiki with a user having Administration rights
  2. Go to the Administration page and select the Import category
  3. Follow the on-screen instructions to upload the downloaded XAR
  4. Click on the uploaded XAR and follow the instructions
  5. You'll also need to install all dependent Extensions that are not already installed in your wiki

Dependencies

Dependencies for this extension (org.xwiki.platform:xwiki-platform-export-pdf-ui 14.6):

Tags:
    

Get Connected