A Guide to Gotenberg

Gotenberg is a great open source project for converting files to PDF.

Starting Up Gotenberg

docker run --rm -p 3000:3000 gotenberg/gotenberg:7
# Run container in background
docker run --rm -d -p 3000:3000 gotenberg/gotenberg:7

View console output of Gotenberg

docker container ls
docker logs -f <containerId>

API Usage

Health Check

http://localhost:3000/health

Chromium

The Chromium module interacts with the Chromium browser to convert HTML documents to PDF.

Converting HTML URLs to PDF

curl \
--request POST 'http://localhost:3000/forms/chromium/convert/url' \
--form 'url="https://my.url"' \
-o my.pdf

Converting local HTML files to PDF

curl \
--request POST 'http://localhost:3000/forms/chromium/convert/html' \
--form 'files=@"/file/path/to/index.html"' \
-o my.pdf

LibreOffice

The LibreOffice module interacts with LibreOffice to convert documents to PDF.

To convert documents to PDF

Files with the following extensions:

.bib` `.doc` `.xml` `.docx` `.fodt` `.html` `.ltx` `.txt` `.odt` `.ott` `.pdb` `.pdf` `.psw` `.rtf` `.sdw` `.stw` `.sxw` `.uot` `.vor` `.wps` `.epub` `.png` `.bmp` `.emf` `.eps` `.fodg` `.gif` `.jpg` `.met` `.odd` `.otg` `.pbm` `.pct` `.pgm` `.ppm` `.ras` `.std` `.svg` `.svm` `.swf` `.sxd` `.sxw` `.tiff` `.xhtml` `.xpm` `.fodp` `.potm` `.pot` `.pptx` `.pps` `.ppt` `.pwp` `.sda` `.sdd` `.sti` `.sxi` `.uop` `.wmf` `.csv` `.dbf` `.dif` `.fods` `.ods` `.ots` `.pxl` `.sdc` `.slk` `.stc` `.sxc` `.uos` `.xls` `.xlt` `.xlsx` `.tif` `.jpeg` `.odp

Converting local document files to PDF

Relative file path

curl \
--request POST 'http://localhost:3000/forms/libreoffice/convert' \
--form 'files=@"test.docx"' \
-o test.pdf

Absolute file path

curl \
--request POST 'http://localhost:3000/forms/libreoffice/convert' \
--form 'files=@"D:\My Workspace\test.docx"' \
-o test.pdf

Multiple files

curl \
--request POST 'http://localhost:3000/forms/libreoffice/convert' \
--form 'files=@"/path/to/file.docx"' \
--form 'files=@"/path/to/file.xlsx"' \
-o my.zip

Request by Java OkHttp

private static final String GOTENBERG_SERVICE_HOST = "http://xxx.xxx.xxx.xxx:3000";

private static final String CONVERT_TO_PDF_URI = "/forms/libreoffice/convert";

private static final OkHttpClient OKHTTP_CLIENT = new OkHttpClient.Builder()
.connectTimeout(20, TimeUnit.SECONDS)
.readTimeout(15, TimeUnit.SECONDS)
.writeTimeout(15, TimeUnit.SECONDS)
.build();

public static byte[] convertToPdfByGotenberg(InputStream inputStream, String fileName) throws IOException {
Request.Builder requestBuilder = new Request.Builder()
.url(HttpUrl.parse(GOTENBERG_SERVICE_HOST + CONVERT_TO_PDF_URI).newBuilder().build());
String mediaType = URLConnection.guessContentTypeFromName(fileName);
if (mediaType == null) {
mediaType = "application/octet-stream";
}
MultipartBody.Builder builder = new MultipartBody.Builder()
.setType(MultipartBody.FORM)
.addFormDataPart("files", fileName,
RequestBody.create(MediaType.parse(mediaType),
IOUtils.toByteArray(inputStream)));
requestBuilder.post(builder.build());
try (Response response = OKHTTP_CLIENT.newCall(requestBuilder.build()).execute()) {
return response.body().bytes();
}
}

PDF Engines

Merge multiple PDF files to one PDF file.

curl \
--request POST 'http://localhost:3000/forms/pdfengines/merge' \
--form 'files=@"/path/to/pdf1.pdf"' \
--form 'files=@"/path/to/pdf2.pdf"' \
--form 'files=@"/path/to/pdf3.pdf"' \
--form 'files=@"/path/to/pdf4.pdf"' \
-o my.pdf

Webhook

The Webhook module provides a middleware that allows you to upload the output file from multipart/form-data routes to the destination of your choice.

curl \
--request POST 'http://localhost:3000/forms/chromium/convert/url' \
--header 'Gotenberg-Webhook-Extra-Http-Headers: {"MyHeader": "MyValue"}' \
--header 'Gotenberg-Webhook-Url: https://my.webhook.url' \
--header 'Gotenberg-Webhook-Method: PUT' \
--header 'Gotenberg-Webhook-Error-Url: https://my.webhook.error.url' \
--header 'Gotenberg-Webhook-Error-Method: POST' \
--form 'url="https://my.url"'

The middleware reads the following headers:

  • Gotenberg-Webhook-Url - the callback to use - required
  • Gotenberg-Webhook-Error-Url - the callback to use if error - required
  • Gotenberg-Webhook-Method - the HTTP method to use (POST, PATCH, or PUT - default POST).
  • Gotenberg-Webhook-Error-Method - the HTTP method to use if error (POST, PATCH, or PUT - default POST).
  • Gotenberg-Webhook-Extra-Http-Headers - the extra HTTP headers to send to both URLs (JSON format).

Common Errors

Error: file name is too long

{"level":"error","ts":1649232587.172472,"logger":"api","msg":"create request context: copy to disk: create local file: open /tmp/9e10e36d-c5a9-4623-9fac-92db4a0d0982/xxx.doc: file name too long","trace":"de0b5ce2-5a99-406e-a61d-4abb65ef0294","remote_ip":"xxx.xxx.xxx.xxx","host":"xxx.xxx.xxx.xxx:3000","uri":"/forms/libreoffice/convert","method":"POST","path":"/forms/libreoffice/convert","referer":"","user_agent":"okhttp/4.9.3","status":500,"latency":13161094736,"latency_human":"13.161094736s","bytes_in":6983097,"bytes_out":21}

Solutions

Decreasing your file name length.

Error: file name contains UTF-8 characters

{"level":"error","ts":1649234692.9638329,"logger":"api","msg":"convert to PDF: unoconv PDF: unix process error: wait for unix process: exit status 6","trace":"28d9a196-10e5-4c7d-af6a-178494f49cd1","remote_ip":"xxx.xxx.xxx.xxx","host":"xxx.xxx.xxx.xxx:3000","uri":"/forms/libreoffice/convert","method":"POST","path":"/forms/libreoffice/convert","referer":"","user_agent":"okhttp/4.9.3","status":500,"latency":130617774,"latency_human":"130.617774ms","bytes_in":11559,"bytes_out":21}

Solutions

Encoding filename by URLEncoder.encode(fileName, "UTF-8").

Error: file extension name is not right

{"level":"error","ts":1649234777.9096093,"logger":"api","msg":"validate form data: no form file found for extensions: [.bib .doc .xml .docx .fodt .html .ltx .txt .odt .ott .pdb .pdf .psw .rtf .sdw .stw .sxw .uot .vor .wps .epub .png .bmp .emf .eps .fodg .gif .jpg .jpeg .met .odd .otg .pbm .pct .pgm .ppm .ras .std .svg .svm .swf .sxd .sxw .tif .tiff .xhtml .xpm .odp .fodp .potm .pot .pptx .pps .ppt .pwp .sda .sdd .sti .sxi .uop .wmf .csv .dbf .dif .fods .ods .ots .pxl .sdc .slk .stc .sxc .uos .xls .xlt .xlsx]","trace":"10e6ffd8-2d00-4374-b6fb-0c4ff6af3043","remote_ip":"xxx.xxx.xxx.xxx","host":"xxx.xxx.xxx.xxx:3000","uri":"/forms/libreoffice/convert","method":"POST","path":"/forms/libreoffice/convert","referer":"","user_agent":"okhttp/4.9.3","status":400,"latency":3885166,"latency_human":"3.885166ms","bytes_in":11551,"bytes_out":449}

Solutions

Files in form data should have a correct file extension.

References

Gotenberg Documentation