Skip to content

Commit

Permalink
pdftopdf: Added pdftopdf-form-flattening option to select form flatte…
Browse files Browse the repository at this point in the history
…ning method.
  • Loading branch information
tillkamppeter committed Jan 14, 2019
1 parent 86288fa commit 8022c03
Show file tree
Hide file tree
Showing 5 changed files with 130 additions and 46 deletions.
49 changes: 47 additions & 2 deletions README
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ INTRODUCTION
by both CUPS and this package. Then the filters of this package
should be used.

For compiling and using this package CUPS, libqpdf (8.1.0 or
For compiling and using this package CUPS, libqpdf (8.3.0 or
newer), libjpeg, libpng, libtiff, freetype, fontconfig, liblcms
(liblcms2 recommended), libavahi-common, libavahi-client, libdbus,
and glib are needed. It is highly recommended, especially if
Expand Down Expand Up @@ -845,9 +845,54 @@ non-landscape pages are rotated instead.
Note: Some pages might end up 180 degree rotated (instead of 0 degree).
Those should probably be rotated manually before binding the pages together.

3) Method of flattening interactive PDF forms and annotations.

Some PDF files (like application forms) contain interactive forms
which the user can fill in inside a PDF viewer like evince. The filled
in data is not integrated in each page of the PDF file but stored in
an extra layer. Due to this the data gets lost when applying
manipulations like scaling or N-up to the pages. To prevent the loss
of the data pdftopdf flattens the form before doing the
manipulations. This means the PDF will be converted into a static PDF
file with the data being integral part of the pages.

The same flattening is needed for annotations in PDF files.

By default the actual flattening work is done by QPDF, as QPDF is also
doing everything else in pdftopdf. This way no external utilities need
to be called and so extra piping between processes and extra PDF
interpreter runs are avoided which makes the filtering process faster.

As we did not test the new QPDF-based form-flattening with thousands
of PDF files yet and it has not been available to actual users yet it
is possible that there are still some bugs. To give users a
possibility to work around possible bugs in QPDF's form flattening, we
have introduced an option to get back to the old flattening by the
external tools pdftocairo or Ghostscript.

The selection of the method is done by the "pdftopdf-form-flattening"
option, setting it to "auto", "qpdf", "pdftocairo", "ghostscript",
"gs", "internal" or "external":

Per-job: lpr -o pdftopdf-form-flattening=pdftocairo ...
Per-queue default: lpadmin -p printer -o pdftopdf-form-flattening-default=gs
Remove default: lpadmin -p printer -R pdftopdf-form-flattening-default

By default, pdftopdf uses QPDF if the option is not supplied, also the
settings "auto" and "internal" make QPDF being used. "external"
auto-selects from the two external utilities, trying pdftocairo at
first and on failure Ghostscript. If the selected utility fails, the
form stays unflattened and so the filled in data will possibly not get
printed.

Native PDF Printer / JCL Support
--------------------------------

Note that for most modern native PDF printers JCL is not needed any
more as they are controlled via IPP. For these the PPD files get
auto-generated by the support of CUPS and cups-filters for driverless
IPP printing.

pdftopdf will emit JCL when provided with a PPD file that includes the
"*JCLToPDFInterpreter:" keyword.

Expand Down Expand Up @@ -915,7 +960,7 @@ License

pdftopdf is released under the MIT license.

The required libqpdf is available under version 2 of the Artistic License,
The required libqpdf is available under version 2.0 of the Apache License,
e.g. here: https://github.com/qpdf/qpdf


Expand Down
97 changes: 67 additions & 30 deletions filter/pdftopdf/pdftopdf.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1026,7 +1026,7 @@ int main(int argc,char **argv)
//param.mirror=true;
//param.reverse=true;
//param.numCopies=3;
if (!proc1->loadFilename("in.pdf")) return 2;
if (!proc1->loadFilename("in.pdf",1)) return 2;
param.dump();
if (!processPDFTOPDF(*proc1,param)) return 3;
emitComment(*proc1,param);
Expand Down Expand Up @@ -1063,30 +1063,61 @@ int main(int argc,char **argv)
param.dump();
#endif

/* Check with which method we will flatten interactive PDF forms
and annotations so that they get printed also after page
manipulations (scaling, N-up, ...). Flattening means to
integrate the filled in data and the printable annotations into
the pages themselves instead of holding them in an extra
layer. Default method is using QPDF, alternatives are the
external utilities pdftocairo or Ghostscript, but these make
the processing slower, especially due to extra piping of the
data between processes. */
int qpdf_flatten = 1;
int pdftocairo_flatten = 0;
int gs_flatten = 0;
int external_auto_flatten = 0;
const char *val;
if ((val = cupsGetOption("pdftopdf-form-flattening", num_options, options)) != NULL) {
if (strcasecmp(val, "qpdf") == 0 || strcasecmp(val, "internal") == 0 ||
strcasecmp(val, "auto") == 0) {
qpdf_flatten = 1;
} else if (strcasecmp(val, "external") == 0) {
qpdf_flatten = 0;
external_auto_flatten = 1;
} else if (strcasecmp(val, "pdftocairo") == 0) {
qpdf_flatten = 0;
pdftocairo_flatten = 1;
} else if (strcasecmp(val, "ghostscript") == 0 || strcasecmp(val, "gs") == 0) {
qpdf_flatten = 0;
gs_flatten = 1;
} else
fprintf(stderr,
"WARNING: Invalid value for \"pdftopdf-form-flattening\": \"%s\"\n", val);
}

cupsFreeOptions(num_options,options);

std::unique_ptr<PDFTOPDF_Processor> proc(PDFTOPDF_Factory::processor());

FILE *tmpfile = NULL;
if (argc==7) {
if (!proc->loadFilename(argv[6])) {
if (!proc->loadFilename(argv[6],qpdf_flatten)) {
ppdClose(ppd);
return 1;
}
} else {
tmpfile = copy_stdin_to_temp();
if ((!tmpfile)||
(!proc->loadFile(tmpfile,WillStayAlive))) {
(!proc->loadFile(tmpfile,WillStayAlive,qpdf_flatten))) {
ppdClose(ppd);
return 1;
}
}

/* The input file contains a PDF form. To not loose the data filled
into the form during our further manipulations we need to flatten
the form, meaning that we integrate the filled in data into the
pages themselves instead of holding them in an extra layer */
if (proc->hasAcroForm()) {
/* If the input file contains a PDF form and we opted for not
using QPDF for flattening the form, we pipe the PDF through
pdftocairo or Ghostscript here */
if (!qpdf_flatten && proc->hasAcroForm()) {
/* Prepare the input file for being read by the form flattening
process */
FILE *infile = NULL;
Expand Down Expand Up @@ -1119,22 +1150,26 @@ int main(int argc,char **argv)
const char *command;
cups_array_t *args;
/* Choose the utility to be used and create its command line */
/* Try pdftocairo first, the preferred utility for form-flattening */
command = CUPS_POPPLER_PDFTOCAIRO;
args = cupsArrayNew(NULL, NULL);
cupsArrayAdd(args, strdup(command));
cupsArrayAdd(args, strdup("-pdf"));
cupsArrayAdd(args, strdup("-"));
cupsArrayAdd(args, strdup(buf));
/* Run the pdftocairo form flattening process */
rewind(infile);
int status = sub_process_spawn (command, args, infile);
cupsArrayDelete(args);
if (status == 0)
flattening_done = 1;
else {
error("Unable to execute pdftocairo for form flattening!");
/* Try Ghostscript, currently the only working alternative */
if (pdftocairo_flatten || external_auto_flatten) {
/* Try pdftocairo first, the preferred utility for form-flattening */
command = CUPS_POPPLER_PDFTOCAIRO;
args = cupsArrayNew(NULL, NULL);
cupsArrayAdd(args, strdup(command));
cupsArrayAdd(args, strdup("-pdf"));
cupsArrayAdd(args, strdup("-"));
cupsArrayAdd(args, strdup(buf));
/* Run the pdftocairo form flattening process */
rewind(infile);
int status = sub_process_spawn (command, args, infile);
cupsArrayDelete(args);
if (status == 0)
flattening_done = 1;
else
error("Unable to execute pdftocairo for form flattening!");
}
if (flattening_done == 0 &&
(gs_flatten || external_auto_flatten)) {
/* Try Ghostscript */
command = CUPS_GHOSTSCRIPT;
args = cupsArrayNew(NULL, NULL);
cupsArrayAdd(args, strdup(command));
Expand All @@ -1157,11 +1192,12 @@ int main(int argc,char **argv)
cupsArrayDelete(args);
if (status == 0)
flattening_done = 1;
else {
else
error("Unable to execute Ghostscript for form flattening!");
error("No suitable utility for flattening filled PDF forms available, no flattening performed. Filled in content will not be printed.");
rewind(infile);
}
}
if (flattening_done == 0) {
error("No suitable utility for flattening filled PDF forms available, no flattening performed. Filled in content will possibly not be printed.");
rewind(infile);
}
/* Clean up */
if (infile != tmpfile)
Expand All @@ -1170,12 +1206,13 @@ int main(int argc,char **argv)
if (flattening_done) {
rewind(outfile);
unlink(buf);
if (!proc->loadFile(outfile,TakeOwnership)) {
if (!proc->loadFile(outfile,TakeOwnership,0)) {
error("Unable to create a PDF processor on the flattened form!");
return 1;
}
}
}
} else if (qpdf_flatten)
fprintf(stderr, "DEBUG: PDF interactive form and annotation flattening done via QPDF\n");

/* TODO
// color management
Expand Down
4 changes: 2 additions & 2 deletions filter/pdftopdf/pdftopdf_processor.h
Original file line number Diff line number Diff line change
Expand Up @@ -119,8 +119,8 @@ class PDFTOPDF_Processor { // abstract interface
virtual ~PDFTOPDF_Processor() {}

// TODO: ... qpdf wants password at load time
virtual bool loadFile(FILE *f,ArgOwnership take=WillStayAlive) =0;
virtual bool loadFilename(const char *name) =0;
virtual bool loadFile(FILE *f,ArgOwnership take=WillStayAlive,int flatten_forms=1) =0;
virtual bool loadFilename(const char *name,int flatten_forms=1) =0;

// TODO? virtual bool may_modify/may_print/?
virtual bool check_print_permissions() =0;
Expand Down
20 changes: 11 additions & 9 deletions filter/pdftopdf/qpdf_pdftopdf_processor.cc
Original file line number Diff line number Diff line change
Expand Up @@ -381,7 +381,7 @@ void QPDF_PDFTOPDF_Processor::error(const char *fmt,...) // {{{

// TODO? try/catch for PDF parsing errors?

bool QPDF_PDFTOPDF_Processor::loadFile(FILE *f,ArgOwnership take) // {{{
bool QPDF_PDFTOPDF_Processor::loadFile(FILE *f,ArgOwnership take,int flatten_forms) // {{{
{
closeFile();
if (!f) {
Expand Down Expand Up @@ -416,12 +416,12 @@ bool QPDF_PDFTOPDF_Processor::loadFile(FILE *f,ArgOwnership take) // {{{
error("loadFile with MustDuplicate is not supported");
return false;
}
start();
start(flatten_forms);
return true;
}
// }}}

bool QPDF_PDFTOPDF_Processor::loadFilename(const char *name) // {{{
bool QPDF_PDFTOPDF_Processor::loadFilename(const char *name,int flatten_forms) // {{{
{
closeFile();
try {
Expand All @@ -431,20 +431,22 @@ bool QPDF_PDFTOPDF_Processor::loadFilename(const char *name) // {{{
error("loadFilename failed: %s",e.what());
return false;
}
start();
start(flatten_forms);
return true;
}
// }}}

void QPDF_PDFTOPDF_Processor::start() // {{{
void QPDF_PDFTOPDF_Processor::start(int flatten_forms) // {{{
{
assert(pdf);

QPDFAcroFormDocumentHelper afdh(*pdf);
afdh.generateAppearancesIfNeeded();
if (flatten_forms) {
QPDFAcroFormDocumentHelper afdh(*pdf);
afdh.generateAppearancesIfNeeded();

QPDFPageDocumentHelper dh(*pdf);
dh.flattenAnnotations(an_print);
QPDFPageDocumentHelper dh(*pdf);
dh.flattenAnnotations(an_print);
}

pdf->pushInheritedAttributesToPage();
orig_pages=pdf->getAllPages();
Expand Down
6 changes: 3 additions & 3 deletions filter/pdftopdf/qpdf_pdftopdf_processor.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ class QPDF_PDFTOPDF_PageHandle : public PDFTOPDF_PageHandle {

class QPDF_PDFTOPDF_Processor : public PDFTOPDF_Processor {
public:
virtual bool loadFile(FILE *f,ArgOwnership take=WillStayAlive);
virtual bool loadFilename(const char *name);
virtual bool loadFile(FILE *f,ArgOwnership take=WillStayAlive,int flatten_forms=1);
virtual bool loadFilename(const char *name,int flatten_forms=1);

// TODO: virtual bool may_modify/may_print/?
virtual bool check_print_permissions();
Expand All @@ -61,7 +61,7 @@ class QPDF_PDFTOPDF_Processor : public PDFTOPDF_Processor {
private:
void closeFile();
void error(const char *fmt,...);
void start();
void start(int flatten_forms);
private:
std::unique_ptr<QPDF> pdf;
std::vector<QPDFObjectHandle> orig_pages;
Expand Down

0 comments on commit 8022c03

Please sign in to comment.