Dr STEVEN BAGLEY steven.bagley@nottingham.ac.uk
ASSISTANT PROFESSOR
Extracting reusable document components for variable data printing
Bagley, Steven R.; Brailsford, David F.; Ollis, James A.
Authors
David F. Brailsford
James A. Ollis
Abstract
Variable Data Printing (VDP) has brought new flexibility and dynamism to the printed page. Each printed instance of a specific class of document can now have different degrees of customized content within the document template.
This flexibility comes at a cost. If every printed page is potentially different from all others it must be rasterized separately, which is a time-consuming process. Technologies such as PPML (Personalized Print Markup Language) attempt to address this problem by dividing the bitmapped page into components that can be cached at the raster level, thereby speeding up the generation of page instances.
A large number of documents are stored in Page Description Languages at a higher level of abstraction than the bitmapped page. Much of this content could be reused within a VDP environment provided that separable document components can be identified and extracted. These components then need to be individually rasterisable so that each high-level component can be related to its low-level (bitmap) equivalent. Unfortunately, the unstructured nature of most Page Description Languages makes it difficult to extract content easily.
This paper outlines the problems encountered in extracting component-based content from existing page description formats, such as PostScript, PDF and SVG, and how the differences between the formats affects the ease with which content can be extracted. The techniques are illustrated with reference to a tool called COG Extractor, which extracts content from PDF and SVG and prepares it for reuse.
Citation
Bagley, S. R., Brailsford, D. F., & Ollis, J. A. Extracting reusable document components for variable data printing. Presented at ACM Symposium on Document Engineering
Conference Name | ACM Symposium on Document Engineering |
---|---|
End Date | Aug 31, 2007 |
Publication Date | Jan 1, 2007 |
Deposit Date | Jul 23, 2008 |
Publicly Available Date | Jul 23, 2008 |
Peer Reviewed | Peer Reviewed |
DOI | https://doi.org/10.1145/1284420.1284435 |
Keywords | PostScript, PDF, SVG, graphic objects, Content Extraction, Variable Data Printing. |
Public URL | https://nottingham-repository.worktribe.com/output/1017172 |
Publisher URL | http://doi.acm.org/10.1145/1284420.1284435 |
Files
fp17-brailsford.pdf
(3.4 Mb)
PDF
You might also like
Generating summary documents for a variable-quality PDF document collection
(2014)
Presentation / Conference Contribution
Revisiting a summer vacation: digital restoration and typesetter forensics
(2013)
Presentation / Conference Contribution
No need to justify your choice: pre-compiling line breaks to improve eBook readability
(2013)
Presentation / Conference Contribution
Reflowable documents composed from pre-rendered atomic components
(2011)
Presentation / Conference Contribution
Optimized reprocessing of documents using stored processor state
(2010)
Presentation / Conference Contribution
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search