PDF/UA Technical Implementation Guide: Understanding ISO 14289-1 (PDF/UA-1)

Table of Contents

1.0 Statement of Purpose

ISO 14289-1 is a terse document. This Guide is intended to provide implementers with information and examples to illuminate ISO 14289-1 beyond the text of the standard itself. While informative rather than normative, this document reflects the understandings and intentions of the US Committee for PDF/UA, lead authors of ISO 14289-1.

1.01 Document Version

This version of the PDF/UA-1 Technical Implementation Guide is designated 1.2a as approved by the US Committee for PDF/UA on October 3, 2013.

1.1 Expectations

For the most part, PDF/UA simply serves to require many elements that are optional in ISO 32000-1. Chapter 14 - the "Interchange" chapter of the PDF Reference - is the fundamental starting point for any reading of ISO 14289.

Within Chapter 14, the relevant sections are:

  • 14.6 Marked Content
  • 14.7 Logical Structure
  • 14.8 Tagged PDF
  • 14.9 Accessibility Support

It's important to understand what this Guide does not do:

  • This Guide does not substitute for ISO 14289-1; it simply provides software developers with additional information beyond the text of the International Standard.
  • This Guide is not a "how to" for tagging PDF documents.

1.2 Target Audience

  • Assistive Technology (AT) developers and vendors.
  • Software developers interested in writing and processing PDF files.
  • Those evaluating PDF processing software and AT.
  • Those interested in learning how PDF/UA looks forward to new features in PDF 2.0.

2.0 How to Use this Guide

Each section of this Guide should be read side-by-side with PDF/UA. This Guide is not intended to be an exhaustive expansion on the normative text of PDF/UA. Where the language of the Standard itself is suitably clear as written no further explanation is offered.

2.1 Introduction to Concepts

In PDF, page content is represented by a sequence of graphics objects encoded in content streams of type text, path, image or smooth shade (see ISO 32000-1, 8.2 Graphics Objects for a detailed explanation) to be drawn one after another on a virtual canvas the size of the page. In many cases the order of the graphics objects on a given page does not reflect semantic aspects of the page content such as intended reading order. In order to be able to indicate the logical order of page content in a PDF, further data structures are required. There are three such structures in PDF, which together provide this mechanism, Marked Content, Logical Structure and Tagged PDF.

The Marked Content mechanism provides a means of identifying sequences of graphics objects within a content stream. Logical Structure enables a document to contain a tree, describing the logical hierarchy for content within the document. The Logical Structure uses the Marked Content mechanism to identify the content belonging to a given node or leaf in the tree (e.g. Heading or Paragraph). Tagged PDF makes it possible to apply semantic typing to content items identified by Logical Structure. 14.8. Tagged PDF establishes a number of mandatory rules and optional recommendations for how to use these semantic types for content items.

Much of PDF/UA is aimed at ensuring that Tagged PDF rules are required and that optional mechanisms in the Logical Structure are mandated. The technical background for this model will be discussed in a separate document.

2.2 Introduction to Sections

The provisions in PDF/UA are grouped into three sections reflecting three different areas of conformance. This guide addresses them in sequence.

2.3 General Admonition

Many of the provisions in ISO 14289-1 are difficult or impossible to assess by software means alone. Validation often requires human interaction at some stage in authoring or validation processing. Examples include:

  • Assessing the correctness of a document's logical structure.
  • Determining the sufficiency of alternative text.

3.0 Relating PDF/UA to WCAG 2.0

The WAI's Web Content Accessibility Guidelines 2.0 (WCAG 2.0) provide normative text suitable to guiding development of accessible content in the web context. The scope of WCAG 2.0 is broad, encompassing files of many types, technical specifications, structural requirements and writing requirements, each assigned a distinct "Level" for assessing conformance.

Where PDF/UA conforming PDF documents are utilized in a web context, the WCAG 2.0 model is broadly applicable to PDF, though lacking critical technical detail. For those wishing to assess the conformance of a PDF/UA document vis-a-vis WCAG 2.0 success criteria, the document "Achieving WCAG 2.0 with PDF/UA" by AIIM's US Committee for PDF/UA, the original ISO Member Body developing ISO 14289, provides the authoritative mapping between PDF/UA provisions and WCAG 2.0 Success Criteria.

4.0 Notation

This section in ISO 14289-1 does not contain any provisions that are discussed in this Technical Implementation Guide.

5.0 Version Identification

In ISO 14289-1:2012 the 2nd sentence in the first paragraph is erroneous, and should not have been included. This will be corrected in a subsequent document.

To claim PDF/UA conformance the XMP's RDF's description tag must include tags corresponding to the values in Table 1. An example follows.

Example

<x:xmpmeta xmlns:x="adobe:ns:meta/"
     x:xmptk="Adobe XMP Core 5.4-c005 78.147326, 2012/08/23-13:03:03">
     <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
           <rdf:Description
                xmlns:pdfuaid="http://www.aiim.org/pdfua/ns/id/"
                xmlns:dc="http://purl.org/dc/elements/1.1/"
                ... other namespace declarations ...     
           >
                <pdfuaid:part>1</pdfuaid:part> <!-- PDF/UA Declaration -->
                <dc:title>
                     <rdf:Alt>
                           <rdf:li xml:lang="x-default">The Document's Title</rdf:li>
                     </rdf:Alt>
                </dc:title>
                ... other RDF entries ...
           </rdf:Description>
     </rdf:RDF>
</x:xmpmeta>

 

6.0 Conformance Requirements

This section in ISO 14289-1 does not contain any provisions that are discussed in this Technical Implementation Guide.

7.0 File Format Specifications

Section 7 of ISO 14289-1 details the file-format specifications for PDF files that may claim conformance with PDF/UA.

7.1 General

7.1 is the core of PDF/UA; it contains most of the essential concepts pertaining to page-content.

Paragraphs 1 and 2

The following elements of a page's content should always be understood as "Artifacts".

  • Repeating headers or footers, even if entered by a user.
  • Slide backgrounds, even if user-defined.
  • Page numbers.
  • Usually, any items that tend to repeat page after page.
  • Bates numbers (NOTE: ISO 32000-2 will introduce a new type of artifact to contain Bates numbers)

Artifacts can never be a part of the logical structure tree. As PDF/UA-1 requires that page content is either contained in the document's semantic structure or marked as an artifact, all page content must be either contained in the logical structure or marked as an Artifact.

"Semantically appropriate tags" refers to the choice of structure element type used to represent the documents semantic structure, including section and subsection heading, paragraph, list, table, figure and other elements. The term "semantically appropriate" implies that the tags are so chosen as to accurately represent the author's intent.

Regarding page headers and footers

Content marked as artifact is generally not exposed by existing implementations, though we expect that as adoption of PDF/UA progresses this situation will change. Presently, information not provided except in the form of content marked as artifact must therefore occasionally be exposed via some other means. Options include:

  • Not marking such information as an artifact and including such content in the structure tree.
  • Including such information in the document's metadata.
Common use cases requiring treatment
  • Form numbers
  • "Mouseprint" - Copyright, trademark attributions and legal disclaimers, typically found at the back of documents.
  • Section identification (if not otherwise included in the structure tree). Typically, information that repeats on subsequent pages should only be present in the structure tree in the first instance.
  • Dictionary-style headings, for example, when presenting the first and last definition entries on the page or spread. Example: a phone-book

"Logical reading order" refers to (in order):

  1. The sequence of structure elements as it appears in the logical structure tree.
  2. Within each structure element the logical order is the order of marked content sequences referenced in the K array.
  3. Within each marked content sequence, the order in which the content is drawn (after ReversedChars tag has been taken into account).

Visual representation of the relationship between logical order, reading order and the physical view of PDF.

ActualText Semantics

Note that authors may intend ActualText with zero characters to be processed as if there were no content.

EXAMPLE: If a minus "-" sign is used to visually represent a hyphen and it is enclosed by an ActualText entry with a string of zero length the minus sign "goes away".

Paragraph 3

"Functionally equivalent" standard types, examples include:

  • The tag: Heading 2 should probably map to the standard structure type: H2
  • The tag Normal generally maps to the standard structure type: P
  • The tag Chart generally maps to the standard structure type: Figure

Paragraph 4

No additional information provided in this Guide.

Paragraph 5

While feasible, it's a rare PDF file that flashes at all. Nevertheless flashing cannot be completely excluded since JavaScript actions may create a flashing effect, or a movie embedded in the PDF file may include flashing effects.

Paragraph 6

Whether visual effects have been used to deliver information is often impossible to ascertain programmatically. Developers should consult appropriate accessibility standards pertaining to the use of color, contrast, layout and similar features of content. See in particular the success criteria associated with WCAG 2.0 Guideline 1.4, and the resources identified by the W3C's WAI in this regard.

Paragraphs 7-9

No additional information provided in this Guide.

Paragraph 10

There are many routes to a PDF file via conversion from a scanned image.

A scanned page is not a figure (in semantic terms), but simply a representation of a page's content by means of an image XObject (or, in principle, an inline image). Tagging that image with a Figure tag, and possibly, placing the OCR-derived or rekeyed text representation of the page's content into the Alt attribute of a Figure tag violates the spirit of PDF/UA.

Where areas of a scanned page represent content that is usually perceived as an image (e.g. a photo or drawing) the scanned page must be segmented such that each of the areas representing such an image can each be tagged with a Figure tag and the Alt attribute populated as necessary.

Where areas of a scanned page represent text content it is a common practice to programmatically overlay these areas with invisible text that transports the text content in these areas, preferably true to the (approximate) position of each character as represented in the underlying image.

7.2 Text

Paragraph 1

ISO 32000-1 specifies that the page content order in tagged PDF shall follow logical reading order but acknowledges that there are cases in which it's not always possible to do so. There are common use-cases in which it is impossible for the logical reading order (as represented by the logical structure tree) to correspond to the content order (as represented by the sequence of graphics objects in a page's content stream). Examples include:

  • Page content such as tables, lists or headings that span facing pages in a spread.
  • An article starts or continues on a page and ends on another non-contiguous page.
  • Two or more newspaper articles all start or continue on one page and end on another page.

For logical ordering purposes the logical reading order (tags) allows the correct content reading order to be determined.

Not all PDFs are created with tagging in mind. If a tool aiming to convert existing files to conforming PDF/UA files encounters non-artifact graphics objects that cross semantic boundaries, such objects must be adjusted to enable correct tagging.

EXAMPLE: a single text run (TJ operator) may include text belonging to two or more logical cells (<TD>) in a table. For correct tagging, it's necessary to "break" such a text run into discrete text runs such that a separate TD tag may be used to contain each text run, as appropriate.

The order of content within a tag (ie, the sequence in which Marked Content elements appear in the Kids array must be in logical reading order for the file to conform. Accordingly, should the order of content within a given marked-content sequence not reflect the logical reading order, conformance with PDF/UA requires such shortcomings are addressed. For example, the logical reading order in such cases may be corrected by splitting the tag structure into smaller pieces.

Regarding the use of Marked Content attributes in the Structure Tree

A common mistake regarding the use of Marked Content is to use the Lang, Alt, ActualText or E attributes from directly within a Marked Content Sequence identifying content contained within the structure tree (using an MCID). These attributes are intended to allow the language for a given piece of text to be overridden or to provide alternatives to the content within the sequence. These attributes can still be used for this purpose, but (Lang, Alt, ActualText and E attributes) are only allowed within Marked Content Sequences that are not directly included in the logical structure tree. It should also be noted that equivalents to these attributes are available as entries within a structure element dictionary.

An example of appropriate usage for these attributes, in the context of logical structure, is provided in ISO 32000-1, 14.9.2.3, Example 2. The following is incorrect:

BT
 /P << /MCID 0 /Lang (en-US) >> BDC
   (Some Text) Tj
 EMC
ET

When encountering a PDF misusing the above attributes, a consuming application can choose to process these attributes on any Marked Content sequence. However, a PDF writer should never misuse these attributes into a PDF.

If a content sequence has any of these four entries, such entries should override structure element attributes for that sequence of content.

Paragraph 4

An example of stretchable characters would be large curly braces commonly used in mathematics.

7.3 Graphics

Paragraph 1, Bullet 1

Graphics that do not represent "meaningful content" include registration and Optical Mark Recognition (OMR) symbols.

The use of both actual and Alt text is not recommended as it places conforming readers in a confusing situation, since a common industry implementation has Alt text take precedence over ActualText; however, this may conflict with an author's intent.  For PDF processors encountering such files, it should be possible for AT to select between ActualText and Alt text representations.

Paragraph 6

Any graphics objects including but not limited to bitmap images, may be used individually or in combination with other objects to represent a semantic figure irrespective of the type of graphics object(s) used to render the Figure.

Do not represent text by means of path objects, inline images or image XObjects unless necessary. Where it cannot be avoided that text is represented by non-text graphics objects, ActualText or Alt text attributes may have to be used, or invisible text may be overlaid to represent the text (see 7.1 regarding scanned images and OCR).

7.4 Headings

In PDF, headings are a key means of navigation for many AT users. In order to offer a realistic possibility of reliable navigation in longer and more complex documents, PDF/UA conformance requires that heading levels not be skipped.

7.4.1 - 7.4.3 refer to weakly structured documents. Look to 7.4.4 for guidance on strongly structured documents.

In most authoring applications heading levels are typically associated with specific typographical styles. While it's commonplace for authors to confuse structural and style commands, structure elements must be selected based on the role of headings within the document's structure, not based on the appearance of a given set of styles.

If relevant information is available, use the optional "T" key in Table 323 to characterize the content inside the respective structure element.

Not all AT implementations yet recognize the concept of "strongly structured" documents, however PDF/UA-conforming implementations must support both strongly structured and weakly structured documents.

7.4.2 Numbered Headings

No additional information provided in this Guide.

7.4.3 Additional Headings

Although this section defines the nomenclature for heading levels above H6 (Hn), these are not standard structure types and therefore Hn tags must (PDF/UA-1 7.1, paragraph 1) be role-mapped to a standard structure type.

For documents claiming PDF/UA conformance, processors encountering heading level 7 (H7) or higher - regardless of role-map - should treat such tags as a heading, with the appropriate level.

Writers must use H7, H8. etc. exclusively for content that represents a heading on level 7, 8 and above.

Readers and AT should always assume when encountering an Hn tag that the content inside the H7 tag corresponds to heading level 7.

7.4.4 Unnumbered Headings

No additional information provided in this Guide.

7.5 Tables

Tables may require a variety of structure attributes as determined by the semantics of the content. Thus, while optional in ISO 32000-1, table attributes are required in a PDF/UA-1 conforming document when the content includes such instances.

While ISO 14289-1:2012 requires the "Scope" attribute for all TH cells even where header IDs are used to associated TD cells with the respective TH cells, in such cases the Scope attribute does not actually add any value.

In ISO 32000-1, the Note to Table 337 says "Lookup is heuristic", however, this provision can lead to incompatible behavior by AT. The recursive lookup mentioned in Table 349 is ambiguous in that the table headers might be only associated with a row, a column, or both. No algorithm is given in ISO 32000-1 when table header cell IDs and table data cell IDs are not present.

If Table Headers are not specified

In the case that table header data cell IDs and table data cell IDs are not specified, the following algorithm is accepted for ISO 32000-2 and should be used by developers implementing ISO 14289-1:

To find headers for any data or header cell, search left/up from the cell's position to find row/column header cells. The search in a given direction stops when any of these conditions is reached:

  • The edge of the table is reached,
  • A data cell is found after a header cell
  • A header cell has the Headers attribute set -- the headers that are specified are appended to the row/column list that is being built.

When a header cell is found in the search and the (implicit or explicit) Scope of the header cell is either Both or Row/Column, the header cell is appended to the end of the list of row/column headers, resulting in a list of headers ordered from most specific to most general.

NOTE: This algorithm works for languages with different intrinsic directionality of the script (such as right-to-left) because the structure always reflects the reading order of the table.

If Table Headers are specified

Because ISO 32000-1 conforming PDF files do not include a specified order for table header IDs, readers of such documents have no guarantee that the following algorithm works. Writers of PDF files, however, should use the order specified below (accepted for ISO 32000-2):

The order of IDs in the Headers array shall be row IDs followed by column IDs. The row and column IDs shall be ordered from the most specific to the most general. For any cells with an ID listed as a Header, such cells shall have a Scope identified.

If Scope is not identified

In the event that Scope is not identified then the assumed value for the Scope should be determined based on the following algorithm (accepted for ISO 32000-2):

if it is in the first row and column, the scope is assumed to be Both;
otherwise, if it is in the first row, the scope is assumed to be Column.
otherwise, if it is in the first column, the scope is assumed to be Row.
otherwise, the scope is assumed to be Both.

These assumptions are used by the Table Header Finding algorithm.

Note that these algorithms work for languages with different intrinsic directionality of the script, such as right-to-left, because the structure always reflects the reading order of the table.

Summary Attribute

Writers and readers should implement this feature, as AT users find such content very useful.

Regularity of Tables

Tables are regular when the number of logical cells is equal in each row (after accounting for rowspan and colspan attributes).

While not required by PDF/UA-1 (or WCAG 2.0), regularity of tables can be important to assistive technology because regularity is a strong indicator of proper semantics in tabular content. If an irregular table occurs, the Headers array should be checked to verify, minimally, that each TD cell is associated with at least one heading.

7.6 Lists

In a unordered list without Lbl elements (which is permissible) the value of ListNumbering should be "None". In such cases the graphics objects representing the list's labels should not be contained in the logical structure and should be marked as Artifacts.

The ListNumbering attribute may be useful even without Lbl tags in the list. If ListNumbering has a value of UpperRoman, for example, conforming PDF/UA readers must make available this ListNumbering attribute to users and AT.

Conforming PDF/UA writers should ensure that content contained in Lbl tags matches the ListNumbering attribute if present.

If a list is an ordered list the ListNumbering attribute must be present with a suitable value.

If it's not possible to establish a perfect match for ListNumbering in ordered lists, the value "Decimal" may be used.

Depending on whether the content of a Lbl element can be readily represented in Unicode, it may be necessary to add information to enable mapping to Unicode, for example by means of a ToUnicode table, or by adding an ActualText attribute.

7.7 Math

ISO 32000-1 does not include any specific physical representation of mathematical expressions or formulas in the content of a page. For the purpose of tagged PDF, a mathematical expression or formula must be tagged as a Formula and carry an Alt attribute that represents the content of the formula as plain text.

In the case of one graphics object or a combination of graphics objects (as defined in ISO 32000-1 8.2) used to represent math, such graphics object(s) must be tagged in a conforming manner (see ISO 32000-1 Table 340) by a Formula tag. Even when the formula is represented by an inline image or image XObject, a Figure tag must not be used to represent the formula, instead a Formula tag must be used.

Appropriate ALT text must be supplied. It is important to recognize that different audiences benefit from different ways of speaking a math equation. If the target audience includes people who are blind and are not familiar with the subject matter it is important to make sure that the math is spoken unambiguously. This typically requires that any two dimensional notation such as a fraction having a starting and ending phrase (e.g., "Fraction x plus y over x minus y end fraction"). These extra words are not commonly used to speak math. For those with learning disabilities such as dyslexia it would be beneficial if the content items inside a formula are each tagged such that when reading the formula, each part of the formula can be highlighted separately.

In the absence of any information about the document's audience, an unambiguous reading should be favored.

In math, certain "stretched" characters may be composed of multiple instances of characters that together represent a single "stretched" character. Examples include tall parentheses, arrows and integrals. Such objects must be represented via an ActualText attribute (ISO 32000-1 14.9.4).

NOTE: Looking forward to ISO 32000-2 and 14289-2

It is expected that ISO 32000-2 will provide a new primary method that allows for dynamic generation of accessible math based on user preferences.

ISO 14289-2 will align with ISO 32000-2 to address MathML. The following are anticipated specifications in ISO 14289-2:

  • All mathematical expressions that can be represented by MathML may be represented as MathML occurring inside a <Formula> tag.
  • All nonlinear mathematical expressions (i.e., those that use two dimensional notations such as fractions and superscripts) shall be represented by MathML.
  • All MathML shall use presentation MathML tags; content MathML may be included via PDF's facility for attached files.
  • All MathML tag names occur in the MathML namespace.
  • The requirements regarding mapping of characters to Unicode shall apply to math as set forth in ISO 32000-1:2008, 9.10.2 and 14.8.3.4, as well as in 6.12 of this standard.

7.8 Page headers and footers

If the content stream includes page numbers, they should be considered as part of a running header or footer and thus marked as artifacts of type Pagination with a subtype of Header or Footer, as appropriate.

In order to facilitate page-based navigation to end-users, the contents of the PageLabel tree should match the logical pagination of the document whenever possible.

7.9 Notes and references

No additional information provided in this Guide.

7.10 Optional content

Artifacts aside, in conforming files page content contained in optional content is always present in the logical structure tree regardless of the current visibility state of such page content.

Optional content should be used with care, as the activation and deactivation of optional content, when presented through AT, may confuse some users.

7.11 Embedded files

Applicable accessibility standards may include (but are not limited to): ISO 40500 (WCAG 2.0), ISO 29500 (OpenXML, Annex J), ISO 26300 (ODF 1.1 Accessibility Guidelines 1.0), ISO 9241-171, and certain standards referenced by ISO/IEC TR 29138-2:2009.

In the context of multimedia, "file attachments" means any media data embedded in the PDF.

7.12 Article threads

Article threads are not required for a PDF/UA conforming document.

7.13 Digital signatures

Signature form fields are annotations, and must conform with the provisions for digital signature annotations as specified in ISO 32000-1:2008 Table 252.

7.14 Non Interactive Forms

PrintField attributes are characteristically useful for completed and flattened (and thus non-interactive) PDF forms where the appearance of interactive form fields has been included in the content stream of the page.

7.15 XFA

Dynamic XFA, which does not use widget annotations for form fields, is a distinct technology from "classical" PDF. PDF/UA does not support dynamic XFA. Those interested in supporting Dynamic XFA should refer to the Adobe XML Forms Architecture (XFA): Listing of Specifications http://partners.adobe.com/public/developer/xml/index_arch.html. Interested users may refer to Adobe's documentation at http://www.adobe.com/accessibility/products/livecycle/pdf/LiveCycle8_2AccessibilityGuidelines.pdf.

For static XFA documents, which do use form fields in the form of widget annotations, there are no specific requirements.

7.16 Security

No additional information provided in this Guide.

7.17 Navigation

PDF/UA conforming writers that do not already create bookmarks are encouraged to create bookmarks based on the heading structure of the document (and possibly, other structure elements as well).

Where a document contains a Table of Contents, PDF/UA conforming writers must create TOC/TOCI structures as opposed to simply using lists.

Where a document contains a Table of Contents, PDF/UA conforming writers are encouraged to create Link structure elements including link annotations for each TOCI.

7.18 Annotations

Popup annotations do not carry any content on their own, but are part of the parent annotation and cannot be tagged on their own. Popup annotations are implicitly included in the structure tree through their parent annotation, and should not be directly included.

NOTE: A pending revision for PDF/UA-1 includes this adjustment.

Representation of annotation context should proceed in the following fashion:

Alt keys in Annot and Link structure elements and Contents keys in annotation dictionaries

If an Annot or Link structure element contains an Alt key, the content inside the Annot or Link structure element might be hidden for at least some AT.

If instead the Annot or Link structure element does not contain an Alt key, but the dictionary for an annotation referenced in an Annot or Link structure element contains a Contents key, the contents inside the Annot or Link structure element will be available to AT. In addition, an alternative description for the annotation or link will be available through the Contents key.

Depending on the type of annotation and how it's used, either solution may be more suitable.

Unless there is an Alt key in the Annot or Link structure element encapsulating the actual annotation, and unless AT decides to present the contents of the Alt key instead of the content inside the Annot or Link structure element, and the Contents key has a non-empty value, then AT should present that value.

NOTE 1: If the file is not PDF/UA conforming and the Contents key is null, empty or missing, the AT may retrieve the immediate context of the Link annotation by means of the text or other content contained in the same structure element parent, or through other suitable means (for example, the link's target's title, or the text corresponding to the destination of the link).

NOTE 2: For historical reasons, it's not uncommon for PDF creators to use an Alt key on an Annot tag.

Form Fields

The usage advice given for annotations, above, applies equally to form fields except that instead of the Contents key an alternate description is provided by the TU key.

Unless there is an Alt key in the Form structure element encapsulating the actual form field, and unless AT decides to present the contents of the Alt key instead of the content inside the Form structure element, and the TU key has a non-empty value, then AT should present that value.

NOTE: If the file is not PDF/UA conforming and the TU key is null, empty or missing, AT may retrieve the immediate context of the form field by means of the text contained in the same structure element parent, or through other suitable means, for example, the name of the form field.

7.18.2 Annotation Types

No additional information provided in this Guide.

7.18.3 Tab order

No additional information provided in this Guide.

7.18.4 Forms

There are three types of forms in PDF:

  • Fillable interactive forms: These pages include widget annotations at least one of which is not read-only.
  • Non-fillable interactive forms: All widget annotations on these pages are read-only.
  • Non-interactive forms: These pages have the appearance of forms but do not contain widget annotations.

A non-interactive form cannot conform to PDF/UA unless it conforms with 7.14.

7.18.5 Links

The required "alternative description" should reflect the purpose of the link. See WCAG 2.0 Success Criterion 2.4.4 and Technique G91.

PDFUA-1 ERRATA: 7.18.5, first sentence includes an incorrect reference. The reference should be to "14.8.4.4.2 Link Element".

7.18.6 Media

No additional information provided in this Guide.

7.18.6.1 General

No additional information provided in this Guide.

7.18.6.2 Media clip data

No additional information provided in this Guide.

7.18.7 File Attachments

No additional information provided in this Guide.

7.18.8 PrinterMark

Paragraph 1 includes an error: it's not possible to mark PrinterMark annotations as Incidental Artifacts in PDF. The text should be read as specifying that PrinterMark annotations should be considered as Incidental Artifacts.

7.19 Actions

Actions are the main means of performing interactions. While PDF/UA doesn't really limit the use of actions, an accessible document should not surprise users with actions that change context without adequate warning.

Where a PDF/UA conforming PDF document makes use of actions AT users must be advised appropriately each time an action changes the focus, state or visible contents of the PDF document.

Any chosen notification style must be accessible. In a JavaScript, the announcement may occur by way of a JavaScript dialog.

EXAMPLE: If a button triggers a Hide action for some other annotation, this change of the document’s state must be made available to AT. In general, this will only be possible through execution of a JavaScript action, that may (for example) display a JavaScript dialog.

NOTE: WCAG 2.0 Success Criterion 3.2.1 provides detailed information on this topic.

7.20 XObjects

If a form XObject contains marked content sequences referenced by the logical structure tree, this form XObject can only be used once in the PDF via the Do operator. If a form XObject does not contain any marked content sequences referenced by the logical structure tree, it can be used multiple times in the PDF through the Do operator, and each Do operator may belong to a structure element.

7.21 Fonts

Due to the requirement in Section 7.2 that all characters shall map to Unicode, it is important to only use fonts that contain encoding information that can be mapped to Unicode or provide for Unicode mapping in some other way.

NOTE: Some older PostScript Type 1 fonts may lack suitable encoding information.

7.21.1 General

No additional information provided in this Guide.

7.21.2 Font Types

No additional information provided in this Guide.

7.21.3 Composite Fonts

No additional information provided in this Guide.

7.21.4 Embedding

Subset embedding is recommended when file size is of special importance.

7.21.5 Font Metrics

No additional information provided in this Guide.

7.21.6 Character encodings

No additional information provided in this Guide.

7.21.7 Unicode character maps

No additional information provided in this Guide.

7.21.8 Use of .notdef glyph

No additional information provided in this Guide.

Conforming Reader Specifications

8.1 General

Users generally do not need or desire to interact with artifacts, however, there are cases in which such interaction is desirable, and therefore the conforming Reader must possess the capability to report on the artifacts present on the page.

EXAMPLE: Page headers and footers, in particular the printed page numbers, are artifacts of possible interest to users.

8.2 Text

No additional information provided in this Guide.

8.3 Tables

See the table processing algorithm.

8.4 Optional Content

No additional information provided in this Guide.

8.5 File attachments and embedded files

No additional information provided in this Guide.

8.6 Digital signatures

No additional information provided in this Guide.

8.7 Actions

Note that WCAG 2.0 requires actions resulting in a change of context be announced prior to the event. Examples of actions causing visible or focus changes include:

  • Go to a page view.
  • Open a file.
  • Show or hide annotations.
  • Submit a form.

8.8 Metadata

No additional information provided in this Guide.

8.9 Navigation

The requirement in list item 4 is primarily aimed at the need to override hardwired zoom levels.

Blank pages should be recognized (ie, the user should be made aware of a blank page).

8.10 Annotations

8.10.1 General

This sentence should not be read to imply that only conforming AT are entitled to this data!

8.10.2 Forms

No additional information provided in this Guide.

8.10.3 Media

No additional information provided in this Guide.

Conforming AT Specifications (Section 9)

9.1 General

In lists (see 7.6): If no Lbl structure elements are present AND either the value of ListNumbering is equal to "None" OR the ListNumbering key is present, implementations should fall back to their default list item representations.

9.2 Optional Content

Where a PDF document contains optional content reflecting multiple representations of the content, AT must make it possible for users to switch between available representations.

9.3 Navigation

No additional information provided in this Guide.

Additional Documents

Achieving WCAG 2.0 with PDF/UA