Purpose

This documentation explains the behavior and impact of dynamic fields specifically the Table of Contents (TOC), cross-references, bookmarks, hyperlinks, and page numbers during the Word-to-PDF conversion process using Aspose.Words in our backend services (convert to pdf lambda).

 

It includes an in-depth analysis of field update logic, performance implications, and actual results from test conversions.

 

Background

When Word documents are converted to PDF, they may contain dynamic fields such as:

 

  • Table of Contents (or other type of dynamic content table such as Table of Figures)
  • Cross-references
  • Internal bookmarks
  • Page numbers
  • External hyperlinks

 

These fields are often generated or controlled via Word’s dynamic Field system (e.g., TOC, REF, HYPERLINK, PAGE). By default, their contents are not automatically updated unless the user explicitly triggers the update (e.g., right-click > "Update Field" in Word). This leads to two possible approaches during PDF conversion:

 

 

Current Behavior in Production (29.3.0)

 

In our current implementation, we explicitly call:

doc.UpdateFields();

 

This refreshes all dynamic fields before saving the document to PDF. We also use:

new PdfSaveOptions { UpdateFields = false };

 

This ensures:

 

  • TOC texts (headings) and page numbers are regenerated
  • Cross-references are fully re-evaluated
  • Any broken or outdated references get surfaced as ## Error if the source no longer exists
  • Page number fields update based on final pagination

 

This results in the PDF reflecting the most up-to-date field state, regardless of whether the original .doc/.docx had those fields manually updated.

 

This behavior applies now not only to Table of Contents, but also to other types of dynamic references such as: Table of Figures, Table of Authorities, Index.

 

Bibliographies are not automatically updated during conversion.

 

 

Problem Statement

While technically correct, this behavior has led to client confusion. Users often upload Word documents without manually updating fields like TOC or cross-references. After conversion:

 

  • The PDF shows updated TOC entries and page numbers
  • But the original Word document (as downloaded later) shows outdated or mismatched content

 

This discrepancy has triggered multiple Zendesk tickets from clients expecting WYSIWYG fidelity (PDF should match what they saw in Word).

 

Changed in next release (29.4.0)

To align with user expectations and reduce confusion, we remove the explicit call to:

doc.UpdateFields();

 

And change the save options flag to true:

new PdfSaveOptions { UpdateFields = true };

 

This approach allows:

 

  • Page numbers to be updated automatically during export for Table of Contents and Table of Figures. For Table of Authorities and Index, page numbers are not automatically updated.
  • Only the TOC page numbers get surfaced as Error! Bookmark not defined, if the source no longer exists.
  • All other fields (TOC text, Captions inside Table of Figures, cross-reference labels, etc.) to remain as-isβ€”matching the state in the uploaded .doc/.docx file

 

Test Results

Several files were used for validation:

 

Document 1: TOC and other cross-references

πŸ“„ Original Word Document – Dynamic_TOC_Test_Document.docx 
πŸ“„ Converted PDF – MG-1-1 Mihai Test doc 1 (v1.0).pdf

 

Contents of the Test Document

  • A manually-inserted TOC
  • Section headings across 5 pages
  • One cross-reference
  • One bookmark
  • One external hyperlink
  • Header and Footer with dynamic page numbers

 

Document 2: Table of Figures

πŸ“„ Original Word Document – Document Figures.docx

πŸ“„Converted PDF – MIG2-127 Document Figures (v1.0).pdf

 

Contents of the Test Document

  • Captions for the images in the document (References β†’ Insert Caption)
  • A manually-inserted Table of Figures (not updated)

 

Document 3: Table of Authorities

πŸ“„ Original Word Document – Document Table of Authorities.docx

πŸ“„Converted PDF – MIG2-128 Document Table Authorities (v1.0).pdf

 

Contents of the Test Document

  • Sequences of text marked as citation. (References β†’ Insert citation in Word)
    • To easily identify them in the doc: go to Home-> Click on  . Search by β€œ\ c”.
    • The text marked as citation is the one between {}
  • A manually inserted Table of Authorities (not updated)

 

Document 4: Index

πŸ“„Original Word Document – Document Index.docx

πŸ“„Converted PDF – MIG2-130 Document Index (v1.0).pdf

 

Contents of the Test Document

  • Sequences of text marked as Entries for Index (References->Mark Entry in Word)
    • To easily identify them in the doc: go to Home-> Click on the paragraph icon . Search by β€œ\ c”.
    • The text marked as index entry is the one between { }
  • A manually inserted Index (not updated)

 

Observed Behavior in PDF

Feature

 

Result

 

Explanation

 
TOC Textβœ… Reflects original doc/docx TOCTOC is preserved 1:1 with Word in the PDF
TOC Page Numbersβœ… CorrectUpdated usingPdfSaveOptions.UpdateFields = true
Captions inside Table of Figuresβœ… Reflects original doc/docx Table of FiguresTable of Figures is preserved 1:1 with Word in PDF
Table of Figures Page Numbersβœ… Correct 
Table of Authorities text (non-clickable)βœ… Reflects values in the original doc/docxNon-clickable references
Table of Authorities Page Numbersβœ… Reflects values in the original doc/docxNon-clickable references
Index textβœ… Reflects values in the original doc/docxNon-clickable references
Index Page Numbersβœ… Reflects values in the original doc/docxNon-clickable references
Cross-reference Textβœ… UpdatedReference to section is resolved correctly
Cross-reference Linkβœ… ClickableInternal navigation works
Bookmark Targetingβœ… CorrectClicking anchor navigates as expected
External Hyperlinkβœ… Clickable in browserURL retained properly
Footer Page Numbersβœ… AccuratePage field rendered per PDF layout
Header Page Numbersβœ… AccuratePage field rendered per PDF layout
 
 
 

Benefits of Change

Area

 

Before (With Update Fields)

 

After (Without Update Fields)

 
Visual Match with Word❌ PDF may differβœ… 1:1 match
Client Understanding❌ Confusingβœ… Aligned
Field Consistencyβœ… Always freshclipboard Depends on author
Broken Field Warnings❌ Possible ## Errorβœ… Preserved as-is
Performance⏳ Slower on large docs⚑ Faster
 
 
 

Known Limitations & Considerations

  • If the user forgets to update TOC or references manually in Word, the PDF will preserve those stale values.
     

    If a section (ex: Section Five) is removed we update the page numbers by this option 

     

    new PdfSaveOptions { UpdateFields = true };

     

    This reflects in the table of content as β€œError! Bookmark not defined.β€œ This is intentional. 
     

    84ae7255-f694-4bee-951f-8cbb4b90fce8.png

     

    If there is a need to preserve 1:1 parity completely the flag must be set to false, however the full impact of the page numbers not updating is not yet fully known. In such cases Microsoft Word does not even let you update only the page numbers, only the full TOC table.

  • Bookmarks and hyperlinks do not require field updates and continue to function correctly.
  • Cross-references with broken anchors may remain silently incorrect same as in prod (29.3.0).
  • Headers/footers that use Page or NUMPAGES fields are updated correctly.
  • Behavior in Collaborative Editing seems to reflect same behavior as in Microsoft Word Office 365 application. If any discrepancies shows it depends on Microsoft to keep the applications in sync with each other.
  • Page numbers inside Table of Authorities and Indexes do not update automatically. However, unlike Table of Contents and Table of figures, they are not clickable, therefore it is less likely to generate confusion/errors during user’s navigation inside document.
Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.