When non-HTML documents become a problem

Although the web has long provided support for non-HTML documents (PDF, Word, etc.) that doesn’t mean it is a good thing to do. It can make it difficult for users to find what they need. Non-HTML documents are often not written to support a good web experience – they are created with the intent of being printed, presented, etc.

The government of Canada excels at making native-format documents available on the web – Google lists millions of PDFs on gc.ca sites – and an even higher number of pages since PDFs are more than one page, and are sometimes hundreds of pages long. On one hand it is good that public information is made available. But the government should be concerned about the user experience – users generally expect to use web content like they use HTML pages. But PDFs and non-HTML documents sometimes need to be made available to clients or citizens to see what was printed or presented. Web teams strike a balance between findability and ease of publishing, so PDFs keep getting made available.

How does a web manager decide when to convert files on your website into HTML? When should web content creators limit the number of non-HTML versions? If you have lots of PDFs, where do you begin migrating to HTML? Here are some issues to be aware of when making your decisions – non-HTML content becomes a problem when:

Users need to get around within the non-HTML content


  • Non-HTML documents often have no global navigational links
  • If there is a table of contents it may not be hyperlinked, or may be several scrolls down
  • The first screen in a print-oriented document is a cover page and may have no links at all
  • Consequently, scrolling and the Find utility are the principle means of getting around
  • The most common way out of a non-HTML document is the Back button or closing the window altogether
  • TIP: Make the table of contents into good links, and add way-finding links to help users navigate around to key spots within the non-HTML document


Users need to arrive deep into non-HTMLcontent


  • Non-HTML documents do not use anchor links, so clicking a URL to a PDF takes a user to the first page – rather than deep-linking into a specific place within the PDF
  • Searchers scan for familiar words when they arrive, but are confused when the words are not the first things they see
  • Users can’t point other users to content, or get back to the exact page they just visited
  • TIP: Put key links on the first screen, and other visible places in the non-HTML document


Users visit the non-HTML content frequently


  • Even when an HTML equivalent is provided, users arrive at the PDF instead, because non-HTML documents rank high in search results because they have more words and instances
  • But the HTML equivalent is often what searchers really need – the PDF is just the general overall document
  • TIP: Link to the HTML version or updates from deep in the non-HTML version


Back to Top

This entry was posted in Uncategorized. Bookmark the permalink.

Comments are closed.


  • +1 613 271 3001 Main
  • +1 866 322 8522 Toll-free
  • +1 866 232 6968 Fax

Follow Us    Twitter   LinkedIn

© 2012-7 Neo Insight. All Rights Reserved