[Word] FYI: Word 2007 XML Viewer
David A. Gray
dagray at p6c.com
Sun Feb 7 01:06:37 CST 2010
One of the secrets to the transportability of Microsoft Word documents is
that, from the moment of inception, a blank document contains, among other
things, the entire collection of styles, as they existed in the template to
which it is attached when the document was created. This applies to all
documents, because every document is based on some template, even if one is
not nominated at inception or subsequently attached. By default, all new
documents are based on NORMAL.DOT, and, therefore, inherit its style sheet.
All styles are copied, even the unused ones, so that any subsequent user of
the document can use them, and inherits them as they existed in the template
attached to it by its creator.
This raises another important point, which is that, since the styles
accompany the document, any fonts that don't exist on another machine on
which it is opened will be substituted. In 1997, I had a very nasty
experience with this, when I sent a document to a client, so that they could
print it on a color laser printer for inclusion in a software manual. The
document used a custom TrueType font, RRKeyFonts, which, of course, they
didn't have. Since I didn't tell Word to embed TrueType fonts, Word
substituted Wingdings, with disastrous results. I didn't catch the error
until after the printing, and we had to throw away the entire batch. Moral:
If you use unusual fonts in your document, and you plan to send it to anyone
who may not have them, be sure to embed them before your final save.
Things used to get really interesting when you passed around a document that
was based on a custom template, such as the CALMSDS.DOT that I developed for
one of my clients. Word 2000 would try to find the template, using a UNC
path to the attached template, which Word stored in the document. If this
UNC path was inaccessible, as would be the case when you sent the document
outside your company, there could be a delay of up to several minutes, while
Word waited for the attempt to connect to the invalid share to time out and
fail. At that point, Word would attach the document to the local NORMAL.DOT,
and all was right with the world, unless the outside user needed some of the
other features of the original attached template, such as macros or custom
toolbars, to interact with the document.
Shortly before Word 2002 was about to ship, I opened a paid support incident
with Microsoft, and we spent many hours, over the course of several weeks,
identifying and documenting this problem. Fortunately, Microsoft fixed the
problem in Word 2002, which was the version that my client was about to
adopt, as part of a world wide upgrade from Windows 95 and Office 97 to
Windows XP and Office XP.
BTW, if you want an easy way to see all these style sheets, save any Word
document in HTML format, and open the resulting file in a text editor. You
will see a massive Cascading Style Sheet in its HEAD section, with an entry
for every paragraph and character style defined in the underlying template.
This is part of the reason that HTML documents saved from Microsoft Word are
so large.
David Gray, MBA, Chief Wizard
WizardWrx - Making software magic since 1985
V: +1 (817) 812-3041
C: +1 (817) 298-0867
TZ: USA Central
E: dagray at wizardwrx.com
W: www.wizardwrx.com 3971 North O'Connor Road
Irving, TX 75062-7640
USA
Tell me what you need, and I'll conjure it.
-----Original Message-----
From: word-bounces at dcomp.com [mailto:word-bounces at dcomp.com] On Behalf Of
Ron Solecki
Sent: Saturday, 06 February, 2010 21:59
To: DailyWordTips
Subject: [Word] FYI: Word 2007 XML Viewer
I found an interesting addon to display XML code:
http://wordsourceviewer.codeplex.com/
It looks to be in early development, so may not be fully functional yet. It
adds a new group, "Source View" to the Developer tab. To use it:
1- open your doc
2- navigate to Developer ta
3- click on "Source Pane" button to display the new task pane
4- Click on "Copy to Source Pane" button
You will now see the underlying XLM code for the document. Even with a
brand new, blank doc, the viewer showed my 1400+ lines of XML code.
Granted, XML is a very "verbose" language, but by looking at this you can
get an idea of the underlying complexity of Word documents. Most, all?, of
this formatting "stuff" was also present in different, "Binary" form in the
old DOC format. Which explains why the old DOC format was subject to
corruption.
The source code displayed does not match the example screen capture on the
site. Strangely enough, I could not find the document text.
Checked by AVG - www.avg.com
Version: 9.0.733 / Virus Database: 271.1.1/2671 - Release Date: 02/06/10
01:35:00
More information about the Word
mailing list