Edit OOXML with VBA – Cool Code

For Office users, the closest thing to a “programming language of the people” is VBA. It’s not too hard to get started, there are gobs of help information from a good search, and the results are immediate. But VBA’s abilities haven’t expanded as its environment has changed. This has become abundantly clear with current versions of Office, where task panes and the Windows-version Backstage haven’t been included in the VBA object model. Many want to edit OOXML with VBA, but Microsoft prefers to shuffle you off to the Open XML SDK programmed with C# to do that job.

Fortunately, we’re on the case at Brandwares. We collaborated with programmer Jan Karel Pieterse to develop a PowerPoint version of his macro set that edits Excel OOXML. We’re making this freely available as a download so you can get the benefit of this.

Let me apologize to my macOS readers. I really try to provide solutions that work cross-platform, but this macro set relies on Windows system calls.

I’ll be honest, this isn’t the most elegant OOXML editing solution. The macro set unzips the OOXML to its component files, gives you the opportunity to edit the XML using VBA string manipulation, then rezips the OOXML to a usable PowerPoint file. The unzip/rezip operations are fairly slow, especially with large files. It’s not something you can use in a real-time editing situation.

One of Brandwares’s specialties is converting legacy presentations to new themes/templates. Often, there are OOXML mismatches that make reused slides retain artifacts or formatting from the old decks. We solve these issues with macro convertors that take a folder full of old decks and transform them into new presentations with new branding. This macro set is great for that.

Edit OOXML with VBA: a Peek Under the Hood

THe file contains 3 VBA modules and 1 class module. Module modConvert is the only one you need to modify. In it, Sub MainVBAOperations does the actual work of opening files, saving as a work file, calling the XML process, saving the modified file and deleting the work file. This is also where you would do any additional VBA processing. As one example, after you modify the XML of a slide master or layout, you have to reset the slide based on it to display the changes. MainVBAOperations is where you would do this.

The other Sub is ProcessXML. Here’s where you unzip the file being modified, open different XML parts for find and replace processes, followed by a rezip of all files back to a working file. The sample code in this module shows a typical revision to the idx numbers of placeholders, a common requirement of legacy presentation conversions and one that can’t be done with the PowerPoint interface.

The module modDisplay, by Shyam Pillai, provides the PowerPoint equivalent of the Application.Screenupdating command that exists in Word VBA. Useful to prevent the screen flashing and jumping as files are processed, it also helps speed code execution. modUNC by Randy Birch, assists with file management.

Jan Karel Pieterse wrote the class module clsEditOpenXML that does the heavy lifting of unzipping and rezipping the document to be modified and reading and writing the XML.

As noted in the code, You are free to use this code within your own applications, add-ins, documents etc but you are expressly forbidden from selling or otherwise distributing this source code without prior consent. This includes both posting free demo projects made from this code as well as reproducing the code in text or html format.

Converting Legacy Presentations

We use often use this macro set to update old (legacy) presentations with a new design. Successful updating requires meeting 5 criteria, please read this article for more details: Legacy Slides – Best Practices. As noted on that page, the 5th requirement is that placeholder idx numbers in the OOXML must match on the old and new layouts. There’s nothing in PowerPoint’s interface that allows you to set idx number, but this macro set allows you to do just that. The pre- and post-processing sections of the macros allow you to set the the other 4 parameters for each slide layout. Click here to download it.

The following advice is particular to presentation conversion. It’s routine that slide masters and layouts will be changed in that process. Then, to apply those changes to the actual slides in a presentation, the slides must be reset, as if you pressed the Home>Reset button in a presentation. Resetting slides wipes out character-based formatting. If a user has applied bold or italic or an underline to particular text, that all will disappear. It’s important to notify your client of this. To make an exact update would require a painstaking construction of a multi-dimensional array for each placeholder on each slide that would record all character-based formatting, then restore it after the update, for which you would have to charge many times as much as for the basic conversion work.

Brandwares is a world leader in presentation updating and conversion. We’re available for presentation assessments, to identify potential problems. We have multiple techniques for seamless re-use of legacy presentations. Contact us when you’re redesigning to ensure your new template will reuse your old slides without a hiccup.

Multiple Slide Masters: Beware! – Best Practices

One of my areas of expertise is to do presentation assessments for designers, finding any potential problems in their work before sending their file to their client. Something I see fairly often is multiple slide masters in a template or theme, where they are used as an organizational tool. Typically, one master will be for title slides, another for section headers, and so on. This is a poor practice, but it goes on because designers never have to actually use the files they create. If they did, the problem would soon become evident. Here’s a typical example:

Three Slide Masters

Multiple Slide Masters – How They Work

If I asked you “How does PowerPoint know which slide master to use when there is more than one?”, what would your answer be?

Well, I’ve done the research, so I can let you know. When you paste slides into a presentation, PowerPoint looks at the slide immediately preceding the paste position and uses the slide master for that slide. If there are no slides in the presentation, then PowerPoint uses the first slide master. This has major consequences for the format of the pasted slides. If the slide master of the preceding slide has only title layouts, only the pasted title slides will find their rightful layout. All the rest of the slides will bring their old layout with them. So now the slide master for titles has a whole bunch of non-title layouts. Oops. So much for your organizational scheme. Following are the original slides, followed by what they look like when pasted after slides based on different masters:

4 slides based on 3 slide masters
4 slides from 3 masters
Pasted after slide based on first master
Pasted after first slide
Pasted after slide based on second master
Pasted after second slide
Pasted after slide based on third master
Pasted after third slide

If the pasted slides are from legacy presentations, where the intention was to update them to the new look, you get a total fail. Not only do they not update, they’ve ruined the new deck with added old layouts. The next time the user adds a new slide and tries to find a layout in that slide master, they get a psychedelic soup of designs.

But wait! It gets worse! For some bizarre reason, designers have a bad habit of deleting all the placeholders on the slide master, then building the design in the layouts. Ouch! The placeholders on the slide master are parents to the child placeholders on the layouts. They are the source for font and paragraph formatting in layout placeholders, as well as the default placeholder positioning. When you paste slides so they use that blank slide master, formatting goes haywire. The slide layouts that get imported with the pasted slide have no source to tell them how to display content. PowerPoint makes a guess and always guesses wrong. Bold text becomes regular, font sizes changes, pictures disappear. It’s a nightmare, and one that designers just don’t understand, can’t explain to their clients, and have no idea how to fix.

A Better Approach

Just use a single slide master. In 99% of presentations, one slide master is all you need. That slide master should be formatted as closely as possible to look like a typical content slide. Then for layouts that have different formatting, right-click on the layout background and choose Format background. In the Format Background task pane, check the option for Hide background graphics. Then place the alternate graphics on that layout.

About the placeholders: keep them! Deleting the default placeholders does nothing helpful and can cause problems. If someone pastes a slide with a footer and your layout doesn’t have a footer placeholder, the footer text will reposition itself, often at the very left edge of the slide, because the content has been orphaned. If your design doesn’t include a footer, move the placeholder off the bottom of the layout! This makes a pasted footer simply disappear instead of looking goofy.

Multiple Slide Masters – When and How to Use Them

So far, there are only 2 uses I’ve found a genuine need for multiple slide masters. One is when a presentation requires multiple color themes. Some organizations have divisions that use code colors: a key color that is used only in that group. This situation was easier in PowerPoint 2003 and earlier, because you could easily apply different color schemes to groups of slides. Modern versions have only 1 color theme per slide master. If you change the color theme, all the slides change. Here’s how to handle that situation:

Under the slide master, create the complete set of layouts needed for the presentation. Set the code color in Accent 1 if it’s a predominant element of the client branding. If it’s a subtle design change, Accent 6 is better. Get the client to sign off, so the design is not going to need a lot of future revisions. Then copy and paste the slide master for each code color required. Finally, revise the color theme for each slide master, changing only the accent color. Following this method, every slide master has every layout. The worst that can happen is someone pastes in the wrong place and the code color is off. Easily fixed.

The other time you need multiple masters when a client has rolled out a new template, then realizes they need to handle legacy slides. Then you create a second slide master with layouts that match the new look but have a layout structure that is compatible with the old decks. At the beginning of the deck create 2 sample slides, one based on the new slide master and another based on the legacy slide master. Place a notice to the user just off the edge of the slide master for the first slide: “New slides only! Don’t paste old slides!” and on the second “Paste old slides here! No new ones!”. Then users are less likely to paste slides in the wrong spot and get weird results.

OOXML Hacking: Replacing Fonts and Languages in PowerPoint

Replacing odd fonts and errant languages in PowerPoint is not something that always works well in the program. Problems with replacing double-byte fonts for Asian languages have been documented on other pages, but designers also have problems replacing Mac-only formats like AAT (Apple Advanced Typography) fonts.

There’s a similar problem with languages. While language tags are scattered all over in XML, you can only reach some of them with PowerPoint’s built-in Review>Language>Set Proofing Language (Windows) or Tools>Language (macOS) command. When files are moved between computers, it’s very easy for the file to have a mix of language tags.

In Windows, there are some free macros available to make replacing languages easier in PowerPoint, but there’s no equivalent for Macs. So here’s my go-to method to fix both problems with not much more than a good text editor, like NotePad++ in Windows and BBEdit on Mac. If you’re new to this site, please read my introductory articles OOXML Hacking: An Introduction and, if you’re on a Mac, OOXML Hacking: Editing in macOS.

Referring to the macOS article, a patient Mac user can use BBEdit 11 or better to open a presentation, select each file in turn and get the job done. But your time will be better spent by creating a network or USB disk that doesn’t create .DS_Store files. It’s a little time-consuming the first time, but if you’re someone who creates PowerPoint files on a regular basis, you’ll soon recover that time. The nice thing about this technique is that it always works, 100% of the time, even with double-byte fonts.

Replacing Fonts in PowerPoint

Unwanted fonts can be introduced when you’re doing design experiments, collaborating with clients, or when you convert a KeyNote presentation to PowerPoint. In all these cases, you may end up with fonts you cannot remove, or even with a deck you cannot save. Start by making a list of all fonts that need to be eliminated. You can get the names by choosing Home>Replace>Replace Fonts (Windows) or Edit>Find>Replace Fonts (macOS menu). Then expand the PowerPoint file to XML parts.

Now fire up your text editor. Both BBEdit and NotePad++ include utilities to find and replace in files. With both, you point the find at a folder full of expanded XML files from your presentation. In the Find field, look for typeface=”Font Name”, where the font name is the name listed in Home>Replace>Replace Fonts (Windows) or Edit>Find>Replace Fonts (macOS menu). The Replace field should also be typeface=”Font Name”, but here the font name is the font to match the rest of the deck. Executing the Find and Replace should display a list of files where changes were made. Both programs automatically save the changed files, so all you have to do is close the text editor and rezip the XML back to a presentation.

Windows users can take a shortcut with this process. PowerPoint for Windows has a save format called PowerPoint XML Presentation (*.xml). This saves the entire presentation as one big XML file. Open that file in plain old NotePad, choose Edit>Replace and do the same replacement as above. No fancy text editor needed! Then open the XML file in PowerPoint and save to a normal presentation format for your client.

Replacing Languages in PowerPoint

PowerPoint puts language tags all over the place. Here’s a random sample showing one paragraph with one word that contains 2 different language tags!:

    <a:rPr lang="en-US" dirty="0">
      <a:latin typeface="Arial Black" panose="020B0A04020102020204" pitchFamily="34" charset="0"/>
  <a:endParaRPr lang="en-CA" dirty="0">
    <a:latin typeface="Arial Black" panose="020B0A04020102020204" pitchFamily="34" charset="0"/>

The technique is nearly identical to font replacement, you just search on a different tag. In the example above, we would find lang=”en-CA” and replace it with lang=”en-US” to create a uniform U.S. English presentation. Here at Brandwares, we do a lot of international work, so files can have language tags from all over. A standard part of file finalization is replacing language tags with the target language for the client.

When you’re working with OOXML files, multifile find and replace is a very useful technique to solve all kinds of problems. As another example, here’s my article on fixing broken color themes with similar techniques: OOXML Hacking: Repairing Color Themes. Mastering multifile find and replace can save you hours over manual repairs to PowerPoint files.

Is your problem more complex? You just can’t get the result you want with find and replace? That’s why we’re here, to help you get your work done faster! Shoot me an email at production@brandwares.com, I’ll get you going in a jiffy.

Office Automation with AppleScript Reference – Cool Code

For Office for Mac users, AppleScript remains a useful tool in scripting Office documents. VBA for Mac has poor links to the operating suystem. In Windows, a programmer can easily make system calls to the OS. In macOS, the analogous system calls are not documented by either Apple or Microsoft. A few intrepid programmers have posted about macOS equivalents, but the information is sporadic and very hard to find. macOS’s built-in AppleScript can solve many of these issues, so I’m posting the best available Office AppleScript Reference.

This is old Microsoft documentation from 2008 that they’ve removed from their site. I don’t guarantee this, or offer support, I’m just posting it to make it available to interested programmers.

Office AppleScript Reference: MacScript vs AppleScriptTask

Ron de Bruin is a European programmer and MVP (Microsoft Valued Professional) who has written lots of good information about programming Office on a Mac. While he writes mostly about Excel, VBA is so similar across Office that most of the information also applies to Word and PowerPoint. Here’s his page on using the AppleScriptTask command in Office 2016 and later to call AppleScripts from a VBA macro: AppleScriptTask in Mac Office 2016 or higher.

Office AppleScript Reference: Conditional Compilation

To create a macro that can run on different versions of Office for Mac, or on both Windows and Mac, you’ll need to use conditional compilation. This uses #If – #Else – #End if statements to run a set of commands only in certain conditions. Here is Microsoft’s page on conditional statements for different Mac versions: Differentiate between Office for Mac versions at compile time. This MS reference page shows all available conditional compiler constants: Compiler constants.

Click here to download: Office AppleScript Reference.

Another useful reference for AppleScript automation of Office is the online article Moving from Microsoft Office VBA to AppleScript by Paul Berkowitz. This is a full-length book with comparisons between VBA and AppleScript, plus many real-world example scripts. It was originally published in 2007, in anticipation of Office 2008. Unfortunately, the downloadable PDF they promise is no longer available.

OOXML Hacking: Protected Area Exceptions in Word

Microsoft doesn’t have a catchy name for this feature, but I’ll try to describe it. In Word for Windows, you can select document text, then apply Read-only protection, but with Exceptions. By default, the exception is Everyone. If we untangle the word logic, this means that the document becomes read-only except that everyone can edit the selected text. It’s a far better solution than the old protection for forms.

When this type of protection is applied, the selected areas become shaded in light yellow as a visual cue that the highlighted text remains editable. Users of Word 2016 for Mac and 2019 for Mac (including Microsoft 365 subscribers) can use these documents, but can’t produce them. At least, until now.

Light yellow areas indicate editable text
Protected area exceptions in Word

Word pros will look at the square bracket and think it’s a bookmark. It’s not. Microsoft reused the bookmark character to show a Permission Range. To add these editable ranges on a Mac, we’re going to create one manually.

To start, apply Read-only protection to the document. Use Tools>Protect Document, then check Protect document for. Click on Read only, then on OK. Save the file.

Now open the file in your XML editor. OOXML Tools in the Chrome browser is fine for this job. Open document.xml inside the word folder.

Just before the text that you want to be editable, insert a line like this:

<w:permStart w:id="883447734" w:edGrp="everyone"/>

At the end of the editable text, insert this XML:

<w:permEnd w:id="883447734"/>

The beginning and end tag must have the same 9-digit random number. Each pair of tags in a document must have a different random number.

Here is a whole paragraph marked as editable:

<w:permStart w:id="783447734" w:edGrp="everyone"/>
<w:p w14:paraId="5B68C6A9">
    <w:t>To make your document look professionally produced, Word provides header, footer, cover page, and text box designs that complement each other.</w:t>
<w:permEnd w:id="783447734"/>

Here, just one word is editable:

professionally produced, <w:permStart w:id="983447734" w:edGrp="everyone"/>Word<w:permEnd w:id="983447734"/> provides header,

It’s that simple. Save the file and distribute to users.

For any of our tips that seem too complicated, Brandwares is available to do it for you. We teach the pros!

Fix It with VBA

Instead of hand-editing the XML, then applying Read-only protection, you can use a bit of VBA to solve this problem. Select the text that should remain editable, then run this macro:

Sub CreatePAE()
  Selection.Editors.Add wdEditorEveryone   ActiveDocument.Protect Password:="", NoReset:=False, Type:=wdAllowOnlyReading, UseIRM:=False, EnforceStyleLock:=False End Sub

If you have multiple areas that should remain editable, run the macro, then unprotect the document (Tools>Protect Document>Uncheck Protect document for option), select new text, then re-run the macro. Leave the document protected before distributing it.

Text Effects? Don’t! – Best Practices

The Best Practice is to NOT use Text Effects in Office. Ever.

That could have been my shortest article ever, but I guess I should explain the reasons. I’m referring to the graduated fills and lines, the glows, reflections, shadows and 3-D effects you can add to text. In the past these effects have caused some problems with ordinary shapes, but with text, they’re a disaster.

Text Effects: Bad in PDFS

Clearly, it’s not enough for most users that these effects are visually hideous. That just a natural result of the low value we assign to arts education. In many years of working with competent graphic artists, I’ve never been asked to create any template that uses these effects. Designers understand the need for restraint, users don’t. And so we get the appalling appearance of most Word and PowerPoint documents.

But the functional problem with these effects is how they affect PDFs created from Word, PowerPoint and Excel. Microsoft has no clue how to export true PostScript of the fancy effects. So they adopt a simplistic approach: flatten them to graphics. Unfortunately, this means the text vanishes, leaving behind only a pretty picture. Well, not even that. All kinds of PDF functions are impaired: Text to speech is impossible, accessibility goes right out the window, reimporting the PDF to Office is brain-dead.

I tested PDFs created in 3 ways: saving to PDF in Office, printing to Acrobat and printing to the Microsoft Print to PDF print driver that comes with Windows 10. When saving to PDF, all text with applied effects was flattened. When printing to either Acrobat or Microsoft Print to PDF, Gradient Fills, Gradient Lines and 3D Effects were flattened, while Shadows, Reflections and Glows remained as live text.

The moral is clear: when the client asks for Text Effects, just say NO!

Every AutoShape – Cool Code

There are more AutoShapes in Office than appear in the user interface. Over time, Microsoft has quietly added to the collection stored in Office. Many of the more recent shapes are used in SmartArt files, while others have no current use that I can detect. But for anyone who hacks XML or codes VBA will find this week’s download or every autoshape a useful reference.

These are all the shapes that can be used in a piece of SmartArt. This is a deep topic: SmartArt XML is a programming language with a Frankenstein syntax. But the starting point for any language is to get the names right. SmartArt and VBA both reference the AutoShapes collection in Office, but they use different names for the same objects. Many of the same shapes are seen in the Shapes dropdown of Office programs, but those names are also different. The main source of the VBA names are from this page: MsoAutoShapeType enumeration (Office), while the XML names come from this out-of-date listing: SmartArt AutoShapes.

So, for my own sanity, I created a Word document showing an example of each shape, along with it’s name in XML, in VBA and in the user interface. This has smoothed out my workflow, and it may help you as well.

Listing of every AutoShape

For any non-interface shapes, you can insert them into a document with code analogous to this. For PowerPoint:

Sub MakeShape()
  ActivePresentation.Slides(1).Shapes.AddShape msoShapeTrapezoid, 24, 24, 144, 144
End Sub

Change the bolded word to the shape name in the VBA MsoShapeType Name column.

Some shapes have specialized VBA commands, like callouts:

Sub MakeShape()
  ActivePresentation.Slides(1).Shapes.AddCallout msoCalloutTwo, 24, 24, 144, 144
End Sub

Callouts led me to a discovery about legacy versions. msoCalloutOne gives exactly the same result as msoCalloutTwo in current versions of Office.That seemed odd, so I ran the same macro in PowerPoint 2003 (I still have it installed for its macro recorder). In that version, msoCalloutOne creates a callout with a vertical leading line that can be moved up and down, but not at an angle. Presumably, MS found that useless and deprecated it.

Download the Word document showing every autoshape here.

OOXML Hacking: Open XML Explained

We’re proud to be hosting a free e-book Open XML Explained. Finding readable explanations of how Office Open XML works isn’t easy. Outside of the very dense published ECMA standards, most of the information exists only on the web in help pages and blog posts. Back in 2007, a developer in the Netherlands, Wouter van Vugt, published a well written PDF that goes into details about the XML formats used in Word, Excel and PowerPoint. This was originally posted on the openxmldeveloper.org website run by OOXML genius Eric White.

Eric’s site closed down some years ago and much of the content disappeared. But I’ve been able to track down a copy of Wouter’s book and have his permission to give it a new home here at Brandwares. Download Open XML Explained.

The e-book covers Excel, Word and PowerPoint in depth, covering otherwise obscure topics like table formatting, bullets and charts. It’s an excellent companion to my book, covering the theory behind my practical techniques. OOXML hackers will appreciate the explanations of all the major XML parts in an Office file, while coders will find this a useful resource in learning how to programmatically create and modify documents, workbooks and presentations.

Open XML Explained

I should note that Open XML Explained was written in 2007, so while it covers all the OOXML basics, it doesn’t have any information about newer features like the Backstage or SuperThemes. Reader David Wiggins was kind enough to find the sample files that accomapied the book. The download page has a link to them as well. Enjoy!

OOXML Hacking: Buy the Book

SOLD OUT! The paper copies of the book are all gone, but the ebook version (with an additional 40 pages of new content) is available here.

After years of original research, you can finally buy the book! Filled with unique information not found anywhere else, online or in print, this manual shows you how to build SuperThemes 3 different ways, how to create custom Effects Themes, how to edit the Ribbon in macOS, and much more!

OOXML Hacking: buy the book

The book expands on many of the brief articles on this site, putting them in logical order and expanding the number of examples. Author John Korchok explains how Office Open XML files work, shows you where to find each XML part and how each part can be modified. With these tools, you can provide unique services to your clients or users that you can’t find at the average Office template service bureau. To give you a better idea of what it covers, here is the Table of Contents:

Table of Contents 1
Table of Contents 2
Table of Contents 3

All techniques are covered in both Windows and macOS. The book includes a link to a downloadable text file with all the hyperlinks, XML and VBA listings, so you don’t have to retype anything from the printed page. At this time, only print copies are available, ebook versions will be here in several months. To buy the book, click here.