Paul Kiddie

Recovering a corrupted Word 2007 document

August 14, 2009

I just had the fright of my life when opening one of my thesis chapters this morning, to which I got presented with the message: corrupted1

I pressed OK, when I was asked whether I wanted to recover the document:

corrupted2

To which I was presented with the earlier error message, with Word making no attempt to recover the document at all:

corrupted3

At this point I was concerned I’d lost a whole days work, and what was even more interesting was that the file timestamp was not correct (6:10pm) as I had saved over the file later that evening (9:10pm). I work on a external USB hard disk and try to always ensure I safely remove hardware etc so I was at a bit of loss as to the reason why.

So, knowing that the .docx is simply a number of XML files and other content within a zip file up I changed the extension from .docx to a .zip, and tried to open it in Windows Explorer. The zip handler in explorer couldn’t open it so I thought the zip container itself must be corrupted — not good!

Doing a quick search for zip recovery I found a great, free tool called zip-repair from DiskInternals. I provided the tool the corrupted document with the zip extension, which recovered all of the files but some were in better shape than others:

zip-repair-docx

I then changed the extension on the recovered file back to a docx and tried to open again in Word 2007. I was greeted initially with the same error message that the file was corrupted, but the recovery attempt succeeded this time.I was presented with the document, content intact.

Some of figures from Visio was bombed and the formatting of the document fairly non-existent, but it had still preserved headers and inline cross-references including bibliographic references, which were the more time consuming things. So I was able to just cut and paste the relevant material, import the bibliography XML from Jabref and complete data loss was averted.

Perhaps this is something for Microsoft to look at including in Office 2010 - try to recover the zip file before working on the content? Hope this helps someone and of course: your milage may vary.


👋 I'm Paul Kiddie, a software engineer working in London. I'm currently working as a Principal Engineer at trainline.