There seems to be something up with the new HPSF properties parsing and CodePageString. The result is that for files such as "TestVisioWithCodepage.vsd" (taken from the Tika test suite, and created with Visio), HPSF is deciding that at least one apparently non codepage property is one What then happens is that the string "Page-1\0c" is parsed through the null termination check in CodePageString, and blows up
Added a unit test in r1722755 which tries to reproduce this, it seems to work now, thus closing this old bug as WORKSFORME for now.