Created attachment 37678 [details] Integration test patch The TestAllFiles class is a bit non-deterministic, i.e. the several try catches and various expected failure lists are confusing. I've changed the implementation so it's now based on a test list which takes into account which ... - file, - file handler - test case, - password - and exception/message ... is processed. To have comparable exceptions, I've removed the wrapping IOException/POIXMLException. Furthermore I've activated concurrent processing, which cuts the test time on my machine by 50% - only active for the integration tests. As this would affect our current regression test comparability, I stash it here until we've released POI 5.0.0 ... and maybe you want to give it a review.
Thanks for the suggestion, yes, sound like a good idea to clean up the integration-testing a bit and also the speedup sounds very good as it is becoming a bit long-running already. I will review the impact on the mass-testing, the following items are used currently: * reuse the excludes via TestAllFiles.SCAN_EXCLUDES * a way to lookup matching handlers based on file-extension via POIFileScanner * use the interface FileHandler for executing the test for each file * use BaseIntegrationTest.test() for executing tests for each file None of the logic in TestAllFiles itself is needed, so refactoring it should be possible as long as the basic FileHandler implementations are kept as-is. Removing POIFileScanner would need some other way of getting a list of matching documents. Keeping this in one place would be nice to not duplicate this logic in both projects. Not sure how much impact the changed Exception-handling will have, this will need a closer look. But let's keep this until after the 5.0.0 release so we do not introduce more changes right now.
Applied via r1885538 I've removed POIFileScanner as it's not need for the internal integration tests. To lookup the handler, you can copy the logic of TestAllFiles.allfiles. If this doesn't work, POIFileScanner can be reverted.
POIFileScanner can easily be moved to the regression-test project, but it is now complicated to use the integration-test functionality to find a matching handler for an arbitrary file, i.e. "TestAllFiles.HANDLERS.get(TestAllFiles.getExtension(file))" does not work any more. I would like to re-use this "knowledge" from poi about the relation of extensions to file-handlers.
I refactored the TestAllFiles class and extracted some classes. Have a look at TestAllFiles.allfiles - the StressMap is used to retrieve the FileHandlers. It must be initialized with the mapping file before. For the mass test, the "Exceptions" sheet doesn't need to be provided.