After upgrading poi-scratchpad to 4.1.1 I'm getting this exception trying HMEFContentsExtractor.main: Exception in thread "main" java.lang.IllegalArgumentException: Unknown type 72 / 0x0048 - CLS ID GUID @ 16 at org.apache.poi.hmef.attribute.MAPIAttribute.getLength(MAPIAttribute.java:204) at org.apache.poi.hmef.attribute.MAPIAttribute.create(MAPIAttribute.java:170) at org.apache.poi.hmef.attribute.TNEFMAPIAttribute.<init>(TNEFMAPIAttribute.java:41) at org.apache.poi.hmef.attribute.TNEFAttribute.create(TNEFAttribute.java:71) at org.apache.poi.hmef.HMEFMessage.processMessage(HMEFMessage.java:99) at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:81) at org.apache.poi.hmef.HMEFMessage.<init>(HMEFMessage.java:66) at org.apache.poi.hmef.extractor.HMEFContentsExtractor.<init>(HMEFContentsExtractor.java:74) at org.apache.poi.hmef.extractor.HMEFContentsExtractor.main(HMEFContentsExtractor.java:58) This worked fine with v4.1.0. OS: Ubuntu Linux 19.10 JAVA: 13.0.1
thanks for the issue - could you provide a sample file as it helps us debug the issue and will help us form a regression corpus?
Created attachment 36894 [details] Unparseable winmail.dat This doesn't parse using 4.1.1, but works fine using 4.1.0
Are you sure it worked with 4.1.0? I tried it quickly, but it also failed with 4.1.0 and there were no changes in the HMEF area which would cause such a regression.
It is working in my project when downgrading, and it stops working when upgrading again. I'll try to make a separate maven-project for reproducing.
Created attachment 36895 [details] screenshot of Run-configuration in IDEA
This project: https://github.com/andreak/tnefextractorfail fails using the run-configuration in the attached screenshot. Changing the poi-scratchpad version to 4.1.0 with this property <version.apache-poi-scratchpad>4.1.0</version.apache-poi-scratchpad> makes it work again.
FWIW; jtnef parses it fine: https://www.freeutils.net/source/jtnef/
Are you able to reproduce this, ie. make it work with 4.1.0 and not 4.1.1? Let me know if there's anything more I can do.
Thanks for the details, "git bisect" identifies the following commit as causing this regression: 162f69655fc9146c94dfc4b4e101cbaf46255356 is the first bad commit commit 162f69655fc9146c94dfc4b4e101cbaf46255356 Date: Wed Apr 17 20:18:29 2019 +0000 #github-143 - MAPIType.isFixedLength: not true in case of length > 8 See also r1857708
Glad you found it, looking forward to next release:-)
Fixed via r1870692 May I use your example winmail.dat in our corpus or can you provide an anonymized one? (this would help to keep the extraction valid)
Use it, it's fine.
added the example file via r1872480 and optimized a few unit tests