Bug 66029

Summary: poi 5.0 generated xlsx file RUN TO EMAIL APPENDS .TXT TO OUTPUT
Product: POI Reporter: Rakhi Barayanan <rakhi.narayanan>
Component: XSSFAssignee: POI Developers List <dev>
Status: REOPENED ---    
Severity: regression    
Priority: P1    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: With poi 5.0
POI 3.17
Format of both files during sharing

Description Rakhi Barayanan 2022-04-25 06:02:21 UTC
We are generating the Excel file in our organization using poi.After upgrading to poi 5.0 from poi 3.17,output RUN TO EMAIL APPENDS .TXT along with xlsx extension.That is instead of .xlsx result is appearing as .xlsx.txt.

The attachment was removed from this email message because it matched a
blocked file type extension

The same set up works fine with poi 3.17,with no changes in mail server and OS.

Could you please have a look.
Comment 1 PJ Fanning 2022-04-25 08:08:14 UTC
Hi Rakhi - I think this may be better suited to stackoverflow.com - more users looking at questions there.

You haven't given us a reproducible test case so anything at the moment would be guess work.

It might be a good idea to test the code changes yourself by adding temp code to write the xlsx out to disk so you can check what you get there.
Comment 2 Rakhi Barayanan 2022-04-25 09:26:27 UTC
Same code work fine for poi 3.17.Issue is with poi 5.0.

Is there anyone who have faced this before or do we have a similar bug.
Comment 3 Andreas Beeker 2022-04-25 10:59:17 UTC
Of course POI doesn't change the filename, so it is something else. My guess is, you receive an error when generating the file and the .xlsx.txt is not a valid file.
Please check your dependencies and use a dependency manager like maven or gradle. If you do, remove all POI related dependencies like ooxml-schema*.jar and only reference poi-ooxml.
Then check the output file and any exceptions - if it's still invalid, provide the exceptions and the list of dependencies.
Comment 4 Rakhi Barayanan 2022-04-25 12:01:47 UTC
Thanks for the suggestion.We will try this.
One more observation we have is with latest poi that poi 5.0 ,even if we see output as type .xlsx,it is generating in binary format.I am attaching the both poi 3.17 and poi 5.0 outputs.
Comment 5 Rakhi Barayanan 2022-04-25 12:02:26 UTC
Created attachment 38260 [details]
With poi 5.0
Comment 6 Rakhi Barayanan 2022-04-25 12:03:10 UTC
Created attachment 38261 [details]
POI 3.17
Comment 7 Rakhi Barayanan 2022-04-25 12:11:32 UTC
Created attachment 38262 [details]
Format of both files during sharing
Comment 8 Nick Burch 2022-04-25 12:19:58 UTC
Without any code, there's very little we're going to be able to do to help you. "RUN TO EMAIL APPENDS" is not a built in bit of Apache POI code

If you can share some code, someone here may have some time to look and see your mistake, but no promises - we're all volunteers here. You may have more luck on stackoverflow as there are more people there, but they will also need code, and will insist on it being a small sensible snippet not a huge code dump.

Otherwise, you will need to talk to whoever you pay for support, as that certainly isn't any of us who are doing this for free...
Comment 9 Rakhi Barayanan 2022-04-25 12:48:40 UTC
Hi Nick,

Please ignore about Run to email,my doubt is the code worked well and genreated proper xlsx earlier in 3.17 and in poi 5.0 even if we see the generated file is xlsx,if you are trying to share it you can see it is in binary format.I have put the attachments with this.You can try and see that.
I think I may be missing some dependency,does someone have any similar experience here who faced similar issue.
Comment 10 Axel Howind 2022-04-25 12:57:02 UTC
Without seeing any code, there is no way anyone can reproduce this. To me it seems rather to be a problem with the program that sends the email and not with POI. Also you should NEVER send excel files containing sensitive information like password hashes and the corresponding salt to a public bug tracker.

I am really sure the problem is within the code that actually sends the email, and not within POI, as I am really sure that the file name is not generated by POI. You should post that code to a site like stackoverflow and try to get a solution there.
Comment 11 Rakhi Barayanan 2022-04-26 13:23:28 UTC
Thanks for looking into this.
Unfortunately I may not be able to post the code here ,its against the organisation policy.
My application code is same while working with POI 3.17 and POI 5.0.
But for xlsx file generated using POI 5.0 is in binary format while sharing(Please find the attachments shared).Just replacing the old poi binaries that is poi 3.17 in the same environment will make everything work.
Have we added any new functionality for encoding after poi 3.17 or anyone can point out any change that may lead to this.
Or if I am missing any dependency jars,please help me with that.
Comment 12 Axel Howind 2022-04-28 10:25:36 UTC
As I understand this, the problem is that the files generated with POI 5 are displayed as "Binary" instead of "Excel Spreadsheet" in your inhouse application. Since Excel does not expose this problem (I can open both without any error using MS Excel, this is rather a problem with your application code.

You should check how your application determines the file format. It should be updated accordingly. Both attached files start with the required PK-zip header (504B0304). Whatever your application does to determine the file type from there on is unknown to us POI developers if you cannot share the code. If you use a library for file type detection, open an issue with that library maintainers.
Comment 13 Rakhi Barayanan 2022-05-03 11:39:02 UTC
Hi All,

There is a Bug 59747 - xlsx file does not conform to bit patterns used by common file type detection software(https://bz.apache.org/bugzilla/show_bug.cgi?id=59747) checked in to POI 4.0.0 .
Reverting this change will solve our issue.
Could you please let us know what is this fix for znd why it is causing impact in our system.

Regards,
Rakhi
Comment 14 Rakhi Barayanan 2022-05-05 04:32:53 UTC
Shall we get an update for this.
Comment 15 Nick Burch 2022-05-05 07:22:43 UTC
We still cannot reproduce your problem without any code. We still remain entirely volunteers

Without some code showing the problem (ideally as a small, self contained, failing junit unit test case), or some sensible way for us to reproduce the issue, there is very little we can do to help. Outputs don't really assist much if the problem is a bug in the code generating it...
Comment 16 Axel Howind 2022-05-06 09:08:52 UTC
The result of analysis is that this is not a bug in POI. The files you attached are both valid Excel files and can both be opened without problems using original MS Excel.

You have yourself hinted at the apparent cause of the incompatibility with your software: apparently your detection algorithm relies on an undocumented implementation detail, that is the order of zip file entries. That order has been changed to *increase* compatibility with third party software (https://bz.apache.org/bugzilla/show_bug.cgi?id=59747). We will not revert that change as that would mean introducing an incompatibility with a widely used tool for file detection.

As has been stated before, the correct approach is to fix the detection algorithm that you are using. As you cannot share the code with us, there is nothing more we can do for you. If you rely on some external library for the detection, please open an issue with that library. If the detection algorithm is developed by your company, you have to fix it yourself. You should be able to get help on stack overflow, but people there will probably request to see your current code as well.
Comment 17 Axel Howind 2022-05-06 09:10:30 UTC
This works as intended so closing as WONTFIX.
Comment 18 shashidhar 2023-01-29 05:07:24 UTC
We are also able to reproduce this issue after we switch from poi 3 to poi 5 version. If we export the data in excel format and download the excel file no issue noticed with the exported file. But the problem when we export the excel and attached the excel to email programmatically.
Comment 19 Dominik Stadler 2023-01-29 15:43:23 UTC
I think the analysis from the previous comments still stands. We do not plan to change anything in Apache POI as far as we could analyze your issue.

From previous analysis by Alex: "As has been stated before, the correct approach is to fix the detection algorithm that you are using."
Comment 20 shashidhar 2023-01-29 22:39:07 UTC
Hi, We have implemented excel export and triggering emails with excel file as an attachement 4 years back by using POI3 libs. After we switch from POI 3 to POI 5 we have start noticing issue with attached excel file borken due to the file renamed to .xlsx.txt. Also we noticed this issue is not happening when we use the GMAIL email server.  Only happening through Microsoft email exchange server. If we 100% sure this issue happened due to the lib's upgrade.
Comment 21 shashidhar 2023-01-29 23:14:58 UTC
(In reply to shashidhar from comment #20)
> Hi, We have implemented excel export and triggering emails with excel file
> as an attachement 4 years back by using POI3 libs. After we switch from POI
> 3 to POI 5 we have start noticing issue with attached excel file borken due
> to the file renamed to .xlsx.txt. Also we noticed this issue is not
> happening when we use the GMAIL email server.  Only happening through
> Microsoft email exchange server.we are 100% sure this issue start happening due to
> the POI lib's upgrade.
Comment 22 shashidhar 2023-01-30 05:52:00 UTC
We resoled the issue temporarely by replacing the excel sheet generation class  from SXSSFWorkbook to XSSFWorkbook.
Comment 23 Rakhi Barayanan 2023-05-24 04:05:37 UTC
Hi team,

Could see STANDALONE="NO" IN XML BEHIND THE EXCEL OUTPUT FROM latest poi,this also causing  corruption issues  in the 3rd
party applications when accessing the file.

Is there a way to remove this?