Bug 53360

Summary: SXSSF removes characters before escaped Unicode control character
Product: POI Reporter: Martin Andersson <martin.andersson>
Component: SXSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: martin.andersson
Priority: P2    
Version: 3.8-FINAL   
Target Milestone: ---   
Hardware: All   
OS: All   
Attachments: Fix for SheetDataWriter and test case

Description Martin Andersson 2012-06-05 06:46:57 UTC
Created attachment 28885 [details]
Fix for SheetDataWriter and test case

SXSSF replaces Unicode control characters with '?' but fails to write out the preceding characters.

A cell containing "value\u0019" should be "value?" but is now "?".

Attached is a fix and a test case to prove it.
Comment 1 Martin Andersson 2012-07-16 12:55:50 UTC
This issue has been unanswered for a few weeks now. Is there anything more you need to review the patch?

The patch is only three lines of code and it makes the SXSSF api consistent with the XSSF api when it comes to escaping cell string values.

When SXSSF writes string cells it iterates over the string locking for characters that need escaping. The valid characters are kept in a buffer and are flushed when an escaped character is written. The flushing part is missing for Unicode control characters. That's what this patch adds.

We have used the patch in production for five weeks now without any insidents.

Please let me know if you need anything more.
Comment 2 Yegor Kozlov 2012-07-16 12:59:38 UTC
I will give my feedback within a few days. 

Thanks for your patience.

Yegor
Comment 3 Yegor Kozlov 2012-07-16 15:32:53 UTC
patch applied in r1362093

Regards,
Yegor