Bug 53360 - SXSSF removes characters before escaped Unicode control character
Summary: SXSSF removes characters before escaped Unicode control character
Alias: None
Product: POI
Classification: Unclassified
Component: SXSSF (show other bugs)
Version: 3.8-FINAL
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2012-06-05 06:46 UTC by Martin Andersson
Modified: 2012-07-16 15:32 UTC (History)
1 user (show)

Fix for SheetDataWriter and test case (1.81 KB, text/plain)
2012-06-05 06:46 UTC, Martin Andersson

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Andersson 2012-06-05 06:46:57 UTC
Created attachment 28885 [details]
Fix for SheetDataWriter and test case

SXSSF replaces Unicode control characters with '?' but fails to write out the preceding characters.

A cell containing "value\u0019" should be "value?" but is now "?".

Attached is a fix and a test case to prove it.
Comment 1 Martin Andersson 2012-07-16 12:55:50 UTC
This issue has been unanswered for a few weeks now. Is there anything more you need to review the patch?

The patch is only three lines of code and it makes the SXSSF api consistent with the XSSF api when it comes to escaping cell string values.

When SXSSF writes string cells it iterates over the string locking for characters that need escaping. The valid characters are kept in a buffer and are flushed when an escaped character is written. The flushing part is missing for Unicode control characters. That's what this patch adds.

We have used the patch in production for five weeks now without any insidents.

Please let me know if you need anything more.
Comment 2 Yegor Kozlov 2012-07-16 12:59:38 UTC
I will give my feedback within a few days. 

Thanks for your patience.

Comment 3 Yegor Kozlov 2012-07-16 15:32:53 UTC
patch applied in r1362093