50715 – Specifying encoding of <sql> task doesn't result in filtering out malformed input

Bug 50715 - Specifying encoding of <sql> task doesn't result in filtering out malformed input

Summary: Specifying encoding of <sql> task doesn't result in filtering out malformed i...

Status:	NEW

Alias:	None

Product:	Ant
Classification:	Unclassified
Component:	Core tasks (show other bugs)
Version:	1.8.2
Hardware:	PC Windows XP

Importance:	P2 normal (vote)
Target Milestone:	---
Assignee:	Ant Notifications List

URL:
Keywords:

Depends on:
Blocks:

Reported:	2011-02-03 09:26 UTC by Grzegorz Oledzki
Modified:	2011-02-03 09:26 UTC (History)
CC List:	0 users

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Grzegorz Oledzki 2011-02-03 09:26:19 UTC

Observed problem:
If you specify the 'encoding' attribute of the <sql> task, you would expect that the task fails when some illegal characters are passed as input. Current behavior is to replace them with U+FFFD REPLACEMENT CHARACTER, which is useless in the database world.

Background:
Stackoverflow question: http://stackoverflow.com/questions/4886460/why-does-us-ascii-encoding-accept-non-us-ascii-characters

Suggested solution: 
When encoding is specified, explicitly specify 'CharsetDecoder' with CodingErrorAction.REPORT as the input to InputStreamReader instead of simple encoding.

Semi-patch:
reader = (encoding == null) ? new InputStreamReader(is) : new InputStreamReader(is, encoding);

could be replaced with:

if (encoding == null) {
   reader = new InputStreamReader(is);
} else {
   CharsetDecoder decoder = Charset.forName(encoding).newDecoder();
   decoder.onMalformedInput(CodingErrorAction.REPORT);
   reader = new InputStreamReader(bais, decoder);  
}