Bug 2525 - Leading zero-length string splitted by RE
Summary: Leading zero-length string splitted by RE
Alias: None
Product: Regexp
Classification: Unclassified
Component: Other (show other bugs)
Version: unspecified
Hardware: PC All
: P3 normal (vote)
Target Milestone: ---
Assignee: Jakarta Notifications Mailing List
Depends on:
Reported: 2001-07-09 18:50 UTC by arlou
Modified: 2005-03-20 17:06 UTC (History)
0 users


Note You need to log in before you can comment on or make changes to this bug.
Description arlou 2001-07-09 18:50:37 UTC
When I use an RE, say "a*b", to split a string like "aaabxyz", I'd think only 1 
part comes out, but there are 2 parts, with the first is a zero-length string. 
I wonder there is something missed in RE.split.
Comment 1 graham 2001-08-28 15:34:00 UTC
This seems to be consistent behavior with the split() method in general,
specifically when a pattern match returns true on the very first character of a
String instance.

Since split() returns an array of Strings if this particular condition exists
(the first character of a String happens to be matched by the pattern in the RE
instance when split() is called) the first element in the string array will be
returned as an empty String.

This results in either negating any matching characters from the front of the
String instance they are about to split(), or negate the empty String element
from the returned array once split() has been called.  If matching characters
are encountered at the end of the String instance they either seem to be
ignored, or removed from the array before the array is returned.  Shouldn't this
either be consistent, or should split() contain options for splitting on
concurrent matches to yield empty String elements (so that this condition can be
Comment 2 Michael McCallum 2001-09-08 15:54:10 UTC
If noone has any objections I will change this bug to INVALID.
Grahams's seems to show its behaviour is correct.

Perhaps some notes in the docs?
Comment 3 graham 2002-03-22 17:50:11 UTC
Sorry for the LATE additional comments!

I have no objections as long as this case is documented.  Taking the behavior
into account when using "split()" has yielded no problems with my implementation
so far (negating the first empty String element), but it would have helped if I
knew it in advance.
Comment 4 Vadim Gritsenko 2004-01-31 00:23:19 UTC
Javadoc updated to reflect this behavior.