I created a file of comma seperated values and used open office to write an excel file. I found in smaller data sets with java hprof that over 30% of the time was spent in SSTderializer.addToStringTable and what it called mostly creating the exception in put. Nothing else was over 5%. The precentage of time in addToStringTable increased as my data set got larger. static public void addToStringTable( BinaryTree strings, Integer integer, UnicodeString string ) { if ( string.isRichText() ) string.setOptionFlags( (byte) ( string.getOptionFlags() & ( ~8 ) ) ); if ( string.isExtendedText() ) string.setOptionFlags( (byte) ( string.getOptionFlags() & ( ~4 ) ) ); boolean added = false; while ( added == false ) { try { strings.put( integer, string ); added = true; } catch ( Exception ignore ) { string.setString( string.getString() + " " ); } } } Of course, if you are really expecting the values might be the same a different data struture should be used like a straight hash map.
Created attachment 13486 [details] The version created by open office
Created attachment 13487 [details] The orginal I cut 100 lines off the file to create the 900 records
It shouldnot contain any rich text
This is now corrected in SVN (or has been for quite some time). Previously we didnt understand rich text, so the exception below was used to append a space on the send of the string to preserve its uniqueness. Now that rich text handling is correct, that block of code is now gone ie we dont use exceptions at that low level, hence the code is much faster. Marking as fixed. Jason *** This bug has been marked as a duplicate of 25039 ***