Bug 50665

Summary: [Question] formattingruns in HSSFRichTextString
Product: POI Reporter: peterpham
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED INVALID    
Severity: minor    
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: Macintosh   
OS: All   
Attachments: POI test

Description peterpham 2011-01-27 00:00:19 UTC
Created attachment 26558 [details]
POI test

I wanted to extract cell's text value and its format (bold, italic, etc) by using org.apache.poi.hssf.usermodel.HSSFRichTextString

The function numFormattingRuns() and getIndexOfFormattingRun() can help me to identify different style available in the text. 

However, it seems that numFormattingRuns() will always return 0 unless 2 different formats are used.

For example:
Example 1: (Text with no format)
Apache POI is rock

Result: 
HSSFRichTextString.numFormattingRuns() = 0

=================================

Example 2: (Whole text is bold)
Apache POI is rock

Result: 
HSSFRichTextString.numFormattingRuns() = 0

================================= 

Example 3: (word "POI" is bold)
Apache POI is rock
HSSFRichTextString.numFormattingRuns() = 2
HSSFRichTextString.getIndexOfFormattingRun(0) = 7
HSSFRichTextString.getIndexOfFormattingRun(1) = 10

=================================

Example 4: (word "Apache" is bold)
Apache POI is rock
HSSFRichTextString.numFormattingRuns() = 1
HSSFRichTextString.getIndexOfFormattingRun(0) = 6


=====================================================

It appears that getIndexOfFormattingRun() will first only returns the character index if it encounters a different format from the initial format. Is this a correct behavior? It seems very different from the way getCharacterRuns works in HWPF model.
Comment 1 peterpham 2011-01-27 00:37:19 UTC
Also, I want to know if there is a different way to extract text and its format in MS Excel? 
An example is very much appreciated.
Comment 2 Nick Burch 2011-01-27 04:27:34 UTC
Bugzilla is not the place to ask questions. Please use the mailing list instead.