Bug 47498

Summary: HyperlinkRecord truncates URLs
Product: POI Reporter: Ibrahim Damlaj <ibrahim.damlaj>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: major    
Priority: P2    
Version: 3.5-dev   
Target Milestone: ---   
Hardware: All   
OS: All   
Attachments: diff -r -u poi-3.5-beta6 poi-edit
Test file

Description Ibrahim Damlaj 2009-07-08 07:33:11 UTC
Created attachment 23942 [details]
diff -r -u poi-3.5-beta6 poi-edit

Hi,

While testing POI's hyperlink functionality, I noticed that URL Hyperlinks are parsed incorrectly at the Record level. They are truncated by 11 characters from the right (12 characters if we count the null termination).
Reproducibility: 100%


FIX:
A quick look at HyperlinkRecord reveals the problem:

in 
public HyperlinkRecord(RecordInputStream)
[...]
if ((_linkOpts & HLINK_TARGET_FRAME) != 0) { 
  int nChars = fieldSize/2;
  _address = in.readUnicodeLEString(nChars);
} else {
  int nChars = (fieldSize - TAIL_SIZE)/2;
  _address = in.readUnicodeLEString(nChars);
  _uninterpretedTail = readTail(URL_TAIL, in);
}
[...]


Should be (as far as my understanding goes):

if ((_linkOpts & HLINK_TARGET_FRAME) == 0) {
  int nChars = fieldSize/2;
  _address = in.readUnicodeLEString(nChars);
} else {
  int nChars = (fieldSize - TAIL_SIZE)/2;
  _address = in.readUnicodeLEString(nChars);
  _uninterpretedTail = readTail(URL_TAIL, in);
}


I've attached a patch against 3.5-beta6.
Comment 1 Nick Burch 2009-07-08 07:35:46 UTC
Any chance you could upload a file that has a truncated hyperlink, so we've got something to use for a unit test?
Comment 2 Ibrahim Damlaj 2009-07-08 07:56:28 UTC
Created attachment 23943 [details]
Test file
Comment 3 Ibrahim Damlaj 2009-07-08 08:05:44 UTC
Hey Nick,

Thanks for the quick reply.
I've attached the test file you requested.


Best Regards,
Ibrahim
Comment 4 Yegor Kozlov 2009-07-12 00:29:20 UTC
Fixed in r793281. The problem was indeed in HyperlinkRecord although the fix is a bit more complicated than you suggested. HyperlinkRecord didn't properly handle all kinds of link options depending on which a 24-byte record tail may appear or not. This unexpected 24-byte tail resulted in truncation by 11 characters.

Regards,
Yegor