Issue 127692 - Truncation of target field in hyperlink in .doc documents (similar to 120676)
Summary: Truncation of target field in hyperlink in .doc documents (similar to 120676)
Status: UNCONFIRMED
Alias: None
Product: Writer
Classification: Application
Component: save-export (show other issues)
Version: 4.1.5
Hardware: All macOS 10.13
: P5 (lowest) Normal (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-06 23:03 UTC by Chris Brossard
Modified: 2018-12-01 21:10 UTC (History)
2 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Chris Brossard 2018-02-06 23:03:18 UTC
When I create a document with Writer and insert a hyperlink with the following text in the "Target" and "Text" fields:

https://play.google.com/store/search?q=%22chris%20brossard%22&c=apps&hl=en

and then save the document in Microsoft Word 97/2000/XP format (.doc) and then open the .doc file with OpenOffice Writer, the text in the "Target" field in the hyperlink in the .doc file has been truncated to:

https://play.google.com/store/search?q=
Comment 1 Tomislav Plavcic 2018-12-01 16:39:59 UTC
I have managed to reproduce this issue under both configurations mentioned below.

Configuration 1:
Ubuntu Bionic (18.04) Linux
Apache OO: 4.2.0 developer build
Build: AOOm1(Build:9800) - Rev. 1846402

Configuration 2:
Mac OS X Mojave 10.14.1
Apache OO: 4.1.6 release

My steps to reproduce:
1. create a new document
2. use Insert->Hyperlink and paste already provided link https://play.google.com/store/search?q=%22chris%20brossard%22&c=apps&hl=en into Target and Text fields
3. click Apply and Close
4. save the document as Microsoft Word 97/2000/XP (.doc) file
5. close the document and then reopen saved file
6. position cursor on the link and do Edit->Hyperlink and observe that Target field has been truncated

Some follow-up testing:
1. Saving to native open office file format works as expected.
2. Saving to Word 95 format also works as expected (so only higher Word file versions seem affected).
3. Tried with different combinations of "%22" and "%20" in URL and seems the issue is visible only with "%22" which decodes into double quote character.
For example if you try with this URL: "https://play.google.com/store/search?q=%22chris%22" the bug will reproduce, but if you try this "https://play.google.com/store/search?q=chris%20brossard" it will work fine.
So it seems that "%22" in the hyperlink target is problematic when saving to Microsoft Word 97/2000/XP (.doc) file.

I have tried using the online URL decoder (like https://www.url-encode-decode.com/) of originally provided URL and it seems to decode correctly so this URL should be valid.
Comment 2 Vicente Garcia 2018-12-01 21:10:50 UTC
I was able to reproduce the failure with:
-- Apache OpenOffice 4.2.0 / Operating system: Ubuntu 16.04.2 LTS
-- Apache OpenOffice 4.1.6 / Operating system: Windows 7 Professional

I would add to the above report that:
-- Following the steps in the report, if a character “ (%22) is in any position of the link, then the link is truncated until the position before the first character ‘ “ ‘ or %22 that appears in the link.

-- I checked that If the document with the problematic link it is “Save as” > File type “Microsoft word 97/2000/XP (.doc)”, and then it is opened with Microsoft Word 2003 or with Microsoft Office 365, the links are completed.

I was not able to reproduce the failure:
-- If the document save as File type, with other formats like: Microsoft Word 2003 *.xml, DocBook *.xml, OpenOffice 1.0 *.sxw) 

-- After reading, RFC3986 - Uniform Resource Identifier (URI): Generic Syntax 
https://www.ietf.org/rfc/rfc3986.txt I tried with links that included characters that are reserved and not allowed according to RFC3986. For example with: 	
    gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
    sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"  / "*" / "+" / "," / ";" / "="
    unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
    not allowed  =  <>\^`{|}