Issue 122900

Summary: PDF import of SCAN with combined picture + text contents: Some text wrongly in foreground
Product: Draw Reporter: Rainer Bielefeld <rainerbielefeld_ooo_qa>
Component: open-importAssignee: AOO issues mailing list <issues>
Status: UNCONFIRMED --- QA Contact: Edwin Sharp <elish>
Severity: Minor    
Priority: P3 CC: delest.hagar, issues, rainerbielefeld_ooo_qa
Version: 4.0.0Keywords: needmoreinfo
Target Milestone: ---   
Hardware: PC   
OS: Windows 7   
Issue Type: DEFECT Latest Confirmation in: ---
Developer Difficulty: ---
Attachments:
Description Flags
Sample document none

Description Rainer Bielefeld 2013-07-30 18:07:12 UTC
When I import OCR Scan PDF with contents combining page filling picture + Texts (Electric Wiring Schemes, ...) to DRAW very often some Text boxes appear in the foreground of the page filling picture instead behind it.

I see this using with  "AOO 4.0.0-Dev – German UI / German locale  [AOO400m3(Build:9702)  -  Rev. 1503709 (2013-07-17) ]" on German German WIN7 Home Premium (64bit)", Common 4.0-dev User Profile + 
"PDF Import for Apache OpenOffice" Extension.
<http://code.google.com/a/apache-extras.org/p/aoo-pdf-import/>

That's not a new problem, I have been used to see that problem for years also with LibreOffice, so I think the problem already is in the Oracle PDF import extension.

I can contribute sample PDF documents what may be used for debugging, but they are too confidential to be published here.
Comment 1 Hagar Delest 2013-08-05 19:44:23 UTC
PDF is not a format intended to be edited. So the reverse engineering of that format is very difficult.

I don't think a correct import filter will be ever made (except the one from Adobe of course).
Comment 2 Edwin Sharp 2013-08-06 06:39:31 UTC
Please provide instructions for install of aoo-pdf-import on Win 7.
Apologies, but I'm not familiar with git.
Trying to confirm...
Comment 3 Rainer Bielefeld 2013-08-06 15:02:28 UTC
@Edwin:
1. Menu 'Tools -> Extension Manager ->  Link Dwonload more ... from Internet'
   Extensions repository will be opened in Browser
2. Search for "aoo-pdf-import"
   or go directly to
   <http://extensions.services.openoffice.org/en/project/pdf-import-apache-openoffice>
3. Download extension
4. AOO Menu 'Tools -> Extension Manager -> Add -> Browse for downloaded 
   Extension -> Install'
   > Import will be available (I can't remember whether you will have to 
     Restart AOO before you can use it
Comment 4 Edwin Sharp 2013-08-06 16:01:11 UTC
Thanks!
Download OK
Restart OpenOffice
Scan a document and save as PDF
Start Draw
Open PDF -> General Error. General input/output error.
?
Comment 5 Edwin Sharp 2013-08-08 08:37:01 UTC
Please allow reproduction with needed explanation!
Comment 6 Rainer Bielefeld 2013-09-04 05:39:39 UTC
Created attachment 81445 [details]
Sample document

Steps how to reproduce with  "AOO 4.1.0-Dev – English  UI / German locale - [AOO410m1(Build:9750)  -  Rev. 1516435  2013-08-24]" on German WIN7 Home
Premium (64bit)", own separate user profile:

0. Install Extension "PDF Import for Apache OpenOffice"
   <http://extensions.openoffice.org/en/project/pdf-import-apache-openoffice>
   if not already installed for your OOo
1. From AOO Start Center, open attached sample document (PDF)
   Expected: all OCR text behind the full page picture
   Actual: at the most bottom most right visible text line "E-Mail:" I see 
           some textbox text in front of the picture, what will become invisible
           when I bring the picture to the most foreground (click somewhere into
           a white area in the middle of the document, when full page picture is
           selected, menu 'Modify -> Arrange -> Bring to Front')
           All real ocr text should be behind picture