Issue 122900 - PDF import of SCAN with combined picture + text contents: Some text wrongly in foreground
Summary: PDF import of SCAN with combined picture + text contents: Some text wrongly i...
Alias: None
Product: Draw
Classification: Application
Component: open-import (show other issues)
Version: 4.0.0
Hardware: PC Windows 7
: P3 Minor (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact: Edwin Sharp
Keywords: needmoreinfo
Depends on:
Reported: 2013-07-30 18:07 UTC by Rainer Bielefeld
Modified: 2013-09-04 05:39 UTC (History)
3 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---

Sample document (335.93 KB, application/download)
2013-09-04 05:39 UTC, Rainer Bielefeld
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description Rainer Bielefeld 2013-07-30 18:07:12 UTC
When I import OCR Scan PDF with contents combining page filling picture + Texts (Electric Wiring Schemes, ...) to DRAW very often some Text boxes appear in the foreground of the page filling picture instead behind it.

I see this using with  "AOO 4.0.0-Dev – German UI / German locale  [AOO400m3(Build:9702)  -  Rev. 1503709 (2013-07-17) ]" on German German WIN7 Home Premium (64bit)", Common 4.0-dev User Profile + 
"PDF Import for Apache OpenOffice" Extension.

That's not a new problem, I have been used to see that problem for years also with LibreOffice, so I think the problem already is in the Oracle PDF import extension.

I can contribute sample PDF documents what may be used for debugging, but they are too confidential to be published here.
Comment 1 Hagar Delest 2013-08-05 19:44:23 UTC
PDF is not a format intended to be edited. So the reverse engineering of that format is very difficult.

I don't think a correct import filter will be ever made (except the one from Adobe of course).
Comment 2 Edwin Sharp 2013-08-06 06:39:31 UTC
Please provide instructions for install of aoo-pdf-import on Win 7.
Apologies, but I'm not familiar with git.
Trying to confirm...
Comment 3 Rainer Bielefeld 2013-08-06 15:02:28 UTC
1. Menu 'Tools -> Extension Manager ->  Link Dwonload more ... from Internet'
   Extensions repository will be opened in Browser
2. Search for "aoo-pdf-import"
   or go directly to
3. Download extension
4. AOO Menu 'Tools -> Extension Manager -> Add -> Browse for downloaded 
   Extension -> Install'
   > Import will be available (I can't remember whether you will have to 
     Restart AOO before you can use it
Comment 4 Edwin Sharp 2013-08-06 16:01:11 UTC
Download OK
Restart OpenOffice
Scan a document and save as PDF
Start Draw
Open PDF -> General Error. General input/output error.
Comment 5 Edwin Sharp 2013-08-08 08:37:01 UTC
Please allow reproduction with needed explanation!
Comment 6 Rainer Bielefeld 2013-09-04 05:39:39 UTC
Created attachment 81445 [details]
Sample document

Steps how to reproduce with  "AOO 4.1.0-Dev – English  UI / German locale - [AOO410m1(Build:9750)  -  Rev. 1516435  2013-08-24]" on German WIN7 Home
Premium (64bit)", own separate user profile:

0. Install Extension "PDF Import for Apache OpenOffice"
   if not already installed for your OOo
1. From AOO Start Center, open attached sample document (PDF)
   Expected: all OCR text behind the full page picture
   Actual: at the most bottom most right visible text line "E-Mail:" I see 
           some textbox text in front of the picture, what will become invisible
           when I bring the picture to the most foreground (click somewhere into
           a white area in the middle of the document, when full page picture is
           selected, menu 'Modify -> Arrange -> Bring to Front')
           All real ocr text should be behind picture