Issue 127122 - problem with bibliography index "std::bad_alloc"
Summary: problem with bibliography index "std::bad_alloc"
Status: UNCONFIRMED
Alias: None
Product: Writer
Classification: Application
Component: open-import (show other issues)
Version: 3.3.0 or older (OOo)
Hardware: PC Linux 64-bit
: P5 (lowest) Normal (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-19 10:36 UTC by yury_t
Modified: 2018-07-02 05:17 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
The file without content (28.79 KB, application/vnd.oasis.opendocument.text)
2017-10-04 20:53 UTC, mroe
no flags Details
big file (just the bibliography) (56.07 KB, application/vnd.oasis.opendocument.text)
2018-07-02 05:17 UTC, yury_t
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description yury_t 2016-09-19 10:36:31 UTC
I have a big file with lots of formulas (obfusctaed, attached), which is freezing Writer on opening, at something like 80% progress.
LibreOffice had the same issue up to and including 4.2 series (see also https://bugs.documentfoundation.org/show_bug.cgi?id=88140).
Then some changes LO folks introduced in the 4.3 series made the issue go away (the fix wasn't intentional, it seems).
Shows here both in 4.1 and 4.2.0-dev series.
Comment 1 yury_t 2016-09-19 10:39:50 UTC
I was going to attach the ODT file triggering the issue but it is 4M in size, bugzilla won't let it through. What to do?
Comment 2 yury_t 2016-09-23 04:04:29 UTC
Well, the big file in question is available at:
https://bugs.documentfoundation.org/attachment.cgi?id=114621
Comment 3 eds92 2016-09-26 01:20:49 UTC
I have been able to replicate this Bug with the given document. it loads to about 80% like yury_t stated then crashes with an error message of bad allocation. However I have not been able to replicate this with any other documents of similar size. 

opening the file in different programs:
   Microsoft office: recover document prompt and successful repair
   note pad: gibberish
   word pad: opens fine with warning that format is not fully supported

Enviroment: 
   Windows 10 Pro (version 1607), 
   AMD FX(tm)-8350 8core processor (4.00hz),
   32GB ram,
   64-bit system

Attempts:
Test 1: open given file in open office writer
File Size: 3.76MB
Steps: 
1. double clicked file to open
Expected outcome: load file
Actual outcome: Writer crashed at 80%

Test 2: open given file in windows (to see if it is corrupted)
File Size: 569KB
Steps: 
1. right clicked and open with microsoft office
2. when prompted to recover hit yes.
Expected outcome: load file
Actual outcome: file is corrupted but able to be recovered. Repaired style 1 to recover the document.

Test 3: open recovered file in open office
File Size: 569KB
Steps: 
1. right clicked and open with open office
Expected outcome: load file
Actual outcome: file is able to be loaded. I noticed that the style was in Russian (Титульный лист) and the font type was a non standard (XITS). Not sure if that style had cause any issues with the document.

Test 4: changed file to default and font Adobe Arabic and reload
File Size: 2.17MB
Steps: 
1. select all
2. change style to default and fornt to adobe arabic
3. save file and close
4. open file
Expected outcome: load file and no real effect
Actual outcome: file is able to be loaded, but the size of the document increased from 569KB to 2.17MB

Test 5: double file size to be equal or greater than original document and reload
File Size: 4.36MB
Steps: 
1. select all
2. copy and past (after deselecting)
3. save file and close
4. open file
Expected outcome: load file and no real effect
Actual outcome: file loaded, no change

Test 6: create new document of size greater than 4MB with formulas and reload
File Size: 6.12MB
Steps: 
1. add in "X" and nonsensical formulas for about a page
2. copy and past to 105 pages
3. save file and close
4. open file
Expected outcome: load file
Actual outcome: file loaded without issue

Conclusion: 
 If I had to guess, I would say that either the document got corrupted when it was last saved or that the bug is related to a style or format that was not replicated in my own test document and got removed by Microsoft's recovery function.
Comment 4 yury_t 2016-09-26 15:32:30 UTC
I am unsure what those tests were supposed to mean.
That file -- likewise all succeeding versions of the original which was obfuscated to produce that file -- are opened with LibreOffice 4.3 series and newer.
And I'm fairly sure it's the sheer complexity of the structure that triggers the behaviour -- one page of random content times 300 is not complex enough (it was tried).
So, is there any hope of fixing this in AOO?
Comment 5 yury_t 2017-10-04 15:42:53 UTC
Hey guys, one year having passed, just checked with 

AOO420m1(Build:9800)  -  Rev. 1811013
2017-10-04_04:15:13 - Rev. 1811038

The issue is still there. Loading big file progresses up to 75-80%, then OOO freezes, shortly after the system load goes to about 100%.

Is there any hope of solving this? You guys are crafty with GIT and stuff, you could replicate the solution of the similar issue in LibreOffice (link in #1), or trace the differences in relevant modules between LO 4.2.* and 4.3.* series.

I (still) can't attach 4M file to this bugzilla, so you'd have to rely on attachments in the page linked in #1.
Comment 6 mroe 2017-10-04 17:23:27 UTC
(In reply to yury_t from comment #2)
> Well, the big file in question is available at:
> https://bugs.documentfoundation.org/attachment.cgi?id=114621

This file was created with an older version of LibreOffice. And this version of LibreOffice can't open its own file. So what happens this to Apache OpenOffice?
IMHO Nothing!

As you stated, a newer version can read this file. Fine.
Now it would be a good goal, if LibreOffice would create files in a standard way, so other ODF-applications can also read these files.
Comment 7 mroe 2017-10-04 19:21:10 UTC
Amendment:

If you can create such a huge file with 1290(!) (math) objects inside with Apache OpenOffice and AOO saves this file without errors but can not reload it, feel free to reopen this issue as an AOO bug report.
Comment 8 yury_t 2017-10-04 20:18:31 UTC
(In reply to mroe from comment #7)
> Amendment:
> 
> If you can create such a huge file with 1290(!) (math) objects inside with
> Apache OpenOffice and AOO saves this file without errors but can not reload
> it, feel free to reopen this issue as an AOO bug report.

I don't understand this position.

If both branches (LO and AOO), based on common codebase, fail in the same manner, processing the same input, then it's only logical to pick up the (principle of the) solution from one branch to another.

I assure you, this file just exposes in AOO the same weakness that LO had prior to 4.3 series.
Comment 9 mroe 2017-10-04 20:53:09 UTC
Created attachment 86219 [details]
The file without content

I've opened your File with LO 5 and deleted the content in it.
LO can reopen it, but AOO doesn't. So it has nothing to do with the size or the count of objects in it – only with that, HOW LO saves the file.

Please go to the LO people that they should find out, what is wrong inside the written file!

Again: Create a file with AOO, which AOO is unable to open after a successful save!
Comment 10 mroe 2017-10-04 20:54:23 UTC
To reopen this issue as an valid bug, please:

You have LO and you have AOO.
Simply copy the content from LO to AOO and create the indexes. Save the file. Look if AOO can reopen the file or not.
Comment 11 yury_t 2017-10-05 05:30:07 UTC
(In reply to mroe from comment #10)
> To reopen this issue as an valid bug, please:
> 
> You have LO and you have AOO.
> Simply copy the content from LO to AOO and create the indexes. Save the
> file. Look if AOO can reopen the file or not.

AOO can't even paste the content, throws "std::bad_alloc" messagebox at me and aborts.

For me, it looks like there are issues in AOO, coming from the old (pre 4 series) codebase, common with LO

(I remember now I've seen the "std::bad_alloc" in LO console too, in context of opening this same big file, before the issue was fixed in LO 4.3 series).
Comment 12 mroe 2017-10-05 12:13:18 UTC
Deleting the bibliography index lets AOO open the (big) file.
Maybe there is a developer who want to look at this problem.

I set the Version to 3.3.0 or older because the file was created with a version earlier than 4.0. I don't know, if such a problem file can be created with AOO, so I reset the issue back as UNCONFIRMED.
Comment 13 yury_t 2017-10-05 12:46:37 UTC
Well, thank you for letting this squat on your BZ.

The file in question grew to its breaking in LO 3.6.7.2 (early 2015), but was created, I think, in fairly low 3.* versions (late 2012), might be even before 'the Schism'.

It was a several years work, and what would I know about 'should' and 'should not' (do with OO of either extraction)?
Comment 14 yury_t 2017-12-07 17:55:55 UTC
Some minor followup on this:

Minding the #c12, I've deleted the bibliography in LO 5, saved and opened the original (not obfuscated) bigfile in AOO 4.1.4 - success!

Still in AOO 4.1.4, added bibliography, saved - success!

Tried to open in 4.2.0 AOO420m1(Build:9800)  -  Rev. 1817003 // 2017-12-05_04:15:45 - Rev. 1817150 : after progressbar finished 100% and disappeared, I've got msgbox 'General error', but after dismissing it I've got the file on screen, opened.

Then, tried to save this file (still in 4.2.0). Progressbar finished, got msgbox 'Error saving the document ... // General error'. However, the file is already there, okay by the superficial look.

And when this file in its turn is opened, the same thing with 'general error' happens (like described two paragraphs back).

Would you comment?
Comment 15 yury_t 2017-12-07 18:28:09 UTC
However, the AOO415m1(Build:9789)  -  Rev. 1816477 // 2017-12-03_12:47:08 - Rev. 1817032 used in the same scenario doesn't show those msgboxes.
Comment 16 yury_t 2017-12-30 08:34:38 UTC
Re-checked with:

AOO420m1(Build:9800)  -  Rev. 1819451
2017-12-29_04:16:37 - Rev. 1819466

-- shows msgbox 'General error' after the doc load completion but before the writer screen opens. Dismissing the msgbox gets to the doc opened okay (? at least, I see no immediate problems). Re-saving the doc goes like in #14.

AOO415m1(Build:9789)  -  Rev. 1817496
2017-12-24_12:46:45 - Rev. 1819218

-- opens and re-saves without complains.
Comment 17 Matthias Seidel 2017-12-30 11:46:15 UTC
(In reply to yury_t from comment #16)
> Re-checked with:
> 
> AOO420m1(Build:9800)  -  Rev. 1819451
> 2017-12-29_04:16:37 - Rev. 1819466
> 
> -- shows msgbox 'General error' after the doc load completion but before the
> writer screen opens. Dismissing the msgbox gets to the doc opened okay (? at
> least, I see no immediate problems). Re-saving the doc goes like in #14.
> 
> AOO415m1(Build:9789)  -  Rev. 1817496
> 2017-12-24_12:46:45 - Rev. 1819218
> 
> -- opens and re-saves without complains.

The message box in 4.2.0 is not related to your problem.

See i127315
Comment 18 yury_t 2018-06-16 17:57:04 UTC
Few days ago, I had to process my big file in LO.

The big file had its big problematic bibliography index removed and re-done in AOO and up to that moment had been opening in AOO nicely.

After I worked with the file in LO, I've tried to open it again in AOO 4.2. Imagine what -- the issue was back -- opening gets to about 75% in the progress bar, CPU usage tops, and after I kill the process, it sometimes shows me the messagebox about "bad std alloc" or something.

I verified this by removing the bibliography in LO and saving and opening in AOO -- opens okay, so the point of failure is the bibliography somehow.

The LO 5.4 has its saving format set to 1.2 extended in compatibility mode.

Would attaching just the bibliography part of the contents XML help? I still can't publish the complete file.
Comment 19 yury_t 2018-07-02 05:17:11 UTC
Created attachment 86443 [details]
big file (just the bibliography)

This is the big file in question cut down to just the bibliography index. The bibliography was rebuilt in LO 5.4.7.2 which is set to save in ODF 1.2 extended format (compatibility mode).
On opening freezes the AOO 4.2 with CPU and memory usage maxing out.