Issue 75547 - AutoCorrect: make table detection more sensitive
Summary: AutoCorrect: make table detection more sensitive
Alias: None
Product: Writer
Classification: Application
Component: editing (show other issues)
Version: OOo 2.1
Hardware: All All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
Keywords: needmoreinfo
Depends on: 75524
  Show dependency tree
Reported: 2007-03-20 10:34 UTC by tuharsky
Modified: 2017-05-20 10:12 UTC (History)
2 users (show)

See Also:
Latest Confirmation in: ---
Developer Difficulty: ---


Note You need to log in before you can comment on or make changes to this issue.
Description tuharsky 2007-03-20 10:34:40 UTC
This is another option for the Bad Document Correction Tool (Issue 75524).
It's purpose is to detect and reformat textual table.

Some users intend to create a simple textual table by just placing values and
separating (and formatting too) them by TAB.
The problem is, that users tend to use multiple TABs together to create columns
("put values under each other) and when font or other formatting changes, maybe
just by opening on other machine with other printer installed, the table is broken.
Users again tend to regain corrupted formatting by adding or deleting TAB
symbols. This is not good however.

The tool should detect such textual table.
1, If the table was created by placing groups of spacebars, it should have been
already catched by the Issue 75545 Multiple spacebars removal. However, for the
instance where the option from Issue 75545 wasn't used, the rule 4 from 75545
should be used separately anyhow.

2, Now, if there is such a table, it uses TABs to do the value separation.

3, We can start table detection. We could consider a line to be part of table,
if all rules are matching:
 a, the line break is used before the line
 b, the line is ended with line break
 c, the line is a single line, or at most it leaks to second text line -in that
case we consider the line to be single line (defined between the two line breaks)
 d, there are at least 2 "islands of symbols" separated by TAB in the line

4, If there are several such lines, one-under-other, it is probably a table.

5, We consider the "islands of symbols" to represent table cells

6, We define columns for internal purposes, simply by number of "cells" in each

7, For each column, we define the "Column width" by the length of the longest
content "cell" in the column.
Comment 1 tuharsky 2007-03-20 11:01:19 UTC
Now we can consider, what should we do with the detected table.

We could first define a text division, and inside, define TAB distances by the
column's width. When the multiple TABs are removed, the table should be "nice
and ready".

Other possibility could be, that the regular new table (with visible borders) is
created in the document and values are placed into.
Comment 2 michael.ruess 2007-03-20 11:04:14 UTC
Reassigned to SBA.
Comment 3 michael.ruess 2009-07-24 11:00:44 UTC
Reassigned to requirements.
Comment 4 Edwin Sharp 2014-03-21 14:50:05 UTC
Please attach example.
Comment 5 Edwin Sharp 2014-04-03 10:31:51 UTC
No info from author.