Bug 49687 - [PATCH] Complex Script Support
[PATCH] Complex Script Support
Status: CLOSED FIXED
Product: Fop - Now in Jira
Classification: Unclassified
Component: general
trunk
All All
: P2 major
: ---
Assigned To: fop-dev
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2010-08-02 05:36 UTC by Glenn Adams
Modified: 2012-02-27 18:14 UTC (History)
1 user (show)



Attachments
Patch to FOP trunk revision 981406. (119.40 KB, patch)
2010-08-02 05:36 UTC, Glenn Adams
Details | Diff
Simple test Arabic input FO file. (2.20 KB, text/plain)
2010-08-02 05:38 UTC, Glenn Adams
Details
Simple test Arabic output PDF file. (26.87 KB, application/pdf)
2010-08-02 05:39 UTC, Glenn Adams
Details
Patch to FOP trunk revision 981406. (118.41 KB, patch)
2010-08-02 21:08 UTC, Glenn Adams
Details | Diff
Simple test Arabic output PDF file. (47.44 KB, application/pdf)
2010-08-03 04:45 UTC, Glenn Adams
Details
Patch to revision 987110 of branch Temp_ComplexScripts. (110.42 KB, patch)
2010-08-19 06:14 UTC, Glenn Adams
Details | Diff
Patch to revision 1002949 of branch Temp_ComplexScripts. (99.68 KB, patch)
2010-09-30 02:02 UTC, Glenn Adams
Details | Diff
Patch to revision 1076316 of branch Temp_ComplexScripts. (198.17 KB, patch)
2011-03-02 13:31 UTC, Glenn Adams
Details | Diff
Patch to revision 1136002 of branch Temp_ComplexScripts. (362.14 KB, patch)
2011-06-15 16:32 UTC, Glenn Adams
Details | Diff
Patch to revision 1183594 of branch Temp_ComplexScripts (381.98 KB, patch)
2011-10-15 09:58 UTC, Glenn Adams
Details | Diff
list of Gujarati words and sentences (14.77 KB, text/plain)
2011-10-20 03:55 UTC, Dilip Shah
Details
Patch to revision 1198396 of trunk. (730.68 KB, patch)
2011-11-07 03:46 UTC, Glenn Adams
Details | Diff
Patch to revision 1236835 of trunk. (747.15 KB, patch)
2012-01-27 20:27 UTC, Glenn Adams
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Glenn Adams 2010-08-02 05:36:47 UTC
Created attachment 25825 [details]
Patch to FOP trunk revision 981406.

This patch adds support for complex scripts, starting with Arabic. Included in this patch are:

* support for XSL FO 1.1 Section 5.8 Unicode BIDI Processing
* generic support for TrueType and OpenType advanced typography tables
* support for OpenType GSUB table (partial)
* support for OpenType GPOS table (partial)
* support for Arabic specific script processing

This patch should be considered preliminary support for the above features. Not all aspects of these features are implemented or tested. In particular, only three Microsoft Arabic fonts have been (partially) tested:

* Simplified Arabic (simpo.ttf)
* Simplified Arabic Bold (simbdo.ttf)
* Arial Unicode MS (arialuni.ttf)

Certain important functions needed to adequately support Arabic are not yet in place. In particular, the following either do not work or work only partially:

* writing-mode relative properties
* glyph positioning of non-spacing marks
* arabic fonts that require use of multiple, context, or chained context subtable types

Notwithstanding the above limitations, basic Arabic content is adequately processed and formatted provided that the above fonts are used without non-spacing marks, and with explicit text-align="right" at the fo:block level (or above). Note also that only simple inline content has been tested thus far in bidi block contexts, namely #PCDATA and fo:inline; support for the remaining inline FOs in a bidi context must await further improvements to this patch.

Because this patch involves modification or addition of nearly 150 files in the core sources, I recommend that a temporary branch be created in order to permit further maturation, and, in particular, to allow me time to add test suites, enhancements, and bug fixes prior to merging into the main development trunk.

Both findbugs and checkstyle have been run against the patch with no new warnings or errors; PMD has not been run. Actually some checkstyle warnings in the trunk have been fixed by this patch. Note also the addition of findbugs-exclude.xml and checkstyle-suppressions.xml at the top-level directory, which are used to filter blessed warnings.

Note further that one existing layout engine test file has been disabled due to dependencies on an incorrect prior implementation of the writing-mode property:

simple-page-master_writing-mode_rl_region-body_writing-mode-lr.xml

I will be adding a complete set of tests on the writing-mode property before I propose merging with trunk. Three other tests were modified slightly to account for defaulting of the offset attribute on word and space elements in the XML area tree output. 

This patch takes the form of a diff file produced by using the 'svn diff' command against revision 981406 of the trunk.

Regards,
Glenn Adams
Comment 1 Glenn Adams 2010-08-02 05:38:37 UTC
Created attachment 25826 [details]
Simple test Arabic input FO file.
Comment 2 Glenn Adams 2010-08-02 05:39:21 UTC
Created attachment 25827 [details]
Simple test Arabic output PDF file.
Comment 3 Vincent Hennebert 2010-08-02 07:02:11 UTC
Hi Glenn,

Thanks for your patch. I've just created a temporary branch for processing it:
https://svn.apache.org/repos/asf/xmlgraphics/fop/branches/Temp_ComplexScripts

I'll apply your patch to the branch soon (or Simon if he beats me to it). Please create your diff files from this branch from now on.

More later.

Thanks,
Vincent
Comment 4 Pascal Sancho 2010-08-02 08:10:17 UTC
(In reply to comment #3)

Hi,
I tried to apply patch against the new Vincent's Temp_ComplexScripts branch, but I get the following error during build:
 

compile-java:
    [mkdir] Created dir: D:\bin\java\fop\Temp_ComplexScripts\build\classes
    [javac] Compiling 1316 source files to D:\bin\java\fop\Temp_ComplexScripts\build\classes
    [javac] D:\bin\java\fop\Temp_ComplexScripts\src\java\org\apache\fop\fonts\type1\AFMFile.java:59: generics are not supported in -source 1.4
    [javac] (use -source 5 or higher to enable generics)
    [javac]     private List<AFMCharMetrics> charMetrics = new java.util.ArrayList<AFMCharMetrics>();
    [javac]                 ^
    [javac] 1 error


Pascal
Comment 5 Glenn Adams 2010-08-02 20:50:42 UTC
You need to create a build-local.properties file at the top-level of FOP tree, containing, e.g., the following:

javac.source = 1.5
javac.target = 1.5
javac.fork = on
junit.fork = on

Note that the 1.5 dependency below was not introduced by this patch.

Glenn

(In reply to comment #4)
> (In reply to comment #3)
> 
> Hi,
> I tried to apply patch against the new Vincent's Temp_ComplexScripts branch,
> but I get the following error during build:
> 
> 
> compile-java:
>     [mkdir] Created dir: D:\bin\java\fop\Temp_ComplexScripts\build\classes
>     [javac] Compiling 1316 source files to
> D:\bin\java\fop\Temp_ComplexScripts\build\classes
>     [javac]
> D:\bin\java\fop\Temp_ComplexScripts\src\java\org\apache\fop\fonts\type1\AFMFile.java:59:
> generics are not supported in -source 1.4
>     [javac] (use -source 5 or higher to enable generics)
>     [javac]     private List<AFMCharMetrics> charMetrics = new
> java.util.ArrayList<AFMCharMetrics>();
>     [javac]                 ^
>     [javac] 1 error
> 
> 
> Pascal
Comment 6 Glenn Adams 2010-08-02 20:57:58 UTC
Ah, I take that back. It looks like this got inadvertently included from an early merge from trunk of some new AFM code that used generics, but prior to that dependency being backed out.

I'll fix this in my next patch to the new branch, that is, unless whoever applies this current patch to the branch takes care of it first. In the mean time, you can use my suggested work around.

G.

(In reply to comment #5)
> 
> Note that the 1.5 dependency below was not introduced by this patch.
Comment 7 Glenn Adams 2010-08-02 21:08:30 UTC
Created attachment 25830 [details]
Patch to FOP trunk revision 981406.

This replacement patch removes the erroneously introduced JDK 1.5 (generics) dependency.
Comment 8 Glenn Adams 2010-08-02 21:10:00 UTC
I decided not to wait, so here is a replacement patch that fixes the 1.5 dependency. Let me know if there are further build problems on 1.4.

G.

(In reply to comment #7)
> Created an attachment (id=25830) [details]
> Patch to FOP trunk revision 981406.
> 
> This replacement patch removes the erroneously introduced JDK 1.5 (generics)
> dependency.
Comment 9 Pascal Sancho 2010-08-03 03:14:45 UTC
(In reply to comment #8)
Glenn, build is OK, now.
I didn't made Junit tests, I have to check my config before.
Thx.
Comment 10 Glenn Adams 2010-08-03 03:59:31 UTC
For further documentation and issue tracking for this patch, see the following:

https://skynav.trac.cvsdude.com/fop

Regards,
Glenn
Comment 11 Glenn Adams 2010-08-03 04:45:56 UTC
Created attachment 25831 [details]
Simple test Arabic output PDF file.
Comment 12 Pascal Sancho 2010-08-03 04:49:43 UTC
(In reply to comment #8)

Glenn, I do not get Character shaping when applying patch (against either trunk r981406 or Temp_ComplexScripts). The output is the same as FOP 1.0 without patch.

I've tested it with Arial UNI (I didn't get Simplified Arabic).
--
Pascal
Comment 13 Pascal Sancho 2010-08-04 03:32:34 UTC
(In reply to comment #12)
Works fine now (with Glenn help)
Thx
Comment 14 Dharmesh Rana 2010-08-10 01:38:43 UTC
(In reply to comment #11)
> Created an attachment (id=25831) [details]
> Simple test Arabic output PDF file.

Hi Glenn,

Can you please let me know in detail, how this Arabic patch that you have created to be applied and run?

Thanks,
Dharmesh Rana
Comment 15 Glenn Adams 2010-08-10 01:49:34 UTC
It would be best to wait until the patch is committed to the Temp_ComplexScripts branch that has been created for it, then you will need to build it prior to running it. You will also need one or more of the (presently) supported fonts, about which see https://skynav.trac.cvsdude.com/fop/wiki/SupportedFonts.

I have no control over the schedule for the committing of the patch, as it will have to be performed by an FOP committer. But you can express your interest in having them do that on the following mail list: <fop-user@xmlgraphics.apache.org>. If you are impatient, you can apply the patch yourself, but you will need the appropriate skills to do that.

Regards,
Glenn

(In reply to comment #14)
> (In reply to comment #11)
> > Created an attachment (id=25831) [details] [details]
> > Simple test Arabic output PDF file.
> 
> Hi Glenn,
> 
> Can you please let me know in detail, how this Arabic patch that you have
> created to be applied and run?
> 
> Thanks,
> Dharmesh Rana
Comment 16 Glenn Adams 2010-08-19 06:14:14 UTC
Created attachment 25912 [details]
Patch to revision 987110 of branch Temp_ComplexScripts.

This patch file replaces the prior patch against trunk. It is intended to be applied to revision 987110 of branch Temp_ComplexScripts as follows:

$ cd .../xmlgraphics/fop/branches/Temp_ComplexScripts
$ svn update -r 987110 .
$ gzcat .../fop-i18n.arabic-patch-1-update-1.diff.tgz | patch -p1
$ svn add `svn status . | grep '?' | awk '{print $2;}'`

This patch has been verified to build and run junit tests without regression, with 4 compiler warnings (none new), 0 checkstyle warnings, and 971 findbugs bugs (none new).

The intent of this patch update is to merge recent changes to the trunk, as well as fix three problems recently reported with the Arabic shape selection and bidi logic.

Note well that this patch update does NOT address the following outstanding issues:

* support for correct treatment and placement of non-spacing marks
* support for certain advanced GSUB table types
* support for GPOS table types
* support for inline FOs in bidi conexts other than fo:inline, fo:bidi-override, and fo:character
* correct treatment of corresponding properties in right-to-left contexts
* incorrect justification of arabic text with text-align="justify"

These latter issues will be addressed in subsequent patches to the Temp_ComplexScripts branch.

Regards,
Glenn
Comment 17 Simon Pepping 2010-08-19 15:48:11 UTC
Committed revision 987282
Comment 18 akarmoun 2010-09-22 06:36:43 UTC
Hello,

we want to test the fix for arabic characters. What is the svn checkout URL for this fop revision in trunk? We've tested with current trunk of 22.09.2010. How can I install the patch? what#s the svn checkout base?

Thanks!

Jens and Abde
Comment 19 Simon Pepping 2010-09-22 08:02:12 UTC
Summary of web links
====================

This work is available in a git repository, see http://github.com/skynavga/fop. The repository URLs are: http://github.com/skynavga/fop.git and git://github.com/skynavga/fop.git

This work is available on the subversion branch Temp_ComplexScripts. The repository URL is: http://svn.apache.org/repos/asf/xmlgraphics/fop/branches/Temp_ComplexScripts

Binary snapshots of this work are available at http://people.apache.org/~spepping/. Updated snapshots will be provided whenever there is a new patch.

The git repository is most up to date; the subversion repository waits for the submission of patches.
Comment 20 Glenn Adams 2010-09-30 02:02:28 UTC
Created attachment 26104 [details]
Patch to revision 1002949 of branch Temp_ComplexScripts.

This patch is to be applied to revision 1002949 of branch Temp_ComplexScripts as follows:

cd ${FOP}/branches/Temp_ComplexScripts
svn update -r 1002949 --force
gzcat ${DOWNLOADS}/fop-i18n.arabic-patch-2.diff.gz | patch -p0
svn rm src/java/org/apache/fop/fonts/GlyphUtils.java
svn add src/java/org/apache/fop/fonts/GlyphSubstitutionState.java
svn add src/java/org/apache/fop/fonts/GlyphClassMapping.java
svn add src/java/org/apache/fop/fonts/GlyphPositioningState.java
svn add src/java/org/apache/fop/fonts/GlyphCoverageMapping.java
svn add src/java/org/apache/fop/fonts/GlyphProcessingState.java
svn add src/java/org/apache/fop/fonts/GlyphClassTable.java
svn add src/java/org/apache/fop/fonts/GlyphDefinitionTable.java
svn add src/java/org/apache/fop/fonts/GlyphMappingTable.java
svn add src/java/org/apache/fop/fonts/GlyphDefinitionSubtable.java
svn add src/java/org/apache/fop/fonts/GlyphDefinition.java
svn add src/java/org/apache/fop/fonts/ScriptContextTester.java
svn add src/java/org/apache/fop/fonts/GlyphTester.java
svn commit ...

This patch completes the first round of functional support for OpenType GDEF/GSUB/GPOS advanced typographic tables as employed by selected Arabic fonts.

Regards,
Glenn
Comment 21 Dmitrij Golubev 2010-10-06 12:55:34 UTC
Hi! 

It is work great for whole document.
Could you implement similar for 

<fo:block-container writing-mode="rl-tb">

</fo:block-container>

I my project a lot of text contain mixed text (arabic, cyrilic, latin), therefore I have a big problem with FOP.
Comment 22 Dmitrij Golubev 2010-10-06 14:32:01 UTC
find problem: 
1. Background color for inline element does not drawed
<fo:inline background-color="#f3b53f" >

2. Possible not in right place 
if in fop.xconf add <auto-detect/> in section <fonts>

SEVERE: SVG graphic could not be rendered. Reason: java.lang.UnsupportedOperationException: unsupported device table del
ta format: 0
java.lang.UnsupportedOperationException: unsupported device table delta format: 0
        at org.apache.fop.fonts.truetype.TTFFile.readPosDeviceTable(TTFFile.java:3239)
        at org.apache.fop.fonts.truetype.TTFFile.readPosAnchor(TTFFile.java:3588)
        at org.apache.fop.fonts.truetype.TTFFile.readMarkToBasePosTableFormat1(TTFFile.java:3711)
        at org.apache.fop.fonts.truetype.TTFFile.readMarkToBasePosTable(TTFFile.java:3766)
        at org.apache.fop.fonts.truetype.TTFFile.readGPOSSubtable(TTFFile.java:4649)
        at org.apache.fop.fonts.truetype.TTFFile.readLookupTable(TTFFile.java:4721)
Comment 23 Imad Daou 2010-11-04 18:17:31 UTC
Dear all,

I was using FOP for more than 1 year and it's a great tool for printing and creating PDF files and now I have problem printing ARABIC Text.

I downloaded and tested the FO-PDF converted and it rendered the text correctly, but this does not fix my problem since I was using the PrintRenderer to directly print an FO file.

My question is, when can I expect a new patch or release of FOP with Arabic print-rendering support.

Best regards,
Imad Daou
Comment 24 Glenn Adams 2010-11-05 11:13:34 UTC
I cannot determine from your email whether you are using this new patch or not. In any case, only the IF, AT, and PDF renderers are presently supported by this patch, and I do not have a schedule for when the remaining renderers will be supported.

Regards,
Glenn

(In reply to comment #23)
> Dear all,
> 
> I was using FOP for more than 1 year and it's a great tool for printing and
> creating PDF files and now I have problem printing ARABIC Text.
> 
> I downloaded and tested the FO-PDF converted and it rendered the text
> correctly, but this does not fix my problem since I was using the PrintRenderer
> to directly print an FO file.
> 
> My question is, when can I expect a new patch or release of FOP with Arabic
> print-rendering support.
> 
> Best regards,
> Imad Daou
Comment 25 Glenn Adams 2010-11-05 11:16:24 UTC
I cannot determine from your email whether you are using this new patch or not. In any case, only the IF, AT, and PDF renderers are presently supported by this patch, and I do not have a schedule for when the remaining renderers will be supported.

Regards,
Glenn

(In reply to comment #23)
> Dear all,
> 
> I was using FOP for more than 1 year and it's a great tool for printing and
> creating PDF files and now I have problem printing ARABIC Text.
> 
> I downloaded and tested the FO-PDF converted and it rendered the text
> correctly, but this does not fix my problem since I was using the PrintRenderer
> to directly print an FO file.
> 
> My question is, when can I expect a new patch or release of FOP with Arabic
> print-rendering support.
> 
> Best regards,
> Imad Daou
Comment 26 Virginijus Kandrotas 2011-02-04 11:53:07 UTC
Dear Glenn,

thank you for this patch. 

I am testing it and have following problem: Arabic letter in the words looks not connected together (separated), can you please explain this behavior ?

Best regards,

Virginijus
Comment 27 Glenn Adams 2011-02-04 13:47:24 UTC
(In reply to comment #26)
> Dear Glenn,
> 
> thank you for this patch. 
> 
> I am testing it and have following problem: Arabic letter in the words looks
> not connected together (separated), can you please explain this behavior ?
> 
> Best regards,
> 
> Virginijus

please use the i18n.arabic branch at git://github.com/skynavga/fop.git for the latest code; also, please restrict your fonts to those specified described under "Supported Fonts" at https://skynav.trac.cvsdude.com/fop/wiki/ComplexScripts

regards, glenn
Comment 28 Glenn Adams 2011-03-02 13:31:48 UTC
Created attachment 26721 [details]
Patch to revision 1076316 of branch Temp_ComplexScripts.


This patch is to be applied to revision 1076316 of branch Temp_ComplexScripts
as follows:

cd ${FOP}/branches/Temp_ComplexScripts
svn update -r 1076316 --force
svn revert -R .
svn status # check and remove any unexpected changes prior to patching
gzcat ${DOWNLOADS}/fop-i18n.arabic-patch-3.diff.gz | patch -p0
ant codegen-unicode-bidi # needed to generate test data to be committed
svn add test/java/org/apache/fop/complexscripts
svn add src/java/org/apache/fop/fonts/AdvancedTypographicTableFormatException.java
svn add src/codegen/complexscripts
svn add src/codegen/unicode/java/org/apache/fop/text/bidi/GenerateBidiTestData.java
svn commit ...

This patch completes the next update of support for OpenType
GDEF/GSUB/GPOS advanced typographic tables as employed by selected Arabic
fonts. This patch includes the following:

* bug fixes
* new codegen-unicode-bidi target (used to generate bidi test data above)
* new junit-complexscripts target

N.B. the use of "ant codegen-unicode-bidi" above, which is needed to generate test data to be committed.

Regards,
Glenn
Comment 29 Simon Pepping 2011-03-05 03:44:43 UTC
(In reply to comment #28)
> Created an attachment (id=26721) [details]
> Patch to revision 1076316 of branch Temp_ComplexScripts.

Thanks. Committed. I note the following problem: In the build target junit-compile-java, the attribute value memoryMaximumSize="1024m" was added, but the ant build reports: [javac] Since fork is false, ignoring memoryMaximumSize setting.

Simon
Comment 30 Vincent Hennebert 2011-03-22 10:15:54 UTC
Glenn,

I had a brief look at your branch and have the following questions and comments:

You implemented the BIDI algorithm in a separate BidiUtil class. Why didn't you integrate it into the layout engine? It seems that what is done there is a layout task, therefore should be put in the layout engine classes. FO tree objects are now manipulated by both the layout engine and that class, and I'm seriously concerned about the maintenance issues that that may create. 

Also, AFAIU a whole data structure is created there that will exist in parallel with the Knuth elements created by the layout engine. In addition to the memory issues, isn't there a risk of discrepancy between the two? That may cause hard-to-find bugs.

How feasible is it to run the BIDI algorithm on a subset of the FO tree? If I'm right, ATM it is launched on a whole page-sequence. When we refactor the code so that layout can be started before a whole page-sequence has been parsed, in order to avoid the infamous memory issue that a lot of users are running into, will that code allow to do it?

I have a concern with some metrics. That BidiUtil class is more than 1700 lines. There are a few other new classes that are more than 1000 lines long, and TTFFile now is more than 5500 lines (!). What's your plan to break them down into more manageable chunks? Also, there are now 72 classes in the o.a.f.fonts package.

Also, many variables have a two- or three-letter name, which makes it difficult to understand what their purpose is, especially if a lot of them are involved at the same time. I realise there usually is a comment explaining what the variable does when it's defined, but it would be easier if the variable were carrying the comment with itself by having a longer name.

Thanks,
Vincent
Comment 31 Glenn Adams 2011-03-22 10:59:03 UTC
inline

(In reply to comment #30)
> Glenn,
> 
> I had a brief look at your branch and have the following questions and
> comments:
> 
> You implemented the BIDI algorithm in a separate BidiUtil class. Why didn't you
> integrate it into the layout engine? It seems that what is done there is a
> layout task, therefore should be put in the layout engine classes. FO tree
> objects are now manipulated by both the layout engine and that class, and I'm
> seriously concerned about the maintenance issues that that may create. 

BidiUtil is defined in o.a.f.layoutmgr, therefore it is "integrated into the layout engine". I considered placing it in o.a.f.util or in o.a.f.text.bidi; however, since it is only referenced by o.a.f.layoutmgr code, I decided that was the best home for it at this time.

However, having said that, it is not performing layout and knows nothing about areas or geometries whatsoever. It's functionality is invoked in two places:

(1) in PageSequenceLayoutManager.activateLayout(), in order to resolve bidi levels of each delimited text range, and
(2) in LineLayoutManager.add{Inline,Block}Area, after completing line area construction;

See XSL-FO Section 5.8 for more information. I'm open to any concrete suggestions about a better point of integration, but I don't see any at present that is consistent with 5.8.
 
> Also, AFAIU a whole data structure is created there that will exist in parallel
> with the Knuth elements created by the layout engine. In addition to the memory
> issues, isn't there a risk of discrepancy between the two? That may cause
> hard-to-find bugs.

There is no relationship between the bidi levels created by BidiUtil and Knuth elements used for line breaking.

> 
> How feasible is it to run the BIDI algorithm on a subset of the FO tree? If I'm
> right, ATM it is launched on a whole page-sequence. When we refactor the code
> so that layout can be started before a whole page-sequence has been parsed, in
> order to avoid the infamous memory issue that a lot of users are running into,
> will that code allow to do it?

I'm not sure what you mean by "ATM". The semantics of XSL-FO 5.8 as embodied by the present implementation will have to be taken into account by such a refactoring, as I'm sure will many other aspects of the current FOP implementation.
 
> I have a concern with some metrics. That BidiUtil class is more than 1700
> lines. There are a few other new classes that are more than 1000 lines long,
> and TTFFile now is more than 5500 lines (!). What's your plan to break them
> down into more manageable chunks? Also, there are now 72 classes in the
> o.a.f.fonts package.

I have no such plan in the near to medium term. There is no technical reason to refactor TTFFile, though I agree it is on the long side, and could probably use a general refactoring in the long term. 

> Also, many variables have a two- or three-letter name, which makes it difficult
> to understand what their purpose is, especially if a lot of them are involved
> at the same time. I realise there usually is a comment explaining what the
> variable does when it's defined, but it would be easier if the variable were
> carrying the comment with itself by having a longer name.

Symbol length is a trade off between readability and other factors, such as line length and typing time. I prefer short symbols. Perhaps it is my training in mathematics, electrical engineering, and having started programming with APL that leads me in that direction. The meaning of short symbols should be readily inferable given sufficient knowledge of the problem domain.

Best,
Glenn
Comment 32 Andreas L. Delmelle 2011-06-10 19:11:54 UTC
I just tried checking it out, and got two types of compile error when trying to build:

    [javac] ... org/apache/fop/layoutmgr/BidiUtil.java:42: cannot find symbol
    [javac] symbol  : class Viewport
    [javac] location: package org.apache.fop.area.inline
    [javac] import org.apache.fop.area.inline.Viewport;
    [javac]                                  ^

    [javac] ... org/apache/fop/layoutmgr/BidiUtil.java:107: reference to collectRuns is ambiguous, both method collectRuns(java.util.List,java.util.List) in org.apache.fop.layoutmgr.BidiUtil and method collectRuns(Viewport,java.util.List) in org.apache.fop.layoutmgr.BidiUtil match
    [javac]         List runs = collectRuns ( la.getInlineAreas(), new Vector() );
    [javac]                     ^
    ...
    [javac] 13 errors

Replacing all references with InlineViewport fixes the problem (committed in rev 1134414)

First look: hope the CS suppressions aren't there to stay... ;-P

(more detailed review to follow)
Comment 33 Glenn Adams 2011-06-10 20:44:11 UTC
I believe this is a side-effect of Simon's incremental merges from trunk into the Temp_ComplexScripts branch. I'm doing my work in my dev repo at git://github.com/skynavga/fop.git, from which I irregularly post a new patch for merger into Temp_ComplexScripts. I've got a couple of bugs I'm working on, after which I expect to submit a new patch before the end of June.

Regarding CS suppressions, I expect they will remain in the source I authored since I use somewhat different style conventions than were used by the pre-existing code. For example, I use:

* longer lines
* a few methods with greater than 7 parameters
* slightly different spacing conventions

When I've modified pre-existing files, then I kept to the existing checkstyle conventions; however, with some new files, I have varied a bit. I don't have any plans to change those new files at this time. This does not affect functionality.

G.

(In reply to comment #32)
> I just tried checking it out, and got two types of compile error when trying to
> build:
> 
>     [javac] ... org/apache/fop/layoutmgr/BidiUtil.java:42: cannot find symbol
>     [javac] symbol  : class Viewport
>     [javac] location: package org.apache.fop.area.inline
>     [javac] import org.apache.fop.area.inline.Viewport;
>     [javac]                                  ^
> 
>     [javac] ... org/apache/fop/layoutmgr/BidiUtil.java:107: reference to
> collectRuns is ambiguous, both method
> collectRuns(java.util.List,java.util.List) in org.apache.fop.layoutmgr.BidiUtil
> and method collectRuns(Viewport,java.util.List) in
> org.apache.fop.layoutmgr.BidiUtil match
>     [javac]         List runs = collectRuns ( la.getInlineAreas(), new Vector()
> );
>     [javac]                     ^
>     ...
>     [javac] 13 errors
> 
> Replacing all references with InlineViewport fixes the problem (committed in
> rev 1134414)
> 
> First look: hope the CS suppressions aren't there to stay... ;-P
> 
> (more detailed review to follow)
Comment 34 Andreas L. Delmelle 2011-06-12 11:54:08 UTC
(In reply to comment #31)
> (In reply to comment #30)
> > I had a brief look at your branch and have the following questions and
> > comments:
> > 
> > You implemented the BIDI algorithm in a separate BidiUtil class. Why didn't you
> > integrate it into the layout engine? It seems that what is done there is a
> > layout task, therefore should be put in the layout engine classes. FO tree
> > objects are now manipulated by both the layout engine and that class, and I'm
> > seriously concerned about the maintenance issues that that may create. 
> 
> BidiUtil is defined in o.a.f.layoutmgr, therefore it is "integrated into the
> layout engine". I considered placing it in o.a.f.util or in o.a.f.text.bidi;
> however, since it is only referenced by o.a.f.layoutmgr code, I decided that
> was the best home for it at this time.

Just to weigh in here: 
On the one hand, I believe that the layout *managers* should actually remain quite BIDI-agnostic. Ultimately, to a layout manager, it should only matter how wide/heigh a given text-fragment will be and how its own child areas should be stacked. In what direction text should be rendered, is hardly its concern. The TextLM, maybe... but that is already quite large.

That said, Vincent's point is a valid one. The observation certainly does not mean that more LMs shouldn't *use* the BIDI functionality. Almost on the contrary...

Thinking out loud, maybe a singleton that is made available in the top level LM would be more appropriate. That would allow potential optimizations, in that BIDI is then not applied to a page-sequence as a whole, but gradually, as layout progresses.

> However, having said that, it is not performing layout and knows nothing about
> areas or geometries whatsoever. It's functionality is invoked in two places:
> 
> (1) in PageSequenceLayoutManager.activateLayout(), in order to resolve bidi
> levels of each delimited text range, and
> (2) in LineLayoutManager.add{Inline,Block}Area, after completing line area
> construction;
> 
> See XSL-FO Section 5.8 for more information. I'm open to any concrete
> suggestions about a better point of integration, but I don't see any at present
> that is consistent with 5.8.

I am even thinking that the first part (resolving embedding levels) can be handled entirely in the FO tree. Places like finalizeNode() come to mind. That way, RTF output (or more generally:  formats not using the AreaTreeHandler and the layout engine) would also be able to benefit from it.
Also, I notice 'blind' traversal of the ancestry of a node. 'Blind' means that I am wondering what happens with BIDI processing in retrieved fo:markers. The spec is not entirely clear about it, but given that writing-mode is involved, I would expect the behavior to be similar to property-resolution, where the base context comes from the parent of the fo:retrieve-marker, not the actual parent node in the source document.
Integrating the step in the FO tree would make this relatively straightforward to solve. The first step would be skipped during initial parsing, and instead executed as part of the resolveRetrieveMarker() logic that is invoked during layout of static content.

<snip />

> > How feasible is it to run the BIDI algorithm on a subset of the FO tree? If I'm
> > right, ATM it is launched on a whole page-sequence. When we refactor the code
> > so that layout can be started before a whole page-sequence has been parsed, in
> > order to avoid the infamous memory issue that a lot of users are running into,
> > will that code allow to do it?
> 
> I'm not sure what you mean by "ATM". The semantics of XSL-FO 5.8 as embodied by
> the present implementation will have to be taken into account by such a
> refactoring, as I'm sure will many other aspects of the current FOP
> implementation.

XSL-FO doesn't go so far as specifying that this all has to be applied to the page-sequence as a whole. That would be an implementation detail. As long as the result is compliant, it shouldn't matter whether, architecturally, we process complete page-sequences or separate blocks. I still firmly believe there is a possibility to switch from endPageSequence() to endBlock(). Only blocks that are direct children of a fo:flow should be used as logical boundaries to trigger layout (and, if possible, rendering) of all content up to that point. The proposed approach should work too, preferably without too much refactoring, as it would be the idea of purging all finished blocks.
Comment 35 Glenn Adams 2011-06-12 13:28:35 UTC
(In reply to comment #34)
> (In reply to comment #31)
> > (In reply to comment #30)
> > > I had a brief look at your branch and have the following questions and
> > > comments:
> > > 
> > > You implemented the BIDI algorithm in a separate BidiUtil class. Why didn't you
> > > integrate it into the layout engine? It seems that what is done there is a
> > > layout task, therefore should be put in the layout engine classes. FO tree
> > > objects are now manipulated by both the layout engine and that class, and I'm
> > > seriously concerned about the maintenance issues that that may create. 
> > 
> > BidiUtil is defined in o.a.f.layoutmgr, therefore it is "integrated into the
> > layout engine". I considered placing it in o.a.f.util or in o.a.f.text.bidi;
> > however, since it is only referenced by o.a.f.layoutmgr code, I decided that
> > was the best home for it at this time.
> 
> Just to weigh in here: 
> On the one hand, I believe that the layout *managers* should actually remain
> quite BIDI-agnostic. Ultimately, to a layout manager, it should only matter how
> wide/heigh a given text-fragment will be and how its own child areas should be
> stacked. In what direction text should be rendered, is hardly its concern. The
> TextLM, maybe... but that is already quite large.

One point where LMs are exposed to bidi is that the rendering system assumes a left to right rendering direction of areas irrespective of bidi, while start/end edges vary according to bidi context.

For example, the start/end indent traits on generated areas varies with bidi levels. Also, certain progression directions, such as table column progression, list label versus list body, region start versus region end, etc., all vary by bidi level. In some cases this variation must be accounted for in an LM.

> That said, Vincent's point is a valid one. The observation certainly does not
> mean that more LMs shouldn't *use* the BIDI functionality. Almost on the
> contrary...
> 
> Thinking out loud, maybe a singleton that is made available in the top level LM
> would be more appropriate. That would allow potential optimizations, in that
> BIDI is then not applied to a page-sequence as a whole, but gradually, as
> layout progresses.
> 
> > However, having said that, it is not performing layout and knows nothing about
> > areas or geometries whatsoever. It's functionality is invoked in two places:
> > 
> > (1) in PageSequenceLayoutManager.activateLayout(), in order to resolve bidi
> > levels of each delimited text range, and
> > (2) in LineLayoutManager.add{Inline,Block}Area, after completing line area
> > construction;
> > 
> > See XSL-FO Section 5.8 for more information. I'm open to any concrete
> > suggestions about a better point of integration, but I don't see any at present
> > that is consistent with 5.8.
> 
> I am even thinking that the first part (resolving embedding levels) can be
> handled entirely in the FO tree. Places like finalizeNode() come to mind. That
> way, RTF output (or more generally:  formats not using the AreaTreeHandler and
> the layout engine) would also be able to benefit from it.
> Also, I notice 'blind' traversal of the ancestry of a node. 'Blind' means that
> I am wondering what happens with BIDI processing in retrieved fo:markers. The
> spec is not entirely clear about it, but given that writing-mode is involved, I
> would expect the behavior to be similar to property-resolution, where the base
> context comes from the parent of the fo:retrieve-marker, not the actual parent
> node in the source document.
> Integrating the step in the FO tree would make this relatively straightforward
> to solve. The first step would be skipped during initial parsing, and instead
> executed as part of the resolveRetrieveMarker() logic that is invoked during
> layout of static content.
> 
> <snip />
> 
> > > How feasible is it to run the BIDI algorithm on a subset of the FO tree? If I'm
> > > right, ATM it is launched on a whole page-sequence. When we refactor the code
> > > so that layout can be started before a whole page-sequence has been parsed, in
> > > order to avoid the infamous memory issue that a lot of users are running into,
> > > will that code allow to do it?
> > 
> > I'm not sure what you mean by "ATM". The semantics of XSL-FO 5.8 as embodied by
> > the present implementation will have to be taken into account by such a
> > refactoring, as I'm sure will many other aspects of the current FOP
> > implementation.
> 
> XSL-FO doesn't go so far as specifying that this all has to be applied to the
> page-sequence as a whole. That would be an implementation detail. As long as
> the result is compliant, it shouldn't matter whether, architecturally, we
> process complete page-sequences or separate blocks. I still firmly believe
> there is a possibility to switch from endPageSequence() to endBlock(). Only
> blocks that are direct children of a fo:flow should be used as logical
> boundaries to trigger layout (and, if possible, rendering) of all content up to
> that point. The proposed approach should work too, preferably without too much
> refactoring, as it would be the idea of purging all finished blocks.

It may be readily feasible to change the bidi level resolution process from end of page sequence to end of block. I'll give that a try to see what it breaks, and if it doesn't break anything that can't be readily fixed, i'll move the point of resolution invocation to end block processing.

G.
Comment 36 Glenn Adams 2011-06-15 16:32:07 UTC
Created attachment 27161 [details]
Patch to revision 1136002 of branch Temp_ComplexScripts.

This patch is to be applied to revision 1136002 of branch Temp_ComplexScripts
as follows:

cd ${FOP}/branches/Temp_ComplexScripts
svn update -r 1136002 --force
svn revert -R .
svn status # check and remove any unexpected changes prior to patching
gzcat ${DOWNLOADS}/fop-i18n.arabic-patch-4.diff.gz | patch -p0
svn add test/java/org/apache/fop/complexscripts/arabic
svn add test/java/org/apache/fop/complexscripts/scripts
svn add test/java/org/apache/fop/complexscripts/fonts
svn add test/java/org/apache/fop/complexscripts/util
svn add test/layoutengine/standard-testcases/page-number-citation_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/inline_background-color_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/table-column_column-progression-lr.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-next-page.xml
svn add test/layoutengine/standard-testcases/basic-link_height_multi-child_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/instream-foreign-object_display-align_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_for_toc_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_multi-line_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_baseline-shift_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-after_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/table-column_column-progression-rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_inline-child_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/instream-foreign-object_basic_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/character_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_external-destination_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/instream-foreign-object_border_padding_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/list-block_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-after.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-next-page_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-before.xml
svn add test/layoutengine/standard-testcases/basic-link_external-destination_writing-mode_rl_2.xml
svn add test/layoutengine/standard-testcases/block_text-align_4.xml
svn add test/layoutengine/standard-testcases/basic-link_background-image_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-before_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/block_text-align_5.xml
svn add test/resources/complexscripts
svn add src/java/org/apache/fop/fonts/IndicScriptProcessor.java
svn add src/java/org/apache/fop/fonts/DevanagariScriptProcessor.java
svn rm --force src/codegen/complexscripts
svn commit ...

This patch includes the following:

* bug fixes
* new layout engine tests for right-to-left writing mode
* new generic indic script processing support
* new devanagari script processing support

See milestone "Patch 4" at http://skynav.trac.cvsdude.com/fop/report/6 for further details.

Regards,
Glenn
Comment 37 Simon Pepping 2011-06-15 18:55:43 UTC
Committed in revisions 1136144 and 1136146. In rev. 1136144 I forgot the additions and deletions, so that the code of that revision does not compile.
Comment 38 Simon Pepping 2011-06-22 18:31:13 UTC
The code contains a number of Unicode properties, e.g. in util.CharUtilities and DevanagariScriptProcessor. It would be better if all Unicode properties would be collected in one class. Perhaps it is even possible to generate that class from the Unicode Database files. You already do that with class BidiClassUtils. This would make it much easier to update FOP to newer versions of Unicode.

Some of the info in util.CharUtilities is also provided by Character.UnicodeBlock.of(int codePoint). Why do you not use Java's Unicode support or ICU?
Comment 39 Glenn Adams 2011-06-22 19:29:51 UTC
(In reply to comment #38)
> The code contains a number of Unicode properties, e.g. in util.CharUtilities
> and DevanagariScriptProcessor. It would be better if all Unicode properties
> would be collected in one class. Perhaps it is even possible to generate that
> class from the Unicode Database files. You already do that with class
> BidiClassUtils. This would make it much easier to update FOP to newer versions
> of Unicode.

The Unicode Database files do not have the specialized properties needed to perform script the script processing found in DevanagariScriptProcessor, etc. Furthermore, there are context sensitive properties that are not simple table lookup.

> Some of the info in util.CharUtilities is also provided by
> Character.UnicodeBlock.of(int codePoint). Why do you not use Java's Unicode
> support or ICU?

Neither Java's Unicode support or ICU are sufficient to provide the necessary logic, e.g., determination of script membership for the purpose of performing complex script rendering. To give an example, script membership is not the same as unicode block membership.

G.
Comment 40 Glenn Adams 2011-10-15 09:58:00 UTC
Created attachment 27784 [details]
Patch to revision 1183594 of branch Temp_ComplexScripts
Comment 41 Glenn Adams 2011-10-15 10:00:16 UTC
This patch is to be applied to revision 1183594 of branch Temp_ComplexScripts
as follows:

cd ${FOP}/branches/Temp_ComplexScripts
svn update -r 1183594 --force
svn revert -R .
svn status # check and remove any unexpected changes prior to patching
gzcat ${DOWNLOADS}/fop-i18n.arabic-patch-5.diff.gz | patch -p0
ant resgen-complexscripts clean
svn add test/java/org/apache/fop/complexscripts/gsub
svn add test/java/org/apache/fop/complexscripts/gdef
svn add test/java/org/apache/fop/complexscripts/gpos
svn add test/java/org/apache/fop/complexscripts/arabic/GenerateArabicTestData.java
svn add test/java/org/apache/fop/complexscripts/arabic/ArabicTestCase.java
svn add test/java/org/apache/fop/complexscripts/arabic/ArabicTestConstants.java
svn add test/java/org/apache/fop/layoutengine/LayoutEngineTestUtils.java
svn add test/java/org/apache/fop/util/NumberConverterTestCase.java
svn add test/layoutengine/standard-testcases/region-body_column-count_1_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/page-sequence-force-page-count-odd.xml
svn add test/layoutengine/standard-testcases/region-body_column-count_2_writing-mode_rl.xml
svn add test/resources/complexscripts/arab/data
svn propdel svn:mime-type test/resources/complexscripts/arab/data/arab-001.txt
svn add src/java/org/apache/fop/util/NumberConverter.java
svn rm --force test/java/org/apache/fop/complexscripts/arabic/ArabicScriptTestSuite.java
svn commit ...

This patch includes the following:

* bug fixes
* new layout engine tests for right-to-left writing mode
* new test utility for using TTX files in testing advanced typographic tables
* new test cases for GDEF/GSUB/GPOS advanced typographic tables
* new test cases for ~85000 arabic word forms against four fonts
* new implementation of number formatter for number to string conversion, which
  adds support for arabic, hebrew, thai, and kana scripts

See milestone "Patch 5" at http://skynav.trac.cvsdude.com/fop/report/6 for
further details.

Regards,
Glenn
Comment 42 Simon Pepping 2011-10-18 08:29:31 UTC
Applied. I had problems with test/resources/complexscripts/arab/data/. Please, check the result.
Comment 43 Dilip Shah 2011-10-20 03:55:40 UTC
Created attachment 27822 [details]
list of Gujarati words and sentences

As per my exchange with Glenn, I've attached a UTF-8 encoded file that contains Gujarati words and sentences. Actually, it's an output of a multiple choice quiz from my application. If the data is expected in some other form, let me know. Also, let me know if you need more of such data.
Comment 44 Glenn Adams 2011-10-20 13:03:06 UTC
(In reply to comment #43)
> Created attachment 27822 [details]
> list of Gujarati words and sentences
> 
> As per my exchange with Glenn, I've attached a UTF-8 encoded file that contains
> Gujarati words and sentences. Actually, it's an output of a multiple choice
> quiz from my application. If the data is expected in some other form, let me
> know. Also, let me know if you need more of such data.

Thanks. I'll let you know when I've got the Gujarati support working and have tried out these word forms. By the way, for Arabic script, i have approximately 85,000 word forms which represents a significant cross section of a number of corpuses. I would hope to have similar number of word forms for other scripts.

I prefer word forms only for Indic scripts rather than phrases or sentences, since the latter do not typically use any whitespace between words. So I might ask you to manually segment your data into word forms only.

G.
Comment 45 Glenn Adams 2011-11-07 03:46:58 UTC
Created attachment 27906 [details]
Patch to revision 1198396 of trunk.
Comment 46 Glenn Adams 2011-11-07 03:56:53 UTC
Re: Created attachment 27906 [details]
Patch to revision 1198396 of trunk.

This patch is to be applied to revision 1198396 of trunk
as follows:

cd ${FOP}/trunk
svn update -r 1198396 --force
svn revert -R .
svn status # check and remove any unexpected changes prior to patching
gzcat ${DOWNLOADS}/fop-complexscripts-trunk-patch-1.diff.gz | patch -p1
ant clean
ant codegen-unicode-bidi
ant resgen-complexscripts
ant clean
svn add src/codegen/unicode/java/org/apache/fop/text/bidi
svn add src/java/org/apache/fop/fonts/AdvancedTypographicTableFormatException.java
svn add src/java/org/apache/fop/fonts/ArabicScriptProcessor.java
svn add src/java/org/apache/fop/fonts/DefaultScriptProcessor.java
svn add src/java/org/apache/fop/fonts/DevanagariScriptProcessor.java
svn add src/java/org/apache/fop/fonts/DiscontinuousAssociationException.java
svn add src/java/org/apache/fop/fonts/GlyphClassMapping.java
svn add src/java/org/apache/fop/fonts/GlyphClassTable.java
svn add src/java/org/apache/fop/fonts/GlyphContextTester.java
svn add src/java/org/apache/fop/fonts/GlyphCoverageMapping.java
svn add src/java/org/apache/fop/fonts/GlyphCoverageTable.java
svn add src/java/org/apache/fop/fonts/GlyphDefinition.java
svn add src/java/org/apache/fop/fonts/GlyphDefinitionSubtable.java
svn add src/java/org/apache/fop/fonts/GlyphDefinitionTable.java
svn add src/java/org/apache/fop/fonts/GlyphMappingTable.java
svn add src/java/org/apache/fop/fonts/GlyphPositioning.java
svn add src/java/org/apache/fop/fonts/GlyphPositioningState.java
svn add src/java/org/apache/fop/fonts/GlyphPositioningSubtable.java
svn add src/java/org/apache/fop/fonts/GlyphPositioningTable.java
svn add src/java/org/apache/fop/fonts/GlyphProcessingState.java
svn add src/java/org/apache/fop/fonts/GlyphSequence.java
svn add src/java/org/apache/fop/fonts/GlyphSubstitution.java
svn add src/java/org/apache/fop/fonts/GlyphSubstitutionState.java
svn add src/java/org/apache/fop/fonts/GlyphSubstitutionSubtable.java
svn add src/java/org/apache/fop/fonts/GlyphSubstitutionTable.java
svn add src/java/org/apache/fop/fonts/GlyphSubtable.java
svn add src/java/org/apache/fop/fonts/GlyphTable.java
svn add src/java/org/apache/fop/fonts/GlyphTester.java
svn add src/java/org/apache/fop/fonts/IncompatibleSubtableException.java
svn add src/java/org/apache/fop/fonts/IndicScriptProcessor.java
svn add src/java/org/apache/fop/fonts/Positionable.java
svn add src/java/org/apache/fop/fonts/ScriptContextTester.java
svn add src/java/org/apache/fop/fonts/ScriptProcessor.java
svn add src/java/org/apache/fop/fonts/Substitutable.java
svn add src/java/org/apache/fop/layoutmgr/BidiUtil.java
svn add src/java/org/apache/fop/text/bidi
svn add src/java/org/apache/fop/traits/Direction.java
svn add src/java/org/apache/fop/traits/WritingMode.java
svn add src/java/org/apache/fop/traits/WritingModeTraits.java
svn add src/java/org/apache/fop/traits/WritingModeTraitsGetter.java
svn add src/java/org/apache/fop/traits/WritingModeTraitsSetter.java
svn add src/java/org/apache/fop/util/BidiConstants.java
svn add src/java/org/apache/fop/util/NumberConverter.java
svn add test/java/org/apache/fop/complexscripts
svn add test/java/org/apache/fop/util/NumberConverterTestCase.java
svn add test/layoutengine/standard-testcases/basic-link_background-image_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_external-destination_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_external-destination_writing-mode_rl_2.xml
svn add test/layoutengine/standard-testcases/basic-link_for_toc_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_baseline-shift_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_inline-child_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_multi-child_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_multi-line_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-next-page.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-next-page_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-after.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-after_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-before.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-before_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/block_text-align_4.xml
svn add test/layoutengine/standard-testcases/block_text-align_5.xml
svn add test/layoutengine/standard-testcases/character_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/inline_background-color_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/instream-foreign-object_basic_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/instream-foreign-object_border_padding_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/instream-foreign-object_display-align_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/list-block_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/page-number-citation_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/region-body_column-count_1_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/region-body_column-count_2_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/table-column_column-progression-lr.xml
svn add test/layoutengine/standard-testcases/table-column_column-progression-rl.xml
svn add test/resources/complexscripts
svn commit ...

This patch includes the following:

* merge from Temp_ComplexScripts branch as updated to current trunk changes
* bug fix for bidi content inside fo:static-content

Regards,
Glenn
Comment 47 Glenn Adams 2011-11-13 22:57:35 UTC
Comment on attachment 27906 [details]
Patch to revision 1198396 of trunk.

This patch should no longer be used. Most all complex script code has been extensively reorganized into a distinct package, org.apache.fop.complexscripts, now available at http://github.com/skynavga/fop (see the i18n branch)

It is uncertain as to if or when this work will be merged into the FOP for future release by Apache. In the mean time, I expect to continue development of new features and bug fixes of complex scripts and other I18N related features in the repository listed above.

I will also be regularly merging from the Apache FOP trunk into this repository in order to keep it closely aligned to the main Apache FOP development branch.
Comment 48 Glenn Adams 2012-01-27 20:27:02 UTC
Created attachment 28220 [details]
Patch to revision 1236835 of trunk.

This patch is to be applied to revision 1236835 of trunk as follows:

cd ${FOP}/trunk
svn status # check and remove any unexpected changes prior to patchingsvn update -r 1236835 --force
svn revert -R .
svn status # check and remove any unexpected changes prior to patching
gzcat ${DOWNLOADS}/fop-complexscripts-patch-2.diff.gz | patch -p1
ant clean
ant codegen-unicode-bidi
ant resgen-complexscripts
ant clean
svn add src/codegen/unicode/java/org/apache/fop/complexscripts
svn add src/java/org/apache/fop/complexscripts
svn add src/java/org/apache/fop/traits/Direction.java
svn add src/java/org/apache/fop/traits/WritingMode.java
svn add src/java/org/apache/fop/traits/WritingModeTraits.java
svn add src/java/org/apache/fop/traits/WritingModeTraitsGetter.java
svn add src/java/org/apache/fop/traits/WritingModeTraitsSetter.java
svn add test/java/org/apache/fop/complexscripts
svn add test/layoutengine/standard-testcases/basic-link_background-image_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_external-destination_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_external-destination_writing-mode_rl_2.xml
svn add test/layoutengine/standard-testcases/basic-link_for_toc_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_baseline-shift_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_inline-child_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_multi-child_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_multi-line_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_height_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-next-page.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-next-page_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-after.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-after_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-before.xml
svn add test/layoutengine/standard-testcases/basic-link_internal-desination-same-page-before_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/block-container_start-indent_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/block_text-align_4.xml
svn add test/layoutengine/standard-testcases/block_text-align_5.xml
svn add test/layoutengine/standard-testcases/character_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/inline_background-color_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/instream-foreign-object_basic_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/instream-foreign-object_border_padding_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/instream-foreign-object_display-align_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/leader-alignment_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/leader_leader-pattern_dots_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/leader_leader-pattern_rule_writing-system_rl.xml
svn add test/layoutengine/standard-testcases/leader_leader-pattern_space_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/leader_leader-pattern_use-content_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/list-block_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/page-number-citation_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/region-body_column-count_1_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/region-body_column-count_2_writing-mode_rl.xml
svn add test/layoutengine/standard-testcases/table-column_column-progression-lr.xml
svn add test/layoutengine/standard-testcases/table-column_column-progression-rl.xml
svn add test/resources/complexscripts
svn propdel svn:mime-type test/resources/complexscripts/arab/data/arab-001.txt
svn commit ...

This patch replaces (obsoletes) all previous patches on this bug and includes the following:

Bug Fixes

* fo:block-container start-indent in bidi context (http://skynav.trac.cvsdude.com/fop/ticket/64)
* fo:static-content fails to process bidirectional levels (http://skynav.trac.cvsdude.com/fop/ticket/70)
* ignores standard kerning when font has GSUB but not GPOS (http://skynav.trac.cvsdude.com/fop/ticket/71)
* fo:leader not fully supported in bidi context (http://skynav.trac.cvsdude.com/fop/ticket/74)

New Features

* preliminary support for gujarati script (http://skynav.trac.cvsdude.com/fop/ticket/48)
* preliminary support for gurmukhi script (http://skynav.trac.cvsdude.com/fop/ticket/49)

Refactoring and Cleanup

* move new complex script related code into separate package hierarchy at o.a.f.complexscripts
  - the only exception is the retention of the four classes o.a.f.traits.WritingMode*.java
* refactor large complex script related classes into smaller classes, e.g., BidiUtil.java has been divided into 6 classes
* migrate advanced typography table parsing from TTFFile to o.a.f.complexscripts.fonts.OTFAdvancedTypographicTableReader
* migrate FO node specific functionality into FO class hierarchy, e.g., see FONode.collectDelimitedTextRanges
* migrate area tree node specific functionality into area class hierarchy, e.g., see InlineArea.collectInlineRuns

Regards,
Glenn
Comment 49 Glenn Adams 2012-02-27 18:13:50 UTC
Added complex script support (bidi, shaping, etc) at revision 1293736.