Bug 45062 - Table cells text is wrong
Summary: Table cells text is wrong
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: 3.0-dev
Hardware: PC Linux
: P1 critical (vote)
Target Milestone: ---
Assignee: POI Developers List
: 45167 (view as bug list)
Depends on:
Reported: 2008-05-22 06:55 UTC by dnapoletano
Modified: 2008-09-23 04:32 UTC (History)
1 user (show)

Table reading test class (95.00 KB, application/msword)
2008-05-22 06:55 UTC, dnapoletano

Note You need to log in before you can comment on or make changes to this bug.
Description dnapoletano 2008-05-22 06:55:18 UTC
Created attachment 21989 [details]
Table reading test class

Table-related text() method seems to be screwed up. Given a table cell, TableCell.text() method returns not only cell's text, but also part of text of nearest cells. Given a sample 3x3 table with cell texts marked "CELLij", with i=row, j=column, if the top left cell is empty, returned texts are as follows (from Eclipse console output):


Only last cell's text seems to be right.

The simple test class I've used is

package org.apache.poi.hwpf;

import java.io.*;

import javax.swing.JFileChooser;
import javax.swing.JOptionPane;

import org.apache.poi.hwpf.usermodel.*;

public class QuickTest
  public QuickTest()

  public static void main(String[] args)
      JFileChooser jfc = new JFileChooser();
		int esito = jfc.showOpenDialog(null);
		if(esito != JFileChooser.APPROVE_OPTION)
			JOptionPane.showMessageDialog(null, "No file selected");
			String percorso = jfc.getSelectedFile().getAbsolutePath();
			HWPFDocument doc = new HWPFDocument(new FileInputStream(percorso));
			Range r = doc.getRange();
			for(int i = 0; i < r.numParagraphs(); i++)
				Paragraph p = r.getParagraph(i);
					Table t = r.getTable(p);
					int cl = numCol(t);
					System.out.println("Found " + t.numRows() + "x" + cl + " table");
					i += t.numParagraphs() - 1;
	  catch(Exception er)
  private static int numCol(Table t)
	  int col = 0;
	  for(int i = 0; i < t.numRows(); i++)
		  if(t.getRow(i).numCells() > col)
			  col = t.getRow(i).numCells();
	  return col;
  private static void dumpTab(Table t)
	  for(int i = 0; i < t.numRows(); i++)
		  TableRow tr = t.getRow(i);
		  for(int j = 0; j < tr.numCells(); j++)
			  TableCell tc = tr.getCell(j);
			  System.out.println("CELL[" + i + "][" + j + "]=" + tc.text());


Sample test doc attached
Comment 1 Thomas Martin 2008-06-09 04:43:06 UTC
*** Bug 45167 has been marked as a duplicate of this bug. ***
Comment 2 Thomas Martin 2008-06-09 05:03:19 UTC
As a fix: needs to put back old constructor settings for TableCell (end+1->end)

Works fine (in TableRow.java)
_cells[cellIndex] = new TableCell(start, end, this, levelNum,
Comment 3 Nick Burch 2008-06-28 11:40:23 UTC
Any chance you could upload a simple file that shows up that problem? I'd like to commit a test at the same time as your fix, so we know it won't get broken again in the future. Your code looks like a good basis for a test, just need a file to drive it!
Comment 4 dnapoletano 2008-08-05 07:46:25 UTC
Test code can be tried with the same Word document attached for bug

44292 - [PATCH] TableCell skip its last Paragraphs

which, given to the QuickTest class I posted, gives as output

Found 1x3 table
CELL[0][0]=One paragraph is ok
CELL[0][1]=First para is ok
Second paragraph is skipped
CELL[0][2]=One paragraph is ok

showing that bugs 44292 and 45062 (this one) do not appear anymore using Trunk source code version, because table cells text is *right* and TableCell *does not skip* its last Paragraphs.

JUnit docet.
Comment 5 Nick Burch 2008-09-21 12:32:05 UTC
Just re-tested with svn trunk, and hwpf works properly
Comment 6 dnapoletano 2008-09-23 04:32:06 UTC
(In reply to comment #5)
> Just re-tested with svn trunk, and hwpf works properly

The problem seems to be located in the way in which HWPF handles paragraphs and character runs of table cells. In TestProblems.java (the JUnit testcase related to this bug and bug 44292) I tried to add the lines (in first table cell checks)

//A - the *already existing* tests on first table cell
assertEquals(1, cell.numParagraphs());
assertEquals("One paragraph is ok\7", cell.getParagraph(0).text());
//A end

//B - mine, added
assertEquals(1, cell.numCharacterRuns());
assertEquals("One paragraph is ok\7", cell.getCharacterRun(0).text());
//B end

//C - mine, added
assertEquals("One paragraph is ok\7", cell.text());


1) if in TableCell.java "end+1" is used, "A" and "C" blocks pass the test, but "B" fails: then, in this case paragraphs list and text() run correctly, but character runs retrieval does not

2) if in TableCell.java "end" is used, "B" block passes and "A" and "C" fail: then, in this case, character runs list is retrieved correctly correctly, but paragraph retrieval and text() doe not