Summary: | Extract Tables from word document | ||
---|---|---|---|
Product: | POI | Reporter: | ahmed <ayah683> |
Component: | HWPF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED WORKSFORME | ||
Severity: | normal | ||
Priority: | P2 | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Windows XP | ||
Attachments: | word document file |
Ahmed, The first table is placed inside of textbox, not as part of "main" text. If you need content of it, you need to navigate into textbox document part and extract data from it. Second and last table are correctly extracted. Sergey |
Created attachment 28793 [details] word document file i used POI 3.8 to extract tables from word document but i can't get all tables in Doc i write this code to get this action public static void main(String[] args) { String fileName = "C:\\fjn3312r.doc"; try { InputStream fis = new FileInputStream(fileName); POIFSFileSystem fs = new POIFSFileSystem(fis); HWPFDocument doc = new HWPFDocument(fs); Range range = doc.getRange(); int tblNameIdx = 0; for (int i = 0; i < range.numParagraphs(); i++) { Paragraph tablePar = range.getParagraph(i); String parText = tablePar.text(); try { Pattern pattern = Pattern.compile("[\\s]*", Pattern.CASE_INSENSITIVE); Matcher matcher = pattern.matcher(parText); if (matcher.matches()) { continue; } matcher.matches()); } catch (Exception e) { e.printStackTrace(); } Paragraph tableName = range.getParagraph(tblNameIdx); System.out.println("Table name=====>>" + tableName.text()); Table table = range.getTable(tablePar); for (int rowIdx = 0; rowIdx < table.numRows(); rowIdx++) { TableRow row = table.getRow(rowIdx); BorderCode bc = row.getVerticalBorder(); i = i + 1; row.text(); String rowText = ""; for (int colIdx = 0; colIdx < row.numCells(); colIdx++) { TableCell cell = row.getCell(colIdx); rowText = rowText + "\t" + cell.getParagraph(0).text(); i = i + 1; } System.out.println("Row----" + rowIdx + " ===>>" + rowText); } i = i - 1; } else { tblNameIdx = i; } } } catch (Exception e) { e.printStackTrace(); } } }