Skip to content

Commit

Permalink
Merge branch 'master' into dependabot/maven/org.locationtech.jts-jts-…
Browse files Browse the repository at this point in the history
…core-1.19.0
  • Loading branch information
jazzido authored Jul 17, 2024
2 parents 3afa050 + c831cf6 commit d225366
Show file tree
Hide file tree
Showing 16 changed files with 544 additions and 464 deletions.
11 changes: 8 additions & 3 deletions .github/workflows/tests-windows.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Java CI
name: Java CI (Windows)

on: [push]

Expand All @@ -7,9 +7,14 @@ jobs:
runs-on: windows-latest

steps:
- uses: actions/checkout@v2
# https://github.com/actions/checkout/issues/135#issuecomment-602171132
- name: Set git to use LF
run: |
git config --global core.autocrlf false
git config --global core.eol lf
- uses: actions/checkout@v3
- name: Set up JDK 11
uses: actions/setup-java@v2
uses: actions/setup-java@v3
with:
java-version: '11'
distribution: 'adopt'
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
name: Java CI

on: [push]
on: [push, pull_request]

jobs:
build:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Set up JDK 11
uses: actions/setup-java@v2
uses: actions/setup-java@v3
with:
java-version: '11'
distribution: 'adopt'
Expand Down
43 changes: 42 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ tabula-java [![Build Status](https://travis-ci.org/tabulapdf/tabula-java.svg?bra

Download a version of the tabula-java's jar, with all dependencies included, that works on Mac, Windows and Linux from our [releases page](../../releases).

## Usage Examples
## Commandline Usage Examples

`tabula-java` provides a command line application:

Expand Down Expand Up @@ -81,6 +81,47 @@ JVM start-up time is a lot of the cost of the `tabula` command, so if you're try
- writing your own program in any JVM language (Java, JRuby, Scala) that imports tabula-java.
- waiting for us to implement an API/server-style system (it's on the [roadmap](https://github.com/tabulapdf/tabula-api))

## API Usage Examples

A simple Java code example which extracts all rows and cells from all tables of all pages of a PDF document:

```java
InputStream in = this.getClass().getResourceAsStream("my.pdf");
try (PDDocument document = PDDocument.load(in)) {
SpreadsheetExtractionAlgorithm sea = new SpreadsheetExtractionAlgorithm();
PageIterator pi = new ObjectExtractor(document).extract();
while (pi.hasNext()) {
// iterate over the pages of the document
Page page = pi.next();
List<Table> table = sea.extract(page);
// iterate over the tables of the page
for(Table tables: table) {
List<List<RectangularTextContainer>> rows = tables.getRows();
// iterate over the rows of the table
for (List<RectangularTextContainer> cells : rows) {
// print all column-cells of the row plus linefeed
for (RectangularTextContainer content : cells) {
// Note: Cell.getText() uses \r to concat text chunks
String text = content.getText().replace("\r", " ");
System.out.print(text + "|");
}
System.out.println();
}
}
}
}
```


For more detail information check the Javadoc.
The Javadoc API documentation can be generated (see also '_Building from Source_' section) via

```
mvn javadoc:javadoc
```

which generates the HTML files to directory ```target/site/apidocs/```

## Building from Source

Clone this repo and run:
Expand Down
Loading

0 comments on commit d225366

Please sign in to comment.