Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Add word crop general orientation to output #1546

Merged
merged 8 commits into from
Apr 12, 2024

Conversation

felixdittrich92
Copy link
Contributor

@felixdittrich92 felixdittrich92 commented Apr 12, 2024

This PR:

  • Add result from crop orientation classifier to the output (so users are able to work with the general word orientation)

Any feedback is welcome 🤗

Note: feature request from: @mllife

@felixdittrich92 felixdittrich92 added topic: documentation Improvements or additions to documentation type: enhancement Improvement module: models Related to doctr.models module: io Related to doctr.io ext: api Related to api folder ext: docs Related to docs folder labels Apr 12, 2024
@felixdittrich92 felixdittrich92 added this to the 0.9.0 milestone Apr 12, 2024
@felixdittrich92 felixdittrich92 self-assigned this Apr 12, 2024
@felixdittrich92
Copy link
Contributor Author

felixdittrich92 commented Apr 12, 2024

API will fail until it's merged into main because (main branch):

python-doctr = {git = "https://github.com/mindee/doctr.git", extras = ['tf'], branch = "main" }

But it works:
KIE:

[
  {
    "name": "vertical.png",
    "orientation": {
      "value": null,
      "confidence": null
    },
    "language": {
      "value": null,
      "confidence": null
    },
    "dimensions": [
      632,
      1508
    ],
    "predictions": [
      {
        "class_name": "words",
        "items": [
          {
            "value": "Consolidated",
            "geometry": [
              0.0556640625,
              0.06892182555379744,
              0.1806640625,
              0.1364962420886076
            ],
            "confidence": 0.64,
            "crop_orientation": {
              "value": 0,
              "confidence": null
            }
          },

OCR:

[
  {
    "name": "6.jpg",
    "orientation": {
      "value": null,
      "confidence": null
    },
    "language": {
      "value": null,
      "confidence": null
    },
    "dimensions": [
      4624,
      3468
    ],
    "items": [
      {
        "blocks": [
          {
            "geometry": [
              0.25390625,
              0.2880859375,
              0.4075520833333333,
              0.392578125
            ],
            "lines": [
              {
                "geometry": [
                  0.30598958333333337,
                  0.2958984375,
                  0.31380208333333337,
                  0.3017578125
                ],
                "words": [
                  {
                    "value": "A",
                    "geometry": [
                      0.30598958333333337,
                      0.2958984375,
                      0.31380208333333337,
                      0.3017578125
                    ],
                    "confidence": 0.17,
                    "crop_orientation": {
                      "value": 0,
                      "confidence": null
                    }
                  }
                ]
              },

with assume_straight_pages=False:

"words": [
                  {
                    "value": "temtemnestpe.secines",
                    "geometry": [
                      0.1366126537322998,
                      0.4366913139820099,
                      0.2622371315956116,
                      0.40304186940193176,
                      0.27151674032211304,
                      0.4225289523601532,
                      0.14589223265647888,
                      0.4561783969402313
                    ],
                    "confidence": 0,
                    "crop_orientation": {
                      "value": 90,
                      "confidence": 0.79
                    }
{
            "value": "TLOUTSLAPS",
            "geometry": [
              0.24755072593688965,
              0.4060332179069519,
              0.31999316811561584,
              0.389607310295105,
              0.3279702067375183,
              0.4093964695930481,
              0.2555277347564697,
              0.425822377204895
            ],
            "confidence": 0,
            "crop_orientation": [
              90,
              0.98

@felixdittrich92 felixdittrich92 marked this pull request as ready for review April 12, 2024 12:33
@felixdittrich92 felixdittrich92 merged commit dcaae42 into mindee:main Apr 12, 2024
66 of 68 checks passed
@felixdittrich92 felixdittrich92 deleted the word-orient-out branch April 12, 2024 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: api Related to api folder ext: docs Related to docs folder module: io Related to doctr.io module: models Related to doctr.models topic: documentation Improvements or additions to documentation type: enhancement Improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants