Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi page table data not extracting #7

Open
ursawant opened this issue Oct 29, 2021 · 0 comments
Open

Multi page table data not extracting #7

ursawant opened this issue Oct 29, 2021 · 0 comments

Comments

@ursawant
Copy link

ursawant commented Oct 29, 2021

I have 2 pager PDF in which there is some page title text, then some intro text and table started in 1st page and continues in 2nd page. in both pages table having the table headers. But When I tried to extract this PDF, the extractor giving Tables array [].
Result array is :

[ [ 'Email Id: abc@gmail.com\nA BC\nFLAT NO 222 LAND 24 E\nCITY CENTRE \nNR TELEPHONE EXCHANGE\nMUMBAI - 400014\nMaharashtra\nIndia\nMobile: +12121212121,
'Some intro text intro text\nintro text intro textintro textintro textintro text intro textintro text.\nintro text.' ],
[ '', '' ],
[ 'ValueSr. No.\n(INR)\nName Unit Date Amount\n(INR)',
'' ] ]

Here Table headers are - Sr. No., Value , Unit , Date and Amount

If I split PDF and create 2 separate PDF as Page1.pdf and page2.pdf for each pages then still Page 1 having the same issue where as I can extract page2.pdf.
My Observation is in page 1 there is no table closer(horizontal line or something) that is why something breaks and it don't extract the table data.

Any comments and suggestions will be appreciated.

Thanks is advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant