Visually Salient Text (VST)

Visual saliency is the distinct quality which makes some items stand out from the others and grab our attention.

This project uses Deep Learning to extract Salient text from an image using State-of-the-Art Vision Transformer Architecture.

The model used is the Visual Saliency Transformer, which was trained on a synthetically generated dataset which focused on textual saliency considerations. This Dataset consists of images in the formats of news articles, memes, advertisements and other commonly found internet images. Usage of Text Saliency Models include filtering out noise in text-rich environments, as well as improving OCR quality when in the wild.

Examples of Text Saliency used with EasyOCR

Raw Text: TAROT PREDICTS HUNG HOUSE INDIA TODAY IN UTTAR PRADESH 540 INDIA EXCLUSIVE TODAY MAN WHO SPOKE T0 SAIFULLAH DURING ENCOUNTER JALu Msn B PM Iop SheeLA BaJaJ, TarOT CarD READER Mt indiatoday-in NeWS LUCKNOW ENCOUNTER FLASH Saifullah died in exchange 0f fire Pm

Salient Text: TAROT PREDICTS HUNG HOUSE INDIA IN UTTAR PRADESH INDIA TODAY NeWS LUCKNOW ENCOUNTER FLASH Saifullah died in exchange of fire

Raw text: NEED TO LOSE 30 POUNDS? TRY SENSA FREEI SENSA" is clinically proven to help you lose 30 Ibs without dieting or spending all your time working out: Just sprinkle on your food; eat and lose weight! GET A GYM BODY WIthout GOING TO THE GYM NO COUNTING CALORIES NO STIMULANTS NO PILLS for Doesn t taste of the your foodl Try SENSA'FREEI Mfll SensaOftercom /OKer (8001750-6971 VoI | CLINICALLY PROVEN: 100% SATISFACTION GUARANTEED: SENSA eocicgdodged Ca Cla npmnd nn nandtan GNCLVWcll Oeoi S S ne nite Deantat #op hanehroloadceoh A A dtatd CCdedoDado nolcamnatndndniot enn GPECIa< OKI 6 change ncadans SENSA CL

Salient Text: 30 POUNDS? TRY SENSA FREEI GET A GYM BODY Try SENSA'FREEI

Usage of VST

Directory Structure of Key Components

VisuallySalientText
├── VST_DEMO.ipynb
├── Models
    ├── PretrainedModels
    |   └── 80.7_T2T_ViT_t_14.pth.tar***
    ├── Checkpoints
    |   └── RGB_VST.pth***
    └── Decoder.py, Transformer.py, ...
├── Data
    ├── OCSD
    │   ├── OCSD-TR     (training set)
    │   │   ├── OCSD-TR-Image
    │   │   │   └── img0.jpg, img1.jpg, ...
    │   │   └── OCSD-TR-Mask
    │   │   │   └── img0.png, img1.png, ...
    │   │   └── OCSD-TR-Contour
    │   │   │   └── img0.png, img1.png, ...
    │   ├── OCSD-TE     (testing set)
    │   │   ├── images
    │   │   │   └── img0.jpg, img1.jpg...
...

(***) Create the directories and download their respective model/weights for PretrainedModels and Checkpoints

The directory structure here is for the Optical Character Saliency Dataset, but will also work for any dataset with Image-Mask-Contour formatted directories
Due to the small, convoluted nature of optical characters, the Contour Masks are largely unecessary for text saliency and can be replaced with a copy of the saliency masks

Saliency Inference / Testing

For images in the directory Data/Dataset/images/image0.jpg

$ python VST.py --test_paths Dataset/

Saliency Mask Visualization (Overlay)

Refer to PredictionHeatmapVisualization.ipynb

Saliency-OCR Integration with EasyOCR

For images in the directory Data/Dataset/images/image0.jpg and masks in the directory Predictions/Dataset/RGB_VST/.

$ python SalOCR.py --imagefilepath Data/Dataset/images/ --maskfilepath Predictions/Dataset/RGB_VST/

Text Output will be in TextOutput/ in JSON format.

Training

$ python VST.py --Training True --Testing False

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
Evaluation		Evaluation
Models		Models
LICENSE		LICENSE
PredictionHeatmapVisualization.ipynb		PredictionHeatmapVisualization.ipynb
README.md		README.md
SalOCR.py		SalOCR.py
Testing.py		Testing.py
Training.py		Training.py
VST.py		VST.py
VST_Demo.ipynb		VST_Demo.ipynb
dataset.py		dataset.py
requirements.txt		requirements.txt
transforms.py		transforms.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visually Salient Text (VST)

Examples of Text Saliency used with EasyOCR

Usage of VST

Directory Structure of Key Components

Saliency Inference / Testing

Saliency Mask Visualization (Overlay)

Saliency-OCR Integration with EasyOCR

Training

About

Releases

Packages

Languages

License

reidenong/VisuallySalientText

Folders and files

Latest commit

History

Repository files navigation

Visually Salient Text (VST)

Examples of Text Saliency used with EasyOCR

Usage of VST

Directory Structure of Key Components

Saliency Inference / Testing

Saliency Mask Visualization (Overlay)

Saliency-OCR Integration with EasyOCR

Training

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages