Skip to content

max-efort/slimpy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

slimpy

A simple library that provides custom string slicing and matching.

slimpy provides a convenient solution for searching and identifying strings that may have unexpected characters. It is useful for searching and identifying an expected string but has mismatched characters, for example, strings extracted from OCR tools like pytesseract. In fact, that is the reason behind the creation of this library.

Installation

pip install slimpy

Example

Suppose there is a script that extracts text from an image and there is an expected word to be present in the extracted text:

expected_word = "character"
extracted_text = "This sentence has one typo word that has two mismatch oharaoter"
expected_word in extracted_text
>>> False

# We can use this library to tackle this kind of occasion
from slimpy import Fragment, REM

word_Fragment = Fragment(expected_word)
matching = REM()
matching.set_reference(extracted_text)
match = matching.perform_matching(word_Fragment)
match
>>> 
Fragmented string: character
Pattern match    : .{0,1}hara.*?ter
List of match    : ['oharaoter']

match.match
>>> oharaoter
match.pattern
>>> .{0,1}hara.*?ter

That's it! As mentioned earlier, searching and identifying strings that may have unexpected characters. For more information, see some documentation inside doc.

About

Custom String Slicing and Matching in Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages