Skip to content

Commit

Permalink
Added Notification & Finder Modules + UnitTests
Browse files Browse the repository at this point in the history
Signed-off-by: Guilherme Bacellar Moralez <guibacellar@gmail.com>
  • Loading branch information
guibacellar committed Oct 1, 2023
1 parent 35ea745 commit d7981fd
Show file tree
Hide file tree
Showing 21 changed files with 512 additions and 20 deletions.
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,38 +84,38 @@ data_path=/usr/TEx/
Execute the first 2 commands to configure and sync TEx and the last one to activate the listener module.

```bash
python -m TEx connect --config /usr/my_TEx_config.config
python -m TEx load_groups --config /usr/my_TEx_config.config
python -m TEx listen --config /usr/my_TEx_config.config
python3 -m TEx connect --config /usr/my_TEx_config.config
python3 -m TEx load_groups --config /usr/my_TEx_config.config
python3 -m TEx listen --config /usr/my_TEx_config.config
```

<!-- Command Line -->
## Command Line

### Connect to Telegram Servers
```bash
python -m TEx connect --config CONFIGURATION_FILE_PATH
python3 -m TEx connect --config CONFIGURATION_FILE_PATH
```
* **config** > Required - Created Configuration File Path

### Update Groups List (Optional, but Recommended)
```bash
python -m TEx load_groups --config CONFIGURATION_FILE_PATH --refresh_profile_photos
python3 -m TEx load_groups --config CONFIGURATION_FILE_PATH --refresh_profile_photos
```

* **config** > Required - Created Configuration File Path
* **refresh_profile_photos** > Optional - If present, forces the Download and Update all Channels Members Profile Photo

### List Groups
```bash
python -m TEx list_groups --config CONFIGURATION_FILE_PATH
python3 -m TEx list_groups --config CONFIGURATION_FILE_PATH
```

* **config** > Required - Created Configuration File Path

### Listen Messages (Start the Message Listener)
```bash
python -m TEx listen --config CONFIGURATION_FILE_PATH --group_id 1234,5678
python3 -m TEx listen --config CONFIGURATION_FILE_PATH --group_id 1234,5678
```

* **config** > Required - Created Configuration File Path
Expand All @@ -125,7 +125,7 @@ python -m TEx listen --config CONFIGURATION_FILE_PATH --group_id 1234,5678
### Download Messages (Download since first message for each group)
Scrap Messages from Telegram Server
```bash
python -m TEx download_messages --config CONFIGURATION_FILE_PATH --group_id 1234,5678
python3 -m TEx download_messages --config CONFIGURATION_FILE_PATH --group_id 1234,5678
```

* **config** > Required - Created Configuration File Path
Expand All @@ -135,7 +135,7 @@ python -m TEx download_messages --config CONFIGURATION_FILE_PATH --group_id 1234
### Generate Report
Generate HTML Report
```bash
python -m TEx report --config CONFIGURATION_FILE_PATH --report_folder REPORT_FOLDER_PATH --group_id * --around_messages NUM --order_desc --limit_days 3 --filter FILTER_EXPRESSION_1,FILTER_EXPRESSION_2,FILTER_EXPRESSION_N
python3 -m TEx report --config CONFIGURATION_FILE_PATH --report_folder REPORT_FOLDER_PATH --group_id * --around_messages NUM --order_desc --limit_days 3 --filter FILTER_EXPRESSION_1,FILTER_EXPRESSION_2,FILTER_EXPRESSION_N
```
* **config** > Required - Created Configuration File Path
* **report_folder** > Optional - Defines the Report Files Folder
Expand All @@ -149,7 +149,7 @@ python -m TEx report --config CONFIGURATION_FILE_PATH --report_folder REPORT_FOL
### Export Downloaded Files
Export Downloaded Files by MimeType
```bash
python -m TEx export_file --config CONFIGURATION_FILE_PATH -report_folder REPORT_FOLDER_PATH --group_id * --filter * --limit_days 3 --mime_type text/plain
python3 -m TEx export_file --config CONFIGURATION_FILE_PATH -report_folder REPORT_FOLDER_PATH --group_id * --filter * --limit_days 3 --mime_type text/plain
```
* **config** > Required - Created Configuration File Path
* **report_folder** > Optional - Defines the Report Files Folder
Expand All @@ -161,7 +161,7 @@ python -m TEx export_file --config CONFIGURATION_FILE_PATH -report_folder REPORT
### Export Texts
Export Messages (Texts) using Regex finder
```bash
python -m TEx export_text --config CONFIGURATION_FILE_PATH --order_desc --limit_days 3 --regex REGEX --report_folder REPORT_FOLDER_PATH --group_id *
python3 -m TEx export_text --config CONFIGURATION_FILE_PATH --order_desc --limit_days 3 --regex REGEX --report_folder REPORT_FOLDER_PATH --group_id *
```
* **config** > Required - Created Configuration File Path
* **report_folder** > Optional - Defines the Report Files Folder
Expand Down
4 changes: 2 additions & 2 deletions TEx/core/mapper/telethon_channel_mapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@ def to_database_dict(channel: Channel, target_phone_numer: str) -> Dict:
'fake': channel.fake,
'gigagroup': getattr(channel, 'gigagroup', False),
'has_geo': getattr(channel, 'has_geo', False),
'participants_count': channel.participants_count,
'participants_count': getattr(channel, 'participants_count', 0),
'restricted': channel.restricted,
'scam': channel.scam,
'group_username': channel.username,
'verified': channel.verified,
'title': channel.title,
'title': getattr(channel, 'title', ''),
'source': target_phone_numer
}

Expand Down
4 changes: 2 additions & 2 deletions TEx/database/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,5 @@ def __setitem__(self, key, value, cache_setitem=Cache.__setitem__) -> None: # t
super().__setitem__(key, value, cache_setitem) # type: ignore


GROUPS_CACHE: NoneSupportedTTLCache = NoneSupportedTTLCache(maxsize=256, ttl=60)
USERS_CACHE: NoneSupportedTTLCache = NoneSupportedTTLCache(maxsize=2048, ttl=60)
GROUPS_CACHE: NoneSupportedTTLCache = NoneSupportedTTLCache(maxsize=256, ttl=300)
USERS_CACHE: NoneSupportedTTLCache = NoneSupportedTTLCache(maxsize=2048, ttl=300)
1 change: 1 addition & 0 deletions TEx/finder/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""TEx Finder Modules."""
10 changes: 10 additions & 0 deletions TEx/finder/base_finder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
"""Base Class for All Finders."""
import abc


class BaseFinder:
"""Base Finder Class."""

@abc.abstractmethod
async def find(self, raw_text: str) -> bool:
"""Apply Find Logic."""
59 changes: 59 additions & 0 deletions TEx/finder/finder_engine.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
"""Finder Engine."""
from configparser import ConfigParser
from typing import Dict, List

from telethon.events import NewMessage

from TEx.finder.regex_finder import RegexFinder
from TEx.notifier.notifier_engine import NotifierEngine


class FinderEngine:
"""Primary Finder Engine."""

def __init__(self) -> None:
"""Initialize Finder Engine."""
self.is_finder_enabled: bool = False
self.rules: List[Dict] = []
self.notification_engine: NotifierEngine = NotifierEngine()

def __is_finder_enabled(self, config: ConfigParser) -> bool:
"""Check if Finder Module is Enabled."""
return (
config.has_option('FINDER', 'enabled') and config['FINDER']['enabled'] == 'true'
)

def __load_rules(self, config: ConfigParser) -> None:
"""Load Finder Rules."""
rules_sections: List[str] = [item for item in config.sections() if 'FINDER.RULE.' in item]

for sec in rules_sections:
if config[sec]['type'] == 'regex':
self.rules.append({
'id': sec,
'instance': RegexFinder(config=config[sec]),
'notifier': config[sec]['notifier']
})

def configure(self, config: ConfigParser) -> None:
"""Configure Finder."""
self.is_finder_enabled = self.__is_finder_enabled(config=config)
self.__load_rules(config=config)
self.notification_engine.configure(config=config)

async def run(self, message: NewMessage.Event) -> None:
"""Execute the Finder with Raw Text."""
if not self.is_finder_enabled:
return

for rule in self.rules:
is_found: bool = await rule['instance'].find(raw_text=message.raw_text)

if is_found:

# Runt the Notification Engine
await self.notification_engine.run(
notifiers=rule['notifier'].split(','),
message=message,
rule_id=rule['id']
)
23 changes: 23 additions & 0 deletions TEx/finder/regex_finder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
"""Regex Finder."""
import re
from configparser import SectionProxy

from TEx.finder.base_finder import BaseFinder


class RegexFinder(BaseFinder):
"""Regex Based Finder."""

def __init__(self, config: SectionProxy) -> None:
"""Initialize RegEx Finder."""
self.regex: re.Pattern = re.compile(config['regex'], flags=re.IGNORECASE | re.MULTILINE)

async def find(self, raw_text: str) -> bool:
"""Apply Find Logic."""
if not raw_text or len(raw_text) == 0:
return False

if len(self.regex.findall(raw_text)) > 0:
return True

return False
2 changes: 1 addition & 1 deletion TEx/logging.conf
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ keys=simpleFormatter
#######################

[logger_root]
level=DEBUG
level=INFO
handlers=consoleHandler

[logger_sqlalchemy]
Expand Down
8 changes: 8 additions & 0 deletions TEx/modules/telegram_messages_listener.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
from TEx.core.media_handler import UniversalTelegramMediaHandler
from TEx.database.telegram_group_database import TelegramGroupDatabaseManager, TelegramMessageDatabaseManager, \
TelegramUserDatabaseManager
from TEx.finder.finder_engine import FinderEngine

logger = logging.getLogger()

Expand All @@ -28,6 +29,7 @@ def __init__(self) -> None:
self.group_ids: List[int] = []
self.media_handler: UniversalTelegramMediaHandler = UniversalTelegramMediaHandler()
self.target_phone_number: str = ''
self.finder: FinderEngine = FinderEngine()

async def __handler(self, event: NewMessage.Event) -> None:
"""Handle the Message."""
Expand Down Expand Up @@ -72,6 +74,9 @@ async def __handler(self, event: NewMessage.Event) -> None:
values['from_id'] = None
values['from_type'] = None

# Execute Finder
await self.finder.run(message=message)

# Add to DB
TelegramMessageDatabaseManager.insert(values)

Expand Down Expand Up @@ -130,6 +135,9 @@ async def run(self, config: ConfigParser, args: Dict, data: Dict) -> None:
self.data_path = config['CONFIGURATION']['data_path']
self.target_phone_number = config['CONFIGURATION']['phone_number']

# Set Finder
self.finder.configure(config=config)

# Update Module Group Filtering Info
if args['group_id'] and args['group_id'] != '*':
self.group_ids = [int(group_id) for group_id in args['group_id'].split(',')]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ async def run(self, config: ConfigParser, args: Dict, data: Dict) -> None:
)

def __filter_groups(self, args: Dict, source: List[TelegramGroupReportFacadeEntity]) -> List[TelegramGroupReportFacadeEntity]:
"""Apply Filter on Gropus."""
"""Apply Filter on yGropus."""
groups: List[TelegramGroupReportFacadeEntity] = []

# Filter Groups
Expand Down Expand Up @@ -121,6 +121,9 @@ async def __export_data(self, args: Dict, config: ConfigParser, group: TelegramG
souce_media_path: str = os.path.join(config['CONFIGURATION']['data_path'], 'media', str(media[0].group_id), media[0].file_name)
destination_media_path: str = os.path.join(report_root_folder, f'{media[0].group_id}_{media[0].file_name}')

if not os.path.exists(souce_media_path):
continue

# Compute Source File Hash
file_hash: str = ''
with open(souce_media_path, "rb") as f:
Expand Down
1 change: 1 addition & 0 deletions TEx/notifier/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Notifier Modules."""
50 changes: 50 additions & 0 deletions TEx/notifier/discord_notifier.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
"""Discord Notifier."""
from configparser import SectionProxy

from discord_webhook import DiscordEmbed, DiscordWebhook
from telethon.events import NewMessage

from TEx.notifier.notifier_base import BaseNotifier


class DiscordNotifier(BaseNotifier):
"""Basic Discord Notifier."""

def __init__(self) -> None:
"""Initialize Discord Notifier."""
super().__init__()
self.url: str = ''

def configure(self, url: str, config: SectionProxy) -> None:
"""Configure the Notifier."""
self.url = url
self.configure_base(config=config)

async def run(self, message: NewMessage.Event, rule_id: str) -> None:
"""Run Discord Notifier."""
# Check and Update Deduplication Control
is_duplicated, duplication_tag = self.check_is_duplicated(message=message.raw_text)
if is_duplicated:
return

# Run the Notification Process.
webhook = DiscordWebhook(
url=self.url,
rate_limit_retry=True
)

embed = DiscordEmbed(
title=f'**{message.chat.title}** ({message.chat.id})',
description=message.raw_text
)

embed.add_embed_field(name="Rule", value=rule_id, inline=False)
embed.add_embed_field(name="Message ID", value=str(message.id), inline=False)
embed.add_embed_field(name="Group Name", value=message.chat.title, inline=True)
embed.add_embed_field(name="Group ID", value=message.chat.id, inline=True)
embed.add_embed_field(name="Message Date", value=str(message.date), inline=False)
embed.add_embed_field(name="Tag", value=duplication_tag, inline=False)

# add embed object to webhook
webhook.add_embed(embed)
webhook.execute()
40 changes: 40 additions & 0 deletions TEx/notifier/notifier_base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
"""Base Class for All Notifiers."""
import abc
import hashlib
from configparser import SectionProxy
from typing import Optional, Tuple

from cachetools import TTLCache
from telethon.events import NewMessage


class BaseNotifier:
"""Base Notifier."""

def __init__(self) -> None:
"""Initialize the Base Notifier."""
self.cache: Optional[TTLCache] = None

def configure_base(self, config: SectionProxy) -> None:
"""Configure Base Notifier."""
self.cache = TTLCache(maxsize=4096, ttl=int(config['prevent_duplication_for_minutes']) * 60)

def check_is_duplicated(self, message: str) -> Tuple[bool, str]:
"""Check if Message is Duplicated on Notifier."""
if not message or self.cache is None:
return False, ''

# Compute Deduplication Tag
tag: str = hashlib.md5(message.encode('UTF-8')).hexdigest() # nosec

# If Found, Return True
if self.cache.get(tag):
return True, tag

# Otherwise, Just Insert and Return False
self.cache[tag] = True
return False, tag

@abc.abstractmethod
async def run(self, message: NewMessage.Event, rule_id: str) -> None:
"""Run the Notification Process."""
Loading

0 comments on commit d7981fd

Please sign in to comment.