Skip to content

Commit

Permalink
Fix typos and improve wording
Browse files Browse the repository at this point in the history
  • Loading branch information
paskal authored and umputun committed Nov 23, 2024
1 parent ac97942 commit adc7198
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 6 deletions.
2 changes: 1 addition & 1 deletion app/events/listener.go
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ func (l *TelegramListener) Do(ctx context.Context) error {
if l.DisableAdminSpamForward {
adminForwardStatus = "disabled"
}
log.Printf("[DEBUG] admin handler created, spam forvarding %s, %+v", adminForwardStatus, l.adminHandler)
log.Printf("[DEBUG] admin handler created, spam forwarding %s, %+v", adminForwardStatus, l.adminHandler)

u := tbapi.NewUpdate(0)
u.Timeout = 60
Expand Down
10 changes: 5 additions & 5 deletions site/docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ TG-Spam's spam detection algorithm is multifaceted, incorporating several criter
- **Stop Words Comparison**: Messages are compared against a curated list of stop words commonly found in spam.
- **OpenAI Integration**: TG-Spam may optionally use OpenAI's GPT models to analyze messages for spam patterns.
- **Emoji Count**: Messages with an excessive number of emojis are scrutinized, as this is a common trait in spam messages.
- **Meta checks**: TG-Spam can optionalsly check the message for the number of links and the presence of images. If the number of links is greater than the specified limit, or if the message contains images but no text, it will be marked as spam.
- **Meta checks**: TG-Spam can optionally check the message for the number of links and the presence of images. If the number of links is greater than the specified limit, or if the message contains images but no text, it will be marked as spam.
- **Automated Action**: If a message is flagged as spam, TG-Spam takes immediate action by deleting the message and banning the responsible user.

TG-Spam can also run as a server, providing a simple HTTP API to check messages for spam. This is useful for integration with other tools, not related to Telegram. For more details see [Running with webapi server](#running-with-webapi-server) section below. In addition, it provides WEB UI to perform some useful admin tasks. For more details see [WEB UI](#web-ui) section below. All the spam detection modules can be also used as a library. For more details see [Using tg-spam as a library](#using-tg-spam-as-a-library) section below.
Expand All @@ -44,7 +44,7 @@ All the configuration is done via environment variables or command line argument
There are some mandatory parameters what has to be set:

- `--telegram.token=, [$TELEGRAM_TOKEN]` - telegram bot token. See below how to get it.
- `--telegram.group=, [$TELEGRAM_GROUP]` - group name/id. This can be a group name (for public groups it will lookg like `mygroup`) or group id (for private groups it will look like `-123456789`). To get the group id you can use [this bot](https://t.me/myidbot) or others like it.
- `--telegram.group=, [$TELEGRAM_GROUP]` - group name/id. This can be a group name (for public groups it will look like `mygroup`) or group id (for private groups it will look like `-123456789`). To get the group id you can use [this bot](https://t.me/myidbot) or others like it.

As long as theses two parameters are set, the bot will work. Don't forget to add the bot to the group as an admin, otherwise it will not be able to delete messages and ban users.

Expand All @@ -64,7 +64,7 @@ By default, the bot reports back to the group with the message `this is spam` an

There are 4 files used by the bot to detect spam:

- `spam-samples.txt` - list of spam samples. Each line in this file is a full text of spam message with removed EOL. I.e. the orginal message represented as a single line. EOLs can be replaced by spaces
- `spam-samples.txt` - list of spam samples. Each line in this file is a full text of spam message with removed EOL. I.e. the original message represented as a single line. EOLs can be replaced by spaces
- `ham-samples.txt` - list of ham (non-spam) samples. Each line in this file is a full text of ham message with removed EOL
- `exclude-tokens.txt` - list of tokens to exclude from spam detection, usually common words. Each line in this file is a single token (word), or a comma-separated list of words in dbl-quotes.
- `stop-words.txt` - list of stop words to detect spam right away. Each line in this file is a single phrase (can be one or more words). The bot checks if any of those phrases are present in the message and if so, it marks the message as spam.
Expand All @@ -85,7 +85,7 @@ This check uses provides samples files and active by default. The bot compares t

**Stop Words Comparison**

If stop words file is present, the bot will check the message for the presence of any of the phrases in the file. The bot is enabled as long as `stop-words.txt` file is present in samples directory and not empty.
If stop words file is present, the bot will check the message for the presence of the phrases in the file. The bot is enabled as long as `stop-words.txt` file is present in samples directory and not empty.

**Combot Anti-Spam System (CAS) integration**

Expand Down Expand Up @@ -450,7 +450,7 @@ services:
## Getting spam samples from CAS
CAS provide an API to get spam samples, which can be used to creata a set of spam samples for the bot. Provided [`cas-export.sh`](https://raw.githubusercontent.com/umputun/tg-spam/master/cas-export.sh) script automate the process and result (`messages.txt`) can be used as a base for `spam-samples.txt` file. The script requires `jq` and `curl` to be installed and running it will take a long time.
CAS provide an API to get spam samples, which can be used to create a set of spam samples for the bot. Provided [`cas-export.sh`](https://raw.githubusercontent.com/umputun/tg-spam/master/cas-export.sh) script automate the process and result (`messages.txt`) can be used as a base for `spam-samples.txt` file. The script requires `jq` and `curl` to be installed and running it will take a long time.

```bash
curl -s https://raw.githubusercontent.com/umputun/tg-spam/master/cas-export.sh > cas-export.sh
Expand Down

0 comments on commit adc7198

Please sign in to comment.