Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Uniformize spam-checker API, part 2: check_event_for_spam #12808

Merged
merged 9 commits into from
May 23, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/12808.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Update to `check_event_for_spam`. Deprecate the current callback signature, replace it with a new signature that is both less ambiguous (replacing booleans with explicit allow/block) and more powerful (ability to return explicit error codes).
Yoric marked this conversation as resolved.
Show resolved Hide resolved
27 changes: 17 additions & 10 deletions docs/modules/spam_checker_callbacks.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,22 +11,29 @@ The available spam checker callbacks are:
### `check_event_for_spam`

_First introduced in Synapse v1.37.0_
_Signature extended to support Allow and Code in Synapse v1.60.0_
_Boolean and string return value types deprecated in Synapse v1.60.0_

```python
async def check_event_for_spam(event: "synapse.events.EventBase") -> Union[bool, str]
async def check_event_for_spam(event: "synapse.module_api.EventBase") -> Union["synapse.module_api.ALLOW", "synapse.module_api.error.Codes", str, bool]
```

Called when receiving an event from a client or via federation. The callback must return
either:
- an error message string, to indicate the event must be rejected because of spam and
give a rejection reason to forward to clients;
- the boolean `True`, to indicate that the event is spammy, but not provide further details; or
- the booelan `False`, to indicate that the event is not considered spammy.
Called when receiving an event from a client or via federation. The callback must return either:
- `synapse.module_api.ALLOW`, to allow the operation. Other callbacks
may still decide to reject it.
- `synapse.api.Codes` to reject the operation with an error code. In case
of doubt, `synapse.api.error.Codes.FORBIDDEN` is a good error code.
- (deprecated) a `str` to reject the operation and specify an error message. Note that clients
typically will not localize the error message to the user's preferred locale.
- (deprecated) on `False`, behave as `ALLOW`. Deprecated as confusing, as some
callbacks in expect `True` to allow and others `True` to reject.
- (deprecated) on `True`, behave as `synapse.api.error.Codes.FORBIDDEN`. Deprecated as confusing, as
some callbacks in expect `True` to allow and others `True` to reject.

If multiple modules implement this callback, they will be considered in order. If a
callback returns `False`, Synapse falls through to the next one. The value of the first
callback that does not return `False` will be used. If this happens, Synapse will not call
any of the subsequent implementations of this callback.
callback returns `synapse.module_api.ALLOW`, Synapse falls through to the next one. The value of the
first callback that does not return `synapse.module_api.ALLOW` will be used. If this happens, Synapse
will not call any of the subsequent implementations of this callback.

### `user_may_join_room`

Expand Down
29 changes: 29 additions & 0 deletions docs/upgrade.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,36 @@ has queries that can be used to check a database for this problem in advance.

</details>

## SpamChecker API's `check_event_for_spam` has a new signature.

The previous signature has been deprecated.

Whereas `check_event_for_spam` callbacks used to return `Union[str, bool]`, they should now return `Union["synapse.module_api.Allow", "synapse.module_api.errors.Codes"]`.

This is part of an ongoing refactoring of the SpamChecker API to make it less ambiguous and more powerful.

If your module implements `check_event_for_spam` as follows:

```python
async def check_event_for_spam(event):
if ...:
# Event is spam
return True
# Event is not spam
return False
```

you should rewrite it as follows:

```python
async def check_event_for_spam(event):
if ...:
# Event is spam, mark it as forbidden (you may use some more precise error
# code if it is useful).
return synapse.module_api.errors.Codes.FORBIDDEN
# Event is not spam, mark it as `ALLOW`.
return synapse.module_api.ALLOW
```

# Upgrading to v1.59.0

Expand Down
4 changes: 1 addition & 3 deletions synapse/api/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -270,9 +270,7 @@ class UnrecognizedRequestError(SynapseError):
"""An error indicating we don't understand the request you're trying to make"""

def __init__(
self,
msg: str = "Unrecognized request",
errcode: str = Codes.UNRECOGNIZED,
self, msg: str = "Unrecognized request", errcode: str = Codes.UNRECOGNIZED
):
super().__init__(400, msg, errcode)

Expand Down
49 changes: 39 additions & 10 deletions synapse/events/spamcheck.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,10 @@
Union,
)

from synapse.api.errors import Codes
from synapse.rest.media.v1._base import FileInfo
from synapse.rest.media.v1.media_storage import ReadableFileWrapper
from synapse.spam_checker_api import RegistrationBehaviour
from synapse.spam_checker_api import Allow, Decision, RegistrationBehaviour
from synapse.types import RoomAlias, UserProfile
from synapse.util.async_helpers import delay_cancellation, maybe_awaitable
from synapse.util.metrics import Measure
Expand All @@ -40,9 +41,19 @@

logger = logging.getLogger(__name__)


CHECK_EVENT_FOR_SPAM_CALLBACK = Callable[
["synapse.events.EventBase"],
Awaitable[Union[bool, str]],
Awaitable[
Union[
Allow,
Codes,
# Deprecated
bool,
# Deprecated
str,
]
],
]
USER_MAY_JOIN_ROOM_CALLBACK = Callable[[str, str, bool], Awaitable[bool]]
USER_MAY_INVITE_CALLBACK = Callable[[str, str, str], Awaitable[bool]]
Expand Down Expand Up @@ -244,7 +255,7 @@ def register_callbacks(

async def check_event_for_spam(
self, event: "synapse.events.EventBase"
) -> Union[bool, str]:
) -> Union[Decision, str]:
"""Checks if a given event is considered "spammy" by this server.

If the server considers an event spammy, then it will be rejected if
Expand All @@ -255,18 +266,36 @@ async def check_event_for_spam(
event: the event to be checked

Returns:
True or a string if the event is spammy. If a string is returned it
will be used as the error message returned to the user.
- on `ALLOW`, the event is considered good (non-spammy) and should
be let through. Other spamcheck filters may still reject it.
- on `Code`, the event is considered spammy and is rejected with a specific
error message/code.
- on `str`, the event is considered spammy and the string is used as error
message. This usage is generally discouraged as it doesn't support
internationalization.
"""
for callback in self._check_event_for_spam_callbacks:
with Measure(
self.clock, "{}.{}".format(callback.__module__, callback.__qualname__)
):
res: Union[bool, str] = await delay_cancellation(callback(event))
if res:
return res

return False
res: Union[Decision, str, bool] = await delay_cancellation(
callback(event)
)
if res is False or res is Allow.ALLOW:
# This spam-checker accepts the event.
# Other spam-checkers may reject it, though.
continue
elif res is True:
# This spam-checker rejects the event with deprecated
# return value `True`
return Codes.FORBIDDEN
else:
# This spam-checker rejects the event either with a `str`
# or with a `Codes`. In either case, we stop here.
return res

# No spam-checker has rejected the event, let it pass.
return Allow.ALLOW

async def user_may_join_room(
self, user_id: str, room_id: str, is_invited: bool
Expand Down
5 changes: 3 additions & 2 deletions synapse/federation/federation_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
import logging
from typing import TYPE_CHECKING

import synapse
from synapse.api.constants import MAX_DEPTH, EventContentFields, EventTypes, Membership
from synapse.api.errors import Codes, SynapseError
from synapse.api.room_versions import EventFormatVersions, RoomVersion
Expand Down Expand Up @@ -98,9 +99,9 @@ async def _check_sigs_and_hash(
)
return redacted_event

result = await self.spam_checker.check_event_for_spam(pdu)
spam_check = await self.spam_checker.check_event_for_spam(pdu)

if result:
if spam_check is not synapse.spam_checker_api.Allow.ALLOW:
logger.warning("Event contains spam, soft-failing %s", pdu.event_id)
# we redact (to save disk space) as well as soft-failing (to stop
# using the event in prev_events).
Expand Down
11 changes: 6 additions & 5 deletions synapse/handlers/message.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@

from twisted.internet.interfaces import IDelayedCall

import synapse
from synapse import event_auth
from synapse.api.constants import (
EventContentFields,
Expand Down Expand Up @@ -885,11 +886,11 @@ async def create_and_send_nonmember_event(
event.sender,
)

spam_error = await self.spam_checker.check_event_for_spam(event)
if spam_error:
if not isinstance(spam_error, str):
spam_error = "Spam is not permitted here"
raise SynapseError(403, spam_error, Codes.FORBIDDEN)
spam_check = await self.spam_checker.check_event_for_spam(event)
if spam_check is not synapse.spam_checker_api.Allow.ALLOW:
raise SynapseError(
403, "This message had been rejected as probable spam", spam_check
)

ev = await self.handle_new_client_event(
requester=requester,
Expand Down
5 changes: 5 additions & 0 deletions synapse/module_api/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
from twisted.internet import defer
from twisted.web.resource import Resource

from synapse import spam_checker_api
from synapse.api.errors import SynapseError
from synapse.events import EventBase
from synapse.events.presence_router import (
Expand Down Expand Up @@ -139,13 +140,17 @@

PRESENCE_ALL_USERS = PresenceRouter.ALL_USERS

ALLOW = spam_checker_api.Allow.ALLOW
# Singleton value used to mark a message as permitted.

__all__ = [
"errors",
"make_deferred_yieldable",
"parse_json_object_from_request",
"respond_with_html",
"run_in_background",
"cached",
"Allow",
"UserID",
"DatabasePool",
"LoggingTransaction",
Expand Down
2 changes: 2 additions & 0 deletions synapse/module_api/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
"""Exception types which are exposed as part of the stable module API"""

from synapse.api.errors import (
Codes,
InvalidClientCredentialsError,
RedirectException,
SynapseError,
Expand All @@ -24,6 +25,7 @@
from synapse.storage.push_rule import RuleNotFoundException

__all__ = [
"Codes",
"InvalidClientCredentialsError",
"RedirectException",
"SynapseError",
Expand Down
27 changes: 26 additions & 1 deletion synapse/spam_checker_api/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,38 @@
# See the License for the specific language governing permissions and
# limitations under the License.
from enum import Enum
from typing import Union

from synapse.api.errors import Codes


class RegistrationBehaviour(Enum):
"""
Enum to define whether a registration request should allowed, denied, or shadow-banned.
Enum to define whether a registration request should be allowed, denied, or shadow-banned.
"""

ALLOW = "allow"
SHADOW_BAN = "shadow_ban"
DENY = "deny"


# We define the following singleton enum rather than a string to be able to
# write `Union[Allow, ..., str]` in some of the callbacks for the spam-checker
# API, where the `str` is required to maintain backwards compatibility with
# previous versions of the API.
class Allow(Enum):
"""
Singleton to allow events to pass through in SpamChecker APIs.
"""

ALLOW = "allow"
Yoric marked this conversation as resolved.
Show resolved Hide resolved


Decision = Union[Allow, Codes]
"""
Union to define whether a request should be allowed or rejected.

To accept a request, return `ALLOW`.

To reject a request without any specific information, use `Codes.FORBIDDEN`.
"""