Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Django 5 deprecates CiCharField. #489

Closed
terjekv opened this issue Mar 26, 2023 · 7 comments · Fixed by #513
Closed

Django 5 deprecates CiCharField. #489

terjekv opened this issue Mar 26, 2023 · 7 comments · Fixed by #513

Comments

@terjekv
Copy link
Collaborator

terjekv commented Mar 26, 2023

See
https://adamj.eu/tech/2023/02/23/migrate-django-postgresql-ci-fields-case-insensitive-collation/ for more information.

hostpolicy.HostPolicyAtom.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
hostpolicy.HostPolicyRole.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.Cname.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.ForwardZone.email: (fields.W906) django.contrib.postgres.fields.CIEmailField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use EmailField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.ForwardZone.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.ForwardZone.primary_ns: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.ForwardZoneDelegation.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.Host.contact: (fields.W906) django.contrib.postgres.fields.CIEmailField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use EmailField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.Host.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.HostGroup.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.Label.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.Mx.mx: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.NameServer.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.Naptr.replacement: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.Naptr.service: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.ReverseZone.email: (fields.W906) django.contrib.postgres.fields.CIEmailField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use EmailField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.ReverseZone.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.ReverseZone.primary_ns: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.ReverseZoneDelegation.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.
mreg.Srv.name: (fields.W905) django.contrib.postgres.fields.CICharField is deprecated. Support for it (except in historical migrations) will be removed in Django 5.1.
HINT: Use CharField(db_collation="…") with a case-insensitive non-deterministic collation instead.

@terjekv
Copy link
Collaborator Author

terjekv commented May 29, 2023

One solution if we want to keep the idea itself (a case-insensitive field) is to migrate to using collations: https://adamj.eu/tech/2023/02/23/migrate-django-postgresql-ci-fields-case-insensitive-collation/

However, an equally valid solution would be to use case-sensitive fields and ensure that we use lower() or similar when we interact with the field. This would also allow us to use other database backends for testing (and production).

@terjekv
Copy link
Collaborator Author

terjekv commented May 31, 2023

Also note that we use CI-fields for email addresses:

email = pgfields.CIEmailField()
contact = pgfields.CIEmailField(blank=True)

However, https://datatracker.ietf.org/doc/html/rfc5321#section-2.4 states:

Verbs and argument values (e.g., "TO:" or "to:" in the RCPT command
and extension name keywords) are not case sensitive, with the sole
exception in this specification of a mailbox local-part (SMTP
Extensions may explicitly specify case-sensitive elements). That is,
a command verb, an argument value other than a mailbox local-part,
and free form text MAY be encoded in upper case, lower case, or any
mixture of upper and lower case with no impact on its meaning. The
local-part of a mailbox MUST BE treated as case sensitive.
Therefore, SMTP implementations MUST take care to preserve the case
of mailbox local-parts. In particular, for some hosts, the user
"smith" is different from the user "Smith".
However, exploiting the
case sensitivity of mailbox local-parts impedes interoperability and
is discouraged. Mailbox domains follow normal DNS rules and are
hence not case sensitive.

(My emphasis.)

@oyvindhagberg
Copy link
Contributor

It seems switching to case-sensitive fields is the cleanest solution. However, we must ensure that we don't break anything by doing so. Perhaps there was a reason a case-insensitive data type was chosen in the first place. But overall, it looks like these fields could be case-sensitive (like the email address).

@terjekv
Copy link
Collaborator Author

terjekv commented Jun 27, 2023

This came up again due to #492. To have django-filter automatically create filters and lookups for fields, one uses a structure such as:

class HostPolicyAtomFilterSet(rest_filters.FilterSet):
    class Meta:
        model = HostPolicyAtom
        fields = "__all__"

The fields = "__all__"-construct checks the field type and ensures that the appropriate lookups (exact, contains, gt, etc) are made available for the field. However, for our LCI-fields, this automation breaks, seemingly as these field types are not known to django-filter. Specifying the lookup types manually works:

class HostPolicyAtomFilterSet(rest_filters.FilterSet):
    class Meta:
        model = HostPolicyAtom
        fields = {
            'name': ['exact', 'regex', 'contains'],
        }

But, if we do this for one field, we have to do it for all fields in the model, so for this model:

class HostPolicyRole(HostPolicyComponent):

    name = LCICharField(max_length=64, unique=True, validators=[_validate_role_name])
    atoms = models.ManyToManyField(HostPolicyAtom, related_name='roles')
    hosts = models.ManyToManyField(Host, related_name='hostpolicyroles')
    labels = models.ManyToManyField(Label, blank=True, related_name='hostpolicyroles')

    class Meta:
        db_table = 'hostpolicy_role'
        ordering = ('name',)

We would have to specify something like:

class HostPolicyRoleFilterSet(rest_filters.FilterSet):
    class Meta:
        model = HostPolicyRole
        fields = {
            'name': ['exact', 'regex', 'contains'],
            'atoms__name': ['exact', 'regex', 'contains'],
            'hosts__name': ['exact', 'regex', 'contains'],
            'labels__name': ['exact', 'regex', 'contains'],
        }

If we want other filters for other fields in related models, we'd have to be explicit about them.

Moving to case sensitive fields would also avoid issues such as these (from https://adamj.eu/tech/2023/02/23/migrate-django-postgresql-ci-fields-case-insensitive-collation/)

Some of Django’s lookups, like icontains, map to a SQL LIKE. PostgreSQL does not allow such queries with nondeterministic collations—you will see this error if such a query is attempted:

terjekv added a commit to terjekv/mreg that referenced this issue Jun 27, 2023
  - Requires manual lookup declarations.
  - See unioslo#489 (comment)
@oyvindhagberg
Copy link
Contributor

Very good work here, @terjekv! Looks like we should move to case sensitive fields sooner rather than later.

@oyvindkolbu
Copy link
Collaborator

As nearly everything in DNS is case insenstive moving things to case sensitive fields will be painfull. The idea to use lower() in on strings before matches is much more work than you think.

Also the idea of case sensitive emails is just there in theory. No (sensible) server will allow both User@example.org and user@example.org. And remember that user@Example.org and user@example.org are equivalent as the domain part of an email address is case insensitive.

@terjekv
Copy link
Collaborator Author

terjekv commented Jun 28, 2023

The main solution for case-insensitive user input matching would be to inject a manager for the searches. Something down the route of this:

class CaseInsensitiveNameManager(models.Manager):
    def get_queryset(self):
        return super().get_queryset().annotate(
            lower_name=models.functions.Lower('name'),
        )

    def filter(self, *args, **kwargs):
        kwargs = {k.replace('name', 'lower_name'): v for k, v in kwargs.items()}
        return super().filter(*args, **kwargs)

I think that's the cleanest solution? For DNS we probably have to do something like this, unless we want to put the burden onto users, which would be cruel. Then again, this entire logic gets painful when doing IDNA in general, especially if we are realistically supporting ThaiURL or similar schemes. Meh.

For email fields, there is the annoyance that as much as case sensitive local parts are a theoretical possibility, it is in the RFC standard, so it feels pretty iffy to pretend it can't happen. Even parsing an email is painful, so I'm not sure what to do, unless we choose to allow the way the user inputs the email address to be used as-is, and let the email servers sort it out.

If we wish for the email entries to be authoritatively tied to a specific contact (ie, we want hostmaster@domain to have hostmaster@domain.com and no other email address), we should maybe consider adding a contact model and associate contacts to relevant entries and resolve emails from said contact.

terjekv added a commit to terjekv/mreg that referenced this issue Aug 18, 2023
  - Requires manual lookup declarations.
  - See unioslo#489 (comment)
terjekv added a commit to terjekv/mreg that referenced this issue Aug 28, 2023
  - Requires manual lookup declarations.
  - See unioslo#489 (comment)
terjekv added a commit to terjekv/mreg that referenced this issue Sep 18, 2023
  - Requires manual lookup declarations.
  - See unioslo#489 (comment)
terjekv added a commit to terjekv/mreg that referenced this issue Nov 29, 2023
  - Requires manual lookup declarations.
  - See unioslo#489 (comment)
terjekv added a commit to terjekv/mreg that referenced this issue Dec 5, 2023
  - Requires manual lookup declarations.
  - See unioslo#489 (comment)
pederhan pushed a commit that referenced this issue May 21, 2024
* Fixes #487 by implementing option 5 in the issue.

Supports: python3.7+, django 3.2+. Tests included for this range.
Breaking changes: None

This implementation uses django-filter as intended. It sets a default filter backend and uses standard filter_class models where possible.

Exceptions take place in the following views:

HostList. Relies on manipulating the queryset, and said manipulation is also used in the filter-less HostDetail.
*Zone*List. The abstraction model for these views is based around propagating a filterset into the parent class.
Concerns:

This patch creates explicit mapping filters for JSONField and CIDRField. It is unclear how well-tested these are.

* Update pyproject.toml fix package building (for tox)

  - Required as we intriduced the toml file for ruff.
  - Sets some default keys.
  - Make tox.ini use the lint and coverage environments.
  - Make tox runs that use pythons and django show versions for both.

* A few cleanup fixes.

  - Uses filterset_class correctly.
  - Applies the filter via MregMixin.
  - Test basic filtering in hostgroups.

* Hack support CI-fields.

  - Requires manual lookup declarations.
  - See #489 (comment)

* Rebase to master, clean up imports a bit.

* Cleanup after rebase.

* Remove unused variable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants