Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #2900 - Checks slur regex to see if it is too permissive #3146

Merged
merged 16 commits into from
Jun 27, 2023

Conversation

ninanator
Copy link
Contributor

@ninanator ninanator commented Jun 16, 2023

WHAT

  • Adds validation to check if the slur regex is too permissive (i.e would match all text)
    • Adds two new error keys depending on whether the slur regex:
      • is too permissive (permissive_regex)
      • couldn't compile (invalid_regex)
  • Adds a validation check for the site name, since I saw we were checking the site description length but not the name length
    • Adds two new error keys:
      • site_name_length_overflow
      • site_name_required

lemmy-translations PR here: LemmyNet/lemmy-translations#65

PROOF

Updating site name

OLD - Attempting to clear site name

Old behavior for clearing site name; site now has name of empty string

(Your site now has a name of empty string, which seems not great).

NEW - Attempting to clear site name

New behavior for clearing site name; returns error

OLD - Site name is too long

Old behavior for site name too long; site name does not change

(Your change had no effect, which is confusing)

NEW - Site name is too long

New behavior for site name too long; returns error

Updating the site with different regex patterns

Screen.Recording.2023-06-17.at.3.39.04.PM.mov

…e along with small validation organization
@ninanator
Copy link
Contributor Author

@dessalines @Nutomic - It looks like my build is failing since I'm using features from 1.70.0 while Woodpecker is pinned to 1.67.0. I'm happy to adjust to what is available from 1.67.0, but just wanted to ask if you both had an opinion on what Rust version we should be using to develop locally?

Would you be OK with bumping to the 1.70.0 image if everything builds fine in the pipeline? I don't know if version upgrades in Rust are nerve-wracking or not, so I figured I'd ask. Thank you!

crates/api_common/src/utils/site_utils.rs Outdated Show resolved Hide resolved
crates/api_common/src/utils/site_utils.rs Outdated Show resolved Hide resolved
let icon = diesel_option_overwrite_to_url(&data.icon)?;
let banner = diesel_option_overwrite_to_url(&data.banner)?;
// Check that the slur regex compiles, and returns the regex if valid...
let slur_regex = build_and_check_regex(&local_site.slur_filter_regex.as_deref())?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect.

crates/api_crud/src/site/create.rs Outdated Show resolved Hide resolved

if let Some(Some(desc)) = &description {
if let Some(desc) = &data.description {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that scares me about this change, is that the diesel_option_overwrite function has a special case: it considers sending empty string as a set the DB item to null. In which case, this check shouldn't be run.

By removing that, it now runs these checks on empty strings, which might fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


if let Some(desc) = &data.description {
site_description_length_check(desc)?;
check_slurs_opt(&data.description, &slur_regex)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern as above, and this also should probably be changed to if let Some(Some(... .

I hope it won't fail on empty strings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had started moving the validation portions into a function; based on this and your previous comments I'll just go ahead and do that! It'll let me write more targeted unit tests and show things should continue to work as expected 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// may match against any string text. To keep it simple, we'll match the regex
// against an innocuous string - a single number - which should help catch a regex
// that accidentally matches against all strings.
if regex.is_match("1") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be problematic if there is ever a community that rejects numbers; however, I feel like this would work for the majority of cases. Not sure if there's a better way to check that wont get complicated quickly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine, a slur filter which matches a single character is almost certainly wrong.


/// Checks the site name length, the limit as defined in the DB.
pub fn site_name_length_check(name: &str) -> LemmyResult<()> {
min_max_length_check(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the site name is required in the database, so I added a check to also validate a minimum site name length. Otherwise the changes are just moved from when it lived in site_utils.rs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks.

check_slurs_opt(&edit_site.description, &slur_regex)?;
}

if let Some(listing_type) = &edit_site.default_post_listing_type {
Copy link
Contributor Author

@ninanator ninanator Jun 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check also be done when creating a site?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its probably a good idea, for consistency, although lemmy-ui doesn't show nearly as many fields when creating the site, as for updating it.

let enabled_federation_with_private_instance = edit_site.federation_enabled == Some(true)
&& edit_site.private_instance.unwrap_or(private_instance);

if enabled_private_instance_with_federation || enabled_federation_with_private_instance {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - should this check also be done when creating a site?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea for consistency.

federation_enabled: bool,
private_instance: bool,
edit_site: &EditSite,
) -> LemmyResult<()> {
Copy link
Contributor Author

@ninanator ninanator Jun 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following comments are questions I have, that I can address in a subsequent PR:

Do we need a check to make sure that the site does exist (opposite of the local_site.site_setup in create)? Not immediately clear to me if this would also work as a way to create the initial site.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really necessary IMO, because it never sets that site_setup field in the form builder.

}
}

let invalid_payloads = [(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add cases for functions from validation.rs because the unit tests in that file should cover those cases. Let me know if you think otherwise, and I can add more test cases here!

}

is_valid_body_field(&data.sidebar)?;
validate_create_payload(local_site.site_setup, local_site.slur_filter_regex, data)?;

let application_question = diesel_option_overwrite(&data.application_question);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to move this into the validation fn in the next PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

@ninanator
Copy link
Contributor Author

ninanator commented Jun 17, 2023

@dessalines - Thank you for the initial feedback! I think this is ready to review again at your leisure. For additional proof, I've added some screenshots and a video of this working in the MR description. I also created a PR for the translations at LemmyNet/lemmy-translations#65.

@ninanator ninanator requested a review from dessalines June 17, 2023 21:05
@ninanator ninanator force-pushed the lemmy-2900-check-slur-regex branch from 9181f8f to 33eff7e Compare June 18, 2023 14:37
@@ -600,21 +606,22 @@ mod tests {

let read_comment_views_no_person = CommentQuery::builder()
.pool(pool)
.sort(Some(CommentSortType::Hot))
.sort(Some(CommentSortType::Old))
Copy link
Contributor Author

@ninanator ninanator Jun 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the main change required to get the tests to not be flaky on both main and this branch. It seems like the hot ranking doesn't always update in time for the sort order Hot to pick the first comment on the post.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that's perfectly fine,we should be using New or Old for sorting for the tests, to be predictable.

Copy link
Member

@dessalines dessalines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, and thank you for adding all these tests too!

}

is_valid_body_field(&data.sidebar)?;
validate_create_payload(local_site.site_setup, local_site.slur_filter_regex, data)?;

let application_question = diesel_option_overwrite(&data.application_question);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

federation_enabled: bool,
private_instance: bool,
edit_site: &EditSite,
) -> LemmyResult<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really necessary IMO, because it never sets that site_setup field in the form builder.

check_slurs_opt(&edit_site.description, &slur_regex)?;
}

if let Some(listing_type) = &edit_site.default_post_listing_type {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its probably a good idea, for consistency, although lemmy-ui doesn't show nearly as many fields when creating the site, as for updating it.

let enabled_federation_with_private_instance = edit_site.federation_enabled == Some(true)
&& edit_site.private_instance.unwrap_or(private_instance);

if enabled_private_instance_with_federation || enabled_federation_with_private_instance {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea for consistency.

@@ -600,21 +606,22 @@ mod tests {

let read_comment_views_no_person = CommentQuery::builder()
.pool(pool)
.sort(Some(CommentSortType::Hot))
.sort(Some(CommentSortType::Old))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that's perfectly fine,we should be using New or Old for sorting for the tests, to be predictable.


/// Checks the site name length, the limit as defined in the DB.
pub fn site_name_length_check(name: &str) -> LemmyResult<()> {
min_max_length_check(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks.

@ninanator
Copy link
Contributor Author

@dessalines I did (what I think is) the last step and updated the submodules 🎉 Let me know if there's anything else I need to do! Thank you!

}
}
},
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it pretty confusing to loop through tests like this. Would be much clearer to have a helper function and call that explicitly with each set of params.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nutomic Trying to understand your feedback here - are you saying you'd rather have a test per case, and call

match validate_update_payload(local_site_slur_filter_regex.clone(), false, true, edit_site)
{
  Ok(_) => {
    panic!(
    "Got Ok, but validation should have failed with error: {} for invalid_payloads.nth({})",
    expected_err, idx
  )
  }
  Err(error) => {
    assert!(
      error.message.eq(&Some(String::from(expected_err))),
      "Got Err {:?}, but should have failed with message: {} for invalid_payloads.nth({})",
      error.message,
      expected_err,
      idx
    )
  }
}

as the helper function?

Copy link
Member

@Nutomic Nutomic Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you write it out, it doesnt seem to make much difference. Anyway the errors are being logged so it should be fine. Just resolve the conflicts and we can merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had started cleaning up the validation some more in anticipation of putting up a second PR, but at this point I just pushed it all up. I also updated the tests so that there's now a reason included in the tuples, which should make it easier to understand each case and not need to make a separate test for each one.

@ninanator ninanator requested a review from Nutomic June 27, 2023 03:46
@Nutomic Nutomic enabled auto-merge (squash) June 27, 2023 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants