Discussion of automated spam detection techniques (formerly spam dashboard discussion) #2377

ebarry · 2018-02-21T14:17:12Z

@jywarren here, just stepping in to try to organize some of the ideas here. I see a few different "groupings" --

~~having a 'users' tab as part of our spam dashboard with ban/approve etc similar to other tabs~~ COMPLETE in GSoC'20: "Spam Management Dashboard" Project Planning #7885
having a keyword search system for searching user's usernames and bios for, as an example, spammy terms like "sale" (SOME FILTERING BY KEYWORD NOW COMPLETE IN GSoC'20: "Spam Management Dashboard" Project Planning #7885)
using more advanced analyses to create "recommendations" for spamming/banning either presented in the user list in a column, or potentially (harder) in a way that would sort users by likelihood that they are spammy

(3) above is very big and would need significant planning, UI, etc etc, but 1 is relatively simple and 2 is a medium-sized project. Thanks, all!

Please describe the problem (or idea)

What problem could this idea solve?

This idea is about extending the Spam Page to assist moderators in banning spam profiles.

The current page at https://publiclab.org/spam/ is only visible to moderators, so here's a screenshot:

What the current spam page achieves is an efficient listing of content to that moderators can more quickly "ban" without multiple page reloads.

This idea is to add a tab for viewing profiles, and to allow searching within profile bio content via URL:

https://publiclab.org/spam/people/_keyword_
or
https://publiclab.org/spam/profiles/_keyword_

An outstanding question i have is that most of my profile moderation depends on having more than one offensive keyword present. Can this sort of URL query take multiple keywords?

Thanks for thinking about this!!!

jywarren · 2018-02-21T16:36:35Z

In this URL scheme, the keyword could be used in a new /config/routes.rb entry to match against both the user name and bio, using this service:

TypeaheadService.new.users(keyword)`

(using this code: https://github.com/publiclab/plots2/blob/master/app/services/typeahead_service.rb#L15-L25)

jywarren · 2018-02-21T16:37:44Z

An outstanding question i have is that most of my profile moderation depends on having more than one offensive keyword present. Can this sort of URL query take multiple keywords?

Let's start with a single keyword system, then follow up with a multiple keyword system. The new rich search may automatically incorporate multiple, so let's also see how that works.

ebarry · 2018-02-27T15:03:19Z

Not to derail, but adding more information:

Here are examples of non-offensive single words that are present in thousands of spam profiles on our site, but aren't a sole criteria for banning a profile. Even working through a supportive interface such as publiclab.org/spam/profiles/keyword would still require massive manual time investment:

payday
pets
brisbane
credit
vitamin
protein
creatine
massage
estate
obedience
electrician
carpenter
limo
exotic
dental

bronwen9 · 2018-02-27T15:27:49Z

Adding to the list:
esquire
exterminator
elbow
kung fu
wizard of oz
carpet
gambling
bodyguard
(etc, etc, etc)

jywarren · 2018-03-31T20:30:23Z

Could we achieve this through a route like /search/people/keyword+keyword ? Using the TypeaheadService.new.users() method?

steviepubliclab · 2020-01-07T15:10:20Z

This is a little like what @skilfullycurled has recommended on key words. Just flagging similarities

skilfullycurled · 2020-01-07T17:25:50Z

@steviepubliclab, I completely agree with you.

Two thoughts:

If there is someone willing to take up the spam detection project from Spam account detection/reduction planning #5450, that actually might be easier because it's a little more batteries included, even though this approach is "simpler". I don't know if the GSoC our Outreachy had that flexibility in terms of projects or they were assigned ahead of time.

@ebarry, I have enough spam now that we could create this list programmatically, so it just became pretty low hanging fruit. If I'm understanding your request, it's to be able to separate out highly likely spam, which will have multiple keywords, from suspicious spam, which might only have few and therefore could be legitimate users?

Perhaps instead of seeing the key words, we can just give a number or score? I think the difficult thing about seeing all of the keywords is that there could be a lot of them. We could add to the scores other factors as well. Some that I've found that I think would be pretty easy catches:

a. Spam users have bios (we wouldn't know otherwise unless they post)
b. Spam users are more likely to have URL's in their bio, and that URL is highly likely to be the first thing.
c. Spam users are more likely to have the total length of their bio greater than 1000 words (based on some quick distributions).
d. Spam users are highly likely to talk about themselves in specific grammar such as 3rd person and past tense such as we, us, our(s), have etc.

jywarren · 2020-01-10T16:18:57Z

Hi, this seems like a nice project that could be part of GSoC, potentially. We could workshop how to develop a minimal, and more fleshed out version of this, ranging from simply adding some kind of "score" marker to a list of recent profiles, to a more complete system with bulk spamming tools, different filtering methods, and a more flexible "spam dashboard".

jywarren · 2020-01-10T16:23:56Z

for a truly automated system, we would want to include a pretty rigorous section on identifying false positives and negatives, tuning the filter and manually evaluating the results, etc.

on the other hand, just adding some filtering tools to support manual banning could be done faster, and would be a good stepping stone on the way to a more automated system, since the same methods could be used to "collect" batches for evaluation, and eventually automated spamming.

Another relatively easy filter would be Askimet integration, using WordPress's pretty sophisticated spam filter: https://github.com/jonahb/akismet/

So there are a number of different possible sub-projects to be considered, prioritized, and that could make up a nice Summer of Code project. Thanks!

ebarry · 2020-01-10T17:05:16Z

Benjamin, i wonder if you are working on a paper of anything from your analysis of spam? I really appreciated your points a) b) c) d) from your above comment, especially d which i instantly recognized as a decisive pattern but would not have named it on my own!

skilfullycurled · 2020-01-11T20:08:38Z

@barry, I recently took a very good class which inadvertently made me question the utility of the vast majority of papers. I think it was meant to make you more critical of research but I took it a bit to heart. So, no, I don't have any plans to write any papers at the moment. Admittedly, this is a bit of a problem, but I'll figure it out.

As for spam in general, we'd need a novel angle which could be some sort of examination of community designed classification. I think this is an itnteresting area because right now, machine learning classifiers are developed in an all purpose manner, the goal of which is to ideally create something which is perfectly predictive. As people consider the flaws and biases of machine learning, an interesting question (I think) is what if these programs are designed and tuned by the community and they are the ones who decide which biases and trade offs they feel comfortable with.

skilfullycurled · 2020-01-11T20:42:10Z

Regarding your comment @jywarren, I'm now leaning heavily towards your idea of building some tools which help moderators do their job with less overhead instead of trying to remove the job entirely.

for a truly automated system, we would want to include a pretty rigorous section on identifying false positives and negatives, tuning the filter and manually evaluating the results, etc.

Yes, this would all be a part of the training process, but this brings up a good point which makes me think we need to take a step back:

Instead of asking how we can end the human cost of spam moderation, we probably should be asking where the pain points are This is what is nice about the question @ebarry's is asking here. She has a specific thing that would help her.

Classification may not lessen the amount of effort it promises to. The classifier will need to continually be re-evaluated to avoid The Parable of Google Flu in which their flu predictions worked amazingly, until people's search habits changed and it didn't anymore.

Ideally, when a person is a false positive, they would contact us, we'd un-ban them, and then that would be the human manual evaluation. However, we do not email people if they are marked as spam (for good reason) and so it would rely on the user logging in again to find out. That'll work technically, but it won't really be a good "Welcome to Public Lab!" experience because they probably won't think to log in until a lot of time has passed and they're wondering why they still haven't been approved. So, that may not reduce the labor involved in moderation, since the determination of the classifier would need to be confirmed anyway. My experience with Akismet is similar, but that was for personal email, I'm not sure how user friendly it is on a wordpress site. But it does give a score which is nice.

just adding some filtering tools to support manual banning could be done faster, and would be a good stepping stone on the way to a more automated system, since the same methods could be used to "collect" batches for evaluation, and eventually automated spamming.

Agreed. This seems like a more prudent way to go.

Uzay-G · 2020-01-11T20:42:53Z

Oh wow, this is a really interesting subject! I'd love to contribute to building this system 👍

skilfullycurled · 2020-01-12T00:07:05Z

@Uzay-G, I'm really glad you chimed into this and the ML thread at #4660 because it made me realize that there are two parts to this.

Making moderation easier in general
I forgot that the original impetus was Spam account detection/reduction planning #5450 in which we need to detect spam users and posts that are in the database from the years prior to when we had any moderation.

Okay. The classification is back on! @Uzay-G, I'm going to ping you on #5450 and you can see if you're interested in that aspect (and it's okay if you're not). The nice thing about that issue is that you can get started/work on it whenever you want how ever you want. You don't have to wait for planning or anything because it's not integrated with the website, it's just a dataset from the website.

Tlazypanda · 2020-01-31T18:22:03Z

@jywarren @skilfullycurled This seems quite interesting! Would love to contribute to this ❤️ and take it up as part of my proposal. ✌️ Keep me in the loop! 😄

Uzay-G · 2020-01-31T20:59:14Z

@skilfullycurled I have been a bit busy with GCI but I will be happy to work on this when I have time 👍

jywarren · 2020-09-01T16:51:10Z

Hi all! I'm going to rename this to narrow in on automated spam detection techniques, as the bulk moderation and filtering UI work is complete or nearly complete in the spam2 project in #7885 -- thanks!

ebarry mentioned this issue Feb 21, 2018

PLANNING ISSUE: spam #974

Open

7 tasks

ebarry added this to the Spam milestone Feb 21, 2018

jywarren added enhancement explains that the issue is to improve upon one of our existing features help wanted requires help by anyone willing to contribute labels Feb 21, 2018

jywarren changed the title ~~Add tab for Profiles to publiclab.org/spam/~~ Add tab for Profiles to /spam/profiles with filtering/searching/bulk moderation tools Jan 10, 2020

jywarren changed the title ~~Add tab for Profiles to /spam/profiles with filtering/searching/bulk moderation tools~~ Add tab for Profiles at /spam/profiles with filtering/searching/bulk moderation tools Jan 10, 2020

jywarren added discussion and removed help wanted requires help by anyone willing to contribute labels Jan 10, 2020

jywarren mentioned this issue Jun 11, 2020

GSoC'20: "Spam Management Dashboard" Project Planning #7885

Closed

48 tasks

jywarren changed the title ~~Add tab for Profiles at /spam/profiles with filtering/searching/bulk moderation tools~~ Add tab for Profiles at /spam/profiles with filtering/searching/bulk moderation tools (and assorted other ideas) Jun 11, 2020

jywarren changed the title ~~Add tab for Profiles at /spam/profiles with filtering/searching/bulk moderation tools (and assorted other ideas)~~ Discussion of automated spam detection techniques (formerly spam dashboard discussion) Sep 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion of automated spam detection techniques (formerly spam dashboard discussion) #2377

Discussion of automated spam detection techniques (formerly spam dashboard discussion) #2377

ebarry commented Feb 21, 2018 •

edited by jywarren

Loading

jywarren commented Feb 21, 2018

jywarren commented Feb 21, 2018

ebarry commented Feb 27, 2018

bronwen9 commented Feb 27, 2018

jywarren commented Mar 31, 2018

steviepubliclab commented Jan 7, 2020

skilfullycurled commented Jan 7, 2020 •

edited

Loading

jywarren commented Jan 10, 2020

jywarren commented Jan 10, 2020

ebarry commented Jan 10, 2020

skilfullycurled commented Jan 11, 2020

skilfullycurled commented Jan 11, 2020

Uzay-G commented Jan 11, 2020

skilfullycurled commented Jan 12, 2020

Tlazypanda commented Jan 31, 2020

Uzay-G commented Jan 31, 2020

jywarren commented Sep 1, 2020

Discussion of automated spam detection techniques (formerly spam dashboard discussion) #2377

Discussion of automated spam detection techniques (formerly spam dashboard discussion) #2377

Comments

ebarry commented Feb 21, 2018 • edited by jywarren Loading

Please describe the problem (or idea)

jywarren commented Feb 21, 2018

jywarren commented Feb 21, 2018

ebarry commented Feb 27, 2018

bronwen9 commented Feb 27, 2018

jywarren commented Mar 31, 2018

steviepubliclab commented Jan 7, 2020

skilfullycurled commented Jan 7, 2020 • edited Loading

jywarren commented Jan 10, 2020

jywarren commented Jan 10, 2020

ebarry commented Jan 10, 2020

skilfullycurled commented Jan 11, 2020

skilfullycurled commented Jan 11, 2020

Uzay-G commented Jan 11, 2020

skilfullycurled commented Jan 12, 2020

Tlazypanda commented Jan 31, 2020

Uzay-G commented Jan 31, 2020

jywarren commented Sep 1, 2020

ebarry commented Feb 21, 2018 •

edited by jywarren

Loading

skilfullycurled commented Jan 7, 2020 •

edited

Loading