Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Track record/account delete and update data in subject status #2804

Open
wants to merge 3 commits into
base: ozone-account-events
Choose a base branch
from

Conversation

foysalit
Copy link
Contributor

@foysalit foysalit commented Sep 9, 2024

This PR updates moderation subject status with 3 new properties recordUpdatedAt recordDeletedAt and recordStatus. as account/record events are received, these help moderators query subject statuses with delete/update time range and record status.

type: 'string',
format: 'datetime',
description:
'Last update timestamp of the record the subject is associated with',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the language here is quite convoluted. would love suggestions to make it clearer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps something along the lines of: "Timestamp at which the record or account was last updated."

@@ -11127,7 +11145,7 @@ export const schemaDict = {
type: 'object',
description:
'Logs account status related events on a repo subject. Normally captured by automod from the firehose and emitted to ozone for historical tracking.',
required: ['timestamp', 'status', 'active'],
required: ['timestamp', 'active'],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: if this PR doesn't make it into the main branch, we'd wanna port this change over to the parent PR.

// When deactivated accounts are re-activated, we receive the event with just the active flag set to true
// so we want to make sure that the recordStatus is not set to an outdated value
if (currentStatus?.recordStatus !== 'active' && event.meta?.active) {
status.recordStatus = 'active'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default for recordStatus is null, may be we should fallback to that? although, doing this is maybe helpful if we want to see which accounts reactivated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend reflecting the exact hosting status here (where active is either null or 'active', slight preference for the latter), and then if we want to keep tract of reactivations we do so with a separate field such as reactivatedAt. That allows the status to not be dependent on past statuses, and if we want to know if something is active we don't need to check for multiple different values.


if (event.action === 'tools.ozone.moderation.defs#accountEvent') {
const status: Partial<ModerationSubjectStatusRow> = {
recordStatus: `${event.meta?.status}`,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the types here, as written this could end-up containing the string 'undefined'.

}

if (event.meta?.tombstone) {
status.recordStatus = 'tombstoned'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a couple values for recordStatus in here that aren't reflected in the lexicon.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(P.S. is this separate from 'deleted'? Historically we have used the term "tombstoned" in a couple different ways, and I'm not sure which one is meant here.)

Comment on lines +134 to +148
"recordUpdatedAt": {
"type": "string",
"format": "datetime",
"description": "Last update timestamp of the record the subject is associated with"
},
"recordDeletedAt": {
"type": "string",
"format": "datetime",
"description": "Timestamp referencing when the record the subject is associated with was deleted"
},
"recordStatus": {
"type": "string",
"description": "Status of the record the subject is associated with. Statuses are different when the subject references an account vs. a record",
"knownValues": ["takendown", "suspended", "deleted", "deactivated"]
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names work alright if the subject is a record, but if the subject is a repository or chat message, using the term "record" here doesn't seem like the best fit. Would there be any issue with using something like subjectUpdatedAt/subjectDeletedAt instead?

Comment on lines +144 to +148
"recordStatus": {
"type": "string",
"description": "Status of the record the subject is associated with. Statuses are different when the subject references an account vs. a record",
"knownValues": ["takendown", "suspended", "deleted", "deactivated"]
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite a complex field, I think there are a few potential issues with it:

  1. It's called "record status" but it may contain the hosting status of an account.
  2. The hosting statuses of records and accounts are mixed together, which could make it seem like a record can be marked as takendown here, which isn't the case.
  3. Potential confusion between the ozone status and the hosting status: the terms "takendown" and "suspended" are also terms used by ozone overlapping with the takendown and suspendedUntil fields.

Here's one idea: we could combine the three new fields into something more like this (using typescript just to illustrate the concept):

type SubjectStatusView = {
  // ...
  hosting?: AccountHosting | RecordHosting
}

type AccountHosting = {
  $type: 'tools.ozone.moderation.defs#accountHosting',
  status: 'active' | 'takendown' | 'suspended' | 'deleted' | 'deactivated'
  createdAt: Date
  updatedAt: Date
  deletedAt?: Date
  suspendedAt?: Date
  deactivatedAt?: Date
  takendownAt?: Date
  reactivatedAt?: Date
}

type RecordHosting = {
  $type: 'tools.ozone.moderation.defs#recordHosting'
  status: 'active' | 'deleted'
  createdAt: Date
  updatedAt: Date
  deletedAt?: Date
}

This also gives us a way to make certain fields required together if we want, e.g. if hosting is present then hosting.createdAt can always be present too.

@@ -28,6 +28,31 @@
"format": "datetime",
"description": "Search subjects reviewed after a given timestamp"
},
"recordDeletedAfter": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My qualm with these fields is just that they are named "record" when they really may apply to any subject, e.g. an account rather than a record.

Comment on lines +4 to +15
await db.schema
.alterTable('moderation_subject_status')
.addColumn('recordStatus', 'varchar')
.execute()
await db.schema
.alterTable('moderation_subject_status')
.addColumn('recordDeletedAt', 'varchar')
.execute()
await db.schema
.alterTable('moderation_subject_status')
.addColumn('recordUpdatedAt', 'varchar')
.execute()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend distinguishing these as e.g. hostingStatus, hostingDeletedAt, etc.

type: 'string',
format: 'datetime',
description:
'Last update timestamp of the record the subject is associated with',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps something along the lines of: "Timestamp at which the record or account was last updated."

Comment on lines +142 to +144
const timestamp = event.meta?.timestamp
? `${event.meta?.timestamp}`
: event.createdAt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The types here are a little funky to wrangle. To avoid ending up with a string like 'true' here, could we go with something more explicit like:

Suggested change
const timestamp = event.meta?.timestamp
? `${event.meta?.timestamp}`
: event.createdAt
const timestamp = typeof event.meta?.timestamp === 'string'
? event.meta.timestamp
: event.createdAt

// When deactivated accounts are re-activated, we receive the event with just the active flag set to true
// so we want to make sure that the recordStatus is not set to an outdated value
if (currentStatus?.recordStatus !== 'active' && event.meta?.active) {
status.recordStatus = 'active'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend reflecting the exact hosting status here (where active is either null or 'active', slight preference for the latter), and then if we want to keep tract of reactivations we do so with a separate field such as reactivatedAt. That allows the status to not be dependent on past statuses, and if we want to know if something is active we don't need to check for multiple different values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants