Page MenuHomePhabricator

Reverted edit tag applied to protection entries in page history
Closed, ResolvedPublicBUG REPORT

Description

Steps to Reproduce:

  1. Edit a wiki page
  2. Apply protection to that page
  3. Revert the edit made in 1

Actual Results:
The "reverted" tag is applied to both edit 1 and the protection entry in the page history.
When a page is protected, the page history shows a log of the protection along with an empty diff. If protection occurs between an edit and a revert, the page protection gets marked as "Reverted" despite the page still being protected. This is misleading and undesirable behavior.

Expected Results:
The history entry noting page protection should not be marked as reverted, only the prior edit.

Example:
You can see an example of the undesired behavior at this revision and surrounding history. There was a brief discussion about it on EnWikipedia's technical village pump.

Event Timeline

DannyS712 subscribed.

Likely affects all null revisions, which include protection and page moves
@Ostrzyciel any reason not to exclude all null revisions in general? If so, the fix should be pretty simply to avoid adding such tags in the future:

in RevertedTagUpdate::markAsReverted

$revLookup = MediaWikiServices::getInstance()->getRevisionLookup();
$revisionRecord = $revLookup->getRevisionById( $revisionId );
$priorRev = $revLookup->getPreviousRevision( $revisionRecord );
if ( $priorRev && $revisionRecord->hasSameContent( $priorRev ) ) {
	// Edit has the same content as the parent revision, therefore a null edit with nothing to revert, early return
	return;
}
// Continue to call ChangeTags::addTags

though removing the tag from existing null revisions that are marked as reverted would probably require scanning all revisions with that tag, which would be expensive for a script (and more expensive the longer it takes to be fixed)

Hmmm, haven't thought of that.

Thanks @DannyS712 for the suggestion, I think this would do, I'll make a patch for this soon.

I don't know about the script, I have zero experience running scripts on Wikipedia and I don't know what are the performance limits there. I can do some poking around with DB indexes to see if we can optimize something here.

Change 633701 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/core@master] Reverted tag: do not mark null revisions as reverted

https://gerrit.wikimedia.org/r/633701

Change 633701 merged by jenkins-bot:
[mediawiki/core@master] Reverted tag: do not mark null revisions as reverted

https://gerrit.wikimedia.org/r/633701

Tagging user notice for change going forward
Should a maintenance script be added for removing the tag from old null revisions?

Should a maintenance script be added for removing the tag from old null revisions?

I probably won't have time to implement it, too many other matters right now, sorry, but I did some snooping on how it could be implemented. The problem is – enwiki's Special:Tags says there already over half a million reverted tags… which is a lot, wow. Of course the trouble would be to sift through these to obtain only null revisions marked with the reverted tag.

Finding revisions with the mw-reverted tag is simple, we just have to join with the change_tag and change_tag_def tables, these are all indexed of course. We can also filter the revisions by timestamp easily, but that still leaves us with half a million rows on enwiki alone. Unfortunately (for us) diff size is not saved in the revision table, so we can't simply filter null revisions. I can see two possible strategies for finding them:

  • Retrieve each revision marked with the reverted tag and then use RevisionStore's method for getting the previous revision, then compare their SHA1s. If they match, we've got a null revision. IIRC, this would mean a single DB query for each revision with the mw-reverted tag. I… don't see a simple way could batch these queries, apart from retrieving entire pages' histories, which would be probably worse.
  • Perform a join with the comment table and add a LIKE condition over the comment_text field. Protection and move revisions (I think) have some pre-defined comments that we could look for. This way the scan would be performed on the DB which would probably speed things up a little. Note: I haven't tested this and it may be a horrible idea.

For the user-notice entry in Tech News, is this accurate? If not, a rewritten or simpler/clearer draft would be appreciated :)

There was a problem with the [[Special:Tag|Change Tags]]. The software would apply the "{{int:Tag-mw-reverted}}" tag to any null edits such as page-protection changes that came after a reverted edit. This has now been fixed for new edits. [1]

For the user-notice entry in Tech News, is this accurate? If not, a rewritten or simpler/clearer draft would be appreciated :)

Looks good. :)

Just noting: @Tacsipacsi has helpfully rewritten the version in the draft to be more accurate (and especially thankfully, did so before I sent the 1st email to translators-l@!). That new version looks good to me, so I will leave it like that.
For future reference, it is always appreciated if people write out a draft entry for Tech News at the same time as they tag a task with User-notice. That saves us editors a major headache of trying to deduce what a task is all about. Thanks in advance! ;-)

@Quiddity Thanks for the ping, but that was actually T202989’s item (which I’ve been involved with), not this one’s.

Grah! It's been a long week. Thanks/sorry!