Page MenuHomePhabricator

"Pages to date" not loading with "daily" metric
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

What happens?: "I tried to see the number of pages and articles in Turkish Wikipedia separately as "daily". Regardless of the metrics "1 month", "3 months", "6 months", "1 year", "2 years", "all time", it gets stuck on the "loading metrics" screen. "monthly" is fine. "Daily" does not work.

What should have happened instead?:

Software version (skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I reproduced the error. My scarce skills in Javascript tell me the problem doesn't come from loading data but after in the stack (I inspected the code execution in the browser, and I could see the API backend call to get data successfully).

Yep, looks like an infinite recursive call :) I'll debug when we prioritize this.

For an ugly workaround, one could open the console and click on the URL the app tries to load, in this case: https://wikimedia.org/api/rest_v1/metrics/edited-pages/new/tr.wikipedia.org/all-editor-types/all-page-types/daily/1980010100/2022071100

Oh! Looking at my own workaround I realized another problem, the start date in all the calls above is 1980, so this is a bug in the Time Selector, as in my example it was set to "Last Two Years".

Hi @Milimetric, can you do this API just "article". not all page types.

Hi @Milimetric, can you do this API just "article". not all page types.

Yes, @Nevmit, the page type parameter controls what kind of pages you're getting stats for. So, data for year=2022:

Thank you @Milimetric. So, how long will the real problem be solved?

I'm not sure @Nevmit, we triage again on July 25th, and if this is prioritized then, that week but probably the next at the earliest. So early August?

Aklapper added a subscriber: EChetty.

@EChetty: Please keep/add valid code project tags such as Data-Engineering-Wikistats which allow finding tasks related to code bases, not to end up in a big unmaintainable pile of only some-team-in-some-organization tasks. Thanks a lot. :)

Hi @Milimetric and @EChetty, I guess the problem is still not fixed

I'm going to go out of process here and try to fix this. It's not really ok for a production bug to sit around this long.

Change 833065 had a related patch set uploaded (by Milimetric; author: Milimetric):

[analytics/wikistats2@master] Avoid infinite loop when loading pages-to-date

https://gerrit.wikimedia.org/r/833065

Change 833065 merged by Milimetric:

[analytics/wikistats2@master] Avoid infinite loop when loading pages-to-date

https://gerrit.wikimedia.org/r/833065

Mentioned in SAL (#wikimedia-analytics) [2022-09-19T22:28:06Z] <milimetric> Wikistats: improved build a little and deployed fix to T312717

For the record, this was an interesting bug. I'll describe here for anyone interested.

The code was doing something like (get all items).length === 0 to check if the API returned no results. There's a perfectly simple way to just check the size, and doing so fixes the problem. But why all of a sudden, with no code changes, was this get all items version failing? It was a max stack depth exceeded bug. Analysis showed that to return a result, crossfilter was sorting the items. It uses recursive quicksort to do so, and that was fine until recently. But this metric (pages to date) is "cumulative". That means we have to fetch data from the beginning of time and keep a tally until the start date the user is interested in. With all that daily data, the array that gets passed to quicksort grew every day. With the growth came ever deeper recursions until I guess recently when it exceeded max stack depth and stopped working.

(Also, I vaguely remember from school that quicksort has bad performance on mostly sorted data, and this data was mostly sorted, so maybe that's a contributing factor to the deep stack)

Thank you for the troubleshooting and the description @Milimetric!

@Milimetric looks like you squashed this bug last year. Is that true? If it is let's close the ticket.

Yes, there will be a bunch of these, especially in the in-betweens as we changed managers and task closing processes (our usual process was to just leave them open for managers to review).