Page MenuHomePhabricator

Set up data storage to collect loosely structured data from CI
Closed, DeclinedPublic

Description

The Release-Engineering-Team have decided to start collecting as much data as possible out of CI, Code Health, Incidents, etc. To that end, we need to provision some storage for this data.

We don't expect the data to be very large because we will filter and summarize up-front rather than storing bulk logs or artifacts. I would expect the data to be quite a bit less than 10 gigabytes.

After a couple of discussions within the team, we are leaning towards using ElasticSearch for the back-end since it has powerful query capabilities that will facilitate easy retrieval in a usable format for reporting and analysis.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@EBernhardson: Would it be reasonable to store this data on the search cluster? We thought to ask for your blessing to do so, in order to avoid setting up a separate elasticsearch cluster for this tiny use-case. So I guess the question is whether you think it's reasonable and won't be a burden on the Discovery-Search team.

For the CI logs and tests result, we have the old T78705 which has a bunch of context. It is a subset of this T211904 task.

@mmodell EBernhardson is out through the end of the year and really needs to weigh in on this

@EBjune: Thanks for the heads-up. I definitely want EBernhardson to weigh in. This is just exploratory work and it can wait.

EBjune triaged this task as Medium priority.Jan 3 2019, 6:11 PM
mmodell raised the priority of this task from Medium to High.Nov 4 2020, 6:27 PM
thcipriani lowered the priority of this task from High to Low.

Closing this one again since we have no plan to work on it.