Page MenuHomePhabricator

GitLab uses 'real name' as username (rather than 'shell name' or an user-specified name)
Closed, ResolvedPublic

Description

Wikitech policy (used to?) states that the username should be your real name. In my case, that means my username is 'Merlijn van Deen' while my 'shell name' is 'valhallasw' - the nickname I use all over Wikimedia-space.

Gitlab takes the LDAP username, removes spaces (and other characters?), and uses that as username: https://gitlab.wikimedia.org/MerlijnvanDeen

This has two issues:

  • This is not the username I want :-)
  • What happens with names that end up in the same simplified username?

Would it be possible to let people specify their own username? (I assume this is standard Gitlab...).
An alternative could be to use the 'shell name' field, although a) that might still not be what people want, and b) probably requires significant modifications in the LDAP provider.

Event Timeline

brennen added subscribers: jbond, thcipriani, brennen.

For clarity: GitLab currently follows our Gerrit configuration here in using the LDAP CN value - which for most recently-created users is set to a "real name", but that's far from a universal in practice. The main difference is that projects can live under a user's namespace in GitLab, so it shows up in URLs.

Would it be possible to let people specify their own username? (I assume this is standard Gitlab...).

Probably not, or at least not with a scope limited to GitLab. We'll definitely continue using developer accounts, not managing account creation separately through GitLab.

An alternative could be to use the 'shell name' field, although a) that might still not be what people want, and b) probably requires significant modifications in the LDAP provider.

This might be possible; I'm not actually entirely sure. Assuming it is, though, the question is probably whether most users would prefer it, and how consistent it is with other uses of the system. In general, I expect to use "Brennen Bearnes" to log in to most services around here, except for shell sessions, so I didn't find doing the same for GitLab jarring. I don't think we're entirely consistent about this, though, and I acknowledge the reasons people might prefer a shell username.

Anyhow, if we were going to change it, one complication is that we've already got a set of users (and projects) under the current scheme, and I'm not sure how practical it would be to migrate them.

cc: @jbond, @thcipriani

Change 714382 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/gitlab-ansible@master] gitlab cas: update uid field to use uid not CN

https://gerrit.wikimedia.org/r/714382

An alternative could be to use the 'shell name' field, although a) that might still not be what people want, and b) probably requires significant modifications in the LDAP provider.

This might be possible; I'm not actually entirely sure. Assuming it is, though, the question is probably whether most users would prefer it, and how consistent it is with other uses of the system. In general, I expect to use "Brennen Bearnes" to log in to most services around here, except for shell sessions, so I didn't find doing the same for GitLab jarring. I don't think we're entirely consistent about this, though, and I acknowledge the reasons people might prefer a shell username.

This is faily easy to change however I'm not sure how the change would affect current users.

It also uses the LDAP username poorly - I ended up with GergTisza, apparently. Forcing people to remember in which exact way GitLab distorts their name (and others' name, if they want to ping someone) seems like a rather annoying UX choice. And what happens if two people's names only differ in non-ASCII characters? What happens with usernames not written in Latin script?

I agree the shell name would be a more reasonable choice.

brennen edited projects, added GitLab; removed GitLab (Initialization).

This is fairly easy to change however I'm not sure how the change would affect current users.

As mentioned on that patch, I tried it on the gitlab-test box and got an error to the effect that my e-mail was already taken, when logging in for a pre-existing user.

I experimented more with it today. Setting gitlab_rails['omniauth_auto_link_user'] = ["cas3"] got rid of the "Sign-in failed because Email has already been taken." error, and allowed me to log in as my existing user ("BrennenBearnes"). Docs for that setting here:

https://docs.gitlab.com/ee/integration/omniauth.html#automatically-link-existing-users-to-omniauth-users

auto_link_user seems to link accounts based on e-mail address, so maybe that would let us make the switch to uid for people newly logging in without disrupting existing accounts, and we could figure out renaming those separately.

However, after configuring a fresh installation at https://gitlab-test.wmcloud.org/ with the settings from the patch, I still wind up with a username of "BrennenBearnes" for a newly created user. Even if I also set "name_key" => 'uid' - the "name" field winds up getting shell name, but username still seems to be based on CN.

You can see this with a test user here:

At this point I'm not exactly sure where CN is even referenced.

Made some progress here, I think. Looks like nickname_key is the missing ingredient. Updated commit message on John's patch from above with more details.

I think we can likely apply that, which should unblock T288162, and then deal with moving existing usernames separately.

Change 714382 merged by Brennen Bearnes:

[operations/gitlab-ansible@master] gitlab cas: uid instead of CN; add nickname_key

https://gerrit.wikimedia.org/r/714382

Mentioned in SAL (#wikimedia-releng) [2021-09-23T15:54:14Z] <brennen> gitlab1001: brief downtime to apply [[gerrit:714382|gitlab cas: uid instead of CN; add nickname_key]] for T288392

Mentioned in SAL (#wikimedia-operations) [2021-09-23T15:54:27Z] <brennen> gitlab1001: brief downtime to apply [[gerrit:714382|gitlab cas: uid instead of CN; add nickname_key]] for T288392

Mentioned in SAL (#wikimedia-operations) [2021-09-23T16:09:15Z] <brennen> gitlab1001: reverting [[gerrit:714382|gitlab cas: uid instead of CN; add nickname_key]] for T288392, as existing user logins are broken.

Change 723230 had a related patch set uploaded (by Brennen Bearnes; author: Brennen Bearnes):

[operations/gitlab-ansible@master] Revert \"gitlab cas: uid instead of CN; add nickname_key\"

https://gerrit.wikimedia.org/r/723230

No love on this one on gitlab1001. I get a 422 with "Sign-in failed because Email has already been taken." when logging in with an existing user.

Screenshot-2021-09-23-10:06:25.png (658×715 px, 43 KB)

Googling on that error again led to this issue: A user which is created in Gitlab after SAML sign in cannot sign in again:

The problems seems that "urn:oasis:names:tc:SAML:2.0: nameid-format:transient" changes from each SAML response and the value is stored as extern_uid. An existing user (same email address) with a different extern_uid as the result of the changing nameid cannot login after sign-up.

extern_uid seems like a hint. Googling for "gitlab extern_uid" finds a snippet called Update extern_uid help script, which points to an identities table.

gitlab-ansible-testgitlab-1001
User.find_by_username("RandoMcRandomface").identities.first.extern_uid
=> "rando"
User.find_by_username("BrennenBearnes").identities.first.extern_uid
=> "Brennen Bearnes"

So I think the production configuration is storing CN (or cas:user) in identities.extern_uid, while both versions of the patch store uid, which is why my testing didn't surface this.

A double-check on this in SQL seems to confirm:

# Production (gitlab1001):
gitlabhq_production=# gitlabhq_production=# select extern_uid from identities;
     extern_uid      
---------------------
 Brennen Bearnes
 Ahmon Dancy
 Thcipriani
...

# gitlab-ansible-test:
gitlabhq_production=# select extern_uid from identities;
 extern_uid 
------------
 rando
 brennen
 jbond
 jelto

I'm going to:

  • reset gitlab-ansible-test data
  • apply the production version of the config there
  • log in
  • confirm that the login breaks if I apply the patch
  • see if the login can be made to work by updating extern_uid in the database

If if it does, then I can probably write a script to do updates for all existing users in production and re-apply the patch.

It also looks like extern_uid can be modified through the API, which may be safer if the operation is supposed to have any side-effects (though something about that being a simple field while there can be many identities gives me pause):

https://docs.gitlab.com/ee/api/users.html#user-modification

Seems like this works. Applied production config, logged in as "Rando McRandomface", applied patch, login was broken, ran the following:

#!/bin/bash
curl -s --header "PRIVATE-TOKEN: $GITLAB_TOKEN" -X PUT "https://gitlab-test.wmcloud.org/api/v4/users/2" -d provider=cas3 -d extern_uid=rando

Login now works.

A script for resetting both extern_uid and username: set-usernames - a quick and dirty utility for changing usernames in bulk.

I'm getting the file of mappings from e-mail to shell uid by querying against LDAP on mwmaint1002 with a script like so:

#!/usr/bin/perl

use strict;
use warnings;
use 5.10.0;

while (my $mail = <>) {
  chomp $mail;
  print "$mail," . `ldapsearch -x mail=$mail | grep uid: | cut -f2 -d' '`;
  sleep 1;
}
./query.pl < mails

Where mails is just one address per line, queried from the GitLab API.

I tested against gitlab-test and found that projects even redirect correctly after a user is renamed. I'm planning to do this migration on Monday.

Change 724160 had a related patch set uploaded (by Brennen Bearnes; author: Brennen Bearnes):

[operations/gitlab-ansible@master] Revert \"Revert \"gitlab cas: uid instead of CN; add nickname_key\"\"

https://gerrit.wikimedia.org/r/724160

Change 724160 abandoned by Brennen Bearnes:

[operations/gitlab-ansible@master] Revert \"Revert \"gitlab cas: uid instead of CN; add nickname_key\"\"

Reason:

Original revert never merged.

https://gerrit.wikimedia.org/r/724160

Mentioned in SAL (#wikimedia-operations) [2021-09-27T20:06:08Z] <brennen> gitlab1001: ~1hr downtime to attempt migration of usernames to shell uid (T288392)

brennen claimed this task.

This looks to have worked. As an example, @Tgr is now at https://gitlab.wikimedia.org/tgr with an un-mangled display name of "Gergő Tisza". I *think* old URLs will redirect cleanly, though I'll double-check whether this holds for git remotes. Thanks everybody for your help and patience on this; I'll send mail to the usual places mentioning the migration.

Change 723230 abandoned by Brennen Bearnes:

[operations/gitlab-ansible@master] Revert \"gitlab cas: uid instead of CN; add nickname_key\"

Reason:

These changes now applied, revert not necessary.

https://gerrit.wikimedia.org/r/723230