April 23, 2016

avatar

Gone In Six Characters: Short URLs Considered Harmful for Cloud Services

[This is a guest post by Vitaly Shmatikov, professor at Cornell Tech and once upon a time my adviser at the University of Texas at Austin. — Arvind Narayanan.]

TL;DR: short URLs produced by bit.ly, goo.gl, and similar services are so short that they can be scanned by brute force.  Our scan discovered a large number of Microsoft OneDrive accounts with private documents.  Many of these accounts are unlocked and allow anyone to inject malware that will be automatically downloaded to users’ devices.  We also discovered many driving directions that reveal sensitive information for identifiable individuals, including their visits to specialized medical facilities, prisons, and adult establishments.

URL shorteners such as bit.ly and goo.gl perform a straightforward task: they turn long URLs into short ones, consisting of a domain name followed by a 5-, 6-, or 7-character token.  This simple convenience feature turns out to have an unintended consequence.  The tokens are so short that the entire set of URLs can be scanned by brute force.  The actual, long URLs are thus effectively public and can be discovered by anyone with a little patience and a few machines at her disposal.

Today, we are releasing our study, 18 months in the making, of what URL shortening means for the security and privacy of cloud services.  We did not perform a comprehensive scan of all short URLs (as our analysis shows, such a scan would have been within the capabilities of a more powerful adversary), but we sampled enough to discover interesting information and draw important conclusions.  Our study focused on two cloud services that directly integrate URL shortening: Microsoft OneDrive cloud storage (formerly known as SkyDrive) and Google Maps.  In both cases, whenever a user wants to share a link to a document, folder, or map with another user, the service offers to generate a short URL – which, as we show, unintentionally makes the original URL public.

OneDrive.

OneDrive generates short URLs for documents and folders using the 1drv.ms domain.  This is a “branded short domain” operated by Bitly and uses the same tokens as bit.ly. Therefore, any scan of bit.ly short URLs automatically discovers 1drv.ms URLs.  In our sample scan of 100,000,000 bit.ly URLs with randomly chosen 6-character tokens, 42% resolved to actual URLs.  Of those, 19,524 URLs lead to OneDrive/SkyDrive files and folders, most of them live.  But this is just the beginning.

OneDrive URLs have predictable structure.  From the URL to a single shared document (“seed”), one can construct the root URL and automatically traverse the account, discovering all files and folders shared under the same capability as the seed document or without a capability. For example, suppose you obtain a short URL such as http://1drv.ms/1xNOWV7 which resolves to https://onedrive.live.com/?cid=48…48&id=48…48!115&ithint=folder,xlsx&authkey=!A..q4.  First parse the URL and extract the cid and authkey parameters.  Then, construct the root URL for the account as  https://onedrive.live.com/?cid=48…48&authkey=!A...q4. From the root URL, it is easy to automatically discover URLs of other shared files and folders in the account (note: the following traversal methodology no longer works as of March 2016). To find individual files, parse the HTML code of the page and look for a elements with href attributes containing &app=, &v=, /download.aspx?, or /survey?. To find other folders, look for links that start with https://onedrive.live.com/ and contain the account’s cid. 

The traversal-augmented scan yielded URLs to 227,276 publicly accessible OneDrive documents, including dozens of thousands of PDF and Word files, spreadsheets, media files, and executable binaries.  A similar scan of 100,000,000 random 7-character bit.ly tokens yielded URLs to 1,105,146 publicly accessible OneDrive documents.  We did not download their contents, but just from the metadata it is obvious that many of them contain private or sensitive information.

Around 7% of the OneDrive folders discovered in this fashion allow writing.  This means that anyone who randomly scans bit.ly URLs will find thousands of unlocked OneDrive folders and can modify existing files in them or upload arbitrary content, potentially including malware.  Microsoft’s virus scanning for OneDrive accounts is trivial to evade (for example, it fails to discover even the test EICAR virus if the attacker goes to the trouble of compressing it).  Furthermore, OneDrive “synchronizes” account contents across the user’s OneDrive clients.  Therefore, the injected malware will be automatically downloaded to all of the user’s machines and devices running OneDrive.

Google Maps.

Before September 2015, short goo.gl/maps URLs used 5-character tokens.  Our sample random scan of these URLs yielded 23,965,718 live links, of which 10% were for maps with driving directions.  These include directions to and from many sensitive locations: clinics for specific diseases (including cancer and mental diseases), addiction treatment centers, abortion providers, correctional and juvenile detention facilities, payday and car-title lenders, gentlemen’s clubs, etc.  The endpoints of driving directions often contain enough information (e.g., addresses of single-family residences) to uniquely identify the individuals who requested the directions. For instance, when analyzing one such endpoint, we uncovered the address, full name, and age of a young woman who shared directions to a planned parenthood facility. Conversely, by starting from a residential address and mapping all addresses appearing as the endpoints of the directions to and from the initial address, one can create a map of who visited whom.

Fine-grained data associated with individual residential addresses can be used to infer interesting information about the residents. We conjecture that one of the most frequently occurring residential addresses in our sample is the residence of a geocaching enthusiast. He or she shared directions to hundreds of locations around Austin, Texas, as shown in the picture, many of them specified as GPS coordinates. We have been able to find some of these coordinates in a geocaching database.

It is also worth mentioning that there is a rich literature on inferring information about individuals from location data. For example, Crandall et al. inferred social ties between people based on their co-occurrence in a geographic location, Isaacman et al. inferred important places in people’s lives from location traces, and Montjoye et al. observed that 95% of individuals can be uniquely identified given only 4 points in a high-resolution location dataset.

What happened when we told them.

We made several attempts to report the security and privacy risks of short OneDrive URLs to Microsoft’s Security Response Center (MSRC).  After an email exchange that lasted over two months, “Brian” informed us on August 1, 2015, that the ability to share documents via short URLs “appears by design” and “does not currently warrant an MSRC case.”  As of March of 2016, the URL shortening option is no longer available in the OneDrive interface, and the account traversal methodology described above no longer works.  After we contacted MSRC again, they denied that these changes have anything to do with our previous report and reiterated that the issues we discovered do not qualify as a security vulnerability,

As of this writing, all previously generated short OneDrive URLs remain vulnerable to scanning and malware injection.

We reported the privacy risks of short Google Maps URLs to the Google Security Team.  They responded immediately.  All newly generated goo.gl/maps URLs have 11- or 12-character tokens, and Google deployed defenses to limit the scanning of the existing URLs.

How cloud services should use URL shorteners.

Use longer tokens in short URLs.  Warn users that shortening a URL may expose the content behind the original URL to unintended third parties.  Use your own resolver and tokens, not bit.ly.  Detect and limit scanning, and consider techniques such as CAPTCHAs to separate human users from automated scanners.  Finally, design better APIs so that leakage of a single URL does not compromise every shared URL in the account.

Comments

  1. It would be interesting to look into the claims made with Google Photos Share URLs as well. I would have thought goo.gl URLs were “smartly” protected from crawlers as well. Know I wonder.

    http://www.theverge.com/2015/6/23/8830977/google-photos-security-public-url-privacy-protected

    • The short URLs that Google Photos uses for sharing have 16+ character tokens.

      • avatar Danny Tuppeny says:

        This is one of the reasons I’m looking to move away from Google Photos. Previously we could securely share family albums with family based on Google Accounts, but not the whole thing relies on urls being kept secret. They might be HTTPS, and they might be long, but once they’re out, they’re out. How do you share them? SMS? Email? Hangouts? Are they opened on shared devices? I expect better of Google; we could do secure sharing before, why can’t we now? :(

  2. avatar Rui Marques says:

    That kind of attitude is what frustrates many white hats leading them to make some sort of “don’t say I didn’t warned you” kind of thing. I know that would be what I would want to do. Great work guys.

  3. If the problem was actually real and you really did make Microsoft and Google do something about it, thank you!

  4. avatar Dubops Guy says:

    If sharing something private common sense dictates you use a onetimepassword with your URL. Don’t we all know that 5 characters and short URL’s aren’t security by now?

  5. avatar Greg Hall says:

    Why would I want a short URL in the first place? All the other person will do is click on it, so who cares if it’s not short?

    The only legit reason I can see for this is to keep it within the bounds of a tweet or a single SMS message, either of which can generally hold most complete URLs anyway.

    Maybe I’m missing something, but I’ve always felt that short URLs serve no real purpose. But clearly they are a security risk, it’s ridiculous for anyone to try and deny it.

  6. avatar Braxton Jumper says:

    I would probably use the URL or verbal.

  7. Here’s a clear demonstration of the problem, concerning Droplr … http://soundly.me/a/droplr-drive-by/

  8. The reason they said it wasn’t a security vulnerability is because it isn’t. It’s an obscurity vulnerability. If you rely entirely on URL obfuscation to secure your data, you will be pwned. The traversal method certainly makes discovery easier, but if you require authentication to get to data, it doesn’t matter.

    That said, Microsoft and others should use this report to better inform users that sharing links does not protect their data.

    • I cannot speak to OneDrive specifically, but I have seen services (such as DropBox) suggesting that open but unhindered access is “share with people with whom I have shared the link.” If a user who uncritically follows your interface would rely on the security of something merely obscure, I do think it a security concern if it is not awfully obscure.

  9. avatar ᖺᕦʟƿᙢᕦᘜᘢᖇᘴ says:

    One mams entropy is another’s irony

  10. Hello, I enbjoy reading through your article post.
    I like too write a little comment to support you.

  11. What’s up, always i used to check website posts here early in the dawn, for the
    reason that i enjoy to learn more and more.

  12. public URLs are public, what a surprise. the only security issue if someone understands short url’s as protection. they’re – like the name says – meant to shorten URLs, not for protection. if you want protected data use crypto/passwords

Speak Your Mind