Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPIKE: using Google file picker with institution-only PDFs and Team Drive #632

Closed
2 tasks
klemay opened this issue May 21, 2019 · 10 comments
Closed
2 tasks
Assignees
Labels
Google Drive Issues with our Google Drive integration Spike

Comments

@klemay
Copy link
Contributor

klemay commented May 21, 2019

Note: this is being re-added to the "To Review" column on the Product backlog board because it was added to a sprint and then removed - we need to re-prioritize it among our other work.

User story:
As a teacher at Example University, I want to use the Hypothesis LMS app in Sakai to assign readings to my students. The PDFs for these readings reside in a Team Drive, and the university library mandates that these PDFs are set so that "anyone at Example University with the link can view," rather than "anyone with the link can view."

Currently, when a teacher selects a PDF from Google Drive, we set the privacy of that file to "anyone with the link can view." If you go back and manually change the privacy of the file to "anyone at [institution] with the link can view," launching the assignment will take you to a Drive login page.

We need to look into:

  • Does the Google Drive API allow for creation of a temporary public link which could be used by our app (the way the Canvas Files API currently works in our app)? If not, is it possible for our Picker to set the privacy of the link to "anyone at [institution] can view" by default?
  • What will it take to get our file picker to support Team Drives? (This might be helpful: https://developers.google.com/drive/api/v3/enable-shareddrives)
@seanh
Copy link
Collaborator

seanh commented May 22, 2019

@klemay Can we have a screenshot of what happens If a PDF does have Downloading, printing, and copying enabled, but it is set to anyone at Example University can view?

@seanh
Copy link
Collaborator

seanh commented May 22, 2019

I believe this is a screenshot of what happens if:

  1. The PDF's share link in Google is set to anyone with the link can view, but
  2. Disable options to download... is checked

Screen Shot 2019-05-21 at 6 33 01 PM

@seanh seanh added Spike Google Drive Issues with our Google Drive integration lms labels May 22, 2019
@klemay klemay self-assigned this Jun 11, 2019
@klemay
Copy link
Contributor Author

klemay commented Jun 13, 2019

Ok, this might be helpful so documenting here.

I created an assignment using the Google Picker for a PDF in my drive that was not shared with anyone. After the assignment was created, that PDF's settings were changed to "Anyone with the link can view." I went back and manually set the PDF to "anyone at Hypothesis who has the link can view." When I went to re-launch the assignment, this is what I ran into:

Screen Shot 2019-06-13 at 12 02 15 PM

@klemay
Copy link
Contributor Author

klemay commented Jun 13, 2019

Another potential complication - the partner institution is using a Shared drive for these files, which would require changes to our picker, it looks like:

https://developers.google.com/drive/api/v3/enable-shareddrives

@klemay klemay changed the title SPIKE: using Google file picker with institution-only PDFs SPIKE: using Google file picker with institution-only PDFs and Team Drive Jun 14, 2019
@seanh seanh removed the lms label Jul 10, 2019
@robertknight
Copy link
Member

robertknight commented Aug 21, 2019

The solution I had in mind here is to make Google Drive files work similar to Canvas files, with the difference that I don't think we are able to serve the PDF download URL through Via but will instead need to load the content into PDF.js another way.

A sketch of the changes this would involve:

  1. Change the Google Drive assignment selection flow so that it records the file ID and the fact that it came from Google Drive in the LTI launch URL, but doesn't make the file publicly accessible. In other words, it would generate an LTI launch URL similar to the one it generates for Canvas files.

  2. Change the assignment launch flow for Google Drive files so that instead of directly rendering a PDF in Via, it instead:

    • Gets the user's authorization to access their files in Google Drive
    • Fetches the file content
    • Serves the file content inside PDF.js, eg. using a blob URL or by passing a data buffer directly to PDF.js

    This would involve changes to the BasicLtiLaunchApp mini-SPA in the client and we'd also need to either give Via the ability to load content from a file buffer (eg. transferred from the LMS app to Via via postMessage) or add PDF.js directly to the LMS app.

What this will allow is for any file that is shared with the student's Google account, eg. by being shared with "anyone at " to be used as an assignment without needing public sharing.

A completely different approach would be to fetch the file content at the time the teacher chooses the assignment, store it somewhere on our backend, and serve it when the student creates the assignment. This approach obviously comes with copyright complications since the content would end up, at least for a period of time, on our servers. For that reason I'm inclined to go with the first option outlined above.

@klemay
Copy link
Contributor Author

klemay commented Aug 21, 2019

@robertknight thanks for this - the implementation you've outlined makes sense. I'm going to close this spike and open an issue based on your comment so it can be prioritized among our other work.

@robertknight
Copy link
Member

Hi @klemay - I still think it would be worth doing a spike based upon my suggestion above to check that it does actually work and that there aren't any major issues I've failed to consider.

@seanh
Copy link
Collaborator

seanh commented Aug 22, 2019

FYI, I think we are going to need a standalone CORS proxy server for PDFs. Currently this is Via:

  • Google Drive and other file hosts don't include the necessary CORS headers to allow JS to download their PDFs, IIRC, so I don't think the blob or data URL thing will work. Even if it can work for some file hosts, we know it won't work for all hosts that we want to support, and it'd be nicer to have one solution for PDFs not two
  • The LMS app shouldn't download PDFs to its server and re-host them IMO. This means that the LMS app is spending a long time doing these long downloads. The browser has to wait the entire time for the download to our hard drive to complete and then they can start downloading the file from us. It'd require a persistent storage attached to the LMS app's app servers (which while possible is not something any of our apps currently have). Storage space issues and having to delete older files. Copyright issues ... It makes more sense to do Via-style proxying without persistence or caching on our servers. (This is why we changed Jon's original LTI app, which did download PDFs to its hard drive, to use Via instead.) But just implement a better, faster, simpler version of Via for PDFs
  • A better Via for PDFs is also a first step towards a better Via generally, and one day we're gonna have to do something about the problem of Via

@robertknight
Copy link
Member

Google Drive and other file hosts don't include the necessary CORS headers to allow JS to download their PDFs, IIRC, so I don't think the blob or data URL thing will work.

I've just checked and this is not a problem for Google Drive. See Slack discussion. The Google API JS client is able to fetch file content from Google Drive files. It does so in an elaborate way that doesn't involve making a CORS request, but the end result is that the frontend app can fetch data from Google Drive without needing any server-side help from a proxy.

We will however still need a proxy for serving PDFs from arbitrary URLs entered in the "Enter URL" dialog in the file picker.

@klemay
Copy link
Contributor Author

klemay commented Aug 22, 2019

@robertknight sorry for getting a little over-eager with closing this issue. I've converted the new issue to a spike: #928

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Google Drive Issues with our Google Drive integration Spike
Projects
None yet
Development

No branches or pull requests

3 participants