Wednesday, 22 April 2015

How-to: Protect your web downloads. Not.


Portcullis by Kevin King on imgbuddy I started this under the banner “Web-accessible e-publications NOT as PDF's” in light of the membership products discussion that happened this morning. Thinking about the delivery of web-accessible e-publications (our bread-and-butter) which are NOT downloadable and NOT printable for certain classes of registered web users, this is as much of a minefield today as fifteen years ago in a previous life.

This is all about balancing ease of access and customer service against the admin/IT cost of providing material via a robust delivery mechanism. Before you ask, I have no recommendation at this point.

Restricted PDF's
The PDF format was entirely designed to be the universal, 'access-anywhere' format; the ubiquity of standalone readers and web-browser plugin's for reading PDF's made it so.

We currently put our PDF's behind the login wall, relying on the user account profile to determine whether it is a free download or to be purchased through the shopping cart. We apply no further restrictions on opening or printing.

Depending on the web-browser, once the download link is available, clicking on it triggers the 'Save As' function, 'Open in Adobe Reader' or opens in a browser PDF plug-in.

In Adobe Reader and pretty much every browser PDF plugin, the PDF can be downloaded by pressing the 'save' button.

While the Adobe InDesign security options for PDF files include disabling, or setting a password for, opening and printing, RiPFAC doesn't apply PDF passwords or un-locking codes. They are a burden to administer, as you have to give the user the un-lock code - either unique to each document, which is a pain, or generic, which becomes pointless.

This sort of "protection" only works when all PDF viewers respect the settings - the three I use don't! In any event, PDF passwords are not that difficult to bypass. So it’s not fail-safe, but simply makes things a little more difficult for the casual pirate to bypass. That doesn't stop me screen-printing  individual pages off a high-resolution display and putting the legible images in a Word document or another PDF.

Flipping Book format
We use Flipping Book for 'look inside' previews on the RiP site currently, as per Amazon books.

What we have further decided to do for one of our affiliate organisations is to create Flipping Book versions of 15 whole documents which we placed on  hidden pages for its members.

We can set Flipping Book to disallow saving and printing; this relies on the Adobe Flash format. The alternative 'basic HTML' version for older web-browsers and those with Flash disabled - which includes lo-tech local authorities and paranoid IT departments -  bypasses some of those settings, so it clearly less than ideal. In addition, we would have to create full Flipping Book editions for all the resources in our catalogue.

Digital Rights Management
Our main user profile types fall into Partners/free downloads, or non-partners/purchased downloads.

The current Individual Subscribers are setup for purchased downloads and get an automatic 50% discount applied in the shopping cart (we are about to amend this for new Individual Subscribers).

On our main web site, each publications has its own page with a 'download' option (purchased or free, as per profile) and a 'purchase hardcopy' option.

Should we wish to lock down Individual Subscriber access to 'read online only', with no download (free OR purchased) and no print option, we would need to choose from:
  • use a non-standard PDF viewer embedded in the site - probably built with Flash.
  • create Flipping Book versions for ALL documents, with the caveat above.
  • host ALL documents externally on a proprietary third-party Digital Book site such as Scribd.com (uses HPub format I think). However, third party sites like Scribd are commercial and charge for the service.
  • deploy another form of Digital Rights Management (DRM) to lock down documents, such as Adobe LiveCycle Rights Management server; you can design custom policies and set the expiration date. Once this policy this applied to the PDF, end users won't be able to open the PDF after the expiry date ('self-desctruct' option). You can also limit the number of devices or limit the number of times it can be opened.
  • bodge together a Javascript solution to "cover" the PDF pages with an opaque watermark after a set period. Not a perfect method; if JavaScript is turned off or it is opened in a third party viewer, the content will be hidden. Unfortunately many people have Javascript turned off...
  • deploy a DRM solution that "phones home" to validate access (as well as other features above). 
We then get into cryptographic DRM with digital fingerprinting, certificates and the like. There are a number of them out there, none of them particularly cheap. Adobe has LiveCycle Policy Server, which is very expensive, however, and is intended for Governments and large corporations. Alternatives include FileOpen (which the British Library uses) and LockLizard.
User permissions added to resource pages
Should we add a third format of 'protected PDF' (alongside 'standard PDF' and Flipping Book preview), we would then need to add that to the RiP website, link it to the user account type, so that the user would see a link to the protected document - in place of, or in addition to, the 'purchase download' option.

If no downloads, then what?
Removing the download option entirely means having the content as soft copy in html text format. On our micro-sites sites, the HTML page IS the content. Highly searchable, highly discoverable. You then have to lay out and format it separately from the Adobe InDesign publication, supported by site stylesheets and templates.

Watch-lists, reading lists and in-site bookmarks might all be great features to add on to the user profile at this point, as per the DeepDyve journal aggregator.

While it is possible to 'protect' flat html pages from copy-paste, this involves various scripting technologies and plug-ins, many of which are easily circumvented or require an entry-level version of web-browser with appropriate security permissions to work. Local Authority IT? See above...

Questions following...
Bear in mind the following:
  • Anyone who purchases or downloads our PDF's currently has no restriction on usage, copying, redistribution, selecting to copy/paste text. 
  • We have no way of knowing how many times a downloaded document is then passed around multiple readers. 
  • Other than the 5 documents for which we issued take-down notices against publicly accessible sites in the last 6 months, I see no further instances of our documents in the wild. This does not include Local Authority intranets, internal networks, sector forums or special interest sites or any site with content behind a login wall not scanned by the search engines.
The questions are as follows
  • how much of a threat is unauthorised redistribution of our content? 
  • how much effort and money are we prepared to invest in an attempt to prevent it? 
  • how much inconvenience are we willing to put customers through, and will that cost us more in the long run?



Did I mention how much I hate the whole notion of DRM? How I hate the idea of paying to implement and run DRM? How much hassle you endure supporting a user base that hates DRM?



The boot is firmly on the other foot, however. We have valuable content that we want to protect, as that is what pays our salaries. Like I said at the beginning, I have no recommendation now, just as I had none fifteen years ago. Welcome to the digital age. RC


Image credit: Portcullis by Kevin King on imgbuddy.

No comments:

Post a Comment

At least try to be nice, it won't kill you...