Caching vs Copyright

news2.jpgOne of the top requests we get for NewsBreak is that we download and cache the web pages that the “Read More” links go to. The reason people want this is that many news feeds don’t provide the full story. They just give you a teaser. If you’re off line this means you can’t read the entire story without going back online. Annoying, I know, especially for those feeds that only give you like one line of the story.

So why don’t we do it? I’m glad you asked! Read on for the full story…
How a Newsfeed Works
Before I go into the details I’ll offer the world’s shortest lesson on RSS. No technical details here. Just the nuts and bolts of where the content of a news feed comes from. When you subscribe to a newsfeed you get a specific bunch of content. We, as the news reader, have no control over this content. The provider determines how much you get, whether images are included, etc..

So if the provider only wants you to read the first two lines, all we can provide you with are the first two lines. If the provider sends the entire article, we give you the entire article! Unfortunately, most providers only send one or two lines of a story and then include a link to the whole thing.

The Economics of Newsfeeds
Why do they do this? Another excellent question! The fact of the matter is that these sites typically make money when people actually visit them. Unique visits, click throughs, ad clicks, and other typical “web money makers” provide the source site with income. If they send you the entire story, you never visit the site, and they never make their money! For many sites the news feed isn’t so much a service as it is a teaser to get you to visit the actual site!

The Trouble with Caching: Part I
This brings us to the problem. If we start going out to the links in a news feed and grab the entire page that it leads to, we’ve just killed the potential income that the source site hoped to make. When you read the article offline not only can this bypass their metrics (means of measuring hits) but it may even eliminate the ads completely. Needless to say the providers really really really don’t approve of this!

Technical Note: Yep…this is a simplification but the general point is valid. The sites want you to visit, not download their pages.

The Trouble with Caching: Part II
Here is the really ugly one though. Copyright. If we go out, grab an entire web page and copy it to your device, we’ve just duplicated copyrighted content.

Is this legal?
Maybe it is but maybe it isn’t. To my knowledge this hasn’t been played out in court yet (not in the situation we are in anyhow.)

Do I want Ilium Software to be the company CNN, Disney, Time-Warner, or others send their lawyers after to resolve this issue?
Do I need to answer this one? 🙂

And this isn’t just a legal issue. We’re in the business of selling software. Copyright is a big deal to us. There is no way we are going to do anything that even comes close to violating the copyright of someone else. It goes against our core beliefs as individuals and as a company.

The Moral of the Story
For now, I don’t intend to put page caching on our list of new features for the next version of NewsBreak. Anything is possible of course and we never close the door completely on a good idea. At the same time, until we hear something that solidly invalidates the problems listed above, it just isn’t something we feel comfortable doing.

9 thoughts on “Caching vs Copyright

  1. Johan

    Ok, so no page caching for now, but what about image caching? Newsbreak already grabs the images, just not when it downloads feed’s. As long as you put it as an option, most people would be happy. When is the next version of newsbreak expected? Oh, and I read the post in newsbreak!

  2. spmwinkel

    Image caching and ‘mark all as read’ when leaving a feed are the only two things at my wishlist, so with one of the two coming, I’m happy!

    “There is no way we are going to do anything that even comes close to violating the copyright of someone else.” Very good principle. :thumbsup:

  3. Alan

    There is absolutely no difference, technical, practical, legal, or moral between a ‘download’ and a ‘visit’. They are the exact same thing. Also, making a ‘copy’ of copyrighted material as part of the process of displaying it is what every browser does. You might as well suggest that looking at a copyrighted web page with a web browser is a legal gray area. The only difference between a ‘pre-fetched’ and a ‘downloaded when I click on the link’ page is that in the latter case the probability that I will look at the page is equal to one, and in the former case it is less than one. How can the probability of my looking at it or not affect the ethics of the situation? Just my opinion of course. Cheers.

  4. Marc Post author

    I wish that was all there was to it. Here are the two things that must be considered when talking about copyright:

    1) Will the user profit from the copying of the page?
    2) Will the owner of the page lose profit as a result of the copying?

    1: Will the User Profit?
    YOU won’t profit but WE will. If we cache webpages will we sell more software? We probably would. Thus, we would profit by copying the copyrighted work of another.

    2: Will the Owner Lose Profit
    Very possibly, yes. If you are viewing pages offline with no direct internet connection the possibility of measuring visits, clicking ads, etc. all diminish. As a result the site would lose profit as a result of our page caching.

    So just on those two points along, page caching becomes a real danger for us. Whether or not other people do it is secondary to whether or not it is legal.

  5. Clinton Fitch

    Great explanation and post Marc. While I really like the idea of complete downloading of an RSS feed and caching it on my device, I must admit that it never occurred to me that there could be a copyright issue with the content.



  6. Phil

    Copyright not withstanding I would like to see the ability to mark a single selected article as unread! It is an easy way to keep articles for later reviw (at least until you refresh again).

    You can currently mark all as read or as unread. Reading it marks it as read automatically.

    Completing the object abstraction by offering a way to “unread” single articles would be most helpful to me.

  7. Marc Post author

    Already there! Tap and hold on an article in the Headline View (where you see all the headlines for one channel).

  8. Doug

    I think Marc raises some excellent points in this article. As an attorney and an online writer myself, there is definately a concern with someone accessing or using your work without you getting compensated appropriately.

    There is definately a HUGE gray area in terms of what you can and cannot access from the web and how you access it. And we, the endusers, are not the ones who will ultimately pay for a mistake in this area, it is Ilium and the developers who will be held responsible. While I would love to have fully downloaded articles, I would not ask Ilium to offer something like that with such legal implications. If they decided to offer it, I would presume that they had fully researched the law or entered into an agreement with various sites ala AvantGo.


Comments are closed.