A proper use of technical measures by authors and publishers

An article in the the UK times talks about an attempt to create a technology called the "Automated Content Access Protocol" which would tag content such that search engines would know what is intended to be done with it. While using such a technology is both appropriate and necessary, I believe it is inappropriate for legacy publishers to create yet another format when there are already standards for this. They should be publishing documentation on "best practises", and participating with existing bodies working on this problem, not coming up with their own incompatible formats.

To tell Robots whether they should or should not index your content, they should use the Robots Exclusion Standard. This documents two different methods, one involving a file called "robots.txt" for site-wide settings, as well as information that can be put inside of individual HTML files.

While robots will follow these files, there is an implied license for the "no membership required" part of the Internet. While the exact terms of this implied license seem to be up for debate, authors can indicate to robots and other persons that the author want to offer more or less permissions in a variety of ways.

If the author wants to offer less permissions, then they need to set up a system that requires a person login before accessing the content. There are literally millions of existing software packages that allow this type of feature, many of them are FLOSS so are free to install. If the content isn't expensive, simple systems that require a cookie can be used. If the content is more expensive, with it being more likely that an attacker will try to access the content without paying, then cryptography can be used.

Whether an author is using a "membership required" access system or not, there are various emerging metadata standards that allow authors to indicate exactly what copyright license they are offering their work under. Creative Commons currently documents license Embedding Specifications for a large number of file formats. While they document how to indicate a Creative Commons license in a standard way that can be automatically understood by software, the format is extensible and can be used to indicate other licensing options as well. This license embedding system is used by a number of search engines already to allow people to search based on license options.

