Proposed Standard |
WD-countmethod-19980421 |
The proposed counting methodology has the following design goals:
The IAB MMTF has already published definitions of such terms as "ad requests" and "ad clicks" [IAB97]. These definitions have been invaluable to ensure that companies using the defined concepts and metrics for advertising are consistent. That document also states: "It is important to note that for true comparability to exist, we need to define both the concepts and the metrics themselves as well as the methodology sites should use to generate those metrics."[IAB97] This proposal is an attempt to define a standard methodology, so the resulting reports can be truly comparable. This proposal also attempts to address the objections raised in the Q&A section titled "Why are images not a comparable measure?"[IAB97]. In particular, by defining the methodology clearly, we hope to mitigate the effects of caching and other "environmental" factors on the comparability of the counts, resulting in a measure that is both implementable by ad networks, and potentially more accurate than ad requests.
For simplicity, this proposal only addresses measuring ads that use clickable image media, including GIFs, animated GIFs, and JPGs. These ads constitute the vast majority of the advertising on the Internet today. No attempt is made in this document to define a methodology that can measure HTML ads such as banner forms, Java ads, Javascript ads, or embeddable media such as Shockwave; these may be defined in a later revision. Further, this proposal does not attempt to take into account client-side counting by offline browsers, or filtering of hits from non-human browsers such as spiders and robots. Finally, this proposal does not define the user action required to measure an ad download; in particular, it makes no distinction between an ad download as a result of a user-initiated event or one resulting from a timed refresh.
A valid ad impression may only be counted when an ad counter receives a request for an ad image from a browser. This image request must be the result of an IMG tag in the page HTML. In response, the ad counter must return a location redirect, specifying the location of a file or other program that will deliver the image media. The location redirect must take the following form:
302 Moved Temporarily
Location: http://www.site.com/ad.gif
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Cache-Control: no-cache
A valid ad click may only be recorded when an ad counter receives a click request from a browser. This click request must be the result of a user clicking on an Anchor tag in the page HTML. In response, the ad counter must return a location redirect, specifying the location of the destination for the ad. The location redirect must take the following form:
302 Moved Temporarily
Location: http://www.advertiser.com/index.html
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Cache-Control: no-cache
The response in both cases must be a location redirect - implementations which respond with a valid page or a status code other than 302 may NOT count a valid impression. The response may NOT contain the following header: Last-Modified, as it may encourage caching. Other headers, including Set-Cookie, may be used if desired.
The URLs that are used to make the IMG SRC and A HREF requests may take the form of any valid URL (see [RFC1738]). The URLs that are redirected to may also be any valid URL. If a single ad counter is used to count both impressions and clicks, obviously it needs to have a way to distinguish one request from the other. In many cases, the ad counter functionality will be included in a more full-featured ad server, which chooses the appropriate image, or determines the correct ad destination.
The methodology requires the ability to defeat caching on a location redirect, in order to count accurately and efficiently. There are several ways that caching can occur, most commonly either in the browser or in an intermediate proxy server. There are also several mechanisms for defeating caching, including response headers, and URL construction techniques. These and other issues are discussed in this section.
The mechanism chosen here to defeat caching is to use the headers "Expires", "Pragma", and "Cache-Control". These methods, taken together, should defeat most browser and proxy caches. In addition, the omission of the "Last-Modified" header should help prevent caching. The "Set-Cookie" header was also considered, as it often prevents proxy caches from storing documents, but it also has social implications that make its use often undesirable. One option that may have fewer implications is to set an invalid cookie; this may have the desired effect of defeating the cache, while not actually setting a cookie in the browser.
There have been reports of browser bugs causing incompatibilities with this counting methodology. The most common problem is that in most versions of Netscape, animated GIFs set to rotate continuously will only rotate once, and then stop. Some versions of Netscape may also continuously re-request animated GIFs. These problems must be confirmed, and workarounds discovered, if they exist.
Another method for defeating caching is to modify the request URL, either by constructing a new URL for each request, or masquerading as a CGI. For example, image or click requests URLs may contain unique IDs, query strings, ISMAP modifiers, or other valid URL components (see [RFC1738]) designed to reduce caching. These techniques are not recommended by this methodology for two reasons. The cache-defeating headers should prevent the vast majority of caching, so these URL techniques should not have a material effect on the accuracy of the counting. Also, most ad delivery systems already have well-defined URL schemes for ad requests, and we do not wish to require everyone to change their tags.
The methodology has been designed to handle multiple cascading ad counters. For example, a small site may accept local advertising, but send the rest of its inventory to a larger ad network. The network, in turn, may accept advertising from a large advertiser who serves their own ads. In this case there are at least three ad counters that will be involved in each ad delivery. This methodology has been designed so that the numbers produced by each counter should be within 5%.
The HTTP 1.1 standard [RFC2068] recommends that servers responding to a request with a status 302 (location redirect) include an entity body that contains a short hypertext note with a hyperlink to the new URL. For maximum efficiency, and because nearly all browers will automatically follow a 302, this method does not encourage an entity body in the 302 response. One open issue is whether or not a "Content-Length: 0" header should be included, so browsers do not wait for the entity body to be transmitted.
A suggestion has been made that the measurement would be slightly more accurate if the counting were performed after the redirect was known to have transmitted successfully. This, however, is much more difficult to implement with today's web servers, and would probably result in fewer sites counting this way. Also, because the redirect is a relatively small response, the likelihood of successful transmission is high, so the difference should be small.
<HTML><HEAD><TITLE>Ad Tester</TITLE></HEAD><BODY>
<H1>Ad Tester</H1>
<A HREF="http://ad.counter.com/cgi-bin/adcounter.cgi?DEST=http%3A%2F%2Fwww.advertiser.com%2Findex.html">
<IMG SRC="http://ad.counter.com/cgi-bin/adcounter.cgi?IMAGE=http%3A%2F%2Fwww.site.com%2Fad.gif"></A>
</BODY></HTML>
When this page is downloaded by a browser, the browser will make a subsequent request to "adcounter.cgi?IMAGE" (assuming images are turned on). This hypothetical adcounter.cgi program would retrieve the correct image URL (in this case, from the query string), record an ad impression, and issue the following response:
Status: 302 Moved Temporarily
Location: http://www.site.com/ad.gif
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Cache-Control: no-cache
The browser will receive this response, and automatically retrieve "ad.gif" and display it. Note that "ad.gif" may in fact be retrieved from an intermediate proxy cache, or even the browser's own cache. However, because the redirect has been marked non-cachable, the next browser to retrieve the page will also make a call through to adcounter.cgi.
When the user clicks on the ad, the browser will make a request to "adcounter.cgi?DEST". In this case, the ad counter will retrieve the appropriate destination URL (in this case, from the query string again), record an ad click, and issue the following response:
Status: 302 Moved Temporarily
Location: http://www.advertiser.com/index.html
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Cache-Control: no-cache
Again, the browser will receive the response and retrieve the proper desination page. Because the location redirect defeats caching, click counting should not be subject to cache discrepancies.
Note that multiple ad counters may be involved in counting ad impressions or clicks. The first ad counter, for example, may redirect to another ad counter with a response like this:
Status: 302 Moved Temporarily
Location: http://www.othercounter.com/cgi-bin/adcounter.cgi?IMAGE=http%3A%2F%2Fwww.site.com%2Fad.gif
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Cache-Control: no-cache