Internet Advertising Technology Council
|
WD-adreport-19970505
|
Internet Advertising Report Format
IATC Working Draft WD-adreport-19970505
-
Latest version:
-
http://www.basswoodassoc.com/standards/WD-adreport.html
-
This version:
-
http://www.basswoodassoc.com/standards/WD-adreport.html-19970505
-
Author:
-
Tom Shields <tom.shields@basswoodassoc.com>
Status of this document
This is a IATC Working Draft for review by IATC members and other interested
parties. It is a draft document and may be updated, replaced or obsoleted
by other documents at any time. It is inappropriate to use IATC Working
Drafts as reference material or to cite them as other than "work in progress".
Note: since working drafts are subject to frequent change, you are
advised to reference the above URL, rather than the URLs for working drafts
themselves.
Abstract
A standard format for presenting internet ad performance information is
presented. The format is extensible, portable, and easy to parse
into a variety of data stores. The proposal is motivated by the need
to capture and compare ad performance information independent of the software
used to perform the ad delivery.
Introduction
Advertisers and agencies who plan internet ad campaigns require ad performance
reports that are comparable across web sites. Because many web sites
use different ad delivery software, agencies are currently expending considerable
manual effort reconciling these reports. One fixed format report
is not enough, because different campaigns require reporting of different
kinds of information. This proposal defines a self-describing report
format that is extensible and flexible enough to report many kinds of ad
performance information, but simple and strict enough to be easily parsed
and converted into data stores for analysis and comparison.
This proposal is broken into two major sections. The first is a description
of the file format itself, with emphasis on the encoding and ordering of
information for easy parsing. The second part is a "taxonomy" of
standard headers, columns, and report templates, to ensure the utility
and comparability of the information.
The report format has the following design goals:
-
Simple to get started with one of several standard templates
-
Extensible to add new metadata, columns, and complete report templates
-
Expressive to handle a variety of data types
-
Complete to address the immediate needs of advertisers, agencies, and
ad delivery systems
-
Robust to handle character escaping issues
-
Efficient to provide varying levels of detail and aggregation
-
Self-describing to permit generic tools for analysis and conversion
This proposal makes no attempt to define the performance measures used
by advertisers, agencies, and ad delivery systems, except to provide for
the reporting of them. This proposal explicitly does not address
the issue of transmission of this data, for example from a web site to
an advertiser or agency. This format does not define presentation
information, and is designed to be readable, but not camera ready.
This work is distantly based on the W3C Extended
Log File Format [Hallam96] working draft.
Format
A advertising report file contains a sequence of lines containing
US-ASCII (ISO-8859-1) characters terminated by either the sequence LF or
CRLF. Report generators should follow the line termination convention for
the platform on which they are executed. Analyzers and converters must
accept either form. Each line may contain either a directive or
an entry. Directives record metadata about the report. Entries consist
of a sequence of fields relating to a ad event or series of events.
Fields are units of information, separated by whitespace.
Report file parsers should be tolerant of errors. If an entry or directive
contains corrupt data or is terminated unexpectedly the parser should resynchronize
using the end of line marker and continue to parse the following lines.
Directives
Directives are lines beginning with the '#' character. No whitespace
is permitted between the '#' and the directive name. Directive data takes
the form of attribute value pairs. Defined attributes may occur in
any order. If an attribute occurs more than once, the last occurrence
is taken as the proper value. Parsers should ignore unknown directives,
unknown attributes, and directives with parse errors. Every conforming
report file must begin with the directive IARF, and contain other required
directives as described below.
<directive> = "#" <name> ":" *<nvpair> <end-of-line>
<name> = *[ <alnum> "-" ]
<nvpair> = <whitespace> <name> "=" <string>
Entries
Entries are analagous to database rows. An entry consists of a sequence
of fields, separated by whitespace. All lines not beginning with
'#' are considered entries. If the first field of an entry begins
with '#', the field must be surrounded by double quotes ("") to distinguish
it from a directive. Entries are not required in report files, although
a report file with no entries may not convey much information. If
an entry contains fewer than the required number of fields as defined by
the Fields directive, the entire entry should be ignored. If an entry
contains extra fields, the extra fields should be ignored. Blank
lines should be ignored, because by definition they will not have enough
fields in them. No line continuation character is defined - entries must
be completely contained on one line.
<entry> = [ <field> *[ <whitespace> <field> ]] <end-of-line>
Fields
Fields are individual data entities analagous to database columns in a
row. Fields conform to one of the following data types, according
to the field definition. Fields without definitions should be considered
as data type <string>. Fields may not contain ASCII control
characters, or characters outside the US7ASCII set, unless they are encoded
as described below. Because fields are separated by whitespace, fields
must consist of at least one non-whitespace character.
<field> = <integer> | <fixed> | <uri> | <date> | <time> | <string>
<type-identifier> = "integer" | "fixed" | "uri" | "date" | "time" | "string"
<digit> = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
<lowalpha> = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
"j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
"s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
<hialpha> = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
"J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
"S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
<punct> = "$" | "-" | "_" | "." | "+" | "!" | "*" | "(" | ")" |
"{" | "}" | "|" | "&" | "^" | "~" | "[" | "]" | "," |
"<" | ">" | "#" | "%" | ";" | "/" | "'" | "`" | "=" |
"?" | ":" | "@"
<reserved> = <"> | "\"
<wchar> = <tab> | <space>
<whitespace> = <wchar> *<wchar>
<alnum> = <lowalpha> | <hialpha> | <digit>
<version> = <digit> *<digit> "." <digit> *<digit> *[ "." <digit> *<digit> ]
Integer
<integer> = [ "-" ] *<digit>
Integers are represented as a sequence of digits, base 10.
Fixed
<fixed> = [ "-" ] [ <digit> *<digit> [ "." *<digit> ]]
Floats are represented as at least one digit left of decimal, and arbitrary
precision.
URI
A URI as specified by RFC1738,
relative URIs are specified by RFC1808.
URIs cannot by definition include whitespace or ASCII control characters,
therefore they do not need to be escaped or enclosed in quotes.
Date
<date> = 4*<digit> "-" 2*<digit> "-" 2*<digit>
Dates are recorded in the format YYYY-MM-DD where YYYY, MM and
DD stand for the numeric year, month and day respectively. All dates are
specified in GMT, unless the GMT-Offset attribute of the Site directive
is used. This format is chosen to assist collation.
Time
<time> = 2*<digit> ":" 2*<digit> [ ":" 2*<digit> [ "." *<digit> ]]
Times are recorded in the form HH:MM, HH:MM:SS or HH:MM:SS.S where HH is
the hour in 24 hour format, MM is minutes and SS is seconds. All times
are specified in GMT, unless the GMT-Offset attribute of the Site directive
is used.
String
<string> = <wstring> | <qstring>
<wstring> = <alnum> *[ <alnum> | <punct> ]
<qstring> = <"> *<schar> <">
<schar> = <alnum> | <punct> | <wchar> | <xchar>
<xchar> = <""> | <xescape>
<xescape> = "\x" 2*<hexdigit>
<hexdigit> = <digit> | "A" | "B" | "C" | "D" | "E" | "F"
Strings are rendered in either unquoted or quoted/escaped form. Unquoted
form may only be used for strings that do not contain whitespace, reserved,
or control characters, and may be parsed by simply scanning for the end
of field separator (whitespace). Quoted strings must be parsed and
characters unescaped appropriately. Empty strings may be represented
by <"">.
The character escaping rules are designed to be easy to parse by a variety
of conversion tools, and expressive enough to encode any characters.
The basic rule is any character that is either a special character (backslash
or double quote) or not printable (such as control characters) may be encoded
using the form "\xFF" where the FF represents the two hex digits of the
character. There is a special case designed for legacy conversion
tools: the double quote may also be escaped by doubling the character,
and should be escaped in this fashion when possible.
Taxonomy
Directive Names
The following standard directives are defined. New directives not
approved by the IATC should be preceded with the string "X-" as in "X-New-Directive".
Directive names are case-insensitive. Arbitrary comments can be inserted
in the file using the Remark directive.
-
IARF: Version=<version>
-
The IARF directive is required and must appear exactly once as the
first line of the report file. The version attribute defines the version
of the report format. This draft defines version 1.0.
-
Format: Fields="<field-identifier> [<field-identifier> [...]]"
Template=<name>
-
The Format directive controls the field ordering of the entries that
follow. This directive must appear in the report file before any
entry lines are encountered. Format directives may occur anywhere
within the report file, they will take effect for the entries that follow
the directive. The Fields and Template attributes are complementary,
either or both may be present, and at least one is required. See
below for a list of standard field identifiers and templates. Non-standard
field identifiers may be typed using the "Field-Info" directive.
-
Field-Info: Name=<field-identifier> Type=<type-identifier>
Header=<string>
-
Optional additional information for each field; this directive is never
required. Standard fields defined below are strictly typed, and unknown
fields should be considered type "string". The Name attribute refers to
a field-identifier specified in the Format directive. Type identifiers
convey field type information; allowed values are listed above, and are
case insensitive. The Header attribute is a string representing column
header label information.
-
Site: Name=<string> Domain=<domain> GMT-Offset=<integer>
Email=<address> Contact=<string>
-
The site directive conveys information specific to the site that generated
the data. Name is a human-readable name, domain refers to the primary
site domain name. GMT-Offset represents the offset in hours for all
dates and times in both entries and directives. This directive may occur
anywhere within the file and the GMT-Offset takes effect for all directives
and entries following it. If the GMT-Offset attribute does not occur,
all dates and times are assumed to be GMT. The Email and Contact
attributes are for representing contact info regarding this report file.
-
Advertiser: Name=<string> Brand=<string> Campaign=<string>
-
The Advertiser directive is for advertiser-specific information.
The Name, Brand, and Campaign attributes are self-explanatory.
-
Agency: Name=<string> Insertion-Order=<string>
-
The agency directive may be used for ad agency information. The
Name attribute identifies the agency. The Insertion-Order attribute
represents the agency-assigned insertion order number represented by the
report. Multiple insertion orders may be reported on by using multiple
Agency directives.
-
Flight: Name=<string> Start-Date=<date> Start-Time=<time>
End-Date=<date> End-Time=<time> Impression-Guarantee=<integer>
-
The Flight directive represents flight information. Multiple flights
may be reported in one report file by using multiple Flight directives.
The Name represents an identifier for the flight. Impression-Guarantee
represents the number of impressions guaranteed over the duration of the
flight.
-
Created: Report-Date=<date> Report-Time=<time>
Vendor=<name> Version=<version> OS=<name>
OS-Version=<version>
-
The Created directive is used for information about how and when the
report was created. Report-Date and Report-Time represent the date
and time when the report was run. Vendor and Version represent the
software used to create the report. OS and OS-Version are used to
indicate the platform this software was running on. This directive may
occur only once, and should occur near the top.
-
Remark or Rem: <text>
-
Comment information. Data recorded by this directive should be ignored
by analysis tools.
Field Identifiers
The Fields directive lists a sequence of field identifiers
specifying the semantics and data type recorded in the fields of each entry.
Field identifiers may be one of the following case-insensitive strings:
-
start-date - <date>
-
Date beginning the entry. If this field is not present, interpretation
of the entry date is system-specific.
-
start-time - <time>
-
Time beginning the entry. If this field is not present, tools
should default to 00:00:00.
-
end-date - <date>
-
Date the entry ends. If this field is not present, tools should
default to the same as the start-date.
-
end-time - <time>
-
Time the entry ends. If this field is not present, tools should
default to 23:59:59.999.
-
ad-name - <string>
-
The name of the ad banner.
-
ad-id - <string>
-
The unique server-assigned ID of the ad banner.
-
ad-media-filename - <string>
-
The filename of the ad media.
-
ad-click-url - <uri>
-
The clickthrough URL for the ad banner.
-
placement - <string>
-
The name of the placement - may be a page URL, section of the site,
run of site, keyword, etc.
-
impressions - <integer>
-
The number of impressions recorded.
-
insertions - <integer>
-
The number of insertions recorded.
-
clicks - <integer>
-
The number of clicks recorded.
Standard Templates
In order to promote report standardization and enable simpler report analysis
tools, a set of standard templates are defined. Agencies may therefore
require reports to conform to both the standard format, and one (or more)
of the standard templates. Analysis tools may interpret a limited
number of templates, instead of requiring enough generality to interpret
the entire spectrum of report types. A template directive should
be interpreted exactly the same as the associated fields directive; report
generators are encouraged to insert both directives, for maximum flexibility.
Templates not approved by the IATC should be preceded by the string "X-"
as in "X-New-Template". The following standard case-insensitive template
names are defined:
-
Template=basic <=> Fields="start-date ad-name placement impressions
clicks"
-
This template is the most common and most basic set of report criteria.
Impression and click counts are aggregated daily, and reported for each
ad and placement combination
-
Template=adinfo <=> Fields="start-date start-time ad-name ad-media-filename
ad-click-url placement impressions insertions clicks"
-
This template is an example of another set of report criteria.
Impression, insertion and click counts are aggregated daily, and more ad
info is included in the report.
Example
The following is an example file in the report format:
#IARF: Version=1.0
#Format: Template=basic Fields="start-date ad-name placement impressions clicks"
#Site: Name="Content Provider" Domain=www.content.com GMT-Offset=-8
#Advertiser: Name=Ford Campaign="Explore the world"
#Agency: Name="Funky Agency" Insertion-Order=11783
#Flight: Start-Date=1997-04-01 End-Date=1997-04-30 Impression-Guarantee=1000000
#Created: Report-Date=1997-04-03 Vendor=NetGravity Version=3.0
1997-04-01 "Ford Explorer" "Sports section" 10253 843
1997-04-01 "Ford Explorer" "Keyword: outdoors" 2543 85
1997-04-01 "Ford Taurus" "Entertainment section" 84922 1024
1997-04-02 "Ford Explorer" "Sports section" 10765 682
#Remark: This IARF stuff is pretty cool!
The next example uses a slightly more complex template.
#IARF: Version=1.0
#Format: Template=adinfo Fields="start-date start-time ad-name ad-media-filename ad-click-url placement impressions insertions clicks"
#Site: Name="Content Provider" Domain=www.content.com GMT-Offset=-8
#Advertiser: Name= Microsoft Campaign="Try Java"
#Agency: Name="Funky Agency" Insertion-Order=11784
#Flight: Start-Date=1997-04-01 End-Date=1997-04-30 Impression-Guarantee=1000000
#Created: Report-Date=1997-04-03 Vendor=NetGravity Version=3.0
1997-04-01 00:00 "MS Image ad" msimage.gif http://www.microsoft.com/ "Javaless browsers" 10253 0 843
1997-04-01 00:00 "MS Java ad" msjava.class http://www.microsoft.com/ "Java capable" 0 2543 175
1997-04-01 12:00 "MS Image ad" msimage.gif http://www.microsoft.com/ "Javaless browsers" 11874 0 735
1997-04-01 12:00 "MS Java ad" msjava.class http://www.microsoft.com/ "Java capable" 0 2278 156
The last example includes some experimental directives, attributes, and
fields.
#IARF: Version=1.0
#Format: Fields="start-date start-time ad-name x-ad-size placement impressions clicks"
#Field-Info: Name=x-ad-size Type=string Header="Ad Size"
#Advertiser: Name="The Big Little Co" Campaign="Size wise"
#Agency: Name="Funky Agency" Insertion-Order=11785 Contact="John Smith"
#X-Site-Sizes: Allowed="banner button"
1997-04-01 08:00 "Big Ad" banner "Sports" 10253 843
1997-04-01 08:00 "Little Ad" button "Sports" 2543 85
1997-04-01 09:00 "Big Ad" banner "Sports" 84922 1024
1997-04-01 09:00 "Little Ad" button "Sports" 10765 682
1997-04-01 10:00 "Big Ad" banner "Sports" 10158 729
1997-04-01 10:00 "Little Ad" button "Sports" 3097 79
1997-04-01 11:00 "Big Ad" banner "Sports" 96123 1083
1997-04-01 11:00 "Little Ad" button "Sports" 11074 701
Acknowledgements.
Steve Goldberg pushed the IATC to develop this format. Paul Nakada contributed
the idea of attribute value pairs for directives.
References.
-
[Hallam96]
-
P. Hallam-Baker, B. Behlendorf Extended
Log File Format, March 1996
-
[RFC1808]
-
R. Fielding Relative
Uniform Resource Locators, June 1995
-
[RFC1738]
-
T. Berners-Lee, L. Masinter, Uniform
Resource Locators (URL), December 1994
$Id: WD-adreport-19970505.html,v 1.3 1999/02/19 01:33:21 ts Exp $