Provided by: awffull_3.10.2-10_amd64 bug

NAME

       AWFFull - A Webalizer Fork, Full o' features

DESCRIPTION

       awffull.conf  is  the  configuration  file for awffull(1). awffull.conf is a standard ASCII(7) text files
       that may be created or edited using any standard editor.

       Blank lines and lines that begin with a pound sign ('#') are ignored.

       Any other lines are considered to be configuration lines, and have the form ‘Keyword  Value’,  where  the
       ‘Keyword’ is one of the currently available configuration keywords, and ‘Value’ is the value to assign to
       that particular option.

       Any  text  found  after  the  keyword up to the end of the line is considered the keyword's value, so you
       should not include anything after the actual value on the line that is not actually  part  of  the  value
       being assigned. The file sample.conf provided with the distribution contains lots of useful documentation
       and examples as well.

       Some  ‘Keywords’ will accept a 2^nd value. In those situations, the first value may be enclosed in double
       quotes (") to allow for whitespace.

       Keywords are Case Insensitive. Values are Case Sensitive, with some gotchas: See Ignore* for details.

WILDCARDS

       Wildcards within AWFFull are a little non standard and may cause some confusion.

       Wildcards are only valid within the Value of certain keywords

       A Value can have either a leading or trailing '*' to signify a wildcard  character.  If  no  wildcard  is
       found,  a  match  can occur anywhere in the string. Given a string ‘www.yourmama.com’, the values ‘your’,
       ‘*mama.com’ and ‘www.your*’ will all match.

       Thus the use of the wildcard signifies that the other end of the Value is anchored at  the  Beginning  or
       End of a field to be searched against.

       eg.  A  Value  of  ‘Bot*’  implies  that  the field (probably UserAgent in this case) MUST start with the
       letters Bot. Or in the case of a Hostname ‘*.gov.au’ implies a match ONLY against  Australian  Government
       hostnames.

RUN OPTIONS

       The  Run Options are the generic ones that tell AWFFull where stuff is and how to generally operate. Some
       of these can modify the results that AWFFull will produce.

       OutputDir
              OutputDir is where you want to put the output files. This should  should  be  a  full  path  name,
              however  relative  ones  might  work  as  well.  If  no output directory is specified, the current
              directory will be used.

       LogFile
              LogFile defines the web server log file to use. If not specified here or on on the  command  line,
              input  will  default  to STDIN. If the log filename ends in '.gz' (ie: a gzip compressed file), it
              will be decompressed on the fly as it is being read.

       LogType
              LogType defines the log type being processed. Normally, AWFFull expects  a  CLF  or  Combined  web
              server  log  as input. Using this option, you can process ftp logs as well (xferlog as produced by
              wu-ftpd and others), or Squid native logs. Values can be 'auto' 'clf', 'combined', 'ftp', 'domino'
              or 'squid', with 'auto' the default. The 'auto' value means that AWFFull will  try  and  work  out
              what log format you are sending to it. If no joy, AWFFull will immediately exit.

       GeoIP  GeoIP  enables  or  disables  the  use  of  the  GeoIP  capability  for more accurate detection of
              countries. Default is ‘no’. NOTE! Do not enable GeoIP if you analyse files that have  had  the  IP
              Address translated to a Fully Qualified Host Name. Use either raw IP Addresses and GeoIP, or Names
              and disable GeoIP. ie. Don't use GeoIP AND DNShistory.

       GeoIPDatabase
              GeoIPDatabase  is  the location of the GeoIP database file. Default is /usr/share/GeoIP/GeoIP.dat,
              which is where a default GeoIP install will put it. Note that the database is updated monthly. For
              the details see: ⟨http://www.maxmind.com/app/geoip_country⟩

       Incremental
              Incremental processing allows multiple partial log files to be  used  instead  of  one  huge  one.
              Useful  for  large  sites that have to rotate their log files more than once a month. AWFFull will
              save its internal state before exiting, and restore it the next time run,  in  order  to  continue
              processing  where  it  left  off.  This  mode also causes AWFFull to scan for and ignore duplicate
              records (records already processed by  a  previous  run).  See  the  README  file  for  additional
              information.  The  value may be 'yes' or 'no', with a default of 'no'. The file awffull.current is
              used to store the current state data, and is located  in  the  output  directory  of  the  program
              (unless  changed  with  the  IncrementalName  option  below).  Please read at least the section on
              Incremental processing in the README file before you enable this option.

       TimeMe TimeMe allows you to force the display of timing information at the end of processing. A value  of
              'yes' will force the timing information to be displayed. A value of 'no' has no effect.

       IgnoreHist
              IgnoreHist  should not be used in a standard configuration, but it is here because it is useful in
              certain analysis situations. If the history file is ignored, the main ‘index.html’ file will  only
              report on the current log files contents. Incremental data (if present) is still processed. Useful
              when  you  want to reproduce the reports from scratch, for example. USE WITH CAUTION! Valid values
              are ‘yes’ or ‘no’. Default is ‘no’.

       IncrementalName
              IncrementalName allows you to specify the filename for saving  the  incremental  data  in.  It  is
              similar  to  the  HistoryName option where the name is relative to the specified output directory,
              unless an absolute filename is specified. The default is a file named  ‘awffull.current’  kept  in
              the  normal  output  directory.  If you don't specify Incremental as 'yes' then this option has no
              meaning.

       HistoryName
              HistoryName allows you to specify the name of the history file produced by  AWFFull.  The  history
              file  keeps  the  data  for  up to 12 months worth of logs, used for generating the main HTML page
              (index.html). The default is a file named awffull.hist, stored in the specified output  directory.
              If  you  specify  just  the  filename  (without  a  path), it will be kept in the specified output
              directory. Otherwise, the path is relative to the output directory, unless absolute (leading /).

ANALYSIS OPTIONS

       These are the basic analysis options that one can and should modify to start fine tuning AWFFull  against
       a given website.

       PageType
              PageType  lets  you  tell  AWFFull what types of URL's you consider a 'page'. Most people consider
              html and cgi documents as pages, while not images and audio files.  If  no  types  are  specified,
              defaults will be used ('htm', 'html', 'cgi' and HTMLExtension if different for web logs, 'txt' for
              ftp  logs).  Putting  the  more likely page types first in the list should increase the speed of a
              run.

              Do Not Use Wildcards Here. It will not work.

       NotPageType
              NotPageType is the direct and incompatible opposite of PageType. You can use one set or the other,
              but not both. PageType specifies what *is* a Page, NotPageType specifies what *isn't*,  and  hence
              by  implication,  everything  else  is  a  page. Neither method is more or lessor correct than the
              other. It's more what is more accurate for *your* site. Do not add the "." or use  any  wildcards.
              As  a  general rule. There are some assumed internal optimisations that may otherwise break. Those
              who understand pcre's would do well to examine the source of parser.c  if  they  wish  to  extract
              greater flexibility from the below.

       FoldSeqErr
              FoldSeqErr  forces  AWFFull  to  ignore sequence errors. This is useful for Netscape and other web
              servers that cache the writing of  log  records  and  do  not  guarantee  that  they  will  be  in
              chronological order. The use of the FoldSeqErr option will cause out of sequence log records to be
              treated  as  if  they  had  the same time stamp as the last valid record. The default action is to
              ignore out of sequence log records.

       SearchEngine
              The SearchEngine keywords allow specification of search engines and their  query  strings  on  the
              URL. These are used to locate and report what search strings are used to find your site. The first
              word  is  a  substring  to  match in the referrer field that identifies the search engine, and the
              second is the URL variable used by that search engine to define it's search terms.

       VisitTimeout
              VisitTimeout allows you to set the default timeout for a visit (sometimes called a 'session'). The
              default is 30 minutes, which should be fine for most sites. Visits are determined  by  looking  at
              the  time  of  the  current  request,  and the time of the last request from the site. If the time
              difference is greater than the VisitTimeout value, it is considered a new visit, and visit  totals
              are incremented. Value is the number of seconds to timeout (default=1800=30min)

       TrackPartialRequests
              TrackPartialRequests is used to track 206 codes. This gives two additional columns in the Top URLs
              tables.  The  first  to "Hits" counts the number of partial requests The second to "Volume" counts
              the volume in partial requests This option is more of use to those with lots of PDF's.

       MangleAgents
              The MangleAgents allows you to specify how much, if any, AWFFull should mangle user  agent  names.
              This  allows  several  levels of detail to be produced when reporting user agent statistics. There
              are six levels that can be specified, which define different levels of detail suppression. Level 5
              shows only the browser name (MSIE or Mozilla) and the major version number. Level 4 adds the minor
              version number (single decimal place). Level 3 displays the minor version to two  decimal  places.
              Level  2  will add any sub-level designation (such as Mozilla/3.01Gold or MSIE 3.0b). Level 1 will
              attempt to also add the system type if it is specified. The default Level 0 displays the full user
              agent field without modification and produces the greatest amount of detail. User agent names that
              can't be mangled will be left unmodified.

       AssignToCountry
              AssignToCountry allows a form of override to force given domains to a specified country.  Use  the
              standard  2 letter country codes. Can also use org, com, net and so on, if more appropriate.  With
              judicious use of AllSites, GroupSite and 'whois', this  can  cover  the  majority  of  your  users
              without too much effort.

       IndexAlias
              AWFFull  normally  strips  the  string  'index.'  off the end of URL's in order to consolidate URL
              totals. For example, the URL /somedir/index.html is turned into /somedir/ which is really the same
              URL. This option allows you to specify additional strings to treat in the same way. You don't need
              to specify 'index.' as it is always scanned for  by  AWFFull,  this  option  is  just  to  specify
              _additional_  strings  if  needed. If you don't need any, don't specify any as each string will be
              scanned for in EVERY log record... A bunch of them will degrade performance. Also, the  string  is
              scanned   for   anywhere   in   the   URL,   so   a   string   of   'home'   would  turn  the  URL
              /somedir/homepages/brad/home.html into just /somedir/ which is probably not what was intended.

       IgnoreIndexAlias
              The opposite (in a way) of IndexAlias is  IgnoreIndexAlias.   This  will  STOP  any  URL  variable
              stripping, as well as ignoring the default "index." setting, or any that you set above.

IGNORE* OPTIONS

       The  Ignore* keywords allow you to completely ignore, or filter away, log records based on hostname, URL,
       user agent, referrer or user name. Use the same syntax as the Hide* keywords, where the value can have  a
       leading or trailing wildcard '*'.

       IgnoreURL
              Filters  out traffic accessing certain URLs. eg You may wish to avoid seeing traffic that accesses
              administration functions, thus "IgnoreURL /admin*". URLs are case sensitive.

       IgnoreSite
              Ignore sites that visit this website. Ignore by what is presented to awffull - name or IP Address.
              Sites are lowercased prior to filtering, so if Ignore'ing by name, do use a lowercased Value.

       IgnoreReferrer
              Ignore specified referrers. Very useful for filtering away SPAM Referrers. Referrers are partially
              case sensitive. \o/ The host portion is lowercased; the URI is case sensitive.

       IgnoreUser
              Ignore specified users. User names are lowercased prior to filtering.

       IgnoreAgent
              Agents are case sensitive.

INCLUDE* OPTIONS

       The Include* keywords allow you to force the inclusion of log records based on hostname, URL, user agent,
       referrer or user name. The Include* keywords take precedence over the Ignore* keywords.

       Note: Using Ignore/Include combinations to  selectively  process  parts  of  a  web  site  is  _extremely
       inefficient_!!!  Avoid doing so if possible ie: grep or gawk the records to a separate file if you really
       want that kind of report.

       IncludeURL

       IncludeSite

       IncludeReferrer

       IncludeUser

       IncludeAgent

SEGMENTING OPTIONS

       Segmenting is a bit like the Ignore* and Include* keywords. Where it differs is  in  "remembering".  Such
       that,  as  a  ‘session’  (or ‘visit’) moves away from the original entry condition, that session is still
       tracked. So if you segment on a referal from Google, only sessions that were  referred  to  the  analysed
       website, from Google, will be tracked. Even as that same session accesses other pages within the website.

       eg. Google -> Site Page 1 -> Site Page 2 -> Site Page 3

       Whereas Ignore/Include would only filter the first interaction. eg.  Google -> Site Page 1

       By  "session"  (or  ‘visit’)  it  is  meant  that  the time limitation of a session (typically 30 minutes
       timeout) will impact. So in the above example from Google, if the last step  (from  Page  2  to  Page  3)
       occurred  31+  minutes after the Page 1 to Page 2 transition, then this final step would NOT be included.
       The trail would be:

       Google -> Site Page 1 -> Site Page 2

       Please do be aware that currently AWFFull uses IP Addresses to determine  the  continuation  of  a  given
       session.  This will be most flawed if you have a user population that sits behind corporate firewalls, or
       ISP Proxies. To mention two major problem areas.

       Why do Segmenting?

       ⟨http://judah.webanalyticsdemystified.com/2007/11/a-few-tips-on-web-analytics-segmentation.html⟩

       ‘Segment analysis will tell you different things about your audience than you will realize from  studying
       overall population metrics.’

       ‘The goal of segmentation is to maximize future value of that segment by optimizing your marketing mix.’

       With apologies to Judah for mixing his phrase order around.  :-)

       SegCountry
              Segment  by  Country:  Only  track  sessions  that come from the following countries. This will be
              determined by:

              1.  Use of AssignToCountry overrides

              2.  GeoIP lookups if so configured and enabled

              3.  Hostname TLD. eg .au

       The third option is generally going to be the worst for accuracy. eg. We have  plenty  of  Australian  IP
       addresses that otherwise resolve to .com or .net etc.

       It is strongly advised to enable GeoIP if you wish to use this option.

       SegReferer
              Segment  by  Referer:  Only  track  sessions that originated from the following referers. NOTE!!!!
              SegReferer only works against the HOST name. Not the full URL.

DISPLAY OPTIONS

       The Display Options modify the resulting output that AWFFull  produces.  Things  like  HTML  Headers  and
       Footers to add on every page. These options don't change the numbers that AWFFull will calculate, but may
       change which ones appear, giving the illusion of a numerical change.

       ReportTitle
              ReportTitle  is  the  text to display as the title. The hostname (unless blank) is appended to the
              end of this string (separated with a space) to generate the final full title  string.  Default  is
              (for English) ‘Usage Statistics for’.

       HostName
              HostName  defines  the hostname for the report. This is used in the title, and is prepended to the
              URL table items. This allows clicking on URL's in the report to go to the proper location  in  the
              event you are running the report on a 'virtual' web server, or for a server different than the one
              the  report resides on. If not specified here, or on the command line, AWFFull will try to get the
              hostname via a uname system call. If that fails, it will default to ‘localhost’.

       IndexMonths
              This option controls how many years worth of data to display on the front summary page. In months.
              eg: Display the last 5 years: 5 x 12 = 60

       DailyStats
              DailyStats allows the daily statistics table to be disabled - not displayed. Values may  be  ‘yes’
              or ‘no’. Default is ‘yes’ - do display the Daily Statistics table.

       HourlyStats
              HourlyGraph and HourlyStats allows the hourly statistics graph and statistics table to be disabled
              (not displayed). Values may be "yes" or "no". Default is "yes".

       CSSFilename
              CSSFilename is used to set the name of the CSS file to use in conjunction with the generated html.
              An existing file is not overwritten, so feel free to make you own changes to the default file. The
              default is awffull.css.

       FlagsLocation
              FlagsLocation  will  enable the display of country flag pictures in the country table. The path is
              that for a webserver, not file system. Can be relative or complete.  The  trailing  slash  is  not
              necessary. The default location is not set and hence will not be used.

       YearlySubtotals
              YearlySubtotals  will  display the subtotal for a given year in the main page. This is in addition
              to the Grand Total of all years.

       GroupShading
              The GroupShading allows grouped rows to be shaded in the report. Useful if you have lots of groups
              and individual records that intermingle in the report, and you want  to  differentiate  the  group
              records a little more. Value can be ‘yes’ or ‘no’, with ‘yes’ being the default.

       GroupHighlight
              GroupHighlight  allows  the group record to be displayed in BOLD. Can be either ‘yes’ or ‘no’ with
              the default being ‘yes’.

       HTMLExtension
              HTMLExtension allows you to specify the filename  extension  to  use  for  generated  HTML  pages.
              Normally, this defaults to "html", but can be changed for sites who need it (like for PHP embedded
              pages).

       UseHTTPS
              UseHTTPS  should be used if the analysis is being run on a secure server, and links to urls should
              use ‘https://’ instead of the default ‘http://’. If you need this, set it  to  ‘yes’.  Default  is
              ‘no’. This only changes the behaviour of the ‘Top URLs’ table.

       Top*   The  various  ‘Top’  options  below  define  the  number  of entries for each table. Tables may be
              disabled by using zero (0) for the value.

       TopURLs
              The most accessed URLs or Resources by number of requests (hits). Includes both Pages and  Images,
              for example. Defaults to 30 URLs.

       TopKURLs
              The greatest volume generating URLs. Defaults to 10 URL's.

       TopEntry
              The  most  accessed  initial URLs within a complete Visit. Will also display Single Access counts,
              Stickiness ration and Popularity ratio. Defaults to 10 URLs.

       TopExit
              The most accessed last URLs within a complete Visit. ie: The last page recorded of a  Visit.  Also
              displays the Popularity ratio.  Defaults to 10 URLs.

       Top404Errors
              The most seen error requests and a corresponding referring URL. Defaults to 0, ie not shown.

       TopSites
              Those Sites that have accessed the most Pages. Default is 30 Sites.

       TopKSites
              Those Sites that have downloaded the greatest Volume. Default is 10 Sites.

       TopReferrers
              Those local and remote URLs that refer the most requests.  Default is 30 Referrers.

       TopSearch
              Those  words  and  phrases  used  at  remote  Search Engines to direct traffic here. Default is 20
              Phrases.

       TopUsers
              Those logged in users who most use the site. Default is 20 Users.

       TopAgents
              The Browser Agents that are busiest against this site. Default is 15 Agents.

       TopCountries
              A view of all traffic against this site via country.

       All*   The All* keywords allow the display of all the below measures.  If enabled, a separate  HTML  page
              will  be created, and a link will be added to the bottom of the appropriate "Top" table. There are
              a couple of conditions for this to occur. First, there must be more items than  will  fit  in  the
              "Top"  table  (otherwise  it  would  just  be  duplicating what is already displayed). Second, the
              listing will only show those items that are normally visible, which means it  will  not  show  any
              hidden  items.  Grouped  entries will be listed first, followed by individual items. The value for
              these keywords can be either 'yes' or 'no', with the default being  'no'.  Please  be  aware  that
              these  pages  can  be  quite  large  in  size, particularly the sites page, and separate pages are
              generated for each month, which can consume quite a lot of disk space depending on the traffic  to
              your site.

       AllURLs
              All accessed URLs

       AllEntryPages
              All Pages that initialised a Visit

       AllExitPages
              All the last or exit pages in all Visits.

       All404Errors
              All ErrorRequests and the corresponding referral URLs.

       AllSites
              All remote sites that accessed this website.

       AllReferrers
              All local and remote referring URLs

       AllSearchStr
              All Remote Search Engine words and Phrases used to refer traffic here.

       AllUsers
              All users who logged into this website.

       AllAgents
              All Browser Agents used to access this site. Useful for identifying robots.

       GMTTime
              GMTTime  allows  reports  to  show GMT (UTC) time instead of local time. Default is to display the
              time the report was generated in the timezone of the local machine,  such  as  EDT  or  PST.  This
              keyword  allows  you  to  have  times displayed in UTC instead. Use only if you really have a good
              reason, since it will probably screw up the reporting periods by however  many  hours  your  local
              time zone is off of GMT.

       HTMLPre
              HTMLPre defines HTML code to insert at the very beginning of the file. Default is the DOCTYPE line
              shown below. Max line length is 80 characters, so use multiple HTMLPre lines if you need more.

       HTMLHead
              HTMLHead defines HTML code to insert within the <HEAD></HEAD> block, immediately after the <TITLE>
              line. Maximum line length is 80 characters, so use multiple lines if needed.

       HTMLBody
              HTMLBody defined the HTML code to be inserted, starting with the <BODY> tag. If not specified, the
              default  is shown below.  If used, you MUST include your own <BODY> tag as the first line. Maximum
              line length is 80 char, use multiple lines if needed.

       HTMLPost
              HTMLPost defines the HTML code to insert immediately before the first <HR> on the document,  which
              is  just  after  the title and "summary period"-"Generated on:" lines. If anything, this should be
              used to clean up in case an image was inserted with HTMLBody. As with HTMLHead, you can define  as
              many  of  these as you want and they will be inserted in the output stream in order of appearance.
              Max string size is 80 characters. Use multiple lines if you need to.

       HTMLTail
              HTMLTail defines the HTML code to insert at the bottom of each HTML document, usually to include a
              link back to your home page or insert a small graphic. It is inserted as a table data element (ie:
              <TD> your code here </TD>) and is right aligned with the page.  The  maximum  string  size  is  80
              characters.

       HTMLEnd
              HTMLEnd  defines  the HTML code to add at the very end of the generated files. It defaults to what
              is shown below. If used, you MUST specify the </BODY> and </HTML> closing tags as the last  lines.
              The maximum string length is 80 characters.

GRAPHING OPTIONS

       As  distinct  from  the  general  Display Options, the Graphing Options focus on manipulating the various
       graphs produced.

       CountryGraph
              CountryGraph allows the usage by country graph to be disabled.   Values  can  be  'yes'  or  'no',
              default is 'yes'.

       DailyGraph
              DailyGraph  determines if the daily statistics graph will be displayed or not. Values may be "yes"
              or "no". Default is "yes" - do display the daily graph.

       HourlyGraph
              HourlyGraph determines if the daily statistics graph will be displayed or not. Values may be "yes"
              or "no". Default is "yes" - do display the hourly graph.

       TopURLsbyHitsGraph
              Display a pie chart of the top URLs by HITS

       TopURLsbyVolGraph
              Display a pie chart of the top URLs by HITS

       TopExitPagesGraph
              Display Top Exit Pages Pie Chart. Values may be ‘hits’ or ‘visits’ or "no". Default is "no"

              ‘hits’ means order the graph by hits

              ‘visits’ means order the graph by visits

       TopEntryPagesGraph
              Display Top Entry Pages Pie Chart. Values may be ‘hits’ or ‘visits’ or "no". Default is "no"

              ‘hits’ means order the graph by hits

              ‘visits’ means order the graph by visits

       TopSitesbyPagesGraph
              Display a pie chart of the Top Sites by Page Impressions

       TopSitesbyVolGraph
              Display a pie chart of the Top Sites by Page Impressions

       TopAgentsGraph
              Display a pie chart of the Top User Agents (by pages)

       GraphLegend
              GraphLegend allows the color coded legends to be turned on or off in the graphs.  The  default  is
              for  them  to  be  displayed. This only toggles the color coded legends, the other legends are not
              changed. If you think they are hideous and ugly, say 'no' here :)

       GraphLines
              GraphLines allows you to have index lines drawn behind the graphs. Anything other than  "no"  will
              enable the lines.

       Graph*X and Graph*Y
              The  following Graph*X and Graph*Y options are used to modify the sizes of the created charts. The
              default settings are shown. The defaults are also the minimum settings. #define GRAPH_INDEX_X  512
              /*  px.  Default  X size (512) */ #define GRAPH_INDEX_Y 256 /* px. Default Y size (256) */ #define
              GRAPH_DAILY_X 512 /* px. Daily X size (512) */ #define GRAPH_DAILY_Y 400 /* px. Daily Y size (400)
              */ #define GRAPH_HOURLY_X 512 /* px. Daily X size (512) */ #define GRAPH_HOURLY_Y 400 /* px. Daily
              Y size (400) */ #define GRAPH_PIE_X 512 /* px. Pie X size (512) */ #define GRAPH_PIE_Y 300 /*  px.
              Pie Y size (300) */

       GraphIndexX
              The main chart on the front page. Summary of all Months.  Default is 512 pixels.

       GraphIndexY
              Default is 256 pixels.

       GraphDailyX
              The Day by Day Summary graph at the start of each Months Summary. Default is 512 pixels.

       GraphDailyY
              Default is 400 pixels.

       GraphHourlyX
              The Hourly Average graph within each Months Summary. Default is 512 pixels.

       GraphHourlyY
              Default is 400 pixels.

       GraphPieX
              All pie charts are the same size. Default is 512 pixels.

       GraphPieY
              Default is 300 pixels.

       Graph and Table Colours
              The  custom  bar  graph  and pie Colours can be overridden with these options. Declare them in the
              standard hexadecimal way - as per HTML but without the '#'. If none are given, you  will  get  the
              default AWFFull colors.

       ColorHit
              Default value is ‘00805C’ (dark green)

       ColorFile
              Default value is ‘0000FF’ (blue)

       ColorSite
              Default value is ‘FF8000’ (orange)

       ColorKbyte
              Default value is ‘FF0000’ (red)

       ColorPage
              Default value is ‘00E0FF’ (cyan)

       ColorVisit
              Default value is ‘FFFF00’ (yellow)

       PieColor1
              Default value is ‘00805C’ (dark green)

       PieColor2
              Default value is ‘0000FF’ (blue)

       PieColor3
              Default value is ‘FF8000’ (orange)

       PieColor4
              Default value is ‘FF0000’ (red)

GROUP* OPTIONS

       The  Group*  keywords  permit  the  grouping  of similar objects as if they were one. Grouped records are
       displayed in the ‘Top’ tables and can optionally be displayed in bold and/or  shaded.  Groups  cannot  be
       hidden,  and are not counted in the main totals. The Group* options do not hide the individual items that
       are members of the Group. If you wish to hide the records that match - so just  the  grouping  record  is
       displayed  -  follow with an identical Hide* keyword with the same value. Or use the single GroupAndHide*
       keyword that matches, instead of the Group* and Hide* combination.

       Group* keywords may have an optional label which will be displayed instead of  the  keywords  value.  The
       label should be separated from the value by at least one white-space character, such as a space or tab.

       The  Hide*, Group* and Ignore* and Include* keywords allow you to change the way Sites, URL's, Referrers,
       User Agents and User names are manipulated. The Ignore* keywords will cause AWFFull to completely  ignore
       records  as  if they didn't exist (and thus not counted in the main site totals). The Hide* keywords will
       prevent things from being displayed in the 'Top' tables, but will still be counted in  the  main  totals.
       The  Group* keywords allow grouping similar objects as if they were one. Grouped records are displayed in
       the 'Top' tables and can optionally be displayed in BOLD and/or shaded. Groups cannot be hidden, and  are
       not  counted  in  the  main  totals.  The  Group*  options do not, by default, hide all the items that it
       matches. If you want to hide the records that match (so just the grouping record  is  displayed),  follow
       with an identical Hide* keyword with the same value. (see example below) In addition, Group* keywords may
       have  an  optional  label  which  will  be  displayed instead of the keywords value.  The label should be
       separated from the value by at least one 'white-space' character, such as a space or tab.

       The value can have either a leading or trailing '*' wildcard character. If no wildcard is found, a  match
       can  occur  anywhere in the string. Given a string ‘www.yourmama.com’, the values ‘your’, ‘*mama.com’ and
       ‘www.your*’ will all match.

       GroupURL

       GroupSite

       GroupReferrer

       GroupUser

       GroupAgent

       GroupDomains
              The GroupDomains keyword allows you to group individual host names into their respective  domains.
              The  value  specifies  the  level  of grouping to perform, and can be thought of as 'the number of
              dots' that will be displayed. For example, if a visiting host  is  named  cust1.tnt.mia.uu.net,  a
              domain  grouping  of  1  will  result  in  just "uu.net" being displayed, while a 2 will result in
              "mia.uu.net". The default value of zero disable this feature.  Domains will  only  be  grouped  if
              they do not match any existing "GroupSite" records, which allows overriding this feature with your
              own if desired.

HIDE* OPTIONS

       The  Hide*  keywords  will prevent things from being displayed in the 'Top' tables. The hidden items will
       still be counted in the main totals.

       HideURL
              Hide URL matching name.

       HideSite
              Hide site matching name.

       HideReferrer
              Hide referrer matching name.

       HideUser

       HideAgent
              Hide user agents matching name.

       HideAllSites
              HideAllSites allows forcing individual sites to be hidden in  the  report.  This  is  particularly
              useful  when  used  in  conjunction  with  the "GroupDomain" feature, but could be useful in other
              situations as well, such as when you only want  to  display  grouped  sites  (with  the  GroupSite
              keywords...).  The  value  for  this  keyword  can be either 'yes' or 'no', with 'no' the default,
              allowing individual sites to be displayed.

GROUPANDHIDE* OPTIONS

       All the Hide and Group "name" options can be combined in a single config line. eg GroupAndHideURL. If you
       start using the Group* options you will  find  that  you  tend  to  match  every  Group*  option  with  a
       corresponding Hide* option. The GroupAndHide* options simply short circuit this unnecessary duplication.

       GroupAndHideURL

       GroupAndHideSite

       GroupAndHideReferrer

       GroupAndHideUser

       GroupAndHideAgent

DATA DUMP OPTIONS

       The  Dump*  keywords  allow  the  dumping  of  Sites, URL's, Referrers User Agents, User names and Search
       strings to separate tab delimited text files, suitable for  import  into  most  database  or  spreadsheet
       programs.

       DumpPath
              DumpPath  specifies  the  path to dump the files. If not specified, it will default to the current
              output directory. Do not use a trailing slash ('/').

       DumpHeader
              The DumpHeader keyword specifies if a header record should be written to the file. A header record
              is the first record of the file, and contains the labels for each field written.  Normally,  files
              that  are  intended  to  be  imported  into a database system will not need a header record, while
              spreadsheets usually do. Value can be either 'yes' or 'no', with 'no' being the default.

       DumpExtension
              DumpExtension allow you to specify the dump filename extension to use. The default is  "tab",  but
              some programs are picky about the filenames they use, so you may change it here (for example, some
              people may prefer to use "csv").

       DumpURLs

       DumpEntryPages

       DumpExitPages

       DumpSites

       DumpReferrers

       DumpSearchStr

       DumpUsers

       DumpAgents

       DumpCountries

EXAMPLES

       Sample Extract of a configuration file:

       # The 'auto' value means that AWFFull will try and work out what log format
       # you are sending to it. If no joy, AWFFull will immediately exit.

       LogType        auto

       # OutputDir is where you want to put the output files.  This should
       # should be a full path name, however relative ones might work as well.
       # If no output directory is specified, the current directory will be used.

       OutputDir      .

       Minimal configuration file:

       # Sample *MINIMAL* AWFFull configuration file
       #
       # The below settings are the only ones you *really* need to worry about
       # when configuring AWFFull. See the sample.conf file for all options if
       # the below only serves to whet your appetite.
       #
       # See awfful(1) or sample.conf for full explanations.

       # We can do a little bit each day, or hour...
       Incremental             yes

       # Your server name to display
       HostName                www.my_example.site

       ##---------------------------
       # Use PageType OR NotPageType
       # I personally prefer NotPageType - YMMV!
       PageType                htm
       PageType                html
       PageType                php
       #PageType               pl
       #PageType               cfm
       #PageType               pdf
       #PageType               txt
       #PageType               cgi
       ### OR! ---------------------
       #NotPageType            gif
       #NotPageType            css
       #NotPageType            js
       #NotPageType            jpg
       #NotPageType            ico
       #NotPageType            png
       ##---------------------------

       # Should always fold in Sequence Errors. Logs can be messy...
       FoldSeqErr              yes

       # If you want to see the country flags, uncomment the following.
       # This is the, possibly relative, URL where the flag flies are located.
       #FlagsLocation          flags

       .fi

SEE ALSO

       awffull(1)

BUGS

       None currently known. YMMV....

       Report    bugs   to   ⟨https://bugs.launchpad.net/awffull⟩,   or   use   the   email   discussion   list:
       <awffull@stedee.id.au>

NOTES

       In case it is not obvious: AWFFull is a play/pun on the word ‘awful’, and is pronounced the same way. Yes
       it was deliberate.

REFERENCES

       [1] Web Site Measurement Hacks. Eric T. Peterson (and others).  O'Reilly. ISBN 0-596-00988-7.

                                                   2008-Dec-13                                   awffull.conf(5)