Posts Tagged web log analysis

Web Log Analysis Tutorial – Lesson 2: Basic concept of web log analysis

Table of Contents

  1. Hits and Visits
  2. Page views
  3. Bandwidth
  4. Web Spider
  5. Stolen object
  6. Unique Visitors
  7. Session
  8. Referrer
  9. Bounce rate

I. Hits and Visits

A log entry will generate a “Hit” on the web server. This can include pages, images, animations, audio, video, downloads, PDF or Word documents or anything else that you allow visitors to access. When a web browser loads a page, it also loads all the components referenced by that page. For example, if a web page contains 5 images, a visit on that page will generate 6 “Hits” on the web server, one hit for the web page, 5 hits for the images.

A unique visitor is determined by the IP address or cookie. By default, a visit session is terminated when a user falls on inactive state for more than 30 minutes. So a unique visitor may visit your web site twice and get reported as two visits.

If the visitor left the web site and came back 30 minutes later, Nihuo Web Log Analyzer will report 2 visits. If the visitor came back within 30 minutes, Nihuo Web Log Analyzer will still report 1 visit.

II. Page views

Page is any file or content delivered by a web server that would generally be considered a web document. This includes HTML pages (.html, .htm, .shtml), script-generated pages (.cgi, .asp, .cfm, etc.). Image
files (.jpeg, .gif, .png), javascript (.js) and style sheets (.css) are generally not considered to be pages.

A page view (PV) or page impression is a request to load a single page of an Internet site. On the World Wide Web a page request would result from a web surfer clicking on a link on another HTML page pointing to
the page in question. This should be contrasted with a hit, which refers to a request for a file from a web server. There may therefore be many hits per page view since a page can be made up of multiple files.

III. Bandwidth

Measure (in kilobytes of data transferred) of the traffic on a site. If you are billed for bandwidth usage on a monthly basis you can see an estimate of the amount of bandwidth your web site used in the General Statistics report.

IV. Web Spider

Web spider is a program used by search engines, also known as a crawler or robot, searches the internet scanning web pages to include in the search engines index. All activities caused by web spiders will also be recorded into web log files.

V. Stolen object

Stolen object report reveals cases in which your images and other non-page objects have been embedded in, or directly linked to by, pages on other web sites. This does NOT mean that the files have been stolen in any legal sense. It does, however, mean that your content is being displayed, heard or shown outside the context of your own web pages.

For example, if an outside site places this code in a popular web page:

<img src=”http://www.yoursite.com/yourpicture.jpg”>

Then your image will be displayed thousands of times, possibly without any attribution or permission on your part. This report is extremely valuable in identifying such situations.

VI. Unique Visitors

The number of individuals who visit a web site during a specific time. The same person visiting twice is only counted once.

VII. Session

A period of interaction between a visitor’s browser and a particular web site, ending when the browser is closed or shut down, or when the user has been inactive on that site for a specified period of time.

For the purpose of Nihuo Web Log Analyzer reports, a session is considered to have ended if the user has been inactive on the site for 30 minutes. You can update this setting in Option dialog.

VIII. Referrer

An http referrer or referrer is anything online that drives visits and
visitors to your Web site.
This can include:

  • search engines
  • blogs
  • link lists
  • banner ads
  • email
  • affiliate links
  • links built into software

Technically, even offline sources like print ads or references in books or magazines are referrers, but these aren’t specifically captured in
the server referrer log. When a Web developer uses the term “referrer” she means those sites or services that are referenced in the Web server
logs.

IX. Bounce rate

It essentially represents the percentage of initial visitors to a site who “bounce” away to a different site, rather than continue on to other pages within the same site.

The formula used to calculate bounce rate is:

Bounce Rate = Total Number of Single-Page Visitors / Total Number of Visitors

A bounce occurs when a web site visitor only views a single page on a website, that is, the visitor leaves a site without visiting any other pages before a specified session-timeout occurs. There is no industry-standard minimum or maximum time by which a visitor must leave in order for a bounce to occur. Rather, this is determined by the session timeout of the analytics tracking software.

Post to Twitter Post to Yahoo Buzz Post to Delicious Post to Digg Post to Facebook

Tags: ,

Web Log Analysis Tutorial – Lesson 1 : Getting Started with Nihuo Web Log Analyzer

Table of Contents

  1. Introduction
  2. Download and install
  3. Creating your 1st analysis task
  4. Web Log Format
  5. Related learning resources

I. Introduction

This tutorial is your starting point for learning web log analysis. It
shows you some of the things you can discover about your visitors
through analysis of your web site logs. It uses Nihuo Web Log Analyzer
Windows version to provide examples of reports, but the knowledge gained
can be applied to Nihuo Web Log Analyzer Linux version and any other
traffic analysis tool.

II. Download and install

If you have not downloaded Nihuo Web Log Analyzer, please download and
install the latest version from

http://www.loganalyzer.net/download.html, before proceeding with this
tutorial.

III. Creating your 1st analysis task

1. Where can I find my IIS log files?

To determine where your IIS log files are stored, please follow below
guides step by step on your server:

  1. Go to Start -> Control Panel -> Administrative Tools
  2. Run Internet Information Services (IIS).
  3. Find your Web site under the tree on the left.
  4. If your server is IIS7
    1. Click Logging icon on the right
    2. On the bottom of logging page, you will see a box that contains
      the log file directory
  5. If your server is IIS 6
    1. Right-click on it and choose Properties.
    2. On the Web site tab, you will see an option near the bottom that
      says “Active Log Format” Click on the Properties button.

    3. At the bottom of the General Properties tab, you will see a box
      that contains the log file directory and the log file name.

2. Where can I find my Apache access log files?

The location and content of the access log are controlled by the
CustomLog directive. Default apache access log file location:

  • RHEL / Red Hat / CentOS / Fedora Linux Apache access file
    location – /var/log/httpd/access_log
  • Debian / Ubuntu Linux Apache access log file location -
    /var/log/apache2/access.log
  • FreeBSD Apache access log file location -
    /var/log/httpd-access.log

To find exact apache log file location, you can use grep command:

  • grep CustomLog /usr/local/etc/apache22/httpd.conf
  • grep CustomLog /etc/apache2/apache2.conf
  • grep CustomLog /etc/httpd/conf/httpd.conf

Sample output:

a CustomLog directive (see below)

CustomLog “/var/log/httpd-access.log” common

CustomLog “/var/log/httpd-access.log” combined

3. How to create my first analysis task?

Please visit online flash step by step tutorial in http://loganalyzer.net/log-analysis-tutorial/creating-project.html.

IV. Web Log Format

It is critical to set up your web server logging in a format that allows
Nihuo Web Log Analyzer to properly interpret the data and produce fully
detailed reporting.

1. Apache

By default, Apache generally logs in what’s called common log format,
and also provides an option to log in a more detailed format known as NCSA extended/combined log format. For optimal reporting, Nihuo strongly
recommend the NCSA extended/combined format. NCSA custom log format can
be analyzed by Nihuo Web Log Analyzer too.

2. Microsoft Internet Information Server (IIS)

Nihuo Web Log Analyzer can provide very basic reporting if your IIS log
files have, at the very least, the following fields:

  • date
  • time
  • c-ip
  • cs-uri-stem
  • sc-status
  • sc-bytes

However, this minimal logging does not provide enough information for
Referral and Browser reporting. Therefore it is advisable to set more
detailed logging properties for your IIS server.

For more detail report, please export following fields in your IIS log
files:

  • c-ip
  • cs-method
  • cs-host
  • cs-uri-stem
  • cs-uri-query
  • sc-status
  • sc-bytes
  • time-taken
  • cs(referer)
  • cs(user-agent)
  • cs(cookie)
  • cs-username
  • date
  • time
  • s-ip
  • s-port
  • sc-win32-status
  • sc-substatus
  • s-sitename
  • s-computername

V. Related learning resources

Post to Twitter Post to Yahoo Buzz Post to Delicious Post to Digg Post to Facebook

Tags: ,