Web Analytics
   HOME

TheInfoList



OR:

Web analytics is the measurement,
collection Collection or Collections may refer to: * Cash collection, the function of an accounts receivable department * Collection (church), money donated by the congregation during a church service * Collection agency, agency to collect cash * Collectio ...
,
analysis Analysis ( : analyses) is the process of breaking a complex topic or substance into smaller parts in order to gain a better understanding of it. The technique has been applied in the study of mathematics and logic since before Aristotle (38 ...
, and reporting of web
data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted ...
to understand and optimize
web usage The usage share of web browsers is the portion, often expressed as a percentage, of visitors to a group of web sites that use a particular web browser. Accuracy Measuring browser usage in the number of requests (page hits) made by each us ...
. Web analytics is not just a process for measuring
web traffic Web traffic is the data sent and received by visitors to a website. Since the mid-1990s, web traffic has been the largest portion of Internet traffic. Sites monitor the incoming and outgoing traffic to see which parts or pages of their site are ...
but can be used as a tool for business and
market research Market research is an organized effort to gather information about target markets and customers: know about them, starting with who they are. It is an important component of business strategy and a major factor in maintaining competitiveness. Mark ...
and assess and improve
website A website (also written as a web site) is a collection of web pages and related content that is identified by a common domain name and published on at least one web server. Examples of notable websites are Google Search, Google, Facebook, Amaz ...
effectiveness. Web analytics applications can also help companies measure the results of traditional print or broadcast
advertising campaign An advertising campaign is a series of advertisement messages that share a single idea and theme which make up an integrated marketing communication (IMC). An IMC is a platform in which a group of people can group their ideas, beliefs, and conc ...
s. It can be used to estimate how traffic to a website changes after launching a new advertising campaign. Web analytics provides information about the number of visitors to a website and the number of page views, or create user behavior profiles. It helps gauge traffic and popularity trends, which is useful for market research.


Basic steps of the web analytics process

Most web analytics processes come down to four essential stages or steps, which are: * Collection of data: This stage is the collection of the basic, elementary data. Usually, these data are counts of things. The objective of this stage is to gather the data. * Processing of data into information: This stage usually takes counts and makes them ratios, although there still may be some counts. The objective of this stage is to take the data and conform it into information, specifically metrics. * Developing KPI: This stage focuses on using the ratios (and counts) and infusing them with business strategies, referred to as
key performance indicators A performance indicator or key performance indicator (KPI) is a type of performance measurement. KPIs evaluate the success of an organization or of a particular activity (such as projects, programs, products and other initiatives) in which it en ...
(KPI). Many times, KPIs deal with conversion aspects, but not always. It depends on the organization. * Formulating online strategy: This stage is concerned with the online goals, objectives, and standards for the organization or business. These strategies are usually related to making money, saving money, or increasing market share. Another essential function developed by the analysts for the optimization of the websites are the experiments * Experiments and testing:
A/B testing A/B testing (also known as bucket testing, split-run testing, or split testing) is a user experience research methodology. A/B tests consist of a randomized experiment that usually involves two variants (A and B), although the concept can be al ...
is a controlled experiment with two variants, in online settings, such as
web development Web development is the work involved in developing a website for the Internet (World Wide Web) or an intranet (a private network). Web development can range from developing a simple single static page of plain text to complex web applications ...
. The goal of A/B testing is to identify and suggest changes to web pages that increase or maximize the effect of a statistically tested result of interest. Each stage impacts or can impact (i.e., drives) the stage preceding or following it. So, sometimes the data that is available for collection impacts the online strategy. Other times, the online strategy affects the data collected.


Web analytics technologies

There are at least two categories of web analytics, ''off-site'' and ''on-site'' web analytics. *Off-site web analytics refers to web measurement and analysis regardless of whether a person owns or maintains a website. It includes the measurement of a website's ''potential'' audience (opportunity), share of voice (visibility), and buzz (comments) that is happening on the Internet as a whole. *On-site web analytics, the more common of the two, measure a visitor's behavior once ''on a specific website''. This includes its drivers and conversions; for example, the degree to which different
landing page In online marketing, a landing page, sometimes known as a "lead capture page", "single property page", "static page", "squeeze page" or a "destination page", is a single web page that appears in response to clicking on a search engine optimized se ...
s are associated with online purchases. On-site web analytics measures the performance of a specific website in a commercial context. This data is typically compared against
key performance indicators A performance indicator or key performance indicator (KPI) is a type of performance measurement. KPIs evaluate the success of an organization or of a particular activity (such as projects, programs, products and other initiatives) in which it en ...
for performance and is used to improve a website or
marketing campaign Marketing is the process of exploring, creating, and delivering value to meet the needs of a target market in terms of goods and services; potentially including selection of a target audience; selection of certain attributes or themes to empha ...
's audience response. Google Analytics and Adobe Analytics are the most widely used on-site web analytics service; although new tools are emerging that provide additional layers of information, including
heat map A heat map (or heatmap) is a data visualization technique that shows magnitude of a phenomenon as color in two dimensions. The variation in color may be by hue or intensity, giving obvious visual cues to the reader about how the phenomenon is c ...
s and
session replay Session replay is the ability to replay a visitor's journey on a web site or within a mobile application or web application. Replay can include the user's view (browser or screen output), user input (keyboard and mouse inputs), and logs of netwo ...
. Historically, web analytics has been used to refer to on-site visitor measurement. However, this meaning has become blurred, mainly because vendors are producing tools that span both categories. Many different vendors provide on-site web analytics software and
services Service may refer to: Activities * Administrative service, a required part of the workload of university faculty * Civil service, the body of employees of a government * Community service, volunteer service for the benefit of a community or a p ...
. There are two main technical ways of collecting the data. The first and traditional method, ''server log file analysis'', reads the logfiles in which the
web server A web server is computer software and underlying hardware that accepts requests via HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiate ...
records file requests by browsers. The second method, '' page tagging'', uses
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of Website, websites use JavaScript on the Client (computing), client side ...
embedded in the webpage to make image requests to a third-party analytics-dedicated server, whenever a webpage is rendered by a
web browser A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on ...
or, if desired, when a mouse click occurs. Both collect data that can be processed to produce web traffic reports.


Web analytics data sources

The fundamental goal of web analytics is to collect and analyze data related to web traffic and usage patterns. The data mainly comes from four sources: # Direct HTTP request data: directly comes from HTTP request messages (HTTP request headers). # Network-level and server-generated data associated with HTTP requests: not part of an HTTP request, but it is required for successful request transmissions - for example, the IP address of a requester. # Application-level data sent with HTTP requests: generated and processed by application-level programs (such as JavaScript, PHP, and ASP.Net), including sessions and referrals. These are usually captured by internal logs rather than public web analytics services. # External data: can be combined with on-site data to help augment the website behavior data described above and interpret web usage. For example, IP addresses are usually associated with Geographic regions and internet service providers, e-mail open and click-through rates, direct mail campaign data, sales, lead history, or other data types as needed.


Web server log file analysis

Web servers record some of their transactions in a log file. It was soon realized that these log files could be read by a program to provide data on the popularity of the website. Thus arose
web log analysis software Web log analysis software (also called a web log analyzer) is a kind of web analytics software that parses a server log file from a web server, and based on the values contained in the log file, derives indicators about when, how, and by whom a web ...
. In the early 1990s, website statistics consisted primarily of counting the number of client requests (or ''hits'') made to the web server. This was a reasonable method initially since each website often consisted of a single HTML file. However, with the introduction of images in HTML, and websites that spanned multiple HTML files, this count became less useful. The first true commercial Log Analyzer was released by IPRO in 1994.Web Traffic Data Sources and Vendor Comparison
by Brian Clifton and Omega Digital Media Ltd
Two units of measure were introduced in the mid-1990s to gauge more accurately the amount of human activity on web servers. These were ''page views'' and ''visits'' (or ''sessions''). A ''
page view In web analytics and website management, a pageview or page view, abbreviated in business to PV and occasionally called page impression, is a request to load a single HTML file ( web page) of an Internet site. On the World Wide Web, a page request ...
'' was defined as a request made to the web server for a page, as opposed to a graphic, while a ''visit'' was defined as a sequence of requests from a uniquely identified client that expired after a certain amount of inactivity, usually 30 minutes. The emergence of search engine spiders and robots in the late 1990s, along with web proxies and dynamically assigned IP addresses for large companies and
ISPs An Internet service provider (ISP) is an organization that provides services for accessing, using, or participating in the Internet. ISPs can be organized in various forms, such as commercial, community-owned, non-profit, or otherwise private ...
, made it more difficult to identify unique human visitors to a website. Log analyzers responded by
tracking Tracking may refer to: Science and technology Computing * Tracking, in computer graphics, in match moving (insertion of graphics into footage) * Tracking, composing music with music tracker software * Eye tracking, measuring the position of t ...
visits by
cookies A cookie is a baked or cooked snack or dessert that is typically small, flat and sweet. It usually contains flour, sugar, egg, and some type of oil, fat, or butter. It may include other ingredients such as raisins, oats, chocolate chips, nuts ...
, and by ignoring requests from known spiders. The extensive use of
web cache A Web cache (or HTTP cache) is a system for optimizing the World Wide Web. It is implemented both client-side and server-side. The caching of multimedias and other files can result in less overall delay when browsing the Web. Parts of the syste ...
s also presented a problem for log file analysis. If a person revisits a page, the second request will often be retrieved from the browser's cache, and so no request will be received by the web server. This means that the person's path through the site is lost. Caching can be defeated by configuring the web server, but this can result in degraded performance for the visitor and bigger load on the servers.


Page tagging

Concerns about the accuracy of log file analysis in the presence of caching, and the desire to be able to perform web analytics as an outsourced service, led to the second data collection method, page tagging or "
web beacon A web beaconAlso called web bug, tracking bug, tag, web tag, page tag, tracking pixel, pixel tag, 1×1 GIF, or clear GIF. is a technique used on web pages and email to unobtrusively (usually invisibly) allow checking that a user has accessed s ...
s". In the mid-1990s,
Web counter A web counter or hit counter is a publicly displayed running tally of the number of visits a webpage has received. Web counters are usually displayed as an inline digital image or in plain text. Image rendering of digits may use a variety of ...
s were commonly seen — these were images included in a web page that showed the number of times the image had been requested, which was an estimate of the number of visits to that page. In the late 1990s, this concept evolved to include a small invisible image instead of a visible one, and, by using JavaScript, to pass along with the image request certain information about the page and the visitor. This information can then be processed remotely by a web analytics company, and extensive statistics generated. The web analytics service also manages the process of assigning a cookie to the user, which can uniquely identify them during their visit and in subsequent visits. Cookie acceptance rates vary significantly between websites and may affect the quality of data collected and reported. Collecting website data using a third-party data collection server (or even an in-house data collection server) requires an additional
DNS The Domain Name System (DNS) is a hierarchical and distributed naming system for computers, services, and other resources in the Internet or other Internet Protocol (IP) networks. It associates various information with domain names assigned to ...
lookup by the user's computer to determine the IP address of the collection server. On occasion, delays in completing successful or failed DNS lookups may result in data not being collected. With the increasing popularity of
Ajax Ajax may refer to: Greek mythology and tragedy * Ajax the Great, a Greek mythological hero, son of King Telamon and Periboea * Ajax the Lesser, a Greek mythological hero, son of Oileus, the king of Locris * ''Ajax'' (play), by the ancient Greek ...
-based solutions, an alternative to the use of an invisible image is to implement a call back to the server from the rendered page. In this case, when the page is rendered on the web browser, a piece of JavaScript code would call back to the server and pass information about the client that can then be aggregated by a web analytics company.


Logfile analysis vs page tagging

Both logfile analysis programs and page tagging solutions are readily available to companies that wish to perform web analytics. In some cases, the same web analytics company will offer both approaches. The question then arises of which method a company should choose. There are advantages and disadvantages to each approach.


Advantages of logfile analysis

The main advantages of log file analysis over page tagging are as follows: * The web server normally already produces log files, so the raw data is already available. No changes to the website are required. * The data is on the company's servers and is in a standard, rather than a proprietary, format. This makes it easy for a company to switch programs later, use several different programs, and analyze historical data with a new program. * Log files contain information on visits from search engine spiders, which generally are excluded from the analytics tools using JavaScript tagging. (Some search engines might not even execute JavaScript on a page.) Although these should not be reported as part of human activity, it is useful information for
search engine optimization Search engine optimization (SEO) is the process of improving the quality and quantity of Web traffic, website traffic to a website or a web page from web search engine, search engines. SEO targets unpaid traffic (known as "natural" or "Organ ...
. * Log files require no additional
DNS The Domain Name System (DNS) is a hierarchical and distributed naming system for computers, services, and other resources in the Internet or other Internet Protocol (IP) networks. It associates various information with domain names assigned to ...
lookups or TCP slow starts. Thus there are no external server calls that can slow page load speeds, or result in uncounted page views. * The web server reliably records every transaction it makes, e.g. serving PDF documents and content generated by scripts, and does not rely on the visitors' browsers cooperating.


Advantages of page tagging

The main advantages of page tagging over log file analysis are as follows: * Counting is activated by opening the page (given that the web client runs the tag scripts), not requesting it from the server. If a page is cached, it will not be counted by server-based log analysis. Cached pages can account for up to one-third of all page views, which can negatively impact many site metrics. * Data is gathered via a component ("tag") in the page, usually written in JavaScript. It is typically used in conjunction with a server-side scripting language (such as
PHP PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group ...
) to manipulate and (usually) store it in a database. * The script may have access to additional information on the web client or on the user, not sent in the query, such as visitors' screen sizes and the price of the goods they purchased. * Page tagging can report on events that do not involve a request to the web server, such as interactions within
Flash Flash, flashes, or FLASH may refer to: Arts, entertainment, and media Fictional aliases * Flash (DC Comics character), several DC Comics superheroes with super speed: ** Flash (Barry Allen) ** Flash (Jay Garrick) ** Wally West, the first Kid ...
movies, partial form completion, mouse events such as , , , , etc. * The page tagging service manages the process of assigning cookies to visitors; with log file analysis, the server has to be configured to do this. * Page tagging is available to companies who do not have access to their web servers. * Lately, page tagging has become a standard in web analytics.


Economic factors

Logfile analysis is almost always performed in-house. Page tagging can be performed in-house, but it is more often provided as a third-party service. The economic difference between these two models can also be a consideration for a company deciding which to purchase. * Logfile analysis typically involves a one-off software purchase; however, some vendors are introducing maximum annual page views with additional costs to process additional information. In addition to commercial offerings, several open-source logfile analysis tools are available free of charge. * For Logfile analysis data must be stored and archived, which often grows large quickly. Although the cost of hardware to do this is minimal, the overhead for an IT department can be considerable. * For Logfile analysis software need to be maintained, including updates and security patches. * Complex page tagging vendors charge a monthly fee based on volume i.e. number of page views per month collected. Which solution is cheaper to implement depends on the amount of technical expertise within the company, the vendor chosen, the amount of activity seen on the websites, the depth and type of information sought, and the number of distinct websites needing statistics. Regardless of the vendor solution or data collection method employed, the cost of web visitor analysis and interpretation should also be included. That is, the cost of turning raw data into actionable information. This can be from the use of third party consultants, the hiring of an experienced web analyst, or the training of a suitable in-house person. A cost-benefit analysis can then be performed. For example, what revenue increase or cost savings can be gained by analyzing the web visitor data?


Hybrid methods

Some companies produce solutions that collect data through both log files and page tagging and can analyze both kinds. By using a hybrid method, they aim to produce more accurate statistics than either method on its own. An early hybrid solution was produced in 1998 by Rufus Evison.


Geolocation of visitors

With IP geolocation, it is possible to track visitors' locations. Using an IP geolocation database or API, visitors can be geolocated to city, region, or country level. IP Intelligence, or Internet Protocol (IP) Intelligence, is a technology that maps the Internet and categorizes IP addresses by parameters such as geographic location (country, region, state, city and postcode), connection type, Internet Service Provider (ISP), proxy information, and more. The first generation of IP Intelligence was referred to as
geotargeting In geomarketing and internet marketing, geotargeting is the method of delivering different content to visitors based on their geolocation. This includes country, region/state, city, metro code/ zip code, organization, IP address, ISP, or other cri ...
or
geolocation Geopositioning, also known as geotracking, geolocalization, geolocating, geolocation, or geoposition fixing, is the process of determining or estimating the geographic position of an object. Geopositioning yields a set of Geographic coordinate s ...
technology. This information is used by businesses for online audience segmentation in applications such as
online advertising Online advertising, also known as online marketing, Internet advertising, digital advertising or web advertising, is a form of marketing and advertising which uses the Internet to promote products and services to audiences and platform users. ...
,
behavioral targeting Targeted advertising is a form of advertising, including online advertising, that is directed towards an audience with certain traits, based on the product or person the advertiser is promoting. These traits can either be demographic with a focus ...
, content localization (or
website localization Website localization is the process of adapting an existing website to local language and culture in the target market. It is the process of adapting a website into a different linguistic and cultural context— involving much more than the simple ...
),
digital rights management Digital rights management (DRM) is the management of legal access to digital content. Various tools or technological protection measures (TPM) such as access control technologies can restrict the use of proprietary hardware and copyrighted works. ...
,
personalization Personalization (broadly known as customization) consists of tailoring a service or a product to accommodate specific individuals, sometimes tied to groups or segments of individuals. A wide variety of organizations use personalization to improv ...
, online fraud detection, localized search, enhanced analytics, global traffic management, and content distribution.


Click analytics

Click analytics Click analytics is a special type of web analytics that gives attention to clicks (Point-and-click) which constitute the first stage in the conversion funnel. Commonly, click analytics focuses on on-site analytics. An editor of a web site uses cl ...
, also known as
Clickstream A click path or clickstream is the sequence of hyperlinks one or more website visitors follows on a given site, presented in the order viewed. A visitor's click path may start within the website or at a separate third party website, often a search e ...
is a special type of web analytics that gives special attention to clicks. Commonly,
click analytics Click analytics is a special type of web analytics that gives attention to clicks (Point-and-click) which constitute the first stage in the conversion funnel. Commonly, click analytics focuses on on-site analytics. An editor of a web site uses cl ...
focuses on on-site analytics. An editor of a website uses click analytics to determine the performance of his or her particular site, with regards to where the users of the site are clicking. Also,
click analytics Click analytics is a special type of web analytics that gives attention to clicks (Point-and-click) which constitute the first stage in the conversion funnel. Commonly, click analytics focuses on on-site analytics. An editor of a web site uses cl ...
may happen real-time or "unreal"-time, depending on the type of information sought. Typically, front-page editors on high-traffic news media sites will want to monitor their pages in real-time, to optimize the content. Editors, designers or other types of stakeholders may analyze clicks on a wider time frame to help them assess performance of writers, design elements or advertisements etc. Data about clicks may be gathered in at least two ways. Ideally, a click is "logged" when it occurs, and this method requires some functionality that picks up relevant information when the event occurs. Alternatively, one may institute the assumption that a page view is a result of a click, and therefore log a simulated click that led to that page view.


Customer lifecycle analytics

Customer lifecycle analytics is a visitor-centric approach to measuring that falls under the umbrella of lifecycle marketing. Page views, clicks and other events (such as API calls, access to third-party services, etc.) are all tied to an individual visitor instead of being stored as separate data points. Customer lifecycle analytics attempts to connect all the data points into a
marketing funnel The purchase funnel, or purchasing funnel, is a consumer-focused marketing model that illustrates the theoretical customer journey toward the purchase of a good or service. In 1898, E. St. Elmo Lewis developed a model that mapped a theoretical ...
that can offer insights into visitor behavior and website optimization.


Other methods

Other methods of data collection are sometimes used. Packet sniffing collects data by sniffing the network traffic passing between the web server and the outside world. Packet sniffing involves no changes to the web pages or web servers. Integrating web analytics into the webserver software itself is also possible. Both these methods claim to provide better
real-time Real-time or real time describes various operations in computing or other processes that must guarantee response times within a specified time (deadline), usually a relatively short time. A real-time process is generally one that happens in defined ...
data than other methods.


On-site web analytics metrics

There are no globally agreed definitions within web analytics as the industry bodies have been trying to agree on definitions that are useful and definitive for some time. The main bodies who have had input in this area have been the IAB (Interactive Advertising Bureau),
JICWEBS JICWEBS (the Joint Industry Committee for Web Standards) was created by the UK and Ireland media industry to ensure independence and comparability of measurement on the web. In effect JICWEBS works in partnership with the ABCe, with JICWEBS decidi ...
(The Joint Industry Committee for Web Standards in the UK and Ireland), and The DAA (Digital Analytics Association), formally known as the WAA (Web Analytics Association, US). However, many terms are used in consistent ways from one major analytics tool to another, so the following list, based on those conventions, can be a useful starting point: *
Bounce rate Bounce rate is an Internet marketing term used in web traffic analysis. It represents the percentage of visitors who enter the site and then leave ("bounce") rather than continuing to view other pages within the same site. Bounce rate is calculated ...
- The percentage of visits that are single-page visits and without any other interactions (clicks) on that page. In other words, a single click in a particular session is called a bounce. * Click path - the chronological sequence of page views within a visit or session. *
Hit Hit means to strike someone or something. Hit or HIT may also refer to: Arts, entertainment and media Fictional entities * Hit, a fictional character from '' Dragon Ball Super'' * Homicide International Trust, or HIT, a fictional organization ...
- A request for a file from the webserver. Available only in log analysis. The number of hits received by a website is frequently cited to assert its popularity, but this number is extremely misleading and dramatically overestimates popularity. A single web-page typically consists of multiple (often dozens) of discrete files, each of which is counted as a hit as the page is downloaded, so the number of hits is really an arbitrary number more reflective of the complexity of individual pages on the website than the website's actual popularity. The total number of visits or page views provides a more realistic and accurate assessment of popularity. *
Page view In web analytics and website management, a pageview or page view, abbreviated in business to PV and occasionally called page impression, is a request to load a single HTML file ( web page) of an Internet site. On the World Wide Web, a page request ...
- A request for a file, or sometimes an event such as a mouse click, that is defined as a page in the setup of the web analytics tool. An occurrence of the script being run in page tagging. In log analysis, a single page view may generate multiple hits as all the resources required to view the page (images, .js and .css files) are also requested from the webserver. * Visitor/unique visitor/unique user - The uniquely identified client that is generating page views or hits within a defined period time (e.g. day, week or month). A uniquely identified client is usually a combination of a machine (one's desktop computer at work for example) and a browser (Firefox on that machine). The identification is usually via a persistent cookie that has been placed on the computer by the site page code. An older method, used in log file analysis, is the unique combination of the computer's IP address and the User-Agent (browser) information provided to the web server by the browser. It is important to understand that the "Visitor" is not the same as the human being sitting at the computer at the time of the visit, since an individual human can use different computers or, on the same computer, can use different browsers, and will be seen as a different visitor in each circumstance. Increasingly, but still, somewhat rarely, visitors are uniquely identified by Flash LSO's (
Local Shared Object A local shared object (LSO), commonly called a Flash cookie (due to its similarity with an HTTP cookie), is a piece of data that websites that use Adobe Flash may store on a user's computer. Local shared objects have been used by all versions of ...
s), which are less susceptible to privacy enforcement. * Visit/session - A visit or session is defined as a series of page requests or, in the case of tags, image requests from the same uniquely identified client. A unique client is commonly identified by an IP address or a unique ID that is placed in the browser cookie. A visit is considered ended when no requests have been recorded in some number of elapsed minutes. A 30-minute limit ("time out") is used by many analytics tools but can, in some tools (such as Google Analytics), be changed to another number of minutes. Analytics data collectors and analysis tools have no reliable way of knowing if a visitor has looked at other sites between page views; a visit is considered one visit as long as the events (page views, clicks, whatever is being recorded) are 30 minutes or less close together. Note that a visit can consist of a one-page view or thousands. A unique visit session can also be extended if the time between page loads indicates that a visitor has been viewing the pages continuously. * Active time/engagement time - Average amount of time that visitors spend actually interacting with content on a web page, based on mouse moves, clicks, hovers, and scrolls. Unlike session duration and page view duration/time on page, this metric can accurately measure the length of engagement in the final page view, but it is not available in many analytics tools or data collection methods. * Average page depth/page views per average session - Page depth is the approximate "size" of an average visit, calculated by dividing the total number of page views by the total number of visits. * Average page view duration - Average amount of time that visitors spend on an average page of the site. * Click - "refers to a single instance of a user following a hyperlink from one page in a site to another". * Event - A discrete action or class of actions that occur on a website. A page view is a type of event. Events also encapsulate clicks, form submissions, keypress events, and other client-side user actions. * Exit rate/% exit - A statistic applied to an individual page, not a web site. The percentage of visits seeing a page where that page is the final page viewed in the visit. * First visit/first session - (also called 'Absolute Unique Visitor' in some tools) A visit from a uniquely identified client that has theoretically not made any previous visits. Since the only way of knowing whether the uniquely identified client has been to the site before is the presence of a
persistent cookie HTTP cookies (also called web cookies, Internet cookies, browser cookies, or simply cookies) are small blocks of data created by a web server while a user is browsing a website and placed on the user's computer or other device by the user's ...
or via digital fingerprinting that had been received on a previous visit, the ''First Visit'' label is not reliable if the site's cookies have been deleted since their previous visit. * Frequency/session per unique - Frequency measures how often visitors come to a website in a given time period. It is calculated by dividing the total number of sessions (or visits) by the total number of unique visitors during a specified time period, such as a month or year. Sometimes it is used interchangeable with the term "loyalty." * Impression - The most common definition of ''impression'' is an instance of an advertisement appearing on a viewed page. Note that an advertisement can be displayed on a viewed page below the area actually displayed on the screen, so most measures of impressions do not necessarily mean an advertisement has been view-able. * New visitor - A visitor that has not made any previous visits. This definition creates a certain amount of confusion (see common confusions below), and is sometimes substituted with analysis of first visits. * Page time viewed/page visibility time/page view duration - The time a single page (or a blog, ad banner) is on the screen, measured as the calculated difference between the time of the request for that page and the time of the next recorded request. If there is no next recorded request, then the viewing time of that instance of that page is not included in reports. * Repeat visitor - A visitor that has made at least one previous visit. The period between the last and current visit is called visitor recency and is measured in days. * Return visitor - A unique visitor with activity consisting of a visit to a site during a reporting period and where the unique visitor visited the site prior to the reporting period. The individual is counted only once during the reporting period. * Session duration/visit duration - Average amount of time that visitors spend on the site each time they visit. It is calculated as the sum total of the duration of all the sessions divided by the total number of sessions. This metric can be complicated by the fact that analytics programs can not measure the length of the final page view. * Single page visit/singleton - A visit in which only a single page is viewed (this is not a 'bounce'). *Site overlay is a report technique in which statistics (clicks) or hot spots are superimposed, by physical location, on a visual snapshot of the web page. * Click-through rate is a ratio of users who click on a specific link to the number of total users who view a page, email, or advertisement. It is commonly used to measure the success of an online advertising campaign for a particular website as well as the effectiveness of email campaigns.


Off-site web analytics

Off-site web analytics is based on open data analysis,
social media Social media are interactive media technologies that facilitate the creation and sharing of information, ideas, interests, and other forms of expression through virtual communities and networks. While challenges to the definition of ''social medi ...
exploration, share of voice on web properties. It is usually used to understand how to market a site by identifying the keywords tagged to this site, either from social media or from other websites.


Common sources of confusion in web analytics


The hotel problem

The hotel problem is generally the first problem encountered by a user of web analytics. The problem is that the unique visitors for each day in a month do not add up to the same total as the unique visitors for that month. This appears to an inexperienced user to be a problem in whatever analytics software they are using. In fact it is a simple property of the metric definitions. The way to picture the situation is by imagining a hotel. The hotel has two rooms (Room A and Room B). As the table shows, the hotel has two unique users each day over three days. The sum of the totals with respect to the days is therefore six. During the period each room has had two unique users. The sum of the totals with respect to the rooms is therefore four. Actually only three visitors have been in the hotel over this period. The problem is that a person who stays in a room for two nights will get counted twice if they are counted once on each day, but are only counted once if the total for the period is looked at. Any software for web analytics will sum these correctly for the chosen time period, thus leading to the problem when a user tries to compare the totals.


Analytics Poisoning

As the internet has matured, the proliferation of automated bot traffic has become an increasing problem for the reliability of web analytics. As bots traverse the internet, they render web documents in ways similar to organic users, and as a result may incidentally trigger the same code that web analytics use to count traffic. Jointly, this incidental triggering of web analytics events impacts interpretability of data and inferences made upon that data. IPM provided a proof of concept of how
Google Analytics Google Analytics is a web analytics service offered by Google that tracks and reports website traffic, currently as a platform inside the Google Marketing Platform brand. Google launched the service in November 2005 after acquiring Urchin. As o ...
as well as their competitors are easily triggered by common bot deployment strategies.


Web analytics methods


Problems with cookies

Historically, vendors of page-tagging analytics solutions have used
third-party cookie HTTP cookies (also called web cookies, Internet cookies, browser cookies, or simply cookies) are small blocks of data created by a web server while a user is browsing a website and placed on the user's computer or other device by the user's ...
s sent from the vendor's domain instead of the domain of the website being browsed. Third-party cookies can handle visitors who cross multiple unrelated domains within the company's site, since the cookie is always handled by the vendor's servers. However, third-party cookies in principle allow tracking an individual user across the sites of different companies, allowing the analytics vendor to collate the user's activity on sites where he provided personal information with his activity on other sites where he thought he was anonymous. Although web analytics companies deny doing this, other companies such as companies supplying
banner ads A web banner or banner ad is a form of advertising on the World Wide Web delivered by an ad server. This form of online advertising entails embedding an advertisement into a web page. It is intended to attract traffic to a website by linking ...
have done so. Privacy concerns about cookies have therefore led a noticeable minority of users to block or delete third-party cookies. In 2005, some reports showed that about 28% of Internet users blocked third-party cookies and 22% deleted them at least once a month. Most vendors of page tagging solutions have now moved to provide at least the option of using
first-party cookie HTTP cookies (also called web cookies, Internet cookies, browser cookies, or simply cookies) are small blocks of data created by a web server while a user is browsing a website and placed on the user's computer or other device by the user's ...
s (cookies assigned from the client subdomain). Another problem is cookie deletion. When web analytics depend on cookies to identify unique visitors, the statistics are dependent on a persistent cookie to hold a unique visitor ID. When users delete cookies, they usually delete both first- and third-party cookies. If this is done between interactions with the site, the user will appear as a first-time visitor at their next interaction point. Without a persistent and unique visitor id, conversions, click-stream analysis, and other metrics dependent on the activities of a unique visitor over time, cannot be accurate. Cookies are used because
IP address An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
es are not always unique to users and may be shared by large groups or proxies. In some cases, the IP address is combined with the user agent in order to more accurately identify a visitor if cookies are not available. However, this only partially solves the problem because often users behind a proxy server have the same user agent. Other methods of uniquely identifying a user are technically challenging and would limit the trackable audience or would be considered suspicious. Cookies reach the lowest common denominator without using technologies regarded as
spyware Spyware (a portmanteau for spying software) is software with malicious behaviour that aims to gather information about a person or organization and send it to another entity in a way that harms the user—for example, by violating their privac ...
.


Secure analytics (metering) methods

It may be good to be aware that third-party information gathering is subject to any network limitations and security applied. Countries, Service Providers and Private Networks can prevent site visit data from going to third parties. All the methods described above (and some other methods not mentioned here, like sampling) have the central problem of being vulnerable to manipulation (both inflation and deflation). This means these methods are imprecise and insecure (in any reasonable model of security). This issue has been addressed in several papers, but to date the solutions suggested in these papers remain theoretical.


See also

*
List of web analytics software This is a list of web analytics software used to collect and display data about visiting website users. Self-hosted software Free / Open source ( FLOSS) This is a comparison table of web analytics software released under a free software license ...
*
Mobile Web Analytics Mobile web analytics studies the behaviour of mobile website users in a similar way to traditional web analytics. In a commercial context, mobile web analytics refers to the data collected from the users who access a website from a mobile phone ...
* Online video analytics * Post-click marketing *
Web log analysis software Web log analysis software (also called a web log analyzer) is a kind of web analytics software that parses a server log file from a web server, and based on the values contained in the log file, derives indicators about when, how, and by whom a web ...
* Web mining *
Web traffic Web traffic is the data sent and received by visitors to a website. Since the mid-1990s, web traffic has been the largest portion of Internet traffic. Sites monitor the incoming and outgoing traffic to see which parts or pages of their site are ...


References


Bibliography

* Clifton, Brian (2010) Advanced Web Metrics with Google Analytics, 2nd edition,
Sybex John Wiley & Sons, Inc., commonly known as Wiley (), is an American multinational publishing company founded in 1807 that focuses on academic publishing and instructional materials. The company produces books, journals, and encyclopedias, in p ...
(Paperback.) * Kaushik, Avinash (2009) Web Analytics 2.0 - The Art of Online Accountability and Science of Customer Centricity.
Sybex John Wiley & Sons, Inc., commonly known as Wiley (), is an American multinational publishing company founded in 1807 that focuses on academic publishing and instructional materials. The company produces books, journals, and encyclopedias, in p ...
,
Wiley Wiley may refer to: Locations * Wiley, Colorado, a U.S. town * Wiley, Pleasants County, West Virginia, U.S. * Wiley-Kaserne, a district of the city of Neu-Ulm, Germany People * Wiley (musician), British grime MC, rapper, and producer * Wiley Mil ...
. * Mortensen, Dennis R. (2009) Yahoo! Web Analytics. Sybex. * Farris, P., Bendle, N.T., Pfeifer, P.E. Reibstein, D.J. (2009) Key Marketing Metrics The 50+ Metrics Every Manager needs to know,
Prentice Hall Prentice Hall was an American major educational publisher owned by Savvas Learning Company. Prentice Hall publishes print and digital content for the 6–12 and higher-education market, and distributes its technical titles through the Safari B ...
, London. * Plaza, B (2009) Monitoring web traffic source effectiveness with Google Analytics: An experiment with time series. ''ASLIB Proceedings'', 61(5): 474–482. * Arikan, Akin (2008) Multichannel Marketing. Metrics and Methods for On and Offline Success.
Sybex John Wiley & Sons, Inc., commonly known as Wiley (), is an American multinational publishing company founded in 1807 that focuses on academic publishing and instructional materials. The company produces books, journals, and encyclopedias, in p ...
. * Tullis, Tom & Albert, Bill (2008) Measuring the User Experience. Collecting, Analyzing and Presenting Usability Metrics.
Morgan Kaufmann Morgan Kaufmann Publishers is a Burlington, Massachusetts (San Francisco, California until 2008) based publisher specializing in computer science and engineering content. Since 1984, Morgan Kaufmann has published content on information technology ...
,
Elsevier Elsevier () is a Dutch academic publishing company specializing in scientific, technical, and medical content. Its products include journals such as ''The Lancet'', ''Cell'', the ScienceDirect collection of electronic journals, '' Trends'', th ...
, Burlington MA. * Kaushik, Avinash (2007) Web Analytics: An Hour a Day,
Sybex John Wiley & Sons, Inc., commonly known as Wiley (), is an American multinational publishing company founded in 1807 that focuses on academic publishing and instructional materials. The company produces books, journals, and encyclopedias, in p ...
,
Wiley Wiley may refer to: Locations * Wiley, Colorado, a U.S. town * Wiley, Pleasants County, West Virginia, U.S. * Wiley-Kaserne, a district of the city of Neu-Ulm, Germany People * Wiley (musician), British grime MC, rapper, and producer * Wiley Mil ...
. * Bradley N (2007) Marketing Research. Tools and Techniques.
Oxford University Press Oxford University Press (OUP) is the university press of the University of Oxford. It is the largest university press in the world, and its printing history dates back to the 1480s. Having been officially granted the legal right to print books ...
, Oxford. * Sostre, Pedro and LeClaire, Jennifer (2007) Web Analytics for Dummies.
Wiley Wiley may refer to: Locations * Wiley, Colorado, a U.S. town * Wiley, Pleasants County, West Virginia, U.S. * Wiley-Kaserne, a district of the city of Neu-Ulm, Germany People * Wiley (musician), British grime MC, rapper, and producer * Wiley Mil ...
. * Burby, Jason and Atchison, Shane (2007) Actionable Web Analytics: Using Data to Make Smart Business Decisions. * Davis, J. (2006) ‘Marketing Metrics: How to create Accountable Marketing plans that really work’
John Wiley & Sons John Wiley & Sons, Inc., commonly known as Wiley (), is an American multinational publishing company founded in 1807 that focuses on academic publishing and instructional materials. The company produces books, journals, and encyclopedias, in p ...
(Asia). * Peterson Eric T (2005) Web Site Measurement Hacks.
O'Reilly O'Reilly ( ga, Ó Raghallaigh) is a group of families, ultimately all of Irish Gaels, Gaelic origin, who were historically the kings of East Bréifne in what is today County Cavan. The clan were part of the Connachta's Uí Briúin Bréifne kin ...
ebook. * Peterson Eric T (2004) Web Analytics Demystified: A Marketer's Guide to Understanding How Your Web Site Affects Your Business. Celilo Group Media * Lenskold, J. (2003) ‘Marketing ROI: how to plan, Measure and Optimise strategies for Profit’ London:
McGraw Hill McGraw Hill is an American educational publishing company and one of the "big three" educational publishers that publishes educational content, software, and services for pre-K through postgraduate education. The company also publishes referenc ...
Contemporary * Sterne, J. (2002) Web metrics, Proven Methods for Measuring Web Site Success, London:
John Wiley & Sons John Wiley & Sons, Inc., commonly known as Wiley (), is an American multinational publishing company founded in 1807 that focuses on academic publishing and instructional materials. The company produces books, journals, and encyclopedias, in p ...
. * Srinivasan, J .(2001) E commerce Metrics, Models and Examples, London:
Prentice Hall Prentice Hall was an American major educational publisher owned by Savvas Learning Company. Prentice Hall publishes print and digital content for the 6–12 and higher-education market, and distributes its technical titles through the Safari B ...
. * Zheng, J. G. and Peltsverger, S. (2015
Web Analytics Overview
In book: Encyclopedia of Information Science and Technology, Third Edition, Publisher: IGI Global, Editors: Mehdi Khosrow-Pour {{DEFAULTSORT:Web Analytics Audience measurement Digital marketing Market research kn:ಜಾಲ ವಿಶ್ಲೇಷಣೆ km:ការវិភាគតាមអ៊ីនធើណែត fi:Kävijäseuranta ta:இணையப் பகுப்பாய்வு zh:網站分析