A click path or clickstream is the sequence of
hyperlinks
In computing, a hyperlink, or simply a link, is a digital reference providing direct access to data by a user's clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with ...
one or more website visitors follows on a given site, presented in the order viewed. A visitor's click path may start within the website or at a separate
third party website, often a
search engine
A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...
results page, and it continues as a sequence of successive webpages visited by the user. Click paths take call data and can match it to ad sources,
keywords, and/or referring domains, in order to capture data.
Clickstream analysis is useful for web activity analysis, software testing, market research, and for analyzing employee productivity.
Information storage
While navigating the
World Wide Web
The World Wide Web (WWW or simply the Web) is an information system that enables Content (media), content sharing over the Internet through user-friendly ways meant to appeal to users beyond Information technology, IT specialists and hobbyis ...
, a "user agent" (
web browser
A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...
) makes requests to another computer, known as a
web server
A web server is computer software and underlying Computer hardware, hardware that accepts requests via Hypertext Transfer Protocol, HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, co ...
, every time the user selects a
hyperlink
In computing, a hyperlink, or simply a link, is a digital reference providing direct access to Data (computing), data by a user (computing), user's point and click, clicking or touchscreen, tapping. A hyperlink points to a whole document or to ...
. Most web servers store information about the sequence of links that a user "
clicks through" while visiting the websites that they host in
log files for the site operator's benefit. The information of interest can vary and may include information downloaded, webpage visited previously, webpage visited afterwards, duration of time spent on page, etc. The information is most useful when the client/user is identified, which can be done through website registration or record matching through the client's
Internet service provider
An Internet service provider (ISP) is an organization that provides a myriad of services related to accessing, using, managing, or participating in the Internet. ISPs can be organized in various forms, such as commercial, community-owned, no ...
(ISP). Storage can also occur in a
router,
proxy server, or
ad server.
Data analysis
Data mining
Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and ...
,
column-oriented DBMS
Data orientation is the representation of tabular data in a linear memory model such as in-disk or in-memory. The two most common representations are column-oriented (columnar format) and row-oriented (row format).
The choice of data orienta ...
, and integrated
OLAP
In computing, online analytical processing (OLAP) (), is an approach to quickly answer multi-dimensional analytical (MDA) queries. The term ''OLAP'' was created as a slight modification of the traditional database term online transaction processi ...
systems can be used in conjunction with clickstreams to better record and analyze this data.
Privacy
Use of clickstream data can raise
privacy
Privacy (, ) is the ability of an individual or group to seclude themselves or information about themselves, and thereby express themselves selectively.
The domain of privacy partially overlaps with security, which can include the concepts of a ...
concerns, especially since some
Internet service provider
An Internet service provider (ISP) is an organization that provides a myriad of services related to accessing, using, managing, or participating in the Internet. ISPs can be organized in various forms, such as commercial, community-owned, no ...
s have resorted to selling users' clickstream data as a way to enhance revenue. There are 10-12 companies that purchase this data, typically for about $0.40/month per user. While this practice may not directly identify individual users, it is often possible to indirectly identify specific users, an example being the
AOL search data scandal. Most consumers are unaware of this practice, and its potential for compromising their privacy. In addition, few ISPs publicly admit to this practice.
As the world of
online shopping
Online shopping is a form of electronic commerce which allows consumers to directly buy goods or services from a seller over the Internet using a web browser or a mobile app. Consumers find a product of interest by visiting the website of th ...
grows, it is becoming easier for the privacy of individuals to become exploited. There have many cases of
email address An email address identifies an email box to which messages are delivered. While early messaging systems used a variety of formats for addressing, today, email addresses follow a set of specific rules originally standardized by the Internet Enginee ...
es,
phone numbers, and other personal information that have been stolen illegally from shoppers, clients, and many more to be used by third parties. These third parties can range from advertisers to
hackers. There are consumers who actually benefit from this by gaining more targeted advertising and deals, but most are harmed by the lack of privacy. As the world of technology grows, consumers are more and more in risk of losing privacy.
Applications
Clickstreams can be used to allow the user to see where they have been and allow them to easily return to a page they have already visited, a function that is already incorporated in most browsers. Clickstream can display the specific time and position that individuals browsed and closed the website, all the web pages they viewed, the duration they spent on each page, and it can also show which pages are viewed most frequently. There is abundant information to be analyzed, individuals can check visitors clickstream in association with other statistical information, such as: visiting length, retrieval words, ISP, countries, explorers, etc. This process enables individuals to know their visitors deeply.
Webmasters can gain insight into what visitors on their site are doing by using the clickstream. This data itself is "neutral" in the sense that any dataset is neutral. The data can be used in various scenarios, one of which is marketing. Additionally, any webmaster, researcher,
blog
A blog (a Clipping (morphology), truncation of "weblog") is an informational website consisting of discrete, often informal diary-style text entries also known as posts. Posts are typically displayed in Reverse chronology, reverse chronologic ...
ger or person with a website can learn about how to improve their site.
The growing
e-commerce
E-commerce (electronic commerce) refers to commercial activities including the electronic buying or selling products and services which are conducted on online platforms or over the Internet. E-commerce draws on technologies such as mobile co ...
industry has made it necessary to tailor to the needs and preferences of consumers. Click path data can be used to personalize product offerings. By using previous click path data, websites can predict what products the user is likely to purchase. Click path data can contain information about the user's goals, interests, and knowledge and therefore can be used to predict their future actions and decisions. By using
statistical model
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repre ...
s,
website
A website (also written as a web site) is any web page whose content is identified by a common domain name and is published on at least one web server. Websites are typically dedicated to a particular topic or purpose, such as news, educatio ...
s can potentially increase their operating profits by streamlining results based on what the user is most likely to purchase.
Analyzing the data of clients that visit a company website can be important in order to remain competitive. This analysis can be used to generate two findings for the company, the first being an analysis of a user's clickstream while using a website to reveal usage patterns, which in turn gives a heightened understanding of customer behaviour. This use of the analysis creates a user profile that aids in understanding the types of people that visit a company's website. As discussed in Van den Poel & Buckinx (2005), clickstream analysis can be used to predict whether a customer is likely to purchase from an e-commerce website. Clickstream analysis can also be used to improve customer satisfaction with the website and with the company itself. This can generate a business advantage, and be used to assess the effectiveness of advertising on a web page or site.
Implications
Most websites store data about visitors to the site through click path. The information is typically used to improve the website and deliver personalized and more relevant content. In addition, the data results can not only be used by a designer to review, improve or redesign their website, but can also be used to model a user's browsing behaviour. In the online world of e-commerce, information collected through click path allows advertisers to construct personal profiles and use them to individually target consumers much more effectively than ever before; as a result, advertisers create more relevant advertising and efficiently spend advertising dollars. Meanwhile, in the wrong hands click path data poses a serious threat to personal privacy.
Unauthorized clickstream
data collection
Data collection or data gathering is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes. Data collection is a research com ...
is considered to be
spyware
Spyware (a portmanteau for spying software) is any malware that aims to gather information about a person or organization and send it to another entity in a way that harms the user by violating their privacy, endangering their device's securit ...
. However, authorized clickstream data collection comes from organizations that use opt-in panels to generate market research using panelists who agree to share their clickstream data with other companies by downloading and installing specialized clickstream collection agents.
Challenges
The number of paths a user can potentially take greatly increases depending on the number of
pages on that particular website. Many tools to determine path analysis are too linear and do not account for the complexity of internet usage. In most cases, less than 5% of users follow the most common path. However, even if all users used the same path, there is still no way to tell which page is the most influential in determining behavior. Even in more linear forms of path analysis, where they can see where most customers drop off the website, the "why?" factor is still missed. The main challenge of path analysis lies in the fact that it tries to regulate and force
user
Ancient Egyptian roles
* User (ancient Egyptian official), an ancient Egyptian nomarch (governor) of the Eighth Dynasty
* Useramen, an ancient Egyptian vizier also called "User"
Other uses
* User (computing), a person (or software) using an ...
s to follow a certain path, when in reality users are very diverse and have specific
preference
In psychology, economics and philosophy, preference is a technical term usually used in relation to choosing between alternatives. For example, someone prefers A over B if they would rather choose A than B. Preferences are central to decision the ...
and opinions.
See also
*
Keystroke logging
Keystroke logging, often referred to as keylogging or keyboard capturing, is the action of recording (logging) the keys struck on a keyboard, typically covertly, so that a person using the keyboard is unaware that their actions are being monitore ...
*
Phorm
*
Real-time Marketing
*
Software Asset Management
*
Click tracking
References
{{Reflist
Web design
Web analytics
Spyware
Surveillance