Digging for Web data
- By Beth Archibald Tang
- Jul 20, 2000
My hometown was built on the wealth of what was taken out. In the early
days, the town, the bank and the stores owed their existence to coal. With
the benefit of ever-improving machinery, miners dug great quantities of
the coal beneath the hills of West Virginia. It was a tough, dangerous,
dirty job, but the miners were among the best-paid workers. The lesson to
learn is that only by the hard work of extracting the coal, was it possible
to fuel the steel mills in Pittsburgh and to keep the trains going that
delivered the manufactured goods.
Data and information are the new energy sources. In the world of Web sites,
it is important to locate and use data to make informed decisions about
design, marketing, long-term strategies and information technology purchases.
Fortunately, without even getting your fingernails dirty, you can dig for
a treasure trove of information from the comfort of your ergonomically correct
One way to dig for data about a Web site is to use data mining, which
is the process of finding patterns. Patterns are hidden in extremely large
tables. Depending on your needs, finding the patterns could involve artificial
intelligence or statistical analysis software.
To a lesser extent, data mining also makes use of Web site analysis tools.
Such tools are limited to providing some facts about the visitors the site
and some data about their activities. However, full-scale data mining will
provide insight about trends of the visitors to your site.
New to data mining? It can be a daunting task. For a start, you may want
to answer such questions as:
* What browsers do users have? What versions do they use?
* How many unique visitors did our site have this month? How long did they
* Who is referring to us? What search words did they use to find the Web
* What are the top 10 downloaded documents? What are the least requested
* Were there any problems encountered out of the ordinary? Do we have a
lot of 404 errors?
Simple tools may be best when just starting out. Before jumping into data
mining, try analyzing the Web site data in the log files. This assumes that
you have graduated beyond using a hit counter, and that your site employs
some sort of Web analysis tool, such as WebTrends Corp. software.
Using Web site analysis tools can help with marketing and design strategies
as well as help advise your Web design team. The team needs to know if links
are dead ends and be able to answer the above questions. By knowing if users
are encountering an overabundance of problems (via miscellaneous error messages),
the Web team will be able to distinguish between activity because of miscoded
links and activity that possibly is a sign of hacking. The informed designer
also will know if users' paths through a site are more convoluted than need
Knowledge can arise from the careful study of your log files' data. The
information can then be used to map a data mining strategic plan for more
Look to all the individuals (in a small organization) or to departmental
representatives (in larger agencies) to comment on the strategic plan and
to ensure that it meshes with the overall mission as well as other existing
business or marketing plans. Doing so will ensure that all stakeholders
are accounted for in the plan and that the most can be achieved with the
limited resources. It can help you decide if log analysis suits your needs
or if you need to invest in data mining software.
The more customer-driven your site (that is, if it performs e-commerce or
delivers e-services), the more likely you will want to do a comprehensive
analysis of your Web server logs and then transition into studying customer
transactions and trends.
Tang, a member of the Federal Web Business Council, is an associate in the
Information Technology Group at Caliber
Associates, Fairfax, Va.