Login

Login
Welcome:
Guest

Search for:


Browse:

Bannner: Aslib individual membership.
 
Journal search
Journal cover: International Journal of Web Information Systems

International Journal of Web Information Systems

ISSN: 1744-0084

Online from: 2005

Subject Area: Information and Knowledge Management

Content: Latest Issue | icon: RSS Latest Issue RSS | Previous Issues

Options: To add Favourites and Table of Contents Alerts please take a Emerald profile

Previous article.Icon: Print.Table of Contents.Next article.Icon: .

2010 Awards for Excellence


Downloads: The fulltext of this document has been downloaded 347 times since 2010

Article citation: , (2010) "2010 Awards for Excellence", International Journal of Web Information Systems, Vol. 6 Iss: 4, pp. -


Options

Further reading

Marked list

Bookmark & share



Article Type: 2010 Awards for Excellence From: International Journal of Web Information Systems, Volume 6, Issue 4

Outstanding paper International Journal of Web Information Systems

A new approach to web users clustering and validation: a divergence-based scheme

Vassiliki A. Koutsonikola, , Sophia G. Petridou, Athena I. Vakali, Georgios I. Papadimitriou
Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece

Purpose – Web users’ clustering is an important mining task since it contributes in identifying usage patterns, a beneficial task for a wide range of applications that rely on the web. The purpose of this paper is to examine the usage of Kullback-Leibler (KL) divergence, an information theoretic distance, as an alternative option for measuring distances in web users clustering.
Design/methodology/approach – KL-divergence is compared with other well-known distance measures and clustering results are evaluated using a criterion function, validity indices, and graphical representations. Furthermore, the impact of noise (i.e. occasional or mistaken page visits) is evaluated, since it is imperative to assess whether a clustering process exhibits tolerance in noisy environments such as the web.
Findings – The proposed KL-clustering approach is of similar performance when compared with other distance measures under both synthetic and real data workloads. Moreover, imposing extra noise on real data, the approach shows minimum deterioration among most of the other conventional distance measures.
Practical implications – The experimental results show that a probabilistic measure such as KL divergence has proven to be quite efficient in noisy environments and thus constitute a good alternative, the web users clustering problem.
Originality/value – This work is inspired by the usage of divergence in clustering of biological data and it is introduced by the authors in the area of web clustering. According to the experimental results presented in this paper, KL divergence can be considered as a good alternative for measuring distances in noisy environments such as the web.

Keywords: Cluster analysis, Data mining, Internet, User studies

www.emeraldinsight.com/10.1108/17440080910983583

This article originally appeared in International Journal of Web Information Systems Volume 5 Number 3, 2009, pp. 348-37. Editor: Dr Ismail Khalil Ibrahim and Dr David Taniar



© Emerald Group Publishing Limited  |  Copyright information  |  Site policies  |  Cookie information
.