Mapping Scientific Frontiers: The Quest for Knowledge Visualization

Tony Cawkell (CITECH Ltd, Iver, UK)

Journal of Documentation

ISSN: 0022-0418

Article publication date: 1 June 2003

362

Keywords

Citation

Cawkell, T. (2003), "Mapping Scientific Frontiers: The Quest for Knowledge Visualization", Journal of Documentation, Vol. 59 No. 3, pp. 364-369. https://doi.org/10.1108/00220410310472554

Publisher

:

Emerald Group Publishing Limited

Copyright © 2003, MCB UP Limited


The fascinating subject – visualisation (note that an “s” rather than the US “z” in its title, is used in this review) has been defined as “making visible, especially to one's mind, things not visible to the eye”. Chen says “the goal of information visualisation is to reveal invisible patterns from abstract data. It brings new insights, not merely pretty pictures”.

Previous work by Edward Tufte

It was the three books by Tufte (1983, 1990, 1997) which aroused this reviewer's interest. Incidentally, since a review of visualisation was published in 2001 (Cawkell, 2001), another comprehensive book has appeared making available information in print previously only available from a Web site (Dodge and Kitchen, 2001).

Tufte dwells on intrinsic and aesthetic aspects of the topic – as well as on its merits for representing information interpreted in a concise fashion Others, such as Chen, are primarily interested in the second aspect – particularly in the representation and display of data made possible with computer software.

Some remarks about Tufte's books from my earlier article are summarised here. Tufte understands the nuances of a visual display. A design strategy is required which “sharpens the information resolution, the resolving power, of paper and video screen … Visual displays are simultaneously a wideband and a perceiver‐controllable channel”, he writes. These words summarise the requirements perfectly. Continuing: “at every screen are two powerful information‐processing capabilities, human and computer. Yet all communication between the two must pass through the low‐resolution narrow‐band video display terminal which chokes off fast, precise, and complex communication”.

Compare, for instance, the ease with which information provided in a newspaper like The Times may be scanned, read or ignored, with the poor performance of a screen and its associated electronics and software attempting to provide these functions – an unfortunate fact in view of the general use of screens for information display.

Tufte describes Minard's “Carte Figurative” showing the route taken by Napoleon's army to Moscow and back. The width of a track on a map denotes the reduction of the army from over 400,000 men who left France, to the 10,000 who returned to Paris. Tufte thinks that this diagram “may well be the best statistical graphic ever drawn”.

Another of Tufte's examples is about the cholera epidemic which occurred in the Broad Street neighbourhood of London in 1854 as portrayed by Dr John Snow. Snow suspected that there might be a link between impure water and cholera although it was believed at the time that the disease was borne on the air. He drew a street map of the area marked with dots representing the number of deaths at specific locations. The location of community pump‐wells were also marked on the map. He noted that there were clusters of dots near the Broad Street pump, indicating that this could indicate the source of the epidemic. There were only a few dots near other pumps.

Computers, photo‐realism and movies

Before returning to Chen's new book, here is another digression covering an area of (I hope) general interest.

Small computers with special software are able to provide some stunning visualisation effects. The Atari 1049 ST machine came out early in 1986 to be followed by the Amiga, but today machines from Apple are considered to excel. Software is available for them for:

  • Modelling.The process of representing the objects which will appear in an image. It probably started in engineering applications where parts were drawn from which corresponding software‐created wire‐frame representations were produced.

  • Rendering. A technique which seems to have been introduced by Appel in the late 1960s (Appel, 1968). The purpose of rendering is to make a computer‐generated image look as realistic as a photograph. After rendering, a hitherto lifeless picture becomes enhanced with shadows, reflective light effects, highlights and surface texture definition. Rendering requires that the path of light rays be tracked as they are reflected or diffused from objects in an image – a computationally‐intensive operation – eventually producing a pixel (picture element) of a particular colour (see Rademacher, 2000). For example the ray formation producing over a quarter of a million pixels in a modest image of 500 × 500 pixels will require to be examined. The passage of light paths from a light source are simulated as they pass from the source via the surfaces of modelled objects to generate the colour of a pixel to be observed. In practice, only the light rays which form the observed area of pixels are traced, not the light rays radiated in all directions from the source.

  • Z buffering is a technique associated with rendering. It defines surfaces which are not obscured by other objects, thereby removing the apparent transparency of the obscuring objects. Early rendering was done using transputers in parallel – an inexpensive way of achieving the very large amount of processing required in a reasonable time. Rendering operations may now be carried out on many types of personal computers and is used in computer games systems often using Pixar

Pixar and Lucasfilms

The visualisation concept, embracing, as it does, advanced graphics, means that organizations like Pixar, spun off from Lucasfilm Ltd, leaders in film animation and special effects, require a special mention. George Lucas joined forces with Francis Ford Coppola and made a film called THX 1138 which failed at the box office, but included some extraordinary special effect techniques. His next film American Graffiti, a low‐cost production, was a huge box office success. It was followed by the Star Wars series made by Lucasfilm Ltd at Elstree, UK one of the major pre‐war British film studios. The Star Wars movies took special effects to new levels of realism. As the Star Wars series progressed so were Pixar's realism effects improved and featured in movies such as Toy Story, Jurassic Park and A Bug's Life.

Understanding visualisation

A visualisation display is often cartographically devised – that is it is based it on a map. This is the case for the two examples given in Tufte's book. Both Minard's and Dr Snow's graphics are map‐based. Both men had the opportunity to add further symbols from other data if they possessed it. For instance, the French lost a lot of men when they crossed the frozen Berezena river. Assuming that some heavy artillery disappeared through the ice as well, these losses could be represented by a large symbol on one side of the river, and a small one on the other. An excellent example of an uncluttered presentation is provided by Harry Beck's map of the London underground system.

To understand what Chen is talking about it is useful to be able to think mathematically – not my métier. It seems that computer‐based visualization techniques – undoubtedly important for representing information – are not generally understood. I will attempt to explain them without the use of mathematics.

Chen's work

Chen completed his first book (Chen, 1999) when working on a funded research project at Brunel University. The book being reviewed here has been published following his move to Drexel University, Philadelphia, USA. In the first book a number of applications of visualisation are discussed, while the second is devoted to science mapping and scientific frontiers.

Chen writes: ”Mapping scientific frontiers involves several disciplines from the philosophy and sociology of science to Information Science, Scientometrics, and Information Visualisation … One must transcend disciplinary boundaries so that each contributing approach can fit it into the context”.

The earlier chapters introduce the growth of scientific knowledge accompanied by a mention of the sociology and structure of science in which the names of Kuhn, Price, Merton and Bernal come to mind. Information about “Scientific Frontiers” – areas characterised by intense research activity – is available on the Web. It includes such activities as computers and robotics, genomics, neuroscience and quantum technology. In the UK's “Foresight” programme, areas such as cognition in living systems and self‐organising systems have been discussed as topics requiring a special research effort.

Following these chapters, are: “4. Enabling techniques for science mapping”; “ 5. On the shoulders of the giants”; ” 6. Tracing competing paradigms”; and “7. Tracking latent domain knowledge”.

In a suitable computer‐based visualisation display, the data from which it is devised may be processed to present it in a way which is not evident from the data themselves. For example dots representing articles about similar aspects of the subject are closely spaced; those which are less related are displayed at a greater distance from each other. Different strands of a subject are shown as lines of interconnected dots. The result is a network structure of spidery appearance. For such structures to convey useful information, a variety of software has become available, much of which is discussed by Chen. The process requires that suitable terms be used describing articles for processing in order to produce a collection of articles (nodes) showing significant interconnections without crossovers, spaced from each other according to subject similarity.

Chapters 4 to 7 provide examples of multi‐dimensional scaling leading to the use of Pathfinder and other systems. Chen writes “Citation analysis takes into account one of the most crucial indicators of scholarship: citations. Citation analysis has a unique position in the history of science mapping because several widely used methods have been developed to extract citation patterns from the scientific literature and these citation patterns can provide insightful knowledge of an invisible college”. He provides a number of examples of Henry Small's work on mapping scientific frontiers.

Pathfinder network scaling eliminates redundant links (McGreevy, 1995). Spatial layout is determined by a graph‐drawing algorithm. Pathfinder it is “an automated method of calculating a relational metric based on proximity‐weighted co‐occurrence among terms in domain text”. Chen uses article co‐citation data (Small, 1999) to extract patterns from the scientific literature to provide a Pathfinder 3D rendered landscape. The beauty of the system is that a very large network may be displayed on a computer screen. Zooming‐in to any selected part of it is available to make that part legible. The system may include a pop‐up feature for any node/article so the article itself may be retrieved.

It is not always clear from the book which software is most suitable for particular applications nor how effective are the various examples given for browsing or information retrieval. An example is given for determining the similarity of images using data obtained by QBIC processing, using image colour and texture descriptive terms. However, this provides less information than data derived from word terms. The usefulness of a network created in this manner is limited by the inability of QBIC to identify wanted images in a real‐world collection rather than by any shortcomings of the Pathfinder system – particularly in regard to matching images containing objects of interest against objects contained in an image database (Cawkell, 2000).

However, this is minor criticism. This excellent book, suggesting as it does new methods for studying science and containing many illustrations in colour, will be of great interest both to visualisation students and to all those interested in studying the sociology of science.

References

Appel, A. (1968), Some techniques for machine renderings of solids, AFIP Spring Conference, pp. 37‐45

Cawkell, T. (2000), “Image indexing and retrieval by content”, Information Services and Use, Vol. 20, pp. 4958.

Cawkell, T. (2001), “Progress in visualisation”, Journal of Information Science, Vol. 27 No. 6, pp. 42738.

Chen, C. (1999), Information Visualiation and Virtual Environments, Springer‐Verlag, London.

Dodge, M. and Kitchen, B. (2001), Atlas of Cyberspace, Addison‐Wesley, Reading, MA.

McGreevy, M.W. (1995), A Relational Metric, Its Application to Domain Analysis and an Example Analysis and Model of a Remote Sensing Domain, NASA TM‐110358 Ames Research Center, Moffett Field, CA.

Rademacher, P. (2000), Ray Tracing: Graphics for the Masses, available at: www.cs.unc.edu/∼rademach/xroads‐rt/rtarticle

Small, H. (1999), “Visualizing science by citation mapping”, Journal of the American Society for Information Science, Vol. 50 No. 9, pp. 780913.

Tufte, E.R. (1983), The Visual Display of Quantitative Information, Graphics Press, Cheshire, CT.

Tufte, E.R. (1990), Envisioning Information, Graphics Press, Cheshire, CT.

Tufte, E.R. (1997), Visual Experiments. Images and Quantities, Evidence and Narrative, Graphics Press, Cheshire, CT.

Related articles