In brief, I am a philosopher of perception turned information designer, and I favor iconographic approaches to visualization—e.g., glyph plots—in part because I believe they conduce to data perception, the pursuit of which is the organizing principle of my research. Somewhat less briefly:
Two Levels of Perceptual Content
In my dissertation, I argue for a distinction between two levels of perceptual content. We can illustrate the difference between them using the Necker cube:
At the first level, we have the hard imposition of the lines; at the second, at least three depth interpretations—facing side high, facing side low, and flat. The visual experience of depth is quite unlike seeing the lines: it is not so obviously tied to point locations in the visual field, and it has an almost kinetic component to it. Moreover, the representation of depth, although related to the lines, is underdetermined by them. After all, we can switch between depth interpretations without changing the lines.
The stated goal of all information visualization is to make abstract information accessible to human perception. The above distinction between levels of content should therefore give us hope, because it suggests that sensory channels can be remapped: if the relationship between high- and low-level content is not one-one, it might be possible to induce arbitrary associations between sensory stimuli and abstract states of affairs. This is conjecture, of course: all we know for sure from the example is that concrete, local states of affairs sometimes share sensory stimuli. And although it is quite commonplace for visual marks to stand in for abstract states of affairs—e.g., when red dots and blue dots represent Republican votes and Democractic votes—such cases typically involve perhaps three data dimensions at a maximum. Thus, the better part of our response to these stimuli—how we interpret them, what judgments we make, how we feel about them—has nothing to do with the visual marks themselves and everything to do with our existing knowledge.
This is not to say that the involvement of our existing knowledge is a problem. Human bias is hard-won and supremely important. It makes the difference between an ignorant machine and a nuanced, human expert who is sensitive to context and knows when something is surprising or significant. But if our information channels are low-bandwidth, we can never hope to deploy human bias across the vast expanse of the data universe, and we will have to send the machines in our stead. Alternatively, if we wish to look for ourselves, our channels must be wide, and we must read a great deal more into our visual marks.
It's worth pointing out that the present dilemma will not arise in every case. If our data are images, for example, we have ready-made stimuli of suitable complexity, and there is no need for remapping. Images can be visualized, we might say, “directly”. Nonetheless, few have recognized this, and a great deal of image analysis is either done by machine or done serially on small subsets of images. In the Cultural Analytics Lab, we plot large collections of images together on digital canvases to be looked at by human experts. I believe this approach to be crucially important for the humanistic sciences, and in the Digital Humanities Lab at Yale, I'm building an accessible software library for direct visualization designed to work inside computational notebooks. Direct visualization is conceptually simple but extraordinarily powerful, and it offers a kind of ideal to be sought after in the tougher cases.
Of course, if our data are abstract, direct visualization is not an option. It is helpful to notice, however, that images—as used in our visualizations—are special cases of glyphs. Large collections of glyphs can be organized using the same sorting principles we apply to images. Moreover, the success of direct visualization demonstrates the impressive bandwidth of the visual channel and suggests that visualizations using suitably designed high-dimensional glyphs will fare similarly, provided we are able to cultivate a similar level of expertise in their apprehension. This alone would be reason enough to support the spread of glyph visualization. But even independently of the rather ambitious goal of data perception, there are good reasons to use unit-based methods. For one, if you give each data point a distinct visual mark, the user can select it. This was part of the justification for using glyphs in Wintour. More importantly, however, unit-based visualizations guard against the evils of aggregation. It is nearly always better to see a distribution in plain than to see it binned, and computer technology has made the former as easy to produce as the latter.