Why is SVG going to be REALLY BIG?

David Dailey

Professor
Slippery Rock University Department of Computer Science

David Dailey has been working with SVG since about 2003. He previously held appointments at Williams College, Vassar, and the Universities of Alaska, Tulsa, and Wyoming. In addition to SVG, his research interests include graph theory, semantics and cognition.

Abstract

Why is SVG going to be REALLY BIG? There are all the ordinary reasons that SVG will be big. Most of us involved in the SVG community can recite them by heart: it is an open standard; it uses XML; the graphics are scalable; server-side processing is minimized with much of the processing being off-loadable to the client; it works in concert with JavaScript, XML DOM, CSS, AJAX and HTML; browser support is almost universal (with a shrinking list of exceptions); SMIL is utterly cool; et cetera. But, why is it going to be REALLY BIG? Well, it has to do with the nature of communication itself. Communication has suffered a number of setbacks over the years, not the least of which was the advent of the alphabet. SVG offers the possibility of expanding the bandwidth of communication in terms of the rate at which humans can produce and consume information. In short, it offers a paradigm shift for which the Internet is just the first of a series of developments that will constitute a revolution in human communication. It will be argued that SVG, in that revolution, is actually more important than HTML and that future historians will look upon the twenty year radius of the present as the time that the World Wide Web and SVG came into existence. HTML will be largely forgotten.

Consider the relationship between language and thought. Thought is non-linear. It is cross-referential. It is sometimes verbal (or lexical), and sometimes it is not. There are many times that we experience the phenomenon of having a thought first, and only afterward, attempting to put that thought into words. That is, not all thoughts are composed of words, though many can be translated into words. While differing theories about the evolution of human language exist, some linguists postulate that human communication, prior to spoken language may have been more gestural and less auditory. Regardless of this, there is good evidence that humans did develop an oral tradition prior to the development of written language. There is also evidence that many of the written languages were ideographic and spatial, rather than alphabetic and sequential. The first of our mistakes in developing written language may have been to pattern our writing after our speech, rather than patterning it after our underlying thoughts. Given the advances in printing and distribution of manuscripts that the Internet has afforded, perhaps we no longer need to be bound by the shortcomings that historical accident has infused into our written expression. SVG is key in making the next step.

Table of Contents

Major advances in the technology of communication.
Why the development of speech was a mistake
Why the development of the alphabet was a bigger mistake
How HTML is a linear medium like speech and the alphabet.
How thought is non-linear.
How SVG is non-linear.
How graph theory plus semantics plus pictures can convey thoughts in a cross cultural way.
How SVG can get where we need to go quicker than HTML can
Bibliography

Major advances in the technology of communication.

The SVG community's artist friend, Jerry Maddox, has worked with SVG for many years. He has encouraged me, perhaps in not so many words, to make this talk and paper a bit provocative. Certainly, at least, I think he would hope that a bit of historical breadth be brought to the table by another senior member of the SVG community. As such I will lay into this paper large slabs of speculation salted with morsels of hyperbole, for your reading enjoyment and consternation.

A.S. Diamond on the history and origin of language, like many scholars, speculates on the emergence of language as we stepped out of the primordial ooze as a species. Just what was it that first brought us to both raise our eyebrows knowingly and grunt at the same time?

One of the most readable and entertaining accounts comes from Lincoln Barnett's Treasure of our Tongue [Barnett1965], a celebrated account of popular linguistics by a non-linguist. He summarizes the competing theories of the day (which appear not that different from the competing theories of this day) as the bow-wow (onomatopoeic), pooh-pooh (mammalian anatomy), yo-he-ho (differentiated grunts) , ta-ta (a Darwinian idea based on co-evolution of gesture and grunt) , ding-dong ( attributed to Max Muller but seeming to have a bit of Jung thrown in), or sing-song (again due to Darwin, speech began as song) . My own view is perhaps a fusion of more than one of these, but like Darwin's is based on observation. In my case it comes from observing humans rather than reptiles. As such, I present them more for sake of contemplation than persuasion.

In the good-old-days when humans were out camping and grilling caribou and asparagus over an open fire and eating blueberries and honey for dessert, our hands were largely free for communicating. If we were not eating or hunting or preparing food or tools, our hands were able to do many things. We could make mudras, play charades, point, and use multi-finger control on our audience's visual displays. There are times, certainly, that the pointer is much richer than a keyboard and times when a body-full of pointers is richer than six keyboards. In fact, this theory of language holds that the oral tradition was not really just oral. It was oro-gestural! Whether oral or gestural preceeded one another is rather irrelevant. Gestures work quite well for certain communication even when there is no shared spoken language, as is well known to the international tourist.

At some time, humans developed speech, In the next section I argue that this was a step backwards in the evolution of the species, but it did happen and there is a plenitude of historical and comparative linguistic evidence to suggest that happen it did, and that it happened long before writing systems emerged.

One Julian Jaynes while a faculty member at Princeton made quite a name for himself by positing that a major landmark in human neurological evolution happened between the writing of the Odyssey ( a largely oral tradition finally recorded) and the Iliad. He saw the difference as signaling the develpment of the corpus collosum and its remarkable ability to prevent the left half from knowing what the right half is doing. At any rate the transition from an oral tradition to a written one happened sometime during the early development of writing systems. However with the exception of a few ideological systems, most writing systems have sought to encode not what ideas mean but rather what words have been used to say those ideas. We have chosen the spoken word itself as the unit of speech to serve as the basis for our writing system.

The most troubling aspect of this is that speech itself is a low bandwidth channel compared to vision, and when choosing to make our ideas visible, most cultures have turned with their writing system (whether throught alphabet, syllabary, or ideography) to focus on speech for inspiration. The idea is that two people within the same language group would be able to "read writing" and come up with the same words in conveying said idea to yet another person. First however, speech is slow compared to vision, and when we choose speech (with all of the imperfection of its mapping to ideas) as the basis of a new visual form of communication we have chosen a very flawed metaphor for ideas themselves. Writing had its advantages though, since ideas could travel farther than either gestures or speech. And writing became a way to hold the new empires together.

After the development of oral language, followed by the written language, the next major transition usually posited by scholars tends to be the advent of the printing press. Now longer was the "word" controlled by just the monarchies or the Church each with their own agenda, but suddenly words were free to migrate openly through much of Europe (there was, after all a Renaissance going on). This led to all manner of upheaval of the status quo. History sort of ambled along for another 500 years with minor and major squabbles and wars consuming much of Europe and Asia, until suddenly one day with no warning nor prior art, the Internet was given us as some sort of mysterious and divine act. With the Internet came new ways of packaging and broadcasting information, and new ways of distributing and accessing it. The Internet is usually seen as the next major technology of writing and its implications are predicted to be every bit as profound as the three previous advances in the technology of communication.

Why the development of speech was a mistake

So back to the pre-oral days of human communication: that period when the gesture was worth a thousand grunts. We humans were in a pre-literate day of oro-gestural communication, getting along quite well as hunters and gatherers. Then one day, some pharaoh or Emporer Cuzco in a distant land teaches his followers to cultivate and pick vegetables. Suddenly the hands that are the mainstay of human communication are stripped of their millennias old role of communicating poetry and folklore, and assigned instead to the inglorious job of picking vegetables for the rulers of the newly emerging empires. The empires become so bloated on the over abundance of sugar beets and barley, so opulent and corpulent that their thirst for new riches and new sources of gluttony pushes the empire further and further until at last writing is invented to bring messages from point A to B. In the earlier days experiments with gestural writing (pictograms) were tried, but since the peasants (the majority after all) have lost the ability to gesture (which is just as well for keeping their communication bandwidth low) even the nobility starts to lose the ability to communicate except in the painstaking manner of grunts that have been codified as speech.

Almost 30 years ago, a student of mine who was doing an independent study with me on the linguistics of American Sign Language (a topic she knew far more about than I), believed it was time for the professor to actually see what ASL looked like when it was spoken. We went to a party for the hearing impaired, and it made an enduring impression. Beyond the experience of having her translate to me the rich ideas being expressed with such ease and facility in ASL, I also saw something done which would not have been feasible using speech. Three people standing in a triangle, were all "speaking" at the same time. Furthermore, they were all smiling and responding to one another as though they understood. When hearing people speak, not even two people can talk at the same time and still "hear" each other, but here were three people, apparently signing and understanding all at the same time. (for those who do not know it, ASL only resorts to an alphabet when words that have no gesture need to be conveyed -- things like personal names, product names, animals never-before encountered, or infrared radiation -- things much like my semantic primitives did not encode well). I asked my friend if this were common, and she said it was.

I reasoned as follows. First the overall bandwidth (in terms of bits per second) of the visual system is probably far greater than that of the auditory system. (see Edward Tufte's comments on the visual system that purportedly handles about 10 megabits per second. http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0002NC ) Additoinally, the rate at which we process visual information in terms of identifying, labeling and understanding what we see is probably far greater than the capacity of the auditory system. Speech can be processed and understood several orders of magnitude less than the rate at which the retina conveys information to the brain though it is not obvious how we might measure the meaningful comprehension of visual information. Nevertheless, given the human's well developed visual cortex , it stands to reason that we might indeed be able to process considerably more information visually than auditorially. The notion therefore that speech is an "advance" over gesture as a means of communicating can thus be held somewhat suspect.

Thus seen, the development of speech is not a glorious triumph of the human intellect, but a miserable setback caused by the greed of monarchial agribusiness. Humans used to tell good stories, until their hands got too busy picking vegetables.

Why the development of the alphabet was a bigger mistake

Okay, so a setback occurs. It might not have been so bad, if we did not then stoop a step further. Most of our cultures then decided to invent writing (to ensure that messages could be sent intact across the vast agri-empire) and, adding insult to injury, we patterned our writing not after our thoughts (which were far richer than our grunts) but after our audible grunts known as speech themselves. Ah , eh , iii, oh, oooo! Yabadabadoo!

And to think we actually wrote this nonsense down and called it literature and then came to revere literature as though it had something to do with the human spirit. The selling of the alphabet was the greatest boondoggle ever. It slowed down the speed of communication between all humans, hence providing an inexorable momentum for preserving the status quo.

I mentioned that this would be provocative for the mere sake of provocation. The reader is under no more obligation to believe any of this then she is to believe the QWERTY keyboard is the best way to communicate with fellow humans!

How HTML is a linear medium like speech and the alphabet.

HTML at its core is (with the exception of the <table> element and possibly other elements related to spatial arrangement) 1.5 dimensional: that is, its fundamental metaphor consists of written speech (text), with occasional embedded belches of multimedia (<object>, <img>, <audio>, <video>) plus graph theoretic cross-references that provide a modest foray into translinearity.

For alphabet A={a1, a2, ... , an } and graphics G={G1, G2, ...Gk} (where A intersect G is empty) we may represent a typical text as

Figure 1. The text -- a linear sequence of characters with occasional pictures.

For vocabulary V={w1, w2, ... , wn }, including graphics G and anaphora H={h1, h2, ...hk} (where G and H are subsets of V). An example of anaphora might include H={this, that, which, he, her}

linear text with internal and external links

Figure 2. The hypertext -- words and pix in sequence with references.

Table� 1.�
1972 Experiment A: mapping semantic inferential content	1972 Experiment B: mapping semantic inferential content
Figure4.	Figure5.

Table� 1.�

1972 Experiment A: mapping semantic inferential content

1972 Experiment B: mapping semantic inferential content

notes from notebook showing web of thought

Figure4.

view of notebook showing web of thoughts

Figure5.