Donald A. Norman
I am a newcomer to the area of television. At Apple, I head the
research group -- The Advanced Technology Group -- and like my
colleagues in other information technology companies, I am helping
today's computer technology move into the 21st Century, where
information and social interaction will be pervasive, simpler, and
more responsive to social needs. We are in the midst of an
interesting revolution, one that I am sure historians 200 years from
now will call one of the more profound technological changes in
written history. This revolution is really about social interaction,
collaboration, and access to knowledge. It isn't about telephones or
computers or television.
One of the standard design exercises in my field is that of the
scenario: you imagine some people with as much precision as possible
about their gender, age, job, family, everyday problems. Then you try
to see how these people would behave with the new technologies under
discussion. Scenarios are important because they let us examine how
the technology will really work in context from the very first,
formative stages of the idea. Let me give you two simplified examples
in which I will not elaborate upon the people or their lifestyles,
but simply upon the kinds of technology we are envisioning.
This scenario was about baseball, but it could just as easily been
about gardening, or cooking, or news. It could even be about
advertisements. Suppose that advertisers always made available more
information about their products. Part of the interesting scenarios
that we have developed has advertising playing the role of providing
needed product information, so much so that people would request it.
Now imagine that, business of the world: consumers who beg you to
give more information about your products.
In fact, people generally do want more information about products
when they are ready to buy. Advertising is annoying because, in the
current mode of television, it interrupts the flow of events and,
moreover, is often not relevant. Newspaper and magazine
advertisements don't interfere nearly as much because the nature of
printed matter is such that advertisements are easy to skip when they
are not relevant, and equally easy to concentrate upon when they are.
TV doesn't have this property because of the sequential nature of the
information flow. As a result, ads are more useful and less annoying
in magazines and newspapers than on TV. Some of our scenarios show
how this pattern could change to the point that consumers might even
pay to see television advertisements.
Now look at another scenario -- school homework:
The basic scenario here is simple, although a bit optimistic: my
teen-age children would never work quite so cooperatively with their
parents. I present it here to show the potential of the marriage
between TV and the NII for education, in this case both to let the
family search for relevant information, but also to let them view
retrieved television clips.
Note how both scenarios capture the essence of our vision for future
television and information services: effortless traveling back and
forth. The basic scenarios, by the way, are possible today on any
computer hooked to the World Wide Web. What isn't possible is the
smooth integration with commercial television. These scenarios
require that viewers be able to direct their searches across the
Internet for relevant material, implying low bandwidth data
transmission from the home and text on the screen that is easy to
read. To make this possible, we need close to the same quality text
display now found on GUI-based computers, which is a far cry from
what today's TV set can display. And finally, there is the ability to
transfer text and images from one document to another.
In both of the scenarios and in any situation where a viewer requests
more information, the viewer is learning. This is the ideal situation
for education: allowing people to learn because they want to learn,
because the information being requested and read is interesting and
relevant to their concerns. Imagine when students can immediately
look up information on topics that interest them, and then display,
read, scan, and print out relevant findings. No longer is access
limited to textbooks or even the local library: the library has
become international.
I could present another dozen scenarios. The range of possibilities
is enormous, from education for elementary school, high school, and
college, for self-learning and business education. For home
improvement, for travel, hobbies, and entertainment. For work and for
play.
Note that we still maintain the distinction between "television" and
"computers." We do not believe that either will supplant the other.
To us, "television" and "computers" are primarily distinguished by
usage. When people watch television, they are in "receive" mode --
receiving more information than they are providing. TV sets will have
large screens and will typically be viewed by a group of people from
a distance with only limited amount of interaction. Their main
function will be entertainment and education. Computers, on the other
hand, are much more interactive in function. When people use
computers, they will be generating much of the material -- writing
letters or essays, drawing, financial activities, or communicating
with others. Computers will therefore be smaller than television
sets, meant to be used by one or two people at a time rather than the
groups that congregate around television sets, with considerable
interaction with the material. Technically, however, the television
set and the computer will be very similar. Both will have
communication ports to the external world, both will have
high-resolution display screens, both will have CPU chips and
considerable local storage and working memory. But they will look
different because they will be constructed for very different usage
patterns.
To those of us in the computer business, these scenarios make
great sense. In fact, the only question is when, not if. Most of us
have given the HDTV developments only passing interest. Most people
don't know what ATV stands for, or what the Grand Alliance is. I did
a good deal of learning in getting ready for the talk. Mind you, we
know that HDTV/ATV will be important for us, but we assume,
therefore, that the natural regulatory process will play itself out
and then we will build upon the result. The National Information
Infrastructure (NII) and advanced television (ATV) seems like a
natural marriage.
Hah. Blind faith is not a good way to proceed.
I look at what is happening and I am amazed. Um, what ever happened
to the information revolution? It never got to the Grand Alliance. I
think about those scenarios, and I don't see them happening. The
standards process, as written, makes it very very difficult. Let me
share with you my analysis of why: Basically, it is because of
neglect of the requirements of the information world. We assumed that
information from the NII -- the world of computers -- could share the
screen with information from ATV -- the world of television.
Nope.
The great promise of Advanced TV for the NII comes from some of its
technological components:
To my great surprise, none of these seem to be issues in ATV.
Instead, the emphasis has been on entertainment and on the commercial
model of current broadcast TV, one in which entertainment titles
contain little text, and the sending of data, if it is to be done at
all, is one-way, separate from the TV service. Two way interaction,
if done at all, is very asymmetric, with the viewer perhaps selecting
catalog items or shows, perhaps sending back a credit card number and
purchase choice. The notion of symmetric interaction, where some
viewers (such as a real estate firm, a High School, or even a private
individual) might publish its information for others seems
neglected.
I am very worried by what I see. The problem is two-fold. First of
all, the members of the Grand Alliance, on the whole, are dominated
by the television industry, yet we are in the midst of a
technological revolution that is merging television with
communication with computers. What happened to representatives from
those other industries?
Second, the assumptions are all based on the cost model of current
broadcasting and production, transmission, and reception facilities.
Great care is taken to minimize cost. But cost minimization also
means a slackening of quality.
You know, standards such as Advanced Television are going to be with
us a very long time. We will never be able to change some of its
conventions. Yet what is costly today will be cheap tomorrow. The
standards will still be here -- the cost will be gone -- and the
American public will be the loser.
Let me discuss my issues separately
Digital signals, that convey picture, sound, text, and
data.
There are a number of issues of that are important to the
computer industry in order for advanced television to be able to play
an important role in the emerging national Information
Infrastructure. Technically, they deal with:
In this talk I concentrate only upon the first four items because
from the computer graphics point of view, one thing is clear: you
have to have a clean, coherent image. This allows wonderful
processing effects to take place, enhancing the experience and the
understanding by the viewers.
But now, in the interest of economy, the picture is taken apart and
shown in segments (half now, the other half later -- interlacing),
and each element of the picture is distorted out of shape, wider
horizontally than it is deep, vertically -- non-square pixels. Think
about that for a moment. Interlacing is a technology whose time has
long since past. Interlace is a technology of the 1930s. It was
essential to the success of NTSC, for it allowed a possible picture
in the days when it was simply not possible to send a cohesive
(progressively scanned) picture. But times have changed. We don't
need interlacing anymore. Worse, it gets in the way. It degrades the
image quality considerably. It makes computer processing and
augmentation of the image extremely difficult, and, in this age of
digital compression and MPEG-2, it doesn't even save much
bandwidth.
Now, I realize that the proposals are for a family of standards, some
of which meet our purposes quite well, others of which do not. Thus,
interlacing is only present in a few of the formats, the rest are
progressive scan. The problem is that the lowest quality images are
apt to dominate for reasons of (false) economy. Let interlacing in
and it will be difficult or impossible to eradicate. If the baseball
game is transmitted with interlaced scan, our scenario fails. It
won't help that there is an existing standard that calls for
progressive scan if the interlaced scan is allowed and used.
Non-square pixels are an accident. They don't do anyone any good, but
they certainly make life difficult for the computer graphics
processing. To say nothing of diminishing the quality of text that is
displayed on the screen.
Why are these such problems? Just imagine the difficulties. Suppose
the TV is displaying a person running from right to left. The odd
rows of the interlaced image show the person at a different location
than the even rows. If you wanted your TV set to capture the image so
you could print it out, you would have to settle for half an image
(only the even or the odd lines). If you tried to take advantage of
the full resolution of the picture and capture all the lines of the
image, it would be a mishmash of images because of the interlacing.
Note that if the image had been transmitted by progressive scan,
there would be no problem: the entire image would be consistent with
itself, and you would see a better quality picture and it would be
easier to capture a high quality image, using all of the lines.
Sound complex? It is. It's a mess. For everyday cinema or video, none
of this matters. For scientific or business visualization, it makes a
big difference.
Interlacing saves money at the camera end, and then a little bit in
transmission bandwidth. But it creates problems in production and in
the display if there is to be computer enhancement or computer
generated imagery added to the picture. With MPEG-2 compression,
there is little saving with interlacing because any two adjacent
lines of a progressively scanned image are apt to be very similar, so
it gets maximum advantage of the compression.
When the computer industry first brought out Graphical User
Interfaces about 15 years ago, the clear monitor of choice was the
television set, for it was mass produced, inexpensive, and readily
available. But this was simply not possible to use commercially
available. It simply is not possible to present high-quality text on
NTSC (or PAL or SECAM, for that matter).
If you take a look at what has transpired, the computer industry has
had to go to progressive scan, ever smaller pixel sizes, larger
screens, and faster refresh rates.
The human eye is remarkable sensitive to flicker, especially at the
periphery. The human eye is designed to be motion sensitive -- slight
motion in the periphery gets your attention. Flicker seems like
motion to the neural circuits. This is a special problem with big
screens (or to be more precise, with wide viewing angles). Apple soon
discovered that people want big screens. As a result, we have had to
go to refresh rates greater than 60 and often 70 frames/second.
The film industry recognizes this too. By historical accident, film
is taken at the rate of 24 frames/second, but if it is projected that
way, especially with the wide screen that is so popular today, it
flickers badly -- enough to create headaches and nausea. So the best
quality movie projectors break the beam of light for each picture
frame twice, meaning that each picture frame is projected three times
so that the eye sees 72 flashes per second, thus minimizing
brightness flicker.
Television has the added problem that it should really be using high
frame rates to avoid brightness flicker, but because it is
interlaced, even at these rates, there is interlace flicker when the
object being projected moves between the transmission of the two
parts of the interlace.
Interlacing is bad. High frame rates are good. Alas, we are stuck
with our history, in which there are three major existing TV
standards (PAL, SECAM, and NTSC) and one primary commercial
motion picture standard: PAL and SECAM use 25 frames/second
(interlaced), NTSC uses approximately 30 frames/second (interlaced),
and movies use 24 frames/second .
Faster scan rates are needed to capture movement. Rapid display rates
are needed to eliminate flicker. The existing rates of 24 (film), 25
(European TV) and 30 (NTSC) frames/second are too slow for both
motion capture and flicker prevention.
Our studies show that it is possible to produce a reasonable
compromise with 24 frames/second or progressively scanned images,
because that allows easy display of 48 or 72 images/second and
relatively simple conversion from 25 and 30 frames/second by schemes
already in wide use. When large screen displays are used, it is easy
to show each frame three times, yielding 72 images/second.
High resolution screens It must be possible to
put text on the screen that can actually be read. On my small
computer screen, I can read a page of printed text that looks almost
as clear and precise as the printed page itself -- not quite as good,
but it is getting there rapidly.
One my big, expensive home TV, I'm lucky if I can read 12 lines of
text. Movie titles scroll illegibly across the screen. Here is where
the number of lines in the picture make a big difference. For data,
we need more, not less. But we can deal with numbers such as 640 wide
by 480 deep. This is a small screen, but workable if the image that
is presented has square pixels, progressive scan, and is unrefreshed
frequently enough.
Conclusion: we need at least 640 by 480 pixels, displayed at least 24
frames/second with progressive scan, and refreshed at a 48 or 72
images/second rate. Note that a 640 by 480 image can easily be
stretched by a factor of 4/3 to yield a screen aspect ratio of 16:19
or stretched by 3/2 to yield an aspect ratio of 2:1, the movie
industry's choice.
Two way interaction between source and viewer
Current NTSC is barely capable of transmitting data.
Basically, there are those 24 left-over lines in the vertical
blanking interval. Not much room
What we need is a standard that allows the transmission of data, if
necessary by freezing the image on the screen and then using the
normal image transmission period for data. In one frame, we could
transmit a lot of data, and the viewer might never note that a single
frame had been frozen one extra cycle.
What forms of data? Who knows? That's the whole point about
technological revolutions: you don't know what is going to
happen.
Therefore the data transmission standard must have a flexible, self
defining structure that allows data formats that have not yet been
invented to be sent in some future year. The current proposal is not
sufficiently robust in its structure to deliver computer code or data
with sufficient accuracy to be useful. It needs a layer of error
correction.
Once again: the ATV standards are apt to be with us for 50 years, and
by then, technology will be very different. NTSC was invented before
the notion of digital data, before the computer, before the
transistor. The world has changed a lot since then.
The transmission has to allow for two-way interaction because small
and large businesses, high schools, elementary schools, colleges, and
universities are all going to want to generate and transmit their own
information.
We want a world in which one television set provides the image for
everyone on the NII.
I haven't been able to discover why we have 50 Hz. power in some
part of the world and 60 Hz power here, but that accidental decision
dominated the technical choices in our original television standards
of NTSC, SECAM, and APL and seems impossible to remove today.
Similarly, the use of 24 frames/second in movie theaters is equally
arbitrary, and the fact that it is synonymous with neither the 25 nor
the 30 frames/second of TV is equally bizarre. Interlacing was a
technological hack, no elevated into a world-wide standard. Can we
get rid of it?
Standards are forever, because once established, the simplify and
dominate the lives of millions, even billions. The "qwerty" keyboard
seems forever with us, as does the English system of measurement, at
least in the United States. Which is the correct side of the road on
which to drive? Right or left? Obviously, it doesn't matter, as long
as everyone does the same, but wouldn't the world be better for
automobile manufacturers and drivers if everyone had agreed upon the
same standard. Think it would be possible to agree upon a single,
world-wide standard now? No way.
I am sure each of you has your own experience with old technologies
and methods that your industry is forced to maintain because, once
upon a time, long ago in the past, it was the standard and today,
there are far too many people who depend upon that ancient,
antiquated method.
These are examples of standards that stay with us for a long time,
causing inconvenience and expense. I worry about the impact of the
ATV proposals in this way -- how many old standards are we
perpetuating for yet another century?
But what of other standards that lock us in to in appropriate
methods, that prevent advances? This is where I really worry. This is
where I fear that insufficient thought has been given in ATV
process.
Standards based on costs are dangerous. The costs drop, while the
standards stay.
We are in the midst of an information revolution in which the
fields of entertainment, communications, and computation converge. If
things go well, it will be possible to merge the strengths of each
field, to create services and experiences not possible today. Many
activities stand to benefit -- education, business, personal
interaction, fun, entertainment, and recreation. On the other hand,
if we do any one of the areas wrong, then the expected convergence
will not occur: television, communication, and computation will go
their separate ways, co-existing but not co-supporting.
The chance to establish critically important technological standards
does not often occur. Today we are at that point. Are we to take the
easy way, emphasizing the least common denominator, doing what we can
get away with rather that what is best? Will we let the temporary
expediency of costs lock deficient standards into place for a large
fraction of the next century?
The proposed "Grand Alliance" standards for Advanced TV are very
close to being satisfactory for the NII as well as for TV. In fact,
to make them work only requires the deletion of some of the
alternative formats. We are not asking for radical change. Indeed, we
ask for simplification of the family of standards to one that will
guarantee compatibility between the world of television and the world
of information. Today's standards do not do that because they allow
for inappropriate transmission standards in the guise of a temporary
efficiency in costs. It is these inappropriate standards that we must
eliminate.
I am here to urge you to do the right thing, not the cheap
thing. Recognize that the Advanced Television Standards are really
the Advanced Information Services Standards that will allow Advanced
Television to become a major, central part of the National
Information Infrastructure. Intelligent choices can lead us to great
societal advances. Standards are forever. Costs are temporary. What
is expedient and expensive today will be foolish and inexpensive
later. Let us do the right thing.