Toppan's entry into packaged multimedia includes CD-ROM and DVD products. In an area flooded by technical standards Toppan's efforts are aimed at improving product quality within the confines of the standards. For example, a troubling aspect of MPEG encoding is that viewed motion pictures suffer from "jitter," in which the image jumps in position slightly from frame to frame. Toppan demonstrated a system to remove jitter from MPEG-encoded video. It also has techniques for improving the shading and tone of JPEG images to make them more realistic.
The network library system being developed at NTT provides multimedia services based on a broadband ATM network. The network is served by hi-fi music, MPEG-1, MPEG-2 and digital library servers. Processing engines for voice recognition, search, Japanese/English translation and text-to-speech are provided. A key component in this network is a super-high definition display, at a resolution of 2048 x 2048 pixels, 24 bits/pixel operating at 60 frames/sec for video. The network library is being used for doctors' viewing of medical images, sightseeing tours, teleconferences and on-the-fly machine translation between Japanese and English.
Electronic commerce is viewed as being one of the promising opportunities in the 21st century. Major concerns in making this feasible are guaranteeing security, copyrights, and maintaining the timeline of transactions. The WTEC team saw two especially interesting demonstrations illustrating how electronic money can be securely moved around between interested parties and how copyrights can be protected in the sale and distribution of digital objects. In the demonstration of moving electronic money around, a smart card is used for making purchases from anywhere as long as one is connected to the network. When digital objects are marketed over the network, the sellers need to ensure that their copyrights are protected. NTT's InfoProtect project demonstrates the secure distribution of images. The owner of the digital content first creates a partial image (semi-disclosed) and its descrambling key. The descrambling key is registered with the system center and the partial image is transmitted to the potential buyer. The buyer decides to purchase by inspecting the scrambled image and buys the descrambling key via a secure key transmission protocol known as InfoKey developed at NTT. The key is used to descramble the image. The buyer ID is embedded using digital watermarking, providing protection against copyright violation.
The high presence video teleconference system demonstrated at NTT is centered around two large projection displays (each 110 inches long along the diagonal). The resolution is four times that of high definition TV and enables interaction with life-sized humans. The quality of display performance was demonstrated using 2D monocular and stereo still images. The monocular images were viewed at a resolution of 6 million pixels/frame and the stereo pairs each had about 3 million pixels/image, giving excellent quality to the stereo images. Although this system as a whole is expensive, key components of the display technology have been commercialized. Using sound localization, an enhanced multimedia presentation is possible with applications to remote museums and education.
The WTEC team saw a demonstration of a virtual tour of the monastery at Sande Marco in Florence, Italy was shown at Keio University. The tour was displayed on three flat screens using back projection. About 1,000 photographs taken at the monastery were used with a 3D modeling package to create the tour. The building and surroundings were all synthesized, whereas the artwork was all photographed. This required 200 MB of storage. On the virtual tour it is possible to zoom in on the many works of art. The tour is controlled using a joystick.
The team also saw Toppan's Virtual Reality Gallery, which consists of a portion of a spherical screen in an auditorium giving a horizontal visual range of 150 degrees, so the viewer is enveloped by the image being displayed by a digital projection system of resolution 3,500 x 1,000 lines. Toppan demonstrated a virtual reality tour through the Sistine Chapel that was created by taking still photographs from 50 different vantage points throughout the chapel, digitizing them and using them to create a 3D digital model. The viewer is able to move around within the chapel by means of a hand-held joystick. ATR demonstrated an agent interface that serves as a tour guide. This generic approach will be useful for exploring cyberspace. Other ongoing research efforts in this laboratory are communication by mental images including kansei processing, virtual reality, art and technology, including interactive environments, understanding of emotions from music and peoples' voices, and human communications science. This chapter will now briefly elaborate on the concept of kansei computing, which appears to be similar to the concept of affective computing put forward by Picard (1997).
Kansei is defined as computing that relates to, arises from, or is influenced by human characteristics such as sensibility, perception, affection or subjectivity. A national research project at ATR sponsored by the Ministry of Education during 1992-1995 gave impetus to giving computers human-like responsiveness. One of the media of Kansei information is the face, as it is able to express subtle emotions. There are two aspects to what is perceptible from a facial expression. One deals with image engineering issues of accurate, robust face recognition and expression algorithms. The other deals with human-science aspects of mimicking human information processing. For example, to respond to queries such as who looks similar here? or who is the most senior? one should incorporate ideas from human information processing. The emphasis of this group is on combining psychophysics and image engineering. This will guide the design of human computer interface (HCI) systems. For further information refer to the HCI report.