Site: Advanced Telecommunications Research
Institute International (ATR)
2-2 Hikaridai Seika-Cho Soraku-Iun
Kyoto 619-02, Japan
Date Visited: 27 March 1998
WTEC Attendess: R. Chellappa (report author), B. Brown-Davis, R. Larsen, J. Mendel, H. Morishita, R. Reddy
Hosts:
ATR, a private company, was established in 1986 with support from industry, academia and government under the Japan Key Technologies Center (KTC) initiative with a mandate to serve as a major center of basic telecommunications R&D. ATR also fosters national and international research collaborations through an invited researcher program, workshops and seminars. The research activities of ATR are carried out in four research laboratories:
Transitioning of research into products is done under the auspices of one or more of the following laboratories:
Findings are widely disseminated through domestic and international conferences and journals. ATR International's capital is ¥22.03 billion, as a result of investments from 140 companies. The total annual research budget is about ¥8 billion, divided among the basic research laboratories. 70% of the research budget is from the government, the rest being from private companies.
The total number of employees is 295, of whom 235 are involved in research. The 235 researchers are composed of 53 invited international researchers, 38 invited domestic researchers, and 118 researchers transferred from other companies, the remaining 26 being staff (in-house) researchers. The distribution of researchers among the four basic research laboratories and the headquarters is as follows:
The hosts were from the Media Integration and Communications, Interpreting Telecommunications, and Human Information Processing Research Laboratories. The site report discusses research projects that were selected from ongoing research efforts.
The activities of the Human Information Processing Lab were presented by Dr. Shigeru Akamatsu. The focus is on multi-modal interactions between perception and production through kansei. Kansei is defined as computing that relates to, arises from, or is influenced by human characteristics such as sensibility, perception, affection or subjectivity. A national research project sponsored by the Ministry of Education during 1992-1995 gave impetus to giving computers human-like responsiveness.
One of the media of kansei information is the face, as it is able to express subtle emotions. There are two aspects to what is perceptible from a facial expression. One deals with image engineering issues of accurate, robust face recognition and expression algorithms. The other deals with human-science aspects of mimicking human information processing. For example, to respond to queries such as who looks similar here? or who is the most senior? one should incorporate ideas from human information processing. The emphasis of this group is on combining psychophysics and image engineering. Annual symposia emphasizing this for face processing are being held. These symposia have invited lectures from domestic and international researchers. Dr. Michael Lyons demonstrated a gender/expression recognition system using the Gabor wavelet-based group matching technique developed by Dr. C.v.d. Malsburg and associates. Dr. Lyons is investigating the possibility of using discriminant analysis for this problem.
The research activities in ITL, supported by Japan Key Technology Center and other major Japanese companies, started in March 1993. The main focus is on basic research efforts on multilingual speech translation technologies. As part of this seven-year effort, researchers in ITL plan to develop key component technologies such as speech recognition, language translation and speech synthesis. ITL is a core member of the international Consortium for Speech Translation Advanced Research (C-STAR). As part of this involvement, researchers are involved in a joint experiment in multilingual speech translation technologies. Languages under consideration are Japanese, Korean, English and German. The WTEC team saw a demonstration of multilingual chat translation in the domain of travel-related conversations. The prototypical system demonstrated uses a combination of examples and rules in a unified framework and handles two-way translation between Japanese and English, Korean or German and outputs synthesized speech in these languages. The example-based approach is trained using a large number of spoken expressions, while the rule-based approach makes full use of linguistic rules. Multilingual translation is accomplished in an incremental fashion. Example-based translation is done using a semantic similarity measure between words. Rule-based translation relies on manipulation of syntactic rules, lexicons and dependency structure rules.
Natural-sounding speech synthesis using a concatenation based scheme was demonstrated. More details on multilingual translation algorithms developed at ATR may be found in the following references: Sumita and Iida 1992, Furuse and Iida 1996, Wakita et al. 1997, Mima et al. 1997. Related work on parallel implementations using a massively parallel associative processor and a CM-2 connection machine is described in Sumita et al. 1993 and in Sumita et al. 1994. A comparison of implementations of example-based retrieval schemes using serial, MIMD and SIMD architectures is in Sumita et al. 1994. Using the relationship between architectures and response times, an appropriate architecture may be designed depending on the response time constraint. Related work using associative parallel processors may also be found in Sumita et al. 1995.
An agent interface that serves as a tour guide was demonstrated. This generic approach will be useful for exploring cyberspace. Other ongoing research efforts in this laboratory are communication by mental images including kansei processing, virtual reality, art and technology, including interactive environments, understanding of emotions from music and people's voices and human communications science. Due to time constraints, the site visit team could not see demonstrations of these projects. See the WTEC Panel Report on Human-Computer Interaction (http://itri.loyola.edu/hci/) for more details concerning ATR's HCI research. The forthcoming WTEC report on the Japanese Key Technologies Center program will also include further information on ATR (see http://itri.loyola.edu).
Furuse, O. and H. Iida. 1996. Incremental translation utilizing constituent boundary patterns. In Proc. 16th Intl. Conf. on Comp. Ling. Copenhagen, Denmark. Aug: 412-417.
Mima, H., O. Furuse, Y. Wakita, and H. Iida. 1997. Multi-lingual spoken dialog translation system using transfer-driven machine translation. In Proc. Machine Translation Summit VI. San Diego, CA. Nov: 148-155.
Sumita, E. and H. Iida. 1992. Example-based transfer of Japanese adnominal particles into English. IEICE Trans. Inf. And Syst. Vol. E75-D. July: 585-594.
Sumita, E., K. Oi, O. Furuse, and H. Iida. 1993. Example-based machine translation on massively parallel processors. Proc. IJCAI-93. Chambery, France. Sept: 1283-1288.
Sumita, E., N. Nisiyama and H. Iida. 1994. The relationship between architectures and example-retrieval times. In Proc. 12th National Conf. on Artif. Intell. Seattle, WA. Sept.
Sumita, E., K. Oi, O. Furuse, H. Iida, and T. Higuchi. 1995. Example-based machine translation using associative processor. Jl. Natural Language Processing. Vol. 2: 27-48.
Wakita, Y., J. Kawai, and H. Iida. 1997. Correct parts extraction from speech recognition results using semantic distance calculation, and its application to speech translation. In Proc. Spoken Language Translation. Madrid, Spain. July.