Site: Advanced Telecommunications Research
Institute International (ATR)
2-2 Hikaridai Seika-Cho Soraku-Iun
Kyoto 619-02, Japan
Date Visited: 27 March 1998
WTEC Attendess: R. Chellappa (reportauthor), B. Brown-Davis, R. Larsen, J. Mendel, H. Morishita, R. Reddy
ATR, a private company, was established in 1986 with support from industry,academia and government under the Japan Key Technologies Center (KTC)initiative with a mandate to serve as a major center of basictelecommunications R&D. ATR also fosters national and internationalresearch collaborations through an invited researcher program, workshops andseminars. The research activities of ATR are carried out in four researchlaboratories:
Transitioning of research into products is done under the auspices of one ormore of the following laboratories:
Findings are widely disseminated through domestic and internationalconferences and journals. ATR International's capital is ¥22.03 billion, as aresult of investments from 140 companies. The total annual research budget isabout ¥8 billion, divided among the basic research laboratories. 70% of theresearch budget is from the government, the rest being from privatecompanies.
The total number of employees is 295, of whom 235 are involved in research.The 235 researchers are composed of 53 invited international researchers, 38invited domestic researchers, and 118 researchers transferred from othercompanies, the remaining 26 being staff (in-house) researchers. Thedistribution of researchers among the four basic research laboratories and theheadquarters is as follows:
The hosts were from the Media Integration and Communications, InterpretingTelecommunications, and Human Information Processing Research Laboratories. Thesite report discusses research projects that were selected from ongoingresearch efforts.
The activities of the Human Information Processing Lab were presented by Dr.Shigeru Akamatsu. The focus is on multi-modal interactions between perceptionand production through kansei. Kansei is defined as computingthat relates to, arises from, or is influenced by human characteristics such assensibility, perception, affection or subjectivity. A national research projectsponsored by the Ministry of Education during 1992-1995 gave impetus to givingcomputers human-like responsiveness.
One of the media of kansei information is the face, as it is able toexpress subtle emotions. There are two aspects to what is perceptible from afacial expression. One deals with image engineering issues of accurate, robustface recognition and expression algorithms. The other deals with human-scienceaspects of mimicking human information processing. For example, to respond toqueries such as who looks similar here? or who is the most senior? one shouldincorporate ideas from human information processing. The emphasis of this groupis on combining psychophysics and image engineering. Annual symposiaemphasizing this for face processing are being held. These symposia haveinvited lectures from domestic and international researchers. Dr. Michael Lyonsdemonstrated a gender/expression recognition system using the Gaborwavelet-based group matching technique developed by Dr. C.v.d. Malsburg andassociates. Dr. Lyons is investigating the possibility of using discriminantanalysis for this problem.
The research activities in ITL, supported by Japan Key Technology Center andother major Japanese companies, started in March 1993. The main focus is onbasic research efforts on multilingual speech translation technologies. As partof this seven-year effort, researchers in ITL plan to develop key componenttechnologies such as speech recognition, language translation and speechsynthesis. ITL is a core member of the international Consortium for SpeechTranslation Advanced Research (C-STAR). As part of this involvement,researchers are involved in a joint experiment in multilingual speechtranslation technologies. Languages under consideration are Japanese, Korean,English and German. The WTEC team saw a demonstration of multilingual chattranslation in the domain of travel-related conversations. The prototypicalsystem demonstrated uses a combination of examples and rules in a unifiedframework and handles two-way translation between Japanese and English, Koreanor German and outputs synthesized speech in these languages. The example-basedapproach is trained using a large number of spoken expressions, while therule-based approach makes full use of linguistic rules. Multilingualtranslation is accomplished in an incremental fashion. Example-basedtranslation is done using a semantic similarity measure between words.Rule-based translation relies on manipulation of syntactic rules, lexicons anddependency structure rules.
Natural-sounding speech synthesis using a concatenation based scheme wasdemonstrated. More details on multilingual translation algorithms developed atATR may be found in the following references: Sumita and Iida 1992, Furuse andIida 1996, Wakita et al. 1997, Mima et al. 1997. Related work on parallelimplementations using a massively parallel associative processor and a CM-2connection machine is described in Sumita et al. 1993 and in Sumita et al.1994. A comparison of implementations of example-based retrieval schemes usingserial, MIMD and SIMD architectures is in Sumita et al. 1994. Using therelationship between architectures and response times, an appropriatearchitecture may be designed depending on the response time constraint. Relatedwork using associative parallel processors may also be found in Sumita et al.1995.
An agent interface that serves as a tour guide was demonstrated. Thisgeneric approach will be useful for exploring cyberspace. Other ongoingresearch efforts in this laboratory are communication by mental imagesincluding kansei processing, virtual reality, art and technology,including interactive environments, understanding of emotions from music andpeople's voices and human communications science. Due to time constraints, thesite visit team could not see demonstrations of these projects. See the WTECPanel Report on Human-Computer Interaction (http://itri.loyola.edu/hci/) for more detailsconcerning ATR's HCI research. The forthcoming WTEC report on the Japanese KeyTechnologies Center program will also include further information on ATR (seehttp://itri.loyola.edu).
Furuse, O. and H. Iida. 1996. Incremental translationutilizing constituent boundary patterns. In Proc. 16th Intl. Conf. on Comp.Ling. Copenhagen, Denmark. Aug: 412-417.
Mima, H., O. Furuse, Y. Wakita, and H. Iida. 1997.Multi-lingual spoken dialog translation system using transfer-driven machinetranslation. In Proc. Machine Translation Summit VI. San Diego, CA. Nov:148-155.
Sumita, E. and H. Iida. 1992. Example-based transfer ofJapanese adnominal particles into English. IEICE Trans. Inf. And Syst.Vol. E75-D. July: 585-594.
Sumita, E., K. Oi, O. Furuse, and H. Iida. 1993.Example-based machine translation on massively parallel processors. Proc.IJCAI-93. Chambery, France. Sept: 1283-1288.
Sumita, E., N. Nisiyama and H. Iida. 1994. Therelationship between architectures and example-retrieval times. In Proc.12th National Conf. on Artif. Intell. Seattle, WA. Sept.
Sumita, E., K. Oi, O. Furuse, H. Iida, and T. Higuchi.1995. Example-based machine translation using associative processor. Jl.Natural Language Processing. Vol. 2: 27-48.
Wakita, Y., J. Kawai, and H. Iida. 1997. Correct partsextraction from speech recognition results using semantic distance calculation,and its application to speech translation. In Proc. Spoken LanguageTranslation. Madrid, Spain. July.