In this section, the overall impression of the text and IR research being done in Japan is summarized, and then it is compared to the work being done in the United States.
The first observation is that the Japanese community of computer and information scientists working in the IR and text-related areas is smaller than the comparable communities in the United States and Europe. As a result, Japanese research in these areas tends to follow directions and initiatives begun in the United States. Individual projects are of good quality and are producing interesting technology, but progress has been somewhat impeded by a lack of a Japanese version of TREC or equivalent test collections. Although the value of recall/precision measurements is hotly debated in the IR community, there is no doubt that the culture of experiment and comparison in IR and TREC has led to significant improvements in both the understanding and performance of text access techniques. There have been some efforts to develop test collections for Japanese and this has resulted in a recent Call for Participation for IREX (Japanese Information Retrieval and Extraction Exercise, http://cs.nyu.edu/cs/projects/proteus/irex). IREX is organized by a committee of people from Japanese companies and universities, and is modeled on the TIPSTER and TREC programs. In addition, because TREC has made Chinese collections available, there have been a large number of recent papers on Chinese text retrieval.
Text-related research in Japan covers essentially the same areas as the United States, although there continues to be a strong emphasis on indexing techniques and speed. The differences that arose from the language-dependent aspects of Japanese text are rapidly disappearing.
Japanese companies appear to be focusing on developing the best commercial Asian language search systems for applications in Japanese, Chinese and Korean. There is, however, considerable competition even in this area in that considerable research and development of Chinese IR is underway in China, Singapore, Taiwan and Hong Kong, and Korea has a substantially longer history of IR research than Japan. One general criticism is that there seems to be too much reinvention of basic IR technology in Japan. Nearly every group visited was developing its own search engine (or engines). Licensing of U.S. search engines with Japanese capability such as Verity or Infoseek is limited but may increase as it is demonstrated that search technology is essentially language-independent.
Current Japanese research and text search techniques do not offer significant benefits for English applications. The research is complementary to that being done in the United States, and the results tend to be incremental in nature. As the community of researchers in this area increases, however, we may expect to see more innovation and exploration of new ideas.
A number of groups in Japan are studying information visualization, architectures for scalable IR systems, and the application of natural language processing (NLP) techniques to IR. These are areas that could have a significant impact on the development of text-based systems. For example, the use of NLP techniques for IR has been studied in the United States for some time because of the obvious potential benefits of a system that "understands" the query better than a word-based system. Despite those potential benefits, research using quantitative evaluation based on test collections such as TREC has never demonstrated any retrieval effectiveness improvements from NLP. On the other hand, there is some evidence that language-based techniques may work better in Japanese than in English (Fujii 1997), and this may lead to a better understanding of text retrieval in general. Information visualization is another area where the opportunity exists for substantial innovation and synergy between Japanese research groups. An example of a visualization interface being developed and deployed by IBM Japan is shown in Figure 5.1.
In conclusion, the WTEC panelists' view was that text-related research in Japan has been lagging behind that of the United States and Europe, but that substantial recent investments by companies and universities in this area mean that this gap is rapidly narrowing. One should expect to see substantially more new techniques and research directions originating in Japan in the near future.

Fig. 5.1. IBM information outlining: search, extraction, categorization, and
abstraction.