Site: Keio University
2-15-45 Mita, Minato-ku,
Date Visited: 25 March 1998
WTEC Attendess: J. M. Mendel (report author), R. Chellappa, B. Davis-Brown,L. Goldberg, R. Larsen, R. Reddy
This section was extracted from the typewritten remarks of ProfessorTakamiya and Mr. Iwai, who generously provided them to the panel at ourrequest.
The Humanities Media Interface Project (HUMI) was launched at KeioUniversity in Spring 1996, with the aim, among others, of digitizing major rarebooks and manuscripts--Western, Japanese and Chinese--in the Keio collection,including the Keio Gutenberg Bible. (Obtaining the Gutenberg Bible was remotelyconnected with Keio University's founding president Yukichi Fukuzawa, who sawthe Gutenberg Bible on his visit to St. Petersburg as early as 1862.) The HUMIProject has been supported by the Education Ministry, theInformation-Technology Promotion Agency (IPA), which is attached to theMinistry of International Trade and Industry, and Keio University.
The Keio University Library has a very large collection of rare books,including 8,000 Western rare books. Project participants seem to have a veryprogressive view of digitization of books, namely that, once digitized, thebook can be reassembled any way a person wants. The Keio Gutenberg Bible hasplayed a very important role in the HUMI Project. It was acquired not just forpossession of an important article of Western cultural heritage, but becauseKeio University believes that modern research libraries should possess workssignificant enough to be digitized for the benefit of today's scholars and forthe greater goal of preserving these treasures for posterity without furtherdecay.
Prof. Takamiya summarized the reason for the HUMI Project. He pointed outthat, "Digitization means more than just creating a passable facsimile on acomputer screen. It is an opportunity for transcending the confines of thetraditional format, with its bound pages. Once digitized, every component canbe unbound and rebound in an infinite number of ways. The book becomes a newentity in 'cyberspace'-perhaps more vivid than ever possible in the real world,where rare books are often inaccessible. In the worlds of virtual reality wecan re-experience it in a personal way. This means that digitized rare books,including the Gutenberg Bible, will never become forgotten relics of pastwisdom. They will come alive every time someone has access to them. This, then,is the raison-d'etre of the HUMI Project."
According to Prof. Takamiya, "The HUMI Project aims to digitize manuscriptsand rare books, process them, research them, and provide online access tomultimedia representations. Data and results will be transmitted via high-speednetworks and the Internet. The global academic community will thus be able touse this material for education and research. In terms of digitization, thereare two roles for the HUMI Project: (a) to establish the foundation of digitaltechnicalities from a viewpoint of research in humanities, and (b) to explorethe possibility of producing what should be called digital bibliology byapplying digital imaging techniques to history of the book, informationmanagement, and pedagogical presentation."
As an inter-faculty initiative organized by Keio University, the HUMIProject is envisioned as a first step in the establishment of a digitalresearch library based on the rare book collection at Keio. Bibliographicalanalysis of the rare books and manuscripts has been conducted by members of theEnglish Department, led by Prof. Takamiya; non-destructive testing has beenperformed at two research laboratories in the Faculty of Physics and Technologyunder the supervision of Profs. Ozawa and Inoue, respectively; and virtualreality applications have been developed under the guidance of Prof. Okude ofthe Faculty of Environmental Information. Technical aspects of the HUMI projecthave also been supported by various firms forming a consortium.
The HUMI Project began its activities by taking advantage of the Japanesegovernment's request for participation in the electronic library pilot project.This governmental project has its origin in the fact that Japan was nominatedas one of the main promoters of a global electronic library at an internationalsummit conference. In 1995, the government established the Center forInformation Infrastructure (CII) at Keio University's Shonan-Fujisawa Campus asa part of the activities conducted by the Ministry of International Trade andIndustry's Information Technology Promotion Agency (IPA). The ElectronicLibrary Pilot Project began by digesting the resources of the National DietLibrary, including 10 million pages from Japanese rare books and othermaterials. Many of these digital resources have already been opened up to thepublic through the Internet.
The Keio University HUMI Project began its partnership with CII in 1997, andprovided digital images of the Keio University collection, which included bothoriental and Western rare books.
In March 1997 project members successfully digitized a complete set ofimages of the Keio Gutenberg Bible (about 650 images). The group used a digitalcamera jointly developed by NTT (Nippon Telegraph & Telephone) and OlympusOptical Company. Since this camera is an experimental one-shot 3-CMD model, ittook only a few seconds to acquire a full color high-resolution image (2,048 x2,048 pixels). With this camera and a special book cradle developed by the HUMIProject, the team also successfully digitized the Cambridge University Librarycopy of the Gutenberg Bible (2 volumes, about 1,300 images) within four days inNovember 1998.
Prof. Matsuda described the online catalog (OLC) of Western manuscripts andrare books in the Keio University Library. He emphasized non-traditional accesspoints (in addition to author, etc. information) and that the catalog isnon-static and can constantly change. Different experts can add to the index,based on their interpretation of an item, and the index is easily updateable.There are no current plans for collaboration within Japan on the OLC anddigitization, in the area of rare books and manuscripts. To-date the projecthas digitized 5,000 pages in about two years. High quality and resolution areemphasized. The project managers are considering re-digitization again andagain as higher-quality digitization equipment becomes available.
The WTEC team members then toured three photo labs in the old library, inwhich project staff members are experimenting with different camera techniquesranging from high speed digital cameras to slower, but higher-resolution linescanning cameras. One laboratory contained a very high-speed digital camera, anNTT-Olympus prototype that takes 5 sec/page to get a large image onto adisplay. Curvature of the page is a problem. Bleed-through from the back of apage (which is actually present because the rare manuscript was originallywritten and illustrated on both sides) needs to be removed digitally, if theviewer so desires. This camera is used to copy an entire book very quickly. Ina second laboratory, a Dicomed digital camera back with a Mamiya RZ67 camera isused to digitize Western illustrated books. The camera has a viewfinder andtakes about 2 minutes per page. In the third laboratory there is a KodakProfessional PCD Scanner 4045 which is being used to scan 4 x 5 and 6 x 7films.
The team members then went to the new library where they were given ademonstration of a virtual tour of the monastery of San de Marco in Florence,Italy. This is in Prof. Okude's laboratory. Unfortunately, he was traveling;however, his student, Ms. Tomoko Ushiyama, gave the team a wonderfulpresentation. The tour was displayed on three flat screens using backprojection; each screen has its own projector. No glasses were required. Onethousand photographs were taken at the monastery. These were then used with a3D modeling package to create the tour. Buildings and surroundings were allsynthesized, whereas the artwork was all photographed. This required 200 MB ofstorage. On the virtual tour it is possible to zoom in on the many works ofart. The tour is controlled using a joystick. This is a wonderful example ofhow digital information can be used for education and learning about artwork ata location that most people will not have the opportunity to visit.
A digital Gutenberg Bible was demonstrated. It was pointed out that todayalmost no one can read or touch this kind of rare book; but, in the virtualreality environment, researchers, even young students, can access the Bibledirectly. The human interface of turning the pages, makes researchers learnintuitively. One can see the Bible as close up as possible, and pages canactually be turned. The Bible can be opened and closed, and we can look at itscover. Signatures of its past owners can be found, so we can get to know whokept this Bible in the past. It's possible to tear a page and see several pagesat one time.
WTEC's hosts then described two technical problems for their project:
Finally, the WTEC team had a short question and answer period with the KeioUniversity hosts. On the question of university/industry collaborations, theyinvited computer companies to join in a consortium, and 20-25 joined. Hitachihas been very helpful; NTT provided the digital camera (they want to be able toshare the results of the HUMI research just for publicity purposes); and,Hitachi provided digital imaging systems for removing stains and processing ofvery high-resolution images. Keio University has excellent connections withcompanies; their graduates now occupy very high management positions in the20-25 companies and are very supportive of their work.
On the question of making the virtual reality space available to others, itwas stated that the space will be made available to researchers, and it is notgoing to be used just for demonstrations.
On the question of what lessons were learned and can be shared from theirexperiences, project team members stated that international collaboration wouldbe very useful on such a project. In addition, the two weeks it took to scanthe 600 pages of the Gutenberg Bible scales up, so that their experience indoing this can be used to help estimate costs of other projects.
Answers to a large collection of questions that were sent ahead of thepanel's visit are provided below. The questions were circulated among membersof the HUMI Project and were then compiled and transmitted by Kenji Umeto,Secretary, HUMI Project, Keio University (firstname.lastname@example.org).
[Okude] = Naohito Okude, Professor, Faculty of Environmental Information,Keio University
[Hosono] = Kimio Hosono, Professor, School of Library Science, KeioUniversity
[Shibukawa] = Masatoshi Shibukawa, Professor, Faculty of EnvironmentalInformation, Keio University
[Armour] = Andrew Armour, Associate Professor, Faculty of Letters, KeioUniversity
[Iwai] = Shigeaki Iwai, Lecturer, Faculty of Letters, Keio University
1. Please describe your long term vision or scenario for:
Digital information technology offers the most extraordinary opportunitiesto teach and study the liberal arts in new ways. Digitization of the liberalarts drastically democratizes them. The people who developed computer literacyperceived it as a device of democratization from its inception. Thisdemocratization is the most powerful influence of digital technology on modernthinking.
In the academic environment digitized and printed information shouldco-exist together. Roles that printed information like academic journals haveplayed can not be completely replaced by digitized versions in the near future.Digital information is not necessarily reliable in terms of its quality,stableness and durability.
The supposed digital library could be considered as logistics of supplyingany necessary information to common people, which would thoroughly differ fromwhat we call 'library' now.
The current library, though useful, is not able to provide all theinformation concerning people's everyday life (personal, domestic,professional, or social). This, however, is the goal of the digital library: itmust enable people to "live" using the digital network, in which all thedigitized, organized, and united information can be retrieved. It is notpredictable when and how such a system will be realized; its dynamics would bea harbinger of a social change. The "library" has progressed for 5,000 years,and the realization of the digital one will still need some other years thoughit will come true before the quincentenary of the Gutenberg revolution. Thisview is based on the statements of Fukuzawa Yukichi ("Knowledge developscourage," 1879), P. Butler ("Books are one social mechanism for preserving theracial memory and the library one social apparatus for transferring this to theconsciousness of living individual," 1933), and P. Barker (his scenario from"Polymedia libraries" through "Electronic libraries" to "Digital libraries,"1996).
The role of the nineteenth century library as the custodian of physicalprinted materials will remain, but the digital libraries will becomedistributed information managers of the links to other digital libraries. Agrand distributed global digital library is the dream and the final goal of thedigital libraries' endeavor.
Digital libraries could be defined in several ways, such as networkedinformation resources, digitization of traditional libraries (i.e., integrationof digital collection and the systems for utilizing it), computer systemsemulating fundamental library functions, etc. If they are recognized asdigitization of traditional ones, they may not become popular in the nearfuture because of copyright issues, difficulty to establishinter-organizational management policies, unstableness of methods andtechnologies to capture and represent digital contents, etc.
In addition to frequently discussed copyright issues, we will have to faceseveral kinds of managerial ones. The example is the decisions related to whatmaterials in the collection of a library should be digitized (i.e., priorityissues). As far as we limit the objectives or aims of digitization to theresearch by the use of, or feasibility studies of a particular IT, issues maynot be so tough. If we seek, however, digitization of works in an operationalbase, the situation will change drastically. In this case, the following mustbe defined adequately and this is not easy to do at all. Issues include thefollowing: (1) Who is responsible for making decisions in terms of selection,processing, maintenance and management of materials that are to be digitized?(2) How can we carry out cooperative digitization activities with otherinstitutions in order to avoid duplication and establish a network to share theproducts among them? (3) Where should a digital collection be preserved andarchived as the last resort for academic research and studies?
In addition, a lot of issues are left unsolved in terms of managing digitalinformation provided by publishers such as electronic journals. Above (3) isalso applicable here.
2. How do national and international intellectual property lawsand commercial regulations or practices affect development, deployment, andutilization of digital information?
Technologies including code and electronic watermark might be effectiveagainst the illegal use of intellectual property to a certain degree. We,however, are pessimistic over the development of technologies that canexterminate illegal use, especially when considering the social opinion thatknowledge and information are common property of mankind on the one hand, andthe existence of a genuine interest in deciphering itself on the other-hand.We, for the present, endeavor to establish a proper standard of the licensecontract with social consent and a system to watch the obedience of thecontract with technological assistance.
It is problematic (both to social/cultural development and that of business)to insist on ownership of the creativity, regarding it as property, to beconcerned only about its illegal use, and to seek the way to solve the problemonly through technologies. It is far more important to ferment a common opinionthat a proper royalty ought to be paid for a valuable information whether it isa property or not.
3. Please explain how your organization sees the relationshipbetween digital library and electronic commerce.a. What are the economic or business models that apply todigital library in Japan?
Education and learning is a lifelong pursuit. Within a few decades, peoplein Japan will come to the university in broken times and take more than fouryears to graduate; more years to study, and more study. This fragmented anddiscontinuous pattern is more the exception than the norm now, but students inthe future will attend in broken times often at more than one institution.People will want to study and learn more in the future. This knowledge consumermarket is the digital libraries' business domain.b. How have those models influenced the directions digitallibrary technologies have taken and will take in the future?
Learning has always been a people-to-people process. The digital librarytechnology will promote a computer-mediated people-to-people learning process.Technology will be required to expand the libraries' traditional areas, such asinformation retrieval and distance learning, to the new frontier of informationwork application to assist the distributed constructionism learning process,using the network system.
Keio University, as an academic institute, has no need to relate the digitallibrary to electronic commerce. Yet present higher education, a publicenterprise though it is, could vie with broadcasting and newspapers in theirfields, if it will be able to provide lifelong education.
In this sense, education in universities ought to take the economic andmarketing model of the mass media as an example. Since the electronic tradewhich deals in intellectual property will more and more become dominant in suchenterprises, the digital library, which is to support the future digitaluniversity, will probably provide the information service on the basis of theelectronic trade itself.
4. Which sectors of the information technology economy (consumergoods, information services, hardware, business computing, educationaltechnology, etc.) will be the main beneficiaries of future Japanese digitallibrary technologies?
When the digital library realizes the prospects offered in the answer toitem A.1, it will produce far-reaching benefits to every concerned area. As auniversity, however, we hope that the digital library will grow beneficial toacademic research and education.
Educational and "research" technology.
5. How do you see internetworking, the convergence ofcommunication and computation, and new distribution technologies for digitaldata as changing the nature of digital libraries?
If the prospects offered in the answer to item A.1 prove to be right, thedevelopment of digital information technology as well as the creation ofinformation contents determines the function and structure of the digitallibrary. But we should note that the rate of the development ofinternetworking, the convergence of communication and computation, and newdistribution technologies for digital data are closely connected with thedemand of society and people for them.
When people and organizations all have computers and all these computers areinterconnected, they will buy, sell and freely exchange information andinformation services. The digital libraries will become distributed informationmanagers of the links to other digital libraries.
6. What are the main trends in content creationtechnologies?a. How would you characterize the various market segments forcontent creation technologies (publishing, entertainment, consumer electronics,education, business, government)?
The real market for digital technology is not the "information market" butthe "information work" market. The technologies for information work let aperson or a computer program take in information, transform it, and send itout. Today's content creation technologies do not fulfill these conditions.b. Do you see the need for specialized content creation andmanagement technologies for the separate sectors?
No. What we need are interactive technologies for information work ingeneral.c. Which sectors do you think will drive the industry in 5years? 10 years?
Education and learning will be the huge market when the distributed digitallibraries and the information work technologies are available to the contentcreators.
Here we cannot enumerate all of the segments because of limited space, butit can be said that in Japan digital contents have recently been created invarious fields including news, library, museum, and the medical industry. TheDatabase Register (the Ministry of International Trade and Industry, annual)reports the details.
I myself strongly feel it necessary to construct the image informationdatabase as an enterprise of a public sector; for we now reach the point wherewe should reconsider the information of the past with the assistance of graphicimages, and there must be a great amount of such graphic images. Nonetheless,graphic images, which have not been regarded as an information medium, areneither collected nor organized, and are therefore unavailable. Books indeedpass on to the next generation some of the past information, but not all. Astexts with graphic images could probably convey information in full,preservation of the graphic images of the past would provide a new perspectivefor the present and the future.
In the future, the public and business sectors will cooperate in, or competefor, content creation, and so will industries and companies within the businesssectors; but, it is the marketability, that is, people's needs, that willdetermine the direction.
1. What are the public policy drivers of digital library inJapan?a. How are ministries and agencies tasked and funded toimplement these policies?
National Center for Science Information System (NACSIS), InformationTechnology Promotion Agency (IPA), and National Diet Library (NDL) are takinginitiatives to promote digital libraries.
NACSIS is distributing electronic journal articles directly to scholars andresearchers, not via university libraries. These journals are limited to theones published from learned societies. NACSIS's main aim is to provide academicinformation to end-users as effectively and efficiently as possible via thenetwork.
IPA, which is an extra-governmental body of the Ministry of InternationalTrade and Industry, has focused its emphasis on the technological aspect ofdigitization and has financially supported R&D projects carried out bycomputer/network companies.
NDL has tried to digitize its unique collection to make clear problems suchas copyright, user-interface, efficiency of operation, etc. The project hasbeen directed at operational systems.b. How does the government stimulate or partner with industryin the definition, standardization or commercialization of new DLtechnologies?
IPA has a strong direct partnership with industry and tries to supporttechnological development related to digital libraries since such technologieshave wider influences on other fields. On the other hand, the role of theMinistry of Education is indirect regarding technologies, since it financiallysupports university libraries as a whole when they intend to construct digitallibraries. So far, at least three national university libraries have embarkedon digitization projects.
2. What are the current public sector priorities and programs(education, health, social services, the arts and culture, etc.) for digitallibraries?
Since the core of digital libraries is the "contents" themselves that are tobe digitized and/or utilized, the main governmental activities or programsshould be to create and foster a good environment where large volumes ofdigital information can be easily created and disseminated. Thus it is vital toestablish new copyright laws or revise existing ones and strengthen networkinfrastructure. Introducing new concepts, atmosphere, customs and institutionsto encourage digitization activities is also required.
[Shibukawa] (Answer to B.1 & B.2)
The national policy of Japan for founding the digital library is complicatedunder the conflicting jurisdictions of the Ministry of International Trade andIndustry, the Ministry of Education, Science and Culture (MESC), and theMinistry of Posts and Telecommunications. Only MITI has secured the source ofrevenue and carried out a plan for promoting the digital library.
Although it is the MESC that controls academic research and education inuniversities, it only directs universities to develop the "electronic library"function as an improvement of the university library services. It has, however,supported model digital libraries in a few national universities (e.g., NaraInstitute of Science and Technology).
As a private university, Keio University participates in the DemonstrativeExperiment in the Pilot Electronic Library, which the Association for Promotingthe Information Enterprise (a division of the MITI) runs in partnership withthe National Diet Library and commercial publishers. Financially supported byMITI, it aims at becoming an incubator of digital information technologies,library technologies in particular; but, its technological level, based on thegraphic image database, is no higher than that of the digital library of NaraInstitute of Science and Technology (though the system was to be improved in1998). In any case, the Japanese government has no clear policy on the digitallibrary.
3. What is the expected role of digital library technology forpublic, school, research, technical libraries and museums in Japan?
The expected roles are many. Following are examples: (1) saving spaces, thatprinted collections have occupied; (2) strengthening a library collection thatis short in volume and variety; (3) expanding the service areas that arephysically limited to the inside of a library and to the registered users ofthe library (this implies that not only will the service be provided to remoteusers, but also to new customers that formerly were not allowed to receive it);(4) increasing the variety of information that can be searched; and, (5)increasing the service menu (e.g., electronic reference service and onlinefull-text document delivery).
According to Barker's scenario (see the answer to A.1), it is the"traditional" library that will develop from the "book library" through thepresent "polymedia library" to the "electronic library," and this will also bethe case with the museum. University libraries and national museums will playthe role of leaders. The digital information and database produced by eachlibrary or museum will be rapidly organized.
Such digital information, however, is not composed of new intellectualcontent, but a legacy of the past, so to speak. New contents have been providedby publishers, newspaper publishing companies, and broadcasting stations."Electronic publishing" will therefore take the leadership in the developmentfrom the "electronic library" to the "digital library" and the "digitalmuseum". In addition, educational and academic research institutes, especiallyuniversities that are proposing the "digital university" projects, will play animportant role in creative activities.
4. In what ways do you see the traditional skills of librarians,archivists, curators, and information specialists as being utilized or changedby the presence of increasing amounts of digital information?
These skills are utilized fully for the cataloging, indexing and searchingof digital information. Since digital information appears in differentrepresentation forms, easy and adequate identification of each item is crucial.Discussions about metadata imply the importance of know-how in traditionalcataloging practice. Indexing and searching techniques, having been developedin the library and information science field, are also fundamental for managingdigital information.
Following Butler's opinion on the raison d'etre of the library (see theanswer to A.1), it can be said that the library, as a device to conveyinformation, must change in accordance with the change of the form ofinformation from "book" to "digital material." Librarians, archivists,curators, and information specialists must also adapt themselves to the change.They ought to develop and acquire the professional skills to provide peoplewith necessary information about books (museum pieces or art objects), digitalcontents, and computers at their command. Of course, this does not negate thepresent skills, with which librarians have long administered intellectualcontents, since the age of the Alexandrian library or earlier.
Hereafter, however, professional education needs a new curriculum that goesbeyond the traditional framework of "book and book library." It is mostimportant to acquire the skills to produce digital information and databasesand to manage the "cyber collections" which will come to existence on the cybernetwork. Such skills are also necessary for the librarians in activeservice.
1. Please discuss your approach to digital collectiondevelopment end-to-end including:a. Capture
We have been trying the following methods for capturing digital images ofrare books: (1) Kodak PhotoCD Imaging Workstation and analogue films (4 x 5, 6x 7, 35 mm); (2) crossfield drum scanner and analog film (4 x 5); (3) scanningcamera (Dicomed Field Pro with Sinar 4 x 5 & Mamiya 6 x 7 on WindowsNT);(4) one shot one CCD digital camera (Kodak DCS460); (5) three shots one CCDdigital camera (Leaf/Scitex with Mamiya 6 x 7); and, (6) one shot three CCDdigital camera (NTT/Olympus SHD View-2 beta version).
In making a comparatively low resolution (approx. 2,048 x 2,048 pixels) buthigh-quality "digital facsimile" of rare books, we use "(6) one shot three CCDcamera," whose advantage is a good balance between capturing speed and quality.At present, the master copy should be produced from big size analog film.b. Catalog
Descriptive cataloging practice for printed books can be applied widely butshould be expanded to include such information about technologies and/ormethods used for capturing and representing digital information. Discussionsabout metadata are indispensable.c. Index
Indexing of digital information is extremely difficult if we expect highretrieval performance, since targets to be indexed are too dispersed to specifyand standardize. In particular, index terms that enable us to get access todigital information from its subjects or contents are difficult to determine. Apossible way, although its performance is limited, may be to compile a specialthesaurus consisting of controlled index terms and to use it as a guide.d. Representation
I use virtual reality (VR) technology for representation. Bit-mapped,graphics-based supercomputers can run high-speed graphics that track humanmovement. Immersion, interactivity and information intensity are the three maincharacteristics of this technology. In the next 10 years, we can expect awidespread and growing experience of virtual reality in a variety of everydayeducational and learning environments.
While much of the humanities research community still has ears only forinformation engineering professionals who speak of being digital, a growingnumber of humanities scholars are beginning to look at the complex tradeoffsand theoretical shortcomings of the vision of computer professionals. Somescholars are starting to fight digital technology with a Luddite passion.
My approach makes a premise of balance. Virtual reality representation ofhuman artifacts meditates the merger of virtual reality with humanitiesscholarship. A holographic and multidisciplinary reality will be possible usingthree-dimensional imaging.e. Search
Being digital in a research library requires designing a post-Gutenbergianresearch model of humanities. Contrary to a general assumption that ahypermedia obliterates the past, digital technology is radically reconfiguringour understanding of history. Digital technology forces the recognition thattexts are not higher than images. Computers rid us of the assumption thatsensory messages are incompatible with reflection. Once digitized, fleetingimages become available to anyone who "reads" them on a graphic computer.Imaging becomes a rich and fascinating mode for communicating ideas. Diversephenomenological performances, whether drawings, gestures, sounds, or scents,will be rescued from the past by scripturalist professions.
To make an image search for humanities professionally, a serious training invisual proficiency is needed. Image search is an activity of focusing ontransdisciplinary problems across multiple and linear disciplines in arts,graphics, film, video, or media production as well as their differenthistories.
Searching Japanese texts does not have severe physical problems. Since thereare, however, several ways to divide Japanese texts (sentences) into words,there are lots of alternatives for search terms. This means exact match methodsare not so functional. Sophisticated fuzzy or approximate matching mechanismsfor Japanese texts should be developed.
There are "keywords" or "shapes" as access keys for image retrieval. Interms of keywords it is helpful to develop a list of keywords that not onlydescribe objects, phenomena or events but also represent human feeling such as"passion," "peace," "violence," etc. On the other hand, searching by shapesneeds pattern matching mechanisms that still have a way to go in theirdevelopment. f. Other technologies that are necessary for managing digitalcollections:
Technologies are needed that convert a particular digital collection intoanother irrespective of the lapse of time. Since technologies will changedrastically as time goes by, digital information produced in past years must beeasily converted to the newest version or accessed by the newest system.
Technologies for distributed libraries are desperately needed. Each libraryoffers its collections in electronic form. To users, the collection ofworldwide distributed libraries must look like one uniform library.
2. What are your current technologies and methodologies forpreservation and archiving of digital information?
Texture mapping and 3D real-time computer graphics.
For high-resolution rare book images, we use disk array, DAT tape, CD-R,magneto-optical disk, and DVD-RAM.
3. How do you deal with the many and frequently changingrepresentation formats for digital data? What formats do you currently use?
As regards formats, the problems we face are fundamentally no different tothose faced by any enterprise today. We thus tend to choose the most commonlyaccepted formats for binary data-Microsoft Word, RTF, TIFF, JPEG, etc. It couldbe argued that, in fact, there are fewer problems faced today by "many andfrequently changing representation formats" than 10 years ago; there is, forexample, less interest in format-conversion utilities than there used to be.Our concerns are perhaps more about the preservation of information, whether itbe accented European characters (not available in Shift-JIS) or fine imagedetail (JPEG lossy compression is only used for Web delivery). Provided thatthis information is stored in one of today's common formats, we assume that itwill still be accessible after, say, 10 years, when conversion to a new formatmay be called for.
4. How can technical materials be made useful to both expertsand to the average citizen?a. What do we need to do to make digital information usefulfor other communities ?
b. How can collections of historical records or of scientificimages be arranged in order to promote use by scholars and school children?
An answer to these questions would perhaps require clear definitions of theterms "technical materials" and "made useful." However, the average citizenmight be assumed to have an interest, say, in local history and be prepared tosit in front of a computer monitor for, say, 20 minutes in order to satisfyhis/her curiosity. The interface should obviously be as intuitive and thus asinvisible as possible. Fortunately, the growth of the Internet is rapidlyleading to a familiarity with (if not a consensus on) such interfaces. Thequalities one would look for in the presentation of such information areclarity, simplicity, visual appeal, etc., and therefore "technical materials"would be kept at a different level, which the user would be free to access by,say, clicking on a button labeled "Tell me more" (a common technique). Therecould be several layers, into which the user could "drill down." Of course, thescholar would require some kind of shortcut to jump to these more detailedlayers.
Alternatively, the user could initially identify him/herself by logging onas "Student" or "Expert." Fortunately, these problems are also faced bybusinesses (not everyone in an enterprise is equally well informed about alltopics), and we can reasonably hope for the appearance of new tools andapproaches from both academic and commercial communities. Of especial interestto us is the rapid evolution of the computer-based encyclopedias (such as theEncyclopedia Britannica, with its natural language search facility), some ofwhich must similarly cater to different levels of expertise.
5. Are you doing retrospective digital collections of historicalmaterial or are you focusing primarily on creation of new materials? Are youconverting bibliographic data as it relates to historical digitalmaterials?
[No Answer Provided.]
1. What are the expected breakthrough technologies in the areasof automated cataloging, indexing, search, and analysis of digital multimediacontent?
As regards textual materials, we can perhaps say that future progress in thearea of automatic indexing will be incremental, as there are already manypowerful tools available, with the ability to conduct "fuzzy" searches,proximity searches, etc. The search tools available on the Internet arecontinually being refined and made available for use locally or over intranets.What still leaves room for improvement is OCR, especially for difficult fontsor, eventually, hand-written materials such as collections of letters; abreakthrough technology in this area is sorely needed. A compromise for theinterim is some form of pattern-matching, though here too the tools are as yetsomewhat rudimentary.
Far into the future, we might hope for pattern-matching software of suchsophistication (and involving considerable "expert knowledge") that it could beused for indexing and accessing image collections.
Enhancement of capabilities of networking, VR interface design, objectoriented database[s].
2. What are the emerging technologies for creating,administering, searching and providing access to virtual or federatedcollections?
For centuries, the world's libraries have used virtually the same technologyfor acquiring, storing, and organizing their collections. In contrast, theWeb-surely key to any "virtual or federated collection"-is evolving so fastthat it is not uncommon for a technology to be superceded before it has hadtime to be adopted. While the new opportunities presented by such developmentsas Java, Dynamic HTML, FlashPix and Digimarc are widely welcomed, there is nodenying the danger of investing significant time and resources in somethingthat may be superseded in a few years or even sooner.
Internet technologies come and go, but among them Java-championed by SunMicrosystems-looks set to play a key role in defining the future of theInternet, even though its viability has come into question.
Clearer perhaps is the future of HTML, the lingua franca of the Web. Thishas long been recognized as being incapable of furnishing a foundation for thefuture development of the Web. At the same time, SGML has proved too difficultfor most people to implement, leading to a compromise solution known as XML, orExtensible Markup Language. But this is not a compromise in the sense of"falling between two stools." XML has many advantages: it is based on existinginternational standards; is fully extensible and does not suffer from taglimitations; is internationalized (based on Unicode); offers simpler systemadministration of Web sites, and so on. These advantages, combined with thepossibility that XML services may soon be made available at theoperating-system level, make this a very attractive course for futuredevelopment of any digital library/museum projects.
The Internet itself is suffering growing pains, which may be partiallyalleviated by the Active Node Transport System (ANTS) currently underdevelopment. This is an active network architecture that in effect will performlike a meta-protocol allowing for spontaneously generated protocols and willmake the network as flexible as XML.
Java and Corba. Object oriented database[s].
3. What new technologies are being developed especially for thepreservation and archiving of multimedia digital information? Is there movementtoward common preservation technologies and methodologies that serve business,academia, government and consumer oriented instances of digital library?
Once information is digital, questions of whether or not it is "multimedia"or business/ academia/government-oriented are irrelevant from the point of viewof conservation. The only real issues are (1) is the physical mediumsufficiently stable (cf. acid paper and early film stock), and (2) will it bepossible to access ("read") the information in the future? Librarians areconstantly aware of the problems posed by a particular medium (such as the5.25" floppy disk) falling into disuse, so they take steps to transfer digitaldata to new media (such as DVD). This process is never-ending, as we will neverfind the perfect storage medium; there will always be room for improvement. Theformat of the data is less of a problem; in theory, any format can be convertedat any time in the future, though there may be some cost involved. Formatconversion can be put off (till funds are available, for instance); mediaconversion cannot be delayed.
Rather than preservation, we have to find technologies that will serve ourneeds as regards security (including intellectual property rights),distribution, and image scalability.
Preserving technology is now reaching its maturity. What we have to careabout is a distribution technology and image scalability technology.
4. What emerging technologies and data formats are most likelyto enhance interoperability of digital data at all levels:
Database schemata are the most important. We have to incorporate distributedobject-oriented technology into global digital library projects. The next mostimportant topic is language: Unicode has yet to be widely accepted, and alreadyits shortcomings are being criticized. We can, however, hope for some "SuperUnicode" to emerge at some time in the not too distant future. Printer controllanguages, compression algorithms, document and page description formats areall ephemeral and of little consequence when looking at the larger picture.They can and must be left to commercial interests and the competition of theopen market.
Database schema are the most important. We have to incorporate distributedobject oriented technology into the global digital library project.
5. What is the expected volume of data in digital libraries inthe next 2, 5 and ten-year timeframes?
Each library's data volume should remain as small as possible.a. What technologies are you developing or depending on tomanage such volumes of multimedia data?
Object oriented database technology is needed.b. How critical is the need for such technologies?
Without this distributed object-oriented technology, there is no future ofthe digital library for the scholars and the people who use the libraries fortheir creative activities.
6. What relationships do you see between information appliancesand digital library?
The digital library's data structure should be isolated from any specifichardware and device. Under this condition, information appliances are veryuseful; however, if these appliances require us accepting a specific dataformat, they should not be used for a research environment.
7. Do the requirements for digital libraries imply any specificrequirements as to capacity, coverage, quality of service, or standardizationof national and international communications infrastructure?
They imply an open architecture and deployment of distributed objectoriented technology. Every library around the world should communicate witheach other and contribute a consolidation of diverse human knowledge andexperience.
8. What are the key technologies regarding multi-lingualrepresentation, search, and cataloging of digital data?
Keep ISO10646-1 UCS. Make clear the distinction between the code for (1)everyday communication and (2) that for a special purpose. Use the distributednetwork system to provide the group code (2). So, whenever the people have toor want to use a special character set, they can obtain the code from the netand see the representation on the screen. Instead of creating the huge standardcharacter library and carrying it within the computer, distribute the code andcreate the code when it is needed, and use it when needed. This approach is thesame as that of distributed digital libraries. Letters as well as knowledgebelong to infinite databases, so it is impossible to create a single universalrepository.
Machine translation systems and multi-lingual thesauri will be promisingones.
9. What innovative, multi-modal interfaces seem most appropriatefor digital libraries?
Computer-human interface should be a central research agenda for digitallibraries. Besides keyboards and mice, trackballs and joysticks move an objecton a computer screen; and there are many other interface devices developed,e.g., gloves, helmets, glasses, bodysuit, and so on. These multi-modalinterfaces, however, are not only immature but also are not intelligent. Futureinterfaces will be intelligent and will mediate communication between man anddistributed computer networks, and will be more responsive to researchers'wants and needs. Lowering the threshold for researchers to engage the data inthe digital libraries, new multi-modal intelligent interfaces can span thecontinuum from passive reception of research data to active creation ofresearch results.
10. Please share with us your views on the role of standards inthe evolution of digital library capabilities. How, specifically, is yourorganization involved in utilizing or defining standards that specificallyaffect digital libraries?
The digital library should conform to various types of standards: someacademic (continuing established practices), some administrative (defining goodmanagement), and some technical (ensuring that data can be exchanged andshared). Academic standards can and are being applied successfully in theInformation Age, although problems remain, particularly as a result of thetransient nature of digital information. Administrative standards can besimilarly based on accepted principles, with necessary adjustment. In thisarea, lessons can be learned from the world of commerce. It has been pointedout that in an information-based economy, selling a "product" (data)-often withminimal distribution costs-actually results in the seller gaining more data(information regarding the buyer). The same applies to a digital library andits resources.
Technical standards are perhaps the most difficult to cope with: they arealways changing and their adoption usually involves considerable cost. Whilesome academic institutions, such as Keio University, are trying to contributetoward future standards, it is more realistic to think in terms of selecting,testing and perhaps finding new applications for standards that will be set bylarge corporate interests.
Standards should be considered from the technological and bibliographicalpoints of view. In terms of the former, because of rapid advancement andresulting obsolescence of information technologies, it is doubtful how long aparticular standard can continue to function. Thus, it seems better to developvery powerful and efficient conversion software or techniques in order totransfer digital information among different systems. At any rate, the lifeexpectancy of technical standards seems to be shorter than the bibliographicalones.
Bibliographical standards, in which Dublin Core could be included, should bedefinitely established to assure easy and effective retrieval and use ofparticular digital information. They may include information about technologiesused for digitization as well as description about contents.
[Shibukawa] (Answers to D.1-10)
As Keio University has not decided yet on an electronic or a digitallibrary, we cannot answer questions practically. Under the present situation,however, we have two problems in digitizing Japanese rare books, one of whichlies in translation. While it is desirable to translate all lines in Japaneseworks into English, we have only bilingual bibliographical and explanatorynotes. We need more translators and enough finance to get bilingual full texts.Automatic translation systems, which are not available yet, will be of greatuse to us. The other problem concerns a difficulty of making reference toJapanese works. It is hard to articulate and classify parts of speechautomatically in Japanese texts. Therefore, it demands a lot of work to compilean index for reference.
1. In what ways will digital libraries fundamentally change theways in which children (K-12, college) are educated? What are the majorobstacles to making this happen?
2. Will digital libraries increase the costs of education (K-12,college)? If so, who will pay for this?
[Shibukawa] (Answers to E.1-2)
Any systematization of intellectual information based on digitization willchange the way to utilize such information at any level of education. Even ifthe cost is covered by tax or commercial profits, everyone has to bear it.Therefore, the cost should be shared only by users under mutual agreement.Moreover, it is important to infer how the market for digitized informationwill grow.