トップページに戻る

ACM SIGMOD日本支部 講演会(チュートリアル)

「Search Engines for Indian Languages」

講演者:Dr. TV Prabhakar (Department of Computer Science and Engineering, Indian Institute of Technology, Kanpur, India)
日時:2000年 9月22日(金) 13:30-14:30
開催場所:東京大学生産技術研究所
主催:ACM SIGMOD 日本支部
支部長:喜連川 優

Abstract

There is a great need for search engines for web documents
written in languages other than English. In this talk, we describe the
design issues of a Search Engine for Indian Languages. After introducing
Indian languages technologies for the web, we describe the
implementation of two Search Engines for Indian Languages, one for
documents in ISCII and the other for documents in Unicode. The software
allows full-text indexing and searching of a database of documents
written in any Brahmi-based Indian Language. The Search engines gather
the HTML documents from the web, index and compress the documents and
then searches for the given keywords. The main features of the search
engines are phonetic tolerance, morphological analysis, compression and
indexing, leading and trailing sub string matches for keywords, search
through compressed documents. Performance results show that the search
engine achieves a compression of almost 80 percent and has an
appreciable precision and recall.
参加条件:どなたでもご参加頂けます.参加費は無料です.