トップページに戻る

ACM SIGMOD日本支部 講演会(チュートリアル)

「Clustering Web Documents」

講演者:Prof. Mukesh Mohania (IBM India)
日時:2001年 11月26日(月) 17:30~18:30
開催場所東京大学生産技術研究所駒場キャンパス E棟/西側/5階会議室A,B (Ew-501,502)
主催:ACM SIGMOD 日本支部
支部長:喜連川 優

Abstract

Users are increasingly relying on search engines to obtain
useful information from the web. It is becoming more and more
difficult for users to find relevant information as a large number of
documents are returned as a result of a search. Hence, in order to
make the search, it is necessary to categorize documents into sets
(i.e. clusters) based on some subject or similarity. A way to cluster
documents based on relative similarity between them will be explored
in this talk. The documents are scanned and important keywords or
document representatives are obtained from each document. Weights are
assigned to these keywords based on their location in the document,
frequency and various other factors. We will then discuss the
Row-Column Iterative Algorithm that is applied on the set of N
documents to form clusters based on relative similarity of
documents. We will also discuss some on-going research projects.
参加条件:どなたでもご参加頂けます.参加費は無料です.