Corpus Based Method of Transforming Nominalized Phrases into Clauses for Text Mining Application(Special Issue on Text Processing for Information Access)
スポンサーリンク
概要
- 論文の詳細を見る
Nominalization is a linguistic phenomenon in which events usually described in terms of clauses are expressed in the form of noun phrases. Extracting event structures is an important task in text mining applications. To achieve this goal, clauses are parsed and the argument structure of main verbs are extracted from the parsed results. This kind of preprocessing has been commonly done in the past research. In order to extract event structure from nominalized phrases as well, we need to establish a technique to transform nominalized phrases into clauses. In this paper, we propose a method to transform nominalized phrases into clauses by using corpus-based approach. The proposed method first enumerates possible predicate / argument structures by referring to a nominalized phrase (noun phrase) and makes their ranking based on the frequency of each argument in the corpus. The algorithm based on this method was evaluated using a corpus consisting of 24,626 aviation safety reports in English and it achieved a 78% accuracy in transformation. The algorithm was also evaluated by applying text mining application to extract events and their cause-effect relations from the texts. This application produced an improvement in the text mining application's performance.
- 社団法人電子情報通信学会の論文
- 2003-09-01
著者
-
Tokunaga Takenobu
Department of Computer Science Tokyo Institute of Technology
-
Terada A
Department Of Computer Science Tokyo Institute Of Technology
-
Terada Akira
Department Of Computer Science Tokyo Institute Of Technology
-
Terada Akira
Department of Chemistry, Faculty of Science, Osaka City University
関連論文
- A STUDY ON INTERPRETING SPATIAL CONSTRAINTS FOR AUTONOMOUS AGENTS(International Workshop on Advanced Image Technology 2005)
- Incorporating Probabilistic Parsing into an LR Parser : LR Table Engineering (4)
- Integration of Morphological and Syntactic Analysis based on LR Parsing Algorithm
- Word sequences for second language acquisition: technical report (思考と言語)
- Corpus Based Method of Transforming Nominalized Phrases into Clauses for Text Mining Application(Special Issue on Text Processing for Information Access)
- RESONANCE-ENHANCED MULTIPHOTON ELECTRON DETACHMENT (REMPED) SPECTRUM OF C^^-__5, C^^-__8, C^^-__9
- Comparative study of generating referring expressions in situated dialogues
- Synthesis of Naphthoquinone Derivatives. Paper XI. A Fusarubin Isomer.
- One-pot ortho hydroxylations of 2-(1-hydroxyalkyl)-naphthalenes and (1-hydroxyalkyl)benzenes.
- Synthesis of 5,8-dihydroxy-2-methoxy-1,4-naphthoquinone derivatives. A major naphthoquinone moiety of some of naphthoquinone antibiotics.
- Synthesis of shikalkin (.+-.shikonin) and related compounds.
- Monosubstituted Tetramethoxynaphthalenes
- On synthesis of naphthoquinone derivatives. Paper IX. A synthetic route of 1,4,8-trimethoxy-2-naphthalenecarbaldehyde via Duff formylation of 4,8-dimethoxy-1-naphthol.
- Fries rearrangement of 1-methoxy-4-(4-methyl-2- and -3-pentenoyloxy)benzenes.