Fast String Searching in a Character Lattice (Special Issue on Document Analysis and Recognition)
スポンサーリンク
概要
- 論文の詳細を見る
This paper presents an algorithm for string searching in a character lattice. A character lattice, which is obtained through a character recognition process, is a general and flexible data structure that represents many hypothesized strings in a document image. In this paper, the authors propose a simple and efficient algorithm; it consists of a single loop of some set-operations and scans the character lattice only once. The authors also describe two actual implementations of the algorithm; one uses Bit-Arrays and the other a Trie. Owing to its bit parallelism, the Bit-Array approach is able to search for a single pattern faster than the Trie approach, and is easily extended to complex matchings such as an approximate one. It is suited for document retrieval systems that need to search for a keyword as fast as possible. A hashed compact version of the character lattice is also useful to increase the speed of the search for a single pattern. In contrast, the Trie approach is able to search for a large number of patterns simultaneously, and is suited for document understanding systems that need to extract words from the character lattice. The experimental results have shown that both approaches achieve high performance.
- 社団法人電子情報通信学会の論文
- 1994-07-25
著者
-
Ikeda Katsuo
Faculty Of Engineering Kyoto University
-
Minoh Michihiko
Faculty of Engineering, Kyoto University
-
Minoh Michihiko
Faculty Of Engineering Kyoto University
-
Senda Shuji
Faculty of Engineering, Kyoto University
-
Senda Shuji
Faculty Of Engineering Kyoto University
関連論文
- Propagation Characteristics of Boolean Functions and Their Balancedness
- Relationships among Nonlinearity Criteria of Boolean Functions
- Complexity of Boolean Functions Satisfying the Propagation Criterion
- Extraction of inclined Character Strings from Unformed Document Images Using the Confidence Value of a Character Recognizer
- Fast String Searching in a Character Lattice (Special Issue on Document Analysis and Recognition)