Speeding up String Searching Algorithms for Nonuniform Texts
スポンサーリンク
概要
- 論文の詳細を見る
The string searching problem is to find all occurrences of a pattern in a text string. All string searching algorithms can be generally described as a cycle of two-step processing: the checking step, which determines whether the pattern matches with a substring of the text, and the skipping step, which determines the location of next substring possibly matching with the pattern. The fewer the number of character comparisons in the checking step and the farther the jump in the skipping step, the faster the algorithm is. Most of previous work focused on improvement of jump distance in the skipping step. In this paper, we focus on the other important aspect: reducing the number of character comparisons in the checking step based on the feature that the distribution of characters is always nonuniform in natural languages. According to our observation that the character which least occurs in the text is the character which is the most difficult to be matched with in the pattern, we propose a simple method, called cycle method, to reduce the number of character comparisons in the checking step without using any prior knowledge. This method can be used to improve most of the existing algorithms as a general speeding up technique. The empirical experiment shows that performance of improved algorithms is better than that of original algorithms on both the number of character comparisons and the running time.
- 一般社団法人情報処理学会の論文
- 1997-07-15
著者
-
ISHII Naohiro
Department of Intelligence and Computer Science, Nagoya Institute of Technology
-
Ishii N
Nagoya Inst. Technol.
-
Ishii Naohiro
The Author Is With The Department Of Intelligence And Computer Science Nagoya Institute Of Technolog
-
Ishii Naohiro
Department Of Intelligence And Computer Science Nagoya Institute Of Technology
-
Ishii Naohiro
The Authors Are With The Department Of Intelligence And Computer Science Nagoya Institute Of Technol
-
LIU Zhibin
Faculty of Biological Science Shaanxi Normal University
-
Liu Z
Faculty Of Biological Science Shaanxi Normal University
-
Liu Zhibin
Department Of Intelligence And Computer Science Nagoya Institute Of Technology
-
DU XIAOYONG
Department of Intelligence and Computer Science, Nagoya Institute of Technology
-
Du Xiaoyong
Department Of Intelligence And Computer Science Nagoya Institute Of Technology
-
Ishii Naohiro
Department Of Electrical Engineering And Computer Science Nagoya Institute Of Technology
関連論文
- The Completeness of Order-Sorted Term Rewriting Systems Is Preserved by Currying
- An Energy-Efficient Initialization Protocol for Wireless Sensor Networks with No Collision Detection
- Termination of Order-Sorted Rewriting with Non-minimal Signatures
- Enhanced Look-Ahead Scheduling Technique to Overlap Communication with Computation
- Accuracy of the Minimum Time Estimate for Programs on Heterogeneous Machines
- A Lookahead Heuristic for Heterogeneous Multiprocessor Scheduling with Communication Costs (Special Issue on Parallel and Distributed Supercomputing)
- An Improved Increase over the Minimum Execution Time of a Parallel Program
- Cuticular hydrocarbons in workers of the slave-making ant Polyergus samurai and its slave, Formica japonica(Hymenoptera:Formicidae)
- Morphometric Comparisons of Three Subspecies of Locusta migratoria Linnaeus (Orthoptera : Acrididae; Acridinae, Oedipodini) in China
- Doubly-Logarithmic Energy-Efficient Initialization Protocols for Single-Hop Radio Networks(Special Section on Discrete Mathematics and Its Applications)
- An Efficient Method for Computing All Reducts
- A Hybrid Method of Feature Subset Selection
- Improving Performance of the k-Nearest Neighbor Classifier by Combining Feature Selection with Feature Weighting
- Speeding up String Searching Algorithms for Nonuniform Texts
- On Relationships between Decomposable Programs and Rule Commutative Programs
- Efficient Evaluation of One-directional Cycle-recursive Formulas
- A Performance Measure for the Scheduling of Typed Task Systems with Communication Costs
- Decomposable Programs Revised
- Determining Feature Weight of Pattern Classification by Using Rough Genetic Algorithm and Fuzzy Similarity Measure
- Fast On-line String Searching
- Reliability of a Mobile Communication System with Network Congestion
- Optimizing Linear Recursive Formulas by Detaching Isolated Variables