In computer science, the Raita algorithm is a

string searching algorithm In computer science, string-searching algorithms, sometimes called string-matching algorithms, are an important class of string algorithms that try to find a place where one or several strings (also called patterns) are found within a larger strin ...

which improves the performance of Boyer–Moore–Horspool algorithm. This algorithm preprocesses the string being searched for the pattern, which is similar to

Boyer–Moore string-search algorithm In computer science, the Boyer–Moore string-search algorithm is an efficient string-searching algorithm that is the standard benchmark for practical string-search literature. It was developed by Robert S. Boyer and J Strother Moore in 1977. ...

. The searching pattern of particular sub-string in a given string is different from Boyer–Moore–Horspool algorithm. This algorithm was published by Timo Raita in 1991.RAITA T., 1992, Tuning the Boyer–Moore–Horspool string searching algorithm, Software - Practice & Experience, 22(10):879-88

/ref>

Description

Raita algorithm searches for a pattern "P" in a given text "T" by comparing each character of pattern in the given text. Searching will be done as follows. Window for a text "T" is defined as the length of "P". # First, last character of the pattern is compared with the rightmost character of the window. # If there is a match, first character of the pattern is compared with the leftmost character of the window. # If they match again, it compares the middle character of the pattern with middle character of the window. If everything in the pre-check is successful, then the original comparison starts from the second character to last but one. If there is a mismatch at any stage in the algorithm, it performs the bad character shift function which was computed in pre-processing phase. Bad character shift function is identical to the one proposed in Boyer–Moore–Horspool algorithm. A modern formulation of a similar pre-check is found in , a linear/quadratic string-matcher, in libc++ and libstdc++. Assuming a well-optimized version of , not skipping characters in the "original comparison" tends to be more efficient as the pattern is likely to be aligned.

C Code for Raita algorithm

#include #include #define ALPHABET_SIZE (1 << CHAR_BITS) /* typically 256 */ /* Preprocessing: the BMH bad-match table. */ static inline void preBmBc(char *pat, size_t lpat, ptrdiff_t bmBc[]) void RAITA(char *pat, size_t lpat, char *s, size_t n)

Example

Pattern: abddb Text: Pre- Processing stage: a b d 4 3 1 Attempt 1: ....b Shift by 4 (bmBc Comparison of last character of pattern to rightmost character in the window. It's a mismatch and shifted by 4 according to the value in pre-processing stage. Attempt 2: A.d.B Shift by 3 (bmBc Here last and first character of the pattern are matched but middle character is a mismatch. So the pattern is shifted according to the pre-processing stage. Attempt 3: ABDDB Shift by 3 (bmBc We found exact match here but the algorithm continues until it can't move further. Attempt 4: ....b Shift by 4 (bmBc At this stage, we need to shift by 4 and we can't move the pattern by 4. So, the algorithm terminates. Letters in capital letter are exact match of the pattern in the text.

Complexity

# Pre-processing stage takes O(m) time where "m" is the length of pattern "P". # Searching stage takes O(mn) time complexity where "n" is the length of text "T".

References

{{Reflist

External links