Global Information Lookup Global Information

Longest repeated substring problem information


A suffix tree of the letters ATCGATCGA$

In computer science, the longest repeated substring problem is the problem of finding the longest substring of a string that occurs at least twice.

This problem can be solved in linear time and space by building a suffix tree for the string (with a special end-of-string symbol like '$' appended), and finding the deepest internal node in the tree with more than one child. Depth is measured by the number of characters traversed from the root. The string spelled by the edges from the root to such a node is a longest repeated substring. The problem of finding the longest substring with at least occurrences can be solved by first preprocessing the tree to count the number of leaf descendants for each internal node, and then finding the deepest node with at least leaf descendants. To avoid overlapping repeats, you can check that the list of suffix lengths has no consecutive elements with less than prefix-length difference.

In the figure with the string "ATCGATCGA$", the longest substring that repeats at least twice is "ATCGA".

and 11 Related for: Longest repeated substring problem information

Request time (Page generated in 0.8604 seconds.)

Longest repeated substring problem

Last Update:

science, the longest repeated substring problem is the problem of finding the longest substring of a string that occurs at least twice. This problem can be...

Word Count : 215

LCP array

Last Update:

LZ77 factorization in O ( n ) {\displaystyle O(n)} time. The longest repeated substring problem for a string S {\displaystyle S} of length n {\displaystyle...

Word Count : 4379

Suffix tree

Last Update:

operations can be performed quickly, such as locating a substring in S {\displaystyle S} , locating a substring if a certain number of mistakes are allowed, and...

Word Count : 3691

Palindrome

Last Update:

entire word has been read completely. It is possible to find the longest palindromic substring of a given input string in linear time. The palindromic density...

Word Count : 4989

List of algorithms

Last Update:

numbers Longest common substring problem: find the longest string (or strings) that is a substring (or are substrings) of two or more strings Substring search...

Word Count : 7809

Suffix array

Last Update:

{\textstyle n} -string and let S [ i , j ] {\displaystyle S[i,j]} denote the substring of S {\displaystyle S} ranging from i {\displaystyle i} to j {\displaystyle...

Word Count : 3848

Hash function

Last Update:

case here is gravely pathological: both the text string and substring are composed of a repeated single character, such as t="AAAAAAAAAAA", and s="AAA")....

Word Count : 7844

Memory span

Last Update:

a repeated syllable (i.e. ba, ba, ba) the span is reduced (articulatory suppression effect) Rhythm of presentation: Closely related to the problem of...

Word Count : 3762

COBOL

Last Update:

looping with PERFORM UNTIL EXIT SUBSTITUTE intrinsic function allowing for substring substitution of different length CONVERT function for base-conversion...

Word Count : 14526

Jewels of Stringology

Last Update:

covered in several variations including edit distance and the longest common subsequence problem. The book concludes with advanced topics including two-dimensional...

Word Count : 368

Cryptic crossword

Last Update:

the riddle of Chinese characters, where partial characters instead of substrings are clued and combined. Clues given to the solver are based on various...

Word Count : 11915

PDF Search Engine © AllGlobal.net