IEICE globals.ieice.org Site

Author Search Result

[Author] Shirou MARUYAMA(3hit)

1-3hit

Scalable Detection of Frequent Substrings by Grammar-Based Compression
Masaya NAKAHARA Shirou MARUYAMA Tetsuji KUBOYAMA Hiroshi SAKAMOTO

PAPER

Vol:
E96-D No:3
Page(s):
457-464
A scalable pattern discovery by compression is proposed. A string is representable by a context-free grammar deriving the string deterministically. In this framework of grammar-based compression, the aim of the algorithm is to output as small a grammar as possible. Beyond that, the optimization problem is approximately solvable. In such approximation algorithms, the compressor based on edit-sensitive parsing (ESP) is especially suitable for detecting maximal common substrings as well as long frequent substrings. Based on ESP, we design a linear time algorithm to find all frequent patterns in a string approximately and prove several lower bounds to guarantee the length of extracted patterns. We also examine the performance of our algorithm by experiments in biological sequences and other compressible real world texts. Compared to other practical algorithms, our algorithm is faster and more scalable with large and repetitive strings.
Context-Sensitive Grammar Transform: Compression and Pattern Matching
Shirou MARUYAMA Youhei TANAKA Hiroshi SAKAMOTO Masayuki TAKEDA

PAPER

Vol:
E93-D No:2
Page(s):
219-226
A framework of context-sensitive grammar transform for speeding-up compressed pattern matching (CPM) is proposed. A greedy compression algorithm with the transform model is presented as well as a Knuth-Morris-Pratt (KMP)-type compressed pattern matching algorithm. The compression ratio is a match for gzip and Re-Pair, and the search speed of our CPM algorithm is almost twice faster than the KMP-type CPM algorithm on Byte-Pair-Encoding by Shibata et al., and in the case of short patterns, faster than the Boyer-Moore-Horspool algorithm with the stopper encoding by Rautio et al., which is regarded as one of the best combinations that allows a practically fast search.
A Space-Saving Approximation Algorithm for Grammar-Based Compression
Hiroshi SAKAMOTO Shirou MARUYAMA Takuya KIDA Shinichi SHIMOZONO

PAPER

Vol:
E92-D No:2
Page(s):
158-165
A space-efficient approximation algorithm for the grammar-based compression problem, which requests for a given string to find a smallest context-free grammar deriving the string, is presented. For the input length n and an optimum CFG size g, the algorithm consumes only O(g log g) space and O(n log*n) time to achieve O((log*n)log n) approximation ratio to the optimum compression, where log*n is the maximum number of logarithms satisfying log log log n > 1. This ratio is thus regarded to almost O(log n), which is the currently best approximation ratio. While g depends on the string, it is known that g=Ω(log n) and for strings from k-letter alphabet [12].

Author Search Result

[Author] Shirou MARUYAMA(3hit)

Scalable Detection of Frequent Substrings by Grammar-Based Compression

Context-Sensitive Grammar Transform: Compression and Pattern Matching

A Space-Saving Approximation Algorithm for Grammar-Based Compression

Latest Issue

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles