Data Compression by Context Sorting

Hidetoshi YOKOO; Masaharu TAKAHASHI

Data Compression by Context Sorting

Hidetoshi YOKOO, Masaharu TAKAHASHI

Full Text Views

0

Share
Cite this

Summary :

This paper proposes a new lossless data compression method, which utilizes a context sorting algorithm. Every symbol in the data can be predicted by taking its immediately preceding characters, or context, into account. The context sorting algorithm sorts a set of all the previous contexts to find the most similar context to the current one. It then predicts the next symbol by sorting previous symbol-context pairs in an order of context similarity. The codeword for the next symbol represents the rank of the symbol in this sorted sequence. The compression performance is evaluated both analytically and empirically. Although the proposed method operates character by character, with no probability distribution used to make a prediction, it has comparable compression performance to the best known data compression utilities.

Publication: IEICE TRANSACTIONS on Fundamentals Vol.E79-A No.5 pp.681-686

Publication Date: 1996/05/25

Publicized

Online ISSN

DOI

Type of Manuscript: PAPER

Category: Information Theory and Coding Theory

Cite this

Copy

Hidetoshi YOKOO, Masaharu TAKAHASHI, "Data Compression by Context Sorting" in IEICE TRANSACTIONS on Fundamentals, vol. E79-A, no. 5, pp. 681-686, May 1996, doi: .
Abstract: This paper proposes a new lossless data compression method, which utilizes a context sorting algorithm. Every symbol in the data can be predicted by taking its immediately preceding characters, or context, into account. The context sorting algorithm sorts a set of all the previous contexts to find the most similar context to the current one. It then predicts the next symbol by sorting previous symbol-context pairs in an order of context similarity. The codeword for the next symbol represents the rank of the symbol in this sorted sequence. The compression performance is evaluated both analytically and empirically. Although the proposed method operates character by character, with no probability distribution used to make a prediction, it has comparable compression performance to the best known data compression utilities.
URL: https://globals.ieice.org/en_transactions/fundamentals/10.1587/e79-a_5_681/_p

Copy

@ARTICLE{e79-a_5_681,
author={Hidetoshi YOKOO, Masaharu TAKAHASHI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Data Compression by Context Sorting},
year={1996},
volume={E79-A},
number={5},
pages={681-686},
abstract={This paper proposes a new lossless data compression method, which utilizes a context sorting algorithm. Every symbol in the data can be predicted by taking its immediately preceding characters, or context, into account. The context sorting algorithm sorts a set of all the previous contexts to find the most similar context to the current one. It then predicts the next symbol by sorting previous symbol-context pairs in an order of context similarity. The codeword for the next symbol represents the rank of the symbol in this sorted sequence. The compression performance is evaluated both analytically and empirically. Although the proposed method operates character by character, with no probability distribution used to make a prediction, it has comparable compression performance to the best known data compression utilities.},
keywords={},
doi={},
ISSN={},
month={May},}

Copy

TY - JOUR
TI - Data Compression by Context Sorting
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 681
EP - 686
AU - Hidetoshi YOKOO
AU - Masaharu TAKAHASHI
PY - 1996
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E79-A
IS - 5
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - May 1996
AB - This paper proposes a new lossless data compression method, which utilizes a context sorting algorithm. Every symbol in the data can be predicted by taking its immediately preceding characters, or context, into account. The context sorting algorithm sorts a set of all the previous contexts to find the most similar context to the current one. It then predicts the next symbol by sorting previous symbol-context pairs in an order of context similarity. The codeword for the next symbol represents the rank of the symbol in this sorted sequence. The compression performance is evaluated both analytically and empirically. Although the proposed method operates character by character, with no probability distribution used to make a prediction, it has comparable compression performance to the best known data compression utilities.
ER -