This paper proposes a new lossless data compression method, which utilizes a context sorting algorithm. Every symbol in the data can be predicted by taking its immediately preceding characters, or context, into account. The context sorting algorithm sorts a set of all the previous contexts to find the most similar context to the current one. It then predicts the next symbol by sorting previous symbol-context pairs in an order of context similarity. The codeword for the next symbol represents the rank of the symbol in this sorted sequence. The compression performance is evaluated both analytically and empirically. Although the proposed method operates character by character, with no probability distribution used to make a prediction, it has comparable compression performance to the best known data compression utilities.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Hidetoshi YOKOO, Masaharu TAKAHASHI, "Data Compression by Context Sorting" in IEICE TRANSACTIONS on Fundamentals,
vol. E79-A, no. 5, pp. 681-686, May 1996, doi: .
Abstract: This paper proposes a new lossless data compression method, which utilizes a context sorting algorithm. Every symbol in the data can be predicted by taking its immediately preceding characters, or context, into account. The context sorting algorithm sorts a set of all the previous contexts to find the most similar context to the current one. It then predicts the next symbol by sorting previous symbol-context pairs in an order of context similarity. The codeword for the next symbol represents the rank of the symbol in this sorted sequence. The compression performance is evaluated both analytically and empirically. Although the proposed method operates character by character, with no probability distribution used to make a prediction, it has comparable compression performance to the best known data compression utilities.
URL: https://globals.ieice.org/en_transactions/fundamentals/10.1587/e79-a_5_681/_p
Copy
@ARTICLE{e79-a_5_681,
author={Hidetoshi YOKOO, Masaharu TAKAHASHI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Data Compression by Context Sorting},
year={1996},
volume={E79-A},
number={5},
pages={681-686},
abstract={This paper proposes a new lossless data compression method, which utilizes a context sorting algorithm. Every symbol in the data can be predicted by taking its immediately preceding characters, or context, into account. The context sorting algorithm sorts a set of all the previous contexts to find the most similar context to the current one. It then predicts the next symbol by sorting previous symbol-context pairs in an order of context similarity. The codeword for the next symbol represents the rank of the symbol in this sorted sequence. The compression performance is evaluated both analytically and empirically. Although the proposed method operates character by character, with no probability distribution used to make a prediction, it has comparable compression performance to the best known data compression utilities.},
keywords={},
doi={},
ISSN={},
month={May},}
Copy
TY - JOUR
TI - Data Compression by Context Sorting
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 681
EP - 686
AU - Hidetoshi YOKOO
AU - Masaharu TAKAHASHI
PY - 1996
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E79-A
IS - 5
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - May 1996
AB - This paper proposes a new lossless data compression method, which utilizes a context sorting algorithm. Every symbol in the data can be predicted by taking its immediately preceding characters, or context, into account. The context sorting algorithm sorts a set of all the previous contexts to find the most similar context to the current one. It then predicts the next symbol by sorting previous symbol-context pairs in an order of context similarity. The codeword for the next symbol represents the rank of the symbol in this sorted sequence. The compression performance is evaluated both analytically and empirically. Although the proposed method operates character by character, with no probability distribution used to make a prediction, it has comparable compression performance to the best known data compression utilities.
ER -