Net Resources on Data Compression and Information Theory
Data Compression is the art of squeezing
a large amount of information into a small space. Compression algorithms
are widely used in applications that depend on storing and transmitting
data efficiently, including the Internet, fax machines, cell phones, DVD
players, database applications, and satellite communications.
Information Theory provides the scientific
basis for data compression and for other technologies such as encryption,
error correction, pattern recognition, signal processing, and similarity
comparisons. Information theory has also been widely influential
in research on theoretical computer science, the physics of computation,
fiber optics, psychology, and even music and molecular biology.
Overviews and Tutorials
-
Information theory and first principles
-
Text compression
-
Mathematical advances in compression
-
JPEG, MPEG and other standards for multimedia compression
Seminal Papers and literature surveys
-
C. E. Shannon, "A
Mathematical Theory of Communication," Bell System Tech Journal,vol.
27, pp. 379-423 and 623-656, July and October, 1948. (the original
paper on information theory)
-
Kolmogorov Complexity or algorithmic complexity -- see G. Chaitin, "An Invitation to Algorithmic Information Theory," DMTCS'96 Proceedings, 1997.
-
Huffman, paper on huffman coding
-
Witten, Neal, and Cleary, "Arithmetic Coding for Data Compression," Comm.
ACM, 30(6):520--541, June 1987 (long considered the standard for arithmetic
coding)
-
Moffat, Neal, and Witten, "Arithmetic
Coding Revisited", ACM Transactions on Information Systems, 16(3):256-294,
July 1998 (modern improvements to the speed of arithmetic coding)
-
wavelets for compression
-
R. M. Gray and D. L. Neuhoff, "Quantization", invited to IEEE Transactions on Information Theory, October 1998. (a comprehensive overview of scalar and vector quantization)
-
fractal compression
Source Code/Implementation Notes
Other Web Directories
-
Dr. Dobb's Journal data
compression page -- a snazzy, well-organized page that links some of
the best sites for programmers seeking to understand and implement compression
algorithms
-
The Data Compression Library
-- a good directory organized by topic. allows users to search, rate
sites, and submit new sites
-
Links2go compression page -- many
excellent links, little organization
-
Dr. Ross's Compression Crypt
-- stylish, high quality overview, from a former compression researcher
-
Arimura's
compression bookmarks -- compression links clearly organized by subject
areas
-
Compression
Pointers -- alphabetical list of people in the field, plus companies
and research projects
-
A nice overview
of data compression, emphasizing huffman coding, markov modeling, and
images (in a really long file)
Companies and Organizations
Books
-
D. Slepian, editor, Key Papers in the Development of Information Theory New York: IEEE Press, 1974.
-
Managing Gigabytes : Compressing and Indexing Documents and Images
by Ian H. Witten, Alistair Moffat,
and Timothy C. Bell, Morgan Kaufmann,
San Francisco, 1999.
-
K. Sayood, Introduction to Data Compression, Second Edition
-
W. Weaver and C. E. Shannon, The Mathematical Theory of Communication,
Urbana, Illinois: University of Illinois Press, 1949, republished in paperback
1963.
Researchers
Some Frequently Asked Questions on Data Compression
-
Can all data sets be compressed? How far can a data set be compressed?
-
Can any compression algorithm achieve the maximum possible compression
rate for a text string?
-
How can a string of symbols be compressed with less than one bit per symbol?
-
Why isn't arithmetic coding used more often?
-
What is the best compression method to use for (text? images? music? speech?
video? 3D graphics?)
-
Are these algorithms patented?
-
What are open research topics in compression?