We developed a very low bit rate video compression algorithm using multiscale image segmentation based hierarchical motion compensation and residual coding. The proposed algorithm outperforms the H.261-like coder by 3 dB and the H.263 version 2 by 1 dB. Such gains come from the use of image segmentation and reversed motion prediction. The proposed region based reversed motion compensation strategy regulates the size and number of regions used, by pruning multiscale segmentation of video frames. Since regions used for motion compensation are obtained by segmenting the previously decoded frame, the shape of the regions need not be transmitted to the decoder. Furthermore, the hierarchical motion compensation strategy involves two stages: it refines an initial, region level, coarse motion field to obtain a dense motion field which provides pixel level motion vectors. The refinement procedure does not require any additional information to be transmitted. We also developed a residual coding technique for coding the displaced frame difference after segmentation based motion compensation. Residual coding is performed using a method which exploits the fact that the energy of the residual resulting from motion compensation is concentrated in a priori predictable positions. This residual coding technique can also be extrapolated to improve the performance of coders using a block based motion compensation strategy.
Results
We compare our coder with a generic block based coder as used in the
H.261 or the H.263 standards. All performance comparison is performed
on the luminance (Y) component of the video frames. In order to make an
objective comparison, we used the same quantization strategies to quantize
DCT coefficients for both the coders. The Huffman codes for motion vectors
and DCT coefficients were the same for both the coders. The frame bit-rate
was held (approximately) fixed for both the coders at 1280 bits. This
bit-rate corresponds to a bit-rate of 9.6 kbps if every fourth frame is
coded and a bit-rate of 38.4 kbps if all the frames are coded.
* Click one of the images to see a video sequence.
We also present results comparing our residual coding scheme with the usual block DCT based coding scheme. The overhead of 1 bit per coded block will be transmitted. Such a coder always performs better than the baseline block DCT scheme. The following figure shows the improvement (in dB PSNR) over the generic coder, when the quantization step size of AC coefficients is 16 and 32.
Downloadable Papers
1. Seung Chul Yoon, Krishna Ratakonda and Narendra Ahuja, "Low Bit-Rate
Video Coding with Implicit Multiscale Segmentation," IEEE Trans. on Circuits
and Systems for Video Technology, Vol. 9, No. 7, pp. 1115-1129, October
1999. [Abstract
][Download Paper]
2. Seung Chul Yoon, Krishna Ratakonda and Narendra Ahuja, "Region based Video Coding using a Multiscale Image Segmentation," Proc. IEEE Int. Conf. on Image Proc. (ICIP'97), vol. 2, pp. 510-513, Santa Barbara, 1997. [Abstract ][Download Paper]
3. Krishna Ratakonda, Seung Chul Yoon and Narendra Ahuja, "Video Compression: Coding the Displaced Frame Difference," Proc. IEEE Int. Conf. on Image Proc. (ICIP'97), vol. 1, pp. 353-356, Santa Barbara, 1997. [Abstract ][Download Paper]
Contact Information
Seung-Chul Yoon
Address:
1614 Beckman Institute
405 N. Mathews Avenue, Urbana IL 61801, USA.
Phone: (217) 333-1869 / 244-4392
Email: scyoon@vision.ai.uiuc.edu
Homepage: http://vision.ai.uiuc.edu/scyoon/
Title: Low Bit-Rate Video Coding with Implicit Multiscale Segmentation
Abstract: Discusses a multiscale segmentation based video compression
algorithm aimed at very low bit-rate applications such as video teleconferencing
and video phones. We introduce novel techniques for multiscale segmentation
based motion compensation and residual coding. Our region based forward
motion compensation strategy (in terms of direction of motion vector, which
is from the previous frame to the current frame) regulates the size and number
of regions used, by pruning a multiscale segmentation of video frames. Since
regions used for motion compensation are obtained by segmenting the previously
decoded frame, the shape of the regions need not be transmitted to the
decoder. Furthermore, our hierarchical motion compensation strategy refines
an initial region level, coarse motion field to obtain a dense motion field
which provides pixel level motion vectors. The refinement procedure does
not require any additional information to be transmitted. This motion compensation
technique effectively addresses the problem of dealing with "holes" and
"overlapping regions" which are inherent to forward motion compensation.
Residual coding is performed using a novel method which exploits the fact
that the energy of the residual resulting from motion compensation is concentrated
in a priori predictable positions. We show that this residual coding technique
can also be extrapolated to improve the performance of coders using a block
based motion compensation strategy. A fusion of these concepts leads to a
gain of 2-3 dB in peak signal-to-noise ratio, apart from significant perceptual
improvement, over a generic video coding algorithm using a block based motion
compensation strategy (such as H.261 or H.263) for a variety of test sequences.
Title: Region based Video Coding using a Multiscale Image Segmentation
Abstract: This paper proposes a novel region-based video coding technique
using a multiscale image segmentation method thus obtaining better quality
at the same bit rate. In most of the previous region-based video coding
techniques, occlusion caused degradation in terms of both PSNR and perceptual
video quality.We propose a new motion estimation and compensation algorithm
which solves occlusion related problems effectively. The proposed motion
estimation and compensation is a two stage procedure: the first stage uses
a coarse motion model while the second stage uses a dense motion model.
The coarse motion model generates region level motion vectors which are
then fine tuned by the dense motion model which produces pixel level motion
vectors. A fusion of these concepts leads to a gain of 2~3 dB in PSNR over
the block-based algorithm for a variety of test sequences using a fully
functional video coder.
Title: Video Compression: Coding the Displaced Frame Difference
Abstract: Popular techniques employed to code the displaced frame
difference (DFD) treat it no differently from an ordinary image for coding
purposes. Since the DFD is generated by the process of motion compensation,
such methods do not fully exploit the underlying redundancies. This paper
proposes a DFD coding method which exploits such redundancies while incurring
negligible information overhead. The key idea is to predict locations of
high DFD concentration which occupy small portions of the image and use this
predicted information (which is also available to the decoder without additional
information transmission) to improve the quality of the decoded image. Two
key features of the proposed approach are its compatibility with any transform
based DFD coding scheme and negligible information overhead. Tests with a
fully functional video coder show the efficacy of the proposed approach.