CPSC 538a  - Topics in Computer Systems (Winter 2005)

  

Paper                      : No-Reference Metrics for Video Streaming Applications  by Venkatesh Babu Radhakrishnan, Ajit Bopardikar, Andrew Perkis, Odd Inge Hillestad  , International Packet Video Workshop (PV 2004)

 

Presented on         : February 21, 2005  by Bhavana

Paper Summary    :   Developing accurate metrics for quantifying perceived video quality correctly is currently an active topic of research. Traditionally, such metrics have been based on comparing video quality with the original video source . Such schemes are called full-reference metrics. Some examples are : mean Square Estimation, PSNR . These schemes have been criticized for not providing a good correlation between the metric and the actual perceived video quality. No-reference metrics is a new approach in this area wherein the quality of a video frame is estimated by calculating the structural distortion caused in it instead of comparing it with the original source.

This paper describes two Non-reference quality assessment metrics for video streaming applications. The first metric is the blockiness metric aimed at quantifying the blockiness artifact introduced in the video frame as a result of using block-based video compression scheme such as MPEG-4. The second is the packet-loss metric which quantifies the distortion artifact introduced as a result of applying concealment techniques to video frame at places of lost/corrupted macroblock. The authors claim that both the metrics are computationally cheap and can be effectively used in monitoring streaming video and/or  providing feedback to streaming server to adjust its sending rate.

NR Blockiness Metric :  The basic idea here is that blockiness occurs when there is not sufficient spatial activity around the block edges to mask the edge gradient. The proposed scheme divides each edge of each 8 x 8 block into three segments of length 6 each. Then it calculates gradient and standard deviation along each segment.  Now, it counts the number of blocks for each which at least one such segment has standard deviation less than a given threshold ε and gradient more than a given threshold τ. The final metric gives the blockiness measure of a frame βF  as a ratio of this count to the total number of blocks in that frame. The authors also conducted some experiments to test the goodness of this metric. They calculated the metric for the same video coded at different bit rates (i.e. different compression ratios and hence different blockiness ) . Their experiments showed that as the bit rate decreased, compression increased and hence the blockiness metric increased thereby indicating an increase in the perceived blockiness in the video.

NR Packet Loss Metric :  Here, they assumed a simple concealment scheme at the decoder which replaced the lost/corrupted macroblock  with the corresponding macroblock in the previous frame. So , the assumption is that the length of distortion artifact can proportionally indicate the distortion caused in the video frame due to packet loss. In the scheme, for each macroblock, two strength vectors are calculated , one for measuring strength across the macroblock and the other within and near the macroblock edge. Next, these edge vectors are converted into binary vectors by setting the strength value to one if the strength exceeds a given threshold and to zero otherwise. Then, the length of the distortion for each macroblock is equal to the sum of the differences between these two vectors if the sum exceeds a given threshold or is equal to zero otherwise. The final metric F is a sum of squared distortion lengths for each macroblock. By squaring, longer distortion artifacts are given more weightage than shorter ones. For simulation, the authors used NTT DoCoMo’s packet loss model and calculated the metric for video frames at different PLRs(packet loss ratios). The NR packet loss metric increased as the packet loss increased.

The NR techniques for estimating video quality are beneficial in situations where the original reference is not available or is expensive to send as side information. But, these techniques do not take into account the perception qualities of human audio-visual system. Besides, in the presence of strong de-blocking filters which minimize blockiness at the cost of increasing blurness , the blockiness metric might fail to convey the perceived loss in quality due to blurness and noise. The NR techniques can benefit by incorporating HVS models into their analysis. Some people have also suggested using natural scene statistical models for NR metrics. It remains to be seen how these metrics behave with other non-block-based codecs like wavelet based codecs.

Class Discussion  :

1.      NR blockiness metric depends on the idea that if there is a region of high spatial activity then human eye is not able to perceive details and hence blockiness artifacts tend to get masked

2.      Buck found the NR packet loss metric as completely useless as it targets artifacts caused due to packet loss which anyways do not give acceptable video quality and thus measuring their quality is of no use.  Also , their technique is oversimplified, does not take into account many characteristics of the encoder and targets only a particular type of simple concealment strategy. It is also not known how will this metric behave in presence of strong de-blocking filters

3.      Regarding PSNR , Buck told that although PSNR is known to work improperly in some situations ,( for ex. If we shift the picture by few pixels then PSNR value degrades drastically while the actual perceived quality is not effected that much) , yet sophisticated metrics based on HVS models have failed to outperform PSNR. Because of this, VQEG had to postpone its standardization process and wait for technology to become more mature.

4.      It seems the proposed NR blockiness metric shall behave well in low bit-rate zones where the blockiness is the dominant artifact over other kinds whereas in high bit rate zones where the users also want detail and clarity, these metrics might not correlate well with the perceived video quality due to other artifacts like blur, noise becoming increasingly dominant.

References              :

*       No Reference Image and Video Quality Assessment http://live.ece.utexas.edu/research/quality/nrqa.htm

*      Objective video Quality Assessment http://www.cns.nyu.edu/~zwang/files/papers/QA_hvd_bookchapter.pdf

*      Perceptual Video Quality and Blockiness Metrics for Multimedia Streaming Applications http://www.stefan.winkler.net/Publications/wpmc2001.pdf

                                                                                                                                                      

Slides                     :  Powerpoint