Understanding Video Quality Metrics
Introduction
Video quality metrics are objective methods of assessing the quality of a processed video and can detect artifacts like blockiness, blurriness, mosquito noise , etc.. This article tries to familiarize you with the metrics provided by our video quality testing tool VQLab. For an introduction in video quality assessment see “Objective Video Quality Assessment”.
The three metrics described below are computed for every frame of a video. The average of the computed values is an indication of the overall quality of the video.
PSNR
PSNR is a statistical method of estimating differences between samples based on per pixel comparison, widely used by the industry for its simplicity and surprisingly good video quality assessment. Because it’s based on computing pixel differences, PSNR fails to capture structured or localized errors, also failing to differentiate between different types of errors (errors with different impact on a human observer can have the same PSNR). But, besides these drawbacks, PSNR can be used for fast detection of low quality and, in combination with other metrics can increase the correlation of objective scores with subjective ones.
CZD
CZD (Czenakowski Distance) as PSNR is a per-pixel quality metric (it estimates the quality by measuring differences between pixels). Described in literature as being “useful for comparing vectors with strictly non-negative elements” it measures the similarity among different samples. This different approach has a better correlation with subjective quality assessment than PSNR mostly because noise in darker areas of the picture has a bigger impact on the value of the metric than noise in brighter areas (see Weber's Law of Just Noticeable Differences). PSNR and CZD are more sensitive to noise than other metrics. The HVS (Human Visual System) behaves in a similar way, being more sensitive to changes in the darker areas of a picture.
SSIM
SSIM is a more complex metric based on different properties of the Human Visual System (HVS) that is starting to replace PSNR as the most widely use metric. It estimates the quality of the video much better (a higher correlation with subjective assessment) by subtracting information from the frames like luminance, contrast and structure information, and comparing those estimated values instead of directly comparing pixels. The metric is based on the fact that the HVS is more sensitive to structural changes in videos than to luminance and contrast changes.