Robust Specularity Removal from Hand-held Videos
Info: 12214 words (49 pages) Dissertation
Published: 13th Dec 2019
Tagged: Technology
Abstract
Specular reflection exists when one tries to record a photo or video through a transparent glass medium or opaque surfaces such as plastics, ceramics, polyester and human skin, which can be well described as the superposition of the transmitted layer and the reflection layer. These specular reflections often confound the algorithms developed for image analysis, computer vision and pattern recognition. To obtain a pure diffuse reflection component, specularity (highlights) needs to be removed. To handle this problem, a novel and robust algorithm is formulated. The contributions of this work are three-fold.
First, the smoothness of the video along with the temporal coherence and illumination changes are preserved by reducing the flickering and jagged edges caused by hand-held video acquisition and homography transformation respectively.
Second, this algorithm is designed to be more potent than the state of art algorithms by introducing automaticity in the selection of the region of interest for all the frames, reducing the computational time and complexity by utilizing the luminance (Y) channel and exploiting the Augmented Lagrange Multiplier (ALM) with Alternating Direction Minimizing (ADM) for the decomposition of the objective function which is formulated by applying the structural priors.
Third, a quantity metrics is devised, which objectively quantifies the amount of specularity in each frame of a hand-held video. The proposed specularity removal algorithm is compared against the existing state of art algorithms using the newly-developed quantity metrics. Experimental results demonstrate that the developed algorithm has superior performance in terms of computation time, quality and accuracy.
Table of Contents
List of Tables……………………………………………………….vii
List of Algorithms……………………………………………………viii
Introduction……………………………………………………..1
Importance of Specularity Removal…………………………………1
Thesis Framework……………………………………………..3
Video Data Collection…………………………………………..4
Challenges in Separation…………………………………………5
Contributions…………………………………………………7
Thesis Outline………………………………………………..8
Background……………………………………………………..9
Illumination………………………………………………….9
Surface Reflectance……………………………………………10
2.2.1 Specular Reflection…………………………………………12
2.2.2 Diffuse Reflection………………………………………….12
2.3 Color and Color Space………………………………………….13
2.3.1 RGB Color Space………………………………………….14
2.3.2 YUV Color Space………………………………………….15
2.4 RGB to YUV Conversion………………………………………..17
2.5 YUV to RGB Conversion………………………………………..18
Automatic ROI detection……………………………………………19
Harris Corner detection…………………………………………19
3.1.1 Mathematical formulation…………………………………….19
3.1.2 Limitations of Harris Corner Detector……………………………21
Tracking using KLT Algorithm……………………………………23
3.2.1 Template Image T(x)………………………………………..23
3.2.2 Optimization problem and Solution Algorithm……………………..24
Specularity Removal Algorithm……………………………………….27
Problem formulation…………………………………………..27
4.1.1 First Structural Prior………………………………………..27
4.1.2 Second Structural Prior………………………………………27
Optimization Problem………………………………………….28
Solution Algorithm……………………………………………29
4.3.1 Solution Algorithm for specularity removal………………………..29
Planar Homography Transformation………………………………..33
4.4.1 RANSAC method for Homography estimation………………………34
4.4.2 Limitations for Homography Transformation……………………….35
Final image to Video sequence conversion……………………………36
Quantity Metrics………………………………………………….37
6. Experimental Results………………………………………………40
6.1 Quality Evaluation metrics………………………………………41
6.1 Implementation Details…………………………………………42
7. Conclusion……………………………………………………..44
7.1 Future Scope………………………………………………..44
Bibliography………………………………………………………..45
List of Figures
Figure 1.1 Specular reflections through glass, opaque surface [19], human skin [1] and iris respectively.………………………………………………………………………………………………………………………….1
Figure 1.2 (a) Colonoscope generated specular highlight [2], (b) Iris detection, (c) Number plate recognition.…………………………………………………………………………………………………………………………..2
Figure 1.3 Main schema of the thesis………………………………………..3
Figure 1.4 (a) Book placed on an opaque surface, (b) Photo frame, (c) Book in high illumination
condition.…………………………………………………………..5
Figure 2.1 Natural and artificial sources of illumination…………………………..9
Figure 2.2 Bidirectional Reflectance Distribution Function.……………………….10
Figure 2.3 Mechanism of Surface Reflection.…………………………………11
Figure 2.4 Specular Reflection and Diffuse Reflection.………………………….12
Figure 2.5 (a) Additive color mixing, (b) Subtractive color mixing [25]………………13
Figure 2.6 RGB Color Space…………………………………………….14
Figure 2.7 R, G, B channels for an Image……………………………………15
Figure 2.8 (a) Original RGB image, (b) Y channel, (c) U channel, (d) V channel………..16
Figure 3.1 Basic idea of a “flat”, “edge” and “corner”…………………………..19
Figure 3.2 Classification of the points based on Eigen values………………………20
Figure 3.3 Corner Response Map………………………………………….21
Figure 3.4 Tracking a point (x,y)………………………………………….24
Figure 3.5 Distinct Warp functions for different transformations [33]………………..25
Figure 3.6 Illustration of Algorithm 1………………………………………26
Figure 4.1 Plot of number of input frames with respect to energy in transmission layer and reflection layer……32
Figure 4.2 Planar Homography transformation………………………………..33
Figure 4.3 Mapping of points during planar homography…………………………34
Figure 4.4 Non-homogeneous coordinates…………………………………..34
Figure 4.5 Projective transformation Example………………………………..35
Figure 4.6 Jagged edges due to Projective transform…………………………….35
Figure 5.1 Quantity metrics for specularity measurement Frame-by-Frame basis………..37
Figure 5.2 Quantity metrics for specularity measurement…………………………38
Figure 5.3 GUI for Specularity measurement (Low Specularity)……………………38
Figure 5.4 GUI for Specularity measurement (High Specularity)……………………39
Figure 5.1 Visual display of results (a) Original Frame – Y (b) Transmission layer – T and (c) Specular Reflection Layer – E recovered by our method for different video sequences……41
Figure 5.2 Figure 5.2 Quality Metrics for Amount of Specularity……………………42
Figure 5.3 Computational time for different number of frames for Y channel and RGB channel ……43
List of Algorithms
Algorithm 1 Automatic detection of the ROI of first frame ……………………….22
Algorithm 2 ROI Selection……………………………………………..26
Algorithm 3 Specularity Removal…………………………………………31
Algorithm 4 Images to Video Sequence…………………………………….36
Introduction
Importance of Specularity removal
The algorithms developed for computer vision tasks, image analysis and pattern recognition assumes that the surface of the object are purely diffuse. When recording a photo or video from a transparent or opaque surface one can experience the occlusion of a part of image or video, where these algorithms become erroneous. To obtain pure diffuse reflection, the specularity needs to be accurately removed.
Figure 1.1 contains the specular reflection when capturing the image through the inhomogeneous material such as human skin, glass, plastics, opaque surfaces etc.
Figure 1.1 Specular reflections through glass, opaque surface [19], human skin [1] and iris respectively.
Due to growth in the application of the computer vision, many specular reflection separation algorithms have been formulated. Specularity elimination requires no camera calibration or other a priori information regarding the scene.
Figure 1.2 shows some of the domains in which specular reflection obstructs the computer vision tasks.
(a)
(b)
(c)
Figure 1.2 (a) Colonoscope generated specular highlight [2], (b) Iris detection, (c) Number plate recognition
Thesis Framework
Figure 1.3 shows the process schema of this thesis. It proposes an ideal way for specular reflection separation from a video. Given an input video sequence, at the end of the process we’ll have an output video sequence free from the specularity. In order to achieve this goal, the input video sequence is processed through many stages as shown in the flowchart. Specular reflection separation has been mostly performed on either multiple images or single images and is a highly ill posed problem as the number of unknowns to be recovered are twice as many as that of the given inputs.
Figure 1.3 Main schema of the thesis
Video Data Collection
Since, the specular reflection separation from a video is a novel idea, there weren’t any available datasets. So, I have collected the video data using IPhone 6S which is in .MOV format. These data are recorded in our university library. Figure 1.4 shows the video data collected, for the understanding of reader, two frames from each data are displayed.
(a)
(b)
(c)
Figure 1.4 (a) Book placed on an opaque surface, (b) Photo frame, (c) Book in high illumination condition
1.4 Challenges in Separation
Many algorithms have been formulated to separate the transmission layer and the reflection layers which can be grouped into two methods
- Single-Image Methods
- Multi-Image Methods
Single-Image Methods: These methods use only a single image to separate the layers in an image. They do not require several images of the same scene taken at different angles, which needs time and effort in the image acquisition process. Papers based on these methods can be grouped into two different categories based on their approach.
Firstly, Color space analysis, in this approach the analysis of color space aids in the separation of the highlights from object colors. Klinker et al. [4] classified the color pixels as matte (evince only body reflection), highlight (both body and specular reflections) and clipped (this behaves as a specularity if the light reflection exceeds the dynamic range of camera) [3]. Whereas, Two-dimensional diagram approach, which is superior in terms of computational time, runs mainly in the UV-space. In this approach, the RGB data is converted to UV-space in the first step and then to h-space in the next step, later the h-space is processed using a morphological filter [5,6]. In Bajcsy et al. approach, a novel S-color space was introduced [7]. Klinker et al. approach, Two-dimensional diagram approach and Bajcsy et al. approach together contribute towards Color space analysis.
Secondly, Neighborhood analysis, in this approach some local operations are performed and color information is utilized in the separation process. Specular-free image approach, PDE approach, Inpainting techniques, color information and classifier, Fresnel reflection coefficient methods deploys the use of Neighborhood analysis in their formulation. Specular-free image approach generates a pseudo-diffuse component (specular- free image) [8,9,10,20, 21] and then categorize the pixels as specular or diffuse. If the pixel is labelled specular then specular-to-diffuse algorithm is applied to produce more diffuse component. Hui Liang Shen and Zhi-Huan Zhng [19] proposed an efficient method that generates pseudo-chromaticity image and then categorize the pixels by using chromaticity threshold (Tc) and percentile threshold (Tp). In PDE approach, the main concept is to iteratively erode the specular component at each pixel value [11,12]. While Inpainting techniques, are built on the process of reconstructing the lost or deteriorated parts of image. Tan et al. [13] observed that highlight pixels contain useful information for guiding the inpainting process. Using color information and classifier, Tappen et al. [14] assumed that the input image is the product of the shading and reflectance images and a logarithm is applied on both sides to make it additive. These derivatives are then classified by grey-scale classifier into diffuse and reflection components. Specularity detection based on Fresnel reflection coefficient [15], observed that the Fresnel coefficient changes with the wavelength, which affects the color of specular highlight. So, a segmentation step is performed by making use of mean-shift segmentation algorithm.
Multiple-Image methods: These method uses the information contained in an image sequence of the same scene at different angles or with varying light information. Lin et al. [16] presented a technique that uses the sequence of images to identify the specular pixels by color histogram difference and then finding the stereo correspondences for the specular pixels. Once the specular pixels are identified, they are removed by associating them to their corresponding diffuse points in other images. The Multi-Flash methods are exploited by Feris et al. [17] and Agarwal et al. [18], which utilizes the multiple snapshots of the same image at same point but by varying the input variable i.e. flash-light position.
Although many algorithms have been formulated they are limited by the conditions of their applicability. Most of the techniques rely on a specific reflection model and assume that the specularity varies inconsiderably with the wavelength. Sparse blind separation algorithm with spatial shifts [22,11] are under the assumption of the motions in uniform translations. RASL [23] has the drawback that the visual quality of the layers is not guaranteed. Xiaojie Guo et al. [24] proposed a promising method for decomposition of superimposed images but fails when the superimposed image is blurred.
1.5 Contributions
A robust specular reflection separation method requires that all the specular reflection components are removed, the diffuse reflection part to be recovered with an algorithm having less computational complexity, optimized implementation and has greater accuracy than the state of art algorithms.
Contribution of this work is to extend the algorithms prevailing for the images or image sequences, to a video having camera motion and illumination changes. This algorithm is formulated to have less computational complexity and higher accuracy.
Since, in video there are more than 250 frames, we need to find an optimum number of frames and the best frames to work on. This is determined by using the energy in the transmitted layer, reflection layer and the computation time for number of frames (5,7,10,12,14,15,16,17,18,19,20 and 21). A threshold level is reached which shows that 15 frames are the optimum number of frames to obtain accurate separation of the specular reflection from the given video sequence.
At the end of the thesis, the best 15 frames are determined which effectively and efficiently separates the specular reflection in the video sequence but, the video appears to be the same quality (to human eye and by the devise quantity metrics) as that of the video sequence obtained by using the frames (1:17:250). Every frame of the acquired video should contain the 4-sided polygon region of interest since, the correspondence between the frames is exploited to remove the highlights. In future, this algorithm and quantity metrics can aid for videos that have no correspondences.
1.6 Thesis outline
The thesis is outlined in the flowchart on Figure 1.3. Background Chapter 2 provides the basic physical and mathematical background. This chapter discusses about the Specular reflection, diffuse reflection, color, color spaces (mainly algorithms for RGB to YUV and YUV to RGB). Chapter 3 to Chapter 5 delineate proposed method for Specular reflection separation. Chapter 3 presents the Automatic detection and tracking of region of interest, Chapter 4 presents the most important algorithm for separation of the layers (Superimposed Image Decomposition) in detail, along with homography transformation, limitations and final image sequence to video conversion. Chapter 5 is regarding the novel Quantity metrics and the GUI developed for the side by side evaluation of the specularity. Chapter 6 is about Experimental results, Implementation Details and Quality evaluation metrics. Chapter 7 concludes the work by presenting the limitations of this work and future research scope along this direction.
Background
This chapter provides the basic physical and mathematical background concerning to this thesis. Sec.2.1 refers to illumination, the main influential part on computer vision. On it depends the surface reflectance explained in Sec.2.2 which has two major classes, specular reflection in Sec.2.2.1 and diffuse reflection in Sec.2.2.2. Sec 2.3 introduces color and color space which has two subsections Sec.2.3.1 about the RGB color space and Sec.2.3.2 about the YUV color space. Next the RGB to YUV and YUV to RGB conversion algorithms are explained in Sec.2.4 and Sec.2.5 respectively.
2.1 Illumination
Illumination is defined as the amount of source light incident on the scene. A scene without illumination turns vision system to useless. There are two types of illumination. Firstly, the natural illumination provided by the sun, which is varying with the time of the day and secondly, the artificial illumination by the artificial light sources. The illumination of a scene is defined as i(x,y) which satisfies 0 < i(x,y) < inf condition.
Figure 2.1 Natural and artificial sources of illumination
2.2 Surface Reflectance
Intensity of an image depends on illumination and reflectance. Reflectance is the amount of the light reflected by the object in the scene denoted by r(x,y).
f(x,y) = i(x,y) r(x,y)
where, 0 < i(x,y) < ∞
and 0 < r(x,y) < 1 , where 0 corresponds to total reflectance and 1 to a total absorption
f(x,y) is the image intensity.
Figure 2.2 shows a basic case of surface reflectance where a beam of light strikes a surface with an angle Θi respect to the normal, and outgoing beam with an angle of Θr.
Figure 2.2 Bidirectional Reflectance Distribution Function1
1https://www.cs.cmu.edu/afs/cs/academic/class/15462-f09/www/lec/lec8.pdf
Two main points can be considered
- The outgoing beam does not have the same energy and wavelength as that of the incoming beam.
- Θi may or may not be equal to Θr .
Surfaces are divided into two main classes; specular surfaces and diffuse or lambertian surfaces.
Figure 2.3 shows the mechanism of surface reflection; the image intensity is the additive sum of body reflection and surface reflection.
Figure 2.3 Mechanism of Surface Reflection1
1https://www.cs.cmu.edu/afs/cs/academic/class/15462-f09/www/lec/lec8.pdf
- Specular Reflection
Specular reflection is due to the bouncing of light from a shiny surface such as a mirror
which is highly polished or something which is like it, where parallel rays of light bounce off at the same angle. This is the main phenomenon due to which we could see our face in the mirror.
- Diffuse Reflection
Diffuse reflections occur on rough surfaces such as paper, clothes, asphalt etc. These
reflections behave in accordance with the laws of reflections, but due to the roughness of the surfaces, there will be variation in the normal along the surface. The normal are not parallel as that in case of the specular reflection. Diffuse reflection contributes mostly in identifying the object when compared to specular reflection.
Figure 2.4 Specular Reflection and Diffuse Reflection1
1http://help.autodesk.com/
Figure 2.4 shows the specular and diffuse reflections, the impact of these reflections on a surface.
2.3 Color and Color Space
Color is an important descriptor for human visual perception. Achromatic light is the light that appears on a black and white television, chromatic light spans approximately 400 to 700nm of the electromagnetic spectrum. Chromatic colors have hue, saturation and intensity. These colors are identified due to the simulation of the cone cells in the human eye [25]. Approximately 65% of all cones are sensitive to red light, 33% to green and 2% are sensitive to blue light. Based on this red, green and blue are ascertained to be the primary colors. The primary colors can be added to produce the secondary colors of light- magenta (red plus blue), cyan (green plus blue), and yellow (red plus green). Mixing all the primaries produces a white light and mixing of all the secondary colors produces a black light [26].
Figure 2.5 (a) shows additive color mixing, (b) shows subtractive color mixing.
(a) (b)
Figure 2.5 (a) Additive color mixing, (b) Subtractive color mixing [25]
The main purpose of the color space is to specify the colors in some standards. Most colors are oriented either towards hardware (color monitors and printers) or towards application where color manipulation (such as computer graphics for animation) [26]. There are innumerable color spaces which are used based on their application.
- RGB color space for color monitors and video cameras
- CMY and CMYK for color printing
- HSI in human perception
- YUV for digital television
2.3.1 RGB Color Space
This color space is based on a Cartesian coordinate system. In this model each color appears as a subset of the three primary colors (red, green and blue). Figure 2.6 is RGB color space. Red, Green and Blue are at the corners of the cube, the other three corners are covered by Cyan, Magenta and Yellow. The white color is at the coordinate (1,1,1) and black is at the origin (0,0,0).
Figure 2.6 RGB Color Space1
Images represented using the RGB color space consists of three component image, one for each primary color which are then combined to produce the image. Figure 2.7 shows the three channels for a bird.
1http://eprints.grf.unizg.hr/2318/1/Z617_Komugovi%C4%87_Ana.pdf
Figure 2.7 R, G, B channels for an Image1
2.3.2 YUV Color Space
In this color space the Y component is the brightness of the color also known as luminance, while the U and V components represents the color (Chroma). The primary advantage of this color space is that some of the information can be discarded for reducing the bandwidth. This is a common video compression standard. Recent algorithms such as MPEG-2 are dominant in use. The Chroma subsampling rate is 4:2:2. Figure 2.8 shows the YUV color space for an image.
(a)
1http://triplelift.com/2013/07/02/the-complexity-of-image-analysis-part-2-colors/
(b)
(c)
(d)
Figure 2.8 (a) Original RGB image, (b) Y channel, (c) U channel, (d) V channel1
1https://en.wikipedia.org/wiki/YUV
2.4 RGB to YUV Conversion
YUV is often used interchangeably with YCbCr, different equations are involved in the conversion of RGB to YUV based on the application. For digital component video YCbCr color format is used. For SDTV (Standard definition TV) the following equation depicts the conversion formula.
Here the range for Y is [16 to 235] and that of Cb and Cr are [16 to 240]. There is a need to have Full-range for Y, Cb and Cr. The following equations shows the conversion of RGB values to full range YCbCr.
The possible range of values for luminance and chrominance reserve some footroom and
headroom, which is necessary to provide some space for overshooting [27]. These [0, 128,128] are referred as the clamping values. By omitting these values, things like negative luminance pop up, e.g. in combination with analog video equipment.
After the conversion from RGB to YUV, subsampling the Chroma channels is done to get YUV 4:2:2. A Bi-cubic interpolation is used to reduce the size of the U and V channels. Now the size of the Y channel will be same as that of the frame size in a video and the Chroma channels will be having half of the size of the Y channel.
2.5 YUV to RGB Conversion
This is one of the stage in Figure1.3, in this the YUV color space is converted back to RGB
color space. For a standard definition TV applications (SDTV) the YUV to RGB conversion is by
the following equations
In these equations the Y channel ranges from [16 to 235] and Cb, Cr channel ranges from [16 to 240]. To have full range for Y, Cb and Cr. The following equations shows the conversion of Full-range YCbCr values to RGB.
Converting from YUV 4:2:2 color space to RGB color space, the first step in this algorithm is to increase the size of the Chroma channels (U and V) by 2 so that they are of the same size as the Y channel. Inorder to achieve this, Bi-cubic Interpolation is employed and then the conversion equations are applied on to the YUV channels.
The values in the matrix used for the conversion process are the inverse of the matrix used for RGB to YUV conversion from Sec.2.4. Typically, this full range color format is used for JPEG images. The process assigned 2 and 4 in the Figure 1.3 are formulated by employing the Full-range conversion.
Automatic ROI detection
The primary step is to automatically determine the region of interest. The region of interest is specified by using four corner coordinates in a frame. The approach is to first detect the four corner points of the region of interest in the first frame and later, using the KLT tracking algorithm to find the corner points in all the other frames.
To detect the four corner points in the first frame, Harris corner detection was applied to the binary version of the first frame.
3.1 Harris Corner detection
Corners are good feature points which are more stable over the change of viewpoints. These corners can be identified by looking after the intensity values within a window. For a point to be a corner, there must be a large change in the intensity values along all the directions. Figure 3.1 shows the basic idea of a corner [38].
Figure 3.1 Basic idea of a “flat”, “edge” and “corner”
3.1.1 Mathematical Formulation
Change of intensity for shift [u,v] is given as
end
Γ = Γ + △Γt;
end
Saving the 15 output images and replacing it by the 15 output images (Iaf).
for i from 1 to 250 do
masking out the region of interest from the present
frame, the ROI is black with its background undisturbed
(I1).
An image with black background and the ROI with Iaf
(I2)
Computing I3= I2 ο Γi
Final image I4= I1+ I3
end
Combining the Y, U and V channels and converting the image sequences to RGB Video.
Output: Optimal solution (T* = Tt, E* = Et) and Video after Specularity removal
△Γt+1 =
argmin △Γ
Φ(
Z1t
,
YοΓ +∑i=1nJ
i △
Γ
εiεiT – Tt+1 – Mt+1)
=
∑i=1nJi†
( Tt+1 + Mt+1 –
Y ο Γ
–
Z1tμt
) εiεiT
(28)
where
Ji†
represents the Moore-Penrose pseudoinverse of J.
Z1t+1
=
Z1t
+
μt
( Vt+1 – Tt+1 – Mt+1)
Z2t+1
=
Z2t
+
μt
( Mt+1 – Et+1 – Nt+1)
Z3t+1
=
Z3t
+
μt
( Lt+1 – Tt+1)
Z4t+1
=
Z4t
+
μt
( Pt+1 – DTt+1)
Z5t+1
=
Z5t
+
μt
( Qt+1 – DEt+1)
(29)
The inner loop of Algorithm 3 ends when || Vt+1 – Tt+1 – Mt+1||F ≤ δ||Y ο Γ||F with δ= 10-6 or the maximum number of iterations are reached. The outer loop terminates when the change of the objective function value is less or maximum number of iterations are reached.
4.3.2 Determining the optimum number of Frames
Figure 4.1 Plot of number of input frames with respect to energy in transmission layer and reflection layer
To determine the optimum number of frames in the specularity removal process, a graph between the number of input frames with respect to energy in transmission layer and reflection layer is plotted as shown in Figure 4.1. For the specularity to be removed more accurately, the energy in the transmission layer should be very high and that of reflection layer should be very low. From the Figure 4.1 N=15 is the threshold point, the values for the energy in the transmission layer and reflection layer are nearly equal. So, 15 frames are chosen for in the specularity removal process.
4.4. Planar Homography Transformation
It is a geometric transformation which has 8 degrees of freedom. There are two cases in which this transformation occurs [35].
Case 1: Images of a plane viewed under arbitrary camera motion.
Case 2: Images of an arbitrary 3D scene viewed by camera rotating and/or zooming about its optic
center.
These two cases are illustrated in the Figure 4.2.
Figure 4.2 Planar Homography transformation
The points in a planar homography are mapped as shown in Figure 4.3. A point (x,y) is represented as (x,y,1), its corresponding point in homogeneous coordinates is (x1, x2, x3) which is the point (x1/ x3, x2/ x3)
Figure 4.3 Mapping of points during planar homography
Equivalently X′= HX, where H is a 3×3 non-singular homogeneous matrix. Where the non-homogeneous coordinates are given by Figure 4.4.
Figure 4.4 Non-homogeneous coordinates
4.4.1 RANSAC method for Homography estimation
In the work, we use the RANSAC method for estimating the homography matrix for all the frames. The algorithm for RANSAC method has the following steps.
Step 1. Extract features
Step 2. Compute a set of potential matches
Step 3. do
Step 3.1 select minimal sample (i.e. 7 matches) ( Generate hypothesis)
Step 3.2 compute solution(s) for H
Step 3.3 determine inliers (verify hypothesis)
until a large enough set of the matches become inliers
Step 4. Compute H based on all inliers
Step 5. Look for additional matches
Step 6. Refine H based on all correct matches
Figure 4.5 shows an example of Projective homography transformation, where (x2,y2) is the transformed point for (x1,y1).
Figure 4.5 Projective transformation Example
4.4.2 Limitations for Homography transformation
When warping an image to do projective transform, the smooth edges in the original images turn to jagged edges as shown in Figure 4.6.
Figure 4.6 Jagged edges due to Projective transform
The left side shows the original image; the right-hand side is the warped image which is prone to jagged edges. These jagged edges need to be smoothed by using appropriate filters. The best suitable filter is using a Gaussian filter. imgaussfilt(A,sigma) filters image A with a 2-D Gaussian smoothing kernel with standard deviation specified by sigma in MATLAB. Another method to smooth the edges is to apply an edge detector like Sobel, Canny, Prewitt, Roberts etc. using a threshold level, dilating the edges and then applying the Gaussian filtering to that image. At last the edges are replaced with the smoothed ones. The former one is fast and easy to implement in terms of complexity so, the previous one is used in this work.
4.5. Final image to video sequence conversion
This is the last step in the flowchart, the input to this stage are the 250 images free from specularity obtained from Algorithm 3. These 250 images are converted to video sequence using Algorithm 4.
Algorithm 4: Images to Video sequence
Input: 250 frames output from Algorithm 3
Creating an object for VideoWriter and specifying its frame rate = 30
Setting the time for each image = 1 sec
Opening the VideoWriter
for i= 1 to 250 do
convert image to frame
write this frame to a video
end
close the writer object
play the video sequence
Output: A video sequence with the name specified in the creation step
5. Quantity Metrics
The amount of specularity in a video can be determined by employing the video before specularity removal and video obtained after specularity removal. The quantity metrics is computed frame by frame basis. The difference between the frames from the two videos is taken which in turn is the specularity component in each frame. Then, the energy of the difference frame is calculated by adding the intensity value at each location.
Figure 5.1 shows the amount of specularity for all the acquired videos on frame by frame basis.
Figure 5.1 Quantity metrics for Specularity Measurement Frame-by-Frame basis
The mean value for all the frames is taken to have a single value for a video. Figure 5.2 shows the amount of specularity for the whole video.
Figure 5.2 Quantity metrics for Specularity measurement
To understand deeply about the amount of specularity in each frame, a simultaneous frame and graph GUI is developed. Figure 5.3 shows the developed GUI.
Figure 5.3 GUI for Specularity measurement (Low Specularity)
The GUI has a panel for the display of the video frames, Pop-up menu to select the video acquired from the hand-held devices and axes to display the graph representing the amount of specularity at each frame. The color of the graph depicts the amount of specularity present in the frame. The blue color of the graph represents the frames of low specularity and red color of the graph represents the frames of high specularity. Figure 5.4 shows the frame having high specularity.
Figure 5.4 GUI for Specularity measurement (High Specularity)
6. Experimental Results
This Chapter covers the experimental results, Quality evaluation metrics and implementation details of this work. This method is evaluated on high-quality videos, Figure 6.1. displays the results for video sequence placed on opaque surface, photo frame and a book in high illuminating condition respectively. The region of interest is bounded by the yellow boxes in all the frames.
- (b) (c)
- (b) (c)
Figure 6.1 Visual display of results (a) Original Frame – Y (b) Transmission layer – T and (c) Specular Reflection Layer – E recovered by our method for different video sequences
6.1. Quality evaluation metrics
The amount of specularity can be determined by the energy in the reflection layer, which is the sum of the intensity values at each pixel. Since, there are no papers on specularity removal from the video sequence, to compare the various algorithms, the quality evaluation is performed on the images. The color image can be decomposed into transmission layer and reflection layer. The reflection layer can be derived from the original image by subtracting the transmission layer from it.
For evaluation purpose three different methods along with our method is compared. These methods are
- Chromaticity-based separation of reflection components in a single image [21]
- Single Image Layer Separation using Relative Smoothness [36]
- Exploiting Reflection Change for Automatic Reflection Removal [37]
The amount of specularity before and after specularity removal for all the above-mentioned methods are plotted. This plot reveals that the current method is superior to other methods in terms of performance.
Figure 6.2 Quality Metrics for Amount of Specularity
6.2. Implementation Details
In the Algorithm 3 the parameters λ1 = 0.3/√(w*h), λ2 = 50/√(w*h), λ3 = 1/√(w*h), λ4 = 5/√(w*h), λ5 = 50/√(w*h) and λ6 = 50/√(w*h), where w is the width of the region of interest and h is the height of the region of interest. The Algorithm 4 is applied on the Y channel, after the replacement of the region of interest with the specular free image the Y, U and V channels are combined to form the original video sequence without specularity. The experiments are conducted in MATLAB on a PC running Windows 10 32bit operating with Intel Core i5 processor with 1TB hard disk and 8 GB RAM. The main advantage of the algorithm over the state of art algorithms is that it preserves the illumination change, temporal coherence and the computational complexity is minimized. Figure 6.3. compares the Algorithm presented in this paper (using Y channel) over the state of art algorithms (using RGB Channels).
Figure 6.3 Computational time for different number of frames for Y channel and RGB channel
7. Conclusion
This thesis has proposed a novel and robust method for the specularity removal from hand-held video. Hand-held videos are prone to flickering and illumination changes, to deal with this a noise term is used in the problem formulation to preserve the smoothness in a video and the automaticity in selection of region of interest for all the frames are introduced. Unlike the existing solutions for images computed on RGB channel, this paper evaluates the algorithm on Y channel.
The experimental results compared to the state of art algorithms manifests that the proposed algorithm has higher accuracy and speed.
7.1. Future Scope
The best 15 frames are to be selected for efficient and effective specularity removal. The frames are selected such that there exists very little correlation between the frames. Using the best 15 frames in the above algorithm, the video obtained appears to be the same quality (to human eye and by the devise quantity metrics) as that of the video sequence obtained by using the frames (1:17:250). Every frame of the acquired video should contain the 4-sided polygon region of interest since, the correspondence between the frames is exploited to remove the highlights. In future, this algorithm and quantity metrics can aid for videos that have no correspondences.
Bibliography
[1] Biometrics.idealtest.org. (2017). Biometrics Ideal Test. [online] Available at: “CASIA-FaceV5, http://biometrics.idealtest.org/” [Accessed 9 Mar. 2017].
[2] Jorge Bernal, F. Javier Sánchez, Cristina Rodríguez de Miguel and Gloria Fernández-Esparrach. Building up the Future of Colonoscopy – A Synergy between Clinicians and Computer Scientists.
[3] Klinker, G.J., Shafer, S.A. and Kanade, T., 1987, June. Using a color reflection model to separate highlights from object color. In Proc. ICCV (Vol. 87, pp. 145-150).
[4] Klinker, G.J., Shafer, S.A. and Kanade, T., 1988. The measurement of highlights in color images. International Journal of Computer Vision, 2(1), pp.7-32.
[5] Schlüns, K. and Teschner, M., 1995, January. Fast separation of reflection components and its application in 3d shape recovery. In Color and Imaging Conference (Vol. 1995, No. 1, pp. 48-51). Society for Imaging Science and Technology.
[6] Schluns, K. and Koschan, A., 2000, October. Global and local highlight analysis in color images. In Proc. 1st Int. Conf. Color Graphics Image Processing (pp. 300-304).
[7] Bajcsy, R., Lee, S.W. and Leonardis, A., 1996. Detection of diffuse and specular interface reflections and inter-reflections by color image segmentation. International Journal of Computer Vision, 17(3), pp.241-272.
[8] Tan, R.T. and Ikeuchi, K., 2005. Separating reflection components of textured surfaces using a single image. IEEE transactions on pattern analysis and machine intelligence, 27(2), pp.178-193.
[9] Tan, R.T. and Ikeuchi, K., 2005. Illumination color and intrinsic surface properties physical-based color analysis from a single image. Transactions of Information Processing Society of Japan 46 (2005), 17–40.
[10] Yoon, K.J., Choi, Y. and Kweon, I.S., 2006, October. Fast separation of reflection components using a specularity-invariant image representation. In Image Processing, 2006 IEEE International Conference on (pp. 973-976). IEEE.
[11] Mallick, S.P., Zickler, T., Belhumeur, P.N. and Kriegman, D.J., 2006, May. Specularity removal in images and videos: A PDE approach. In European Conference on Computer Vision (pp. 550-563). Springer Berlin Heidelberg.
[12] Mallick, S.P., Zickler, T., Belhumeur, P. and Kriegman, D., 2006, July. Dichromatic separation: specularity removal and editing. In ACM SIGGRAPH 2006 Sketches (p. 166). ACM.
[13] Quan, L. and Shum, H.Y., 2003, October. Highlight removal by illumination-constrained inpainting. In Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on (pp. 164-169). IEEE.
[14] Tappen, M.F., Freeman, W.T. and Adelson, E.H., 2005. Recovering intrinsic images from a single image. IEEE Trans. Pattern Anal. Mach. Intell., 27(9), pp.1459-1472.
[15] Angelopoulou, E., 2007, October. Specular highlight detection based on the Fresnel reflection coefficient. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on (pp. 1-8). IEEE.
[16] Lin, S., Li, Y., Kang, S.B., Tong, X. and Shum, H.Y., 2002, May. Diffuse-specular separation and depth recovery from image sequences. In European conference on computer vision (pp. 210-224). Springer Berlin Heidelberg.
[17] Feris, R., Raskar, R., Tan, K.H. and Turk, M., 2004, October. Specular reflection reduction with multi-flash imaging. In Computer Graphics and Image Processing, 2004. Proceedings. 17th Brazilian Symposium on (pp. 316-321). IEEE.
[18] Agrawal, A., Raskar, R., Nayar, S.K. and Li, Y., 2005. Removing photography artifacts using gradient projection and flash-exposure sampling. ACM Transactions on Graphics (TOG), 24(3), pp.828-835.
[19] Shen, H.L. and Zheng, Z.H., 2013. Real-time highlight removal using intensity ratio. Applied optics, 52(19), pp.4483-4493.
[20] Shen, H.L. and Cai, Q.Y., 2009. Simple and efficient method for specularity removal in an image. Applied optics, 48(14), pp.2711-2719.
[21] Shen, H.L., Zhang, H.G., Shao, S.J. and Xin, J.H., 2008. Chromaticity-based separation of reflection components in a single image. Pattern Recognition, 41(8), pp.2461-2469.
[22] Gai, K., Shi, Z. and Zhang, C., 2012. Blind separation of superimposed moving images using image statistics. IEEE transactions on pattern analysis and machine intelligence, 34(1), pp.19-32.
[23] Peng, Y., Ganesh, A., Wright, J., Xu, W. and Ma, Y., 2012. RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), pp.2233-2246.
[24] Guo, X., Cao, X. and Ma, Y., 2014. Robust separation of reflection from multiple images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2187-2194).
[25] Color space. (2017, March 7). In Wikipedia, The Free Encyclopedia. Retrieved 06:00, March 9, 2017, from https://en.wikipedia.org/w/index.php?title=Color_space&oldid=769021088.
[26] Gonzalez, Rafael C and Richard E Woods. Digital Image Processing. 3rd ed. Reading, Mass.: Addison-Wesley, 1998. Print.
[27] “Color Conversion – Equasys Gmbh”. Equasys.de. N.p., 2017. Web. 9 Mar. 2017, from http://www.equasys.de/colorconversion.html.
[28] Lucas, B.D. and Kanade, T., 1981. An iterative image registration technique with an application to stereo vision.
[29] Carlo Tomasi and Takeo Kanade. Detection and Tracking of Point Features. Carnegie Mellon University Technical Report CMU-CS-91-132, April 1991.
[30] Jianbo Shi and Carlo Tomasi. Good Features to Track. IEEE Conference on Computer Vision and Pattern Recognition, pages 593-600, 1994.
[31] Stan Birchfield. Derivation of Kanade-Lucas-Tomasi Tracking Equation. Unpublished, January 1997.
[32] Lin, Z., Chen, M., Wu, L. and Ma, Y..2009. The Augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. Technical report UILU_ENG-09-2215, UIUC, Technical Report.
[33] Szeliski.org. (2017). Computer Vision: Algorithms and Applications. [online] Available at: http://szeliski.org/Book/ [Accessed 9 Mar. 2017].
[34] Baker, S. and Matthews, I., 2004. Lucas-kanade 20 years on: A unifying framework. International journal of computer vision, 56(3), pp.221-255.
[35] Capel, D. and Zisserman, A., 2003. Computer vision applied to super resolution. IEEE Signal Processing Magazine, 20(3), pp.75-86.
[36] Li, Y. and Brown, M.S., 2014. Single image layer separation using relative smoothness. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2752-2759).
[37] Li, Y. and Brown, M.S., 2013. Exploiting reflection change for automatic reflection removal. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2432-2439).
[38] “Matching with Invariant Features”, Darya Frolova, Denis Simakov, The Weizmann Institute of Science, March 2004.
Cite This Work
To export a reference to this article please select a referencing stye below:
Related Services
View allRelated Content
All TagsContent relating to: "Technology"
Technology can be described as the use of scientific and advanced knowledge to meet the requirements of humans. Technology is continuously developing, and is used in almost all aspects of life.
Related Articles
DMCA / Removal Request
If you are the original writer of this dissertation and no longer wish to have your work published on the UKDiss.com website then please: