Disclaimer: This dissertation has been written by a student and is not an example of our professional work, which you can see examples of here.

Any opinions, findings, conclusions, or recommendations expressed in this dissertation are those of the authors and do not necessarily reflect the views of UKDiss.com.

Robust Specularity Removal from Hand-held Videos

Info: 11919 words (48 pages) Dissertation
Published: 13th Dec 2019

Reference this

Tags: Technology

Abstract

Specular reflection exists when one tries to record a photo or video through a transparent glass medium or opaque surfaces such as plastics, ceramics, polyester and human skin, which can be well described as the superposition of the transmitted layer and the reflection layer. These specular reflections often confound the algorithms developed for image analysis, computer vision and pattern recognition. To obtain a pure diffuse reflection component, specularity (highlights) needs to be removed. To handle this problem, a novel and robust algorithm is formulated. The contributions of this work are three-fold.

First, the smoothness of the video along with the temporal coherence and illumination changes are preserved by reducing the flickering and jagged edges caused by hand-held video acquisition and homography transformation respectively.

Second, this algorithm is designed to be more potent than the state of art algorithms by introducing automaticity in the selection of the region of interest for all the frames, reducing the computational time and complexity by utilizing the luminance (Y) channel and exploiting the Augmented Lagrange Multiplier (ALM) with Alternating Direction Minimizing (ADM) for the decomposition of the objective function which is formulated by applying the structural priors.

Third, a quantity metrics is devised, which objectively quantifies the amount of specularity in each frame of a hand-held video. The proposed specularity removal algorithm is compared against the existing state of art algorithms using the newly-developed quantity metrics. Experimental results demonstrate that the developed algorithm has superior performance in terms of computation time, quality and accuracy.

Table of Contents

List of Tables……………………………………………………….vii

List of Algorithms……………………………………………………viii

Introduction……………………………………………………..1

Importance of Specularity Removal…………………………………1

Thesis Framework……………………………………………..3

Video Data Collection…………………………………………..4

Challenges in Separation…………………………………………5

Contributions…………………………………………………7

Thesis Outline………………………………………………..8

Background……………………………………………………..9

Illumination………………………………………………….9

Surface Reflectance……………………………………………10

2.2.1 Specular Reflection…………………………………………12

2.2.2 Diffuse Reflection………………………………………….12

2.3   Color and Color Space………………………………………….13

2.3.1 RGB Color Space………………………………………….14

2.3.2 YUV Color Space………………………………………….15

2.4   RGB to YUV Conversion………………………………………..17

2.5   YUV to RGB Conversion………………………………………..18

Automatic ROI detection……………………………………………19

Harris Corner detection…………………………………………19

3.1.1 Mathematical formulation…………………………………….19

3.1.2 Limitations of Harris Corner Detector……………………………21

Tracking using KLT Algorithm……………………………………23

3.2.1 Template Image T(x)………………………………………..23

3.2.2 Optimization problem and Solution Algorithm……………………..24

Specularity Removal Algorithm……………………………………….27

Problem formulation…………………………………………..27

4.1.1 First Structural Prior………………………………………..27

4.1.2 Second Structural Prior………………………………………27

Optimization Problem………………………………………….28

Solution Algorithm……………………………………………29

4.3.1 Solution Algorithm for specularity removal………………………..29

Planar Homography Transformation………………………………..33

4.4.1 RANSAC method for Homography estimation………………………34

4.4.2 Limitations for Homography Transformation……………………….35

Final image to Video sequence conversion……………………………36

Quantity Metrics………………………………………………….37

6.    Experimental Results………………………………………………40

6.1    Quality Evaluation metrics………………………………………41

6.1    Implementation Details…………………………………………42

7.    Conclusion……………………………………………………..44

7.1    Future Scope………………………………………………..44

Bibliography………………………………………………………..45

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

List of Figures

Figure 1.1 Specular reflections through glass, opaque surface [19], human skin [1] and iris respectively.………………………………………………………………………………………………………………………….1

Figure 1.2 (a) Colonoscope generated specular highlight [2], (b) Iris detection, (c) Number plate recognition.…………………………………………………………………………………………………………………………..2

Figure 1.3 Main schema of the thesis………………………………………..3

Figure 1.4 (a) Book placed on an opaque surface, (b) Photo frame, (c) Book in high illumination 

condition.…………………………………………………………..5

Figure 2.1 Natural and artificial sources of illumination…………………………..9

Figure 2.2 Bidirectional Reflectance Distribution Function.……………………….10

Figure 2.3 Mechanism of Surface Reflection.…………………………………11

Figure 2.4 Specular Reflection and Diffuse Reflection.………………………….12

Figure 2.5 (a) Additive color mixing, (b) Subtractive color mixing [25]………………13

Figure 2.6 RGB Color Space…………………………………………….14

Figure 2.7 R, G, B channels for an Image……………………………………15

Figure 2.8 (a) Original RGB image, (b) Y channel, (c) U channel, (d) V channel………..16

Figure 3.1 Basic idea of a “flat”, “edge” and “corner”…………………………..19

Figure 3.2 Classification of the points based on Eigen values………………………20

Figure 3.3 Corner Response Map………………………………………….21

Figure 3.4 Tracking a point (x,y)………………………………………….24

Figure 3.5 Distinct Warp functions for different transformations [33]………………..25

Figure 3.6 Illustration of Algorithm 1………………………………………26

Figure 4.1 Plot of number of input frames with respect to energy in transmission layer and reflection layer……32

Figure 4.2 Planar Homography transformation………………………………..33

Figure 4.3 Mapping of points during planar homography…………………………34

Figure 4.4 Non-homogeneous coordinates…………………………………..34

Figure 4.5 Projective transformation Example………………………………..35

Figure 4.6 Jagged edges due to Projective transform…………………………….35

Figure 5.1 Quantity metrics for specularity measurement Frame-by-Frame basis………..37

Figure 5.2 Quantity metrics for specularity measurement…………………………38

Figure 5.3 GUI for Specularity measurement (Low Specularity)……………………38

Figure 5.4 GUI for Specularity measurement (High Specularity)……………………39

Figure 5.1 Visual display of results (a) Original Frame – Y (b) Transmission layer – T and (c) Specular Reflection Layer – E recovered by our method for different video sequences……41

Figure 5.2 Figure 5.2 Quality Metrics for Amount of Specularity……………………42

Figure 5.3 Computational time for different number of frames for Y channel and RGB channel ……43

List of Algorithms

Algorithm 1 Automatic detection of the ROI of first frame ……………………….22

Algorithm 2 ROI Selection……………………………………………..26

Algorithm 3 Specularity Removal…………………………………………31

Algorithm 4 Images to Video Sequence…………………………………….36

Introduction

Importance of Specularity removal

The algorithms developed for computer vision tasks, image analysis and pattern recognition assumes that the surface of the object are purely diffuse. When recording a photo or video from a transparent or opaque surface one can experience the occlusion of a part of image or video, where these algorithms become erroneous. To obtain pure diffuse reflection, the specularity needs to be accurately removed.

Figure 1.1 contains the specular reflection when capturing the image through the inhomogeneous material such as human skin, glass, plastics, opaque surfaces etc.

C:UsersergprDesktopimages1.JPG

Figure 1.1 Specular reflections through glass, opaque surface [19], human skin [1] and iris respectively.

Due to growth in the application of the computer vision, many specular reflection separation algorithms have been formulated. Specularity elimination requires no camera calibration or other a priori information regarding the scene.

Figure 1.2 shows some of the domains in which specular reflection obstructs the computer vision tasks.

C:UsersergprDesktopimagescrop1.png

(a)

C:UsersergprDesktopimagesimages.jpeg

(b)

C:UsersergprDesktopimagesCapture.JPG

(c)

Figure 1.2 (a) Colonoscope generated specular highlight [2], (b) Iris detection, (c) Number plate recognition

 

Thesis Framework

Figure 1.3 shows the process schema of this thesis. It proposes an ideal way for specular reflection separation from a video. Given an input video sequence, at the end of the process we’ll have an output video sequence free from the specularity. In order to achieve this goal, the input video sequence is processed through many stages as shown in the flowchart. Specular reflection separation has been mostly performed on either multiple images or single images and is a highly ill posed problem as the number of unknowns to be recovered are twice as many as that of the given inputs.

C:UsersergprDesktopimagesflowchart.JPG

Figure 1.3 Main schema of the thesis

 

 

Video Data Collection

Since, the specular reflection separation from a video is a novel idea, there weren’t any available datasets. So, I have collected the video data using IPhone 6S which is in .MOV format. These data are recorded in our university library. Figure 1.4 shows the video data collected, for the understanding of reader, two frames from each data are displayed.

C:UsersergprDesktopcoloracklitSIDSID	assfullframesimagesookab1.jpgC:UsersergprDesktopcoloracklitSIDSID	assfullframesimagesookab52.jpg

(a)

C:UsersergprDesktopcoloracklitSIDSID	assfullframesimagesfrmvid1.jpgC:UsersergprDesktopcoloracklitSIDSID	assfullframesimagesfrmvid466.jpg

(b)

C:UsersergprDesktopcoloracklitSIDSID	assfullframesimagesigbook1.jpgC:UsersergprDesktopcoloracklitSIDSID	assfullframesimagesigbook425.jpg

(c)

Figure 1.4 (a) Book placed on an opaque surface, (b) Photo frame, (c) Book in high illumination condition

 

1.4 Challenges in Separation

Many algorithms have been formulated to separate the transmission layer and the reflection layers which can be grouped into two methods

  1. Single-Image Methods
  2. Multi-Image Methods

Single-Image Methods: These methods use only a single image to separate the layers in an image. They do not require several images of the same scene taken at different angles, which needs time and effort in the image acquisition process. Papers based on these methods can be grouped into two different categories based on their approach.

Firstly, Color space analysis, in this approach the analysis of color space aids in the separation of the highlights from object colors. Klinker et al. [4] classified the color pixels as matte (evince only body reflection), highlight (both body and specular reflections) and clipped (this behaves as a specularity if the light reflection exceeds the dynamic range of camera) [3]. Whereas, Two-dimensional diagram approach, which is superior in terms of computational time, runs mainly in the UV-space. In this approach, the RGB data is converted to UV-space in the first step and then to h-space in the next step, later the h-space is processed using a morphological filter [5,6]. In Bajcsy et al. approach, a novel S-color space was introduced [7]. Klinker et al. approach, Two-dimensional diagram approach and Bajcsy et al. approach together contribute towards Color space analysis.

Secondly, Neighborhood analysis, in this approach some local operations are performed and color information is utilized in the separation process. Specular-free image approach, PDE approach, Inpainting techniques, color information and classifier, Fresnel reflection coefficient methods deploys the use of Neighborhood analysis in their formulation. Specular-free image approach generates a pseudo-diffuse component (specular- free image) [8,9,10,20, 21] and then categorize the pixels as specular or diffuse. If the pixel is labelled specular then specular-to-diffuse algorithm is applied to produce more diffuse component. Hui Liang Shen and Zhi-Huan Zhng [19] proposed an efficient method that generates pseudo-chromaticity image and then categorize the pixels by using chromaticity threshold (Tc) and percentile threshold (Tp). In PDE approach, the main concept is to iteratively erode the specular component at each pixel value [11,12]. While Inpainting techniques, are built on the process of reconstructing the lost or deteriorated parts of image. Tan et al. [13] observed that highlight pixels contain useful information for guiding the inpainting process. Using color information and classifier, Tappen et al. [14] assumed that the input image is the product of the shading and reflectance images and a logarithm is applied on both sides to make it additive.  These derivatives are then classified by grey-scale classifier into diffuse and reflection components. Specularity detection based on Fresnel reflection coefficient [15], observed that the Fresnel coefficient changes with the wavelength, which affects the color of specular highlight. So, a segmentation step is performed by making use of mean-shift segmentation algorithm.

Multiple-Image methods: These method uses the information contained in an image sequence of the same scene at different angles or with varying light information. Lin et al. [16] presented a technique that uses the sequence of images to identify the specular pixels by color histogram difference and then finding the stereo correspondences for the specular pixels. Once the specular pixels are identified, they are removed by associating them to their corresponding diffuse points in other images.  The Multi-Flash methods are exploited by Feris et al. [17] and Agarwal et al. [18], which utilizes the multiple snapshots of the same image at same point but by varying the input variable i.e. flash-light position.

Although many algorithms have been formulated they are limited by the conditions of their applicability. Most of the techniques rely on a specific reflection model and assume that the specularity varies inconsiderably with the wavelength. Sparse blind separation algorithm with spatial shifts [22,11] are under the assumption of the motions in uniform translations. RASL [23] has the drawback that the visual quality of the layers is not guaranteed. Xiaojie Guo et al. [24] proposed a promising method for decomposition of superimposed images but fails when the superimposed image is blurred.

1.5 Contributions

A robust specular reflection separation method requires that all the specular reflection components are removed, the diffuse reflection part to be recovered with an algorithm having less computational complexity, optimized implementation and has greater accuracy than the state of art algorithms.

Contribution of this work is to extend the algorithms prevailing for the images or image sequences, to a video having camera motion and illumination changes. This algorithm is formulated to have less computational complexity and higher accuracy.

Since, in video there are more than 250 frames, we need to find an optimum number of frames and the best frames to work on. This is determined by using the energy in the transmitted layer, reflection layer and the computation time for number of frames (5,7,10,12,14,15,16,17,18,19,20 and 21). A threshold level is reached which shows that 15 frames are the optimum number of frames to obtain accurate separation of the specular reflection from the given video sequence.

At the end of the thesis, the best 15 frames are determined which effectively and efficiently separates the specular reflection in the video sequence but, the video appears to be the same quality (to human eye and by the devise quantity metrics) as that of the video sequence obtained by using the frames (1:17:250). Every frame of the acquired video should contain the 4-sided polygon region of interest since, the correspondence between the frames is exploited to remove the highlights. In future, this algorithm and quantity metrics can aid for videos that have no correspondences.

1.6 Thesis outline

The thesis is outlined in the flowchart on Figure 1.3. Background Chapter 2 provides the basic physical and mathematical background. This chapter discusses about the Specular reflection, diffuse reflection, color, color spaces (mainly algorithms for RGB to YUV and YUV to RGB). Chapter 3 to Chapter 5 delineate proposed method for Specular reflection separation. Chapter 3 presents the Automatic detection and tracking of region of interest, Chapter 4 presents the most important algorithm for separation of the layers (Superimposed Image Decomposition) in detail, along with homography transformation, limitations and final image sequence to video conversion. Chapter 5 is regarding the novel Quantity metrics and the GUI developed for the side by side evaluation of the specularity. Chapter 6 is about Experimental results, Implementation Details and Quality evaluation metrics. Chapter 7 concludes the work by presenting the limitations of this work and future research scope along this direction.

Background

This chapter provides the basic physical and mathematical background concerning to this thesis. Sec.2.1 refers to illumination, the main influential part on computer vision. On it depends the surface reflectance explained in Sec.2.2 which has two major classes, specular reflection in Sec.2.2.1 and diffuse reflection in Sec.2.2.2. Sec 2.3 introduces color and color space which has two subsections Sec.2.3.1 about the RGB color space and Sec.2.3.2 about the YUV color space. Next the RGB to YUV and YUV to RGB conversion algorithms are explained in Sec.2.4 and Sec.2.5 respectively.

 

2.1 Illumination

Illumination is defined as the amount of source light incident on the scene. A scene without illumination turns vision system to useless. There are two types of illumination. Firstly, the natural illumination provided by the sun, which is varying with the time of the day and secondly, the artificial illumination by the artificial light sources. The illumination of a scene is defined as i(x,y) which satisfies 0 <  i(x,y) <  inf condition.

C:UsersergprDesktopimages2767cf9e3c70c033c3921ba6930dc6c7.jpg

Figure 2.1 Natural and artificial sources of illumination

2.2 Surface Reflectance

Intensity of an image depends on illumination and reflectance. Reflectance is the amount of the light reflected by the object in the scene denoted by r(x,y).

f(x,y) = i(x,y) r(x,y)

where, 0 < i(x,y) < ∞ http://bme.med.upatras.gr/improc/illumi2.gif

and      0 < r(x,y) < 1 , where 0 corresponds to total reflectance and 1 to a total absorption

f(x,y) is the image intensity.

Figure 2.2 shows a basic case of surface reflectance where a beam of light strikes a surface with an angle Θi  respect to the normal, and outgoing beam with an angle of Θr.

C:UsersergprDesktopimages
ef.JPG

Figure 2.2 Bidirectional Reflectance Distribution Function1

1https://www.cs.cmu.edu/afs/cs/academic/class/15462-f09/www/lec/lec8.pdf

Two main points can be considered

  1. The outgoing beam does not have the same energy and wavelength as that of the incoming beam.
  2. Θi may or may not be equal to Θr .

Surfaces are divided into two main classes; specular surfaces and diffuse or lambertian surfaces.

Figure 2.3 shows the mechanism of surface reflection; the image intensity is the additive sum of body reflection and surface reflection.

C:UsersergprDesktopimages
ef2.JPG

Figure 2.3 Mechanism of Surface Reflection1

 

1https://www.cs.cmu.edu/afs/cs/academic/class/15462-f09/www/lec/lec8.pdf

  1.   Specular Reflection

Specular reflection is due to the bouncing of light from a shiny surface such as a mirror

which is highly polished or something which is like it, where parallel rays of light bounce off at the same angle. This is the main phenomenon due to which we could see our face in the mirror.

  1.   Diffuse Reflection

Diffuse reflections occur on rough surfaces such as paper, clothes, asphalt etc. These

reflections behave in accordance with the laws of reflections, but due to the roughness of the surfaces, there will be variation in the normal along the surface. The normal are not parallel as that in case of the specular reflection. Diffuse reflection contributes mostly in identifying the object when compared to specular reflection.

C:UsersergprDesktopimagesil_radiosity_reflection.jpgC:UsersergprDesktopimagesSpecular-Diffuse-Reflection.jpg

 

Figure 2.4 Specular Reflection and Diffuse Reflection1

 

1http://help.autodesk.com/

Figure 2.4 shows the specular and diffuse reflections, the impact of these reflections on a surface.

2.3 Color and Color Space

Color is an important descriptor for human visual perception. Achromatic light is the light that appears on a black and white television, chromatic light spans approximately 400 to 700nm of the electromagnetic spectrum. Chromatic colors have hue, saturation and intensity. These colors are identified due to the simulation of the cone cells in the human eye [25]. Approximately 65% of all cones are sensitive to red light, 33% to green and 2% are sensitive to blue light. Based on this red, green and blue are ascertained to be the primary colors. The primary colors can be added to produce the secondary colors of light- magenta (red plus blue), cyan (green plus blue), and yellow (red plus green). Mixing all the primaries produces a white light and mixing of all the secondary colors produces a black light [26].

Figure 2.5 (a) shows additive color mixing, (b) shows subtractive color mixing.

C:UsersergprDesktopimagesSubtractiveColor.svg.pngC:UsersergprDesktopimagesadditive_colour.jpg

(a)                                                                                   (b)

Figure 2.5 (a) Additive color mixing, (b) Subtractive color mixing [25]

The main purpose of the color space is to specify the colors in some standards. Most colors are oriented either towards hardware (color monitors and printers) or towards application where color manipulation (such as computer graphics for animation) [26]. There are innumerable color spaces which are used based on their application.

  1. RGB color space for color monitors and video cameras
  2. CMY and CMYK for color printing
  3. HSI in human perception
  4. YUV for digital television

2.3.1 RGB Color Space

This color space is based on a Cartesian coordinate system. In this model each color appears as a subset of the three primary colors (red, green and blue). Figure 2.6 is RGB color space. Red, Green and Blue are at the corners of the cube, the other three corners are covered by Cyan, Magenta and Yellow. The white color is at the coordinate (1,1,1) and black is at the origin (0,0,0).

C:UsersergprDesktopimages
gb.JPG

Figure 2.6 RGB Color Space1

Images represented using the RGB color space consists of three component image, one for each primary color which are then combined to produce the image. Figure 2.7 shows the three channels for a bird.

1http://eprints.grf.unizg.hr/2318/1/Z617_Komugovi%C4%87_Ana.pdf

C:UsersergprDesktopimages	umblr_inline_moz26ub9rb1qz4rgp.png

Figure 2.7 R, G, B channels for an Image1

2.3.2 YUV Color Space

In this color space the Y component is the brightness of the color also known as luminance, while the U and V components represents the color (Chroma). The primary advantage of this color space is that some of the information can be discarded for reducing the bandwidth. This is a common video compression standard. Recent algorithms such as MPEG-2 are dominant in use. The Chroma subsampling rate is 4:2:2. Figure 2.8 shows the YUV color space for an image.

C:UsersergprDesktopimagesarn1.png

(a)

1http://triplelift.com/2013/07/02/the-complexity-of-image-analysis-part-2-colors/

C:UsersergprDesktopimagesarn2.png

(b)

C:UsersergprDesktopimagesarn3.png

(c)

C:UsersergprDesktopimagesarn4.png

(d)
Figure 2.8 (a) Original RGB image, (b) Y channel, (c) U channel, (d) V channel1

1https://en.wikipedia.org/wiki/YUV
 2.4 RGB to YUV Conversion

YUV is often used interchangeably with YCbCr, different equations are involved in the conversion of RGB to YUV based on the application. For digital component video YCbCr color format is used. For SDTV (Standard definition TV) the following equation depicts the conversion formula.

C:UsersergprDesktopimagesRGB2YUV.JPG

Here the range for Y is [16 to 235] and that of Cb and Cr are [16 to 240]. There is a need to have Full-range for Y, Cb and Cr. The following equations shows the conversion of RGB values to full range YCbCr.

C:UsersergprDesktopimagesfullrgb2YUV.JPG

The possible range of values for luminance and chrominance reserve some footroom and

headroom, which is necessary to provide some space for overshooting [27]. These [0, 128,128] are referred as the clamping values. By omitting these values, things like negative luminance pop up, e.g. in combination with analog video equipment.

After the conversion from RGB to YUV, subsampling the Chroma channels is done to get YUV 4:2:2. A Bi-cubic interpolation is used to reduce the size of the U and V channels. Now the size of the Y channel will be same as that of the frame size in a video and the Chroma channels will be having half of the size of the Y channel.

2.5 YUV to RGB Conversion

This is one of the stage in Figure1.3, in this the YUV color space is converted back to RGB

color space. For a standard definition TV applications (SDTV) the YUV to RGB conversion is by

the following equations

C:UsersergprDesktopimagesYUV2RGB.JPG

In these equations the Y channel ranges from [16 to 235] and Cb, Cr channel ranges from [16 to 240]. To have full range for Y, Cb and Cr.  The following equations shows the conversion of Full-range YCbCr values to RGB.

C:UsersergprDesktopimagesfullYUV2RGB.JPG

Converting from YUV 4:2:2 color space to RGB color space, the first step in this algorithm is to increase the size of the Chroma channels (U and V) by 2 so that they are of the same size as the Y channel. Inorder to achieve this, Bi-cubic Interpolation is employed and then the conversion equations are applied on to the YUV channels.

The values in the matrix used for the conversion process are the inverse of the matrix used for RGB to YUV conversion from Sec.2.4. Typically, this full range color format is used for JPEG images. The process assigned 2 and 4 in the Figure 1.3 are formulated by employing the Full-range conversion.

Automatic ROI detection

The primary step is to automatically determine the region of interest. The region of interest is specified by using four corner coordinates in a frame. The approach is to first detect the four corner points of the region of interest in the first frame and later, using the KLT tracking algorithm to find the corner points in all the other frames.

To detect the four corner points in the first frame, Harris corner detection was applied to the binary version of the first frame.

3.1 Harris Corner detection

Corners are good feature points which are more stable over the change of viewpoints. These corners can be identified by looking after the intensity values within a window. For a point to be a corner, there must be a large change in the intensity values along all the directions. Figure 3.1 shows the basic idea of a corner [38].

Figure 3.1 Basic idea of a “flat”, “edge” and “corner”

3.1.1 Mathematical Formulation

Change of intensity for shift [u,v] is given as

end

Γ = Γ + △Γt;

end

Saving the 15 output images and replacing it by the 15 output images (Iaf).

for i from 1 to 250 do

masking out the region of interest from the present

frame, the ROI is black with its background undisturbed

(I1).

An image with black background and the ROI with Iaf

(I2)

Computing I3= I2 ο Γi

Final image I4= I1+ I3

end

Combining the Y, U and V channels and converting the image sequences to RGB Video.

 

Output: Optimal solution (T* = Tt, E* = Et) and Video after Specularity removal

Γt+1 =

argmin △Γ

Φ(

Z1t

,

YοΓ +∑i=1nJ

i

Γ

εiεiT  – Tt+1 – Mt+1)

=

∑i=1nJi†

( Tt+1 + Mt+1

Y ο Γ

Z1tμt

) εiεiT

(28)

where

Ji†

represents the Moore-Penrose pseudoinverse of J.

Z1t+1

=

Z1t

+

μt

( Vt+1 – Tt+1 – Mt+1)

Z2t+1

=

Z2t

+

μt

( Mt+1 – Et+1 – Nt+1)

Z3t+1

=

Z3t

+

μt

( Lt+1 – Tt+1)

Z4t+1

=

Z4t

+

μt

( Pt+1 – DTt+1)

Z5t+1

=

Z5t

+

μt

( Qt+1 – DEt+1)

(29)

The inner loop of Algorithm 3 ends when || Vt+1 – Tt+1 – Mt+1||F ≤ δ||Y ο Γ||F with δ= 10-6 or the maximum number of iterations are reached. The outer loop terminates when the change of the objective function value is less or maximum number of iterations are reached.

4.3.2 Determining the optimum number of Frames

C:UsersergprDesktopimagesener.JPG

Figure 4.1 Plot of number of input frames with respect to energy in transmission layer and reflection layer

To determine the optimum number of frames in the specularity removal process, a graph between the number of input frames with respect to energy in transmission layer and reflection layer is plotted as shown in Figure 4.1. For the specularity to be removed more accurately, the energy in the transmission layer should be very high and that of reflection layer should be very low. From the Figure 4.1 N=15 is the threshold point, the values for the energy in the transmission layer and reflection layer are nearly equal. So, 15 frames are chosen for in the specularity removal process.

4.4. Planar Homography Transformation

It is a geometric transformation which has 8 degrees of freedom. There are two cases in which this transformation occurs [35].

Case 1: Images of a plane viewed under arbitrary camera motion.

Case 2: Images of an arbitrary 3D scene viewed by camera rotating and/or zooming about its optic

center.

These two cases are illustrated in the Figure 4.2.

Figure 4.2 Planar Homography transformation

The points in a planar homography are mapped as shown in Figure 4.3. A point (x,y) is represented as (x,y,1), its corresponding point in homogeneous coordinates is (x1, x2, x3) which is the point (x1/ x3, x2/ x3)

Figure 4.3 Mapping of points during planar homography

Equivalently X′= HX, where H is a 3×3 non-singular homogeneous matrix.  Where the non-homogeneous coordinates are given by Figure 4.4.

Figure 4.4 Non-homogeneous coordinates

4.4.1 RANSAC method for Homography estimation

In the work, we use the RANSAC method for estimating the homography matrix for all the frames. The algorithm for RANSAC method has the following steps.

Step 1. Extract features

Step 2. Compute a set of potential matches

Step 3. do

Step 3.1 select minimal sample (i.e. 7 matches)         ( Generate hypothesis)

Step 3.2 compute solution(s) for H

Step 3.3 determine inliers (verify hypothesis)

until a large enough set of the matches become inliers

Step 4. Compute H based on all inliers

Step 5. Look for additional matches

Step 6. Refine H based on all correct matches

Figure 4.5 shows an example of Projective homography transformation, where (x2,y2) is the transformed point for (x1,y1).

C:UsersergprDesktopimageshomography-example.jpg

Figure 4.5 Projective transformation Example

4.4.2 Limitations for Homography transformation

When warping an image to do projective transform, the smooth edges in the original images turn to jagged edges as shown in Figure 4.6.

C:UsersergprDesktopimageslogo.pngC:UsersergprDesktopimagesout_imwarp_crop.jpg

Figure 4.6 Jagged edges due to Projective transform

The left side shows the original image; the right-hand side is the warped image which is prone to jagged edges. These jagged edges need to be smoothed by using appropriate filters. The best suitable filter is using a Gaussian filter. imgaussfilt(A,sigma) filters image A with a 2-D Gaussian smoothing kernel with standard deviation specified by sigma in MATLAB. Another method to smooth the edges is to apply an edge detector like Sobel, Canny, Prewitt, Roberts etc. using a threshold level, dilating the edges and then applying the Gaussian filtering to that image. At last the edges are replaced with the smoothed ones. The former one is fast and easy to implement in terms of complexity so, the previous one is used in this work.

4.5. Final image to video sequence conversion

This is the last step in the flowchart, the input to this stage are the 250 images free from specularity obtained from Algorithm 3. These 250 images are converted to video sequence using Algorithm 4.

Algorithm 4: Images to Video sequence

Input: 250 frames output from Algorithm 3

Creating an object for VideoWriter and specifying its frame rate = 30

Setting the time for each image = 1 sec

Opening the VideoWriter

for i= 1 to 250 do

convert image to frame

write this frame to a video

end

close the writer object

play the video sequence

Output: A video sequence with the name specified in the creation step

5. Quantity Metrics

The amount of specularity in a video can be determined by employing the video before specularity removal and video obtained after specularity removal. The quantity metrics is computed frame by frame basis. The difference between the frames from the two videos is taken which in turn is the specularity component in each frame. Then, the energy of the difference frame is calculated by adding the intensity value at each location.

C:UsersergprAppDataLocalMicrosoftWindowsINetCacheContent.WordQuantity metrics.jpg Figure 5.1 shows the amount of specularity for all the acquired videos on frame by frame basis.

 

 

 

Figure 5.1 Quantity metrics for Specularity Measurement Frame-by-Frame basis

The mean value for all the frames is taken to have a single value for a video. Figure 5.2 shows the amount of specularity for the whole video.

C:UsersergprAppDataLocalMicrosoftWindowsINetCacheContent.Wordmeanspec.jpg

Figure 5.2 Quantity metrics for Specularity measurement

To understand deeply about the amount of specularity in each frame, a simultaneous frame and graph GUI is developed. Figure 5.3 shows the developed GUI.

C:UsersergprAppDataLocalMicrosoftWindowsINetCacheContent.WordGUI1.jpg

Figure 5.3 GUI for Specularity measurement (Low Specularity)

The GUI has a panel for the display of the video frames, Pop-up menu to select the video acquired from the hand-held devices and axes to display the graph representing the amount of specularity at each frame. The color of the graph depicts the amount of specularity present in the frame. The blue color of the graph represents the frames of low specularity and red color of the graph represents the frames of high specularity. Figure 5.4 shows the frame having high specularity.

C:UsersergprAppDataLocalMicrosoftWindowsINetCacheContent.WordGUI2.jpg

Figure 5.4 GUI for Specularity measurement (High Specularity)

6. Experimental Results

This Chapter covers the experimental results, Quality evaluation metrics and implementation details of this work. This method is evaluated on high-quality videos, Figure 6.1. displays the results for video sequence placed on opaque surface, photo frame and a book in high illuminating condition respectively. The region of interest is bounded by the yellow boxes in all the frames.

C:UsersergprDesktopimages1reflinc.jpgC:UsersergprDesktopimages1frlinc.jpgC:UsersergprDesktopcoloracklitSIDSID	assfullframesimagesookab1.jpg

C:UsersergprDesktopcoloracklitSIDSID	assfullframesimagesfrmvid1.jpgC:UsersergprDesktopimages1frtass.jpgC:UsersergprDesktopimages1reftass.jpg

  1.                                                  (b)                                                 (c)

C:UsersergprDesktopimages1refbig.jpgC:UsersergprDesktopimages1frbig.jpgC:UsersergprDesktopcoloracklitSIDSID	assfullframesimagesigbook1.jpg

  1.                                                  (b)                                               (c)

Figure 6.1 Visual display of results (a) Original Frame – Y (b) Transmission layer – T and (c) Specular Reflection Layer – E recovered by our method for different video sequences

 

6.1. Quality evaluation metrics

The amount of specularity can be determined by the energy in the reflection layer, which is the sum of the intensity values at each pixel. Since, there are no papers on specularity removal from the video sequence, to compare the various algorithms, the quality evaluation is performed on the images. The color image can be decomposed into transmission layer and reflection layer. The reflection layer can be derived from the original image by subtracting the transmission layer from it.

For evaluation purpose three different methods along with our method is compared. These methods are

  1. Chromaticity-based separation of reflection components in a single image [21]
  2. Single Image Layer Separation using Relative Smoothness [36]
  3. Exploiting Reflection Change for Automatic Reflection Removal [37]

The amount of specularity before and after specularity removal for all the above-mentioned methods are plotted. This plot reveals that the current method is superior to other methods in terms of performance.

C:UsersergprAppDataLocalMicrosoftWindowsINetCacheContent.Wordquality metrics.jpg

 

Figure 6.2 Quality Metrics for Amount of Specularity

 

6.2. Implementation Details

In the Algorithm 3 the parameters λ1 = 0.3/√(w*h), λ2 = 50/√(w*h), λ3 = 1/√(w*h), λ4 = 5/√(w*h), λ5 = 50/√(w*h) and λ6 = 50/√(w*h), where w is the width of the region of interest and h is the height of the region of interest. The Algorithm 4 is applied on the Y channel, after the replacement of the region of interest with the specular free image the Y, U and V channels are combined to form the original video sequence without specularity. The experiments are conducted in MATLAB on a PC running Windows 10 32bit operating with Intel Core i5 processor with 1TB hard disk and 8 GB RAM. The main advantage of the algorithm over the state of art algorithms is that it preserves the illumination change, temporal coherence and the computational complexity is minimized. Figure 6.3. compares the Algorithm presented in this paper (using Y channel) over the state of art algorithms (using RGB Channels).

C:UsersergprDesktopimages	im.JPG

Figure 6.3 Computational time for different number of frames for Y channel and RGB channel

 

 

 

 

 

 

 

 

 

 

 

 

 

7. Conclusion

This thesis has proposed a novel and robust method for the specularity removal from hand-held video. Hand-held videos are prone to flickering and illumination changes, to deal with this a noise term is used in the problem formulation to preserve the smoothness in a video and the automaticity in selection of region of interest for all the frames are introduced. Unlike the existing solutions for images computed on RGB channel, this paper evaluates the algorithm on Y channel.

The experimental results compared to the state of art algorithms manifests that the proposed algorithm has higher accuracy and speed.

7.1. Future Scope

The best 15 frames are to be selected for efficient and effective specularity removal. The frames are selected such that there exists very little correlation between the frames. Using the best 15 frames in the above algorithm, the video obtained appears to be the same quality (to human eye and by the devise quantity metrics) as that of the video sequence obtained by using the frames (1:17:250). Every frame of the acquired video should contain the 4-sided polygon region of interest since, the correspondence between the frames is exploited to remove the highlights. In future, this algorithm and quantity metrics can aid for videos that have no correspondences.

Bibliography

[1] Biometrics.idealtest.org. (2017). Biometrics Ideal Test. [online] Available at: “CASIA-FaceV5, http://biometrics.idealtest.org/” [Accessed 9 Mar. 2017].

[2] Jorge Bernal, F. Javier Sánchez, Cristina Rodríguez de Miguel and Gloria Fernández-Esparrach. Building up the Future of Colonoscopy – A Synergy between Clinicians and Computer Scientists.

[3] Klinker, G.J., Shafer, S.A. and Kanade, T., 1987, June. Using a color reflection model to separate highlights from object color. In Proc. ICCV (Vol. 87, pp. 145-150).

[4] Klinker, G.J., Shafer, S.A. and Kanade, T., 1988. The measurement of highlights in color images. International Journal of Computer Vision, 2(1), pp.7-32.

[5] Schlüns, K. and Teschner, M., 1995, January. Fast separation of reflection components and its application in 3d shape recovery. In Color and Imaging Conference (Vol. 1995, No. 1, pp. 48-51). Society for Imaging Science and Technology.

[6] Schluns, K. and Koschan, A., 2000, October. Global and local highlight analysis in color images. In Proc. 1st Int. Conf. Color Graphics Image Processing (pp. 300-304).

[7] Bajcsy, R., Lee, S.W. and Leonardis, A., 1996. Detection of diffuse and specular interface reflections and inter-reflections by color image segmentation. International Journal of Computer Vision, 17(3), pp.241-272.

[8] Tan, R.T. and Ikeuchi, K., 2005. Separating reflection components of textured surfaces using a single image. IEEE transactions on pattern analysis and machine intelligence, 27(2), pp.178-193.

[9] Tan, R.T. and Ikeuchi, K., 2005. Illumination color and intrinsic surface properties physical-based color analysis from a single image. Transactions of Information Processing Society of Japan 46 (2005), 17–40.

[10] Yoon, K.J., Choi, Y. and Kweon, I.S., 2006, October. Fast separation of reflection components using a specularity-invariant image representation. In Image Processing, 2006 IEEE International Conference on (pp. 973-976). IEEE.

[11] Mallick, S.P., Zickler, T., Belhumeur, P.N. and Kriegman, D.J., 2006, May. Specularity removal in images and videos: A PDE approach. In European Conference on Computer Vision (pp. 550-563). Springer Berlin Heidelberg.

[12] Mallick, S.P., Zickler, T., Belhumeur, P. and Kriegman, D., 2006, July. Dichromatic separation: specularity removal and editing. In ACM SIGGRAPH 2006 Sketches (p. 166). ACM.

[13] Quan, L. and Shum, H.Y., 2003, October. Highlight removal by illumination-constrained inpainting. In Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on (pp. 164-169). IEEE.

[14] Tappen, M.F., Freeman, W.T. and Adelson, E.H., 2005. Recovering intrinsic images from a single image. IEEE Trans. Pattern Anal. Mach. Intell., 27(9), pp.1459-1472.

[15] Angelopoulou, E., 2007, October. Specular highlight detection based on the Fresnel reflection coefficient. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on (pp. 1-8). IEEE.

[16] Lin, S., Li, Y., Kang, S.B., Tong, X. and Shum, H.Y., 2002, May. Diffuse-specular separation and depth recovery from image sequences. In European conference on computer vision (pp. 210-224). Springer Berlin Heidelberg.

[17] Feris, R., Raskar, R., Tan, K.H. and Turk, M., 2004, October. Specular reflection reduction with multi-flash imaging. In Computer Graphics and Image Processing, 2004. Proceedings. 17th Brazilian Symposium on (pp. 316-321). IEEE.

[18] Agrawal, A., Raskar, R., Nayar, S.K. and Li, Y., 2005. Removing photography artifacts using gradient projection and flash-exposure sampling. ACM Transactions on Graphics (TOG), 24(3), pp.828-835.

[19] Shen, H.L. and Zheng, Z.H., 2013. Real-time highlight removal using intensity ratio. Applied optics, 52(19), pp.4483-4493.

[20] Shen, H.L. and Cai, Q.Y., 2009. Simple and efficient method for specularity removal in an image. Applied optics, 48(14), pp.2711-2719.

[21] Shen, H.L., Zhang, H.G., Shao, S.J. and Xin, J.H., 2008. Chromaticity-based separation of reflection components in a single image. Pattern Recognition, 41(8), pp.2461-2469.

[22] Gai, K., Shi, Z. and Zhang, C., 2012. Blind separation of superimposed moving images using image statistics. IEEE transactions on pattern analysis and machine intelligence, 34(1), pp.19-32.

[23] Peng, Y., Ganesh, A., Wright, J., Xu, W. and Ma, Y., 2012. RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), pp.2233-2246.

[24] Guo, X., Cao, X. and Ma, Y., 2014. Robust separation of reflection from multiple images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2187-2194).

[25] Color space. (2017, March 7). In Wikipedia, The Free Encyclopedia. Retrieved 06:00, March 9, 2017, from https://en.wikipedia.org/w/index.php?title=Color_space&oldid=769021088.

[26] Gonzalez, Rafael C and Richard E Woods. Digital Image Processing. 3rd ed. Reading, Mass.: Addison-Wesley, 1998. Print.

[27] “Color Conversion – Equasys Gmbh”. Equasys.de. N.p., 2017. Web. 9 Mar. 2017, from http://www.equasys.de/colorconversion.html.

[28] Lucas, B.D. and Kanade, T., 1981. An iterative image registration technique with an application to stereo vision.

[29] Carlo Tomasi and Takeo Kanade. Detection and Tracking of Point Features. Carnegie Mellon University Technical Report CMU-CS-91-132, April 1991.

[30] Jianbo Shi and Carlo Tomasi. Good Features to Track. IEEE Conference on Computer Vision and Pattern Recognition, pages 593-600, 1994.

[31] Stan Birchfield. Derivation of Kanade-Lucas-Tomasi Tracking Equation. Unpublished, January 1997.

[32] Lin, Z., Chen, M., Wu, L. and Ma, Y..2009. The Augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. Technical report UILU_ENG-09-2215, UIUC, Technical Report.

[33] Szeliski.org. (2017). Computer Vision: Algorithms and Applications. [online] Available at: http://szeliski.org/Book/ [Accessed 9 Mar. 2017].

[34] Baker, S. and Matthews, I., 2004. Lucas-kanade 20 years on: A unifying framework. International journal of computer vision, 56(3), pp.221-255.

[35] Capel, D. and Zisserman, A., 2003. Computer vision applied to super resolution. IEEE Signal Processing Magazine, 20(3), pp.75-86.

[36] Li, Y. and Brown, M.S., 2014. Single image layer separation using relative smoothness. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2752-2759).

[37] Li, Y. and Brown, M.S., 2013. Exploiting reflection change for automatic reflection removal. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2432-2439).

[38] “Matching with Invariant Features”, Darya Frolova, Denis Simakov, The Weizmann Institute of Science, March 2004.

Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

DMCA / Removal Request

If you are the original writer of this dissertation and no longer wish to have your work published on the UKDiss.com website then please: