Detection of Facial Occlusions by Skin-Color based and Local-Minimum based Feature Extractor
(91 5 ) 1.. 3. i
ii
Abstract The frequently-encountered problems in human face detection and recognition are complex background and occlusion of human face features. In this thesis, we adopt dynamic information of video sequences to resolve the complex background problem. The speed of face detection can be drastically decreased due to the reduced in the size of searching window. In addition to the extraction of face features, we can also determine whether face features are occluded. Two strategies are proposed in this thesis to achieve the goal. The first strategy is to use skin color information by analyzing the skin color distribution in YCbCr color system. Non-skin color regions can be eliminated by utilizing the analyzed skin color distribution information. Then, face features are paired based on the geometric constraints. If the pairing is failure, the face skin color distribution ratios obtained from statistical results are utilized to determine if occlusion do occur. If the first strategy fails in determining whether human faces are occluded, then the second strategy is employed. The second strategy is implemented based on the facts that the gray values of eyes and mouths are lower than the other parts in human faces and there will contain more edge information in the vicinity of eyes and mouths. The candidates of eyes and mouths can be extracted according to the aforementioned two facts. Then, all possible candidates are paired. If the pairing is failure, the determination of occlusion/non-occlusion can be accomplished by judging the distribution of these candidates. Experiments were conducted on various video images. Experimental results verify the feasibility and validity of our proposed approach in determining the occlusion/non-occlusion of face features. iii
YCbCr iv
... v...vii... 1 1.1... 1 1.... 1.3... 3 1.4... 4... 6.1... 7.... 10..1...10.....1.3... 16.3.1...16.3....17... 19 3.1... 19 3.1.1...19 3.1....0 3.1.3...1 3.... 5 v
3..1...5 3.....6 3.3... 7 3.3.1...7 3.4... 8 3.4.1...9... 31 4.1... 33 4.1.1...33 4.1....35 4.... 36 4.3... 37 4.3.1...37 4.3....37... 38 5.1... 38 5.1.1...38 5.1....39 5.1.3...39 5.... 39... 45 6.1... 45 6.... 46... 48 vi
1.1...5.1...6. (a) I i-1 (b) I i (c) (d) (c)...8.3 (a) I t - 1 (b) I t (c) (d) (c) (e) I i (f) (d) (e)...9.4 1....10.5...1.6 (a) (b)sobel (c) (d) (c) (b)...13.7 (a) (b)...14.8 (a)~(g) (h)...15.9...16.10...18 3.1...0 3. (a) (b) (a) (c) (b) (d) (c)...5 3.3 (a) (b) vii
(c) (d)...6 3.4...8 3.5 (a)(b)(c) (d)(e)(f) (g)(h)(i)...30 4.1 (a)(b)(c) (d)(e)...3 4. (a) (b) (c) a/ (d)...34 4.3 (a) (b) (c) (a) (b) (d) (c)...34 4.4...35 4.5 (a) (b) (a) (c) (b)...36 4.6...37 5.1...41 5....4 5.3...43 5.4...43 5.5...44 5.6...44 viii
ix
1
Pentland et al.[1] rincipal component analysis, PCA) Sung Poggio[] Rowley et al.[3] Jeng et al.[4] Sobottka Pitas[5] Chen et al.[6] PCA(principle component analysis) [7][8] SVM(support vector machine) [9][10][11]
1.1 3
4
(1) () (3) (1) () (3) off-line (4) (1) () (3). (4). (5). 1.1 5
.1 (1) () (3).1 6
1.. [1]. D ( x, y) = I ( x, y) I 1( x, y) i i i if Di ( x, y) > threshold, ( x, y) otherwise, ( x, y) backword foreword (.1-1) (.1-) 7
(a) (b) (c). (a) I i-1 (b) I i (c) (d) (c).3 (d) 8
(a) (b) (c) (d) (e) (f).3 (a) I t - 1 (b) I t (c) (d) (c) (e) I i (f) (d) (e) 9
1. 1. x y a + = (1.a ) 1 (.-1) 1..4 1. 10
1 1 1 1 x 1 a x y + = 1 y = b a (1.a ) bx y' = = 1 x 1 a 1 a bx a = 1 x a b a x 4 = 1 x a ( a + b a 4 )x = 1 a x = y = b a + b a b + (.-) ( ±, ± ) + + a a b a b b 11
.5 bp4(x 4,y 4 ) bp1(x 1,y 1 ) (0,0) bp3(x3,y3) bp(x,y).5 y x x = ± 1, ±,..., ± (.-3) = b 1 x bp a y x = a 1 y = ± 1, ±,..., ± bp (.-4) b (.-3) (.-4) 11 pixels 50 pixels 40 y Sobel [13].6 1
Sobel Edge Detection: a 1 a a 3 a 4 a 5 a 6 a 7 a 8 a 9 a < > a b = ( ak bk ) 9 k = 1 b 1 b b 3 b 4 b 5 b 6 b 7 b 8 b 9 b -1 - -1 1 1 S1-1 1 - -1 1 S (x-1,y-1) (x,y-1) (x+1,y-1) (x-1,y) (x,y) (x+1,y) (x-1,y+1) (x,y+1) (x+1,y+1) I3x3 Regular Sobel : G(x,y) = 1 I3x3) + (S I3x3) ( S (.-5) Fast Sobel: G(x,y) = ( S1 I3x3) + (S I3x3) (.-6) (a) (b) (c) (d).6 (a) (b)sobel (c) (d) (c) (b) 13
.7(b) 1 0 M ( x k, y k n m ) = i= 1 j = 1 [ E nxm ( i, j ) I( x k + i, y k + j )], for( x k, y k ) I * M ( xk, yk ) s = max{ } N N 1 ( x k, y k ) S * x 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 1 0 0 0 E 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 E n x m y I.7 (a) (b) 14
(a) (b) (c) (d) (e) (f) (g).8 (a)~(g) (h) (h) 15
.9 bp4(x 4,y 4 ) bp1(x 1,y 1 ) (0,0) template bp3(x 3,y 3 ) bp(x,y ).9 16
[14][15] 1. NCC (Normalized Cross-Correlation) NCC( x, y) B 1 B 1 j= 0 i= 0 I ( x + i, y + t I t T j) T ( i, j) (.-7). SAD (Sum of Absolute Differences) SAD( x, y) B 1 B 1 j= 0 i= 0 I ( x + i, y + t j) T ( i, j) (.-8) 3. SSD (Sum of Squared Differences) B 1 B 1 SSD ( x, y) ( I ( x + i, y + j) T ( i, j) ) (.-9) j= 0 i= 0 t NCC NCC SAD SSD NCC SSD SAD SAD TemplateScore TotalScore = EllipseScore we + ( 1 ) wt N t (.-10) w e w t N t 4a 4b ( a m ~ ( a + ) ) m a m.10 17
.10 18
19
RGB HSV YCbCr XYZ YIQ YES YCbCr YCbCr YCbCr YUV luminance(y) blueness(cb) redness(cr) YCbCr RGB Y C C b r = 0.57 0.148 0.439 0.504 0.91 0.368 0.098 R 0.439 G 0.071 B + 16 18 18 (3.1-1) YCbCr (look-up table) 1. 3.1 3.1. YCbCr Cb Cr 0
3.(a) 3. YCbCr 3.(b) (Dilation)& (Erosion) A B a=(a 1,a ) b=(b 1,b ) A x=(x 1,x ) (A) x A) x = { c c = a + x, a A} A A c { x x A} ( (3.1-) = (3.1-3) A B A B { x x A x B} A B =, c = A B (3.1-4) A B A B A B = U ( A) (3.1-5) A B b B b 1
A B A B AΘB B = I ( A) (3.1-6) A B A B b B b (4-connected components) r t p : (scanning) p = 0 p = 1 r=t= 0 p r=1 t=0 p r r=0 t=0 p t r=1 t=1 r p r t r=1 t=1 r p r t r : (merging classes) t r
(8-connected components) q r s t p : (scanning) p = 0 p = 1 q=r=s=t= 0 p (q, r, s, t) 1 p 1 (q, r, s, t) 1 p 1 1 : (merging classes) (Convex Hull) Q : TOP(S) : S S NEXT-TO-TOP(S) : S S Graham[16] 1. let p 0 be the point in Q with the minimum y-coordinate, or the leftmost such point in case of a tie. let p 1, p,..., pm be the remaining points in Q, 3
sorted by polar angle in counterclockwise order around p 0 (if more than point has the same angle, remove all but the one that is farthest from p 0 ) 3. top[s] 0 4. PUSH (p 0, S) 5. PUSH (p 1, S) 6. PUSH (p, S) 7. for i 3 to m 8. do while the angle formed by points NEXT-TO-TOP(S), TOP(S), and p i makes a nonleft turn 9. do POP(S) 10. PUSH(S,p i ) 11. return S 3.(c) 3.(d) 4
(a) (b) (c) (d) 3. (a) (b) (a) (c) (b) (d) (c) (3.1-1) YCbCr Cb Cr 5
3.3(b) 3.6(c) 3.6(b) 3.6(d) (a) (b) (c) (d) 3.3 (a) (b) (c) (d) 6
3. C C x j x k m 3 m = x j if max x j x k, = j = 1 j k e 1,e 3.4 1,,3 (3.3-1) m p 7
e 1 m p e m 3.4 m p e1m e e = e e 1 1 (0.5 0.) (3.3-) m p e1 (0.5 + 0.) e e (3.3-3) 1 e 1 e e 1 e 0. m p m ( 1 0.3 ) ( 1 + 0.3 ) (3.3-4) e1e 1:1 0.3 8
3.5 Eye_Region Mouth_Region 15 Eye_Region 0.6585 0.16953 Mouth_Region 0.68707 0.11784 9
3.5 Eye_Region Mouth_Region (g)eye_region: 519/964, Mouth_Region: 04/964 (h)eye_region: 1973/310, Mouth_Region: 1968/310 (i)eye_region: 809/968, Mouth_Region: 60/968 (a) (b) (c) (d) (e) (f) (g) (h) (i) 3.5 (a)(b)(c) (d)(e)(f) (g)(h)(i) 30
(Mean-Value Theorem) f(x) [a, b] (a, b) (a, b) f ( b ) f ( a ) c f ' ( c ) = b a (Bolzano s Theorem) f(x) [a, b] f(a) f(b) < 0 [a, b] c f(c) = 0 f(x) f [a, b] (a, b) (a, b) c f(c) < f(a) f(c) < f(b) (a, b) p f (p) = 0 f(p) (a, b) f ( c ) f ( a ) (a, c) p 1 f ' ( p1 ) = c a f ( b ) f ( c ) (c, b) p f ' ( p ) = b c f(c) < f(a) f(c) < f(b) a < c < b f (p 1 ) < 0 f (p ) > 0 (p 1, p ) 31
p f (p) = 0 f ((p 1 ) < 0 f ((p ) > 0 f(p) (p 1, p ) a < p 1 < p < b f(p) (a, b) 4.1 (a) (b) (c) (d) (e) 4.1 (a)(b)(c) (d)(e) 3
a/ 4. (b+a) 4.3(a) 3a V 4.3(b) 3x3 H V 4.3(c) H 3a ( b+ a ) 3 a ( b+a ) 3 a ( b+ a ) H 3a ( b+ a ) 3a ( b+ a ) C 3a ( b+a ) C 3 a ( b+ a ) 3x3 C 3 a ( b+ a ) 33
4.3(d) (a) (b) (c) (d) 4. (a) (b) (c) a/ (d) (a) (b) (c) (d) 4.3 (a) (b) (c) (a) (b) (d) (c) 34
4.4 I II I e 1 II e o e 1 e a 0. < e1e 3 < 3a 0.6 (4.1-1) oe 1 oe < threshold (4.1-) 1 e 1e ox > threshold, for any x on the x coordinate (4.1-3) 4.1-1 4.1-4.1-3 e 1 e e 1 e o 4.4 35
4.5(a) m 1 m 0.a < m m 1 < 1.5a (4.-1) m 1m ox > threshold, for any x on the x coordinate (4.-) 4.-1 4.- o m 1 m (a) (b) (c) 4.5 (a) (b) (a) (c) (b) 36
3.3 3.4 4.6 4.6 37
Windows 000 VC Intel Pentium 1.6 GHz (CPU) 56MB 5.1 38
5. 5.3 5.4 5.5 5.6 5.1(b) 5.3 5.5 39
40
(a) 5.1 (b) 41
(a) (b) 5. 4
5.3 5.4 43
5.5 5.6 44
YCbCr n (n 3 ) n 5 frames/sec 45
(m 1 ) (m m 1 m 46
47
[1] M. Turk and A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuro-science, vol.3, no.1, pp.71-86, 1991. [] K. K. Sung and T. Poggio, Example-based learning for view-based human face detection, in Proc. Image Understanding Workshop, pp. 843-850, Monterey, Calif., Nov. 1994. [3] H. A. Rowley, S. Baluja, and T. Kanade, Human face detection in visual scenes, Tech. Rep. CMU-CS-95-158R, Carnegie Mellon University, 1995. [4] S. H. Jeng, H. Y. Mark Liao, C. C. Han, M. Y. Chern, and Y. T. Liu, An efficient approach for facial feature detection using geometrical face model, to appear in Pattern Recognition, 1997. [5] K. Sobottka and I. Pitas, Extraction of facial regions and features using color and shape information, in Proc. 13 th International Conference on Pattern Recognition, pp.41-45, Vienna, Austria, Aug. 1996. [6] H. Wu, Q. Chen, and M. Yachida, A fuzzy-theory-based face detector, in Proc. 13 th International Conference on Pattern Recognition, pp.406-410, Vienna, Austria, Aug. 1996. [7] B. Moghaddam, and A. Pentland, Probabilistic Visual Learning for Object Representation, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 19, no. 7, pp. 696-710, July 1997. [8] B. Moghaddam, W. Wahid, and A. Pentland, Beyond Eigenfaces: Probabilistic Matching for Face Recognition, IEEE International Conference on Automatic Face and Gesture Recognition, pp. 1-35, 1998. [9] C. Cortes, V. Vapnik, Support vector networks, Machine Learning, Vol. 48
0,pp. 73-97, 1995. [10] E. Osuna, Support Vector Machines: Training and Applications, Ph.D. thesis, Dept. of EECS, Massachusetts Institute of Technology, 1998. [11] E. Osuna, R. Freund, and F. Girosi, Training Support Vector Machines: an Application to Face Detection, Proc. Computer Vision and Pattern Recognition, pp. 17-19, 1997. [1] D. Gutchess, M. Trajkovic, E. Cohen-Solal, D. Lyons, A. K. Jain, A Background Model Initialization Algorithm for Video Surveillance, IEEE, 001. [13] R.M. Haralick, and L. G. Shapiro, Computer and Robot Vision, Vo1.1, Addison-Wesley Inc., 199. [14] R. Brunelli, and T. Poggio, Template Matching: Matched Spatial Filters And Beyond, MIT AI Memo 1549,July 1995. [15] M.S. Lew, N. Sube, T.S. Huang, Improving visual matching, IEEE Conference on Computer Vision and Pattern Recognition, Vol., pp. 58-65, 000. [16] H. Thomas, E. Charles, and L. Ronald, Introduction To Algorithms, pp. 898-90, 1989. 49