Video Segmentation through Multiscale Analysis of Natural Textures

 

Miguel Alemán-Flores and Luis Álvarez-León
Departamento de Informática y Sistemas
Universidad de Las Palmas de Gran Canaria
35017, Las Palmas, SPAIN
maleman@dis.ulpgc.es, lalvarez@dis.ulpgc.es
serdis.dis.ulpgc.es/~maleman, serdis.dis.ulpgc.es/~lalvarez

   

Abstract

Segmenting a video sequence into different coherent scenes requires analyzing those aspects which allow detecting the changes where a transition occurs. Textures are very helpful for this purpose, since a change in the scene is usually associated to a change in the textures. Furthermore, analyzing the textures in a given environment at different scales provides more information than considering the features which can be extracted from a single one. A standard multiscale texture analysis would require an adjustment of the scales in the comparison of the textures. However, when analyzing video sequences, this process can be simplified by assuming that the frames have the same resolution. Finally, the combination of different time intervals improves the results by reducing the false transitions.

 

1. Texture Analysis

Orientation histograms are extracted using  the structure tensor to estimate the orientation and magnitude of the gradient in every point into the textured region. The energy function below measures the difference between two textures by means of the Fourier coefficients of their histograms. The normalization of the histograms and the minimization of the energy function provide size and rotation invariance. We can extract the most similar textures in a database by selecting the comparisons that produce the lowest energy values:

 

Examples of natural texture images (left) and groups of them extracted using histogram comparison (right)

 

2. Multiscale Analysis of Natural Textures
The comparison of textures at different scales increases the discrimination capability of the technique we use to measure the differences between the textures [1]. If we use a Gaussian multiscale analysis, a previous adjustment of the standard deviation of the filters used in the comparison is required. The adjusting ratio is obtained from the decrease of φ(I0,Ω,t):
The decrease of φ(I0,Ω,t) characterizes the resolution of the texture and allows adjusting the scales for the multiscale analysis.

Energy evolution in the comparison of two images of the same texture at different resolutions (left) and two images of different textures (right)

 

 

 

3. Results

 

We have extracted the scene transitions in a video sequence by comparing the images every n frames. A new scene starts when the energy threshold is exceeded. However, some false transitions appear due to the presence of fast changes within a scene, which produce high energies. The use of multiscale analysis and the comparison with different time intervals improve the results, increasing the specificity of the discrimination.

 

Examples of the initial and final frames of different scenes extracted from a video sequence using multiscale texture analysis

 

True transitions (TT) and false transitions (FT) extracted using original scale analysis with 10-frame interval (OSA 10), multiscale analysis with 10-frame interval (MSA 10) and multiscale analysis with the combination of 10-frame and 5-frame intervals (MSA10-5) (transitions compared with [2]).

Distribution of the adjusting ratio for the comparison of the textures in a database of natural scenes.
Mean=0.92, Std. deviation=0.06

4. Conclusion


We have presented a new approach to video sequence segmentation based on a multiscale comparison of natural textures. The extraction of orientation histograms using the structure tensor to describe the distribution of the orientations across a textured region and the multiscale analysis of the histograms have produced quite satisfactory results, since the visual similarity or difference between two textures is much more reliably detected by the evolution of the energies resulting when comparing the histograms at different scales. The need for a high sensibility produces a decrease in the specificity. However, the comparison at different scales and using different temporal intervals reduces the number of misclassified normal changes without ignoring the true transitions. The promising results confirm the usefulness of the multiscale texture analysis for the segmentation of video sequences.
 
[1] Alemán-Flores, M., Álvarez-León, L.: Texture Classification through Multiscale Orientation Histogram Analysis. Lecture Notes in Computer Science, Springer Verlag 2695 (2003) 479-493
[2] Bescós, J.: Shot Transitions Ground Truth for the MPEG7 Content Set. Technical Report 2003/06. Universidad Autónoma de Madrid (2003)
International Conference on Image Analysis and Recognition
Porto (Portugal) - 2004