Video
Segmentation through Multiscale Analysis of Natural Textures
Miguel
Alemán-Flores and Luis Álvarez-León
Departamento
de Informática y Sistemas
Universidad
de Las Palmas de Gran Canaria
35017,
Las Palmas, SPAIN
maleman@dis.ulpgc.es,
lalvarez@dis.ulpgc.es
serdis.dis.ulpgc.es/~maleman,
serdis.dis.ulpgc.es/~lalvarez
Abstract
Segmenting a video sequence into
different coherent scenes requires analyzing those aspects which allow
detecting the changes where a transition occurs. Textures are very helpful
for this purpose, since a change in the scene is usually associated to a
change in the textures. Furthermore, analyzing the textures in a given
environment at different scales provides more information than considering
the features which can be extracted from a single one. A standard
multiscale texture analysis would require an adjustment of the scales in
the comparison of the textures. However, when analyzing video sequences,
this process can be simplified by assuming that the frames have the same
resolution. Finally, the combination of different time intervals improves
the results by reducing the false transitions.
Orientation histograms are extracted
using the structure tensor to
estimate the orientation and magnitude of the gradient in every point into
the textured region. The energy function below measures the difference
between two textures by means of the Fourier coefficients of their
histograms. The normalization of the histograms and the minimization of the
energy function provide size and rotation invariance. We can extract the
most similar textures in a database by selecting the comparisons that
produce the lowest energy values:
Examples of natural texture
images (left) and groups of them extracted using histogram comparison (right)
2.
Multiscale Analysis of Natural Textures
The
comparison of textures at different scales increases the discrimination
capability of the technique we use to measure the differences between the
textures [1]. If we use a Gaussian multiscale analysis, a previous
adjustment of the standard deviation of the filters used in the comparison
is required. The adjusting ratio is obtained from the decrease of φ(I0,Ω,t):
The decrease of φ(I0,Ω,t) characterizes the resolution of the texture and
allows adjusting the scales for the multiscale analysis.
Energy evolution in the
comparison of two images of the same texture at different resolutions (left)
and two images of different textures (right)
3. Results
We
have extracted the scene transitions in a video sequence by comparing the
images every n frames. A new scene starts when the energy threshold
is exceeded. However, some false transitions appear due to the presence of
fast changes within a scene, which produce high energies. The use of
multiscale analysis and the comparison with different time intervals improve
the results, increasing the specificity of the discrimination.
Examples of the initial and
final frames of different scenes extracted from a video sequence using multiscale texture analysis
True transitions (TT) and false
transitions (FT) extracted using original scale analysis with 10-frame
interval (OSA 10), multiscale analysis with 10-frame interval (MSA 10) and
multiscale analysis with the combination of 10-frame and 5-frame intervals
(MSA10-5) (transitions compared with [2]).
Distribution of the adjusting ratio for
the comparison of the textures in a database of natural scenes.
Mean=0.92, Std. deviation=0.06
4. Conclusion
We have
presented a new approach to video sequence segmentation based on a
multiscale comparison of natural textures. The extraction of orientation
histograms using the structure tensor to describe the distribution of the
orientations across a textured region and the multiscale analysis of the
histograms have produced quite satisfactory results, since the visual
similarity or difference between two textures is much more reliably
detected by the evolution of the energies resulting when comparing the
histograms at different scales. The need for a high sensibility produces a
decrease in the specificity. However, the comparison at different scales
and using different temporal intervals reduces the number of misclassified
normal changes without ignoring the true transitions. The promising
results confirm the usefulness of the multiscale texture analysis for the
segmentation of video sequences.
[1]
Alemán-Flores, M., Álvarez-León, L.: Texture Classification through
Multiscale Orientation Histogram Analysis. Lecture Notes in Computer Science,
Springer Verlag 2695 (2003) 479-493
[2]
Bescós,
J.: Shot Transitions Ground Truth for the MPEG7 Content Set. Technical
Report 2003/06. Universidad Autónoma de Madrid (2003)
International
Conference on Image Analysis and Recognition
Porto
(Portugal) - 2004