HD-03: Big Data from Earth Observation: analytics, mining, semantics

Mihai Datcu

Sunday, July 26
13:30 - 17:30


There is an unceasing interest in a global understanding of the processes governing the Earth involving observations of large extended areas, and over long periods of time, with a broad variety of Earth Observation (EO) sensors. The collected EO data volumes are increasing immensely, with the current EO technologies these will further grow rapidly, the horizons lie beyond Zettabytes of data. The challenge is the exploration of these data and the timely delivery of focused information and knowledge in a simple understandable format.

Big Data from Space, specifically Earth Observation imagery, besides to the prevalent “3Vs”, is postulating additional challenges emerging from its very particular nature:

  • sources are sensors and instruments,
  • nature is spatio-temporal
  • meaning is quantitative-physical and qualitative-semantic
  • perception is beyond the human bounds
  • understanding is contextual in synergy with multi-sensor, in-situ, geo- and other sources of information.

Therefore, the goal of the tutorial is the presentation of specific leading edge concepts, methods and algorithms for information content exploration and intelligence extraction from Big Data provided by EO sensors and other related sources, and offer an interdisciplinary view of methods in signal processing, machine learning and communication theory on a semantic level. To cope with large and complex data volumes in a time effective manner, all these methods must be also simple.

The important particularity of Big Data in EO that should be considered is their “instrument” nature, as we are sensing meaningful physical parameters. Consequently, the semantic aspects are much broader and much more difficult to formalize. The “meaning” of the data comprises many facets. They contain quantitative information about physical, geometrical, spatio-temporal patterns, or other types of attributes of the observed scene. The instrument (sensor) and image formation parameters lend understandability to the data, in synergy with their spatio-temporal and geo-information context. Moreover, the “non-visual” nature of the EO images, is dissociating from any leading edge active learning, or other visual interfaces, impacting a broad variety of the algorithms and methods for accessing the data content.

Therefore, we need new approaches for the representation of information, information processing, data compression, dimensionality reduction, sparsity analysis, and learning or adaptation in order to handle intelligence, significance and knowledge.

The tutorial begins with an overview of the area of Big Data, mainly the fields of Data Mining, Knowledge and Data Discovery, Visual Analytics and image search engines, intended to position the Big Data Intelligence field for EO applications.
Further the presentation encompasses the following topics:

1. EO Sensor Data and metadata modeling and specific and meaningful descriptors extraction, dictionary learning, compressive sampling, bag-of-words, feature coding, feature pooling, for multispectral, SAR and SITS

2. Selected in-situ and geo-information sources (text, sensor, pictures, vector) their characterization, description and synergetic use.

3. Big Data intelligence and semantic representation, Bayesian learning, latent semantic annotation, kernel-based and boosting cascaded learning, deep learning, learning with context and multiple heterogeneous sources

4. Compression-based pattern recognition, parameter and model-free descriptors, dictionary-based data analysis, similarity metrics, data indexing and clustering

5. Visual data mining, multispectral, complex valued SAR image, and SITS visualization, dimensionality reduction, visual data analytics, immersive visual data mining

6. Big and Heterogeneous Data mining, spatio-temporal pattern recognition, semantic learning and automated annotation, search engines, scalability

7. Applications for very high and high resolution multispectral and SAR data, use cases, and validation procedures.

The introduced methods will be demonstrated for selected actual scenarios using TerraSAR-X, TanDEM-X, Sentinel 1, optical multispectral images, such as Landsat, or WorldView, and Satellite Image Time Series. The perspective of the use of the Big Data Mining for Sentinel 1 and 2 will be discussed.


Mihai Datcu received the M.S. and Ph.D. degrees in electronics and telecommunications from the University Politechnica of Bucharest (UPB), Bucharest, Romania, in 1978 and 1986, respectively, and the title Habilitation a diriger des recherches in computer science from University Louis Pasteur, Strasbourg, France, in 1999. Since 1993, he has been a Scientist with the German Aerospace Center (DLR), Oberpfaffenhofen, Germany, where he is currently a Senior Scientist and an Image Analysis Research Group Leader in the Remote Sensing Technology Institute (IMF). He is developing algorithms for analyzing very high resolution synthetic aperture radar (SAR) and interferometric SAR data. He is engaged in research related to information theoretical aspects and semantic representations in advanced communication systems. Since 2011, he has also been leading the Immersive Visual Information Mining Research Laboratory, Munich Aerospace Faculty, Munich, Germany. He has held Visiting Professor appointments with the University of Oviedo, Oviedo, Spain; University Louis Pasteur; the International Space University, Strasbourg; the University of Siegen, Germany; the University of Camerino, Italy; and the Swiss Center for Scientific Computing (CSCS), Manno. From 1992 to 2002, he had a longer Invited Professor assignment with the Swiss Federal Institute of Technology, ETH Zurich. Since 2001, he has initiated and led the Competence Centre on Information Extraction and Image Understanding for Earth Observation at Telecom ParisTech, Paris, a collaboration of DLR with the French Space Agency (CNES). He has been a Professor holder of the DLR-CNES Chair at Telecom ParisTech. Since 1981, he has been a Professor with the Faculty of Electronics, Telecommunications and Information Technology (ETTI), UPB, working on signal/image processing and electronic speckle interferometry, where has been the Director of the Research Center for Spatial Information since 2011. He and his team have developed and are currently developing the operational image information mining (IIM) processor in the payload ground segment systems for the German missions TerraSAR-X, TanDEM-X, and the ESA Sentinel 1 and 2. He is the author of more than 200 scientific publications, among them about 60 journal papers, and a book on number theory. His interests are in information and complexity theory, stochastic processes, Bayesian inference, and IIM. Prof. Datcu is a member of the European Image Information Mining Coordination Group and the Data Archiving and Distribution Technical Committee of the IEEE Geoscience and Remote Sensing Society. He is IEEE Fellow.