The LLP-Co method learns visual patterns and automatically clusters large volumes of data, with applications in identifying deforested areas and characteristics of plant biodiversity in forest regions
Applying satellite imagery and artificial intelligence to determine deforestation levels and details of the Amazon Rainforest’s biodiversity—this ambitious project is being developed by Professor Dário Oliveira from the School of Applied Mathematics at Fundação Getulio Vargas (FGV EMAp) and his students. The research employs a methodology developed by Oliveira during his stay at the University of Wisconsin-Madison (USA), which was awarded at the VIII FGV Research and Innovation Symposium. This methodology enables the automatic classification of agricultural crops using a machine learning system that relies on satellite imagery for mapping.
At the core of this approach is the Learning from Label Proportions with Prototypical Contrastive Clustering (LLP-Co) tool, which organizes large datasets captured by drones and satellites. These devices can fly over the rainforest and collect large-scale data, even in areas with dense vegetation cover. The system processes, labels, and classifies this data autonomously, learning through researcher supervision and validation.
LLP-Co can "learn" which types of crops appear in each section of the image—even without detailed instructions on which section corresponds to which crop. | Source: Reproduction/ La Rosa; Oliveira; Ghamisi, 2022.
How does the model work?
LLP-Co recognizes patterns by considering various characteristics, such as colors and historical data from the observed region. It employs prototype-based contrastive learning, where images are processed and divided into "bags"—sets of small pixel blocks—that are then grouped into clusters (collections of elements sharing similar characteristics) without the need for individual sample labeling.
According to the research, the LLP-Co method has outperformed traditional unsupervised learning approaches, such as Swapping Assignments Between Views (SwAV). Remote sensing image classification is a time-consuming and labor-intensive task, but systems like LLP-Co significantly reduce the need for human effort in this process.
With the help of drones, researchers can capture images and classify them using the LLP-Co system. | Photo: Envato
The system's applications in forest monitoring enable greater accuracy in tracking deforestation progress, allowing for more effective public policy management. Additionally, the identification and classification of plant species can reveal surprising details about local biodiversity, opening new avenues for research focused on environmental conservation.
Versatility of applications
Professor Dário Oliveira’s research is at the forefront of Data Science. Through international collaborations, his work spans multiple disciplines and has been published in highly respected scientific journals. In the Social Sciences field, Oliveira contributed to a study that explored the relationship between socioeconomic indicators using satellite imagery processed by artificial intelligence. The findings enable applications in urban planning, social inequality monitoring, and the development of more effective public policies to support community growth.
Another study examines the connection between urban expansion and land surface temperature (LST) increase. Using LST time series extracted from satellite images, the researchers applied clustering methods to identify spatiotemporal patterns in three cities: Kolkata (India), São Paulo (Brazil), and Munich (Germany). The results revealed that urban growth significantly contributes to rising LST, particularly in recently urbanized peripheral areas. Additionally, the study showed that the migration of the LST center of mass over time reflects the direction of urban expansion.
Application of how deep learning models can predict socioeconomic outcomes related to local environmental conditions | Photo: Reproduction/ Obadic et al., 2024.
Dário Oliveira also contributed to research that proposed a new method for improving the synthesis of extreme climate data using variational autoencoders (VAEs). The VAEs were trained to recognize climate patterns from historical precipitation data, enabling the prediction or generation of new climate scenarios. The results indicate that this technique can enhance climate forecasting models and environmental risk analysis. This approach also helped correct biases in traditional models, ensuring that rare events were neither underestimated nor overlooked. The methodology has potential applications beyond climate studies, including natural disaster prediction, climate monitoring, and urban planning.
The work of Professor Dário Oliveira and his team at FGV EMAp marks a significant advancement in artificial intelligence and remote sensing techniques for complex data analysis. The versatility of this method allows its application across multiple fields, broadening the use of geospatial data to better understand urban, climate, and socioeconomic phenomena.
Through these innovations, FGV EMAp plays a key role in bridging data science and sustainability, offering groundbreaking solutions for environmental monitoring and strategic decision-making in natural resource management.