Urban Growth Boundaries Prediction Using Stable Diffusion and ControlNet: A Case Study of Chengdu
-
07.2024 - ongoing
Abstract
Urban Growth Boundaries (UGBs) are important tools to mitigate development issues such as uncontrolled sprawl and regional ecological threats. But predicting UGBs remains difficult due the complicated nature of urban development. Researchers in urban planning has already developed models to predict urban growth using rule-based method such as Cellular Automata(CA) and more recently Deep-Learning-based methods such as Multi-Layer Perceptron(MLP) (Wang et al., 2023), Convolutional Neural Network(CNN) and Long Short-Term Memory(LSTM) Neural Networks (Zhou et al., 2023) . These models are either computationally expensive(CA) or fail to reflect numerical accuracies for rasterized approaches (MLP-LSTM).
This study seeks to enhance the accuracy and adaptability of urban growth predictions, offering a more detailed control over urban growth prediction models using multi-modal machine learning. More specifically, we use text to represent urban growth data and use image to represent urban growth boundary ‘conditions’ and the ‘result image’ which represents actual urban growth in a rasterized way. We then fit state-of-the-art multimodal latent diffusion model Stable Diffusion and ControlNet(Zhang and Agrawala, 2023) on the dataset to sample the joint distribution between urban planning indicators, boundary conditions and actual growth.
For case study, we selected Chengdu, the economic capital of southwest China due to its rapid yet stable growth and consistent urban planning policy during 2005-2015.
Then we segmented the urban sprawl GIS data of Chengdu from 2005-2015 into a paired dataset of 2160 urban patch data points. Each data point represents urban growth in a fixed time window(in our case 1 year) and consists of three parts: textural information, conditioning image and target image for Stable Diffusion and ControlNet training. Textual information includes urban planning indicators such as road length, population and GDP growth rate, mixed land use, color coding of different land use and more importantly the urban development policies. The Conditioning image contains several constraints on urban growth such as the limit of UGBs, water bodies, urban green spaces, and road networks. Target images are the ground truth UGBs that we are predicting. And the models are evaluated using Mean Square Error (MSE) for results similarity and a self-developed evaluation to measure the accuracy of UGBs.
The main findings are: (1) Our model was able to generate accurate UGBs with a MSE loss of 0.05, demonstrating high reliability and controllability. (2) Models with political information integrated into text prompt output better results with a MSE loss of 0.04 and allow more control over the context of different area. The study highlights the ability of integrating more urban planning indicators, especially political information into UGB prediction models for Stable Diffusion and ControlNet to learn and control complex latent relationships.
Keywords
Urban Growth Boundaries Prediction, Deep Learning Latent Diffusion Model, ControlNet, Multimodal Machine Learning