COMPUTERS ENVIRONMENT AND URBAN SYSTEMS

Causal effects of mobility intervention policies on intracity flows during the COVID-19 pandemic: The moderating role of zonal locations in the transportation networks
Niu C and Zhang W
Many studies have investigated the impact of mobility restriction policies on the change of intercity flows during the outbreak of COVID-19, whereas only a few have highlighted intracity flows. By using the mobile phone trajectory data of approximately three months, we develop an interrupted time series quasi-experimental design to estimate the abrupt and gradual effects of mobility intervention policies during the pandemic on intracity flows of 491 neighborhoods in Shenzhen, China, with a focus on the role of urban transport networks. The results show that the highest level of public health emergency response caused an abrupt decline by 4567 trips and a gradually increasing effect by 34 trips per day. The effectiveness of the second return-to-work order (RtW2) was found to be clearly larger than that of the first return-to-work order (RtW1) as a mobility restoration strategy. The causal effects of mobility intervention policies are heterogenous across zonal locations in varying urban transport networks. The declining effect of health emergency response and rebounding effect of RtW2 are considerably large in better-connected neighborhoods with metro transit, as well as in those close to the airport. These findings provide new insights into the identification of pandemic-vulnerable hotspots in the transport network inside the city, as well as of crucial neighborhoods with increased adaptability to mobility interventions during the onset and decline of COVID-19.
How regularly do people visit service places?
Zhong S and Bian L
It is often believed that regularities are embedded in mobile behaviors. Highly regular mobile behaviors, such as daily commutes between home and workplace, have been actively investigated in the context of health risks. Less regular mobile behaviors, such as visits to service places (e.g., supermarkets and healthcare facilities), have not received much attention. This study explores the regularity in service place visits using a deep learning method and the effect of place type on the stability of recurring visits using an entropy assessment. Results reveal both periodic and bursty visit behaviors to service places. The periodic visits are prominent on the weekly and bi-weekly scales, and the bursty visits dominate the multi-day scales. Service place type indeed affects the stability of recurring visits, and certain place types have the strongest effect. The research findings substantially expand the knowledge of mobile behaviors and are valuable in informing both visitor-based and place-based health risks.
Information propagation on cyber, relational and physical spaces about covid-19 vaccine: Using social media and splatial framework
Yin F, Crooks A and Yin L
With the advent of social media, human dynamics studied in purely physical space have been extended to that of a cyber and relational context. However, connections and interactions between these hybrid spaces have not been sufficiently investigated. The "space-place ()" framework proposed in recent years allows capturing human activities in the hybrid of spaces. This study applies the framework to examine the information propagation between cyber, relational, and physical spaces through a case study of Covid-19 vaccine debates in New York State (NYS). Whereby the physical space represents the regional boundaries and locations of social media (i.e., Twitter) users in NYS, the relational space indicates the social networks of these NYS users, and the cyber space captures the larger conversational context of the vaccination debate. Our results suggest that the Covid-19 vaccine debate is not polarized across all three spaces as compared to that of other vaccines. However, the rate of users with a pro-vaccine stance decreases from physical to relational and cyber spaces. We also found that while users from different spaces interact with each other, they also engage in local communications with users from the same region or same space, and distance-based and boundary-confined clusters exist in cyber and relational space communities. These results based on the framework not only shed light on the vaccination debates but also help to define and elucidate the relationships between the three spaces. The intense interactions between spaces suggest incorporating people's relational network and cyber presence in physical place-making.
Structural changes in intercity mobility networks of China during the COVID-19 outbreak: A weighted stochastic block modeling analysis
Zhang W, Gong Z, Niu C, Zhao P, Ma Q and Zhao P
This study focuses on a mesoscale perspective to examine the structural and spatial changes in the intercity mobility networks of China from three phases of before, during and after the Wuhan lockdown due to the outbreak of COVID-19. Taking advantages of mobility big data from Baidu Maps, we introduce the weighted stochastic block model (WSBM) to measure and compare mesoscale structures in the three mobility networks. The results reveal significant changes to volume and structure of the intercity mobility networks. Particularly, WSBM results show that the intercity network transformed from a typical core-periphery structure in the normal phase, to a hybrid and asymmetric structure with mixing core-peripheries and local communities in the lockdown phase, and to a multi-community structure with nested core-peripheries during the post-lockdown phase. These changes suggest that the outbreak of COVID-19 and the travel restrictions deconstructed the original hierarchy of the intercity mobility network in China, making the network more locally or regionally fragmented, even at the recovery stage. This study provides new empirical and methodological insights into understanding mobility network dynamics under the impact of COVID-19, helping assess the emergency-induced impact as well as the recovery process of the mobility network.
Imputation of missing time-activity data with long-term gaps: A multi-scale residual CNN-LSTM network model
Eum Y and Yoo EH
Despite the increasing availability and spatial granularity of individuals' time-activity (TA) data, the missing data problem, particularly long-term gaps, remains as a major limitation of TA data as a primary source of human mobility studies. In the present study, we propose a two-step imputation method to address the missing TA data with long-term gaps, based on both efficient representation of TA patterns and high regularity in TA data. The method consists of two steps: (1) the continuous bag-of-words word2vec model to convert daily TA sequences into a low-dimensional numerical representation to reduce complexity; (2) a multi-scale residual Convolutional Neural Network (CNN)-stacked Long Short-Term Memory (LSTM) model to capture multi-scale temporal dependencies across historical observations and to predict the missing TAs. We evaluated the performance of the proposed imputation method using the mobile phone-based TA data collected from 180 individuals in western New York, USA, from October 2016 to May 2017, with a 10-fold out-of-sample cross-validation method. We found that the proposed imputation method achieved excellent performance with 84% prediction accuracy, which led us to conclude that the proposed imputation method was successful at reconstructing the sequence, duration, and spatial extent of activities from incomplete TA data. We believe that the proposed imputation method can be applied to impute incomplete TA data with relatively long-term gaps with high accuracy.
Towards the automated large-scale reconstruction of past road networks from historical maps
Uhl JH, Leyk S, Chiang YY and Knoblock CA
Transportation infrastructure, such as road or railroad networks, represent a fundamental component of our civilization. For sustainable planning and informed decision making, a thorough understanding of the long-term evolution of transportation infrastructure such as road networks is crucial. However, spatially explicit, multi-temporal road network data covering large spatial extents are scarce and rarely available prior to the 2000s. Herein, we propose a framework that employs increasingly available scanned and georeferenced historical map series to reconstruct past road networks, by integrating abundant, contemporary road network data and color information extracted from historical maps. Specifically, our method uses contemporary road segments as analytical units and extracts historical roads by inferring their existence in historical map series based on image processing and clustering techniques. We tested our method on over 300,000 road segments representing more than 50,000 km of the road network in the United States, extending across three study areas that cover 42 historical topographic map sheets dated between 1890 and 1950. We evaluated our approach by comparison to other historical datasets and against manually created reference data, achieving F-1 scores of up to 0.95, and showed that the extracted road network statistics are highly plausible over time, i.e., following general growth patterns. We demonstrated that contemporary geospatial data integrated with information extracted from historical map series open up new avenues for the quantitative analysis of long-term urbanization processes and landscape changes far beyond the era of operational remote sensing and digital cartography.
Early warning of COVID-19 hotspots using human mobility and web search query data
Yabe T, Tsubouchi K, Sekimoto Y and Ukkusuri SV
COVID-19 has disrupted the global economy and well-being of people at an unprecedented scale and magnitude. To contain the disease, an effective early warning system that predicts the locations of outbreaks is of crucial importance. Studies have shown the effectiveness of using large-scale mobility data to monitor the impacts of non-pharmaceutical interventions (e.g., lockdowns) through population density analysis. However, predicting the locations of potential outbreak occurrence is difficult using mobility data alone. Meanwhile, web search queries have been shown to be good predictors of the disease spread. In this study, we utilize a unique dataset of human mobility trajectories (GPS traces) and web search queries with common user identifiers (> 450 K users), to predict COVID-19 hotspot locations beforehand. More specifically, web search query analysis is conducted to identify users with high risk of COVID-19 contraction, and social contact analysis was further performed on the mobility patterns of these users to quantify the risk of an outbreak. Our approach is empirically tested using data collected from users in Tokyo, Japan. We show that by integrating COVID-19 related web search query analytics with social contact networks, we are able to predict COVID-19 hotspot locations 1-2 weeks beforehand, compared to just using social contact indexes or web search data analysis. This study proposes a novel method that can be used in early warning systems for disease outbreak hotspots, which can assist government agencies to prepare effective strategies to prevent further disease spread. Human mobility data and web search query data linked with common IDs are used to predict COVID-19 outbreaks. High risk social contact index captures both the contact density and COVID-19 contraction risks of individuals. Real world data was collected from 200 K individual users in Tokyo during the COVID-19 pandemic. Experiments showed that the index can be used for microscopic outbreak early warning.
Associations between mobility and socio-economic indicators vary across the timeline of the Covid-19 pandemic
Long JA and Ren C
Covid-19 interventions are greatly affecting patterns of human mobility. Changes in mobility during Covid-19 have differed across socio-economic gradients during the first wave. We use fine-scale network mobility data in Ontario, Canada to study the association between three different mobility measures and four socio-economic indicators throughout the first and second wave of Covid-19 (January to December 2020). We find strong associations between mobility and the socio-economic indicators and that relationships between mobility and other socio-economic indicators vary over time. We further demonstrate that understanding how mobility has changed in response to Covid-19 varies considerably depending on how mobility is measured. Our findings have important implications for understanding how mobility data should be used to study interventions across space and time. Our results support that Covid-19 non-pharmaceutical interventions have resulted in geographically disparate responses to mobility and quantifying mobility changes at fine geographical scales is crucial to understanding the impacts of Covid-19.
How did micro-mobility change in response to COVID-19 pandemic? A case study based on spatial-temporal-semantic analytics
Li A, Zhao P, Haitao H, Mansourian A and Axhausen KW
Cities worldwide adopted lockdown policies in response to the outbreak of coronavirus disease 2019 (COVID-19), significantly influencing people's travel behavior. In particular, micro-mobility, an emerging mode of urban transport, is profoundly shaped by this crisis. However, there is limited research devoted to understanding the rapidly evolving trip patterns of micro-mobility in response to COVID-19. To fill this gap, we analyze the changes in micro-mobility usage before and during the lockdown period exploiting high-resolution micro-mobility trip data collected in Zurich, Switzerland. Specifically, docked bike, docked e-bike, and dockless e-bike are evaluated and compared from the perspective of space, time and semantics. First, the spatial and temporal analysis results uncover that the number of trips decreased remarkably during the lockdown period. The striking difference between the normal and lockdown period is the decline in the peak hours of workdays. Second, the origin-destination flows are used to construct spatially embedded networks. The results suggest that the origin-destination pairs remain similar during the lockdown period, while the numbers of trips between each origin-destination pair is reduced due to COVID-19 pandemic. Finally, the semantic analysis is conducted to uncover the changes in trip purpose. It is revealed that the proportions of Home, Park, and Grocery activities increase, while the proportions of Leisure and Shopping activities decrease during the lockdown period. The above results can help planners and policymakers better make evidence-based policies regarding micro-mobility in the post-pandemic society.
Evaluation of Heuristics for the P-Median Problem: Scale and Spatial Demand Distribution
Gwalani H, Tiwari C and Mikler AR
The objective of the p-median problem is to identify p source locations and map them to n destinations while minimizing the average distance between destinations and corresponding sources. Several heuristic algorithms have been developed to solve this general class of facility location problems. In this study, we add to the current literature in two ways: (1) we present a thorough evaluation of existing classic heuristics and (2) we investigate the effect of spatial distribution of destination locations, and the number of sources and destinations on the performance of these algorithms for varying problem sizes using synthetic and real datasets. The performance of these algorithms is evaluated using the objective function value, time taken to achieve the solution, and the stability of the solution. The sensitivity of existing algorithms to the spatial distribution of destinations and scale of the problem with respect to the three metrics is analyzed in the paper. The utility of the study is demonstrated by evaluating these algorithms to select the locations of ad-hoc clinics that need to be set up for resource distribution during a bio-emergency. We demonstrate that interchange algorithms achieve good quality solutions with respect to both the execution time and cost function values, and they are more stable for clustered distributions.
Incorporating space and time into random forest models for analyzing geospatial patterns of drug-related crime incidents in a major U.S. metropolitan area
Xia Z, Stewart K and Fan J
The opioid crisis has hit American cities hard, and research on spatial and temporal patterns of drug-related activities including detecting and predicting clusters of crime incidents involving particular types of drugs is useful for distinguishing hot zones where drugs are present that in turn can further provide a basis for assessing and providing related treatment services. In this study, we investigated spatiotemporal patterns of more than 52,000 reported incidents of drug-related crime at block group granularity in Chicago, IL between 2016 and 2019. We applied a space-time analysis framework and machine learning approaches to build a model using training data that identified whether certain locations and built environment and sociodemographic factors were correlated with drug-related crime incident patterns, and establish the top contributing factors that underlaid the trends. Space and time, together with multiple driving factors, were incorporated into a random forest model to analyze these changing patterns. We accommodated both spatial and temporal autocorrelation in the model learning process to assist with capturing the changes over time and tested the capabilities of the space-time random forest model by predicting drug-related activity hot zones. We focused particularly on crime incidents that involved heroin and synthetic drugs as these have been key drug types that have highly impacted cities during the opioid crisis in the U.S.
Chinese tourists in Nordic countries: An analysis of spatio-temporal behavior using geo-located travel blog data
Zheng Y, Mou N, Zhang L, Makkonen T and Yang T
Geo-located travel blogs, a new data source, enable to achieve more detailed analysis of tourists' spatio-temporal behavior. Taking Chinese tourists in Nordic countries as the research object, this paper focuses on their behavior, seasonal patterns and complex network effects by using geo-located travel blog data collected from Qunar.com. The results show that: (1) Chinese tourists visiting Nordic countries are often experienced in traveling. The local climate during the cold season does not prevent them from pursuing the aurora scenery. (2) The travel behavior of Chinese tourists is spatially heterogeneous. The network analysis reveals that Iceland showcases stronger, compared to the other Nordic countries, community independence and small world effect. (3) During the warm season, Chinese tourists choose a variety of destinations, while in cold season, they tend to choose destinations with higher chances for spotting the northern lights. These results provide helpful information for the tourism management departments of Nordic countries to improve their marketing and development efforts directed for Chinese tourists.
Impact of extreme weather events on urban human flow: A perspective from location-based service data
Chen Z, Gong Z, Yang S, Ma Q and Kan C
This study investigates the impact of extreme weather events on urban human flow disruptions using location-based service data obtained from Baidu Map. Utilizing the 2018 Typhoon Mangkhut as an example, the spatial and temporal variations of urban human flow patterns in Shenzhen are examined using GIS and spatial flow analysis. In addition, the variation of human flow by different urban functions (e.g. transport, recreational, institutional, commercial and residential related facilities) is also examined through an integration of flow data and point-of-interest (POI) data. The study reveals that urban flow patterns varied substantially before, during, and after the typhoon. Specifically, urban flows were found to have reduced by 39% during the disruption. Conversely, 56% of flows increased immediately after the disruption. In terms of functional variation, the assessment reveals that fundamental urban functions, such as industrial (work) and institutional - (education) related trips experienced less disruption, whereas the typhoon event appears to have a relatively larger negative influence on recreational related trips. Overall, the study provides implications for planners and policy makers to enhance urban resilience to disasters through a better understanding of the urban vulnerability to disruptive events.
Advancing Scenario Planning through Integrating Urban Growth Prediction with Future Flood Risk Models
Kim Y and Newman G
High uncertainty about future urbanization and flood risk conditions limits the ability to increase resiliency in traditional scenario-based urban planning. While scenario planning integrating urban growth prediction modeling is becoming more common, these models have not been effectively linked with future flood plain changes due to sea level rise. This study advances scenario planning by integrating urban growth prediction models with flood risk scenarios. The Land Transformation Model, a land change prediction model using a GIS based artificial neural network, is used to predict future urban growth scenarios for Tampa, Florida, USA, and future flood risks are then delineated based on the current 100-year floodplain using NOAA level rise scenarios. A multi-level evaluation using three urban prediction scenarios (business as usual, growth as planned, and resilient growth) and three sea level rise scenarios (low, high, and extreme) is conducted to determine how prepared Tampa's current land use plan is in handling increasing resilient development in lieu of sea level rise. Results show that the current land use plan (growth as planned) decreases flood risk at the city scale but not always at the neighborhood scale, when compared to no growth regulations (business as usual). However, flood risk when growing according to the current plan is significantly higher when compared to all future growth residing outside of the 100-year floodplain (resilient growth). Understanding the potential effects of sea level rise depends on understanding the probabilities of future development options and extreme climate conditions.
Economic and technical assessment of rooftop solar photovoltaic potential in Brownsville, Texas, U.S.A
Mangiante MJ, Whung PY, Zhou L, Porter R, Cepada A, Campirano E, Licon D, Lawrence R and Torres M
Localized assessment of solar energy economic feasibility will benefit the structuring of residential solar energy deployment globally. In the U.S. growing interest in rooftop residential solar among city managers has spurred the development of photovoltaic (PV) feasibility maps of the technical and economic solar potential within cities. The City of Brownsville, Texas was interested in evaluating solar feasibility for their city but lacked information to make informed policy decisions on PV development. This paper presents novel and systems approaches for determining the technical and economic feasibility of solar development for homes in the Brownsville using LiDAR and local information. Residential technical and economic potential was assessed by optimizing the internal rate of return (IRR) and an average residential building demand profile to determine ideal size and placement of solar arrays. Results showed that residential structures in Brownsville have the technical potential to generate approximately 11% of the total energy provided by the local utility; however, average IRR was only 2.9% with a payback period of over 15 years. Five neighborhoods in the City of Brownsville were identified with spatially clustered homes that had relatively higher IRRs compared with other areas in the city. Despite the high technical potential, modeled results indicate that perspective home owners interested in solar development may require additional incentives to improve the economic feasibility of PV in Brownsville. This study provides a demonstration of an interdisciplinary systems approach and methodology that can be adopted internationally to evaluate the feasibility of solar development in other areas.
Annually modelling built-settlements between remotely-sensed observations using relative changes in subnational populations and lights at night
Nieves JJ, Sorichetta A, Linard C, Bondarenko M, Steele JE, Stevens FR, Gaughan AE, Carioli A, Clarke DJ, Esch T and Tatem AJ
Mapping urban features/human built-settlement extents at the annual time step has a wide variety of applications in demography, public health, sustainable development, and many other fields. Recently, while more multitemporal urban features/human built-settlement datasets have become available, issues still exist in remotely-sensed imagery due to spatial and temporal coverage, adverse atmospheric conditions, and expenses involved in producing such datasets. Remotely-sensed annual time-series of urban/built-settlement extents therefore do not yet exist and cover more than specific local areas or city-based regions. Moreover, while a few high-resolution global datasets of urban/built-settlement extents exist for key years, the observed date often deviates many years from the assigned one. These challenges make it difficult to increase temporal coverage while maintaining high fidelity in the spatial resolution. Here we describe an interpolative and flexible modelling framework for producing annual built-settlement extents. We use a combined technique of random forest and spatio-temporal dasymetric modelling with open source subnational data to produce annual 100 m × 100 m resolution binary built-settlement datasets in four test countries located in varying environmental and developmental contexts for test periods of five-year gaps. We find that in the majority of years, across all study areas, the model correctly identified between 85 and 99% of pixels that transition to built-settlement. Additionally, with few exceptions, the model substantially out performed a model that gave every pixel equal chance of transitioning to built-settlement in each year. This modelling framework shows strong promise for filling gaps in cross-sectional urban features/built-settlement datasets derived from remotely-sensed imagery, provides a base upon which to create urban future/built-settlement extent projections, and enables further exploration of the relationships between urban/built-settlement area and population dynamics.
Using Multiple Scale Space-Time Patterns in Variance-Based Global Sensitivity Analysis for Spatially Explicit Agent-Based Models
Kang JY and Aldstadt J
Sensitivity analysis (SA) in spatially explicit agent-based models (ABMs) has emerged to address some of the challenges associated with model specification and parameterization. For spatially explicit ABMs, the comparison of spatial or spatio-temporal patterns has been advocated to evaluate models. Nevertheless, less attention has been paid to understanding the extent to which parameter values in ABMs are responsible for mismatch between model outcomes and observations. In this paper, we propose the use of multiple scale space-time patterns in variance-based global sensitivity analysis (GSA). A vector-borne disease transmission model was used as the case study. Input factors used in GSA include one related to the environment (introduction rates), two related to interactions between agents and environment (level of herd immunity, mosquito population density), and one that defines agent state transition (mosquito extrinsic incubation period). The results show parameters related to interactions between agents and the environment have great impact on the ability of a model to reproduce observed patterns, although the magnitudes of such impacts vary by space-time scales. Additionally, the results highlight the time-dependent sensitivity to parameter values in spatially explicit ABMs. The GSA performed in this study helps in identifying the input factors that need to be carefully parameterized in the model to implement ABMs that well reproduce observed patterns at multiple space-time scales.
Using population surfaces and spatial metrics to track the development of deprivation landscapes in Glasgow, Liverpool, and Manchester between 1971 and 2011
Stewart JL, Livingston M, Walsh D and Mitchell R
Measuring change in the spatial arrangement of deprivation over time, and making international, inter-city comparisons, is technically challenging. Meeting these challenges offers a means of furthering understanding and providing new insights into the geography of urban poverty and deprivation. In this paper, we introduce a novel approach to mapping and analysing spatio-temporal patterns of household deprivation, assessing the distribution at the landscape level. The approach we develop has advantages over existing techniques because it is applicable in situations where i) conventional approaches based on choropleth mapping are not feasible due to boundary change and/or ii) where spatial relationships at a landscape level are of interest. Through the application of surface mapping techniques to disaggregate census count data, and by applying spatial metrics commonly used in ecology, we were able to compare the development of the spatial arrangement of deprivation between 1971 and 2011 in three UK cities of particular interest: Glasgow, Manchester and Liverpool. Applying three spatial metrics - spatial extent, patch density, and mean patch size - revealed that over the 40 year period household deprivation has been more spatially dispersed in Glasgow. This novel approach has enabled an analysis of deprivation distributions over time which is less affected by boundary change and which accurately assesses and quantifies the spatial relationships between those living with differing levels of deprivation. It thereby offers a new approach for researchers working in this area.
A large-scale location-based social network to understanding the impact of human geo-social interaction patterns on vaccination strategies in an urbanized area
Luo W, Gao P and Cassels S
Cities play an important role in fostering and amplifying the transmission of airborne diseases (e.g., influenza) because of dense human contacts. Before an outbreak of airborne diseases within a city, how to determine an appropriate containment area for effective vaccination strategies is unknown. This research treats airborne disease spreads as geo-social interaction patterns, because viruses transmit among different groups of people over geographical locations through human interactions and population movement. Previous research argued that an appropriate scale identified through human geo-social interaction patterns can provide great potential for effective vaccination. However, little work has been done to examine the effectiveness of such vaccination at large scales (e.g., city) that are characterized by spatially heterogeneous population distribution and movement. This article therefore aims to understand the impact of geo-social interaction patterns on effective vaccination in the urbanized area of Portland, Oregon. To achieve this goal, we simulate influenza transmission on a large-scale location-based social network to 1) identify human geo-social interaction patterns for designing effective vaccination strategies, and 2) and evaluate the efficacy of different vaccination strategies according to the identified geo-social patterns. The simulation results illustrate the effectiveness of vaccination strategies based on geosocial interaction patterns in containing the epidemic outbreak at the source. This research can provide evidence to inform public health approaches to determine effective scales in the design of disease control strategies.
Identifying residential neighbourhood types from settlement points in a machine learning approach
Jochem WC, Bird TJ and Tatem AJ
Remote sensing techniques are now commonly applied to map and monitor urban land uses to measure growth and to assist with development and planning. Recent work in this area has highlighted the use of textures and other spatial features that can be measured in very high spatial resolution imagery. Far less attention has been given to using geospatial vector data (i.e. points, lines, polygons) to map land uses. This paper presents an approach to distinguish residential settlement types (regular vs. irregular) using an existing database of settlement points locating structures. Nine data features describing the density, distance, angles, and spacing of the settlement points are calculated at multiple spatial scales. These data are analysed alone and with five common remote sensing measures on elevation, slope, vegetation, and nighttime lights in a supervised machine learning approach to classify land use areas. The method was tested in seven provinces of Afghanistan (Balkh, Helmand, Herat, Kabul, Kandahar, Kunduz, Nangarhar). Overall accuracy ranged from 78% in Kandahar to 90% in Nangarhar. This research demonstrates the potential to accurately map land uses from even the simplest representation of structures.
Spatiotemporal aggregation for temporally extensive international microdata
Kugler TA, Manson SM and Donato JR
We describe a strategy for regionalizing subnational administrative units in conjunction with harmonizing changes in unit boundaries over time that can be applied to provide small-area geographic identifiers for census microdata. The availability of small-area identifiers blends the flexibility of individual microdata with the spatial specificity of aggregate data. Regionalizing microdata by administrative units poses a number of challenges, such as the need to aggregate individual scale data in a way that ensures confidentiality and issues arising from changing spatial boundaries over time. We describe a regionalization and harmonization strategy that creates units that satisfy spatial and other constraints while maximizing the number of units in a way that supports policy and research use. We describe this regionalization strategy for three test cases of Malawi, Brazil, and the United States. We test different algorithms and develop a semi-automated strategy for regionalization that meets data restrictions, computation, and data demands from end users.