Quantifying “Political Islands” with Persistent Homology

Recent political discourse in the U.S. has made much ado about a growing demographic and political divide between urban and rural communities. Moreover, cities and metropolitan areas have long been described as “islands of blue in a sea of red,” prompting lively discussions of the potential implications of such political geography. Does something about urban living cause people to shift their political views? Are Democrats self-segregating into cities? Or is there another explanation? Additional concerns arise, as one may wonder whether the concentration of blue (Democratic) voters in dense urban areas makes it harder to avoid gerrymandering. How can we locate these political “islands” to better study them and identify trends in their formation?

These are just a few of the questions about America’s political gap that pervade mainstream discussion. U.S. politics has become increasingly polarized over the last several years, impacting policy debate and electoral behavior and in turn leading to partisan gridlock.

We use tools from topological data analysis (TDA) to examine the problem of large-scale identification of political islands. To explore this subject, we have proposed several methods to examine TDA of geospatial data and applied them to 2016 precinct-level election data1 from the state of California [1].

When searching for political islands, we can imagine that we are looking for gaps in a “sea” of regions with similar electoral preferences. Homology, a tool from algebraic topology that characterizes topological spaces based on their “holes,” is well-suited to finding these types of gaps [4]. A technique known as persistent homology (PH) enables us to locate these holes in data across a variety of scales. This is useful because of variations in physical size between urban and rural precincts. It also allows us to quantify the strength of the differences in opinion between precincts.

To apply PH [5], we need to transform our data into a suitable topological space. In our case, this space takes the form of simplicial complexes [4], which use simplices as building blocks and permit computational tractability for the study of PH. In our recent work, we developed two methods of doing so that yield different interpretations with respect to our original quest to pinpoint political islands [1].

<strong>Figure 1.</strong> Construction of a filtered simplicial complex for California’s Imperial County, with nodes, edges, and faces colored by the order in which we add them to the filtration. The filtered simplicial complex is strictly increasing. We add only the darkest red precincts initially, and we add the lighter red precincts in order of preference, from strongest to weakest. Figure courtesy of [1]. — **Figure 1.** Construction of a filtered simplicial complex for California’s Imperial County, with nodes, edges, and faces colored by the order in which we add them to the filtration. The filtered simplicial complex is strictly increasing. We add only the darkest red precincts initially, and we add the lighter red precincts in order of preference, from strongest to weakest. Figure courtesy of [1].

We first introduced an adjacency method that utilizes the network of electoral precincts as its basic structure. Each precinct in this network is a node, with voting information attached to it; each network adjacency is an edge. We focused on the 2016 U.S. presidential election and considered the preference of voters for presidential candidates Hillary Clinton and Donald Trump. An edge exists between two nodes if the precincts that are associated with those nodes share a boundary. If three nodes are connected by all possible pairwise edges, we add a 2-simplex to the simplicial complex. Figure 1 illustrates this process. To exploit the power of PH, we construct a sequence of these simplicial complexes called a “filtered simplicial complex.” We begin only with those precincts that possess the largest percentage of votes for Trump, and we continue to add precincts in order of decreasing preference. We then compute the PH of the entire sequence to track holes as they appear and disappear. More explicitly, we record the strength of preference at which a hole first forms, as well as the voting percentages at which we fill in all missing precincts. This allows us to determine the polarization strength between a voting island and its surrounding precincts. Holes that last longer indicate stronger polarization than those that rapidly appear and disappear.

<strong>Figure 2.</strong> Construction of a filtered simplicial complex using a level-set approach on California’s Imperial County. The initial simplicial complex consists of a triangulation of a map of all red precincts. We then evolve this surface outward and color the simplices based on the order in which they enter the filtered simplicial complex. Figure courtesy of [1]. — **Figure 2.** Construction of a filtered simplicial complex using a level-set approach on California’s Imperial County. The initial simplicial complex consists of a triangulation of a map of all red precincts. We then evolve this surface outward and color the simplices based on the order in which they enter the filtered simplicial complex. Figure courtesy of [1].

Our second technique, based on level sets, tracks a voting island’s physical size and uses a county map as its basic structure. For example, consider a map of all precincts in a county that voted for Trump. We triangulate this map by projecting it onto a triangular grid to form a simplicial complex. Any grid cells that are contained entirely within the map’s boundaries become a 2-simplex. Using a level-set method for front propagation, we then evolve the map’s boundaries outward until we fill all of the grid cells. By adding 2-simplices as we fill cells (see Figure 2), we form a filtered simplicial complex, to which we apply PH. The resulting PH computation reveals the number of time steps required to fill a given hole. Because we evolve the entire boundary of a map with the same normal velocity, it takes longer to fill in larger holes than small ones. This allows us to track the geographical size of a given political island.

As an illustration of our methods in the context of real data, consider Tulare County in California. This county is home to Sequoia National Park and is known for the historical black farming community of Allensworth. Tulare is a strongly Republican (red) county, with a few blue and purple cities dispersed throughout. It also houses Visalia, a very large red city. Our adjacency method (see Figure 3a) captures many loops—mostly around blue and light red islands—whereas the level-set approach (see Figure 3b) successfully captures blue islands. The bar length of a given feature in the “barcodes” [4, 5] in Figure 3 corresponds to polarization strength in the adjacency construction and to hole size in the level-set construction.

<strong>Figure 3.</strong> Finding political islands in Tulare County, Calif. <strong>3a.</strong> Features of Tulare County with the adjacency construction. The top panel depicts a feature map that highlights the generators of our persistent homology (PH) computation’s longest-persisting features. The bottom two panels show the barcode for this PH computation, where each bar in the code represents one feature. The left endpoint indicates the scale at which a feature is born (readers may ask whether the features are born at the right time), and the right endpoint indicates the scale at which it dies. All six loops (and highlighted bars) capture medium- to dark-red precincts that surround the blue precincts. <strong>3b.</strong> Feature map and barcode from the level-set construction on Tulare County. All features capture blue precincts that are surrounded by red precincts. Figure courtesy of [1]. — **Figure 3.** Finding political islands in Tulare County, Calif. **3a.** Features of Tulare County with the adjacency construction. The top panel depicts a feature map that highlights the generators of our persistent homology (PH) computation’s longest-persisting features. The bottom two panels show the barcode for this PH computation, where each bar in the code represents one feature. The left endpoint indicates the scale at which a feature is born (readers may ask whether the features are born at the right time), and the right endpoint indicates the scale at which it dies. All six loops (and highlighted bars) capture medium- to dark-red precincts that surround the blue precincts. **3b.** Feature map and barcode from the level-set construction on Tulare County. All features capture blue precincts that are surrounded by red precincts. Figure courtesy of [1].

We also compared tabulations of our computational results with existing Vietoris–Rips simplicial constructions [1]. These comparisons illustrate that our methods perform with speeds that are equal to or faster than standard techniques. They also yield more interpretable PH results for our quest to find political islands.

In the future, we hope to apply our methods to a longitudinal study of California precincts (and other map-based electoral data). Our adjacency and level-set constructions will be useful for applications beyond the analysis of voting islands, and we are currently utilizing them to study additional spatial systems, such as urban and biological structures. We hope that our work will inspire other researchers to begin or continue using topological tools to pursue problems in spatial networks and other spatial systems [2, 3].

¹ The data were provided by the Los Angeles Times data visualization team.'

At the 2019 SIAM Conference on Applications of Dynamical Systems, which took place last year in Snowbird, Utah, Michelle Feng described a project from the 2018 Voting Rights Data Institute that used an adjacency-based construction of demographic data to examine racial segregation in cities. The presentation is available from SIAM either as slides with synchronized audio or as a PDF of slides only.

Update: Michelle Feng won a 2021 SIAM Student Paper Prize for her paper with Mason Porter entitled "Persistent Homology of Geospatial Data: A Case Study with Voting," which appeared in Vol. 23, Issue 1 of SIAM Review in 2021. That paper, about which this article is based, is freely available online until August 20, 2023.

References

[1] Feng, M., & Porter, M.A. (2019). Persistent homology of geospatial data: a case study with voting. Preprint, arXiv:1902.05911.
[2] Katifori, E., & Magnasco, M.O. (2012). Quantifying loopy network architectures. PLoS ONE, 7(6), e37994.
[3] Motta, F.C., Neville, R., Shipman, P.D., Pearson, D.A., & Bradley, R.M. (2018). Measures of order for nearly hexagonal lattices. Phys. D: Nonlin. Phenom., 380-381, 17–30.
[4] Otter, N., Porter, M.A., Tillmann, U., Grindrod, P., & Harrington, H.A. (2017). A roadmap for the computation of persistent homology. EPJ Data Sci., 6, 17.
[5] Topaz, C. (2016). Topological data analysis: One applied mathematician’s heartwarming story of struggle, triumph, and ultimately, more struggle. SIAM Dynamical Systems Web. Retrieved from https://dsweb.siam.org/The-Magazine/Article/topological-data-analysis.

About the Authors

Michelle Feng

Student, University of California, Los Angeles

Michelle Feng is a graduate student in the Department of Mathematics at the University of California, Los Angeles. Her research interests include topological data analysis, network science, and mathematical applications to political and social science.

Mason A. Porter

Professor, University of California, Los Angeles

Mason A. Porter is a professor in the Department of Mathematics at the University of California, Los Angeles. His research interests include the theory, methods, and applications of complex systems, nonlinear systems, and networks.

About the Authors

Stay Up-to-Date with Email Alerts