Spaces Liminal & Latent

Chad Eby

Latent Space

Figure 1: Screenshot of the HUMBABA poem ‘Backlash Gorge’ - strophe I, showing activated peritext.
HUMBABA algorithm output is on the right, authored input is on the left. — Stable Diffusion FLUX image of an empty hospital corridor; a liminal space from latent space

Latent Space

Latent space, as understood in generative artificial intelligence work, refers to a high-dimensional mathematical construct that serves as a space of transformation between inputs of human intent and unruly—sometimes uncanny—generative output. Because of the particular way that latent spaces have developed as a space- and computation-saving abstraction, they are largely opaque from the outside; each one a black box but brimming inside with a riotous multitude of unrealized potentialities.

Unlike the step-by-step logic of traditional imperative computer programs, generative AI models that employ latent space remain largely inscrutable, existing as complex, mathematical abstractions that defy un-augmented human comprehension. The imaginary of latent space stands in as an undiscovered country; a largely impenetrable territory known only from the fragmentary texts, images, and sounds smuggled out through computationally intensive decoding or from arduous and incomplete nascent attempts at analysis and mapping.

Specifically for image generation, a latent space will contain data abstracted from digitized images paired with matching text descriptions. Enormous beyond human imagination, each latent space is like a secret garden overgrown with vector encodings gathered from the probabilistic patterns of literal billions of text-image pairs. The latent space is where an encoded prompt may be transformed into a particular image out of the myriad possible ones¹.

The architecture of latent spaces varies across different types of generative techniques. Variational Autoencoders (VAEs) explicitly construct their latent space as a probabilistic distribution, typically a multi-dimensional Gaussian, allowing them to generate novel outputs by sampling from different regions. In contrast, Generative Adversarial Networks (GANs) implicitly learn a latent space through an antagonistic interplay between generator and discriminator networks. While often less structured than VAE latent spaces, the dance of GANs can nonetheless produce high-fidelity output. Diffusion models also employ autoencoders, but add a probabilistic denoising method, effectively traversing a path through latent space from pure noise to a coherent image. This more structured iterative process allows diffusion models to generate believable images while maintaining some degree of interpretability of their latent representations, offering the possibility to steer image generation modestly toward preferred outcomes.²

Efforts to visualize and interpret latent spaces have become a significant area of research in the field of AI interpretability. Techniques like t-SNE and UMAP allow for the projection of high-dimensional latent spaces into two or three dimensions, providing a look into the organization of learned features³. These visualizations are necessarily imperfect, offering only small glimpses into the complex relationships encoded within a latent space. Latent space interpolation has become a powerful tool for exploring the generative capabilities of these models. By smoothly transitioning between different vector coordinates in latent space, we can observe how a model’s encoding of features and concepts evolves. This technique has been used to create mesmerizing visual effects, such as morphing between different faces or objects, and, increasingly, to produce coherent video clips.

Liminal Space

Liminal space, like latent space, is similarly understood as a space of transition, ambiguity, and transformation. The adjective, “liminal,” is derived from a Greek/Latin word that referred to literal doorway thresholds. In the nineteenth century, the word was used by psychologists to indicate a lower limit—a metaphorical threshold beyond which a sensation is too faint to be perceived.⁴ Only later did the word come to mean a general transitional or intermediate state, and by coupling “liminal” to “space,” current usage restores a measure of the original term’s architectural connections.

In liminal spaces we encounter the unsettling in-betweenness of not being quite where we were, but also not quite having arrived at where we intended to be. It is in this sense that liminal spaces are associated with the middle stage of a rite of passage; they encompass a disorienting zone of uncertainty.⁵

Latent and liminal spaces conceptually overlap as being zonic or spatial in structure, but more importantly they share a similar functional role as a place of ambiguity, potentiality, transition, translation, and transformation. In addition to the structural and functional similarities, latent space may be argued to share an additional characteristic of liminal space: it is also haunted. The spectral turn toward “hauntology” in the Derridean sense acts as a vehicle to further explore the intersections of liminal and latent space and may offer new ways to think about interacting with generative AI.⁶

The Spectral Turn

The generative capacities of modern AI systems arise from their ability to navigate the liminal, in-between corridors of latent space; the high-dimensional vector fields, where the essential features and patterns of the training data are distilled, organized, and interrelated in complex, convoluted ways. In these not-quite-spaces, generative models begin to channel the spectral influences of their pasts, transmuting them into startlingly original and sometimes deeply uncanny new forms.

Latent space models, called “checkpoints” in Stable Diffusion, are like vast graveyards filled with the encoded bodies of images paired with their textual epitaphs. It is this sense of the latent space as a site of digital necromancy—where the ghosts of the past are summoned to birth strange new forms. Projects like Deep Dream, which leverage the latent representations of convolutional neural networks to generate hallucinatory imagery of slug-dogs and orientalist pagodas are emblematic of this early generative image aesthetic. The uncanny qualities of these outputs speak to a deeper unease with the machine’s ability to channel the spectral influences of its training data, conjuring visions that exist in a liminal zone between the familiar and the alien.⁷

The hauntological dimension of latent space speaks to an ontological unease that permeates our relationship with generative AI technologies. Just as liminal spaces complicate place, stability, and identity, the latent spaces of AI systems act to challenge some of our most basic understandings of creativity, authorship, collaboration, agency—and even representation itself. The ghostly presences that haunt the outputs of these models—the echoes of data past, the long traces of encoded biases, and the virtual potentialities of billions of unrealized futures—nudge us to consider the limits of our own agency and the spectral influences that increasingly modulate our technologically-mediated making.

Semantic Probes

Unlike more engineering-oriented projects intended to map the contours of latent space in a direct way, my Alphamerics project explores latent spaces as a series of images generated using Stable Diffusion models. The images are generated by sending near-empty prompts (single uppercase alphabetic characters from A-Z and numerals 0-9), all sharing a single, seed as “probes” into a latent space to see what may come back. A full run of these prompts returns a grid of 36 images that often share strong visual similarities in terms of composition and color palette, because the seed value sets initial conditions for distribution of noise, but at the same time may vary wildly in subject matter and theme from image to image in the same seed series. Nearly all of the images feel a little haunted.

While a completely empty prompt is the most neutral vehicle to probe latent spaces, the inevitable drawback is that there is only one empty prompt per seed, and since the seed determines initial conditions, it isn’t possible to “walk the seed” by prompting. This led me to send single characters as prompts as the next best solution, since I suspected that they may be tokenized in such a way as to fall beneath the threshold of semantic content. Recent investigations I’ve made with a wider range of checkpoints and encoders (especially with the new Flux models) point to there being more semantic content in single character prompts generally than I first believed, and, specifically, significantly more in some single character prompts than others; “Q” and “X” seem to be particularly potent signifiers, perhaps related to their relative infrequent use in English words.

These approaches have revealed some useful insights into how generative models encode and associate information, but they also highlight the vast gulf between human-interpretable concepts and the abstract, high-dimensional representations learned by neural networks.

Large Language Models

The spectral potentialities of generative AI extend beyond the visuals smuggled out of latent space. In the domain of natural language processing, Large Language Models (LLMs) like GPT-3 have demonstrated a remarkable capacity to generate human-like text, from creative fiction to persuasive essays. This fluency poses a haunting proposition: that a language model performs metaphorically as a medium, channeling the ghostly influences of its training corpus like a statistical Ouija board to produce novel combinations of words and ideas that feel disturbingly lifelike.

The hauntological quality of large language models speaks to deeper questions of authorship and authenticity. As generative AI systems demonstrate an uncanny ability to mimic and recombine human modes of expression, the arguments for human exceptionalism in terms of individual creativity and originality may be called into question, not to mention the legitimacy and appropriateness of the use of generative AI itself. We find ourselves confronted with the spectral presence of the symbolic machine, a frothy, ghostly entity that can seemingly conduct the voices of the past to craft compelling new narratives—a feat that challenges our understanding of what it means to be a writer, a thinker, or a creative individual in the 21st century.

The implications of this hauntological dimension of the latent space extend beyond the realm of aesthetics and language, casting a long shadow over the practical applications of generative AI. In fields like scientific research, medical diagnostics, and financial forecasting, latent space representations are becoming essential tools for extracting meaningful insights from complex, high-dimensional data. Yet, the opacity of these latent spaces, and the spectral influences that nip and worry at their generative outputs, raise important questions about the reliability, transparency, and accountability of these systems.

The ethical implications of latent space representations in AI systems are manifold. Issues of safety, bias and fairness in AI have already been widely discussed, but the latent space adds a new dimension to these concerns. The complex, distributed nature of these representations makes it challenging to identify and mitigate dangers or biases, as they may be encoded in subtle, non-linear ways across multiple dimensions of a latent space.

As we entrust ever greater decision-making power to generative AI models, we should remain sensitive to the unsettling reality that the latent spaces at the heart of these technologies are haunted by the ghosts of their pasts. The biases, blind spots, virtual potentialities/lost futures encoded within these representations have the power to profoundly shape the trajectories of our institutions, our policies, and our collective futures. The stakes here are high, and the need to develop a more nuanced understanding of the latent space and its spectral qualities is vital.

Haunting as Resistance

Due to the liminal nature of latent space, the unchained ghosts we summon there are radically polysemous; they offer to show us not only the traces of our past inequities and encodings of present dominant narratives, but through their perversely polymorphic tendency they point to representations of recombinant possibilities that differ from either—not just lost futures, but the most fundamental raw material for preferred ones.

In this way, latent space may be seen as Pandora’s box; unleashing evil upon the world, but also containing hope (or, more cynically, “deceptive expectation”) simply as a demonstration of the condition of possibility for something different than it is now; an endless well of disruption. The caveat to this claim is that the ghosts must be truly unchained—the enormous costs of training generative AI models are often undertaken as investments by profit-seeking entities, who are then compelled to limit the inputs and outputs of such systems to avoid alienating advertisers or investors. This is a topic for a different paper, but I will say that as a site of resistance, open-source software under local control is essential.

Conclusion

By embracing the hauntological dimensions of latent space, we may find new pathways for navigating the ontological uncertainties of the technological present we now inhabit. This confrontation with the spectral nature of latent spaces may require new modes of thinking and new frameworks for analysis. Rather than viewing these generative models as neutral, deterministic tools, we might more profitably understand them as complex, liminal entities—repositories of virtual potentiality that are inextricably linked to the spectral influences of their training data and the broader socio-cultural and economic contexts in which they are embedded. By variously dispelling or partnering with the ghosts that haunt latent space, we may use these strange constructs as a fulcrum to lever into place visions of our preferred futures.

Notes

Zhang, Chenshuang, Chaoning Zhang, Mengchun Zhang, and In So Kweon. “Text-to-image diffusion models in generative ai: A survey.” arXiv preprint arXiv:2303.07909 (2023). ↩︎
Kingma, Diederik P., and Max Welling. “An introduction to variational autoencoders.” Foundations and Trends in Machine Learning 12, no. 4 (2019): 307-392. ↩︎
Arora, Sanjeev, Wei Hu, and Pravesh K. Kothari. “An analysis of the t-sne algorithm for data visualization.” In Conference on learning theory, pp. 1455-1462. PMLR, 2018. ↩︎
Oxford English Dictionary, s.v. “liminal (adj.), sense 3,” July 2023, https://doi.org/10.1093/OED/6646919567. ↩︎
Andrews, Hazel, and Les Roberts, eds. Liminal Landscapes. New York: Taylor & Francis, 2012. ↩︎
Derrida, Jacques. Specters of Marx : the state of the debt, the work of mourning, and the New international. United Kingdom: Routledge, 1994. ↩︎
Mordvintsev, Alexander, Christopher Olah, and Mike Tyka. “Deepdream-a code example for visualizing neural networks.” Google Research 2, no. 5 (2015). ↩︎