Category Archives: Social Network Analysis

Ekphrasis as an LDA Network in NodeXL

In an earlier post, I mention the value of visualizations as a means for exploring topic modeling data.Â That particular example used a small model of 276 poems labeled â€œekphrasticâ€ out of a much larger collection.Â At that point, I was still struggling with how to read the data, which felt overwhelming.Â How could I organize the relationships between topics and documents in such a way as to see salient connections produced by the model? Â The intermediate solution was to break the model down into groups of 3 topics and create bar graphs charting the likelihood that each document contained language from each topic.Â That solution worked in the short-term, because it helped me to discover the fact that one topic was found highly likely within a particular volume of ekphrastic verse: John Hollanderâ€™s The Gazerâ€™s Spirit.

Still, what I wanted was an impressionistic overview of the documentsâ€™ association with all of the topics. The first 40 or so attempts at this process were a dismal failure.Â Partly because it was a learning process and partly because the results frequently resembled the much maligned â€œhairball,â€ what I produced was completely incomprehensible.Â However, August 20th to 24th I attended the NSF, Social Media Research Foundation, and Grand funded Summer Social Webshop on Technology-Mediated Social Participation.Â There, I met Marc Smith, who began developing NodeXL, a social media network analysis tool built to work with Microsoft Excel, while he worked for Microsoft Research.Â Marc, who now leads the Social Media Research Foundation and Connected Action Â generously took time to demonstrate how to import my topic modeling data into NodeXL so that I could generate graphs that are more elegant and streamlined than any Iâ€™ve been able to produce to this point.Â The results arenâ€™t just beautiful: theyâ€™re useful.

So, what are those results? They include unimodal and bimodal network graphs that visualize connections between documents with other documents, topics with other topics, and documents with topics created with an LDA model in MALLET.Â Using NodeXLâ€™s algorithms, I am able to cluster groups with stronger ties in grid areas, assign them unique colors, and demonstrate the degree of probability the model calculates as a connection between nodes (either documents or topics depending on the graph).Â The real power of NodeXL, though, is that in the future I can make my data public through the NodeXL gallery, and you can download my network graph and play with it yourself.Â The data isnâ€™t quite there yet, but thatâ€™s whatâ€™s coming.

In the meantime, Iâ€™ll offer the following image of a network graph that I had hoped to produce with my earlier post about The Gazerâ€™s Spirit.Â Though the topic label is small, Topic 3 can be seen in the top left hand corner of the network diagram. The width and color of the edges in the diagram (meaning the width of the lines) is determined by the modelâ€™s estimation of how much of each topic is in each poem.Â If the lines are thicker and lighter, it means that the model estimates that a large portion of the poem draws its language from the corresponding topic.Â Similarly, the thinner and darker a line is the lower the probability that the poem includes language from the corresponding topic.

Table 1: Ekphrastic Dataset – 276 poems and 15 topics

Â Â Â Â Â Â Â Â Â Â Â Topic 3 (in the top, left-hand corner) is primarily comprised of connections to poems from The Gazerâ€™s Spirit and is affiliated by language that reflects a kind of courtship, including archaic references (thy, thee, thou) and the language of love (er, beauty, grace, eyes, heaven, divine, hand, love).Â This makes sense in the context of existing knowledge about Hollanderâ€™s volume.Â The collection reads very much like a tribute to painting and the visual arts by poetry, and the language of desire is prevalent throughout.Â Moreover, both W.J.T. Mitchell and James A.W. Heffernan, two prominent theorists in the ekphrastic tradition, insist that the language of love and desire is a strong, if not dominant, discourse across all of ekphrasis based on a canon of poems mostly included in The Gazer’s Spirit.Â One might assume, then, that there would be strong connections between a topic comprised of the language of courtship, love, and desire and most of the poems in the collection; however, only a few of the poems with a statistically significant portion of its language from Topic 3 are not also in The Gazerâ€™s Spirit: â€œThe Picture of Little T.C. in a Prospect of Flowers,â€ â€œThe Art of Poetry [excerpt],â€ â€œOzymandius,â€ â€œCanto I,â€ and â€œMy Last Duchess.â€Â Of those poems, none are by female poets.

Poems with highest proportion of Topic 3

The Temeraire (Supposed to Have Been Suggested to an Englishman of the Old Order by the Flight of the Monitor and Merrimac) by Herman Melville

To my Worthy Friend Mr. Peter Lilly: on that Excellent Picture of His majesty, and the Duke of York, drawne by him at Hampton-Court by Sir Richard Lovelace

From The Testament of Beauty, Book III by Robert Bridges

For Spring By Sandro Botticelli (In the Academia of Florence) by Dante Gabriel Rosetti

To the Statue on the Capitol: Looking Eastward at Dawn by John James Piatt

The Poem of Jacobus Sadoletus on the Statue of Laocoon by Jacobus Sadoleto

To the Fragment of a Statue of Hercules, Commonly Called the Torso by Samuel Rogers

The Last of England by Ford Maddox Brown

On the Group of the Three Angels Before the Tent of Abraham, by Rafaelle, in the Vatican by Washington Allston

Death’s Valley To accompany a picture; by request.Â “The Valley of the Shadow of Death,” from the painting by George Inness by Walt Whitman

Elegiac Stanzas Suggested by a Picture of Peele Castle, in a Storm, Painted by Sir George Beaumont by William Wordsworth

On the Medusa of Leonardo da Vinci in the Florentine Gallery by Percy B. Shelley

The Mind of the Frontispiece to a Book by Ben Jonson

Venus de Milo by Charles-Rene Marie Leconte de Lisle

The City of Dreadful Night by James Thomson

Sonnet by Pietro Aretino

For “Our Lady of the Rocks” By Leonardo da Vinci by Dante Gabriel Rosetti

Mona Lisa by Edith Wharton

Ode on a Grecian Urn by John Keats

The National Painting by Joseph Rodman Drake

The “Moses” of Michael Angelo by Robert Browning

Hiram Powers’ Greek Slave by Elizabeth Barrett Browning

From Childe Harold’s Pilgrimage, canto 4 by George Byron Gordon

The Picture of Little T. C. in a Prospect of Flowers by Andrew Marvell

Before the Mirror (Verses written under a Picture)Inscribed to J. A. Whistler by Algernon Charles Swinburne

For Venetian Pastoral By Giorgone (In the Louvre) by Dante Gabriel Rosetti

The Art of Poetry [excerpt] by Nicolas Boileau-Despreaux

Ozymandias by Percy B. Shelley

The Iliad, Book XVIII, [The Shield of Achilles] by Homer

Canto I by Dante Alighieri

The Hunter in the Snow by William Carlos Williams

Tiepolo’s Hound by Derek Wallcot

St. Eustace by Derek Mahon

Three for the Mona Lisa by John Stone

My Last Duchess by Robert Browning

Table 2: Ekphrastic Dataset 15 Topic Model, Topic 3 Highlighted

Â The only remaining topic which includes the word love fairly high in the key word distribution is Topic 4, which includes the following terms: portrait, monument, foreman, felt, woman, monuments, box, press, bacall, detail, young, thick, crimson, instrument, hotel, compartment, picked, cornell, Europe, lovers. As you can see from the network diagram below, none of the topics with high probabilities of containing Topic 3 are included in the Topic 4 distribution.

Table 3: Ekphrastic Dataset 15 Topic Model, Topic 4 Highlighted

Equally interesting, poems with the highest proportion of Topic 4 are also authored by female poets. Â Certainly, more poems by men include significant proportions of Topic 4 than poems by women that include significant portions of Topic three; however, there are striking and salient points to be made about the contrasting networks:

Poems with highest proportion of Topic 4

“Utopia Parkway” after Joseph Cornell’s Penny Arcade Portrait of Lauren Bacall, 1945 â€“ 46 by Linda Hull

Canvas and Mirror by Evie Shockley

Portrait of Madame Monet on Her Deathbed by Mary Rose Oâ€™Reilley

Internal Monument by G. C. Waldrup

The Uses of Distortion by Caroline Crumpacker

Joseph Cornell, with Box by Michael DumanisÂ Â

Drawing Wildflowers by Jorie Graham

The Eye Like a Strange Balloon Mounts Toward Infinity by Mary Jo Bang

Visiting the Wise Men in Cologne by J.P. White

Rhyme by Robert Pinksy

The Street by Stephen Dobyns

The Portrait by Stanley Kunitz

“Picture of a 23-Year-Old Painted by His Friend of the Same Age, an Amateur” by C.P. Cavafy

Portrait in Georgia by Jean Toomer

For the Poem Paterson [1. Detail] William Carlos Williams

The Dance by William Carlos Williams

Late Self-Portrait by Rembrandt by Jane Hirshfield

Sea Life in St. Mark’s Square by Mary Oâ€™Donnell

Washington’s Monument, February, 1885 by Walt Whitman

Still Life by Jorie Graham

Still Life by Tony Hoagland

The Family Photograph by Vona Groarke

The Corn Harvest by William Carlos Williams

Portrait of a Lady by T. S. Eliot

Portrait d’une Femme by Ezra Pound

This impressionistic overview of the ekphrastic dataset prompted through the exploration of a network graph of the relationships between topics and poems is a first step.Â Enough, perhaps, to formulate a new hypothesis about the difference between â€œloveâ€ and â€œloversâ€ in ekphrastic poetry, or to lend further support to the growing sense that there is a much broader range of kinds of attraction and kinshipâ€”a range inclusive of both competitive and kindred discoursesâ€”than previous theorizations of the genre have taken into account. Â The network visualization goes further than to suggest that there are two very different discourses regarding love and affection in ekphrastic verse, but even suggests possible poems to consider reading closely to see what those differences might be and if they are worth pursuing further. Â Through the use of networked relationships between topics and documents, we begin with lists of poems in which the discourse of affinity, affection, and desireâ€”as courtship or as partnershipâ€”can be further explored through close readings.

Meeting Edward Tufte’s claim that evidence should be both beautiful and useful, the NodeXL network diagrams of LDA data are a step toward developing methods of evaluating and exploring models of figurative language that do not necessarily fit the same criteria for models of non-figurative texts.

Curating a Network of Wood and Would

1 Reply

In my current research, I argue that Elizabeth Bishop’s poem “The Monument” represents a more democratic attitude toward aesthetic objects than what we see from her contemporaries Robert Lowell and John Berryman. Â Eschewing the “tutelary” relationship between poet as teacher and reader as student, Bishop offsets her own position of power as the artist-creator by including the voice of a resistant, reluctant onlooker whose interrogations about what it is they are supposed to be looking at position the reader as the monument’s curator, one who must select between descriptions, views, depth, and purposes for the monument. Â The monument’s physical presence is brought into being collaboratively between the two speakers who create it’s shape and it’s potential and the reader who must parse and prioritize the verbal network of the poem. Â Furthermore, Bishop, who started writing “The Monument” in her Key West notebooks (see Barbara Page) after just having read Wallace Stevens’ Owl’s Clover, enters into public discourse about the relevance of public monuments (and by extension art) in a social, political, and economic climate in which people are suffering, nations are warring, and the realities of daily life seem to negate the place and purpose of art. Â Sounds familiar, no?

Bishop responds by arguing that monuments (and painting, and poetry, and sculpture) are significant because they are sites of “commemoration”–an interesting word choice because the word “commemorate” requires communal activity. Â Unlike her friend Robert Lowell who uses the bronze relief by August Saint-Gaudens in Boston Commons to establish historical connection and significance for himself as an artist, Bishop imagines the monument as evolving toÂ purpose rather deriving from it.

Visualizing the poem as a network demonstrates how the speakers’ relationship to one another builds out of their discursive description of the monument. Â In the networks below, each speaker is related to the monument through questions and description (recorded as the statements they make in the poem). Â I have characterized those statements as grounding the monument in physical space, with tangible attributes, insisting on its materiality [repesented by blue lines] or on the other hand imagining the monument’s potential or possibility through questions, equivalences (is it this or that statements) and statements that intimate a metaphysical presence for the monument [in orange].

The networks are formed by matching each speaker up with with each statement made in the poem. Â As a result, this is a “bimodal” network: one which takes actors and text and studies the relationship between them. Â The relationship, represented by a line is further characterized as advancing the monument’s status as “wood” (a physical object belonging to the material, and therefore “real” world) and the word’s homophone “would” (a representation of possibility, potential, and by association the “imaginary” life of the mind).

The reader as curator, then, must choose among the wood and the would–between the multiply rendered descriptions of the monument’s physical presence and its imagined potential, which is a new beginning itself, the shape of which could be poetry, painting, statue or monument, depending on choices the reader makes.

Preparing texts for network visualization

Leave a reply

When I presented at MSA 13 earlier this month, I was unsatisfied with my methods for creating network visualizations of texts.Â I knew that preprocessing automatically would not work yet, since I have yet to identify precisely how I want to designate nodes across larger bodies of poems.Â What Iâ€™ve been looking for is a way to mark texts up descriptively, using some form of markup language (XML, TEI), that would be uniform enough to render data that could be meaningfully displayed, and then to find a visualization software package with an algorithm that would â€œworkâ€ the way I wanted it to.Â The problem, of course, is that when youâ€™re a rogue DH scholar out in the world borrowing tools and using whatever tends to fall your way, then youâ€™re not going to be sure about how each tool works (unless you have a CS or social science degree that includes learning about network algorithms, which I do not have), and this is going to detract from the validity of how and what you say about your object of study.Â On the flip side, tools and text analysis software are becoming more widely available, and so doing what Iâ€™ve done, which is to say Googled â€œdiscourse network toolâ€ and finding Philip Leifieldâ€™s â€œDiscourse Network Analyzerâ€ is actually possible.Â What is remarkable about how DNA, a GUI text processing software, works is that it is designed as an interpretive tool to mark texts up in XML so that they can be displayed using free network visualizing software such as Visone, Ucinet, or Netdraw.Â The designed purpose of Leifieldâ€™s DNA software is to collect articles on a topic area and to use those articles to create network visualizations of agreement and disagreement between individuals and groups.Â For example, the sample dataset used for a tutorial on the software comes from someone at the University of Maryland named Dana R. Fischer, (I have no idea who she isâ€¦ but Iâ€™m definitely going to look her up!) who marked up articles, testimony, and other texts about climate change.Â Essentially, she could input each text into the DNA software and create a basic XML document with very minimal encoding (document type, author, dates, title) and then use DMA to select portions of text that create a â€œstatementâ€ about climate change.Â By tagging the speaker, the organization the speaker is affiliated with, and the content type â€“a restricted list of terms created by the user to describe the topic being discussedâ€”as well as whether or not the speaker agreed or disagreed with the topic) she could create networks of statements made about climate change that also included the individuals involved in the climate change debate and their organizations.Â Such a visualization helps us to understand how much any one group (say, the Senate and the EPA) agree with one another, to identify the issues on which they agree and disagree, and to also understand affiliations (which speakers are affiliated with which climate change debates).

This isnâ€™t *exactly* what I had in mind, but itâ€™s really darn close.Â The power of this particular piece of software is that I can be in charge of what constitutes an article (a poem), what constitutes a speaker (the poetic speaker, the author, the third person omniscientâ€¦ all of them), and the â€œcontentâ€ to be described.Â Granted the â€œorganizationâ€ classification is less helpful to me, but in the instance of â€œThe Venus Hottentot (1825)â€ I could differentiate between speakers from the first section of the poem from the second using this feature.Â Using the software this way does not begin to utilize itâ€™s real power, which is to read topics and speakers over large corpuses of texts in similar ways.Â For now, Iâ€™m looking at one poem; however, I could see in the future were I to take this poem and situate it in a larger public discourse about black female subjectivity, I could.Â I could import, for example, Sander Gilmanâ€™s article â€œBlack Bodies, White Bodies: Toward an Iconography of Female Sexuality in Late Nineteenth-Century Art, Medicine, and Literature,â€ which we know Elizabeth Alexander read before writing the poem.Â We could also bring in articles by Sadiah Quershi on â€œDisplaying Sara Baartmanâ€ or Terri Francisâ€™s â€œI and I: Elizabeth Alexanderâ€™s Collective First-Person Voice, the Witness and the Lure of Amnesia,â€ or chapters from Deborah Willisâ€™s Hottentot Venus 2010 and demonstrate how Alexanderâ€™s poem participates in a larger act of social recovery.

There are, as with any tool, limitations, though.Â So far, the only way to create the visualizations is using the speakers, organizations, and categories with directional lines indicating agreement or disagreement.Â I have not found a way of creating networks of â€œstatements.â€Â In other words, I have not found a way to pull a category and then visualize the network of statements about that category and how they relate to each speaker; however, I have only begun the process of creating visualizations.Â Another complication is that I have only found ways to make a statement associated with one category.Â Iâ€™m fairly certain I can find a work around for that, but for the moment, thatâ€™s not worked out; however, I will say that having to choose between regular category designations (ones of my own creation) did make me very attuned to my assumptions about the text. Â That process helped me to realize how my visualizations of these networks will always be limited and remind me that I need to make those limitations transparent when I write about what the visualization actually visualizes.

In the meantime, even though I am not teaching right now, Iâ€™m really excited about what this kind of software could mean for my students.Â In the English 101 courses at the University of Maryland, students write three linked assignment papers on a self-selected research topics.Â These are position papers, where the student must make purposeful arguments for what he or she believes in and respond to the discourse of the field in which their selected debate is ongoing.Â We generally assign an annotated bibliography as the first part of that linked assignment as a way of getting students to read the work and to then explain who agrees with each other on particular points and who disagrees.Â The hard part of this assignment is that each entry is generally 2 paragraphs long and includes only 8-10 sources, and getting the students to actually compare arguments, identifying points of agreement and disagreement is difficult.Â However, if the assignment were to use the Discourse Network Analyzer to import each article and then go through each article tagging â€œstatements,â€ â€œspeakers,â€ â€œorganizations,â€ and â€œcategoriesâ€ (for example, are the speakers arguing that a particular action should be taken or that one event causes anotherâ€¦) as well as â€œagreementâ€ or â€œdisagreementâ€ with that statement, they might begin to see how their readings create a network of ideas and by understanding who agrees and what they agree upon, the student might be better able to situate him or herself within the discourse of that issue.Â Itâ€™s an intriguing idea to me, and at some point when Iâ€™m teaching again, I think Iâ€™m going to make use of this technology.