Network Analysis 101

The advent of the internet, and especially of its more socially connected Web 2.0 variant, has ushered in a golden age for the concept of the network.  The interconnected world we now live in has changed not only the way we study computers and the internet, but the very way we envision the world and humanity’s place in it, as Thomas Fisher has argued.  The digital technologies that we are learning to use in this class are tightly linked to these new understandings, making network analysis a powerful addition to the Digital Humanist’s toolkit.  According to Fisher,

The increasingly weblike way of seeing the world … has profound implications for how and in what form we will seek information. The printed book offers us a linear way of doing so. We begin at the beginning—or maybe at the end, with the index—and work forward or backward through a book, or at least parts of it, to find the information we need. Digital media, in contrast, operate in networked ways, with hyperlinked texts taking us in multiple directions, social media placing us in multiple communities, and geographic information systems arranging data in multiple layers. No one starting place, relationship, or layer has privilege over any other in such a world.

Small Network

To study this world, it can therefore be helpful to privilege not the people, places, ideas or things that have traditionally occupied humanistic scholarship, but the relationships between them.  Network analysis, at root, is the study of the relationships between discrete objects, which are represented as graphs of nodes or vertices (the things) and edges (the relationships between those things).  This is a very active area of research that emerged from mathematics but is being explored in a wide array of disciplines, resulting in a vast literature.  (Scott Weingart offers a gentle introduction for the non-tech savvy in his Networks Demystified series and you can get a sense of the scope from the Wikipedia entry on Network Theory.)  As hackers, we are not going to get too deep into the mathematical underpinnings and rely mostly on software platforms that make network visualization relatively easy, but it is important to have a basic understanding of what these visualizations actually mean in order to use them critically and interpret them correctly.


Exercise: Your (analog) social network

The basics of visualizing a network are fairly intuitive and can be done with pen and paper.

  • Draw a simple diagram of your own social network in high school including
    • 10-12 people as nodes and
    • your relationship to them as edges
  • Put yourself at the center and then place other people around you.
    • Start with your close friends and then expand out to include friends of friends, people you know through clubs or activities, etc.
  • Draw lines to connect these people to yourself
  • Now draw lines to connect them to each other.
    • How many have relationships that do not run through you?
  • As undifferentiated lines, these are probably not very informative, so code the lines to indicate the nature of each relationship

What takeaways emerge from your diagram?

Are there connections that surprised you or figures that emerge as more central to your network than you had realized?


Exercise: Your (digital) social network

The relationships you just drew can be expressed in a simple data model as a “triple” comprised of a subject, a predicate, and an object.  My relationship to my friend Chris for instance can be expressed as a triple in the following format:

Austin — is friends with — Chris

subject — predicate — object

Each relationship in your whole network can be represented this way as a set of triples, that allow for easily readable data storage and ready network visualization.   Many DH projects make extensive use of the RDF (Resource Description Framework) specifications for modeling large sets of data as an RDF graph of triples.  For our small example, we are going to recast our personal network as a set of triples and visualize it as a digital network using Google’s Fusion Tables application.

NB: As stated before Fusion Tables is sadly being discontinued effective December 3, 2019. To do a full project in network analysis, we would want to use a dedicated application like Gephi or NodeXL (see below). But since this is just a quick demo, we will use Fusion Tables today.

Fusion Tables is an experimental platform for data visualization that Google developed to allow spreadsheet data to be quickly visualized in any number of ways from traditional bar and line charts to maps and network visualizations.  For now, we going to focus on its Network Graph capabilities.

Screen Shot 2015-02-17 at 11.38.45 AM

Our first step will be to populate a Google Sheet with triples representing our own network data, and then import it into Fusion Tables and visualize it.

  • Launch Google Drive and create a new sheet with the following three columns: Person ARelation, and Person B
    • Go through your hand-drawn diagram and translate each network relationship into a triple following the model above

(One word of caution — there are two types of relationship that can be expressed here: mutual and unreciprocated.  “Is friends with” or “is a sibling of” would be mutual relationships that produce an undirected graph.  Directed graphs map one-sided relationships like “is the parent of,” “is the student of” or “is in love with” by drawing a directional arrow for the edge.  Both are possible and can be used, but you should be aware of the distinction as you draw up your triples and stick to one or the other.)

This data model is unlike a relational database in that you will be repeating names in order to express all of the relationships in the graph.

  • Try to connect each person or node with at least two others
  • Make sure you are logged in and save your sheet

Import your data into Fusion Tables

Screen Shot 2015-02-17 at 11.49.41 AM

Go to the Fusion Tables start page, click on Google Spreadsheets and import your data, checking the Export box if you wish to make the data public and downloadable. 

  • A window should open showing your data table.  You will add a new chart by clicking the red plus sign of the type “Network graph” and change the options to Show the Link Between your Person A and Person B columns.
  • Congratulations!  You have just made a graph of your social network.  Explore the limited options and apply some filters, then click and drag around the graph to see how you can change the visualization.

This is a simple assignment that only has basic relationships in it. If you have more data points associated with each node, you can do more complex calculations.

Google Senate Network example

Exercise Two:

Now that you know the basics of what a network graph is and how to create a rudimentary one, let’s explore some much more sophisticated network analysis DH projects.  With your neighbors, explore one or more of the following projects:

As you explore the project, consider the following questions about the nature of this network analysis:

  • What (or who) are the nodes and what are the edges?
  • How are the relationships characterized and categorized?
  • What interactions does the project allow?
    • How does this impact their effectiveness and/or your engagement?
  • How was the project created?
    • See if you can dig around in the documentation and discover what tools or data manipulation steps produced the outcome you see.
    • Does the project combine network analysis with any other information or technique, like spatial analysis or text mining?

Resources

  • The main software DHers use for Network Analysis is Gephi which is a free download and has a Quick Start Tutorial.
  • The other popular option is a plugin for MS Excel called NodeXL, which we have used in previous iterations of this class to make a social network of Carleton’s early days

Assignment Tutorial Blog Post (Due Friday, 3/8)

For this assignment, create a step-by-step tutorial as a blog post demonstrating a particular technique, tool, or other helpful how-to discovery you’ve made over the past several weeks in this course.

Pick a DH tool that we haven’t discussed yet and figure out an interesting use case for it (or, vice versa, pick a use case and figure out a potentially viable DH tool or methodology).  You can highlight a technique that you have discovered in class, or in the preparation of your projects, as long as it isnot one we’ve all covered together already.  If you’re stuck for ideas, the Dirt Digital Research Tools directory offers an extensive list of software for academic uses.

Once you have an idea, create an online tutorial for the rest of us and the wider world to start paying forward what you’ve learned in the course and becoming the “local computer expert.”  For examples, you can look at some of the posts for this class, think back on all those SketchUp resources you’ve looked through, or see the software posts on the Profhacker blog.

Your tutorial blog post should include:

  • An introductory paragraph explaining clearly
    • what the tool or technique is and
    • why or in what context it would be useful
  • step-by-step walkthrough of how to accomplish a specific task using the tool that contains
    • At least 5 steps 
    • EITHER screenshots illustrating the steps where appropriate
    • OR a screencast video in which you record your actions while speaking about the process into a microphone
  • A link to at least two further resources like the software’s documentation or other tutorials around the web

For screen capture software, if you Google “how to create tutorials screenshot” you’ll be overwhelmed with options.