Categories
Uncategorized

Data Visualization with Gephi

Gephi is an open-source visualization software. It is especially useful for link and network analysis. Gephi is capable of in-depth analysis and creating and customizing your own visuals. Gephi uses data points called nodes connected by lines (edges) that represent the links and networks between them. Many examples online show analysis of website pathways, social media, and biological network data among other things. I decided to do a network analysis tutorial of The Guardian’s Top Selling 100 Books of All Times. Other datasets for Gephi exist on this site, and for simplicity’s sake, I would recommend following along using one of those files. This basic tutorial will go over how to set up Gephi, import a CSV file, and set up a visualization of the data.

Step 1: Download the Software

The Gephi software is available on the Gephi website, click Download Free and choose the appropriate download for your computer. Click on the Gephi download and follow the prompts on your application to open the software.

Now is a good time to download the CSV file you will be working with. The Top 100 Book file I am using is from the Guardian website, click DATA: Download the full spreadsheet and save to your server.

Step 2: Import CSV file into Gephi

Go to File > Open > New Project

Select your CSV file. Indicate the Separator (commas usually) and what form you want to import it as (Adjacency list for this file).

Click Next, then finish any import settings. I kept the time representation at Intervals for this file. Click Finish. The next screen will show any potential problems with your CSV file, and it also indicates how many nodes and edges there are in the file. Click OK.

Step 3: Navigating the Main Screen

After uploading your CSV file, you should be taken to the main screen. Each upload gets its own window, called a Workspace. The main area is called the Graph module. In this area, you can zoom (two fingers up and down on a track pad) and move (two fingers click and drag) the visualization. If you lose your visualization you can click the magnifying glass to reset.

The representation below is random; to fix this we need to adjust the layout in the Layout module.

Step 4: Customize the Layout

Navigate to the Layout module (bottom left corner) and choose a layout. There are many options, shown here are the Fruchterman Reingold and Force Atlas layouts. All layouts are customizable so you can edit what your graph looks like. For this tutorial I will use the Force Atlas layout.

Within the Force Atlas layout, you can adjust by size and change the repulsion strength. This makes the visualization easier to read and more aesthetically pleasing. Click adjust by size and change repulsion strength to your desired strength (I chose 1000.0).

Step 5: Customize the Appearance

An all-black visual is not one that someone can use practically. In the Appearance module, click on Nodes > Ranking > then choose your preferred color scheme, or create your own. Click Apply to see it in the Graph module.

Step 6: Add Labels

In order to understand what the graph is showing, you can add labels to the nodes. Use the toolbar at the bottom of the Graph module to add labels, adjust to node size, and adjust the text size. You can also use this toolbar to adjust the size of the edges.

Once all of this is done, you can see your data in a clear visualization. The networks and links between data points (nodes) are represented by lines (edges). Here is one section of the CSV file I used. In this graph, you can see the connections between the books published at Random House.

More Resources

Here are some guides and instructions that helped me create this tutorial, as well as a link to the CSV data set and the Gephi website.

Gephi’s Quickstart Guide

Martin Grandjean’s Introduction to Gephi

Dataset

Gephi Website

8 replies on “Data Visualization with Gephi”

Wow! I tried to do a Gephi tutorial for this assignment and just could not get the software to work for me. Using your tutorial, the things I wanted to do were actually made possible. I think you could be a bit more clear on how to create or where to acquire datasets that would be good for making networks. Overall, I really enjoyed your tutorial and am hoping that somehow Gephi will stop being permanently messed up on my computer– not even deleting the app and re-installing it will fix it. But I now know that when I need it I can easily install it on the lab computer and follow your tutorial to import data.

Great tutorial, really quick and easy way of visualizing data, it seemed a bit daunting at first but got the hang of it quite quickly

This tutorial is really helpful for someone like me who isn’t very competent with coding and what’s at the heart of computers. We are using Gephi for our final project, and have debated about the appearance of our network. You highlighted a lot of the basic components that we are using, and it made it easy to locate them on the screen and see what they affect in the end rather than trying to guess ourselves.

This was a clear and easy tutorial for me to follow. Since this was my first time using it, I really appreciated you providing everything (including the files) I needed in order to learn how to use the program.

Your tutorial was very clear! Having also done a tutorial on Gephi, but focusing more on the data analysis techniques, it was really helpful to see the various ways in which you can make the data easier to visualize. I think it would’ve been helpful to see how the data you’re taking as input was formatted, just so it was easier to see the correlation between what you start with and what you end up with.

I tried to do this tutorial but I couldn’t manage to get my csv file into gephi. The option to open my csv file in gephi just never came up. I think this is more of personal technological issue though as it appears everyone else had success making their plot.

I appreciated how you kept the user in mind for this tutorial– what does someone want to get out of Gephi? Well, they want the data to look good and be coherent visually, and I’m glad that you guided us through making adjustments to color (away from the homogenous, hard to read black). I also appreciated how you highlighted each of your own choices throughout and explained why you would want to use the tool as well as each toggle option in the way that you do. And, you brought me back to that reading on nodes so many weeks ago and helped me connect the theoretical with the technical!

Leave a Reply

Your email address will not be published. Required fields are marked *