CS 190C LAB 12: Friday, April 10, 2009


Here is a tutorial on Cytoscape that will walked through today. Here are the graph and node attributes used.

In-Lab Problem

Today's in-lab project is similar to the movie-matching game called six degrees of Kevin Bacon. We are providing you a data file that represents an undirected graph where there exists an edge between an actor and a movie if that actor was in that movie. Thus, your problem is to leverage NetworkX such that you can perform lookups of the shortest paths between actors.

The skeleton code has a function that you can use to import the graph.

The \t (TAB) character is the separator between actor and movie. Also, note that ordering doesn't matter here, since the graph is undirected.

Then, create a procedure separation that leverages the imported graph to find the shortest path between two actors. Prompt for two actors names, then use the NetworkX function shortest_path in the following manner:

 networkx.path.shortest_path(G, 'Bloom, Orlando', 'Bacon, Kevin')

This will return a list of nodes. Remember, you need to make sure that both actors are in the graph before calling the shortest path function. If either actor is not in the graph, print an error message and return. This type of path will be a list returned to you by the shortest path function of the following type:

 actor -> movie -> actor -> movie -> ... -> actor

Parse the list returned in the following way, and output the results:

 Burns, George was in "Movie Movie" with Wallach, Eli 
 Wallach, Eli was in "Mystic River" with Bacon, Kevin 
 The connection number of Burns, George and Bacon, Kevin is 2.

Connection number is defined as:

  • Your connection number to yourself is 0.
  • If you were in a movie with someone, your connection to them is 1.
  • If you were never in a movie with someone, your connection to them is the number of movies in a shortest path between you and them.

You can use Cytoscape to verify and view the graph. Also, there is an online version of this game which uses the complete data set of IMDB. We are using a reduced dataset.

Data sets




Here is a Cytoscape session file that may help you visualize the graphs: lab12.zip


import networkx as nx
def import_graph():
    fname = raw_input("Enter the filename of the graph: ");
    G = nx.Graph()
    f = open(fname)
    for edge in f:
        a,b = edge.split('\t')
        G.add_edge(a.strip(), b.strip())     
    return G
if __name__ == '__main__':
    G = import_graph()
    actor1 = raw_input('Enter the name of the first actor: ')
    actor2 = raw_input('Enter the name of the second actor: ')
    path = nx.path.shortest_path(G, actor1, actor2)
    num = 0 # connection number
    if len(path) > 1:
        num = len(path) / 2
        for i in xrange(0, len(path) - 1, 2):
            print path[i], 'was in', path[i+1], 'with', path[i+2]
    print 'The connection number of', actor1, 'to', actor2, 'is', num
cs190c/lab12_09.txt · Last modified: 2009/04/29 23:01 (external edit)
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki