Lab 12: In-Lab Problem

Today's in-lab project is similar to the movie-matching game called six degrees of Kevin Bacon. We are providing you a data file that represents an undirected graph where there exists an edge between an actor and a movie if that actor was in that movie. Thus, your problem is to leverage NetworkX such that you can perform lookups of the shortest paths between actors.

The following is a function that you can use to import the graph.

from networkx import *
from string import replace
def import_graph():
    fname = raw_input("Enter the filename of the graph: ");
    G = Graph()
    f = open(fname)
    for edge in f:
        a,b = edge.split('\t')
        b = b.replace ( "\n", "" )
    return G

The \t character is the separator between actor and movie. Also, note that ordering doesn't matter here, since the graph is undirected.

Then, create a procedure separation that leverages the imported graph to find the shortest path between two actors. Prompt for two actors names, then use the NetworkX function shortest_path in the following manner:

 networkx.path.shortest_path(G, 'Bloom, Orlando', 'Bacon, Kevin')

This will return a list of nodes. Remember, you need to make sure that both actors are in the graph before calling the shortest path function. If either actor is not in the graph, print an error message and return. This type of path will be a list returned to you by the shortest path function of the following type:

 actor -> movie -> actor -> movie -> ... -> actor

Parse the list returned in the following way, and output the results:

 Burns, George was in "Movie Movie" with Wallach, Eli 
 Wallach, Eli was in "Mystic River" with Bacon, Kevin 
 The connection number of Burns, George and Bacon, Kevin is 2.

Connection number is defined as:

  • Your connection number to yourself is 0.
  • If you were in a movie with someone, your connection to them is 1.
  • If you were never in a movie with someone, your connection to them is the number of movies in a shortest path between you and them.

You can use Cytoscape to verify and view the graph. Also, there is an online version of this game which uses the complete data set of IMDB. We are using a reduced dataset.

Data sets

Second hour

  • Answer questions about Project 4, part 1.
  • Go over Project 4, part 2 in detail; draw a graph of all actions to be taken for a synthetic and for a biological data set.
cs190c/lab12.txt · Last modified: 2008/07/24 12:13 by seh
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki