Lab 10: First Hour

  • In Lab problem:
    • Read in the file dictionary.txt and create a Python dictionary that contains each word in the file. Then, prompt the user for a file to spell check. Print out each word that is in the user's file that does not reside in the dictionary. The words you print out should be unique and in alphabetical order. Use the word frequency program as a template of how to split apart the words in the file. You should write this into a function spell_check that takes no parameters and has no return value. You need NOT include the function word_frequency in the file you submit.
    • Sample input files: constitution.txt and mobydick.txt.

A solution to the dictionary problem:

#     Program to spell_check words in a text file.
#     Further illustrates Python dictionarys
import string
def spell_check():
    # Load the dictionary 
    text = open('dictionary.txt','r').read()
    text = string.lower(text)
    for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~':
        text = string.replace(text, ch, ' ')
    words = string.split(text)
    # construct a dictionary of the given words
    dictionary = {}
    for w in words:
        dictionary[w] = True
    print "Dictionary loaded."
    # get the sequence of words from the file
    fname = raw_input("File to analyze: ")
    text = open(fname,'r').read()
    text = string.lower(text)
    for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~1234567890':
        text = string.replace(text, ch, ' ')
    words = string.split(text)
    # construct a dictionary of words not found in dictionary
    orphans = {}
    for w in words:
        if dictionary.get(w) == None:
            orphans[w] = True
    # output analysis of n most frequent words.
    items = orphans.items()
    for i in range(len(items)):
        print items[i][0]
if __name__ == '__main__':

Second Hour

  • Address problems students had with Ideal gas simulation, and answer questions about the Ising spin simulation
  • Make sure NetworkX and Cytoscape are installed and work properly on lab machines. Run the following piece of code with input file ebi.txt:
from networkx import *
import pylab as P
G = DiGraph()
G.add_edges_from([tuple(s.split()) for s in open('ebi.txt')])
print 'Number of nodes:', G.order()
C1 = component.strongly_connected_components(G)
print 'Number of strongly connected components:', len(C1)
for t in C1:
        if len(t) > 1:
            print "a strongly connected component of size > 1: ", len(t)
Gscc = DiGraph()
Gscc = subgraph(G, C1[0])
print "number of nodes in Gscc: ", Gscc.order()
cs190c/lab10.txt · Last modified: 2008/07/24 12:13 by seh
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki