For today's lab, we will be investigating the benefits of using an efficient algorithm. We will be searching datasets ranging in size from 100 to 10,000 elements.
sizes = range(100,10000, 100)
queries = range(1000)
For each list size:
From this information, you can compare the time it takes for the queries in a linear search versus the same queries on the dictionary.
def binary_search(L, key): ### This function assumes a sorted list! first = 0 last = len(L)-1 current = (first + last) // 2 while first != last: current = (first + last) // 2 if key == L[current]: return True if key < L[current]: last = current - 1 if key > L[current]: first = current + 1 return L[current] == key
p1 = plot(sizes, list_times, 'rD') p2 = plot(sizes, sorted_times, 'gD') p3 = plot(sizes, dict_times, 'bD') title('List vs Dictionary') xlabel('Data set size') ylabel('Running time') legend(('List', 'Dictionary', 'Sorted List'), 2) show()
Graph your results of data set size (x-axis) versus time (y-axis). There should be three plots, one for the random list (t2-t1), one for the sorted list (t3-t2), and one for the dictionary (t4-t3).
What is the general trend of the three plots?
from pylab import plot, show, ylabel, xlabel, title, legend from random import randint,shuffle from time import clock sizes = range(100,10000, 100) queries = range(1000) ltimes = [] btimes = [] dtimes = [] for size in sizes: r = range(size) shuffle(r) dlist = r ddict = {} start = clock() for q in queries: if q in dlist: pass end = clock() ltimes.append(end - start) start = clock() slist = sorted(r) for q in queries: found = False first = 0 last = len(slist)-1 while first != last and not found: current = (first + last) // 2 if q == slist[current]: found = True if q < slist[current]: last = current - 1 if q > slist[current]: first = current + 1 end = clock() btimes.append(end - start) start = clock() for d in dlist: ddict[d] = None for q in queries: if q in ddict: pass end = clock() dtimes.append(end - start) p1 = plot(sizes, ltimes, 'rD') p2 = plot(sizes, dtimes, 'bD') p3 = plot(sizes, btimes, 'gD') title('List vs Dictionary') xlabel('Data set size') ylabel('Running time') legend(('List', 'Dictionary', 'Sorted List'), 2) show()