Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit



Program 1

Iteration and Major Data Types:

List, Tuple, Set, and Dict (and Open for files)

ICS-33: Intermediate Programming

Introduction This programming assignment is designed to ensure that you know how to use combinations of Python's most

important data types to model and compactly write code that solves a wide variety of different programming    problems. The kind of abstract thinking that goes into modeling solutions to these programming problems with these data types (and iteration over them) is critical to your development as computer scientists.

There are five parts to this assignment. In each you will be asked to write a module that contains a few functions and a script at the bottom, which ties these functions together to solve the problem.

You should download the program1 project folder and use it to create an Eclipse project. The project folder    contains files for all the modules in which to write your functions and scripts; it also contains all the data files that you need to test/debug your modules; finally, it contains all the batch self-check files I will use when        grading your programs. In your modules, you may import additional standard/courselib modules and you may write additional helper functions.

I recommend that you work on this assignment in pairs, and I recommend that you work with someone in your lab section (so that you have 4 hours each week of scheduled time together). These are just recommendations. Try to find someone who lives near you, with similar programming skills, and work habits/schedule: e.g., talk about whether you prefer to work mornings, nights, or weekends; what kind of commitment you will make to  submit program early.

Only one student should submit all parts of the the assignment, but both students' UCInetID and name     should appear in a comment at the top of each submitted .py file. A special grading program reads this         information. The format is a comment starting with Submitter and Partner (when working with a partner),  followed by a colon, followed by the student's UCInetID (in all lower-case), followed by the student's name in parentheses (last name, comma, first name -capitalized appropriately). If you omit this information, or do not follow this exact form, it will require extra work for us to grade your program, so we will deduct points.

For example if Romeo Montague (whose UCInetID is romeo1) submitted a program that he worked on with his partner Juliet Capulet (whose UCInetID is jcapulet) the comment at the top of each .py file would appear

as:

# Submitter: romeo1(Montague, Romeo)

# Partner : jcapulet(Capulet, Juliet)

# We certify that we worked cooperatively on this programming

# assignment, according to the rules for pair programming

If you do not know what the terms cooperatively and/or rules for pair programming mean, please read about Pair Programming before starting this assignment. Please turn in each program as you finish it, so that I can    more accurately assess the progress of the class as a whole during this assignment.

Print this document and carefully read it, marking any parts that contain important detailed information that      you find (for review before you turn in the files). The code you write should be as elegant and compact as         possible, using appropriate Python idioms. You should familiarize yourselves with the safe_open function in    the goody module and all the functions in the prompt module, both of which you should have installed in your courselib folder as part of the Eclipse/Python installation. Recall how to use the sep and end parameters in the print function.

Reread the section on Time Management from Programming Assignment 0 before starting this assignment.

IMPORTANT 1: Before starting this assignment, download the xrefproject folder which contains a small       Python script xref.py that produces a cross-reference of all the words (converted to lower case) in a file (where words appear with spaces between them: see xrefin.txt for an example): the words are listed in alphabetic        order followed by a set (i.e., no duplicates) of the line numbers it appears on (listed in increasing numeric         order). Before solving the problems in this programming assignment, ensure you understand all the details of   how this program works: look at features and functions like safe_open, defaultdict (and how it is used),

enumerate, rstrip and lower, split and join, sorted, for loops with two (unpacked) indexes, the two                comprehensions (in the call to max and join), and format. These are the building blocks for many parts of this assignment; explore and experiment with this code to understand how all the parts work together to achieve the desired result. Run this code on more complicated data files.

IMPORTANT 2: This assignment has 5 parts: pairs should work on each part together, not split them up and   do them separately. Parts 1-3 are going to be worth 12 points each; parts 4-5 are to be worth 7 points each. This skewing of points towards the simpler parts means students finishing the first 3 parts correctly will yield a 72% average; those finishing the first 4 parts correctly will have an 86% average; but to get an A on this assignment requires solving all parts correctly. I strongly recommend finishing the first part by the weekend, and then         finishing another part every few days.

In the past, many students waited until the last few days and then tried to write all their solutions: that is a        recipe for learning little and getting a poor grade (or worse, cheating and getting caught; remember that I'm      going to be running MOSS on all the parts of this assignment, checking for very similar solutions). So, now I   am grading only 2 parts submitted the day before the due date or later (and only 1 part submitted on the due     date). You can submit as many parts as you want earlier than the day before the due date. That is if you submit all 5 parts on the day before the due date, I will grade only 2 parts; if you submit all parts on the due date I will grade only 1 part. If you submit all 5 parts two days before the due date, I will grade all parts and you will get  two extra credit points. In the worst case, to have me grade all parts, you must submit 3 parts two days before  the due date, 1 part a day before the due date, and 1 part on the due date. So, start early on this assignment and submit your work well before the due date.

IMPORTANT 3: I will mostly grade all these programs automatically, using the batch self-check files            provided in the download. Use the driver program (explored in Programming Assignment #0) to run the         batch-self check files in this assignment; debug any errors that they produce. But the TAs (with some               automated tools) will also look at/run the code in some of your scripts: so the scripts need to follow exactly     what is shown in the Sample Interactions part for each problem. I suggest testing your code first to match the scripts; when those results are correct, test it using the batch self-check files. Finally, if a submitted Python      module contains even one syntax error or bad import, it will fail all its batch self-checks; ensure that you submit modules with no syntax or bad import errors (Python sometimes adds strange imports at the top of your file; ensure that all your imports are used and are reasonable).



Problem

#1: Influencers


Problem Summary:

Write the required functions and script that prompts the user for the name of a file representing a graph; reads the file (storing   the graph in a dictionary); prints the graph/dictionary in a special form; computes an approximation to the minimmum influencers (a special set of node names in the graph); repeatedly prompts the user for a set of node names in the graph            (rejecting sets that specify any node name not in the graph); computes and prints all the nodes that are "influenced" (see below) by the node names in the set.

We can use a graph to describe friendships or professional relationships, as Facebook and LinkedIn do. Each node in a              friendship graph is the name of a person, and that person has edges (arrows) leading from his/her node to the nodes of all          his/her friends. In this model, friendship is symmetric (goes both ways): if a is b's friend (there is an arrow from a to b), then b is a's friend (there is an arrow from b to a). For simplicity, we will represent these two arrows by one double arrow: a single     line with arrowheads at both ends. Such a graph, with all double-arrow edges, is called undirected. The undirected graph         below shows all the friendships among the names a-k: note tha there k is edgeless, meaning he/she has no friends.

So, for example, in the above graph c has 4 friends: a, b, d, and g. Likewise g has 3 friends: c, h, and j. Of course, because c has g as a friend, g has c as a friend: see the double arrow. Note that h has just 1 friend: g. Finally k has 0 (no) friends.

Assume that people in a friendship graph can influence their friends: specifically, assume in this example that if half or more of a person's friends like a song, then that person will decide (be influenced) to like the song too. For example, c has 4 friends, so if 2 or more of c's friends like a song, then c will be influenced to like it too; it doesn't matter which of c's friends like the    song, so long as 2 or more of his/her friends like it.

How does this rule apply if a person has an odd number of friends? For example, g has 3 friends, so half for g's friend would be 3/2 = 1.5 friends; in this case, we round upward to an integer and require 2 or more of g's 3      friends to like the song in order to influence g to like it.

In fact Python's math.ceil (ceiling) function computes and returns the smallest integer >= to its argument: so,       calling math.ceil(1.5) returns 2, and calling math.ceil(2) also returns 2. Therefore, math.ceil(#friends/2),            whether #friends is even or odd, correctly computes the minimum number of friends (an int) needed to influence that person. Remember ceil: you will use it in your code; it is already imported into the module you will               download, into which you will write your code.

Let's see how a few people's influence can spread through this friendship graph. To start, let's suppose that that person b and e like a song. Here is how they can influence their friends, illustrated and explained below.


Program 1



Computing which friends are influenced by b and e:

1. At the start, only b and e like a song; if a person likes a song, we will write a * after their name in the graphs above, so the node names b* and e* appear in the first graph.

2. a now likes the song, because 1 of a's 2 friends (>= half) like the song: b.

3. c now likes the song, because 2 of c's 4 friends (>= half) like the song: a and b.

4. d now likes the song, because 2 of d's 3 friends (>= half) like the song: c and e.

5. f now likes the song, because 1 of f's 1 friends (>= half) like the song: d.

As this point, no other people are influenced by enough of their friends to like the song; specifically, g needs 2 of his/her 3  friends to to inluence him/her to like the song, but only 1 of his/her friend likes it. So b and e were not influential enough to make everyone in the graph like a song, but they did influence a, c, d, and f.

In fact, no pair of people wields enough influence for everyone in this graph to like a song (note that no one can influence k to like it). But if d, g, and k like a song, they can influence everyone else to like it too (check that this statement is true, to ensure that you understand the influencing process). This is the minimum number of people needed to influence everyone in this         friendship graph. In this problem, we will learn how to represent friendship graphs in Python and implement an algorithm that  computes a small set of nodes that can influence all the others. If we wanted to create a viral marketing campaign to promote    say a new song (or any belief), we would concentrate our efforts on this set, because they could convince everyone else in their social network.

Input and Output:

We will store this graph in a dictionary: each person's/node's name will be a key (str) whose associated value is a set of str of the people/node names of his/her friends.

Read a file where each line is either one node name (a person with no identifiable friends) or a pair of node names                    (representing an undirected friendship edge) in the graph. Note that node names may be any number of characters (not just the single characters used in this example), separated by one semicolon character. Build a dictionary whose keys are str node        names, and whose associated values are sets of str node names that are friends. We annotate this dictionary as {str:{str}}.

For example, the input file graph1.txt contains the following lines (which could appear in this order, or any other order, and still produce the same dictionary). By convention, we will put all friendless names at the end of the file.

a;b

a;c

b;c

c;d

d;e

d;f

c;g

g;h

g;j

i;j

k

which represent the original friendship graph we examined above.

Print the graph, one node name per line followed by the set of all his/her friend's node names (alphabetically). This graph above would print as

Graph: person -> [sorted friends of person]

a -> ['b', 'c']

b -> ['a', 'c']

c -> ['a', 'b', 'd', 'g']

d -> ['c', 'e', 'f']

e -> ['d']

f -> ['d']

g -> ['c', 'h', 'j']

h -> ['g']

i -> ['j']

j -> ['g', 'i']

k -> []

Note that the node names must be sorted alphabetically; and, the set of associated node names must appear in a list whose       values are sorted alphabetically: we use a list, because it makes no sense to talk about sorted sets. Note that because node k appears in the input file on a line by itself, it has no friends. Remember that friendship is symmetric! So the line a;b means b is a's friend and a is b's friend; this means the node name b appears in node a's associated set and the node name a appears in node b's associated set.

There are multiple data files for this program: graph1.txt (shown above), graph2.txt and graph3.txt; test/debug your program on the first file; when you are done, test it on the remaining files. Draw the graph represented by each file to ensure that your    code correctly prints it and computes the set of influencer nodes (which you can do by eyeballing the graphs: they are small).

Here is a description of the Influencer algorithm for computing a small set of nodes that can influence every node in the graph. It is guaranteed to produce a small set of nodes with the desired property, but it will not necessarily compute the smallest such set in all cases (that is a much harder problem). You must implement this Influencer algorithm, as it is described below.

It is fairly straightforward to specify the Influencer algorithm, which is straightforward to implement in Python. But, first you  must understand these English instructions, and only then can you carefully translate them into Python code. You should hand- simulate this algorithm using the data above, and verify that it produces the results that you expect, before coding it in Python.

1. Make a dictionary (I'll call it infl_dict here) whose keys are all the node names in the graph and whose associated values are 3-lists. Initially, each 3-list stores

at index 0: the number of friends of the node minus the number of friends needed to influence the node (this

second value is the ceil of half the number of its friends); but, if a node contains no friends, the value stored at index 0 should be -1.

at index 1: the number of friends of the node

at index 2: the node name itself (duplicating the key)

Index 0 for a node specifies its number of friends that can be removed while still allowing the node to be influenced by its remaining friends. For example Index 0 for g is computed as 3-ceil(3/2) = 1 meaning that node c can have 1 of its

friends removed and still be influenced by its (2) remaining friends.

If Index 0 stores a negative value, it cannot be influenced by its friends, so it must be part of the returned set of influencers: see the last pair of lines below.

2. Repeat the following process until termination:

a. Create a list of the 3-tuple values currently stored as 3-list values in infl_dict, but only if their index 0 values are non-negative; these are candidates for removal from the infl_dict, since they can still be influenced by their        friends.

b. Terminate the Influencer algorithm if there are no values in this list; otherwise ...

c. Find the smallest 3-tuple in this list: use the min function (not via sorting). Because 3-tuples are unique, the minimum 3-tuple is unique.

d. Remove the specified node name (see index 2) from infl_dict.

e. For every friend of this node in the graph that is also still in infl_dict, decreeasing its index 0 and index 1 values by 1 in its associated 3-list.

Upon terminating, the keys remaining in infl_dict represent a small (but not necessarily the smallest) set of influencers for the entire friendship graph. Note that although infl_dict changes, the original dictionary storing the friendship graph remains        unchanged.

Read these instructions carefully, a few times. Hand simulate these instructions to ensure that you understand the Influencer    algorithm; use the data above, which is automatically traced in the example below. Do not attempt to write any Python code to solve this problem until you understand this algorithm and can apply it to the data specified above. Eventually you will write   your Python code to produce such a trace conditionally.

Here is a trace (see the 2nd parameter to the find_influencers function described below, which activiates the trace) for the friendship graph specified above. The order of values in the dictionaries and lists are arbitrary; I have written these data    structures on multiple lines for formating purposes..


influencer dictionary = {'a': [1, 2, 'a'], 'b': [1, 2, 'b'], 'c': [2, 4, 'c'], 'd': [1, 3, 'd'], 'e': [0, 1, 'e'], 'f': [0, 1, 'f'], 'g': [1, 3, 'g'], 'h': [0, 1, 'h'], 'j': [1, 2, 'j'], 'i': [0, 1, 'i'], 'k': [-1, 0, 'k']}

removal candidates = [(1, 2, 'a'), (1, 2, 'b'), (2, 4, 'c'), (1, 3, 'd'), (0, 1, 'e'),

(0, 1, 'f'), (1, 3, 'g'), (0, 1, 'h'), (1, 2, 'j'), (0, 1, 'i')]

(0, 1, 'e') is the smallest candidate

Removing e as key from influencer dictionary; decrementing every friend's value

influencer dictionary = {'a': [1, 2, 'a'], 'b': [1, 2, 'b'], 'c': [2, 4, 'c'], 'd': [0, 2, 'd'], 'f': [0, 1, 'f'], 'g': [1, 3, 'g'], 'h': [0, 1, 'h'], 'j': [1, 2, 'j'],

'i': [0, 1, 'i'], 'k': [-1, 0, 'k']}

removal candidates = [(1, 2, 'a'), (1, 2, 'b'), (2, 4, 'c'), (0, 2, 'd'), (0, 1, 'f'),

(1, 3, 'g'), (0, 1, 'h'), (1, 2, 'j'), (0, 1, 'i')]

(0, 1, 'f') is the smallest candidate

Removing f as key from influencer dictionary; decrementing every friend's value

influencer dictionary = {'a': [1, 2, 'a'], 'b': [1, 2, 'b'], 'c': [2, 4, 'c'], 'd': [-1, 1, 'd'], 'g': [1, 3, 'g'], 'h': [0, 1, 'h'], 'j': [1, 2, 'j'], 'i': [0, 1, 'i'], 'k': [-1, 0, 'k']}

removal candidates = [(1, 2, 'a'), (1, 2, 'b'), (2, 4, 'c'), (1, 3, 'g'), (0, 1, 'h'),

(1, 2, 'j'), (0, 1, 'i')]

(0, 1, 'h') is the smallest candidate

Removing h as key from influencer dictionary; decrementing every friend's value

influencer dictionary = {'a': [1, 2, 'a'], 'b': [1, 2, 'b'], 'c': [2, 4, 'c'], 'd': [-1, 1, 'd'], 'g': [0, 2, 'g'], 'j': [1, 2, 'j'], 'i': [0, 1, 'i'], 'k': [-1, 0, 'k']}

removal candidates = [(1, 2, 'a'), (1, 2, 'b'), (2, 4, 'c'), (0, 2, 'g'), (1, 2, 'j'),

(0, 1, 'i')]

(0, 1, 'i') is the smallest candidate

Removing i as key from influencer dictionary; decrementing every friend's value

influencer dictionary = {'a': [1, 2, 'a'], 'b': [1, 2, 'b'], 'c': [2, 4, 'c'], 'd': [-1, 1, 'd'], 'g': [0, 2, 'g'], 'j': [0, 1, 'j'], 'k': [-1, 0, 'k']}

removal candidates = [(1, 2, 'a'), (1, 2, 'b'), (2, 4, 'c'), (0, 2, 'g'), (0, 1, 'j')]

(0, 1, 'j') is the smallest candidate

Removing j as key from influencer dictionary; decrementing every friend's value

influencer dictionary = {'a': [1, 2, 'a'], 'b': [1, 2, 'b'], 'c': [2, 4, 'c'], 'd': [-1, 1, 'd'],

'g': [-1, 1, 'g'], 'k': [-1, 0, 'k']}

removal candidates = [(1, 2, 'a'), (1, 2, 'b'), (2, 4, 'c')]

(1, 2, 'a') is the smallest candidate

Removing a as key from influencer dictionary; decrementing every friend's value

influencer dictionary = {'b': [0, 1, 'b'], 'c': [1, 3, 'c'], 'd': [-1, 1, 'd'], 'g': [-1, 1, 'g'], 'k': [-1, 0, 'k']}

removal candidates = [(0, 1, 'b'), (1, 3, 'c')]


(0, 1, 'b') is the smallest candidate

Removing b as key from influencer dictionary; decrementing every friend's value

influencer dictionary = {'c': [0, 2, 'c'], 'd': [-1, 1, 'd'], 'g': [-1, 1, 'g'], 'k': [-1, 0, 'k']} removal candidates = [(0, 2, 'c')]

(0, 2, 'c') is the smallest candidate

Removing c as key from influencer dictionary; decrementing every friend's value

influencer dictionary = {'d': [-2, 0, 'd'], 'g': [-2, 0, 'g'], 'k': [-1, 0, 'k']}

removal candidates = []

When the algorithm terminates, the small influencer set is {'d', 'g', 'k'}: the node names remaining in the infl_dict whose associated 3-list stores a negative number at index 0.

Now, repeatedly prompt the user for a set of node names in the graph (until the user enters done) and compute and print all the nodes these influence (as well as the percentage of nodes in the graph influenced): the default value for the prompt should be   all the nodes computed above, which when entered should influence 100% of the nodes in the friendship graph (if they were    computed correctly). Reject any set that contains a node name that is not a key in the graph.

Call the function all_influenced to compute this value: it uses the friendship graph and the set of node names the user enters to compute all the influenced nodes in the friendship graph as follows.

1. Create a dictionary associating every node in the graph with a bool value telling whether it is currently influenced:          initiallyl True if and only if it is in the set supplied to the influencers parameter; also compute the number of keys in this dictionary whose associated value is True (which initially is the length of the set).

2. Repeat the following process until termination:

a. For every item in the dictionary produced in Step 1, if it is not yet influenced (its associated bool is False) check to see whether it has enough friends (who are already influenced) to influence it as well; if so, change its              associated dictionary value to True.

You will need to treat friendless nodes specially: they are never influenced because they have no friends.

b. If the number of currently influenced nodes has not changed (from the previous time this value was computed), terminate by returning a set of all the dictionary keys whose associated value is True: all the influenced nodes.

An example interaction (processing the graph above) might be

Select a subset of persons (or enter done to stop)[{'d', 'g', 'k'}]: {'b','d'}

People influenced by selected subset (54.54545454545455% of graph) = {'c', 'b', 'a', 'f', 'e', 'd'}

Select a subset of persons (or enter done to stop)[{'d', 'g', 'k'}]: {'a','x'}

Entry Error: '{'a','x'}';

Please enter a legal String

Select a subset of persons (or enter done to stop)[{'d', 'g', 'k'}]:

People influenced by selected subset (100.0% of graph) = {'h', 'c', 'b', 'a', 'i', 'f', 'j', 'e', 'd', 'g', 'k'}

Select a subset of persons (or enter done to stop)[{'d', 'g', 'k'}]: done

Functions and Script:

Write the following functions and script. I am providing line counts for these function bodies not as requirements, but to indicate the lengths of well-written Pythonic code.

read_graph has an open (file) parameter; it returns the dictionary representing the undirected friendship graph (body is

11 lines).