CS 131 – Problem Set 3 – Part 1
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
CS 131 – Problem Set 3 – Part 1
Problem 1. [10 points] For parts a and b, you will be given a set described in set builder notation. Translate this into a set in roster notation by naming the “first” 5 elements in the set if possible (assuming it were ordered in some logical manner), and describe the set in a concise description (one sentence maximum).
For parts c and d, you will be given a roster notation of a set and you must describe it using set builder notation and a short description. While there may exist multiple correct responses, you will not need to make use of set operations ( ∪ , ∩ , \, etc.) to answer these.
a) {2j | j ∈ Z} ∩ {nm | n ∈ N ∧ m ∈ Z+ ∧ n = m}
Solution. {nm | n ∈ N ∧ m ∈ Z+ ∧ n = m} = {1, 4, 9, 16, 25, ··· }
{2j | j ∈ Z} ∩ {nm | n ∈ N ∧ m ∈ Z+ ∧ n = m} = { · · · , −4, −2, 0, 2, 4, ··· } ∩ {1, 4, 9, 16, 25, ··· } =
{4, 16, 36, 64, 100, ··· }. All positive even perfect squares.
b) {log | x ∈ N} − Z+, where as an example log = log = 2
Solution.
{log | x ∈ N} = {log log log log log } = {log10(1) , log log log log } =
{0, 1, 2, 3, 4, 5, ··· } = N
Therefore, {log1(1)0(0)x | x ∈ N} − Z+ = N − Z+ = {0}. A set of single element 0
c) {1, 4, 16, 64, 256, ··· }
Solution. {22i | i ∈ N} Even powers of 2
d) {−5, 15, −25, 35, −45, ··· }
Solution. {(− 1)i · 5(2i − 1) | i ∈ Z+ } Alternating negative/positive odd multiples of 5. Alternative solution: {(− 1)i+1 · 5(2i + 1) | i ∈ N}
Problem 2. [10 points]
a) Let A, B, and C be sets in a common universe. Using set identities, prove that
(A − B) ∩ (A − C) = A − (B ∪ C)
Solution.
(A − B) ∩ (A − C) = (A ∩ B) ∩ (A ∩ C) Set difference
************= A ∩ B ∩ C Idempotent and Commutative laws
************= A ∩ B ∪ C De Morgan’s law
************= A — (B O C) Set difference
b) Let A, B, and C be sets in a common universe. Using set builder notation, prove that (A — C) U (B — C) = (A U B) — C
Solution.
(A — C) U (B — C) = {x|x e (A — C) V (x e B — C)} = {x|x e A V x C V x e B V x C)} = {x|x e A V x e B V x C} = {x|x e (A U B) V x C}
= {x|x e (A U B) — C}
= (A U B) — C
Definition of intersection Definition of set difference
Idempotent and Commutative laws in logic Definition of intersection Definition of set difference
Problem 3. [5 points]
a) Write a Python function range1(A), given a list A of integer numbers, tests the following proposition:
3x e A,either x is greater than 10 or x is less than 20 but not both.
Solution.
def range1(A):
for x in A:
if ((x > 10) ^ (x < 20)):
return True
return False
# alternative
def range1(A):
for x in A:
if ((x > 10) and not (x < 20)) or (not (x > 10) and (x < 20)): return True
return False
b) Write a Python function range2(A), given a list A of integer numbers, tests the following proposition:
Ax e A,either x is greater than 10 or x is less than 20 but not both.
Solution.
def range2(A):
for x in A:
if not ((x > 10) ^ (x < 20)):
return False
return True
#alternative
def range2(A):
for x in A:
if not (((x > 10) and not (x < 20)) or (not (x > 10) and (x < 20))): return False
return True
Note 1: Place both functions in a single python file and make sure to name your python file p3.py for the auto-grader to recognize and compile the file. Also, do not include any print statements or header documentation in the file. Use function names as required by each section .
Note 2: Although we have Python set datatype (https://www.w3schools.com/python/python_ sets.asp), we use Python list datatype (https://www.w3schools.com/python/python_lists. asp) to represent sets for simplicity.
Problem 4. [15 points] Express A B as a quantified statement
a) using only A over the universal set.
Solution. We can start with A = B:
(A = B) 三 A 天 B V B 天 A 三 Ax(x e A 一 x e B) V Ax(x e B 一 x e A) Also we can use a bi-implication as follows:
(A = B) 三 Ax(x e B 一 x e A)
Now we can apply a negation on the above statements to find A B
( )
(A B) 三 - (Ax(x e A 一 x e B) V Ax(x e B 一 x e A))
(A B) 三 -Ax(x e A 一 x e B) ^ -Ax(x e B 一 x e A)
(A B) 三 -Ax(x e B 一 x e A)
b) using only A over a restricted domain.
Solution. First we define A = B using domain restriction:
(A = B) 三 Ax e Ax e B V Ax e Bx e A
Now we can apply a negation on the above statements to find A B
(A B) 三 - (Ax e Ax e B V Ax e Bx e A) 三 -Ax e Ax e B ^ -Ax e Bx e A c) using only 3 over the universal set.
Solution.
(A B) ≡ ∃x¬ (x ∈ A → x ∈ B) ∧ (x ∈ B → x ∈ A)
(A B) ≡ ∃x (¬ (x ∈ A → x ∈ B) ∨ ¬ (x ∈ B → x ∈ A))
(A B) ≡ ∃x¬ (x ∈ A → x ∈ B) ∨ ∃x¬ (x ∈ B → x ∈ A)
(A B) ≡ ∃x¬ (x ∈ B ↔ x ∈ A)
d) using only ∃ over a restricted domain.
Solution. Applying De Morang’s law on the result of part b:
(A B) ≡ ∃x ∈ A ¬x ∈ B ∨ ∃x ∈ B ¬x ∈ A ≡ ∃x ∈ Ax B ∨ ∃x ∈ Bx A
e) Write a Python function notEqual(A,B) that takes two lists where each list represents a set. The function outputs true if the two sets are not equal each other. Submit a python file p4.py for the auto-grader to recognize and compile the file. Also, do not include any print statements or header documentation in the file.
Solution. Your function should satisfy the following quantified statement which is ¬ (A = B).
¬ (A = B) ≡ ¬ (A ⊆ B ∧ B ⊆ A)
≡ ¬ (A ⊆ B) ∨ ¬ (B ⊆ A)
≡ (A ̸⊆ B) ∨ (B ̸⊆ A)
≡ ∃x(x ∈ A ∧ x B) ∨ ∃x(x ∈ B ∧ x A)
Note that this is equivalent with the answers of parts a - d.
Definition of equality in sets DeMorgan’s law
Definition of negation in sets Definition of not a subset
def notEqual(A, B):
for a in A:
if a not in B:
return True
for b in B:
if b not in A:
return True
return False
# alternative 1
def notSubset (A, B):
for a in A:
if a not in B:
return True
return False
def notEqual(A, B):
return notSubset(A, B) or notSubset(B,A)
# alternative 2
def subset(A,B):
for a in A:
if a not in B:
return False
return True
def Equal(A, B):
return Subset(A, B) and Subset(B,A)
def notEqual(A, B):
return not Equal(A, B)
Problem 5. [10 points + 10 Bonus points] Suppose we have the following two sentences:
• It is interesting that she has interested in interesting philosophy.
• Is he interested? No, but philosophers have been.
We want to find the similarity degree of the above two strings. First, we need to group the words using their roots. For example, “philosopher” and “philosophy” will both become “philosophy”, “interesting” and “interested” will both become “interest”, “have” and “has” will both become “have”.
In the first method, we make two sets for the two strings:
Words in string 1: {it, is, interest, that, she, have, in, philosophy}
Words in string 2: {is, he, interest, no, but, philosophy, have, been} Now, we draw a Venn diagram using the two sets.
Then, we use the following formula to find a similarity score between the two strings:
Similarity1(A,B) =
For example, for the above two strings Similarity1(A,B) = = 0.333
In the second method, first we list all words and then find frequency of each word in each strings.
Then, we use the following equation to find similarity score between the two strings:
Similarity2(A,B) = 对 Ai · Bi
^对 Ai(2)^对 Bi(2)
Here 对 means summation. For example, for the above two strings, we will have:
Similarity2(A,B) = = 0.530
a) Write a Python function similarity1(A, B) that takes two lists A and B . Each list rep- resents set of words in one string. The output of the function is the similarity score between the two strings using the algorithm explained above for similarity1.
Test case 1:
• It is interesting that she has interested in interesting philosophy.
• Is he interested? No, but philosophers have been.
>>> A = ['it', 'that', 'she', 'in', 'is', 'interest', 'philosophy', 'have'] >>> B = ['interest', 'philosophy', 'have', 'he', 'no', 'but', 'been', 'is'] >>> similarity1(A, B)
0.333
Test case 2:
• Like trees in a forest, humans share a root system.
• Human activity is killing most of the trees on the planet.
>>> A = ['like', 'tree', 'in', 'a', 'forest', 'human', 'share', 'root', 'system'] >>> B = ['human', 'activity', 'is', 'kill', 'most', 'of', 'tree', 'on', 'the', 'planet'] >>> similarity1(A, B)
0.117
b) Write a Python function similarity2(A, B) that takes two lists A and B . Each list includes frequency of each word in the two strings. The output of the function is similarity score between the two strings using the algorithm explained above for similarity2.
Test case 1:
• It is interesting that she has interested in interesting philosophy.
• Is he interested? No, but philosophers have been.
>>> A = [1, 1, 3, 1, 1, 1, 1, 1, 0, 0, 0, 0]
>>> B = [0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1]
>>> similarity2(A, B)
0.530
Test case 2:
• Like trees in a forest, humans share a root system.
• Human activity is killing most of the trees on the planet.
>>> A = [1, 1, 1, 2, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]
>>> B = [0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 2, 1, 1, 1]
>>> similarity2(A, B)
0.160
c) Test your functions using the following strings and analyze the outputs. In particular, explain the key difference between the two similarity algorithms.
Test case 1:
• happiness is not a gift
• happiness is a gift
Test case 2:
• happiness is not a gift
• happiness is a happy gift
Test case 3:
• happiness is not a gift
• happiness is a happy happy gift
Test case 4:
• happiness is not a gift
• happiness is a happy happy happy gift
Hint 1: You do not need to take care of roots. Instead, test your functions using the above test cases. In similarity1, lists include words in the two strings. In similarity2, lists include numbers which represent frequency of words involved in the problem.
Hint 2: Note that ordering of the elements of the lists in similarity1 does not matter but it matters in similarity2.
Hint 3: Place both functions in a single python file and make sure to name your python file p5.py for the auto-grader to recognize and compile the file. The answer to part c should be added to p5.py as a comment using comment tag. Solution.
Part a:
def similarity1 (A, B):
sizeU = len(B)
sizeI = 0
for x in A:
if x in B:
sizeI += 1
else :
sizeU += 1
return sizeI / sizeU
Part b:
def similarity2 (A, B):
sumxx = 0
sumxy = 0
sumyy = 0
for i in range(len(A)):
x = A[i]
y = B[i]
sumxx += x*x
sumyy += y*y
sumxy += x*y
return sumxy/(sumxx*sumyy)**0.5
Part c: Similarity1 returns similarity score 0.8 for all four cases in part c while Similarity2 returns 0.894, 0.801, 0.714 and 0.649 for the four tests respectively.
• similarity1 takes only unique set of words for each string while similarity2 does not. This means that if you repeat the word “happy” in one string several times, similarity2 changes but similarity1 does not. For example, if the word “happy”is repeated in the second string
50 times, similarity2 drops to 0.381 but similarity1 remains at 0.8.
• similarity1 is good for cases where duplication does not matter, similarity2 is good for cases where duplication matters while analyzing text similarity. For two product descriptions, it will be better to use similarity1 as repetition of a word does not reduce their similarity.
FYI: Similarity1 is called Jaccard Similarity and similarity2 is called Cosine Similarity. For more information about text similarities read: https://towardsdatascience.com/overview-of-text-similarity-metrics-3397c4601f50
2023-06-09