Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CS 131 – Problem Set 3 – Part 1

Problem 1.   [10 points] For parts a and b, you will be given a set described in set builder notation. Translate this into a set in roster notation by naming the “first” 5 elements in the set if possible (assuming it were ordered in some logical manner), and describe the set in a concise description (one sentence maximum).

For parts c and d, you will be given a roster notation of a set and you must describe it using set builder notation and a short description. While there may exist multiple correct responses, you will not need to make use of set operations ( ∪ , ∩ , \, etc.) to answer these.

a)  {2j | j ∈ Z} ∩ {nm | n ∈ N ∧ m Z+ n = m}

Solution.   {nm | n ∈ N ∧ m Z+ n = m} = {1, 4, 9, 16, 25, ··· }

{2j | j Z} ∩ {nm | n Z+ n = m} = { · · · , −4−2, 0, 2, 4, ··· } ∩ {1, 4, 9, 16, 25, ··· } =

{4, 16, 36, 64, 100, ··· }. All positive even perfect squares.

b)  {log | x N} − Z+, where as an example log  = log  = 2

Solution.

{log | x ∈ N} = {log log  log  log  log  } = {log10(1) , log  log  log  log  } =

{0, 1, 2, 3, 4, 5, ··· } = N

Therefore, {log1(1)0(0)x   | x N} − Z+ = N Z+ = {0}. A set of single element 0

c)  {1, 4, 16, 64, 256, ··· }

Solution.   {22i  | i ∈ N} Even powers of 2

d)  {−5, 15, −25, 35, −45, ··· }

Solution.   {(− 1)i · 5(2i − 1) | i Z+ } Alternating negative/positive odd multiples of 5. Alternative solution: {(− 1)i+1 · 5(2i + 1) | i ∈ N}

Problem 2.   [10 points]

a) Let A, B, and C be sets in a common universe. Using set identities, prove that

(A  B) ∩ (A C) = A − (C)

Solution.

 

(A − B) ∩ (A − C) = (  B) ∩ (A ∩ C) Set difference

************= A C Idempotent and Commutative laws

************= A ∩ B ∪ C De Morgan’s law

************= A — (B O C) Set difference

b)  Let A, B, and C be sets in a common universe. Using set builder notation, prove that (A — C) U (B — C) = (A U B) — C

Solution.

(A — C) U (B — C) = {x|x e (A — C) V (x e B — C)}        = {x|x e A V x  C V x e B V x  C)} = {x|x e A V x e B V x  C}              = {x|x e (A U B) V x  C}

= {x|x e (A U B) — C}

= (A U B) — C

Definition of intersection Definition of set difference

Idempotent and Commutative laws in logic Definition of intersection Definition of set difference

Problem 3.   [5 points]

a)   Write a Python function range1(A), given a list A of integer numbers, tests the following proposition:

3x e A,either x is greater than 10 or x is less than 20 but not both.

Solution.

def  range1(A):

for  x  in  A:

if  ((x  >  10)  ^  (x  <  20)):

return  True

return  False

#  alternative

def  range1(A):

for  x  in  A:

if  ((x  >  10)  and  not  (x  <  20))  or  (not  (x  >  10)  and  (x  <  20)): return  True

return  False

b)   Write a Python function range2(A), given a list A of integer numbers, tests the following proposition:

Ax e A,either x is greater than 10 or x is less than 20 but not both.

Solution.

def  range2(A):

for  x  in  A:

if  not  ((x  >  10)  ^  (x  <  20)):

return  False

return  True

#alternative

def  range2(A):

for  x  in  A:

if  not  (((x  >  10)  and  not  (x  <  20))  or  (not  (x  >  10)  and  (x  <  20))): return  False

return  True

Note 1: Place both functions in a single python file and make sure to name your python file p3.py for the auto-grader to recognize and compile the file. Also, do not include any print statements or header documentation in the file. Use function names as required by each section .

Note 2: Although we have Python set datatype (https://www.w3schools.com/python/python_ sets.asp), we use Python list datatype (https://www.w3schools.com/python/python_lists. asp) to represent sets for simplicity.

Problem 4.   [15 points] Express A  B as a quantified statement

a)  using only A over the universal set.

Solution.   We can start with A = B:

(A = B) 三 A  B V B  Ax(x e A  x e B) V Ax(x e B  x e A) Also we can use a bi-implication as follows:

(A = B) 三 Ax(x e B  x e A)

Now we can apply a negation on the above statements to find A  B

(                                                        )

(A  B) 三 - (Ax(x e A  x e B) V Ax(x e B  x e A))

(A  B) 三 -Ax(x e A  x e B) ^ -Ax(x e B  x e A)

(A  B) 三 -Ax(x e B  x e A)

b)  using only A over a restricted domain.

Solution.   First we define A = B using domain restriction:

(A = B) 三 Ax e Ax e B  V Ax e Bx e A

Now we can apply a negation on the above statements to find A  B

(A  B) 三 - (Ax e Ax e B  V Ax e Bx e A) -Ax e Ax e B  ^ -Ax e Bx e A c)  using only 3 over the universal set.

Solution.

(A  B) ≡ ∃x¬  (x B) (x A)

(A  B) ≡ ∃x (¬ (x B)  ¬ (x B x A))

(A  B) ≡ ∃x¬ (x ∈ A → x ∈ B) ∨ ∃x¬ (x ∈ B → x ∈ A)

(A  B) ≡ ∃x¬ (x ∈ B ↔ x ∈ A)

d)  using only ∃ over a restricted domain.

Solution.   Applying De Morang’s law on the result of part b:

(A  B) ≡ ∃x ∈ A ¬x ∈ B  ∨ ∃x ∈ B ¬x ∈ A ≡ ∃x ∈ Ax  B  ∨ ∃x ∈ Bx  A

e)  Write a Python function notEqual(A,B) that takes two lists where each list represents a set. The function outputs true if the two sets are not equal each other.  Submit a python file p4.py for the auto-grader to recognize and compile the file. Also, do not include any print statements or header documentation in the file.

Solution.   Your function should satisfy the following quantified statement which is ¬ (A = B).

¬ (A = B) ≡ ¬ (A ⊆ B ∧ B ⊆ A)

≡ ¬ (A ⊆ B) ∨ ¬ (B ⊆ A)

≡ (A ̸⊆ B) ∨ (B ̸⊆ A)

x(x  B) ∨ ∃x(x  A)

Note that this is equivalent with the answers of parts a - d.

Definition of equality in sets DeMorgan’s law

Definition of negation in sets Definition of not a subset

def  notEqual(A,  B):

for  a  in  A:

if  a  not  in  B:

return  True

for  b  in  B:

if  b  not  in  A:

return  True

return  False

#  alternative  1

def  notSubset  (A,  B):

for  a  in  A:

if  a  not  in  B:

return  True

return  False

def  notEqual(A,  B):

return  notSubset(A,  B)  or  notSubset(B,A)

#  alternative  2

def  subset(A,B):

for  a  in  A:

if  a  not  in  B:

return  False

return  True

def  Equal(A,  B):

return  Subset(A,  B)  and  Subset(B,A)

def  notEqual(A,  B):

return  not  Equal(A,  B)

Problem 5.   [10 points + 10 Bonus points] Suppose we have the following two sentences:

• It is interesting that she has interested in interesting philosophy.

• Is he interested? No, but philosophers have been.

We want to find the similarity degree of the above two strings. First, we need to group the words using their roots.  For example,  “philosopher” and “philosophy” will both become “philosophy”, “interesting” and  “interested” will both become  “interest”,  “have” and  “has” will both become “have”.

In the first method, we make two sets for the two strings:

Words  in  string  1:  {it,  is,  interest,  that,  she,  have,  in,  philosophy}

Words  in  string  2:  {is,  he,  interest,  no,  but,  philosophy,  have,  been} Now, we draw a Venn diagram using the two sets.

 

Then, we use the following formula to find a similarity score between the two strings:

Similarity1(A,B) =

For example, for the above two strings Similarity1(A,B) =  = 0.333

In the second method, first we list all words and then find frequency of each word in each strings.

 

Then, we use the following equation to find similarity score between the two strings:

Similarity2(A,B) =         Ai · Bi          

^ Ai(2)^ Bi(2)

Here means summation. For example, for the above two strings, we will have:

Similarity2(A,B) =                                                                                                                             = 0.530

a)   Write a Python function similarity1(A,  B) that takes two lists A and B .  Each list rep- resents set of words in one string.  The output of the function is the similarity score between the two strings using the algorithm explained above for similarity1.

Test case 1:

• It is interesting that she has interested in interesting philosophy.

• Is he interested? No, but philosophers have been.

>>>  A  =  ['it',  'that',  'she',  'in',  'is',  'interest',  'philosophy',  'have'] >>>  B  =  ['interest',  'philosophy',  'have',  'he',  'no',  'but',  'been',  'is'] >>>  similarity1(A,  B)

0.333

Test case 2:

• Like trees in a forest, humans share a root system.

• Human activity is killing most of the trees on the planet.

>>>  A  =  ['like',  'tree',  'in',  'a',  'forest',  'human',  'share',  'root',  'system']              >>>  B  =  ['human',  'activity',  'is',  'kill',  'most',  'of',  'tree',  'on',  'the',  'planet'] >>>  similarity1(A,  B)

0.117

b)  Write a Python function similarity2(A,  B) that takes two lists A and B . Each list includes frequency of each word in the two strings. The output of the function is similarity score between the two strings using the algorithm explained above for similarity2.

Test case 1:

• It is interesting that she has interested in interesting philosophy.

• Is he interested? No, but philosophers have been.

 

>>>  A  =  [1,  1,  3,  1,  1,  1,  1,  1,  0,  0,  0,  0]

>>>  B  =  [0,  1,  1,  0,  0,  1,  0,  1,  1,  1,  1,  1]

>>>  similarity2(A,  B)

0.530

Test case 2:

• Like trees in a forest, humans share a root system.

• Human activity is killing most of the trees on the planet.

>>>  A  =  [1,  1,  1,  2,  1,  1,  1,  1,  1,  0,  0,  0,  0,  0,  0,  0,  0]

>>>  B  =  [0,  1,  0,  0,  0,  1,  0,  0,  0,  1,  1,  1,  1,  2,  1,  1,  1]

>>>  similarity2(A,  B)

0.160

c)  Test your functions using the following strings and analyze the outputs. In particular, explain the key difference between the two similarity algorithms.

Test case 1:

• happiness is not a gift

• happiness is a gift

Test case 2:

• happiness is not a gift

• happiness is a happy gift

Test case 3:

• happiness is not a gift

• happiness is a happy happy gift

Test case 4:

• happiness is not a gift

• happiness is a happy happy happy gift


Hint 1:  You do not need to take care of roots. Instead, test your functions using the above test cases.  In similarity1, lists include words in the two strings.  In similarity2, lists include numbers which represent frequency of words involved in the problem.

Hint  2:   Note that ordering of the elements of the lists in similarity1 does not matter but it matters in similarity2.

Hint  3:   Place both functions in a single python file and make sure to name your python file p5.py for the auto-grader to recognize and compile the file.  The answer to part c should be added to p5.py as a comment using comment tag.                                                                  Solution. 

Part a:

def  similarity1  (A,  B):

sizeU  =  len(B)

sizeI  =  0

for  x  in  A:

if  x  in  B:

sizeI  +=  1

else :

sizeU  +=  1

return  sizeI  /  sizeU

Part b:

def  similarity2  (A,  B):

sumxx  =  0

sumxy  =  0

sumyy  =  0

for  i  in  range(len(A)):

x  =  A[i]

y  =  B[i]

sumxx  +=  x*x

sumyy  +=  y*y

sumxy  +=  x*y

return  sumxy/(sumxx*sumyy)**0.5

Part c: Similarity1 returns similarity score 0.8 for all four cases in part c while Similarity2 returns 0.894, 0.801, 0.714 and 0.649 for the four tests respectively.

• similarity1 takes only unique set of words for each string while similarity2 does not.  This means that if you repeat the word “happy” in one string several times, similarity2 changes but similarity1 does not. For example, if the word “happy”is repeated in the second string

50 times, similarity2 drops to 0.381 but similarity1 remains at 0.8.

• similarity1 is good for cases where duplication does not matter, similarity2 is good for cases where duplication matters while analyzing text similarity.  For two product descriptions, it will be better to use similarity1 as repetition of a word does not reduce their similarity.


FYI: Similarity1 is called Jaccard Similarity and similarity2 is called Cosine Similarity. For more information about text similarities read:                                                                                         https://towardsdatascience.com/overview-of-text-similarity-metrics-3397c4601f50