Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ASSIGNMENT 1 (Weightage 15%)

AUGUST 2022 SEMESTER (BLOCK 1)

Principles of AI

ITS70304

Assessment Criteria

Assessment Task

Weightage

MLO   Assessed

Formative/ Summative

Assessment Instrument

Topics

Week

MQF

2.0

Task 1 Part I

8%

1

Formative

N/A

1,2

1,2

C1

Task 1 Part II

37%

1

Formative

Google     Collaboratory

1,2

1,2

C2

MLO 1: Demonstrate knowledge and the principles of AI.

C1 = Knowledge & Understanding, C2 = Cognitive Skills, C3A = Practical Skills, C3B = Interpersonal Skills, C3C = Communication Skills, C3D = Digital Skills, C3E = Numeracy Skills, C3F = Leadership, Autonomy & Responsibility, C4A = Personal Skills, C4B = Entrepreneurial Skills, C5 = Ethics &

Professionalism

Assessment Questions

Taylor’s University (TU) is a very well-established institution in education either in Malaysia or in the world. It is ranked number 1 Private University in Southeast Asia for 3 years in a row. Leaping 349 ranks since 2019 to #284 in the QS World University Rankings 2023, TU is now placed among the top 1% of the most influential universities globally. TU also ranked at #53 in Asia according to the QS Asia University Rankings 2022, leaping  36 places  from the previous year and over  140 places throughout the past seven years.

These successes are a testament to its commitment in providing quality education for its students. More importantly, the honour belongs not just to the institution, but to the entire Taylor’sphere™ community, encompassing dedicated lecturers and researchers, industry partners, alumni, and most of all, the students, and their parents.

Taylor’s vision for the future has always been clear: achieving balanced excellence, being recognised as a leading private university locally and globally, and equipping students to graduate in demand.      One of the important criteria for making TU at its ranking now is the student’s employability. After students are graduated, within six months, university is required to identify how many numbers of students are employed.

In 2021, around 232 students from first batch, have been graduated from TU. Exam Unit from TU has prepared these datasets and has been requested to provide an analysis to the top management. Your job is   to   provide   a   data   analysis   from   these   data   sets.   Please   refer   to   the   data   file- Student_Employability.csv. Your analysis must use machine learning to support the fact about TU’s ranking.

Part I : Knowledge and Understanding (TOTAL 8 marks)

1.            Discuss three (3) differences between Artificial Intelligence and Machine Learning. (3 marks)

2.         Knowledge Representation and Reasoning (KR, KRR) represents information from the real world for a computer to understand and then utilize this knowledge to solve complex real- life  problems  like  communicating  with  human  beings  in  natural  language.  Knowledge representation in AI is not just about storing data in a database, it allows a machine to learn from that knowledge and behave intelligently like a human being. Your job in this assignment is to represent knowledge to TU’s top management with the dataset given. Draw a diagram and explain six (6) Cycle of Knowledge Representation in AI to shows the interaction of an AI system with the real world and the components involved in showing intelligence. (5 marks)

Part II : Cognitive Skills (Exploratory and Analytical) (TOTAL 37 marks)

Attached are two (2) datasets  show records of Student Employability and  Student  Salary. These datasets are raw and needs to be pre-processed before it can be fed into AI prediction model. Pre- process the datasets with Python programming on Google Colab.

(Demonstrate a broad and coherent theoretical and technical knowledge comprehension, add comments where necessary)

1. Loading first dataset (Student_Employability.csv) into a Pandas DataFrame. ( 1 mark)          Using Data Pre-processing and Data Cleansing technique, find the following information

a)  Number of rows and columns (1 mark)

b)  Determine the datatype of each column (1 mark)

c)  Find missing value of each column (1 mark)

d)  Remove all rows with missing data (1 mark)

e)   Compute the mean value from age (2 marks)

f)   Impute missing data from age column with the mean value (2 marks)

g)  Two columns- gender and grade are categorical data. Impute Categorical Data with its mode (3 marks)

2. Loading second dataset (Student_Salary.csv) into a Pandas DataFrame. ( 1 mark)

Using Data Pre-processing and Data Cleansing technique, find the following information

a)  Number of rows and columns ( 1 mark)

b)  Determine the datatype of each column (1 mark)

c)   Compute the mean value from salary (2 marks)

d)  Compute the median value from salary (2 marks)

e)  Find the mode from city column (2 marks)

f)   Merge data frames from Student_Employability with Student_Salary (2 marks)

g)  Once merged, separate Categorical Columns from Dataframe (2 marks)

h)  Finding the Frequency of Distribution for gender, grade, employed and city (2 marks)

3. Perform a simple Exploratory Data Analysis (EDA) on the following attributes

•   Age ( 1 mark)

•   Gender (1 mark)

•    Grade ( 1 mark)

•    Employed (1 mark)

•    Salary (1 mark)

•   City (1 mark)

4. Data can be visualized in many forms. Plot a scatter diagram to visualize the relationship between Age and Salary (4 marks)

Submission Requirements

1.   Font type                     : Times New Roman

2.   Font size                     : 12

3.   Line spacing               : 1.5

4.   Alignment                   : Justify Text

5.   Document type           : .pdf, .ipynb

6.  Number of pages        : 5 – 20 pages (do not exceed the page limit)

7.   Your full report should consist of the following:

a)   Cover page (Name, ID, Date, Signature, Score)

b)  Marking Rubrics & Declaration (attach as second page in the report)

c)  Report of your answer script

d)  Appendixes (line spacing = 1.0)

•   List of references (APA format)

•   Python script

•   Report of similarity score-Turnitin (percentage of similarity score from each source needs to be shown)

8.   Start each question on a separate page (Part I and Part II)

9.   All figures and tables are labelled properly.

10. File naming conventions: StudentName_Assignment1

Notes:

•   Student is not allowed to transcribe directly (copy and paste) any material from another source into their submission.

•   Start each question on a new page (Part I and Part II).

•   Answer in form of short essay (50 to 200 words) and print out the relevant Python program outputs

•   All process/functions must be clearly explained.

•   Include in-text citation to support your answers and add the list of references at the end of your report (APA format). The list of references is to be alphabetized by the first author's last name, or (if no author is listed) the organization or title.

•   The Turnitin similarity for this module is 20% overall and lesser than 1% from a single source excluding program source codes.