Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

COM00166M

Department of Computer Science

Applied Artificial Intelligence

SUMMATIVE ASSESSMENT BRIEF


Author

Saul Cross

Assessment type

Summative assignment

Weighting

100%

Release

Week 3

Deadline

Monday following Week 8, 13:00 (UK time) *

* If this date falls on a UK public holiday or a University of York closure day, the submission date will change. Please check the submission point in the ‘Assignments’ area of the module in Canvas for the exact submission deadline.

I. Module Learning Outcomes

The module learning outcomes for this module are as follows:

MLO 1. Select and apply appropriate AI algorithms and methodologies, with consideration for optimisation and scale to meet business objectives and performance targets.

MLO 2. Critically evaluate AI methodologies through experimental design, exploratory modelling, and hypothesis testing.

MLO 3. Critically analyse techniques for the extraction of data from systems, ensuring standards of data quality and consistency for processing by AI systems

MLO 4. Identify and discuss appropriate application areas and problems for current AI techniques, such as: neural networks, deep learning, genetic algorithms and local search approaches.

This assessment addresses all the module learning outcomes listed above.

II. Assessment Background/Scenario

You are working as an AI engineer at a reputable organisation. You have a client from the green energy sector who needs to be able to evaluate the impact of various green and sustainable energy sources on CO2 emissions in the UK. They have acquired data from two independent sources, the first covering CO2 emissions by country over time with detail of the contribution of various fossil fuels and other activities, while the second charts energy consumption in a 30-year series with a detailed breakdown of the contribution from renewable energy sources. The challenge is to make a meaningful connection between the two sets of data in order to build a predictive model to demonstrate the impact on CO2 emissions of replacing non-renewable energy sources with renewable energy sources. You have carried out an initial data exploration and found:

· The data provided is taken from the two independent sources and is significantly imbalanced.

· The data contains both numeric and nominal attributes.

· There are a few missing values in the data.

You’ve had a meeting with your client and have agreed to model the data using artificial intelligence techniques - namely, supervised learning and feature selection optimisation. Feature selection is important in removing irrelevant attributes and helps reduce computation cost. You are expected to present a report to your client by constructing two robust models which must follow the guidelines presented below:

· Design and build a supervised learning model on the full data.

· Use optimisation techniques (learned in this module) to find a subset of relevant features.

· Design and build a supervised learning model on the derived subset of features.

· Critically evaluate the two learning models (with and without feature selection).

· Evaluate the robustness of the generated models by applying appropriate validation techniques (and identifying a suitable subset of data for validation).

While setting the parameters of the optimisation methods, pay special attention to selecting an appropriate fitness function (evaluation criteria). The fitness function plays an important role in the relative success or failure of the potential solutions and setting the direction of the search.

The full datasets will be provided once this assessment is open, together with some additional documentation, including the acknowledgement/referencing details for the datasets which will also be of use in your secondary research.

Please note that you may use any tools to develop your models, e.g. WEKA, Java/Tensorflow, Python/Scikitlearn, etc., but you must state which tools were used and why. Your source files must be presented in a form that can be opened/reviewed by the assessor, e.g. arff (WEKA), .csv (for raw/source data), specific source code (dependent on language if programming).

III. Assessment Tasks

You are required to design and build solutions, and to write a report based on the discussed scenario and the data provided. You should clearly draw on the current literature and use examples from your work throughout this module as supporting evidence for the approach. Your report should provide an initial executive summary and consist of five clear sections - one for each task outlined below. Further formatting details are also given below.

Overall Academic Quality (10% weighting)

Covered by all tasks.

Executive Summary (10% weighting)
(suggested - 300 words)

Overview/summary of the report which should at least contain:

1. What was achieved/undertaken.

2. What processes were applied.

3. What tools were used and why.

4. What the results demonstrated.

5. What should be reconsidered in future.

Task 1 (10% weighting) - Introduction
(suggested - 500 words)

In the context of the scenario provided, your introduction should at least contain:

1. A brief description of the business problem and its significance to the relevant sector.

2. Background information on the field of AI.

3. A description of the link between the business problem and the field of AI.

4. A brief description of the proposed solution.

Task 2 (20% weighting) - Literature Review
(suggested - 700 words)

Given the scenario above, research and identify the main areas of investigation which the research community is currently tackling. Consider the following questions:

1. What are the current ‘problem’ areas and how is AI helping to solve these problems?

2. What techniques have been developed to effectively address those business problems?

3. What tools have been used and what are the selection criteria for them?

4. How are these techniques being evaluated in the context of the ‘problem’?

5. Critically evaluate various approaches/solutions presented in the literature.

Present a discussion around these questions and consider how current research could potentially change or improve your solution to the given scenario.

Task 3 (20% weighting) - Research Design
(suggested - 500 words)

Given the scenario above, design and discuss the potential modelling solution(s). You are required to design the solution(s) of the presented scenario. Moreover, you need to strongly justify the techniques selected in the context of the ‘problem in hand’. You must select one supervised learning algorithm, and one optimisation algorithm to complete this task.

Your report should clearly cover the following:

1. Any assumptions you are making about the given scenario.

2. Any pre-processing you would undertake to make the data fit for purpose.

3. Which optimisation techniques you would apply for feature selection and why (the techniques covered in this module are: hill climbing, simulated annealing, tabu search and genetic algorithms).

4. Which supervised learning techniques you would apply and why (the techniques covered in this module are: artificial neural networks, decision trees, naïve Bayes and support vector machines).

5. An evaluation of the techniques applied in terms of the accuracy of their results (or any other suitable evaluation measure).

6. Algorithmic parameters should be adequately stated and discussed.

Task 4 (20% weighting) - Experimental Results and Analysis

(suggested - 700 words)

After carrying out the modelling of the data provided, both with and without feature selection, this section must at least cover the following points:

1. Present your findings in a clear and concise manner.

2. Discuss your results in the context of the selected optimisation algorithm and supervised learning technique.

3. Discuss how these results can help the business in evaluating the impact of green energy on CO2 emissions.

4. Your arguments should also be supported by the relevant literature.

5. Present the limitations (if any) of your approach in a clear and concise manner.

Task 5 (10% weighting) - Conclusion
(suggested - 300 words)

Your conclusion must at least cover the following points:

1. A summary of the main points.

2. A discussion of the significance of your results.

3. Any recommendation(s) resulting from your analysis.

IV. Deliverables

Your assignment should be laid out following the formatting guidelines that are specified in the ‘Submission Formatting’ page in Canvas.

You should submit the following to the Canvas submission point:

· A zipped file containing your solution source files (not executables), in .zip format ONLY

· A separate file containing your report, in either .docx or .pdf ONLY

If you are submitting multiple files, you must upload all files simultaneously to ensure that they are marked as a single submission. If you want to resubmit one component of your work, you need to re-upload all other files at the same time: every submission must include all of the deliverables listed in the assessment brief.

Your report should be no more than 3,000 words in total. For each task contributing to your report, we have provided guidelines above on the suggested word counts for each section; however, it is your choice how to use the word count limit across the whole of this report. Exceeding the word count will result in any work beyond the word count being disregarded when assessing.

Appendices may be used but should not exceed 5 additional pages and all content must be referred to and discussed in the main body of the text. Any content that is not cross-referenced within the report will NOT be considered.

Appendices should only be used for supportive information, such as over-large figures or tables of data. They are not a device to incorporate material that would otherwise cause you to exceed the word limit. These are not included in the word count.

Your references should come after any appendices and are also not included in the word count.

Referencing

You are required to use the IEEE referencing style for citing books, articles, and all other sources (like websites) used in your assignment.

Good referencing is essential in order to meet the standards of academic integrity set by the University. All of your sources must be acknowledged, regardless of whether you included direct quotes or not. Visit your Academic Integrity Tutorial module in Canvas for additional guidance on effective referencing.

V. Marking Criteria

Learning Outcome

Section/Task

Criteria

Available marks

Overall Academic Quality

1, 2, 3 & 4

All Tasks

Clear and coherent across all tasks with appropriate, relevant and effective referencing and citation.

10

Executive summary

1, 2, 3 & 4

N/A

Full and clear overview of report.

10

Introduction

1

Task 1

Clear description of the problem and its significance to the sector. Discussion of relevant background/underpinning theory in AI. Alignment of the problem to the field of AI. Description of the solution(s) presented.

10

Literature Review

4

Task 2

Examines current, relevant problem areas. Considers and evaluates proposed solutions. Includes a critical evaluation of any, and all relevant approaches/solutions under review.

20

Research Design

1, 2, 3 & 4

Task 3

To cover pre-processing techniques, selection of features and optimisation techniques, evaluation of selected techniques and discussion of algorithmic parameters, accompanied by a clear and justified rationale in each case.

20

Experimental Results and Analysis

2 & 4

Task 4

A clear and concise presentation of the results to include mapping of results to the problem; reference to relevant literature and discussion of the solutions in context.

20

Conclusion

1, 2, 3 & 4

Task 5

A clear and concise presentation of findings with consideration of limitations and further development.

10

TOTAL:

100

NOTE: Failure to submit the required elements (the report and/or the supporting zip file containing your solution source files) will result in a grade of ZERO. Work NOT submitted in the requested formats (.docx or .pdf for the report and source files in an appropriate and readable format, e.g. .arff, .csv or relevant source code) may NOT be considered.

VI. Marking Criteria: Grade breakdown

Overall Academic Quality 10%

0-39%

Fail

The documentation is poorly structured and flows badly, with little evidence of the effective use of referencing and citation.

40-49%

Compensatable fail

The documentation is not well structured and does not flow well, with very limited/ineffective use of referencing and citation.

50%-59%

Pass

The documentation has a clear structure and is both legible and coherent but could flow better and make much better use of referencing and citation.

60%-69%

Merit

The documentation is well structured with a clear and coherent flow and uses referencing and citation effectively to support the discussion.

70%-100%

Distinction

The documentation is very well structured and flows well, making good use of referencing and citation throughout to support the discussion.

Executive Summary 10%

0-39%

Fail

The executive summary is significantly incomplete.

40-49%

Compensatable fail

The executive summary lacks clarity, providing only limited coverage of the report.

50%-59%

Pass

The executive summary is reasonably clear and provides basic coverage for most of the key areas.

60%-69%

Merit

The executive summary is clear, giving a good coverage of the key areas of the report.

70%-100%

Distinction

The executive summary is both clear and concise with full coverage of the report.

Introduction 10%

0-39%

Fail

The introduction is very limited with little or no theoretical underpinning, an undefined or poorly defined project and no clear alignment to the field of artificial intelligence.

40-49%

Compensatable fail

The introduction is limited with little theoretical underpinning, the project is not clearly defined, and there is limited relevance to the field of artificial intelligence.

50%-59%

Pass

The introduction is basic with limited theoretical underpinning, definition of the project and reference to relevant concepts in the field of artificial intelligence.

60%-69%

Merit

The introduction gives a clear and concise coverage of the required points with the most relevant theoretical underpinning covered, a well-defined project, and a well-supported discussion of the relevant concepts in the field of artificial intelligence.

70%-100%

Distinction

The introduction provides a very clear and concise coverage of the required points with most relevant theoretical underpinning covered, a very well-defined project and a very well-supported discussion of relevant concepts in the field of artificial intelligence.

Literature Review 20%

0-39%

Fail

Coverage is very limited with little relevance in the sources selected and is not a good match for the subject matter under review, with very limited referencing and citation.

40-49%

Compensatable fail

There is limited coverage and the sources selected could be more relevant as they are not a good match to the subject matter under review; referencing and citation are present but limited.

50%-59%

Pass

There is basic coverage from mostly relevant sources that do match the subject matter under review to an extent, but referencing and citation could be much more thoroughly and effectively used.

60%-69%

Merit

There is some detailed and effective coverage based on relevant sources and this is a good match to the subject matter under discussion that makes effective use of referencing and citation.

70%-100%

Distinction

A very detailed yet concise literature review that identifies a good range of relevant sources and aligns well with the subject under discussion, with very effective use of referencing and citation seen throughout.

Research Design 20%

0-39%

Fail

The research design element is very limited, with little valid coverage of the dataset/subset selection, pre-processing, selection criteria for the methods applied, feature selection and parameterisation.

40-49%

Compensatable fail

The research design element is limited, with limited coverage of the dataset/subset selection, pre-processing, selection criteria for the methods applied, feature selection and parameterisation, and lacks rationale or justification for the decisions made.

50%-59%

Pass

The research design element is basic, with some coverage of the dataset/subset selection, pre-processing, selection criteria for the methods applied, feature selection and parameterisation; with limited rationale or justification for the decisions made.

60%-69%

Merit

The research design element is effective, with reasonable coverage of the dataset/subset selection, pre-processing, selection criteria for the methods applied, feature selection and parameterisation, and some rationale or justification for many of the decisions made.

70%-100%

Distinction

The research design element is both clear and effective, with a good level of coverage of the dataset/subset selection, pre-processing, selection criteria for the methods applied, feature selection and parameterisation, and clear rationale or justification for most of the decisions made.