COMP8410 Data Mining S1 2021
COMP8410 Data Mining S1 2021
Assignment 1
Maximum marks 100
Weight 15% of the total marks for the course
Length Maximum of 8 pages excluding cover sheet, bibliography and appendices.
Layout A4. At least 11 point type size. Use of typeface, margins and headings consistent with a professional style.
Submission deadline 9am, Monday 15th March
Submission mode Electronic, PDF via Wattle, file-name includes u-number
Estimated time 15 hours
Penalty for lateness 100% after the deadline has passed
First posted: 22nd Feb, 9am
Last modified: 22nd Feb, 9am
Questions to: Wattle Discussion Forum
This assignment specification may be updated to reflect clarifications and modifications after it is first issued.
In this assignment, you are required to submit a single essay in the form of a single PDF file with a file-name that includes your University u-number ID. The first page must have a clearly identified title and author, identified by both name and university u-number. You may also attach supporting information (appendices) in the same PDF file. Appendices will not be marked but may be treated as supporting information to your essay.
This is a single-person assignment and must be completed on your own. You must use quality reference material and carefully reference all the material that you use via in-text citations. Any material that you quote should have the source clearly referenced. It is unacceptable to present any portion of another author's work as your own. Anyone found doing so will be penalised in marks. In addition, CECS plagiarism procedures apply.
It is strongly suggested that you start working on the assignment right away. You can submit as many times as you wish. Only the most recent submission at the due date will be assessed.
Task
You are to write a well-researched essay that critically evaluates the ethics and social impact of a data mining project.
1. Select a Data Mining project and describe it.
You are asked to select a data mining project from your workplace. This could be a past, completed project, a current, active project, or a future project in planning stages. You may select a scientific project, but it must be the case that the project raises sufficient genuine ethical questions for you to have something to write about in the assignment. For example, the project may use data corresponding to attributes of individual people or organisations that could be privacy-sensitive or for whom the mining results could entrench bias against them. The project must involve data mining or analytics; simple data collection and release, whether intentional or not, is not sufficient.
If it is difficult for you to find any such project (for example, if you are not employed, or you cannot share sufficient information about a workplace project), then you may use a real-world project related to predictive policing, that is, using data mining or statistical analytics to predict the participants, locations, and/or times of criminal activities in order to allocate policing resources. There is an abundance of information available on specific predictive policing projects to be found in magazine and newspaper publications and in research papers. Remember to cite carefully.
On-campus and online You are expected to choose the predictive policing option here, although you may choose the workplace project option if you prefer.
In your essay you will need to describe the project in terms of its aims, its methods, the source and nature of the data it uses, the authority for the organisation’s access to the data, and the expected use and impact of any results obtained. For the impact you should consider not only how the results are planned to be used, but also how they otherwise could be or have been used. In every case, you will need to consider whether the data was provided with consent, whether it is or could be seen to be of a personal nature, and whether the outcomes of the data mining will contribute to social improvement or improved services to consumers or the public. You will also need to describe any other aspects of the project that are necessary for you to address the other aspects of your essay.
For a workplace project, you are encouraged to attach non-confidential background material, written by others, concerning the project about which you write, where this may help to support the information provided in your essay. This should be clearly marked as an appendix and its source and status identified.
2. Consider the ethical aspects of the project.
The Australian Computer Society (ACS) Code of Professional Conduct 2014 is expected to be applied by all Computing Professionals in Australia. It sets out 6 values but stresses the primacy of the public interest as the overriding value. In 2017, the US Branch of the Association for Computing Machinery (ACM), recognizing the ubiquity and far-reaching impact of algorithms in daily lives, issued a Statement on Algorithmic Transparency and Accountability including 7 principles designed to address potential harmful social discrimination due to bias. In 2018, the Australian Government Office of the Australian Information Commissioner released the Guide to Data Analytics and the Australian Privacy Principles (APP). The research community has been addressing the principle of explanation and is surveyed in Du, Liu and Hu, (2020) “Techniques for Interpretable Machine Learning”, Communications of the ACM 63(1).
You are asked to discuss the ethical aspects of your data mining project with particular reference to all of the ACS Code, the US ACM Statement (including the 7 Principles) and the APP. You must consider the privacy of individuals where personal information is involved: such as credit card transactions, health care records, personal financial records, biological traits, criminal or justice investigations, ethnicity or lifestyle choices.
You may need to address complex issues, like whether the potential cost to a few may be outweighed by the benefit to many. You are not expected to provide simple, one-directional answers. While your project may raise many ethical issues, paying attention to the page limit, you are advised to broadly introduce those that you recognise but then to focus your discussion more deeply on some particular issue(s) you choose.
3. Recommend how the project should, could, or should have, managed ethical issues related to data mining.
You are expected to form an opinion on the appropriate measures to put in place to address the ethical issues you have identified. You must place your opinion in the context of technological solutions available to address ethical issues in data mining. However, you are not asked to consider those methods in detail; a light coverage of the expected benefits of the approach is sufficient. The Du et al paper will assist you with technical approaches to some ethical issues you may encounter. Other potential technical approaches are summarised in the course notes for Week 1. You are also specifically required to go beyond such technical solutions alone to consider procedural, governance or educational approaches to managing ethical issues.
While you are asked to provide your own point of view of measures that could be taken, you are also asked to explicitly critique alternative views, such as, perhaps, the measures that were put in place when the project was conducted, or measures that relate to the project that you can discover from the literature or Web sources. Alternatively, you could interview colleagues in your workplace (but not students of this course) in order to gain alternative points of view about what measures could be taken that are ethically acceptable. You may also interview other people that are potentially affected by the results of the project. Consider attaching a transcript, recording or extracts from the interviews as appendices to your essay – such material, where relevant, will be considered as evidence of your research for the essay.
You are free to conclude that ethical considerations would recommend against the project going ahead, but any conclusion you make must be supported by a well-reasoned argument.
General Comments
An abstract or executive summary is not required. A cover sheet is optional and does not contribute to the page count. No particular layout is specified, but you should follow a professional style and use no smaller than 11 point typeface and stay within the maximum specified page count. It is a strict maximum: long-winded or irrelevant content within the limit will be penalised and text beyond the limit will be treated as non-existent. Page margins, heading sizes, paragraph breaks and so forth are not specified but a professional style must be maintained. Appendices may be used and do not contribute to the page count, but appendices may be only quickly scanned or used for reference and will not be specifically marked.
Your essay is expected to be a well-researched piece of critical writing. You may find this resource from Sydney University helpful information on what is expected in critical writing, and noting that critical writing necessarily includes elements of descriptive, analytical, and persuasive writing as well.
http://sydney.edu.au/stuserv/learning_centre/help/analysing/an_distinguishTypes.shtml.
You should play close attention to references, both to demonstrate the research component of your essay, to support your argument with expert opinion and evidence, and also to appropriately attribute the work of others including all reference documents made available to you (but not this assignment specification itself). No particular referencing style is required. However, you are expected to reference conventionally, conveniently, and consistently. Your references should be sufficient to both unambiguously identify the source, to describe the nature of the source, and also to retrieve the source in online and (if possible) traditional publisher formats.
An assessment rubric is provided. The rubric will be used to mark your assignment. You are advised to use it to supplement your understanding of what is expected for the assignment and to direct your effort towards the most rewarding parts of the work.
Your assignment submission will be treated confidentially, but it will be available to ANU staff involved in the course for the purposes of marking. Please respect your employer’s expectations of confidentiality in your assignment. If you cannot share sufficient information about your project in order to address the assignment questions, then please do choose a different project or take the alternative options given above.
Assessment Rubric
This rubric will be used to mark your assignment. You are advised to use it to supplement your understanding of what is expected for the assignment and to direct your effort towards the most rewarding parts of the work. Your assignment will be marked out of 100, and marks will be scaled back to contribute to the defined weighting for assessment of the course.
Review
Criteria
|
Max
Mark
|
Exemplary
|
Excellent
|
Good
|
Acceptable
|
Unsatisfactory
|
Overall holistic
evaluation of the
report
|
20
|
17-20
Highly original and very
interesting.
Excellent, detailed and
relevant discussion that
develops and enhances the
reader's understanding of
the topic.
Very clear key message
argued throughout.
|
14-16
Interesting with some
originality.
Relevant discussion of
sufficient detail to allow the
reader to develop a clear
understanding of the topic.
Clear key message and
associated conclusion.
|
12-13
Interesting but lacking
originality.
Although relevant,
discussion sometimes lacks
sufficient detail to allow the
reader to develop a
consistent understanding of
the topic.
Identifiable key message
and associated conclusion.
|
10-11
Not very interesting or
original.
Discussion is not always
relevant nor sufficiently
detailed to enable the reader
to develop an understanding
of the topic.
Difficult to be certain what
the key message is and how
the conclusion relates to it
|
0-9
Boring and mundane.
Discussion lacks detail, is
mostly irrelevant and
doesn't help the reader to
develop an understanding of
the topic.
No discernible key message
or conclusion.
|
Communication,
Structure and
Presentation
|
10
|
9-10
Exemplary use of language
enhancing the quality of the
submission.
Very well ordered with
logical and clear structure
supported by appropriate
headings and sub headings.
All use of others' ideas and
materials acknowledged.
References are all included
and are formatted
consistently and
appropriately.
Diagrams and/or images are
ideally suited to the points
where they are used.
|
7-8
Very good use of language.
Well-ordered and logical.
Headings and sub-headings
help to clarify text.
All use of others' ideas and
material is acknowledged.
All references are included,
though some minor
inconsistency of in-text
citation or formatting.
Diagrams and/or images are
used effectively.
|
6
Reasonable but needs some
revision.
Mostly well-ordered and
logical, most supported by
headings and sub-headings
All use of others' ideas and
material is acknowledged.
Some references are missing
and occasional
inconsistencies of in-text
citation and formatting.
Diagrams and/or images
improve readability.
|
5
Poor writing or spelling,
needs significant revision.
Visual presentation not of
professional quality.
Order is not always logical
and is sometimes confusing.
Headings are simply those of
the questions posed.
All use of other's ideas and
material is acknowledged,
though sometimes
inconsistently. Missing
references and inconsistent
in-text citation and
formatting.
Diagrams and/or images are
not well selected or
incompletely explained or
poorly labelled.
|
0-4
Very difficult to understand.
Order is confusing and not
always logical. Headings and
sub-headings do little to help
clarify the text
Not all use of other's ideas
and material is
acknowledged. Missing in-
text citations, i.e. plagiarism.
References in the
bibliography not used in the
text. Poorly and
inconsistently formatted.
Diagrams and/or images
detract from the key
messages.
|
Project
Description
|
20
|
17-20
The project basics are given:
aims, methods, data source,
data nature, authority,
expected impact, and a
creative analysis of
alternative possible uses of
mining results.
The scope of the project
introduces clear and richly
variable challenges around
ethical considerations.
Project description is
supported by evidence.
|
14-16
Most of the project basics
are given: aims, methods,
data source, data nature,
authority, expected impact,
and some alternative
possible uses of mining
results.
Project description is
supported by evidence.
|
12-13
The project description
provides adequate context
for the discussion
concerning ethical aspects,
although some key elements
could be expanded to
support richer ethical
discussion.
Project description is linked
to verifiable statements.
|
10-11
Project description is barely
adequate for the purpose.
|
0-9
Key elements of the project
description are missing or
insufficiently explained.
|
Ethical aspects
raised
|
30
|
27-30
A broad range of potential
ethical issues are raised.
Issues raised address every
type: biased decisions,
individual privacy, and
public interest or quality of
life.
Discussion of ethical issues
by linking to ACM
Statement, ACS code of
Conduct and Australian
Privacy Principles
demonstrates a mature
understanding of
professional ethics.
Analysis of the issues
demonstrates an
understanding of the
complexity in balancing
alternative viewpoints
|
23-26
At least 3 distinct ethical
issues raised and clearly
explained with reference to
the project.
Potential ethical issues
raised address at least 2 out
3 of biased decisions,
privacy, and public interest
or quality of life.
Discussion of ethical issues
linked to many of the ACM
Statement, ACS Code of
Conduct and Australian
Privacy Principles, at the
item level.
Pros and cons for various
viewpoints identified
throughout.
|
19-22
At least 3 distinct ethical
issues raised and clearly
explained with reference to
the project.
Issues raised are discussed
in the context of the ACM
Statement, ACS Code of
Practice, and the Australian
Privacy Principles.
Some issues are presented
from more than one
viewpoint.
|
15-18
At least 2 distinct ethical
issues are raised and
discussed in the project
context.
There is a cursory attempt to
relate the issues to the ACM
Statement the ACS Code of
Conduct and/or Australian
Privacy Principles but the
analysis is shallow.
Some alternative viewpoints
are recognised, but only
lightly.
|
0-14
Ethical issues may be raised
but are not adequately
discussed in the context of
the project (How would they
occur? Who could be
affected? And so forth).
Unclear whether the
relevance and purpose of
the ACM Statement, ACS
Code of Conduct and the
Guide to Data Analytics and
the Australian Privacy
Principles have been fully
understood.
Generally a failure to
recognise alternative
viewpoints.
|
Recommendation
on how to
manage ethical
aspects
|
20
|
17-20
Some technological
solutions identified and
explained for addressing
specific ethical concerns in
the project.
Procedural, governance and
educational approaches to
managing ethical issues
identified and
contextualised for
application in the project.
Surprising or creative ideas.
Balanced presentation of
alternative measures that
were or could be taken.
Opinion is persuasively
supported by argument.
|
14-16
Some relevant technical
approaches to ethical
concerns described.
Some procedural,
governance or educational
approaches to managing
ethical issues identified.
Balanced presentation of
alternative approaches that
differ from the
recommended approach.
Opinion is clear and
consistent with argument.
|
12-13
A few technical approaches
identified but not clear that
they are important for the
project in question.
A few procedural,
governance or educational
approaches to managing
ethical issues identified but
not clear whether they are
relevant.
Alternative approaches to
recommended approach
given but not well defended.
Opinion clear but rationale
missing.
|
10-11
A few technical, procedural,
governance or educational
approaches to managing
ethical issues identified, but
not clear that they are
important for the project.
Management approaches
not well tied to project
context.
Poor presentation and
analysis of defensible
alternatives.
Recommendation given.
|
0-9
Scant description or range of
procedural, governance,
educational or technical
approaches offered,
demonstrating ineffective
research.
Pros and cons for various
approaches to managing
ethical issues not (or barely)
presented.
Recommendation unclear or
incomplete.
|
2021-02-26