University of Chester – Postgraduate Programmes
Assignment Specification Faculty of Science
and Engineering Department of
Computer Science |
|||
Module
No CO7405 |
Module
Title Principles of Data Science |
Assessment
No 2 |
Weighting 70% |
Submission
Date 19 Jan 2023 at 17:30 |
Feedback
due by 16 Feb 2023 |
||
Assignment
Title Programming assignment |
|||
Learning
Outcomes Assessed 2.
Develop and critically evaluate software tools in Python or
other appropriate languages/tools for processing and visualising data. |
|||
Submission Information For projects that include programming code: The TurnItIn submission box will have multiple parts. You
must submit to the appropriate part ·
A PDF
file with all programming code from your project (in a monospace font),
followed by a reference list in APA format. ·
A ZIP
file containing the project Both files must be named with your assessment (J number),
e.g. J123456.pdf and J123456.zip. Files submitted in an incorrect format will usually be
marked as zero. All components must be submitted to avoid receiving a mark
of zero. Any late work penalties for assignments will be calculated
using the latest submission date/time. |
|||
Extensions Extensions should be requested through the online system available on
the Registry services pages on Portal. Late work is penalised at the rate of 5% per day or part thereof. Academic Integrity |
Referencing code
Code adapted from third parties must be clearly referenced using comments to denote the start and end of the adapted code. You must also include an APA format reference in the PDF file.
Example of referenced code
//code adapted from Thomson, 2012
if (someCharacter == ‘z’ || someCharacter == ‘Z’) {
someCharacter -= 25;
} else {
someCharacter += 1;
}
//end of adapted code
Example of reference entry in PDF file
Thomson, C. (2012). Rot-13 function in Java?. Retrieved from http://stackoverflow.com/questions/8981296/rot-13-function-in-java
Assignment Brief
You are to create and critically evaluate a software tool made in Python for processing and visualising data.
You have been hired as a data scientist to help the business understand some new house price data the company has acquired. You are expected to create a Python programme that will show any trends, insights, and interesting information from the data.
To delve deeper into the data, it is expected you will produce some sort of prediction. This can be based on any of the data (e.g., house prices in areas, average amount of days on the market vs lot size, number of bathrooms vs sales days etc).
You are to create a python tool that will complete the ETL process using the data. You must cleanse the data and produce outputs. You will use current house price data from the excel files on Moodle.
Notes:
· The dataset is on Moodle (under the assessment tab) named: San_Francisco.csv and New_York.csv.
· Make sure to explore multiple columns in the data to find trends or interesting patterns within.
· Additional formatting on visualisation will gain extra marks.
· More complex visualisation and deeper understanding of the data methods will gain extra marks.
· You are expected to use exception handling.
· Add comments to your code.
· All diagrams must have an index and labels.
· All the above must use Python only. You are not to use external programs or websites to create the data you present. Therefore, your Python must generate all of the above.
· Redfin have an API – https://pypi.org/project/redfin/ this may help improve prediction results.
You are expected to produce a report of your findings and predictions that will be shown to the senior management team.
Task 2
You are to create a short report critically evaluating your created tool.
You should evaluate the effectiveness of your tool and how well it displayed the results. You should comment on improvements that can be backed up by research and examples.
This section of the report should be no more than 1000 words.
Assessment Criteria
70%+ will be awarded for:
• Demonstrating in-depth knowledge
• Showing excellent knowledge of the topic area
• Excellent command and understanding of areas covered
• A very sophisticated critical evaluation and program evidence of prediction.
.
60-70% will be awarded for:
• Demonstrating extensive knowledge
• Showing good knowledge of the topic area
• Sound command, understanding, and usage of relevant tools
• A sound critical evaluation and program.
50-60% will be awarded for:
• Demonstrating some knowledge
• Showing relevant knowledge of the topic area
• Showing good command, understanding of areas covered
• A critical evaluation, and program.
Answers that fall below the criteria for a pass will receive a failure mark