首页 > > 详细

讲解 COMP5310 Project Stage 2 Develop and evaluate a predictive model辅导 Python编程

COMP5310 Project Stage 2

Develop and evaluate a predictive model

Due: 11:59PM on 15th of May 2025 (Week 11)

This assignment is worth 25% of the nal mark of the unit of study.

GROUPS

This stage is done with the same group members you worked with for Stage 1. However, under exceptional circumstances, an alternative group may be created by the tutor when a group is reduced in size due to members discontinuing this unit. If this applies to you, please email the unit coordinator maryam.khaniannajafab[email protected] or the TAs:

[email protected] or w[email protected] to discuss this.

Note: Each member of the group is required to complete individual tasks, but the project will be submiCed as a combined effort. The final project will be marked as a whole, with both individual and group components contribuHng to the final grade. All assessments will be based on the single, submiCed document.

Dispute resolution

If, during the course of the assignment work, there is a dispute among group members that you can’t resolve or that will impact your group’s capacity to complete the task well, you need to inform. the unit coordinator [email protected] or the TAs: [email protected] or [email protected]. Make sure that your email specifies the group name and is explicit about the difficulty; also, make sure this email is copied to all group members (including anyone you are complaining about).

We need to know about problems in 3me to help fix them, so set early deadlines for group members, and deal with non-performance promptly (don’t wait till a few days before the work is due to complain that someone is not delivering on their tasks). If necessary, the coordinator will split a group and leave anyone who didn’t participate effectively in a group by themselves (they will need to achieve all the outcomes on their own). This op3on is only available up un3l Thursday Week 9, which is the last day with time to resolve the issue before the due date. For any group issues that arise aRer this time, you will need to try to resolve the problem on your own, and you will continue to be treated as a single group. If someone doesn’t provide the material required for the report, or their material is not of the agreed standard, you should still have the report show what that person did. Their section of the report may be empty if they don’t produce anything, or it may have material but not enough. In such cases, please put a “Note to marker” on the front page of the report, which describes the circumstances. That way, we can consider how best to apply the marking scheme. Note that it is not expected or sensible for other members to do the work that someone failed to deliver.

PROJECT

Overview

The objective of Stage 2 of the project is to build a robust predictive model using the clean dataset obtained in Stage 1. This stage will involve advanced predictive modelling techniques, as well as thorough model evaluation and optimisation processes.

Important Notes:

1. You MUST work in the same groups you worked on during Stage 1.

2. Further cleaning of the dataset, addiHon of previously dropped columns, and or removal of columns are permiCed if you wish.

3. Each member must use a different modelling technique to develop their predicHve model.

4. Changing of target variable and research quesHon is also permiCed, if the group chooses to do so.

DELIVERABLES

Report

The report must have a maximum of 3 pages for each individual section and maximum of 3 pages for the group section (including both group components 1 and 2) for a group of 2, and a maximum of 4 pages for the group section for a group of 3. You must use the high-level headings, as provided below, to indicate the different sections and sub-sections of the report.

You must use line spacing of at least 1.15pt, margins of at least 1.8cm, and body font size of at least 10pt. The goal is to convey the problem clearly and concisely.

The report  should   be   in   PDF format,   named   using   the   following   convention: “GroupX_A2_Report.pdf”, where X is your group number. DO NOT SUBMIT A FOLDER THAT IS NAMED GroupX_A2_Report. It must have a front page that gives the group number, and the list of members involved (giving their SIDs AND unikeys, NOT their names).

The body of the report must have a structure as follows:

Group Component 1

The report must begin with a group section including:

1.   Topic   and    research    question: Describe   the    research    problem   comprehensively, emphasizing its significance in the domain. All  members  must agree upon and aim to answer the same research question. Clearly articulate the research question and highlight its implications for various stakeholders. Discuss how addressing this question could lead to actionable insights or improvements in decision-making for the stakeholders.

2.   Dataset: Provide  a  detailed  overview  of  the  dataset  and  discuss  any  challenges,  class imbalances, and or biases present in the data and how they might impact the modelling process.

3. Setup

3.1. Modelling agreements: Identify an a`ribute that you will all make predictions about and agree on at least two measures of success for the predictive models you will be producing. These measures should go beyond standard accuracy metrics and may include areas under the receiver operating characteristic curve (AUC-ROC), F1-score, precision-recall curves, etc. Explain the rationale behind these measures and their suitability for the research question.

3.2. Data  division: Describe  the  process  of  how  you  divided  your  data  into  training, validation (if applicable), and test sets. Explain the rationale behind the data division, considering strategies like temporal validation or stratified sampling.

Individual Component

The report must include a dedicated section for each group member. Each section should clearly state the member's Unikey to identify their individual contribution / component:

The report must include a dedicated individual secHon for each group member. Each secHon should clearly state the member's Unikey to idenHfy their individual secHons in the report (THIS IS A  UNIKEY: ABCD1234).  DO  NOT PROVIDE STUDENT ID  OR  STUDENT NAME  TO IDENTIFY ANY OF THE SECTIONS.

1. Predictive model

Note: Each member must choose a different predicHve modelling technique.

1.1.  Model  Description: Name  and  describe  your  technique,  discuss the  assumptions underlying this technique, and critically evaluate their validity in the context of the dataset. Highlight the strengths and limitations of the chosen technique and justify its suitability for the research question and dataset characteristics. Modelling techniques not covered in the tutorial sessions, such as neural networks (CNNs, LSTMs, RNNs, GANs,  etc)  or  including  bagging  or  gradient  boosting  techniques  (GBM,  XGBoost, LightGBM, CatBoost, AdaBoost, etc.) are preferred.

1.2.  Model Algorithm: Provide  a  detailed explanation of the algorithm powering your chosen technique,  including  its  underlying  principles,  such as  (but  not  limited to) mathematical    equations,    hyperparameters,     and    potential     variations.    Using pseudocode  or  flowchart  diagrams,  provide  the  step-by-step  execution  of  the algorithm. (You can type the pseudocode in Jupyter Notebook and put the screenshot of the pseudocode here.  You cannot put the screenshot of the pseudocode in the appendix. If you do, it will not be marked). If you choose to draw a flowchart, you can create it on any online tool or so`ware and aCach its screenshot here. You must put the screenshot of the flowchart diagram here, in the main report. If you put it in the appendix, it will not be marked.

1.3. Model Development: Describe the process of building the predictive model, including advanced  data  preprocessing  techniques  such  as  feature  scaling,  dimensionality reduction (e.g., Principal Component Analysis), or feature engineering. Discuss the selection  of  model-specific  functions  and  hyperparameters,  providing  theoretical justification and empirical validation. Also, you will identify the Python functions and chosen parameters you selected and what they mean.

Note: You don’t have to include the code in the report, as you will submit it separately.

2. Model Evalua3on and Op3miza3on

2.1. Model Evalua3on: Perform. a comprehensive evaluation of your model's performance using the agreed-upon measures of success. Interpret the results in the context of the research  question  and  dataset  characteristics,  considering  factors  such   as  class imbalance,  noise,  and  interpretability.  Discuss  the  implications  of  the  evaluation metrics and identify potential areas for improvement.

2.2. Model Op3misa3on: Explore advanced optimisation techniques to further enhance your   model's   performance,   explaining   your   choices   clearly.   This   may   involve hyperparameter tuning using techniques like grid search.

Group Component 2

Finally, a second group section at the end of the report should include:

1.   Discussion: Engage  in  a  critical  discussion  on  the  strengths  and   limitations  of  each modelling   technique   employed    by    group   members.    Compare   and    contrast   the performance of various models quantitatively and qualitatively. Reflect on the broader implications of model selection for addressing the research question effectively.

2.   Conclusion: Synthesize  the  findings  from  individual  model  evaluations  and  provide  a recommendation  on  the  most  effective  predictive  model  for  answering  the  research question.   Justify   your   recommendation   based   on   empirical   evidence,   theoretical considerations, and domain knowledge. Propose potential avenues for future research, including data collection strategies, model refinement techniques, and interdisciplinary collaborations.


联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!