The University of Sydney Page 1 
QBUS6860 
Visual Data Analytics 
Weekly Assignment 6 
Dr Demetris Christodoulou 
Discipline of Accounting 
MEAFA Research Group 
http://sydney.edu.au/business/research/meafa 
The University of Sydney Page 2 
Weekly Assignment 6 
o The dataset sailor_performance.xlsx provided on Canvas >  
Datasets used in lectures and assignments, contains data on the  
performance of young sailors from a sailing club. This is a real  
dataset, but for confidentiality reasons we protect the identity of  
the club and the sailors. All sailor names are therefore fictional. 
o The dataset holds observations for a specific sailing class where all  
young men compete against other men, and all young women  
compete against each other women. That is, there is differentiation  
by gender. 
o Sailing performance is measured with variable rank, and it is the  
target variable that the sailing club is interested in understanding  
how it is determined. We need to help the sailing club discover any  
potential determinants. The next slide gives more information about  
how rank is specified. 
The University of Sydney Page 3 
Weekly Assignment 6 
o Performance is measured using variable rank. This is the standing  
rank of a sailor at the end of every given month. This rank indicates  
the national rank for this class of young sailors across clubs. 
o It holds that the lower the rank the better the performance, i.e. the  
no.1 sailor is the best sailor in this class. So, rank is a relative form  
of performance across all completing sailors. 
o The rank can only be determined at the completion of a race and it  
remains unchanged between races. Note that a race usually lasts  
many days.  
o Females compete in female-only races and males compete in male- 
only races. This means that two sailors of the same gender cannot  
share the same rank in some given month/date.  
The University of Sydney Page 4 
Weekly Assignment 6 
o In addition to rank we are given the following information that  
could help us with the investigation of what drives performance 
– name: the name and information about the sailor 
– year: the calendar year when observations were made.  
– month: the month of the year when observations were made. 
– date_joined: the date that the sailor joined the club.  
– gender: the gender of the sailor.  
– training_days: the number of training days that the sailor had  
for each month. 
– race_days: the number of competitive race days for each  
month.  
The University of Sydney Page 5 
Weekly Assignment 6 
o The graph objective is concerned with the analysis of  
“Drivers of  sailor performance". The sailing club wants to know what  
drives individual performance in terms of sailor rank, given the  
data provided. 
o We do not know in advance the answer to this question and we  
suspect that any of the provided data may be used to help  
explain performance.  
o You are required to adopt an exploratory type of approach in  
order to discover what drives sailor performance. You are  
required to produce suitable EDA analysis and appropriate data  
graphs that would be presented to the sailing club to help  
understand what drives sailor performance. 
The University of Sydney Page 6 
Weekly Assignment 6 
o There is no need to do extra research on this topic, and you do not  
need to describe the data generating process. The data has been  
recorded by hand by the sailing coach at the end of each month  
using this spreadsheet that is provided. 
o However, it is important that you validate the data properties. 
o You are required to analyse the graph objective with Tableau 
using 2 data graphs.  
o These graphs do not need to be interactive but do use any form of  
interactivity if you it helps you encode the data. That is, I will  
accept both interactive and non-interactive graphs.  
The University of Sydney Page 7 
Weekly Assignment 6 
o You will be evaluated on your success to work with this dataset,  
and your ability to apply basic EDA methods to learn important  
insights about this data 
Hint: not every EDA tool is useful with this data. It is important that  
you first spend sufficient time to understand the data properties  
before you start analysing the data using EDA.  
o You will be evaluated on the quality of data report and data  
management, the correct choice of visual implantations and retinal  
variables, the appropriate application of graph identification and  
graph enhancement tools, and the decoding discussion  
o You are required to submit on Canvas a Word or PDF report, the  
Tableau .twbx file and the Excel file of the managed data