辅导 program、讲解 Java语言程序
            
                page 1 of 4 
University of Aberdeen 
School of Natural and Computing Sciences 
Department of Computing Science 
2024 – 2025 
Programming assignment – Individually Assessed (no teamwork) 
Title: JC4001 – Distributed Systems Note: This assignment accounts for 30% of 
your total mark of the course. 
Learning Outcomes 
On successful completion of this component a student will have demonstrated to be able to: 
• Understand the principles of federated learning (FL) in distributed systems and how it 
differs from centralized machine learning. 
• Implement a basic federated learning in distributed systems for image classification using 
the MNIST dataset. 
• Simulate a federated learning environment in distributed systems where multiple clients 
independently train models and the server aggregates them. 
• Explore the effects of model aggregation and compare with centralized training. 
• Evaluate the performance of the FL model under different conditions, such as non-IID data 
distribution and varying number of clients. 
Information for Plagiarism and Collusion: The source code and your report may be submitted for 
plagiarism check. Please refer to the slides available at MyAberdeen for more information about 
avoiding plagiarism before you start working on the assessment. The use of large language 
models, such as ChatGPT, for writing the code or the report can also be considered as plagiarism. 
In addition, submitting similar work with another student can be considered as collusion. Also read 
the following information provided by the university: 
https://www.abdn.ac.uk/sls/online-resources/avoiding-plagiarism/ page 2 of 4 
Introduction 
In this assignment, your task is to build a federated learning (FL) algorithm in a distributed system. 
FL is a distributed approach to train machine learning models, designed to guarantee local data 
privacy by training learning models without centralized datasets. As shown in Fig. 1, the FL structure 
should include two parts. The first part is an edge server for model aggregation. The second part 
should include several devices, and each device has a local dataset for local model updating. Then, 
each device transmits the updated local model to the edge server for local model aggregation. 
 
 
Figure 1. Illustration of the FL structure. 
 
 
General Guidance and Requirements 
Your assignment code and report must conform to the requirements given below and include the 
required content as outlined in each section. You must supply a written report, along with the 
corresponding code, containing all distinct sections/subtasks that provide a full critical and reflective 
account of the processes undertaken. 
This assignment can be done in Python/PyCharm on your own device. If you work on your own device, 
then be sure to move your files to MyAberdeen regularly, so that we can run the application and 
mark it. 
Note that it is your responsibility to ensure that your code runs on Python/PyCharm. By default, 
your code should run by directly clicking the “run” button. If your implementation uses some other 
command to start the code, it must be mentioned in the report. 
Submission Guideline. After you finish your assignment, please compress all your files in a 
compressed file and submit it in MyAberdeen (Content -> Assignment Submit -> View Instructions -> 
Submission (Drag and drop files here)) page 3 of 4 
Part 1: Understanding Federated Learning [5 points] 
1. Read the Research Paper: You should read a foundational paper on federated learning, such 
as Communication-Efficient Learning of Deep Networks from Decentralized Data by 
McMahan et al. (2017). 
2. Summary Task: Write a 500-word summary explaining the key components of federated 
learning (client-server architecture, data privacy, and challenges like non-IID data). [5 points] 
 
 
Part 2: Centralized Learning Baseline [15 points] 
1. Implement Centralized Training: You should implement a simple neural network using a 
centralized approach for classifying digits in the MNIST dataset. This will serve as a 
baseline. 
o Input: MNIST dataset. [5 points] 
o Model: A basic neural network with several hidden layers. [5 points] 
o Task: Train the model and evaluate its accuracy. [5 points] 
 
Part 3: Federated Learning Implementation [30 points] 
1. Simulate Clients: Split the MNIST dataset into several partitions to represent data stored 
locally at different clients. Implement a Python class that simulates clients, each holding a 
subset of the data. [10 points] 
o Task: Implement a function to partition the data in both IID (independent and 
identically distributed) and non-IID ways. 
2. Model Training on Clients: Modify the centralized neural network code so that each client 
trains its model independently using its local data. [5 points] 
3. Server-Side Aggregation: Implement a simple parameter server that aggregates model 
updates sent by clients. Use the Federated Averaging (FedAvg) algorithm: [10 points] 
o Each client sends its model parameters to the server after training on local data. 
o The server aggregates these parameters (weighted by the number of samples each 
client has) and updates the global model. 
4. Communication Rounds: Implement a loop where clients train their local models and the 
server aggregates them over multiple communication rounds. [5 points] 
 
Part 4: Experimentation and Analysis [20 points] page 4 of 4 
1. Experiment 1 - Impact of Number of Clients: [10 points] 
o Vary the number of clients (e.g., 5, 10, 20) and evaluate the accuracy of the final 
federated model. 
o Plot the training accuracy and loss over communication rounds for each case. 
2. Experiment 2 - Non-IID Data: [10 points] 
o Modify the data distribution across clients to simulate a non-IID scenario (where 
clients have biased or skewed subsets of the data). 
o Compare the performance of the federated learning model when clients have IID 
data vs. non-IID data. Plot the accuracy and loss over communication rounds for 
both cases. 
Part 5: Performance Comparison with Centralized Learning [5 points] 
• Compare the federated learning model (both IID and non-IID) to the centralized learning 
baseline in terms of: 
o Final accuracy 
o Number of epochs/communication rounds needed to converge 
 
Requirements and Marking Criteria for the Project Report [25 points] 
You should write a report. Your report should describe the overall design of the federated learning 
in distributed system, as well as the challenges faced during programming federated learning. 
The marking criteria for the report is the following: 
• Structure and completeness (all the aspects are covered) [5 points]. 
• Clarity and readability (the language is understandable) [5 points]. 
• Design explained [5 points]. 
• Challenges discussed [5 points]. 
• References to the sources [5 points]. 
 
Submission 
You should submit the code and the report in MyAberdeen, using the Assignment Submit linked in 
MyAberdeen for the coursework assignment. The deadline is 22 December 2024. Please do not be 
late than the deadline.