Distributed Training With Local Updates

Distributed machine learning allows model training on decentralised data residing on various devices, such as mobile phones or IoT devices. However, all these edge devices usually have limited communication bandwidth to transfer the initial global model and local gradient updates. Limited bandwidth is one of the major bottlenecks that hinder applying FL in practice. We would like to leverage permutation invariance of neural networks to make these devices start learning from locally initialised, rather than a global model and enable partial parameter updates. This approach exploits update locality and should considerably reduce bandwidth usage. The goal of the thesis would be to implement the approach and compare it with stateof-the-art distributed machine learning implementations.

Download as PDF

Student Target Groups:

  • Students in ICE
  • Students in Computer Science

Thesis Type:

  • Master Thesis / Master Project

Goal and Tasks:

  • Literature review on distributed and federated training, training and aggregating models from different initializations and partial model update
  • Implement distributed model training with provided optimizations
  • Compare the obtained performance to vanilla baselines (the code will be available) and to a nondistributed training
  • Report obtained results in a written report, oral presentation

Recommended Prior Knowledge:

  • Good knowledge of deep neural networks and interest in optimization and distributed training
  • Programming skills in Python
  • Prior experience in deep learning frameworks is desirable (preferably PyTorch)

Start:

  • a.s.a.p.

Contact: