We study attacks and defenses for distributed machine learning systems
Data is essential to training high quality machine learning (ML) models. Federated learning (FL) is a distributed ML approach that aims to facilitate collaborative ML on models from data distributed across personal devices, without disclosing device-local training data. Unfortunately, FL’s design is susceptible to security and privacy attacks. The client nodes play an important role in the computation as they are maintaining their own local data, whereas before they merely compute the gradient using the data sent out from the server. This allows client nodes to mount a variety of attacks.
We have been considered systems that provide alternatives to FL, as well as attacks and defenses for FL-based systems. For example, we previously created FoolsGold, a defense for model poisoning attacks. In a poisoning attack, the attacker includes bad data as part of the client training set, which causes the trained model to make wrong predictions on certain input classes. In our ongoing work we are considering how an attacker may use the clients they control to hurt the training process more generally. For example, clients that connect and disconnect may lead to poor model quality (known as malicious node churn). We also study how these strategies comprise existing FL defenses against security and privacy attacks.
Systopia lab is supported by a number of government and industrial sources, including Cisco Systems, the Communications Security Establishment Canada, Intel Research, the National Sciences and Engineering Research Council of Canada (NSERC), Network Appliance, Office of the Privacy Commissioner of Canada, and the National Science Foundation (NSF).