We conduct research on a variety of topics, including operating systems, distributed systems, security, data provenance, program analysis, and much more!
We use computational caching as well as novel data structures and algorithms to produce provably optimal solutions to real world NP-hard problems. Our focus is on interpretable, certifiably optimal models.
View Interpretable Machine Learning Projects |
Rule lists are a set of easily interpretable solutions to classification tasks. We develop custom discrete optimization techniques to use these interpretable models while remaining competitive with common black box approaches.
We create optimal decision trees for classification tasks. We optimize an objective that includes a penalty on the number of leaves in a tree to favor models that are sparse and interpretable. Sparse decision trees function as an interpretable model with greater flexibility and power than rule lists (the latter being a special case of the former).
We investigate new design principles in datacenter system design and operation. Our work covers broad areas such as Network Function chain allocation and acceleration, disaggregated datacenters, and using programmable devices to acclerate data analytics workloads.
View Networked and Distributed Systems Projects |
Datacenter scheduling of Spot virtual machines
Datacenter scheduling of virtual machines with network guarantees
We use dataplane programmable switches to improve goodput in Network Function chain deployments.
Systems that handle users' data must consider the privacy of that data. We build systems that provide privacy guarantees in the face of various types of threats and also study individual privacy expectations using mixed methods approaches.
View Privacy Projects |
This research looks at data disclosures due to side channels in cloud services.
Who do we want to share our information with and why? This project looks at the emotional and social dynamics of privacy and trust in intimate relationships.
We are designing a smart cache that uses synthetic data to allow users to issue more differentially-private queries before depleting their privacy budget.
We study transaction source privacy and transaction content privacy in blockchain systems.
We investigate techniques and applications that make data more valuable, from capturing data provenance (a formal history of how data came to be in its present form) to deriving provenance to developing applications that use data provenance. We build tools to facilitate scientific reproducibility and we investigate ways to build systems that are more accountable to the people who develop and use them.
View Provenance Projects |
We offer visual insight into the anomalies triggered by a detection system to assist in performing root-cause analysis.
We develop systems that facilitate computational reproducibility for scientific and ML workflows
We explore the use of system provenance for security.
We research, develop, evaluate, and deploy tools and systems designed to ensure that system/network security missions can be accomplished successfully despite cyber attacks. We also develop advanced algorithms and techniques for processing big datasets from a range of sources, such as IoT devices and network traffic.
View Security Projects |
We offer techniques for building intrusion detection system for Cyber-Physical Systems (CPSes).
This research looks at data disclosures due to side channels in cloud services.
We offer visual insight into the anomalies triggered by a detection system to assist in performing root-cause analysis.
We study attacks and defenses for distributed machine learning systems
We explore the use of system provenance for security.
We investigate the use of Machine Learning techniques to design systems able to self-optimize.
View Self-optimizing Systems Projects |
Tuneful is an extension for Spark which optimizes workload configurations starting from a zero-knowledge setting. The more workloads a cluster executes, the better it becomes at executing them. In order to achieve this, we leverage Multi Task Gaussian Process, Similarity Analysis and Significance Analysis.
Auto-tuning complex systems configurations for high performance
Complex software systems are difficult to design, implement, tune, debug, and understand. We are working on techniques to address these challenges.
View Software Engineering for Systems Projects |
We compile Go-based distributed systems from specifications written in a variant of PlusCal.
We investigate emerging storage technology such as non-volatile RAM and ultra-dense hard disk drives to match technology characteristics with file system capabilities. We also explore new approaches to namespace management, data organization, and data analytics.
View Storage Projects |
Zoned storage devices, such as flash drives and shingled magnetic disks, are divided into units called zones, in which all writes must be sequential. We develop techniques to improve I/O performance on these devices.
We are redesigning file system implmentation structures to enable synthesis of file system components and automated file system assembly.
We develop systems and algorithms for large scale graph processing and explore novel applications for graph-structured data.
We benchmark Intel Optane Persistent Memory to understand file allocation policies that optimize performance for applications that use persistent memory.
Traditionally, file systems have been implemented in kernel mode for efficiency and security, albeit at substantially higher development complexity. User mode file systems have existed for decades and suffer from lower efficiency. Why? How can we improve them?
The hierarchical filesystem namespace originated in the 1950s and was modeled after filing cabinets. While it has served us adequately for decades, we explore alternate namespace paradigms to determine if there are alternatives that can enhance the ability to find and present information to users.
We investigate novel techniques for specifying systems components and tractably synthesizing provably correct implementations.
View Synthesizing System Software Projects |
Shellac synthesizes compiler rules from specification to implementation languages.
Grand unified theory of isolation mechanisms in an operating system.
Tinkertoy is a set of modular operating systems components from which one can assemble a custom IoT system.
We are redesigning file system implmentation structures to enable synthesis of file system components and automated file system assembly.
COMET synthesizes reactive hardware and software from UNITY protocol specifications.
We develop systems, machine learning models, and algorithms for different tasks on large scale graphs and explore novel applications for graph-structured data.
View Graph-structured Data Projects |
Systopia lab is supported by a number of government and industrial sources, including Cisco Systems, the Communications Security Establishment Canada, Intel Research, the National Sciences and Engineering Research Council of Canada (NSERC), Network Appliance, Office of the Privacy Commissioner of Canada, and the National Science Foundation (NSF).