Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

Towards Reliable Learning Systems: Efficient, Secure, and Generalizable Generative Models

No data is associated with this publication.
Creative Commons 'BY-NC' version 4.0 license
Abstract

Human beings inherently tend to learn skills that generalize well to different environments. The ability to infer or interpret the surroundings efficiently with proper reasoning differentiates humans from other living beings. Moreover, humans possess the ability of tackling scenarios when presented with incorrect information. Therefore, real-world machine learning models should be able to learn from data distribution, possess the ability to utilize information under changing conditions, and be robust against adversaries trying to manipulate their decisions while being compute efficient. This thesis principally focuses towards understanding how robust features can be extracted from underlying data distributions to train models better, explore the extent of brittleness of model decision-making, and reduce the designing cost of multi-task models such that they can be used in diverse scenarios with optimum performance. The first work focuses on the problem of generating videos from latent noise vectors, without any reference input frames. We developed a method that jointly optimizes the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning. In the second work, we looked into the problem of cross-domain unsupervised video anomaly detection tasks where no target domain training data are available. The goal is to allow end-users in accessing a system that works ``out-of-the-box" to avoid laborious model-tuning. In the third work, we leveraged upon the open-sourced pre-trained vision-language model CLIP (Contrastive Language-Image Pre-training) to create adversarial attack with multi-object scenes. The motivation is to exploit the encoded semantics in the language space along with the visual space. In order to represent the relationships between different objects in natural scenes, we designed an attack approach that demonstrates the utility of the CLIP model as an attacker's tool to train formidable perturbation generators for multi-object scenes. In the fourth work, we proposed a method to deploy multi-task machine learning models on diverse hardware platforms that satisfy multiple hardware efficiency constraints (e.g., storage, latency), while keeping training cost to a minimum. In particular, we present a methodology to learn slimmable multi-task models that can allow switchable filters based on user constraints, without much performance degradation.

Main Content

This item is under embargo until May 4, 2025.