Subspaces of Deep Neural Networks
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

Subspaces of Deep Neural Networks

No data is associated with this publication.
Creative Commons 'BY-NC-ND' version 4.0 license
Abstract

Deep Neural Networks are now the most prominent tool in Machine Learning with a wide array of societal applications ranging from computer vision and natural language processing to drug discovery and material engineering and much more. With the increasing pervasiveness of deep learning in daily lives especially in areas of critical important it becomes increasingly necessary to understand the mechanisms and dynamics behind how these models work. This is important not only in engendering confidence in the public at large but to also ensure fairness and accountability.This thesis focuses on developing approaches that can help us visualise and compare representations inside deep neural networks. We start with a coupled factorisation framework where we try to embed Features, Neurons and Input Instances themselves into a common shared space. This enables us to compute direction comparisons and correlations between the 3 entities and visualise a network by analysing the subspaces in this common projected space. We build upon this by then utilising tools from subspace clustering literature to develop a more architecture agnostic framework for comparing representations of a neural network. In doing so we discover that deeper and more overparameterised networks have large blocks of layers that are similar to each other. For a network with a given depth, phenomena also occurs when the network is trained on much less data and also earlier in its training. Additionally we also discover that earlier layer of the network are quicker to converge to their final state than later layers. Next we perform a similar study for networks in the domain of time series classification. In this study we focus on analysing how different architectural choices affect the way these networks learn. We take ResNets and Temporal Convolution Networks (TCNs), which are architecturally identical but for the causality of the convolution filter used. We demonstrate that this causes the 2 architectures to have very different dynamics of learning representations in their internal layers. This is the first such study for univariate time series networks. Finally, we analyse how neural networks evolve self-expressive structures in their internal representations by comparing networks trained in different loss regimes. We also demonstrate their predictive power when compared to linear probes trained on representations in the same space. We also use the information encoded in the self-expressive structures to differentiate between networks which generalise and memorise and observe that the effects of memorisation tend to appear in the final layers of a network and how these effects change when using a different non linear activations over neurons.

Main Content

This item is under embargo until October 18, 2025.