Enhanced Register Data-Flow Techniques for High-Performance, Energy-Efficient GPUs