We desire to render scenes onto 3D displays for realtime applications based on image data, targeting media such as streaming multi-view video. The conventional method for rendering would be to first perform 3D reconstruction of the image data, then render the 3D geometric and texture data. However, this approach would be too slow and would lack robustness for complex scenes. Instead, we take an image based approach by using appropriate filtering. Our approach interpolates limited light field data of a 3D scene to simulate depth of field, so that we are able to render new stereo pair views from camera array data in realtime. This allows users to roam in front of a 3D display and still experience seamless natural viewing. We produce these images by convolving 2D image data with an elliptical Gaussian. By shearing the line-like Gaussian to minimize variance over an area-of-interest, the application determines the depth of a user designated pixel. This lets users change focus in realtime, so that their vision does not feel inhibited by the shallower focal depth. Instead, they are able to better focus on the details of their current interest. Our method can be combined with eye trackers and video data for minimally invasive surgery, video conferencing, or any other remote controlled interactive visual applications for enhanced realistic viewing experience