Augmented Magic Mirror

Augmented Magic Mirror is a research project I made during my master’s program. There are several DIY instructions on how to build a smart mirror. Sometimes also called a magic mirror. Smart mirrors can display digital 2D content, like your calendar, weather forecast, etc,  on a mirror surface. My idea was to take it to the next level by displaying 3D content, as in enhancing the real mirror image with virtual objects. That augmentation is not just overlaying the images, instead interposition as a depth cue is also taken into account. Therefore real objects like the user’s hand also occlude virtual objects and vice versa.

The project is available on GitHub. Below is a short clip with the project in action.

Description

To display the 3D content the virtual content is merged with the real mirror image. For visuals the virtual rendering must match the perspective of the viewer. Also depth perception is important, so the viewer will perceive the virtual objects with the correct distance. To match the perspectives head tracking is used with the help of a Kinect V2 sensor. Adjusting the perspective by moving the viewer is also known as motion parallax. This in itself is a depth cue, as seen in the clip below. Therefore in a first milestone I used only motion parallax as a depth cue. Though after testing I observed that motion parallax works quite well for monocular viewers, like cameras. However when testing it as a binocular human, the depth effect isn’t convincing enough and doesn’t work very well. In a second milestone I added stereoscopic view, by using NVIDIA 3D Vision 2 glasses. After testing it again I concluded that Stereopsis and motion parallax is sufficient to merge both images and create a convincing depth perception.

The head tracking using a  Kinect V2 sensor is available on GitHub as a separate project as well. Below is a clip of the head tracking without the mirror.

The occlusion works in both ways. For virtual objects to hide a real object I simply render them. The opacity depends on the light in the room and the monitor brightness, but it is sufficient for most occasions. To occlude virtual objects I make use of Kinect’s depth image. The depth image can be used to compare the distances between mirror and real object and mirror and virtual object. To simplify that I reconstruct a surface mesh from the depth image. That mesh is then rendered as a black object, to utilize the z-buffering of the rendering pipeline.

Setup

For the augmanted magic mirror I followed those DIY instructions. Basically it is just a glass or acrylic plane with a semi-transparent foil. That semi-transparent mirror is then placed on a monitor screen. For my setup I placed the Kinect V2 sensor beneath the augmented magic mirror, though you can also place it above. The closer it is to the augmented magic mirror the better. Also be careful if you place the sensor behind the semi-transparent mirror. The sensor uses infrared light for its depth camera and some materials are infrared opaque. For the stereoscopic view you can use any device that is compatible with the Windows 10 3D display mode. I used the NVIDIA 3D Vision 2 glasses. Place the IR-emitter somewhere next to the augmented magic mirror. It just needs to be within the range and direct line of sight with the glasses. Also keep in mind that a 3D enabled monitor is required, when using stereoscopic view.

Augmented Magic Mirror setup
Augmented Magic Mirror setup.

The software is an C++ application targeted for Windows systems. It uses the official Microsoft Kinect SDK v2.0 and that is only available for Windows systems. The application features rendering using either DirectX 11 or DirectX 12. I started off implementing rendering using DirectX 12. However when adding the stereo support I came across the incompatibility between DirectX 12 and the 3D display mode. To solve that I had to reimplement it using DirectX 11. The stereoscopic rendering utilizes the Windows API and does not use the NVIDIA SDK.

Conclusion

For the physical setup I made two observations. First off the mirror and screen size needs to be the same size. Otherwise if one image exceeds the other and is cut off, the illusion breaks. Secondly the setup needs proper mounting and calibration. In the beginning I used two clamps to mount the mirror plane on the monitor. As my pane is acrylic, it bend from the pressure that resulted in a distorted mirror image. Later on I used tape to prevent this. Even smaller deviations will break the illusions.

As already written Stereopsis is mandatory for proper depth perception.  For the project I only had access to the shutter glasses. Unfortunately other stereo systems that do not make use of glasses weren’t available.  There are several issues with the glasses. First off all it limits the usability. Users have to put those on before using the mirror. The semi-transparent mirror in addition with the shutter glasses are reducing the received brightness. So the overall scene is very dark for the user. For a proper stereoscopic and motion parallax view, the software needs to detect the exact eye position. However the glasses are opaque to the Kinect. It is not able to detect the exact position and an approximation is used instead. In addition to that the glasses are quite bulky. The head tracking keeps failing every now and then, interrupting the experience.

The occlusion is implemented only in a very basic naive form. The reconstruction simply uses each pixel as vertex and creates a face between three adjacent vertices. This results in scattered edges and fragments. Because the Kinect is placed somewhere next to the augmented magic mirror it has a different perspective, than the viewer. This difference in perspective causes incorrect occlusion on object edges where the perspective angle differs. Moreover a shadow issue is introduced. The viewer can see a region of real imagery that is hidden for the Kinect sensor, like in the image below. A solution proposal is to use multiple Kinect sensors. Those would cover different perspective of the scenery in front of the mirror. The depth images can then be merged to create a point cloud that covers edges past the users perspective, as well as shadows. However currently due to hardware and SDK limitations only one Kinect sensor can be used.

ShadowsAndScattering
Shadow and scatter issue of the depth mesh.