Occassionally we get downtime on the Emerging Experiences team — but not very much. After a recent project we decided to play with some of the technology we have lying around. One of our favorite toys for 2012 is the high-end projector (pick yours up for Christmas now!). With a powerful projector you can create great interactive experiences on glass, on store windows, on mist, on water and so on. When a projector is pointed at a flat surface, it creates the illusion of depth. Lately we have been playing with the idea of shining a projector on surfaces that already have depth. Next step — hooking this up to an array of Kinect sensors to see what projecting on 3D surfaces that interact with 3D depth cameras will look like.
Archive for November, 2011
Ever since Microsoft started leaking details about its upcoming version of their flagship product, Windows 8, there has been firestorm of controversy among Microsoft’s faithful. Many Silverlight application developers and publishers feel like they have been willfully misled into investing in a technology that Microsoft is now apparently abandoning. Many IT Pros dislike and fear the retraining efforts they will have to make with new Start Screen and other Windows shell changes. Finally, many ASP.NET web developers don’t see how Windows 8 relates to them despite the fact that Microsoft is adding “WinJS”, a runtime that allows Web developers to leverage their existing skills to build native applications. On the ground, it may seem like things are going badly for Windows 8, but with a little developer ingenuity and a lot more communication and documentation from Microsoft, Windows 8 could be the product that saves Microsoft from being a victim of their own success.
Take the Start Screen for example—In order to finally enable OEMs to build devices that can truly be considered a “Tablet PC”, Microsoft has to provide a way for users to launch applications. One might be tempted to think that the Start Menu in Windows 7 could be adapted to serve this purpose, but fingers are just not good at tapping on small icons or icons that are densely packed. Making the Start Menu a full screen experience is really the only way to get enough space to create a truly usable touch-optimized experience. We in the Emerging Experience group have known this for years as practically every single touch based application that we have built has been a full screen app. On top of this, Microsoft’s Start Screen’s animations are extremely fluid and natural, and so to us it seems like a natural platform from which to launch our showcase applications.
To give a little background about ourselves and our applications, we are a technology agile group, which means that we use the technology that creates the best experience for our customers. Many of our apps are built using WPF but we also have apps that are built using Flash. Obviously attempting to port our applications would not be a good strategy for the Flash apps, but even after a brief investigation, I quickly decided that attempting to port all of our WPF applications was a non-starter. The Metro APIs are far too different and who knows if, after porting the WPF applications, I would even end up with an app that worked? The solution, it was decided, was to leave the existing showcase applications as they were but to simply create live tiles for them so that they could be invoked.
The problem with this solution is that it is not possible to really take advantage of the Live Tile infrastructure from a Win32 app. In a Metro-style (WinRT) application you can supply different resolution images for the tiles by altering the AppX manifest, but Win32 Applications don’t have AppX manifests. It might seem trivial to simply create a WinRT application that upon launch invokes one of our showcase applications, and to use the WinRT app’s AppX manifest to customize the Live Tile, but unfortunately the relationship between WinRT and Win32 is significantly more complex than that. First of all, WinRT applications can call some Win32 APIs, but it explicitly cannot create new processes—this is part Microsoft security model for WinRT apps. On top of that, even though WinRT apps can call many Win32 APIs, many of those calls either fail outright or fail to have the desired result. Clearly this is an area where Microsoft can do a much better job in providing documentation.
To work around these limitations, I decided to create a WPF application that lives in the System Tray as a notification icon. The entire purpose of this WPF application is to listen for network calls and then launch and activate the requested application. At this point our WinRT “launcher” application was simply responsible for initiating the network call and then close itself down.
While this worked beautifully in the debugger, I was surprised to find that it did not work once the applications were freed from the debugger. Sure, the Launcher application still made a network call to the WPF application and the WPF application still launched the showcase application, but the showcase application was never displayed. The problem, it turns out, is that the Win32 function “SetForegroundWindow” on which my WPF application was indirectly relying behaves differently if the calling application is being debugged. Clearly the Windows shell makes use of a facility to show the desktop when the user clicks on the Desktop tile in the Start Screen, but when I asked Microsoft about this and SetForegroundWindow, I was essentially told that this was by design and that only the end user should control which window has focus. I understand the wisdom of this decision, but this answer didn’t get me any closer to being able to launch our showcase applications from nice looking Live Tiles.
While I wouldn’t propose that developers do this in production applications, Windows 8 isn’t a production OS itself—and I still hold out hope that will make this whole endeavor moot by they release Windows 8. With the disclaimer in effect, the way that I solved this problem was to create a third Windows Forms application whose sole responsibility is to run CDB, the command-line debugger, and automate it to launch and attach to the WPF application. Because the WPF application has a debugger attached, it is now able use the SetForegroundWindow API and the entire system works as expected. In fact, by not creating a Window in the Windows Forms application and launching CDB without a console window the entire hack is invisible to the user and everything transparently works as expected.
Over the last few weeks Amnesia Razorfish had the opportunity to collaborate with the University of Sydney and Publicis Mojo for an innovative take on how near field communication can transform tasks in our daily lives. For this concept designed for food courts Stephen Davis created a Brand Table prototype that not only simplifies the process of displaying menus of restaurants on your phone, but also allows you to place your order and pay instantly.
Steve has a nice write-up on his website about the benefits of this approach and we were happy to see it being picked by a number of online publications including TechCrunch and Engadget.
With the wide-spread adoption of NFC at the horizon we’ll hopefully see concepts like this becoming reality soon.
As we approach the one year anniversary of the Kinect launch, Microsoft has announced that the Kinect for PC Commercial SDK will be released in early 2012 (http://majornelson.com/2011/10/31/xbox-360-celebrates-one-year-anniversary-of-the-kinect-effect/). More than 200 businesses worldwide, including Toyota, Houghton Mifflin Harcourt and Razorfish, are involved in a pilot program to explore the commercial possibilities of the Kinect.
Until now, most companies working with the Kinect have been working within the constraints of a research license for the Kinect SDK. Consequently the applications that corporations have been working on have been restricted to tightly held private projects or, at most, proof-of-concept projects visible only as demo reels on the Internet. While most people are at least aware of the Kinect technology, the terms of the research license has relegated it to being an afterthought or something only understood at a distance – a nice to have.
The recent announcement of the timeline for the commercial license implicitly green lights these projects to make preparations for releasing Kinect-enabled applications for everyday use. Over the next year we can expect to see the Kinect as a ubiquitous part of our daily environments and something just as prevalent as interactive kiosks are today. The spread of the Kinect beyond the living room may be as dramatic as the proliferation of smart phones or tablets – one day no one knew what they were and, the next, everyone seemed to have one. In boardrooms across America, the question will no longer be one of whether to have a Kinect strategy but instead what that strategy is.
As the Kinect becomes more prevalent in our daily lives, the possibilities and limitations of the Kinect will undergo much closer scrutiny. The potential offered by a mass produced device that provides a video camera, an infrared depth camera and a four microphone array with beamforming capabilities is vast. The technology can be taken in multiple directions including computer vision in robotics, 3D modeling with multiple linked devices, inexpensive augmented reality, hands-free interactive experiences, speech recognition based in-store assistance and innovative computer assisted learning.
That Microsoft’s visionary strategy in designing the Kinect has revolved around off-loading processing to the operating system rather than building it solely into the hardware means that complex scenarios not currently supported by the Xbox can be made viable through improved software and processing power on computers and video cards, the price of which are constantly falling. Microsoft’s Kinect technology is actually scalable and does not require improving the Kinect hardware itself but, instead, on simply improving the software that processes the data streamed by the Kinect.
This all leads to the inevitable question – what is the future of the Kinect? After a year, what are second generation Kinect applications going to look like? The answer depends on where Microsoft takes Kinect software going forward. The current research version of the Kinect SDK beta shows its roots in gaming. The visual processing, depth processing and even acoustical models are tied to the limitations and optimizations required for the Xbox 360 gaming system. They all work best in a room about the size of your living room and even begin to have troubles in small apartments. The microphone array seems to work well in standard rooms, for which it has painstakingly been optimized to deal with surround sound speakers and audio reflections off of furniture, but appears to have trouble in large spaces.
Strikingly, even though the depth camera is capable of 640 x 480 resolutions, the current SDK only provides access to 320 x 240 image streams. The Kinect SDK, likewise, does not provide depth data information for objects within 800 mm (about 2 ½ feet) of the Kinect sensor even though the camera does capture this information.
There are clearly performance reasons for setting these limitations. However part of the problem also appears to be related to the fact that the USB connector for the Kinect is a bottleneck and has been throttled for the particular USB controller configuration requirements of the Xbox. As the Kinect moves out of the living room and into the real world, it makes sense to leave the restrictions imposed by tying the Kinect SDK to the Xbox behind. If we can use improved software running on improved hardware to boost the capabilities of Kinect for PC applications, it would be a shame to have a gaming infrastructure be the main showstopper.
Nowhere is this more clear than when we consider using the Kinect in the office. As a Kinect developer, I have to slide my chair back and away from my monitor whenever I want to debug a piece of code. Fortunately I don’t work in a cubicle and have some open space behind me. I am also fortunate that my chair has wheels and I have the code – slide – code routine down pat. However I don’t see anyone wanting to use a Kinect-enabled business application in this way. Unlike the living room, which is the natural space of our home lives, the office environment of our work lives is generally cramped and close to the screen with just enough room for a keyboard between us and our monitor. We are always within two and a half feet of the objects we work with.
Yet the workspace is one of the chief places we want to see our Kinects working. And instead of large arm movements, we would like to wave our hands or snap our fingers in order to make things happen on our screens. We want The Minority Report writ small. In order to achieve this, in turn, we need to move beyond skeletal tracking and start enabling fine finger tracking.
Along the same lines, for larger movements, the skeletal tracking capabilities of the Kinect only work with the full body. At the office, sitting in our office chairs, we typically never see anything below the waist. Even skeletal tracking, then, needs to be modified to take this into account and to support partial skeleton tracking at the software level.
As the Kinect is being allowed to travel beyond our living rooms with the upcoming release of the commercial Kinect SDK, the software that allows developers to build applications for the Kinect needs to cut its strong dependence on gaming scenarios. This is the natural future for a technology that is maturing. This is where the Kinect is headed – not only out into the world but also up in our faces. We want and need to get closer to the Kinect.