Cognitive Services And Artificial Intelligence: How Microsoft Pix Works

We asked the representatives of Strategic Technologies Department “Microsoft Russia” to tell us how a new device Pix works and what services were used creating it.

Professional photographers are familiar with the feeling when you take million shots expecting a perfect one, when it is essential to capture the moment because in a split second the shot will change forever. We all remember the feeling when we want to feel ourselves a pro and get a unique perfect shot using a smartphone, which is always with us, but unfortunately lacks some functions of a professional camera. Microsoft scientific-research team offered a solution of this problem and developed Microsoft Pix, an app for iPhone aimed at adjusting the settings for taking the best shots (ISO, exposition, focus) using the technologies of artificial intelligence. In this article we are going to consider it from a user’s and developer’s perspectives.

How does it work from a user’s perspective?

For a user Microsoft Pix is a very easy app that does not demand from you any additional actions. The only thing you need is to take a photo and everything else takes place inside your app.

How do the ideas of such apps appear?

13874881_1145314318860366_1195035936_n

They appear to solve problems that make our life more complicated. Josh Weisberg, a leading manager of Computational Photography Group in Research Department Microsoft Research, said that his wife’s dissatisfaction with the quality of children’s photos taken by a smartphone gave him an incentive to create such an app.

You will find several interesting possibilities in the app, for example: automatic photo stabilization, gallery synchronization, function “opened/closed eyes”, etc.   It also has a set of different filters and tools for editing photos, besides, all changes are stored and you always can undo them, returning to the original photo. You can both take photos and film using Microsoft Pix.

The principle of Microsoft Pix is quite easy: the app creates a series of ten shots, after that, using the technologies of artificial intelligence, chooses the best photos (up to three shots). Before deleting “unsuccessful” images, the app analyzing the data from all series, de-noises them, selects the best exposure and tone. All the process takes up to 1 second. During the changes of eternal photo features, the app determines if there was any motion in your shot, and if so you can create a short animation, getting a “live image”.

How does it work from a developer’s perspective?

For developers Microsoft Pix is an app implemented using already existing tools and their enhancement. Below you can get acquainted with the algorithms, technologies and services used to develop the app. A short description of three most interesting ones:  

  1. The algorithm of face recognition was realized through cognitive Microsoft services. Microsoft Cognitive Services is a set of intelligent programs (API-interfaces), working in a cloud and allowing to recognize and interpret the requests sent by usual communication means. For example, this app works on Face API that detects faces by face rectangles and face attributes, including different features and pose. Emotion API helped to recognize people emotions using the algorithm of artificial intelligence, based on facial expression patterns.  However, the developers minimized the functionality of API, because it is not necessary to detect partial emotions and facial expression: opened or closed eyes, a smile, etc. You can use Microsoft Cognitive Services free. The limit for Free-version is 30,000 transactions per month.
  2. The algorithm of avoiding “trembling hands” effect was implemented in 2012 in a separate app Cliplets for a cycle animation, which is also free. It helped to solve the most frequent problem of GIF –animation: a very noticeable jump from the end till the beginning. The program isolates all moving objects and determines the speed of their movement, then optimizes the movement of all objects in a way that the movement in the beginning and in the end coincide. In other words, some objects speed up, others slow down. Microsoft Pix uses the algorithms of Clipets to replace a tripod. If you are interested in using such functions, you can read further information in a scientific work «Automated Video Looping with Progressive Dynamism».
  3. The algorithm of creating “a live picture”, is mentioned in a scientific work, but in its original variant it was very slow. To be used in Microsoft Pix special classifiers were added to detect and choose the element for animation looping. Now this process takes less than 50 milliseconds and image processing less than 2 seconds.
Is IoT a puzzle you’re trying to put together?
Download a definitive IoT Guide from our partner!

Using the algorithms and cloud technologies described above, you can develop a similar app by yourself. Besides, Microsoft Pix app, is a part of another app, developed by Josh Weisberg with Microsoft Team — Microsoft Hyperlapse for iOS. But this is already another history.

Have something to add to this story? Share it in the comments. Leave a Comment

Leave a comment