Towards the Holodeck Experience: Seeking Life-Like Interaction with Virtual Reality
By now, most of us are well aware of the market buzz around the topics of virtual and augmented reality. Many of us, at some point or another, have donned the bulky, head-mounted gear and tepidly stepped into the experience to check it out for ourselves. And, depending on how sophisticated your set up is (and how much it costs), your mileage will vary. Ironically, some research suggests that it’s the baby boomers who are more likely to be “blown away” with virtual reality rather than the millennials who are more likely to respond with an ambivalent “meh”. And, this brings us to the ultimate question that is simmering on the minds of a whole lot of people: is virtual reality here to stay?
It’s a great question.
Certainly, the various incarnations of 3D viewing in the last half-century, suggest that we are not happy with something. Our current viewing conditions are not good enough, or … something isn’t quite right with the way we consume video today.
What do you want to see?
Let’s face it, the way that we consume video today is not the way our eyes were built to record visual information, especially in the “real-world”. Looking into the real world (which, by the way, is not what you are doing right now) your eyes capture much more information than the color and intensity of light reflected off of the objects in the scene. In fact, the Human Visual System (HVS) is designed to pick up on many visual cues, and these cues are extremely difficult to replicate both in current generation display technology, and content.
Displays and content? Yes. Alas, it is a two-part problem. But let’s first get back to the issue of visual cues.
What your brain expects you to see
Consider this, for those of us with the gift of sight, the HVS provides roughly 90% of the information we absorb every day, and as a result, our brains are well-tuned to the various laws of physics and the corresponding patterns of light. Put more simply, we recognize when something just doesn’t look like it should, or when there is a mismatch between what we see and what we feel or do. These mismatches in sensory signals are where our visual cues come into play.
Here are some cues that are most important:
- Vergence distance is the distance that the brain perceives when the muscles of the eyes move to focus at a physical location, or focal plane. When that focal plane is at a fixed distance from our eyes, let’s say, like with the screen in your VR headset, then the brain is literally not expecting for you to detect large changes in distance. After all, your eye muscles are fixed at looking at something that is physically attached to your face, i.e. the screen. But, when the visual content is produced in a way so as to simulate the illusion of depth (especially large changes in depth) the brain recognizes that there is a mismatch between the distance information that it is getting from our eyes vs. the distance it is trained to receive in the real world based on where our eyes are physically focused. The result? Motion sickness and/or a slew of other unpleasantries.
- Motion parallax: As you, the viewer, physically move, let’s say walk through a room in a museum, then objects that are physically closer to you should move more quickly across your field of view (FOV) vs. objects that are farther away. Likewise, objects that are positioned farther away should move more slowly across your FOV.
- Horizontal and vertical parallax: Objects in the FOV should appear differently when viewed from different angles, both from changes in visual angles based on your horizontal and vertical location.
- Motion to photon latency:. It is really unpleasant when you are wearing a VR headset and the visual content doesn’t change right away to accommodate the movements of your head. This lag is called “motion to photon” latency. To achieve a realistic experience, motion to photon latency must be less than 20ms, and that means that service providers, e.g. cable operators, will need to design networks that can deterministically support extremely low latency. After all, from the time that you move your head, a lot of things need to happen, including signaling head motion, identifying the content consistent with the motion, fetching that content if not already available to the headset, and so on.
- Support for occlusions, including the filling of “holes”. As you move through, or across, a visual scene, objects that are in front of or behind other objects should block each other, or begin to reappear consistent with your movements.
It’s no wonder…
Given all of these huge demands placed on the technology by our brains, it’s no wonder that current VR is not quite there yet. But, what will it take to get there? How far does the technology still have to go? Will there ever be a real holodeck? If “yes”, when? Will it be something that we experience in our lifetimes?
The holodeck first appeared properly in Star Trek: The Next generation in 1987. The holodeck was a virtual reality environment which used holographic projections to make it possible to interact physically with the virtual world.
Fortunately, there are a lot of positive signs to indicate that we might just get to see a holodeck sometime soon. Of course, that is not a promise, but let’s say that there is evidence that content production, distribution, and display are making significant strides. How you say?
Capturing and displaying light fields
Light fields are 3D volumes of light as opposed to the ordinary 2D planes of light that are commonly distributed from legacy cameras to legacy displays. When the HVS captures light in the natural world (i.e. not from a 2D display), it does so by capturing light from a 3D space, i.e. a volume of light being reflected from the objects in our field of view. That volume of light contains the necessary information to trigger the all-too-important visual cues for our brains, i.e. allowing us to experience the visual information in a way that is natural to our brains.
So, in a nutshell, not only does there need to be a way to capture that volume of light, but there also needs to be a way to distribute that volume of light over a, e.g. cable, network, and there needs to be a display at the end of the network that is capable of reproducing the volume of light from the digital signal that was sent over the network. A piece of cake, right?
Believe it or not
There is evidence of significant progress on all fronts. For example, at the F8 conference earlier this year, Facebook, unveiled its light field cameras, and corresponding workflow. Lytro is also a key player in the light field ecosystem with their production-based light field cameras.
For the display side, there is Light Field Lab and Ostendo, both with the mission to make in-home viewing with light field displays, i.e. displays that are capable of projecting a volume of light, a reality.
On the distribution front, both MPEG and JPEG have projects underway to make the compression and distribution of light field content possible. And, by the way, what is the digital format for that content? Check out this news from MPEG’s 119th meeting in Torino:
At its 119th meeting, MPEG issued Draft Requirements to develop a standard to define a scene representation media container suitable for interchange of content for authoring and rendering rich immersive experiences. Called Hybrid Natural/Synthetic Scene (HNSS) data container, the objective of the standard will be to define a scene graph data representation and the associated container for media that can be rendered to deliver photorealistic hybrid scenes, including scenes that obey the natural flows of light, energy propagation and physical kinematic operations. The container will support various types of media that can be rendered together, including volumetric media that is computer generated or captured from the real world.
This latest work is motivated by contributions submitted to MPEG by CableLabs, OTOY, and Light Field Labs.
Hmmmm … reading the proverbial tea-leaves, maybe we are not so far away from that holodeck experience after all.
Subscribe to our blog to read more about virtual reality and more CableLabs innovations.
Gadgets Galore in our Comfortable Living Room Lab
Someone once said, “Seeing is Believing.” At CableLabs, we believe that “Doing is Knowing.” With this in mind, we crafted a space in our Sunnyvale, CA facility that we call the Moveable Experience Lab (aka “MEL”).
The MEL simulates a consumer living room. The space has been intentionally designed as a “safe place” to frame your own ideas and opinions about experiences influencing consumers habits today.
Everything in the lab is meant to be touched, used and experimented with – similar to a Discovery Museum for adults. Haven’t tried Oculus Rift, VR or 360 Content? We’ve got it. How about the Amazon Echo? We’ve got it. HBO GO, Xfinity1, Google Glass and smart watches and UHD TV’s…check, check, check and check. The good news? Unlike crowded conference floors or company demos, visitors are not rushed to take their turn or see a limited aspect of the experience. Visitors are encouraged to take as much time with any technology they are interested in so that they can get a full sense of what might be the big “wow” associated with any one of these consumer tech experiences. Further, CableLabs personnel are frequently on hand to offer their own insights – or just to converse about the technology – without the pressure of press or others waiting for you to form an “official” opinion. Sometimes, just talking about what you’ve observed helps to ensure that the learning sticks or could inspire you to have new ideas about new possibilities.
Right now, MEL contains several experience exhibits:
Battle for the Living Room
Experiences for the Sony 65” UHD Display
- Comcast: Xfinity 1 with Voice Remote - demonstrating personalization (guide preference), recommendations, and voice navigation via Comcast Voice Remote.
- AppleTV: Demonstrating uber aggregation and an ultra simplified remote.
- Amazon KindleFire TV: Features an elegant voice remote and categorized discovery guide (with recommendations based on viewing habits).
- Playstation 4 (with 2 gaming controls) featuring Playstation Now: An ecosystem for games, movies, music, etc.
The Evolution of 3D Experiences in Head Mounted Displays/Mobile Devices
This exhibit shows the progression of virtual reality in the head mounted display form factor. This is probably one of the fastest growing areas of technology and, by all counts, is staged to become a significant consumer trend.
- Google Cardboard (used with any smartphone): Google Cardboard was handed out free last year at Google I/O as an example of a low-cost system to encourage interest and development in VR and VR applications.
- Google Glass: 2nd Generation: The last beta generation of Google Glass. Google Glass was a large-scale experiment meant to “free people from looking down at their cell phones — and look up.”
- Oculus Rift (version 2) (non functional): Representation of the state of development in 2012.
- Samsung Gear VR Headset Low Cost Headset (run on Android Phone): Demonstrates growing number of 3D-interactive titles and games as well as 360 Degree content. As a $99 accessory to a smartphone, this product is expected to cause the tipping point that will be instrumental in bringing Virtual Reality experiences to the masses.
Evolution of the Smart Watch
Like the mobile OS battles, Apple and Google are set to compete for the wrists of consumers — and life-style brand status. Visitors to the MEL lab can gain a sense of how quickly the wearable market has been populated and is evolving.
- 2012 - MetaWatch: First smartwatch of the latest wave of Smartwatches
- 2013-14 – AndroidWear Watches (built from same SDK, different Brand UX): Requires android phone to work. Smart watches with apps, tracking, and mobile notifications and a market place of 4K+ apps.
- Motorola 360 – Round Watch: First round interface (note the UX)
- Samsung Gear: Samsung’s interpretation of the Android Smartwatch SDK
- 2015 - AppleWatch (requires iPhone to work): A watch experience aimed at the luxury experience featuring a high-end display, materials, ApplePay and a 8.5K+ apps.
Tablet Experiences (Apps)
- Watchable: Comcast’s new, cross-platform video service that curates a selection of the best content from popular online video networks and shows in an easy-to-use experience — no need for a cable subscription. Specifically curating content from digital content that will appeal to younger generations than standard cable.
- HBONow : OTT viewing experience – available to anyone who wants to run HBO on a mobile device.
- HBOGo: OTT Viewing experience – available to HBO’s premium cable subscribers.
- Xfinity TV Remote: If you don’t want to use the remote control and your tablet is more handy — there’s an app for that!
- Unstaged: American Express – 360 experiences
We believe that having access to a space like MEL helps CableLabs employees establish opinions on consumer technology that are based on facts as well as experience with the real products. These insights enable us to ground our insights and set the stage for innovation beyond what already is on market.
The next time you are in Sunnyvale, make some time to stop by and experience MEL. No appointment is necessary, but be forewarned…sudden attacks of binge-watching have been known to afflict visitors and you might just be inspired to create a product on your own!
Mickie Calkins is the Director of Co-Innovation & Prototyping in the Strategy & Innovation Group at CableLabs.
Active Story: How Virtual Reality is Changing Narratives
I love story, it’s why I got into filmmaking. I love the archetypal form of the written story, the many compositional techniques to achieve a visual story, and using layers of association to tell story with sound. But most of all, I love story because of its capacity to teach. Telling a story to teach an idea, it has been argued, is the most fundamental technology for social progress.
When the first filmmakers were experimenting with shooting a story, they couldn’t foresee using the many creative camera angles that are used today: the films more resembled watching a play. After decades of experimentation, filmmakers discovered more and more about what they could do with the camera, lighting, editing, sound, and special effects, and how a story could be better told as a result. Through much trial and error, film matured into its own discipline.
Like film over a century ago, VR is facing unknown territory and needs creative vision balanced with hard testing. If VR is to be used as a platform for story, it’s going to have to overcome some major problems. As a recent cover story from Time reported, gaming and cinema story practices don’t work in virtual reality.
VR technology is beckoning storytellers to evolve, but there’s one fundamental problem standing in the way: the audience is now in control of the camera. When the viewer is given this power, as James Cameron points out, the experience becomes a gaming environment, which calls for a different method to create story.
Cameron argues that cinematic storytelling can’t work in VR: it’s simply a gaming environment with the goggles replacing the POV camera and joystick. He’s correct, and as long as the technology stays that way, VR story will be no different than gaming. But there’s an elephant in the room that Cameron is not considering: the VR wearable is smart. The wearable is much more than just a camera and joystick, it’s also a recording device for what the viewer is doing and experiencing. And this changes everything.
If storytelling is going to work in VR, creatives need to make a radical paradigm shift in how story is approached. VisionaryVR has come up with a partial solution to the problem, using smart zones in the VR world to guide viewing and controlling the play of content, but story is still in its old form. There’s not a compelling enough reason to move from the TV to the headset in this case. But they’re on the right track.
Enter artificial intelligence.
The ICT at USC has produced a prototype of a virtual conversation with the holocaust survivor, Pinchas Gutter (Video). In order to populate the AI, Pinchas was asked two thousand questions about his experiences. His answers were filmed on the ICT LightStage that also recorded a full 3D video of him. The final product, although not perfect, is the feeling of an authentic discussion with Pinchas about his experience in the holocaust. They’ve used a similar “virtual therapist” to treat PTSD soldiers, and when using the technology alongside real treatment, they’ve achieved an 80% recovery rate, which is incredible. In terms of AI applications, these examples are very compelling use cases.
The near future of VR will include character scripting using this natural language application. Viewers are no longer going to be watching characters, they are going to be interacting with them and talking to them. DreamWorks will script Shrek with five thousand answers about what it’s like when people make fun of him, and when kids use it, parents will be better informed about how to talk to their kids. Perhaps twenty leading minds will contribute to the natural languaging of a single character about a single topic. That’s an exciting proposition, but how does that help story work in VR?
Gathering speech, behavioral, and emotive analytics from the viewer, and applying AI responses to it, is an approach that turns story on its head. The producer, instead of controlling the story from beginning to end, follows the lead of the viewer. As the viewer reveals an area of interest, the media responds in a way that’s appropriate to that story. This might sound like gaming, but it’s different when dealing with authentic, human experience. Shooting zombies is entertaining on the couch, but in the VR world, the response is to flee. If VR story is going to succeed, it must be built on top of real world, genuine experience. Working out how to do this, I believe, is the paradigm shift of story that VR is calling for.
If this is the future for VR, it will change entirely how media is produced, and a new era of story will develop. I’m proposing the term, “Active Story,” as a name for this approach to VR narratives. With this blog, I’ll be expanding on this topic as VR is produced and tested at CableLabs and as our findings reveal new insight.
Eric Klassen is an Associate Media Engineer in the Client Application Technologies group at CableLabs.
Who Will Win The Race For Virtual Reality Content?
This is the second of a series of posts on Virtual Reality technology by Steve and others at CableLabs. You can find the first post here.
In the recent study we performed at CableLabs, we asked what would stop people from buying a virtual reality headset. High on the list of items was availability of content. Setting aside VR gaming, people didn’t want to spend money on a device that only had a few (or a few hundred) pieces of content. So there has to be lots. Who is creating content, and how quickly will content show up?
Leave it to the Professionals?
Three companies have emerged as leaders in the space of professional content production:
- JauntVR who just secured another $66M investment
- Immersive Media who worked on the Emmy award-winning Amex Unstaged: Taylor Swift Experience
- NextVR who seemingly wants to become the “Netflix of VR.”
Each company is developing content with top-tier talent, each working to establish brand equity and to develop critical intellectual property. Another thing they have in common is that they have developed their own cameras to capture the content, yet they all say they are not in the business of building cameras.
JauntVR Neo Camera
The new Jaunt Neo camera seems to have 16 sensors around the periphery, which we might expect to yield a 16K x 8K resolution. This is far in excess of what the Samsung GearVR will decode and display (limited to 4K x 2K 30fps, or lower resolution at 60fps), but shows the type of quality experience we might expect for the future. The NextVR camera uses 6 4K RED professional cameras, each of which is a rather expensive proposition before you even buy lenses.
But there seems to be a problem of scaling up. To get to thousands of pieces of professional content we will have to wait for years for the three companies and any other new entrants to scale up their staff, equipment, and get engaged in and produce enough interesting projects.
Or will consumer-generated content win the day?
On the flip side, we have seen the rate at which regular video is created and uploaded to YouTube (300 hours uploaded per minute). All we need is for affordable consumer-grade cameras to show up. The first cameras are already here, though their resolution leaves something to be desired. Brand-name cameras like the Ricoh Theta and the Kodak SP360 are affordable consumer-grade 360° cameras that are around $300, but they only shoot HD video. When the 1920 pixels of the HD video are expanded to fill the full horizon, all that can be seen in the VR headset is less than a quarter of that - or around 380 pixels. Most of the content we used in consumer testing was 4000x2000 pixels of content, and even that showed up as fairly low resolution. There are some some startup names entering the scene (like Bublcam and Giroptic), but these are not shipping yet, and don’t advance beyond HD video for the full sphere of what you see. Perhaps more interesting in terms of quality are cameras coming from Sphericam and Nokia that promise 4K video at 60 frames per second, and an updated Kodak camera that supports 4K. These cameras reach a little beyond what the Samsung GearVR headset can decode (which is 4K 30 fps) but are a little beyond the consumer price point, entering the market at $1500. Oddly missing from the consumer arena here is GoPro, but I think we can reasonably predict a consumer-grade 360° camera from them late in 2015 or early 2016 that is a little more manageable than the 16-GoPro Google Jump camera, or the 6-camera rigs that hobbyists have been using to create the early content.
YouTube and Facebook 360 are already here
Facebook launched support for 360° videos in timelines just last week. YouTube launched 360° support in March, and in six months already has about 14,000 spherical videos, with few mainstream cameras available. I strongly suspect that user-generated content will massively out-number professionally produced 360° content. Out studies suggested that while consumers guessed they would be willing to wear a VR headset for 1-2 hours, short-form content would be favored. This matches the predominant YouTube content length well. While Samsung has created their own video browsing app (MilkVR), a YouTubeVR application is sure to emerge soon, allowing us all to spend hours consuming the 360 videos in a VR headset.
What Content Appeals?
Shooting interesting 360° content is hard. There is nowhere to hide the camera operator. Everything is visible, and traditional video production focuses on a “stage” that the director populates with the action he or she wants you to see. We had a strong inkling that live sports and live music concerts would be compelling types of content, and might be experiences people would value enough to pay for. We tried several different types of content when we tested VR with consumers, trying to cover the bases of sports, music, travel and “storytelling” - fiction in short or long form. We were surprised to see the travel content we chose as the clear winner, leaving all other content categories in the dust on a VR headset. We gained lots of insights into the difficulty of producing good VR video content (like not moving the camera, and not doing frequent scene cuts), and the difficulty of creating great storytelling content that really leverages the 360° medium. Let’s just say I am not expecting vast libraries of professionally produced 360 movies any time soon.
The ingenuity of the population in turning YouTube into the video library of everything is likely to play out here too, with lots of experimentation and new types of content becoming popular. The general public, through trying lots of different things, is most likely to find the answers for us. For that to happen, mass adoption of VR headsets has to take place, and consumer 360 cameras have to become cheap and desirable. Samsung announced the “general public” version of their headset at the Oculus Connect 2 last week, with a price point of $99 over the cost of their flagship phones. Oculus expects to ship over 1 million headsets in their first year of retail availability. Cameras are already out there, inexpensive, and getting better. In 12 months I expect over 100,000 360° videos to be published on YouTube.
Is the killer app for VR… TV?
Netflix just launched their app on the GearVR. Here at my desk I can watch movies and TV episodes on a VR headset, sitting in a ski lodge with views out the window of ski runs. I’m sitting in front of a (virtual) 105” TV, enjoying an episode of Sherlock. The screen quality is pretty grainy, but I quickly forget that and become engrossed in the episode -- until my boss taps me on the shoulder and tells me to get back to work. I thought this was a pretty dumb use case for the technology, but I was wrong. Dead wrong. Only problem is I can’t see the popcorn to pick it up and try to get it into a mouth I can’t see. There is lots to ponder here.
Steve Glennon is a Principal Architect in the Advanced Technology Group at CableLabs.
Is Virtual Reality the Next Big Thing?
In the Advanced Technology Group at CableLabs I get to play with all the best toys, trying to work out what is a challenge or an opportunity, and what is a flop waiting to be unwrapped. 3D is a great case study. I first got to work with 3D TV in 1995 (in a prior life), with an early set of shutter glasses and a PC display. Back then it faced the chicken and egg problem - no TVs, and no content. Which would come first?
I still remember the first 3D video I saw in 1995. It was high quality, high frame rate, and beautiful, but not compelling. I still didn’t find it compelling in 2010 when the 3D TV/content logjam was broken.
I remember the first time I saw Virtual Reality video at CES in January 2014. My reactions were starkly different. The first VR Video I saw was grainy, blurry, and a little jerky. And it was incredibly compelling. I was hooked.
You can’t talk about VR - you have to experience it
Lots of people talk about VR - they “get it” in their heads. You see video playing, and you can look all around. And you can look stupid with a headset on. But you don’t really get it until you experience some good content in a good headset.
This doesn’t convey what it feels like see elephants up close. Everyone loves elephants, and with a headset you feel like you are meeting one for the first time, in their natural habitat.
And this doesn’t convey the reaction of stepping back to avoid getting wet feet, with icy cold Icelandic water.
So we have been doing a couple of things. We have been showing the headsets to people at the CableLabs conferences, where we got almost half the attendees to try on a headset and experience it first hand. We also wanted to put some market research behind it, so we designed some consumer panels to get regular consumer feedback across a representative demographic cross-section.
VR Is Only For Gamers. Or Not.
A lot of the early focus for the Oculus Rift (and Crystal Cove) headset has been around gamers and gaming. It is the perfect add-on for the gamer, where every frame is rendered in real time and you can go beyond “looking around” on the PC screen to just looking around. I had the pleasure of getting a tour and demo of the HTC Vive headset at Valve’s headquarters in Bellevue, and there you can walk around in addition to looking around. It is the closest thing to a Holodeck that I have ever experienced. I think we can reasonably expect these to go mainstream within the gaming community.
But the consumption of 360° video has much more importance to cable, as mentioned by Phil McKinney in this Variety interview, because of the need for high bandwidth to deliver the content. We’ll look at that in more detail in a future blog post.
Rather than use our own tech-geek, early-adopter attitudes to evaluate this, we wanted to get a representative sample of the general population and ask their opinions. So that’s what we did. With some excellent 360° video content from Immersive Media, a former Innovation Showcase participant, we asked regular people, with regular jobs. Like teachers, nurses, students, physical therapists, personal trainers, and a roadie. We tried to come up with some different display technologies to compare against, and showed them the Samsung GearVR with the content. Here’s what they told us:
57% “Had to have it”. 88% could see themselves using a head-mounted display within 3 years. Only 11% considered the headset to be either uncomfortable or very uncomfortable. 96% of those who were cost sensitive would consider a purchase at a $200 price point (the GearVR is a $200 add-on to the Samsung Galaxy Note 4 or Galaxy S6 phone).
So this seems overwhelmingly positive. There is the novelty factor to take into account, but we were surprised by how few expressed any discomfort and how positively regular people described the experience.
“VR is the most Important Development since the iPhone”
I had the distinct pleasure of spending time on stage with Robert Scoble during his keynote at the recent CableLabs Summer Conference. We discussed the state of the art in head-mounted displays, 360° cameras and content (we’ll talk more about that in later blog posts) but he expressed this sentiment (paraphrased): “Virtual reality is the most important technology development since the original iPhone”. I hadn’t thought about it that way, and now I agree with him. This is not just hot and sexy, a passing fad. It has massive potential to transform lots of what we do, and we can all expect incredible developments in this space.
I want to meet Palmer Luckey
Palmer Luckey created his first prototype head-mounted display at age 18. Four years later he sold Oculus to Facebook for $2 billion, having dropped out of school to build more prototypes. I didn’t see the Facebook deal coming, and didn’t understand it. Now I get it. I want to meet him and thank him for transforming our lives in ways we cannot imagine. We just haven’t witnessed it yet. We will explore more in the next few posts.
Steve Glennon is a Principal Architect in the Advanced Technology Group at CableLabs.