CMU Video Conference System Gets 3D From Cheap Webcams 94
Hesham writes "Carnegie Mellon University's HCI Institute just released details on
their "why-didn't-I-think-of-that-style" 3D video conferencing application. Considering how stale development has been in this field, this research seems like a nice solid step towards immersive telepresence. I was really disappointed with the "state-of-the-art" systems demoed at CES this year — they are all still just a flat, square, video stream. Hardly anything new. What is really cool about this project, is that researchers avoided building custom hardware no one is going to ever buy, and explored what could be done with just the generic webcams everyone already has. The result is a software-only solution, meaning all the big players (AIM, Skype, MSN, etc.) can release this as a simple software update. 'Enable 3D' checkbox anyone? YouTube video here. Behind the scenes, it relies on a clever illusory trick (motion
parallax) and head-tracking (a la Johnny Lee's Wiimote stuff — same
lab, HCII). It was just presented at IEEE International
Symposium on Multimedia in December."
Re: (Score:1)
It's not gonna give you a true 3D sensation since the image will appear identical to both eyes.
It's basically using your cam to track your head, then using software to munge up the incoming image from the other person's 2 cameras, as if your head was at that spot between the two, kind of like setting the fade and balance for audio, but for video.
But it'll still be a flat image.
But "it's very neat!"
Re: (Score:2)
Agreed. The next step is to make a virtual mannequin head and map the face onto that. (with a very small number of knobs for fitting size and orientation) Like that Disney ride with the ghosts.
And after that, a few tricks to change the virtual viewpoint so it looks like you're looking at the camera and not the picture of the other person.
Re: (Score:2)
No. The next step is to move the technology into a FPS. Imagine actually being able to look around the corner by...looking around the corner.
Plus, my wife will not longer be able to laugh at me for leaning around in my chair when I play.
2.5D, not 3D (Score:5, Insightful)
For what it's worth, I really don't care for this effect at all. I am not denigrating its inventors in the slightest; this is a novel (read: low cost) approach, and I am sure some people would enjoy having this in their iChat/AIM/skype. To me, it's the equivalent of Apple's Photobooth filters (fisheye, inverted colors, etc) -- a cheap parlor trick that seems nifty for about 5 seconds, and then becomes precipitously distracting. True 3D has its own issues with distraction and visual anomalies (leading to headaches, etc). Even the best 3D cinematographers around have to be very careful to avoid these issues (for instance, Vince Pace, who shoots 3D for James Cameron (Titanic, Terminator, etc) has plenty of headache-inducing scenes in his demoreel, and this is a guy with state-of-the-art facilities who has as much knowledge as anyone about how to do stereoscopic cinematography). Frankly, I think video conferencing is best left 2D, and any efforts toward improving it should be spent increasing framerate/resolution (and reducing lag + dropped frames).
Re: (Score:3, Interesting)
I'm with you - while my inner geek wants to give the developers credit and is impressed, the result is not something I'd want to actually use short of screwing around with it for a few minutes.
Even if it were improved to the point where it was "perfect", it would still be just a cool trick and not a killer feature.
Re: (Score:2)
One thing I found was the the blackness around the edges was annoying.. it "gave" the impression of 2.5d-ness to someone who would otherwise have considered it 3D.. I'm talking Mr.LameO suddenly installing 3DChat-2009, and immediately recognizing it for a "parlor" trick.
One small thing that would go a long way in alleviating this would be cropping off edges on both sides (only in the viewing windows).. would make for a much more realistic experience.
Also, I disagree that its something you wouldn't want.. if
Re: (Score:2)
You don't think it would still look like "Viewmaster" 3-D, even if they trimmed the edges? To me it was very obvious that a flat person was being moved around on a flat background, making things more cartoonish rather than more realistic.
But maybe you are right and I would like the effect if it were polished.
Re: (Score:2)
With the background being a good distance away (> 10 or 15 feet) and the person within 2 feet, an object moving around a flat background is a good approximation of real 3D.
I've a very good hunch a blind test would make identifying a real 3D environment with this 2.5D would prove they look pretty much the same.
Re: (Score:3, Insightful)
Why not just use 2 webcams a red/blue filters and a camera on the other end?
It'll be slightly annoying wearing the glasses, but it'll be much more 'real' than what this appears to be. Set the cameras eye width apart for realism or farther to make the effect more predominant.
Re: (Score:2)
glasses, glasses on the other end...
Re: (Score:2)
Cause with Red/Blue glasses you only see black and white. I would take the colour video over the 2.5D/3D effect any day.
Re: (Score:1)
Re: (Score:1)
Re:2.5D, not 3D (Score:5, Informative)
First off, the image would be an, ugly, red/blue mess. Secondly, even if you used one of the more advanced shutter glasses or polerized 3d techniques you'd still end up looking at someone wearing goofy 3d glasses abscuring eye contact. Don't get me wrong, I have no problem with wearing 3d glasses when playing games or watching a movie but not when I'm trying to converse, face to face, with someone.
Re: (Score:3, Funny)
Red/blue contacts.
Re: (Score:1)
I've seen a couple of Real3D movies recently and I liked the effects, and I wonder if there is any reason they couldn't make contacts like that? I've never worn contacts (and my wife didn't like hers before she has laser surgery) so it might not be worth it, but I think that could be cool for certain uses.
Also, does anyone know if they can make games, etc using this tech? I tried to find something about it the other day and everything was about the red/blue system and that frankly sucks compared to the more
Re: (Score:1)
Nike actually has MaxSight contacts [see2020now.com] that act like shades. Shouldn't be too hard to make something similar with red/blue, and it'd look even more freaky.
Dunno if polarisation lenses can be done as contacts, since those have to be exactly right, rotation-wise.
Re: (Score:2)
Dunno if polarisation lenses can be done as contacts, since those have to be exactly right, rotation-wise.
Most modern polarized monitors use circular polarization, which could be easily implemented in contacts. Linear polarization was an issue even with fixed glasses, as a little tilt of the head would blend the views together and make you feel sick.
Re: (Score:2)
As the othe poster mentioned, modern 3D movie glasses from RealD use circular polarization. Not entirely familiar with ho that works but it sounds like it might do the trick. Also, my understanding is that in the case of costume/club contact lenses, like cat's eyes, they make minor changes to the shape of the contact lenses in order to keep them in the right orientation. I think that it would also be needed in order to make contact lenses to correct some vision problems like Astigmatism but I could be wr
Re: (Score:2)
Who the hell uses 'red blue' 3d techniques anymore?
The DM is not always right.
Re: (Score:2)
Not many people. It was, however, included as a feature in the newer nVidia stereo 3d drivers. It was also, coincidentally, what the OP suggested and what I was responding to.
this was released at CES (Score:2)
and I can't believe no one else has mentioned it
http://www.minoru3d.com/ [minoru3d.com]
it comes with red-blue glasses for the purchaser to send to people that they intend to use it with.
Re: (Score:2)
http://www.minoru3d.com/ [minoru3d.com]
I can't believe I'm the first to know this--
released at ces THIS year.. it uses red/blue and comes with crappy glasses
Re: (Score:2, Offtopic)
Unless it's fractal. Actually, that's the definition.
About a million miles off topic, admittedly, but there you go...
Re: (Score:1)
Re: (Score:2)
What is a complex number?
Math is all kinds of weird.
Re: (Score:3, Informative)
Games with faked 3D are known as "2.5D" -- most notably, most side-scrolling fighting adventure games, like the Teenage Mutant Ninja Turtles series for the NES.
It's not pure 2D like the Mario/Metroid NES/SNES games, but it's not pure 3D either.
Re: (Score:2)
Pretty much any 2D game that uses some sort of trickery to emulate 3D gameplay is 2.5D.
I submit this review as evidence of the aforementioned NES/SNES games being considered 2.5D by the gaming industry:
http://www.escapistmagazine.com/videos/view/zero-punctuation/222-XBLA-Double-Bill [escapistmagazine.com]
Doom was a "3D game" in that all of the brush work was actually drawn in 3D, even though all of the entities were sprites. Each point on the map had only one height value, but the point is that different points on the map could h
Re: (Score:2)
Someone should read about fractals [rice.edu].
Re: (Score:1)
There was no real up/down... you could never pass under a bridge, you could only move on a 2D plane.
But this plane was deformed to give you the illusion that you where moving on 3D.. clever!... and, yes, 2.5D.
Re: (Score:2)
Not really, think about games like Doom, you moved on a plane that happens to be deformed on 3 dimensions.
There was no real up/down...
Yes there was.
you could never pass under a bridge,
Yes. A map design and rendering engine limitation.
And you could only move on a 2D plane.
Not exactly. You could move freely in a 3D space limited to being an arrangement of connected non-overlapping rooms, with different floor and ceiling heights. But it was possible to freely move in 3D within that space, within the confines o
Re: (Score:1)
You could *never* dock a projectile... that's why you never need to aim up/down.
Ceiling effect was done in a similar way.
Doom was just an extension to Wolfenstein, but the rendering was similar, some sort of "ray trace" on a plane.
I insist, try to find *any* map where you could cross under *and* over a bridge.
I build a couple of deathmatch maps on my own, and that's how I remembered it... it's been a while, so I still can be probed wrong.
Do Joh
Re: (Score:2)
Sorry to say, but you were fooled (just like all of us).
No, actually I wasn't.
You could *never* dock a projectile...
Of course you could. Find a monster with a slow moving projectile... an imp, or those things that threw the green balls, or the big floating heads... and find a steep staircase. They'd fire at you, and it would come right at you at whatever your elevation was (auto aim), and you could trivially dodge it simply by moving forwards down staircase, and the projectile would fly harmlessly over your
Re:2.5D, not 3D (Score:5, Insightful)
I agree with you: having this kind of 2.5D experience is neat but not particularly useful.
But I wonder if this software could be adapted to do something else... One of the things that most people dislike about webcam-conferencing is that the other person is never looking "at" you. They are looking on their screen at an image of you, so they are not looking directly at their camera, and so on your end they seem to be looking away from you. (And they see you looking away from them, too.)
While this may seem trivial, it is actually a significant roadblock to inter-person tele-communication. People rely on body language and eye contact to establish each other's moods, to really "connect". Webcam-conferencing forces us to violate social conventions (like looking into people's eyes), which can be anywhere from subconsciously bothersome, to somewhat distracting, or even perceived as insulting.
So what I would like is a multi-camera system that uses similar kinds of interpolation to rebuild the image of the person so that they are looking directly at the camera. So if I put one webcam on either side of my screen, they can combine their images to create a shifted image where I am looking directly at the viewer on the other end.
Though it is a rather small and subtle addition to tele-conferencing, I believe it would have a bigger impact than what TFA seems to be showing. I think it would make the interaction "more real."
Re: (Score:3, Interesting)
Or, just put the stream of the conferenced person just below/above and centred on the camera. I've operated Access Grid a couple of times and this is the first thing that I do.
Re: (Score:2, Insightful)
I did a study about this gaze problem and a possible algorithmic solution, for a videoconf specialist about one year ago.
My conclusion:
no algorithm was/is/will be suitable to combine any point of view with any other point of view. Consider an object occluded for each point of view but not occluded for your virtual view (the combination of the two actual views): there is no solution but to guess areas that can be very wide and for situation very frequent (if 2 objects are near, like your hand and your body f
Re:2.5D, not 3D (Score:4, Informative)
Geometric view interpolation is not unknown in the labs right now, and in some cases is being researched for exactly the reason you suggest. As another poster suggested, there are certainly some cases where the interpolation will break down. (Put a hand in front of each webcam at the side of your monitor, and it won't interpolate two palms to look like your face, for example.) Another one is that anything transparent makes it impossible to estimate the depth at a particular point because there are actually two depth values there. So, the smoke from your cigarette which is an amorphous volume of semitransparency through which you can see a window, the schmutz on the window, a reflection on the window, and something through the window will just ruin any chance of doing the interpolation properly. When you try to shift the pixel correctly to accomodate for the view shift, you get like seven different answers for what direction it is supposed to go.
Still, look up the Foundry's "Ocula" system for 3D cinematography. It's a shipping commercial product that does a lot of strong magic with stereoscopic imagery on a daily basis. (Which i would have assumed was currently impossible.)
It's too slow to be used for real time conferencing. You let it cook overnight for a single shot, or a handful of shots to compute disparity maps offline. It needs to be at least an order of magnitude faster to be practical for real time work. Thankfully, there are a lot of researches trying to figure out clever hacks to speed up these sorts of things, and a lot of engineers figuring out ways to build stonking GPU's to run OpenCL in a year or two. Expect stereo stuff to become mainstream somewhere around 2011-2012 would be my guess.
Re: (Score:2)
My first idea in response to this was to put the camera somehow behind the display. Maybe by having a translucent display or perhaps there is some technology out there in which the display emitters could also be used as detectors.
So I jump on to Google and it turns out Apple has already patented [appleinsider.com] my idea. How did that pass the test of novelty and non-obviousness for a patent claim?
Re: (Score:2)
So what I would like is a multi-camera system that uses similar kinds of interpolation to rebuild the image of the person so that they are looking directly at the camera. So if I put one webcam on either side of my screen, they can combine their images to create a shifted image where I am looking directly at the viewer on the other end.
Sounds overly complicated. Why not just put the camera behind the screen, so the user is actually looking directly at the camera, rather than faking it?
Re: (Score:2)
Not to rain on your parade, but there's just no way this would work with current full-sized monitor tech.
A CRT would need to have, at the very least, some optics embedded into the tube so that the camera itself could remain outside, and then you're interfering with the beam no matter where you put those optics. Besides, CRT's are pretty much obsolete except for a few corner cases.
An LCD is out because you'd have to poke a small hole in the backlight reflector and diffusing layers for the camera to see out
Re: (Score:2)
I think getting the whole "eye contact" thing worked out would be much more useful as a way of making the experience feel more "natural". I am used to looking at peoples faces and having them look at mine when I chat with someone in person. Video chatting requires you to either get accustomed to people looking over your head/off to the side the whole time, or only watch the video with your periferal vision so that the person you are chatting with
Re: (Score:3, Insightful)
Very cool and I like the fact that they give the webcam a double function. The 2.5D effect against a static background is indeed novel only but I see there is a confusion between 3D and stereo vision.
I agree with most writers that stereo vision induces headaches which are simply due to the fact that the eyes each see a different image which giver your brain a depth cue, yet your eyes focal point (To your screen) conflicts with that depth cue thus resulting in a headache. It is unavoidable with normal screen
The tech is cool, sure.. (Score:5, Insightful)
Re: (Score:2)
I kept thinking of Stevie Wonder, and after watching that I understand why only he can move like that, it's extremely visually disconcerting and headache inducing.
HEX
Re: (Score:1)
Did any body else read that as... (Score:1)
Brilliant (Score:1)
Re: (Score:2)
[citation needed] or else corps will try to patent it.
same as wii head tracking vr (Score:1, Redundant)
Re: (Score:2)
i thought that failed if you have two eyes?
Re: (Score:1)
Johnny Lee didn't invent this, it's been done tons of times before he even started his PHD thesis on using the wii mote.
Look at the paper "3D display based on motion parallax using non-contact 3d measurement of head position"
All Dr Lee did was a much simpler demo with a 3d box and 2D sprites using a wii mote instead of a camera and now everyone worships him like he's so amazing for it.
If you did some research into it you'd realise that his demo sucks and if you read his paper he doesn't go into any detail a
Re: (Score:2)
Duh, of course he didn't invent anything, he just hooked a wiimote up to a PC and used it to provide positioning for a camera in a virtual scene. Nothing special there.
The reason it wowed eveyone is A) because nobody at nintendo thought to demo it first and B)it let everyone at home do the same thing for way cheaper than before.
Re:Game control? (Score:4, Informative)
John Carmack prototyped this a few years back. His conclusion at the time was that there was too much lag in the system to make it really useful.
Re: (Score:2)
Yeah, FPSs that wish to implement leaning have established a convention by now of using Q and E for that purpose. Call of Duty was one of the first games I played to implement it.
Re: (Score:2, Interesting)
Much better/clever implementation than for video conferencing.
Come on... be honest, everyone has done that unconsciously on Counterstrike... even without a webcam
Re: (Score:2)
The lag wasn't due to CPU speed - it was due to cumulative delays in the webcam itself, the USB bus, and only a tiny bit of image processing. I think his analysis was done on his .plan proto-blog, way back. I have no idea where it might be archived these days.
I know that even today, when capturing video from a USB camera, I can see a noticeable delay between when I move an object and when I see it moving on the screen, so I don't think that much changed since then. The only video capture setup I'm aware of
Re: (Score:2)
My setup isn't that cheap, I would guess that most people's setups would be cheaper, and I don't see how you can speculate on the amount of lag I'm experiencing when all I said was that it was noticeable, which it is.
Anyway - if you want to implement it, go right ahead. Don't let some guy on Slashdot stop you.
Re: (Score:2)
1. Cite? 2. Because as everyone knows, as time goes on, CPU doesn't get faster and RAM doesn't get bigger.
Re: (Score:2)
Sorry, can't find the place where I saw this. The closest I came was this:
http://doom-ed.com/blog/1999/11 [doom-ed.com]
This is an archive of his old .plan updates in blog form. I know that the actual .plan updates are archived somewhere on www.bluesnews.com, but I can't figure out where they are. That post just mentions that he started working on it, but there's no followup there. I do remember reading a followup somewhere else some time later, and he mentioned the latency issue.
The latency had nothing to do with the CPU
Re: (Score:1)
Re: (Score:2)
There's even an open source work-a-like: http://www.free-track.net/english/ [free-track.net]
Bandwidth reduction? (Score:3, Interesting)
I wonder if a more practical use would be to use the technique for video bandwidth reduction. If you know where the person is, you could concentrate video bandwidth on the face region, while keeping the rest of the "video" relatively static. No point in continuously compressing and sending boring background. Of course many codecs already do temporal compression that gives a similar effect, but this might increase the efficiency for video chat.
Great (Score:1)
Now I can see everyone's zits in 3d.
No good enough (Score:1)
The reason we'd be moving left to right would be to see something which isn't in the frame.
An idea I have been thinking about for awhile is to have the remote camera move when the user on the other side does. This would be much more convenient instead of having to ask the person to keep adjusting the camera angle to see outside the frame.
I worked on that too. Look at these vids... (Score:4, Interesting)
Inspired by Johnny Lee's stuff, I pulled some old code out over a year ago and turned it into a decent engine that handles multiple screens and head tracking (TrackIR) to achieve the motion parallax effect. Like with all 3D effects, it needs to be seen but the following videos give you a good idea.
Have a look at these demo videos and you can even download a demo:
My first test
http://nz.youtube.com/watch?v=X8PevTuEWlg [youtube.com]
More accurate tracking
http://nz.youtube.com/watch?v=yf1hu6GLmf0 [youtube.com]
Multi screen study
http://nz.youtube.com/watch?v=ZBdtPz2V_vY [youtube.com]
Engine complete
http://nz.youtube.com/watch?v=ku76aHq3pps [youtube.com]
Download Demo
http://vandinther.googlepages.com/virtualwindow [googlepages.com]
Re: (Score:2)
I just ran your demo, quite nice although a click-drag on the head (instead of fly) would be more educational.
Since head tracking has a common solution, there's no need for IR (although precision is better). You should open this and get it connected to standard head tracking. It'd be quite nauseating, even with the lag. But that's a compliment in this area.
DrunkChat (tm) (Score:2)
Ma Ma Ma Max (Score:2)
The Wonder cam arrives. (Score:1)
Not to mention some schmuck in the US will soon sue because it made them puke from motion sickness.
Why not just use two webcams? (Score:1)
In fact I've seen web cam kits with 2 in the package.
The would let you have true parallax, AND would have the benefit of making it appear that you are looking at the viewer.
Solves the two main problems I see being discussed here for an extra $29.95 or so.
Plus, it would make cool things like 3D position tracking possible (think Minority Report).