Monday, February 21, 2005

Video

My family went on vacation last week. Let's say that I would like to share highlights of my vacation with you in a 10-minute video.

If I want to share snapshots with you, the technology that we have today makes it pretty easy -- almost trivial. Video, on the other hand, is incredibly hard today. Here are the steps that I would have to go through (let's call this the "normal process"):
  1. I would shoot my raw footage with my camcorder.
  2. I would then have to upload the footage into a laptop. If I shot an hour of raw footage, it would take more than hour to upload it because I have to play the tape back through the firewire connection at the normal tape play speed. I also have to break the tape up into shots or segments as I go.
  3. Uploading an hour of footage would require that I have about 20 GB of free hard disk space on my laptop -- it has only been in the last two years that a "normal" laptop would have that much free space.
  4. Then I would have to open a video editing program and sort through all of the raw footage. I would snip out the little pieces of footage that I wanted to share with you and then connect them together. This page discusses the most rudimentary aspects of video editing. I could spend anywhere from an hour to a week doing this depending on how elaborate I want to get in the edited piece.
  5. Then I would render the file to create the final output. This process would yield a 3 to 4 GB AVI file.
  6. Today there is no good way for me to get a 3 to 4 GB file to you in any finite amount of time. Therefore I would have to encode the file to try to shrink it. I could shrink it to anywhere between 10 MB and 100 MB depending on the quality I would like for the file to have when you watch it.
  7. Then I would have to get to a place where I can upload the encoded file to the internet. Today this means that I would need to find a WiFi Hotspot. I would probably upload the file to a Web site, although I might upload it to BitTorrent to try to reduce my bandwidth costs. If I use BitTorrent, I've cut the potential audience way down, but it's the only choice I have if my home video is popular.
  8. Then you download it to your machine (either directly from my Web site or through BitTorrent) so you can watch it.
That is sad. Painfully sad. As you can see, it is a total pain in the butt for me to share home video with you today. Most likely it would take me a full day (possibly a lot more than a full day depending on how much editing I want to do) to create and transmit a 10-minute video to you.

With today's technology, there is a way for me to strip down this process a bit. I could do the following (let's call this the "streamlined process"):
  1. I could shoot the video on my digital camera. The camera will automatically create encoded MPG snippets, and I can choose between two different resolutions.
  2. I would upload the snippets to the laptop.
  3. Then the last two steps are identical -- I have to find a WiFi hotspot, upload the MPG snippets to either my Web site or BitTorrent, and you have to download them.
By doing this I completely lose the ability to edit -- there is no way to hook pieces of video together, or to edit out what I don't like, or to do anything like a voice-over or a wipe. Therefore, the resulting video would come to you in a package of little snippets largely unedited.

Neither Disney nor Sea World had WiFi hotspots that I could find. So if I wanted to get the video to you in any sort of "instantaneous" amount of time, I would probably have had to leave the park and get back to the hotel to find a decent Internet connection.

I actually have a demo here of the "MPG snippet" approach. Here are two MPG files that I recorded with my digital camera. The first is in 160 x 120 format, and the second is in 640 x 480 format:You can see that snippet 1 is a mere postage stamp of a video. It is almost useless because it is so small. However, it is one tenth the size of Snippet 2. Snippet 2 actually has enough resolution that you can blow it up to full-screen size.

You will also note that both snippets are only 5 seconds long. That means that a 10 minute video at "good" (640 x 480) resolution would take about 240 MB. That is going to be too expensive for me to serve to the "general public", forcing me to go the BitTorrent route. A better encoder on the laptop can reduce the size, but then I have to go through the whole "normal process" rather than using the "streamlined process".

Sad.

And now let's talk about one other problem that isn't even on the radar today. When I was sitting in the stands watching the whale show, here is what I was looking at:


[Note that I've reduced the 2600 x 2000 pixel photo that I originally
took down to 400 x 300 pixels here, again to reduce bandwidth costs.]


What I showed you in the video snippets above was a tiny portion of the full scene I was looking at. In an ideal universe, the original video would have shot a complete 180-degree (or possible even 240-degree) swath of the scene at some kind of immense resolution like 20,000 pixels by 5,000 pixels. Then you, as the viewer, would choose where you want to focus your attention and see that part of the scene at good resolution. You might use a headset like this one so that you can focus your attention on a part of the scene in a completely natural way.

This is how humans normally "watch" an event like a whale show -- each person in the audience chooses where to focus his or her attention, rather than allowing a video producer to have total control. In other words, each audience member in the stands does his or her own editing. We do not even consider offering an option like this today -- anywhere -- because we have no way to implement it. The bandwidth requirements are simply too massive for today's technology to begin handling.

And let's not even get into the fact that a human sitting in the stands gets the additional benefit of a binocular view. To be even more realistic, we would want to be shooting (or artificially generating) two streams of 180-degree video -- one for each eye.

In an ideal universe, what would happen is that I would shoot the video in this 180-degree mode, which means that you as the viewer gets to do all of the "editing" in a completely natural way. That video would not record onto tape (video tape is sad...) -- the camera would send the video stream directly to the wireless Internet in real time, and you could view it in real time or later. And I would be able to serve it to you wirelessly for free, because bandwidth is so plentiful it is free.

When you think about all of this, you realize how truly pathetic video is today. We think that "HDTV" is cutting edge, but we have not even started to scratch the surface when it comes to realistic video viewing. People in 2050 will look back at our plasma screens of today and laugh out loud. They will have the same feeling that we have when we look at a Kinetoscope.

[The state of the art in video today is sad, yes, but here it is mostly for technological reasons. It is different from the kind of sadness discussed in the last post, where the problems could all be prevented because they are completely under human control. It simply will take time for us to develop the technology that will begin to solve the video problem. It will take awhile, for example, to develop 20,000 x 5,000 (100 megapixel) image sensors. It will take awhile before we have wireless connections that can handle 100 megapixel real-time video streaming for free. And so on...]

Google

9 Comments:

At 6:06 AM, Anonymous Anonymous said...

Even though it is only 5 seconds long, the "whale snippet 2" is fairly cool. They are amazing animals. Video tells you so much more than a photo.

 
At 6:47 AM, Anonymous Anonymous said...

Watching a football game on TV in 180-degree "surround view" like you describe would be be awesome! You need enough resolution to focus in on the ball and the person carrying it with plenty of detail.

 
At 7:55 AM, Anonymous Anonymous said...

I was thinking along similar lines when looking through a gallery of amazing gigapixel photographs.

How "sad" will our megapixel photos look once everyone has a gigapixel cam in his cellphone...

These gigapixel photos also seem to make zooming and careful framing more or less redundant. Photograph the whole thing, zoom and choose frame at home.

If you had that as video you´d definitely need your editing software to help you a lot more...or maybe there is no way to do that, maybe editing a 10 minute clip will always take one whole day? Maybe being an editor is a "safe from robotic replacement" job?

Gigapixel gallery found via wired.com article.

 
At 8:38 AM, Blogger Marshall Brain said...

> Maybe being an editor is a "safe from robotic replacement" job?

This is a very interesting thought. It could go in a couple of different directions depending on the time frame.

In the short term, editing HAS to get easier. It is so hard right now... For example, let's say that I have an hour of raw footage, and I know that somewhere in that footage Aunt Edna says something funny, and I want to find that shot. It is like finding a needle in a haystack. If I have 20 hours of raw footage, it can be impossible to find it. Eventually, editing software will get to the point where we can say, in English, without ever touching a keyboard, "find me the part where Aunt Edna is talking about her dog," and the editing software will take you there instantly. Eventually the editing software will get so fluid and easy to use that we can talk through the creation of a 10-minute video in just a few minutes.

Once it gets that easy, however, robots will be getting smart enough to do ALL the editing, and then humans will be out of that job too.

 
At 11:09 AM, Anonymous Anonymous said...

Shamu!

 
At 11:16 AM, Anonymous Anonymous said...

> Video tells you so much more than a photo.

Little 5-second snips, rather than 10-minute-long videos, is what the camera designer had in mind. The snips work well. But a 2 meg "snapshot" like Whale Snippet 2 is big. eMail systems reject messages over 1 meg.

 
At 4:36 AM, Anonymous Anonymous said...

Regarding the editing robots...

two more points that come to mind.

1) Apparently most of the jobs in editing are already gone. When I took an editing class in 1999 our teacher told us about the way he had learned - and what exactly assistant editors did all day long: sort film clips, sync the sound, document what is on the clips, bring them to the head editor, get them back from the editor, sometimes undo cuts that the head editor made and file the clips again. So apparently most of the work had to do with handling and filing thousands of lenghts of celluloid - all that work is gone (and that´s good in my view) with the digital editing where only the final cut is done in celluloid. Apparently every editor on major films had several assistants of that kind. Would be interesting to see statistics on that.

2) Of course editing is an art - and as such probably not prone to easy robotic replacement. Actually my teacher pointed that out: the filming and lighting come from painting and photography, acting and directing from the theatre, the writing from literature. In that way editing is the "real" moviemaking art. The technology has come a long way, seeing that hobbyists can work with editing systems on their laptops that not so long ago would have cost many ten- or hundredthousands of dollars. Then again. Typewriters and word processors have been around for a long time - I don´t think being a good writer has become all that much easier either?

Cheers, Björn

 
At 3:09 PM, Anonymous Anonymous said...

If someone would invent an easy way to edit/merge the little MPEG snippets and also a ways to shrink the size of the encoded file, that would help. Has anyone invented such a thing? Can they include it in Windows XP for free?

 
At 3:18 PM, Anonymous Anonymous said...

Wait till you learn about "phased array optics"!

 

Post a Comment

<< Home