Cohen joined Facebook in Fall 2015 as Director of Facebook’s Computational Photography Research team, which was formed to explore new ways to share photos and videos online. Cohen is also an affiliate professor at the University of Washington and a member of the advisory board of the University of Washington’s AR/VR Reality Lab.
Cohen sat down to share details about his contributions to the field of computational photography, his impactful work at Facebook, and his experience in academia. He also describes how his work relates to art and computer vision.
Q: Your undergraduate degrees are in both art and civil engineering. What led you to begin a career in computational photography? What initially inspired you to apply art to computation?
Michael F. Cohen: I have always had one foot in artistic endeavors and one foot in technology. My first degree was in Art but shortly thereafter I continued on to study Civil Engineering. I never became an engineer but rather first entered the field of computer graphics with intentions to continue studies related to civil engineering. Soon after, I found applications to more artistic endeavors to be more exciting. I often referred to computer graphics as “imagination amplification.” It lets us imagine what could be and produce imagery that represent what could be through algorithms. This is not so far from engineering where one tests structures through algorithms and produces imagery of deformations, etc. In fact, much of the math is the same.
Q: Looking back on your career to date, what are some of the projects you’re most proud of?
MFC: I am probably most proud of the people whose careers I have helped along the way. These include my undergraduate and graduate students and work colleagues.
That said, the projects that I am most proud of would include:
Simulating the Cornell Box with Radiosity (1985): The box (which I built) has become a famous artifact used in a lot of research since the 1980s.
The Lumigraph (1996): This work established a whole area of research that rides the line between computer graphics and computer vision.
The Moment Camera (2005): This laid out the basic ideas for combining images/photos to create imagery that more closely resembles our memories than actual photos. A lot of modern computational photography is based on these ideas.
3D Photos (at Facebook): This allows people to capture in a single shot an image that exhibits aspects of 3D. It also is laying the groundwork for what we are calling “reactive media,” where the viewer participates actively in viewing the media.
Q: What has surprised you most about the impact of your work? What milestones are you most proud of from your career? Are there one or two examples that stand out in particular?
MFC: One thing that has taken some time to get used to is the delay between the original research and when impact is realized. For example, the Moment Camera work from 2005 has only now become commonplace in cameras.
Also, in 1998 I won theComputer Graphics Achievement Award at SIGGRAPH for my work in realistic image synthesis for an idea called radiosity. So, it’s nice to know that 20 years later I’m still in the field and recognized for the totality of the work over my career with theSteven A. Coons Award this year.
Q: Which of your team’s research projects have you been most excited to see applied on Facebook or Instagram?
MFC: 3D Photos, which I mentioned earlier, is certainly the most exciting project. This product was launched in Fall 2018 and is something that anyone can see on Facebook on their News Feed — and can create currently only on certain devices, like a smartphone with a dual-lens camera. Another exciting project is360 photo and video viewing on Facebook.
There are also less-obvious quality improvements and stabilization that simply make media look better. One example is when you’re on Facebook and you use the Facebook camera — we’ve improved the quality of the imagery in low-light situations there. On Instagram, we’ve improved the Boomerang feature and stabilized the video to be less shaky when you view it.
Q: How do you balance your time between industry and academia? How do those two roles complement each other?
MFC: Although I have been in industrial research and development for some time, I have always tried to keep active in academia. I had taught at Cornell and Princeton before coming to Facebook. I now have dual affiliations, which allows me to both work at Facebook and serve as an affiliate professor at some institutions, such as the University of Washington. I serve as a graduate student adviser and/or simply collaborate on projects. Michael Abrash and I also serve as board members for the AR/VR center at UW.
I think it is important to stay involved in academia both to act as a teacher and also to be a lifelong learner from the students. When you’re a graduate student adviser, the natural progression is to start out working with your students and helping suggest projects they might work on, maybe help them learn the relevant technology and literature. Over the process of doing a PhD, those roles get reversed, so by the end of any successful PhD, that student is teaching you all about what they’re doing because they’ve essentially become the world expert in that field.
Q: Can you talk about some of the different ways computational photography research has had an impact more broadly? Are there areas that you feel might not be as widely recognized?
MFC: There’s been a tremendous impact in the movie industry, particularly with CG. I contributed to the creation of the area of research called image-based rendering, which combines techniques from computer graphics and computer vision. The idea is that you can point a camera at a scene, record it, learn something about the shape, the color, and the textures in the scene, and finally use that data to render new images from different directions. For example, the 3D photos that one may see on Facebook are all taken in the same shot with a camera, and yet you are able to move it and see it from slightly different points of view.
Relatedly, computer graphics in movies are created by somebody modeling the geometry and the textures of what they want to create, but they use real textures and real geometry that are captured from the scene. Often when you go to the movies, what you see is a computer graphics character acting within the same setting as the real scene or real actor, who have to act in front of a green screen with environments that may be partially CG and partially real. It’s really hard to align everything into one scene in the end, and the only way to do so is to use technology from computer vision. This is really a marriage of computer graphics and computer vision.
It may be hard to believe, but only a few years ago we debated when the first computer graphics would appear in a movie such that you could not tell if what you were looking at was real or CG. Of course, now this question seems silly, as almost everything we see in action movies is CG and you have no chance of knowing what is real or not.
There are also many artists whose particular genre can be thought of as computational photography, like Jason Salavon, for example. His work is actually very similar to what we’ve been doing, and he’s just using it to express himself artistically. Also, artificial intelligence itself is now being used to create art. If you show an AI system thousands of examples of art, you can then ask it to create a new piece of art. This brings up a lot of interesting questions like who the artist is. Is it the person who wrote the algorithm or the person who pushed the button and ran the algorithm?
Q: What trends are you seeing develop now in computer graphics research? What might this field look like five years from now? What problems still need to be solved before we get there?
MFC: The biggest new trend is the use of deep networks to solve old CG problems. One of the most fun areas to watch is the development of deep net artists, networks that can simulate what an artist does to some extent. This enlivens the old debate about “what is art?”
The biggest problems remaining are in the area of enabling people to be able to more fully express themselves visually. This means figuring out how to create a teamwork between traditional algorithms, deep learning, while always keeping the human in the loop.
Q: What would you say to researchers who are just starting their careers and are considering focusing on computational photography?
MFC: In terms of what you need to succeed, clearly one needs to understand both traditional (geometric) computer graphics and computer vision, as well as new AI methodologies (deep nets). Also, never underestimate the value of getting through math classes. Out of everything I could have done to make my career easier, I would have taken more math early on.
At the same time, a broad understanding of human perception (and art) still forms a great basis for the kind of work coming up. Without always keeping in mind the role and needs of the end user, it is hard to do impactful work. For me, my first art degree was helpful in cultivating that understanding of human perception because this is what you do implicitly in art.
When you go through an art degree and learn how to draw, what you’re really learning is how to see. In drawing classes, the first thing they’d probably ask you to do is to draw the negative space. If I want to draw a chair, I draw the spaces between the chair legs and not the actual chair legs, for example. That’s really an exercise in learning how to see. You also learn about color, how color is produced, what the effect of color is, and things like that.
That relates to the entire field because, first of all, in computer vision you’re teaching a computer how to see. With computer graphics, you’re teaching it how to create imagery. If nothing else, artistic training helps you understand how to debug algorithms and understand what the algorithm is actually doing. It’s essentially trying to simulate a human-like process. Fields like vision science, psychology, human-computer interaction, and design are also helpful.