Trends in Computer Vision with Amir Zamir
Automatic TRANSCRIPT
All right. Everyone we are here for a are rewind our second annual walk walk through the top trends and developments in machine learning and AI. And this time. I am with emirs. Amir Emir is an assistant professor Besser of computer science at the Swiss Federal Institute of Technology or EPL. In fact Amir this current moment For another three days days is a post doc at affiliated with Stanford and UC. Berkeley where he was when we first spoke with him back in July of the twentieth eighteen Amir. Welcome back to the TUOMO. Ai podcast thanks. Great to be here. Yeah I'm really excited to dig into this conversation about what What's new your take on? Two Thousand Nineteen from Computer Vision Perspective and so to kind of get started with that. WanNa we just take broad brushstrokes. What's your take on twenty nineteen? Thanks for having me. It was another exciting year in my opinion for fields old's but the guy and Including commit or vision no conflict conferences. They're expanding so there's more talent coming in more energy a thanks. APR Two thousand. Eighteen was six thousand. Roughly the number of ten days in two thousand on nineteen was about ten thousand so we will see another six months ago. So we'll see how it's going to be in twenty twenty but that expansion size. How does that gross rate compared to Europe which is the one that gets a lot of headlines? Writing Writing Europe's was I came back nurse. Like roughly a two weeks ago. I believe it was thirteen thousand But again they are like six months apart so oh You know we will see how CPR it'll be but I think roughly the same size maybe near somewhat bigger because it's generally includes many different areas not just vision entity and so on many There myself included But yeah but they are. Huge conferences is big enough. That you want your friends anymore. are the apor submissions as TV PR growing as quickly as nurse. Yes yeah I don't no not this year The area church should know this number but but all in all what what I'm sure about and it is that there was a really sharp expansion and to some extent that presents a problem for US academics. Because we have to find reviewers for dislocated like papers and so on But that's fine. That's a good problem to have The overall outcome is positive albeit at some variances and You know did. The girl was in a way disproportional because now for the past like four or five years it was a huge huge Interest and we have a lot more like young talent in the field for the same reason. They're young can news. We don't have as many season reviewers to do value the papers so to some extended review quality has a little bit of various incentive in comfort two years ago but again like I said that's a good problem to the house because the field is generating more results so a little bit of variances in my opinion acceptable so the field is growing dramatically as has our other areas in La. I what else is happening. Ambition right in terms more. Technically I think You know we see a few trends had had there not to exclude two thousand nineteen but I think some of them are maturing into the nineteen one Meta trend that I see for sure is that we see a lot of in areas like vision plus something else like vision process graphics a nominal example of that or are all these like image scientists spy-planes either Gan base and and whatnot but the pipelines that essentially generate image. That's in a way it's a graphics problem because graphic is about like generating something good looking that you put under your screen like rendering things and vision was inverse of the problem. You already have an image and you won't understand ended but These areas got blended together. And I think The first time envisioned that I saw a reasonably working example was the paper. Picks two picks a few years ago and after it became really popular and cycle gone and so on those papers actually he came primarily from the vision community and of course the graphics community worked a lot on it too so So mix of area of vision press graphics. I guess we'll discuss said partially a Leonard. Get into more details Vision plus robotics is expanding. Think it's one of the areas to watch for sure vision Like adver Saudi under the bus since literature. I think that's something that we not many of US side coming But wait actually makes sense That how the algorithms the basically that there was a line of research going forward making machinery insistence mortar boss than at their Sarah. Examples would like posing concerns funds to people who turns out that you know if you have more robust algorithms for processing visual data. They are more useful seward just processing non-adversarial literature and not address content as well like if you have an innocent as pipeline It it works better if it was robust defied even though the even if you don't mess with the input anymore We can discuss in more detail. Go forward but I think generally the trend of mixing saying different areas with vision is increasingly popular. And I think there's actually a healthy reason to this I think it's a realization musician attack. That vision is service to some downstream goal. It's a very powerful skill. But don't usually observe the world old for the purpose of just absorbing a like just understanding what's going on usually have an intense in mind like we understand the world with you know when I when I get up in the morning wing When I open my eyes a intend to get out of bed safely navigate myself out of the bedroom so the vision is very practical unskilled and so we cannot really make that independence? The research do on vision of these downstream skills so vision plus x is is To me I see that as realization of that fact especially in the context of robotics the reason we mix vision robotics together is that our robots what's needed to have a complex understanding that the world acquired through the cameras so whatever division pipeline outburts. It should be in a way curated To best support downstream gulliver robots. I particularly. Don't care for instance having robot in my home and it can detect all the objects Jackson do all sorts of complex. Thanks I don't really care how the Vision pipeline works if the downstream will robots. Whatever it is? Make the beds or or do laundry or whatever that is if that works just fine the vision can be asking. Blizzard wants to be and so it is really and twin intertwined pipeline and I am. I'm actually happy to see that. These areas are mixing together. Because we can now do a more were meditated design in our research and do vision in a way that it's more useful toward you downstream goals. There are some caveats that story Ernst Art when I observe painting watching and looking at it but I'm appreciating I don't necessarily intent to do something with it but I'm generally speaking. Vision is a very practical scale and makes it areas in in. My opinion is Is a realization of that is there also also an implication that vision has reached a level of maturity or meaning the core vision tasks have reached a level of maturity already or performance that we can now even consider moving onto a real world Types of things and incorporating in these other areas like we've we've solved enough of core vision to then mix it with these other fields. Yes and no I would really has really hesitant to say to have solved enough core vision problems or it'd be have sold them like fine. I actually think lets them. The simplest example probably longest Running problem vision is say object detection We're not appointed can say that they can with a high confidence. Yes we can detect an object under varying lighting conditions in different context says unsold and so forth so But at the same time a huge amount of progress has has been made Whether there's a way to make them useful at the answer is yes. And that's why many people are not vision. Experts are actually a using vision pipelines NBC API's