Listen: Responsible AI in Practice with Sarah Bird - #322
"S. slash. Azer Sir. All right onto the show art. Everyone I am here in Sunny Orlando. Actually it's not all that sunny today Kinda gray and gray and rainy but It is still sunny. Orlando right how could it not be At Microsoft Christoph Tonight and I've got the wonderful pleasure of being seated with Sarah Bird. Sarah is a principal program manager for azure machine learning platform. Sarah Welcome to the Tuomo. podcast asked thank you. I'm excited to be here absolutely. I am really excited about this conversation. We're about to have on responsible. Ai Before we do that I I love to hear a little bit more about your background. You've got a very enviable position. Kind of at the nexus of research and product and Tech Strategy How did you create that? Well I started my career in research I did my PhD machine. Learning systems at Berkeley. And and I I loved creating the the basic technology but then I wanted to take it to the next step and I wanted to have people who really used it and I found that when You take sort of research into production. There's a lot more innovation that happens and so I really Since graduating in have styled my career around living at that intersection of research and product and taking some of the great cutting edge ideas and figuring out how we can get them in the hands of people as soon as possible and so My role now is is specifically focused on trying to do this for as your machine learning and responsibly I as one of the great new areas that there's a ton of innovation research and people need it right now and so we're at working to try to make that possible That's fantastic and so between Your Grad Work at Berkeley and Microsoft. What was the path? So I was in Jon. Langford Group Group in Microsoft Research and was working on a system for contextual bandits and trying to make it easier for people to use those in practice because a lot of the the times when People were trying to play that type of algorithm the system infrastructure would actually get in the way you wouldn't be able to get the features to the point of decision or the logging. Gene would not work and it would break the Algorithm and so we designed a system that made it correct by construction so is easy for people to go and plug it in. This is actually turned into The personalization cognitive service now but through that experience. I learned a lot about actually working with customers doing this in production and so I decided that I wanted to have more of that in my career. And so I spent a year as a technical adviser which is a a great role in Microsoft? Where you you work for an executive and advise them and helped work on special projects and it enables you to see both kind of the The the business and the strategy side of things as well as all the operational things how you run or eggs and then of course the technical things and I realized that I think that makes his very interesting and so After that I joined facebook and my role was at the intersection of fair facebook. Ai Research and am L.. which was applied machine learning group with this role of specifically trying to take research into production and and accelerate the rate of innovation so I started the onyx project as a part of that Enabling us to solve a tooling gap where it was difficult to get models from one framework to another and then Also worked on Pie Torch neighborliness to make that more production ready and Since then I've been working in. Ai Ethics yet. If we weren't going going to be focused on Ethics I'm responsible a today. We would be going deep into personalized was Microsoft Decision Service and the sole contextual actual abandons thing Really interesting topic not the least of which because you know we talk a lot about reinforcement learning and if it's useful and wallets this deep reinforcement learning game playing thing it's you know reinforcement learning and people are getting a lot of use out of it and a lot of different contexts. Yeah Yeah it's It's actually when it works right. It doesn't work in all cases but in with works works really well. It's where the kind of thing where you get the numbers back in your like. Can this be true you right and so I think it's a really exciting technology going forward and there's a lot of cases where people are using it successfully now but I think though they'll be a lot more in the future awesome awesome. Well we'll have to take a rain check on that aspect of the conversation and kind of segue over to the responsible way I Peace and you know. I've been thinking a a lot about a tweet that I saw by Rachel Thomas who is A former guest on the podcast longtime friend of the show And and currently the UCSF Center for apply data ethics Head and she was lamenting that you know. There are a lot of people out there talking about ethics like it's a solved problem. Do you think it's a salt problem. No absolutely not. I didn't think so. I think there are fundamentally the hard and difficult problems when we have a new technology and so I think we're always going to be having the AI ethics conversation. This is not something that we're gonNA solvent in go away. Okay but what I do think we have now is a lot more tools and and techniques and best practices to help people start the journey of doing things responsibly. Possibly and so I think the reality is. There's there's many things people could be doing right now. They're not in so I feel like there's an urgency to get some of these tools into people's both hands so that we can do that so I think we can quickly go a lot farther than we have right now in my conversations with folks that are working on on this and and thinking about the role that you know responsible plays in kind of the way they the way they'd quote unquote do do a I do machine. Learning a lot of people get stopped at the very beginning. Who should own this? Where does it live? Is it like a research kind of function or is it a product function or is it kind of more of a compliancy kind of thing like a chief data officer or the chief security officer kind of function like one of those executive functions and oversight or compliance is the better word What do you see folks doing? And do you have any thoughts on where he is successful patterns of where it should live. Yeah I think the the models that we've been using our thinking a lot about the transition to Security for example and I think the reality he is. It's not one person's job or one function Everybody now has to think about security. Even your basic software developers have to know and think think about it when they're when they're designing however there are people who are experts in it and handled the really challenging problems of course legal and compliance pieces in there as as well and so. I think we're seeing the same thing where we really need every role to come together and do this. And so one of the patterns. We arcing is part of the challenge with responsible. Ai In technology is that we've we've designed technology to abstract away things and enable you to just focus on your little problem and this has led to a ton of innovation however the whole idea of responsibly is. Actually you need to pick your head up. You need to have this larger context. You need to think about the application in the real world you need to think about the implications and so we've break a little bit of our patterns of my problem is just this this little box. And so we're finding like user. Research designed for example is already trained and equipped to think NCA- about the people element in that and so it's really great to bring them into more conversations as we're developing technology. So that's that's one pattern. That refined gain adds a lot of value in my conversation with With Jordan Edwards. Your colleague You know many of his answers were for all of the above. This one is an all of the above responses. Well I think I do machine. Learning in practice takes a lot of different roles. As Jordan was talking about in operationalizing things and then responsibly. I just adds an extra layer of and more rolls on top of that because one of the challenges that kind of naturally truly evolves when everyone has to be thinking about Something is that you know. It's a lot harder right. You know the developer. You know is kind of trained as a developer Burger and now they have to start thinking about this security thing And it's changing so quickly and the best practices or evolving all the time and it's hard hard to stay on top of that. How if we're to replicate that same kind of model in responsible they are? What sounds like the right thing to do How do we support the people that are kind of on the ground trying to do this? Yeah and I. I think it's definitely a challenge because the the end result can't be that every individual a person has to know the state of the art in every area and responsible I and so one of the ways that we're trying to do this is as much as possible title build it into our processes and our tooling right so that you can say okay. Well you should have a fairness metric for your model and You can talk to experts about what that fairness metrics should be but you should know the requirement that you should have a fairness metric for example and so we I are are starting with that. Process layer and then azure machine learning. We've build tools that enable you to easily enact that process and so the foundational piece is the MLS L. OP story. That Jordan was talking about where we actually enable you to have a process that's reproducible. That's repeatable so you you can say before. This model goes into production. I know that it's passed these validation tests and I know that Human looked at it and said it looks good and if it's out in production and there's an error or some sort of issue that arises you can go back you can recreate that model. You can debugged the error. And so that's that's the the real foundational piece for all of it and then On top of that we're trying to give data scientists more tools to analyze the models this themselves and there's no magic button in here. It's not just a weekend run a test and we can tell you everything you want to know but there's lots of great algorithms out there and research that help you better understand your model like chef or lime are are common interpret ability ones and so we've created a tool whoa kit called interpret. MFL where This is an open source. toolkit you can use it anywhere. But the idea is enables you to easily use a variety of these algorithms to explain your model behavior and explore it and see if there are there are any issues and so we've also built that into our machine learning process so that if I build a model I can easily generate explanations for that model and when I've deployed in production I can also deploying join explainer with it so individual predictions can be explained while it's running so I can understand if I think it's doing the right thing and if I want to trust it for example It strikes me that there's a bit of a catch twenty two here in in the sense that the only way we could possibly do this is by putting tools in the hands of The folks that are working you know. Data scientists machine learning engineers that are working on these problems but the tools in their very nature kind of abstract them away from the problem problem and You know allow them if not encouraged them to think less deeply about what's going on underneath right how do we have we address that. Yeah I think Do you agree with that. First of all I completely agree with that. And it's a challenge that we have in all of these cases Where we want to give the tool to help them and to have more insight but it's easily for people then to just use it as a shortcut shortcut and so in a lot of cases were being very thoughtful about the design of the tool and making sure that it is helping? We knew surface insights. But it's not saying this is the answer because I think when you start doing that we're like if you have Have some that flags and says this is a the problem than people really start relying on that and maybe someday we will have the techniques. Were we have that level of confidence and we can do it but but right now we we really really don't and so I think a lot of it is making sure that we designed the tools that encourages this mindset of exploration and deeper understanding of your models. And what's going on and not just Oh. This is just another compliance tests. I have to pass. Just run this test and it says green and I go you alluded to this earlier in the conversation but it seems appropriate here as well and it's maybe a bit of a tangent but you know so much of Pulling all these pieces together is is kind of a user experience and design any thoughts on that. Is that something that you've kind of dug into studied a lot or other folks. Worry sorry about that here. It's not in my background but to me. It's an essential part of the function of actually making these technologies usable. And particularly when you you take something the as complex algorithm and then you're trying to make that obstructed distracted and usable for people. The design is a huge part of the story. And so what we're finding in response will is that When you think about this even more and A A lot of our a lot of the guidelines are saying be more thoughtful and include sort of more careful design for example. People are tempted to say well. This is the date I have so this is a model I him build and so I'm going to put in my application that way and then if it has too much inaccuracy than you spend a lot of resources to try to make the model more accurate where you could have just had a more elegant.."