Trends in Natural Language Processing with Nasrin Mostafazadeh
"All right. Everyone welcome back to our AI rewind 2019 series in this episode will be covering covering NLP. And I've got the pleasure of being on the line with Nassreen Mostafa Day. She is a senior research scientist. At elemental cognition Nassreen. Woken back to the PODCAST. Sam Glad to back. Thanks for having me definitely glad I to be speaking with you again. We last spoke back in August of twenty eighteen when we spoke about contextual modeling language envision and some of your research This time will be reviewing some of your thoughts on the most important papers and developments more broadly the and the field that you work in natural language processing in twenty nineteen. I'll have folks refer back to that previous episode for a a little bit more about you and your background and what you're working on but to get this conversation started. Why don't we just start with your kind of broad? Take on twenty nine thousand nine in and Lt what was the was a big year for an ob sure so. Actually I think yeah thing into the nineteen was actually exciting. You're out the you know. These large pre-trading Models have been stretched widely to various various different directions and you know slowly but surely is community. They've started the sink about elected problems. They have the weaknesses the blindness spice up up Citing the sort of paradigm shifts that you're seeing in an LP sort of are into twenty twenty now kinda started. I'm I can reflect back on the decade Started back in two thousand fifteen to sixteen or so in various Task could start to get tackled by relatively straightforward approach that you would just including input tax. It could be looked looked at as a sequence of wars characters etc.. The new US like attention to actually Basically looked back back into the included representation video trying to predict something for task. which could be a sequence of Tokens as evacuate does container so so You know Chris Manning which is one of the pioneers of our field. The had this Basically the Belief from him that he believed him BIOS hegemony which he believes that basically no matter what the task is out there not task if you try wireless wirelessly omitted and use attention to attend back to the Basically important including a of the input you basically can Actually the state of the art knows this referring to the tension is all you need paper so attention is it only you need. Paper is more resent so that was then. The transfer miss came to picture. This has been hellish how fast field is moving through two thousand teams still as I said like the consensus in all it was that you can reach you. Choose state of the art if you just throw it. Violence attached that was the recipe and back in that tire member. Like when I was like talks I would conclude that look although that has been true or a host of different benchmarks a happens that for detested require vast amounts of background Dan knowledge reasoning in basically Require salish along tastes Not yet achieve state of the art or near human performance servants using these by Malls so fast forward just one year. In two thousand eighteen we had like L. modes steep contextualized were presentation The basically started sort of this one more step forward of billing these large language models which happened to be contextualized so preaching on a very large corpus and then fine tune of data stream which should sell started meeting lots and lots of different as state of the arts and establishing brand new state of arts and so the test that I had in mind when I was personally criticizing the fact that Oh look by throwing Added attention on a particular benchmark jump. Necessarily she stayed at the ARD causes reasoning tasks which is something that I personally absolutely very passionate about. It happens to be mined line of research and so the particular task was a storage tasks which I talk again. The lastingly testing we talk. Radio is specifically story Koehler says which is tested given a sequence of four sentences on which form a coherent story very very short story. The task is riches between two alternative endings to that story which Yunos designed basically to evaluate systems commonsense reasoning reasoning capabilities What happened in two thousand seventeen? Is that mid two thousand seventeen or so. The attention is unique. Paper came out the transformer paper that you just mentioned a minute or two ago so that paper basically enable aiding effect of other blurry large large pre-trade transformer models that could actually establish the state of the art in various commonsense reasoning tasks one being the. Gt one paper says uh-huh on paper came out around in two thousand eighteen hours which was Utah Training Model. This was a very large language model. Oh that opening. I folks have basically trained on a very large diverse corpus and then fine tune on a small data sets and actually this data said that they highlighted as to the place for me. The most amazing basically progress happened to be story closed as the benchmark. I I really cared about. So they have Notably they have often like around eighty six or so percent accuracy which was exceedingly getting better than the previous Number is that people had reported on the test set and so that really sort of changed my personal mind out adverb. You're going to this. I started believing in the fact that all look although these models may seem to be sort of doing pattern recognition at the scale pitch may not Doing reasoning in connecting the dots in all these sorts of things that we care about in a label as his reasoning. Efi You know do them in the right way or give these models off chance of being trained for on the right Dina says finding them right these center eric capable of doing knowledge transfer. I think that sort of set the ground up for us to move into has nineteen Very had more more of these very large preaching models that then you could basically find on various demonstrating test and establish state of yard. No matter the but they're not they're from our very at coronel t tasks like Shining tests such as historic Costas itself Congress this is reasoning etc.. So I think this has been the main exciting thing about Nineteen where we could see. This wasn't just a glimpse of Wasn't just a one time thing that these models could perform val it continued into two thousand eighteen. And I think I'm actually excited about A scene Improving these people off more about the downsides of these models but yeah I'm very excited to see her. VR going with this paradigm shift into any twenty.
Nassreen Mostafa Day
Senior Research Scientist
United States
LT
SAM
Dina
Congress
Chris Manning
Utah
Yunos
DAN
Koehler
Eric