"zahren fraser" Discussed on Code Story

"Their activity in the background and then surplus issues when we find it and so the day decides pc kind of an often his own Building some staff came with pretty impact together version of what we have today. I was simple. Give it some data to return back and say yes. Now wasn't assurer wasn't them and then we will take action based on the response so yeah it was just a website you know there was no there was no native APPs or anything and I think probably knew maybe getting getting your starting from nothing to getting something that somebody could click through probably like two months. Maybe now having something that we're proud of took a lot longer but What sort of tech did you use to build that? Initial the two-month Bill. So early for me. You know as a standard collapsed act for me was Israel's the push back end I'd done a Lotta Open source work with sidekick. This is background job processing library. And so for me that was pretty natural to bring in Interesting thing about Barca's is a lot of the war. Goes on in the background. So you know we have like very little worries about keeping being a website up and and the website traffic actually doesn't really indicate our success which is a little bit opposite of a lot of pretty typical web APPs so that website needs are minimal. And and then you know we always in the background or looking for stuff and analyzing it. So we have these massive There's clashes of servers now doing background work. And these little all small little web servers just up for done. It's really so to kind of different mindset. But it's a IT'S A. It's a standard rail stacks reddish was involved for sidekick backing and then the data science piece. This was Is Still Python was type two so as front of with the small class. KPI on Jason Api and then at the time. Jonathan back it. It was just some kind of basic. It'll be libraries from that are in the python. And and either hand label editor or farmed out to to somebody else To kind of put human labels on it but you can find a data set that was the challenging piece for us was said. Hey we're GONNA create a cyber bullying class fire. We need data. We need data one but says this is not cyberbullying sued. The negatives are really important. Then we need data to say this is cyber bullying unfortunately fortunately at the time and not and maybe more so today you know you had a lot of data set sewer folks strong sentiment so like analysis for comments on Youtube or Wikipedia. Articles Sir reviews on Amazon that type of thing. So you'd have sentiment analysis that might tag together a bunch of reviews and categorize them into certain buckets and see if this data and ultimately you need to do some either supervisor and surprise learning on it and so that was the challenge. Was You know how do we build Together put together the first version of classified. That can be trained on data that we didn't have and so ultimately you know you through places for four and a half years ago that you felt to be toxic. There was a likelihood there was some kind of bullying cyber going on so youtube comments. Come to mind to enlarge video so we scraped a bunch of stuff from their message. Boards read read it at the time. The certain ones that were more abusive than others and young we started labeling those things we start off doing ourselves so built a tool or look at someone's yes the styling notes not who tried some Chemical Turk staff. It was largely internal employees. That helped us built that first data set and so we wanted to monitor for a lot of things but out of the gates you can. I can go out with a lot of things because it's just it's too roaming. I'm introducing a single class rider. Do one thing really well as is really difficult and then try to across. You know at this point we have about nine out a really good and then Subtle ones for about another fifteen so we knew what we wanted long-term and whittling that down to like you know the MVP version of that was like okay. Well let's focus on cyber bullying and then you think about cyber bullying and how it relates to hate speech. Well there's probably he's be typically like a subset Asai Rolling often you'll see cyber bullying Ling remnants and something might evolve speech and so you kind of like giving these species of taxed multiple labels and one that I think one of the Early on concessions we made was. If you look at. Let's say you're exchanging messages with some friends and there's fifty messages back and forth. Can you each each year. At twenty five we would analyze each each line differently without context. So you'd look at a line and say like hey what's up and we would analyze ads. That's I don't the next line Joking around hey what are you doing this weekend or something and then you might get to align says like I hate you and said Whoa. That's that might be cyber. There's very little context. We didn't really understand at the time that I hit. You might have been might come before the next message that you know. It was like a smiley face Emoji or something and so but but we knew that we wanted but that was a lot harder and so we started off with just this kind of single message approach analyzing it was it was easier to go out of the gates. It's certainly but then we found quickly for like apparently when we did surface in issue it would be like I hit you and they'd be like well what happened. Like what what what caused my son to say that or my daughter were what caused the other person to say. That was it a joke name very quickly said to us. I don't understand and this is about. Can you give me the message and our ethos at the start of the company was very Very focused on privacy for the child. The other products in the market were focused. Kasan just giving the parents the entire the target social media feed let them sift through it. So it's really just Nagara emailing you every night l.. List of thousand awesome messages. The kid might have been involved in and so took a lot of the parents time in it and then it didn't really offer the kid privacy and they they could talk about. The kids could talk about some things and parent would know everything about it and it didn't really offer the Kidman privacy so we set out to do this in a way that you know. Hey we're GONNA only service the issues. The child gets their privacy. Apparently gets involved when we believe based on with qualifiers are saying there's an issue so we just said Hey we're going to play you know the five messages before in the five messages after for the parent to see and so they can understand. Maybe what led to it and then what what was it after right. That's interesting it's interesting to hear about the early the early days. To now to your kind of pioneering this the early machine learning data science based solution where you know. There weren't a ton of frameworks out there for this there might have winsome. NLP natural language processing stuff that you could take advantage of. Maybe but there certainly wasn't any frameworks out there for. Hey this is cyber bullying or not or this is hate speech or not and so US essentially head to build that by using structured and unstructured learning and the machine learning in space and it must have been an epic thing too Epoch Mountain to climb to to build it but also incredibly exciting product working on your Zoli when you see the computer make decisions like that and you're like wow is amazing. I don't know at the time I don't I really don't think I knew how how long it would take to be good at it and I don't know that that would change anything. I I think I thought we get better at it faster before half years ago was lake thank. You didn't have a lot of the tools there's no. There's no real mentioned in neural net stuff. I mean they're probably was like in smaller communities but it certainly wasn't mainstream and it was like We're not sure if this was an approach probably being pitched in academia but it didn't really catch on and so you didn't have these tools to out of the box to give you the model that that worked pretty close now. You're they're preaching models that you can get off the shelf that you work really well for you throw some data at it will like get pretty close close and you don't need you know page in data science to get you there. It was quite different than you know so using a lot of of kind of old approaches that worked. Well you know things like God you know. A lot of the tactics came up with Internet. Search you know and distances between queen similar types of words and Some of those kind of shallow learning techniques leverage. Dfid Alvin that type of thing that you would you would commonly see to tender stand you know. Is this word or is this phrase or sentence. Similar to the other Fraser sentenced I know is to be cyberbullying but it's not quite the same and so we start start off very very basic approaches. It's obviously gotten far more complex now. But it also was fueled by the yet accelerated kind of development of technology data science and tolls and tensor flow and just these massive companies doing incredible research. And making this easier and more approachable for everyone. I mean.

