Donald Trump, Sergei Markov, Chad discussed on City Arts and Lectures


Sergei Markov, off Friedrich Schiller University in Germany has tried to understand the trolls by adopting the mind of one I started with a kind of mental experiment trying to imagine what I would do her. I paid Internet troll. My boss would tell me to post some number tweets all saying the same thing. But with different words so he wouldn't tell me. Your task today is to write 1000 tweets or 10,000 tweets, all saying that Donald Trump is great, actually, and that would be a hard task. I would kind of relieving the number of targets words. President Trump grade and some synonyms of great but the number of times that I have to propose it is. Ah, they're big. I have to repeat myself as little as possible. Sell that my fake Internet accounts wouldn't be compromised, and I wouldn't be identified as a paid intern. Natural, right? So it means that words appear in these tweets in a subtly but nonetheless, Detective Lee Different way to how they might do in just everyday natural. Less show like communication. Yeah, I haven't analogy you could think about like this supposed that one day you come home from your work. You go, Teo Bathroom. You find your boss sitting there? That's a very unusual thing to find out your problem. And then is worse. You go down to your car on the backseat. You see our old schoolmate who's moved to Japan like 20 years ago currency in your world this time. What does it mean? How can you be so it's a mess? And that madness comes about because the associations air all wrong, So the place is a real familiar. The people are all familiar, but the association between those people and places is now completely disrupted. That's exactly what raise all about. The words are the same. The contracts are the same. But everything is mixed up. You know, there's a famous saying you should know. I work by the company to keeps It means that in natural languages, words are defined by their contexts. So words belong to their contexts. They don't move all over the place in some crazy matter. When we speak about troll tweets. We can say that with words and troll tweets, the run all close friends on the run or bitter enemies. All the words and just good acquaintances. Which brings us to the whole point of your research. So how do we identify the trolls and your homing in on the context of language? It's a linguistic approach, which contrasts with other approaches that have looked, for instance, geographical location or maybe the links that Charles Post up in their tweets. Or maybe these studies have looked at the use of hashtag those kinds of things. You're taking this linguistic approach. And, crucially, you're saying that it's a quicker way and that's very important because he wants to hopefully track these people down and closed their accounts down. This is a quicker way of finding them. How much quicker Is this approach Wait weaker, so we only need 50 tweets. To start making reliable Predictions by reliable I mean those with 86% accuracy. With only a few to tweets. And with every increase in the number of tweets, the accuracy of predictions goes up with Ah Still as few s 200 tweets. The mortal is almost unbearable. It's 99.99%. And given that trolls are turning out sweets incredibly quickly. Hundreds, maybe thousands of tweets a day. Within a few hours, possibly your average kick in and say, that's the trouble account on flag it up, maybe to be closed down. So it's very fast. It's very Eckert. Actually, I don't remember anyone even with millions off tweets to analyze coming. Even close to this receive squirrel. That's Sergei Monarch off site built homes and back on algorithms again. Not when we can throw algorithms at this troll problem, and that's going on here, but very much looking at the language, the construction of languages on language and how words associate with each other. Indeed, it's algorithms all the way down, really, basically resting on the back of a giant turtle floating through space Clearly. And in this case, I think it's fascinating. Just the basis in human psychology that the circus work is based on really, really fascinate me on understanding that there are limits to basically how much repetition a person can produce. How much variation you could get into your star when you asked to do the same task. Many, many times, and therefore you can use that to create a fingerprint on that. This then allows you to identify enough similarity to be able to categorise with a good degree of certainty in this case, tweets or language coming from the same source. And obviously it's really important that we deal with this sort of problem. Our social media overwhelmed with this sort of sort of malware. There's this misinformation these people who are trying to confuse not just a political discourse, but always any debate that happens online. To the point where the usefulness of these services is becoming limited. You can spend a lot of your time if you're a heavy user off Twitter or any of the other social media trying to block people trying to trap you and stuff like that. But that just takes away if I'm useful time in your life, So I'm really pleased to see some some significant and probably grounded work that might help me make it easier. All the social media players to identify and take down this this malicious and content bill. Thank you. For that we'll finally the power of data saving lives by predicting which regions are most vulnerable. Tio covert 19 outbreaks Now that's the idea behind the Africa Covert 19 Community Vulnerability Index is the first two of its kind across the continent. To have mapped out region by region where the risk areas are now the index doesn't predict where outbreaks and most likely instead it charts how well these regions are likely to cope with the disease. And it's a huge processing exercise, pulling together data on age epidemiology. Population density on countries health systems. The index comes from the Sergo Foundation body that combines behavioral science on artificial intelligence to improve lives. Thie, co founder and executive director is summer. This index. What it does is that it takes 48 countries and 751 regions across those 48 countries in Africa. And it looks at their ability to whether not only the health impact of this pandemic but also the social and economic impact, which is absolutely essential. So what kind of things does it look at Many different data points and those include things like old age, which we know is absolutely critical for the impact of cove. It looks at the strength of the health system. It looks that underlying socioeconomic conditions it looks at fragility, housing type transport. It's evidence based. It looks at all the factors that we know are critical for the impact of this pandemic. Other a few really big findings, maybe possibly surprising findings. That really jump out. See, One thing we're seeing is that the regions that are most highly vulnerable to the impact of this pandemic are actually concentrated in three countries. That's the Democratic Republic of Congo, Molly and Chad When we look across countries while they may have similar levels of vulnerability, the reasons why they may be vulnerable. Khun B. Quite different. For example, South Africa is highly vulnerable due to the underlying chronic conditions that are high levels in its population. Whereas when we look at a country like Chad, the underlying power levels of vulnerability, maybe due to socioeconomic factors, and then if we look to another country like Cameroon, Cameron is highly vulnerable because of the underlying fragility. And so what's really important about this index is not only does it tell you which countries or regions are highly vulnerable, but it also tells you why, so that we can be much more precise in our response to this pandemic. And of course, this is about data. It's about a lot of processing, really, to bring together. The whole lot of information. Sources together say that you can get these insights from the index. Can you give me a sense of the data challenge here for us To be able to know what's happening with this pandemic? We need good levels of testing. Testing levels across the continent, very humongously, but they're quite low and so given the fact that we don't have good data in terms of what this virus where it is and how it's going. We have to rely on a lot of other data sources to be able to make these kinds of predictions. And so we used many different sources of well validated data. One big data source is called the D. H s survey. This is a survey that's been happening for the last 20 years across the continent on it. Collect information on a lot of different variables. We've used data from the World Bank, the W. H O so many different data points and data sources came together to actually construct. This pretty data rich index. How do you even go about processing all that data off the shelf software tools he couldn't use to pull out the insights that you need or have you had to develop some of your own tools, maybe even a eye to really make sense of it all..

Coming up next