Martine Casado, Peter Wang Co, Twitter discussed on a16z

a16z
|

Automatic TRANSCRIPT

And I come from the physics computational physics background, , and we both kind of been pushed into this. . Data. . All data science and I don't know if that is coincidence or if we have an affinity for them before we get into that though this kind of competing view of the world, , which basically says sequel can do everything and this one we spent a lot of time actually looking at the data science or the data landscape and it feels like there's two worlds there's like the data warehouse maximalists. . We'll stick all date on the data warehouse and then we're GonNa do sequel and then we're GonNa have some extensions to sequel like you see popping up like big query whatever and that can do everything needs to be done, , and Oh by the way if someone's using python and are all, , they're doing basic regressions and so we can just make that a simple extension and we're done and then there's the other view of the world I like to call the hoop refugees which is. . Actually, , we do hardcore computation and we need our in python because of stuff we do is very sophisticated. . I I know you're squarely on one side of those but I wonder like, , do you think there's a convergence that happens these stay two worlds does one become a relevant like what happens there just because you oppose extremism doesn't make you an extremist, , right David? How's ? maximum? ? I see this world and it's the old yarn about I guess, , I. . Don't know there's there's so many variants of this but Allen Parlous a great computer scientists has some really great quotes about somebody Reverend sees about these kinds of things but I would say that to the idea that everything be expressing sequel. . which sequel with how many extensions because the end of the day and with how many like extensions on extensions and multicolored on your post grass? ? Pipe, , colonel, , I guess you're doing the sequel, , but you're running a python script knows let's not really don't count and frankly a lot of stuff runs access and BB in this world isn't sequel I think if you choose to look at the world through a particular Lens, , you can choose to count everything else as. . And rounding errors. . But if you take off those lands as you see a much more diverse landscape and I think that's where for me, I , see the space for sequel I understand the reasons why it has evolved into particular kind of animal like the shark is still the best predatory fish in the ocean. . But it's not the biggest Predator in the world and I think there's something about that that if you're in the ocean, , you're going basically shark like if you're to eat a lot of fish. . So if you're in that business data analytics world especially because a lot of business data looks like fish, , it's evolved look like food for the sharks. . So that's kind of the way it is but what Hoodoo opened. . Up Back in twenty twelve I called it the hoodoo battering Ram as we're not GonNa win the Hoop Gate. . We'll let the vendors go and fight against the Tara's the oracles, , all the classical data warehouse guys let them do that thing. . Once he battered down the door, , we're GONNA come flooding in with all sorts of heterogenous approaches to data science analytics, , things that are hard to ask in sequel. . And Moreover, , there's a term use which I don't hear used very often now obviously or the shadow it, , which is used it, , but there's a shadow data management that's a are far more serious and dangerous problem when I was at an messing bank, , they had a million dollar oracle database sitting somewhere and it was too slow to actually run the analytics they needed, , and so what they did is they had an instance of this oracle database cost a million bucks and what they did is the only quarry they ran was a bit full table dump into a CS meet and then they took CSV. . And they did everything else with it and it was python scripts Ram Java. . Crap was bunch of other stuff and it was sort of like. . So if you're a data manager, , if you're like in the data management practice, , you say, , well, , we have another big old million dollar instance stood up our data management techniques are great <unk> Potemkin village against right. . But then when you actually go and you ask the developers, , Hey, , where's the source data for the stuff whereas prod data coming from? ? Like, , oh. . Yeah. . This file share back slash black slash something or the other or you know that I'll I'm like that file what about database don't touch the database to Brennan right so there's this kind of stuff going on in everybody listens knows what I'm talking about that shadow dated management is absolutely a pernicious problem and data science is just eating it alive because did ask the question you want to ask you have to integrate us together master data management is about silence <unk>. . And all this stuff you've hit to the site which I just think it's so Germane to what we're here to talk about, , which is this clearly problem domains which sequels totally fine for right Yep and you can argue the problem domains, , which is just not any sort of hardcore statistics is just not very good for and the point of us being on this podcast is actually talk about like listen we're kind of new types of companies and you types of workloads. . And they're around kind of processing data and I hear you that this shadow data management is a real issue and you can make an argument why that exists because people are stupid or they don't WanNa do good workflows is like literally we don't have the tooling to deal with data in the right way. . One question that I have that I would love to hash out with you is are we a fundamental shift in workload that requires a fundamentally new set of tools? ? And a fundamentally new type of company or is this just more of a transition where we can kind of put into service the old tools and I just want to be a little bit more specific, , which has in the past you had your toolkit of systems approaches and you have software system, , and you'd kind of pull them out and applied to the problem the sequels, , one of them, , and we kind of understood how systems behavior and we kind of understood how the company's both around the behaves. . As an investor looking at a lot of data companies, , they just don't look the same types of tools they use the type of operational practice they use. . The one that you pointed out was a great one, , which is outdated becomes a primitive what actually apply like software techniques to in a way, but , we don't have the tools to do that. . and. . Then we've written posts about margin structures look a lot different where you go at your company different and so I just do you think this mess is because data scientists don't have formal S-. . Yes trainings or do you think this is an entirely different problem domain and we should actually look at what the future looks like for that and development. . Tools, , etc. . This is at the heart of Oprah talking about this is absolutely the heart and I will try to start from the top, , which is this concept that every baby or every child is born and the reason that they think their childhood is normal right? ? They think of like your childhood like normal thing. . So have developers coming online in late two. . Thousands let's say and they think this is the world even me as a professional starting in ninety nine while this was just what there is the more you start researching history and looking back your life. . You know what? ? We're just building in this industry, , which is layer it's frozen accident on top of frozen accident frozen accident very very few times do. . People. . Make principled intentional revolutionary shifts. . Right? ? It's. . He basically band-aid a substrate. . Okay. . So starting from the top, , what I would say is that there is no law there was nothing hard in stone that Moses brought down from the mountain said all information systems must be deconstructed into hardware and software and data there's no such thing it was information systems will stop. . The fact that we had different cost structures for innovation in hardware versus software versus networking, , and so forth. . That has led to different rates of innovation different places, , things like that, , and so when a business steps in and says, okay, , , what's on the shelf that I can use to accelerate my business processes? ? Then it makes sense because this thing that thing like when you buy a car, , you buy the car and then you put CDs in the car you don't go by car with a CD prefect

Coming up next