Build Your Personal Search Engine With Datasette


Let's start at the foundation of this recent work you've been doing and in some sense it's sort of a natural progression writes in the journalism side of things where the origin came from. So tell us about dataset data set. Is i on its website. I call it an a multi tool for exploring in publishing data. Basically it's a web application which you can point at a sequel light relational database and it gives you pages where you can browse the tables and one queries lets you want like type in custom sequel queries and one them again. Not die tobias lets you custom templates and out things ten. Let's you could have been back out as jason be so you can use it for. Api integrations and it. Lets you publish the whole thing on the internet really easily. So it's it's a loss and one of the biggest challenges i had is. How do i turn this into a bite size description that really helps people understand what the software does the point now. Where if i can get somebody on a video chance. I can a fifteen minute demo but at the end of it going. I totally get this. This is amazing but that's not not sliding software. It doesn't scale well. Yeah well let me see if i can with my limited exposure to it in knowing some some what we're going you have this data source that's pretty ubiquitous or can become ubiquitous in terms of like some sort of etl with sequel light. Right ziegler. light is everywhere. What's beautiful about it is. There's no please set up the server and make it not run as root and then put it on your network and sell the magic sequel light sequel. I it boasts the most widely distributed database in the world which it is because it runs on every phone. My watch a sequel light tracking my steps every iphone app every android app every laptop. That old running the down here your phone. That's crazy it's a file formats it's a sequel database is a singled dot d. Be biden's on disk. Which like you said makes it so convenient. Because i didn't have to ask kasit happened to set me up post schema or anything like that. I just create a file on my laptop and and that's my database. Yeah and it's even built into python right. It just comes with python. Yeah exactly so. That's super cool. And it's great that we have this data format that if we have data in there or you could do like an api call and then jam the data and they're right like something to get it into that format which is great. But you can explore that with like beekeeper studio or some data. Visualization sequel management studio. But that doesn't work for journalists. That doesn't work for getting it on the internet. That doesn't give like The transformations in some sense. I kind of see it almost like as a really advanced web based like data. Id but user friendly earned a year with an but the emphasis is absolutely on on publishing getting it online. And then it's on being web nights it like everything and data set can be good on his jason as well as html it can get. Beat you e-content csv to you. It uses you pasta sequel query and get in a query string. C can bookmark queries all of that kind of stuff. Yeah i think the key. That's really the key idea is how do you take relational databases and make them as web to this possible and cheap and inexpensive to to host into run as possible so you can take any data that fits in a sequel database which is almost everything and stick it online and that people can both explore it and stop like integrating with it as well on the another key idea. Indict set his dates plug in system. I've actually written over fifty four it now that add all sorts of different things different output formats to get your database out as an atom feed when i cal- the'd of play against visualizations that plot the daytona map will give you charts line graphs. And so on. I just this morning released a authentication plug in that. Supports the indy off with indication mechanism. So you can use india with logan to to password. Protect your room your data all of these different things and honestly having a plugin system is so much baden. 'cause i can come up with a terrible day for feature and i can build it to plug in and it doesn't matter that's just an awful idea that nobody should ever have implemented because i'm not causing any harm to that core project.

Coming up next