Although managing data in relational database has plenty of benefits, they’re rarely used in day-to-day work with small to medium scale datasets. But why is that? Why do we see an awful lot of data stored in static files in CSV or JSON format, even though they are hard to query and update incrementally?
The answer is that programmers are lazy, and thus they tend to prefer the easiest solution they find. And in Python, a database isn’t the simplest solution for storing a bunch of structured data. This is what dataset is going to change!
dataset provides two key functions that make using SQL databases in Python a breeze:
A simple data loading script using dataset might look like this:
import dataset db = dataset.connect('sqlite:///:memory:') table = db['sometable'] table.insert(dict(name='John Doe', age=37)) table.insert(dict(name='Jane Doe', age=34, gender='female')) john = table.find_one(name='John Doe')
Here is similar code, without dataset.
dataset is written and maintained by Friedrich Lindenberg, Gregor Aisch and Stefan Wehrmeyer. Its code is largely based on the preceding libraries sqlaload and datafreeze. And of course, we’re standing on the shoulders of giants.