Excel or SQL? Storing Data in Spreadsheets vs Relational Databases

Just about all of us have used spreadsheets at some point or another. Whether its been for keeping track of expenses, marking observations in a science class, or anything in between, Excel and other programs for creating and editing spreadsheets can be incredibly useful for holding relatively small amounts of well structured data, and performing limited analysis on them.

However, what if your data table ends up being thousands, or even millions, or lines? If you’ve ever tried to open a CSV file in Excel with a lot of data, you have probably noticed that it starts to lag enough to become practically unusable, or even crashes. And as far as pulling out useful information from spreadsheets apart from eyeballing a small amount of data? You can do some useful arithmetic on the columns and such, but you are fairly limited in what you can do and the kind of relationships between data that you can represent, especially without digging more into the advanced features.

Enter the relational database. If you aren’t particularly tech-savvy or haven’t studied any computer programming, you may not have encountered the term “relational database,” although you may be familiar with the idea of the “database” in general. In essence, a relational database consists of of tables (think rows and columns, just like excel spreadsheets). However, rather than being stored in a user interface with cells you can see and manipulate (although there are some programs that allow you to view database tables in a similar manner), the data in a relational database is accessed using a language called SQL. While learning and using SQL is certainly trickier than navigating a conventional spreadsheet interface, it allows you much more power towards manipulating your data – especially when there are a lot. In addition, relational database platforms are engineered to store far and away more data than your everyday spreadsheet, so if you are dealing with any large volume of data it will be much more manageable. And finally, another significant advantage of relational databases is their ability to allow you to represent relationships between data. A database can, and almost always does, store more than one table. Relationships can be formed between the data in these different tables. Consider, for example, that you are storing a database of authors and their works. You might have two tables – an authors table and a works table – where each author is identified by an ID number, and that ID number is referenced in a column of the works table in order to indicate by which author the work was created. Try modeling just that simple relationship in a spreadsheet, and it will start to get tedious quickly. Start trying to model far more complex relationships between data, and it may become basically impossible.

In short, there are pros to the conventional spreadsheet in terms of storing data. They are extremely quick and easy for smaller sets of simple data, and they are far more accessible for most people as they require a lot less technical knowledge to use. However, the relational database takes the cake for almost any case in which the data is large or complex.

Author: Aidan

Leave a Reply

Your email address will not be published. Required fields are marked *