Flat Data vs. Relational Data: Which is Right for Your Purpose?

When creating a dataset, it’s important to plan ahead. What type of information will you store? Who will access it? What type of performance do you need from a database? These are important questions to ask before creating a system for your data.

Stephen Ramsay addresses some of these concerns in his “Introduction to Databases” tutorial. Ramsay discusses the rise of computerized databases, the modern version of hard-copy indexes that have existed “since the middle ages.” Indexes somewhat resemble flat databases, software like excel and simple spreadsheet models. With advancements in computation and the rapid growth of data, relational databases have offered efficient organization and retrieval of information.

Relational databases are advantageous in that they allow entities to be tied to variables, called keys. Keys allow for one aspect of one entity to another. For example, if you’re creating a database of movies, Martin Scorcesse might appear multiple entries of the “directors” category. With relational databases, a matching key is associated with each entry of Martin Scorsese as a director to a particular movie. Keys make for quick traversal, insertion, and deletion of data.

Addition of metadata adds a number of benefits to a relational database. With metadata, you can declare relationships, maintain consistency in your data, and construct a clear lineage of your data. However, it is important to determine all the fields and relations for your data before diving into data entry.

In many ways, relational databases are a great solution to storing data, but they don’t come without complications. One concern is the administrative privileges of the data. Which users should be granted permissions to insert, delete, or reorganize the data? This becomes an issue when your database might be accessed by thousands of contributors when integrated into an online platform.

Relational databases have many advantages over their ancestors, but they have their drawbacks as well. Here’s a comparison between the two database designs.

Relational databases

Pros

  • Quick retrieval of data
  • OPtimal for Network solutions
  • Navigate redundancies
  • Most tools like MySQL and PostgreSQL are open-source

Cons

  • Making assumptions with relations (especially in humanistic inquiry)
  • Deleting entities in the database could case null pointers or other issues with primary and foreign keys.
  • Require a comfortability with writing code like statements
  • Possible fragmentation of data with heavy use

Flat data structures

Pros

  • Usually easy format to read
  • use delimited values for ensured storage

Cons

  • redundancies
  • navigation through data is limited to rows and columns
  • Usually used in offline systems

In other news, I’ve just launched my new website, colehanson.net!

Author: Cole

Leave a Reply

Your email address will not be published. Required fields are marked *