Why do you need a database to store your data? Why can’t you just use files on disk?

Effeciency Reason

If you have a little bit of data, just putting in in files and managing it yourself is fine.

If you have a lot of data, you will run into effeciency issues. How will you find a specific piece of data you need in your large amount of files or giant files. Will you load each file and search the entire file, one at a time, to find a piece of info you need (e.g. the address of someone named abdullah)? Or will you have an “overlord” file that tells you what part of other files contain certain information (“for example, information for “abdullah” is all in file “abdullah.xml”.). If you start the rout of having your own overlord file, what if your overlord file gets big and unweildy? Will you have a overlord-overlord file? Ya see the issues that arises when you have a ton of data and you want to effieicnely access certain bits without having to do a whole bunch of IO (slow).

A database management system basically handles handles this stuff for you. It kind of creates its own “overlord” files, but it has “overlord overlord” files and its overlord files are not huge and unweildy. It does a lot for you, that you would have done yourslef, to ensure that when you want a specific piece of info, you can find that piece of info by doing just a little bit of disk read (in other words, it knows where on disk that piece of info is at).

How the effeciency is achieved

The basic unit of storage in a RDBMS is a “file”. A file contains records (rows) or index (key-value pairs). The index is implemented using a B Tree (much wider, thus shorter, thus less number of levels, thus less number of disk reads than a BST).

file record record record index index file record record index file record index

Other Reasons

Databases have convenience concepts such as:

  • users (different users have different roles)
  • roles (different roles have different permissions)
  • permissions (what tables are accessed? write access? read access?)
  • encryption
  • backups