RIF: REBOL Indexed Files

This is a somewhat longer blog posting today. Sorry about that, but this is an important topic, and I wanted to make sure to give you the whole picture.

It is common for REBOL applications to make use of flat (linear, unstructured) files to implement databases. IOS and AltME programs are examples, and there are others. These programs store their data as a single REBOL block that contains "records" that are sub-blocks. REBOL handles such files quite well due to its ability to LOAD and SAVE data in REBOL format (a powerful feature that does not get mentioned enough to new users, in my opinion).

This method works fine for databases that need less than 10,000 records (which actually includes 80-90% of the typical databases in the world, because most databases are things like contact databases, file lists, email boxes, code modules, etc.) In addition, this method has the advantage that you don't need to figure out and maintain an interface to a database management system (DBMS), like MySQL, SQL Server, Oracle, PeopleSoft, or others. That can take a lot of time and energy, it requires more complex installation, and it also makes your software less portable.

But then, there are those applications, and the AltME and IOS systems are good examples, where you want a bit more than flat files can provide. It's not that flat files don't do the job, but for efficiency reasons there are times when you don't want to load a three megabyte file (like a message group) to get a 100 byte message. That's a reality. Most of our applications get away with doing that because the load time is so fast, and we just close our eyes and ignore the RAM that was needed to hold the data. On the other hand, we don't want to use a DBMS because that's more than we really need, and we also want easy installation and no headaches.

So, what's a good solution? That's the idea I call REBOL Indexed Files, or RIF for short. RIF is the concept of a REBOL block series implemented as a file structure. Essentially, RIF lets you access (read and write) any sub-block value that is stored in the file. The block is indexed by record number, so it is quickly accessible using a file seek. The records themselves would be variable length, allowing you to store blocks that contain any type of value, including large strings, objects, source code, or even images and sounds.

The benefits of RIFs are simply that you can efficiently load a single record without loading the entire database; you can modify a single record; it's platform independent (as is REBOL); and you don't need an external DBMS. Also, RIFs would allow records themselves to be stored in REBOL binary format, REBin (more later).

In addition to RIFs providing a series of records, they would also provide a single "global block" that can be used to hold a small amount of non-record data, such as the database name, access controls, lock tags, REBin symbol tables, or whatever other "meta data" you might need for your DB application.

So, that's the RIF concept in a nutshell. Note that this is not a relational database system, and in fact, it's not even a RDB table because there are no keys other than the implied record index. Of course, it would be possible for you to implement a relational database by creating multiple RIFs; if you want to do that.

This last point brings up the question: why not implement an RDBMS as a standard part of REBOL (at the native C level)? It turns out that I've considered that idea, many times. The problem is that most RDBMS implementations are very complex, and their code is quite large. That's the nature of the beast, especially if you want to store the entire DB in a single file. If there was a relational DB that required only 30-50 KB of code, was OS portable, and had a free usage license, I'd strongly consider it. But, so far, that gem is not to be found.

Post Comments

Updated 7-Mar-2024 - Copyright Carl Sassenrath - WWW.REBOL.COM - Edit - Blogger Source Code