30 August 2010

Were you a Lego kid? Or Meccano?

If you were, you’ve gotta like Unix (in this case, Linux) ’coz that’s what it’s all about.

Case in point is an app I’m working on (supposedly enhancing, in practice damn near rewriting) written in C which stores data into flat files. No indexes. It just writes a gazillion struct whatever; records straight into the data file.

The current problem is that as data tots up, it gets very slow since it processes each data file serially when reporting, & if there’s cross-references it serially scans the related (thankfully small) data file for each record, to fetch the required text.

There’s a background problem in that multiple users running the same app (which must be done on the same computer) have no clear way of avoiding collisions, with the inevitable result that the second (or successive) user requiring access to a record must wait until the earlier user(s) have finished with it & if the first user crashes the unlock() never happens so they wait forever (have to reboot the “server” PC).

This data access system will be the second against the wall when the revolution comes (the first is printing, currently hard-wired to a specific device-node behind which rests a specific model of dot-matrix impact printer; the third... well, does printf("\33[%d;%dH",r,c) look familiar to anybody?), & I aim to do that by twisting this app’s arm to use an SQL database.

This has the advantages of speed, better multi-user interfaces, multiple apps running on multiple PCs able to work on one (potentially remote) database at the same time, the ability to back up just the data, potentially using a commercial backup system.

Setting up the many required SQL TABLEs was looking a bit tedious, but this construction-set operating system is making it very easy.

I’ve written a small gawk script which is thrown each source file, to extract the struct entries & translate each into a CREATE TABLE SQL statement.

Another script extracts references to each struct within invocations of the write() function, associating the file-handle variable with the same variable name used in an open() function to find the data file’s name.

This can then be used to create on-the-fly C apps to (compile with make, run) extract the data from the files & morph it to INSERT INTO SQL statements.

All that’s left for hand-coding is to translate the C from write() functions to invocations of SQL statements for achieving the same thing.

The data manipulation is so very basic (LEFT OUTER JOIN? in your wildest dreams!) that the SQL engine underlying this doesn’t really matter, but I will do interface layers for PostgreSQL & MySQL to ensure that I’ve untangled enough details to make Oracle or whatever simple if it becomes necessary.

No comments: