Details
-
Suggestion
-
Resolution: Unresolved
-
P4: Low
-
None
-
None
-
None
Description
Problem: storage grows continuously, takes too much space
Many terabytes of disk space in millions of small files is both inefficient and slow. Just trying to recurse through all the storage directory takes huge amount of time, making it impossible to schedule maintenance operations.
Suggested Improvement
If the webserver could serve storage files from many read-only storage roots, we could archive all the old files in highly compressed and indexed archival formats. For example, squashfs is a very efficient read-only filesystem, that is mount-able directly by the kernel. For Coin it would just be another directory to read files from. It only needs the functionality to read from multiple roots.
Take for example testresults. The webserver process runs as:
webserver -storage-root=/data/www/testresults/logs
If we implement a parameter -readonly-root that can be repeated several times, to add more read-only storage locations. For example:
webserver -storage-root=/data/www/testresults/logs \ -readonly-root=/data/www/testresults/logs-archive-01 \ -readonly-root=/data/www/testresults/logs-archive-02
logs-archive-* are just directories with files. But important is that Coin never tries to write to them.
Benefits
- Keep ci-working-dir lean and fast to work with
- Save disk space
- Possibility of putting the old data in different disk volumes, quite useful since we have huge amounts of old data.
Then we could also figure out an automated way of periodically (lets say monthly) archiving old data from the "storage-root" to a "readonly-root", but this can be described in a new ticket.