Raise your hand if it happened to you too that an application working on a server for years ended up filling up the whole disk space bloking the entire server.
Too many files or no more space on disk
It is a pretty common problem if you work with applications that generate output on disk, that in a way or another they also leave temporary files or logs spread across one or more directories. And sometimes you don't notice them until it's too late, because you already ended up with a directory with tens of thousands of small or empty files or with no space left on disk.
They are two faces of the same problem, because a directory with too many files becomes impossible to navigate, and a disk with no space is the root of weird application behavior and crashes, sometimes making it even impossible to log into the server.
When a new file is generated, copy it over there
Another common issue i get, still relate to files generated by application is, as soon as a new file is generated, please copy it over there for that other application to process it. But not only that, if the file is named this way move it here, if it is named that other way move it there, and so on...
To handle this kind of requests without having to periodically check by hand all of the servers, and to play around with file handling in Go, I created Dirkeeper, a small command line utility that can be used directly or inside scripts to automate this kind of tasks.
The tool is quite simple at the moment and has a few commands:
Directory management utilities Usage: dirkeeper [command] Available Commands: cleanold clean old files completion Generate the autocompletion script for the specified shell help Help about any command match match and process files watch watch for new files and process them based on config rules Flags: -h, --help help for dirkeeper Use "dirkeeper [command] --help" for more information about a command.
cleanold - clean old files
clean old files Usage: dirkeeper cleanold [flags] Flags: -d, --directory strings List of directories to cleanup --dry-run Only check for old files without deleting -h, --help help for cleanold --max-age int Maximum age of the file in days
cleanold command is pretty simple, give it a comma separated list of directories and the maximum number of days since the file creation and dirkeeper will delete all the files older than the given age.
You can also specify the
--dry-run flag to only check the matching files before deleting them.
match - execute task on matching files
match and process files Usage: dirkeeper match [flags] Flags: -a, --action string Action to execute --dest-dir string Destination directory -d, --directory string Base directory --dry-run Do not execute action -h, --help help for match --max-age int Min file age in minutes --pattern strings List of file name patterns --prefix strings List of file name prefixes --suffix strings List of file name suffixes
match command checks one or more directories for files matching a given pattern and execute an action when a match is found.
The list of directories can be specified with the
--directory flag and is mandatory.
The matching rules can be specified using one or more of the
--suffix flags, where
suffix accept comma separated list of prefixes and suffixes of the file name, while
pattern accept a comma separated list of regular expressions that the file name has to match.
In addition to these rules, a
--max-age flag could be specified to indicate the minimum age of the file in minutes before executing the action. This could be useful when the file takes several seconds to be generated and prevent working on a partial file.
When one of the mathing rules is triggered, then the action specified with the
--action flag is executed. Valid actions are
copy-delete. The latter is useful when a direct move is not allowed by the system, like when working on remotely mounted directories.
watch - keep watching for new files an execute task on matching rule
watch for new files and process them based on config rules Usage: dirkeeper watch [flags] Flags: -c, --config string Config file --debug Enable debug log --frequency int Watch frequency in seconds (default 10) -h, --help help for watch
watch command is similar to the
match one, but runs periodically with a frequency in seconds specified by the
The other main difference with the
watch command is that it takes a config file as input, so that it's not required to specify all the parameters on the command line.
The config file is in YAML format and an example is like the following:
watch: # Dry run indicates if the action should be executed or only logged dryRun: false # Can have a list of input directories to watch directories: # The path of the directory to watch - name: "/test/input" # The list of rules to apply rules: # The action to execute for every matching file, can be copy, move or delete - action: "move" pattern: # The list of pattern to match, as regular expressions on file name - "RY59A.*" prefix: # The list of prefixes to match suffix: # Th elist of suffixes to match # The destination directory for the copy or move actions destination: "/tmp/test/outputA" - action: "delete" pattern: # Regular expression of file name - "RY59B.*" destination: "/tmp/test/outputB"
Contributions and suggestions are welcome
I created this little utility for my customers, but it is open source and available on Github for anyone to use and extend.
If you find interesting additional commands or bugs you can open an issue directly on Github or contact me via Twitter @fabiomarininet.