Handy bash commands
- Top 20 files by size
- Concatenate multiple files skipping n rows
- Compress last X days data in one archive
- Compress files older than X days individually
- Merging multiple lines
- Find latest files recursively
- Get rows starting and ending between a known range
- More, coming soon ..
Print total size of top 20 big files in a directory
$ \ls -lS | head -20 | awk '{ total += $5 }; END { print total }'
Concatenate multiple files skipping N rows
$ tail -q -n +2 *.tsv > /tmp/output.tsv
Compress last X days data in one archive
# compress last 5 days data in an archive find . -mtime -5 ! -type d -print | xargs -r tar -czvPf /tmp/files_$HOSTNAME_last5days.tar.gz
Compress files older than X days individually
# compress files more than 30 days old in current dir (non recursive). Preserves timestamp find . -maxdepth 0 -mtime +30 -exec gzip {} \; # uncompress files in current dir find ./* -maxdepth 0 -mtime +30 -exec gunzip {} \;
Merging multiple lines
# merge 2 lines from input_file with a comma paste -d',' – – < input_file # merge 3 lines from input file with a space paste -d' ' – – - < input_file
Find latest files recursively
$ find $1 -type f -print0 | xargs -0 stat --format '%Y :%y %n' | sort -nr | cut -d: -f2- | head
Get rows starting and ending between a known range
awk 'NR>=3775803 && NR <=6436803' filename.log > subset-file.log