Introduction
If you've ever had to deal with a Linux server that was out of disk space at some point you know that it's important not to run into the same situation again. In most cases, the server can run out of disk space due to huge session files or even an error log file growing up to hundreds of Gigabytes.
However, sometimes you might get the famous "No space left on device" error message without knowing what is taking the disk space. With few simple BASH commands, you can quickly check which folders and files are the most disk space consuming on your system
Script summary
The first step you can perform if you experience this issue is the check the disk space usage on your server using df
. The command displays the amount of disk space available on the file system containing each file name argument.
The get an idea of what the output of the command is here is an example. We will also add the -h
argument or --human-readable to get a more nice output of the command.
$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 2.0G 0 2.0G 0% /dev
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 2.0G 209M 1.8G 11% /run
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/xvda 30G 12G 16G 43% /
tmpfs 400M 0 400M 0% /run/user/0
As you can see the disk space usage is totally fine and we're a far way from running into issues for the moment. However, this can quickly change if we do a large backup of our application or website or as already mentioned the session folder grows dramatically or if we simply modify our code and this triggers an error which then gets logged and then the error_log
can grow pretty quickly as well.
With the df
command we can quickly check if the disk space was exceeded on the /
partition or for example if /tmp
or /var
are separate partitions the disk space can be exceeded there if the assigned amount of disk space was not sufficient or if simply some of the bad scenarios we've already mentioned had happened.
Another useful command we can use to locate file space usage is du
. You can quickly check which are the most disk space consuming directories on your server. An example command will be the following:
du -ch --max-depth=2 / 2>/dev/null | sort -rh | head -15
Where: -c
will produce a grand total of the disk usage, -h
is again for human-readable output.
Where: -h
is for human-readable output.
We also apply the -max-depth
level to specify that we want our command to check the disk usage at certain levels below starting point which in this case will be the /
partition.
We then specify that we want to perform the check on the /
directory and we also want to filter any errors using 2>/dev/null
so that they will not be output to your console.
The last bit we want to is to sort (-h
is for human readable output and the `-r argument is to simply reverse the result of comparisons) the output and list the first 15 rows of the output.
At last we simply want to display the first 15 rows of the output with the head
command.
An example output of the command will be:
$ du -ch --max-depth=2 / 2>/dev/null | sort -rh | head -15
14G total
14G /
4.7G /usr
4.3G /var
2.2G /usr/local
1.3G /var/lib
1.3G /root
In order to find the largest in size files in our system, we will use the find
command.
find
can be pretty harsh sometimes and can cause issues with exceeding the memory on your server and because of this, we would like to use the nice
command as well in order to make sure everything will go smoothly. In short words, you will use nice in order to run a program with modified scheduling priority. Niceness values range from -20 (most favourable to the process) to 19 (least favourable to the process).
We can also use ionice
instead of nice
in order to set a different type of priority. This program sets or gets the I/O scheduling class and priority for a program.
Then with a simple find command, we search for files that are larger than 100MB and we also estimate the file space usage
An example commands are:
nice -n 19 find / ! -path "/proc/*" -type f -size +100M -exec du -hs {} \; | sort -hr
ionice -n 3 -c 3 2>/dev/null find / ! -path "/proc/*" -type f -size +100M -exec du -hs {} \; | sort -hr
The script
Now we want to combine the commands in a simple bash script instead of running them as standalone one-liners and of course we can modify the whole script in one big one-liner script but I think this is totally up to you if you want to execute the same one-liner again and perhaps save it in your .bashrc and use it with an alias or practise your BASH and put the commands in a simple script.
We can print the date at the beginning of the script for visibility (you can also make the script run a cron job and also send you an email once a week with the disk space usage or set a trigger if the usage is more than 90%) and also clear the screen in order to get a more nice output however these are not a must. We can also define the path for the scan in a variable just to practice this, although the script becomes larger so you can simply put the path as "/" and not save it in a variable.
The final script will look like that:
#!/bin/bash
## Define the path for the scans
DIR_TO_CHECK='/'
clear
date
printf "======================================================\n"
# Get a report for the file system usage
df -h
printf "======================================================\n"
printf "LARGEST DIRECTORIES\n"
printf "\n"
# Get a report for the most disk space consuming directories in the / partition
du -ch --max-depth=2 ${DIR_TO_CHECK} 2>/dev/null | sort -rh | head -15
printf "======================================================\n"
printf "LARGEST FILES\n"
printf "\n"
# Get a list for all files larger than 100MB in the / partition
ionice -n 3 -c 3 2>/dev/null find ${DIR_TO_CHECK} ! -path "/proc/*" -type f -size +100M -exec du -hs {} \; | sort -hr
printf "=======================================================\n"
Conclusion
With few simple commands, you can create a really useful script to check the largest files and directories on your server. I personally use the two commands in bash aliases and whenever I need to check which files are taking most of the disk space I execute them separately, but you may find it useful to have them in a bash script or in a one-liner script.
clear ; printf "LARGEST DIRECTORIES\n"; printf "\n"; du -ah --max-depth=2 / 2>/dev/null | sort -rh | head -20 | grep [0-9]G; printf "\n"; printf "LARGEST FILES\n" ; ionice -n 3 -c 3 2>/dev/null find / ! -path "/proc/*" -type f -size +100M -exec du -hs {} \; | sort -hr
You can also check the man pages for the commands we used in our script:
man df
man du
man find
man sort
man head
man ionice
man grep
This is pretty much how you can quickly find the most disk space consuming directories and files on your server or computer. The script and the commands can be easily modified and people with longer BASH experience can make them look even better, so any thoughts on this are more than welcome.
Please feel free to share if you also use basic scripts to monitor the disk space usage on your servers.
Support
If you've enjoyed reading this post or learned something new and would like to support me to publish more content like this one you can support me by buying me a coffee:
Thank you!
Comments (0)