Generic Linux System Debugging
Following is a list of commands i use for day to day system debugging. These are very generic commands, they do not assume any understanding of what the server is running. As in, whether its a database server, a web server or a backup server etc. I use alsmost the same set of tools also to monitor most of our servers.
I am assuming we are on RHEL or RHEL derivatives , albeit exact same or similar tools are available for debians as well
First install some of the necessary packages (you should remove some of these once the troubleshooting session is over)
yum install htop traceroute screen telnet sysstat iptraf-ng
Check global server health related stuffs
Check system resources:
Check cpu load:
w
Check memory:
free -m
Total Number of processess:
ps aux | wc -l
Check free disk space:
df -h
Look ate changes in pattern, context switches, io rate changes. You should know the output of vmstat very well.
vmstat 1 5
Check disk usage of a file/directory
du -sh /*
To externally figure out what all ports are open(from outside the server),execute this:
nmap -P0 <ip>
To check tcp network reachability:
tcptrace-route <ip>
To check the bandwidth usage of by host , traffic type use these tools:
iptraf-ng
tcpdump
Wondering what a file does? Whats his type? From where it came?
To check the type of a file execute this:
file <filename>
To check which package has installed this file, execute this:
rpm -qf <path to a file>
To check what all that package has installed, execute this:
rpm -ql <name of package>
99% of the problems are resource crunch(disk, memory, cpu, io etc) due to one or many processes
Following is a set of command that can help you nail down the process
To list out all the programs that are listening to a tcp or udp port, execute this
netstat -tulpn
To nail down a process from its behavior:
To Find the process bind to a port, execute any of this:
lsof -i :<port>
netstat -tulpn | grep <port>
fuser <port>/<protocol>
To Find the process that is using a file, execute this:
fuser <filename>
Once the process causing the crunch is known
To list resource usage of an individual process, execute this:
ps -p <pid> -o comm,args,pcpu,pmem,rss
To check syscall profiles for a program/executable:
strace -c <executable file name>
To attach to a running process, and check the syscall related details
strace -p <pid>
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)





