← Home 🗺️ Mind Map ☕ Ko-fi 💳 Razorpay
// Linux Guide · Essential Skills

Linux Commands for DevOps Engineers: Essential Guide with Real Examples

📅 Updated April 2026 · 📅 April 2026 ⏱ 10 min read 🏷 Linux · DevOps · SRE · Shell
👨‍💻
master.devops
Practising DevOps Engineer with deep hands-on experience in Kubernetes, AWS, CI/CD, and SRE. Every guide is written from real production work.

Linux is the operating system that runs every server, container, and Kubernetes node in production. Every DevOps engineer spends hours every week in a terminal. At a top enterprise, I use Linux daily for debugging production issues, managing servers, writing automation scripts, and investigating performance problems. This guide covers the commands and concepts that appear most often in DevOps interviews and real on-call situations.

File System Navigation and Permissions

# Navigate and explore pwd # print working directory ls -lah # long listing with hidden files and human-readable sizes find /var/log -name "*.log" -mtime -1 # find logs modified in last 24h find / -perm /4000 2>/dev/null # find setuid files (security audit) du -sh /var/log/* # disk usage per directory df -hT # disk free with filesystem type

File Permissions — chmod, chown, umask

Linux permissions use a three-group model: owner, group, others. Each group has three bits: read (4), write (2), execute (1). chmod 755 means owner=7(rwx), group=5(r-x), others=5(r-x).

# Octal notation chmod 755 script.sh # rwxr-xr-x chmod 600 ~/.ssh/id_rsa # rw------- (SSH key must be 600) chmod 644 /etc/nginx/nginx.conf # Symbolic notation chmod u+x script.sh # add execute for owner chmod g-w file.txt # remove write from group chmod o=r file.txt # set others to read-only chmod -R 755 /var/www/html # recursive # Change ownership chown appuser:appgroup app.jar chown -R nginx:nginx /var/www # umask — default permission mask umask 022 # files created as 644, dirs as 755 umask 027 # more restrictive: files 640, dirs 750

Process Management

# View processes ps aux # all processes with CPU and memory ps aux | grep java # filter for Java processes top # live viewer (press 1 for per-CPU, M for memory sort) htop # interactive top (install separately) # Kill processes kill -15 PID # SIGTERM — graceful shutdown (try this first) kill -9 PID # SIGKILL — force kill (last resort) kill -9 $(lsof -t -i:8080) # kill process on port 8080 pkill -f "java.*api" # kill by process name pattern # Process priority nice -n 10 ./heavy-script.sh # start with lower priority renice -n 5 -p PID # change running process priority # Background jobs nohup ./long-script.sh & # run detached from terminal ./script.sh > output.log 2>&1 & # redirect stdout+stderr, background

Systemd — Managing Services

Systemd is the init system on all modern Linux distributions. In DevOps work, you use systemd to manage long-running services, investigate service failures, and view structured logs.

# Service management systemctl start nginx systemctl stop nginx systemctl restart nginx systemctl reload nginx # reload config without restart (if supported) systemctl status nginx # detailed status with recent log lines systemctl enable nginx # start on boot systemctl disable nginx # View logs with journalctl journalctl -u nginx # all logs for nginx service journalctl -u nginx -f # follow (tail -f equivalent) journalctl -u nginx --since "1 hour ago" journalctl -u nginx -p err # errors only journalctl --disk-usage # how much disk logs use

Networking Commands

# Ports and connections ss -tulnp # listening ports with process names (modern netstat) ss -tulnp | grep :8080 # which process is on port 8080? netstat -tulnp # older equivalent (deprecated but still common) lsof -i :8080 # processes using port 8080 # IP and routing ip addr show # interface IPs (replaces ifconfig) ip route show # routing table ip link show # interface status # DNS debugging dig api.company.com # full DNS response dig +short api.company.com # IP only dig @8.8.8.8 api.company.com # query specific DNS server nslookup api.company.com cat /etc/resolv.conf # which DNS servers this machine uses # HTTP testing curl -v https://api.company.com/health curl -H "Authorization: Bearer $TOKEN" -X POST https://api/data -d '{"key":"val"}' wget -O- https://api.company.com/health # Network path tracing traceroute api.company.com mtr api.company.com # interactive traceroute ping -c 4 api.company.com

Log Analysis — grep, awk, sed

# grep — search grep "ERROR" /var/log/app.log grep -i "error" app.log # case-insensitive grep -r "NullPointerException" /var/log/ grep -c "ERROR" app.log # count matching lines grep -v "DEBUG" app.log # exclude DEBUG lines grep -A 5 "FATAL" app.log # 5 lines after match grep -B 3 "FATAL" app.log # 3 lines before match # Count errors in last hour from timestamped logs awk '/2026-04-13 1[5-6]:/ && /ERROR/' app.log | wc -l # Extract specific fields awk '{print $1, $4, $9}' /var/log/nginx/access.log # IP, timestamp, status awk -F: '{print $1}' /etc/passwd # usernames only # Top 10 IPs hitting your server awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -10 # sed — stream editor sed -i 's/old-value/new-value/g' config.yaml # replace in file sed -n '100,200p' large.log # print lines 100-200 sed '/^#/d' config.txt # delete comment lines

Performance Analysis

# Memory free -h # RAM and swap summary cat /proc/meminfo # detailed memory info vmstat 1 5 # system stats every 1s, 5 times # CPU top # press 1 for per-core view mpstat -P ALL 1 # per-CPU stats cat /proc/cpuinfo | grep "model name" | head -1 # Disk I/O iostat -xz 1 # extended I/O stats per device iotop # per-process I/O (like top for disk) # Load average uptime # 1min, 5min, 15min load averages # Load average > number of CPU cores = system is overloaded

Interview Q&A

Q1: What does chmod 777 do and why is it dangerous?
chmod 777 gives read, write, and execute permission to owner, group, and all other users. It means any user on the system can modify or execute the file. This is dangerous because: any compromised process can modify the file, a web application running as www-data can write malicious code, and it violates the principle of least privilege. In production, files should typically be 644 (readable by all, writable by owner only) and scripts 755 (executable by all, writable by owner only). Config files with credentials should be 600 (only owner can read/write).
Q2: How do you find which process is using port 8080?
ss -tulnp | grep :8080 on modern systems. Or lsof -i :8080. Both show the PID and process name. Add sudo to see processes owned by other users. Once you have the PID, use ps -p PID -o cmd to see the full command with arguments. If you want to kill it: kill -15 $(lsof -t -i:8080) — try SIGTERM (15) first, then SIGKILL (9) only if the process does not respond.
Q3: What is a zombie process and how do you handle it?
A zombie process (shown as Z in ps output) is a process that has finished execution but whose entry in the process table has not been cleaned up — because the parent process has not called wait() to read the exit status. Zombies consume no CPU or memory, just a process table slot. They cannot be killed with kill -9 (they are already dead). The solution is to fix the parent process to properly collect child exit codes. If the parent is not fixable, killing the parent causes its zombie children to be adopted by init (PID 1), which does collect exit codes.
// More Guides
📖 DevOps ☸️ Kubernetes 🐳 Docker ⚙️ CI/CD 🗂️ Terraform 🐧 Linux 🌿 Git ☁️ AWS 📊 Prometheus

🐧 Explore Linux on the Interactive Mind Map

See Linux in the context of the full DevOps ecosystem — and explore 18 other tools with real commands and interview Q&A.

Open Interactive Mind Map ← Terraform Guide
🚀 Want the complete DevOps interview kit?
Full notes, Q&A cheat sheets, real commands — all tools covered.
💳 Get Complete DevOps Kit →

Linux is the foundation — Docker containers run on Linux, Kubernetes nodes run Linux. Level up with Docker → next.

📩 Get Free DevOps Interview Notes

Cheat sheets, real commands, interview Q&As — free.

No spam · Follow @master.devops for daily tips

// Continue Learning
🐳Docker — Containers run on Linux kernel primitives 🌿Git — Version control from the Linux terminal ⚙️CI/CD — Linux servers power every CI runner

Text Processing — grep, awk, sed

These three tools are the bread and butter of log analysis and data extraction in any Linux environment. In production SRE work you will use them daily — parsing Nginx access logs, extracting error rates, transforming configuration files, and building ad-hoc monitoring scripts.

# grep — search text grep "ERROR" /var/log/app.log # find lines with ERROR grep -i "error" /var/log/app.log # case-insensitive grep -r "TODO" /app/src/ # recursive in directory grep -c "ERROR" /var/log/app.log # count matching lines grep -v "DEBUG" /var/log/app.log # exclude DEBUG lines grep -E "ERROR|WARN" /var/log/app.log # regex: ERROR or WARN grep -B2 -A5 "OutOfMemoryError" app.log # 2 lines before, 5 after # awk — field processing awk '{print $1, $7}' access.log # print column 1 and 7 awk -F: '{print $1}' /etc/passwd # colon delimiter, print usernames awk '$9 == 500' access.log # lines where field 9 is 500 (HTTP 500s) awk '{sum+=$10} END {print sum}' log # sum column 10 (bytes sent) # sed — stream editor sed 's/foo/bar/g' file.txt # replace foo with bar globally sed -i 's/localhost/db.prod.internal/g' config.yaml # in-place edit sed -n '50,100p' large.log # print lines 50-100 sed '/^#/d' config.conf # delete comment lines
Real production use: Top 10 most common HTTP 500 URLs in Nginx log:
grep " 500 " /var/log/nginx/access.log | awk '{print $7}' | sort | uniq -c | sort -rn | head -10

Performance Analysis

When a service is slow or a node is under pressure, these commands help you pinpoint the bottleneck within seconds. Kubernetes nodes are simply Linux servers — the same tools apply for debugging pod-level and node-level performance issues.

# CPU and memory top # live view — press 1 for per-core, M to sort by mem vmstat 1 10 # 10 snapshots at 1s intervals: CPU swap IO free -h # memory summary with human units # Disk I/O iostat -xz 1 # disk utilisation per device (await = latency ms) iotop # per-process disk I/O (like top for disk) lsblk # block devices and mount points # Network iftop -i eth0 # live bandwidth per connection ss -s # socket statistics summary (connections by state)
Interview tip: When asked "how do you debug a slow server?", answer in layers: CPU (top/vmstat) → Memory (free) → Disk I/O (iostat) → Network (ss, iftop) → Application logs (journalctl, grep). Interviewers want to see a systematic approach.

Shell Scripting Fundamentals

Shell scripting is how DevOps engineers automate repetitive tasks — health checks, log rotation, deployment helpers, and on-call runbooks. A well-written Bash script saves an entire team hours per week.

#!/bin/bash set -euo pipefail # exit on error, undefined var, pipe failure ENV=${1:-staging} # first arg, default to staging TIMESTAMP=$(date +%Y%m%d_%H%M%S) check_health() { local URL=$1 local STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$URL/health") [[ "$STATUS" == "200" ]] && echo "OK: $URL" || echo "FAIL: $URL returned $STATUS" } for HOST in api.staging payment.staging; do check_health "https://${HOST}.internal" done

Cron Jobs and Scheduling

Cron is the standard scheduler on Linux systems. DevOps engineers use cron for log rotation, backup scripts, health checks, certificate renewal, and metric collection.

# Cron syntax: minute hour day month weekday command 0 2 * * * /opt/scripts/backup.sh # daily at 2am */15 * * * * /opt/scripts/health-check.sh # every 15 minutes 0 0 * * 0 /opt/scripts/weekly-report.sh # every Sunday midnight # Always redirect output to avoid silent failures 0 * * * * /opt/script.sh >> /var/log/script.log 2>&1
Common mistake: Cron runs with a minimal environment — $PATH is not your interactive shell's PATH. Always use absolute paths (/usr/bin/python3 not python3). Check journalctl -u cron when cron jobs fail silently.

Linux Interview Questions & Answers

Q: What is the difference between a hard link and a symbolic link?
A hard link is another directory entry pointing to the same inode — both point to the same data on disk. Deleting one does not affect the other. Hard links cannot span filesystems or point to directories. A symbolic link (symlink) is a special file containing a path to another file. If the target is deleted, the symlink breaks. Symlinks can cross filesystems and point to directories. In DevOps, symlinks are commonly used to manage versioned binaries (/usr/local/bin/python → python3.11).
Q: What does set -euo pipefail mean?
set -e exits immediately if any command returns non-zero. set -u treats undefined variables as errors — prevents bugs where a misspelled variable silently evaluates to empty. set -o pipefail makes a pipeline fail if any command in it fails, not just the last one. Without pipefail, false | true would succeed. Every production Bash script should start with this combination.
Q: A server is responding slowly. What is your diagnostic process?
I follow a layered approach: 1) Check CPU with top or vmstat 1 — is any process at 100%? 2) Check memory with free -h — is the server swapping? Swapping kills performance. 3) Check disk I/O with iostat -xz 1 — is await (disk latency) high? 4) Check network with ss -s — thousands of CLOSE_WAIT connections indicate connection pool exhaustion. 5) Check application logs with journalctl -u app -p err. Each layer narrows the root cause.
Q: How do you find what is listening on port 8080?
ss -tulnp | grep :8080 (modern, preferred) or lsof -i :8080. Both show the PID and process name. Add sudo if the process is owned by root. Once you have the PID, use ps -p PID -o pid,cmd,user for details.

🔗 Related DevOps Topics

🐳 Docker ☸️ Kubernetes 🗂️ Terraform 🐧 Linux ☁️ AWS ⚙️ CI/CD 📊 Prometheus 🌿 Git 📖 DevOps 🗺️ Mind Map

☕ Support Master DevOps

All content is 100% free. If this guide helped you crack an interview or learn something new, your support keeps the project going.

☕ Ko-fi — International 💳 Razorpay — UPI / India

No subscription · One-time equally loved 🙏

☸️
Written by Master DevOps
DevOps & SRE Engineer · Updated April 2026

Master DevOps is a community of practising DevOps and SRE engineers sharing real production knowledge — from Kubernetes internals to CI/CD pipeline design. All content is written from hands-on experience, not copied from documentation. Our mission: make senior-level DevOps knowledge free for everyone.

📸 Instagram ▶️ YouTube 💼 LinkedIn About Us →
🎯

Ready to Crack Your DevOps Interview?

Access 90+ interview Q&As, real commands, SRE frameworks, and 18-tool reference cards — all free, no login required. Used by 1,300+ DevOps engineers.

🎯 Open Interview Kit → 🗺️ Explore Mind Map

No account needed · Works on mobile · Updated weekly

Advertisement
🌙