IT Problems & Solutions

Step-by-step fixes for the most common IT problems. Real commands, real solutions — no fluff.

200+ Problems Solved
9 Topic Areas
500+ Commands & Examples

No problems found for ""

🌐

Networking Problems

28 problems
High Cannot connect to the internet — all pings fail
Problem
No internet access. ping 8.8.8.8 times out. Browser shows "No internet connection".
Diagnosis
ip addr show # check if interface has IP ip route show # check default gateway exists ping 192.168.1.1 # ping gateway — if fails, local issue ping 8.8.8.8 # if gateway ok, DNS/ISP issue cat /etc/resolv.conf # check DNS servers
Solution — Step by Step
  1. Restart network interface: sudo ip link set eth0 down && sudo ip link set eth0 up
  2. Request new IP via DHCP: sudo dhclient -r && sudo dhclient eth0
  3. Add Google DNS if resolv.conf is empty: echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
  4. Add missing default route: sudo ip route add default via 192.168.1.1
  5. If still failing, restart networking service: sudo systemctl restart NetworkManager
networkingdhcpdnslinux
High DNS resolution failing — domain not found
Problem
Websites fail with "server not found" but IP addresses work fine. nslookup google.com returns SERVFAIL.
Diagnosis
nslookup google.com # test DNS resolution nslookup google.com 8.8.8.8 # test with Google DNS directly cat /etc/resolv.conf # see configured DNS servers systemd-resolve --status # check systemd-resolved
Solution
  1. Set reliable DNS servers: sudo nano /etc/resolv.conf → add nameserver 8.8.8.8 and nameserver 1.1.1.1
  2. Flush DNS cache: sudo systemd-resolve --flush-caches
  3. On Windows: ipconfig /flushdns
  4. Restart DNS resolver: sudo systemctl restart systemd-resolved
dnsnetworkingresolv.conf
Medium High network latency and packet loss
Problem
Slow network, video calls drop, ping shows >200ms or packet loss.
Diagnosis
ping -c 50 8.8.8.8 # check packet loss % mtr --report 8.8.8.8 # traceroute + ping combined iperf3 -c iperf.he.net # test bandwidth ss -tulpn # check what's using bandwidth nethogs # per-process bandwidth monitor
Solution
  1. Identify bandwidth hogs: sudo nethogs eth0 — kill greedy processes
  2. Check for duplex mismatch: ethtool eth0 — set to sudo ethtool -s eth0 duplex full speed 1000
  3. Enable jumbo frames for LAN: sudo ip link set eth0 mtu 9000
  4. If Wi-Fi: switch to 5GHz band or move closer to AP
  5. If VPS: contact provider — could be noisy neighbor on shared host
latencypacket-lossmtr
Medium SSH connection refused on port 22
Problem
ssh user@host returns "Connection refused" or times out.
Diagnosis
nc -zv host 22 # test if port 22 open nmap -p 22 host # port scan # on the server: systemctl status sshd # is SSH daemon running? ss -tlnp | grep 22 # is it listening? ufw status # firewall blocking?
Solution
  1. Start SSH daemon: sudo systemctl start sshd && sudo systemctl enable sshd
  2. Allow port in firewall: sudo ufw allow 22/tcp
  3. If port was changed, connect with: ssh -p 2222 user@host
  4. Check SSH config: sudo sshd -T | grep port
sshfirewallport-22
Medium SSL certificate error in browser
Problem
"Your connection is not private" / NET::ERR_CERT_AUTHORITY_INVALID or certificate expired warning.
Diagnosis
openssl s_client -connect domain.com:443 2>/dev/null | openssl x509 -noout -dates # Shows notBefore and notAfter dates curl -vI https://domain.com 2>&1 | grep -E "expire|SSL|cert"
Solution
  1. Check expiry: if expired, renew with sudo certbot renew --force-renewal
  2. Auto-renew setup: sudo certbot renew --dry-run
  3. Add cron: 0 12 * * * certbot renew --quiet
  4. If self-signed cert: replace with Let's Encrypt free cert
  5. Check system clock — wrong date causes SSL failures: timedatectl status
sslhttpscertbotcertificate
Low VPN connected but no internet access
Problem
VPN shows connected but browsing fails. Also known as "VPN tunnel all traffic" issue.
Solution
  1. Check routing table: ip route show — look for 0.0.0.0/0 via VPN interface
  2. Enable split tunneling in VPN client settings to allow non-VPN traffic
  3. Add DNS bypass: set DNS to 8.8.8.8 in network settings while VPN is on
  4. On OpenVPN: remove redirect-gateway def1 from config if you don't want all traffic routed
vpnroutingsplit-tunnel
High Port 80/443 not accessible from outside
Problem
Web server runs locally but external users can't reach the site.
Diagnosis
ss -tlnp | grep -E "80|443" # is nginx/apache listening? sudo ufw status # firewall rules curl -I http://localhost # local test # From outside: curl -I http://YOUR_IP
Solution
  1. Open firewall: sudo ufw allow 80/tcp && sudo ufw allow 443/tcp
  2. Check server binds to 0.0.0.0 not 127.0.0.1 in nginx/apache config
  3. If cloud server (AWS/DO): add inbound rules in security group / firewall panel
  4. Restart web server: sudo systemctl restart nginx
nginxfirewallufwweb-server
🔒

Security Problems

30 problems
High Server getting brute-forced via SSH
Problem
Hundreds of failed SSH login attempts in auth.log. Server is being scanned/attacked.
Diagnosis
sudo grep "Failed password" /var/log/auth.log | tail -20 sudo grep "Failed password" /var/log/auth.log | awk '{print $11}' | sort | uniq -c | sort -rn | head -10
Solution
  1. Install fail2ban: sudo apt install fail2ban && sudo systemctl enable fail2ban
  2. Disable password auth — use keys only: in /etc/ssh/sshd_config set PasswordAuthentication no
  3. Change SSH port: Port 2222 in sshd_config (reduces noise significantly)
  4. Allow only your IP: sudo ufw allow from YOUR_IP to any port 22
  5. Reload SSH: sudo systemctl reload sshd
sshbrute-forcefail2banhardening
High Website defaced or injected with malware
Problem
Site shows unexpected content, redirects visitors, or Google marks it as dangerous.
Diagnosis
find /var/www -name "*.php" -newer /var/www/index.php -ls # recently changed files grep -r "eval(base64_decode" /var/www/ # common malware pattern grep -r "iframe" /var/www/ --include="*.html" # injected iframes
Solution
  1. Take site offline immediately to protect visitors
  2. Restore from last known clean backup
  3. Change all passwords: FTP, database, CMS admin, hosting panel
  4. Update CMS/plugins to latest versions — patch the entry point
  5. Add WAF: Cloudflare free plan blocks most attacks
  6. Request Google reconsideration after cleanup
malwaredefacementcmsrecovery
High Ransomware encrypted files on server
Problem
Files renamed with unknown extension (.locked, .crypt). Ransom note left on server.
Solution
  1. Immediately isolate the server — disconnect from network
  2. Do NOT pay the ransom — no guarantee of decryption
  3. Identify the ransomware family at nomoreransom.org — may have free decryptor
  4. Restore from offline backup (why you should always have air-gapped backups)
  5. Report to CISA (US) or your national cybercrime unit
  6. After restore: patch the entry vector, enable EDR, set up immutable backups
ransomwareincident-responsebackup
Medium API keys accidentally pushed to GitHub
Problem
AWS keys, API tokens or passwords committed to a public or private repository.
Solution
  1. Revoke the exposed key IMMEDIATELY — do this before anything else
  2. Remove from Git history: git filter-repo --path secrets.env --invert-paths
  3. Force push: git push origin --force --all
  4. Add to .gitignore: echo ".env" >> .gitignore
  5. Use environment variables or a secrets manager (AWS Secrets Manager, HashiCorp Vault)
  6. Enable GitHub secret scanning to catch this automatically in future
api-keysgitsecretsgithub
Medium Users logging in from suspicious locations
Problem
Account accessed from unknown country/IP. Possible credential compromise.
Solution
  1. Force password reset for affected account immediately
  2. Invalidate all existing sessions
  3. Enable MFA — even if password is leaked, attacker can't log in
  4. Check for other accounts using same password — change those too
  5. Review audit logs for what the attacker accessed or changed
  6. Set up geo-blocking or anomaly detection alerts
account-takeovermfaincident
Low Missing security headers on web application
Problem
securityheaders.com gives F grade. Missing CSP, HSTS, X-Frame-Options etc.
Solution — Add to Nginx config
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always; add_header X-Content-Type-Options "nosniff" always; add_header X-Frame-Options "SAMEORIGIN" always; add_header X-XSS-Protection "1; mode=block" always; add_header Referrer-Policy "strict-origin-when-cross-origin" always; add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' https://www.googletagmanager.com; style-src 'self' 'unsafe-inline';" always;
security-headersnginxcsphsts
🐧

Linux Problems

28 problems
High Disk 100% full — server unresponsive
Problem
df -h shows 100% usage. Applications crash, logs stop writing, server may become unreachable.
Find the culprit
df -h # which partition is full du -sh /* 2>/dev/null | sort -rh | head -10 # top directories by size du -sh /var/log/* | sort -rh | head -10 # often logs are the issue find / -name "*.log" -size +100M 2>/dev/null # big log files
Solution
  1. Clear old logs: sudo journalctl --vacuum-size=100M
  2. Remove old kernels: sudo apt autoremove --purge
  3. Clean apt cache: sudo apt clean
  4. Find and delete large temp files: sudo find /tmp -size +50M -delete
  5. Truncate large log file (safe): sudo truncate -s 0 /var/log/syslog
  6. Set log rotation: configure /etc/logrotate.conf
disk-fulllogsstoragelinux
High CPU at 100% — server crawling
Diagnosis
top # see what's eating CPU (press P to sort by CPU) ps aux --sort=-%cpu | head -10 # top CPU processes htop # visual alternative iotop # if it's I/O wait, not CPU
Solution
  1. Kill runaway process: kill -9 PID
  2. Nice a process to lower priority: renice +10 PID
  3. Check for crypto miners: ps aux | grep -i "minerd\|xmrig\|cryptonight"
  4. If it's a web process: check for slow database queries or infinite loops in code
  5. Set CPU limits with cgroups or systemd: CPUQuota=50% in service file
cpuperformancetopprocess
High Permission denied errors on files/directories
Problem
bash: ./script.sh: Permission denied or cannot write to a directory you should own.
Diagnosis
ls -la /path/to/file # check permissions and owner whoami # current user id # groups you're in stat /path/to/file # full permission details
Solution
  1. Make script executable: chmod +x script.sh
  2. Change ownership: sudo chown user:group /path
  3. Fix web directory permissions: sudo chown -R www-data:www-data /var/www
  4. Set correct directory perms: chmod 755 /dir (dirs need execute bit)
  5. Add user to group: sudo usermod -aG groupname username (log out and back in)
permissionschmodchownlinux
Medium Service fails to start after reboot
Diagnosis
systemctl status servicename # see error message journalctl -u servicename -n 50 # last 50 log lines for service journalctl -u servicename --since "1 hour ago"
Solution
  1. Enable auto-start: sudo systemctl enable servicename
  2. Fix config errors shown in journal output
  3. Check dependencies: systemctl list-dependencies servicename
  4. If port conflict: ss -tlnp | grep PORT — kill conflicting process
  5. Reload daemon after editing service file: sudo systemctl daemon-reload
systemdservicebootjournalctl
Medium Out of memory — OOM killer killing processes
Diagnosis
free -h # see available RAM dmesg | grep -i "oom\|killed" # OOM killer log ps aux --sort=-%mem | head -10 # top memory processes
Solution
  1. Add swap (quick fix): sudo fallocate -l 2G /swapfile && sudo chmod 600 /swapfile && sudo mkswap /swapfile && sudo swapon /swapfile
  2. Make swap permanent: add /swapfile none swap sw 0 0 to /etc/fstab
  3. Tune swappiness: sudo sysctl vm.swappiness=10
  4. Find memory leaks — restart the leaking service nightly via cron
  5. Upgrade RAM or move to larger server
memoryoomswapram
☁️

Cloud Problems

24 problems
High AWS bill unexpectedly high — cost spike
Problem
AWS/GCP/Azure bill 10x higher than expected. Often caused by forgotten resources or data transfer costs.
Solution
  1. Go to AWS Cost Explorer → identify the service causing the spike
  2. Common culprits: NAT Gateway data transfer, unused EC2 instances, S3 request spikes, Elastic IPs not attached
  3. Set billing alarm: aws cloudwatch put-metric-alarm --alarm-name billing-alarm --metric-name EstimatedCharges
  4. Enable AWS Budgets — get email before threshold reached
  5. Use Reserved Instances or Savings Plans for predictable workloads (saves 40-70%)
  6. Delete idle resources: use AWS Trusted Advisor to find them
awscostbillingcloud
High EC2 instance unreachable after security group change
Problem
Locked yourself out of EC2 — SSH and HTTP both blocked after mis-configuring security group.
Solution
  1. Go to AWS Console → EC2 → Security Groups
  2. Find the security group → Edit Inbound Rules
  3. Add rule: Type SSH, Port 22, Source: My IP (or 0.0.0.0/0 temporarily)
  4. For future: use AWS Systems Manager Session Manager — SSH without port 22
  5. Never remove all rules in one batch — always add new rule before removing old
ec2security-groupawslockout
Medium S3 bucket public — data exposed
Problem
S3 bucket accidentally made public exposing files. Noticed via security scan or data breach.
Solution
  1. Block public access immediately: AWS Console → S3 → Bucket → Permissions → Block all public access → Enable
  2. Or via CLI: aws s3api put-public-access-block --bucket BUCKET --public-access-block-configuration "BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true"
  3. Audit what was accessed: enable S3 server access logging and CloudTrail
  4. Enable S3 Block Public Access at account level to prevent future mistakes
s3awsdata-exposuresecurity
Medium Kubernetes pod stuck in CrashLoopBackOff
Diagnosis
kubectl get pods # see status kubectl describe pod POD_NAME # events and error details kubectl logs POD_NAME --previous # logs from crashed container kubectl logs POD_NAME -c CONTAINER # specific container logs
Solution
  1. Read crash logs — usually reveals the root cause immediately
  2. Common causes: missing environment variables, wrong image, misconfigured liveness probe
  3. Check resource limits — OOMKilled means pod ran out of memory: kubectl describe pod | grep -A5 Limits
  4. Test image locally: docker run --rm IMAGE_NAME
  5. Fix config then: kubectl rollout restart deployment/DEPLOYMENT_NAME
kubernetesk8scrashlooppods
⚙️

DevOps Problems

24 problems
High Docker container exits immediately after start
Diagnosis
docker ps -a # see exited containers docker logs CONTAINER_ID # see what it printed before dying docker inspect CONTAINER_ID # full config and exit code docker run -it IMAGE /bin/sh # run interactively to debug
Solution
  1. Exit code 1 = application error — check logs for crash message
  2. Exit code 137 = OOM killed — increase --memory limit
  3. Missing CMD/ENTRYPOINT — add to Dockerfile: CMD ["python", "app.py"]
  4. Missing env vars — add -e VAR=value or use --env-file .env
  5. Permission issue on mounted volumes — check :z flag on SELinux systems
dockercontainerdevops
High CI/CD pipeline failing on every push
Problem
GitHub Actions / Jenkins pipeline red on every commit. Tests pass locally but fail in CI.
Solution
  1. Read the full CI log — don't just see "failed", read WHY
  2. Common: missing secrets — add to GitHub Settings → Secrets and variables
  3. Environment mismatch — CI uses different Node/Python version: pin it in workflow node-version: '20'
  4. Missing test dependencies: add install step before test step
  5. Reproduce locally: act tool runs GitHub Actions locally
  6. Port conflicts between parallel jobs: use dynamic ports or sequential jobs
ci-cdgithub-actionsjenkinspipeline
Medium Git merge conflicts blocking deployment
Solution
  1. See all conflicted files: git status | grep "both modified"
  2. Open each file — resolve between <<<<<<< and >>>>>>> markers
  3. Use visual tool: git mergetool (opens vimdiff or configured tool)
  4. After resolving: git add . && git commit
  5. Prevent conflicts: merge main into feature branches frequently, keep PRs small
gitmergeconflictdevops
Medium Nginx 502 Bad Gateway error
Diagnosis
sudo tail -f /var/log/nginx/error.log # see the actual error systemctl status gunicorn # is backend running? curl http://localhost:8000 # test backend directly ss -tlnp | grep 8000 # is backend listening on expected port?
Solution
  1. 502 = nginx can't reach backend — start the backend service
  2. Check proxy_pass port matches where backend actually runs
  3. Increase timeouts if backend is slow: proxy_read_timeout 300; in nginx config
  4. Check unix socket permissions if using socket instead of port
  5. Reload after config fix: sudo nginx -t && sudo systemctl reload nginx
nginx502proxybackend
🗄️

Database Problems

24 problems
High MySQL queries extremely slow — full table scans
Diagnosis
EXPLAIN SELECT * FROM users WHERE email = '[email protected]'; -- Look for "type: ALL" = full table scan — needs index SHOW FULL PROCESSLIST; -- see running queries SHOW VARIABLES LIKE 'slow_query_log%'; -- enable slow query log
Solution
  1. Add index on searched column: CREATE INDEX idx_email ON users(email);
  2. Enable slow query log: SET GLOBAL slow_query_log = 1; SET GLOBAL long_query_time = 1;
  3. Analyze with EXPLAIN — every query plan should show index use, not ALL
  4. Use composite index for multi-column WHERE: CREATE INDEX idx_name ON table(col1, col2);
  5. Run ANALYZE TABLE to update statistics: ANALYZE TABLE users;
mysqlindexperformanceslow-query
High Database connection pool exhausted
Problem
"Too many connections" error. Application throws connection errors under load.
Diagnosis
SHOW STATUS LIKE 'Threads_connected'; SHOW VARIABLES LIKE 'max_connections'; SHOW FULL PROCESSLIST; -- see all open connections
Solution
  1. Increase max connections: SET GLOBAL max_connections = 500;
  2. Add connection pooling: use PgBouncer (PostgreSQL) or ProxySQL (MySQL)
  3. Find connection leaks in code — ensure connections are closed after use
  4. Kill idle connections: KILL CONNECTION_ID;
  5. Set wait_timeout: SET GLOBAL wait_timeout = 60;
mysqlpostgresqlconnectionspool
High Accidentally deleted important data — no backup
Problem
DELETE FROM users; or DROP TABLE orders; without WHERE clause or backup.
Solution
  1. Stop writes immediately to prevent overwriting deleted data
  2. MySQL: check binary logs for recovery: mysqlbinlog --start-datetime="2026-07-03 08:00:00" /var/lib/mysql/mysql-bin.000001 | mysql -u root -p
  3. PostgreSQL: if WAL archiving was on, use Point-in-Time Recovery
  4. Check if any replica has the data — promote replica before it syncs the delete
  5. Lesson: always use transactions, always have backups, test restores regularly
data-recoverybackupbinlogdisaster
Medium Deadlock errors in database logs
Diagnosis
SHOW ENGINE INNODB STATUS\G -- MySQL deadlock info -- Look for "LATEST DETECTED DEADLOCK" section
Solution
  1. Always access tables in the same order across all transactions
  2. Keep transactions short — don't hold locks while doing non-DB work
  3. Use lower isolation level if possible: READ COMMITTED instead of REPEATABLE READ
  4. Add retry logic in application for deadlock errors (error code 1213 in MySQL)
  5. Use SELECT ... FOR UPDATE only when you'll actually update
deadlocktransactionsmysqlinnodb
🖥️

Hardware Problems

22 problems
High Server overheating — thermal throttling
Diagnosis
sensors # CPU/GPU temps (install: apt install lm-sensors) sudo sensors-detect # first-time setup cat /sys/class/thermal/thermal_zone*/temp # raw temp in millidegrees dmesg | grep -i "thermal\|throttl" # kernel thermal events
Solution
  1. Clean dust from fans and heatsinks — compressed air every 6-12 months
  2. Replace thermal paste on CPU — degrades after 3-5 years
  3. Ensure proper airflow — hot exhaust not recirculating back as intake
  4. Check fan speeds: sensors — if fans at 0 RPM they've failed
  5. For servers: check data center ambient temp and cooling unit function
thermalcpucoolinghardware
High Hard drive failing — S.M.A.R.T. errors
Diagnosis
sudo smartctl -a /dev/sda # full SMART report sudo smartctl -H /dev/sda # overall health assessment dmesg | grep -i "error\|ata\|I/O" # kernel disk errors
Solution
  1. If SMART shows "FAILED" — backup ALL data immediately, drive will die soon
  2. Watch reallocated_sector_ct — any non-zero value is serious
  3. Set up monitoring: sudo apt install smartmontools && sudo systemctl enable smartd
  4. Check disk health weekly: add to cron smartctl -H /dev/sda | mail -s "Disk Health" [email protected]
  5. Replace drive before it fails — RAID is not a backup
smartdiskhddfailure
Medium RAM causing random crashes and blue screens
Diagnosis
# Linux: run memtest86 from boot # Or: sudo apt install memtester sudo memtester 1G 1 # test 1GB of RAM, 1 pass # Windows: mdsched.exe → Restart and check for problems
Solution
  1. Run memtest86+ overnight — any errors = faulty RAM
  2. If multiple sticks: remove one at a time to isolate the bad stick
  3. Reseat RAM sticks — remove and firmly push back in
  4. Try different RAM slots — could be motherboard slot fault
  5. Replace faulty DIMM — RAM is relatively inexpensive
rammemorymemtestcrash
📋

Compliance Problems

20 problems
High GDPR violation — user data processed without consent
Problem
Tracking users, storing personal data, or sending emails without proper GDPR consent mechanism.
Solution
  1. Add cookie consent banner before loading any tracking (GA4, Meta Pixel, etc.)
  2. Create and publish Privacy Policy and Cookie Policy pages
  3. Implement consent management platform (CMP): Cookiebot, OneTrust, or open-source Klaro
  4. Document your legal basis for processing: consent, legitimate interest, contract, etc.
  5. Add "Delete My Data" mechanism for users to exercise right to erasure
  6. Appoint DPO if processing sensitive data at scale
gdprprivacyconsentcompliance
High PCI DSS — storing plain-text card data
Problem
Card numbers, CVVs, or full track data stored in database in plain text — massive PCI violation.
Solution
  1. Delete all stored card data immediately
  2. Use a payment processor (Stripe, Braintree) — never touch raw card data
  3. Tokenize: processor gives you a token to charge later, card number never hits your server
  4. Scope reduction: if you don't store/transmit card data, PCI scope drops to SAQ A (simplest)
  5. Never store CVV under any circumstances — prohibited by PCI DSS
pci-dsspaymentscompliancestripe
Medium Audit log gaps — missing activity records
Problem
Security audit finds incomplete logs — who accessed what data and when is not tracked.
Solution
  1. Enable database audit logging: MySQL audit plugin or PostgreSQL pgaudit extension
  2. Log all admin actions: who changed what config, when, from which IP
  3. Ship logs to immutable storage: attacker can't delete what they can't reach
  4. Set retention: SOC 2 requires 1 year, PCI DSS requires 1 year minimum
  5. Use structured logging (JSON) for easy querying and alerting
audit-logssoc2compliancemonitoring