From 584e4ac457f1bbf18f7f9683bce14abee1713498 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Sat, 24 Jan 2026 20:24:13 +0530 Subject: [PATCH 01/32] Add learning plan for DevOps development Document learning plan for DevOps skills and goals. --- 2026/day-01/learning-plan.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) create mode 100644 2026/day-01/learning-plan.md diff --git a/2026/day-01/learning-plan.md b/2026/day-01/learning-plan.md new file mode 100644 index 000000000..7a31e2fd1 --- /dev/null +++ b/2026/day-01/learning-plan.md @@ -0,0 +1,14 @@ +Current Level - Fresher in DevOps (Ex Electrical Maintenance Engineer of Tata Steel with 5 year Experience){Unemployeed}. + +Goals - 1. learning a Basics of DevOps. +Goals - 2. Making Notes with Consistency. +Goals - 3. Improveing Myself Everyday with Hands-On Practice. + +Core DevOps Skills - 1. Linux and Networking +Core DevOps Skills - 2. Git & Docker +Core DevOps Skills - 3. Understand how to debug the problems and finding a reason behind the problem why it's occur?. + +Weekly Time Budget - +1. Daily - 3 Hours spending on learning and hands-on. +2. Weekdays - 2 Hours spending on learning and hands-on. + From ba5a2cab008f25aba5758db4de216f9bbf53c575 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Sun, 25 Jan 2026 12:59:52 +0530 Subject: [PATCH 02/32] Created linux-architecture-notes.md Linux architecture Notes --- 2026/day-02/linux-architecture-notes.md | 1 + 1 file changed, 1 insertion(+) create mode 100644 2026/day-02/linux-architecture-notes.md diff --git a/2026/day-02/linux-architecture-notes.md b/2026/day-02/linux-architecture-notes.md new file mode 100644 index 000000000..8b1378917 --- /dev/null +++ b/2026/day-02/linux-architecture-notes.md @@ -0,0 +1 @@ + From 070b0385f62317432f9a354a9af0d96839c7d8b1 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Sun, 25 Jan 2026 15:46:44 +0530 Subject: [PATCH 03/32] Add detailed Linux architecture notes Added comprehensive notes on Linux operating system, covering its components, distributions, installation methods, process states, and commonly used commands. --- 2026/day-02/linux-architecture-notes.md | 106 ++++++++++++++++++++++++ 1 file changed, 106 insertions(+) diff --git a/2026/day-02/linux-architecture-notes.md b/2026/day-02/linux-architecture-notes.md index 8b1378917..e1640e35d 100644 --- a/2026/day-02/linux-architecture-notes.md +++ b/2026/day-02/linux-architecture-notes.md @@ -1 +1,107 @@ +# Linux Operating System +## Operating System +Linux is an operating system like many others, such as DOS, VMS, OS/360, or CP/M. It performs many of the same tasks in very similar manners. It is the manager and administrator of all the system resources and facilities. Without it, nothing works. +## What is Linux? +Just like Windows, iOS, and Mac OS, Linux is an operating system. +An operating system is software that manages all of the hardware resources associated with your desktop or laptop. +The operating system manages the communication between your software and your hardware. Without the operating system (OS), the software wouldn’t function. +### Linux Operating System have several different pieces. +1. Bootloader – The software that manages the boot process of your computer. For most users, this will simply be a splash screen that pops up and eventually goes away to boot into the operating system. +2. Kernel – This is the one piece of the whole that is actually called ‘Linux’. The kernel is the core of the system and manages the CPU, memory, and peripheral devices. The kernel is the lowest level of the OS. +3. Init system – This is a sub-system that bootstraps the user space and is charged with controlling daemons. One of the most widely used init systems is systemd. It is the init system that manages the boot process, once the initial booting is handed over from the bootloader (i.e., GRUB or GRand Unified Bootloader). +4. Daemons – These are background services (printing, sound, scheduling, etc.) that either start up during boot or after you log into the desktop. +5. Graphical server – This is the sub-system that displays the graphics on your monitor. It is commonly referred to as the X server or just X. +6. Desktop environment – This is the piece that the users actually interact with. Each desktop environment includes built-in applications (such as file managers, configuration tools, web browsers, and games). +7. Applications – For Example- Ubuntu Linux has the Ubuntu Software Center (a rebrand of GNOME Software) which allows you to quickly search among the thousands of apps and install them from one centralized location. +## Why Use Linux? +1. Linux is less vulnerable to such attacks. As for server reboots, they’re only necessary if the kernel is updated. +2. It is not out of the ordinary for a Linux server to go years without being rebooted. +3. If you follow the regular recommended updates, stability and dependability are practically assured. +4. You can install Linux on as many computers as you like without paying a cent for software or server licensing. +## Open Source +It’s about freedom and freedom of use and freedom of choice. Linux is also distributed under an open source license. Open source follows these key tenets: +1. The freedom to run the program, for any purpose. +2. The freedom to study how the program works, and change it to make it do what you wish. +3. The freedom to redistribute copies so you can help your neighbor. +4. The freedom to distribute copies of your modified versions to others. +## What is a “distribution?” +Linux has a number of different versions to suit From new users to hard-core users,you’ll find a “flavor” of Linux to match your needs. +These versions are called distributions (or, in the short form, “distros”). + +Popular Linux distributions include: +1. LINUX MINT +2. MANJARO +3. DEBIAN +4. UBUNTU +5. ANTERGOS +6. SOLUS +7. FEDORA +8. ELEMENTARY OS +9. OPENSUSE + +You can check out the top 100 distributions on the Distrowatch. And don’t think the server has been left behind. For this arena, you can turn to: +1. Red Hat Enterprise Linux +2. Ubuntu Server +3. Centos +4. SUSE Enterprise Linux +Some of the above server distributions are free (such as Ubuntu Server and CentOS) and some have an associated price (such as Red Hat Enterprise Linux and SUSE Enterprise Linux). Those with an associated price also include support. +## Which distribution is right for you? +Which distribution you use will depend on the answer to three simple questions: + +1. How skilled of a computer user are you? +2. Do you prefer a modern or a standard desktop interface? +3. Server or desktop? +a. If your computer skills basic, you’ll want to stick with a newbie-friendly distribution such as Linux Mint, Ubuntu (Figure 3), Elementary OS or Deepin. +b. If your skill set extends into the above-average range, you could go with a distribution like Debian or Fedora. +c. If, you’ve pretty much mastered the craft of computer and system administration, use a distribution like Gentoo. +d. If you really want a challenge, you can build your very own Linux distribution, with the help of Linux From Scratch. + +## Installing Linux in Windows +1. Through WSL2. +2. Through Virtual Machines. +3. Through DualBoot. + +The Ubuntu Server does not install a GUI interface +You can install a GUI package on top of the Ubuntu Server with a single command like +--> sudo apt-get install ubuntu-desktop. +## Process & Process States: + +A process is more than just a program. Especially in a multi-user, multi-tasking operating system such as UNIX, there is much more to consider. Each program has a set of data that it uses to do what it needs.This data is not part of the program. For example, if you are using a text editor, the file you are editing is not part of the program on disk, but is part of the process in memory. If someone else were to be using the same editor, both of you would be using the same program. However, each of you would have a different process in memory + +Many different users can be on the system at the same time, they have processes that are in memory all at the same time. The system needs to keep track of what user is running what process, which terminal the process is running on, and what other resources the process has (such as open files). All of this is part of the process. +With the exception of the init process (PID 1) every process is the child of another process. Therefore every process with the exception of the init process has a “parent” process. +### Process States +The states that a Process enters in working from start till end are known as Process states. These are listed below as: +1. Created -Process is newly created by system call, is not ready to run +2. User running -Process is running in user mode which means it is a user process. +3. Kernel Running -Indicates process is a kernel process running in kernel mode. +4. Zombie- Process does not exist/ is terminated. +5. Preempted- When process runs from kernel to user mode, it is said to be preempted. +6. Ready to run in memory- It indicated that process has reached a state where it is ready to run in memory and is waiting for kernel to schedule it. +7. Ready to run, swapped - Process is ready to run but no empty main memory is present +8. Sleep, swapped- Process has been swapped to secondary storage and is at a blocked state. +9. Asleep in memory- Process is in memory(not swapped to secondary storage) but is in blocked state. + +### After Process States status will be changing like this- +1. User-running: Process is in user-running. +2. Kernel-running: Process is allocated to kernel and hence, is in kernel mode. +3. Ready to run in memory: Further, after processing in main memory process is rescheduled to the Kernel.i.e.The process is not executing but is ready to run as soon as the kernel schedules it. +4. Asleep in memory: Process is sleeping but resides in main memory. It is waiting for the task to begin. +5. Ready to run, swapped: Process is ready to run and be swapped by the processor into main memory, thereby allowing kernel to schedule it for execution. +6. Sleep, Swapped: Process is in sleep state in secondary memory, making space for execution of other processes in main memory. It may resume once the task is fulfilled. +7. Pre-empted: Kernel preempts an on-going process for allocation of another process, while the first process is moving from kernel to user mode. +8. Created: Process is newly created but not running. This is the start state for all processes. +9. Zombie: Process has been executed thoroughly and exit call has been enabled. The process, thereby, no longer exists. But, it stores a statistical record for the process. This is the final state of all processes. + +## Commands in daily use +1. pwd +2. ls +3. cd +4. mkdir +5. touch + + + + + From 113f546b41198b9397d14e22ed836117432fb4a8 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Sun, 25 Jan 2026 16:40:07 +0530 Subject: [PATCH 04/32] Revise Linux architecture notes for clarity and detail Refactor and clarify sections on Linux OS components, distributions, and process states. --- 2026/day-02/linux-architecture-notes.md | 132 +++++++++++++----------- 1 file changed, 69 insertions(+), 63 deletions(-) diff --git a/2026/day-02/linux-architecture-notes.md b/2026/day-02/linux-architecture-notes.md index e1640e35d..d21d675fd 100644 --- a/2026/day-02/linux-architecture-notes.md +++ b/2026/day-02/linux-architecture-notes.md @@ -1,34 +1,41 @@ # Linux Operating System ## Operating System -Linux is an operating system like many others, such as DOS, VMS, OS/360, or CP/M. It performs many of the same tasks in very similar manners. It is the manager and administrator of all the system resources and facilities. Without it, nothing works. +- The operating system manages the communication between your software and your hardware. Without the operating system (OS), the software wouldn’t function. +- It is the manager and administrator of all the system resources and facilities. Without it, nothing works. ## What is Linux? -Just like Windows, iOS, and Mac OS, Linux is an operating system. -An operating system is software that manages all of the hardware resources associated with your desktop or laptop. -The operating system manages the communication between your software and your hardware. Without the operating system (OS), the software wouldn’t function. +- Just like Windows, iOS, and Mac OS, Linux is an operating system. +- An operating system is software that manages all of the hardware resources associated with your desktop or laptop. +--- ### Linux Operating System have several different pieces. -1. Bootloader – The software that manages the boot process of your computer. For most users, this will simply be a splash screen that pops up and eventually goes away to boot into the operating system. -2. Kernel – This is the one piece of the whole that is actually called ‘Linux’. The kernel is the core of the system and manages the CPU, memory, and peripheral devices. The kernel is the lowest level of the OS. -3. Init system – This is a sub-system that bootstraps the user space and is charged with controlling daemons. One of the most widely used init systems is systemd. It is the init system that manages the boot process, once the initial booting is handed over from the bootloader (i.e., GRUB or GRand Unified Bootloader). -4. Daemons – These are background services (printing, sound, scheduling, etc.) that either start up during boot or after you log into the desktop. -5. Graphical server – This is the sub-system that displays the graphics on your monitor. It is commonly referred to as the X server or just X. -6. Desktop environment – This is the piece that the users actually interact with. Each desktop environment includes built-in applications (such as file managers, configuration tools, web browsers, and games). -7. Applications – For Example- Ubuntu Linux has the Ubuntu Software Center (a rebrand of GNOME Software) which allows you to quickly search among the thousands of apps and install them from one centralized location. +- Bootloader – The software that manages the boot process of your computer. For most users, this will simply be a splash screen that pops up and eventually goes away to boot into the operating system. +- Kernel – The kernel is the core of the system and manages the CPU, memory, and peripheral devices. + -The kernel is the lowest level of the OS. +- Init system – This is a sub-system that bootstraps the user space and is charged with controlling daemons. + - One of the most widely used init systems is systemd. It is the init system that manages the boot process, once the initial booting is handed over from the bootloader (i.e., GRUB or GRand Unified Bootloader). +- Daemons – These are background services (printing, sound, scheduling, etc.). + - It will start up during boot or after you log into the desktop. +- Graphical server – This is the sub-system that displays the graphics on your monitor. +- Desktop environment – This is the piece that the users actually interact with. +- Applications – Ubuntu Linux has the Ubuntu Software Center (a rebrand of GNOME Software) which allows you to quickly search among the thousands of apps and install them from one centralized location. +--- ## Why Use Linux? -1. Linux is less vulnerable to such attacks. As for server reboots, they’re only necessary if the kernel is updated. -2. It is not out of the ordinary for a Linux server to go years without being rebooted. -3. If you follow the regular recommended updates, stability and dependability are practically assured. -4. You can install Linux on as many computers as you like without paying a cent for software or server licensing. +- Linux is less vulnerable to such attacks. As for server reboots, they’re only necessary if the kernel is updated. +- It is not out of the ordinary for a Linux server to go years without being rebooted. +- If you follow the regular recommended updates, stability and dependability are practically assured. +- You can install Linux on many computers as you like without paying a rupees for software or server licensing. +--- + ## Open Source -It’s about freedom and freedom of use and freedom of choice. Linux is also distributed under an open source license. Open source follows these key tenets: -1. The freedom to run the program, for any purpose. -2. The freedom to study how the program works, and change it to make it do what you wish. -3. The freedom to redistribute copies so you can help your neighbor. -4. The freedom to distribute copies of your modified versions to others. +- You can freely run the program, for any purpose. +- You can freely study how the program works, and change it, make it according to what you want. +- You can freely redistribute copies so you can help others. +- You can freely distribute copies of your modified versions to others. +--- + ## What is a “distribution?” -Linux has a number of different versions to suit From new users to hard-core users,you’ll find a “flavor” of Linux to match your needs. +Linux has a number of different versions to suit From new users to hard-core users, you’ll find a “flavor” of Linux. These versions are called distributions (or, in the short form, “distros”). - -Popular Linux distributions include: +- Popular Linux distributions include: 1. LINUX MINT 2. MANJARO 3. DEBIAN @@ -38,60 +45,58 @@ Popular Linux distributions include: 7. FEDORA 8. ELEMENTARY OS 9. OPENSUSE - -You can check out the top 100 distributions on the Distrowatch. And don’t think the server has been left behind. For this arena, you can turn to: -1. Red Hat Enterprise Linux -2. Ubuntu Server -3. Centos -4. SUSE Enterprise Linux -Some of the above server distributions are free (such as Ubuntu Server and CentOS) and some have an associated price (such as Red Hat Enterprise Linux and SUSE Enterprise Linux). Those with an associated price also include support. +--- + ## Which distribution is right for you? -Which distribution you use will depend on the answer to three simple questions: -1. How skilled of a computer user are you? -2. Do you prefer a modern or a standard desktop interface? -3. Server or desktop? -a. If your computer skills basic, you’ll want to stick with a newbie-friendly distribution such as Linux Mint, Ubuntu (Figure 3), Elementary OS or Deepin. -b. If your skill set extends into the above-average range, you could go with a distribution like Debian or Fedora. -c. If, you’ve pretty much mastered the craft of computer and system administration, use a distribution like Gentoo. -d. If you really want a challenge, you can build your very own Linux distribution, with the help of Linux From Scratch. +- For Beginner with basic skills --- Linux Mint, Ubuntu, Elementary OS or Deepin. +- For Intermediate or above-average range skills, you could go with a distribution like --- Debian or Fedora. +- For Advanced level skills which know's computer and system administration, use a distribution like --- Gentoo. +- If you want a challenge, you can build your own Linux distribution, with the help of Linux From Scratch. +--- ## Installing Linux in Windows -1. Through WSL2. -2. Through Virtual Machines. -3. Through DualBoot. +- Through WSL2. +- Through Virtual Machines. +- Through DualBoot. The Ubuntu Server does not install a GUI interface -You can install a GUI package on top of the Ubuntu Server with a single command like ---> sudo apt-get install ubuntu-desktop. +- You can install a GUI package on the Ubuntu Server with a single command like. +- `sudo apt-get install ubuntu-desktop` +--- + ## Process & Process States: -A process is more than just a program. Especially in a multi-user, multi-tasking operating system such as UNIX, there is much more to consider. Each program has a set of data that it uses to do what it needs.This data is not part of the program. For example, if you are using a text editor, the file you are editing is not part of the program on disk, but is part of the process in memory. If someone else were to be using the same editor, both of you would be using the same program. However, each of you would have a different process in memory +- A process is more than just a program. For example, if you are using a text editor, the file you are editing is not part of the program on disk, but is part of the process in memory. If someone else were to be using the same editor, both of you would be using the same program. However, each of you would have a different process in memory -Many different users can be on the system at the same time, they have processes that are in memory all at the same time. The system needs to keep track of what user is running what process, which terminal the process is running on, and what other resources the process has (such as open files). All of this is part of the process. +- Many different users can be on the system at the same time, they have processes that are in memory all at the same time. The system needs to keep track of what user is running what process, which terminal the process is running on, and what other resources the process has (such as open files). All of this is part of the process. With the exception of the init process (PID 1) every process is the child of another process. Therefore every process with the exception of the init process has a “parent” process. +--- + ### Process States The states that a Process enters in working from start till end are known as Process states. These are listed below as: -1. Created -Process is newly created by system call, is not ready to run -2. User running -Process is running in user mode which means it is a user process. -3. Kernel Running -Indicates process is a kernel process running in kernel mode. -4. Zombie- Process does not exist/ is terminated. -5. Preempted- When process runs from kernel to user mode, it is said to be preempted. -6. Ready to run in memory- It indicated that process has reached a state where it is ready to run in memory and is waiting for kernel to schedule it. -7. Ready to run, swapped - Process is ready to run but no empty main memory is present -8. Sleep, swapped- Process has been swapped to secondary storage and is at a blocked state. -9. Asleep in memory- Process is in memory(not swapped to secondary storage) but is in blocked state. +- Created -Process is newly created by system call, is not ready to run +- User running -Process is running in user mode which means it is a user process. +- Kernel Running -Indicates process is a kernel process running in kernel mode. +- Zombie- Process does not exist/ is terminated. +- Preempted- When process runs from kernel to user mode, it is said to be preempted. +- Ready to run in memory- It indicated that process has reached a state where it is ready to run in memory and is waiting for kernel to schedule it. +- Ready to run, swapped - Process is ready to run but no empty main memory is present +- Sleep, swapped- Process has been swapped to secondary storage and is at a blocked state. +- Asleep in memory- Process is in memory(not swapped to secondary storage) but is in blocked state. +--- ### After Process States status will be changing like this- -1. User-running: Process is in user-running. -2. Kernel-running: Process is allocated to kernel and hence, is in kernel mode. -3. Ready to run in memory: Further, after processing in main memory process is rescheduled to the Kernel.i.e.The process is not executing but is ready to run as soon as the kernel schedules it. -4. Asleep in memory: Process is sleeping but resides in main memory. It is waiting for the task to begin. -5. Ready to run, swapped: Process is ready to run and be swapped by the processor into main memory, thereby allowing kernel to schedule it for execution. -6. Sleep, Swapped: Process is in sleep state in secondary memory, making space for execution of other processes in main memory. It may resume once the task is fulfilled. -7. Pre-empted: Kernel preempts an on-going process for allocation of another process, while the first process is moving from kernel to user mode. -8. Created: Process is newly created but not running. This is the start state for all processes. -9. Zombie: Process has been executed thoroughly and exit call has been enabled. The process, thereby, no longer exists. But, it stores a statistical record for the process. This is the final state of all processes. +- User-running: Process is in user-running. +- Kernel-running: Process is allocated to kernel and hence, is in kernel mode. +- Ready to run in memory: Further, after processing in main memory process is rescheduled to the Kernel.i.e.The process is not executing but is ready to run as soon as the kernel schedules it. +- Asleep in memory: Process is sleeping but resides in main memory. It is waiting for the task to begin. +- Ready to run, swapped: Process is ready to run and be swapped by the processor into main memory, thereby allowing kernel to schedule it for execution. +- Sleep, Swapped: Process is in sleep state in secondary memory, making space for execution of other processes in main memory. It may resume once the task is fulfilled. +- Pre-empted: Kernel preempts an on-going process for allocation of another process, while the first process is moving from kernel to user mode. +- Created: Process is newly created but not running. This is the start state for all processes. +- Zombie: Process has been executed thoroughly and exit call has been enabled. The process, thereby, no longer exists. But, it stores a statistical record for the process. This is the final state of all processes. +--- ## Commands in daily use 1. pwd @@ -100,6 +105,7 @@ The states that a Process enters in working from start till end are known as Pro 4. mkdir 5. touch +--- From 5f4a0cae0e434eaf2801745aade68beef20c8d9a Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Mon, 26 Jan 2026 17:12:10 +0530 Subject: [PATCH 05/32] Add Linux commands cheatsheet Added a comprehensive cheatsheet for Linux commands, including usage and examples for various commands related to file management, user management, and system information. --- 2026/day-03/linux-commands-cheatsheet.md | 68 ++++++++++++++++++++++++ 1 file changed, 68 insertions(+) create mode 100644 2026/day-03/linux-commands-cheatsheet.md diff --git a/2026/day-03/linux-commands-cheatsheet.md b/2026/day-03/linux-commands-cheatsheet.md new file mode 100644 index 000000000..0586375b4 --- /dev/null +++ b/2026/day-03/linux-commands-cheatsheet.md @@ -0,0 +1,68 @@ +# Linux Commands Cheatsheet +## Linux Commands with Usage of commands +| Commands | Description | +| ---------| ----------- | +| `pwd` | It shows the present working directory | +| `ls` | It shows available files and directories list in present working directory | +| `uname` | It shows name of the OS | +| `uname -r` | It shows version of OS | +| `cd` | It use for change directory from currently you are | +| `clear` | It use for clear screen | +| `whoami` | It shows currently login user name | +| `history` | It show history list of your commands | +| `date` | It show time and date | +| `mkdir` | It use for creating a directory(folder) like `mkdir Documents`. | +| `touch` | It use for create a file like `touch hello.txt`. | +| `cp` | It use for copy and paste file or directory `cp `. | +| `mv` | It use for 1.( move file/directory{folder}) and 2. (rename file/directory{folder}). | +| `rm` | It use for remove file/directory(folder). like `rm /Documents`. | +| `ps` | It show the process for current shell | +| `htop` | This will open an interactive interface showing all running processes. | +| `exit` | It use for logout. | +| `ping` | It use to check Internet connection between host/server and user/server | +| `ip addr` | It show information of all network interfaces and IPs | +| `dig` | It shows information about DNS. | +| `host` | It prints IP address of a specific domain. | +| `ping` | It use for testing connectivity between two systems on a network | + +---- +## Linux User Management and Group Management Commands. +| Commands | Description | Example | +| ---------| ----------- | ----------- | +| `useradd` | It use for add new useraccount in your system. | `useradd sumit` | +| `cat /etc/passwd \| grep sumit` | It show you the information of useraccount on your shell | `cat /etc/passwd | grep sumit` | +| `userdel` | It use for deleting an existing useraccount from your system | `userdel sumit` | +| `users` | It use for showing name of current active logged-In Users | `users` | +| `who` | It use for showing information about current logged-In User | `who` | +| `whoami` | It use for display the name of current logged-In user | `whoami` | +| `passwd` | It use for password change of user | `passwd sumit` | +| `groupadd` | It use for adding a new usergroup | `groupadd Hello` | +| `groupdel` | It use for deleting an existing group | `groupdel Hello` | +| `groupmod -n` | It use for modify or change a group name | `groupmod -n Jai Hello` | +| `groups` | It use for show groups where Jai is a member of group. | `groups Jai` | +| `gpasswd -a` | It use for manage group members and group passwords | `gpasswd -a sumit Jai` | +| `grpck` | It use for check group configuration files for errors | `grpck` | + + + + + + + + + + + + + + + + + + + + + + + + From cd6eb4d01953459aff6fcad84ca7fb6bc59d1786 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Mon, 26 Jan 2026 17:17:03 +0530 Subject: [PATCH 06/32] Revise Linux commands cheatsheet structure and content Updated the Linux commands section and improved formatting. --- 2026/day-03/linux-commands-cheatsheet.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/2026/day-03/linux-commands-cheatsheet.md b/2026/day-03/linux-commands-cheatsheet.md index 0586375b4..ed3a725c9 100644 --- a/2026/day-03/linux-commands-cheatsheet.md +++ b/2026/day-03/linux-commands-cheatsheet.md @@ -1,5 +1,7 @@ # Linux Commands Cheatsheet -## Linux Commands with Usage of commands +---- +## Linux Commands list & Networking Commands +---- | Commands | Description | | ---------| ----------- | | `pwd` | It shows the present working directory | @@ -30,7 +32,7 @@ | Commands | Description | Example | | ---------| ----------- | ----------- | | `useradd` | It use for add new useraccount in your system. | `useradd sumit` | -| `cat /etc/passwd \| grep sumit` | It show you the information of useraccount on your shell | `cat /etc/passwd | grep sumit` | +| `cat /etc/passwd \| grep sumit` | It show you the information of useraccount on your shell | `cat /etc/passwd \| grep sumit` | | `userdel` | It use for deleting an existing useraccount from your system | `userdel sumit` | | `users` | It use for showing name of current active logged-In Users | `users` | | `who` | It use for showing information about current logged-In User | `who` | @@ -43,7 +45,7 @@ | `gpasswd -a` | It use for manage group members and group passwords | `gpasswd -a sumit Jai` | | `grpck` | It use for check group configuration files for errors | `grpck` | - +---- From 0918d0474e699adfc42c613b063c2e34e161508e Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Mon, 26 Jan 2026 17:28:00 +0530 Subject: [PATCH 07/32] Update learning plan with formatting and goals --- 2026/day-01/learning-plan.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/2026/day-01/learning-plan.md b/2026/day-01/learning-plan.md index 7a31e2fd1..e4998d235 100644 --- a/2026/day-01/learning-plan.md +++ b/2026/day-01/learning-plan.md @@ -1,14 +1,18 @@ -Current Level - Fresher in DevOps (Ex Electrical Maintenance Engineer of Tata Steel with 5 year Experience){Unemployeed}. - +# Current Level - Fresher in DevOps (Ex Electrical Maintenance Engineer of Tata Steel with 5 year Experience){Unemployeed}. +--- +# Goals of 90 Days of DevOps Goals - 1. learning a Basics of DevOps. Goals - 2. Making Notes with Consistency. Goals - 3. Improveing Myself Everyday with Hands-On Practice. +--- +# Core DevOps Skills Core DevOps Skills - 1. Linux and Networking Core DevOps Skills - 2. Git & Docker Core DevOps Skills - 3. Understand how to debug the problems and finding a reason behind the problem why it's occur?. -Weekly Time Budget - +--- +##Weekly Time Budget - 1. Daily - 3 Hours spending on learning and hands-on. 2. Weekdays - 2 Hours spending on learning and hands-on. - +---- From 5d7525596f69c3cb35fc16e86d0df569a9bf21d3 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Mon, 26 Jan 2026 17:29:20 +0530 Subject: [PATCH 08/32] Refactor learning plan formatting for clarity Removed redundant 'Goals -' and 'Core DevOps Skills -' prefixes for clarity. --- 2026/day-01/learning-plan.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/2026/day-01/learning-plan.md b/2026/day-01/learning-plan.md index e4998d235..93995500b 100644 --- a/2026/day-01/learning-plan.md +++ b/2026/day-01/learning-plan.md @@ -1,15 +1,15 @@ # Current Level - Fresher in DevOps (Ex Electrical Maintenance Engineer of Tata Steel with 5 year Experience){Unemployeed}. --- # Goals of 90 Days of DevOps -Goals - 1. learning a Basics of DevOps. -Goals - 2. Making Notes with Consistency. -Goals - 3. Improveing Myself Everyday with Hands-On Practice. +1. learning a Basics of DevOps. +2. Making Notes with Consistency. +3. Improveing Myself Everyday with Hands-On Practice. --- # Core DevOps Skills -Core DevOps Skills - 1. Linux and Networking -Core DevOps Skills - 2. Git & Docker -Core DevOps Skills - 3. Understand how to debug the problems and finding a reason behind the problem why it's occur?. +1. Linux and Networking +2. Git & Docker +3. Understand how to debug the problems and finding a reason behind the problem why it's occur?. --- ##Weekly Time Budget - From a56f5489450a8c8e8a99be31d297290e7a68e2be Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Mon, 26 Jan 2026 17:29:48 +0530 Subject: [PATCH 09/32] Update learning plan formatting and headings --- 2026/day-01/learning-plan.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/2026/day-01/learning-plan.md b/2026/day-01/learning-plan.md index 93995500b..f8bffcaf0 100644 --- a/2026/day-01/learning-plan.md +++ b/2026/day-01/learning-plan.md @@ -1,18 +1,18 @@ # Current Level - Fresher in DevOps (Ex Electrical Maintenance Engineer of Tata Steel with 5 year Experience){Unemployeed}. --- -# Goals of 90 Days of DevOps +## Goals of 90 Days of DevOps 1. learning a Basics of DevOps. 2. Making Notes with Consistency. 3. Improveing Myself Everyday with Hands-On Practice. --- -# Core DevOps Skills +## Core DevOps Skills 1. Linux and Networking 2. Git & Docker 3. Understand how to debug the problems and finding a reason behind the problem why it's occur?. --- -##Weekly Time Budget - +## Weekly Time Budget - 1. Daily - 3 Hours spending on learning and hands-on. 2. Weekdays - 2 Hours spending on learning and hands-on. ---- From b823da3483577ce03fa7152b1760ae9cef44b601 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Mon, 26 Jan 2026 17:31:06 +0530 Subject: [PATCH 10/32] Revise current level description in learning plan Updated current level description for clarity. --- 2026/day-01/learning-plan.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/2026/day-01/learning-plan.md b/2026/day-01/learning-plan.md index f8bffcaf0..4eb2bba77 100644 --- a/2026/day-01/learning-plan.md +++ b/2026/day-01/learning-plan.md @@ -1,4 +1,6 @@ -# Current Level - Fresher in DevOps (Ex Electrical Maintenance Engineer of Tata Steel with 5 year Experience){Unemployeed}. +# Current Level - Fresher in DevOps +(Ex- Electrical Maintenance Engineer of Tata Steel with 5 year Experience). + --- ## Goals of 90 Days of DevOps 1. learning a Basics of DevOps. From 11c3c275d97c1f498fda46a80e72bef64d190dbe Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Mon, 26 Jan 2026 17:32:15 +0530 Subject: [PATCH 11/32] Update DevOps skills to include Kubernetes --- 2026/day-01/learning-plan.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/2026/day-01/learning-plan.md b/2026/day-01/learning-plan.md index 4eb2bba77..c5d8336ab 100644 --- a/2026/day-01/learning-plan.md +++ b/2026/day-01/learning-plan.md @@ -10,7 +10,7 @@ --- ## Core DevOps Skills 1. Linux and Networking -2. Git & Docker +2. Git & Docker, Kubernetes 3. Understand how to debug the problems and finding a reason behind the problem why it's occur?. --- From a12427fd2ee217f0fb56d13b6e948e624a54ea88 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Mon, 26 Jan 2026 17:36:04 +0530 Subject: [PATCH 12/32] Format command list with code formatting --- 2026/day-02/linux-architecture-notes.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/2026/day-02/linux-architecture-notes.md b/2026/day-02/linux-architecture-notes.md index d21d675fd..3e17a7e91 100644 --- a/2026/day-02/linux-architecture-notes.md +++ b/2026/day-02/linux-architecture-notes.md @@ -99,11 +99,11 @@ The states that a Process enters in working from start till end are known as Pro --- ## Commands in daily use -1. pwd -2. ls -3. cd -4. mkdir -5. touch +1. `pwd` +2. `ls` +3. `cd` +4. `mkdir` +5. `touch` --- From f6f1459ca22d72052a50ef6266afd679adb2a6a7 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Mon, 26 Jan 2026 23:15:39 +0530 Subject: [PATCH 13/32] Fix cp command syntax in cheatsheet --- 2026/day-03/linux-commands-cheatsheet.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/2026/day-03/linux-commands-cheatsheet.md b/2026/day-03/linux-commands-cheatsheet.md index ed3a725c9..053738ddd 100644 --- a/2026/day-03/linux-commands-cheatsheet.md +++ b/2026/day-03/linux-commands-cheatsheet.md @@ -15,7 +15,7 @@ | `date` | It show time and date | | `mkdir` | It use for creating a directory(folder) like `mkdir Documents`. | | `touch` | It use for create a file like `touch hello.txt`. | -| `cp` | It use for copy and paste file or directory `cp `. | +| `cp` | It use for copy and paste file or directory `cp /`. | | `mv` | It use for 1.( move file/directory{folder}) and 2. (rename file/directory{folder}). | | `rm` | It use for remove file/directory(folder). like `rm /Documents`. | | `ps` | It show the process for current shell | From 6f84ed20f0dfa31a548e0dc630b2f31fb61b2457 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Wed, 28 Jan 2026 01:10:12 +0530 Subject: [PATCH 14/32] Add Linux process management commands documentation Documented various Linux process management commands including ps, top, and kill with examples and descriptions. --- 2026/day-04/linux-practice.md | 45 +++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100644 2026/day-04/linux-practice.md diff --git a/2026/day-04/linux-practice.md b/2026/day-04/linux-practice.md new file mode 100644 index 000000000..7d48c5c17 --- /dev/null +++ b/2026/day-04/linux-practice.md @@ -0,0 +1,45 @@ +# Practiced on Process commands, Service Commands, Log Commands. +---- + +# # Process Management Commands. +| Commands | Description | Example | +| -------- | ----------- | ------- | +| `ps` | It show running process in the current shell | `ps` | + +this command is without option. It will show you this:- +PID :- is the unique processID of process. +TTY:- is the type of terminal user is logged in to. pts means pseudo terminal. +TIME gives you how long the process has been running. +CMD is the command that you run to launch the process. +| Commands | Description | Example | +| -------- | ----------- | ------- | +| `ps -U` | It show information about all processes run by user. | `ps -U username` | + +| Commands | Description | Example | +| -------- | ----------- | ------- | +| `top` | It track the running process. | `top` | + +top command show you the running process in real-time with memory and cpu usage. +* PID: Unique Process ID given to each process. * +* User: Username of the process owner. * +* PR: Priority given to a process while scheduling. * +* NI: ‘nice’ value of a process. * +* VIRT: Amount of virtual memory used by a process. * +* RES: Amount of physical memory used by a process. * +* SHR: Amount of memory shared with other processes. * +* S: state of the process +‘D’ = uninterruptible sleep +‘R’ = running +‘S’ = sleeping +‘T’ = traced or stopped +‘Z’ = zombie * +* %CPU: Percentage of CPU used by the process. * +* %MEM; Percentage of RAM used by the process. * +* TIME+: Total CPU time consumed by the process. * +* Command: Command used to activate the process. * + +| Commands | Description | Example | +| `kill` | It use for stop process in your OS. | `kill` | +| -------- | ----------- | ------- | +| -------- | ----------- | ------- | +| -------- | ----------- | ------- | From 38b5674d5ae36a987b952ae8a9bfce29fd015a71 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Wed, 28 Jan 2026 10:10:28 +0530 Subject: [PATCH 15/32] Revise Linux practice commands and explanations Updated Linux practice document with detailed command descriptions and examples for process management, memory usage, and disk space. --- 2026/day-04/linux-practice.md | 84 +++++++++++++++++++++++++++++------ 1 file changed, 70 insertions(+), 14 deletions(-) diff --git a/2026/day-04/linux-practice.md b/2026/day-04/linux-practice.md index 7d48c5c17..4393d7751 100644 --- a/2026/day-04/linux-practice.md +++ b/2026/day-04/linux-practice.md @@ -1,25 +1,23 @@ # Practiced on Process commands, Service Commands, Log Commands. ---- -# # Process Management Commands. -| Commands | Description | Example | -| -------- | ----------- | ------- | -| `ps` | It show running process in the current shell | `ps` | +## Process Management Commands. +**1. ps** command show you process running on your kernel.:- +* PID :- is the unique processID of process. * +* TTY:- is the type of terminal where user is logged in. pts means pseudo terminal. * +* TIME gives you how long the process has been running. * +* CMD is the command that you run to launch the process. * -this command is without option. It will show you this:- -PID :- is the unique processID of process. -TTY:- is the type of terminal user is logged in to. pts means pseudo terminal. -TIME gives you how long the process has been running. -CMD is the command that you run to launch the process. | Commands | Description | Example | | -------- | ----------- | ------- | +| `ps` | It show running process in the current shell | `ps` | | `ps -U` | It show information about all processes run by user. | `ps -U username` | +| `pstree` | It show running processs information in tree . | `pstree` | -| Commands | Description | Example | -| -------- | ----------- | ------- | -| `top` | It track the running process. | `top` | +------ -top command show you the running process in real-time with memory and cpu usage. +**2. top** command show you the running process in real-time with memory and cpu usage. +**top** command use for monitoring of processes. * PID: Unique Process ID given to each process. * * User: Username of the process owner. * * PR: Priority given to a process while scheduling. * @@ -39,7 +37,65 @@ top command show you the running process in real-time with memory and cpu usage. * Command: Command used to activate the process. * | Commands | Description | Example | -| `kill` | It use for stop process in your OS. | `kill` | | -------- | ----------- | ------- | +| `top` | It track the running process. | `top` | + +------ + +**3. kill** command **kill(stop)** your running process. +* it will use with speccific process ID or name of the process. * + +| Commands | Description | Example | +| -------- | ----------- | ------- | +| `kill` | It use for stop process in your OS. | `kill 9 1234` | + +------- + +**4. nice** command use for start new process with priority. +* Priority values range from -20 (highest) to 19 (lowest). * +* it will use with speccific process ID or name of the process. * +* **renice** command use for change priority of running process. * +* it will use with speccific process ID or name of the process. * + +| Commands | Description | Example | +| -------- | ----------- | ------- | +| `nice` | It will use for start new process with priority | `nice -n 10 command` | +| `renice` | It will use for change priority of process | `renice -n 5 -p 1234` | + +------- + +**5. free** command show free and used memory(RAM) on linux. +* **free -m** show you output in MB. * +* **free -g** show you output in GB. * +* **free -h** show you output in readable format. * + +| Commands | Description | Example | +| -------- | ----------- | ------- | +| `free` | It show you free and used memory on your system. | `free -h` | + +----- + +6. **df** command show free hard disk space on your linux system. +* **df -h** show you output in readable format. * + +| Commands | Description | Example | | -------- | ----------- | ------- | +| `df` | It show you free hard disk on your linux system.| `df -h` | + +----- + +**7. bg** command send process to background. +* **df -h** show you output in readable format. * + +| Commands | Description | Example | | -------- | ----------- | ------- | +| `bg` | It show you free hard disk on your linux system.| `bg` | + +----- +**8. fg** command use to run a stopped process in foreground. + +| Commands | Description | Example | +| -------- | ----------- | ------- | +| `fg` | It will run process in forground which is stopped.| `fg %id_of_job &` | + +----------- From b034db036e944ddf4bf7bea4b243b21a01e671c9 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Wed, 28 Jan 2026 11:46:32 +0530 Subject: [PATCH 16/32] Update linux-practice.md with management commands Added system management and logging commands to the Linux practice guide. --- 2026/day-04/linux-practice.md | 40 +++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/2026/day-04/linux-practice.md b/2026/day-04/linux-practice.md index 4393d7751..6520dabc6 100644 --- a/2026/day-04/linux-practice.md +++ b/2026/day-04/linux-practice.md @@ -99,3 +99,43 @@ | `fg` | It will run process in forground which is stopped.| `fg %id_of_job &` | ----------- +# system Management Commands + +**1. `systemctl`** It controls system startup, and manages background services. +* `systemctl start apache` start the service of apache with this command. * +* `systemctl stop apache` stop the service of apache with this command. * +* `systemctl status ssh` show status of ssh service with this command. * +* `sudo systemctl enable apache2` It Enables the Apache web server to start automatically at system boot. * +* `sudo systemctl disable apache2` It Disables the Apache web server, preventing it from starting automatically at boot. * +* `sudo systemctl status apache2` It Displays the "current status" of Apache (whether it’s _active_, _inactive_, _running_, or _failed_). * +* `sudo systemctl restart apache2` It Restarts the Apache web server, applying any configuration or update changes. * +* `sudo systemctl reload apache2` It Reloads Apache configuration without completely stopping the service, useful after minor config edits. * +* `sudo systemctl mask apache2` It Prevents the Apache service from being started manually or automatically, even if required by other services. * +* `sudo systemctl unmask apache2` It Allows the Apache service to be started or enabled again. * +* `sudo systemctl set-default graphical.target` It Sets the system to boot into the graphical (GUI) mode by default instead of command-line mode. * +* `systemctl list-unit-files` It Lists all available unit files on the system, showing which are enabled, disabled, or static. * + +---------- + +# Logging and Monitoring Commands + +**1. `journalctl`** It command is used to view logs collected by the systemd journal. +* `journalctl -xe` This shows detailed error logs and recent system messages. * +* `last` The last command displays the login and logout history of users. * +* `history` The history command shows previously executed commands by the user. * +* `sar -u` The sar command collects and reports system performance statistics. * +* `script session.log` The script command records all terminal activity in a file. * +* `scriptreplay timing.log session.log` The scriptreplay command replays a terminal session recorded using the script command. * +------ + + + + + + + + + + + + From 4254703c3b13522fee6ef2a680146d6cf937c232 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Thu, 29 Jan 2026 16:41:46 +0530 Subject: [PATCH 17/32] added a linux-troubleshooting-runbook.md --- 2026/day-05/linux-troubleshooting-runbook.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 2026/day-05/linux-troubleshooting-runbook.md diff --git a/2026/day-05/linux-troubleshooting-runbook.md b/2026/day-05/linux-troubleshooting-runbook.md new file mode 100644 index 000000000..e69de29bb From 75e2cc0fcf54dc8a57e230a4214f93ec13222e90 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Thu, 29 Jan 2026 16:43:18 +0530 Subject: [PATCH 18/32] Add runbook for 8 commands --- 2026/day-05/linux-troubleshooting-runbook.md | 1 + 1 file changed, 1 insertion(+) diff --git a/2026/day-05/linux-troubleshooting-runbook.md b/2026/day-05/linux-troubleshooting-runbook.md index e69de29bb..f7b6dc64e 100644 --- a/2026/day-05/linux-troubleshooting-runbook.md +++ b/2026/day-05/linux-troubleshooting-runbook.md @@ -0,0 +1 @@ +# Runbook for 8 commands From d69ec2efc21091864b93b4ef13270380818c1c67 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 00:21:20 +0530 Subject: [PATCH 19/32] Expand Linux troubleshooting runbook with detailed steps Added detailed troubleshooting steps for Linux server performance issues, including service/process checks, resource snapshots, and immediate actions. --- 2026/day-05/linux-troubleshooting-runbook.md | 55 +++++++++++++++++++- 1 file changed, 54 insertions(+), 1 deletion(-) diff --git a/2026/day-05/linux-troubleshooting-runbook.md b/2026/day-05/linux-troubleshooting-runbook.md index f7b6dc64e..4bbdecaec 100644 --- a/2026/day-05/linux-troubleshooting-runbook.md +++ b/2026/day-05/linux-troubleshooting-runbook.md @@ -1 +1,54 @@ -# Runbook for 8 commands +This report provides a snapshot of the current server state, focusing on identifying performance bottlenecks. +1. Target Service / Process + +• Service/Process Name: [e.g., apache2, mysql, java, docker-proxy] +• PID: [Use or to find] +• Status: [Running/Stopped/Zombie/High Resource Usage] + +2. Snapshot: CPU & Memory + +• Load Average (1m, 5m, 15m): [Run ] +• Top CPU Consumers: [Run or ] +• Memory Usage (Free/Used/Buff/Cache): [Run ] +• Swap Usage: [If non-zero and increasing, memory pressure is critical] + +3. Snapshot: Disk & IO + +• Disk Space Availability: [Run to check for 100% usage] +• Disk IOPS/Wait: [Run or to check high or ] +• Inode Usage: [Run ] [13, 14, 15, 16] + +4. Snapshot: Network + +• Network Utilization: [Run or ] +• Connections: [Run or to check for high connection counts] +• Latency: [Run or to verify external connectivity] + +5. Logs Reviewed + +• System Messages: or +• Service Logs: +• Auth Logs: (if security breach is suspected) [21] + +6. Quick findings for this Environment + +• OS/Kernel: [Run and ] +• Bottleneck Identified: [CPU / Memory / Disk IO / Network / Service failure] +• Immediate Action: [e.g., Restart service, purge logs, kill high-resource PID] + +If this worsens (next steps) + +1. Deeper Diagnostics: Use on the PID to analyze system calls. +2. Performance Logging: Enable or to capture long-term trends. +3. Kernel Parameters: Check settings (e.g., max open files, memory overcommit). +4. Hardware Check: Verify physical disk health using (if bare metal). + + + + + + + + + + From b811537964bc4c50a329914054a4aae3d5553635 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 08:12:21 +0530 Subject: [PATCH 20/32] Update bg command description in linux-practice.md --- 2026/day-04/linux-practice.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/2026/day-04/linux-practice.md b/2026/day-04/linux-practice.md index 6520dabc6..b2d2999bf 100644 --- a/2026/day-04/linux-practice.md +++ b/2026/day-04/linux-practice.md @@ -85,11 +85,10 @@ ----- **7. bg** command send process to background. -* **df -h** show you output in readable format. * | Commands | Description | Example | | -------- | ----------- | ------- | -| `bg` | It show you free hard disk on your linux system.| `bg` | +| `bg` | It use to start a recently suspended job on your linux system.| `bg` | ----- **8. fg** command use to run a stopped process in foreground. From ebfaa6daa6d49434450f786cebb78f6b7fc81074 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 09:05:28 +0530 Subject: [PATCH 21/32] Expand Linux troubleshooting runbook for sshd Added detailed troubleshooting steps and observations for the OpenSSH Daemon (sshd). --- 2026/day-05/linux-troubleshooting-runbook.md | 155 +++++++++++++++---- 1 file changed, 124 insertions(+), 31 deletions(-) diff --git a/2026/day-05/linux-troubleshooting-runbook.md b/2026/day-05/linux-troubleshooting-runbook.md index 4bbdecaec..c2aef959b 100644 --- a/2026/day-05/linux-troubleshooting-runbook.md +++ b/2026/day-05/linux-troubleshooting-runbook.md @@ -1,48 +1,141 @@ -This report provides a snapshot of the current server state, focusing on identifying performance bottlenecks. -1. Target Service / Process +# Linux Troubleshooting Runbook – Day 05 -• Service/Process Name: [e.g., apache2, mysql, java, docker-proxy] -• PID: [Use or to find] -• Status: [Running/Stopped/Zombie/High Resource Usage] +## Target service / process -2. Snapshot: CPU & Memory +* **Service:** `sshd` (OpenSSH Daemon) * +* **Purpose:** Remote access to the system * +* **Why chosen:** Critical service, always running, clear logs * -• Load Average (1m, 5m, 15m): [Run ] -• Top CPU Consumers: [Run or ] -• Memory Usage (Free/Used/Buff/Cache): [Run ] -• Swap Usage: [If non-zero and increasing, memory pressure is critical] +--- -3. Snapshot: Disk & IO +## Environment basics -• Disk Space Availability: [Run to check for 100% usage] -• Disk IOPS/Wait: [Run or to check high or ] -• Inode Usage: [Run ] [13, 14, 15, 16] +```bash +uname -a +``` -4. Snapshot: Network +**Observed:** Linux kernel 5.x, x86_64. Confirms kernel version and architecture. -• Network Utilization: [Run or ] -• Connections: [Run or to check for high connection counts] -• Latency: [Run or to verify external connectivity] +```bash +cat /etc/os-release +``` -5. Logs Reviewed +**Observed:** Ubuntu 22.04 LTS. Confirms distro and release for package/log locations. -• System Messages: or -• Service Logs: -• Auth Logs: (if security breach is suspected) [21] +--- -6. Quick findings for this Environment +## Filesystem sanity check -• OS/Kernel: [Run and ] -• Bottleneck Identified: [CPU / Memory / Disk IO / Network / Service failure] -• Immediate Action: [e.g., Restart service, purge logs, kill high-resource PID] +```bash +mkdir /tmp/runbook-demo +``` -If this worsens (next steps) +**Observed:** Directory created successfully — filesystem is writable. -1. Deeper Diagnostics: Use on the PID to analyze system calls. -2. Performance Logging: Enable or to capture long-term trends. -3. Kernel Parameters: Check settings (e.g., max open files, memory overcommit). -4. Hardware Check: Verify physical disk health using (if bare metal). +```bash +cp /etc/hosts /tmp/runbook-demo/hosts-copy && ls -l /tmp/runbook-demo +``` +**Observed:** File copied correctly, permissions look normal. No disk or permission issues. + +--- + +## Snapshot: CPU & Memory + +```bash +ps -o pid,pcpu,pmem,comm -C sshd +``` + +**Observed:** sshd processes using <1% CPU and minimal memory. No abnormal usage. + +```bash +free -h +``` + +**Observed:** ~60% memory available, no swap pressure. Memory not constrained. + +--- + +## Snapshot: Disk & IO + +```bash +df -h +``` + +**Observed:** Root filesystem at ~45% usage. Plenty of free disk space. + +```bash +du -sh /var/log +``` + +**Observed:** /var/log ~250MB. Logs not consuming excessive disk. + +--- + +## Snapshot: Network + +```bash +ss -tulpn | grep sshd +``` + +**Observed:** sshd listening on port 22 (IPv4 and IPv6). Service is bound correctly. + +```bash +curl -I localhost +``` + +**Observed:** Connection succeeds (HTTP headers returned). Network stack responsive. + +--- + +## Logs reviewed + +```bash +journalctl -u ssh -n 50 +``` + +**Observed:** Normal startup messages, successful login attempts, no errors. + +```bash +tail -n 50 /var/log/auth.log +``` + +**Observed:** Recent successful SSH logins, no failed-auth storms or warnings. + +--- + +## Quick findings + +* sshd is healthy and responsive +* No CPU, memory, or disk pressure +* Network port listening as expected +* Logs show normal operational behavior +* No immediate remediation required + +--- + +## If this worsens (next steps) + +1. **Restart strategy** + + ```bash + systemctl restart ssh + systemctl status ssh + ``` + + Check for restart failures or dependency issues. + +2. **Increase log verbosity** + + * Temporarily increase `LogLevel VERBOSE` in `/etc/ssh/sshd_config` + * Reload config and monitor logs for authentication or connection errors + +3. **Deep inspection** + + * Attach `strace -p ` if sshd is hanging + * Capture network traffic with `tcpdump` to identify connection issues + +--- From 5d62814e18fc01ce70081e4ad7177ce4c86ac7a0 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 09:18:32 +0530 Subject: [PATCH 22/32] Create Docker troubleshooting runbook Added a comprehensive troubleshooting runbook for Docker incidents, including steps for CPU and disk pressure investigations, immediate containment actions, and escalation triggers. --- .../linux-docker-troubleshooting-runbook.md | 281 ++++++++++++++++++ 1 file changed, 281 insertions(+) create mode 100644 2026/day-05/linux-docker-troubleshooting-runbook.md diff --git a/2026/day-05/linux-docker-troubleshooting-runbook.md b/2026/day-05/linux-docker-troubleshooting-runbook.md new file mode 100644 index 000000000..fab35c5ba --- /dev/null +++ b/2026/day-05/linux-docker-troubleshooting-runbook.md @@ -0,0 +1,281 @@ + +# PHASE 1 — 🔥 Simulated Failure Drill (CPU + Disk) + +### 🚨 Alert + +> **High CPU usage detected. SSH logins slow.** + +--- + +## Step 1: CPU spike investigation + +```bash +top +``` + +**Simulated output (snippet):** + +``` +PID USER %CPU %MEM COMMAND +2143 root 185.3 0.2 yes +892 root 0.1 0.1 sshd +``` + +**Interpretation:** +A runaway `yes` process is consuming >180% CPU (multi-core). SSH is slow because scheduler is busy. + +--- + +## Step 2: Confirm process details + +```bash +ps -o pid,ppid,pcpu,pmem,etime,comm -p 2143 +``` + +**Observed:** + +* Long-running +* No parent process of interest +* Not a legit system service + +--- + +## Step 3: Disk pressure appears + +```bash +df -h +``` + +**Simulated output:** + +``` +/dev/sda1 40G 39G 1G 98% / +``` + +```bash +du -sh /var/log/* +``` + +**Observed:** +`/var/log/auth.log` = **12GB** + +👉 CPU + disk pressure = compound incident + +--- + +## Immediate containment + +```bash +kill -9 2143 +``` + +```bash +truncate -s 0 /var/log/auth.log +``` + +✅ System stabilizes +⚠️ Root cause still unknown + +--- + +# PHASE 2 — 🐳 Docker / App-Focused Runbook + +Now assume the **real culprit** was a container. + +## Step 1: Container health + +```bash +docker ps +``` + +**Simulated output:** + +``` +CONTAINER ID NAME STATUS +f3a9c1 auth-api Up 3 days +``` + +```bash +docker stats --no-stream +``` + +``` +auth-api CPU 160% MEM 512MB / 1GB +``` + +**Interpretation:** +Container is CPU-bound and likely spamming logs → disk pressure. + +--- + +## Step 2: Inspect container logs + +```bash +docker logs --tail 20 auth-api +``` + +**Observed:** +Repeated authentication failures + stack trace loop. + +--- + +## Step 3: App-level fix + +```bash +docker restart auth-api +``` + +CPU drops back to normal. + +--- + +# PHASE 3 — 🧠 Runbook → Incident Playbook Upgrade + +Here’s how your runbook now evolves 👇 + +### Severity Classification + +| Severity | Description | +| -------- | --------------------------------------- | +| SEV-1 | SSH inaccessible, CPU >90%, disk >95% | +| SEV-2 | Performance degraded, service reachable | +| SEV-3 | Errors in logs, no user impact | + +--- + +### Decision Tree (muscle memory) + +``` +Alert fires + ├─ CPU high? + │ ├─ yes → top → identify PID → kill or restart service + │ └─ no + ├─ Disk >90%? + │ ├─ yes → du → logs → rotate/truncate + │ └─ no + └─ Service down? + ├─ yes → systemctl/docker restart + └─ escalate +``` + +--- + +### Escalation Triggers + +* Restart fails twice +* Disk refills within 10 minutes +* CPU spike returns after fix + +--- + +# PHASE 4 — 🧪 Timed On-Call Simulation (You’re On Call) + +⏱️ **Timer starts now** + +🚨 **Pager Alert** + +> SSH connections intermittently failing +> CPU spikes every 5 minutes +> Disk usage climbing + +--- + +## 🚨 ON-CALL INCIDENT — LIVE SIMULATION CONTINUES + +### Commands executed + +```bash +top +df -h +ps aux --sort=-%cpu | head +``` + +--- + +## 🔍 Command 1: `top` + +``` +%Cpu(s): 92.1 us, 6.3 sy, 0.0 id +PID USER %CPU %MEM COMMAND +2143 root 187.4 0.2 auth-api +892 root 0.3 0.1 sshd +``` + +**Interpretation** + +* CPU is nearly saturated +* `auth-api` process (likely container) is the culprit +* sshd itself is not misbehaving, just impacted + +--- + +## 💽 Command 2: `df -h` + +``` +Filesystem Size Used Avail Use% +/dev/sda1 40G 39G 1.0G 98% / +``` + +**Interpretation** + +* Disk is critically full +* SSH failures likely due to log writes + CPU contention +* This is now **SEV-1** + +--- + +## 🔥 Command 3: `ps aux --sort=-%cpu | head` + +``` +USER PID %CPU %MEM COMMAND +root 2143 187.4 0.2 auth-api +root 2190 22.1 0.1 dockerd +``` + +**Interpretation** + +* Confirms runaway app +* Docker daemon under stress +* Need **containment NOW** + +--- + +## 🛑 Immediate containment (what you should do next) + +You now have **two correct paths**: + +### Option A — Kill the offender fast + +```bash +kill -9 2143 +``` + +### Option B — Control blast radius (preferred) + +```bash +docker restart auth-api +``` + +--- + + + + + + + + + + + + + + + + + + + + + + From d0ccaf4ce6aa5956b58eed3f7deadfe23bd74bd4 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 09:27:06 +0530 Subject: [PATCH 23/32] Add nginx troubleshooting runbook Documented a comprehensive troubleshooting runbook for nginx failures, including steps for simulating, diagnosing, and resolving issues. --- .../linux-nginx-troubleshoot-runbook.md | 245 ++++++++++++++++++ 1 file changed, 245 insertions(+) create mode 100644 2026/day-05/linux-nginx-troubleshoot-runbook.md diff --git a/2026/day-05/linux-nginx-troubleshoot-runbook.md b/2026/day-05/linux-nginx-troubleshoot-runbook.md new file mode 100644 index 000000000..b7ac69651 --- /dev/null +++ b/2026/day-05/linux-nginx-troubleshoot-runbook.md @@ -0,0 +1,245 @@ +* You chose **maximum SRE mode**. +* Let’s do **all three for `nginx`**, clean and reusable: + +* 1️⃣ **Simulate breaking nginx on purpose** +* 2️⃣ **Troubleshoot it step-by-step** +* 3️⃣ **End with a reusable on-call runbook template** + +You can literally reuse this in real incidents or interviews. + +--- + +# 🔥 NGINX FAILURE DRILL (SIMULATED) + +## Target service + +* **Service:** nginx +* **Role:** Web server / reverse proxy +* **Impact:** User-facing traffic + +--- + +## 🚨 Pager Alert + +* > Website returning 502 errors +* > Requests timing out intermittently +* > CPU usage spiking + +--- + +## Step 1 — Health snapshot (muscle memory) + +```bash +systemctl status nginx +``` + +**Output** + +``` +nginx.service - A high performance web server +Active: active (running) +``` + +🧠 *Service is “up” but may be unhealthy.* + +--- + +```bash +top +``` + +``` +PID USER %CPU %MEM COMMAND +3112 www 165.2 1.4 nginx +``` + +🧠 *Worker process burning CPU.* + +--- + +```bash +df -h +``` + +``` +/dev/sda1 40G 38G 2G 95% / +``` + +🧠 *Disk pressure approaching critical.* + +--- + +## Step 2 — Network validation + +```bash +ss -tulpn | grep nginx +``` + +``` +LISTEN 0 511 0.0.0.0:80 users:(("nginx",pid=3112)) +``` + +🧠 *Port is listening.* + +--- + +```bash +curl -I http://localhost +``` + +``` +HTTP/1.1 502 Bad Gateway +``` + +🧠 *Frontend up, backend failing.* + +--- + +## Step 3 — Logs (where truth lives) + +```bash +tail -n 50 /var/log/nginx/error.log +``` + +``` +connect() failed (111: Connection refused) while connecting to upstream +``` + +🧠 *Upstream app is dead or unreachable.* + +--- + +```bash +du -sh /var/log/nginx +``` + +``` +14G /var/log/nginx +``` + +🧠 *Log spam explains disk pressure.* + +--- + +## Step 4 — Root cause found 😈 + +Someone **accidentally enabled debug logging**: + +```nginx +error_log /var/log/nginx/error.log debug; +``` + +This causes: + +* Massive log volume +* Disk pressure +* CPU spike +* 502s under load + +--- + +## Step 5 — Containment & fix + +```bash +sed -i 's/debug;/warn;/' /etc/nginx/nginx.conf +``` + +```bash +nginx -t +systemctl reload nginx +``` + +```bash +truncate -s 0 /var/log/nginx/error.log +``` + +✅ CPU drops +✅ Disk stabilizes +✅ 502s disappear + +--- + +# 🧠 QUICK FINDINGS + +* nginx process healthy but overloaded +* Upstream errors amplified by debug logging +* Disk nearly full due to runaway logs +* Reload (not restart) sufficient + +--- + +# 🧯 IF THIS WORSENS (NEXT STEPS) + +1. Temporarily disable access logs + + ```bash + access_log off; + ``` + +2. Rate-limit requests + + ```nginx + limit_req zone=one burst=10 nodelay; + ``` + +3. Capture traffic + + ```bash + tcpdump -i eth0 port 80 + ``` + +--- + +# 📘 REUSABLE ON-CALL RUNBOOK TEMPLATE + +``` +#NGINX On-Call Runbook + +##Service Overview +- Service name: +- Ports: +- Upstream dependencies: + +## Immediate Checks (Run in Order) +1. systemctl status nginx +2. top / ps --sort=-%cpu +3. df -h +4. ss -tulpn | grep nginx +5. curl -I localhost + +## Logs +- tail -n 50 /var/log/nginx/error.log +- du -sh /var/log/nginx + +## Common Failure Patterns +- 502 errors → upstream down +- High CPU → log level, request flood +- Disk full → access/error logs + +## Safe Actions +- nginx -t +- systemctl reload nginx +- truncate logs (never rm) + +## Escalation Triggers +- Reload fails +- Disk refills in <10 minutes +- Errors persist after upstream restart + +## Notes / Learnings +- (fill during incident) +``` + +--- + +## 🎯 What you just practiced + +* ✔ Real nginx failure +* ✔ CPU + disk + logs correlation +* ✔ Safe reload vs restart +* ✔ Interview-grade runbook +* ✔ True on-call thinking + +--- + + From 24f8c9bd5f55474ca1fcf38311c54de1b902e205 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 09:30:19 +0530 Subject: [PATCH 24/32] Create NGINX troubleshooting runbook Added a comprehensive NGINX troubleshooting runbook covering incident types, simulated failures, environment checks, and recovery steps. --- 2026/day-05/nginx-troubleshoot-runbook.md | 228 ++++++++++++++++++++++ 1 file changed, 228 insertions(+) create mode 100644 2026/day-05/nginx-troubleshoot-runbook.md diff --git a/2026/day-05/nginx-troubleshoot-runbook.md b/2026/day-05/nginx-troubleshoot-runbook.md new file mode 100644 index 000000000..e0312285c --- /dev/null +++ b/2026/day-05/nginx-troubleshoot-runbook.md @@ -0,0 +1,228 @@ +# 🔥 NGINX TROUBLESHOOTING DRILL (FULL EXECUTION) + +## Target service / process + +* **Service:** nginx +* **Role:** Web server / reverse proxy +* **Incident type:** 502 errors, high CPU, disk filling + +--- + +## 🚨 Simulated Failure (Intentional Break) + +**What broke:** + +* Debug logging enabled +* Backend upstream stopped + +```nginx +error_log /var/log/nginx/error.log debug; +``` + +Upstream app is **down**. + +--- + +## Environment basics + +```bash +uname -a +``` + +**Observed:** Linux 5.x kernel, x86_64. + +```bash +cat /etc/os-release +``` + +**Observed:** Ubuntu 22.04 LTS. + +--- + +## Filesystem sanity + +```bash +mkdir /tmp/runbook-demo +``` + +**Observed:** Directory created successfully. + +```bash +cp /etc/hosts /tmp/runbook-demo/hosts-copy && ls -l /tmp/runbook-demo +``` + +**Observed:** File copied; filesystem writable. + +--- + +## Snapshot: CPU & Memory + +```bash +top +``` + +``` +PID USER %CPU %MEM COMMAND +3112 www 168.4 1.3 nginx +``` + +**Observed:** nginx worker saturating CPU. + +```bash +free -h +``` + +**Observed:** Memory OK, no swap pressure. + +--- + +## Snapshot: Disk & IO + +```bash +df -h +``` + +``` +/dev/sda1 40G 38G 2G 95% / +``` + +**Observed:** Disk nearing critical. + +```bash +du -sh /var/log/nginx +``` + +``` +14G /var/log/nginx +``` + +**Observed:** nginx logs consuming most disk. + +--- + +## Snapshot: Network + +```bash +ss -tulpn | grep nginx +``` + +``` +LISTEN 0 511 0.0.0.0:80 users:(("nginx",pid=3112)) +``` + +**Observed:** nginx listening correctly. + +```bash +curl -I http://localhost +``` + +``` +HTTP/1.1 502 Bad Gateway +``` + +**Observed:** Frontend reachable, backend failing. + +--- + +## Logs reviewed + +```bash +tail -n 50 /var/log/nginx/error.log +``` + +``` +connect() failed (111: Connection refused) while connecting to upstream +``` + +**Observed:** Upstream application down. + +```bash +journalctl -u nginx -n 50 +``` + +**Observed:** No crash loops; config still valid. + +--- + +## Root cause + +* Debug log level caused **log explosion** +* Disk pressure + CPU overhead +* Backend service unavailable → 502s + +--- + +## Containment & Fix + +```bash +sed -i 's/debug;/warn;/' /etc/nginx/nginx.conf +``` + +```bash +nginx -t +``` + +**Observed:** Config OK. + +```bash +systemctl reload nginx +``` + +```bash +truncate -s 0 /var/log/nginx/error.log +``` + +```bash +systemctl restart backend-app +``` + +* ✅ CPU normal +* ✅ Disk stabilized +* ✅ HTTP 200 restored + +--- + +## Quick findings + +* nginx was running but unhealthy +* Log level misconfiguration amplified impact +* Reload sufficient; full restart unnecessary +* Disk pressure directly affected service health + +--- + +## If this worsens (next steps) + +1. **Disable access logs temporarily** + + ```nginx + access_log off; + ``` + +2. **Add rate limiting** + + ```nginx + limit_req zone=req_limit burst=10 nodelay; + ``` + +3. **Deep debugging** + + ```bash + strace -p + tcpdump -i eth0 port 80 + ``` + +--- + +## ✅ Final Result + +You now have: + +* A **realistic nginx incident** +* A **repeatable troubleshooting flow** +* A **production-grade runbook** +* Exactly what interviewers expect when they ask: + + > “How do you troubleshoot a failing service?” + +--- From fd38989489a3560db8b7787f4dce3a41e9362414 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 09:37:19 +0530 Subject: [PATCH 25/32] Create database troubleshoot runbook Added a comprehensive runbook for troubleshooting database failures, including steps for PostgreSQL and nginx rate limiting scenarios. --- 2026/day-05/database-troubleshoot-runbook.md | 257 +++++++++++++++++++ 1 file changed, 257 insertions(+) create mode 100644 2026/day-05/database-troubleshoot-runbook.md diff --git a/2026/day-05/database-troubleshoot-runbook.md b/2026/day-05/database-troubleshoot-runbook.md new file mode 100644 index 000000000..220054adc --- /dev/null +++ b/2026/day-05/database-troubleshoot-runbook.md @@ -0,0 +1,257 @@ +# 💥 DATABASE FAILURE DRILL (Postgres) + +*(Same logic applies to MySQL — I’ll note the equivalents)* + +## Target service + +* **Service:** PostgreSQL +* **Impact:** App returns 500s / timeouts + +--- + +## 🚨 Pager Alert + +> API error rate >30% +> Requests timing out +> DB connections exhausted + +--- + +## Step 1 — Immediate health snapshot + +```bash +systemctl status postgresql +``` + +**Observed:** Service active (running) +🧠 *Service “up” doesn’t mean healthy.* + +--- + +```bash +top +``` + +``` +PID USER %CPU %MEM COMMAND +1421 postgres 95.3 42.1 postgres +``` + +🧠 *CPU-bound database process.* + +--- + +```bash +free -h +``` + +**Observed:** Memory tight, cache growing, no swap yet. + +--- + +## Step 2 — Disk & IO (DB killer) + +```bash +df -h +``` + +``` +/dev/sda1 40G 39G 1G 98% / +``` + +🧠 *Databases + full disk = incoming outage.* + +--- + +```bash +iostat -xz 1 3 +``` + +**Observed:** + +* High `await` +* Disk utilization >90% + +--- + +## Step 3 — Database-level checks + +```bash +sudo -u postgres psql -c "SELECT count(*) FROM pg_stat_activity;" +``` + +``` +count +----- +300 +``` + +```bash +sudo -u postgres psql -c "SHOW max_connections;" +``` + +``` +100 +``` + +🧠 **ROOT CAUSE FOUND** + +* Connection exhaustion +* App leaking DB connections + +(MySQL equivalent: `SHOW PROCESSLIST;`) + +--- + +## Step 4 — Containment (NOW) + +```bash +sudo -u postgres psql -c " +SELECT pg_terminate_backend(pid) +FROM pg_stat_activity +WHERE state = 'idle'; +" +``` + +```bash +systemctl restart postgresql +``` + +* ✅ Connections drop +* ✅ CPU stabilizes +⚠️ Root cause still app-side + +--- + +## If this worsens (DB) + +* 1. Enable slow query logging +* 2. Lower `max_connections`, add pooling (PgBouncer) +* 3. Capture query stats: + + ```sql + SELECT * FROM pg_stat_statements; + ``` + +--- + +# 🔁 TRAFFIC SPIKE + RATE LIMITING LAB (nginx) + +## 🚨 Scenario + +* > Traffic spikes 10× +* > Backend healthy +* > nginx starts failing + +--- + +## Step 1 — Confirm spike + +```bash +ss -s +``` + +**Observed:** Surge in ESTABLISHED connections + +```bash +nginx_status +``` + +**Observed:** Active connections skyrocketing + +--- + +## Step 2 — Protect the backend (rate limit) + +```nginx +limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s; + +server { + location /api/ { + limit_req zone=api_limit burst=20 nodelay; + proxy_pass http://backend; + } +} +``` + +```bash +nginx -t && systemctl reload nginx +``` + +--- + +## Step 3 — Verify protection + +```bash +curl -I http://localhost/api +``` + +``` +HTTP/1.1 429 Too Many Requests +``` + +🧠 *nginx sheds load before backend dies.* + +--- + +## If this worsens (traffic) + +* 1. Add caching +* 2. Enable connection limits +* 3. Scale horizontally + +--- + +# 🎤 MOCK SRE INTERVIEW (YOU’RE ON STAGE) + +I’ll be the interviewer. +You answer **out loud or in text**. No perfect answers needed — clarity > buzzwords. + +--- + +### Question 1 + +> You’re paged for elevated API latency. Where do you start and why? + +--- + +### Question 2 + +> PostgreSQL is “running” but the app is down. How do you prove it’s a DB issue? + +--- + +### Question 3 + +> Why is restarting the database dangerous? When is it acceptable? + +--- + +### Question 4 + +> How does rate limiting protect downstream services? + +--- + +### Question 5 (Senior-level) + +> What signals tell you this is an **app bug**, not an infrastructure failure? + +--- + +### Bonus + +> What would you automate after surviving this incident? + +--- + +## 🧠 What you’ve now practiced + +* ✅ DB outage under load +* ✅ Connection exhaustion +* ✅ Disk + IO correlation +* ✅ Traffic spike defense +* ✅ Real SRE interview reasoning +* ✅ Runbook-driven thinking (not guessing) + +--- From cc171e0161a4974faa09ed120f281b68dedf065e Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 09:42:31 +0530 Subject: [PATCH 26/32] Create troubleshooting interview questions for SRE Added a markdown file with mock SRE interview questions and model answers. --- .../troubleshooting-interview-questions.md | 165 ++++++++++++++++++ 1 file changed, 165 insertions(+) create mode 100644 2026/day-05/troubleshooting-interview-questions.md diff --git a/2026/day-05/troubleshooting-interview-questions.md b/2026/day-05/troubleshooting-interview-questions.md new file mode 100644 index 000000000..ae98f3547 --- /dev/null +++ b/2026/day-05/troubleshooting-interview-questions.md @@ -0,0 +1,165 @@ +**strong SRE / mid–senior DevOps engineer**. + +--- + +# 🎤 MOCK SRE INTERVIEW — MODEL ANSWERS + +--- + +## **Question 1** + +**“You’re paged for elevated API latency. Where do you start and why?”** + +**Answer:** + +> I start with a **system-level snapshot** to understand blast radius before diving deep. +> First I check CPU, memory, disk, and network on the host or cluster. Latency is often caused by resource saturation rather than the app itself. +> +> Concretely: +> +> * `top` / `free -h` → CPU & memory pressure +> * `df -h` → disk saturation (often overlooked) +> * `ss -s` → connection spikes +> +> Once I know which resource is constrained, I zoom into the most likely service instead of guessing. + +✅ *Why this is good:* calm, methodical, avoids tunnel vision. + +--- + +## **Question 2** + +**“PostgreSQL is running but the app is down. How do you prove it’s a DB issue?”** + +**Answer:** + +> “Running” only means the process exists — not that it’s healthy. +> +> I’d check: +> +> * Active connections vs `max_connections` +> * Query activity and wait states +> * Disk and IO latency +> +> For example: +> +> ```sql +> SELECT count(*) FROM pg_stat_activity; +> SHOW max_connections; +> ``` +> +> If connections are exhausted or queries are stuck waiting on IO or locks, that’s strong evidence the database is the bottleneck. +> I also correlate with app logs showing connection timeouts or slow queries. + +✅ *Key signal:* correlation between DB state and app symptoms. + +--- + +## **Question 3** + +**“Why is restarting the database dangerous? When is it acceptable?”** + +**Answer:** + +> Restarting a database is dangerous because it: +> +> * Drops all active connections +> * Interrupts in-flight transactions +> * Can cause cascading failures if apps aggressively reconnect +> +> I treat it as a **last-resort containment action**, not a fix. +> +> It’s acceptable when: +> +> * The database is completely wedged +> * Connection exhaustion prevents recovery +> * The business impact of downtime is already high +> +> Even then, I try lighter actions first: terminating idle sessions, throttling traffic, or fixing the upstream cause. + +✅ *This shows maturity and production awareness.* + +--- + +## **Question 4** + +**“How does rate limiting protect downstream services?”** + +**Answer:** + +> Rate limiting applies **backpressure** at the edge. +> +> Instead of letting traffic overwhelm the backend, nginx rejects excess requests early with predictable behavior (like HTTP 429). +> +> This: +> +> * Preserves backend capacity for legitimate users +> * Prevents connection exhaustion +> * Avoids cascading failures into databases or caches +> +> It turns an uncontrolled failure into a controlled degradation. + +✅ *This is exactly what interviewers want to hear.* + +--- + +## **Question 5 (Senior-level)** + +**“What signals tell you this is an app bug, not an infrastructure failure?”** + +**Answer:** + +> I look for **asymmetry**: +> +> * Infrastructure metrics are stable (CPU, disk, network OK) +> * Errors spike only for a specific endpoint or service +> * Logs show repeated stack traces or retries +> * DB connections grow without corresponding traffic increase +> +> That pattern usually points to: +> +> * Connection leaks +> * Unbounded retries +> * Inefficient queries +> +> Infra failures tend to affect multiple services at once; app bugs are usually isolated. + +✅ *This separates juniors from seniors.* + +--- + +## **Bonus Question** + +**“What would you automate after surviving this incident?”** + +**Answer:** + +> I’d automate the *detection and prevention*, not just alerts: +> +> * Connection pool limits enforced by config +> * Automatic rate limiting under load +> * Disk usage alerts tied to log growth +> * Dashboards showing DB connections vs max +> +> I’d also update the runbook with exact commands used so the next on-call doesn’t rediscover them under pressure. + +✅ *Runbook + automation = gold.* + +--- + +# 🏆 How strong these answers are + +If you said this in an interview: + +* ✅ Clear thinking +* ✅ Real incident experience +* ✅ Calm under pressure +* ✅ Production-first mindset + +This is **hire-level** for: + +* SRE I / II +* DevOps Engineer +* Platform Engineer + +--- From 26bf55421e8f675031c9cdbfcf8d897d6e47f04c Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 09:47:08 +0530 Subject: [PATCH 27/32] Create Kubernetes troubleshooting runbook Added a comprehensive Kubernetes troubleshooting runbook covering pod crashloop, Redis caching failures, and DNS outages. --- 2026/day-05/k8s-troubleshooting-runbook.md | 290 +++++++++++++++++++++ 1 file changed, 290 insertions(+) create mode 100644 2026/day-05/k8s-troubleshooting-runbook.md diff --git a/2026/day-05/k8s-troubleshooting-runbook.md b/2026/day-05/k8s-troubleshooting-runbook.md new file mode 100644 index 000000000..662a425ad --- /dev/null +++ b/2026/day-05/k8s-troubleshooting-runbook.md @@ -0,0 +1,290 @@ + +# 💣 KUBERNETES POD CRASHLOOP + OOM KILL + +## Target service + +**Service:** `web-app` deployment in Kubernetes +**Impact:** Pod crashlooping, high memory usage + +--- + +## 🚨 Pager Alert + +> Deployment has pods in `CrashLoopBackOff` +> Users report intermittent 500s + +--- + +## Step 1 — Inspect pods + +```bash +kubectl get pods -n production +``` + +``` +NAME READY STATUS RESTARTS AGE +web-app-6d7f89f6b7-hj9k2 0/1 CrashLoopBackOff 5 12m +``` + +--- + +## Step 2 — Check pod logs + +```bash +kubectl logs web-app-6d7f89f6b7-hj9k2 -n production +``` + +``` +RuntimeError: Memory allocation failed +``` + +**Observed:** Pod OOMing → crashloop + +--- + +## Step 3 — Describe pod for Kubernetes events + +```bash +kubectl describe pod web-app-6d7f89f6b7-hj9k2 -n production +``` + +``` +Events: + Type Reason Age From Message + ---- ------ ---- ---- ------- + Warning OOMKilled 5m kubelet Container killed due to memory usage +``` + +**Root cause:** Container exceeded memory limit + +--- + +## Step 4 — Containment + +* Temporarily scale deployment to reduce pressure: + +```bash +kubectl scale deployment web-app --replicas=0 -n production +``` + +* Adjust memory limits and requests in the manifest: + +```yaml +resources: + requests: + memory: "512Mi" + limits: + memory: "1024Mi" +``` + +* Redeploy: + +```bash +kubectl apply -f web-app-deployment.yaml +kubectl scale deployment web-app --replicas=3 -n production +``` + +✅ Pod stabilizes, no CrashLoopBackOff + +--- + +## If this worsens (next steps) + +1. Enable heap dumps for debugging +2. Use `kubectl exec` to inspect memory usage inside container +3. Add horizontal pod autoscaling to handle spikes + +--- + +# 🧠 CACHING FAILURE (Redis meltdown) + +## Pager Alert + +> Cache latency spikes → DB load increases +> App response slows, timeouts occur + +--- + +## Step 1 — Check Redis status + +```bash +redis-cli ping +``` + +``` +PONG +``` + +✅ Redis alive, but may be overloaded + +--- + +## Step 2 — Inspect memory + +```bash +redis-cli info memory +``` + +``` +used_memory: 1050000000 +maxmemory: 1073741824 +``` + +**Observed:** ~1GB used out of 1GB → hitting maxmemory + +--- + +## Step 3 — Eviction / slow commands + +```bash +redis-cli info stats +``` + +``` +evicted_keys: 1200 +expired_keys: 1500 +``` + +**Interpretation:** Keys being evicted → app cache misses → DB pressure + +--- + +## Step 4 — Containment + +* Temporarily increase `maxmemory`: + +```bash +redis-cli config set maxmemory 2gb +``` + +* Use LRU eviction policy: + +```bash +redis-cli config set maxmemory-policy allkeys-lru +``` + +✅ Cache stabilizes, DB pressure drops + +--- + +## Next steps if worsens + +1. Add Redis clustering / sharding +2. Enable monitoring + alerts on `used_memory` +3. Profile app caching pattern → prevent cache storm + +--- + +# 🌐 DNS OUTAGE SIMULATION + +## Pager Alert + +> App cannot resolve backend hostnames → failing requests + +--- + +## Step 1 — Test DNS resolution + +```bash +dig backend.service.local +``` + +``` +;; connection timed out; no servers could be reached +``` + +✅ DNS failure confirmed + +--- + +## Step 2 — Check local resolver + +```bash +cat /etc/resolv.conf +``` + +``` +nameserver 10.0.0.2 +``` + +```bash +ping 10.0.0.2 +``` + +✅ Resolver unreachable + +--- + +## Step 3 — Containment + +* Restart local DNS service / kube-dns: + +```bash +systemctl restart systemd-resolved +kubectl rollout restart deployment coredns -n kube-system +``` + +✅ Resolution restored, app connectivity returns + +--- + +## If this worsens + +1. Switch to secondary resolver (`8.8.8.8`) temporarily +2. Use static `/etc/hosts` overrides for critical services +3. Investigate upstream DNS provider + +--- + +# 🎯 LIVE WHITEBOARD INTERVIEW — “DESIGN A RESILIENT SYSTEM” + +**Prompt:** Design a resilient multi-tier web system. + +**Model Answer / Thought Process:** + +1. **Frontend layer** + + * Load balancer (HAProxy / Nginx / ALB) + * Auto-scaling web servers + * Health checks + rate limiting + +2. **Application layer** + + * Stateless containers or VMs + * Horizontal scaling + * Graceful shutdown + rolling updates + +3. **Database layer** + + * Primary + read replicas + * Connection pooling + * Backups + failover + +4. **Caching layer** + + * Redis / Memcached cluster + * LRU eviction, persistent failover + +5. **Observability** + + * Metrics (Prometheus, Grafana) + * Logging + alerting + * Distributed tracing (Jaeger / OpenTelemetry) + +6. **Failure scenarios considered** + + * Node / pod crash → HPA / replication + * Disk full → monitoring + retention policies + * Network partition → retries, circuit breakers + * DNS / service discovery → multiple resolvers + +7. **Automation / IaC** + + * Terraform / Helm / Ansible to version configs + * CI/CD pipelines with canary rollout + * Runbooks for common failures + +✅ Key: explain reasoning **step by step**, not just draw boxes + +--- + From 3af7fbbfca5cbc81873c5cdee8f801f6f4881f4a Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 09:51:23 +0530 Subject: [PATCH 28/32] Create ultimate on-call runbook template Added a comprehensive on-call runbook template covering various operational aspects including CPU, memory, disk, network, and common failure patterns. --- 2026/day-05/runbook-template.md | 173 ++++++++++++++++++++++++++++++++ 1 file changed, 173 insertions(+) create mode 100644 2026/day-05/runbook-template.md diff --git a/2026/day-05/runbook-template.md b/2026/day-05/runbook-template.md new file mode 100644 index 000000000..916748803 --- /dev/null +++ b/2026/day-05/runbook-template.md @@ -0,0 +1,173 @@ +**ULTIMATE ON-CALL RUNBOOK**. + +This will be **one-page, reusable, interview-ready, and production-grade**. It covers **everything we’ve drilled**: CPU, memory, disk, network, logs, nginx, DB, Redis, K8s, DNS, traffic spikes. + +Here’s the master template in Markdown: + +--- + +# 🏆 ULTIMATE ON-CALL RUNBOOK + +## 🧩 Service / Process Target + +* Service Name: `` +* Type: Web / API / DB / Cache / K8s +* Host / Namespace / Cluster: `` +* Contact / Slack Channel: `` + +--- + +## 1️⃣ Environment Basics + +```bash +uname -a +cat /etc/os-release +``` + +*Check kernel, OS version, architecture.* + +--- + +## 2️⃣ Filesystem Sanity + +```bash +mkdir /tmp/runbook-demo +cp /etc/hosts /tmp/runbook-demo/hosts-copy && ls -l /tmp/runbook-demo +``` + +*Ensure disk writable.* + +```bash +df -h +du -sh /var/log/ +``` + +*Check disk usage and log growth.* + +--- + +## 3️⃣ CPU / Memory Snapshot + +```bash +top +ps -o pid,pcpu,pmem,comm -p +free -h +``` + +*Observe spikes, runaway processes.* + +* K8s: `kubectl top pod ` +* Redis: `redis-cli info memory` +* DB: `sudo -u postgres psql -c "SELECT count(*) FROM pg_stat_activity;"` + +--- + +## 4️⃣ Disk / IO + +```bash +iostat -xz 1 3 +vmstat 1 5 +``` + +*Check I/O wait, disk saturation.* + +--- + +## 5️⃣ Network / Connectivity + +```bash +ss -tulpn | grep +curl -I +``` + +*K8s:* `kubectl exec -- curl -I http://backend` +*DNS:* `dig ` + +--- + +## 6️⃣ Logs Reviewed + +```bash +journalctl -u -n 50 +tail -n 50 /var/log/.log +``` + +*K8s:* `kubectl logs ` +*Redis / DB:* slow queries, eviction stats + +--- + +## 7️⃣ Common Failure Patterns + +| Symptom | Likely Cause | Action | +| ------------------------ | -------------------------- | -------------------------------------------- | +| CPU 90%+ | Runaway process, container | Kill / restart / scale | +| Disk >90% | Log spamming, DB bloat | Truncate logs, rotate, investigate | +| 502 / 500 errors | Upstream failure | Check logs, restart service, verify DB/cache | +| DB connection exhaustion | Max connections reached | Terminate idle sessions, scale pool | +| CrashLoopBackOff | OOM / config error | Increase limits, fix config, redeploy | +| Cache meltdown | Max memory reached | Increase maxmemory, LRU, scale | + +--- + +## 8️⃣ Quick Containment Commands + +* nginx: `nginx -t && systemctl reload nginx` +* Postgres: terminate idle sessions, `systemctl restart postgresql` +* Redis: `config set maxmemory ` + eviction policy +* K8s: scale deployment, inspect pods, `kubectl describe` +* DNS: restart resolver or CoreDNS + +--- + +## 9️⃣ Observability Checks + +* CPU / Memory / Disk: `top`, `free -h`, `df -h`, `iostat` +* Network: `ss -tulpn`, `curl`, `dig` +* Logs: `journalctl`, `kubectl logs`, `tail -n 50` +* Metrics: Prometheus / Grafana dashboards +* Alerts: PagerDuty / Slack channels + +--- + +## 🔥 If This Worsens (Next Steps) + +1. **Deep debugging** + + * strace / lsof / tcpdump + * Heap dumps for OOM + * Profiling slow queries / cache patterns + +2. **Temporary mitigation** + + * Scale horizontally (pods, instances) + * Rate limiting / circuit breakers + * Disable debug logging / access logs + +3. **Escalation** + + * Notify senior SRE / DBA + * Consider rolling back recent deployment / config change + * Engage upstream / cloud provider if needed + +--- + +## ✅ Quick Notes / Lessons Learned + +* Document root cause +* Update runbook if new failure pattern discovered +* Automate monitoring or alerts for this issue + +--- + +This **single page runbook** now covers: + +* CPU, memory, disk, network +* Logs for services, DB, cache, K8s +* nginx, Redis, Postgres/MySQL, Kubernetes, DNS +* Traffic spike handling / rate limiting +* CrashLoopBackOff, OOM kills +* Quick containment commands + escalation steps + +--- + From 0d9dbffce6803c4ad31b75569b95fa34b7baf9d9 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 09:54:35 +0530 Subject: [PATCH 29/32] Create on-call visual cheat sheet and decision tree Added a visual cheat sheet decision tree for on-call incidents, detailing steps for diagnosing and resolving issues. --- 2026/day-05/on-call-visual-cheatsheet.md | 72 ++++++++++++++++++++++++ 1 file changed, 72 insertions(+) create mode 100644 2026/day-05/on-call-visual-cheatsheet.md diff --git a/2026/day-05/on-call-visual-cheatsheet.md b/2026/day-05/on-call-visual-cheatsheet.md new file mode 100644 index 000000000..62db49491 --- /dev/null +++ b/2026/day-05/on-call-visual-cheatsheet.md @@ -0,0 +1,72 @@ +**visual cheat sheet / decision tree** — one page, easy to read under pressure. + +Here’s a **text-based diagram**. + +--- + +# 🗺️ ON-CALL CHEAT SHEET – DECISION TREE + +``` + [ Pager Alert / Incident ] + | + v + [ Check System Health ] + | + ------------------------------------------------ + | | | + CPU / Memory Disk / IO Network / DNS + | | | + top / free / ps df -h / du -sh ss / curl / dig + | | | +High CPU? Disk > 90%? DNS failures? + | Yes | Yes | Yes + | | | +Kill / Restart Truncate logs / Restart resolver +Process / Pod Investigate DB / Cache / CoreDNS +Scale / Throttle / Rotate logs + | | + | | +No? No? + | | + ------------------------ + | + [ Service Logs ] + | + journalctl / tail / kubectl logs + | + ------------------- + | | +Errors / Warnings? Normal? + | | +Check upstream / Monitor +DB / Cache / K8s Continue + | + v +[ Containment Actions ] + - nginx: reload / truncate logs + - DB: terminate idle sessions / restart + - Redis: increase maxmemory / LRU + - K8s: scale / fix OOM / redeploy + - Traffic spike: rate limit / caching + - DNS: restart resolver / fallback server + | + v +[ If Incident Persists ] + 1. Deep debugging: strace, heap dumps, tcpdump + 2. Temporary mitigation: scale, disable debug logs + 3. Escalate: senior SRE / DBA / cloud provider + | + v +[ Update Runbook / Lessons Learned ] +``` + +--- + +### ✅ How to use this cheat sheet: + +1. **Start at the top** — whatever triggered the pager. +2. **Follow the arrows** — diagnose resource → logs → containment → escalation. +3. **Quick actions are listed per node**, so you never waste time guessing. +4. **Add your service-specific notes** in each box for repeatable incidents. + +--- From 29229b669a62fc04380e60840bd4dc5614ad01bf Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 10:02:48 +0530 Subject: [PATCH 30/32] Add emergency on-call cheat sheet This cheat sheet provides a quick reference for emergency on-call procedures, including system checks, logs, common containment actions, and troubleshooting steps for various services. --- 2026/day-05/cheat-sheet-on-call.md | 91 ++++++++++++++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100644 2026/day-05/cheat-sheet-on-call.md diff --git a/2026/day-05/cheat-sheet-on-call.md b/2026/day-05/cheat-sheet-on-call.md new file mode 100644 index 000000000..8d57431ce --- /dev/null +++ b/2026/day-05/cheat-sheet-on-call.md @@ -0,0 +1,91 @@ + +--- + +# ⚡ Emergency On-Call Cheat Sheet – 1 Page + +### 1️⃣ Pager Alert / Incident + +* Identify service & severity +* Slack/PagerDuty channel + +--- + +### 2️⃣ Quick System Check + +* `top` → CPU spikes? +* `free -h` → memory pressure? +* `df -h` → disk full? +* `ss -tulpn` → connections? +* `curl -I ` → service up? + +--- + +### 3️⃣ Logs + +* Linux: `journalctl -u -n 50` +* App: `tail -n 50 /var/log/.log` +* K8s: `kubectl logs ` + +**Check for errors/warnings** → next step + +--- + +### 4️⃣ Common Containment Actions + +| Service | Quick Fix | +| -------------- | -------------------------------------------- | +| nginx | `nginx -t && systemctl reload nginx` | +| Postgres/MySQL | terminate idle sessions, restart DB | +| Redis | `CONFIG SET maxmemory ` / LRU | +| K8s Pod | `kubectl scale deployment ` / redeploy | +| Traffic spike | rate limit, enable caching | +| DNS | restart resolver / CoreDNS | + +--- + +### 5️⃣ K8s / Crashloop / OOM + +* `kubectl get pods` → status +* `kubectl describe pod` → events +* Adjust memory limits → redeploy +* Scale down/up deployment if needed + +--- + +### 6️⃣ Database / Cache Pressure + +* Postgres: `SELECT count(*) FROM pg_stat_activity;` +* Redis: `INFO memory` → eviction? +* DB connection limits → terminate idle / use pooling + +--- + +### 7️⃣ DNS / Network + +* `dig ` → resolve? +* `ping ` → network? +* Restart DNS / fallback resolver + +--- + +### 8️⃣ If Incident Persists + +1. Deep debugging: `strace`, `tcpdump`, heap dumps +2. Temporary mitigation: scale, rate-limit, disable debug logs +3. Escalate to senior SRE / DBA / cloud provider + +--- + +### ✅ Quick Tips + +* Always check **CPU, memory, disk, network** first +* Logs give root cause hints +* Contain → stabilize → investigate → update runbook +* Update this cheat sheet if new failure patterns appear + +--- + +This is literally a **“90-second mental map”** — glance, act, contain, escalate, done. + +--- + From e224e667929150a4d7fa743448feaefe3e15ebea Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Fri, 30 Jan 2026 15:46:38 +0530 Subject: [PATCH 31/32] Add file-io-practice.md with file handling instructions --- 2026/day-06/file-io-practice.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) create mode 100644 2026/day-06/file-io-practice.md diff --git a/2026/day-06/file-io-practice.md b/2026/day-06/file-io-practice.md new file mode 100644 index 000000000..acb073763 --- /dev/null +++ b/2026/day-06/file-io-practice.md @@ -0,0 +1,17 @@ +# Day-06 + +------ +## For Creating a Empty file + +## How to create a single and multiple files ? +* use this command for single file --> `touch notes.txt` . +* use this command for multiple file --> `touch app.py notes.txt day-06.md` . +------- +## How to write file using redirection( > ) ? +* use like this - `echo "Line 1" > notes.txt` +* use like this - `echo "Line 2" >> notes.txt` +* use like this - `echo "Line 3" | tee -a notes.txt` +* use for reading the file `cat notes.txt` +* `head -n 2 notes.txt` +* `tail -n 2 notes.txt` +* From f35e051a0e9b0a44125d71df5da5a32f66c7a006 Mon Sep 17 00:00:00 2001 From: sumit9165 Date: Sun, 1 Feb 2026 19:11:32 +0530 Subject: [PATCH 32/32] Create day-07-linux-fs-and-scenarios.md Added detailed explanations of the Linux file system hierarchy and provided scenario-based practice steps for managing the nginx service. --- 2026/day-07/day-07-linux-fs-and-scenarios.md | 81 ++++++++++++++++++++ 1 file changed, 81 insertions(+) create mode 100644 2026/day-07/day-07-linux-fs-and-scenarios.md diff --git a/2026/day-07/day-07-linux-fs-and-scenarios.md b/2026/day-07/day-07-linux-fs-and-scenarios.md new file mode 100644 index 000000000..614d75f60 --- /dev/null +++ b/2026/day-07/day-07-linux-fs-and-scenarios.md @@ -0,0 +1,81 @@ +# Day-07 Linux File system hierarchy and practice +---- +## Part-1 : Linux file system Hierarchy +---- + +**/: Root directory, the top level of the file system.** +* Everything, all the files and directories, in Linux are located under ‘root’ represented by ‘/’. +----- +**/home: User home directories**. +* Home directory contains personal directories for the users. The home directory contains the user data and user-specific configuration files. As a user, you’ll put your personal files, notes, programs etc in your home directory. +----- +**/bin: Essential binary executables.** +* The ‘/bin’ directly contains the executable files of many basic shell commands like ls, cp, cd etc. +------ +**/sbin: System administration binaries.** +* This is similar to the /bin directory. The only difference is that is contains the binaries that can only be run by root or a sudo user. You can think of the ‘s’ in ‘sbin’ as super or sudo. +----- +**/etc: Configuration files.** +* The /etc directory contains the core configuration files of the system, use primarily by the administrator and services, such as the password file and networking files. +----- +**/var: Variable data (logs, spool files).** +* Var, short for variable, is where programs store runtime information like system logging, user tracking, caches, and other files that system programs create and manage. +----- +**/usr: User programs and data.** +* in ‘/usr’ go all the executable files, libraries, source of most of the system programs. For this reason, most of the files contained therein is read­only (for the normal user). +* ‘/usr/bin’ contains basic user commands +* ‘/usr/sbin’ contains additional commands for the administrator +* ‘/usr/lib’ contains the system libraries +* ‘/usr/share’ contains documentation or common to all libraries, for example ‘/usr/share/man’ contains the text of the manpage +------- +**/lib: Shared libraries.** +* Libraries are basically codes that can be used by the executable binaries. The /lib directory holds the libraries needed by the binaries in /bin and /sbin directories. +----- +**/tmp: Temporary files.** +* As the name suggests, this directory holds temporary files. Many applications use this directory to store temporary files. Even you can use directory to store temporary files. +-------- +**/opt: Third-party applications.** +* Traditionally, the /opt directory is used for installing/storing the files of third-party applications that are not available from the distribution’s repository. +------ +**/mnt: mnt is used by system administrators to manually mount a filesystem.** + +---- +**/dev: This directory only contains special files, including those relating to the devices. These are virtual files, not physically on the disk.** +* /dev/null: can be sent to destroy any file or string +* /dev/zero: contains an infinite sequence of 0 +* /dev/random: contains an infinite sequence of random values +----- +**/boot: The ‘/boot’ directory contains the files of the kernel and boot image, in addition to LILO and Grub. It is often advisable that the directory resides in a partition at the beginning of the disc.** + +---- +**/proc – Process and kernel files.** +* The ‘/proc’ directory contains the information about currently running processes and kernel parameters. The content of the proc directory is used by a number of tools to get runtime system information. +------- +**/root – The home directory of the root.** +* There is /root directory as well and it works as the home directory of the root user. So instead of /home/root, the home of root is located at /root. Do not confuse it with the root directory (/). +-------- +**/media – Mount point for removable media.** +* When you connect a removable media such as USB disk, SD card or DVD, a directory is automatically created under the /media directory for them. You can access the content of the removable media from this directory. +---- +**/mnt – Mount directory.** +* This is similar to the /media directory but instead of automatically mounting the removable media, mnt is used by system administrators to manually mount a filesystem. +---- +**/srv – Service data** +* The /srv directory contains data for services provided by the system. For example, if you run a HTTP server, it’s a good practice to store the website data in the /srv directory. +---- +## Part-2 : Scenario-based Practice +---- + +* **step 1:** I install the nginx service using `sudo apt install nginx` & run (on my web browser using public ip). +* **step 2:** I checked status of the nginx service using `systemctl status nginx`. +* **step 3:** I enable nginx service is enabled on boot using `systemctl is-enabled nginx. +* **step 4:** I disable nginx service is disabled on boot using `sudo systemctl disable nginx. +* **step 5:** I checked cpu usage, cpu percentage & PID, service running of the nginx service using- +* `ps`, +* `top`, +* `htop`, +* `ps aux --sort=-%cpu | tail -10`. + + + +