RHCSA (032): Season 2 - Scenario 7: Archive and Regex Parsing

Episode 032 Executive Summary

In this Archive and Regex Parsing RHCSA lab scenario, I start performing the lab after a 6 minute introduction by the podcast hosts. You will get the most out of this lab if you listen to the entire show and then try to practice the lab several times, either along with me or by yourself. Today we are diving into the Essential Tools domain, focusing on the hidden complexities of archiving and regular expressions. These are the foundational skills that underpin almost every advanced operational task on the exam.

The deceptive nature of this scenario lies in absolute versus relative path archiving. Candidates often compress files using absolute paths, which strips the leading slash and alters the directory structure upon extraction, leading to instant failure on exam day. The operational goal here is to parse a directory for specific critical system errors and archive only the matching files while preserving their exact structural integrity.

The core challenge forces you to use regex to identify the correct files, pipe that output into an archive utility, and ensure the resulting tarball is deposited securely into a designated functional user directory. We will rely on grep, tar, and xargs to build a bulletproof one-liner. We will then verify the archive contents to guarantee the file tree remains completely intact without extracting it prematurely.

Keywords: RHEL 10, RHCSA EX200, tar archive, regular expressions, absolute paths, grep parsing, xargs, system administration

EPISODE 032: Archive and Regex Parsing
* Season: 2 | Difficulty: High
* Objectives: Primary 1.3, 1.6, 1.8; Secondary 9.1
* Lab Focus: Archiving, Regex, File Management
* URL: https://djere.com/rhcsa-032-season-2-scenario-7-archive-regex.html

***

### 1. SCENARIO BRIEF (THE PROBLEM)
A legacy application is writing unstructured debug logs into a shared directory. You need to identify all log files containing a specific "FATAL_OOM" string, archive only those files into a compressed tarball, and store the archive in the home directory of a newly created backup user. The archive must retain the exact original directory structure of the logs so the engineering team can extract them safely into a sandbox environment later.

***

### 2. TASK ANALYSIS (THE "WHY")
* 1.3 (Grep/Regex): Required to scan inside text files across a directory and return only the names of files containing the target string.
* 1.6 (Archive/Tar): Needed to bundle the resulting files into a single, compressed artifact while managing path structures.
* 1.8 (File Management): Ensures the artifact is moved to the correct location with appropriate ownership.
* 9.1 (Create/Modify Users): Introduces a secondary domain objective by requiring a dedicated functional user for backup storage.

***

### 3. SOLUTION STEPS

#### Step 1: Environment Setup (Root Only)
# Verify tar is installed and install it silently if missing to ensure tool availability
rpm -q tar >/dev/null 2>&1 || dnf install -y tar

# Create the target functional user for storing the backups
useradd backup-manager

# Create the log directory structure using the parent flag to avoid missing directory errors
mkdir -p /var/log/legacy-app

# Generate dummy log files with varying content to simulate a production environment
echo "INFO: System started normally" > /var/log/legacy-app/app-01.log
echo "FATAL_OOM: Memory limit exceeded at 0x00A" > /var/log/legacy-app/app-02.log
echo "WARN: High CPU usage detected" > /var/log/legacy-app/app-03.log
echo "FATAL_OOM: Out of memory killer invoked" > /var/log/legacy-app/app-04.log

#### Step 2: Core Implementation (Execute as root)
# Use grep with -r to search recursively, -l to output only file names, and pipe to xargs tar
# The -c flag creates the archive, -z applies gzip compression, and -f specifies the filename
# The -P flag forces tar to use absolute paths, preventing the default removal of the leading slash
grep -rl "FATAL_OOM" /var/log/legacy-app/ | xargs tar -czPf /home/backup-manager/oom-logs.tar.gz

# Change the ownership of the newly created archive to the backup user
# The chown command uses the user:group format to secure the artifact for the target account
chown backup-manager:backup-manager /home/backup-manager/oom-logs.tar.gz

# Technical Breakdown: Grep isolates the specific files containing the error. Xargs takes that list and feeds it as arguments to tar. The capital P flag overrides the default tar safety mechanism, enforcing absolute path preservation.
# Pro-Tip: Forgetting the -P flag means the archive extracts locally relative to your current working directory, which will fail an exam requirement dictating absolute path restoration.

#### Step 3: Verification (The "Proof of Work")
# Use tar with the -t flag to list the contents of the archive without extracting it
# The -f flag specifies the target file, and we pipe to head to confirm the absolute path is present
tar -tf /home/backup-manager/oom-logs.tar.gz | head -n 2
* EXPECTED: /var/log/legacy-app/app-02.log

***

### 4. COMPREHENSIVE CLEANUP (ZERO-TRACE)
# Recursively remove the log directory and delete the functional user along with their home directory
rm -rf /var/log/legacy-app
userdel -r backup-manager

You should also read:

RHCSA Series (005): Providing User Interfaces

Mind Map RHCSA_Series_5_Providing_User_Interfaces_Mind_Map │ ├── Alphabetical_List_of_Abbreviations │ ├── CLI = Command-Line Interface │ ├── CSCI = Computer Science │ ├── CSH = C…

RHCSA Series (004): Managing Memory

Mind Map RHCSA_Series_4_Managing_Memory_Mind_Map │ ├── Alphabetical_List_of_Abbreviations │ ├── CPU = Central Processing Unit │ ├── cron = Chron Table (scheduler) │ ├── dstat…