operator shade role Network & System Admin status available --:--:-- PT → open ticket
Targeting Network / System Administration roles

Network & System Admin
in deliberate motion.

Hands-on experience in production environments — diagnosing LAN outages, securing compromised infrastructure, maintaining Windows/Linux servers, and deploying containerized services. Five years at ProGranite Surfaces and UW Continuum College, then a homelab that formalized everything: dual Proxmox hosts, 30+ Docker containers, TrueNAS ZFS storage, multi-vendor OSPF routing, and AI-agent operations. Today: actively pursuing CCNA 200-301 and RHCSA/RHCE, with a deliberate, methodical path into network and system administration.

// the deliberate practice Project Hermes — an always-on homelab spanning dual Proxmox VE hosts, 30+ Docker containers, TrueNAS ZFS RAIDZ1 storage, Traefik reverse proxy with Cloudflare DNS-01, Authentik SSO, and CrowdSec WAF. A practice environment where real incident reports are written and the discipline of infrastructure operations is deliberate, not incidental.
// the operator instinct At ProGranite, a complete LAN failure brought a stone fabrication facility to its knees. The switching loop took down every workstation, every CNC machine, every phone line. Diagnosed it bottom-up the OSI model — identified the loop, isolated the broadcast storm, and redesigned the network with VLAN segmentation. The mindset — calm, methodical, OSI-layer-by-OSI-layer — is the same one I bring to every incident.
14yrs Tech-curious since I built my first PC
50+ Devices supported across roles
35VLAN Homelab network segmentation design
30ctnrs Containers in production-like operation
CCNA 200-301 · Target: End 2026
shade@hermes:~ — systemctl status homelab
shade@hermes:~$ systemctl status homelab ● homelab.target - Project Hermes Homelab Stack Loaded: loaded (/etc/systemd/system/homelab.target; enabled) Active: active (running) since Thu 2026-06-11 08:00:00 PDT; 2h 14min ago Main PID: N/A (target) Tasks: 47 (limit: 4915) CPU: 2min 34.7s CGroup: /system.slice/homelab.target ├─proxmox1.service · PVE Node 1 · Ryzen 5 5600G · 80GB ├─proxmox2.service · PVE Node 2 · Ryzen 5600GT · 64GB ├─truenas.service · TrueNAS Scale · 43TB ZFS RAIDZ1 └─docker.service · 30+ containers · Traefik + Authentik shade@hermes:~$ _
Project Hermes full topology — Proxmox cluster with VLAN segmentation
project_hermes · the practice environment · click to enlarge
How I got here

Infrastructure has been the through-line.
Now deliberately the career.

Not the standard help-desk-to-sysadmin story. Three things to know about how I got here — and why I built this portfolio the way I did.

01

Hands-on before I had the title.

IT support at UW Continuum College gave me Tier 1 & 2 discipline. Then ProGranite Surfaces put me in front of production infrastructure — a flat LAN that failed catastrophically, shared hosting that got compromised, Windows and Linux servers that needed maintenance, and a network that I redesigned from scratch with VLAN segmentation. The title was "IT Support Specialist." The work was already system administration.

→ where I came from
02

The homelab is where I prove it.

Project Hermes is the formalization of everything: dual Proxmox VE hypervisors, TrueNAS Scale with 43TB ZFS RAIDZ1, 30+ Docker containers behind Traefik with Authentik SSO and CrowdSec WAF, 35 VLANs segmented across a Ubiquiti stack. Built to deliberately deepen infrastructure skills — and to document how I actually think and troubleshoot. Two documented incident RCAs with full investigation paths.

→ the engineering work
03

CCNA, RHCSA, and the deliberate path.

CCNA 200-301 targeting end of 2026. RHCSA/RHCE on the same timeline. The credentials aren't the destination — they're the structured path I chose to formalize what hands-on production work made me good at. Targeting Network & System Administration roles where I can grow with a team, handle real production incidents, and keep building infrastructure that stays up.

→ get in touch
Engineering · ProGranite Incidents

Two incident reports.
Real production, real investigation, real engineering judgment.

These are two incidents I investigated and resolved in a production environment at ProGranite Surfaces, written in real incident-report format: environment, symptoms, hypotheses ruled out, investigation path, root cause, validation, lessons. Skim the summaries. Read the investigations when you want depth.

// methodology

Calm. Methodical. Bottom-up the OSI model — every time.

// the philosophy that came out of a catastrophic production outage
PGR-IR-01 P2 · LAN / SWITCHING RESOLVED SCOPE · Complete LAN failure, switching loop, network redesign

Complete LAN failure at a stone fabrication facility — switching loop diagnosis and network redesign.

A stone fabrication facility's entire LAN collapsed during production hours. Every workstation lost connectivity, CNC machines went offline, and phone lines dropped simultaneously. The initial assumption was an upstream ISP or firewall failure — multiple services failing at once usually points to the edge. The actual failure was inside the switching fabric: a switching loop had formed when an unmanaged switch was connected to two wall jacks fed by the same switch stack, creating a Layer 2 broadcast storm that saturated every link on the network. Resolution required physically locating the loop source, breaking the broadcast storm by disabling ports until the loop resolved, then redesigning the flat network with proper VLAN segmentation to prevent recurrence. The redesign was eventually implemented on a UniFi/UDM Pro stack with isolated VLANs for data, voice, production equipment, and guest access.

  • Facility: ProGranite Surfaces — stone fabrication plant, ~15,000 sq ft
  • Switching: Two HP ProCurve switches in stack configuration feeding ~30 desk drops + 15 production-area drops
  • Edge/WAN: UDM Pro (pre-outage: basic flat config, no VLAN segmentation)
  • Production equipment: CNC stone-cutting machines on wired Ethernet, VoIP phones, workstation PCs
  • Network design (pre-incident): Single flat /24 subnet, all devices on the same broadcast domain
  • Complete loss of network connectivity across the entire facility — no device could reach the gateway, internet, or other devices
  • VoIP phones displayed "No Network" — no registration, no dial tone
  • CNC machines stopped receiving job files and reported network errors
  • All workstations lost mapped drives, internet, and inter-device communication
  • Switch stack link LEDs showed maximum utilization across all ports — sustained 100% activity
  • UDM Pro dashboard showed no client activity despite dozens of devices being powered on
  • ISP outage — UDM Pro WAN link showed carrier, modem was sync'd to upstream
  • UDM Pro failure/reboot — device was online, web UI accessible from a directly-connected laptop, LAN-side showed no DHCP activity
  • Switch stack failure — both switches were powered, ports showed link but no traffic forwarded successfully
  • DNS/DHCP server failure — UDM Pro was serving both, and a statically-addressed laptop on the same switch stack could not reach the gateway
  • Cable plant issue — too many concurrent failures across independent drops for a physical cabling problem
  1. Establish baseline connectivity from the edge. Connected a laptop directly to the UDM Pro LAN port. DHCP worked, internet was reachable. The WAN edge was healthy — the failure was in the switching fabric.
  2. Check switch health and utilization. Accessed the HP ProCurve CLI via serial console. Port utilization was pegged at 99–100% on every active port. Broadcast counters were increasing at thousands per second. This is the signature of a broadcast storm from a switching loop.
  3. Identify the loop carrier. Disabled downlink ports one at a time, observing link utilization on the remaining ports. When a specific office-area port was disabled, utilization on all other ports dropped to near-zero within seconds. Network connectivity returned immediately for all remaining ports.
  4. Trace the offending port. The disabled port connected to a wall jack in the main office. That jack fed a desk where an unmanaged 5-port switch had been connected — and that switch was plugged into two wall jacks, both terminating at the same HP ProCurve stack. Broadcast frames from one port traveled out through the switch and came back in through the second port, creating a perfect Layer 2 loop.
  5. Break the loop physically. Removed the redundant patch cable at the wall. Re-enabled the switch port. Network stabilized immediately.
  6. Confirm full recovery. All devices reconnected via DHCP. VoIP phones registered. CNC machines resumed communication. No further broadcast storm activity.
  7. Assess the underlying vulnerability. The flat /24 network design meant any loop anywhere brought down everything. STP was not enabled on the HP ProCurve stack (or was disabled by default on the affected ports). There was no VLAN isolation, no broadcast containment strategy.
  8. Design the remediation. Proposed and later implemented a redesigned network: UDM Pro with VLAN segmentation isolating production equipment (VLAN 10), office workstations (VLAN 20), VoIP (VLAN 30), and guest/IoT (VLAN 40). STP enabled on all managed switch ports. No unmanaged switches without explicit authorization and proper uplink cabling.

RC-1 (immediate): An unmanaged desktop switch was connected to two wall jacks both terminating at the HP ProCurve stack, creating a physical Layer 2 switching loop. Broadcast frames circulated indefinitely, producing a broadcast storm that saturated all switch uplinks and prevented legitimate traffic from being forwarded.

RC-2 (contributing): The flat /24 network design with no VLAN segmentation meant there was no broadcast domain isolation. A loop originating in the office could take down CNC machines on the other side of the facility. STP was either not enabled or not covering the unmanaged switch segment, which had no STP capability.

  • Removed the redundant patch cable that created the loop — immediate restoration of connectivity
  • Enabled and verified STP (RSTP) on all managed switch ports
  • Implemented VLAN segmentation design on the UDM Pro: production equipment, office workstations, VoIP, and guest networks isolated into separate VLANs with inter-VLAN firewall rules
  • Documented the incident and created a written policy: no unmanaged switches without explicit approval and proper single-uplink cabling
  • All devices reconnected via DHCP within minutes of breaking the loop
  • VoIP phones registered and made test calls successfully
  • CNC machines resumed receiving job files over the network
  • Broadcast counters returned to normal baseline levels on all switch ports
  • VLAN segmentation design was implemented as a permanent architectural fix — no recurrence of broadcast storms in the redesigned network
  • A switching loop produces symptoms that look exactly like a total WAN failure. Every device offline, no internet, no inter-device communication — it's easy to blame the ISP when the real problem is inside your switching fabric.
  • Flat networks amplify any single failure into a facility-wide outage. VLAN segmentation isn't just best practice — it's blast radius containment.
  • Unmanaged switches are risk vectors. They have no STP, no loop protection, and no management visibility. They belong in environments where you control both ends of every cable.
  • Bottom-up OSI model diagnosis works. Start with the physical layer, then L2 switching, then L3 routing. If serial console to the switch shows broadcast storms at L2, you don't need to look at the firewall.
// skills demonstrated LAN troubleshooting Switching loop diagnosis VLAN segmentation UniFi / UDM Pro Network redesign Broadcast storm containment Production outage response
PGR-IR-02 P2 · SECURITY / INCIDENT RESPONSE RESOLVED SCOPE · WordPress compromise, dual backdoor, defense-in-depth recovery

WordPress compromise on shared hosting — backdoor removal and defense-in-depth recovery.

A WordPress site hosted on shared Hostinger/CageFS infrastructure was compromised. The site displayed a defacement page redirecting visitors to a spam domain, and the hosting provider sent a malware detection alert. Initial response — replacing core files and resetting credentials — appeared to resolve the issue. But the compromise persisted through the cleanup. A second infection vector was discovered: PHP backdoor files embedded in the uploads directory using obfuscated base64_decode and eval() payloads, designed to survive a standard file replacement. Complete remediation required a systematic defense-in-depth approach: forensic file analysis, backdoor file removal, credential rotation, WAF deployment, file integrity monitoring setup, and a hardened WordPress configuration.

  • Hosting: Hostinger shared hosting with CageFS filesystem isolation
  • Application: WordPress (standard install, several third-party plugins and themes)
  • Access: Admin credentials, SFTP access to public_html
  • Detection: Hostinger security scan flagged malicious file modifications; site visitors saw a spam redirect
  • Backup status: Pre-compromise backups existed in Hostinger's auto-backup system
  • Site homepage replaced with a defacement page — visitors redirected to an external spam domain
  • Hostinger security panel flagged "Malicious code detected" with specific file paths
  • WordPress admin dashboard accessible but showed altered site health indicators
  • Standard WordPress file permissions appeared modified on several directories
  • Google Search Console alerts about site compromise
  • Simple site defacement via known vulnerability — replacing core WordPress files alone did not resolve the issue; the compromise returned
  • Credential compromise (admin password guess) — password reset alone was insufficient; the compromise persisted after credential rotation
  • Plugin vulnerability only — disabling all plugins and using a default theme did not stop the reinfection
  • Hosting provider server compromise — other sites on the same shared hosting account were not affected, suggesting a site-specific infection
  1. Confirm initial compromise and scope. Logged into Hostinger panel — confirmed malware alert on specific PHP files. Site was redirecting to spam-domain[.]com. The defacement was the visible symptom, but the vector was unknown.
  2. Attempt standard recovery. Replaced all WordPress core files with fresh downloads, reset all admin credentials, updated plugins. Site appeared clean for ~2 hours — then returned to defacement state. Something was reinfecting the site.
  3. Deep file inspection. Ran a full recursive grep search across public_html for common backdoor signatures: base64_decode, eval() with encoded strings, preg_replace with /e modifier, system(), exec(), shell_exec(). Found two files in wp-content/uploads/2023/ containing obfuscated PHP — hidden among legitimate uploaded images.
  4. Analyze the backdoor payload. The files used a layered obfuscation chain: base64_decode(str_rot13(gzinflate(...))) executed via eval(). The decoded payload checked for specific GET/POST parameters as a trigger — without the trigger key, the file appeared benign. This meant automated scanners could easily miss it.
  5. Check for additional persistence mechanisms. Found no cron-based reinfection, no .htaccess modifications (in this case), no database-stored payloads. The backdoors were pure file-based and depended on being included or accessed directly.
  6. Determine the entry vector. Evidence pointed to a compromised plugin (outdated version with known CVE) as the initial entry point. The attackers uploaded the backdoor files through the compromised plugin's unrestricted file upload capability.
  7. Complete file-level cleanup. Removed both backdoor files. Ran a comprehensive sweep of all PHP files for encoded strings, suspicious file permissions, and unexpected modifications. Verified against known-good hashes from original plugin/theme downloads.
  8. Implement defense-in-depth. Deployed Cloudflare WAF with OWASP rule set and WordPress-specific rate limiting. Installed Wordfence Security with real-time file integrity monitoring and firewall. Removed the vulnerable plugin entirely. Restricted upload directory execution (where hosting allowed). Set up automated offsite backups with 7-day retention.

RC-1 (initial compromise): Vulnerable third-party WordPress plugin with an unrestricted file upload CVE allowed attackers to upload PHP backdoor files disguised as legitimate media uploads into wp-content/uploads/.

RC-2 (persistence): The backdoor files used layered obfuscation (base64_decode + str_rot13 + gzinflate + eval) to evade signature-based detection. They were hidden among thousands of legitimate uploaded files and activated only when specific HTTP parameters were present, making them invisible to casual inspection and standard automated scanners.

  • Removed two obfuscated PHP backdoor files from wp-content/uploads/2023/
  • Replaced all WordPress core files with fresh downloads from wordpress.org
  • Updated/removed all vulnerable plugins and themes
  • Reset all credentials (admin, SFTP, database, hosting panel)
  • Deployed Cloudflare WAF with WordPress-specific security rules
  • Installed and configured Wordfence Security with real-time file integrity monitoring
  • Established automated offsite backup pipeline with verified restore testing
  • Site returned to normal operation — no further defacement, no redirects
  • Wordfence file integrity scans confirmed no unexpected file modifications
  • Hostinger security panel showed clean status
  • Google Search Console re-verified site ownership and flagged the site as clean
  • Cloudflare WAF logged blocked exploit attempts in the days following cleanup, confirming the WAF was intercepting ongoing attack traffic
  • File integrity monitoring established as an ongoing operational practice — monthly audit cadence implemented
  • File replacement alone is not incident response. Until you find and remove the persistence mechanism, you're treating symptoms, not the infection.
  • Backdoor files in upload directories are a standard WordPress attack pattern. Upload directories are writable by design and easy to hide files in among legitimate uploads.
  • Obfuscated PHP (base64_decode + eval chains) defeats naive grep searches. You have to decode and understand the payload, not just match strings.
  • Defense-in-depth is what keeps a cleaned site clean. WAF, file integrity monitoring, restricted file permissions, regular updates, and verified backups — any one layer can fail, but multiple layers make reinfection harder.
Real-World Hands-On · Where I came from

Before the homelab,
there was a production network to fix.

The lab work didn't appear out of nowhere. It's the formalization of years of production IT support, infrastructure troubleshooting, and operational discipline I developed across two roles.

2023 – 2025

IT Support Specialist · ProGranite Surfaces · Seattle, WA

Production IT in a stone fabrication facility — maintaining the LAN, servers, CNC equipment connectivity, and end-user infrastructure for a ~30-person shop. This is where the incident reports in this portfolio come from.

  • LAN & network — diagnosed and resolved a complete LAN failure (switching loop/broadcast storm, PGR-IR-01), redesigned the flat network with VLAN segmentation on UniFi/UDM Pro stack (VLANs for production, office, VoIP, guest)
  • server administration — Windows Server maintenance (AD user management, file shares, permissions), Linux server upkeep (Ubuntu/CentOS for internal tools), SQL database backup and verification
  • containerization — Docker deployment of management tools and internal web applications
  • security — WordPress compromise response (dual backdoor removal, WAF deployment, file integrity monitoring — PGR-IR-02), firewall rule management on UDM Pro, VPN access configuration for remote vendors
  • hardware — Workstation provisioning (Windows 10/11), printer/plotter setup and maintenance, CNC machine network connectivity, VoIP phone deployment, structured cabling and patch panel organization
2022 – 2023

IT Support · UW Continuum College · Seattle, WA

Tier 1 & 2 technical support in a higher-education environment, supporting faculty, staff, and classroom technology across the UW Continuum College campus.

  • endpoint support — Windows and Mac workstation configuration, deployment, and troubleshooting for faculty and administrative staff
  • classroom technology — AV equipment setup and maintenance, classroom computer troubleshooting, lecture capture system support
  • documentation — created and maintained troubleshooting guides, knowledge base articles, and standard operating procedures
  • enterprise applications — supported university-wide enterprise applications, Office 365, Active Directory account management, and asset inventory tracking
Project Hermes · Operations & Monitoring

The way I'd want a data center
to be operated.

Infrastructure visibility via Proxmox, Docker health checks, Uptime Kuma monitoring, and automated cron-based diagnostics. Practiced on hardware I own — ready for production environments where uptime matters.

2 Proxmox hosts
12 Active VMs
30+ Docker containers
43TB ZFS RAIDZ1 raw
Hermes health dashboard

Composite health view

Proxmox cluster health, Docker container status, storage utilization, network latency — on a dark dashboard tuned for at-a-glance reading.

Monitoring

Hypervisors

Dual Proxmox VE 8.x nodes — proxmox1 (Ryzen 5 5600G, 80GB RAM) and proxmox2 (Ryzen 5600GT, 64GB RAM). ZFS root, 10 GbE backbone.

Storage

Storage

TrueNAS Scale (2 instances). Primary pool: 3× Seagate Exos X18 16TB in RAIDZ1 = 32TB usable. Secondary: 1TB zvol for backup staging. NFS shares to both Proxmox hosts.

Automation & Tooling

Scripting isn't optional
in a modern infrastructure role.

Python, bash, and systemd, running 24/7 in containers and cron jobs I built and maintain. Automated health checks, security auditing, monitoring, and backup orchestration — all real, all operating in the homelab.

python · systemd · ai-agent ● running

Hermes Agent — AI-Powered Operations

Custom AI agent running cron-driven health checks across the homelab: automated VM diagnostics, security audit orchestration, container health verification, and incident report drafting. Interfaces with the Proxmox API and Docker socket for real-time infrastructure awareness.

Python · systemd service · auto-restart Proxmox VE API + Docker SDK Cron health checks · 24/7 operation Automated security audit pipeline
bash · cron · audit ● running

Weekly Security Audit Pipeline

Automated weekly security audit covering: WordPress file integrity check against known-good hashes, full recursive malware scan of web roots (base64_decode/eval/gzinflate signature matching), SSH configuration audit (PermitRootLogin, password auth, key-only enforcement), and log review summary. Results logged and alerted via Telegram.

Bash + grep · cron weekly WordPress hash verification PHP backdoor signature matching Automated report · Telegram alert
uptime-kuma · docker · webhook ● running

Infrastructure Monitoring & Alerting

Uptime Kuma monitoring 21 endpoints across the homelab — Proxmox hosts, TrueNAS web UI, Docker containers, Traefik dashboard, Authentik SSO, CrowdSec API, and external services. Push notification alerts via Telegram webhook on status changes. Custom notification templates with severity-based formatting.

21 monitored endpoints Telegram webhook notifications Docker · auto-update SSL certificate expiry checks
bash · cron · nfs · zfs ● scripted

NFS Mount Recovery & Backup Orchestration

Five-minute cron job checking NFS mount health across Proxmox hosts — automatically remounts if dropped, logs failures. ZFS snapshot pipeline: automated daily snapshots on TrueNAS datasets with 14-day retention. Proxmox backup job targeting VM backups to ZFS storage with verified restore capability.

5-min cron · health check + auto-repair ZFS snapshots · daily · 14-day retention Proxmox backup jobs · verified restores NFS mount diagnostics · alert on failure
Open to opportunities

Hire the operator.
The certifications follow.

Targeting Network & System Administration roles in the Seattle area. Open to remote where appropriate. CCNA 200-301 and RHCSA/RHCE in deliberate pursuit — targeting end of 2026. A live homelab with dual Proxmox hosts, 30+ containers, and 43TB ZFS storage that I can demo on a call. Two full incident reports from real production environments ready to walk through in detail. And the operator instinct that comes from fourteen years of being the tech-curious person in every room.

Andrew Shutov
// operator Andrew Shutov
whoami
shade@hermes ~ whoami --verbose name Andrew Shutov role Network & System Admin target Net/Sys Admin roles location Seattle, WA cert CCNA 200-301 · Target: End 2026 cert+ RHCSA/RHCE · Target: End 2026 lab Project Hermes · dual Proxmox status available shade@hermes ~ _