Security Log Analysis: Detecting Brute Force, Data Exfiltration, and Slow Attacks

Why Logs Are Your Most Underutilized Security Asset

Every application generates logs. Authentication attempts, API calls, database queries, error messages, configuration changes, and file access events all leave traces in your log files. These logs contain the forensic evidence of attacks in progress, but most organizations treat them as operational debugging tools rather than security intelligence sources.

The gap between generating logs and actually analyzing them for security threats is enormous. Studies show that the average organization takes 197 days to identify a data breach. The evidence was almost certainly in the logs from day one, but no one was looking at it, or the volume was so overwhelming that the signal was lost in the noise.

197 Average days to identify a breach

69 Average days to contain a breach

80% Of breaches leave evidence in logs

The challenge is not a lack of data; it is a lack of intelligent analysis. A moderately busy web application generates millions of log entries per day. No human team can read them all. The question becomes: how do you build a system that can process this volume, identify the patterns that matter, and alert you before an attack succeeds?

Types of Log-Based Attacks

Brute Force Attacks

Brute force attacks are the most straightforward pattern to detect in logs because they produce a high volume of failed authentication events in a short time. An attacker trying to guess a password will generate hundreds or thousands of failed login attempts, each recorded as a log entry.

Here is what a brute force attack looks like in application logs:

Attack Pattern

2026-03-15 14:22:01 WARN  Auth - Failed login: user=admin ip=198.51.100.47 reason=invalid_password
2026-03-15 14:22:01 WARN  Auth - Failed login: user=admin ip=198.51.100.47 reason=invalid_password
2026-03-15 14:22:02 WARN  Auth - Failed login: user=admin ip=198.51.100.47 reason=invalid_password
2026-03-15 14:22:02 WARN  Auth - Failed login: user=admin ip=198.51.100.47 reason=invalid_password
2026-03-15 14:22:03 WARN  Auth - Failed login: user=admin ip=198.51.100.47 reason=invalid_password
... [500 more entries in 3 minutes]
2026-03-15 14:25:17 INFO  Auth - Successful login: user=admin ip=198.51.100.47

A simple threshold rule detects this easily: "Alert if more than 10 failed login attempts from the same IP within 5 minutes." But sophisticated attackers have adapted.

Credential Stuffing

Credential stuffing is a more sophisticated variant of brute force that uses stolen username/password pairs from previous data breaches. Instead of trying many passwords against one account, the attacker tries one stolen credential pair against many accounts. This distributes the failed attempts across thousands of different usernames, making per-account threshold rules ineffective.

Credential Stuffing

2026-03-15 14:22:01 WARN  Auth - Failed login: user=john.smith@email.com   ip=203.0.113.12
2026-03-15 14:22:01 WARN  Auth - Failed login: user=jane.doe@email.com     ip=203.0.113.12
2026-03-15 14:22:02 WARN  Auth - Failed login: user=bob.jones@email.com    ip=203.0.113.12
2026-03-15 14:22:02 INFO  Auth - Successful login: user=alice.w@email.com  ip=203.0.113.12
2026-03-15 14:22:03 WARN  Auth - Failed login: user=mike.chen@email.com    ip=203.0.113.12
2026-03-15 14:22:03 WARN  Auth - Failed login: user=sarah.k@email.com      ip=203.0.113.12

Notice the pattern: one attempt per user, high velocity across different accounts, and a success rate of about 1-3% (which is typical for credential stuffing using breach data). Each individual user sees at most one failed attempt, so per-user threshold rules would never fire. Detection requires analyzing the aggregate pattern across all users from a given IP or IP range.

Slow and Low Attacks

Slow and low attacks are specifically designed to evade threshold-based detection rules. Instead of sending 1,000 requests per minute, the attacker sends 1 request every 2 minutes, staying well below any reasonable rate limit. Over the course of days or weeks, they can test thousands of credentials or enumerate sensitive data without triggering a single alert.

These attacks are invisible to simple rule-based detection because each individual time window looks normal. Detection requires long-term behavioral analysis: comparing current patterns against weeks of historical baselines to identify statistically significant deviations that only become apparent over extended periods.

Data Exfiltration

Data exfiltration is often the ultimate goal of an attack, yet it is one of the hardest patterns to detect because it mimics legitimate user behavior. An attacker who has compromised a valid user account will access data using normal API endpoints and legitimate credentials. The difference is in the pattern of access.

Exfiltration

# Normal user behavior (baseline):
# - 20-30 customer record lookups per day
# - All within business hours
# - Records from assigned territory

# Exfiltration pattern:
2026-03-15 02:14:01 INFO  API - GET /api/customers?page=1&size=100 user=compromised_user
2026-03-15 02:14:03 INFO  API - GET /api/customers?page=2&size=100 user=compromised_user
2026-03-15 02:14:05 INFO  API - GET /api/customers?page=3&size=100 user=compromised_user
... [200 more pages downloaded in 10 minutes at 2 AM]

Key indicators of data exfiltration visible in logs include:

Unusual access volume — A user who normally views 20 records per day suddenly downloads 10,000 records.
Off-hours activity — Data access at 2 AM from a user who has never worked past 6 PM.
Geographic anomalies — Access from a country where the user has never logged in before.
Sequential access patterns — Paginating through entire database tables rather than searching for specific records.
Response size anomalies — API responses that are significantly larger than the user's historical average.

Rule-Based Detection vs. AI-Powered Analysis

The Limits of Static Rules

Traditional log analysis relies on static rules: predefined conditions that trigger alerts when met. Common examples include "alert on 5 failed logins in 5 minutes" or "alert when admin panel is accessed from a non-whitelisted IP." These rules are effective against known, well-characterized attack patterns but fail in several critical scenarios:

Adaptive attackers — Once attackers know the threshold, they stay just below it. If your rule triggers at 10 failed logins per minute, the attacker will try 9.
Unknown attack patterns — Rules can only detect what they were written to detect. Novel attack techniques that do not match existing rule patterns are invisible.
Rule explosion — As you try to cover more scenarios, the number of rules grows exponentially. Organizations with 500+ rules struggle with maintenance, conflicts between rules, and diminishing returns.
Alert fatigue — Overly sensitive rules generate thousands of alerts per day, most of which are false positives. Security teams stop investigating alerts when 95% are noise.

AI-Powered Anomaly Detection

Machine learning-based detection works fundamentally differently from static rules. Instead of defining what "bad" looks like, it learns what "normal" looks like and flags deviations. This approach has several advantages:

Adaptive baselines — The model continuously updates its understanding of normal behavior. If your application traffic naturally increases during a holiday sale, the model adjusts its baseline rather than generating hundreds of false positives.
Multi-dimensional analysis — Humans write rules that check one or two dimensions (failed logins + time window). ML models analyze dozens of dimensions simultaneously: time of day, request frequency, response sizes, user agent strings, geographic location, API endpoint distribution, and error rates.
Detects slow/low attacks — By comparing behavior across weeks of data, ML models can identify patterns that are statistically impossible to detect within a single time window.
Reduces false positives — Context-aware models understand that 100 failed logins during a load test are normal but 10 failed logins at 3 AM from a foreign IP are suspicious.

Best practice: Use rules and AI together, not one or the other. Rules are excellent for known, high-confidence patterns (such as login from a known-malicious IP). AI excels at detecting unknown, subtle patterns that rules would miss. The combination provides both reliability and adaptability.

SIEM Limitations and Why You Need More

Security Information and Event Management (SIEM) platforms have been the standard tool for log-based security analysis for over a decade. However, modern application security challenges have exposed significant limitations in traditional SIEM approaches:

Volume and Cost

SIEM platforms typically charge based on data ingestion volume (gigabytes per day). As applications grow and generate more logs, SIEM costs grow proportionally. Organizations often face a choice between comprehensive logging (expensive) and selective logging (incomplete). This creates dangerous gaps where attack evidence falls into the logs you chose not to collect.

Correlation Complexity

Most SIEMs require analysts to write correlation rules in proprietary query languages. Correlating events across multiple log sources (web server logs, application logs, database audit logs, authentication logs) requires expertise in both the query language and the attack patterns being searched for. Many organizations lack the specialized staff to write and maintain these correlations effectively.

Alert Triage Burden

A typical enterprise SIEM generates 500 to 10,000 alerts per day. Security Operations Center (SOC) analysts must triage each alert, investigate the context, and determine whether it represents a real threat. When false positive rates exceed 90%, which is common in poorly tuned SIEMs, analysts become desensitized and critical alerts are missed.

Building Behavioral Baselines

Effective anomaly detection requires a solid understanding of what "normal" looks like for your specific application. Behavioral baselines capture the typical patterns of user behavior, system behavior, and traffic patterns that characterize your application during normal operation.

What to Baseline

Authentication patterns — Typical login times, geographic locations, devices, and success/failure ratios for each user and for the application as a whole.
API usage patterns — Which endpoints each user or role typically accesses, at what frequency, and with what response sizes.
Error rates — Normal error rates by endpoint, by time of day, and by user segment. A sudden spike in 500 errors or a specific error type can indicate an attack in progress.
Data access volumes — How much data each user typically reads or exports. This is critical for detecting data exfiltration.
Session behavior — Typical session durations, pages per session, and navigation patterns. Automated attackers have very different session profiles than human users.

Baseline Collection Period

A meaningful baseline requires at least two to four weeks of clean data, covering weekdays, weekends, and ideally a month-end processing period. The baseline should be collected during a period when no known incidents are occurring. If your historical data includes undetected attacks, those patterns will be incorporated into the baseline as "normal," reducing detection effectiveness.

Attack Chain Correlation

Individual log events rarely tell the full story of an attack. A failed login is just a failed login. But a failed login followed by a successful login from a different country, followed by a password change, followed by mass data download, followed by account deletion — that is a complete account takeover attack chain.

Attack chain correlation connects related events across time and across log sources to reconstruct the full narrative of an attack. This is where log analysis transcends simple monitoring and becomes true security intelligence.

Example: Reconstructing an Account Takeover

Attack Chain

# Phase 1: Reconnaissance (Day 1-3)
Auth log: 3 failed logins for user=target@company.com from ip=198.51.100.0/24
Web log:  Password reset page visited from ip=198.51.100.12

# Phase 2: Credential Compromise (Day 4)
Email log: Password reset email sent to user=target@company.com
Auth log:  Password reset completed from ip=198.51.100.12 (not user's usual IP)
Auth log:  Successful login from ip=198.51.100.12

# Phase 3: Persistence (Day 4, +5 minutes)
API log:   POST /api/account/mfa/disable    user=target@company.com
API log:   POST /api/account/email/change    user=target@company.com
API log:   POST /api/account/api-keys/create user=target@company.com

# Phase 4: Data Exfiltration (Day 4-5)
API log:   GET /api/reports/financial?range=all  user=target@company.com  [2.3 GB]
API log:   GET /api/customers/export             user=target@company.com  [890 MB]

# Phase 5: Covering Tracks (Day 5)
API log:   DELETE /api/account/audit-log          user=target@company.com
Auth log:  Password changed from ip=198.51.100.12

Each individual event in this chain could appear benign. People forget passwords. People disable MFA temporarily. People download reports. But the sequence, timing, and context make the malicious intent unmistakable. Effective correlation engines evaluate the entire chain and assign a composite risk score that is far higher than the sum of individual event scores.

Real-Time Alerting Best Practices

The gap between detection and response determines the impact of a security incident. A credential stuffing attack detected in 30 seconds can be blocked with zero accounts compromised. The same attack detected after 4 hours may result in thousands of compromised accounts.

Tiered Alerting Strategy

Tier	Response Time	Examples	Action
Critical (P1)	Under 5 minutes	Active data exfiltration, admin account compromise, mass account takeover	Automated blocking + immediate page to on-call
High (P2)	Under 30 minutes	Credential stuffing in progress, anomalous API usage, privilege escalation attempt	Automated rate limiting + alert to security team
Medium (P3)	Under 4 hours	Brute force from single IP, unusual geographic login, elevated error rates	Queue for analyst review during business hours
Low (P4)	Next business day	Minor anomalies, informational findings, policy violations	Add to daily security digest report

Automated Response Actions

For high-confidence, high-severity detections, automated response actions dramatically reduce the window of exposure:

Dynamic rate limiting — When credential stuffing is detected, automatically apply aggressive rate limits to the offending IP ranges.
Session termination — When an account takeover is detected, immediately invalidate all active sessions for the affected user.
Temporary account lockout — When a compromised account is identified, lock it and require identity verification before re-enabling access.
Enrichment and escalation — Automatically enrich alerts with context (IP reputation, geographic data, user history) and escalate to the appropriate team.

Caution: Automated response actions can cause denial-of-service if triggered by false positives. Always test automated responses thoroughly in staging environments and start with conservative thresholds. An attacker who can trigger your automated lockout system can effectively lock out legitimate users.

Putting It All Together

Effective security log analysis is not about any single technique. It is the combination of comprehensive log collection, intelligent analysis (both rules and AI), behavioral baselining, attack chain correlation, and tiered alerting that transforms raw log data into actionable security intelligence. Organizations that invest in this capability dramatically reduce their mean time to detection and mean time to response, cutting the cost and impact of security incidents by orders of magnitude.

Detect These Issues Automatically with Security Factor 365

Security Factor 365's log intelligence engine uses AI-powered anomaly detection and attack chain correlation to find threats that rules-based systems miss. From brute force to data exfiltration, get real-time visibility into your application's security posture.

Enable Smart Log Analysis