Skip to content

[20260312_DOCS] Added stable anchors and compliance guidance used by the current website.

Security Analysis Guide

Code Scalpel provides advanced security analysis capabilities using taint tracking and static analysis. This guide covers how to detect vulnerabilities, analyze security flows, and integrate security scanning into your workflow.

Security Analysis Overview

Code Scalpel detects vulnerabilities by tracking how untrusted data (taint sources) flows to dangerous operations (sinks).

graph LR
    A[User Input<br/>Taint Source] --> B[Data Processing]
    B --> C[Database Query<br/>Sink]

    style A fill:#ff6b6b
    style C fill:#ffd93d

Supported Vulnerability Types

Vulnerability CWE Severity Detection
SQL Injection CWE-89 Critical Taint tracking
Command Injection CWE-78 Critical Taint tracking
XSS CWE-79 High Taint tracking
Path Traversal CWE-22 High Taint tracking
LDAP Injection CWE-90 High Taint tracking
NoSQL Injection CWE-943 High Taint tracking
SSRF CWE-918 High Taint tracking
Hardcoded Secrets CWE-798 Medium Pattern matching

Single-File Security Scan

Basic Usage

# AI prompt
"Scan api/views.py for security vulnerabilities"

The AI will use security_scan:

{
  "file_path": "api/views.py",
  "confidence_threshold": 0.7
}

Understanding Results

{
  "vulnerabilities": [
    {
      "type": "SQL_INJECTION",
      "severity": "CRITICAL",
      "cwe": "CWE-89",
      "line": 45,
      "function": "get_user",
      "source": "user_id (request.args.get)",
      "sink": "cursor.execute(query)",
      "taint_flow": [
        {"line": 42, "code": "user_id = request.args.get('id')"},
        {"line": 44, "code": "query = f\"SELECT * FROM users WHERE id = {user_id}\""},
        {"line": 45, "code": "cursor.execute(query)"}
      ],
      "confidence": 0.95,
      "remediation": "Use parameterized queries: cursor.execute('SELECT * FROM users WHERE id = ?', (user_id,))"
    }
  ],
  "summary": {
    "critical": 1,
    "high": 0,
    "medium": 0,
    "low": 0,
    "total": 1
  }
}

Adjusting Confidence Threshold

# Higher threshold = fewer false positives
"Scan api/views.py with high confidence threshold (0.9)"

# Lower threshold = more comprehensive
"Scan api/views.py with low confidence threshold (0.5)"

Cross-File Security Analysis

Why Cross-File Analysis?

Many vulnerabilities span multiple files:

# routes.py
@app.route('/search')
def search():
    query = request.args.get('q')  # Source: user input
    results = db.search(query)      # Passes to another file
    return render(results)

# db.py
def search(term):
    # Sink: SQL execution
    cursor.execute(f"SELECT * FROM items WHERE name LIKE '%{term}%'")

Single-file analysis misses this. Cross-file analysis catches it.

Running Cross-File Scan

# AI prompt
"Run a cross-file security scan on the entire project"

The AI uses cross_file_security_scan:

{
  "project_root": "./src",
  "entry_points": ["routes.py:search", "api.py:handle_request"],
  "max_depth": 5
}

Cross-File Results

{
  "vulnerabilities": [
    {
      "type": "SQL_INJECTION",
      "severity": "CRITICAL",
      "source_file": "routes.py",
      "source_line": 15,
      "sink_file": "db.py",
      "sink_line": 23,
      "cross_file_flow": [
        {"file": "routes.py", "line": 15, "code": "query = request.args.get('q')"},
        {"file": "routes.py", "line": 16, "code": "results = db.search(query)"},
        {"file": "db.py", "line": 22, "code": "def search(term):"},
        {"file": "db.py", "line": 23, "code": "cursor.execute(f\"SELECT...{term}...\")"}
      ]
    }
  ],
  "taint_entry_points": [
    {"file": "routes.py", "function": "search", "source": "request.args.get"}
  ],
  "mermaid_diagram": "graph TD\n..."
}

Tier Limits for Cross-File Scan

Tier Max Depth Max Modules Timeout
Community 3 50 60s
Pro 10 200 180s
Enterprise Unlimited Unlimited 600s

Dependency Vulnerability Scanning

Scanning Dependencies

# AI prompt
"Scan my project dependencies for known vulnerabilities"

Uses scan_dependencies:

{
  "project_root": "./",
  "include_dev": true,
  "scan_vulnerabilities": true
}

Results

{
  "dependencies": [
    {
      "name": "requests",
      "version": "2.25.0",
      "vulnerabilities": [
        {
          "id": "CVE-2023-32681",
          "severity": "HIGH",
          "description": "Proxy-Authorization header leaked",
          "fixed_version": "2.31.0"
        }
      ]
    }
  ],
  "summary": {
    "total_packages": 45,
    "vulnerable_packages": 1,
    "critical_vulns": 0,
    "high_vulns": 1
  }
}

Security Workflows

Pre-Commit Security Check

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: code-scalpel-security
        name: Security Scan
        entry: code-scalpel scan --security
        language: system
        types: [python]
        pass_filenames: true

CI/CD Security Gate

# GitHub Actions
jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install Code Scalpel
        run: pip install codescalpel-pro  # or enterprise

      - name: Security Scan
        run: |
          code-scalpel scan --security ./src --format json > security.json

      - name: Check Results
        run: |
          python -c "
          import json
          with open('security.json') as f:
              data = json.load(f)
          criticals = data['summary']['critical']
          if criticals > 0:
              print(f'Found {criticals} critical vulnerabilities!')
              exit(1)
          "

Weekly Security Audit

# .github/workflows/security-audit.yml
name: Weekly Security Audit

on:
  schedule:
    - cron: '0 9 * * 1'  # Monday 9am
  workflow_dispatch:

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Full Security Scan
        run: |
          pip install codescalpel-enterprise
          code-scalpel scan --security --cross-file ./src > audit.json
          code-scalpel scan --dependencies ./requirements.txt >> audit.json

      - name: Create Issue
        uses: peter-evans/create-issue-from-file@v5
        with:
          title: Weekly Security Audit
          content-filepath: audit.json

Compliance & Audit

Enterprise teams can pair security scans with governance enforcement to support SOC 2, HIPAA, GDPR, and PCI-DSS review workflows.

  • Use code_policy_check to evaluate code against compliance-oriented rulesets.
  • Use verify_policy_integrity before audits to confirm governance files have not been tampered with.
  • Combine cross_file_security_scan results with CI artifact retention for audit evidence.

Custom Security Rules

Adding Custom Sinks

For Pro and Enterprise tiers, you can define custom sinks in governance.yaml:

# .code-scalpel/governance.yaml
security:
  custom_sinks:
    - name: "audit_log"
      patterns:
        - "logger.audit(*)"
        - "AuditLog.write(*)"
      sensitivity: "HIGH"
      reason: "Audit logs may contain sensitive data"

    - name: "payment_processor"
      patterns:
        - "payment.process(*)"
        - "stripe.charge(*)"
      sensitivity: "CRITICAL"
      reason: "Payment data requires extra scrutiny"

Adding Custom Sources

security:
  custom_sources:
    - name: "websocket_input"
      patterns:
        - "ws.receive(*)"
        - "socket.recv(*)"
      taint_level: "HIGH"

    - name: "queue_message"
      patterns:
        - "queue.get(*)"
        - "redis.lpop(*)"
      taint_level: "MEDIUM"

Custom Sanitizers

security:
  sanitizers:
    - name: "html_escape"
      patterns:
        - "bleach.clean(*)"
        - "escape_html(*)"
      neutralizes:
        - "XSS"

    - name: "sql_param"
      patterns:
        - "sqlalchemy.text(*).bindparams(**)"
      neutralizes:
        - "SQL_INJECTION"

Interpreting Confidence Scores

Score Ranges

Score Meaning Action
0.9-1.0 Very likely vulnerability Fix immediately
0.7-0.9 Probable vulnerability Review and fix
0.5-0.7 Possible vulnerability Investigate
<0.5 Unlikely vulnerability Low priority

Factors Affecting Confidence

  • Direct data flow: Higher confidence
  • Complex transformations: Lower confidence
  • Unknown function calls: Lower confidence
  • Sanitizer presence: Lower confidence

Best Practices

1. Scan Early, Scan Often

# During development
"Scan this function I just wrote for security issues"

# Before commit
"Quick security check on my changes"

# Before PR
"Full security scan of the modified files"

2. Fix Critical First

Prioritize by severity:

  1. Critical: SQL Injection, Command Injection
  2. High: XSS, Path Traversal, SSRF
  3. Medium: Hardcoded secrets, Information disclosure
  4. Low: Minor issues, best practice violations

3. Use Parameterized Queries

# Bad
query = f"SELECT * FROM users WHERE id = {user_id}"
cursor.execute(query)

# Good
query = "SELECT * FROM users WHERE id = ?"
cursor.execute(query, (user_id,))

4. Validate All Input

# Validate before use
def get_user(user_id: str):
    if not user_id.isdigit():
        raise ValueError("Invalid user ID")

    # Now safe to use
    return db.query(User).filter(User.id == int(user_id)).first()

5. Use Framework Protections

# Flask - Use escape for XSS
from markupsafe import escape

@app.route('/hello/<name>')
def hello(name):
    return f"Hello, {escape(name)}!"

# Django - Use ORM instead of raw SQL
User.objects.filter(id=user_id)  # Safe

Next Steps