22 Tools for Surgical Code Operations

Give your AI agent deep understanding of your codebase. Parse structure, track security flows, refactor safely, and verify everything with zero hallucination.

22
MCP Tools
Complete toolkit for AI agents
6
Categories
Analysis, security, graphs & more
4
Languages
Python, JS/TS, Java, JSX/TSX

See it in action

Real examples of what your AI agent does differently with Code Scalpel.

"Add a new field to the users table"

Your agent uses get_project_map to find the User model, get_symbol_references to find all usages, security_scan to check implications, and update_symbol to apply changes safely.

Result: Safe change, fully verified, completely auditable. Zero broken builds.

"Fix this security vulnerability"

Your agent uses cross_file_security_scan to trace data flow, symbolic_execute to generate test cases, applies the fix with update_symbol, and verifies with simulate_refactor.

Result: Vulnerability closed. All paths verified safe. Compliance proof generated.

"Refactor this monolithic file"

Your agent uses analyze_code to understand structure, get_call_graph to map dependencies, extract_code to pull out functions, and simulate_refactor to verify behavior preservation.

Result: Clean, testable functions. Every reference updated. Behavior preserved.

What you get at each tier

Every tool works at every tier. Higher tiers increase limits and unlock advanced features.

Tool Community (Free) Pro Enterprise
Analysis & Context
analyze_code ✅ Full ✅ Full ✅ Full
crawl_project ✅ Full ✅ Full ✅ Full
get_file_context ✅ Full ✅ Full ✅ Full
get_symbol_references ✅ Full ✅ Full ✅ Full
Extraction & Modification
extract_code ✅ Single-file ✅ + Cross-file deps ✅ + Cross-file deps
update_symbol ✅ Full ✅ Full ✅ Full
rename_symbol ✅ Single-file ✅ + Cross-file refs ✅ + Cross-file refs
Security
security_scan ✅ 10 paths ✅ 100 paths ✅ Unlimited
unified_sink_detect ✅ Full ✅ Full ✅ Full
cross_file_security_scan ✅ Basic ✅ 100 files ✅ Unlimited
scan_dependencies ✅ Full ✅ Full ✅ Full
type_evaporation_scan ✅ Full ✅ Full ✅ Full
Graph Analysis
get_call_graph ✅ 3 depth / 50 nodes ✅ 50 depth / 500 nodes ✅ Unlimited
get_project_map ✅ 100 files ✅ 1,000 files ✅ Unlimited
get_graph_neighborhood ✅ 2 hops / 50 nodes ✅ 5 hops / 200 nodes ✅ Unlimited
get_cross_file_dependencies ✅ 2 depth ✅ 5 depth ✅ Unlimited
Symbolic Execution
symbolic_execute ✅ 10 paths / depth 3 ✅ 100 paths / depth 10 ✅ Unlimited
generate_unit_tests ✅ Basic ✅ Full coverage ✅ Full coverage
simulate_refactor ✅ Single-file ✅ + Cross-file ✅ + Cross-file
Policy & Governance
validate_paths ✅ Full ✅ Full ✅ Full
verify_policy_integrity ✅ Cryptographic
code_policy_check ✅ OWASP basics ✅ Multi-standard ✅ + Custom OPA rules

Complete Tool Reference

All 22 tools organized by what they do. Every tool works in every tier. Pro and Enterprise increase limits and add advanced features.

✨ All 22 tools available at all tiers

Community (free), Pro, and Enterprise tiers all have access to the complete toolkit. Higher tiers increase limits (max paths, depth, files) and add advanced features (cross-file analysis, custom rules, compliance reporting) — but every tool works in every tier.

Analysis & Context (4 tools)

Understand code structure without guessing. Parse files, inventory projects, and get quick context.

analyze_code – Parse code structure (functions, classes, imports)

What it does: Performs real Abstract Syntax Tree (AST) parsing to extract complete code structure with precision. Returns every function, class, method, import statement, decorator, and cyclomatic complexity metric. Guaranteed accuracy—no fragile regex patterns or guesswork.

When agents use it: Before any code modification, to map dependencies, identify complexity hotspots (functions over 10 lines with 5+ branches), find all callable functions/classes, or plan safe refactoring strategies.

Languages: Python, JavaScript, TypeScript, Java, JSX, TSX

crawl_project – Inventory all files in a project

What it does: Recursively scans your project directory and catalogs every file. Returns normalized paths, file types (Python, JavaScript, config, docs), file sizes, and respects .gitignore/.git to avoid noise. Produces a complete project inventory in seconds.

When agents use it: On first pass to understand project shape, locate configuration files (setup.py, package.json, .env), find test directories, or identify which directories matter for analysis (skipping node_modules, dist, .git).

get_file_context – Quick file overview without full read

What it does: Provides a lightweight file preview: size, line count, first 10 lines (imports/headers), last 10 lines (EOF markers), mime type, and encoding. Perfect for triage without the cost of parsing 10MB files.

When agents use it: Before expensive operations, to determine if a file is relevant (headers show if it's a config file, test file, or utility). Saves tokens by filtering out non-code files first.

get_symbol_references – Find all usages of a symbol

What it does: Uses AST-based symbol tracking to find every reference to a function, class, method, or variable across your entire codebase. Returns exact file locations, line numbers, and context (is it a definition, call, or reference?). No false positives from string matches.

When agents use it: Before renaming (to update all callers), before deleting (to confirm nothing depends on it), or to understand impact radius of a change ("If I modify this function, what else breaks?").

2

Code Extraction & Modification

extract_code – Surgical extraction of functions/classes by name

What it does: Surgically extracts a specific function, class, or method from a file with exact source code and line numbers. Optionally resolves and includes dependencies (imports, parent classes) so the extracted code is self-contained and visible.

When agents use it: To get current implementation before making changes. Agents extract → modify → use update_symbol to apply. Works like copying code but with automatic context inclusion.

Languages: Python, JavaScript, TypeScript, Java, JSX, TSX

Pro: Cross-file deps
update_symbol – Safely replace functions/classes in files

What it does: Precisely replaces a function, class, or method in a source file with new code. Preserves all surrounding code exactly, validates Python/JavaScript syntax before writing, and creates automatic .bak backups. Zero risk of mangling nearby code.

When agents use it: After modifying extracted code. Extract → edit locally → update_symbol applies changes. Syntax validation prevents broken deployments.

Use case: "Update the validate_email function to use RFC 5322 pattern" → extract_code → modify → update_symbol

rename_symbol – Rename functions/classes throughout codebase

What it does: Renames a function, class, or method with surgical precision. Automatically updates all references across the codebase (imports, calls, type hints) using AST-based analysis to avoid string-match errors.

When agents use it: Refactoring tasks: "Rename calc_total to calculate_total with all callers updated," enforce naming conventions, or make code more semantic.

Tier differences: Community renames in single file; Pro/Enterprise add cross-file reference updates and automatic import fixing.

3

Security

security_scan – Detect vulnerabilities using taint analysis

What it does: Advanced taint tracking analysis that traces untrusted data (user input, environment variables, network calls) through your code to identify how it reaches dangerous operations (SQL queries, system commands, file writes). Maps complete attack paths with evidence. Detects SQL injection, XSS, command injection, path traversal, SSRF, hardcoded secrets, with confidence scoring.

When agents use it: Before deployment for security audit, or when user asks "Is this code secure?". Returns taint flow visualization: User input → sanitize? → SQL execute = vulnerability or safe.

Example: User input → f-string → SQL execute = CRITICAL vulnerability detected with remediation advice.

Pro: 100 paths
unified_sink_detect – Polyglot sink detection across languages

What it does: Polyglot sink detection across multiple languages. Finds dangerous operations (SQL execute, eval, system calls, file operations, DOM manipulation) in Python (os.system, sqlite3.execute), JavaScript (innerHTML, fetch), TypeScript, Java (Runtime.exec) with confidence scoring for each detection.

When agents use it: Quick surface-level security audit without full taint analysis. Answer: "Show me all potentially dangerous operations in this file."

Languages: Python (os.system, eval), JavaScript (innerHTML, eval), TypeScript, Java (Runtime.exec)

cross_file_security_scan – Track taint flow across file boundaries

What it does: Extends single-file security analysis to multi-file attack chains. Traces tainted data flowing across module boundaries: routes.py (accepts user input) → services.py (processes data) → db.py (executes SQL). Detects vulnerabilities that local analysis misses. Generates Mermaid visualizations showing complete taint flow.

When agents use it: Full-project security audits. Catches cross-module vulnerabilities where input validation happens in one file but is bypassed in another.

Pro/Enterprise limits: Tracks across up to 100 files (Community: basic cross-file). Includes Mermaid flow diagrams showing taint path.

scan_dependencies – Scan dependencies for CVEs

What it does: Scans requirements.txt, package.json, pyproject.toml, go.mod against the OSV (Open Source Vulnerabilities) database. Returns CVE IDs, CVSS severity scores, affected version ranges, and direct links to security advisories. Works for Python (PyPI), JavaScript (npm), Go, Rust, Java ecosystems.

When agents use it: Supply chain security: "Are my dependencies safe?" Returns actionable patch versions and remediation steps.

Ecosystems: Python (PyPI), JavaScript (npm), Go, Rust

type_evaporation_scan – Detect Type System Evaporation

What it does: Detects type system vulnerabilities at API boundaries. A TypeScript frontend declares {age: number, isAdmin: boolean} but backend receives raw JSON without validation. Attacker sends {age: 30, isAdmin: "yes"} (truthy string) and becomes admin. Identifies these full-stack type confusion vulnerabilities.

When agents use it: Full-stack projects with TypeScript frontend + Python/Node backend. Verifies backend properly validates and type-checks all API inputs.

Example vuln: Frontend: isAdmin: boolean → Backend: if data.get('isAdmin') → Attacker sends isAdmin="yes" (truthy string) = admin access granted

4

Graph Analysis

get_call_graph – Build call graphs showing function relationships

What it does: Builds a complete call graph: every function and which other functions it calls. Detects entry points (main (), CLI handlers, API routes), identifies circular dependencies, shows execution depth and breadth. Outputs Mermaid diagrams showing the entire call chain from entry to leaf functions. Includes confidence scoring (100% certain vs. dynamic dispatch).

When agents use it: Understand execution paths (user request → which functions run?), find dead code, trace taint from input source to output sink, or plan refactoring impact zones.

Pro/Enterprise limits: 50 depth, 500 nodes (Community: 3 depth, 50 nodes). Path queries, k-hop traversal, confidence scoring available at all tiers with different limits.

Pro: 50 depth, 500 nodes
get_project_map – Generate comprehensive project structure map

What it does: Generates a high-level architecture map of your entire codebase: package structure, exposed functions/classes, entry points, complexity hotspots (cyclomatic complexity > 10), import graph visualization, circular dependencies, size metrics. Returns Mermaid-compatible diagrams and JSON for analysis.

When agents use it: Onboarding to new codebase ("What does this project do?"), identifying code to refactor (high complexity), finding test entry points, or planning modularization.

Use case: Onboarding to new codebase, finding complexity hotspots (functions with cyclomatic complexity > 10).

Pro: 1,000 files
get_graph_neighborhood – Extract k-hop neighborhood subgraph

What it does: Extracts a focused subgraph from the full call graph. Choose a center node (e.g., process_payment function) and specify depth (2 hops), then get all functions → calling that function, → called by that function, or both. Prevents graph explosion on large codebases by limiting scope.

When agents use it: Impact analysis ("What does delete_user call?"), focused refactoring ("Which functions touch this module?"), or security analysis ("What calls execute_sql?").

Directions: outgoing (callees), incoming (callers), both (full neighborhood)

Pro: 5 hops, 200 nodes
get_cross_file_dependencies – Analyze cross-file dependency chains

What it does: Recursively resolves all dependencies a function/class needs from other files (imports, parent classes, used utilities). Returns combined source code with all dependencies inlined, a dependency graph, circular dependency detection, and Mermaid visualization of the dependency tree. Includes confidence decay scoring (direct deps = high confidence, transitive deps = lower).

When agents use it: Get full context for a function (all imports resolved and visible), detect circular imports, plan module extraction ("Can we move this function to a new file?"), or understand hidden dependencies.

Output: Dependency graph, combined code (all deps in one string), confidence decay scoring, Mermaid diagram.

Pro: 5 depth
5

Symbolic Execution

symbolic_execute – Explore execution paths with Z3 solver

What it does: Uses the Z3 theorem prover to explore all possible execution paths through a function. Sends symbolic values (not concrete numbers) through if/else branches and computes path constraints. Discovers edge cases and crash conditions: "If quantity < 0 then ValueError," "If total == 0 then ZeroDivisionError," etc. Returns all paths with concrete example inputs.

When agents use it: Answer "What edge cases am I missing?" or find hard-to-reach code paths. Generates vulnerability-triggering inputs and crash proofs.

Example: Function with 5 if/else branches → Tool explores all 5 paths and generates example inputs for each.

Pro: 100 paths, depth 10
generate_unit_tests – Generate tests from symbolic paths

What it does: Leverages symbolic execution to automatically generate comprehensive unit tests. Explores all function branches and generates test cases for each path: happy path, error conditions (ValueError, ZeroDivisionError, TypeError), edge cases (empty inputs, negative numbers, None values). Outputs ready-to-run pytest, unittest, or hypothesis tests with 100% code path coverage.

When agents use it: User says "Write tests for this function" or for TDD workflows. Ensures every code branch is tested, including error handling.

Output: Complete test file with test cases for all branches, error cases (ValueError, ZeroDivisionError), and edge cases.

Pro: Full coverage
simulate_refactor – Verify behavior preservation before applying changes

What it does: Simulates applying a code change and verifies behavior preservation through comparative symbolic execution. Passes original and refactored code to Z3, explores all paths in both, and confirms they produce identical results for all inputs. Returns is_safe (boolean), differences (if any), and path comparison analysis.

When agents use it: Before update_symbol applies changes. Critical safety check: "This refactor preserves 100% of behavior" or "Warning: This changes behavior when quantity == 0"

Output: is_safe (boolean), differences (if any), path_comparison (original vs new paths), recommendation.

Pro: Cross-file
6

Policy & Governance

validate_paths – Validate file paths before operations

What it does: Validates file paths before all operations: confirms files exist, are readable/writable, fall within project boundaries (no path traversal escape), and detects suspicious symlinks pointing outside the project. Essential security layer for sandboxed deployments.

When agents use it: Every agent operation on files is guarded by path validation first. Prevents attacks, enforces sandbox boundaries, and confirms file accessibility before expensive parsing.

Checks: File exists, is readable/writable, within project boundaries, not a symlink outside project.

verify_policy_integrity – Cryptographic policy verification

What it does: Cryptographic policy verification: generates SHA-256 hashes of governance configuration files (limits.toml, security-policy.yaml, .code-scalpel config) and verifies against a trusted manifest. Detects if policies have been tampered with or downgraded.

When agents use it: Compliance audits in Enterprise environments. Answer: "Have our security/governance policies been modified?" with cryptographic proof.

Pro/Enterprise limits: Advanced cryptographic verification and remote manifest support available in higher tiers.

code_policy_check – Check code against compliance standards

What it does: Audits code against multi-standard compliance frameworks: OWASP Top 10 (hardcoded secrets, eval usage), SOC 2 (audit logging, access controls), ISO 27001 (data handling), HIPAA (PII handling), PCI-DSS. Also enforces org-specific style guides. Returns violations with severity (CRITICAL/HIGH/MEDIUM/LOW) and remediation guidance.

When agents use it: Pre-commit checks, CI/CD gates, compliance reporting. Answer: "Does this code violate our security/compliance standards?"

Enterprise limits: Adds custom OPA (Open Policy Agent) rules and compliance reporting for organization-specific policies.

Start Using Code Scalpel

All 22 tools included free. Install in 30 seconds.