Tool Reference

Complete guide to Code Scalpel's 22 MCP tools for surgical code operations

Overview

Code Scalpel provides 22 specialized MCP tools for AI-assisted code operations. All tools are available at all tiers (Community, Pro, Enterprise) with tier-based limits and progressive feature enhancement.

🔑 Key Principles

Token Efficiency: 99% reduction vs traditional file reading (50 tokens vs 10,000+)
AST-Based Accuracy: Zero hallucination via real parsers (Python ast, tree-sitter)
Multi-Language: Python (full AST), JavaScript/TypeScript/Java (tree-sitter/heuristic)
Tier Scaling: All tools available; limits expand by tier (Community → Pro → Enterprise)

Tool Categories

📊 Code Analysis (4 tools)

Understand code structure, complexity, and architecture without reading full files.

analyze_code

All Tiers

Purpose: Static code structure analysis via AST parsing. Returns functions, classes, imports, and complexity metrics without hallucination risk.

Token Savings: ~200 tokens vs ~5,000+ for full file read

Languages: Python (ast), JavaScript/TypeScript (tree-sitter), Java

Performance: <100ms for files <1K LOC

Use Cases: Understanding code before extraction/modification, identifying functions/classes, complexity assessment

get_file_context

All Tiers

Purpose: Token-efficient "strategic glance" at a file. Returns structured summary including functions, classes, imports, complexity, and security warnings.

Token Savings: 20-50x reduction (~50-150 tokens vs ~5,000+)

Security: Full security warning support (CWE mapping) in all tiers

Performance: ~50ms average response time

Use Cases: Assessing file relevance before extraction, checking for specific imports, identifying security hotspots quickly

Pro Tier: Semantic summarization, intent extraction, code smells detection, maintainability index

Enterprise Tier: Compliance flags (HIPAA, GDPR, PCI-DSS), PII/secret redaction, CODEOWNERS integration

crawl_project

Community: 100 files Pro: 1000 files Enterprise: Unlimited

Purpose: Project-wide analysis providing comprehensive structure, complexity metrics, and code intelligence across all files. Bird's-eye view of entire codebase.

Performance: >100 files/second, <500MB for 10K files

Community: File inventory, basic metrics, hotspot identification (100 file limit)

Pro: Complexity hotspots, architectural layer detection, dependency mapping (1000 files)

Enterprise: Monorepo support, historical trends, compliance scanning (unlimited)

Use Cases: Understanding overall project structure, identifying complexity hotspots, risk assessment, framework detection, dependency analysis

get_project_map

Community: 100 files Pro: Enhanced Enterprise: Full

Purpose: Architectural reconnaissance engine generating comprehensive project structure visualizations. Shows packages, modules, complexity hotspots, and architectural patterns.

Performance: <10s for 1K files

Visualizations: Mermaid (all tiers), City Map (Enterprise), Force Graph (Enterprise)

Community: Package/module hierarchy, entry points, circular imports (100 files, 50 modules)

Pro: Complexity hotspots, architectural layer detection, smart context

Enterprise: Compliance flags, Git ownership/churn analysis, technical debt scoring

✂️ Code Extraction (3 tools)

Surgically extract specific code elements by name with maximum token efficiency.

extract_code

Community: Single-file Pro: Cross-file depth=1 Enterprise: Unlimited

Purpose: Primary code retrieval tool. Surgically extracts specific functions/classes/methods by name. Agent sends ~50 tokens, receives ~200 (vs 10,000+ for full file).

Token Efficiency: 99% savings (50 vs 10,000+ tokens)

Languages: Python (full AST), JavaScript/TypeScript/Java (tree-sitter)

Performance: <50ms extraction time

Features: Extract by symbol name (no line numbers), context-aware (includes decorators/docstrings), React metadata support (JSX/TSX)

Community: Single-file extraction with intra-file dependencies

Pro/Enterprise: Cross-file dependency resolution (Python), depth=1 imports resolved

get_symbol_references

Community: 10 files, 50 refs Pro: Unlimited Enterprise: + Risk Scoring

Purpose: Find all references to a symbol (function, class, variable) across the project for safe refactoring and impact analysis. AST-based accuracy eliminates false positives.

Accuracy: Zero hallucination via AST parsing (excludes comments/strings)

Performance: <100ms for typical projects (<100 files)

Community: Accurate reference finding (10 files, 50 references max)

Pro: Reference categorization (call, import, read, write), unlimited files/refs

Enterprise: Risk scoring, CODEOWNERS integration, team coordination

Use Cases: Pre-refactoring impact analysis, understanding call sites, safe symbol changes

get_cross_file_dependencies

Community: Depth=1, 50 files Pro: Depth=5, 500 files Enterprise: Unlimited

Purpose: Analyze import/require statements and trace dependency chains across file boundaries with confidence scoring. Gathers complete code context for AI-assisted editing.

Confidence Scoring: Exponential decay (0.9^depth) shows reliability of deep chains

Languages: Python, JavaScript, TypeScript

Features: Circular import detection, combined code output, Mermaid diagrams

Community: Depth=1 (immediate imports), 50 files max

Pro: Transitive resolution (A→B→C chains), depth=5, 500 files

Enterprise: Architectural enforcement, unlimited depth/files, coupling metrics

✏️ Code Modification (2 tools)

Surgical, safe code modification with automatic validation and backup.

update_symbol

All Tiers (No Restrictions)

Purpose: Primary write tool for surgical code modification. Replace specific functions/classes/methods with new code while preserving surrounding context. Atomic write with backup.

Safety: Automatic syntax validation before writing, atomic operations prevent corruption

Backup: Creates .bak files by default (rollback safety)

Languages: Python (v1.0), JS/TS/Java (planned v2.0)

Performance: <100ms for patch application

Features: Decorator awareness, automatic indentation handling, zero hallucination risk

Use Cases: Applying generated fixes, updating single functions, adding methods to classes, refactoring implementation

rename_symbol

Community: Single-file Pro: Cross-file (bounded) Enterprise: Org-wide

Purpose: Safely rename functions, classes, or methods while automatically updating all references. AST-based transformation ensures syntactic correctness and preserves formatting.

Accuracy: AST-based (won't rename in strings/comments), identifier validation

Languages: Python, JavaScript, TypeScript, JSX

Community: Single-file renames with local reference updates

Pro: Cross-file propagation with import statement updates (bounded by limits)

Enterprise: Repository-wide and multi-repo renames, audit trails, approval workflows

🔀 Code Flow Analysis (3 tools)

Understand function relationships, call graphs, and execution paths.

get_call_graph

Community: Depth=3, 50 nodes Pro: Depth=50, 500 nodes Enterprise: Unlimited

Purpose: Generate static call graphs showing function-to-function relationships, entry points, and circular dependencies. Primary architecture visualization tool.

Languages: Python, JavaScript, TypeScript

Performance: <500ms for 1K functions

Precision: >90% (Community), >95% (Pro), >98% (Enterprise)

Features: Entry point discovery, circular dependency detection, Mermaid diagrams

Use Cases: Impact analysis (what breaks if I change this?), dead code detection, security auditing (trace to sinks), architecture validation

symbolic_execute

Community: 3 paths, 10 loop iterations Pro: 10 paths, 100 iterations Enterprise: Unlimited

Purpose: Formal verification and path exploration powered by Z3 Theorem Prover. Treats variables as mathematical symbols to solve for inputs that trigger specific code paths.

Mathematical Certainty: Uses SMT (Satisfiability Modulo Theories) to prove code reachability

Solver Engine: Microsoft Z3

Primary Language: Python (v1.0), JS/Java planned

Features: Corner case discovery, zero hallucination test generation, unreachable code detection

Use Cases: Determining how to reach specific code, generating test inputs, verifying critical business logic, proving code is unreachable

get_graph_neighborhood

Community: k=1 Pro: k=5 Enterprise: Unlimited

Purpose: Extract localized subgraph around a specific function to avoid the "Exploding Graph Problem". Surgically explore dependency chains without loading entire codebase.

Memory Safety: Prevents OOM errors via node limits and truncation protection

Performance: <100ms for neighborhood extraction

Community: k=1 (immediate neighbors only)

Pro: k=5 hops, semantic intelligence (functionally similar nodes)

Enterprise: Unlimited hops, Cypher-like query language for complex traversals

🔒 Security Analysis (5 tools)

Comprehensive security vulnerability detection with taint analysis and dependency scanning.

security_scan

Community: 50 findings Pro: Unlimited + Remediation Enterprise: + Custom Rules

Purpose: Primary Single-File SAST engine. Identifies SQL injection, XSS, command injection via sophisticated Python taint analysis. Sink detection for JS/TS/Java.

Precision: Python taint analysis tracks data flow from sources to sinks, recognizes sanitizers

Performance: <200ms per file average

Community: Big Four vulnerabilities (SQL, XSS, Command Injection, Path Traversal), 50 findings max

Pro: Extended detection (NoSQL, LDAP, Secrets, JWT), unlimited findings, remediation hints

Enterprise: Custom rules, compliance reporting, organization-wide patterns

cross_file_security_scan

Community: 10 modules, depth=3 Pro: 100 modules, depth=10 Enterprise: Unlimited

Purpose: Detect vulnerabilities that span multiple files by tracking tainted data flow across module boundaries. Answers "How does untrusted data flow through my entire system?"

Multi-File Taint Tracking: Follows data from entry (routes.py) → storage (models.py) → execution (db.py)

Languages: Python (full), others (heuristic context)

Community: 10 modules, depth=3, basic cross-file flows

Pro: 100 modules, depth=10, architectural visibility, compliance ready

Enterprise: Unlimited modules/depth, microservice boundary tracking

unified_sink_detect

Community: 50 sinks Pro: Unlimited + Context Enterprise: + Risk Scoring

Purpose: Polyglot detection of dangerous "sinks" (functions where untrusted data execution leads to vulnerabilities). Confidence scoring and CWE mapping reduce false positives.

Languages: Python, JavaScript, TypeScript, Java

Sink Types: eval, exec, subprocess, innerHTML, document.write, Runtime.exec, etc.

Confidence Scoring: High (0.9-1.0), Medium (0.6-0.8), Low (0.0-0.5) based on context

Community: 50 sinks max, basic signature matching

Pro: Unlimited sinks, context-aware analysis (rules out safe usages), framework support

Enterprise: Risk scoring, compliance mapping (GDPR, PCI-DSS), remediation advice

scan_dependencies

Community: 50 dependencies Pro: Unlimited + Reachability Enterprise: + Compliance

Purpose: Software Composition Analysis (SCA) to identify security risks in project dependencies. Scans manifests against OSV database with typosquatting detection and license compliance.

Multi-Language: Python (pip), JavaScript (npm), Java (Maven/Gradle)

Vulnerability DB: Google OSV (aggregates GHSA, PySEC, NVD)

Community: CVE detection, transitive dependencies, 50 deps max

Pro: Reachability analysis (is vulnerable code imported?), typosquatting detection, license compliance, supply chain risk scoring

Enterprise: Policy enforcement, SOC2/ISO 27001 reports, SBOM generation (CycloneDX), custom registry support

type_evaporation_scan

All Tiers

Purpose: Detect type safety violations where type hints are present but can be bypassed at runtime. Focuses on Python's dynamic type system vulnerabilities.

Detection: Any casting, getattr, setattr, __dict__ manipulation that bypasses type hints

Languages: Python (primary), TypeScript (planned)

Use Cases: Type safety enforcement, runtime integrity validation, security boundary verification

📋 Policy & Compliance (2 tools)

Enforce organizational standards, best practices, and regulatory compliance.

code_policy_check

Community: Style Pro: + Best Practices Enterprise: + Compliance

Purpose: Unified enforcement of coding standards, best practices, security patterns, and compliance requirements. Single interface for style, security, and regulatory auditing.

Performance: <500ms per file

Patterns: 50+ (Community/Pro/Enterprise combined)

Community: Style guides (PEP8, ESLint), basic patterns

Pro: Best practices (type hints, docstrings), security patterns (hardcoded secrets, SQL injection)

Enterprise: Compliance auditing (HIPAA, SOC2, GDPR, PCI-DSS), custom rules, audit trails, PDF certification generation

verify_policy_integrity

Enterprise Only

Purpose: Cryptographic guardian of governance model. Ensures policy definitions (allowlists, denylists, compliance rules) haven't been tampered with. Fail-closed security.

Algorithm: HMAC-SHA256 & SHA-256 cryptographic verification

Security Model: Fail-closed (deny by default on any anomaly)

Bypass Resistance: Defeats chmod attacks—verifies cryptographic content, not file permissions

Features: Tamper detection (single-bit changes), manifest signature validation, audit-ready error reporting

Use Cases: Pre-flight check before AI agents, policy drift auditing, compliance artifact verification in CI/CD

🧪 Testing & Validation (2 tools)

Automated test generation and pre-flight safety validation for code changes.

generate_unit_tests

Community: 5 tests Pro: 20 tests + Parametrized Enterprise: Unlimited + Bug Reproduction

Purpose: Automatically create comprehensive unit tests using symbolic execution. Explores all execution paths to generate concrete test cases with specific input values.

Method: Z3-powered symbolic execution (zero hallucination)

Frameworks: pytest (all tiers), unittest (Pro+)

Performance: <5s per function

Community: 5 test cases max, pytest framework, path coverage

Pro: 20 test cases, data-driven parametrized tests, unittest support

Enterprise: Unlimited tests, bug reproduction from crash logs

simulate_refactor

All Tiers

Purpose: Pre-flight safety validator for code modifications. Performs "dry run" analysis to detect security regressions, breaking changes, and behavioral inconsistencies before applying changes.

Performance: <1s for typical refactors (<500 LOC)

Languages: Python, JavaScript, TypeScript, Java

Analysis: Syntax validation, security differential (compares security_scan results), structural comparison (signature changes, removed symbols), semantic analysis (Pro+)

Safety Gate: Prevents SQL injection, XSS introduction during refactoring

Enterprise: Compliance impact analysis, automated rollback strategies

🔧 Utility (2 tools)

Path validation and file system operations with Docker awareness.

validate_paths

Community: 100 paths Pro/Enterprise: Unlimited

Purpose: Pre-flight checklist for filesystem operations. Validates file paths are accessible, exist, and safe before expensive operations. Docker-aware with volume mount suggestions.

Performance: <10ms for 100 paths

Docker Intelligence: Detects container execution, suggests missing volume mounts

Community: Batch validation (100 paths), basic existence checks

Pro/Enterprise: Unlimited paths, Enterprise adds workspace boundary enforcement, audit logging for violations

Use Cases: Fail fast before operations, turn "File Not Found" into actionable DevOps guidance, security enforcement

Tier Comparison Summary

Feature	Community	Pro	Enterprise
All 22 Tools	✅ Available	✅ Available	✅ Available
security_scan findings	50 max	Unlimited	Unlimited + Custom
symbolic_execute paths	3 paths	10 paths	Unlimited
crawl_project files	100 files	1,000 files	Unlimited
extract_code dependencies	Single-file	Cross-file depth=1	Unlimited depth
generate_unit_tests	5 tests	20 tests	Unlimited
Remediation Suggestions	❌	✅	✅
Compliance Reporting	❌	❌	✅ (HIPAA, SOC2, GDPR, PCI-DSS)
Custom Policies	❌	❌	✅

Common Integration Patterns

Typical Workflow: Analyze → Extract → Modify → Verify

Understand Structure: analyze_code or get_file_context
Extract Code: extract_code with symbol name
Verify Safety: security_scan on extracted code
Check Impact: get_symbol_references to find all call sites
Simulate Changes: simulate_refactor with proposed modifications
Apply Changes: update_symbol if simulation passes
Validate: Re-run security_scan and tests

Getting Started

Ready to use these tools? Check out our documentation for installation instructions, configuration guides, and detailed usage examples.

Read Documentation View Pricing