FAQ — Code Scalpel

General

What is Code Scalpel?

Code Scalpel is an MCP (Model Context Protocol) server that provides AI assistants with surgical code analysis and modification tools. Instead of AI reading entire files and guessing line numbers, Code Scalpel enables precise extraction, analysis, and modification by symbol name.

What AI assistants work with Code Scalpel?

Code Scalpel works with any MCP-compatible AI assistant:

Claude (Desktop and API)
VS Code GitHub Copilot (via MCP extension)
Cursor (built-in MCP support)
Any MCP client (custom integrations)

What languages are supported?

Language	Analysis	Extraction	Security
Python	✅	✅	✅
JavaScript	✅	✅	✅
TypeScript	✅	✅	✅
Java	✅	✅	✅
JSX/TSX	✅	✅	✅

Is Code Scalpel free?

Yes! The Community tier is free forever and includes:

Code analysis and extraction
Basic security scanning
Symbol updates and renames
Path validation

Pro and Enterprise tiers add advanced features. See Tier Comparison.

Installation

How do I install Code Scalpel?

Via pip:

pip install codescalpel

Via Docker:

docker pull ghcr.io/codescalpel/code-scalpel

Do I need Python installed?

For pip installation, yes (Python 3.10+). For Docker, no—everything is containerized.

Tools & Features

Why use extract_code instead of reading the file?

Traditional file reading has problems:

Issue	File Reading	extract_code
Token cost	High (entire file)	Low (just the symbol)
Precision	Line number guessing	Name-based lookup
Fragility	Breaks on reformat	Resilient
Dependencies	Manual tracking	Automatic

What is symbolic execution?

Symbolic execution explores all possible paths through code by treating variables as symbols instead of concrete values.

Tiers & Licensing

What's included in each tier?

Feature	Community	Pro	Enterprise
Analysis tools	✅	✅	✅
Extraction tools	✅	✅	✅
Basic security	✅	✅	✅
Symbolic execution	✅ Limited	✅	✅
Cross-file security	✅ Limited	✅	✅
Compliance checking	✅ Basic	✅	✅

How do licenses work?

Pro and Enterprise licenses are JWT files set via environment variable:

export CODE_SCALPEL_LICENSE_PATH=/path/to/license.jwt

Security & Privacy

Does Code Scalpel send my code anywhere?

No. All analysis runs locally. Code Scalpel does not transmit code to external servers and works completely offline.

Is Code Scalpel safe to use in production?

Yes. Code Scalpel never executes analyzed code, creates backups before modifications, and validates syntax before writing.

Can I use Code Scalpel offline?

Yes! Code Scalpel works completely offline. License validation is performed offline via cryptographic JWT verification.

Online checks: Happen every 24 hours to validate license revocation status, with a 48-hour grace period if your network is unavailable. If your license is valid, you can work offline for up to 48 hours between checks.

Can I trust `scan_dependencies` results if my network is down?

No. The tool returns success=False with an error message if the OSV API is unreachable (timeout/500 error).

Fail-closed design: In security tools, a network error is not a clean bill of health. We return an error to prevent false confidence rather than silently passing with zero findings.

For air-gapped environments: Consider local vulnerability databases or manual CVE tracking.

Does `security_scan` understand Flask/Express middleware?

Not automatically. The taint tracker only clears taint when it sees a call matching known sanitizers (or ones you add via config). If global sanitization happens in middleware that isn't invoked in the analyzed call graph, the data stays tainted when it reaches your controller.

Solution: Register middleware sanitizers in SANITIZER_PATTERNS config so the tool recognizes them as cleaning operations.

Tier Limits & Behavior

What happens if my license expires?

All tools include a 7-day grace period for expired licenses, managed by the validation/authentication pipeline during MCP server boot. After the grace period:

Pro/Enterprise tools revert to Community tier limits
You retain full access to Community tier features
No data loss or tool failures

What happens if I exceed my tier's file limit in a monorepo?

Files are selected lexicographically (alphabetically by path) after filtering ignored directories. For example, in a monorepo with 10,001 files on Pro tier (1,000 file limit):

All files are discovered and filtered (excluding .venv, node_modules, etc.)
Remaining files are sorted alphabetically by full path
The first 1,000 files after sorting are included
services/auth/ will be included before services/payments/

Why alphabetical? Deterministic behavior is more valuable than heuristic "importance" scoring. You get predictable, reproducible results across runs.

Can Community users access Pro features with a workaround?

No. The MCP server determines tier limits during validation/authentication at boot time. Pro/Enterprise feature limits are enforced server-side, making it impossible to access higher-tier features without a valid license. If no license (or an invalid license) is presented, the server defaults to Community features and limits automatically.

Tool Behavior Details

Why doesn't `get_call_graph` link my `x.save()` call to all classes with a `save()` method?

To prevent graph explosion, we use best-effort type inference with advanced_resolution=True. If the type of x is unknown (no assignment tracking or annotation), the call remains unlinked rather than creating hundreds of false edges to every save() method in your codebase.

Design principle: We prefer "missing a link" over "creating 1,000 false links."

Does `rename_symbol` handle string literals?

No. Renaming operates only on AST NAME tokens (identifiers in code). Occurrences in string literals (e.g., logging.info("Calling function_x")) or comments are never modified. This prevents false positives like renaming a function called test and breaking SQL query strings containing "SELECT * FROM test".

Can I filter out test files from `get_symbol_references` results?

Not automatically. The tool walks all *.py files under the root, skipping only common directories like .venv, node_modules, build, and dist. References in tests/ are included in results (truncated to first 100 hits with a warning).

Workaround: Post-filter results by checking if file_path contains tests/ or test_.

If I import `numpy as np`, will searching for `numpy.array` find `np.array` usages?

No. Symbol search is string-based only—it matches AST node names equal to symbol_name. It does not resolve import aliases.

Options:

Search for array (broader, may include false positives)
Search for np.array if you know the alias
Run multiple searches for common aliases

Refactoring & Testing

If I ask to generate tests twice for the same function, will the output be identical?

Yes, bit-identical. The generator uses no randomness, UUIDs, or timestamps. Deterministic symbolic execution produces sequential test IDs derived from path IDs with names like test_{function_name}_path_{path_id}.

Critical for CI/CD: This enables caching and regression testing in automated pipelines.

Does `simulate_refactor` guarantee behavioral equivalence?

No. The tool checks syntax validity, security issues (taint analysis), and structural changes (added/removed functions, classes, imports). It does not check behavioral equivalence (e.g., uuid.uuid4() → uuid.uuid1()), test execution, or non-deterministic functions.

Why? Behavioral equivalence requires test execution or full symbolic execution, which is out of scope for a fast simulation tool.

Docker & Deployment

How does `validate_paths` handle Docker volume mappings?

The tool returns container-resolved paths, not original host paths. When given a Windows host path like C:\Users\Project, the resolver tries mount translations:

/mnt/c/Users/Project (WSL2)
/c/Users/Project (Git Bash)
Environment-driven mappings
Workspace root detection

If resolved, the container path appears in accessible list. If not, the original path remains in inaccessible with Docker mounting suggestions.

Does the MCP server work in environments without `.git` folders?

Yes. Git-dependent features (like ownership tracking in get_project_map) gracefully degrade. The tool runs git rev-parse --is-inside-work-tree with a 5-second timeout. If it's not a repo or Git isn't installed:

Returns owner="unknown" with reason "Not a git repository" or "Git not installed"
Skips blame/log calls
No hanging or crashes

Frequently Asked Questions