Known Limitations & Edge Cases

We believe in radical transparency. Every tool has trade-offs, and we document ours upfront so you can make informed decisions.

🎯 Why This Page Exists

Code Scalpel is built for precision, not magic. We document limitations because we believe users deserve to know exactly what our tools can and cannot do. This page represents our commitment to engineering honesty over marketing hype.

Tool-Specific Limitations

cross_file_security_scan

"Middleware Blindness"

Global middleware sanitizers (e.g., Express body parsers, Flask request validators) are invisible to the taint tracker unless explicitly registered in your configuration.

Why: The tool analyzes call graphs from entry points. If sanitization happens in framework middleware that isn't invoked in the analyzed code path, the data remains tainted when it reaches your controller.

Workaround: Register middleware sanitizers in SANITIZER_PATTERNS config, or ensure sanitizer calls appear in the analyzed call graph.

High Impact
get_call_graph

"Duck Typing Blindness"

When analyzing x.save() calls, the tool does not link to every save() method in your codebase to prevent graph explosion.

Why: In Python's duck-typed world, linking x.save() to all classes with a save() method would create thousands of false edges. We prefer "missing a link" over "creating 1,000 false links."

Behavior: With advanced_resolution=True, we track simple type assignments. Without type information, the call remains unlinked.

Medium Impact
crawl_project

"Stale Ghosts"

Incremental crawl mode does not detect file deletions. Deleted files vanish from results but may linger in the cache until a full re-crawl.

Why: Incremental mode optimizes for speed by only re-scanning changed files. Deletion detection would require full directory traversal, negating the performance benefit.

Workaround: Periodically run full crawls (without incremental_crawl) to clean up stale cache entries.

Low Impact
generate_unit_tests

"Unmockable Logic"

Generated tests for code relying on C-extensions, hardware, or complex external state (databases) will likely fail at runtime because the tool does not auto-mock these dependencies.

Why: Symbolic execution operates on pure Python AST. It cannot model numpy arrays, torch tensors, or hardware I/O.

Behavior: Tests will contain concrete input values but call the actual function—no mocking is inserted.

Medium Impact

Security & Safety Edge Cases

security_scan

Dead Code Vulnerabilities

Vulnerabilities in never-called functions are flagged with the same severity as reachable code.

Why: Static analysis cannot guarantee a function is truly "dead" (reflection, dynamic imports, future changes). We prioritize completeness over noise reduction.

Use Case: Useful for legacy code audits, but may create noise in actively maintained codebases.

Low Impact
scan_dependencies

Fail-Closed on API Timeouts

If the OSV API is unreachable (timeout/500), the tool returns success=False with an error message.

Why: In security tools, a network error is not a clean bill of health. We fail-closed to prevent false confidence.

Note: Requires outbound connectivity to OSV API. For air-gapped environments, consider local vulnerability databases.

High Impact
type_evaporation_scan

Generics with any

Cannot detect any types hidden inside generics (e.g., List<any>, Promise<any>).

Why: The detector inspects as assertions and call patterns but does not traverse generic type arguments.

Workaround: Manually review generic type declarations, or use TypeScript's noImplicitAny in strict mode.

Medium Impact
unified_sink_detect

Simple Obfuscation Evasion

String splitting like "ex" + "ec" instead of "exec" evades detection.

Why: Matching is literal (AST call names for Python, substring search for others). Deobfuscation requires symbolic execution, which is not applied at this layer.

Note: Python comments are ignored (AST-based), but JavaScript/TypeScript comments may trigger false positives.

Low Impact

Performance Boundaries

validate_paths

Hanging on Unresponsive Mounts

Can hang indefinitely on unresponsive NFS/SMB mounts (no timeout mechanism).

Why: The tool calls Path.exists() synchronously. If the OS-level stat call hangs, the tool waits indefinitely.

Workaround: Ensure mount points are responsive before validation, or use OS-level timeouts (mount -o soft,timeo=5 for NFS).

Medium Impact
symbolic_execute

10 Iteration Loop Limit

Hard limit of 10 loop iterations. Paths requiring >10 iterations are not fully explored.

Why: Symbolic execution can generate infinite paths for unbounded loops. The "fuel" limit prevents hanging on while True: constructs.

Configurable: max_loop_iterations can be adjusted, but higher values increase execution time exponentially.

Medium Impact
analyze_code

File Size Limits

Enforces tier-based byte limits with a hard cap of 110MB per file.

Why: Parsing 10MB single-line minified files can cause parser slowdowns or memory issues.

Behavior: Files exceeding the limit return an error before parsing begins. No truncation or chunking.

Low Impact

Design Trade-Offs

Lexicographic File Selection (Pro Tier Limits)

When Pro tier file limits are exceeded (e.g., 1,000 files in get_project_map), files are selected lexicographically (alphabetically by path), not by "importance."

Why: Deterministic behavior is more valuable than heuristic "importance" scoring. Alphabetical selection is predictable and reproducible.

Result: In a monorepo, services/auth/ will be included before services/payments/.

Medium Impact

No Behavioral Equivalence Checking

simulate_refactor checks syntax, structure, and security—but not behavioral equivalence.

Example: Refactoring uuid.uuid4() to uuid.uuid1() returns SAFE because no syntax/security issues are detected, even though output behavior differs.

Why: Behavioral equivalence requires test execution or full symbolic execution, which is out of scope for a fast simulation tool.

Medium Impact

What We're Working On

🚧 Active Improvements

rename_symbol Keyword Validation: Moving identifier validation to the top of the function to prevent partial application on invalid names (e.g., renaming to reserved keywords like yield).

scan_dependencies Fail-Closed: Ensuring API timeouts return success=False instead of success=True with a warning note.

Questions about a specific limitation? Ask in our GitHub Discussions

Last Updated: January 2026