Your First Analysis¶
Now that you have Code Scalpel installed, let's run your first analysis and understand what the tools tell you.
What We'll Cover¶
- Running
analyze_codeto understand code structure - Interpreting the JSON response
- Using
security_scanto find vulnerabilities - Understanding tier limitations
Sample Code¶
We'll analyze this Python file throughout this guide:
import os
import sqlite3
from typing import Optional
def get_user(user_id: str) -> Optional[dict]:
"""Fetch user from database."""
conn = sqlite3.connect("users.db")
cursor = conn.cursor()
# WARNING: SQL Injection vulnerability!
query = f"SELECT * FROM users WHERE id = '{user_id}'"
cursor.execute(query)
result = cursor.fetchone()
conn.close()
return result
class UserService:
"""Service for user operations."""
def __init__(self, db_path: str = "users.db"):
self.db_path = db_path
def create_user(self, name: str, email: str) -> int:
"""Create a new user."""
conn = sqlite3.connect(self.db_path)
cursor = conn.cursor()
cursor.execute(
"INSERT INTO users (name, email) VALUES (?, ?)",
(name, email)
)
conn.commit()
user_id = cursor.lastrowid
conn.close()
return user_id
Step 1: Analyze Code Structure¶
Ask your AI assistant:
Prompt
"Use Code Scalpel's analyze_code tool to analyze this code structure"
(paste the code above)
Understanding the Response¶
Code Scalpel returns a structured JSON response:
{
"data": {
"functions": [
{
"name": "get_user",
"line_start": 6,
"line_end": 18,
"parameters": [
{"name": "user_id", "type": "str"}
],
"return_type": "Optional[dict]",
"docstring": "Fetch user from database.",
"complexity": 2
}
],
"classes": [
{
"name": "UserService",
"line_start": 20,
"line_end": 37,
"docstring": "Service for user operations.",
"methods": [
{"name": "__init__", "line": 24},
{"name": "create_user", "line": 27}
],
"bases": []
}
],
"imports": [
{"module": "os", "line": 1},
{"module": "sqlite3", "line": 2},
{"module": "typing", "names": ["Optional"], "line": 3}
]
},
"tier_applied": "community",
"duration_ms": 45
}
What Each Field Means¶
| Field | Description |
|---|---|
functions | All top-level functions with signatures and location |
classes | All classes with methods and inheritance info |
imports | All import statements for dependency tracking |
tier_applied | Which tier was used (affects limits) |
duration_ms | How long the analysis took |
Why This Matters¶
Unlike traditional code search, this gives the AI assistant:
- Exact line numbers → No guessing where code is
- Type information → Accurate parameter and return types
- Structure hierarchy → Methods belong to their classes
- Complexity metrics → Identify complex functions
Step 2: Security Scanning¶
Now let's find vulnerabilities. Ask your AI assistant:
Prompt
"Use Code Scalpel's security_scan to check this code for vulnerabilities"
Understanding Security Results¶
{
"data": {
"vulnerabilities": [
{
"type": "SQL_INJECTION",
"severity": "HIGH",
"confidence": 0.95,
"line": 12,
"code": "query = f\"SELECT * FROM users WHERE id = '{user_id}'\"",
"message": "User input 'user_id' flows into SQL query without sanitization",
"cwe": "CWE-89",
"remediation": "Use parameterized queries: cursor.execute('SELECT * FROM users WHERE id = ?', (user_id,))"
}
],
"summary": {
"total": 1,
"high": 1,
"medium": 0,
"low": 0
},
"taint_sources": [
{"variable": "user_id", "line": 6, "type": "function_parameter"}
],
"taint_sinks": [
{"function": "cursor.execute", "line": 13, "sink_type": "sql_execution"}
]
},
"tier_applied": "community",
"duration_ms": 78
}
Key Security Concepts¶
Taint Analysis tracks how untrusted data flows through your code:
- Taint Sources: Where untrusted data enters (function parameters, user input, file reads)
- Taint Sinks: Where data becomes dangerous (SQL queries, shell commands, file writes)
- Taint Flow: The path data takes from source to sink
In our example:
Vulnerability Details¶
| Field | Description |
|---|---|
type | Vulnerability category (SQL_INJECTION, XSS, etc.) |
severity | Risk level (HIGH, MEDIUM, LOW) |
confidence | How certain the detection is (0.0-1.0) |
cwe | Common Weakness Enumeration ID |
remediation | How to fix the issue |
Step 3: Extract Specific Code¶
Want to extract just one function? Ask:
Prompt
"Use extract_code to get just the get_user function"
Response¶
{
"data": {
"source": "def get_user(user_id: str) -> Optional[dict]:\n \"\"\"Fetch user from database.\"\"\"\n conn = sqlite3.connect(\"users.db\")\n cursor = conn.cursor()\n \n # WARNING: SQL Injection vulnerability!\n query = f\"SELECT * FROM users WHERE id = '{user_id}'\"\n cursor.execute(query)\n \n result = cursor.fetchone()\n conn.close()\n return result",
"line_start": 6,
"line_end": 18,
"dependencies": ["sqlite3", "typing.Optional"],
"token_estimate": 156
},
"tier_applied": "community",
"duration_ms": 23
}
Why This Matters¶
Token Efficiency: Instead of reading an entire file (~10,000 tokens), the AI only processes the exact code needed (~156 tokens). This:
- Saves API costs
- Reduces context window usage
- Focuses on relevant code
- Eliminates hallucination about code that doesn't exist
Step 4: Understanding Tool Responses¶
All Code Scalpel tools return a consistent envelope:
{
"data": { ... }, // Tool-specific results
"tier_applied": "community", // Which tier was used
"duration_ms": 45, // Processing time
"error": null // Error message if failed
}
Successful Response¶
{
"data": {
"functions": [...],
"classes": [...]
},
"tier_applied": "community",
"duration_ms": 45
}
Error Response¶
{
"data": null,
"error": {
"code": "FILE_NOT_FOUND",
"message": "File '/path/to/missing.py' does not exist",
"suggestions": ["Check the file path", "Use validate_paths first"]
},
"tier_applied": "community"
}
Common Patterns¶
Analyze → Extract → Modify¶
A typical workflow:
- analyze_code - Understand what's in the file
- extract_code - Get specific function/class
- update_symbol - Replace with fixed version
Security Workflow¶
- security_scan - Find vulnerabilities in a file
- cross_file_security_scan - Check if issues span files
- extract_code - Get vulnerable code
- simulate_refactor - Verify fix doesn't break behavior