Putting It Together¶
Time: 15 minutes | Tools: All beginner tools | Tier: Community
Complete a real-world workflow combining analysis, extraction, updates, and security scanning.
What You'll Learn¶
- Combining multiple tools
- A real-world refactoring workflow
- Best practices for safe changes
The Scenario¶
You've inherited a codebase with a security vulnerability. Your task:
- Understand the code structure
- Find the vulnerability
- Extract the problematic function
- Fix it safely
- Verify the fix
The Code¶
Create user_service.py:
# user_service.py
import sqlite3
import hashlib
DATABASE = "app.db"
def get_connection():
"""Get database connection."""
return sqlite3.connect(DATABASE)
def hash_password(password):
"""Hash a password using MD5."""
return hashlib.md5(password.encode()).hexdigest()
def create_user(username, password):
"""Create a new user."""
conn = get_connection()
cursor = conn.cursor()
hashed = hash_password(password)
query = f"INSERT INTO users (username, password) VALUES ('{username}', '{hashed}')"
cursor.execute(query)
conn.commit()
conn.close()
def get_user(username):
"""Get user by username."""
conn = get_connection()
cursor = conn.cursor()
query = f"SELECT * FROM users WHERE username = '{username}'"
cursor.execute(query)
result = cursor.fetchone()
conn.close()
return result
def authenticate(username, password):
"""Authenticate a user."""
user = get_user(username)
if user is None:
return False
hashed = hash_password(password)
return user[2] == hashed # password is third column
Step 1: Analyze the Code¶
"Analyze user_service.py and summarize its structure"
Tool call:
Result summary:
| Type | Name | Lines | Complexity |
|---|---|---|---|
| Function | get_connection | 8-10 | 1 |
| Function | hash_password | 12-14 | 1 |
| Function | create_user | 16-25 | 1 |
| Function | get_user | 27-35 | 1 |
| Function | authenticate | 37-44 | 2 |
Now you know the structure.
Step 2: Security Scan¶
"Scan user_service.py for security vulnerabilities"
Tool call:
Vulnerabilities found:
| # | Type | Severity | Function | Line |
|---|---|---|---|---|
| 1 | SQL Injection | Critical | create_user | 22 |
| 2 | SQL Injection | Critical | get_user | 32 |
| 3 | Weak Hash | Medium | hash_password | 14 |
Three issues to fix!
Step 3: Extract Functions to Fix¶
"Extract the create_user function"
Tool call:
{
"tool": "extract_code",
"parameters": {
"file_path": "user_service.py",
"target_type": "function",
"target_name": "create_user"
}
}
Extracted code:
def create_user(username, password):
"""Create a new user."""
conn = get_connection()
cursor = conn.cursor()
hashed = hash_password(password)
query = f"INSERT INTO users (username, password) VALUES ('{username}', '{hashed}')"
cursor.execute(query)
conn.commit()
conn.close()
Step 4: Fix the SQL Injection¶
"Update create_user to use parameterized queries"
AI creates fixed version and calls:
{
"tool": "update_symbol",
"parameters": {
"file_path": "user_service.py",
"target_type": "function",
"target_name": "create_user",
"new_code": "def create_user(username, password):\n \"\"\"Create a new user.\"\"\"\n conn = get_connection()\n cursor = conn.cursor()\n \n hashed = hash_password(password)\n query = \"INSERT INTO users (username, password) VALUES (?, ?)\"\n cursor.execute(query, (username, hashed))\n conn.commit()\n conn.close()"
}
}
Result: ✅ Updated with backup at user_service.py.bak
Now fix get_user the same way:
"Also fix the get_user function"
Step 5: Fix the Weak Hash¶
"Update hash_password to use bcrypt instead of MD5"
Fixed version:
def hash_password(password):
"""Hash a password using bcrypt."""
import bcrypt
return bcrypt.hashpw(password.encode(), bcrypt.gensalt()).decode()
Wait! This changes behavior—existing passwords won't match. We need to handle migration. Let's create a safer version:
def hash_password(password, legacy=False):
"""Hash a password.
Args:
password: The password to hash.
legacy: If True, use MD5 for comparison with old hashes.
"""
if legacy:
import hashlib
return hashlib.md5(password.encode()).hexdigest()
import bcrypt
return bcrypt.hashpw(password.encode(), bcrypt.gensalt()).decode()
And update authenticate to handle both:
def authenticate(username, password):
"""Authenticate a user."""
user = get_user(username)
if user is None:
return False
stored_hash = user[2]
# Check if it's a legacy MD5 hash (32 chars)
if len(stored_hash) == 32:
return stored_hash == hash_password(password, legacy=True)
# Modern bcrypt comparison
import bcrypt
return bcrypt.checkpw(password.encode(), stored_hash.encode())
Step 6: Verify the Fixes¶
"Rescan user_service.py for security issues"
Result:
| Vulnerabilities | Before | After |
|---|---|---|
| SQL Injection | 2 | 0 |
| Weak Hash | 1 | 0* |
*Legacy support still uses MD5 but only for existing records.
The Complete Workflow¶
graph TD
A[analyze_code] --> B[security_scan]
B --> C{Issues found?}
C -->|Yes| D[extract_code]
D --> E[AI fixes code]
E --> F[update_symbol]
F --> G[security_scan]
G --> H{Verify fixed}
H -->|Yes| I[Done!]
H -->|No| D
C -->|No| I Best Practices¶
1. Always Analyze First¶
Understand the code before changing it.
2. Scan Before and After¶
Security scan before changes, then verify after.
3. Extract, Don't Guess¶
Use extract_code to get exact current code.
4. Keep Backups¶
Let Code Scalpel create backups (create_backup: true).
5. Small Changes¶
Update one function at a time, verify each change.
Final Code¶
After all fixes:
# user_service.py (fixed)
import sqlite3
DATABASE = "app.db"
def get_connection():
"""Get database connection."""
return sqlite3.connect(DATABASE)
def hash_password(password, legacy=False):
"""Hash a password.
Args:
password: The password to hash.
legacy: If True, use MD5 for comparison with old hashes.
"""
if legacy:
import hashlib
return hashlib.md5(password.encode()).hexdigest()
import bcrypt
return bcrypt.hashpw(password.encode(), bcrypt.gensalt()).decode()
def create_user(username, password):
"""Create a new user."""
conn = get_connection()
cursor = conn.cursor()
hashed = hash_password(password)
query = "INSERT INTO users (username, password) VALUES (?, ?)"
cursor.execute(query, (username, hashed))
conn.commit()
conn.close()
def get_user(username):
"""Get user by username."""
conn = get_connection()
cursor = conn.cursor()
query = "SELECT * FROM users WHERE username = ?"
cursor.execute(query, (username,))
result = cursor.fetchone()
conn.close()
return result
def authenticate(username, password):
"""Authenticate a user."""
user = get_user(username)
if user is None:
return False
stored_hash = user[2]
# Check if it's a legacy MD5 hash (32 chars)
if len(stored_hash) == 32:
return stored_hash == hash_password(password, legacy=True)
# Modern bcrypt comparison
import bcrypt
return bcrypt.checkpw(password.encode(), stored_hash.encode())
Congratulations!¶
You've completed the beginner tutorials. You can now:
- ✅ Analyze code structure
- ✅ Extract functions safely
- ✅ Update code without breaking things
- ✅ Find and fix security vulnerabilities
Next Steps¶
Continue to Intermediate Tutorials to learn:
- Multi-file analysis
- Call graph exploration
- Cross-file security scanning
- Advanced workflows