Skip to content

Symbolic Execution Tools

Symbolic execution tools use Z3 constraint solving to explore code paths, generate tests, and verify that refactors preserve behavior.

Tools in This Category

Tool Description Tier
symbolic_execute Z3-based path exploration Pro
generate_unit_tests Auto-generate test cases Pro
simulate_refactor Verify refactors preserve behavior Pro

symbolic_execute

Explore all possible execution paths using Z3 constraint solving.

What It Does

symbolic_execute treats variables as symbolic values and explores all paths:

  • Path enumeration: All possible execution paths
  • Constraint solving: What inputs reach each path
  • Edge case discovery: Division by zero, None access, bounds
  • Test case generation: Concrete values for each path

When AI Agents Use This

  • Finding edge cases humans miss
  • Understanding conditional logic
  • Generating comprehensive tests
  • Debugging complex branching

Quick Reference

Property Value
Tier Pro
Languages Python
Token Cost ~200-800 tokens

Parameters

Parameter Type Required Description
code string Python code to analyze
max_paths int Max paths to explore (default: 100)
max_depth int Max recursion depth (default: 10)

Example Usage

Prompt:

"What are all the execution paths through this function?"

Tool Call:

{
  "tool": "symbolic_execute",
  "parameters": {
    "code": "def classify(x):\n    if x < 0:\n        return 'negative'\n    elif x == 0:\n        return 'zero'\n    else:\n        return 'positive'"
  }
}

Prompt:

"Find edge cases in this validation function"

Tool Call:

{
  "tool": "symbolic_execute",
  "parameters": {
    "code": "def validate(age, income):\n    if age < 18:\n        return False\n    if age >= 65 and income < 30000:\n        return True\n    if income >= 50000:\n        return True\n    return False",
    "max_paths": 50
  }
}

Response Format

{
  "data": {
    "paths": [
      {
        "id": 1,
        "constraints": ["x < 0"],
        "result": "'negative'",
        "concrete_example": {"x": -5},
        "path_trace": ["line 2: if x < 0 → True", "line 3: return 'negative'"]
      },
      {
        "id": 2,
        "constraints": ["x == 0"],
        "result": "'zero'",
        "concrete_example": {"x": 0},
        "path_trace": ["line 2: if x < 0 → False", "line 4: elif x == 0 → True", "line 5: return 'zero'"]
      },
      {
        "id": 3,
        "constraints": ["x > 0"],
        "result": "'positive'",
        "concrete_example": {"x": 42},
        "path_trace": ["line 2: if x < 0 → False", "line 4: elif x == 0 → False", "line 7: return 'positive'"]
      }
    ],
    "total_paths": 3,
    "coverage": {
      "lines_covered": [2, 3, 4, 5, 6, 7],
      "branches_covered": ["2:T", "2:F", "4:T", "4:F"]
    },
    "warnings": []
  },
  "tier_applied": "pro",
  "duration_ms": 156
}

Supported Types

Type Support Level
int Full
bool Full
str Basic (length, equality)
float Basic (comparisons)
list Limited (length, index)
dict Not supported

Tier Differences

Feature Community Pro Enterprise
Available
Max paths 100 Unlimited
Max depth 10 Unlimited
String constraints
Custom types

Full deep dive


generate_unit_tests

Automatically generate unit tests from symbolic execution.

What It Does

generate_unit_tests creates test cases for each execution path:

  • Path coverage: Test for every branch
  • Edge cases: Tests for boundary conditions
  • Pytest/unittest: Output in standard formats
  • Data-driven: Optionally generate parameterized tests

When AI Agents Use This

  • Bootstrapping test suites
  • Ensuring branch coverage
  • Creating regression tests
  • Finding untested paths

Quick Reference

Property Value
Tier Pro
Languages Python
Token Cost ~200-600 tokens

Parameters

Parameter Type Required Description
code string ✓* Python code to test
file_path string ✓* Path to file to test
function_name string Specific function to test
framework string "pytest" or "unittest" (default: "pytest")
data_driven bool Generate parameterized tests (default: false)
crash_log string Generate test for crash scenario

*Provide either code or file_path.

Example Usage

Prompt:

"Generate tests for this function"

Tool Call:

{
  "tool": "generate_unit_tests",
  "parameters": {
    "code": "def divide(a, b):\n    if b == 0:\n        raise ValueError('Cannot divide by zero')\n    return a / b",
    "framework": "pytest"
  }
}

Prompt:

"Generate tests for the validate function in validators.py"

Tool Call:

{
  "tool": "generate_unit_tests",
  "parameters": {
    "file_path": "src/validators.py",
    "function_name": "validate",
    "data_driven": true
  }
}

Response Format

{
  "data": {
    "test_code": "import pytest\nfrom module import divide\n\ndef test_divide_normal():\n    assert divide(10, 2) == 5.0\n\ndef test_divide_negative():\n    assert divide(-10, 2) == -5.0\n\ndef test_divide_by_zero():\n    with pytest.raises(ValueError, match='Cannot divide by zero'):\n        divide(10, 0)",
    "test_count": 3,
    "paths_covered": 3,
    "framework": "pytest",
    "imports_needed": ["pytest", "module.divide"]
  },
  "tier_applied": "pro",
  "duration_ms": 234
}

Tier Differences

Feature Community Pro Enterprise
Available
Basic test generation
Parameterized tests
Crash log tests
Custom frameworks
Coverage targeting

Full deep dive


simulate_refactor

Verify that a code change preserves behavior.

What It Does

simulate_refactor compares original and refactored code:

  • Behavior analysis: Do both versions behave the same?
  • Path comparison: Same execution paths?
  • Side effect detection: Any new side effects?
  • Safety verdict: Safe to apply or not

When AI Agents Use This

  • Before applying refactors
  • Verifying optimizations
  • Checking bug fixes don't change behavior
  • Validating AI-suggested changes

Quick Reference

Property Value
Tier Pro
Languages Python
Token Cost ~100-400 tokens

Parameters

Parameter Type Required Description
original_code string Original code
new_code string ✓* Refactored code
patch string ✓* Unified diff patch
strict_mode bool Fail on any difference (default: false)

*Provide either new_code or patch.

Example Usage

Prompt:

"Is this refactor safe?"

Tool Call:

{
  "tool": "simulate_refactor",
  "parameters": {
    "original_code": "def add(a, b):\n    result = a + b\n    return result",
    "new_code": "def add(a, b):\n    return a + b"
  }
}

Prompt:

"Verify this patch is safe to apply"

Tool Call:

{
  "tool": "simulate_refactor",
  "parameters": {
    "original_code": "def add(a, b):\n    result = a + b\n    return result",
    "patch": "--- a/file.py\n+++ b/file.py\n@@ -1,3 +1,2 @@\n def add(a, b):\n-    result = a + b\n-    return result\n+    return a + b"
  }
}

Response Format

{
  "data": {
    "is_safe": true,
    "behavior_preserved": true,
    "paths_original": 1,
    "paths_new": 1,
    "differences": [],
    "side_effects_added": [],
    "side_effects_removed": [],
    "verdict": "SAFE: Refactor preserves all behavior",
    "confidence": 0.95
  },
  "tier_applied": "pro",
  "duration_ms": 189
}

Safety Verdicts

Verdict Meaning
SAFE Behavior preserved, safe to apply
UNSAFE Behavior changed, review differences
UNKNOWN Could not determine (complex code)
SIDE_EFFECTS New side effects introduced

Tier Differences

Feature Community Pro Enterprise
Available
Behavior comparison
Side effect detection
Patch format support
Strict mode
Custom assertions

Full deep dive


Common Workflows

Safe Refactoring

1. extract_code("file.py", "function", "target")
   → Get current code

2. [AI refactors the code]

3. simulate_refactor(original, refactored)
   → Verify behavior preserved

4. update_symbol("file.py", "function", "target", new_code)
   → Apply only if safe

Test-Driven Bug Fix

1. generate_unit_tests(file_path="buggy.py")
   → Create tests capturing current behavior

2. [AI fixes the bug]

3. simulate_refactor(original, fixed)
   → Verify only the bug is fixed

4. generate_unit_tests(code=fixed_code)
   → Generate tests for fixed version