Codegen natively integrates with LLMs via the codebase.ai(…) method, which lets you use large language models (LLMs) to help generate, modify, and analyze code.

Configuration

Before using AI capabilities, you need to provide an OpenAI API key via codebase.set_ai_key(…):

# Set your OpenAI API key
codebase.set_ai_key("your-openai-api-key")

Calling Codebase.ai(…)

The Codebase.ai(…) method takes three key arguments:

result = codebase.ai(
    prompt="Your instruction to the AI",
    target=symbol_to_modify,  # Optional: The code being operated on
    context=additional_info   # Optional: Extra context from static analysis
)
  • prompt: Clear instruction for what you want the AI to do
  • target: The symbol (function, class, etc.) being operated on - its source code will be provided to the AI
  • context: Additional information you want to provide to the AI, which you can gather using GraphSitter’s analysis tools

Codegen does not automatically provide any context to the LLM by default. It does not “understand” your codebase, only the context you provide.

The context parameter can include:

  • A single symbol (its source code will be provided)
  • A list of related symbols
  • A dictionary mapping descriptions to symbols/values
  • Nested combinations of the above

How Context Works

The AI doesn’t automatically know about your codebase. Instead, you can provide relevant context by:

  1. Using GraphSitter’s static analysis to gather information:
function = codebase.get_function("process_data")
context = {
    "call_sites": function.call_sites,      # Where the function is called
    "dependencies": function.dependencies,   # What the function depends on
    "parent": function.parent,              # Class/module containing the function
    "docstring": function.docstring,        # Existing documentation
}
  1. Passing this information to the AI:
result = codebase.ai(
    "Improve this function's implementation",
    target=function,
    context=context  # AI will see the gathered information
)

Common Use Cases

Code Generation

Generate new code or refactor existing code:

# Break up a large function
function = codebase.get_function("large_function")
new_code = codebase.ai(
    "Break this function into smaller, more focused functions",
    target=function
)
function.edit(new_code)

# Generate a test
my_function = codebase.get_function("my_function")
test_code = codebase.ai(
    f"Write a test for the function {my_function.name}",
    target=my_function
)
my_function.insert_after(test_code)

Documentation

Generate and format documentation:

# Generate docstrings for a class
class_def = codebase.get_class("MyClass")
for method in class_def.methods:
    docstring = codebase.ai(
        "Generate a docstring describing this method",
        target=method,
        context={
            "class": class_def,
            "style": "Google docstring format"
        }
    )
    method.set_docstring(docstring)

Code Analysis and Improvement

Use AI to analyze and improve code:

# Improve function names
for function in codebase.functions:
    if codebase.ai(
        "Does this function name clearly describe its purpose? Answer yes/no",
        target=function
    ).lower() == "no":
        new_name = codebase.ai(
            "Suggest a better name for this function",
            target=function,
            context={"call_sites": function.call_sites}
        )
        function.rename(new_name)

Contextual Modifications

Make changes with full context awareness:

# Refactor a class method
method = codebase.get_class("MyClass").get_method("target_method")
new_impl = codebase.ai(
    "Refactor this method to be more efficient",
    target=method,
    context={
        "parent_class": method.parent,
        "call_sites": method.call_sites,
        "dependencies": method.dependencies
    }
)
method.edit(new_impl)

Best Practices

  1. Provide Relevant Context

    # Good: Providing specific, relevant context
    summary = codebase.ai(
        "Generate a summary of this method's purpose",
        target=method,
        context={
            "class": method.parent,              # Class containing the method
            "usages": list(method.usages),       # How the method is used
            "dependencies": method.dependencies,  # What the method depends on
            "style": "concise"
        }
    )
    
    # Bad: Missing context that could help the AI
    summary = codebase.ai(
        "Generate a summary",
        target=method  # AI only sees the method's code
    )
    
  2. Gather Comprehensive Context

    # Gather relevant information before AI call
    def get_method_context(method):
        return {
            "class": method.parent,
            "call_sites": list(method.call_sites),
            "dependencies": list(method.dependencies),
            "related_methods": [m for m in method.parent.methods
                              if m.name != method.name]
        }
    
    # Use gathered context in AI call
    new_impl = codebase.ai(
        "Refactor this method to be more efficient",
        target=method,
        context=get_method_context(method)
    )
    
  3. Handle AI Limits

    # Set custom AI request limits for large operations
    codebase.set_session_options(max_ai_requests=200)
    
  4. Review Generated Code

    # Generate and review before applying
    new_code = codebase.ai(
        "Optimize this function",
        target=function
    )
    print("Review generated code:")
    print(new_code)
    if input("Apply changes? (y/n): ").lower() == 'y':
        function.edit(new_code)
    

Limitations and Safety

  • The AI doesn’t automatically know about your codebase - you must provide relevant context
  • AI-generated code should always be reviewed
  • Default limit of 150 AI requests per codemod execution
    codebase.set_session_options(max_ai_requests=200)
    

You can also use codebase.set_session_options to increase the execution time and the number of operations allowed in a session. This is useful for handling larger tasks or more complex operations that require additional resources. Adjust the max_seconds and max_transactions parameters to suit your needs:

codebase.set_session_options(max_seconds=300, max_transactions=500)