Parsing Codebases
The primary entrypoint to programs leveraging Codegen is the Codebase class.
Local Codebases
Construct a Codebase by passing in a path to a local git
repository or any subfolder within it. The path must be within a git repository (i.e., somewhere in the parent directory tree must contain a .git
folder).
By default, Codegen will automatically infer the programming language of the codebase and
parse all files in the codebase. You can override this by passing the language
parameter
with a value from the ProgrammingLanguage
enum.
The initial parse may take a few minutes for large codebases. This pre-computation enables constant-time operations afterward. Learn more here.
Remote Repositories
To fetch and parse a repository directly from GitHub, use the from_repo
function.
Remote repositories are cloned to the /tmp/codegen/{repo_name}
directory by
default. The clone is shallow by default for better performance.
Configuration Options
You can customize the behavior of your Codebase instance by passing a CodebaseConfig
object. This allows you to configure secrets (like API keys) and toggle specific features:
CodebaseConfig
andSecretsConfig
allow you to configureconfig
: Toggle specific features like language engines, dependency management, and graph synchronizationsecrets
: API keys and other sensitive information needed by the codebase
For a complete list of available feature flags and configuration options, see the source code on GitHub.
Advanced Initialization
For more complex scenarios, Codegen supports an advanced initialization mode using ProjectConfig
. This allows for fine-grained control over:
- Repository configuration
- Base path and subdirectory filtering
- Multiple project configurations
Here’s an example:
For more details on advanced configuration options, see the source code on GitHub.
Supported Languages
Codegen currently supports:
Was this page helpful?