Files and Directories
Codegen provides three primary abstractions for working with your codebase’s file structure:
- File - Represents a file in the codebase (e.g. README.md, package.json, etc.)
- SourceFile - Represents a source code file (e.g. Python, TypeScript, React, etc.)
- Directory - Represents a directory in the codebase
SourceFile is a subclass of File that provides additional functionality for source code files.
Accessing Files and Directories
You typically access files from the codebase object with two APIs:
- codebase.get_file(…) - Get a file by its path
- codebase.files - Enables iteration over all files in the codebase
These APIs are similar for Directory, which provides similar methods for accessing files and subdirectories.
Differences between SourceFile and File
- File - a general purpose class that represents any file in the codebase including non-code files like README.md, .env, .json, image files, etc.
- SourceFile - a subclass of File that provides additional functionality for source code files written in languages supported by the codegen-sdk (Python, TypeScript, JavaScript, React).
The majority of intended use cases involve using exclusively SourceFile objects as these contain code that can be parsed and manipulated by the codegen-sdk. However, there may be cases where it will be necessary to work with non-code files. In these cases, the File class can be used.
By default, the codebase.files
property will only return SourceFile objects. To include non-code files the extensions='*'
argument must be used.
When getting a file with codebase.get_file
, files ending in .py, .js, .ts, .jsx, .tsx
are returned as SourceFile objects while other files are returned as File objects.
Furthermore, you can use the isinstance
function to check if a file is a SourceFile:
Currently, the codebase object can only parse source code files of one language at a time. This means that if you want to work with both Python and TypeScript files, you will need to create two separate codebase objects.
Accessing Code
SourceFiles and Directories provide several APIs for accessing and iterating over their code.
See, for example:
.functions
(SourceFile / Directory) - All Functions in the file/directory.classes
(SourceFile / Directory) - All Classes in the file/directory.imports
(SourceFile / Directory) - All Imports in the file/directory.get_function(...)
(SourceFile / Directory) - Get a specific function by name.get_class(...)
(SourceFile / Directory) - Get a specific class by name.get_global_var(...)
(SourceFile / Directory) - Get a specific global variable by name
Working with Non-Code Files (README, JSON, etc.)
By default, Codegen focuses on source code files (Python, TypeScript, etc). However, you can access all files in your codebase, including documentation, configuration, and other non-code files like README.md, package.json, or .env:
You can also filter for specific file types:
These APIs are similar for Directory, which provides similar methods for accessing files and subdirectories.
Raw Content and Metadata
Editing Files Directly
Files themselves are Editable
objects, just like Functions and Classes.
Learn more about the Editable API.
This means they expose many useful operations, including:
File.search
- Search for all functions named “main”File.edit
- Edit the fileFile.replace
- Replace all instances of a string with another stringFile.insert_before
- Insert text before a specific stringFile.insert_after
- Insert text after a specific stringFile.remove
- Remove a specific string
You can frequently do bulk modifictions via the .edit(...)
method or .replace(...)
method.
Most useful operations will have bespoke APIs that handle edge cases, update references, etc.
Moving and Renaming Files
Files can be manipulated through methods like File.update_filepath()
, File.rename()
, and File.remove()
:
Removing files is a potentially breaking operation. Only remove files if they have no external usages.
Directories
Directories
expose a similar API to the File class, with the addition of the subdirectories
property.
Removing directories is a potentially destructive operation. Only remove directories if they have no external usages.
Was this page helpful?