Docgen¶
osa_tool.operations.codebase.docstring_generation.docgen
¶
DocGen
¶
Bases: object
Utility class for generating and inserting Python docstrings with an LLM.
The class formats parsed code structures, requests documentation for classes and methods, extracts clean docstring text from model output, and writes generated docstrings back into source files.
__init__(config_manager)
¶
Instantiates the object of the class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_manager
|
ConfigManager
|
Configuration manager instance |
required |
context_extractor(method_details, structure, function_index=None, generated_docstrings=None)
¶
Extracts the context of function calls from given method_details using method_calls field.
Parameters: - method_details: A dictionary containing details about the method, including 'method_calls' list. - structure: A dictionary representing the code structure (for fallback search) - function_index: Optional index built by osa_treesitter.build_function_index() for fast O(1) lookup. - generated_docstrings: Optional dict mapping node_id to generated docstring (from topological sort)
Returns: A string containing the context of called functions in the format: "Function {function_name} (from {file}) {source_code} Args: {arguments} Return: {return_type} "
count_tokens(prompt)
¶
Counts the number of tokens in a given prompt using a specified model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The text for which to count the tokens. |
required |
Returns:
| Type | Description |
|---|---|
int
|
The number of tokens in the prompt. |
create_mkdocs_git_workflow(repository_url, path)
¶
Generates .yaml documentation deploy workflow for chosen git host service.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
repository_url
|
str
|
str - URI of the Python project's repository on GitHub. |
required |
path
|
str
|
str - The path to the root directory of the Python project. |
required |
Returns:
| Type | Description |
|---|---|
None
|
None. The method generates workflow for MkDocs documentation of a current project. |
extract_pure_docstring(gpt_response)
staticmethod
¶
Extracts only the docstring from the GPT response while keeping triple quotes. Handles common formatting issues like Markdown blocks, extra indentation, and missing closing quotes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gpt_response
|
str
|
Full response string from LLM. |
required |
Returns:
| Type | Description |
|---|---|
str
|
A properly formatted Python docstring string with triple quotes. |
format_structure_openai(structure)
staticmethod
¶
Formats the structure of Python files in a readable string format.
This method iterates over the given dictionary 'structure' and generates a formatted string where it describes each file, its classes and functions along with their details such as line number, arguments, return type, source code and docstrings if available.
Returns:
| Type | Description |
|---|---|
str
|
A formatted string representing the structure of the Python files. |
format_with_black(filename)
staticmethod
¶
Formats a Python source code file using the black code formatter.
This method takes a filename as input and formats the code in the specified file using the black code formatter.
Parameters: - filename: The path to the Python source code file to be formatted.
Returns:
| Type | Description |
|---|---|
None
|
None |
generate_class_documentation(class_details, semaphore)
async
¶
Generate documentation for a class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
class_details
|
list
|
A list of dictionaries containing method names and their docstrings. |
required |
semaphore
|
Semaphore
|
synchronous primitive that implements limitation of concurrency degree to avoid overloading api. |
required |
generate_documentation_mkdocs(path, files_info, modules_info)
¶
Generates MkDocs documentation for a Python project based on provided path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
str - The path to the root directory of the Python project. |
required |
Returns:
| Type | Description |
|---|---|
None
|
None. The method generates MkDocs documentation for the project. |
generate_method_documentation(method_details, semaphore, context_code=None)
async
¶
Generate documentation for a single method.
insert_cls_docstring_in_code(source_code, class_name, generated_docstring)
staticmethod
¶
Inserts or replaces a class-level docstring for a given class name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_code
|
str
|
The full source code string. |
required |
class_name
|
str
|
Name of the class to update. |
required |
generated_docstring
|
str
|
The new docstring (raw response from LLM). |
required |
Returns:
| Type | Description |
|---|---|
str
|
Updated source code with the inserted or replaced class docstring. |
insert_docstring_in_code(source_code, method_details, generated_docstring, class_method=False)
staticmethod
¶
Inserts or replaces a method-level docstring in the provided source code, using the method's body from method_details['source_code'] to locate the method. Handles multi-line signatures, decorators, async definitions, and existing docstrings.
strip_docstring_from_body(body)
staticmethod
¶
Method to trimm method's body from docstring
summarize_submodules(project_structure, rate_limit=20)
async
¶
This method performs recursive traversal over given parsed structure of a Python codebase and generates short summaries for each directory (submodule).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
project_structure
|
dict[str, Any]
|
A dictionary representing the parsed structure of the Python codebase. The dictionary keys are filenames and the values are lists of dictionaries representing classes and their methods. |
required |
rate_limit
|
int
|
A number of maximum concurrent requests to provided API |
20
|
update_class_documentation(class_details, semaphore)
async
¶
Generate documentation for a class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
class_details
|
list
|
A list of dictionaries containing method names and their docstrings. |
required |
semaphore
|
Semaphore
|
synchronous primitive that implements limitation of concurrency degree to avoid overloading api. |
required |
update_method_documentation(method_details, semaphore, context_code=None, class_name=None)
async
¶
Update documentation for a single method.