Skip to content

Docgen

osa_tool.operations.codebase.docstring_generation.docgen

DocGen

Bases: object

Utility class for generating and inserting Python docstrings with an LLM.

The class formats parsed code structures, requests documentation for classes and methods, extracts clean docstring text from model output, and writes generated docstrings back into source files.

__init__(config_manager)

Instantiates the object of the class.

Parameters:

Name Type Description Default
config_manager ConfigManager

Configuration manager instance

required

context_extractor(method_details, structure, function_index=None, generated_docstrings=None)

Extracts the context of function calls from given method_details using method_calls field.

Parameters: - method_details: A dictionary containing details about the method, including 'method_calls' list. - structure: A dictionary representing the code structure (for fallback search) - function_index: Optional index built by osa_treesitter.build_function_index() for fast O(1) lookup. - generated_docstrings: Optional dict mapping node_id to generated docstring (from topological sort)

Returns: A string containing the context of called functions in the format: "Function {function_name} (from {file}) {source_code} Args: {arguments} Return: {return_type} "

count_tokens(prompt)

Counts the number of tokens in a given prompt using a specified model.

Parameters:

Name Type Description Default
prompt str

The text for which to count the tokens.

required

Returns:

Type Description
int

The number of tokens in the prompt.

create_mkdocs_git_workflow(repository_url, path)

Generates .yaml documentation deploy workflow for chosen git host service.

Parameters:

Name Type Description Default
repository_url str

str - URI of the Python project's repository on GitHub.

required
path str

str - The path to the root directory of the Python project.

required

Returns:

Type Description
None

None. The method generates workflow for MkDocs documentation of a current project.

extract_pure_docstring(gpt_response) staticmethod

Extracts only the docstring from the GPT response while keeping triple quotes. Handles common formatting issues like Markdown blocks, extra indentation, and missing closing quotes.

Parameters:

Name Type Description Default
gpt_response str

Full response string from LLM.

required

Returns:

Type Description
str

A properly formatted Python docstring string with triple quotes.

format_structure_openai(structure) staticmethod

Formats the structure of Python files in a readable string format.

This method iterates over the given dictionary 'structure' and generates a formatted string where it describes each file, its classes and functions along with their details such as line number, arguments, return type, source code and docstrings if available.

Returns:

Type Description
str

A formatted string representing the structure of the Python files.

format_with_black(filename) staticmethod

Formats a Python source code file using the black code formatter.

This method takes a filename as input and formats the code in the specified file using the black code formatter.

Parameters: - filename: The path to the Python source code file to be formatted.

Returns:

Type Description
None

None

generate_class_documentation(class_details, semaphore) async

Generate documentation for a class.

Parameters:

Name Type Description Default
class_details list

A list of dictionaries containing method names and their docstrings.

required
semaphore Semaphore

synchronous primitive that implements limitation of concurrency degree to avoid overloading api.

required

generate_documentation_mkdocs(path, files_info, modules_info)

Generates MkDocs documentation for a Python project based on provided path.

Parameters:

Name Type Description Default
path str

str - The path to the root directory of the Python project.

required

Returns:

Type Description
None

None. The method generates MkDocs documentation for the project.

generate_method_documentation(method_details, semaphore, context_code=None) async

Generate documentation for a single method.

insert_cls_docstring_in_code(source_code, class_name, generated_docstring) staticmethod

Inserts or replaces a class-level docstring for a given class name.

Parameters:

Name Type Description Default
source_code str

The full source code string.

required
class_name str

Name of the class to update.

required
generated_docstring str

The new docstring (raw response from LLM).

required

Returns:

Type Description
str

Updated source code with the inserted or replaced class docstring.

insert_docstring_in_code(source_code, method_details, generated_docstring, class_method=False) staticmethod

Inserts or replaces a method-level docstring in the provided source code, using the method's body from method_details['source_code'] to locate the method. Handles multi-line signatures, decorators, async definitions, and existing docstrings.

strip_docstring_from_body(body) staticmethod

Method to trimm method's body from docstring

summarize_submodules(project_structure, rate_limit=20) async

This method performs recursive traversal over given parsed structure of a Python codebase and generates short summaries for each directory (submodule).

Parameters:

Name Type Description Default
project_structure dict[str, Any]

A dictionary representing the parsed structure of the Python codebase. The dictionary keys are filenames and the values are lists of dictionaries representing classes and their methods.

required
rate_limit int

A number of maximum concurrent requests to provided API

20

update_class_documentation(class_details, semaphore) async

Generate documentation for a class.

Parameters:

Name Type Description Default
class_details list

A list of dictionaries containing method names and their docstrings.

required
semaphore Semaphore

synchronous primitive that implements limitation of concurrency degree to avoid overloading api.

required

update_method_documentation(method_details, semaphore, context_code=None, class_name=None) async

Update documentation for a single method.