Skip to content

Documentation Generator

osa_tool.operations.codebase.docstring_generation.docgen.DocGen

Bases: object

This class is a utility for generating Python docstrings using OpenAI's GPT model. It includes methods for generating docstrings for a class, a single method, formatting the structure of Python files, counting the number of tokens in a given prompt, extracting the docstring from GPT's response, inserting a generated docstring into the source code and also processing a Python file by generating and inserting missing docstrings.

__init__(config_manager)

Instantiates the object of the class.

Parameters:

Name Type Description Default
config_loader

Configuration loader instance

required

context_extractor(method_details, structure, function_index=None, generated_docstrings=None)

Extracts the context of function calls from given method_details using method_calls field.

Parameters: - method_details: A dictionary containing details about the method, including 'method_calls' list. - structure: A dictionary representing the code structure (for fallback search) - function_index: Optional index built by osa_treesitter.build_function_index() for fast O(1) lookup. - generated_docstrings: Optional dict mapping node_id to generated docstring (from topological sort)

Returns: A string containing the context of called functions in the format: "Function {function_name} (from {file}) {source_code} Args: {arguments} Return: {return_type} "

count_tokens(prompt)

Counts the number of tokens in a given prompt using a specified model.

Parameters:

Name Type Description Default
prompt str

The text for which to count the tokens.

required

Returns:

Type Description
int

The number of tokens in the prompt.

create_mkdocs_git_workflow(repository_url, path)

Generates .yaml documentation deploy workflow for chosen git host service.

Parameters:

Name Type Description Default
repository_url str

str - URI of the Python project's repository on GitHub.

required
path str

str - The path to the root directory of the Python project.

required

Returns:

Type Description
None

None. The method generates workflow for MkDocs documentation of a current project.

extract_pure_docstring(gpt_response) staticmethod

Extracts only the docstring from the GPT response while keeping triple quotes. Handles common formatting issues like Markdown blocks, extra indentation, and missing closing quotes.

Parameters:

Name Type Description Default
gpt_response str

Full response string from LLM.

required

Returns:

Type Description
str

A properly formatted Python docstring string with triple quotes.

format_structure_openai(structure) staticmethod

Formats the structure of Python files in a readable string format.

This method iterates over the given dictionary 'structure' and generates a formatted string where it describes each file, its classes and functions along with their details such as line number, arguments, return type, source code and docstrings if available.

Parameters:

Name Type Description Default
structure dict

A dictionary containing details of the Python files structure. The dictionary should

required

Returns:

Type Description
str

A formatted string representing the structure of the Python files.

format_with_black(filename) staticmethod

Formats a Python source code file using the black code formatter.

This method takes a filename as input and formats the code in the specified file using the black code formatter.

Parameters:

Name Type Description Default
- filename

The path to the Python source code file to be formatted.

required

Returns:

Type Description

None

generate_class_documentation(class_details, semaphore) async

Generate documentation for a class.

Parameters:

Name Type Description Default
class_details list

A list of dictionaries containing method names and their docstrings.

required
semaphore Semaphore

synchronous primitive that implements limitation of concurrency degree to avoid overloading api.

required

generate_documentation_mkdocs(path, files_info, modules_info)

Generates MkDocs documentation for a Python project based on provided path.

Parameters:

Name Type Description Default
path str

str - The path to the root directory of the Python project.

required

Returns:

Type Description
None

None. The method generates MkDocs documentation for the project.

generate_method_documentation(method_details, semaphore, context_code=None) async

Generate documentation for a single method.

insert_cls_docstring_in_code(source_code, class_name, generated_docstring) staticmethod

Inserts or replaces a class-level docstring for a given class name.

Parameters:

Name Type Description Default
source_code str

The full source code string.

required
class_name str

Name of the class to update.

required
generated_docstring str

The new docstring (raw response from LLM).

required

Returns:

Type Description
str

Updated source code with the inserted or replaced class docstring.

insert_docstring_in_code(source_code, method_details, generated_docstring, class_method=False) staticmethod

Inserts or replaces a method-level docstring in the provided source code, using the method's body from method_details['source_code'] to locate the method. Handles multi-line signatures, decorators, async definitions, and existing docstrings.

strip_docstring_from_body(body) staticmethod

Method to trimm method's body from docstring

summarize_submodules(project_structure, rate_limit=20) async

This method performs recursive traversal over given parsed structure of a Python codebase and generates short summaries for each directory (submodule).

Parameters:

Name Type Description Default
project_structure

A dictionary representing the parsed structure of the Python codebase. The dictionary keys are filenames and the values are lists of dictionaries representing classes and their methods.

required
rate_limit int

A number of maximum concurrent requests to provided API

20

update_class_documentation(class_details, semaphore) async

Generate documentation for a class.

Parameters:

Name Type Description Default
class_details list

A list of dictionaries containing method names and their docstrings.

required
semaphore Semaphore

synchronous primitive that implements limitation of concurrency degree to avoid overloading api.

required

update_method_documentation(method_details, semaphore, context_code=None, class_name=None) async

Update documentation for a single method.