Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion apiCall.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,23 @@


def generate_code_tree(file_path: str, content: str, modified_lines: List[int]) -> Dict[str, CodeTree]:
"""Generate a code tree for a file with modified lines."""
"""Generate a code tree for a file with modified lines.

This function sends a POST request to a specified API endpoint to
generate a code tree based on the provided file path, content, and
modified lines. It constructs the necessary data and headers for the
request and handles any exceptions that may occur during the process. If
the request is successful, it returns the JSON response containing the
code tree; otherwise, it returns an empty dictionary.

Args:
file_path (str): The path to the file for which the code tree is generated.
content (str): The content of the file as a string.
modified_lines (List[int]): A list of line numbers that have been modified.

Returns:
Dict[str, CodeTree]: A dictionary representing the generated code tree.
"""

url = "https://production-gateway.snorkell.ai/api/v1/hook/file/generate/codetree"
data = {
Expand Down
36 changes: 34 additions & 2 deletions dualEncoder.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,24 @@ def encode_batch(
encoder: SentenceTransformer,
batch_size: int = 8
) -> np.ndarray:
"""Encode texts in batches."""
"""Encode a list of texts in batches using a specified encoder.

This function takes a list of text strings and encodes them using the
provided SentenceTransformer encoder. It processes the texts in batches
to optimize performance and memory usage. The encoding is done with a
progress bar displayed to the user, and the output is converted to a
NumPy array for easier manipulation and integration with other numerical
libraries.

Args:
texts (List[str]): A list of text strings to be encoded.
encoder (SentenceTransformer): The encoder used to transform the texts.
batch_size (int?): The number of texts to process in each batch.
Defaults to 8.

Returns:
np.ndarray: A NumPy array containing the encoded representations of the input texts.
"""
return encoder.encode(
texts,
batch_size=batch_size,
Expand All @@ -78,7 +95,22 @@ def encode_batch(


def index_repository(self, repo_path: str, docs_path: str, force_update: bool = False):
"""Index all Python files using both encoders."""
"""Index all Python files using both encoders.

This function scans the specified repository path for Python and other
specified file types, collects their code and documentation, and
generates embeddings for both. It checks if an index already exists and
can skip the indexing process if not forced to update. The function
processes each file, extracts methods and classes, and prepares them for
encoding. Finally, it saves the generated index to a JSON file.

Args:
repo_path (str): The path to the repository containing the code files.
docs_path (str): The path to the documentation files (not currently used in this
implementation).
force_update (bool?): A flag indicating whether to force an update of the index.
Defaults to False.
"""


# external_docs = self.load_documentation(docs_path)
Expand Down