Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "git-internal"
version = "0.5.0"
version = "0.6.0"
edition = "2024"
license = "MIT"
description = "Git-Internal is a high-performance Rust library for encoding and decoding Git internal objects and Pack files."
Expand Down
4 changes: 4 additions & 0 deletions src/errors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,10 @@ pub enum GitError {
#[error("Not a valid agent task object: {0}")]
InvalidTaskObject(String),

/// Malformed intent object.
#[error("Not a valid agent intent object: {0}")]
InvalidIntentObject(String),

/// Malformed tool invocation object.
#[error("Not a valid agent tool invocation object: {0}")]
InvalidToolInvocationObject(String),
Expand Down
7 changes: 6 additions & 1 deletion src/internal/object/blob.rs
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,10 @@ impl Blob {
#[cfg(test)]
mod tests {
use super::*;
use crate::hash::{HashKind, set_hash_kind_for_test};
use crate::{
hash::{HashKind, set_hash_kind_for_test},
internal::object::ObjectTrait,
};

/// Test creating a Blob from content string
#[test]
Expand All @@ -117,6 +120,8 @@ mod tests {
blob.id.to_string(),
"5dd01c177f5d7d1be5346a5bc18a569a7410c2ef"
);
let hash_from_trait = blob.object_hash().unwrap();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test Coverage: Good addition but could be more thorough

The new test validates that object_hash() matches the stored id, which is good. However, consider testing:

  1. Hash computation for different content sizes (empty, small, large)
  2. Consistency across multiple calls (deterministic hashing)
  3. Different hash kinds (SHA-1 vs SHA-256)

Enhancement suggestion:

#[test]
fn test_blob_hash_consistency() {
    set_hash_kind_for_test(HashKind::Sha1);
    let blob1 = Blob::from_content("test data");
    let hash1 = blob1.object_hash().unwrap();
    let hash2 = blob1.object_hash().unwrap();
    assert_eq!(hash1, hash2, "Hash should be deterministic");
    
    // Test with SHA-256
    set_hash_kind_for_test(HashKind::Sha256);
    let blob2 = Blob::from_content("test data");
    let hash3 = blob2.object_hash().unwrap();
    assert_ne!(hash1.to_string(), hash3.to_string(), "Different algorithms produce different hashes");
}

assert_eq!(hash_from_trait.to_string(), blob.id.to_string());
}

/// Test creating a Blob from content string using SHA-256 hash algorithm.
Expand Down
23 changes: 22 additions & 1 deletion src/internal/object/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,15 @@ pub enum SelectionStrategy {
pub enum ContextItemKind {
/// A regular file in the repository.
File,
/// A URL (web page, API endpoint, etc.).
Url,
/// A free-form text snippet (e.g. doc fragment, note).
Snippet,
/// Command or terminal output.
Command,
/// Image or other binary visual content.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API Design: Consider using an enum variant instead of Other(String)

The Other(String) variant allows arbitrary strings, which makes it harder to:

  • Validate and reason about valid context item kinds
  • Maintain backward compatibility when new kinds are added
  • Perform exhaustive pattern matching
Suggested change
/// Image or other binary visual content.
Other(String),

Recommendation:
Unless there's a strong need for extensibility, consider either:

  1. Making this a closed enum (remove Other) and add new variants as needed
  2. If extensibility is required, document the expected format/conventions for Other values

Example:

/// Other context item kind with a documented format.
/// Should follow the pattern: "provider:kind" (e.g., "github:issue", "jira:ticket")
Other(String),

Image,
Other(String),
}

/// Context item describing a single input.
Expand All @@ -52,6 +61,11 @@ pub struct ContextItem {
pub kind: ContextItemKind,
pub path: String,
pub content_id: IntegrityHash,
/// Optional preview/summary of the content (for example, first 200 characters).
/// Used for display without loading the full content via `content_id`.
/// Should be kept under 500 characters for performance.
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation recommends keeping content_preview under 500 characters for performance, but there's no validation or enforcement of this limit in the ContextItem::new method or anywhere else. Consider adding validation to enforce this limit, or make it clear that this is just a guideline. If it's a hard requirement for performance, it should be enforced in code.

Suggested change
/// Should be kept under 500 characters for performance.
/// For performance, it is recommended (but not enforced) to keep this under 500 characters.

Copilot uses AI. Check for mistakes.
#[serde(default)]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Documentation: What is content_preview for?

The new content_preview field lacks documentation explaining:

  • What should be stored here (truncated content? summary?)
  • When should it be populated vs left as None?
  • What's the expected length/format?
  • How does it relate to the actual content referenced by content_id?
Suggested change
#[serde(default)]
#[serde(default)]
pub content_preview: Option<String>,

Recommendation:

/// Optional preview/summary of the content (e.g., first 200 chars).
/// Used for display purposes without loading the full content via content_id.
/// Should be kept under 500 characters for performance.
#[serde(default)]
pub content_preview: Option<String>,

pub content_preview: Option<String>,
}

impl ContextItem {
Expand All @@ -68,6 +82,7 @@ impl ContextItem {
kind,
path,
content_id,
content_preview: None,
})
}
}
Expand Down Expand Up @@ -150,7 +165,13 @@ impl ObjectTrait for ContextSnapshot {
}

fn get_size(&self) -> usize {
serde_json::to_vec(self).map(|v| v.len()).unwrap_or(0)
match serde_json::to_vec(self) {
Ok(v) => v.len(),
Err(e) => {
tracing::warn!("failed to compute ContextSnapshot size: {}", e);
0
}
}
}

fn to_data(&self) -> Result<Vec<u8>, GitError> {
Expand Down
8 changes: 7 additions & 1 deletion src/internal/object/decision.rs
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,13 @@ impl ObjectTrait for Decision {
}

fn get_size(&self) -> usize {
serde_json::to_vec(self).map(|v| v.len()).unwrap_or(0)
match serde_json::to_vec(self) {
Ok(v) => v.len(),
Err(e) => {
tracing::warn!("failed to compute Decision size: {}", e);
0
}
}
}

fn to_data(&self) -> Result<Vec<u8>, GitError> {
Expand Down
8 changes: 7 additions & 1 deletion src/internal/object/evidence.rs
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,13 @@ impl ObjectTrait for Evidence {
}

fn get_size(&self) -> usize {
serde_json::to_vec(self).map(|v| v.len()).unwrap_or(0)
match serde_json::to_vec(self) {
Ok(v) => v.len(),
Err(e) => {
tracing::warn!("failed to compute Evidence size: {}", e);
0
}
}
}

fn to_data(&self) -> Result<Vec<u8>, GitError> {
Expand Down
169 changes: 169 additions & 0 deletions src/internal/object/intent.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
use std::fmt;

use serde::{Deserialize, Serialize};
use uuid::Uuid;

use crate::{
errors::GitError,
hash::ObjectHash,
internal::object::{
ObjectTrait,
integrity::IntegrityHash,
types::{ActorRef, Header, ObjectType},
},
};

#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "snake_case")]
pub enum IntentStatus {
Draft,
Active,
Completed,
Cancelled,
}

impl IntentStatus {
pub fn as_str(&self) -> &'static str {
match self {
IntentStatus::Draft => "draft",
IntentStatus::Active => "active",
IntentStatus::Completed => "completed",
IntentStatus::Cancelled => "cancelled",
}
}
}

impl fmt::Display for IntentStatus {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.as_str())
}
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Intent {
#[serde(flatten)]
header: Header,
content: String,
parent_id: Option<Uuid>,
root_id: Option<Uuid>,
task_id: Option<Uuid>,
result_commit_sha: Option<IntegrityHash>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API Design: Consider builder pattern for complex initialization

The Intent struct has many optional fields that would typically be set after creation. The current API requires multiple setter calls:

let mut intent = Intent::new(repo_id, actor, "content")?;
intent.set_parent_id(Some(parent));
intent.set_task_id(Some(task));
intent.set_status(IntentStatus::Active);

Suggestion: Consider a builder pattern for cleaner API:

impl Intent {
    pub fn builder(repo_id: Uuid, created_by: ActorRef) -> IntentBuilder {
        IntentBuilder::new(repo_id, created_by)
    }
}

pub struct IntentBuilder { /* ... */ }
impl IntentBuilder {
    pub fn content(mut self, content: impl Into<String>) -> Self { /* ... */ }
    pub fn parent_id(mut self, id: Uuid) -> Self { /* ... */ }
    pub fn task_id(mut self, id: Uuid) -> Self { /* ... */ }
    pub fn build(self) -> Result<Intent, String> { /* ... */ }
}

// Usage:
let intent = Intent::builder(repo_id, actor)
    .content("Refactor login")
    .parent_id(parent_id)
    .task_id(task_id)
    .build()?;

This is optional but improves ergonomics for types with many optional fields.

status: IntentStatus,
}

impl Intent {
pub fn new(
repo_id: Uuid,
created_by: ActorRef,
content: impl Into<String>,
) -> Result<Self, String> {
Ok(Self {
header: Header::new(ObjectType::Intent, repo_id, created_by)?,
content: content.into(),
parent_id: None,
root_id: None,
task_id: None,
result_commit_sha: None,
status: IntentStatus::Draft,
})
}

pub fn header(&self) -> &Header {
&self.header
}

pub fn content(&self) -> &str {
&self.content
}

pub fn parent_id(&self) -> Option<Uuid> {
self.parent_id
}

pub fn root_id(&self) -> Option<Uuid> {
self.root_id
}

pub fn task_id(&self) -> Option<Uuid> {
self.task_id
}

pub fn result_commit_sha(&self) -> Option<&IntegrityHash> {
self.result_commit_sha.as_ref()
}

pub fn status(&self) -> &IntentStatus {
&self.status
}

pub fn set_parent_id(&mut self, parent_id: Option<Uuid>) {
self.parent_id = parent_id;
}

pub fn set_root_id(&mut self, root_id: Option<Uuid>) {
self.root_id = root_id;
}

pub fn set_task_id(&mut self, task_id: Option<Uuid>) {
self.task_id = task_id;
}

pub fn set_result_commit_sha(&mut self, sha: Option<IntegrityHash>) {
self.result_commit_sha = sha;
}

pub fn set_status(&mut self, status: IntentStatus) {
self.status = status;
}
}

impl fmt::Display for Intent {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "Intent: {}", self.header.object_id())
}
}

impl ObjectTrait for Intent {
fn from_bytes(data: &[u8], _hash: ObjectHash) -> Result<Self, GitError>
where
Self: Sized,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error Handling: Wrong error variant used

The from_bytes method uses GitError::InvalidObjectInfo for deserialization errors, but there's a specific GitError::InvalidIntentObject variant added in this PR (line 94 of errors.rs).

Suggested change
Self: Sized,
serde_json::from_slice(data).map_err(|e| GitError::InvalidObjectInfo(e.to_string()))

Recommended fix:

serde_json::from_slice(data).map_err(|e| GitError::InvalidIntentObject(e.to_string()))

This provides better error specificity and uses the error variant that was specifically added for this type.

{
serde_json::from_slice(data).map_err(|e| GitError::InvalidIntentObject(e.to_string()))
}

fn get_type(&self) -> ObjectType {
ObjectType::Intent
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: Avoid unwrap_or with zero for size calculation

The get_size() method uses unwrap_or(0) which could hide serialization errors and return an incorrect size. This violates the repository's coding convention to avoid unsafe operations and return proper Result types.

Suggested change
fn get_size(&self) -> usize {
serde_json::to_vec(self).map(|v| v.len()).unwrap_or(0)

Recommended fix:
Since ObjectTrait::get_size() returns usize, consider either:

  1. Changing the trait to return Result<usize, GitError> (breaking change)
  2. Documenting that this is a best-effort size and logging errors
  3. Using a cached size field updated during serialization

This same issue appears in other object types - consider a holistic fix across the codebase.

fn get_size(&self) -> usize {
match serde_json::to_vec(self) {
Ok(v) => v.len(),
Err(e) => {
tracing::warn!("failed to compute Intent size: {}", e);
0
}
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consistency Issue: Same error handling as from_bytes

The to_data method also uses InvalidObjectInfo when it should use InvalidIntentObject for consistency.

Suggested change
serde_json::to_vec(self).map_err(|e| GitError::InvalidObjectInfo(e.to_string()))

Recommended fix:

serde_json::to_vec(self).map_err(|e| GitError::InvalidIntentObject(e.to_string()))

fn to_data(&self) -> Result<Vec<u8>, GitError> {
serde_json::to_vec(self).map_err(|e| GitError::InvalidIntentObject(e.to_string()))
}
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_intent_creation() {
let repo_id = Uuid::from_u128(0x0123456789abcdef0123456789abcdef);
let actor = ActorRef::human("jackie").expect("actor");
let intent = Intent::new(repo_id, actor, "Refactor login flow").expect("intent");

assert_eq!(intent.header().object_type(), &ObjectType::Intent);
assert_eq!(intent.status(), &IntentStatus::Draft);
assert!(intent.parent_id().is_none());
assert!(intent.root_id().is_none());
assert!(intent.task_id().is_none());
}
}
10 changes: 10 additions & 0 deletions src/internal/object/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ pub mod context;
pub mod decision;
pub mod evidence;
pub mod integrity;
pub mod intent;
pub mod note;
pub mod patchset;
pub mod plan;
Expand Down Expand Up @@ -58,4 +59,13 @@ pub trait ObjectTrait: Send + Sync + Display {
fn get_size(&self) -> usize;

fn to_data(&self) -> Result<Vec<u8>, GitError>;

/// Computes the object hash from serialized data.
///
/// Default implementation serializes the object and computes the hash from that data.
/// Override only if you need custom hash computation or caching.
fn object_hash(&self) -> Result<ObjectHash, GitError> {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Addition: Default object_hash() implementation

This is a nice improvement that reduces boilerplate! All types implementing ObjectTrait now get a default hash computation.

Minor suggestion: Consider adding a doc comment explaining when this might need to be overridden:

/// Computes the object hash from serialized data.
/// 
/// Default implementation serializes the object and computes hash from the data.
/// Override only if you need custom hash computation or caching.
fn object_hash(&self) -> Result<ObjectHash, GitError> {
    let data = self.to_data()?;
    Ok(ObjectHash::from_type_and_data(self.get_type(), &data))
}

let data = self.to_data()?;
Ok(ObjectHash::from_type_and_data(self.get_type(), &data))
}
}
Loading