Skip to content

Conversation

@ishaksebsib
Copy link
Contributor

@ishaksebsib ishaksebsib commented Oct 21, 2025

Description

Problem

  • Analyzer Issue: The type validator incorrectly grouped IS_IN with other boolean operations, expecting scalar arguments instead of arrays. This caused compilation errors for queries like nodes::WHERE(_::{id}::IS_IN(node_ids)) when node_ids is [ID].
  • Runtime Panic: after parsing, queries using IS_IN to filter by internal node IDs caused a panic: "Value is not an id" at value.rs:1591. This happened
    because check_property("id") returned Value::String (stringified UUID), but IS_IN's is_in method requires Value::Id for ID operations (via IntoPrimitive<ID>).
  • Root Cause: Type mismatch in both analysis (expecting scalars for arrays) and runtime (wrong Value variant for IDs).
  • Impact: IS_IN with arrays failed at compile-time or runtime, preventing valid queries from working.

What We Solved

  • Analyzer Fix: Separated IS_IN validation in traversal_validation.rs to expect Type::Array(Box<Type::Scalar(ft))) for arguments, ensuring array element compatibility.
  • Runtime Fix: Updated check_property("id") in filterable.rs to return Value::Id(ID::from(self.id)) instead of Value::String(...), aligning with IntoPrimitive<ID> expectations.
  • Result: IS_IN now correctly validates and executes with arrays, resolving panics while preserving ID string serialization in responses.

The fixes ensure type safety for array operations and proper handling of reserved properties.

Related Issues

None

Checklist when merging to main

  • No compiler warnings (if applicable)
  • Code is formatted with rustfmt
  • No useless or dead code (if applicable)
  • Code is easy to understand
  • Doc comments are used for all functions, enums, structs, and fields (where appropriate)
  • All tests pass
  • Performance has not regressed (assuming change was not to fix a bug)
  • Version number has been updated in helix-cli/Cargo.toml and helixdb/Cargo.toml

Additional Notes

None

Greptile Overview

Updated On: 2025-10-21 12:34:05 UTC

Summary

This PR fixes IS_IN operator to work correctly with array arguments by addressing type validation at compile-time and value handling at runtime.

Changes:

  • Analyzer (traversal_validation.rs): Separated IS_IN from scalar boolean operations, now correctly expects Type::Array(Box<Type::Scalar(ft))) and validates array element types match the field type
  • Runtime (filterable.rs, vector.rs): Changed check_property("id") for Node, Edge, and HVector to return Value::Id(ID::from(self.id)) instead of Value::String, ensuring compatibility with IntoPrimitive<ID> trait used by Value::is_in() method

Impact:
Enables queries like nodes::WHERE(_::{id}::IS_IN(node_ids)) to compile and execute without panics, resolving both the analyzer's incorrect type expectations and the runtime's value variant mismatch.

Important Files Changed

File Analysis

Filename Score Overview
helix-db/src/utils/filterable.rs 5/5 Changed check_property("id") to return Value::Id(ID::from(self.id)) for both Node and Edge, fixing runtime panics in IS_IN operations
helix-db/src/helix_engine/vector_core/vector.rs 5/5 Updated HVector's check_property("id") to return Value::Id(ID::from(self.id)), ensuring consistency with Node and Edge implementations
helix-db/src/helixc/analyzer/methods/traversal_validation.rs 5/5 Separated IS_IN validation from scalar boolean operations to correctly expect Type::Array arguments with scalar elements, fixing compile-time type checking

Sequence Diagram

sequenceDiagram
    participant Query as HQL Query
    participant Analyzer as traversal_validation
    participant Runtime as check_property
    participant IsIn as Value is_in
    participant IntoPrim as IntoPrimitive

    Query->>Analyzer: nodes WHERE id IS_IN node_ids
    
    alt Before Analyzer Fix
        Note over Analyzer: IS_IN grouped with scalar ops
        Analyzer->>Analyzer: Expect Type Scalar
        Analyzer-->>Query: Error Expected scalar got array
    end
    
    alt After Analyzer Fix
        Note over Analyzer: IS_IN separated from scalar ops
        Analyzer->>Analyzer: Check IS_IN arg type
        Analyzer->>Analyzer: Expect Type Array of Scalar
        Analyzer->>Analyzer: Validate array element type
        Analyzer-->>Query: Compilation succeeds
    end
    
    Query->>Runtime: Execute filter by ID
    Runtime->>Runtime: check_property id
    
    alt Before Runtime Fix
        Runtime-->>IsIn: Value String uuid_string
        IsIn->>IntoPrim: into_primitive ID
        IntoPrim-->>IsIn: panic Value is not an id
    end
    
    alt After Runtime Fix
        Runtime-->>IsIn: Value Id ID from self id
        IsIn->>IntoPrim: into_primitive ID
        IntoPrim-->>IsIn: Returns ID reference
        IsIn-->>Query: Returns filter result
    end
Loading

@ishaksebsib ishaksebsib marked this pull request as ready for review October 21, 2025 09:37
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. helix-db/src/utils/filterable.rs, line 230 (link)

    logic: Edge's id property returns Value::String (via Value::from(self.uuid())), while Node now returns Value::Id. This inconsistency will cause the same panic if IS_IN is used on edge IDs.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@ishaksebsib
Copy link
Contributor Author

@greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

xav-db added a commit that referenced this pull request Oct 29, 2025
…roperty handling

## Changes

### 1. Added Reserved Property Support
- Created `ReservedProp` enum for built-in properties (id, label, version, etc.)
- Added `Step::ReservedPropertyAccess` variant to generate direct field access
- Reserved properties now return correct Value types (e.g., Value::Id for id field)

### 2. Updated Property Access Generator
- Modified `gen_property_access()` to identify reserved vs user-defined properties
- Reserved properties use direct struct field access
- User-defined properties continue using `get_property()` HashMap lookup

### 3. Enhanced Boolean Operation Optimization
- Updated `WhereRef` Display implementation for reserved property optimization
- Generates `Value::Id(ID::from(val.id))` for id field in filters
- Maintains `get_property()` optimization for user-defined properties

### 4. Fixed IS_IN Validation
- Separated IS_IN from scalar boolean operations
- IS_IN now correctly expects `Type::Array` arguments with scalar elements
- Added `get_reserved_property_type()` helper for type validation
- Updated Node/Edge/Vector validation to check reserved properties first

### 5. Testing
- Added IS_IN test with both reserved (id) and user-defined (field) properties
- Verified correct code generation and type safety

## Impact
Resolves the issues from original PR #666:
- IS_IN with arrays now validates at compile-time
- Reserved property access returns correct Value types for IS_IN operations
- Queries like `WHERE(_::{id}::IS_IN(node_ids))` now compile and execute correctly

Co-Authored-By: ishaksebsib <[email protected]>
xav-db added a commit that referenced this pull request Oct 29, 2025
#674)

## Summary

This PR integrates the fixes from PR #666 into the current
`arena-implementation` branch. PR #666 originally fixed IS_IN validation
and runtime handling for reserved properties, but the current
implementation has diverged significantly from when that PR was created.

## Problem

**Analyzer Issue**: The type validator incorrectly grouped `IS_IN` with
other boolean operations, expecting scalar arguments instead of arrays.
This caused compilation errors for queries like
`nodes::WHERE(_::{id}::IS_IN(node_ids))` when `node_ids` is `[ID]`.

**Runtime Issue**: The current implementation removed `filterable.rs`
and its special handling for reserved properties (id, label, etc.).
Reserved properties are now struct fields, not in the properties
HashMap, so `get_property("id")` returns `None`.

**Root Cause**: 
1. IS_IN validation expects scalars instead of arrays
2. Reserved properties lost special handling during refactoring
3. Type mismatch between `Value::String` and `Value::Id` for ID
operations

## Solution

### 1. Added Reserved Property Support
- Created `ReservedProp` enum for built-in properties: Id, Label,
Version, FromNode, ToNode, Deleted, Level, Distance, Data
- Added `Step::ReservedPropertyAccess` variant to generate direct field
access
- Reserved properties now return correct Value types (e.g., `Value::Id`
for id field)

### 2. Updated Property Access Generator
- Modified `gen_property_access()` in `analyzer/utils.rs` to identify
reserved vs user-defined properties
- Reserved properties use direct struct field access:
`Value::Id(ID::from(val.id))`
- User-defined properties continue using `get_property()` HashMap lookup

### 3. Enhanced Boolean Operation Optimization
- Updated `WhereRef` Display implementation in `traversal_steps.rs` for
reserved properties
- Generates optimized filter code with correct Value types
- Example: `Value::Id(ID::from(val.id)).is_in(&data.node_ids)`

### 4. Fixed IS_IN Validation
- Separated IS_IN from scalar boolean operations in
`traversal_validation.rs`
- IS_IN now correctly expects `Type::Array(Box<Type::Scalar(ft)))`
arguments
- Added `get_reserved_property_type()` helper for type-safe validation
- Updated Node/Edge/Vector validation to check reserved properties first

## Files Changed

- `helix-db/src/helixc/generator/traversal_steps.rs` - Added
ReservedProp enum and Display logic
- `helix-db/src/helixc/analyzer/utils.rs` - Updated
gen_property_access()
- `helix-db/src/helixc/analyzer/methods/traversal_validation.rs` - Fixed
IS_IN validation and reserved property checking
- `hql-tests/tests/is_in/` - Added test cases for IS_IN with reserved
and user-defined properties

## Testing

Added comprehensive IS_IN tests:
```hql
QUERY GetNodes (fields: [String]) =>
    node <- N<MyNode>::WHERE(_::{field}::IS_IN(fields))
    RETURN node

QUERY GetNodesByID (node_ids: [ID]) =>
    node <- N<MyNode>::WHERE(_::{id}::IS_IN(node_ids))
    RETURN node
```

**Verified**:
- ✅ Queries compile successfully
- ✅ Correct code generation for reserved properties (direct field
access)
- ✅ Correct code generation for user-defined properties (get_property)
- ✅ Type safety maintained for array operations

## Generated Code Example

For reserved property `id`:
```rust
.filter_ref(|val, txn|{
    if let Ok(val) = val {
        Ok(Value::Id(ID::from(val.id)).is_in(&data.node_ids))
    } else {
        Ok(false)
    }
})
```

For user-defined property `field`:
```rust
.filter_ref(|val, txn|{
    if let Ok(val) = val {
        Ok(val
        .get_property("field")
        .map_or(false, |v| v.is_in(&data.fields)))
    } else {
        Ok(false)
    }
})
```

## Impact

- IS_IN operations now work correctly with arrays at compile-time and
runtime
- Reserved properties return proper Value types for IS_IN comparisons
- Queries like `WHERE(_::{id}::IS_IN(node_ids))` compile and execute
without panics
- No breaking changes to existing functionality

Co-Authored-By: ishaksebsib <[email protected]>
@xav-db xav-db mentioned this pull request Oct 29, 2025
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant