Skip to content

Implement Future backend as primary API (refactor direct API as wrapper) #1

@scttfrdmn

Description

@scttfrdmn

Overview

Implement a complete Future backend as the primary API for staRburst. The direct API (starburst_map) will be refactored as a convenience wrapper around the Future backend.

Architecture Decision

Future-first approach: All features and functionality will be built around the Future backend. This provides:

  • Clean integration with the Future ecosystem (furrr, future.apply, targets)
  • Consistent async/parallel execution model
  • Easier maintenance (single execution path)
  • Better composability with existing R parallel tools

Current Status

  • ✅ Direct API (starburst_map, starburst_cluster) working and validated
  • ✅ AWS infrastructure fully functional (Fargate, Docker, S3, IAM)
  • ⏳ Basic StarburstFuture class exists in R/future-starburst.R
  • ⏳ Needs complete implementation

Implementation Plan

Phase 1: Core Future Backend

  • Implement complete run() method for Future execution
  • Implement resolved() method for checking completion
  • Implement result() method for retrieving results
  • Handle Future backend registration with plan(future_starburst)
  • Support sequential task resolution
  • Proper error propagation and handling

Phase 2: Refactor Direct API as Wrapper

  • Refactor starburst_map() to use Future backend internally
  • Refactor starburst_cluster() to create Future plan
  • Ensure backward compatibility
  • Maintain simple API for users who don't want Future concepts

Phase 3: Testing & Documentation

  • Add tests for Future API compatibility
  • Test with furrr, future.apply, targets
  • Update all documentation to show Future-first examples
  • Add migration guide from v0.1.0 direct API

Benefits

  • Ecosystem compatibility: Drop-in replacement for future::plan(multisession)
  • Composability: Works with existing furrr code: future_map(), future_pmap(), etc.
  • Targets integration: Compatible with targets, clustermq, and other Future-aware packages
  • Single execution path: Easier to maintain and extend
  • Clean architecture: Future provides the abstraction, staRburst provides the AWS backend

API Examples

Future API (Primary)

library(furrr)
plan(future_starburst, workers = 50, cpu = 4, memory = "8GB")

# Just works with furrr
results <- future_map(1:1000, expensive_function)

Direct API (Convenience wrapper)

# Still works, but internally uses Future backend
results <- starburst_map(1:1000, expensive_function, workers = 50)

Estimated Effort

Medium-Large - ~1000 lines of code, 3-4 weeks

  • Week 1-2: Core Future backend implementation
  • Week 3: Refactor direct API as wrapper
  • Week 4: Testing, documentation, refinement

References

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions