Skip to content

Releases: chaoticgoodcomputing/flowthru

v0.15.0

03 May 22:00

Choose a tag to compare

Release 0.15.0

We've added compile-time NewType generation, expanded schema validation diagnostics, and built a comprehensive meta-test framework to catch schema decay, missing conformance tests, and diagnostic drift — keeping your pipelines correct before they run.

What's New

  • NewType Source Generator for Columns: You can now use [FlowthruColumn("TypeName", typeof(BackingType))] to generate NewType record structs at compile-time, eliminating manual boilerplate. Two new diagnostics — FT1003 (invalid backing type) and FT1004 (conflicting backing types across uses) — catch schema mistakes immediately during the build, not at runtime.
  • FlowthruConfig Validation: New diagnostics FT3001 and FT3002 enforce that your configuration class is partial and that all IItem properties carry the [ConfigSection] attribute, catching common wiring mistakes early.
  • Format Extension Conformance Tests: Added conformance fixtures for CSV, Excel, and Parquet serializers to verify round-trip behavior on complex row shapes (PositionalRecord, OptionalEnum). These fixtures are automatically discovered and validated by the meta-test framework.
  • Meta-Test Framework for Diagnostics: A comprehensive suite of meta-tests under scripts/_test/ ensures your pipeline infrastructure stays correct:
    • conformance-presence: Verifies every feature declared in the capability matrix has a matching conformance test fixture, so claimed support is actually tested.
    • dead-schemas: Finds unused schema declarations, helping you keep test data clean as your fixtures evolve.
    • dead-fixtures: Identifies unused test fixture files to prevent bitrot and confusion.
    • diagnostic-id-registration: Validates that all diagnostic IDs (FT*) are properly registered in AnalyzerReleases.Unshipped.md, catching ID collisions and drift.
    • project-mirror: Ensures your project structure stays consistent across test projects.
  • Format Capability Matrix: We've added auto-generated documentation showing which row-shape features each format extension supports on top of the universal baseline (CLR primitives, BCL scalars, nullable types, enum mapping). See docs/reference/extensions/capability-matrix.md.

Bug Fixes

  • Diagnostic Check Soft-Skip: The row-features inventory check now gracefully handles missing scratch documents in CI environments, since docs/scratch/ is gitignored. The check still fires locally where drift is most likely to be caught.
  • Coverage Upload Dependency: The coverage:upload task now depends on coverage:sync, fixing CI failures when the test step exits early before coverage flags are prepared.

🚀 Features

  • finish data compat testing layer and refactors (f2e3b2c7)
  • wrap up data feature matrix & optimize example runs (e7e6b9c3)

🩹 Fixes

  • phase 1 of diagnostic setup (1b27a9f7)
  • set up second phase on data extension surface (f9dc25a5)
  • add metatest compat matrix for storage extensions (f1a4a99f)
  • resolve scratch doc reference (c5c48456)

❤️ Thank You

  • Spencer Elkington

v0.14.0

01 May 15:59

Choose a tag to compare

Release 0.14.0

The last two releases introduced a complete service dependency injection system for steps and stronger controls over post-run metadata collection. Steps now inject external services directly into their factories, with full FUnit testing support and sidecar pre-flight inspection.

What's New

  • Service Dependency Injection for Steps: Steps now declare service dependencies via factory parameters (e.g., Create(IMyService svc)) — the runtime tracks these dependencies for metadata and pre-flight validation. FUnitContext exposes an IServiceCollection so test code can register stubs, and [FUnitStubContainerAttribute]-marked classes auto-register themselves on test construction. The metadata layer collapses identical service types into single nodes in Mermaid diagrams, so the DAG clearly shows which steps share dependencies.
    • SimpleEffectsExample Starter: A minimal effects-as-step example where four steps share a single time-service dependency. Demonstrates factory injection, AddFlowthruInspect<T> for pre-flight service probes, and the recommended [FUnitStubContainer] pattern for unit testing. See examples/starter/SimpleEffectsExample.
    • FUnit DI Enhancements: FUnitContext.GetRequiredService<T>() for ergonomic test-time resolution, automatic stub container discovery, and full service lifetime control.
image
  • Diagnostics Metadata Extension: A new opt-in metadata system lets you collect post-run summaries without materializing entire datasets. The [Diagnostics] extension adds RowCountProvider, OutputExistenceProvider, RunSummaryProvider, and StepTimingProvider — each checks IItem.HasEfficientCount before attempting to count rows, so providers skip items that would require full materialization. Wire it up via UseMetadata(options => options.Diagnostics(...)) and consume the resulting metadata in your result formatter. See Flowthru.Extensions.Metadata.Diagnostics.

Bug Fixes

  • EFCore Runtime Shape Validation: The EFCore extension now uses builtin EFCore utilities for runtime shape validation across all EF-supported frameworks, fixing compatibility issues and improving consistency between environments.
  • Extension Testing Coverage: Additional test kits for conformance testing across extension combinations, resolving miscellaneous edge cases in EFCore and CSV nullability handling.
  • Zero-Arity Steps: Fixed initialization of parameterless step factories, and ensured consistent patterns for file-based dataset loading across all storage adapters.

❤️ Thank You

  • Spencer Elkington

v0.12.2

29 Apr 18:41

Choose a tag to compare

This patch fix resolves a couple of issues caught during our testing improvements:

  1. CSVs now have the ability to properly round-trip null values. The default null value respected by the extension is an empty string (,, in the plaintext would represent a null value, for example), with an additional CSV Item Factory surface allowing for overrides (N/A, NA, etc).
  2. Resolved an issue around 0.12.1's EFCore shape validation for provider-specific quirks about nullability reporting. EFCore+PGSQL should now accurately report nullability checks during pre-flight.

🩹 Fixes

  • test fixtures & CSV nullability fix (80a6a4f6)
  • resolved nullability bug for PGSQL on EFCore shape validator (0cb460d9)

❤️ Thank You

  • Spencer Elkington

v0.12.1

29 Apr 09:22

Choose a tag to compare

This is a minor update to allow the EFCore extension to forward-peek not just whether or not a write target is accessible during preflight, but to also verify if the shape of the target table in the database matches the FlowthruSchema and catch mismatched columns in pre-flight.

🩹 Fixes

  • efcore forward look at query shapes (f531dede)

❤️ Thank You

  • Spencer Elkington

v0.12.0

28 Apr 22:44

Choose a tag to compare

Release 0.12.0

This release adds XML as a first-class data format for catalog entries and extends the internal coverage analytics pipeline.

What's New

  • XML Extension: You can now read and write XML data in your catalog using Flowthru.Extensions.Xml. XmlItemFactory.Single.Xml<T>() creates a catalog entry backed by a single XML file, and ItemFactory.Enumerable.XmlDocuments<T>() reads all *.xml files in a directory, yielding each as an XmlDocument<T> that carries the source file name. Files follow System.Xml.Serialization conventions — decorate your schema type with [XmlRoot], [XmlElement], and [XmlAttribute] as you normally would.
    • FlowthruCoverage Example: A new advanced example demonstrates reading a directory of Cobertura XML coverage reports, processing them through a multi-stage pipeline, and generating a Python-based coverage heatmap. See FlowthruCoverage.

Bug Fixes

  • Template Testing Stability: Template integration tests had intermittent failures under certain build conditions. These tests now run more reliably.
  • Test Output Caching: A cache key issue caused stale test outputs to be used in some scenarios. This is resolved.
  • FUnit Discovery in Examples: The examples project was incorrectly being scanned for FUnit tests, which produced spurious discovery results. Discovery is now scoped correctly.

🚀 Features

🩹 Fixes

  • remove funit discovery from examples project (11b41d80)
  • code coverage for examples (27f468b4)
  • resolve test output cache issue (c8d6a965)
  • better coverage analytics (5858b558)
  • less flakey template testing (f94a8de7)
  • updated inputs for coverage (d0646912)
  • unignore missing coverage analysis flow (d9f2d7cb)

❤️ Thank You

  • Spencer Elkington

v0.11.0

24 Apr 21:49

Choose a tag to compare

Release 0.11.0

Before executing your flow, Flowthru now inspects all write destinations — checking that directories exist, tables are schema-compatible, database connections are valid, and permissions allow writes. Problems surface during pre-flight, not hours into a run.

What's New

  • Destination Validation in Pre-Flight: All catalog outputs are now inspected before your flow starts. InspectTarget() runs against each storage adapter — file adapters verify parent directories and write permissions; database adapters confirm table existence and schema compatibility. Read-only adapters skip the check trivially. If a destination is invalid, the flow stops before any step executes, not after wasting compute time reaching a write step. Validation can be disabled per-flow via ValidationOptions.SkipTargetInspection() if needed.

🚀 Features

  • destination-based inspection during pre-flight (2dfb6682)

🩹 Fixes

❤️ Thank You

  • Spencer Elkington

v0.10.0

24 Apr 17:01

Choose a tag to compare

Release 0.10.0

What's New

  • EFCore Bulk Extension: You can now perform high-performance bulk insert, upsert, and truncate-then-insert operations directly from your Flowthru steps. The extension uses each database provider's native bulk-load paths (Npgsql binary COPY for PostgreSQL, BULK INSERT for SQL Server, etc.), eliminating the row-by-row overhead of standard EF Core inserts. When writing large datasets to databases, you can swap a standard saveFunc for BulkSave.Insert(), BulkSave.TruncateAndInsert(), or BulkSave.InsertOrUpdate() in your catalog items, with optional configuration for batch size, timeout, and ordering behavior.

Bug Fixes

  • Project File Path References: Relative paths in C# project files have been corrected across the codebase to ensure consistent build behavior.
  • Consistent Testing Patterns: Testing patterns have been standardized across the application to ensure that there are clear-cut standards for folks who would like to write new Flowthru extensions and push them upstream.

🚀 Features

🩹 Fixes

  • relative paths within CS project files (0883bb27)
  • resolve slnx issue (5dbae0d9)

❤️ Thank You

  • Spencer Elkington

v0.9.0

23 Apr 20:25

Choose a tag to compare

Release 0.9.0

When a Flowthru pipeline fails at runtime, you now get structured error reports with pre-populated GitHub issue URLs — making it vastly easier to diagnose problems and report bugs back to the team.

What's New

  • Runtime Exception Reporting: When a flow encounters a runtime error, you now get an automatic error report that classifies the failure (possible Flowthru bug vs. external factors like network or filesystem issues) and generates a pre-populated GitHub issue URL with full context: stack trace, environment details, flow name, and which step failed. This cuts debugging time and makes bug reports complete before you file them.
  • Shallow Inspection Performance: Significant performance improvements when Flowthru inspects catalog metadata across all storage adapters. JSON, EFCore, GraphQL, and Parquet serializers are all faster at inspecting schema and structure without loading full datasets. If you're using flows with many catalog items, you'll notice faster flow startup times.
    • Parquet IO Optimization: Parquet serialization now includes configurable options for fine-tuning behavior on different hardware and data scales.

Bug Fixes

  • Flow Configuration: Fixed configuration catalog generation across all example flows.
  • Spark Compatibility: Temporarily removed pending Databricks branch integration — will be restored once the integration is complete.
  • Example Configurations: Resolved configuration issues in KedroSpaceflightsCustom and template test dependencies.

🚀 Features

  • runtime exception reporting addition (6a4f30db)

🩹 Fixes

  • performance resolution for shallow inspection across extensions (78e4c3a1)
  • performance improvements on parquet IO (8aa35955)
  • resolve flow config issues (e20a9144)
  • temp remove spark compat pending databricks branch integration (65b1aee3)
  • resolve KedroSpaceflightsCustom config settings (a654c16e)
  • resolve brittle template test pack dependency (c0038c9d)

❤️ Thank You

  • Spencer Elkington

v0.8.0

23 Apr 16:51

Choose a tag to compare

This release brings configuration management into Flowthru's type-safe ecosystem, and adds HTTP data pulling with optional storage-based caching.

What's New

  • Configuration as Typed Catalog: Configuration sections are now bound as strongly-typed catalog items, so steps can depend on configuration the same way they depend on data. Use the [FlowthruConfig] attribute and [ConfigSection] properties to wire configuration sections (from appsettings.json or environment variables) directly into your step parameters. This eliminates string-based configuration lookups and moves configuration errors to compile-time.
    • Idiomatic .NET Integration: Flowthru now uses the standard Microsoft.Extensions.Configuration system rather than custom configuration plumbing. Extensions are configured through the standard DI container, and all examples have been migrated. See KedroSpaceflights.Custom for a complete example of configuration as catalog in action.
  • HTTP Provider with Storage Caching: You can now pull data over HTTP using the HttpStorageMediumProvider from the new HTTP extension. Optionally cache downloaded files locally to avoid repeated HTTP calls; specify a cache directory and maximum age, and cached files are automatically validated and refreshed. RetailDataSplitFlow demonstrates HTTP pulling with caching.
  • Enhanced Metadata Surface: The metadata provider API now has better visibility into flow exports and improved default provider behavior, making it easier to inspect and document your pipelines after execution.

🚀 Features

  • pull over HTTP, storage cache (8d430203)
  • better metadata surface and default provider behavior (d9df524f)
  • migrate to dotnet IConfiguration (d3c8e50d)
  • extensions now use idiomatic C# config system (77100c0c)
  • configuration as catalog (ed8e3cb4)

🩹 Fixes

  • resolve issue with GitHub pre-releases (95b15373)
  • resolve CI orphaned head issue (ec5c7237)
  • resolve issues with manual dispatch deployments hitting NX versioning clothesline. (07c1da41)

❤️ Thank You

  • Spencer Elkington

v0.6.2

20 Apr 19:13

Choose a tag to compare

🩹 Fixes

  • increase performance of shallow inspection on GQL+EFCore (11e848a5)
  • richer metadata on flow exports (3a8c7de4)

❤️ Thank You

  • Spencer Elkington