usemoss · Abhijitam01 · Apr 8, 2026 · Apr 8, 2026 · Apr 8, 2026 · Apr 8, 2026
@@ -0,0 +1,31 @@
+# TODOs
+
+## CI/CD pipeline for integration packages
+
+**What:** Add GitHub Actions workflows to build, test, and publish `strands-agents-moss` and `semantic-kernel-moss` to PyPI.
+
+**Why:** Neither integration package has a CI/CD pipeline. Code without distribution is code nobody can use. Users currently have no way to `pip install` these packages from PyPI.
+
+**Context:** Both packages use setuptools with `pyproject.toml`. Tests use pytest + pytest-asyncio. Linting uses ruff. A single reusable workflow could cover both packages since they share the same build system and test tooling. Consider matrix strategy for Python 3.10-3.13.
+
+**Depends on:** Nothing. Can be done independently.
+
+## .NET Semantic Kernel plugin design doc
+
+**What:** Write a design document for a .NET version of the Moss Semantic Kernel plugin.
+
+**Why:** The GitHub issue (#82) lists .NET as a stretch goal. Semantic Kernel has strong .NET adoption in enterprise shops. A design doc captures the approach (NuGet package, IKernelPlugin interface, C# async patterns) without committing to implementation.
+
+**Context:** The Python plugin (`semantic-kernel-moss`) is the reference implementation. The .NET version would follow the same pattern: single `Search` kernel function, constructor-configured `MossClient`, pre-loaded index. Key decisions: whether to use the .NET Moss SDK (if it exists) or wrap the Python SDK, and how to handle the async index loading lifecycle in C#.
+
+**Depends on:** Python plugin shipped and validated by users.
+
+## Streaming/batched ingest for large tables
+
+**What:** Change shared `ingest.py` to pass iterables directly or batch in chunks instead of `list(source)`.
+
+**Why:** Current `list()` call loads entire dataset into memory. For 1M+ row tables, that's 1GB+ RAM.
+
+**Context:** The connector `__iter__` already streams correctly. The bottleneck is the shared `ingest.py` doing eager collection. Fix would be ~5 lines but affects all connectors (SQLite, MongoDB, DynamoDB).
+
+**Depends on:** Checking if `MossClient.create_index()` accepts iterables.
@@ -0,0 +1,43 @@
+# moss-data-connector
+
+Folder holding the database-connector packages. Each subfolder is its own pip-installable package
+
+## Layout
+
+```
+moss-data-connector/
+├── _template/               # copy-me starting point for a new connector
+├── moss-connector-sqlite/   # SQLite source (stdlib, no driver)
+├── moss-connector-mongodb/  # MongoDB source (requires pymongo)
+└── moss-connector-dynamodb/ # DynamoDB source (requires boto3)
+```
+
+
+## Caller shape
+
+```python
+from moss import DocumentInfo
+from moss_connector_sqlite import SQLiteConnector, ingest
+
+source = SQLiteConnector(
+    database="my.db",
+    query="SELECT id, title, body FROM articles",
+    mapper=lambda r: DocumentInfo(id=str(r["id"]), text=r["body"], metadata={"title": r["title"]}),
+)
+
+await ingest(source, project_id="...", project_key="...", index_name="articles")
+```
+
+
+## Available connectors
+
+| Package                                              | Source   | Extra driver |
+| ---------------------------------------------------- | -------- | ------------ |
+| [`moss-connector-sqlite`](moss-connector-sqlite)     | SQLite   | —            |
+| [`moss-connector-mongodb`](moss-connector-mongodb)   | MongoDB  | `pymongo`    |
+| [`moss-connector-dynamodb`](moss-connector-dynamodb) | DynamoDB | `boto3`      |
+
+## Adding a new connector
+
+See [`_template/README.md`](_template/README.md).
+
@@ -0,0 +1,10 @@
+build/
+dist/
+*.egg-info/
+__pycache__/
+*.py[cod]
+.venv/
+.pytest_cache/
+.ruff_cache/
+.mypy_cache/
+.env
@@ -0,0 +1,49 @@
+# moss-connector-template
+
+Starting point for a new connector. Not a real package, don't install it.
+
+## To create a new connector
+
+```bash
+cd packages/moss-data-connector
+cp -r _template moss-connector-<source>
+cd moss-connector-<source>
+```
+
+Then:
+
+1. Open `pyproject.toml` and replace every `TODO` (name, description, keywords, Source URL, driver deps). The package name is `moss-connector-<source>`, the Python module is `moss_connector_<source>`.
+2. Open `src/connector.py` and:
+   - Rename `TemplateConnector` → `<Source>Connector`.
+   - Add your source-specific config to `__init__`.
+   - Implement `__iter__` (connect, pull rows, `yield self.mapper(row)`).
+3. Update `src/__init__.py` to re-export your renamed class.
+4. Rename `tests/test_template.py` → `tests/test_<source>.py` and fill in.
+5. Add a live integration test in `tests/test_integration_<source>_moss.py` if you can (see sqlite/mongodb for patterns).
+6. Update this package's README with install + usage snippets (see `moss-connector-sqlite/README.md` for shape).
+7. Add a row to `packages/moss-data-connector/README.md`.
+8. Open a PR.
+
+## Rules
+
+- **One source per package.** Don't combine.
+- **Declare your driver as a main dependency** in `pyproject.toml` and import it at the top of the module.
+- **No retries or rate-limit logic in `ingest.py`.** If a connector needs it, put it in the connector's own code.
+
+## Caller shape (what users write against your connector)
+
+```python
+from moss import DocumentInfo
+from moss_connector_<source> import <Source>Connector, ingest
+
+source = <Source>Connector(
+    # your config here
+    mapper=lambda r: DocumentInfo(
+        id=str(r["id"]),
+        text=r["body"],
+        metadata={"title": r["title"]},
+    ),
+)
+
+await ingest(source, project_id="...", project_key="...", index_name="articles")
+```
@@ -0,0 +1,60 @@
+[project]
+# TODO: rename to "moss-connector-<source>"
+name = "moss-connector-template"
+version = "0.0.1"
+description = "TODO: short description of the source this connector reads from."
+readme = "README.md"
+requires-python = ">=3.10,<3.15"
+license = { text = "BSD-2-Clause" }
+authors = [{ name = "InferEdge Inc.", email = "contact@moss.dev" }]
+# TODO: update keywords
+keywords = ["moss", "connectors", "ingest"]
+classifiers = [
+    "Development Status :: 3 - Alpha",
+    "Intended Audience :: Developers",
+    "License :: OSI Approved :: BSD License",
+    "Programming Language :: Python :: 3",
+    "Programming Language :: Python :: 3.10",
+    "Programming Language :: Python :: 3.11",
+    "Programming Language :: Python :: 3.12",
+    "Programming Language :: Python :: 3.13",
+    "Topic :: Database",
+]
+dependencies = [
+    "moss>=1.0.0",
+    # TODO: add your source's driver, e.g. "psycopg[binary]>=3.1"
+]
+
+[project.optional-dependencies]
+dev = [
+    "pytest>=8.0.0",
+    "pytest-asyncio>=0.23.0",
+    "python-dotenv>=1.0.0",
+    "ruff>=0.5.0",
+]
+
+[project.urls]
+Homepage = "https://github.com/usemoss/moss"
+Repository = "https://github.com/usemoss/moss"
+# TODO: update the Source path
+Source = "https://github.com/usemoss/moss/tree/main/packages/moss-data-connector/moss-connector-template"
+
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"
+
+# Flat layout: src/ itself IS the package.
+# TODO: rename to "moss_connector_<source>" to match your package name.
+[tool.setuptools]
+packages = ["moss_connector_template"]
+package-dir = { "moss_connector_template" = "src" }
+
+[tool.ruff]
+line-length = 100
+target-version = "py310"
+
+[tool.ruff.lint]
+select = ["E", "W", "F", "I", "B", "UP"]
+
+[tool.pytest.ini_options]
+asyncio_mode = "auto"
@@ -0,0 +1,10 @@
+"""Template connector package.
+
+Copy this directory to `packages/moss-data-connector/moss-connector-<source>/`,
+then rename `TemplateConnector` in `connector.py` to `<Source>Connector`.
+"""
+
+from .connector import TemplateConnector
+from .ingest import ingest
+
+__all__ = ["TemplateConnector", "ingest"]
@@ -0,0 +1,33 @@
+"""Connector class goes here. Rename both the file's class and the module's
+host package (`moss_connector_template` → `moss_connector_<source>`).
+"""
+
+from __future__ import annotations
+
+from typing import Any, Callable, Iterator
+
+from moss import DocumentInfo
+
+
+class TemplateConnector:
+    """Yield one `DocumentInfo` per row from your source.
+
+    `mapper` turns one row dict into a `DocumentInfo`, the caller decides
+    which keys become id / text / metadata / embedding.
+    """
+
+    def __init__(
+        self,
+        # TODO: add your source-specific config here (connection string, query, etc.)
+        mapper: Callable[[dict[str, Any]], DocumentInfo],
+    ) -> None:
+        self.mapper = mapper
+
+    def __iter__(self) -> Iterator[DocumentInfo]:
+        # TODO: connect to your source, pull rows, and for each one:
+        #   yield self.mapper(row_as_dict)
+        # Don't pre-filter columns - the caller's mapper decides what to use.
+        # Import your driver *inside* this method, not at module top, so
+        # importing the package never fails just because the driver isn't
+        # installed.
-        # Import your driver *inside* this method, not at module top, so
-        # importing the package never fails just because the driver isn't
-        # installed.
+        # Import your driver at the top of the module (see README rules).
+        # Declare it as a main dependency in pyproject.toml so pip installs it.
+        #
-        # Import your driver *inside* this method, not at module top, so
-        # importing the package never fails just because the driver isn't
-        # installed.
+        # Import your driver at the top of the module (see README rules).
+        # Declare it as a main dependency in pyproject.toml so pip installs it.
+        #
+        raise NotImplementedError
@@ -0,0 +1,22 @@
+"""Copy rows into a Moss index."""
+
+from __future__ import annotations
+
+from collections.abc import Iterable
+
+from moss import DocumentInfo, MossClient, MutationResult
+
+
+async def ingest(
+    source: Iterable[DocumentInfo],
+    project_id: str,
+    project_key: str,
+    index_name: str,
+    model_id: str | None = None,
+) -> MutationResult | None:
+    """Copy every `DocumentInfo` from `source` into a fresh Moss index."""
+    docs = list(source)
+    if not docs:
+        return None
+    client = MossClient(project_id, project_key)
+    return await client.create_index(index_name, docs, model_id=model_id)
@@ -0,0 +1,35 @@
+"""Template unit test. Rename to test_<source>.py and adapt."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from typing import Any
+from unittest.mock import patch
+
+import pytest  # noqa: F401
+
+from moss import DocumentInfo  # noqa: F401
+
+# TODO: update these imports to match your renamed package.
+# from moss_connector_<source> import <Source>Connector, ingest
+
+
+@dataclass
+class FakeMossClient:
+    """Records create_index calls without hitting the network."""
+
+    calls: list[dict[str, Any]] = field(default_factory=list)
+
+    async def create_index(self, name, docs, model_id=None):
+        self.calls.append({"name": name, "docs": list(docs), "model_id": model_id})
+
+
+# Example test, adapt to your source. See moss-connector-sqlite/tests/test_sqlite.py
+# for a worked example that uses a real stdlib driver + fake MossClient.
+#
+# async def test_<source>_ingest():
+#     fake_moss = FakeMossClient()
+#     with patch("moss_connector_<source>.ingest.MossClient", return_value=fake_moss):
+#         source = <Source>Connector(..., mapper=lambda r: DocumentInfo(...))
+#         count = await ingest(source, "fake_id", "fake_key", "idx")
+#     assert count == ...
@@ -0,0 +1,10 @@
+build/
+dist/
+*.egg-info/
+__pycache__/
+*.py[cod]
+.venv/
+.pytest_cache/
+.ruff_cache/
+.mypy_cache/
+.env
@@ -0,0 +1,80 @@
+# moss-connector-dynamodb
+
+DynamoDB source connector for Moss. Scans an entire table (with optional filters) and ingests items into a Moss search index.
+
+## Install
+
+```bash
+pip install moss-connector-dynamodb
+```
+
+Pulls `boto3` as a dependency. Uses the standard boto3 credential chain (env vars, shared credentials file, IAM role, etc.).
+
+## Usage
+
+```python
+import asyncio
+from moss import DocumentInfo
+from moss_connector_dynamodb import DynamoDBConnector, ingest
+
+async def main():
+    source = DynamoDBConnector(
+        table_name="articles",
+        mapper=lambda item: DocumentInfo(
+            id=str(item["id"]),
+            text=item["body"],
+            metadata={"title": item["title"]},
+        ),
+        region_name="us-east-1",
+        scan_kwargs={                                          # optional
+            "FilterExpression": "#s = :val",
+            "ExpressionAttributeNames": {"#s": "status"},
+            "ExpressionAttributeValues": {":val": "published"},
+        },
+    )
+
+    result = await ingest(
+        source,
+        project_id="your_project_id",
+        project_key="your_project_key",
+        index_name="articles",
+    )
+    print(f"copied {result.doc_count} rows")
+
+asyncio.run(main())
+```
+
+DynamoDB items come back as dicts with Python types (Decimal for numbers, etc.). Handle type coercion in your mapper.
+
+For large tables, `ingest()` loads all items into memory before indexing. Consider batching for tables with 100K+ rows.
+
+### Local development
+
+Pass `endpoint_url` to target DynamoDB Local or localstack:
+
+```python
+source = DynamoDBConnector(
+    table_name="articles",
+    mapper=my_mapper,
+    endpoint_url="http://localhost:8000",
+)
+```
+
+## Layout
+
+```
+src/
+├── __init__.py      # re-exports DynamoDBConnector and ingest
+├── connector.py     # DynamoDBConnector class
+└── ingest.py        # ingest() - keep in sync with the other connector packages
+```
+
+## Tests
+
+```bash
+pip install -e ".[dev]"
+pytest tests/test_dynamodb.py -v                              # mocked Moss + mocked boto3
+pytest tests/test_integration_dynamodb_moss.py -v -s          # live Moss + real DynamoDB
+```
+
+The live integration test requires `DYNAMODB_TABLE`, `AWS_REGION`, `MOSS_PROJECT_ID`, and `MOSS_PROJECT_KEY` env vars. Optionally set `DYNAMODB_ENDPOINT_URL` for DynamoDB Local.