Skip to content

Cookboook with Firecrawl #200#206

Open
AnandKrishnamoorthy1 wants to merge 6 commits intousemoss:mainfrom
AnandKrishnamoorthy1:feat/firecrawl-cookbook
Open

Cookboook with Firecrawl #200#206
AnandKrishnamoorthy1 wants to merge 6 commits intousemoss:mainfrom
AnandKrishnamoorthy1:feat/firecrawl-cookbook

Conversation

@AnandKrishnamoorthy1
Copy link
Copy Markdown

@AnandKrishnamoorthy1 AnandKrishnamoorthy1 commented May 4, 2026

Pull Request Checklist

Please ensure that your PR meets the following requirements:

  • I have read the CONTRIBUTING guide.
  • I have updated the documentation (if applicable).
  • My code follows the style guidelines of this project.
  • I have performed a self-review of my own code.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.

  • Added Firecrawl + Moss cookbook example: Complete Jupyter notebook demonstrating URL crawling (Firecrawl) → markdown normalization → Moss semantic indexing & querying, with a prepare-once/query-many workflow for efficiency.
  • Motivation: Users need a clear, runnable example of integrating Firecrawl's web scraping with Moss's semantic search; this cookbook provides end-to-end setup, helpers, and sample queries on real documentation sites.

Fixes #200

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Open in Devin Review

Copilot AI review requested due to automatic review settings May 4, 2026 03:08
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 4, 2026

CLA assistant check
All committers have signed the CLA.

devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"metadata": {},
"outputs": [],
"source": [
"#pip install firecrawl-py moss python-dotenv"
Comment on lines +109 to +111
" id=f\"firecrawl-{index}\",\n",
" text=page.markdown,\n",
" metadata={\"source_url\": page.url, \"title\": page.title or \"\"},\n",
Comment on lines +155 to +165
"async def prepare_knowledge_base(urls: list[str], limit: int = 10) -> tuple[MossClient, str]:\n",
" validate_configuration(urls)\n",
" crawled_pages = crawl_urls(urls, limit=limit)\n",
" documents = crawled_pages_to_moss_docs(crawled_pages)\n",
"\n",
" if not documents:\n",
" raise RuntimeError(\"Firecrawl returned no markdown content to index.\")\n",
"\n",
" index_name = f\"firecrawl-cookbook-{uuid.uuid4().hex[:8]}\"\n",
" client = MossClient(MOSS_PROJECT_ID, MOSS_PROJECT_KEY)\n",
"\n",
Comment on lines +53 to +55
├──> Markdown Normalization
│ (clean text, remove chrome)
Comment on lines +11 to +13
# Optional: default index name used by the notebook
MOSS_INDEX_NAME=firecrawl-demo

Comment thread examples/cookbook/firecrawl/.env.example Outdated
Comment on lines +289 to +291
"display_name": "Python [conda env:base] *",
"language": "python",
"name": "conda-base-py"
" print(f\" {item.text[:200].strip()}\")\n",
"\n",
"\n",
"# Build knowledgebase and query it in one step\n",
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
devin-ai-integration[bot]

This comment was marked as resolved.

@yatharthk2
Copy link
Copy Markdown
Collaborator

Hi @AnandKrishnamoorthy1 , thank you for working on this pr. were you able to solve the ai comments ?

@AnandKrishnamoorthy1
Copy link
Copy Markdown
Author

AnandKrishnamoorthy1 commented May 9, 2026

Hi @AnandKrishnamoorthy1 , thank you for working on this pr. were you able to solve the ai comments ?

@yatharthk2 Yes, I have. Lmk if you have any other comments, or else you can merge and close this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Cookboook with Firecrawl

4 participants