|
| 1 | +# MCP Server Tools Testing Guide |
| 2 | + |
| 3 | +This guide provides instructions on how to test the Model Context Protocol (MCP) server tools in the ipfs_datasets_py library. |
| 4 | + |
| 5 | +## Overview of MCP Tools |
| 6 | + |
| 7 | +The ipfs_datasets_py library exposes various functionality through the Model Context Protocol (MCP) server. This allows the library's features to be accessible through a standard protocol interface. |
| 8 | + |
| 9 | +Based on our analysis, the MCP server includes the following tool categories: |
| 10 | + |
| 11 | +1. **dataset_tools (4 tools)** |
| 12 | + - load_dataset |
| 13 | + - save_dataset |
| 14 | + - process_dataset |
| 15 | + - convert_dataset_format |
| 16 | + |
| 17 | +2. **ipfs_tools (2 tools)** |
| 18 | + - get_from_ipfs |
| 19 | + - pin_to_ipfs |
| 20 | + |
| 21 | +3. **vector_tools (2 tools)** |
| 22 | + - create_vector_index |
| 23 | + - search_vector_index |
| 24 | + |
| 25 | +4. **graph_tools (1 tool)** |
| 26 | + - query_knowledge_graph |
| 27 | + |
| 28 | +5. **audit_tools (2 tools)** |
| 29 | + - record_audit_event |
| 30 | + - generate_audit_report |
| 31 | + |
| 32 | +6. **security_tools (1 tool)** |
| 33 | + - check_access_permission |
| 34 | + |
| 35 | +7. **provenance_tools (1 tool)** |
| 36 | + - record_provenance |
| 37 | + |
| 38 | +8. **web_archive_tools (6 tools)** |
| 39 | + - create_warc |
| 40 | + - index_warc |
| 41 | + - extract_dataset_from_cdxj |
| 42 | + - extract_text_from_warc |
| 43 | + - extract_links_from_warc |
| 44 | + - extract_metadata_from_warc |
| 45 | + |
| 46 | +9. **cli (1 tool)** |
| 47 | + - execute_command |
| 48 | + |
| 49 | +10. **functions (1 tool)** |
| 50 | + - execute_python_snippet |
| 51 | + |
| 52 | +## Testing Approaches |
| 53 | + |
| 54 | +There are several ways to test the MCP tools: |
| 55 | + |
| 56 | +### 1. Using the MCP Server Test Script |
| 57 | + |
| 58 | +The existing `test_mcp_server.py` file in the MCP server directory can be used to test the server and its tools. This script starts the MCP server and tests the tools through the Model Context Protocol. |
| 59 | + |
| 60 | +```bash |
| 61 | +python ipfs_datasets_py/mcp_server/test_mcp_server.py |
| 62 | +``` |
| 63 | + |
| 64 | +### 2. Testing MCP Tool Coverage |
| 65 | + |
| 66 | +The `test_mcp_api_coverage.py` script checks if all expected library features are exposed as MCP tools. |
| 67 | + |
| 68 | +```bash |
| 69 | +python test_mcp_api_coverage.py |
| 70 | +``` |
| 71 | + |
| 72 | +### 3. Direct Tool Testing |
| 73 | + |
| 74 | +You can test individual tools directly by importing them and calling their functions. However, note that many of these functions are asynchronous and need to be run in an async context. |
| 75 | + |
| 76 | +Example for testing a dataset tool: |
| 77 | + |
| 78 | +```python |
| 79 | +import asyncio |
| 80 | +from ipfs_datasets_py.mcp_server.tools.dataset_tools import load_dataset |
| 81 | + |
| 82 | +async def test_load_dataset(): |
| 83 | + result = await load_dataset(source="path/to/dataset.json", format="json") |
| 84 | + print(result) |
| 85 | + |
| 86 | +asyncio.run(test_load_dataset()) |
| 87 | +``` |
| 88 | + |
| 89 | +### 4. Mock-Based Unit Testing |
| 90 | + |
| 91 | +For proper unit testing, you'll want to use mocks to avoid dependencies on external services like IPFS. Here's an example approach: |
| 92 | + |
| 93 | +```python |
| 94 | +import unittest |
| 95 | +from unittest.mock import patch, MagicMock |
| 96 | +import asyncio |
| 97 | + |
| 98 | +class DatasetToolsTest(unittest.TestCase): |
| 99 | + @patch('ipfs_datasets_py.mcp_server.tools.dataset_tools.load_dataset.datasets') |
| 100 | + async def test_load_dataset(self, mock_datasets): |
| 101 | + from ipfs_datasets_py.mcp_server.tools.dataset_tools import load_dataset |
| 102 | + |
| 103 | + # Set up mock |
| 104 | + mock_dataset = MagicMock() |
| 105 | + mock_datasets.load_dataset.return_value = mock_dataset |
| 106 | + |
| 107 | + # Call function |
| 108 | + result = await load_dataset("test_dataset", format="json") |
| 109 | + |
| 110 | + # Assertions |
| 111 | + self.assertEqual(result["status"], "success") |
| 112 | + mock_datasets.load_dataset.assert_called_once_with("test_dataset", format="json") |
| 113 | + |
| 114 | +# Run with asyncio |
| 115 | +def run_tests(): |
| 116 | + unittest.main() |
| 117 | + |
| 118 | +if __name__ == "__main__": |
| 119 | + run_tests() |
| 120 | +``` |
| 121 | + |
| 122 | +## Implementing New Tests |
| 123 | + |
| 124 | +To implement tests for currently untested tools: |
| 125 | + |
| 126 | +1. **Identify untested tools**: Run the API coverage test to see which tools need testing. |
| 127 | +2. **Create test files**: Create test files for each tool category (e.g., `test_web_archive_tools.py`). |
| 128 | +3. **Implement unit tests**: Write tests that mock external dependencies and verify the tool's functionality. |
| 129 | +4. **Run tests**: Execute the tests to ensure they pass. |
| 130 | + |
| 131 | +## Testing Web Archive Tools Example |
| 132 | + |
| 133 | +Here's a detailed example for testing web archive tools: |
| 134 | + |
| 135 | +```python |
| 136 | +import unittest |
| 137 | +from unittest.mock import patch, MagicMock |
| 138 | +import asyncio |
| 139 | +import os |
| 140 | +from pathlib import Path |
| 141 | + |
| 142 | +class WebArchiveToolsTest(unittest.TestCase): |
| 143 | + def setUp(self): |
| 144 | + self.test_dir = Path("/tmp/web_archive_test") |
| 145 | + os.makedirs(self.test_dir, exist_ok=True) |
| 146 | + self.warc_path = self.test_dir / "test.warc" |
| 147 | + self.cdxj_path = self.test_dir / "test.cdxj" |
| 148 | + |
| 149 | + def tearDown(self): |
| 150 | + import shutil |
| 151 | + if self.test_dir.exists(): |
| 152 | + shutil.rmtree(self.test_dir) |
| 153 | + |
| 154 | + async def test_create_warc(self): |
| 155 | + with patch('ipfs_datasets_py.web_archive_utils.WebArchiveProcessor') as mock_class: |
| 156 | + # Set up mock |
| 157 | + mock_processor = MagicMock() |
| 158 | + mock_class.return_value = mock_processor |
| 159 | + mock_processor.create_warc.return_value = str(self.warc_path) |
| 160 | + |
| 161 | + # Import tool (do this inside the test to keep patch context) |
| 162 | + from ipfs_datasets_py.mcp_server.tools.web_archive_tools import create_warc |
| 163 | + |
| 164 | + # Call function |
| 165 | + result = await create_warc( |
| 166 | + url="https://example.com", |
| 167 | + output_path=str(self.warc_path) |
| 168 | + ) |
| 169 | + |
| 170 | + # Assertions |
| 171 | + self.assertEqual(result["status"], "success") |
| 172 | + self.assertEqual(result["warc_path"], str(self.warc_path)) |
| 173 | + mock_processor.create_warc.assert_called_once() |
| 174 | + |
| 175 | +# Run async tests |
| 176 | +def run_tests(): |
| 177 | + loader = unittest.TestLoader() |
| 178 | + suite = loader.loadTestsFromTestCase(WebArchiveToolsTest) |
| 179 | + |
| 180 | + # Create a test runner that will run the async tests |
| 181 | + class AsyncioTestRunner: |
| 182 | + def run(self, test): |
| 183 | + loop = asyncio.get_event_loop() |
| 184 | + return loop.run_until_complete(test) |
| 185 | + |
| 186 | + runner = AsyncioTestRunner() |
| 187 | + result = runner.run(suite) |
| 188 | + |
| 189 | + print(f"Ran {result.testsRun} tests with {len(result.errors)} errors and {len(result.failures)} failures") |
| 190 | + |
| 191 | +if __name__ == "__main__": |
| 192 | + run_tests() |
| 193 | +``` |
| 194 | + |
| 195 | +## Conclusion |
| 196 | + |
| 197 | +Thoroughly testing MCP tools ensures that all library features are properly exposed through the Model Context Protocol. By following the approaches in this guide, you can verify that the MCP server correctly implements the interface to the ipfs_datasets_py library functionality. |
0 commit comments