Bump version to v0.2.0 (#60)

* Refactor RemoteTool to support OpenAPI style spec * Refactor base tool, remote tool and tool server to support OpenAPI spec. * Update all tools to new style annotation * Fix wrappers and server * Update requirements * Remove mmengine and opencv requirements * Add missing file * Update requirements and langchain * Fix remote tool to support schema.AllOf * Update code style to increase line width * Support general file type inputs and outputs * Update `RemoteTool` and add docstring. * Add Gradio interface * Fix bugs * Support Lagent 0.2.0 and update gradio web ui * Fix internlm2 agent in webui * Make function tool can be used in server. * Move gradio webui to `webui` folder. * Add Chinese WebUI Readme * Update docs * Update search tool * Bump version to v0.2.0
InternLM · Feb 2, 2024 · 83b1552 · 83b1552
1 parent 96fbb97
commit 83b1552
Showing 154 changed files with 6,770 additions and 2,590 deletions.
diff --git a/.dev_scripts/generate_readme.py b/.dev_scripts/generate_readme.py
@@ -59,8 +59,7 @@
 
 def parse_args():
     parser = argparse.ArgumentParser(description=prog_description)
-    parser.add_argument(
-        'tools', type=str, nargs='+', help='The tool class to generate.')
+    parser.add_argument('tools', type=str, nargs='+', help='The tool class to generate.')
     args = parser.parse_args()
     return args
 

diff --git a/.gitignore b/.gitignore
@@ -119,3 +119,10 @@ venv.bak/
 
 # generated image, audio or video files.
 generated/
+
+# gradio demo
+webui/logs/
+webui/custom_tools/
+webui/installer_files/
+webui/generated/
+webui/*.yml
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -4,6 +4,7 @@ repos:
     rev: 6.1.0
     hooks:
       - id: flake8
+        args: ['--exclude', 'webui/*']
   - repo: https://github.com/PyCQA/isort
     rev: 5.12.0
     hooks:
@@ -12,6 +13,7 @@ repos:
     rev: v0.40.2
     hooks:
       - id: yapf
+        args: ['--exclude', 'webui']
   - repo: https://github.com/pre-commit/pre-commit-hooks
     rev: v4.5.0
     hooks:

diff --git a/README.md b/README.md
@@ -47,8 +47,8 @@ pip install agentlego
 Some tools requires extra packages, please check the readme file of the tool, and confirm all requirements are
 satisfied.
 
-For example, if we want to use the `ImageCaption` tool. We need to check the **Set up** section of
-[readme](agentlego/tools/image_text/README.md#ImageCaption) and install the requirements.
+For example, if we want to use the `ImageDescription` tool. We need to check the **Set up** section of
+[readme](agentlego/tools/image_text/README.md#ImageDescription) and install the requirements.
 
 ```bash
 pip install -U openmim
@@ -62,7 +62,7 @@ from agentlego import list_tools, load_tool
 
 print(list_tools())  # list tools in AgentLego
 
-image_caption_tool = load_tool('ImageCaption', device='cuda')
+image_caption_tool = load_tool('ImageDescription', device='cuda')
 print(image_caption_tool.description)
 image = './examples/demo.png'
 caption = image_caption_tool(image)
@@ -88,9 +88,9 @@ caption = image_caption_tool(image)
 
 **Image-processing related**
 
-- [ImageCaption](agentlego/tools/image_text/README.md#ImageCaption): Describe the input image.
+- [ImageDescription](agentlego/tools/image_text/README.md#ImageDescription): Describe the input image.
 - [OCR](agentlego/tools/ocr/README.md#OCR): Recognize the text from a photo.
-- [VisualQuestionAnswering](agentlego/tools/vqa/README.md#VisualQuestionAnswering): Answer the question according to the image.
+- [VQA](agentlego/tools/vqa/README.md#VQA): Answer the question according to the image.
 - [HumanBodyPose](agentlego/tools/image_pose/README.md#HumanBodyPose): Estimate the pose or keypoints of human in an image.
 - [HumanFaceLandmark](agentlego/tools/image_pose/README.md#HumanFaceLandmark): Estimate the landmark or keypoints of human faces in an image.
 - [ImageToCanny](agentlego/tools/image_canny/README.md#ImageToCanny): Extract the edge image from an image.
@@ -100,8 +100,7 @@ caption = image_caption_tool(image)
 - [TextToBbox](agentlego/tools/object_detection/README.md#TextToBbox): Detect specific objects described by the given text in the image.
 - Segment Anything series
   - [SegmentAnything](agentlego/tools/segmentation/README.md#SegmentAnything): Segment all items in the image.
-  - [SegmentClicked](agentlego/tools/segmentation/README.md#SegmentClicked): Segment the masked region in the image.
-  - [ObjectSegmenting](agentlego/tools/segmentation/README.md#ObjectSegmenting): Segment the certain objects in the image according to the given object name.
+  - [SegmentObject](agentlego/tools/segmentation/README.md#SegmentObject): Segment the certain objects in the image according to the given object name.
 
 **AIGC related**
 

diff --git a/README_zh-CN.md b/README_zh-CN.md
@@ -46,7 +46,7 @@ pip install agentlego
 
 一些工具需要额外的软件包，请查看工具的自述文件，并确认所有要求都得到满足。
 
-例如，如果我们想要使用`ImageCaption`工具。我们需要查看工具 [readme](agentlego/tools/image_text/README.md#ImageCaption) 的 **Set up** 小节并安装所需的软件。
+例如，如果我们想要使用`ImageDescription`工具。我们需要查看工具 [readme](agentlego/tools/image_text/README.md#ImageDescription) 的 **Set up** 小节并安装所需的软件。
 
 ```bash
 pip install -U openmim
@@ -60,7 +60,7 @@ from agentlego import list_tools, load_tool
 
 print(list_tools())  # list tools in AgentLego
 
-image_caption_tool = load_tool('ImageCaption', device='cuda')
+image_caption_tool = load_tool('ImageDescription', device='cuda')
 print(image_caption_tool.description)
 image = './examples/demo.png'
 caption = image_caption_tool(image)
@@ -86,9 +86,9 @@ caption = image_caption_tool(image)
 
 **图像处理相关**
 
-- [ImageCaption](agentlego/tools/image_text/README.md#ImageCaption): 描述输入图像。
+- [ImageDescription](agentlego/tools/image_text/README.md#ImageDescription): 描述输入图像。
 - [OCR](agentlego/tools/ocr/README.md#OCR): 从照片中识别文本。
-- [VisualQuestionAnswering](agentlego/tools/vqa/README.md#VisualQuestionAnswering): 根据图片回答问题。
+- [VQA](agentlego/tools/vqa/README.md#VQA): 根据图片回答问题。
 - [HumanBodyPose](agentlego/tools/image_pose/README.md#HumanBodyPose): 估计图像中人体的姿态或关键点，并绘制人体姿态图像
 - [HumanFaceLandmark](agentlego/tools/image_pose/README.md#HumanFaceLandmark): 识别图像中人脸的关键点，并绘制带有关键点的图像。
 - [ImageToCanny](agentlego/tools/image_canny/README.md#ImageToCanny): 从图像中提取边缘图像。
@@ -98,8 +98,7 @@ caption = image_caption_tool(image)
 - [TextToBbox](agentlego/tools/object_detection/README.md#TextToBbox): 检测图像中的给定对象。
 - Segment Anything 系列工具
   - [SegmentAnything](agentlego/tools/segmentation/README.md#SegmentAnything): 分割图像中的所有物体。
-  - [SegmentClicked](agentlego/tools/segmentation/README.md#SegmentClicked): 分割图像中指定区域的物体。
-  - [ObjectSegmenting](agentlego/tools/segmentation/README.md#ObjectSegmenting): 根据给定的物体名称，在图像中分割出特定的物体。
+  - [SegmentObject](agentlego/tools/segmentation/README.md#SegmentObject): 根据给定的物体名称，在图像中分割出特定的物体。
 
 **AIGC 相关**
 

diff --git a/agentlego/__init__.py b/agentlego/__init__.py
@@ -1,4 +1,5 @@
 from .apis.tool import list_tools, load_tool
 from .search import search_tool
+from .version import __version__  # noqa: F401, F403
 
 __all__ = ['load_tool', 'list_tools', 'search_tool']
diff --git a/agentlego/apis/tool.py b/agentlego/apis/tool.py
@@ -1,21 +1,30 @@
 import importlib
 import inspect
+from typing import Optional, Union
 
 import agentlego.tools
 from agentlego.tools import BaseTool
+from agentlego.tools.func import _FuncToolType
 from agentlego.utils.cache import load_or_build_object
 
 NAMES2TOOLS = {}
 
 
-def register_all_tools(module):
+def extract_all_tools(module):
     if isinstance(module, str):
         module = importlib.import_module(module)
 
+    tools = {}
     for k, v in module.__dict__.items():
-        if (isinstance(v, type) and issubclass(v, BaseTool)
-                and (v is not BaseTool)):
-            NAMES2TOOLS[k] = v
+        if (isinstance(v, type) and issubclass(v, BaseTool) and (v is not BaseTool)):
+            tools[k] = v
+        elif isinstance(v, _FuncToolType):
+            tools[k] = v
+    return tools
+
+
+def register_all_tools(module):
+    NAMES2TOOLS.update(extract_all_tools(module))
 
 
 register_all_tools(agentlego.tools)
@@ -39,15 +48,15 @@ def list_tools(with_description=False):
         ...     print(name, description)
     """
     if with_description:
-        return list((name, cls.DEFAULT_TOOLMETA.description)
+        return list((name, cls.get_default_toolmeta().description)
                     for name, cls in NAMES2TOOLS.items())
     else:
         return list(NAMES2TOOLS.keys())
 
 
 def load_tool(tool_type: str,
-              name: str = None,
-              description: str = None,
+              name: Optional[str] = None,
+              description: Optional[str] = None,
               device=None,
               **kwargs) -> BaseTool:
     """Load a configurable callable tool for different task.
@@ -56,7 +65,7 @@ def load_tool(tool_type: str,
         tool_name (str): tool name for specific task. You can find more
             description about supported tools by
             :func:`~agentlego.apis.list_tools`.
-        override_name (str | None): The name to override the default name.
+        name (str | None): The name to override the default name.
             Defaults to None.
         description (str): The description to override the default description.
             Defaults to None.
@@ -72,28 +81,25 @@ def load_tool(tool_type: str,
     Examples:
         >>> from agentlego import load_tool
         >>> # load tool with tool name
-        >>> tool, meta = load_tool('object detection')
-        >>> # load a specific model
-        >>> tool, meta = load_tool(
-        >>>     'object detection', model='rtmdet_l_8xb32-300e_coco')
+        >>> tool, meta = load_tool('GoogleSearch', with_url=True)
     """
     if tool_type not in NAMES2TOOLS:
         # Using ValueError to show error msg cross lines.
         raise ValueError(f'{tool_type} is not supported now, the available '
                          'tools are:\n' + '\n'.join(NAMES2TOOLS.keys()))
 
-    tool_type = NAMES2TOOLS[tool_type]
-    if 'device' in inspect.getfullargspec(tool_type).args:
+    constructor: Union[type, _FuncToolType] = NAMES2TOOLS[tool_type]
+    if 'device' in inspect.getfullargspec(constructor).args:
         kwargs['device'] = device
 
-    if name or description:
-        tool_obj = tool_type(**kwargs)
+    if name or description or isinstance(constructor, _FuncToolType):
+        tool_obj = constructor(**kwargs)
         if name:
             tool_obj.name = name
         if description:
             tool_obj.description = description
     else:
         # Only enable cache if no overrode attribution
         # to avoid the cached tool is changed.
-        tool_obj = load_or_build_object(tool_type, **kwargs)
+        tool_obj = load_or_build_object(constructor, **kwargs)
     return tool_obj
diff --git a/agentlego/parsers/default_parser.py b/agentlego/parsers/default_parser.py
@@ -1,31 +1,27 @@
 from typing import Tuple
 
-from agentlego.types import CatgoryToIO, IOType
+from agentlego.types import AudioIO, File, ImageIO, IOType
 from .base_parser import BaseParser
 
 
 class DefaultParser(BaseParser):
-    agent_cat2type = {
-        'image': 'path',
-        'text': 'string',
-        'audio': 'path',
-        'int': 'int',
-        'bool': 'bool',
-        'float': 'float',
+    agent_type2format = {
+        ImageIO: 'path',
+        AudioIO: 'path',
+        File: 'path',
     }
 
     def parse_inputs(self, *args, **kwargs) -> Tuple[tuple, dict]:
-        for arg, arg_name in zip(args, self.tool.parameters):
-            kwargs[arg_name] = arg
+        for arg, p in zip(args, self.tool.inputs):
+            kwargs[p.name] = arg
 
         parsed_kwargs = {}
         for k, v in kwargs.items():
-            if k not in self.tool.parameters:
+            p = self.tool.arguments.get(k)
+            if p is None:
                 raise TypeError(f'Got unexcepted keyword argument "{k}".')
-            p = self.tool.parameters[k]
-            tool_type = CatgoryToIO[p.category]
-            if not isinstance(v, tool_type):
-                tool_input = tool_type(v)
+            if not isinstance(v, p.type):
+                tool_input = p.type(v)
             else:
                 tool_input = v
             parsed_kwargs[k] = tool_input
@@ -35,18 +31,27 @@ def parse_inputs(self, *args, **kwargs) -> Tuple[tuple, dict]:
     def parse_outputs(self, outputs):
         if isinstance(outputs, tuple):
             assert len(outputs) == len(self.toolmeta.outputs)
+            parsed_outs = []
+            for out in outputs:
+                format = self.agent_type2format.get(type(out))
+                if isinstance(out, IOType) and format:
+                    out = out.to(format)
+                parsed_outs.append(out)
+            parsed_outs = tuple(parsed_outs)
+        elif isinstance(outputs, dict):
+            parsed_outs = {}
+            for k, out in outputs.items():
+                format = self.agent_type2format.get(type(out))
+                if isinstance(out, IOType) and format:
+                    out = out.to(format)
+                parsed_outs[k] = out
         else:
-            assert len(self.toolmeta.outputs) == 1
-            outputs = [outputs]
+            format = self.agent_type2format.get(type(outputs))
+            if isinstance(outputs, IOType) and format:
+                outputs = outputs.to(format)
+            parsed_outs = outputs
 
-        parsed_outs = []
-        for tool_output, out_category in zip(outputs, self.toolmeta.outputs):
-            agent_type = self.agent_cat2type[out_category]
-            if isinstance(tool_output, IOType):
-                tool_output = tool_output.to(agent_type)
-            parsed_outs.append(tool_output)
-
-        return parsed_outs[0] if len(parsed_outs) == 1 else tuple(parsed_outs)
+        return parsed_outs
 
     def refine_description(self) -> str:
         """Refine the tool description by replacing the input and output
@@ -61,12 +66,40 @@ def refine_description(self) -> str:
         """
 
         inputs_desc = []
-        for p in self.tool.parameters.values():
-            type_ = self.agent_cat2type[p.category]
-            default = f', Defaults to {p.default}' if p.optional else ''
-            inputs_desc.append(f'{p.name} ({p.category} {type_}{default})')
-        inputs_desc = 'Args: ' + ', '.join(inputs_desc)
+        for p in self.tool.inputs:
+            desc = f'{p.name}'
+            format = self.agent_type2format.get(p.type, p.type.__name__)
+            if p.description:
+                format += f', {p.description}'
+            if p.optional:
+                format += f'. Optional, Defaults to {p.default}'
+            desc += f' ({format})'
+            inputs_desc.append(desc)
+        if len(inputs_desc) > 0:
+            inputs_desc = 'Args: ' + '; '.join(inputs_desc)
+        else:
+            inputs_desc = 'No argument.'
+
+        outputs_desc = []
+        for p in self.tool.outputs:
+            format = self.agent_type2format.get(p.type, p.type.__name__)
+            if p.name and p.description:
+                desc = f'{p.name} ({format}, {p.description})'
+            elif p.name:
+                desc = f'{p.name} ({format})'
+            elif p.description:
+                desc = f'{format} ({p.description})'
+            else:
+                desc = f'{format}'
+            outputs_desc.append(desc)
+        if len(outputs_desc) > 0:
+            outputs_desc = 'Returns: ' + '; '.join(outputs_desc)
+        else:
+            outputs_desc = 'No returns.'
 
-        description = f'{self.toolmeta.description} {inputs_desc}'
+        description = ''
+        if self.toolmeta.description:
+            description += f'{self.toolmeta.description}\n'
+        description += f'{inputs_desc}\n{outputs_desc}'
 
         return description