Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump version to v0.2.0 #60

Merged
merged 22 commits into from
Feb 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .dev_scripts/generate_readme.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,7 @@

def parse_args():
parser = argparse.ArgumentParser(description=prog_description)
parser.add_argument(
'tools', type=str, nargs='+', help='The tool class to generate.')
parser.add_argument('tools', type=str, nargs='+', help='The tool class to generate.')
args = parser.parse_args()
return args

Expand Down
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -119,3 +119,10 @@ venv.bak/

# generated image, audio or video files.
generated/

# gradio demo
webui/logs/
webui/custom_tools/
webui/installer_files/
webui/generated/
webui/*.yml
2 changes: 2 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ repos:
rev: 6.1.0
hooks:
- id: flake8
args: ['--exclude', 'webui/*']
- repo: https://github.com/PyCQA/isort
rev: 5.12.0
hooks:
Expand All @@ -12,6 +13,7 @@ repos:
rev: v0.40.2
hooks:
- id: yapf
args: ['--exclude', 'webui']
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
Expand Down
13 changes: 6 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,8 @@ pip install agentlego
Some tools requires extra packages, please check the readme file of the tool, and confirm all requirements are
satisfied.

For example, if we want to use the `ImageCaption` tool. We need to check the **Set up** section of
[readme](agentlego/tools/image_text/README.md#ImageCaption) and install the requirements.
For example, if we want to use the `ImageDescription` tool. We need to check the **Set up** section of
[readme](agentlego/tools/image_text/README.md#ImageDescription) and install the requirements.

```bash
pip install -U openmim
Expand All @@ -62,7 +62,7 @@ from agentlego import list_tools, load_tool

print(list_tools()) # list tools in AgentLego

image_caption_tool = load_tool('ImageCaption', device='cuda')
image_caption_tool = load_tool('ImageDescription', device='cuda')
print(image_caption_tool.description)
image = './examples/demo.png'
caption = image_caption_tool(image)
Expand All @@ -88,9 +88,9 @@ caption = image_caption_tool(image)

**Image-processing related**

- [ImageCaption](agentlego/tools/image_text/README.md#ImageCaption): Describe the input image.
- [ImageDescription](agentlego/tools/image_text/README.md#ImageDescription): Describe the input image.
- [OCR](agentlego/tools/ocr/README.md#OCR): Recognize the text from a photo.
- [VisualQuestionAnswering](agentlego/tools/vqa/README.md#VisualQuestionAnswering): Answer the question according to the image.
- [VQA](agentlego/tools/vqa/README.md#VQA): Answer the question according to the image.
- [HumanBodyPose](agentlego/tools/image_pose/README.md#HumanBodyPose): Estimate the pose or keypoints of human in an image.
- [HumanFaceLandmark](agentlego/tools/image_pose/README.md#HumanFaceLandmark): Estimate the landmark or keypoints of human faces in an image.
- [ImageToCanny](agentlego/tools/image_canny/README.md#ImageToCanny): Extract the edge image from an image.
Expand All @@ -100,8 +100,7 @@ caption = image_caption_tool(image)
- [TextToBbox](agentlego/tools/object_detection/README.md#TextToBbox): Detect specific objects described by the given text in the image.
- Segment Anything series
- [SegmentAnything](agentlego/tools/segmentation/README.md#SegmentAnything): Segment all items in the image.
- [SegmentClicked](agentlego/tools/segmentation/README.md#SegmentClicked): Segment the masked region in the image.
- [ObjectSegmenting](agentlego/tools/segmentation/README.md#ObjectSegmenting): Segment the certain objects in the image according to the given object name.
- [SegmentObject](agentlego/tools/segmentation/README.md#SegmentObject): Segment the certain objects in the image according to the given object name.

**AIGC related**

Expand Down
11 changes: 5 additions & 6 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ pip install agentlego

一些工具需要额外的软件包,请查看工具的自述文件,并确认所有要求都得到满足。

例如,如果我们想要使用`ImageCaption`工具。我们需要查看工具 [readme](agentlego/tools/image_text/README.md#ImageCaption) 的 **Set up** 小节并安装所需的软件。
例如,如果我们想要使用`ImageDescription`工具。我们需要查看工具 [readme](agentlego/tools/image_text/README.md#ImageDescription) 的 **Set up** 小节并安装所需的软件。

```bash
pip install -U openmim
Expand All @@ -60,7 +60,7 @@ from agentlego import list_tools, load_tool

print(list_tools()) # list tools in AgentLego

image_caption_tool = load_tool('ImageCaption', device='cuda')
image_caption_tool = load_tool('ImageDescription', device='cuda')
print(image_caption_tool.description)
image = './examples/demo.png'
caption = image_caption_tool(image)
Expand All @@ -86,9 +86,9 @@ caption = image_caption_tool(image)

**图像处理相关**

- [ImageCaption](agentlego/tools/image_text/README.md#ImageCaption): 描述输入图像。
- [ImageDescription](agentlego/tools/image_text/README.md#ImageDescription): 描述输入图像。
- [OCR](agentlego/tools/ocr/README.md#OCR): 从照片中识别文本。
- [VisualQuestionAnswering](agentlego/tools/vqa/README.md#VisualQuestionAnswering): 根据图片回答问题。
- [VQA](agentlego/tools/vqa/README.md#VQA): 根据图片回答问题。
- [HumanBodyPose](agentlego/tools/image_pose/README.md#HumanBodyPose): 估计图像中人体的姿态或关键点,并绘制人体姿态图像
- [HumanFaceLandmark](agentlego/tools/image_pose/README.md#HumanFaceLandmark): 识别图像中人脸的关键点,并绘制带有关键点的图像。
- [ImageToCanny](agentlego/tools/image_canny/README.md#ImageToCanny): 从图像中提取边缘图像。
Expand All @@ -98,8 +98,7 @@ caption = image_caption_tool(image)
- [TextToBbox](agentlego/tools/object_detection/README.md#TextToBbox): 检测图像中的给定对象。
- Segment Anything 系列工具
- [SegmentAnything](agentlego/tools/segmentation/README.md#SegmentAnything): 分割图像中的所有物体。
- [SegmentClicked](agentlego/tools/segmentation/README.md#SegmentClicked): 分割图像中指定区域的物体。
- [ObjectSegmenting](agentlego/tools/segmentation/README.md#ObjectSegmenting): 根据给定的物体名称,在图像中分割出特定的物体。
- [SegmentObject](agentlego/tools/segmentation/README.md#SegmentObject): 根据给定的物体名称,在图像中分割出特定的物体。

**AIGC 相关**

Expand Down
1 change: 1 addition & 0 deletions agentlego/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from .apis.tool import list_tools, load_tool
from .search import search_tool
from .version import __version__ # noqa: F401, F403

__all__ = ['load_tool', 'list_tools', 'search_tool']
40 changes: 23 additions & 17 deletions agentlego/apis/tool.py
Original file line number Diff line number Diff line change
@@ -1,21 +1,30 @@
import importlib
import inspect
from typing import Optional, Union

import agentlego.tools
from agentlego.tools import BaseTool
from agentlego.tools.func import _FuncToolType
from agentlego.utils.cache import load_or_build_object

NAMES2TOOLS = {}


def register_all_tools(module):
def extract_all_tools(module):
if isinstance(module, str):
module = importlib.import_module(module)

tools = {}
for k, v in module.__dict__.items():
if (isinstance(v, type) and issubclass(v, BaseTool)
and (v is not BaseTool)):
NAMES2TOOLS[k] = v
if (isinstance(v, type) and issubclass(v, BaseTool) and (v is not BaseTool)):
tools[k] = v
elif isinstance(v, _FuncToolType):
tools[k] = v
return tools


def register_all_tools(module):
NAMES2TOOLS.update(extract_all_tools(module))


register_all_tools(agentlego.tools)
Expand All @@ -39,15 +48,15 @@ def list_tools(with_description=False):
... print(name, description)
"""
if with_description:
return list((name, cls.DEFAULT_TOOLMETA.description)
return list((name, cls.get_default_toolmeta().description)
for name, cls in NAMES2TOOLS.items())
else:
return list(NAMES2TOOLS.keys())


def load_tool(tool_type: str,
name: str = None,
description: str = None,
name: Optional[str] = None,
description: Optional[str] = None,
device=None,
**kwargs) -> BaseTool:
"""Load a configurable callable tool for different task.
Expand All @@ -56,7 +65,7 @@ def load_tool(tool_type: str,
tool_name (str): tool name for specific task. You can find more
description about supported tools by
:func:`~agentlego.apis.list_tools`.
override_name (str | None): The name to override the default name.
name (str | None): The name to override the default name.
Defaults to None.
description (str): The description to override the default description.
Defaults to None.
Expand All @@ -72,28 +81,25 @@ def load_tool(tool_type: str,
Examples:
>>> from agentlego import load_tool
>>> # load tool with tool name
>>> tool, meta = load_tool('object detection')
>>> # load a specific model
>>> tool, meta = load_tool(
>>> 'object detection', model='rtmdet_l_8xb32-300e_coco')
>>> tool, meta = load_tool('GoogleSearch', with_url=True)
"""
if tool_type not in NAMES2TOOLS:
# Using ValueError to show error msg cross lines.
raise ValueError(f'{tool_type} is not supported now, the available '
'tools are:\n' + '\n'.join(NAMES2TOOLS.keys()))

tool_type = NAMES2TOOLS[tool_type]
if 'device' in inspect.getfullargspec(tool_type).args:
constructor: Union[type, _FuncToolType] = NAMES2TOOLS[tool_type]
if 'device' in inspect.getfullargspec(constructor).args:
kwargs['device'] = device

if name or description:
tool_obj = tool_type(**kwargs)
if name or description or isinstance(constructor, _FuncToolType):
tool_obj = constructor(**kwargs)
if name:
tool_obj.name = name
if description:
tool_obj.description = description
else:
# Only enable cache if no overrode attribution
# to avoid the cached tool is changed.
tool_obj = load_or_build_object(tool_type, **kwargs)
tool_obj = load_or_build_object(constructor, **kwargs)
return tool_obj
95 changes: 64 additions & 31 deletions agentlego/parsers/default_parser.py
Original file line number Diff line number Diff line change
@@ -1,31 +1,27 @@
from typing import Tuple

from agentlego.types import CatgoryToIO, IOType
from agentlego.types import AudioIO, File, ImageIO, IOType
from .base_parser import BaseParser


class DefaultParser(BaseParser):
agent_cat2type = {
'image': 'path',
'text': 'string',
'audio': 'path',
'int': 'int',
'bool': 'bool',
'float': 'float',
agent_type2format = {
ImageIO: 'path',
AudioIO: 'path',
File: 'path',
}

def parse_inputs(self, *args, **kwargs) -> Tuple[tuple, dict]:
for arg, arg_name in zip(args, self.tool.parameters):
kwargs[arg_name] = arg
for arg, p in zip(args, self.tool.inputs):
kwargs[p.name] = arg

parsed_kwargs = {}
for k, v in kwargs.items():
if k not in self.tool.parameters:
p = self.tool.arguments.get(k)
if p is None:
raise TypeError(f'Got unexcepted keyword argument "{k}".')
p = self.tool.parameters[k]
tool_type = CatgoryToIO[p.category]
if not isinstance(v, tool_type):
tool_input = tool_type(v)
if not isinstance(v, p.type):
tool_input = p.type(v)
else:
tool_input = v
parsed_kwargs[k] = tool_input
Expand All @@ -35,18 +31,27 @@ def parse_inputs(self, *args, **kwargs) -> Tuple[tuple, dict]:
def parse_outputs(self, outputs):
if isinstance(outputs, tuple):
assert len(outputs) == len(self.toolmeta.outputs)
parsed_outs = []
for out in outputs:
format = self.agent_type2format.get(type(out))
if isinstance(out, IOType) and format:
out = out.to(format)
parsed_outs.append(out)
parsed_outs = tuple(parsed_outs)
elif isinstance(outputs, dict):
parsed_outs = {}
for k, out in outputs.items():
format = self.agent_type2format.get(type(out))
if isinstance(out, IOType) and format:
out = out.to(format)
parsed_outs[k] = out
else:
assert len(self.toolmeta.outputs) == 1
outputs = [outputs]
format = self.agent_type2format.get(type(outputs))
if isinstance(outputs, IOType) and format:
outputs = outputs.to(format)
parsed_outs = outputs

parsed_outs = []
for tool_output, out_category in zip(outputs, self.toolmeta.outputs):
agent_type = self.agent_cat2type[out_category]
if isinstance(tool_output, IOType):
tool_output = tool_output.to(agent_type)
parsed_outs.append(tool_output)

return parsed_outs[0] if len(parsed_outs) == 1 else tuple(parsed_outs)
return parsed_outs

def refine_description(self) -> str:
"""Refine the tool description by replacing the input and output
Expand All @@ -61,12 +66,40 @@ def refine_description(self) -> str:
"""

inputs_desc = []
for p in self.tool.parameters.values():
type_ = self.agent_cat2type[p.category]
default = f', Defaults to {p.default}' if p.optional else ''
inputs_desc.append(f'{p.name} ({p.category} {type_}{default})')
inputs_desc = 'Args: ' + ', '.join(inputs_desc)
for p in self.tool.inputs:
desc = f'{p.name}'
format = self.agent_type2format.get(p.type, p.type.__name__)
if p.description:
format += f', {p.description}'
if p.optional:
format += f'. Optional, Defaults to {p.default}'
desc += f' ({format})'
inputs_desc.append(desc)
if len(inputs_desc) > 0:
inputs_desc = 'Args: ' + '; '.join(inputs_desc)
else:
inputs_desc = 'No argument.'

outputs_desc = []
for p in self.tool.outputs:
format = self.agent_type2format.get(p.type, p.type.__name__)
if p.name and p.description:
desc = f'{p.name} ({format}, {p.description})'
elif p.name:
desc = f'{p.name} ({format})'
elif p.description:
desc = f'{format} ({p.description})'
else:
desc = f'{format}'
outputs_desc.append(desc)
if len(outputs_desc) > 0:
outputs_desc = 'Returns: ' + '; '.join(outputs_desc)
else:
outputs_desc = 'No returns.'

description = f'{self.toolmeta.description} {inputs_desc}'
description = ''
if self.toolmeta.description:
description += f'{self.toolmeta.description}\n'
description += f'{inputs_desc}\n{outputs_desc}'

return description
Loading