【InternLM实战营---第六节课作业】

一、学习笔记

https://blog.csdn.net/weixin_45609124/article/details/138093110

二、基础作业

  • 完成 AgentLego WebUI 使用
    参考官方教程,下面记录下非 InterStudio 开发机环境配置

    1. 创建初始环境

      conda create -n agent
      conda activate agent
      conda install python=3.10
      conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=11.8 -c pytorch -c nvidia
      
    2. 安装 Lagent 和 AgentLego

      cd /root/agent
      conda activate agent
      git clone https://gitee.com/internlm/lagent.git
      cd lagent && git checkout 581d9fb && pip install -e . && cd ..
      git clone https://gitee.com/internlm/agentlego.git
      cd agentlego && git checkout 7769e0d && pip install -e . && cd ..
      
    3. 准备其他

      conda activate agent
      pip install lmdeploy==0.3.0
      
      #准备 Tutorial
      cd /root/agent
      git clone -b camp2 https://gitee.com/internlm/Tutorial.git
      
    4. 使用 LMDeploy 启动模型API服务

      conda activate agent
      lmdeploy serve api_server /root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-7b \
                                  --server-name 127.0.0.1 \
                                  --model-name internlm2-chat-7b \
                                  --cache-max-entry-count 0.1
      

      服务启动

    5. 启动并使用 Lagent Web Demo

      conda activate agent
      cd /root/agent/lagent/examples
      streamlit run internlm2_agent_web_demo.py --server.address 127.0.0.1 --server.port 7860
      

      web demo

    6. 效果展示
      触发论文搜索的能力
      论文搜索

  • 完成 AgentLego 直接使用部分

    1. 文件及环境准备

      #先下载 demo 文件
      cd /root/agent
      wget http://download.openmmlab.com/agentlego/road.jpg
      
      # 安装相关依赖
      conda activate agent
      pip install openmim==0.3.9
      mim install mmdet==3.3.0
      
    2. 脚本准备

      import re
      
      import cv2
      from agentlego.apis import load_tool
      
      # load tool
      tool = load_tool('ObjectDetection', device='cuda')
      
      # apply tool
      visualization = tool('/root/agent/road.jpg')
      print(visualization)
      
      # visualize
      image = cv2.imread('/root/agent/road.jpg')
      
      preds = visualization.split('\n')
      pattern = r'(\w+) \((\d+), (\d+), (\d+), (\d+)\), score (\d+)'
      
      for pred in preds:
          name, x1, y1, x2, y2, score = re.match(pattern, pred).groups()
          x1, y1, x2, y2, score = int(x1), int(y1), int(x2), int(y2), int(score)
          cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 1)
          cv2.putText(image, f'{name} {score}', (x1, y1), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 1)
      
      cv2.imwrite('/root/agent/road_detection_direct.jpg', image)
      
    3. 执行脚本并查看效果

      python /root/agent/direct_use.py
      

      执行脚本
      图片对比

三、进阶作业

  • 完成 AgentLego WebUI 使用
    1. 启动API server

      conda activate agent
      lmdeploy serve api_server /root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-7b \
                                  --server-name 127.0.0.1 \
                                  --model-name internlm2-chat-7b \
                                  --cache-max-entry-count 0.1
      
    2. 启动 AgentLego WebUI

      cd /root/agent/agentlego/webui
      python one_click.py
      
    3. 访问AgentLego WebUI并进行配置
      agent
      tools

    4. 最终效果展示
      效果展示

  • 用 AgentLego 自定义工具
    1. 创建工具文件
      touch /root/agent/agentlego/agentlego/tools/magicmaker_image_generation.py
      脚本内容如下:

      import json
      import requests
      
      import numpy as np
      
      from agentlego.types import Annotated, ImageIO, Info
      from agentlego.utils import require
      from .base import BaseTool
      
      
      class MagicMakerImageGeneration(BaseTool):
      
          default_desc = ('This tool can call the api of magicmaker to '
                          'generate an image according to the given keywords.')
      
          styles_option = [
              'dongman',  # 动漫
              'guofeng',  # 国风
              'xieshi',   # 写实
              'youhua',   # 油画
              'manghe',   # 盲盒
          ]
          aspect_ratio_options = [
              '16:9', '4:3', '3:2', '1:1',
              '2:3', '3:4', '9:16'
          ]
      
          @require('opencv-python')
          def __init__(self,
                       style='guofeng',
                       aspect_ratio='4:3'):
              super().__init__()
              if style in self.styles_option:
                  self.style = style
              else:
                  raise ValueError(f'The style must be one of {self.styles_option}')
              
              if aspect_ratio in self.aspect_ratio_options:
                  self.aspect_ratio = aspect_ratio
              else:
                  raise ValueError(f'The aspect ratio must be one of {aspect_ratio}')
      
          def apply(self,
                    keywords: Annotated[str,
                                        Info('A series of Chinese keywords separated by comma.')]
              ) -> ImageIO:
              import cv2
              response = requests.post(
                  url='https://magicmaker.openxlab.org.cn/gw/edit-anything/api/v1/bff/sd/generate',
                  data=json.dumps({
                      "official": True,
                      "prompt": keywords,
                      "style": self.style,
                      "poseT": False,
                      "aspectRatio": self.aspect_ratio
                  }),
                  headers={'content-type': 'application/json'}
              )
              image_url = response.json()['data']['imgUrl']
              image_response = requests.get(image_url)
              image = cv2.cvtColor(cv2.imdecode(np.frombuffer(image_response.content, np.uint8), cv2.IMREAD_COLOR),cv2.COLOR_BGR2RGB)
              return ImageIO(image)
      
    2. 注册新工具
      修改 /root/agent/agentlego/agentlego/tools/init.py 文件

      from .base import BaseTool
      from .calculator import Calculator
      from .func import make_tool
      from .image_canny import CannyTextToImage, ImageToCanny
      from .image_depth import DepthTextToImage, ImageToDepth
      from .image_editing import ImageExpansion, ImageStylization, ObjectRemove, ObjectReplace
      from .image_pose import HumanBodyPose, HumanFaceLandmark, PoseToImage
      from .image_scribble import ImageToScribble, ScribbleTextToImage
      from .image_text import ImageDescription, TextToImage
      from .imagebind import AudioImageToImage, AudioTextToImage, AudioToImage, ThermalToImage
      from .object_detection import ObjectDetection, TextToBbox
      from .ocr import OCR
      from .scholar import *  # noqa: F401, F403
      from .search import BingSearch, GoogleSearch
      from .segmentation import SegmentAnything, SegmentObject, SemanticSegmentation
      from .speech_text import SpeechToText, TextToSpeech
      from .translation import Translation
      from .vqa import VQA
      + from .magicmaker_image_generation import MagicMakerImageGeneration
      
      __all__ = [
          'CannyTextToImage', 'ImageToCanny', 'DepthTextToImage', 'ImageToDepth',
          'ImageExpansion', 'ObjectRemove', 'ObjectReplace', 'HumanFaceLandmark',
          'HumanBodyPose', 'PoseToImage', 'ImageToScribble', 'ScribbleTextToImage',
          'ImageDescription', 'TextToImage', 'VQA', 'ObjectDetection', 'TextToBbox', 'OCR',
          'SegmentObject', 'SegmentAnything', 'SemanticSegmentation', 'ImageStylization',
          'AudioToImage', 'ThermalToImage', 'AudioImageToImage', 'AudioTextToImage',
          'SpeechToText', 'TextToSpeech', 'Translation', 'GoogleSearch', 'Calculator',
      -     'BaseTool', 'make_tool', 'BingSearch'
      +     'BaseTool', 'make_tool', 'BingSearch', 'MagicMakerImageGeneration'
      ]
      
    3. 服务启动
      要重启上面的api server 和 AgentLego WebUI

    4. 效果展示
      记得要和配置agent和tools在这里插入图片描述

  • 14
    点赞
  • 23
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值