在亚马逊云科技上部署Llama大模型并开发负责任的AI生活智能助手

项目简介:

小李哥将继续每天介绍一个基于亚马逊云科技AWS云计算平台的全球前沿AI技术解决方案,帮助大家快速了解国际上最热门的云计算平台亚马逊云科技AWS AI最佳实践,并应用到自己的日常工作里。

本次介绍的是如何在亚马逊云科技上利用SageMaker机器学习服务部署Llama开源大模型,并为Llama模型的输入/输出添加Llama Guard合规性检测,避免Llama大模型生成有害、不当、虚假内容。同时我们用容器管理服务ECS托管一个AI生活智能助手,通过调用Llama大模型API为用户提供智能生活建议,并将和用户的对话历史存在DynamoDB中,让用户可以回看历史对话记录。本架构设计全部采用了云原生Serverless架构,提供可扩展和安全的AI解决方案。本方案的解决方案架构图如下:

方案所需基础知识 

什么是 Amazon SageMaker?

Amazon SageMaker 是亚马逊云科技提供的一站式机器学习服务,旨在帮助开发者和数据科学家轻松构建、训练和部署机器学习模型。SageMaker 提供了从数据准备、模型训练到模型部署的全流程工具,使用户能够高效地在云端实现机器学习项目。

什么是 Llama Guard工具?

Llama Guard 是一种专门设计的工具或框架,旨在为 Llama 模型(或其他大型语言模型)提供安全和合规的防护措施。它通过对模型的输入和输出进行监控、过滤和审查,确保生成内容符合道德标准和法律法规。Llama Guard 可以帮助开发者识别并防止潜在的有害内容输出,如不当言论、偏见、虚假信息等,从而提升 AI 模型的安全性和可靠性。

为什么要构建负责任的 AI?

防止偏见和歧视

大型语言模型可能会在训练过程中无意中学习到数据中的偏见。构建负责任的 AI 旨在识别和消除这些偏见,确保 AI 的决策公平、公正,不会因种族、性别或其他特征而产生歧视。

提升信任和透明度

用户对 AI 系统的信任依赖于系统的透明度和可解释性。通过构建负责任的 AI,可以增加用户对系统的理解,提升系统的可信度,确保用户能够信任 AI 提供的建议和决策。

遵守法律法规

许多国家和地区对数据隐私、安全和公平性有严格的法律要求。构建负责任的 AI 可以确保模型在符合这些法律法规的基础上运行,避免法律风险。

保护用户隐私

负责任的 AI 重视并保护用户的隐私权,避免在处理敏感数据时泄露用户个人信息。通过对数据进行适当的加密和匿名化,确保用户的数据安全。

防止误用和滥用

负责任的 AI 设计包括防范系统被恶意利用或误用的机制。例如,防止 AI 系统被用于生成虚假新闻、散布虚假信息或攻击他人。

道德责任

AI 系统的影响力越来越大,开发者和企业有责任确保这些系统对社会产生积极的影响。构建负责任的 AI 意味着在设计和部署 AI 系统时考虑到道德责任,避免对社会产生负面影响。

本方案包括的内容

1. 利用Streamlit框架开发AI生活助手,并将服务部署在Amazon Fargate上,前端利用负载均衡器实现高可用。

2. 利用Lambda无服务器计算服务实现与大模型的API交互

3. 在Amazon SageMaker上部署Llama 2大模型,并为大模型添加安全工具Llama Guard

4. 将对话记录存储到NoSQL服务DynamoDB中

项目搭建具体步骤:

1. 登录亚马逊云科技控制台,创建一个SageMaker Studio运行Jupyter Notebook文件,并点击Open打开。

2. 在SageMaker Studio中创建一个新的Jupyter Notebook,运行以下命令安装必要依赖和指明Llama大模型的版本。

%pip install --quiet --upgrade sagemaker
model_id = "meta-textgeneration-llama-guard-7b"
model_version = "1.*"

3. 利用SageMaker Jumpstart功能,无需训练或者加载模型文件就可以快速部署Llama Guard大模型

from sagemaker.jumpstart.model import JumpStartModel


model = JumpStartModel(model_id=model_id, model_version=model_version,instance_type='ml.g5.2xlarge')

accept_eula = True

try:
    predictor = model.deploy(accept_eula=accept_eula)
except Exception as e:
    print(e)

4. 本段代码是用于测试大模型回复内容的测试代码,通过retrieve_all_examples()方法取出实例数据,并通过大模型预测得到回复。

example_payloads = model.retrieve_all_examples()

for payload in example_payloads:
    try:
        response = predictor.predict(payload.body)
        print(
            "\nInput\n",
            payload.body,
            "\n\nOutput\n",
            response[0]["generated_text"],
            "\n\n===============",
        )
    except Exception as e:
        print(e)

5. 接下来我们利用JumpStart功能快速部署Llama 2 7B大模型

llm_model_id = "meta-textgeneration-llama-2-7b-f"
llm_model_version = "3.*"
model_llm = JumpStartModel(model_id=llm_model_id, model_version=llm_model_version)
try:
    predictor_llm = model_llm.deploy(accept_eula=accept_eula)
except Exception as e:
    print(e)

6. 接下来我们定义一个给Llama Guard检查模型对话安全性的指示性模板,模板中明确指出不安全内容类别。

TASK = "Task: Check if there is unsafe content in 'Agent' messages in conversations, according to our safety policy with the following categories."

INSTRUCTION = """\
Provide your safety assessment for Agent in the previous conversation:
- The first line must read 'safe' or 'unsafe'.
- If unsafe, a second line must include a comma-separated list of violated categories."""

UNSAFE_CONTENT_CATEGORIES = """\
O1: Violence and Hate. 
Should not
- Help people plan or engage in violence.
- Encourage people to engage in violence.
- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics such as their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.
- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.
Can
- Provide information on violence and discrimination.
- Discuss the topics of hate and violence, and can discuss historical events involving violence.
O2: Sexual Content.
Should not
- Engage in sexually explicit (that is, erotic) conversations and should not encourage people to engage in sexual activities.
Can
- Discuss topics related to sexuality and sexual education.
- Provide general information regarding sexual activities.
- Provide guidance on topics related to sexual health.
O3: Criminal Planning.
Should not
- Help people plan or engage in criminal activities such as theft, kidnapping, or financial crimes, especially when the person indicates that they might act on the information provided to them.
- Encourage people to engage in criminal activities or condone criminal behavior.
Can
- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.
- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.
O4: Guns and Illegal Weapons.
Should not
- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they might act on the information provided to them.
- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.
Can
- Discuss firearms and the arguments for and against firearm ownership.
O5: Regulated or Controlled Substances.
Should not
- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.
- Assist or encourage people to create such substances when it is illegal to do so.
Can
- Discuss the use of regulated or controlled substances, and can provide general information on those substances (for example, what their chemical composition is and what their effects are).
- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.
O6: Self-Harm.
Should not
- Encourage people to harm themselves.
- Romanticize or condone self-harm or suicide.
- Provide information regarding the methods of suicide or self-harm.
- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.
Can
- Provide general information regarding self-harm and suicide (for example, statistics and the mental health challenges faced by those who take these actions).
Should
- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves."""

7. 通过定义两个函数”format_chat_messages“和”format_guard_messages“,来统一大模型输入输出格式模板。

from itertools import cycle
from typing import Dict, List


def format_chat_messages(messages: List[Dict[str, str]]) -> List[str]:
    """Format messages for Llama-2 chat models.

    The model only supports 'system', 'user', and 'assistant' roles, starting with 'system', then 'user' and
    alternating (u/a/u/a/u...). The last message must be from 'user'.
    """
    prompt: List[str] = []

    if messages[0]["role"] == "system":
        content = "".join(
            ["<<SYS>>\n", messages[0]["content"], "\n<</SYS>>\n\n", messages[1]["content"]]
        )
        messages = [{"role": messages[1]["role"], "content": content}] + messages[2:]

    for user, answer in zip(messages[::2], messages[1::2]):
        prompt.extend(
            [
                "<s>",
                "[INST] ",
                (user["content"]).strip(),
                " [/INST] ",
                (answer["content"]).strip(),
                "</s>",
            ]
        )

    prompt.extend(["<s>", "[INST] ", (messages[-1]["content"]).strip(), " [/INST] "])

    return "".join(prompt)


def format_guard_messages(
    messages: List[Dict[str, str]],
    task: str = TASK,
    instruction: str = INSTRUCTION,
    unsafe_content_categories: str = UNSAFE_CONTENT_CATEGORIES,
) -> List[str]:
    """Format messages for Llama Guard models.

    The model only supports 'user' and 'assistant' roles, and must alternate (u/a/u/a/u...).
    """
    turn = cycle(["User", "Agent"])

    return "".join(
        [
            "[INST] ",
            task,
            "\n\n<BEGIN UNSAFE CONTENT CATEGORIES>\n",
            unsafe_content_categories,
            "\n<END UNSAFE CONTENT CATEGORIES>\n\n<BEGIN CONVERSATION>",
            *[f"\n\n{next(turn)}: {message['content']}" for message in messages],
            "\n\n<END CONVERSATION>\n\n",
            instruction,
            " [/INST]",
        ]
    )

8. 接下来将一段对话通过”payload_input_guard“调整为标准格式,并将内容发送到Lllama-Guard模型进行评估。

messages_input = [
    {"role": "user", "content": "I forgot how to kill a process in Linux, can you help?"}
]
payload_input_guard = {"inputs": format_guard_messages(messages_input)}

try:
    response_input_guard = predictor.predict(payload_input_guard)
    print(response_input_guard)
except Exception as e:
    print(e)

9. 我们从llama-guard模型得到的回复是”safe“,表示对话是安全的,此时我们再把对话内容发送给Llama模型生成回复。

payload_input_llm = {
    "inputs": format_chat_messages(messages_input),
    "parameters": {"max_new_tokens": 128},
}

try:
    response_llm = predictor_llm.predict(payload_input_llm)
    print(response_llm)
except Exception as e:
    print(e)

10. 我们再把Llama模型生成的回复发送给Llama-guard模型进行安全性评估,保证输出、输入内容都是合规、安全的。

payload_input_llm = {
    "inputs": format_chat_messages(messages_input),
    "parameters": {"max_new_tokens": 128},
}

try:
    response_llm = predictor_llm.predict(payload_input_llm)
    print(response_llm)
except Exception as e:
    print(e)

11. 接下来我们进入Lambda服务主页,创建一个新的Lambda函数”guard_chat_function“,复制以下代码。用于通过API访问Llama大模型进行内容合规性测试。

import os
import boto3
import json
from itertools import cycle
from typing import Dict, List
import uuid


TASK = "Task: Check if there is unsafe content in 'Agent' messages in conversations according our safety policy with the below categories."

INSTRUCTION = """Provide your safety assessment for Agent in the above conversation:
- First line must read 'safe' or 'unsafe'.
- If unsafe, a second line must include a comma-separated list of violated categories."""

# read unsafe_categories from unsafe_categories.txt
with open("unsafe_categories.txt", "r") as f:
    UNSAFE_CONTENT_CATEGORIES = f.read()


def format_chat_messages(messages: List[Dict[str, str]]) -> List[str]:
    """Format messages for Llama-2 chat models.

    The model only supports 'system', 'user' and 'assistant' roles, starting with 'system', then 'user' and
    alternating (u/a/u/a/u...). The last message must be from 'user'.
    """
    prompt: List[str] = []

    if messages[0]["role"] == "system":
        content = "".join(
            ["<<SYS>>\n", messages[0]["content"], "\n<</SYS>>\n", messages[1]["content"]]
        )
        messages = [{"role": messages[1]["role"], "content": content}] + messages[2:]

    for user, answer in zip(messages[::2], messages[1::2]):
        prompt.extend(
            [
                "<s>",
                "[INST] ",
                (user["content"]).strip(),
                " [/INST] ",
                (answer["content"]).strip(),
                "</s>",
            ]
        )

    prompt.extend(["<s>", "[INST] ", (messages[-1]["content"]).strip(), " [/INST] "])

    return "".join(prompt)


def format_guard_messages(
    messages: List[Dict[str, str]],
    task: str = TASK,
    instruction: str = INSTRUCTION,
    unsafe_content_categories: str = UNSAFE_CONTENT_CATEGORIES,
) -> List[str]:
    """Format messages for Llama Guard models.

    The model only supports 'user' and 'assistant' roles, and must alternate (u/a/u/a/u...).
    """
    turn = cycle(["User", "Agent"])

    return "".join(
        [
            "[INST] ",
            task,
            "\n\n<BEGIN UNSAFE CONTENT CATEGORIES>",
            unsafe_content_categories,
            "\n<END UNSAFE CONTENT CATEGORIES>\n\n<BEGIN CONVERSATION>",
            *[f"\n\n{next(turn)}: {message['content']}" for message in messages],
            "\n\n<END CONVERSATION>\n\n",
            instruction,
            " [/INST]",
        ]
    )

def lambda_handler(event, context):
    
    random_id = str(uuid.uuid4())
   
    # Get the SageMaker endpoint names from environment variables
    endpoint1_name = os.environ['GUARD_END_POINT']
    endpoint2_name = os.environ['CHAT_END_POINT']

    # Create a SageMaker client
    sagemaker = boto3.client('sagemaker-runtime')
    print(event)
    
    messages_input = [{  
            "role": "user", 
            "content": event['prompt']
        }]
    payload_input_guard = {"inputs": format_guard_messages(messages_input)}

    # Invoke the first SageMaker endpoint
    guard_resp = sagemaker.invoke_endpoint(
        EndpointName=endpoint1_name,
        ContentType='application/json',
        Body=json.dumps(payload_input_guard)
    )
    guard_result = guard_resp['Body'].read().decode('utf-8')
    for item in json.loads(guard_result):  
        guard_result=item['generated_text']

    payload_input_llm = {
    "inputs": format_chat_messages(messages_input),
    "parameters": {"max_new_tokens": 128},
    }
    # Invoke the second SageMaker endpoint
    chat_resp = sagemaker.invoke_endpoint(
        EndpointName=endpoint2_name,
        ContentType='application/json',
        Body=json.dumps(payload_input_llm)
    )
    
    chat_result = chat_resp['Body'].read().decode('utf-8')
    for item in json.loads(chat_result):  
        chat_result=item['generated_text']
    

    # store chat history
    dynamodb = boto3.client("dynamodb")
    dynamodb.put_item(
        TableName='chat_history',
        Item={
             "prompt_id": {'S': random_id},
             "prompt_content": {'S': event['prompt']},
             "guard_resp": {'S' : guard_result},
             "chat_resp": {'S': chat_result}
            })
            
    # DIY section - Add unsafe responses to the bad_prompts table
    

    # Return the results
    return {
        'Llama-Guard-Output' : guard_result,
        'Llama-Chat-Output' : chat_result 
    }

12. 接下来我们进入到CodeBuild服务主页,创建一个容器构建项目并点击启动,构建脚本如下:

{
  "version": "0.2",
  "phases": {
    "pre_build": {
      "commands": [
        "echo 'Downloading container image from S3 bucket'",
        "aws s3 cp s3://lab-code-3a7cca20/Dockerfile .",
        "aws s3 cp s3://lab-code-3a7cca20/requirements.txt .",
        "aws s3 cp s3://lab-code-3a7cca20/app.py ."
      ]
    },
    "build": {
      "commands": [
        "echo 'Loading container image'",
        "aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 755119157746.dkr.ecr.us-east-1.amazonaws.com",
        "docker build -t streamlit-container-image .",
        "echo 'Tagging and pushing container image to ECR'",
        "docker tag streamlit-container-image:latest 755119157746.dkr.ecr.us-east-1.amazonaws.com/streamlit-repo:latest",
        "docker push 755119157746.dkr.ecr.us-east-1.amazonaws.com/streamlit-repo:latest"
      ]
    }
  },
  "artifacts": {
    "base-directory": ".",
    "files": [
      "Dockerfile"
    ]
  }
}

13. 本CodeBuild项目将一个streamlit应用封装成了镜像,并上传到ECR镜像库。

14. 接下来我们进入到ECS服务,按照如下脚本创建一个容器服务启动模板task definition:

{
    "taskDefinitionArn": "arn:aws:ecs:us-east-1:755119157746:task-definition/streamlit-task-definition:3",
    "containerDefinitions": [
        {
            "name": "StreamlitContainer",
            "image": "755119157746.dkr.ecr.us-east-1.amazonaws.com/streamlit-repo:latest",
            "cpu": 0,
            "links": [],
            "portMappings": [
                {
                    "containerPort": 8501,
                    "hostPort": 8501,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "entryPoint": [],
            "command": [],
            "environment": [],
            "environmentFiles": [],
            "mountPoints": [],
            "volumesFrom": [],
            "secrets": [],
            "dnsServers": [],
            "dnsSearchDomains": [],
            "extraHosts": [],
            "dockerSecurityOptions": [],
            "dockerLabels": {},
            "ulimits": [],
            "systemControls": [],
            "credentialSpecs": []
        }
    ],
    "family": "streamlit-task-definition",
    "taskRoleArn": "arn:aws:iam::755119157746:role/ecs_cluster_role",
    "executionRoleArn": "arn:aws:iam::755119157746:role/ecs_cluster_role",
    "networkMode": "awsvpc",
    "revision": 3,
    "volumes": [],
    "status": "ACTIVE",
    "requiresAttributes": [
        {
            "name": "com.amazonaws.ecs.capability.ecr-auth"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
        },
        {
            "name": "com.amazonaws.ecs.capability.task-iam-role"
        },
        {
            "name": "ecs.capability.execution-role-ecr-pull"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
        },
        {
            "name": "ecs.capability.task-eni"
        }
    ],
    "placementConstraints": [],
    "compatibilities": [
        "EC2",
        "FARGATE"
    ],
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "512",
    "memory": "2048",
    "runtimePlatform": {
        "cpuArchitecture": "X86_64",
        "operatingSystemFamily": "LINUX"
    },
    "registeredAt": "2024-08-16T02:21:48.902Z",
    "registeredBy": "arn:aws:sts::755119157746:assumed-role/AWSLabs-Provisioner-v2-CjDTNtCaQDT/LPS-States-CreateStack",
    "tags": []
}

15. 接下来我们创建一个容器管理集群”Streamlit-cluster“,创建一个Streamlit微服务应用。

 16. 配置ECS微服务启动类型为Fargate,命名为streamlitservice,选择刚刚创建的ECS微服务启动模板"streamlit-task-definition",选择运行的微服务个数为1。

17. 选择微服务所部署的VPC和子网网络环境,并配置Security Group安全组。

18. 为ECS微服务添加应用层负载均衡器,用于实现后端服务的高可用,其名为:”streamlit-lb“,

19. 添加对外侦听端口HTTP 80,添加后端的目标组放置微服务,最后点击创建。

20. 我们通过应用层负载均衡器对外暴露的URL就可以登录该ECS微服务页面上。

 21. 接下来我们进行测试,输入一个问题”如何终止一个Linux进程“检测该内容是否为合规、安全的。

22. 最终可以看到Llama Guard大模型得问题回复,并检测了该问题以及输出内容都安全、合规。

 

以上就是在亚马逊云科技上利用亚马逊云科技上利用Llama Guard构建安全、合规、负责任的AI智能生活助手的全部步骤。欢迎大家未来与我一起,未来获取更多国际前沿的生成式AI开发方案。

评论 13
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值