GPT-4 との対話を通して EC2 インスタンスを起動する様子

はじめに

本ブログでは OpenAI の提供する Chat Completion API の Function Calling を使用して、対話形式で EC2 インスタンスの起動に必要なパラメータを抽出する方法について紹介します。冒頭の Gif では、以下の対話が行われています。

■ User prompt: I want to run a new EC2 instance.
□ Assistant prompt: Sure, I can help you with that. Can you please let me know the following details?

1. The AMI ID. This is the ID of the Amazon Resource Image that you want to use.
2. The Instance type. Available options are: `t2.micro`, `t2.small`, `t3.micro` & `t3.small`.
3. The name of the key pair that will be used to connect this EC2 instance.

■ User prompt: AMI ID is ami-012261b9035f8f938. Instance type is t2.micro.
□ Assistant prompt: Got it. To proceed, I would also need the name of the key pair that will be used to connect to this EC2 instance. Could you please provide that?

■ User prompt: kondo

一連の会話から、GPT が必要情報を抽出できたと判断した場合、EC2 インスタンスを起動するための確認プロンプトを表示します。これは GPT からの返答ではなく、必要パラメータを反映させて Python 上で構成したプロンプトです。

You are trying to start an instance.
Below is information about the instance you are trying to launch.
--------------------------------
AMI ID: ami-012261b9035f8f938
Instance type: t2.micro
Kay pair name: kondo
--------------------------------
proceed? (y/n): y

ユーザーが y を入力しエンターを押下すると、EC2 (boto3) の run_instances API が実行されるといった流れです。

How to call functions with chat models

Function Calling とは

いきなりですが、OpenAI のドキュメント Function calling にも以下の記載があるように、Function Calling は特定の関数を AI が呼び出す機能ではありません。

In an API call, you can describe functions and have the model intelligently choose to output a JSON object containing arguments to call one or many functions. The Chat Completions API does not call the function; instead, the model generates JSON that you can use to call the function in your code.

API 呼び出しでは、関数を記述して、1つまたは複数の関数を呼び出すための引数を含む JSON オブジェクトを出力するようにモデルにインテリジェントに選択させることができます。Chat Completions API は関数を呼び出しません。代わりに、モデルはあなたのコードで関数を呼び出すために使用できる JSON を生成します。

Function Calling は、呼び出したい関数で必要となるパラメータを GPT が抽出してくれる機能です。

実装

ライブラリのインポートと openai クライアントの定義

外部ライブラリは boto3 と openai のみ使用します。また、OpenAI の API を実行するクライアントを openai として定義します。

import json
import os
  
import boto3
from openai import OpenAI
 
 
openai = OpenAI()

対話を行う chat 関数

以下の chat 関数を定義し、ユーザーとの対話を行います。

def chat(messages, tools=None, tool_choice=None, model=os.getenv('CHAT_MODEL')):
    # 1. Chat completion API に渡す引数を作成する
    arguments = {
        'model': model,
        'messages': messages
    }
    if tools is not None:
        arguments['tools'] = tools
    if tool_choice is not None:
        arguments['tool_choice'] = tool_choice
 
    # 2. ユーザーの初回の投稿を送信する
    response = chat_completion(arguments)
    assistant_message = response.choices[0].message
 
    # 3. tool_calls に値が設定されるまで（API に渡す引数が確定するまで）ユーザーと対話する
    tool_calls = assistant_message.tool_calls
    assistant_content = assistant_message.content
    while tool_calls is None:
        print(f"□ Assistant prompt: {assistant_content}")
        messages.append({
            'role': 'assistant',
            'content': assistant_message.content
        })
 
        user_input = input('■ User prompt: ')
        messages.append({
            'role': 'user',
            'content': user_input
        })
        response = chat_completion(arguments)
        assistant_message = response.choices[0].message
        assistant_content = response.choices[0].message.content
        tool_calls = response.choices[0].message.tool_calls    
 
    # 4. tool_calls が設定されたら、ユーザーに確認を求める
    tool_calls = response.choices[0].message.tool_calls
    tool_calls_arguments = json.loads(
        assistant_message.tool_calls[0].function.arguments)
    user_input = input(
        f"\nYou are trying to start an instance.\n"
        f"Below is information about the instance you are trying to launch.\n"
        f"--------------------------------\n"
        f"AMI ID: {tool_calls_arguments['image_id']}\n"
        f"Instance type: {tool_calls_arguments['instance_type']}\n"
        f"Kay pair name: {tool_calls_arguments['key_name']}\n"
        f"--------------------------------\n"
        f"proceed? (y/n): "
    )
 
    # 5. ユーザーの了承後、EC2 インスタンスを起動する
    if user_input == 'y':
        ec2 = boto3.client('ec2', region_name=os.getenv('AWS_REGION'))
        response = ec2.run_instances(
            ImageId=tool_calls_arguments['image_id'],
            InstanceType=tool_calls_arguments['instance_type'],
            KeyName=tool_calls_arguments['key_name'],
            MinCount=1,
            MaxCount=1
        )
        print(response)

1. Chat completion API に渡す引数を作成する

Chat completion を使用する際に必要なパラメータとして以下を定義しています。tools, tool_choice に関してはオプションのパラメータです。関数の入力に左記パラメータが存在すれば、これを arguments に設定します。

パラメータ	概要
model	使用するモデルの ID
messages	これまでの会話を構成するメッセージのリスト
tools	モデルが呼び出すツールのリスト。現時点（2023.11.17）、ツールとしては関数のみがサポートされる
tool_choice	モデルによって呼び出される関数を指定

Chat completion で使用されるパラメータについては API reference Create chat completion を参照するようにしてください。

2. ユーザーの初回の投稿を送信する

Chat Completion API を実行するプログラムは、後述（Chat Completion API を実行する関数の定義）の chat_completion で共通化しています。ここでは chat_completion を呼び出し、ユーザーの初回の投稿を行います。また、GPT からの返答を assistant_message として定義します。assistant_message には単に返答文字列だけではなく、Function Calling の結果も含まれます。レスポンスオブジェクトの詳細については The chat completion object を参照するようにしてください。

3. tool_calls に値が設定されるまで（API に渡す引数が確定するまで）ユーザーと対話する

GPT のレスポンスオブジェクトには、tool_calls が含まれます。tool_calls は Function Calling が完了した場合に、実行対象として定義する関数名やその引数のオブジェクトが含まれます。Function Calling が未完了の状態では None が設定されます。

while 文の箇所では、Function Calling が完了状態となるまでユーザーと対話を継続する処理が記述されています。重ねての記述になりますが、Function Calling は特定の関数を実行するのではありません。「完了状態」とは関数を実行するために必要な引数が得られた、と GPT が判断した状態です。

4. tool_calls が設定されたら、ユーザーに確認を求める

ここでは Function Calling で得られたパラメータを用いて、特定の関数を実行するか否かの最終確認を行います。GPT が必ず正しい引数を設定する保証がないため、重要度の高い処理を実行する際に人間の目による確認を差し込んでいます。

5. ユーザーの了承後、EC2 インスタンスを起動する

ユーザーの確認を取ったのち、EC2 インスタンスを起動します。前項のプログラムで引数を格納した tool_calls_arguments から、インスタンス起動に必要な情報（AMI Image ID, インスタンスタイプ, キーペア名）を抽出して run_instances を実行します。

Chat Completion API を実行する関数の定義

Chat Completion API を呼び出すプログラムを chat_completion 関数として共通化しています。

def chat_completion(arguments):
    try:
        response = openai.chat.completions.create(**arguments)
        return response
    except Exception as e:
        print('Unable to generate ChatCompletion response')
        print(f'Exception: {e}')
        return e

chat 関数の実行

if __name__ == '__main__':
    # 1. ユーザーのプロンプトを messages に格納する
    user_prompt = input('■ User prompt: ')

    messages = []
    messages.append({
        'role': 'system',
        'content': "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."
    })
    messages.append({
        'role': 'user',
        'content': user_prompt
    })
 
    # 2. Function Calling に関数を設定する
    tools = []
    run_instances = {
        'type': 'function',
        'function': {
            'name': 'run_instances',
            'description': 'Run a new EC2 instance.',
            'parameters': {
                'type': 'object',
                'properties': {
                    'image_id': {
                        'type': 'string',
                        'description': 'The ID of the Amazon Resource Image (AMI). An AMI ID is required to launch an instance and must be specified here or in a launch template.'
                    },
                    'instance_type': {
                        'type': 'string',
                        'description': 'Instance family defined by AWS.',
                        'enum': ['t2.micro', 't2.small', 't3.micro', 't3.small']
                    },
                    'key_name': {
                        'type': 'string',
                        'description': 'The name of the non-empty key pair that will connect to the EC2 instance created here.'
                    }
                },
                'required': ['image_id', 'instance_type', 'key_name']
            },
        }
    }
    tools.append(run_instances)
 
    chat(messages, tools=tools)

1. ユーザーのプロンプトを messages に格納する

Chat Completion API で対話を行うためにメッセージを格納しています。システムプロンプトには以下の文言を記述しています。

Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.

関数にどのような値を差し込むべきかについて、勝手に決めつけないこと。ユーザーの要求があいまいな場合は、説明を求めること。

How to call functions with chat models に記載の内容と同様です。ユーザーが入力していない必要な値を、GPT 側で勝手に作成しないこと、不明瞭な場合はユーザーに確認することを伝えています。

2. Function Calling に関数を設定する

run_instances で定義しているのが Function Calling に設定する関数です。指定のスキーマに従って関数を定義します。基本的にはパラメータ名、パラメータの型、パラメータの説明を記載し、GPT にどのようなパラメータをユーザーに要求するのかを伝えます。required キーを使用することで、必須なパラメータを伝えることもできます。ここではインスタンスの起動に必要な3つのパラメータ全てを必須としています。

さいごに

Function Calling を利用してユーザーとの対話から EC2 インスタンスの起動に必要なパラメータを抽出し、実際に起動してみました。モデルやプロンプトの内容によって、うまく引数が設定されないケースがありました。今回は GPT-4 を利用しており、高確率で意図した振る舞いを見せてくれました。今後はこの辺りの振る舞いについても調査してみたいと思います。