5013.大模型-调用api - 多模态调用

openai-api官方文档
通义千问API参考

测试链接是否成功

1
2
3
4
curl https://api.openai.com/v1/models \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "OpenAI-Organization: YOUR_ORG_ID" \
-H "OpenAI-Project: $PROJECT_ID"
1
2
3
4
5
6
7
8
from openai import OpenAI

client = OpenAI(
api_key=api_key, # 必选
base_url=base_url, # 必选
organization='YOUR_ORG_ID', #可选
project='$PROJECT_ID', # 可选
)

发送api请求示例

1
2
3
4
5
6
7
8
curl https://api.openai.com/v1/chat/completions \ # 部署服务的地址
-H "Content-Type: application/json" \ # 传递数据的格式
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4o-mini", # 访问的模型
"messages": [{"role": "user", "content": "Say this is a test!"}], # 模型参数
"temperature": 0.7 # 模型温度
}'

像模型发送上述请求后,你会得到一个完整的json格式的返回,格式如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1677858242,
"model": "gpt-4o-mini",
"usage": {
"prompt_tokens": 13,
"completion_tokens": 7,
"total_tokens": 20,
"completion_tokens_details": {
"reasoning_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"choices": [
{
"message": {
"role": "assistant",
"content": "\n\nThis is a test!"
},
"logprobs": null,
"finish_reason": "stop",
"index": 0
}
]
}

聊天

创建对话

1
2
3
4
5
6
7
8
9
10
11
12
from openai import OpenAI
client = OpenAI()

completion = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "developer", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)

print(completion.choices[0].message)

对话包含图片上传

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
}
},
],
}
],
max_tokens=300,
)

print(response.choices[0])
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What'\''s in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
}
]
}
],
"max_tokens": 300
}'

api使用音频

文字转音频

1
2
3
4
5
6
7
8
9
10
from pathlib import Path
import openai

speech_file_path = Path(__file__).parent / "speech.mp3"
response = openai.audio.speech.create(
model="tts-1",
voice="alloy",
input="The quick brown fox jumped over the lazy dog."
)
response.stream_to_file(speech_file_path)

参数说明
| 参数 | 数据类型 | 是否必须 | 说明 |
| ————— | ——– | ———————————————————————————————————- | ———————————————————————————————————————————————————————————————— |
| model | string | Required | One of the available TTS models: tts-1 or tts-1-hd |
| input | string | Required | The text to generate audio for. The maximum length is 4096 characters. |
| voice | string | Required | The voice to use when generating the audio. Supported voices are alloy, ash, coral, echo, fable, onyx, nova, sage and shimmer. Previews of the voices are available in the Text to speech guide. |
| response_format | string | Optional Defaults to mp3,The format to audio in. Supported formats are mp3, opus, aac, flac, wav, and pcm. |
| speed | number | Optional | Defaults to 1 ,The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default. |
| Returns | - | - | The audio file content. |

音频转文字

1
2
3
4
5
6
7
8
from openai import OpenAI
client = OpenAI()

audio_file = open("speech.mp3", "rb")
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)

翻译

1
2
3
4
5
6
7
8
from openai import OpenAI
client = OpenAI()

audio_file = open("speech.mp3", "rb")
transcript = client.audio.translations.create(
model="whisper-1",
file=audio_file
)
-------------本文结束感谢您的阅读-------------