12月14日,谷歌在官网宣布,免费开放Gemini Pro和Gemini Pro Vision的API,支持中文。
Gemini Pro隶属于谷歌前几天最新发布的大语言模型Gemini家族,性能强、能耗低可自动生成文本/代码、总结内容、语义检索等。支持32K上下文(下一个版本会更大),38种语言。
目前,谷歌的类ChatGPT聊天产品Bard使用的就是Gemini Pro模型。
Gemini Pro Vision是谷歌最新推出的多模态模型,可以识别用户输入的图片,同样免费提供API。
目前谷歌的开发者平台也放出了简单的流程,我们话不多说,先体验
和ChatGPT一样,也要使用API Key
登录Google AI Studio,然后create API key in new project
然后根据Gemini API: Quickstart with Python | Google AI for Developers根据教程来进行代码编写。
目前用这个API KEY可以使用Gemin pro ,Gemini vision。尝试本地部署,但是会出现连接错误
所以必须采取曲线救国的方式,需要将demo运行在googler认证的域里面,比如google colab,没有的小伙伴可以自行申请一个。
然后我们在google colab上新建一个笔记,然后安装好相应的依赖
! pip3 install --upgrade --user google-cloud-aiplatform
!pip install google-generativeai==0.3.1
!pip install python-dotenv
然后做好认证
import sys
# Additional authentication is required for Google Colab
if "google.colab" in sys.modules:
# Authenticate user to Google Cloud
from google.colab import auth
auth.authenticate_user()
配置好API key
import pathlib
import textwrap
import google.generativeai as genai
genai.configure(api_key='AIzaSyBhLrxD8xAe154dqLXTI_xfkdY2dKa_KKw')
接下来就是简单的chat配置
import google.generativeai as genai
from dotenv import load_dotenv
import os
from datetime import datetime
import re
def main():
load_dotenv()
#api_key = os.getenv("GEMINI_API_KEY")
api_key = 'your api key'
if not api_key:
raise ValueError(
"API key not found. Please set your GEMINI_API_KEY in the environment.")
genai.configure(api_key=api_key)
generation_config = {
"temperature": 0.7,
"top_p": 1,
"top_k": 1,
"max_output_tokens": 2048,
}
safety_settings = {
"HARM_CATEGORY_HARASSMENT": "BLOCK_NONE",
"HARM_CATEGORY_HATE_SPEECH": "BLOCK_NONE",
"HARM_CATEGORY_SEXUALLY_EXPLICIT": "BLOCK_NONE",
"HARM_CATEGORY_DANGEROUS_CONTENT": "BLOCK_NONE",
}
model = genai.GenerativeModel(
'gemini-pro', generation_config=generation_config, safety_settings=safety_settings)
chat = model.start_chat(history=[])
while True:
user_input = input("User: ").strip()
try:
response = chat.send_message(user_input, stream=True)
response_text = ""
for chunk in response:
print(chunk.text)
print("="*80)
except Exception as e:
print(f"An error occurred: {e}")
if __name__ == "__main__":
main()
回答的信息还是挺完整的,整体来说和ChatGPT有的一拼,接下来我们使用Gemini vision,看看Gemini对图片的理解能力如何
import google.generativeai as genai
import os
from dotenv import load_dotenv
import PIL.Image
img_vision = PIL.Image.open('/content/d040f28e.jpg')
def setup():
try:
# Load environment variables
gemini_api_key = 'your api key'
print(f"Gemini API Key: {gemini_api_key}")
if gemini_api_key is None:
raise Exception("GEMINI_API_KEY is not set")
genai.configure(api_key=gemini_api_key)
except Exception as exception:
raise Exception("An error occurred during setup:", exception)
def generate_text(img):
try:
model = genai.GenerativeModel(model_name='gemini-pro-vision')
response = model.generate_content(['写一个故事关于这图片',img])
return response.text
except Exception as exception:
raise Exception("An error occurred during text generation:", exception)
if __name__ == "__main__":
try:
setup()
#print(f"Gemini Pro\nEnter a prompt:")
#prompt = input()
response = generate_text(img_vision)
print(response)
except Exception as exception:
print(f"An error occurred: {exception}")
我们让他根据这张图片写一个故事
结果如下:
整体的表述很清晰,时间、地点、人物、事件都描述的很清晰。
如果大家在体验过程中出现400错误An error occurred: 400 Unknown error trying to retrieve streaming response. Please retry with `stream=False` for more details.可以尝试清除下浏览器缓存。
想体验的小伙伴可以去尝试折腾下,支持的小伙伴可以一键三连哦