向 Google Cloud Storage 存储分区提供用户项目

潮易

于 2024-08-15 06:54:25 发布

阅读量221

点赞数 2

文章标签： java 服务器运维

本文链接：https://blog.csdn.net/wangbadan121/article/details/141205126

版权

要将数据以分区和用户项目的形式存储到Google Cloud Storage中，首先需要创建一个Google Cloud Storage（GCS）项目以及相关的存储桶。这里提供详细的步骤和代码示例：

### 步骤1: 创建Google Cloud Storage项目
1. 登录Google Cloud Platform控制台（https://console.cloud.google.com/），选择或创建一个新项目。
2. 在项目列表中找到"APIs & Services"，点击进入。
3. 选择"Dashboard"，然后点击"Enable APIs and Services"。
4. 搜索"Cloud Storage API"并启用它。

### 步骤2: 创建存储桶
1. 返回Google Cloud Console主页，找到左侧菜单中的"Navigation Menu"（导航菜单），展开它并选择"Storage" -> "Buckets"。
2. 点击右上角的"CREATE BUCKET"按钮。
3. 在弹出的窗口中，设置存储桶名称、位置和访问控制等选项。确保bucket的访问权限设置为“Private”。

### 步骤3: 安装Google Cloud SDK（可选）
1. 在本地计算机上安装Google Cloud SDK（https://cloud.google.com/sdk/docs/install）。
2. 启动SDK，运行`gcloud auth login`并按照提示完成身份验证。
3. 激活Google Cloud Storage API：`gcloud services enable storage-component.googleapis.com --project [YOUR_PROJECT_ID]`

### 使用Python上传分区和用户项目的数据
```python
from google.cloud import storage

def upload_to_storage(bucket_name, source_file_name, destination_blob_name):
    """将本地文件上传到指定的GCS存储桶中，并设置特定的用户项目。"""
    # 创建一个Google Cloud Storage客户端实例
    client = storage.Client()

# 获取指定项目的存储桶实例
bucket = client.get_bucket(bucket_name)

# 准备要上传的blob（文件）
blob = bucket.blob(destination_blob_name, metadata={'userProject': 'YOUR_PROJECT_ID'})

    # 上传文件到GCS
    with open(source_file_name, "rb") as source_file:
        blob.upload_from_file(source_file)

print(f"File {source_file_name} uploaded to {destination_blob_name} with user project.")

# 替换以下参数为你自己的项目ID、源文件路径和目标文件名
bucket_name = 'YOUR_BUCKET_NAME'
source_file_name = 'path/to/your/local/file.txt'
destination_blob_name = 'folder1/folder2/file.txt'

upload_to_storage(bucket_name, source_file_name, destination_blob_name)
```

### 测试用例
```python
def test_upload_to_storage():
    bucket_name = 'your-test-bucket'
    source_file_name = 'tests/data/test.txt'
    destination_blob_name = 'folder1/test.txt'

    try:
        # 确保存储桶存在且用户项目设置为当前项目
        upload_to_storage(bucket_name, source_file_name, destination_blob_name)
        print("Upload test passed.")
    except Exception as e:
        print(f"Test failed with error: {e}")

test_upload_to_storage()
```

### 人工智能大模型应用示例
假设我们有一个使用Google Cloud Storage作为数据存储的大型自然语言处理任务。我们可以通过设置`metadata={'userProject': 'YOUR_PROJECT_ID'}`来确保模型在访问GCS时使用当前项目的费用。

**场景：** 在一个文本分类任务中，我们需要为每个文本样本分配一个类别标签。这个过程需要将大量标记数据存储在GCS，然后通过AI模型进行快速、高效的分类。

**示例代码：**
```python
def classify_texts(bucket_name, texts):
    """根据给定的文本列表进行分类，并将结果以分区和用户项目的形式上传到GCS。"""
    client = storage.Client()
    bucket = client.get_bucket(bucket_name)

    for text in texts:
        # 对text进行分类
        category = classify(text) # 假设classify是一个有效的函数

        # 构建目标blob名称，例如：'folder1/classification/{uuid}.txt'
        destination_blob_name = f'folder1/classification/{uuid4().hex}.txt'
        blob = bucket.blob(destination_blob_name, metadata={'userProject': 'YOUR_PROJECT_ID'})

        # 将分类结果写入到GCS
        with blob.open("w") as file:
            file.write(category)

print(f"Classification result for text '{text}' saved at {destination_blob_name}.")
```

在这个示例中，每个文本样本都被分类并保存为GCS中的文件。同时，通过设置用户项目，确保了模型在使用这些数据时不会产生额外的费用。

潮易

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
打赏
0
评论
向 Google Cloud Storage 存储分区提供用户项目

要将数据以分区和用户项目的形式存储到Google Cloud Storage中，首先需要创建一个Google Cloud Storage（GCS）项目以及相关的存储桶。1. 返回Google Cloud Console主页，找到左侧菜单中的"Navigation Menu"（导航菜单），展开它并选择"Storage" -> "Buckets"。"""根据给定的文本列表进行分类，并将结果以分区和用户项目的形式上传到GCS。"""将本地文件上传到指定的GCS存储桶中，并设置特定的用户项目。
复制链接

扫一扫