Skip to content

向 Integrated Data Lake 上传数据

本节描述如何向 Integrated Data Lake 上传数据。

先决条件

方法的选择完全取决于需求的类型。您可以使用以下定义的方法执行上传数据到 Integrated Data Lake:

  1. 生成签名 URL
  2. 交叉账户访问

生成签名 URL

您可以遵循以下步骤使用此方法:

  1. 生成签名 URL 以上传对象

端点:

POST /generateUploadObjectUrls
Content-Type: application/json

请求示例:

{
  "paths": [
    {
      "path": "myfolder/mysubfolder/myobject.objext"
    }
  ]
}

响应示例:

{
    "objectUrls":[
        {
            "signedUrl":"https://datalake-integ-dide2-5234525690573.s3.eu-central-1.amazonaws.com/data/ten%3Ddide2/myfolder/mysubfolder/myobject.objext?X-Amz-Security-Token=Awervzdg23452xvbxd3434ddg&X-Amz-SignedHeaders=host&X-Amz-Expires=7200&X-Amz-Credentials=ASIATCES50453sdf&X-Amz-Signature=2e2342sfgsdfgsdgh",
            "path":"myfolder/mysubfolder/myobject.objext"
        }
    ]
}
2. 可以使用此签名 URL 将一个或多个对象上传到目标目录。此 URL 的有效期为120分钟。一旦时间限制过期,您需要重新生成签名 URL。

端点:

POST https://datalake-integ-dide2-5234525690573.s3.eu-central-1.amazonaws.com/data/ten%3Ddide2/myfolder/mysubfolder/myobject.objext?X-Amz-Security-Token=Awervzdg23452xvbxd3434ddg&X-Amz-SignedHeaders=host&X-Amz-Expires=7200&X-Amz-Credentials=ASIATCES50453sdf&X-Amz-Signature=2e2342sfgsdfgsdgh
X-Amz-Content-sha256=UNSIGNED-PAYLOAD
Content-Type: text/plain
X-Amz-acl=bucket-owner-full-control

请求示例:

This is sample text in the file being uploaded.

交叉账户访问

如果需要连续访问需要上传的目录,则使用此方法。考虑这样一个示例,其中您有一个 AWS 帐户,任何应用都驻留在这个帐户中,并且这个应用需要持续访问 IDL 目录。在这种情况下,交叉帐户访问是有用的。

您可以遵循以下步骤使用此方法:

  1. 创建需要提供访问权限的交叉帐户。
POST /crossAccounts
Content-Type: application/json

请求示例:

{
  "name": "testCrossAccount",
  "accessorAccountId": "960568630345",
  "description": "Cross Account Access for Testing",
  "subtenantId": "204a896c-a23a-11e9-a2a3-2a2ae2dbcce4"
}

响应示例:

{
  "id": "20234sd34a23a-11e9-a2a3-2a2sdfw34ce4",
  "name": "testCrossAccount",
  "accessorAccountId": "960768132345",
  "description": "Cross Account Access for Testing",
  "timestamp": "2019-09-06T21:23:32.000Z",
  "subtenantId": "204a896c-a23a-11e9-a2a3-2a2ae2dbcce4",
  "eTag": 1
}
2. 创建交叉帐户后,执行交叉帐户访问以在所需的前缀上提供所需的访问。

POST /crossAccounts/20234sd34a23a-11e9-a2a3-2a2sdfw34ce4/accesses
Content-Type: application/json

请示示例:

{
  "description": "Access to write to mysubfolder",
  "path": "myfolder/mysubfolder",
  "permission": "WRITE"
}

响应示例:

{
  "id": "781c8b90-c7b6-4b1c-993c-b51a00b35be2",
  "description": "Access to write to mysubfolder",
  "storageAccount": "dlbucketname",
  "storagePath": "data/ten=tenantname/myfolder/mysubfolder",
  "path": "myfolder/mysubfolder",
  "permission": "WRITE",
  "status": "ENABLED",
  "timestamp": "2019-11-04T19:19:25.866Z",
  "eTag": 1
}
3. 一旦提供了访问权限,您就可以通过 CLI 或 AWS SDK 使用相应的访问权限将数据上传到所需的前缀。

使用下面的命令将文件上传到 S3 bucket:

$ aws s3 cp myobject.objext s3://tgsbucket

upload: ./myobject.objext to s3://tgsbucket/myobject.objext


Last update: January 6, 2020