Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

AWS Glue code examples for the SDK for .NET

Overview

Shows how to use the AWS SDK for .NET to work with AWS Glue.

AWS Glue is a scalable, serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

⚠ Important

  • Running this code might result in charges to your AWS account. For more details, see AWS Pricing and Free Tier.
  • Running the tests might result in charges to your AWS account.
  • We recommend that you grant your code least privilege. At most, grant only the minimum permissions required to perform the task. For more information, see Grant least privilege.
  • This code is not tested in every AWS Region. For more information, see AWS Regional Services.

Code examples

Prerequisites

For prerequisites, see the README in the dotnetv3 folder.

Get started

Single actions

Code excerpts that show you how to call individual service functions.

Scenarios

Code examples that show you how to accomplish a specific task by calling multiple functions within the same service.

Run the examples

Instructions

For general instructions to run the examples, see the README in the dotnetv3 folder.

Some projects might include a settings.json file. Before compiling the project, you can change these values to match your own account and resources. Alternatively, add a settings.local.json file with your local settings, which will be loaded automatically when the application runs.

After the example compiles, you can run it from the command line. To do so, navigate to the folder that contains the .csproj file and run the following command:

dotnet run

Alternatively, you can run the example from within your IDE.

Hello AWS Glue

This example shows you how to get started using AWS Glue.

Get started with crawlers and jobs

This example shows you how to do the following:

  • Create a crawler that crawls a public Amazon S3 bucket and generates a database of CSV-formatted metadata.
  • List information about databases and tables in your AWS Glue Data Catalog.
  • Create a job to extract CSV data from the S3 bucket, transform the data, and load JSON-formatted output into another S3 bucket.
  • List information about job runs, view transformed data, and clean up resources.

This scenario requires the following scaffold resources:

  • An S3 bucket that can contain the Python ETL job script and receive output data.
  • An AWS Identity and Access Management (IAM) role that can be assumed by AWS Glue. The role must grant read-write access to the S3 bucket and standard rights needed by AWS Glue.

You can deploy and destroy these resources by using the AWS Cloud Development Kit (AWS CDK). To do this, run cdk deploy or cdk destroy in the /resources/cdk/glue_role_bucket folder.

When the AWS CDK script reports the bucket name and the IAM role that was created, open the settings.json file and fill in the BucketName, RoleName, and ScriptURL values.

Also copy the Python script flight_etl_job_script.py from /aws-doc-sdk-examples/python/example_code/glue/flight_etl_job_script.py to the S3 bucket.

Example:

"BucketName": "bucket-name-from-cdk-script",
"CrawlerName": "any-name-for-crawler",
"RoleName": "role-name-from-cdk-script",
"SourceData": "s3://crawler-public-us-east-1/flight/2016/csv",
"DbName": "example-flights-db",
"Cron": "cron(15 12 * * ? *)",
"ScriptURL": s3://bucket-name-from-cdk-script/flight_etl_job_script.py
"JobName": "glue-mvp-job"

Tests

⚠ Running tests might result in charges to your AWS account.

To find instructions for running these tests, see the README in the dotnetv3 folder.

Additional resources


Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.

SPDX-License-Identifier: Apache-2.0