0.0.5 • Published 3 years ago

@educational-technology-collective/etc-jupyterlab-aws-api-s3-handler v0.0.5

Weekly downloads
-
License
BSD-3-Clause
Repository
github
Last release
3 years ago

etc-jupyterlab-aws-api-s3-handler

Binder

The JupyterLab AWS API S3 Handler provides a JupyterLab service that can be used to send log messages to a specified AWS REST API Gateway and its associated S3 Bucket.

The extension provides a Service Token named IAWSAPIGatewayHandler. Once the extension is installed, it can be injected into a consumer extension like this:

const extension: JupyterFrontEndPlugin<void> = {
  id: 'the-unique-id-of-your-extension',
  autoStart: true,
  requires : [IAWSAPIGatewayHandler],
  activate: (app: JupyterFrontEnd, AWSAPIGatewayHandler: IAWSAPIGatewayHandler): void => {}};

AWSAPIGatewayHandler is a class. The constructor takes an object that specifies the URL endpoint, the name of the bucket, and the bucket path for where the log messages will be logged.

new AWSAPIGatewayHandler({
      url: "https://example.com",
      bucket: "name-of-bucket",
      path: "arbitrary-bucket-path"
    });

In order to use the handler we will need to configure the necessary AWS resources.

Setting up the AWS API Gateway and S3 Bucket

The AWS configuration consists of the following components:

  1. REST API Gateway
  2. Execution Role
  3. An Inline Policy for the Execution Role
  4. S3 Bucket

We will explain how to create each of these resources in the following instructions.

Summary

These instructions explain how to create an AWS API Gateway that can write objects to an S3 Bucket at an arbitrary path. The path of the object in the S3 Bucket will be the part of the path specified in the URL that follows the bucket name segment. A URL may look like this:

https://example.com/name-of-bucket/arbitrary/bucket/path

The above URL will create a new S3 object in the S3 Bucket named name-of-bucket and the object will be created in the arbitrary/bucket/path S3 folder. We will configure the API Gateway to accept POST HTTP requests; hence, the server will name the resource. The name of the resource will be the server timestamp concatenated with a server generated UUID.

Following these instructions will result in an API that will permit objects to be sent to an S3 Bucket using unauthenticated HTTP POST requests. It will not allow for objects to be received from the S3 Bucket. In order to receive objects from the S3 Bucket additional methods may be added to the API Gateway or S3 Sync can be used.

We will create a Role that contains a Policy that permits a Trusted Entity (e.g., an API Gateway) that is assigned the Role to PUT objects into a S3 Bucket. Then we will assign this Role to an API Gateway. This assignment will permit the API Gateway to forward payloads to the S3 Bucket.

Instructions

Create the S3 Bucket

We want to create an S3 Bucket that will store objects that it receives from the Gateway API. The S3 Bucket should be configured with the default secure permissions. Once the S3 Bucket is created, open the bucket and select the Properties tab. Copy the S3 Bucket ARN. This ARN will be used when we create the Inline Policy.

Create the Role

Create a Role. For the "use case" select "API Gateway". Accept all the defaults and name the Role.

Create the Inline Policy

Open the newly created Role and create an Inline Policy as shown below. Click the JSON button in order to open the JSON editor. Replace the ARN in the Resource list with the ARN of the S3 Bucket. Make sure the "/*" is appended to the ARN string.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::s3-bucket-name/*"
            ]
        }
    ]
}

The above Inline Policy allows the specified Action on the specified Resource (i.e., the S3 Bucket). The PutObject Action will permit the entity that is assigned the Role to use the PUT method in order to write objects to the S3 Bucket specified by the ARN in the Resource list.

Create an API Gateway.

Choose the REST API type. Accept the defaults, name the API, and click Create API. Select Actions > Create Resource. We want to be able to capture the path of the URL in order to use it in order to create the resource in the S3 Bucket. We accomplish this by creating a "greedy path parameter." Complete the New Child Resource as shown in the image below.

New Child Resource

This will result in a new resource, /{bucket-path+} and two new methods ANY and OPTIONS.

Resources

For this configuration we want to restrict our methods to just POST and OPTIONS. Click the ANY option and click Actions > Delete Method.

We want to create a POST method under the {bucket-path+} resource in order to allow for the API Gateway to receive POST HTTP requests. Click the {bucket-path+} resource. Click Actions > Create Method. Select POST. Click the checkmark. Complete the POST - Setup fields as shown in the below image:

POST Setup

Next we will set up the integration between the API Gateway and the S3 Bucket. The Integration Request configuration can be access by clicking on the POST method under resources and then clicking Integration Request in the POST-Method Execution

Requirements

  • JupyterLab >= 3.0

Install

To install the extension, clone the repository, open the repo directory, and execute:

pip install .

Uninstall

To remove the extension, execute:

pip uninstall etc-jupyterlab-aws-api-s3-handler

Troubleshoot

If you are seeing the frontend extension, but it is not working, check that the server extension is enabled:

jupyter server extension list

If the server extension is installed and enabled, but you are not seeing the frontend extension, check the frontend extension is installed:

jupyter labextension list

Contributing

Development install

Note: You will need NodeJS to build the extension package.

The jlpm command is JupyterLab's pinned version of yarn that is installed with JupyterLab. You may use yarn or npm in lieu of jlpm below.

# Clone the repo to your local environment
# Change directory to the etc-jupyterlab-aws-api-s3-handler directory
# Install package in development mode
pip install -e .
# Link your development version of the extension with JupyterLab
jupyter labextension develop . --overwrite
# Server extension must be manually installed in develop mode
jupyter server extension enable etc-jupyterlab-aws-api-s3-handler
# Rebuild extension Typescript source after making changes
jlpm run build

You can watch the source directory and run JupyterLab at the same time in different terminals to watch for changes in the extension's source and automatically rebuild the extension.

# Watch the source directory in one terminal, automatically rebuilding when needed
jlpm run watch
# Run JupyterLab in another terminal
jupyter lab

With the watch command running, every saved change will immediately be built locally and available in your running JupyterLab. Refresh JupyterLab to load the change in your browser (you may need to wait several seconds for the extension to be rebuilt).

By default, the jlpm run build command generates the source maps for this extension to make it easier to debug using the browser dev tools. To also generate source maps for the JupyterLab core extensions, you can run the following command:

jupyter lab build --minimize=False

Development uninstall

# Server extension must be manually disabled in develop mode
jupyter server extension disable etc-jupyterlab-aws-api-s3-handler
pip uninstall etc-jupyterlab-aws-api-s3-handler

In development mode, you will also need to remove the symlink created by jupyter labextension develop command. To find its location, you can run jupyter labextension list to figure out where the labextensions folder is located. Then you can remove the symlink named etc-jupyterlab-aws-api-s3-handler within that folder.