AWS EMR Add Git Repository

How can a Git-based repository be added?

  1. First step is to head to the EMR console through the following link https://console.aws.amazon.com/elasticmapreduce/.
    AWS EMR Add Git Repository - EMR Git Repositories

    AWS EMR Add Git Repository – EMR Git Repositories

     

  2. After that, you should select the option Git repositories, then click on Add repository.
  3. In the section of Repository name, you need to type in a unique name for your EMR repository. The name can only include the following: underscores, alphanumeric characters, and hyphens.
    AWS EMR Add Git Repository - Add an EMR Git Repository

    AWS EMR Add Git Repository – Add an EMR Git Repository

     

  4. In the section for Git repository URL, type in the repository’s URL. In case you’re utilizing the CodeCommit repository, you will find this URL upon selecting Clone URL and then choosing Clone HTTPS, such as the following example: https://git-codecommit.us-west-2.amazonaws.com/v1/repos/FirstCodeCommitRepoName.
  5. In the section of Branch, fill in a name for your branch.
    AWS EMR Add Git Repository - Select Git Credentials

    AWS EMR Add Git Repository – Select Git Credentials

     

  6. In the section on Git credentials, you will need to select specific options based on the below information. For the sake of adding authentication to the repository, a PAT or a Git username and password can be utilized. Secrets that are stored in the Secrets Manager can be used by EMR Notebooks for accessing Git credentials.

 

Option Selection What is it For?
Create a new secret For the sake of associating Git credentials with a newly created secret.

In case Git credentials are utilized for accessing the repository, choose Username and password, type in the used Secret name, then fill in the Username and Password.

Otherwise,

In case a personal access token is utilized for the sake of accessing, then you will need to choose the option Personal access token (PAT), type in the Secret name that you are going to be using in Secrets Manager, then fill in the personal access token.

Use an existing AWS secret. For existing saved credentials as a secret, then choose which secret name you’d like to use from the available list.

In case of choosing a secret that associates itself with specific Git credentials, then your secret needs to have the following format {“gitUsername”: XYUserName, “gitPassword”: XYPassword}.

Use a public repository without credentials. For accessing a specific public repository.

7.  Finally, click on Add repository.

 

How to Update or Delete a Git Repository?

For the sake of updating a Git-based repository, do the following steps:

  1. When the Git repositories page opens, you will need to choose the required repository for updates.
  2. When the repository page opens, click on the option Edit repository.
  3. Start with updating your Git credentials.

For the sake of deleting a Git repository, do the following steps:

  1. When the Git repositories page opens, you will need to choose the required repository for deletion.
  2. When the repository page opens, you will need to select every single notebook linked to this repository and then click on the option Unlink notebook.
  3. Again, from the repository page, you will need to click on the option Delete.

What are the different Repository Statuses, and what do they mean?

The following statuses may be given to any of your Git repositories.

Status Title Reference
Linking Linking to the notebook. Unable to stop notebook in the meantime.
Linked Linked to notebook and connected to the remote repository.
Link Failed Failure in linking to the notebook, try again.
Unlinking Unlinking from a notebook, thus, cannot stop the notebook right now. This will disconnect the Git repository from the remote repository without deleting your notebook’s code.
Unlink Failed Failure in unlinking from a notebook, try again.

 

How to get a Git-based repository linked to an EMR notebook?

You may link your repository to a specific EMR notebook as soon as this notebook becomes Ready. Linking can be made in two ways. Choose the one that suits you and go with it.

First Way:

  1. Select a specific notebook for updating from the list of Notebooks.
  2. For Git repositories, which can be found on the page named Notebook, click on the option Link new repository.
  3. Choose 1 or multiple repositories for linking to the notebook from the list of available repositories found in the window named Link Git repository to the notebook. Then, click on the option Link repository.

Second Way:

  1. From the page Git repositories, select which of the repositories you’d like to get your notebook linked to.
  2. From the EMR notebooks list, click on the option Link new notebook for the sake of linking the chosen repository to a notebook that already exists.

 

How to get a Git repository Unlinked to an EMR notebook?

Unlinking can be done in two ways. Choose the one that suits you and go with it.

First Way:

  1. In the list of Notebooks, click on the notebook that you’d like to get updated.
  2. From the Git repositories list, choose which repository you need to get unlinked from the notebook. Then, click on the option Unlink repository.

Second Way:

  1. From the page Git repositories, click on the wanted repository for updates.
  2. From the EMR notebooks list, click on which notebook you’d like to unlink from the chosen repository. Then, select the option Unlink notebook.

You can also use the AWS EMR for the sake of Creating a Cluster.


AUTHOR