AWS Kinesis Create a Data Stream

Amazon Kinesis Data Streams is real time stream management service that enables real-time data streaming to AWS data stores and applications. It is a highly scalable service where you do not have to worry about consuming high volume of streaming data in real-time.

Kinesis Data Streams enables consuming stream data such as logs, telemetry, clickstream or video feeds from different data producers. Range of sources can be mobile devices, web applications, IoT devices and social media platforms as an example

Kinesis Data Streams allows real-time analytics on streaming data with low latency. Data can be analyzed as it is ingested using the AWS Kinesis Data Streams API. You should manage partition keys to enable multiple producers to write to a single stream as well as improve data retrieval time.

Kinesis Data Firehose is integrated with Kinesis Data Streams so streaming data to be automatically loaded into data stores such as Amazon S3, Amazon Redshift or Amazon Elasticsearch. Kinesis Video Streams, another service in the Kinesis family, enables real-time streaming of video data from connected devices for processing, analysis, and playback.

You can also analyze stream data in real-time for use cases such as fraud detection, IoT data processing, social media sentiment analysis and log analysis. Data lakes can be also used to store streaming data in S3 while processing it with tools like Amazon EMR, Amazon Athena or Amazon Redshift.

To begin with the process of creating an AWS Kinesis Data Streams you can start by using Kinesis Data Streams console, the CLI, or Kinesis Data Streams API.

AWS Kinesis Create a Data Stream through console:

  1. Login to Management Console then head to Kinesis console using the following link https://console.aws.amazon.com/kinesis.
  2. From navigation bar, use the Region selector at the top right to select a specific Region.
    AWS Kinesis Create a Data Stream - Region Selector

    AWS Kinesis Create a Data Stream – Region Selector

     

  3. Click on Create data stream.
    AWS Kinesis Create a Data Stream - Create Data Stream

    AWS Kinesis Create a Data Stream – Create Data Stream

     

  4. From the page of Create Kinesis stream, type in a stream name as well as the required shards number. Then, select the option Create Kinesis stream.

    AWS Kinesis Create a Data Stream - Stream name and number of shards

    AWS Kinesis Create a Data Stream – Stream name and number of shards

You can see from the page of Kinesis streams the following statuses accordingly:

Status= Creating (When stream is ongoing creation)

StatusActive (When stream becomes available for using)

5. Click on your stream’s name. You will find on the page of Stream Details an overall report of monitoring info as well as the stream config.

 

AWS Kinesis Create a Data Stream through CLI:

With CLI you can start creating a stream directly through using the create-stream command.

 

 

AWS Kinesis Create a Data Stream with API:

Go over the below steps for creating a Kinesis data stream.

First: AWS Kinesis Build a Data Streams Client

Prior to starting with Kinesis data streams, a client object needs to get built. In the bellow Java code, a client builder will be instantiated and utilized for setting the client config, the credentials, and the Region. Then, a client object will be built.

AmazonKinesisClientBuilder clientBuilder = AmazonKinesisClientBuilder.standard();

clientBuilder.setRegion(regionName);

clientBuilder.setCredentials(credentialsProvider);

clientBuilder.setClientConfiguration(config);

AmazonKinesis client = clientBuilder.build();

Second: AWS Kinesis Create a Data Stream

After the creation of a Kinesis Data Streams, a stream can now be created. This can be done through either the Kinesis Data Streams console, or with programming. For the programmatic creation of a stream, a CreateStreamRequest object needs to be instantiated. A name must be given for this stream and a number of shards needs to be set for using.

CreateStreamRequest createStreamRequest = new CreateStreamRequest();

createStreamRequest.setStreamName( myStreamName );

createStreamRequest.setShardCount( myStreamSize );

A stream name is used for identifying the stream. It is scoped to the app’s account, and by Region.

2 streams from 2 different accounts are capable of having the exact name.

2 streams in 1 same account yet from 2 differing Regions are capable of having the exact name.

2 streams in the exact account and in exact Region cannot have the same name.

Throughput of stream= function of the number of shards. As shards increase, there will be more provisioned throughput, and more charged cost for AWS.



client.createStream( createStreamRequest );

DescribeStreamRequest describeStreamRequest = new DescribeStreamRequest();

describeStreamRequest.setStreamName( myStreamName );

long startTime = System.currentTimeMillis();
long endTime = startTime + ( 10 * 60 * 1000 );

while ( System.currentTimeMillis() < endTime ) {

try {    Thread.sleep(20 * 1000);  }   catch ( Exception e ) {}

try {    DescribeStreamResult describeStreamResponse = client.describeStream( describeStreamRequest );

String streamStatus = describeStreamResponse.getStreamDescription().getStreamStatus();

if ( streamStatus.equals( "ACTIVE" ) )

{      break;    }

//    // sleep for one second    //

try {      Thread.sleep( 1000 );    }

catch ( Exception e ) {}  }

catch ( ResourceNotFoundException e ) {}}

if ( System.currentTimeMillis() >= endTime )

{  throw new RuntimeException( "Stream " + myStreamName + " never went active" );}

 

How to update a data stream through the console?

  1. Go to the Kinesis console using the following link https://console.aws.amazon.com/kinesis/.
  2. From navigation bar, use the Region selector at the top right to select a specific Region.
    AWS Kinesis Create a Data Stream - Region Selector

    AWS Kinesis Create a Data Stream – Region Selector

     

  3. Pick a stream name from the list of streams. In the page of Stream Details you will get an overall report for your monitoring info and stream config.
  4. For editing the shard number, select Edit option from under the Shards section, then type in the shard count of your choice.
  5. For enabling data records server-side encryption, click on Edit from the section of Server-side encryption. Select a specific KMS key for making it master key for the process of encryption. Otherwise, just go with the default kinesis managed master key, aws/kinesis. If you enable encryption for a stream and utilize a KMS master key of your own, you must make sure the consumer apps as well as the consumer obtain permission to access to the KMS master key that you chose.
  6. For editing period of data retention, you need to click Edit from the section of Data retention period. After that, type in a different value for data retention period.
  7. In case of enabling custom metrics, click on Edit from the section of Shard level metrics. Then, set your stream’s metrics.

 

How to Update a Stream with API?

If you’d like to get stream details updated through the API, it can be done using the below methods:

  • DecreaseStreamRetentionPeriod: To lessen the period of retention.
  • UpdateShardCount: To change the shard count.
  • EnableEnhancedMonitoring: To allow for enhanced monitoring.
  • AddTagsToStream: To give the stream added tags. To learn more about tagging you can check the AWS Tagging guidlines.
  • StopStreamEncryption: To prevent the encryption of the stream.
  • RemoveTagsFromStream: To delete existing tags from the stream.
  • DisableEnhancedMonitoring: To prevent enhanced monitoring.
  • IncreaseStreamRetentionPeriod: To add more time to the retention period.
  • StartStreamEncryption: To begin the encryption of the stream.

AWS Cost Optimization


AUTHOR