[Spark By Example] SparkSession

The following sample code (by Python and C#) shows how to use SparkSession.

SparkSession

  • SparkSession is an entry point to your spark application since Spark version 2.
  • SparkSession wraps all different contexts (SparkContext, SQLContext, HiveContext, …) to a single entry point.
  • You can create as many SparkSessions as you want.
  • In the Spark shell, such as PySpark shell, the SparkSession object (named as “spark”) is created for you.
  • In the application, you need to create a SparkSession object.
Continue reading “[Spark By Example] SparkSession”

[AWS Lab] Lambda – Environment Variables

In this lab, we will learn how to use environment variables in a Lambda function.

  • Overview
  • S3
    • Create a bucket
    • Upload a image file
  • Lambda – Function
    • Get the bucket name and the image name from environment variables
    • Retrieves the object properties
    • Python, boto3
Continue reading “[AWS Lab] Lambda – Environment Variables”

[AWS Lab] Lambda Invocation via Polling – DynamoDB Stream

In this lab, we will learn how a lambda function can be invoked via Polling.

  • Overview
  • Source: DynamoDB Stream
    • Create a DynamoDB Table
    • Enable the Stream
  • Target: Lambda – Function
    • logs the received message
    • does not return any value
    • configures the trigger
    • Python
Continue reading “[AWS Lab] Lambda Invocation via Polling – DynamoDB Stream”