How to create AWS Transfer for SFTP Custom Identity Provider for Active Directory

How to create AWS Transfer for SFTP Custom Identity Provider for Active Directory

Active Directory often used in corporate world to authenticate and authorize users on a big scale. SFTP is another protocol popular for data exchange, integration and ETL(Extract-Transform-Load) processes. Integrating both of them can be very benefitial for the organization. Let's see how it can be done using AWS.

What is Active Directory?

Developed by Microsoft, Active Directory is a service that stores information about various network resources and also, where applicable, maps them to physical network addresses. Data in the Active Directory is represented as a set of objects with attributes and relations between them. Those relations can be represented using the tree data structure: Active Directory Example

To access the Active Directory the LDAP protocol is used. LDAP(Lightweight Directory Access Protocol) is an open-source protocol supported by many platforms and languages.

The Idea

We are going to use AWS Transfer for SFTP with a custom authentication configured to allow uploading files to S3 via SFTP using Active Directory credentials: Active Directory and SFTP Diagram

We have the following steps involved:

  1. Client initiates an SFTP transfer
  2. AWS Transfer for SFTP configured to use a custom identity provider, sends a request to the AWS API Gateway
  3. AWS API Gateway invokes our custom AWS Lambda function
  4. AWS Lambda verifies authentication information against the AWS Directory
  5. If authentication information is correct data is being transferred from/to S3.

Everything, except for the step (4) is a standard behavior provided and managed by AWS. We only need to implement the authentication itself.

It looks rather simple, but there is a catch: AWS Directory Service can be used only inside a VPC. So if our Lambda needs to validate credentials against Active Directory, it also should be inside a VPC.

Let's see how we can describe such infrastructure.

Infrastructure

In our company, even for demo projects we try to use infrastructure-as-a-code. This way it's easier both to share the code and delete unneeded resources. We use tools like AWS CloudFormation, Serverless or Terraform. In this tutorial, I'll go with Serverless + CloudFormation becase it's easier to work with AWS Lambda this way.

Let's start with the VPC:

    vpc:
      Type: AWS::EC2::VPC
      Properties:
        CidrBlock: "10.0.0.0/16"
        EnableDnsHostnames: true
        EnableDnsSupport: true

We are also going to create two subnets:

    subnet1:
      Type: AWS::EC2::Subnet
      Properties:
        VpcId:
          Ref: vpc
        CidrBlock: 10.0.1.0/24
        AvailabilityZone:
          Fn::Select:
            - 0
            - Fn::GetAZs: ""
    subnet2:
      Type: AWS::EC2::Subnet
      Properties:
        VpcId:
          Ref: vpc
        CidrBlock: 10.0.2.0/24
        AvailabilityZone:
          Fn::Select:
            - 1
            - Fn::GetAZs: ""

As you can see, we don't provide Availability Zone (AZ) names. Instead, we use a dynamic list of available AZs, since they depend on a region.

Now it's time to declare our AD itself:

    activeDirectory:
      Type: AWS::DirectoryService::SimpleAD
      Properties:
        Name: sftpdemo.example.com
        Password: ${env:AD_ADMIN_PASSWORD}
        Size: "Small"
        VpcSettings:
          SubnetIds:
            - Ref: subnet1
            - Ref: subnet2
          VpcId:
            Ref: vpc

There is a variable "ad_password" which can be set using environment variable when the "sls deploy" command is being executed. It will be the password for the AD Administrator user.

Here is also a definition of the AWS Lambda Function in the serverless.yml:

authorize:
    handler: handler.authorize
    role: lambdaExecutionRole
    vpc:
      securityGroupIds:
        - Ref: authorizerSecurityGroup
      subnetIds:
        - Ref: subnet1
        - Ref: subnet2
    events:
      - http:
          path: /servers/{serverId}/users/{user}/config
          method: GET
          authorizer: aws_iam
    environment:
      SFTP_USER_ROLE_ARN: !GetAtt sftpUserRole.Arn
      BUCKET_ARN:
        Fn::Join:
          - ":"
          - - "arn"
            - Ref: "AWS::Partition"
            - "s3::"
            - ${env:TRANSFER_BUCKET_NAME}
      LDAP_DIRECTORY_NAME: sftpdemo
      LDAP_DNS_NAME: sftpdemo.example.com

You may be wondering, why I'm using sftpdemo.example.com(a non-existing domain) as a LDAP server hostname. We can do it, since I'm going to create a Route 53 Private Hosted Zone inside our VPC. It will also have a record for our Active Directory IPs so we can access it using the sftpdemo.example.com hostname:

    privateHostedZone:
      Type: AWS::Route53::HostedZone
      Properties:
        Name: "example.com"
        VPCs:
          - VPCId:
              Ref: vpc
            VPCRegion: !Ref "AWS::Region"
    adRecordSet:
      Type: "AWS::Route53::RecordSet"
      Properties:
        HostedZoneId:
          Ref: privateHostedZone
        Name: sftpdemo.example.com.
        Type: A
        TTL: "900"
        ResourceRecords: !GetAtt activeDirectory.DnsIpAddresses
      DependsOn: privateHostedZone

The next step is to create AWS Transfer for SFTP server. You can read our blog post about it. Six months later, CloudFormation(and thus Serverless) has a support for the AWS Transfer Server resource.

    sftpServer:
      Type: AWS::Transfer::Server
      Properties:
        IdentityProviderDetails:
          InvocationRole:
            Fn::GetAtt: ["transferInvocationRole", "Arn"]
          Url:
            Fn::Join:
              - ""
              - - https://
                - Ref: ApiGatewayRestApi
                - .execute-api.
                - Ref: "AWS::Region"
                - .amazonaws.com/${self:provider.stage}/
        IdentityProviderType: "API_GATEWAY"
        LoggingRole:
          Fn::GetAtt: ["transferLoggingRole", "Arn"]

To provision the infrastructure, execute the following command:

sls deploy

Now we have all the infrastructure required and can proceed with our custom AWS Lambda function code.

Active Directory Authentication in AWS Lambda

To perform the actual authentication, we are going to use the same approach as in our previous blog post about AWS Transfer for SFTP custom identity providers. The only difference is that we are going to perform an LDAP bind instead of AWS Cognito authenticateUser API call. Here is the related code fragment:

exports.handler = async function(event) {
  return new Promise((resolve, reject) => {
    const username = event.pathParameters.user;
    const password = event.headers.Password;

    console.log(`Performing authentication for user ${username}`);

    client.bind(`${username}@${process.env.LDAP_DIRECTORY_NAME}`, password, err => {
      if (err) {
        reject(err);
      } else {
        const response = {
          headers: {
            "Access-Control-Allow-Origin": "*",
            "Content-Type": "application/json"
          },
          body: getSftpPolicy(username),
          statusCode: 200
        };
        resolve(response);
      }
    });
  });
};

A very important part here is the fact the bind call requires to pass the DN in the format <ldap-directory-netbios-name>@<username>. In the opposite case, the bind call will fail.

Conclusion

Flexibility of AWS Transfer for SFTP custom identity providers allows implementing authentication and authorization using virtually any data source. In a case if the organization uses Active Directory, it's possible to provide access to AWS S3 via SFTP to Active Directory users with a fine-grained level of access control.

Links

© 2016 - 2019 AgileVision sp. z o.o.