Scaling Rails: Docker & AWS Beanstalk

May 20, 2020

Igor Aleksandrov

CTO / Co-Founder at JetRockets

Sergey Potrubach

Infrastructure Engineer at JetRockets

Ilya Gorbunov

Head of DevOps at JetRockets

Scaling Rails project still can be quite an interesting task.

We always want deployment to be as easy for developers as possible. On the other hand we want to pay less for the infrastructure, but be ready for unexpected significant increase of RPM.

In this article we show how to automate deploy of Rails application with Sidekiq and AnyCable to AWS Elastic Beanstalk using GitLab.

Amazon AWS has always been our preferred infrastructure provider. We love to use AWS EC2 for our applications, AWS RDS for databases, and AWS ElastiсCache as the ‘key-value’ storage.

Terminology

Due to NDA and security-related restrictions we are unable to use actual project and client names, so here is some terminology we will use in this article:

X-Project – this will be the name of our project;
x-project.com – the main domain;
git://x-project.com/x-project-server – the repository with the Rails application;
git://x-project.com/x-project-ci-image – the repository with the Docker images;

For a better experience of exploring the code, we pulled all the config files out to a separate repository: https://github.com/jetrockets/rails-elastic-beanstalk. This repository contains two directories:

x-project-ci-image contains files that were used in git://x-project.com/x-project-ci-image;
x-project-server contains the files that were added to our Rails application from git://x-project.com/x-project-server.

Inside x-project-server/eb/production we have configs of all three of our EBS projects. An EBS project usually consists of two main components:

.elasticbeanstalk directory with the global scenario files for the project configuration and the Elastic Beanstalk environment: name, instance types, variables of the environment and the environment settings;
.ebextensions directory which contains a collection of scenario files for instance configuration. These files represent a set of instructions that are loaded automatically upon creation of a new instance or update of an existing one.

Use Case

Usually we deploy our servers on EC2 using Ansible scripts and Capistrano, which is a common approach. However, this approach may lead to a state that Martin Fowler referred to as SnowflakeServer. Over time, the configuration of servers starts to have too many dependencies and it becomes very difficult to replicate quickly. That's why even if you're not using Docker on EC2 machines, it's vital to make all changes to the server by modifying the Ansible scripts.

This ensures that if necessary, you can deploy a new environment relatively quickly.

One of our recent projects was originally focused on a high load that was difficult to predict. We were building a chat room for online stores with enhanced retail sales capabilities. It was a great opportunity to try Elastic Beanstalk. To understand why, let's first look at what it is.

AWS Elastic Beanstalk is an AWS manipulation service. A preset configuration can quickly set up several application instances, link them to a database or a cache, and set up a balancer between them. With EBS you can run the application code on several platforms: Puma or Passenger for Ruby, Python, Java, and of course you can run Docker containers. We would be satisfied with the Docker container option, so let's explore this further.

Architecture

Our application was built using the following tech stack:

A Rails application (web and Sidekiq);
AnyCable for a websocket chat;
Embeddable Widgets in HTML and JS which are integrated into client e-commerce platforms;

The project required at least two environments: Production (for the live system) and Integration (for demo and testing purposes).

Let’s start with the Rails app.

Configuration

By default, Rails 6 stores credentials in separate files specific to each environment. To learn more about this, go to Rails Guides. We opted to use environment variables for managing settings, because this approach could provide certain advantages when working with AWS. For this, we used a gem dotenv-rails.


# git://x-project.com/x-project-server/Gemfile



gem 'dotenv-rails'

Environment variables are added from the .env file in the application root, which is convenient for local development. Moreover, we always store the .env.sample file inside the project, which contains a template for all project settings.

In our experience, using .env is a more convenient, clear, and consistent way to configure Rails, rather than using rails credentials. You can define the settings in a file, copy the finished file, or run Rails and transfer the settings directly using the command line.

Container Registry

Amazon offers a container management tool – Amazon Elastic Container Registry. Amazon ECR is a fully automated Docker container registry that makes it easy for developers to store, deploy, and manage Docker container images.

Getting Ready

To work with git, we use a self-hosted version of GitLab, so we wanted to integrate GitLab & GitLab CI into the AWS infrastructure.

We decided to store all the application settings in AWS. This means that GitLab-runner should not have access to these settings and so we had to make a second docker image, which receives the keys to work with AWS while the app is building. To do this, the following Dockerfile should be added to a separate repository.


# git://x-project.com/x-project-ci-image/aws-utils/Dockerfile



FROM docker:latest     



ARG AWS_ACCESS_KEY_ID

ARG AWS_SECRET_ACCESS_KEY

ARG AWS_REGION



RUN apk add --no-cache python py-pip python-dev build-base libressl-dev musl-dev libffi-dev git

RUN pip install awsebcli awscli s3cmd



RUN  aws configure set aws_access_key_id $AWS_ACCESS_KEY_ID

RUN  aws configure set aws_secret_access_key $AWS_SECRET_ACCESS_KEY 

RUN  aws configure set region $AWS_REGION

The settings for AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY & AWS_REGION will be based on the GitLab environment variables while the app is building. To accomplish this we created a user in the Amazon IAM with the required permissions. In our case, we selected AWSElasticBeanstalkFullAccess (to work with EC2, S3, RDS, and other services), AWSElasticBeanstalkMulticontainerDocker (to work with Docker images) & ECRFullAccess (a custom policy with full ECR access). As a result of the app build, this container will be able to work with AWS API according to the set permissions, so the main application will be built on its basis.

To make it easier to run tests and release the app, we decided to build an intermediate Docker image in order to launch rspec and collect images to run Rails & Sidekiq. This image already has everything you need: nodejs & yarn for compiling JS & CSS, and client PostgreSQL for connecting to RDS. It is also necessary to remember about static distribution, therefore we’ll add Nginx. We use ruby 2.6.5, as Rails (at the time of writing this article) still generates a large number of warnings with ruby 2.7.0.




# git://x-project.com/x-project-ci-image/app-image/Dockerfile



ARG RUBY_VERSION_IMAGE=2.6.5

FROM ruby:${RUBY_VERSION_IMAGE}-slim-buster



ARG APP_PATH="/app"

ARG APP_USER=app

ARG NODE_VERSION=12

ARG POSTGRES_VERSION=12



RUN adduser --disabled-password --home ${APP_PATH} --gecos '' ${APP_USER}



RUN mkdir -p ${APP_PATH} && \

    chown ${APP_USER}. ${APP_PATH}



# Install packages

RUN apt-get update -qq && \

    apt-get install -yq --no-install-recommends \

    curl \

    gnupg2 \

    lsb-release \

    gcc \

    g++ \

    git \

    openssh-client \

    libxml2-dev \

    libxslt-dev \

    libjemalloc2 \

    make \

    nginx \

    pkg-config \

    file \

    imagemagick



# Nginx settings

RUN touch /run/nginx.pid && \

    chown -R ${APP_USER}. /run/nginx.pid && \

    chown -R ${APP_USER}. /var/log/nginx && \

    chown -R ${APP_USER}. /var/lib/nginx && \

    chown -R ${APP_USER}. /etc/nginx/conf.d && \

    chown -R ${APP_USER}. /usr/share/nginx && \

    unlink /etc/nginx/sites-enabled/default



# Forward request logs to Docker log collector

RUN ln -sf /dev/stdout /var/log/nginx/access.log \

 && ln -sf /dev/stderr /var/log/nginx/error.log



# Add Node.js repo

RUN curl -fsSL https://deb.nodesource.com/gpgkey/nodesource.gpg.key | apt-key add - && \

    echo "deb https://deb.nodesource.com/node_${NODE_VERSION}.x $(lsb_release -s -c) main" | tee /etc/apt/sources.list.d/nodesource.list && \

    echo "deb-src https://deb.nodesource.com/node_${NODE_VERSION}.x $(lsb_release -s -c) main" | tee -a /etc/apt/sources.list.d/nodesource.list



# Add yarn repo

RUN curl -fsSL https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - && \

    echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list



# Install packages

RUN apt-get update -qq && \

    apt-get install -qy --no-install-recommends \

    nodejs \

    yarn



RUN curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - &&\

    echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -sc)-pgdg main" | tee /etc/apt/sources.list.d/postgresql.list



# Install dependency for gem pg

RUN apt-get update -qq && \

    apt-get install -qy --no-install-recommends \

    libpq-dev \

    postgresql-client-${POSTGRES_VERSION}



COPY --chown=${APP_USER}:${APP_USER} . ${APP_PATH}



USER ${APP_USER}



ENV GEM_HOME=${APP_PATH}/bundle \

    BUNDLE_PATH=${APP_PATH}/bundle \

    BUNDLE_BIN=${APP_PATH}/bundle/bin \

    PATH=${APP_PATH}/bin:${APP_PATH}/bundle/bin:/usr/local/bin:$PATH



WORKDIR ${APP_PATH}



# Install dependences

# Ruby-2.6.5 supplied with bundler:1.17.2

RUN gem install bundler -v '2.1.4'



RUN bundle install --jobs=$(nproc)

RUN yarn install --check-files

The build of both images is performed in the following scenario.


# git://x-project.com/x-project-ci-image/.gitlab-ci.yml



variables:

   RUBY_VERSION_IMAGE: "2.6.5"

   NODE_VERSION: "12"

   APP_PATH: "/app"

   APP_USER: "app"



stages: 

  - build



build_image:

  # Docker image with pre-installed docker package 

  image: docker:latest

  stage: build

  before_script:

    # login to git://x-project.com/x-project-ci-image and pull images from regestry

    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY

    # Pull all images for use caching. 

    # Statement "|| true" used if images not found and pipeline not failing

    - docker pull $CI_REGISTRY/x-project/x-project-ci-image/app-image:latest || true

    - docker pull $CI_REGISTRY/x-project/x-project-ci-image/aws-utils:latest || true



  script:

    # Build docker image with aws utils and and aws secrets

    - docker build --cache-from $CI_REGISTRY/x-project/x-project-ci-image/aws-utils:latest

      -t $CI_REGISTRY/x-project/x-project-ci-image/aws-utils

        --build-arg AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} 

        --build-arg AWS_REGION=${AWS_REGION}

        --build-arg AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}

        aws-utils



    # Build docker image with pre-installed linux packages, gems and frontend packages    

    - docker build --cache-from $CI_REGISTRY/x-project/x-project-ci-image/app-image:latest 

      -t $CI_REGISTRY/x-project/x-project-ci-image/ruby-image

        --build-arg APP_PATH=${APP_PATH}

        --build-arg APP_USER=${APP_USER}

        --build-arg NODE_VERSION=${NODE_VERSION}

        --build-arg RUBY_VERSION_IMAGE=${RUBY_VERSION_IMAGE}

        app-image



    # login to git://x-project.com/x-project_ci_image for push images

    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY



    # Push images to git://x-project.com/x-project_ci_image in regestry

    # See git://x-project.com/x-project_ci_image/container_registry

    - docker push $CI_REGISTRY/x-project/x-project-ci-image/aws-utils:latest    

    - docker push $CI_REGISTRY/x-project/x-project-ci-image/app-image:latest

Everything described above can be illustrated in the following diagram.

Pipeline (ci-image)

Building an App

We use GitLab-runner for building as well as for the intermediate image.

Building and deployment of the project begin after the code gets into the master or integration branch. Currently, our pipeline consists of five stages.

Pipeline

Let’s skip notify & rspec to take a closer look at how build & deploy are going.


#.gitlab-ci.yml



docker-build:

  image: $BASE_IMAGES_URL:$CI_PROJECT_NAMESPACE-aws-utils

  stage: build

  environment:

    name: $CI_COMMIT_BRANCH

  only:

    - master

    - integration

  script:

    - $(aws ecr get-login --no-include-email --profile default)

    - aws s3 cp s3://s3.environments.x-project.com/.env.$RAILS_ENV .env

    - aws s3 cp s3://s3.environments.x-project.com/$RAILS_ENV.key config/credentials/$RAILS_ENV.key

    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY

    - docker pull $BASE_IMAGES_URL:$CI_PROJECT_NAMESPACE-app-image

    - docker build -t $REPOSITORY_URL:$REPOSITORY_TAG

      -t $REPOSITORY_URL:$RAILS_ENV-latest

      --build-arg APP_PATH=$APP_PATH

      --build-arg DATABASE_URL=$DATABASE_URL

      --build-arg RAILS_ENV=$RAILS_ENV

      --build-arg REDIS_URL=$REDIS_URL

      --build-arg S3_ACCESS_KEY_ID=$S3_ACCESS_KEY_ID

      --build-arg S3_BUCKET=$S3_BUCKET

      --build-arg S3_REGION=$S3_REGION

      --build-arg S3_SECRET_KEY=$S3_SECRET_KEY

      --build-arg SECRET_KEY_BASE=$SECRET_KEY_BASE

      --build-arg WEBHOOKS_API_URL=$WEBHOOKS_API_URL

      --build-arg WIDGET_SRC=$WIDGET_SRC .

    - docker push $REPOSITORY_URL:$REPOSITORY_TAG



    # "Hypervisor" settings

    - docker tag $REPOSITORY_URL:$REPOSITORY_TAG $REPOSITORY_URL:$RAILS_ENV-latest

    - docker push $REPOSITORY_URL:$RAILS_ENV-latest

To build the image we use the previously prepared aws-utils image with ready-to-go settings for aws. Then we copy the file with the environment variables from S3 and run the build based on app-image, which we also built earlier. Thus, we get a universal container that can run rails, sidekiq, & anycable-rails.

To be able to quickly roll back to the previous version of the application without doing an additional build, we use tags with a hash of the commit and the current environment (for example production-2a4521a1). Each docker image has a tag with the current version marked as :latest. The same tag is used to run in hypervisors (we’ll talk about them below). Accordingly, a rollback to any version of the system that is stored in the registry takes no more than a minute.

Pushing configuration files to AWS EB

So, we have collected the images and now should only deliver them to the Elastic Beanstalk.


docker-deploy:

  image: $BASE_IMAGES_URL:$CI_PROJECT_NAMESPACE-aws-utils

  stage: deploy

  environment:

    name: $CI_COMMIT_BRANCH

  only:

    - master

    - integration

  script:

    - ./bin/ci-deploy.sh

For convenience, we wrote a small script that runs in the aws-utils image and therefore can work with the AWS API. So, all that is left to do is run eb deploy command with the corresponding parameters in the directory, where the .elasticbeanstalk configuration file is located.

At this stage we have already received an image with a tag in the format RAILS_ENV-SHORT_SHA_COMMIT; it remains to specify ElasticBeanstalk in the configuration file, so that it downloads this particular version. To do that, let’s use sed and replace the tag with RAILS_ENV-SHORT_SHA_COMMIT.


# bin/ci-deploy.sh



#!/bin/sh

xargs -P 3 -I {} sh -c 'eval "$1"' - {} << "EOF"

cd eb/$RAILS_ENV/rails; sed -i 's/tag/'$RAILS_ENV-$CI_COMMIT_SHORT_SHA'/' Dockerrun.aws.json; eb deploy --profile default --label $LABEL-rails --timeout 30



cd eb/$RAILS_ENV/sidekiq; sed -i 's/tag/'$RAILS_ENV-$CI_COMMIT_SHORT_SHA'/' Dockerrun.aws.json; eb deploy --profile default --label $LABEL-sidekiq --timeout 30



cd eb/$RAILS_ENV/anycable-rails;  sed -i 's/tag/'$RAILS_ENV-$CI_COMMIT_SHORT_SHA'/' Dockerrun.aws.json; eb deploy --profile default --label $LABEL-anycable-rails --timeout 30

EOF

After successfully uploading the image to Elastic Beanstalk each container starts according to the script from Dockerrun.aws.json.

Balancers and Autoscaling

Now that we have prepared the environments and configured the deploy process, we are fully ready for production. Running a final performance check, we are seeing WebSocket connection errors.

A couple of hours of checking EB components and we discover that the AWS Application Load Balancer doesn’t want to route traffic to the anycable-rails port. This is due to a health check failure: incorrect response from the port (204 or 200). Dealing with anycable-rails, it’s clear that it runs with 50051 port and responds via HTTP2/gRPC by default, while ALB is waiting for a reply via HTTP. HTTP health check can be configured in anycable-rails with the --http-health-port option. However, ALB is not able to do a health check for one port and proxy traffic to another. At this point we realize that we could try to proxy traffic to anycable-rails through Nginx as well as parse the requests for health checks.

The Boss with finger

After reconfiguring Nginx, and re-checking, the ALB accepts our response for a health check, and traffic successfully goes to the container with anycable-rails. However, right away, we are running into a new error at anycable-go.


context=ws Websocket session initialization failed: rpc error: code = Unavailable

desc = all SubConns are in TransientFailure, latest connection error: connection

closed

So we go back to anycable-rails to carefully check the logs.


"PRI * HTTP/2.0" 400 173 "-" "-"

An idea occurs that there might be a problem with ALB, which disassembles HTTPS on its side but sends requests to the host via HTTP. After creating a self-signed certificate and configuring Nginx to listen to the port in order to forward gRPC to HTTP/2 with the certificate, we get another error.


"PRI * HTTP/2.0" 400 173 "-" "-”

We continued EB configuration, made full-fledged end-2-end encryption, but unfortunately this didn’t help. Finally, we tried to work with anycable-rails directly without ALB, and it didn’t work out either. It became clear that the problem was in ALB, though according to the documentation ALB supports HTTP/2. While ALB parses HTTP/2 traffic on its side, it sends requests to the host via HTTP/1.1, and gRPC uses HTTP/2.

Then we remembered that apart from ALB, Elastic Beanstalk has its own second balancer – NLB, which does nothing with traffic, unlike ALB. We decided to try it out. Besides dividing application and anycable we can be more accurate with autoscaling. As a result, we arrived at the following scheme for our service.

cunning plan

Hypervisors

Elastic Beanstalk does not imply the use of an EC2 instance as we are accustomed to: you can, of course, connect via ssh with a key, but setting up a system of rights through IAM will be problematic. Only administrators should have access to such instances in order to resolve extraordinary cases.

And the first question is how to give developers access to the console rails.

We decided to make two micro-instances (t2.micro) on EC2, one for each environment, and called them hypervisors. Developers can use this environment to perform various tasks: launching rails console, connecting to RDS or ElasticCache, viewing logs, and so on.

We deployed Docker to both instances, so AWS IAM created the user (registry-user) with two policies: AmazonEC2ContainerRegistryReadOnly & CloudWatchReadOnlyAccess. The selected policies allow us to download the Docker image from the registry and view CloudWatch logs. To make these operations easier we wrote two tooling bash scripts.

Tooling

The first script accesses the rails console. The integration and production options differ only in ECR addresses. It is important in CI to make integration-latest & production-latest tags, to download the current image in the script.


#!/bin/bash

COMMAND=$@

if test -z $COMMAND; then

  COMMAND="rails c"

fi



$(aws ecr get-login --no-include-email --profile default)

docker pull some-address.dkr.ecr.us-east-1.amazonaws.com/x-project-server:integration-latest

docker run -it --rm some-address.dkr.ecr.us-east-1.amazonaws.com/x-project-server:integration-latest $COMMAND

The second script allows viewing logs from various instances in the console. Docker sends all logs to stdout by default, no one can read them there, so we add them to the file, and stream the file to CloudWatch. The problem is Docker doesn’t send multi-container logs to CloudWatch. To solve this you need to add .ebextensions settings for the CW agent and configure Dockerrun.aws.json for logging.

Now when all the logs are in CloudWatch, tooling can be built. To get the logs we use awslogs. Below is a sample script for integration.


#!/bin/bash



service=$1

minutes=$2

if test -z $minutes ;then

  minutes=20

fi



case $service in

  rails)

    awslogs get /aws/elasticbeanstalk/Application-integration/var/log/containers/app/integration.log --start=${minutes}'m ago'

  ;;

  anycable)

    awslogs get /aws/elasticbeanstalk/Anycable-rails-integration/var/log/containers/RPC/integration.log --start=${minutes}'m ago'

  ;;

  nginx)

    awslogs get /aws/elasticbeanstalk/Application-integration/var/log/containers/nginx/access.log --start=${minutes}'m ago'

    echo "↓--- NGNIX ERROR LOGS ---↓"

    awslogs get /aws/elasticbeanstalk/Application-integration/var/log/containers/nginx/error.log --start=${minutes}'m ago'

  ;;

  sidekiq)

    awslogs get /aws/elasticbeanstalk/Sidekiq-integration/var/log/eb-docker/containers/eb-current-app/stdouterr.log --start=${minutes}'m ago'

  ;;

  *)

    echo "Supported serices rails, anycable, nginx, sidekiq."

    echo "Usage ./logs ralis 30"

    echo "Output will be logs from rails env from the last 30 minutes"

    echo "If output is emply, it seems last N (default 20) minuters no logs from app"

    exit 1

  ;;

esac

Key Takeaways

AWS Elastic Beanstalk seems to be the best solution for scalable Rails hosting. After complex first time configuration every new project can be deployed to it within one hour.

The most valuable thing is that developers don’t need any additional help from dev-ops team to scale their apps. All needed information about current application performance can be found in AWS dashboards and scaling configuration can be edited directly in web-interface. Another good point is that if your project has peak load only a few days within a month (e.g. report generation by the end of month) you will not pay for not needed instances during the whole month.