Optimizing your deployment

Before proceeding, you are encouraged to read (or at least skim) the excellent docs Fly.io provides for Scaling Machine CPU and RAM. Getting the right sized machine close to your users may be enough.

Once you have settled on a machine configuration, there are a number of things you can do to get the most out of that machine. For each item below it is worth benchmarking the results of the change with your application because, as they say, your mileage may vary.

Jemalloc

Memory allocation is key to performance, and jemalloc is an alternative malloc implementation.

Enabling it is a matter of adding the following to your Dockerfile:

RUN apt-get install libjemalloc2
ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2
ENV MALLOC_CONF=dirty_decay_ms:1000,narenas:2,background_thread:true

YJIT

A “Just In Time” compiler can make your application run faster. Starting with Ruby 3.1, YJIT has been included in the official Ruby release sources. Using it requires a build time option and enabling at runtime.

The official Ruby docker images for 3.2.0 and later, as well as Fullstaq Ruby, support YJIT and enabling it can be done with an environment variable:

ENV RUBY_YJIT_ENABLE=1

Both rvm and rbenv will install yjit by default if rustc is available.

Enabling swap

While RAM is always faster, not everything that is loaded into RAM needs equal treatment. Enabling swap can allow less frequently used regions of memory to be moved out to disk (which these days is SSD). See swap_size_mb for configuring for deploys.

Recap

While the number of “one size fits all” deployment optimizations are few and far between, testing out a new configuration is generally a matter of adding a few lines to a Dockerfile and deploying it. This makes it easy to experiment and find the combination that is right for your application.