Elixir - Deployment


guides:

  1. https://elixirforum.com/t/elixir-deployment-tools-general-discussion-blog-posts-wiki/827
  2. https://hackernoon.com/state-of-the-art-in-deploying-elixir-phoenix-applications-fe72a4563cd8
  3. https://dustinfarris.gitbooks.io/phoenix-continuous-deployment/content/
  4. https://jimmy-beaudoin.com/posts/elixir/phoenix-deployment/ (simple, distillery)
  5. https://groups.google.com/forum/#!topic/elixir-lang-talk/zobme8NvlZ4
  6. https://medium.com/@zek/deploy-early-and-often-deploying-phoenix-with-edeliver-and-distillery-part-two-f361ef36aa10
  7. https://habrahabr.ru/post/320096/
  8. http://fletchermoore.me/blog/notes-on-deploying-phoenix/ (simple, edeliver)

official documentation:

  1. https://hexdocs.pm/phoenix/deployment.html
  2. https://github.com/edeliver/edeliver/wiki/Configuration-(.deliver-config)

NOTE: all paths on production host are specified relative to application directory located at $DELIVER_TO/<app_name>/.

configuration

all edeliver hooks: https://github.com/edeliver/edeliver/wiki/Run-additional-build-tasks.

secrets

  1. https://hexdocs.pm/phoenix/deployment.html#handling-of-your-application-secrets

there are 2 alternative approaches to deal with secrets:

assets

  1. https://hexdocs.pm/phoenix/deployment.html#compiling-your-application-assets

compilation of static assets consists of 2 steps:

assets are used

  1. http://blog.plataformatec.com.br/2016/06/deploying-elixir-applications-with-edeliver/
  2. https://github.com/edeliver/edeliver/wiki/Run-additional-build-tasks

assets are not used

If you are not serving or don’t care about assets at all, you can just remove the cache_static_manifest configuration from config/prod.exs.

so if application doesn’t deal with assets remove this line in config/prod.exs:

  config :billing, BillingWeb.Endpoint,
    load_from_system_env: true,
+   url: [host: "example.com", port: 80]
-   url: [host: "example.com", port: 80],
-   cache_static_manifest: "priv/static/cache_manifest.json"

artifacts (resources, resource files - say, YAML or JSON files)

https://elixirforum.com/t/including-data-files-in-a-distillery-release/2813:

The traditional place to put non-code resources that are needed at runtime is the priv folder. All the tools are aware of this convention and preserve proper paths.

https://elixirforum.com/t/is-it-possible-to-include-resource-files-when-packaging-my-project-using-mix-escript/730/4:

“priv” is like “OTP” where its name made sense at the beginning but today it has grown beyond that. All it matters now is that we put in the “priv” directory any artifact that you need in production alongside your code.

https://stackoverflow.com/a/32097896:

Elixir applications care about two directories: 1. ebin (which is where you put compiled code) and 2. priv (auxiliary files that you need to run your software in production, like static files). If you rely on a file that is not in any of those directories, things can break when running in production or building releases.

2 ways to get priv/ directory itself (or the file inside it) at runtime:

for example:

defmodule Neko.Reader do
  @rules_path Application.app_dir(:neko, "priv/rules.yml")
  # ...
end

however both ways don’t work in config/config.exs because application is not started yet (and hence unknown) when the former is being compiled - in this case just reference priv/ directory directly using relative path:

# config/config.exs

config :neko, :rules, dir: "priv/rules"

this should work in most cases. however priv/ directory wasn’t found using relative path when I tried to access it from inside release task (see https://hexdocs.pm/distillery/guides/running_migrations.html). I had to update application environment value in the task itself:

new_rules_config =
  :neko
  |> Application.get_env(:rules)
  |> update_in([:dir], &Application.app_dir(:neko, &1))

Application.put_env(:neko, :rules, new_rules_config)

endpoint

url option

set host and port to be used in generated URLs using url option.

config/prod.exs:

  config :billing, BillingWeb.Endpoint,
    load_from_system_env: true,
-   url: [host: "example.com", port: 80]
+   url: [host: "billing.***.com", port: 80]

server option

  1. https://hexdocs.pm/phoenix/Phoenix.Endpoint.html
  2. https://elixirforum.com/t/how-can-i-see-what-port-a-phoenix-app-in-production-is-actually-trying-to-use/5160/10

Runtime configuration

:server - when true, starts the web server when the endpoint supervision tree starts. Defaults to false. The mix phx.server task automatically sets this to true.

config/prod.exs (see also auto-generated comment titled Using releases):

  config :billing, BillingWeb.Endpoint,
    load_from_system_env: true,
-   url: [host: "billing.***.com", port: 80]
+   url: [host: "billing.***.com", port: 80],
+   server: true

if web server is not started you’ll get Connection refused error when trying to send any request to application.

load_from_system_env and http options

Distillery

default generated config will do in most cases (unless you need to set up stage environment, for example).

Distillery config (rel/config.exs)

EVM config (rel/vm.args)

  1. http://erlang.org/doc/man/erl.html

EVM = Erlang VM (aka BEAM).

http://ds.cs.ut.ee/courses/course-files/To303nis%20Pool%20.pdf:

The whole runtime system together with the language, interpreter, memory handling is called the Erlang Runtime System (ERTS), but the virtual machine is often referred to also as the BEAM.

when using Distillery EVM flags are set in rel/vm.args (or any other file set with vm_args setting for specific environment in rel/config.exs).

options set there are passed as is to EVM process:

/home/billing/prod/billing/erts-9.0/bin/beam.smp -Bd\
  -- -root /home/billing/prod/billing\
  -progname home/billing/prod/billing/releases/0.0.1/billing.sh\
  -- -home /home/billing\
  ...
  -name billing_prod@127.0.0.1\
  -setcookie <cookie>\
  -smp auto\
  ...

NOTE: application must be restarted after changing EVM flags.

Chef

deployment

install and configure edeliver

inter alia, create config (.deliver/config) manually as instructed in README.

auto-versioning

  1. https://github.com/edeliver/edeliver/wiki/Auto-Versioning

auto-versioning allows to append version metadata to release version (it’s equal to application version by default as configured in rel/config.exs while application version is set in mix.exs):

.deliver/config:

AUTO_VERSION=build-date+git-branch+git-revision

resulting release version (example): neko_0.2.0+20180527-master-6bfed7e (neko is application name here).

NOTE: application version must be incremented manually in mix.exs.

build and deploy release

NOTE: push all changes to github!!! when building new release on build server edeliver fetches repo from github (just like Capistrano).

  1. http://blog.plataformatec.com.br/2016/06/deploying-elixir-applications-with-edeliver/
$ mix edeliver build release
$ mix edeliver deploy release production

or the same in one go:

$ mix edeliver update production

NOTE: edeliver build command doesn’t allow to specify target envinroment - it’s set using --mix-env option which has prod value by default:

$ mix edeliver --help
...
--mix-env=<env>   Build with custom mix env $MIX_ENV. Default is 'prod'

restart application

it’s necessary to restart application after deploying (otherwise previous release will still be running).

run migrations

  1. https://github.com/edeliver/edeliver/issues/81

NOTE: migrations should be run after restarting application - otherwise new release is not loaded yet and migrations are not seen (they will be shown as pending afterwards).

$ mix edeliver migrate production

ALWAYS use --version option when running edeliver migrate production down command or else it will rollback all migrations (effectively deleting all data):

$ mix edeliver migrate production down --version=20170728105044

or else just don’t use change/0 functions in migrations - only up/0 and down/0 ones making migrations irreversible.

ping node and check release version

$ mix edeliver ping production
$ mix edeliver version production

rollback release

NOTE: migrations should be rollbacked before restarting application - they just might be not available when previous realese is loaded.

$ mix edeliver deploy release production --version=<previous_release_version>
$ mix edeliver migrate production down --version=<previous_migration_version>
$ ssh devops@billing sudo systemctl restart billing_prod

release version is defined as AUTO_VERSION in .deliver/config (see auto-versioning section above).

tasks and commands

[local] edeliver tasks

  1. https://hexdocs.pm/edeliver/Mix.Tasks.Edeliver.html

on tasks vs. commands: in wiki and mix edeliver --help tasks and commands are used interchangeably but to be precise edeliver is a Mix task itself so edeliver build release is a edeliver build task as well while build release is a specific edeliver command (edeliver build release command vs. edeliver build release task).

but for simplicity I might refer to edeliver commands as tasks as well.

[remote] application commands

console, foreground, start - application boot commands.

[remote] systemd commands

[remote] custom commands

  1. https://hexdocs.pm/distillery/guides/running_migrations.html
  2. https://dockyard.com/blog/2018/08/23/announcing-distillery-2-0
  3. http://blog.plataformatec.com.br/2016/04/running-migration-in-an-exrm-release/

custom commands allow, say, to run migrations or build ES indexes directly on production host:

# lib/reika/release_tasks.ex

# https://hexdocs.pm/mix/Mix.Tasks.Release.html#module-one-off-commands-eval-and-rpc
defmodule Reika.ReleaseTasks do
  @start_apps [
    :crypto,
    :ssl,
    :postgrex,
    :ecto_sql,
    :elasticsearch
  ]

  @app :reika
  @repos Application.get_env(@app, :ecto_repos, [])
  @es_cluster Reika.ES.Cluster
  @es_indexes [:reika_shops]

  # https://hexdocs.pm/distillery/guides/running_migrations.html
  def eval_migrate do
    # configuration of apps is not available unless they are loaded
    start_apps()
    start_repos()

    IO.puts("Running migrations...")
    Enum.each(@repos, &run_migrations_for/1)

    stop_apps()
  end

  # https://hexdocs.pm/elasticsearch/distillery.html
  def eval_build_es_indexes do
    # configuration of apps is not available unless they are loaded
    start_apps()
    start_repos()
    start_es_cluster()

    IO.puts("Building ES indexes...")

    Enum.each(@es_indexes, fn es_index ->
      new_es_config =
        @app
        |> Application.get_env(@es_cluster)
        |> update_in(
          [:indexes, es_index, :settings],
          &Application.app_dir(@app, &1)
        )

      Application.put_env(@app, @es_cluster, new_es_config)

      # restart ES Cluster so that configuration is re-read
      GenServer.stop(@es_cluster)
      start_es_cluster()

      Elasticsearch.Index.hot_swap(@es_cluster, es_index)
    end)

    stop_apps()
  end

  # -----------------------------------------------------------------
  # migrations
  # -----------------------------------------------------------------

  defp run_migrations_for(repo) do
    migrations_path = priv_path_for(repo, "migrations")
    Ecto.Migrator.run(repo, migrations_path, :up, all: true)
  end

  defp priv_path_for(repo, filename) do
    app = Keyword.get(repo.config, :otp_app)

    repo_underscore =
      repo
      |> Module.split()
      |> List.last()
      |> Macro.underscore()

    priv_dir = "#{:code.priv_dir(app)}"
    Path.join([priv_dir, repo_underscore, filename])
  end

  # -----------------------------------------------------------------
  # start/stop helpers
  # -----------------------------------------------------------------

  defp start_apps do
    IO.puts("Loading #{@app}..")
    # Load the code for @app, but don't start it
    Application.load(@app)

    IO.puts("Starting apps..")
    Enum.each(@start_apps, &Application.ensure_all_started/1)
  end

  defp start_repos do
    IO.puts("Starting repos..")
    # > Ecto requires a pool size of at least 2 to support
    # > concurrent migrators.
    # > When migrations run, Ecto uses one connection to
    # > maintain a lock and another to run migrations.
    Enum.each(@repos, & &1.start_link(pool_size: 2))
  end

  defp start_es_cluster do
    IO.puts("Starting ES cluster..")
    @es_cluster.start_link()
  end

  defp stop_apps do
    IO.puts("Success!")
    :init.stop()
  end
end

on production host:

$ bin/reika stop
$ bin/reika eval "Elixir.Reika.ReleaseTasks.eval_migrate()"

locations on production host

logging

generally Elixir application log is written to EVM log file:

$ tail -f var/log/erlang.log.1

but since application service is managed by systemd all logs are sent to systemd journal (as configured in systemd service unit):

$ journalctl --no-tail --since yesterday -fu billing_prod

NOTE: don’t use -e and -n options (-e implies -n1000) - they cause some lines not to be printed (IDK why).

when application is started via systemd service unit:

when application is started manually (service is stopped):

log level

colorizing

persistent systemd journal

  1. https://www.freedesktop.org/software/systemd/man/journald.conf.html
  2. https://habrahabr.ru/company/selectel/blog/264731/

/etc/systemd/journald.conf (no changes):

[Journal]
#Storage=auto

with Storage=auto (default value) logs will be persisted on disk if /var/log/journal/ directory exists - so create one with Chef.

view specific boot:

$ journalctl --list-boots
-1 e833ad1ae9f34f89b851d08b9ad55ee0 Wed 2017-08-16 19:03:28 UTC—Wed 2017-08-16 21:54:57 UTC
 0 c4ef341537734dc18235c9e8d2d7a76a Wed 2017-08-16 21:55:35 UTC—Thu 2017-08-17 18:49:56 UTC
$ journalctl -b 0

parameter filtering

  1. https://hexdocs.pm/phoenix/Phoenix.Logger.html
  2. https://docs.appsignal.com/elixir/configuration/parameter-filtering.html

config/prod.exs:

config :phoenix,
  :filter_parameters, ["password", "number", "exp_date"]

formatting

  1. https://hexdocs.pm/logger/Logger.html

config/prod.exs:

# Do not include time - it's provided by systemd journal
config :logger, :console, format: "$metadata[$level] $message\n"

hot upgrades

https://hackernoon.com/state-of-the-art-in-deploying-elixir-phoenix-applications-fe72a4563cd8

The downside is that you need to migrate data structures in your application. Deployment is no longer a no-brainer (as it should be in the continuous deployment world).

Simply restart.

https://hexdocs.pm/distillery/walkthrough.html#building-an-upgrade-release

You do not have to use hot upgrades, you can simply do rolling restarts by running stop, extracting the new release tarball over the top of the old, and running start to boot the release.

testing production release locally

  1. https://hexdocs.pm/distillery/terminology.html
  2. https://hexdocs.pm/distillery/walkthrough.html

create production database

$ psql -d postgres
=# CREATE USER billing_prod WITH PASSWORD 'billing_prod';
=# ALTER USER billing_prod CREATEDB;
$ mix ecto.setup
$ sudo ln -s $PWD/config/prod.secret.exs /var/prod.secret.exs

build production release

  1. https://hexdocs.pm/distillery/introduction/installation.html#your-first-release

https://hexdocs.pm/distillery/walkthrough.html#deploying-your-release

The artifact you will want to deploy is the release tarball, which is located at _build/prod/rel/<name>/releases/<version>/<name>.tar.gz.

https://hexdocs.pm/phoenix/deployment.html#putting-it-all-together:

# no idea how it's different from `mix deps.get`
$ mix deps.get --only prod
# compiles project into _build/prod/lib/ directory
$ MIX_ENV=prod mix compile
# builds release in _build/prod/rel/ directory
$ MIX_ENV=prod mix release

by default Distillery uses release environment which matches the value of MIX_ENV (that is Mix.env()):

# rel/config.exs

use Mix.Releases.Config,
  # ...
  # This sets the default environment used by `mix release`
  default_environment: Mix.env()

but it’s possible to use another release environment with --env flag: in this case release will be compiled and built using specified release environment and current Mix environment simultaneously:

# both configurations are used:
#
# - `staging` release environment configuration from rel/config.ex
# - `prod` Mix environment configuration from config/prod.ex
$ MIX_ENV=prod mix release --env=staging

Mix environment also uses release environment to determine the location where project should be compiled and built - say, it’s _build/prod/ in case of MIX_ENV=prod.

in all examples I’ve seen MIX_ENV=prod and --env=prod are used together but it’s sufficient to use MIX_ENV=prod only because in this case release environment is automatically set to prod in rel/config.exs.

run production release

$ PORT=4000 _build/prod/rel/billing/bin/billing console

in another terminal:

$ curl -X POST -d '{"user":{"name":"Jane"}}' -H "Content-Type: application/json" http://localhost:4000/v1/users

tips

rerun all migrations in production

I did it once when I accidentally modified old migration and wanted to run all migrations starting from that one again. in fact it was the first migration so I just dropped all tables including schema_migrations one in psql and run Reika.ReleaseTasks.migrate() in IEx:

$ bin/billing remote_console
iex> Reika.ReleaseTasks.migrate()

or else run custom migrate command:

$ bin/billing migrate

the gotcha is that Phoenix application stops (IDK why) after running migrations this way so make sure to start/restart it afterwards:

$ sudo systemctl restart billing_prod

troubleshooting

dependency is not included in distillery release

phoenix_expug package has expug dependency but it’s not added to distillery release (raising error at runtime):

$ mix release --verbose
...
=> One or more direct or transitive dependencies are missing from
    :applications or :included_applications, they will not be included
    in the release:

    :expug
    :parse_trans

    This can cause your application to fail at runtime. If you are sure
    that this is not an issue, you may ignore this warning.

systemd journal:

Request: GET /admin/transfers
** (exit) an exception was raised:
    ** (UndefinedFunctionError) function Expug.Runtime.attr/2 is undefined (module Expug.Runtime is not available)
        Expug.Runtime.attr("lang", "en")
        (billing) lib/billing_web/templates/layout/admin.html.pug:2: BillingWeb.LayoutView."admin.html"/1

solution

  1. https://github.com/hashrocket/gatling/issues/24#issuecomment-270044265
  2. https://github.com/bitwalker/distillery/issues/55

probably this is because phoenix_expug uses applications option in mix.exs (application callback) where all dependencies that should be started are listed explicitly. it’s deprecated now and overrides new default behaviour when all dependencies from deps option (project callback) are added to applications implicitly.

while iex phx.server seems to start all dependencies listed both in applications and deps, distillery apparently does add applications only to release treating deps as compile-time dependencies.

one solution is to remove applications altogether so that all deps are added to applications by default though it’s not an option when dealing with external dependencies (unless you fork them).

another solution is to add those missing depedencies (from the ouput of running mix release --verbose command) to rel/config.exs:

  release :billing do
    set version: current_version(:billing)
    set applications: [
-     :runtime_tools
+     :runtime_tools,
+     expug: :load
    ]
  end

logs are truncated in systemd journal

say, we have a very long PARes XML field that used to be truncated in systemd journal all the time.

solution

log message was truncated by Kernel.inspect/2 - not by systemd journal. solution is not to use the former (where it’s possible of course - that is when argument is a string):

- Logger.info("API REQUEST:\n" <> inspect(soap))
+ Logger.info("API REQUEST:\n" <> soap)

NOTE: long request parameter values are still truncated by Phoenix - IDK how to change this behaviour.

systemd service is restarted twice

I restart application systemd service after each deploy:

# mix.exs

defp deploy(_) do
  Mix.Task.run("bootleg.build")
  Mix.Task.run("bootleg.deploy")

  Mix.Task.run(
    :cmd,
    ["ssh devops@XXX.XXX.XXX.XX sudo systemctl restart my_app_prod"]
  )
end

the problem is that the service is restarted twice: right after it’s stopped and started for the first time it’s getting stopped again for some reason.

systemd journal:

21:46:26 systemd[1]: Stopping my_app service (prod)...
21:46:28 my_app[23151]: ok
21:46:38 systemd[1]: Stopped my_app service (prod).
21:46:38 systemd[1]: Started my_app service (prod).
21:46:40 systemd[1]: Stopping my_app service (prod)...
21:46:43 my_app[23623]: Node my_app@127.0.0.1 is not running!
21:46:43 systemd[1]: my_app_prod.service: Control process exited, code=exited status=1
21:46:44 my_app[23391]: module=Swarm.Logger [info] [swarm on my_app@127.0.0.1] [tracker:init] started
21:46:44 my_app[23391]: module=Phoenix.Endpoint.Cowboy2Adapter [info] Running MyAppWeb.Endpoint with cowboy 2.6.3 at 0.0.0.0:4000 (http)
21:46:44 my_app[23391]: module=Phoenix.Endpoint.Supervisor [info] Access MyAppWeb.Endpoint at http://XXX.XXX.XXX.XX
21:46:49 my_app[23391]: module=Swarm.Logger [info] [swarm on my_app@127.0.0.1] [tracker:cluster_wait] joining cluster..
21:46:49 my_app[23391]: module=Swarm.Logger [info] [swarm on my_app@127.0.0.1] [tracker:cluster_wait] no connected nodes, proceeding without sync
21:48:13 systemd[1]: my_app_prod.service: State 'stop-sigterm' timed out. Killing.
21:48:13 systemd[1]: my_app_prod.service: Killing process 23391 (beam.smp) with signal SIGKILL.
21:48:13 systemd[1]: my_app_prod.service: Killing process 23543 (erl_child_setup) with signal SIGKILL.
21:48:13 systemd[1]: my_app_prod.service: Killing process 23773 (inet_gethost) with signal SIGKILL.
21:48:13 systemd[1]: my_app_prod.service: Killing process 23774 (inet_gethost) with signal SIGKILL.
21:48:13 systemd[1]: my_app_prod.service: Killing process 23807 (appsignal-agent) with signal SIGKILL.
21:48:13 systemd[1]: my_app_prod.service: Main process exited, code=killed, status=9/KILL
21:48:13 systemd[1]: my_app_prod.service: Failed with result 'exit-code'.
21:48:13 systemd[1]: Stopped my_app service (prod).
21:48:13 systemd[1]: Started my_app service (prod).
21:48:17 my_app[23823]: module=Swarm.Logger [info] [swarm on my_app@127.0.0.1] [tracker:init] started
21:48:17 my_app[23823]: module=Phoenix.Endpoint.Cowboy2Adapter [info] Running MyAppWeb.Endpoint with cowboy 2.6.3 at 0.0.0.0:4000 (http)
21:48:17 my_app[23823]: module=Phoenix.Endpoint.Supervisor [info] Access MyAppWeb.Endpoint at http://XXX.XXX.XXX.XX
21:48:22 my_app[23823]: module=Swarm.Logger [info] [swarm on my_app@127.0.0.1] [tracker:cluster_wait] joining cluster..
21:48:22 my_app[23823]: module=Swarm.Logger [info] [swarm on my_app@127.0.0.1] [tracker:cluster_wait] no connected nodes, proceeding without sync

it looks like Mix.Task.run/2 tries to rerun specified task if the latter times out - that is if no response is received within a set amount of time.

solution

use :os.cmd/1 to execute a shell command instead:

# mix.exs

defp deploy(_) do
  Mix.Task.run("bootleg.build")
  Mix.Task.run("bootleg.deploy")

  :os.cmd('ssh devops@XXX.XXX.XXX.XX sudo systemctl restart my_app_prod')
end