The whoami-based approach is now the only implementation for sync worker routing.
It works with all token types (native Synapse, MAS, etc.) and is automatically
enabled when sync workers exist.
The old map-based approach only worked with native Synapse tokens (syt_<b64>_...)
and would give poor results with MAS or other auth systems.
This adds a new routing mechanism for sync workers that resolves access tokens
to usernames via Synapse's whoami endpoint, enabling true user-level sticky
routing regardless of which device or token is used.
Previously, sticky routing relied on parsing the username from native Synapse
tokens (`syt_<base64 username>_...`), which only works with native Synapse auth
and provides device-level stickiness at best. This new approach works with any
auth system (native Synapse, MAS, etc.) because Synapse handles token validation
internally.
Implementation uses nginx's auth_request module with an njs script because:
- The whoami lookup requires an async HTTP subrequest (ngx.fetch)
- js_set handlers must return synchronously and don't support async operations
- auth_request allows the async lookup to complete, then captures the result
via response headers into nginx variables
The njs script:
- Extracts access tokens from Authorization header or query parameter
- Calls Synapse's whoami endpoint to resolve token -> username
- Caches results in a shared memory zone to minimize latency
- Returns the username via a `X-User-Identifier` header
The username is then used by nginx's upstream hash directive for consistent
worker selection. This leverages nginx's built-in health checking and failover.
On ansible-core 2.19.0, invoking macro like this (which only outputted
something in its `if` block, not in `else`), resulted in a macro
outputting `None`.
One way to work around it is to add an explicit `else` block which also
outputs something.
A better way to work around it is to only invoke the macro if it
has something to output.
Related to https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/4458
* Enable Internal Admin API Access separately from Public access.
* Add Config variable for Draupnir Hijack command
And also make the internal admin API be automatically activated when this capability is used.
* Apply suggestions from code review
Co-authored-by: Slavi Pantaleev <slavi@devture.com>
* Further Refine Internal Admin API
* Add Non Worker Labels for Internal Admin API
* Variable Rename
* Add validation rules for Internal Synapse admin API
* Add Draupnir Admin API required config validation.
* Override `matrix_synapse_reverse_proxy_companion_container_labels_internal_client_synapse_admin_api_traefik_entrypoints` via group vars
* Wire `matrix_bot_draupnir_admin_api_enabled` to `matrix_bot_draupnir_config_admin_enableMakeRoomAdminCommand` in Draupnir's `defaults/main.yml`
* Remove unnecessary `matrix_bot_draupnir_admin_api_enabled` override from `group_vars/matrix_servers`
The same value is now (more appropriately) defined in Draupnir's `defaults/main.yml` file anyway.
* Add additional condition (`matrix_bot_draupnir_enabled`) for enabling `matrix_synapse_container_labels_internal_client_synapse_admin_api_enabled`
* Use a separate task for validating `matrix_bot_draupnir_admin_api_enabled` when `matrix_bot_draupnir_config_admin_enableMakeRoomAdminCommand`
The other task deals with checking for null and not-blank and can't handle booleans properly.
---------
Co-authored-by: Slavi Pantaleev <slavi@devture.com>
Since nginx 1.27.3, we can make use of the `resolve` parameter for an `upstream`'s `server`,
to allow DNS resolution to happen continuously at runtime, not just once during startup.
Previously, this was not possible to do in an `upstream` block without
an nginx-plus subscription. Outside of an `upstream` block, we've used
and still use `set $backend ..` workarounds to get DNS resolution at
runtime, but now we can do it in `upstream` as well.
Until now, most sections were specifying their own values for these.
For `client_max_body_size`, a value of 25MB was hardcoded in most places.
This was generally OK, but..
Some sections (those generated by the `render_locations_to_upstream` macro), were not specifying these options
and were ending up with a default value for configuration options for `client_max_body_size` (likely 1MB), etc.
From now on:
- we use individual variables for defining these for the Client-Server
and Federation API and apply these once at the `server` level
- we keep auto-determining the `client_max_body_size` for the
Client-Server API based on `matrix_synapse_max_upload_size_mb`
- we keep auto-calculating the `client_max_body_size` for the Federation
API based on the one for the Client API, but now also add a "minimum"
value (`matrix_synapse_reverse_proxy_companion_federation_api_client_max_body_size_mb_minimum: 100`)
to ensure we don't go too low
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/4100
gzipping certain responses is known to cause problems with QR code logins.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/3749
Gzipping at the synapse-reverse-proxy-companion level and not at the
level of the outer-most reverse-proxy (Traefik) also sounds non-ideal.
This change only affects setups powered by Synapse workers.
Non-worker setups (and setups powered by other homeservers) were not
having their requests go through synapse-reverse-proxy-companion anyway,
so this change does not affect them.
Future patches may enable response compression support at the Traefik level for
all setups.
After some checking, it seems like there's `/_synapse/client/oidc`,
but no such thing as `/_synapse/oidc`.
I'm not sure why we've been reverse-proxying these paths for so long
(even in as far back as the `matrix-nginx-proxy` days), but it's time we
put a stop to it.
The OIDC docs have been simplified. There's no need to ask people to
expose the useless `/_synapse/oidc` endpoint. OIDC requires
`/_synapse/client/oidc` and `/_synapse/client` is exposed by default
already.
This moves the comments from being just in Jinja,
to actually ending up in the generated `labels` file,
which makes inspection of the final result easier.
Also, some new lines were added here and there to make labels
more legible.
The generated file may still include weird new-lines due to
various `if` statements yielding content or not, but that's not so ugly
anymore - now that we have proper start/end sections that are visible in
the final `labels` file.
We'd be adding integration with an internal Traefik entrypoint
(`matrix_playbook_internal_matrix_client_api_traefik_entrypoint`),
so renaming helps disambiguate things.
There's no need for deperecation tasks, because the old names
have only been part of this `bye-bye-nginx-proxy` branch and not used by
anyone publicly.
This also updates validation tasks and documentation, pointing to
variables in the matrix-synapse role which don't currently exist yet
(e.g. `matrix_synapse_container_labels_client_synapse_admin_api_enabled`).
These variables will be added soon, as Traefik labels are added to the
`matrix-synapse` role. At that point, the `matrix-synapse-reverse-proxy-companion` role
will be updated to also use them.
Switching from doing "post-start" loop hacks to running the container
in 3 steps: `create` + potentially connect to additional networks + `start`.
This way, the container would be connected to all its networks even at
the very beginning of its life.
* add prometheus-nginxlog-exporter role
* Rename matrix_prometheus_nginxlog_exporter_container_url to matrix_prometheus_nginxlog_exporter_container_hostname
* avoid referencing variables from other roles, handover info using group_vars/matrix_servers
* fix: stop service when uninstalling
fix: typo
move available arch's into a var
fix: text
* fix: prometheus enabled condition
Co-authored-by: ikkemaniac <ikkemaniac@localhost>