From 007421e881da19eddaaab61484a69e7837d23b0d Mon Sep 17 00:00:00 2001 From: Christian Grothoff Date: Thu, 14 Jan 2021 16:14:49 +0100 Subject: update exchange/auditor manuals now that we have taler-auditor-sync --- taler-auditor-manual.rst | 146 +++++++++++++++++++++++++++++++++++++--------- taler-exchange-manual.rst | 36 ++++++++++-- 2 files changed, 151 insertions(+), 31 deletions(-) diff --git a/taler-auditor-manual.rst b/taler-auditor-manual.rst index e506bda6..adfb57a7 100644 --- a/taler-auditor-manual.rst +++ b/taler-auditor-manual.rst @@ -438,26 +438,91 @@ Database -------- The next key step for the auditor is to configure replication of the -*exchange*'s database in-house. The ``taler-exchange-dbinit`` tool can be -used to setup the schema. For replication of the actual SQL data, we refer to -the Postgres manual. We note that asynchronous replication should suffice. - -.. note: - Easy and secure database synchronization between exchange and auditor - is still an open issue the developer team expects to address for Taler v1.0. - -Note that during replication, the only statements that may be performed -are ``INSERT``\ s. ``CREATE`` / ``DELETE`` / ``DROP`` / ``UPDATE`` -are generally not allowed. A -special exception applies when an exchange runs garbage collection on -old data that is no longer relevant from a regulatory point of view. - -While the auditor could just run the garbage collection logic locally as well, -this may interact badly with the standard Postgres synchronization -mechanisms. A good solution for secure (against exchanges deleting arbitrary -data) and convenient (with respect to automatic and timely synchronization) -garbage collection still needs to be developed. - +*exchange*'s database in-house. This should be performed in two steps. + +First, the exchange should use standard Postgres replication features to +enable the auditor to obtain a full copy of the exchange's database. +Second, the auditor should make a "trusted" local copy, ensuring that it +never replicates malicious changes using ``taler-auditor-sync``. Both +of these steps are described in more detail below. + +We note that as a result of these steps, the auditor will have three +databases: its own production primary database (as configured in +``auditordb-postgres``), its on production copy of the exchange's database +(``exchangedb-postgress``), and a third, untrusted "ingres" copy of the +exchange database. The untrusted database should run as a separate Postgres +instance and is only accessed via ``taler-auditor-sync`` and the replication +mechanism driven by the exchange operator. + + +Ingres replication of the exchange production database +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The full copy can be obtained in various ways with Postgres. It is +possible to use log shipping with streaming replication as described +in https://www.postgresql.org/docs/13/warm-standby.html, or to use +logical replication, as described in +https://www.postgresql.org/docs/13/logical-replication.html. We note +that asynchronous replication should suffice. + +The resulting auditor database should be treated as read-only on the auditor +side. The ``taler-exchange-dbinit`` tool can be used to setup the schema, or +the schema can be replicated using Postgres's standard mechanisms. The same +applies for schema upgrades: if logical replication is used (which does not +replicate schema changes), ``taler-exchange-dbinit`` can be used to migrate +the schema(s) in both the ingres and production copies of the exchange's +database as well. + +For details, we refer to the Postgres manual. + +.. note:: + + Depending on the replication method used, the exchange may perform + unexpected changes to the schema or perform ``UPDATE``, ``DELETE`` or + ``DROP`` operations on the tables. Hence, the auditor cannot rely upon the + exchange's primary copy to respect schema constraints, especially as we + have to presume that the exchange could act maliciously. Furthermore, it + is unclear to what degree Postgres database replication mechanisms are + robust against a malicious master database. Thus, the auditor should + isolate its primary copy of the exchange database, including the Postgres + process, from its actual operational data. + + +Safe replication of the ingres database into the auditor production database +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Using ``taler-auditor-sync``, the auditor should make a second "safe" copy of +the exchange's ingres database. ``taler-auditor-sync`` basically reads from one +exchange database and inserts all records found into a second exchange +database. If the source database violates invariants, the tool halts with an +error. This way, records violating invariants are never even copied, and in +particular schema changes and deletions or updates are not propagated into the +auditor's production database. + +While ``taler-auditor-sync`` could in theory be run directly against the +exchange's production system, this is likely a bad idea due to the high +latency from the network between auditor and exchange operator. Thus, we +recommend first making an "untrusted" ingress copy of the exchange's +production database using standard Postgres tooling, and then using +``taler-auditor-sync`` to create a second "safe" copy. The "safe" copy used +by the production system should also run under a different UID. + +Before ``taler-auditor-sync`` can be used, the target database must be +initialized with the exchange schema using ``taler-exchange-dbinit``. +Note that running ``taler-auditor-sync`` requires the use of two +configuration files, one specifying the options for accessing the source +database, and a second with the options for accessing the destination +database. In both cases, likely only the ``[exchangedb]/CONFIG`` option +needs to be changed. + +When the exchange performs garbage collection to ``DELETE`` obsolete records, +this change should be automatically replicated to the auditors untrusted +ingress database. However, as ``taler-auditor-sync`` tries to be "safe", +it will not replicate those deletions to the auditor's production database. +Thus, it is necessary to (occasonally) run ``taler-exchange-dbinit -g`` on +the auditor's production database to garbage collect old data in the +auditor's production copy. We note that this does not have to be done +at the same time when the exchange runs its garbage collection. .. _Operation: @@ -543,17 +608,37 @@ several categories of failures of different severity: Database upgrades ----------------- -To upgrade the database between Taler versions can be done by -running: +To upgrade the database between Taler versions can be done by running: .. code-block:: console $ taler-auditor-dbinit - -However, the above is the general rule. Please review the -specific release notes to ensure this is correct for the -specific upgrade. - + $ taler-exchange-dbinit + +In any case, it is recommended that exchange and auditor coordinate closely +during schema-changing database upgrades as without coordination the database +replication or ``taler-auditor-sync`` will likely experience problematic +failures. In general, we recommend: + + * halting the exchange business logic, + * allowing the replication and ``taler-auditor-sync`` to complete + (see also the **-t** option of ``taler-auditor-sync``) + * completing a ``taler-audit`` run against the old schema + * migrating the exchange schema (``taler-exchange-dbinit``) of + the master database, possibly the ingres database and the + auditor's production copy + * migrating the auditor database (``taler-auditor-dbinit``) + * resuming database replication between the exchange's master + database and the auditor's ingres copy + * resuming ``taler-auditor-sync`` + * resuming the regular exchange and auditor business logic + +Regardless, the above is merely the general rule. Please review the specific +release notes to ensure this procedure is correct for the specific upgrade. + + +Database reset +--------------- The auditor database can be reset using: @@ -910,6 +995,13 @@ The current script also rudimentarily tests the auditor's resume logic, by re-starting the auditor once against a database that the auditor has already seen. + +The ``test-revocation.sh`` script performs tests related to the handling of +key revocations. + +The ``test-sync.sh`` script performs tests related to the ``taler-auditor-sync`` +tool. + .. TODO More extensive auditor testing where additional transactions diff --git a/taler-exchange-manual.rst b/taler-exchange-manual.rst index 879f9ad2..c1354d39 100644 --- a/taler-exchange-manual.rst +++ b/taler-exchange-manual.rst @@ -902,10 +902,38 @@ of ``taler-exchange-offline``. Diagnostics =========== -This chapter includes various (very unpolished) sections on specific -topics that might be helpful to understand how the exchange operates, -which files should be backed up. The information may also be helpful for -diagnostics. +This chapter includes various sections on specific topics that might be +helpful to understand how the exchange operates. The information may also be +helpful for diagnostics. + +.. _Internal-audit: + +Internal audits +--------------- + +While an exchange should use an external auditor to attest to regulators that +it is operating correctly, an exchange operator can also use the auditor's +logic to perform internal checks. For this, an exchange opeator can generally +follow the auditor guide. However, instead of using ``taler-auditor-sync``, +an internal audit can and likely should be performed either directly against +the production exchange database or against a synchronous copy created using +standard database replication techniques. After all, the exchange operator +runs this for diagnostics and can generally trust its own database to maintain +the database invariants. + +Running the auditor against a the original the production database (without +using ``taler-auditor-sync``) enables the auditing logic to perform a few +additional checks that can detect inconsistencies. These checks are enabled +by passing the **-i** option to the ``taler-auditor`` command. As always, +the resulting report should be read carefully to see if there are any problems +with the setup. + +Reports are generally created incrementally, with ``taler-auditor`` reporting +only incidents and balance changes that were not covered in previous reports. +While it is possible to reset the auditor database and to restart the audit +from the very beginning, this is generally not recommended as this may be too +expensive. + .. _Database-Scheme: -- cgit v1.2.3