From 0e17395994ab5022c60bf3e018c886bc86630e45 Mon Sep 17 00:00:00 2001
From: Christian Grothoff <christian@grothoff.org>
Date: Sun, 31 Jul 2022 17:22:29 +0200
Subject: -update KYC DD

---
 design-documents/023-taler-kyc.rst | 481 ++++++++++++++++++++++++++++---------
 1 file changed, 363 insertions(+), 118 deletions(-)

(limited to 'design-documents')

diff --git a/design-documents/023-taler-kyc.rst b/design-documents/023-taler-kyc.rst
index b7349bfc..87982ce7 100644
--- a/design-documents/023-taler-kyc.rst
+++ b/design-documents/023-taler-kyc.rst
@@ -42,8 +42,30 @@ Taler needs to run KYC checks in the following circumstances:
 Proposed Solution
 =================
 
-Exchange modifications
-^^^^^^^^^^^^^^^^^^^^^^
+Terminology
+^^^^^^^^^^^
+
+* **Check**: A check establishes a particular attribute of a user, such as their name based on an ID document and lifeness, mailing address, phone number, taxpayer identity, etc.
+
+* **Condition**: A condition specifies when KYC is required. Conditions include the *type of operation*, a threshold amount (e.g. above EUR:1000) and possibly a time period (e.g. over the last month).
+
+* **Configuration**: The configuration determines the *legitimization rules*, and specifies which providers offer which *checks* at what *cost*.
+
+* **Cost**: Metric for the business expense for a KYC check at a certain *provider*. Not in any currency, costs are simply relative and non-negative values. Costs are considered when multiple choices are allowed by the *configuration*.
+
+* **Expiration**: KYC legitimizations may be outdated. Expiration rules determine when *checks* have to be performed again.
+
+* **Legitimization rules**: The legitimization rules determine under which *conditions* which *checks* must be performend and the *expiration* time period for the *checks*.
+
+* **Logic**: Logic refers to a specific bit of code (realized as an exchange plugin) that enables the interaction with a specific *provider*.  Logic typically requires *configuration* for access control (such as an authorization token) and possibly the endpoint of the specific *provider* implementing the respective API.
+
+* **Provider**: A provider performs a specific set of *checks* at a certain *cost*. Interaction with a provider is performed by provider-specific *logic*.
+
+* **Type of operation**: The operation type determines which Taler-specific operation has triggered the KYC requirement. We support four types of operation: withdraw (by customer), deposit (by merchant), P2P receive (by wallet) and (high) wallet balance.
+
+
+New Endpoints
+^^^^^^^^^^^^^
 
 We introduce a new ``wire_targets`` table into the exchange database. This
 table is referenced as the source or destination of payments (regular deposits
@@ -53,39 +75,71 @@ reference this table.  In this table, we additionally store information
 related to the KYC status of the underlying payto://-URI.
 
 The new ``/kyc-check/`` endpoint is based on the ``wire_targets`` serial
-number. Access is ``authenticated`` by also passing the hash of the
-payto://-URI (weak authentication is acceptable, as the KYC status or the
-ability to initiate a KYC process are not very sensitive).  Given this pair,
-the ``/kyc-check/`` endpoint returns either the (positive) KYC status or
-redirects the client (202) to the current stage of the KYC process.  (The
-endpoint may have to create and store a nonce to be used during
-``/kyc-proof/``, depending on the OAuth variant used.) The redirection is
-offered using an HTTP-redirect for Web-based clients and a JSON body with
-information for triggering a browser-based KYC process using OAuth 2.0.
-
-The OAuth 2.0 process is setup to end at a new ``/kyc-proof/`` endpoint. This
-endpoint then updates the KYC table of the exchange with the legitimization
-status (which is checked using OAuth 2.0). The endpoint also wakes up any
-long-polling ``/kyc-check/`` requests. Naturally, the exchange's OAuth 2.0
-client credentials must be configured apriori with the legitimization service.
+number.  Access is ``authenticated`` by also passing the hash of the
+payto://-URI.  (Weak authentication is acceptable, as the KYC status or the
+ability to initiate a KYC process are not very sensitive).  Additionally, a
+``type`` argument determines the type of the operation for which the KYC
+status is to be checked.  Finally, the client must specify whether the KYC
+check is for an individual or a business.  Given this quadruplet, the
+``/kyc-check/`` endpoint returns either the (positive) KYC status or redirects
+the client (202) to the next required stage of the KYC process.  The
+redirection must be for an HTTP(S) endpoint to be triggered via a simple HTTP
+GET.
 
-When withdrawing, the exchange checks if the KYC status is acceptable.  If no
-KYC was done and if either the amount withdrawn over the last X days exceeds
-the threshold or the reserve received received a P2P transfer, then a ``202
-Accepted`` is returned which redirects the consumer to the new ``/kyc-check/``
-handler.
+.. Note::
+
+   operation type and individual vs. business are new here, API change!
+
+The specific KYC provider to be executed depends on the configuration (see
+below) which specifies a ``$PROVIDER_ID`` for each authentication procedure.
+For each (enabled) provider, the exchange has a logic plugin which
+(asynchronously) determines the redirect URL for a given wire target. See
+below for a description of the high-level process for different providers.
+
+Upon completion of the process at the KYC provider, the provider must trigger
+a GET request to a new ``/kyc-proof/$PROVIDER_ID/$H_PAYTO`` endpoint.  This
+may be done either by redirecting the browser of the user to that endpoint, or
+by using a webhook (which is used may depend on the provider).  Once this
+endpoint is triggered, the exchange will pass the received arguments to the
+respective logic plugin.  The logic plugin will then (asynchronously) update
+the KYC status of the user.  The logic plugin should return a human-readable
+HTML page with the KYC result to the user (which will be ignored in case of
+a webhook).
+
+.. Note::
+
+   provider ID is new here, API change!
+
+
+Additionally, a new ``/kyc-webhook/$PROVIDER_ID`` POST endpoint is
+required, as some KYC providers send us the result per POST, and here the
+response does NOT go to the end-users' browser.  We again should trigger the
+plugin-specific logic.
+
+.. Note::
+
+   ``/kyc-webhook/`` is new here, new endpoint!
 
-When depositing, the exchange checks the KYC status and if negative, returns an
-additional information field that tells the merchant the ``wire_target_serial``
-number needed to begin the KYC process (this is independent of the amount)
-at the new ``/kyc-check/`` handler.
 
-When tracking deposits, the exchange also adds the ``wire_target_serial`` to
-the reply if the KYC status is negative.
+Legitimization Hooks
+^^^^^^^^^^^^^^^^^^^^
 
-The aggregator is modified to only SELECT deposits where the ``wire_target``
-has the KYC status set to positive (unless KYC is disabled in the exchange
-configuration).
+When withdrawing, the exchange checks if the KYC status is acceptable.  If no
+KYC was done and if either the amount withdrawn over a particular timeframe
+exceeds the threshold or the reserve received received a P2P transfer, then a
+``202 Accepted`` is returned which redirects the consumer to the new
+``/kyc-check/`` handler.
+
+When depositing, the exchange checks the KYC status and if negative, returns
+an additional information field that tells the merchant the
+``wire_target_serial`` number needed to begin the KYC process (this is
+independent of the amount) at the new ``/kyc-check/`` handler.  When tracking
+deposits, the exchange also adds the ``wire_target_serial`` to the reply if
+the KYC status is negative.  Furthermore, the aggregator is modified to only
+SELECT deposits where the ``wire_target`` has the KYC status set to positive
+(unless KYC is disabled in the exchange configuration).
+
+FIXME: describe KYC on P2P transfer here.
 
 To allow the wallet to do the KYC check if it is about to exceed a set balance
 threshold, we modify the ``/keys`` response to add an optional field
@@ -95,18 +149,16 @@ If the field is provided, a correct wallet must create a long-term
 account-reserve key pair. This should be the same key that is also used to
 receive wallet-to-wallet payments. Then, before a wallet performs an operation
 that would cause it to exceed the balance threshold in terms of funds held
-from a particular exchange, it must first request the user to complete the KYC
-process.
-
-For that, it should POST to the new ``/wallet-kyc`` endpoint, providing its
-long-term reserve-account public key and a signature requesting permission to
-exceed the account limit.  The exchange will respond with a wire target
-UUID. The wallet can then use this UUID to being the KYC process at
-``/kyc-check/``. The wallet must only proceed to obtain funds exceeding the
-threshold after the KYC process has concluded. While wallets could be "hacked"
-to bypass this measure (we cannot cryptographically enforce this), such
-modifications are a terms of service violation which may have legal
-consequences for the user.
+from a particular exchange, it should first request the user to complete the
+KYC process.  For that, the wallet should POST to the new ``/wallet-kyc``
+endpoint, providing its long-term reserve-account public key and a signature
+requesting permission to exceed the account limit.  The exchange will respond
+with a wire target UUID. The wallet can then use this UUID to being the KYC
+process at ``/kyc-check/``. The wallet must only proceed to obtain funds
+exceeding the threshold after the KYC process has concluded. While wallets
+could be "hacked" to bypass this measure (we cannot cryptographically enforce
+this), such modifications are a terms of service violation which may have
+legal consequences for the user.
 
 
   ..note::
@@ -115,88 +167,139 @@ consequences for the user.
     instead of setting them to ``finished`` in ``taler-exchange-transfer``.
 
 
-
-Exchange database schema changes
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-Note that there is may be some slight complication in the migration as the
-h_wire in deposits is salted, while the h_payto in the new wire_targets is
-expected to be unsalted. So converting the existing information to create the
-wire_targets table will be tricky!
-
-We can *either* not support a fully automatic migration, or do an "expensive"
-migration with C logic (so not just SQL statements).
-
-Given the other database changes for protocol v9, it was decided to just
-not support any migration this time.
+Configuration Options
+^^^^^^^^^^^^^^^^^^^^^
+
+The configuration specifies a set of providers, one
+per configuration section:
+
+[provider-$PROVIDER_ID]
+# How expensive is it to use this provider?
+# Used to pick the cheapest provider possible.
+COST = NUMBER
+# Which plugin is responsible for this provider?
+LOGIC = PLUGIN_NAME
+# Which checks does this provider provide?
+# List of strings, no specific semantics.
+PROVIDED_CHECKS = SMS GOVID PHOTO
+# Plus additional logic-specific options, e.g.:
+AUTHORIZATION_TOKEN = superdupersecret
+FORM_ID = business_legi_form
+
+The configuration also specifies a set of legitimization
+requirements, one per configuration section:
+
+[legitimization-$RULE_NAME]
+# For which type of user do these legitimization
+# rules apply? Either INDIVIDUAL or BUSINESS.
+USER_TYPE = INDIVIDUAL
+# Operation that triggers this legitimization.
+# Must be one of WITHDRAW, DEPOSIT, P2P-RECEIVE
+# or WALLET-BALANCE.
+OPERATION_TYPE = WITHDRAW
+# Required checks to be performed.
+# List of strings, must individually match the
+# strings in one or more provider's PROVIDED_CHECKS.
+REQUIRED_CHECKS = SMS GOVID
+# How long is the check considered valid?
+EXPIRATION = DURATION
+# Threshold amount above which the legitimization is
+# triggered.  The total must be exceeded in the given
+# timeframe. Can be 'forever'.
+THRESHOLD = AMOUNT
+# Timeframe over which the amount to be compared to
+# the  THRESHOLD is calculated.
+# Ignored for WALLET-BALANCE.
+TIMEFRAME = DURATION
+
+.. note::
+
+   The required checks / forms generally depend on whether the
+   user is an individual person or a business. Right now, we
+   cannot tell which one it is! For deposit we may be able to
+   presume it is a business and for the rest we could presume
+   it is individuals, but this is far from assured (e.g. an
+   individual may raise donations for themselves, or a business
+   may have a wallet or receive p2p payments).  Thus, we need
+   a way to be told the type of entity up-front!
+
+
+
+Exchange Database Schema
+^^^^^^^^^^^^^^^^^^^^^^^^
 
 .. sourcecode:: sql
 
-  -- Everything in one big transaction
-  BEGIN;
-  -- Check patch versioning is in place.
-  SELECT _v.register_patch('exchange-TBD', NULL, NULL);
-  --
   CREATE TABLE IF NOT EXISTS wire_targets
   (wire_target_serial_id BIGSERIAL UNIQUE
   ,h_payto BYTEA NOT NULL CHECK (LENGTH(h_payto)=64),
   ,payto_uri STRING NOT NULL
-  ,kyc_ok BOOLEAN NOT NULL DEFAULT (false)
-  ,oauth_username STRING NOT NULL
-  ,PRIMARY KEY (h_wire)
-  );
+  ,PRIMARY KEY (h_payto)
+  ) SHARD BY (h_payto);
   COMMENT ON TABLE wire_targets
     IS 'All recipients of money via the exchange';
   COMMENT ON COLUMN wire_targets.payto_uri
     IS 'Can be a regular bank account, or also be a URI identifying a reserve-account (for P2P payments)';
   COMMENT ON COLUMN wire_targets.h_payto
     IS 'Unsalted hash of payto_uri';
-  COMMENT ON COLUMN wire_targets.kyc_ok
-    IS 'true if the KYC check was passed successfully';
-  COMMENT ON COLUMN wire_targets.oauth_username
-    IS 'Name of the user that was used for OAuth 2.0-based legitimization';
-  --
-  -- NOTE: logic to fill wire_target missing, so this
-  -- CANNOT work if the database contains any data!
-  --
-  ALTER TABLE wire_out
-    ADD COLUMN wire_target_serial_id INT8 NOT NULL REFERENCES wire_targets (wire_target_serial_id),
-    DROP COLUMN wire_target;
-  COMMENT ON COLUMN wire_out.wire_target_serial_id
-    IS 'Identifies the target bank account and KYC status';
-  --
-  ALTER TABLE reserves_in
-    ADD COLUMN wire_source_serial_id INT8 NOT NULL REFERENCES wire_targets (wire_target_serial_id),
-    DROP COLUMN sender_account_details;
-  COMMENT ON COLUMN wire_out.wire_target_serial_id
-    IS 'Identifies the target bank account and KYC status';
-  --
-  ALTER TABLE reserves_close
-    ADD COLUMN wire_source_serial_id INT8 NOT NULL REFERENCES wire_targets (wire_target_serial_id),
-    DROP COLUMN receiver_account;
-  COMMENT ON COLUMN reserves_close.wire_target_serial_id
-    IS 'Identifies the target bank account and KYC status. Note that closing does not depend on KYC.';
-  --
-  ALTER TABLE deposits
-    ADD COLUMN wire_target_serial_id INT8 NOT NULL,
-    ADD COLUMN salt BYTEA NOT NULL CHECK (LENGTH(salt)=64),
-    DROP COLUMN h_wire,
-    DROP COLUMN wire;
-  COMMENT ON COLUMN deposits.wire_target_serial_id
-    IS 'Identifies the target bank account and KYC status';
-  -- Complete transaction
-  --
-  -- FIXME: 512-bit SALT is likely not specified/checked
-  -- anywhere in the code (salt==string), and we probably
-  -- should move to a 128-bit salt anyway!
-  --
-  COMMIT;
 
+  CREATE TABLE IF NOT EXISTS legitimizations
+  (legitimization_serial_id BIGSERIAL UNIQUE
+  ,h_payto BYTEA NOT NULL CHECK (LENGTH(h_payto)=64)
+  ,expiration_time INT8 NOT NULL DEFAULT (0)
+  ,provider_section VARCHAR NOT NULL
+  ,provider_user_id VARCHAR DEFAULT NULL
+  ,provider_legitimization_id VARCHAR DEFAULT NULL
+  ) SHARD BY (h_payto);
+
+  COMMENT ON COLUMN legitimizations.legitimization_serial_id
+    IS 'unique ID for this legitimization process at the exchange';
+  COMMENT ON COLUMN legitimizations.h_payto
+    IS 'foreign key linking the entry to the wire_targets table, NOT a primary key (multiple legitimizations are possible per wire target)';
+  COMMENT ON COLUMN legitimizations.expiration_time
+    IS 'in the future if the respective KYC check was passed successfully';
+  COMMENT ON COLUMN legitimizations.provider_section
+    IS 'Configuration file section with details about this provider';
+  COMMENT ON COLUMN legitimizations.provider_user_id
+    IS 'Identifier for the user at the provider that was used for the legitimization. NULL if provider is unaware.';
+  COMMENT ON COLUMN legitimizations.provider_legitimization_id
+    IS 'Identifier for the specific legitimization process at the provider. NULL if legitimization was not started.';
+
+
+Database API
+------------
+
+This section describes the new DB plugin functions.
+
+* insert_legi (INSERT h_payto, provider_section),
+  returns legitimization_serial_id
+
+* start_legi (UPDATE based on h_payto, provider_section,
+  SETs provider_user_id, provider_legitimization_id)
+
+* confirm_legi (UPDATE based on h_payto, provider_section,
+  SETs expiration_time)
+
+* get_legitimizations (SELECT by h_payto,
+  WHERE NOT expired), returns provider_section list.
+
+Additionally, we have to make:
+
+* changes to the existing wire_targets API
+
+* changes to existing KYC checks in stored procedures
 
 
 Merchant modifications
 ^^^^^^^^^^^^^^^^^^^^^^
 
+A new setting is required where the merchant backend
+can be configured for a business (default) or individual.
+
+.. note::
+
+   This still needs to be done!
+
 We introduce new ``kyc_status``, ``kyc_timestamp`` and ``kyc_serial`` fields
 into a new table with primary keys ``exchange_url`` and ``account``.  This
 status is updated whenever a deposit is created or tracked, or whenever the
@@ -237,24 +340,166 @@ long-poller return with positive news.
 Bank requirements
 ^^^^^^^^^^^^^^^^^
 
-The exchange primarily requires an OAuth 2.0 login page where the user
-can either login (and share an access token that grants access to only
-the username) or register to initiate the KYC process.
+The exchange primarily requires a KYC provider to be operated by the
+bank that offers an endpoint for with an API implemented by one of
+the logic plugins (and the respective legitimization configuration).
 
 
-Alternatives
-============
+Logic plugins
+^^^^^^^^^^^^^
 
-We may not need the oauth_username, but it seems saner to store it to
-provide a link to the legitimization resource server.
+The ``$PROVIDER_ID`` is based on the name of the configuration section,
+not on the name of the logic plugin.  Using the configuration section,
+the exchange then determines the logic plugin to use.
 
-We could also store the access token, but that seems slightly more
-dangerous and given the close business relationship is unnecessary.
+This section describes the general API for all of the supported KYC providers,
+as well as some details of how this general API could be implemented by the logic for
+different APIs.
 
-We may want to store some additional "permission level" obtained from the
-resource server to say for which of the operations (see requirements section)
-the legitimization is sufficient.
 
+General KYC Logic Plugin API
+----------------------------
+
+This section provides a sketch of the proposed API for the KYC logic plugins.
+
+* initiation of KYC check (``kyc-check``):
+
+  - inputs:
+    + provider_section (for additional configuration)
+    + h_payto
+  - outputs:
+    + success/provider-failure
+    + provider_user_id (or NULL)
+    + provider_legitimization_id (or NULL)
+
+* KYC status check (``kyc-proof``):
+
+  - inputs:
+    + provider_section (for additional configuration)
+    + h_payto
+    + provider_user_id (or NULL)
+    + provider_legitimization_id (or NULL)
+  - outputs:
+    + success/pending/user-aborted/user-failure/provider-failure status code
+    + HTML response for end-user
+
+* Webhook notification handler (``kyc-webhook``):
+
+  - inputs:
+    + HTTP method (GET/POST)
+    + rest of URL (after provider_section)
+    + HTTP body (if applicable!)
+  - outputs:
+    + success/pending/user-aborted/user-failure/provider-failure status code
+    + h_payto (for DB status update)
+    + HTTP response to be returned to KYC provider
+
+The plugins do not directly interact with the database, the caller sets the
+expiration on ``success`` and also updates ``provider_user_id`` and
+``provider_legitimization_id`` in the tables as required.
+
+
+For the webhook, we need a way to lookup ``h_payto`` by other data, so the
+KYC logic plugin API should be provided a method lookup with:
+
+  - inputs:
+    + ``provider_section``
+    + ``provider_legitimization_id``
+  - outputs:
+    + ``h_payto``
+
+
+OAuth 2.0 specifics
+-------------------
+
+In terms of configuration, the OAuth 2.0 logic requires the respective client
+credentials to be configured apriori to enable access to the legitimization
+service.
+
+For the ``/kyc-check/`` endpoint, the OAuth 2.0 logic may need to create and
+store a nonce to be used during ``/kyc-proof/``, depending on the OAuth
+variant used.  This may require another exchange table.  The OAuth 2.0 process
+must then be set up to end at the new ``/kyc-proof/$PROVIDER_ID/`` endpoint.
+
+This ``/kyc-proof/oauth2/`` endpoint must query the OAuth 2.0 server using the
+``code`` argument provided as a query parameter. Based on the result, it then
+updates the KYC table of the exchange with the legitimization status and
+returns a human-readable KYC status page.
+
+The ``/kyc-webhook/`` is not applicable.
+
+
+Persona specifics
+-----------------
+
+We would use the hosted flow. Endpoints return a ``request-id``, which we should
+log for diagnosis.
+
+For ``/kyc-check/``:
+
+* Post to ``/api/v1/accounts`` using ``reference-id`` set to our ``h_payto``.
+  Returns ``id`` (account_id).
+
+* Create ``/verify`` endpoint using ``template-id`` (from configuration),
+  and ``account_id`` (from previous step) and a ``reference-id`` (use
+  the ``legitimization_serial_id`` for the new process). Set
+  ``redirect-uri`` to ``/kyc-proof/$PROVIDER_ID/``.  However, we cannot
+  rely on the user clicking this, so we must also configure a webhook.
+  The request returns a '``verification-id``.  That we store under
+  the ``provider_legitimization_id`` in the database.
+
+For ``/kyc-proof/``:
+
+* Use the ``/api/v1/verifications`` endpoint to get the verification
+  status. Requires the ``verification-id`` from the previous step.
+  Results include: created/pending/completed/expired (aborted)/failed.
+
+For ``/kyc-webhook/``:
+
+* The webhook is authenticated using a shared secret, which should
+  be in the configuration.  So all we should have to do is parse
+  the POSTed body to find the status and the ``verification-id`` to
+  lookup ``h_payto`` and return the result.
+
+
+KYC AID specifics
+-----------------
+
+For ``/kyc-check/``:
+
+* Post to ``/applicants`` with a type (person or company) to
+  obtain ``applicant_id``. Store that under ``provider_user_id``.
+  ISSUE: *we* need to get the company_name, business_activity_id
+  and registration_country before this somehow!
+
+* start with create form URL ``/forms/$FORM_ID/urls``
+  providing our ``h_payto`` as the ``external_applicant_id``,
+  using the ``applicant_id`` from above,
+  and the ``/kyc-proof/$PROVIDER_ID`` for the ``redirect_url``.
+
+* redirect customer to the ``form_url``,
+  store the ``verification_id`` under ``provider_legitimization_id``
+  in the database.
+
+For ``/kyc-proof/``:
+
+* Perform GET ``/verifications/{verification-id}`` to determine
+  and return status.
+
+For ``/kyc-webhook/``:
+
+* For security, we should probably simply trigger the GET on
+  ``/verifications/{verification_id}`` to not trust an unsigned POST
+  to tell us anything for sure.  The result is then returned.
+
+
+
+Alternatives
+============
+
+We could also store the access token (returned by OAuth 2.0), but that seems
+slightly more dangerous and given the close business relationship is
+unnecessary. Furthermore, not all APIs offer this.
 
 
 Drawbacks
-- 
cgit v1.2.3