1 files changed, 152 insertions, 0 deletions
diff --git a/design-documents/033-database.rst b/design-documents/033-database.rst
new file mode 100644
index 00000000..ee37cc32
--- /dev/null
+++ b/design-documents/033-database.rst
@@ -0,0 +1,152 @@
+DD 33: Database Schema and Versioning
+#####################################
+
+Summary
+=======
+
+This document describes how we version database schema in GNU Taler
+and enable migrations.
+
+
+Motivation
+==========
+
+As Taler evolves, it will be necessary to add new tables, change existing
+tables, modify indices and make other changes to the database schema. As
+production systems exist, existing data will need to be migrated to the new
+schema, and we need to do this in a systematic way.  While certain major
+changes may require radical or manual interventions, we should have a
+systematic way of dealing with minor or modest modifications to the schema.
+
+
+Requirements
+============
+
+* The approach should be language-independent
+* Schema migration should be reasonably performant
+* Schema migration must be usable, and in particular safe to use for
+  operators without significant risk
+* We need to support key database features we might need to use,
+  such as partitioning or sharding
+
+
+Proposed Solution
+=================
+
+We use the "versioning.sql" system to store the current set of patches that
+have been applied to the database so far in a "_v" schema. This allows us to
+quickly test which version of the database we are on and which migrations may
+still need to be applied.
+
+For each component, all tables are placed into a SCHEMA named after the
+component.
+
+We then use a set of numbered SQL files that create, alter or drop tables and
+indices (like exchange-0001.sql, exchange-0002.sql, ...) to setup the
+database. However, some setups need additional arguments, such as the number
+of partitions. Those setups are then NOT performed explicitly, but by creating
+stored procedures and registering those stored procedures in a general global
+"master" table to be called from the main setup logic with arguments in a
+particular order under certain conditions.
+
+When setting up a database, there is no point in incrementally defining
+ordinary stored procedures that are used at runtime (not the ones to setup the
+tables we talked about above). Thus, all of the stored procedures used by the
+runtime system are placed in a file "procedures.sql" which is loaded
+last. This makes changes to stored procedures particularly easy, as one simply
+edits "procedures.sql".  Loading "procedures.sql" also does not change "_v".
+
+A "drop.sql" file is created that DROPs the main SCHEMA of the component and
+additionally unregisters all patches from the "_v" schema. The drop script
+is run during tests to fully reset the database.
+
+Exchange details
+^^^^^^^^^^^^^^^^
+
+The exchange uses "exchange_tables" to create the master
+table mentioned above. In "exchange_tables", entries are
+executed in the order of the "table_serial_id". Each
+entry has a "name", which is the name of the affected table
+(or at least the prefix in the case of partitioned or sharded
+tables).  The "version" field stores which "exchange-XXXX.sql"
+file setup the respective table entry, but is for now mostly
+for internal documentation.  The "action" defines both the
+condition under which to run a function.  Specifically,
+actions can be:
+
+* create --- run on the master table and each shard; used to create or alter the main table
+* constrain --- run only on the partitions/shards, or on master if there are no partitions; used to setup constraints like uniqueness that only apply to the lowest levels of the table
+* master -- run only on the master table; used to setup triggers and other constraints that only apply to the master table
+* foreign -- run only on the master table and only if there are no partition; used to setup foreign key constraints that are not supported on partitioned or sharded tables
+
+The "partitioned" field indicates that this table is partitioned and instructs the functions to create partitions (or shards)
+for this table.
+    
+The "by_range" field indicates if the table is partitioned by
+range, which prevents automatic generation of partitions as
+is done if partitioned by hash.
+
+The "finished" field is initially false, but set to TRUE once the respective
+function has been executed.
+
+The main "do_create_tables" function triggers the unfinished actions
+registered in the "exchange_tables" table.  It is given arguments to control
+the number of partitions, the use of partitions and (in the future) the use of
+sharding.
+
+The individual actions use helper functions ("create_partitioned_table",
+"comment_partitioned_table" and "comment_partitioned_column") to facilitate
+the creation of tables and associated comments.  These functions are used so
+that we can only define the schema or comment once, and have it applied to
+tables with names and creation syntax that changes slightly if we use shards
+or partitions.
+
+Some additional logic will be needed to deal nicely with
+sharding (which is currently not supported), as with
+sharing we will need to detach shards, migrate shards, and
+re-attach shards. So this will require additional stored
+procedures to support operations on shards.
+
+
+Merchant details
+^^^^^^^^^^^^^^^^
+
+The merchant does not (yet) need any type of master table, as we do not
+(yet) use any kind of sharding or partitioning.  There are also no
+stored procedures being used by the backend.  Hence, it is simply the
+"versioning.sql"-controlled table creation/alteration sequence
+(merchant-0001.sql, etc.) and the "drop.sql" to reset everything.
+
+
+Alternatives
+============
+
+* We might want to consider storing more meta-data
+  in the database, such as the use of sharding, the
+  names of the shard servers, or even just the number
+  of partitions.
+
+* We could require dumping out the old database and
+  loading it in via some explicit importer during each
+  migration; having migration logic in C would enable more
+  powerful migrations, but dumping and reloading the entire
+  database would also be more expensive. It would have the
+  advantage of basically having the old database around in
+  case of migration trouble, so the cost disadvantage might
+  not be so bad (as admins are likely to make a backup anyway).
+  OTOH, doing the migration with the equivalent of a
+  taler-auditor-sync would require quite a bit more code
+  than the simple ALTER/CREATE statements in an SQL file.
+  
+
+Drawbacks
+=========
+
+* not exactly trival logic
+* some complexity to implement
+
+
+Discussion / Q&A
+================
+
+(This should be filled in with results from discussions on mailing lists / personal communication.)