response-202005.tex - exchange - Base system with REST service to issue digital coins, run by the payment service provider

response-202005.tex (11355B)
      1 \documentclass[11pt]{article}
      2 \oddsidemargin=0in \evensidemargin=0in
      3 \textwidth=6.2in \textheight=8.7in
      4 %\topmargin=-0.2in
      5 
      6 \usepackage[ansinew]{inputenc}
      7 \usepackage{makeidx,amsmath,amssymb,exscale,multicol,epsfig,graphics}
      8 
      9 \begin{document}
     10 \pagestyle{headings}
     11 \title{Preliminary response to the \\ GNU Taler security audit in Q2/Q3 2020}
     12 \author{Christian Grothoff \and Florian Dold}
     13 
     14 \maketitle
     15 
     16 \section{Abstract}
     17 
     18 This is the response to the source code audit report CodeBlau
     19 created for GNU Taler in Q2/Q3 2020.
     20 
     21 \section{Management Summary}
     22 
     23 We thank CodeBlau for their detailed report and thorough analysis. We are
     24 particularly impressed that they reported issues against components that were
     25 not even in-scope, and also that they found an {\em interesting} new corner
     26 case we had not previously considered. Finally, we also find several of their
     27 architectural recommendations to strengthen security to be worthwhile, and
     28 while some were already on our long-term roadmap, we will reprioritize our
     29 roadmap given their recommendations.
     30 
     31 Given our extensive discussions with CodeBlau, we also have the impression
     32 that they really took the time to understand the system, and look forward
     33 to working with CodeBlau as a competent auditor for GNU Taler in the future.
     34 
     35 \section{Issues in the exchange}
     36 
     37 We agree with the issues CodeBlau discovered and both parties believe that
     38 they have all been addressed.
     39 
     40 \section{Issues in the auditor}
     41 
     42 We appreciate CodeBlau's extensive list of checks the Taler auditor performs,
     43 which was previously not documented adequately by us. We agree that the
     44 auditor still needs more comprehensive documentation.
     45 
     46 As for issue \#6416, we agree with the analysis. However, the proposed fix
     47 of making the primary key include the denomination would create other problems,
     48 such as the exchange sometimes not having the denomination key (link, refund)
     49 and the code in various places relying on the assumption of the coin's
     50 public key being unique. Furthermore, allowing coin key re-use may validate
     51 a terrible practice. We thus decided it is better to ``fail early'', and
     52 modified the code to check that the coin public key is ``unique'' during
     53 deposit, refresh and recoup and ensured that the exchange returns a proof
     54 of non-uniqueness in case of a violation. The test suite was extended to
     55 cover the corner case.
     56 
     57 \section{Issues in GNUnet}
     58 
     59 We agree with the issues CodeBlau discovered and both parties believe that
     60 they have all been addressed.
     61 
     62 \section{General remarks on the code}
     63 
     64 We understand that writing the code in another programming language may make
     65 certain checks for the auditor less work to implement. However, our choice of C
     66 is based on the advantages that make it superior to contemporary languages for
     67 our use case:  relatively low complexity of the language (compared to C++);
     68 availability of mature compilers, static and dynamic analysis tools;
     69 predictable performance; access to stable and battle-tested libraries; and
     70 future-proofness due to portability to older systems as well as new platforms.
     71 
     72 We believe creating a parallel implementation in other languages would provide
     73 advantages, especially with respect to avoiding ``the implementation is the
     74 specification''-style issues.  However, given limited resources will not make
     75 this a priority.
     76 
     77 We disagree that all modern software development has embraced the idea that
     78 memory errors are to be handled in ways other than terminating or restarting
     79 the process. Many programming languages (Erlang, Java) hardly offer any other
     80 means of handling out-of-memory situations than to terminate the process. We
     81 also insist that Taler {\em does} handle out-of-memory as it does have code
     82 that terminates the process (we do {\em not} simply ignore the return value
     83 from {\tt malloc()} or other allocation functions!). We simply consider that
     84 terminating the process (which is run by a hypervisor that will restart the
     85 service) is the correct way to handle out-of-memory situations.  We also have
     86 limits in place that should prevent attackers from causing large amounts of
     87 memory to be consumed, and also have code to automatically preemptively
     88 restart the process to guard against memory exhaustion from memory
     89 fragmentation.  Finally, a common problem with abrupt termination may be
     90 corrupted files.  However, the code mostly only reads from files and limits
     91 writing to the Postgres database. Hence, there is no possibility of corrupt
     92 files being left behind even in the case of abnormal termination.
     93 
     94 
     95 \section{More specs and documentation code}
     96 
     97 We agree with the recommendation that the documentation should be improved,
     98 and will try to improve it along the lines recommended by CodeBlau.
     99 
    100 \section{Protocol change: API for uniformly distributed seeds}
    101 
    102 We agree with the suggestion, have made the necessary changes, and both
    103 parties believe that the suggestion has been implemented.
    104 
    105 \section{Reduce code complexity}
    106 
    107 \subsection{Reduce global variables}
    108 
    109 While we do not disagree with the general goal to have few global variables,
    110 we also believe that there are cases where global variables make sense.
    111 
    112 We have already tried to minimize the scope of variables. The remaining few
    113 global variables are largely ``read-only'' configuration data. The report does
    114 not point out specific instances that would be particularly beneficial to
    115 eliminate. As we continue to work on the code, we will of course evaluate
    116 whether the removal of a particular global variable would make the code
    117 cleaner.
    118 
    119 Also, we want to point out that all global variables we introduce
    120 in the exchange are indicated with a prefix {\tt TEH\_} in the code, so they
    121 are easy to identify as such.
    122 
    123 \subsection{Callbacks, type pruning}
    124 
    125 We understand that higher order functions in C can be confusing, but this
    126 is also a common pattern to enable code re-use and asynchronous execution
    127 which is essential for network applications. We do not believe that we
    128 use callbacks {\em excessively}.  Rewriting the code in another language
    129 may indeed make this part easier to understand, alas would have other
    130 disadvantages as pointed out previously.
    131 
    132 \subsection{Initializing structs with memset}
    133 
    134 Using {\tt memset()} first prevents compiler (or valgrind) warnings about
    135 using uninitialized memory, possibly hiding bugs. We also do use struct
    136 initialization in many cases.
    137 
    138 The GNUnet-wrappers are generally designed to be ``safer'' or ``stricter''
    139 variants of the corresponding libc functions, and not merely ``the same''.
    140 Hence we do not believe that renaming {\tt GNUNET\_malloc} is indicated.
    141 
    142 The argument that {\tt memset()}ing first makes the code inherently more
    143 obvious also seems fallacious, as it would commonly result in dead stores,
    144 which can confuse developers and produce false-positive warnings from static
    145 analysis tools.
    146 
    147 \subsection{NULL pointer handling}
    148 
    149 The problem with the ``goto fail'' style error handling is that it rarely
    150 results in specific error handling where diagnostics are created that are
    151 specific to the error. Using this style of programming encourages developers
    152 to create simplistic error handling, which can result in inappropriate error
    153 handling logic and also makes it harder to attribute errors to the specific
    154 cause.
    155 
    156 However, we have no prohibition on using this style of error handling either:
    157 if it is appropriate, develpers should make a case-by-case decision as to how
    158 to best handle a specific error.
    159 
    160 We have made some first changes to how {\tt GNUNET\_free()} works in response
    161 to the report, and will discuss further changes with the GNUnet development
    162 team.
    163 
    164 \subsection{Hidden security assumptions}
    165 
    166 We disagree that the assumptions stated are ``hidden'', as (1) the Taler code
    167 has its own checks to warrant that the requirements of the {\tt
    168   GNUNET\_malloc()} API are satisfied (so enforcement is not limited to the
    169 abstraction layer), and (2) the maximum allocation size limit is quite clearly
    170 specified in the GNUnet documentation.  Also, the GNUnet-functions are not
    171 merely an abstraction layer for portability, but they provided extended
    172 semantics that we rely upon. So it is not like it is possible to swap this
    173 layer and expect anything to continue to work.
    174 
    175 When we use the libjansson library, it is understood that it does not use
    176 the GNUnet operations, and the code is careful about this distinction.
    177 
    178 \subsection{Get rid of boolean function arguments}
    179 
    180 We agree that this can make the code more readable, and have in some places
    181 already changed the code in this way.
    182 
    183 \section{Structural Recommendation}
    184 
    185 \subsection{Least privilege}
    186 
    187 It is wrong to say that GNU Taler has ``no work done'' on privilege separation.
    188 For example, the {\tt taler-exchange-dbinit} tool is the only tool that requires
    189 CREATE, ALTER and DROP rights on database tables, thus enusring that the ``main''
    190 process does not need these rights.
    191 
    192 We also already had the {\tt taler-exchange-keyup} tool responsible for
    193 initializing keys. In response to the audit, we already changed the GNUnet API
    194 to make sure that tools do not create keys as a side-effect of trying to read
    195 non-existent key files.
    196 
    197 We agree with the recommendation on further privilege separation for access
    198 to cryptographic keys, and intend to implement this in the near future.
    199 
    200 \subsection{File system access}
    201 
    202 The auditor helpers actually only read from the file system, only the LaTeX
    203 invocation to compile the final report to PDF inherently needs write
    204 access. We do not predict that we will retool LaTeX.  Also, the file system
    205 access is completely uncritical, as the auditor by design runs on a system
    206 that is separate from the production exchange system.
    207 
    208 Because that system will not have {\em any} crypto keys (not even the one of
    209 the auditor!), CodeBlau is wrong to assume that reading from or writing to the
    210 file system represents a security threat.
    211 
    212 We have started to better document the operational requirements on running the
    213 auditor.
    214 
    215 \subsection{Avoid dlopen}
    216 
    217 Taler actually uses {\tt ltdlopen()} from GNU libtool, which provides
    218 compiler flags to convert the dynamic linkage into static linkage.  For
    219 development, dynamic linkage has many advantages.
    220 
    221 We plan to test and document how to build GNU Taler with only static
    222 linkage, and will recommend this style of deployment for the Taler
    223 exchange for production.
    224 
    225 \subsection{Reduce reliance on PostgreSQL}
    226 
    227 CodeBlau's suggestion to use an append-only transaction logging service in
    228 addition to the PostgreSQL database is a reasonable suggestion for a
    229 production-grade deployment of GNU Taler, as it would allow partial disaster
    230 recovery even in the presence of an attacker that has gained write access to
    231 the exchange's database.
    232 
    233 We are currently still investigating whether the transaction logging should be
    234 implemented directly by the exchange service, or via the database's extensible
    235 replication mechanism.  Any implementation of such an append-only logging
    236 mechanism must be carefully designed to ensure it does not negatively impact
    237 the exchange's availability and does not interfere with serializability of
    238 database transactions.  As such we believe that transaction logging can only be
    239 provided on a best-effort basis.  Fortunately, even a best-effort append-only
    240 transaction log would serve to limit the financial damage incurred by the
    241 exchange in an active database compromise scenario.
    242 
    243 \end{document}
	exchange Base system with REST service to issue digital coins, run by the payment service provider
	Log \| Files \| Refs \| Submodules \| README \| LICENSE