aboutsummaryrefslogtreecommitdiff
path: root/doc/audit/response-202109.tex
diff options
context:
space:
mode:
Diffstat (limited to 'doc/audit/response-202109.tex')
-rw-r--r--doc/audit/response-202109.tex291
1 files changed, 291 insertions, 0 deletions
diff --git a/doc/audit/response-202109.tex b/doc/audit/response-202109.tex
new file mode 100644
index 000000000..43ed6a636
--- /dev/null
+++ b/doc/audit/response-202109.tex
@@ -0,0 +1,291 @@
1\documentclass[11pt]{article}
2\oddsidemargin=0in \evensidemargin=0in
3\textwidth=6.2in \textheight=8.7in
4%\topmargin=-0.2in
5
6\usepackage[ansinew]{inputenc}
7\usepackage{makeidx,amsmath,amssymb,exscale,multicol,epsfig,graphics,url}
8
9\begin{document}
10\pagestyle{headings}
11\title{Final response to the \\ GNU Taler security audit in Q2/Q3 2020}
12\author{Christian Grothoff \and Florian Dold}
13
14\maketitle
15
16\section{Abstract}
17
18This is the response to the source code audit report CodeBlau
19created for GNU Taler in Q2/Q3 2020.
20
21\section{Management Summary}
22
23We thank CodeBlau for their detailed report and thorough analysis. We are
24particularly impressed that they reported issues against components that were
25not even in-scope, and also that they found an {\em interesting} new corner
26case we had not previously considered. Finally, we also find several of their
27architectural recommendations to strengthen security to be worthwhile, and
28while some were already on our long-term roadmap, we will reprioritize our
29roadmap given their recommendations.
30
31Given our extensive discussions with CodeBlau, we also have the impression
32that they really took the time to understand the system, and look forward
33to working with CodeBlau as a competent auditor for GNU Taler in the future.
34
35\section{Issues in the exchange}
36
37We agree with the issues CodeBlau discovered and both parties believe that
38they have all been addressed.
39
40\section{Issues in the auditor}
41
42We appreciate CodeBlau's extensive list of checks the Taler auditor performs,
43which was previously not documented adequately by us. We agree that the
44auditor still needs more comprehensive documentation.
45
46As for issue \#6416, we agree with the analysis. However, the proposed fix
47of making the primary key include the denomination would create other problems,
48such as the exchange sometimes not having the denomination key (link, refund)
49and the code in various places relying on the assumption of the coin's
50public key being unique. Furthermore, allowing coin key re-use may validate
51a terrible practice. We thus decided it is better to ``fail early'', and
52modified the code to check that the coin public key is ``unique'' during
53deposit, refresh and recoup and ensured that the exchange returns a proof
54of non-uniqueness in case of a violation. The test suite was extended to
55cover the corner case.
56
57{\bf Update:} We have now also addressed the (``soft'') exchange online
58signing key revocation issue (\#6161) reported originally by CodeBlau.
59The auditor now checks for key revocations before recording deposit
60confirmations. The impact is very minor, as this will merely prevent
61an adversary controlling an exchange online signing key from submitting
62false claims to the auditor.
63
64
65
66\section{Issues in GNUnet}
67
68We agree with the issues CodeBlau discovered and both parties believe that
69they have all been addressed.
70
71\section{General remarks on the code}
72
73We understand that writing the code in another programming language may make
74certain checks for the auditor less work to implement. However, our choice of C
75is based on the advantages that make it superior to contemporary languages for
76our use case: relatively low complexity of the language (compared to C++);
77availability of mature compilers, static and dynamic analysis tools;
78predictable performance; access to stable and battle-tested libraries; and
79future-proofness due to portability to older systems as well as new platforms.
80
81We believe creating a parallel implementation in other languages would provide
82advantages, especially with respect to avoiding ``the implementation is the
83specification''-style issues. However, given limited resources will not make
84this a priority.
85
86We disagree that all modern software development has embraced the idea that
87memory errors are to be handled in ways other than terminating or restarting
88the process. Many programming languages (Erlang, Java) hardly offer any other
89means of handling out-of-memory situations than to terminate the process. We
90also insist that Taler {\em does} handle out-of-memory as it does have code
91that terminates the process (we do {\em not} simply ignore the return value
92from {\tt malloc()} or other allocation functions!). We simply consider that
93terminating the process (which is run by a hypervisor that will restart the
94service) is the correct way to handle out-of-memory situations. We also have
95limits in place that should prevent attackers from causing large amounts of
96memory to be consumed, and also have code to automatically preemptively
97restart the process to guard against memory exhaustion from memory
98fragmentation. Finally, a common problem with abrupt termination may be
99corrupted files. However, the code mostly only reads from files and limits
100writing to the Postgres database. Hence, there is no possibility of corrupt
101files being left behind even in the case of abnormal termination.
102
103
104\section{More specs and documentation code}
105
106We agree with the recommendation that the documentation should be improved,
107and will try to improve it along the lines recommended by CodeBlau.
108
109\section{Protocol change: API for uniformly distributed seeds}
110
111We agree with the suggestion, have made the necessary changes, and both
112parties believe that the suggestion has been implemented.
113
114\section{Reduce code complexity}
115
116\subsection{Reduce global variables}
117
118While we do not disagree with the general goal to have few global variables,
119we also believe that there are cases where global variables make sense.
120
121We have already tried to minimize the scope of variables. The remaining few
122global variables are largely ``read-only'' configuration data. The report does
123not point out specific instances that would be particularly beneficial to
124eliminate. As we continue to work on the code, we will of course evaluate
125whether the removal of a particular global variable would make the code
126cleaner.
127
128Also, we want to point out that all global variables we introduce
129in the exchange are indicated with a prefix {\tt TEH\_} in the code, so they
130are easy to identify as such.
131
132\subsection{Callbacks, type pruning}
133
134We understand that higher order functions in C can be confusing, but this
135is also a common pattern to enable code re-use and asynchronous execution
136which is essential for network applications. We do not believe that we
137use callbacks {\em excessively}. Rewriting the code in another language
138may indeed make this part easier to understand, alas would have other
139disadvantages as pointed out previously.
140
141{\bf Update:} We introduced additional functions to replace
142variadic calls to functions that cannot be type-checked by
143the compiler (like libjansson's {\tt json\_pack()}) with
144type-safe versions (like the new {\tt GNUNET\_JSON\_PACK()}).
145
146
147\subsection{Initializing structs with memset}
148
149Using {\tt memset()} first prevents compiler (or valgrind) warnings about
150using uninitialized memory, possibly hiding bugs. We also do use struct
151initialization in many cases.
152
153The GNUnet-wrappers are generally designed to be ``safer'' or ``stricter''
154variants of the corresponding libc functions, and not merely ``the same''.
155Hence we do not believe that renaming {\tt GNUNET\_malloc} is indicated.
156
157The argument that {\tt memset()}ing first makes the code inherently more
158obvious also seems fallacious, as it would commonly result in dead stores,
159which can confuse developers and produce false-positive warnings from static
160analysis tools.
161
162\subsection{NULL pointer handling}
163
164The problem with the ``goto fail'' style error handling is that it rarely
165results in specific error handling where diagnostics are created that are
166specific to the error. Using this style of programming encourages developers
167to create simplistic error handling, which can result in inappropriate error
168handling logic and also makes it harder to attribute errors to the specific
169cause.
170
171However, we have no prohibition on using this style of error handling either:
172if it is appropriate, develpers should make a case-by-case decision as to how
173to best handle a specific error.
174
175We have made some first changes to how {\tt GNUNET\_free()} works in response
176to the report, and will discuss further changes with the GNUnet development
177team.
178
179\subsection{Hidden security assumptions}
180
181We disagree that the assumptions stated are ``hidden'', as (1) the Taler code
182has its own checks to warrant that the requirements of the {\tt
183 GNUNET\_malloc()} API are satisfied (so enforcement is not limited to the
184abstraction layer), and (2) the maximum allocation size limit is quite clearly
185specified in the GNUnet documentation. Also, the GNUnet-functions are not
186merely an abstraction layer for portability, but they provided extended
187semantics that we rely upon. So it is not like it is possible to swap this
188layer and expect anything to continue to work.
189
190When we use the libjansson library, it is understood that it does not use
191the GNUnet operations, and the code is careful about this distinction.
192
193\subsection{Get rid of boolean function arguments}
194
195We agree that this can make the code more readable, and have in some places
196already changed the code in this way.
197
198\section{Structural Recommendation}
199
200\subsection{Least privilege}
201
202It is wrong to say that GNU Taler has ``no work done'' on privilege separation.
203For example, the {\tt taler-exchange-dbinit} tool is the only tool that requires
204CREATE, ALTER and DROP rights on database tables, thus enusring that the ``main''
205process does not need these rights.
206
207We also already had the {\tt taler-exchange-keyup} tool responsible for
208initializing keys. In response to the audit, we already changed the GNUnet API
209to make sure that tools do not create keys as a side-effect of trying to read
210non-existent key files.
211
212{\bf Update:} We have now implemented full privilege separation for access to the online
213cryptographic signing keys. Details about the design are documented in the
214section ``Exchange crypto helper design'' at \url{https://docs.taler.net/} of
215Chapter 12.
216
217{\bf Update:} In doing so, we also added a new type of signing key, the
218``security module'' signing key. This is used by the newly separated ``security
219module`` processes to sign the public keys that they guard the private keys
220for. The security module signatures are verified by the new
221``taler-exchange-offline`` tool to ensure that even if the {\tt
222taler-exchange-httpd} process is compromised, the offline signature tool would
223refuse to sign new public keys that do not originate from the security
224module(s). The security module public keys can be given in the configuration,
225or are learned TOFU-style.
226
227
228\subsection{File system access}
229
230The auditor helpers actually only read from the file system, only the LaTeX
231invocation to compile the final report to PDF inherently needs write
232access. We do not predict that we will retool LaTeX. Also, the file system
233access is completely uncritical, as the auditor by design runs on a system
234that is separate from the production exchange system.
235
236Because that system will not have {\em any} crypto keys (not even the one of
237the auditor!), CodeBlau is wrong to assume that reading from or writing to the
238file system represents a security threat.
239
240We have started to better document the operational requirements on running the
241auditor.
242
243{\bf Update:} On the exchange side, we have now moved additional information
244from the file system into the database, in particular information about offline signatures
245(including key revocations) and wire fees. This simplifies the deployment and
246the interaction with offline key signing mechanism. The remaining disk accesses are for
247quite fundamental configuration data (which ports to bind to, configuration to
248access the database, etc.), and of course the program logic itself.
249
250{\bf Update:} We have also restructured the configuration such that only
251the {\tt taler-exchange-transfer} and {\tt taler-exchange-wirewatch} programs
252need to have access to the more sensitive bank account configuration data,
253and so that these processes can run as a separate user.
254
255
256\subsection{Avoid dlopen}
257
258Taler actually uses {\tt ltdlopen()} from GNU libtool, which provides
259compiler flags to convert the dynamic linkage into static linkage. For
260development, dynamic linkage has many advantages.
261
262We plan to test and document how to build GNU Taler with only static
263linkage, and will recommend this style of deployment for the Taler
264exchange for production.
265
266\subsection{Reduce reliance on PostgreSQL}
267
268CodeBlau's suggestion to use an append-only transaction logging service in
269addition to the PostgreSQL database is a reasonable suggestion for a
270production-grade deployment of GNU Taler, as it would allow partial disaster
271recovery even in the presence of an attacker that has gained write access to
272the exchange's database.
273
274We are currently still investigating whether the transaction logging should be
275implemented directly by the exchange service, or via the database's extensible
276replication mechanism. Any implementation of such an append-only logging
277mechanism must be carefully designed to ensure it does not negatively impact
278the exchange's availability and does not interfere with serializability of
279database transactions. As such we believe that transaction logging can only be
280provided on a best-effort basis. Fortunately, even a best-effort append-only
281transaction log would serve to limit the financial damage incurred by the
282exchange in an active database compromise scenario.
283
284{\bf Update:} We have tightened the installation instructions for the
285Taler exchange to guide users towards a more restricted Postgres setup,
286tightening which components of the Exchange need what level of access
287to the exchange database.
288
289
290
291\end{document}