076-paywall-proxy.rst (3107B)
1 DD 76: Paivana - Fighting AI Bots with GNU Taler 2 ################################################ 3 4 Summary 5 ======= 6 7 This design document describes the architecture of an AI Web firewall using GNU 8 Taler, as well as new features that are required for the implementation. 9 10 Motivation 11 ========== 12 13 AI bots are causing enormous amounts of traffic by scraping sites like git 14 forges. They neither respect robots.txt nor 5xx HTTP responses. Solutions like 15 Anubis and IP-based blocking do not work anymore at this point. 16 17 Requirements 18 ============ 19 20 * Must withstand high traffic from bots, requests before a payment happened 21 must be *very* cheap, both in terms of response generation and database 22 interaction. 23 24 Proposed Solution 25 ================= 26 27 Architecture 28 29 * paivana-httpd is a reverse proxy that sits between ingress HTTP(S) traffic 30 and the protected upstream service. 31 * paivana-httpd is configured with a particular merchant backend. 32 * A payment template must be set up in the merchant backend (called ``{template_id}`` 33 from here on). 34 35 Steps: 36 37 * Browser visits git.taler.net 38 * paivana-httpd checks for a Paivana cookie 39 40 * If cookie is set and valid, the request is reverse-proxied to upstream. *Stop.* 41 * Otherwise, a paywall page is rendered, continue. 42 43 * The browser (rendering the paywall page) generates a random paivana ID via JS. 44 * Based on this paivana ID, a ``taler://pay-template/{paivana_backend}/.well-known/paivana/{template_id}?paivana_id={paivana_id}`` 45 URI is generated and rendered as a QR code and link. 46 * The browser long-polls on a ``{paivana_backend}/.well-known/pavivana/paivanas/{paivana_id}`` 47 endpoint that returns when an order with the given paivana ID has been paid 48 for (regardless of the order ID, which is not known to the browser). 49 * A wallet now needs to instantiate the pay template and pay for the resulting order 50 by talking to the Paivana backend which proxies the requests to the merchant backend 51 and in the process learns the order ID and the payment status change. 52 paivana-httpd may also implement the required subset of the merchant backend itself in the future. 53 * When the long-poller returns and the payment has succeeded, the 54 HTTP response sets the Paivana cookie. The browser reloads the page. 55 56 The Paivana Cookie is computed as ``exp_timestamp || '-' || H(client_ip || paivana_server_secret || exp_timestamp)``. 57 58 Problems: 59 60 * A smart attacker might still create a lot of orders via the pay-template. 61 62 * Solution A: Don't care, unlikely to happen in the first place. 63 * Solution B: Rate-limit template instantiation on a per-IP basis. 64 65 Implementation: 66 67 * Paivana needs to support extended template instantiation with a ``paivana_id``. 68 * Paivana component needs to be specified / implemented 69 * Wallet-core needs support for a ``paivana_id`` in pay templates. 70 71 72 Test Plan 73 ========= 74 75 * Deploy it for git.taler.net 76 77 Definition of Done 78 ================== 79 80 N/A 81 82 Alternatives 83 ============ 84 85 Drawbacks 86 ========= 87 88 * Requires JavaScript 89 90 * Could be made to work without JS by returning some ``Paivana: ...`` header. 91 92 Discussion / Q&A 93 ================