1=====================
2Retryable Write Tests
3=====================
4
5.. contents::
6
7----
8
9Introduction
10============
11
12The YAML and JSON files in this directory tree are platform-independent tests
13that drivers can use to prove their conformance to the Retryable Writes spec.
14
15Several prose tests, which are not easily expressed in YAML, are also presented
16in this file. Those tests will need to be manually implemented by each driver.
17
18Tests will require a MongoClient created with options defined in the tests.
19Integration tests will require a running MongoDB cluster with server versions
203.6.0 or later. The ``{setFeatureCompatibilityVersion: 3.6}`` admin command
21will also need to have been executed to enable support for retryable writes on
22the cluster. Some tests may have more stringent version requirements depending
23on the fail points used.
24
25Server Fail Point
26=================
27
28onPrimaryTransactionalWrite
29---------------------------
30
31Some tests depend on a server fail point, ``onPrimaryTransactionalWrite``, which
32allows us to force a network error before the server would return a write result
33to the client. The fail point also allows control whether the server will
34successfully commit the write via its ``failBeforeCommitExceptionCode`` option.
35Keep in mind that the fail point only triggers for transaction writes (i.e. write
36commands including ``txnNumber`` and ``lsid`` fields). See `SERVER-29606`_ for
37more information.
38
39.. _SERVER-29606: https://jira.mongodb.org/browse/SERVER-29606
40
41The fail point may be configured like so::
42
43 db.runCommand({
44 configureFailPoint: "onPrimaryTransactionalWrite",
45 mode: <string|document>,
46 data: <document>
47 });
48
49``mode`` is a generic fail point option and may be assigned a string or document
50value. The string values ``"alwaysOn"`` and ``"off"`` may be used to enable or
51disable the fail point, respectively. A document may be used to specify either
52``times`` or ``skip``, which are mutually exclusive:
53
54- ``{ times: <integer> }`` may be used to limit the number of times the fail
55 point may trigger before transitioning to ``"off"``.
56- ``{ skip: <integer> }`` may be used to defer the first trigger of a fail
57 point, after which it will transition to ``"alwaysOn"``.
58
59The ``data`` option is a document that may be used to specify options that
60control the fail point's behavior. As noted in `SERVER-29606`_,
61``onPrimaryTransactionalWrite`` supports the following ``data`` options, which
62may be combined if desired:
63
64- ``closeConnection``: Boolean option, which defaults to ``true``. If ``true``,
65 the connection on which the write is executed will be closed before a result
66 can be returned.
67- ``failBeforeCommitExceptionCode``: Integer option, which is unset by default.
68 If set, the specified exception code will be thrown and the write will not be
69 committed. If unset, the write will be allowed to commit.
70
71failCommand
72-----------
73
74Some tests depend on a server fail point, ``failCommand``, which allows the
75client to force the server to return an error. Unlike
76``onPrimaryTransactionalWrite``, ``failCommand`` does not allow the client to
77directly control whether the server will commit the operation (execution of the
78write depends on whether the ``closeConnection`` and/or ``errorCode`` options
79are specified). See: `failCommand <../../transactions/tests#failcommand>`_ in
80the Transactions spec test suite for more information.
81
82Disabling Fail Points after Test Execution
83------------------------------------------
84
85After each test that configures a fail point, drivers should disable the fail
86point to avoid spurious failures in subsequent tests. The fail point may be
87disabled like so::
88
89 db.runCommand({
90 configureFailPoint: <fail point name>,
91 mode: "off"
92 });
93
94Use as Integration Tests
95========================
96
97Integration tests are expressed in YAML and can be run against a replica set or
98sharded cluster as denoted by the top-level ``runOn`` field. Tests that rely on
99the ``onPrimaryTransactionalWrite`` fail point cannot be run against a sharded
100cluster because the fail point is not supported by mongos.
101
102The tests exercise the following scenarios:
103
104- Single-statement write operations
105
106 - Each test expecting a write result will encounter at-most one network error
107 for the write command. Retry attempts should return without error and allow
108 operation to succeed. Observation of the collection state will assert that
109 the write occurred at-most once.
110
111 - Each test expecting an error will encounter successive network errors for
112 the write command. Observation of the collection state will assert that the
113 write was never committed on the server.
114
115- Multi-statement write operations
116
117 - Each test expecting a write result will encounter at-most one network error
118 for some write command(s) in the batch. Retry attempts should return without
119 error and allow the batch to ultimately succeed. Observation of the
120 collection state will assert that each write occurred at-most once.
121
122 - Each test expecting an error will encounter successive network errors for
123 some write command in the batch. The batch will ultimately fail with an
124 error, but observation of the collection state will assert that the failing
125 write was never committed on the server. We may observe that earlier writes
126 in the batch occurred at-most once.
127
128We cannot test a scenario where the first and second attempts both encounter
129network errors but the write does actually commit during one of those attempts.
130This is because (1) the fail point only triggers when a write would be committed
131and (2) the skip and times options are mutually exclusive. That said, such a
132test would mainly assert the server's correctness for at-most once semantics and
133is not essential to assert driver correctness.
134
135Test Format
136-----------
137
138Each YAML file has the following keys:
139
140- ``runOn`` (optional): An array of server version and/or topology requirements
141 for which the tests can be run. If the test environment satisfies one or more
142 of these requirements, the tests may be executed; otherwise, this file should
143 be skipped. If this field is omitted, the tests can be assumed to have no
144 particular requirements and should be executed. Each element will have some or
145 all of the following fields:
146
147 - ``minServerVersion`` (optional): The minimum server version (inclusive)
148 required to successfully run the tests. If this field is omitted, it should
149 be assumed that there is no lower bound on the required server version.
150
151 - ``maxServerVersion`` (optional): The maximum server version (inclusive)
152 against which the tests can be run successfully. If this field is omitted,
153 it should be assumed that there is no upper bound on the required server
154 version.
155
156 - ``topology`` (optional): An array of server topologies against which the
157 tests can be run successfully. Valid topologies are "single", "replicaset",
158 and "sharded". If this field is omitted, the default is all topologies (i.e.
159 ``["single", "replicaset", "sharded"]``).
160
161- ``data``: The data that should exist in the collection under test before each
162 test run.
163
164- ``tests``: An array of tests that are to be run independently of each other.
165 Each test will have some or all of the following fields:
166
167 - ``description``: The name of the test.
168
169 - ``clientOptions``: Parameters to pass to MongoClient().
170
171 - ``useMultipleMongoses`` (optional): If ``true``, the MongoClient for this
172 test should be initialized with multiple mongos seed addresses. If ``false``
173 or omitted, only a single mongos address should be specified. This field has
174 no effect for non-sharded topologies.
175
176 - ``failPoint`` (optional): The ``configureFailPoint`` command document to run
177 to configure a fail point on the primary server. Drivers must ensure that
178 ``configureFailPoint`` is the first field in the command. This option and
179 ``useMultipleMongoses: true`` are mutually exclusive.
180
181 - ``operation``: Document describing the operation to be executed. The
182 operation should be executed through a collection object derived from a
183 client that has been created with ``clientOptions``. The operation will have
184 some or all of the following fields:
185
186 - ``name``: The name of the operation as defined in the CRUD specification.
187
188 - ``arguments``: The names and values of arguments from the CRUD
189 specification.
190
191 - ``outcome``: Document describing the return value and/or expected state of
192 the collection after the operation is executed. This will have some or all
193 of the following fields:
194
195 - ``error``: If ``true``, the test should expect an error or exception. Note
196 that some drivers may report server-side errors as a write error within a
197 write result object.
198
199 - ``result``: The return value from the operation. This will correspond to
200 an operation's result object as defined in the CRUD specification. This
201 field may be omitted if ``error`` is ``true``. If this field is present
202 and ``error`` is ``true`` (generally for multi-statement tests), the
203 result reports information about operations that succeeded before an
204 unrecoverable failure. In that case, drivers may choose to check the
205 result object if their BulkWriteException (or equivalent) provides access
206 to a write result object.
207
208 - ``errorLabelsContain``: A list of error label strings that the
209 error is expected to have.
210
211 - ``errorLabelsOmit``: A list of error label strings that the
212 error is expected not to have.
213
214 - ``collection``:
215
216 - ``name`` (optional): The name of the collection to verify. If this isn't
217 present then use the collection under test.
218
219 - ``data``: The data that should exist in the collection after the
220 operation has been run.
221
222Split Batch Tests
223=================
224
225The YAML tests specify bulk write operations that are split by command type
226(e.g. sequence of insert, update, and delete commands). Multi-statement write
227operations may also be split due to ``maxWriteBatchSize``,
228``maxBsonObjectSize``, or ``maxMessageSizeBytes``.
229
230For instance, an insertMany operation with five 10 MiB documents executed using
231OP_MSG payload type 0 (i.e. entire command in one document) would be split into
232five insert commands in order to respect the 16 MiB ``maxBsonObjectSize`` limit.
233The same insertMany operation executed using OP_MSG payload type 1 (i.e. command
234arguments pulled out into a separate payload vector) would be split into two
235insert commands in order to respect the 48 MB ``maxMessageSizeBytes`` limit.
236
237Noting when a driver might split operations, the ``onPrimaryTransactionalWrite``
238fail point's ``skip`` option may be used to control when the fail point first
239triggers. Once triggered, the fail point will transition to the ``alwaysOn``
240state until disabled. Driver authors should also note that the server attempts
241to process all documents in a single insert command within a single commit (i.e.
242one insert command with five documents may only trigger the fail point once).
243This behavior is unique to insert commands (each statement in an update and
244delete command is processed independently).
245
246If testing an insert that is split into two commands, a ``skip`` of one will
247allow the fail point to trigger on the second insert command (because all
248documents in the first command will be processed in the same commit). When
249testing an update or delete that is split into two commands, the ``skip`` should
250be set to the number of statements in the first command to allow the fail point
251to trigger on the second command.
252
253Command Construction Tests
254==========================
255
256Drivers should also assert that command documents are properly constructed with
257or without a transaction ID, depending on whether the write operation is
258supported. `Command Monitoring`_ may be used to check for the presence of a
259``txnNumber`` field in the command document. Note that command documents may
260always include an ``lsid`` field per the `Driver Session`_ specification.
261
262.. _Command Monitoring: ../../command-monitoring/command-monitoring.rst
263.. _Driver Session: ../../sessions/driver-sessions.rst
264
265These tests may be run against both a replica set and shard cluster.
266
267Drivers should test that transaction IDs are never included in commands for
268unsupported write operations:
269
270* Write commands with unacknowledged write concerns (e.g. ``{w: 0}``)
271
272* Unsupported single-statement write operations
273
274 - ``updateMany()``
275 - ``deleteMany()``
276
277* Unsupported multi-statement write operations
278
279 - ``bulkWrite()`` that includes ``UpdateMany`` or ``DeleteMany``
280
281* Unsupported write commands
282
283 - ``aggregate`` with write stage (e.g. ``$out``, ``$merge``)
284
285Drivers should test that transactions IDs are always included in commands for
286supported write operations:
287
288* Supported single-statement write operations
289
290 - ``insertOne()``
291 - ``updateOne()``
292 - ``replaceOne()``
293 - ``deleteOne()``
294 - ``findOneAndDelete()``
295 - ``findOneAndReplace()``
296 - ``findOneAndUpdate()``
297
298* Supported multi-statement write operations
299
300 - ``insertMany()`` with ``ordered=true``
301 - ``insertMany()`` with ``ordered=false``
302 - ``bulkWrite()`` with ``ordered=true`` (no ``UpdateMany`` or ``DeleteMany``)
303 - ``bulkWrite()`` with ``ordered=false`` (no ``UpdateMany`` or ``DeleteMany``)
304
305Prose Tests
306===========
307
308The following tests ensure that retryable writes work properly with replica sets
309and sharded clusters.
310
311#. Test that retryable writes raise an exception when using the MMAPv1 storage
312 engine. For this test, execute a write operation, such as ``insertOne``,
313 which should generate an exception. Assert that the error message is the
314 replacement error message::
315
316 This MongoDB deployment does not support retryable writes. Please add
317 retryWrites=false to your connection string.
318
319 and the error code is 20.
320
321 **Note**: Drivers that rely on ``serverStatus`` to determine the storage engine
322 in use MAY skip this test for sharded clusters, since ``mongos`` does not report
323 this information in its ``serverStatus`` response.
324
325Changelog
326=========
327
328:2019-10-21: Add ``errorLabelsContain`` and ``errorLabelsContain`` fields to ``result``
329
330:2019-08-07: Add Prose Tests section
331
332:2019-06-07: Mention $merge stage for aggregate alongside $out
333
334:2019-03-01: Add top-level ``runOn`` field to denote server version and/or
335 topology requirements requirements for the test file. Removes the
336 ``minServerVersion`` and ``maxServerVersion`` top-level fields,
337 which are now expressed within ``runOn`` elements.
338
339 Add test-level ``useMultipleMongoses`` field.
View as plain text