README.rst

Documentation: go.mongodb.org/mongo-driver/testdata/retryable-writes

     1=====================
     2Retryable Write Tests
     3=====================
     4
     5.. contents::
     6
     7----
     8
     9Introduction
    10============
    11
    12The YAML and JSON files in this directory tree are platform-independent tests
    13that drivers can use to prove their conformance to the Retryable Writes spec.
    14
    15Several prose tests, which are not easily expressed in YAML, are also presented
    16in this file. Those tests will need to be manually implemented by each driver.
    17
    18Tests will require a MongoClient created with options defined in the tests.
    19Integration tests will require a running MongoDB cluster with server versions
    203.6.0 or later. The ``{setFeatureCompatibilityVersion: 3.6}`` admin command
    21will also need to have been executed to enable support for retryable writes on
    22the cluster. Some tests may have more stringent version requirements depending
    23on the fail points used.
    24
    25Server Fail Point
    26=================
    27
    28onPrimaryTransactionalWrite
    29---------------------------
    30
    31Some tests depend on a server fail point, ``onPrimaryTransactionalWrite``, which
    32allows us to force a network error before the server would return a write result
    33to the client. The fail point also allows control whether the server will
    34successfully commit the write via its ``failBeforeCommitExceptionCode`` option.
    35Keep in mind that the fail point only triggers for transaction writes (i.e. write
    36commands including ``txnNumber`` and ``lsid`` fields). See `SERVER-29606`_ for
    37more information.
    38
    39.. _SERVER-29606: https://jira.mongodb.org/browse/SERVER-29606
    40
    41The fail point may be configured like so::
    42
    43    db.runCommand({
    44        configureFailPoint: "onPrimaryTransactionalWrite",
    45        mode: <string|document>,
    46        data: <document>
    47    });
    48
    49``mode`` is a generic fail point option and may be assigned a string or document
    50value. The string values ``"alwaysOn"`` and ``"off"`` may be used to enable or
    51disable the fail point, respectively. A document may be used to specify either
    52``times`` or ``skip``, which are mutually exclusive:
    53
    54- ``{ times: <integer> }`` may be used to limit the number of times the fail
    55  point may trigger before transitioning to ``"off"``.
    56- ``{ skip: <integer> }`` may be used to defer the first trigger of a fail
    57  point, after which it will transition to ``"alwaysOn"``.
    58
    59The ``data`` option is a document that may be used to specify options that
    60control the fail point's behavior. As noted in `SERVER-29606`_,
    61``onPrimaryTransactionalWrite`` supports the following ``data`` options, which
    62may be combined if desired:
    63
    64- ``closeConnection``: Boolean option, which defaults to ``true``. If ``true``,
    65  the connection on which the write is executed will be closed before a result
    66  can be returned.
    67- ``failBeforeCommitExceptionCode``: Integer option, which is unset by default.
    68  If set, the specified exception code will be thrown and the write will not be
    69  committed. If unset, the write will be allowed to commit.
    70
    71failCommand
    72-----------
    73
    74Some tests depend on a server fail point, ``failCommand``, which allows the
    75client to force the server to return an error. Unlike
    76``onPrimaryTransactionalWrite``, ``failCommand`` does not allow the client to
    77directly control whether the server will commit the operation (execution of the
    78write depends on whether the ``closeConnection`` and/or ``errorCode`` options
    79are specified). See: `failCommand <../../transactions/tests#failcommand>`_ in
    80the Transactions spec test suite for more information.
    81
    82Disabling Fail Points after Test Execution
    83------------------------------------------
    84
    85After each test that configures a fail point, drivers should disable the fail
    86point to avoid spurious failures in subsequent tests. The fail point may be
    87disabled like so::
    88
    89    db.runCommand({
    90        configureFailPoint: <fail point name>,
    91        mode: "off"
    92    });
    93
    94Use as Integration Tests
    95========================
    96
    97Integration tests are expressed in YAML and can be run against a replica set or
    98sharded cluster as denoted by the top-level ``runOn`` field. Tests that rely on
    99the ``onPrimaryTransactionalWrite`` fail point cannot be run against a sharded
   100cluster because the fail point is not supported by mongos.
   101
   102The tests exercise the following scenarios:
   103
   104- Single-statement write operations
   105
   106  - Each test expecting a write result will encounter at-most one network error
   107    for the write command. Retry attempts should return without error and allow
   108    operation to succeed. Observation of the collection state will assert that
   109    the write occurred at-most once.
   110
   111  - Each test expecting an error will encounter successive network errors for
   112    the write command. Observation of the collection state will assert that the
   113    write was never committed on the server.
   114
   115- Multi-statement write operations
   116
   117  - Each test expecting a write result will encounter at-most one network error
   118    for some write command(s) in the batch. Retry attempts should return without
   119    error and allow the batch to ultimately succeed. Observation of the
   120    collection state will assert that each write occurred at-most once.
   121
   122  - Each test expecting an error will encounter successive network errors for
   123    some write command in the batch. The batch will ultimately fail with an
   124    error, but observation of the collection state will assert that the failing
   125    write was never committed on the server. We may observe that earlier writes
   126    in the batch occurred at-most once.
   127
   128We cannot test a scenario where the first and second attempts both encounter
   129network errors but the write does actually commit during one of those attempts.
   130This is because (1) the fail point only triggers when a write would be committed
   131and (2) the skip and times options are mutually exclusive. That said, such a
   132test would mainly assert the server's correctness for at-most once semantics and
   133is not essential to assert driver correctness.
   134
   135Test Format
   136-----------
   137
   138Each YAML file has the following keys:
   139
   140- ``runOn`` (optional): An array of server version and/or topology requirements
   141  for which the tests can be run. If the test environment satisfies one or more
   142  of these requirements, the tests may be executed; otherwise, this file should
   143  be skipped. If this field is omitted, the tests can be assumed to have no
   144  particular requirements and should be executed. Each element will have some or
   145  all of the following fields:
   146
   147  - ``minServerVersion`` (optional): The minimum server version (inclusive)
   148    required to successfully run the tests. If this field is omitted, it should
   149    be assumed that there is no lower bound on the required server version.
   150
   151  - ``maxServerVersion`` (optional): The maximum server version (inclusive)
   152    against which the tests can be run successfully. If this field is omitted,
   153    it should be assumed that there is no upper bound on the required server
   154    version.
   155
   156  - ``topology`` (optional): An array of server topologies against which the
   157    tests can be run successfully. Valid topologies are "single", "replicaset",
   158    and "sharded". If this field is omitted, the default is all topologies (i.e.
   159    ``["single", "replicaset", "sharded"]``).
   160
   161- ``data``: The data that should exist in the collection under test before each
   162  test run.
   163
   164- ``tests``: An array of tests that are to be run independently of each other.
   165  Each test will have some or all of the following fields:
   166
   167  - ``description``: The name of the test.
   168
   169  - ``clientOptions``: Parameters to pass to MongoClient().
   170
   171  - ``useMultipleMongoses`` (optional): If ``true``, the MongoClient for this
   172    test should be initialized with multiple mongos seed addresses. If ``false``
   173    or omitted, only a single mongos address should be specified. This field has
   174    no effect for non-sharded topologies.
   175
   176  - ``failPoint`` (optional): The ``configureFailPoint`` command document to run
   177    to configure a fail point on the primary server. Drivers must ensure that
   178    ``configureFailPoint`` is the first field in the command. This option and
   179    ``useMultipleMongoses: true`` are mutually exclusive.
   180
   181  - ``operation``: Document describing the operation to be executed. The
   182    operation should be executed through a collection object derived from a
   183    client that has been created with ``clientOptions``. The operation will have
   184    some or all of the following fields:
   185
   186    - ``name``: The name of the operation as defined in the CRUD specification.
   187
   188    - ``arguments``: The names and values of arguments from the CRUD
   189      specification.
   190
   191  - ``outcome``: Document describing the return value and/or expected state of
   192    the collection after the operation is executed. This will have some or all
   193    of the following fields:
   194
   195    - ``error``: If ``true``, the test should expect an error or exception. Note
   196      that some drivers may report server-side errors as a write error within a
   197      write result object.
   198
   199    - ``result``: The return value from the operation. This will correspond to
   200      an operation's result object as defined in the CRUD specification. This
   201      field may be omitted if ``error`` is ``true``. If this field is present
   202      and ``error`` is ``true`` (generally for multi-statement tests), the
   203      result reports information about operations that succeeded before an
   204      unrecoverable failure. In that case, drivers may choose to check the
   205      result object if their BulkWriteException (or equivalent) provides access
   206      to a write result object.
   207
   208      - ``errorLabelsContain``: A list of error label strings that the
   209        error is expected to have.
   210
   211      - ``errorLabelsOmit``: A list of error label strings that the
   212        error is expected not to have.
   213
   214    - ``collection``:
   215
   216      - ``name`` (optional): The name of the collection to verify. If this isn't
   217        present then use the collection under test.
   218
   219      - ``data``: The data that should exist in the collection after the
   220        operation has been run.
   221
   222Split Batch Tests
   223=================
   224
   225The YAML tests specify bulk write operations that are split by command type
   226(e.g. sequence of insert, update, and delete commands). Multi-statement write
   227operations may also be split due to ``maxWriteBatchSize``,
   228``maxBsonObjectSize``, or ``maxMessageSizeBytes``.
   229
   230For instance, an insertMany operation with five 10 MiB documents executed using
   231OP_MSG payload type 0 (i.e. entire command in one document) would be split into
   232five insert commands in order to respect the 16 MiB ``maxBsonObjectSize`` limit.
   233The same insertMany operation executed using OP_MSG payload type 1 (i.e. command
   234arguments pulled out into a separate payload vector) would be split into two
   235insert commands in order to respect the 48 MB ``maxMessageSizeBytes`` limit.
   236
   237Noting when a driver might split operations, the ``onPrimaryTransactionalWrite``
   238fail point's ``skip`` option may be used to control when the fail point first
   239triggers. Once triggered, the fail point will transition to the ``alwaysOn``
   240state until disabled. Driver authors should also note that the server attempts
   241to process all documents in a single insert command within a single commit (i.e.
   242one insert command with five documents may only trigger the fail point once).
   243This behavior is unique to insert commands (each statement in an update and
   244delete command is processed independently).
   245
   246If testing an insert that is split into two commands, a ``skip`` of one will
   247allow the fail point to trigger on the second insert command (because all
   248documents in the first command will be processed in the same commit). When
   249testing an update or delete that is split into two commands, the ``skip`` should
   250be set to the number of statements in the first command to allow the fail point
   251to trigger on the second command.
   252
   253Command Construction Tests
   254==========================
   255
   256Drivers should also assert that command documents are properly constructed with
   257or without a transaction ID, depending on whether the write operation is
   258supported. `Command Monitoring`_ may be used to check for the presence of a
   259``txnNumber`` field in the command document. Note that command documents may
   260always include an ``lsid`` field per the `Driver Session`_ specification.
   261
   262.. _Command Monitoring: ../../command-monitoring/command-monitoring.rst
   263.. _Driver Session: ../../sessions/driver-sessions.rst
   264
   265These tests may be run against both a replica set and shard cluster.
   266
   267Drivers should test that transaction IDs are never included in commands for
   268unsupported write operations:
   269
   270* Write commands with unacknowledged write concerns (e.g. ``{w: 0}``)
   271
   272* Unsupported single-statement write operations
   273
   274  - ``updateMany()``
   275  - ``deleteMany()``
   276
   277* Unsupported multi-statement write operations
   278
   279  - ``bulkWrite()`` that includes ``UpdateMany`` or ``DeleteMany``
   280
   281* Unsupported write commands
   282
   283  - ``aggregate`` with write stage (e.g. ``$out``, ``$merge``)
   284
   285Drivers should test that transactions IDs are always included in commands for
   286supported write operations:
   287
   288* Supported single-statement write operations
   289
   290  - ``insertOne()``
   291  - ``updateOne()``
   292  - ``replaceOne()``
   293  - ``deleteOne()``
   294  - ``findOneAndDelete()``
   295  - ``findOneAndReplace()``
   296  - ``findOneAndUpdate()``
   297
   298* Supported multi-statement write operations
   299
   300  - ``insertMany()`` with ``ordered=true``
   301  - ``insertMany()`` with ``ordered=false``
   302  - ``bulkWrite()`` with ``ordered=true`` (no ``UpdateMany`` or ``DeleteMany``)
   303  - ``bulkWrite()`` with ``ordered=false`` (no ``UpdateMany`` or ``DeleteMany``)
   304
   305Prose Tests
   306===========
   307
   308The following tests ensure that retryable writes work properly with replica sets
   309and sharded clusters.
   310
   311#. Test that retryable writes raise an exception when using the MMAPv1 storage
   312   engine. For this test, execute a write operation, such as ``insertOne``,
   313   which should generate an exception. Assert that the error message is the
   314   replacement error message::
   315
   316    This MongoDB deployment does not support retryable writes. Please add
   317    retryWrites=false to your connection string.
   318
   319   and the error code is 20.
   320
   321   **Note**: Drivers that rely on ``serverStatus`` to determine the storage engine
   322   in use MAY skip this test for sharded clusters, since ``mongos`` does not report
   323   this information in its ``serverStatus`` response.
   324
   325Changelog
   326=========
   327
   328:2019-10-21: Add ``errorLabelsContain`` and ``errorLabelsContain`` fields to ``result``
   329
   330:2019-08-07: Add Prose Tests section
   331
   332:2019-06-07: Mention $merge stage for aggregate alongside $out
   333
   334:2019-03-01: Add top-level ``runOn`` field to denote server version and/or
   335             topology requirements requirements for the test file. Removes the
   336             ``minServerVersion`` and ``maxServerVersion`` top-level fields,
   337             which are now expressed within ``runOn`` elements.
   338
   339             Add test-level ``useMultipleMongoses`` field.
View as plain text