README.rst

Documentation: go.mongodb.org/mongo-driver/testdata/server-discovery-and-monitoring

     1=====================================
     2Server Discovery And Monitoring Tests
     3=====================================
     4
     5.. contents::
     6
     7----
     8
     9The YAML and JSON files in this directory tree are platform-independent tests
    10that drivers can use to prove their conformance to the
    11Server Discovery And Monitoring Spec.
    12
    13Additional prose tests, that cannot be represented as spec tests, are
    14described and MUST be implemented.
    15
    16Version
    17-------
    18
    19Files in the "specifications" repository have no version scheme. They are not
    20tied to a MongoDB server version.
    21
    22Format
    23------
    24
    25Each YAML file has the following keys:
    26
    27- description: A textual description of the test.
    28- uri: A connection string.
    29- phases: An array of "phase" objects.
    30  A phase of the test optionally sends inputs to the client,
    31  then tests the client's resulting TopologyDescription.
    32
    33Each phase object has the following keys:
    34
    35- description: (optional) A textual description of this phase.
    36- responses: (optional) An array of "response" objects. If not provided,
    37  the test runner should construct the client and perform assertions specified
    38  in the outcome object without processing any responses.
    39- applicationErrors: (optional) An array of "applicationError" objects.
    40- outcome: An "outcome" object representing the TopologyDescription.
    41
    42A response is a pair of values:
    43
    44- The source, for example "a:27017".
    45  This is the address the client sent the "hello" or legacy hello command to.
    46- A hello or legacy hello response, for example ``{ok: 1, helloOk: true, isWritablePrimary: true}``.
    47  If the response includes an electionId it is shown in extended JSON like
    48  ``{"$oid": "000000000000000000000002"}``.
    49  The empty response `{}` indicates a network error
    50  when attempting to call "hello" or legacy hello.
    51
    52An "applicationError" object has the following keys:
    53
    54- address: The source address, for example "a:27017".
    55- generation: (optional) The error's generation number, for example ``1``.
    56  When absent this value defaults to the pool's current generation number.
    57- maxWireVersion: The ``maxWireVersion`` of the connection the error occurs
    58  on, for example ``9``. Added to support testing the behavior of "not writable primary"
    59  errors on <4.2 and >=4.2 servers.
    60- when: A string describing when this mock error should occur. Supported
    61  values are:
    62
    63  - "beforeHandshakeCompletes": Simulate this mock error as if it occurred
    64    during a new connection's handshake for an application operation.
    65  - "afterHandshakeCompletes": Simulate this mock error as if it occurred
    66    on an established connection for an application operation (i.e. after
    67    the connection pool check out succeeds).
    68
    69- type: The type of error to mock. Supported values are:
    70
    71  - "command": A command error. Always accompanied with a "response".
    72  - "network": A non-timeout network error.
    73  - "timeout": A network timeout error.
    74
    75- response: (optional) A command error response, for example
    76  ``{ok: 0, errmsg: "not primary"}``. Present if and only if ``type`` is
    77  "command". Note the server only returns "not primary" if the "hello" command
    78  has been run on this connection. Otherwise the legacy error message is returned.
    79
    80In non-monitoring tests, an "outcome" represents the correct
    81TopologyDescription that results from processing the responses in the phases
    82so far. It has the following keys:
    83
    84- topologyType: A string like "ReplicaSetNoPrimary".
    85- setName: A string with the expected replica set name, or null.
    86- servers: An object whose keys are addresses like "a:27017", and whose values
    87  are "server" objects.
    88- logicalSessionTimeoutMinutes: null or an integer.
    89- maxSetVersion: absent or an integer.
    90- maxElectionId: absent or a BSON ObjectId.
    91- compatible: absent or a bool.
    92
    93A "server" object represents a correct ServerDescription within the client's
    94current TopologyDescription. It has the following keys:
    95
    96- type: A ServerType name, like "RSSecondary".
    97- setName: A string with the expected replica set name, or null.
    98- setVersion: absent or an integer.
    99- electionId: absent, null, or an ObjectId.
   100- logicalSessionTimeoutMinutes: absent, null, or an integer.
   101- minWireVersion: absent or an integer.
   102- maxWireVersion: absent or an integer.
   103- topologyVersion: absent, null, or a topologyVersion document.
   104- pool: (optional) A "pool" object.
   105
   106A "pool" object represents a correct connection pool for a given server.
   107It has the following keys:
   108
   109- generation: This server's expected pool generation, like ``0``.
   110
   111In monitoring tests, an "outcome" contains a list of SDAM events that should
   112have been published by the client as a result of processing hello or legacy hello
   113responses in the current phase. Any SDAM events published by the client during its
   114construction (that is, prior to processing any of the responses) should be
   115combined with the events published during processing of hello or legacy hello
   116responses of the first phase of the test. A test MAY explicitly verify events
   117published during client construction by providing an empty responses array for the
   118first phase.
   119
   120
   121Use as unittests
   122----------------
   123
   124Mocking
   125~~~~~~~
   126
   127Drivers should be able to test their server discovery and monitoring logic without
   128any network I/O, by parsing hello (or legacy hello) and application error from the
   129test file and passing them into the driver code. Parts of the client and
   130monitoring code may need to be mocked or subclassed to achieve this.
   131`A reference implementation for PyMongo 3.10.1 is available here
   132<https://github.com/mongodb/mongo-python-driver/blob/3.10.1/test/test_discovery_and_monitoring.py>`_.
   133
   134Initialization
   135~~~~~~~~~~~~~~
   136
   137For each file, create a fresh client object initialized with the file's "uri".
   138
   139All files in the "single" directory include a connection string with one host
   140and no "replicaSet" option.
   141Set the client's initial TopologyType to Single, however that is achieved using the client's API.
   142(The spec says "The user MUST be able to set the initial TopologyType to Single"
   143without specifying how.)
   144
   145All files in the "sharded" directory include a connection string with multiple hosts
   146and no "replicaSet" option.
   147Set the client's initial TopologyType to Unknown or Sharded, depending on the client's API.
   148
   149All files in the "rs" directory include a connection string with a "replicaSet" option.
   150Set the client's initial TopologyType to ReplicaSetNoPrimary.
   151(For most clients, parsing a connection string with a "replicaSet" option
   152automatically sets the TopologyType to ReplicaSetNoPrimary.)
   153
   154Set up a listener to collect SDAM events published by the client, including
   155events published during client construction.
   156
   157Test Phases
   158~~~~~~~~~~~
   159
   160For each phase in the file:
   161
   162#. Parse the "responses" array. Pass in the responses in order to the driver
   163   code. If a response is the empty object ``{}``, simulate a network error.
   164
   165#. Parse the "applicationErrors" array. For each element, simulate the given
   166   error as if it occurred while running an application operation. Note that
   167   it is sufficient to construct a mock error and call the procedure which
   168   updates the topology, e.g.
   169   ``topology.handleApplicationError(address, generation, maxWireVersion, error)``.
   170
   171For non-monitoring tests,
   172once all responses are processed, assert that the phase's "outcome" object
   173is equivalent to the driver's current TopologyDescription.
   174
   175For monitoring tests, once all responses are processed, assert that the
   176events collected so far by the SDAM event listener are equivalent to the
   177events specified in the phase.
   178
   179Some fields such as "logicalSessionTimeoutMinutes", "compatible", and
   180"topologyVersion" were added later and haven't been added to all test files.
   181If these fields are present, test that they are equivalent to the fields of
   182the driver's current TopologyDescription or ServerDescription.
   183
   184For monitoring tests, clear the list of events collected so far.
   185
   186Continue until all phases have been executed.
   187
   188Integration Tests
   189-----------------
   190
   191Integration tests are provided in the "integration" directory.
   192
   193Test Format
   194~~~~~~~~~~~
   195
   196The same as the `Transactions Spec Test format
   197</source/transactions/tests/README.rst#test-format>`_ with the following
   198additions:
   199
   200- The ``runOn`` requirement gains a new field:
   201
   202  - ``authEnabled`` (optional): If True, skip this test if auth is not enabled.
   203    If False, skip this test if auth is enabled. If this field is omitted,
   204    this test can be run on clusters with or without auth.
   205
   206Special Test Operations
   207~~~~~~~~~~~~~~~~~~~~~~~
   208
   209Certain operations that appear in the "operations" array do not correspond to
   210API methods but instead represent special test operations. Such operations are
   211defined on the "testRunner" object and are documented in the
   212`Transactions Spec Test
   213</source/transactions/tests/README.rst#special-test-operations>`_.
   214
   215Additional, SDAM test specific operations are documented here:
   216
   217configureFailPoint
   218''''''''''''''''''
   219
   220The "configureFailPoint" operation instructs the test runner to configure
   221the given server failpoint on the "admin" database. The runner MUST disable
   222this failpoint at the end of the test. For example::
   223
   224      - name: configureFailPoint
   225        object: testRunner
   226        arguments:
   227          failPoint:
   228            configureFailPoint: failCommand
   229            mode: { times: 1 }
   230            data:
   231                failCommands: ["insert"]
   232                closeConnection: true
   233
   234Tests that use the "configureFailPoint" operation do not include
   235``configureFailPoint`` commands in their command expectations. Drivers MUST
   236ensure that ``configureFailPoint`` commands do not appear in the list of logged
   237commands, either by manually filtering it from the list of observed commands or
   238by using a different MongoClient to execute ``configureFailPoint``.
   239
   240Note, similar to the ``tests.failPoint`` field described in the `Transactions
   241Spec Test format </source/transactions/tests/README.rst#test-format>`_ tests
   242with ``useMultipleMongoses: true`` will not contain a ``configureFailPoint``
   243operation.
   244
   245wait
   246''''
   247
   248The "wait" operation instructs the test runner to sleep for "ms"
   249milliseconds. For example::
   250
   251      - name: wait
   252        object: testRunner
   253        arguments:
   254          ms: 1000
   255
   256waitForEvent
   257''''''''''''
   258
   259The "waitForEvent" operation instructs the test runner to wait until the test's
   260MongoClient has published a specific event a given number of times. For
   261example, the following instructs the test runner to wait for at least one
   262PoolClearedEvent to be published::
   263
   264      - name: waitForEvent
   265        object: testRunner
   266        arguments:
   267          event: PoolClearedEvent
   268          count: 1
   269
   270Note that "count" includes events that were published while running previous
   271operations.
   272
   273If the "waitForEvent" operation is not satisfied after 10 seconds, the
   274operation is considered an error.
   275
   276ServerMarkedUnknownEvent
   277````````````````````````
   278
   279The ServerMarkedUnknownEvent may appear as an event in `waitForEvent`_ and
   280`assertEventCount`_. This event is defined as ServerDescriptionChangedEvent
   281where newDescription.type is ``Unknown``.
   282
   283assertEventCount
   284''''''''''''''''
   285
   286The "assertEventCount" operation instructs the test runner to assert the test's
   287MongoClient has published a specific event a given number of times. For
   288example, the following instructs the test runner to assert that a single
   289PoolClearedEvent was published::
   290
   291      - name: assertEventCount
   292        object: testRunner
   293        arguments:
   294          event: PoolClearedEvent
   295          count: 1
   296
   297recordPrimary
   298'''''''''''''
   299
   300The "recordPrimary" operation instructs the test runner to record the current
   301primary of the test's MongoClient. For example::
   302
   303      - name: recordPrimary
   304        object: testRunner
   305
   306runAdminCommand
   307'''''''''''''''
   308
   309The "runAdminCommand" operation instructs the test runner to run the given
   310command on the admin database. Drivers MUST run this command on a different
   311MongoClient from the one used for test operations. For example::
   312
   313      - name: runAdminCommand
   314        object: testRunner
   315        command_name: replSetFreeze
   316        arguments:
   317          command:
   318            replSetFreeze: 0
   319          readPreference:
   320            mode: Secondary
   321
   322waitForPrimaryChange
   323''''''''''''''''''''
   324
   325The "waitForPrimaryChange" operation instructs the test runner to wait up to
   326"timeoutMS" milliseconds for the MongoClient to discover a new primary server.
   327The new primary should be different from the one recorded by "recordPrimary".
   328For example::
   329
   330      - name: waitForPrimaryChange
   331        object: testRunner
   332        arguments:
   333          timeoutMS: 15000
   334
   335To implement, Drivers can subscribe to ServerDescriptionChangedEvents and wait
   336for an event where newDescription.type is ``RSPrimary`` and the address is
   337different from the one previously recorded by "recordPrimary".
   338
   339startThread
   340'''''''''''
   341
   342The "startThread" operation instructs the test runner to start a new thread
   343with the provided "name". The `runOnThread`_ and `waitForThread`_ operations
   344reference a thread by its "name". For example::
   345
   346      - name: startThread
   347        object: testRunner
   348        arguments:
   349          name: thread1
   350
   351runOnThread
   352'''''''''''
   353
   354The "runOnThread" operation instructs the test runner to schedule an operation
   355to be run on the given thread. runOnThread MUST NOT wait for the scheduled
   356operation to complete. For example::
   357
   358      - name: runOnThread
   359        object: testRunner
   360        arguments:
   361          name: thread1
   362          operation:
   363            name: insertOne
   364            object: collection
   365            arguments:
   366              document:
   367                _id: 2
   368            error: true
   369
   370waitForThread
   371'''''''''''''
   372
   373The "waitForThread" operation instructs the test runner to stop the given
   374thread, wait for it to complete, and assert that the thread exited without
   375any errors. For example::
   376
   377      - name: waitForThread
   378        object: testRunner
   379        arguments:
   380          name: thread1
   381
   382Prose Tests
   383-----------
   384
   385The following prose tests cannot be represented as spec tests and MUST be
   386implemented.
   387
   388Streaming protocol Tests
   389~~~~~~~~~~~~~~~~~~~~~~~~
   390
   391Drivers that implement the streaming protocol (multi-threaded or
   392asynchronous drivers) must implement the following tests. Each test should be
   393run against a standalone, replica set, and sharded cluster unless otherwise
   394noted.
   395
   396Some of these cases should already be tested with the old protocol; in
   397that case just verify the test cases succeed with the new protocol.
   398
   3991.  Configure the client with heartbeatFrequencyMS set to 500,
   400    overriding the default of 10000. Assert the client processes
   401    hello and legacy hello replies more frequently (approximately every 500ms).
   402
   403RTT Tests
   404~~~~~~~~~
   405
   406Run the following test(s) on MongoDB 4.4+.
   407
   4081.  Test that RTT is continuously updated.
   409
   410    #. Create a client with  ``heartbeatFrequencyMS=500``,
   411       ``appName=streamingRttTest``, and subscribe to server events.
   412
   413    #. Run a find command to wait for the server to be discovered.
   414
   415    #. Sleep for 2 seconds. This must be long enough for multiple heartbeats
   416       to succeed.
   417
   418    #. Assert that each ``ServerDescriptionChangedEvent`` includes a non-zero
   419       RTT.
   420
   421    #. Configure the following failpoint to block hello or legacy hello commands
   422       for 250ms which should add extra latency to each RTT check::
   423
   424         db.adminCommand({
   425             configureFailPoint: "failCommand",
   426             mode: {times: 1000},
   427             data: {
   428               failCommands: ["hello"], // or the legacy hello command
   429               blockConnection: true,
   430               blockTimeMS: 500,
   431               appName: "streamingRttTest",
   432             },
   433         });
   434
   435    #. Wait for the server's RTT to exceed 250ms. Eventually the average RTT
   436       should also exceed 500ms but we use 250ms to speed up the test. Note
   437       that the `Server Description Equality`_ rule means that
   438       ServerDescriptionChangedEvents will not be published. This test may
   439       need to use a driver specific helper to obtain the latest RTT instead.
   440       If the RTT does not exceed 250ms after 10 seconds, consider the test
   441       failed.
   442
   443    #. Disable the failpoint::
   444
   445         db.adminCommand({
   446             configureFailPoint: "failCommand",
   447             mode: "off",
   448         });
   449
   450.. Section for links.
   451
   452.. _Server Description Equality: /source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst#server-description-equality
View as plain text