Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incoming messages to the host without connections #5539

Open
nenych opened this issue Jun 14, 2024 · 11 comments
Open

Incoming messages to the host without connections #5539

nenych opened this issue Jun 14, 2024 · 11 comments
Labels
defect Suspected defect such as a bug or regression

Comments

@nenych
Copy link

nenych commented Jun 14, 2024

Observed behavior

We have a NATS cluster with 3 nodes in different AZ and connecting clients also to the node in the same AZ.
We have 1 node without connections, and once we got the situation when this node has no connections but a lot of incoming "orphan" messages from the other nodes:

CleanShot 2024-06-14 at 17 13 40@2x

We tried to change "pool_size = -1" - cluster updated the configuration but it did not help to down-crease incoming traffic.

Routes before the pool_size parameter change
{
"server": {
  "name": "nats-cluster-v4-0",
  "host": "0.0.0.0",
  "id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
  "cluster": "nats-cluster-v4",
  "ver": "2.10.14",
  "jetstream": false,
  "flags": 0,
  "seq": 27816,
  "time": "2024-06-14T09:00:15.936315993Z"
},
"data": {
  "server_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
  "server_name": "nats-cluster-v4-0",
  "now": "2024-06-14T09:00:15.936260353Z",
  "num_routes": 8,
  "routes": [
    {
      "rid": 6,
      "remote_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
      "remote_name": "nats-cluster-v4-1",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.142.65",
      "port": 6222,
      "start": "2024-06-04T17:45:20.530656125Z",
      "last_activity": "2024-06-04T17:45:20.531406265Z",
      "rtt": "218µs",
      "uptime": "9d15h14m55s",
      "idle": "9d15h14m55s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 9,
      "remote_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
      "remote_name": "nats-cluster-v4-1",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.142.65",
      "port": 6222,
      "start": "2024-06-04T17:45:20.578446619Z",
      "last_activity": "2024-06-04T17:45:20.578905629Z",
      "rtt": "207µs",
      "uptime": "9d15h14m55s",
      "idle": "9d15h14m55s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 12,
      "remote_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
      "remote_name": "nats-cluster-v4-1",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.142.65",
      "port": 6222,
      "start": "2024-06-04T17:45:20.660375Z",
      "last_activity": "2024-06-14T09:00:15.936291023Z",
      "rtt": "219µs",
      "uptime": "9d15h14m55s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 1045046185,
      "out_msgs": 65460384,
      "in_bytes": 1868687327845,
      "out_bytes": 51359890769,
      "subscriptions": 6252,
      "compression": "off"
    },
    {
      "rid": 8,
      "remote_id": "NC34AP6LAJSBGRNQOZHE5OYXNEAEDITZXBGYQP4USZ3GHMBJP2DGCKO2",
      "remote_name": "nats-cluster-v4-2",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.172.33",
      "port": 6222,
      "start": "2024-06-04T17:45:20.530737765Z",
      "last_activity": "2024-06-04T17:45:20.531492214Z",
      "rtt": "292µs",
      "uptime": "9d15h14m55s",
      "idle": "9d15h14m55s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 10,
      "remote_id": "NC34AP6LAJSBGRNQOZHE5OYXNEAEDITZXBGYQP4USZ3GHMBJP2DGCKO2",
      "remote_name": "nats-cluster-v4-2",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.172.33",
      "port": 6222,
      "start": "2024-06-04T17:45:20.604039586Z",
      "last_activity": "2024-06-04T17:45:20.604548357Z",
      "rtt": "308µs",
      "uptime": "9d15h14m55s",
      "idle": "9d15h14m55s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 11,
      "remote_id": "NC34AP6LAJSBGRNQOZHE5OYXNEAEDITZXBGYQP4USZ3GHMBJP2DGCKO2",
      "remote_name": "nats-cluster-v4-2",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.172.33",
      "port": 6222,
      "start": "2024-06-04T17:45:20.633688383Z",
      "last_activity": "2024-06-14T09:00:15.931065823Z",
      "rtt": "243µs",
      "uptime": "9d15h14m55s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 33948303,
      "out_msgs": 14153420,
      "in_bytes": 11515003051,
      "out_bytes": 10301579698,
      "subscriptions": 13,
      "compression": "off"
    },
    {
      "rid": 5,
      "remote_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
      "remote_name": "nats-cluster-v4-1",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.142.65",
      "port": 6222,
      "start": "2024-06-04T17:45:20.530645194Z",
      "last_activity": "2024-06-14T09:00:15.936240633Z",
      "rtt": "248µs",
      "uptime": "9d15h14m55s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 27798,
      "out_msgs": 27786,
      "in_bytes": 42873121,
      "out_bytes": 42789916,
      "subscriptions": 48,
      "account": "$SYS",
      "compression": "off"
    },
    {
      "rid": 7,
      "remote_id": "NC34AP6LAJSBGRNQOZHE5OYXNEAEDITZXBGYQP4USZ3GHMBJP2DGCKO2",
      "remote_name": "nats-cluster-v4-2",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.172.33",
      "port": 6222,
      "start": "2024-06-04T17:45:20.530697954Z",
      "last_activity": "2024-06-14T09:00:15.936240633Z",
      "rtt": "311µs",
      "uptime": "9d15h14m55s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 27765,
      "out_msgs": 27775,
      "in_bytes": 42166135,
      "out_bytes": 42464040,
      "subscriptions": 48,
      "account": "$SYS",
      "compression": "off"
    }
  ]
}
}
{
"server": {
  "name": "nats-cluster-v4-1",
  "host": "0.0.0.0",
  "id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
  "cluster": "nats-cluster-v4",
  "ver": "2.10.14",
  "jetstream": false,
  "flags": 0,
  "seq": 27874,
  "time": "2024-06-14T09:00:15.936405351Z"
},
"data": {
  "server_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
  "server_name": "nats-cluster-v4-1",
  "now": "2024-06-14T09:00:15.936370711Z",
  "num_routes": 8,
  "routes": [
    {
      "rid": 6,
      "remote_id": "NC34AP6LAJSBGRNQOZHE5OYXNEAEDITZXBGYQP4USZ3GHMBJP2DGCKO2",
      "remote_name": "nats-cluster-v4-2",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.172.33",
      "port": 6222,
      "start": "2024-06-04T17:45:08.358281524Z",
      "last_activity": "2024-06-04T17:45:08.358675434Z",
      "rtt": "380µs",
      "uptime": "9d15h15m7s",
      "idle": "9d15h15m7s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 11,
      "remote_id": "NC34AP6LAJSBGRNQOZHE5OYXNEAEDITZXBGYQP4USZ3GHMBJP2DGCKO2",
      "remote_name": "nats-cluster-v4-2",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.172.33",
      "port": 6222,
      "start": "2024-06-04T17:45:08.373206593Z",
      "last_activity": "2024-06-04T17:45:08.373595833Z",
      "rtt": "295µs",
      "uptime": "9d15h15m7s",
      "idle": "9d15h15m7s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 13,
      "remote_id": "NC34AP6LAJSBGRNQOZHE5OYXNEAEDITZXBGYQP4USZ3GHMBJP2DGCKO2",
      "remote_name": "nats-cluster-v4-2",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.172.33",
      "port": 6222,
      "start": "2024-06-04T17:45:08.401488062Z",
      "last_activity": "2024-06-14T09:00:15.936183591Z",
      "rtt": "234µs",
      "uptime": "9d15h15m7s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 79617062,
      "out_msgs": 1027217544,
      "in_bytes": 68925666034,
      "out_bytes": 1863523035076,
      "subscriptions": 13,
      "compression": "off"
    },
    {
      "rid": 22,
      "remote_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
      "remote_name": "nats-cluster-v4-0",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.173.11",
      "port": 52636,
      "start": "2024-06-04T17:45:20.530854683Z",
      "last_activity": "2024-06-04T17:45:20.531027283Z",
      "rtt": "223µs",
      "uptime": "9d15h14m55s",
      "idle": "9d15h14m55s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 23,
      "remote_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
      "remote_name": "nats-cluster-v4-0",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.173.11",
      "port": 52648,
      "start": "2024-06-04T17:45:20.578585482Z",
      "last_activity": "2024-06-04T17:45:20.578741222Z",
      "rtt": "270µs",
      "uptime": "9d15h14m55s",
      "idle": "9d15h14m55s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 24,
      "remote_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
      "remote_name": "nats-cluster-v4-0",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.173.11",
      "port": 52652,
      "start": "2024-06-04T17:45:20.660517999Z",
      "last_activity": "2024-06-14T09:00:15.936183591Z",
      "rtt": "219µs",
      "uptime": "9d15h14m55s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 65460384,
      "out_msgs": 1045046186,
      "in_bytes": 51359890769,
      "out_bytes": 1868687328185,
      "subscriptions": 1286,
      "compression": "off"
    },
    {
      "rid": 5,
      "remote_id": "NC34AP6LAJSBGRNQOZHE5OYXNEAEDITZXBGYQP4USZ3GHMBJP2DGCKO2",
      "remote_name": "nats-cluster-v4-2",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.172.33",
      "port": 6222,
      "start": "2024-06-04T17:45:08.356582374Z",
      "last_activity": "2024-06-14T09:00:12.199988588Z",
      "rtt": "246µs",
      "uptime": "9d15h15m7s",
      "idle": "3s",
      "pending_size": 0,
      "in_msgs": 27777,
      "out_msgs": 27804,
      "in_bytes": 42177782,
      "out_bytes": 42809896,
      "subscriptions": 48,
      "account": "$SYS",
      "compression": "off"
    },
    {
      "rid": 21,
      "remote_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
      "remote_name": "nats-cluster-v4-0",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.173.11",
      "port": 52630,
      "start": "2024-06-04T17:45:20.530829183Z",
      "last_activity": "2024-06-14T09:00:15.936356571Z",
      "rtt": "337µs",
      "uptime": "9d15h14m55s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 27786,
      "out_msgs": 27798,
      "in_bytes": 42789916,
      "out_bytes": 42873121,
      "subscriptions": 49,
      "account": "$SYS",
      "compression": "off"
    }
  ]
}
}
{
"server": {
  "name": "nats-cluster-v4-2",
  "host": "0.0.0.0",
  "id": "NC34AP6LAJSBGRNQOZHE5OYXNEAEDITZXBGYQP4USZ3GHMBJP2DGCKO2",
  "cluster": "nats-cluster-v4",
  "ver": "2.10.14",
  "jetstream": false,
  "flags": 0,
  "seq": 27816,
  "time": "2024-06-14T09:00:15.936489307Z"
},
"data": {
  "server_id": "NC34AP6LAJSBGRNQOZHE5OYXNEAEDITZXBGYQP4USZ3GHMBJP2DGCKO2",
  "server_name": "nats-cluster-v4-2",
  "now": "2024-06-14T09:00:15.936449387Z",
  "num_routes": 8,
  "routes": [
    {
      "rid": 26,
      "remote_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
      "remote_name": "nats-cluster-v4-1",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.142.65",
      "port": 59576,
      "start": "2024-06-04T17:45:08.358430289Z",
      "last_activity": "2024-06-04T17:45:08.358559849Z",
      "rtt": "255µs",
      "uptime": "9d15h15m7s",
      "idle": "9d15h15m7s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 27,
      "remote_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
      "remote_name": "nats-cluster-v4-1",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.142.65",
      "port": 59578,
      "start": "2024-06-04T17:45:08.373324948Z",
      "last_activity": "2024-06-04T17:45:08.373466167Z",
      "rtt": "249µs",
      "uptime": "9d15h15m7s",
      "idle": "9d15h15m7s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 28,
      "remote_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
      "remote_name": "nats-cluster-v4-1",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.142.65",
      "port": 59588,
      "start": "2024-06-04T17:45:08.401617335Z",
      "last_activity": "2024-06-14T09:00:15.936316987Z",
      "rtt": "290µs",
      "uptime": "9d15h15m7s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 1027217544,
      "out_msgs": 79617062,
      "in_bytes": 1863523035076,
      "out_bytes": 68925666034,
      "subscriptions": 6252,
      "compression": "off"
    },
    {
      "rid": 29,
      "remote_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
      "remote_name": "nats-cluster-v4-0",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.173.11",
      "port": 44962,
      "start": "2024-06-04T17:45:20.530866763Z",
      "last_activity": "2024-06-04T17:45:20.531103313Z",
      "rtt": "215µs",
      "uptime": "9d15h14m55s",
      "idle": "9d15h14m55s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 31,
      "remote_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
      "remote_name": "nats-cluster-v4-0",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.173.11",
      "port": 44970,
      "start": "2024-06-04T17:45:20.604194468Z",
      "last_activity": "2024-06-04T17:45:20.604339128Z",
      "rtt": "277µs",
      "uptime": "9d15h14m55s",
      "idle": "9d15h14m55s",
      "pending_size": 0,
      "in_msgs": 0,
      "out_msgs": 0,
      "in_bytes": 0,
      "out_bytes": 0,
      "subscriptions": 0,
      "compression": "off"
    },
    {
      "rid": 32,
      "remote_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
      "remote_name": "nats-cluster-v4-0",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.173.11",
      "port": 44984,
      "start": "2024-06-04T17:45:20.633770436Z",
      "last_activity": "2024-06-14T09:00:15.930918998Z",
      "rtt": "238µs",
      "uptime": "9d15h14m55s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 14153420,
      "out_msgs": 33948303,
      "in_bytes": 10301579698,
      "out_bytes": 11515003051,
      "subscriptions": 1286,
      "compression": "off"
    },
    {
      "rid": 25,
      "remote_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
      "remote_name": "nats-cluster-v4-1",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.142.65",
      "port": 59560,
      "start": "2024-06-04T17:45:08.356758569Z",
      "last_activity": "2024-06-14T09:00:12.199711305Z",
      "rtt": "323µs",
      "uptime": "9d15h15m7s",
      "idle": "3s",
      "pending_size": 0,
      "in_msgs": 27804,
      "out_msgs": 27777,
      "in_bytes": 42809896,
      "out_bytes": 42177782,
      "subscriptions": 48,
      "account": "$SYS",
      "compression": "off"
    },
    {
      "rid": 30,
      "remote_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
      "remote_name": "nats-cluster-v4-0",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.173.11",
      "port": 44952,
      "start": "2024-06-04T17:45:20.530884863Z",
      "last_activity": "2024-06-14T09:00:15.936429307Z",
      "rtt": "341µs",
      "uptime": "9d15h14m55s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 27775,
      "out_msgs": 27765,
      "in_bytes": 42464040,
      "out_bytes": 42166135,
      "subscriptions": 49,
      "account": "$SYS",
      "compression": "off"
    }
  ]
}
}
Routes after the change
{
"server": {
  "name": "nats-cluster-v4-1",
  "host": "0.0.0.0",
  "id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
  "cluster": "nats-cluster-v4",
  "ver": "2.10.14",
  "jetstream": false,
  "flags": 0,
  "seq": 30506,
  "time": "2024-06-14T15:48:01.505977662Z"
},
"data": {
  "server_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
  "server_name": "nats-cluster-v4-1",
  "now": "2024-06-14T15:48:01.505934692Z",
  "num_routes": 2,
  "routes": [
    {
      "rid": 6264,
      "remote_id": "NBRW5CG3HCOJSGMMXDGDGHPHTFUDLEYFUP3PXFWLNVFZ4PXCVLCQXNCI",
      "remote_name": "nats-cluster-v4-2",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.172.149",
      "port": 52110,
      "start": "2024-06-14T12:46:40.646080193Z",
      "last_activity": "2024-06-14T15:48:01.505911722Z",
      "rtt": "332µs",
      "uptime": "3h1m20s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 53161,
      "out_msgs": 882,
      "in_bytes": 75416839,
      "out_bytes": 335060,
      "subscriptions": 56,
      "compression": "off"
    },
    {
      "rid": 6122,
      "remote_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
      "remote_name": "nats-cluster-v4-0",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.173.11",
      "port": 52472,
      "start": "2024-06-14T12:34:42.848411331Z",
      "last_activity": "2024-06-14T15:48:01.505957832Z",
      "rtt": "291µs",
      "uptime": "3h13m18s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 682786,
      "out_msgs": 87690088,
      "in_bytes": 690431012,
      "out_bytes": 180559446164,
      "subscriptions": 1502,
      "compression": "off"
    }
  ]
}
}
{
"server": {
  "name": "nats-cluster-v4-0",
  "host": "0.0.0.0",
  "id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
  "cluster": "nats-cluster-v4",
  "ver": "2.10.14",
  "jetstream": false,
  "flags": 0,
  "seq": 28927,
  "time": "2024-06-14T15:48:01.506207613Z"
},
"data": {
  "server_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
  "server_name": "nats-cluster-v4-0",
  "now": "2024-06-14T15:48:01.506158393Z",
  "num_routes": 2,
  "routes": [
    {
      "rid": 6056,
      "remote_id": "NBRW5CG3HCOJSGMMXDGDGHPHTFUDLEYFUP3PXFWLNVFZ4PXCVLCQXNCI",
      "remote_name": "nats-cluster-v4-2",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.172.149",
      "port": 58960,
      "start": "2024-06-14T12:46:40.646873006Z",
      "last_activity": "2024-06-14T15:48:01.485820224Z",
      "rtt": "402µs",
      "uptime": "3h1m20s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 582011,
      "out_msgs": 1127420,
      "in_bytes": 246896380,
      "out_bytes": 3392144265,
      "subscriptions": 56,
      "compression": "off"
    },
    {
      "rid": 6038,
      "remote_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
      "remote_name": "nats-cluster-v4-1",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.142.65",
      "port": 6222,
      "start": "2024-06-14T12:34:42.848315979Z",
      "last_activity": "2024-06-14T15:48:01.506126813Z",
      "rtt": "266µs",
      "uptime": "3h13m18s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 87690088,
      "out_msgs": 682786,
      "in_bytes": 180559446164,
      "out_bytes": 690431012,
      "subscriptions": 6132,
      "compression": "off"
    }
  ]
}
}
{
"server": {
  "name": "nats-cluster-v4-2",
  "host": "0.0.0.0",
  "id": "NBRW5CG3HCOJSGMMXDGDGHPHTFUDLEYFUP3PXFWLNVFZ4PXCVLCQXNCI",
  "cluster": "nats-cluster-v4",
  "ver": "2.10.14",
  "jetstream": false,
  "flags": 0,
  "seq": 401,
  "time": "2024-06-14T15:48:01.506191192Z"
},
"data": {
  "server_id": "NBRW5CG3HCOJSGMMXDGDGHPHTFUDLEYFUP3PXFWLNVFZ4PXCVLCQXNCI",
  "server_name": "nats-cluster-v4-2",
  "now": "2024-06-14T15:48:01.506164282Z",
  "num_routes": 2,
  "routes": [
    {
      "rid": 5,
      "remote_id": "NCS7TNLPVKAI7YA777FKARHZAEVIUOQ5CKFYXSKP6RDO2ZGE2TSFGE25",
      "remote_name": "nats-cluster-v4-1",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.142.65",
      "port": 6222,
      "start": "2024-06-14T12:46:40.645912846Z",
      "last_activity": "2024-06-14T15:48:01.506142912Z",
      "rtt": "302µs",
      "uptime": "3h1m20s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 882,
      "out_msgs": 53161,
      "in_bytes": 335060,
      "out_bytes": 75416839,
      "subscriptions": 6132,
      "compression": "off"
    },
    {
      "rid": 6,
      "remote_id": "NBYFQSZJA3Q42WRHRMDH6SCBQ3YJKBZZXH62P4BDZW73PCMFXZCQVKKS",
      "remote_name": "nats-cluster-v4-0",
      "did_solicit": true,
      "is_configured": true,
      "ip": "10.202.173.11",
      "port": 6222,
      "start": "2024-06-14T12:46:40.646706926Z",
      "last_activity": "2024-06-14T15:48:01.485658073Z",
      "rtt": "301µs",
      "uptime": "3h1m20s",
      "idle": "0s",
      "pending_size": 0,
      "in_msgs": 1127420,
      "out_msgs": 582011,
      "in_bytes": 3392144265,
      "out_bytes": 246896380,
      "subscriptions": 1502,
      "compression": "off"
    }
  ]
}
}

Only node restart did the trick. Situation after restart (0 connections, ~0 incoming bytes, ~0 outgoing bytes):

CleanShot 2024-06-14 at 17 25 50@2x

Expected behavior

Remove the subscription route when there are no consumers to read messages.

Server and client version

Server version: 2.10.14
Client: -

Host environment

GKE host:

OS: Container-Optimized OS from Google
OS version: 109
Architecture: x86-64
CR: containerd

Steps to reproduce

In our case a few slow consumers had appeared in a cluster and incoming messages ran to the free node.

In logs we could see only messages like:

Logs
nats [7] 2024/06/10 11:05:42.004110 [INF] 10.202.171.13:52058 - cid:3207 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 2 chunks of 4435 total bytes.
nats [7] 2024/06/10 11:11:03.923105 [INF] 10.202.171.13:55020 - cid:3211 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 2 chunks of 535 total bytes.
nats [7] 2024/06/10 11:11:29.156916 [INF] 10.202.171.13:47712 - cid:3212 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 1 chunks of 176 total bytes.
nats [7] 2024/06/10 11:12:13.039384 [INF] 10.202.171.13:36508 - cid:3214 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 44 chunks of 601458 total bytes.
nats [7] 2024/06/10 11:13:49.470739 [INF] 10.202.171.13:36666 - cid:3216 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 14 chunks of 258264 total bytes.
nats [7] 2024/06/10 11:26:42.056093 [INF] 10.202.171.13:33510 - cid:3219 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 2 chunks of 848 total bytes.
nats [7] 2024/06/10 11:31:57.516530 [INF] 10.202.171.13:55710 - cid:3221 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 1 chunks of 69 total bytes.
nats [7] 2024/06/10 11:33:56.144849 [INF] 10.202.171.13:58652 - cid:3223 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 2 chunks of 4319 total bytes.
nats [7] 2024/06/10 11:37:48.260718 [INF] 10.202.171.13:59468 - cid:3227 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 30 chunks of 789327 total bytes.
nats [7] 2024/06/10 11:38:29.813781 [INF] 10.202.171.13:40308 - cid:3229 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 2 chunks of 1659 total bytes.
nats [7] 2024/06/10 15:39:38.161377 [INF] 10.202.171.13:55896 - cid:3330 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 35 chunks of 929026 total bytes.
nats [7] 2024/06/10 15:41:54.142104 [INF] 10.202.171.13:48106 - cid:3335 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 2 chunks of 8820 total bytes.
nats [7] 2024/06/10 17:08:54.921921 [INF] 10.202.171.13:37126 - cid:3409 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 61 chunks of 1927533 total bytes.
nats [7] 2024/06/10 17:16:09.376586 [INF] 10.202.171.13:39610 - cid:3415 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 2 chunks of 5772 total bytes.
nats [7] 2024/06/10 17:18:45.193519 [INF] 10.202.171.13:56224 - cid:3416 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 2 chunks of 2065 total bytes.
nats [7] 2024/06/10 17:28:55.946395 [INF] 10.202.171.13:53416 - cid:3417 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 2 chunks of 13288 total bytes.
nats [7] 2024/06/10 17:38:25.867114 [INF] 10.202.171.13:48406 - cid:3422 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 2 chunks of 16039 total bytes.
nats [7] 2024/06/10 17:38:41.448495 [INF] 10.202.171.13:48482 - cid:3423 - Slow Consumer Detected: WriteDeadline of 4s exceeded with 2 chunks of 18763 total bytes
@nenych nenych added the defect Suspected defect such as a bug or regression label Jun 14, 2024
@jing-flowdesk
Copy link

Hello, nice timing, it seems we are having similar issue with similar setup. We also have a NATS cluster with 3 nodes in different AZ and connecting clients also to the node in the same AZ.

Server and client version
Server version: 2.10.16
Client: -

No Jetstream activated
We have setup queue group for subscribers

Observed behavior
When there is no subscriber for a specific subject in a queue group. We observe that we have 10 times more messages (sent and received) according to our metrics

Expected behavior
No burst of messages when there is no subscriber

nats message
@derekcollison
Copy link
Member

Can you provide us detailed instructions on how we could reproduce that would be helpful.

@miloaec
Copy link

miloaec commented Jun 19, 2024

Hi,
I am working with Jing and we restarted the nats servers and at the same time we moved them (pods) to a new nodes pool, so the nats pods migrate to new vm. Since this move we did not see any issue.

We tried to reproduce the same behavior on our dev environment , alas we are actually unable to reproduce it.

When it occurred, we had a microservice sending messages to a subscriber, this subscriber is part of a queue group. If we stop the subscriber, we observe the behavior we can see in the graph sent by Jing, the number of messages is about x10.

When we subscribe back the number of message come back to a normal number. We test it several time with exactly the same behavior, alas as i said, since we move to new nodes we are not able to reproduce this behavior.

@nenych
Copy link
Author

nenych commented Jun 19, 2024

@derekcollison I can't reproduce it right now, but we have a node with this issue and if there is some possibility to debug it there - can do it.

@derekcollison
Copy link
Member

So you have a node that is showing this behavior that has no client connections and no jetstream assets on that node, correct?

@nenych
Copy link
Author

nenych commented Jun 20, 2024

Right now it has clients but also for sure has additional unroutable traffic (if we fix one issue we will have this node without clients). We do not use JS so yes this node has no jetstream.

@derekcollison
Copy link
Member

Possible to see if issue presents with latest pre-release candidate for v2.10.17? RC6?

https://github.com/nats-io/nats-server/releases/tag/v2.10.17-RC.6

@nenych
Copy link
Author

nenych commented Jun 21, 2024

I am trying to reproduce this situation but without luck, so we have it only on production.

@derekcollison
Copy link
Member

Is production showing the issue now?

@nenych
Copy link
Author

nenych commented Jun 25, 2024

Yes, it is production, and the issue is present.

@derekcollison
Copy link
Member

Can we schedule a call to take a look?

Shoot me an email - derek@synadia.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Suspected defect such as a bug or regression
4 participants