Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] The consumer minOffSet keeps no change before restart container #8327

Open
3 tasks done
tongtaodragon opened this issue Jun 24, 2024 · 1 comment
Open
3 tasks done

Comments

@tongtaodragon
Copy link

Before Creating the Bug Report

  • I found a bug, not just asking a question, which should be created in GitHub Discussions.

  • I have searched the GitHub Issues and GitHub Discussions of this repository and believe that this is not a duplicate.

  • I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ.

Runtime platform environment

Redhat 8.0

RocketMQ version

Broker: 4.8.0
Client SDK: 4.9.3

JDK Version

Orace JDK 1.8.0_121

Describe the Bug

In our product environment, we met this issue for several times but can't reproduce always. It happened during upgrading. We have several container instances in environment. During upgrade we adopt rolling upgrade. When it happened, the consumer minOffset of one queue in one broker always keeps fixed value, but the consumer still can consumer later messages successfully. And the number of cumulative messages grow bigger. After we restart the problematic consumer instance, this issue disappeared.

  1. In broker log, we didn't find any warn or error messages related to this.
  2. In client log, we didn't find any warn or error messages related to this. Since the number of cumulative messages is larger than 2000 which is larger than maxSpan so in client log, some warn flow control log message logged.
  3. We compared the cleanExpireMessage thread between problematic and normal instance, cleanExpiredMessage thread work well. We found an old issue which is related to cleanExpiredMessage thread, it will result in this issue, but it was fixed in SDK 4.9.3.

Steps to Reproduce

We doubt one consume thread exit abnormally and doesn't set ConsumeStartTimeStamp. Then we reproduce it using below hack method.

In ConsumeMessageConcurrentlyService.java, we change code.

  1. Add one member as below
    private static int count = 0;

  2. Add code in methos run of class ConsumeRequest as below
    @OverRide
    public void run() {
    if (this.processQueue.isDropped()) {
    log.info("xxxx");
    return;
    }

    // Add below code
    if (count == 0) {
    count++;
    return;
    }

  3. Recompile rocketmq-client jar and install

  4. Produce some messages in queue

  5. Start one consumer instance, then we can found issue happened.

What Did You Expect to See?

After some time, this queue will be normal, minOffset keep grow normally and cumulative messages be normal. We don't want to restart consumer instance since it will consume old messages again.

What Did You See Instead?

minOffset keep a fixed value and no grow, the cumulative messages grow bigger.

Additional Context

In our case

  1. clustering consume
  2. DefaultPushConsume
  3. Concurrently consume
@tongtaodragon
Copy link
Author

The issue confirmed in production environment. We got live dump and searched the message which stucked for several days. In the msg properties, there is no CONSUME_START_TIME property. We don't know the reason why consume thread exit abnormally still.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant