Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: fix update ignore still report dup-key error in statement retry #54495

Merged
merged 4 commits into from
Jul 16, 2024

Conversation

lcwangchao
Copy link
Collaborator

@lcwangchao lcwangchao commented Jul 8, 2024

What problem does this PR solve?

Issue Number: close #54489

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

fix some dup-key cases in UpdateRecord
@ti-chi-bot ti-chi-bot bot added release-note size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 8, 2024
Copy link

tiprow bot commented Jul 8, 2024

Hi @lcwangchao. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

codecov bot commented Jul 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 56.2582%. Comparing base (0c9a679) to head (1f34c5f).
Report is 72 commits behind head on master.

Additional details and impacted files
@@                Coverage Diff                @@
##             master     #54495         +/-   ##
=================================================
- Coverage   74.7741%   56.2582%   -18.5159%     
=================================================
  Files          1539       1669        +130     
  Lines        361871     615209     +253338     
=================================================
+ Hits         270586     346106      +75520     
- Misses        71638     245607     +173969     
- Partials      19647      23496       +3849     
Flag Coverage Δ
integration 37.1187% <100.0000%> (?)
unit 71.7767% <100.0000%> (-1.9076%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.9656% <ø> (-2.2339%) ⬇️
parser ∅ <ø> (∅)
br 52.2160% <ø> (+4.3332%) ⬆️
@lcwangchao lcwangchao changed the title executor: fix some dup-key cases in UpdateRecord Jul 8, 2024
if err != nil {
return err
}
err = t.rebuildIndices(sctx, txn, h, touched, oldData, newData, table.WithCtx(ctx))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change these lines? The test still passes if these lines are unchanged.

Copy link
Collaborator Author

@lcwangchao lcwangchao Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this code is unnecessary now. Deleting these lines can make code clear. The current LazyCheckKeyNotExists is:

func (s *SessionVars) LazyCheckKeyNotExists() bool {
	if s.StmtCtx.ErrGroupLevel(errctx.ErrGroupDupKey) != errctx.LevelError {
		// This branch means we are in `insert/update ignore`.
		// The executor will handle the dup-key error and ignore it in executor,
		// so we must check the dup-key error in place to make sure the executor can get the error.
		return false
	}
	return s.PresumeKeyNotExists || (s.TxnCtx != nil && s.TxnCtx.IsPessimistic)
}

You can see that if sessVars.TxnCtx.IsPessimistic is true, s.PresumeKeyNotExists || (s.TxnCtx != nil && s.TxnCtx.IsPessimistic) will always return true, we do not need to set PresumeKeyNotExists.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted diffs in LazyCheckKeyNotExists can also fix this bug...

Copy link
Member

@jackysp jackysp Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‌‌Let's go back to the original place. The purpose of tidb_constraint_check_in_place is only to make the lazy unique index check for insert effective. Historically, only inserts had a lazy unique index check. tidb_constraint_check_in_place was actually designed to remove this exclusive behavior for inserts, so that all statements execute consistent in-place checks. Therefore, update statements have never had a lazy check. So even if tidb_constraint_check_in_place is off, updates should still report an error. @lcwangchao @ekexium

Copy link
Member

@jackysp jackysp Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for why the name was designed this way, that was the product manager's decision at the time. However, I don't think it's necessary to make updates lazy check as well. The reason insert needs lazy check is because it performs more efficiently during bulk loading. Bulk loading should rarely involve updates, right?

Copy link
Member

@jackysp jackysp Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want the update to also perform a lazy check, handling it only at

if !sessVars.InTxn() {
savePresumeKeyNotExist := sessVars.PresumeKeyNotExists
if !sessVars.ConstraintCheckInPlace && sessVars.TxnCtx.IsPessimistic {
sessVars.PresumeKeyNotExists = true
}
err = t.rebuildIndices(sctx, txn, h, touched, oldData, newData, table.WithCtx(ctx))
sessVars.PresumeKeyNotExists = savePresumeKeyNotExist
if err != nil {
return err
}
} else {
is not enough. It also requires the 2PC phase to first check for unique constraints like insert before committing. This way, for #54489, there will be no retry during an update; instead, a duplicate entry error reported by TiKV will be received by TiDB, and it will convert it to a warning.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a discuss about issue #54492 . This PR just disable force lazy check for pessimistic txn when update ignore. For #54492, I think we can left above comments there and just close it.

Copy link
Contributor

@ekexium ekexium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider a pessimistic txn, constraint_check_in_place=off, and error level is LevelWarn. In original code LazyCheckKeyNotExists will return true, in this PR it will return false, right?
If we let the original value of vars.PresumeKeyNotExists be !vars.ConstraintCheckInPlace we will have the table:

InTxn constraint_check_in_place mode levelError master, lazy? PR, lazy? Equivalent
FALSE FALSE opt FALSE TRUE FALSE FALSE
FALSE FALSE opt TRUE TRUE TRUE TRUE
FALSE FALSE pes FALSE TRUE FALSE FALSE
FALSE FALSE pes TRUE TRUE TRUE TRUE
FALSE TRUE opt FALSE FALSE FALSE TRUE
FALSE TRUE opt TRUE FALSE FALSE TRUE
FALSE TRUE pes FALSE FALSE FALSE TRUE
FALSE TRUE pes TRUE TRUE TRUE TRUE
TRUE FALSE opt FALSE TRUE FALSE FALSE
TRUE FALSE opt TRUE TRUE TRUE TRUE
TRUE FALSE pes FALSE TRUE FALSE FALSE
TRUE FALSE pes TRUE TRUE TRUE TRUE
TRUE TRUE opt FALSE FALSE FALSE TRUE
TRUE TRUE opt TRUE FALSE FALSE TRUE
TRUE TRUE pes FALSE FALSE FALSE TRUE
TRUE TRUE pes TRUE TRUE TRUE TRUE

Besides, s.PresumeKeyNotExists only makes the reasoning more complex. Since it's only set in one place, I think we should consider removing it.

@lcwangchao
Copy link
Collaborator Author

Consider a pessimistic txn, constraint_check_in_place=off, and error level is LevelWarn. In original code LazyCheckKeyNotExists will return true, in this PR it will return false, right? If we let the original value of vars.PresumeKeyNotExists be !vars.ConstraintCheckInPlace we will have the table:

InTxn constraint_check_in_place mode levelError master, lazy? PR, lazy? Equivalent
FALSE FALSE opt FALSE TRUE FALSE FALSE
FALSE FALSE opt TRUE TRUE TRUE TRUE
FALSE FALSE pes FALSE TRUE FALSE FALSE
FALSE FALSE pes TRUE TRUE TRUE TRUE
FALSE TRUE opt FALSE FALSE FALSE TRUE
FALSE TRUE opt TRUE FALSE FALSE TRUE
FALSE TRUE pes FALSE FALSE FALSE TRUE
FALSE TRUE pes TRUE TRUE TRUE TRUE
TRUE FALSE opt FALSE TRUE FALSE FALSE
TRUE FALSE opt TRUE TRUE TRUE TRUE
TRUE FALSE pes FALSE TRUE FALSE FALSE
TRUE FALSE pes TRUE TRUE TRUE TRUE
TRUE TRUE opt FALSE FALSE FALSE TRUE
TRUE TRUE opt TRUE FALSE FALSE TRUE
TRUE TRUE pes FALSE FALSE FALSE TRUE
TRUE TRUE pes TRUE TRUE TRUE TRUE
Besides, s.PresumeKeyNotExists only makes the reasoning more complex. Since it's only set in one place, I think we should consider removing it.

This PR only changes LazyCheckKeyNotExists in the pessimistic mode. In optimistic mode, it's return value is affected by PresumeKeyNotExists. In the update statement, it is not set anyway, so this PR does not change the return value here. In the insert statement, it only changes the return value when error level is LevelWarn, however, this only happens in insert ignore. The LazyCheckKeyNotExists is not used because BatchCheck is set to true...

Yes, I think PresumeKeyNotExists is boring and we should remove it. But I think we should also remove LazyCheckKeyNotExists and use a new option WithDupKeyCheckMode (default is checkInPlace) to indicate it. Each scene should compute the DupKeyCheckMode separately.

@ekexium ekexium requested a review from cfzjywxk July 10, 2024 08:14
@ekexium
Copy link
Contributor

ekexium commented Jul 10, 2024

So the correctness of this fix depends on how upper layer uses it. I suggest that we make LazyCheckKeyNotExists unchanged for optimistic transactions, instead of depending on the callers. It seems a safer approach.

@lcwangchao
Copy link
Collaborator Author

/retest

Copy link

tiprow bot commented Jul 10, 2024

@lcwangchao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@lcwangchao
Copy link
Collaborator Author

/retest

Copy link

tiprow bot commented Jul 10, 2024

@lcwangchao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@lcwangchao
Copy link
Collaborator Author

/retest

Copy link

tiprow bot commented Jul 10, 2024

@lcwangchao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@lcwangchao
Copy link
Collaborator Author

/retest

Copy link

tiprow bot commented Jul 10, 2024

@lcwangchao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@lcwangchao
Copy link
Collaborator Author

/retest

Copy link

tiprow bot commented Jul 10, 2024

@lcwangchao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

@ekexium ekexium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you verified #20484? Seems the part was introduced to solve that issue.

@lcwangchao
Copy link
Collaborator Author

lcwangchao commented Jul 12, 2024

Have you verified #20484? Seems the part was introduced to solve that issue.

Seems it is trying to make a lazy check for update... But I don't understand this PR, if a transaction is pessimistic and the statement is not update ignore, LazyCheckKeyNotExists should always return true. If the statement is update ignore, this bug will occur...

@tiancaiamao could you PTAL?

Copy link
Contributor

@tiancaiamao tiancaiamao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current master branch does not work as expected.
If there is no begin, the expected behavior is that index.Create does not need tikvSnapshotGet to check unique key exist or not.
1WkBxs4akW

If there is a begin, both master branch and this PR don't use tikvSnapshotGet under index.Create:

QpFEM3wbyh

This PR does not make thing change, and it simplify the logic, so LGTM
As for why the current master branch does not work as expected, we can take a separate thread to trace...

Copy link
Contributor

@cfzjywxk cfzjywxk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's difficult to verify the correctness for all the combinations of optimistic/pessimistic mods and check in-place or not directly.

We may need to file a refactor and test coverage task to clarify the related code path, especially the optimistic mode would be deprecated in the future. It could be also considered as one of the sub-tasks of the deprecation.
/cc @lcwangchao @ekexium

Copy link

ti-chi-bot bot commented Jul 16, 2024

@cfzjywxk: GitHub didn't allow me to request PR reviews from the following users: lcwangchao.

Note that only pingcap members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

It's difficult to verify the correctness for all the combinations of optimistic/pessimistic mods and check in-place or not directly.

We may need to file a refactor and test coverage task to clarify the related code path, especially the optimistic mode would be deprecated in the future. It could be also considered as one of the sub-tasks of the deprecation.
/cc @lcwangchao @ekexium

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot bot requested a review from ekexium July 16, 2024 12:56
Copy link

ti-chi-bot bot commented Jul 16, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cfzjywxk, tiancaiamao

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

ti-chi-bot bot commented Jul 16, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-07-12 13:53:15.192697164 +0000 UTC m=+16417.183638635: ☑️ agreed by tiancaiamao.
  • 2024-07-16 12:56:16.413604905 +0000 UTC m=+358598.404546359: ☑️ agreed by cfzjywxk.
@ti-chi-bot ti-chi-bot bot merged commit 7a09434 into pingcap:master Jul 16, 2024
22 of 23 checks passed
@lcwangchao lcwangchao deleted the fix_dupkey branch July 17, 2024 01:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm release-note size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
5 participants