-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(plugins/http-log): improve concurrency when max_batch_size is set to 1 #13332
base: master
Are you sure you want to change the base?
Conversation
kong/plugins/http-log/handler.lua
Outdated
|
||
if (ngx.now() - start_time) > queue_conf.max_retry_time then | ||
kong.log.err(string.format( | ||
"could not send entries, giving up after %d retries. 1 queue entries were lost", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the maximum retry time is configured, it should probably be reported instead of in addition to the number of retries so that it is clear what knob to turn in case this message ist shown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does changing log like this make sense to you?
kong.log.err(string.format(
"could not send entries, giving up after %d retries. 1 queue entries were lost。 Please consider tune the configuration max_retry_time",
retry_count))
break
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I update the log message to indicate the max_retry_time is exceeded
Co-authored-by: Hans Hübner <hans.huebner@gmail.com>
I think the new behavior should be enabled by an explicit configuration parameter ( |
Co-authored-by: Hans Hübner <hans.huebner@gmail.com>
Co-authored-by: Hans Hübner <hans.huebner@gmail.com>
@@ -0,0 +1,4 @@ | |||
message: | | |||
**HTTP-Log**: When `queue.max_batch_size` is 1, log entries are now sent in separate, parallel HTTP requests. Previously, they were sent sequentially in FIFO order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**HTTP-Log**: When `queue.max_batch_size` is 1, log entries are now sent in separate, parallel HTTP requests. Previously, they were sent sequentially in FIFO order. | |
**HTTP-Log**: When `queue.max_batch_size` is explicitly set to `1`, log entries are now sent via concurrent HTTP requests. Previously, they were sent sequentially in FIFO order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value of max_batch_size is 1. So words explicitly set
is not applicable at this case.
kong/plugins/http-log/handler.lua
Outdated
end | ||
|
||
if (now() - start_time) > queue_conf.max_retry_time then | ||
kong.log.err("could not send entries due to max_retry_time exceeded. 1 queue entry was lost") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kong.log.err("could not send entries due to max_retry_time exceeded. 1 queue entry was lost") | |
kong.log.err("could not send this log entry due to max_retry_time exceeded: lost") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now reuse the Queue.handle
logic, so the suggestion here is not applicable.
kong/plugins/http-log/handler.lua
Outdated
local entry = cjson.encode(kong.log.serialize()) | ||
|
||
if queue_conf.max_batch_size == 1 then | ||
local ok ,err = timer_at(0, function() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the TPS is very high, the number of timers would soar. Would that be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose all the timers would be managed by timer-ng
since 3.0. In that case, there will be warning logs from the timer-ng
.
kong/plugins/http-log/handler.lua
Outdated
local entry = cjson.encode(kong.log.serialize()) | ||
|
||
if queue_conf.max_batch_size == 1 then | ||
local ok ,err = timer_at(0, function() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we extract the timer's logic in a local function? Also - this looks very similar to the process_once
method of the queue. Perhaps we could expose that logic from there and reuse it, to reduce duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of function reuse. So the Queue has to expose a function(e.g. Queue:handle
) to be used internally and externally.
Here is the function reuse approach PR #13384. However, we have to refactor the Queue a little to expose
- Queue.create(): create a Queue instance without starting a background timer, this helps us to leverage the OOP of the queue. (most of the queue's functions are OOP, such as
queue:log()
, it's hard to reuse these without create a Queue instance : ( - Queue:handle(): a function contains retry logic
@samugi @hanshuebner Let me know your thoughts. The function reuse approach has a bit wider code change and is a bit more risky, hence requires more time to do a thoughtful review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a review comment, I hope my observation above didn't cost you too much time and efforts 😸
@@ -189,12 +190,27 @@ function HttpLogHandler:log(conf) | |||
|
|||
local queue_conf = Queue.get_plugin_params("http-log", conf, make_queue_name(conf)) | |||
kong.log.debug("Queue name automatically configured based on configuration parameters to: ", queue_conf.name) | |||
local entry = cjson.encode(kong.log.serialize()) | |||
|
|||
if queue_conf.max_batch_size == 1 then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this comment by Hans:
I think the new behavior should be enabled by an explicit configuration parameter (no_queue) rather than implicit by a certain queue length. This would make it easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion about it. Move to internal discussion and the potential backport.
if premature then | ||
return | ||
end | ||
queue:handle({ entry }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem with this approach is that we'd be using the Queue to process entries in parallel, i.e. not what a queue normally does. To properly reuse the code (now in the queue) that processes an entry we should probably extract it from the Queue module and then refer to it both from the queue and from here.
If this is too complex for the scope of this task we might want to do it as a separate refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we extract the retry logic ourside of Queue in another PR.
Summary
Checklist
changelog/unreleased/kong
orskip-changelog
label added on PR if changelog is unnecessary. README.mdIssue reference
FTI-6022