-
Notifications
You must be signed in to change notification settings - Fork 969
Issues: huggingface/text-generation-inference
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Feature request: Add documentation and examples for adding additional API endpoints.
#2321
opened Jul 27, 2024 by
michael-conrad
Possible config mismatch between TGI and transformers (
hidden_act
vs. hidden_activation
)
#2319
opened Jul 26, 2024 by
xenova
4 tasks
Request failed during generation: Server error: Batch ID 408 not found in cache.
#2316
opened Jul 26, 2024 by
leizhao1234
LLama 3/3.1 70B Outputting "!!!!!!"; Shorter Context
#2312
opened Jul 26, 2024 by
mallorbc
2 of 4 tasks
Tools still giving EoF errors on generated JSON
#2310
opened Jul 25, 2024 by
ArjunBhalla98
1 of 4 tasks
how to use the model's checkpoint in local fold?
#2302
opened Jul 25, 2024 by
zk19971101
1 of 4 tasks
AttributeError: 'NoneType' object has no attribute 'replace'
#2297
opened Jul 24, 2024 by
almersawi
2 of 4 tasks
Meta-Llama-3.1-405B-Instruct-BNB-NF4 - Shard process was signaled to shutdown with signal 11
#2293
opened Jul 24, 2024 by
BugsBuggy
1 of 4 tasks
Latest Docker image fails while initializing gemma2
#2275
opened Jul 22, 2024 by
jorado
2 of 4 tasks
Add support for Flash Attention 3
feature request
New feature or request
#2264
opened Jul 20, 2024 by
RonanKMcGovern
Add support for Mistral-Nemo
new model
Request for integration of new model
#2252
opened Jul 18, 2024 by
shaltielshmid
2 tasks done
can't start server with small --max-total-tokens. But works fine with big stting
question
Further information is requested
#2246
opened Jul 18, 2024 by
rooooc
max_batch_size limit doesn't work well at queue.next_batch()
#2241
opened Jul 17, 2024 by
AndersWXJY
tool_calls
gives sporadic EoF parsing errors
#2240
opened Jul 16, 2024 by
ArjunBhalla98
2 of 4 tasks
Can I somehow change attention type from 'FlashAttention' in the text-server-launcher?
question
Further information is requested
#2239
opened Jul 16, 2024 by
wasifmasood
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.