-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[TPU] Support collective communications in XLA devices
ready
tpu
Related to Google TPUs
#6813
opened Jul 26, 2024 by
WoosukKwon
Loading…
[Misc] Support TPU in initialize_ray_cluster
ready
tpu
Related to Google TPUs
#6812
opened Jul 26, 2024 by
WoosukKwon
Loading…
[Build/CI][ROCm] Minor simplification to Dockerfile.rocm
ready
rocm
#6811
opened Jul 26, 2024 by
WoosukKwon
Loading…
[Bugfix] Make defaulting OMP_NUM_THREADS to 1 have the desired impact
#6802
opened Jul 25, 2024 by
tjohnson31415
•
Draft
[BugFix] skip loading lm_head for llama if word embeddings are tied
#6796
opened Jul 25, 2024 by
prashantgupta24
Loading…
[Kernel] Increase precision of GPTQ/AWQ Marlin kernel
ready
#6795
opened Jul 25, 2024 by
alexm-neuralmagic
Loading…
[CI] Reproduce SGLANG benchmark results
nightly-benchmarks
#6794
opened Jul 25, 2024 by
KuntaiDu
Loading…
[Bugfix] Allow vllm to still work if triton is not installed.
#6786
opened Jul 25, 2024 by
tdoublep
Loading…
[Bugfix][Model] Jamba assertions and no chunked prefill by default for Jamba
ready
#6784
opened Jul 25, 2024 by
tomeras91
Loading…
[BugFix][Speculative Decoding] Fixes the generation token numbers with sps
#6782
opened Jul 25, 2024 by
sighingnow
Loading…
[Bugfix] [Easy] Fixed a bug in the multiprocessing GPU executor.
#6770
opened Jul 25, 2024 by
eaplatanios
Loading…
[ Kernel ] Add Fused Layernorm + Dynamic-Per-Token Quant Kernels
ready
#6763
opened Jul 24, 2024 by
varun-sundar-rabindranath
Loading…
[Bugfix][Model] Skip loading lm_head weights if using tie_word_embeddings
#6758
opened Jul 24, 2024 by
tjohnson31415
Loading…
[Frontend] New
allowed_token_ids
decoding request parameter
#6753
opened Jul 24, 2024 by
njhill
Loading…
[Bugfix]: use PretrainedConfig to communicate config objects with trust remote code
#6751
opened Jul 24, 2024 by
tjohnson31415
•
Draft
[CI/Build] Build wheel in release model when sccache is not enabled
ready
#6710
opened Jul 23, 2024 by
zifeitong
Loading…
[CI] [nightly benchmark] Do not re-download sharegpt dataset if exists
nightly-benchmarks
ready
#6706
opened Jul 23, 2024 by
cadedaniel
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.