Skip to content
Navigation Menu
Toggle navigation
Sign in
Product
Actions
Automate any workflow
Packages
Host and manage packages
Security
Find and fix vulnerabilities
Codespaces
Instant dev environments
GitHub Copilot
Write better code with AI
Code review
Manage code changes
Issues
Plan and track work
Discussions
Collaborate outside of code
Explore
All features
Documentation
GitHub Skills
Blog
Solutions
By size
Enterprise
Teams
Startups
By industry
Healthcare
Financial services
Manufacturing
By use case
CI/CD & Automation
DevOps
DevSecOps
Resources
Topics
AI
DevOps
Innersource
Open Source
Security
Software Development
Explore
Learning Pathways
White papers, Ebooks, Webinars
Customer Stories
Partners
Open Source
GitHub Sponsors
Fund open source developers
The ReadME Project
GitHub community articles
Repositories
Topics
Trending
Collections
Enterprise
Enterprise platform
AI-powered developer platform
Available add-ons
Advanced Security
Enterprise-grade security features
GitHub Copilot
Enterprise-grade AI features
Premium Support
Enterprise-grade 24/7 support
Pricing
Search or jump to...
Search code, repositories, users, issues, pull requests...
Search syntax tips
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Sign in
Sign up
You signed in with another tab or window.
Reload
to refresh your session.
You signed out in another tab or window.
Reload
to refresh your session.
You switched accounts on another tab or window.
Reload
to refresh your session.
Dismiss alert
{{ message }}
mozilla-ai
/
lm-evaluation-harness
Public
forked from
EleutherAI/lm-evaluation-harness
Notifications
You must be signed in to change notification settings
Fork
0
Star
1
Code
Pull requests
0
Actions
Projects
0
Security
Insights
Additional navigation options
Code
Pull requests
Actions
Projects
Security
Insights
Commits
Branch selector
main
User selector
All users
All time
Commit History
Commits on Mar 12, 2024
cli_evaluate calls simple_evaluate with the same verbosity. (
EleutherAI#1563
)
Wongboo
committed
Mar 12, 2024
49695e8
Commits on Mar 11, 2024
AGIEval (
EleutherAI#1359
)
haileyschoelkopf
and
Sparkier
committed
Mar 11, 2024
a3e56af
add Arabic EXAMS benchmark (
EleutherAI#1498
)
khalil-Hennara
and
lintangsutawika
committed
Mar 11, 2024
4ab0759
Update ifeval.yaml (
EleutherAI#1506
)
haileyschoelkopf
committed
Mar 11, 2024
282b9e7
Update generate_until_template_yaml (
EleutherAI#1546
)
haileyschoelkopf
committed
Mar 11, 2024
a79a7c3
Commits on Mar 10, 2024
Support jinja templating for task descriptions (
EleutherAI#1553
)
HishamYahya
and
haileyschoelkopf
committed
Mar 10, 2024
3bdf25e
Commits on Mar 9, 2024
Fix incorrect
max_gen_toks
generation kwarg default in code2_text. (
EleutherAI#1551
)
cosmo3769
committed
Mar 9, 2024
f518228
Add compatibility for vLLM's new Logprob object (
EleutherAI#1549
)
Yard1
and
haileyschoelkopf
committed
Mar 9, 2024
8051d95
Commits on Mar 6, 2024
Update installation commands in openai_completions.py and contributing document and, update wandb_args description (
EleutherAI#1536
)
naem1023
and
haileyschoelkopf
committed
Mar 6, 2024
9e6e240
Cleanup and fixes (Task, Instance, and a little bit of *evaluate) (
EleutherAI#1533
)
LSinev
and
haileyschoelkopf
committed
Mar 6, 2024
4ee1b38
update printed num-fewshot ; prevent fewshots from erroneously being used by cot which hardcodes fewshot prompt (
EleutherAI#1502
)
haileyschoelkopf
committed
Mar 6, 2024
0270505
Update docs on LM.loglikelihood_rolling abstract method (
EleutherAI#1532
)
haileyschoelkopf
committed
Mar 6, 2024
525b8f5
Adding new task : KorMedMCQA (
EleutherAI#1530
)
sean0042
committed
Mar 6, 2024
faee1ad
Add WMDP Multiple-choice (
EleutherAI#1534
)
justinphan3110
and
lintangsutawika
committed
Mar 6, 2024
29b2b01
Add EQ-Bench as per
EleutherAI#1459
(
EleutherAI#1511
)
pbevan1
committed
Mar 6, 2024
c5acce0
Commits on Mar 5, 2024
Add a new task GPQA (the part CoT and generative) (
EleutherAI#1482
)
uanu2002
and
haileyschoelkopf
committed
Mar 5, 2024
01108ac
Openllm benchmark (
EleutherAI#1526
)
baberabb
committed
Mar 5, 2024
8a875e9
Commits on Mar 4, 2024
Fix minor edge cases (
EleutherAI#951
EleutherAI#1503
) (
EleutherAI#1520
)
haileyschoelkopf
committed
Mar 4, 2024
292e581
Hotfix: fix TypeError in
--trust_remote_code
(
EleutherAI#1517
)
haileyschoelkopf
committed
Mar 4, 2024
4582391
French Bench (
EleutherAI#1500
)
ManuelFay
and
haileyschoelkopf
committed
Mar 4, 2024
48476c4
Cleaning up unused unit tests (
EleutherAI#1516
)
veekaybee
committed
Mar 4, 2024
4eba9cf
Commits on Mar 3, 2024
Setting trust_remote_code to True for HuggingFace datasets compatibility (
EleutherAI#1487
)
veekaybee
committed
Mar 3, 2024
9516792
Vllm update DP+TP (
EleutherAI#1508
)
baberabb
committed
Mar 3, 2024
e5e35fc
Commits on Mar 1, 2024
modify
WandbLogger
to accept arbitrary kwargs (
EleutherAI#1491
)
baberabb
committed
Mar 1, 2024
ae79b12
Improve data-parallel request partitioning for VLLM (
EleutherAI#1477
)
haileyschoelkopf
committed
Mar 1, 2024
27a3da9
always include EOS token in stopsequences if possible (
EleutherAI#1480
)
haileyschoelkopf
committed
Mar 1, 2024
284dd80
Add multilingual truthfulqa targets (
EleutherAI#1499
)
jordane95
committed
Mar 1, 2024
d272c19
Commits on Feb 28, 2024
fix duplicated kwargs in some model init (
EleutherAI#1495
)
lchu-ibm
committed
Feb 28, 2024
b177c82
Commits on Feb 27, 2024
Fix AttributeError in huggingface.py When 'model_type' is Missing (
EleutherAI#1489
)
richwardle
and
haileyschoelkopf
committed
Feb 27, 2024
cc771ec
update name of val split in truthfulqa multilingual (
EleutherAI#1488
)
haileyschoelkopf
committed
Feb 27, 2024
a08eb87
add multilingual mmlu eval (
EleutherAI#1484
)
jordane95
committed
Feb 27, 2024
7cd004c
Refactor
evaluater.evaluate
(
EleutherAI#1441
)
baberabb
and
haileyschoelkopf
committed
Feb 27, 2024
5ccd65d
Commits on Feb 26, 2024
Cont metrics (
EleutherAI#1475
)
lintangsutawika
and
haileyschoelkopf
committed
Feb 26, 2024
96d185f
Create a means for caching task registration and request building. Ad… (
EleutherAI#1372
)
inf3rnus
and
haileyschoelkopf
committed
Feb 26, 2024
1e6c927
Revert "setting trust_remote_code (
EleutherAI#1467
)" (
EleutherAI#1474
)
haileyschoelkopf
committed
Feb 26, 2024
f6befdb
Pagination
Previous
Next
You can’t perform that action at this time.