Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: generate 1wg-charters files via celery #7428

Merged
merged 12 commits into from
May 16, 2024

Conversation

jennifer-richards
Copy link
Member

@jennifer-richards jennifer-richards commented May 16, 2024

The main goal of this PR is to update the 1wg-charters.txt and 1wg-charters-by-acronym.txt files via celery tasks instead of cron jobs. The cron job worked by using wget to retrieve a datatracker URL that generated the files. This was done once per hour and cached for an hour each time using the slowpages cache.

As a result, the slowpages cache is effectively always holding on to the data generated by the last run of bin/hourly and the view itself is always served from that cache except when the cron job regenerates it.

This seems needlessly baroque, so I've refactored the work at generating the charters files entirely out of the views and into a task. The views then simply grab the file the task updates and hand that back in their response.

The new method has a bunch of benefits. It avoids collecting the group data twice (the views gathered the same data twice, the task does it only once). It avoids the small chance that the charter data is occasionally an extra hour out of date depending on system load (the cron job might make its request at very slightly less than an hour since the previously cached value). It entirely moves the purportedly expensive data gathering out of the view.

Important: I did a careful comparison of the output of the old and new code and there are a couple of differences. Most notably, the groups in 1wg-charters.txt are sorted alphabetically by acronym within each area. Before, they were (to my eye) random within the area. This seems like an improvement to me, but it is a change.

Second, there were some \r\n line endings in the old output - I assume they were bleeding in from the charters. These are converted to \n by the new rendering path. This seems like an absolute win to me, but again it's a change.

Also important: We reviewed the \r (aka CR) changes and found it was caused by CRLFs in some old draft titles, not in the charters. Added a clean_whitespace filter to the title in the template to fix this. That replaces any run of ASCII whitespace / control characters (i.e., 0x00-0x1F) with a single space, then strips leading and trailing space. This ends up improving spacing / inappropriate line breaks in a couple dozen draft titles.

@jennifer-richards
Copy link
Member Author

jennifer-richards commented May 16, 2024

Draft right now because I need to update tests

@jennifer-richards jennifer-richards marked this pull request as ready for review May 16, 2024 14:51
Copy link

codecov bot commented May 16, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 88.92%. Comparing base (187c2c5) to head (a08a02f).
Report is 199 commits behind head on main.

Current head a08a02f differs from pull request most recent head 11b4be5

Please upload reports for the commit 11b4be5 to get more accurate results.

Files Patch % Lines
ietf/group/utils.py 87.75% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7428      +/-   ##
==========================================
- Coverage   88.98%   88.92%   -0.07%     
==========================================
  Files         291      294       +3     
  Lines       40717    41112     +395     
==========================================
+ Hits        36233    36557     +324     
- Misses       4484     4555      +71     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rjsparks
Copy link
Member

rjsparks commented May 16, 2024

Acknowledging that there will be ordering and whitespace changes. That might disrupt some people doing diffs for a day, but the tradeoff seems the right one.

…arters

# Conflicts:
#	bin/hourly
#	ietf/utils/management/commands/periodic_tasks.py
@rjsparks rjsparks merged commit a5f44df into ietf-tools:main May 16, 2024
7 checks passed
@jennifer-richards jennifer-richards deleted the 1wg-charters branch May 16, 2024 20:08
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
2 participants