Your system is of the form $y'=f(t,y)$, with $y \in \mathbb{R}^m$ your state vector.
Applying Crank-Nicolson to that scheme leads to the following equation for the time step from $t_n$ to $t_{n+1}=t_n + \Delta t$:
$$ y_{n+1} = y_{n} + \frac{\Delta t}{2} \left( f(y_n) + f(y_{n+1}) \right) $$
This corresponds to the following nonlinear problem:
$$O = g(y_{n+1})$$
with:
$$g(y_{n+1}) = y_{n+1} - \frac{\Delta t}{2} f(y_{n+1}) - \underbrace{ ( y_{n} + \frac{\Delta t}{2} f(y_n) )}_{R}$$
The term $R$ does not depend on $y_{n+1}$.
The traditional way si to apply a Newton method, which linearises the problem iteratively to solve the nonlinear problem. Starting from the an initial guess $y_{n+1}^0$ (simplest guess is $y_n$), the $k$-th Newton iteration reads:
$$y_{n+1}^k = y_{n+1}^{k-1} - (\partial_y g)^{-1} g(y_{n+1}^{k-1}) $$
For a full Newton method, the Jacobian is evaluated at $y_{n+1}^k$, and is therefore updated at each iteration.
Very often for ODEs, it is sufficient to perform quasi-Newton iterations, where this Jacobian is only updated if the Newton converges poorly, e.g. if $||y_{n+1}^{k+1} - y_{n+1}^{k}||$ does not decrease quickly enough from one iteration to the next.
The Newton is ensured to converge for a sufficiently small time step.
The other approach you suggested is to linearise $f$ directly in the Crank-Nicolson scheme:
$$f(t_{n+1}, y_{n+1}) \approx f(y_{n}) + (\partial_y f)(y_n) (y_{n+1}-y_{n})$$
In that case, a single Newton iteration suffices, because $g$ is then linear:
$$\left(\mathbb{I} - \dfrac{\Delta t}{2}\partial_y f(y_n)\right) y_{n+1} = \left( \mathbb{I} - \dfrac{\Delta t}{2} \partial_y f(y_n)\right) y_n + \Delta t f(y_n)$$
However, that approach is typically much less robust and stable in the case of strongly nonlinear terms, because the linearisation if too crude of an approximation.
Remarks:
maybe an explicit algorithm can work very well for your problem if this nonlinear term you speak of is not stiff. If you have stiffness issues due to other terms (e.g. diffusive processes), try explicit integrators with extended stability domains (e.g. ROCK4).
in all cases, try to use existing softwares to integrate your system. Coding in Fortran ? Try Radau5, Dopri5. Coding in Matlab, try ode45, ode15s, ode32tb. Coding in Python, try the integrators from Scipy's solve_ivp
package.