That’s fairly simple: we restrict the complex phase to the range (-pi, pi] and the principal square root halves the complex phase. -1 has the phase value pi, so the principal square root has the the complex phase pi/2, so it’s i, while -i has a phase of -pi/2
But (-i)^2=-1 as well. So we still need a convention to distinguish i from -i.
That’s fairly simple: we restrict the complex phase to the range (-pi, pi] and the principal square root halves the complex phase. -1 has the phase value pi, so the principal square root has the the complex phase pi/2, so it’s i, while -i has a phase of -pi/2