The answer to this is dependent on the voice compression scheme in use.
The most common compression scheme used for networked voice audio is mu- law. This is based on a companding operation that compresses dynamic range during the encode operation, and expands the dynamic range on playback. This technique uses an 8-bit number to represent each sample, with audio samples taken at the rate of 8000 per second (8kHz). The resultant data rate is 64k bits/second. The audio quality of mu-law encoded audio is really quite good for voice.
A lesser-used compression scheme for networked audio is CVSD (Continuous Variable Slope Delta). With this scheme the difference in signal level between the current sample and the last sample is encoded. This results in a much lower data rate of 16k bits/second. There are a number of variants of CVSD in use within the simulation and training industry (Mil-Std-188, CCTT, and CECOM).
The above data rates are the basic audio data rates. There is additional load due to the networking header data that is pre-pended to each block of data samples. This has the resultant effect of increasing the effective bandwidth for a voice stream such that:
The above numbers are conservative approximations useful for network loading calculations.