This is an archived post. You won't be able to vote or comment.

all 1 comments

[–]mracidglee 0 points1 point  (0 children)

Well, human voices typically have a base frequency between 70-400Hz. You can look for spectral peaks there. If you want more accuracy than that, you can analyze the spectrum for formants.

If the person isn't articulating - just going "ooooooo", "eeeeeeeee", or "ssssssss" - your job gets much harder.