It's pretty self-explanatory, surely?
In the manufacturer's view,
Music suits music best,
Voice works best for program material that is mostly voice (e.g. Talk Radio, 5 Live, Radio 4, TV soaps and chat shows etc).
Cinema is what they think will give film and TV action viewing an appropriate tonal tweak.
Standard is the sound of the device without the graphic-equaliser cuts/boosts in the bass or midrange or high frequencies.
Was that really your question, or were you looking for some different information? If so, have another go at phrasing the question.