In your example, "flavour enhancer" is followed by a colon and parentheses. Presumably it contains monosodium glutamate and nucleotide seasoning - but not salt, which falls outside the parentheses. The word "flavouring" is also followed by a colon, but no parentheses - thus it is not clear what is included within the term.
I don't think you will have much success telling a computer how to handle text that cannot be reliably handled by a human (a fairly stupid human, for that matter).
1 of 1 people found this helpful
On second thought, if you ignore the internal hierarchy, you could treat the colon and the parentheses as separators - equal to the comma. That would give you a list of:
- flavour enhancer
- monosodium glutamate
- nucleotide seasoning
- curry powder
- contains turmeric
- sesame oil
- hydrolyzed vegetable protein
- white dextrin
- vitamin B2
which could be further improved by substituting out words like "contains".