Tonality is a sound quality metric aimed at identifying and quantifying the strength of tones in a given noise spectrum. Ideally, tonality metrics should align well with the human perception of tones and help the user to differentiate between tones which may be objectionable and those which may not be apparent to the listener.
There are several different approaches to the formulation of tonality, and recent advances have added some new capability for this metric. This article will review two popular forms of the tonality metric and highlight their differences and respective strengths.
This article has the following sections: 1. Introduction 1.1. “Classic” Tonality (Aures/Terhardt) 1.1.1. Formulation & Values 1.1.2. Classic Tonality in Simcenter Testlab: Increasing tone amplitude 1.1.3. Classic Tonality in Simcenter Testlab: Mixing tones and broadband noise 1.2. Psychoacoustic Tonality 1.2.1. Formulation & Values 1.2.2. Psychoacoustic Tonality in Simcenter Testlab Neo (rev 2019.1.2 on) 2. Tonality Comparison 2.1. Threshold of hearing / Sensitivity to loudness 2.2. Human perception 2.3. Frequency information 3. Psychoacoustic Tonality in Simcenter Testlab Neo 3.1. Tonality 3.1.1. Versus Time 3.1.2. Versus Frequency 3.2. Tonality Frequency 3.3. Tonality Map
1. Introduction
In Tonality calculations, a frequency tone is compared to the surrounding background sound. If high enough relative to the background, the sound is considered tonal. The higher the tonal value, the more perceivable the tone.
1.1. “Classic” Tonality (Aures/Terhardt)
So called “classic” tonality is based on publications of Aures and Terhardt and is designed to evaluate whether tones are present in a noise spectrum. Classic tonality produces a value in “tonality units” (t.u.) on a scale from 0 to 1.
1.1.1. Formulation and Values
The value of t.u. provides an assessment of the relative weight of the tonal components of the noise compared to the rest of the spectrum: 0.0 t.u. represents a noise with no discrete tones, and 1.0 t.u. is defined as a 60 dB sine tone at 1 kHz with no other noise present.
The technique used to identify tones is illustrated in Figure 1. The model first searches the spectrum for spectral lines which are larger than both neighboring lines (Si ±1); this is Criteria 1. Of the lines fulfilling Criteria 1, only lines which are at least 7 dB larger than both the 2nd and 3rd neighboring lines (Si ± 2 and Si ± 3) are considered to qualify as pure tones; this is Criteria 2. The seven-line groups with the indexes i-3 to i+3 satisfying both Criteria 1&2 are identified as the tonal components of the spectrum.
Figure 1: Process used by Terhardt/Aures to identify tones in a spectrum. Only tones meeting certain criteria contribute to the tonality value.
Once all tonal components and their neighboring lines have been identified, a new spectrum, free of tonal components, is built by removing the tonal groups. From the two spectra, the fraction of the total loudness due to tonal components is calculated. This is denoted by WN .
An extra weighting function, WT, is determined from the pitch weights of the tonal components relevant to the pitch perception. This is a frequency dependent function such that at 700Hz the perception of tonality is maximal.
Finally, a constant term C is included to scale the tonality results to standardize the results, such that a 1 kHz pure sine tone at 60 dB gives a tonality of 1 t.u.
The final value of tonality, K, is given by the Equation 1 below.
Equation 1: Tonality Equation
Some notes regarding this method:
As the method is analyzing individual spectral lines, it follows that the spectral resolution can have a considerable effect on identifying tones in the spectrum.
The weighting given to the tonal components in Aures method, WT, is based as a function of frequency, where 700 Hz is given the maximum perception of tonality. This does not necessarily match with most recent research with respect to human perception.
Only a single tonal component per critical band is assessed. Once a tone is identified in a particular critical band, any other tones present in this band will be missed.
Once all discrete tones have been identified, a search is then made for narrowband noise that may contribute to the subjective impression of tonality. The algorithm looks for noises with elevated levels and bandwidth smaller than the critical bandwidth at the frequency being investigated. This functionality is also subject to failing to correctly identify banded tonal noise, as will be shown in Figure 7.
1.1.2. Classic Tonality in Simcenter Testlab: Increasing Tone Amplitude
As the tonality value is based in large part on the ratio of tonal loudness compared to the rest of the spectrum, the louder a tone gets the more t.u. it will receive.
Video 1: Classic Tonality on Increasing Tone Amplitude over Constant Broadband
Figure 2. Classic tonality calculated a tone of increasing magnitude (left); Classic tonality vs. time for the same 6 tones (right). The louder the tone gets above the background noise, the more tonality units it calculates.
1.1.3 Classic Tonality in Simcenter Testlab: Mixing tones and Broadband Noise
Classic tonality is formulated to give an impression of the relative power of the tones in a sound compared to the power of the background noise.
Video 2: Classic Tonality on Mixed Tone Signals
This is illustrated in Figure 3 below, where all three signals are present in the time history.
Figure 3: Classic tonality calculated for a mixture of random noise (left), random noise with tones mixed in (middle), and finally just tones (right).
So, purely random noise with no discrete tones should produce 0.0 t.u. (0% tonal), a mixture of tones and random noise should produce an intermediate value, for example 0.5 t.u. (50% tonal), and pure tones alone should produce a value of 1.0 t.u. (100% tonal).
Psychoacoustic tonality was put forth in Annex G of ECMA 74 (2019, 17th Edition). Annex G expands the hearing model approach for specific loudness described in Annex F of ECMA 74 and applies the perception-model basis specifically for tonal noises.
1.2.1 Formulation and Values
The formulation for hearing model first transforms incoming sound pressure signals into psychoacoustic loudness and incorporates several features of human hearing including the influence of the outer and middle/inner ear, the threshold of hearing, and the frequency scale in which the brain registers frequency information. This scale is known as the Bark scale, and the hearing model expands on previous formulations by breaking the human auditory range into 53 overlapping bands, twice as many as were previously used. These bands are often referred to using the letter z, ranging from z = 0.5 to 26.5, covering 20 Hz to 20 kHz (in Simcenter Testlab these bands are numbered 0-52). This aspect of the hearing model allows for the calculation of tonality into specific frequency bands, as opposed to a single number for an entire spectrum.
Once the signal is translated into specific loudness, the hearing model for tonality separates the tonal components of the signal from the rest via a sliding autocorrelation function. The perceived loudness of the tonal components is then compared to the loudness of broadband components for each critical band, and the prominence of the tones is calculated. The amount of tonality is reported in a new unit for tonality: t.u.HMS, to indicate that it is based on the hearing model. A tone receiving 0.1 t.u.HMS is considered barely perceivable, 0.4 t.u.HMS is where a tone starts to become problematic in terms of annoyance, and 0.8 t.u.HMS and greater will be considered highly problematic and highly perceivable.
Video 3: Pscyhoacoustic Tonality with Threshold on Varying Amplitude Tone with Constant Background
Because the hearing model incorporates several aspects of the perception of loudness such as the threshold of hearing, loudness-level-dependent masking and frequency-dependent masking, tonality metrics built using the hearing model are much better at accurately identifying multiple tone effects, and the tonal perception of narrow-band noise. Human perception of tones increases with increasing loudness, and psychoacoustic loudness captures this phenomenon. As a result, the amount of tonality calculated for a given sound is theoretically unlimited: the louder the tones, the more tonality t.u.HMS it will calculate.
1.2.2. Psychoacoustic Tonality in Simcenter Testlab Neo (rev 2019.1.2 on)
Where classic tonality gives an idea of the relative power of tones to broadband noise (as shown in Figure 3,) psychoacoustic tonality based on the hearing model assigns a tonality value (in t.u.HMS) that corresponds to how perceivable the tone is, regardless of magnitude compared to the background noise. If we take the same recording previously analyzed using classical tonality and compute psychoacoustic tonality, we will get much different results, as shown in Figure 4.
Figure 4: Psychoacoustic tonality calculated for a mixture of random noise, random noise with tones mixed in, and finally just tones.
As can be seen in Figure 4, the tonality metric based on the hearing model produces values quite different from classical tonality. In the middle where the tones are mixed on top of broadband noise the t.u.HMS value (about 15 t.u.HMS) is almost as high as at the end of the recording where there is no broadband noise (15.7 t.u.HMS). This is due to the level of the tones and their perceptibility. Any tone with a t.u.HMS value greater than 0.1 t.u.HMS will be perceivable, the magnitude of the tonality value gives an indication just how perceivable the tone is. The t.u.HMS values indicate that the tones are almost exactly as perceivable when mixed with broadband noise as they are when there is nothing but tones in the sound.
This trend matches our perception, which is shown in the video below.
Video 4: Psychoacoustic Tonality on Mixed Tone Signals
2. Tonality Comparisons
How does the classic Aures/Terhardt Tonality compare to Psychoacoustic Tonality?
2.1. Sensitivity to Loudness
As has been previously mentioned, one aspect of the hearing model is the incorporation of the sensitivity to perceived loudness. This makes psychoacoustic tonality more realistic compared to classical tonality when it comes to how human beings perceive tones. Using the recorded combination of tones and broadband noise once again, we can see the sensitivity to loudness by scaling the signals and comparing classical tonality to psychoacoustic tonality. In the top pane of Figure 5, we see the original time history signal in red, and a scaled version (amplitude scaled by 0.5) in green. In the lower pane of Figure 5 we see classical tonality calculated for both functions.
Figure 5. Upper: Time histories of original and scaled (reduced) signals. Lower: Classical tonality plotted versus time for both functions. Classic tonality calculates the same values of t.u. for both versions.
The t.u. vs. time shows no difference between these two signals, as classical tonality only considers the relative power of the tones versus the power of the background noise. Since the tones and broadband noise were both scaled the same amount, the t.u. calculation remains constant.
However, in Figure 6 we see the psychoacoustic tonality based on the hearing model calculates a lower t.u.HMS value for the reduced signal.
Figure 6. Upper: Time histories of original and scaled (reduced) signals. Lower: Psychoacoustic tonality plotted versus time for both functions. The scaled function (green) produces fewer t.u.HMS, which matches human perception.
This is an important aspect of the hearing model, and why it matches our perception. As tones get louder, they become more perceivable as tones to our ear. This effect is captured by the psychoacoustic tonality calculation.
2.2. Narrow-band Tonal Noise
The hearing model also enables psychoacoustic tonality to correctly identify sounds which have a tonal character to them but are not pure tones, for example narrow bands of noise.
Video 5 : Psychoacoustic Tonality on Tone composed of Banded Noise
Classic tonality misses these noises as they are not tonal in their frequency content, but have a tonal character to the listener. One such example is show in Figure 7. This plot shows a band of noise that rises above the background, about 200 Hz wide (3800-4000 Hz).
Figure 7. Narrow bands of broadband noise can have a tonal quality to a listener, despite no tones being present in the frequency spectrum. Classic tonality misses this type of noise, calculating 0.0 t.u. as shown in the legend.
As classical tonality first searches the frequency spectrum for pure tones, it does not find any frequency content that satisfies the tonality criteria. As such, the metric calculates 0.0 tonality units for this spectrum.
However, the noise has a tonal quality to a listener, which psychoacoustic tonality can detect (see Figure 8).
Figure 8. Psychoacoustic tonality calculated versus time for the same narrowband noise. The hearing model tonality is able to identify the tonal nature of the sound, which is around the “problematic” threshold, 0.4 t.u.HMS.
2.3. Frequency information
One of the key features of the hearing model on which psychoacoustic tonality is based is the processing of loudness in terms of overlapping critical bands – this feature treats the sound energy in each critical band separately as it relates to tonality, and as a result one can now extract the frequency information as well as the tonality information (see Figure 9).
Figure 9. Psychoacoustic tonality can also show how much tonality is present at which critical bands. This capability is not part of classical tonality.
In the top part of Figure 9 is the frequency spectrum of a well-known vacuum cleaner, “Brand A”. Several peaks are visible in the frequency spectrum (shown as vertical lines in the top plot). Knowing how much tonality is present in a given sound is the first step, psychoacoustic tonality goes further and shows the user at what frequencies the tonality is occurring, and how much. The bottom plot of Figure 9 shows time-averaged psychoacoustic tonality versus frequency. Classical tonality does not provide any information about the frequency which is attributing to the perception of tonality.
Psychoacoustic tonality has several ways of showing the frequency information based on the tonality characteristics of the sound: tonality versus frequency, tonality frequency, and tonality map. These additional methods are covered in detail in the next section.
3. Psychoacoustic Tonality in Simcenter Testlab Neo
Several forms of psychoacoustic tonality based on the hearing model as set forth in Annex G of ECMA 74 (17th Edition, 2019) are available in Simcenter Testlab Neo as of revision 2019.1.2. There are three so-called Methods available for this metric in the Sound Quality section of the Process Designer Method Library as shown in Figure 10.
Figure 10. Psychoacoustic tonality is available in several forms in Simcenter Testlab Neo. The Methods are labeled with “t.u.HMS” to indicate they are formulated using the hearing model.
The following sections take a closer look at each of these methods and their associated settings in Simcenter Testlab. In each case the output of the method is shown using the vacuum cleaner recording discussed previously.
3.1. Tonality
The input to all the psychoacoustic tonality methods is time history data. The Tonality method has two main variants: plot t.u.HMS versus time (or RPM, or other tracking parameter) or versus frequency. In both cases other settings remain the same and are detailed in Figure 11 below.
Figure 11. Settings panel for psychoacoustic tonality methods in Simcenter Testlab Neo.
A. Tracking strategy: Default is Free Run with 75% overlap (fixed value). May also track on time or channel (RPM, etc) where an increment can be specified. B. Save single values: Returns the time averaged value of all the maxima of tonality per time point through the 53 Bark bands. C. Method: Default ECMA-74:2019. Small revisions were made to the standard in 2019. 2018 formulation is also available for historical comparison. D. Frequency Limits: Defaults 20 Hz & 20,000 Hz. Can restrict the calculation to a smaller band if desired. E. Tonality versus: Default is Time (see section 3.1.1). Can also select Frequency (section 3.1.2). F. Type: Default is Monaural, each channel processed separately. Can also select Binaural, which will take the maximum tonality of L and R ear in case a stereo time history is fed into the method.
3.1.1. Tonality Versus Time
This method plots the maximum tonality value across all critical bands for each tracking value as shown in Figure 12.
Figure 12. Plot of Tonality method versus Time axis. Single number value is shown at the bottom.
The time-average of these values is reported if the “Save single values” checkbox is turned on. This single value is the same regardless of method calculated.
3.1.2. Tonality Versus Frequency
Changing the “Tonality versus” setting to Frequency plots the time-averaged tonality value for each Bark band, resulting in a tonality spectrum as shown in Figure 13.
Figure 13. Plot of Tonality method versus Frequency axis.
This helps identify which tones (by frequency) need to be addressed to lower the overall tonality.
3.2. Tonality Frequency
Tonality Frequency creates a 2-D plot of Frequency vs. Time (or tracking value) as shown in Figure 14.
Figure 14. Plot of Tonality Frequency method.
The frequency of the largest tone in a sound might change over time. With the Tonality Frequency plot, the frequency of the largest tone can be tracked over time (or versus specified tracking channel).
3.3. Tonality Map
The Tonality Map combines all the data produced by the other tonality methods into one plot as shown in Figure 15.
Figure 15. Output of the Tonality Map method showing value of tonality in each 0.5 Bark band for every time/tracking increment.
It is a 3-D map with Frequency vs. Time (or tracking parameter), with magnitude of tonality in t.u.HMS shown in color.
Hope this explain how to use the Tonality functions of Simcenter Testlab Neo. Check out the Tonal Sound Metric Seminar for more information!