nzilbb.vowels 0.3.1

linguistics
vowels
pca
Author

Joshua Wilson Black

Published

December 1, 2024

The nzilbb.vowels R package has just been released to CRAN. This is the first release on CRAN and contains a few modifications from the version which was available on GitHub and used in Wilson Black et al. (2022).

plot_pc_vs() and pc_flip()

The model-to-PCA workflow for vocalic data can produce results which require a lot of mental ‘axis flipping’. First, the convention is to plot the axes in reverse in vowel space diagrams. Second, when two PCA analyses with one another, one might have a positive loadings where the other has negative loadings. But the sign of a PC is arbitrary.

To quickly solve the ‘loadings-to-vowel space’ problem, I’ve added a function called plot_pc_vs(), which plots a principal component generated by prcomp, princomp or nzilbb.vowels::pca_test() in the vowel space. For instance:

onze_vowels <- onze_vowels |>  
  # apply lobanov normalisation
  lobanov_2() |> 
  # put lobanov normalised values towards front of data frame.
  relocate(speaker, vowel, F1_lob2, F2_lob2)

pca_data <- onze_intercepts |>
  select(-speaker)

onze_test <- pca_test(
  pca_data,
  n = 500,
  scale = TRUE,
  variance_confint = 0.95,
  loadings_confint = 0.9
)

plot_pc_vs(onze_vowels, onze_test, pc_no = 1, is_sig = TRUE) +
  coord_fixed()
Figure 1: Significant PC1 loadings for subset of ONZE speakers in vowel space.

The arrows in Figure 1 indicate relative size of movement, but do not indicate the exact magnitude of movement in the vowel space expected for a specific increase in PC score. The way of getting at this will vary in each research project and will depend on how any models have been used. The purpose of plot_pc_vs() is to provide a quick way of getting from PC loadings to movement in the vowel space.

pc_flip() allows you to flip a named PC or to specify a variable which one wants to be positive. So, for instance, if you know you want trap F1 to be positive in PC1, you’d do the following:

onze_test <- pc_flip(onze_test, pc_no = 1, flip_var = "F1_TRAP")

As wil plot_pc_vs(), pc_flip() works with prcomp, princomp or nzilbb.vowels::pca_test().

We can use plot_pc_vs() again to see that this has worked.

plot_pc_vs(onze_vowels, onze_test, pc_no = 1, is_sig = TRUE) +
  coord_fixed()
Figure 2: Significant PC1 loadings for subset of ONZE speakers in vowel space (flipped).

mds_test()

In the course of Sheard et al. (2024), we build a testing function for choosing a number of dimensions in Multidimensional Scaling (MDS) by analogy with pca_test(). The function calculates the reduction in success achieved by adding an additional dimension to an MDS analysis for both bootstrapped and permuted versions of a similarity matrix. Here’s what it, and the plot_mds_test() function look like in practice:

mds_res <- mds_test(
  sim_matrix,
  n_boots = 50,
  n_perms = 50,
  test_dimensions = 5,
  principal = TRUE,
  mds_type = "ordinal",
  spline_degree = 2,
  spline_int_knots = 2
)
plot_mds_test(mds_res)
Figure 3: Plot of MDS test results.

In Figure 3, the black crosses indicate the stress reduction from adding an additional dimension for a given similarity matrix. The red box and whisker plot indicates the reduction across a series of bootstraps and the blue indicates the same for permuted versions of the data. We’ve found that accepting the number of dimensions up to and including the first where the two distributions align seems to perform well, but we are still experimenting! In the case of this (simulated) data, we’d go with two dimensions.

Website

The package now has a pkgdown website at https://nzilbb.github.io/nzilbb_vowels. Have a look to see the rest of the documentation.

References

Sheard, Elena, Jen Hay, Robert Fromont, Joshua Wilson Black, and Lynn Clark. 2024. “Covarying New Zealand Vowels Interact with Speech Rate to Create Social Meaning for NZ Listeners.” Presented at the 19th Conference on Laboratory Phonology, Hanyang University, June 29.
Wilson Black, Joshua, James Brand, Jen Hay, and Lynn Clark. 2022. “Using Principal Component Analysis to Explore Co‐variation of Vowels.” Language and Linguistics Compass 17 (1). https://doi.org/10.1111/lnc3.12479.