Metodologia sobre la recerca sociolingüística

Statistics in the analysis of phonetic variation: application of the Goldvarb programme, by Josefina Carrera


3. Study variables

To enable us to work with the results and obtain the averages and probabilities of appearance of each phonetic segment, values were assigned to the dependent variable (the pretonic vowel) and to the independent variables (those that define the framework of appearance of the dependent variable, i.e. linguistic and extralinguistic factors). The linguistic and extralinguistic variables initially (4) considered to be the most important are as follows:

1) The linguistic variables taken into consideration at the start of the analysis were: type of pretonic vowels, position of stress in each word, consonantal context adjacent to the pretonic vowel, point and mode of articulation of sounds adjacent to the pretonic vowel, vocal chord vibration of the sound following the vowel, quality of the pretonic syllable, etymology of each word, tonic syllable of each word and the sound corresponding to the pretonic position of the same word in Spanish.

2) The extralinguistic variables, directly related to speakers, were: sex, sociocultural status, knowledge of written Catalan, studies and age.

Each of these variables contains different factors (e.g. the variable sex includes the factors female and male) and each factor is represented by a digit or letter (in this case, female is represented by a d and male, by an h). The result of these variables, taken together, gives a series of codifications used to qualify each pronunciation analysed. An example of such a codification is:

which is decoded as follows:


a: pretonic sound [a]


a: pretonic type vowel: absolute initial
2: position of stress: vv'v
c: adjacent context before the vowel: existent
r: point of articulation before the pretonic vowel
4: mode of articulation before the pretonic vowel: liquid
a: point of articulation after the pretonic vowel: alveolar
2: mode of articulation after the vowel: fricative
s: vocal chord vibration of the sound following the vowel: mute
t: quality of the pretonic syllable: closed
b: etymology: derivative word with the particle es-
O: tonic syllable of the word
c: initial sound of the corresponding word in Spanish: consonant
h: sex: male
-: sociocultural context: mid-low
1: level of studies: none
n: knowledge of standard Catalan: none
9: age: from 3 to 5 years
f: style of speech: formal
A: linguistic community: Alguaire
088: word code: estisores
36: informant code

4. Application of the Goldvarb statistical programme

The Goldvarb statistical programme (5) contains four essential stages, as does the Varbrul programme. These four steps, which enable us to obtain descriptive statistics, are as follows:

a) Entry of survey results into the token file. This file is the basis of all analyses so the results must be entered correctly. All codifications, similar to that above, for each enunciation by survey subjects are entered here.

b) Once the token file is complete, the condition file must be set up; this will contain the parameters that are to combine the results obtained with the dependent and independent variables of the analysis. Linguistic variation can be explained one way or another, depending on the structure of this file. However, this file is also modified somewhat throughout the analysis, since the combination of factors that best explains the variable phenomenon is required. Thus, among other changes, we find unmodified variables and factors, and regrouped factors and variables.

c) Once the condition file set-up has been completed, the programme creates an iteration file of independent variables: the "cell file". This part of the programme combines all independent factors (tonic syllable, age of informants, studies, etc.) and relates these to the real data of the analysis. The user simply has to click on a series of instructions by which these cells are created. However, each time the conditions are changed, this cell file must be re-created.

d) Lastly, the results of the combination of data and conditions are displayed automatically by the programme in percentages and real amounts.

Basically, it is the user who introduces the tokens and conditions with all the necessary changes in this first phase. The cells and results, however, are calculated by the programme when the appropriate option is chosen. By the end of this first stage, we have the descriptive statistics and real incidences of use for each independent variable, according to the dependent factors.



