The Zygocity questionnaire was developed by the Norwegian Public Health Institute (FHI; Folkehelseinstituttet) for their twin registry studies. Its a series of questions probing the similarities between twins, to determine if they are mono- or dizygotic.
This note contains a brief description of the algorithm used to determine zygocity in recruitment in the 2000s.
Name | Answer questions about… | Used for |
---|---|---|
Drop | You and your twin were like two drops of water in childhood | Pairs and singles |
Stranger | Strangers had trouble telling the difference when you were children | Pairs and singles |
Eye | Similarity in terms of eye color | Pairs |
Voice | Similarity in terms of voice | Single |
Dexter | Similarity in Dexterity | Pairs and Singles |
Belief | What you believe yourself | Pairs and Singles |
“Single” twins here means those who have responded alone, i.e. there is no data available for both in the pair. The similarity questions that are not found in the table above, e.g. whether or not family members had problems distinguishing the twins is not used in the classification.
During calculations of the entire zygocity score, weights are applied to the different categories, depending on whether one or both twins have responded to the questionnaire.
Name | Answer questions about… | Factor single | Factor pair |
---|---|---|---|
Drop | You and your twin were like two drops of water | 1.494 | 2.111 |
Stranger | Strangers had trouble seeing the difference | 0.647 | 0.691 |
Eye | Similarity in terms of eye color | 0.394 | |
Voice | Similarity in terms of voice | 0.347 | |
Dexter | Dexterity Similarity | 0.458 | 0.366 |
Belief | What you believe yourself | 0.417 | 0.481 |
Constant term in the formula | 0.007 | - 0.087 |
“Form value” is the value the answer option has in the data file. “Score value” is the value used in the algorithm when zygocity is calculated.
Variable | Answer option | Form value | Score value |
---|---|---|---|
Drop | Like two drops of water | 1 | 1 |
Like most siblings | 2 | -1 | |
Don’t know | 3 | 0 | |
Stranger | Often | 1 | 1 |
Occasionally | 2 | 0 | |
Never | 3 | -1 | |
Don’t know | 4 | 0 | |
Belief | Monozygotic | 1 | 1 |
Dizygotic | 2 | -1 | |
Don’t know | 3 | 0 | |
Eye, Voice & Dexter | Exactly the same | 1 | 1 |
Almost like | 2 | 0 | |
Different | 3 | -1 | |
Don’t know | 4 | 0 |
No answer option is used directly in the calculations, only the score values. In the following, it is these values (-1, 0 or 1) that are used in the algorithms. E.g. has Drop in the formula value 1 for a positive answer to whether the twins were equal to two drops of water.
The higher the absolute value of the final score, the more certain / clearer the classification. For answers that reveal greater uncertainty about the similarity (e.g. a greater proportion of “almost” and “don’t know”), the value will be closer to zero.
For pairs where both have answered, the pair’s average values for all score values are first calculated. That is Drop = (Drop1 + Drop2) / 2, etc., where Drop1 is the score value of the response from twin 1 and Drop2 is the score value of the response from twin 2 in the same pair.
The sign of this “pair score” is then used to determine zygocity in the same way as for “single”: Negative value means double, positive value means single.
If only one twin in the pair has responded, the following is calculated:
The sign of this “single score” is then used to determine the zygocity: Negative value means double egg, positive value means single egg.
By default, the functions assume that columns have names in the
manner of zygocity_XX
where XX
is a
zero-padded (i.e. zero in front of numbers below 9, eg. 09
)
question number of the inventory. You may have column names in another
format, but in that case you will need to supply to the functions the
names of those columns using tidy-selectors (see the tidyverse packages for this). The
columns should adhere to some naming logic that is easy to specify.
The values in the columns should be the item number of the question
that was answered (i.e. 1
, 2
, or
3
, and for some questions also 4
).
zygo
functionsCurrently undocumented…
library(questionnaires)
library(dplyr)
zygo <- tibble(
id = 1:10,
twinpair = rep(1:5, each = 2),
drop = c(1, 2, 3, NA, 2, 2, 1, 1, NA, 2),
stranger = c(1, 2, 4, NA, 2, 3, 3, 1, NA, 2),
dexterity = c(1, 1, 3, NA, 2, 2, 1, 2, NA, 1),
voice = c(2, 2, 3, NA, 2, 2, 1, 1, NA, 1),
eye = c(2, 2, 2, NA, 2, 2, 1, 1, NA, 2),
belief = c(1, 1, 2, NA, 2, 2, 1, 1, NA, 2)
)
zygo_compute(zygo,
twin_col = twinpair,
cols = 3:6,
recode = FALSE)
#> # A tibble: 10 × 8
#> zygo_eye zygo_drop zygo_stranger zygo_dexterity zygo_voice zygo_belief
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0.788 2.11 0.691 0.366 NA 0.481
#> 2 0.788 4.22 1.38 0.366 NA 0.481
#> 3 NA 4.48 2.59 1.37 1.04 0.834
#> 4 NA NA NA NA NA NA
#> 5 0.788 4.22 1.38 0.732 NA 0.962
#> 6 0.788 4.22 2.07 0.732 NA 0.962
#> 7 0.394 2.11 2.07 0.366 NA 0.481
#> 8 0.394 2.11 0.691 0.732 NA 0.481
#> 9 NA NA NA NA NA NA
#> 10 NA 2.99 1.29 0.458 0.347 0.834
#> # ℹ 2 more variables: zygo_score <dbl>, zygo_zygocity <chr>