library(nettskjema.tsd)
#> You are not on linux TSD.
#> Some functions from this package may not work as intended.FALSE
Currently, nettskjema data sent to TSD are stored on
durable/nettskjema-submissions/XXX/
where each nettskjema
has its own folder with the nettskjema id (that can be found in the URL
of the nettskjema in the nettskjema portal). All Nettskjema that are
connected to our TSD project send data to this spot, with one file per
response, in an encrypted file in both json and csv format.
At its core, the single function
nettskjema_tsd_organise_all_forms()
starts and sets up the
entire pipeline for you. There are several options to specify custom
output locations, rerunning all files again etc, but these can be
checked out in the function documentation.
durable/nettskjema-submissions/XXXX
durable/nettskjema-submissions/cleaned
by defaultform-xxxxx.csv
(data source is incoming csv files)form-xxxxx-json.csv
(data source is incoming json
files)And optional bit can be run for applying edits to data values that have been entered incorrectly, while still retaining the original raw data. We do not recommend ever directly changing raw data, but rather use the editing pipeline provided to keep raw data intact, have a log of changes needed and the reasons for these, and have a working clean data set.
The data are first copied over to safe new space, where we can more easily store the data in the way we want, and continue working on them. The incoming data folder is locked in such a way that we are only allowed to copy data out of the folder, not work there in any way.
The new data location is by default the same spot as the incoming
data, durable/nettskjema-submissions/cleaned
Here, just
like in the raw folder, each nettskjema has its own folder using the
nettskjema ID.
-- 177322
-- attachments/
-- edits-177322.json
-- form-177322.csv
-- form-177322-json.csv
-- form-177322-json-ed.csv
-- submissions/
-- decrypted/
| -- form-177322-2020-01-31-18-14-32.json
| -- form-177322-2020-02-23-13-44-01.json
| -- form-177322-2020-02-24-10-32-22.json
-- encrypted/
| -- form-177322-2020-01-31-18-14-32.json.asc
| -- form-177322-2020-02-23-13-44-01.json.asc
| -- form-177322-2020-02-24-10-32-22.json.asc
The submission’s folder contains all single file submissions (answers) to the nettskjema. The encrypted are exactly as they are when they enter TSD, and the decrypted are the same files but decrypted so they may be read by users. Some nettskjema might also have attachment folders, which should contain all the attachments that might come with a nettskjema (if the form has the option to add attachments).
Data are encrypted and decrypted with gpg.
The data that are encrypted are:
The submission data are combined into data sets for each nettskjema, using both the jsons and csv. The data sources are a little different, which is why we create files from both. Generally, combining data from jsons are more robust than combining data from the csv sources, because the json files do not have issues with possible column mis-matches.
Read more about the json data format on digitalocean.
There are also options for creating edited versions of the data, in
case there are incorrectly submitted values to the nettskjema data.
These edits are stored as json files, specifying which data should be
edited in the combined data. Edit files are named in the form
edits-XXXX.json
where XXXX
is the nettskjema
form id, and is stored in the same location as the
form-xxxx.csv
files. The edit jsons are organised after
submission ids, which is the unique identifier of ANY nettskjema
submission.
{
"889922": {
"data": {
"sex": {
"value": "female",
"comment": "incorrectly punched by RA"
},
"cvlt_a_total": {
"value": "45",
"comment": "incorrectly calculated"
}
}
},
"872635":{
"data": {},
"comment": "double punched"
}
}
In the case above, the first submission has two variables that were
entered incorrectly. The sex
should be female, and
cvlt_a_total
should be 45. Each of these has their own
comments on why the change is needed. The second submission has been
entered by mistake, and so should be deleted in its entirety.
This way we can keep easy track of all edits made to the nettskjema data for posterity.
Edits can be applied to all nettskjema data with a single command if needed:
This will create new combined files for the form with the ending
-ed
, so you can always preserve the intact raw data, but
also have a working cleaned data set.