To make the task easier, and to measure the progress towards the goal, we divide it into steps. The procedure below applies equally to assessment coordinators and the TAF support team.
Five steps to script the analysis from data to output (core assessment) in TAF:
1: Get model to run
Well, unless you are the assessment coordinator :)
Files might be found in the Sharepoint
Data
folder.
Earlier WG reports can be found online.
Being able to run the assessment on a different computer is an important milestone in making the analysis reproducible.
2: Examine the analysis
This is a good time to open and view
(a) the input & output files, and
(b) the last WG report, especially the table section
Do the tables in (a) and (b) look similar?
What kinds of data are used in this assessment, perhaps more than one survey?
Are some data tables in the report not in the model input, or vice versa?
Are the model settings stored in a separate file?
Is it easy to find out which input files the model requires?
In general, TAF should only contain files that are absolutely necessary to run the final assessment.
- all other files are probably best stored outside of TAF
- data files should include all available years and ages, which can be truncated (e.g. in a plus group) in the data script
What is the smallest set of files required to run the final assessment on another computer?
3: Data script
The easiest way to import data into R depends on the data file format:
- simple text files can often be imported using base functions likeread.table
- specific file formats can be imported using packages likestockassessment
orFLCore
Some preprocessing of data often occurs before they are fed into a model:
- years or ages might be excluded from the analysis
- ages might be aggregated into a plus group
- survey indices might be combined, the current year’s weights predicted, etc.
The data should preferably start in disaggregated form (see ‘Mission Y’ below).
Data that are used in the assessment model should be written as TAF data files in the
data
folder.
- the icesTAF package provides the functionwrite.taf
for this purpose
- write both full datasets (all ages, all years) and truncated datasets, e.g. plus group in catch-at-age ascatage_full.csv
andcatage.csv
Ideally, the TAF data files are the only files necessary for theinput.R
script, but sometimes it’s practical to write additional files in thedata
folder that are not in the TAF file format.
4: Input and model scripts
The model input is data in the format that the model requires, for example:
- text files such asinput.dat
with many tables, or
-input.RData
with many R objects
Ideally,input.R
should read the TAF data files created bydata.R
and create the model input from that, thus guaranteeing that the TAF data files are indeed the data that the model uses.
Sometimes it’s practical to have theinput.R
script read/copy/move files that are not in the TAF file format.
The input files, containing data in model-specific format, are written in theinput
folder, ready for the next step.
In TAF, stock assessment models are either run as:
- R packages, such asstockassessment
andFLR
, or
- executables, such as ADMB or Fortran applications
R package models return the results into the R session, and those results can be written out asresults.RData
inside themodel
folder.
Executable models can be run using the R functionsystem
and the output files are stored inside themodel
folder.
Model settings are sometimes stored in files, especially in the case of executable models.
5: Output script
Mission X: Report script
Mission Y: Start from disaggregated data
Mission Z: TAF forecast