ASTR schema: Implementation
Source:vignettes/VG.ASTR.Schema.Implementation.Rmd
VG.ASTR.Schema.Implementation.RmdThis vignette outlines how the conventions defined in the ASTR schema were implemented in the ASTR package.
Naming patterns
- Elements, oxides, and isotopes in column names are compared to pre-compiled lists. While we aim to be as exhaustive as possible, we cannot guarantee they are complete for oxides and isotopes. For example, only naturally occurring isotopes are currently supported. If you encounter a missing compound or isotope, please reach out to the package maintainers or open an issue in the ASTR GitHub repo.
- In addition to column names with single isotope, element, or oxide, ratios and sums are also supported. The table below provides some examples and as which type they will be recognized.
| Column name | Type | Unit |
|---|---|---|
143Nd/144Nd |
isotope ratio | unitless |
d65Cu |
isotope ratio | unitless |
Na2O_wt% |
concentration | wtP |
S_at% |
concentration | atP |
75As_wt% |
concentration | wtP |
FeOtot_wt% |
concentration | wtP |
FeOtot_errSD% |
error | % |
Ag_ppb |
concentration | ng/g |
Ag_err2D |
error | ppb |
Te_cps |
concentration | counts/s |
Sn_µg/ml |
concentration | µg/ml |
206Pb/204Pb_errSE |
error | unitless |
Na2O+CaO_ppm |
concentration | mg/kg |
FeOtot/SiO2 |
elemental ratio | unitless |
(Na2O+K2O)/SiO2 |
elemental ratio | unitless |
(Ti/SiO2)/(Ag2O-Fe) |
elemental ratio | unitless |
Data import
- Columns in a dataset are recognized either as isotope ratio, concentration, elemental ratio (i.e. ratio of concentrations), error (i.e., analytical precision), or contextual information.
- Contextual information is additional information about the analytical values such as sample number, geolocation or a group. Columns with such information must be explicitly declared during import, excluding them from pattern recognition.
- One column must be specified as
IDcolumn during import to provide a unique identifier for each line in the dataset. To ensure uniqueness of its values,_1,_2, …_nwill be added to non-unique values. The original column is preserved. - Columns without a column header will be removed during import.
- The following notations will be automatically identified and
replaced with
NA, unless you explicitly define other values inread_ASTR():NA,N.A.,N/A,na,n/a,-, andn.d.. Values containing common excel error messages (#DIV/0!, #VALUE!, #REF!, #NAME?, #NUM!, #N/A, and #NULL!) are also replaced withNAby default. - Units will be removed from column names because {units} stores them
in the column attributes. This allows for clean column names and
therefore of e.g. axis labels in plots. They can be “shifted” from the
column attributes back to the column names by
remove_units()withrecover_unit_names = TRUE.
Units
- The package relies on units, which uses the udunits C
library, for handling all SI units (e.g. µg/ml) and relative
concentration units (e.g. ppm(m/V)). If the mixture type is not
specified, m/m is assumed. Non-SI units wt% and
at% were defined as
wtPandatP, respectively. - Following the use of relative units being discouraged in current IUPAC recommendations, relative units are converted to absolute units wherever possible. This means that import of data in e.g. ppm is possible but they will be converted to mg/kg during import. Explicit conversion to ppm is still possible.
- As an exception, the relative unit % will not be converted to its SI unit equivalent during data import. However, unit conversions will treat it as any other relative unit, essentially handling it the same as wt%.
- The unit wt% (weight%) is defined as relative unit
wtPanalogous to e.g. ppm, meaning it is equivalent to “parts per hundred”. The package provides support for conversion between wt% for elements and oxides (sometimes referred to as oxide%). Because this is a chemical rather than a mathematical conversion, both have the same unit and the distinction is made based on the chemical formula in the column name (e.g.,Fevs.Fe2O3). This conversion has additional complexity because one element can have multiple oxides. The conversion functions take this into account: Different options are offered to choose the oxide to convert into and if oxides convert into the same element, the columns are summarized into a single column per element.
- Conversion to and from at% (
atP) is restricted to wt% (wtP). If you want to convert to or from at% in another unit, you must convert to wt% first.
NOTE: Conversions currently support only
concentrations provided as elements or oxides but not as isotopes
(e.g. 204Pb).
Limit of detection
In as_ASTR(), the limit of detection where indicated by
a below detection limit notation is automatically set to
NA. Users requesting a more advanced approach by valuing
the LOD in the ASTR package, e.g. for plotting functions,
are requested to implement their own lambda function redefining the
bdl_strategy.
Substitution methods could be e.g. dropping the left-censored value
by replacing it by NA or 0, calculating LOD/2 or LOD/√2,
skipping < of the left-censored value, or using regression models,
enhanced censoring calculations, or maximum likelihood estimates (Croghan & Egeghy,
2003; Giskeødegård & Lydersen,
2022; Helsel, 2006)
Output
- Values derived by calculations, such as age model parameters of lead isotope data, are returned as an ASTR object together with the ID column, contextual columns and the input used for the calculation (after unit conversion), but without analytical values not used in the calculation. This avoids datasets growing unwieldingly complex and large. Unless the result is clearly another valid analytical value that can be classified according to the ASTR schema, they are classified as contextual information.
Export
- ASTR does not provide a dedicated function to save ASTR objects as
e.g. csv file. Instead, use the functions already available in R and its
packages. Don’t forget to “shift” the units from the column attributes
back into the column names with
remove_units(df, recover_unit_names = TRUE)before export.
ASTR vs. non-ASTR objects
We do not want to make following the ASTR schema mandatory for using the functions in this package. Therefore, many of the functions not dedicated to the ASTR schema and its implementation support also non-ASTR objects. However, default values of functions are defined for ASTR objects and other convenient features, such as on-the-fly unit conversion, are restricted to ASTR objects. Read more about how to work with ASTR objects in this vignette.