3. The Bayesian Process

The use of Bayesian statistics for the interpretation of radiocarbon dates reinforces the need for clear problem definition, the requirement for rigour in sample selection, and the need for explicit consideration of our pre-understandings in interpretation.

Attention to these issues is, however, essential in any programme of scientific dating. Consequently, the iterative approach to sample selection and chronological modelling that has been crafted out of repeated practice over the past twenty-five years is applicable whether or not Bayesian statistics are ultimately used for the interpretation of the data (Fig. 11). This process enables best value to be obtained from any programme of radiocarbon dating.

3.1 Problem definition

The first step in the process is to consider the range of potential archaeological questions that a dating programme could address (Fig. 11). These are, of course, framed within the context of existing knowledge (often summarised in regional or period resource assessments). Key is the need to identify why dating for the artefact, activity or site is required. This factor will determine the precision of dating needed to resolve the question of interest.

Secondly, we must consider whether the question can be resolved at the level of the study being undertaken, or whether we wish to submit samples that will ultimately contribute to wider objectives, such as those identified in regional and period research frameworks.

For example, consider a site of undiagnostic form, lacking any material culture. We think that it could be prehistoric or early medieval, but need to determine to which period the site belongs. Dating to within a few centuries will resolve this issue.

Or, perhaps we have excavated part of an enclosure, and revealed enough of the plan and sufficient associated material culture to be confident we have a Neolithic causewayed enclosure. A recent synthesis has determined that this type of monument was constructed over a period of approximately 150 years between the late 38th and late 36th centuries cal BC (Whittle et al. 2011, figs 14.11–14.12). So, obtaining a few radiocarbon dates that, when calibrated, will place the monument in the mid-fourth millennium cal BC will not tell us anything that we do not already know.

We need a full programme of radiocarbon dating and chronological modelling to produce a chronology that is precise to about half a century, so that we can place our site in the emerging narrative for the appearance and use of this monument type in southern Britain (ibid, figs 14.16 etc).

Or, perhaps we have excavated a pit containing a highly decorated Beaker vessel, accompanied by a large concentration of carbonised plant remains and some articulated animal bone.

Nationally, we may wish to trace the direction of the spread of Beakers across Britain. Regionally, we may wish to know when Beakers first appeared in our region. Obtaining high-quality radiocarbon dates (see sections 3.2.2 and 3.2.3) for one assemblage on our site will tell us its date of deposition to within a century or two, and could place the pottery in the earlier or later part of the national currency of Beaker pottery. But, once several assemblages have been dated from our region, time transgressive patterns at a much higher resolution will become apparent (cf Jay et al. 2019, fig. 2.1).

3.2 Identifying a pool of suitable samples

Once we have defined the objectives of our proposed dating programme, the next step is to identify a pool of samples that is potentially suitable for dating (Fig. 11).

3.2.1 Retrieving and storing samples

First, of course, it is necessary to retrieve those samples during fieldwork and to store them until they are needed for radiocarbon dating (see section 4.1).

With the advent of AMS, the concept of a radiocarbon sample fundamentally changed. The required sample size is now so small (Table 2) that it is physically possible to obtain a radiocarbon measurement on almost any organic material that is recovered during fieldwork. Consequently, all material should be collected, packaged and stored in a way that does not compromise its potential for radiocarbon dating.


TABLE 2: Guide to optimal sample size for material commonly dated by AMS (for further information you MUST contact your chosen laboratory, as both preferred and, in particular, minimum sample sizes do vary considerably by facility).

MaterialOptimal sample size (before pre-treatment)Comment
Wood (not waterlogged or charred)60mgSingle tree-ring from increment borer usually sufficient.
Bone & antler2gCortical bone (e.g. a long bone) is best
Calcined bone4gSingle fragment of pure white bone is best.
Residues on pottery — pitch, charred food50mg1cm2 of visible residue is usually adequate.
Charred plant remains & charcoal60mgA single charred cereal grain is usually adequate (c. 10mg).
Waterlogged wood5g1cm3 is usually adequate
Waterlogged plant remains200mg–5gThe size needed is very variable as it depends on water content; a large macrofossil such as an alder cone is usually viable.
Organic sediment3g1cm3 is usually datable – but beware (see section 3.2.3)!
Other materialsContact your radiocarbon dating laboratory before submitting samples of other materials.

Dry wood

Samples of wood that is not carbonised or waterlogged are usually obtained from standing buildings, either as offcuts when parts of timbers are replaced during repair works or as cores removed by an increment borer during dendrochronology (see English Heritage 1998, section 2.2.4; Fig. 12).

Samples should be clearly labelled, and the presence of the heartwood/sapwood boundary, sapwood and waney edge/bark recorded. They should be stored in cardboard boxes or plastic bags. Samples intended for radiocarbon wiggle-matching should not be glued to wooden laths or marked-up with ink for tree-ring measurement.

Any evidence of past timber treatment should be recorded, and details of the chemicals used obtained if possible.

Bone and antler

Generally, samples of bone and antler may be washed in water, marked using Indian ink or otherwise clearly labelled, dried and stored in plastic bags or cardboard boxes (Baker and Worley 2019, 23–4).

Reconstruction of breaks using glue should be avoided. Samples intended for dating should not be chemically consolidated. Specialist advice should be obtained during the fieldwork stage of a project if consolidation is essential, so a sub-sampling strategy can be devised to retain sufficient unconsolidated material for dating.

Fragile specimens may be wrapped in aluminium foil. Especial care should be taken in recording and recovery of articulating animal bone groups (Baker and Worley 2019, 18), which are likely to be preferred for dating (see section 3.2.2).

Calcined bone

Calcined bone may also be washed in water and stored in plastic bags. Fragile bone should be protected from further fragmentation by storage in acetate boxes.

Surface and absorbed residues from pottery sherds

A variety of surface residues on pottery sherds can be dated by AMS, including carbonised food crusts (Fig. 13), sooting and decoration and repairs undertaken in pitch. Absorbed fatty acids from ceramics can also be dated.

Sherds displaying visible residues, or intended for absorbed lipid analysis, should not be washed. All these residues can be contaminated by the plasticizers used in plastic bags, bubble-wrap and the lids of some types of glass vial. Such sherds should be air-dried, wrapped in aluminium foil, clearly labelled and then stored in acetate boxes or plastic bags.

Especial care should be taken of groups of refitting sherds with ancient breaks, which are likely to be preferred for dating (see section 3.2.2).

Carbonised plant remains (including charcoal)

Carbonised plant remains, including charcoal, are generally recovered by water flotation from bulk sediment samples that have been taken according to an explicit sampling strategy (see Campbell et al. 2011, section 3). Carbonised material should be air-dried, clearly labelled and stored in glass vials, acetate boxes or plastic bags.

If the material is stored in plastic bags, care should be taken to ensure that it is not crushed during storage.

Waterlogged wood

Waterlogged wood is sometimes recovered either during excavation, or during sampling for other waterlogged plant remains or organic sediment. Samples of structural timber fall into two categories: large timbers with ring sequences that have failed to produce dendrochronological dating, which require wiggle-matching (see section 5.6); and short-lived pieces of wood.

Sampling for the first category of material should be as set out for dendrochronology (English Heritage 1998, section 2.2.5; Brunning and Watson 2010, section 3.6.5).

For the second category, ideally six pieces of short-lived material from different elements of an archaeological structure that can be dated should be retained (e.g. wattle panel).

All samples should be kept wet in a plastic bag (or wrapped in plastic), clearly labelled using waterproof pen on waterproof labels in another plastic bag, and then wrapped in a third plastic bag. They should then be kept in a cold store or fridge until wood identification, dendrochronology and radiocarbon dating can be undertaken (Brunning and Watson 2010, section 3.8). Biocides must not be used.

Waterlogged plant remains

Waterlogged plant remains can be recovered by wet sieving of bulk sediment samples that have been taken according to an explicit sampling strategy (see Campbell et al. 2011, section 3), but they can also be retrieved from samples of organic sediment that have been taken from exposed sections or by coring. Once isolated they should be clearly labelled, stored in a small amount of water in a glass vial or plastic tube and kept in a cold store or fridge until needed for dating.

Samples should not be stored in Industrial Methylated Spirits (IMS) or alcohol.

Organic sediment

Samples from vertical sections of sediment can be taken either by hand excavation, using monolith tins, or by coring (Historic England 2015a; Fig. 14). Care must be taken to ensure that a continuous sequence of sediment is retrieved and that it is not contaminated during recovery.

It is beneficial to take an overlapping series of monolith tins or cores so that samples from undisturbed positions throughout a sequence can be obtained. If possible, a closed-chamber corer (such as a ‘Russian’ or ‘Livingstone’ corer) should be used to take two adjacent cores (no more than 0.2m apart) overlapping by half the length of the core-sections.

‘Gouge’ augers, typically with an open semi-cylindrical chamber, should not be used. However, in situations where this is unavoidable, extreme care should be taken to minimise the possibility of contamination. Similarly, when cores are taken by power augers, the holes are not sleeved and therefore contamination can be an issue.

The outer surface of core samples, which is most likely to have become contaminated during extraction, should be cleaned before packaging and storage. All samples should be located three dimensionally in relation to the local datum points e.g. Ordnance Survey grid (OSGB36; UK onshore) or UTM (WGS84) using appropriate surveying equipment (see Historic England 2015b).

Most organic sediments of Holocene date recovered from England are sufficiently well preserved that datable waterlogged plant macrofossils will be recovered from a 10mm thick slice of sediment obtained from a section, monolith tin or corer. Some sediments, however, are humified to the point that plant macrofossils do not survive. In these circumstances, radiocarbon dating of bulk sediment has to be considered (see section 3.2.3).

The materials described above constitute over 95% of the radiocarbon samples dated from England, although a wide range of other archaeological finds can be dated. These include: marine and freshwater shell, both of which require reservoir corrections (see section 1.6); skin, leather and parchment; hair, wool and horn; insect chitin; ivory; paper; and vegetable resin used as mastic for hafting stone tools.

It is also possible to date the carbon included in the steel component of some ferrous objects, and carbon dioxide fixed from the atmosphere by lime mortar as it sets. Not all laboratories date all these material types, and if you are interested in dating any of them you should obtain specialist advice before submitting samples.

This section considers samples obtained during new fieldwork. Some projects can require dating of materials that have been stored in archaeological and museum archives for many decades (Fig. 15). Such objects may have been chemically conserved, and may have been neither collected nor stored in ideal conditions. This does not necessarily mean that they cannot be successfully dated, but specialist advice in such circumstances is essential.

3.2.2 Archaeological criteria for identifying suitable samples

Once the organic material from the project has been retrieved and catalogued, the next step is to identify samples that are potentially suitable for dating (Fig. 11). This is a complex task that requires both rigorous understanding of some challenging archaeological issues (discussed in this section), and consideration of the wide range of scientific complexities that beset radiocarbon dating (see section 3.2.3).

The association between the datable material and the archaeological activity that is of interest is paramount (Waterbolk 1971). This relationship, between the dated event (e.g. the shedding of an antler) and the target event (e.g. the digging of a Neolithic ditch with that antler), is never known but is inferred from archaeological evidence (such as wear or burning on the antler). The basis of this inference, and its security, must be specifically considered for every potential sample.

The most secure association is when the datable material comes from an object that is of intrinsic interest. In this case, it would not matter if the sample was unstratified. An example is a carbonised food crust adhering to a diagnostic pottery sherd, if the objective of the dating programme is to obtain a chronology for that ceramic type.

Such cases are, however, comparatively rare. It is usually the context of the sample that is of interest: the date of the ditch, or of the site or of the associated material culture. This is even more important if you have a stratigraphic sequence of deposits that you wish to use as prior information in a Bayesian chronological model. Stratigraphy, of course, provides evidence about the relative sequence of contexts.

Radiocarbon dating does not date contexts, it dates samples. So, the calibrated radiocarbon dates can only be constrained using the stratigraphic sequence of contexts if the dated samples were freshly deposited in the contexts from which they were recovered. This is where interpretation of the taphonomy of the datable material comes into play (Fig. 16).

There is no such thing as a perfect sample for radiocarbon dating. All potential samples have their strengths and weaknesses, and a key part of sample selection is to assess the risk of submitting an item for dating. The crucial archaeological interpretation is to establish whether a potential sample is likely to have been residual (or, less frequently, intrusive) in the context from which it was recovered. This can be inferred with varying degrees of confidence.

Archaeological association

There are many types of evidence that can be considered in assessing sample taphonomy, most of which rely on the results of other archaeological analyses and assessments (faunal, geoarchaeological, environmental, etc.). The availability of the wide range of information that is necessary for the selection of samples for radiocarbon dating is a major constraint in timetabling dating in the overall project programme (see section 4.6).

Please click on the gallery images to enlarge and see the captions.

Please click on the gallery images to enlarge and see the captions.

In most studies, dating a sample is a means to date a context. In such cases, the vast majority of samples submitted for radiocarbon dating from England can be included in the following taphonomic categories, which are listed in roughly descending order of reliability:

a) Bones found in articulation and recorded in the ground as such (Fig. 17a). These samples would have been still connected by soft tissue when buried and hence from people or animals that were not long dead.

b) Articulating bones identified as such during specialist analysis (Fig. 17b). These samples could have been articulated in the ground (but not recognised as such) or have only been slightly disturbed before burial. The presence of more than one bone from the same individual provides evidence that such samples are close in age to their contexts. The security of this inference increases as the number of articulating bones increases. Occasionally both bones are not present, but the condition of the articular facet suggests that the articulating bone was present in the ground.

c) Bones with refitting unfused epiphyses identified during specialist analysis (Fig. 17c; see b, above).

d) Food residues from groups of refitting pottery sherds or from a group of sherds from a single vessel (Fig. 17d). Carbonised residues on the interior of the vessel probably represent charred food (rather than sooting). As the sherds refit or much of a pot survives, the vessel has a good chance of being in the place where it was originally discarded.

e) Calcined bone from distinct individuals (human or animal) and carbonised plant remains from cremation deposits (Fig. 17e).

f) Wood used in the construction of archaeological structures (e.g. waterlogged hurdles, charred posts, timbers from standing buildings Fig. 17f).

g) Carbonised plant remains functionally related to the context from which they were recovered (e.g. charcoal from a hearth or kiln; Fig. 17g).

h) Antler tools discarded on the base of ditches and other negative features (Fig. 17h), thought to be functionally related to the digging of the features (e.g. in a flint mine). This inference is most secure when the tine is embedded in the base of the cut, but could be based on use-wear such as battering on the posterior side of the beam/burr/coronet.

i) Waterlogged plant remains from archaeological contexts (e.g. a well). These are probably in the place where they were originally deposited, or they would not have remained waterlogged and survived (Fig. 17i).

j) Single fragments of short-lived carbonised plant material from coherent, often friable or ashy, dumps of charred material: inferred on the basis of their coherence and fragility to be primary disposal events (e.g. ‘placed’ deposits in pits; Fig. 17j).

k) Paired bones (usually from different sides, e.g. left and right ulnae) thought to be from a single individual on the basis of size, morphology, etc. (Fig. 17k; see c, above) but less secure.

l) Grave goods, which must have been in circulation at the time of burial but may have had a history of use before deposition (Fig. 17l).

m) Disarticulated human bones from burial monuments, which are probably functionally related with the site, even if they do not necessarily represent primary deposition (Fig. 17m).

n) Material from ‘occupation’ spreads: samples that can be related to human activity (e.g. cut-marked bone, calcined animal bone) can be more secure than those that might derive from previous natural events (e.g. charcoal) (Fig. 17n).

o) Food residues from single pottery sherds (see d, above) but less secure (Fig. 17o). The inference that the sherd is not residual is based on the fragility of the pottery concerned.

p) Material derived from the postholes of timber buildings; on the basis of experimental archaeology (Reynolds 1995), putatively derived from the occupation of the structure (Fig. 17p).

q) Well-preserved disarticulated animal bones: submitted on the basis that the latest date from a group of measurements should provide a terminus post quem that is (hopefully) not too much earlier than the actual date of interest (e.g. multiple dates from basal fills of field boundaries) (Fig. 17q).

This list is not comprehensive, but it does give an indication of the range of issues that should be considered when assessing the relationship between the target event and the dated event of potential radiocarbon samples. Other material that has a high chance of being residual, such as disarticulated bones from the upper fills of features, or low densities of charred plant material that has been retrieved by processing large environmental samples, is rarely suitable for dating.

The golden rule is that every potential sample should be considered residual unless there is a plausible argument showing that it was freshly deposited in the context from which it was recovered.

Taphonomy of organic sediments

So far, we have considered only those samples that derive from archaeological excavations. Many projects, however, are concerned with samples of bulk organic sediment from sequences used for environmental reconstructions. The taphonomy of the material within these samples can be complex. The question we have to ask is the same, however: is the carbon that will be dated from the sediment in situ, and is it directly related to the past event of interest?

Several fractions may be dated from bulk sediment:

  1. Identifiable waterlogged plant macrofossils; thought to be from plants that grew on or around the sampled site as the sediment accumulated.
  2. Fulvic acid’ fraction of bulk sediment: this is the acid soluble fraction and is often too recent. It is no longer dated routinely, although measurement on this fraction can be found in the literature or undertaken for experimental reasons.
  3. Humic acid’ fraction of bulk sediment: this is the acid insoluble, alkali soluble fraction. It is thought to derive from the decay of plant material that grew on the site as the sediment accumulated.
  4. Humin’ fraction of bulk sediment: this is the acid and alkali insoluble fraction. It is thought to consist of the physical remains of the plant material that grew on the site.
  5. Total organic’ fraction of bulk sediment: this is the solid fraction that remains after the acid soluble fraction (Fulvic acid) has been removed. It consists of the ‘humic acid’ and ‘humin’ fractions combined.
  6. Bulk samples of microfossils (e.g. foraminifera, pollen).

There are risks inherent in dating any of these materials. The likelihood that the datable remains in the sediment grew in situ on the wetland surface, or were incorporated from plants growing on the contemporary landsurface, must be assessed by careful consideration of their context.

  • What is the lithology and geomorphology of the site?
  • Are the sediments horizontally bedded?
  • Is the wetland an ombrotrophic mire, or a minerotrophic fen or marsh?
  • If the wetland is fed by run-off, then what else could have washed in?
  • Are there exposed coal measures or peat deposits, for example, farther upstream?
  • If there are plant macrofossils in the profile, what species are present? What are their characteristics?
  • Do they have invasive roots (e.g. Phragmites sp.)? If so, are we sure that the isolated material is not root, or has not been pushed down into earlier sediments from above?
  • What is the organic content of the sediment?
  • What is its pH?

Once more, there are no perfect samples. The object of this deliberation is to select for dating the fraction or material from within a sediment which most accurately reflects the age of its deposition (see section 3.2.3).

Single-entity dating

The imperfection of almost all potential radiocarbon samples brings us to the need for single-entity dating (Ashmore 1999). This is a strategy that minimises the risk that the submitted sample will contain residual or reworked material, by dating material that certainly derives from a single organism (e.g. a single cereal grain).

We can examine this strategy using simple statistics. Consider, for example, a deposit where 1 in 10 of the recovered short-life charred plant remains are residual. Imagine, perhaps, that we have excavated a malting kiln containing charred barley from its final firing. Most of the barley grains will have come from that firing, but a small proportion could derive from previous firings or the clay fabric of the collapsed kiln itself and thus be earlier.

If we date a single grain from this deposit, the radiocarbon date will have a 90 percent chance of dating to the time when the context was formed and a 10 percent chance of being earlier. If we obtain two radiocarbon dates from this deposit, each from a single grain, then there will be a 99 percent chance that at least one of the two dates will relate to the time when the deposit was formed.

If, however, we bulk those two grains together and obtain one radiocarbon date, then there will be a 19 percent chance that at least one of those grains is residual and so the radiocarbon date is earlier than the time when the deposit was formed.

The greater the number of items that are bulked together, the lower the probability that the sample will contain only freshly deposited material. If 10 seeds were to be bulked together for dating from this deposit, then there would be a chance of less than 1 in 3 that the resultant radiocarbon date would accurately date the formation of the context. Obviously, the scale of the offset will depend on the actual proportion of residual material in a sample and its date in relation to the time when the deposit from which it was recovered was formed.

Not all bulk samples necessarily contain residual material. For example, the dating of multiple single fragments of short-life charcoal from a fired feature, such as a hearth, will often give results that are statistically consistent both with each other and with a measurement on a bulk sample of short-lived material from the same context, although this is not always the case (cf Tintagel Castle, Bayliss and Harry 1997). With the routine availability and increasing precision of AMS, however, the submission of bulk samples where they can be avoided is an unnecessary risk.

There are still, however, a few situations where it could be necessary to submit bulk materials.

  • Food residues from ceramic sherds probably derive from meals that contained several ingredients, and so, by definition, such residues are not single-entities. Carbonised residues probably relate to the last use of the vessel, but lipids can accumulate during the time when the vessel was used.
  • Waterlogged plant macrofossils and bulk sediment are not so unproblematic. Much of the weight of waterlogged material (sometimes 80%) is water, and so to obtain enough carbon for dating even by AMS it is often necessary to bulk together several plant macrofossils (e.g. seeds of the same species).
  • Carbonised plant remains that are too small for single-entity dating (e.g. cereal glume bases).
  • Microfossils (such as foraminifera, pollen and most species of ostracod) again have to be bulked to provide enough carbon for dating even by AMS.
  • Fractions of bulk sediment, by definition, derive from multiple sources.

Figure 18 is a flow diagram that provides a step-by-step guide to assessing the archaeological suitability of potential samples for radiocarbon dating.

3.2.3 Scientific criteria for identifying suitable samples

In the previous section, we have considered the first basic criterion that a sample must meet before it is considered suitable for dating: that it must be securely associated with the archaeological activity that is of interest. There are two other criteria that a potential sample must meet, however, before it can be considered for dating. These are considered in this section.

First, the carbon in the sampled organism must be in equilibrium with the carbon in the atmosphere (or some other well-characterised reservoir) at the time when the organism died.

Old-wood effect

By far the most common source of error of this type is the old-wood effect, where dates are obtained on wood or charcoal from a long-lived tree. The carbon in a tree-ring dates from the year in which that tree-ring was laid down, and so the carbon from the centre of a 300-year-old oak tree will be 300 years old while that tree is still growing — this is why wood dated by dendrochronology can be used to construct a radiocarbon calibration curve.

For this reason, all samples of wood and charcoal should consist of:

  • material from a known position in a tree-ring series (such as rings sampled for wiggle-matching),
  • twigs or the outer rings of the tree (a single years’ growth is optimal, but the number of growth rings to bark should be recorded if a single growth-ring cannot be isolated),
  • if twigs are not available, samples have to be taken from short-lived species (e.g. Corylus avellana) or branch-wood, although in this case a wood-offset of a few decades cannot be ruled out, and sophisticated mathematical approaches will be required to utilise the resulting measurement (see 2.2.2).

All samples of wood or charcoal must be aged and identified to the highest taxonomic level possible by a suitable specialist before submission for dating.

Figure 19 is a flow diagram that provides a step-by-step guide to assessing the scientific suitability for radiocarbon dating of carbonised plant material that has passed the steps illustrated in Figure 18; and Figure 20 provides a similar flow diagram for assessing the suitability of waterlogged plant material.

Other age-at-death offsets

Sooting on pottery sherds usually derives from the fuel used on domestic hearths during cooking.

If this was wood derived from relatively short-lived material (e.g. branches collected from hedgerows), then there is unlikely to be a significant offset.

If the fuel used was constructional timber from a recently-demolished building or from peat, then more substantial offsets are more likely.

Unfortunately, the fuel used on the fire that left a soot deposit on a sherd is rarely known, and so the scale of any potential offset is also unknown. It is for this reason that internal carbonised residues on pottery sherds (which likely derived from carbonised food) are generally preferred for radiocarbon dating, although, of course, external residues that have been chemically characterised as food crusts are also suitable.

Decoration or repairs on pottery vessels in pitch, which is derived from wood resin, provide dates that are in equilibrium with the contemporary atmosphere; but decorations or repairs in bitumen or coal-tar, which are petroleum-derived, do not.

Calcined bone can also exhibit an age-at-death offset derived from the incorporation of carbon from the pyre fuel during the cremation process (Snoeck et al. 2014). The scale of offsets of this kind is currently uncertain, as is their prevalence in the past.

Most pairs of measurements on calcined bone and on short-lived charcoal from the same cremation deposit undertaken so far seem to be statistically consistent (Lanting et al. 2001), and so significant age-at-death offsets in prehistoric cremation deposits seem uncommon in practice (but see Olsen et al. 2012).

Figure 21 is a flow diagram that provides a step-by-step guide to assessing the scientific suitability for radiocarbon dating of calcined bone samples that have passed the steps illustrated in Figure 18.

Age-at-death offsets can affect bone from older individuals of species that live for some decades. The offset arises from the time it takes carbon from the diet to be incorporated into bone collagen.

As individuals become older, the average difference between the radiocarbon age of the carbon in the bone collagen and the carbon in the contemporary atmosphere becomes greater, particularly in men (Hedges et al. 2007). Given life-expectancy in the past, bone turnover offsets are unlikely to be of practical relevance except for the most high-precision applications.

Other effects that can complicate the relationship between the carbon absorbed by the sampled organism in life and the contemporary atmosphere are isotopic fractionation, which should be dealt with by age-calculation (see section 1.4), and reservoir effects (see section 1.6).

Reservoir effects

As described above, the best policy for dealing with samples that exhibit reservoir effects is avoidance. This means that animal bones should be identified before dating to ensure that they come from a terrestrial mammal.

In England, there will almost always be suitable material of fully-terrestrial origin, which can be dated in preference to a sample from a non-terrestrial reservoir. In those cases, where it seems likely that such samples are the best material available, specialist advice should be sought.

Some information is available about the marine reservoir of English coastal waters, and so samples of local marine origin, such as shellfish and some foraminifera can be dated, and the resultant measurements calibrated using the internationally agreed marine calibration curve (Marine20) and an appropriate local ΔR correction (see section 1.6). The error on the ΔR correction compounds that on the radiocarbon measurement on the sample itself, so the resultant calibrated date is less precise than would be the case with a measurement on a contemporary terrestrial sample.

Other samples of marine origin can be more problematic. Most fishing appears to have been in-shore until about AD 1000 and trading in preserved fish, as far as is known, of modest scale. Consequently, usually it will be valid to calibrate results on fish bone using a local marine correction. It is sometimes difficult to know which ΔR correction is appropriate, however, from later fish remains, which could come from deep-water fisheries or from traded salt-fish.

Similarly, marine mammals can range widely, and it is again difficult to know which ΔR correction is appropriate. Food residues from pottery sherds can also potentially derive from marine sources.

In contrast, little is known about reservoir effects in freshwater and estuarine conditions in England. So, if materials from these reservoirs are selected for dating, it is necessary to measure the local reservoir offset as part of the study. Specialist advice should be sought in these circumstances.

Hard-water offsets can occur, not just in freshwater fish and shells, but also in food residues from pottery sherds. The most common type of material encountered where freshwater reservoirs can be an issue are waterlogged plant macrofossils from submerged plants, for example Potamogeton.

Animals that rely on freshwater resources, such as beaver or waterfowl, can also exhibit a freshwater reservoir effect offset. Again, avoidance is the best policy, and material from fully-terrestrial or emergent plants or from terrestrial animals should be isolated and dated wherever possible. Hard-water offsets can also occur in results on bulk fractions of organic sediments, where the sediments were made up of submerged plants. In this case, the potential presence of an offset can be indicated by an enriched δ13C value.

Correcting for dietary offsets in bone samples is also difficult, largely because of uncertainties in estimating the proportions of different food sources in past diets accurately from stable isotopic values (Fig. 22). This has been done most convincingly where non-terrestrial dietary components are large, or where there is a restricted range of food sources (e.g. Arneborg et al. 1999).

Modest offsets from small (<10%) components of non-terrestrial foods are very difficult to identify and quantify accurately. Dietary offsets in bone apatite derive from whole diet, and so potentially are much lower than those from bone collagen, which derives mainly from the protein component of diet.

In England, significant dietary offset are rare in human bone before the medieval period, and even then are by no means universal. The presence of a marine component in the diet can be indicated by enriched δ13C and δ15N values, and enriched δ15N values can indicate the presence of a freshwater fish component (although the interpretation of these values is particularly complicated, and there can be other explanations of such values). If elevated δ13C and δ15N values (above c. −18.0‰ and +12.0‰) are encountered when dating human bone, specialist advice should be sought.

Figure 23 is a flow diagram that provides a step-by-step guide to assessing the scientific suitability for radiocarbon dating of animal and human bone that has passed the steps illustrated in Figure 18.

Natural contamination

The second scientific criterion a sample must meet if it is to be considered suitable for radiocarbon dating is that it must not be contaminated by any other carbon-containing material. This is impossible in practice, as the climate of England is damp and so, at the very least, the organic component of groundwater will have added contaminants to the sample.

The principal contaminants are dissolved carbonates from bedrock, and fulvic and humic acids, which arise from the decay of organic matter in soils. This is why the pretreatment of almost all samples begins with an acid step (see section 1.2), to remove dissolved carbonates (which are of geological age) and fulvic acids (which are usually more recent as they are often mobile in groundwater).

Humic acids are generally less mobile in groundwater than fulvic acids and, as they arise from the decay of organic matter, are frequently of the same age as the sample to be dated. This is often the case, for example, with fragile carbonised plant remains that are only given ‘an acid wash’ in the laboratory. In this case both the carbonised material and the humic acid complexes within it that remain after the acid step are dated. Humic acids are, however, soluble in alkali and can be mobile on alkaline geologies where anomalously young ages can occur (e.g. OxA-11663 from Silbury Hill, Wiltshire; Marshall et al. 2013, table 4.1).

Targeting material for dating from organic sediments

The potential mobility of humic acids is thus a material consideration in choosing the best material to date from organic sediments. This is a complex issue, and there is no single best solution. The choice of material to date from a sediment is affected by its preservation, geology and hydrology.

The material of choice is a single-entity waterlogged terrestrial plant macrofossil (e.g. an alder cone). This is based on the principle that dates on terrestrial plant macrofossils are generally more reliable than those on ‘bulk’ samples of the sediment matrix, as the source of carbon in the former is known and, in a single macrofossil at least, is not made up of heterogenous material that could be of different ages (Walker et al. 2001).

Waterlogged plant macrofossils are generally fragile and do not usually survive reworking, but they are not entirely unproblematic. It is both possible for earlier material to be in-washed and for later material to be pushed down from above. This can be investigated by dating more than one sample from a key horizon (see section 3.3.2 below).

Phragmites australis — as a marginal aquatic plant where the majority of its growth above ground occurs in air rather than in water, and with rhizomes that are readily recognisable — is often chosen as suitable single entity for radiocarbon dating. However, the rhizomes are far-creeping and the roots often reach to considerable depth, so a Phragmites sp. culm base or rhizome can be considerably younger than the sediment in which it occurs. For this reason, it is preferable to choose horizontally bedded leaves and/or stems, even if they cannot be precisely identified, as there is a better chance they will be the same age as the deposit in which they are found.

Alternatively, taxa such as Schoenoplectus spp. and Cladium mariscus can be used as these have shallower roots and short creeping rhizomes. Caution should also be exercised when using seeds that are dispersed by water. These can travel some distance before being deposited and thus can be reworked. Twigs are generally more robust, and can also survive reworking.

Most radiocarbon laboratories, however, require at least 60mg of waterlogged plant material for dating, and so most waterlogged macrofossils recovered from sediment are too small for dating on their own.

When no macrofossils large enough for single-entity dating are found in a sediment, or where these are so atypical that there must be a concern that they are exogenous to the sediment, a number of macrosfossils can be bulked together for dating. This introduces the risks of bulk samples (see section 3.2.2), but again the source of the carbon dated is known. Experience has shown that bulking together a large number of macrofossils of the same kind (e.g. birch seeds) can be better than bulking together the remains of heterogenous species, as it is more likely that the latter will include intrusive/reworked material.

Strenuous efforts must be made to isolate macrofossils before the dating of bulk sediment is considered (c. 80% of organic sediments of Holocene date from England do contain macrofossils). If identifiable plant macrofossils do not exist within the sediment, it is advisable to sample elsewhere.

If macrofossils are still not preserved, then it will be difficult and expensive to obtain a reliable chronology for the sequence. In these circumstances the importance of the information contained in the deposits must be considered. Will the resources, probably considerable, necessary to provide an accurate chronology for the sediment sequence be justified by the importance of the environmental/geoarchaeological record?

If dating is still considered to be merited, then it is desirable to obtain large bulk sediment samples (which can be homogenised before dating).

Can test pits be dug, or an open section sampled?

If coring is necessary, can a wide-diameter corer with sleeved liners reach the required depth? If this is not possible, then it is necessary to proceed on the basis of the quantity of sediment available.

If the amount of material at a given depth is large enough, it can be split and half sieved in water in an attempt to retrieve macrofossils for dating (the remainder surviving for bulk sediment dating and other analyses).

The addition of chemicals such as calgon, sodium bi-carbonate, sodium hexametaphosphate, tetra sodium pyrophosphate decahydrate and hydrogen peroxide to sediment that is slow to disaggregate while wet sieving — to aid the identification of macrofossils — does not preclude their subsequent accurate radiocarbon dating.

If only a very small amount of material is available, either the 10mm slice above or below the horizon of interest can be sieved to assess the likelihood that macrofossils will be found, or the horizon itself can be sieved in water and the residue retained for bulk dating if no macrofossils are recovered.

In theory, if the rationale for dating the ‘humic acid’ and ‘humin’ fractions of bulk organic sediment outlined above hold true in practice (see section 3.2.2, fractions 3 and 4), then replicate measurements of these fractions on the same sample should usually be statistically consistent (ideally in 19 out of 20 cases).

Reality is illustrated in Figure 24, where only 11 out of every 20 cases produce statistically consistent measurements. There is a clear tendency for the ‘humic acid’ fraction to be younger than the ‘humin’ fraction (on average by 86±4 BP). Where the two fractions of a sample give statistically consistent results, our confidence that the radiocarbon dates reflect the time of sediment accumulation is greater. But this does not tell us which, if either, fraction accurately dates the deposition of the sediment, when the measurements on the two fractions diverge.

In selecting which fraction to target for dating, the geology and hydrology of the site are key. If the site is on an alkaline substrate (e.g. chalk), then there is a risk that results on humic acids will be anomalously young (especially if the sediment is early Holocene in age).

Catchments with coal measures or older peat deposits that can be incorporated into sediments through erosion and run-off run the risk that results on the humin fraction can be anomalously old and thus dating the humic acids would be preferable. Whichever approach is adopted, adequate replication is essential (see section 3.3.2). It should not be necessary to date the total organic fraction (i.e. bulk the bulk fractions!) using AMS.

This discussion illustrates the difficulties of dating organic sediments. In most circumstances, accurate dating can be achieved, but it is necessary to carefully consider the context, geomorphology and stratigraphic relationships between replicate measurements in order to construct such chronologies and identify inaccurate dates.

Samples of single waterlogged plant macrofossils are the material of choice and, where necessary, these can be bulked together to provide sufficient material for dating. Figure 25 provides a flow diagram that can aid in these difficult site-specific choices.

Bone diagenesis

The burial environment also degrades bone samples. As collagen decays, its strands untwist and become vulnerable to contamination by humic acids. Laboratory pretreatment aims to retrieve collagen or clean gelatin for dating. This is particularly difficult for samples with low collagen levels, where most of the protein content of the bone has decayed, and so most laboratories utilise methods of assessing whether the protein is sufficiently well preserved for accurate dating (usually C:N ratios, %C, %N or percentage yield by weight).

Generally, bone collagen preservation is higher in cortical bone (e.g. a femur) or in tooth dentine where the protein has been protected by the surrounding enamel.

Collagen preservation of some bones, in England usually those from sites on acid substrates, is simply not adequate for radiocarbon dating. But even on these sites, a small proportion of bones could be better preserved owing to local variations in the burial environment. In these cases, it can be worth pre-screening samples for protein preservation using %N measurements on whole bone (Brock et al. 2010a). This involves drilling a small amount of bone powder (c. 5mg, or a small pinch of salt) from each bone and measuring its %N content in a conventional mass spectrometer. Bones with more than 0.76%N have an 84% chance of successful dating. As no chemical pretreatment is required, costs are modest, and so large numbers of bones can be pre-screened so that the small proportion that are datable can be identified.

Recently an entirely non-destructive technique, near-infrared spectroscopy, has been shown to similarly assess the collagen content of bone samples (Sponheimer et al. 2019).

Dating collagen from charred bone does not usually produce accurate radiocarbon dates. This is because the charring process in effect accelerates the degradation of bone collagen and makes it particularly susceptible to contamination by humic acids. Similarly, bone apatite that has been insufficiently calcined can also produce inaccurate results. This can be assessed on the basis of colour before submission for dating: white calcined bone should be selected in preference to grey or blackened material.

In the dating laboratory a variety of tests can also be employed — the organic content of the sample, the crystallinity index or the splitting factor — to assess the suitability of a sample for accurate dating (Van Strydonck et al. 2010).

Anthropogenic contamination

The contamination so far discussed derives from the natural environment, but we also have to consider anthropogenic sources of contamination. Some of these are unavoidable, such as samples derived from ground contaminated by past industrial uses or timber that has undergone wood treatment during past structural maintenance; others are introduced accidentally by archaeologists, such as fuel leaks from on-site generators or water-pumps; and still others are introduced inadvertently by archaeologists during sample retrieval, processing, storage, packaging and conservation.

Obviously, it is better if a sample is not contaminated in the first place. But where such material does need to be dated, the critical factors are the nature of the material to be dated and the type of contaminant present.

Situations where the contaminant is chemically the same as the sample to be dated are the most problematic (for example, modern cigarette ash in carbonised plant material, animal-bone glue coating bones, or algae growing on waterlogged plant remains).

It is also difficult to deal with samples that are contaminated by unspecific cocktails of chemicals, such as IMS. It is, however, often possible to at least attempt to date samples that have undergone consolidation with Polyvinyl Acetate (PVA) or cellulose nitrate (for bones) or Polyethylene Glycol (PEG) (for waterlogged wood).

There will always be more concern about the reliability of a radiocarbon date on a contaminated sample than would be the case for an uncontaminated sample. A larger sample is often required, and the laboratory procedures necessary are aggressive and non-trivial. Where such contamination is suspected, it is essential that as much information as possible is gathered about the chemical(s) that might have been used, and that the proposed analysis is discussed with the radiocarbon dating facility before samples are submitted.

3.3 Statistical simulation and sample selection

Sample selection needs to balance the risks of dating a sample or series of samples, against the probability of achieving the objectives of the dating programme. The aim is to minimise the risk and the cost of the dating programme, while maximising the information gain. Sometimes suitable samples are not available, and the temptation to submit inferior material for dating should be resisted. Dates on such samples almost always mislead more than they inform, and hamper the understanding of past chronologies (e.g. Darvill and Wainwright 2009).

3.3.1 How many samples?

First, you need to estimate how many samples you need to date to achieve the objective of the dating programme to the desired resolution. This is done by running a series of simulations covering a representative range of the likely outcomes of the dating programme.

The following needs to be defined:

  1. the prior information relevant to the problem that can be included in the model (see 2.2.1);
  2. the pool of samples that are potentially suitable for dating (see 3.2), and their relationships to that prior information;
  3. the error terms that are likely to be returned by the selected radiocarbon facility, given the likely age and material of the samples to be submitted; and
  4. a representative range of scenarios for the likely actual calendar dating of the problem under consideration.

For site-based studies a Harris matrix of the samples that are potentially suitable for dating, or a schematic diagram showing them with the site phasing, is often helpful. This information then needs to be combined into a simulation model.

These are of variable complexity. This may simply involve simulation of a single date. For example, is it worth dating a carbonised food residue on a pottery sherd that is typologically known to be ‘early Anglo-Saxon’ (c. AD 420–700)?

We can simulate the calibrated date we would get if we submit a single sample from this residue for radiocarbon dating (Fig. 26). In this case, the inputs into the simulation are our expected dating and an anticipated error on an AMS measurement of this age of about ±25 BP.

This simulation illustrates two points. Firstly, it tells us that we can expect much better precision (to within a century) if the sample is later 6th or 7th century. If the sample is earlier than this, then a radiocarbon date will simply tell us that the sherd was used in the 5th or early 6th century AD. Archaeological judgement will tell us whether this precision is useful for the problem under consideration. Secondly, it shows the risks of submitting single samples. Simply from the expected statistical scatter on radiocarbon dates, some of the time the actual dates of a sample will lie on the limits of a calibrated radiocarbon date (e.g. the simulation at AD 460, where the true date actually lies on one of the smaller, later humps of the probability distribution rather than on the larger hump in the early 5th century). Potentially, single dates can mislead.

To take a slightly more complex example, we have a small, single-phase, late Iron Age farmstead and would like to know, to within a century, when it was occupied and for how many generations. Bone is not preserved, so we are reliant on dating carbonised plant remains from a variety of fired features on the farmstead. How many samples do we need to date to obtain the required resolution?

In this case, our prior information is that all the samples derive from a period between when the farmstead was established and when it was abandoned. We have many potential short-life, single-entity samples, which should provide error terms of about ± 30 BP by AMS. Say, the site was occupied for 40 years in the last decades of the 1st century BC and we take two samples from each of six fired features (i.e. 12 samples in total). We get a model of the form shown in Figure 27.

This tells us that the site was established in 105 cal BC–cal AD 5 (95% probability; start farmstead; Fig. 27). This range covers 110 years and includes the actual date input into the simulation (40 BC). It estimates that the site ended in 45 cal BC–cal AD 65 (95% probability; end farmstead; Fig. 27). This range also covers 110 years and also includes the actual date input into the simulation (1 BC).

In neither case does the model estimate the key parameters to the desired level of precision. So, we add two more simulated dates from another feature and rerun the model to see how far the precision obtained improves.

Ultimately, we can plot the bandwidth of the date range obtained for each key parameter given different numbers of dated samples (Fig. 28). In this case, we can see that the desired level of precision for this application is achieved by obtaining dates on 14 samples.

Of course, we do not actually know that this site was used for 40 years between 40 BC and 1 BC. It could have been used for 40 years between 80 BC and 40 BC, or for 20 years between 70 BC and 50 BC, or for 80 years between 100 BC and 20 BC, etc. So, we need to produce a series of simulations for different scenarios and a series of graphs of the form of Figure 27 (which itself summarises 13 simulation models).

In this case, we might perhaps need to build 150–200 simulation models (which would take an experienced modeller perhaps half a day). This will give us an idea of the variation in the number of samples that we might need to achieve the required precision for this application; perhaps, in the best-case scenario, we would only need 12 samples, but in the worst-case scenario we would need 20.

We now need to consider how to use this information to inform our sampling strategy. Simulation is only a guide to the number of samples that, statistically, are needed to achieve an objective. Archaeological factors also need to feed into the strategy. There might, for example, be seven structures in the farmstead, each of which has a hearth or other fired feature. Perhaps in this case we might suggest that dating two samples from one feature in each building would be sensible. A sampling strategy should be archaeologically representative as well as statistically viable.

Practical considerations also come into play. If minimising costs is paramount, we might submit 12 samples as a first round of dating, obtain and model the results, and then obtain another six samples, if necessary, in a second round (Fig. 11). This might, if the site actually falls on the most favourable part of the calibration curve investigated by our simulations, save us the cost of six radiocarbon dates. But it might also save us nothing and extend the post-excavation programme by several months. This might, in itself, be more costly than the potential saving in radiocarbon dating costs, and so it might be most cost-effective to submit 20 samples in the first round.

Generally, at least two rounds of dating are recommended for all but the simplest of applications. Because of the difficulties of dating sequences of organic sediments (see section 3.2.1, section 3.2.2 and section 3.3.2), two rounds of dating are essential in these cases. A preliminary round of dating is needed to demonstrate that a reliable chronology can be obtained from the sediments, and then further dating is needed to construct the chronology. Commissioning of extensive palaeoenvironmental analysis should normally follow the first stage of dating. For complex or large-scale applications, three rounds of dating should be scheduled as optimal.

In other cases, simulation might lead to the decision not to proceed with an intended programme of radiocarbon dating. For example, if this application fell on a different part of the calibration curve, and simulation suggested that no matter how many samples were submitted the maximum precision obtainable was to within 200 years, and we already know the date of the site to this resolution, then there is no point proceeding with the dating. Or, if simulation suggested that we needed between 40 and 50 samples to achieve useful precision, and we only have 30 potential samples, then we cannot proceed. Or we might decide that the archaeological objective of dating this site to within 100 years does not merit the cost of 50 radiocarbon dates.

This scenario might lead us to recast our objective into something that is less ambitious, but achievable; or it might lead us to think about the objective from a different angle.

Do we really want to date the farmstead?

Or is what is important actually the regional typology of ceramic forms?

Could we use the typological series of the pottery to provide constraints on our model?

Sometimes we do not even date what we are interested in to achieve our objective, but rather think laterally. A good example is dating field systems, where often the most effective strategy is to date one set of features that are cut by the field ditches, and another set that overlie the silted-up system. Samples from within the field ditches themselves are often both scarce and residual (Griffiths et al. 2021).

What is important in all these cases is that we have made an informed decision based on what can be achieved given the datable material and prior information that is available.

There is no point in submitting 10 suitable samples for dating to achieve an objective that requires 20 samples. The objective will not have been achieved and resources will have been wasted. But if we decide to reallocate resources within a project to fund 25 radiocarbon dates that simulation suggests are needed to resolve reliably an important archaeological question, we can do so in the knowledge that this expenditure will have a good chance of achieving the required chronology.

The exception to this is where dating is undertaken to contribute to a wider research objective highlighted in a national or regional research agenda. For example, if there is a regional priority to date Beaker pottery, then dating two samples from a pit containing diagnostic Beaker sherds will ultimately contribute to wider understanding (for example, of the time-transgressive nature of the appearance of this ceramic style across England), even though the pit itself is only dated to a resolution of within a few hundred years.

3.3.2 Mitigating risk

So far, we have inhabited a paradise where all samples date the target event intended and all measurements are accurate to within their quoted uncertainty. The real world is not like this. Few radiocarbon samples, and even fewer sampling strategies, are perfect. There is always some element of risk in dating a group of samples, but we aim to minimise this and, if possible, to mitigate it.

This is done by testing the accuracy of the radiocarbon dates obtained, both individually and as a group. Our sampling strategy must consider both the risks posed by archaeological weaknesses in our pool of samples and the risks posed by their scientific complexities. There are a number of methods that we can use as a check on our results:

  1. the coherence of a suite of related radiocarbon dates — are there any clear outliers or misfits? (see 2.2.2);
  2. the compatibility of a series of results with the relative chronological sequence known from archaeological information (such as stratigraphy); and
  3. the consistency of replicate results on the same or similar material.

The first two methods come into play once our radiocarbon results have been reported; replicate samples, however, must be selected as part of the overall sampling strategy. Replication is neither scientific prurience nor an expensive luxury, but rather an essential element of any competent sampling strategy for radiocarbon dating.

There are two types of replicate measurement: multiple samples on different single-entities from the same context or feature, and replicate measurements on the same single-entity. The first mitigates the archaeological risk that the dated samples are residual, reworked or intrusive; the second mitigates the scientific risks of dating certain types of material.

The number of repeat samples that are needed to address archaeological concerns about the dated samples is directly related to the certainty of the relationship between the dated event and the target event (see section 3.2.2). Basically, the greater the uncertainty of association, the greater the number of repeat samples needed.

The number of repeat samples is also related to the other checks, if any, that we have on the dates. So, for example, if there is a sequence of ten contexts each containing articulating animal bones, the stratigraphy will check the reliability of the measurements and so replicates will not be required.

If, however, the sequence is ten levels in an organic sediment and the samples are waterlogged plant macrofossils, then two or three replicate pairs of single-entity macrofossils from the same level would be ideal to check for reworking. If the ten samples have no stratigraphic controls but are, for example, the base of basal sedimentation across a region, then replicate macrofossils from a higher proportion of the samples would be needed.

Generally, on archaeological grounds, a modest number of repeat measurements are needed on articulating or refitting samples, and on samples from structural material; but much higher numbers are needed for samples of disarticulated bone or carbonised plant remains, particularly as the putative functional association between the datable material and the context from which it was recovered becomes more uncertain. So, for example, when dating our fictional Iron Age farmstead, two samples might be sufficient from a hearth, but two samples might be needed from each of several postholes of a building. The highest level of repeat sampling is needed in dating organic sediments, particularly when submitting a first set of samples from a sequence to determine whether it can be dated reliably. In this case a good rule of thumb is that samples of waterlogged plant material should be sought every 0.5m through the part of the sequence that is of interest, and that half of the levels dated should have replicate samples.

Replicate measurements undertaken to address the scientific complexities of dated samples are generally repeat determinations taken on the same sample, or on different fractions of the same sample. Most radiocarbon laboratories have continuing programmes of random replication that are part of their internal quality assurance procedures, and that form part of their protocols for error calculation. Most scientific replicates commissioned by archaeologists are therefore likely to consist either of repeat measurements of different fractions of the same sample, or on samples that are split and dated by two different radiocarbon facilities.

In this case the degree of replication needed depends on the other checks that are available on the accuracy of the results, on the scientific difficulty of producing reproducible measurements on the material (Bayliss and Marshall 2019, table 1) and on the importance of the application. In an extreme example (see section 5.2), replicate measurements were made by each of two different laboratories.

Generally, some degree of inter-laboratory replication is wise for contaminated or poorly-preserved samples. In larger studies, and where sufficient material is available, it could also be merited for bones and charred food residues on pottery sherds. For some types of sample, for example when dating lime mortar, repeat measurements are an integral part of the dating process. If bulk organic sediment must be dated, then measurement of humic acid and humin replicates should be the norm, at least for the first set of samples from a sequence used to determine whether reliable dating is feasible. In this case replicate measurements should be obtained every 0.5m through the part of the sequence that is of interest.

Replicate measurements might also be needed to check for radiocarbon offsets. Where samples of articulated herbivore bone and articulated human bone occur in the same inhumation, it is important that radiocarbon determinations are obtained on both the human and the animal. This perfect pair will provide a check for a dietary offset in the human bone. Similarly, to check for the incorporation of an old-wood offset from pyre fuel in calcined bone, when dating cremation deposits replicate, measurements on a single-entity, short-life carbonised plant macrofossil and a fragment of white, calcined bone are desirable.

A sampling strategy for radiocarbon dating should consider all the factors discussed in section 3.3. Simulation models will provide an indication of the number of samples that could be needed, given the shape of the relevant portion of the calibration curve and archaeological prior information that can be incorporated in the chronological model. Other things being equal, fewer radiocarbon dates will be needed where there are more archaeological constraints on the model. Identifying this prior information and suitable samples that enable it to be exploited is thus extremely cost-effective.

Theoretical simulation models are helpful, but need to be interpreted intelligently. A sampling strategy also needs to be representative of the archaeological remains that are being dated, and sufficient replication needs to be commissioned to mitigate risks in the archaeological or scientific characteristics of the proposed samples.

Designing an efficient and effective radiocarbon sampling strategy from the mass of datable material from a site is undoubtedly the most difficult and technically demanding step in the Bayesian process. But it is key, and so commissioning a specialist to undertake this work is likely to be worthwhile in all but the simplest of cases.

3.4 Purchasing radiocarbon dates

Having determined which samples should be submitted for dating, the next step in the Bayesian process is to submit the samples for dating to a radiocarbon laboratory.

Best practice is to split the samples from a site between two different laboratories. This provides a degree of cross-checking that will ensure the reproducibility and accuracy of the radiocarbon measurements and the resultant chronology. It also mitigates the risks inherent in any complex scientific process, and is essential when high-precision dating is required. In some circumstances, however, this risk will have to be weighed against the practicalities of the project timetable and funding.

You should consider any technical constraints that the samples could impose. Are any of the samples of less routine materials that not all laboratories accept for dating? Are your samples contaminated, or particularly small? Is the quoted precision critical to the success of your dating programme? Information on these issues is often available from laboratory websites, date-lists or publications (see Appendix). For non-routine or contaminated samples, it is certainly worth contacting the laboratory to discuss a potential submission before sending the samples.

Quality is also an essential consideration. The technical procedures used by laboratories should be fully published, and thus accessible to future generations of researchers who need to trace these details. Laboratories should use internationally recognised reference materials (the results of which are sometimes reported along with your results), and take part in the series of international radiocarbon inter-comparison exercises (most recently SIRI; Scott et al. 2017). Laboratories should also have their own, internal quality assurance procedures, the results of which are often published. A full list of radiocarbon laboratories is maintained by the journal Radiocarbon.

Other considerations affecting the choice of radiocarbon laboratory are practical. Most laboratories can provide an indication of the likely timescale for the provision of radiocarbon results on submission of samples. Some will guarantee a turn-around time, and offer ‘express’ services for situations where time is of the essence. Ultimately the reliability of laboratories in producing results within the timescale indicated is best assessed by experience.

Costs are another consideration. These can be found on the relevant laboratory websites, but care must be taken to determine what taxes (e.g. VAT) are liable for a particular project. Many laboratories offer bulk discounts (and even loyalty cards!) for ‘persistent customers’, and so it may be worth organising the submission of radiocarbon samples centrally in your organisation to maximise the advantage of such discounts. Express services are typically considerably more expensive.

Care must be taken to determine which associated measurements will be undertaken by each laboratory. For archaeological samples from England, δ13C values should be obtained routinely. You should check which type of δ13C value will be measured and reported by a laboratory (see section 1.4) and determine whether this is a standard part of the reported measurement or whether extra charges apply. If additional associated measurements are required, such as C:N ratios or IRMS δ13C and δ15N values, these might be provided by the dating laboratory (usually at additional cost), but might have to be sourced elsewhere.

It is essential that all permissions for destructive analysis and export/import permits are obtained before samples are despatched. Human remains will require relevant permission (Mays et al. 2013, 4). Certain archaeological objects of more than 50 years in age might require an export licence. Samples from endangered species sent to laboratories outside the UK will require an export permit  Samples destined for laboratories in some countries might require specific import documentation, which will be supplied by the relevant laboratory.

All samples should be fully documented before submission for dating, and laboratories generally have procedures for this purpose. Radiocarbon dates are expensive, and it is worth double-checking that the labelling on the samples and the accompanying documentation is consistent.

Samples should consist of exactly what you want the radiocarbon laboratory to measure, and generally material can be sub-sampled for radiocarbon dating by specialists on the project team. For example, if you have a number of cereal grains from a context, select a large, well-preserved grain, obtain as precise a botanical identification as possible, photograph it, and send that single grain in a glass vial to the laboratory to be dated. Do not send multiple grains, as the laboratory will not necessarily know that you want a single-entity to be dated, and might bulk them together for analysis. If the selected grain is too small, the laboratory will contact you for a replacement.

It can be difficult to judge whether waterlogged plant macrofossils are large enough for dating. If there is choice, then the largest terrestrial single entity should be selected. If this is unlikely to make up the weight required by the relevant laboratory, then advice should be sought on how much material is needed. Judgement is required, as the risks of bulking together more than one item for dating (see section 3.2.2) have to be balanced against the risk of a sample failing, or producing an inaccurate result, if insufficient material is supplied.

Some materials are better sub-sampled in the radiocarbon laboratory, however, particularly if specialist knowledge is needed to select the best material for dating. Carbonised food crusts on pottery should generally be left on the sherd. This should be sent to the laboratory, where the residue will be sub-sampled for dating and the object will be returned to the submitter. Sub-sampling intact bones for radiocarbon dating requires specialist drilling equipment (such as that used for sampling for stable isotopic studies; Fig. 29). If this is not available, complete bones can be sent for dating to be sub-sampled in the radiocarbon dating laboratory, which will again be returned to the submitter.

Samples should be stored as described above (see section 3.2.1), although additional packaging is usually required for samples that are to be sent by post (special delivery) or courier. It is important that samples of carbonised material are not crushed (smaller fragments tend to produce lower yields of carbon during laboratory processing), and it is important that glass vials do not break during transport. Generally, packing in bubble-wrap or polystyrene chippings in a sturdy box is optimal.

3.5 Preliminary modelling and additional samples

When the radiocarbon results are reported, they replace the simulated measurements in the simulation model. Almost always the real results will not be quite as anticipated. Most commonly it will be the assessment of the taphonomy of the dated sample that will be in error.

Sometimes samples will be residual or intrusive, and it will be necessary to revisit the chain of inference by which the association of the dated sample to the archaeological event of interest was assessed before its submission for dating.

On other occasions, it is necessary to reconsider the prior archaeological information that has been included in a model. Direct stratigraphic relationships usually prove to be secure, but the criteria on which dated deposits have been phased often require re-evaluation. Occasionally, something will have gone wrong with a radiocarbon measurement in the laboratory and it will be necessary to ask for the technical details of a sample to be reviewed.

Once these problems have been identified and re-modelled in an appropriate way, further simulated dates are added to the existing suite of radiocarbon dates. Once the additional number of samples needed has been determined (see section 3.3.1), further samples are selected from the pool of potential samples that has already been identified (see section 3.2.2 and section 3.2.3) or are chosen because further replication to assess sample taphonomy or laboratory accuracy is required (see section 3.3.2). These are dated and the cycle repeats (Fig. 11).

Ideally, this process repeats until adding more simulated dates does not materially improve the precision of the chronology produced by the model. In practice, however, usually either there is no more money for more samples, or the post-excavation timetable cannot accommodate further rounds of sampling. Occasionally, there is be no further suitable material for dating.

This process is time-consuming; frequently as much staff time is spent in selecting samples and running simulations as is spent in analysis and publication of the final set of results. However, projects where the samples are selected around the model, rather than where the model is grafted onto an existing series of dates, have consistently provided much more precise chronologies and have been much more cost-effective and archaeologically useful.

3.6 Reporting radiocarbon dates and chronological models

The detailed reporting of radiocarbon dates and chronological models is a fundamental part of any programme of radiocarbon dating.

3.6.1 Reporting radiocarbon dates

Details of the radiocarbon measurements, the methods used to produce them and the samples analysed will be essential information for future generations of researchers.

Currently, any synthetic study of English chronology requires considerable research to track down the relevant details. Often the original reporting documentation sent by the radiocarbon facility can be traced in project archives, and radiocarbon laboratories generally do their best to help trace details of past measurements. But accessing primary archives is time-consuming, and over time radiocarbon facilities do close.

There are also potential legal and other barriers to radiocarbon dating laboratories making information available (for example, client confidentiality in perpetuity is a condition for ISO-9000 accreditation).

The following information must be published for each radiocarbon measurement:

1. Details of the facility or facilities that produced the results, and how samples were pretreated, prepared for measurement and dated. References to published papers should be preferred to citation of web addresses (as the archival stability of the latter is currently unproven). This information should be supplied by the radiocarbon dating facilities.

Example: Samples of bulk peat were pretreated using an acid-base-acid protocol (Mook and Waterbolk 1985) and then converted to benzene and dated by liquid scintillation spectrometry at the University of Waikato (Hogg et al. 1987). The other samples were pretreated and combusted as described by Brock et al. (2010b), and then graphitised and dated by AMS at the Oxford Radiocarbon Accelerator Unit (Dee and Bronk Ramsey 2000; Bronk Ramsey et al. 2004a).

2. Details of the radiocarbon results and associated measurements and how these have been calculated.

Example: The results are conventional radiocarbon ages (Stuiver and Polach 1977) and are listed in Table 3. The ages produced at Rijksuniversiteit Groningen have been calculated using the fractionation correction provided by the δ13C (AMS) values, which are not reported. Those produced at SUERC have been calculated using the reported δ13C values measured by conventional mass spectrometry.


TABLE 3: Reporting radiocarbon and stable isotope measurements.


3. Details of the material dated and the context from which it came (see Table 4).

The critical information that will be needed by future researchers, both to recalibrate your radiocarbon results as calibration data are refined and to re-interpret your data to answer new questions, is included under 1–3 and in Tables 3 and 4.


TABLE 4: Reporting sample details.

Sample🙂
The Good
😐
The Bad
🙁
The Ugly
Sample 1ws8, human bone, right femur from adult male, partially articulated skeleton group 7, overlying ws13 in primary mortuary depositfemur from partially articulated skeleton in primary mortuary deposithuman bone from mortuary deposit
Sample 2cattle right ulna articulating with radius from segment 3, F-1672=F44 Context 59; fill of an early recut, stratigraphically later than Sherd Group 265articulating animal bone from early recut of Segment 3animal bone from ditch
Sample 3AB1 (511), wild boar tibia with refitting unfused epiphysis from Pit F7juvenile wild boar tibia from Pit F7animal bone from Pit 7 [Should be Pit F7?]
Sample 4Sherd Group 98, carbonised residue on 1 large body sherd among >10 from a single Neolithic bowl from Segment 2, F1358, Context 1272; lowest fill of recut of segmentcarbonised residue on Sherd Group 98 from lowest fill of recut in Segment 2carbonised residue on sherd from ditch
Sample 5Acalcined human bone from Cremation Burial [7074] of an adult ?male individualhuman bone from Cremation [7074]Cremation [7074]
Sample 5Breplicate of Sample 5Ahuman bone from Cremation [7074]Cremation [7074]
Sample 6waterlogged wood, Prunus sp. roundwood sail from Well Lining in [5288]Prunus sp. from Well Lining [5288]Well Lining [5288]
Sample 7single carbonised hazelnut shell fragment from hearth [293]hazelnut shell from hearth [293]hazelnut shell from hearth
Sample 8antler pick from bottom of ditch in Cutting 25.2antler from ditch in Cutting 25.2antler from ditch
Sample 9Polygonum aviculare seeds (×20) from fill [691] of plank-lined springheadwaterlogged seeds from springheadseeds from [691]
Sample 10single fragment of charred hazelnut shell from Pit 5025, which contained plain and decorated bowl pottery, struck flint, charcoal, charred plant remains, and animal bonecharred hazelnut shell from Pit 5025hazelnut shell from Pit 5025
Sample 11AB12 (450), paired dog left and right mandibles from secondary barrow, cutting DXdog left mandible from secondary barrowanimal bone from secondary barrow
Sample 12fragment of one of three interleaved proximal rib fragments from a large mammal found together in outer ditch, Bone Group 115 in top of Layer 111large mammal rib from outer ditch, layer 111animal bone from outer ditch
Sample 13beeswax from lamp accompanying primary burial in Mound 1beeswax from primary burial in Mound 1beeswax from Mound 1
Sample 14ws14, human bone, right femur from adult, possibly female (no articulation demonstrable), from Bone Group Q in third layer of primary mortuary deposithuman femur from Bone Group Q in primary mortuary deposithuman bone from mortuary deposit
Sample 15bulk charcoal, Corylus avellana and Pomoideae, from Context 61: gleyed colluvium with lenses of burnt material representing occupation activity that abuts or pre-dates Structure 57short-lived charcoal from occupation associated with or pre-dating Structure 57charcoal from Context 61
Sample 16PT1 AuW1976.217, Vessel 33, carbonised residue adhering to sherd from buried soil west of the midden in square m21residue on sherd from buried soilpottery from buried soil
Sample 17single fragment of charcoal, Corylus avellana, from Posthole 346 of Building 4Corylus avellana from posthole of Building 4charcoal from Building 4
Sample 18disarticulated cattle mandible from a layer of fine silt and chalk rubble sealing the layers of phase II, and probably originating as upcast from ditch cleaning of the monument, thus forming the third phase of the bank/rampartcattle mandible from the third phase of the bank/rampartanimal bone from the bank
Sample 19W2, waterlogged Alnus/glutinosa roundwood including bark, from prostrate tree on the surface of the peatW2, waterlogged Alnus sp. from peatW2, waterlogged wood from peat
Sample 20AW1, peat (200g), humic acid fraction, from 2cm spit at a depth of 16–18cm from the top of the peatW1, peat (humic acid fraction) at a depth of 16–18cm from the top of the peatW1, peat from a depth of 16–18cm
Sample 20BW2, peat (200g), humin fraction, from 2cm spit at a depth of 16–18cm from the top of the peatW2, peat (humin fraction) at a depth of 16–18cm from the top of the peatW2, peat from a depth of 16–18cm
Sample 21Bulk sample of carbonised grain from Pit 277, Fill 278. A 4cm thick deposit of carbonised grain, covered the pit floor. The grain consisted mainly of spelt and six-row hulled barley. The grain was either burned within the pit or accumulated very rapidly. This sample came from the base of the deposit.carbonised grain from base of Pit 277, Fill 278grain from Pit 277
Sample 22bulk sample of oyster shell (Ostrea sp.) from the top of the 3.4m oyster middenoyster shell from the top of the middenoyster shell from midden

4. Details of any replicate analyses, statistical tests on replicate groups of measurements, although sometimes more extensive discussion may be merited.

Example: Measurements on the humic acid and humin fractions of the large bulk peat sample are statistically consistent (GrN-28276, 1140±50 BP and GrN-28277, 1050±50BP; T′=1.6,T′(5%)=3.8, ν=1; Ward and Wilson 1978) and so a weighted mean (1095±35 BP) has been calculated before calibration.

5. Details of the calibration protocols used, including any reservoir corrections employed. Calibration is an essential step in the use of radiocarbon dating to infer chronology and this information will always be required.

Often, however, calibration is simply part of formal statistical modelling, and where further statistical analysis is undertaken, it might be more appropriate to provide posterior density estimates and Highest Posterior Density intervals (see section 3.6.2), rather than simple calibrated date ranges.

In applications where no further analysis of the radiocarbon dates is undertaken, however — for example, when range finder dates are required — then calibrated radiocarbon dates should be reported.

Example: The quantile ranges of the calibrated dates for the samples given in Table 3 have been calculated using the probability method (Stuiver and Reimer 1993), and are quoted with end points rounded outwards to ten years. They have been calculated using OxCal v4.4 (Bronk Ramsey 2009a) and the current internationally-agreed atmospheric calibration dataset for the northern hemisphere, IntCal20 (Reimer et al. 2020).

The sample of oyster shell (HAR-3464, 1280±80 BP) is from Poole, Dorset, and has been calibrated using the marine dataset of Heaton et al. (2020) and a ΔR value of −179 ± 93 BP calculated from the ten closest marine reservoir datapoints to the site (http://calib.org/marine/; Reimer and Reimer 2017).

Replicate analysis and calibration are needed for the interpretation of the radiocarbon dates that have been obtained as part of a project. Both, however, can be reworked from the details provided under points 1.– 3. and in Tables 3 and 4.

As described in section 1.6, calibration is not only now usually part of further analysis, but is also periodically refined, so it is essential that the information necessary for revising the calibrated dates and including them in future chronological models is provided.

In simple cases, the reporting of the radiocarbon dates in a project will be completed by the publication of the information in this section. In cases where Bayesian Chronological Modelling has been undertaken, however, the information in the following section should also be reported.

3.6.2 Reporting Bayesian chronological models

Bayesian chronological models are interpretative constructions. They will be revised, not only as calibration data and statistical methods improve, but also as archaeological understanding develops and new questions are posed. Consequently the aim of chronological modelling reporting is not just to explain how and why the models presented were constructed, but also to provide sufficient information to enable the reader to understand the strengths and weaknesses of those models, and so that they can be critically analysed and reconstructed by future researchers.

Chronological modelling reports should include the following information:

  1. Objectives of the study
    The objectives of the dating programme, including the dating precision needed to achieve these objectives and discussion of how the objectives may have been recast in the light of the available material, prior information, calibration curve, available funding, etc.
  2. Methodology
    This should include a statement of the approach adopted, including details of the radiocarbon calibration data (and any reservoir corrections), statistical methods and software used.
  3. Sampling strategy
    This should include discussion of:
    (a) the pool of potential samples available from the project (see section 3.2.2 and section 3.2.3),
    (b) the available prior information,
    (c) the results of any simulation models (see section 3.3.1),
    (d) any other factors that affected the sampling strategy adopted (see section 3.3.2), and
    (e) the rationale by which these elements have been combined into a strategy.
  4. Details of scientific dates
    Radiocarbon dates should be published as outlined in section 3.6.1. Legacy data might also need to be reported to this level of detail, although reference to relevant publications might be adequate (depending on the quality of the original reporting). Where legacy data are reported in a variety of sources, it can be helpful to provide a table of dates (see Tables 3 and 4), so that readers can assess their quality.

    Details of other scientific dates should be reported in a similar way (see Duller 2008, sections 9–10; English Heritage 2006, 18; English Heritage 1998, sections 2.7–8).
  5. Model definition and description
    It is essential that each model in a publication is explicitly defined so that it can be recreated by readers. Most published models have been created using one of the software packages listed in the Appendix, and can be defined as described in the relevant publication relating to that software (see Case Studies, section 5). Sometimes, it is possible to define simple or variant models in the text. Models that have been constructed using new statistical procedures that have not been published elsewhere will need technical mathematical appendices.

    Chronological models do not, however, simply have to be defined. They must also be justified. The prior information included in the model should be described, and its strengths and weaknesses assessed. Consideration should be given to whether the ‘uninformative’ prior information included is appropriate for the problem at hand, the robustness of the associations between the data and the prior information, and the identification of any outliers or misfits.
  6. Sensitivity analyses
    Having defined and justified a model, it is necessary to assess its strengths and weaknesses (see section 2.2.2). Most usually this is done by varying components of a model to determine how sensitive the modelled chronology is to changes in the interpretations on which the modelling is based.
  7. Recommendations for further work
    Sometimes the assessment of the strengths and weaknesses of the current study, as described above will suggest or indicate that further work is needed.

    The posterior density estimates produced by chronological models can be summarised using Highest Posterior Density intervals. These should be cited in italics to distinguish them from calibrated radiocarbon dates. They should be rounded outwards to a resolution that is dependent on that of the calibration curve used and the precision of the posterior distributions. All Highest Posterior Density intervals produced by a model should be rounded to the same resolution, which should not be greater than that of the relevant section of the calibration curve.

    In practice, most Highest Posterior Density intervals are rounded outwards to five years, except for those from wiggle-matching on parts of the calibration curve that are interpolated at single-year resolution, which are rounded outwards to the nearest year.

3.6.3 Citation of Bayesian chronological models

Discussions of chronology often include comparisons between the dates of different sites and different artefacts. Rarely will all the relevant comparanda be included in the modelling for a project, and so it will often be necessary to cite key parameters from previously published models. It is essential that it is clear precisely which parameter is meant, and from which model it derives. Thus, in addition to the Highest Probability Density interval, both the parameter name and an exact reference to the published definition of the relevant model should be given.

For example, the building of the outer circuit of the Chalk Hill causewayed enclosure considered below would be cited as ‘3760–3675 cal BC (95% probability; build outer Chalk Hill; Bayliss and Marshall 2022, Fig. 65)’.

KEY FACTS BOX 4
Reporting radiocarbon dates and Bayesian chronological models

For each radiocarbon date the unique laboratory identifier, the conventional radiocarbon age or fraction modern value, and the experimental uncertainty at 1σ must be reported, along with any associated measurements (e.g. δ13C values). Details of the material dated and the context from which it came should be given, and the methods of sample preparation and measurement specified. Replicate analyses should be described, and details of the calibration procedures used should be given.

Reports on Bayesian chronological models should include descriptions of the objectives of the study, the methodology employed, and the sampling strategy adopted. The dated material and radiocarbon measurements should be fully described, as should the prior information included in the modelling. Each model must be explicitly defined so that it can be recreated. Sensitivity analyses are often required to assess the strengths and weaknesses of the models presented.