Stratigraphic diagrams using analogue

One of the routine tasks palaeoecologists do is plot data on species composition or geochemical proxies say along a sediment core or stratigraphic sequence. These diagrams are the canonical way of displaying stratigraphic data in this field. An example of a stratigraphic diagram is shown below.

Example of a stratigraphic diagram showing data from the classic Abernethy Forest late glacial pollen sequence

Example of a stratigraphic diagram showing data from the classic Abernethy Forest late glacial pollen sequence

These plots are also a bit of a pain to produce, for various reasons. We want to cram as much information into a single diagram as possible, so when plotting species abundance type data, we use larger panels for abundant taxa and smaller panels for the less abundant taxa. Quite often we want to mix relative data types (e.g. relative abundances or compositional data) with absolute data types (e.g. geochemical proxies, palaeoenvironmental reconstructions, ordination summaries) and therefore only want to scale the sizes of the panels for the relative data types. We might want to display the data in the panels in different ways and add extra information to the diagram such as stratigraphic zones or an additional y-axis scale (dates as well as depths for example).

Because of these special plotting requirements, the task of drawing stratigraphic diagrams has often been performed in specialist software; e.g. Eric Grimm’s Tilia and Tiliagraph are an old example I used during my undergraduate dissertation, now updated for modern computers, or Steve Juggin’s C2 programme. Having produced the figure in the specialist software, a lengthy post-processing process in Illustrator often ensued, to get the diagram looking just right and ready for publication.

Being a Linux user for over a decade now, none of these applications would run on my computers without dropping back out to Windows, and because I use R all the time for my data analysis, it would be great if we could produce these plots using R. So I started writing some code that resulted in the Stratiplot() function in my analogue package. Stratiplot() uses the power of the Lattice graphics package and the odd bit of grid code to achieve pretty reasonable-looking diagrams (IMHO).


The example above was produced with the following code:

require(analogue)
data(abernethy)
Stratiplot(Age ~ . - Depth, data = chooseTaxa(abernethy, n.occ = 5, max.abun = 10),
           type = c("h","l","g"))

As you can see, Stratiplot() has a formula interface (the abernethy data frame contains both an Age and a Depth variable, but we only want to use Age in the plot so we must remove Depth from the RHS of the formula), so works like many other R functions, but there is a standard interface if you are prepared to get the data in the correct format &mdash which isn’t too difficult! The type = c("h","l","g") is an extension of the xyplot() argument of the same name, and exists to define what types of sub-plots are drawn on each of the panels. The available types are:

  • "l" — draws the data as lines,
  • "p" — draws the data as points,
  • "o" — draws the data as both lines and points overplotted,
  • "b" — draws the data as both lines and points,
  • "g" — draws a grid at the tick marks,
  • "h" — draws the data as histogram-like bars, but extending from the y-axis, not the usual x-axis, margin,
  • "smooth" — draws a LOESS smoother through the data,
  • "poly" — draws the data as a filled polygon, which is like type = "l", but the area between 0 and the line, on the x-axis, is filled,

The first five have their usual meaning from the xyplot() function, whilst the last three are either unique to Stratiplot() (type "poly") or have been implemented differently because the data in a stratigraphic diagram move up the diagram and not from left to right (types "h" and smooth). These can be combined in whatever way you wish, as the underlying panel function panel.Stratiplot() tries to plot them in a sensible order so one graphical element doesn’t obscure another element.

In the example we limited the number of taxa that are used in the diagram via the chooseTaxa() function to select only those taxa that were present in at least 5 samples and were at least as abundant as 10% in any one sample. The criteria could have been made “either or” by using type = "OR" in the chooseTaxa() call.

The stratigraphic diagram can be augmented in several ways. One of which is to order the variables by another variable. A usual ploy is to sort the taxa in order of their weighted average “optima” for the y-axis variable, which emphasises the change in species composition over time. This can be done in Stratiplot() using sort = "wa", e.g. this snippet of R code

Stratiplot(Age ~ . - Depth, data = chooseTaxa(abernethy, n.occ = 5, max.abun = 10),
           type = c("h","l","g"), sort = "wa")

produces this version of the figure

Stratigraphic diagram of the Abernethy Forest pollen sequence with taxa sorted by WA of the Age variable

Stratigraphic diagram of the Abernethy Forest pollen sequence with taxa sorted by WA of the Age variable

Stratigraphic zones can also be added to the diagram using the zones argument. By default the zones are illustrated by a legend on the right of the figure and labelled using argument zoneNames. In the code below, we add in the six significant zones in this sequence using five boundaries, Zones, and label the zones “A” to “E”

Zones <- c(7226,9540,9826,11180,11700)
Stratiplot(Age ~ . - Depth, data = chooseTaxa(abernethy, n.occ = 5, max.abun = 10),
           type = c("h","l","g"), sort = "wa", zones = Zones,
           zoneNames = c(LETTERS[1:6]))

to produce this diagram

Stratigraphic diagram of the Abernethy Forest pollen sequence with taxa sorted by WA and six zones superimposed

Stratigraphic diagram of the Abernethy Forest pollen sequence with taxa sorted by WA and six zones superimposed

One of the draw backs of using Lattice to do the heavy-lifting of drawing the panels, is that customisation of the plot after it has been drawn is more tricky than if a solution using base graphics had been used.

Stratiplot() can handle mixtures of relative and absolute data types but this code is very experimental at the moment. I’ll illustrate how to use this feature in a future blog post. This functionality will no doubt be updated in future versions of analogue. I also want to add the option to use a dendrogram representation of the zones (if applicable) instead of the box-like legend now used.

About these ads
This entry was posted in analogue, R. Bookmark the permalink.

19 Responses to Stratigraphic diagrams using analogue

  1. brobar says:

    I’m having a little trouble replicating your result. Rather than plotting the stratigraphy, the frames have text that reads, for example:

    Error using packet 1
    could not find function “hasArg”

    Any idea what’s going on?

    • ucfagls says:

      @brobar – is that on all the examples or one in particular? hasArg is in the methods package which should come with and be loaded in your R. Are you running an up-to-date version of R and the Lattice and grid packages? If not, I can only suggest you update; my package code doesn’t call hasArg() at all so it must be being called from grid or Lattice, but as I said, it should be available on a reasonably new version of R.

      • brobar says:

        Yes, everything is current. I was running your code with “Rscript –save” which was not working, but when I use “R –save” it works fine. Hmmm… I’m missing something here.

        Anyways, thanks for your post, and reply!

        • ucfagls says:

          Ahhh… Ok I get you. ?Rscript gives us the answer – namely, by default Rscript doesn’t load the methods package as this doubles the R start-up time. So stick a require(methods) at the start of the script and the code examples should run just fine.

          This might be a bug somewhere – if code in a package needs another package it should be declared in the dependencies. I’ll track down where the call is coming from and see if this is a bug somewhere else and take it up with the maintainers.

  2. F. Tusell says:

    Great post, thank you very much.

  3. Ben says:

    Hi Gavin, thanks for your excellent contributions to the vegan package, I love this Stratiplot function. I wonder if you could give a commented example using mixtures of relative and absolute data types? I’d like to add some geochemical data to my plots but I’m getting errors like “Error in grid.Call.graphics(L_setviewport, pvp, TRUE)”
    “max(max.abun[REL]) : no non-missing arguments to max; returning -Inf”
    I’d be most grateful for your assistance!

    • ucfagls says:

      Thanks @Ben. There was a bug in the code for absolute data types. What version of analogue are you using? You can try the version from R-Forge using install.packages("analogue", repos="http://R-Forge.R-project.org") as I fixed a few things. This version should make its way to CRAN soon. If you can test the R-Forge version to see if it works, it would help with testing ready for the release. I’ll see about knocking up an example with mixed data types too, but I am busy with work til the end of the week.

      • Ben says:

        Thanks Gavin, the R-Forge version works perfectly (I was using the CRAN version). One tiny quibble is that the zero line of one of the plots does not line up with the solid horizontal line that separates the plots. See the EC plot is this image: http://i.imgur.com/odlra.png Thanks again for your quick and helpful reply, I’m looking forward to future releases!

        • ucfagls says:

          Thanks @Ben. Glad to know the newer code is working OK, if with a few infelicities such as the one you mention. This is due to the automatic finding of axis limits for each panel. This uses the standard automatic pretty limits that Lattice uses. I have some experimental code in the function that was trying to find limits myself, which could be set by the user as an argument. But I haven’t activated the use of this code in the actual plotting, probably because it wasn’t working quite right when I had to take a break from development due to work commitments. I can take another look later next week. Thanks for pointing this out.

  4. John says:

    Hi Gavin, I’m a budding paleoecologist and I was thrilled to find R code for making pollen diagrams – thanks a bunch. I have one question about absolute data. My absolute data (charcoal) is contiguously sampled and the pollen is every 5cm. When I plot them together it will only plot the charcoal for every point where there is also a pollen sample. Is there a way around this? Thanks!

    • ucfagls says:

      Hi John, I presume you have the pollen and charcoal data in the same data frame and that the pollen taxa have values NA for all the samples where you didn’t count them but did measure charcoal? If so, stratiplot() should, or at least was intended to, work. I’ll work up a little test example and try to see if I can reproduce what you are seeing.

      • John says:

        That’s exactly how I have it set up.

        • ucfagls says:

          OK, can you confirm you are using the latest version of analogue, 0.8-0? Will take a look at the code as your use case should be fine. Might take a few days to get back to you as I’m still getting over flu and have a couple of outstanding jobs to deal with.

          • ucfagls says:

            Hi John, I found the problem; the formula method was stripping out the ‘NA’ rows because the `na.action` argument wasn’t set correctly (nor processed appropriately if it had been set) in the code. The default method would have worked fine; the problem was just in how I was processing the data by interpreting the formula.

            I have checked in changes in the code base to r-forge. You’ll need a build of revision 268 or later (version 0.9-5 or later of analogue) to use the fixed code. If you aren’t on Linux you’ll need to wait until r-forge builds a binary package for your OS.

            These changes will make it to CRAN around the end of May as I’m working on the 0.10-0 version of the package at the moment.

            Thanks for reporting the issue and sorry this took a while to track down and fix.

          • John Calder says:

            Aswesome, thanks for the update. And I’m not on a linux OS so I will wait until the package is ready. Thanks again.

          • ucfagls says:

            You’re welcome. No need to wait though; R-forge has a binary build of analogue 0.9-5: https://r-forge.r-project.org/R/?group_id=69 which you can install with

            install.packages("analogue", repos="http://R-Forge.R-project.org")
            

            in R

  5. John says:

    Yes, I am using 0.8-0. I hope you feel better. Thanks again.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s