MfN Micropaleo Lab

Creative Commons License
Johan Renaudie.

Big data analysis in Micropaleontology

NSB database coverage (left: amount of occurrence data for each fossil group over geological time; right: geographical coverage).
[N: calcareous nannofossils; D: diatoms; Df: dinoflagellates; R: radiolarians; F: planktonic foraminifera; black: sites drilled; red: sites in NSB]

NSB Database

The Neptune (NSB) database and the PaleoBiology DataBase (PBDB) are the two largest databases serving paleontology that synthesize published fossil occurrence data. PBDB covers the Phanerozoic record of invertebrate and vertebrates but at low time (ca 10 my) and taxonomic (genera) resolution. NSB covers only marine plankton from the last 100 my, but at far higher temporal and taxonomic resolution. Currently NSB has ca 800K occurrence records for ca 10K species, with an average age resolution of ca 300,000 years. NSB is used, as is PBDB, in paleobiology studies. NSB also though uniquely contains detailed geochronologic information for individual sections and biostratigraphic marker species, and is thus also used by paleoceanographers and other earth scientists. In all, nearly 100 publications cite or have used the system. Lastly, the taxonomic name lists in NSB are linked to other marine plankton taxonomy databases. NSB provides the source names to maintain the World Registry of Marine Species list of Polycystina (Radiolaria) and is linked to the main micropaleontology community online taxonomic catalog Mikrotax.

NSB was developed over more than 20 years at various institutions, with funding from national science agencies, the European Community, and CEES Oslo. The current implementation, developed and hosted by the MfN was largely written by Johan Renaudie and Patrick Diver (USA), with community support for taxonomic and geochronologic content (particularly Jeremy Young, UCL, London and Brian Huber, Smithsonian, USA).

More information on NSB:

NSB website


Complete list of publications using or discussing NSB as a bibtex file.

Recent research papers using NSB from our group:

Renaudie, J. (2016). Quantifying the Cenozoic marine diatom deposition history: links to the C and Si cycles. Biogeosciences, 13(21):6003–6014.

Wiese, R., Renaudie, J., and Lazarus, D. B. (2016). Testing the accuracy of genus-level data to predict species diversity in Cenozoic marine diatoms. Geology, 44(12):1051–1054.

Lazarus, D., Barron, J., Renaudie, J., Diver, P., and Türke, A. (2014). Cenozoic planktonic marine diatom diversity and correlation to climate change. PloS one, 9(1):e84857.

Papers describing the NSB/Neptune system, methods papers or general reviews of paleobiologic research using the database:

Fenton, I., Woodhouse, A., Aze, T., Lazarus, D., Renaudie, J., Dunhill, A., Young, J., Saupe, E., 2021. Triton, a new species-level database of Cenozoic planktonic foraminiferal occurrences.. Scientific Data, 8:160.

Renaudie, J., Lazarus, D.B., Diver, P., 2020. NSB (Neptune Sandbox Berlin): an expanded and improved database of marine planktonic microfossil data and deep-sea stratigraphy. Palaeontologia Electronica, 23(1):a11.

Yasuhara, M., Tittensor, D.P., Hellebrand, H., and Worm, B., 2017, Combining marine macroecology and palaeoecology in understanding biodiversity: microfossils as a model: Biological Reviews, 92:199-215.

Lazarus, D., Weinkauf, M., and Diver, P., 2012, Pacman profiling: a simple procedure to identify stratigraphic outliers in high density deep-sea microfossil data: Paleobiology, v. 38, p. 144–161.

Lazarus, D., 2011, The deep-sea microfossil record of macroevolutionary change in plankton and its study, in Smith, A., and McGowan, A., ed., Comparing the Geological and Fossil Records: Implications for Biodiversity Studies: London, The Geological Society, p. 141–166.

Fils, D., Cervato, C., Reed, J., Diver, P., Tang, X., Bohling, G., and Greer, D., 2009, CHRONOS architecture: Experiences with an open-source services-oriented architecture for geoinformatics: Comp. Geosci., v. 35, p. 774–782.

Bohling, G., 2005, Chronos Age-Depth Plot: A Java application for stratigraphic data analysis: Geosphere, v. 1, p. 78–84.

Spencer-Cervato, C., 1999, The Cenozoic deep sea microfossil record: explorations of the DSDP/ODP sample set using the Neptune database: Palaeontologica Electronica, v. 2.

Lazarus, D.B., Spencer-Cervato, C., Pianka-Biolzi, M., Beckmann, J.P., von Salis, K., Hilbrecht, H., and Thierstein, H.R., 1995, Revised chronology of Neogene DSDP holes from the world ocean: College Station, Ocean Drilling Program, v. 24, ca. 250 p.

Lazarus, D.B., 1994, The Neptune Project - a marine micropaleontology database: Math. Geol., v. 26, p. 817–832.

Lazarus, D.B., 1992, Age Depth Plot and Age Maker: age depth modelling on the Macintosh series of computers: Geobyte, v. 7, p. 7–13.

ReLu6 activation map of MobileNet on a picture of Antarctissa cylindrica (first convolution).

Automatic identification of radiolarian species

Automated identification in fields such as plankton research or micropaleontology, where enormous numbers of objects are available, would significantly improve data quantity and quality, particularly in applied studies of environmental and climate change. We're developing a python-based (tensorflow) machine learning workflow based on the MobileNet convolutional network with the aim of identifying closely-related species of radiolarians from complete species populations (not only ideal specimens) as they are normally identified in standard transmitted light microscope preparations.

Code and Dataset:

The code can be found in this Github repository, and the complete dataset used in the pilot study can be found in this Zenodo repository.


Ryan J. Gray

Veronica Carlsson


Renaudie J., Gray R., Lazarus D.B. 2018. Accuracy of a neural net classification of closely-related species of microfossils from a sparse dataset of unedited images. PeerJ Preprints.