Post by Caleb Aldridge.—This is a recent paper wirtten for a directed study course on the philosophy and application of estimation. The thoughts are my own and do not neccessarily reflect those of my lab.
“I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.” (Maslow, 1996)
The beginning of the 21stcentury coincided with the explosion of the Information Age, a period marked by economies based on information and information technology. The shift within the past decade can be likened to growth of an understory tree racing to fill a gap in the canopy. With this new age has come expansion in computer sensors, technology, and collection of data. The interrelated fields of ecology, conservation biology, and natural resources management, hereafter “ecology” for convenience, traditionally and exclusively required manual methods to capture data, but new sensors and technologies have changed how and how much data are collected. Application of data-driven techniques (i.e., machine learning) to handle the deluge of data has ignited interesting discussions regarding scientific reasoning and ecological management (Kelling et al., 2009; Peters et al., 2014). In the following essay I highlight: (1) the history of scientific processes; (2) the balance of understanding and prediction; and (3) the role of philosophy of science in applied ecology.
A Brief History of Scientific Processes
Reasoning typically takes three forms: deduction, induction, and abduction. Deductive logic proceeds from the general to the singular (e.g., all whales are blue; George is a whale; therefore George is blue). Inductive logic proceeds from the set to the general (e.g., 98% of whales sampled are blue; it is expected that most all whales are blue, including George the whale). Abductive logic proceeds from specific to general and seeks the most likely explanation (e.g., Anne, Percy, and Toni are whales and are blue; George is a whale; it is likely that George is also blue). Deduction is the preferred method of reasoning as it leaves no room for uncertainty, but ecologist typically are not privileged with complete knowledge of the population and must use inference to gain knowledge about the world.
Both historically and contemporarily, two approaches to understanding have been used: hypothesis-driven (H-D) and data-driven (D-D). The H-D approach can be generally characterized by moving from theory or ideas to data (Kell & Oliver, 2003). Advocates for the H-D approach include Robert Boyle, Robert Hooke, Thomas Chamerlin, John Platt, and Karl Popper (Elliott et al., 2016). The D-D approach can be generally characterized as moving from data to ideas (Kell & Oliver, 2003). Advocates for the D-D approach include Francis Bacon, Isaac Newton, William Whewell, and John Stuart Mill (Snyder, 1999; Elliott et al., 2016). The popularity of one approach or the other has waned since the 17thcentury (Elliott et al., 2016). Contemporary views include a more nuanced, iterative approach to filling knowledge gaps—a D-D helps discover patterns and generate hypotheses in light of data and theory that are then tested in a H-D framework, which then garners support from additional observation and may imply new hypotheses. This contemporary view aligns with Hilborn and Mangel (1997), “…we bring to the problem whatever techniques–from wherever the come–needed to solve it.”
See Elliott et al. (2016), Kell and Oliver (2003), Peters et al. (2014), Elliot (2012), Kelling et al. (2009), and Fudge (2014) for further details.
N is Only Part of a Study’s Strength
A large amount of data and methods for discovering patterns can do well for predicting and generating hypotheses, but causal inference must be approached more carefully to establish. As Hernán, Hsu, and Healy (2019) state, “…the validity of causal inferences… depends on the adequacy of expert causal knowledge.” And sometimes, models can be theory based and lead to strong inference but be less accurate than a “wrong” model (Shmueli, 2010; Platt, 1964; Fudge, 2014; Kelling et al., 2009). The best models from a scientific perspective are those that allow for powerful explanation and powerful prediction.
Three important considerations for any ecologist when balancing between understanding and prediction are: (1) know the objectives of one’s study, (2) know the context of one’s study, and (3) know one’s data, whether it has or is yet to be collected. Elucidating these will help the ecologist strengthen their study and meet its purpose (Elliott et al., 2016; Nichols & Williams, 2006; Williams, Nichols & Conroy).
Management and Knowledge Discovery
The idea of knowing one’s objectives, context, and data is echoed by Nichols and Williams (2006), though they perhaps over-emphasize the necessity of hypotheses in management. (They themselves pull back on the throttle in the “Caveats” section). A focus in Gregory et al. (2012), Conroy and Peterson (2013), and even latter chapters of Williams, Nichols, and Conroy (2002) take a more nuanced approach, suggesting that the ecologist carefully balance understanding and prediction— this balance from a modeling perspective is discussed by Shmueli (2010).
If one considers H-D and D-D approaches struggle over an element of efficiency, not just logic and virtue, one can envision science and management progressing forward using the approaches iteratively. (This struggle of efficiency is also a focus in Nichols and Williams (2006) as they contrast targeted and surveillance monitoring). Indeed, adopting elements of the D-D approach into adaptive management (AM) might prove to be even more efficient. Per the structured decision making (SDM) and AM framework, the need for information would initiate a D-D approach to generate new hypotheses that would then be considered in light of theory and tested via the H-D approach.
Closing Comments
Both H-D (i.e., deductive and focused more on explanation or parts) and D-D (i.e., inductive/abductive and more focused on prediction or systems) approaches have made large and important contributions to scientific knowledge. There are encampments about which is superior or if the other is valid. I prefer to view each as necessary to scientific progress and complimentary to one another (e.g., Darwin’s collecting of specimens from which he inductively inferred evolution by natural selection that was then tested in a hypothetico-deductive approach). This shifts the focus to addressing the knowledge gap or challenge and the approaches or methods necessary to do so (Elliott et al., 2016; Shmueli, 2010; Kell & Oliver, 2003; Peters et al., 2014; Elliott, 2012; Kelling et al., 2009; Hilborn & Mangel, 1997; Glass & Hall, 2008).
References and Additional Readings
Conroy, M. J. & Peterson, J. T. (2013). Decision making in natural resource management: A structured, adaptive approach. West Sussex, UK: Wiley-Blackwell.
Elliott, K. C. (2012). Epistemic and methodological iteration in scientific research. Studies in History and Philosophy of Science, 43(2): 376–382.
Elliott, K. C., Cheruvelil, K. S., Montgomery, G. M., & Soranno, P. A. (2016). Conceptions of good science in our data-rich world. BioScience, 66(10): 880–889.
Franklin, L. R. (2005). Exploratory experiments. Philosophy of Science, 72(5): 888–899.
Fudge, D. (2014). Fifty years of J. R. Platt’s strong inference. The Journal of Experimental Biology, 217: 1202–1204.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B. Vehtari, A., & Rubin, D. B. (2014). Bayesian data analysis (3rd ed., pp. 3–28). Boca Raton, FL: CRC Press.
Glass, D. J. & Hall, N. (2008). A brief history of the hypothesis. Cell, 134(3): 378–381.
Gregory, R., Failing, L., Harstone, M., Long, G., McDaniels, T., & Ohlson, D. (2012). Structured decision making: A practical guide to environmental management choices. West Sussex, UK: Wiley-Blackwell.
Hernán, M. A., Hsu, J., & Healy, B. (2019). A second chance to get causal inference right: A classification of data science tasks. Chance, 32(1): 42–49.
Hilborn, R. & Mangel, M. (1997). The ecological detective: Confronting models with data (pp. xi–11). Princeton, NJ: Princeton University Press.
Hobbs, N. T. & Hooten, M. B. (2015). Bayesian models: A statistical primer for ecologists (pp. 3–10). Princeton, NJ: Princeton University Press.
Kell, D. B. & Oliver, S. G. (2003). Here is the evidence, now what is the hypothesis? The complementary roles of inductive and hypothesis-driven science in the post-genomic era. BioEssays, 26(1): 99–105.
Kelling, S., Hochachka, W. M., Fink, D., Riedewald, M., Caruana, R., Ballard, G., & Hooker, G. (2009). Data-intensive science: A new paradigm for biodiversity studies. BioScience, 59(7): 613–620.
Maslow, A. H. (1966). The psychology of science: A reconnaissance (p. 15). New York, NY: Harper & Row.
Nichols, J. D. & Williams, B. K. (2006). Monitoring for conservation. Trends in Ecology and Evolution, 21(12): 668–673.
Peters, D. P. C., Havstad, K. M., Cushing, J., Tweedie, C., Fuentes, O., & Villanueva-Rosales, N. (2014). Harnessing the power of big data: Infusing the scientific method with machine learning to transform ecology. Ecosphere, 5(6): 67.
Platt, J. R. (1964). Strong inference: Certain systematic methods of scientific thinking may produce much more rapid progress than others. Science, 146(3642): 347–353.
Shmueli, G. (2010). To explain or to predict? Statistical Science, 25(3): 289–310.
Snyder, L. J. (1999). Renovating the Novum Organum: Bacon, Whewell and induction. Studies in History and Philosophy of Science, 30(4): 531–557.
Steinle, F. (1997). Entering new fields: Exploratory uses of experimentation. Philosophy of Science, 64(Proceedings): S65–S74.
Williams, B. K., Nichols, J. D., & Conroy, M. J. (2002). Analysis and management of animal populations: Modeling, estimation, and decision making (pp. 11–58). San Diego, CA: Academic Press.