Over deze cursus
Advancements in technology and information processing are rapidly changing many fields of plant sciences, animal sciences and ecology, including research, agriculture and conservation. For example, distributed sensor networks currently allow for the acquisition of huge volumes of data on many relevant aspects, ranging from soil and vegetation characteristics, abiotic conditions like weather, to the behaviour of animals. The availability of unprecedented amounts of data is unlocking potential, however, it also creates a major challenge: the ability to effectively process and analyse it. In the current data-centered digital era that is driven by technological change, the volume of data will continue to skyrocket due to decreasing costs of data collection, storage and processing. Fostered by these technological developments, researchers and various branches of business are increasingly embracing data science: a concept to unify data processing, statistics, artificial intelligence and their related algorithms to extract knowledge from data. Hence, data science is increasingly becoming an integral part of decision making in many fields, including precision agriculture, livestock management and nature conservation, as it fosters automated prediction and classification (e.g.: is this animal ill?, is this plant a weed?, is this apple ready to pick?, when should we harvest?).
To keep up with these technological developments, students need to become acquainted with the terms, concepts and methodology accompanying these developments. This is especially important since it can require a different approach to using data and conducting science than the approaches they are familiar with. Namely, the large volumes of data usually come from various sources, each with their own characteristics, uncertainties and measurement errors. The data from these different sources need to be integrated, and the inherent heterogeneity should be accounted for. Moreover, the collected sensor data are generally not immediately fit for analyses, so that pre-processing of the raw data is needed. After initial data pre-processing, the engineering of informative and discriminating features (i.e., measurable properties of the phenomenon being observed) is a crucial step for creating effective algorithms. Furthermore, the collection of large volumes of data leads to a shift away from frequentist hypothesis testing towards analytics that is more focussed on prediction, classification, pattern recognition or anomaly detection. To this end, machine learning techniques are often used, usually by high performance computing.
This course covers the main elements of using a data science approach to solving agricultural or ecological problems. The students will be guided through the main conce
Leerresultaten
Explain important concepts in data science needed to solve typical ecological problems
Explain how key features of ecological data influence the selection, training, validation and evaluation of algorithms
Identify and select machine learning algorithms appropriate to specific ecological problems
Create a reproducible workflow (loading raw data, data processing, feature engineering, and machine learning algorithms) to efficiently analyse ecological datasets
Critically evaluate the reliability and adequacy of trained algorithms
Create ecological insight from data using a data science approach
Communicate the key elements and findings of a data science project clearly and concisely
Toetsing
- Performance (30%) Acquired skills regarding the application of data science methods to solving ecological problems
- Assignment oral presentation (40%) A group-based examination based on the group work (execution of the project, data analysis, and presentation)
- Written test with open and closed questions (30%) General principles in data science for ecological applications as covered in the lectures
Voorkennis
Experience with programming in R is needed to follow and successfully complete this course. For example, students who followed a course in which R is heavily used, e.g. CSA40306 Ecological Modelling and Data Analysis in R, will likely have sufficient background knowledge to participate in this course. We strongly urge students without prior experience with programming in R to learn programming in R before the start of the course, either by:
- following the online course 'R programming' on Coursera (https://www.coursera.org/learn/r-programming): this course can be audited for free, and following the first 2 weeks of this course will suffice;
- or studying the free online book 'Hands-On Programming with R' (https://rstudio-education.github.io/hopr/), where parts 1 and 2 provides sufficient prerequisite knowledge.
We advice students that are unsure about their level of R skills to go through the first 2 parts of the online book 'Hands-On Programming with R' (the latter url above). If most elements discussed in these first 2 parts are understood, then the understanding of R programming is sufficient to participate in this course.
We assume general understanding on ecology, mathematics and statistics. Familiarity with the concept of data science (e.g. INF34306 Data Science Concepts), the application of statistical methods to ecological data (e.g., CSA40306 Ecological Modelling and Data Analysis in R), and algorithms used in data science (e.g., MAT32806 Statistics for Data Scientists; FTE35306 Machine Learning; AIN31306 Deep Learning in Data Science) is helpful but not urgent.
Bronnen
- The book R for Data Science by Wickham and Grolemund (available in print or for free online) is used throughout the course, as well as a collection of supplied book chapters or journal articles that cover relevant elements covered during the course.
Aanvullende informatie
- Neem contact op met een coordinator
- Niveaumaster
- Instructievormop de campus
Startdata
9 mrt 2026
tot 3 mei 2026