methods and techniques of the study in research example

What Is Research Methodology? A Plain-Language Explanation & Definition (With Examples)

By Derek Jansen (MBA) and Kerryn Warren (PhD) | June 2020 (Last updated April 2023)

If you’re new to formal academic research, it’s quite likely that you’re feeling a little overwhelmed by all the technical lingo that gets thrown around. And who could blame you – “research methodology”, “research methods”, “sampling strategies”… it all seems never-ending!

In this post, we’ll demystify the landscape with plain-language explanations and loads of examples (including easy-to-follow videos), so that you can approach your dissertation, thesis or research project with confidence. Let’s get started.

Research Methodology 101

What exactly research methodology means
What qualitative , quantitative and mixed methods are
What sampling strategy is
What data collection methods are
What data analysis methods are
How to choose your research methodology
Example of a research methodology

What is research methodology?

Research methodology simply refers to the practical “how” of a research study. More specifically, it’s about how a researcher systematically designs a study to ensure valid and reliable results that address the research aims, objectives and research questions . Specifically, how the researcher went about deciding:

What type of data to collect (e.g., qualitative or quantitative data )
Who to collect it from (i.e., the sampling strategy )
How to collect it (i.e., the data collection method )
How to analyse it (i.e., the data analysis methods )

Within any formal piece of academic research (be it a dissertation, thesis or journal article), you’ll find a research methodology chapter or section which covers the aspects mentioned above. Importantly, a good methodology chapter explains not just what methodological choices were made, but also explains why they were made. In other words, the methodology chapter should justify the design choices, by showing that the chosen methods and techniques are the best fit for the research aims, objectives and research questions.

So, it’s the same as research design?

Not quite. As we mentioned, research methodology refers to the collection of practical decisions regarding what data you’ll collect, from who, how you’ll collect it and how you’ll analyse it. Research design, on the other hand, is more about the overall strategy you’ll adopt in your study. For example, whether you’ll use an experimental design in which you manipulate one variable while controlling others. You can learn more about research design and the various design types here .

Need a helping hand?

methods and techniques of the study in research example

What are qualitative, quantitative and mixed-methods?

Qualitative, quantitative and mixed-methods are different types of methodological approaches, distinguished by their focus on words , numbers or both . This is a bit of an oversimplification, but its a good starting point for understanding.

Let’s take a closer look.

Qualitative research refers to research which focuses on collecting and analysing words (written or spoken) and textual or visual data, whereas quantitative research focuses on measurement and testing using numerical data . Qualitative analysis can also focus on other “softer” data points, such as body language or visual elements.

It’s quite common for a qualitative methodology to be used when the research aims and research questions are exploratory in nature. For example, a qualitative methodology might be used to understand peoples’ perceptions about an event that took place, or a political candidate running for president.

Contrasted to this, a quantitative methodology is typically used when the research aims and research questions are confirmatory in nature. For example, a quantitative methodology might be used to measure the relationship between two variables (e.g. personality type and likelihood to commit a crime) or to test a set of hypotheses .

As you’ve probably guessed, the mixed-method methodology attempts to combine the best of both qualitative and quantitative methodologies to integrate perspectives and create a rich picture. If you’d like to learn more about these three methodological approaches, be sure to watch our explainer video below.

What is sampling strategy?

Simply put, sampling is about deciding who (or where) you’re going to collect your data from . Why does this matter? Well, generally it’s not possible to collect data from every single person in your group of interest (this is called the “population”), so you’ll need to engage a smaller portion of that group that’s accessible and manageable (this is called the “sample”).

How you go about selecting the sample (i.e., your sampling strategy) will have a major impact on your study. There are many different sampling methods you can choose from, but the two overarching categories are probability sampling and non-probability sampling .

Probability sampling involves using a completely random sample from the group of people you’re interested in. This is comparable to throwing the names all potential participants into a hat, shaking it up, and picking out the “winners”. By using a completely random sample, you’ll minimise the risk of selection bias and the results of your study will be more generalisable to the entire population.

Non-probability sampling , on the other hand, doesn’t use a random sample . For example, it might involve using a convenience sample, which means you’d only interview or survey people that you have access to (perhaps your friends, family or work colleagues), rather than a truly random sample. With non-probability sampling, the results are typically not generalisable .

To learn more about sampling methods, be sure to check out the video below.

What are data collection methods?

As the name suggests, data collection methods simply refers to the way in which you go about collecting the data for your study. Some of the most common data collection methods include:

Interviews (which can be unstructured, semi-structured or structured)
Focus groups and group interviews
Surveys (online or physical surveys)
Observations (watching and recording activities)
Biophysical measurements (e.g., blood pressure, heart rate, etc.)
Documents and records (e.g., financial reports, court records, etc.)

The choice of which data collection method to use depends on your overall research aims and research questions , as well as practicalities and resource constraints. For example, if your research is exploratory in nature, qualitative methods such as interviews and focus groups would likely be a good fit. Conversely, if your research aims to measure specific variables or test hypotheses, large-scale surveys that produce large volumes of numerical data would likely be a better fit.

What are data analysis methods?

Data analysis methods refer to the methods and techniques that you’ll use to make sense of your data. These can be grouped according to whether the research is qualitative (words-based) or quantitative (numbers-based).

Popular data analysis methods in qualitative research include:

Qualitative content analysis
Thematic analysis
Discourse analysis
Narrative analysis
Interpretative phenomenological analysis (IPA)
Visual analysis (of photographs, videos, art, etc.)

Qualitative data analysis all begins with data coding , after which an analysis method is applied. In some cases, more than one analysis method is used, depending on the research aims and research questions . In the video below, we explore some common qualitative analysis methods, along with practical examples.

Moving on to the quantitative side of things, popular data analysis methods in this type of research include:

Descriptive statistics (e.g. means, medians, modes )
Inferential statistics (e.g. correlation, regression, structural equation modelling)

Again, the choice of which data collection method to use depends on your overall research aims and objectives , as well as practicalities and resource constraints. In the video below, we explain some core concepts central to quantitative analysis.

How do I choose a research methodology?

As you’ve probably picked up by now, your research aims and objectives have a major influence on the research methodology . So, the starting point for developing your research methodology is to take a step back and look at the big picture of your research, before you make methodology decisions. The first question you need to ask yourself is whether your research is exploratory or confirmatory in nature.

If your research aims and objectives are primarily exploratory in nature, your research will likely be qualitative and therefore you might consider qualitative data collection methods (e.g. interviews) and analysis methods (e.g. qualitative content analysis).

Conversely, if your research aims and objective are looking to measure or test something (i.e. they’re confirmatory), then your research will quite likely be quantitative in nature, and you might consider quantitative data collection methods (e.g. surveys) and analyses (e.g. statistical analysis).

Designing your research and working out your methodology is a large topic, which we cover extensively on the blog . For now, however, the key takeaway is that you should always start with your research aims, objectives and research questions (the golden thread). Every methodological choice you make needs align with those three components.

Example of a research methodology chapter

In the video below, we provide a detailed walkthrough of a research methodology from an actual dissertation, as well as an overview of our free methodology template .

Psst… there’s more (for free)

This post is part of our dissertation mini-course, which covers everything you need to get started with your dissertation, thesis or research project.

You Might Also Like:

198 Comments

Thank you for this simple yet comprehensive and easy to digest presentation. God Bless!

You’re most welcome, Leo. Best of luck with your research!

I found it very useful. many thanks

This is really directional. A make-easy research knowledge.

Thank you for this, I think will help my research proposal

Thanks for good interpretation,well understood.

Good morning sorry I want to the search topic

Thank u more

Thank you, your explanation is simple and very helpful.

Very educative a.nd exciting platform. A bigger thank you and I’ll like to always be with you

That’s the best analysis

So simple yet so insightful. Thank you.

This really easy to read as it is self-explanatory. Very much appreciated…

Thanks for this. It’s so helpful and explicit. For those elements highlighted in orange, they were good sources of referrals for concepts I didn’t understand. A million thanks for this.

Good morning, I have been reading your research lessons through out a period of times. They are important, impressive and clear. Want to subscribe and be and be active with you.

Thankyou So much Sir Derek…

Good morning thanks so much for the on line lectures am a student of university of Makeni.select a research topic and deliberate on it so that we’ll continue to understand more.sorry that’s a suggestion.

Beautiful presentation. I love it.

please provide a research mehodology example for zoology

It’s very educative and well explained

Thanks for the concise and informative data.

This is really good for students to be safe and well understand that research is all about

Thank you so much Derek sir🖤🙏🤗

Very simple and reliable

This is really helpful. Thanks alot. God bless you.

very useful, Thank you very much..

thanks a lot its really useful

in a nutshell..thank you!

Thanks for updating my understanding on this aspect of my Thesis writing.

thank you so much my through this video am competently going to do a good job my thesis

Very simple but yet insightful Thank you

This has been an eye opening experience. Thank you grad coach team.

Very useful message for research scholars

Really very helpful thank you

yes you are right and i’m left

Research methodology with a simplest way i have never seen before this article.

wow thank u so much

Good morning thanks so much for the on line lectures am a student of university of Makeni.select a research topic and deliberate on is so that we will continue to understand more.sorry that’s a suggestion.

Very precise and informative.

Thanks for simplifying these terms for us, really appreciate it.

Thanks this has really helped me. It is very easy to understand.

I found the notes and the presentation assisting and opening my understanding on research methodology

Good presentation

Im so glad you clarified my misconceptions. Im now ready to fry my onions. Thank you so much. God bless

Thank you a lot.

thanks for the easy way of learning and desirable presentation.

Thanks a lot. I am inspired

Well written

I am writing a APA Format paper . I using questionnaire with 120 STDs teacher for my participant. Can you write me mthology for this research. Send it through email sent. Just need a sample as an example please. My topic is ” impacts of overcrowding on students learning

Thanks for your comment.

We can’t write your methodology for you. If you’re looking for samples, you should be able to find some sample methodologies on Google. Alternatively, you can download some previous dissertations from a dissertation directory and have a look at the methodology chapters therein.

All the best with your research.

Thank you so much for this!! God Bless

Thank you. Explicit explanation

Thank you, Derek and Kerryn, for making this simple to understand. I’m currently at the inception stage of my research.

Thnks a lot , this was very usefull on my assignment

excellent explanation

I’m currently working on my master’s thesis, thanks for this! I’m certain that I will use Qualitative methodology.

Thanks a lot for this concise piece, it was quite relieving and helpful. God bless you BIG…

I am currently doing my dissertation proposal and I am sure that I will do quantitative research. Thank you very much it was extremely helpful.

Very interesting and informative yet I would like to know about examples of Research Questions as well, if possible.

I’m about to submit a research presentation, I have come to understand from your simplification on understanding research methodology. My research will be mixed methodology, qualitative as well as quantitative. So aim and objective of mixed method would be both exploratory and confirmatory. Thanks you very much for your guidance.

OMG thanks for that, you’re a life saver. You covered all the points I needed. Thank you so much ❤️ ❤️ ❤️

Thank you immensely for this simple, easy to comprehend explanation of data collection methods. I have been stuck here for months 😩. Glad I found your piece. Super insightful.

I’m going to write synopsis which will be quantitative research method and I don’t know how to frame my topic, can I kindly get some ideas..

Thanks for this, I was really struggling.

This was really informative I was struggling but this helped me.

Thanks a lot for this information, simple and straightforward. I’m a last year student from the University of South Africa UNISA South Africa.

its very much informative and understandable. I have enlightened.

An interesting nice exploration of a topic.

Thank you. Accurate and simple🥰

This article was really helpful, it helped me understanding the basic concepts of the topic Research Methodology. The examples were very clear, and easy to understand. I would like to visit this website again. Thank you so much for such a great explanation of the subject.

Thanks dude

Thank you Doctor Derek for this wonderful piece, please help to provide your details for reference purpose. God bless.

Many compliments to you

Great work , thank you very much for the simple explanation

Thank you. I had to give a presentation on this topic. I have looked everywhere on the internet but this is the best and simple explanation.

thank you, its very informative.

Well explained. Now I know my research methodology will be qualitative and exploratory. Thank you so much, keep up the good work

Well explained, thank you very much.

This is good explanation, I have understood the different methods of research. Thanks a lot.

Great work…very well explanation

Thanks Derek. Kerryn was just fantastic!

Great to hear that, Hyacinth. Best of luck with your research!

Its a good templates very attractive and important to PhD students and lectuter

Thanks for the feedback, Matobela. Good luck with your research methodology.

Thank you. This is really helpful.

You’re very welcome, Elie. Good luck with your research methodology.

Well explained thanks

This is a very helpful site especially for young researchers at college. It provides sufficient information to guide students and equip them with the necessary foundation to ask any other questions aimed at deepening their understanding.

Thanks for the kind words, Edward. Good luck with your research!

Thank you. I have learned a lot.

Great to hear that, Ngwisa. Good luck with your research methodology!

Thank you for keeping your presentation simples and short and covering key information for research methodology. My key takeaway: Start with defining your research objective the other will depend on the aims of your research question.

My name is Zanele I would like to be assisted with my research , and the topic is shortage of nursing staff globally want are the causes , effects on health, patients and community and also globally

Thanks for making it simple and clear. It greatly helped in understanding research methodology. Regards.

This is well simplified and straight to the point

Thank you Dr

I was given an assignment to research 2 publications and describe their research methodology? I don’t know how to start this task can someone help me?

Sure. You’re welcome to book an initial consultation with one of our Research Coaches to discuss how we can assist – https://gradcoach.com/book/new/ .

Thanks a lot I am relieved of a heavy burden.keep up with the good work

I’m very much grateful Dr Derek. I’m planning to pursue one of the careers that really needs one to be very much eager to know. There’s a lot of research to do and everything, but since I’ve gotten this information I will use it to the best of my potential.

Thank you so much, words are not enough to explain how helpful this session has been for me!

Thanks this has thought me alot.

Very concise and helpful. Thanks a lot

Thank Derek. This is very helpful. Your step by step explanation has made it easier for me to understand different concepts. Now i can get on with my research.

I wish i had come across this sooner. So simple but yet insightful

really nice explanation thank you so much

I’m so grateful finding this site, it’s really helpful…….every term well explained and provide accurate understanding especially to student going into an in-depth research for the very first time, even though my lecturer already explained this topic to the class, I think I got the clear and efficient explanation here, much thanks to the author.

It is very helpful material

I would like to be assisted with my research topic : Literature Review and research methodologies. My topic is : what is the relationship between unemployment and economic growth?

Its really nice and good for us.

THANKS SO MUCH FOR EXPLANATION, ITS VERY CLEAR TO ME WHAT I WILL BE DOING FROM NOW .GREAT READS.

Short but sweet.Thank you

Informative article. Thanks for your detailed information.

I’m currently working on my Ph.D. thesis. Thanks a lot, Derek and Kerryn, Well-organized sequences, facilitate the readers’ following.

great article for someone who does not have any background can even understand

I am a bit confused about research design and methodology. Are they the same? If not, what are the differences and how are they related?

Thanks in advance.

concise and informative.

Thank you very much

How can we site this article is Harvard style?

Very well written piece that afforded better understanding of the concept. Thank you!

Am a new researcher trying to learn how best to write a research proposal. I find your article spot on and want to download the free template but finding difficulties. Can u kindly send it to my email, the free download entitled, “Free Download: Research Proposal Template (with Examples)”.

Thank too much

Thank you very much for your comprehensive explanation about research methodology so I like to thank you again for giving us such great things.

Good very well explained.Thanks for sharing it.

Thank u sir, it is really a good guideline.

so helpful thank you very much.

Thanks for the video it was very explanatory and detailed, easy to comprehend and follow up. please, keep it up the good work

It was very helpful, a well-written document with precise information.

how do i reference this?

MLA Jansen, Derek, and Kerryn Warren. “What (Exactly) Is Research Methodology?” Grad Coach, June 2021, gradcoach.com/what-is-research-methodology/.

APA Jansen, D., & Warren, K. (2021, June). What (Exactly) Is Research Methodology? Grad Coach. https://gradcoach.com/what-is-research-methodology/

Your explanation is easily understood. Thank you

Very help article. Now I can go my methodology chapter in my thesis with ease

I feel guided ,Thank you

This simplification is very helpful. It is simple but very educative, thanks ever so much

The write up is informative and educative. It is an academic intellectual representation that every good researcher can find useful. Thanks

Wow, this is wonderful long live.

Nice initiative

thank you the video was helpful to me.

Thank you very much for your simple and clear explanations I’m really satisfied by the way you did it By now, I think I can realize a very good article by following your fastidious indications May God bless you

Thanks very much, it was very concise and informational for a beginner like me to gain an insight into what i am about to undertake. I really appreciate.

very informative sir, it is amazing to understand the meaning of question hidden behind that, and simple language is used other than legislature to understand easily. stay happy.

This one is really amazing. All content in your youtube channel is a very helpful guide for doing research. Thanks, GradCoach.

research methodologies

Please send me more information concerning dissertation research.

Nice piece of knowledge shared….. #Thump_UP

This is amazing, it has said it all. Thanks to Gradcoach

This is wonderful,very elaborate and clear.I hope to reach out for your assistance in my research very soon.

This is the answer I am searching about…

realy thanks a lot

Thank you very much for this awesome, to the point and inclusive article.

Thank you very much I need validity and reliability explanation I have exams

Thank you for a well explained piece. This will help me going forward.

Very simple and well detailed Many thanks

This is so very simple yet so very effective and comprehensive. An Excellent piece of work.

I wish I saw this earlier on! Great insights for a beginner(researcher) like me. Thanks a mil!

Thank you very much, for such a simplified, clear and practical step by step both for academic students and general research work. Holistic, effective to use and easy to read step by step. One can easily apply the steps in practical terms and produce a quality document/up-to standard

Thanks for simplifying these terms for us, really appreciated.

Thanks for a great work. well understood .

This was very helpful. It was simple but profound and very easy to understand. Thank you so much!

Great and amazing research guidelines. Best site for learning research

hello sir/ma’am, i didn’t find yet that what type of research methodology i am using. because i am writing my report on CSR and collect all my data from websites and articles so which type of methodology i should write in dissertation report. please help me. i am from India.

how does this really work?

perfect content, thanks a lot

As a researcher, I commend you for the detailed and simplified information on the topic in question. I would like to remain in touch for the sharing of research ideas on other topics. Thank you

Impressive. Thank you, Grad Coach 😍

Thank you Grad Coach for this piece of information. I have at least learned about the different types of research methodologies.

Very useful content with easy way

Thank you very much for the presentation. I am an MPH student with the Adventist University of Africa. I have successfully completed my theory and starting on my research this July. My topic is “Factors associated with Dental Caries in (one District) in Botswana. I need help on how to go about this quantitative research

I am so grateful to run across something that was sooo helpful. I have been on my doctorate journey for quite some time. Your breakdown on methodology helped me to refresh my intent. Thank you.

thanks so much for this good lecture. student from university of science and technology, Wudil. Kano Nigeria.

It’s profound easy to understand I appreciate

Thanks a lot for sharing superb information in a detailed but concise manner. It was really helpful and helped a lot in getting into my own research methodology.

Comment * thanks very much

This was sooo helpful for me thank you so much i didn’t even know what i had to write thank you!

You’re most welcome 🙂

Simple and good. Very much helpful. Thank you so much.

This is very good work. I have benefited.

Thank you so much for sharing

This is powerful thank you so much guys

I am nkasa lizwi doing my research proposal on honors with the university of Walter Sisulu Komani I m on part 3 now can you assist me.my topic is: transitional challenges faced by educators in intermediate phase in the Alfred Nzo District.

Appreciate the presentation. Very useful step-by-step guidelines to follow.

I appreciate sir

wow! This is super insightful for me. Thank you!

Indeed this material is very helpful! Kudos writers/authors.

I want to say thank you very much, I got a lot of info and knowledge. Be blessed.

I want present a seminar paper on Optimisation of Deep learning-based models on vulnerability detection in digital transactions.

Need assistance

Dear Sir, I want to be assisted on my research on Sanitation and Water management in emergencies areas.

I am deeply grateful for the knowledge gained. I will be getting in touch shortly as I want to be assisted in my ongoing research.

The information shared is informative, crisp and clear. Kudos Team! And thanks a lot!

hello i want to study

Hello!! Grad coach teams. I am extremely happy in your tutorial or consultation. i am really benefited all material and briefing. Thank you very much for your generous helps. Please keep it up. If you add in your briefing, references for further reading, it will be very nice.

All I have to say is, thank u gyz.

Good, l thanks

thank you, it is very useful

Trackbacks/Pingbacks

What Is A Literature Review (In A Dissertation Or Thesis) - Grad Coach - […] the literature review is to inform the choice of methodology for your own research. As we’ve discussed on the Grad Coach blog,…
Free Download: Research Proposal Template (With Examples) - Grad Coach - […] Research design (methodology) […]
Dissertation vs Thesis: What's the difference? - Grad Coach - […] and thesis writing on a daily basis – everything from how to find a good research topic to which…

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Print Friendly

News alert: UC Berkeley has announced its next university librarian

Secondary menu

Log in to your Library account
Hours and Maps
Connect from Off Campus
UC Berkeley Home

Search form

Research methods--quantitative, qualitative, and more: overview.

Quantitative Research
Qualitative Research
Data Science Methods (Machine Learning, AI, Big Data)
Text Mining and Computational Text Analysis
Evidence Synthesis/Systematic Reviews
Get Data, Get Help!

About Research Methods

This guide provides an overview of research methods, how to choose and use them, and supports and resources at UC Berkeley.

As Patten and Newhart note in the book Understanding Research Methods , "Research methods are the building blocks of the scientific enterprise. They are the "how" for building systematic knowledge. The accumulation of knowledge through research is by its nature a collective endeavor. Each well-designed study provides evidence that may support, amend, refute, or deepen the understanding of existing knowledge...Decisions are important throughout the practice of research and are designed to help researchers collect evidence that includes the full spectrum of the phenomenon under study, to maintain logical rules, and to mitigate or account for possible sources of bias. In many ways, learning research methods is learning how to see and make these decisions."

The choice of methods varies by discipline, by the kind of phenomenon being studied and the data being used to study it, by the technology available, and more. This guide is an introduction, but if you don't see what you need here, always contact your subject librarian, and/or take a look to see if there's a library research guide that will answer your question.

Suggestions for changes and additions to this guide are welcome!

START HERE: SAGE Research Methods

Without question, the most comprehensive resource available from the library is SAGE Research Methods. HERE IS THE ONLINE GUIDE to this one-stop shopping collection, and some helpful links are below:

SAGE Research Methods
Little Green Books (Quantitative Methods)
Little Blue Books (Qualitative Methods)
Dictionaries and Encyclopedias
Case studies of real research projects
Sample datasets for hands-on practice
Streaming video--see methods come to life
Methodspace- -a community for researchers
SAGE Research Methods Course Mapping

Library Data Services at UC Berkeley

Library Data Services Program and Digital Scholarship Services

The LDSP offers a variety of services and tools ! From this link, check out pages for each of the following topics: discovering data, managing data, collecting data, GIS data, text data mining, publishing data, digital scholarship, open science, and the Research Data Management Program.

Be sure also to check out the visual guide to where to seek assistance on campus with any research question you may have!

Library GIS Services

Other Data Services at Berkeley

D-Lab Supports Berkeley faculty, staff, and graduate students with research in data intensive social science, including a wide range of training and workshop offerings Dryad Dryad is a simple self-service tool for researchers to use in publishing their datasets. It provides tools for the effective publication of and access to research data. Geospatial Innovation Facility (GIF) Provides leadership and training across a broad array of integrated mapping technologies on campu Research Data Management A UC Berkeley guide and consulting service for research data management issues

General Research Methods Resources

Here are some general resources for assistance:

Assistance from ICPSR (must create an account to access): Getting Help with Data , and Resources for Students
Wiley Stats Ref for background information on statistics topics
Survey Documentation and Analysis (SDA) . Program for easy web-based analysis of survey data.

Consultants

D-Lab/Data Science Discovery Consultants Request help with your research project from peer consultants.
Research data (RDM) consulting Meet with RDM consultants before designing the data security, storage, and sharing aspects of your qualitative project.
Statistics Department Consulting Services A service in which advanced graduate students, under faculty supervision, are available to consult during specified hours in the Fall and Spring semesters.

Related Resourcex

IRB / CPHS Qualitative research projects with human subjects often require that you go through an ethics review.
OURS (Office of Undergraduate Research and Scholarships) OURS supports undergraduates who want to embark on research projects and assistantships. In particular, check out their "Getting Started in Research" workshops
Sponsored Projects Sponsored projects works with researchers applying for major external grants.
Next: Quantitative Research >>
Last Updated: Apr 3, 2023 3:14 PM
URL: https://guides.lib.berkeley.edu/researchmethods

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

Knowledge Base
Methodology

Research Design | Step-by-Step Guide with Examples

Published on 5 May 2022 by Shona McCombes . Revised on 20 March 2023.

A research design is a strategy for answering your research question using empirical data. Creating a research design means making decisions about:

Your overall aims and approach
The type of research design you’ll use
Your sampling methods or criteria for selecting subjects
Your data collection methods
The procedures you’ll follow to collect data
Your data analysis methods

A well-planned research design helps ensure that your methods match your research aims and that you use the right kind of analysis for your data.

Step 1: consider your aims and approach, step 2: choose a type of research design, step 3: identify your population and sampling method, step 4: choose your data collection methods, step 5: plan your data collection procedures, step 6: decide on your data analysis strategies, frequently asked questions.

Introduction

Before you can start designing your research, you should already have a clear idea of the research question you want to investigate.

There are many different ways you could go about answering this question. Your research design choices should be driven by your aims and priorities – start by thinking carefully about what you want to achieve.

The first choice you need to make is whether you’ll take a qualitative or quantitative approach.

Qualitative research designs tend to be more flexible and inductive , allowing you to adjust your approach based on what you find throughout the research process.

Quantitative research designs tend to be more fixed and deductive , with variables and hypotheses clearly defined in advance of data collection.

It’s also possible to use a mixed methods design that integrates aspects of both approaches. By combining qualitative and quantitative insights, you can gain a more complete picture of the problem you’re studying and strengthen the credibility of your conclusions.

Practical and ethical considerations when designing research

As well as scientific considerations, you need to think practically when designing your research. If your research involves people or animals, you also need to consider research ethics .

How much time do you have to collect data and write up the research?
Will you be able to gain access to the data you need (e.g., by travelling to a specific location or contacting specific people)?
Do you have the necessary research skills (e.g., statistical analysis or interview techniques)?
Will you need ethical approval ?

At each stage of the research design process, make sure that your choices are practically feasible.

Prevent plagiarism, run a free check.

Within both qualitative and quantitative approaches, there are several types of research design to choose from. Each type provides a framework for the overall shape of your research.

Types of quantitative research designs

Quantitative designs can be split into four main types. Experimental and quasi-experimental designs allow you to test cause-and-effect relationships, while descriptive and correlational designs allow you to measure variables and describe relationships between them.

With descriptive and correlational designs, you can get a clear picture of characteristics, trends, and relationships as they exist in the real world. However, you can’t draw conclusions about cause and effect (because correlation doesn’t imply causation ).

Experiments are the strongest way to test cause-and-effect relationships without the risk of other variables influencing the results. However, their controlled conditions may not always reflect how things work in the real world. They’re often also more difficult and expensive to implement.

Types of qualitative research designs

Qualitative designs are less strictly defined. This approach is about gaining a rich, detailed understanding of a specific context or phenomenon, and you can often be more creative and flexible in designing your research.

The table below shows some common types of qualitative design. They often have similar approaches in terms of data collection, but focus on different aspects when analysing the data.

Your research design should clearly define who or what your research will focus on, and how you’ll go about choosing your participants or subjects.

In research, a population is the entire group that you want to draw conclusions about, while a sample is the smaller group of individuals you’ll actually collect data from.

Defining the population

A population can be made up of anything you want to study – plants, animals, organisations, texts, countries, etc. In the social sciences, it most often refers to a group of people.

For example, will you focus on people from a specific demographic, region, or background? Are you interested in people with a certain job or medical condition, or users of a particular product?

The more precisely you define your population, the easier it will be to gather a representative sample.

Sampling methods

Even with a narrowly defined population, it’s rarely possible to collect data from every individual. Instead, you’ll collect data from a sample.

To select a sample, there are two main approaches: probability sampling and non-probability sampling . The sampling method you use affects how confidently you can generalise your results to the population as a whole.

Probability sampling is the most statistically valid option, but it’s often difficult to achieve unless you’re dealing with a very small and accessible population.

For practical reasons, many studies use non-probability sampling, but it’s important to be aware of the limitations and carefully consider potential biases. You should always make an effort to gather a sample that’s as representative as possible of the population.

Case selection in qualitative research

In some types of qualitative designs, sampling may not be relevant.

For example, in an ethnography or a case study, your aim is to deeply understand a specific context, not to generalise to a population. Instead of sampling, you may simply aim to collect as much data as possible about the context you are studying.

In these types of design, you still have to carefully consider your choice of case or community. You should have a clear rationale for why this particular case is suitable for answering your research question.

For example, you might choose a case study that reveals an unusual or neglected aspect of your research problem, or you might choose several very similar or very different cases in order to compare them.

Data collection methods are ways of directly measuring variables and gathering information. They allow you to gain first-hand knowledge and original insights into your research problem.

You can choose just one data collection method, or use several methods in the same study.

Survey methods

Surveys allow you to collect data about opinions, behaviours, experiences, and characteristics by asking people directly. There are two main survey methods to choose from: questionnaires and interviews.

Observation methods

Observations allow you to collect data unobtrusively, observing characteristics, behaviours, or social interactions without relying on self-reporting.

Observations may be conducted in real time, taking notes as you observe, or you might make audiovisual recordings for later analysis. They can be qualitative or quantitative.

Other methods of data collection

There are many other ways you might collect data depending on your field and topic.

If you’re not sure which methods will work best for your research design, try reading some papers in your field to see what data collection methods they used.

Secondary data

If you don’t have the time or resources to collect data from the population you’re interested in, you can also choose to use secondary data that other researchers already collected – for example, datasets from government surveys or previous studies on your topic.

With this raw data, you can do your own analysis to answer new research questions that weren’t addressed by the original study.

Using secondary data can expand the scope of your research, as you may be able to access much larger and more varied samples than you could collect yourself.

However, it also means you don’t have any control over which variables to measure or how to measure them, so the conclusions you can draw may be limited.

As well as deciding on your methods, you need to plan exactly how you’ll use these methods to collect data that’s consistent, accurate, and unbiased.

Planning systematic procedures is especially important in quantitative research, where you need to precisely define your variables and ensure your measurements are reliable and valid.

Operationalisation

Some variables, like height or age, are easily measured. But often you’ll be dealing with more abstract concepts, like satisfaction, anxiety, or competence. Operationalisation means turning these fuzzy ideas into measurable indicators.

If you’re using observations , which events or actions will you count?

If you’re using surveys , which questions will you ask and what range of responses will be offered?

You may also choose to use or adapt existing materials designed to measure the concept you’re interested in – for example, questionnaires or inventories whose reliability and validity has already been established.

Reliability and validity

Reliability means your results can be consistently reproduced , while validity means that you’re actually measuring the concept you’re interested in.

For valid and reliable results, your measurement materials should be thoroughly researched and carefully designed. Plan your procedures to make sure you carry out the same steps in the same way for each participant.

If you’re developing a new questionnaire or other instrument to measure a specific concept, running a pilot study allows you to check its validity and reliability in advance.

Sampling procedures

As well as choosing an appropriate sampling method, you need a concrete plan for how you’ll actually contact and recruit your selected sample.

That means making decisions about things like:

How many participants do you need for an adequate sample size?
What inclusion and exclusion criteria will you use to identify eligible participants?
How will you contact your sample – by mail, online, by phone, or in person?

If you’re using a probability sampling method, it’s important that everyone who is randomly selected actually participates in the study. How will you ensure a high response rate?

If you’re using a non-probability method, how will you avoid bias and ensure a representative sample?

Data management

It’s also important to create a data management plan for organising and storing your data.

Will you need to transcribe interviews or perform data entry for observations? You should anonymise and safeguard any sensitive data, and make sure it’s backed up regularly.

Keeping your data well organised will save time when it comes to analysing them. It can also help other researchers validate and add to your findings.

On their own, raw data can’t answer your research question. The last step of designing your research is planning how you’ll analyse the data.

Quantitative data analysis

In quantitative research, you’ll most likely use some form of statistical analysis . With statistics, you can summarise your sample data, make estimates, and test hypotheses.

Using descriptive statistics , you can summarise your sample data in terms of:

The distribution of the data (e.g., the frequency of each score on a test)
The central tendency of the data (e.g., the mean to describe the average score)
The variability of the data (e.g., the standard deviation to describe how spread out the scores are)

The specific calculations you can do depend on the level of measurement of your variables.

Using inferential statistics , you can:

Make estimates about the population based on your sample data.
Test hypotheses about a relationship between variables.

Regression and correlation tests look for associations between two or more variables, while comparison tests (such as t tests and ANOVAs ) look for differences in the outcomes of different groups.

Your choice of statistical test depends on various aspects of your research design, including the types of variables you’re dealing with and the distribution of your data.

Qualitative data analysis

In qualitative research, your data will usually be very dense with information and ideas. Instead of summing it up in numbers, you’ll need to comb through the data in detail, interpret its meanings, identify patterns, and extract the parts that are most relevant to your research question.

Two of the most common approaches to doing this are thematic analysis and discourse analysis .

There are many other ways of analysing qualitative data depending on the aims of your research. To get a sense of potential approaches, try reading some qualitative research papers in your field.

A sample is a subset of individuals from a larger population. Sampling means selecting the group that you will actually collect data from in your research.

For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

Statistical sampling allows you to test a hypothesis about the characteristics of a population. There are various sampling methods you can use to ensure that your sample is representative of the population as a whole.

Operationalisation means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioural avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalise the variables that you want to measure.

The research methods you use depend on the type of data you need to answer your research question .

If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts, and meanings, use qualitative methods .
If you want to analyse a large amount of readily available data, use secondary data. If you want data specific to your purposes with control over how they are generated, collect primary data.
If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

McCombes, S. (2023, March 20). Research Design | Step-by-Step Guide with Examples. Scribbr. Retrieved 20 March 2024, from https://www.scribbr.co.uk/research-methods/research-design/

Is this article helpful?

Shona McCombes

Research Methods In Psychology

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

Research methods in psychology are systematic procedures used to observe, describe, predict, and explain behavior and mental processes. They include experiments, surveys, case studies, and naturalistic observations, ensuring data collection is objective and reliable to understand and explain psychological phenomena.

Hypotheses are statements about the prediction of the results, that can be verified or disproved by some investigation.

There are four types of hypotheses :

Null Hypotheses (H0 ) – these predict that no difference will be found in the results between the conditions. Typically these are written ‘There will be no difference…’
Alternative Hypotheses (Ha or H1) – these predict that there will be a significant difference in the results between the two conditions. This is also known as the experimental hypothesis.
One-tailed (directional) hypotheses – these state the specific direction the researcher expects the results to move in, e.g. higher, lower, more, less. In a correlation study, the predicted direction of the correlation can be either positive or negative.
Two-tailed (non-directional) hypotheses – these state that a difference will be found between the conditions of the independent variable but does not state the direction of a difference or relationship. Typically these are always written ‘There will be a difference ….’

All research has an alternative hypothesis (either a one-tailed or two-tailed) and a corresponding null hypothesis.

Once the research is conducted and results are found, psychologists must accept one hypothesis and reject the other.

So, if a difference is found, the Psychologist would accept the alternative hypothesis and reject the null. The opposite applies if no difference is found.

Sampling techniques

Sampling is the process of selecting a representative group from the population under study.

A sample is the participants you select from a target population (the group you are interested in) to make generalizations about.

Representative means the extent to which a sample mirrors a researcher’s target population and reflects its characteristics.

Generalisability means the extent to which their findings can be applied to the larger population of which their sample was a part.

Volunteer sample : where participants pick themselves through newspaper adverts, noticeboards or online.
Opportunity sampling : also known as convenience sampling , uses people who are available at the time the study is carried out and willing to take part. It is based on convenience.
Random sampling : when every person in the target population has an equal chance of being selected. An example of random sampling would be picking names out of a hat.
Systematic sampling : when a system is used to select participants. Picking every Nth person from all possible participants. N = the number of people in the research population / the number of people needed for the sample.
Stratified sampling : when you identify the subgroups and select participants in proportion to their occurrences.
Snowball sampling : when researchers find a few participants, and then ask them to find participants themselves and so on.
Quota sampling : when researchers will be told to ensure the sample fits certain quotas, for example they might be told to find 90 participants, with 30 of them being unemployed.

Experiments always have an independent and dependent variable .

The independent variable is the one the experimenter manipulates (the thing that changes between the conditions the participants are placed into). It is assumed to have a direct effect on the dependent variable.
The dependent variable is the thing being measured, or the results of the experiment.

Operationalization of variables means making them measurable/quantifiable. We must use operationalization to ensure that variables are in a form that can be easily tested.

For instance, we can’t really measure ‘happiness’, but we can measure how many times a person smiles within a two-hour period.

By operationalizing variables, we make it easy for someone else to replicate our research. Remember, this is important because we can check if our findings are reliable.

Extraneous variables are all variables which are not independent variable but could affect the results of the experiment.

It can be a natural characteristic of the participant, such as intelligence levels, gender, or age for example, or it could be a situational feature of the environment such as lighting or noise.

Demand characteristics are a type of extraneous variable that occurs if the participants work out the aims of the research study, they may begin to behave in a certain way.

For example, in Milgram’s research , critics argued that participants worked out that the shocks were not real and they administered them as they thought this was what was required of them.

Extraneous variables must be controlled so that they do not affect (confound) the results.

Randomly allocating participants to their conditions or using a matched pairs experimental design can help to reduce participant variables.

Situational variables are controlled by using standardized procedures, ensuring every participant in a given condition is treated in the same way

Experimental Design

Experimental design refers to how participants are allocated to each condition of the independent variable, such as a control or experimental group.

Independent design ( between-groups design ): each participant is selected for only one group. With the independent design, the most common way of deciding which participants go into which group is by means of randomization.
Matched participants design : each participant is selected for only one group, but the participants in the two groups are matched for some relevant factor or factors (e.g. ability; sex; age).
Repeated measures design ( within groups) : each participant appears in both groups, so that there are exactly the same participants in each group.
The main problem with the repeated measures design is that there may well be order effects. Their experiences during the experiment may change the participants in various ways.
They may perform better when they appear in the second group because they have gained useful information about the experiment or about the task. On the other hand, they may perform less well on the second occasion because of tiredness or boredom.
Counterbalancing is the best way of preventing order effects from disrupting the findings of an experiment, and involves ensuring that each condition is equally likely to be used first and second by the participants.

If we wish to compare two groups with respect to a given independent variable, it is essential to make sure that the two groups do not differ in any other important way.

Experimental Methods

All experimental methods involve an iv (independent variable) and dv (dependent variable)..

Field experiments are conducted in the everyday (natural) environment of the participants. The experimenter still manipulates the IV, but in a real-life setting. It may be possible to control extraneous variables, though such control is more difficult than in a lab experiment.
Natural experiments are when a naturally occurring IV is investigated that isn’t deliberately manipulated, it exists anyway. Participants are not randomly allocated, and the natural event may only occur rarely.

Case studies are in-depth investigations of a person, group, event, or community. It uses information from a range of sources, such as from the person concerned and also from their family and friends.

Many techniques may be used such as interviews, psychological tests, observations and experiments. Case studies are generally longitudinal: in other words, they follow the individual or group over an extended period of time.

Case studies are widely used in psychology and among the best-known ones carried out were by Sigmund Freud . He conducted very detailed investigations into the private lives of his patients in an attempt to both understand and help them overcome their illnesses.

Case studies provide rich qualitative data and have high levels of ecological validity. However, it is difficult to generalize from individual cases as each one has unique characteristics.

Correlational Studies

Correlation means association; it is a measure of the extent to which two variables are related. One of the variables can be regarded as the predictor variable with the other one as the outcome variable.

Correlational studies typically involve obtaining two different measures from a group of participants, and then assessing the degree of association between the measures.

The predictor variable can be seen as occurring before the outcome variable in some sense. It is called the predictor variable, because it forms the basis for predicting the value of the outcome variable.

Relationships between variables can be displayed on a graph or as a numerical score called a correlation coefficient.

types of correlation. Scatter plot. Positive negative and no correlation

If an increase in one variable tends to be associated with an increase in the other, then this is known as a positive correlation .
If an increase in one variable tends to be associated with a decrease in the other, then this is known as a negative correlation .
A zero correlation occurs when there is no relationship between variables.

After looking at the scattergraph, if we want to be sure that a significant relationship does exist between the two variables, a statistical test of correlation can be conducted, such as Spearman’s rho.

The test will give us a score, called a correlation coefficient . This is a value between 0 and 1, and the closer to 1 the score is, the stronger the relationship between the variables. This value can be both positive e.g. 0.63, or negative -0.63.

Types of correlation. Strong, weak, and perfect positive correlation, strong, weak, and perfect negative correlation, no correlation. Graphs or charts ...

A correlation between variables, however, does not automatically mean that the change in one variable is the cause of the change in the values of the other variable. A correlation only shows if there is a relationship between variables.

Correlation does not always prove causation, as a third variable may be involved.

Interview Methods

Interviews are commonly divided into two types: structured and unstructured.

A fixed, predetermined set of questions is put to every participant in the same order and in the same way.

Responses are recorded on a questionnaire, and the researcher presets the order and wording of questions, and sometimes the range of alternative answers.

The interviewer stays within their role and maintains social distance from the interviewee.

There are no set questions, and the participant can raise whatever topics he/she feels are relevant and ask them in their own way. Questions are posed about participants’ answers to the subject

Unstructured interviews are most useful in qualitative research to analyze attitudes and values.

Though they rarely provide a valid basis for generalization, their main advantage is that they enable the researcher to probe social actors’ subjective point of view.

Questionnaire Method

Questionnaires can be thought of as a kind of written interview. They can be carried out face to face, by telephone, or post.

The choice of questions is important because of the need to avoid bias or ambiguity in the questions, ‘leading’ the respondent or causing offense.

Open questions are designed to encourage a full, meaningful answer using the subject’s own knowledge and feelings. They provide insights into feelings, opinions, and understanding. Example: “How do you feel about that situation?”
Closed questions can be answered with a simple “yes” or “no” or specific information, limiting the depth of response. They are useful for gathering specific facts or confirming details. Example: “Do you feel anxious in crowds?”

Its other practical advantages are that it is cheaper than face-to-face interviews and can be used to contact many respondents scattered over a wide area relatively quickly.

Observations

There are different types of observation methods :

Covert observation is where the researcher doesn’t tell the participants they are being observed until after the study is complete. There could be ethical problems or deception and consent with this particular observation method.
Overt observation is where a researcher tells the participants they are being observed and what they are being observed for.
Controlled : behavior is observed under controlled laboratory conditions (e.g., Bandura’s Bobo doll study).
Natural : Here, spontaneous behavior is recorded in a natural setting.
Participant : Here, the observer has direct contact with the group of people they are observing. The researcher becomes a member of the group they are researching.
Non-participant (aka “fly on the wall): The researcher does not have direct contact with the people being observed. The observation of participants’ behavior is from a distance

Pilot Study

A pilot study is a small scale preliminary study conducted in order to evaluate the feasibility of the key s teps in a future, full-scale project.

A pilot study is an initial run-through of the procedures to be used in an investigation; it involves selecting a few people and trying out the study on them. It is possible to save time, and in some cases, money, by identifying any flaws in the procedures designed by the researcher.

A pilot study can help the researcher spot any ambiguities (i.e. unusual things) or confusion in the information given to participants or problems with the task devised.

Sometimes the task is too hard, and the researcher may get a floor effect, because none of the participants can score at all or can complete the task – all performances are low.

The opposite effect is a ceiling effect, when the task is so easy that all achieve virtually full marks or top performances and are “hitting the ceiling”.

Research Design

In cross-sectional research , a researcher compares multiple segments of the population at the same time

Sometimes, we want to see how people change over time, as in studies of human development and lifespan. Longitudinal research is a research design in which data-gathering is administered repeatedly over an extended period of time.

In cohort studies , the participants must share a common factor or characteristic such as age, demographic, or occupation. A cohort study is a type of longitudinal study in which researchers monitor and observe a chosen population over an extended period.

Triangulation means using more than one research method to improve the study’s validity.

Reliability

Reliability is a measure of consistency, if a particular measurement is repeated and the same result is obtained then it is described as being reliable.

Test-retest reliability : assessing the same person on two different occasions which shows the extent to which the test produces the same answers.
Inter-observer reliability : the extent to which there is an agreement between two or more observers.

Meta-Analysis

A meta-analysis is a systematic review that involves identifying an aim and then searching for research studies that have addressed similar aims/hypotheses.

This is done by looking through various databases, and then decisions are made about what studies are to be included/excluded.

Strengths: Increases the conclusions’ validity as they’re based on a wider range.

Weaknesses: Research designs in studies can vary, so they are not truly comparable.

Peer Review

A researcher submits an article to a journal. The choice of the journal may be determined by the journal’s audience or prestige.

The journal selects two or more appropriate experts (psychologists working in a similar field) to peer review the article without payment. The peer reviewers assess: the methods and designs used, originality of the findings, the validity of the original research findings and its content, structure and language.

Feedback from the reviewer determines whether the article is accepted. The article may be: Accepted as it is, accepted with revisions, sent back to the author to revise and re-submit or rejected without the possibility of submission.

The editor makes the final decision whether to accept or reject the research report based on the reviewers comments/ recommendations.

Peer review is important because it prevent faulty data from entering the public domain, it provides a way of checking the validity of findings and the quality of the methodology and is used to assess the research rating of university departments.

Peer reviews may be an ideal, whereas in practice there are lots of problems. For example, it slows publication down and may prevent unusual, new work being published. Some reviewers might use it as an opportunity to prevent competing researchers from publishing work.

Some people doubt whether peer review can really prevent the publication of fraudulent research.

The advent of the internet means that a lot of research and academic comment is being published without official peer reviews than before, though systems are evolving on the internet where everyone really has a chance to offer their opinions and police the quality of research.

Types of Data

Quantitative data is numerical data e.g. reaction time or number of mistakes. It represents how much or how long, how many there are of something. A tally of behavioral categories and closed questions in a questionnaire collect quantitative data.
Qualitative data is virtually any type of information that can be observed and recorded that is not numerical in nature and can be in the form of written or verbal communication. Open questions in questionnaires and accounts from observational studies collect qualitative data.
Primary data is first-hand data collected for the purpose of the investigation.
Secondary data is information that has been collected by someone other than the person who is conducting the research e.g. taken from journals, books or articles.

Validity means how well a piece of research actually measures what it sets out to, or how well it reflects the reality it claims to represent.

Validity is whether the observed effect is genuine and represents what is actually out there in the world.

Concurrent validity is the extent to which a psychological measure relates to an existing similar measure and obtains close results. For example, a new intelligence test compared to an established test.
Face validity : does the test measure what it’s supposed to measure ‘on the face of it’. This is done by ‘eyeballing’ the measuring or by passing it to an expert to check.
Ecological validit y is the extent to which findings from a research study can be generalized to other settings / real life.
Temporal validity is the extent to which findings from a research study can be generalized to other historical times.

Features of Science

Paradigm – A set of shared assumptions and agreed methods within a scientific discipline.
Paradigm shift – The result of the scientific revolution: a significant change in the dominant unifying theory within a scientific discipline.
Objectivity – When all sources of personal bias are minimised so not to distort or influence the research process.
Empirical method – Scientific approaches that are based on the gathering of evidence through direct observation and experience.
Replicability – The extent to which scientific procedures and findings can be repeated by other researchers.
Falsifiability – The principle that a theory cannot be considered scientific unless it admits the possibility of being proved untrue.

Statistical Testing

A significant result is one where there is a low probability that chance factors were responsible for any observed difference, correlation, or association in the variables tested.

If our test is significant, we can reject our null hypothesis and accept our alternative hypothesis.

If our test is not significant, we can accept our null hypothesis and reject our alternative hypothesis. A null hypothesis is a statement of no effect.

In Psychology, we use p < 0.05 (as it strikes a balance between making a type I and II error) but p < 0.01 is used in tests that could cause harm like introducing a new drug.

A type I error is when the null hypothesis is rejected when it should have been accepted (happens when a lenient significance level is used, an error of optimism).

A type II error is when the null hypothesis is accepted when it should have been rejected (happens when a stringent significance level is used, an error of pessimism).

Ethical Issues

Informed consent is when participants are able to make an informed judgment about whether to take part. It causes them to guess the aims of the study and change their behavior.
To deal with it, we can gain presumptive consent or ask them to formally indicate their agreement to participate but it may invalidate the purpose of the study and it is not guaranteed that the participants would understand.
Deception should only be used when it is approved by an ethics committee, as it involves deliberately misleading or withholding information. Participants should be fully debriefed after the study but debriefing can’t turn the clock back.
All participants should be informed at the beginning that they have the right to withdraw if they ever feel distressed or uncomfortable.
It causes bias as the ones that stayed are obedient and some may not withdraw as they may have been given incentives or feel like they’re spoiling the study. Researchers can offer the right to withdraw data after participation.
Participants should all have protection from harm . The researcher should avoid risks greater than those experienced in everyday life and they should stop the study if any harm is suspected. However, the harm may not be apparent at the time of the study.
Confidentiality concerns the communication of personal information. The researchers should not record any names but use numbers or false names though it may not be possible as it is sometimes possible to work out who the researchers were.

What is Research Methodology? Definition, Types, and Examples

Research methodology 1,2 is a structured and scientific approach used to collect, analyze, and interpret quantitative or qualitative data to answer research questions or test hypotheses. A research methodology is like a plan for carrying out research and helps keep researchers on track by limiting the scope of the research. Several aspects must be considered before selecting an appropriate research methodology, such as research limitations and ethical concerns that may affect your research.

The research methodology section in a scientific paper describes the different methodological choices made, such as the data collection and analysis methods, and why these choices were selected. The reasons should explain why the methods chosen are the most appropriate to answer the research question. A good research methodology also helps ensure the reliability and validity of the research findings. There are three types of research methodology—quantitative, qualitative, and mixed-method, which can be chosen based on the research objectives.

What is research methodology ?

A research methodology describes the techniques and procedures used to identify and analyze information regarding a specific research topic. It is a process by which researchers design their study so that they can achieve their objectives using the selected research instruments. It includes all the important aspects of research, including research design, data collection methods, data analysis methods, and the overall framework within which the research is conducted. While these points can help you understand what is research methodology, you also need to know why it is important to pick the right methodology.

Why is research methodology important?

Having a good research methodology in place has the following advantages: 3

Helps other researchers who may want to replicate your research; the explanations will be of benefit to them.
You can easily answer any questions about your research if they arise at a later stage.
A research methodology provides a framework and guidelines for researchers to clearly define research questions, hypotheses, and objectives.
It helps researchers identify the most appropriate research design, sampling technique, and data collection and analysis methods.
A sound research methodology helps researchers ensure that their findings are valid and reliable and free from biases and errors.
It also helps ensure that ethical guidelines are followed while conducting research.
A good research methodology helps researchers in planning their research efficiently, by ensuring optimum usage of their time and resources.

Writing the methods section of a research paper? Let Paperpal help you achieve perfection

Types of research methodology.

There are three types of research methodology based on the type of research and the data required. 1

Quantitative research methodology focuses on measuring and testing numerical data. This approach is good for reaching a large number of people in a short amount of time. This type of research helps in testing the causal relationships between variables, making predictions, and generalizing results to wider populations.
Qualitative research methodology examines the opinions, behaviors, and experiences of people. It collects and analyzes words and textual data. This research methodology requires fewer participants but is still more time consuming because the time spent per participant is quite large. This method is used in exploratory research where the research problem being investigated is not clearly defined.
Mixed-method research methodology uses the characteristics of both quantitative and qualitative research methodologies in the same study. This method allows researchers to validate their findings, verify if the results observed using both methods are complementary, and explain any unexpected results obtained from one method by using the other method.

What are the types of sampling designs in research methodology?

Sampling 4 is an important part of a research methodology and involves selecting a representative sample of the population to conduct the study, making statistical inferences about them, and estimating the characteristics of the whole population based on these inferences. There are two types of sampling designs in research methodology—probability and nonprobability.

Probability sampling

In this type of sampling design, a sample is chosen from a larger population using some form of random selection, that is, every member of the population has an equal chance of being selected. The different types of probability sampling are:

Systematic —sample members are chosen at regular intervals. It requires selecting a starting point for the sample and sample size determination that can be repeated at regular intervals. This type of sampling method has a predefined range; hence, it is the least time consuming.
Stratified —researchers divide the population into smaller groups that don’t overlap but represent the entire population. While sampling, these groups can be organized, and then a sample can be drawn from each group separately.
Cluster —the population is divided into clusters based on demographic parameters like age, sex, location, etc.
Convenience —selects participants who are most easily accessible to researchers due to geographical proximity, availability at a particular time, etc.
Purposive —participants are selected at the researcher’s discretion. Researchers consider the purpose of the study and the understanding of the target audience.
Snowball —already selected participants use their social networks to refer the researcher to other potential participants.
Quota —while designing the study, the researchers decide how many people with which characteristics to include as participants. The characteristics help in choosing people most likely to provide insights into the subject.

What are data collection methods?

During research, data are collected using various methods depending on the research methodology being followed and the research methods being undertaken. Both qualitative and quantitative research have different data collection methods, as listed below.

Qualitative research 5

One-on-one interviews: Helps the interviewers understand a respondent’s subjective opinion and experience pertaining to a specific topic or event
Document study/literature review/record keeping: Researchers’ review of already existing written materials such as archives, annual reports, research articles, guidelines, policy documents, etc.
Focus groups: Constructive discussions that usually include a small sample of about 6-10 people and a moderator, to understand the participants’ opinion on a given topic.
Qualitative observation : Researchers collect data using their five senses (sight, smell, touch, taste, and hearing).

Quantitative research 6

Sampling: The most common type is probability sampling.
Interviews: Commonly telephonic or done in-person.
Observations: Structured observations are most commonly used in quantitative research. In this method, researchers make observations about specific behaviors of individuals in a structured setting.
Document review: Reviewing existing research or documents to collect evidence for supporting the research.
Surveys and questionnaires. Surveys can be administered both online and offline depending on the requirement and sample size.

Let Paperpal help you write the perfect research methods section. Start now!

What are data analysis methods.

The data collected using the various methods for qualitative and quantitative research need to be analyzed to generate meaningful conclusions. These data analysis methods 7 also differ between quantitative and qualitative research.

Quantitative research involves a deductive method for data analysis where hypotheses are developed at the beginning of the research and precise measurement is required. The methods include statistical analysis applications to analyze numerical data and are grouped into two categories—descriptive and inferential.

Descriptive analysis is used to describe the basic features of different types of data to present it in a way that ensures the patterns become meaningful. The different types of descriptive analysis methods are:

Measures of frequency (count, percent, frequency)
Measures of central tendency (mean, median, mode)
Measures of dispersion or variation (range, variance, standard deviation)
Measure of position (percentile ranks, quartile ranks)

Inferential analysis is used to make predictions about a larger population based on the analysis of the data collected from a smaller population. This analysis is used to study the relationships between different variables. Some commonly used inferential data analysis methods are:

Correlation: To understand the relationship between two or more variables.
Cross-tabulation: Analyze the relationship between multiple variables.
Regression analysis: Study the impact of independent variables on the dependent variable.
Frequency tables: To understand the frequency of data.
Analysis of variance: To test the degree to which two or more variables differ in an experiment.

Qualitative research involves an inductive method for data analysis where hypotheses are developed after data collection. The methods include:

Content analysis: For analyzing documented information from text and images by determining the presence of certain words or concepts in texts.
Narrative analysis: For analyzing content obtained from sources such as interviews, field observations, and surveys. The stories and opinions shared by people are used to answer research questions.
Discourse analysis: For analyzing interactions with people considering the social context, that is, the lifestyle and environment, under which the interaction occurs.
Grounded theory: Involves hypothesis creation by data collection and analysis to explain why a phenomenon occurred.
Thematic analysis: To identify important themes or patterns in data and use these to address an issue.

How to choose a research methodology?

Here are some important factors to consider when choosing a research methodology: 8

Research objectives, aims, and questions —these would help structure the research design.
Review existing literature to identify any gaps in knowledge.
Check the statistical requirements —if data-driven or statistical results are needed then quantitative research is the best. If the research questions can be answered based on people’s opinions and perceptions, then qualitative research is most suitable.
Sample size —sample size can often determine the feasibility of a research methodology. For a large sample, less effort- and time-intensive methods are appropriate.
Constraints —constraints of time, geography, and resources can help define the appropriate methodology.

Got writer’s block? Kickstart your research paper writing with Paperpal now!

How to write a research methodology .

A research methodology should include the following components: 3,9

Research design —should be selected based on the research question and the data required. Common research designs include experimental, quasi-experimental, correlational, descriptive, and exploratory.
Research method —this can be quantitative, qualitative, or mixed-method.
Reason for selecting a specific methodology —explain why this methodology is the most suitable to answer your research problem.
Research instruments —explain the research instruments you plan to use, mainly referring to the data collection methods such as interviews, surveys, etc. Here as well, a reason should be mentioned for selecting the particular instrument.
Sampling —this involves selecting a representative subset of the population being studied.
Data collection —involves gathering data using several data collection methods, such as surveys, interviews, etc.
Data analysis —describe the data analysis methods you will use once you’ve collected the data.
Research limitations —mention any limitations you foresee while conducting your research.
Validity and reliability —validity helps identify the accuracy and truthfulness of the findings; reliability refers to the consistency and stability of the results over time and across different conditions.
Ethical considerations —research should be conducted ethically. The considerations include obtaining consent from participants, maintaining confidentiality, and addressing conflicts of interest.

Streamline Your Research Paper Writing Process with Paperpal

The methods section is a critical part of the research papers, allowing researchers to use this to understand your findings and replicate your work when pursuing their own research. However, it is usually also the most difficult section to write. This is where Paperpal can help you overcome the writer’s block and create the first draft in minutes with Paperpal Copilot, its secure generative AI feature suite.

With Paperpal you can get research advice, write and refine your work, rephrase and verify the writing, and ensure submission readiness, all in one place. Here’s how you can use Paperpal to develop the first draft of your methods section.

Generate an outline: Input some details about your research to instantly generate an outline for your methods section
Develop the section: Use the outline and suggested sentence templates to expand your ideas and develop the first draft.
P araph ras e and trim : Get clear, concise academic text with paraphrasing that conveys your work effectively and word reduction to fix redundancies.
Choose the right words: Enhance text by choosing contextual synonyms based on how the words have been used in previously published work.
Check and verify text : Make sure the generated text showcases your methods correctly, has all the right citations, and is original and authentic. .

You can repeat this process to develop each section of your research manuscript, including the title, abstract and keywords. Ready to write your research papers faster, better, and without the stress? Sign up for Paperpal and start writing today!

Frequently Asked Questions

Q1. What are the key components of research methodology?

A1. A good research methodology has the following key components:

Research design
Data collection procedures
Data analysis methods
Ethical considerations

Q2. Why is ethical consideration important in research methodology?

A2. Ethical consideration is important in research methodology to ensure the readers of the reliability and validity of the study. Researchers must clearly mention the ethical norms and standards followed during the conduct of the research and also mention if the research has been cleared by any institutional board. The following 10 points are the important principles related to ethical considerations: 10

Participants should not be subjected to harm.
Respect for the dignity of participants should be prioritized.
Full consent should be obtained from participants before the study.
Participants’ privacy should be ensured.
Confidentiality of the research data should be ensured.
Anonymity of individuals and organizations participating in the research should be maintained.
The aims and objectives of the research should not be exaggerated.
Affiliations, sources of funding, and any possible conflicts of interest should be declared.
Communication in relation to the research should be honest and transparent.
Misleading information and biased representation of primary data findings should be avoided.

Q3. What is the difference between methodology and method?

A3. Research methodology is different from a research method, although both terms are often confused. Research methods are the tools used to gather data, while the research methodology provides a framework for how research is planned, conducted, and analyzed. The latter guides researchers in making decisions about the most appropriate methods for their research. Research methods refer to the specific techniques, procedures, and tools used by researchers to collect, analyze, and interpret data, for instance surveys, questionnaires, interviews, etc.

Research methodology is, thus, an integral part of a research study. It helps ensure that you stay on track to meet your research objectives and answer your research questions using the most appropriate data collection and analysis tools based on your research design.

Accelerate your research paper writing with Paperpal. Try for free now!

Research methodologies. Pfeiffer Library website. Accessed August 15, 2023. https://library.tiffin.edu/researchmethodologies/whatareresearchmethodologies
Types of research methodology. Eduvoice website. Accessed August 16, 2023. https://eduvoice.in/types-research-methodology/
The basics of research methodology: A key to quality research. Voxco. Accessed August 16, 2023. https://www.voxco.com/blog/what-is-research-methodology/
Sampling methods: Types with examples. QuestionPro website. Accessed August 16, 2023. https://www.questionpro.com/blog/types-of-sampling-for-social-research/
What is qualitative research? Methods, types, approaches, examples. Researcher.Life blog. Accessed August 15, 2023. https://researcher.life/blog/article/what-is-qualitative-research-methods-types-examples/
What is quantitative research? Definition, methods, types, and examples. Researcher.Life blog. Accessed August 15, 2023. https://researcher.life/blog/article/what-is-quantitative-research-types-and-examples/
Data analysis in research: Types & methods. QuestionPro website. Accessed August 16, 2023. https://www.questionpro.com/blog/data-analysis-in-research/#Data_analysis_in_qualitative_research
Factors to consider while choosing the right research methodology. PhD Monster website. Accessed August 17, 2023. https://www.phdmonster.com/factors-to-consider-while-choosing-the-right-research-methodology/
What is research methodology? Research and writing guides. Accessed August 14, 2023. https://paperpile.com/g/what-is-research-methodology/
Ethical considerations. Business research methodology website. Accessed August 17, 2023. https://research-methodology.net/research-methodology/ethical-considerations/

Paperpal is a comprehensive AI writing toolkit that helps students and researchers achieve 2x the writing in half the time. It leverages 21+ years of STM experience and insights from millions of research articles to provide in-depth academic writing, language editing, and submission readiness support to help you write better, faster.

Get accurate academic translations, rewriting support, grammar checks, vocabulary suggestions, and generative AI assistance that delivers human precision at machine speed. Try for free or upgrade to Paperpal Prime starting at US$19 a month to access premium features, including consistency, plagiarism, and 30+ submission readiness checks to help you succeed.

Experience the future of academic writing – Sign up to Paperpal and start writing for free!

Language and Grammar Rules for Academic Writing

Climatic vs. climactic: difference and examples, you may also like, how to write an essay introduction (with examples)..., similarity checks: the author’s guide to plagiarism and..., what is a master’s thesis: a guide for..., should you use ai tools like chatgpt for..., what are the benefits of generative ai for..., how to avoid plagiarism tips and advice for..., plagiarism checkers vs. ai content detection: navigating the..., plagiarism prevention: why you need a plagiarism check..., how long should a chapter be, how to cite social media sources in academic writing .

Affiliate Program

UNITED STATES
台灣 (TAIWAN)
TÜRKIYE (TURKEY)
Academic Editing Services
- Research Paper
- Journal Manuscript
- Dissertation
- College & University Assignments
Admissions Editing Services
- Application Essay
- Personal Statement
- Recommendation Letter
- Cover Letter
- CV/Resume
Business Editing Services
- Business Documents
- Report & Brochure
- Website & Blog
Writer Editing Services
- Script & Screenplay
Our Editors
Client Reviews
Editing & Proofreading Prices
Wordvice Points
Partner Discount
Plagiarism Checker
APA Citation Generator
MLA Citation Generator
Chicago Citation Generator
Vancouver Citation Generator
- APA Style
- MLA Style
- Chicago Style
- Vancouver Style
Writing & Editing Guide
Academic Resources
Admissions Resources

Types of Research Methods: Examples and Tips

What are research methods?

Research methods are the techniques and procedures used to collect and analyze data in order to answer research questions and test a research hypothesis . There are several different types of research methods, each with its own strengths and weaknesses.

Common Types of Research Methods

There are several main types of research methods that are employed in academic articles. The type of research method applied depends on the nature of the data to be collected and analyzed, as well as any restrictions or limitations that dictate the study’s resources and methodology. Surveying articles from your target journal and identifying the methods commonly used in these studies is also recommended before choosing a research method or methods.

It’s important to note that research methods can be combined for a more complete understanding of a research question or hypothesis. For example, an experiment can be followed by a survey to gather more information about participants’ attitudes and behaviors.

Overall, the choice of research method depends on the research question, the type of data needed, and the resources available to the researcher.

Data Collection Methods

Data is information collected in order to answer research questions . The kind of data you choose to collect will depend on the nature of your research question and the aims of your study. There are a few main category distinctions of data a researcher can collect.

Quantitative vs qualitative data

Qualitative and quantitative data are two types of data that are often used in research studies. They are different in terms of their characteristics, how they are collected, and how they are analyzed.

Quantitative data is numerical and is collected through methods such as surveys, polls, and experiments. It is often used to measure and describe the characteristics of a large group of people or objects. This data can be analyzed using statistical methods to find patterns and trends.

Qualitative data, on the other hand, is non-numerical and is collected through methods such as interviews, observations, and focus groups. It is often used to understand the experiences, attitudes, and perceptions of individuals or small groups. This data is analyzed using methods such as content analysis, thematic analysis, and discourse analysis to identify patterns and themes.

Overall, quantitative data provides a more objective and generalizable understanding of a phenomenon, while qualitative data provides a more subjective and in-depth understanding. Both types of data are important and can be used together to gain a more comprehensive understanding of a topic.

You can also make use of both qualitative and quantitative research methods in your study.

Primary vs secondary data

Primary and secondary research are two different types of research methods that are used in the field of academia and market research. Both primary and secondary sources can be applied in most studies.

Primary research is research that is conducted by the individual or organization themselves. It involves collecting original data through methods such as surveys, interviews, or experiments. The data collected through primary research is specific to the research question and objectives, and is not typically available through other sources.

Secondary research, on the other hand, involves the use of existing data that has already been collected by someone else. This can include data from government reports, academic journals, or industry publications. The advantage of secondary research is that it is typically less time-consuming and less expensive than primary research, as the data has already been collected. However, the data may not be as specific or relevant to the research question and objectives.

The choice between using primary and secondary research will depend on the research question, study budget, and time constraints of the project, as well as the target journal to which you are submitting your manuscript.

Experimental vs descriptive data collection

Experimental data is collected through a controlled experiment, in which the researcher manipulates one or more variables to observe the effect on another variable. The goal of experimental data is to determine cause-and-effect relationships. For example, in a study on the effectiveness of a new drug for treating a certain condition, the researchers would randomly assign participants to either a group that receives the drug or a group that receives a placebo, and then compare the outcomes between the two groups. The data collected in this study would be considered experimental data.

Descriptive data, on the other hand, is data that is collected through observation or surveys and is used to describe the characteristics of a population or phenomenon. The goal of descriptive data is to provide a snapshot of the current state of a certain population or phenomenon, rather than to determine cause-and-effect relationships. For example, in a study on the dietary habits of a certain population, the researchers would collect data on what types of food the participants typically eat and how often they eat them. This data would be considered descriptive data.

In summary, experimental data is collected through a controlled experiment to determine cause-and-effect relationships, while descriptive data is collected through observation or surveys to describe the characteristics of a population or phenomenon.

Descriptive data examples:

A survey that asks people about their favorite type of music
A census that counts the number of people living in a certain area
A poll that asks people about their political affiliation

Experimental data examples:

A study comparing the effectiveness of two different medications for treating a certain condition
An experiment measuring the effect of different levels of a certain chemical on plant growth
A clinical trial comparing the side effects of a new treatment to a standard treatment for a disease

Examples of Difference Data Collection Methods

Prepare your manuscript with professional editing.

To ensure your methods are accurately articulated in your study and that your work is free of language errors, consider receiving professional proofreading services . Wordvice specializes in paper editing and manuscript editing for any kind of academic document.

And to receive a free grammar check for academic writing in real time, try Wordvice.ai and see how it compares to the big names in AI proofreading.

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

Advanced Search
Journal List
Neurol Res Pract

How to use and assess qualitative research methods

Loraine busetto.

1 Department of Neurology, Heidelberg University Hospital, Im Neuenheimer Feld 400, 69120 Heidelberg, Germany

Wolfgang Wick

2 Clinical Cooperation Unit Neuro-Oncology, German Cancer Research Center, Heidelberg, Germany

Christoph Gumbinger

Associated data.

Not applicable.

This paper aims to provide an overview of the use and assessment of qualitative research methods in the health sciences. Qualitative research can be defined as the study of the nature of phenomena and is especially appropriate for answering questions of why something is (not) observed, assessing complex multi-component interventions, and focussing on intervention improvement. The most common methods of data collection are document study, (non-) participant observations, semi-structured interviews and focus groups. For data analysis, field-notes and audio-recordings are transcribed into protocols and transcripts, and coded using qualitative data management software. Criteria such as checklists, reflexivity, sampling strategies, piloting, co-coding, member-checking and stakeholder involvement can be used to enhance and assess the quality of the research conducted. Using qualitative in addition to quantitative designs will equip us with better tools to address a greater range of research problems, and to fill in blind spots in current neurological research and practice.

The aim of this paper is to provide an overview of qualitative research methods, including hands-on information on how they can be used, reported and assessed. This article is intended for beginning qualitative researchers in the health sciences as well as experienced quantitative researchers who wish to broaden their understanding of qualitative research.

What is qualitative research?

Qualitative research is defined as “the study of the nature of phenomena”, including “their quality, different manifestations, the context in which they appear or the perspectives from which they can be perceived” , but excluding “their range, frequency and place in an objectively determined chain of cause and effect” [ 1 ]. This formal definition can be complemented with a more pragmatic rule of thumb: qualitative research generally includes data in form of words rather than numbers [ 2 ].

Why conduct qualitative research?

Because some research questions cannot be answered using (only) quantitative methods. For example, one Australian study addressed the issue of why patients from Aboriginal communities often present late or not at all to specialist services offered by tertiary care hospitals. Using qualitative interviews with patients and staff, it found one of the most significant access barriers to be transportation problems, including some towns and communities simply not having a bus service to the hospital [ 3 ]. A quantitative study could have measured the number of patients over time or even looked at possible explanatory factors – but only those previously known or suspected to be of relevance. To discover reasons for observed patterns, especially the invisible or surprising ones, qualitative designs are needed.

While qualitative research is common in other fields, it is still relatively underrepresented in health services research. The latter field is more traditionally rooted in the evidence-based-medicine paradigm, as seen in " research that involves testing the effectiveness of various strategies to achieve changes in clinical practice, preferably applying randomised controlled trial study designs (...) " [ 4 ]. This focus on quantitative research and specifically randomised controlled trials (RCT) is visible in the idea of a hierarchy of research evidence which assumes that some research designs are objectively better than others, and that choosing a "lesser" design is only acceptable when the better ones are not practically or ethically feasible [ 5 , 6 ]. Others, however, argue that an objective hierarchy does not exist, and that, instead, the research design and methods should be chosen to fit the specific research question at hand – "questions before methods" [ 2 , 7 – 9 ]. This means that even when an RCT is possible, some research problems require a different design that is better suited to addressing them. Arguing in JAMA, Berwick uses the example of rapid response teams in hospitals, which he describes as " a complex, multicomponent intervention – essentially a process of social change" susceptible to a range of different context factors including leadership or organisation history. According to him, "[in] such complex terrain, the RCT is an impoverished way to learn. Critics who use it as a truth standard in this context are incorrect" [ 8 ] . Instead of limiting oneself to RCTs, Berwick recommends embracing a wider range of methods , including qualitative ones, which for "these specific applications, (...) are not compromises in learning how to improve; they are superior" [ 8 ].

Research problems that can be approached particularly well using qualitative methods include assessing complex multi-component interventions or systems (of change), addressing questions beyond “what works”, towards “what works for whom when, how and why”, and focussing on intervention improvement rather than accreditation [ 7 , 9 – 12 ]. Using qualitative methods can also help shed light on the “softer” side of medical treatment. For example, while quantitative trials can measure the costs and benefits of neuro-oncological treatment in terms of survival rates or adverse effects, qualitative research can help provide a better understanding of patient or caregiver stress, visibility of illness or out-of-pocket expenses.

How to conduct qualitative research?

Given that qualitative research is characterised by flexibility, openness and responsivity to context, the steps of data collection and analysis are not as separate and consecutive as they tend to be in quantitative research [ 13 , 14 ]. As Fossey puts it : “sampling, data collection, analysis and interpretation are related to each other in a cyclical (iterative) manner, rather than following one after another in a stepwise approach” [ 15 ]. The researcher can make educated decisions with regard to the choice of method, how they are implemented, and to which and how many units they are applied [ 13 ]. As shown in Fig. 1 , this can involve several back-and-forth steps between data collection and analysis where new insights and experiences can lead to adaption and expansion of the original plan. Some insights may also necessitate a revision of the research question and/or the research design as a whole. The process ends when saturation is achieved, i.e. when no relevant new information can be found (see also below: sampling and saturation). For reasons of transparency, it is essential for all decisions as well as the underlying reasoning to be well-documented.

An external file that holds a picture, illustration, etc.
Object name is 42466_2020_59_Fig1_HTML.jpg

Iterative research process

While it is not always explicitly addressed, qualitative methods reflect a different underlying research paradigm than quantitative research (e.g. constructivism or interpretivism as opposed to positivism). The choice of methods can be based on the respective underlying substantive theory or theoretical framework used by the researcher [ 2 ].

Data collection

The methods of qualitative data collection most commonly used in health research are document study, observations, semi-structured interviews and focus groups [ 1 , 14 , 16 , 17 ].

Document study

Document study (also called document analysis) refers to the review by the researcher of written materials [ 14 ]. These can include personal and non-personal documents such as archives, annual reports, guidelines, policy documents, diaries or letters.

Observations

Observations are particularly useful to gain insights into a certain setting and actual behaviour – as opposed to reported behaviour or opinions [ 13 ]. Qualitative observations can be either participant or non-participant in nature. In participant observations, the observer is part of the observed setting, for example a nurse working in an intensive care unit [ 18 ]. In non-participant observations, the observer is “on the outside looking in”, i.e. present in but not part of the situation, trying not to influence the setting by their presence. Observations can be planned (e.g. for 3 h during the day or night shift) or ad hoc (e.g. as soon as a stroke patient arrives at the emergency room). During the observation, the observer takes notes on everything or certain pre-determined parts of what is happening around them, for example focusing on physician-patient interactions or communication between different professional groups. Written notes can be taken during or after the observations, depending on feasibility (which is usually lower during participant observations) and acceptability (e.g. when the observer is perceived to be judging the observed). Afterwards, these field notes are transcribed into observation protocols. If more than one observer was involved, field notes are taken independently, but notes can be consolidated into one protocol after discussions. Advantages of conducting observations include minimising the distance between the researcher and the researched, the potential discovery of topics that the researcher did not realise were relevant and gaining deeper insights into the real-world dimensions of the research problem at hand [ 18 ].

Semi-structured interviews

Hijmans & Kuyper describe qualitative interviews as “an exchange with an informal character, a conversation with a goal” [ 19 ]. Interviews are used to gain insights into a person’s subjective experiences, opinions and motivations – as opposed to facts or behaviours [ 13 ]. Interviews can be distinguished by the degree to which they are structured (i.e. a questionnaire), open (e.g. free conversation or autobiographical interviews) or semi-structured [ 2 , 13 ]. Semi-structured interviews are characterized by open-ended questions and the use of an interview guide (or topic guide/list) in which the broad areas of interest, sometimes including sub-questions, are defined [ 19 ]. The pre-defined topics in the interview guide can be derived from the literature, previous research or a preliminary method of data collection, e.g. document study or observations. The topic list is usually adapted and improved at the start of the data collection process as the interviewer learns more about the field [ 20 ]. Across interviews the focus on the different (blocks of) questions may differ and some questions may be skipped altogether (e.g. if the interviewee is not able or willing to answer the questions or for concerns about the total length of the interview) [ 20 ]. Qualitative interviews are usually not conducted in written format as it impedes on the interactive component of the method [ 20 ]. In comparison to written surveys, qualitative interviews have the advantage of being interactive and allowing for unexpected topics to emerge and to be taken up by the researcher. This can also help overcome a provider or researcher-centred bias often found in written surveys, which by nature, can only measure what is already known or expected to be of relevance to the researcher. Interviews can be audio- or video-taped; but sometimes it is only feasible or acceptable for the interviewer to take written notes [ 14 , 16 , 20 ].

Focus groups

Focus groups are group interviews to explore participants’ expertise and experiences, including explorations of how and why people behave in certain ways [ 1 ]. Focus groups usually consist of 6–8 people and are led by an experienced moderator following a topic guide or “script” [ 21 ]. They can involve an observer who takes note of the non-verbal aspects of the situation, possibly using an observation guide [ 21 ]. Depending on researchers’ and participants’ preferences, the discussions can be audio- or video-taped and transcribed afterwards [ 21 ]. Focus groups are useful for bringing together homogeneous (to a lesser extent heterogeneous) groups of participants with relevant expertise and experience on a given topic on which they can share detailed information [ 21 ]. Focus groups are a relatively easy, fast and inexpensive method to gain access to information on interactions in a given group, i.e. “the sharing and comparing” among participants [ 21 ]. Disadvantages include less control over the process and a lesser extent to which each individual may participate. Moreover, focus group moderators need experience, as do those tasked with the analysis of the resulting data. Focus groups can be less appropriate for discussing sensitive topics that participants might be reluctant to disclose in a group setting [ 13 ]. Moreover, attention must be paid to the emergence of “groupthink” as well as possible power dynamics within the group, e.g. when patients are awed or intimidated by health professionals.

Choosing the “right” method

As explained above, the school of thought underlying qualitative research assumes no objective hierarchy of evidence and methods. This means that each choice of single or combined methods has to be based on the research question that needs to be answered and a critical assessment with regard to whether or to what extent the chosen method can accomplish this – i.e. the “fit” between question and method [ 14 ]. It is necessary for these decisions to be documented when they are being made, and to be critically discussed when reporting methods and results.

Let us assume that our research aim is to examine the (clinical) processes around acute endovascular treatment (EVT), from the patient’s arrival at the emergency room to recanalization, with the aim to identify possible causes for delay and/or other causes for sub-optimal treatment outcome. As a first step, we could conduct a document study of the relevant standard operating procedures (SOPs) for this phase of care – are they up-to-date and in line with current guidelines? Do they contain any mistakes, irregularities or uncertainties that could cause delays or other problems? Regardless of the answers to these questions, the results have to be interpreted based on what they are: a written outline of what care processes in this hospital should look like. If we want to know what they actually look like in practice, we can conduct observations of the processes described in the SOPs. These results can (and should) be analysed in themselves, but also in comparison to the results of the document analysis, especially as regards relevant discrepancies. Do the SOPs outline specific tests for which no equipment can be observed or tasks to be performed by specialized nurses who are not present during the observation? It might also be possible that the written SOP is outdated, but the actual care provided is in line with current best practice. In order to find out why these discrepancies exist, it can be useful to conduct interviews. Are the physicians simply not aware of the SOPs (because their existence is limited to the hospital’s intranet) or do they actively disagree with them or does the infrastructure make it impossible to provide the care as described? Another rationale for adding interviews is that some situations (or all of their possible variations for different patient groups or the day, night or weekend shift) cannot practically or ethically be observed. In this case, it is possible to ask those involved to report on their actions – being aware that this is not the same as the actual observation. A senior physician’s or hospital manager’s description of certain situations might differ from a nurse’s or junior physician’s one, maybe because they intentionally misrepresent facts or maybe because different aspects of the process are visible or important to them. In some cases, it can also be relevant to consider to whom the interviewee is disclosing this information – someone they trust, someone they are otherwise not connected to, or someone they suspect or are aware of being in a potentially “dangerous” power relationship to them. Lastly, a focus group could be conducted with representatives of the relevant professional groups to explore how and why exactly they provide care around EVT. The discussion might reveal discrepancies (between SOPs and actual care or between different physicians) and motivations to the researchers as well as to the focus group members that they might not have been aware of themselves. For the focus group to deliver relevant information, attention has to be paid to its composition and conduct, for example, to make sure that all participants feel safe to disclose sensitive or potentially problematic information or that the discussion is not dominated by (senior) physicians only. The resulting combination of data collection methods is shown in Fig. 2 .

An external file that holds a picture, illustration, etc.
Object name is 42466_2020_59_Fig2_HTML.jpg

Possible combination of data collection methods

Attributions for icons: “Book” by Serhii Smirnov, “Interview” by Adrien Coquet, FR, “Magnifying Glass” by anggun, ID, “Business communication” by Vectors Market; all from the Noun Project

The combination of multiple data source as described for this example can be referred to as “triangulation”, in which multiple measurements are carried out from different angles to achieve a more comprehensive understanding of the phenomenon under study [ 22 , 23 ].

Data analysis

To analyse the data collected through observations, interviews and focus groups these need to be transcribed into protocols and transcripts (see Fig. 3 ). Interviews and focus groups can be transcribed verbatim , with or without annotations for behaviour (e.g. laughing, crying, pausing) and with or without phonetic transcription of dialects and filler words, depending on what is expected or known to be relevant for the analysis. In the next step, the protocols and transcripts are coded , that is, marked (or tagged, labelled) with one or more short descriptors of the content of a sentence or paragraph [ 2 , 15 , 23 ]. Jansen describes coding as “connecting the raw data with “theoretical” terms” [ 20 ]. In a more practical sense, coding makes raw data sortable. This makes it possible to extract and examine all segments describing, say, a tele-neurology consultation from multiple data sources (e.g. SOPs, emergency room observations, staff and patient interview). In a process of synthesis and abstraction, the codes are then grouped, summarised and/or categorised [ 15 , 20 ]. The end product of the coding or analysis process is a descriptive theory of the behavioural pattern under investigation [ 20 ]. The coding process is performed using qualitative data management software, the most common ones being InVivo, MaxQDA and Atlas.ti. It should be noted that these are data management tools which support the analysis performed by the researcher(s) [ 14 ].

An external file that holds a picture, illustration, etc.
Object name is 42466_2020_59_Fig3_HTML.jpg

From data collection to data analysis

Attributions for icons: see Fig. Fig.2, 2 , also “Speech to text” by Trevor Dsouza, “Field Notes” by Mike O’Brien, US, “Voice Record” by ProSymbols, US, “Inspection” by Made, AU, and “Cloud” by Graphic Tigers; all from the Noun Project

How to report qualitative research?

Protocols of qualitative research can be published separately and in advance of the study results. However, the aim is not the same as in RCT protocols, i.e. to pre-define and set in stone the research questions and primary or secondary endpoints. Rather, it is a way to describe the research methods in detail, which might not be possible in the results paper given journals’ word limits. Qualitative research papers are usually longer than their quantitative counterparts to allow for deep understanding and so-called “thick description”. In the methods section, the focus is on transparency of the methods used, including why, how and by whom they were implemented in the specific study setting, so as to enable a discussion of whether and how this may have influenced data collection, analysis and interpretation. The results section usually starts with a paragraph outlining the main findings, followed by more detailed descriptions of, for example, the commonalities, discrepancies or exceptions per category [ 20 ]. Here it is important to support main findings by relevant quotations, which may add information, context, emphasis or real-life examples [ 20 , 23 ]. It is subject to debate in the field whether it is relevant to state the exact number or percentage of respondents supporting a certain statement (e.g. “Five interviewees expressed negative feelings towards XYZ”) [ 21 ].

How to combine qualitative with quantitative research?

Qualitative methods can be combined with other methods in multi- or mixed methods designs, which “[employ] two or more different methods [ …] within the same study or research program rather than confining the research to one single method” [ 24 ]. Reasons for combining methods can be diverse, including triangulation for corroboration of findings, complementarity for illustration and clarification of results, expansion to extend the breadth and range of the study, explanation of (unexpected) results generated with one method with the help of another, or offsetting the weakness of one method with the strength of another [ 1 , 17 , 24 – 26 ]. The resulting designs can be classified according to when, why and how the different quantitative and/or qualitative data strands are combined. The three most common types of mixed method designs are the convergent parallel design , the explanatory sequential design and the exploratory sequential design. The designs with examples are shown in Fig. 4 .

An external file that holds a picture, illustration, etc.
Object name is 42466_2020_59_Fig4_HTML.jpg

Three common mixed methods designs

In the convergent parallel design, a qualitative study is conducted in parallel to and independently of a quantitative study, and the results of both studies are compared and combined at the stage of interpretation of results. Using the above example of EVT provision, this could entail setting up a quantitative EVT registry to measure process times and patient outcomes in parallel to conducting the qualitative research outlined above, and then comparing results. Amongst other things, this would make it possible to assess whether interview respondents’ subjective impressions of patients receiving good care match modified Rankin Scores at follow-up, or whether observed delays in care provision are exceptions or the rule when compared to door-to-needle times as documented in the registry. In the explanatory sequential design, a quantitative study is carried out first, followed by a qualitative study to help explain the results from the quantitative study. This would be an appropriate design if the registry alone had revealed relevant delays in door-to-needle times and the qualitative study would be used to understand where and why these occurred, and how they could be improved. In the exploratory design, the qualitative study is carried out first and its results help informing and building the quantitative study in the next step [ 26 ]. If the qualitative study around EVT provision had shown a high level of dissatisfaction among the staff members involved, a quantitative questionnaire investigating staff satisfaction could be set up in the next step, informed by the qualitative study on which topics dissatisfaction had been expressed. Amongst other things, the questionnaire design would make it possible to widen the reach of the research to more respondents from different (types of) hospitals, regions, countries or settings, and to conduct sub-group analyses for different professional groups.

How to assess qualitative research?

A variety of assessment criteria and lists have been developed for qualitative research, ranging in their focus and comprehensiveness [ 14 , 17 , 27 ]. However, none of these has been elevated to the “gold standard” in the field. In the following, we therefore focus on a set of commonly used assessment criteria that, from a practical standpoint, a researcher can look for when assessing a qualitative research report or paper.

Assessors should check the authors’ use of and adherence to the relevant reporting checklists (e.g. Standards for Reporting Qualitative Research (SRQR)) to make sure all items that are relevant for this type of research are addressed [ 23 , 28 ]. Discussions of quantitative measures in addition to or instead of these qualitative measures can be a sign of lower quality of the research (paper). Providing and adhering to a checklist for qualitative research contributes to an important quality criterion for qualitative research, namely transparency [ 15 , 17 , 23 ].

Reflexivity

While methodological transparency and complete reporting is relevant for all types of research, some additional criteria must be taken into account for qualitative research. This includes what is called reflexivity, i.e. sensitivity to the relationship between the researcher and the researched, including how contact was established and maintained, or the background and experience of the researcher(s) involved in data collection and analysis. Depending on the research question and population to be researched this can be limited to professional experience, but it may also include gender, age or ethnicity [ 17 , 27 ]. These details are relevant because in qualitative research, as opposed to quantitative research, the researcher as a person cannot be isolated from the research process [ 23 ]. It may influence the conversation when an interviewed patient speaks to an interviewer who is a physician, or when an interviewee is asked to discuss a gynaecological procedure with a male interviewer, and therefore the reader must be made aware of these details [ 19 ].

Sampling and saturation

The aim of qualitative sampling is for all variants of the objects of observation that are deemed relevant for the study to be present in the sample “ to see the issue and its meanings from as many angles as possible” [ 1 , 16 , 19 , 20 , 27 ] , and to ensure “information-richness [ 15 ]. An iterative sampling approach is advised, in which data collection (e.g. five interviews) is followed by data analysis, followed by more data collection to find variants that are lacking in the current sample. This process continues until no new (relevant) information can be found and further sampling becomes redundant – which is called saturation [ 1 , 15 ] . In other words: qualitative data collection finds its end point not a priori , but when the research team determines that saturation has been reached [ 29 , 30 ].

This is also the reason why most qualitative studies use deliberate instead of random sampling strategies. This is generally referred to as “ purposive sampling” , in which researchers pre-define which types of participants or cases they need to include so as to cover all variations that are expected to be of relevance, based on the literature, previous experience or theory (i.e. theoretical sampling) [ 14 , 20 ]. Other types of purposive sampling include (but are not limited to) maximum variation sampling, critical case sampling or extreme or deviant case sampling [ 2 ]. In the above EVT example, a purposive sample could include all relevant professional groups and/or all relevant stakeholders (patients, relatives) and/or all relevant times of observation (day, night and weekend shift).

Assessors of qualitative research should check whether the considerations underlying the sampling strategy were sound and whether or how researchers tried to adapt and improve their strategies in stepwise or cyclical approaches between data collection and analysis to achieve saturation [ 14 ].

Good qualitative research is iterative in nature, i.e. it goes back and forth between data collection and analysis, revising and improving the approach where necessary. One example of this are pilot interviews, where different aspects of the interview (especially the interview guide, but also, for example, the site of the interview or whether the interview can be audio-recorded) are tested with a small number of respondents, evaluated and revised [ 19 ]. In doing so, the interviewer learns which wording or types of questions work best, or which is the best length of an interview with patients who have trouble concentrating for an extended time. Of course, the same reasoning applies to observations or focus groups which can also be piloted.

Ideally, coding should be performed by at least two researchers, especially at the beginning of the coding process when a common approach must be defined, including the establishment of a useful coding list (or tree), and when a common meaning of individual codes must be established [ 23 ]. An initial sub-set or all transcripts can be coded independently by the coders and then compared and consolidated after regular discussions in the research team. This is to make sure that codes are applied consistently to the research data.

Member checking

Member checking, also called respondent validation , refers to the practice of checking back with study respondents to see if the research is in line with their views [ 14 , 27 ]. This can happen after data collection or analysis or when first results are available [ 23 ]. For example, interviewees can be provided with (summaries of) their transcripts and asked whether they believe this to be a complete representation of their views or whether they would like to clarify or elaborate on their responses [ 17 ]. Respondents’ feedback on these issues then becomes part of the data collection and analysis [ 27 ].

Stakeholder involvement

In those niches where qualitative approaches have been able to evolve and grow, a new trend has seen the inclusion of patients and their representatives not only as study participants (i.e. “members”, see above) but as consultants to and active participants in the broader research process [ 31 – 33 ]. The underlying assumption is that patients and other stakeholders hold unique perspectives and experiences that add value beyond their own single story, making the research more relevant and beneficial to researchers, study participants and (future) patients alike [ 34 , 35 ]. Using the example of patients on or nearing dialysis, a recent scoping review found that 80% of clinical research did not address the top 10 research priorities identified by patients and caregivers [ 32 , 36 ]. In this sense, the involvement of the relevant stakeholders, especially patients and relatives, is increasingly being seen as a quality indicator in and of itself.

How not to assess qualitative research

The above overview does not include certain items that are routine in assessments of quantitative research. What follows is a non-exhaustive, non-representative, experience-based list of the quantitative criteria often applied to the assessment of qualitative research, as well as an explanation of the limited usefulness of these endeavours.

Protocol adherence

Given the openness and flexibility of qualitative research, it should not be assessed by how well it adheres to pre-determined and fixed strategies – in other words: its rigidity. Instead, the assessor should look for signs of adaptation and refinement based on lessons learned from earlier steps in the research process.

Sample size

For the reasons explained above, qualitative research does not require specific sample sizes, nor does it require that the sample size be determined a priori [ 1 , 14 , 27 , 37 – 39 ]. Sample size can only be a useful quality indicator when related to the research purpose, the chosen methodology and the composition of the sample, i.e. who was included and why.

Randomisation

While some authors argue that randomisation can be used in qualitative research, this is not commonly the case, as neither its feasibility nor its necessity or usefulness has been convincingly established for qualitative research [ 13 , 27 ]. Relevant disadvantages include the negative impact of a too large sample size as well as the possibility (or probability) of selecting “ quiet, uncooperative or inarticulate individuals ” [ 17 ]. Qualitative studies do not use control groups, either.

Interrater reliability, variability and other “objectivity checks”

The concept of “interrater reliability” is sometimes used in qualitative research to assess to which extent the coding approach overlaps between the two co-coders. However, it is not clear what this measure tells us about the quality of the analysis [ 23 ]. This means that these scores can be included in qualitative research reports, preferably with some additional information on what the score means for the analysis, but it is not a requirement. Relatedly, it is not relevant for the quality or “objectivity” of qualitative research to separate those who recruited the study participants and collected and analysed the data. Experiences even show that it might be better to have the same person or team perform all of these tasks [ 20 ]. First, when researchers introduce themselves during recruitment this can enhance trust when the interview takes place days or weeks later with the same researcher. Second, when the audio-recording is transcribed for analysis, the researcher conducting the interviews will usually remember the interviewee and the specific interview situation during data analysis. This might be helpful in providing additional context information for interpretation of data, e.g. on whether something might have been meant as a joke [ 18 ].

Not being quantitative research

Being qualitative research instead of quantitative research should not be used as an assessment criterion if it is used irrespectively of the research problem at hand. Similarly, qualitative research should not be required to be combined with quantitative research per se – unless mixed methods research is judged as inherently better than single-method research. In this case, the same criterion should be applied for quantitative studies without a qualitative component.

The main take-away points of this paper are summarised in Table Table1. 1 . We aimed to show that, if conducted well, qualitative research can answer specific research questions that cannot to be adequately answered using (only) quantitative designs. Seeing qualitative and quantitative methods as equal will help us become more aware and critical of the “fit” between the research problem and our chosen methods: I can conduct an RCT to determine the reasons for transportation delays of acute stroke patients – but should I? It also provides us with a greater range of tools to tackle a greater range of research problems more appropriately and successfully, filling in the blind spots on one half of the methodological spectrum to better address the whole complexity of neurological research and practice.

Take-away-points

Acknowledgements

Abbreviations, authors’ contributions.

LB drafted the manuscript; WW and CG revised the manuscript; all authors approved the final versions.

no external funding.

Availability of data and materials

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Survey Software The world’s leading omnichannel survey software
Online Survey Tools Create sophisticated surveys with ease.
Mobile Offline Conduct efficient field surveys.
Text Analysis
Close The Loop
Automated Translations
NPS Dashboard
CATI Manage high volume phone surveys efficiently
Cloud/On-premise Dialer TCPA compliant Cloud on-premise dialer
IVR Survey Software Boost productivity with automated call workflows.
Analytics Analyze survey data with visual dashboards
Panel Manager Nurture a loyal community of respondents.
Survey Portal Best-in-class user friendly survey portal.
Voxco Audience Conduct targeted sample research in hours.
Predictive Analytics
Customer 360
Customer Loyalty
Fraud & Risk Management
AI/ML Enablement Services
Credit Underwriting

Find the best survey software for you! (Along with a checklist to compare platforms)

Get Buyer’s Guide

100+ question types
Drag-and-drop interface
Skip logic and branching
Multi-lingual survey
Text piping
Question library
CSS customization
White-label surveys
Customizable ‘Thank You’ page
Customizable survey theme
Reminder send-outs
Survey rewards
Social media
SMS surveys
Website surveys
Correlation analysis
Cross-tabulation analysis
Trend analysis
Real-time dashboard
Customizable report
Email address validation
Recaptcha validation
SSL security

Take a peek at our powerful survey features to design surveys that scale discoveries.

Download feature sheet.

Hospitality
Financial Services
Academic Research
Customer Experience
Employee Experience
Product Experience
Market Research
Social Research
Data Analysis
Banking & Financial Services
Retail Solution
Risk Management
Customer Lifecycle Solutions
Net Promoter Score
Customer Behaviour Analytics
Customer Segmentation
Data Unification

Explore Voxco

Need to map Voxco’s features & offerings? We can help!

Watch a Demo

Download Brochures

Get a Quote

NPS Calculator
CES Calculator
A/B Testing Calculator
Margin of Error Calculator
Sample Size Calculator
CX Strategy & Management Hub
Market Research Hub
Patient Experience Hub
Employee Experience Hub
Market Research Guide
Customer Experience Guide
The Voxco Guide to Customer Experience
NPS Knowledge Hub
Survey Research Guides
Survey Template Library
Webinars and Events
Feature Sheets
Try a sample survey
Professional services
Blogs & White papers
Case Studies

Find the best customer experience platform

Uncover customer pain points, analyze feedback and run successful CX programs with the best CX platform for your team.

Get the Guide Now

We’ve been avid users of the Voxco platform now for over 20 years. It gives us the flexibility to routinely enhance our survey toolkit and provides our clients with a more robust dataset and story to tell their clients.

VP Innovation & Strategic Partnerships, The Logit Group

Client Stories
Voxco Reviews
Why Voxco Research?
Why Voxco Intelligence?
Careers at Voxco
Vulnerabilities and Ethical Hacking

Explore Regional Offices

Cloud/On-premise Dialer TCPA compliant Cloud & on-premise dialer
Fraud & Risk Management

Get Buyer’s Guide

Banking & Financial Services

Explore Voxco

Watch a Demo

Download Brochures

CX Strategy & Management Hub
Blogs & White papers

VP Innovation & Strategic Partnerships, The Logit Group

Our clients
Client stories
Featuresheets

Diversity of Research Methodology : A Guide to Different Types of Research Methodology

December 9, 2021

SHARE THE ARTICLE ON

Research is a methodology used by scientists from various fields to experiment with different topics of interest. Researchers do their experiments to either reject or accept a formulated hypothesis or to just study a topic in depth. The conclusions from such are mostly for the betterment of society and to enhance the knowledge about subjects that were not touched many times before.

Research as a scientific tool helps these researchers to measure sample data with as minimum as possible biases and much higher accuracy rate. This helps them to confidently put forth a conclusion to the society knowing that the data gathered is legit and the results drawn from the studies are systematic and statistically sensible.

Researchers adopt various methods of research that best suit their study. In this article we will be seeing various types of research methodology that are classified based on their common features and use.

What is the importance of research methodology?

Research methodology offers a systematic and structured framework that helps researchers gather, analyze, and interpret data to evaluate hypotheses. The various types of research methodology ensure that the results are reliable, meaningful, and valid.

A well-defined methodology offers credibility to your research findings.
It helps establish a clear path for other researchers to replicate the research and validate your result.
Different methodologies are suited for different data types and research goals. Choosing the right one ensures you gather precise and accurate data.
Selecting the proper research methodology helps optimize resource allocation. It ensures that you effectively utilize the time, budget, and resources to achieve the research objective.

17 Types of research methodology

Research types are classified based on their objective, depth of study, data analysis, time, and cost efficiency. Researchers are likely to use various types in combination for their study.

Types of research methodology based on PURPOSE

01. Theoretical research

Also known as pure research or basic research. Theoretical research is used when the researcher wants to gather more information about a particular topic without considering its practical work. Such research is used mostly for documentaries and mathematical formulas, which give a better understanding of the subject.

Example : Social research was conducted to understand the economic mentality of middle-class citizens.

02. Applied research

Applied research works to find a solution to a scientific problem. The main objective is to address the STEM fields, such as engineering, medicine, etc., which are more closely connected to human lives with their actual applications. There are two types of applied research:

Technological applied research aims to enhance the efficiency of the products through betterments in the technological aspects.
Scientific applied research helps to generate a predictive analysis based on the available data, which will be even more helpful in the goods and services sector.

Example : A business can analyze the customers’ purchase strategies and plan the marketing accordingly.

Read how Voxco helped Modus Research increase research efficiency with Voxco Online, CATI, IVR, and panel system.

Types of research methodology based on the DEPTH OF SCOPE

03. Exploratory research

As the name suggests, exploratory research is used as a preliminary study for the topics about which there is no deeper knowledge explored yet. It acts as a reference to the further in-depth studies that will emerge from this hypothesis. As it is not a deep study, it focuses more on data collection that will explain the causes of the phenomena.

Example : A study conducted to understand the relationship between millennials and social media usage.

04. Descriptive research

This, too, does not go very deep into the phenomena as it just focuses on finding out the characteristics the phenomena show rather than the factors that cause it. Researchers have to make sure they are not disturbing the observed phenomena and causing a change in them.

Example : investigating the standard of living in rural and urban areas.

05. Explanatory research

It is most commonly used for establishing the cause-effect relationship between the variables. The results from explanatory research can then be generalized for the rest of the variables.

Example : understanding a toddler’s behavior while watching cartoons.

Types of research methodology based on the TYPE OF DATA USED

06. Qualitative research

Qualitative research is used to collect, compare, and analyze large descriptive data from the sample. This data is often collected through surveys, interviews, and focus groups where people are allowed to express their opinions and thoughts openly to open-ended questions. The data from qualitative research is usually large and needs data labeling or coding while analyzing it.

Example : Studying the effects of exercise on health

07. Quantitative research

Quantitative research collects the data through quantitative and close-ended questions, and the data is analyzed using statistics, mathematical, and computerized tools. The data is mostly collected through surveys and it is in the form of numeric values.

Example : Conducting a survey on the likes and dislikes of the customers regarding clothing.

Types of research methodology based on MANIPULATION OF VARIABLES

08. Experimental research

Experimental research starts with replicating a phenomenon in a scientific environment. The variables are tested for the cause-effects on the sample chosen at random by putting them in two groups – a control group and a treatment group. The result is an understood cause-effect relationship between the variables.

Example : Experiment to test a new drug in the market on patients.

09. Non-experimental research

It focuses on experimenting with the population in their natural environment. Also known as an observational study, researchers do not directly intervene in the experiment. As it is observed, it is also used with descriptive research.

Example : A study on the effects of a certain education program on a batch of students.

10. Quasi-experimental research

It is very similar to experimental research but the only difference is the sample for the experiment is not selected at random. Since it is a narrowed-down study focusing on a certain kind of population, the sample should be tested and then assigned to the treatment or control group.

Example : A study was conducted to know the effects eating more cheese has on bad breath.

Take a guided tour of Voxco Online.

See how easily you can create, test, distribute, and design the surveys.

Advanced logic
White-label
Auto-translations in 100+ languages

Types of research methodology based on TYPE OF INFERENCE

11. Deductive investigation

It focuses on explaining reality with the help of general laws referring to certain conclusions. These conclusions are a part of research problems and are said to be true if the deductive investigation turns out to be applied in the right way.

12. Inductive investigation

It is also an observational study that focuses on achieving generalized results. It collects the data from which new theories can be generated.

13. Hypothetical-deductive investigation

It first formulates the hypothesis based on basic observation and then uses deductive investigation to conclude the study, which will, in return, reject or accept the hypothesis.

Types of research methodology based on TIME OF STUDY

14. Longitudinal study

Also known as Diachronic research, it observes one event, individual, or group at different times. It aims to track the changes in the subject over time. It is generally used in medical, para-medical, and social fields.

Example : To study the health of a patient under treatment over several weeks.

15. Cross-sectional study

Also known as Synchronous research, it focuses on observing events, individuals, or groups of subjects over time.

Types of research methodology based on INFORMATION SOURCES

16. Primary research

This is research that is done from scratch. Researchers themselves gather data that is specific to their study and is more reliable since it is first-hand information.

17. Secondary research

This research is conducted on someone else’s work. Researchers use available material like research papers, interviews, and documentaries as a source of data and information in their research. The problem with this is there is no guarantee that the collected information is reliable or not, and there is a chance of getting more irrelevant data outside of the research topic at hand.

Download Market Research Toolkit

Get Market Research Trends Guide, Online Surveys Guide, Agile Market Research Guide & 5 Market Research Template

Advantages of Using Different Types of Research Methodology

Embarking on a research journey involves navigating diverse methodologies, each serving as a unique lens for scrutiny. These approaches not only help achieve research goals but also elevate the entire process to thorough analysis. Let’s explore the advantages of various research methodologies:

Verification and Fact-Checking: Methodologies act as guardians, providing a structured framework for reliable outcomes.
Diverse Perspectives: Varied methodologies present unique angles, unraveling nuanced aspects for a comprehensive analysis.
Quantitative Precision with Surveys: Surveys offer numerical efficiency, bypassing intricacies for streamlined, informed results.
Qualitative Depth in Case Studies: Qualitative methods, like case studies, dive deep into intricacies, offering insights overlooked by quantitative methods.
Validation and Generalization: Employing multiple methods ensures cross-verified, consistent results, allowing findings to be applicable to a broader audience.

In the intricate dance of research, these diverse methodologies play a pivotal role, elevating the quality and reliability of discoveries. Researchers, navigating this ever-expanding landscape, unlock new dimensions of understanding, making their findings impactful.

What factors should you consider when selecting the type of research methodology?

To run a research, you don’t just need to know how to analyze data but also identify the suitable method that helps resolve the research problem efficiently with a high degree of accuracy.

1. Define your research objectives:

You should have a clear and precise idea of the research problem you intend to resolve. A good research question should be valuable and applicable.

2. Know the research type:

Learn about the types of research methodology available to identify which categories your research falls under. For example, the quantitative method involves gathering statistical data, while the qualitative method focuses on gathering customer’s voices.

3. Create inclusive surveys:

To ensure your research findings represent the target population, you must ensure that you create inclusive surveys.

4. Survey questions should align with the research problem:

The questions you ask should address the intended research objective. The responses to each survey question should help test the hypothesis.

When selecting the right research methodology, it’s important to consider the time available for conducting the research.

Voxco helps the top 50 MR firms & 500+ global brands gather omnichannel feedback, measure sentiment, uncover insights, and act on them.

See How Voxco Can Enhance Your Research Efficiency.

In the intriguing realm of research, the different types guide you through the complex landscape of data collection and analysis. The types of research methodology serve as the backbone of acquiring valid data to address problems and test hypotheses. Follow the various types mentioned above, some tips, and their importance, equipping your project with the right methodology and making impactful discoveries.

What is the research methodology?

Research methodology is a systematic approach used by researchers to plan, conduct, and analyze studies, ensuring reliable and valid results.

What is a mixed research method?

Answer: In a mixed methods study, researchers collect and analyze quantitative and qualitative data within the same study.

How do I choose the right research methodology?

Consider your research objectives, data type, and available time, focusing on factors like depth of study, manipulation of variables, and type of inference.

Can I use multiple research methodologies for a study?

Yes, using a combination of methods enhances verification, fact-checking, and validation, providing more comprehensive and applicable results.

What is the difference between research methods and research methodology?

While research methods aim to solve a research problem, research methodology evaluates the appropriateness of the methods used. The types of methodology are tools for selecting a research approach, and the methodology assesses the suitability of all approaches and procedures employed in the research.

Explore all the survey question types possible on Voxco

Explore Voxco Survey Software

+ Omnichannel Survey Software

+ Online Survey Software

+ CATI Survey Software

+ IVR Survey Software

+ Market Research Tool

+ Customer Experience Tool

+ Product Experience Software

+ Enterprise Survey Software

We use cookies in our website to give you the best browsing experience and to tailor advertising. By continuing to use our website, you give us consent to the use of cookies. Read More

Open supplemental data
Reference Manager
Simple TEXT file

People also looked at

Original research article, learning scientific observation with worked examples in a digital learning environment.

1 Department Educational Sciences, Chair for Formal and Informal Learning, Technical University Munich School of Social Sciences and Technology, Munich, Germany
2 Aquatic Systems Biology Unit, TUM School of Life Sciences, Technical University of Munich, Freising, Germany

Science education often aims to increase learners’ acquisition of fundamental principles, such as learning the basic steps of scientific methods. Worked examples (WE) have proven particularly useful for supporting the development of such cognitive schemas and successive actions in order to avoid using up more cognitive resources than are necessary. Therefore, we investigated the extent to which heuristic WE are beneficial for supporting the acquisition of a basic scientific methodological skill—conducting scientific observation. The current study has a one-factorial, quasi-experimental, comparative research design and was conducted as a field experiment. Sixty two students of a German University learned about scientific observation steps during a course on applying a fluvial audit, in which several sections of a river were classified based on specific morphological characteristics. In the two experimental groups scientific observation was supported either via faded WE or via non-faded WE both presented as short videos. The control group did not receive support via WE. We assessed factual and applied knowledge acquisition regarding scientific observation, motivational aspects and cognitive load. The results suggest that WE promoted knowledge application: Learners from both experimental groups were able to perform the individual steps of scientific observation more accurately. Fading of WE did not show any additional advantage compared to the non-faded version in this regard. Furthermore, the descriptive results reveal higher motivation and reduced extraneous cognitive load within the experimental groups, but none of these differences were statistically significant. Our findings add to existing evidence that WE may be useful to establish scientific competences.

1 Introduction

Learning in science education frequently involves the acquisition of basic principles or generalities, whether of domain-specific topics (e.g., applying a mathematical multiplication rule) or of rather universal scientific methodologies (e.g., performing the steps of scientific observation) ( Lunetta et al., 2007 ). Previous research has shown that worked examples (WE) can be considered particularly useful for developing such cognitive schemata during learning to avoid using more cognitive resources than necessary for learning successive actions ( Renkl et al., 2004 ; Renkl, 2017 ). WE consist of the presentation of a problem, consecutive solution steps and the solution itself. This is especially advantageous in initial cognitive skill acquisition, i.e., for novice learners with low prior knowledge ( Kalyuga et al., 2001 ). With growing knowledge, fading WE can lead from example-based learning to independent problem-solving ( Renkl et al., 2002 ). Preliminary work has shown the advantage of WE in specific STEM domains like mathematics ( Booth et al., 2015 ; Barbieri et al., 2021 ), but less studies have investigated their impact on the acquisition of basic scientific competencies that involve heuristic problem-solving processes (scientific argumentation, Schworm and Renkl, 2007 ; Hefter et al., 2014 ; Koenen et al., 2017 ). In the realm of natural sciences, various basic scientific methodologies are employed to acquire knowledge, such as experimentation or scientific observation ( Wellnitz and Mayer, 2013 ). During the pursuit of knowledge through scientific inquiry activities, learners may encounter several challenges and difficulties. Similar to the hurdles faced in experimentation, where understanding the criteria for appropriate experimental design, including the development, measurement, and evaluation of results, is crucial ( Sirum and Humburg, 2011 ; Brownell et al., 2014 ; Dasgupta et al., 2014 ; Deane et al., 2014 ), scientific observation additionally presents its own set of issues. In scientific observation, e.g., the acquisition of new insights may be somewhat incidental due to spontaneous and uncoordinated observations ( Jensen, 2014 ). To address these challenges, it is crucial to provide instructional support, including the use of WE, particularly when observations are carried out in a more self-directed manner.

For this reason, the aim of the present study was to determine the usefulness of digitally presented WE to support the acquisition of a basic scientific methodological skill—conducting scientific observations—using a digital learning environment. In this regard, this study examined the effects of different forms of digitally presented WE (non-faded vs. faded) on students’ cognitive and motivational outcomes and compared them to a control group without WE. Furthermore, the combined perspective of factual and applied knowledge, as well as motivational and cognitive aspects, represent further value added to the study.

2 Theoretical background

2.1 worked examples.

WE have been commonly used in the fields of STEM education (science, technology, engineering, and mathematics) ( Booth et al., 2015 ). They consist of a problem statement, the steps to solve the problem, and the solution itself ( Atkinson et al., 2000 ; Renkl et al., 2002 ; Renkl, 2014 ). The success of WE can be explained by their impact on cognitive load (CL) during learning, based on assumptions from Cognitive Load Theory ( Sweller, 2006 ).

Learning with WE is considered time-efficient, effective, and superior to problem-based learning (presentation of the problem without demonstration of solution steps) when it comes to knowledge acquisition and transfer (WE-effect, Atkinson et al., 2000 ; Van Gog et al., 2011 ). Especially WE can help by reducing the extraneous load (presentation and design of the learning material) and, in turn, can lead to an increase in germane load (effort of the learner to understand the learning material) ( Paas et al., 2003 ; Renkl, 2014 ). With regard to intrinsic load (difficulty and complexity of the learning material), it is still controversially discussed if it can be altered by instructional design, e.g., WE ( Gerjets et al., 2004 ). WE have a positive effect on learning and knowledge transfer, especially for novices, as the step-by-step presentation of the solution requires less extraneous mental effort compared to problem-based learning ( Sweller et al., 1998 ; Atkinson et al., 2000 ; Bokosmaty et al., 2015 ). With growing knowledge, WE can lose their advantages (due to the expertise-reversal effect), and scaffolding learning via faded WE might be more successful for knowledge gain and transfer ( Renkl, 2014 ). Faded WE are similar to complete WE, but fade out solution steps as knowledge and competencies grow. Faded WE enhance near-knowledge transfer and reduce errors compared to non-faded WE ( Renkl et al., 2000 ).

In addition, the reduction of intrinsic and extraneous CL by WE also has an impact on learner motivation, such as interest ( Van Gog and Paas, 2006 ). Um et al. (2012) showed that there is a strong positive correlation between germane CL and the motivational aspects of learning, like satisfaction and emotion. Gupta (2019) mentions a positive correlation between CL and interest. Van Harsel et al. (2019) found that WE positively affect learning motivation, while no such effect was found for problem-solving. Furthermore, learning with WE increases the learners’ belief in their competence in completing a task. In addition, fading WE can lead to higher motivation for more experienced learners, while non-faded WE can be particularly motivating for learners without prior knowledge ( Paas et al., 2005 ). In general, fundamental motivational aspects during the learning process, such as situational interest ( Lewalter and Knogler, 2014 ) or motivation-relevant experiences, like basic needs, are influenced by learning environments. At the same time, their use also depends on motivational characteristics of the learning process, such as self-determined motivation ( Deci and Ryan, 2012 ). Therefore, we assume that learning with WE as a relevant component of a learning environment might also influence situational interest and basic needs.

2.1.1 Presentation of worked examples

WE are frequently used in digital learning scenarios ( Renkl, 2014 ). When designing WE, the application via digital learning media can be helpful, as their content can be presented in different ways (video, audio, text, and images), tailored to the needs of the learners, so that individual use is possible according to their own prior knowledge or learning pace ( Mayer, 2001 ). Also, digital media can present relevant information in a timely, motivating, appealing and individualized way and support learning in an effective and needs-oriented way ( Mayer, 2001 ). The advantages of using digital media in designing WE have already been shown in previous studies. Dart et al. (2020) presented WE as short videos (WEV). They report that the use of WEV leads to increased student satisfaction and more positive attitudes. Approximately 90% of the students indicated an active learning approach when learning with the WEV. Furthermore, the results show that students improved their content knowledge through WEV and that they found WEV useful for other courses as well.

Another study ( Kay and Edwards, 2012 ) presented WE as video podcasts. Here, the advantages of WE regarding self-determined learning in terms of learning location, learning time, and learning speed were shown. Learning performance improved significantly after use. The step-by-step, easy-to-understand explanations, the diagrams, and the ability to determine the learning pace by oneself were seen as beneficial.

Multimedia WE can also be enhanced with self-explanation prompts ( Berthold et al., 2009 ). Learning from WE with self-explanation prompts was shown to be superior to other learning methods, such as hypertext learning and observational learning.

In addition to presenting WE in different medial ways, WE can also comprise different content domains.

2.1.2 Content and context of worked examples

Regarding the content of WE, algorithmic and heuristic WE, as well as single-content and double-content WE, can be distinguished ( Reiss et al., 2008 ; Koenen et al., 2017 ; Renkl, 2017 ). Algorithmic WE are traditionally used in the very structured mathematical–physical field. Here, an algorithm with very specific solution steps is to learn, for example, in probability calculation ( Koenen et al., 2017 ). In this study, however, we focus on heuristic double-content WE. Heuristic WE in science education comprise fundamental scientific working methods, e.g., conducting experiments ( Koenen et al., 2017 ). Furthermore, double-content WE contain two learning domains that are relevant for the learning process: (1) the learning domain describes the primarily to be learned abstract process or concept, e.g., scientific methodologies like observation (see section 2.2), while (2) the exemplifying domain consists of the content that is necessary to teach this process or concept, e.g., mapping of river structure ( Renkl et al., 2009 ).

Depending on the WE content to be learned, it may be necessary for learning to take place in different settings. This can be in a formal or informal learning setting or a non-formal field setting. In this study, the focus is on learning scientific observation (learning domain) through river structure mapping (exemplary domain), which takes place with the support of digital media in a formal (university) setting, but in an informal context (nature).

2.2 Scientific observation

Scientific observation is fundamental to all scientific activities and disciplines ( Kohlhauf et al., 2011 ). Scientific observation must be clearly distinguished from everyday observation, where observation is purely a matter of noticing and describing specific characteristics ( Chinn and Malhotra, 2001 ). In contrast to this everyday observation, scientific observation as a method of knowledge acquisition can be described as a rather complex activity, defined as the theory-based, systematic and selective perception of concrete systems and processes without any fundamental manipulation ( Wellnitz and Mayer, 2013 ). Wellnitz and Mayer (2013) described the scientific observation process via six steps: (1) formulation of the research question (s), (2) deduction of the null hypothesis and the alternative hypothesis, (3) planning of the research design, (4) conducting the observation, (5) analyzing the data, and (6) answering the research question(s) on this basis. Only through reliable and qualified observation, valid data can be obtained that provide solid scientific evidence ( Wellnitz and Mayer, 2013 ).

Since observation activities are not trivial and learners often observe without generating new knowledge or connecting their observations to scientific explanations and thoughts, it is important to provide support at the related cognitive level, so that observation activities can be conducted in a structured way according to pre-defined criteria ( Ford, 2005 ; Eberbach and Crowley, 2009 ). Especially during field-learning experiences, scientific observation is often spontaneous and uncoordinated, whereby random discoveries result in knowledge gain ( Jensen, 2014 ).

To promote successful observing in rather unstructured settings like field trips, instructional support for the observation process seems useful. To guide observation activities, digitally presented WE seem to be an appropriate way to introduce learners to the individual steps of scientific observation using concrete examples.

2.3 Research questions and hypothesis

The present study investigates the effect of digitally presented double-content WE that supports the mapping of a small Bavarian river by demonstrating the steps of scientific observation. In this analysis, we focus on the learning domain of the WE and do not investigate the exemplifying domain in detail. Distinct ways of integrating WE in the digital learning environment (faded WE vs. non-faded WE) are compared with each other and with a control group (no WE). The aim is to examine to what extent differences between those conditions exist with regard to (RQ1) learners’ competence acquisition [acquisition of factual knowledge about the scientific observation method (quantitative data) and practical application of the scientific observation method (quantified qualitative data)], (RQ2) learners’ motivation (situational interest and basic needs), and (RQ3) CL. It is assumed that (Hypothesis 1), the integration of WE (faded and non-faded) leads to significantly higher competence acquisition (factual and applied knowledge), significantly higher motivation and significantly lower extraneous CL as well as higher germane CL during the learning process compared to a learning environment without WE. No differences between the conditions are expected regarding intrinsic CL. Furthermore, it is assumed (Hypothesis 2) that the integration of faded WE leads to significantly higher competence acquisition, significantly higher motivation, and lower extraneous CL as well as higher germane CL during the learning processes compared to non-faded WE. No differences between the conditions are expected with regard to intrinsic CL.

The study took place during the field trips of a university course on the application of a fluvial audit (FA) using the German working aid for mapping the morphology of rivers and their floodplains ( Bayerisches Landesamt für Umwelt, 2019 ). FA is the leading fluvial geomorphological tool for application to data collection contiguously along all watercourses of interest ( Walker et al., 2007 ). It is widely used because it is a key example of environmental conservation and monitoring that needs to be taught to students of selected study programs; thus, knowing about the most effective ways of learning is of high practical relevance.

3.1 Sample and design

3.1.1 sample.

The study was conducted with 62 science students and doctoral students of a German University (age M = 24.03 years; SD = 4.20; 36 females; 26 males). A total of 37 participants had already conducted a scientific observation and would rate their knowledge in this regard at a medium level ( M = 3.32 out of 5; SD = 0.88). Seven participants had already conducted an FA and would rate their knowledge in this regard at a medium level ( M = 3.14 out of 5; SD = 0.90). A total of 25 participants had no experience at all. Two participants had to be excluded from the sample afterward because no posttest results were available.

3.1.2 Design

The study has a 1-factorial quasi-experimental comparative research design and is conducted as a field experiment using a pre/posttest design. Participants were randomly assigned to one of three conditions: no WE ( n = 20), faded WE ( n = 20), and non-faded WE ( n = 20).

3.2 Implementation and material

3.2.1 implementation.

The study started with an online kick-off meeting where two lecturers informed all students within an hour about the basics regarding the assessment of the structural integrity of the study river and the course of the field trip days to conduct an FA. Afterward, within 2 weeks, students self-studied via Moodle the FA following the German standard method according to the scoresheets of Bayerisches Landesamt für Umwelt (2019) . This independent preparation using the online presented documents was a necessary prerequisite for participation in the field days and was checked in the pre-testing. The preparatory online documents included six short videos and four PDF files on the content, guidance on the German protocol of the FA, general information on river landscapes, information about anthropogenic changes in stream morphology and the scoresheets for applying the FA. In these sheets, the river and its floodplain are subdivided into sections of 100 m in length. Each of these sections is evaluated by assessing 21 habitat factors related to flow characteristics and structural variability. The findings are then transferred into a scoring system for the description of structural integrity from 1 (natural) to 7 (highly modified). Habitat factors have a decisive influence on the living conditions of animals and plants in and around rivers. They included, e.g., variability in water depth, stream width, substratum diversity, or diversity of flow velocities.

3.2.2 Materials

On the field trip days, participants were handed a tablet and a paper-based FA worksheet (last accessed 21st September 2022). 1 This four-page assessment sheet was accompanied by a digital learning environment presented on Moodle that instructed the participants on mapping the water body structure and guided the scientific observation method. All three Moodle courses were identical in structure and design; the only difference was the implementation of the WE. Below, the course without WE are described first. The other two courses have an identical structure, but contain additional WE in the form of learning videos.

3.2.3 No worked example

After a short welcome and introduction to the course navigation, the FA started with the description of a short hypothetical scenario: Participants should take the role of an employee of an urban planning office that assesses the ecomorphological status of a small river near a Bavarian city. The river was divided into five sections that had to be mapped separately. The course was structured accordingly. At the beginning of each section, participants had to formulate and write down a research question, and according to hypotheses regarding the ecomorphological status of the river’s section, they had to collect data in this regard via the mapping sheet and then evaluate their data and draw a conclusion. Since this course serves as a control group, no WE videos supporting the scientific observation method were integrated. The layout of the course is structured like a book, where it is not possible to scroll back. This is important insofar as the participants do not have the possibility to revisit information in order to keep the conditions comparable as well as distinguishable.

3.2.4 Non-faded worked example

In the course with no-faded WE, three instructional videos are shown for each of the five sections. In each of the three videos, two steps of the scientific observation method are presented so that, finally, all six steps of scientific observation are demonstrated. The mapping of the first section starts after the general introduction (as described above) with the instruction to work on the first two steps of scientific observation: the formulation of a research question and hypotheses. To support this, a video of about 4 min explains the features of scientific sound research questions and hypotheses. To this aim, a practical example, including explanations and tips, is given regarding the formulation of research questions and hypotheses for this section (e.g., “To what extent does the building development and the closeness of the path to the water body have an influence on the structure of the water body?” Alternative hypothesis: It is assumed that the housing development and the closeness of the path to the water body have a negative influence on the water body structure. Null hypothesis: It is assumed that the housing development and the closeness of the path to the watercourse have no negative influence on the watercourse structure.). Participants should now formulate their own research questions and hypotheses, write them down in a text field at the end of the page, and then skip to the next page. The next two steps of scientific observation, planning and conducting, are explained in a short 4-min video. To this aim, a practical example including explanations and tips is given regarding planning and conducting scientific for this section (e.g., “It’s best to go through each evaluation category carefully one by one that way you are sure not to forget anything!”). Now, participants were asked to collect data for the first section using their paper-based FA worksheet. Participants individually surveyed the river and reported their results in the mapping sheet by ticking the respective boxes in it. After collecting this data, they returned to the digital learning environment to learn how to use these data by studying the last two steps of scientific observation, evaluation, and conclusion. The third 4-min video explained how to evaluate and interpret collected data. For this purpose, a practical example with explanations and tips is given regarding evaluating and interpreting data for this section (e.g., “What were the individual points that led to the assessment? Have there been points that were weighted more than others? Remember the introduction video!”). At the end of the page, participants could answer their before-stated research questions and hypotheses by evaluating their collected data and drawing a conclusion. This brings participants to the end of the first mapping section. Afterward, the cycle begins again with the second section of the river that has to be mapped. Again, participants had to conduct the steps of scientific observation, guided by WE videos, explaining the steps in slightly different wording or with different examples. A total of five sections are mapped, in which the structure of the learning environment and the videos follow the same procedure.

3.2.5 Faded worked example

The digital learning environment with the faded WE follow the same structure as the version with the non-faded WE. However, in this version, the information in the WE videos is successively reduced. In the first section, all three videos are identical to the version with the non-faded WE. In the second section, faded content was presented as follows: the tip at the end was omitted in all three videos. In the third section, the tip and the practical example were omitted. In the fourth and fifth sections, no more videos were presented, only the work instructions.

3.3 Procedure

The data collection took place on four continuous days on the university campus, with a maximum group size of 15 participants on each day. The students were randomly assigned to one of the three conditions (no WE vs. faded WE vs. non-faded WE). After a short introduction to the procedure, the participants were handed the paper-based FA worksheet and one tablet per person. Students scanned the QR code on the first page of the worksheet that opened the pretest questionnaire, which took about 20 min to complete. After completing the questionnaire, the group walked for about 15 min to the nearby small river that was to be mapped. Upon arrival, there was first a short introduction to the digital learning environment and a check that the login (via university account on Moodle) worked. During the next 4 h, the participants individually mapped five segments of the river using the cartography worksheet. They were guided through the steps of scientific observation using the digital learning environment on the tablet. The results of their scientific observation were logged within the digital learning environment. At the end of the digital learning environment, participants were directed to the posttest via a link. After completing the test, the tablets and mapping sheets were returned. Overall, the study took about 5 h per group each day.

3.4 Instruments

In the pretest, sociodemographic data (age and gender), the study domain and the number of study semesters were collected. Additionally, the previous scientific observation experience and the estimation of one’s own ability in this regard were assessed. For example, it was asked whether scientific observation had already been conducted and, if so, how the abilities were rated on a 5-point scale from very low to very high. Preparation for the FA on the basis of the learning material was assessed: Participants were asked whether they had studied all six videos and all four PDF documents, with the response options not at all, partially, and completely. Furthermore, a factual knowledge test about scientific observation and questions about self-determination theory was administered. The posttest used the same knowledge test, and additional questions on basic needs, situational interest, measures of CL and questions about the usefulness of the WE. All scales were presented online, and participants reached the questionnaire via QR code.

3.4.1 Scientific observation competence acquisition

For the factual knowledge (quantitative assessment of the scientific observation competence), a single-choice knowledge test with 12 questions was developed and used as pre- and posttest with a maximum score of 12 points. It assesses the learners’ knowledge of the scientific observation method regarding the steps of scientific observation, e.g., formulating research questions and hypotheses or developing a research design. The questions are based on Wahser (2008 , adapted by Koenen, 2014 ) and adapted to scientific observation: “Although you are sure that you have conducted the scientific observation correctly, an unexpected result turns up. What conclusion can you draw?” Each question has four answer options (one of which is correct) and, in addition, one “I do not know” option.

For the applied knowledge (quantified qualitative assessment of the scientific observation competence), students’ scientific observations written in the digital learning environment were analyzed. A coding scheme was used with the following codes: 0 = insufficient (text field is empty or includes only insufficient key points), 1 = sufficient (a research question and no hypotheses or research question and inappropriate hypotheses are stated), 2 = comprehensive (research question and appropriate hypothesis or research question and hypotheses are stated, but, e.g., incorrect null hypothesis), 3 = very comprehensive (correct research question, hypothesis and null hypothesis are stated). One example of a very comprehensive answer regarding the research question and hypothesis is: To what extent does the lack of riparian vegetation have an impact on water body structure? Hypothesis: The lack of shore vegetation has a negative influence on the water body structure. Null hypothesis: The lack of shore vegetation has no influence on the water body structure. Afterward, a sum score was calculated for each participant. Five times, a research question and hypotheses (steps 1 and 2 in the observation process) had to be formulated (5 × max. 3 points = 15 points), and five times, the research questions and hypotheses had to be answered (steps 5 and 6 in the observation process: evaluation and conclusion) (5 × max. 3 points = 15 points). Overall, participants could reach up to 30 points. Since the observation and evaluation criteria in data collection and analysis were strongly predetermined by the scoresheet, steps 3 and 4 of the observation process (planning and conducting) were not included in the analysis.

All 600 cases (60 participants, each 10 responses to code) were coded by the first author. For verification, 240 cases (24 randomly selected participants, eight from each course) were cross-coded by an external coder. In 206 of the coded cases, the raters agreed. The cases in which the raters did not agree were discussed together, and a solution was found. This results in Cohen’s κ = 0.858, indicating a high to very high level of agreement. This indicates that the category system is clearly formulated and that the individual units of analysis could be correctly assigned.

3.4.2 Self-determination index

For the calculation of the self-determination index (SDI-index), Thomas and Müller (2011) scale for self-determination was used in the pretest. The scale consists of four subscales: intrinsic motivation (five items; e.g., I engage with the workshop content because I enjoy it; reliability of alpha = 0.87), identified motivation (four items; e.g., I engage with the workshop content because it gives me more options when choosing a career; alpha = 0.84), introjected motivation (five items; e.g., I engage with the workshop content because otherwise I would have a guilty feeling; alpha = 0.79), and external motivation (three items, e.g., I engage with the workshop content because I simply have to learn it; alpha = 0.74). Participants could indicate their answers on a 5-point Likert scale ranging from 1 = completely disagree to 5 = completely agree. To calculate the SDI-index, the sum of the self-determined regulation styles (intrinsic and identified) is subtracted from the sum of the external regulation styles (introjected and external), where intrinsic and external regulation are scored two times ( Thomas and Müller, 2011 ).

3.4.3 Motivation

Basic needs were measured in the posttest with the scale by Willems and Lewalter (2011) . The scale consists of three subscales: perceived competence (four items; e.g., during the workshop, I felt that I could meet the requirements; alpha = 0.90), perceived autonomy (five items; e.g., during the workshop, I felt that I had a lot of freedom; alpha = 0.75), and perceived autonomy regarding personal wishes and goals (APWG) (four items; e.g., during the workshop, I felt that the workshop was how I wish it would be; alpha = 0.93). We added all three subscales to one overall basic needs scale (alpha = 0.90). Participants could indicate their answers on a 5-point Likert scale ranging from 1 = completely disagree to 5 = completely agree.

Situational interest was measured in the posttest with the 12-item scale by Lewalter and Knogler (2014 ; Knogler et al., 2015 ; Lewalter, 2020 ; alpha = 0.84). The scale consists of two subscales: catch (six items; e.g., I found the workshop exciting; alpha = 0.81) and hold (six items; e.g., I would like to learn more about parts of the workshop; alpha = 0.80). Participants could indicate their answers on a 5-point Likert scale ranging from 1 = completely disagree to 5 = completely agree.

3.4.4 Cognitive load

In the posttest, CL was used to examine the mental load during the learning process. The intrinsic CL (three items; e.g., this task was very complex; alpha = 0.70) and extraneous CL (three items; e.g., in this task, it is difficult to identify the most important information; alpha = 0.61) are measured with the scales from Klepsch et al. (2017) . The germane CL (two items; e.g., the learning session contained elements that supported me to better understand the learning material; alpha = 0.72) is measured with the scale from Leppink et al. (2013) . Participants could indicate their answers on a 5-point Likert scale ranging from 1 = completely disagree to 5 = completely agree.

3.4.5 Attitudes toward worked examples

To measure how effective participants rated the WE, we used two scales related to the WE videos as instructional support. The first scale from Renkl (2001) relates to the usefulness of WE. The scale consists of four items (e.g., the explanations were helpful; alpha = 0.71). Two items were recoded because they were formulated negatively. The second scale is from Wachsmuth (2020) and relates to the participant’s evaluation of the WE. The scale consists of nine items (e.g., I always did what was explained in the learning videos; alpha = 0.76). Four items were recoded because they were formulated negatively. Participants could indicate their answers on a 5-point Likert scale ranging from 1 = completely disagree to 5 = completely agree.

3.5 Data analysis

An ANOVA was used to calculate if the variable’s prior knowledge and SDI index differed between the three groups. However, as no significant differences between the conditions were found [prior factual knowledge: F (2, 59) = 0.15, p = 0.865, η 2 = 0.00 self-determination index: F (2, 59) = 0.19, p = 0.829, η 2 = 0.00], they were not included as covariates in subsequent analyses.

Furthermore, a repeated measure, one-way analysis of variance (ANOVA), was conducted to compare the three treatment groups (no WE vs. faded WE vs. non-faded WE) regarding the increase in factual knowledge about the scientific observation method from pretest to posttest.

A MANOVA (multivariate analysis) was calculated with the three groups (no WE vs. non-faded WE vs. faded WE) as a fixed factor and the dependent variables being the practical application of the scientific observation method (first research question), situational interest, basic needs (second research question), and CL (third research question).

Additionally, to determine differences in applied knowledge even among the three groups, Bonferroni-adjusted post-hoc analyses were conducted.

The descriptive statistics between the three groups in terms of prior factual knowledge about the scientific observation method and the self-determination index are shown in Table 1 . The descriptive statistics revealed only small, non-significant differences between the three groups in terms of factual knowledge.

Table 1 . Means (standard deviations) of factual knowledge tests (pre- and posttest) and self-determination index for the three different groups.

The results of the ANOVA revealed that the overall increase in factual knowledge from pre- to posttest just misses significance [ F (1, 57) = 3.68, p = 0.060, η 2 = 0 0.06]. Furthermore, no significant differences between the groups were found regarding the acquisition of factual knowledge from pre- to posttest [ F (2, 57) = 2.93, p = 0.062, η 2 = 0.09].

An analysis of the descriptive statistics showed that the largest differences between the groups were found in applied knowledge (qualitative evaluation) and extraneous load (see Table 2 ).

Table 2 . Means (standard deviations) of dependent variables with the three different groups.

Results of the MANOVA revealed significant overall differences between the three groups [ F (12, 106) = 2.59, p = 0.005, η 2 = 0.23]. Significant effects were found for the application of knowledge [ F (2, 57) = 13.26, p = <0.001, η 2 = 0.32]. Extraneous CL just missed significance [ F (2, 57) = 2.68, p = 0.065, η 2 = 0.09]. There were no significant effects for situational interest [ F (2, 57) = 0.44, p = 0.644, η 2 = 0.02], basic needs [ F (2, 57) = 1.22, p = 0.302, η 2 = 0.04], germane CL [ F (2, 57) = 2.68, p = 0.077, η 2 = 0.09], and intrinsic CL [ F (2, 57) = 0.28, p = 0.757, η 2 = 0.01].

Bonferroni-adjusted post hoc analysis revealed that the group without WE had significantly lower scores in the evaluation of the applied knowledge than the group with non-faded WE ( p = <0.001, M diff = −8.90, 95% CI [−13.47, −4.33]) and then the group with faded WE ( p = <0.001, M diff = −7.40, 95% CI [−11.97, −2.83]). No difference was found between the groups with faded and non-faded WE ( p = 1.00, M diff = −1.50, 95% CI [−6.07, 3.07]).

The descriptive statistics regarding the perceived usefulness of WE and participants’ evaluation of the WE revealed that the group with the faded WE rated usefulness slightly higher than the participants with non-faded WE and also reported a more positive evaluation. However, the results of a MANOVA revealed no significant overall differences [ F (2, 37) = 0.32, p = 0.732, η 2 = 0 0.02] (see Table 3 ).

Table 3 . Means (standard deviations) of dependent variables with the three different groups.

5 Discussion

This study investigated the use of WE to support students’ acquisition of science observation. Below, the research questions are answered, and the implications and limitations of the study are discussed.

5.1 Results on factual and applied knowledge

In terms of knowledge gain (RQ1), our findings revealed no significant differences in participants’ results of the factual knowledge test both across all three groups and specifically between the two experimental groups. These results are in contradiction with related literature where WE had a positive impact on knowledge acquisition ( Renkl, 2014 ) and faded WE are considered to be more effective in knowledge acquisition and transfer, in contrast to non-faded WE ( Renkl et al., 2000 ; Renkl, 2014 ). A limitation of the study is the fact that the participants already scored very high on the pretest, so participation in the intervention would likely not yield significant knowledge gains due to ceiling effects ( Staus et al., 2021 ). Yet, nearly half of the students reported being novices in the field prior to the study, suggesting that the difficulty of some test items might have been too low. Here, it would be important to revise the factual knowledge test, e.g., the difficulty of the distractors in further study.

Nevertheless, with regard to application knowledge, the results revealed large significant differences: Participants of the two experimental groups performed better in conducting scientific observation steps than participants of the control group. In the experimental groups, the non-faded WE group performed better than the faded WE group. However, the absence of significant differences between the two experimental groups suggests that faded and non-faded WE used as double-content WE are suitable to teach applied knowledge about scientific observation in the learning domain ( Koenen, 2014 ). Furthermore, our results differ from the findings of Renkl et al. (2000) , in which the faded version led to the highest knowledge transfer. Despite the fact that the non-faded WE performed best in our study, the faded version of the WE was also appropriate to improve learning, confirming the findings of Renkl (2014) and Hesser and Gregory (2015) .

5.2 Results on learners’ motivation

Regarding participants’ motivation (RQ2; situational interest and basic needs), no significant differences were found across all three groups or between the two experimental groups. However, descriptive results reveal slightly higher motivation in the two experimental groups than in the control group. In this regard, our results confirm existing literature on a descriptive level showing that WE lead to higher learning-relevant motivation ( Paas et al., 2005 ; Van Harsel et al., 2019 ). Additionally, both experimental groups rated the usefulness of the WE as high and reported a positive evaluation of the WE. Therefore, we assume that even non-faded WE do not lead to over-instruction. Regarding the descriptive tendency, a larger sample might yield significant results and detect even small effects in future investigations. However, because this study also focused on comprehensive qualitative data analysis, it was not possible to evaluate a larger sample in this study.

5.3 Results on cognitive load

Finally, CL did not vary significantly across all three groups (RQ3). However, differences in extraneous CL just slightly missed significance. In descriptive values, the control group reported the highest extrinsic and lowest germane CL. The faded WE group showed the lowest extrinsic CL and a similar germane CL as the non-faded WE group. These results are consistent with Paas et al. (2003) and Renkl (2014) , reporting that WE can help to reduce the extraneous CL and, in return, lead to an increase in germane CL. Again, these differences were just above the significance level, and it would be advantageous to retest with a larger sample to detect even small effects.

Taken together, our results only partially confirm H1: the integration of WE (both faded and non-faded WE) led to a higher acquisition of application knowledge than the control group without WE, but higher factual knowledge was not found. Furthermore, higher motivation or different CL was found on a descriptive level only. The control group provided the basis for comparison with the treatment in order to investigate if there is an effect at all and, if so, how large the effect is. This is an important point to assess whether the effort of implementing WE is justified. Additionally, regarding H2, our results reveal no significant differences between the two WE conditions. We assume that the high complexity of the FA could play a role in this regard, which might be hard to handle, especially for beginners, so learners could benefit from support throughout (i.e., non-faded WE).

In addition to the limitations already mentioned, it must be noted that only one exemplary topic was investigated, and the sample only consisted of students. Since only the learning domain of the double-content WE was investigated, the exemplifying domain could also be analyzed, or further variables like motivation could be included in further studies. Furthermore, the influence of learners’ prior knowledge on learning with WE could be investigated, as studies have found that WE are particularly beneficial in the initial acquisition of cognitive skills ( Kalyuga et al., 2001 ).

6 Conclusion

Overall, the results of the current study suggest a beneficial role for WE in supporting the application of scientific observation steps. A major implication of these findings is that both faded and non-faded WE should be considered, as no general advantage of faded WE over non-faded WE was found. This information can be used to develop targeted interventions aimed at the support of scientific observation skills.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

Ethical approval was not required for the study involving human participants in accordance with the local legislation and institutional requirements. Written informed consent to participate in this study was not required from the participants in accordance with the national legislation and the institutional requirements.

Author contributions

ML: Writing – original draft. SM: Writing – review & editing. JP: Writing – review & editing. JG: Writing – review & editing. DL: Writing – review & editing.

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/feduc.2024.1293516/full#supplementary-material

1. ^ https://www.lfu.bayern.de/wasser/gewaesserstrukturkartierung/index.htm

Atkinson, R. K., Derry, S. J., Renkl, A., and Wortham, D. (2000). Learning from examples: instructional principles from the worked examples research. Rev. Educ. Res. 70, 181–214. doi: 10.3102/00346543070002181

Crossref Full Text | Google Scholar

Barbieri, C. A., Booth, J. L., Begolli, K. N., and McCann, N. (2021). The effect of worked examples on student learning and error anticipation in algebra. Instr. Sci. 49, 419–439. doi: 10.1007/s11251-021-09545-6

Bayerisches Landesamt für Umwelt. (2019). Gewässerstrukturkartierung von Fließgewässern in Bayern – Erläuterungen zur Erfassung und Bewertung. (Water structure mapping of flowing waters in Bavaria - Explanations for recording and assessment) . Available at: https://www.bestellen.bayern.de/application/eshop_app000005?SID=1020555825&ACTIONxSESSxSHOWPIC(BILDxKEY:%27lfu_was_00152%27,BILDxCLASS:%27Artikel%27,BILDxTYPE:%27PDF%27)

Google Scholar

Berthold, K., Eysink, T. H., and Renkl, A. (2009). Assisting self-explanation prompts are more effective than open prompts when learning with multiple representations. Instr. Sci. 37, 345–363. doi: 10.1007/s11251-008-9051-z

Bokosmaty, S., Sweller, J., and Kalyuga, S. (2015). Learning geometry problem solving by studying worked examples: effects of learner guidance and expertise. Am. Educ. Res. J. 52, 307–333. doi: 10.3102/0002831214549450

Booth, J. L., McGinn, K., Young, L. K., and Barbieri, C. A. (2015). Simple practice doesn’t always make perfect. Policy Insights Behav. Brain Sci. 2, 24–32. doi: 10.1177/2372732215601691

Brownell, S. E., Wenderoth, M. P., Theobald, R., Okoroafor, N., Koval, M., Freeman, S., et al. (2014). How students think about experimental design: novel conceptions revealed by in-class activities. Bioscience 64, 125–137. doi: 10.1093/biosci/bit016

Chinn, C. A., and Malhotra, B. A. (2001). “Epistemologically authentic scientific reasoning” in Designing for science: implications from everyday, classroom, and professional settings . eds. K. Crowley, C. D. Schunn, and T. Okada (Mahwah, NJ: Lawrence Erlbaum), 351–392.

Dart, S., Pickering, E., and Dawes, L. (2020). Worked example videos for blended learning in undergraduate engineering. AEE J. 8, 1–22. doi: 10.18260/3-1-1153-36021

Dasgupta, A., Anderson, T. R., and Pelaez, N. J. (2014). Development and validation of a rubric for diagnosing students’ experimental design knowledge and difficulties. CBE Life Sci. Educ. 13, 265–284. doi: 10.1187/cbe.13-09-0192

PubMed Abstract | Crossref Full Text | Google Scholar

Deane, T., Nomme, K. M., Jeffery, E., Pollock, C. A., and Birol, G. (2014). Development of the biological experimental design concept inventory (BEDCI). CBE Life Sci. Educ. 13, 540–551. doi: 10.1187/cbe.13-11-0218

Deci, E. L., and Ryan, R. M. (2012). Self-determination theory. In P. A. M. LangeVan, A. W. Kruglanski, and E. T. Higgins (Eds.), Handbook of theories of social psychology , 416–436.

Eberbach, C., and Crowley, K. (2009). From everyday to scientific observation: how children learn to observe the Biologist’s world. Rev. Educ. Res. 79, 39–68. doi: 10.3102/0034654308325899

Ford, D. (2005). The challenges of observing geologically: third graders’ descriptions of rock and mineral properties. Sci. Educ. 89, 276–295. doi: 10.1002/sce.20049

Gerjets, P., Scheiter, K., and Catrambone, R. (2004). Designing instructional examples to reduce intrinsic cognitive load: molar versus modular presentation of solution procedures. Instr. Sci. 32, 33–58. doi: 10.1023/B:TRUC.0000021809.10236.71

Gupta, U. (2019). Interplay of germane load and motivation during math problem solving using worked examples. Educ. Res. Theory Pract. 30, 67–71.

Hefter, M. H., Berthold, K., Renkl, A., Riess, W., Schmid, S., and Fries, S. (2014). Effects of a training intervention to foster argumentation skills while processing conflicting scientific positions. Instr. Sci. 42, 929–947. doi: 10.1007/s11251-014-9320-y

Hesser, T. L., and Gregory, J. L. (2015). Exploring the Use of Faded Worked Examples as a Problem Solving Approach for Underprepared Students. High. Educ. Stud. 5, 36–46.

Jensen, E. (2014). Evaluating children’s conservation biology learning at the zoo. Conserv. Biol. 28, 1004–1011. doi: 10.1111/cobi.12263

Kalyuga, S., Chandler, P., Tuovinen, J., and Sweller, J. (2001). When problem solving is superior to studying worked examples. J. Educ. Psychol. 93, 579–588. doi: 10.1037/0022-0663.93.3.579

Kay, R. H., and Edwards, J. (2012). Examining the use of worked example video podcasts in middle school mathematics classrooms: a formative analysis. Can. J. Learn. Technol. 38, 1–20. doi: 10.21432/T2PK5Z

Klepsch, M., Schmitz, F., and Seufert, T. (2017). Development and validation of two instruments measuring intrinsic, extraneous, and germane cognitive load. Front. Psychol. 8:1997. doi: 10.3389/fpsyg.2017.01997

Knogler, M., Harackiewicz, J. M., Gegenfurtner, A., and Lewalter, D. (2015). How situational is situational interest? Investigating the longitudinal structure of situational interest. Contemp. Educ. Psychol. 43, 39–50. doi: 10.1016/j.cedpsych.2015.08.004

Koenen, J. (2014). Entwicklung und Evaluation von experimentunterstützten Lösungsbeispielen zur Förderung naturwissenschaftlich experimenteller Arbeitsweisen . Dissertation.

Koenen, J., Emden, M., and Sumfleth, E. (2017). Naturwissenschaftlich-experimentelles Arbeiten. Potenziale des Lernens mit Lösungsbeispielen und Experimentierboxen. (scientific-experimental work. Potentials of learning with solution examples and experimentation boxes). Zeitschrift für Didaktik der Naturwissenschaften 23, 81–98. doi: 10.1007/s40573-017-0056-5

Kohlhauf, L., Rutke, U., and Neuhaus, B. J. (2011). Influence of previous knowledge, language skills and domain-specific interest on observation competency. J. Sci. Educ. Technol. 20, 667–678. doi: 10.1007/s10956-011-9322-3

Leppink, J., Paas, F., Van der Vleuten, C. P., Van Gog, T., and Van Merriënboer, J. J. (2013). Development of an instrument for measuring different types of cognitive load. Behav. Res. Methods 45, 1058–1072. doi: 10.3758/s13428-013-0334-1

Lewalter, D. (2020). “Schülerlaborbesuche aus motivationaler Sicht unter besonderer Berücksichtigung des Interesses. (Student laboratory visits from a motivational perspective with special attention to interest)” in Handbuch Forschen im Schülerlabor – theoretische Grundlagen, empirische Forschungsmethoden und aktuelle Anwendungsgebiete . eds. K. Sommer, J. Wirth, and M. Vanderbeke (Münster: Waxmann-Verlag), 62–70.

Lewalter, D., and Knogler, M. (2014). “A questionnaire to assess situational interest – theoretical considerations and findings” in Poster Presented at the 50th Annual Meeting of the American Educational Research Association (AERA) (Philadelphia, PA)

Lunetta, V., Hofstein, A., and Clough, M. P. (2007). Learning and teaching in the school science laboratory: an analysis of research, theory, and practice. In N. Lederman and S. Abel (Eds.). Handbook of research on science education , Mahwah, NJ: Lawrence Erlbaum, 393–441.

Mayer, R. E. (2001). Multimedia learning. Cambridge University Press.

Paas, F., Renkl, A., and Sweller, J. (2003). Cognitive load theory and instructional design: recent developments. Educ. Psychol. 38, 1–4. doi: 10.1207/S15326985EP3801_1

Paas, F., Tuovinen, J., van Merriënboer, J. J. G., and Darabi, A. (2005). A motivational perspective on the relation between mental effort and performance: optimizing learner involvement in instruction. Educ. Technol. Res. Dev. 53, 25–34. doi: 10.1007/BF02504795

Reiss, K., Heinze, A., Renkl, A., and Groß, C. (2008). Reasoning and proof in geometry: effects of a learning environment based on heuristic worked-out examples. ZDM Int. J. Math. Educ. 40, 455–467. doi: 10.1007/s11858-008-0105-0

Renkl, A. (2001). Explorative Analysen zur effektiven Nutzung von instruktionalen Erklärungen beim Lernen aus Lösungsbeispielen. (Exploratory analyses of the effective use of instructional explanations in learning from worked examples). Unterrichtswissenschaft 29, 41–63. doi: 10.25656/01:7677

Renkl, A. (2014). “The worked examples principle in multimedia learning” in Cambridge handbook of multimedia learning . ed. R. E. Mayer (Cambridge University Press), 391–412.

Renkl, A. (2017). Learning from worked-examples in mathematics: students relate procedures to principles. ZDM 49, 571–584. doi: 10.1007/s11858-017-0859-3

Renkl, A., Atkinson, R. K., and Große, C. S. (2004). How fading worked solution steps works. A cognitive load perspective. Instr. Sci. 32, 59–82. doi: 10.1023/B:TRUC.0000021815.74806.f6

Renkl, A., Atkinson, R. K., and Maier, U. H. (2000). “From studying examples to solving problems: fading worked-out solution steps helps learning” in Proceeding of the 22nd Annual Conference of the Cognitive Science Society . eds. L. Gleitman and A. K. Joshi (Mahwah, NJ: Erlbaum), 393–398.

Renkl, A., Atkinson, R. K., Maier, U. H., and Staley, R. (2002). From example study to problem solving: smooth transitions help learning. J. Exp. Educ. 70, 293–315. doi: 10.1080/00220970209599510

Renkl, A., Hilbert, T., and Schworm, S. (2009). Example-based learning in heuristic domains: a cognitive load theory account. Educ. Psychol. Rev. 21, 67–78. doi: 10.1007/s10648-008-9093-4

Schworm, S., and Renkl, A. (2007). Learning argumentation skills through the use of prompts for self-explaining examples. J. Educ. Psychol. 99, 285–296. doi: 10.1037/0022-0663.99.2.285

Sirum, K., and Humburg, J. (2011). The experimental design ability test (EDAT). Bioscene 37, 8–16.

Staus, N. L., O’Connell, K., and Storksdieck, M. (2021). Addressing the ceiling effect when assessing STEM out-of-school time experiences. Front. Educ. 6:690431. doi: 10.3389/feduc.2021.690431

Sweller, J. (2006). The worked example effect and human cognition. Learn. Instr. 16, 165–169. doi: 10.1016/j.learninstruc.2006.02.005

Sweller, J., Van Merriënboer, J. J. G., and Paas, F. (1998). Cognitive architecture and instructional design. Educ. Psychol. Rev. 10, 251–295. doi: 10.1023/A:1022193728205

Thomas, A. E., and Müller, F. H. (2011). “Skalen zur motivationalen Regulation beim Lernen von Schülerinnen und Schülern. Skalen zur akademischen Selbstregulation von Schüler/innen SRQ-A [G] (überarbeitete Fassung)” in Scales of motivational regulation in student learning. Student academic self-regulation scales SRQ-A [G] (revised version). Wissenschaftliche Beiträge aus dem Institut für Unterrichts- und Schulentwicklung Nr. 5 (Klagenfurt: Alpen-Adria-Universität)

Um, E., Plass, J. L., Hayward, E. O., and Homer, B. D. (2012). Emotional design in multimedia learning. J. Educ. Psychol. 104, 485–498. doi: 10.1037/a0026609

Van Gog, T., Kester, L., and Paas, F. (2011). Effects of worked examples, example-problem, and problem- example pairs on novices’ learning. Contemp. Educ. Psychol. 36, 212–218. doi: 10.1016/j.cedpsych.2010.10.004

Van Gog, T., and Paas, G. W. C. (2006). Optimising worked example instruction: different ways to increase germane cognitive load. Learn. Instr. 16, 87–91. doi: 10.1016/j.learninstruc.2006.02.004

Van Harsel, M., Hoogerheide, V., Verkoeijen, P., and van Gog, T. (2019). Effects of different sequences of examples and problems on motivation and learning. Contemp. Educ. Psychol. 58, 260–275. doi: 10.1002/acp.3649

Wachsmuth, C. (2020). Computerbasiertes Lernen mit Aufmerksamkeitsdefizit: Unterstützung des selbstregulierten Lernens durch metakognitive prompts. (Computer-based learning with attention deficit: supporting self-regulated learning through metacognitive prompts) . Chemnitz: Dissertation Technische Universität Chemnitz.

Wahser, I. (2008). Training von naturwissenschaftlichen Arbeitsweisen zur Unterstützung experimenteller Kleingruppenarbeit im Fach Chemie (Training of scientific working methods to support experimental small group work in chemistry) . Dissertation

Walker, J., Gibson, J., and Brown, D. (2007). Selecting fluvial geomorphological methods for river management including catchment scale restoration within the environment agency of England and Wales. Int. J. River Basin Manag. 5, 131–141. doi: 10.1080/15715124.2007.9635313

Wellnitz, N., and Mayer, J. (2013). Erkenntnismethoden in der Biologie – Entwicklung und evaluation eines Kompetenzmodells. (Methods of knowledge in biology - development and evaluation of a competence model). Z. Didaktik Naturwissensch. 19, 315–345.

Willems, A. S., and Lewalter, D. (2011). “Welche Rolle spielt das motivationsrelevante Erleben von Schülern für ihr situationales Interesse im Mathematikunterricht? (What role does students’ motivational experience play in their situational interest in mathematics classrooms?). Befunde aus der SIGMA-Studie” in Erziehungswissenschaftliche Forschung – nachhaltige Bildung. Beiträge zur 5. DGfE-Sektionstagung “Empirische Bildungsforschung”/AEPF-KBBB im Frühjahr 2009 . eds. B. Schwarz, P. Nenninger, and R. S. Jäger (Landau: Verlag Empirische Pädagogik), 288–294.

Keywords: digital media, worked examples, scientific observation, motivation, cognitive load

Citation: Lechner M, Moser S, Pander J, Geist J and Lewalter D (2024) Learning scientific observation with worked examples in a digital learning environment. Front. Educ . 9:1293516. doi: 10.3389/feduc.2024.1293516

Received: 13 September 2023; Accepted: 29 February 2024; Published: 18 March 2024.

Reviewed by:

Copyright © 2024 Lechner, Moser, Pander, Geist and Lewalter. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Miriam Lechner, [email protected]

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

Knowledge Base

Methodology

What Is Quantitative Research? | Definition, Uses & Methods

What Is Quantitative Research? | Definition, Uses & Methods

Published on June 12, 2020 by Pritha Bhandari . Revised on June 22, 2023.

Quantitative research is the process of collecting and analyzing numerical data. It can be used to find patterns and averages, make predictions, test causal relationships, and generalize results to wider populations.

Quantitative research is the opposite of qualitative research , which involves collecting and analyzing non-numerical data (e.g., text, video, or audio).

Quantitative research is widely used in the natural and social sciences: biology, chemistry, psychology, economics, sociology, marketing, etc.

What is the demographic makeup of Singapore in 2020?
How has the average temperature changed globally over the last century?
Does environmental pollution affect the prevalence of honey bees?
Does working from home increase productivity for people with long commutes?

Quantitative research methods, quantitative data analysis, advantages of quantitative research, disadvantages of quantitative research, other interesting articles, frequently asked questions about quantitative research.

You can use quantitative research methods for descriptive, correlational or experimental research.

In descriptive research , you simply seek an overall summary of your study variables.
In correlational research , you investigate relationships between your study variables.
In experimental research , you systematically examine whether there is a cause-and-effect relationship between variables.

Correlational and experimental research can both be used to formally test hypotheses , or predictions, using statistics. The results may be generalized to broader populations based on the sampling method used.

To collect quantitative data, you will often need to use operational definitions that translate abstract concepts (e.g., mood) into observable and quantifiable measures (e.g., self-ratings of feelings and energy levels).

Note that quantitative research is at risk for certain research biases , including information bias , omitted variable bias , sampling bias , or selection bias . Be sure that you’re aware of potential biases as you collect and analyze your data to prevent them from impacting your work too much.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

Academic style
Vague sentences
Style consistency

See an example

Once data is collected, you may need to process it before it can be analyzed. For example, survey and test data may need to be transformed from words to numbers. Then, you can use statistical analysis to answer your research questions .

Descriptive statistics will give you a summary of your data and include measures of averages and variability. You can also use graphs, scatter plots and frequency tables to visualize your data and check for any trends or outliers.

Using inferential statistics , you can make predictions or generalizations based on your data. You can test your hypothesis or use your sample data to estimate the population parameter .

First, you use descriptive statistics to get a summary of the data. You find the mean (average) and the mode (most frequent rating) of procrastination of the two groups, and plot the data to see if there are any outliers.

You can also assess the reliability and validity of your data collection methods to indicate how consistently and accurately your methods actually measured what you wanted them to.

Quantitative research is often used to standardize data collection and generalize findings . Strengths of this approach include:

Replication

Repeating the study is possible because of standardized data collection protocols and tangible definitions of abstract concepts.

Direct comparisons of results

The study can be reproduced in other cultural settings, times or with different groups of participants. Results can be compared statistically.

Large samples

Data from large samples can be processed and analyzed using reliable and consistent procedures through quantitative data analysis.

Hypothesis testing

Using formalized and established hypothesis testing procedures means that you have to carefully consider and report your research variables, predictions, data collection and testing methods before coming to a conclusion.

Despite the benefits of quantitative research, it is sometimes inadequate in explaining complex research topics. Its limitations include:

Superficiality

Using precise and restrictive operational definitions may inadequately represent complex concepts. For example, the concept of mood may be represented with just a number in quantitative research, but explained with elaboration in qualitative research.

Narrow focus

Predetermined variables and measurement procedures can mean that you ignore other relevant observations.

Structural bias

Despite standardized procedures, structural biases can still affect quantitative research. Missing data , imprecise measurements or inappropriate sampling methods are biases that can lead to the wrong conclusions.

Lack of context

Quantitative research often uses unnatural settings like laboratories or fails to consider historical and cultural contexts that may affect data collection and results.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

Chi square goodness of fit test
Degrees of freedom
Null hypothesis
Discourse analysis
Control groups
Mixed methods research
Non-probability sampling
Inclusion and exclusion criteria

Research bias

Rosenthal effect
Implicit bias
Cognitive bias
Selection bias
Negativity bias
Status quo bias

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

Reliability and validity are both about how well a method measures something:

Reliability refers to the consistency of a measure (whether the results can be reproduced under the same conditions).
Validity refers to the accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 22). What Is Quantitative Research? | Definition, Uses & Methods. Scribbr. Retrieved March 20, 2024, from https://www.scribbr.com/methodology/quantitative-research/

Is this article helpful?

Pritha Bhandari

Other students also liked, descriptive statistics | definitions, types, examples, inferential statistics | an easy introduction & examples, unlimited academic ai-proofreading.

✔ Document error-free in 5minutes ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Help | Advanced Search

Computer Science > Computer Vision and Pattern Recognition

Title: mm1: methods, analysis & insights from multimodal llm pre-training.

Abstract: In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image encoder, the vision language connector, and various pre-training data choices, we identified several crucial design lessons. For example, we demonstrate that for large-scale multimodal pre-training using a careful mix of image-caption, interleaved image-text, and text-only data is crucial for achieving state-of-the-art (SOTA) few-shot results across multiple benchmarks, compared to other published pre-training results. Further, we show that the image encoder together with image resolution and the image token count has substantial impact, while the vision-language connector design is of comparatively negligible importance. By scaling up the presented recipe, we build MM1, a family of multimodal models up to 30B parameters, including both dense models and mixture-of-experts (MoE) variants, that are SOTA in pre-training metrics and achieve competitive performance after supervised fine-tuning on a range of established multimodal benchmarks. Thanks to large-scale pre-training, MM1 enjoys appealing properties such as enhanced in-context learning, and multi-image reasoning, enabling few-shot chain-of-thought prompting.

Submission history

Access paper:.

Download PDF
Other Formats

References & Citations

Google Scholar
Semantic Scholar

BibTeX formatted citation

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

View all journals
My Account Login
Explore content
About the journal
Publish with us
Sign up for alerts
Open access
Published: 19 March 2024

TacticAI: an AI assistant for football tactics

Zhe Wang ORCID: orcid.org/0000-0002-0748-5376 1 na1 ,
Petar Veličković ORCID: orcid.org/0000-0002-2820-4692 1 na1 ,
Daniel Hennes ORCID: orcid.org/0000-0002-3646-5286 1 na1 ,
Nenad Tomašev ORCID: orcid.org/0000-0003-1624-0220 1 ,
Laurel Prince 1 ,
Michael Kaisers 1 ,
Yoram Bachrach 1 ,
Romuald Elie 1 ,
Li Kevin Wenliang 1 ,
Federico Piccinini 1 ,
William Spearman 2 ,
Ian Graham 3 ,
Jerome Connor 1 ,
Yi Yang 1 ,
Adrià Recasens 1 ,
Mina Khan 1 ,
Nathalie Beauguerlange 1 ,
Pablo Sprechmann 1 ,
Pol Moreno 1 ,
Nicolas Heess ORCID: orcid.org/0000-0001-7876-9256 1 ,
Michael Bowling ORCID: orcid.org/0000-0003-2960-8418 4 ,
Demis Hassabis 1 &
Karl Tuyls ORCID: orcid.org/0000-0001-7929-1944 5

Nature Communications volume 15 , Article number: 1906 ( 2024 ) Cite this article

20k Accesses

241 Altmetric

Metrics details

Computational science
Information technology

Identifying key patterns of tactics implemented by rival teams, and developing effective responses, lies at the heart of modern football. However, doing so algorithmically remains an open research challenge. To address this unmet need, we propose TacticAI, an AI football tactics assistant developed and evaluated in close collaboration with domain experts from Liverpool FC. We focus on analysing corner kicks, as they offer coaches the most direct opportunities for interventions and improvements. TacticAI incorporates both a predictive and a generative component, allowing the coaches to effectively sample and explore alternative player setups for each corner kick routine and to select those with the highest predicted likelihood of success. We validate TacticAI on a number of relevant benchmark tasks: predicting receivers and shot attempts and recommending player position adjustments. The utility of TacticAI is validated by a qualitative study conducted with football domain experts at Liverpool FC. We show that TacticAI’s model suggestions are not only indistinguishable from real tactics, but also favoured over existing tactics 90% of the time, and that TacticAI offers an effective corner kick retrieval system. TacticAI achieves these results despite the limited availability of gold-standard data, achieving data efficiency through geometric deep learning.

Artificial intelligence and illusions of understanding in scientific research

Lisa Messeri & M. J. Crockett

Highly accurate protein structure prediction with AlphaFold

John Jumper, Richard Evans, … Demis Hassabis

Towards a general-purpose foundation model for computational pathology

Richard J. Chen, Tong Ding, … Faisal Mahmood

Introduction

Association football, or simply football or soccer, is a widely popular and highly professionalised sport, in which two teams compete to score goals against each other. As each football team comprises up to 11 active players at all times and takes place on a very large pitch (also known as a soccer field), scoring goals tends to require a significant degree of strategic team-play. Under the rules codified in the Laws of the Game 1 , this competition has nurtured an evolution of nuanced strategies and tactics, culminating in modern professional football leagues. In today’s play, data-driven insights are a key driver in determining the optimal player setups for each game and developing counter-tactics to maximise the chances of success 2 .

When competing at the highest level the margins are incredibly tight, and it is increasingly important to be able to capitalise on any opportunity for creating an advantage on the pitch. To that end, top-tier clubs employ diverse teams of coaches, analysts and experts, tasked with studying and devising (counter-)tactics before each game. Several recent methods attempt to improve tactical coaching and player decision-making through artificial intelligence (AI) tools, using a wide variety of data types from videos to tracking sensors and applying diverse algorithms ranging from simple logistic regression to elaborate neural network architectures. Such methods have been employed to help predict shot events from videos 3 , forecast off-screen movement from spatio-temporal data 4 , determine whether a match is in-play or interrupted 5 , or identify player actions 6 .

The execution of agreed-upon plans by players on the pitch is highly dynamic and imperfect, depending on numerous factors including player fitness and fatigue, variations in player movement and positioning, weather, the state of the pitch, and the reaction of the opposing team. In contrast, set pieces provide an opportunity to exert more control on the outcome, as the brief interruption in play allows the players to reposition according to one of the practiced and pre-agreed patterns, and make a deliberate attempt towards the goal. Examples of such set pieces include free kicks, corner kicks, goal kicks, throw-ins, and penalties 2 .

Among set pieces, corner kicks are of particular importance, as an improvement in corner kick execution may substantially modify game outcomes, and they lend themselves to principled, tactical and detailed analysis. This is because corner kicks tend to occur frequently in football matches (with ~10 corners on average taking place in each match 7 ), they are taken from a fixed, rigid position, and they offer an immediate opportunity for scoring a goal—no other set piece simultaneously satisfies all of the above. In practice, corner kick routines are determined well ahead of each match, taking into account the strengths and weaknesses of the opposing team and their typical tactical deployment. It is for this reason that we focus on corner kick analysis in particular, and propose TacticAI, an AI football assistant for supporting the human expert with set piece analysis, and the development and improvement of corner kick routines.

TacticAI is rooted in learning efficient representations of corner kick tactics from raw, spatio-temporal player tracking data. It makes efficient use of this data by representing each corner kick situation as a graph—a natural representation for modelling relationships between players (Fig. 1 A, Table 2 ), and these player relationships may be of higher importance than the absolute distances between them on the pitch 8 . Such a graph input is a natural candidate for graph machine learning models 9 , which we employ within TacticAI to obtain high-dimensional latent player representations. In the Supplementary Discussion section, we carefully contrast TacticAI against prior art in the area.

A How corner kick situations are converted to a graph representation. Each player is treated as a node in a graph, with node, edge and graph features extracted as detailed in the main text. Then, a graph neural network operates over this graph by performing message passing; each node’s representation is updated using the messages sent to it from its neighbouring nodes. B How TacticAI processes a given corner kick. To ensure that TacticAI’s answers are robust in the face of horizontal or vertical reflections, all possible combinations of reflections are applied to the input corner, and these four views are then fed to the core TacticAI model, where they are able to interact with each other to compute the final player representations—each internal blue arrow corresponds to a single message passing layer from ( A ). Once player representations are computed, they can be used to predict the corner’s receiver, whether a shot has been taken, as well as assistive adjustments to player positions and velocities, which increase or decrease the probability of a shot being taken.

Uniquely, TacticAI takes advantage of geometric deep learning 10 to explicitly produce player representations that respect several symmetries of the football pitch (Fig. 1 B). As an illustrative example, we can usually safely assume that under a horizontal or vertical reflection of the pitch state, the game situation is equivalent. Geometric deep learning ensures that TacticAI’s player representations will be identically computed under such reflections, such that this symmetry does not have to be learnt from data. This proves to be a valuable addition, as high-quality tracking data is often limited—with only a few hundred matches played each year in every league. We provide an in-depth overview of how we employ geometric deep learning in TacticAI in the “Methods” section.

From these representations, TacticAI is then able to answer various predictive questions about the outcomes of a corner—for example, which player is most likely to make first contact with the ball, or whether a shot will take place. TacticAI can also be used as a retrieval system—for mining similar corner kick situations based on the similarity of player representations—and a generative recommendation system, suggesting adjustments to player positions and velocities to maximise or minimise the estimated shot probability. Through several experiments within a case study with domain expert coaches and analysts from Liverpool FC, the results of which we present in the next section, we obtain clear statistical evidence that TacticAI readily provides useful, realistic and accurate tactical suggestions.

To demonstrate the diverse qualities of our approach, we design TacticAI with three distinct predictive and generative components: receiver prediction, shot prediction, and tactic recommendation through guided generation, which also correspond to the benchmark tasks for quantitatively evaluating TacticAI. In addition to providing accurate quantitative insights for corner kick analysis with its predictive components, the interplay between TacticAI’s predictive and generative components allows coaches to sample alternative player setups for each routine of interest, and directly evaluate the possible outcomes of such alternatives.

We will first describe our quantitative analysis, which demonstrates that TacticAI’s predictive components are accurate at predicting corner kick receivers and shot situations on held-out test corners and that the proposed player adjustments do not strongly deviate from ground-truth situations. However, such an analysis only gives an indirect insight into how useful TacticAI would be once deployed. We tackle this question of utility head-on and conduct a comprehensive case study in collaboration with our partners at Liverpool FC—where we directly ask human expert raters to judge the utility of TacticAI’s predictions and player adjustments. The following sections expand on the specific results and analysis we have performed.

In what follows, we will describe TacticAI’s components at a minimal level necessary to understand our evaluation. We defer detailed descriptions of TacticAI’s components to the “Methods” section. Note that, all our error bars reported in this research are standard deviations.

Benchmarking TacticAI

We evaluate the three components of TacticAI on a relevant benchmark dataset of corner kicks. Our dataset consists of 7176 corner kicks from the 2020 to 2021 Premier League seasons, which we randomly shuffle and split into a training (80%) and a test set (20%). As previously mentioned, TacticAI operates on graphs. Accordingly, we represent each corner kick situation as a graph, where each node corresponds to a player. The features associated with each node encode the movements (velocities and positions) and simple profiles (heights and weights) of on-pitch players at the timestamp when the corresponding corner kick was being taken by the attacking kicker (see the “Methods” section), and no information of ball movement was encoded. The graphs are fully connected; that is, for every pair of players, we will include the edge connecting them in the graph. Each of these edges encodes a binary feature, indicating whether the two players are on opposing teams or not. For each task, we generated the relevant dataset of node/edge/graph features and corresponding labels (Tables 1 and 2 , see the “Methods” section). The components were then trained separately with their corresponding corner kick graphs. In particular, we only employ a minimal set of features to construct the corner kick graphs, without encoding the movements of the ball nor explicitly encoding the distances between players into the graphs. We used a consistent training-test split for all benchmark tasks, as this made it possible to benchmark not only the individual components but also their interactions.

Accurate receiver and shot prediction through geometric deep learning

One of TacticAI’s key predictive models forecasts the receiver out of the 22 on-pitch players. The receiver is defined as the first player touching the ball after the corner is taken. In our evaluation, all methods used the same set of features (see the “Receiver prediction” entry in Table 1 and the “Methods” section). We leveraged the receiver prediction task to benchmark several different TacticAI base models. Our best-performing model—achieving 0.782 ± 0.039 in top-3 test accuracy after 50,000 training steps—was a deep graph attention network 11 , 12 , leveraging geometric deep learning 10 through the use of D 2 group convolutions 13 . We supplement this result with a detailed ablation study, verifying that both our choice of base architecture and group convolution yielded significant improvements in the receiver prediction task (Supplementary Table 2 , see the subsection “Ablation study” in the “Methods” section). Considering that corner kick receiver prediction is a highly challenging task with many factors that are unseen by our model—including fatigue and fitness levels, and actual ball trajectory—we consider TacticAI’s top-3 accuracy to reflect a high level of predictive power, and keep the base TacticAI architecture fixed for subsequent studies. In addition to this quantitative evaluation with the evaluation dataset, we also evaluate the performance of TacticAI’s receiver prediction component in a case study with human raters. Please see the “Case study with expert raters” section for more details.

For shot prediction, we observe that reusing the base TacticAI architecture to directly predict shot events—i.e., directly modelling the probability ${\mathbb{P}}(\,{{\mbox{shot}}}| {{\mbox{corner}}}\,)$ —proved challenging, only yielding a test F 1 score of 0.52 ± 0.03, for a GATv2 base model. Note that here we use the F 1 score—the harmonic mean of precision and recall—as it is commonly used in binary classification problems over imbalanced datasets, such as shot prediction. However, given that we already have a potent receiver predictor, we decided to use its output to give us additional insight into whether or not a shot had been taken. Hence, we opted to decompose the probability of taking a shot as

where ${\mathbb{P}}(\,{{\mbox{receiver}}}| {{\mbox{corner}}}\,)$ are the probabilities computed by TacticAI’s receiver prediction system, and ${\mathbb{P}}(\,{{\mbox{shot}}}| {{\mbox{receiver}}},{{\mbox{corner}}}\,)$ models the conditional shot probability after a specific player makes first contact with the ball. This was implemented through providing an additional global feature to indicate the receiver in the corresponding corner kick (Table 1 ) while the architecture otherwise remained the same as that of receiver prediction (Supplementary Fig. 2 , see the “Methods” section). At training time, we feed the ground-truth receiver as input to the model—at inference time, we attempt every possible receiver, weighing their contributions using the probabilities given by TacticAI’s receiver predictor, as per Eq. ( 1 ). This two-phased approach yielded a final test F 1 score of 0.68 ± 0.04 for shot prediction, which encodes significantly more signal than the unconditional shot predictor, especially considering the many unobservables associated with predicting shot events. Just as for receiver prediction, this performance can be further improved using geometric deep learning; a conditional GATv2 shot predictor with D 2 group convolutions achieves an F 1 score of 0.71 ± 0.01.

Moreover, we also observe that, even just through predicting the receivers, without explicitly classifying any other salient features of corners, TacticAI learned generalisable representations of the data. Specifically, team setups with similar tactical patterns tend to cluster together in TacticAI’s latent space (Fig. 2 ). However, no clear clusters are observed in the raw input space (Supplementary Fig. 1 ). This indicates that TacticAI can be leveraged as a useful corner kick retrieval system, and we will present our evaluation of this hypothesis in the “Case study with expert raters” section.

We visualise the latent representations of attacking and defending teams in 1024 corner kicks using t -SNE. A latent team embedding in one corner kick sample is the mean of the latent player representations on the same attacking ( A – C ) or defending ( D ) team. Given the reference corner kick sample ( A ), we retrieve another corner kick sample ( B ) with respect to the closest distance of their representations in the latent space. We observe that ( A ) and ( B ) are both out-swing corner kicks and share similar patterns of their attacking tactics, which are highlighted with rectangles having the same colours, although they bear differences with respect to the absolute positions and velocities of the players. All the while, the latent representation of an in-swing attack ( C ) is distant from both ( A ) and ( B ) in the latent space. The red arrows are only used to demonstrate the difference between in- and out-swing corner kicks, not the actual ball trajectories.

Lastly, it is worth emphasising that the utility of the shot predictor likely does not come from forecasting whether a shot event will occur—a challenging problem with many imponderables—but from analysing the difference in predicted shot probability across multiple corners. Indeed, in the following section, we will show how TacticAI’s generative tactic refinements can directly influence the predicted shot probabilities, which will then corresponds to highly favourable evaluation by our expert raters in the “Case study with expert raters” section.

Controlled tactic refinement using class-conditional generative models

Equipped with components that are able to potently relate corner kicks with their various outcomes (e.g. receivers and shot events), we can explore the use of TacticAI to suggest adjustments of tactics, in order to amplify or reduce the likelihood of certain outcomes.

Specifically, we aim to produce adjustments to the movements of players on one of the two teams, including their positions and velocities, which would maximise or minimise the probability of a shot event, conditioned on the initial corner setup, consisting of the movements of players on both teams and their heights and weights. In particular, although in real-world scenarios both teams may react simultaneously to the movements of each other, in our study, we focus on moderate adjustments to player movements, which help to detect players that are not responding to a tactic properly. Due to this reason, we simplify the process of tactic refinement through generating the adjustments for only one team while keeping the other fixed. The way we train a model for this task is through an auto-encoding objective: we feed the ground-truth shot outcome (a binary indicator) as an additional graph-level feature to TacticAI’s model (Table 1 ), and then have it learn to reconstruct a probability distribution of the input player coordinates (Fig. 1 B, also see the “Methods” section). As a consequence, our tactic adjustment system does not depend on the previously discussed shot predictor—although we can use the shot predictor to evaluate whether the adjustments make a measurable difference in shot probability.

This autoencoder-based generative model is an individual component that separates from TacticAI’s predictive systems. All three systems share the encoder architecture (without sharing parameters), but use different decoders (see the “Methods” section). At inference time, we can instead feed in a desired shot outcome for the given corner setup, and then sample new positions and velocities for players on one team using this probability distribution. This setup, in principle, allows for flexible downstream use, as human coaches can optimise corner kick setups through generating adjustments conditioned on the specific outcomes of their interest—e.g., increasing shot probability for the attacking team, decreasing it for the defending team (Fig. 3 ) or amplifying the chance that a particular striker receives the ball.

TacticAI makes it possible for human coaches to redesign corner kick tactics in ways that help maximise the probability of a positive outcome for either the attacking or the defending team by identifying key players, as well as by providing temporally coordinated tactic recommendations that take all players into consideration. As demonstrated in the present example ( A ), for a corner kick in which there was a shot attempt in reality ( B ), TacticAI can generate a tactically-adjusted setting in which the shot probability has been reduced, by adjusting the positioning of the defenders ( D ). The suggested defender positions result in reduced receiver probability for attacking players 2–5 (see bottom row), while the receiver probability of Attacker 1, who is distant from the goalpost, has been increased ( C ). The model is capable of generating multiple such scenarios. Coaches can inspect the different options visually and additionally consult TacticAI’s quantitative analysis of the presented tactics.

We first evaluate the generated adjustments quantitatively, by verifying that they are indistinguishable from the original corner kick distribution using a classifier. To do this, we synthesised a dataset consisting of 200 corner kick samples and their corresponding conditionally generated adjustments. Specifically, for corners without a shot event, we generated adjustments for the attacking team by setting the shot event feature to 1, and vice-versa for the defending team when a shot event did happen. We found that the real and generated samples were not distinguishable by an MLP classifier, with an F 1 score of 0.53 ± 0.05, indicating random chance level accuracy. This result indicates that the adjustments produced by TacticAI are likely similar enough to real corner kicks that the MLP is unable to tell them apart. Note that, in spite of this similarity, TacticAI recommends player-level adjustments that are not negligible—in the following section we will illustrate several salient examples of this. To more realistically validate the practical indistinguishability of TacticAI’s adjustments from realistic corners, we also evaluated the realism of the adjustments in a case study with human experts, which we will present in the following section.

In addition, we leveraged our TacticAI shot predictor to estimate whether the proposed adjustments were effective. We did this by analysing 100 corner kick samples in which threatening shots occurred, and then, for each sample, generated one defensive refinement through setting the shot event feature to 0. We observed that the average shot probability significantly decreased, from 0.75 ± 0.14 for ground-truth corners to 0.69 ± 0.16 for adjustments ( z = 2.62, p < 0.001). This observation was consistent when testing for attacking team refinements (shot probability increased from 0.18 ± 0.16 to 0.31 ± 0.26 ( z = −4.46, p < 0.001)). Moving beyond this result, we also asked human raters to assess the utility of TacticAI’s proposed adjustments within our case study, which we detail next.

Case study with expert raters

Although quantitative evaluation with well-defined benchmark datasets was critical for the technical development of TacticAI, the ultimate test of TacticAI as a football tactic assistant is its practical downstream utility being recognised by professionals in the industry. To this end, we evaluated TacticAI through a case study with our partners at Liverpool FC (LFC). Specifically, we invited a group of five football experts: three data scientists, one video analyst, and one coaching assistant. Each of them completed four tasks in the case study, which evaluated the utility of TacticAI’s components from several perspectives; these include (1) the realism of TacticAI’s generated adjustments, (2) the plausibility of TacticAI’s receiver predictions, (3) effectiveness of TacticAI’s embeddings for retrieving similar corners, and (4) usefulness of TacticAI’s recommended adjustments. We provide an overview of our study’s results here and refer the interested reader to Supplementary Figs. 3 – 5 and the Supplementary Methods for additional details.

We first simultaneously evaluated the realism of the adjusted corner kicks generated by TacticAI, and the plausibility of its receiver predictions. Going through a collection of 50 corner kick samples, we first asked the raters to classify whether a given sample was real or generated by TacticAI, and then they were asked to identify the most likely receivers in the corner kick sample (Supplementary Fig. 3 ).

On the task of classifying real and generated samples, first, we found that the raters’ average F 1 score of classifying the real vs. generated samples was only 0.60 ± 0.04, with individual F 1 scores ( ${F}_{1}^{A}=0.54,{F}_{1}^{B}=0.64,{F}_{1}^{C}=0.65,{F}_{1}^{D}=0.62,{F}_{1}^{E}=0.56$ ), indicating that the raters were, in many situations, unable to distinguish TacticAI’s adjustments from real corners.

The previous evaluation focused on analysing realism detection performance across raters. We also conduct a study that analyses realism detection across samples. Specifically, we assigned ratings for each sample—assigning +1 to a sample if it was identified as real by a human rater, and 0 otherwise—and computed the average rating for each sample across the five raters. Importantly, by studying the distribution of ratings, we found that there was no significant difference between the average ratings assigned to real and generated corners ( z = −0.34, p > 0.05) (Fig. 4 A). Hence, the real and generated samples were assigned statistically indistinguishable average ratings by human raters.

In task 1, we tested the statistical difference between the real corner kick samples and the synthetic ones generated by TacticAI from two aspects: ( A.1 ) the distributions of their assigned ratings, and ( A.2 ) the corresponding histograms of the rating values. Analogously, in task 2 (receiver prediction), ( B.1 ) we track the distributions of the top-3 accuracy of receiver prediction using those samples, and ( B.2 ) the corresponding histogram of the mean rating per sample. No statistical difference in the mean was observed in either cases (( A.1 ) ( z = −0.34, p > 0.05), and ( B.1 ) ( z = 0.97, p > 0.05)). Additionally, we observed a statistically significant difference between the ratings of different raters on receiver prediction, with three clear clusters emerging ( C ). Specifically, Raters A and E had similar ratings ( z = 0.66, p > 0.05), and Raters B and D also rated in similar ways ( z = −1.84, p > 0.05), while Rater C responded differently from all other raters. This suggests a good level of variety of the human raters with respect to their perceptions of corner kicks. In task 3—identifying similar corners retrieved in terms of salient strategic setups—there were no significant differences among the distributions of the ratings by different raters ( D ), suggesting a high level of agreement on the usefulness of TacticAI’s capability of retrieving similar corners ( F 1,4 = 1.01, p > 0.1). Finally, in task 4, we compared the ratings of TacticAI’s strategic refinements across the human raters ( E ) and found that the raters also agreed on the general effectiveness of the refinements recommended by TacticAI ( F 1,4 = 0.45, p > 0.05). Note that the violin plots used in B.1 and C – E model a continuous probability distribution and hence assign nonzero probabilities to values outside of the allowed ranges. We only label y -axis ticks for the possible set of ratings.

For the task of identifying receivers, we rated TacticAI’s predictions with respect to a rater as +1 if at least one of the receivers identified by the rater appeared in TacticAI’s top-3 predictions, and 0 otherwise. The average top-3 accuracy among the human raters was 0.79 ± 0.18; specifically, 0.81 ± 0.17 for the real samples, and 0.77 ± 0.21 for the generated ones. These scores closely line up with the accuracy of TacticAI in predicting receivers for held-out test corners, validating our quantitative study. Further, after averaging the ratings for receiver prediction sample-wise, we found no statistically significant difference between the average ratings of predicting receivers over the real and generated samples ( z = 0.97, p > 0.05) (Fig. 4 B). This indicates that TacticAI was equally performant in predicting the receivers of real corners and TacticAI-generated adjustments, and hence may be leveraged for this purpose even in simulated scenarios.

There is a notably high variance in the average receiver prediction rating of TacticAI. We hypothesise that this is due to the fact that different raters may choose to focus on different salient features when evaluating the likely receivers (or even the amount of likely receivers). We set out to validate this hypothesis by testing the pair-wise similarity of the predictions by the human raters through running a one-away analysis of variance (ANOVA), followed by a Tukey test. We found that the distributions of the five raters’ predictions were significantly different ( F 1,4 = 14.46, p < 0.001) forming three clusters (Fig. 4 C). This result indicates that different human raters—as suggested by their various titles at LFC—may often use very different leads when suggesting plausible receivers. The fact that TacticAI manages to retain a high top-3 accuracy in such a setting suggests that it was able to capture the salient patterns of corner kick strategies, which broadly align with human raters’ preferences. We will further test this hypothesis in the third task—identifying similar corners.

For the third task, we asked the human raters to judge 50 pairs of corners for their similarity. Each pair consisted of a reference corner and a retrieved corner, where the retrieved corner was chosen either as the nearest-neighbour of the reference in terms of their TacticAI latent space representations, or—as a feature-level heuristic—the cosine similarities of their raw features (Supplementary Fig. 4 ) in our corner kick dataset. We score the raters’ judgement of a pair as +1 if they considered the corners presented in the case to be usefully similar, otherwise, the pair is scored with 0. We first computed, for each rater, the recall with which they have judged a baseline- or TacticAI-retrieved pair as usefully similar—see description of Task 3 in the Supplementary Methods . For TacticAI retrievals, the average recall across all raters was 0.59 ± 0.09, and for the baseline system, the recall was 0.36 ± 0.10. Secondly, we assess the statistical difference between the results of the two methods by averaging the ratings for each reference–retrieval pair, finding that the average rating of TacticAI retrievals is significantly higher than the average rating of baseline method retrievals ( z = 2.34, p < 0.05). These two results suggest that TacticAI significantly outperforms the feature-space baseline as a method for mining similar corners. This indicates that TacticAI is able to extract salient features from corners that are not trivial to extract from the input data alone, reinforcing it as a potent tool for discovering opposing team tactics from available data. Finally, we observed that this task exhibited a high level of inter-rater agreement for TacticAI-retrieved pairs ( F 1,4 = 1.01, p > 0.1) (Fig. 4 D), suggesting that human raters were largely in agreement with respect to their assessment of TacticAI’s performance.

Finally, we evaluated TacticAI’s player adjustment recommendations for their practical utility. Specifically, each rater was given 50 tactical refinements together with the corresponding real corner kick setups—see Supplementary Fig. 5 , and the “Case study design” section in the Supplementary Methods . The raters were then asked to rate each refinement as saliently improving the tactics (+1), saliently making them worse (−1), or offering no salient differences (0). We calculated the average rating assigned by each of the raters (giving us a value in the range [− 1, 1] for each rater). The average of these values across all five raters was 0.7 ± 0.1. Further, for 45 of the 50 situations (90%), the human raters found TacticAI’s suggestion to be favourable on average (by majority voting). Both of these results indicate that TacticAI’s recommendations are salient and useful to a downstream football club practitioner, and we set out to validate this with statistical tests.

We performed statistical significance testing of the observed positive ratings. First, for each of the 50 situations, we averaged its ratings across all five raters and then ran a t -test to assess whether the mean rating was significantly larger than zero. Indeed, the statistical test indicated that the tactical adjustments recommended by TacticAI were constructive overall ( ${t}_{49}^{{{{{{{{\rm{avg}}}}}}}}}=9.20,\, p \, < \, 0.001$ ). Secondly, we verified that each of the five raters individually found TacticAI’s recommendations to be constructive, running a t -test on each of their ratings individually. For all of the five raters, their average ratings were found to be above zero with statistical significance ( ${t}_{49}^{A}=5.84,\, {p}^{A} \, < \, 0.001;{t}_{49}^{B}=7.88,\; {p}^{B} \, < \, 0.001;{t}_{49}^{C}=7.00,\; {p}^{C} \, < \, 0.001;{t}_{49}^{D}=6.04,\; {p}^{D} \, < \, 0.001;{t}_{49}^{E}=7.30,\, {p}^{E} \, < \, 0.001$ ). In addition, their ratings also shared a high level of inter-agreement ( F 1,4 = 0.45, p > 0.05) (Fig. 4 E), suggesting a level of practical usefulness that is generally recognised by human experts, even though they represent different backgrounds.

Taking all of these results together, we find TacticAI to possess strong components for prediction, retrieval, and tactical adjustments on corner kicks. To illustrate the kinds of salient recommendations by TacticAI, in Fig. 5 we present four examples with a high degree of inter-rater agreement.

These examples are selected from our case study with human experts, to illustrate the breadth of tactical adjustments that TacticAI suggests to teams defending a corner. The density of the yellow circles coincides with the number of times that the corresponding change is recognised as constructive by human experts. Instead of optimising the movement of one specific player, TacticAI can recommend improvements for multiple players in one generation step through suggesting better positions to block the opposing players, or better orientations to track them more efficiently. Some specific comments from expert raters follow. In A , according to raters, TacticAI suggests more favourable positions for several defenders, and improved tracking runs for several others—further, the goalkeeper is positioned more deeply, which is also beneficial. In B , TacticAI suggests that the defenders furthest away from the corner make improved covering runs, which was unanimously deemed useful, with several other defenders also positioned more favourably. In C , TacticAI recommends improved covering runs for a central group of defenders in the penalty box, which was unanimously considered salient by our raters. And in D , TacticAI suggests substantially better tracking runs for two central defenders, along with a better positioning for two other defenders in the goal area.

We have demonstrated an AI assistant for football tactics and provided statistical evidence of its efficacy through a comprehensive case study with expert human raters from Liverpool FC. First, TacticAI is able to accurately predict the first receiver after a corner kick is taken as well as the probability of a shot as the direct result of the corner. Second, TacticAI has been shown to produce plausible tactical variations that improve outcomes in a salient way, while being indistinguishable from real scenarios by domain experts. And finally, the system’s latent player representations are a powerful means to retrieve similar set-piece tactics, allowing coaches to analyse relevant tactics and counter-tactics that have been successful in the past.

The broader scope of strategy modelling in football has previously been addressed from various individual angles, such as pass prediction 14 , 15 , 16 , shot prediction 3 or corner kick tactical classification 7 . However, to the best of our knowledge, our work stands out by combining and evaluating predictive and generative modelling of corner kicks for tactic development. It also stands out in its method of applying geometric deep learning, allowing for efficiently incorporating various symmetries of the football pitch for improved data efficiency. Our method incorporates minimal domain knowledge and does not rely on intricate feature engineering—though its factorised design naturally allows for more intricate feature engineering approaches when such features are available.

Our methodology requires the position and velocity estimates of all players at the time of execution of the corner and subsequent events. Here, we derive these from high-quality tracking and event data, with data availability from tracking providers limited to top leagues. Player tracking based on broadcast video would increase the reach and training data substantially, but would also likely result in noisier model inputs. While the attention mechanism of GATs would allow us to perform introspection of the most salient factors contributing to the model outcome, our method does not explicitly model exogenous (aleatoric) uncertainty, which would be valuable context for the football analyst.

While the empirical study of our method’s efficacy has been focused on corner kicks in association football, it readily generalises to other set pieces (such as throw-ins, which similarly benefit from similarity retrieval, pass and/or shot prediction) and other team sports with suspended play situations. The learned representations and overall framing of TacticAI also lay the ground for future research to integrate a natural language interface that enables domain-grounded conversations with the assistant, with the aim to retrieve particular situations of interest, make predictions for a given tactical variant, compare and contrast, and guide through an interactive process to derive tactical suggestions. It is thus our belief that TacticAI lays the groundwork for the next-generation AI assistant for football.

We devised TacticAI as a geometric deep learning pipeline, further expanded in this section. We process labelled spatio-temporal football data into graph representations, and train and evaluate on benchmarking tasks cast as classification or regression. These steps are presented in sequence, followed by details on the employed computational architecture.

Raw corner kick data

The raw dataset consisted of 9693 corner kicks collected from the 2020–21, 2021–22, and 2022–23 (up to January 2023) Premier League seasons. The dataset was provided by Liverpool FC and comprises four separate data sources, described below.

Our primary data source is spatio-temporal trajectory frames (tracking data), which tracked all on-pitch players and the ball, for each match, at 25 frames per second. In addition to player positions, their velocities are derived from position data through filtering. For each corner kick, we only used the frame in which the kick is being taken as input information.

Secondly, we also leverage event stream data, which annotated the events or actions (e.g., passes, shots and goals) that have occurred in the corresponding tracking frames.

Thirdly, the line-up data for the corresponding games, which recorded the players’ profiles, including their heights, weights and roles, is also used.

Lastly, we have access to miscellaneous game data, which contains the game days, stadium information, and pitch length and width in meters.

Graph representation and construction

We assumed that we were provided with an input graph ${{{{{{{\mathcal{G}}}}}}}}=({{{{{{{\mathcal{V}}}}}}}},\,{{{{{{{\mathcal{E}}}}}}}})$ with a set of nodes ${{{{{{{\mathcal{V}}}}}}}}$ and edges ${{{{{{{\mathcal{E}}}}}}}}\subseteq {{{{{{{\mathcal{V}}}}}}}}\times {{{{{{{\mathcal{V}}}}}}}}$ . Within the context of football games, we took ${{{{{{{\mathcal{V}}}}}}}}$ to be the set of 22 players currently on the pitch for both teams, and we set ${{{{{{{\mathcal{E}}}}}}}}={{{{{{{\mathcal{V}}}}}}}}\times {{{{{{{\mathcal{V}}}}}}}}$ ; that is, we assumed all pairs of players have the potential to interact. Further analyses, leveraging more specific choices of ${{{{{{{\mathcal{E}}}}}}}}$ , would be an interesting avenue for future work.

Additionally, we assume that the graph is appropriately featurised. Specifically, we provide a node feature matrix, ${{{{{{{\bf{X}}}}}}}}\in {{\mathbb{R}}}^{| {{{{{{{\mathcal{V}}}}}}}}| \times k}$ , an edge feature tensor, ${{{{{{{\bf{E}}}}}}}}\in {{\mathbb{R}}}^{| {{{{{{{\mathcal{V}}}}}}}}| \times | {{{{{{{\mathcal{V}}}}}}}}| \times l}$ , and a graph feature vector, ${{{{{{{\bf{g}}}}}}}}\in {{\mathbb{R}}}^{m}$ . The appropriate entries of these objects provide us with the input features for each node, edge, and graph. For example, ${{{{{{{{\bf{x}}}}}}}}}_{u}\in {{\mathbb{R}}}^{k}$ would provide attributes of an individual player $u\in {{{{{{{\mathcal{V}}}}}}}}$ , such as position, height and weight, and ${{{{{{{{\bf{e}}}}}}}}}_{uv}\in {{\mathbb{R}}}^{l}$ would provide the attributes of a particular pair of players $(u,\, v)\in {{{{{{{\mathcal{E}}}}}}}}$ , such as their distance, and whether they belong to the same team. The graph feature vector, g , can be used to store global attributes of interest to the corner kick, such as the game time, current score, or ball position. For a simplified visualisation of how a graph neural network would process such an input, refer to Fig. 1 A.

To construct the input graphs, we first aligned the four data sources with respect to their game IDs and timestamps and filtered out 2517 invalid corner kicks, for which the alignment failed due to missing data, e.g., missing tracking frames or event labels. This filtering yielded 7176 valid corner kicks for training and evaluation. We summarised the exact information that was used to construct the input graphs in Table 2 . In particular, other than player heights (measured in centimeters (cm)) and weights (measured in kilograms (kg)), the players were anonymous in the model. For the cases in which the player profiles were missing, we set their heights and weights to 180 cm and 75 kg, respectively, as defaults. In total, we had 385 such occurrences out of a total of 213,246( = 22 × 9693) during data preprocessing. We downscaled the heights and weights by a factor of 100. Moreover, for each corner kick, we zero-centred the positions of on-pitch players and normalised them onto a 10 m × 10 m pitch, and their velocities were re-scaled accordingly. For the cases in which the pitch dimensions were missing, we used a standard pitch dimension of 110 m × 63 m as default.

We summarised the grouping of the features in Table 1 . The actual features used in different benchmark tasks may differ, and we will describe this in more detail in the next section. To focus on modelling the high-level tactics played by the attacking and defending teams, other than a binary indicator for ball possession—which is 1 for the corner kick taker and 0 for all other players—no information of ball movement, neither positions nor velocities, was used to construct the input graphs. Additionally, we do not have access to the player’s vertical movement, therefore only information on the two-dimensional movements of each player is provided in the data. We do however acknowledge that such information, when available, would be interesting to consider in a corner kick outcome predictor, considering the prevalence of aerial battles in corners.

Benchmark tasks construction

TacticAI consists of three predictive and generative models, which also correspond to three benchmark tasks implemented in this study. Specifically, (1) Receiver prediction, (2) Threatening shot prediction, and (3) Guided generation of team positions and velocities (Table 1 ). The graphs of all the benchmark tasks used the same feature space of nodes and edges, differing only in the global features.

For all three tasks, our models first transform the node features to a latent node feature matrix, ${{{{{{{\bf{H}}}}}}}}={f}_{{{{{{{{\mathcal{G}}}}}}}}}({{{{{{{\bf{X}}}}}}}},\, {{{{{{{\bf{E}}}}}}}},\, {{{{{{{\bf{g}}}}}}}})$ , from which we could answer queries: either about individual players—in which case we learned a relevant classifier or regressor over the h u vectors (the rows of H )—or about the occurrence of a global event (e.g. shot taken)—in which case we classified or regressed over the aggregated player vectors, ∑ u h u . In both cases, the classifiers were trained using stochastic gradient descent over an appropriately chosen loss function, such as categorical cross-entropy for classifiers, and mean squared error for regressors.

For different tasks, we extracted the corresponding ground-truth labels from either the event stream data or the tracking data. Specifically, (1) We modelled receiver prediction as a node classification task and labelled the first player to touch the ball after the corner was taken as the target node. This player could be either an attacking or defensive player. (2) Shot prediction was modelled as graph classification. In particular, we considered a next-ball-touch action by the attacking team as a shot if it was a direct corner, a goal, an aerial, hit on the goalposts, a shot attempt saved by the goalkeeper, or missing target. This yielded 1736 corners labelled as a shot being taken, and 5440 corners labelled as a shot not being taken. (3) For guided generation of player position and velocities, no additional label was needed, as this model relied on a self-supervised reconstruction objective.

The entire dataset was split into training and evaluation sets with an 80:20 ratio through random sampling, and the same splits were used for all tasks.

Graph neural networks

The central model of TacticAI is the graph neural network (GNN) 9 , which computes latent representations on a graph by repeatedly combining them within each node’s neighbourhood. Here we define a node’s neighbourhood, ${{{{{{{{\mathcal{N}}}}}}}}}_{u}$ , as the set of all first-order neighbours of node u , that is, ${{{{{{{{\mathcal{N}}}}}}}}}_{u}=\{v\,| \,(v,\, u)\in {{{{{{{\mathcal{E}}}}}}}}\}$ . A single GNN layer then transforms the node features by passing messages between neighbouring nodes 17 , following the notation of related work 10 , and the implementation of the CLRS-30 benchmark baselines 18 :

where $\psi :{{\mathbb{R}}}^{k}\times {{\mathbb{R}}}^{k}\times {{\mathbb{R}}}^{l}\times {{\mathbb{R}}}^{m}\to {{\mathbb{R}}}^{{k}^{{\prime} }}$ and $\phi :{{\mathbb{R}}}^{k}\times {{\mathbb{R}}}^{{k}^{{\prime} }}\to {{\mathbb{R}}}^{{k}^{{\prime} }}$ are two learnable functions (e.g. multilayer perceptrons), ${{{{{{{{\bf{h}}}}}}}}}_{u}^{(t)}$ are the features of node u after t GNN layers, and ⨁ is any permutation-invariant aggregator, such as sum, max, or average. By definition, we set ${{{{{{{{\bf{h}}}}}}}}}_{u}^{(0)}={{{{{{{{\bf{x}}}}}}}}}_{u}$ , and iterate Eq. ( 2 ) for T steps, where T is a hyperparameter. Then, we let ${{{{{{{\bf{H}}}}}}}}={f}_{{{{{{{{\mathcal{G}}}}}}}}}({{{{{{{\bf{X}}}}}}}},\, {{{{{{{\bf{E}}}}}}}},\, {{{{{{{\bf{g}}}}}}}})={{{{{{{{\bf{H}}}}}}}}}^{(T)}$ be the final node embeddings coming out of the GNN.

It is well known that Eq. ( 2 ) is remarkably general; it can be used to express popular models such as Transformers 19 as a special case, and it has been argued that all discrete deep learning models can be expressed in this form 20 , 21 . This makes GNNs a perfect framework for benchmarking various approaches to modelling player–player interactions in the context of football.

Different choices of ψ , ϕ and ⨁ yield different architectures. In our case, we utilise a message function that factorises into an attentional mechanism, $a:{{\mathbb{R}}}^{k}\times {{\mathbb{R}}}^{k}\times {{\mathbb{R}}}^{l}\times {{\mathbb{R}}}^{m}\to {\mathbb{R}}$ :

yielding the graph attention network (GAT) architecture 12 . In our work, specifically, we use a two-layer multilayer perceptron for the attentional mechanism, as proposed by GATv2 11 :

where ${{{{{{{{\bf{W}}}}}}}}}_{1},\, {{{{{{{{\bf{W}}}}}}}}}_{2}\in {{\mathbb{R}}}^{k\times h}$ , ${{{{{{{{\bf{W}}}}}}}}}_{e}\in {{\mathbb{R}}}^{l\times h}$ , ${{{{{{{{\bf{W}}}}}}}}}_{g}\in {{\mathbb{R}}}^{m\times h}$ and ${{{{{{{\bf{a}}}}}}}}\in {{\mathbb{R}}}^{h}$ are the learnable parameters of the attentional mechanism, and LeakyReLU is the leaky rectified linear activation function. This mechanism computes coefficients of interaction (a single scalar value) for each pair of connected nodes ( u , v ), which are then normalised across all neighbours of u using the ${{{{{{{\rm{softmax}}}}}}}}$ function.

Through early-stage experimentation, we have ascertained that GATs are capable of matching the performance of more generic choices of ψ (such as the MPNN 17 ) while being more scalable. Hence, we focus our study on the GAT model in this work. More details can be found in the subsection “Ablation study” section.

Geometric deep learning

In spite of the power of Eq. ( 2 ), using it in its full generality is often prone to overfitting, given the large number of parameters contained in ψ and ϕ . This problem is exacerbated in the football analytics domain, where gold-standard data is generally very scarce—for example, in the English Premier League, only a few hundred games are played every season.

In order to tackle this issue, we can exploit the immense regularity of data arising from football games. Strategically equivalent game states are also called transpositions, and symmetries such as arriving at the same chess position through different move sequences have been exploited computationally since the 1960s 22 . Similarly, game rotations and reflections may yield equivalent strategic situations 23 . Using the blueprint of geometric deep learning (GDL) 10 , we can design specialised GNN architectures that exploit this regularity.

That is, geometric deep learning is a generic methodology for deriving mathematical constraints on neural networks, such that they will behave predictably when inputs are transformed in certain ways. In several important cases, these constraints can be directly resolved, directly informing neural network architecture design. For a comprehensive example of point clouds under 3D rotational symmetry, see Fuchs et al. 24 .

To elucidate several aspects of the GDL framework on a high level, let us assume that there exists a group of input data transformations (symmetries), ${\mathfrak{G}}$ under which the ground-truth label remains unchanged. Specifically, if we let y ( X , E , g ) be the label given to the graph featurised with X , E , g , then for every transformation ${\mathfrak{g}}\in {\mathfrak{G}}$ , the following property holds:

This condition is also referred to as ${\mathfrak{G}}$ -invariance. Here, by ${\mathfrak{g}}({{{{{{{\bf{X}}}}}}}})$ we denote the result of transforming X by ${\mathfrak{g}}$ —a concept also known as a group action. More generally, it is a function of the form ${\mathfrak{G}}\times {{{{{{{\mathcal{S}}}}}}}}\to {{{{{{{\mathcal{S}}}}}}}}$ for some state set ${{{{{{{\mathcal{S}}}}}}}}$ . Note that a single group element, ${\mathfrak{g}}\in {\mathfrak{G}}$ can easily produce different actions on different ${{{{{{{\mathcal{S}}}}}}}}$ —in this case, ${{{{{{{\mathcal{S}}}}}}}}$ could be ${{\mathbb{R}}}^{| {{{{{{{\mathcal{V}}}}}}}}| \times k}$ ( X ), ${{\mathbb{R}}}^{| {{{{{{{\mathcal{V}}}}}}}}| \times | {{{{{{{\mathcal{V}}}}}}}}| \times l}$ ( E ) and ${{\mathbb{R}}}^{m}$ ( g ).

It is worth noting that GNNs may also be derived using a GDL perspective if we set the symmetry group ${\mathfrak{G}}$ to ${S}_{| {{{{{{{\mathcal{V}}}}}}}}}|$ , the permutation group of $| {{{{{{{\mathcal{V}}}}}}}}|$ objects. Owing to the design of Eq. ( 2 ), its outputs will not be dependent on the exact permutation of nodes in the input graph.

Frame averaging

A simple mechanism to enforce ${\mathfrak{G}}$ -invariance, given any predictor ${f}_{{{{{{{{\mathcal{G}}}}}}}}}({{{{{{{\bf{X}}}}}}}},\, {{{{{{{\bf{E}}}}}}}},\, {{{{{{{\bf{g}}}}}}}})$ , performs frame averaging across all ${\mathfrak{G}}$ -transformed inputs:

This ensures that all ${\mathfrak{G}}$ -transformed versions of a particular input (also known as that input’s orbit) will have exactly the same output, satisfying Eq. ( 5 ). A variant of this approach has also been applied in the AlphaGo architecture 25 to encode symmetries of a Go board.

In our specific implementation, we set ${\mathfrak{G}}={D}_{2}=\{{{{{{{{\rm{id}}}}}}}},\leftrightarrow,\updownarrow,\leftrightarrow \updownarrow \}$ , the dihedral group. Exploiting D 2 -invariance allows us to encode quadrant symmetries. Each element of the D 2 group encodes the presence of vertical or horizontal reflections of the input football pitch. Under these transformations, the pitch is assumed completely symmetric, and hence many predictions, such as which player receives the corner kick, or takes a shot from it, can be safely assumed unchanged. As an example of how to compute transformed features in Eq. ( 6 ), ↔( X ) horizontally reflects all positional features of players in X (e.g. the coordinates of the player), and negates the x -axis component of their velocity.

Group convolutions

While the frame averaging approach of Eq. ( 6 ) is a powerful way to restrict GNNs to respect input symmetries, it arguably misses an opportunity for the different ${\mathfrak{G}}$ -transformed views to interact while their computations are being performed. For small groups such as D 2 , a more fine-grained approach can be assumed, operating over a single GNN layer in Eq. ( 2 ), which we will write shortly as ${{{{{{{{\bf{H}}}}}}}}}^{(t)}={g}_{{{{{{{{\mathcal{G}}}}}}}}}({{{{{{{{\bf{H}}}}}}}}}^{(t-1)},\, {{{{{{{\bf{E}}}}}}}},\, {{{{{{{\bf{g}}}}}}}})$ . The condition that we need a symmetry-respecting GNN layer to satisfy is as follows, for all transformations ${\mathfrak{g}}\in {\mathfrak{G}}$ :

that is, it does not matter if we apply ${\mathfrak{g}}$ it to the input or the output of the function ${g}_{{{{{{{{\mathcal{G}}}}}}}}}$ —the final answer is the same. This condition is also referred to as ${\mathfrak{G}}$ -equivariance, and it has recently proved to be a potent paradigm for developing powerful GNNs over biochemical data 24 , 26 .

To satisfy D 2 -equivariance, we apply the group convolution approach 13 . Therein, views of the input are allowed to directly interact with their ${\mathfrak{G}}$ -transformed variants, in a manner very similar to grid convolutions (which is, indeed, a special case of group convolutions, setting ${\mathfrak{G}}$ to be the translation group). We use ${{{{{{{{\bf{H}}}}}}}}}_{{\mathfrak{g}}}^{(t)}$ to denote the ${\mathfrak{g}}$ -transformed view of the latent node features at layer t . Omitting E and g inputs for brevity, and using our previously designed layer ${g}_{{{{{{{{\mathcal{G}}}}}}}}}$ as a building block, we can perform a group convolution as follows:

Here, ∥ is the concatenation operation, joining the two node feature matrices column-wise; ${{\mathfrak{g}}}^{-1}$ is the inverse transformation to ${\mathfrak{g}}$ (which must exist as ${\mathfrak{G}}$ is a group); and ${{\mathfrak{g}}}^{-1}{\mathfrak{h}}$ is the composition of the two transformations.

Effectively, Eq. ( 8 ) implies our D 2 -equivariant GNN needs to maintain a node feature matrix ${{{{{{{{\bf{H}}}}}}}}}_{{\mathfrak{g}}}^{(t)}$ for every ${\mathfrak{G}}$ -transformation of the current input, and these views are recombined by invoking ${g}_{{{{{{{{\mathcal{G}}}}}}}}}$ on all pairs related together by applying a transformation ${\mathfrak{h}}$ . Note that all reflections are self-inverses, hence, in D 2 , ${\mathfrak{g}}={{\mathfrak{g}}}^{-1}$ .

It is worth noting that both the frame averaging in Eq. ( 6 ) and group convolution in Eq. ( 8 ) are similar in spirit to data augmentation. However, whereas standard data augmentation would only show one view at a time to the model, a frame averaging/group convolution architecture exhaustively generates all views and feeds them to the model all at once. Further, group convolutions allow these views to explicitly interact in a way that does not break symmetries. Here lies the key difference between the two approaches: frame averaging and group convolutions rigorously enforce the symmetries in ${\mathfrak{G}}$ , whereas data augmentation only provides implicit hints to the model about satisfying them. As a consequence of the exhaustive generation, Eqs. ( 6 ) and ( 8 ) are only feasible for small groups like D 2 . For larger groups, approaches like Steerable CNNs 27 may be employed.

Network architectures

While the three benchmark tasks we are performing have minor differences in the global features available to the model, the neural network models designed for them all have the same encoder–decoder architecture. The encoder has the same structure in all tasks, while the decoder model is tailored to produce appropriately shaped outputs for each benchmark task.

Given an input graph, TacticAI’s model first generates all relevant D 2 -transformed versions of it, by appropriately reflecting the player coordinates and velocities. We refer to the original input graph as the identity view, and the remaining three D 2 -transformed graphs as reflected views.

Once the views are prepared, we apply four group convolutional layers (Eq. ( 8 )) with a GATv2 base model (Eqs. ( 3 ) and ( 4 )) as the ${g}_{{{{{{{{\mathcal{G}}}}}}}}}$ function. Specifically, this means that, in Eqs. ( 3 ) and ( 4 ), every instance of ${{{{{{{{\bf{h}}}}}}}}}_{u}^{(t-1)}$ is replaced by the concatenation of ${({{{{{{{{\bf{h}}}}}}}}}_{{\mathfrak{h}}}^{(t-1)})}_{u}\parallel {({{{{{{{{\bf{h}}}}}}}}}_{{{\mathfrak{g}}}^{-1}{\mathfrak{h}}}^{(t-1)})}_{u}$ . Each GATv2 layer has eight attention heads and computes four latent features overall per player. Accordingly, once the four group convolutions are performed, we have a representation of ${{{{{{{\bf{H}}}}}}}}\in {{\mathbb{R}}}^{4\times 22\times 4}$ , where the first dimension corresponds to the four views ( ${{{{{{{{\bf{H}}}}}}}}}_{{{{{{{{\rm{id}}}}}}}}},\, {{{{{{{{\bf{H}}}}}}}}}_{\leftrightarrow },\, {{{{{{{{\bf{H}}}}}}}}}_{\updownarrow },\, {{{{{{{{\bf{H}}}}}}}}}_{\leftrightarrow \updownarrow }\in {{\mathbb{R}}}^{22\times 4}$ ), the second dimension corresponds to the players (eleven on each team), and the third corresponds to the 4-dimensional latent vector for each player node in this particular view. How this representation is used by the decoder depends on the specific downstream task, as we detail below.

For receiver prediction, which is a fully invariant function (i.e. reflections do not change the receiver), we perform simple frame averaging across all views, arriving at

and then learn a node-wise classifier over the rows of ${{{{{{{{\bf{H}}}}}}}}}^{{{{{{{{\rm{node}}}}}}}}}\in {{\mathbb{R}}}^{22\times 4}$ . We further decode H node into a logit vector ${{{{{{{\bf{O}}}}}}}}\in {{\mathbb{R}}}^{22}$ with a linear layer before computing the corresponding softmax cross entropy loss.

For shot prediction, which is once again fully invariant (i.e. reflections do not change the probability of a shot), we can further average the frame-averaged features across all players to get a global graph representation:

and then learn a binary classifier over ${{{{{{{{\bf{h}}}}}}}}}^{{{{{{{{\rm{graph}}}}}}}}}\in {{\mathbb{R}}}^{4}$ . Specifically, we decode the hidden vector into a single logit with a linear layer and compute the sigmoid binary cross-entropy loss with the corresponding label.

For guided generation (position/velocity adjustments), we generate the player positions and velocities with respect to a particular outcome of interest for the human coaches, predicted over the rows of the hidden feature matrix. For example, the model may adjust the defensive setup to decrease the shot probability by the attacking team. The model output is now equivariant rather than invariant—reflecting the pitch appropriately reflects the predicted positions and velocity vectors. As such, we cannot perform frame averaging, and take only the identity view’s features, ${{{{{{{{\bf{H}}}}}}}}}_{{{{{{{{\rm{id}}}}}}}}}\in {{\mathbb{R}}}^{22\times 4}$ . From this latent feature matrix, we can then learn a conditional distribution from each row, which models the positions or velocities of the corresponding player. To do this, we extend the backbone encoder with conditional variational autoencoder (CVAE 28 , 29 ). Specifically, for the u -th row of H id , h u , we first map its latent embedding to the parameters of a two-dimensional Gaussian distribution ${{{{{{{\mathcal{N}}}}}}}}({\mu }_{u}| {\sigma }_{u})$ , and then sample the coordinates and velocities from this distribution. At training time, we can efficiently propagate gradients through this sampling operation using the reparameterisation trick 28 : sample a random value ${\epsilon }_{u} \sim {{{{{{{\mathcal{N}}}}}}}}(0,1)$ for each player from the unit Gaussian distribution, and then treat μ u + σ u ϵ u as the sample for this player. In what follows, we omit edge features for brevity. For each corner kick sample X with the corresponding outcome o (e.g. a binary value indicating a shot event), we extend the standard VAE loss 28 , 29 to our case of outcome-conditional guided generation as

where h u is the player embedding corresponding to the u th row of H id , and ${\mathbb{KL}}$ is Kullback–Leibler (KL) divergence. Specifically, the first term is the generation loss between the real player input x u and the reconstructed sample decoded from h u with the decoder p ϕ . Using the KL term, the distribution of the latent embedding h u is regularised towards p ( h u ∣ o ), which is a multivariate Gaussian in our case.

A complete high-level summary of the generic encoder–decoder equivariant architecture employed by TacticAI can be summarised in Supplementary Fig. 2 . In the following section, we will provide empirical evidence for justifying these architectural decisions. This will be done through targeted ablation studies on our predictive benchmarks (receiver prediction and shot prediction).

Ablation study

We leveraged the receiver prediction task as a way to evaluate various base model architectures, and directly quantitatively assess the contributions of geometric deep learning in this context. We already see that the raw corner kick data can be better represented through geometric deep learning, yielding separable clusters in the latent space that could correspond to different attacking or defending tactics (Fig. 2 ). In addition, we hypothesise that these representations can also yield better performance on the task of receiver prediction. Accordingly, we ablate several design choices using deep learning on this task, as illustrated by the following four questions:

Does a factorised graph representation help? To assess this, we compare it against a convolutional neural network (CNN 30 ) baseline, which does not leverage a graph representation.

Does a graph structure help? To assess this, we compare against a Deep Sets 31 baseline, which only models each node in isolation without considering adjacency information—equivalently, setting each neighbourhood ${{{{{{{{\mathcal{N}}}}}}}}}_{u}$ to a singleton set { u }.

Are attentional GNNs a good strategy? To assess this, we compare against a message passing neural network 32 , MPNN baseline, which uses the fully potent GNN layer from Eq. ( 2 ) instead of the GATv2.

Does accounting for symmetries help? To assess this, we compare our geometric GATv2 baseline against one which does not utilise D 2 group convolutions but utilises D 2 frame averaging, and one which does not explicitly utilise any aspect of D 2 symmetries at all.

Each of these models has been trained for a fixed budget of 50,000 training steps. The test top- k receiver prediction accuracies of the trained models are provided in Supplementary Table 2 . As already discussed in the section “Results”, there is a clear advantage to using a full graph structure, as well as directly accounting for reflection symmetry. Further, the usage of the MPNN layer leads to slight overfitting compared to the GATv2, illustrating how attentional GNNs strike a good balance of expressivity and data efficiency for this task. Our analysis highlights the quantitative benefits of both graph representation learning and geometric deep learning for football analytics from tracking data. We also provide a brief ablation study for the shot prediction task in Supplementary Table 3 .

Training details

We train each of TacticAI’s models in isolation, using NVIDIA Tesla P100 GPUs. To minimise overfitting, each model’s learning objective is regularised with an L 2 norm penalty with respect to the network parameters. During training, we use the Adam stochastic gradient descent optimiser 33 over the regularised loss.

All models, including baselines, have been given an equal hyperparameter tuning budget, spanning the number of message passing steps ({1, 2, 4}), initial learning rate ({0.0001, 0.00005}), batch size ({128, 256}) and L 2 regularisation coefficient ({0.01, 0.005, 0.001, 0.0001, 0}). We summarise the chosen hyperparameters of each TacticAI model in Supplementary Table 1 .

Data availability

The data collected in the human experiments in this study have been deposited in the Zenodo database under accession code https://zenodo.org/records/10557063 , and the processed data which is used in the statistical analysis and to generate the relevant figures in the main text are available under the same accession code. The input and output data generated and/or analysed during the current study are protected and are not available due to data privacy laws and licensing restrictions. However, contact details of the input data providers are available from the corresponding authors on reasonable request.

Code availability

All the core models described in this research were built with the Graph Neural Network processors provided by the CLRS Algorithmic Reasoning Benchmark 18 , and their source code is available at https://github.com/google-deepmind/clrs . We are unable to release our code for this work as it was developed in a proprietary context; however, the corresponding authors are open to answer specific questions concerning re-implementations on request. For general data analysis, we used the following freely available packages: numpy v1.25.2 , pandas v1.5.3 , matplotlib v3.6.1 , seaborn v0.12.2 and scipy v1.9.3 . Specifically, the code of the statistical analysis conducted in this study is available at https://zenodo.org/records/10557063 .

The International Football Association Board (IFAB). Laws of the Game (The International Football Association Board, 2023).

Tuyls, K. et al. Game plan: what AI can do for football, and what football can do for AI. J. Artif. Intell. Res. 71 , 41–88 (2021).

Article Google Scholar

Goka, R., Moroto, Y., Maeda, K., Ogawa, T. & Haseyama, M. Prediction of shooting events in soccer videos using complete bipartite graphs and players’ spatial–temporal relations. Sensors 23 , 4506 (2023).

Article ADS PubMed PubMed Central Google Scholar

Omidshafiei, S. et al. Multiagent off-screen behavior prediction in football. Sci. Rep. 12 , 8638 (2022).

Article ADS CAS PubMed PubMed Central Google Scholar

Lang, S., Wild, R., Isenko, A. & Link, D. Predicting the in-game status in soccer with machine learning using spatiotemporal player tracking data. Sci. Rep. 12 , 16291 (2022).

Baccouche, M., Mamalet, F., Wolf, C., Garcia, C. & Baskurt, A. Action classification in soccer videos with long short-term memory recurrent neural networks. In International Conference on Artificial Neural Networks (eds Diamantaras, K., Duch, W. & Iliadis, L. S.) pages 154–159 (Springer, 2010).

Shaw, L. & Gopaladesikan, S. Routine inspection: a playbook for corner kicks. In Machine Learning and Data Mining for Sports Analytics: 7th International Workshop, MLSA 2020, Co-located with ECML/PKDD 2020 , Proceedings, Ghent, Belgium, September 14–18, 2020, Vol . 7 , 3–16 (Springer, 2020).

Araújo, D. & Davids, K. Team synergies in sport: theory and measures. Front. Psychol. 7 , 1449 (2016).

Article PubMed PubMed Central Google Scholar

Veličković, P. Everything is connected: graph neural networks. Curr. Opin. Struct. Biol. 79 , 102538 (2023).

Article PubMed Google Scholar

Bronstein, M. M., Bruna, J., Cohen, T. & Veličković, P. Geometric deep learning: grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478 (2021).

Brody, S., Alon, U. & Yahav, E. How attentive are graph attention networks? In International Conference on Learning Representations (ICLR, 2022). https://openreview.net/forum?id=F72ximsx7C1 .

Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations (ICLR, 2018). https://openreview.net/forum?id=rJXMpikCZ .

Cohen, T. & Welling, M. Group equivariant convolutional networks. In International Conference on Machine Learning (eds Balcan, M. F. & Weinberger, K. Q.) 2990–2999 (PMLR, 2016).

Honda, Y., Kawakami, R., Yoshihashi, R., Kato, K. & Naemura, T. Pass receiver prediction in soccer using video and players’ trajectories. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 3502–3511 (2022). https://ieeexplore.ieee.org/document/9857310 .

Hubáček, O., Sourek, G. & Železný, F. Deep learning from spatial relations for soccer pass prediction. In MLSA@PKDD/ECML (eds Brefeld, U., Davis, J., Van Haaren, J. & Zimmermann, A.) Vol. 11330, (Lecture Notes in Computer Science, Springer, Cham, 2018).

Sanyal, S. Who will receive the ball? Predicting pass recipient in soccer videos. J Visual Commun. Image Represent. 78 , 103190 (2021).

Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In International Conference on Machine Learning (Precup, D. & Teh, Y. W.) 1263–1272 (PMLR, 2017).

Veličković, P. et al. The CLRS algorithmic reasoning benchmark. In International Conference on Machine Learning (eds Chaudhuri, K. et al.) 22084–22102 (PMLR, 2022).

Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems (eds Guyon, I. et al.) Vol. 30 (Curran Associates, Inc., 2017).

Veličković, P. Message passing all the way up. In ICLR 2022 Workshop on Geometrical and Topological Representation Learning (GTRL, 2022). https://openreview.net/forum?id=Bc8GiEZkTe5 .

Baranwal, A., Kimon, F. & Aukosh, J. Optimality of message-passing architectures for sparse graphs. In Thirty-seventh Conference on Neural Information Processing Systems (2023). https://papers.nips.cc/paper_files/paper/2023/hash/7e991aa4cd2fdf0014fba2f000f542d0-Abstract-Conference.html .

Greenblatt, R. D., Eastlake III, D. E. & Crocker, S. D. The Greenblatt chess program. In Proc. Fall Joint Computer Conference , 14–16 , 801–810 (Association for Computing Machinery, 1967). https://dl.acm.org/doi/10.1145/1465611.1465715 .

Schijf, M., Allis, L. V. & Uiterwijk, J. W. Proof-number search and transpositions. ICGA J. 17 , 63–74 (1994).

Fuchs, F., Worrall, D., Fischer, V. & Welling, M. SE(3)-transformers: 3D roto-translation equivariant attention networks. Adv. Neural Inf. Process. Syst. 33 , 1970–1981 (2020).

Google Scholar

Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529 , 484–489 (2016).

Article ADS CAS PubMed Google Scholar

Satorras, V. G., Hoogeboom, E. & Welling, M. E ( n ) equivariant graph neural networks. In International Conference on Machine Learning (eds Meila, M. & Zhang, T.) 9323–9332 (PMLR, 2021).

Cohen, T. S. & Welling, M. Steerable CNNs. In International Conference on Learning Representations (ICLR, 2017). https://openreview.net/forum?id=rJQKYt5ll .

Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14–16, 2014, Conference Track Proceedings (ICLR, 2014). https://openreview.net/forum?id=33X9fd2-9FyZd .

Sohn, K., Lee, H. & Yan, X. Learning structured output representation using deep conditional generative models. In Advances in Neural Information Processing Systems (eds Cortes, C, Lawrence, N., Lee, D., Sugiyama, M. & Garnett, R.) Vol. 28 (Curran Associates, Inc., 2015).

Fernández, J. & Bornn, L. Soccermap: a deep learning architecture for visually-interpretable analysis in soccer. In Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part V (eds Dong, Y., Ifrim, G., Mladenić, D., Saunders, C. & Van Hoecke, S.) 491–506 (Springer, 2021).

Zaheer, M. et al. Deep sets. In Advances in Neural Information Processing Systems Vol. 30 (eds Guyon, I., et al.) (Curran Associates, Inc., 2017).

Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning , Vol. 70 of Proceedings of Machine Learning Research, 6–11 Aug 2017 (eds Precup, D. & Whye Teh, Y) 1263–1272 (PMLR, 2017).

Kingma, G. E. & Ba, J. Adam: a method for stochastic optimization. In ICLR (Poster) , (eds Bengio, Y. & LeCun, Y.) (International Conference of Learning Representations (ICLR), 2015). https://openreview.net/forum?id=8gmWwjFyLj .

Download references

Acknowledgements

We gratefully acknowledge the support of James French, Timothy Waskett, Hans Leitert and Benjamin Hervey for their extensive efforts in analysing TacticAI’s outputs. Further, we are thankful to Kevin McKee, Sherjil Ozair and Beatrice Bevilacqua for useful technical discussions, and Marc Lanctôt and Satinder Singh for reviewing the paper prior to submission.

Author information

These authors contributed equally: Zhe Wang, Petar Veličković, Daniel Hennes.

Authors and Affiliations

Google DeepMind, 6-8 Handyside Street, London, N1C 4UZ, UK

Zhe Wang, Petar Veličković, Daniel Hennes, Nenad Tomašev, Laurel Prince, Michael Kaisers, Yoram Bachrach, Romuald Elie, Li Kevin Wenliang, Federico Piccinini, Jerome Connor, Yi Yang, Adrià Recasens, Mina Khan, Nathalie Beauguerlange, Pablo Sprechmann, Pol Moreno, Nicolas Heess & Demis Hassabis

Liverpool FC, AXA Training Centre, Simonswood Lane, Kirkby, Liverpool, L33 5XB, UK

William Spearman

Liverpool FC, Kirkby, UK

University of Alberta, Amii, Edmonton, AB, T6G 2E8, Canada

Michael Bowling

Google DeepMind, London, UK

You can also search for this author in PubMed Google Scholar

Contributions

Z.W., D. Hennes, L.P. and K.T. coordinated and organised the research effort leading to this paper. P.V. and Z.W. developed the core TacticAI models. Z.W., W.S. and I.G. prepared the Premier League corner kick dataset used for training and evaluating these models. P.V., Z.W., D. Hennes and N.T. designed the case study with human experts and Z.W. and P.V. performed the qualitative evaluation and statistical analysis of its outcomes. Z.W., P.V., D. Hennes, N.T., L.P., M. Kaisers, Y.B., R.E., L.K.W., F.P., W.S., I.G., N.H., M.B., D. Hassabis and K.T. contributed to writing the paper and providing feedback on the final manuscript. J.C., Y.Y., A.R., M. Khan, N.B., P.S. and P.M. contributed valuable technical and implementation discussions throughout the work’s development.

Corresponding authors

Correspondence to Zhe Wang , Petar Veličković or Karl Tuyls .

Ethics declarations

Competing interests.

The authors declare no competing interests but the following competing interests: TacticAI was developed during the course of the Authors’ employment at Google DeepMind and Liverpool Football Club, as applicable to each Author.

Peer review

Peer review information.

Nature Communications thanks Rui Luo and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, peer review file, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Wang, Z., Veličković, P., Hennes, D. et al. TacticAI: an AI assistant for football tactics. Nat Commun 15 , 1906 (2024). https://doi.org/10.1038/s41467-024-45965-x

Download citation

Received : 13 October 2023

Accepted : 08 February 2024

Published : 19 March 2024

DOI : https://doi.org/10.1038/s41467-024-45965-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

Explore articles by subject
Guide to authors
Editorial policies

Buy Me a Coffee

Home » Sampling Methods – Types, Techniques and Examples

Sampling Methods – Types, Techniques and Examples

Table of Contents

Sampling refers to the process of selecting a subset of data from a larger population or dataset in order to analyze or make inferences about the whole population.

In other words, sampling involves taking a representative sample of data from a larger group or dataset in order to gain insights or draw conclusions about the entire group.

Sampling Methods

Sampling methods refer to the techniques used to select a subset of individuals or units from a larger population for the purpose of conducting statistical analysis or research.

Sampling is an essential part of the Research because it allows researchers to draw conclusions about a population without having to collect data from every member of that population, which can be time-consuming, expensive, or even impossible.

Types of Sampling Methods

Sampling can be broadly categorized into two main categories:

Probability Sampling

This type of sampling is based on the principles of random selection, and it involves selecting samples in a way that every member of the population has an equal chance of being included in the sample.. Probability sampling is commonly used in scientific research and statistical analysis, as it provides a representative sample that can be generalized to the larger population.

Type of Probability Sampling :

Simple Random Sampling: In this method, every member of the population has an equal chance of being selected for the sample. This can be done using a random number generator or by drawing names out of a hat, for example.
Systematic Sampling: In this method, the population is first divided into a list or sequence, and then every nth member is selected for the sample. For example, if every 10th person is selected from a list of 100 people, the sample would include 10 people.
Stratified Sampling: In this method, the population is divided into subgroups or strata based on certain characteristics, and then a random sample is taken from each stratum. This is often used to ensure that the sample is representative of the population as a whole.
Cluster Sampling: In this method, the population is divided into clusters or groups, and then a random sample of clusters is selected. Then, all members of the selected clusters are included in the sample.
Multi-Stage Sampling : This method combines two or more sampling techniques. For example, a researcher may use stratified sampling to select clusters, and then use simple random sampling to select members within each cluster.

Non-probability Sampling

This type of sampling does not rely on random selection, and it involves selecting samples in a way that does not give every member of the population an equal chance of being included in the sample. Non-probability sampling is often used in qualitative research, where the aim is not to generalize findings to a larger population, but to gain an in-depth understanding of a particular phenomenon or group. Non-probability sampling methods can be quicker and more cost-effective than probability sampling methods, but they may also be subject to bias and may not be representative of the larger population.

Types of Non-probability Sampling :

Convenience Sampling: In this method, participants are chosen based on their availability or willingness to participate. This method is easy and convenient but may not be representative of the population.
Purposive Sampling: In this method, participants are selected based on specific criteria, such as their expertise or knowledge on a particular topic. This method is often used in qualitative research, but may not be representative of the population.
Snowball Sampling: In this method, participants are recruited through referrals from other participants. This method is often used when the population is hard to reach, but may not be representative of the population.
Quota Sampling: In this method, a predetermined number of participants are selected based on specific criteria, such as age or gender. This method is often used in market research, but may not be representative of the population.
Volunteer Sampling: In this method, participants volunteer to participate in the study. This method is often used in research where participants are motivated by personal interest or altruism, but may not be representative of the population.

Applications of Sampling Methods

Applications of Sampling Methods from different fields:

Psychology : Sampling methods are used in psychology research to study various aspects of human behavior and mental processes. For example, researchers may use stratified sampling to select a sample of participants that is representative of the population based on factors such as age, gender, and ethnicity. Random sampling may also be used to select participants for experimental studies.
Sociology : Sampling methods are commonly used in sociological research to study social phenomena and relationships between individuals and groups. For example, researchers may use cluster sampling to select a sample of neighborhoods to study the effects of economic inequality on health outcomes. Stratified sampling may also be used to select a sample of participants that is representative of the population based on factors such as income, education, and occupation.
Social sciences: Sampling methods are commonly used in social sciences to study human behavior and attitudes. For example, researchers may use stratified sampling to select a sample of participants that is representative of the population based on factors such as age, gender, and income.
Marketing : Sampling methods are used in marketing research to collect data on consumer preferences, behavior, and attitudes. For example, researchers may use random sampling to select a sample of consumers to participate in a survey about a new product.
Healthcare : Sampling methods are used in healthcare research to study the prevalence of diseases and risk factors, and to evaluate interventions. For example, researchers may use cluster sampling to select a sample of health clinics to participate in a study of the effectiveness of a new treatment.
Environmental science: Sampling methods are used in environmental science to collect data on environmental variables such as water quality, air pollution, and soil composition. For example, researchers may use systematic sampling to collect soil samples at regular intervals across a field.
Education : Sampling methods are used in education research to study student learning and achievement. For example, researchers may use stratified sampling to select a sample of schools that is representative of the population based on factors such as demographics and academic performance.

Examples of Sampling Methods

Probability Sampling Methods Examples:

Simple random sampling Example : A researcher randomly selects participants from the population using a random number generator or drawing names from a hat.
Stratified random sampling Example : A researcher divides the population into subgroups (strata) based on a characteristic of interest (e.g. age or income) and then randomly selects participants from each subgroup.
Systematic sampling Example : A researcher selects participants at regular intervals from a list of the population.

Non-probability Sampling Methods Examples:

Convenience sampling Example: A researcher selects participants who are conveniently available, such as students in a particular class or visitors to a shopping mall.
Purposive sampling Example : A researcher selects participants who meet specific criteria, such as individuals who have been diagnosed with a particular medical condition.
Snowball sampling Example : A researcher selects participants who are referred to them by other participants, such as friends or acquaintances.

How to Conduct Sampling Methods

some general steps to conduct sampling methods:

Define the population: Identify the population of interest and clearly define its boundaries.
Choose the sampling method: Select an appropriate sampling method based on the research question, characteristics of the population, and available resources.
Determine the sample size: Determine the desired sample size based on statistical considerations such as margin of error, confidence level, or power analysis.
Create a sampling frame: Develop a list of all individuals or elements in the population from which the sample will be drawn. The sampling frame should be comprehensive, accurate, and up-to-date.
Select the sample: Use the chosen sampling method to select the sample from the sampling frame. The sample should be selected randomly, or if using a non-random method, every effort should be made to minimize bias and ensure that the sample is representative of the population.
Collect data: Once the sample has been selected, collect data from each member of the sample using appropriate research methods (e.g., surveys, interviews, observations).
Analyze the data: Analyze the data collected from the sample to draw conclusions about the population of interest.

When to use Sampling Methods

Sampling methods are used in research when it is not feasible or practical to study the entire population of interest. Sampling allows researchers to study a smaller group of individuals, known as a sample, and use the findings from the sample to make inferences about the larger population.

Sampling methods are particularly useful when:

The population of interest is too large to study in its entirety.
The cost and time required to study the entire population are prohibitive.
The population is geographically dispersed or difficult to access.
The research question requires specialized or hard-to-find individuals.
The data collected is quantitative and statistical analyses are used to draw conclusions.

Purpose of Sampling Methods

The main purpose of sampling methods in research is to obtain a representative sample of individuals or elements from a larger population of interest, in order to make inferences about the population as a whole. By studying a smaller group of individuals, known as a sample, researchers can gather information about the population that would be difficult or impossible to obtain from studying the entire population.

Sampling methods allow researchers to:

Study a smaller, more manageable group of individuals, which is typically less time-consuming and less expensive than studying the entire population.
Reduce the potential for data collection errors and improve the accuracy of the results by minimizing sampling bias.
Make inferences about the larger population with a certain degree of confidence, using statistical analyses of the data collected from the sample.
Improve the generalizability and external validity of the findings by ensuring that the sample is representative of the population of interest.

Characteristics of Sampling Methods

Here are some characteristics of sampling methods:

Randomness : Probability sampling methods are based on random selection, meaning that every member of the population has an equal chance of being selected. This helps to minimize bias and ensure that the sample is representative of the population.
Representativeness : The goal of sampling is to obtain a sample that is representative of the larger population of interest. This means that the sample should reflect the characteristics of the population in terms of key demographic, behavioral, or other relevant variables.
Size : The size of the sample should be large enough to provide sufficient statistical power for the research question at hand. The sample size should also be appropriate for the chosen sampling method and the level of precision desired.
Efficiency : Sampling methods should be efficient in terms of time, cost, and resources required. The method chosen should be feasible given the available resources and time constraints.
Bias : Sampling methods should aim to minimize bias and ensure that the sample is representative of the population of interest. Bias can be introduced through non-random selection or non-response, and can affect the validity and generalizability of the findings.
Precision : Sampling methods should be precise in terms of providing estimates of the population parameters of interest. Precision is influenced by sample size, sampling method, and level of variability in the population.
Validity : The validity of the sampling method is important for ensuring that the results obtained from the sample are accurate and can be generalized to the population of interest. Validity can be affected by sampling method, sample size, and the representativeness of the sample.

Advantages of Sampling Methods

Sampling methods have several advantages, including:

Cost-Effective : Sampling methods are often much cheaper and less time-consuming than studying an entire population. By studying only a small subset of the population, researchers can gather valuable data without incurring the costs associated with studying the entire population.
Convenience : Sampling methods are often more convenient than studying an entire population. For example, if a researcher wants to study the eating habits of people in a city, it would be very difficult and time-consuming to study every single person in the city. By using sampling methods, the researcher can obtain data from a smaller subset of people, making the study more feasible.
Accuracy: When done correctly, sampling methods can be very accurate. By using appropriate sampling techniques, researchers can obtain a sample that is representative of the entire population. This allows them to make accurate generalizations about the population as a whole based on the data collected from the sample.
Time-Saving: Sampling methods can save a lot of time compared to studying the entire population. By studying a smaller sample, researchers can collect data much more quickly than they could if they studied every single person in the population.
Less Bias : Sampling methods can reduce bias in a study. If a researcher were to study the entire population, it would be very difficult to eliminate all sources of bias. However, by using appropriate sampling techniques, researchers can reduce bias and obtain a sample that is more representative of the entire population.

Limitations of Sampling Methods

Sampling Error : Sampling error is the difference between the sample statistic and the population parameter. It is the result of selecting a sample rather than the entire population. The larger the sample, the lower the sampling error. However, no matter how large the sample size, there will always be some degree of sampling error.
Selection Bias: Selection bias occurs when the sample is not representative of the population. This can happen if the sample is not selected randomly or if some groups are underrepresented in the sample. Selection bias can lead to inaccurate conclusions about the population.
Non-response Bias : Non-response bias occurs when some members of the sample do not respond to the survey or study. This can result in a biased sample if the non-respondents differ from the respondents in important ways.
Time and Cost : While sampling can be cost-effective, it can still be expensive and time-consuming to select a sample that is representative of the population. Depending on the sampling method used, it may take a long time to obtain a sample that is large enough and representative enough to be useful.
Limited Information : Sampling can only provide information about the variables that are measured. It may not provide information about other variables that are relevant to the research question but were not measured.
Generalization : The extent to which the findings from a sample can be generalized to the population depends on the representativeness of the sample. If the sample is not representative of the population, it may not be possible to generalize the findings to the population as a whole.

About the author

Muhammad Hassan

Researcher, Academic Writer, Web developer

Stratified Random Sampling – Definition, Method...

Purposive Sampling – Methods, Types and Examples

Non-probability Sampling – Types, Methods and...

Cluster Sampling – Types, Method and Examples

Systematic Sampling – Types, Method and Examples

Snowball Sampling – Method, Types and Examples

IMAGES

15 Types of Research Methods (2024)
Qualitative Research: Definition, Types, Methods and Examples
Methodology Example In Research
Research Methods
Research Methods
Chapter 3 Methodology Example In Research Methodology Sample In

COMMENTS

Research Methods
Research methods are specific procedures for collecting and analyzing data. Developing your research methods is an integral part of your research design. When planning your methods, there are two key decisions you will make. First, decide how you will collect data. Your methods depend on what type of data you need to answer your research question:
Research Techniques
Examples of quantitative research techniques are surveys, experiments, and statistical analysis. Qualitative research: This is a research method that focuses on collecting and analyzing non-numerical data, such as text, images, and videos, to gain insights into the subjective experiences and perspectives of the participants.
Research Methods
Research Methods. Definition: Research Methods refer to the techniques, procedures, and processes used by researchers to collect, analyze, and interpret data in order to answer research questions or test hypotheses.The methods used in research can vary depending on the research questions, the type of data that is being collected, and the research design.
Research Methodology
Research Methodology refers to the systematic and scientific approach used to conduct research, investigate problems, and gather data and information for a specific purpose. It involves the techniques and procedures used to identify, collect, analyze, and interpret data to answer research questions or solve research problems.
What Is Research Methodology? Definition + Examples
As we mentioned, research methodology refers to the collection of practical decisions regarding what data you'll collect, from who, how you'll collect it and how you'll analyse it. Research design, on the other hand, is more about the overall strategy you'll adopt in your study. For example, whether you'll use an experimental design ...
Research Methods
You can also take a mixed methods approach, where you use both qualitative and quantitative research methods. Primary vs secondary data. Primary data are any original information that you collect for the purposes of answering your research question (e.g. through surveys, observations and experiments). Secondary data are information that has already been collected by other researchers (e.g. in ...
What Is a Research Methodology?
Step 1: Explain your methodological approach. Step 2: Describe your data collection methods. Step 3: Describe your analysis method. Step 4: Evaluate and justify the methodological choices you made. Tips for writing a strong methodology chapter. Other interesting articles.
Research Methods--Quantitative, Qualitative, and More: Overview
The choice of methods varies by discipline, by the kind of phenomenon being studied and the data being used to study it, by the technology available, and more. This guide is an introduction, but if you don't see what you need here, always contact your subject librarian, and/or take a look to see if there's a library research guide that will ...
Research Design
Step 1: Consider your aims and approach. Step 2: Choose a type of research design. Step 3: Identify your population and sampling method. Step 4: Choose your data collection methods. Step 5: Plan your data collection procedures. Step 6: Decide on your data analysis strategies. Frequently asked questions.
Research Methods In Psychology
Research methods in psychology are systematic procedures used to observe, describe, predict, and explain behavior and mental processes. ... Sampling techniques. Sampling is the process of selecting a representative group from the population under study. A sample is the participants you select from a target population (the group you are ...
Types of Research Methods (With Best Practices and Examples)
There are two main categories of research methods: qualitative research methods and quantitative research methods. Quantitative research methods involve using numbers to measure data. Researchers can use statistical analysis to find connections and meaning in the data. Qualitative research methods involve exploring information and non-numerical ...
Research Methodology
research method in which subjects respond to a series of statements or questions in a questionnaire or an interview. Population. the people who are the focus of research. Sample. part of a ...
What is Research Methodology? Definition, Types, and Examples
Definition, Types, and Examples. Research methodology 1,2 is a structured and scientific approach used to collect, analyze, and interpret quantitative or qualitative data to answer research questions or test hypotheses. A research methodology is like a plan for carrying out research and helps keep researchers on track by limiting the scope of ...
Types of Research Methods: Examples and Tips
What are research methods? Research methods are the techniques and procedures used to collect and analyze data in order to answer research questions and test a research hypothesis. There are several different types of research methods, each with its own strengths and weaknesses. ... For example, in a study on the dietary habits of a certain ...
What Is Qualitative Research?
Qualitative research methods. Each of the research approaches involve using one or more data collection methods.These are some of the most common qualitative methods: Observations: recording what you have seen, heard, or encountered in detailed field notes. Interviews: personally asking people questions in one-on-one conversations. Focus groups: asking questions and generating discussion among ...
Qualitative Study
Qualitative research is a type of research that explores and provides deeper insights into real-world problems.[1] Instead of collecting numerical data points or intervene or introduce treatments just like in quantitative research, qualitative research helps generate hypotheses as well as further investigate and understand quantitative data. Qualitative research gathers participants ...
What are Sampling Methods? Techniques, Types, and Examples
Sampling methods or sampling techniques in research are statistical methods for selecting a sample representative of the whole population to study the population's characteristics. Sampling methods serve as invaluable tools for researchers, enabling the collection of meaningful data and facilitating analysis to identify distinctive features ...
Research Design
The purpose of research design is to plan and structure a research study in a way that enables the researcher to achieve the desired research goals with accuracy, validity, and reliability. Research design is the blueprint or the framework for conducting a study that outlines the methods, procedures, techniques, and tools for data collection ...
How to use and assess qualitative research methods
Abstract. This paper aims to provide an overview of the use and assessment of qualitative research methods in the health sciences. Qualitative research can be defined as the study of the nature of phenomena and is especially appropriate for answering questions of why something is (not) observed, assessing complex multi-component interventions ...
Mastering Research Methodology: Types and Techniques
Example: A study was conducted to know the effects eating more cheese has on bad breath. ... What is the difference between research methods and research methodology? While research methods aim to solve a research problem, research methodology evaluates the appropriateness of the methods used. The types of methodology are tools for selecting a ...
Frontiers
The current study has a one-factorial, quasi-experimental, comparative research design and was conducted as a field experiment. 62 students of a German University learned about scientific observation steps during a course on applying a fluvial audit, in which several sections of a river were classified based on specific morphological ...
What Is Quantitative Research?
Quantitative research methods. You can use quantitative research methods for descriptive, correlational or experimental research. In descriptive research, you simply seek an overall summary of your study variables.; In correlational research, you investigate relationships between your study variables.; In experimental research, you systematically examine whether there is a cause-and-effect ...
Case Study
Defnition: A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation. It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied.
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image encoder, the vision language connector, and various pre-training data choices, we identified several crucial design lessons. For example, we demonstrate that ...
TacticAI: an AI assistant for football tactics
Identifying key patterns of tactics implemented by rival teams, and developing effective responses, lies at the heart of modern football. However, doing so algorithmically remains an open research ...
Sampling Methods
Education: Sampling methods are used in education research to study student learning and achievement. For example, researchers may use stratified sampling to select a sample of schools that is representative of the population based on factors such as demographics and academic performance. Examples of Sampling Methods

What Is Research Methodology? A Plain-Language Explanation & Definition (With Examples)

Research Methodology 101

What is research methodology?

Need a helping hand?

What are qualitative, quantitative and mixed-methods?

What is sampling strategy?

What are data collection methods?

What are data analysis methods?

How do I choose a research methodology?

Example of a research methodology chapter

Psst… there’s more (for free)

You Might Also Like:

198 Comments

Trackbacks/Pingbacks

Submit a Comment Cancel reply

Secondary menu

Search form

About Research Methods

START HERE: SAGE Research Methods

Library Data Services at UC Berkeley

Other Data Services at Berkeley

General Research Methods Resources

Consultants

Related Resourcex

Have a language expert improve your writing

Research Design | Step-by-Step Guide with Examples

Table of contents

Practical and ethical considerations when designing research

Prevent plagiarism, run a free check.

Types of quantitative research designs

Types of qualitative research designs

Defining the population

Sampling methods

Case selection in qualitative research

Survey methods

Observation methods

Other methods of data collection

Secondary data

Operationalisation

Reliability and validity

Sampling procedures

Data management

Quantitative data analysis

Qualitative data analysis

Cite this Scribbr article

Is this article helpful?

Shona McCombes

Research Methods In Psychology

Experimental Design

Experimental Methods

Correlational Studies

A correlation between variables, however, does not automatically mean that the change in one variable is the cause of the change in the values of the other variable. A correlation only shows if there is a relationship between variables.

Interview Methods

Observations

Pilot Study

Research Design

Reliability

Meta-Analysis

Peer Review

Types of Data

Features of Science

Statistical Testing

Ethical Issues

What is Research Methodology? Definition, Types, and Examples

What is research methodology ?

Why is research methodology important?

Writing the methods section of a research paper? Let Paperpal help you achieve perfection

What are the types of sampling designs in research methodology?

What are data collection methods?

Let Paperpal help you write the perfect research methods section. Start now!

How to choose a research methodology?

Got writer’s block? Kickstart your research paper writing with Paperpal now!

Streamline Your Research Paper Writing Process with Paperpal

Frequently Asked Questions

Accelerate your research paper writing with Paperpal. Try for free now!

Related Reads:

Language and Grammar Rules for Academic Writing

Types of Research Methods: Examples and Tips

What are research methods?

Common Types of Research Methods