Methodology tutorial - theory-driven research designs: Difference between revisions

The educational technology and digital learning wiki
Jump to navigation Jump to search
m (Text replacement - "<pageby nominor="false" comments="false"/>" to "<!-- <pageby nominor="false" comments="false"/> -->")
 
(22 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Incomplete}}
{{Incomplete}}
{{under construction}}


<pageby nominor="false" comments="false"/>
<!-- <pageby nominor="false" comments="false"/> -->


== Research Design for Educational Technologies - Theory driven research designs ==
This is part of the [[methodology tutorial]] (see its table of contents).


This is part of the [[methodology tutorial]] (see its table of contents).
== Introduction ==


Note: There should be links to selected wiki articles !
<div class="tut_goals">
; Learning goals
* Understand the fundamental principles of theory-driven research
* Become familiar with some major approaches
; Prerequisites
* [[Methodology tutorial - empirical research principles]]
; Moving on
* [[Methodology tutorial - quantitative data acquisition methods]]
* [[Methodology tutorial - quantitative data analysis]]
; Level and target population
* Beginners
; Quality / To do:
* More or less ok in spirit. '''But'': some translations are needed, some phrases need to be rewritten. Bullets need to transformed to real paragraphs.
</div>


== Overview of theory driven research ==
== Overview of theory driven research ==
Line 16: Line 28:
[[Image:empirical-research-elements.png]]
[[Image:empirical-research-elements.png]]


* '' Conceptualisations'' : Each research question is formulated as one or more hypothesis. Hypothesis are grounded in theory.
* ''Conceptualizations'' : Each research question is formulated as one or more hypothesis. Hypothesis are grounded in theory.
* '' Measures'' : are usually quantitative (e.g. experimental data, survey data, organizational or public "statistics", etc.) and make use of artifacts like surveys or experimental materials
* ''Measures'' : are usually quantitative (e.g. experimental data, survey data, organizational or public "statistics", etc.) and make use of artifacts like surveys or experimental materials
* '' Analyses &amp; conclusion'' : Hypothesis are tested with statistical methods
* ''Analyses &amp; conclusion'' : Hypothesis are tested with statistical methods




Line 25: Line 37:
=== The scientific ideal ===
=== The scientific ideal ===


'' Control physical interactions between variables''
Experimental research is the ideal paradigm for empirical research in most natural science disciplines. It aims to ''control physical interactions between variables''


Experimentation principle in science:
;Experimentation principle in science


# The study object is completely isolated from any environmental influence and observed (O<sub>1</sub>)
# The study object is completely isolated from '''any''' environmental influence and observed (O<sub>1</sub>)
# A stimulus is applied to the object (X<sub>1</sub>)
# A stimulus is applied to the object (X<sub>1</sub>)
# The object’s reactions are observed (O<sub>2</sub>).
# The object’s reactions are observed (O<sub>2</sub>).


We may draw a picture for this:
[[Image:science-experiment.png]]
[[Image:science-experiment.png]]


* O1 = observation of the non-manipulated object’s state”
* O1 = observation of the non-manipulated object’s state
* X = treatment (stimulus, intervention)
* X = treatment (stimulus, intervention)
* O2 = observation of the manipulated object’s state”.
* O2 = observation of the manipulated object’s state”.


The effect of the treatment (X) is measured by the difference between O<sub>1</sub> and O<sub>2</sub>
The effect of the treatment (X) is measured by the difference between O<sub>1</sub> and O<sub>2</sub>
In other words, an experiment can "prove" (corroborate) that an intervention X will have an effect Y. X and Y are theorectical variables that are operationalized in the following way. X becomes intervention and Y quantified measures of the effect.


=== The simple experiment in human sciences ===
=== The simple experiment in human sciences ===


It is not possible to totally isolate a subject from its environment. Therefore we have to make sure that effects or the environments are either controlled or at least equally distributed over the control group.
In humain sciences (as well as in the life sciences) it is not possible to totally isolate a subject from its environment. Therefore we have to make sure that effects of the environment are either controlled or at least equally distributed over the control group. Let's now look at a few strategies...


=== Simple experimentation using a control group : ===
=== Simple experimentation using a control group : ===
Line 53: Line 68:
Principle:
Principle:


# Two groups of subjects are chosen randomly (R) within a mother population:
# Two groups of subjects are chosen randomly (R) within a mother population. This ought to eliminate systematic influence of unknown variables on one group. I.e. we postulate that both groups will be under the same influence of the same uncontrolled variables.
#* this ought to eliminate systematic influence of unknown variables on one group
# The independent variable (X) is manipulated by the researcher. He will put on group under an experimental condition, i.e. apply a treatment.
# Ideally, subjects should not be aware of the research goals
# Ideally, subjects should not be aware of the research goals since they might for example consciously or unconsciously want to influence the result.
# The independent variable (X) is manipulated by the researcher (experimental condition)
 
; Analysis of results:


Analysis of results: effects are compared:
We compare effects of treatement (stimulus) vs. non-treatement of the two groups. The measure "O" is also called a '''post-test''' since we apply it after the treatment.


{| border="1"
{| border="1"
! rowspan="1" colspan="1" |Treatment
! rowspan="1" colspan="1" |effect (O)
! rowspan="1" colspan="1" |non-effect (O)
! rowspan="1" colspan="1" |Total effect<br/>for a group
! rowspan="1" colspan="1" |
! rowspan="1" colspan="1" |
Treatment
! rowspan="1" colspan="1" |
effect (O)
! rowspan="1" colspan="1" |
non-effect (O)
! rowspan="1" colspan="1" |
Total effect<br/>for a group
! rowspan="1" colspan="1" |
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |treatment: (group X)
treatment: (group X)
| rowspan="1" colspan="1" |bigger
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |smaller
bigger
| rowspan="1" colspan="1" |100 %
| rowspan="1" colspan="1" |
| rowspan="2" colspan="1" |We do a <br/>
smaller
"vertical" comparison <br />
| rowspan="1" colspan="1" |
100 %
| rowspan="2" colspan="1" |
We do a <br/>
vertical comparison <br />
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |non-treatment: (group non-X)
non-treatment: (group non-X)
| rowspan="1" colspan="1" |smaller
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |bigger
smaller
| rowspan="1" colspan="1" |100 %
| rowspan="1" colspan="1" |
bigger
| rowspan="1" colspan="1" |
100 %
|}
|}


Analysis questions are formulated in this spirit: What is the probability that treatment X leads to effect O ? In the table
Analysis questions are formulated in this spirit: What is the probability that treatment X leads to effect O ? In the table above we can observe an experimentation effect. We can see that the effect in the experimental (treated) group is bigger than in the non experimental group and the other way round.
above we can observe an experimentation effect.


The '''Simple experiment with different treatments''' is a slightly different design alternative, but similar in spirit.
The '''Simple experiment with different treatments''' is a slightly different design alternative, but similar in spirit.
Line 103: Line 104:
Example: First students are assigned randomly to different lab sessions using a different pedagogy (X) and we would like to know if there are different effects at the end (O).
Example: First students are assigned randomly to different lab sessions using a different pedagogy (X) and we would like to know if there are different effects at the end (O).


'''Problems with simple experimentation:'''
'''Problems with simple experimentation'''


* Selection: Subjects may not be the same in the different groups
This "post-test" only design is not really optimal and for various reasons.
** Since samples are typically very small (15-20 / group) this may have an effect
* Reactivity of subjects: Individuals ask themselves questions about the experiment (compensatory effects) or may otherwise change between observations
* Difficulty to control certain variables in a “real” context
** Example: A new ICT-supported pedagogy may work better, because it stimulates the teacher, students may increase their attention and work input, groups may be smaller and individuals get more attention.
** In principle one could test these variables with experimental conditions, but for each new variable, one has to add at least 2 more experimental groups, .....


=== The simple experiment with pretests: ===
* '''Selection''': Subjects may not be the same in the different groups. Since samples are typically very small (15-20 / group) this may have an effect
* '''Reactivity of subjects''': Individuals ask themselves questions about the experiment and this leads to compensatory effects or they may otherwise change between observations
* Difficulty to '''control certain variables''' in a real context. Example: A new ICT-supported pedagogy may work better, because it stimulates the teacher, because students may increase their attention and amount of work, or simply because experimental groups may be smaller than in "real conditions" and individual student therefore get more attention.


The following design attempts to control the difference that may exist between 2 experimental groups (i.e. we don't trust randomization or we can't randomly assign subjects to a group, e.g. 2 classes in a school setting).
In principle one could test such intervening variables with new experimental conditions, but for each new variable, one has to add at least 2 more experimental groups, something that is very costly. Let's now look at a more popular design...
 
=== The simple experiment with pretests ===
 
The following design attempts to control the difference that may exist between 2 experimental groups (i.e. we don't trust randomization or we can't randomly assign subjects to a group. This is typically the case when we select for example two classes in a school setting.
 
Here is the desing:


[[Image:simple-experiment-pretest.png]]
[[Image:simple-experiment-pretest.png]]


* To control the potential difference between groups: compare O2 - O1 (difference) with O4 - O3
; Analysis
* Disadvantage: effects of the first measure on the experiment<br /> Example: (a) If X is supposed to increase pedagogical effect, the O''1'' and O''3'' tests could have an effect (students learn by doing the test), so you can’t measure the "pure" effect of X.


The Solomon design is similar in spirit and better, but it requires two extra control groups:
To control the potential difference between groups we compare the difference between O2 and O1 with the difference between O4 - O3
effect = (O2-O1) versus (O4-O3.
 
There are also disadvantage of this design, in particular the effect of the first measure on the experiment can influence the outcome. Example: (a) If X is supposed to increase a pedagogical effect, the O1 and O3 tests could have an effect (students learn by doing the test), so you can’t measure the "pure" effect of X.
 
This '''experimentation effect''' can be controlled by the '''Solomon design''', which is similar in spirit, but this method requires two extra control groups and is more costly.


[[Image:experiment-solomon-design.png]]
[[Image:experiment-solomon-design.png]]


* combines the simple experiment design with the pretest design:
The Solomon design combines the simple experiment design with the pretest design:
* and we can test for example: O2&gt;O1, O2&gt;O4, O5&gt;O6, O5&gt;O3


Note: comparing 2 different situations is NOT an experiment ! The treatment variable X must be simple and uni-dimensional (else you don’t know the precise cause of an effect)
We can test for example if O2&gt;O1, O2&gt;O4, O5&gt;O6 and O5&gt;O3
 
Final Note: Simply comparing 2 different situations is NOT an experiment ! The treatment variable X must be simple and uni-dimensional (else you don’t know the precise cause of an effect). We shall come back to this problem below when we discuss quasi-experimental research designs.


There exist even more complicated designs to measure interaction effects of 2 or more treatments, but we shall stop here.
There exist even more complicated designs to measure interaction effects of 2 or more treatments, but we shall stop here.
Line 134: Line 143:
=== The non-experiment: what you should not do ===
=== The non-experiment: what you should not do ===


The (non)experiment without control group nor pretest can be presented like this:
Let's now look at bad designs, since they often can be found in the discourse of policy makers or early drafts of research proposals.
 
 
; The (non)experiment without control group nor pretest
 
This bad design looks like this:


[[Image:non-experiment.png]]
[[Image:non-experiment.png]]
Line 140: Line 154:
We just look at data (O) after some event (X).
We just look at data (O) after some event (X).


'''Example: A bad discourse on ICT competence of pupils'''
'''Example: A bad discourse on ICT competence of pupils''': “Since we introduced ICT in the curriculum, most of the school’s pupils are good at finding things on the Internet"
 
“Since we introduced ICT in the curriculum, most of the school’s pupils are good at finding things on the Internet"


There is a lack of real comparison !!
There is a lack of real comparison !!
 
* We don’t compare what happens in other schools that offer no ICT training. Maybe it is a general trend that pupils have become better a finding things on the Internet, since most households how have computers and Internet access.
* We don’t compare: what happens in other schools that offer no ICT training ? (Maybe this is a general trend since more households have computers and Internet access.)
* We don’t even know what happened before !
* We don’t even know what happened before !


"Most of the students are good ! ..." means that you don't compare to what happens in other settings that do not include ICT in their curriculum
A statement like "Most of the students are good ! ..." means that you don't compare to what happens in other settings that do not include ICT in their curriculum. It's therefore pretty worthless as proof that introduction of ICT on schools had any effect.


{| border="1"
{| border="1"
! rowspan="1" colspan="1" |The variable to be explained (O)
! rowspan="1" colspan="1" |x= ICT in school
! rowspan="1" colspan="1" |x= no ICT in school
! rowspan="1" colspan="1" |
! rowspan="1" colspan="1" |
The variable to be explained (O)
! rowspan="1" colspan="1" |
x= ICT in school
! rowspan="1" colspan="1" |
x= no ICT in school
! rowspan="1" colspan="1" |
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |bad at web search
bad at web search
| rowspan="1" colspan="1" |10 students
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |???
10 students
| rowspan="1" colspan="1" |
???
| rowspan="2" colspan="1" |
| rowspan="2" colspan="1" |
* horizontal comparison <br /> of % ???
* horizontal comparison <br /> of % ???
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |good at web search
good at web search
| rowspan="1" colspan="1" |20 students
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |???
20 students
| rowspan="1" colspan="1" |
???
|}
|}


Line 181: Line 182:


{| border="1"
{| border="1"
! rowspan="1" colspan="1" |The variable to be explained (O)
! rowspan="1" colspan="1" |before
! rowspan="1" colspan="1" |after
! rowspan="1" colspan="1" |
! rowspan="1" colspan="1" |
The variable to be explained (O)
! rowspan="1" colspan="1" |
before
! rowspan="1" colspan="1" |
after
! rowspan="1" colspan="1" |
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |bad at web search
bad at web search
| rowspan="1" colspan="1" |???
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |10 students
???
| rowspan="2" colspan="1" |Horizontal comparison <br /> of % ???
| rowspan="1" colspan="1" |
10 students
| rowspan="2" colspan="1" |
* horizontal comparison <br /> of % ???
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |good at web search
good at web search
| rowspan="1" colspan="1" |???
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |20 students
???
| rowspan="1" colspan="1" |
20 students
|}
|}


Here is another bad design:
Now let's look at another bad design ...


'''Experiments without randomization nor pretest'''
'''Experiments without randomization nor pretest'''
In the following design with have the Problem that '''there is no control over the conditions and the evolution of the control group'''.


[[Image:bad-control-group-experiment.png]]
[[Image:bad-control-group-experiment.png]]


'''Problem''': There is no control over the conditions and the
evolution of the control group


* Example: Computer animations used in school A are the reason of better grade averages (than in school B)
An Example: Computer animations used in school A are the reason of better grade averages (than in school B). School A simply may attract pupils from different socio-economic conditions and they usually have better grades. Or politically speaking. Rich schools have to money to introduce computer animations and they also attract better learners.
* School A simply may attract pupils from different socio-economic conditions and that usually show better results.


Finally, let's look at the experiment without control group
Finally, let's look at the experiment without control group
; The experiment without control group


[[Image:no-control-group-experiment.png]]
[[Image:no-control-group-experiment.png]]
Line 225: Line 216:
We don’t know if X is the real cause
We don’t know if X is the real cause


* Example: “Since I bought my daughter a lot of video games, she is much better at word processing
Example: Since I bought my daughter a lot of video games, she is much better at word processing". You don’t know if this evolution is "natural" (kids always get better at word processing after using it a few times) or if she learnt it somewhere else. This is "natural evolution" or "statistical regression" of the population.
* You don’t know if this evolution is "natural" (kids always get better at word processing after using it a few times) or if she learnt it somewhere else.


=== Examples of experimental designs ===
===Experimental designs example ===


Drawn from TECFA's [http://tecfa.unige.ch/maltt MSc MALTT] (Master of Science in Learning and Teaching Technologies)
* Rebetez, C. (2004). Sous quelles conditions l’animation améliore-t-elle l’apprentissage ? Master Thesis (145p.). Master Thesis, [http://tecfa.unige.ch/maltt MSc MALTT] (Master of Science in Learning and Teaching Technologies), TECFA, University of Geneva. [http://tecfa.unige.ch/perso/staf/rebetez/papers/memoire_staf.pdf version pdf (2.6Mo)].


'''To do''': translation ....
The thesis we are presenting was writting in French, but there he also did a Master thesis in cognitive psychology in English:
* Rebetez, C. (2006). Control and collaboration in multimedia learning: is there a split-interaction? Master Thesis, School of Psychology and Education, University of Geneva. [http://tecfa.unige.ch/perso/staf/rebetez/blog/wp-content/files/ThesisProject_Rebetez.pdf pdf (168ko)]


=== Under which conditions does animation favor learning ? ===
''Notice''. This thesis was funded by a real research project, i.e. the student did more than it usually expected for an MA thesis.


Master (DESS) thesis by Cyril Rebetez, TECFA 2005
The ''big research question'': Our research has the objective to show the influence of the continuity of a presentation flow, of collaboration and previous states in memory and to verify the influence of individual variables like visual span and capacities to do mental rotations.


'' Note:'' Funded by a real research project, i.e. the student did '' more than usually expected '' !
Original {{quotation|Notre recherche a pour objectif de mettre en évidence l'influence, de la '' continuité du flux'' , de la '' collaboration'' , de la ''permanence des états antérieurs,'' ainsi que de vérifier la portée de '' variables individuelles'' telles que l’empan visuel et les
capacités de rotation mentale. (p.33)}}


The ''big research question''
The general objective is then further developed through 1 1/2 pages in the thesis. Causalities are discussed in verbal form (p. 34-40) and then "general" hypothesis are presented on 2 pages.
 
"Notre recherche a pour objectif de mettre en évidence l'influence, de la '' continuité du flux'' , de la '' collaboration'' , de la ''permanence des états antérieurs,'' ainsi que de vérifier la portée de '' variables individuelles'' telles que l’empan visuel et les
capacités de rotation mentale." (p.33)
 
* This objective is then further developed through 1 1/2 pages in the thesis. Causalities are discussed in verbal form (p. 34-40) and then "general" hypothesis are presented on 2 pages.


Explanatory (independent) variables, i.e. conditions
Explanatory (independent) variables, i.e. conditions


# Animation, '' static vs. dynamic condition'' : allows to visualize transition between states. Static presentation forces a student to imagine movement of elements.
# '''Animation''', ''static vs. dynamic condition'': allows to visualize transition between states. Static presentation forces a student to imagine movement of elements.
# Permanence, '' present or absent condition'' : If older states of the animation are shown, students have better recall and therefore can more easily build up their model.
# '''Permanence''', ''present or absent condition'': If older states of the animation are shown, students have better recall and therefore can more easily build up their model.
# Collaboration, '' present or absent condition'' : Working together should allow students to create more sophisticated representations.
# '''Collaboration''', ''present or absent condition'': Working together should allow students to create more sophisticated representations.
 
Operational hypothesis (presented in the methodology chapter):


Quotations from the thesis:
Operational hypothesis (presented in the methodology chapter, ''translation needed''):


* Animation
* Animation
Line 271: Line 256:
Method (short summary !)
Method (short summary !)


* Population = 160 students
Population: 160 students. All have been tested to check if the were novices (show lack of domain knowledge used in the material)
** All have been tested to check if the were novices (show lack of domain knowledge used in the material)
 
* Material
Material:
** Pedagogical material is 2 different multimedia contents (geology and astronomy), each one in 2 versions. For the dynamic condition there are 12 animations, for the static conditions 12 static pictures
* Pedagogical material are 2 different multimedia contents (geology and astronomy), each one in 2 versions. For the dynamic condition there are 12 animations, for the static conditions 12 static pictures
** Contents of pedagogical material: "Transit of Venus" made with VRML, "Ocean and mountain building" made with Flash
* Contents of pedagogical material: "Transit of Venus" made with VRML, "Ocean and mountain building" made with Flash
** These media were integrated in Authorware (to take measures and to ensure a consistent interface)
* These media were integrated in Authorware (to take measures and to ensure a consistent interface)
* Procedure (roughly, step by step)
 
** Pretest (5 questions)
Procedure (roughly, step by step)
** Introduction (briefing)
* Pretest (5 questions)
** For solo condition: paper folding and Corsi visio-spatial tests
* Introduction (briefing)
** Test with material
* For solo condition: paper folding and Corsi visio-spatial tests
** Cognitive load test (nasa-tlx)
* Test with material
** Post-test (17 questions)
* Cognitive load test (nasa-tlx)
* Measured dependant variables:
* Post-test (17 questions)
** Nombre de réponses correctes aux questionnaires de rétention.
 
** Nombre de réponses correctes aux questionnaires d'inférence.
Measured dependant variables:
** Niveau de certitude des réponses aux questionnaires.
* Number of correct answers in a retention questionnaire.
** Scores sur cinq échelles de charge cognitive perçue (tirées du nasa-tlx).
* Number of correct answers in a inference questionnaire.
** Score au paper-folding test.
* Level of response certitudes in both questionnaires.
** Score d'empan au test de Corsi.
* Subject cognitive load scores (measures with the NASA-TLX test)
** Temps (sec) et nombre d'utilisation des vignettes en condition de permanence.
* Paper-folding test score
** Temps de réflexion entre les présentations (sec).
* Visual span test score (Corsi)
* Time (seconds) and number of vignette use in the "permanent condition" permanence.
* Reflection time between presentations (seconds).


== Quasi-experimental designs ==
== Quasi-experimental designs ==


Quasi-experimental designs are inspired by experimental design principles (pre- and post tests, and control groups).
It is difficult to carry out experiments in real settings, e.g. schools. However there exist
so-called '''quasi-experimental''' designs. These are inspired by experimental design principles (pre- and post tests, and control groups) .


Use case and advantages:
; Advantages
* Are led in non-experimental situations (e.g. real contexts)
* Can be led in non-experimental situations, i.e. in "real" contexts
* Are used when the treatment is too "heavy", i.e. involve more that 1-2 well defined treatment variables.
* Can be used when the treatment may become too "heavy", i.e. involves more than 2-3 well defined treatment variables.
* Address all sorts of threats to internal validity (see later)


; Disadvantage
In quasi-experimental situations, you really lack control:
In quasi-experimental situations, you really lack control:
* you don’t know all possible stimuli (causes not due to experimental conditions)
* You don’t know all possible stimuli (causes not due to experimental conditions)
* you can’t randomize (distribute evenly other intervening unknown stimuli over the groups)
* You can’t randomize (distribute evenly other intervening unknown stimuli over the groups)
* you may lack enough subjects
* You may lack enough subjects


Usage examples in social sciences:
Nevertheless, quasi-experimental research can help to test all sorts of threats from variables that you can not control. These are called '''threats to internal validity''' (see below)


* evaluation research
; Usage examples in the social sciences
* organizational innovation studies
* Evaluation research
* questionnaire design (think about control variables to test alternative hypothesis)
* Organizational innovation studies
* Questionnaire design for survey research (think about control variables to test alternative hypothesis)


There exist various designs. Some are easier to conduct but lead to less solid (valid) results:
There exist various designs. Some are easier to conduct, but they lead to less solid (valid) results. Let's examine a few...


=== Interrupted time series design ===
=== Interrupted time series design ===
Line 322: Line 312:
[[Image:interrupted-time-series-design.png]]
[[Image:interrupted-time-series-design.png]]


Advantages:
;Advantages
 
* you can control (natural) trends somewhat
 
'''Problems:'''


* You can't control external simultaneous events ( X''2'' that happen at the same time as X''1'' )
You can control (natural) trends somewhat. I.e. when you observe or introduce a treatment, e.g. a pedagogical reform, you may not really be sure that reform features themselves had any effect or if it was something else, like a general trend in the abilities of the student population.
* Example: ICT-based pedagogies are introduced together with other pedagogical innovations. So which one does have an effect on overall performance ?


'''Practical difficulties''':
; Problems
You can't control external simultaneous events ( X''2'' that happen at the same time as X''1'' )
Example concerning the effect of ICT-based pedagogies in the classroom: These ICT-based pedagogies may have been introduced together with other pedagogical innovations. So which one does have an effect on overall performance ?


; Practical difficulties
* Sometimes it is not possible to obtain data for past years
* Sometimes it is not possible to obtain data for past years
* Sometimes you don’t have the time wait long enough (your research ends too early)
* Sometimes you don’t have the time wait for long enough (your research ends too early and decision makers never want wait for long-term results). Example: ICT-based pedagogies often claim to improve meta-cognitive skills. Do you have tests for year-1, year-2, year-3 ? Can you wait for year+3 ? Can you wait even longer, i.e. test the same population when they reach university or jobs where meta-cognitive skills matter more ?
** Example: ICT-based pedagogies often claim to improve meta-cognitive skills. Do you have tests for year-1, year-2, year-3 ? Can you wait for year+3 ?


'''Examples of time series'''  
'''Examples of time series'''  
Now let's have an informal look at '''time serious''', i.e. measures that evolve over time and that can corroborate or invalidate hypothesis about an intervention '''X'''.


[[Image:time-series-examples.png]]
[[Image:time-series-examples.png]]


* O1, O2, etc. are observation data (e.g. yearly), X is the treatment (intervention)
O1, O2, etc. are observation data (e.g. yearly), X is the treatment (intervention)


;A. a statistical effect is likely
;A. a statistical effect is likely
:Example "Student’s drop-out rates are lower since we added forums to the e-learning content server "
:Example "Student’s drop-out rates are lower since we added forums to the e-learning content server"
: but attention: you don’t know if there was an '' other intervention'' at the same time.
: but attention: you don’t know if there was an ''other intervention'' at the same time.


; B. Likely “Straw fire” effect
; B. Likely Straw fire effect
: Teaching has improved after we introduced X. But then, things '' went back to normal''
: Teaching has improved after we introduced X. But then, things '' went back to normal''
:So there is an effect, but after a while the cause "wears out". E.g. the typical motivation boost from ICT introduction in the curriculum may not last
:So there is an effect, but after a while the cause "wears out". E.g. the typical motivation boost from ICT introduction in the curriculum may not last
Line 358: Line 347:


; E. Delay effect:
; E. Delay effect:
: Example: high investments in education (take decades to take effect)
: Example: high investments in education (may take decades to take effect)


; F. Trend acceleration effect,
; F. Trend acceleration effect,
: difficult to discriminate with G
: difficult to discriminate from G. I.e. there is some change in the curve, but it may just be variant of exponential natural evolution.


; :Natural exponential evolution: same as (C).
; G. Natural exponential evolution:
: same as (C).


=== Threats to internal validity ===
=== Threats to internal validity ===


The big question you should ask yourself over and over: What other variables could influence our experiments ?
The big question you should ask yourself over and over: What other variables could influence our experiments ? Campbell and Stanley (1963) created an initial typology of threats for which you have to watch out:
(Campbell and Stanley, 1963)


{| border="1"
{| border="1"
! rowspan="1" colspan="1" |
! rowspan="1" colspan="1" |Type of threat
Type
! rowspan="1" colspan="1" |Definition and example
! rowspan="1" colspan="1" |
Definition:
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |history
history
| rowspan="1" colspan="1" |An other event than X happens between measures<br/>''Example: ICT introduction happened at the same time as introduction of project-based teaching.''
| rowspan="1" colspan="1" |
An other event than X happens between measures'' example: ICT introduction happened at the
same time as introduction of project-based teaching.''
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |maturation
maturation
| rowspan="1" colspan="1" |The object changed “naturally” between measures<br/>''Example: Did this course change your awareness of methodology or was it simply the fact that you started working on your master thesis.''
| rowspan="1" colspan="1" |
The object changed “naturally” between measures'' example: Did this course change your
awareness of methodology or was it simply the fact that you started working on your master
thesis.''
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |testing
testing
| rowspan="1" colspan="1" |The measure had an effect on the object<br/>''Example: Your pre-intervention interviews had an effect on people (e.g. teachers changed behavior before you invited them to training sessions)''
| rowspan="1" colspan="1" |
The measure had an effect on the object'' example: Your pre-intervention interviews had an
effect on people (e.g. teachers changed behavior before you invited them to training
sessions)''
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |instrumentation
instrumentation
| rowspan="1" colspan="1" |Method use to measure has changed<br/>''Example: Reading skills are defined differently. E.g. newer tests favor text understanding.''
| rowspan="1" colspan="1" |
Method use to measure has changed'' example: Reading skills are defined differently. E.g.
newer tests favor text understanding.''
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |statistical regression
statisticalregression
| rowspan="1" colspan="1" |Differences would have evened out naturally<br/>''Example: School introduces new disciplinary measures after kids beat up a teacher. Maybe next year such events wouldn’t have happened without any intervention.''
| rowspan="1" colspan="1" |
Differences would have evened out naturally'' example: School introduces new disciplinary
measures after kids beat up a teacher. Maybe next year such events wouldn’t have happened
without any intervention.''
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |(auto) selection
(auto)<br /> selection
| rowspan="1" colspan="1" |Subjects auto-select for treatment<br/>''Example: You introduce ICT-based new pedagogies and results are really good (Maybe only good teachers did participate in these experiments).''
| rowspan="1" colspan="1" |
Subjects auto-select for treatment'' example: You introduce ICT-based new pedagogies and
results are really good (Maybe only good teachers did participate in these experiments).''
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |mortality
mortality
| rowspan="1" colspan="1" |Subjects are not the same<br/>''Example: A school introduces special measures to motivate "difficult kids". After 2-3 years drop-out rates improve. Maybe the school is situated in
| rowspan="1" colspan="1" |
Subjects are not the same-'' example: A school introduces special measures to motivate
"difficult kids". After 2-3 years drop-out rates improve. Maybe the school is situated in
a area that show rapid socio-demographic change (different people).''
a area that show rapid socio-demographic change (different people).''
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |interaction with<br /> selection
interaction<br /> with<br /> selection
| rowspan="1" colspan="1" | Combinatory effects example: the control group shows a different maturation
| rowspan="1" colspan="1" |
Combinatory effectsexample: the control group shows a different maturation
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |directional <br /> ambiguity
directional <br /> ambiguity
| rowspan="1" colspan="1" |Its it the treatment or the subjects ? <br/>''Example: Do workers show better output in "flat-hierarchy" / participatory / ICT-supported organization or do such organizations attract more active and efficient
| rowspan="1" colspan="1" |
'' example: Do workers show better output in "flat-hierarchy" / participatory /
ICT-supported organization or do such organizations attract more active and efficient
people ?''
people ?''
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |Diffusion or treatment imitation
Diffusion ortreatment imitation
| rowspan="1" colspan="1" |<br/>''Example: An academic unit promotes modern blended learning and attracts good students from a wide geographic area. A control unit also may profit from this effect.''
| rowspan="1" colspan="1" |
'' example: An academic unit promotes modern blended learning and attracts good students
from a wide geographic area. A control unit also may profit from this effect.''
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" | Compensatory egalization
Compensatoryegalization
| rowspan="1" colspan="1" | The control groups observes the experimental group<br/>''Example: Subjects who don’t receive treatment, react by changing their behavior''
| rowspan="1" colspan="1" |
'' example: Subjects who don’t receive treatment, react negatively.''
|}
|}
Let's now have a look at some designs that attempt to control such threats.


=== Non-equivalent control group design ===
=== Non-equivalent control group design ===
Line 452: Line 407:
[[Image:non-equivalent-control-group-design.png]]
[[Image:non-equivalent-control-group-design.png]]


[[Image:hand-up.png]] Advantages: Good at detecting other causes
[[Image:icon-thumb-up.png]] Advantages: Good at detecting other causes


* If O''2'' - O''1'' is similar to O''4'' - O''3'', we can reject the hypothesis that O''2'' - O''1'' is due to X.
* If O''2'' - O''1'' is similar to O''4'' - O''3'', we can reject the hypothesis that O''2'' - O''1'' is due to X.


[[Image:hand-down.png]] Inconvenients and possible problems:
[[Image:icon-thumb-down.png]] Inconvenients and possible problems:
* Bad control of natural tendencies
* Bad control of natural tendencies
* Finding (somewhat) equivalent groups is not easy
* Finding (somewhat) equivalent groups is not easy
Line 501: Line 456:
|}
|}


Questions:
Review Question: Why does student satisfaction improve at the same time for B ?
 
* E2: Why does student satisfaction improve at the same time for B ?
 
=== Validity in quasi-experimental design ===
 
There exist four kinds of validity according to Stanley et al.:
 
'' [[Image:finger-1.png]] Internal validity '' concerns'' your research
design ''
:* You have to show that postulated causes are "real" (as discussed before) and that alternative explanations are wrong.
:* This is the most important validity type.
 
[[Image:finger-2.png]] '' External validity '' .... can you make
generalizations ?
* not easy ! because you may not be aware of "helpful" variables, e.g. the "good teacher" you worked with or the fact that things were much easier in your private school ....
* How can you provide evidence that your successful ICT experiment will be successful in other similar situations, or situations not that similar ?
 
[[Image:finger-3.png]] '' Statistical validity'' .... are your statistical relations significant ?
:* not too difficult for simple analysis
:* just make sure that you use the right statistics and believe them (see module on data analysis)
 
[[Image:finger-4.png]] C'' onstruction'' '' validity ...'' are your
operationalizations sound ?
:* Did you get your dimensions right ?
:* Do your indicators really measure what you want to know ?
 
This typology is also useful for other settings, e.g. structured qualitative analysis or statistical designs.


=== Use comparative time series if you can ===
=== Comparative time series ===


One of the most powerful quasi-experimentalresearch designs uses comparative time series.
One of the most powerful quasi-experimental research designs uses comparative time series. I.e. you combine the [[#Interrupted_time_series_design|Interrupted_time_series_design]] with the [[#Non-equivalent_control_group_design|Non-equivalent_control_group_design]] we presented above.


[[Image:comparative-time-series.png]]
[[Image:comparative-time-series.png]]
Line 546: Line 474:
# Watch out for simultaneous interventions at point X.
# Watch out for simultaneous interventions at point X.


=== Example ===  
=== Validity in quasi-experimental design ===
 
Let's now generalize a little bit our discussion and discuss the causality issue we alread addressed somewhat in the [[Methodology tutorial - empirical research principles]].
 
There exist four kinds of validity according to Stanley et al.:
 
[[Image:icon-finger-1.png]] ''Internal validity'' concerns'' your research design ''
:* You have to show that postulated causes are "real" (as discussed before) and that alternative explanations are wrong.
:* This is the most important validity type.
 
[[Image:icon-finger-2.png]] ''External validity'' .... can you make generalizations ?
:* This is not easy ! because you may not be aware of "helpful" variables, e.g. the "good teacher" you worked with or the fact that things were much easier in your private school ....
:* How can you provide evidence that your successful ICT experiment will be successful in other similar situations, or situations not that similar ?
 
[[Image:icon-finger-3.png]] ''Statistical validity'' .... are your statistical relations significant ?
:* This not too difficult for simple analysis
:* Just make sure that you use the right statistics and believe them (see [[Methodology tutorial - qualitative data analysis]])


Thesis title: Scripting Strategies In Computer Supported Collaborative Learning Environments
[[Image:icon-finger-4.png]] ''Construction validity ...'' are your operationalizations sound ?
:* Did you get your dimensions right ?
:* Do your indicators really measure what you want to know ?


Author: Michele Notari
'''Important''': This typology is also useful for other settings, e.g. structured qualitative analysis or statistical designs. In most other empricial research designs you '''must''' address these issues.


* This thesis concerns the design and effects of ICT-supported activity-based pedagogics in a normal classroom setting
=== Quasi-experimental thesis example ===
 
Notari, Michele (2003). Scripting Strategies In Computer Supported Collaborative Learning Environments, Master Thesis, [http://tecfa.unige.ch/maltt MSc MALTT] (Master of Science in Learning and Teaching Technologies), TECFA, University of Geneva. [http://tecfa.unige.ch/perso/staf/notari/thesispage.html HTML/PDF]
* This master thesis concerns the design and effects of ICT-supported activity-based pedagogics in a normal classroom setting
* Target: Biology at high-school level (various subjects)
* Target: Biology at high-school level (various subjects)


Three research questions formulated as 'working hypotheses':
Three research questions formulated as 'working hypotheses':


* The use of a '' Swiki'' as collaborative editing tool '' causes no technical '' and comprehensive problems (after a short introduction) for high school students without experience in collaborative editing but with some knowledge of the use of a common text-editing software and the research of information in the Web.
* The use of a ''Swiki'' as collaborative editing tool ''causes no technical'' and comprehensive problems (after a short introduction) for high school students without experience in collaborative editing but with some knowledge of the use of a common text-editing software and the research of information in the Web.
* '' Scripting which induces students to compare and comment on the work of the whole learning community'' (using a collaborative editing tool) '' leads to better learning performance'' (as assessed by pre- and post-testing) '' than a script leading students to work without such a tool and with little advice or / and opportunity to make comments and compare'' their work with the learning community.
* ''[Pedagogical] scripting which induces students to compare and comment on the work of the whole learning community'' (using a collaborative editing tool) ''leads to better learning performance'' (as assessed by pre- and post-testing) ''than a script leading students to work without such a tool and with little advice or / and opportunity to make comments and compare'' their work with the learning community.
* The '' quality'' of the product of the working groups is '' better'' (longer and more detailed) when students are induced to compare and comment on their work (with a collaborative editing tool) during the learning unit.
* The ''quality'' of the product of the working groups is ''better'' (longer and more detailed) when students are induced to compare and comment on their work (with a collaborative editing tool) during the learning unit.


Method (Summary, quotations from thesis)
;Method  
(Summary, quotations from thesis)


* The whole research took place in a normal curricular class environment. The classes were not aware of a special learning situation and a deeper evaluation of the output they produced.
* The whole research took place in a normal curricular class environment. The classes were not aware of a special learning situation and a deeper evaluation of the output they produced.
Line 568: Line 518:
* Of course the students asked about the purposes of the tests. We tried to motivate them to perform as well as they could without telling them the real reason of the tests.
* Of course the students asked about the purposes of the tests. We tried to motivate them to perform as well as they could without telling them the real reason of the tests.


Notes:
; Notes:


* This master theses concerns several quasi-experiments, all in real-world settings.
* This master theses concerns several quasi-experiments, all in real-world settings.
* On the next slide we just reproduce the settings for one of these.
* Below we just reproduce the settings for one of these.
* Several explaining variables intervene in the example on next page ( the procedure as whole was evaluated, and not variables as defined by experimentalism ).
* Several explaining variables intervene in the example on next page ( the procedure as whole was evaluated, and not variables as defined by experimentalism ).


Line 577: Line 527:


[[Image:notari-wiki-scripting.png]]
[[Image:notari-wiki-scripting.png]]
Let's now look into so-called statistical designs, an approach that is typically used in "survey research".


== Statistical designs ==
== Statistical designs ==


Statistical designs are related to experimental designs:
Statistical designs are also conceptuall related to experimental designs:
 
[[Image:book-research-design-132.png]] Statistical designs formulate laws


** there is no interest in individual cases (unless something goes wrong)
; Statistical designs formulate laws
** You can test quite a lot of laws (hypothesis) with statistical data (your computer will
* there is no interest in individual cases (unless something goes wrong)
calculate)
* You can test quite a lot of laws (hypothesis) with statistical data (your computer will calculate)


[[Image:book-research-design-133.png]] Designs are based on prior theoretical reasoning,
;Designs are based on prior theoretical reasoning, because:
because:


* measures are not all that reliable,
* measures are not all that reliable,
Line 598: Line 547:
* you can not get an "inductive picture" by asking a few dozen closed questions.
* you can not get an "inductive picture" by asking a few dozen closed questions.


[[Image:book-research-design-134.png]] Design à la Popper:
The dominant research design is conducted "à la Popper":
 
# You start by formulating hypothesis<br /> (models that contain measurable variables and
relations)
# You test relations with statistical tools
 
* Most popular variant in educational technology: Survey research


# You start by formulating hypothesis (models that contain measurable variables and relations)
# You measure the variables (e.g. with a questionnaire and/or a test)
# You then test relations with statistical tools


The Most popular variant in educational technology is so-called "survey research".


=== Introduction to survey research ===
=== Introduction to survey research ===


A typical research plan looks like this:
; A typical research plan looks like this:


# Literature review leading to general research questions and/or analysis frameworks
# Literature review leading to general research questions and/or analysis frameworks
# You may use qualitative methodology to investigate new areas of study
# You may use qualitative methodology to investigate new areas of study
# Definition of hypothesis
# Definition of hypothesis
# Operationalization of hypothesis, e.g. definition of scales and related questionnaire
# Operationalization of hypothesis, e.g. definition of scales and related questionnaire items
items
# Definition of the mother population
# Definition of the mother population
# Sampling strategies
# Sampling strategies
# Identification of analysis methods
# Identification of analysis methods


Implementation (mise en oeuvre)
; Implementation (mise en oeuvre)


# Questionnaire building (preferably with input from published scales)
# Questionnaire building (preferably with input from published scales)
Line 629: Line 575:
# Analysis
# Analysis


Writing it up
; Writing it up
 
** Compare results to theory
** Marry good practise of results presentation and discussion, but also make it readable
 


* Compare results to theory
* Marry good practise of results presentation and discussion, but also make it readable


=== Levels of reasoning within a statistical approach ===
=== Levels of reasoning within a statistical approach ===
Line 692: Line 636:


(Just for your information. If it looks too complicated, ignore)
(Just for your information. If it looks too complicated, ignore)


=== Typology of internal validity errors ===
=== Typology of internal validity errors ===


[[Image:book-research-design-135.png]] Error of type 1: you believe that a statistical
[[Image:icon-finger-1-3cm.png|left]] Error of type 1: you believe that a statistical relation is meaningful ... but "in reality" it doesn’t exist.
relation is meaningful
:* In complicated words : You wrongly reject the null hypothesis (no link between variables)
 
<br clear="all"/>
... but "in reality" it doesn’t exist
[[Image:icon-finger-2-3cm.png|left]] Error of type 2: you believe that a relation does not exist ... but "in reality" it does.
 
:*E.g. you compute a correlation coefficient, results show that is very weak. Maybe because the relation was non-linear, or because an other variable causes an interaction effect ...
** In complicated words : You wrongly reject the null hypothesis (no link between
:*With more complicated words: Your wrongly accept the null hypothesis
variables)
<br clear="all"/>
 
[[Image:icon-light-bulb.png|left]] There exist useful statistical methods to diminish the risks
[[Image:book-research-design-136.png]] Error of type 2: you believe that a relation does
:* See statistical data analysis techniques
not exist
:* Think !
 
<br clear="all"/>
... but "in reality" it does
 
** E.g. you compute a correlation coefficient, results show that is very weak. Maybe
because the relation was non-linear, or because an other variable causes an interaction
effect ...
** In complicated words: Your wrongly accept the null hypothesis
 
[[Image:book-research-design-137.png]] The are useful statistical methods to diminish the
risks
 
* See statistical data analysis techniques
* Think !
 
 


=== Survey research examples ===
=== Survey research examples ===


* See quantitative data gathering and quantitative analysis modules for some examples
* See quantitative data gathering and quantitative analysis modules for some examples


=== Etude pilote sur la mise en oeuvre et les perceptions des TIC ===
=== Etude pilote sur la mise en oeuvre et les perceptions des TIC ===


* (Luis Gonzalez, DESS thesis 2004): Main goal: "Study factors that favor teacher’s use of
* (Luis Gonzalez, DESS thesis 2004): Main goal: "Study factors that favor teacher’s use of ICT". The author defines 8 factors and also postulates a few relationships among them
ICT". The author defines 8 factors and also postulates a few relationships among them
 
[[Image:book-research-design-138.png]]
 
* Below we quote from the thesis (and not the research plan):
 
&lt;&lt; Mon hypothèse principale postule l’existence d’une corrélation entre les facteurs
suivants et la mise en œuvre des TIC par les enseignants :
 
** Le '' type de support'' offert par le cadre institutionnel
** Leurs '' compétences pédagogiques''
** Leurs '' compétences techniques''
** La '' formation reçue'' , que se soit la formation de base ou la formation continue
** Leur '' sentiment d’auto-efficacité''
** Leur '' perception des technologies''
** Leur '' perception de l’usage pédagogique'' des TIC
** Leur rationalisation et '' digitalisation pédagogique''


Mes hypothèses secondaires sont :
Todo: translation ....


1. La perception de l’usage péd. est corrélée avec les compétences pédagogiques de
[[Image:teacher-ICT-use-model-gonzalez.png]]
l’enseignant.


2. La perception des technologies est corrélée avec celle de l’usage pédagogique.
Below we quote from the thesis (and not the research plan):


3. Rationalisation et digitalisation péd. est corrélée avec la perception des
{{quotationbox| My principal hypothesis postulates the existence of a correlation between the following factors and teacher's use of ICT
technologies.
* The ''type of support'' offered by the institutions
* Teacher's ''pedagogical competences''
* Teacher's ''technical competences''
* ''ICT training'' received by teachers
* Teacher's feeling of [[self-efficacy]]
* Teacher's ''percepetion of technology''
* Teacher's ''perception of ICT's pedagogical usefulness''
* Teacher's ''digitalization'' (rationalization) practices with ICT
}}


4. La formation est corrélée avec les compétences pédagogiques et techniques.
{{quotationbox|Secondar hypothesis are:
* Teacher's ''percpetion of ICT's pedagogical usefulness'' is correlated with ''pedagogical competences''
* Teacher's ''percepetion of technology'' is correlated with ''percpetion of ICT's pedagogical usefulness''
* Teacher's ''digitalization'' (rationalization) practices with ICT is correlated ''percepetion of technology''
* ''ICT training'' received by teachers is correlation with teacher's ''pedagogical competences'' and ''technical competences''
* Teacher's ''feeling of self-efficacy'' is correlated with teacher's ''pedagogical competences'' and ''technical competences''
*  Teacher's ''digitalization'' (rationalization) practices with ICT is correlated with ''feeling of self-efficacy''


5. Le sentiment d’auto-efficacité est corrélé avec les compétences pédagogiques et
; Sampling method
techniques.
 
6.Rationalisation et de digitalisation péd. est corrélée avec le sentiment
d’auto-efficacité." &gt;&gt;
 
Sampling method


* Representative sample of future primary teachers (students), N = 48
* Representative sample of future primary teachers (students), N = 48
* Non-representative sample of primary teacher’s, N = 38
* Non-representative sample of primary teacher’s, N = 38
** All teachers with an email address in Geneva were contacted, auto-selection (!)
** All teachers with an email address in Geneva were contacted, auto-selection (!)
** Note: the questionnaire was very long, some teachers who started doing it, dropped out
** Note: the questionnaire was very long, some teachers who started doing it, dropped out after a while
after a while
* This sort of sampling is ok for a pilot study or a little master thesis.
* This sort of sampling is ok for a pilot study
 
; Questionnaire design
 
Definition of each "conceptual domain" (see above, i.e. main factors/variables identified from the literature).


Questionnaire design
Item sets (questions) and scales have been adapted from the literature if possible, e.g.
* L’échelle d’auto-efficacité (Dussault, Villeneuve &amp; Deaudelin, 2001)
* Enquête internationale sur les attitudes, représentations et pratiques des étudiantes et étudiants en formation à la profession enseignante au regard du matériel pédagogique ou didactique, informatisé ou non (Larose, Peraya, Karsenti, Lenoir &amp; Breton, 2000)
* Guide et instruments pour évaluer la situation d’une école en matière d’intégration des TIC (Basque, Chomienne &amp; Rocheleau, 1998).
* Les usages des TIC dans les IUFM : état des lieux et pratiques pédagogiques (IUFM, 2003).


* Definition of each "conceptual domain" (see above, i.e. main factors/variables
; Data collection
identified from the literature).
* Data was collected with an on-line questionnaire tool (using the ESP program)
* Create item sets (questions). Scales have been adapted from the literature if possible
** L’échelle d’auto-efficacité (Dussault, Villeneuve &amp; Deaudelin, 2001)
** Enquête internationale sur les attitudes, représentations et pratiques des étudiantes
et
** étudiants en formation à la profession enseignante au regard du matériel pédagogique ou
didactique, informatisé ou non (Larose, Peraya, Karsenti, Lenoir &amp; Breton, 2000)
** Guide et instruments pour évaluer la situation d’une école en matière d’intégration des
TIC (Basque, Chomienne &amp; Rocheleau, 1998).
** Les usages des TIC dans les IUFM : état des lieux et pratiques pédagogiques (IUFM,
2003).
* Collect data with an on-line questionnaire (using the ESP program)
* Purification of the instrument. For each item set, a factor analysis was performed and
indicators were constructed according to auto-correlation of items (typically the first
2-3 factors were used).
** Note: If you used fully tested published scales, you don’t need to do this !


Example regarding the concept "perception of pedagogical ICT use"
; Purification of the instrument
* For each item set, a factor analysis was performed and indicators were constructed according to auto-correlation of items (typically the first 2-3 factors were used). Notice: If you used fully tested published scales, you don’t need to do this !


* In the questionnaire this concept is measured by two '' questions sets'' (scales).
; Example - "perception of pedagogical ICT use"
In the questionnaire this concept is measured by two '' questions sets'' (scales).


&lt;&lt; La perception de l’usage pédagogique des TIC comporte deux séries de questions
{{quotationbox| La perception de l’usage pédagogique des TIC comporte deux séries de questions s’intéressant respectivement au degré d’accord des enseignants avec les discours gouvernementaux et scientifiques sur le recours aux ressources éducatives informatisées en éducation (question 34, 10 items) et au degré d’importance accordé à diverses ressources informatisées (question 43, 12 items).}}
s’intéressant respectivement au degré d’accord des enseignants avec les discours
gouvernementaux et scientifiques sur le recours aux ressources éducatives informatisées en
éducation (question 34, 10 items) et au degré d’importance accordé à diverses ressources
informatisées (question 43, 12 items). &gt;&gt;


Here we show one of these 2 question sets:
Here we show one of these 2 question sets:


Question 34. PUP1: Les énoncés suivants reflètent des opinions " fort présentes " dans les
Question 34. PUP1: Les énoncés suivants reflètent des opinions " fort présentes " dans les discours gouvernementaux ainsi que " scientifiques " qui portent sur le recours aux ressources éducatives informatisées en éducation. Indiquez votre degré d'accord par rapport à chacun d'entre eux.
discours gouvernementaux ainsi que " scientifiques " qui portent sur le recours aux
ressources éducatives informatisées en éducation. Indiquez votre degré d'accord par
rapport à chacun d'entre eux.


(Tout à fait en désaccord=1 Plutôt en désaccord=2 Plutôt d'accord=3 Tout à fait d'accord=4
(Tout à fait en désaccord=1 Plutôt en désaccord=2 Plutôt d'accord=3 Tout à fait d'accord=4


[[Image:book-research-design-139.png]]
[[Image:questionnaire-example-gonzalez.png]]
 
Note: these 10 items and the 12 items from question 43 have been later reduced to 3
indicators:


Var_PUP1 Degré d'importance des outils d'entraide et de collaboration pour les élèves
Note: these 10 items and the 12 items from question 43 have been later reduced to 3 indicators:
 
Var_PUP2 Degré d'importance des outils de communication entre élèves
 
Var_PUP3 Accord sur ce qui favorise les apprentissages de type constructiviste


* Var_PUP1 - Degré d'importance des outils d'entraide et de collaboration pour les élèves
* Var_PUP2 - Degré d'importance des outils de communication entre élèves
* Var_PUP3 - Accord sur ce qui favorise les apprentissages de type constructiviste


We shall use this example again in the [[Methodology tutorial - quantitative data analysis]].


== Similar comparative systems design ==
== Similar comparative systems design ==


Principle
This design is popular in comparative public policy analysis. It can be used to compare educational systems of a few districts, states or countries.


[[Image:book-research-design-140.png]] Make sure to have good variance within “operative
'''Principle''':
variables” (dependant + independent)


[[Image:book-research-design-141.png]] Make sure that no other variable shows variance
[[Image:icon-finger-1.png]] Make sure to have good variance within “operative variables” (dependant + independent)
(i.e. that there are no hidden control variables that may produce effects)


[[Image:icon-finger-2.png]] Make sure that no other variable shows variance (i.e. that there are no hidden control variables that may produce effects)


[[Image:similar-systems-design.png]]


====  ====
In more simple words: Select cases that are different in respect to the variables that are of interest to your research, but otherwise similar in all other respects.


[[Image:book-research-design-142.png]]
E.g. don’t select an prestige school that does ICT and a normal school that doesn’t do ICT if you want to measure the effect of ICT. Either stick to prestige schools or "normal" schools, otherwise, you can’t tell if it was ICT that made the difference ...


In more simple words: Select cases that are different in respect to the variables that are
Advantages and inconvenients of this method:
of interest to your research, but otherwise similar in all other respects.


E.g. don’t select an prestige school that does ICT and a normal school that doesn’t do ICT
[[Image:icon-thumb-up.png]] Less reliability and construction validity problems
if you want to measure the effect of ICT. Either stick to prestige schools or "normal"
schools, otherwise, you can’t tell if it was ICT that made the difference ...


Advantages and inconvenients of this method
[[Image:icon-thumb-up.png]] Better control of "unknown" variables


[[Image:book-research-design-143.png]] less reliability and construction validity problems
[[Image:icon-thumb-down.png]] Worse external validity (impossibility to generalize)
 
[[Image:book-research-design-144.png]] better control of "unknown" variables
 
[[Image:book-research-design-145.png]] worse external validity (possibility to generalize)


[[Image:icon-thumb-down.png]] Weak or none statistical testing. Most often researchers just compare data but can not provide statistically significant results, since cases are too few.


== Summary of theory-driven designs discussed ==


== Summary of theory-driven designs discussed ==
In this tutorials we present some important theory-driven research designs which we summarize in the table below with a few typical use cases. There exist other theory-driven designs, e.g. simulations.


{| border="1"
{| border="1"
! rowspan="1" colspan="1" |
! rowspan="1" colspan="1" |approach
approach
! rowspan="1" colspan="1" |some usages
! rowspan="1" colspan="1" |
some usages
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |[[#See Experimental designs|See Experimental designs]]
[#See Experimental designs|See Experimental designs]
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |
* Psycho-pedagogical investigations
* Psycho-pedagogical investigations
* User-interface design
* User-interface design
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |[[#Quasi-experimental_designs|See Quasi-experimental designs]]
[#Quasi-experimental_designs|See Quasi-experimental designs]
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |
* Instructional designs (as a whole)
* Instructional designs (as a whole)
Line 885: Line 776:
* Organizational reform
* Organizational reform
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |[[#Statistical_designs|See Statistical designs]]
[#Statistical_designs|See Statistical designs]
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |
* Teaching practise
* Teaching practise
* Usage patterns
* Usage patterns
|-
|-
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |[[#Similar comparative systems design|See Similar comparative systems design]]
[Similar comparative systems design|See Similar comparative systems design]
| rowspan="1" colspan="1" |
| rowspan="1" colspan="1" |
* Public policy analysis
* Public policy analysis
Line 899: Line 788:


Of course, you can combine these approaches within a research project. You also may use different designs to look a the same question in order triangulate answers.
Of course, you can combine these approaches within a research project. You also may use different designs to look a the same question in order triangulate answers.
== Bibliography ==
* Campbell, D. T., and Stanley, J.C, "Experimental and Quasi-Experimental Designs for Research on Teaching." In N. L. Gage (ed.), Handbook of Research on Teaching. Boston, Houghton, 1963.  [http://moodle.technion.ac.il/pluginfile.php/367640/mod_resource/content/1/Donald_T._%28Donald_T._Campbell%29_Campbell,_Julian_Stanley-Experimental_and_Quasi-Experimental_Designs_for_Research-Wadsworth_Publishing%281963%29%20%281%29.pdf PDF]
* Campbell, D. T. &amp; Stanley, J. (1966). Experimental and quasi-experimental designs for research. Rand McNally, Chicago. (The revised original. It remains the reference for quasi-experimental research).
* Cook, T., & D. Campbell. (1979). Quasi-experimental design. Chicago: Rand McNally.
* Dawson, T. E. (1997, January 23–25). A primer on experimental and quasi-experimental design. Paper presented at the Annual Meeting of the Southwest Educational Research Association, Austin, TX.
* Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin.
[[Category: research methodologies]]
[[Category:Research methodology tutorials]]

Latest revision as of 18:28, 22 August 2016


This is part of the methodology tutorial (see its table of contents).

Introduction

Learning goals
  • Understand the fundamental principles of theory-driven research
  • Become familiar with some major approaches
Prerequisites
Moving on
Level and target population
  • Beginners
Quality / To do
  • More or less ok in spirit. 'But: some translations are needed, some phrases need to be rewritten. Bullets need to transformed to real paragraphs.

Overview of theory driven research

Most important elements of an empirical theory-driven design:

Empirical-research-elements.png

  • Conceptualizations : Each research question is formulated as one or more hypothesis. Hypothesis are grounded in theory.
  • Measures : are usually quantitative (e.g. experimental data, survey data, organizational or public "statistics", etc.) and make use of artifacts like surveys or experimental materials
  • Analyses & conclusion : Hypothesis are tested with statistical methods


Experimental designs

The scientific ideal

Experimental research is the ideal paradigm for empirical research in most natural science disciplines. It aims to control physical interactions between variables

Experimentation principle in science
  1. The study object is completely isolated from any environmental influence and observed (O1)
  2. A stimulus is applied to the object (X1)
  3. The object’s reactions are observed (O2).

We may draw a picture for this: Science-experiment.png

  • O1 = observation of the non-manipulated object’s state
  • X = treatment (stimulus, intervention)
  • O2 = observation of the manipulated object’s state”.

The effect of the treatment (X) is measured by the difference between O1 and O2

In other words, an experiment can "prove" (corroborate) that an intervention X will have an effect Y. X and Y are theorectical variables that are operationalized in the following way. X becomes intervention and Y quantified measures of the effect.

The simple experiment in human sciences

In humain sciences (as well as in the life sciences) it is not possible to totally isolate a subject from its environment. Therefore we have to make sure that effects of the environment are either controlled or at least equally distributed over the control group. Let's now look at a few strategies...

Simple experimentation using a control group :

A simple control group design looks like this:

Simple-control-group.png

Principle:

  1. Two groups of subjects are chosen randomly (R) within a mother population. This ought to eliminate systematic influence of unknown variables on one group. I.e. we postulate that both groups will be under the same influence of the same uncontrolled variables.
  2. The independent variable (X) is manipulated by the researcher. He will put on group under an experimental condition, i.e. apply a treatment.
  3. Ideally, subjects should not be aware of the research goals since they might for example consciously or unconsciously want to influence the result.
Analysis of results

We compare effects of treatement (stimulus) vs. non-treatement of the two groups. The measure "O" is also called a post-test since we apply it after the treatment.

Treatment effect (O) non-effect (O) Total effect
for a group
treatment: (group X) bigger smaller 100 % We do a

"vertical" comparison

non-treatment: (group non-X) smaller bigger 100 %

Analysis questions are formulated in this spirit: What is the probability that treatment X leads to effect O ? In the table above we can observe an experimentation effect. We can see that the effect in the experimental (treated) group is bigger than in the non experimental group and the other way round.

The Simple experiment with different treatments is a slightly different design alternative, but similar in spirit.

Simple-multiple-control-groups.png

Example: First students are assigned randomly to different lab sessions using a different pedagogy (X) and we would like to know if there are different effects at the end (O).

Problems with simple experimentation

This "post-test" only design is not really optimal and for various reasons.

  • Selection: Subjects may not be the same in the different groups. Since samples are typically very small (15-20 / group) this may have an effect
  • Reactivity of subjects: Individuals ask themselves questions about the experiment and this leads to compensatory effects or they may otherwise change between observations
  • Difficulty to control certain variables in a real context. Example: A new ICT-supported pedagogy may work better, because it stimulates the teacher, because students may increase their attention and amount of work, or simply because experimental groups may be smaller than in "real conditions" and individual student therefore get more attention.

In principle one could test such intervening variables with new experimental conditions, but for each new variable, one has to add at least 2 more experimental groups, something that is very costly. Let's now look at a more popular design...

The simple experiment with pretests

The following design attempts to control the difference that may exist between 2 experimental groups (i.e. we don't trust randomization or we can't randomly assign subjects to a group. This is typically the case when we select for example two classes in a school setting.

Here is the desing:

Simple-experiment-pretest.png

Analysis

To control the potential difference between groups we compare the difference between O2 and O1 with the difference between O4 - O3

effect = (O2-O1) versus (O4-O3.

There are also disadvantage of this design, in particular the effect of the first measure on the experiment can influence the outcome. Example: (a) If X is supposed to increase a pedagogical effect, the O1 and O3 tests could have an effect (students learn by doing the test), so you can’t measure the "pure" effect of X.

This experimentation effect can be controlled by the Solomon design, which is similar in spirit, but this method requires two extra control groups and is more costly.

Experiment-solomon-design.png

The Solomon design combines the simple experiment design with the pretest design:

We can test for example if O2>O1, O2>O4, O5>O6 and O5>O3

Final Note: Simply comparing 2 different situations is NOT an experiment ! The treatment variable X must be simple and uni-dimensional (else you don’t know the precise cause of an effect). We shall come back to this problem below when we discuss quasi-experimental research designs.

There exist even more complicated designs to measure interaction effects of 2 or more treatments, but we shall stop here.

The non-experiment: what you should not do

Let's now look at bad designs, since they often can be found in the discourse of policy makers or early drafts of research proposals.


The (non)experiment without control group nor pretest

This bad design looks like this:

Non-experiment.png

We just look at data (O) after some event (X).

Example: A bad discourse on ICT competence of pupils: “Since we introduced ICT in the curriculum, most of the school’s pupils are good at finding things on the Internet"

There is a lack of real comparison !!

  • We don’t compare what happens in other schools that offer no ICT training. Maybe it is a general trend that pupils have become better a finding things on the Internet, since most households how have computers and Internet access.
  • We don’t even know what happened before !

A statement like "Most of the students are good ! ..." means that you don't compare to what happens in other settings that do not include ICT in their curriculum. It's therefore pretty worthless as proof that introduction of ICT on schools had any effect.

The variable to be explained (O) x= ICT in school x= no ICT in school
bad at web search 10 students ???
  • horizontal comparison
    of % ???
good at web search 20 students ???

"Things have changed ..." means that you are not aware of the situation before the change.

The variable to be explained (O) before after
bad at web search ??? 10 students Horizontal comparison
of % ???
good at web search ??? 20 students

Now let's look at another bad design ...

Experiments without randomization nor pretest

In the following design with have the Problem that there is no control over the conditions and the evolution of the control group.

Bad-control-group-experiment.png


An Example: Computer animations used in school A are the reason of better grade averages (than in school B). School A simply may attract pupils from different socio-economic conditions and they usually have better grades. Or politically speaking. Rich schools have to money to introduce computer animations and they also attract better learners.

Finally, let's look at the experiment without control group

The experiment without control group

No-control-group-experiment.png

We don’t know if X is the real cause

Example: Since I bought my daughter a lot of video games, she is much better at word processing". You don’t know if this evolution is "natural" (kids always get better at word processing after using it a few times) or if she learnt it somewhere else. This is "natural evolution" or "statistical regression" of the population.

Experimental designs example

  • Rebetez, C. (2004). Sous quelles conditions l’animation améliore-t-elle l’apprentissage ? Master Thesis (145p.). Master Thesis, MSc MALTT (Master of Science in Learning and Teaching Technologies), TECFA, University of Geneva. version pdf (2.6Mo).

The thesis we are presenting was writting in French, but there he also did a Master thesis in cognitive psychology in English:

  • Rebetez, C. (2006). Control and collaboration in multimedia learning: is there a split-interaction? Master Thesis, School of Psychology and Education, University of Geneva. pdf (168ko)

Notice. This thesis was funded by a real research project, i.e. the student did more than it usually expected for an MA thesis.

The big research question: Our research has the objective to show the influence of the continuity of a presentation flow, of collaboration and previous states in memory and to verify the influence of individual variables like visual span and capacities to do mental rotations.

Original “Notre recherche a pour objectif de mettre en évidence l'influence, de la continuité du flux , de la collaboration , de la permanence des états antérieurs, ainsi que de vérifier la portée de variables individuelles telles que l’empan visuel et les capacités de rotation mentale. (p.33)”

The general objective is then further developed through 1 1/2 pages in the thesis. Causalities are discussed in verbal form (p. 34-40) and then "general" hypothesis are presented on 2 pages.

Explanatory (independent) variables, i.e. conditions

  1. Animation, static vs. dynamic condition: allows to visualize transition between states. Static presentation forces a student to imagine movement of elements.
  2. Permanence, present or absent condition: If older states of the animation are shown, students have better recall and therefore can more easily build up their model.
  3. Collaboration, present or absent condition: Working together should allow students to create more sophisticated representations.

Operational hypothesis (presented in the methodology chapter, translation needed):

  • Animation
    • Les scores d'inférence ainsi que les scores de rétention seront plus élevés en condition dynamique qu'en condition statique.
    • La charge cognitive perçue sera plus élevée en condition dynamique qu'en condition statique. Les temps de discussion ainsi que les niveaux de certitude n'ont pas de raison d'être différents entre les conditions.
  • Permanence
    • Les participants en condition avec permanence auront de meilleurs résultats aux questionnaires que les participants en condition sans permanence. Les résultats d'inférence sont tout particulièrement visés par cet effet.
    • La charge cognitive perçue ne devrait pas être différente entre ces deux conditions. Les temps de discussion ainsi que les niveaux de certitude devraient être plus élevés avec que sans permanence.
    • L'influence de la permanence sera d'autant plus grande si les participants sont en condition de présentation dynamique.
  • Collaboration
    • La collaboration aura un effet positif sur l'apprentissage, autant en ce qui concerne la rétention que l'inférence. Toutefois, l'inférence devrait être tout particulièrement avantagée en cas de " grounding ". Les participants en duo auront donc de meilleurs scores que les participants en solo.
    • La charge cognitive perçue devrait suivre le niveau de résultat et être plus bas en condition duo qu'en solo.
    • Les temps de discussion devraient être naturellement plus grand en condition duo. Les niveaux de certitude devraient également s'élever en condition duo face à la condition solo.

Method (short summary !)

Population: 160 students. All have been tested to check if the were novices (show lack of domain knowledge used in the material)

Material:

  • Pedagogical material are 2 different multimedia contents (geology and astronomy), each one in 2 versions. For the dynamic condition there are 12 animations, for the static conditions 12 static pictures
  • Contents of pedagogical material: "Transit of Venus" made with VRML, "Ocean and mountain building" made with Flash
  • These media were integrated in Authorware (to take measures and to ensure a consistent interface)

Procedure (roughly, step by step)

  • Pretest (5 questions)
  • Introduction (briefing)
  • For solo condition: paper folding and Corsi visio-spatial tests
  • Test with material
  • Cognitive load test (nasa-tlx)
  • Post-test (17 questions)

Measured dependant variables:

  • Number of correct answers in a retention questionnaire.
  • Number of correct answers in a inference questionnaire.
  • Level of response certitudes in both questionnaires.
  • Subject cognitive load scores (measures with the NASA-TLX test)
  • Paper-folding test score
  • Visual span test score (Corsi)
  • Time (seconds) and number of vignette use in the "permanent condition" permanence.
  • Reflection time between presentations (seconds).

Quasi-experimental designs

It is difficult to carry out experiments in real settings, e.g. schools. However there exist so-called quasi-experimental designs. These are inspired by experimental design principles (pre- and post tests, and control groups) .

Advantages
  • Can be led in non-experimental situations, i.e. in "real" contexts
  • Can be used when the treatment may become too "heavy", i.e. involves more than 2-3 well defined treatment variables.


Disadvantage

In quasi-experimental situations, you really lack control:

  • You don’t know all possible stimuli (causes not due to experimental conditions)
  • You can’t randomize (distribute evenly other intervening unknown stimuli over the groups)
  • You may lack enough subjects

Nevertheless, quasi-experimental research can help to test all sorts of threats from variables that you can not control. These are called threats to internal validity (see below)

Usage examples in the social sciences
  • Evaluation research
  • Organizational innovation studies
  • Questionnaire design for survey research (think about control variables to test alternative hypothesis)

There exist various designs. Some are easier to conduct, but they lead to less solid (valid) results. Let's examine a few...

Interrupted time series design

Here is a schema of the interrupted time series design that attempts to control the effect of possible other events (treatements) on a single experimental group.

Interrupted-time-series-design.png

Advantages

You can control (natural) trends somewhat. I.e. when you observe or introduce a treatment, e.g. a pedagogical reform, you may not really be sure that reform features themselves had any effect or if it was something else, like a general trend in the abilities of the student population.

Problems

You can't control external simultaneous events ( X2 that happen at the same time as X1 ) Example concerning the effect of ICT-based pedagogies in the classroom: These ICT-based pedagogies may have been introduced together with other pedagogical innovations. So which one does have an effect on overall performance ?

Practical difficulties
  • Sometimes it is not possible to obtain data for past years
  • Sometimes you don’t have the time wait for long enough (your research ends too early and decision makers never want wait for long-term results). Example: ICT-based pedagogies often claim to improve meta-cognitive skills. Do you have tests for year-1, year-2, year-3 ? Can you wait for year+3 ? Can you wait even longer, i.e. test the same population when they reach university or jobs where meta-cognitive skills matter more ?

Examples of time series

Now let's have an informal look at time serious, i.e. measures that evolve over time and that can corroborate or invalidate hypothesis about an intervention X.

Time-series-examples.png

O1, O2, etc. are observation data (e.g. yearly), X is the treatment (intervention)

A. a statistical effect is likely
Example "Student’s drop-out rates are lower since we added forums to the e-learning content server"
but attention: you don’t know if there was an other intervention at the same time.
B. Likely Straw fire effect
Teaching has improved after we introduced X. But then, things went back to normal
So there is an effect, but after a while the cause "wears out". E.g. the typical motivation boost from ICT introduction in the curriculum may not last
C. Natural trend (unlikely effect)
You can control this error by looking beyond O4 and O5 !
D. Confusion between cycle effects and intervention
Example: government introduced measures to fight unemployment, but you don’t know if they only "surf" on a natural business cycle. Control this by looking at the whole time series.
E. Delay effect
Example: high investments in education (may take decades to take effect)
F. Trend acceleration effect,
difficult to discriminate from G. I.e. there is some change in the curve, but it may just be variant of exponential natural evolution.
G. Natural exponential evolution
same as (C).

Threats to internal validity

The big question you should ask yourself over and over: What other variables could influence our experiments ? Campbell and Stanley (1963) created an initial typology of threats for which you have to watch out:

Type of threat Definition and example
history An other event than X happens between measures
Example: ICT introduction happened at the same time as introduction of project-based teaching.
maturation The object changed “naturally” between measures
Example: Did this course change your awareness of methodology or was it simply the fact that you started working on your master thesis.
testing The measure had an effect on the object
Example: Your pre-intervention interviews had an effect on people (e.g. teachers changed behavior before you invited them to training sessions)
instrumentation Method use to measure has changed
Example: Reading skills are defined differently. E.g. newer tests favor text understanding.
statistical regression Differences would have evened out naturally
Example: School introduces new disciplinary measures after kids beat up a teacher. Maybe next year such events wouldn’t have happened without any intervention.
(auto) selection Subjects auto-select for treatment
Example: You introduce ICT-based new pedagogies and results are really good (Maybe only good teachers did participate in these experiments).
mortality Subjects are not the same
Example: A school introduces special measures to motivate "difficult kids". After 2-3 years drop-out rates improve. Maybe the school is situated in

a area that show rapid socio-demographic change (different people).

interaction with
selection
Combinatory effects example: the control group shows a different maturation
directional
ambiguity
Its it the treatment or the subjects ?
Example: Do workers show better output in "flat-hierarchy" / participatory / ICT-supported organization or do such organizations attract more active and efficient

people ?

Diffusion or treatment imitation
Example: An academic unit promotes modern blended learning and attracts good students from a wide geographic area. A control unit also may profit from this effect.
Compensatory egalization The control groups observes the experimental group
Example: Subjects who don’t receive treatment, react by changing their behavior

Let's now have a look at some designs that attempt to control such threats.

Non-equivalent control group design

This design adopts comparisons between two similar (but not equivalent) control groups.

Non-equivalent-control-group-design.png

Icon-thumb-up.png Advantages: Good at detecting other causes

  • If O2 - O1 is similar to O4 - O3, we can reject the hypothesis that O2 - O1 is due to X.

Icon-thumb-down.png Inconvenients and possible problems:

  • Bad control of natural tendencies
  • Finding (somewhat) equivalent groups is not easy
  • You also may encounter interaction effects between groups, e.g. imitation.

Experimentation and imitation effects

Here is an example of an imitation effect. In course we introduce

Course A introduces
ICT in the classroom

Course B doesn’t

Effect 1:costs

augment

stable

compare results
horizontally

E 2: student satisfaction

augments

augments

E 3: deadlines respected

better

stable

Review Question: Why does student satisfaction improve at the same time for B ?

Comparative time series

One of the most powerful quasi-experimental research designs uses comparative time series. I.e. you combine the Interrupted_time_series_design with the Non-equivalent_control_group_design we presented above.

Comparative-time-series.png

  1. Compare between groups (situations)
  2. Make series of pre- and post observations (tests)

Difficulties:

  1. Find comparable groups
  2. Find groups with more than just one or a few cases (!)
  3. Find data (in time in particular)
  4. Watch out for simultaneous interventions at point X.

Validity in quasi-experimental design

Let's now generalize a little bit our discussion and discuss the causality issue we alread addressed somewhat in the Methodology tutorial - empirical research principles.

There exist four kinds of validity according to Stanley et al.:

Icon-finger-1.png Internal validity concerns your research design

  • You have to show that postulated causes are "real" (as discussed before) and that alternative explanations are wrong.
  • This is the most important validity type.

Icon-finger-2.png External validity .... can you make generalizations ?

  • This is not easy ! because you may not be aware of "helpful" variables, e.g. the "good teacher" you worked with or the fact that things were much easier in your private school ....
  • How can you provide evidence that your successful ICT experiment will be successful in other similar situations, or situations not that similar ?

Icon-finger-3.png Statistical validity .... are your statistical relations significant ?

Icon-finger-4.png Construction validity ... are your operationalizations sound ?

  • Did you get your dimensions right ?
  • Do your indicators really measure what you want to know ?

Important: This typology is also useful for other settings, e.g. structured qualitative analysis or statistical designs. In most other empricial research designs you must address these issues.

Quasi-experimental thesis example

Notari, Michele (2003). Scripting Strategies In Computer Supported Collaborative Learning Environments, Master Thesis, MSc MALTT (Master of Science in Learning and Teaching Technologies), TECFA, University of Geneva. HTML/PDF

  • This master thesis concerns the design and effects of ICT-supported activity-based pedagogics in a normal classroom setting
  • Target: Biology at high-school level (various subjects)

Three research questions formulated as 'working hypotheses':

  • The use of a Swiki as collaborative editing tool causes no technical and comprehensive problems (after a short introduction) for high school students without experience in collaborative editing but with some knowledge of the use of a common text-editing software and the research of information in the Web.
  • [Pedagogical] scripting which induces students to compare and comment on the work of the whole learning community (using a collaborative editing tool) leads to better learning performance (as assessed by pre- and post-testing) than a script leading students to work without such a tool and with little advice or / and opportunity to make comments and compare their work with the learning community.
  • The quality of the product of the working groups is better (longer and more detailed) when students are induced to compare and comment on their work (with a collaborative editing tool) during the learning unit.
Method

(Summary, quotations from thesis)

  • The whole research took place in a normal curricular class environment. The classes were not aware of a special learning situation and a deeper evaluation of the output they produced.
  • We tried to embed the scenarios in an absolutely everyday teaching situation and supposed students to have the same motivational state as in other lessons.
  • To collect data we used questionnaires, observed students while working, and for one set up we asked students to write three tests.
  • Of course the students asked about the purposes of the tests. We tried to motivate them to perform as well as they could without telling them the real reason of the tests.
Notes
  • This master theses concerns several quasi-experiments, all in real-world settings.
  • Below we just reproduce the settings for one of these.
  • Several explaining variables intervene in the example on next page ( the procedure as whole was evaluated, and not variables as defined by experimentalism ).

A sample "experiment" from Notari’s thesis:

Notari-wiki-scripting.png

Let's now look into so-called statistical designs, an approach that is typically used in "survey research".

Statistical designs

Statistical designs are also conceptuall related to experimental designs:

Statistical designs formulate laws
  • there is no interest in individual cases (unless something goes wrong)
  • You can test quite a lot of laws (hypothesis) with statistical data (your computer will calculate)
Designs are based on prior theoretical reasoning, because
  • measures are not all that reliable,
    • what people tell may not be what they do,
    • what you ask may not measure what you want to observe ...
  • there is a statistical over-determination,
    • you can find correlations between a lot of things !
  • you can not get an "inductive picture" by asking a few dozen closed questions.

The dominant research design is conducted "à la Popper":

  1. You start by formulating hypothesis (models that contain measurable variables and relations)
  2. You measure the variables (e.g. with a questionnaire and/or a test)
  3. You then test relations with statistical tools

The Most popular variant in educational technology is so-called "survey research".

Introduction to survey research

A typical research plan looks like this
  1. Literature review leading to general research questions and/or analysis frameworks
  2. You may use qualitative methodology to investigate new areas of study
  3. Definition of hypothesis
  4. Operationalization of hypothesis, e.g. definition of scales and related questionnaire items
  5. Definition of the mother population
  6. Sampling strategies
  7. Identification of analysis methods
Implementation (mise en oeuvre)
  1. Questionnaire building (preferably with input from published scales)
  2. Test of the questionnaire with 2-3 subjects
  3. Survey (interviews, on-line or written)
  4. Coding and data verification + scale construction
  5. Analysis
Writing it up
  • Compare results to theory
  • Marry good practise of results presentation and discussion, but also make it readable

Levels of reasoning within a statistical approach

Reasoning level

Variables

cases

Relations (causes)

theoretical

concept /category

depends on the scope of your theory

verbal

hypothesis

variables and values (attributes)

mother population(students, schools,)

clearly stated causalities or co-occurrences

operationalization

dimensions and indicators

good enough sampling

statistical relations between statistical variables (e.g. composite scales, socio-demographic variables)

measure

observed indicators (e.g. survey questions)

subjects in the sample

statistics

measures (e.g. response items to questions)scales (composite measures)

data(numeric variables)

(Just for your information. If it looks too complicated, ignore)

Typology of internal validity errors

Icon-finger-1-3cm.png

Error of type 1: you believe that a statistical relation is meaningful ... but "in reality" it doesn’t exist.

  • In complicated words : You wrongly reject the null hypothesis (no link between variables)


Icon-finger-2-3cm.png

Error of type 2: you believe that a relation does not exist ... but "in reality" it does.

  • E.g. you compute a correlation coefficient, results show that is very weak. Maybe because the relation was non-linear, or because an other variable causes an interaction effect ...
  • With more complicated words: Your wrongly accept the null hypothesis


Icon-light-bulb.png

There exist useful statistical methods to diminish the risks

  • See statistical data analysis techniques
  • Think !


Survey research examples

  • See quantitative data gathering and quantitative analysis modules for some examples

Etude pilote sur la mise en oeuvre et les perceptions des TIC

  • (Luis Gonzalez, DESS thesis 2004): Main goal: "Study factors that favor teacher’s use of ICT". The author defines 8 factors and also postulates a few relationships among them

Todo: translation ....

Teacher-ICT-use-model-gonzalez.png

Below we quote from the thesis (and not the research plan):


My principal hypothesis postulates the existence of a correlation between the following factors and teacher's use of ICT

  • The type of support offered by the institutions
  • Teacher's pedagogical competences
  • Teacher's technical competences
  • ICT training received by teachers
  • Teacher's feeling of self-efficacy
  • Teacher's percepetion of technology
  • Teacher's perception of ICT's pedagogical usefulness
  • Teacher's digitalization (rationalization) practices with ICT

{{quotationbox|Secondar hypothesis are:

  • Teacher's percpetion of ICT's pedagogical usefulness is correlated with pedagogical competences
  • Teacher's percepetion of technology is correlated with percpetion of ICT's pedagogical usefulness
  • Teacher's digitalization (rationalization) practices with ICT is correlated percepetion of technology
  • ICT training received by teachers is correlation with teacher's pedagogical competences and technical competences
  • Teacher's feeling of self-efficacy is correlated with teacher's pedagogical competences and technical competences
  • Teacher's digitalization (rationalization) practices with ICT is correlated with feeling of self-efficacy
Sampling method
  • Representative sample of future primary teachers (students), N = 48
  • Non-representative sample of primary teacher’s, N = 38
    • All teachers with an email address in Geneva were contacted, auto-selection (!)
    • Note: the questionnaire was very long, some teachers who started doing it, dropped out after a while
  • This sort of sampling is ok for a pilot study or a little master thesis.
Questionnaire design

Definition of each "conceptual domain" (see above, i.e. main factors/variables identified from the literature).

Item sets (questions) and scales have been adapted from the literature if possible, e.g.

  • L’échelle d’auto-efficacité (Dussault, Villeneuve & Deaudelin, 2001)
  • Enquête internationale sur les attitudes, représentations et pratiques des étudiantes et étudiants en formation à la profession enseignante au regard du matériel pédagogique ou didactique, informatisé ou non (Larose, Peraya, Karsenti, Lenoir & Breton, 2000)
  • Guide et instruments pour évaluer la situation d’une école en matière d’intégration des TIC (Basque, Chomienne & Rocheleau, 1998).
  • Les usages des TIC dans les IUFM : état des lieux et pratiques pédagogiques (IUFM, 2003).
Data collection
  • Data was collected with an on-line questionnaire tool (using the ESP program)
Purification of the instrument
  • For each item set, a factor analysis was performed and indicators were constructed according to auto-correlation of items (typically the first 2-3 factors were used). Notice: If you used fully tested published scales, you don’t need to do this !
Example - "perception of pedagogical ICT use"
In the questionnaire this concept is measured by two  questions sets (scales).


La perception de l’usage pédagogique des TIC comporte deux séries de questions s’intéressant respectivement au degré d’accord des enseignants avec les discours gouvernementaux et scientifiques sur le recours aux ressources éducatives informatisées en éducation (question 34, 10 items) et au degré d’importance accordé à diverses ressources informatisées (question 43, 12 items).

Here we show one of these 2 question sets:

Question 34. PUP1: Les énoncés suivants reflètent des opinions " fort présentes " dans les discours gouvernementaux ainsi que " scientifiques " qui portent sur le recours aux ressources éducatives informatisées en éducation. Indiquez votre degré d'accord par rapport à chacun d'entre eux.

(Tout à fait en désaccord=1 Plutôt en désaccord=2 Plutôt d'accord=3 Tout à fait d'accord=4

Questionnaire-example-gonzalez.png

Note: these 10 items and the 12 items from question 43 have been later reduced to 3 indicators:

  • Var_PUP1 - Degré d'importance des outils d'entraide et de collaboration pour les élèves
  • Var_PUP2 - Degré d'importance des outils de communication entre élèves
  • Var_PUP3 - Accord sur ce qui favorise les apprentissages de type constructiviste

We shall use this example again in the Methodology tutorial - quantitative data analysis.

Similar comparative systems design

This design is popular in comparative public policy analysis. It can be used to compare educational systems of a few districts, states or countries.

Principle:

Icon-finger-1.png Make sure to have good variance within “operative variables” (dependant + independent)

Icon-finger-2.png Make sure that no other variable shows variance (i.e. that there are no hidden control variables that may produce effects)

Similar-systems-design.png

In more simple words: Select cases that are different in respect to the variables that are of interest to your research, but otherwise similar in all other respects.

E.g. don’t select an prestige school that does ICT and a normal school that doesn’t do ICT if you want to measure the effect of ICT. Either stick to prestige schools or "normal" schools, otherwise, you can’t tell if it was ICT that made the difference ...

Advantages and inconvenients of this method:

Icon-thumb-up.png Less reliability and construction validity problems

Icon-thumb-up.png Better control of "unknown" variables

Icon-thumb-down.png Worse external validity (impossibility to generalize)

Icon-thumb-down.png Weak or none statistical testing. Most often researchers just compare data but can not provide statistically significant results, since cases are too few.

Summary of theory-driven designs discussed

In this tutorials we present some important theory-driven research designs which we summarize in the table below with a few typical use cases. There exist other theory-driven designs, e.g. simulations.

approach some usages
See Experimental designs
  • Psycho-pedagogical investigations
  • User-interface design
See Quasi-experimental designs
  • Instructional designs (as a whole)
  • Social psychology
  • Public policy analysis
  • Educational reform
  • Organizational reform
See Statistical designs
  • Teaching practise
  • Usage patterns
See Similar comparative systems design
  • Public policy analysis
  • Comparative education

Of course, you can combine these approaches within a research project. You also may use different designs to look a the same question in order triangulate answers.

Bibliography

  • Campbell, D. T., and Stanley, J.C, "Experimental and Quasi-Experimental Designs for Research on Teaching." In N. L. Gage (ed.), Handbook of Research on Teaching. Boston, Houghton, 1963. PDF
  • Campbell, D. T. & Stanley, J. (1966). Experimental and quasi-experimental designs for research. Rand McNally, Chicago. (The revised original. It remains the reference for quasi-experimental research).
  • Cook, T., & D. Campbell. (1979). Quasi-experimental design. Chicago: Rand McNally.
  • Dawson, T. E. (1997, January 23–25). A primer on experimental and quasi-experimental design. Paper presented at the Annual Meeting of the Southwest Educational Research Association, Austin, TX.
  • Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin.