Methodology tutorial - theory-driven research designs: Difference between revisions

The educational technology and digital learning wiki
Jump to navigation Jump to search
Line 582: Line 582:
Statistical designs are related to experimental designs:
Statistical designs are related to experimental designs:


[[Image:book-research-design-132.png]] Statistical designs formulate laws
; Statistical designs formulate laws
* there is no interest in individual cases (unless something goes wrong)
* You can test quite a lot of laws (hypothesis) with statistical data (your computer will calculate)


** there is no interest in individual cases (unless something goes wrong)
;Designs are based on prior theoretical reasoning, because:
** You can test quite a lot of laws (hypothesis) with statistical data (your computer will
calculate)
 
[[Image:book-research-design-133.png]] Designs are based on prior theoretical reasoning,
because:


* measures are not all that reliable,
* measures are not all that reliable,
Line 598: Line 595:
* you can not get an "inductive picture" by asking a few dozen closed questions.
* you can not get an "inductive picture" by asking a few dozen closed questions.


[[Image:book-research-design-134.png]] Design à la Popper:
The dominant research design is conducted "à la Popper":
 
# You start by formulating hypothesis<br /> (models that contain measurable variables and
relations)
# You test relations with statistical tools
 
* Most popular variant in educational technology: Survey research


# You start by formulating hypothesis (models that contain measurable variables and relations)
# You measure the variables (e.g. with a questionnaire and/or a test)
# You then test relations with statistical tools


The Most popular variant in educational technology is so-called "survey research".


=== Introduction to survey research ===
=== Introduction to survey research ===


A typical research plan looks like this:
; A typical research plan looks like this:


# Literature review leading to general research questions and/or analysis frameworks
# Literature review leading to general research questions and/or analysis frameworks
# You may use qualitative methodology to investigate new areas of study
# You may use qualitative methodology to investigate new areas of study
# Definition of hypothesis
# Definition of hypothesis
# Operationalization of hypothesis, e.g. definition of scales and related questionnaire
# Operationalization of hypothesis, e.g. definition of scales and related questionnaire items
items
# Definition of the mother population
# Definition of the mother population
# Sampling strategies
# Sampling strategies
# Identification of analysis methods
# Identification of analysis methods


Implementation (mise en oeuvre)
; Implementation (mise en oeuvre)


# Questionnaire building (preferably with input from published scales)
# Questionnaire building (preferably with input from published scales)
Line 629: Line 623:
# Analysis
# Analysis


Writing it up
; Writing it up
 
** Compare results to theory
** Marry good practise of results presentation and discussion, but also make it readable
 


* Compare results to theory
* Marry good practise of results presentation and discussion, but also make it readable


=== Levels of reasoning within a statistical approach ===
=== Levels of reasoning within a statistical approach ===
Line 692: Line 684:


(Just for your information. If it looks too complicated, ignore)
(Just for your information. If it looks too complicated, ignore)


=== Typology of internal validity errors ===
=== Typology of internal validity errors ===


[[Image:book-research-design-135.png]] Error of type 1: you believe that a statistical
[[Image:fingers-1.png]] Error of type 1: you believe that a statistical relation is meaningful ... but "in reality" it doesn’t exist
relation is meaningful
* In complicated words : You wrongly reject the null hypothesis (no link between variables)


... but "in reality" it doesn’t exist
[[Image:fingers-2.png]] Error of type 2: you believe that a relation does not exist ... but "in reality" it does
 
* E.g. you compute a correlation coefficient, results show that is very weak. Maybe because the relation was non-linear, or because an other variable causes an interaction effect ...
** In complicated words : You wrongly reject the null hypothesis (no link between
* In complicated words: Your wrongly accept the null hypothesis
variables)
 
[[Image:book-research-design-136.png]] Error of type 2: you believe that a relation does
not exist
 
... but "in reality" it does
 
** E.g. you compute a correlation coefficient, results show that is very weak. Maybe
because the relation was non-linear, or because an other variable causes an interaction
effect ...
** In complicated words: Your wrongly accept the null hypothesis
 
[[Image:book-research-design-137.png]] The are useful statistical methods to diminish the
risks


[[Image:fingers-2.png]] The are useful statistical methods to diminish the risks
* See statistical data analysis techniques
* See statistical data analysis techniques
* Think !
* Think !


=== Survey research examples ===
=== Survey research examples ===


* See quantitative data gathering and quantitative analysis modules for some examples
* See quantitative data gathering and quantitative analysis modules for some examples


=== Etude pilote sur la mise en oeuvre et les perceptions des TIC ===
=== Etude pilote sur la mise en oeuvre et les perceptions des TIC ===


* (Luis Gonzalez, DESS thesis 2004): Main goal: "Study factors that favor teacher’s use of
* (Luis Gonzalez, DESS thesis 2004): Main goal: "Study factors that favor teacher’s use of ICT". The author defines 8 factors and also postulates a few relationships among them
ICT". The author defines 8 factors and also postulates a few relationships among them


[[Image:book-research-design-138.png]]
Todo: translation ....


* Below we quote from the thesis (and not the research plan):
[[Image:teacher-ICT-use-model-gonzalez.png]]


&lt;&lt; Mon hypothèse principale postule l’existence d’une corrélation entre les facteurs
Below we quote from the thesis (and not the research plan):
suivants et la mise en œuvre des TIC par les enseignants :


** Le '' type de support'' offert par le cadre institutionnel
Mon hypothèse principale postule l’existence d’une corrélation entre les facteurs suivants et la mise en œuvre des TIC par les enseignants:
** Leurs '' compétences pédagogiques''
* Le '' type de support'' offert par le cadre institutionnel
** Leurs '' compétences techniques''
* Leurs '' compétences pédagogiques''
** La '' formation reçue'' , que se soit la formation de base ou la formation continue
* Leurs '' compétences techniques''
** Leur '' sentiment d’auto-efficacité''
* La '' formation reçue'' , que se soit la formation de base ou la formation continue
** Leur '' perception des technologies''
* Leur '' sentiment d’auto-efficacité''
** Leur '' perception de l’usage pédagogique'' des TIC
* Leur '' perception des technologies''
** Leur rationalisation et '' digitalisation pédagogique''
* Leur '' perception de l’usage pédagogique'' des TIC
* Leur rationalisation et '' digitalisation pédagogique''


Mes hypothèses secondaires sont :
Mes hypothèses secondaires sont :


1. La perception de l’usage péd. est corrélée avec les compétences pédagogiques de
1. La perception de l’usage péd. est corrélée avec les compétences pédagogiques de l’enseignant.
l’enseignant.


2. La perception des technologies est corrélée avec celle de l’usage pédagogique.
2. La perception des technologies est corrélée avec celle de l’usage pédagogique.


3. Rationalisation et digitalisation péd. est corrélée avec la perception des
3. Rationalisation et digitalisation péd. est corrélée avec la perception des technologies.
technologies.


4. La formation est corrélée avec les compétences pédagogiques et techniques.
4. La formation est corrélée avec les compétences pédagogiques et techniques.


5. Le sentiment d’auto-efficacité est corrélé avec les compétences pédagogiques et
5. Le sentiment d’auto-efficacité est corrélé avec les compétences pédagogiques et techniques.
techniques.


6.Rationalisation et de digitalisation péd. est corrélée avec le sentiment
6.Rationalisation et de digitalisation péd. est corrélée avec le sentiment d’auto-efficacité."
d’auto-efficacité." &gt;&gt;


Sampling method
Sampling method
Line 779: Line 747:
Questionnaire design
Questionnaire design


* Definition of each "conceptual domain" (see above, i.e. main factors/variables
* Definition of each "conceptual domain" (see above, i.e. main factors/variables identified from the literature).
identified from the literature).
* Create item sets (questions). Scales have been adapted from the literature if possible
* Create item sets (questions). Scales have been adapted from the literature if possible
** L’échelle d’auto-efficacité (Dussault, Villeneuve &amp; Deaudelin, 2001)
** L’échelle d’auto-efficacité (Dussault, Villeneuve &amp; Deaudelin, 2001)
** Enquête internationale sur les attitudes, représentations et pratiques des étudiantes
** Enquête internationale sur les attitudes, représentations et pratiques des étudiantes
et
et
** étudiants en formation à la profession enseignante au regard du matériel pédagogique ou
** étudiants en formation à la profession enseignante au regard du matériel pédagogique ou didactique, informatisé ou non (Larose, Peraya, Karsenti, Lenoir &amp; Breton, 2000)
didactique, informatisé ou non (Larose, Peraya, Karsenti, Lenoir &amp; Breton, 2000)
** Guide et instruments pour évaluer la situation d’une école en matière d’intégration des TIC (Basque, Chomienne &amp; Rocheleau, 1998).
** Guide et instruments pour évaluer la situation d’une école en matière d’intégration des
** Les usages des TIC dans les IUFM : état des lieux et pratiques pédagogiques (IUFM, 2003).
TIC (Basque, Chomienne &amp; Rocheleau, 1998).
** Les usages des TIC dans les IUFM : état des lieux et pratiques pédagogiques (IUFM,
2003).
* Collect data with an on-line questionnaire (using the ESP program)
* Collect data with an on-line questionnaire (using the ESP program)
* Purification of the instrument. For each item set, a factor analysis was performed and
* Purification of the instrument. For each item set, a factor analysis was performed and
indicators were constructed according to auto-correlation of items (typically the first
indicators were constructed according to auto-correlation of items (typically the first 2-3 factors were used).
2-3 factors were used).
** Note: If you used fully tested published scales, you don’t need to do this !
** Note: If you used fully tested published scales, you don’t need to do this !


Line 801: Line 764:
* In the questionnaire this concept is measured by two '' questions sets'' (scales).
* In the questionnaire this concept is measured by two '' questions sets'' (scales).


&lt;&lt; La perception de l’usage pédagogique des TIC comporte deux séries de questions
{{quotationbox| La perception de l’usage pédagogique des TIC comporte deux séries de questions s’intéressant respectivement au degré d’accord des enseignants avec les discours gouvernementaux et scientifiques sur le recours aux ressources éducatives informatisées en éducation (question 34, 10 items) et au degré d’importance accordé à diverses ressources informatisées (question 43, 12 items).}}
s’intéressant respectivement au degré d’accord des enseignants avec les discours
gouvernementaux et scientifiques sur le recours aux ressources éducatives informatisées en
éducation (question 34, 10 items) et au degré d’importance accordé à diverses ressources
informatisées (question 43, 12 items). &gt;&gt;


Here we show one of these 2 question sets:
Here we show one of these 2 question sets:


Question 34. PUP1: Les énoncés suivants reflètent des opinions " fort présentes " dans les
Question 34. PUP1: Les énoncés suivants reflètent des opinions " fort présentes " dans les discours gouvernementaux ainsi que " scientifiques " qui portent sur le recours aux ressources éducatives informatisées en éducation. Indiquez votre degré d'accord par rapport à chacun d'entre eux.
discours gouvernementaux ainsi que " scientifiques " qui portent sur le recours aux
ressources éducatives informatisées en éducation. Indiquez votre degré d'accord par
rapport à chacun d'entre eux.


(Tout à fait en désaccord=1 Plutôt en désaccord=2 Plutôt d'accord=3 Tout à fait d'accord=4
(Tout à fait en désaccord=1 Plutôt en désaccord=2 Plutôt d'accord=3 Tout à fait d'accord=4
Line 818: Line 774:
[[Image:book-research-design-139.png]]
[[Image:book-research-design-139.png]]


Note: these 10 items and the 12 items from question 43 have been later reduced to 3
Note: these 10 items and the 12 items from question 43 have been later reduced to 3 indicators:
indicators:
 
Var_PUP1 Degré d'importance des outils d'entraide et de collaboration pour les élèves
 
Var_PUP2 Degré d'importance des outils de communication entre élèves
 
Var_PUP3 Accord sur ce qui favorise les apprentissages de type constructiviste
 


* Var_PUP1 - Degré d'importance des outils d'entraide et de collaboration pour les élèves
* Var_PUP2 - Degré d'importance des outils de communication entre élèves
* Var_PUP3 - Accord sur ce qui favorise les apprentissages de type constructiviste


== Similar comparative systems design ==
== Similar comparative systems design ==


Principle
'''Principle''':
 
[[Image:book-research-design-140.png]] Make sure to have good variance within “operative
variables” (dependant + independent)
 
[[Image:book-research-design-141.png]] Make sure that no other variable shows variance
(i.e. that there are no hidden control variables that may produce effects)
 


[[Image:book-research-design-140.png]] Make sure to have good variance within “operative variables” (dependant + independent)


====  ====
[[Image:book-research-design-141.png]] Make sure that no other variable shows variance (i.e. that there are no hidden control variables that may produce effects)


[[Image:book-research-design-142.png]]
[[Image:book-research-design-142.png]]


In more simple words: Select cases that are different in respect to the variables that are
In more simple words: Select cases that are different in respect to the variables that are of interest to your research, but otherwise similar in all other respects.
of interest to your research, but otherwise similar in all other respects.


E.g. don’t select an prestige school that does ICT and a normal school that doesn’t do ICT
E.g. don’t select an prestige school that does ICT and a normal school that doesn’t do ICT if you want to measure the effect of ICT. Either stick to prestige schools or "normal" schools, otherwise, you can’t tell if it was ICT that made the difference ...
if you want to measure the effect of ICT. Either stick to prestige schools or "normal"
schools, otherwise, you can’t tell if it was ICT that made the difference ...


Advantages and inconvenients of this method
Advantages and inconvenients of this method:


[[Image:book-research-design-143.png]] less reliability and construction validity problems
[[Image:book-research-design-143.png]] less reliability and construction validity problems
Line 859: Line 801:


[[Image:book-research-design-145.png]] worse external validity (possibility to generalize)
[[Image:book-research-design-145.png]] worse external validity (possibility to generalize)


== Summary of theory-driven designs discussed ==
== Summary of theory-driven designs discussed ==

Revision as of 16:25, 6 October 2008

This article or section is currently under construction

In principle, someone is working on it and there should be a better version in a not so distant future.
If you want to modify this page, please discuss it with the person working on it (see the "history")

<pageby nominor="false" comments="false"/>

Research Design for Educational Technologies - Theory driven research designs

This is part of the methodology tutorial (see its table of contents).

Note: There should be links to selected wiki articles !

Overview of theory driven research

Most important elements of an empirical theory-driven design:

Empirical-research-elements.png

  • Conceptualisations : Each research question is formulated as one or more hypothesis. Hypothesis are grounded in theory.
  • Measures : are usually quantitative (e.g. experimental data, survey data, organizational or public "statistics", etc.) and make use of artifacts like surveys or experimental materials
  • Analyses & conclusion : Hypothesis are tested with statistical methods


Experimental designs

The scientific ideal

Control physical interactions between variables

Experimentation principle in science:

  1. The study object is completely isolated from any environmental influence and observed (O1)
  2. A stimulus is applied to the object (X1)
  3. The object’s reactions are observed (O2).

Science-experiment.png

  • O1 = observation of the non-manipulated object’s state”
  • X = treatment (stimulus, intervention)
  • O2 = observation of the manipulated object’s state”.

The effect of the treatment (X) is measured by the difference between O1 and O2

The simple experiment in human sciences

It is not possible to totally isolate a subject from its environment. Therefore we have to make sure that effects or the environments are either controlled or at least equally distributed over the control group.

Simple experimentation using a control group :

A simple control group design looks like this:

Simple-control-group.png

Principle:

  1. Two groups of subjects are chosen randomly (R) within a mother population:
    • this ought to eliminate systematic influence of unknown variables on one group
  2. Ideally, subjects should not be aware of the research goals
  3. The independent variable (X) is manipulated by the researcher (experimental condition)

Analysis of results: effects are compared:

Treatment

effect (O)

non-effect (O)

Total effect
for a group

treatment: (group X)

bigger

smaller

100 %

We do a
vertical comparison

non-treatment: (group non-X)

smaller

bigger

100 %

Analysis questions are formulated in this spirit: What is the probability that treatment X leads to effect O ? In the table above we can observe an experimentation effect.

The Simple experiment with different treatments is a slightly different design alternative, but similar in spirit.

Simple-multiple-control-groups.png

Example: First students are assigned randomly to different lab sessions using a different pedagogy (X) and we would like to know if there are different effects at the end (O).

Problems with simple experimentation:

  • Selection: Subjects may not be the same in the different groups
    • Since samples are typically very small (15-20 / group) this may have an effect
  • Reactivity of subjects: Individuals ask themselves questions about the experiment (compensatory effects) or may otherwise change between observations
  • Difficulty to control certain variables in a “real” context
    • Example: A new ICT-supported pedagogy may work better, because it stimulates the teacher, students may increase their attention and work input, groups may be smaller and individuals get more attention.
    • In principle one could test these variables with experimental conditions, but for each new variable, one has to add at least 2 more experimental groups, .....

The simple experiment with pretests:

The following design attempts to control the difference that may exist between 2 experimental groups (i.e. we don't trust randomization or we can't randomly assign subjects to a group, e.g. 2 classes in a school setting).

Simple-experiment-pretest.png

  • To control the potential difference between groups: compare O2 - O1 (difference) with O4 - O3
  • Disadvantage: effects of the first measure on the experiment
    Example: (a) If X is supposed to increase pedagogical effect, the O1 and O3 tests could have an effect (students learn by doing the test), so you can’t measure the "pure" effect of X.

The Solomon design is similar in spirit and better, but it requires two extra control groups:

Experiment-solomon-design.png

  • combines the simple experiment design with the pretest design:
  • and we can test for example: O2>O1, O2>O4, O5>O6, O5>O3

Note: comparing 2 different situations is NOT an experiment ! The treatment variable X must be simple and uni-dimensional (else you don’t know the precise cause of an effect)

There exist even more complicated designs to measure interaction effects of 2 or more treatments, but we shall stop here.

The non-experiment: what you should not do

The (non)experiment without control group nor pretest can be presented like this:

Non-experiment.png

We just look at data (O) after some event (X).

Example: A bad discourse on ICT competence of pupils

“Since we introduced ICT in the curriculum, most of the school’s pupils are good at finding things on the Internet"

There is a lack of real comparison !!

  • We don’t compare: what happens in other schools that offer no ICT training ? (Maybe this is a general trend since more households have computers and Internet access.)
  • We don’t even know what happened before !

"Most of the students are good ! ..." means that you don't compare to what happens in other settings that do not include ICT in their curriculum

The variable to be explained (O)

x= ICT in school

x= no ICT in school

bad at web search

10 students

???

  • horizontal comparison
    of % ???

good at web search

20 students

???

"Things have changed ..." means that you are not aware of the situation before the change.

The variable to be explained (O)

before

after

bad at web search

???

10 students

  • horizontal comparison
    of % ???

good at web search

???

20 students

Here is another bad design:

Experiments without randomization nor pretest

Bad-control-group-experiment.png

Problem: There is no control over the conditions and the evolution of the control group

  • Example: Computer animations used in school A are the reason of better grade averages (than in school B)
  • School A simply may attract pupils from different socio-economic conditions and that usually show better results.

Finally, let's look at the experiment without control group

No-control-group-experiment.png

We don’t know if X is the real cause

  • Example: “Since I bought my daughter a lot of video games, she is much better at word processing ”
  • You don’t know if this evolution is "natural" (kids always get better at word processing after using it a few times) or if she learnt it somewhere else.

Examples of experimental designs

Drawn from TECFA's MSc MALTT (Master of Science in Learning and Teaching Technologies)

To do: translation ....

Under which conditions does animation favor learning ?

Master (DESS) thesis by Cyril Rebetez, TECFA 2005

Note: Funded by a real research project, i.e. the student did more than usually expected  !

The big research question

"Notre recherche a pour objectif de mettre en évidence l'influence, de la continuité du flux , de la collaboration , de la permanence des états antérieurs, ainsi que de vérifier la portée de variables individuelles telles que l’empan visuel et les capacités de rotation mentale." (p.33)

  • This objective is then further developed through 1 1/2 pages in the thesis. Causalities are discussed in verbal form (p. 34-40) and then "general" hypothesis are presented on 2 pages.

Explanatory (independent) variables, i.e. conditions

  1. Animation, static vs. dynamic condition : allows to visualize transition between states. Static presentation forces a student to imagine movement of elements.
  2. Permanence, present or absent condition : If older states of the animation are shown, students have better recall and therefore can more easily build up their model.
  3. Collaboration, present or absent condition : Working together should allow students to create more sophisticated representations.

Operational hypothesis (presented in the methodology chapter):

Quotations from the thesis:

  • Animation
    • Les scores d'inférence ainsi que les scores de rétention seront plus élevés en condition dynamique qu'en condition statique.
    • La charge cognitive perçue sera plus élevée en condition dynamique qu'en condition statique. Les temps de discussion ainsi que les niveaux de certitude n'ont pas de raison d'être différents entre les conditions.
  • Permanence
    • Les participants en condition avec permanence auront de meilleurs résultats aux questionnaires que les participants en condition sans permanence. Les résultats d'inférence sont tout particulièrement visés par cet effet.
    • La charge cognitive perçue ne devrait pas être différente entre ces deux conditions. Les temps de discussion ainsi que les niveaux de certitude devraient être plus élevés avec que sans permanence.
    • L'influence de la permanence sera d'autant plus grande si les participants sont en condition de présentation dynamique.
  • Collaboration
    • La collaboration aura un effet positif sur l'apprentissage, autant en ce qui concerne la rétention que l'inférence. Toutefois, l'inférence devrait être tout particulièrement avantagée en cas de " grounding ". Les participants en duo auront donc de meilleurs scores que les participants en solo.
    • La charge cognitive perçue devrait suivre le niveau de résultat et être plus bas en condition duo qu'en solo.
    • Les temps de discussion devraient être naturellement plus grand en condition duo. Les niveaux de certitude devraient également s'élever en condition duo face à la condition solo.

Method (short summary !)

  • Population = 160 students
    • All have been tested to check if the were novices (show lack of domain knowledge used in the material)
  • Material
    • Pedagogical material is 2 different multimedia contents (geology and astronomy), each one in 2 versions. For the dynamic condition there are 12 animations, for the static conditions 12 static pictures
    • Contents of pedagogical material: "Transit of Venus" made with VRML, "Ocean and mountain building" made with Flash
    • These media were integrated in Authorware (to take measures and to ensure a consistent interface)
  • Procedure (roughly, step by step)
    • Pretest (5 questions)
    • Introduction (briefing)
    • For solo condition: paper folding and Corsi visio-spatial tests
    • Test with material
    • Cognitive load test (nasa-tlx)
    • Post-test (17 questions)
  • Measured dependant variables:
    • Nombre de réponses correctes aux questionnaires de rétention.
    • Nombre de réponses correctes aux questionnaires d'inférence.
    • Niveau de certitude des réponses aux questionnaires.
    • Scores sur cinq échelles de charge cognitive perçue (tirées du nasa-tlx).
    • Score au paper-folding test.
    • Score d'empan au test de Corsi.
    • Temps (sec) et nombre d'utilisation des vignettes en condition de permanence.
    • Temps de réflexion entre les présentations (sec).

Quasi-experimental designs

Quasi-experimental designs are inspired by experimental design principles (pre- and post tests, and control groups).

Use case and advantages:

  • Are led in non-experimental situations (e.g. real contexts)
  • Are used when the treatment is too "heavy", i.e. involve more that 1-2 well defined treatment variables.
  • Address all sorts of threats to internal validity (see later)

In quasi-experimental situations, you really lack control:

  • you don’t know all possible stimuli (causes not due to experimental conditions)
  • you can’t randomize (distribute evenly other intervening unknown stimuli over the groups)
  • you may lack enough subjects

Usage examples in social sciences:

  • evaluation research
  • organizational innovation studies
  • questionnaire design (think about control variables to test alternative hypothesis)

There exist various designs. Some are easier to conduct but lead to less solid (valid) results:

Interrupted time series design

Here is a schema of the interrupted time series design that attempts to control the effect of possible other events (treatements) on a single experimental group.

Interrupted-time-series-design.png

Advantages:

  • you can control (natural) trends somewhat

Problems:

  • You can't control external simultaneous events ( X2 that happen at the same time as X1 )
  • Example: ICT-based pedagogies are introduced together with other pedagogical innovations. So which one does have an effect on overall performance ?

Practical difficulties:

  • Sometimes it is not possible to obtain data for past years
  • Sometimes you don’t have the time wait long enough (your research ends too early)
    • Example: ICT-based pedagogies often claim to improve meta-cognitive skills. Do you have tests for year-1, year-2, year-3 ? Can you wait for year+3 ?

Examples of time series

Time-series-examples.png

  • O1, O2, etc. are observation data (e.g. yearly), X is the treatment (intervention)
A. a statistical effect is likely
Example "Student’s drop-out rates are lower since we added forums to the e-learning content server "
but attention: you don’t know if there was an other intervention at the same time.
B. Likely “Straw fire” effect
Teaching has improved after we introduced X. But then, things went back to normal
So there is an effect, but after a while the cause "wears out". E.g. the typical motivation boost from ICT introduction in the curriculum may not last
C. Natural trend (unlikely effect)
You can control this error by looking beyond O4 and O5 !
D. Confusion between cycle effects and intervention
Example: government introduced measures to fight unemployment, but you don’t know if they only "surf" on a natural business cycle. Control this by looking at the whole time series.
E. Delay effect
Example: high investments in education (take decades to take effect)
F. Trend acceleration effect,
difficult to discriminate with G
Natural exponential evolution: same as (C).

Threats to internal validity

The big question you should ask yourself over and over: What other variables could influence our experiments ? (Campbell and Stanley, 1963)

Type

Definition:

history

An other event than X happens between measures example: ICT introduction happened at the same time as introduction of project-based teaching.

maturation

The object changed “naturally” between measures example: Did this course change your awareness of methodology or was it simply the fact that you started working on your master thesis.

testing

The measure had an effect on the object example: Your pre-intervention interviews had an effect on people (e.g. teachers changed behavior before you invited them to training sessions)

instrumentation

Method use to measure has changed example: Reading skills are defined differently. E.g. newer tests favor text understanding.

statisticalregression

Differences would have evened out naturally example: School introduces new disciplinary measures after kids beat up a teacher. Maybe next year such events wouldn’t have happened without any intervention.

(auto)
selection

Subjects auto-select for treatment example: You introduce ICT-based new pedagogies and results are really good (Maybe only good teachers did participate in these experiments).

mortality

Subjects are not the same- example: A school introduces special measures to motivate "difficult kids". After 2-3 years drop-out rates improve. Maybe the school is situated in a area that show rapid socio-demographic change (different people).

interaction
with
selection

Combinatory effectsexample: the control group shows a different maturation

directional
ambiguity

example: Do workers show better output in "flat-hierarchy" / participatory / ICT-supported organization or do such organizations attract more active and efficient people ?

Diffusion ortreatment imitation

example: An academic unit promotes modern blended learning and attracts good students from a wide geographic area. A control unit also may profit from this effect.

Compensatoryegalization

example: Subjects who don’t receive treatment, react negatively.

Non-equivalent control group design

This design adopts comparisons between two similar (but not equivalent) control groups.

Non-equivalent-control-group-design.png

Hand-up.png Advantages: Good at detecting other causes

  • If O2 - O1 is similar to O4 - O3, we can reject the hypothesis that O2 - O1 is due to X.

Hand-down.png Inconvenients and possible problems:

  • Bad control of natural tendencies
  • Finding (somewhat) equivalent groups is not easy
  • You also may encounter interaction effects between groups, e.g. imitation.

Experimentation and imitation effects

Here is an example of an imitation effect. In course we introduce

Course A introduces
ICT in the classroom

Course B doesn’t

Effect 1:costs

augment

stable

compare results
horizontally

E 2: student satisfaction

augments

augments

E 3: deadlines respected

better

stable

Questions:

  • E2: Why does student satisfaction improve at the same time for B ?

Validity in quasi-experimental design

There exist four kinds of validity according to Stanley et al.:

Finger-1.png Internal validity concerns your research design

  • You have to show that postulated causes are "real" (as discussed before) and that alternative explanations are wrong.
  • This is the most important validity type.

Finger-2.png External validity .... can you make generalizations ?

  • not easy ! because you may not be aware of "helpful" variables, e.g. the "good teacher" you worked with or the fact that things were much easier in your private school ....
  • How can you provide evidence that your successful ICT experiment will be successful in other similar situations, or situations not that similar ?

Finger-3.png Statistical validity .... are your statistical relations significant ?

  • not too difficult for simple analysis
  • just make sure that you use the right statistics and believe them (see module on data analysis)

Finger-4.png C onstruction validity ... are your operationalizations sound ?

  • Did you get your dimensions right ?
  • Do your indicators really measure what you want to know ?

This typology is also useful for other settings, e.g. structured qualitative analysis or statistical designs.

Use comparative time series if you can

One of the most powerful quasi-experimentalresearch designs uses comparative time series.

Comparative-time-series.png

  1. Compare between groups (situations)
  2. Make series of pre- and post observations (tests)

Difficulties:

  1. Find comparable groups
  2. Find groups with more than just one or a few cases (!)
  3. Find data (in time in particular)
  4. Watch out for simultaneous interventions at point X.

Example

Thesis title: Scripting Strategies In Computer Supported Collaborative Learning Environments

Author: Michele Notari

  • This thesis concerns the design and effects of ICT-supported activity-based pedagogics in a normal classroom setting
  • Target: Biology at high-school level (various subjects)

Three research questions formulated as 'working hypotheses':

  • The use of a Swiki as collaborative editing tool causes no technical and comprehensive problems (after a short introduction) for high school students without experience in collaborative editing but with some knowledge of the use of a common text-editing software and the research of information in the Web.
  • Scripting which induces students to compare and comment on the work of the whole learning community (using a collaborative editing tool) leads to better learning performance (as assessed by pre- and post-testing) than a script leading students to work without such a tool and with little advice or / and opportunity to make comments and compare their work with the learning community.
  • The quality of the product of the working groups is better (longer and more detailed) when students are induced to compare and comment on their work (with a collaborative editing tool) during the learning unit.

Method (Summary, quotations from thesis)

  • The whole research took place in a normal curricular class environment. The classes were not aware of a special learning situation and a deeper evaluation of the output they produced.
  • We tried to embed the scenarios in an absolutely everyday teaching situation and supposed students to have the same motivational state as in other lessons.
  • To collect data we used questionnaires, observed students while working, and for one set up we asked students to write three tests.
  • Of course the students asked about the purposes of the tests. We tried to motivate them to perform as well as they could without telling them the real reason of the tests.

Notes:

  • This master theses concerns several quasi-experiments, all in real-world settings.
  • On the next slide we just reproduce the settings for one of these.
  • Several explaining variables intervene in the example on next page ( the procedure as whole was evaluated, and not variables as defined by experimentalism ).

A sample "experiment" from Notari’s thesis:

Notari-wiki-scripting.png

Statistical designs

Statistical designs are related to experimental designs:

Statistical designs formulate laws
  • there is no interest in individual cases (unless something goes wrong)
  • You can test quite a lot of laws (hypothesis) with statistical data (your computer will calculate)
Designs are based on prior theoretical reasoning, because
  • measures are not all that reliable,
    • what people tell may not be what they do,
    • what you ask may not measure what you want to observe ...
  • there is a statistical over-determination,
    • you can find correlations between a lot of things !
  • you can not get an "inductive picture" by asking a few dozen closed questions.

The dominant research design is conducted "à la Popper":

  1. You start by formulating hypothesis (models that contain measurable variables and relations)
  2. You measure the variables (e.g. with a questionnaire and/or a test)
  3. You then test relations with statistical tools

The Most popular variant in educational technology is so-called "survey research".

Introduction to survey research

A typical research plan looks like this
  1. Literature review leading to general research questions and/or analysis frameworks
  2. You may use qualitative methodology to investigate new areas of study
  3. Definition of hypothesis
  4. Operationalization of hypothesis, e.g. definition of scales and related questionnaire items
  5. Definition of the mother population
  6. Sampling strategies
  7. Identification of analysis methods
Implementation (mise en oeuvre)
  1. Questionnaire building (preferably with input from published scales)
  2. Test of the questionnaire with 2-3 subjects
  3. Survey (interviews, on-line or written)
  4. Coding and data verification + scale construction
  5. Analysis
Writing it up
  • Compare results to theory
  • Marry good practise of results presentation and discussion, but also make it readable

Levels of reasoning within a statistical approach

Reasoning level

Variables

cases

Relations (causes)

theoretical

concept /category

depends on the scope of your theory

verbal

hypothesis

variables and values (attributes)

mother population(students, schools,)

clearly stated causalities or co-occurrences

operationalization

dimensions and indicators

good enough sampling

statistical relations between statistical variables (e.g. composite scales, socio-demographic variables)

measure

observed indicators (e.g. survey questions)

subjects in the sample

statistics

measures (e.g. response items to questions)scales (composite measures)

data(numeric variables)

(Just for your information. If it looks too complicated, ignore)

Typology of internal validity errors

File:Fingers-1.png Error of type 1: you believe that a statistical relation is meaningful ... but "in reality" it doesn’t exist

  • In complicated words : You wrongly reject the null hypothesis (no link between variables)

File:Fingers-2.png Error of type 2: you believe that a relation does not exist ... but "in reality" it does

  • E.g. you compute a correlation coefficient, results show that is very weak. Maybe because the relation was non-linear, or because an other variable causes an interaction effect ...
  • In complicated words: Your wrongly accept the null hypothesis

File:Fingers-2.png The are useful statistical methods to diminish the risks

  • See statistical data analysis techniques
  • Think !

Survey research examples

  • See quantitative data gathering and quantitative analysis modules for some examples

Etude pilote sur la mise en oeuvre et les perceptions des TIC

  • (Luis Gonzalez, DESS thesis 2004): Main goal: "Study factors that favor teacher’s use of ICT". The author defines 8 factors and also postulates a few relationships among them

Todo: translation ....

Teacher-ICT-use-model-gonzalez.png

Below we quote from the thesis (and not the research plan):

Mon hypothèse principale postule l’existence d’une corrélation entre les facteurs suivants et la mise en œuvre des TIC par les enseignants:

  • Le type de support offert par le cadre institutionnel
  • Leurs compétences pédagogiques
  • Leurs compétences techniques
  • La formation reçue , que se soit la formation de base ou la formation continue
  • Leur sentiment d’auto-efficacité
  • Leur perception des technologies
  • Leur perception de l’usage pédagogique des TIC
  • Leur rationalisation et digitalisation pédagogique

Mes hypothèses secondaires sont :

1. La perception de l’usage péd. est corrélée avec les compétences pédagogiques de l’enseignant.

2. La perception des technologies est corrélée avec celle de l’usage pédagogique.

3. Rationalisation et digitalisation péd. est corrélée avec la perception des technologies.

4. La formation est corrélée avec les compétences pédagogiques et techniques.

5. Le sentiment d’auto-efficacité est corrélé avec les compétences pédagogiques et techniques.

6.Rationalisation et de digitalisation péd. est corrélée avec le sentiment d’auto-efficacité."

Sampling method

  • Representative sample of future primary teachers (students), N = 48
  • Non-representative sample of primary teacher’s, N = 38
    • All teachers with an email address in Geneva were contacted, auto-selection (!)
    • Note: the questionnaire was very long, some teachers who started doing it, dropped out

after a while

  • This sort of sampling is ok for a pilot study

Questionnaire design

  • Definition of each "conceptual domain" (see above, i.e. main factors/variables identified from the literature).
  • Create item sets (questions). Scales have been adapted from the literature if possible
    • L’échelle d’auto-efficacité (Dussault, Villeneuve & Deaudelin, 2001)
    • Enquête internationale sur les attitudes, représentations et pratiques des étudiantes

et

    • étudiants en formation à la profession enseignante au regard du matériel pédagogique ou didactique, informatisé ou non (Larose, Peraya, Karsenti, Lenoir & Breton, 2000)
    • Guide et instruments pour évaluer la situation d’une école en matière d’intégration des TIC (Basque, Chomienne & Rocheleau, 1998).
    • Les usages des TIC dans les IUFM : état des lieux et pratiques pédagogiques (IUFM, 2003).
  • Collect data with an on-line questionnaire (using the ESP program)
  • Purification of the instrument. For each item set, a factor analysis was performed and

indicators were constructed according to auto-correlation of items (typically the first 2-3 factors were used).

    • Note: If you used fully tested published scales, you don’t need to do this !

Example regarding the concept "perception of pedagogical ICT use"

  • In the questionnaire this concept is measured by two questions sets (scales).


La perception de l’usage pédagogique des TIC comporte deux séries de questions s’intéressant respectivement au degré d’accord des enseignants avec les discours gouvernementaux et scientifiques sur le recours aux ressources éducatives informatisées en éducation (question 34, 10 items) et au degré d’importance accordé à diverses ressources informatisées (question 43, 12 items).

Here we show one of these 2 question sets:

Question 34. PUP1: Les énoncés suivants reflètent des opinions " fort présentes " dans les discours gouvernementaux ainsi que " scientifiques " qui portent sur le recours aux ressources éducatives informatisées en éducation. Indiquez votre degré d'accord par rapport à chacun d'entre eux.

(Tout à fait en désaccord=1 Plutôt en désaccord=2 Plutôt d'accord=3 Tout à fait d'accord=4

File:Book-research-design-139.png

Note: these 10 items and the 12 items from question 43 have been later reduced to 3 indicators:

  • Var_PUP1 - Degré d'importance des outils d'entraide et de collaboration pour les élèves
  • Var_PUP2 - Degré d'importance des outils de communication entre élèves
  • Var_PUP3 - Accord sur ce qui favorise les apprentissages de type constructiviste

Similar comparative systems design

Principle:

File:Book-research-design-140.png Make sure to have good variance within “operative variables” (dependant + independent)

File:Book-research-design-141.png Make sure that no other variable shows variance (i.e. that there are no hidden control variables that may produce effects)

File:Book-research-design-142.png

In more simple words: Select cases that are different in respect to the variables that are of interest to your research, but otherwise similar in all other respects.

E.g. don’t select an prestige school that does ICT and a normal school that doesn’t do ICT if you want to measure the effect of ICT. Either stick to prestige schools or "normal" schools, otherwise, you can’t tell if it was ICT that made the difference ...

Advantages and inconvenients of this method:

File:Book-research-design-143.png less reliability and construction validity problems

File:Book-research-design-144.png better control of "unknown" variables

File:Book-research-design-145.png worse external validity (possibility to generalize)

Summary of theory-driven designs discussed

approach

some usages

[#See Experimental designs|See Experimental designs]

  • Psycho-pedagogical investigations
  • User-interface design

[#Quasi-experimental_designs|See Quasi-experimental designs]

  • Instructional designs (as a whole)
  • Social psychology
  • Public policy analysis
  • Educational reform
  • Organizational reform

[#Statistical_designs|See Statistical designs]

  • Teaching practise
  • Usage patterns

[Similar comparative systems design|See Similar comparative systems design]

  • Public policy analysis
  • Comparative education

Of course, you can combine these approaches within a research project. You also may use different designs to look a the same question in order triangulate answers.