Data Preparation
Once you finish testing participants, you will have data stored in a data file with file extension .azk. The data in the file(s) are not in a format ready to be used in a statisitic package such as SPSS. Some preparation of the data is necessary. The following are the steps you may need to take before you carry out statistic analysis.
Step One: Collapsing the Data
This step is necessary only when you use more than one computer to test participants for the same experiment. When you test participants on more than one computer, your data will be stored in more than one file. Thus it is necessary to collapse them into one file before you prepare them for analysis. A utility program called "UnloadAZK" allows you to do that. Here are the procedures:
1. Copy the data file (.azk file) from Computer A on a disc; Bring the disc to Computer B;
2. Start the "UnloadAZK" program; and you will see a dialogue box shown below:
![]()
3. use the first Browse button to locate the file you want to unload, e.g., the .azk file you copied on the disc from Computer A; Use the second Brown button to identify the data file you want to add data to. This may be a data file on Computer B that you use to run the same experiment. Click on Unload. If unloading is successful, you will see the following message showing the destination file name where the data are copied.
Now the data on the disc have been copied into the data file on Computer B. The first data file will become a data backup file with the extension .rdb.
Step Two: Formatting the Data
Once the data from all participants are in the same data file (.azk file), you need to prepare the data for statistic analysis. Here are the steps involved:
1. Write an input specification file specifying how the data should be treated and arranged. This is a simple text file with the extension .spc. You can use Notepad to write the file. The following is an example of an .spc file with explanations taken from the file named dmdxutils_readme coming with the dmdx utility package (all bold words are key words followed by specfications and all lines beginning with # are explanations).title: Priming Experiment 1
# any text appearing after "title:" will be used as a title.
discard_display_errors
# if there was a display error on an item, the data from this trial is discarded.
# any option can be turned off either by deleting the line, or by inserting # at the beginning of the line, e.g., #discard_display_errors
subject_rejection: 20
# an option for specifying the highest percent error rate that will be included.
#In this example, subjects making 20% errors or more over all items being analyzed would be automatically #rejected.
data_threshold: 2.0
# any RTs more than 2.0 S.D. units away from the overall mean RT for each subject will be trimmed,
# either by setting it equal to the cutoff value (the default), or by exclusion (see next option).
# Delete the above line or comment it out if you don't want any data trimming.
data_rejection
# any RTs selected by the above data_threshold are excluded rather than trimmed.
low_cutoff: 200
# RTs faster than 200 ms will be discarded
high_cutoff: 1500
# RTs slower than 1500 ms will be discarded
rt_width: 7
rt_precision: 1
# specifies the output format of the .das file. rt_width is the number of columns (default 5), and
# rt_precision is the number of decimals (default 0).
analyze_incorrect_responses
# use this if you want to analyze only the incorrect responses.
# now follows the item assignments.
condition: 1
name: c1
description: Semantically related prime
items: 1-9 11 12
# This allows you to specify a name and a description for each condition, and to specify which items belong
# to this condition. In this example, condition 1 is named "c1", and the description of the condition is
# "Sematically related prime". The items to be included are 1 to 9, 11, and 12.
condition: 2
name: c2
description: Unrelated prime
items:14-16 18 19-26
# etc., for each remaining condition.2. Start the data preparation program called "Analyze". The dialogue box looks like this:
![]()
3. Locate the files to be analyzed. Use the first Browse button to locate the item file; Use the second Browse button to locate the .spc file; Then click on Analyze. If everything goes well, you will see the following message telling you two files have been generated (.das & .ism files) and their location.
4. Examine the .ism file, which looks like this:
At the top, you can find information about each subject, including their error rate, mean reaction time, s.d.
The body of the file contains items in each condition, their mean reaction time and error rate, and mean RT for each condition. This file can serve at least two purposes. a. you can check if you have put the right item in the right condition in the .spc file. b. you can take a look at the mean RT for each condition as a quick check to see if there is any effect. If you expect to observe a particular effect but after testing 5 people on each list, the results showed no effect or are in the wrong direction, you may consider whether it makes sense to test more people.5. Examine the .das file, which looks like this (see explanation on the right): ......
There are four matrixes of data, in the following order: subject reaction time, subject error rate, item reaction time, and item error rate. Each row in the first two matrixes represents an individual subject. Each column represents a condition. In this example,
there are 6 participants and 4 conditions.Each row in the last two matrixes represents an item. Each column represents a condition. In this example, there are ten items in of the four conditions.
For a typical psycholinguistic RT study, you need to do both subject analysis and item analysis on both RTs and ERs.
Use the data in the first matrix for subject RT analysis. Use the data in the second matrix for subject ER analysis. Use the last two matrixes for item analyses.
Step Three: Build a dataset for SPSS
You can't copy and paste the data in the .das file into the SPSS data file directly. What I do is to copy them into a spreadsheet in Excel first. Then open the data file in SPSS.
You need two Excel data files, one for subject means (the first two matrixes in the .das file) and the other for item means (the last two matrixes in the .das file).
If you have two or more counterbalanced lists (and thus two or more .das files), you need to combine the data from the different .das files into the same data file.
How you are going to list the data in the data file will depend on the design of your study. Go to the SPSS page to see examples of different data files required by different design and thus different statistical procedures.