Systems Factorial Technology (SFT) is a set of experimental methods and measures designed to address two basic of questions pertinent to cognitive psychology: First, when people have to make a decision about something that involves considering two things, do they deal with those two things at the same time or one after the other? Second, when people have to make a decision about something that involves considering two things, can they deal with those two things more or less efficiently than they could just one thing? The first question is often framed in terms of whether people “process” two factors in serial (one after the other) or parallel (at the same time) fashion. The second question is framed in terms of workload capacity—as we increase your “workload” (the number of factors relevant to your decision), how much does this slow you down or cause you to make more errors?
To make those questions more concrete, consider the following examples:
The preceding examples are typical of the kind of scenarios to which SFT has been applied in the past. You may have noticed that they are all about how people process external factors, like the icon/map on their phone, the price/screen size of a product, or the color of a flower. Part of the research here at LINCD has been applying ideas from SFT to how people process internal factors, specifically, factors having to do with memory. But before we get to that, let’s see how SFT helps us address these two fundamental questions:
The descriptions above used terms like “processing” and “efficiency”. In SFT, the primary way we measure those things is in terms of response time (RT). RT is the amount of time between when we present something to a participant and when the participant makes a response that reflects the outcome of a decision. Participants usually make their responses by clicking the mouse or hitting a key.
Again, let’s take a concrete example to keep things grounded. The example below shows a simplified version of the “unusual flower” visual search situation described above. Each display shows an array of purple squares called “distractors”; think of these like a bunch of oddly-shaped lavender stalks in a field. In some displays, one or two of those “distractors” is replaced by a “target” that differs slightly in color from the distractors. The job of the participant is to hit a key (e.g., the “J” key) if they see something with an unusual color in the display, and otherwise to hit another key (e.g., the “F” key) indicating that all of the items in the display look the same.
The example displays in the figure above differ according to two factors: The “strength” with which a target object in the upper region of the array differs from the distractor objects; and the “strength” with which a target object in the lower region of the array differs from the distractor objects. Each of those strengths can be high (H) (the color is very different from the distractors), low (L) (the color is somewhat different from the distractors), or null (N) (the color is identical to the distractors). The above represents a double factorial paradigm (DFP) because each possible display is defined by two factors.
On each trial of our task, we show a participant one display like one of those above and then wait until they hit a key (“F”) saying either that they see something unusual or a different key (“J”) saying that everything looks the same. For “NN” displays, participants should hit the “J” key, but for all other displays, they should hit the “F” key.
For the time being, let’s assume that we have a particular participant (call them “Albert”) who is always able to eventually respond correctly to each display. But Albert sometimes takes longer and sometimes shorter to make his responses. In other words, his RT differs from one trial to the next. Even for displays of the same type (e.g., all the HH displays or all the LN displays), Albert’s RT will not be the same each time. For one thing, displays of the same type are not always identical (the location of the target within each region is not always the same). For another thing, Albert is not identical from one trial to the next. Just like we wouldn’t expect the same dice to come up the same every time it is rolled, there’s all sorts of tiny random differences from one trial to the next that influence Albert’s RT on any one trial.
The data we get from Albert in this task might then look something like this:
subject | upper_strength | lower_strength | rt | correct |
---|---|---|---|---|
Albert | H | H | 0.587 | 1 |
Albert | H | H | 1.208 | 1 |
Albert | H | H | 0.496 | 1 |
Albert | H | H | 1.133 | 1 |
Albert | H | H | 0.683 | 1 |
Albert | H | L | 1.310 | 1 |
Albert | H | L | 0.859 | 1 |
Albert | H | L | 0.893 | 1 |
Albert | H | L | 1.167 | 1 |
Albert | H | L | 0.802 | 1 |
Albert | H | N | 0.870 | 1 |
Albert | H | N | 0.751 | 1 |
Albert | H | N | 1.151 | 1 |
Albert | H | N | 1.396 | 1 |
Albert | H | N | 0.868 | 1 |
Albert | L | H | 1.002 | 1 |
Albert | L | H | 1.042 | 1 |
Albert | L | H | 0.865 | 1 |
Albert | L | H | 1.161 | 1 |
Albert | L | H | 0.923 | 1 |
Albert | L | L | 0.992 | 1 |
Albert | L | L | 1.308 | 1 |
Albert | L | L | 1.450 | 1 |
Albert | L | L | 0.597 | 1 |
Albert | L | L | 0.992 | 1 |
Albert | L | N | 1.533 | 1 |
Albert | L | N | 0.900 | 1 |
Albert | L | N | 1.139 | 1 |
Albert | L | N | 1.145 | 1 |
Albert | L | N | 0.701 | 1 |
Albert | N | H | 0.567 | 1 |
Albert | N | H | 1.237 | 1 |
Albert | N | H | 1.003 | 1 |
Albert | N | H | 0.613 | 1 |
Albert | N | H | 0.802 | 1 |
Albert | N | L | 1.384 | 1 |
Albert | N | L | 0.607 | 1 |
Albert | N | L | 2.762 | 1 |
Albert | N | L | 2.406 | 1 |
Albert | N | L | 1.340 | 1 |
Albert | N | N | 1.719 | 1 |
Albert | N | N | 1.083 | 1 |
Albert | N | N | 1.075 | 1 |
Albert | N | N | 0.852 | 1 |
Albert | N | N | 1.954 | 1 |
Each row represents a single trial of the experiment that Albert completed. The type of display shown on each trial is indicated by the “strength” of the target in either the upper or lower region of the display. Albert’s response time on each trial is given in the “rt” column and is measured in seconds. Finally, Albert’s responses are all correct, so he get’s a “1” in the “correct” column for every row.
We have seen that Albert doesn’t always produce the same RT to the same type of display on every trial. What we are really concerned with is the general trend—how quickly does Albert tend to respond to different types of displays? This tendency is a measure of how efficiently Albert can process displays of each type.
A common way to measure central tendency is in terms of the mean or “average”. We can get Albert’s mean RT to each trial type using some R code:
small_data %>%
group_by(upper_strength, lower_strength) %>%
summarize(mean_rt = mean(rt))
## `summarise()` has grouped output by 'upper_strength'. You can override using
## the `.groups` argument.
Perhaps more revealing would be to make a plot of Albert’s
mean RT to each trial. To do this, we will put one factor
(upper_strength
) on the horizontal axis and use the other
factor (lower_strength
) to choose the color of each point.
The height of each point is Albert’s mean RT in each condition:
small_data %>%
ggplot(aes(x = upper_strength, color = lower_strength, y = rt)) +
stat_summary(fun.data = mean_se)
You probably noticed the bars around each point—those are “error bars” and represent our uncertainty regarding how well we have estimated Albert’s mean RT in each condition from the limited sample that we have. The error bars extend one ``standard error’’ away from the estimated mean, both above and below. You can see that they are pretty wide!
One way to make them smaller—that is, to get a better estimate of Albert’s mean RT in each condition—is to collect more data. That means making Albert do more trials of the experiment. Fortunately, since Albert is not real, we can simulate some more data from him so that instead of 5 trials for each type of display, he now does 100 trials. Let’s see what that same plot looks like with more data. I also added some lines to connect some of the points to make it easier to see some important trends.
full_data %>%
ggplot(aes(x = upper_strength, color = lower_strength, y = rt)) +
stat_summary(aes(group = interaction(lower_strength, lower_strength == "N", upper_strength == "N")), fun = mean, geom = "line") +
stat_summary(fun.data = mean_se)
Plots like this, where we show mean RT as a function of the different “strengths” of the two factors, are the first and most important method for addressing the two questions posed by SFT. In a bit, we will see how these kinds of plots help us distinguish between serial and parallel processing. For now, let’s focus on what we can see from this plot:
These results give a good summary of how efficiently Albert can process elements in the two regions of the display, where efficiency is indicated by faster mean response times. How can we use this information to help us figure out whether Albert processes the two regions in serial or parallel fashion? Or whether he can deal with two targets as efficiently as one?
We refer to serial and parallel processing as two different information processing architectures. But whether someone processes two (or more) things in serial or parallel fashion is not the only thing we need to think about. We also need to think about whether a person has to finish processing each factor before they can respond. For example, in the visual search task Albert was doing, he only needed to respond when he saw anything unusual in the display. In principle, then, as soon as he saw even one target, he could respond. This kind of architecture is called self-terminating because Albert can choose for himself when to terminate processing and commit to a decision. In contrast, there may be situations which require that someone process both factors fully before then can respond; in those situations, we say that a person employs an exhaustive architecture. An example of such a situation might be the phone purchasing scenario above; you would not want to base your decision on just one factor, but would want to process both of them before committing to a decision. Of course, just because you might want to do something one way, or if there is a “right” way to do a task, it doesn’t mean that people will actually do it that way!
To summarize, an information processing architecture is defined not just by whether it is serial or parallel, but also by whether the decision rule is self-terminating or exhaustive:
Decision rule | Serial | Parallel |
---|---|---|
Self-terminating | Serial self-terminating | Parallel self-terminating |
Exhaustive | Serial exhaustive | Parallel exhaustive |