Telling Two from One: The Story of Systems Factorial Technology

Systems Factorial Technology (SFT) is a set of experimental methods and measures designed to address two basic of questions pertinent to cognitive psychology: First, when people have to make a decision about something that involves considering two things, do they deal with those two things at the same time or one after the other? Second, when people have to make a decision about something that involves considering two things, can they deal with those two things more or less efficiently than they could just one thing? The first question is often framed in terms of whether people “process” two factors in serial (one after the other) or parallel (at the same time) fashion. The second question is framed in terms of workload capacity—as we increase your “workload” (the number of factors relevant to your decision), how much does this slow you down or cause you to make more errors?

To make those questions more concrete, consider the following examples:

We are trying to design a weather alert that is supposed to pop up on your phone to warn you about possible dangerous weather. The alert might show an icon indicating the type of weather (like a lightning bolt for a thunderstorm or a funnel for a tornado) as well as a small map indicating your location relative to the dangerous weather. The icon and the map are the two “factors” you need to “process”. By “process”, we mean to arrive at a decision on the basis of the information you are given (e.g., to seek shelter). We would like to know whether you can process the map and the icon at the same time—in parallel—or whether you have to process them one at a time—in series. If you have to process them in series, that could be dangerous if it slows you down. Similarly, if having to process both the icon and the map goes beyond your workload capacity, it could be harmful to give you too much information at once.
We are interested in how people choose between two versions of a product, like a new phone. Say two phones differ in price and screen size; these are the two “factors” people need to “process” before deciding between one phone or the other. Again, we can ask whether people consider those two factors at the same time (in parallel) or one after the other (in series). We can also ask whether people are slower or less accurate to decide between phones that differ in both price and screen size, compared to when they have to decide between phones that differ on only one factor—this tells us about their workload capacity to process both factors relative to just one.
In the preceding examples, it may have seemed intuitive that processing two factors might be more difficult (meaning slower or less accurate) than processing just one. But sometimes, two factors may be easier to process together than apart if they lead you in the same direction. For example, if one phone has both a bigger screen and a lower price, processing both of those factors should make your decision easier. Another situation in which having two factors can be better than one is “visual search”. Imagine you are in a field looking for unusual flowers. A flower might seem to “pop out” of the background if it has an unusual color, relative to the color of the other flowers in the field. If there is an unusual flower in the upper part of the field as well as the lower part of the field, you will probably notice that “something” is unusual faster than if there is an unusual flower in only one part of the field. The two “factors” here no refer to the two parts of the field, which either do or do not contain an unusual flower. We can then ask whether you process the two parts of the field in parallel or in series, as before. But we can also ask whether having an unusual flower in both the upper and lower parts of the field makes you more efficient at detecting something unusual, even though technically having two unusual flowers increases your workload. In this example, processing two factors may actually increase your workload capacity, in the sense that you are faster and/or more accurate. This is because the two factors might combine to be “more than the sum of their parts”.

The preceding examples are typical of the kind of scenarios to which SFT has been applied in the past. You may have noticed that they are all about how people process external factors, like the icon/map on their phone, the price/screen size of a product, or the color of a flower. Part of the research here at LINCD has been applying ideas from SFT to how people process internal factors, specifically, factors having to do with memory. But before we get to that, let’s see how SFT helps us address these two fundamental questions:

Are two factors processed in serial or parallel fashion?
Are two factors processed more or less efficiently than either one alone?

Response times measure processing speed

The descriptions above used terms like “processing” and “efficiency”. In SFT, the primary way we measure those things is in terms of response time (RT). RT is the amount of time between when we present something to a participant and when the participant makes a response that reflects the outcome of a decision. Participants usually make their responses by clicking the mouse or hitting a key.

An example: Visual search

Again, let’s take a concrete example to keep things grounded. The example below shows a simplified version of the “unusual flower” visual search situation described above. Each display shows an array of purple squares called “distractors”; think of these like a bunch of oddly-shaped lavender stalks in a field. In some displays, one or two of those “distractors” is replaced by a “target” that differs slightly in color from the distractors. The job of the participant is to hit a key (e.g., the “J” key) if they see something with an unusual color in the display, and otherwise to hit another key (e.g., the “F” key) indicating that all of the items in the display look the same.

The example displays in the figure above differ according to two factors: The “strength” with which a target object in the upper region of the array differs from the distractor objects; and the “strength” with which a target object in the lower region of the array differs from the distractor objects. Each of those strengths can be high (H) (the color is very different from the distractors), low (L) (the color is somewhat different from the distractors), or null (N) (the color is identical to the distractors). The above represents a double factorial paradigm (DFP) because each possible display is defined by two factors.

On each trial of our task, we show a participant one display like one of those above and then wait until they hit a key (“F”) saying either that they see something unusual or a different key (“J”) saying that everything looks the same. For “NN” displays, participants should hit the “J” key, but for all other displays, they should hit the “F” key.

For the time being, let’s assume that we have a particular participant (call them “Albert”) who is always able to eventually respond correctly to each display. But Albert sometimes takes longer and sometimes shorter to make his responses. In other words, his RT differs from one trial to the next. Even for displays of the same type (e.g., all the HH displays or all the LN displays), Albert’s RT will not be the same each time. For one thing, displays of the same type are not always identical (the location of the target within each region is not always the same). For another thing, Albert is not identical from one trial to the next. Just like we wouldn’t expect the same dice to come up the same every time it is rolled, there’s all sorts of tiny random differences from one trial to the next that influence Albert’s RT on any one trial.

What the raw data look like

The data we get from Albert in this task might then look something like this:

subject	upper_strength	lower_strength	rt	correct
Albert	H	H	0.587	1
Albert	H	H	1.208	1
Albert	H	H	0.496	1
Albert	H	H	1.133	1
Albert	H	H	0.683	1
Albert	H	L	1.310	1
Albert	H	L	0.859	1
Albert	H	L	0.893	1
Albert	H	L	1.167	1
Albert	H	L	0.802	1
Albert	H	N	0.870	1
Albert	H	N	0.751	1
Albert	H	N	1.151	1
Albert	H	N	1.396	1
Albert	H	N	0.868	1
Albert	L	H	1.002	1
Albert	L	H	1.042	1
Albert	L	H	0.865	1
Albert	L	H	1.161	1
Albert	L	H	0.923	1
Albert	L	L	0.992	1
Albert	L	L	1.308	1
Albert	L	L	1.450	1
Albert	L	L	0.597	1
Albert	L	L	0.992	1
Albert	L	N	1.533	1
Albert	L	N	0.900	1
Albert	L	N	1.139	1
Albert	L	N	1.145	1
Albert	L	N	0.701	1
Albert	N	H	0.567	1
Albert	N	H	1.237	1
Albert	N	H	1.003	1
Albert	N	H	0.613	1
Albert	N	H	0.802	1
Albert	N	L	1.384	1
Albert	N	L	0.607	1
Albert	N	L	2.762	1
Albert	N	L	2.406	1
Albert	N	L	1.340	1
Albert	N	N	1.719	1
Albert	N	N	1.083	1
Albert	N	N	1.075	1
Albert	N	N	0.852	1
Albert	N	N	1.954	1

Each row represents a single trial of the experiment that Albert completed. The type of display shown on each trial is indicated by the “strength” of the target in either the upper or lower region of the display. Albert’s response time on each trial is given in the “rt” column and is measured in seconds. Finally, Albert’s responses are all correct, so he get’s a “1” in the “correct” column for every row.

Summarizing the data: Mean RT

We have seen that Albert doesn’t always produce the same RT to the same type of display on every trial. What we are really concerned with is the general trend—how quickly does Albert tend to respond to different types of displays? This tendency is a measure of how efficiently Albert can process displays of each type.

A common way to measure central tendency is in terms of the mean or “average”. We can get Albert’s mean RT to each trial type using some R code:

small_data %>%
    group_by(upper_strength, lower_strength) %>%
    summarize(mean_rt = mean(rt))

## `summarise()` has grouped output by 'upper_strength'. You can override using
## the `.groups` argument.

Perhaps more revealing would be to make a plot of Albert’s mean RT to each trial. To do this, we will put one factor (upper_strength) on the horizontal axis and use the other factor (lower_strength) to choose the color of each point. The height of each point is Albert’s mean RT in each condition:

small_data %>%
    ggplot(aes(x = upper_strength, color = lower_strength, y = rt)) +
    stat_summary(fun.data = mean_se)

You probably noticed the bars around each point—those are “error bars” and represent our uncertainty regarding how well we have estimated Albert’s mean RT in each condition from the limited sample that we have. The error bars extend one ``standard error’’ away from the estimated mean, both above and below. You can see that they are pretty wide!

One way to make them smaller—that is, to get a better estimate of Albert’s mean RT in each condition—is to collect more data. That means making Albert do more trials of the experiment. Fortunately, since Albert is not real, we can simulate some more data from him so that instead of 5 trials for each type of display, he now does 100 trials. Let’s see what that same plot looks like with more data. I also added some lines to connect some of the points to make it easier to see some important trends.

full_data %>%
    ggplot(aes(x = upper_strength, color = lower_strength, y = rt)) +
    stat_summary(aes(group = interaction(lower_strength, lower_strength == "N", upper_strength == "N")), fun = mean, geom = "line") +
    stat_summary(fun.data = mean_se)

Plots like this, where we show mean RT as a function of the different “strengths” of the two factors, are the first and most important method for addressing the two questions posed by SFT. In a bit, we will see how these kinds of plots help us distinguish between serial and parallel processing. For now, let’s focus on what we can see from this plot:

Within each level of upper strength, Albert’s RT got faster as the strength of the lower region increased.
Within each level of lower strength, Albert’s RT got faster as the strength of the upper region increased. Together with the previous point, these results tell us that the more stronger targets there are, the faster Albert will respond.
Albert was fastest when both the upper and lower regions contained an obvious target, i.e., the “HH” condition.
Albert was a little slower when one of the regions contained a high strength target, but the other contained a low strength one (conditions “HL” and “LH”), compared to the HH condition.
Albert was even slower to respond in the LL condition, compared to the LH and HL conditions. This, plus the previous two points, indicate that increasing the strength of even just one target from L to H makes a big difference to Albert’s speed. But as long as there is at least one high-strength target (HL or LH), increasing the strength of the other one (to HH) doesn’t speed him up that much.
Curiously, Albert was faster when there was just one high strength target in the upper region (HN) compared to two low strength targets (LL). This suggests that Albert might be overall faster to process the upper region than the lower region.

These results give a good summary of how efficiently Albert can process elements in the two regions of the display, where efficiency is indicated by faster mean response times. How can we use this information to help us figure out whether Albert processes the two regions in serial or parallel fashion? Or whether he can deal with two targets as efficiently as one?

Comparing response times reveals processing architecture

We refer to serial and parallel processing as two different information processing architectures. But whether someone processes two (or more) things in serial or parallel fashion is not the only thing we need to think about. We also need to think about whether a person has to finish processing each factor before they can respond. For example, in the visual search task Albert was doing, he only needed to respond when he saw anything unusual in the display. In principle, then, as soon as he saw even one target, he could respond. This kind of architecture is called self-terminating because Albert can choose for himself when to terminate processing and commit to a decision. In contrast, there may be situations which require that someone process both factors fully before then can respond; in those situations, we say that a person employs an exhaustive architecture. An example of such a situation might be the phone purchasing scenario above; you would not want to base your decision on just one factor, but would want to process both of them before committing to a decision. Of course, just because you might want to do something one way, or if there is a “right” way to do a task, it doesn’t mean that people will actually do it that way!

To summarize, an information processing architecture is defined not just by whether it is serial or parallel, but also by whether the decision rule is self-terminating or exhaustive:

Decision rule	Serial	Parallel
Self-terminating	Serial self-terminating	Parallel self-terminating
Exhaustive	Serial exhaustive	Parallel exhaustive