Reproducibility of Ranking Visualization of Correlation using Weber’s Law by Harrison et al. (2014, TVCG)

Author

Fuling Sun

Published

October 13, 2024

Introduction

My research focuses on information visualization, particularly how visualizations can be design effectively to align with authors’ communication goals and support readers’ analytic tasks. This paper, along with the follow-up study by Kay and Heer, models the perception of correlation afforded by visualizations, providing quantitative methods to compare and rank them. Replicating this study will contribute to my future research, both in theoretical development and tool design, while also enhancing my skills in statistical analysis.

To reproduce the analysis from the paper, I will use the open-source data provided by the authors, follow the analysis pipeline, and implement the process using R. Specifically, I will conduct the following tasks:

  1. Data processing: I will understand the meaning of the variables in the dataset, and aggregate the data based on the experimental conditions, as described in Harrison et al. 
  2. Statistical analysis: I will perform analysis on both the aggregated data (from Harrison et al.) and the individual judgements (from Kay and Heer), applying techniques such as the Kruskal-Wallis test, Mann-Whitney-Wilcoxon tests, linear models, log-linear models and Bayesian estimation.
  3. Result comparison: I will compare my results with those reported in the original papers to identify any discrepancies.
  4. Additional analysis: I will propose and conduct further analyses to explore new questions that arise during the process.

Some potential challenges in this project include learning each statistic method and implementing them in R, which I am not yet fully familiar with. Additionally, since part of the analysis will draw from the follow-up study, I will need to understand the differences between these methods, their strengths and limitations, and when to apply each one.

Links:

Methods

Power Analysis

Original effect size, power analysis for samples to achieve 80%, 90%, 95% power to detect that effect size. Considerations of feasibility for selecting planned sample size.

Planned Sample

Planned sample size and/or termination rule, sampling frame, known demographics if any, preselection rules if any.

Materials

All materials - can quote directly from original article - just put the text in quotations and note that this was followed precisely. Or, quote directly and just point out exceptions to what was described in the original article.

Procedure

Can quote directly from original article - just put the text in quotations and note that this was followed precisely. Or, quote directly and just point out exceptions to what was described in the original article.

Analysis Plan

Can also quote directly, though it is less often spelled out effectively for an analysis strategy section. The key is to report an analysis strategy that is as close to the original - data cleaning rules, data exclusion rules, covariates, etc. - as possible.

Clarify key analysis of interest here You can also pre-specify additional analyses you plan to do.

Differences from Original Study

Explicitly describe known differences in sample, setting, procedure, and analysis plan from original study. The goal, of course, is to minimize those differences, but differences will inevitably occur. Also, note whether such differences are anticipated to make a difference based on claims in the original article or subsequent published research on the conditions for obtaining the effect.

Methods Addendum (Post Data Collection)

You can comment this section out prior to final report with data collection.

Actual Sample

Sample size, demographics, data exclusions based on rules spelled out in analysis plan

Differences from pre-data collection methods plan

Any differences from what was described as the original plan, or “none”.

Results

Data preparation

Data preparation following the analysis plan.

Confirmatory analysis

The analyses as specified in the analysis plan.

Side-by-side graph with original graph is ideal here

Exploratory analyses

Any follow-up analyses desired (not required).

Discussion

Summary of Replication Attempt

Open the discussion section with a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.

Commentary

Add open-ended commentary (if any) reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt. None of these need to be long.