The audience for this analysis is a real estate investment and pricing strategy team working for a property analytics firm that advises clients on apartment pricing in major urban housing markets. In particular, the team wants to better understand how apartment characteristics relate to whether a unit is located in San Francisco versus New York, since those markets differ substantially in pricing, density, geography, and housing demand.
A property analytics firm wants to evaluate whether apartment characteristics such as square footage, number of bedrooms, and elevation can meaningfully distinguish apartments located in San Francisco from those in New York. The broader business concern is that if the company relies on an oversimplified model, it may misclassify properties, misinterpret which features matter most, and provide weak recommendations to clients making pricing, investment, or development decisions. Therefore, the firm needs to determine whether the modeling approach used in the Week 11 lab is sufficient for high-stakes real estate decision-making or whether stronger analyses and more careful interpretation are needed.
The Week 11 lab uses apartment data and focuses on modeling whether an apartment is in San Francisco through a generalized linear modeling framework. The variables most directly relevant to this problem include measures such as beds, square footage, elevation, and transformed variables derived from apartment size or price-related structure. Within this scope, the lab can support an initial critique of how well the selected variables and model structure address the classification problem.
However, several assumptions would need to be made before using this analysis in practice. These include assuming that the available apartment features are measured consistently across cities, that the included variables are sufficient to distinguish the two housing markets, and that the sample is reasonably representative of the apartments to which the firm hopes to generalize. If these assumptions do not hold, then the model may produce conclusions that appear statistically reasonable but are not practically reliable.
The objective of this critique is to evaluate whether the Week 11 modeling approach is adequate for a realistic apartment-market decision setting and to identify specific analytical, statistical, and ethical improvements that would make the analysis more credible and useful in practice. Success will be defined by clearly identifying major weaknesses in the original approach, proposing stronger alternatives, and explaining how those improvements would better support business interpretation and responsible use.
The Week 11 lab introduces important generalized linear model ideas, especially maximum likelihood estimation, model comparison, and the addition of explanatory variables. However, in a realistic business setting, the analysis would need to be strengthened before it could support high-stakes decisions about apartment pricing strategy, market segmentation, or location-based recommendations. The critique below focuses on analytical limitations in the lab and proposes improved analyses that would make the results more credible and practically useful.
One of the strongest ideas in the Week 11 lab is the comparison of models using measures such as deviance, AIC, and BIC. That is a useful starting point, since comparing models is better than interpreting one model in isolation. However, the lab does not go far enough in evaluating whether the selected model is actually reliable for classification in practice.
A model can outperform another model on AIC or deviance and still be poorly suited for business deployment. For example, a model may fit the training data better while still failing to generalize to new apartment listings. In a business context, this matters since the firm does not care only about which model looks strongest on the observed data. It also needs to know whether the model would remain useful when applied to future or unseen cases.
An improved analysis would include some form of out-of-sample validation, such as splitting the data into training and testing sets or using cross-validation. This would allow the analyst to evaluate whether performance remains stable when the model is applied beyond the sample on which it was estimated. In practical terms, this would provide stronger evidence that the model is not simply capitalizing on patterns specific to the original dataset.
The Week 11 lab uses a small set of explanatory variables, such as beds and elevation, to distinguish apartments in San Francisco from those in New York. While that simplification is understandable in a classroom setting, it creates a serious limitation in a real-world business scenario.
Housing markets are shaped by many factors beyond a small number of physical features. Variables such as neighborhood, distance to city center, building age, bathrooms, amenities, transit access, crime rates, and broader market conditions may strongly affect whether a property resembles one market or another. If important variables are omitted, the model may incorrectly attribute too much explanatory power to the few predictors that remain. This creates a classic omitted variable problem, where the coefficients may reflect missing context rather than the true independent contribution of the included variables.
A stronger analysis would expand the feature set or at least explicitly acknowledge that the current model is only a limited proof of concept. If additional data were unavailable, the critique should state clearly that the conclusions must remain narrow: the lab may show that certain variables help distinguish the two cities in this dataset, but it does not establish that those variables are the most important drivers of market differences in practice.
The Week 11 lab explains model comparison through likelihood, deviance, and information criteria, which are statistically important. However, from a business standpoint, these measures are still incomplete since they do not directly answer a practical question such as: How often does the model correctly classify apartments?
In a real business context, the audience would likely care about evaluation metrics such as accuracy, sensitivity, specificity, precision, and possibly the ROC curve or AUC. These measures help translate the model into practical decision terms. For example, if the model is used to guide pricing or investment strategy, stakeholders need to know not only that one model has a lower AIC than another, but also whether the classification errors are frequent, systematic, or costly.
This issue becomes even more important if the classes are imbalanced or if one type of error is more costly than another. Misclassifying a San Francisco apartment as a New York apartment may have different business consequences than the reverse. Therefore, an improved analysis would evaluate classification quality more directly and would discuss the practical implications of different kinds of error.
Generalized linear models often produce coefficients that are mathematically interpretable, but that does not automatically mean they are easy for stakeholders to understand. In a logistic-regression-style setting, coefficient estimates are typically interpreted on the log-odds scale, which is not intuitive for many business users.
If a model is presented to a pricing strategy team or investment group, then the analysis should move beyond formal coefficient significance and help explain what the coefficients mean in more accessible terms. For example, it may be more useful to discuss how changes in a variable affect the estimated probability that a listing belongs to one city rather than the other. Without that step, the results risk being technically correct but practically difficult to use.
A stronger version of the analysis would translate the model into interpretable quantities such as predicted probabilities, marginal effects, or example scenarios. This would improve communication and make the model more appropriate for decision support.
The lab is strong conceptually, but in a business setting, visual communication matters just as much as mathematical explanation. Some of the analytical ideas in the Week 11 lab, such as comparing candidate models or understanding the contribution of explanatory variables, would benefit from clearer visual support.
For instance, predicted probability curves, class-separation plots, confusion-matrix-style summaries, or side-by-side visual comparisons of model performance would make the analysis easier to interpret. Better visualizations could also help identify whether the model performs well across the full range of the predictors or only in limited parts of the data.
This matters since business audiences rarely act on model summaries alone. They are more likely to trust an analysis when they can see how the model behaves and where its strengths or weaknesses appear.
Based on the critiques above, at least three improved analyses would strengthen the Week 11 lab for the proposed business scenario:
Out-of-sample model validation
Evaluate the model on held-out data or through cross-validation in order to determine whether the model generalizes beyond the original sample.
Expanded model evaluation metrics
Supplement deviance, AIC, and BIC with practical classification measures such as accuracy, sensitivity, specificity, precision, and ROC/AUC-style evaluation.
Improved variable strategy
Reassess whether the selected explanatory variables are sufficient, and if possible, incorporate additional housing and location features that better reflect real market structure.
Probability-based interpretation
Translate coefficient results into predicted probabilities or practical case examples so that decision-makers can understand the model more clearly.
Improved visual communication
Use clearer visualizations to show model behavior, classification quality, and where the model performs well or poorly.
Overall, the Week 11 lab provides a strong classroom introduction to generalized linear model comparison, but it is not yet sufficient for real business deployment without stronger validation, broader variable consideration, more practical evaluation metrics, and clearer stakeholder-oriented interpretation.
In a real business setting, the Week 11 lab would not only require stronger statistical analysis, but also a careful examination of the ethical and epistemological limits of the modeling process. Even if a model performs well numerically, it may still produce misleading, biased, or socially harmful conclusions if the data and assumptions behind it are not questioned.
One major ethical concern is that a simplified model may encourage decision-makers to believe that a small set of measurable apartment features fully explains differences between San Francisco and New York housing markets. In reality, housing markets are shaped by a much broader set of structural, geographic, historical, and social factors.
If the model relies too heavily on limited predictors such as bedrooms or elevation, it may produce an overly narrow narrative about what “drives” market differences. This matters ethically since simplified models can influence real decisions about investment, pricing, and neighborhood desirability. If those decisions are based on incomplete information, the model may reinforce distorted market views rather than provide responsible guidance.
In a classroom setting, classification errors may seem purely technical. In practice, however, misclassification can affect real stakeholders. A model that incorrectly categorizes apartments or overstates confidence in its predictions could influence client advice, pricing strategy, or investment decisions in ways that create financial or reputational harm.
This is especially important when stakeholders interpret the model as more certain than it really is. If a firm presents the analysis as authoritative without explaining uncertainty, then clients may make high-stakes decisions on the basis of a model that was originally built as a simplified instructional example.
From an epistemological perspective, one key issue is the difference between modeling a pattern and understanding a phenomenon. The Week 11 lab may identify statistical relationships in the observed data, but that does not mean the model fully captures the true structure of urban housing markets.
In other words, the model can provide partial knowledge, but not complete knowledge. It identifies regularities in the available sample under a particular set of assumptions. It does not prove that those regularities are universal, causal, or stable across time and place. This limitation is important since business users may mistake predictive success for genuine explanatory understanding.
Another epistemological issue is that the analysis focuses on variables that are available and quantifiable. However, the most measurable features are not always the most meaningful. Factors such as neighborhood reputation, long-term urban policy, housing discrimination, or informal market dynamics may be highly influential even if they are not included in the dataset.
This creates a risk of measurement bias, where the analysis privileges what is easy to quantify rather than what is most relevant to the actual problem. As a result, the model may appear more objective than it really is.
A final epistemological issue is that whether a model is considered “good” depends on the purpose for which it is used. A model that is acceptable for illustrating GLM concepts in a classroom may be inadequate for real estate investment decisions. This means that knowledge claims about the model are always conditional on context.
For this reason, the business scenario matters. The model should not be judged only by whether it can produce a statistically valid output, but also by whether it produces the right kind of knowledge for the decision at hand. If the goal is high-stakes strategic decision-making, then the threshold for credibility must be much higher than it would be for a classroom demonstration.
Overall, the Week 11 lab is useful as an educational introduction to generalized linear models, but it would require much greater care before being used in a real business setting. Ethical concerns arise from omitted context, possible reinforcement of structural bias, and the real-world consequences of model error. Epistemological concerns arise from the fact that a statistical model captures only a limited version of reality and may privilege what is measurable over what is most meaningful.
Therefore, a responsible critique of this lab must go beyond technical model quality and also consider who is affected, what assumptions are being made, what kinds of knowledge the model can actually provide, and where its conclusions may be incomplete or potentially misleading.