Welcome to ElectraGrid, a national power company responsible for supplying electricity to thousands of homes and businesses spread across a huge region. You have just joined the data science team, and today you’ve been assigned your first technical challenge. ElectraGrid tracks the locations of all its customers, and it also monitors the position of each power hub. Your role is to help the engineering department understand how far customers are from the hub, whether any of them fall outside the guaranteed service area, and how the overall spread of customers looks from a distance.

The only rule: you must use NumPy and avoid writing loops wherever possible.

This is how real data science teams work—fast, efficient, and vectorised.

Getting Set Up

Begin by importing NumPy and setting the location of the company’s main power hub:

import numpy as np
hub = np.array([0, 0])

To simulate the thousands of customers in the region, generate one thousand random 2D coordinates:

np.random.seed(42)  # Set the seed for reproducibility
customers = np.random.randn(1000, 2) * 20

The expression customers = np.random.randn(1000, 2) * 20 creates 1,000 customer locations by first generating random 2-dimensional points from a standard normal distribution, where most values lie close to zero, and then multiplying them by 20 to spread them out across a much larger, more realistic area. Without this scaling, almost all customers would cluster tightly around the hub at (0, 0), making the distances artificially small and the analysis uninteresting. By multiplying the coordinates by 20, we effectively “zoom out” the map so the customers occupy a wider region, allowing for meaningful differences in distance, clearer patterns, and a more believable scenario for analysing coverage and service range.

You now have everything you need to begin the analysis.

1. Compute Customer Distances

Your first responsibility is to understand the geography. ElectraGrid needs to know how far each customer lies from the main hub at (0, 0). Using NumPy, compute the Euclidean distance from every customer to the hub.

Once you’ve done this, identify:

  • the customer who is closest to the hub, and how far away they are
  • the customer who is farthest from the hub, and their distance

This helps the planning team understand how stretched the current grid might be.

2. Identify Customers Outside Service Area

ElectraGrid promises reliable service for anyone living within 50 units of the main hub. Your job now is to identify the customers who fall outside this guaranteed range. Use NumPy’s Boolean masking (no loops) to:

  • pick out customers beyond 50 units
  • count how many of them there are
  • print the indices of these customers

These are the customers most likely to experience issues—and who may need additional support.

3. Nearest Customers to the Hub

Engineers are preparing routine maintenance near the main hub and want to know which customers lie closest. Use np.argsort() to sort customers by their distance from the hub, and then:

  • extract the indices of the ten nearest customers
  • display their coordinates

This gives the field team a quick snapshot of the immediate surrounding area.

4. Distance Statistics

If you have time, step back and consider the broader picture. Using the distance values you calculated earlier, work out some basic statistics:

  • mean
  • median
  • minimum
  • maximum
  • standard deviation

If you want to go further, create a histogram to visualise the distribution. Finish with one sentence describing what this spread tells you about ElectraGrid’s customers.

5. Cosine Similarity Between Customers

Sometimes ElectraGrid needs to understand not only distance, but direction—for example, when planning expansions. Take customer 0 and customer 1. Treat both as direction vectors starting from the hub.

  1. subtract the hub coordinates to get the two direction vectors
  2. convert each to a unit vector
  3. compute the cosine similarity between them

Write one sentence explaining what a high cosine similarity would mean here. (Hint: think about the paths from the hub to each customer.)

6. Visualising Customer Distribution

If you want to visualise your work, create a scatter plot:

  • plot every customer as a small point
  • mark the hub in a different colour
  • show customers outside the service radius in red
  • draw a circle representing the 50-unit boundary

This offers a helpful “map” of how ElectraGrid’s customers are arranged.

Bonus: Introducing a Second Hub

ElectraGrid has plans to expand by adding a second hub at (20, 10). Your final challenge is to decide which hub each customer should belong to.

hub2 = np.array([20, 10])

Calculate how far each customer is from both hubs and assign them to whichever one is closer. If you choose to plot this, colour customers by the hub they end up assigned to.

```