Welcome to ElectraGrid

Welcome to ElectraGrid, a national power company responsible for supplying electricity to thousands of homes and businesses spread across a huge region. You have just joined the data science team, and today you’ve been assigned your first technical challenge. ElectraGrid tracks the locations of all its customers, and it also monitors the position of each power hub. Your role is to help the engineering department understand how far customers are from the hub, whether any of them fall outside the guaranteed service area, and how the overall spread of customers looks from a distance.

The only rule: you must use NumPy and avoid writing loops wherever possible. This is how real data science teams work—fast, efficient, and vectorised.

Getting Set Up

Begin by importing NumPy and setting the location of the company’s main power hub:

import numpy as np

# Set hub location
hub = np.array([0, 0])

To simulate the thousands of customers in the region, generate one thousand random 2D coordinates:

np.random.seed(42)  # Set the seed for reproducibility
customers = np.random.randn(1000, 2) * 20

The expression above creates 1,000 customer locations by generating random 2-dimensional points from a standard normal distribution and then multiplying them by 20 to spread them across a larger, more realistic area.

1. Compute Distances from the Hub

distances = np.linalg.norm(customers - hub, axis=1)
nearest_index = np.argmin(distances)
farthest_index = np.argmax(distances)

print("Nearest customer index:", nearest_index, "Distance:", distances[nearest_index])
## Nearest customer index: 328 Distance: 0.5571516853645208
print("Farthest customer index:", farthest_index, "Distance:", distances[farthest_index])
## Farthest customer index: 104 Distance: 77.74011592029372

2. Service Radius Check

service_radius = 50
outside_mask = distances > service_radius
outside_indices = np.where(outside_mask)[0]

print("Number of customers outside service radius:", len(outside_indices))
## Number of customers outside service radius: 40
print("Indices of customers outside:", outside_indices)
## Indices of customers outside: [ 37  89 104 110 131 142 210 239 272 323 327 334 354 377 381 440 466 530
##  550 580 582 616 645 673 677 726 745 747 769 781 795 797 807 824 875 955
##  963 967 978 985]

3. Top 10 Nearest Customers

nearest_10_indices = np.argsort(distances)[:10]
nearest_10_coords = customers[nearest_10_indices]

print("Indices of 10 nearest customers:", nearest_10_indices)
## Indices of 10 nearest customers: [328 653 399 201 602 545 914 152 586 513]
print("Coordinates of 10 nearest customers:\n", nearest_10_coords)
## Coordinates of 10 nearest customers:
##  [[ 0.27858584 -0.48250174]
##  [ 0.56362315 -0.18237993]
##  [ 0.56636752  0.59512279]
##  [ 0.10487399  0.93961188]
##  [ 0.97720141  0.81183382]
##  [-1.43202519 -0.74444473]
##  [-1.97176265  0.37699246]
##  [-0.41803188  2.34654767]
##  [ 2.08402208 -1.25186256]
##  [ 0.76006956  2.40062653]]

4. Basic Distance Statistics

print("Mean distance:", np.mean(distances))
## Mean distance: 24.741228692022887
print("Median distance:", np.median(distances))
## Median distance: 22.96472377994813
print("Minimum distance:", np.min(distances))
## Minimum distance: 0.5571516853645208
print("Maximum distance:", np.max(distances))
## Maximum distance: 77.74011592029372
print("Standard deviation:", np.std(distances))
## Standard deviation: 13.067452958545584

Optional: histogram

import matplotlib.pyplot as plt

plt.hist(distances, bins=30, edgecolor='k')
plt.xlabel("Distance from hub")
plt.ylabel("Number of customers")
plt.title("Distribution of Customer Distances")
plt.show()

5. Directional Analysis (Cosine Similarity)

vec0 = customers[0] - hub
vec1 = customers[1] - hub

unit_vec0 = vec0 / np.linalg.norm(vec0)
unit_vec1 = vec1 / np.linalg.norm(vec1)

cos_similarity = np.dot(unit_vec0, unit_vec1)
print("Cosine similarity between customer 0 and 1:", cos_similarity)
## Cosine similarity between customer 0 and 1: 0.13023721636004673

6. Visualisation of Customers

plt.figure(figsize=(8,8))
plt.scatter(customers[:,0], customers[:,1], s=10, label='Customers')
plt.scatter(hub[0], hub[1], color='green', s=100, label='Hub')
plt.scatter(customers[outside_mask,0], customers[outside_mask,1], color='red', s=10, label='Outside Service Radius')

circle = plt.Circle((hub[0], hub[1]), service_radius, color='blue', fill=False, linestyle='--', label='Service Radius')
plt.gca().add_patch(circle)

plt.xlabel("X coordinate")
plt.ylabel("Y coordinate")
plt.title("ElectraGrid Customer Locations")
plt.legend()
plt.axis('equal')
## (np.float64(-71.14549494990577), np.float64(67.89776431762907), np.float64(-65.6008928186046), np.float64(83.84774993841343))
plt.show()

Bonus: Second Hub Assignment

hub2 = np.array([20, 10])

dist_to_hub2 = np.linalg.norm(customers - hub2, axis=1)
nearest_hub = np.where(distances <= dist_to_hub2, 1, 2)  # 1 for hub1, 2 for hub2

print("Customer hub assignments (1=Hub1, 2=Hub2):", nearest_hub)
## Customer hub assignments (1=Hub1, 2=Hub2): [1 2 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 2 1 1 2 1 1 2 2
##  1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 2 1 1 2 1 1 2 1 2 2 1 1 1 1 1 1 1 2 1 2 1
##  2 1 1 1 2 1 1 2 2 2 1 1 1 2 2 2 1 1 1 1 1 1 1 1 1 1 2 2 1 2 2 2 2 1 1 1 2
##  1 1 1 1 1 1 2 1 1 1 2 1 1 2 1 2 1 2 2 1 1 1 2 1 2 2 1 1 1 1 2 2 1 1 1 1 1
##  2 2 1 2 1 2 1 1 2 2 1 1 1 1 1 2 2 1 1 2 1 1 1 1 1 1 1 2 2 1 1 1 2 1 1 1 1
##  1 2 2 2 2 1 1 1 2 2 1 1 2 1 1 1 1 1 1 1 1 1 1 2 2 2 1 2 1 1 1 1 2 2 1 1 1
##  1 1 1 1 1 1 1 1 2 1 1 2 1 1 1 2 2 2 1 1 1 1 2 2 1 2 1 1 2 1 1 1 1 1 1 1 1
##  1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 2 1 1 1 2 2 2 1 1 2 1 1 1 1 1 2 1 1 1 1
##  2 2 1 1 1 2 2 1 1 2 1 2 1 1 1 2 2 1 1 1 1 1 2 1 1 1 1 1 1 2 1 2 1 1 1 1 1
##  1 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2
##  1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 2 1 1 1 1 2 2 2 1 1 1 1 1 1 2 1 1 1 2 1 2
##  1 2 1 1 2 1 2 1 1 1 1 2 2 1 1 1 2 1 1 1 2 1 1 1 2 1 1 2 1 1 2 1 1 2 1 2 1
##  2 2 2 2 1 2 1 2 1 1 2 1 2 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 2
##  1 2 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 2 1 2 1 2 2 2 2 2 2 1 1 1 2 2 1 1 1 2 1
##  1 2 2 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 2 1 2 1 1 2 2 1 2 1 2 1 1 1
##  1 2 1 2 2 2 1 2 1 1 1 2 1 1 2 1 1 1 1 2 1 1 2 1 2 1 1 2 2 2 2 1 1 1 1 2 1
##  1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 2 1 1 2 1 2 1 2 2 1 1 1 1 1 2 1 1 2
##  1 1 1 1 1 1 2 1 2 2 1 1 2 1 1 1 2 1 1 2 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 2
##  1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 2 1 1 2 1 2 1 1 1 2 2 2 1 1 2 1 1 2 1 2 1
##  1 1 2 2 1 1 2 2 1 1 2 1 2 1 1 1 2 2 1 1 1 2 2 2 1 2 2 1 1 1 2 1 2 1 1 1 1
##  1 2 2 1 1 2 1 2 2 2 1 1 1 2 2 1 1 1 2 1 2 1 1 2 1 1 1 1 1 1 2 2 1 1 2 1 1
##  1 1 2 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 2 1 2 1 1 1 1 2 1 1 2 2 1 1 1 1 2 1
##  1 1 1 1 1 2 2 1 1 1 2 2 1 1 2 1 2 1 2 1 2 1 1 2 2 2 1 1 1 1 1 1 1 2 1 1 1
##  2 1 2 1 2 1 1 1 1 1 2 1 1 2 1 1 1 1 2 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 1 1
##  1 1 1 1 1 1 2 1 1 1 1 2 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 2 2
##  1 1 2 2 1 1 1 1 2 1 1 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 2 2 1 1 1 1 1
##  2 1 2 1 1 2 1 2 1 1 1 1 2 1 1 1 1 2 1 2 1 1 1 1 2 1 2 1 1 1 2 2 1 2 2 2 1
##  1]