Learning in public: A Geodesic Calculator in Python

This is a rather long post describing the steps I took in implementing a small project in a domain I’m not especially familiar with, using libraries I’m not especially familiar with. Rather than showing off a polished finished product, it details the steps I took along the way, false starts and all. Nor do I claim that my final version is the best possible version; there are no doubt more efficient and clever ways to do what I did. But I wanted to illustrate my thought process, and how I learned from my mistakes, and I thought it might be useful for other learners to follow along with my journey.

One of my weird perennial obsessions is Geodesic polyhedra. You’ve seen them: EPCOT’s Spaceship Earth is one, as are countless dome-shaped structures. Buckminster Fuller popularized (but did not invent) them, and for a minute or so in the 1960s they were what The Future was supposed to look like, until people figured out they didn’t actually want to live inside a golf ball.

The basic idea is fairly simple: You start with a regular polyhedron whose faces are triangular. This can be either a tetrahedron, an octahedron, or an icosahedron. You can also cheat a little bit by starting with a dodecahedron or cube, and subdividing each face into equilateral triangles by drawing lines from the vertices to the midpoint of each face. But my personal favorite (and probably the most common) is to start with an icosahedron. It is helpful to remember that every regular polyhedron has a circumscribed sphere: each vertex is the same distance r from a central point, so each vertex lies on a sphere of radius r.

The next step is to divide each triangle into some number of smaller equilateral triangles, a process known as tesselation. The vertices of these new triangles (apart from those that are shared with the original faces) will lie inside the circumscribed sphere. Thus we need to project or normalize them: Draw a line from the center of the circumscribed sphere through each triangle vertex, and note where it intersects the sphere. These new points define a new set of triangles that are no longer equilateral, but are nearly so: they’re the closest you can get to equilateral triangular faces on a non-Platonic solid. The more you subdivide each starting triangular face, the more sphere-like the resulting geodesic polyhedron. In fact, if you’re doing 3D graphics programming, a geodesic polyhedron makes a pretty good wireframe model for a sphere.

It’s fairly easy to build a physical model of a geodesic polyhedron. You can cut out card stock in the shape of triangles and tape them together, or tie together coffee stirrers that have been cut to the proper length, or any number of other simple methods. The key is to figure out exactly how long each edge should be, and for that, you need a geodesic calculator. There are a number of these for free on the Web, and there have been since at least the mid-’90s; I know because they were among the first web sites I remember visiting. Some current examples are here and here. They allow you to specify the radius of the circumscribed sphere and get the lengths for the struts you need, or specify the length of an individual strut and get the lengths of the other struts and the radius of the sphere.

The problem with these sites is that they aren’t very versatile. Each one is designed for a particular frequency of dome (where frequency refers to the number of times each original edge is subdivided), and there don’t seem to be many around for frequencies greater than 3. But what if you wanted to get the lengths for an arbitrary frequency? What if you wanted to start with a different polyhedron than an icosahedron? I decided I should write my own geodesic calculator in Python that let you do just that. I also decided that, to make it a true learning experience, I wouldn’t research how geodesic calculators had been implemented before. Instead, I would start with just the basic mathematical properties of geodesic polyhedra and figure out how to construct them myself.

Initially, I started by defining some classes to represent 3D vectors and perform basic arithmetic on them. But then I remembered that Python has these great libraries like numpy and pandas for dealing with linear algebra and n-dimensional data, so I decided to use them instead. Here’s the story of how I built my geodesic calculator, and the false steps I took along the way.

Step 1: The Icosahedron

Since the icosahedron is the most popular starting point, let’s do that first. Our first step will be to create a dataframe containing all the points of a regular icosahedron. There’s an elegant way of doing this: we take all cyclic permutations of (0, ±1, ±φ) (where φ is the Golden Ratio). “Cyclic permutations” here means that we do not change the ordering of the elements, only which one comes first. So the cyclic permutations of the letters ABC would be ABC, BCA, and CAB.

So let’s start by defining φ. Python 3’s wonderful Unicode support means we can even name the variable that, though to be honest, unless your keyboard setup allows you to type Greek letters easily, it might be less annoying to name it something like PHI instead.

from math import sqrt

φ = (1 + sqrt(5)) / 2

Now let’s work on the cyclic permutations. Python’s itertools library is the usual go-to for all things combinatorics, but for some reason it doesn’t seem to have anything that gets us exactly what we want. No worries; we’ll just define our own cyclic_permutations function using the tools that itertools does provide.

I’m sure there are many elegant ways to define such a function. I chose to use cycle(), which is a generator that yields each element in a sequence and loops back to the beginning when exhausted. Then I looped over the range of the sequence length, each time yielding a tuple the same length as the original sequence, starting at index 1. This has the effect of skipping the next element of the cycle each time.

from itertools import cycle, islice

def cyclic_permutations(seq):
    seq_cycle = cycle(seq)
    for _ in range(len(seq)):
        yield tuple(islice(seq_cycle, 1, len(seq) + 1))

list(cyclic_permutations('ABCD'))
[('B', 'C', 'D', 'A'),
 ('C', 'D', 'A', 'B'),
 ('D', 'A', 'B', 'C'),
 ('A', 'B', 'C', 'D')]

Now we can put it all together. I’m using a list comprehension to generate all possible combinations of coordinates that fit our criteria, in the form of a list of dictionaries with keys x, y, and z. I can then pass this list to DataFrame.from_records() to generate the dataframe.

import pandas as pd

ico_points = pd.DataFrame.from_records([
    {'x': x, 'y': y, 'z':z}
    for m in (-1, 1)
    for n in (-1, 1) 
    for (x, y, z) in cyclic_permutations((0, 1 * m, φ * n))
])

ico_points
x y z
0 -1.000000 -1.618034 0.000000
1 -1.618034 0.000000 -1.000000
2 0.000000 -1.000000 -1.618034
3 -1.000000 1.618034 0.000000
4 1.618034 0.000000 -1.000000
5 0.000000 -1.000000 1.618034
6 1.000000 -1.618034 0.000000
7 -1.618034 0.000000 1.000000
8 0.000000 1.000000 -1.618034
9 1.000000 1.618034 0.000000
10 1.618034 0.000000 1.000000
11 0.000000 1.000000 1.618034

There’s one problem, though: It’s best to work with a unit icosahedron, meaning one whose points are all exactly one unit from the origin. The nifty Golden Ratio-derived formula we’re using doesn’t do that. Instead, it generates an icosahedron whose edges all have a length of two units. We can verify this using the norm() function from numpy.linalg, which, when applied to a set of Cartesian coordinates, will tell us its distance from the origin. (We’ll need to use axis=1 to tell .apply() that we want to apply the function across the columns of each row rather than the rows of each column; otherwise we’ll get three results that treat each column like a twelve-dimensional coordinate.)

from numpy import linalg

ico_points.apply(linalg.norm, axis=1)
0     1.902113
1     1.902113
2     1.902113
3     1.902113
4     1.902113
5     1.902113
6     1.902113
7     1.902113
8     1.902113
9     1.902113
10    1.902113
11    1.902113
dtype: float64

As expected, each point is ~1.902 units from the origin.

In order to turn these points into coordinates of the unit icosahedron, we need to normalize them. This means we divide each point by its distance from the origin.

def normalize(n):
    distance = linalg.norm(n)
    # Watch out for division by zero!
    return n if distance == 0 else n / distance

ico_points = ico_points.apply(normalize, axis=1)
ico_points
x y z
0 -0.525731 -0.850651 0.000000
1 -0.850651 0.000000 -0.525731
2 0.000000 -0.525731 -0.850651
3 -0.525731 0.850651 0.000000
4 0.850651 0.000000 -0.525731
5 0.000000 -0.525731 0.850651
6 0.525731 -0.850651 0.000000
7 -0.850651 0.000000 0.525731
8 0.000000 0.525731 -0.850651
9 0.525731 0.850651 0.000000
10 0.850651 0.000000 0.525731
11 0.000000 0.525731 0.850651

Let’s just double check that our distances are what we want:

ico_points.apply(linalg.norm, axis=1)
0     1.0
1     1.0
2     1.0
3     1.0
4     1.0
5     1.0
6     1.0
7     1.0
8     1.0
9     1.0
10    1.0
11    1.0
dtype: float64

Perfect!

Edges

Just knowing the points in our icosahedron isn’t enough. We also want to know about the edges. We can start by generating a dataframe containing all possible pairs of points in our icosahedron.

We know right away that we don’t want to pair any point with itself. We also don’t want any duplicates; a line segment starting at a and ending at b is effectively identical to one starting at b and ending at a. Fortunately, itertools comes in clutch. If we apply the combinations() function to our index, we get all possible pairs of points, in sorted order, no repeats.

We’ll then use the sequence of tuples generated by combinations() to create a dataframe consisting of start and end columns, join each of these columns to ico_points, apply suffixes to the columns in the joined dataframe to distinguish their sources, and then drop our start and end columns.

from itertools import combinations

ico_edges = (
    pd.DataFrame(combinations(ico_points.index, 2), columns=['start', 'end'])
    .join(ico_points.add_suffix('_start'), on='start')
    .join(ico_points.add_suffix('_end'), on='end')
    .drop(columns=['start', 'end'])
)

ico_edges
x_start y_start z_start x_end y_end z_end
0 -0.525731 -0.850651 0.000000 -0.850651 0.000000 -0.525731
1 -0.525731 -0.850651 0.000000 0.000000 -0.525731 -0.850651
2 -0.525731 -0.850651 0.000000 -0.525731 0.850651 0.000000
3 -0.525731 -0.850651 0.000000 0.850651 0.000000 -0.525731
4 -0.525731 -0.850651 0.000000 0.000000 -0.525731 0.850651
... ... ... ... ... ... ...
61 0.000000 0.525731 -0.850651 0.850651 0.000000 0.525731
62 0.000000 0.525731 -0.850651 0.000000 0.525731 0.850651
63 0.525731 0.850651 0.000000 0.850651 0.000000 0.525731
64 0.525731 0.850651 0.000000 0.000000 0.525731 0.850651
65 0.850651 0.000000 0.525731 0.000000 0.525731 0.850651

66 rows × 6 columns

That double set of coordinates is rather unwieldy, though. We can turn it into a MultiIndex dataframe by some trickery.

ico_edges.columns = ico_edges.columns.str.split('_', expand=True)
ico_edges
x y z x y z
start start start end end end
0 -0.525731 -0.850651 0.000000 -0.850651 0.000000 -0.525731
1 -0.525731 -0.850651 0.000000 0.000000 -0.525731 -0.850651
2 -0.525731 -0.850651 0.000000 -0.525731 0.850651 0.000000
3 -0.525731 -0.850651 0.000000 0.850651 0.000000 -0.525731
4 -0.525731 -0.850651 0.000000 0.000000 -0.525731 0.850651
... ... ... ... ... ... ...
61 0.000000 0.525731 -0.850651 0.850651 0.000000 0.525731
62 0.000000 0.525731 -0.850651 0.000000 0.525731 0.850651
63 0.525731 0.850651 0.000000 0.850651 0.000000 0.525731
64 0.525731 0.850651 0.000000 0.000000 0.525731 0.850651
65 0.850651 0.000000 0.525731 0.000000 0.525731 0.850651

66 rows × 6 columns

I was initially confused by this trick, because at first glance I thought we were just converting the column names to strings and using good old str.split() on them. But why would that work? And what’s with that expand=True argument we’re passing to split()? It turns out we’re actually calling pandas.Series.str.split(), which is a string accessor method that provides an analog of the str.split() method for pandas.Series. The expand=True parameter tells it to expand dimensionality, making each part of our split its own index level.

The way we’ve achieved this extra dimensionality doesn’t look very practical, though. It would be better to have start and end as the outermost level, with our x, y, z coordinates below:

ico_edges = ico_edges.swaplevel(axis=1)
ico_edges
start end
x y z x y z
0 -0.525731 -0.850651 0.000000 -0.850651 0.000000 -0.525731
1 -0.525731 -0.850651 0.000000 0.000000 -0.525731 -0.850651
2 -0.525731 -0.850651 0.000000 -0.525731 0.850651 0.000000
3 -0.525731 -0.850651 0.000000 0.850651 0.000000 -0.525731
4 -0.525731 -0.850651 0.000000 0.000000 -0.525731 0.850651
... ... ... ... ... ... ...
61 0.000000 0.525731 -0.850651 0.850651 0.000000 0.525731
62 0.000000 0.525731 -0.850651 0.000000 0.525731 0.850651
63 0.525731 0.850651 0.000000 0.850651 0.000000 0.525731
64 0.525731 0.850651 0.000000 0.000000 0.525731 0.850651
65 0.850651 0.000000 0.525731 0.000000 0.525731 0.850651

66 rows × 6 columns

Looking good!

There’s still a problem, though: An icosahedron only has 30 edges, but we’ve got 66! What’s the problem?

As you’ve no doubt surmised, in our haste to generate every possible line segment between two points on our icosahedron, we wound up generating a lot of line segments that weren’t actually edges. Instead, they’re segments between nonadjacent points.

We can easily filter these out, though. Since this is a regular icosahedron, the edges will all be the same length, and they’ll all be tied for the shortest line segment. So we just filter out any segments that aren’t the shortest length.

We first define a length() function that will operate on a row of our dataframe. It calculates the length of the associated line segment by subtracting the start coordinate from the end coordinate and normalizing it.

def length(n):
    return linalg.norm(n['end'] - n['start']).round(3)

(If you’re wondering why we call .round(3) at the end, it’s because floating point math can be tricky. Two operations which should return the same results might instead return something like 0.99999999 and 1.00000000. Rounding helps reduce floating point error, which will be important in the next step.)

Next we define a keep_shortest() function that takes a DataFrame containing a length column, and returns only the rows for which length is the minimum. It also drops the now-superfluous length column.

def keep_shortest(df):
    return (
        df[df.length == min(df.length)]
        .reset_index(drop=True)
        .drop(columns = ['length'], level=0)
    )

Finally, we apply keep_shortest() to our ico_edges dataframe with the length column added.

(As you might have guessed, when I did this without rounding the length() function, I got far too few edges, because a handful of edges were infinitessimally shorter than others thanks to floating point weirdness.)

ico_edges = keep_shortest(ico_edges.assign(length=ico_edges.apply(length, axis=1)))
ico_edges
start end
x y z x y z
0 -0.525731 -0.850651 0.000000 -0.850651 0.000000 -0.525731
1 -0.525731 -0.850651 0.000000 0.000000 -0.525731 -0.850651
2 -0.525731 -0.850651 0.000000 0.000000 -0.525731 0.850651
3 -0.525731 -0.850651 0.000000 0.525731 -0.850651 0.000000
4 -0.525731 -0.850651 0.000000 -0.850651 0.000000 0.525731
5 -0.850651 0.000000 -0.525731 0.000000 -0.525731 -0.850651
6 -0.850651 0.000000 -0.525731 -0.525731 0.850651 0.000000
7 -0.850651 0.000000 -0.525731 -0.850651 0.000000 0.525731
8 -0.850651 0.000000 -0.525731 0.000000 0.525731 -0.850651
9 0.000000 -0.525731 -0.850651 0.850651 0.000000 -0.525731
10 0.000000 -0.525731 -0.850651 0.525731 -0.850651 0.000000
11 0.000000 -0.525731 -0.850651 0.000000 0.525731 -0.850651
12 -0.525731 0.850651 0.000000 -0.850651 0.000000 0.525731
13 -0.525731 0.850651 0.000000 0.000000 0.525731 -0.850651
14 -0.525731 0.850651 0.000000 0.525731 0.850651 0.000000
15 -0.525731 0.850651 0.000000 0.000000 0.525731 0.850651
16 0.850651 0.000000 -0.525731 0.525731 -0.850651 0.000000
17 0.850651 0.000000 -0.525731 0.000000 0.525731 -0.850651
18 0.850651 0.000000 -0.525731 0.525731 0.850651 0.000000
19 0.850651 0.000000 -0.525731 0.850651 0.000000 0.525731
20 0.000000 -0.525731 0.850651 0.525731 -0.850651 0.000000
21 0.000000 -0.525731 0.850651 -0.850651 0.000000 0.525731
22 0.000000 -0.525731 0.850651 0.850651 0.000000 0.525731
23 0.000000 -0.525731 0.850651 0.000000 0.525731 0.850651
24 0.525731 -0.850651 0.000000 0.850651 0.000000 0.525731
25 -0.850651 0.000000 0.525731 0.000000 0.525731 0.850651
26 0.000000 0.525731 -0.850651 0.525731 0.850651 0.000000
27 0.525731 0.850651 0.000000 0.850651 0.000000 0.525731
28 0.525731 0.850651 0.000000 0.000000 0.525731 0.850651
29 0.850651 0.000000 0.525731 0.000000 0.525731 0.850651

At long last, we have a dataframe containing just the edges of our unit icosahedron.

We can actually visualize our icosahedron now, using matplotlib. The following code is perhaps not the most efficient way to do it, but it gets the job done. It simply draws one line segment per row in ico_edges. This is something of an abuse of the .apply() method, because we’re discarding the resulting dataframe, and only interested in the side effects(the lines being drawn by ax.plot3D).

%matplotlib inline

from mpl_toolkits import mplot3d
import numpy as np
import matplotlib.pyplot as plt

fig = plt.figure()
ax = plt.axes(projection='3d')

_ = ico_edges.apply(lambda r: ax.plot3D(*[(r['start'][c], r['end'][c]) for c in ('x', 'y', 'z')], 'green'), axis=1)

png

Tessellation

Now that we’ve got a unit icosahedron, and a way to display it, let’s see about tessellating it to create a geodesic polyhedron. This turned out to be rather difficult, and I made a couple of false starts before figuring out a good method.

False start #1: Recursive hell

For our first attempt, let’s take advantage of the fact that a 2-frequency tessellation is relatively easy: We find the midpoint of each edge, replace each edge with two shorter lines from midpoint to each vertex, and add three new lines from midpoint to midpoint. This results in a Triforce-like shape.

First we’ll add a midpoints column to our ico_edges dataframe:

midpoints = ico_edges.apply(lambda n: n['start'] + ((n['end'] - n['start']) / 2), axis=1).add_suffix('_midpoint').apply(normalize, axis=1)
midpoints.columns = midpoints.columns.str.split('_', expand=True).swaplevel()
ico_edges = pd.concat([ico_edges, midpoints], axis=1)
ico_edges
start end midpoint
x y z x y z x y z
0 -0.525731 -0.850651 0.000000 -0.850651 0.000000 -0.525731 -0.809017 -0.500000 -0.309017
1 -0.525731 -0.850651 0.000000 0.000000 -0.525731 -0.850651 -0.309017 -0.809017 -0.500000
2 -0.525731 -0.850651 0.000000 0.000000 -0.525731 0.850651 -0.309017 -0.809017 0.500000
3 -0.525731 -0.850651 0.000000 0.525731 -0.850651 0.000000 0.000000 -1.000000 0.000000
4 -0.525731 -0.850651 0.000000 -0.850651 0.000000 0.525731 -0.809017 -0.500000 0.309017
5 -0.850651 0.000000 -0.525731 0.000000 -0.525731 -0.850651 -0.500000 -0.309017 -0.809017
6 -0.850651 0.000000 -0.525731 -0.525731 0.850651 0.000000 -0.809017 0.500000 -0.309017
7 -0.850651 0.000000 -0.525731 -0.850651 0.000000 0.525731 -1.000000 0.000000 0.000000
8 -0.850651 0.000000 -0.525731 0.000000 0.525731 -0.850651 -0.500000 0.309017 -0.809017
9 0.000000 -0.525731 -0.850651 0.850651 0.000000 -0.525731 0.500000 -0.309017 -0.809017
10 0.000000 -0.525731 -0.850651 0.525731 -0.850651 0.000000 0.309017 -0.809017 -0.500000
11 0.000000 -0.525731 -0.850651 0.000000 0.525731 -0.850651 0.000000 0.000000 -1.000000
12 -0.525731 0.850651 0.000000 -0.850651 0.000000 0.525731 -0.809017 0.500000 0.309017
13 -0.525731 0.850651 0.000000 0.000000 0.525731 -0.850651 -0.309017 0.809017 -0.500000
14 -0.525731 0.850651 0.000000 0.525731 0.850651 0.000000 0.000000 1.000000 0.000000
15 -0.525731 0.850651 0.000000 0.000000 0.525731 0.850651 -0.309017 0.809017 0.500000
16 0.850651 0.000000 -0.525731 0.525731 -0.850651 0.000000 0.809017 -0.500000 -0.309017
17 0.850651 0.000000 -0.525731 0.000000 0.525731 -0.850651 0.500000 0.309017 -0.809017
18 0.850651 0.000000 -0.525731 0.525731 0.850651 0.000000 0.809017 0.500000 -0.309017
19 0.850651 0.000000 -0.525731 0.850651 0.000000 0.525731 1.000000 0.000000 0.000000
20 0.000000 -0.525731 0.850651 0.525731 -0.850651 0.000000 0.309017 -0.809017 0.500000
21 0.000000 -0.525731 0.850651 -0.850651 0.000000 0.525731 -0.500000 -0.309017 0.809017
22 0.000000 -0.525731 0.850651 0.850651 0.000000 0.525731 0.500000 -0.309017 0.809017
23 0.000000 -0.525731 0.850651 0.000000 0.525731 0.850651 0.000000 0.000000 1.000000
24 0.525731 -0.850651 0.000000 0.850651 0.000000 0.525731 0.809017 -0.500000 0.309017
25 -0.850651 0.000000 0.525731 0.000000 0.525731 0.850651 -0.500000 0.309017 0.809017
26 0.000000 0.525731 -0.850651 0.525731 0.850651 0.000000 0.309017 0.809017 -0.500000
27 0.525731 0.850651 0.000000 0.850651 0.000000 0.525731 0.809017 0.500000 0.309017
28 0.525731 0.850651 0.000000 0.000000 0.525731 0.850651 0.309017 0.809017 0.500000
29 0.850651 0.000000 0.525731 0.000000 0.525731 0.850651 0.500000 0.309017 0.809017

Next we’ll define a get_edges() function that takes a dataframe consisting of 3D points, and returns another dataframe with start and end columns representing the endpoints of all the edges. This function works similarly to the way we defined our icosahedron: it creates all the line segments between any two points in the dataframe, and keeps only the shortest ones. (This is, in fact, a rather inefficient way to do things, but it works, and at this point I was going more for conceptual simplicity rather than efficiency. As it turns out, it wasn’t very conceptually simple, either; it just happens to be the first way my brain tried to solve the problem.)

We’ll use get_edges() to create mid_to_mids, a dataframe consisting of all the lines from midpoint to midpoint of our polygon faces.

def get_edges(vertices):
    edges = (
        pd.DataFrame(combinations(vertices.index, 2), columns=['start', 'end'])
        .join(vertices.add_suffix('_start'), on='start')
        .join(vertices.add_suffix('_end'), on='end')
        .drop(columns=['start', 'end'])
    )
    edges.columns = edges.columns.str.split('_', expand=True)
    edges = edges.swaplevel(axis=1)
    return keep_shortest(edges.assign(length=edges.apply(length, axis=1)))

mid_to_mids = get_edges(ico_edges.xs('midpoint', axis=1))
mid_to_mids
start end
x y z x y z
0 -0.809017 -0.500000 -0.309017 -0.309017 -0.809017 -0.500000
1 -0.809017 -0.500000 -0.309017 -0.809017 -0.500000 0.309017
2 -0.809017 -0.500000 -0.309017 -0.500000 -0.309017 -0.809017
3 -0.809017 -0.500000 -0.309017 -1.000000 0.000000 0.000000
4 -0.309017 -0.809017 -0.500000 0.000000 -1.000000 0.000000
5 -0.309017 -0.809017 -0.500000 -0.500000 -0.309017 -0.809017
6 -0.309017 -0.809017 -0.500000 0.309017 -0.809017 -0.500000
7 -0.309017 -0.809017 0.500000 0.000000 -1.000000 0.000000
8 -0.309017 -0.809017 0.500000 -0.809017 -0.500000 0.309017
9 -0.309017 -0.809017 0.500000 0.309017 -0.809017 0.500000
10 -0.309017 -0.809017 0.500000 -0.500000 -0.309017 0.809017
11 0.000000 -1.000000 0.000000 0.309017 -0.809017 -0.500000
12 0.000000 -1.000000 0.000000 0.309017 -0.809017 0.500000
13 -0.809017 -0.500000 0.309017 -1.000000 0.000000 0.000000
14 -0.809017 -0.500000 0.309017 -0.500000 -0.309017 0.809017
15 -0.500000 -0.309017 -0.809017 -0.500000 0.309017 -0.809017
16 -0.500000 -0.309017 -0.809017 0.000000 0.000000 -1.000000
17 -0.809017 0.500000 -0.309017 -1.000000 0.000000 0.000000
18 -0.809017 0.500000 -0.309017 -0.500000 0.309017 -0.809017
19 -0.809017 0.500000 -0.309017 -0.809017 0.500000 0.309017
20 -0.809017 0.500000 -0.309017 -0.309017 0.809017 -0.500000
21 -1.000000 0.000000 0.000000 -0.809017 0.500000 0.309017
22 -0.500000 0.309017 -0.809017 0.000000 0.000000 -1.000000
23 -0.500000 0.309017 -0.809017 -0.309017 0.809017 -0.500000
24 0.500000 -0.309017 -0.809017 0.309017 -0.809017 -0.500000
25 0.500000 -0.309017 -0.809017 0.000000 0.000000 -1.000000
26 0.500000 -0.309017 -0.809017 0.809017 -0.500000 -0.309017
27 0.500000 -0.309017 -0.809017 0.500000 0.309017 -0.809017
28 0.309017 -0.809017 -0.500000 0.809017 -0.500000 -0.309017
29 0.000000 0.000000 -1.000000 0.500000 0.309017 -0.809017
30 -0.809017 0.500000 0.309017 -0.309017 0.809017 0.500000
31 -0.809017 0.500000 0.309017 -0.500000 0.309017 0.809017
32 -0.309017 0.809017 -0.500000 0.000000 1.000000 0.000000
33 -0.309017 0.809017 -0.500000 0.309017 0.809017 -0.500000
34 0.000000 1.000000 0.000000 -0.309017 0.809017 0.500000
35 0.000000 1.000000 0.000000 0.309017 0.809017 -0.500000
36 0.000000 1.000000 0.000000 0.309017 0.809017 0.500000
37 -0.309017 0.809017 0.500000 -0.500000 0.309017 0.809017
38 -0.309017 0.809017 0.500000 0.309017 0.809017 0.500000
39 0.809017 -0.500000 -0.309017 1.000000 0.000000 0.000000
40 0.809017 -0.500000 -0.309017 0.809017 -0.500000 0.309017
41 0.500000 0.309017 -0.809017 0.809017 0.500000 -0.309017
42 0.500000 0.309017 -0.809017 0.309017 0.809017 -0.500000
43 0.809017 0.500000 -0.309017 1.000000 0.000000 0.000000
44 0.809017 0.500000 -0.309017 0.309017 0.809017 -0.500000
45 0.809017 0.500000 -0.309017 0.809017 0.500000 0.309017
46 1.000000 0.000000 0.000000 0.809017 -0.500000 0.309017
47 1.000000 0.000000 0.000000 0.809017 0.500000 0.309017
48 0.309017 -0.809017 0.500000 0.500000 -0.309017 0.809017
49 0.309017 -0.809017 0.500000 0.809017 -0.500000 0.309017
50 -0.500000 -0.309017 0.809017 0.000000 0.000000 1.000000
51 -0.500000 -0.309017 0.809017 -0.500000 0.309017 0.809017
52 0.500000 -0.309017 0.809017 0.000000 0.000000 1.000000
53 0.500000 -0.309017 0.809017 0.809017 -0.500000 0.309017
54 0.500000 -0.309017 0.809017 0.500000 0.309017 0.809017
55 0.000000 0.000000 1.000000 -0.500000 0.309017 0.809017
56 0.000000 0.000000 1.000000 0.500000 0.309017 0.809017
57 0.809017 0.500000 0.309017 0.309017 0.809017 0.500000
58 0.809017 0.500000 0.309017 0.500000 0.309017 0.809017
59 0.309017 0.809017 0.500000 0.500000 0.309017 0.809017

Next, we define two more dataframes, start_to_mids and mid_to_ends, that are much simpler to create: they’re just the start, midpoint, and end columns of ico_edges, renamed to start and end as appropriate. Finally, we create a geodesic_edges dataframe that just concatenates start_to_mids, mid_to_ends, and mid_to_mids.

foo = {}
foo['start'] = ico_edges.xs('start', axis=1)
foo['end'] = ico_edges.xs('midpoint', axis=1)
start_to_mids = pd.concat(foo, axis=1)

foo['start'] = ico_edges.xs('midpoint', axis=1)
foo['end'] = ico_edges.xs('end', axis=1)
mid_to_ends = pd.concat(foo, axis=1)

geodesic_edges = pd.concat([start_to_mids, mid_to_ends, mid_to_mids]).reset_index(drop=True)
geodesic_edges
start end
x y z x y z
0 -0.525731 -0.850651 0.000000 -0.809017 -0.500000 -0.309017
1 -0.525731 -0.850651 0.000000 -0.309017 -0.809017 -0.500000
2 -0.525731 -0.850651 0.000000 -0.309017 -0.809017 0.500000
3 -0.525731 -0.850651 0.000000 0.000000 -1.000000 0.000000
4 -0.525731 -0.850651 0.000000 -0.809017 -0.500000 0.309017
... ... ... ... ... ... ...
115 0.000000 0.000000 1.000000 -0.500000 0.309017 0.809017
116 0.000000 0.000000 1.000000 0.500000 0.309017 0.809017
117 0.809017 0.500000 0.309017 0.309017 0.809017 0.500000
118 0.809017 0.500000 0.309017 0.500000 0.309017 0.809017
119 0.309017 0.809017 0.500000 0.500000 0.309017 0.809017

120 rows × 6 columns

It seems to have the right number of edges: Each of the original 30 icosahedron edges is replaced by 2 edges, and then we have 3 new midpoint-to-midpoint edges for each of the 20 faces. 30 * 2 + 20 * 3 = 60 + 60 = 120.

Let’s take a look at the results! We’ll use the same code we used to visualize the icosahedron, but this time we’ll wrap the drawing instructions in a draw_edges() function, since we’ll likely have to reuse it a lot.

fig = plt.figure()
ax = plt.axes(projection='3d')

def draw_edges(df, color='green'):
    df.apply(lambda r: ax.plot3D(*[(r['start'][c], r['end'][c]) for c in ('x', 'y', 'z')], color), axis=1)

draw_edges(geodesic_edges)

png

That, my friends, is a frequency-2 geodesic polyhedron! Success!

Just for fun, we can even display our intermediate dataframes in different colors.

fig = plt.figure()
ax = plt.axes(projection='3d')

draw_edges(start_to_mids, 'green')
draw_edges(mid_to_ends, 'green')
draw_edges(mid_to_mids, 'red')

png

Edges of the same color should be the same length. We can see our red mid-to-mid edges forming triangles at the center of each green “triangle” that was built from the original icosahedron edges. There’s also a nice red equator-like polygon in the middle. This is because an icosahedron can be thought of as a gyroelongated bipyramid consisting of two pentagonal pyramids with an antiprism in the middle. Said antiprism is composed of alternating up- and down-pointing triangles. The midpoints of the faces of these triangles, once projected to the circumscribed sphere, will all fall along the same great circle. This is a nice property, because if we want to build a geodesic dome, we can neatly divide our geodesic polyhedron in half at the equator, with that great circle becoming the base of the dome.

I can imagine that many of you are now slumping in your seats like I’m Colin Robinson feeding on your energy, so I’ll move on. I just thought that was interesting, and I learned a lot of new geometric terminology in my Wikipedia deep-dive.

Moving on: We can now calculate the lengths of edges we need for our geodesic sphere! Just calculate the length of each edge, group by the length, and count:

def get_strut_lengths(edges):
    return edges.assign(length=edges.apply(length, axis=1))[['length']].assign(count=0).groupby('length').count()

get_strut_lengths(geodesic_edges)
count
length
0.547 60
0.618 60

Sixty 0.547-length struts, and sixty 0.618-length struts. We can check our results by entering a spherical radius of 1 into this calculator and get the same lengths.

What about for frequencies greater than 2? A naïve approach would be a recursive one: subdivide the icosahedron to get a geodesic polyhedron, subdivide that one to get another, and so on. So let’s define a tesselate() function that takes our starting icosahedron dataframe and loops num_times times, each time subdividing its triangles further:

def tesselate(df, num_times):
    ico_edges = df.copy(deep=True)
    for _ in range(num_times):
        midpoints = ico_edges.apply(lambda n: n['start'] + ((n['end'] - n['start']) / 2), axis=1).add_suffix('_midpoint')
        midpoints.columns = midpoints.columns.str.split('_', expand=True).swaplevel()
        ico_edges = pd.concat([ico_edges, midpoints], axis=1)
        mid_to_mids = get_edges(ico_edges.xs('midpoint', axis=1))
        cols = {}
        cols['start'] = ico_edges.xs('start', axis=1)
        cols['end'] = ico_edges.xs('midpoint', axis=1)
        start_to_mids = pd.concat(cols, axis=1)
        
        cols['start'] = ico_edges.xs('midpoint', axis=1)
        cols['end'] = ico_edges.xs('end', axis=1)
        mid_to_ends = pd.concat(cols, axis=1)
        
        ico_edges = pd.concat([start_to_mids, mid_to_ends, mid_to_mids]).reset_index(drop=True)
    ico_edges['start'] = ico_edges['start'].apply(normalize, axis=1)
    ico_edges['end'] = ico_edges['end'].apply(normalize, axis=1)
    return ico_edges
    

(Some will no doubt nitpick that this is not really recursive since it uses a loop rather than calling the function within itself, but I would argue that it’s still conceptually recursive in that each iteration performs the same operation on the output of the previous iteration.)

Let’s see what it looks like if we go 3 times:

fig = plt.figure()
ax = plt.axes(projection='3d')

draw_edges(tesselate(ico_edges.drop(columns=['midpoint'], level=0), 3))

png

There’s a problem, though: This only generates geodesic polyhedra of frequencies that are powers of two. num_times = 1 gets you a 2-frequency sphere, num_times = 2 gets you a 4-frequency sphere, and so on. We want to be able to do 3- and 5- and whatever-frequency spheres.

There’s another problem, too, which is that it’s incredibly slow.

Can we do better?

False start #2: ‘Til We Have Faces

I realized that one of my mistakes had been to consider the tesselation problem starting from an edge standpoint, when I should have been considering it from a face standpoint. After all, when we tesselate a triangle, we’re not just subdividing its edges: for any degree greater than two, we’re also creating new points inside the triangle that we need to address. So the first thing I did was create a faces dataframe consisting of three columns, a, b, and c. The three points in each row define a triangle.

The way I went about creating faces was a bit roundabout, and this may just be showing my ignorance of pandas. I’m more used to SQL queries, that allow you to write JOIN statements with complex ON conditions. In pandas, as far as I can tell, you can only join (or merge; join is a special case of merge) two dataframes according to whether a column in one dataframe is equal to a column in the other. For anything more complicated, you have to filter the result of the merge.

So first we create a dataframe called two_edges by joining ico_edges to itself, where the end of one edge equals the start of the other. Because multi-indices in pandas are somewhat bizarre, and I expect were added as something of a hack, we can’t just say left_on='end', right_on='start'. That’s because start and end aren’t actually columns, even though sometimes we can retrieve them via the same syntax we’d use to retrieve columns. Instead, each of those names actually refers to three columns, each of whose name is a tuple: ('start', 'x'), ('start', 'y'), and so forth. Hence the crazy list comprehensions in order to give us all three columns for each higher-level column.

The two_edges dataframe contains every set of three points a, b, and c such that ab and bc are both edges in ico_edges. This includes every face of our icosahedron, but also includes many more cases, such as edges belonging to two different faces that meet at a vertex. We need to filter this down so that only faces are included. Thus we join two_edges to ico_edges again to create three_edges. This time, the c column of two_edges is joined to the end column of ico_edges. We can think of three_edges as containing every set of three contiguous edges.

And what is a face but a set of three contiguous edges that’s closed? In other words, the end of the third edge is identical to the start of the first edge. So the final step is to create faces from three_edges by selecting only those rows where start (from the second join to ico_edges) is equal to a (the first edge in two_edges). Then we select our a, b, and c columns, giving us twenty rows, each containing three points defining a face.

ico_edges = ico_edges.drop(columns=['midpoint'], level=0)

two_edges = (
    ico_edges
    .merge(
        ico_edges, 
        left_on=[x for x in ico_edges.columns if x[0] == 'end'], 
        right_on=[x for x in ico_edges.columns if x[0] == 'start'],
        suffixes=['_a', '_b']
    )
    [['start_a', 'end_a', 'end_b']]
    .rename(columns={'start_a': 'a', 'end_a': 'b', 'end_b': 'c'})
)

three_edges = (
    two_edges
    .merge(
        ico_edges,
        left_on=[x for x in two_edges.columns if x[0] == 'c'],
        right_on=[x for x in ico_edges.columns if x[0] == 'end']
    )
)

faces = three_edges[
    (three_edges[('start', 'x')] == three_edges[('a', 'x')]) &
    (three_edges[('start', 'y')] == three_edges[('a', 'y')]) &
    (three_edges[('start', 'z')] == three_edges[('a', 'z')])
].reset_index(drop=True)[['a', 'b', 'c']]

faces
a b c
x y z x y z x y z
0 -0.525731 -0.850651 0.000000 -0.850651 0.000000 -0.525731 0.000000 -0.525731 -0.850651
1 -0.525731 -0.850651 0.000000 -0.850651 0.000000 -0.525731 -0.850651 0.000000 0.525731
2 -0.525731 -0.850651 0.000000 0.000000 -0.525731 0.850651 -0.850651 0.000000 0.525731
3 -0.850651 0.000000 -0.525731 -0.525731 0.850651 0.000000 -0.850651 0.000000 0.525731
4 -0.850651 0.000000 -0.525731 0.000000 -0.525731 -0.850651 0.000000 0.525731 -0.850651
5 -0.850651 0.000000 -0.525731 -0.525731 0.850651 0.000000 0.000000 0.525731 -0.850651
6 0.000000 -0.525731 -0.850651 0.850651 0.000000 -0.525731 0.000000 0.525731 -0.850651
7 -0.525731 -0.850651 0.000000 0.000000 -0.525731 -0.850651 0.525731 -0.850651 0.000000
8 -0.525731 -0.850651 0.000000 0.000000 -0.525731 0.850651 0.525731 -0.850651 0.000000
9 0.000000 -0.525731 -0.850651 0.850651 0.000000 -0.525731 0.525731 -0.850651 0.000000
10 0.850651 0.000000 -0.525731 0.525731 -0.850651 0.000000 0.850651 0.000000 0.525731
11 0.000000 -0.525731 0.850651 0.525731 -0.850651 0.000000 0.850651 0.000000 0.525731
12 0.850651 0.000000 -0.525731 0.525731 0.850651 0.000000 0.850651 0.000000 0.525731
13 -0.525731 0.850651 0.000000 -0.850651 0.000000 0.525731 0.000000 0.525731 0.850651
14 0.000000 -0.525731 0.850651 -0.850651 0.000000 0.525731 0.000000 0.525731 0.850651
15 -0.525731 0.850651 0.000000 0.525731 0.850651 0.000000 0.000000 0.525731 0.850651
16 0.000000 -0.525731 0.850651 0.850651 0.000000 0.525731 0.000000 0.525731 0.850651
17 0.525731 0.850651 0.000000 0.850651 0.000000 0.525731 0.000000 0.525731 0.850651
18 -0.525731 0.850651 0.000000 0.000000 0.525731 -0.850651 0.525731 0.850651 0.000000
19 0.850651 0.000000 -0.525731 0.000000 0.525731 -0.850651 0.525731 0.850651 0.000000

Just now, as I’m re-reading my code, it occurs to me to wonder how pandas knows to join ('end', 'x') to ('start', 'x') rather than e.g. ('start', 'y'). I doubt it’s because of the similarity of the names, because I could just as easily have named the columns in the right dataframe a, b, c instead of x, y, z. I suspect it’s the order of the keys in left_on and right_on, such that the first column in left_on is compared to the first column in right_on and so forth. If that’s the case, we’re assuming that the start and end columns in our dataframe will always both be in x, y, z order, which may not always be a valid assumption. If this were going to be production software, I’d need to do something to ensure the columns were always in the proper order, such as creating a Vertices class that wraps each dataframe containing coordinate information, and whose constructor explicitly sets the order. But for our purposes, we know that we’re only going to be passing in properly ordered dataframes.

Now we can define a plot_faces() function that takes an a, b, c dataframe like this as its argument:

fig = plt.figure()
ax = plt.axes(projection='3d')

def plot_faces(df):
    df.apply(lambda r: ax.plot3D(*[(r['a'][c], r['b'][c], r['c'][c], r['a'][c]) for c in ('x', 'y', 'z')], 'green'), axis=1)

plot_faces(faces)

png

Too many lines

Once I had a dataframe representing the icosahedron’s faces, I hatched the following convoluted plan:

For a frequency n geodesic polyhedron:

  • Divide each side of each face into n pieces. This involves creating n - 1 new evenly spaced points between each pair of vertices.
  • Take two of the sides and draw lines between their intermediate points, parallel to the face’s third side
  • Through some twisted logic, divide each of these lines into one or more mini-lines, each the same length as the divisions of the original sides
  • Connect the intermediate points of the parallel lines through even more twisted logic

I’m not even going to share the bits of real or pseudocode I developed to do this, because, fortunately, midway through implementing this eldritch horror of an algorithm, I realized that I was needlessly overcomplicating things. Everything fell into place when I remembered that it’s all just triangles, and they’re all the same size.

Mini-Triangles: This is the way

Consider a face that we want to tessellate. The original face is an equilateral triangle. And, once we’ve finished the subdivision (but before we project the new points onto the circumscribed sphere), each of the subdivisions is also an equilateral triangle, oriented the exact same way as the original triangle. The only difference is, each mini-triangle has shorter sides, and is translated in three dimensions relative to the original.

If you look at a tessellated triangle, you’ll see one mini-triangle at the very top. Below it, there are two mini-triangles, each of whose top vertices are just the two bottom vertices of the one above. (There’s a third, inverted triangle between them, but we can ignore it; it’s just what falls out automatically when we consider the space that’s left between the top triangle and each of the next two.) The next row contains three up-pointing mini-triangles: one for the bottom-left vertex of the leftmost triangle above, and then one each for each bottom-right vertex. (Since the triangles in each row all touch at the base, the bottom-right of one is identical to the bottom-left of its neighbor; thus we only have to worry about the bottom-left vertex of the first triangle, and can look at only bottom-right vertices after that.)

So the first thing we’ll do is create a get_little_triangles() function that takes a faces dataframe and a frequency n, and returns one little triangle for each face. Each little triangle will be translated such that its a vertex is at (0, 0, 0). That way we can translate each little triangle wherever we want just by adding it to the point we want to be the new a vertex.

def get_little_triangles(faces, n):
    new_vectors = {}
    new_vectors['a'] = 0.0 * faces['a']
    original_length = None
    # To get a triangle, we only need to find the ab and ac vectors
    for end in 'bc':
        start_minus_end = faces[end] - faces['a']
        # Since all edges are the same length, we only need to calculate original_length once
        if original_length is None:
            original_length = start_minus_end.head(1).apply(linalg.norm, axis=1)[0]
            new_length = original_length / n
        new_vectors[end] = start_minus_end.apply(normalize, axis=1) * new_length

    return pd.concat(new_vectors, axis=1)

little_triangles = get_little_triangles(faces, 2)
little_triangles
a b c
x y z x y z x y z
0 -0.0 -0.0 0.0 -0.162460 0.425325 -0.262866 0.262866 0.162460 -0.425325
1 -0.0 -0.0 0.0 -0.162460 0.425325 -0.262866 -0.162460 0.425325 0.262866
2 -0.0 -0.0 0.0 0.262866 0.162460 0.425325 -0.162460 0.425325 0.262866
3 -0.0 0.0 -0.0 0.162460 0.425325 0.262866 0.000000 0.000000 0.525731
4 -0.0 0.0 -0.0 0.425325 -0.262866 -0.162460 0.425325 0.262866 -0.162460
5 -0.0 0.0 -0.0 0.162460 0.425325 0.262866 0.425325 0.262866 -0.162460
6 0.0 -0.0 -0.0 0.425325 0.262866 0.162460 0.000000 0.525731 0.000000
7 -0.0 -0.0 0.0 0.262866 0.162460 -0.425325 0.525731 0.000000 0.000000
8 -0.0 -0.0 0.0 0.262866 0.162460 0.425325 0.525731 0.000000 0.000000
9 0.0 -0.0 -0.0 0.425325 0.262866 0.162460 0.262866 -0.162460 0.425325
10 0.0 0.0 -0.0 -0.162460 -0.425325 0.262866 0.000000 0.000000 0.525731
11 0.0 -0.0 0.0 0.262866 -0.162460 -0.425325 0.425325 0.262866 -0.162460
12 0.0 0.0 -0.0 -0.162460 0.425325 0.262866 0.000000 0.000000 0.525731
13 -0.0 0.0 0.0 -0.162460 -0.425325 0.262866 0.262866 -0.162460 0.425325
14 0.0 -0.0 0.0 -0.425325 0.262866 -0.162460 0.000000 0.525731 0.000000
15 -0.0 0.0 0.0 0.525731 0.000000 0.000000 0.262866 -0.162460 0.425325
16 0.0 -0.0 0.0 0.425325 0.262866 -0.162460 0.000000 0.525731 0.000000
17 0.0 0.0 0.0 0.162460 -0.425325 0.262866 -0.262866 -0.162460 0.425325
18 -0.0 0.0 0.0 0.262866 -0.162460 -0.425325 0.525731 0.000000 0.000000
19 0.0 0.0 -0.0 -0.425325 0.262866 -0.162460 -0.162460 0.425325 0.262866

Here’s an example of how we would generate the topmost little triangle for each face (where “topmost” means the one that includes the face’s a vertex; the choice is arbitrary, and does not imply that said triangle is “above” the others in a literal spatial sense).

When we add little_triangles to a vertex in order to translate it, what we actually need to do is add it to a dataframe of the same shape, whose a, b, and c columns all have the same x, y, z values. Essentially, we’re saying “let’s add the same vector to each of the three vertices”. In order to generate said dataframe, I made use of a way of calling pd.concat() that I hadn’t known about before, and which I might have been able to use to simplify some of my earlier code. It turns out pd.concat() will accept a dictionary as its first argument, whose keys are the column names (if we’re using axis=1). If the values of this dictionary are themselves dataframes, the result is a nice multi-index dataframe. So I use a dictionary comprehension to assign faces['a'] to all three keys, pass it to pd.concat(), and add the result to little_triangles:

pd.concat({k: faces['a'] for k in 'abc'}, axis=1) + little_triangles
a b c
x y z x y z x y z
0 -0.525731 -0.850651 0.000000 -0.688191 -0.425325 -0.262866 -0.262866 -0.688191 -0.425325
1 -0.525731 -0.850651 0.000000 -0.688191 -0.425325 -0.262866 -0.688191 -0.425325 0.262866
2 -0.525731 -0.850651 0.000000 -0.262866 -0.688191 0.425325 -0.688191 -0.425325 0.262866
3 -0.850651 0.000000 -0.525731 -0.688191 0.425325 -0.262866 -0.850651 0.000000 0.000000
4 -0.850651 0.000000 -0.525731 -0.425325 -0.262866 -0.688191 -0.425325 0.262866 -0.688191
5 -0.850651 0.000000 -0.525731 -0.688191 0.425325 -0.262866 -0.425325 0.262866 -0.688191
6 0.000000 -0.525731 -0.850651 0.425325 -0.262866 -0.688191 0.000000 0.000000 -0.850651
7 -0.525731 -0.850651 0.000000 -0.262866 -0.688191 -0.425325 0.000000 -0.850651 0.000000
8 -0.525731 -0.850651 0.000000 -0.262866 -0.688191 0.425325 0.000000 -0.850651 0.000000
9 0.000000 -0.525731 -0.850651 0.425325 -0.262866 -0.688191 0.262866 -0.688191 -0.425325
10 0.850651 0.000000 -0.525731 0.688191 -0.425325 -0.262866 0.850651 0.000000 0.000000
11 0.000000 -0.525731 0.850651 0.262866 -0.688191 0.425325 0.425325 -0.262866 0.688191
12 0.850651 0.000000 -0.525731 0.688191 0.425325 -0.262866 0.850651 0.000000 0.000000
13 -0.525731 0.850651 0.000000 -0.688191 0.425325 0.262866 -0.262866 0.688191 0.425325
14 0.000000 -0.525731 0.850651 -0.425325 -0.262866 0.688191 0.000000 0.000000 0.850651
15 -0.525731 0.850651 0.000000 0.000000 0.850651 0.000000 -0.262866 0.688191 0.425325
16 0.000000 -0.525731 0.850651 0.425325 -0.262866 0.688191 0.000000 0.000000 0.850651
17 0.525731 0.850651 0.000000 0.688191 0.425325 0.262866 0.262866 0.688191 0.425325
18 -0.525731 0.850651 0.000000 -0.262866 0.688191 -0.425325 0.000000 0.850651 0.000000
19 0.850651 0.000000 -0.525731 0.425325 0.262866 -0.688191 0.688191 0.425325 -0.262866

Now we have all we need to define a better tessellate() function. We’ll pass it our faces dataframe and the frequency n. Then we’ll iteratively build an accumulator (accum), which will be an n-length list of lists, wherein each of the sub-lists represents a row of triangles. The first row just contains a little triangle at the a vertex of the face, and each row thereafter adds a triangle at the b vertex of the first triangle in the previous row, plus one triangle for each c vertex of the previous row’s triangles.

After we’ve built accum, we flatten it by using itertools.chain.from_iterable() to turn it from a list of lists to a list, and use pd.concat() to turn the list into a single dataframe. We’ll call that one denormalized because it contains only the tessellated triangles, but they haven’t yet been normalized. The final step is to apply normalize to each of denormalized’s a, b, c columns and return the result.

from itertools import chain

def tessellate(faces, n):
    little_triangles = get_little_triangles(faces, n)
    accum = []
    for _ in range(n):
        if len(accum) == 0:
            accum.append([pd.concat({k: faces['a'] for k in 'abc'}, axis=1) + little_triangles])
        else:
            next_row = [pd.concat({k: accum[-1][0]['b'] for k in 'abc'}, axis=1) + little_triangles]
            next_row += [pd.concat({k: triangle['c'] for k in 'abc'}, axis=1) + little_triangles for triangle in accum[-1]]
            accum.append(next_row)
    denormalized = pd.concat(chain.from_iterable(accum)).reset_index(drop=True)
    return pd.concat({k: denormalized[k].apply(normalize, axis=1) for k in 'abc'}, axis=1)
    
geodesic = tessellate(faces, 7)
geodesic
a b c
x y z x y z x y z
0 -0.525731 -0.850651 0.000000 -0.615311 -0.784135 -0.080770 -4.846222e-01 -0.864906 -1.306892e-01
1 -0.525731 -0.850651 0.000000 -0.615311 -0.784135 -0.080770 -6.153114e-01 -0.784135 8.077037e-02
2 -0.525731 -0.850651 0.000000 -0.484622 -0.864906 0.130689 -6.153114e-01 -0.784135 8.077037e-02
3 -0.850651 0.000000 -0.525731 -0.864906 0.130689 -0.484622 -9.148244e-01 0.000000 -4.038518e-01
4 -0.850651 0.000000 -0.525731 -0.784135 -0.080770 -0.615311 -7.841352e-01 0.080770 -6.153114e-01
... ... ... ... ... ... ... ... ... ...
555 -0.080770 0.615311 0.784135 0.080770 0.615311 0.784135 8.326673e-17 0.525731 8.506508e-01
556 0.000000 0.403852 0.914824 0.130689 0.484622 0.864906 0.000000e+00 0.525731 8.506508e-01
557 0.080770 0.615311 0.784135 0.130689 0.484622 0.864906 -8.326673e-17 0.525731 8.506508e-01
558 0.403852 0.914824 0.000000 0.484622 0.864906 -0.130689 5.257311e-01 0.850651 0.000000e+00
559 0.615311 0.784135 -0.080770 0.484622 0.864906 -0.130689 5.257311e-01 0.850651 8.326673e-17

560 rows × 9 columns

The results are quite pleasing!

fig = plt.figure()
ax = plt.axes(projection='3d')

plot_faces(geodesic)

png

We can now turn our faces dataframe into an edges dataframe and pass it to our get_strut_lengths() function to find out how many struts of each length we’d need to build a model:

def faces_to_edges(faces):
    return pd.concat([
        pd.concat({k: faces[v] for k, v in (('start', 'a'), ('end', 'b'))}, axis=1),
        pd.concat({k: faces[v] for k, v in (('start', 'a'), ('end', 'c'))}, axis=1),
        pd.concat({k: faces[v] for k, v in (('start', 'b'), ('end', 'c'))}, axis=1),
    ]).reset_index(drop=True)

get_strut_lengths(faces_to_edges(geodesic))
count
length
0.138 120
0.152 120
0.157 120
0.162 60
0.165 120
0.171 360
0.174 120
0.176 180
0.182 300
0.185 120
0.188 60

Unfortunately, these counts are probably off, because we’re double-counting any edges that are shared between two faces. We’ll need to drop any duplicate rows.

At first glance, this should be simple: just use pandas’ built-in drop_duplicates() method.

def dedupe_edges(df):
    return df.drop_duplicates().reset_index(drop=True)

get_strut_lengths(dedupe_edges(faces_to_edges(geodesic)))
count
length
0.138 92
0.152 120
0.157 92
0.162 60
0.165 120
0.171 332
0.174 120
0.176 166
0.182 300
0.185 120
0.188 60

That looks better, but it’s possible that we still have some dupes, but not the kind that pandas would recognize as such. Consider that we have two top-level columns, start and end, but those names are arbitrarily assigned: a line segment from start to end is the same line segment as one from end to start.

One easy way to ensure that we’re truly deduplicating our dataframe is to make sure that start and end are always in the same order. If we were dealing with integers, we could just sort each row, ensuring that start is always less than end. It’s not that simple with 3D vectors, which don’t have an inherent ordering. Fortunately, since all we want to do is deduplicate, we can use an arbitrary ordering. We can convert each row’s start and end columns to lists and use Python’s built-in list comparison. All this does is first compare the x values, then the y values if the x values are equal, then the z values if the ys are equal.

We’ll define a disordered() function that takes a row from an edge dataframe as its argument. If that row’s start is greater than its end, it returns True.

def disordered(row):
    return list(row['start']) > list(row['end'])

Next we’ll define a flip_row() function that swaps the values of start and end in a row. I couldn’t find a simple or easy way to do this. Instead, there’s this vaguely monstrous function that creates a new data dictionary for the row with the top level columns reversed, and passes it to the pd.Series() constructor.

def flip_row(row):
    data = {('start', n): row['end'][n] for n in 'xyz'}
    data.update({('end', n): row['start'][n] for n in 'xyz'})
    return pd.Series(data=data, index=row.index)

Finally, we’ll write a function that takes an edge dataframe and uses apply along the 1 axis to flip those rows where start is greater than end.

def sort_edges(df):
    return df.apply(lambda row: flip_row(row) if disordered(row) else row, axis=1)
get_strut_lengths(dedupe_edges(sort_edges(faces_to_edges(geodesic))))
count
length
0.138 92
0.152 120
0.157 92
0.162 60
0.165 120
0.171 332
0.174 120
0.176 166
0.182 300
0.185 120
0.188 60

That was a fair amount of work for no obvious change. We probably should have checked first to see whether there even were any pairs of edges in our dataframe where the start of one was the end of the other and vice versa. Well, let’s do that now:

edges = dedupe_edges(faces_to_edges(geodesic))

edges.merge(
    edges,
    left_on=[(a, b) for a in ('end', 'start') for b in 'xyz'],
    right_on=[(a,b) for a in ('start', 'end') for b in 'xyz'],
    suffixes=['_first', '_second']
)
end start start_first end_first start_second end_second
x y z x y z x y z x y z x y z x y z

Okay, so there weren’t. Still, it wasn’t a waste of time to think about the possibility. We probably just got lucky; we didn’t do anything when defining our edge dataframes that was explicitly designed to guarantee they wouldn’t contain mirrored line segments. And if we were writing production software, we’d definitely want to do that, because who knows whether, somewhere down the line, someone might make upstream changes that do result in mirrored line segments, thus breaking dedupe_edges(). At the very least, we should add unit tests that check for mirrored line segments in the input. That way, if someone makes such a change, it’ll cause the tests to fail, and they can look at the tests and know exactly what inputs were expected.

Other starting shapes

We’ve made great strides! We created a dataframe representing an icosahedron, a function that can turn it into an n-frequency geodesic polyhedron, a function for displaying said polyhedron, and a function that can count the number of struts we’d need to build a model. But what if we wanted to start with a shape other than an icosahedron?

Fortunately, our functions should work on any dataframe that meets the following criteria:

  • three top level columns named a, b, and c
  • each top level column contains sub-columns x, y, z
  • each set of x, y, z values represents a point 1 unit from the origin

So any set of face data for a regular triangle-faced unit polyhedron should work, as well as any triangle-faced “pseudo-polyhedron” we might make by subdividing the faces of a cube or dodecahedron into triangles. (It should even work for a random assortment of triangles that don’t even define a polyhedron, though the shapes that are ultimately created won’t be geodesic polyhedra.)

Let’s try one, then. The unit octahedron should be fairly simple to construct. And this time, let’s not bother with defining a set of points, creating edges between them, and assembling those into shapes. Let’s just go straight to shapes.

The vertices of a unit octahedron are as follows: A point at the north pole (0, 1, 0), one at the south pole (0, -1, 0), and four points around the equator (0, 0, 1), (1, 0, 0), (0, 0, -1), (-1, 0, 0). For each pair of adjacent equatorial points, there are two triangles: one including the north pole, and one including the south.

def coords(x, y, z):
    return pd.DataFrame({'x': [x], 'y': [y], 'z': [z]})

np = coords(0.0, 1.0, 0.0)
sp = coords(0.0, -1.0, 0.0)
equator = [coords(*n) for n in ((0.0, 0.0, 1.0), (1.0, 0.0, 0.0), (0.0, 0.0, -1.0), (-1.0, 0.0, 0.0))]
equator.append(equator[0]) # For closure

octahedron = pd.concat([
    pd.concat({'a': pole, 'b': equator[i], 'c': equator[i+1]}, axis=1) 
    for pole in (np, sp) 
    for i in range(len(equator) - 1)
]).reset_index(drop=True)

octahedron
a b c
x y z x y z x y z
0 0.0 1.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0
1 0.0 1.0 0.0 1.0 0.0 0.0 0.0 0.0 -1.0
2 0.0 1.0 0.0 0.0 0.0 -1.0 -1.0 0.0 0.0
3 0.0 1.0 0.0 -1.0 0.0 0.0 0.0 0.0 1.0
4 0.0 -1.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0
5 0.0 -1.0 0.0 1.0 0.0 0.0 0.0 0.0 -1.0
6 0.0 -1.0 0.0 0.0 0.0 -1.0 -1.0 0.0 0.0
7 0.0 -1.0 0.0 -1.0 0.0 0.0 0.0 0.0 1.0
fig = plt.figure()
ax = plt.axes(projection='3d')

plot_faces(octahedron)

png

fig = plt.figure()
ax = plt.axes(projection='3d')

plot_faces(tessellate(octahedron, 5))

png

get_strut_lengths(dedupe_edges(sort_edges(faces_to_edges(tessellate(octahedron, 5)))))
count
length
0.244 32
0.314 48
0.341 32
0.343 24
0.389 48
0.392 16
0.400 48
0.437 48
0.471 24

Looks great!

Post-mortem

I’m not going to go through the whole list of Platonic solids and demonstrate how to build geodesic polyhedra from them. You get the point, and I can leave the other shapes as an exercise to the reader. Instead, I’d like to review the project and see what worked, what didn’t, and what we could do better if we wanted to implement this in a production-ready application

Structure

Early on in this project, I envisioned everything taking place inside pandas dataframes, with my code consisting of a set of functions taking dataframes as their arguments. I’ve said before that Python is not really an object-oriented language, and I don’t really see the need to define classes if functions will get you where you want to go. And at first, that approach worked. Our first dataframe, with the x, y, z columns, worked pretty well. The Fortran-like superpowers built into numpy and pandas meant that we could manipulate whole tables of coordinates almost as simply as we could perform arithmetic on scalars.

Since dataframe magic worked pretty well for sets of points, I naturally thought I should preserve the basic dataframe structure when moving on to line segments and then faces. That’s why I chose to use the multi-indexed columns. However, this approach wound up causing more problems than it solved. Everything that worked pretty easily with the x, y, z dataframe became ten times as complex with the multi-index dataframe. I thought I could just do .apply(normalize, axis=1) in order to normalize all the vectors in my edge dataframe, but that didn’t work at all; instead, I had to break start and end out into their own dataframes, normalize each of them, and then put them back together. I also ran into a lot of weird issues with indexing, stemming from the fact that those multi-level columns are really just a bunch of regular columns with tuples for names and some – but not enough – syntactic sugar on top.

There’s also the problem that my functions all accepted dataframes, but would only work on certain types of dataframes. And this isn’t a problem that type-hinting would solve, either, since they’re all just instances of pandas.DataFrame. But if you pass a dataframe containing edges to a function expecting a dataframe containing faces, it’ll fail. If you’re lucky, it’ll error out, but in some cases you might actually get a result, albeit a meaningless one.

So I think a bit of object-oriented wrapping would help. The only dataframes we should implement should be the good old single-indexed kind, where each row contains x, y, z coordinates. And the user of this library should never actually have to interact with those dataframes directly. Instead, wrap them in a Vertices class, then implement an Edges class that has a pair of Vertices objects as members, and a Faces class that has three Vertices objects as members. The constructors would all ensure you didn’t wind up with pathological dataframes, and the public methods would only take Vertices, Edges, or Faces objects as arguments, never dataframes.

In fact, the more I think about it, the more I think that “edges” might be superfluous. We wound up implementing pretty much everything in terms of faces. We only had to convert faces to edges when we wanted to count the struts by length, and we could probably re-implement our strut-counting algorithm to work on face data directly.

Alternate approaches

One way we could have approached the problem would have been to define only a single face of a regular polyhedron, tessellate and re-normalize it, and then just rotate and translate it to make all the other faces. I don’t know whether this would be more efficient or not. I didn’t take this route mainly because it would have involved using rotation matrices, and also figuring out the exact angles you needed to rotate by in order to turn one face into another. The approach I wound up using had the advantage that I could figure most of it out in my head without having to break out my old linear algebra and geometry textbooks (or, who am I kidding, Wikipedia pages). Plus I kind of like the fact that we were applying our tessellation and normalization functions to all the faces at once. (I don’t know how literally to interpret that “all at once”; I believe pandas does take advantage of multiple cores where available, so it may actually be calculating more than one face’s data during the same processor cycles, but I haven’t confirmed that.)

Another thing that jumps out is that both our starting shape (the unit Platonic solid) and our ultimate goal (the geodesic polyhedron) consist entirely of points on the circumscribed sphere, plus information about which points have edges drawn between them. The only time we use any points not on that sphere is in the intermediate phase, after we tessellate but before we normalize the tessellated points. It might be possible to implement this algorithm entirely using polar coordinates. I’m not sure if that gains us anything; it might simplify things, or it might make things way too complex. But it’s something to consider.

Next steps

Next up, I’d like to finish the improvements I’ve outlined here and turn this into a real library, with unit tests and everything. Then I’d like to finish implementing the other Platonic solids so that we could truly use this to implement any (Class I) geodesic polyhedron. (There are two other classes which we haven’t even discussed, that involve different ways to tessellate the faces; I have no interest at the moment in implementing those, as they’re more complex, and I don’t really see what advantages they have.) Finally, I’d like to make this truly interactive with some sort of web front-end. I’ve been meaning to try out pyscript ever since it came out, and this might be the perfect project for that.

I’d also like to try some more advanced matplotlib visualizations. First, I’d like to try animating the plot, rotating the shape on its axis. And second, I’d like to add hidden surface removal so that you only see the front of the shape (or at least so that hidden edges are a different color).


Last modified on 2023-10-25