Recently, I listened to a story from the New York Times about how the world is entering a new space race. The US, China, and Russia are all vying for military dominance and developing new weapons to target satellites. The story also touched on the increasing role of private companies in space.
I was curious to see just how much satellite launches are accelerating. I found a few nice visualizations, like this one from the Economist. Let’s see if we can recreate these figures, but with an x-axis that extends to the present day.
For the data, we’ll turn to Jonathan’s Space Report, a fantastic (and old school) website with a massive amount of information on the entire history of spaceflight. It also has tons of cool data visualizations. This website appears to be the passion project of Jonathan McDowell, a researcher at the Harvard-Smithsonian Center for Astrophysics. So thank you Jonathan!
First, let’s download and read in the data.
import wgetimport pandas as pdimport os.path# download database of launches, if it doesn't already existifnot os.path.isfile("satcat.tsv"):print("test") url ='https://planet4589.org/space/gcat/tsv/cat/satcat.tsv' wget.download(url)# read in tsv file to data framesats = pd.read_table("satcat.tsv", sep='\t')# we need to remove the first row, since it does not contain datasats = sats.drop(index=0)print("there are "+str(len(sats.index)) +" satellites in this dataset")# examine data tablesats.head()
there are 59046 satellites in this dataset
C:\Users\DanJuliaPC\AppData\Local\Temp\ipykernel_33912\2828006063.py:12: DtypeWarning:
Columns (1,18,20,22,24,26,28,32,36) have mixed types. Specify dtype option on import or set low_memory=False.
#JCAT
Satcat
Piece
Type
Name
PLName
LDate
Parent
SDate
Primary
...
ODate
Perigee
PF
Apogee
AF
Inc
IF
OpOrbit
OQUAL
AltNames
1
S00001
1.0
1957 ALP 1
R2
8K71PS No. M1-10 Stage 2
8K71A M1-10 (M1-1PS)
1957 Oct 4
-
1957 Oct 4 1933
Earth
...
1957 Oct 4
214.0
938
65.1
LLEO/I
-
-
2
S00002
2.0
1957 ALP 2
P
1-y ISZ
PS-1
1957 Oct 4
S00001
1957 Oct 4 1933
Earth
...
1957 Oct 4
214.0
938
65.1
LLEO/I
-
:RE,:RC
3
S00003
3.0
1957 BET 1
P A
2-y ISZ
PS-2
1957 Nov 3
A00002
1957 Nov 3 0235
Earth
...
1957 Nov 3
211.0
1659
65.33
LEO/I
-
:RE,:RC
4
S00004
4.0
1958 ALP
P A
Explorer I
Explorer 1
1958 Feb 1
A00004
1958 Feb 1 0355
Earth
...
1958 Feb 1
359.0
2542
33.18
LEO/I
-
:UA,:UB,DEAL I:IA
5
S00005
5.0
1958 BET 2
P
Vanguard I
Vanguard Test Satellite H
1958 Mar 17
S00016
1958 Mar 17 1224
Earth
...
1959 May 23
657.0
3935
34.25
MEO
-
:UA,:VA
5 rows × 41 columns
Let’s plot the total number of satellites launched through time. First, we’ll need to reformat the LDate (launch date) column and deal with problematic entries.
import dateutil.parser as dateparserimport numpy as nptest = sats['LDate']LDate_fmt=[]probs=0for i in test:try: LDate_fmt.append(dateparser.parse(i).strftime("%Y-%m-%d"))exceptException: LDate_fmt.append(np.nan) probs +=1# add new formatted columnsats['LDate_fmt'] = LDate_fmt# remove rows with NaN for datesats_noNa = sats[sats['LDate_fmt'].notna()].copy()# sort data frame by datesats_noNa.sort_values(by=['LDate_fmt'], inplace=True)# add cumulative sum columnmax=int(len(sats_noNa.index)) +1print(max)sats_noNa['cumsum'] =list(range(1, max, 1))print(str(probs) +" rows had problematic launch dates, replaced with NaN")
58966
81 rows had problematic launch dates, replaced with NaN
Now for a quick plot.
import plotly.express as pxfig = px.line(sats_noNa, x='LDate_fmt', y="cumsum", title="Cumulative number of global satellites launches", template="plotly_dark", line_shape='hv') # line_shape will plot lines as stepsfig.update_xaxes(title_text="year")fig.update_yaxes(title_text="satellites")fig.update_traces(line_color='cyan', line_width=3)# reduce margins for better viewing on mobilefig.update_layout(margin=dict(l=20, r=20, b=20))fig.show()
Let’s break it down into a yearly bar plot.
import collections# add column with just launch yeardates = sats_noNa['LDate_fmt']LDate_year=[]for i in dates: LDate_year.append(dateparser.parse(i).strftime("%Y"))sats_noNa['LYear'] = LDate_year# get table of launches by yearLYear_table =dict(collections.Counter(sats_noNa['LYear'].tolist()))df_data = []for key in LYear_table: df_data.append([key, LYear_table[key]])LYear_table_df = pd.DataFrame(df_data, columns=['year', 'launches']) # convert lists to dataframe# now make a bar plotfig = px.bar(LYear_table_df, x='year', y='launches', title="Global satellites launches per year", template="plotly_dark")fig.update_traces(marker_color='cyan')fig.show()
Our plot looks a little strange. There are massive peaks that seem out of place in 1999, 1982, and a few other years. Were there really more satellites launched in orbit in 1999 than in 2023?
We can see the issue also manifests in our previous line plot of cumulative launches. There are few points where the line goes vertical, indicating that many satellites were launched on the exact same day. The most obvious of these spikes occurred on May 10, 1999.
Is this real? Or is this an artifact of the data structure we downloaded from Jonathan’s Space Report?