Martial arts masters are responsible for transmitting the skills and ideology of the art to their students. This master-student relationship is codified when a student receives his/her black belt, thereby becoming a master. The student’s future as a master of the art is forever tied to the person by whom he/she is promoted. Information in a martial art is transmitted through a sequence of master-student relationships. A master promotes students to masters, in turn, eventually promote students of their own.
While we normally think about the transference of information from masters to students, it is also informative to think about the flow of information as we move backwards in time. From this perspective, a student learns from his/her master, who learned from his/her master, and so on. This sequence of master-student relationships tracing back to the genesis of the art defines a martial artist’s lineage.
Here, I will visualize the lineages of hundreds of elite BJJ practitioners drawn from the website BJJ heroes.
I feel that lineages are intrinsically interesting; they allow BJJ practitioners, such as myself, to better understand our place in the context of an increasingly popular sport. Moreover, to the extent that style is transmitted through master-student relationships, practitioners who are nearby in a lineage-network are likely to be more stylistically similar than those from distant lineages.
Obtaining raw lineage data for ~800 BJJ heroes
To investigate BJJ fighter lineages, I used data from BJJ Heroes, a website containing biographies, lineages, fight records and more for around 800 elite BJJ practitioners. While these practitioners are among the more elite members of the sport (as defined by the BJJ Heroes crew), this list seems like a reasonable cross-section of the sport that is not greatly biased towards a subset of lineages or countries.
To extract lineages from this website, I first needed a list of all of the BJJ heroes that we are interested in. These fighters are conveniently located on a single page, so I could associate fighters’ names with links to their pages.
First Name
Last Name
Nickname
Team
Full_name
Fighter_link
Alan
Moraes
Carlson Gracie
Alan Moraes
http://www.bjjheroes.com/?p=2583
Adilson
Lima
Bitta
Academia Pitbull
Adilson Lima
http://www.bjjheroes.com/?p=6768
Ricardo
Rezende
Fight Sports
Ricardo Rezende
http://www.bjjheroes.com/?p=4859
Marcos
de Souza
Bonsai
Marcos de Souza
http://www.bjjheroes.com/?p=1848
Maximiliano
Trombini
Cia Paulista
Maximiliano Trombini
http://www.bjjheroes.com/?p=872
After identifying all of the pages of BJJ heroes, I programmatically followed each hero’s link and saved the lineage field from each hero.
First.Name
Last.Name
Nickname
Team
Full_name
Fighter_link
Lineage
Ana Carolina
Vieira
GF Team
Ana Carolina Vieira
http://www.bjjheroes.com/?p=6979
Lineage: Mitsuyo Maeda > Luis França > Oswaldo Fadda > Monir Salomão > Julio Cesar > Ana Carolina Vieira
Kayron
Gracie
Gracie Barra
Kayron Gracie
http://www.bjjheroes.com/?p=901
Lineage: Mitsuyo Maeda > Carlos Gracie Sr. > Helio Gracie > Carlos Gracie Junior > Kayron Gracie
Marcus
Bello
GF Team
Marcus Bello
http://www.bjjheroes.com/?p=1619
Lineage: Mitsuyo Maeda > Luis França > Oswaldo Fadda> Monir Salomão > Julio Cesar Pereira > Marcus Bello
Guilherme
Augusto
Alliance
Guilherme Augusto
http://www.bjjheroes.com/?p=5417
Lineage: Mitsuyo Maeda > Carlos Gracie > George Gracie > Octávio de Almeida > Moises Murad > Everdan Olegário > Guilherme Augusto
Gary
Tonon
Renzo Gracie Academy
Gary Tonon
http://www.bjjheroes.com/?p=5649
Lineage: Mitsuyo Maeda > Carlos Gracie Sr. > Helio Gracie > Carlos Gracie Junior > Renzo Gracie > Ricardo Almeida (> Tom deBlass) > Garry Tonon
Each hero’s lineage is stored as a string, like “lineage: master’s master > master > student” (with some other variations), that contains the overall lineage of each hero. It is useful to think about these lineages as networks where direct connections between masters and their students are parent-child relationships. Links that span multiple parent-child links are termed ancestor-descendent relationships.
Combining the lineages of all individual fighters into an overall family tree can be accomplished once I have identified all of the master-student relationships in this database.
Summarizing lineages based on master-student relationships
To identify the master-student relationships that will form individual branches of the BJJ family tree, I first need to unpack each hero’s lineage by removing extraneous text and storing each ancestor as a distinct field.
Fighter_link
Lineage
Level
http://www.bjjheroes.com/?p=1000
Mitsuyo Maeda
1
http://www.bjjheroes.com/?p=1000
Carlos Gracie
2
http://www.bjjheroes.com/?p=1000
Carlson Gracie
3
http://www.bjjheroes.com/?p=1000
Ze Mario Sperry
4
http://www.bjjheroes.com/?p=1000
Caroline de Lazzer
5
http://www.bjjheroes.com/?p=1003
Mitsuyo Maeda
1
http://www.bjjheroes.com/?p=1003
Carlos Gracie
2
http://www.bjjheroes.com/?p=1003
Helio Gracie
3
http://www.bjjheroes.com/?p=1003
Francisco Mansor
4
http://www.bjjheroes.com/?p=1003
Augusto Mendes
5
Having separated each lineage into separate entry for each ancestor, these lineages can now be used to form master-student relationships. Master-student relationships may be shared by multiple descendents. For example: >400 lineages contain the master-student relationship between Mitsuyo Maeda and his student Carlos Gracie. Each of these shared master-student relationships only needs to be stored once (although I also keep track of how many descendents each fighter has).
Fighter_link
Master
Student
Level
n
http://www.bjjheroes.com/?p=781
Ricardo De La Riva
Rodrigo Nogueira
4
1
http://www.bjjheroes.com/?p=1577
Leonardo Vieira
Rafael Heck
6
1
http://www.bjjheroes.com/?p=1604
Leonardo Vieira
Nivaldo Oliveira
6
1
http://www.bjjheroes.com/?p=6046
John Lewis
Gazzy Parman
5
1
http://www.bjjheroes.com/?p=6064
John Lewis
Steve da Silva
5
1
http://www.bjjheroes.com/?p=4430
Cesar Guimaraes
Fabiano Gaudio
6
2
http://www.bjjheroes.com/?p=2560
Leoni Nascimento
Luiz Dias
4
1
http://www.bjjheroes.com/?p=2206
Julio Lima
Sandro Lima
5
1
http://www.bjjheroes.com/?p=1174
Kazuo Yoshida
Evaldo Luiz “Serrinha
1
1
http://www.bjjheroes.com/?p=7077
Wilson Mattos
Manoel Costa
4
1
Having aggregated all master-student relationships, I could move to visualizing the overlapping lineages of heroes; however, there are currently inconsistencies in the data that would muddy the results. This problem can be seen by looking at Mitsuyo Maeda, one of the founders of the sport who is an ancestor of almost all fighters in this dataset.
Master
Level
n
Mitsuyo Maeda
1
720
M. Maeda
1
13
Takeo Iano
1
6
Kazuo Yoshida
1
1
Takeo Yano
1
1
Most lineages refer to Mitsuyo Maeda as “Mitsuyo Maeda” but some list him as “M. Maeda.” We see a similar problem with another of the sport’s founders: “Takeo Yano,” also referred to as “Takeo Iano.” Without consolidating such records, some fighters with the same lineage would be artificially separated. In the case of Mitsuyo Maeda, we would need to allow for the insertion or removal of characters to achieve a match. For Takeo Yano, we would need to allow for the substitution of characters.
Dealing with inconsistent hero names is a tricky challenge; we want to match similar strings without inappropriately lumping heroes together. This is especially challenging given the extremely similar names within the Gracie family! With simple string matching approaches, it would be very difficult not to combine names like Rolls, Royce, Royler and Renzo Gracie, while also still appropriately combining alternative spellings such as “Luis Franca” with “Luiz França”.
To combine a set of fighter names, I relied upon a few rules:
1. Names can only be combined if they have the same master. (These master names may have already been merged.)
2. Names that exactly match the BJJ heroes’ entries that I found above are first matched. (This deals with the Gracies’ similar names.)
3. Unmatched names are sorted according to how many ancestors they have.
4. Each unmatched name is tested to see if it matches one of the earlier student names (rule 3) using either of two approaches:
- Maximum subsequence: to identify insertions or deletions, I see how many characters the unmatched name has in common with each previous student and normalize this subsequence to the length of the shortest string. This approach is good at catching abbreviations such as M. Maeda.
- Levenshtein distance: to identify alternative spellings, I determine how many characters need to be either inserted, deleted or changed to another character in order to turn one name into another. This approach is good at combining alternative spellings of a name such as with Luis Franca.
5. Renamed athletes are manually checked and renaming is overruled when appropriate, using a list of manual corrections.
In addition to alternative names, some individuals with similar lineages list different master-student relationships in their lineages. To make sense of these inconsistencies, I assumed that each student has a single master (choosing the master with the most descendents). This simplification neglects that some students should legitimately have multiple masters, but by enforcing a one-to-many master-to-students relationship, I impose a strict hierarchy on the lineages that will improve clarity later on.
Before broadly combining names, I first wrote functions for aggregating names. I applied these functions to the root masters to resolve the inconsistent naming of Mitsuyo Maeda and Takeo Yano.
old
counts
score_indel
score_sub
new
input_order
M. Maeda
13
0.125
0.4038462
Mitsuyo Maeda
2
Takeo Yano
1
0.100
0.0250000
Takeo Iano
5
Using this approach, the variants of Mitsuyo Maeda and Takeo Yano were combined. Carrying out such string matching across all fighters simultaneously would inappropriately group together a large number of names. Instead, I separately combined the students of each master using the criteria outlined above, sequentially moving to lower levels of the lineage hierarchy once higher level master names were aggregated. I also kept track of all of the combined names for manual verification and correction (to ensure that I did not do anything too heretical).
Master
old
counts
score_indel
score_sub
new
input_order
NA
M. Maeda
13
0.1250000
0.4038462
Mitsuyo Maeda
2
NA
Takeo Yano
1
0.1000000
0.0250000
Takeo Iano
5
Mitsuyo Maeda
Carlos Gracie Sr
133
0.0000000
0.1875000
Carlos Gracie
2
Mitsuyo Maeda
Luiz França
17
0.0909091
0.0227273
Luis França
4
Mitsuyo Maeda
C. Gracie
13
0.1111111
0.3269231
Carlos Gracie
5
Mitsuyo Maeda
Carlos Gracie sr
13
0.0000000
0.1875000
Carlos Gracie
6
Mitsuyo Maeda
Luis Franca
5
0.0909091
0.0227273
Luis França
7
Mitsuyo Maeda
C. Gracie Sr
1
0.0000000
0.2500000
Carlos Gracie
8
Mitsuyo Maeda
Carlos Gracie/Helio Gracie
1
0.0000000
0.5000000
Carlos Gracie
9
Carlos Gracie
Carlinhos Gracie
1
NA
NA
Carlos Gracie Junior
17
Since forming the original master-student relationships, I combined the names of some fighters and assigned each student to a single master. Because of the original lineage ambiguities, some masters may currently reside at multiple levels of our lineages. For example, Rickson Gracie is listed as a student of Carlos Gracie in some lineages and of Helio Gracie, in others. Because Helio is a student of Carlos, Rickson occurs on multiple lineage levels. In order to correct this problem, I releveled the lineage based on consensus master-student relationships so the lineages will form a proper hierarchy.
Once this is done, I have a set of unambiguous 849 master-student relationships that will construct the BJJ family tree.
Static BJJ family tree
To first visualize the BJJ family tree, I was interested in generating a plot that showed all students in a hierarchical network, emphasizing masters who have a large number of descendents. Here, the founders of the sport are located in the middle of the plot, and students form sequential layers on the outside of their masters. When I originally generated this plot, I did not have a good tool for visualizing hierarchical data that falls in discrete levels (in retrospect, data.tree is probably the way to go), so I wrote a method for generating the layout from scratch.
Interactive BJJ family tree
While the above static plot is useful for understanding the broad divisions in BJJ, I was only able to show about 30 heroes and still maintain legibility. In order to visualize the lineages of over 800 fighters, I needed an approach that allowed people to focus on subsets of the network. To achieve this goal, I used networkD3, which provides methods for generating interactive hierarchical radial plots using D3. I then added a CSS to alter the plot’s appearance. This interactive plot (each master’s students can be collapsed by clicking on the master) can be found here.