NLP: Analysis of Nietzsche's Ecce Homo in Python using BookNLP

Scholars and readers alike often refer to Nietzsche's Ecce Homo as his autobiography. Nietzsche's early translator Anthony M. Ludovici went so far as to replace Nietzsche's subtitle "How one becomes what one is [Wie man wird, was man ist]" with "autobiography."

Although a wonderful book, Ecce Homo is not a particularly good autobiograpy. Sarah Kofman calls Ecce Homo “the most 'depersonalized' autobiography there is [“l'autobiographie la plus 'dépersonalisée' qui sont”].”[1] Nietzsche's American translator, Walter Kaufmann, apparently thinks this is a good thing, as he lauds the book as "Nietzsche's own interpretation of his development, his works, and his significance; and we should gladly trade the whole vast literature on Nietzsche for this one small book."[2] Perhaps, but that does not make it a good autobiography. As Nietzsche's British translator, R.J. Hollingdale, puts it: “as autobiography, it is a plain failure.”[3] Today several scholars agree with Hollingdale but rather than label Ecce Homo a failure, call it Nietzsche's parody of the genre. While that avoids the issue of it being a poor example of an autobiography, it's not always clear why Nietzsche would set his sights on a particular genre or what such a satire accomplishes.

I would like to take a step back and ask a more fundamental question. While there is no doubt that Ecce Homo is autobiographical, why do so many assume that it is an autobiography? Traditionally, one would argue this point by establishing the criteria for a text's being an autobiography and then showing that Ecce Homo does or does not meet these crieria. The problem with this approach is that the number of variations of autobiography is exceedingly large. For example, Sidonie Smith and Julia Watson list "Fifty-two Genres of Life Narrative" in an appendix to their Reading autobiography: A guide for interpreting life narratives.[4]

I want to show how we might apply some basic methods of natural language processing (NLP) to address this issue. To illustrate how this might work, let's look at a very simple frequency analysis (i.e., how often particular words are used). At least in a straightforward sense, one would expect that an autobiography would have a high percentage of first person singular pronouns.

Having a computer do this task for us is surprisingly complex. To begin with, what delimits a word? Not all words are surrounded by spaces, for example. While you can ask a child to count the number of times that Nietzsche writes the word "ich," simply searching for "ich" would include words that contain the letters "ich" but that we would not want counted, most notably including words with the adjectival and adverbial suffix -lich. Conversely, a faithful text would recreate Nietzsche's means of emphasis, and thus an emphatic "ich" would be "i c h" and would be ignored. Fortunately, sophisticated tools have been developed to solve these and other such problems.

One of the problems for humanists has been scaling NLP to book-length documents. Much of the work in NLP uses documents of such short length, like tweets and customer-submitted reviews, that most humanities scholars would not consider them to be documents in the proper sense. While there are ways to work around such limitations, I was happy to discover David Bamman's BookNLP, which greatly simplifies the programming that needs to be done to analyze book-length documents. I am grateful to William Mattingly for his Introduction to BookNLP.

One type of analysis that I plan on performing on Ecce Homo is entity recognition. Specifically, looking at what people and places Nietzsche names in Ecce Homo and do they differ what from he does in his other books? What is particularly amazing about entity recognition is that the software is not a large list of people and places. Rather, the computer "figures out" which words refer to people and to places. This should allow me to further refine the analysis by looking at entities that Nietzsche knew firsthand. That is, one would expect that an autobiography would mention more people that the author knew and more places that the author had visited than a non-autobiography. At the same time, this is an oversimplification. If I mention Paris now, this short reflection does not thereby become an autobiography, even though I have been to Paris. Similarly, I can imagine that talking about a place that one has never visited could be more revealing. Nonetheless, I am hopeful that such analyses will at least be an interesting beginning and another way to approach the question of genre in the case of Ecce Homo.

While I am still working on this project, let me share one simple result. Here is a frequency analysis comparison between Ecce Homo and The Gay Science for pronouns and proper nouns using BookNLP:

Comparing Ecce Homo with The Gay Science
TextEHFW
I20.65.9
my11.62.6
me6.91.8
one3.73.3
he3.35.4
they2.64.9
his2.45.4
you2.04.7
their1.74.3
myself1.70.4
Wagner1.40.1
him1.31.7
Zarathustra1.10.0[5]
them1.02.2
we0.97.3
oneself0.80.4
man0.80.8
us0.83.1
our0.83.8
himself0.61.4

This table lists the twenty most frequent pronouns and proper nouns in Ecce Homo and compares them with the frequency of those terms in The Gay Science. Nietzsche's significantly greater usage of first person pronouns in EH over FW would seem to confirm that it is an autobiography.

Another apparent confirmation comes from running the same comparison but this time using Hume's autobiography, My Own Life:

Comparing Ecce Homo with Hume's autobiography
TextEHHume
I20.629.8
my11.624.0
me6.95.0
he3.30.2
they2.61.2
his2.42.2
their1.71.5
myself1.71.8
him1.31.5
them1.00.5
we0.90.2
our0.80.2
himself0.60.5
men0.40.2
a man0.30.2
her0.30.8
my friends0.20.5
my father0.20.5
mine0.10.2
the person0.10.2
my mother0.10.8
my ancestors0.00.2
herself0.00.2

I chose Hume's autobiography because in several ways it is similar to Ecce Homo. Both are relatively short, and, more importantly, as you can see from the lower half of the table, neither philosopher says much about his father, mother, or friends. Despite such similarities, it is of course true that a preponderance of first person pronouns does not an autobiography make. To take one example, Descartes's Discourse is not an autobiography. It may be slightly autobiographical, but even then it is less so than Ecce Homo is. Yet, Descartes uses the first person slightly more in the Discourse than Nietzsche does in Ecce Homo, as the table below makes clear.

Comparing Ecce Homo with Descartes's Discourse
TextEHDiscourse
I20.631.7
my11.69.1
me6.95.3
one3.70.2
he3.31.6
they2.66.4
his2.41.2
you2.00.1
their1.74.7
myself1.73.3
him1.30.8
them1.03.2
we0.95.1
man0.80.4
us0.82.3
our0.83.9
himself0.60.5
themselves0.50.9
men0.40.5
your0.40.1

These are only some of my initial findings. As I explore BookNLP, I'll be updating this page.

 

References

[1] S. Kofman. Explosion I: De l' "Ecce Homo" de Nietzsche, p. 29. Paris: Galilée, 1992.

[2] W. Kaufmann. Editor's Introduction, his translation of On the Genealogy of Morals and Ecce Homop. 201. Vintage, 1966.

[3] R. J. Hollingdale, "Introduction," to Nietzsche: Ecce Homo, p. 7. Harmondsworth: Penguin, 1979.

[4] S. Smith and J. Watson. Reading Autobiography: A Guide for Interpreting Life Narratives. Minneapolis: University of Minnesota Press, 2002.

[5] Zarathustra is mentioned in The Gay Science at §342, §381, and the Song of Prince Vogelfrei entitled "Sils-Maria." The model does recognize Zarathustra as an entity (specifically, a GPE or geo-political entity). Similar miscategorizations happen in Ecce Homo as well. For example, when Nietzsche mentions Zarathustra II and Zarathustra III (EH Z 4), the model identifies both as "facilities." Twenty-one other references to Zarathustra are labeled as GPEs, and two are labeled as locations. Nonetheless, the majority are correctly identified as a character.