You are hereSurvival Analysis

Survival Analysis


Finally, one of the most interesting novelties in my PhD. thesis is the application of Survival Analysis for the demographic study of the different communities of authors in Wikipedia. This is a well-known statistical technique already applied with successful results in other scientific areas such as Medicine (specially in Epidemiology studies), Demography and even in industrial environments (where it is better known as Failure Time Analysis).

This tool has been particularly useful in this case, since the huge amount of quantitative data gathered for each community let us obtain remarkably precise results (allowing us, for instance, to calculate the median lifetime of authors in some language versions with a 95% C.I. of approx. 2-3 days around the estimated value). It is worth noticing, however, that this technique must be applied carefully in this case. Contrary to other situations in which we can count the "deaths" in the study in a clear way, members of online communities may sometimes dissapear from the project for several months (even several years), just to show up again at some point in the future. Therefore, we have to consider this limitation in the analysis, taking into account that some of these "apparently dead" members (as for their contribution to the community) might be taking, in fact, a (sometimes long) break in their participation.

The best book I can recommend for an introduction on this topic is "Survival Analysis: A Self-Learning Text". The survival package provides all the necessary support for this techniques in GNU R.