I was working on a project where I needed to compare categorical data in order to determine if there is any association between them. For my particular problem, a chi2 test wouldn’t work, so I needed a Fisher’s exact test. Since there aren’t any code in python to perform the Fisher’s exact test for larger than 2x2 table, I decided to write my own. You can find it here : FisherExact. s Usually, you would use a chi2 test for independence (which is implemented in most programming languages) to compare association between categorical data. The problem was that I had low count in my contingency table.
Although it’s commonly accepted that the Cochran’s rule of at least 5 count in each cell of the expected table is too strict, I actually had really low total counts and cells with 0 count in my observed table. Thus, using an exact test is probably more suited. In fact, McDonald recommend, as a rule of thumb, to use an exact test when the total count is lower than 1000.
So I decided to perform a Fisher’s exact test, despite its controversies (see Frank Harrell’s post with detailled references on stats.stackexchange.com). Unfortunately, I didn’t find any implementation of the Fisher’s exact test for larger than 2x2 table in python. To my knowledge, only R offer a Fisher’s exact test for table larger than 2x2. Since the main part of my project was already written in python, I decided to rewrite the R function in python for consistency. How difficult would have that been ? right ?
It turns out that the R function actually use a heavily edited C version of the FORTRAN subroutine FEXACT (algorithm 643 by Mehta and Patel). My plan was to first find the fortran 90 source code of FEXACT and then use f2py to import the fortran module in my python code. I quickly learned that, although this sound simple, it was nearly impossible without any basic notion in fortran.
Since f2py couldn’t handle real precision when SELECTED_REAL_KIND
is used, I first created a .f2py_f2cmap file with {'real':{'dp':'double'}}
in it in order to map any instance of dp
to double. This wasn’t enough, so I changed the structure of the FEXACT module to this :
I also made the input and output more explicit, using INTENT(OUT)
and INTENT(IN)
.
I wrote the python code based on the R version, so it should essentially do the same thing. For 2x2 contingency table, the fisher_exact function from scipy.stats is used. Since R also provide p-values computation by Monte Carlo simulation, I tried to offer the same thing in my knockoff version. A quick glance at R sources, reveal that the fisher_exact simulation use a C version of the rcont2 subroutine (written in Fortran90). After some struggle, during which, I learn about array indexing order in C vs Fortran, in-place modification of argument with INTENT(INPLACE)
and many more, I was able to make it work.
I also added mid-P correction (see this document, this one and Lydersen, Fagerland and Laake, 2009 for reference), which is not present in R.
The source code is available here : https://github.com/maclandrol/FisherExact.
Comparing R’s fisher.test to my fisher_exact.
Data | Parameters | pval : fisher.test (R) | pval : FisherExact (python) | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| default (exact) | 0.4857 | 0.4857 | ||||||||||||||||
| default (exact) | 0.7827 | 0.78268 | ||||||||||||||||
| simulated p-values, replicates=1e5 | 0.7829 | 0.7823 | ||||||||||||||||
| hybrid | 0.7827 | 0.78268 |
As expected, we have the same p-values for 2x2 table (here the scipy fisher_exact is used) and for tables larger than 2x2 with the default parameters. For the simulated p-value, R doesn't offer a seed option, so it's difficult to compare the two p-values, but I think it's working because, even after several runs, the results were always consistent. In hybrid mode, an approximation based upon asymptotic chi-squared probabilities is used instead of Fisher exact test probabilities. You can set some arguments in FEXACT (expect=5.0, percnt=80.0 and emin=1.0
), to obtain the ‘Cochran’ condition. In R’s source code, percnt=180.0
is used instead :
I’m not sure if this is an error, but the comment preceding said “Cochran condition”, so I guess something is not right. I report it as a bug, although I’m not sure if this is a real bug or I’m just too stupid… Anyway, in my implementation I use percnt=80.0
, and unfortunately, on my test data, the hybrid and the exact mode return the same p-value.
In conclusion, you can now do Fisher’s exact test on any mxn contingency table in python. I hope this will be useful to someone.