I ditched statistical software packages like STATA and SPSS a while ago after falling in love (at first sight) with mighty Python. Now, I use Python everyday and I love it!!!. While I would love to tell you why Python is better than STATA and SPSS, I will save that for another day. Today, I was compelled to write after I came across this article which seem to undermine Python's scientific computation prowess. In summary, the premise of the article is that STATA is more accurate than Python specifically when it comes to simulating distributions. In this article, I demonstrate with hard evidence that this conclusion is false.
In order to demonstrate that the results in Python aren't accurate. The author used Numpy package to simulate a normal distribution with mean (mu = 0) and standard deviation (sigma=1). The expectation is that the resulting mean from the dataset should be pretty close to 0 (within 0.00). The author run this simple experiment 1000 times and found that on the machine he worked on, the mean for each of these datasets could be -3 or even -9. On the other hand, in STATA he didn't find similar errors on the same machine.
I ran the same experiment on my machine and the results are shown in the picture above. As you can see, the results here are as accurate as we expected them to be, the mean is within 0.001 range. In conclusion, Python is as good as STATA when it comes to simulating distributions. I rest my case. I will go back to wrangling data in Python.