追踪下降SciPy的的`ttest_ind所做的假设()`函数所做、函数、SciPy、ttest_ind

2023-09-11 04:36:25 作者:颜值主播乔碧萝

我试图写我自己的Python code来计算一个和两个尾独立样本t检验t统计量和p值。我可以使用正常的逼近,但目前我想只使用t分布。我已经成功在我的测试数据匹配SciPy的的统计库的结果。我可以用一个新的一双眼睛,看我只是做一个愚蠢的错误的地方。

I'm trying to write my own Python code to compute t-statistics and p-values for one and two tailed independent t tests. I can use the normal approximation, but for the moment I am trying to just use the t-distribution. I've been unsuccessful in matching the results of SciPy's stats library on my test data. I could use a fresh pair of eyes to see if I'm just making a dumb mistake somewhere.

请注意,这是横贴的交叉验证的,因为它已经达一会儿那边没有反应,所以我认为这不能伤害也得到一些软件开发商的意见。我想知道,如果有我使用的算法,它应该重现SciPy的的结果是一个错误。这是一个简单的算法,所以它的百思不得其解,为什么我无法找到的错误。

Note, this is cross-posted from Cross-Validated because it's been up for a while over there with no responses, so I thought it can't hurt to also get some software developer opinions. I'm trying to understand if there's an error in the algorithm I'm using, which should reproduce SciPy's result. This is a simple algorithm, so it's puzzling why I can't locate the mistake.

我的code:

import numpy as np
import scipy.stats as st

def compute_t_stat(pop1,pop2):

    num1 = pop1.shape[0]; num2 = pop2.shape[0];

    # The formula for t-stat when population variances differ.
    t_stat = (np.mean(pop1) - np.mean(pop2))/np.sqrt( np.var(pop1)/num1 + np.var(pop2)/num2 )

    # ADDED: The Welch-Satterthwaite degrees of freedom.
    df = ((np.var(pop1)/num1 + np.var(pop2)/num2)**(2.0))/(   (np.var(pop1)/num1)**(2.0)/(num1-1) +  (np.var(pop2)/num2)**(2.0)/(num2-1) ) 

    # Am I computing this wrong?
    # It should just come from the CDF like this, right?
    # The extra parameter is the degrees of freedom.

    one_tailed_p_value = 1.0 - st.t.cdf(t_stat,df)
    two_tailed_p_value = 1.0 - ( st.t.cdf(np.abs(t_stat),df) - st.t.cdf(-np.abs(t_stat),df) )    


    # Computing with SciPy's built-ins
    # My results don't match theirs.
    t_ind, p_ind = st.ttest_ind(pop1, pop2)

    return t_stat, one_tailed_p_value, two_tailed_p_value, t_ind, p_ind

更新:

读多一点对韦尔奇的t检验后,我发现我应该使用韦尔奇 - 萨特思韦特公式计算自由度。我更新了code以上,以反映这一点。

After reading a bit more on the Welch's t-test, I saw that I should be using the Welch-Satterthwaite formula to calculate degrees of freedom. I updated the code above to reflect this.

随着新的自由度,我得到一个更接近的结果。我的双面p值是关闭的约0.008从SciPy的版本的...但是这仍然是太大了一个错误,所以我仍然必须做一些不正确(或SciPy的分布函数都非常糟糕,但很难相信他们只精确到小数点后2位)。

With the new degrees of freedom, I get a closer result. My two-sided p-value is off by about 0.008 from the SciPy version's... but this is still much too big an error so I must still be doing something incorrect (or SciPy distribution functions are very bad, but it's hard to believe they are only accurate to 2 decimal places).

第二次更新:

在继续尝试的事情,我想,也许SciPy的版本会自动计算正常近似t分布,当自由度足够高(约> 30)。于是我重新运行我的code。使用正态分布,而是和计算结果实际上是远离SciPy的比当我使用t分布。

While continuing to try things, I thought maybe SciPy's version automatically computes the Normal approximation to the t-distribution when the degrees of freedom are high enough (roughly > 30). So I re-ran my code using the Normal distribution instead, and the computed results are actually further away from SciPy's than when I use the t-distribution.

奖金的问题:) (更多统计理论相关;随意忽略)

Bonus question :) (More statistical theory related; feel free to ignore)

此外,t-统计是负的。我只是想知道这是什么意思为单侧t检验。难道这通常意味着我应该寻找在用于测试的负方向?在我的测试数据,人口1是谁没有收到一定的就业培训计划,对照组。人口2确实收到了它,并测得的数据是工资差别治疗前/后。

Also, the t-statistic is negative. I was just wondering what this means for the one-sided t-test. Does this typically mean that I should be looking in the negative axis direction for the test? In my test data, population 1 is a control group who did not receive a certain employment training program. Population 2 did receive it, and the measured data are wage differences before/after treatment.

所以我有一些理由认为,平均人口2将更大。但是,从统计理论的角度来看,它似乎并不正确炮制测试这种方式。我怎么能知道在负方向检查(用于单侧检验)不依赖于主观认识有关数据?抑或是那些频率论事,而不是哲学的严谨,需要这只是一个在实践中做些什么呢?

So I have some reason to think that the mean for population 2 will be larger. But from a statistical theory point of view, it doesn't seem right to concoct a test this way. How could I have known to check (for the one-sided test) in the negative direction without relying on subjective knowledge about the data? Or is this just one of those frequentist things that, while not philosophically rigorous, needs to be done in practice?

推荐答案

通过使用SciPy的内置函数源(),我可以看到源代码的打印输出code函数 ttest_ind()。根据源$ C ​​$ c中,SciPy的内置正在执行t检验假定两个样品的方差相等。它不使用韦尔奇-萨特思韦特自由度。 SciPy的假设方差相等,但并没有说明这一假设。

By using the SciPy built-in function source(), I could see a printout of the source code for the function ttest_ind(). Based on the source code, the SciPy built-in is performing the t-test assuming that the variances of the two samples are equal. It is not using the Welch-Satterthwaite degrees of freedom. SciPy assumes equal variances but does not state this assumption.

我只是想指出的是,关键的是,这就是为什么你不应该只是信任库函数。就我而言,我真正做需要的t检验方差不相等的人群,和自由调节的程度可能无所谓对于一些较小的数据集,我会运行此有关。

I just want to point out that, crucially, this is why you should not just trust library functions. In my case, I actually do need the t-test for populations of unequal variances, and the degrees of freedom adjustment might matter for some of the smaller data sets I will run this on.

正如我在一些评论中提到,与我的code和SciPy的的差异约为0.008 30和400之间的样本量,然后慢慢变为零了较大的样本量。这是额外的效果(1 / N1 + 1 / N2)项的等方差t统计量的分母。精度的角度来看,这是pretty的重要,尤其是对小样本量。这肯定证实,我认为我需要写我自己的功能。 (可能还有其他更好的Python库,但至少应该知道。坦白地说,这是令人惊讶,这不是任何地方前面和中心SciPy的文件中的 ttest_ind())。

As I mentioned in some comments, the discrepancy between my code and SciPy's is about 0.008 for sample sizes between 30 and 400, and then slowly goes to zero for larger sample sizes. This is an effect of the extra (1/n1 + 1/n2) term in the equal-variances t-statistic denominator. Accuracy-wise, this is pretty important, especially for small sample sizes. It definitely confirms to me that I need to write my own function. (Possibly there are other, better Python libraries, but this at least should be known. Frankly, it's surprising this isn't anywhere up front and center in the SciPy documentation for ttest_ind()).