多重比較 Multiple comparisons

在用 ANOVA 時，會得到一個 F-test，這個 test 只能告訴你整個 model 是否顯著 (i.e., 總體平均數不相同或不完全相同)，並不能告訴你特定的組與組之間是否有差異。有很多種方法可以比較組間的平均值是否有差異，這些方式就稱作多重比較 (multiple comparisons) (註：這裡只說事後比較，不提事前比較)。

那為什麼有這麼多種方式，不要有一種就好？好問題，因為問題的核心是 Type I errors。不同的方法有不同的假設與處理 Type I errors 的方式。那為什麼問題的核心是 Type I errors呢？因為就整個 model 的 F-test 來說，你只有一個比較，換言之，你只要處理一次 Type I error 即可。但當你作多重比較時，你就要處理很多次 Type I error (三組間的比較就要處理三次，四組就要六次) ，在這情況下，就要調整 error rate 以免不小心犯了 Type I error 了。

如果要作 post hoc comparisons (有人叫 posteriori test) ，有許多選擇：1) Fisher’s Least Significant Difference (又稱為 LSD)；2) Tukey’s Test；3) The Ryan Procedure (REGWQ)；4) The Scheffé Test；5) Dunnett’s test for comparing all treatments with a control。

會產生這麼多方法，主要是比較的方式不同。有的方法是與均值作比較 (i.e., 所有組別的平均皆相同)，有的方法是配對比較 (pairwise comparison) (i.e., 與控制組比較、與最佳組比較)，有的是 contrast (特定兩組比較)。

LSD (最小差異)

簡化比較步驟，臨界值最小，最易顯著 (也表示較容易犯 Type I error)。LSD 要在整體模組 F-test 顯著的情況下才行。

Tukey’s Test (又稱為 Tukey’s Honestly Significant Difference Test，或簡稱為 Tukey’s HSD)

Tukey HSD 是測試所有可能的 pairwise 的平均值，並決定是否其中之一為 0。

Tukey’s test 的假設是 1) 所有的observation 都是獨立的；2) 平均數是來自常態分佈的樣本；3) 同質性 (equal variation across observations)；4) equal sample sizes (參見 Wikipedia)。

如果樣本數不同，用Tukey–Kramer method。

REGWQ

(待補充)

Scheffé Test (雪費檢定；同步檢定法)

同時測試所有可能的 contrast，並視是否其中之一為 0。

臨界值最大，最不容易顯著 (也表示較不容易犯 Type II error)。

可用於各組人數不同、非常態分配上。

Dunnett’s test

如果是比較所有的 treatment 跟 control，建議用此種方法。

比較表 (出自 Statistical methods for Psychology by David C. Howell，第六版，頁375)：

Test	Error rate	Comparison	Type	事前/事後	備註
Individual t tests	PC^a	Pairwise	t	事前
Linear contrast	PC	Any contrasts	F	事前
Bonferroni t	FW^b	Any contrasts	t^^	事前
Holm: Larzelere & Mulaik	FW^b	Any contrasts	t^^	兩者
Fisher’s LSD	FW^*	Pairwise	t	事後
Newman-Keuls test	FW^*	Pairwise	Range	事後	爭議性大
Ryan (REGWQ)	FW	Pairwise	Range	事後
Tukey HSD	FW	Pairwise***	Range	事後	只想測試某兩組時
Scheffé Test	FW	Any contrasts	F^^	事後	想測試所有差異時
Dunnett’s test	FW	With control	F^^	事後

Note:
^a: Error rate per comparison.
^b: Family error rate (FW).
FW^*: against complete null hypothesis.
t^*^* : modified t test.
F^*: modified.
Pairwise***: Tukey HSD can be used for all contrasts, but is poor for this purpose.

請高手補充與指正！

agri521求学在南京

August 14, 2010 at 5:31 am

谢谢你的博文

Miles

September 17, 2010 at 8:10 am

請問臨界值最大，最不容易顯著(也表示較不容易犯 Type II error)這句話是什麼意思?

sunny

November 19, 2010 at 9:09 am

請問大大我再跑One-way Anova檢定例如不同年齡否對在涉入有差異時，在構面檢定時有小於.05達到顯著，但在用雪費事後檢定看哪個組別有達到差異時，就沒有達到顯著了。但是用LSD時又有很多組達到顯著，請問我應該要怎麼利用資料來做解釋會比較好，還是我哪個步驟出了問題，希望您可以幫我解答，謝謝您

ethan

January 8, 2011 at 6:42 pm

請問 Stata 跑出 Stratified log-rank test for equality of survivor functions 後，是否可以跑 survivor functions 間的 pairwise comparisons？

多重比較 Multiple comparisons

4 thoughts on “多重比較 Multiple comparisons”

Leave a Comment Cancel Reply