Manipulating the Alpha Level Cannot Cure Significance Testing
When evaluating the strength of the evidence, we should consider auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p-value threshold is not acceptable.
"Classical peer review" has been subject to intense criticism for slowing down the publication process, bias against specific categories of paper and author, unreliability, inability to detect errors and fraud, unethical practices, and the lack of recognition for unpaid reviewers. This paper surveys innovative forms of peer review that attempt to address these issues.