Developing and commercializing novel ideas is central to innovation processes. As the outcome of such ideas cannot fully be foreseen, the evaluation of them is crucial. With the rise of the internet and ICT, more and new kinds of evaluations are done by crowds. This raises the question whether individuals in crowds possess necessary capabilities to evaluate and whether their outcomes are valid. As empirical insights are not yet available, this paper deals with the examination of evaluation processes and general evaluation components, the discussion of underlying characteristics and mechanism of these components affecting evaluation outcomes (i.e. evaluation validity). We further investigate differences between firm- and crowd-based evaluation using different cases of applications, and develop a theoretical framework towards evaluation validity, i.e. validity by numbers vs. the validity by expertise. The identified factors that influence the validity of evaluations are: (1) the number of evaluation tasks, (2) complexity, (3) expertise, (4) costs, and (5) time to outcome. For each of these factors, hypotheses are developed based on theoretical arguments. We conclude with implications, proposing a model of evaluation validity.