Readings:Heneman, H. H., Judge, T. A., Kammeyer-Mueller, J. (2011). Staffing Organizations (7th ed.). New York: McGraw-Hill. Chapters 8 & 9 – These chapters will be used for modules 7, 8 and 9.Ryan, A. M., & Lasek, M. (1991). Negligent hiring and defamation: Areas of liability related to pre-employment inquiries.Personnel Psychology, 44, 293-319.**Judge, T.A. & Higgins, C.A. (1998). Affective disposition and the letter of reference. Organizational Behavior and Human Decision Process, 76, 207-221.**Fritzsche, B.A., & Brannick, M.T. (2002). The importance of representative design in judgment tasks: The case of resume screening. Journal of Occupational and Organizational Psychology, 75, 163-169.**Chapman, D.S., Uggerslev, K.L., & Webster, J. (2003). Applicant reactions to face-to-face and technology-mediated interviews: A field investigation. Journal of Applied Psychology, 88, 944-953.Overview:This module focuses on several selection methods that are, essentially, screening tools. That is, these are often used to screen peopleout rather than select people in.Biographical information inventories (biodata), background checks, reference checks, resumes, and letters of recommendation (LOR) are all methods of gaining insight into the past behaviors and life or work experiences of an applicant. The underlying assumption of these methods is that, what an applicant has done or experienced in the past may predict what he or she will do (or how he or she will perform) in the future. Although each of these methods focuses on past behaviors or experiences, there are differences between them.Letters of recommendation are exactly that: a recommendation by a third party on behalf of the applicant, and are typically considered a form of reference check. LORs can be structured (the recommending party completes a standard form or answers specific questions) or unstructured in nature (the recommending party submits what they wish on behalf of the applicant). While most often subjectively scored, objective scoring methods are available for LORs (for example, Peres and Garcia’s,  adjective scoring method). An adjective scoring method involves obtaining ratings and sorting adjectives from a large number of LORs submitted for a particular job. The categories of items can then be examined for their relation to a criterion, and thus, subsequent LORs can evaluated based on the type and number of adjectives used.Reference and background checks evaluate the applicant by gathering information from individuals or organizations that have had contact with the applicant. The information solicited in a reference check is used to either verify information provided by the applicant, predict job success (“how would the applicant perform in a specific environment?”), or to uncover additional background information not provided by the applicant. Background checks, instead, focus on the criminal record of an applicant. Checking references is one of the most common methods for screening applicants. Both reference and background checks are used as a screen-out selection method.Biodata is also an assessment designed to predict a criterion (most commonly a measure of turnover) from the experiences or past behaviors of an applicant. Although there is disagreement in the literature as to what constitutes biodata, this information is usually obtained through questions concerning the personal backgrounds and life experiences of applicants. Examples of biodata questions include, “Did you graduate from college?” “How much did you like school?” “Which of the following hobbies do you enjoy?” “Have you ever repaired a broken appliance so that it later worked?”Scoring techniques of biodata are much more complicated than those used for LORs. Empirical keying of biodata instruments involves the assignment of “optimal” weights to items or responses in an effort to predict performance (Dean, Russell, & Muchinsky, 1999). The “optimal” weights are obtained from either of two methods: (1) by establishing a relationship between the item or response option and the criterion or (2) by establishing a relationship between the item or response option and other items or response options.Method or Construct? Biodata, background and reference checks and LORs are all methods of collecting information about an applicant’s past. Narrowing what construct these methods measure is more difficult. These methods all attempt to measure past behavior or experiences, with the underlying assumption that past behavior or experience predicts future behavior. Whether it is biodata, background or reference checks or LORs, each of these focuses on what has occurred. Background and reference checks (including LORs) typically investigate relationships of the applicant with a third party, and that third party’s evaluation of the applicant. Biodata looks at the behavior or experiences of a person, with the expectation that those behaviors or experiences have lead to the development of relevant KSAOs.One of the central criticisms of these methods, most specifically, biodata, is that they are “atheoretical,” the result of “dustbowl empiricism,” (Dean, et al., 1999; Gatewood & Feild, 1998). However, advocates of these methods argue that other selection technologies, such as cognitive ability tests and work samples, also lack a theoretical basis (Dean, et al., 1999). In defense of biodata, Mumford, Stokes and Owens (1992) present a rationale for biodata based on the ecology model (as cited in Dean, et al., 1999). The ecology model suggests that it is not only experiences that shape the person, but also the choices that lead a person to gain certain experiences. This model proposes an iterative process of choice, development, and adaptation (Dean et al., 1999).QUESTIONPlease answer all of the following questions. This assignment is worth 15 points.Please submit your answers as a Word document attachment. Structure your answer in any format. That is, as long as you address each of the questions below, your answer can be in any format you choose (e.g., paragraph form, bullets, combination, etc.).1.What are the possible legal implications for resumes, cover letters, biodata, and reference checks for the employer? Provide concrete examples. Hint: Refer to legal materials/readings earlier in the semester and provide citations and explanations for your answer. (5 pts)2.What recommendations would you provide to an organization on the use of these selection methods? (4 pts)3.What constructs are measured by each of these methods? (3 pts)4.Do you think the organization should use these selection/screening methods? (3 pts)
Unformatted Attachment Preview
Journal of Occupational and Organizational Psychology (2002), 75, 163–169
© 2002 The British Psychological Society
The importance of representative design in
judgment tasks: The case of résumé screening
Barbara A. Fritzsche1* and Michael T. Brannick2
University of Central Florida, USA
University of South Florida, USA
A policy capturing study was conducted to determine if résumé pro le judgments
are generalizable to judgments of actual résumés. Forty recruiters judged 60
résumés or corresponding pro les on interview suitability. When pro les were
judged, more variance in suitability judgments was accounted for, there was higher
agreement among recruiters, the judgments were more favourable, and cue usage
was different than when actual résumés were judged. Thus, inferences based on
pro les were not generalizable to actual résumés. The importance of representative
design and limitations of policy capturing for understanding résumé screening
judgments were discussed.
Résumé screening is a widely used human resource selection technique (Gatewood &
Feild, 2001), and considerable efforts have been made to identify the determinants of
résumé screening decisions (e.g. Fox, Bizman, Hoffman, & Oren, 1995; Gardner,
Kosloski, & Hults, 1991; Glick, Zion, & Nelson, 1988; Graves & Karren, 1992; Mazen,
1990). Although some studies have examined reactions to real-world pre-screening
criteria (e.g. Campion, 1978; Werbel, Phillips, & Carney, 1989), most of what is known
about résumé screening is based on studies using laboratory-created stimuli. Bogus
résumés are created to simulate actual résumés in appearance and content (e.g. Work
Experience: Assistant Manager at Joe’s Deli, 1/90–1/92), or résumé profiles are created
to contain abstractions of information that might appear on résumés (e.g. work experience is a ‘3’ on a 4-point scale). Typically, a few items within these résumés are
manipulated, and participants are provided with brief job descriptions to judge the
suitability of one or more applicants for employment.
An important reason for creating bogus résumés is that use of a factorial design
facilitates the presentation of all possible profiles and ensures that the variables
manipulated (i.e. cues) are orthogonal. In studies designed to capture judgment
policies (see Hammond, 1980), having orthogonal cues provides confidence in the
interpretation of standardized beta weights as an indication of the relative importance
of the cues (Cooksey, 1996).
However, little is known about how well bogus résumé judgments generalize to
judgments of actual résumés. Judgments of real-world résumés are likely to be different, particularly if the cues used to make judgments are correlated (Stevenson,
Busemeyer, & Naylor, 1990), and if the information presented on laboratory-made
*Requests for reprints should be addressed to Barbara Fritzsche, Department of Psychology, University of Central Florida,
PO Box 161390, Orlando, FL 32816-1390, USA (e-mail: [email protected]).
Barbara A. Fritzsche and Michael T. Brannick
résumés does not reflect variations that are typical on real-world résumés (Dougherty,
Ebert, & Callender, 1986). Real-world résumé screening involves the more complex
task of identifying important cues and cue values, in addition to combining cue values
into judgments. Thus, the judgment task is likely to be different. Our review found only
two studies (Campion, 1978; Werbel et al., 1989) that examined real-world prescreening judgments and no studies that compared laboratory-made résumés to their
real-world counterparts. The purpose of this study was to make this comparison.
We designed the study to mimic judgments that recruiters make in day-to-day work.
The judges were college recruiters screening for jobs that they knew well, the number
of résumés was fairly large, and the résumés were sampled from a population of
appropriate job applicants rather than constructed to have orthogonal cues. The
judgment policies of those who viewed actual résumés were compared to policies of
those who viewed résumé profiles. The comparisons addressed the recruiters’ agreement about the suitability of the résumés, the favourabiity and predictability of the
judgments, and the relative importance of each cue.
Forty recruiters who interview seniors at a large US university participated. Of these,
33 recruiters (18 males, 15 females) with an average of 7.78 years of experience in
personnel selection screened actual résumés (Sample 1). Seven college recruiters (five
males, two females) with an average of 5.3 years of experience screened résumé
profiles (Sample 2).1 The recruiters typically screened résumés for entry-level management in department stores, financial services, retail clothing, business supplies, engineering, medical supplies and furniture stores.
Sixty résumés of recent graduates applying for management positions were gathered at
a job fair and used in this study. To preserve anonymity, identifying information was
omitted from the résumés. Because six relevant cues were identified, 60 résumés
resulted in 10 résumés per cue, as recommended by Cooksey (1996).
To determine the important cues, three individuals with 10–17 years of experience in
résumé screening, were interviewed and a résumé guide (Bostwick, 1985) was consulted. There was high consensus among the experts and guide with regard to important cues for recent graduates’ résumés. The cues identified were similar to those
identified by Gardner et al. (1991), Graves and Karren (1992) and Werbel et al. (1989),
One of the original purposes of this study was to examine the importance of gender in actual résumé screening. Hence,
recruiters were deliberately oversampled for actual résumé screening. Regardless of condition, recruiters were sampled
from the same population and no differential mortality occurred.
and included: (1) a targeted career objective; (2) relevant education and training; (3)
relevant work experience; (4) interests, activities and special skills; (5) references; and
(6) format, visual appeal and spelling.
Based on information gathered from the experts, a detailed scoring form was developed to assess each cue on each of the 60 résumés. Twelve graduate students in
industrial/organizational psychology and two recruiters rated each cue independently
for entry-level management positions (1 =low quality, 4 =high quality); at least four
raters rated each cue. One cue, references, was not rated; rather, a reference score was
assigned to each résumé based on whether references appeared on the résumé.
The mean graduate student cue ratings ranged from 2.31 (SD=.74) for format to 2.59
(SD=.98) for interests. Cronbach’s alpha ranged from .72 to .97 for the cue ratings,
suggesting moderate to high reliability. Correlations between the average student
ratings with the professionals’ ratings ranged from .47 to .80, suggesting moderate
Because the cues were not chosen to be orthogonal, Pearson correlations among the
cues were calculated. The correlation between work experience and interests
(r= .34) was the only significant correlation. Because the cues were basically
orthogonal, regression weights can be treated essentially as zero-order correlations
with suitability judgments.
Pro le construction
The average graduate student cue ratings were used to construct résumé profiles that
corresponded to the 60 actual résumés. For example, the career objective on one
résumé was ‘To pursue a career in which I can use my marketing and management
skills to actively participate as a team member.’ The rating of this career objective was
2.0 on a 4-point scale because a career objective was stated, but it was not welltargeted. On the résumé profile, therefore, a 2.0 appears as the rating for career
objective. Because profiles were created by substituting written cues with their
associated cue values, the résumés and profiles have the same quantitative cue values
and, therefore, the same cue intercorrelations.
The actual résumés were mailed to Sample 1 recruiters, and the profiles were mailed to
Sample 2 recruiters. Each recruiter evaluated the 60 résumés or profiles for the management position for which they typically screen résumés. Each résumé or profile was
rated on the following 7-point suitability scale: ‘How likely would you be to offer
this applicant on interview (1 =extremely unlikely; 7 = extremely likely)?’ During
debriefing, recruiters suggested that the résumés and the research task were similar to
what they encounter in their jobs.
The mean suitability rating was 3.60 (SD=.72) for Sample 1 and 4.25 (SD=.67) for
Sample 2. Recruiters who viewed profiles gave significantly higher suitability ratings
than recruiters who viewed actual résumés (F(1,38)=4.73, p<.05, R2 =.11). The correlations between recruiters’ judgments indicate the extent to which they agree about 166 Barbara A. Fritzsche and Michael T. Brannick Table 1. Regression results for mean judgments of résumés and pro les (N=60) Standardized beta weights Cue Career objective Education Work experience Activities References Format Résumé judgments Pro le judgments .20a 2 .01a .32*a .10 .31*a .16a .37*b .47*b .66*b .19* 2 .01b .31*b *p<.05. Note. Overal R2 =.25 (F(6,53)=2.99, p<.05) for the résumé judgments and R2 =.81 (F(6,53)=36.72, p<.05) for the pro le judgments. Beta weights with different subscripts differ signi cantly at p<.05. the applicants’ suitability. The average correlation between Sample 1 recruiters’ judgments (M=.10, SD=.18) was lower than that for Sample 2 (M=.32, SD=.20) and suggests low agreement among recruiters. Agreement was low regardless of whether the recruiters judged the résumés for similar entry-level management positions. Individual judgment policies were calculated by regressing each recruiter’s judgment on the cue ratings. The squared multiple correlation coefficients (R2) reflect the predictability of each judge from the cues. For Sample 1, R2 ranged from .04 to .31 (M =.16). Of the 33 coefficients, only three were significantly different from zero. This suggests a general inability to capture policies when actual résumés were rated. For Sample 2, R2 ranged from .40 to .80 (M =.65). All coefficients were statistically significant and, on average, 65%of the variance in suitability judgments was explained. This is similar to the amount of variance found in previous judgment analysis studies (see Cooksey, 1996). Thus, policies were adequately captured when profile judgments were made. To test the difference between Sample 1 and Sample 2 multiple correlation coefficients, they were transformed from R to z, and a one-way ANOVA indicated that they were significantly different from each other (F(1,38)=197.89, p<.05; R2 =.84). Mean variance accounted for was greater when profiles were evaluated than when résumés were evaluated. Finally, recruiters’ judgments for each résumé or profile were averaged and then regressed on the cue ratings. This removes individual differences in judgments from the error term. These results, suggesting different cue usage between the conditions, are shown in Table 1. Discussion This study examined whether résumé profile judgments are generalizable to actual résumé judgments. This is important because our knowledge of initial employment screening judgments is based primarily on findings from studies that use profiles constructed specifically for research rather than actual résumés of job applicants. The evidence overwhelmingly suggests that inferences based on profiles are not generalizable to inferences based on actual résumés. For résumé profiles, judgments were more favourable, predictable and consistent among recruiters. And, perhaps most importantly, cue usage was different for the two tasks. Representative design 167 2 The low R s when recruiters judged actual résumés could be due to difficulty in translating bits of information on the résumé into subjective cue values, combining cues into judgments, or both. Less than perfect inter-rater reliability among cue ratings suggests that part of the problem is mapping observed information into subjective values. However, additional post hoc data collected suggest that the major problem is combining information. Specifically, we collected test–retest data to examine consistency within recruiters over time. Test–retest correlations among three original recruiters (with approximately two months’ delay) were low (rs=.34, .49 and .45). In addition, we asked one professional to identify important cues on the résumés, rate each cue individually and rate each résumé on overall suitability. Thus, a policy was calculated based solely on his judgments, thereby eliminating any possibility that cues were misidentified or that cue rtings were inaccurate. Using his own judgments exclusively, the R2 was only .28. Even in this ideal situation, we were unable to adequately capture his policy. Findings suggest that recruiters use unstable strategies or strategies change over time. Time constraints, boredom and the need to complete a complex task in an efficient manner may encourage recruiters to base judgments on whatever is salient on each résumé. Because of the need to translate written information into subjective cue values, the résumé screening task was more complex than the profile screening task. The lower favourability ratings for actual résumés may, therefore, result from greater opportunity to find and give more weight to negative information. In addition, the lower agreement among recruiters in the résumé condition may be due to the extra task of having to evaluate the suitability of the written information for entry-level management in their specific industry (which was already preassessed in the profiles).2 In complex tasks, judges tend to rely on strategies that enable them to reduce cognitive effort while maintaining coherence in their choices. In fact, judges who achieve good accuracy-to-effort ratios change their strategy as the situation demands (Payne, Bettman, & Johnson, 1997). Policy capturing is not well-suited to this because multiple regression assumes that observations are replicates. Process tracing methods (Payne, Bettman, & Luce, 1998; Stevenson et al., 1990; Svenson, 1979) may help identify whether unreliability occurs when recruiters encode cues or when they aggregate them into judgments. The practical implications for the results are different for applicants and for recruiters. Because résumés are screened unreliably, applicants should blanket employers with résumés; interviews appear to be granted as much by luck and whim as by merit. For recruiters, interventions are warranted to boost the reliability of résumé evaluations. One possible method would be frame of reference training (e.g. Sulsky & Day, 1994). The present study did not reflect actual résumé screening in two ways: (1) the recruiters were asked to judge the suitability of the résumés on a 7-point scale; and (2) the judgments had no consequences for the recruiters. Neither point, however, explains the difference between the Sample 1 and Sample 2 judgments. 3 The two points do limit the generalizability of our results for predicting what recruiters will do 2 We are indebted to an anonymous reviewer for this suggestion. To provide some insight into whether the suitability judgments made on a 7-point scale articially increase the variance that would naturally be generated by recruiters’ binary decisions, we recorded the 7-point scale into a 2-point scale (ratings of 1–3 were coded ‘0’ and ratings of 5–7 were coded as ‘1’) and recalculated the regression analyses. Using the recoded criterion measure, sample 1 R2s ranged from .04 to .37 (M=.17), and sample 2 R2s ranged from .29 to .70 (M=.54). These results are very similar to the results found when the 7-point scale was used. 3 168 Barbara A. Fritzsche and Michael T. Brannick on the job. It is interesting to note, though, that our results were similar to those found in a real-world context (Campion, 1978), in that he also found low R2s (ranging from .126 to .157, collapsed across interviewers). Others (e.g. Cooksey, 1996) have argued that simulated profiles should maintain important characteristics of the judgment task, such as cue intercorrelations and cue distribution parameters. In this study, the profiles did have the same cue intercorrelations and cue distribution parameters as the actual résumés, but there were other aspects of the task that were not maintained through the use of profiles. Real-world résumé screening involves the more complex task of identifying important cues and cue values, in addition to combining cue values into judgments. Thus our results emphasize the importance of lab stimuli mirroring actual stimuli if the goal of the research is to model judgments. Acknowledgements We would like to thank Daniel Van Hoose of the Career Resource Center at USF for his assistance with data collection. References Bostwick, B. E. (1985). Resume writing: A comprehensive how-to-do-it guide (3rd ed.). New York: Wiley. Campion, M. A. (1978). Identification of variables most influential in determining interviewers’ evaluations of applicants in a college placement center. Psychological Reports, 42, 947–952. Cooksey, R. W. (1996). Judgment analysis: Theory, methods, and applications. San Diego: Academic Press. Dougherty, T. W., Ebert, R. J., & Callender, J. C. (1986). Policy capturing in the employment interview. Journal of Applied Psychology, 71, 9–15. Fox, S., Bizman, A., Hoffman, M., & Oren, L. (1995). The impact of variability in candidate profiles on rater confidence and judgments regarding stability and job suitability. Journal of Occupational and Organizational Psychology, 68, 13–23. Gardner, P. D., Kozloski, S. W. J., & Hults, B. H. (1991). Will the real prescreening criteria please stand up? Journal of Career Planning and Employment, 51, 57–60. Gatewood, R. D., & Feild, H. S. (2001). Human resource selection (5th ed.). Fort Worth, TX: Dryden. Glick, P., Zion, C., & Nelson, C. (1988). What mediates sex discrimination in hiring decisions? Journal of Personality and Social Psychology, 55, 178–186. Graves, L. M., & Karren, R. J. (1992). Interviewer decision processes and effectiveness: An experimental policy-capturing investigation. Personnel Psychology, 45, 313–340. Hammond, K. R. (1980). Introduction to Brunswikian theory and methods. In K. R. Hammond & N. E. Wascoe (Eds.), New directions for methodology of social and behavioral science: Realizations of Brunswik’s representative design (pp. 1–11). San Francisco, CA: JosseyBass. Mazen, A. M. (1990). The moderating role of social desirability, age, and experience in human judgment: Just how indirect is policy capturing? Organizational Behavior and Human Decision Processes, 45, 19–40. Payne, J. W., Bettman, J. R., & Johnson, E. J. (1997). The adaptive decision maker: Effort and accuracy in choice. In W. M. Goldstein & R. M. Hogarth (Eds.), Research on judgment and decision making: Currents, connectons, and controversies (pp. 181–204). Cambridge: Cambridge University Press. Representative design 169 Payne, J. W., Bettman, J. R., & Luce, M. F. (1998). Behavioral decision research: An overview. In E. C. Carterette & M. P. Friedman (series Eds.), & M. H. Birnbaum (vol. Ed.), Handbook of perception and cognition (2nd ed.): Measu ... Purchase answer to see full attachment