Session 2F Agent Societies and Societal Issues 2 AAMAS 2019 May 13 17 2019 Montr al Canada. work builds on these contributions by adding varied in terms of bound rad ai Once arm ai is pulled e g application reviewed or. cost and reward arm pulls applicant interviewed u ai and rad ai are updated . Several MAB formulations select an optimal subset using a single The set of potential cohorts or subsets of arms is defined by. type of arm pull modeling decisions with focuses on different a decision class M 2 n Note that M need not be the power. problem features Cao et al 8 solve the top K problem with MABs set of arms but can include cardinality and other constraints The. for linear objectives Locatelli et al 22 address the thresholding total utility for a cohort is given by some linear function w Rn . bandit problem finding the arms above and below threshold with M R that takes as input the unknown true utilities u of the. precision Jun et al 17 identify the top K set while pulling arms arms and the selected cohort Throughout the paper we assume. in batches Singla et al 31 propose an algorithm for crowdsourcing a maximization oracle defined as Oracle v arg maxM M w M . that hires a team for specific tasks treating types of workers as where v Rn is a vector of weights in this case estimated or true. separate problems and an arm pull as a worker performing an action utilities for each arm Our overall goal is to accurately estimate the. with uniform cost true utilities of arms and then select the optimal subset of arms. To select the best subset while satisfying a submodular function using the maximization oracle . Singla et al 32 propose an algorithm maximizing an unknown. function accessed through noisy evaluations Radlinski et al 27 Problem hardness Following the notation of Chen et al 9 we. learn a diverse ranking from the behavior patterns of different users define a gap score for each arm For each arm a that is in the optimal. and then greedily select the next document to rank They treat each cohort M the gap is the difference in optimality between M and. rank as a separate MAB instance rather than our approach using the best set without a For each arm a that is not in the optimal. a single MAB to model the whole system Yue and Guestrin 36 set M the gap is the sub optimality of the best set that includes a . introduce the linear submodular bandits problem to select diverse Formally the gap is defined as. sets of content in an online learning setting for optimizing a class . w M maxM M a M w M if a M ,of feature rich submodular utility models a 1 . We are motivated by the observation that in many real world w M maxM M a M w M if a M . settings different levels of information gathering can be performed. at varying costs Previous work uses stochastic costs in the MAB This gap score serves as a useful signal for problem hardness . setting However our costs are fixed for specific types of arm pulls which we use in our theoretical analysis Formally the hardness of. Ding et al 11 look at a MAB problem with variable rewards and the problem can be defined as the sum of inverse squared gaps. cost with budget constraints When an arm is pulled a random . reward is received and a random cost is taken from the budget H a 2 2 . Similarly Xia et al 35 propose a batch arm pull MAB solution. to a problem with variable random rewards and costs Jain et al Chen et al defined the concept of width M When comparing. 15 use MABs with variable rewards and costs to select individual all combinations of two sets A A M where A A define. workers in a crowdsourcing setting They select workers to do dist A A A A A A Therefore define width M . binary tasks with an assured accuracy for each where workers min A A A A M A A dist A A In other words the width is. costs are unknown the smallest distance between any two sets in M See Chen et al . Lux et al 23 and Waters and Miikkulainen 33 use supervised for an in depth explanation of width M . learning to model admissions decisions They develop accurate. classifiers none decide how to allocate interviewing resources or Strong and weak pulls In reality there is more than one way. maximize a certain objective unlike our aim to select a more diverse to gather information or receive rewards Therefore we introduce. cohort via a principled semi automated system two kinds of arm pulls which vary in cost j and information gain. The behavioral science literature shows that scoring candidates s Information gain s is defined as how sure one is the reward is. via the same rubric asking the same questions and spending the close to the true utility We model the information gain as s parallel. same amount of time are interviewing best practices 2 13 29 34 arm pulls with the resulting rewards being averaged together A. Such structured interviews reduce bias and provide better job success weak arm pull has cost j 1 but results in a small amount of. predictors 18 25 We incorporate these results into our model information s 1 In our domain of graduate admissions weak. through our assumption that we can spend the same budget and arm pulls are standard application reviews which involve reading. get the same information gain across different arms submitted materials and then making a recommendation A strong. arm pull in contrast has cost j 1 but results in s 1 times. the information as a weak arm pull In our domain strong arm. 3 PROBLEM FORMULATION pulls extend reading submitted materials with a structured Skype. We now formally describe the stochastic multi armed bandit set interview followed by note taking and a recommendation . ting in which we operate For exposition s sake we do so in the In our experience the latter can reduce uncertainty considerably . context of a decision maker reviewing a set of job applicants How which we quantify and discuss in Section 5 However due to their. ever the formulation itself is fully general We represent a set of high cost such interviews are allocated relatively sparingly We. n applications A as arms ai A for i n Each arm has a true formally explore this problem in Section 4 and provide an algorithm. utility u ai 0 1 which is unknown an empirical estimate for selecting which arms to pull along with nonasymptotic upper. u ai 0 1 of that underlying true utility and an uncertainty bounds on total cost . Session 2F Agent Societies and Societal Issues 2 AAMAS 2019 May 13 17 2019 Montr al Canada. 4 SWAP AN ALGORITHM FOR ALLOCATING t a1 , INTERVIEW RESOURCES. In this section we propose a new multi armed bandit algorithm t a2 . strong weak arm pulls SWAP that is parameterized by s and j t a3 . SWAP uses a combination of strong and weak arm pulls to gain. information about the true utility of arms and then selects the. optimal cohort Our setting and the algorithm we present generalize Figure 1 Example with n 3 after running SWAP for t steps . the CLUCB algorithm proposed by Chen et al 9 which can be Dots are the empirical utility ut a while flags represent the. viewed as a special case with s j 1 radius of confidence rad t a Here rad t a 2 and rad t a 3 over . lap SWAP may pull a 3 ,Algorithm 1 Strong Weak Arm Pulls SWAP . is in the set Mt then the worst case is when the true utility of a is. Require Confidence 0 1 Maximization oracle Oracle . less than our estimate a might not be in the true optimal set M . Alternatively if an arm is not in the set Mt then the worst case is. 1 Weak pull each arm a n once to initialize empirical means. when the true utility of a is greater than our estimate a might be. in the true optimal set M Using the worst case estimates SWAP. 2 i n set Tn ai 1 , computes an alternate subset of arms M t If the utility of the initial. 3 Cost n n total resources spent, set Mt and the worst case set M t are equal then SWAP terminates. 4 for t n n 1 do, with output Mt which is correct with probability 1 as we show. 5 Mt Oracle u t , in Theorems 4 2 and 4 4 If w Mt and w M t differ SWAP looks at a. 6 for ai 1 nr, set of candidate arms in the symmetric difference of Mt and M t and. 7 rad t ai 2 log Tt ai chooses the arm pt with the largest uncertainty bound rad t pt . 8 if ai Mt then SWAP then chooses to either strong or weak pull the selected. 9 u t ai u t ai rad t ai arm pt using a strong pull policy depending on parameters s and j . 10 else A strong pull policy is defined as spp R 1 R 1 0 1 . 11 u t ai u t ai rad t ai For example in the experiments in Section 5 we use the following. pull policy ,12 M t Oracle u t s j,13 if w M t w Mt then spp s j 3 . 14 Out Mt This policy tries to balance information gain and cost When. 15 return Out the strong pull gain is high relative to cost then many more strong. 16 pt arg maxa M M M M rad t a pulls will be performed When the weak pull gain is low relative. t t t t, 17 spp s j to cost then fewer strong pulls will be performed as discussed in. 18 with probability do Example 4 1 , 19 Strong pull pt Once an arm is pulled the empirical mean u t 1 pt and the. 20 Tt 1 pt Tt pt s information gain Tt 1 pt is updated A reward from a strong arm. 21 Cost t 1 Cost t j is counted s times more than a weak pull . 22 else, Example 4 1 Suppose we wish to find a cohort of size K 2 from. 23 Weak pull pt, three arms A a 1 a 2 a 3 Run SWAP for t iterations Figure 1. 24 Tt 1 pt Tt pt 1, shows that SWAP maintains empirical utilities u t and uncertainty. 25 Cost t 1 Cost t 1, bounds rad t In this case M a 1 a 2 and M a 1 a 3 Arm a 3 . 26 Update empirical mean u t 1 using observed reward. therefore is the arm in the symmetric difference a 2 a 3 with the. 27 Tt 1 a Tt a a pt, highest uncertainty which therefore needs to be pulled Further . assume that a 3 needs x information gain for SWAP to end When. Algorithm 1 gives pseudocode for SWAP It starts by weak pulling j 1 and s 1 the best pulling strategy would be to weak pull a 3. all arms once to initialize an empirical estimate of the true under for x times When j 1 and s y where y 1 the best pulling. lying utility of each arm It then iteratively pulls arms chooses strategy would be to strong pull a 3 for ceil yx times Finally when. to weak or strong pull based on a general strategy updates em j z and s y where y z 1 the best pulling strategy would be. pirical estimates of arms and terminates with the optimal i e to strong pull a 3 for floor yx 1 z x mod y times and weak. objective maximizing subset of arms with probability 1 for pull a 3 for 1 z x mod y x mod y times where 1 a 1. some user supplied parameter when a 0 and 0 otherwise In reality we do not know how many. During each iteration t SWAP starts by finding the set of arms times an arm needs to be pulled which is why we introduce a. Mt that according to current empirical estimates of their means probabilistic strong pull policy like that in Equation 3 . maximizes the objective function via an oracle It then computes a. confidence radius rad t a for each arm a and estimates the worst Analysis We now formally analyze SWAP We define X Cost . case utility of that arm with the corresponding bound If an arm a E Cost as the expected cost or expected j value and X Gain . Session 2F Agent Societies and Societal Issues 2 AAMAS 2019 May 13 17 2019 Montr al Canada. E Gain as the expected gain or the expected s value Assume. that each arm a n has mean u a withr an sub Gaussian tail T vs Hardness. Theoretical bound, 4nCost 3t 10 9, Following Chen et al set rad t a 2 log Tt a for Actual cost. all t 0 107, Notice that if we use strong pull policy spp s j 0 then we. only perform weak arm pulls and SWAP reduces to Chen et al s 105. CLUCB We call this reduction the weak only pull problem Chen et. al proved that CLUCB returns the optimal set M and uses at most 103 105 107. O width M 2 H samples Similarly if we set spp s j 1 then we H. only perform strong arm pulls dubbed the strong only pull problem . We show that this version of SWAP returns the optimal set M and Figure 2 Exploration of bounds in practice vs the theoret . costs at most O width M 2 H s ical bounds of Theorem 4 4 with respect to hardness note. that both axes are a log scale , Theorem 4 2 Given any 0 1 any decision class M . 2 n and any expected rewards u Rn assume that the reward. distribution a for each arm a n has mean u a with an sub It is nontrivial to determine where the general version of SWAP is. Gaussian tail Let arg maxM M w M denote the optimal set better than both the SWAP algorithm with only strong pulls and the. SWAP algorithm with only weak pulls given the non asymptotic. Set rad t a 2 log Tt a for all t 0 and a n Then nature of all three bounds Chen et al results and Theorems 4 2. with probability at least 1 the SWAP algorithm with only strong and 4 4 Based on our experiments 5 we conjecture that there is. pulls where j 1 and s j returns the optimal set Out M and a of s and j pairs where SWAP is the optimal algorithm even for. 2 relatively low numbers of arm pulls though it is problem specific . width M 2 H log nj 3 2 H This is discussed more in Section 7 3 . T O 4 , where T denotes the total cost used by the SWAP algorithm and H is. 5 TOP K EXPERIMENTS, defined in Eq 2 In this section we experimentally validate the SWAP algorithm. under a variety of arm pull strategies We first explore 5 1 the. Although s and j are problem specific it is important to know efficacy of our bounds in Theorem 4 4 and Corollary 4 3 in simu . when to use the strong only pull problem over the weak only pull lation Then we deploy SWAP on real data 5 2 drawn from one. problem Corollary 4 3 provides weak bounds for s and j for the of the largest computer science graduate programs in the United. strong only pull problem We also explore its ramifications experi States We show that SWAP provides a higher overall utility with. mentally in Figure 3a as discussed in Section 5 1 equivalent cost to the actual admissions process . Corollary 4 3 SWAP with only strong pulls is equally or more. efficient than SWAP with only weak pulls when s 0 and 0 j . 5 1 Gaussian Arm Experiment, C 3 3 where C 4n H We begin by validating the tightness of our theoretical results in a. simulation setting that mimics the assumptions made in Section 4 . We now address the general case of SWAP for any probabilistic We pull from a Gaussian distribution around each arm When arm. strong pull policy parameterized by s and j In Theorem 4 4 we a is weak pulled a reward is pulled from a Gaussian distribution. show that SWAP returns M in O width M 2 H X Gain samples . with mean ua the arm s true utility and standard deviation . Similarly when arm a is strong pulled the algorithm is charged j. Theorem 4 4 Given any 1 2 3 0 1 any decision class cost and a reward is pulled from a distribution with mean ua and. M 2 n and any expected rewards u Rn assume that the reward . standard deviation s This strong pull distribution is equivalent. distribution a for each arm a n has mean u a with an sub to pulling the arm s times and averaging the reward thus ensuring. r M arg max,Gaussian tail Let , M M w M denote the optimal set an information gain of s . Set rad t a 2 log Tt a for all t 0 and a n set We ran all three algorithms SWAP with the strong pull policy. r r defined in Equation 3 SWAP with only strong pulls and SWAP with. only weak pulls while varying s and j For each s and j pair we. , 1 2 log 12 2 T and set 2 2 log 21 3 n Then with. ran the algorithms at least 4 000 times with a randomly generated. probability at least 1 1 1 2 1 3 the SWAP algorithm set of arm values Random seeds were maintained across policies . Algorithm 1 returns the optimal set Out M and We then compared the cost of running each of the algorithms 1. 2 2, 2 3 To test Corollary 4 3 Figure 3a compares SWAP with only weak. width M H log n X Cost 1 H 1 pulls to SWAP with only strong pulls We found that Corollary 4 3. T O 5 , X Gain 2 is a weak bound on the boundary value of j The general version. of SWAP should be used when it performs better costs less than. where T denotes the total cost used by Algorithm 1 and H is defined 1 Allcode to replicate this experiment can be found here https github com . in Eq 2 principledhiring SWAP , Session 2F Agent Societies and Societal Issues 2 AAMAS 2019 May 13 17 2019 Montr al Canada. w T, Optimal zone of SWAP, Strong vs Weak Pull, SWAP 80 1 0 5 1978 53 . 5 Actual 73 96 2000, Table 1 Graduate Admissions Simulation of SWAP Compar . 10 0 10 ison of top K utility w and cost T of SWAP with results of the. actual admissions process The values in parentheses are the. 15 15, standard deviations , 5 10 15 5 10 15, s values s values. By administering an anonymous survey of past admissions com . a Weak vs Strong b SWAP Optimal Zone mittee members we estimated that interviews are approximately. six times longer than reviewing a written application Therefore . Figure 3 Cost comparisons Figure 3a compares only strong we set our j value the cost of a strong pull to be 6 The gain of an. to only weak pulls Green indicates better performance by interview is uncertain so we ran tests over a wide range of s values. strong pulls and intensity indicates magnitude The blue the information gain of a strong pull The number of reviews and. line is the Corollary 4 3 bound on j Figure 3b shows where interviews 6 were summed to get a cost T of the actual review. the general version of SWAP outperformed green both process . SWAP with only strong pulls as well as SWAP with only Experimental Setup We simulate an arm pull by returning a real. weak pulls and maroon where it outperformed at least one score that a reviewer gave during the admissions process in the. of the latter order of the original reviews or a score from a probabilistic classifier. if all committee members reviews have been used An arm pull. both the strong only and weak only versions of SWAP The zone returns a score drawn from a distribution around the probabilistic. where SWAP is effective varies with the problem See 7 3 for result from the classifier to simulate some human error or bias . a deeper discussion Figure 3b shows the optimal zone for the We ran SWAP using the strong pull policy defined in Eq 3 where. Gaussian Arm Experiment we define the utility of each arm by the probabilistic result from. the classifier For our results we compare SWAP s selections with. 5 2 Graduate Admissions Experiment the real decisions made during the admissions process . Finally we describe a preliminary exploration of SWAP on real Results Running SWAP consistently resulted in a higher overall. graduate admissions data from one of the largest CS graduate pro utility than the actual admissions process while using roughly. grams in the United States The experiment was approved by the equivalent cost Table 1 We see that the overall top K utility w. university s Institutional Review Board Our dataset consists of is higher in SWAP than in practice We also see that SWAP uses. three years of graduate admissions applications graduate com roughly equivalent resources T than what is used in practice This. mittee application review text and ratings and final admissions suggests that SWAP is a viable option for admissions There are . decisions Information was gathered from the first two academic however some limitations of only using a top K policy such as. years treated as a training set while the data from last academic potentially overlooking the value diverse candidates bring to a. year was used to evaluate the performance of SWAP treated as a cohort For instance when hiring a software engineering team if. test set the top candidates are all back end developers it may be worthwhile. to hire a front end developer with slightly lower utility . Dataset During the admissions process potential students from. all over the world send in their applications A single applica 6 PROMOTING DIVERSITY THROUGH A. tion consists of quantitative information such as GPA GRE scores . TOEFL scores nationality gender previous degrees and so on as. SUBMODULAR FUNCTION, well as qualitative information in the form of recommendation Motivated by recent evidence that diversity in the workforce can. letters and statements of purpose In the 2016 17 academic year increase productivity 10 14 we explore the effect of formally. the department received approximately 1 600 applications with promoting diversity in the cohort selection problem First we define. roughly 4 500 applications over all three years The most recent a submodular function that promotes diversity Section 6 1 Then. 1 600 applications are roughly split into 1 000 Master s applications empirically we show that SWAP performs well with a submodular. and 600 Ph D applications The acceptance rate is 3 for Masters objective function Section 6 2 In experiments on real data we. students and 20 for Ph D students show a significant increase in diversity with little loss in fit while. Once all applications are submitted they are sent to a review using roughly the same resources as in practice Section 6 3 . committee Generally applicants at the top who far exceed ex . pectations and applicants at the bottom who do not fulfill the 6 1 Diversity Function. program s strict requirements only need one review Applicants Quantifying the diversity of a set of elements is of interest to a vari . on the boundary however may go through multiple reviews with ety of fields including recommender systems information retrieval . different committee members Once all reviews have been made computer vision and others 3 26 27 30 For our experiments we. the graduate chair chooses the final applicants to admit choose a recent formalization from Lin and Bilmes 20 and apply it. Session 2F Agent Societies and Societal Issues 2 AAMAS 2019 May 13 17 2019 Montr al Canada. to both simulated and real data Their formulation assumes that the. T vs Hardness T vs Hardness, arms can be split into L partitions where a partition is denoted as Pi 109 Theoretical bound 109 Theoretical bound. and a cohort is defined as M P1 P2 P L At a high level the Actual cost Actual cost. diversity function w div is defined as w div M i 1 a Pi u a . 107 107, Lin and Bilmes showed that w div is submodular and monotone 105 105. Under w div M there is typically more benefit to selecting an arm. 103 105 107 103 105 107, from a class that is not already represented in the cohort if the H H. empirical utility of an arm is not substantially low As soon as an. arm is selected from a class other arms from that class experience a SWAP with w div b PAC relaxation with w div. diminishing gain due to the square root function Example 6 1 il . lustrates when w div results in a different cohort selection than the Figure 4 Exploration of bounds in practice for SWAP with. top K function w top M a M u a , w div 4a and the PAC relaxation of SWAP with w div 4b vs . Example 6 1 Return to a similar setting to Example 4 1 with the theoretical bounds of Theorem 4 4 with respect to hard . three arms a 1 a 2 a 3 A and true utilities u a 1 0 6 u a 2 ness Note that both axes are a log scale . 0 5 and u a 3 0 3 Assume there exist L 2 classes and let arms. a 1 and a 2 belong to class 1 and arm a 3 belong to class 2 Then for a Gender Actual Gender SWAP. cohort of size K 2 w top will select cohort M top a a while. w div will select cohort M div a 1 a 3 Indeed w top M top. , 1 1 0 9 w top M div while w div M top 1 1 1 05 1 3 . , , 0 6 0 3 w div M div , Maximizing a general submodular function is computationally. difficult Nemhauser, et al 24 proved that a close to optimal that is F. w div M 1 e1 OPT greedy algorithm exists for submodular . monotone functions that are subject to a cardinality constraint We. a Actual b SWAP, use that standard greedy packing algorithm in our implementation. of the oracle Region Actual Region SWAP, N America China. 6 2 Diverse Gaussian Arm Experiments N America, To determine if SWAP works in this submodular setting we ran. Africa India, Other Americas, simulations over a variety of hardness levels We instantiated the Africa. China Europe, problem similarly to that of Section 5 1 with the added complexity Middle Other. of dividing the arms into three partitions East East Europe. India Asia, Figure 4a shows the cost of running SWAP compared to the Asia. theoretical bounds of the linear model over increasing hardness. levels The results show that SWAP performs well for the majority c Actual d SWAP. of cases However for some cases the cost becomes very large To. deal with those situations we can use a probably approximately Figure 5 Comparison of true and SWAP simulated admis . PAC relaxation of Algorithm 1 where Line 13 becomes sions gender 5a 5b region 5c 5d . If w M t w Mt The results from this PAC relaxation. where 0 01 can be found in Figure 4b Note that the definition Gender Region of Origin. , of hardness found in Equation 2 does not quite fit this situation w top w div w top w div. since the graphs in Figure 4 have higher costs for some lower hard . SWAP 8 5 0 03 12 1 0 06 8 0 0 03 22 1 0 03 , ness problems while having lower cost for some higher hardness. Actual 8 6 11 8 8 6 20 47, problems Given that the PAC relaxation performs well with low. costs over all of the tested hardness problems we propose that Table 2 SWAP s average gain in diversity over different. SWAP can be used with w div and perhaps other submodular and classes . monotone functions ,6 3 Diverse Graduate Admissions Experiment. Using the same setting as described in Section 5 2 we simulate a Results We compare two objective functions w top and w div . SWAP admissions process with the submodular function w div We w top treats all applicants as members of one global class This. partition groups by gender which is binary in our dataset and mimics a top K objective where applicants are valued based on. multi class region of origin We found that we did not have to resort individual merit alone w div promotes diversity using reported. to the PAC version of SWAP to tractably run the simulation over gender and region of origin for class memberships We use those. various partitions of the graduate admissions data classes as our objective during separate runs of SWAP . Session 2F Agent Societies and Societal Issues 2 AAMAS 2019 May 13 17 2019 Montr al Canada. allocation can negatively affect applicants from traditionally un . Random and Uniform derrepresented minority groups We suggest a formally structured. vs SWAP and Actual, process to help prevent disadvantaged people from falling through. wTOP Actual the cracks We discuss benefits Section 7 1 and limitations Sec . 8 SWAP, tion 7 2 to this approach as well as mechanism design suggestions. 7 Uniform for deploying SWAP in practice Section 7 3 . 0 10000 20000 30000, Cost 7 1 Benefits, We established SWAP a clear cut way to model a sequential decision . making process where the aim is to select a subset using two kinds. of information gathering strategies as a multi armed bandit algo . Random and Uniform, vs SWAP and Actual rithm This process could have a number of benefits when used in. practical hiring admissions settings , Over the course of designing and running our experiments we. Random noticed what seemed like bias in the application materials of candi . 10 Uniform dates belonging to underrepresented minority groups Our initial. 0 10000 20000 30000 observations were similar to those of scholars such as Schmader. Cost et al 28 who found that recommendation letters for female ap . plicants to faculty jobs contained fewer work specific terms than. b w div male applicants After revisiting and coding application materials. in our experiments we found similar results for female and other. Figure 6 Cost vs utility function comparisons of Actual minority candidates . SWAP Random and Uniform Our process hopes to mitigate this bias by providing a completely. structured process informed by the many studies showing that. structured interviewing reduces bias see Section 2 As we showed. in our experiments one can take additional steps to encourage. Table 2 and Figure 5 show experimental results on the test set diversity by using w div to select a more diverse team which can. most recent year of real admissions data We report w top instead result in a less biased more productive work environment 14 . of w top to align units across objective functions Because the square Furthermore by including a diversity measure in the objective. root function is monotonic this conversion does not impact the function candidates from disadvantaged groups are given a higher. maximum utility cohort Since SWAP uses a diversity oracle 6 1 chance of being pulled through the cracks since we prioritize rec . we notice a slight drop in top K utility However there is a large ommending diverse candidates for additional resource allocation . gain in diversity A practical benefit to SWAP is that it avoids spending unneces . SWAP on average used 1 17 pulls per arm of which 5 were sary resources on outlier candidates and quickly finds uncertain. strong During the last admissions decision process each applicant candidates This give us more information about the applicant pool. was reviewed on average 1 21 times Interviews were not consis as whole allowing us to make better decisions when choosing a. tently documented SWAP performed more strong pulls interviews cohort while using roughly equivalent resources . of applicants than our estimation of interviews by the graduate ad Finally in our simulations of running SWAP during the graduate. missions committee but did fewer weak pulls SWAP spent roughly admissions process we also select a more diverse student cohort at. the same amount of total resources as the committee did with strong low cost to cohort utility . pull cost j 6 and weak pull cost of 1 Given the gains in diversity . this supports SWAP s potential use in practice , We also compare SWAP to both uniform and random pulling 7 2 Limitations. strategies shown in Table 6 The uniform strategy weak pulls each One significant limitation of a large scale system like SWAP is that. arm once and strong pulls each arm once This had a cost approxi it relies on having a utility score for each applicant In our graduate. mately 9 times that of SWAP and resulted in a general utility of 8 3 admissions experiment we assume the true utility of an applicant. and a diversity value of 11 8 The random strategy weak or strong can be modeled by our classifier which is not entirely accurate In. pulls arms randomly Even when spending 10 times the cost of reality the true utility of an applicant is nontrivial to estimate as. running SWAP the random strategy has only a general utility of it is subjective and depends on a wide range of factors Finding. 7 9 and a diversity value of 11 16 SWAP significantly outperforms an applicant s true utility would require following and evaluating. both of these strategies the applicant through the end of the program perhaps even after. they have left the university Even if that were possible being. 7 DISCUSSION able to quantify true utility is nontrivial due to the subjectivity of. Admissions and hiring are extremely important processes that affect success and its qualitative properties This problem is not limited. individuals in very real ways Lack of structure and systematic bias to SWAP it is present in any admissions hiring peer review and. in these processes present in application materials or in resource other processes that attempt to quantify the value of qualitative. Session 2F Agent Societies and Societal Issues 2 AAMAS 2019 May 13 17 2019 Montr al Canada. properties Therefore in these settings there is no choice but to rely The choice of and could be determined via a sensitivity . on proxy values for the true utility such as reviewer scores analysis style study where simulations are run using various set . Similarly even though the cost of a resource j may be inherently tings of and Policymakers can then judge the simulated risks. quantifiable the information gain s is harder to define in such a and rewards to define the parameters . process For example how much more information one gains from Once the hyper parameters have been found simulations can. an interview over a resume review is subjective and by nature be performed to find the optimal zone as discussed in Section 5 1 . more qualitative than quantitative Also the information gain from This will allow the designer to determine the best strong pull policy . expending the same resource may vary over applicants though this Ideally both studies should include a run focused on past de . is slightly mitigated by using structured interviews cisions and one run every time the selection process occurs to. Another limiting factor is that not every admitted applicant will ensure SWAP s parameters align with the experiences and values. matriculate into the program We assume that all applicants will of human decision makers . accept our offer but in reality that is not the case Therefore we. potentially reject applicants that would matriculate as opposed to 8 CONCLUSION. accepting higher quality applicants that will ultimately not In this paper we modeled the allocation of interviewing resources. Finally our graduate admissions experiment simulated strong and subsequent selection of a cohort of applicants as a combina . arm pulls reviewers did not give additional interviews of applicants torial pure exploration CPE problem in the multi armed bandit. during the experiment Although our results are promising SWAP setting We generalized a recent CPE algorithm to the setting where. should be run in conjunction with an actual admissions process to arm pulls can have different costs where a decision maker can. assess its true performance perform strong and weak pulls with the former costing more than. the latter but also resulting in a less noisy signal We presented the. strong weak arm pulls SWAP algorithm and proved theoretical. 7 3 Design Choices upper bounds for a general class of arm pulling strategies in that. Our motivation in designing SWAP and exploring related exten setting We also provided simulation results to test the tightness. sions is to aid hiring and admissions processes that use structured of these bounds We then applied SWAP to a real world problem. interviewing practices and aim to hire a diverse cohort of work with combinatorial structure incorporating diversity into univer . ers As with any algorithm deployed in practice actually running sity admissions On real admissions data from one of the largest. SWAP alongside a hiring process requires adaptation to the specific US based computer science graduate programs we showed that. environment in which it will be used e g batch versus sequential SWAP produces more diverse student cohorts at low cost to student. review as well as estimation of parameters involving correctness quality while spending a budget comparable to that of the current. guarantees e g and or population estimates e g admissions process . In general we recommend that the policymaker or mechanism It would be of both practical and theoretical interest to tighten. designer tasked with setting parameters for SWAP or a SWAP style the upper bounds on convergence for SWAP either for a reduced or. algorithm should conduct a study on past admissions hiring de general set of arm pulling strategies We would also like to extend. cisions This study should include quantitative information e g SWAP to include more than two types of pulls or information gath . how many people applied how many were accepted how many ering strategies We aim to incorporate a more realistic version of. were interviewed how long did interviews take and qualitative diversity and achieve a provably fair multi armed bandit algorithm . information e g how confident was reviewer A after reviewing an as formulated by Joseph et al 16 and Liu et al 21 Additionally . applicant B From this a mechanism designer could determine esti we aim to create a version of SWAP that incorporates applicant ma . mates of population parameters like information gain parameters triculation into the candidate recommending and selection process . s and interview cost parameter j An interesting direction that may be worth pursuing is drawing. To estimate a policymaker could perform a study on past connections between our work the selection of a diverse subset of. reviews and interviews to determine the range of scores for arms arms to recent work in multi winner voting 12 a setting in social. However this method could incorporate various biases that may choice where a subset of alternatives are selected instead of a single. already exist in prior review and scoring processes That consider winner Recent work in that space looks at selecting a diverse. ation should be taken into account but exactly how is situation but good committee of alternatives via social choice methods 4 . specific The introduction of and strict adherence to the structured 6 Similarly drawing connections to diversity in allocation and. interview paradigm is a general method to alleviate some of these matching problems 1 5 19 is also potentially of interest . To estimate the value of s the information gain of a strong 9 ACKNOWLEDGEMENTS. pull one could quantify the difference in confidence level for a Schumann and Dickerson were supported by NSF IIS RI CAREER. particular applicant after performing weak and strong pulls e g Award 1846237 Counts was supported by NSF REU CAAR Com . how confident was reviewer A after reviewing an applicant B how binatorics and Algorithms for Real Problems CNS 1560193 hosted. much more confident was A after interviewing B and so on For at the University of Maryland We thank Google for gift support . j policy makers could use the average relative difference in time and the anonymous reviewers for helpful comments . and possibly monetary resources spent on different information. gathering strategies , Session 2F Agent Societies and Societal Issues 2 AAMAS 2019 May 13 17 2019 Montr al Canada. REFERENCES 19 Jing Wu Lian Nicholas Mattei Renee Noble and Toby Walsh 2018 The Con . 1 Faez Ahmed John P Dickerson and Mark Fuge 2017 Diverse Weighted Bipartite ference Paper Assignment Problem Using Order Weighted Averages to Assign. b Matching In Proceedings of the International Joint Conference on Artificial Indivisible Goods In AAAI Conference on Artificial Intelligence AAAI . Intelligence IJCAI 20 Hui Lin and Jeff Bilmes 2011 A class of submodular functions for document. 2 Richard D Arvey and James E Campion 1982 The employment interview A summarization In ACL HLT 510 520 . summary and review of recent research Personal Psychology 35 2 1982 281 21 Yang Liu Goran Radanovic Christos Dimitrakakis Debmalya Mandal and. 322 David C Parkes 2017 Calibrated Fairness in Bandits In FATML . 3 Azin Ashkan Branislav Kveton Shlomo Berkovsky and Zheng Wen 2015 Opti 22 Andrea Locatelli Maurilio Gutzeit and Alexandra Carpentier 2016 An optimal. mal Greedy Diversity for Recommendation In Proceedings of the International algorithm for the Thresholding Bandit Problem In International Conference on. Joint Conference on Artificial Intelligence IJCAI 1742 1748 Machine Learning ICML . 4 Haris Aziz 2018 A Rule for Committee Selection with Soft Diversity Constraints 23 Thomas Lux Randall Pittman Maya Shende and Anil Shende 2016 Applications. arXiv preprint arXiv 1803 11437 2018 of Supervised Learning Techniques on Undergraduate Admissions Data In CF . 5 Nawal Benabbou Mithun Chakraborty Xuan Vinh Ho Jakub Sliwinski and Yair 24 George L Nemhauser Laurence A Wolsey and Marshall L Fisher 1978 An analysis. Zick 2018 Diversity constraints in public housing allocation In International of approximations for maximizing submodular set functions I Mathematical. Conference on Autonomous Agents and Multi Agent Systems AAMAS 973 981 Programming 14 1 1978 265 294 . 6 Robert Bredereck Piotr Faliszewski Ayumi Igarashi Martin Lackner and Pi 25 Richard A Posthuma Frederick P Morgeson and Michael A Campion 2002 . otr Skowron 2018 Multiwinner elections with diversity constraints In AAAI Beyond employment interview validity A comprehensive narrative review of. Conference on Artificial Intelligence AAAI recent research and trends over time Personal Psychology 55 1 2002 1 81 . 7 S bastien Bubeck Nicolo Cesa Bianchi et al 2012 Regret analysis of stochastic 26 Lijing Qin and Xiaoyan Zhu 2013 Promoting diversity in recommendation. and nonstochastic multi armed bandit problems Foundations and Trends in by entropy regularizer In Proceedings of the International Joint Conference on. Machine Learning 5 1 2012 1 122 Artificial Intelligence IJCAI . 8 Wei Cao Jian Li Yufei Tao and Zhize Li 2015 On top k selection in multi armed 27 Filip Radlinski Robert Kleinberg and Thorsten Joachims 2008 Learning di . bandits and hidden bipartite graphs In Proceedings of the Annual Conference on verse rankings with multi armed bandits In International Conference on Machine. Neural Information Processing Systems NIPS 1036 1044 Learning ICML 784 791 . 28 Toni Schmader Jessica Whitehead and Vicki H Wysocki 2007 A Linguistic. 9 Shouyuan Chen Tian Lin Irwin King Michael R Lyu and Wei Chen 2014 . Comparison of Letters of Recommendation for Male and Female Chemistry and. Combinatorial pure exploration of multi armed bandits In Proceedings of the. Biochemistry Job Applicants Sex Roles 57 7 8 2007 509 514 . Annual Conference on Neural Information Processing Systems NIPS 379 387 . 29 Neal Schmitt 1976 Social and situational determinants of interview decisions . 10 Pierre Desrochers 2001 Local diversity human creativity and technological. Implications for the employment interview Personal Psychology 29 1 1976 . innovation Growth and Change 32 3 2001 369 394 , 11 Wenkui Ding Tao Qin Xu Dong Zhang and Tie Yan Liu 2013 Multi Armed Ban . 30 Chaofeng Sha Xiaowei Wu and Junyu Niu 2016 A Framework for Recom . dit with Budget Constraint and Variable Costs In AAAI Conference on Artificial. mending Relevant and Diverse Items In Proceedings of the International Joint. Intelligence AAAI , Conference on Artificial Intelligence IJCAI 3868 3874 . 12 Piotr Faliszewski Piotr Skowron Arkadii Slinko and Nimrod Talmon 2017 Multi . 31 Adish Singla Eric Horvitz Pushmeet Kohli and Andreas Krause 2015 Learning. winner voting A new challenge for social choice theory Trends in Computational. to Hire Teams In HCOMP , Social Choice 74 2017 , 32 Adish Singla Sebastian Tschiatschek and Andreas Krause 2016 Noisy Submod . 13 Michael M Harris 1989 Reconsidering the employment interview A review of. ular Maximization via Adaptive Sampling with Applications to Crowdsourced. recent literature and suggestions for future research Personal Psychology 42 4. Image Collection Summarization In AAAI Conference on Artificial Intelligence. 1989 691 726 , 14 Vivian Hunt Dennis Layton and Sara Prince 2015 Diversity matters McKinsey. 33 Austin Waters and Risto Miikkulainen 2013 GRADE Machine Learning Support. Company 2015 , for Graduate Admissions In AAAI Conference on Artificial Intelligence AAAI . 15 Shweta Jain Sujit Gujar Onno Zoeter and Y Narahari 2014 A Quality Assur . 1479 1486 , ing Multi armed Bandit Crowdsourcing Mechanism with Incentive Compatible. 34 Laura Gollub Williamson James E Campion Stanley B Malos Mark V Roehling . Learning In International Conference on Autonomous Agents and Multi Agent. and Michael A Campion 1997 Employment interview on trial Linking interview. Systems AAMAS , structure with litigation outcomes Journal of Applied Psychology 82 6 1997 . 16 Matthew Joseph Michael Kearns Jamie H Morgenstern and Aaron Roth 2016 . Fairness in Learning Classic and Contextual Bandits In Proceedings of the Annual. 35 Yingce Xia Tao Qin Weidong Ma Nenghai Yu and Tie Yan Liu 2016 Budgeted. Conference on Neural Information Processing Systems NIPS 325 333 . multi armed bandits with multiple plays In Proceedings of the International Joint. 17 Kwang Sung Jun Kevin Jamieson Robert Nowak and Xiaojin Zhu 2016 Top. Conference on Artificial Intelligence IJCAI , Arm Identification in Multi Armed Bandits with Batch Arm Pulls In AISTATS . 36 Yisong Yue and Carlos Guestrin 2011 Linear submodular bandits and their. 18 Julia Levashina Christopher J Hartwell Frederick P Morgeson and Michael A. application to diversified retrieval In Proceedings of the Annual Conference on. Campion 2014 The structured employment interview Narrative and quantitative. Neural Information Processing Systems NIPS 2483 2491 . review of the research literature Personnel Psychology 67 1 2014 241 293 .

Ljepota Najnovije vijesti iz kozmeti?ke industrije, novosti iz doma?ih i inostranih kozmeti?kih prodavnica i salona, fenomeni ljepote, brzi savjeti koji

Thank you for purchasing the Edelbrock 5.4L Ford / Lincoln Supercharger System for the F-150, Expedition, and Navigator. The Edelbrock E-Force Supercharger System for the 2009-2010 5.4L 3V F-150 and 2007-2011 3V Expedition/Navigator

discourse analysis of Donald Trump's negative evaluation of women. It sheds light on his sexist ideology to negatively represent and underestimate women. It aims to investigate the structural, lexical, and rhetorical strategies that are utilized for this purpose. For this end, the researcher will analyze some of Trump's opinions

Vector Field Analysis and Visualization through Variational Clustering Alexander McKenzie1, Santiago V. Lombeyda2 and Mathieu Desbrun2 1 University College London 2 California Institute of Technology Abstract Scienti?c computing is an increasingly crucial component o f research in various disciplines. Despite its potential,

Atlhough there is an increasing interest in using eel ctrci propulsion for AHTS vessels, most anchor handlers are today built with conventional diesel-mechanic

ahts and ahts for sale and charter ahts (for sale) built 1980, samsung, korea loa 64.66m, breadth 13.8m depth 6.90m draft 6.01m (loaded) bollard pull 87t

Self Directed Learning & Resource Book Registered Nurses and ... Certification is a second level competency. A ... Only nurses with a Portacath competency may ...

Touchdown: The Development of Propulsion Controlled Aircraft at NASA Dryden At 30,000 feet altitude flying to St. Louis on a business trip, Bill Burcham, then Chief Propulsion Engineer at NASNs Dryden Flight Research Center, had an idea that would change his life. It was a late summer day in 1989. Burcham pushed aside his well-thumbed copy of

must fill out and present a Public Comment Card to the Assistant prior to speaking. Comments will be limited to three (3) minutes per speaker. The President has the discretion to reduce this time limit based upon the number of speakers. The President may limit the total time for all public comments to twenty (20) minutes.