Mentoring Measurement Musings
APRIL 10, 2017
BY: HEATHER TAUSSIG, PH.D., NMRC RESEARCH BOARD MEMBER
Editor's Note: Several members of the NMRC Research Board participated in the 2017 National Mentoring Summit this past February, leading a research track that featured OJJDP-funded research and totaled 13 workshops across the multi-day event. We asked several Research Board members to share their key insights from the event based on a workshop they lead, an innovation they learned about, or a conversation they had with an attendee that made them think about the mentoring field in a new light. We will run several of these stories over the months of March and April in the NMRC blog to bring the Summit to life for those who could not attend.
As I left the Research Board meeting of the National Mentoring Resource Center at the 2017 Summit, my head was spinning with one of the Board’s charges – to identify the most important outcome measures to add to the “Measurement Guidance Toolkit for Mentoring Programs”. The measurement toolkit is an invaluable resource for everyone involved in any type of mentoring evaluation. And I urge you – don’t just go to the site to quickly find a tool to measure depression or parent-child relationship quality. You should read, and re-read, the “Evaluation Guidance and Resources” section, which helps programs think about all the important elements involved in selecting and administering measures.
Although I am both a clinician and a researcher, a mentoring program developer and evaluator, I still struggle to make decisions about measurement in our own work. Yes, we have a logic model (or a few dozen created over the years for different funding mechanisms). Yes, we measure fidelity to ensure that our program is delivered with the intended rigor. Yes, we are ensuring that there are high rates of program engagement and completion (ours is a time-limited program). And yes, we evaluate children’s functioning both pre- and post-program using a variety of measures and informants (children, parents, mentors and system records). Despite 15 years of doing this work, however, I could not easily identify what measures should be added to the Toolkit. “It depends,” I kept thinking.
It depends on: 1. What domain is the most important in terms of impact for each unique program and the children they serve; 2. How and when we conduct our post-intervention assessments; and 3. What the funder requires.
1. So, first, what do you want to impact?
What do you hope all the harrowing data collection efforts will yield? What did you put in that logic model? As you all know, the beauty of mentoring is that it can be so individualized, so why would we expect that across the board, youth would improve in their academic functioning or cultural identity? When I designed the measurement strategy for our mentoring program, Fostering Healthy Futures, I put in measures for every outcome I could think of. Why? Well, it was a pilot study and I thought that we might be able to improve everything from psychotropic medication use to school attachment. Of course, these were not the ultimate outcomes, not the things that go in the far-right box of the logic model. For those more distal (i.e., long-term outcomes), we would need to wait longer. We were enrolling 9-11-year-old children in foster care, so if we wanted to examine teen pregnancy or high school graduation, we needed to follow the young people much longer. To that end, we added measures that were precursors to those more distal outcomes, for example, measures of impulse control, grades and disciplinary referrals. We didn’t know what we would find, but have been pleasantly surprised by some of the unexpected outcomes we have achieved.
In the seminal Role of Risk study (Herrera, DuBois, & Grossman, 2013), the researchers examined whether mentored youth improved across a wide set of outcomes, since individual youth might improve in one domain but not another. They found that mentored youth improved in more domains than youth in the comparison group. I think this methodology should be used in more mentoring studies, as it is synergistic with the goals of most mentoring programs: to improve or enhance functioning or well-being. What does that mean for your evaluation? Well, although you probably can’t evaluate everything under the sun, perhaps being more inclusive of different domains and specific outcomes will help you find positive program impacts. Having measures of multiple outcomes will also enable you to follow the methodology employed in the Role of Risk study and examine improvement more globally.
2. The second “It depends” in outcome measurement relates to how and when we conduct the post-assessments.
What does it even mean to have a “post-assessment” in an ongoing mentoring program? How about if youth started the program at different times or at different ages? What if you don’t have the resources to be collecting 1-year follow-up assessments all year long? In our study, all youth are enrolled at the same time and mentoring ends after 9 months, so one would think it would be simple to decide when to conduct a post-intervention assessment. Not so. Initially, we conducted the post-intervention assessments immediately after the program ended. We hypothesized, however, that we might not see the effects of the program until several months post-ending, especially in the mental health realm. Indeed, we did not find evidence of many positive outcomes immediately post-intervention, but we did 6 months later. Unfortunately, most evidence-based registries require findings one-year post intervention, so we continued to search for funds to evaluate the program’s longer-term effects and are currently interviewing the participants in young adulthood!
Another issue related to the timing of the assessments concerns the mentees’ perceptions of the rationale for the assessment and how the information will be used⎯in other words, what do they think about being asked these assessment questions! We have always been concerned about assessing mentees during the program for fear that they would think: (1) that they were supposed to be doing “well” or “better” since they were in a mentoring program, and (2) that we would share their assessment information with their mentors.
Therefore, we have been careful to wait until the end of the program to conduct assessments, especially in gathering information about the mentoring relationship. This, of course, affects the interpretation of the information. For example, I just presented data at the Summit related to mentors’ and mentees’ ratings of how much their interests matched. We found that there was no relationship (i.e., no significant correlation) between the mentors’ and mentees’ ratings, but both ratings were associated with the number of mentoring visits they had over the course of the program. Although this finding was consistent with other findings showing that matching on interests produces better outcomes, we were very surprised because we do not match mentors and mentees on interests. After taking a step back, though, and thinking about the timing of this assessment (which was after the program ended), we realized that the mentee/mentor dyads who had more visits over the course of the program had more time to develop shared interests and that their ratings of how much their interests matched may have been a product of the amount of time they spent together, not how much their interests matched when we paired them. Regardless, thinking about the timing of the data gathered relative to the chicken/egg problem is always important.
3. The final “It depends” is related to the realities of funding requirements.
Even though I run randomized controlled trials, local foundation funding to provide the program often comes with the stipulation that we identify measureable outcomes that we can report at program end. As I already mentioned above, we don’t anticipate finding changes immediately post-program, but foundation funders are typically not interested in hearing about control/intervention differences, nor waiting a year post program to get the findings (and neither are we, especially as we want to apply to them for more funding). They are also not interested in hearing that the mean score on a measure of positive attitudes towards violence was reduced by .24 (even if that seems meaningful to us researchers). So, what to do? It is too costly to interview all the children and their parents at program completion (as we conduct paid, in-home interviews), so we typically rely on mentors’ reports of youth functioning to report on indices such as school suspensions, extracurricular activity involvement, and being promoted to the next grade level. While the methods are not the strongest, and we will never publish these data, this measurement strategy allows us to be responsive to the funder by identifying clear targets and measuring them in a straightforward, low-cost way.
A week after returning from the Summit, the Board members were asked to complete a survey about which measures would be the most important to add to the Measurement Toolkit. I had an incredibly difficult time rating them, as I wanted to say “It depends!” for each item. When would these domains be measured in the program? Who would you ask to report on these? What would the interpretation of such an assessment be by the mentees? What will program funders request?
As you use the Measurement Toolkit or think about what to measure in your program, I encourage you to reflect on your own measurement strategies and remember that your answers to the “It depends” factors will help you design a measurement protocol that best meets your needs and constraints, and ultimately demonstrates the success of your program!
Heather Taussig, PhD is a Professor and Associate Dean for Research at the University of Denver’s Graduate School of Social Work, and a member of the NMRC Research Board. Read more about her work here.