An Open Letter to Vanguard's Steve Utkus

Steve Utkus is the head of retirement research at Vanguard. Though we have never met personally, we have been fortunate enough to appear alongside Steve in articles, including this article about BrightScope that appeared in BusinessWeek. His comments in the press are generally wise and insightful which makes sense given his 25+ years of industry experience.

On Monday of this week Mr. Utkus wrote about BrightScope on his Vanguard blog. In his post Mr. Utkus was warm to the idea of ranking the quality of 401k plans, and due to the fact he discussed BrightScope exclusively, it is clear that he views BrightScope as the market leader in this new space. However, he did point out one problem he has with the BrightScope Rating system:

All of this seems like a useful exercise until you dig a bit further. BrightScope’s ranking system reveals one problem. The top of the scale is dominated by plans for doctors, money managers, and pilots; the bottom, by plans for retail workers. So at least from this perspective, it seems like the rating is less about your 401(k), and more about your income—whether you’re a pilot, a money manager, or a retail clerk.

The good news is that if this is Mr. Utkus’ only problem with our rating system then we are in a really great spot, as it is just one and it is easily addressed. So how did he decide this was a problem? Well, he bases his entire argument on an assumption he makes that turns out to be factually inaccurate:

Meanwhile, in the marketplace, a few start-up companies are looking to profit from the trend. One is BrightScope, which provides a rating of many 401(k) plans in the U.S. If your employer plan is rated 90 or above, it’s stellar in their estimation. If it’s rated 50 or less, it’s not.

As the head of retirement research at one of the largest and most respected providers in the 401k space, I would have expected Mr. Utkus to do a little bit more research before reaching this erroneous conclusion. BrightScope has never said that “90 or above is stellar” and “50 or less” is not. What really matters in our rating system is the relative rating to a relevant peer group. Every single BrightScope Rating chart has the BrightScope Rating on the top of the chart and the lowest, average and highest rated peers in the peer group on the bottom of the chart (see below):

Peer plans are selected based on their industry (as a proxy for demographics) and on their size in assets and participants. By comparing to peer groups we have provided the context with which to determine whether a plan is performing well, which is why there are plans that score a 50 that are above average in their peer groups and plans that are 65 that are below average in their peer groups.

Secondly, focusing only on the overall BrightScope Rating is to ignore the second part of our rating system, which are the plan component ratings.  While a plan might have a higher score due to high funding  – salary deferrals and company contributions – it still might get a bad rating on fees, investment menu quality or participation rate (see example below):

If the plan sponsor (above) has a high BrightScope Rating due in part to high salary deferrals and company contributions, certainly our component ratings make it clear that there still is room for improvement. While the BrightScope Rating looks at things holistically, having a high rating might mean that great demographics are disguising other problems with the plan, which is why the component ratings are so valuable. In fact, for most of the plan sponsors and advisors who work with us the component ratings are a critical piece of our rating system.

I think fundamentally Mr. Utkus has one question about our rating system: why did we design our rating system to allow for both the relative rating of a plan to a peer group, but also the absolute differences between plans that aren’t in the same peer group? In other words why do we let company demographics impact the BrightScope Rating?

There are both philosophical and practical answers to this question:

The philosophical answer is that BrightScope believes that the primary purpose of a retirement plan is to create retirement income security for the highest proportion of a company’s employees. As a result, we believe the best measure of a plan’s success is determining how well it does at creating retirement income security for its participants. At BrightScope we call this “retirement outcomes” and it pervades everything that we do. When you rate a plan based on how quickly it is getting its participants to retirement you see ratings differences based on demographics factors as well as on things like fees and investments. While it is tempting, it is impossible to cleanly separate retirement outcomes from demographics. In addition, by peer grouping after we calculate the ratings our rating system has the flexibility to allow advisors and plan sponsors to create custom peer groups based on their unique knowledge of their firms and industries. If we had built the peer groups into the rating system we would be dictating how peer groups are constructed and not respecting the complexity of the marketplace and the unique knowledge and experience advisors and sponsors have accumulated. In fact one of the parts of our system that advisors and sponsors like the most is the flexibility and customizability of our peer grouping.

The practical answer is that in order for a data point to be an input to our calculations we must have that data in a uniform way across all plans in our dataset. At present no one has demographic information for every 401k plan in the country, and without consistent information for every single plan it can’t be used as an input to our calculations. The good news is that corporations currently do plenty of benchmarking of their benefits (salaries, bonuses, executive compensation, health plans etc.) even though they lack perfect demographic information on all of their peers. How do they overcome not having perfect demographic information? The answer is simple, they construct peer groups. By using industry and size as a proxy and even defining a custom group of peers they are able to benchmark their other benefits. BrightScope is doing no different than what corporate executives have been doing for years and that is using peer groups to compare and benchmark their benefits. While 401k plans are complex, so are just about every other benefit they are benchmarking.  I would argue that the more complex a benefit is the more important it is for a plan sponsor to engage in benchmarking, because complex products are more likely to have larger pricing and value discrepancies.

The second issue that Mr. Utkus mentioned is that some plan sponsors offer other retirement plans and that looking at the DC plan may not paint the entire picture.  This is a reasonable critique and one that several plan sponsors that offer DB plans have brought up in the past. To address in the short term we have placed a disclaimer on every plan page highlighting that the rating applies to just the DC plan and not the entire benefits package. The good news is that BrightScope is hoping to increase its coverage of other retirement plans in the near future to supplement our existing work in the DC market. We are excited to launch these new ratings and are confident it will help address this lingering concern. Even without any additional coverage however our advisors and plan sponsor clients have gained significant value in the comparisons that are available and with our guidance most take other retirement plans into account when constructing peer groups.

The third issue that Mr. Utkus mentions is his analysis of a single plan’s investment menu and his determination (without any knowledge of our algorithms) that the plan was rated poorly because of poor recent investment performance. Mr. Utkus makes several assumptions that are factually wrong and are directly addressed in the FAQ section of our website. We are not doing individual fund ratings. So when a plan has a low “investment menu quality” rating, it does not mean that each of the individual funds are “low quality,” which is Mr. Utkus’ assumption. Rather, it could be that the menu was missing a small cap fund, or had only 1 small cap fund that was underforming. I think even Mr. Utkus would agree that even if the rest of the funds are great, without adequate small cap exposure a plan participant loses a substantial amount of diversification benefit and ends up with a portfolio with a sub-optimal Sharpe Ratio. In this way we are rating the ability of a participant in the plan to build a high quality diversified portfolio. I must also point out that while 5 years of investment underperformance may not mean a fund is “low quality”, I don’t think it is unreasonable to say that a fair share of advisors who work in the DC space and use an investment policy statement and watch list would have removed that fund from the menu sometime during that 5 year period.  If you are an advisor who uses a lot of active management funds and doesn’t remove funds after 5 years of underperformance I would love to hear from you so that I can be corrected in my assumptions.

Fortunately there is at least one point where Steve and I are in complete agreement:

What it seems we need is not just a ranking system, but an interpreter, or a user guide, providing us with all of the caveats and qualifications for a given ranking.

The BrightScope Rating system and our benchmarking tools are designed to provide not just a rating system, but also the tools necessary to customize the comparisons, understand the differences and complexities and ultimately make a decision about what changes, if any, should be made to address the information that comes to light. Without taking a look at the tools, or really understanding the calculations it would be impossible for Steve to know that the “user guide” he so desires already exists. But, I do think he would be happy to know that we agree completely and that there are many in this industry who are committed to transparency and retirement outcomes who want the same things he does . . . many of those people are advisors and plan sponsors who currently subscribe to our tools.