Below is the full transcript of the interview with Marissa Mayer on personalization of search results. For commentary, see the Just Behave column on Searchengineland.
Gord: It's a little more than two weeks ago since Google made the announcement that personalization would become more of a default standard for more users on Google. Why did you move towards making that call?
Marissa: We’ve had a very impressive suite of personalized products for awhile now: personalized homepage, search history, the personalized webpage and we haven't had them integrated, which I think has made it somewhat confusing for users. A lot of people didn't know if they had signed up for search history or personalized search; whether or not it was on. What we really wanted to do was move to a signed in version of Google and a signed out version of Google. So if you're signed in you have access to the personalized home page, the personalized search results and search history. You know all three of those are working for you when you're signed in. And if you're signed out, meaning that you don’t see an email in the upper right hand corner that personalized search isn't turned on. If anything, it’s a cleaning up of the user model, to make it clearer to users what services they’re using them and when they're using them.
Gord: But some of the criticism actually runs counter to that. One of the criticisms is that it used to be clearer, as far as the user went, when you were signed in and when you are signed out. There were more indicators on the Google results page whether you were getting personalized results or not. Some of those have seemed to disappear, so personalized results have become more of a default now, rather than an option that's available to the user.
Marissa: If you think about it as default-on when you're signed in, I think that it's still as clear on the search results page. We removed the “turn off the personalized search results” link, but you still see very clearly up in the upper right-hand corner whether or not you're signed in, your e-mail address appears, and that's your clue Google has personalized you and that's why that e-mail address is there. I do think, based on our user studies and our own usage at Google, that we've made the model clearer. We were actually ended up at the stage with our personalized product earlier this year where, at one point, Eric (Schmidt) asked "am I using personalized search?" And the team’s answer as to whether or not he was currently using it was so complicated that even he couldn't follow it. You’d have to go to “my account”, see whether or not he was signed up for personalized search, make sure that your toggle hadn't been turned off or on, and there was no way to just glance at the search results page and easily tell whether or not it was invoked. So now it's very easy, if you see your username and e-mail address up in the upper left-hand corner, you're getting personalized results and if you don't, you're not. So effectively there are two parallel universes of Google, per se. One if you're signed out where you see the classic homepage and the classic search results and one where you're signed in, where you get the personalized home page and…you’ll be able to toggle back and forth, of course…and then the personalized search results page and the search history becomes coupled with all that because that's how we personalize your search.
Gord: So, to sum up, it's fair to say that really the search experience hasn't changed that dramatically, it's just cleaning up the user experience about whether you're signed in or signed out and that's been the primary change.
Marissa: That's right. Before you could be signed in and be using one of the three products or two of the three products but not all and, of course, because people like to experiment with a new product, they forget whether they signed up for personalized search. Had they signed up for search history? This just makes it cleaner. If you're signed in you’re using and/or have access to all three, if you're signed out, you're on the anonymous version of Google that doesn't have personalization.
Gord: We can say that it cleans up the user experience because it makes it easier to you know when you're signed in or signed out, but having done the eye tracking studies, we know that where the e-mail address shows is in a location that's not prominently scanned as part of the page. Do the changes mean that more people are going to be looking at personalized search results, just because we've made that more of a default opt in and we've moved the signals that you're signed in a little bit out of the scanned area of the page. Once people fixate on their task they are looking further down the page. This should mean at a lot more people are looking at personalized search results than previously.
Marissa: Actually, I don't think it will change the volume of personalized search all that much, not based on what we've seen on our logs and usage. It makes it cleaner to understand whether or not you're using it and I do think that over time, what it does is it pushes the envelope of search more such that you expect personalized results by default. And we think that the search engines in the future will become better for a lot of different reasons, but one of the reasons will be that we understand the user better. And so when we think about how we can advance towards that search engine of the future that we’re building, part of that will be personalization. I do think that when we look five years out, 10 years out, users will have an expectation of better results. One of the reasons that they have that expectation is that search engines will have become more personalized. I think that in the future, working with the search engine that understands something about you will become the expectation. But you're right in that we believe that for users that are signed in, who find value in the personalized search results, over time as those users know they are signed in and that there search history is being kept track of, that their search results are being personalized, and they don't need to look at every single search task to see whether or not they are signed in because that's what their expectation is and they're expecting personalized results. So I do think we won't see a drastic increase of volume right now of the use of personalized search but that it will hopefully change the user's disposition over time to become more comfortable that personalization is a benefit for them and it's something they come to expect.
Gord: There are a number of aspects of that question that I'd like to get into, and leave behind the question of whether you're signed in or signed out of personalized search, but I have one question before we move on. We've been talking a lot about existing users. The other change was where people were creating a new Google account and they got personalized search and search history by default. The opt-out box is tucked into an area where most users would go right past it. The placement of that opt-out box seems to indicate that Google would much rather have people opting into personalized search.
Marissa: I think that falls in with the philosophy that I just outlined. We believe that the search engines of the future will be personalized and that it will offer users better results. And the way for us to get that benefit to our users is to try and have as many users signed up for personalized search as possible. And so certainly we’re offering it to all of our users, and we’re going to be reasonably aggressive about getting them to try it out. Of course, we try to make sure they're well-educated about how to turn it off if that's what they prefer to do.
Gord: When this announcement came out I saw it as a pretty significant announcement for Google because it lays the foundation for the future. I would think from Google's perspective the challenge would be knowing what personalized search could be 5 to 10 years down the road, what it would mean for the user experience and how do you start adding that incrementally to the user experience in the meantime? From Google's side, you have invested in algorithmic work to categorize content online. I would think the challenge would be just as significant to introduce the technology required to disambiguate intent and get to know more about users. You're not going to hit that out of the park on the first pitch. That’s going to be a continuing trial and error process. How do you maintain a fairly consistent user experience as you start to introduce personalization without negatively impacting that user experience?
Marissa: I will say that there are a lot of challenges there and a lot of this is something that's going to be a pragmatic evolution for us. You have to know that this is not a new development for us. We've been working on personalized search now for almost 4 years. It goes back to the Kaltix acquisition. So we've been working on it for awhile and our standards are really high. We only want to offer personalized search if it offers a huge amount of end user benefit. So we’re very comfortable and confident in the relevance seen from those technologies in order to offer them at all, let alone have them veered more towards the results, as we’re doing today. We acquired a very talented team in March of 2003 from Kaltix. It was a group of three students from Stanford doing their Ph.D, headed up by a guy named Sep Kamvar, who is the fellow who cosigned the post with me to the blog. Sep and his team did a lot of PageRank style work at Stanford. Interestingly enough, one of the papers they produced was on how to compute PageRank faster. They wrote this paper about how to compute page rank faster and it caused a huge media roil around the web because everyone said there are these students at Stanford who created an even faster version of Google. Because the press obviously doesn't understand search engines and thinks that we actually do the PageRank calculation on the fly on each query, as opposed to pre-computing it. Their advance was actually significant not because it helps you prepare an index faster, which is what the press thought was significant. Interestingly enough, the reason they were interested in building a faster version of PageRank was because what they wanted to do was be able to build a PageRank for each user. So, based on seed data on which pages were important to you, and what pages you seemed to visit often, re-computing PageRank values based on that. PageRank as an algorithm is very sensitive to the seed pages. And so, what they were doing, was that they had figured out a way to sort by host and as a result of sorting by host, be able to compute PageRank in a much more computationally efficient way to make it feasible to compute a PageRank per user, or as a vector of values that are different from the base PageRank. The reason we were really interested in them was: one, because they really grasped and cogged all of Google's technology really easily; and, two, because we really felt they were on the cutting edge of how personalization would be done on the web, and they were capable of looking at things like a searcher’s history and their past clicks, their past searches, the websites that matter to them, and ultimately building a vector of PageRank that can be used to enhance the search results.
We acquired them in 2003 and we've worked for some time since to outfit our production system to be capable of doing that computation and holding a vector for each user in parallel to the base computation. We've been very responsible in the way that we've personalized Search Labs and we also did what we called Site Flavored Search on Labs where you can put a search box on your page and that is geared towards a page of interests that you’ve selected. So if you have a site about baseball you can say you want to base it on these three of your favorite baseball sites and have a search box that has a PageRank that’s veered in that direction for baseball queries.
So, the Kaltix team has been really successful at integrating all these Google technologies and taking this piece of theoretical research and ultimately bringing it to life on the Web. And as it's growing stronger and stronger and our confidence around the Kaltix technology grew, we've been putting it forward more and more. We started off on Labs through a sign-up process, then we transitioned it over to Google.com and now we are in effect leaning towards a model where for people who use Google.com and have a Google account, they get personalized search basically by default. If you look at the historical reviews of the Kaltix work it's gotten pretty rave reviews. The users that have noticed it and have been using it for a long time, like Danny (Sullivan), they'll say that they think it's one of the biggest advances to relevance that they've seen in the past three years.
Gord: So when you the Kaltix technology working over and above the base algorithm, obviously that's going to be as good as the signals you’re picking up on the individual. And right now the signals are past sites they visited, perhaps what they put on their personalized homepage and sites that they've bookmarked. But obviously the data that you can include to help create that on-the-fly, individual index improves as you get more signals to watch. In our previous interview you said one thing that was really interesting to you was looking at the context of the task you are engaged in, for example, if you're composing an e-mail in Gmail. So is contextual relevance another factor to look at. Are those things that could potentially be rolled into this in the future?
Marissa: I think so. I think that overall, we really feel that personalized search is something that holds a lot of promise, and we're not exactly sure of the signals that will yield the best results. We know that search history, your clicks and your searches together provide a really rich set of signals but it's possible that some of the other data that Google gathers could also be useful. It's a matter of understanding how. There's an interesting trade off around personalized search for the user which is, as you point out, the more signals that you have and the more data you have about the user, the better it gets. It's a hard sell sometimes, we’re asking them to sign up for a service where we begin to collect data in the form of search history yet they don’t see the benefits of that, at least in its fullest form, for some time. It's one of those things that we think about and struggle with. And that’s one reason why we're trying to enter a model where search history and personalized search are, in fact, more expected. And I should also note that as we look at reading some of the signals across different services we will obviously abide by the posted privacy policies. So there are certain services where we’ve made it very clear we won't cross correlate data. For example on Gmail, we've made it very clear that we won't cross correlate that data with searches without being very, very explicit with the end user. You don't have to worry about things like that.
Gord: One of the points of concern seems to be how smart will that algorithm get and do we lose control? For example, when we're exploring new territory online and we’re trying to find answers we've refine our results based on our search experience. So, at the beginning, we use very generic terms that cast a very wide net and then we narrow our search queries as we go. Somebody said to me, “Well, if we become better searchers, does that decrease the need for personalization?” Do we lose some control in that? Do we lose the ability to say "No, I want to see everything, and I will decide how I narrow or filter that query. I don't want Google filtering that query on the front end"?
Marissa: I think it really depends on how forcefully we’re putting forth personalization. And right now we might be very forceful in getting people to sign up to it, or at least more forceful than we were. The actual implementation of personalized search is that as many as two pages of content, that are personalized to you, could be lifted onto the first page and I believe they never displace the first result, in our current substantiation, because that's a level of relevance that we feel comfortable with. So right now, at least eight of the results on your first page will be generic, vanilla Google results for that query and only up to two of them will be results from the personalized algorithm. We’re introducing it in a fairly limited form for exactly the reason that you point out. And I think if we tend to veer towards a model where there are more results that are personalized, we would have ways of making it clearer: “Do you want to explore this topic as a novice or with the personalization in place?” So the user will be able to toggle in a different filter form. I think the other thing to remember is, even when personalization happens and lifts those two results onto the page, for most users it happens one out of every five times. When you think about it, 20% of the queries are much better by doing that, but for 80% of the queries, people are, in fact, exploring topics that are unknown to them and we can tell from their search history that they haven't searched for anything in this sphere before. There’s no other search like it. They've never clicked on any results that are related to this topic, and, as a result, we actually don't change their query set at all because we know that they need the basic Google results. The search history is valuable not only because it can help personalize the results but they’re also valuable because we can tell when not to.
Gord: There's two parts to that: one is the intelligence of the algorithm to know when to push personalization and when not to push personalization, and two, as you said, right now this is only impacting one out of five searches where you may have a couple of new results being introduced into the top 10 as a result of personalization. But that's got to be a moving target. As you become more confident in the technology and that it's adding to the user experience, personalization will creep higher and higher up the fold and increasingly take over more of the search results page, right?
Marissa: Possibly. I think that's one of many things that could possibly happen, and I think that's a pretty aggressive stance. I look at our evolution and our foray into personalization, where we're sitting here three or four years in, with some base technology that several years old already and it still has been very slight in a way that we have it interact with the user experience. Mostly because we think that base Google is pretty good. As it becomes more aggressive, certainly I would be pushing for an understanding of the ability of the user to know that these results are, in fact, coming from my personalization and not background and if I want to filter them out and get back to basics, that that would be possible. One thing that we've struggled with is if we should actually mark the results are entering the page as a result of personalization but because team is currently and frequently doing experiments, we didn't want to settle on a particular model or marker at this exact moment.
Gord: The challenge there is as you roll more personal results into the results page and get feedback from some users that they would want more control over what on the page is personalized and the degree of personalization and introduce more filters or more sophisticated toggles, it complicates the user experience. And as we know, that user experience needs to be very simple. Is it a delicate balance of how much control you give the user versus how much do you impact the 95% of the searches that are just a few seconds in duration and have to be really simple to do?
Marissa: There are two thoughts there. One, even if we introduce them to filtering on the results page, it wouldn't be any more complicated than what you had two weeks ago, so we already have that filter. Two, we put the user first, and people have varying opinions about whether their search results page is too complicated, but the same people who designed that user experience will be the people who will be tackling this for Google, so I think you can expect results of a similar style and direction.
Gord: In the last few weeks, Google has introduced some new functionality, related searches and refine search suggestions, that are appearing at the bottom of the page for a number of searches. To me that would seem to be a prime area that could be impacted by personalization opportunities that are coming. As you make suggestions about other queries that you could be using, using that personalization data to refine those. Is that something you’re considering? And how long before personalization starts impacting the ads that are being presented to you on a search results page?
Marissa: Refinement is an interesting but a neophyte technology from our perspective. We are finally now just beginning to develop some refining technologies that we believe in enough to use on the search results page. A lot of people have been doing it for a lot longer. When you look at the overall utility, probably 1 to 5% of people will click those query refinements on any given search, where most users, probably more than two thirds of users, end up using one of our results. So in terms of utility and value that is delivered to the end user, the search results themselves and personalizing those are an order of magnitude more impactful then personalizing a query refinement. So part of it is a question of, it’s such a new technology that we really haven't looked at how we can make personalization make it work more effectively. But the other thing is on a “bang for the buck” basis, personalizing those search results get us a lot more.
And as to ads, I think there are some easy ways to personalize ads that we've known for some time, but we've chosen at this point to focus on personalizing the search results because we wanted to make sure to delivered the end-user value on that, because that's our focus, before we look at personalizing ads
Gord: So, no immediate plans for the personalization of ads?
Marissa: That's right
Gord: Thank you so much for your time Marissa.