I recently imported a bunch of email into Outlook for OS X and was disappointed that I was unable to search its contents. Outlook uses Apple's native spotlight search so, in theory, all I need to do is wait for spotlight to churn through the new material and I should be done. Hours passed and nothing seemed to happen.
The first thing I tried was to simply force a re-index of my hard-disk. I just added my main drive to the list of places to exclude from spotlight searching and then after waiting a minute or two (for superstitious reasons) I removed that item again and sat back waiting for the inevitable slowdown as mdworker launches into life and starts scanning all my data.
Nothing.
The next step I took was to try to figure out how to see if Spotlight was actually doing any indexing at all. There's no simple control panel or dashboard view of the indexing process. The only way I could find was to press command-space and search for something. It should show the indexing progress-bar if it is indexing (but if it is complete you'll see nothing). I was still getting nowhere and now I'd lost the ability to search for anything.
Check the Console
In these situations it is always worth checking the console. I don't mean the termainal, just the console utility app that spools system messages onto your screen and allows you to see what is happening on your Mac. There's a handy search box (which doesn't use Spotlight!) at the top which filters the current day's logs. Putting just 'md' in that box was enough to filter all the other stuff enabling me to see a constant stream of output from Spotlight's indexing application: mdworker.
Here's a sample:
Jan 12 23:06:45 LernaBookPro com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[24329]): Could not find uid associated with service: 0: Undefined error: 0 1422 Jan 12 23:06:45 LernaBookPro com.apple.xpc.launchd[1] (com.apple.mdworker.bundles): Service only ran for 0 seconds. Pushing respawn out by 10 seconds.
You don't need to be an expert to see that this is some type of unexpected condition and the second line tells me that the resulting process exited out straight away. True to its word, every 10 seconds I got a pair of lines like this in my console log. Interestingly, the problem had been going on for some time, when I searched back through my logs I realised that Spotlight had probably not been indexing properly for ages. I have a feeling that things like new mail that arrives in your inbox gets indexed through some other mechanism and this can mask the fact that the long running indexer has not made a proper index of your hard disk. But that's just a hunch.
Check the web
Armed with this information I was able to find a thread on the internet that clearly dealt with the problem I was having: [...] com.apple.mdworker.bundles pollute logs with errors. Clearly the person who posted this thread didn't think (or perhaps realise) that Spotlight had actually failed and was chasing up unexpected slowness.
Interestingly, from this thread it is clear that the last number in the log entry is a numeric user id. In the case of the poster this was 502 which is typical of the range Apple uses for real users you add to your machine from the control panel. I think they start at 501. If you delete a user from your machine but leave lots of data lying around that is owned by that user then Yosemite seems to be having trouble and it is killing the Spotlight indexer.
The user id causing trouble for me is 1422, which is outside of this range so although the remedy might be similar the origin of my problem is different. I put 1422 into my search and found this thread: 27 inch iMac suddenly running very slowly where the person erases their disk and re-installs (yeah, that fixed it!).
Now use the Terminal
With the clues in the first thread it seems like I need to find some files on my disk owned by user id 1422. The terminal can do that using the Unix find command:
cd / sudo find . -user 1422
This type of thing takes ages, especially if you have a large backup drive.
The Culprit
Turns out that the files owned by user 1422 are all part of the HP printer drivers. These may have been inherited from a previous installation, I'm not that sure, but either way there was information in /Library and inside the Application bundles themselves from HP that were identified this way. I had to use chown -R to change those to root ownership instead (I'm not sure what they are supposed to be).
You'll also find a few files in /private/var/ with paths similar to:
/private/var/folders/zz/zyxvpxvq6csfxvn_n0000000000000/0/com.apple.Spotlight/1422
The exact names will be different but the critical thing is the end part, which is a directory created just for spotlight. It is named after the missing user and it has its ownership set to this non-existent user. It is actually these latter files which are causing the problem, Spotlight is trying to create an index for that user but is surprised when it finds the user doesn't exist. Just removing this directory isn't enough though because Spotlight will re-index your disk and as soon as it finds a file owned by 1422 again it will create this folder in /private/var and grind to a halt again. You must remove or re-own everything that spotlight might see: a real hassle if you have a large backup drive because of the way Time Machine works. I've solved that problem by just excluding my backup drive from Spotlight.
FWIW, unix systems are usually very tolerant of non-existent user ids. Many archive programs will restore files from other machines and systems and, if run as root, will update their ownership to match the original ownership before the files were archived. On networks of Unix workstations that share a user directory this is useful because you can tar up files on one machine and transfer them to another and all the ownership information comes across too. On a personal machine this is less useful and perhaps even dangerous, hence the '--insecure' option on tar.
I considered removing and reinstalling the HP software but I'm not convinced that it isn't a problem with the installer itself. It works fine on the machine I upgraded from 10.8, through 10.9 to 10.10 (after these fixes) but I noticed that on a different machine that came with Mavericks and was upgraded to Yosemite I had to re-install the driver even though I used the migration assistant to set it up from its predecessor (running 10.8) which did have the HP drivers installed. I struggled to find the right download for my HP Officejet 6310 and perhaps now I know why!
Restart
I couldn't find any way of telling mdworker to give up on user id 1422. Removing it's special directory didn't seem to help so I assume that somewhere a process has a cache of that information and the only way I could figure to get it going again was a restart.
Conclusion
If you are using Yosemite and have an HP printer or scanner check your console just in case Spotlight has died for you too. Do battle with the terminal. Restart. Enjoy Spotlight indexing and smoother performance from your Mac.