Moving Immovable Users

•2014/09/17 • Leave a Comment

immovableThis is probably the first of a few blog posts regarding a problem we are facing with our Lync 2013 environment. In short, we have 2 corrupt routing groups right now. Users assigned to those routing groups are unable to add a contact to their buddy list and they cannot change their status.

This tip isn’t anything too special and a lot of you may already know this but I’m putting it out there in case someone else runs into this situation.

Our initial thought was to move the users to a different pool which will remove them from one of the bad routing groups. However, we cannot move the users to a different pool. When doing so, we get the errors seen below.

PS C:\Users\flinchbot> Move-CsUser "" -Target poo
[Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help
(default is "Y"):
Move-CsUser : Distributed Component Object Model (DCOM) operation begin move
away failed.
At line:1 char:1
+ Move-CsUser "" -Target
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 + CategoryInfo : InvalidResult: (:) [Move-CsUser], MoveUserExcept
 + FullyQualifiedErrorId : FAILED::MoveRetry,Microsoft.Rtc.Management.AD.Cm
Move-CsUser : Distributed Component Object Model (DCOM) operation
RollbackMoveAway failed "-1007781356".
At line:1 char:1
+ Move-CsUser "" -Target
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 + CategoryInfo : InvalidResult: (:) [Move-CsUser], MoveUserExcept
 + FullyQualifiedErrorId : FAILED::MoveRetry,Microsoft.Rtc.Management.AD.Cm
Move-CsUser : Distributed Component Object Model (DCOM) operation begin move
away failed.
At line:1 char:1
+ Move-CsUser "" -Target
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 + CategoryInfo : InvalidOperation: (CN=Uk\,lre poc\,DC
 =com:OCSADUser) [Move-CsUser], MoveUserException
 + FullyQualifiedErrorId : MoveError,Microsoft.Rtc.Management.AD.Cmdlets.Mo

So that wasn’t going to work. So we decided to try a force-move of the users. In general a force-move is to be avoided as this process will move the user but it will throw away, among other things, any contact list entries.

So we did an Export-CsUserData of the users information first:

PS C:\Users\flinchbot> Export-CsUserData -UserFilter "user@flinchbot.c
om" -Poolfqdn -filename "e:\temp\"

We verified that the data was correct by extracting the .zip file and looking at the .xml file. In there we could see the contact list entries that the user already had.

Next we did the force-move.

PS C:\Users\flinchbot> Move-CsUser "" -Target poo -force
Move-CsUser [Using Force will cause data loss!]
[Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help
(default is "Y"):

This moved the user. Finally we restored the data using the Update-CsUserData cmdlet:

PS C:\Users\flinchbot> Update-CsUserData -UserFilter "user@flinchbot.c
om" -FileName "e:\temp\" -verbose
VERBOSE: Processing input file e:\temp\
VERBOSE: Opening file
VERBOSE: Opening file e:\temp\
VERBOSE: Processed 1 users so far.
VERBOSE: User specified in User Filter processed.
VERBOSE: Output file C:\Users\flinchbot\AppData\Local\ImportUserDataTemp.Xml
 generated successfully.
VERBOSE: Processing user
VERBOSE: Processed 1 users so far.
Are you sure you want to perform this action?
Performing operation "Update-CsUserData" on Target "".
[Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help
(default is "Y"):

After signing out of the user account and signing back in we saw the contacts had been restored. We were also now able to add new users to the contact list as well as update the Lync status.

Moving the user back to their original pool gave the same errors as in the first example above. We need to figure that issue out but at least our users can have full Lync client functionality again even if they are now in the wrong pool.

Quick & Dirty – Gather Shutdown Tracker Events

•2014/08/25 • Leave a Comment

Today I had the need to see if my Front End servers were shut down “dirty’ and when. So I kicked out the following script.

$banana = Get-CsComputer -Pool
foreach($Server in $banana)
 write-host $server.fqdn
 Get-EventLog -ComputerName $server.fqdn -LogName System -InstanceId 41 | export-csv shutdowns.csv -Append

Port 5088 Missing from Lync 2013 Documentation

•2014/07/01 • 8 Comments

If they had the other Harry Caray, a whole lot of Budweiser would be missing too.

We had an issue where users were able to sign in with Lync mobility but were unable to send and receive IM’s. There are 2 things to note about this scenario:

1. The users are homed on an SBA

2. There are firewalls between the SBA and the parent pool.

So if you don’t have this scenario then you can quit reading now as you won’t ever have this problem.

In order to troubleshoot why our users were unable to successfully use Lync mobility, we jumped into the logs. We reviewed the log from the mobile phone and it showed nothing useful. We enabled the Lync Logging tool on the SBA and had a user log in and try to send an instant message.

Reviewing this log, we saw a request for port 5088 form the SBA to the parent pool. The request was to a specific server in the parent pool and it was from our Survivable Branch Appliance.

If you look at the image below you’ll see this in the Snooper view of the collected log file. The ms-diagnostics line pretty much spells this out as clearly as you could expect.

Look at the circle. It's 5088!

Look at the circle. It’s 5088!

Port 5088 does not currently exist on the Lync Ports and Protocols page on TechNet. Searching for this port turns up very little outside of this one TechNet article. That article points to the set-cswebserver PowerShell cmdlet which is used to define the web server settings in Lync. If you expand the Parameters section in the article and scroll down to the UcwaSipExternalListeningPort section you will see that this is set to use 5088/tcp by default. This is incorrect as this is the port used by UcwaSipPrimaryListeningPort. This TechNet article has the two ports switched in their documentation (The same error is seen when running get-help set-cswebserver -detailed).

ucwa ports

Run get-csservice -Webserver and you will see the default ports. Note that they don’t match the documentation.


In other words, even when Microsoft has documented this port in TechNet, they got it wrong. We didn’t see port 5089 in any of our traces so we couldn’t figure out when this port gets used.

After we updated the firewalls in front of our parent pool Lync servers, the problem immediately disappeared and our SBA users were able to successfully IM via their mobile clients.

Our contact at Microsoft has forwarded this omission to the relevant teams so hopefully at some point this will be added to the Lync ports and protocols page.

Credit to figuring this out goes to Antwan who is resurrecting his UC Playa blog. I’m just the one who wrote the article.

Lync 2013 and Useless(?) Topology Updates

•2014/05/13 • 1 Comment

RedHerringBlurbWe noticed today (and a few days ago, for that matter) that our CMS Replication state was “False” an awful lot of the time. So much so that we thought our CMS Replication was broken. We failed over our CMS role1 the other day and, after coming back from lunch, all of our replicas were “True”. Well we tried the same trick today and it didn’t fix the problem. We dug deep into the logs and it appeared that everything was actually working correctly. We even went so far as making a simple change (New-csUserPolicy  “Delete This Policy”) and verifying after a few minutes that it showed up on a few of our other Lync servers2. So we turned our focus to why wasn’t the replication status ever “True”?3

I’ll skip ahead a little here and get to the point where we made our little discovery. We exported a topology, then waited a random amount of time – say 5 minutes. Then we exported another copy of topology. We took the DocItemSet.xml file from each export and did a text comparison between the two files. Lo and behold there was a change. What was this Topology change?

A user migration.

Yes, moving a user from one pool to another caused a topo refresh to our servers. What the???

Our production environment is pretty big. As such, there are almost constant changes in the environment – be it updating a dial plan or disabling a user. In other words, it’s essentially dumb luck if we ever see our replication status set to “True” on all of our servers.

I was able to replicate this in my lab which has no automated systems enabling users or other system admins editing dial plans or the like. I can control the environment very tightly.

I exported a copy of the topology. I then ran “Move-csuser flinchbot -Target”. I then waited 5 minutes and exported the topology a second time. Next I went to this site and copied the first topology file into the left pane and the updated topology file in the right pane. It found 5 changes.


Look at the bottom right of this image.

The first is (and I am guessing here) a hash of some sort letting the recipient servers know that there has been a change to the following section (XML node). This is found at a root node in the XML document (I think that’s the right term).  The next change is similar. Like above, I think it’s a marker to point out that within the root node above, this is the specific entry that has changed.


Finally we get to the actual change. Notice that the usercount decrements from 320 to 319. This is the move of the user FROM the source pool. Topo3

The fourth change is similar to the second change above – I think it’s just pointing out that “here be changes”: Topo4

I have no users on the destination pool (well maybe a random account or two). As such, you can see that the usercount going from 0 to 1 is completely expected if a new user is moved to this specific pool. Topo5

So….the question is why is there a topology update sent out for a user move?

All signs point to Windows Fabric and/or pool pairing being the reason. But why would you spam all of the Lync servers in your entire infrastructure with a change that is only relevant to a subset and then only if they are using Windows Fabric?

And then the change is only the number of users?

If the user count for a pool is set to 1501 in one of these files, is this the event that triggers Windows Fabric to create a new user routing group or to re-balance its groups? It seems an awful brute-force kind of way to do this.

Consider an environment with tens of thousands or hundreds of thousands of users. Users are being created/deleted/moved all the time. Now files are being blasted around the network constantly to inform all of your servers that a user was moved. Admittedly these files tend to be fairly small. In my lab they are 30K in size. In the production environment I help manage these files are much larger.

As a fun side effect, all of these topo pushes will account for additional writes the the SQL XDS Database which will fill up your SQL Logs faster.

So I don’t know why Microsoft architected it this way. But if you see that your CMS state is False an awful lot then it may very well be normal for your environment.



1You can move the active CMS host(s) by stopping the Lync Server File Transfer Agent, Lync Server Master Replicator Agent, and Lync Server Replica Replicator Agent on the current active CMS host(s). This forces an election and one of the other Front End servers will pick up one or both of the roles.

2For reference, this was done by running Export-CsConfiguration -Filename -LocalStore. Looking in the returned file at the DocItemSet.xml file we found that the change had indeed replicated.

3For the record, to check your replication status run Get-CsManagementStoreReplicationStatus”

New Windows Phone UC App

•2014/05/05 • Leave a Comment

wp_ss_20140505_0001About 2 years ago I released the Lync News app for Windows Phone. Today that app has been retired and replaced with “flinchböt on UC“, an app which covers Lync as well as Exchange and has a fairly terrible name (I was in a hurry and didn’t give the name any thought.). The new app is streamlined from the previous one partially because it was done with App Studio instead of native Visual C++ and partially because the older one was a bloated mess.

So if you have been using the Lync News app on Windows Phone, thanks – but it’s time to uninstall it! This version has way better load times for not only the app but for the Lync feed as well. The Exchange feed is a bit laggy but since I rarely have to deal with Exchange in my job I don’t care that it’s slow.

The app is fairly self explanatory. The one thing to point out is to see the full, original post click the url link at the top of a given article. Otherwise you can read it in a slightly-less readable format within the app. You can also pin an article to your start screen. If you have an article open, tpa on the menu then Share. Pick “Share Link” and then you can save to OneNote which is hot. That would be a really cool way to save articles.

Here is the link to download the app to your Windows 8 phone.

As a reminder, there is also a similar app for Android that can be found here.

Below are some screenshots.








Fun with KHI and Performance Monitor

•2014/05/01 • 1 Comment

A few weeks ago I wrote a post basically saying that the Lync Stress Tool was worthless. In it I said you should really monitor the progress of your Lync deployment using Performance Monitor. I also pointed to the Key Health Indicators  that Microsoft recommends you use to monitor your Lync installs. Heck, they even have a script to easily install the KHI Data Collector Set into Performance Monitor for you.

As we built our Lync 2013 servers, we installed the KHI Data Collector Set on each server as part of our standard build process. As we have about 40 Lync servers it’s a pain to go back to 40 servers and update the KHI Data Collector Set configuration. For example, we want to change the logging directory off of our c: drive and to the e: drive. We’d also like to launch the performance monitor collection every so often, have it run for a week, and then stop. Manually starting Performance Monitor on 40 servers? This is where PowerShell comes in.

I cobbled together a script to change the settings of the KHI Data Collector Set in Performance Monitor. If the KHI Data Collector Set was not installed on the server, the script installs it. After updating (or installing) the KHI Data Collector Set, it starts it on all of the servers. This is a total time saver. I won’t shar the entire script here because I copied the entire Microsoft-written KHI script and buried it into mine. Copyright, plagiarism, etc.

But I will give you enough information to build your own script.

At the top of the script is this:

$arrServers=import-csv e:\scripts\servers.csv

This reads in a simple list of all of the servers I want to manipulate. Set the Header in the file to “ServerName”.

Next, I pasted in the two functions at the top of the Microsoft Script. I edited the CreateDataCollector function to look like this:

Function CreateDataCollector
Write-Host -ForegroundColor Green "Creating Lync Server 2013 KHI Data Collector on $($server.ServerName).."

Invoke-Expression "logman.exe create counter KHI -o e:\Perflogs\KHI_$($server.ServerName) -f csv -si 15 -v mmddhhmm -cf .\LyncServer2013KHIs.config -s $($server.ServerName)"
Remove-Item .\LyncServer2013KHIs.config

I edited the Write-Host line to properly display the Server name as it comes from the text file we are using. I then deleted a few lines and built my own Invoke-Expression command. Note that in this one I am slipping in the server name into the name of the logfile. I am also pointng th elogfile to an e:\Perflogs directory.

The CreateKHIsTextFile function is left unchanged.

And then after those 2 functions is the code I cobbled together.

Function StartKHI
 $datacollectorset.Query("KHI", $Server.Servername)
#Change alread-installed KHI Collector set to log to e: drive instead of default c: drive
 Invoke-Expression "logman.exe update KHI -o e:\Perflogs\KHI_$($Server.ServerName) -b 5/1/2014 17:00:00 -e 5/8/2014 17:00:00 -s $($Server.ServerName)"
#Start the Collector Set

foreach ($Server in $arrServers)
 Write-host "Working on" $Server.ServerName "..." -ForegroundColor Green

 $datacollectorset = New-Object -COM Pla.DataCollectorSet;
#If the collector set is not already installed, it errors. If no error, start the collector
#Starting the collector crashed, so it's probably not installed. Install it, then start it.
 write-host ("KHI counters not installed on {0}" -f $Server.ServerName) -ForegroundColor Green
 write-host "Installing...." -ForegroundColor Green

I’ll assume you are fairly well versed in PowerShell. So let me point out the one bit of creativity I had to use. No value is returned by the  “$datacollectorset.Query(“KHI”, $Server.Servername)” call. Instead, it returns nothing if it worked. If it fails it lows up and scrawls PowerShell blood all over your screen. So the way to tell if the KHI is already installed or not is to use a Try/Catch construct. If the try works, it starts the KHI Data Collector successfully. If it fails, then I assume that the KHI Data Collectors haven’t been installed. So I call the Microsoft-written (and slightly edited by me) functions to install it. Once those are done, I go ahead and start the Data Collector.

So using this script, I am able to either install the KHI Data Collectors or to update them with values I want. If you look at the Invoke-Expression line in the StartKHI function, I use the -b and -e parameters. This sets a begin and end time for the collector to run. In this case it is one week. You will probably have to edit this before running your copy.

Below is a short script to stop the Data Collector Set. It’s useful when testing.

$arrServers=import-csv e:\scripts\servers.csv

foreach ($Server in $arrServers)
 Write-host "Working on" $Server.ServerName "..." -ForegroundColor Green

 $datacollectorset = New-Object -COM Pla.DataCollectorSet;
 $datacollectorset.Query("KHI", $Server.ServerName);
 write-host "KHI counters already stopped on $($Server.ServerName)" -ForegroundColor Green

In the above you don’t really have to use the Try/Catch. It’s just to make things prettier (i.e., less PowerShell blood).

So if you cobble the full script together, you can install the KHI Data Collector set, edit its settings, and start and stop the collector. Pretty useful, especially if you have a lot of servers. Now the next challenge: What do you do with 40 servers-worth of logs?

Find SIP Addresses with Illegal Characters

•2014/04/24 • 2 Comments

SIP HappensOne of my peers had a Lync 2013 pool-failover scenario. Just about everything worked right except that apparently the Lync Backup Service had been getting hung up and not completing its replication cycles. They opened a case with Microsoft and one of the issues discovered was that Lync Backup Service was hanging on users whose SIP Address had illegal characters. Once they manually fixed these SIP Addresses, the Backup Service was able to complete successfully.

So what characters are illegal in a SIP address (at least so far as Lync is concerned)?

~ | { } [ ] < > ` # ^ & @ \

We can convert that to a Regular Expression:


Once that is done, a quick and dirty script can be written to compare every user against this Regular Expression. If the Regular Expression matches the SIP Address, then we can be notified of this.

# These are the invalid characters ~|{}[]<>`#^&@\

$AddressToTest = get-csuser
$regex = "^([^~|{}\[\]<>#^’&@\\]+)$"

Foreach ($user in $AddressToTest)

If (($User.sipaddress -split "@")[0].substring(4) -notmatch $regex)

Write-Host "Invalid username specified." $User.sipaddress

The only fancy part of this script is in the If statement. We can’t compare the entire SIP Address against the regex because the “@” will always be a match. So the Split is used to grab the left hand side of the SIP Address which is the portion that will (most likely) have illegal characters. You’ll also note the “substring” portion in the if statement. This means begin the comparison 4 characters in; skip the “sip:” portion of the returned SIP Address.

Note that if you want to test out this script in a lab environment, you can force a user to have any illegal character if you edit their SIP Address via ADSIEdit. Also note that set-csuser will permit you to edit a SIP Address and inserting a few of the above characters.

Here is sample output from the script:



Tom Arbuthnot points to a Technet document specifically calling out the unsupported usageof the hyphen and apostrophe here:

%d bloggers like this: