On our IBM DS3500 SAN we started seeing VDD errors and repairs. This was coupled with a drive returning a Check Condition and a destination drive error. After speaking with vendor support these problems are caused by the controller reading data off the hard drive which fails checksum. The array will automatically recover. While these can be transient in nature it can also preface a failing drive, especially when you see a drive error with it.
To be on the safe side we replaced the drive. When replacing the drive it's recommended to unassign an global hot spares until the new drive is inserted and the rebuild starts. If you have hot spares assigned when you fail the drive the array will immediately start rebuilding onto the hot spare.
Wednesday, November 28, 2012
Problems with Exchange 2010 Update Rollup 4 and 5
Currently you shouldn't install any Exchange 2010 Update Rollup past UR3.
Update Rollup 4 and 4v.2 both break content indexing.
Update Rollup 5 broke DAGs and Microsoft has since stopped offering it for download.
These issues are supposed to be fixed in the upcoming Update Rollup 6, but we'll have to wait and see. Considering the obvious lack of QA testing Microsoft did on UR4, 4v.2, and 5 I'm going to hold off installing UR6 when it comes out to make sure it doesn't come with a bug as well.
Update Rollup 4 and 4v.2 both break content indexing.
Update Rollup 5 broke DAGs and Microsoft has since stopped offering it for download.
These issues are supposed to be fixed in the upcoming Update Rollup 6, but we'll have to wait and see. Considering the obvious lack of QA testing Microsoft did on UR4, 4v.2, and 5 I'm going to hold off installing UR6 when it comes out to make sure it doesn't come with a bug as well.
Sonicwall appliance not filtering spam
The Sonicwall Email Security spam filtering appliance is a great product. I've thoroughly enjoyed having one. However, with the 7.4.1 firmware they introduced a MySQL backend which holds thumbprints (spam signatures) and report information. Under certain circumstances (like powering off the appliance during a thumbprint update) can corrupt the database. With a corrupt database the appliance will continue to filter spam, but at a reduced efficiency (we saw it drop from about 95% to about 70%). The reports will also not work or work intermittently.
If you log into your appliance and you see the Good/Junk and Spam Breakdown charts reading zero, suspect a corrupt database. You can check it by going to System > Advanced and clicking the Check Connectivity button. You can also check your thumbprints by logging into the appliance and then going to http://<appliance>/diag.html and selecting Thumbprint info.
A corrupt database is not an end user fixable problem. Contact Sonicwall to have them repair. If your appliance is behind a firewall, then you'll need to temporarily open port 22 to give them access to the appliance.
UPDATE: If your database keeps getting corrupted you're probably looking at hardware failure, at which point you should contact Sonicwall support to have a new unit shipped to you.
If you log into your appliance and you see the Good/Junk and Spam Breakdown charts reading zero, suspect a corrupt database. You can check it by going to System > Advanced and clicking the Check Connectivity button. You can also check your thumbprints by logging into the appliance and then going to http://<appliance>/diag.html and selecting Thumbprint info.
A corrupt database is not an end user fixable problem. Contact Sonicwall to have them repair. If your appliance is behind a firewall, then you'll need to temporarily open port 22 to give them access to the appliance.
UPDATE: If your database keeps getting corrupted you're probably looking at hardware failure, at which point you should contact Sonicwall support to have a new unit shipped to you.
Forwarded emails in Exchange 2010 don't show email address
If you have an version of Exchange prior to 2010 and have a rule set up in an email account to forward all incoming emails, when you migrate to Exchange 2010 the emails will still be forwarded, but they won't show the email address they're from. This is a know issue and is not one Microsoft plans to fix. There are a couple workarounds though:
- Delete the account, recreate the account, and recreate the rule.
- Change it from forwarding to redirecting or forwarding as an attachment.
- If it's forwarding to only one person, you can set forwarding in the account properties.
- Delete the account, recreate the account, and recreate the rule.
- Change it from forwarding to redirecting or forwarding as an attachment.
- If it's forwarding to only one person, you can set forwarding in the account properties.
Friday, November 2, 2012
Dell PERC S300 RAID not rebuilding
First, let me say the PERC S300 isn't a good RAID controller. It gives you the capacity for RAID 0, 1, and 5, but the processing is done by the operating system so they're not true hardware RAID controllers. They also don't provide write back support which can a big performance impact depending on your application.
If one of the drives fails on a PERC S300 RAID controller and you replace it with a new drive (the S300 supports hot swap) the controller may not automatically start rebuilding the array onto the new drive. Open your Server Administrator console and first check to make sure the new drive shows up under the "Physical Disks" listing. If it does then check the "Virtual Disks" listing and see if you suddenly have a new virtual disk the same capacity as the replacement drive. If you do, go ahead and verify it only contains the replacement drive, and then go ahead and delete it. Once it's deleted go back to "Physical Disks" and set the replacement drive to "Global Hot Spare". Once applied the controller should start rebuilding the affected arrays onto the new drive. Go back to "Virtual Disks" to check the progress.
If one of the drives fails on a PERC S300 RAID controller and you replace it with a new drive (the S300 supports hot swap) the controller may not automatically start rebuilding the array onto the new drive. Open your Server Administrator console and first check to make sure the new drive shows up under the "Physical Disks" listing. If it does then check the "Virtual Disks" listing and see if you suddenly have a new virtual disk the same capacity as the replacement drive. If you do, go ahead and verify it only contains the replacement drive, and then go ahead and delete it. Once it's deleted go back to "Physical Disks" and set the replacement drive to "Global Hot Spare". Once applied the controller should start rebuilding the affected arrays onto the new drive. Go back to "Virtual Disks" to check the progress.
Wednesday, September 26, 2012
Network drives not mapping on Cisco connected clients
We run all Cisco network hardware and ran into an interesting problem where certain client computers, running a mix of Windows XP Pro and Windows 7 Pro, wouldn't map network drives per our logon script. They could still access network resources and login successfully, but the drives simply wouldn't map.
We tried updating drives, changing network cards, and a variety of group policy settings per Microsoft technical support's recommendation, but nothing seemed to help. We finally found putting a dumb hub between the computer and the wall completely fixed the issue. Working from that, Portfast was implemented on our client switches. So far it seems to have fixed the problem.
We tried updating drives, changing network cards, and a variety of group policy settings per Microsoft technical support's recommendation, but nothing seemed to help. We finally found putting a dumb hub between the computer and the wall completely fixed the issue. Working from that, Portfast was implemented on our client switches. So far it seems to have fixed the problem.
Tuesday, July 24, 2012
Updates on reboot / shutdown on Windows 2008
Windows 2008 has a very annoying habit of installing updates on a shutdown or reboot. On a server this can be unwanted behavior, and at a minimum it should be configurable. The work around is to go to Start > Run and use "shutdown /r /t 0" instead of the restart option and "shutdown /s /t 0" instead of the shutdown option.
Monday, July 23, 2012
Exchange 2010 public folders not forward emails or tasks
When you've migrated from Exchange 2000 or Exchange 2003 to Exchange 2010 and have decommissioned the older servers, you can run into a situation where your public folders won't accept or forward emails. There are already several posts regarding this on the Internet, but all the ones I've seen were either for email public folders or public folders accepting emails. However, it affects non-email public folders from forwarding emails as well.
The first thing is to see if a public folder will accept an incoming email. If you don't have one set up for it, set up a test folder. Send an email to it and you should an NDR saying:
554 5.2.0 STOREDRV.Deliver.Exception:ObjectNotFoundException; Failed to process message due to a permanent exception with message The Active Directory user wasn't found. ObjectNotFoundException: The Active Directory user wasn't found. ##
This issue is caused because Exchange will look at all Exchange server AD containers, including the empty Exchange 2000/2003 container, for a System Attendant to process the message, and errors out on the empty container. The solution is to delete the empty container using ADSIedit. As always, be very careful using ADSIedit as you can cause a lot of damage. If you see any servers in the container STOP because you're probably looking at the 2010 container. Once it's deleted, give Exchange 2010 some time to see the change and start processing correctly.
The first thing is to see if a public folder will accept an incoming email. If you don't have one set up for it, set up a test folder. Send an email to it and you should an NDR saying:
554 5.2.0 STOREDRV.Deliver.Exception:ObjectNotFoundException; Failed to process message due to a permanent exception with message The Active Directory user wasn't found. ObjectNotFoundException: The Active Directory user wasn't found. ##
This issue is caused because Exchange will look at all Exchange server AD containers, including the empty Exchange 2000/2003 container, for a System Attendant to process the message, and errors out on the empty container. The solution is to delete the empty container using ADSIedit. As always, be very careful using ADSIedit as you can cause a lot of damage. If you see any servers in the container STOP because you're probably looking at the 2010 container. Once it's deleted, give Exchange 2010 some time to see the change and start processing correctly.
Friday, July 13, 2012
Routing Group connectors in Exchange 2003
When you're removing Exchange 2003 servers in an Exchange 2010 organization, one of the steps is removing the routing group connector which connects the 2003 and 2010 administrative groups. When going to the 2003 ESM to do so you'll probably see one or more other connectors there as well. Before removing the last Exchange 2003 server you should remove the other routing group connectors in your 2003 administrative group. If you don't your 2010 transport server will still see the connector but won't see a route to it, and will start logging errors 5015 and 5016 and warning 5006.
Fortuantely, if you've already uninstalled your last 2003 server without removing those connectors and are receiving those warnings and errors, you can simply remove the old connector from the "Send Connectors" tab under the "Hub Transport" organization configuration in 2010 EMC.
Wednesday, July 4, 2012
WARNING: orphaned scrubbed LV(673) detected
If you run an integrity check on an Exchange 2010 database using Eseutil /G then you might find entries in the output similar to "WARNING: orphaned scrubbed LV(673) detected". It seems these can be generated if the server loses connection to the database without dismounting it first, if the server crashes, or if it's powered off without shutting down. In speaking with Microsoft they said a few of them aren't a danger to the database. However, a lot of them suggests there might be other issue with the database as well, and you should consider moving the mailboxes to a new database and removing the old one.
Friday, June 29, 2012
Public folders not visible in OWA
During an Exchange 2003 to Exchange 2010 when you try to view public folders in OWA you might receive the message "The public folder you're trying to access couldn't be found. If the problem continues, contact your helpdesk and tell them the following: The public folder couldn't be found because there is no Exchange 2010 public folder server." You'll find several possible solutions online, but none of them worked for us. What finally fixed the problem was moving the public folder hierarchy from the Exchange 2003 administrative group to the Exchange 2010 administrative group, restarting the OWA server, and waiting for everything to sync up. Here are directions on how:
Wednesday, June 27, 2012
Monitoring Exchange 2010 database maintenance
By default Exchange 2010 will performance maintenance activities throughout the day. If you're monitoring your storage performance, and if it's virtualized you should be, you can see a higher level of disk utilization than normal, especially during off hours. To make sure it's just database maintenance and not an issue:
- Start Performance Monitor by going to Start>Run>Perfmon
- Under "Monitoring Tools" select "Performance Monitor"
- Click the green plus sign to add a monitor.
- Select the local computer, scroll down to "MSExchange Database Instancess", select "Database Maintenance Duration", "I/O Database Reads / Sec", and "I/O Database Writes / Sec", and select all database instances in the pane below it.
- Right click in the graph, select properties, select the "graph" tab, and change the view to report.
Now you can see how long the maintenance has been running, and the load it's placing on your storage system. Maintenance appears to generate about 30-35 IOPS and about 7-8 Mbps per database being processed.
Now you can see how long the maintenance has been running, and the load it's placing on your storage system. Maintenance appears to generate about 30-35 IOPS and about 7-8 Mbps per database being processed.
Thursday, June 14, 2012
245 error creating database copy in Exchange 2010 DAG
Here's an annoying one which Microsoft technical support had a hard time with. When you try to create a database copy in a DAG the initial seed fails with a 245 error on the source server. You can work around the problem by dismounting and remounting the database and then telling it to resume replication. The problem is the system essentially tries to start replication before the database is even created on the target server. While Microsoft wasn't able to answer why it's trying to do that, they did give me the correct shell cmdlet to create a database copy with postponsed seeding, which allows you to skip the error and the whole procedure to start initial seeding:
Add-MailboxDatabaseCopy -Identity "Database Name" -MailboxServer "Server Name" -ActivationPreference 2 -SeedingPostponed
That said, even doing it this way you'll might still get an error on the target mailbox server, but it's easier to ignore and don't stop the initial seeding.
The "SeedingPostponed" option is normally used for an existing database, and will create the database copy but leaves replication suspended so you can manually seed it. However, when used for a brand new database the initial log with the database creation command in it is copied over, and that command causes the database copy to be created and begin replication. So the end result is different when using it on a new database like in this scenario.
Add-MailboxDatabaseCopy -Identity "Database Name" -MailboxServer "Server Name" -ActivationPreference 2 -SeedingPostponed
That said, even doing it this way you'll might still get an error on the target mailbox server, but it's easier to ignore and don't stop the initial seeding.
The "SeedingPostponed" option is normally used for an existing database, and will create the database copy but leaves replication suspended so you can manually seed it. However, when used for a brand new database the initial log with the database creation command in it is copied over, and that command causes the database copy to be created and begin replication. So the end result is different when using it on a new database like in this scenario.
Errors with a dismounted database in Exchange 2010
If you keep a database dismounted in Exchange 2010 you can keep getting indexing errors in the application log. This is normal, but annoying. To remedy the situation drop to your shell and disable indexing on the dismounted database using the following cmdlet then restart the indexing service:
Set-MailboxDatabase "Database Name" -IndexEnabled $false
Set-MailboxDatabase "Database Name" -IndexEnabled $false
1114 Warnings in Exchange 2010
We've kept getting clusters of 1114 Warnings in the application log for certain databases in Exchange 2010. After working with Microsoft support on the problem, the final conclusion was some 1114 warnings are normal and to be expected.
UPDATE: After 1 year we still get periodic 1114 warnings on all of our databases but they've haven't affected performance in any way, just as Microsoft support said and it's safe to ignore them.
UPDATE: After 1 year we still get periodic 1114 warnings on all of our databases but they've haven't affected performance in any way, just as Microsoft support said and it's safe to ignore them.
Monday, June 4, 2012
Physical size discrepancy between DAG databases
I noted something interesting. With an Exchange 2010 DAG you have one active database and one or more logical copies. All the databases are logically identical. However, they can vary in physical size by several hundred megabytes.
Thursday, May 31, 2012
Exchange 2010: Removing a stuck move request
When you create a move request in Exchange 2010, an entry for that account is created in the "Move Request" under "Recipient Configuration". Once it's completed you cannot create another move request until you clear the existing move request. However, there are certain situations where the move request can get stuck and cannot be deleted, for example if the user is moved to a different OU after the move, but before deleting the move request.
To clear a stuck move request you need to use ADSIedit to clear the attributes msExchMailboxMoveSourceMDBLink, msExchMailboxMoveTargetMDBLink, msExchMailboxMoveFlags, and msExchMailboxMoveStatus from the user's account. As always, save the attributes to notepad before clearing them. Once they're cleared, synchronize your domain controllers, and the move request should disappear from the Exchange 2010 EMS. You may need to close and reopen EMS.
To clear a stuck move request you need to use ADSIedit to clear the attributes msExchMailboxMoveSourceMDBLink, msExchMailboxMoveTargetMDBLink, msExchMailboxMoveFlags, and msExchMailboxMoveStatus from the user's account. As always, save the attributes to notepad before clearing them. Once they're cleared, synchronize your domain controllers, and the move request should disappear from the Exchange 2010 EMS. You may need to close and reopen EMS.
Sunday, May 27, 2012
Hitting the 4GB database limit in VMWare vCenter
When you install VMWare vCenter to manage your virtual environment, unless you specify an existing database, it will automatically install MS SQL 2005 Express. This is great and works fine. However, one of hte important limitations of SQL 2005 Express is a 4GB database size limit. As you use VMWare, performance data, tasks history, and event history will all be recorded to the database and will slowly fill it up. Depending on size and complixty of your environment, some time later your DB will hit 4GB, the SQL server will stop accepting data, and vCenter will simply stop. As always, before making changes to DB make sure you have a good backup.
The solution is to quickly shrink your DB size, restart SQL, and start the vCenter service again. The easiest way to shrink the DB is to clear out all the old performance, task, and event data. If you haven't already, download and install the SQL Management Studio. Open up your vCenter database, and then run the following query to clear performance data:
truncate table VPX_HIST_STAT1;
truncate table VPX_SAMPLE_TIME1;
truncate table VPX_HIST_STAT2;
truncate table VPX_SAMPLE_TIME2;
truncate table VPX_HIST_STAT3;
truncate table VPX_SAMPLE_TIME3;
truncate table VPX_HIST_STAT4;
truncate table VPX_SAMPLE_TIME4;
Then open the table in the vCenter database called dbo.VPX_PARAMETER, modify "event.maxAge" to how many days of events you want to keep (I usually keep 90 days), modify "event.maxAgeEnabled" to "true", modify "task.maxAge" to how many days of tasks you want to keep (again I usually keep 90 days), and finally set "task.maxAgeEnabled" to "true". Once you're done open the "programmability" folder for the vCenter DB in Management Studio, open "stored procedures", and execute the one called "dbo.cleanup_events_tasks_proc".
The final step is right clicking on the database, going to tasks, and selecting shrink database (not file). Let it run, and your DB should suddenly be well under 4GB. Restart your vCenter service.
You can either monitor the growth of your DB and clean it up periodically, or you can upgrade to SQL standard. Because of Microsoft's SQL licensing, and because you're probably already running SQL in your virtual environment, you can probably upgrade to standard for no cost to you.
The solution is to quickly shrink your DB size, restart SQL, and start the vCenter service again. The easiest way to shrink the DB is to clear out all the old performance, task, and event data. If you haven't already, download and install the SQL Management Studio. Open up your vCenter database, and then run the following query to clear performance data:
truncate table VPX_HIST_STAT1;
truncate table VPX_SAMPLE_TIME1;
truncate table VPX_HIST_STAT2;
truncate table VPX_SAMPLE_TIME2;
truncate table VPX_HIST_STAT3;
truncate table VPX_SAMPLE_TIME3;
truncate table VPX_HIST_STAT4;
truncate table VPX_SAMPLE_TIME4;
Then open the table in the vCenter database called dbo.VPX_PARAMETER, modify "event.maxAge" to how many days of events you want to keep (I usually keep 90 days), modify "event.maxAgeEnabled" to "true", modify "task.maxAge" to how many days of tasks you want to keep (again I usually keep 90 days), and finally set "task.maxAgeEnabled" to "true". Once you're done open the "programmability" folder for the vCenter DB in Management Studio, open "stored procedures", and execute the one called "dbo.cleanup_events_tasks_proc".
The final step is right clicking on the database, going to tasks, and selecting shrink database (not file). Let it run, and your DB should suddenly be well under 4GB. Restart your vCenter service.
You can either monitor the growth of your DB and clean it up periodically, or you can upgrade to SQL standard. Because of Microsoft's SQL licensing, and because you're probably already running SQL in your virtual environment, you can probably upgrade to standard for no cost to you.
Wednesday, May 16, 2012
Fast way to check an ESEUTIL /G log
You can run ESEUTIL /G against an unmounted Exchange 2010 database to check it for errors. It's a good tool, especially if you can bring up your Exchange 2010 backup in a test environment to run checks. The problem is the log is very verbose, and there's no summary regarding if it found any errors. All you get is a line saying "finishes with error 0 (0x0)" if it didn't find an error. If it finds an error it'll say "finishes with error" and then an error code, so you need to parse the log file looking for any line which isn't "finishes with error 0 (0x0)", and even for a small database you'll have thousands of those entries to look at.
The trick is to open the log file in notepad, run Find and Replace, and replace all instances of "finishes with error 0 (0x0)" with something like "CHECK OK". Now you can search the log for "error" and quickly find the entries, if any, where it finished with an error.
The trick is to open the log file in notepad, run Find and Replace, and replace all instances of "finishes with error 0 (0x0)" with something like "CHECK OK". Now you can search the log for "error" and quickly find the entries, if any, where it finished with an error.
Monday, May 14, 2012
Monitoring an Exchange 2010 index crawl
If you need to rebuild an Exchange 2010 DB search index, the index state for the DB changes to "crawling". Unfortunately it doesn't give any indication if it's actually doing anything, and considering it's possible to get the status stuck as crawling, you cannot activate the database in a DAG while it's crawling, and it can take a long time to complete, the ability to actually monitor it's progress is important.
So here's a trick to help monitor the progress of the index crawl:
- Start Performance Monitor by going to Start>Run>Perfmon
- Under "Monitoring Tools" select "Performance Monitor"
- Click the green plus sign to add a monitor.
- Select the local computer, scroll down to "MSExchange Search Indices", select "Number of Indexed Recipients" and "Number of Mailboxes Left to Crawl", and hit OK.
- Right click in the graph, select properties, select the "graph" tab, and change the view to report.
Now you can see as the number of mailboxes remaining drops, making it much easier to monitor its progress and ensure it's working. Be aware some mailboxes take much longer than other mailboxes.
Restoring vCenter
After our SAN reset itself and lost connection with our VMWare 4.1 ESX hosts there were several servers which were understandably not happy. However, vCenter did not appear to be one of them. In fact, we used it extensively while fixing the various issues caused by the SAN outage, and it never caused an issue. It was therefore a little surprising today when the vCenterServer Service stopped on it, and when I restarted the service the SQL Server 2005 Express instance shot to 99% CPU usage, and the tempdb system database started growing, and wouldn't stop until it had exhausted all available storage space, and then crash. We found restarting the SQL server would free up the space, but the cycle would then start again. We also found stopping the vCenterServer Service would drop the SQL service CPU utilization back to 0 and stop the tempdb file growth.
Naturally I engaged VMWare technical support on this, and we worked the problem for four hours without resolution. We tried truncating the temp tables and history tables in the vCenter DB, reset the credentials for the SQL services, and looked through extensive logs. No luck.
Finally, I tried an Instant Recovery of the vCenter VM from our Veeam backup system, and it worked beautifully; CPU utilization stayed down, and storage stayed stable. I immediately did a standard restore of it, and was back up and running.
All vCenter stores is historical information like performance statistics and your higher level configuration like HA. If you're a smaller shop and you keep your configuration basic, then vCenter isn't storing much of anything, and rolling it back a couple days won't harm anything. In fact, you could do a completely new installation of vCenter and reconfigure it manually pretty fast.
Naturally I engaged VMWare technical support on this, and we worked the problem for four hours without resolution. We tried truncating the temp tables and history tables in the vCenter DB, reset the credentials for the SQL services, and looked through extensive logs. No luck.
Finally, I tried an Instant Recovery of the vCenter VM from our Veeam backup system, and it worked beautifully; CPU utilization stayed down, and storage stayed stable. I immediately did a standard restore of it, and was back up and running.
All vCenter stores is historical information like performance statistics and your higher level configuration like HA. If you're a smaller shop and you keep your configuration basic, then vCenter isn't storing much of anything, and rolling it back a couple days won't harm anything. In fact, you could do a completely new installation of vCenter and reconfigure it manually pretty fast.
Subscribe to:
Posts (Atom)