Sep 2


A quick way to find nodes that are not in a collocation group and how much data they are using up. I one onsite stgpool that uses collocation so I specify that I only want to look at data in that stgpool.
select nod.node_name, nod.collocgroup_name, sum(oc.physical_mb/1024) as GB from nodes nod, occupancy oc where nod.node_name=oc.node_name and oc.stgpool_name='MYSTGPOOL' group by nod.node_name, nod.collocgroup_name

This is how I find out what tapes I need to move data off of after running the above query and if I define nodes into a collocation group when they previous where not in one. This was I can reclaim tapes immediatly rather than waiting for data to expire off the tape and space reclaimation to run against the tape.
select distinct volume_name from volumeusage where node_name IN ('NODE1', 'NODE2', 'NODEX') and stgpool_name='MYSTGPOOL'

I hope to update this list as I think of other selects I use often.

Mar 14

Every morning I pick up my blackberry and look at an email from TSM Operational Reporting. If you are responsible for a Tivoli Storage Manager install you should be looking at a report from TSM Operational Reporting at sometime during your day as well. TSMOR has two key reports that can be emailed to you. The daily and hourly reports work in tandem to give you a wide range of information when it is needed and focused information if there is ever a problem. In addition TSMOR has the ability to store a current as well as previous reports, using html formatting, in a directory. Until TSMOR became available there was hardly any way to easily see the health of your TSM environment.

TSM Operational Reporting’s daily report is about as complete of a report you can get. Beginning with a general summary of the target TSM server some of the items monitored are shown below.

These are simple counts or numbers:

    Administrative Schedules – Success, Error, Fail, Missed
    Client Schedules – No error, Skipped files, warnings, Errors, Failed, Missed
    Total GB – Backed up, Restored, Archived, Retrieved
    Database and Log utilization
    DB Cache Hit Ratio
    Diskpool Utilization
    Scratch and Unavailable Volumes

Then it gives you some detailed information of the Administrative Schedules and Client Schedules. Detail such as what missed or failed. At this point I have a solid idea of the health and success of the previous backup cycle. From here I can move on to troubleshooting problems if needed or can safely move on to other tasks if everything was successful. While scrolling down I pass up some awesome looking but often not useful for me graphs of different load summaries. I’ll list them to see if any catch your attention.

    Session Load Summary
    Tape Mount Load Summary
    Migration Load Summary
    Reclamation Load Summary
    Database Backup Load Summary
    Storage Pool backup Load Summary
    Expiration Load Summary

These graphs are useful for more of an 50,000 ft overview of the loads your server is under through out the day. If I want a bit more detail on the clients such as Bytes Transfered or Node versions I can simply scroll down. The Node Activity Summary is one section I watch frequently. It give a list of Nodes and their version of TSM BAClient. This is currently very useful in my environment as I am currently phasing out 5.3.x.x and moving to 5.5.0.4. Support for 5.3.x.x is not going to be available after April 30th, 2008.

The next section I have recently had to turn off. Activity Log Details is the out put of your activity log for the past 24 hours. I have a few HSM for Windows clients and these have been generating a lot of information in the activity log. So much information in fact that it actually made my daily report fail. After disabling this section in the daily report, it runs just fine.

The Missed File Summary is useful at times. An example would be a new agent on all of your machines that has a file that is being skipped on all of the machines. You will see the number of occurrences of skipped files and the name of those files. The next section Missed File Details is what I actually use to troubleshoot missed files. It gives you two key pieces of information. The node name and the unc path to the file. The third piece of information is the time at which it was skipped. This time can be useful if you know some other job doesn’t finish or start until a certain time to release the file. The first two pieces of information should give you enough information to know if you need to edit your include/excludes.

The Session Summary section is awesome. But mostly used for bragging. If you are a backup administrator the only thing that comes close to being able to say you can and have restored anything is how fast you can do it. This section will list for each node:

    Objects Inspected, Backed Up, Updated, Rebound, Failed
    Bytes Moved
    Elapsed Time
    Aggregated Rate KB/Sec
    Percent Compressed

If you are like me knowing how many objects and how fast they moved and the total size of that data is a very good number to know. For instance for nodes with lots of objects it may be worth it to have the tivoli journal engine running. Slow aggregated Rate and high bytes moved can sometimes reveal network bottlenecks. Session Summary is available for both backup sessions and archive sessions as well as restores and retrieves. The last section is Timing Information which is how long it took in each section to gather the data.

I hope this review and summary has informed you a bit. There are many other features which I did not cover but may do so at a later time. In case you would like to research them on your own I will point you in the right direction. You can create multiple daily and hourly reports. You can create your own custom select statements to pull data you need for your environment. You also have the ability to change any the parameters that cause the hourly report to notify you or show errors. One I change is number of scratch tapes required to be health from 5 to 3. I also have the hourly report only email me if there is a problem, such as out of scratch or a log filling up. You may also want to look at per node notifications which would be very handy in larger IT organizations where backups are done on servers you do not care about but some one else does.

Mar 9


VCB is slowly evolving into a product I may actually feel safe using. Friday I had a ticket opened with VMWare because I could not get VCB to correctly mount a virtual machines vmdk for backup. At first the problem looked to me as though vcb did not actually see the LUN. I double checked my SAN configurations and was confident it was a VMWare problem. One test we used was manually mounting the lun with vcbmounter. When we started to use mountvm to see if we could mount the vmdk, someone in the background asked what backup agent and version we used. I told the engineer to tell them we used Tivoli Storage Manager 5.4.0.2.

The distant voice let us know that 5.5 was out and had some special support for vcb. I have had 5.5.0.0 deployed for some clients already and the server was currently running 5.5.0.0 as well. The distant voice came on the line and let me know how cool the support in 5.5.0.0 was. I had already begun taking over the webex session and installing the 5.5.0.0 client on the vcb server. He let me know of a problem where tsm baclient 5.5.0.0 and VCB 1.1.0.64559 do not automaticcly create the directories needed for mounting the lun. If you manaully create the directories it will delete them when VCB cleans up, leaving you with no directories again for next backup. I found the fix for this issue, it’s a maintainence release for 5.5.0.0. TSM baclient 5.5.0.4 is what you need to be upgraded to. This PDF talks about using VCB with Tivoli Storage Manager 5.5 check out page 329 in the document which is 347 of 402 in the pdf.

The APAR IC54709 is included as a fix in 5.5.0.4. You can download the patch here that will take you to 5.5.0.4.

A few features I feel are key:
1. The VCB server backs up the virtual machines using proxynode. This means that you can install an agent on the virtual machines and do file level restores as if they were backed up like normal.
2. License savings for TSM if your VCB server requires less licensing than your physical ESX servers.

Once you setup everything and create a vmlist file with all of your virtual machines. The only thing left is using a macro in your schedules to run “dsmc backup vm”. The product seems to work pretty good after that. I’ll try to test some restores next week and let you all know how that turns out. I’m very interested in keeping an eye on VCB as it develops and I certianly appreciate the intergration that Tivoli and VMWare are providing.

Next Entries »