Feb 24

In the past I have had to create on-call rotations for use in an operations environment. The goal of a rotation like this is to provide the employees with the ability to plan for time when they may be required to be close, available, and on alert. This translates into knowing when best to take that camping trip, vacation, or simply “disconnect” to a certian degree (like turning off email alerts, but still answering phone calls).

These 9 guiding statements were used with the following high level method. The purpose of the guidelines was to explain the responsibilities and make clear that resolution was not only the on-call persons’ responsibility.

A list is used to say who is primary on-call and secondary on-call. The rotation was simply alphabetical. A, then B, Then C, then D. The individuals on-call were on-call for the team and may not be a subject matter expert on whatever may have an issue.

  1. Be Point of Contact until not needed
  2. Be on-call secondary after on-call
  3. Maintain your status in on-call list
  4. Long running issue, use secondary in list
  5. Update status in on-call list on Monday
  6. Respond to alerts
  7. Respond to phone call contact
  8. Stay Rested
  9. Resolve if possible

Feb 21
Site Issues Resolved
icon1 Trace | icon2 Technical | icon4 02 21st, 2014| icon3No Comments »

I have found the site was a white page for many people. This has turned out to be a technical issue with OSE Firewall for WordPress. I have disabled this plugin for now and I am using an alternative. I apologize for the inconvenience.


Aug 29

This is a brand new session this year covering a newer offering from VMware. Logs logs logs we all have them. Some issues with logs are not having the knowledge to interpret, time synchronization, and honestly having just too many of them. I chose this session to see how this solution could fit in my toolbox and if it is ready for use.

First lets take a quick peek at the license model from VMware’s website while we wait for the session to start.

VMware vCenter Log Insight is licensed on a per operating system instance (OSI) basis, which is defined as any server, virtual or physical, with an IP address that generates logs, including network devices and storage arrays.

You can find more info on this right here.

A individual in the office of the CTO realized we were generating logs at an alarming rate. He justified a small team of developers to create log insight. It accepts logs via syslog and allows you to search them. This session will have a slide deck to view at home that has an appendix with dozens of additional slides not covered today.

When getting log insight installed we will appreciate these tips.

Use FQDN
Before power on, add disk
Add 100 GB to start and go from there
Have at least one source configured before install
No spell checker in the network info area, double check
Data-core should be the disk you added + 97GB, this is storage for evens

Configure
log in a change the password for root which will enable ssh
Now you can connect to the URL
With no license you get 60 day use
NTP is very important to configure - see my other blog posts on NTP
A read only vCenter account will be needed
With the current release, you may wish to limit to 2 vCenter's
For the vCOPs account you will need admin
Make sure to click enable launch in context

Sources
whole stack is key
Storage
Networking
ESXi
vCenter - might need some extra attention, look at appendix
View - treat like vCenter above

Tagging
Use tagging to make search easier

Content Packs

This is the secret sauce
A content pack contains the KNOWLEDGE!!!!!!
It contains queries, alerts, dashboards, and field extractions.
HYTrust, Net flow logic, Puppet, Cisco, EMC, NetApp, VCE all have packs today

Scalability
for the first few days look at the utilization
On a regular basis check for dropped packets inside of the GUI
Enable Data Archiving (output to NFS directory)
If you use an archive to review, use a new instance of vC LI
Rough guide of 250MB per day per ESX host at 50MB for other devices
About 16 GB needed to retain 30 days of logs per node.

Miscellaneous

When calling support log in on CLI and issue loginsight-support
Can also use UI on health page of system administration at bottom
Backup via Image level
vC Ops can launch in context
You can create alerts via log queries
Updates require a short outage but is in place.
1.0 will do updates via rpm on CLI "-Uvh file_name"
If you used the wrong IP address, use vApp options to fix it or reinstall.
Product does not have any compatibility restrictions so developers are releasing content fast

Summary

Source working first
Add disk at beginning
Make smtp, vC, vCOPs are all working correctly
Get a food system email address for alerts to go to
Monitor disk and processor in the beginning
Use data archiving
Make sure entire technology stack is reporting
Update as often as you can

Point the product at the vCOPs UI VM.

Here are some screenshots of the product in action.

20130829-150520.jpg
20130829-150420.jpg
20130829-150433.jpg

Listen to the recording for a cameo by me when I say “dashboards” to get the presenter back onto thought process. Michael White is an excellent presenter and it was my pleasure to help.


« Previous Entries