I have found the site was a white page for many people. This has turned out to be a technical issue with OSE Firewall for WordPress. I have disabled this plugin for now and I am using an alternative. I apologize for the inconvenience.
This is a brand new session this year covering a newer offering from VMware. Logs logs logs we all have them. Some issues with logs are not having the knowledge to interpret, time synchronization, and honestly having just too many of them. I chose this session to see how this solution could fit in my toolbox and if it is ready for use.
First lets take a quick peek at the license model from VMware’s website while we wait for the session to start.
VMware vCenter Log Insight is licensed on a per operating system instance (OSI) basis, which is defined as any server, virtual or physical, with an IP address that generates logs, including network devices and storage arrays.
You can find more info on this right here.
A individual in the office of the CTO realized we were generating logs at an alarming rate. He justified a small team of developers to create log insight. It accepts logs via syslog and allows you to search them. This session will have a slide deck to view at home that has an appendix with dozens of additional slides not covered today.
When getting log insight installed we will appreciate these tips.
Before power on, add disk
Add 100 GB to start and go from there
Have at least one source configured before install
No spell checker in the network info area, double check
Data-core should be the disk you added + 97GB, this is storage for evens
log in a change the password for root which will enable ssh
Now you can connect to the URL
With no license you get 60 day use
NTP is very important to configure - see my other blog posts on NTP
A read only vCenter account will be needed
With the current release, you may wish to limit to 2 vCenter's
For the vCOPs account you will need admin
Make sure to click enable launch in context
whole stack is key
vCenter - might need some extra attention, look at appendix
View - treat like vCenter above
Use tagging to make search easier
This is the secret sauce
A content pack contains the KNOWLEDGE!!!!!!
It contains queries, alerts, dashboards, and field extractions.
HYTrust, Net flow logic, Puppet, Cisco, EMC, NetApp, VCE all have packs today
for the first few days look at the utilization
On a regular basis check for dropped packets inside of the GUI
Enable Data Archiving (output to NFS directory)
If you use an archive to review, use a new instance of vC LI
Rough guide of 250MB per day per ESX host at 50MB for other devices
About 16 GB needed to retain 30 days of logs per node.
When calling support log in on CLI and issue loginsight-support
Can also use UI on health page of system administration at bottom
Backup via Image level
vC Ops can launch in context
You can create alerts via log queries
Updates require a short outage but is in place.
1.0 will do updates via rpm on CLI "-Uvh file_name"
If you used the wrong IP address, use vApp options to fix it or reinstall.
Product does not have any compatibility restrictions so developers are releasing content fast
Source working first
Add disk at beginning
Make smtp, vC, vCOPs are all working correctly
Get a food system email address for alerts to go to
Monitor disk and processor in the beginning
Use data archiving
Make sure entire technology stack is reporting
Update as often as you can
Point the product at the vCOPs UI VM.
Listen to the recording for a cameo by me when I say “dashboards” to get the presenter back onto thought process. Michael White is an excellent presenter and it was my pleasure to help.
VCOPs provides me excellent data and information. The problem is building enough knowledge to understand and translate what I am seeing.
An example is shown of peak Disk IO being very different from average IO over 7 days. Commands per second is what vCOPs calls IO. Drilling into the peak it lasts for a few minutes in a 1 hour window. Then the metric graphs are used to add vCenter commands per second for the top 5 VM’s. This is compared for each peak to find a common VM which is causing the peak in disk IO usage.
Next we look at the suspicious VM and compare with vCenter total iops report. The lines align so this is the VM. The question now, is it read or write intensive? The result. A SQL box with peak usage everyday that is write intensive, knowledge is needed to suspect a SQL agent on the box. Everyday at 4:45 this SQL agent is configured to run multiple scheduled jobs. These jobs could be divided to run over multiple time slots at non peak times.
The VMware management blog has the recording of this demo to reuse in your own environments. It was called analyze and optimize. Click Here for the Blog Post
Several useful dashboards are displayed. This have scoreboards for showing things like the total capacity of clusters and current usage via memory and CPU. Colored by health including reserve space for things like failover or procurement time buffers.
Report capacity risk based on your knobs. I have a general rule I use in my environments as 6,7,8,9 it’s resume time. It is catchy and helps me remember its purpose and value.
60% – Analyze and attempt to reduce.
70% – Begin procurement of additional resources.
80% – Stop Provisioning new workloads
90% – Watch closely, actively move workloads out.