These are a few of my favorite vROps dashboards.

The job of a VMware TAM is, among many things, to help guide our customers through various challenges such as monitoring and managing their environments. To some, monitoring may sound like a mundane and uncomplicated slice of your IT infrastructure, but it is not the same Boolean up or down gadget that we got by with 10 to 15 years ago. That method worked good enough when data centers were stocked with dozens of servers, not the thousands we see following the virtualization boom.

Let’s face it, managing and monitoring today’s data centers is a challenge that’s not for the faint of heart. Hardware is no longer dedicated to a single application, or thanks to Hyper Convergence, a single purpose. The data center of the last decade could quite feasibly be consolidated down to just a handful of ESXi hosts. That is why it is more important than ever to continuously and meticulously comb over your environment. If you lose 20 critical virtual machines because you were unaware the datastore they were living on was filling up, how much would that cost your company? Moreover, would it send you on the hunt for a new job?

vRealize Operations Manager is a great tool to turn to when you when you are looking for trouble… In your environment that is! vROps 6.6 introduces an entirely new HTML 5 based Clarity UI that is designed to help you get the information you need faster with persona based dashboards, and an entirely new getting started dashboard. I will take you on a tour through a few of my favorite dashboards that will help improve efficiency and reduce risk in your environment.



After logging in, the first thing you will see is the Recommended Actions home screen. Think of this dashboard as your daily task list. It is organized by object so you can see recommendations for your hosts, VM’s, clusters, datastores, and more. You can also change the scope to look at objects across all vCenters or just one. These are all recommendations that can improve your environment’s health, risk, and efficiency.



Selecting health alerts show you issues that need to be addressed immediately. For example, if you have any guests that are nearly out of disk space, if you have physical networks that are down, or anything else that is going to cause you to have a bad day.

The risk alerts point out potential issues that could degrade your environment. Risk could be guest contention, vDS configuration issues, or hardening guide violations (if you have these enabled in your monitoring policy).

Efficiency alerts will point out “optimization opportunities” such as oversized VM’s, large snapshots, and idle VM’s.

If you only look at one dashboard in vROps (which would be a terrible idea), then this is the one. There’s a reason it is the first to appear in vROps after all. If you commit yourself to fixing at least one or two alerts a day or a week, depending on the scope of the issue, you will be well on your way to having a robust and optimized environment. After a few weeks, challenge yourself to fix even more! That way you will ace your next TAM Best Practice review!

Virtualization helps us squeeze every ounce of horsepower out of our hardware investments. However, there’s a fine line between maximizing your hardware investments and packing them tighter than a Beijing subway. One of my favorite dashboards that can quickly point out areas of opportunity is the Workload Balance dashboard. This dashboard is available by selecting Workload Balance from the left-hand menu on the home screen.



The Workload Balance Dashboard is a great place to look at how well your datacenter resources are being utilized. This is where you would go to see how well your clusters are balanced, if you have any hotspots in your environment, and if necessary you can take action from here such as rebalance workloads between clusters! Yes, you read that right!



You can also check on your CPU and Memory workloads for each cluster as well as its DRS settings which can be modified from within vROps.




My favorite widget is Capacity Utilization. It is split into three columns, underutilized, optimal, and overutilized. Your hosts and clusters will appear on the map based on their workload score which is a single number that represents the percentage of that object’s most consumed resource. In this case, I have hovered over the east-mgmt cluster which has a workload score of 93% based on its memory usage.



Clicking the details hyperlink for this object will take you to its analysis dashboards.



These are some of the most commonly used dashboards when you have questions about a specific object in your environment.



The workload dashboard breaks down resources and helps us to understand how they are consumed. This view is a lot more granular than the performance overview in vCenter and can help you quickly identify problems. For example, there are two metrics to pay attention to when looking for contention. Demand and usage. Demand is how much of a resource your virtual machines are asking for, and usage is the amount of that resource that the host is providing. If your VM’s are demanding more resources than your host can provide, then you have contention.



I also like this view because it shows us how the resource is being consumed by breaking the bar graph up into the individual VM’s consuming it. This makes it easier to identify the heavy hitters.

Now, let’s focus our attention on capacity. Let’s say your manager comes to you and says she needs 12 VM’s deployed in our east-mgmt cluster and hands you the specs. There are two types of administrators in this scenario. Those who will deploy the VM’s without looking to see if there is space, and those who carefully consider the requirements and available capacity. Which one are you?

If you are the type who just wings it and you have vROps, then I am sorry to say you are out of excuses. The capacity remaining dashboard takes the guesswork out of deciding whether or not you have the capacity to deploy more VM’s. The capacity remaining badge represents the remaining capacity of your most consumed resource. In this case, it would not be a good idea to deploy any more VM’s in this cluster because its memory and CPU have been completely consumed.



The capacity remaining badge is great, but vROps makes this even easier by specifically calling out how many VM’s you can deploy (or in this case, how many VM’s you are over by). vROps includes several VM configuration profiles out of the box, and creating your own is just a matter of clicking the plus sign.



Creating your on VM configuration profile is handy if you have a standard configuration for, say, your SQL VM’s. You can simply create a new profile based on your standard configuration, and vROps will tell you how many of those VM’s you can deploy. If you have a specific request for multiple VM’s with unique CPU, memory, and storage requirements, then you could leverage vROps projects to perform what-if scenarios.



Knowing how much capacity you have left is incredibly useful, but how do you know when it is time to expand and purchase more hardware? The time remaining dashboard will look at your consumption trend over last 30 days (configurable by policy) and calculate how much time you have before you run out of resources.

What’s great is that it includes a customizable provisioning time buffer. Let’s say it takes you 60 days from the time you request new hardware, get a quote, submit the RFP, issue a PO, receive the hardware, and install it. You would set your provisioning time buffer to 60 days (or more), and when your remaining time dips below this threshold, your badge score will drop to 0, indicating it is time to expand. You can also set up alerts based on this.

Throwing money at a problem might be fun, but it is not always the most responsible approach. The last stop on our tour de dashboards is reclaimable capacity.



This dashboard gives us a great high-level overview of what resources can be reclaimed. That is to say, what resources have been entitled to our VM’s that they do not need. This dashboard, like the others, includes a badge score. This score is based on the resource with the highest percentage of reclaimable capacity. In this case, we can reclaim 22 vCPUs or 22%. This is pretty exciting, considering the capacity remaining dashboard tells us we are out of CPU! When it is time to take action, click on the “virtual machine reclaimable capacity” hyperlink highlighted above. This link will take you to an actionable list of what resources can be reclaimed for each of your virtual machines.



You just saved your company a boat load of money! Go ahead and take the rest of the day off. You have earned it!

This handful of dashboards will be sure to keep you busy, but they will also lead you down the path of having a super-efficient, well-managed vSphere environment. However, don’t stop here. Be sure to check out the new Getting Started dashboard which will guide you through the most common vROps related tasks. If you are new to vROps and don’t want to get your hands dirty in your production environment, then check out of VMware’s Hands-on Labs today.


Matt Bradford is a VMware Healthcare Technical Account Manager based in New England. He loves the VMware community and has been a VMware vExpert since 2015. Matt also helps run the Boston VMware User Group (VMUG) as a member of the steering committee. Be sure to check out Matt’s personal blog at and say hi to him on Twitter @VMSpot.

The post These are a few of my favorite vROps dashboards. appeared first on VMware Professional Services and Education Insights.

Source: VMware Virtualization –