|
|
It would be nice to have the ability to cluster Nagios instances. Something built around MPI maybe that would allow horizontal scaling without passive checks.
|
|
|
Have integrated performance graphs, maybe talk to the devs of the various performance graphing addons and select one to be integrated into the service check display page.
drill downs would be nice (if technically possible).
|
|
|
The status map needs to be upgraded to a map with webbased features like configuration and design.
* Wouldn't it be great to have a status map where you can drag your hosts around and place them anywhere you want (of course linked by the parent/child relation ship defined in de nagios configuration)? * Wouldn't it be great to have a global and several sub maps that you can configure via a webbased interface ?
|
|
|
A modern web interface with better features search and possibilty to manage the position of hostgroups and servicegroups
|
|
|
Configuration via a Web based interface
|
|
|
|
|
|
The Nagios core development team needs to be expanded to enhance future development
|
|
|
Easy engine te receive SNMP traps.
|
|
|
In host information page - show the contacts associated with the host ( helpful in large companies where the NOC is not always aware whom is the person responsible for a specific host )
|
|
|
Giving Nagios a dashboard where alarm can be easily show, where comment can be added (Ticket Number as an example) and alarm acquitted . Everything on a same page or with custom filtered tabs. Widget use : alarm appearence can change (e.g : blink) on criterion (more than 10 min... escalation ... )
|
|
|
It would be nice if you could groups in the web interface and assign servers to those groups, and assign users to those groups. The second part to this would be to have the web interface to only show servers in the group the user has been assigned to, unless they are a top level user.
update: 10-Sep-2009 I have found that by adding a user with the same name as a contact in the nagios config, then that use will only see services/hosts that have them as the contact or the user is within the contact group. Would be a good idea to make this more visible in the documentation.
|
|
|
Escalations are currently based on the number of failed checks. In some situations however (need I spell SLA?) you have fixed times after which to escalate notifications for failed services. It would be nice to have some means to specify this in nagios (calculating an interation count based on the specified check interval is insufficient as check might be delayed) instead of adding this via event handlers.
|
|
|
Nagios' dependency support is excellent, but it doesn't expose dependencies for view within the web interface. It would be great if the web interface would show dependencies.
This is similar to the benefit of showing host/child relationships in the web interface -- can you even imagine dealing with Nagios if your only view of the structure was a single flat table of all hosts and services?
I realize this view could get complicated quickly -- but how about a simple pop-up browser window next to hosts and services with dependencies? In that window, you'd show the dependency tree that Nagios already builds. This would also make it a great dependency-debugging tool.
|
|
|
Its very helpful if you can provide an API to communicate with the back end. Then the front end developers will have much freedom in changing it as they want.
|
|
|
The ability to apply a schedule (daily, weekly, monthly, yearly) to downtimes provides a valuable ability to add known exceptions to monitoring. It saves creating more and more timeperiods for known outages that occur every week.
i.e. On Sunday morning the APP A servers are all rebooted and it takes one hour. Instead of adding a new time period which excludes that hour. You could put in a re-occurring downtime for every Sunday.
Additionally Downtimes should be able to be applied to hostgroups and service groups. So that in the case above, once a server is added to the APP A host group it inherit the related downtime.
Additional benefits are - * that the downtime and its comment *which explains the reasoning) are easily viewable for the server. * there are events in the logs to show the server/service was in downtime
|
|
|
Plug-ins should be able to be made aware of previous values, not just previous states.
This is the NUMBER ONE BIGGEST deficiency with the current monitoring architecture of Nagios.
Nagios can't monitor RATES, it can only monitor STATES. Plugins have worked around this by keeping their own separate cache of values, stored in DB's or txt files. But some mechanism should exist in the core.
|
|
|
Should be able to create reports on the fly in CSV, PDF or any other rich text format.
|
|
|
Bring back DB backend. A proper admin tool preferably Web based or a GUI. This is critical for Nagios to flourish in Enterprises.
|
|
|
Currently the NAGIOS configuration for hosts supports the definition of a network hierarchy via the "parents" option. For example this would normally relate to the physical switch that a server is plugged into. In most enterprise environments there is a difference between the physical (layer 2) and logical (layer 3) topologies. Improving NAGIOS to have the capability to support multiple (arbitrary) network topology layers would be a great advantage and help it in the enterprise space. This would allow more accurate identification of hosts that are down or unreachable in the event of certain network failures.
To implement this would require changes to the configuration definition to allow parents to be defined for different topologies. The map view should also provide the option to select the specific topology to view. The logic used to classify if a host is unreachable instead of down would also need to be modified to take into account the different topologies. Ideally notifications could be sent detailing the specific parents/topology layer causing a host to be unreachable.
eg. nagios.cfg label_parents_2 Layer 2 label_parents_3 Layer 3
Host definition - parents_2 switch1,switch2 parents_3 router1,router2
|
|
|
Addition to status maps to view cabinet space and layout, maybe something in AJAX. Drag and Drop your server icons.
|
|
|
It would be great to be able to easily acknowledge multiple alerts in Nagios by selecting them all from one screen and just entering the acknowledgement details once.
|
|
|
Give nagios a web service interface... Like JSON (e.g : Nagios2JSON www.yannj.fr) of XML or even a complete SOAP Implementation
|
|
|
Currently only one address is supported for host entries. But a lot of modern servers has more than one network interface (bonding, teaming, failover etc.) On some operating systems like Linux a failover configuration has only one IP-address. On other systems (Windows, Solaris) you have one address per interface and a virtual one added as an alias to the physical interface.
Additonal we have hosts with interfaces in several networks.But it is one host. So it should be only one configuration.
|
|
|
I believe bundling all the functionality into Nagios itself is a poor approach to innovation. I believe the Nagios core should be a small agile core component (to a degree it is). I believe building on that and creating further rich APIs to support external 3rd party tool development will be best. That way people can choose which configuration tool they want to use. They can choose what web UI to use (or desktop app, or phone app, or what-not). As long as the API's are in place, this will help facilitate 3rd party development which will in turn improve adoption and user retention.
|
|
|
Writing HTML in C not only makes it extremely difficult for web developers to customise the look and feel of the web interface, it also makes it difficult to follow exactly where a particular piece of HTML comes from. Separating presentation and code logic is one of the big goals of modern application development.
Using a library like ClearSilver, Nagios can provide fast, powerful templates using native C. ClearSilver is a templating language already proven in use by Trac, Yahoo, Google and others - it's easy to read for web developers. Best of all, the Hierarchial Data Format can be read and written as plain text, which makes testing and debugging easier and also decouples the web front end from the polling system. It would be possible to switch from the CGI polling Nagios every time it is checked, to Nagios producing a HDF file each time the state changes and the CGI simply using those static HDF files to generate its content.
|