Welcome to Surveil’s documentation!¶
Table of Contents:
Surveil¶
Monitoring as a Service
An OpenStack related project designed to provide highly available, scalable and flexible monitoring for OpenStack.
Project Info¶
- Documentation: https://surveil.readthedocs.org/
- IRC: #surveil at Freenode
- Issue tracker: https://waffle.io/surveil/surveil-meta (Also on GitHub)
- Open Gerrit Changesets: https://review.openstack.org/#/q/status:open+surveil,n,z
Tutorials¶
Using Surveil¶
Installing Surveil¶
Surveil is currently packaged for Centos 7. You can install it via our custom repositories.
0. Installing the repositories¶
Install the RDO repositories with the following command:
yum install -y https://rdoproject.org/repos/rdo-release.rpm
Install the Surveil repositories with the following command:
yum install -y yum-utils
yum-config-manager --add-repo http://yum.surveil.io/centos_7/
1. Installing Surveil¶
All-in-One installation: survei-full
¶
Surveil does not work with SELinux yet. To disable it, use the following commands:
echo 0 > /sys/fs/selinux/enforce
sed -i 's/SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
Install surveil-full with the following command:
yum install -y surveil-full --nogpgcheck
Due to an issue with MongoDB presenting itself as running before it is ready, start it 20 seconds before the other services:
systemctl start mongod.service
Launch all surveil services with the following command:
systemctl start surveil-full.target
The surveil-init command will flush existing MongoDB Alignak config, create an InfluxDB database and upload configuration templates to Alignak:
surveil-init --mongodb --influxdb --packs
The surveil-webui-init command will pre-create data sources in Grafana:
surveil-webui-init -H localhost -U root -P root -p 8086 -N db -g "http://localhost/surveil/grafana"
2. Testing the API¶
You should now be able to use the API:
surveil status-host-list
surveil config-host-list
3. Surveil Web UI¶
Access the Surveil Web UI at http://localhost:80/surveil
Monitoring a host with passive checks¶
Surveil allows for both passive monitoring and polling. In this guide, we will be creating a host and send passive check results.
0. Creating the host and service¶
With the Surveil CLI:
surveil config-host-create --host_name passive_check_host --address 127.0.0.1
surveil config-service-create --host_name passive_check_host --service_description passive_check_service --passive_checks_enabled 1 --check_command _echo --max_check_attempts 4 --check_interval 5 --retry_interval 3 --check_period "24x7" --notification_interval 30 --notification_period "24x7" --contacts admin --contact_groups admins
surveil config-reload
1. Sending check results¶
With the Surveil CLI:
surveil status-submit-check-result --host_name passive_check_host --service_description passive_check_service --output "Hello!" --return_code 0
Monitoring with your custom plugin¶
Surveil is compatible with Nagios plugins. It is trivial to write a custom plugin to monitor your applcation. In this guide, we will create a new plugin and configure a new Host that uses it in Surveil.
0. Install the plugin¶
Surveil support Nagios plugins. For more information about Nagios plugins, please refer to the Nagios plugin API documentation for more information.
There are many plugins available on the web. For example, the nagios-plugins project contains many plugins written in C and the monitoring-tools project contains many plugins written in Python.
Surveil loads plugins from /usr/lib/monitoring/plugins/
. In this example, we will be installing a simple fake plugin written in Bash:
echo -e '#!/bin/bash\necho "DISK $1 OK - free space: / 3326 MB (56%); | /=2643MB;5948;5958;0;5968"' | sudo tee /usr/lib/monitoring/plugins/custom/check_example
chmod +x /usr/lib/monitoring/plugins/custom/check_example
1. Create a host using this plugin¶
Now that you are done developing your plugin, it is time to use it in Surveil.
Creating a command¶
Before you can use your plugin in a host/service configuration, you need to create an Alignak command:
surveil config-command-create --command_name check_example --command_line '$CUSTOMPLUGINSDIR$/check_example $HOSTADDRESS$'
Creating a host¶
Create a host with the following command:
surveil config-host-create --host_name check_example_host --address savoirfairelinux.com --use generic-host
Creating a Service¶
Create a service with the following command:
surveil config-service-create --host_name check_example_host --service_description check_example_service --check_command "check_example" --max_check_attempts 4 --check_interval 5 --retry_interval 3 --check_period "24x7" --notification_interval 30 --notification_period "24x7" --contacts admin --contact_groups admins
Reload the config¶
Reload the config this will tell Alignak to reload the new config with the new host
surveil config-reload
Show the new service¶
Show the service list with this command:
surveil status-service-list
You should see the service you just add in the list with the correct status (this could take a minute a two for the result to show)
Heat AutoScaling with Surveil¶
When used with OpenStack integration, Surveil export metrics to Ceilometer. This allows for auto scaling based on application metrics with Heat.
For example, the autoscaling.yaml
template below allows for scaling when there is an average of more than four users connected to the machines in the stack (via ssh).
autoscaling.yml¶
heat_template_version: 2013-05-23
description: Creates an autoscaling group based on Surveil's metrics
parameters:
image:
type: string
default: rhel7-updated
description: Image used for servers
key:
type: string
default: < USER KEY HERE >
description: SSH key to connect to the servers
flavor:
type: string
default: c1.small
description: flavor used by the web servers
network_public:
type: string
default: public-01
description: Public network used by the server
network_private:
type: string
default: private-01
description: Private network used by the server
monitoring_server:
type: string
default: < SURVEIL SERVER IP HERE >
description: Monitoring server address to allow connections from
resources:
asg:
type: OS::Heat::AutoScalingGroup
properties:
min_size: 1
max_size: 6
resource:
type: OS::Nova::Server
properties:
flavor: {get_param: flavor}
image: {get_param: image}
key_name: {get_param: key}
networks:
- network: {get_param: network_public}
- network: {get_param: network_private}
security_groups:
- default
- sysadmin
- insecure
metadata:
metering.stack: {get_param: "OS::stack_id"}
surveil_tags: linux-system-nrpe
user_data_format: RAW
user_data:
str_replace:
template: |
#!/bin/bash -v
rpm -Uvh http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm
yum install -y nrpe wget bc svn
yum install -y nagios-plugins-users nagios-plugins-disk nagios-plugins-load --disablerepo=rhel-7-server-openstack-6.0-rpms
mkdir -p /usr/lib64/nagios/plugins/sfl-monitoring-tools/check_users
svn checkout https://github.com/savoirfairelinux/monitoring-tools/tags/0.3.2/plugins/check-cpu /usr/lib64/nagios/plugins/sfl-monitoring-tools/check_cpu
svn checkout https://github.com/savoirfairelinux/monitoring-tools/tags/0.3.2/plugins/check-mem /usr/lib64/nagios/plugins/sfl-monitoring-tools/check_mem
wget https://raw.githubusercontent.com/fpeyre/nagios-plugins/master/check_swap -P /usr/lib64/nagios/plugins/sfl-monitoring-tools/check_swap/
chmod +x /usr/lib64/nagios/plugins/sfl-monitoring-tools/check_swap/check_swap
chmod +x /usr/lib64/nagios/plugins/sfl-monitoring-tools/check_users/check_users.sh
sed -i 's/^allowed_hosts=.*$/allowed_hosts=$monitoring_server/' /etc/nagios/nrpe.cfg
echo "command[check_disk]=/usr/lib64/nagios/plugins/check_disk -w 85 -c 90 " >> /etc/nagios/nrpe.cfg
echo "command[check_cpu]=/usr/lib64/nagios/plugins/sfl-monitoring-tools/check_cpu/check_cpu -w 80 -c 90 " >> /etc/nagios/nrpe.cfg
echo "command[check_memory]=/usr/lib64/nagios/plugins/sfl-monitoring-tools/check_mem/check_mem -u -w 80.0 -c 90.0 " >> /etc/nagios/nrpe.cfg
echo "command[check_swap]=/usr/lib64/nagios/plugins/sfl-monitoring-tools/check_swap/check_swap 20 10 " >> /etc/nagios/nrpe.cfg
echo "command[check_users]=/usr/lib64/nagios/plugins/check_users -w 2 -c 4 " >> /etc/nagios/nrpe.cfg
systemctl enable nrpe
systemctl start nrpe
params:
$monitoring_server: {get_param: monitoring_server}
server_scaleup_policy:
type: OS::Heat::ScalingPolicy
properties:
adjustment_type: change_in_capacity
auto_scaling_group_id: {get_resource: asg}
cooldown: 30
scaling_adjustment: 1
server_scaledown_policy:
type: OS::Heat::ScalingPolicy
properties:
adjustment_type: change_in_capacity
auto_scaling_group_id: {get_resource: asg}
cooldown: 30
scaling_adjustment: -1
users_alarm_high:
type: OS::Ceilometer::Alarm
properties:
description: Scale-up if the average connected users is > 3 for 1 minute
meter_name: SURVEIL_users
statistic: avg
period: 60
evaluation_periods: 1
threshold: 3
alarm_actions:
- {get_attr: [server_scaleup_policy, alarm_url]}
matching_metadata: {'stack': {get_param: "OS::stack_id"}}
comparison_operator: gt
users_alarm_low:
type: OS::Ceilometer::Alarm
properties:
description: Scale-down if the average connected users is < 1 for 1 minute
meter_name: SURVEIL_users
statistic: avg
period: 60
evaluation_periods: 1
threshold: 1
alarm_actions:
- {get_attr: [server_scaledown_policy, alarm_url]}
matching_metadata: {'stack': {get_param: "OS::stack_id"}}
comparison_operator: lt
outputs:
scale_up_url:
description: >
This URL is the webhook to scale up the autoscaling group. You
can invoke the scale-up operation by doing an HTTP POST to this
URL; no body nor extra headers are needed.
value: {get_attr: [server_scaleup_policy, alarm_url]}
scale_dn_url:
description: >
This URL is the webhook to scale down the autoscaling group.
You can invoke the scale-down operation by doing an HTTP POST to
this URL; no body nor extra headers are needed.
value: {get_attr: [server_scaledown_policy, alarm_url]}
ceilometer_query:
value:
str_replace:
template: >
ceilometer statistics -m SURVEIL_users
-q metadata.user_metadata.stack=$stackval -p 600 -a avg
params:
$stackval: { get_param: "OS::stack_id" }
description: >
This is a Ceilometer query for statistics on the SURVEIL_users meter
Samples about OS::Nova::Server instances in this stack. The -q
parameter selects Samples according to the subject's metadata.
When a VM's metadata includes an item of the form metering.X=Y,
the corresponding Ceilometer resource has a metadata item of the
form user_metadata.X=Y and samples about resources so tagged can
be queried with a Ceilometer query term of the form
metadata.user_metadata.X=Y. In this case the nested stacks give
their VMs metadata that is passed as a nested stack parameter,
and this stack passes a metadata of the form metering.stack=Y,
where Y is this stack's ID.
Contributing¶
Getting started with Surveil¶
0. Prerequisite¶
Surveil’s development environment is based on Docker and docker-compose.
First you need to install Docker. Refer to the project installation documentation.
You can install docker-compose with the following command:
sudo pip install -U docker-compose
1. Starting the containers¶
You will then be able to use the environment with the following commands:
sudo docker-compose up
: Launch Surveil and its dependencies in containers.sudo docker-compose down
: Kill the active docker containers, if any.sudo docker-compose rm
: Remove all containers, if any.sudo docker-compose build
: Build the docker images.
Configuration for the different services running in the Docker containers are stored in tools/docker.
After running sudo docker-compose up
, you should be able to acces all
services at the ports configured in the docker-compose.yml file.
- Surveil API: http://localhost:5311/v1/hello
- Bansho (surveil web interface): http://localhost:8888 (any login info is fine)
- InfluxDB: http://localhost:8083 (user:root pw:root)
- Grafana: http://localhost:80 (user:admin pw:admin)
- Shinken WebUI: http://localhost:7767/all (user:admin pw:admin)
After about 40 seconds, a script will be executed to create fake hosts in the Surveil configuration. You should see it in the docker-compose logs.
The Surveil container mounts your local project folder and pecan reloads every time the project files change thus providing a proper development environment.
Note: Fedora users might want to uncomment the privileged: true
line in docker.compose.yml if they face permissions issues.
2. Interacting with the API¶
You can use the python-surveilclient CLI to interact with the API.
Install it with the following command:
sudo pip install -U python-surveilclient
You’ll need to provide the Surveil API URL. You can do this with the
--surveil-api-url
parameter, but its easier to just set it as environment
variable:
export SURVEIL_API_URL=http://localhost:5311/v2
export SURVEIL_AUTH_URL=http://localhost:5311/v2/auth
Viewing host status¶
You can use the CLI to view the status of the currently monitored hosts and services with
surveil status-host-list
and surveil status-service-list
Example output:
+-------------------------------+---------------+-------+------------+-----------------------------------+
| host_name | address | state | last_check | plugin_output |
+-------------------------------+---------------+-------+------------+-----------------------------------+
| srv-ldap-01 | 127.0.0.1 | UP | 1431712968 | OK - 127.0.0.1: rta 0.036ms, l... |
| sw-iwebcore-01 | 127.0.0.1 | UP | 1431712971 | OK - 127.0.0.1: rta 0.041ms, l... |
| os-controller-1.cloud.mtl.sfl | 145.50.1.61 | UP | 1431713146 | OK - 172.20.1.21: rta 0.453ms,... |
| os-compute-1.cloud.mtl.sfl | 145.50.1.62 | UP | 1431713144 | OK - 172.20.1.31: rta 0.318ms,... |
| os-compute-2.cloud.mtl.sfl | 145.50.1.63 | UP | 1431713144 | OK - 172.20.1.32: rta 0.378ms,... |
| os-compute-3.cloud.mtl.sfl | 145.50.1.64 | UP | 1431713146 | OK - 172.20.1.33: rta 0.373ms,... |
| os-compute-4.cloud.mtl.sfl | 145.50.1.65 | UP | 1431713146 | OK - 172.20.1.34: rta 0.337ms,... |
+-------------------------------+---------------+-------+------------+-----------------------------------+
You can also use the CLI to view the configured hosts in the API with
surveil config-host-list
and surveil config-service-list
Adding a new host¶
The Surveil CLI provides function to add hosts:
surveil config-host-create --host_name openstackwebsite --address openstack.org
This will configure a new host in Surveil. However, it won’t be monitored until Surveil’s config is reloaded. You can do this with the CLI:
surveil config-reload
It will take from 5 to 10 seconds for Surveil to start monitoring the host. After this delay, you will be able to consult the host status with the CLI:
surveil status-host-list
Using Bansho the web interface¶
The Surveil client uses the Surveil API to query information concerning hosts and services. Bansho (Surveil’s web interface) also uses this API. To use Bansho simply open a browser at http://localhost:8888 and press login.
Developping the API¶
Launching the stack¶
If you have completed the Getting started with Surveil tutorial, you should know how to launch the stack:
sudo docker-compose up
Editing the code¶
The Surveil container mounts your local project folder and pecan reloads every time the project files change thus providing a proper development environment.
For example, edit the surveil/api/controllers/v2/hello.py
file and change Hello World!
by Hello Devs!
.
After you save the file, the following logs will appear in Surveil’s output:
surveil_1 | Some source files have been modified
surveil_1 | Restarting server...
You should be able to test your modification by accessing http://localhost:5311/v2/hello
with your browser.
Disabling permissions¶
Depending on what you are working on, it might be practical to disable permissions. This can be done by editing the policy.json
file found at etc/surveil/policy.json
.
For example, you could modify the following lines:
"admin_required": "role:admin or is_admin:1",
"surveil_required": "role:surveil or rule:admin_required",
"surveil:admin": "rule:admin_required",
"surveil:authenticated": "rule:surveil_required",
by:
"admin_required": "@",
"surveil_required": "@",
"surveil:admin": "@",
"surveil:authenticated": "@",
This will modify permissions so that all API calls that require the admin
rule now pass without any verification.
Developping the API without docker¶
You can get development environment without docker
git clone https://review.openstack.org/stackforge/surveil
cd surveil
virtualenv env
source env/bin/activate
pip install -r requirements.txt
python setup.py develop
python setup.py install_data
surveil-api -p env/etc/surveil/config.py -a env/etc/surveil/api_paste.ini -c env/etc/surveil/surveil.cfg -r
Edit your config files
vim env/etc/surveil/config.py
vim env/etc/surveil/surveil.cfg
vim env/etc/surveil/policy.json
vim env/etc/surveil/api_paste.ini
Don’t forget to start your databases (MongoDB and InfluxDB)
Running the tests¶
Using tox¶
Surveil is tested and supported on Python 2.7 and Python 3.4. The project uses tox to manage tests.
The following command will run the tests for Python 3.4, Python 2.7, Flake8 and Docs:
tox
You can also run only one set of tests by specifying the tox environment to run (see tox.ini for more details):
tox -epy27
Building the docs¶
To build the docs, simply run tox -edocs
. The docs will be available in the doc/build/html
folder. After every commit, docs are automatically built on readthedocs and hosted on surveil.readthedocs.org.
Integration tests¶
Integration tests are ran nightly on test.savoirfairelinux.net. You can run them on your machine with tox -eintegration
. Before you launch the command, make sure that you don’t have any other Surveil containers running as they may interfere with the integration tests. Integration tests will create muliple containers on your machine.
Web API¶
V1 Web API¶
Hosts¶
-
POST
/v1/hosts
¶ Create a new host.
Parameters: - data (
Host
) – a host within the request body.
Return type: - data (
-
PUT
/v1/hosts/
(host_name)¶ Modify this host.
Parameters: - data (
Host
) – a host within the request body.
- data (
-
DELETE
/v1/hosts/
(host_name)¶ Delete this host.
-
GET
/v1/hosts/
(host_name)/services
¶ Returns all services assocaited with this host.
Return type: list( Service
)
-
GET
/v1/hosts/
(host_name)/services/
(service_name)/
(service_description)¶ Returns a specific service.
Return type: Service
-
POST
/v1/hosts/
(host_name)/results
¶ Submit a new check result.
Parameters: - data (
CheckResult
) – a check result within the request body.
- data (
-
POST
/v1/hosts/
(host_name)/services/
(service_description)/results
¶ Submit a new check result.
Parameters: - data (
CheckResult
) – a check result within the request body.
- data (
-
type
CheckResult
¶ Data samples:
- Json
{ "output": "CPU Usage 98%|c[cpu]=98%;80;95;0;100", "return_code": 0, "time_stamp": "1409087486" }
- XML
<value> <time_stamp>1409087486</time_stamp> <return_code>0</return_code> <output>CPU Usage 98%|c[cpu]=98%;80;95;0;100</output> </value>
-
output
¶ Type: unicode The output of the check.
-
return_code
¶ Type: int The return code of the check.
-
time_stamp
¶ Type: unicode The time the check was executed. Defaults to now.
-
type
Host
¶ Data samples:
- Json
{ "address": "192.168.1.254", "check_period": "24x7", "contact_groups": "router-admins", "contacts": "admin,carl", "custom_fields": { "OS_AUTH_URL": "http://localhost:8080/v2" }, "host_name": "bogus-router", "max_check_attempts": 5, "notification_interval": 30, "notification_period": "24x7", "use": "generic-host" }
- XML
<value> <host_name>bogus-router</host_name> <address>192.168.1.254</address> <max_check_attempts>5</max_check_attempts> <check_period>24x7</check_period> <contacts>admin,carl</contacts> <contact_groups>router-admins</contact_groups> <notification_interval>30</notification_interval> <notification_period>24x7</notification_period> <use>generic-host</use> <custom_fields> <item> <key>OS_AUTH_URL</key> <value>http://localhost:8080/v2</value> </item> </custom_fields> </value>
-
address
¶ Type: unicode The address of the host. Normally, this is an IP address.
-
check_period
¶ Type: unicode The time period during which active checks of this host can be made.
-
contact_groups
¶ Type: unicode List of the short names of the contact groups that should be notified
-
contacts
¶ Type: unicode A list of the short names of the contacts that should be notified.
-
custom_fields
¶ Type: dict(unicode: unicode) Custom fields for the host
-
host_name
¶ Type: unicode The name of the host
-
use
¶ Type: unicode The template to use for this host
Services¶
-
POST
/v1/services
¶ Create a new service.
Parameters: - data (
Service
) – a service within the request body.
Return type: - data (
-
type
Service
¶ Data samples:
- Json
{ "check_command": "check-disk!/dev/sdb1", "check_interval": 5, "check_period": "24x7", "contact_groups": "linux-admins", "contacts": "surveil-ptl,surveil-bob", "host_name": "sample-server", "max_check_attempts": 5, "notification_interval": 3, "notification_period": "24x7", "retry_interval": 3, "service_description": "check-disk-sdb" }
- XML
<value> <host_name>sample-server</host_name> <service_description>check-disk-sdb</service_description> <check_command>check-disk!/dev/sdb1</check_command> <max_check_attempts>5</max_check_attempts> <check_interval>5</check_interval> <retry_interval>3</retry_interval> <check_period>24x7</check_period> <notification_interval>3</notification_interval> <notification_period>24x7</notification_period> <contacts>surveil-ptl,surveil-bob</contacts> <contact_groups>linux-admins</contact_groups> </value>
Commands¶
-
POST
/v1/commands
¶ Create a new command.
Parameters: - data (
Command
) – a command within the request body.
Return type: - data (
-
PUT
/v1/commands/
(command_name)¶ Modify this command.
Parameters: - data (
Command
) – a command within the request body.
- data (
-
DELETE
/v1/commands/
(command_name)¶ Delete this command.
-
type
Command
¶ Data samples:
- Json
{ "command_line": "/bin/check_http", "command_name": "check_http" }
- XML
<value> <command_name>check_http</command_name> <command_line>/bin/check_http</command_line> </value>
-
command_line
¶ Type: unicode This directive is used to define what is actually executed by Shinken
-
command_name
¶ Type: unicode The name of the command
V2 Web API¶
Config¶
Hosts¶
-
PUT
/v2/config/hosts
¶ Create a new host.
Parameters: - data (
Host
) – a host within the request body.
Return type: - data (
-
PUT
/v2/config/hosts/
(host_name)¶ Modify this host.
Parameters: - data (
Host
) – a host within the request body.
- data (
-
DELETE
/v2/config/hosts/
(host_name)¶ Delete this host.
-
GET
/v2/config/hosts/
(host_name)/services
¶ Returns all services assocaited with this host.
Return type: list( Service
)
-
GET
/v2/config/hosts/
(host_name)/services/
(service_name)/
(service_description)¶ Returns a specific service.
Return type: Service
-
DELETE
/v2/config/hosts/
(host_name)/services/
(service_name)/
(service_description)¶ Delete a specific service.
Services¶
-
PUT
/v2/config/services
¶ Create a new service.
Parameters: - data (
Service
) – a service within the request body.
Return type: - data (
-
type
Service
¶ Data samples:
- Json
{ "check_command": "check-disk!/dev/sdb1", "check_interval": 5, "check_period": "24x7", "contact_groups": [ "linux-admins" ], "contacts": [ "surveil-ptl", "surveil-bob" ], "host_name": [ "sample-server" ], "max_check_attempts": 5, "notification_interval": 3, "notification_period": "24x7", "passive_checks_enabled": "1", "retry_interval": 3, "service_description": "check-disk-sdb" }
- XML
<value> <host_name> <item>sample-server</item> </host_name> <service_description>check-disk-sdb</service_description> <check_command>check-disk!/dev/sdb1</check_command> <max_check_attempts>5</max_check_attempts> <check_interval>5</check_interval> <retry_interval>3</retry_interval> <check_period>24x7</check_period> <notification_interval>3</notification_interval> <notification_period>24x7</notification_period> <contacts> <item>surveil-ptl</item> <item>surveil-bob</item> </contacts> <contact_groups> <item>linux-admins</item> </contact_groups> <passive_checks_enabled>1</passive_checks_enabled> </value>
Commands¶
-
PUT
/v2/config/commands
¶ Create a new command.
Parameters: - data (
Command
) – a command within the request body.
Return type: - data (
-
PUT
/v2/config/commands/
(command_name)¶ Modify this command.
Parameters: - data (
Command
) – a command within the request body.
- data (
-
DELETE
/v2/config/commands/
(command_name)¶ Delete this command.
Business impact modulations¶
-
POST
/v2/config/businessimpactmodulations
¶ Returns all business impact modulations. :type data:
LiveQuery
Return type: list( BusinessImpactModulation
)
-
PUT
/v2/config/businessimpactmodulations
¶ Create a new business impact modulation.
Parameters: - data (
BusinessImpactModulation
) – a business impact modulation within the request body.
Return type: BusinessImpactModulation
- data (
Check modulations¶
-
POST
/v2/config/checkmodulations
¶ Returns all check modulations. :type data:
LiveQuery
Return type: list( CheckModulation
)
-
PUT
/v2/config/checkmodulations
¶ Create a new check modulation.
Parameters: - data (
CheckModulation
) – a check modulation within the request body.
- data (
Notification ways¶
-
POST
/v2/config/notificationways
¶ Returns all notification ways. :type data:
LiveQuery
Return type: list( NotificationWay
)
-
PUT
/v2/config/notificationways
¶ Create a new notification way.
Parameters: - data (
NotificationWay
) – a notification way within the request body.
- data (
types documentation¶
-
type
Command
¶ Data samples:
- Json
{ "command_line": "/bin/check_http", "command_name": "check_http" }
- XML
<value> <command_name>check_http</command_name> <command_line>/bin/check_http</command_line> </value>
-
command_line
¶ Type: unicode This directive is used to define what is actually executed by Shinken
-
command_name
¶ Type: unicode The name of the command
-
type
Host
¶ Data samples:
- Json
{ "address": "192.168.1.254", "check_period": "24x7", "contact_groups": [ "router-admins" ], "contacts": [ "admin", "carl" ], "custom_fields": { "OS_AUTH_URL": "http://localhost:8080/v2" }, "host_name": "bogus-router", "max_check_attempts": 5, "notification_interval": 30, "notification_period": "24x7", "use": [ "generic-host" ] }
- XML
<value> <host_name>bogus-router</host_name> <address>192.168.1.254</address> <max_check_attempts>5</max_check_attempts> <check_period>24x7</check_period> <contacts> <item>admin</item> <item>carl</item> </contacts> <contact_groups> <item>router-admins</item> </contact_groups> <notification_interval>30</notification_interval> <notification_period>24x7</notification_period> <use> <item>generic-host</item> </use> <custom_fields> <item> <key>OS_AUTH_URL</key> <value>http://localhost:8080/v2</value> </item> </custom_fields> </value>
-
address
¶ Type: unicode The address of the host. Normally, this is an IP address.
-
check_period
¶ Type: unicode The time period during which active checks of this host can be made.
-
contact_groups
¶ Type: list(unicode) List of the short names of contact groups that should be notified
-
contacts
¶ Type: list(unicode) A list of the short names of the contacts that should be notified.
-
custom_fields
¶ Type: dict(unicode: unicode) Custom fields for the host
-
host_name
¶ Type: unicode The name of the host
-
use
¶ Type: list(unicode) The template to use for this host
-
type
CheckResult
¶ Data samples:
- Json
{ "output": "CPU Usage 98%|c[cpu]=98%;80;95;0;100", "return_code": 0, "time_stamp": "1409087486" }
- XML
<value> <time_stamp>1409087486</time_stamp> <return_code>0</return_code> <output>CPU Usage 98%|c[cpu]=98%;80;95;0;100</output> </value>
-
output
¶ Type: unicode The output of the check.
-
return_code
¶ Type: int The return code of the check.
-
time_stamp
¶ Type: unicode The time the check was executed. Defaults to now.
-
type
CheckModulation
¶ Data samples:
- Json
{ "check_command": "check_ping_night", "check_period": "night", "checkmodulation_name": "ping_night" }
- XML
<value> <checkmodulation_name>ping_night</checkmodulation_name> <check_command>check_ping_night</check_command> <check_period>night</check_period> </value>
-
type
NotificationWay
¶ Data samples:
- Json
{ "host_notification_commands": [ "notify-host" ], "host_notification_options": [ "d", "u", "r", "f", "s" ], "host_notification_period": "24x7", "notificationway_name": "email_in_day", "service_notification_commands": [ "notify-service" ], "service_notification_options": [ "w", "u", "c", "r", "f" ], "service_notification_period": "24x7" }
- XML
<value> <notificationway_name>email_in_day</notificationway_name> <host_notification_period>24x7</host_notification_period> <service_notification_period>24x7</service_notification_period> <host_notification_options> <item>d</item> <item>u</item> <item>r</item> <item>f</item> <item>s</item> </host_notification_options> <service_notification_options> <item>w</item> <item>u</item> <item>c</item> <item>r</item> <item>f</item> </service_notification_options> <host_notification_commands> <item>notify-host</item> </host_notification_commands> <service_notification_commands> <item>notify-service</item> </service_notification_commands> </value>
Status¶
Events¶
webprefix: | /v2/status/events/ |
---|
Hosts¶
-
POST
/v2/status/hosts
¶ Given a LiveQuery, returns all matching hosts. :type query:
LiveQuery
Return type: list( LiveHost
)
-
GET
/v2/status/hosts/
(host_name)/config
¶ Returns config from a specific host.
-
POST
/v2/status/hosts/
(host_name)/results
¶ Submit a new check result.
Parameters: - data (
CheckResult
) – a check result within the request body.
- data (
-
GET
/v2/status/hosts/
(host_name)/metrics
¶ Returns all metrics name for a host.
Return type: list( Metric
)
-
GET
/v2/status/hosts/
(host_name)/metrics/
(metric_name)¶ Return the last measure for the metric name on the host.
Return type: Metric
-
POST
/v2/status/hosts/
(host_name)/metrics/
(metric_name)¶ Given a live query, returns all matching metrics.
Parameters: - live_query – a live query within the request body.
Return type: list(
Metric
)
-
POST
/v2/status/hosts/
(host_name)/services/
(service_description)/results
¶ Submit a new check result.
Parameters: - data (
CheckResult
) – a check result within the request body.
- data (
Services¶
-
GET
/v2/status/services
¶ Returns all services.
Return type: list( LiveService
)
-
POST
/v2/status/services
¶ Given a LiveQuery, returns all matching services. :type query:
LiveQuery
Return type: list( LiveService
)
types documentation¶
-
type
LiveService
¶ Data samples:
- Json
{ "acknowledged": true, "description": "Serves Stuff", "host_name": "Webserver", "last_check": 1429220785, "last_state_change": 1429220785.481679, "long_output": "Serves /var/www/\nServes /home/webserver/www/", "plugin_output": "HTTP OK - GOT NICE RESPONSE", "service_description": "Apache", "state": "OK" }
- XML
<value> <host_name>Webserver</host_name> <service_description>Apache</service_description> <description>Serves Stuff</description> <state>OK</state> <acknowledged>true</acknowledged> <last_check>1429220785</last_check> <last_state_change>1429220785.48</last_state_change> <plugin_output>HTTP OK - GOT NICE RESPONSE</plugin_output> <long_output>Serves /var/www/ Serves /home/webserver/www/</long_output> </value>
-
acknowledged
¶ Type: bool Wether or not the problem, if any, has been acknowledged
-
description
¶ Type: unicode The description of the sevice
-
host_name
¶ Type: unicode The host for the service
-
last_check
¶ Type: int The last time the service was checked
-
last_state_change
¶ Type: float The last time the state has changed
-
long_output
¶ Type: unicode Plugin long output of the last check
-
plugin_output
¶ Type: unicode Plugin output of the last check
-
service_description
¶ Type: unicode The name of the service
-
state
¶ Type: unicode The current state of the service
-
type
LiveHost
¶ Data samples:
- Json
{ "acknowledged": true, "address": "127.0.0.1", "childs": [ "surveil.com" ], "description": "Very Nice Host", "host_name": "CoolHost", "last_check": 1429220785, "last_state_change": 1429220785, "long_output": "The ping was great\nI love epic ping-pong games", "parents": [ "parent.com" ], "plugin_output": "PING OK - Packet loss = 0%, RTA = 0.02 ms", "services": [ "load", "cpu", "disk_usage" ], "state": "OK" }
- XML
<value> <host_name>CoolHost</host_name> <address>127.0.0.1</address> <childs> <item>surveil.com</item> </childs> <parents> <item>parent.com</item> </parents> <description>Very Nice Host</description> <state>OK</state> <acknowledged>true</acknowledged> <last_check>1429220785</last_check> <last_state_change>1429220785</last_state_change> <plugin_output>PING OK - Packet loss = 0%, RTA = 0.02 ms</plugin_output> <long_output>The ping was great I love epic ping-pong games</long_output> <services> <item>load</item> <item>cpu</item> <item>disk_usage</item> </services> </value>
-
acknowledged
¶ Type: bool Wether or not the problem, if any, has been acknowledged
-
address
¶ Type: unicode The address of the host
-
childs
¶ Type: list(unicode) The childs of the host
-
description
¶ Type: unicode The description of the host
-
host_name
¶ Type: unicode The name of the host
-
last_check
¶ Type: int The last time the host was checked
-
last_state_change
¶ Type: int The last time the state has changed
-
long_output
¶ Type: unicode Plugin long output of the last check
-
parents
¶ Type: list(unicode) The parents of the host
-
plugin_output
¶ Type: unicode Plugin output of the last check
-
services
¶ Type: list(unicode) The services of the host
-
state
¶ Type: unicode The current state of the host
-
type
Metric
¶ Data samples:
- Json
{ "critical": "100", "max": "100", "metric_name": "pl", "min": "0", "unit": "s", "value": "0", "warning": "100" }
- XML
<value> <metric_name>pl</metric_name> <max>100</max> <min>0</min> <critical>100</critical> <warning>100</warning> <value>0</value> <unit>s</unit> </value>
-
critical
¶ Type: unicode Critical value for the metric
-
max
¶ Type: unicode Maximum value for the metric
-
metric_name
¶ Type: unicode Name of the metric
-
min
¶ Type: unicode Minimal value for the metric
-
unit
¶ Type: unicode Unit of the metric
-
value
¶ Type: unicode Current value of the metric
-
warning
¶ Type: unicode Warning value for the metric
-
type
TimeInterval
¶ Hold a time.
Data samples:
- Json
{ "end_time": "2015-01-29T22:50:44Z", "start_time": "2015-01-29T21:50:44Z" }
- XML
<value> <start_time>2015-01-29T21:50:44Z</start_time> <end_time>2015-01-29T22:50:44Z</end_time> </value>
-
end_time
¶ Type: unicode The ending time.
-
start_time
¶ Type: unicode The starting time.
-
type
Event
¶ Data samples:
- Json
{ "alert_type": "SERVICE", "attempts": 4, "event_type": "ALERT", "host_name": "CoolHost", "notification_method": "notify-service-by-email", "notification_type": "", "output": "WARNING - load average: 5.04, 4.67, 5.04", "service_description": "Apache Service", "state": "CRITICAL", "state_type": "HARD", "time": "2015-06-04T18:55:12Z" }
- XML
<value> <time>2015-06-04T18:55:12Z</time> <event_type>ALERT</event_type> <host_name>CoolHost</host_name> <service_description>Apache Service</service_description> <state>CRITICAL</state> <state_type>HARD</state_type> <attempts>4</attempts> <notification_type /> <notification_method>notify-service-by-email</notification_method> <alert_type>SERVICE</alert_type> <output>WARNING - load average: 5.04, 4.67, 5.04</output> </value>
-
alert_type
¶ Type: unicode Type of alert. This is only HOST or SERVICE
-
attempts
¶ Type: int Number of attempts to confirm state
-
downtime_type
¶ Type: unicode Type of alert. This is only HOST or SERVICE
-
event_type
¶ Type: unicode Type of event. This is only ALERT
-
host_name
¶ Type: unicode Host which the alert is from.
-
output
¶ Type: unicode Additional output of the alert.
-
service_description
¶ Type: unicode Service which raised the alert
-
state
¶ Type: unicode State of the service or host who raised the alert
-
state_type
¶ Type: unicode Confirmness level of the state [SOFT|HARD]
-
time
¶ Type: unicode Timestamp of the alert
Actions¶
acknowledge¶
-
POST
/v2/actions/acknowledge
¶ Acknowledge a host/service. :type ack:
Acknowledgement
Return type: Info
-
DELETE
/v2/actions/acknowledge
¶ Delete a host/service acknowledgement. :type ack:
Acknowledgement
Return type: Info
downtime¶
types documentation¶
-
type
Acknowledgement
¶ Data samples:
- Json
{ "author": "aviau", "comment": "Working on it.", "host_name": "localhost", "notify": 0, "persistent": 1, "service_description": "ws-arbiter", "sticky": 1, "time_stamp": "" }
- XML
<value> <host_name>localhost</host_name> <service_description>ws-arbiter</service_description> <time_stamp /> <sticky>1</sticky> <notify>0</notify> <persistent>1</persistent> <author>aviau</author> <comment>Working on it.</comment> </value>
-
host_name
¶ Type: unicode The name of the host
-
type
Downtime
¶ Data samples:
- Json
{ "author": "aviau", "comment": "No comment.", "duration": 86400, "end_time": 1430150469, "fixed": 1, "host_name": "localhost", "service_description": "ws-arbiter", "start_time": 1430150469, "time_stamp": 1430150469, "trigger_id": 0 }
- XML
<value> <host_name>localhost</host_name> <service_description>ws-arbiter</service_description> <time_stamp>1430150469</time_stamp> <start_time>1430150469</start_time> <end_time>1430150469</end_time> <fixed>1</fixed> <duration>86400</duration> <trigger_id>0</trigger_id> <author>aviau</author> <comment>No comment.</comment> </value>
Type: unicode The author of the downtime
-
comment
¶ Type: unicode Comment for the downtime
-
duration
¶ Type: int The duration of the downtime, in seconds
-
end_time
¶ Type: int When to end the downtime
-
host_name
¶ Type: unicode The name of the host
-
service_description
¶ Type: unicode Ther service description
-
start_time
¶ Type: int When to start the downtime
-
time_stamp
¶ Type: int Time stamp for the downtime
Web UI¶
Layout configuration¶
The layout configuration is a JSON
file containing the configuration of
every page.
For example, the following page would be available at: /#/view?view=myPageUrl.
{
"myPageUrl": {
"template": "page",
"components": [...]
}
}
- Template [ page || drupal || drupal_dashboard ]
- This corresponds to the template that will be loaded by the webUI.
- Components
Components is an array of custom directives that define the layout of the page. See Custom directives.
The available custom directives are:
Alternatively, you can use any custom directives but layout of the WebUI can look a little off.
Custom directives¶
Action Bar¶
The action bar is the bar containing components that act on data. These components can apply filters, recheck selected data, etc. on specified datasourceId.
{
"type": "actionbar",
"attributes": { "datasourceId": [ 0, 1 ] },
"components": [...]
}
- datasourceId (required, type: array of int)
- The datasources on which the actionbar components will act.
- Components
- The list of actionbar components.
Components of an actionbar¶
Acknowledge¶
Adds a button that will open an acknowledgement form for all selected entries. (see table checkbox attribute)
{
"type": "actionbar-acknowledge",
"attributes": {}
}
Downtime¶
Adds a button that will open a downtime form for all selected entries. (see table checkbox attribute)
{
"type": "actionbar-downtime",
"attributes": {}
}
Filter¶
Creates a customizable, collapsed menu of filters
{
"type": "actionbar-filter",
"attributes": {
"filters": [
{
"location": "componentsConfig",
"content": "componentsConfigFilterKey"
}
]
}
}
- location (required) [ inline || componentsConfig ]
- Where the filter is loaded. Inline will directly load content as a filter.
- content (required)
Depend on the value of location.
location content inline An inline filter componentsConfig A filters key defined on componentsConfig.json
More¶
Unused for the moment
Recheck¶
Adds a recheck button that will launch a recheck command for all selected items (see table checkbox attribute)
{
"type": "actionbar-recheck",
"attributes": {}
}
Search-filter¶
Adds a search field inside actionbar that allows to search on data linked with the mother actionbar by datasourceId
{
"type": "actionbar-search-filter",
"attributes": {}
}
Components of a container¶
info¶
- Show all information pf a Surveil objects
{ "type": "info", "attributes": { "inputSource": { "MyTileTitle": "myInputSource" } }
- MyTileTitle (required)
- Tile of the tile
- myInputSource
- key of the var fillParams inside container.js file .This key select the type of object in the tile
host main¶
- Show inside a tile the address and the alias of a host
{ "type": "host-main", "attributes": {} }
host live¶
- Show inside a tile the host state, its output and it’s state icon
{ "type": "host-live", "attributes": {} }
host load¶
- Show inside a tile the load metrics state, its output and it’s state icon for an host
{ "type": "host-load", "attributes": {} }
host cpu¶
- Show inside a tile the cpu metrics state, its output and it’s state icon for an host
{ "type": "host-cpu", "attributes": {} }
host service list¶
- Show inside a tile the service description, its acknowledge and its status for all service hosts
{ "type": "host-services-list", "attributes": {} }
service main¶
- Show inside a tile the host attached to a service
{ "type": "service-main", "attributes": {} }
service live¶
- Show inside a tile the service state, its output and it’s state icon
{ "type": "service-live", "attributes": {} }
service info¶
- Show inside a tile the service description, its acknowledge and its status
{ "type": "service-info", "attributes": {} }
service graphs¶
- Show a grafana graph for each service metric
{ "type": "service-graphs", "attributes": {} }
Tabpanel and panel¶
Panels are used to put components in a section.
Tabpanels are a mechanism to show and hide panels according to a panelId.
Panel¶
{
"type": "panel",
"attributes": {
"panelId": "mySuperPanel"
},
"components": [...]
}
- panelId
- The id of the panel use by tabpanel.
- Components
- The list of components of the panel.
Tabpanel¶
{
"type": "tabpanel",
"attributes": {
"navigation": {
"mySuperPanel": {
"title": "My super panel",
"provider": "Provider"
},
"anotherPanelId": {
"title": "All my problems",
"provider": "nbProblemsProvider"
}
}
},
"components": [...]
}
- navigation (required)
Contains keys of every panelId managed by the tabpanel.
- title
- The title of the tab.
- provider
- A provider to show a number next to the title.
- components
- The list of panels managed by the tabpanel.
Components of a table¶
Table components represent its columns. The collumns are named after the types of cell they will contain. For example: cell-single.
Common column attributes:¶
All columns may define the following attributes.
- title (required):
- Title of the column
- class
- width of the column. Choose between xsmart, smart, medium and large
- url
- Creates a link to another bansho view
"url": { "view": "service", "params": [ { "urlParam": "host_name", "entryKey": "host_name" }, { "urlParam": "service_description", "entryKey": "service_description" } ] }
- view (required):
- the view to redirect to
- params:
- a list of objects that will be used to generate the URL
- urlParam:
- name of the url parameter
- entryKey (required):
- a key of the father inputSource’s table object. Its value is the value of the url param in the URL
cell-single¶
Column for a specific value of the father inputSource’s table object
{ "type": "cell-single", "attributes": { "title": "Service Description", "entryKey": "service_description", "url": { "view": "service", "params": [ { "urlParam": "host_name", "entryKey": "host_name" }, { "urlParam": "service_description", "entryKey": "service_description" } ] }, "class": "medium" }
Attributes:
- entryKey(required):
- Key of the father inputSource’s table object who’s the value is print in the column title
cell-other-fields¶
A column that groups values from the parent inputSource’s table object.
{ "type": "cell-other-fields", "attributes": { "title": "Period", "skipFields": [ "contact_name", "email", "host_notification_commands", "service_notification_commands" ], "class": "large", } }
Attributes:
- skipFields:
- Fields to exclude from the cell
cell-status-duration¶
- Only used inside a status service object table. Prints the time of the last service check
{ "type": "cell-status-duration", "attributes": { "title": "Duration" } }
cell-status-last-check¶
- Only used inside a status host object table. Prints the date of the last host check
{ "type": "cell-status-last-check", "attributes": { "title": "Last Check" } }
cell-status-host-status¶
- Only used inside a status host object table. Prints the host state with a specific icon for his curent state
{ "type": "cell-status-host-status", "attributes": { "title": "Host Status" } }
cell-status-host¶
- Only used inside a status host object table. Prints the hostName with a specific icon for his curent state
{ "type": "cell-status-host", "attributes": { "title": "Hosts", "url": { "view": "host", "params": [ { "urlParam": "host_name", "entryKey": "host_host_name" } ] } } }
cell-status-service-check¶
- Only used inside a status service table. Prints a service name, its current output and an icon for his state
{ "type": "cell-status-service-check", "attributes": { "title": "Service Check", "url": { "view": "service", "params": [ { "urlParam": "host_name", "entryKey": "host_host_name" }, { "urlParam": "service_description", "entryKey": "service_service_description" } ] } } }
cell-config-host-register¶
- Only used inside a config host object table. Prints a validate icon if the host is register, prints an unvalidate icon if the host is not registered
{ "type": "cell-config-host-register", "attributes": { "title": "Register", "class": "xsmall" } }
Administration¶
This section will covers the administration and configuration of the Surveil services.
Surveil API¶
The Surveil API provides Surveil’s REST API.
package name (RPM) | surveil |
services | surveil-api.service |
Default port | 5311 |
configuration (API) | /etc/surveil/surveil.cfg |
configuration (permissions) | /etc/surveil/policy.json |
configuration (API - pipeline) | /etc/surveil/api_paste.ini |
The Surveil API needs access to InfluxDB, Alignak and MongoDB. If Keystone authentication is enabled, it needs access to Keystone (see api_paste.ini).
Configuration samples¶
/etc/surveil/surveil.cfg¶
[surveil]
# mongodb_uri is used to connect to MongoDB. Uses the MongoDB Connection
# String URI Format
mongodb_uri= mongodb://mongo:27017
# ws_arbiter_url is the endpoing of the ws-arbiter module of Alignak it is
# used to send commands to Alignak
ws_arbiter_url= http://alignak:7760
# influxdb_uri is used to connect to InfluxDB. Uses the python-influxdb
# connection string format
influxdb_uri= influxdb://root:root@influxdb:8086/db
/etc/surveil/policy.json¶
For documentation on this configuration file, refer to the OpenStack documentation.
{
"admin_required": "role:admin or is_admin:1",
"surveil_required": "role:surveil or rule:admin_required",
"surveil:admin": "rule:admin_required",
"surveil:authenticated": "rule:surveil_required",
"surveil:break": "!",
"surveil:pass": "@"
}
/etc/surveil/api_paste.ini¶
# Surveil API WSGI Pipeline
# Define the filters that make up the pipeline for processing WSGI requests
# Replace `surveil-auth` by `authtoken` to enable Keystone authentication.
[pipeline:main]
pipeline = surveil-auth api-server
[app:api-server]
paste.app_factory = surveil.api.app:app_factory
[filter:surveil-auth]
paste.filter_factory = surveil.api.authmiddleware.auth:filter_factory
[filter:authtoken]
paste.filter_factory = keystonemiddleware.auth_token:filter_factory
# Keystone auth settings
auth_host=198.72.123.131
auth_protocol=http
admin_user=admin
admin_password=password
admin_tenant_name=admin
Surveil Openstack Interface¶
surveil-os-interface is a daemon that connects to the OpenStack message queue. It reacts to various events and automatically configures Surveil monitoring. For example, instances created in Nova will automatically be monitored by Surveil.
package name (RPM) | surveil |
services | surveil-os-interface.service |
configuration | /etc/surveil/surveil_os_interface.cfg |
Surveil-os-interface needs acces to OpenStack’s message queue. The following options must be set in /etc/nova/nova.conf
:
notification_driver=nova.openstack.common.notifier.rpc_notifier
notification_topics=notifications,surveil
notify_on_state_change=vm_and_task_state
notify_on_any_change=True
Configuration samples¶
/etc/surveil/surveil_os_interface.cfg¶
[surveil-os-interface]
# Surveil API URL
SURVEIL_API_URL=http://surveil:8080/v2
# Surveil Auth URL
SURVEIL_AUTH_URL=http://surveil:8080/v2/auth
# Surveil version
SURVEIL_VERSION=2_0
# OpenStack Credentials. Used for creating hosts in Surveil.
SURVEIL_OS_AUTH_URL=http://localhost/v2.0
SURVEIL_OS_USERNAME=admin
SURVEIL_OS_PASSWORD=password
SURVEIL_OS_TENANT_NAME=admin
# Default monitoring pack to use with all OpenStack instances
SURVEIL_DEFAULT_TAGS=openstack-host
# Network used to monitor hosts. Surveil must have access to this network.
SURVEIL_NETWORK_LABEL=surveil
# AMQP credentials
RABBIT_HOST=192.168.49.239
RABBIT_PORT=5672
QUEUE=surveil
RABBIT_USER=admin
RABBIT_PASSWORD=admin
Surveil Web UI¶
The Surveil Web UI is a web interface for Surveil.
package name (RPM) | surveil-webui |
required services | httpd.service |
Default port | 80 |
configuration (global) | /etc/surveil-webui/config.json |
configuration (user config) | /etc/surveil-webui/default_user_config.json |
surveil-webui implements the Surveil API. It needs access to the Surveil API endpoint and Grafana. By default, it is packaged with a reverse proxy in /etc/http/conf.d/surveil
:
ProxyPass /surveil/surveil/v2/auth/ http://localhost:5311/v2/auth/
ProxyPassReverse /surveil/v2/auth/ http://localhost:5311/v2/auth/
ProxyPass /surveil/surveil/ http://localhost:5311/
ProxyPassReverse /surveil/surveil/ http://localhost:5311/
RequestHeader set GRAFANA-USER "admin"
ProxyPass /surveil/grafana/ http://localhost:3000/
ProxyPassReverse /surveil/grafana/ http://localhost:3000/