davos: FTP Download Automation¶
This is the documentation for davos, a web-based tool for automating and managing file downloads over FTP, FTPS and SFTP. Davos was born from the idea that even today, FTP still has relevance in many different markets, but there weren’t many web-based solutions that provided an easy way to manage the movement of files (outside of a command line cron job) from one place to another.
For those new to davos, look through the Installation and Getting Started guides. They will run you though how to get and set up davos for the first time.
davos also provides a basic HTTP API that can be used to hook in to the application to manage things like schedules, hosts, filters, and even to stop or start individual schedules.
Guides¶
This section will run you through the aspects of the application itself, including installation, first time use, and the concept of schedules (what they consist of), hosts, and how they tie together.
Installation¶
Note
davos has been written with Docker at the forefront regarding installation and deployment. This means that you should consider using the pre-built Docker image that LinuxServer have provided for this application.
With Docker¶
This is the recommended method of installation and deployment.
Install Docker¶
Firstly, you’ll need to install Docker, a container engine that is used to fire up user-space virtual containers. I recommend using Docker’s official guide on installing the latest version of Docker CE on your machine, as the steps differ depending on your platform.
Build the container¶
Create a new container from LinuxServer’s image.
docker create \
--name=davos \
-v <path to config>:/config \
-v <path to downloads folder>:/download
-e PGID=<gid> -e PUID=<uid> \
-p 8080:8080 \
linuxserver/davos
Params¶
<path to config>
- The folder on your machine where davos will place its configuration and log files.
Typically this will be somewhere like
/home/me/davos
, but it can be anywhere.
<path to downloads folder>
- The folder on your machine that davos can download files to. This is the volume mount point that davos is aware of for all file downloads.
<uid>
- The id of the user you’d like davos to run as. All files downloaded by davos will be owned by this user.
<gid>
- The id of the group you’d like to attribute to the user davos runs as. All files downloaded by davos will be owned by this group.
Warning
Docker will run all containers as root
by default. Omitting PUID
and PGID
is not recommended.
Run the container¶
Once the container has been created, you can run it.
docker start davos
After about 30 seconds, the application will be running and will be accessible on http://localhost:8080
. If you are running
davos on a remote server, substitute localhost
with the server’s IP address.
Without Docker¶
This is not the recommended method of installation and deployment, but has the potential for being the most configurable and flexible. Davos does not have any prebuilt binaries, so you’ll need to get the source and build it yourself (another reason to use Docker instead).
Get the source¶
wget https://github.com/linuxserver/davos/archive/LatestRelease.zip
unzip LatestRelease.zip -d davos
Configure the application¶
By default, davos is configured to place all of its configuration in /config
, which may
not be preferable if you’re running the application on bare metal. Firstly, reconfigure davos
to use your own defined directory for its database.
In conf/release/application.properties
, change spring.datasource.url
, e.g.:
spring.datasource.url=jdbc:h2:file:/home/me/davos
You’ll also need to do the same in conf/release/log4j2.xml
, this time for the appender:
<RollingFile name="File" fileName="/home/me/davos/logs/davos.log" filePattern="/config/logs/${date:yyyy-MM}/app-%d{yyyy-MM-dd-HH}-%i.log">
Build davos¶
Note
davos requires the Java 8 SDK to build.
Once you’ve updated the configuration locations, you can build the binary.
./gradlew build -Penv=release
This will create “davos-2.2.0.jar” in build/libs
. You should move this somewhere more fitting for an executable (/var/lib
, for example).
It may also be worth renaming the .jar to “davos.jar”, although this is not necessary.
Run davos¶
Note
davos requires the Java 8 JRE to build. This is not required if you already have the SDK installed.
To run the application, run the following command:
java -jar davos.jar
Getting Started¶
This section aims to help you understand how davos is pieced together, and shows you how it can be configured to meet your needs. It is recommended that you follow the below guides.
Hosts¶
A Host configuration provides one or more Schedules with information pertaining to the FTP server to connect to when scanning for files. They are separate to the Schedule configuration to allow multiple Schedules to use the same Host configuration without the need for having to input the same data multiple times.
Under Settings -> New Host, you will be prompted to enter all of the relevant information.
- Name [REQUIRED]
- The friendly name for this Host. This is what will be visible when creating a schedule, so make it indicative of the Host you’re making.
- Protocol
- Which type of connection to be made. This has no bearing on how you configure the host, but will direct davos to build the specific client when connecting.
- Host Address [REQUIRED]
- The IP address (or hostname) of the server.
- Port
- FTP and FTPS are usually on
21
, while SFTP is usually on22
. If your server has been configured to run on a separate port, this is where you reference it.- Username [REQUIRED]
- Name of the user to connect as.
- Password
- Password of the user to connect as.
- Use Identity File
- Only available when
SFTP
is selected. Choose this if the SFTP server requires an identity file to authenticate the user.- Identity File
- Displayed when
Use Identity File
is checked, replacing thePassword
field. Enter the location of the file.
Note
The location of the identity file will be relative to the container’s filesystem, so should ideally be under /config
as this is the directory exposed by the Docker volume mapping.
It is also possible to create, manage, and delete a Host via the HTTP API. See API for more details.
Schedules¶
A Schedule is the configuration that tells davos when to run, where to connect, what to look for, and what to do once it has finished downloading. Schedules are the heart of davos and are powered by its workflow engine.
To create a new Schedule, go to Settings -> New Schedule. Schedules are split into multiple sections, each with their own part to play in the process.
General¶
This defines the metadata and connection information of the Schedule. The General section allows you to name the Schedule, as well as define how often it should run, and where files should be managed.
- Name [REQUIRED]
- The name of the Schedule. This should be relevant to the task this schedule is performing. E.g. “Nightly Feed”
- Interval
- How often the schedule should run. The rate at which the schedule runs begins when the schedule is started for the first time. So, if it is started at 14:05, with an interval of “Every 30 minutes”, it will run again at 14:35, then 15:05, and so on.
Note
If you change the interval for an already running Schedule, you’ll need to restart it before the change takes effect.
- Host
- The Host configuration to use for this Schedule. It will default to the first Host in the list. You cannot create a Schedule if no Hosts have been created.
- Host Directory [REQUIRED]
- This is the directory on the host (relative to the connection entry point) that the Schedule should use for file scanning. Absolute paths are also compliant.
- Local Directory [REQUIRED]
- The directory where this schedule should place file downloads.
Note
The local directory must be relative to the container’s filesystem, so should be under /download
.
- Transfer Type
- This setting will inform the Schedule whether or not it should only download matching files (
FILE
), or if it should also scan matching directories (RECURSIVE
). This can be useful if the server contains sub-directories that may match in a scan, but should not be downloaded.- Start Automatically
- If checked, the Schedule will automatically start when davos is started. Useful if you have a restart policy enabled in Docker and your machine requires a restart.
Filtering¶
This is a process that allows you to narrow down file scanning so only relevant files are processed. Filters can be exceptionally useful for host directories that are used by multiple processes or contain large numbers of files.
- Mandatory
- If checked, the Schedule will only consider scanning files if at least one filter has been defined. If checked and no filters are defined, nothing will be scanned, so nothing will be downloaded.
- Invert
- The default behaviour is to match all files on the host with the defined filters. Checking this option will invert that behaviour, so all files not matching the defined filters will be downloaded.
- Filters
A list of strings that will be used to scan the host directory. Each file on the host is compared to this list - if it matches at least one filter, it will be downloaded. Filters can also be wildcarded using
?
(single character) and*
(multiple characters).For example, for a file called “my_file_name.txt”:
my?file?name.txt = MATCH my*name.txt = MATCH my_file.name.txt = NO MATCH *file_name* = MATCH *file_name = NO MATCH
File Management¶
davos also provides a way to tidy processed files upon completion. You can choose to either delete the file remotely once downloaded (effectively making it a move operation), and you can also move the file locally.
- Delete from Host
- If checked, all matched and downloaded files will be deleted from the Host. This logic will run after each individual download has completed.
Warning
If the FTP user does not have permission to delete files on the Host, this step will fail and the Schedule will cancel the current run. A future run of the Schedule will skip all files previously scanned.
- Move Downloaded File
- The location to move each successfully downloaded file. This will occur after each individual download has completed. A common use-case for this feature is to separate in-progress files with completed files (i.e.
/download/doing
and/download/done
).
Note
The “move to” directory must be relative to the container’s filesystem, so should be under /download
. Advanced users may create additional volume mappings if need be.
Note
If davos is unable to move the file, it will remain in its originating directory, and will continue on to the Schedule’s next step without failure.
Downstream Actions¶
One of the unique aspects of davos in respect to FTP management is its ability to create hooks in to other applications that may be interested in the downloaded files. This may be useful when the download action is part of a wider workflow that must be continued outside of the scope of davos.
Actions defined against a Schedule will run for each individually downloaded file after the File Management step previously mentioned has run.
There are two types of Downstream Action: Notifications and API Calls.
Notifications¶
Notifications are useful if you’d like to know whenever davos has successfully downloaded a file. Generally speaking, no further action is taken after a notification is sent, but SNS may be configured to include a subscriber to a topic that performs a further action.
Note
There is no limit to the number of notifications you can have.
You will need an account with Pushbullet in order to use this feature. In your Pushbullet account, create an Access Token.
- Access Token
- Your Pushbullet account’s access token. This will be used to authenticate notification push requests to the Pushbullet API.
You will need an Amazon AWS account to use this feature.
- Topic Arn
- The Amazon Resource Name for an SNS Topic created under your AWS account. This will be the topic that notifications are sent to.
- Region
- The region that the topic was created under. While regions are not mandatory for Topic Arns, this will be used to authenticate your account and create an SNS client in the correct region.
- Access Key
- The access key for an IAM User under your AWS account.
- Secret Access Key
- The second half of authentication with AWS. This is the secret key for the same IAM User.
Warning
Be careful with IAM User permissions! You should create a new IAM User with permissions only to publish messages to your notification topic, nothing more! See FAQ for more details on best practice regarding IAM Users.
API Calls¶
API Calls are a great way to create hooks in to other applications via their own HTTP API.
- URL
- The URL of the API you wish to call
- Method
- Available options are GET, POST, PUT and DELETE
- Content-Type
- Informs the target API what type of body you’re sending (if any), e.g. “application/json”
- Message Body
- The request payload being sent to the target API
Note
If you need to reference the downloaded file in an HTTP request, use $filename. This will resolve to the file or folder that was matched and subsequently downloaded.
How it works¶
The Schedules in davos are powered by a basic workflow engine that runs a series of steps to ensure each run processes files properly. The order of this workflow is as follows:
- Connect to the host.
- List all files in the provided remote directory.
- Filter all files in the remote directory so only the relevant ones remain.
- Remove any files that have been previously scanned.
- For each matched file, download it. Once downloaded, run any actions required by the schedule.
- Store the list of scanned files against the Schedule.
- Disconnect.
There is no theoretical limit to the number of schedules you can have running at the same time, however it is advised you keep it below 10, as memory usage can become quite high.
App Settings¶
Under Settings -> App Settings, you can configure the log level that davos will output to its log file.
Logging¶
All logs are written to davos.log
, located in the /config/logs
directory.
When mapping the /config
directory in the container to a host directory, logs
will be made available in that host directory.
The log level can be changed at any time while the application is running. The available levels are:
- DEBUG
- INFO
- WARN
- ERROR
The higher the level (DEBUG
is lowest, ERROR
is highest), the fewer logs will be
written. By default, davos logs at INFO
level. If you are experiencing issues
with davos and wish to understand the area of failure, change the level to DEBUG
.
Under this setting, the most logs will be written.
Warning
When setting the log level to DEBUG
, any secure credentials used in connections to the FTP host, or notification systems will be logged.
Reference¶
API¶
davos provides an HTTP API that exposes Schedules and Hosts so they can be managed outside the scope of the web application. This API is also used by the web application’s AJAX calls.
Warning
This API is completely unauthenticated, so anyone on your network can use this
/schedule¶
POST¶
Creates a single Schedule.
POST /api/v2/schedule HTTP 1.0
Host: localhost:8080
Content-Type: application/json
Accept: application/json
{
"name": String,
"interval": Integer,
"host": Integer,
"hostDirectory": String,
"localDirectory": String,
"transferType": String [ FILE | RECURSIVE ],
"automatic": Boolean,
"moveFileTo": String,
"filtersMandatory": Boolean,
"invertFilters": Boolean,
"deleteHostFile": Boolean,
"filters": [
{
"value": String
}
],
"notifications": {
"pushbullet": [
{
"apiKey": String
}
],
"sns": [
{
"topicArn": String,
"region": String,
"accessKey": String,
"secretAccessKey": String
}
]
},
"apis": [
{
"url": String,
"method": String [ POST | GET | PUT | DELETE ],
"contentType": String,
"body": String
}
]
}
For more information regarding what each field represents, see the Schedules documentation in Getting Started.
Response¶
See: Schedule Response Syntax.
/schedule/{id}¶
GET¶
Retrieves a single Schedule based on the supplied {id}
.
GET /api/v2/schedule/{id} HTTP 1.0
Host: localhost:8080
Accept: application/json
Response¶
See: Schedule Response Syntax.
PUT¶
Updates a single Schedule based on the given {id}
. All fields must be supplied, even if only a subset is
being updated. Use a GET to first obtain the most up-to-date payload before performing
a PUT.
PUT /api/v2/schedule/{id} HTTP 1.0
Host: localhost:8080
Content-Type: application/json
Accept: application/json
{
"name": String,
"interval": Integer,
"host": Integer,
"hostDirectory": String,
"localDirectory": String,
"transferType": String [ FILE | RECURSIVE ],
"automatic": Boolean,
"moveFileTo": String,
"filtersMandatory": Boolean,
"invertFilters": Boolean,
"deleteHostFile": Boolean,
"filters": [
{
"id": Integer,
"value": String
}
],
"notifications": {
"pushbullet": [
{
"id": Integer,
"apiKey": String
}
],
"sns": [
{
"id": Integer,
"topicArn": String,
"region": String,
"accessKey": String,
"secretAccessKey": String
}
]
},
"apis": [
{
"url": String,
"method": String [ POST | GET | PUT | DELETE ],
"contentType": String,
"body": String
}
]
}
Note
If you are updating a listed object, you must provide the object’s id
. If you do not, the API will remove the old reference and create a new one. To add a new item to the list, provide the new item (without an id
) alongside the existing one.
Response¶
See: Schedule Response Syntax.
/schedule/{id}/scannedFiles¶
/schedule/{id}/execute¶
/host¶
POST¶
Creates a new Host.
POST /api/v2/host
Host: localhost:8080
Content-Type: application/json
Accept: application/json
{
"name": String,
"address": String,
"port": Integer,
"protocol": String [ FTP | FTPS | SFTP ],
"username": String,
"password": String,
"identityFile": String,
"identityFileEnabled": Boolean
}
Note
If identityFileEnabled
is set to TRUE, you must also provide identityFile
, otherwise provide password
.
/host/{id}¶
GET¶
Retrieves a single Host based on the given {id}
.
GET /api/v2/host/{id}
Host: localhost:8080
Accept: application/json
Response¶
See: Host Response Syntax.
PUT¶
Updates a Host with the given {id}
.
POST /api/v2/host/{id}
Host: localhost:8080
Content-Type: application/json
Accept: application/json
{
"name": String,
"address": String,
"port": Integer,
"protocol": String [ FTP | FTPS | SFTP ],
"username": String,
"password": String,
"identityFile": String,
"identityFileEnabled": Boolean
}
Note
If identityFileEnabled
is set to TRUE, you must also provide identityFile
, otherwise provide password
.
Response¶
See: Host Response Syntax.
DELETE¶
Deletes a single Host with the given {id}
.
DELETE /api/v2/host/{id} HTTP 1.0
Host: localhost:8080
Accept: application/json
Response¶
{
"status": String [ OK | Failure ],
"body": String
}
Warning
If the Host you are attempting to delete is being used by an active Schedule, the DELETE call will fail.
/testConnection¶
POST¶
Allows you to assert whether or not the provided payload contains valid Host information.
POST /api/v2/testConnection
Host: localhost:8080
Content-Type: application/json
{
"id": Integer,
"name": String,
"address": String,
"port": Integer,
"protocol": String [ FTP | FTPS | SFTP ],
"username": String,
"password": String,
"identityFile": String,
"identityFileEnabled": Boolean
}
Response¶
{
"status": String [ OK | Failed ],
"body": String
}
/settings/log¶
POST¶
Changes the logging level of the application’s core code. Unlike other POST calls, there is no payload body. The level is passed in as a request parameter.
- level
- The level to change the logging to. Available options are DEBUG, INFO, WARN, ERROR, FATAL
POST /api/v2/settings/log?level={LEVEL}
Host: localhost:8080
Accept: application/json
Response¶
{
"status": String [ OK | Failed ],
"body": String
}
Responses¶
Schedule Response Syntax¶
{
"status": String [ OK ],
"body": {
"id": Integer,
"name": String,
"interval": Integer,
"host": Integer,
"hostDirectory": String,
"localDirectory": String,
"transferType": String [ FILE | RECURSIVE ],
"automatic": Boolean,
"moveFileTo": String,
"running": Boolean,
"filtersMandatory": Boolean,
"invertFilters": Boolean,
"lastRunTime": String,
"deleteHostFile": Boolean,
"lastScannedFiles": [
String
],
"filters": [
{
"id": Integer,
"value": String
}
],
"notifications": {
"pushbullet": [
{
"id": Integer,
"apiKey": String
}
],
"sns": [
{
"id": Integer,
"topicArn": String,
"region": String,
"accessKey": String,
"secretAccessKey": String
}
]
},
"transfers": [
{
"fileName": String,
"fileSize": Integer,
"directory": Boolean,
"progress": {
"percentageComplete": Double,
"transferSpeed": Double
},
"status": String [ DOWNLOADING | SKIPPED | PENDING | FINISHED ]
}
],
"apis": [
{
"id": Integer,
"url": String,
"method": String [ POST | GET | PUT | DELETE ],
"contentType": String,
"body": String
}
]
}
}
Note
running
, lastScannedFiles
, lastRunTime
and transfers
are immutable metadata fields and can’t be used in PUT or POST requests. If supplied, they will be ignored.
- host
- References the
id
of the linked host.- running
- Descibes whether or not the Schedule is running.
- lastRunTime
- The time recorded when the Schedule last finished running.
- lastScannedFiles
- A list of Strings that represent the files/folders found in the last run of the schedule.
- transfers
- A list of transfer objects that describe all files being actioned. This list will only be populated when the Schedule is running and is actively downloading.
Host Response Syntax¶
Success¶
{
"status": String [ OK ],
"body": {
"id": Integer,
"name": String,
"address": String,
"port": Integer,
"protocol": String [ FTP | FTPS | SFTP ],
"username": String,
"password": String,
"identityFile": String,
"identityFileEnabled": Boolean
}
}
Failure¶
{
"status": String [ Failed ],
"body": String
}
FAQ¶
Can davos be used to upload files?¶
No, davos only downloads files. There are currently no plans on implementing the ability to upload files as this will require a rework of the schedule workflow engine.
How many schedules can I have?¶
There is no theoretical limit to the number of schedules you can have. davos creates an initial thread pool of 10 worker threads, but this gets extended if more than 10 schedules are created.
How many hosts can I have?¶
Unlimited.
Are host credentials hashed in the database?¶
No, all host usernames and passwords are stored in plain text in the H2 database. This is because the application needs to query the hosts table every time a schedule runs, and would have no way to compare a hash with a valid password.
How do I use an identity file for SFTP connections?¶
On the Host configuration page for your Host, make sure Use Identity File is checked. Then enter the absolute path of the identity file. If you’re running davos in a Docker container (recommended), the value of this should be some thing like “/config/id_rsa”, assuming you are using an SSH private key called “id_rsa” and have placed it in your mapped host directory on your machine.
Any form of private identity is applicable, for example if your host server uses .pem files for authentication, use “/config/my_identity.pem”.
Note
Remember, davos can’t see files outside of its /download
and /config
directories when running in a Docker container. So remember to place your identity file(s) in the mapped directory on the host (e.g. /home/user/davos
).
I’ve just updated davos. The application is behaving strangely.¶
Some version updates include changes to the JavaScript sources for the website side of the application. Modern browsers like Chrome tend to cache these types of sources for the sake of performance. It is likely your browser has not re-cached the latest version of the JavaScript code.
To remedy this, hard-refresh the app: CTRL
+ F5
.
How can I use SNS to notify me by email?¶
To use SNS, you’ll need an Amazon AWS Account. Once set up, you should go to Services -> Simple Notification Service, then Create topic. For Topic name, enter something like “davos-notifications”, and click Create topic. The first thing you’ll notice is that it has generated a Topic ARN. You’ll need this for the notification configuration later.
Now create a subscription to your topic by clicking on Create subscription, chosing “Email” as the Protocol, and your preferred email address as the Endpoint. Click Create subscription. You’ll receive an email asking you to confirm the subscription request.
Once your topic has been configured, you should create an IAM User that can publish messages to it. It is this user’s credentials that davos needs to perform the publish.
Go to Services -> IAM, then Users. Click Add user. For User name, enter something sensible, then select “Programmatic access” as the Access Type. Click Next: Permissions. This user should only have permission to publish to this topic, nothing more. So, under “Add user to group”, click Create group, and then Create policy.
Note
A user can be in many groups. Groups can have many policies. A policy is a set of permissions for access to various things in AWS.
You should be directed to the policy creation tool. Select the Policy Generator and set the following:
- Effect
- Allow
- AWS Service
- Amazon SNS
- Actions
- Publish
- Amazon Resource Name (ARN)
- {YOUR_TOPIC_ARN}
Then click Add Statement. You should see it added underneath. Click Next Step. The generated policy will be shown
to you on screen (it’s formatted as JSON, and contains a Statement
array). Update the Policy Name to something
sensible (e.g. “DavosTopicPublishAccess”) then click Create Policy. You’ll be redirected back to the IAM
console, but you can close this.
Go back to the previous tab and under the Filter, type in the name of the policy you just created. Select it. Now, for the Group name, give it a sensible name (e.g. DavosNotifications), and click Create group. The group should now be selected under the IAM user console. Click Next: Review, make sure you’re happy, and then click Create user.
You should see a table showing the user’s Access key ID and Secret access key. You’ll need these for the SNS configuration in davos, so keep them safe somewhere (you can download a .csv with the credentials in).
Warning
The Secret access key will only be shown once in the console, so make sure you store it somewhere safe.
Developers¶
If you wish to contribute to davos (and help me tidy up some of its rather messy code!), you will need to be able to build and run it locally. davos is written almost completely in Java using the Spring Framework, utilising the Thymeleaf rendering engine. The project is unit and integration tested using jUnit and Cucumber JVM, respectively.
Setup¶
Download and install the Java 8 JDK. I’d also recommend using Spring Tool Suite (STS) as it is a prebuilt version of Eclipse IDE with all of the necessary plugins installed for working with a Spring application.
Building¶
Note
You do not need to pre-install Gradle for this application as it comes with Gradle Wrapper, which does all the work for you.
To build the application, use Gradle:
./gradlew clean build -Penv={release|local}
This will download all necessary dependencies, run tests, then package up the application.
The resulting .jar file will be in build/libs
. If you pass through a -Penv=release
when
running this command, the packaged application will use the config under conf/release
, which
tells davos to use a file-based database. By default (i.e. if you do not pass this switch
through), it will use the conf/local
configuration, which makes use of an in-memory database.
Running the app¶
It is recommended to build the app first before running, so you know your latest changes are built:
./gradlew clean build && java -jar build/libs/davos-2.2.0.jar
Development¶
Classpath¶
When using Eclipse (or STS), a separate Gradle command is required in order to update the project’s classpath files so Eclipse is aware of the downloaded dependencies:
./gradlew cleanEcipse eclipse
Code Structure¶
The code of davos is split in to four main sections:
src/main/java
- The core functional code. This contains all logic for the workflow, API, connectivity, and object persistence (database).
src/main/resources
- The front-end code, including all JavaScript, CSS, images, and Thymeleaf templates.
src/test/java
- All unit tests for the core code
src/cucumber/java
- Integration test code. This is separate to the main project code and does not get packaged in to the released application.
Running Tests¶
To run all unit tests, use Gradle:
./gradlew test
To run all integration tests:
./gradlew cucumber
Managing the version¶
The version of the application is referenced in three files:
version.txt
in the project root directoryconf/local/application.properties
as a property calleddavos.version
conf/release/application.properties
as a property calleddavos.version
All three of these need to be updated if you are changing the version number.