HCP Metadata Query Tool (HCPmqt) queries Hitachi Content Platform (HCP) for information about object transactions. These are the ingestion and the deletion of objects within HCP. The term deletion also includes disposition, purging and pruning of objects.
Warning
Using HCPmqt can put severe load on HCP, especially if there is a huge number of objects stored. You might want to monitor HCP performance during a query and tune the load parameters accordingly.
The output file generated by this tool will be huge, depending on the number of objects (expect roundabout 20 GB for 100 million objects found). If you use the ‘Dirtree’ feature, expect the tool to claim up to 512 MB memory per 100 million objects. The tool will fail if it runs out of memory; to add insult to injury, this may affect other applications running on your system!
Handle with care!
To be able to use HCPmqt, there are several prerequisites:
Tip
Tenants do not need to have the Search feature enabled to be queried by HCPmqt, nor is there a need to index the content!
This is a step-by-step guide on how to use HCP Metadata Query Tool.
In this fields, you specify the parameter needed to access HCP and your area of interest within.
Tip
Depending on the access rights you have for HCP, use different names:
You may further restrict the result by defining folders that should be queried - this will skip any other folders.
Warning
Using HCPmqt can put severe load on HCP, especially if there is a huge number of objects stored. You might want to monitor HCP performance during a query and tune the load parameters accordingly.
These values are intended to tune the load generated within HCP when running HCPmqt.
Use the Records / page field to specify the number of records that gets fetched from HCP with a single call. Larger number speed things up a bit, but need more local memory - where smaller number slow down things a bit, but need less memory. 5,000 to 10,000 is a value known as good.
The Throttle (sec/page) field asks the tool to pause for the defined number of seconds between subsequent page requests
Tip
Both values may be changed while a query is running. Please note that changes won’t take place until the page in work has been processed.
Select the type of operational records you want to get.
Transaction type | Description |
---|---|
create | existing (!) objects |
delete | objects that have been deleted |
dispose | objects that have been automatically deleted by HCP after the objects retention had expired [1] |
prune | object’s versions that have been automatically deleted after their lifetime has passed [2] |
purge | object’s versions that have been deleted when the head-object (the newest version) was deleted |
Footnotes
[1] | Disposition will take place if the Disposition Service is enabled in the System Console and for the Namespace, too. |
[2] | Namespaces that are enabled for Versioning define a periode of time during that versions of objects are kept. After a version is leaving this periode of time, it will be pruned (deleted) automatically. |
You can specify a time range for the query.
Per default, values are provided for a full query, which means anything from Jan. 1st, 1970 until now (use the Reset button to reset the fields).
Tip
Normally, you need to enter a timestamp exactly in the given format. In addition to this, a number of seconds counted from Jan. 1st 1970 (the Unix-epoch) will be accepted, also.
Two different output types are available:
Normally, the output will hold selected information only: urlName, version, operation and changeTimeMilliseconds.
If Verbose is checked, all information will be provided: urlName, objectPath, utf8Name, version, namespace, operation, type, size, retention, retentionString, retentionClass, ingestTimeString, ingestTime, accessTimeString, accessTime, changeTimeString, changeTimeMilliseconds, updateTimeString, updateTime, hashScheme, hash, acl, dpl, customMetadata, hold, index, replicated, shred, permissions, owner, uid, gid
Tip
If you need statistical data for a Namespace (or a HCP system at all), check Dirtree. This will write an additional file holding a JSON-structure containing the complete directory tree, including the number of files and subfolders per folder.
Footnotes
[1] | SQlite3 databases can be used by most programming languages. You can also discover them by using the SQlite Shell available from sqlite.org if you like to use the commandline; if you prefer a GUI, try the SQLite Manager Add-on for the Firefox Webbrowser. |
After pressing the Run Query button, the status frame will show information about the progress of a query.
You can pause a query at any time and you can cancel a query, as well; nevertheless you need to wait for the actual page query being ready before pause or cancelation takes place.
This is what happens when you hit the Run Query button:
This runs in a loop until all requested records have been received.
HCP Metadata Query Tool is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
HCP Metadata Query Tool is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with HCP Metadata Query Tool. If not, see the license page at gnu.org.
Copyright 2012-2015 Thorsten Simons