|
|
[[_TOC_]]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### How-to post-install CII services
|
|
|
|
|
|
Most CII services (not: telemetry and alarms) come as part of the DevEnv installation, however some post-installation set-up is needed before they can be used.
|
|
|
|
|
|
Post-installation is always necessary when you start working on a newly installed DevEnv host. After an upgrade of the DevEnv on an existing host, post-install may be necessary, too. In these cases, the Release Notes of the DevEnv version will inform you so.
|
|
|
|
|
|
_Note on DevEnv versions before 3.6_: On older DevEnv versions, you need to download the post-install first. On DevEnv 3.4, first execute this command (as root): `yum -y install elt-ciisrv-postinstall`. On DevEnv 3.5, execute this command (as root): `yum -y update elt-ciisrv-postinstall`
|
|
|
|
|
|
To run post-install, execute this command (as root):
|
|
|
|
|
|
```plaintext
|
|
|
# /elt/ciisrv/postinstall/cii-postinstall <choose a role>
|
|
|
```
|
|
|
|
|
|
To learn about the options, run the command without arguments.
|
|
|
|
|
|
For more details and examples, see the [cii-postinstall user manual](http://www.eso.org/\~eltmgr/CII/latest/manuals/html/docs/services.html)
|
|
|
|
|
|
After postinstall, you will want to start the CII services on your host (unless you have assigned the "groupclient" role to the host, which means you will use the CII Services running on another host).
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### How-to start/stop CII services
|
|
|
|
|
|
The cii-services utility lets you start/stop/monitor the CII services. For some operations, it requires root-privileges, and can be run with `sudo`. If you don't use sudo, it will show a root password prompt when needed.
|
|
|
|
|
|
Note: Before using CII services, you or an administrator have to run the CII-post-installation, see [How-to post-install CII services](#how-to-post-install-cii-services)
|
|
|
|
|
|
_status_ This is a feature-centric view, that basically tells you which features you have available. For example, "Blob Values" means that the Distributed File System MinIO (in previous versions: Hadoop) is available for storage of large values and binaries.
|
|
|
|
|
|
```plaintext
|
|
|
$ cii-services status
|
|
|
```
|
|
|
|
|
|
_info_ This is a deployment-centric view that tells you whether the services are running and where they are.
|
|
|
|
|
|
```plaintext
|
|
|
$ cii-services info
|
|
|
```
|
|
|
|
|
|
_start / stop_
|
|
|
|
|
|
```plaintext
|
|
|
$ sudo cii-services start <services>
|
|
|
```
|
|
|
|
|
|
To learn about the options, run the command without arguments.
|
|
|
|
|
|
For more details and examples, see the [cii-services user manual](http://www.eso.org/\~eltmgr/CII/latest/manuals/html/docs/services.html)
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### How-to get CII Demo apps
|
|
|
|
|
|
CII has demo apps that you can download as source, modify, and build yourself. They demonstrate the use of the CII services.
|
|
|
|
|
|
```plaintext
|
|
|
$ git clone https://oauth2:ujak_jA2BjkL2UDW6v5h@gitlab.eso.org/cii/info/cii-demo.git
|
|
|
Cloning into 'cii-demo'...
|
|
|
[...]
|
|
|
|
|
|
$ cd cii-demo
|
|
|
$ ./cii-demo.sh
|
|
|
|
|
|
Building (this may take some minutes) ...
|
|
|
Installing into INTROOT: /home/eltdev/INTROOT
|
|
|
PREFIX is set to: /home/eltdev/INTROOT
|
|
|
Find the build output in ./cii-demo.sh.build.log
|
|
|
[...]
|
|
|
```
|
|
|
|
|
|
After this, you find the list of available demo apps in the (generated) README file. You can at any time modify the sources and rebuild them with "waf build install". Look inside "cii-demo.sh" if unsure.
|
|
|
|
|
|
Most demo apps require CII Services be running on your host or on another host => Check that the related CII Services (e.g. config service for a a config demo app) are accessible: see [How-to start/stop CII Services](#how-to-start-stop-cii-services)
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Insufficent Manuals, Contributions
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I found some information in the user manuals is incomplete, unclear, outdated, or misleading.
|
|
|
|
|
|
**Solution 1**
|
|
|
|
|
|
If you're not even really sure what you're looking for, go and [create a CII ticket](https://jira.eso.org/secure/CreateIssue!default.jspa), and we can help with the problem at hand, and discuss how to improve the documentation for others.
|
|
|
|
|
|
**Solution 2**
|
|
|
|
|
|
If you have a fairly clear idea what should be added to the manuals, you can propose and make changes to the documentation source files directly. Contributions are welcome!
|
|
|
|
|
|
The CII all-in-one User Manual is composed of a dozen sub-documents in reStructured text format (.rst). The only (minor) challenge is therefore to identify which sub-document you want to edit.
|
|
|
|
|
|
The documentation resides in `https://gitlab.eso.org/cii/info/cii-docs.git` that you can clone as usual to local workspace, then modify and create a merge request.
|
|
|
|
|
|
Alternatively, and recommended, you can edit the files directly in the browser by using the File Editor built into gitlab:
|
|
|
|
|
|
_Note: If any of the buttons mentioned below is greyed out, you're lacking permissions. If so, please contact us first to request permission._
|
|
|
|
|
|
1. Browse to <https://gitlab.eso.org/cii/info/cii-docs>,
|
|
|
and navigate to folder `userManual/ciiman/src/docs`
|
|
|
2. Find the correct .rst file, select it, and press "Edit",
|
|
|
and from the pop-up list of choices, prefer the "Edit single file"
|
|
|
3. Make your changes in the content editing page.
|
|
|
Note you can toggle between Write and Preview mode.
|
|
|
|
|
|
When done with editing, see the lower part of the screen:
|
|
|
|
|
|
4. Describe your change:
|
|
|
* Commit message: please write some rationale for your contribution
|
|
|
Info: this text will re-appear later as the Title of your Merge Request.
|
|
|
* Target branch:
|
|
|
* DO NOT use the default (master), instead:
|
|
|
* empty the field and write e.g.
|
|
|
"hints-on-creating-document", or
|
|
|
"more-details-on-network-interface"
|
|
|
Note there cannot be whitespace in this name
|
|
|
* Checkbox "Start a new merge request": leave default (YES)
|
|
|
5. Commit changes (blue button on lower left),
|
|
|
|
|
|
The Merge Request window appears:
|
|
|
|
|
|
6. It is all prefilled. If you want you can make changes:
|
|
|
* The Title. It is prefilled from your commit message, but you can change it
|
|
|
* Leave all the defaults (Mark as draft: NO, Delete source branch: YES, Squash: NO)
|
|
|
7. Create Merge Request (blue button)
|
|
|
|
|
|
Inspect the result:
|
|
|
|
|
|
8. wait for pipeline to finish
|
|
|
* After the pipeline finishes, Jenkins will add its comment to your merge request page.
|
|
|
* and in that comment follow the link under "Artifacts List" to see your change applied.
|
|
|
|
|
|
8. To refine the applied change, go back to step 1.
|
|
|
Use the same branch name in step 4, so that you do not create a 2nd Merge Request
|
|
|
(what you want is to update your Merge Request, not create another one).
|
|
|
|
|
|
We'll check the merge request, and your change can make it to the next release. Thanks in advance!
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[ICD\]
|
|
|
|
|
|
### Failure to build ZPB from ICD \[ICD waf build\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Building an ICD, I get the following error:
|
|
|
|
|
|
```plaintext
|
|
|
error: return type specification for constructor invalid
|
|
|
```
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
You have defined a struct containing a member of the same name ("feedback" in the example below). Rename one of the two.
|
|
|
|
|
|
```plaintext
|
|
|
<struct name="feedback">
|
|
|
<member name="feedback" type="float" arrayDimensions="(10)"/>
|
|
|
<member name="counter" type="uint32_t"/>
|
|
|
</struct>
|
|
|
```
|
|
|
|
|
|
or you have used a struct name that is a reserved word in Protobuf. Rename it.
|
|
|
|
|
|
```plaintext
|
|
|
<struct name="Swap">
|
|
|
<member name="counter" type="uint32_t"/>
|
|
|
</struct>
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
In the first case, the code that gets generated for the member looks to the compiler like a mal-formed constructor. In the second case, you have used a struct name that is already taken by a method name that protobuf secretly generates into the code to be compiled, which then leads to the same problem.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### No cpp/python from ICD \[ICD waf build\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
When I build my ICD, the build completes without errors or warnings, but I find it has not generated any cpp and python classes for my ICD. I do see protoc file, though
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Check that you have specified all needed dependencies in your waf script. The "requires" list needs to contain the following entries: requires='cxx python protoc fastdds boost cii gtest nosetests pytest'
|
|
|
|
|
|
The full waf script for your project would look something like this:
|
|
|
|
|
|
```plaintext
|
|
|
declare_project(name='mytests',
|
|
|
version='1.0.0',
|
|
|
requires='cxx python protoc fastdds boost cii gtest nosetests pytest',
|
|
|
boost_libs='program_options',
|
|
|
cstd='gnu11',
|
|
|
cxx_std='gnu++14',
|
|
|
recurse='myconfig mylsvsim icd tests')
|
|
|
```
|
|
|
|
|
|
Also, note that in DevEnv 3.x the order of dependencies matters:
|
|
|
|
|
|
```plaintext
|
|
|
# Will not work, see the warnings during the 'waf configure' step.
|
|
|
requires='cxx python protoc fastdds cii boost gtest nosetests pytest pyqt5 sphinx'
|
|
|
|
|
|
# This order works
|
|
|
requires='cxx python protoc fastdds boost cii gtest nosetests pytest pyqt5 sphinx'
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### PYBIND errors \[ICD waf build\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to build your MAL Application, you get errors like below related to the PYBIND module.
|
|
|
|
|
|
```plaintext
|
|
|
icd/python/bindings/src/ModProto-benchmark.cpp:18:25: error: expected initializer before ‘-’ token
|
|
|
PYBIND11_MODULE(ModProto-benchmark, modproto-benchmark) {
|
|
|
```
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Check the name of your ICD file:
|
|
|
|
|
|
```plaintext
|
|
|
> find icd
|
|
|
icd
|
|
|
icd/wscript
|
|
|
icd/src
|
|
|
icd/src/proto-benchmark.xml
|
|
|
```
|
|
|
|
|
|
The icd file name contains a minus, which is actually reflected in the above error message.
|
|
|
|
|
|
Rename the file to something like this:
|
|
|
|
|
|
```plaintext
|
|
|
> find icd
|
|
|
icd
|
|
|
icd/wscript
|
|
|
icd/src
|
|
|
icd/src/protobenchmark.xml
|
|
|
```
|
|
|
|
|
|
In general, due to the many code generation steps taking place, your freedom in ICD file naming is limited.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### multiple XMLs found \[ICD waf build\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to build your ICD module, you see this error:
|
|
|
|
|
|
```plaintext
|
|
|
Waf: Entering directory \`/home/eltdev/repos/hlcc/build'
|
|
|
Error: multiple XMLs found, just one supported.
|
|
|
```
|
|
|
|
|
|
while in fact you have only one XML file in your ICD directory.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Check the file name of your ICD file:
|
|
|
|
|
|
make sure it starts with an uppercase letter.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The error message is misleading (will be improved, [ECII-426](https://jira.eso.org/browse/ECII-426)).
|
|
|
|
|
|
The code generator for malicd_topics fails when the ICD file name starts with lowercase.
|
|
|
|
|
|
For more information, see also: [KB: PYBIND errors \[ICD waf build\]](onenote:#KB%20PYBIND%20errors%20%5BICD%20waf%20build%5D§ion-id=%7BF524F9BE-F51D-4A01-9976-93359FCC4966%7D&page-id=%7B0FEE4FB9-C58B-4E8A-A276-2EC4367CFB30%7D&end&base-path=https://europeansouthernobservatory.sharepoint.com/sites/ELT_Control/SiteAssets/ELT_Control%20Notebook/Documentation/ELT%20Control%20KnowledgeBase.one)
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### g++: internal compiler error, g++ fatal error \[ICD waf build\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to build an ICD-module or MAL-application, the build takes a long time, and/or fails with an error message like this:
|
|
|
|
|
|
```plaintext
|
|
|
g++: fatal error: Killed signal terminated program cc1plus
|
|
|
compilation terminated.
|
|
|
```
|
|
|
|
|
|
```plaintext
|
|
|
Software/CcsLibs/CcsTestData/python/bindings/src/ModCcstestdata.cpp:18:1: note:
|
|
|
in expansion of macro ‘PYBIND11_MODULE’
|
|
|
PYBIND11_MODULE(ModCcstestdata, modccstestdata) {
|
|
|
^
|
|
|
g++: internal compiler error: Killed (program cc1plus)
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The cpp compiler runs out of memory and crashes. You can see the effect by running htop in a separate terminal, all memory (including swap space) is consumed by the g++ compiler, which consequently crashes.
|
|
|
|
|
|
_Memory needed for building a given ICD module_
|
|
|
|
|
|
There is a base load that is the same for all ICD modules. On top of that, the ICD file contents determine how much memory is needed to build the module.
|
|
|
|
|
|
_Rule of Thumb_
|
|
|
| MAL version | Base Load | Mem per ICD-Struct |
|
|
|
|-------------|-----------|--------------------|
|
|
|
| MAL 1.x | 650 MB (2/3 GB) | 320 MB (1/3 GB) |
|
|
|
| MAL 2.0 | 650 MB (2/3 GB) | 110 MB (1/8 GB) |
|
|
|
|
|
|
Thus, if your biggest ICD contains 20 structs, building under MAL 1.x will require around 7 GB of available free memory.
|
|
|
|
|
|
_Measuring_
|
|
|
|
|
|
Record metrics of the ICD build with this time-command:
|
|
|
|
|
|
```plaintext
|
|
|
$ alias time='TIME="real\t%E\nmem\t%Mk\ncpu\t%P\npf\t%F" time'
|
|
|
$ time waf build
|
|
|
|
|
|
[...]
|
|
|
real 10:28.19
|
|
|
mem 7206676k
|
|
|
cpu 765%
|
|
|
pf 166
|
|
|
```
|
|
|
|
|
|
- If the build crashes, the time-command's output will not be fully reliable (the real memory need is higher than what the output shows).
|
|
|
- High page fault counts (`pf 1168635`) generally indicate you should reduce the module's footprint, see Solutions below.
|
|
|
|
|
|
More info is available at [ECII-109](https://jira.eso.org/browse/ECII-109)
|
|
|
|
|
|
**Solution 1: Decrease the module's footprint**
|
|
|
|
|
|
1. Remove unnecessary middlewares
|
|
|
|
|
|
Use the `xyz_disabled` options:
|
|
|
|
|
|
```python
|
|
|
from wtools import module
|
|
|
# Disable OPCUA and DDS, since not part of this interface.
|
|
|
module.declare_malicd(mal_opts={'opcua_disabled': True, 'dds_disabled': True})
|
|
|
```
|
|
|
|
|
|
2. Reduce build parallelism
|
|
|
|
|
|
By default the build system uses all cores on the host. Less parallelism means less memory consumers during the build. This is controlled by the waf `-j` option.
|
|
|
|
|
|
To build with only 4 cores: `$ time waf -j4 build`
|
|
|
|
|
|
As a rough estimate, each waf build task will consume around 2 GB RAM, so on a 12 core host with 16 GB RAM, a parallelism of 8 may be a good choice. Try different numbers of cores and use the output from the time-command (see above) to find an optimum between real, page faults, and cpu.
|
|
|
|
|
|
3. Adjust the compiler flags
|
|
|
|
|
|
The default set of compiler flags applied by the build system consume significant memory. We recommend using "-O2 -flto -pipe" (_to be confirmed_) instead. This is how you pass custom compiler flags for your ICD-module:
|
|
|
|
|
|
In your project wscript:
|
|
|
|
|
|
```python
|
|
|
from wtools import project
|
|
|
[...]
|
|
|
def configure(cnf):
|
|
|
cnf.env.CXXFLAGS_MALPYTHON = '-O2 -flto -pipe'
|
|
|
[...]
|
|
|
```
|
|
|
|
|
|
4. Refactor your ICD
|
|
|
|
|
|
Reduce the memory need by splitting the big ICDs up into 2 or more smaller ICD modules.
|
|
|
|
|
|
**Solution 2: Increase the available memory**
|
|
|
|
|
|
1. Find RAM consumers and stop them, at least temporarily. For example, ElasticSearch uses a significant amount of RAM: `sudo cii-services stop elasticsearch`
|
|
|
2. Add temporary swap space to your host
|
|
|
|
|
|
```shell
|
|
|
# As root:
|
|
|
fallocate -l 8G /swapfile
|
|
|
dd if=/dev/zero of=/swapfile bs=1024 count=8388608
|
|
|
chmod 600 /swapfile
|
|
|
mkswap /swapfile
|
|
|
swapon /swapfile
|
|
|
|
|
|
# and to remove it:
|
|
|
swapoff -v /swapfile
|
|
|
rm -f /swapfile
|
|
|
```
|
|
|
|
|
|
3. Add permanent memory to your VM
|
|
|
|
|
|
Increase your RAM, respectively ask your system administrator to do it. Assess the necessary amount by using the "Rule Of Thumb" above.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Choose middlewares to build \[MAL ICD\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I am certain that my ICD will never be used over OPC UA. Nonetheless, the ICD-compilation builds OPC UA mappings for my ICD. This is unnecessarily extending the compilation time for my application.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
By default the ICD-compilation builds mappings for all middlewares. But it is possible to exclude certain middleware mappings from compilation, which will reduce compilation time. You do this by passing mal options to the icd-generator.
|
|
|
|
|
|
**Example**
|
|
|
|
|
|
wscript
|
|
|
|
|
|
```plaintext
|
|
|
declare_malicd(use='icds.base', mal_opts = { 'opcua_disabled': True } )
|
|
|
```
|
|
|
|
|
|
The available options are:
|
|
|
|
|
|
- opcua_disabled = if True, disable OPCUA middleware generation
|
|
|
- dds_disabled = if True, disable DDS middleware generation
|
|
|
- zpb_disabled = if True, disable ZEROMQ middleware generation
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Variable Tracking exceeded \[ICD\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
When building an ICD using CII-MAL, you see this warning message:
|
|
|
|
|
|
```plaintext
|
|
|
variable tracking size limit exceeded
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The -fvar-tracking-assignments is automatically enabled by GCC when optimizations are enabled.
|
|
|
|
|
|
There is a limit on how many variables can be tracked by the compiler. The warning tells you that more vars would need to be tracked than what's supported.
|
|
|
|
|
|
You can disable the tracking manually with -fno-var-tracking-assignments.
|
|
|
|
|
|
There are two easy ways to do it on the overall project.
|
|
|
|
|
|
**Solution A**
|
|
|
|
|
|
```plaintext
|
|
|
Export CXXFLAGS=-fno-var-tracking-assignments
|
|
|
```
|
|
|
|
|
|
And then rerun “waf configure” and continue with the build.
|
|
|
|
|
|
Note: you have to have the exported variable each time you do a “waf configure” as that is the point at which such flags are saved
|
|
|
|
|
|
**Solution B**
|
|
|
|
|
|
If you want it to be inside the project then put the flags fixed inside your project, so in the top level wscript (where you define the project) add this configure section before the project declaration:
|
|
|
|
|
|
```plaintext
|
|
|
def configure(cnf):
|
|
|
|
|
|
cnf.env.append_value('CXXFLAGS', \['-fno-var-tracking-assignments'\])
|
|
|
```
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[MAL\]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Avoid ephemeral ports \[MAL\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
When you leave a client running for a while (say 20 minutes) without server available, you start getting errors:
|
|
|
|
|
|
```plaintext
|
|
|
Message=Malformed message received, missing frames.
|
|
|
```
|
|
|
|
|
|
Starting the server does not fix the situation, and gives this error:
|
|
|
|
|
|
```plaintext
|
|
|
Errno 48 : Address already in use
|
|
|
```
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
The solution is to not use ports from the ephemeral range (aka local port range). The port numbers in the ephemeral range can be found with `cat /proc/sys/net/ipv4/ip_local_port_range`. For DevEnv 3.x with CentOS 8 they are:
|
|
|
| Port Range | Usable for MAL |
|
|
|
|------------|----------------|
|
|
|
| 1 -1023 | 🟠 No |
|
|
|
| 1024 - 32767 | 🟢 Yes |
|
|
|
| 32768 - 60999 | 🟠 No |
|
|
|
| 61000 - 65535 | 🟢 Yes |
|
|
|
|
|
|
With the implementation of ECII-402, the MAL library will write a warning log (and can also be configured to throw an exception) if an application runs a MAL instance on one of the ephemeral ports.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
(Explanation provided by M. Sekoranja)
|
|
|
|
|
|
After a while of (re-)connection attempts, the client will manage to connect... to itself! Because a client will never expect client's messages, errors of malformed messages are emitted.
|
|
|
|
|
|
This happens because TCP design allows for a 'simultaneous connect feature': if a client is trying to connect to local port and if the port is from the ephemeral range, it can occasionally connect to itself. The client thinks it is connected to a server, however it is actually connected to itself. Moreover, the server can not bind to its server port anymore.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Req/Rep Connection Listeners \[MAL Python\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
MAL includes a callback registration method to allow monitoring of the MAL service connection status.
|
|
|
|
|
|
This is done with the method `registerConnectionListener()`.
|
|
|
|
|
|
Given the example below, this does not work, and the `listenerMethod()` is never called.
|
|
|
|
|
|
|
|
|
|
|
|
```plaintext
|
|
|
import sys
|
|
|
import datetime
|
|
|
import signal
|
|
|
import traceback
|
|
|
import logging
|
|
|
|
|
|
import elt.pymal as mal
|
|
|
from pymalcpp import TimeoutException
|
|
|
from ModTrk.Trk.StdCmds import StdCmdsSync
|
|
|
from ModTrk.Trk.StdCmds import StdCmdsAsync
|
|
|
from ModTrk.Trk import TelPosition
|
|
|
from ModTrk.Trk import AxesPosition
|
|
|
|
|
|
THREE_SECONDS = datetime.timedelta(seconds=3)
|
|
|
MINUTE = datetime.timedelta(seconds=60)
|
|
|
MY_SERVER_URL='localhost'
|
|
|
MY_SERVER_PORT='44444'
|
|
|
|
|
|
def listenerMethod(state):
|
|
|
print("listenerMethod: registerConnectionListener() response :" + str(state))
|
|
|
|
|
|
|
|
|
|
|
|
uri = 'zpb.rr://' + MY_SERVER_URL + ':' + str(MY_SERVER_PORT) + '/m1/' + 'TrkLsvServer'
|
|
|
print('MAL URI: ' + uri)
|
|
|
zpbMal = mal.loadMal('zpb', {})
|
|
|
factory = mal.CiiFactory.getInstance()
|
|
|
factory.registerMal('zpb', zpbMal )
|
|
|
stdcmds = factory.getClient(uri, StdCmdsAsync, qos=mal.rr.qos.ReplyTime(THREE_SECONDS))
|
|
|
stdcmds.registerConnectionListener(listenerMethod)
|
|
|
|
|
|
connectionFuture = stdcmds.asyncConnect()
|
|
|
|
|
|
connectionFuture.wait_for(THREE_SECONDS)
|
|
|
rtn = stdcmds.Status()
|
|
|
rtn.wait()
|
|
|
print( str(rtn.get() ))
|
|
|
```
|
|
|
|
|
|
_From <_[_https://jira.eso.org/browse/ECII-212_](https://jira.eso.org/browse/ECII-212)_>_
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
The developer must keep a reference to the returned Object from the `registerConnectionListener()` invocation.
|
|
|
|
|
|
```plaintext
|
|
|
stdcmds = factory.getClient(uri, StdCmdsAsync, qos=mal.rr.qos.ReplyTime(THREE_SECONDS))
|
|
|
listenerRegistration = stdcmds.registerConnectionListener(listenerMethod)
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The documentation states a Return value, but is the responsibility of the developer to keep the reference. Otherwise, the object will be deleted when exiting the block of code.
|
|
|
|
|
|
Remember to delete (assign None), to this object when closing the connection to the MAL service.
|
|
|
|
|
|
_From <_[_https://jira.eso.org/browse/ECII-212_](https://jira.eso.org/browse/ECII-212)_>_
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Latency on Pub/Sub \[MAL ZMQ\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
We see latencies of 400-1000ms in MAL ZMQ pub/sub communication, for sending a 12k x 12k image blob.
|
|
|
|
|
|
For the 2 first transmissions, there is somewhere between 400 and 500ms latency on the publisher side between just before calling the "publish" method and when we see the first packets on the wire.
|
|
|
|
|
|
This is for all messages, not only for the first messages. When we let the program run several minutes, during that period, all messages have a consistent delay when arriving on subscriber side.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
From MAL 1.0.4 on,
|
|
|
|
|
|
MAL supports the below MAL specific property to limit queue size for large message publishers.
|
|
|
|
|
|
```plaintext
|
|
|
mal::Mal::Properties m_malProperties1;
|
|
|
m_malProperties1["zpb.ps.zmq.sndhwm"] = "1";
|
|
|
|
|
|
auto publisher = factory.getPublisher<mal::example::Sample>(uri,
|
|
|
{ std::make_shared<mal::ps::qos::Latency>(
|
|
|
std::chrono::milliseconds(100)),
|
|
|
std::make_shared<mal::ps::qos::Deadline>(
|
|
|
std::chrono::seconds(1)) }, m_malProperties1);
|
|
|
```
|
|
|
|
|
|
For more information how to use MAL specific properties, see MAL Binding Manual.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The issue has been first reported in [ECII-159](https://jira.eso.org/browse/ECII-159).
|
|
|
|
|
|
The problem lies in the ZMQ send queues. Default size is 1000 and with 144MB per message (in this case) this means 144GB. Limiting this to 1 (for a test) a publisher can handle 20 subscribers (tested) without any problems. The solution is to reconfigure the send-queue size appropriately.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Sending an array of unions \[MAL CPP\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to send a msg, I get the following error message from CII:
|
|
|
|
|
|
```plaintext
|
|
|
[libprotobuf ERROR google/protobuf/message_lite.cc:121] Can't parse message of type "generated.zpb.fcfif.StdCmds_Request" because it is missing required fields: data.Setup.payload[0].piezoData.input
|
|
|
```
|
|
|
|
|
|
My ICD definition looks like this
|
|
|
|
|
|
```xml
|
|
|
<enum name="PiezoInput">
|
|
|
<enumerator name="PIEZO_INPUT_SETMODE" />
|
|
|
<enumerator name="PIEZO_INPUT_MOVE" />
|
|
|
</enum>
|
|
|
|
|
|
<enum name="PiezoMode">
|
|
|
<enumerator name="PIEZO_MODE_1" />
|
|
|
<enumerator name="PIEZO_MODE_2" />
|
|
|
</enum>
|
|
|
|
|
|
<struct name="PiezoModeStruct">
|
|
|
<member name="mode" type="nonBasic" nonBasicTypeName="PiezoMode" />
|
|
|
</struct>
|
|
|
|
|
|
<enum name="PiezoMove">
|
|
|
<enumerator name="PIEZO_MOVE_1" />
|
|
|
<enumerator name="PIEZO_MOVE_2" />
|
|
|
</enum>
|
|
|
|
|
|
<union name="PiezoUnion">
|
|
|
<discriminator type="nonBasic" nonBasicTypeName="PiezoInput" />
|
|
|
<case>
|
|
|
<caseDiscriminator value ="PIEZO_INPUT_SETMODE"/>
|
|
|
<member name="piezoModeData" type="nonBasic" nonBasicTypeName="PiezoModeStruct" />
|
|
|
</case>
|
|
|
<case>
|
|
|
<caseDiscriminator value ="PIEZO_INPUT_MOVE"/>
|
|
|
<member name="piezoMoveData" type="nonBasic" nonBasicTypeName="PiezoMove" />
|
|
|
</case>
|
|
|
</union>
|
|
|
|
|
|
<struct name="Piezo">
|
|
|
<member name="id" type="string" />
|
|
|
<member name="input" type="nonBasic" nonBasicTypeName="PiezoUnion" />
|
|
|
</struct>
|
|
|
|
|
|
<interface name="PiezoTest">
|
|
|
<method name="test" returnType="void">
|
|
|
<argument name="arr" type="nonBasic" nonBasicTypeName="Piezo" arrayDimensions="(10)" />
|
|
|
</method>
|
|
|
</interface>
|
|
|
```
|
|
|
|
|
|
My code looks like this
|
|
|
|
|
|
```plaintext
|
|
|
[...]
|
|
|
auto piezo = mal->createDataEntity<::fcfif::Piezo>();
|
|
|
piezo->setId("foo");
|
|
|
|
|
|
auto input = piezo->getInput();
|
|
|
auto mode = input->getPiezoModeData();
|
|
|
mode->setAction(::fcfif::ActionPiezoMode::SET_AUTO);
|
|
|
|
|
|
auto union = mal->createDataEntity<::fcfif::FcsUnion>();
|
|
|
union->setPiezoData(piezo);
|
|
|
|
|
|
[...]
|
|
|
```
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
```plaintext
|
|
|
auto p = factory.getClient<::fcfif::PiezoTestSync>(uri,
|
|
|
{std::make_shared<mal::rr::qos::ReplyTime>
|
|
|
(std::chrono::seconds(3))},
|
|
|
{});
|
|
|
auto mal = p->getMal();
|
|
|
|
|
|
auto piezo = mal->createDataEntity<::fcfif::Piezo>();
|
|
|
piezo->setId("foo");
|
|
|
|
|
|
auto input = piezo->getInput();
|
|
|
auto piezoModeStruct = mal->createDataEntity<::fcfif::PiezoModeStruct>();
|
|
|
piezoModeStruct->setMode(::fcfif::PiezoMode::PIEZO_MODE_1);
|
|
|
|
|
|
input->setPiezoModeData(piezoModeStruct);
|
|
|
|
|
|
auto piezo2 = mal->createDataEntity<::fcfif::Piezo>();
|
|
|
piezo2->setId("foo2");
|
|
|
|
|
|
auto input2 = piezo2->getInput();
|
|
|
|
|
|
input2->setPiezoMoveData(::fcfif::PiezoMove::PIEZO_MOVE_1);
|
|
|
|
|
|
std::vector<std::shared_ptr<::fcfif::Piezo>> sa;
|
|
|
sa.push_back(piezo);
|
|
|
sa.push_back(piezo2);
|
|
|
p->test(sa);
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
Your code does not work since you do not use union instance provided by the parent structure. You need to obtain nested structure/union via accessors and do not try to created your own (detached) instance.
|
|
|
|
|
|
This issue was first described in [ECII-154](https://jira.eso.org/browse/ECII-154)
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Failed to send request, send queue full \[MAL\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
You are intending to update a config, but you get an IllegalStateException where the last line is
|
|
|
|
|
|
```plaintext
|
|
|
elt.mal.zpb.rr.ClientAsyncImpl:183
|
|
|
```
|
|
|
|
|
|
throw `new MalException("Failed to send request, send queue full");`
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Probably you have called close() on the CiiConfigClient instance somewhere, maybe also implicitly during a try-with-resource block.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Getting More Logs \[MAL Log\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
The MAL seems to misbehave. How can I get more log messages from the MAL used in my application?
|
|
|
|
|
|
**Solution A**
|
|
|
|
|
|
**Java**
|
|
|
|
|
|
From MAL 1.1.0, edit the MAL log4j config xml and specify the MAL log levels:
|
|
|
|
|
|
```plaintext
|
|
|
<Logger name="elt.mal" level="TRACE" />
|
|
|
```
|
|
|
|
|
|
**Cpp**
|
|
|
|
|
|
Example for **Zpb.** For other middlewares, see below
|
|
|
|
|
|
1. Put a log-config file into your file system:
|
|
|
|
|
|
```plaintext
|
|
|
log4cplus.rootLogger=TRACE, stdout
|
|
|
|
|
|
log4cplus.logger.malDds=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsBasePubSub=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsPublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsInstancePublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsSubscriptionManager=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsSubscriptionReaderListener=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsMrvSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsRequesterImpl=TRACE, MyFileAppender
|
|
|
|
|
|
log4cplus.additivity.malDds=True
|
|
|
log4cplus.additivity.malDdsBasePubSub=True
|
|
|
log4cplus.additivity.malDdsPublisher=True
|
|
|
log4cplus.additivity.malDdsInstancePublisher=True
|
|
|
log4cplus.additivity.malDdsSubscriptionManager=True
|
|
|
log4cplus.additivity.malDdsSubscriptionReaderListener=True
|
|
|
log4cplus.additivity.malDdsSubscriber=True
|
|
|
log4cplus.additivity.malDdsMrvSubscriber=True
|
|
|
log4cplus.additivity.malDdsRequesterImpl=True
|
|
|
|
|
|
|
|
|
log4cplus.logger.malZpb=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbBasePubSub=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbPublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbInstancePublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbMrvSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbServer=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbClientAsyncImpl=TRACE, MyFileAppender
|
|
|
|
|
|
log4cplus.additivity.malZpb=True
|
|
|
log4cplus.additivity.malZpbBasePubSub=True
|
|
|
log4cplus.additivity.malZpbPublisher=True
|
|
|
log4cplus.additivity.malZpbInstancePublisher=True
|
|
|
log4cplus.additivity.malZpbSubscriber=True
|
|
|
log4cplus.additivity.malZpbMrvSubscriber=True
|
|
|
log4cplus.additivity.malZpbServer=True
|
|
|
log4cplus.additivity.malZpbClientAsyncImpl=True
|
|
|
|
|
|
|
|
|
log4cplus.logger.malOpcua=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaBasePubSub=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaPublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaInstancePublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaMrvSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaMrvDataMonitor=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaDataPoller=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaDataMonitor=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaClient=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaClientEventLoop=TRACE, MyFileAppender
|
|
|
|
|
|
log4cplus.additivity.malOpcua=True
|
|
|
log4cplus.additivity.malOpcuaBasePubSub=True
|
|
|
log4cplus.additivity.malOpcuaPublisher=True
|
|
|
log4cplus.additivity.malOpcuaInstancePublisher=True
|
|
|
log4cplus.additivity.malOpcuaSubscriber=True
|
|
|
log4cplus.additivity.malOpcuaMrvSubscriber=True
|
|
|
log4cplus.additivity.malOpcuaMrvDataMonitor=True
|
|
|
log4cplus.additivity.malOpcuaDataPoller=True
|
|
|
log4cplus.additivity.malOpcuaDataMonitor=True
|
|
|
log4cplus.additivity.malOpcuaClient=True
|
|
|
log4cplus.additivity.malOpcuaClientEventLoop=True
|
|
|
|
|
|
|
|
|
log4cplus.appender.stdout=log4cplus::ConsoleAppender
|
|
|
log4cplus.appender.stdout.layout=log4cplus::PatternLayout
|
|
|
log4cplus.appender.stdout.layout.ConversionPattern=%5p [%t] (%F:%L) - %m%n
|
|
|
|
|
|
log4cplus.appender.MyFileAppender=log4cplus::RollingFileAppender
|
|
|
log4cplus.appender.MyFileAppender.File=/tmp/elt-mal-cpp-trace.log
|
|
|
log4cplus.appender.MyFileAppender.layout=log4cplus::PatternLayout
|
|
|
log4cplus.appender.MyFileAppender.layout.ConversionPattern=[%-5p][%D{%Y/%m/%d %H:%M:%S:%q}][%-l][%t] %m%n
|
|
|
```
|
|
|
|
|
|
2. Configure log4cplus prior to using CII MAL:
|
|
|
```plaintext
|
|
|
|
|
|
#include <log4cplus/configurator.h>
|
|
|
|
|
|
std::string pathToLogPropFile = "...";
|
|
|
if (pathToLogPropFile .size() > 0) {
|
|
|
log4cplus::PropertyConfigurator::doConfigure(pathToLogPropFile );
|
|
|
}
|
|
|
```
|
|
|
|
|
|
3. Pass the path to the log-config to the specific MAL being loaded:
|
|
|
|
|
|
MAL logging is initialized from a configuration file, the path of which is read from mal properties with key mal::PROP_LOG_CONFIG_FILENAME. When loading mal, use set mal::PROP_LOG_CONFIG_FILENAME in mal properties.
|
|
|
|
|
|
For example:
|
|
|
|
|
|
```plaintext
|
|
|
auto zpbMal = mal::loadMal("zpb",
|
|
|
mal::Mal::Properties{{mal::PROP_LOG_CONFIG_FILENAME,"/path/to/mal-log4cplus.conf"}}
|
|
|
);
|
|
|
```
|
|
|
|
|
|
or in **python:**
|
|
|
|
|
|
```plaintext
|
|
|
import elt.pymal as mal
|
|
|
zpbMal = mal.loadMal ("zpb", {"zpb.log4cplus.filename":"/path/to/mal-log4cplus.conf"})
|
|
|
```
|
|
|
|
|
|
python example with inline config
|
|
|
|
|
|
```plaintext
|
|
|
import elt.pymal
|
|
|
with open ("/tmp/mal.log.conf", 'w') as f:
|
|
|
f.write('''
|
|
|
log4cplus.appender.stdout=log4cplus::ConsoleAppender
|
|
|
log4cplus.logger.malDds=TRACE, stdout
|
|
|
log4cplus.logger.malDdsBasePubSub=TRACE, stdout
|
|
|
log4cplus.logger.malDdsSubscriber=TRACE, stdout
|
|
|
''')
|
|
|
ddsMalProps = {"dds.log4cplus.filename" : "/tmp/mal.log.conf"}
|
|
|
ddsMal = elt.pymal.loadMalForUri("dds.ps://", ddsMalProps)
|
|
|
```
|
|
|
|
|
|
|
|
|
**Solution B**
|
|
|
|
|
|
With [ECII-246](https://jira.eso.org/browse/ECII-246), it is possible to change the log levels of the MAL loggers at run-time via a method call:
|
|
|
|
|
|
```plaintext
|
|
|
#include <mal/util/MalLoggingUtil.hpp>
|
|
|
::elt::mal::util::logging::setLogLevelForLoggers( ... )
|
|
|
|
|
|
// set all mal loggers to INFO...
|
|
|
std::vector<elt::mal::util::logging::LoggerInfo> loggers = elt::mal::util::logging::getLoggers();
|
|
|
std::vector<elt::mal::util::logging::LoggerInfo> info;
|
|
|
for (auto const& logger : loggers) {
|
|
|
info.push_back(elt::mal::util::logging::LoggerInfo(logger.loggerName, ::log4cplus::INFO_LOG_LEVEL));
|
|
|
}
|
|
|
elt::mal::util::logging::setLogLevelForLoggers(info);
|
|
|
|
|
|
// print all mal loggers and log level
|
|
|
loggers = elt::mal::util::logging::getLoggers();
|
|
|
std::cout << "Loggers:\n";
|
|
|
for (auto const& logger : loggers) {
|
|
|
std::cout << "\t" << logger.loggerName << ": " << logger.logLevel << std::endl;
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
The allowed logger names are listed below.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
MAL does not use the Cii Logging System (CiiLogManager etc.) directly. Instead, MAL expects logging to be initialized from a configuration file. Note that the format of the log-config differs per programming language.
|
|
|
|
|
|
List of logger names available in the **Cpp MAL / Python MAL:**
|
|
|
|
|
|
Loggers for **mal-zpb**
|
|
|
|
|
|
- malZpbInstancePublisher
|
|
|
- malZpbPublisher
|
|
|
- malZpbClientAsyncImpl
|
|
|
- malZpbServer
|
|
|
- malZpb
|
|
|
- malZpbSubscriber
|
|
|
- malZpbMrvSubscriber
|
|
|
- malZpbBasePubSub
|
|
|
|
|
|
Loggers for **mal-dds**
|
|
|
|
|
|
- malDds
|
|
|
- malDdsSubscriptionManager
|
|
|
- malDdsSubscriptionReaderListener
|
|
|
- malDdsSubscriber
|
|
|
- malDdsPublisher
|
|
|
- malDdsBasePubSub
|
|
|
- malDdsMrvSubscriber
|
|
|
- malDdsRequesterImpl
|
|
|
- malDdsInstancePublisher
|
|
|
|
|
|
Loggers for **mal-opcua**
|
|
|
|
|
|
- malOpcua
|
|
|
- malOpcuaBasePubSub
|
|
|
- malOpcuaMrvSubscriber
|
|
|
- malOpcuaMrvDataMonitor
|
|
|
- malOpcuaInstancePublisher
|
|
|
- malOpcuaPublisher
|
|
|
- malOpcuaSubscriber
|
|
|
- malOpcuaDataPoller
|
|
|
- malOpcuaDataMonitor
|
|
|
- malOpcuaClient
|
|
|
- malOpcuaClientEventLoop
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### More Frames expected \[MAL ZMQ\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
In your application you get errors like this:
|
|
|
|
|
|
```plaintext
|
|
|
Oct 16, 2019 11:26:21 AM elt.mal.zpb.ps.ZpbSubscriber events
|
|
|
|
|
|
WARNING: Remote data entity type hash does not match (1040672065 != 1708154137).
|
|
|
|
|
|
Oct 16, 2019 11:26:21 AM elt.mal.zpb.ps.ZpbSubscriber events
|
|
|
|
|
|
WARNING: Failed to process message.
|
|
|
|
|
|
java.lang.RuntimeException: more frames expected
|
|
|
|
|
|
at elt.mal.zpb.ps.ZpbSubscriber.requireMoreFrames(ZpbSubscriber.java:82)
|
|
|
|
|
|
at elt.mal.zpb.ps.ZpbSubscriber.events(ZpbSubscriber.java:114)
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The first warning message indicates that your client has received a piece of data (= MAL entity type) on a channel that should not carry such data. This means you are running two publishers, publishing different types of data, on the same channel.
|
|
|
|
|
|
**Example of topic definition with port-clash**
|
|
|
|
|
|
```xml
|
|
|
<pubsub_topic>
|
|
|
<topic_name>sm:current_pos</topic_name>
|
|
|
<topic_type>sm_current_pos</topic_type>
|
|
|
<address_uri>zpb.ps://134.171.2.220:57110/test</address_uri>
|
|
|
<qos latency_ms="1" deadline_ms="100"/>
|
|
|
<performance rate_hz="10" latency_ms="1" synchronous="false" />
|
|
|
<mal>
|
|
|
<zpb />
|
|
|
</mal>
|
|
|
</pubsub_topic>
|
|
|
|
|
|
<pubsub_topic>
|
|
|
<topic_name>hp:global_status</topic_name>
|
|
|
<topic_type>hp_global_status</topic_type>
|
|
|
<address_uri>zpb.ps://134.171.2.220:57110/test</address_uri>
|
|
|
<qos latency_ms="10" deadline_ms="100"/>
|
|
|
<performance rate_hz="1" latency_ms="10" synchronous="false" />
|
|
|
<mal>
|
|
|
<zpb />
|
|
|
</mal>
|
|
|
</pubsub_topic>
|
|
|
```
|
|
|
|
|
|
The second warning and the error trace are just a consequence of the first warning.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Check your topics.xml file, and make sure each channel has its own exclusive topic name.
|
|
|
|
|
|
In the above example, e.g.:
|
|
|
|
|
|
```plaintext
|
|
|
zpb.ps://134.171.2.220:57110/test1
|
|
|
```
|
|
|
|
|
|
and
|
|
|
|
|
|
```plaintext
|
|
|
zpb.ps://134.171.2.220:57110/test2
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Address in use \[MAL ZMQ\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Running your application, you see this error message:
|
|
|
|
|
|
```plaintext
|
|
|
ZMQException: Errno 48 : Address already in use
|
|
|
```
|
|
|
|
|
|
**Solution A**
|
|
|
|
|
|
Another instance of your application is still running.
|
|
|
|
|
|
**Solution B**
|
|
|
|
|
|
Another instance of your application has non-gracefully terminated without freeing the network port.
|
|
|
|
|
|
Ensure your application always performs a call to "mal.close()" on shutdown.
|
|
|
|
|
|
**Solution C**
|
|
|
|
|
|
This could be really a Usage Error due to wrong configuration.
|
|
|
|
|
|
The error message is in fact misleading.
|
|
|
|
|
|
**Example**
|
|
|
|
|
|
```plaintext
|
|
|
eltcii33 [09:38:27] eeltdev:~/mschilli > mal-esotests-testclient1 pub sAddr=zpb://eltcii28:12333/Sample tSlow=100 nSamp=100
|
|
|
pub:sys: Available MAL Flavours loaded: [dds, opc, zpb]
|
|
|
pub:config: sAddr=zpb://eltcii28:12333/Sample
|
|
|
pub:config: nSamp=100
|
|
|
pub:config: tSlow=100
|
|
|
Internal Error: org.eso.elt.mal.MalException: org.zeromq.ZMQException: Errno 48 : Address already in use
|
|
|
```
|
|
|
|
|
|
**Reason**
|
|
|
|
|
|
The above code is trying, on host eltcii33, to publish with an endpoint eltcii28.
|
|
|
|
|
|
**Fix**
|
|
|
|
|
|
On eltcii33, the endpoint must be eltcii33.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Scheme not supported \[MAL\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Running your application, you get this error message:
|
|
|
|
|
|
```plaintext
|
|
|
elt.mal.SchemeNotSupportedException: middleware not supported
|
|
|
```
|
|
|
|
|
|
**Solution A**
|
|
|
|
|
|
In your code, you've misspelled the middleware name, e.g. "opc" instead of "opcua"
|
|
|
|
|
|
**Solution B**
|
|
|
|
|
|
The middleware is supported in fact, but failed to load.
|
|
|
|
|
|
- E.g. in DDS, you are using a Qos profile xml file which has some illegal syntax inside.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Choosing a NIC \[MAL DDS\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I'm using MAL-DDS (or MAL-MUDPI), and I have two network cards (NICs) installed. MAL uses the wrong one, i.e. my network traffic goes into the "office" network, but should go into the "control" network.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
As multicast addresses are by definition not associated with hardware (ie they map to MAC addresses which have no corresponding Ethernet card), there is no means for the OS to resolve which NIC the IGMP subscription should be sent down. Thus the NIC must be specified, or the default is used (which is the office network).
|
|
|
|
|
|
The multicast middlewares (DDS and MUDPI) supported by the MAL allow you to specify which NIC you want to use for outgoing traffic. Thus, this boils down to configuring the middleware.
|
|
|
|
|
|
**Solution (DDS)**
|
|
|
|
|
|
Get the XML file shown in Solution #1 on this page: <https://community.rti.com/howto/control-or-restrict-network-interfaces-nics-used-discovery-and-data-distribution>, and continue with this article: [KB: Configuring DDS](https://gitlab.eso.org/ecs/eltsw-docs/-/wikis/KnowledgeBase/CII#configuring-dds-mal-dds)
|
|
|
|
|
|
**Solution (MUDPI)**
|
|
|
|
|
|
Set the "mudpi.ps.interfaceName" mal property when creating the MAL:
|
|
|
|
|
|
```plaintext
|
|
|
auto &factory = ::elt::mal::loadMalForUri("mudpi.ps://",
|
|
|
{ {"mudpi.ps.interfaceName","192.168.100.165"} } );
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Configuring DDS \[MAL DDS\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Some of the middlewares usable through MAL offer a variety of configuration options.
|
|
|
|
|
|
This article explains how to define and use configuration for the DDS middleware.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
To configure DDS, 3 things are necessary:
|
|
|
|
|
|
- put the desired config into an external XML file (see Fast DDS documentation/examples)
|
|
|
- set the FASTRTPS_DEFAULT_PROFILES_FILE (Connext: NDDS_QOS_PROFILES) environment variable, so DDS finds the XML file (This environment variable need NOT be set, the path can be passed directly in the MAL properties, malprops in the example below).
|
|
|
- pass 2 properties to the MAL factory, so DDS finds the right profile in the XML file
|
|
|
|
|
|
**Example: How to restrict DDS traffic to your own host**
|
|
|
|
|
|
XML file (see <https://fast-dds.docs.eprosima.com/en/latest/fastdds/discovery/general_disc_settings.html>)
|
|
|
|
|
|
```xml
|
|
|
<?xml version="1.0" encoding="UTF-8" ?>
|
|
|
<dds>
|
|
|
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
|
|
|
<participant profile_name="MyApp_Default">
|
|
|
<rtps>
|
|
|
<builtin>
|
|
|
<discovery_config>
|
|
|
<ignoreParticipantFlags>FILTER_DIFFERENT_HOST</ignoreParticipantFlags>
|
|
|
</discovery_config>
|
|
|
</builtin>
|
|
|
</rtps>
|
|
|
</participant>
|
|
|
</profiles>
|
|
|
</dds>
|
|
|
```
|
|
|
|
|
|
Code (C++)
|
|
|
|
|
|
```cpp
|
|
|
// Create DDS-MAL with custom mal properties
|
|
|
|
|
|
// With Fast DDS, profile.library prop and env var *must* have same value!
|
|
|
// Here the env var precedes, but you could do the inverse (using setenv).
|
|
|
char* env_var = std::getenv("FASTRTPS_DEFAULT_PROFILES_FILE");
|
|
|
const ::elt::mal::Mal::Properties malprops { {"dds.qos.profile.library", env_var},
|
|
|
{"dds.qos.profile.name", "MyApp_Default"} };
|
|
|
auto &factory = ::elt::mal::loadMalForUri ("dds.ps://", malprops);
|
|
|
|
|
|
// Publishers created from here on will have the setting applied
|
|
|
auto malpub = factory.getPublisher<AltAz> (pubsuburi, qos, {});
|
|
|
```
|
|
|
|
|
|
Before running your code:
|
|
|
|
|
|
```plaintext
|
|
|
export FASTRTPS_DEFAULT_PROFILES_FILE=<path of XML file>
|
|
|
```
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
### Using DDS Monitor \[MAL DDS\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
The DDS monitor _fastdds_monitor_ can show all DDS Participants (peers) for the selected domain.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Important is that in order to enable publishing of this meta data, it must be enabled either via an environment variable in the shell that the DDS application is run in (i.e. in publishers and subscribers, not in the shell where the fastdds_monitor is running, OR (preferably) set in the XML QoS file.
|
|
|
|
|
|
Using the environment variable:
|
|
|
|
|
|
```plaintext
|
|
|
export FASTDDS_STATISTICS="HISTORY_LATENCY_TOPIC;NETWORK_LATENCY_TOPIC;PUBLICATION_THROUGHPUT_TOPIC;\
|
|
|
RTPS_SENT_TOPIC;RTPS_LOST_TOPIC;HEARTBEAT_COUNT_TOPIC;ACKNACK_COUNT_TOPIC;NACKFRAG_COUNT_TOPIC;\
|
|
|
GAP_COUNT_TOPIC;DATA_COUNT_TOPIC;RESENT_DATAS_TOPIC;SAMPLE_DATAS_TOPIC;PDP_PACKETS_TOPIC;EDP_PACKETS_TOPIC;\
|
|
|
DISCOVERY_TOPIC;PHYSICAL_DATA_TOPIC"
|
|
|
```
|
|
|
|
|
|
Setting in the QoS XML file:
|
|
|
```plaintext
|
|
|
<participant profile_name="MyApp_Default_Participant">
|
|
|
<rtps>
|
|
|
<propertiesPolicy>
|
|
|
<properties>
|
|
|
<!-- Activate Fast DDS Statistics Module -->
|
|
|
<property>
|
|
|
<name>fastdds.statistics</name>
|
|
|
<value>HISTORY_LATENCY_TOPIC;NETWORK_LATENCY_TOPIC;PUBLICATION_THROUGHPUT_TOPIC;RTPS_SENT_TOPIC;RTPS_LOST_TOPIC;HEARTBEAT_COUNT_TOPIC;ACKNACK_COUNT_TOPIC;NACKFRAG_COUNT_TOPIC;GAP_COUNT_TOPIC;DATA_COUNT_TOPIC;RESENT_DATAS_TOPIC;SAMPLE_DATAS_TOPIC;PDP_PACKETS_TOPIC;EDP_PACKETS_TOPIC;DISCOVERY_TOPIC;PHYSICAL_DATA_TOPIC</value>
|
|
|
</property>
|
|
|
</properties>
|
|
|
</propertiesPolicy>
|
|
|
|
|
|
```
|
|
|
|
|
|
Once this is done, the statistics are visible. Note that not all statistics (e.g. the QoS of a participant) are correctly displayed by the DDS Monitor, this is slowly being improved with each release.
|
|
|
|
|
|
I already have some comments/feedback to eProsima. I welcome any feedback from your tests as well.
|
|
|
|
|
|
By default all MAL DDS peers have the name “RTPSParticipant”.
|
|
|
There are two ways to set a custom name:
|
|
|
|
|
|
A participant name can be assigned in the XML QoS file as follows:
|
|
|
|
|
|
```plaintext
|
|
|
<participant profile_name="MyApp_Default_Participant">
|
|
|
<rtps>
|
|
|
<name>MyApp_Participant</name>
|
|
|
</rtps>
|
|
|
```
|
|
|
|
|
|
However this means all participants sharing this profile have the same name.
|
|
|
|
|
|
Using the MAL Property to allow setting the participant name, which will set and/or override any participant name read from the QoS file.
|
|
|
The property is: dds.qos.participant.name
|
|
|
For example it may be used as follows:
|
|
|
|
|
|
```plaintext
|
|
|
const ::elt::mal::Mal::Properties pubprops {
|
|
|
{"dds.qos.profile.library", env_var},
|
|
|
{"dds.qos.profile.name.publisher", "MyApp_Default_Publisher"},
|
|
|
{"dds.qos.profile.name.writer", "MyApp_Default_Writer"},
|
|
|
{"dds.qos.profile.name.topic", "MyApp_Default_Topic"},
|
|
|
{"dds.qos.participant.name", "icd-demo-publisher"},
|
|
|
{"dds.qos.profile.name.participant", "MyApp_Default_Participant"}
|
|
|
};
|
|
|
```
|
|
|
|
|
|
|
|
|
***Example***
|
|
|
build and install icd-demo:
|
|
|
- git clone https://gitlab.eso.org/cii/mal/icd-demo.git
|
|
|
- cd icd-demo/
|
|
|
- waf configure build install
|
|
|
|
|
|
- set the environment variable above and run the deme publisher and subscriber
|
|
|
- mal-api-demo-publisher --uri "dds.ps:///m1"
|
|
|
- mal-api-demo-subscriber --uri "dds.ps:///m1"
|
|
|
- finally, run fastdds_monitor to observe the statistics.
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Topic History a.k.a. late joiners \[MAL DDS\]
|
|
|
|
|
|
Many PubSub topics need to cater for late joining subscribers. That is, subscribers typically need to receive that last value published, and then all values published from the time of joining, on.
|
|
|
|
|
|
There are a few key aspects of DDS that must be configured to enable this:
|
|
|
1. The topic must publish a type that has a key. History is stored in the publisher on an instance (i.e. key value) basis (i.e. one sample per instance). In ICD XML, the type struct must contain a member with key="true" set, e.g.:
|
|
|
|
|
|
```plaintext
|
|
|
<struct name="Sample" trace="true">
|
|
|
<member name="daqId" type="int64_t" key="true" />
|
|
|
<member name="value" type="double" />
|
|
|
</struct>
|
|
|
```
|
|
|
2. The topic QoS must be set to have historyQos set to KEEP_LAST, with depth 1. For example:
|
|
|
|
|
|
```plaintext
|
|
|
<topic profile_name="MyApp_Default_Topic">
|
|
|
<historyQos>
|
|
|
<kind>KEEP_LAST</kind>
|
|
|
<depth>1</depth>
|
|
|
</historyQos>
|
|
|
</topic>
|
|
|
```
|
|
|
|
|
|
3. Both data_writer and data_reader QoS must be set to reliability RELIABLE. Reliable communications is required to receive historical data.
|
|
|
|
|
|
```plaintext
|
|
|
<reliability>
|
|
|
<kind>RELIABLE</kind>
|
|
|
<max_blocking_time>
|
|
|
<sec>1</sec>
|
|
|
</max_blocking_time>
|
|
|
</reliability>
|
|
|
```
|
|
|
|
|
|
4. The data_reader QoS must be set durability to TRANSIENT_LOCAL. This means it will request missed data samples, but not beyond the life of the system (i.e. no persistence to disk). Without this setting the subscriber will not inquire about missed data.
|
|
|
|
|
|
```plaintext
|
|
|
<durability>
|
|
|
<kind>TRANSIENT_LOCAL</kind>
|
|
|
</durability>
|
|
|
```
|
|
|
|
|
|
With the above settings in place, late joining subscribers should receive the last data published for each instance, for each topic, from connected publishers.
|
|
|
More details on Fast DDS Qos: https://fast-dds.docs.eprosima.com/en/latest/fastdds/api_reference/dds_pim/core/policy/historyqospolicykind.html
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
### DDS SHM Shared Memory Startup Errors \[MAL DDS\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
On startup of DDS application an error about Shared Memory (SHM) is displayed, for example:
|
|
|
RTPS SHM: "port marked as not ok"
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
The SHM transport is one of the default transports of a DDS application and is used to communicate to peers on the same host. The relevant files created by DDS are visible in /dev/shm/\*fast\*.
|
|
|
If a DDS application does not exit cleanly, it may leave SHM files present, possibly leading to errors when the application restarts, and in any case polluting the /dev/shm/ folder.
|
|
|
|
|
|
To clean up the SHM files used by DDS the following command is provided:
|
|
|
```plaintext
|
|
|
fastdds shm clean
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
### Discovery over multiple NICs \[MAL DDS\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
How do DataReaders connect over multiple NICs
|
|
|
|
|
|
**Explanantion**
|
|
|
Let's say we have 3 hosts:
|
|
|
- A: one NIC on 192.168.1.10 and another NIC on 10.10.10.10
|
|
|
- B: one NIC on 192.168.1.11
|
|
|
- C: one NIC on 10.10.10.12
|
|
|
|
|
|
On host A we have a participant with FASTDDS_STATISTICS="PUBLICATION_THROUGHPUT_TOPIC" and a DataWriter on topic _important_high_frequency_data_.
|
|
|
|
|
|
On host B we have a participant with FASTDDS_STATISTICS="SUBSCRIPTION_THROUGHPUT_TOPIC" and a DataReader on topic _important_high_frequency_data_
|
|
|
|
|
|
On host C we have the Fast DDS monitor.
|
|
|
|
|
|
Participants B and C will **not** discover each other, since they are on different LANs.
|
|
|
|
|
|
Participant A discovery will announce:
|
|
|
"I have a DataWriter for topic _important_high_frequency_data_ communicating through any of 192.168.1.10, 10.10.10.10.
|
|
|
I also have a DataWriter for topic _fastdds_statistics_publication_throughput_ communicating through any of 192.168.1.10, 10.10.10.10."
|
|
|
|
|
|
Participant B discovery will announce:
|
|
|
"I have a DataReader for topic _important_high_frequency_data_ listening on 192.168.1.11.
|
|
|
I also have a DataWriter for topic _fastdds_statistics_subscription_throughput_ communicating through any of 192.168.1.11."
|
|
|
|
|
|
The DataWriter on participant A will then send data for topic _important_high_frequency_data_ listening to 192.168.1.11 (through NIC 192.168.1.10)
|
|
|
|
|
|
Participant C discovery will announce:
|
|
|
"I have a DataReader for topic _fastdds_statistics_publication_throughput_ listening on 10.10.10.12.
|
|
|
I also have a DataReader for topic _fastdds_statistics_subscription_throughput_ listening on 10.10.10.12."
|
|
|
|
|
|
The statistics DataWriter on participant A will then send data for topic _fastdds_statistics_publication_throughput_ to 10.10.10.12 (through NIC 10.10.10.10) and be visible on DDS Monitor.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
\--------------------------------------------------------------------------
|
|
|
|
|
|
### Summary of OPC/UA MAL in C++
|
|
|
|
|
|
This article covers integration of OPC/UA in CII MAL specifically for OPC/UA Data Access and Subscription profiles. OPC/UA method invocation is also supported in CII MAL but is not described in this article, likewise details of the Python (and Java) support are not provided. Only C++ is considered here.
|
|
|
|
|
|
OPC/UA communication middleware is exposed in CII MAL as either Publish/Subscribe or Request/Reply APIs.
|
|
|
|
|
|
The XML ICD definition of types in CII is used to map sets of data points together that are read/written as a group.
|
|
|
|
|
|
Each attribute in the defined type is connected to a corresponding data point in the OPC/UA data space via a URI. Thus the CII URI for a complex type will contain specific addresses of multiple nodes in the OPC/UA data space.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
#### Pub/Sub API for OPC/UA Clients:
|
|
|
|
|
|
The CII MAL Pub/Sub API utilizes OPC/UA Data Access Reads and Writes, as well OPC/UA subscription. A Publisher will directly trigger an OPC DA write, while a Subscriber will work in one of two ways, depending on the type associated with the subscriber:
|
|
|
|
|
|
- If the subscribers CII URI contains only a single node (i.e. the XML ICD type contained only a single attribute) then the Subscriber will create an OPC/UA subscription on that data point. The subscription will trigger notification of updates to the data point node, which will then be queued for notification via the CII Subscriber API.
|
|
|
- If the Subscriber is using subscription, the opc.ps.outstandingPublishRequests property should not be zero (e.g. set it to 5), see the example code below. The reason is that the publish queue is used to store and send subscription notification events, and if the queue is small the even notifications may simply be dropped.
|
|
|
- If the subscriber CII URI contains multiple nodes (i.e. the XML ICD type contains multiple attributes) then the Subscriber launches a thread to perform periodic polling of data from the OPC/UA server. The rate is based on the properties passed in creating the subscriber. e.g.
|
|
|
|
|
|
```cpp
|
|
|
try{
|
|
|
subscriber = factory.getSubscriber<T>(opcua_uri, ::elt::mal::ps::qos::QoS::DEFAULT,
|
|
|
{ {"opc.ps.outstandingPublishRequests","5"},
|
|
|
{"opc.asyncLoopExecutionPeriodMs","50"},
|
|
|
{"opc.asyncCallSubmitTimeoutMs","1000"},
|
|
|
{"opc.ps.pollingPeriodMs","20000"},
|
|
|
{"opc.asyncCallRetryPeriodMs","250"} } );
|
|
|
|
|
|
} catch(...) {
|
|
|
throw;
|
|
|
}
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
#### Request/Reply API for OPC/UA Clients:
|
|
|
|
|
|
As OPC/UA Data Access read and write essentially follow a synchronous request/reply pattern, CII MAL also provides this interface for OPC/UA clients.
|
|
|
|
|
|
The ICD is termed "virtual" in CII nomenclature as it does not require definition as an XML ICD using the service syntax, rather the same types defined for the Pub/Sub API may be used with a CII MAL OPC/UA Request/Reply.
|
|
|
|
|
|
This approach means OPC/UA Data Access read (and write) are synchronous, and may be called as needed by the application.
|
|
|
|
|
|
```cpp
|
|
|
namespace mal {
|
|
|
namespace rr {
|
|
|
namespace da {
|
|
|
|
|
|
class DataAccess : public ::elt::mal::rr::RrEntity {
|
|
|
public:
|
|
|
[...]
|
|
|
template <typename T>
|
|
|
void read(::elt::mal::ps::DataEntity<T>& value) {
|
|
|
readUnsafe(&value);
|
|
|
}
|
|
|
|
|
|
template <typename T>
|
|
|
void write(const ::elt::mal::ps::DataEntity<T>& value) {
|
|
|
writeUnsafe(&value);
|
|
|
}
|
|
|
```
|
|
|
|
|
|
A test application showing its use is here:
|
|
|
|
|
|
<https://gitlab.eso.org/cosylab/elt-cii/mal/mal-test/-/blob/develop/cpp/mal-test-performance/opcua/mal-opcua-da-speed/src/common.cpp>
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Request-Reply Python Snippets \[MAL Python\]
|
|
|
|
|
|
To interact with any MAL based remote service, you can use the python shell to connect to the remote object and invoke its methods and process the return values.
|
|
|
|
|
|
The whole lifecycle (including clean-up) looks like this:
|
|
|
|
|
|
```python
|
|
|
# connect
|
|
|
import elt.pymal
|
|
|
malfact = elt.pymal.loadMalForUri("zpb.rr://", {})
|
|
|
import ModStdif.Stdif.StdCmds
|
|
|
client = malfact.getClient("zpb.rr://127.0.0.1:12081/StdCmds",
|
|
|
ModStdif.Stdif.StdCmds.StdCmdsSync,
|
|
|
elt.pymal.rr.qos.DEFAULT, {})
|
|
|
# interact
|
|
|
print (client.GetState())
|
|
|
|
|
|
# disconnect
|
|
|
malfact.unregisterMal ("zpb")
|
|
|
```
|
|
|
|
|
|
**Side-note**: For what it's worth, this can be done as a one-liner:
|
|
|
```plaintext
|
|
|
python <<< 'import elt.pymal ; malfact = elt.pymal.loadMalForUri("zpb.rr://", {}) ; import ModStdif.Stdif.StdCmds ; client = malfact.getClient("zpb.rr://127.0.0.1:12081/StdCmds", ModStdif.Stdif.StdCmds.StdCmdsSync, elt.pymal.rr.qos.DEFAULT, {}) ; print (client.GetState()) ; malfact.unregisterMal ("zpb") '
|
|
|
```
|
|
|
... which would be equivalent to this msgsend call:
|
|
|
|
|
|
```
|
|
|
msgsend -t 60 -u zpb.rr://127.0.0.1:12081/StdCmds ::stdif::StdCmds::GetState
|
|
|
```
|
|
|
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[OLDB\]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Less or More Logs \[OLDB Log\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
You are using the OLDB API, and you are getting more logs than you asked for. For example, simply testing for the existence of a datapoint: `[ERROR] Path not found config exception occurred: Path oldb/datapoints/myroot/somedp, version 1 not found`
|
|
|
|
|
|
or, the OLDB API seems to be misbehaving, and you want more logs.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Please read the article "[Adjust CII Log Levels \[Log\]](#user-content-adjust-cii-log-levels-log)" in the Log section of this Knowledge Base. The relevant logger names for the OLDB API are:
|
|
|
|
|
|
```plaintext
|
|
|
CiiOldb
|
|
|
CiiOldbDataPoint
|
|
|
CiiOldbDirectoryTreeProvider
|
|
|
CiiOldbFactory
|
|
|
CiiOldbRedisDataPointProvider
|
|
|
CiiOldbRemoteDataPointProvider
|
|
|
CiiOldbRemoteFileProvider
|
|
|
```
|
|
|
|
|
|
Example: to turn off ERROR logs for testing the existence of non-existing datapoints, use:
|
|
|
|
|
|
```plaintext
|
|
|
log4cplus.logger.CiiOldbRemoteDataPointProvider=FATAL
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Cannot create Datapoints \[OLDB\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to create a datapoint, you get errors such as:
|
|
|
|
|
|
```plaintext
|
|
|
Cannot save to zpb.rr://ciiconfservicehost:9116/configuration/service/clientApi
|
|
|
|
|
|
[ERROR][CiiOldb] Unknown error occurred ::elt::error::icd::CiiSerializableException
|
|
|
```
|
|
|
|
|
|
This is often caused by low disk space (95% used) available for the oldb permanent store:
|
|
|
|
|
|
```plaintext
|
|
|
df -h /var/lib/elasticsearch
|
|
|
```
|
|
|
|
|
|
The disk space taken by the permanent store database itself is in the vast majority of cases dominated by the log records stored in it. The solutions aim at decreasing this space.
|
|
|
|
|
|
**Solution 1**
|
|
|
|
|
|
```plaintext
|
|
|
# Remove old log files:
|
|
|
find /var/log/elasticsearch -type f -mtime +30 -delete
|
|
|
|
|
|
# Put database back into read-write mode:
|
|
|
curl -X PUT -H "Content-Type: application/json" localhost:9200/_all/_settings -d '
|
|
|
{ "index.blocks.read_only_allow_delete": null }'
|
|
|
|
|
|
# Remove old log records:
|
|
|
curl -X POST "localhost:9200/cii_log_default_index/_delete_by_query?pretty" -H 'Content-Type: application/json' -d '
|
|
|
{ "query": { "range" : { "@timestamp" : { "lte": "now-30d/d" } } } }'
|
|
|
|
|
|
# Note: Removal is a background operation, and can take several minutes until it shows an effect.
|
|
|
# Run the below command repeatedly to monitor the removal, the docs.count should be decreasing.
|
|
|
|
|
|
# See number of logs stored in permanent store ("docs.count"):
|
|
|
curl http://localhost:9200/_cat/indices/cii_log_default_index?v\&s=store.size
|
|
|
```
|
|
|
|
|
|
**Solution 2 (brute-force)**
|
|
|
|
|
|
If you could not bring your disk usage below 95%, you can also remove all logs from the permanent store. In this case, you may also want to prevent permanent log storage in the future.
|
|
|
|
|
|
```plaintext
|
|
|
# Prevent storing logs in the permanent store:
|
|
|
sudo cii-services stop log
|
|
|
|
|
|
# Remove all log records:
|
|
|
curl -X DELETE "localhost:9200/cii_log_default_index?pretty"
|
|
|
|
|
|
# Put database back into read-write mode:
|
|
|
curl -X PUT -H "Content-Type: application/json" localhost:9200/_all/_settings -d '
|
|
|
{ "index.blocks.read_only_allow_delete": null }'
|
|
|
|
|
|
# Recreate empty log index
|
|
|
curl -X PUT "localhost:9200/cii_log_default_index?pretty"
|
|
|
```
|
|
|
|
|
|
After that you can restart logging:
|
|
|
|
|
|
```plaintext
|
|
|
sudo cii-services start log
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
If disk usage is 95% or more, elasticsearch goes into read-only mode, and creating new datapoints is not possible any more. To remove old content from the database, it is first necessary to create some free space on the disk (since the database needs space to perform deletetion-operations), then unlock the database, and then remove unnecessary old content from it.
|
|
|
|
|
|
To check the read-only status:
|
|
|
```plaintext
|
|
|
# Check read-only status:
|
|
|
# If the output contains any "true" values, you are facing the problem.
|
|
|
curl -XGET -H "Content-Type: application/json" localhost:9200/_all/_settings/ | jq -r '.[][][]["blocks"]'
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Exception while connecting to OLDB service
|
|
|
|
|
|
When starting an application using the OLDB API, the following exception is received:
|
|
|
|
|
|
```plaintext
|
|
|
<date/time>, ERROR, CiiOldbFactory/140635105143296, Unexpected config exception occurred while retrieving configuration for cii.config://remote/oldb/configurations/oldbClientConfig What:Path oldb/configurations/oldbClientConfig, version -1 not found
|
|
|
terminate called after throwing an instance of 'elt::oldb::CiiOldbException'
|
|
|
what(): Unexpected config exception occurred while retrieving configuration for cii.config://remote/oldb/configurations/oldbClientConfig What:Path oldb/configurations/oldbClientConfig, version -1 not found
|
|
|
```
|
|
|
|
|
|
**Solution** This can indicate that elasticSearch on the config/oldb server is not running, or has crashed. Use cii-services command to check the status on the server where the (cii-internal) config is running.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Connecting to OLDB takes long, then fails \[cpp OLDB\]
|
|
|
|
|
|
**Question** My application blocks a long time on first OLDB access, and eventually fails with a timeout.
|
|
|
|
|
|
**Answer**
|
|
|
|
|
|
Reconfigure the communication timeout (default: 60 seconds)
|
|
|
|
|
|
a) through an environment variable and a properties file
|
|
|
|
|
|
```plaintext
|
|
|
$ cat <<EOF >/tmp/cii_client.ini
|
|
|
connection_timeout = 5
|
|
|
EOF
|
|
|
$ export CONFIG_CLIENT_INI = /tmp/cii_client.ini
|
|
|
```
|
|
|
|
|
|
b) programmatically (available since CII 2.0/DevEnv 3.9)
|
|
|
|
|
|
C++
|
|
|
```plaintext
|
|
|
CiiClientConfiguration config_client_ini = { .connection_timeout = 5, };
|
|
|
elt::config::CiiConfigClient::SetDevClientConfig (config_client_ini);
|
|
|
```
|
|
|
|
|
|
Python
|
|
|
```plaintext
|
|
|
import elt.config
|
|
|
config_client_ini = elt.config.CiiClientConfiguration()
|
|
|
config_client_ini.connection_timeout=5
|
|
|
elt.config.CiiConfigClient.set_dev_client_config(config_client_ini)
|
|
|
```
|
|
|
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The actual stalling comes from a failed MAL-communication with the CII Internal Configuration System, which likely is not running. Setting the timeout for the CiiConfigClient is therefore the thing to do. Note that the properties file (aka. "deployment config") takes precedence and will, if they overlap, overrule the programmatic (aka. "developer config") settings.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Mock OLDB for unit tests \[OLDB cpp python\]
|
|
|
|
|
|
**Question**
|
|
|
|
|
|
Is there already a faked OLDB that I can use in my unit tests in cpp?
|
|
|
|
|
|
**Answer**
|
|
|
|
|
|
(by D. Kumar)
|
|
|
|
|
|
You can create an in-memory OLDB providing a cached config oldb implementation and using the local file system for blob data.
|
|
|
|
|
|
The oldb-client cpp module is providing a
|
|
|
|
|
|
- pure (virtual = 0) interface elt::oldb::CiiOldbDataPointProvider<sup>\[1\]</sup>, and two implementations:
|
|
|
- in-memory data point provider storing data points to the memory
|
|
|
|
|
|
(this is an empty implementation which provides a minimal operational fake oldb)
|
|
|
|
|
|
- a redis data point provider storing data points to redis.
|
|
|
- a remote filesystem interface elt::oldb:impl::ciiOldbRemoteFileProvider.hpp<sup>\[2\]</sup>, and two implementations:
|
|
|
- S3 implementation
|
|
|
- local file system implementation: ciiOldbLocalFileProvider<sup>\[3\]</sup> _\[Note: not before DevEnv 3.4\]_
|
|
|
|
|
|
Here are complete examples of unit tests showing the main use cases how to use oldb (with subscriptions) and metadata creation:
|
|
|
|
|
|
<https://gitlab.eso.org/cii/srv/cii-srv/-/blob/master/oldb-client/cpp/oldb/test/oldbInMemoryTest.cpp>
|
|
|
|
|
|
The same exists in python:
|
|
|
|
|
|
Example in <https://gitlab.eso.org/ahoffsta/cii-srv/-/blob/oldb-in-memory-missing-python-binding/oldb-client/python/oldb/test/oldbInMemoryTest.py>
|
|
|
|
|
|
References
|
|
|
|
|
|
\[1\] <https://gitlab.eso.org/cii/srv/cii-srv/-/blob/master/oldb-client/cpp/oldb/src/include/ciiOldbDataPointProvider.hpp>
|
|
|
|
|
|
\[2\] <https://gitlab.eso.org/cii/srv/cii-srv/-/blob/master/oldb-client/cpp/oldb/src/include/provider/ciiOldbRemoteFileProvider.hpp>
|
|
|
|
|
|
\[3\] <https://gitlab.eso.org/cii/srv/cii-srv/-/blob/master/oldb-client/cpp/oldb/src/provider/ciiOldbLocalFileProvider.hpp> _\[Note: not before DevEnv 3.4\]_
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### access_key empty (DevEnv 3.2.0) \[OLDB\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to use the OLDB on DevEnv 3.2.0, I'm getting this error:
|
|
|
|
|
|
Unexpected exception occurred. What:Configuration invalid: access_key empty
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Run the following commands (you will be asked for the root pw):
|
|
|
|
|
|
```plaintext
|
|
|
wget -q www.eso.org/~mschilli/download/cii/postinstall/cii-postinstall-20210610
|
|
|
|
|
|
cii-services stop config
|
|
|
|
|
|
su -c "bash cii-postinstall-20210610 schemas"
|
|
|
|
|
|
# If the script
|
|
|
|
|
|
#You should see the following output:
|
|
|
|
|
|
Password:
|
|
|
CII PostInstall (20210610)
|
|
|
schemas: applying fix ECII397
|
|
|
/home/eltdev/
|
|
|
schemas: populating elasticsearch
|
|
|
schemas: skipping telemetry
|
|
|
schemas: skipping alarms
|
|
|
|
|
|
cii-services start config
|
|
|
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The OLDB settings coming with 3.2.0 are buggy.
|
|
|
|
|
|
The CII post-install procedure is able to hotfix the settings (ECII397).
|
|
|
|
|
|
The problem will be fixed in DevEnv 3.4.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Datapoint already exists \[OLDB\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
My application tries to create an OLDB datapoint.
|
|
|
|
|
|
This fails because the datapoint "already exists":
|
|
|
|
|
|
```plaintext
|
|
|
ERROR, CiiOldbRedisDataPointProvider/140709706681216, Data point uri: cii.oldb:/tcs/hb/tempser3 in Redis already exists.
|
|
|
```
|
|
|
|
|
|
In response, my application skips the creation step, and wants to use the reportedly existing datapoint.
|
|
|
|
|
|
However, when doing this, I get the error "datapoint doesn't exist":
|
|
|
|
|
|
```plaintext
|
|
|
Dynamic exception type: elt::oldb::CiiOldbDpUndefinedException
|
|
|
std::exception::what: The data point cii.oldb:///tcs/hb/tempser3 with this name does not exist.
|
|
|
```
|
|
|
|
|
|
Likewise, when I run the oldb-gui database browser, it does not show this data point in the OLDB.
|
|
|
|
|
|
**Variant 2 of the Problem**
|
|
|
|
|
|
I try to access an OLDB datapoint, and I see two errors like this:
|
|
|
|
|
|
```plaintext
|
|
|
Target configuration does not exist: Failed to retrieve configuration from elastic search: Configuration
|
|
|
[…]
|
|
|
elt.oldb.exceptions.CiiOldbDpExistsException: Data point cii.oldb:/alarm/alarm/device/motor/input_int_dp_alarm already exisits.
|
|
|
```
|
|
|
|
|
|
Go directly to Solution 2 below.
|
|
|
|
|
|
**Variant 3 of the Problem**
|
|
|
|
|
|
I try to delete an OLDB datapoint and I see an error like this:
|
|
|
|
|
|
```plaintext
|
|
|
CiiOldbPyB.CiiOldbException: De-serialization error:sizeof(T)\*count is greater then remaining
|
|
|
```
|
|
|
|
|
|
Go directly to Solution 2 below.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The two errors are contradicting.
|
|
|
|
|
|
Datapoints are stored in two databases: a document-database (permanent store) for its metadata, and a key-value-database (volatile store) for its current value. The above symptoms indicate that the two databases are out-of-sync, meaning the datapoint exists only "half".
|
|
|
|
|
|
**Solution 1**
|
|
|
|
|
|
With DevEnv 4, which contains [ECII-500](https://jira.eso.org/browse/ECII-500), you can probably delete the datapoint to clean up the situation:
|
|
|
|
|
|
```plaintext
|
|
|
#!/usr/bin/env python
|
|
|
import elt.config
|
|
|
import elt.oldb
|
|
|
|
|
|
oldb_client = elt.oldb.CiiOldbFactory.get_instance()
|
|
|
elt.oldb.CiiOldbGlobal.set_write_enabled(True)
|
|
|
|
|
|
uri = elt.config.Uri("cii.oldb:/tcs/hb/tempser3")
|
|
|
oldb_client.delete_data_point(uri)
|
|
|
|
|
|
```
|
|
|
|
|
|
**Solution 2**
|
|
|
|
|
|
If the above didn't help, find out which "half" of the datapoint exists.
|
|
|
|
|
|
1. The current value exists, and the metadata is missing. This is the case when upgrading DevEnv/CII without deleting the Redis cache.
|
|
|
2. The metadata exists, and the current value is missing
|
|
|
|
|
|
Define the following shell functions (note: not applicable to redis-clusters):
|
|
|
|
|
|
```plaintext
|
|
|
function oldb_ela_list { curl -s -X GET localhost:9200/configuration_instance/_search?size=2000\&q=data.uri.value:\"$1\" | jq -r '.hits.hits[]._id' | sort ; }
|
|
|
|
|
|
function oldb_ela_del { curl -s -X POST localhost:9200/configuration_instance/_delete_by_query?q=data.uri.value:\"$1\" | jq -r '.deleted' ; }
|
|
|
|
|
|
function oldb_red_list { redis-cli --scan --pattern "*$1*" ; }
|
|
|
|
|
|
function oldb_red_del { redis-cli --scan --pattern "*$1*" | xargs redis-cli del ; }
|
|
|
```
|
|
|
|
|
|
Then check if the problematic key is in the volatile store:
|
|
|
|
|
|
```plaintext
|
|
|
# Search for path component of dp-uri (here: "device")
|
|
|
$ oldb_red_list device
|
|
|
... output will be e.g.:
|
|
|
/sampleroot/child/device/doubledp444
|
|
|
/sampleroot/child/device/doubledp445
|
|
|
/sampleroot/child/device/doubledp111
|
|
|
/sampleroot/child/device/doubledp2222
|
|
|
|
|
|
# If the problematic key is in the list, delete it:
|
|
|
$ oldb_red_del device/doubledp444
|
|
|
```
|
|
|
|
|
|
Otherwise, check if the problematic key is in the permanent store:
|
|
|
|
|
|
```plaintext
|
|
|
# Search for path component of dp-uri (whole-word search, e.g. "dev" would not match)
|
|
|
$ oldb_ela_list device
|
|
|
... output e.g.:
|
|
|
oldb___datapoints___sampleroot___child___device___doubledp446___1
|
|
|
|
|
|
# Delete the offending metadata
|
|
|
$ oldb_ela_del doubbledp446
|
|
|
|
|
|
# After deletion, restart the internal config server
|
|
|
$ sudo cii-services stop config ; sudo cii-services start config
|
|
|
```
|
|
|
|
|
|
**Solution 3**
|
|
|
|
|
|
If none of the above helped, another possibility is to clean up the metadata.
|
|
|
|
|
|
WARNING: This is an invasive operation. It deletes all datapoints in the OLDB.
|
|
|
|
|
|
```plaintext
|
|
|
# Clean up the OLDB databases
|
|
|
config-initEs.sh
|
|
|
oldb-initEs
|
|
|
redis-cli flushall
|
|
|
sudo cii-services stop config
|
|
|
sudo cii-services start config
|
|
|
```
|
|
|
|
|
|
If you are dealing with a multi-user oldb ("role_groupserver", meaning it serves an OLDB to a team of developers), after executing the above commands you need to additionally execute (with privileges):
|
|
|
|
|
|
```plaintext
|
|
|
/elt/ciisrv/postinstall/cii-postinstall role_groupserver
|
|
|
```
|
|
|
|
|
|
If you have doubts, please contact us.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Command Line Tools and Python snippets \[OLDB\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I need to inspect or modify the content of the OLDB from the command line or a shell script.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
**cii-oldb-traversal-tool** for searching through the OLDB
|
|
|
|
|
|
```plaintext
|
|
|
$ cii-oldb-traversal-tool --file output --quality OK
|
|
|
$ cat output
|
|
|
cii.oldb:///root/trklsv/cfg/log/level|OK|WARNING|2020-09-11T15:25:08Z
|
|
|
cii.oldb:///root/trklsv/cfg/req/endpoint|OK|zpb.rr://localhost:44444/m1/TrkLsvServer|2020-09-11T15:25:08Z
|
|
|
cii.oldb:///root/trklsv/ctr/current/altaz/alt|OK|0.000000|2020-09-11T15:24:25Z
|
|
|
cii.oldb:///root/trklsv/ctr/current/altaz/az|OK|0.000000|2020-09-11T15:24:25Z
|
|
|
cii.oldb:///root/trklsv/ctr/current/radec/dec|OK|0.000000|2020-09-11T15:24:27Z
|
|
|
cii.oldb:///root/trklsv/ctr/current/radec/ra|OK|0.000000|2020-09-11T15:24:27Z
|
|
|
cii.oldb:///root/trklsv/ctr/poserr|OK|0.000000|2020-09-11T15:24:27Z
|
|
|
cii.oldb:///root/trklsv/ctr/status|OK|UNKNOWN|2020-09-11T15:23:55Z
|
|
|
cii.oldb:///root/trklsv/ctr/substate|OK|UNKNOWN|2020-09-11T15:23:49Z
|
|
|
cii.oldb:///root/trklsv/ctr/target/altaz/alt|OK|0.000000|2020-09-11T15:24:24Z
|
|
|
cii.oldb:///root/trklsv/ctr/target/altaz/az|OK|0.000000|2020-09-11T15:24:25Z
|
|
|
cii.oldb:///root/trklsv/ctr/target/radec/dec|OK|0.000000|2020-09-11T15:24:26Z
|
|
|
cii.oldb:///root/trklsv/ctr/target/radec/ra|OK|0.000000|2020-09-11T15:24:26Z
|
|
|
```
|
|
|
|
|
|
**oldb-cli** for reading, writing, subscribing to, creating, deleting an OLDB-datapoint
|
|
|
|
|
|
```plaintext
|
|
|
$ oldb-cli read cii.oldb:///root/trklsv/cfg/req/endpoint
|
|
|
SLF4J: Class path contains multiple SLF4J bindings.
|
|
|
SLF4J: Found binding in [jar:file:/eelt/ciisrv/1.0-RC3-20201030/lib/srv-support-libs/slf4j-nop-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
|
|
|
SLF4J: Found binding in [jar:file:/eelt/mal/1.1.0-2.2.3-20201027/lib/mal-opcua/slf4j-nop-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
|
|
|
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
|
|
|
SLF4J: Actual binding is of type [org.slf4j.helpers.NOPLoggerFactory]
|
|
|
log4j:WARN No appenders could be found for logger (io.netty.util.internal.logging.InternalLoggerFactory).
|
|
|
log4j:WARN Please initialize the log4j system properly.
|
|
|
Timestamp: 2020-09-11T15:25:08.648Z
|
|
|
Quality: OK
|
|
|
Value: zpb.rr://localhost:44444/m1/TrkLsvServer
|
|
|
```
|
|
|
|
|
|
|
|
|
**oldbReset** for putting an OLDB back to its initial state by removing all custom metadata and removing all datapoints. The tool can only be run on the server that hosts the OLDB.
|
|
|
|
|
|
```plaintext
|
|
|
$ oldbReset
|
|
|
```
|
|
|
|
|
|
See "-h" for options. Note there is no command line option to bypass the mandatory interactive security question.
|
|
|
|
|
|
|
|
|
|
|
|
**oldb Python API** for creating, deleting an OLDB-datapoint
|
|
|
|
|
|
```plaintext
|
|
|
$ python
|
|
|
Python 3.7.6 (default, Jan 8 2020, 19:59:22)
|
|
|
[GCC 7.3.0] :: Anaconda, Inc. on linux
|
|
|
Type "help", "copyright", "credits" or "license" for more information.
|
|
|
>>>
|
|
|
>>> import elt.oldb
|
|
|
>>> from elt.config import Uri
|
|
|
>>> oldb = elt.oldb.CiiOldbFactory.get_instance()
|
|
|
```
|
|
|
|
|
|
… and, to create a datapoint:
|
|
|
|
|
|
```plaintext
|
|
|
>>> elt.oldb.CiiOldbGlobal.set_write_enabled(True)
|
|
|
>>> if not oldb.data_point_exists(Uri("cii.oldb:/ccs/tst/tmp1")):
|
|
|
>>> oldb.create_data_point_by_value (Uri("cii.oldb:/ccs/tst/tmp1"), "my text")
|
|
|
```
|
|
|
|
|
|
… and, to write a datapoint:
|
|
|
|
|
|
```plaintext
|
|
|
>>> elt.oldb.CiiOldbGlobal.set_write_enabled(True)
|
|
|
>>> oldb.get_data_point (Uri("cii.oldb:/ccs/tst/tmp1")).write_value("my text")
|
|
|
```
|
|
|
|
|
|
… and, to read a datapoint:
|
|
|
|
|
|
```plaintext
|
|
|
>>> oldb.get_data_point(Uri("cii.oldb:/ccs/tst/tmp1")).read_value().get_value()
|
|
|
'my text'
|
|
|
```
|
|
|
|
|
|
… and, to subscribe to a datapoint:
|
|
|
|
|
|
```plaintext
|
|
|
>>> class CB:
|
|
|
>>> def new_value(self,value,uri):
|
|
|
>>> print ("value:", value.get_value())
|
|
|
>>> sub = elt.oldb.typesupport.STRING.get_new_subscription_instance(CB())
|
|
|
>>> oldb.get_data_point(Uri("cii.oldb:/ccs/tst/tmp1")).subscribe(sub)
|
|
|
```
|
|
|
|
|
|
… and, to delete a datapoint:
|
|
|
|
|
|
```plaintext
|
|
|
>>> elt.oldb.CiiOldbGlobal.set_write_enabled(True)
|
|
|
>>> oldb.delete_data_point (Uri("cii.oldb:/ccs/tst/tmp1"))
|
|
|
```
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[Log\]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Change Log Levels at Run-time \[Log\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I want to modify the log levels of my application programmatically, without having to reload the full log configuration.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
With [ECII-282](https://jira.eso.org/browse/ECII-282), the CiiLogManager was extended in all three languages to allow dynamically change log levels.
|
|
|
|
|
|
**C++** added methods:
|
|
|
|
|
|
```plaintext
|
|
|
void elt::log::CiiLogManager::SetLogLevel(const std::string logger_name, log4cplus::LogLevel level)
|
|
|
void elt::log::CiiLogManager::SetLogLevel(log4cplus::Logger logger, log4cplus::LogLevel level)
|
|
|
```
|
|
|
|
|
|
**Java** added methods:
|
|
|
|
|
|
```plaintext
|
|
|
void elt.log.CiiLogManager.setLogLevel(
|
|
|
final String loggerName, final org.apache.logging.log4j.Level level);
|
|
|
void elt.log.CiiLogManager.setLogLevel(
|
|
|
org.apache.logging.log4j.Logger logger, final org.apache.logging.log4j.Level level)
|
|
|
```
|
|
|
|
|
|
**Python** added methods:
|
|
|
|
|
|
```plaintext
|
|
|
elt.log.CiiLogManager.set_log_level(name_or_logger: Union\[str, logging.Logger\], level: logging.Level)
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Adjust CII Log Levels \[Log\]
|
|
|
|
|
|
With plain CII (no application frameworks on top), you define a log config file (myapp.logconfig):
|
|
|
|
|
|
```plaintext
|
|
|
log4cplus.rootLogger=INFO, STDOUT
|
|
|
log4cplus.appender.STDOUT=log4cplus::ConsoleAppender
|
|
|
log4cplus.appender.STDOUT.layout=elt::log::layout::CiiSimpleLayout
|
|
|
|
|
|
# other loggers, e.g. OLDB or MAL
|
|
|
log4cplus.logger.CiiOldb=FATAL
|
|
|
```
|
|
|
|
|
|
The name of the log config is your choice, but to comply with the rules of "waf install", best use such a project structure:
|
|
|
|
|
|
```plaintext
|
|
|
myapp/
|
|
|
├── resource
|
|
|
│ └── config
|
|
|
│ └── myapp.logconfig
|
|
|
├── src
|
|
|
│ ├── myapp.cpp
|
|
|
└── wscript
|
|
|
```
|
|
|
|
|
|
Then apply the log config from your application (myapp.cpp):
|
|
|
|
|
|
```plaintext
|
|
|
#include <ciiLogManager.hpp>
|
|
|
int main(int ac, char *av[]) {
|
|
|
::elt::log::CiiLogManager::Configure("resource/config/myapp.logconfig");
|
|
|
log4cplus::Logger root_logger = ::elt::log::CiiLogManager::GetLogger();
|
|
|
root_logger.log(log4cplus::INFO_LOG_LEVEL, "Message via root logger");
|
|
|
return 0;
|
|
|
}
|
|
|
```
|
|
|
|
|
|
Side-note: To configure the logging fully programmatically, without a file, you would do:
|
|
|
```plaintext
|
|
|
::elt::log::CiiLogManager::Configure({
|
|
|
{"log4cplus.appender.ConsoleAppender", "log4cplus::ConsoleAppender"},
|
|
|
{"log4cplus.appender.ConsoleAppender.layout", "elt::log::layout::CiiSimpleLayout"},
|
|
|
{"log4cplus.rootLogger", "FATAL, ConsoleAppender"},
|
|
|
});
|
|
|
```
|
|
|
|
|
|
**List of Loggers** Generally, to learn about all loggers that are active in your application, add this (temporarily) to your application:
|
|
|
|
|
|
```plaintext
|
|
|
#include <ciiLogManager.hpp>
|
|
|
[...]
|
|
|
std::cout << "Current loggers (in addition to root logger):" << std::endl;
|
|
|
std::vector<log4cplus::Logger> list = log4cplus::Logger::getCurrentLoggers();
|
|
|
for (int i=0,n=list.size(); i<n; i++) {
|
|
|
log4cplus::Logger elem = list[i];
|
|
|
std::cout << elem.getName() << std::endl;
|
|
|
}
|
|
|
```
|
|
|
|
|
|
Note: With the next version of CII, the logger name will be included in log messages by default.
|
|
|
|
|
|
**Log Format** To use a different log format (which CII allows, but the Control System guidelines do not), you can modify the above config like this:
|
|
|
|
|
|
```plaintext
|
|
|
#log4cplus.appender.STDOUT.layout=elt::log::layout::CiiSimpleLayout
|
|
|
log4cplus.appender.STDOUT.layout=log4cplus::PatternLayout
|
|
|
log4cplus.appender.STDOUT.layout.ConversionPattern=[%-5p][%D{%Y/%m/%d %H:%M:%S:%q}][%-l][%t] %m%n
|
|
|
```
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### wscript packages for CII Log \[Log\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I want to write an cxx application using CII Log and no other CII services, nor CII MAL. Which packages do I need in my wscript?
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
1) included this in the uses clause in the cxx program folder:
|
|
|
```plaintext
|
|
|
client-api.elt-common.cpp.log
|
|
|
```
|
|
|
|
|
|
2) and put this into the higher-level project or package wscript:
|
|
|
```plaintext
|
|
|
def configure(cnf):
|
|
|
cnf.check_wdep(wdep_name="client-api.elt-common.cpp.log", uselib_store="client-api.elt-common.cpp.log")
|
|
|
```
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
### Too many files errors \[Log\]
|
|
|
|
|
|
If an application (for example a UI) fails with "too many files" or "too many open files" errors, check the /var/log/elt and $CII_LOGS folder. There might be too many files there, typically produced by the logging system.
|
|
|
|
|
|
In particular there could be a number of 0-bytes files.
|
|
|
|
|
|
Cleanup the folder to solve the problem, as root.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[Lang\]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Catching API Exceptions \[Lang Python\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
My application contains a call to the CII Python API.
|
|
|
|
|
|
When I ran it, it threw an exception with the following backtrace:
|
|
|
|
|
|
```plaintext
|
|
|
Top Level Unexpected exception:
|
|
|
Traceback (most recent call last):
|
|
|
File "/home/eltdev/MODULES/test/app.py", line 91, in instantiateDP
|
|
|
double_dp = self.oldb_client.create_data_point(uri, metadataInstName)
|
|
|
CiiOldbPyB.CiiOldbDpExistsException: The Data point cii.oldb:/root/test/xxxdp already exists.
|
|
|
```
|
|
|
|
|
|
Therefore, I added a corresponding try-catch around my call:
|
|
|
|
|
|
```plaintext
|
|
|
try:
|
|
|
...
|
|
|
except CiiOldbPyB.CiiOldbDpExistsException as e:
|
|
|
```
|
|
|
|
|
|
When I run it, the try-catch doesn't work.
|
|
|
|
|
|
Moreover, I now get two backtraces:
|
|
|
|
|
|
```plaintext
|
|
|
Top Level Unexpected exception:
|
|
|
Traceback (most recent call last):
|
|
|
File "/home/eltdev/MODULES/test/app.py", line 91, in instantiateDP
|
|
|
double_dp = self.oldb_client.create_data_point(uri, metadataInstName)
|
|
|
CiiOldbPyB.CiiOldbDpExistsException: The Data point cii.oldb:/root/test/xxxdp already exists.
|
|
|
|
|
|
During handling of the above exception, another exception occurred:
|
|
|
Traceback (most recent call last):
|
|
|
File "/home/eltdev/MODULES/test/app.py", line 108, in main
|
|
|
oldbCreator.instantiateOLDB_exception()
|
|
|
File "/home/eltdev/MODULES/test/app.py", line 81, in instantiateOLDB_exception
|
|
|
self.instantiateDP(double_dp_uri, double_dp_meta.get_instance_name())
|
|
|
File "/home/eltdev/MODULES/test/app.py", line 94, in instantiateDP
|
|
|
except CiiOldbPyB.CiiOldbDpExistsException:
|
|
|
NameError: name 'CiiOldbPyB' is not defined
|
|
|
```
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
You were mislead by the first backtrace: the exception name in the backtrace is not what you should catch.
|
|
|
|
|
|
In your code, replace "**CiiOldbPyB**" with "**elt.oldb**":
|
|
|
|
|
|
```plaintext
|
|
|
try:
|
|
|
....
|
|
|
except elt.oldb.CiiOldbDpExistsException as e:
|
|
|
```
|
|
|
|
|
|
For completeness - do no forget this statement:
|
|
|
|
|
|
```plaintext
|
|
|
import elt.oldb
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The CII Python API is mostly a binding to the CII C++ API.
|
|
|
|
|
|
The CiiOldbPyB.CiiOldbDpExistsException is the original binding class.
|
|
|
|
|
|
This binding class is re-exported under the name elt.oldb.CiiOldbDpExistsException.
|
|
|
|
|
|
The elt.oldb module internally loads the C++ binding module CiiOldbPyB. So both are the same exception.
|
|
|
|
|
|
Nonetheless, you should use the re-exported name, not the original name in your application. We discourage the use of the original name because the structure of the CiiOldbPyB module is more "chaotic" and not equivalent to elt.oldb.
|
|
|
|
|
|
Unfortunately, in the backtraces you will always see the original name instead of the re-exported name.
|
|
|
|
|
|
This question was originally asked in [ECII-422](https://jira.eso.org/browse/ECII-422).
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[IntCfg\]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Elasticsearch disk usage and house-keeping \[IntCfg\]
|
|
|
|
|
|
The Elasticsearch database is used by many CII services, e.g. to store CII log messages, tracing data, and Oldb metadata. If you run a local elasticsearch database on your host (i.e. you have set up a "role_ownserver" during post-install), it is advisable to do some house-keeping on this database from time to time.
|
|
|
|
|
|
Some house-keeping is automated (e.g. ".monitoring-es" indices are automatically rolled over every few days), others may be automated in the future, but currently are not. Instead you should perform the tasks below at your own discretion.
|
|
|
|
|
|
**SOS - Disk Full** When disk usage (`df -h /var/lib/elastic`) goes above 95%, elasticsearch goes into read-only mode. You will see this reported in /var/log/messages and the elastic logs, and by getting exceptions from CII operations like *oldb.CreateDataPoint()*. This will prevent you from doing any clean-up operations on elasticsearch. First, bring disk usage below 95% (e.g. by removing elastic logs with `find /var/log/elasticsearch -type f -mtime +10 -delete`, or by temporarily moving some files from the full partition to another partition), then put elasticsearch back into read-write mode with this command:
|
|
|
`curl -XPUT -H "Content-Type: application/json" localhost:9200/_all/_settings -d '{ "index.blocks.read_only_allow_delete": null }'`. After this, you can proceed normally with the house-keeping operations described next.
|
|
|
|
|
|
|
|
|
|
|
|
1. Check which indices you have and how much memory they consume:
|
|
|
|
|
|
```plaintext
|
|
|
curl localhost:9200/_cat/indices/_all?v\&s=store.size
|
|
|
```
|
|
|
|
|
|
2. To delete diagnostic indices that are older than X days:
|
|
|
|
|
|
```plaintext
|
|
|
function ela_purge_idx { name=$1; age=$2; limit=$(date -d "$age days ago" +"%Y%m%d") ; for a in `curl -s localhost:9200/_aliases | jq -r 'keys | .[]'` ; do [[ $a == *$name* ]] && [[ "${a//[!0-9]/}" -lt $limit ]] && curl -X DELETE localhost:9200/$a ; done }
|
|
|
|
|
|
ela_purge_idx jaeger 10 # delete *jaeger* indices older than 10 days
|
|
|
```
|
|
|
|
|
|
3. To delete CII log messages that are older than 30 days:
|
|
|
|
|
|
```plaintext
|
|
|
curl -X POST "localhost:9200/cii_log_default_index/_delete_by_query?pretty" -H 'Content-Type: application/json' -d' {"query": {"range" : {"@timestamp" : {"lte": "now-30d/d" } } } } '
|
|
|
```
|
|
|
|
|
|
4. Or brute-force, delete all CII log messages:
|
|
|
```plaintext
|
|
|
curl -X DELETE "localhost:9200/cii_log_default_index?pretty"
|
|
|
curl -X PUT "localhost:9200/cii_log_default_index?pretty"
|
|
|
```
|
|
|
|
|
|
5. To free your disk from elastic logs older than 10 days, do (as root):
|
|
|
|
|
|
```plaintext
|
|
|
find /var/log/elasticsearch -type f -mtime +10 -delete
|
|
|
```
|
|
|
|
|
|
Finally, if you do not need CII logs stored in elasticsearch (= you don't use kibana), note that you can stop the log transport and log analysis engine. This way, elasticsearch will grow much slower.
|
|
|
```
|
|
|
sudo cii-services stop log
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### config not found on remote db \[IntCfg\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
You are intending to read a config from the local config database ("localdb"), but you see an error message like this.
|
|
|
|
|
|
```plaintext
|
|
|
elt.config.exceptions.CiiConfigNoTcException: Target configuration does not exist: Failed to retrieve configuration from elastic search: Configuration cii.config://*/supervisoryapp/TrkLsvDeploy on the remote db was not found
|
|
|
at elt.config.client.ConfigRemoteDatabase.retrieveConfig(ConfigRemoteDatabase.java:191)
|
|
|
at elt.config.client.CiiConfigClient.retrieveConfig(CiiConfigClient.java:354)
|
|
|
at elt.config.client.CiiConfigClient.retrieveConfig(CiiConfigClient.java:310)
|
|
|
at trkLsv.DataContext.loadConfig(DataContext.java:324)
|
|
|
at trkLsv.DataContext.<init>(DataContext.java:190)
|
|
|
at trkLsv.TrkLsv.go(TrkLsv.java:72)
|
|
|
at trkLsv.TrkLsv.main(TrkLsv.java:41)
|
|
|
Caused by: elt.error.icd.CiiSerializableException
|
|
|
at elt.config.service.client.icd.zpb.ServiceClientApiInterfaceAsyncImpl.processRequest(ServiceClientApiInterfaceAsyncImpl.java:73)
|
|
|
at elt.mal.zpb.rr.ClientAsyncImpl.events(ClientAsyncImpl.java:261)
|
|
|
at org.zeromq.ZPoller.dispatch(ZPoller.java:537)
|
|
|
at org.zeromq.ZPoller.poll(ZPoller.java:488)
|
|
|
at org.zeromq.ZPoller.poll(ZPoller.java:461)
|
|
|
at elt.mal.zpb.ZpbMal.processThread(ZpbMal.java:459)
|
|
|
at elt.mal.zpb.ZpbMal.lambda$new$0(ZpbMal.java:119)
|
|
|
at java.lang.Thread.run(Thread.java:748)
|
|
|
```
|
|
|
|
|
|
**Solution A**
|
|
|
|
|
|
Your local file may be invalid (e.g. illegal format, or doesn't match the config class definition).
|
|
|
|
|
|
Look at the content of your local config database, e.g. with
|
|
|
|
|
|
```plaintext
|
|
|
$ find $INTROOT/localdb
|
|
|
```
|
|
|
|
|
|
and correct the file in place, or fix the source yaml and then redeploy it from source to the localdb.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
You may have a malformed json file in your local db, which the config service failed to read.
|
|
|
|
|
|
Because of the use the location wildcard "\*" in your code (in "[cii.config://\*/supervisoryapp/TrkLsvDeploy"](cii.config://\*/supervisoryapp/TrkLsvDeploy%22)),
|
|
|
|
|
|
the config service has consequently tried to load the config from the remote config database, where no such config exists.
|
|
|
|
|
|
[[_TOC_]]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### How-to post-install CII services
|
|
|
|
|
|
Most CII services (not: telemetry and alarms) come as part of the DevEnv installation, however some post-installation set-up is needed before they can be used.
|
|
|
|
|
|
Post-installation is always necessary when you start working on a newly installed DevEnv host. After an upgrade of the DevEnv on an existing host, post-install may be necessary, too. In these cases, the Release Notes of the DevEnv version will inform you so.
|
|
|
|
|
|
_Note on DevEnv versions before 3.6_: On older DevEnv versions, you need to download the post-install first. On DevEnv 3.4, first execute this command (as root): `yum -y install elt-ciisrv-postinstall`. On DevEnv 3.5, execute this command (as root): `yum -y update elt-ciisrv-postinstall`
|
|
|
|
|
|
To run post-install, execute this command (as root):
|
|
|
|
|
|
```plaintext
|
|
|
# /elt/ciisrv/postinstall/cii-postinstall <choose a role>
|
|
|
```
|
|
|
|
|
|
To learn about the options, run the command without arguments.
|
|
|
|
|
|
For more details and examples, see the [cii-postinstall user manual](http://www.eso.org/\~eltmgr/CII/latest/manuals/html/docs/services.html)
|
|
|
|
|
|
After postinstall, you will want to start the CII services on your host (unless you have assigned the "groupclient" role to the host, which means you will use the CII Services running on another host).
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### How-to start/stop CII services
|
|
|
|
|
|
The cii-services utility lets you start/stop/monitor the CII services. For some operations, it requires root-privileges, and can be run with `sudo`. If you don't use sudo, it will show a root password prompt when needed.
|
|
|
|
|
|
Note: Before using CII services, you or an administrator have to run the CII-post-installation, see [How-to post-install CII services](#how-to-post-install-cii-services)
|
|
|
|
|
|
_status_ This is a feature-centric view, that basically tells you which features you have available. For example, "Blob Values" means that the Distributed File System MinIO (in previous versions: Hadoop) is available for storage of large values and binaries.
|
|
|
|
|
|
```plaintext
|
|
|
$ cii-services status
|
|
|
```
|
|
|
|
|
|
_info_ This is a deployment-centric view that tells you whether the services are running and where they are.
|
|
|
|
|
|
```plaintext
|
|
|
$ cii-services info
|
|
|
```
|
|
|
|
|
|
_start / stop_
|
|
|
|
|
|
```plaintext
|
|
|
$ sudo cii-services start <services>
|
|
|
```
|
|
|
|
|
|
To learn about the options, run the command without arguments.
|
|
|
|
|
|
For more details and examples, see the [cii-services user manual](http://www.eso.org/\~eltmgr/CII/latest/manuals/html/docs/services.html)
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### How-to get CII Demo apps
|
|
|
|
|
|
CII has demo apps that you can download as source, modify, and build yourself. They demonstrate the use of the CII services.
|
|
|
|
|
|
```plaintext
|
|
|
$ git clone https://oauth2:ujak_jA2BjkL2UDW6v5h@gitlab.eso.org/cii/info/cii-demo.git
|
|
|
Cloning into 'cii-demo'...
|
|
|
[...]
|
|
|
|
|
|
$ cd cii-demo
|
|
|
$ ./cii-demo.sh
|
|
|
|
|
|
Building (this may take some minutes) ...
|
|
|
Installing into INTROOT: /home/eltdev/INTROOT
|
|
|
PREFIX is set to: /home/eltdev/INTROOT
|
|
|
Find the build output in ./cii-demo.sh.build.log
|
|
|
[...]
|
|
|
```
|
|
|
|
|
|
After this, you find the list of available demo apps in the (generated) README file. You can at any time modify the sources and rebuild them with "waf build install". Look inside "cii-demo.sh" if unsure.
|
|
|
|
|
|
Most demo apps require CII Services be running on your host or on another host => Check that the related CII Services (e.g. config service for a a config demo app) are accessible: see [How-to start/stop CII Services](#how-to-start-stop-cii-services)
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Insufficent Manuals, Contributions
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I found some information in the user manuals is incomplete, unclear, outdated, or misleading.
|
|
|
|
|
|
**Solution 1**
|
|
|
|
|
|
If you're not even really sure what you're looking for, go and [create a CII ticket](https://jira.eso.org/secure/CreateIssue!default.jspa), and we can help with the problem at hand, and discuss how to improve the documentation for others.
|
|
|
|
|
|
**Solution 2**
|
|
|
|
|
|
If you have a fairly clear idea what should be added to the manuals, you can propose and make changes to the documentation source files directly. Contributions are welcome!
|
|
|
|
|
|
The CII all-in-one User Manual is composed of a dozen sub-documents in reStructured text format (.rst). The only (minor) challenge is therefore to identify which sub-document you want to edit.
|
|
|
|
|
|
The documentation resides in `https://gitlab.eso.org/cii/info/cii-docs.git` that you can clone as usual to local workspace, then modify and create a merge request.
|
|
|
|
|
|
Alternatively, and recommended, you can edit the files directly in the browser by using the File Editor built into gitlab:
|
|
|
|
|
|
_Note: If any of the buttons mentioned below is greyed out, you're lacking permissions. If so, please contact us first to request permission._
|
|
|
|
|
|
1. Browse to <https://gitlab.eso.org/cii/info/cii-docs>,
|
|
|
and navigate to folder `userManual/ciiman/src/docs`
|
|
|
2. Find the correct .rst file, select it, and press "Edit",
|
|
|
and from the pop-up list of choices, prefer the "Edit single file"
|
|
|
3. Make your changes in the content editing page.
|
|
|
Note you can toggle between Write and Preview mode.
|
|
|
|
|
|
When done with editing, see the lower part of the screen:
|
|
|
|
|
|
4. Describe your change:
|
|
|
* Commit message: please write some rationale for your contribution
|
|
|
Info: this text will re-appear later as the Title of your Merge Request.
|
|
|
* Target branch:
|
|
|
* DO NOT use the default (master), instead:
|
|
|
* empty the field and write e.g.
|
|
|
"hints-on-creating-document", or
|
|
|
"more-details-on-network-interface"
|
|
|
Note there cannot be whitespace in this name
|
|
|
* Checkbox "Start a new merge request": leave default (YES)
|
|
|
5. Commit changes (blue button on lower left),
|
|
|
|
|
|
The Merge Request window appears:
|
|
|
|
|
|
6. It is all prefilled. If you want you can make changes:
|
|
|
* The Title. It is prefilled from your commit message, but you can change it
|
|
|
* Leave all the defaults (Mark as draft: NO, Delete source branch: YES, Squash: NO)
|
|
|
7. Create Merge Request (blue button)
|
|
|
|
|
|
Inspect the result:
|
|
|
|
|
|
8. wait for pipeline to finish
|
|
|
* After the pipeline finishes, Jenkins will add its comment to your merge request page.
|
|
|
* and in that comment follow the link under "Artifacts List" to see your change applied.
|
|
|
|
|
|
8. To refine the applied change, go back to step 1.
|
|
|
Use the same branch name in step 4, so that you do not create a 2nd Merge Request
|
|
|
(what you want is to update your Merge Request, not create another one).
|
|
|
|
|
|
We'll check the merge request, and your change can make it to the next release. Thanks in advance!
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[ICD\]
|
|
|
|
|
|
### Failure to build ZPB from ICD \[ICD waf build\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Building an ICD, I get the following error:
|
|
|
|
|
|
```plaintext
|
|
|
error: return type specification for constructor invalid
|
|
|
```
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
You have defined a struct containing a member of the same name ("feedback" in the example below). Rename one of the two.
|
|
|
|
|
|
```plaintext
|
|
|
<struct name="feedback">
|
|
|
<member name="feedback" type="float" arrayDimensions="(10)"/>
|
|
|
<member name="counter" type="uint32_t"/>
|
|
|
</struct>
|
|
|
```
|
|
|
|
|
|
or you have used a struct name that is a reserved word in Protobuf. Rename it.
|
|
|
|
|
|
```plaintext
|
|
|
<struct name="Swap">
|
|
|
<member name="counter" type="uint32_t"/>
|
|
|
</struct>
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
In the first case, the code that gets generated for the member looks to the compiler like a mal-formed constructor. In the second case, you have used a struct name that is already taken by a method name that protobuf secretly generates into the code to be compiled, which then leads to the same problem.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### No cpp/python from ICD \[ICD waf build\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
When I build my ICD, the build completes without errors or warnings, but I find it has not generated any cpp and python classes for my ICD. I do see protoc file, though
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Check that you have specified all needed dependencies in your waf script. The "requires" list needs to contain the following entries: requires='cxx python protoc fastdds boost cii gtest nosetests pytest'
|
|
|
|
|
|
The full waf script for your project would look something like this:
|
|
|
|
|
|
```plaintext
|
|
|
declare_project(name='mytests',
|
|
|
version='1.0.0',
|
|
|
requires='cxx python protoc fastdds boost cii gtest nosetests pytest',
|
|
|
boost_libs='program_options',
|
|
|
cstd='gnu11',
|
|
|
cxx_std='gnu++14',
|
|
|
recurse='myconfig mylsvsim icd tests')
|
|
|
```
|
|
|
|
|
|
Also, note that in DevEnv 3.x the order of dependencies matters:
|
|
|
|
|
|
```plaintext
|
|
|
# Will not work, see the warnings during the 'waf configure' step.
|
|
|
requires='cxx python protoc fastdds cii boost gtest nosetests pytest pyqt5 sphinx'
|
|
|
|
|
|
# This order works
|
|
|
requires='cxx python protoc fastdds boost cii gtest nosetests pytest pyqt5 sphinx'
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### PYBIND errors \[ICD waf build\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to build your MAL Application, you get errors like below related to the PYBIND module.
|
|
|
|
|
|
```plaintext
|
|
|
icd/python/bindings/src/ModProto-benchmark.cpp:18:25: error: expected initializer before ‘-’ token
|
|
|
PYBIND11_MODULE(ModProto-benchmark, modproto-benchmark) {
|
|
|
```
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Check the name of your ICD file:
|
|
|
|
|
|
```plaintext
|
|
|
> find icd
|
|
|
icd
|
|
|
icd/wscript
|
|
|
icd/src
|
|
|
icd/src/proto-benchmark.xml
|
|
|
```
|
|
|
|
|
|
The icd file name contains a minus, which is actually reflected in the above error message.
|
|
|
|
|
|
Rename the file to something like this:
|
|
|
|
|
|
```plaintext
|
|
|
> find icd
|
|
|
icd
|
|
|
icd/wscript
|
|
|
icd/src
|
|
|
icd/src/protobenchmark.xml
|
|
|
```
|
|
|
|
|
|
In general, due to the many code generation steps taking place, your freedom in ICD file naming is limited.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### multiple XMLs found \[ICD waf build\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to build your ICD module, you see this error:
|
|
|
|
|
|
```plaintext
|
|
|
Waf: Entering directory \`/home/eltdev/repos/hlcc/build'
|
|
|
Error: multiple XMLs found, just one supported.
|
|
|
```
|
|
|
|
|
|
while in fact you have only one XML file in your ICD directory.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Check the file name of your ICD file:
|
|
|
|
|
|
make sure it starts with an uppercase letter.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The error message is misleading (will be improved, [ECII-426](https://jira.eso.org/browse/ECII-426)).
|
|
|
|
|
|
The code generator for malicd_topics fails when the ICD file name starts with lowercase.
|
|
|
|
|
|
For more information, see also: [KB: PYBIND errors \[ICD waf build\]](onenote:#KB%20PYBIND%20errors%20%5BICD%20waf%20build%5D§ion-id=%7BF524F9BE-F51D-4A01-9976-93359FCC4966%7D&page-id=%7B0FEE4FB9-C58B-4E8A-A276-2EC4367CFB30%7D&end&base-path=https://europeansouthernobservatory.sharepoint.com/sites/ELT_Control/SiteAssets/ELT_Control%20Notebook/Documentation/ELT%20Control%20KnowledgeBase.one)
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### g++: internal compiler error, g++ fatal error \[ICD waf build\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to build an ICD-module or MAL-application, the build takes a long time, and/or fails with an error message like this:
|
|
|
|
|
|
```plaintext
|
|
|
g++: fatal error: Killed signal terminated program cc1plus
|
|
|
compilation terminated.
|
|
|
```
|
|
|
|
|
|
```plaintext
|
|
|
Software/CcsLibs/CcsTestData/python/bindings/src/ModCcstestdata.cpp:18:1: note:
|
|
|
in expansion of macro ‘PYBIND11_MODULE’
|
|
|
PYBIND11_MODULE(ModCcstestdata, modccstestdata) {
|
|
|
^
|
|
|
g++: internal compiler error: Killed (program cc1plus)
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The cpp compiler runs out of memory and crashes. You can see the effect by running htop in a separate terminal, all memory (including swap space) is consumed by the g++ compiler, which consequently crashes.
|
|
|
|
|
|
_Memory needed for building a given ICD module_
|
|
|
|
|
|
There is a base load that is the same for all ICD modules. On top of that, the ICD file contents determine how much memory is needed to build the module.
|
|
|
|
|
|
_Rule of Thumb_
|
|
|
| MAL version | Base Load | Mem per ICD-Struct |
|
|
|
|-------------|-----------|--------------------|
|
|
|
| MAL 1.x | 650 MB (2/3 GB) | 320 MB (1/3 GB) |
|
|
|
| MAL 2.0 | 650 MB (2/3 GB) | 110 MB (1/8 GB) |
|
|
|
|
|
|
Thus, if your biggest ICD contains 20 structs, building under MAL 1.x will require around 7 GB of available free memory.
|
|
|
|
|
|
_Measuring_
|
|
|
|
|
|
Record metrics of the ICD build with this time-command:
|
|
|
|
|
|
```plaintext
|
|
|
$ alias time='TIME="real\t%E\nmem\t%Mk\ncpu\t%P\npf\t%F" time'
|
|
|
$ time waf build
|
|
|
|
|
|
[...]
|
|
|
real 10:28.19
|
|
|
mem 7206676k
|
|
|
cpu 765%
|
|
|
pf 166
|
|
|
```
|
|
|
|
|
|
- If the build crashes, the time-command's output will not be fully reliable (the real memory need is higher than what the output shows).
|
|
|
- High page fault counts (`pf 1168635`) generally indicate you should reduce the module's footprint, see Solutions below.
|
|
|
|
|
|
More info is available at [ECII-109](https://jira.eso.org/browse/ECII-109)
|
|
|
|
|
|
**Solution 1: Decrease the module's footprint**
|
|
|
|
|
|
1. Remove unnecessary middlewares
|
|
|
|
|
|
Use the `xyz_disabled` options:
|
|
|
|
|
|
```python
|
|
|
from wtools import module
|
|
|
# Disable OPCUA and DDS, since not part of this interface.
|
|
|
module.declare_malicd(mal_opts={'opcua_disabled': True, 'dds_disabled': True})
|
|
|
```
|
|
|
|
|
|
2. Reduce build parallelism
|
|
|
|
|
|
By default the build system uses all cores on the host. Less parallelism means less memory consumers during the build. This is controlled by the waf `-j` option.
|
|
|
|
|
|
To build with only 4 cores: `$ time waf -j4 build`
|
|
|
|
|
|
As a rough estimate, each waf build task will consume around 2 GB RAM, so on a 12 core host with 16 GB RAM, a parallelism of 8 may be a good choice. Try different numbers of cores and use the output from the time-command (see above) to find an optimum between real, page faults, and cpu.
|
|
|
|
|
|
3. Adjust the compiler flags
|
|
|
|
|
|
The default set of compiler flags applied by the build system consume significant memory. We recommend using "-O2 -flto -pipe" (_to be confirmed_) instead. This is how you pass custom compiler flags for your ICD-module:
|
|
|
|
|
|
In your project wscript:
|
|
|
|
|
|
```python
|
|
|
from wtools import project
|
|
|
[...]
|
|
|
def configure(cnf):
|
|
|
cnf.env.CXXFLAGS_MALPYTHON = '-O2 -flto -pipe'
|
|
|
[...]
|
|
|
```
|
|
|
|
|
|
4. Refactor your ICD
|
|
|
|
|
|
Reduce the memory need by splitting the big ICDs up into 2 or more smaller ICD modules.
|
|
|
|
|
|
**Solution 2: Increase the available memory**
|
|
|
|
|
|
1. Find RAM consumers and stop them, at least temporarily. For example, ElasticSearch uses a significant amount of RAM: `sudo cii-services stop elasticsearch`
|
|
|
2. Add temporary swap space to your host
|
|
|
|
|
|
```shell
|
|
|
# As root:
|
|
|
fallocate -l 8G /swapfile
|
|
|
dd if=/dev/zero of=/swapfile bs=1024 count=8388608
|
|
|
chmod 600 /swapfile
|
|
|
mkswap /swapfile
|
|
|
swapon /swapfile
|
|
|
|
|
|
# and to remove it:
|
|
|
swapoff -v /swapfile
|
|
|
rm -f /swapfile
|
|
|
```
|
|
|
|
|
|
3. Add permanent memory to your VM
|
|
|
|
|
|
Increase your RAM, respectively ask your system administrator to do it. Assess the necessary amount by using the "Rule Of Thumb" above.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Choose middlewares to build \[MAL ICD\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I am certain that my ICD will never be used over OPC UA. Nonetheless, the ICD-compilation builds OPC UA mappings for my ICD. This is unnecessarily extending the compilation time for my application.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
By default the ICD-compilation builds mappings for all middlewares. But it is possible to exclude certain middleware mappings from compilation, which will reduce compilation time. You do this by passing mal options to the icd-generator.
|
|
|
|
|
|
**Example**
|
|
|
|
|
|
wscript
|
|
|
|
|
|
```plaintext
|
|
|
declare_malicd(use='icds.base', mal_opts = { 'opcua_disabled': True } )
|
|
|
```
|
|
|
|
|
|
The available options are:
|
|
|
|
|
|
- opcua_disabled = if True, disable OPCUA middleware generation
|
|
|
- dds_disabled = if True, disable DDS middleware generation
|
|
|
- zpb_disabled = if True, disable ZEROMQ middleware generation
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Variable Tracking exceeded \[ICD\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
When building an ICD using CII-MAL, you see this warning message:
|
|
|
|
|
|
```plaintext
|
|
|
variable tracking size limit exceeded
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The -fvar-tracking-assignments is automatically enabled by GCC when optimizations are enabled.
|
|
|
|
|
|
There is a limit on how many variables can be tracked by the compiler. The warning tells you that more vars would need to be tracked than what's supported.
|
|
|
|
|
|
You can disable the tracking manually with -fno-var-tracking-assignments.
|
|
|
|
|
|
There are two easy ways to do it on the overall project.
|
|
|
|
|
|
**Solution A**
|
|
|
|
|
|
```plaintext
|
|
|
Export CXXFLAGS=-fno-var-tracking-assignments
|
|
|
```
|
|
|
|
|
|
And then rerun “waf configure” and continue with the build.
|
|
|
|
|
|
Note: you have to have the exported variable each time you do a “waf configure” as that is the point at which such flags are saved
|
|
|
|
|
|
**Solution B**
|
|
|
|
|
|
If you want it to be inside the project then put the flags fixed inside your project, so in the top level wscript (where you define the project) add this configure section before the project declaration:
|
|
|
|
|
|
```plaintext
|
|
|
def configure(cnf):
|
|
|
|
|
|
cnf.env.append_value('CXXFLAGS', \['-fno-var-tracking-assignments'\])
|
|
|
```
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[MAL\]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Avoid ephemeral ports \[MAL\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
When you leave a client running for a while (say 20 minutes) without server available, you start getting errors:
|
|
|
|
|
|
```plaintext
|
|
|
Message=Malformed message received, missing frames.
|
|
|
```
|
|
|
|
|
|
Starting the server does not fix the situation, and gives this error:
|
|
|
|
|
|
```plaintext
|
|
|
Errno 48 : Address already in use
|
|
|
```
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
The solution is to not use ports from the ephemeral range (aka local port range). The port numbers in the ephemeral range can be found with `cat /proc/sys/net/ipv4/ip_local_port_range`. For DevEnv 3.x with CentOS 8 they are:
|
|
|
| Port Range | Usable for MAL |
|
|
|
|------------|----------------|
|
|
|
| 1 -1023 | 🟠 No |
|
|
|
| 1024 - 32767 | 🟢 Yes |
|
|
|
| 32768 - 60999 | 🟠 No |
|
|
|
| 61000 - 65535 | 🟢 Yes |
|
|
|
|
|
|
With the implementation of ECII-402, the MAL library will write a warning log (and can also be configured to throw an exception) if an application runs a MAL instance on one of the ephemeral ports.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
(Explanation provided by M. Sekoranja)
|
|
|
|
|
|
After a while of (re-)connection attempts, the client will manage to connect... to itself! Because a client will never expect client's messages, errors of malformed messages are emitted.
|
|
|
|
|
|
This happens because TCP design allows for a 'simultaneous connect feature': if a client is trying to connect to local port and if the port is from the ephemeral range, it can occasionally connect to itself. The client thinks it is connected to a server, however it is actually connected to itself. Moreover, the server can not bind to its server port anymore.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Req/Rep Connection Listeners \[MAL Python\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
MAL includes a callback registration method to allow monitoring of the MAL service connection status.
|
|
|
|
|
|
This is done with the method `registerConnectionListener()`.
|
|
|
|
|
|
Given the example below, this does not work, and the `listenerMethod()` is never called.
|
|
|
|
|
|
|
|
|
|
|
|
```plaintext
|
|
|
import sys
|
|
|
import datetime
|
|
|
import signal
|
|
|
import traceback
|
|
|
import logging
|
|
|
|
|
|
import elt.pymal as mal
|
|
|
from pymalcpp import TimeoutException
|
|
|
from ModTrk.Trk.StdCmds import StdCmdsSync
|
|
|
from ModTrk.Trk.StdCmds import StdCmdsAsync
|
|
|
from ModTrk.Trk import TelPosition
|
|
|
from ModTrk.Trk import AxesPosition
|
|
|
|
|
|
THREE_SECONDS = datetime.timedelta(seconds=3)
|
|
|
MINUTE = datetime.timedelta(seconds=60)
|
|
|
MY_SERVER_URL='localhost'
|
|
|
MY_SERVER_PORT='44444'
|
|
|
|
|
|
def listenerMethod(state):
|
|
|
print("listenerMethod: registerConnectionListener() response :" + str(state))
|
|
|
|
|
|
|
|
|
|
|
|
uri = 'zpb.rr://' + MY_SERVER_URL + ':' + str(MY_SERVER_PORT) + '/m1/' + 'TrkLsvServer'
|
|
|
print('MAL URI: ' + uri)
|
|
|
zpbMal = mal.loadMal('zpb', {})
|
|
|
factory = mal.CiiFactory.getInstance()
|
|
|
factory.registerMal('zpb', zpbMal )
|
|
|
stdcmds = factory.getClient(uri, StdCmdsAsync, qos=mal.rr.qos.ReplyTime(THREE_SECONDS))
|
|
|
stdcmds.registerConnectionListener(listenerMethod)
|
|
|
|
|
|
connectionFuture = stdcmds.asyncConnect()
|
|
|
|
|
|
connectionFuture.wait_for(THREE_SECONDS)
|
|
|
rtn = stdcmds.Status()
|
|
|
rtn.wait()
|
|
|
print( str(rtn.get() ))
|
|
|
```
|
|
|
|
|
|
_From <_[_https://jira.eso.org/browse/ECII-212_](https://jira.eso.org/browse/ECII-212)_>_
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
The developer must keep a reference to the returned Object from the `registerConnectionListener()` invocation.
|
|
|
|
|
|
```plaintext
|
|
|
stdcmds = factory.getClient(uri, StdCmdsAsync, qos=mal.rr.qos.ReplyTime(THREE_SECONDS))
|
|
|
listenerRegistration = stdcmds.registerConnectionListener(listenerMethod)
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The documentation states a Return value, but is the responsibility of the developer to keep the reference. Otherwise, the object will be deleted when exiting the block of code.
|
|
|
|
|
|
Remember to delete (assign None), to this object when closing the connection to the MAL service.
|
|
|
|
|
|
_From <_[_https://jira.eso.org/browse/ECII-212_](https://jira.eso.org/browse/ECII-212)_>_
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Latency on Pub/Sub \[MAL ZMQ\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
We see latencies of 400-1000ms in MAL ZMQ pub/sub communication, for sending a 12k x 12k image blob.
|
|
|
|
|
|
For the 2 first transmissions, there is somewhere between 400 and 500ms latency on the publisher side between just before calling the "publish" method and when we see the first packets on the wire.
|
|
|
|
|
|
This is for all messages, not only for the first messages. When we let the program run several minutes, during that period, all messages have a consistent delay when arriving on subscriber side.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
From MAL 1.0.4 on,
|
|
|
|
|
|
MAL supports the below MAL specific property to limit queue size for large message publishers.
|
|
|
|
|
|
```plaintext
|
|
|
mal::Mal::Properties m_malProperties1;
|
|
|
m_malProperties1["zpb.ps.zmq.sndhwm"] = "1";
|
|
|
|
|
|
auto publisher = factory.getPublisher<mal::example::Sample>(uri,
|
|
|
{ std::make_shared<mal::ps::qos::Latency>(
|
|
|
std::chrono::milliseconds(100)),
|
|
|
std::make_shared<mal::ps::qos::Deadline>(
|
|
|
std::chrono::seconds(1)) }, m_malProperties1);
|
|
|
```
|
|
|
|
|
|
For more information how to use MAL specific properties, see MAL Binding Manual.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The issue has been first reported in [ECII-159](https://jira.eso.org/browse/ECII-159).
|
|
|
|
|
|
The problem lies in the ZMQ send queues. Default size is 1000 and with 144MB per message (in this case) this means 144GB. Limiting this to 1 (for a test) a publisher can handle 20 subscribers (tested) without any problems. The solution is to reconfigure the send-queue size appropriately.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Sending an array of unions \[MAL CPP\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to send a msg, I get the following error message from CII:
|
|
|
|
|
|
```plaintext
|
|
|
[libprotobuf ERROR google/protobuf/message_lite.cc:121] Can't parse message of type "generated.zpb.fcfif.StdCmds_Request" because it is missing required fields: data.Setup.payload[0].piezoData.input
|
|
|
```
|
|
|
|
|
|
My ICD definition looks like this
|
|
|
|
|
|
```xml
|
|
|
<enum name="PiezoInput">
|
|
|
<enumerator name="PIEZO_INPUT_SETMODE" />
|
|
|
<enumerator name="PIEZO_INPUT_MOVE" />
|
|
|
</enum>
|
|
|
|
|
|
<enum name="PiezoMode">
|
|
|
<enumerator name="PIEZO_MODE_1" />
|
|
|
<enumerator name="PIEZO_MODE_2" />
|
|
|
</enum>
|
|
|
|
|
|
<struct name="PiezoModeStruct">
|
|
|
<member name="mode" type="nonBasic" nonBasicTypeName="PiezoMode" />
|
|
|
</struct>
|
|
|
|
|
|
<enum name="PiezoMove">
|
|
|
<enumerator name="PIEZO_MOVE_1" />
|
|
|
<enumerator name="PIEZO_MOVE_2" />
|
|
|
</enum>
|
|
|
|
|
|
<union name="PiezoUnion">
|
|
|
<discriminator type="nonBasic" nonBasicTypeName="PiezoInput" />
|
|
|
<case>
|
|
|
<caseDiscriminator value ="PIEZO_INPUT_SETMODE"/>
|
|
|
<member name="piezoModeData" type="nonBasic" nonBasicTypeName="PiezoModeStruct" />
|
|
|
</case>
|
|
|
<case>
|
|
|
<caseDiscriminator value ="PIEZO_INPUT_MOVE"/>
|
|
|
<member name="piezoMoveData" type="nonBasic" nonBasicTypeName="PiezoMove" />
|
|
|
</case>
|
|
|
</union>
|
|
|
|
|
|
<struct name="Piezo">
|
|
|
<member name="id" type="string" />
|
|
|
<member name="input" type="nonBasic" nonBasicTypeName="PiezoUnion" />
|
|
|
</struct>
|
|
|
|
|
|
<interface name="PiezoTest">
|
|
|
<method name="test" returnType="void">
|
|
|
<argument name="arr" type="nonBasic" nonBasicTypeName="Piezo" arrayDimensions="(10)" />
|
|
|
</method>
|
|
|
</interface>
|
|
|
```
|
|
|
|
|
|
My code looks like this
|
|
|
|
|
|
```plaintext
|
|
|
[...]
|
|
|
auto piezo = mal->createDataEntity<::fcfif::Piezo>();
|
|
|
piezo->setId("foo");
|
|
|
|
|
|
auto input = piezo->getInput();
|
|
|
auto mode = input->getPiezoModeData();
|
|
|
mode->setAction(::fcfif::ActionPiezoMode::SET_AUTO);
|
|
|
|
|
|
auto union = mal->createDataEntity<::fcfif::FcsUnion>();
|
|
|
union->setPiezoData(piezo);
|
|
|
|
|
|
[...]
|
|
|
```
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
```plaintext
|
|
|
auto p = factory.getClient<::fcfif::PiezoTestSync>(uri,
|
|
|
{std::make_shared<mal::rr::qos::ReplyTime>
|
|
|
(std::chrono::seconds(3))},
|
|
|
{});
|
|
|
auto mal = p->getMal();
|
|
|
|
|
|
auto piezo = mal->createDataEntity<::fcfif::Piezo>();
|
|
|
piezo->setId("foo");
|
|
|
|
|
|
auto input = piezo->getInput();
|
|
|
auto piezoModeStruct = mal->createDataEntity<::fcfif::PiezoModeStruct>();
|
|
|
piezoModeStruct->setMode(::fcfif::PiezoMode::PIEZO_MODE_1);
|
|
|
|
|
|
input->setPiezoModeData(piezoModeStruct);
|
|
|
|
|
|
auto piezo2 = mal->createDataEntity<::fcfif::Piezo>();
|
|
|
piezo2->setId("foo2");
|
|
|
|
|
|
auto input2 = piezo2->getInput();
|
|
|
|
|
|
input2->setPiezoMoveData(::fcfif::PiezoMove::PIEZO_MOVE_1);
|
|
|
|
|
|
std::vector<std::shared_ptr<::fcfif::Piezo>> sa;
|
|
|
sa.push_back(piezo);
|
|
|
sa.push_back(piezo2);
|
|
|
p->test(sa);
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
Your code does not work since you do not use union instance provided by the parent structure. You need to obtain nested structure/union via accessors and do not try to created your own (detached) instance.
|
|
|
|
|
|
This issue was first described in [ECII-154](https://jira.eso.org/browse/ECII-154)
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Failed to send request, send queue full \[MAL\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
You are intending to update a config, but you get an IllegalStateException where the last line is
|
|
|
|
|
|
```plaintext
|
|
|
elt.mal.zpb.rr.ClientAsyncImpl:183
|
|
|
```
|
|
|
|
|
|
throw `new MalException("Failed to send request, send queue full");`
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Probably you have called close() on the CiiConfigClient instance somewhere, maybe also implicitly during a try-with-resource block.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Getting More Logs \[MAL Log\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
The MAL seems to misbehave. How can I get more log messages from the MAL used in my application?
|
|
|
|
|
|
**Solution A**
|
|
|
|
|
|
**Java**
|
|
|
|
|
|
From MAL 1.1.0, edit the MAL log4j config xml and specify the MAL log levels:
|
|
|
|
|
|
```plaintext
|
|
|
<Logger name="elt.mal" level="TRACE" />
|
|
|
```
|
|
|
|
|
|
**Cpp**
|
|
|
|
|
|
Example for **Zpb.** For other middlewares, see below
|
|
|
|
|
|
1. Put a log-config file into your file system:
|
|
|
|
|
|
```plaintext
|
|
|
log4cplus.rootLogger=TRACE, stdout
|
|
|
|
|
|
log4cplus.logger.malDds=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsBasePubSub=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsPublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsInstancePublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsSubscriptionManager=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsSubscriptionReaderListener=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsMrvSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malDdsRequesterImpl=TRACE, MyFileAppender
|
|
|
|
|
|
log4cplus.additivity.malDds=True
|
|
|
log4cplus.additivity.malDdsBasePubSub=True
|
|
|
log4cplus.additivity.malDdsPublisher=True
|
|
|
log4cplus.additivity.malDdsInstancePublisher=True
|
|
|
log4cplus.additivity.malDdsSubscriptionManager=True
|
|
|
log4cplus.additivity.malDdsSubscriptionReaderListener=True
|
|
|
log4cplus.additivity.malDdsSubscriber=True
|
|
|
log4cplus.additivity.malDdsMrvSubscriber=True
|
|
|
log4cplus.additivity.malDdsRequesterImpl=True
|
|
|
|
|
|
|
|
|
log4cplus.logger.malZpb=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbBasePubSub=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbPublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbInstancePublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbMrvSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbServer=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malZpbClientAsyncImpl=TRACE, MyFileAppender
|
|
|
|
|
|
log4cplus.additivity.malZpb=True
|
|
|
log4cplus.additivity.malZpbBasePubSub=True
|
|
|
log4cplus.additivity.malZpbPublisher=True
|
|
|
log4cplus.additivity.malZpbInstancePublisher=True
|
|
|
log4cplus.additivity.malZpbSubscriber=True
|
|
|
log4cplus.additivity.malZpbMrvSubscriber=True
|
|
|
log4cplus.additivity.malZpbServer=True
|
|
|
log4cplus.additivity.malZpbClientAsyncImpl=True
|
|
|
|
|
|
|
|
|
log4cplus.logger.malOpcua=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaBasePubSub=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaPublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaInstancePublisher=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaMrvSubscriber=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaMrvDataMonitor=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaDataPoller=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaDataMonitor=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaClient=TRACE, MyFileAppender
|
|
|
log4cplus.logger.malOpcuaClientEventLoop=TRACE, MyFileAppender
|
|
|
|
|
|
log4cplus.additivity.malOpcua=True
|
|
|
log4cplus.additivity.malOpcuaBasePubSub=True
|
|
|
log4cplus.additivity.malOpcuaPublisher=True
|
|
|
log4cplus.additivity.malOpcuaInstancePublisher=True
|
|
|
log4cplus.additivity.malOpcuaSubscriber=True
|
|
|
log4cplus.additivity.malOpcuaMrvSubscriber=True
|
|
|
log4cplus.additivity.malOpcuaMrvDataMonitor=True
|
|
|
log4cplus.additivity.malOpcuaDataPoller=True
|
|
|
log4cplus.additivity.malOpcuaDataMonitor=True
|
|
|
log4cplus.additivity.malOpcuaClient=True
|
|
|
log4cplus.additivity.malOpcuaClientEventLoop=True
|
|
|
|
|
|
|
|
|
log4cplus.appender.stdout=log4cplus::ConsoleAppender
|
|
|
log4cplus.appender.stdout.layout=log4cplus::PatternLayout
|
|
|
log4cplus.appender.stdout.layout.ConversionPattern=%5p [%t] (%F:%L) - %m%n
|
|
|
|
|
|
log4cplus.appender.MyFileAppender=log4cplus::RollingFileAppender
|
|
|
log4cplus.appender.MyFileAppender.File=/tmp/elt-mal-cpp-trace.log
|
|
|
log4cplus.appender.MyFileAppender.layout=log4cplus::PatternLayout
|
|
|
log4cplus.appender.MyFileAppender.layout.ConversionPattern=[%-5p][%D{%Y/%m/%d %H:%M:%S:%q}][%-l][%t] %m%n
|
|
|
```
|
|
|
|
|
|
2. Configure log4cplus prior to using CII MAL:
|
|
|
```plaintext
|
|
|
|
|
|
#include <log4cplus/configurator.h>
|
|
|
|
|
|
std::string pathToLogPropFile = "...";
|
|
|
if (pathToLogPropFile .size() > 0) {
|
|
|
log4cplus::PropertyConfigurator::doConfigure(pathToLogPropFile );
|
|
|
}
|
|
|
```
|
|
|
|
|
|
3. Pass the path to the log-config to the specific MAL being loaded:
|
|
|
|
|
|
MAL logging is initialized from a configuration file, the path of which is read from mal properties with key mal::PROP_LOG_CONFIG_FILENAME. When loading mal, use set mal::PROP_LOG_CONFIG_FILENAME in mal properties.
|
|
|
|
|
|
For example:
|
|
|
|
|
|
```plaintext
|
|
|
auto zpbMal = mal::loadMal("zpb",
|
|
|
mal::Mal::Properties{{mal::PROP_LOG_CONFIG_FILENAME,"/path/to/mal-log4cplus.conf"}}
|
|
|
);
|
|
|
```
|
|
|
|
|
|
or in **python:**
|
|
|
|
|
|
```plaintext
|
|
|
import elt.pymal as mal
|
|
|
zpbMal = mal.loadMal ("zpb", {"zpb.log4cplus.filename":"/path/to/mal-log4cplus.conf"})
|
|
|
```
|
|
|
|
|
|
python example with inline config
|
|
|
|
|
|
```plaintext
|
|
|
import elt.pymal
|
|
|
with open ("/tmp/mal.log.conf", 'w') as f:
|
|
|
f.write('''
|
|
|
log4cplus.appender.stdout=log4cplus::ConsoleAppender
|
|
|
log4cplus.logger.malDds=TRACE, stdout
|
|
|
log4cplus.logger.malDdsBasePubSub=TRACE, stdout
|
|
|
log4cplus.logger.malDdsSubscriber=TRACE, stdout
|
|
|
''')
|
|
|
ddsMalProps = {"dds.log4cplus.filename" : "/tmp/mal.log.conf"}
|
|
|
ddsMal = elt.pymal.loadMalForUri("dds.ps://", ddsMalProps)
|
|
|
```
|
|
|
|
|
|
|
|
|
**Solution B**
|
|
|
|
|
|
With [ECII-246](https://jira.eso.org/browse/ECII-246), it is possible to change the log levels of the MAL loggers at run-time via a method call:
|
|
|
|
|
|
```plaintext
|
|
|
#include <mal/util/MalLoggingUtil.hpp>
|
|
|
::elt::mal::util::logging::setLogLevelForLoggers( ... )
|
|
|
|
|
|
// set all mal loggers to INFO...
|
|
|
std::vector<elt::mal::util::logging::LoggerInfo> loggers = elt::mal::util::logging::getLoggers();
|
|
|
std::vector<elt::mal::util::logging::LoggerInfo> info;
|
|
|
for (auto const& logger : loggers) {
|
|
|
info.push_back(elt::mal::util::logging::LoggerInfo(logger.loggerName, ::log4cplus::INFO_LOG_LEVEL));
|
|
|
}
|
|
|
elt::mal::util::logging::setLogLevelForLoggers(info);
|
|
|
|
|
|
// print all mal loggers and log level
|
|
|
loggers = elt::mal::util::logging::getLoggers();
|
|
|
std::cout << "Loggers:\n";
|
|
|
for (auto const& logger : loggers) {
|
|
|
std::cout << "\t" << logger.loggerName << ": " << logger.logLevel << std::endl;
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
The allowed logger names are listed below.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
MAL does not use the Cii Logging System (CiiLogManager etc.) directly. Instead, MAL expects logging to be initialized from a configuration file. Note that the format of the log-config differs per programming language.
|
|
|
|
|
|
List of logger names available in the **Cpp MAL / Python MAL:**
|
|
|
|
|
|
Loggers for **mal-zpb**
|
|
|
|
|
|
- malZpbInstancePublisher
|
|
|
- malZpbPublisher
|
|
|
- malZpbClientAsyncImpl
|
|
|
- malZpbServer
|
|
|
- malZpb
|
|
|
- malZpbSubscriber
|
|
|
- malZpbMrvSubscriber
|
|
|
- malZpbBasePubSub
|
|
|
|
|
|
Loggers for **mal-dds**
|
|
|
|
|
|
- malDds
|
|
|
- malDdsSubscriptionManager
|
|
|
- malDdsSubscriptionReaderListener
|
|
|
- malDdsSubscriber
|
|
|
- malDdsPublisher
|
|
|
- malDdsBasePubSub
|
|
|
- malDdsMrvSubscriber
|
|
|
- malDdsRequesterImpl
|
|
|
- malDdsInstancePublisher
|
|
|
|
|
|
Loggers for **mal-opcua**
|
|
|
|
|
|
- malOpcua
|
|
|
- malOpcuaBasePubSub
|
|
|
- malOpcuaMrvSubscriber
|
|
|
- malOpcuaMrvDataMonitor
|
|
|
- malOpcuaInstancePublisher
|
|
|
- malOpcuaPublisher
|
|
|
- malOpcuaSubscriber
|
|
|
- malOpcuaDataPoller
|
|
|
- malOpcuaDataMonitor
|
|
|
- malOpcuaClient
|
|
|
- malOpcuaClientEventLoop
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### More Frames expected \[MAL ZMQ\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
In your application you get errors like this:
|
|
|
|
|
|
```plaintext
|
|
|
Oct 16, 2019 11:26:21 AM elt.mal.zpb.ps.ZpbSubscriber events
|
|
|
|
|
|
WARNING: Remote data entity type hash does not match (1040672065 != 1708154137).
|
|
|
|
|
|
Oct 16, 2019 11:26:21 AM elt.mal.zpb.ps.ZpbSubscriber events
|
|
|
|
|
|
WARNING: Failed to process message.
|
|
|
|
|
|
java.lang.RuntimeException: more frames expected
|
|
|
|
|
|
at elt.mal.zpb.ps.ZpbSubscriber.requireMoreFrames(ZpbSubscriber.java:82)
|
|
|
|
|
|
at elt.mal.zpb.ps.ZpbSubscriber.events(ZpbSubscriber.java:114)
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The first warning message indicates that your client has received a piece of data (= MAL entity type) on a channel that should not carry such data. This means you are running two publishers, publishing different types of data, on the same channel.
|
|
|
|
|
|
**Example of topic definition with port-clash**
|
|
|
|
|
|
```xml
|
|
|
<pubsub_topic>
|
|
|
<topic_name>sm:current_pos</topic_name>
|
|
|
<topic_type>sm_current_pos</topic_type>
|
|
|
<address_uri>zpb.ps://134.171.2.220:57110/test</address_uri>
|
|
|
<qos latency_ms="1" deadline_ms="100"/>
|
|
|
<performance rate_hz="10" latency_ms="1" synchronous="false" />
|
|
|
<mal>
|
|
|
<zpb />
|
|
|
</mal>
|
|
|
</pubsub_topic>
|
|
|
|
|
|
<pubsub_topic>
|
|
|
<topic_name>hp:global_status</topic_name>
|
|
|
<topic_type>hp_global_status</topic_type>
|
|
|
<address_uri>zpb.ps://134.171.2.220:57110/test</address_uri>
|
|
|
<qos latency_ms="10" deadline_ms="100"/>
|
|
|
<performance rate_hz="1" latency_ms="10" synchronous="false" />
|
|
|
<mal>
|
|
|
<zpb />
|
|
|
</mal>
|
|
|
</pubsub_topic>
|
|
|
```
|
|
|
|
|
|
The second warning and the error trace are just a consequence of the first warning.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Check your topics.xml file, and make sure each channel has its own exclusive topic name.
|
|
|
|
|
|
In the above example, e.g.:
|
|
|
|
|
|
```plaintext
|
|
|
zpb.ps://134.171.2.220:57110/test1
|
|
|
```
|
|
|
|
|
|
and
|
|
|
|
|
|
```plaintext
|
|
|
zpb.ps://134.171.2.220:57110/test2
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Address in use \[MAL ZMQ\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Running your application, you see this error message:
|
|
|
|
|
|
```plaintext
|
|
|
ZMQException: Errno 48 : Address already in use
|
|
|
```
|
|
|
|
|
|
**Solution A**
|
|
|
|
|
|
Another instance of your application is still running.
|
|
|
|
|
|
**Solution B**
|
|
|
|
|
|
Another instance of your application has non-gracefully terminated without freeing the network port.
|
|
|
|
|
|
Ensure your application always performs a call to "mal.close()" on shutdown.
|
|
|
|
|
|
**Solution C**
|
|
|
|
|
|
This could be really a Usage Error due to wrong configuration.
|
|
|
|
|
|
The error message is in fact misleading.
|
|
|
|
|
|
**Example**
|
|
|
|
|
|
```plaintext
|
|
|
eltcii33 [09:38:27] eeltdev:~/mschilli > mal-esotests-testclient1 pub sAddr=zpb://eltcii28:12333/Sample tSlow=100 nSamp=100
|
|
|
pub:sys: Available MAL Flavours loaded: [dds, opc, zpb]
|
|
|
pub:config: sAddr=zpb://eltcii28:12333/Sample
|
|
|
pub:config: nSamp=100
|
|
|
pub:config: tSlow=100
|
|
|
Internal Error: org.eso.elt.mal.MalException: org.zeromq.ZMQException: Errno 48 : Address already in use
|
|
|
```
|
|
|
|
|
|
**Reason**
|
|
|
|
|
|
The above code is trying, on host eltcii33, to publish with an endpoint eltcii28.
|
|
|
|
|
|
**Fix**
|
|
|
|
|
|
On eltcii33, the endpoint must be eltcii33.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Scheme not supported \[MAL\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Running your application, you get this error message:
|
|
|
|
|
|
```plaintext
|
|
|
elt.mal.SchemeNotSupportedException: middleware not supported
|
|
|
```
|
|
|
|
|
|
**Solution A**
|
|
|
|
|
|
In your code, you've misspelled the middleware name, e.g. "opc" instead of "opcua"
|
|
|
|
|
|
**Solution B**
|
|
|
|
|
|
The middleware is supported in fact, but failed to load.
|
|
|
|
|
|
- E.g. in DDS, you are using a Qos profile xml file which has some illegal syntax inside.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Choosing a NIC \[MAL DDS\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I'm using MAL-DDS (or MAL-MUDPI), and I have two network cards (NICs) installed. MAL uses the wrong one, i.e. my network traffic goes into the "office" network, but should go into the "control" network.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
As multicast addresses are by definition not associated with hardware (ie they map to MAC addresses which have no corresponding Ethernet card), there is no means for the OS to resolve which NIC the IGMP subscription should be sent down. Thus the NIC must be specified, or the default is used (which is the office network).
|
|
|
|
|
|
The multicast middlewares (DDS and MUDPI) supported by the MAL allow you to specify which NIC you want to use for outgoing traffic. Thus, this boils down to configuring the middleware.
|
|
|
|
|
|
**Solution (DDS)**
|
|
|
|
|
|
Get the XML file shown in Solution #1 on this page: <https://community.rti.com/howto/control-or-restrict-network-interfaces-nics-used-discovery-and-data-distribution>, and continue with this article: [KB: Configuring DDS](https://gitlab.eso.org/ecs/eltsw-docs/-/wikis/KnowledgeBase/CII#configuring-dds-mal-dds)
|
|
|
|
|
|
**Solution (MUDPI)**
|
|
|
|
|
|
Set the "mudpi.ps.interfaceName" mal property when creating the MAL:
|
|
|
|
|
|
```plaintext
|
|
|
auto &factory = ::elt::mal::loadMalForUri("mudpi.ps://",
|
|
|
{ {"mudpi.ps.interfaceName","192.168.100.165"} } );
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Configuring DDS \[MAL DDS\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Some of the middlewares usable through MAL offer a variety of configuration options.
|
|
|
|
|
|
This article explains how to define and use configuration for the DDS middleware.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
To configure DDS, 3 things are necessary:
|
|
|
|
|
|
- put the desired config into an external XML file (see Fast DDS documentation/examples)
|
|
|
- set the FASTRTPS_DEFAULT_PROFILES_FILE (Connext: NDDS_QOS_PROFILES) environment variable, so DDS finds the XML file (This environment variable need NOT be set, the path can be passed directly in the MAL properties, malprops in the example below).
|
|
|
- pass 2 properties to the MAL factory, so DDS finds the right profile in the XML file
|
|
|
|
|
|
**Example: How to restrict DDS traffic to your own host**
|
|
|
|
|
|
XML file (see <https://fast-dds.docs.eprosima.com/en/latest/fastdds/discovery/general_disc_settings.html>)
|
|
|
|
|
|
```xml
|
|
|
<?xml version="1.0" encoding="UTF-8" ?>
|
|
|
<dds>
|
|
|
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
|
|
|
<participant profile_name="MyApp_Default">
|
|
|
<rtps>
|
|
|
<builtin>
|
|
|
<discovery_config>
|
|
|
<ignoreParticipantFlags>FILTER_DIFFERENT_HOST</ignoreParticipantFlags>
|
|
|
</discovery_config>
|
|
|
</builtin>
|
|
|
</rtps>
|
|
|
</participant>
|
|
|
</profiles>
|
|
|
</dds>
|
|
|
```
|
|
|
|
|
|
Code (C++)
|
|
|
|
|
|
```cpp
|
|
|
// Create DDS-MAL with custom mal properties
|
|
|
|
|
|
// With Fast DDS, profile.library prop and env var *must* have same value!
|
|
|
// Here the env var precedes, but you could do the inverse (using setenv).
|
|
|
char* env_var = std::getenv("FASTRTPS_DEFAULT_PROFILES_FILE");
|
|
|
const ::elt::mal::Mal::Properties malprops { {"dds.qos.profile.library", env_var},
|
|
|
{"dds.qos.profile.name", "MyApp_Default"} };
|
|
|
auto &factory = ::elt::mal::loadMalForUri ("dds.ps://", malprops);
|
|
|
|
|
|
// Publishers created from here on will have the setting applied
|
|
|
auto malpub = factory.getPublisher<AltAz> (pubsuburi, qos, {});
|
|
|
```
|
|
|
|
|
|
Before running your code:
|
|
|
|
|
|
```plaintext
|
|
|
export FASTRTPS_DEFAULT_PROFILES_FILE=<path of XML file>
|
|
|
```
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
### Using DDS Monitor \[MAL DDS\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
The DDS monitor _fastdds_monitor_ can show all DDS Participants (peers) for the selected domain.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Important is that in order to enable publishing of this meta data, it must be enabled either via an environment variable in the shell that the DDS application is run in (i.e. in publishers and subscribers, not in the shell where the fastdds_monitor is running, OR (preferably) set in the XML QoS file.
|
|
|
|
|
|
Using the environment variable:
|
|
|
|
|
|
```plaintext
|
|
|
export FASTDDS_STATISTICS="HISTORY_LATENCY_TOPIC;NETWORK_LATENCY_TOPIC;PUBLICATION_THROUGHPUT_TOPIC;\
|
|
|
RTPS_SENT_TOPIC;RTPS_LOST_TOPIC;HEARTBEAT_COUNT_TOPIC;ACKNACK_COUNT_TOPIC;NACKFRAG_COUNT_TOPIC;\
|
|
|
GAP_COUNT_TOPIC;DATA_COUNT_TOPIC;RESENT_DATAS_TOPIC;SAMPLE_DATAS_TOPIC;PDP_PACKETS_TOPIC;EDP_PACKETS_TOPIC;\
|
|
|
DISCOVERY_TOPIC;PHYSICAL_DATA_TOPIC"
|
|
|
```
|
|
|
|
|
|
Setting in the QoS XML file:
|
|
|
```plaintext
|
|
|
<participant profile_name="MyApp_Default_Participant">
|
|
|
<rtps>
|
|
|
<propertiesPolicy>
|
|
|
<properties>
|
|
|
<!-- Activate Fast DDS Statistics Module -->
|
|
|
<property>
|
|
|
<name>fastdds.statistics</name>
|
|
|
<value>HISTORY_LATENCY_TOPIC;NETWORK_LATENCY_TOPIC;PUBLICATION_THROUGHPUT_TOPIC;RTPS_SENT_TOPIC;RTPS_LOST_TOPIC;HEARTBEAT_COUNT_TOPIC;ACKNACK_COUNT_TOPIC;NACKFRAG_COUNT_TOPIC;GAP_COUNT_TOPIC;DATA_COUNT_TOPIC;RESENT_DATAS_TOPIC;SAMPLE_DATAS_TOPIC;PDP_PACKETS_TOPIC;EDP_PACKETS_TOPIC;DISCOVERY_TOPIC;PHYSICAL_DATA_TOPIC</value>
|
|
|
</property>
|
|
|
</properties>
|
|
|
</propertiesPolicy>
|
|
|
|
|
|
```
|
|
|
|
|
|
Once this is done, the statistics are visible. Note that not all statistics (e.g. the QoS of a participant) are correctly displayed by the DDS Monitor, this is slowly being improved with each release.
|
|
|
|
|
|
I already have some comments/feedback to eProsima. I welcome any feedback from your tests as well.
|
|
|
|
|
|
By default all MAL DDS peers have the name “RTPSParticipant”.
|
|
|
There are two ways to set a custom name:
|
|
|
|
|
|
A participant name can be assigned in the XML QoS file as follows:
|
|
|
|
|
|
```plaintext
|
|
|
<participant profile_name="MyApp_Default_Participant">
|
|
|
<rtps>
|
|
|
<name>MyApp_Participant</name>
|
|
|
</rtps>
|
|
|
```
|
|
|
|
|
|
However this means all participants sharing this profile have the same name.
|
|
|
|
|
|
Using the MAL Property to allow setting the participant name, which will set and/or override any participant name read from the QoS file.
|
|
|
The property is: dds.qos.participant.name
|
|
|
For example it may be used as follows:
|
|
|
|
|
|
```plaintext
|
|
|
const ::elt::mal::Mal::Properties pubprops {
|
|
|
{"dds.qos.profile.library", env_var},
|
|
|
{"dds.qos.profile.name.publisher", "MyApp_Default_Publisher"},
|
|
|
{"dds.qos.profile.name.writer", "MyApp_Default_Writer"},
|
|
|
{"dds.qos.profile.name.topic", "MyApp_Default_Topic"},
|
|
|
{"dds.qos.participant.name", "icd-demo-publisher"},
|
|
|
{"dds.qos.profile.name.participant", "MyApp_Default_Participant"}
|
|
|
};
|
|
|
```
|
|
|
|
|
|
|
|
|
***Example***
|
|
|
build and install icd-demo:
|
|
|
- git clone https://gitlab.eso.org/cii/mal/icd-demo.git
|
|
|
- cd icd-demo/
|
|
|
- waf configure build install
|
|
|
|
|
|
- set the environment variable above and run the deme publisher and subscriber
|
|
|
- mal-api-demo-publisher --uri "dds.ps:///m1"
|
|
|
- mal-api-demo-subscriber --uri "dds.ps:///m1"
|
|
|
- finally, run fastdds_monitor to observe the statistics.
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Topic History a.k.a. late joiners \[MAL DDS\]
|
|
|
|
|
|
Many PubSub topics need to cater for late joining subscribers. That is, subscribers typically need to receive that last value published, and then all values published from the time of joining, on.
|
|
|
|
|
|
There are a few key aspects of DDS that must be configured to enable this:
|
|
|
1. The topic must publish a type that has a key. History is stored in the publisher on an instance (i.e. key value) basis (i.e. one sample per instance). In ICD XML, the type struct must contain a member with key="true" set, e.g.:
|
|
|
|
|
|
```plaintext
|
|
|
<struct name="Sample" trace="true">
|
|
|
<member name="daqId" type="int64_t" key="true" />
|
|
|
<member name="value" type="double" />
|
|
|
</struct>
|
|
|
```
|
|
|
2. The topic QoS must be set to have historyQos set to KEEP_LAST, with depth 1. For example:
|
|
|
|
|
|
```plaintext
|
|
|
<topic profile_name="MyApp_Default_Topic">
|
|
|
<historyQos>
|
|
|
<kind>KEEP_LAST</kind>
|
|
|
<depth>1</depth>
|
|
|
</historyQos>
|
|
|
</topic>
|
|
|
```
|
|
|
|
|
|
3. Both data_writer and data_reader QoS must be set to reliability RELIABLE. Reliable communications is required to receive historical data.
|
|
|
|
|
|
```plaintext
|
|
|
<reliability>
|
|
|
<kind>RELIABLE</kind>
|
|
|
<max_blocking_time>
|
|
|
<sec>1</sec>
|
|
|
</max_blocking_time>
|
|
|
</reliability>
|
|
|
```
|
|
|
|
|
|
4. The data_reader QoS must be set durability to TRANSIENT_LOCAL. This means it will request missed data samples, but not beyond the life of the system (i.e. no persistence to disk). Without this setting the subscriber will not inquire about missed data.
|
|
|
|
|
|
```plaintext
|
|
|
<durability>
|
|
|
<kind>TRANSIENT_LOCAL</kind>
|
|
|
</durability>
|
|
|
```
|
|
|
|
|
|
With the above settings in place, late joining subscribers should receive the last data published for each instance, for each topic, from connected publishers.
|
|
|
More details on Fast DDS Qos: https://fast-dds.docs.eprosima.com/en/latest/fastdds/api_reference/dds_pim/core/policy/historyqospolicykind.html
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
### DDS SHM Shared Memory Startup Errors \[MAL DDS\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
On startup of DDS application an error about Shared Memory (SHM) is displayed, for example:
|
|
|
RTPS SHM: "port marked as not ok"
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
The SHM transport is one of the default transports of a DDS application and is used to communicate to peers on the same host. The relevant files created by DDS are visible in /dev/shm/\*fast\*.
|
|
|
If a DDS application does not exit cleanly, it may leave SHM files present, possibly leading to errors when the application restarts, and in any case polluting the /dev/shm/ folder.
|
|
|
|
|
|
To clean up the SHM files used by DDS the following command is provided:
|
|
|
```plaintext
|
|
|
fastdds shm clean
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
### Discovery over multiple NICs \[MAL DDS\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
How do DataReaders connect over multiple NICs
|
|
|
|
|
|
**Explanantion**
|
|
|
Let's say we have 3 hosts:
|
|
|
- A: one NIC on 192.168.1.10 and another NIC on 10.10.10.10
|
|
|
- B: one NIC on 192.168.1.11
|
|
|
- C: one NIC on 10.10.10.12
|
|
|
|
|
|
On host A we have a participant with FASTDDS_STATISTICS="PUBLICATION_THROUGHPUT_TOPIC" and a DataWriter on topic _important_high_frequency_data_.
|
|
|
|
|
|
On host B we have a participant with FASTDDS_STATISTICS="SUBSCRIPTION_THROUGHPUT_TOPIC" and a DataReader on topic _important_high_frequency_data_
|
|
|
|
|
|
On host C we have the Fast DDS monitor.
|
|
|
|
|
|
Participants B and C will **not** discover each other, since they are on different LANs.
|
|
|
|
|
|
Participant A discovery will announce:
|
|
|
"I have a DataWriter for topic _important_high_frequency_data_ communicating through any of 192.168.1.10, 10.10.10.10.
|
|
|
I also have a DataWriter for topic _fastdds_statistics_publication_throughput_ communicating through any of 192.168.1.10, 10.10.10.10."
|
|
|
|
|
|
Participant B discovery will announce:
|
|
|
"I have a DataReader for topic _important_high_frequency_data_ listening on 192.168.1.11.
|
|
|
I also have a DataWriter for topic _fastdds_statistics_subscription_throughput_ communicating through any of 192.168.1.11."
|
|
|
|
|
|
The DataWriter on participant A will then send data for topic _important_high_frequency_data_ listening to 192.168.1.11 (through NIC 192.168.1.10)
|
|
|
|
|
|
Participant C discovery will announce:
|
|
|
"I have a DataReader for topic _fastdds_statistics_publication_throughput_ listening on 10.10.10.12.
|
|
|
I also have a DataReader for topic _fastdds_statistics_subscription_throughput_ listening on 10.10.10.12."
|
|
|
|
|
|
The statistics DataWriter on participant A will then send data for topic _fastdds_statistics_publication_throughput_ to 10.10.10.12 (through NIC 10.10.10.10) and be visible on DDS Monitor.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
\--------------------------------------------------------------------------
|
|
|
|
|
|
### Summary of OPC/UA MAL in C++
|
|
|
|
|
|
This article covers integration of OPC/UA in CII MAL specifically for OPC/UA Data Access and Subscription profiles. OPC/UA method invocation is also supported in CII MAL but is not described in this article, likewise details of the Python (and Java) support are not provided. Only C++ is considered here.
|
|
|
|
|
|
OPC/UA communication middleware is exposed in CII MAL as either Publish/Subscribe or Request/Reply APIs.
|
|
|
|
|
|
The XML ICD definition of types in CII is used to map sets of data points together that are read/written as a group.
|
|
|
|
|
|
Each attribute in the defined type is connected to a corresponding data point in the OPC/UA data space via a URI. Thus the CII URI for a complex type will contain specific addresses of multiple nodes in the OPC/UA data space.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
#### Pub/Sub API for OPC/UA Clients:
|
|
|
|
|
|
The CII MAL Pub/Sub API utilizes OPC/UA Data Access Reads and Writes, as well OPC/UA subscription. A Publisher will directly trigger an OPC DA write, while a Subscriber will work in one of two ways, depending on the type associated with the subscriber:
|
|
|
|
|
|
- If the subscribers CII URI contains only a single node (i.e. the XML ICD type contained only a single attribute) then the Subscriber will create an OPC/UA subscription on that data point. The subscription will trigger notification of updates to the data point node, which will then be queued for notification via the CII Subscriber API.
|
|
|
- If the Subscriber is using subscription, the opc.ps.outstandingPublishRequests property should not be zero (e.g. set it to 5), see the example code below. The reason is that the publish queue is used to store and send subscription notification events, and if the queue is small the even notifications may simply be dropped.
|
|
|
- If the subscriber CII URI contains multiple nodes (i.e. the XML ICD type contains multiple attributes) then the Subscriber launches a thread to perform periodic polling of data from the OPC/UA server. The rate is based on the properties passed in creating the subscriber. e.g.
|
|
|
|
|
|
```cpp
|
|
|
try{
|
|
|
subscriber = factory.getSubscriber<T>(opcua_uri, ::elt::mal::ps::qos::QoS::DEFAULT,
|
|
|
{ {"opc.ps.outstandingPublishRequests","5"},
|
|
|
{"opc.asyncLoopExecutionPeriodMs","50"},
|
|
|
{"opc.asyncCallSubmitTimeoutMs","1000"},
|
|
|
{"opc.ps.pollingPeriodMs","20000"},
|
|
|
{"opc.asyncCallRetryPeriodMs","250"} } );
|
|
|
|
|
|
} catch(...) {
|
|
|
throw;
|
|
|
}
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
#### Request/Reply API for OPC/UA Clients:
|
|
|
|
|
|
As OPC/UA Data Access read and write essentially follow a synchronous request/reply pattern, CII MAL also provides this interface for OPC/UA clients.
|
|
|
|
|
|
The ICD is termed "virtual" in CII nomenclature as it does not require definition as an XML ICD using the service syntax, rather the same types defined for the Pub/Sub API may be used with a CII MAL OPC/UA Request/Reply.
|
|
|
|
|
|
This approach means OPC/UA Data Access read (and write) are synchronous, and may be called as needed by the application.
|
|
|
|
|
|
```cpp
|
|
|
namespace mal {
|
|
|
namespace rr {
|
|
|
namespace da {
|
|
|
|
|
|
class DataAccess : public ::elt::mal::rr::RrEntity {
|
|
|
public:
|
|
|
[...]
|
|
|
template <typename T>
|
|
|
void read(::elt::mal::ps::DataEntity<T>& value) {
|
|
|
readUnsafe(&value);
|
|
|
}
|
|
|
|
|
|
template <typename T>
|
|
|
void write(const ::elt::mal::ps::DataEntity<T>& value) {
|
|
|
writeUnsafe(&value);
|
|
|
}
|
|
|
```
|
|
|
|
|
|
A test application showing its use is here:
|
|
|
|
|
|
<https://gitlab.eso.org/cosylab/elt-cii/mal/mal-test/-/blob/develop/cpp/mal-test-performance/opcua/mal-opcua-da-speed/src/common.cpp>
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Request-Reply Python Snippets \[MAL Python\]
|
|
|
|
|
|
To interact with any MAL based remote service, you can use the python shell to connect to the remote object and invoke its methods and process the return values.
|
|
|
|
|
|
The whole lifecycle (including clean-up) looks like this:
|
|
|
|
|
|
```python
|
|
|
# connect
|
|
|
import elt.pymal
|
|
|
malfact = elt.pymal.loadMalForUri("zpb.rr://", {})
|
|
|
import ModStdif.Stdif.StdCmds
|
|
|
client = malfact.getClient("zpb.rr://127.0.0.1:12081/StdCmds",
|
|
|
ModStdif.Stdif.StdCmds.StdCmdsSync,
|
|
|
elt.pymal.rr.qos.DEFAULT, {})
|
|
|
# interact
|
|
|
print (client.GetState())
|
|
|
|
|
|
# disconnect
|
|
|
malfact.unregisterMal ("zpb")
|
|
|
```
|
|
|
|
|
|
**Side-note**: For what it's worth, this can be done as a one-liner:
|
|
|
```plaintext
|
|
|
python <<< 'import elt.pymal ; malfact = elt.pymal.loadMalForUri("zpb.rr://", {}) ; import ModStdif.Stdif.StdCmds ; client = malfact.getClient("zpb.rr://127.0.0.1:12081/StdCmds", ModStdif.Stdif.StdCmds.StdCmdsSync, elt.pymal.rr.qos.DEFAULT, {}) ; print (client.GetState()) ; malfact.unregisterMal ("zpb") '
|
|
|
```
|
|
|
... which would be equivalent to this msgsend call:
|
|
|
|
|
|
```
|
|
|
msgsend -t 60 -u zpb.rr://127.0.0.1:12081/StdCmds ::stdif::StdCmds::GetState
|
|
|
```
|
|
|
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[OLDB\]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Less or More Logs \[OLDB Log\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
You are using the OLDB API, and you are getting more logs than you asked for. For example, simply testing for the existence of a datapoint: `[ERROR] Path not found config exception occurred: Path oldb/datapoints/myroot/somedp, version 1 not found`
|
|
|
|
|
|
or, the OLDB API seems to be misbehaving, and you want more logs.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Please read the article "[Adjust CII Log Levels \[Log\]](#user-content-adjust-cii-log-levels-log)" in the Log section of this Knowledge Base. The relevant logger names for the OLDB API are:
|
|
|
|
|
|
```plaintext
|
|
|
CiiOldb
|
|
|
CiiOldbDataPoint
|
|
|
CiiOldbDirectoryTreeProvider
|
|
|
CiiOldbFactory
|
|
|
CiiOldbRedisDataPointProvider
|
|
|
CiiOldbRemoteDataPointProvider
|
|
|
CiiOldbRemoteFileProvider
|
|
|
CiiRedisClient
|
|
|
ThreadSubscriberConsumer
|
|
|
config.CiiConfigClient
|
|
|
```
|
|
|
|
|
|
Example: to turn off ERROR logs for testing the existence of non-existing datapoints, use:
|
|
|
|
|
|
```plaintext
|
|
|
log4cplus.logger.CiiOldbRemoteDataPointProvider=FATAL
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Cannot create Datapoints \[OLDB\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to create a datapoint, you get errors such as:
|
|
|
|
|
|
```plaintext
|
|
|
Cannot save to zpb.rr://ciiconfservicehost:9116/configuration/service/clientApi
|
|
|
|
|
|
[ERROR][CiiOldb] Unknown error occurred ::elt::error::icd::CiiSerializableException
|
|
|
```
|
|
|
|
|
|
This is often caused by low disk space (95% used) available for the oldb permanent store:
|
|
|
|
|
|
```plaintext
|
|
|
df -h /var/lib/elasticsearch
|
|
|
```
|
|
|
|
|
|
The disk space taken by the permanent store database itself is in the vast majority of cases dominated by the log records stored in it. The solutions aim at decreasing this space.
|
|
|
|
|
|
**Solution 1**
|
|
|
|
|
|
```plaintext
|
|
|
# Remove old log files:
|
|
|
find /var/log/elasticsearch -type f -mtime +30 -delete
|
|
|
|
|
|
# Put database back into read-write mode:
|
|
|
curl -X PUT -H "Content-Type: application/json" localhost:9200/_all/_settings -d '
|
|
|
{ "index.blocks.read_only_allow_delete": null }'
|
|
|
|
|
|
# Remove old log records:
|
|
|
curl -X POST "localhost:9200/cii_log_default_index/_delete_by_query?pretty" -H 'Content-Type: application/json' -d '
|
|
|
{ "query": { "range" : { "@timestamp" : { "lte": "now-30d/d" } } } }'
|
|
|
|
|
|
# Note: Removal is a background operation, and can take several minutes until it shows an effect.
|
|
|
# Run the below command repeatedly to monitor the removal, the docs.count should be decreasing.
|
|
|
|
|
|
# See number of logs stored in permanent store ("docs.count"):
|
|
|
curl http://localhost:9200/_cat/indices/cii_log_default_index?v\&s=store.size
|
|
|
```
|
|
|
|
|
|
**Solution 2 (brute-force)**
|
|
|
|
|
|
If you could not bring your disk usage below 95%, you can also remove all logs from the permanent store. In this case, you may also want to prevent permanent log storage in the future.
|
|
|
|
|
|
```plaintext
|
|
|
# Prevent storing logs in the permanent store:
|
|
|
sudo cii-services stop log
|
|
|
|
|
|
# Remove all log records:
|
|
|
curl -X DELETE "localhost:9200/cii_log_default_index?pretty"
|
|
|
|
|
|
# Put database back into read-write mode:
|
|
|
curl -X PUT -H "Content-Type: application/json" localhost:9200/_all/_settings -d '
|
|
|
{ "index.blocks.read_only_allow_delete": null }'
|
|
|
|
|
|
# Recreate empty log index
|
|
|
curl -X PUT "localhost:9200/cii_log_default_index?pretty"
|
|
|
```
|
|
|
|
|
|
After that you can restart logging:
|
|
|
|
|
|
```plaintext
|
|
|
sudo cii-services start log
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
If disk usage is 95% or more, elasticsearch goes into read-only mode, and creating new datapoints is not possible any more. To remove old content from the database, it is first necessary to create some free space on the disk (since the database needs space to perform deletetion-operations), then unlock the database, and then remove unnecessary old content from it.
|
|
|
|
|
|
To check the read-only status:
|
|
|
```plaintext
|
|
|
# Check read-only status:
|
|
|
# If the output contains any "true" values, you are facing the problem.
|
|
|
curl -XGET -H "Content-Type: application/json" localhost:9200/_all/_settings/ | jq -r '.[][][]["blocks"]'
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Exception while connecting to OLDB service
|
|
|
|
|
|
When starting an application using the OLDB API, the following exception is received:
|
|
|
|
|
|
```plaintext
|
|
|
<date/time>, ERROR, CiiOldbFactory/140635105143296, Unexpected config exception occurred while retrieving configuration for cii.config://remote/oldb/configurations/oldbClientConfig What:Path oldb/configurations/oldbClientConfig, version -1 not found
|
|
|
terminate called after throwing an instance of 'elt::oldb::CiiOldbException'
|
|
|
what(): Unexpected config exception occurred while retrieving configuration for cii.config://remote/oldb/configurations/oldbClientConfig What:Path oldb/configurations/oldbClientConfig, version -1 not found
|
|
|
```
|
|
|
|
|
|
**Solution** This can indicate that elasticSearch on the config/oldb server is not running, or has crashed. Use cii-services command to check the status on the server where the (cii-internal) config is running.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Connecting to OLDB takes long, then fails \[cpp OLDB\]
|
|
|
|
|
|
**Question** My application blocks a long time on first OLDB access, and eventually fails with a timeout.
|
|
|
|
|
|
**Answer**
|
|
|
|
|
|
Reconfigure the communication timeout (default: 60 seconds)
|
|
|
|
|
|
a) through an environment variable and a properties file
|
|
|
|
|
|
```plaintext
|
|
|
$ cat <<EOF >/tmp/cii_client.ini
|
|
|
connection_timeout = 5
|
|
|
EOF
|
|
|
$ export CONFIG_CLIENT_INI = /tmp/cii_client.ini
|
|
|
```
|
|
|
|
|
|
b) programmatically (available since CII 2.0/DevEnv 3.9)
|
|
|
|
|
|
C++
|
|
|
```plaintext
|
|
|
CiiClientConfiguration config_client_ini = { .connection_timeout = 5, };
|
|
|
elt::config::CiiConfigClient::SetDevClientConfig (config_client_ini);
|
|
|
```
|
|
|
|
|
|
Python
|
|
|
```plaintext
|
|
|
import elt.config
|
|
|
config_client_ini = elt.config.CiiClientConfiguration()
|
|
|
config_client_ini.connection_timeout=5
|
|
|
elt.config.CiiConfigClient.set_dev_client_config(config_client_ini)
|
|
|
```
|
|
|
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The actual stalling comes from a failed MAL-communication with the CII Internal Configuration System, which likely is not running. Setting the timeout for the CiiConfigClient is therefore the thing to do. Note that the properties file (aka. "deployment config") takes precedence and will, if they overlap, overrule the programmatic (aka. "developer config") settings.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Mock OLDB for unit tests \[OLDB cpp python\]
|
|
|
|
|
|
**Question**
|
|
|
|
|
|
Is there already a faked OLDB that I can use in my unit tests in cpp?
|
|
|
|
|
|
**Answer**
|
|
|
|
|
|
(by D. Kumar)
|
|
|
|
|
|
You can create an in-memory OLDB providing a cached config oldb implementation and using the local file system for blob data.
|
|
|
|
|
|
The oldb-client cpp module is providing a
|
|
|
|
|
|
- pure (virtual = 0) interface elt::oldb::CiiOldbDataPointProvider<sup>\[1\]</sup>, and two implementations:
|
|
|
- in-memory data point provider storing data points to the memory
|
|
|
|
|
|
(this is an empty implementation which provides a minimal operational fake oldb)
|
|
|
|
|
|
- a redis data point provider storing data points to redis.
|
|
|
- a remote filesystem interface elt::oldb:impl::ciiOldbRemoteFileProvider.hpp<sup>\[2\]</sup>, and two implementations:
|
|
|
- S3 implementation
|
|
|
- local file system implementation: ciiOldbLocalFileProvider<sup>\[3\]</sup> _\[Note: not before DevEnv 3.4\]_
|
|
|
|
|
|
Here are complete examples of unit tests showing the main use cases how to use oldb (with subscriptions) and metadata creation:
|
|
|
|
|
|
<https://gitlab.eso.org/cii/srv/cii-srv/-/blob/master/oldb-client/cpp/oldb/test/oldbInMemoryTest.cpp>
|
|
|
|
|
|
The same exists in python:
|
|
|
|
|
|
Example in <https://gitlab.eso.org/ahoffsta/cii-srv/-/blob/oldb-in-memory-missing-python-binding/oldb-client/python/oldb/test/oldbInMemoryTest.py>
|
|
|
|
|
|
References
|
|
|
|
|
|
\[1\] <https://gitlab.eso.org/cii/srv/cii-srv/-/blob/master/oldb-client/cpp/oldb/src/include/ciiOldbDataPointProvider.hpp>
|
|
|
|
|
|
\[2\] <https://gitlab.eso.org/cii/srv/cii-srv/-/blob/master/oldb-client/cpp/oldb/src/include/provider/ciiOldbRemoteFileProvider.hpp>
|
|
|
|
|
|
\[3\] <https://gitlab.eso.org/cii/srv/cii-srv/-/blob/master/oldb-client/cpp/oldb/src/provider/ciiOldbLocalFileProvider.hpp> _\[Note: not before DevEnv 3.4\]_
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### access_key empty (DevEnv 3.2.0) \[OLDB\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
Trying to use the OLDB on DevEnv 3.2.0, I'm getting this error:
|
|
|
|
|
|
Unexpected exception occurred. What:Configuration invalid: access_key empty
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
Run the following commands (you will be asked for the root pw):
|
|
|
|
|
|
```plaintext
|
|
|
wget -q www.eso.org/~mschilli/download/cii/postinstall/cii-postinstall-20210610
|
|
|
|
|
|
cii-services stop config
|
|
|
|
|
|
su -c "bash cii-postinstall-20210610 schemas"
|
|
|
|
|
|
# If the script
|
|
|
|
|
|
#You should see the following output:
|
|
|
|
|
|
Password:
|
|
|
CII PostInstall (20210610)
|
|
|
schemas: applying fix ECII397
|
|
|
/home/eltdev/
|
|
|
schemas: populating elasticsearch
|
|
|
schemas: skipping telemetry
|
|
|
schemas: skipping alarms
|
|
|
|
|
|
cii-services start config
|
|
|
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The OLDB settings coming with 3.2.0 are buggy.
|
|
|
|
|
|
The CII post-install procedure is able to hotfix the settings (ECII397).
|
|
|
|
|
|
The problem will be fixed in DevEnv 3.4.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Datapoint already exists \[OLDB\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
My application tries to create an OLDB datapoint.
|
|
|
|
|
|
This fails because the datapoint "already exists":
|
|
|
|
|
|
```plaintext
|
|
|
ERROR, CiiOldbRedisDataPointProvider/140709706681216, Data point uri: cii.oldb:/tcs/hb/tempser3 in Redis already exists.
|
|
|
```
|
|
|
|
|
|
In response, my application skips the creation step, and wants to use the reportedly existing datapoint.
|
|
|
|
|
|
However, when doing this, I get the error "datapoint doesn't exist":
|
|
|
|
|
|
```plaintext
|
|
|
Dynamic exception type: elt::oldb::CiiOldbDpUndefinedException
|
|
|
std::exception::what: The data point cii.oldb:///tcs/hb/tempser3 with this name does not exist.
|
|
|
```
|
|
|
|
|
|
Likewise, when I run the oldb-gui database browser, it does not show this data point in the OLDB.
|
|
|
|
|
|
**Variant 2 of the Problem**
|
|
|
|
|
|
I try to access an OLDB datapoint, and I see two errors like this:
|
|
|
|
|
|
```plaintext
|
|
|
Target configuration does not exist: Failed to retrieve configuration from elastic search: Configuration
|
|
|
[…]
|
|
|
elt.oldb.exceptions.CiiOldbDpExistsException: Data point cii.oldb:/alarm/alarm/device/motor/input_int_dp_alarm already exisits.
|
|
|
```
|
|
|
|
|
|
Go directly to Solution 2 below.
|
|
|
|
|
|
**Variant 3 of the Problem**
|
|
|
|
|
|
I try to delete an OLDB datapoint and I see an error like this:
|
|
|
|
|
|
```plaintext
|
|
|
CiiOldbPyB.CiiOldbException: De-serialization error:sizeof(T)\*count is greater then remaining
|
|
|
```
|
|
|
|
|
|
Go directly to Solution 2 below.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The two errors are contradicting.
|
|
|
|
|
|
Datapoints are stored in two databases: a document-database (permanent store) for its metadata, and a key-value-database (volatile store) for its current value. The above symptoms indicate that the two databases are out-of-sync, meaning the datapoint exists only "half".
|
|
|
|
|
|
**Solution 1**
|
|
|
|
|
|
With DevEnv 4, which contains [ECII-500](https://jira.eso.org/browse/ECII-500), you can probably delete the datapoint to clean up the situation:
|
|
|
|
|
|
```plaintext
|
|
|
#!/usr/bin/env python
|
|
|
import elt.config
|
|
|
import elt.oldb
|
|
|
|
|
|
oldb_client = elt.oldb.CiiOldbFactory.get_instance()
|
|
|
elt.oldb.CiiOldbGlobal.set_write_enabled(True)
|
|
|
|
|
|
uri = elt.config.Uri("cii.oldb:/tcs/hb/tempser3")
|
|
|
oldb_client.delete_data_point(uri)
|
|
|
|
|
|
```
|
|
|
|
|
|
**Solution 2**
|
|
|
|
|
|
If the above didn't help, find out which "half" of the datapoint exists.
|
|
|
|
|
|
1. The current value exists, and the metadata is missing. This is the case when upgrading DevEnv/CII without deleting the Redis cache.
|
|
|
2. The metadata exists, and the current value is missing
|
|
|
|
|
|
Define the following shell functions (note: not applicable to redis-clusters):
|
|
|
|
|
|
```plaintext
|
|
|
function oldb_ela_list { curl -s -X GET localhost:9200/configuration_instance/_search?size=2000\&q=data.uri.value:\"$1\" | jq -r '.hits.hits[]._id' | sort ; }
|
|
|
|
|
|
function oldb_ela_del { curl -s -X POST localhost:9200/configuration_instance/_delete_by_query?q=data.uri.value:\"$1\" | jq -r '.deleted' ; }
|
|
|
|
|
|
function oldb_red_list { redis-cli --scan --pattern "*$1*" ; }
|
|
|
|
|
|
function oldb_red_del { redis-cli --scan --pattern "*$1*" | xargs redis-cli del ; }
|
|
|
```
|
|
|
|
|
|
Then check if the problematic key is in the volatile store:
|
|
|
|
|
|
```plaintext
|
|
|
# Search for path component of dp-uri (here: "device")
|
|
|
$ oldb_red_list device
|
|
|
... output will be e.g.:
|
|
|
/sampleroot/child/device/doubledp444
|
|
|
/sampleroot/child/device/doubledp445
|
|
|
/sampleroot/child/device/doubledp111
|
|
|
/sampleroot/child/device/doubledp2222
|
|
|
|
|
|
# If the problematic key is in the list, delete it:
|
|
|
$ oldb_red_del device/doubledp444
|
|
|
```
|
|
|
|
|
|
Otherwise, check if the problematic key is in the permanent store:
|
|
|
|
|
|
```plaintext
|
|
|
# Search for path component of dp-uri (whole-word search, e.g. "dev" would not match)
|
|
|
$ oldb_ela_list device
|
|
|
... output e.g.:
|
|
|
oldb___datapoints___sampleroot___child___device___doubledp446___1
|
|
|
|
|
|
# Delete the offending metadata
|
|
|
$ oldb_ela_del doubbledp446
|
|
|
|
|
|
# After deletion, restart the internal config server
|
|
|
$ sudo cii-services stop config ; sudo cii-services start config
|
|
|
```
|
|
|
|
|
|
**Solution 3**
|
|
|
|
|
|
If none of the above helped, another possibility is to clean up the metadata.
|
|
|
|
|
|
WARNING: This is an invasive operation. It deletes all datapoints in the OLDB.
|
|
|
|
|
|
```plaintext
|
|
|
# Clean up the OLDB databases
|
|
|
config-initEs.sh
|
|
|
oldb-initEs
|
|
|
redis-cli flushall
|
|
|
sudo cii-services stop config
|
|
|
sudo cii-services start config
|
|
|
```
|
|
|
|
|
|
If you are dealing with a multi-user oldb ("role_groupserver", meaning it serves an OLDB to a team of developers), after executing the above commands you need to additionally execute (with privileges):
|
|
|
|
|
|
```plaintext
|
|
|
/elt/ciisrv/postinstall/cii-postinstall role_groupserver
|
|
|
```
|
|
|
|
|
|
If you have doubts, please contact us.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Command Line Tools and Python snippets \[OLDB\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I need to inspect or modify the content of the OLDB from the command line or a shell script.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
**cii-oldb-traversal-tool** for searching through the OLDB
|
|
|
|
|
|
```plaintext
|
|
|
$ cii-oldb-traversal-tool --file output --quality OK
|
|
|
$ cat output
|
|
|
cii.oldb:///root/trklsv/cfg/log/level|OK|WARNING|2020-09-11T15:25:08Z
|
|
|
cii.oldb:///root/trklsv/cfg/req/endpoint|OK|zpb.rr://localhost:44444/m1/TrkLsvServer|2020-09-11T15:25:08Z
|
|
|
cii.oldb:///root/trklsv/ctr/current/altaz/alt|OK|0.000000|2020-09-11T15:24:25Z
|
|
|
cii.oldb:///root/trklsv/ctr/current/altaz/az|OK|0.000000|2020-09-11T15:24:25Z
|
|
|
cii.oldb:///root/trklsv/ctr/current/radec/dec|OK|0.000000|2020-09-11T15:24:27Z
|
|
|
cii.oldb:///root/trklsv/ctr/current/radec/ra|OK|0.000000|2020-09-11T15:24:27Z
|
|
|
cii.oldb:///root/trklsv/ctr/poserr|OK|0.000000|2020-09-11T15:24:27Z
|
|
|
cii.oldb:///root/trklsv/ctr/status|OK|UNKNOWN|2020-09-11T15:23:55Z
|
|
|
cii.oldb:///root/trklsv/ctr/substate|OK|UNKNOWN|2020-09-11T15:23:49Z
|
|
|
cii.oldb:///root/trklsv/ctr/target/altaz/alt|OK|0.000000|2020-09-11T15:24:24Z
|
|
|
cii.oldb:///root/trklsv/ctr/target/altaz/az|OK|0.000000|2020-09-11T15:24:25Z
|
|
|
cii.oldb:///root/trklsv/ctr/target/radec/dec|OK|0.000000|2020-09-11T15:24:26Z
|
|
|
cii.oldb:///root/trklsv/ctr/target/radec/ra|OK|0.000000|2020-09-11T15:24:26Z
|
|
|
```
|
|
|
|
|
|
**oldb-cli** for reading, writing, subscribing to, creating, deleting an OLDB-datapoint
|
|
|
|
|
|
```plaintext
|
|
|
$ oldb-cli read cii.oldb:///root/trklsv/cfg/req/endpoint
|
|
|
SLF4J: Class path contains multiple SLF4J bindings.
|
|
|
SLF4J: Found binding in [jar:file:/eelt/ciisrv/1.0-RC3-20201030/lib/srv-support-libs/slf4j-nop-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
|
|
|
SLF4J: Found binding in [jar:file:/eelt/mal/1.1.0-2.2.3-20201027/lib/mal-opcua/slf4j-nop-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
|
|
|
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
|
|
|
SLF4J: Actual binding is of type [org.slf4j.helpers.NOPLoggerFactory]
|
|
|
log4j:WARN No appenders could be found for logger (io.netty.util.internal.logging.InternalLoggerFactory).
|
|
|
log4j:WARN Please initialize the log4j system properly.
|
|
|
Timestamp: 2020-09-11T15:25:08.648Z
|
|
|
Quality: OK
|
|
|
Value: zpb.rr://localhost:44444/m1/TrkLsvServer
|
|
|
```
|
|
|
|
|
|
|
|
|
**oldbReset** for putting an OLDB back to its initial state by removing all custom metadata and removing all datapoints. The tool can only be run on the server that hosts the OLDB.
|
|
|
|
|
|
```plaintext
|
|
|
$ oldbReset
|
|
|
```
|
|
|
|
|
|
See "-h" for options. Note there is no command line option to bypass the mandatory interactive security question.
|
|
|
|
|
|
|
|
|
|
|
|
**oldb Python API** for creating, deleting an OLDB-datapoint
|
|
|
|
|
|
```plaintext
|
|
|
$ python
|
|
|
Python 3.7.6 (default, Jan 8 2020, 19:59:22)
|
|
|
[GCC 7.3.0] :: Anaconda, Inc. on linux
|
|
|
Type "help", "copyright", "credits" or "license" for more information.
|
|
|
>>>
|
|
|
>>> import elt.oldb
|
|
|
>>> from elt.config import Uri
|
|
|
>>> oldb = elt.oldb.CiiOldbFactory.get_instance()
|
|
|
```
|
|
|
|
|
|
… and, to create a datapoint:
|
|
|
|
|
|
```plaintext
|
|
|
>>> elt.oldb.CiiOldbGlobal.set_write_enabled(True)
|
|
|
>>> if not oldb.data_point_exists(Uri("cii.oldb:/ccs/tst/tmp1")):
|
|
|
>>> oldb.create_data_point_by_value (Uri("cii.oldb:/ccs/tst/tmp1"), "my text")
|
|
|
```
|
|
|
|
|
|
… and, to write a datapoint:
|
|
|
|
|
|
```plaintext
|
|
|
>>> elt.oldb.CiiOldbGlobal.set_write_enabled(True)
|
|
|
>>> oldb.get_data_point (Uri("cii.oldb:/ccs/tst/tmp1")).write_value("my text")
|
|
|
```
|
|
|
|
|
|
… and, to read a datapoint:
|
|
|
|
|
|
```plaintext
|
|
|
>>> oldb.get_data_point(Uri("cii.oldb:/ccs/tst/tmp1")).read_value().get_value()
|
|
|
'my text'
|
|
|
```
|
|
|
|
|
|
… and, to subscribe to a datapoint:
|
|
|
|
|
|
```plaintext
|
|
|
>>> class CB:
|
|
|
>>> def new_value(self,value,uri):
|
|
|
>>> print ("value:", value.get_value())
|
|
|
>>> sub = elt.oldb.typesupport.STRING.get_new_subscription_instance(CB())
|
|
|
>>> oldb.get_data_point(Uri("cii.oldb:/ccs/tst/tmp1")).subscribe(sub)
|
|
|
```
|
|
|
|
|
|
… and, to delete a datapoint:
|
|
|
|
|
|
```plaintext
|
|
|
>>> elt.oldb.CiiOldbGlobal.set_write_enabled(True)
|
|
|
>>> oldb.delete_data_point (Uri("cii.oldb:/ccs/tst/tmp1"))
|
|
|
```
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[Log\]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Change Log Levels at Run-time \[Log\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I want to modify the log levels of my application programmatically, without having to reload the full log configuration.
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
With [ECII-282](https://jira.eso.org/browse/ECII-282), the CiiLogManager was extended in all three languages to allow dynamically change log levels.
|
|
|
|
|
|
**C++** added methods:
|
|
|
|
|
|
```plaintext
|
|
|
void elt::log::CiiLogManager::SetLogLevel(const std::string logger_name, log4cplus::LogLevel level)
|
|
|
void elt::log::CiiLogManager::SetLogLevel(log4cplus::Logger logger, log4cplus::LogLevel level)
|
|
|
```
|
|
|
|
|
|
**Java** added methods:
|
|
|
|
|
|
```plaintext
|
|
|
void elt.log.CiiLogManager.setLogLevel(
|
|
|
final String loggerName, final org.apache.logging.log4j.Level level);
|
|
|
void elt.log.CiiLogManager.setLogLevel(
|
|
|
org.apache.logging.log4j.Logger logger, final org.apache.logging.log4j.Level level)
|
|
|
```
|
|
|
|
|
|
**Python** added methods:
|
|
|
|
|
|
```plaintext
|
|
|
elt.log.CiiLogManager.set_log_level(name_or_logger: Union\[str, logging.Logger\], level: logging.Level)
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Adjust CII Log Levels \[Log\]
|
|
|
|
|
|
With plain CII (no application frameworks on top), you define a log config file (myapp.logconfig):
|
|
|
|
|
|
```plaintext
|
|
|
log4cplus.rootLogger=INFO, STDOUT
|
|
|
log4cplus.appender.STDOUT=log4cplus::ConsoleAppender
|
|
|
log4cplus.appender.STDOUT.layout=elt::log::layout::CiiSimpleLayout
|
|
|
|
|
|
# other loggers, e.g. OLDB or MAL
|
|
|
log4cplus.logger.CiiOldb=FATAL
|
|
|
```
|
|
|
|
|
|
The name of the log config is your choice, but to comply with the rules of "waf install", best use such a project structure:
|
|
|
|
|
|
```plaintext
|
|
|
myapp/
|
|
|
├── resource
|
|
|
│ └── config
|
|
|
│ └── myapp.logconfig
|
|
|
├── src
|
|
|
│ ├── myapp.cpp
|
|
|
└── wscript
|
|
|
```
|
|
|
|
|
|
Then apply the log config from your application (myapp.cpp):
|
|
|
|
|
|
```plaintext
|
|
|
#include <ciiLogManager.hpp>
|
|
|
int main(int ac, char *av[]) {
|
|
|
::elt::log::CiiLogManager::Configure("resource/config/myapp.logconfig");
|
|
|
log4cplus::Logger root_logger = ::elt::log::CiiLogManager::GetLogger();
|
|
|
root_logger.log(log4cplus::INFO_LOG_LEVEL, "Message via root logger");
|
|
|
return 0;
|
|
|
}
|
|
|
```
|
|
|
|
|
|
Side-note: To configure the logging fully programmatically, without a file, you would do:
|
|
|
```plaintext
|
|
|
::elt::log::CiiLogManager::Configure({
|
|
|
{"log4cplus.appender.ConsoleAppender", "log4cplus::ConsoleAppender"},
|
|
|
{"log4cplus.appender.ConsoleAppender.layout", "elt::log::layout::CiiSimpleLayout"},
|
|
|
{"log4cplus.rootLogger", "FATAL, ConsoleAppender"},
|
|
|
});
|
|
|
```
|
|
|
|
|
|
Side-note 2: To configure the oldb logging from python, you would do:
|
|
|
```plaintext
|
|
|
from elt.log import CiiLogManager
|
|
|
|
|
|
logconf={
|
|
|
'log4cplus.appender.STDOUT': 'log4cplus::ConsoleAppender',
|
|
|
'log4cplus.appender.STDOUT.layout': 'elt::log::layout::CiiSimpleLayout',
|
|
|
'log4cplus.logger.CiiOldb': 'DEBUG, STDOUT',
|
|
|
'log4cplus.logger.CiiOldbDataPoint': 'DEBUG, STDOUT',
|
|
|
'log4cplus.logger.CiiOldbDirectoryTreeProvider': 'DEBUG, STDOUT',
|
|
|
'log4cplus.logger.CiiOldbFactory': 'TRACE, STDOUT',
|
|
|
'log4cplus.logger.CiiOldbRedisDataPointProvider': 'DEBUG, STDOUT',
|
|
|
'log4cplus.logger.CiiOldbRemoteDataPointProvider': 'DEBUG, STDOUT',
|
|
|
'log4cplus.logger.CiiOldbRemoteFileProvider': 'DEBUG, STDOUT',
|
|
|
'log4cplus.logger.CiiRedisClient': 'DEBUG, STDOUT',
|
|
|
'log4cplus.logger.ThreadSubscriberConsumer': 'DEBUG, STDOUT',
|
|
|
'log4cplus.logger.config.CiiConfigClient': 'TRACE, STDOUT',
|
|
|
}
|
|
|
CiiLogManager.configure(logconf, cpp_logging=True)
|
|
|
```
|
|
|
|
|
|
**List of Loggers** Generally, to learn about all loggers that are active in your application, add this (temporarily) to your application:
|
|
|
|
|
|
```plaintext
|
|
|
#include <ciiLogManager.hpp>
|
|
|
[...]
|
|
|
std::cout << "Current loggers (in addition to root logger):" << std::endl;
|
|
|
std::vector<log4cplus::Logger> list = log4cplus::Logger::getCurrentLoggers();
|
|
|
for (int i=0,n=list.size(); i<n; i++) {
|
|
|
log4cplus::Logger elem = list[i];
|
|
|
std::cout << elem.getName() << std::endl;
|
|
|
}
|
|
|
```
|
|
|
|
|
|
**Log Format** To use a different log format (which CII allows, but the Control System guidelines do not), you can modify the above config like this:
|
|
|
|
|
|
```plaintext
|
|
|
#log4cplus.appender.STDOUT.layout=elt::log::layout::CiiSimpleLayout
|
|
|
log4cplus.appender.STDOUT.layout=log4cplus::PatternLayout
|
|
|
log4cplus.appender.STDOUT.layout.ConversionPattern=[%-5p][%D{%Y/%m/%d %H:%M:%S:%q}][%-l][%t] %m%n
|
|
|
```
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### wscript packages for CII Log \[Log\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
I want to write an cxx application using CII Log and no other CII services, nor CII MAL. Which packages do I need in my wscript?
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
1) included this in the uses clause in the cxx program folder:
|
|
|
```plaintext
|
|
|
client-api.elt-common.cpp.log
|
|
|
```
|
|
|
|
|
|
2) and put this into the higher-level project or package wscript:
|
|
|
```plaintext
|
|
|
def configure(cnf):
|
|
|
cnf.check_wdep(wdep_name="client-api.elt-common.cpp.log", uselib_store="client-api.elt-common.cpp.log")
|
|
|
```
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
### Too many files errors \[Log\]
|
|
|
|
|
|
If an application (for example a UI) fails with "too many files" or "too many open files" errors, check the /var/log/elt and $CII_LOGS folder. There might be too many files there, typically produced by the logging system.
|
|
|
|
|
|
In particular there could be a number of 0-bytes files.
|
|
|
|
|
|
Cleanup the folder to solve the problem, as root.
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[Lang\]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Catching API Exceptions \[Lang Python\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
My application contains a call to the CII Python API.
|
|
|
|
|
|
When I ran it, it threw an exception with the following backtrace:
|
|
|
|
|
|
```plaintext
|
|
|
Top Level Unexpected exception:
|
|
|
Traceback (most recent call last):
|
|
|
File "/home/eltdev/MODULES/test/app.py", line 91, in instantiateDP
|
|
|
double_dp = self.oldb_client.create_data_point(uri, metadataInstName)
|
|
|
CiiOldbPyB.CiiOldbDpExistsException: The Data point cii.oldb:/root/test/xxxdp already exists.
|
|
|
```
|
|
|
|
|
|
Therefore, I added a corresponding try-catch around my call:
|
|
|
|
|
|
```plaintext
|
|
|
try:
|
|
|
...
|
|
|
except CiiOldbPyB.CiiOldbDpExistsException as e:
|
|
|
```
|
|
|
|
|
|
When I run it, the try-catch doesn't work.
|
|
|
|
|
|
Moreover, I now get two backtraces:
|
|
|
|
|
|
```plaintext
|
|
|
Top Level Unexpected exception:
|
|
|
Traceback (most recent call last):
|
|
|
File "/home/eltdev/MODULES/test/app.py", line 91, in instantiateDP
|
|
|
double_dp = self.oldb_client.create_data_point(uri, metadataInstName)
|
|
|
CiiOldbPyB.CiiOldbDpExistsException: The Data point cii.oldb:/root/test/xxxdp already exists.
|
|
|
|
|
|
During handling of the above exception, another exception occurred:
|
|
|
Traceback (most recent call last):
|
|
|
File "/home/eltdev/MODULES/test/app.py", line 108, in main
|
|
|
oldbCreator.instantiateOLDB_exception()
|
|
|
File "/home/eltdev/MODULES/test/app.py", line 81, in instantiateOLDB_exception
|
|
|
self.instantiateDP(double_dp_uri, double_dp_meta.get_instance_name())
|
|
|
File "/home/eltdev/MODULES/test/app.py", line 94, in instantiateDP
|
|
|
except CiiOldbPyB.CiiOldbDpExistsException:
|
|
|
NameError: name 'CiiOldbPyB' is not defined
|
|
|
```
|
|
|
|
|
|
**Solution**
|
|
|
|
|
|
You were mislead by the first backtrace: the exception name in the backtrace is not what you should catch.
|
|
|
|
|
|
In your code, replace "**CiiOldbPyB**" with "**elt.oldb**":
|
|
|
|
|
|
```plaintext
|
|
|
try:
|
|
|
....
|
|
|
except elt.oldb.CiiOldbDpExistsException as e:
|
|
|
```
|
|
|
|
|
|
For completeness - do no forget this statement:
|
|
|
|
|
|
```plaintext
|
|
|
import elt.oldb
|
|
|
```
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
The CII Python API is mostly a binding to the CII C++ API.
|
|
|
|
|
|
The CiiOldbPyB.CiiOldbDpExistsException is the original binding class.
|
|
|
|
|
|
This binding class is re-exported under the name elt.oldb.CiiOldbDpExistsException.
|
|
|
|
|
|
The elt.oldb module internally loads the C++ binding module CiiOldbPyB. So both are the same exception.
|
|
|
|
|
|
Nonetheless, you should use the re-exported name, not the original name in your application. We discourage the use of the original name because the structure of the CiiOldbPyB module is more "chaotic" and not equivalent to elt.oldb.
|
|
|
|
|
|
Unfortunately, in the backtraces you will always see the original name instead of the re-exported name.
|
|
|
|
|
|
This question was originally asked in [ECII-422](https://jira.eso.org/browse/ECII-422).
|
|
|
|
|
|
### ------------------------------------------------------
|
|
|
|
|
|
### \[IntCfg\]
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### Elasticsearch disk usage and house-keeping \[IntCfg\]
|
|
|
|
|
|
The Elasticsearch database is used by many CII services, e.g. to store CII log messages, tracing data, and Oldb metadata. If you run a local elasticsearch database on your host (i.e. you have set up a "role_ownserver" during post-install), it is advisable to do some house-keeping on this database from time to time.
|
|
|
|
|
|
Some house-keeping is automated (e.g. ".monitoring-es" indices are automatically rolled over every few days), others may be automated in the future, but currently are not. Instead you should perform the tasks below at your own discretion.
|
|
|
|
|
|
**SOS - Disk Full** When disk usage (`df -h /var/lib/elastic`) goes above 95%, elasticsearch goes into read-only mode. You will see this reported in /var/log/messages and the elastic logs, and by getting exceptions from CII operations like *oldb.CreateDataPoint()*. This will prevent you from doing any clean-up operations on elasticsearch. First, bring disk usage below 95% (e.g. by removing elastic logs with `find /var/log/elasticsearch -type f -mtime +10 -delete`, or by temporarily moving some files from the full partition to another partition), then put elasticsearch back into read-write mode with this command:
|
|
|
`curl -XPUT -H "Content-Type: application/json" localhost:9200/_all/_settings -d '{ "index.blocks.read_only_allow_delete": null }'`. After this, you can proceed normally with the house-keeping operations described next.
|
|
|
|
|
|
|
|
|
|
|
|
1. Check which indices you have and how much memory they consume:
|
|
|
|
|
|
```plaintext
|
|
|
curl localhost:9200/_cat/indices/_all?v\&s=store.size
|
|
|
```
|
|
|
|
|
|
2. To delete diagnostic indices that are older than X days:
|
|
|
|
|
|
```plaintext
|
|
|
function ela_purge_idx { name=$1; age=$2; limit=$(date -d "$age days ago" +"%Y%m%d") ; for a in `curl -s localhost:9200/_aliases | jq -r 'keys | .[]'` ; do [[ $a == *$name* ]] && [[ "${a//[!0-9]/}" -lt $limit ]] && curl -X DELETE localhost:9200/$a ; done }
|
|
|
|
|
|
ela_purge_idx jaeger 10 # delete *jaeger* indices older than 10 days
|
|
|
```
|
|
|
|
|
|
3. To delete CII log messages that are older than 30 days:
|
|
|
|
|
|
```plaintext
|
|
|
curl -X POST "localhost:9200/cii_log_default_index/_delete_by_query?pretty" -H 'Content-Type: application/json' -d' {"query": {"range" : {"@timestamp" : {"lte": "now-30d/d" } } } } '
|
|
|
```
|
|
|
|
|
|
4. Or brute-force, delete all CII log messages:
|
|
|
```plaintext
|
|
|
curl -X DELETE "localhost:9200/cii_log_default_index?pretty"
|
|
|
curl -X PUT "localhost:9200/cii_log_default_index?pretty"
|
|
|
```
|
|
|
|
|
|
5. To free your disk from elastic logs older than 10 days, do (as root):
|
|
|
|
|
|
```plaintext
|
|
|
find /var/log/elasticsearch -type f -mtime +10 -delete
|
|
|
```
|
|
|
|
|
|
Finally, if you do not need CII logs stored in elasticsearch (= you don't use kibana), note that you can stop the log transport and log analysis engine. This way, elasticsearch will grow much slower.
|
|
|
```
|
|
|
sudo cii-services stop log
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
### config not found on remote db \[IntCfg\]
|
|
|
|
|
|
**Problem**
|
|
|
|
|
|
You are intending to read a config from the local config database ("localdb"), but you see an error message like this.
|
|
|
|
|
|
```plaintext
|
|
|
elt.config.exceptions.CiiConfigNoTcException: Target configuration does not exist: Failed to retrieve configuration from elastic search: Configuration cii.config://*/supervisoryapp/TrkLsvDeploy on the remote db was not found
|
|
|
at elt.config.client.ConfigRemoteDatabase.retrieveConfig(ConfigRemoteDatabase.java:191)
|
|
|
at elt.config.client.CiiConfigClient.retrieveConfig(CiiConfigClient.java:354)
|
|
|
at elt.config.client.CiiConfigClient.retrieveConfig(CiiConfigClient.java:310)
|
|
|
at trkLsv.DataContext.loadConfig(DataContext.java:324)
|
|
|
at trkLsv.DataContext.<init>(DataContext.java:190)
|
|
|
at trkLsv.TrkLsv.go(TrkLsv.java:72)
|
|
|
at trkLsv.TrkLsv.main(TrkLsv.java:41)
|
|
|
Caused by: elt.error.icd.CiiSerializableException
|
|
|
at elt.config.service.client.icd.zpb.ServiceClientApiInterfaceAsyncImpl.processRequest(ServiceClientApiInterfaceAsyncImpl.java:73)
|
|
|
at elt.mal.zpb.rr.ClientAsyncImpl.events(ClientAsyncImpl.java:261)
|
|
|
at org.zeromq.ZPoller.dispatch(ZPoller.java:537)
|
|
|
at org.zeromq.ZPoller.poll(ZPoller.java:488)
|
|
|
at org.zeromq.ZPoller.poll(ZPoller.java:461)
|
|
|
at elt.mal.zpb.ZpbMal.processThread(ZpbMal.java:459)
|
|
|
at elt.mal.zpb.ZpbMal.lambda$new$0(ZpbMal.java:119)
|
|
|
at java.lang.Thread.run(Thread.java:748)
|
|
|
```
|
|
|
|
|
|
**Solution A**
|
|
|
|
|
|
Your local file may be invalid (e.g. illegal format, or doesn't match the config class definition).
|
|
|
|
|
|
Look at the content of your local config database, e.g. with
|
|
|
|
|
|
```plaintext
|
|
|
$ find $INTROOT/localdb
|
|
|
```
|
|
|
|
|
|
and correct the file in place, or fix the source yaml and then redeploy it from source to the localdb.
|
|
|
|
|
|
**Background**
|
|
|
|
|
|
You may have a malformed json file in your local db, which the config service failed to read.
|
|
|
|
|
|
Because of the use the location wildcard "\*" in your code (in "[cii.config://\*/supervisoryapp/TrkLsvDeploy"](cii.config://\*/supervisoryapp/TrkLsvDeploy%22)),
|
|
|
|
|
|
the config service has consequently tried to load the config from the remote config database, where no such config exists.
|
|
|
|
|
|
To that end, the error message is misleading, and should be improved (ticket [ECII-208](https://jira.eso.org/browse/ECII-208)). |
|
|
\ No newline at end of file |