Detecting Operational Anomalies

As a general rule, when performance is deteriorating or when your system is running out of memory, restarting the Data Server should be the last resort.

Check the indicators described below and attempt to remedy the anomalies using the available tools.
In general, first diagnose the client application using the tools provided in the client, to determine if the anomaly originates from the client. More tools are available in Web Admin to diagnose client applications, in particular with respect to database usage.

If the anomaly comes from a client application, you can stop that client from Web Admin, if needed.

If the anomaly does not come from the client application, you should then diagnose the servers and the engines using Web Admin.

 

The tools for diagnosing applications allow detecting operational anomalies in the following areas:

Client Performance
Memory
Risk Infrastructure
Database
Data Server Performance

 

1. Client Performance

Tips for maximizing the performance of client applications:

When loading trades, use trade filters with books specified as part of the selection criteria. Trades are cached by book in the Data Server and loading by book will cache the trades. This will result in much faster retrieval if someone else has already loaded the trade, or the next time you need the trade.
Task Station - Turn off the following options: Configure > Load Trades, Configure > Load Messages, Configure > Load Transfers. To load a trade, a message, or a transfer, double-click a task, and the corresponding trade, message, or transfer will be loaded individually.
For custom applications:
Use the local caches BOCache and LocalCache for retrieving static data, instead of accessing the Data Server directly.
Whenever possible, use bulk loading instead of loading items one at a time.
Consider implementing your own externalization instead of using Java’s default serialization.

Refer to the Calypso Developer’s Guide for details.

To assist in understanding certain issues, a tracer can be placed to monitor Calypso caches. This tracer is disabled by default except when a variable enabling the tracer for the specific Calypso cache is given in the environment.

The conditions for the cache monitoring to be activated are: 

The environment variable <cache_name> + .TRACE is set to true.

For example, for BO Messaging, the environment variable is BOMessageSQL._cache.TRACE=true

Log category com.calypso.tk.util.cache.CalypsoCacheTracer is enabled.

The traces have the following format: 

For global operations (start cache, committing, commit, rollback): <cache_name>/<requested>: <operation> (txStartTX=unix_timestamp)
For cache item operation (put, remove, eviction, etc…): <cache_name>/requested>: <Operation> item_key (version=x) (old_version=y)

EXAMPLE

 

2. Memory Usage

You can check the memory using the following tools:

Check the Data Server memory using Web Admin > Data Server > Server > Information.

See Data Server Web Admin for details.

Use Web Admin Alerts to monitor specific indicators.

See Alerts for details.

 

To free the memory, you can:

Call Garbage Collection multiple times using Utilities > Maintenance > Cache/Memory > Garbage Collection from the Calypso Navigator.
Check and release unused database connections using Web Admin > Data Server > Monitoring > SQL Statements.

See Data Server Web Admin for details. Call Garbage Collection after this operation.

Clear the caches, and lower the cache limits using Web Admin > Data Server > Metrics > Caches.

See Data Server Web Admin for details.

Allocate more memory to the application. The allocated memory for a given application is specified at startup.

 

3. Risk Infrastructure

The Risk Server Web Admin provides monitoring capabilities for the Calculation Server and the Presentation Server.

See Risk Server Web Admin for details.

 

4. Database Performance

Check SQL Statements to identify queries that take too long to execute. You can monitor the SQL queries to diagnose the source of the anomaly using Web Admin > Data Server >Monitoring > SQL Statements.

It can indicate that an index is missing in the database schema, that a query is improperly configured, or that you should archive unused data from the database tables. If none of those conditions apply, it can indicate a defective application, in which case you must kill the corresponding process.

See Data Server Web Admin for details.

 

5. Tips for Improving Performance

Performance can be impacted by unconsumed events. You can use Web Admin > Data Server > Metrics > Pending Events to detect unconsumed events.

 

In general, to improve the performance of the system you should perform the recommended maintenance routine. Calypso offers a number of scheduled tasks to archive and delete unused objects.

See Recommended Maintenance Routine for details.

 

The following environment properties allow improving the performance of the Data Server:

Note:

AUDIT_PRICER_CONFIG — True or False. Set to true to enable Pricer Config Audit. Setting this to False results in no audit of Pricer Config modifications, and a speed up of Pricer Config saves. Default is true.
COMPRESS_FLOWS_IN_MEMORY — True or False. Set to true to save customized cashflows for Swap, Cap, Floor and Swaption trades in a compressed form in memory until they are used. Should be used in conjunction with SAVE_FLOWS_AS_BLOB. Default is true.
COMPRESS_RMI_PACKETS - True of False. Set to true to compress RMI packets sent and received, or False otherwise. Default is False.
DS_EVENT_BUFFER_POOL_MAX_SIZE — The events in the Event publisher queue are only published if the number of events does not exceed DS_EVENT_BUFFER_POOL_MAX_SIZE.

If the number of events exceeds DS_EVENT_BUFFER_POOL_MAX_SIZE, they are not published.

This feature prevents the Data Server from blocking the Event Server. Events are not published to the Event Server by waiting for a handshake. Instead, a pool is set up to store the events, and a separate thread is used to process the events.

This number varies from installation to installation, and some adjustment will be necessary as the system is deployed. The initial value should be 10,000.

This feature is also available at the command line using the –eventbuffersize <size> option.

JDBC_CACHE_STATEMENT — True or False. Set to true to use cache for Java PreparedStatements in Data Server connections. This saves time while building strings for PreparedStatements. Default is false.
KEEP_CURVE_AS_BLOB — True or False. Set to true to compress curves in memory. The in-memory compression happens after a curve is saved or loaded. Whenever a compressed curve is needed, it will be uncompressed. Default is true.
KEEP_VOLATILITY_AS_BLOB — True or False. Set to true to compress volatility surfaces in-memory. The in-memory compression happens after a surface is saved or loaded. Whenever a compressed surface is needed, it will be uncompressed. Overall, the compression results in less memory consumption. Default is true.
LEGAL_ENTITY_MAX_NAMES — The maximum number of legal entity names allowed per query. Since most users will not need to browse more than a fixed number of counterparties or legal entities at one time, setting this limit results in a faster load. Default is 1000.
MAX_DOCUMENTS_PER_USER — The maximum number of documents that a user can bulk load at once. If set, the number of documents returned by the Data Server is limited to this value. This prevents one large query from using an inordinate amount of memory in the Data Server during the load. Default is 100000.
MAX_MESSAGES_PER_USER — The maximum number of messages that a user can bulk load at once. If set, the number of messages returned by the Data Server is limited to this value. This prevents one large query from using an inordinate amount of memory in the Data Server during the load. Default is 100000.
MAX_TASKS_PER_USER — The maximum number of tasks that a user can bulk load at once. If set, the number of tasks returned by the Data Server is limited to this value. This prevents one large query from using an inordinate amount of memory in the Data Server during the load. Default is 500000.
MAX_TRADES_PER_USER — The maximum number of trades that a user can bulk load at once. If the limit is reached, a PersistenceException is thrown and the query is canceled. This prevents one large query from using an inordinate amount of memory in the Data Server during the load. Default is 0, no limit.
MAX_TRANSFERS_PER_USER — The maximum number of transfers that a user can bulk load at once. If set, the number of transfers returned by the Data Server will be limited to this value. This prevents one large query from using an inordinate amount of memory in the Data Server during the load. Default is 100000.
SAVE_AN_OUTPUT_AS_BLOB — True or False. Set to true to save Risk Analysis reports in a compressed form in the database. If you do not need to query your report results directly from the database, you should always choose to save your reports to blob format for performance reasons. There can be a factor of ten in speeding up the saving of the report.
SAVE_FLOWS_AS_BLOB — True or False. Set to true to save and load customized cashflows of Swap, Cap, Floor and Swaption trades in a compressed form. Cashflows saved in the compressed form will not need to be compressed prior to the trade’s insertion into cache, therefore this option should always be used in conjunction with COMPRESS_FLOWS_IN_MEMORY. Default is false.
STORE_EVENT_TIMESTAMP — True or False. Set to true to store a timestamp with each event. This should not be used under normal circumstances since there is a large impact upon performance. Hence, it is recommended to set it to False. Default is false.
TASK_MIN_PRODUCT_INFO — True or False. Set to true to load minimum product information in Task Station, Task Selector, Payment Report, Advice Report, Posting Report. Only the keywords, fees and a small product image are loaded with the Tasks. It is only when you access trade details, by double-clicking in the respective GUI, that all the remaining information is loaded. This results in a much faster initial load. Default is true.

 

A number of engine parameters allow improving the performance of the engines.

Refer to Calypso Engine Parameters documentation for details.