Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Version History

« Previous Version 16 Next »

Motivation

There have been situations where end users need to review a sample, check relevant metrics and trigger a specific Action - for example "for a given sample, IF RapidQC has completed AND LOD>0 THEN submit "FinalAggregation" request in Pipeline's JMS queue". All of these actions to be done/initiated from within Tableau. Beyond visualization, Tableau can't do much else in terms of maintaining a "workflow" in DB, talking to external web-services like JMS and so on. That's why these job is being outsourced to our "SampleAnalyticsWorkflow" web-service - it's written in Scala so there is virtually no limit what tasks it can do for you. Tableau utilize it's UrlAction gadget to make a URL call to SAW service when appropriate.

How sample workflow is maintained in the DB

ANALYTICS.SAMPLE_ANALYTICS_WORKFLOW table (in SEQPROD db) is defined like this

Column nameData typeDescription
PDOVARCHAR2PDO
PDO_SAMPLEVARCHAR2PDO SAMPLE
DATA_TYPEVARCHAR2Exome, WGS
REQUEST_TYPENUMBER
"RequestFinalAggregation", "RequestTopOff", "RequestTopOffGroup", "RequestPoolCreated", "RequestSendToRework", "RequestHoldForTopOff", "RequestWaitingForNewRwData", "RequestReset", "RequestPCHoldForTopOff","RequestVolumeFinal"
1 encodes RequestFinalAggregation, 2 encodes RequestTopOff and so on
USER_NAMEVARCHAR2who triggered this action/transition
PARAMSVARCHAR2This is a free text where Tableau developers can store additional info re this transition which later can be used. Usually it's semicolon-separated-list-of-key=value-pairs
TIMESTAMPDATEwhen this action was triggered
IS_LATESTNUMBERBoolean (1 or 0), Is this the latest transition a given (PDO, PDO_SAMPLE, DATA_TYPE) has gone through. Every time new transition is recorded it comes with IS_LATEST=1 whereas previous latest transition is reset IS_LATEST=0
IP_ADDRESSVARCHAR2where (which IP) this request came from

The SAW web-service

In general, SAW looks like this

http://analytics:8090/api/sampleAnalyticsWorkflow?request_type={requestType}&items={comma-separated-list-of-items}&EOURL=1

where 

  • requestType is the text-representation of this request (RequestFinalAggregation, RequestTopOff, etc)
  • each individual item is a quintuple representing a single sample like this pdo:pdo_sample:data_type:user_name:parameters (greens are optional, trailing colons are not required)

parameters represents any piece of data you might want to attach to a given sample/transition. It is a semicolon-separated-list-of-key=value-pairs. Gets recorded in PARAMS column.

Authentication

There are 3 alternative ways to provide a "user_name"

  1. repeat it in the quintuple. Very wasteful.
  2. standalone upfront URL-parameter "&user_name=JohnDoe" (economical and solid). Unfortunately as of now, Tableau's urlAction can't inject the username like this.
  3. 2-step cookie-dance (economical but might be tricky to use from within Tableau)
    1. record the user_name-cookie in the browser environment by calling this URL http://analytics:8090/api/setCookie?cookies=sampleAnalyticsWorkflow.user_name=JohnDoe (you only need to do this once for your entire session). Cookie/credentials are configured to expire after 10 minutes.
    2. and then make the main SAW call as many times as you need

URL-length limitations

Two players are involved in this issue

Your environment (browser, OS, versions, etc) implicitly impose a limit on how long a URL can be. On top of that, our web-server provider (Akka-http) also impose limits on what URL-length is acceptable. (Currently akka-http limit is configured to 8192 chars and can possibly be extended further but this should be done rather coservatively since it might lead to more memory consumption). So, best way to answer the question "What URL-size is safe to pass" is to try/test and see. Fortunately, if any truncation happens upstream the last EOURL parameter would definitely not make it and SAW-service can safely detect and report accordingly.

Special provisions for RequestFinalAggregation request

All SAW requests insert record into SAMPLE_ANALYTICS_WORKFLOW table.

RequestFinalAggregation is the only request which also sends message to Pipeline JMS queue. This is done by SAW in 2 steps

  1. JOIN cognos.slxre_readgroup_metadata ON (pdo, pdo_sample, data_type) and lookup (research_project_id,collaborator_sample_id) for a given sample
  2. send JSON message to appropriate JMS queue

Environments

2 separate environments are provided


ProductionDevelopment/Test
tableANALYTICS.SAMPLE_ANALYTICS_WORKFLOW ANALYTICS.SAMPLE_ANALYTICS_WORKFLOW_TEST
web-servicehttp://analytics:8090/api/sampleAnalyticsWorkflowhttp://analytics:8090/api/sampleAnalyticsWorkflowTest
JMS queue

https://picard-jms.broadinstitute.org:8161/admin/browse.jsp?JMSDestination=broad.pushtocloud.enqueue

Pipeline is picking up messages from this queue.

https://picard-jms-dev.broadinstitute.org:8161/admin/browse.jsp?JMSDestination=broad.pushtocloud.analytics

No listener is registered on this queue.

JMS browsehttps://picard-jms.broadinstitute.org:8161/admin/browse.jsphttps://picard-jms-dev.broadinstitute.org:8161/admin/browse.jsp

Security

SAW is not very secure at this time. Everyone who knows the URL can call the service. IP-based restrictions (relatively easy to implement) can vastly reduce the number of machines allowed to access SAW.

EXTRACT_PARAM utility

PARAMS column may become very crowded and extraction of specific field would be tricky. There is a special custom function in Oracle which streamline this task. It allows you to ask for specific fields by "key" (not by position which is fragile)

EXTRACT_PARAM
SELECT 
	extract_param(a.params, 'source') source,
	extract_param(a.params, 'contamination') contamination,
	a.* 
FROM analytics.sample_analytics_workflow_test a 
ORDER BY a.TIMESTAMP desc

Tricks to maximize the number of samples submitted

Tricks are all variations of "how to make The Item (sample description) as short as possible".

  1. take user_name out of items (see authentication above)
  2. take PDO-, SM- prefixes out of resp. PDO, PDO_SAMPLE. SAW will automatically put them back as needed.
  3. parameters which are pertinent to all samples (for example source=RQC) should be moved to standalone extra_info parameter. SAW will disperse it to PARAMS column of all samples involved.
  4. regarding parameters which are sample-specific (for example xShort=123;yLong=567)
    1. move all keys to standalone params_fields url-parameter as colon-separated-list-of-keys
    2. in the items/samples parameter leave only the values as semicolon-separated-list-of-values. SAW will pair back keys/values accordingly and insert it into PARAMS column.
  5. Use single-letter encoding for data_type: (E)xome, (W)GS.

Following example employs all the tricks (implies cookie-dance 1st step has been completed):

http://analytics:8090/api/sampleAnalyticsWorkflowTest?request_type=RequestFinalAggregation&extra_info=source=RQC&params_fields=zShort:xLong&items=16653:HP3LY:W::0.111;55,16653:HP3L9:W::0.222;66,16485:HE5F1:W::0.333;77,16485:HF32S:W::0.444;88&EOURL=1

  • No labels