Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Motivation

...

ANALYTICS.SAMPLE_ANALYTICS_WORKFLOW table (in SEQPROD db) is defined like this

Column nameData typeDescription
PDOVARCHAR2PDO
PDO_SAMPLEVARCHAR2PDO SAMPLE
DATA_TYPEVARCHAR2Exome, WGS
REQUEST_TYPENUMBER
"RequestFinalAggregation", "RequestTopOff", "RequestTopOffGroup", "RequestPoolCreated", "RequestSendToRework", "RequestHoldForTopOff", "RequestWaitingForNewRwData", "RequestReset", "RequestPCHoldForTopOff","RequestVolumeFinal"
1 encodes RequestFinalAggregation, 2 encodes RequestTopOff and so on
USER_NAMEVARCHAR2who triggered this action/transition
PARAMSVARCHAR2This is a free text where Tableau developers can store additional info re this transition which later can be used. Usually it's semicolon-separated-list-of-key=value-pairs
TIMESTAMPDATEwhen this action was triggered
IS_LATESTNUMBERBoolean (1 or 0), Is this the latest transition a given (PDO, PDO_SAMPLE, DATA_TYPE) has gone through. Every time new transition is recorded it comes with IS_LATEST=1 whereas previous latest transition is reset IS_LATEST=0
IP_ADDRESSVARCHAR2where (which IP) this request came from

...

  • requestType is the text-representation of this request (RequestFinalAggregation, RequestTopOff, etc)
  • each individual item is a quintuple representing a single sample like this pdo:pdo_sample:data_type:user_name:parameters (greens are optional, trailing colons are not required)

parameters represents any piece of data you might want to attach to a given sample/transition. It is a semicolon-separated-list-of-key-=value-pairs. Gets recorded in PARAMS column.

Authentication

There are 3 alternative ways to provide a "user_name"

  1. repeat it in the quintuple. Very wasteful.
  2. standalone upfront URL-parameter "&user_name=JohnDoe" (economical and solid). Unfortunately as of now, Tableau's urlAction can't inject the username like this.
  3. 2-step the cookie-dance (economical but might be tricky to use from within Tableau)
    1. record the user_name-cookie in the browser environment by calling this URL http://analytics:8090/api/setCookie?cookies=sampleAnalyticsWorkflow.user_name=JohnDoe (you only need to do this once for your entire session). Cookie/credentials are configured to expire after 10 minutes.
    2. and then make the main SAW call as many times as you need

URL-length limitations

...

Two sides players are involved in this issue

...

RequestFinalAggregation is the only request which also sends message to Pipeline JMS queue. This is done by SAW in 2 steps

...

  1. JOIN cognos.slxre_readgroup_metadata ON (pdo, pdo_sample, data_type) and lookup (research_project_id,collaborator_sample_id) for a given sample

...

  1. send JSON message to appropriate JMS queue

Environments

2 separate environments are provided

...


ProductionDevelopment/Test
tableANALYTICS.SAMPLE_ANALYTICS_WORKFLOW ANALYTICS.SAMPLE_ANALYTICS_WORKFLOW_TEST
web-servicehttp://analytics:8090/api/sampleAnalyticsWorkflowhttp://analytics:8090/api/sampleAnalyticsWorkflowTest
JMS queue

https://picard-jms.broadinstitute.org:8161/admin/browse.jsp?JMSDestination=broad.pushtocloud.enqueue

Pipeline is picking up messages from this queue.

https://picard-jms-dev.broadinstitute.org:8161/admin/browse.jsp?JMSDestination=broad.pushtocloud.analytics

No listener is registered on this queue.

JMS browsehttps://picard-jms.broadinstitute.org:8161/admin/browse.jsphttps://picard-jms-dev.broadinstitute.org:8161/admin/browse.jsp

...

Code Block
languagesql
titlePARAMS-extractEXTRACT_PARAM
SELECT 
	extract_param(a.params, 'source') source,
	extract_param(a.params, 'contamination') contamination,
	a.* 
FROM analytics.sample_analytics_workflow_test a 
ORDER BY a.TIMESTAMP desc

...

Tricks are all variations of "how to make The Item (sample description) as short as possible".

  1. take user_name out of items (see authentication above)
  2. take out take PDO-, SM- prefixes out of resp from . PDO, PDO_SAMPLE. SAW will automatically add put them back as needed.
  3. parameters which are pertinent to all samples (for example source=RQC) should be moved to standalone extra_info parameter. SAW will disperse it to PARAMS column of all samples involved.
  4. regarding parameters which are sample-specific (for example xShort=123;yLong=567)
    1. move all keys to standalone params_fields url-parameter as colon-separated-list-of-keys
    2. in the items/samples parameter leave only the values as semicolon-separated-list-of-values. SAW will pair back keys/values accordingly and insert it into PARAMS column.
  5. In item's Use single-letter encoding for data_type, encode Exome with E (resp.WGS with W): (E)xome, (W)GS.

Following example employs all the tricks (implies cookie-dance 1st step has been completed):

http://analytics:8090/api/sampleAnalyticsWorkflowTest?request_type=RequestFinalAggregation&extra_info=source=RQC&params_fields=zShort:xLong&items=16653:HP3LY:W::0.111;55,16653:HP3L9:W::0.222;66,16485:HE5F1:W::0.333;77,16485:HF32S:W::0.444;88&EOURL=1Image Removed

Image Added

SAW extensions

  1. Exome Autopush ETL agent (click for more details)

  2. Manual Push for Final Aggregation using pre-compiled list of samples

    1. In EtlShell: copy/paste comma-separated list of (pdo:pdo_samples:datatype) triplets (note that within each triplet elements are separated by colon)

      Code Block
      languagescala
      import analytics.tiger.agents.SAW
      CognosDB("analytics.tiger.agents.SAW.PushDev.Manual", SAW.manual("PDO-111:SM-111:E,PDO-222:SM-222:E") flatMap SAW.finalAggDev("nasko"))

      Make sure you provide appropriate user_name and Dev/Prod labels.

  3. Manual Push for Final Aggregation using SQL to get list of samples

    1. In EtlShell: load (pdo:pdo_samples:datatype) triplets as a single column via SQL

      Code Block
      languagescala
      import analytics.tiger.agents.SAW
      val sql = """SELECT m.pdo || ':' || m.pdo_sample || ':' || m.data_type data FROM cognos.rapidqc_agg m WHERE m.pdo='PDO-17350' AND m.pdo_sample IN ('SM-I3797','SM-I2NWY','SM-I371P','SM-I2P3C')"""
      CognosDB("analytics.tiger.agents.SAW.PushDev.viaSQL", SAW.viaSQL(sql) flatMap SAW.finalAggDev("nasko"))