Motivation
Jira-data extracted via JqlTableauExtractService is currently being recorded into Tableau hyper-extracts. Unfortunately Tableau’s ability to deal properly with delta-driven updates is very limited (it can only work on datasets with timeUpdated field) which has encouraged Analytics engineers to implement these “grab-everything” kind of jobs which do NOT have any time-based filter - as a result these jobs keep growing in size and time every day for eternity and Jira might start showing signs of (unnecessary) overload.
It would be great if jira-extract is rerouted into a regular datamart into DB which offers many options for advanced delta-driven ETL-handling.
Step 1: Use Einstein to cook up your NormalizedUrl (aka jqlLink)
Step 2: Create a corresponding table in the DB
Step 3: Register a new JqlTask in “jqlTasks.conf” file
Step 4: Test your ETL in manual mode
Step 5: Prepare a delta-tracker
Run following INSERT-statement and commit. Make sure you plug your TASK_NAME in appropriate place. This tracker will drive your ETL in automatic delta-driven mode. Pick a timestamp from which you want your ETL to start off.
INSERT INTO cognos.etl_property VALUES('analytics.tiger.agents.JqlTask.YOUR_TASK_NAME_HERE','2000-Jan-01 00:00:00')
Step 6: Test your ETL in delta-driven mode
Step 7: Schedule a cronjob to reach full automation
HAPPY END
Some thoughts:
So called “JQL-explosion” (splitting given field - say “SampleIDs -“ into items and combining them with the rest of fields for given ticket) seems convenient however it is very wasteful - all non-exploded fields will be duplicated as many times as #samples are found. This could possibly lead to performance problems.
Alternative normalized approach worth looking into is “having 2 ETL tasks”
1st ETL would take care of all non-exploded fields. There will be no explosion (resp. duplication) and “key” would naturally be a PrimaryKey
2nd ETL would include only 2 fields (key and Sample). This would produce a very lean table having all (key → Sample) links.
both ETLs could be placed in the same task in jqlTasks.conf file
These 2 tables would be nicely equipped with indices/PKs so JOINs should perform fast.
2.