ibm-ia-rest v0.3.0
README
ibm-ia-rest
Re-usable functions for interacting with Information Analyzer's REST API
Examples
// runs column analysis for any objects in Automated Profiling that have not been analyzed since the moment the script is run (new Date())
var iarest = require('ibm-ia-rest');
var commons = require('ibm-iis-commons');
var restConnect = new commons.RestConnection("isadmin", "isadmin", "hostname", "9445");
iarest.setConnection(restConnect);
iarest.getStaleAnalysisResults("Automated Profiling", new Date(), function(errStale, projectRID, aStaleSources) {
iarest.runColumnAnalysisForDataSources(projectRID, aStaleSources, function(errExec, tamsAnalyzed) {
// Note that the API returns async; if you want to busy-wait you need to poll events on Kafka
});
});Meta
- license: Apache-2.0
setConnection
Set the connection for the REST API
Parameters
restConnectRestConnection RestConnection object, from ibm-iis-commons
makeRequest
Make a request against IA's REST API
Parameters
methodstring type of request, one of 'GET', 'PUT', 'POST', 'DELETE'pathstring the path to the end-point (e.g. /ibm/iis/ia/api/...)inputstring? any input for the request, i.e. for PUT, POSTinputTypestring? the type of input, if any provided 'text/xml', 'application/json'callbackrequestCallback callback that handles the response
- Throws any will throw an error if connectivity details are incomplete or there is a fatal error during the request
getAllItemsToIgnore
Retrieves a list of all items that should be ignored, i.e. where they are labelled with "Information Analyzer Ignore List"
Parameters
callbackitemsToIgnoreCallback
addIADBToIgnoreList
Adds the IADB schema to a list of objects for Information Analyzer to ignore (to prevent them being added to projects or being analysed); this is accomplished by creating a label 'Information Analyzer Ignore List'
Parameters
callbackrequestCallback callback that handles the response
createOrUpdateAnalysisProject
Create or update an analysis project, to include ALL objects known to IGC that were updated after the date received -- necessary before any tasks can be executed
Parameters
namestring name of the projectdescriptionstring description of the projectupdatedAfterDate? include into the project any objects in IGC last updated after this datecallbackrequestCallback callback that handles the response
getProjectList
Get a list of Information Analyzer projects
Parameters
callbacklistCallback callback that handles the response
getProjectDataSourceList
Get a list of all of the data sources in the specified Information Analyzer project
Parameters
projectNamestringcallbackdataSourceListCallback callback that handles the response (will be entries with HOST||DB.SCHEMA.TABLE and HOST||PATH:FILE)
runColumnAnalysisForDataSources
Run a full column analysis against the list of data sources specificed (based on TAM RIDs)
Parameters
projectRIDstring the RID of the project in which to execute the analysisaDataSourcesArray<Object> an array of data sources, as returned by getProjectDataSourceListcallbackcolumnAnalysisCallback callback that handles the response
publishResultsForDataSources
Publish analysis results for the list of data sources specified
Parameters
projectRIDstring RID of the IA projectaTAMsArray<string> an array of TAM RIDs whose analysis should be publishedcallbackrequestCallback callback that handles the response
getStaleAnalysisResults
Retrieve previously published analysis results
Parameters
projectNamestring name of the IA projecttimeToConsiderStaleDate the time before which any analysis results should be considered stalecallbackstaleAnalysisCallback callback that handles the response
reindexThinClient
Issues a request to reindex Solr for any resutls to appear appropriately in the IA Thin Client
Parameters
batchSizeint The batch size to retrieve information from the database. Increasing this size may improve performance but there is a possibility of reindex failure. The default is 25. The maximum value is 1000.solrBatchSizeint The batch size to use for Solr indexing. Increasing this size may improve performance. The default is 100. The maximum value is 1000.upgradeboolean Specifies whether to upgrade the index schema from a previous version, and is a one time requirement when upgrading from one version of the thin client to another. The schema upgrade can be used to upgrade from any previous version of the thin client. The value true will upgrade the index schema. The value false is the default, and will not upgrade the index schema.forceboolean Specifies whether to force reindexing if indexing is already in process. The value true will force a reindex even if indexing is in process. The value false is the default, and prevents a reindex if indexing is already in progress. This option should be used if a previous reindex request is aborted for any reason. For example, if InfoSphere Information Server services tier system went offline, you would use this option.callbackreindexCallback status of the reindex "REINDEX_SUCCESSFUL"
getRuleExecutionFailedRecordsFromLastRun
Retrieves a listing of any records that failed a particular Data Rule or Data Rule Set (its latest execution)
Parameters
projectNamestring The name of the Information Analyzer project in which the Data Rule or Data Rule Set existsruleOrSetNamestring The name of the Data Rule or Data Rule SetnumRowsint The maximum number of rows to retrieve (if unspecified will default to 100)callbackrecordsCallback the records that failed
getRuleExecutionResults
Retrieves the statistics of the executions of a particular Data Rule or Data Rule Set
Parameters
projectNamestring The name of the Information Analyzer project in which the Data Rule or Data Rule Set existsruleOrSetNamestring The name of the Data Rule or Data Rule SetbLatestOnlyboolean If true, returns only the statistics from the latest execution (otherwise full history)callbackstatsCallback the statistics of the historical execution(s)
listCallback
This callback is invoked as the result of an IA REST API call, providing the response of that request.
Type: Function
Parameters
errorMessagestring any error message, or null if no errorsaResponseArray<string> the response of the request, in the form of an array
requestCallback
This callback is invoked as the result of an IA REST API call, providing the response of that request.
Type: Function
Parameters
errorMessagestring any error message, or null if no errorsresponseXMLstring the XML of the response
itemsToIgnoreCallback
This callback is invoked as the result of retrieving a list of items that Information Analyzer should ignore
Type: Function
Parameters
errorMessagestring any error message, or null if no errorstypeToIdentitiesObject dictionary keyed by object type, with each value being an array of objects of that type to ignore (as identity strings, /-delimited)
statsCallback
This callback is invoked as the result of an IA REST API call to retrieve historical statistics on Data Rule executions
Type: Function
Parameters
errorMessagestring any error message, or null if no errorsstatsArray<Object> an array of stats, each stat being a JSON object with ???
recordsCallback
This callback is invoked as the result of an IA REST API call to retrieve records that failed Data Rules
Type: Function
Parameters
errorMessagestring any error message, or null if no errorsrecordsArray<Object> an array of records, each record being a JSON object keyed by column name and with the value of the column for that rowcolumnMapObject key-value pairs mapping column names to their context (e.g. full identity in the case of database columns like RecordPK)
reindexCallback
This callback is invoked as the result of an IA REST API call to re-index Solr for IATC
Type: Function
Parameters
errorMessagestring any error message, or null if no errorsstatusstring the status of the reindex operation "REINDEX_SUCCESSFUL"
statusCallback
This callback is invoked as the result of an IA REST API call, providing the response of that request.
Type: Function
Parameters
errorMessagestring any error message, or null if no errorsstatusObject the response of the request, in the form of an object keyed by execution ID, with subkeys for executionTime, progress and status "running", "successful", "failed", "cancelled"
columnAnalysisCallback
This callback is invoked as the result of an IA REST API call to execute column analysis.
Type: Function
Parameters
errorMessagestring any error message, or null if no errorstamsToSourcesObject a dictionary of TAM RIDs to data sourcesscheduleObject an object containing 'scheduleRids', which is an array of scheduler execution IDs
staleAnalysisCallback
This callback is invoked as the result of an IA REST API call to determine which data sources have not been refreshed within a provided time period.
Type: Function
Parameters
errorMessagestring any error message, or null if no errorsprojectRIDstring the RID of the Information Analyzer projectaDataSourcesArray<string> an array of entries with HOST||DB.SCHEMA.TABLE for database tables and HOST||PATH:FILE for data files, only for those that are stale
dataSourceListCallback
This callback is invoked as the result of an IA REST API call to retrieve a list of data sources within a project.
Type: Function
Parameters
errorMessagestring any error message, or null if no errorsprojectRIDstring the RID of the Information Analyzer projectaDataSourcesArray<string> an array of entries with HOST||DB.SCHEMA.TABLE for database tables and HOST||PATH:FILE for data files
Project
Project class -- for handling Information Analyzer projects
getProjectDoc
Retrieve the Project document
setDescription
Set the description of the project
Parameters
desc
addTable
Add the specified table to the project
Parameters
datasourcestring the database nameschemastringtablestringaColumnsArray<string> array of column names
addFile
Add the specified file to the project
Parameters
datasourcestring the host name?folderstring the full path to the filefilestring the name of the fileaFieldsArray<string> array of field names within the file
ColumnAnalysis
ColumnAnalysis class -- for handling Information Analyzer column analysis tasks
constructor
Parameters
projectProject the project in which to create the column analysis taskanalyzeColumnPropertiesboolean whether or not to analyze column propertiescaptureResultsTypestring specifies the type of frequency distribution results that are written to the analysis database "CAPTURE_NONE", "CAPTURE_ALL", "CAPTURE_N"minCaptureSizeint the minimum number of results that are written to the analysis database, including both typical and atypical valuesmaxCaptureSizeint the maximum number of results that are written to the analysis databaseanalyzeDataClassesboolean whether or not to analyze data classes
setSampleOptions
Use to (optionally) set any sampling options for the column analysis
Parameters
typestring the sampling type "random", "sequential", "every_nth"sizenumber if less than 1.0, the percentage of values to use in the sample; otherwise the maximum number of records in the sample. If you use the "random" type of data sample, specify the sample size that is the same number as the number of records that will be in the result, based on the value that you specify in the Percent field. Otherwise, the results might be skewed.seedstring if type is "random", this value is used to initialize the random generators (two samplings that use the same seed value will contain the same records)stepint if type is "every_nth", this value indicates the step to apply (one row will be kept out of every nth value rows)
setEngineOptions
Use to (optionally) set any engine options to use when running the column analysis
Parameters
retainOSHboolean whether to retain the generated DataStage job or notretainDataboolean whether to retain generated data sets (ignored when data rules are running)configstring specifies an alternative configuration file to use with the DataStage engine during this rungridEnabledstring whether or not the grid view will be enabledrequestedNodesstring the name of requested nodesminNodesstring the minimum number of nodes you want in the analysispartitionsPerNodestring the number of partitions for each node in the analysis
setJobOptions
Use to (optionally) set any job options to use when running the column analysis
Parameters
debugEnabledboolean whether to generate a debug table containing the evaluation results of all functions and tests contained in the expression (only used for running data rules)numDebuggedRecordsint how many rows should be debugged, if debugEnabled is "true"arraySizeint the size of the array (?)autoCommitbooleanisolationLevelintupdateExistingTablesboolean whether to update existing tables in IADB or create new ones (only used for column analysis)
addColumn
Use to add a column to the column analysis task -- both table and column can be '*' to specify all tables or all columns
Parameters
addFileField
Use to add a file field to the column analysis task -- column can be '*' to specify all fields within the file
Parameters
connectionstring e.g. "HDFS"pathstring directory path, not including the filenamefilenamestringcolumnstring name of the field within the filehostnamestring?
PublishResults
PublishResults class -- for handling Information Analyzer results publishing tasks
constructor
Parameters
projectProject the project from which to publish analysis results
addTable
Use to add a table whose results should be published -- the table can be '*' to specify all tables
Parameters
addFile
Use to add a file whose results should be published -- file can be '*' to specify all files
Parameters