2.4.1 • Published 3 years ago

@mydiem/diem-util v2.4.1

Weekly downloads
-
License
MIT
Repository
github
Last release
3 years ago

DIEM

Python, Spark, REST, Scala, Pipelines, Scheduling, API, Custom Jobs, SQL Statements, Openshift, Cloud Native, Machine Learning, Sendgrid, Kubernetes, Slack, Cloud Object Storage, JDBC, Box

Diem can be used to create, display, execute and maintain data transfers between hardware and database platforms. It will cover how to create and manage transfers and assign them to a schedule to execute regularly without human intervention.

Diem provides a front end for SPARK ETL (Extract, Transform, Load) – an SQL data pipeline that can be used to synchronize data between RDMS platforms. Composed of individual transfer operations called jobs, the tool will execute SQL statements to select data from a source system and insert or replicate the data on a target system.

Diem allows the user to create scripts using the interpreted programming language Python, and to create sophisticated schedules using Cron (a work scheduler for Unix systems.) The combination of Python and Cron, along with the intrinsic ability to define and execute custom SQL statements, allows a range of activities from simple data transfers to more sophisticated job streams.

Diem also allows quick and easy definition of connections, as well as a scheduler and log display. An interface to Slack can be used to send the results of jobs to a specified Slack channel.

Application Features

FeatureFeature SummaryBenefits
SpacesSupport for Multiple OrganisationsMultiple Organisations can make use of DIEM, each org can have it's own space. You can even have multiple spaces per Org and use it for test, pre-prod or production
Data TransferNodyPyFast transfer of small data sets <100 k using pandas jdbc sqlalchemy
Data TransferSparkBulk Transfer of big data using spark, both pyspark an scala. Partition your data for paralel inserts.Write you sql online and easy manage your job.Include it in a pipeline.Get notified via slack or mail
Custom CodeWrite your own python codeWrite your own python code using pyspark or python. Integrate your favorite library, use your jdbc connection, integrate your config maps, code snippets, webhooks all in one pl;ace, creating a unique experience
API ServicesRest services for external useCreate jobs that can provide REST Services. Connect external applications to your code and provide rest services for them
Machine LearningEmbed Machine Learning in your codeMake use of the latest ML Libraries like SciPy, matplotlib, seaborn, pandas etc.. to create machine learning models that can be used in your code
ConnectionsDB2NetezzaProgreSQLMany moreJDBC connectins into various sources, easy to add and manage.Secrets kept secure if personal
WebhooksBring in your own webhookWebooks can be to integrate into your applications. You can bring in your git or slack webhook and use it n your applications
SlackSlack IntegrationEither you use the default slack channels or bring in your own slack api key. All job progress are logged to your slack channels. You can even integrate them in your custom jobs. Provide custom content and subject messages
PipelinesPipelins of JobsGroup your jobs together and form a pipleline. Start each job at the same time or in order. Manage dependences and organize them in steps
SchedulingCron ScheduleSchedule to run jor jobs using an advance Cron schedule that can handle any type of timeframe and schedule
MailMail FunctionalitySend mail on Completion or Failure of jour job to your audiance
Mail IntegrationMail Functionality for your codeIntegrate mail functionality in your code, send data reports as html, csv , xls to your audience based on your query. Customize headers and body content.
FilesUpload, Download or integrate filesEach space is connected to it's own Cloud Object Storage Buckewt and can be integrated in your code. You can also specify any other COS instance
BoxUpload, Download from BOXYou can now directly download and upload files from Box
Config MapsManage parameters and config valuesConfig maps are vary usefull as you can spererate your code from it's values. They can be kept private and secure so you can use them for storing your own tokens.
TagsDefine your own tagsYou can set up your own tags for easy job search, classification and job management
TemplatesReusable or shared TemplatesYour code could be based of a template, that you can clone from , you can lso have shared code which is the same amongst your jobs but only different in configuration
Code SnipptesReusabel adn sharable codeCreate reusable code, share use it in your jobs.This allows you to reuse your code in multiple jobs, maintaining key code centrally
Job LogAudit trails of completed jobEach started job will have it's own audit trail, so you can go back to view errors and integrate it in your reporting for performance review
OrganizationOrganization ProfileView your Profile and your access rights organisation
OrganizationsOrganizations you belong toSee all organisations you belong to
Space SelectorEasy move between spacesYou can at any time easily swtich between organisations your belong to