Data Wrangling for Extracting More Value from Data
Digital health and therapy device development company offering therapy products includes portable, rechargeable, efficient, safe and user-friendly devices that have none of the side effects associated with anti-hypertension drugs.
The client used a traditional ETL tool to ingest, prepare the raw data of all equipments (vs users) and generate consumer statistics out of it. In Hybrid environment, XML data is loaded into HDFS as text and then transformed for business cases. Use of old technology prevented the users from getting self-serviced reports and complex usecases had larger delivery cycles and always depended on big data developers for any changes or new additions.
The goal was to provide the end users and their caregivers with historical personalized insights to make hypertension reduction and BP monitoring manageable on a mobile app.
Proposed solution must include the following:
- Familiar, intuitive and easy for analysts to use.
- Support a large number of active, concurrent users.
- Provide flexibility to work with a variety of data types, without constraining data shape.
- Provide instant previews of data state.
- Self-document transformation steps.
- Collaboration between SMEs.
- Scalability & faster time-to-deliver.
Used Trifacta’s data wrangler to implement the complex usecase and also enabled self service data wrangling for all future changes or new additions. The data wrangler provided a platform that could load XML files into HDFS (saved as text) from raw data to discovery to shared zone for users. The below transformations could be applied using wrangler:
- Split unstructured data into columns.
- Extract list to create a column containing an array of products and timestamps.
- Date difference to determine the length of product usage time.
- Unnest the arrays.
- Deduplicate records.
This self-service data preparation allows users to leverage the growing complexity of data and also fully compliant with data security and data governance requirements.
Faster Time to deliver
Faster Data Standardization
Faster data transformation
An intelligent data tool that allows users to visually and interactively prepare data. A new service that powers people with super intuitive suggestive models to use immediately.