Data Engineering White Paper: Flat File Load in Vertica Database with Dissimilar Patterns of Record and Header Layout

Learn about the SQL only consumption and integration of CSV file with mixed data layout in Vertica.

By:

Share the link to this page

Copied

About the Product

Vertica is a SQL analytics database built to manage rapidly growing volumes of data in Data Warehouse. Many times CSV files, received from internal upstream or external vendor system, required to be cleansed, consumed, integrated, and reported in Data Warehouse. Data integration complexity increase to a greater extent if an incoming CSV file consists of dissimilar patterns of record and header layout.

This white paper will show SQL only practical solution approach for consumption and integration of CSV file with similar data layout in the Vertica analytics database without using user-defined function/shell scrips/programming script etc.

What will you learn in this course?

Business scenario
Header row specification
- Header row in CSV format
- Header row in tabular format
Data row specification
- Standard data row in CSV format
- Detail data row in CSV format
- Data row in tabular format
Sample data files
Business requirement
- Load above data files merging header row information in data rows and store in a raw table in integration schema
- Load final integration table with clean customer data - update existing and insert new records
Prerequisites
- Copy files in the database server folder
- Create schema in the database
- Create staging and integration tables
Load and transform staging raw table
- Load file data
- Remove null, double quotes, non-breaking space, and folder name from file path with an empty string
- Split fields based on comma delimiter and trim
- Copy standard header row to standard data row based on file name join
- Copy detail header row to detail data row based on file name join
- Delete header rows
- Preview of loaded table data
Load integration raw table
- Preview of loaded table data
Load integration master table
- Update existing records based on the key column
- Insert new records based on the new key column
- Preview of loaded table data
Scope of enhancement adding reject logic and reject reprocessing

Requirements

You should have a basic understanding of database and SQL
You should be familiar with data engineering concepts
You should be working or planning to work as an ETL developer or data engineer in OLTP or DW environment

Authors

School

DataPad OÜ's School

One-time Fee

$25

€21.20

£18.53

CA$34.20

A$35.44

S$31.68

HK$195.37

CHF 19.34

NOK kr238.40

DKK kr158.43

NZ$41.84

د.إ91.81

৳3,053.91

₹2,279.18

RM97.71

₦33,608.25

₨6,986.98

฿779.77

₺1,094.23

B$131.05

R403.89

Лв41.47

₩36,206.34

₪78.58

₱1,451.12

¥3,867.68

MX$430.87

QR91.20

P329.16

KSh3,225

E£1,189.87

ብር3,855.46

Kz22,803.28

CLP$21,558.25

CN¥172.71

RD$1,543.30

DA3,250.08

FJ$55

Q191.67

GY$5,228.52

ISK kr3,077.25

DH229.20

L427.82

ден1,305.71

MOP$201.21

N$401.41

C$918.88

रु3,623.87

S/83.90

K107.43

SAR93.76

ZK469.22

L108.06

Kč514.05

Ft8,046.63

SEK kr226.16

ARS$34,925.15

Bs173.34

COP$91,615.85

₡12,088.25

L663.48

₲163,953.73

$U962.81

zł89.58

What's Included

File Size: 533K

Pages: 10

Language: English

Level: Advanced

Skills: Data Transportation, Vertica, Data Massaging, Batch Integration, Data Engineering, CSV File, White Paper

Age groups: 25-34 years, 35-44 years, 45-54 years, 55-64 years

All Topics

Free
    Live Classes

    Recorded Classes

    Products

    Bundles

    Videos

    Programs
Academics
Business
Creative
Health and Fitness
LifeStyle
Personal Development
Software

Academics

Creative

Health and Fitness

LifeStyle

Personal Development

Software

Admissions

Engineering

Hardware

Hospitality

Humanities

Chinese

Languages

Maths

Other

Pharma

BioPharma

Research

Science

Teaching

Test Preparation

K-12

School

IGCSE

Accounting

Advertising

Analysis

Analytics

Business Communication

Writing

eCommerce

Entrepreneurship

Finance - India

Investing

Freelancing

Internet of Things

Digital Transformation

Human Resources

Industry

Management

Marketing

Media

Operations

Other

Law
Security

Project Management

Public Relations

Real Estate

Sales

Strategy

Audio Editing

Premiere Pro

Audio Production

Dance

Design

Film Production

Music

Photography

Video Production

Writing

Dieting

Food Safety

Games

Chess

Medical

Medical Professionals

Meditation

Pregnancy

Safety & First Aid

Self Defense

Sports

Beauty & Makeup

Food

Fashion

Gaming

Home Improvement

Parenting

Pet Care & Training

Relationships

Sustainable Living

Travel

Career Development

Religion and Spirituality

Accounting

Amazon Web Services

App Development

Continuous Integration

Backup Software

Business Automation

Computational Fluid Dynamics

Business Intelligence

Computer Aided Design (CAD)

Content Management System

Customer Relationship Management

Database

Data Mining

E-Commerce

Enterprise Asset Management

Enterprise Resource Planning

Game Development

Google Cloud

Linux

Artificial Intelligence

Machine Learning

Master Data Management

Microsoft

Music Software

Ableton

Network and Security

Open Source

Operating System

Other

Process Management

Oracle

Productivity Software

Programming Languages

Robotics

Supply Chain Management

Testing

Teaching

LearnDesk

Web Development

Data Engineering White Paper: Flat File Load in Vertica Database with Dissimilar Patterns of Record and Header Layout

About the Product

Requirements

Authors

School

DataPad OÜ's School

What's Included

Sign Up

Sign Up

Share